AbstractThe problem of control is highly relevant to natural language generation (NLG) if a system is able to produce one-to-many mappings from its input to output texts, In this thesis the issue of controlling a natural language generation from the point of view of stylistic variation will be discussed. The scenario is one in which the user can specify values in the stylistic dimensions presented to her, and a text conforming to the specified style has to be produced. This involves two issues: (1) the identification of the stylistic dimensions; and (2) the efficient control of the generation process in a search space that can become quite huge.
Stylistic variation is analysed following the methodology described in (Biber, 1988) in order to obtain the stylistic dimensions present in the texts of a corpus. This approach also produces a scoring function for each stylistic dimension and, therefore, scores in the stylistic dimensions for each text (either from the corpus or produced by a generator) can be obtained. However, this is not sufficient by itself to guide a generator to produce texts with the stylistic scores specified by a user - it could be used in a 'generate-and-test' approach, but this is not efficient.
In order to be able to guide the generator a link between stylistic scores and generator decisions has to be obtained. This research implements an approach that allows the prediction of scores in the stylistic dimensions in terms of a configuration of generator decisions, i.e., a prediction function was obtained, and this function was introduced in the generator to help choose the best path in the search space in accordance with the style desired by the user, obtaining an efficient control mechanism for the generation of texts in different styles.
|Date of Award||Nov 2004|
|Supervisor||Roger Evans (Supervisor)|