This thesis analyses the structure of natural language queries to document repositories, with the aim of finding better methods for information retrieval. The exponential increase of information on the Web and in other large document repositories during recent decades motivates research on facilitating the process of finding relevant information to meet end users’ information needs. A shared problem among several related research areas, such as information retrieval, text summarisation and question answering, is to derive concise textual expressions to describe what a document is about, to function as the bridge between queries and the document content. In current approaches, such textual expressions are typically generated by shallow features, for example, by simply selecting a few most-frequently-occurring key words. However, such approaches are inadequate to generate expressions that truly resemble user queries.
|Date of Award