PCFG learning by nonterminal partition search

Anja Belz

Research output: Chapter in Book/Conference proceeding with ISSN or ISBNConference contribution with ISSN or ISBNpeer-review

Abstract

pcfg Learning by Partition Search is a general grammatical inference method for constructing, adapting and optimising pcfgs. Given a training corpus of examples from a language, a canonical grammar for the training corpus, and a parsing task, Partition Search pcfg Learning constructs a grammar that maximises performance on the parsing task and minimises grammar size. This paper describes Partition Search in detail, also providing theoretical background and a characterisation of the family of inference methods it belongs to. The paper also reports an example application to the task of building grammars for noun phrase extraction, a task that is crucial in many applications involving natu- ral language processing. In the experiments, Partition Search improves parsing performance by up to 21.45% compared to a general baseline and by up to 3.48% compared to a task-specific baseline, while reducing grammar size by up to 17.25%.
Original languageEnglish
Title of host publicationGrammatical Inference: Algorithms and Applications: Proceedings of the 6th International Colloquium: ICGI 2002
EditorsP. Adriaans, H. Fernau, M. van Zaanen
Place of PublicationBerlin, Germany
PublisherSpringer
Pages14-27
Number of pages14
Volume2484/2
ISBN (Electronic)1611-3349
ISBN (Print)0302-9743
DOIs
Publication statusPublished - 1 Jan 2002
EventGrammatical Inference: Algorithms and Applications: Proceedings of the 6th International Colloquium: ICGI 2002 - Amsterdam, The Netherlands, September 23-25, 2002
Duration: 1 Jan 2002 → …

Publication series

NameLecture notes in computer science

Conference

ConferenceGrammatical Inference: Algorithms and Applications: Proceedings of the 6th International Colloquium: ICGI 2002
Period1/01/02 → …

Keywords

  • Partition search

Fingerprint

Dive into the research topics of 'PCFG learning by nonterminal partition search'. Together they form a unique fingerprint.

Cite this