Learning grammars for noun phrase extraction by partition search

Anja Belz

Learning grammars for noun phrase extraction by partition search

Anja Belz

University of Brighton

Research output: Chapter in Book/Conference proceeding with ISSN or ISBN › Conference contribution with ISSN or ISBN › peer-review

Abstract

This paper describes an application of Grammar Learning by Partition Search to noun phrase extraction, an essential task in information extraction and many other N L P applications. Grammar Learning by Partition Search is a general method for automatically constructing grammars for a range of parsing tasks; it constructs an optimised probabilistic context-free grammar by searching a space of nonterminal set partitions, looking for a partition that maximises parsing performance and minimises grammar size. The idea is that the considerable time and cost involved in building new grammars can be avoided if instead existing grammars can be automatically adapted to new parsing tasks and new domains. This paper presents results for applying Partition Search to the tasks of (i) identifying flat N P chunks, and (ii) identifying all N Ps in a text. For N P chunking, Partition Search improves a general baseline result by 12.7%, and a method- specific baseline by 2.2%. For N P identification, Partition Search improves the general baseline by 21.45%, and the method-specific one by 3.48%. Even though the grammars are nonlexicalised, results for N P identification closely match the best existing results for lexicalised approaches.

Original language	English
Title of host publication	Proceedings of the LREC 2002 workshop on linguistic knowledge acquisition and representation: bootstrapping annotated language data
Place of Publication	Amsterdam/Philadelphia
Publisher	John Benjamins Publishing Company
Pages	0-0
Number of pages	1
Publication status	Published - 1 Jan 2002
Event	Proceedings of the LREC 2002 workshop on linguistic knowledge acquisition and representation: bootstrapping annotated language data - Las Palmas, Canary Islands, Spain Duration: 1 Jan 2002 → …

Workshop

Workshop	Proceedings of the LREC 2002 workshop on linguistic knowledge acquisition and representation: bootstrapping annotated language data
Period	1/01/02 → …

Keywords

Grammar learning
Partition search

Access to Document

learning-grammars-for-np-extraction-FINAL.pdfOther version, 238 KBLicence: Unspecified

Cite this

@inproceedings{936a6ac149ac4941b6ce27bdae164e94,

title = "Learning grammars for noun phrase extraction by partition search",

abstract = "This paper describes an application of Grammar Learning by Partition Search to noun phrase extraction, an essential task in information extraction and many other N L P applications. Grammar Learning by Partition Search is a general method for automatically constructing grammars for a range of parsing tasks; it constructs an optimised probabilistic context-free grammar by searching a space of nonterminal set partitions, looking for a partition that maximises parsing performance and minimises grammar size. The idea is that the considerable time and cost involved in building new grammars can be avoided if instead existing grammars can be automatically adapted to new parsing tasks and new domains. This paper presents results for applying Partition Search to the tasks of (i) identifying flat N P chunks, and (ii) identifying all N Ps in a text. For N P chunking, Partition Search improves a general baseline result by 12.7%, and a method- specific baseline by 2.2%. For N P identification, Partition Search improves the general baseline by 21.45%, and the method-specific one by 3.48%. Even though the grammars are nonlexicalised, results for N P identification closely match the best existing results for lexicalised approaches.",

keywords = "Grammar learning, Partition search",

author = "Anja Belz",

year = "2002",

month = jan,

day = "1",

language = "English",

pages = "0--0",

booktitle = "Proceedings of the LREC 2002 workshop on linguistic knowledge acquisition and representation: bootstrapping annotated language data",

publisher = "John Benjamins Publishing Company",

note = "Proceedings of the LREC 2002 workshop on linguistic knowledge acquisition and representation: bootstrapping annotated language data ; Conference date: 01-01-2002",

}

Belz, A 2002, Learning grammars for noun phrase extraction by partition search. in Proceedings of the LREC 2002 workshop on linguistic knowledge acquisition and representation: bootstrapping annotated language data. John Benjamins Publishing Company, Amsterdam/Philadelphia, pp. 0-0, Proceedings of the LREC 2002 workshop on linguistic knowledge acquisition and representation: bootstrapping annotated language data, 1/01/02.

Learning grammars for noun phrase extraction by partition search. / Belz, Anja.
Proceedings of the LREC 2002 workshop on linguistic knowledge acquisition and representation: bootstrapping annotated language data. Amsterdam/Philadelphia: John Benjamins Publishing Company, 2002. p. 0-0.

Research output: Chapter in Book/Conference proceeding with ISSN or ISBN › Conference contribution with ISSN or ISBN › peer-review

TY - GEN

T1 - Learning grammars for noun phrase extraction by partition search

AU - Belz, Anja

PY - 2002/1/1

Y1 - 2002/1/1

N2 - This paper describes an application of Grammar Learning by Partition Search to noun phrase extraction, an essential task in information extraction and many other N L P applications. Grammar Learning by Partition Search is a general method for automatically constructing grammars for a range of parsing tasks; it constructs an optimised probabilistic context-free grammar by searching a space of nonterminal set partitions, looking for a partition that maximises parsing performance and minimises grammar size. The idea is that the considerable time and cost involved in building new grammars can be avoided if instead existing grammars can be automatically adapted to new parsing tasks and new domains. This paper presents results for applying Partition Search to the tasks of (i) identifying flat N P chunks, and (ii) identifying all N Ps in a text. For N P chunking, Partition Search improves a general baseline result by 12.7%, and a method- specific baseline by 2.2%. For N P identification, Partition Search improves the general baseline by 21.45%, and the method-specific one by 3.48%. Even though the grammars are nonlexicalised, results for N P identification closely match the best existing results for lexicalised approaches.

AB - This paper describes an application of Grammar Learning by Partition Search to noun phrase extraction, an essential task in information extraction and many other N L P applications. Grammar Learning by Partition Search is a general method for automatically constructing grammars for a range of parsing tasks; it constructs an optimised probabilistic context-free grammar by searching a space of nonterminal set partitions, looking for a partition that maximises parsing performance and minimises grammar size. The idea is that the considerable time and cost involved in building new grammars can be avoided if instead existing grammars can be automatically adapted to new parsing tasks and new domains. This paper presents results for applying Partition Search to the tasks of (i) identifying flat N P chunks, and (ii) identifying all N Ps in a text. For N P chunking, Partition Search improves a general baseline result by 12.7%, and a method- specific baseline by 2.2%. For N P identification, Partition Search improves the general baseline by 21.45%, and the method-specific one by 3.48%. Even though the grammars are nonlexicalised, results for N P identification closely match the best existing results for lexicalised approaches.

KW - Grammar learning

KW - Partition search

M3 - Conference contribution with ISSN or ISBN

SP - 0

EP - 0

BT - Proceedings of the LREC 2002 workshop on linguistic knowledge acquisition and representation: bootstrapping annotated language data

PB - John Benjamins Publishing Company

CY - Amsterdam/Philadelphia

T2 - Proceedings of the LREC 2002 workshop on linguistic knowledge acquisition and representation: bootstrapping annotated language data

Y2 - 1 January 2002

ER -

Learning grammars for noun phrase extraction by partition search

Abstract

Workshop

Keywords

Access to Document

Fingerprint

Cite this