Abstract
Referring expression generation has recently been the subject of the first Shared Task Challenge in NLG. In this paper, we analyse the systems that participated in the Challenge in terms of their algorithmic properties, comparing new techniques to classic ones, based on results from a new human task-performance experiment and from the intrinsic measures that were used in the Challenge. We also consider the relationship between different evaluation methods, showing that extrinsic taskperformance experiments and intrinsic evaluation methods yield results that are not significantly correlated. We argue that this highlights the importance of including extrinsic evaluation methods in comparative NLG evaluations.
Original language | English |
---|---|
Title of host publication | INLG '08 Proceedings of the Fifth International Natural Language Generation Conference |
Place of Publication | Stroudsburg, PA, USA |
Publisher | Association for Computational Linguistics |
Pages | 50-58 |
Number of pages | 9 |
DOIs | |
Publication status | Published - 1 Jan 2008 |
Event | INLG '08 Proceedings of the Fifth International Natural Language Generation Conference - Salt Fork State Park, Ohio, USA Duration: 1 Jan 2008 → … |
Conference
Conference | INLG '08 Proceedings of the Fifth International Natural Language Generation Conference |
---|---|
Period | 1/01/08 → … |