Data-to-text generation systems tend to be knowledge-based and manually built, which limits their reusability and makes them time and cost-intensive to create and maintain. Methods for automating (part of) the system building process exist, but do such methods risk a loss in output quality? In this paper, we investigate the cost/quality trade-off in generation system building. We compare six data-to-text systems which were created by predominantly automatic techniques against six systems for the same domain which were created by predominantly manual techniques. We evaluate the systems using intrinsic automatic metrics and human quality ratings. We find that there is some correlation between degree of automation in the system-building process and output quality (more automation tending to mean lower evaluation scores). We also find that there are discrepancies between the results of the automatic evaluation metrics and the human-assessed evaluation experiments. We discuss caveats in assessing system-building cost and implications of the discrepancies in automatic and human evaluation.
|Title of host publication||Empirical Methods in Natural Language Generation|
|Editors||E. Krahmer, M. Theune|
|Place of Publication||Berlin, Heidelberg|
|Number of pages||21|
|Publication status||Published - 1 Jan 2010|
|Name||Lecture Notes in Computer Science|