Abstract
Studies assessing rating scales are very common in psychology and related fields, but are rare in NLP. In this paper we assess discrete and continuous scales used for measuring quality assessments of computer-generated language. We conducted six separate experiments designed to investigate the validity, reliability, stability, interchangeability and sensitivity of discrete vs. continuous scales. We show that continuous scales are viable for use in language evaluation, and offer distinct advantages over discrete scales.
Original language | English |
---|---|
Title of host publication | The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies |
Place of Publication | Stroudsburg, PA, USA |
Publisher | Association for Computational Linguistics |
Pages | 230-235 |
Number of pages | 6 |
ISBN (Print) | 9781932432886 |
Publication status | Published - 1 Jan 2011 |
Event | The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Portland, Oregon, USA, 19-24 June, 2011 Duration: 1 Jan 2011 → … |
Conference
Conference | The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies |
---|---|
Period | 1/01/11 → … |