Morphological complexity and unsupervised learning: validating Russian inflectional classes using high frequency data

Dunstan Brown; Roger Evans

Morphological complexity and unsupervised learning: validating Russian inflectional classes using high frequency data

Dunstan Brown, Roger Evans

University of Brighton

Research output: Chapter in Book/Conference proceeding with ISSN or ISBN › Conference contribution with ISSN or ISBN › peer-review

Abstract

This paper addresses the question of whether it is possible to use machine learning techniques on linguistic data to validate linguistic theory. We determine how readily inflectional classes recognized by linguists can be inferred by an unsupervised learning method when it is presented with the paradigms of a small number (80) of high frequency Russian noun lexemes. We interpret this as a measure of the validity of the linguistic theory. Inflectional classes are of particular interest, because they constitute a kind of autonomous morphological complexity which has no direct relationship to other levels of linguistic description, and hence there is no other objective way of assessing a theoretical characterisation of them. Using the same method, we also examine the status of principal parts and defaults in inflectional classes, and the relationship between inflectional classes and stress in Russian nominal morphology. Our experiments suggest that this is an effective and interesting technique for shedding additional light on theoretical claims.

Original language	English
Title of host publication	Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting
Editors	Kiefer Ference, Mária Ladányi, Péter Siptár
Place of Publication	Amsterdam
Publisher	John Benjamins Publishing Co.
Pages	135-162
Number of pages	28
ISBN (Electronic)	9789027273833
ISBN (Print)	9789027248404
Publication status	Published - 1 May 2012
Event	Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting - Budapest, 13–16 May, 2010 Duration: 1 May 2012 → …

Publication series

Name	Current Issues in Morphological Theory

Conference

Conference	Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting
Period	1/05/12 → …

Access to Document

http://benjamins.com/#catalog/books/cilt.322.07bro/detailsLicence: Unspecified

Cite this

Brown, D., & Evans, R. (2012). Morphological complexity and unsupervised learning: validating Russian inflectional classes using high frequency data. In K. Ference, M. Ladányi, & P. Siptár (Eds.), Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting (pp. 135-162). (Current Issues in Morphological Theory). John Benjamins Publishing Co.. http://benjamins.com/#catalog/books/cilt.322.07bro/details

Brown, Dunstan ; Evans, Roger. / Morphological complexity and unsupervised learning: validating Russian inflectional classes using high frequency data. Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting. editor / Kiefer Ference ; Mária Ladányi ; Péter Siptár. Amsterdam : John Benjamins Publishing Co., 2012. pp. 135-162 (Current Issues in Morphological Theory).

@inproceedings{6ab38d519d7a44b28ad5afa39843e3b2,

title = "Morphological complexity and unsupervised learning: validating Russian inflectional classes using high frequency data",

abstract = "This paper addresses the question of whether it is possible to use machine learning techniques on linguistic data to validate linguistic theory. We determine how readily inflectional classes recognized by linguists can be inferred by an unsupervised learning method when it is presented with the paradigms of a small number (80) of high frequency Russian noun lexemes. We interpret this as a measure of the validity of the linguistic theory. Inflectional classes are of particular interest, because they constitute a kind of autonomous morphological complexity which has no direct relationship to other levels of linguistic description, and hence there is no other objective way of assessing a theoretical characterisation of them. Using the same method, we also examine the status of principal parts and defaults in inflectional classes, and the relationship between inflectional classes and stress in Russian nominal morphology. Our experiments suggest that this is an effective and interesting technique for shedding additional light on theoretical claims.",

author = "Dunstan Brown and Roger Evans",

year = "2012",

month = may,

day = "1",

language = "English",

isbn = "9789027248404",

series = "Current Issues in Morphological Theory",

publisher = "John Benjamins Publishing Co.",

pages = "135--162",

editor = "Kiefer Ference and M{\'a}ria Lad{\'a}nyi and P{\'e}ter Sipt{\'a}r",

booktitle = "Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting",

note = "Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting ; Conference date: 01-05-2012",

}

Brown, D & Evans, R 2012, Morphological complexity and unsupervised learning: validating Russian inflectional classes using high frequency data. in K Ference, M Ladányi & P Siptár (eds), Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting. Current Issues in Morphological Theory, John Benjamins Publishing Co., Amsterdam, pp. 135-162, Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting, 1/05/12. <http://benjamins.com/#catalog/books/cilt.322.07bro/details>

Morphological complexity and unsupervised learning: validating Russian inflectional classes using high frequency data. / Brown, Dunstan; Evans, Roger.
Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting. ed. / Kiefer Ference; Mária Ladányi; Péter Siptár. Amsterdam: John Benjamins Publishing Co., 2012. p. 135-162 (Current Issues in Morphological Theory).

Research output: Chapter in Book/Conference proceeding with ISSN or ISBN › Conference contribution with ISSN or ISBN › peer-review

TY - GEN

T1 - Morphological complexity and unsupervised learning: validating Russian inflectional classes using high frequency data

AU - Brown, Dunstan

AU - Evans, Roger

PY - 2012/5/1

Y1 - 2012/5/1

N2 - This paper addresses the question of whether it is possible to use machine learning techniques on linguistic data to validate linguistic theory. We determine how readily inflectional classes recognized by linguists can be inferred by an unsupervised learning method when it is presented with the paradigms of a small number (80) of high frequency Russian noun lexemes. We interpret this as a measure of the validity of the linguistic theory. Inflectional classes are of particular interest, because they constitute a kind of autonomous morphological complexity which has no direct relationship to other levels of linguistic description, and hence there is no other objective way of assessing a theoretical characterisation of them. Using the same method, we also examine the status of principal parts and defaults in inflectional classes, and the relationship between inflectional classes and stress in Russian nominal morphology. Our experiments suggest that this is an effective and interesting technique for shedding additional light on theoretical claims.

AB - This paper addresses the question of whether it is possible to use machine learning techniques on linguistic data to validate linguistic theory. We determine how readily inflectional classes recognized by linguists can be inferred by an unsupervised learning method when it is presented with the paradigms of a small number (80) of high frequency Russian noun lexemes. We interpret this as a measure of the validity of the linguistic theory. Inflectional classes are of particular interest, because they constitute a kind of autonomous morphological complexity which has no direct relationship to other levels of linguistic description, and hence there is no other objective way of assessing a theoretical characterisation of them. Using the same method, we also examine the status of principal parts and defaults in inflectional classes, and the relationship between inflectional classes and stress in Russian nominal morphology. Our experiments suggest that this is an effective and interesting technique for shedding additional light on theoretical claims.

M3 - Conference contribution with ISSN or ISBN

SN - 9789027248404

T3 - Current Issues in Morphological Theory

SP - 135

EP - 162

BT - Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting

A2 - Ference, Kiefer

A2 - Ladányi, Mária

A2 - Siptár, Péter

PB - John Benjamins Publishing Co.

CY - Amsterdam

T2 - Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting

Y2 - 1 May 2012

ER -

Brown D, Evans R. Morphological complexity and unsupervised learning: validating Russian inflectional classes using high frequency data. In Ference K, Ladányi M, Siptár P, editors, Current issues in Morphological Theory: (Ir)regularity, analogy and frequency. Selected papers from the 14th International Morphology Meeting. Amsterdam: John Benjamins Publishing Co. 2012. p. 135-162. (Current Issues in Morphological Theory).

Morphological complexity and unsupervised learning: validating Russian inflectional classes using high frequency data

Abstract

Publication series

Conference

Access to Document

Fingerprint

Cite this