Standard

Internet data in the study of language change : A case study of alternations in Russian comparatives and a program to work with such data. / Magomedova, V. D.; Slioussar, N. A.

In: Komp'juternaja Lingvistika i Intellektual'nye Tehnologii, 2014, p. 379-390.

Research output: Contribution to journalArticlepeer-review

Harvard

APA

Vancouver

Author

BibTeX

@article{60839ee288044c3caa52866afa798c3d,
title = "Internet data in the study of language change: A case study of alternations in Russian comparatives and a program to work with such data",
abstract = "The Internet is a unique source of non-standard forms, which gives us a novel opportunity to analyze fine-grained dynamics of language change. We used this opportunity to study the decay of historic consonant alternations in Russian. In standard Russian, these alternations are present in some verb forms and in comparatives (e.g. suxoj 'dry' - sushe 'drier', ljubit' 'to love' - ljublju 'I love'), as well as before certain derivational suffixes. Verb forms have been recently studied by Slioussar and Kholodilova (2013), and we looked at comparatives. Two groups of adjectives were selected: ones that have normative comparatives with alternations and ones that do not, but native speakers still try to generate such forms. In the first group, some adjectives like ubogij 'poky' have up to 30 % of comparatives without alternations, but, unlike with verbs, no significant correlation with adjective frequency or its other characteristics was found. The second group consisted primarily of compound adjectives ending in -gij, -kij, -xij. Here, the most important factor is whether the second part of the compound is used as an independent adjective. If it is not (e.g. as in dlinnorukij 'long- Armed'), most comparatives lack alternations. Searching for forms on the Internet, we faced many problems. The counts provided by search engines are extremely inaccurate, only the first thousand results are shown, they cannot be downloaded in a convenient format, contain a lot of typos and other irrelevant data etc. We present a program called Lingui-Pingui that we developed to solve these and some other problems.",
keywords = "Comparative, Consonants, Historical alternations, Search optimization",
author = "Magomedova, {V. D.} and Slioussar, {N. A.}",
year = "2014",
language = "русский",
pages = "379--390",
journal = "Компьютерная лингвистика и интеллектуальные технологии",
issn = "2221-7932",
publisher = "Российский государственный гуманитарный университет",

}

RIS

TY - JOUR

T1 - Internet data in the study of language change

T2 - A case study of alternations in Russian comparatives and a program to work with such data

AU - Magomedova, V. D.

AU - Slioussar, N. A.

PY - 2014

Y1 - 2014

N2 - The Internet is a unique source of non-standard forms, which gives us a novel opportunity to analyze fine-grained dynamics of language change. We used this opportunity to study the decay of historic consonant alternations in Russian. In standard Russian, these alternations are present in some verb forms and in comparatives (e.g. suxoj 'dry' - sushe 'drier', ljubit' 'to love' - ljublju 'I love'), as well as before certain derivational suffixes. Verb forms have been recently studied by Slioussar and Kholodilova (2013), and we looked at comparatives. Two groups of adjectives were selected: ones that have normative comparatives with alternations and ones that do not, but native speakers still try to generate such forms. In the first group, some adjectives like ubogij 'poky' have up to 30 % of comparatives without alternations, but, unlike with verbs, no significant correlation with adjective frequency or its other characteristics was found. The second group consisted primarily of compound adjectives ending in -gij, -kij, -xij. Here, the most important factor is whether the second part of the compound is used as an independent adjective. If it is not (e.g. as in dlinnorukij 'long- Armed'), most comparatives lack alternations. Searching for forms on the Internet, we faced many problems. The counts provided by search engines are extremely inaccurate, only the first thousand results are shown, they cannot be downloaded in a convenient format, contain a lot of typos and other irrelevant data etc. We present a program called Lingui-Pingui that we developed to solve these and some other problems.

AB - The Internet is a unique source of non-standard forms, which gives us a novel opportunity to analyze fine-grained dynamics of language change. We used this opportunity to study the decay of historic consonant alternations in Russian. In standard Russian, these alternations are present in some verb forms and in comparatives (e.g. suxoj 'dry' - sushe 'drier', ljubit' 'to love' - ljublju 'I love'), as well as before certain derivational suffixes. Verb forms have been recently studied by Slioussar and Kholodilova (2013), and we looked at comparatives. Two groups of adjectives were selected: ones that have normative comparatives with alternations and ones that do not, but native speakers still try to generate such forms. In the first group, some adjectives like ubogij 'poky' have up to 30 % of comparatives without alternations, but, unlike with verbs, no significant correlation with adjective frequency or its other characteristics was found. The second group consisted primarily of compound adjectives ending in -gij, -kij, -xij. Here, the most important factor is whether the second part of the compound is used as an independent adjective. If it is not (e.g. as in dlinnorukij 'long- Armed'), most comparatives lack alternations. Searching for forms on the Internet, we faced many problems. The counts provided by search engines are extremely inaccurate, only the first thousand results are shown, they cannot be downloaded in a convenient format, contain a lot of typos and other irrelevant data etc. We present a program called Lingui-Pingui that we developed to solve these and some other problems.

KW - Comparative

KW - Consonants

KW - Historical alternations

KW - Search optimization

UR - http://www.scopus.com/inward/record.url?scp=84904818552&partnerID=8YFLogxK

M3 - статья

AN - SCOPUS:84904818552

SP - 379

EP - 390

JO - Компьютерная лингвистика и интеллектуальные технологии

JF - Компьютерная лингвистика и интеллектуальные технологии

SN - 2221-7932

ER -

ID: 9219036