Quantitative Properties of Russian Adjective-Noun Collocations across Dictionaries and Corpora

Результат исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциирецензирование

Аннотация

The paper discusses the differences between collocations extracted from a number of Russian dictionaries paying attention to their frequency characteristics based on corpora. The aim of the study was, first, to analyze how collocations and set expressions are described in Russian explanatory and specialized dictionaries and to what extent their data coincide with each other, and, secondly, to investigate how collocations presented in dictionaries are reflected in text corpora. This will make it possible to examine the interrelation between the “manually” collected data and modern corpora (the Russian National Corpus and ruTenTen). We tested the following hypothesis, i.e. high collocation frequencies correspond to the fact that the item is represented in several dictionaries. In our paper we considered 180 collocations built according to the “adjective / participle + noun” model. The results show the heterogeneity of the dictionary data while the choice of lexical items does not coincide with its frequency characteristics: the examples are low-frequency and about 34% are absent in the disambiguated subcorpus. Explanatory dictionaries and collocation dictionaries show the smallest overlap.
Язык оригиналаанглийский
Название основной публикацииProceedings of the Computational Models in Language and Speech Workshop (CMLS 2020) co-located with 16th International Conference on Computational and Cognitive Linguistics (TEL 2020)
Страницы202–211
СостояниеОпубликовано - 2020
СобытиеComputational Models in Language and Speech Workshop (CMLS 2020)
co-located with 16th International Conference on Computational and Cognitive Linguistics (TEL 2020)
- Kazan, Российская Федерация
Продолжительность: 12 ноя 202013 ноя 2020

Серия публикаций

НазваниеCeur workshop proceedings
Том2780

конференция

конференцияComputational Models in Language and Speech Workshop (CMLS 2020)
co-located with 16th International Conference on Computational and Cognitive Linguistics (TEL 2020)
Сокращенный заголовокCMLS 2020, TEL 2020
СтранаРоссийская Федерация
ГородKazan
Период12/11/2013/11/20

Предметные области Scopus

  • Гуманитарные науки и искусство (все)

Fingerprint

Подробные сведения о темах исследования «Quantitative Properties of Russian Adjective-Noun Collocations across Dictionaries and Corpora». Вместе они формируют уникальный семантический отпечаток (fingerprint).

Цитировать