Collocations in Russian lexicography and Russian collocations database

Research outputpeer-review

2 Citations (Scopus)

Abstract

The paper presents the issue of collocability and collocations in Russian and gives a survey of a wide range of dictionaries both printed and online ones that describe collocations. Our project deals with building a database that will include dictionary and statistical collocations. The former can be described in various lexicographic resources whereas the latter can be extracted automatically from corpora. Dictionaries differ among themselves, the information is given in various ways, making it hard for language learners and researchers to acquire data. A number of dictionaries were analyzed and processed to retrieve verified collocations, however the overlap between the lists of collocations extracted from them is still rather small. This fact indicates there is a need to create a unified resource which takes into account collocability and more examples. The proposed resource will also be useful for linguists and for studying Russian as a foreign language. The obtained results can be important for machine learning and for other NLP tasks, for instance, automatic clustering of word combinations and disambiguation.

Original languageEnglish
Title of host publicationLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
EditorsNicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Place of PublicationParis
PublisherEuropean Language Resources Association (ELRA)
Pages3198-3206
Number of pages9
ISBN (Electronic)9791095546344
Publication statusPublished - 2020
Event12th International Conference on Language Resources and Evaluation, LREC 2020: LREC 2020 - Marseille
Duration: 11 May 202016 May 2020

Publication series

NameLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings

Conference

Conference12th International Conference on Language Resources and Evaluation, LREC 2020
Abbreviated titleLREC 2020
CountryFrance
CityMarseille
Period11/05/2016/05/20

Scopus subject areas

  • Language and Linguistics
  • Education
  • Library and Information Sciences
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Collocations in Russian lexicography and Russian collocations database'. Together they form a unique fingerprint.

Cite this