The paper deals with the description of an integrated Russian collocations database that will contain automatically extracted collocations evaluated against dictionaries and linguistic expertise. Quantitative methods applied to corpus data allow researchers to evaluate the represented results. Statistical association measures are used to evaluate the collocation strength and to produce lists of most plausible collocations. The extracted collocations are supplied with statistical values and information The aim of the project is to provide information about collocational preferences of frequent lexemes supplied with additional information and examples from general and specialized corpora and other resources. The present tool can be used in natural language processing for parse filtering, word sense disambiguation, in lexicography, machine translation and language learning.
Translated title of the contributionRUSSIAN COLLOCATIONS DATABASE: PRELIMINARY OBSERVATIONS
Original languageRussian
Title of host publicationСтруктурная и прикладная лингвистика
Subtitle of host publicationМежвузовский сборник. Выпуск 12. К 60-летию отделения прикладной, компьютерной и математической лингвистики СПбГУ
EditorsИ.С. Николаев
Place of PublicationСПб.
PublisherRWTH Aahen University
Pages212–220
StatePublished - 2019

    Research areas

  • database, COLLOCABILITY, CORPORA, DICTIONARIES, statistics

    Scopus subject areas

  • Arts and Humanities(all)

ID: 62338093