The paper discusses adj-noun word sketches produced for 20 Russian headwords. We analysed the differences between the output and collocations extracted from Russian dictionaries and also validated the collocates by expert evaluation. The aim was to study to what extent their data coincide with each other and to investigate how collocations presented in dictionaries are reflected in a large Web corpus. The comparison with the gold standard shows low precision whereas expert evaluation gives higher values. LogDice tend to extract more peculiar examples compared to joint frequency according to human assessment.

Original languageEnglish
Title of host publicationRASLAN 2020 - 14th Workshop on Recent Advances in Slavonic Natural Language Processing, Proceedings
EditorsAles Horak, Pavel Rychly, Adam Rambousek
PublisherTribun EU
Pages125-131
Number of pages7
ISBN (Electronic)9788026316008
StatePublished - 2020
Event14th Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020 - Virtual, Brno, Czech Republic
Duration: 8 Dec 202010 Dec 2020

Publication series

NameRecent Advances in Slavonic Natural Language Processing
Volume2020-December
ISSN (Print)2336-4289

Conference

Conference14th Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020
Country/TerritoryCzech Republic
CityVirtual, Brno
Period8/12/2010/12/20

    Research areas

  • Word sketches, Collocations, Evaluation, Dictionaries, Russian language

    Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Information Systems
  • Software

ID: 71868467