In the paper vector-space semantic models based on Word2Vec word embeddings algorithm and a count-based association-oriented algorithm are evaluated and compared by measuring association strength between Russian nouns and adjectives. A dataset of nouns and associated adjectives is used as the test set for pseudodisambiguation task. Models are trained with corpora of Russian fiction. A measure of lexical association anomaly is applied evaluating similarity between the initial noun and the resulting attributive phrase. Results of association strength are reported for models characterized by different parameter values; the best parameter value combinations are proposed. The test exemplars producing the error rate are manually annotated, and the model errors are categorized in terms of their linguistic nature and compositionality features.

Original languageEnglish
Title of host publicationAnalysis of Images, Social Networks and Texts - 5th International Conference, AIST 2016, Revised Selected Papers
EditorsNatalia Loukachevitch, Alexander Panchenko, Konstantin Vorontsov, Valeri G. Labunets, Andrey V. Savchenko, Dmitry I. Ignatov, Sergey I. Nikolenko, Mikhail Yu. Khachay
PublisherSpringer Nature
Pages236-247
Number of pages12
ISBN (Print)9783319529196
DOIs
StatePublished - 1 Jan 2017
Event5th International Conference on Analysis of Images, Social Networks and Texts, AIST 2016 - Yekaterinburg, Russian Federation
Duration: 7 Apr 20169 Apr 2016

Publication series

NameCommunications in Computer and Information Science
Volume661
ISSN (Print)1865-0929

Conference

Conference5th International Conference on Analysis of Images, Social Networks and Texts, AIST 2016
Country/TerritoryRussian Federation
CityYekaterinburg
Period7/04/169/04/16

    Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)

    Research areas

  • Association measures, Distributional semantics, Selectional restrictions, Vector-space representation evaluation, Vector-space semantic models

ID: 47480880