In the paper we present distributed vector space models based on word embeddings and a specific association-oriented count-based distributional algorithm which have been applied to measuring association strength in Russian syntagmatic relations (namely, between nouns and adjectives). We discuss the compositional properties of the vectors representing nouns, adjectives and adjective-noun compositions and propose two methods of detecting the syntactic association possibility. The accuracy of the proposed measures is evaluated by means of a pseudo-disambiguation test procedure and all models show considerably high results. The errors are manually annotated, and the model errors are classified in terms of their linguistic nature and compositionality features.

Original languageEnglish
Pages (from-to)112-121
Number of pages10
JournalKomp'juternaja Lingvistika i Intellektual'nye Tehnologii
StatePublished - 1 Jan 2016
Event2016 International Conference on Computational Linguistics and Intellectual Technologies, Dialogue 2016 - Moscow, Russian Federation
Duration: 1 Jun 20164 Jun 2016

    Research areas

  • Adjective-noun phrases, Association measures, Compositional collocations, Distributional semantics, Pseudo-disambiguation, Russian corpora, Word2Vec

    Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Computer Science Applications

ID: 47480962