Estimating syntagmatic association strength using distributional word re presentations

G. T. Bukia
E. V. Protopopova
P. V. Panicheva
O. A. Mitrofanova

In the paper we present distributed vector space models based on word embeddings and a specific association-oriented count-based distributional algorithm which have been applied to measuring association strength in Russian syntagmatic relations (namely, between nouns and adjectives). We discuss the compositional properties of the vectors representing nouns, adjectives and adjective-noun compositions and propose two methods of detecting the syntactic association possibility. The accuracy of the proposed measures is evaluated by means of a pseudo-disambiguation test procedure and all models show considerably high results. The errors are manually annotated, and the model errors are classified in terms of their linguistic nature and compositionality features.

Original language	English
Pages (from-to)	112-121
Number of pages	10
Journal	Komp'juternaja Lingvistika i Intellektual'nye Tehnologii
State	Published - 1 Jan 2016
Event	2016 International Conference on Computational Linguistics and Intellectual Technologies, Dialogue 2016 - Moscow, Russian Federation Duration: 1 Jun 2016 → 4 Jun 2016

Research areas

Adjective-noun phrases, Association measures, Compositional collocations, Distributional semantics, Pseudo-disambiguation, Russian corpora, Word2Vec

Scopus subject areas

Language and Linguistics
Linguistics and Language
Computer Science Applications

ID: 47480962