Evaluating distributional semantic models with Russian noun-adjective compositions

Polina Panicheva, Ekaterina Protopopova, Grigoriy Bukia, Olga Mitrofanova

Результат исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференции

Выдержка

In the paper vector-space semantic models based on Word2Vec word embeddings algorithm and a count-based association-oriented algorithm are evaluated and compared by measuring association strength between Russian nouns and adjectives. A dataset of nouns and associated adjectives is used as the test set for pseudodisambiguation task. Models are trained with corpora of Russian fiction. A measure of lexical association anomaly is applied evaluating similarity between the initial noun and the resulting attributive phrase. Results of association strength are reported for models characterized by different parameter values; the best parameter value combinations are proposed. The test exemplars producing the error rate are manually annotated, and the model errors are categorized in terms of their linguistic nature and compositionality features.

Язык оригиналаанглийский
Название основной публикацииAnalysis of Images, Social Networks and Texts - 5th International Conference, AIST 2016, Revised Selected Papers
РедакторыNatalia Loukachevitch, Alexander Panchenko, Konstantin Vorontsov, Valeri G. Labunets, Andrey V. Savchenko, Dmitry I. Ignatov, Sergey I. Nikolenko, Mikhail Yu. Khachay
ИздательSpringer
Страницы236-247
Число страниц12
ISBN (печатное издание)9783319529196
DOI
СостояниеОпубликовано - 1 янв 2017
Событие5th International Conference on Analysis of Images, Social Networks and Texts, AIST 2016 - Yekaterinburg, Российская Федерация
Продолжительность: 7 апр 20169 апр 2016

Серия публикаций

НазваниеCommunications in Computer and Information Science
Том661
ISSN (печатное издание)1865-0929

Конференция

Конференция5th International Conference on Analysis of Images, Social Networks and Texts, AIST 2016
СтранаРоссийская Федерация
ГородYekaterinburg
Период7/04/169/04/16

Отпечаток

Semantics
Association reactions
Chemical analysis
Measures of Association
Compositionality
Model Error
Test Set
Anomaly
Vector space
Error Rate
Count
Vector spaces
Linguistics
Model
Model-based
Corpus
Similarity

Предметные области Scopus

  • Компьютерные науки (все)
  • Математика (все)

Цитировать

Panicheva, P., Protopopova, E., Bukia, G., & Mitrofanova, O. (2017). Evaluating distributional semantic models with Russian noun-adjective compositions. В N. Loukachevitch, A. Panchenko, K. Vorontsov, V. G. Labunets, A. V. Savchenko, D. I. Ignatov, S. I. Nikolenko, ... M. Y. Khachay (Ред.), Analysis of Images, Social Networks and Texts - 5th International Conference, AIST 2016, Revised Selected Papers (стр. 236-247). (Communications in Computer and Information Science; Том 661). Springer. https://doi.org/10.1007/978-3-319-52920-2_22
Panicheva, Polina ; Protopopova, Ekaterina ; Bukia, Grigoriy ; Mitrofanova, Olga. / Evaluating distributional semantic models with Russian noun-adjective compositions. Analysis of Images, Social Networks and Texts - 5th International Conference, AIST 2016, Revised Selected Papers. редактор / Natalia Loukachevitch ; Alexander Panchenko ; Konstantin Vorontsov ; Valeri G. Labunets ; Andrey V. Savchenko ; Dmitry I. Ignatov ; Sergey I. Nikolenko ; Mikhail Yu. Khachay. Springer, 2017. стр. 236-247 (Communications in Computer and Information Science).
@inproceedings{03fbe27c04c14d61a2095f5e47ec885f,
title = "Evaluating distributional semantic models with Russian noun-adjective compositions",
abstract = "In the paper vector-space semantic models based on Word2Vec word embeddings algorithm and a count-based association-oriented algorithm are evaluated and compared by measuring association strength between Russian nouns and adjectives. A dataset of nouns and associated adjectives is used as the test set for pseudodisambiguation task. Models are trained with corpora of Russian fiction. A measure of lexical association anomaly is applied evaluating similarity between the initial noun and the resulting attributive phrase. Results of association strength are reported for models characterized by different parameter values; the best parameter value combinations are proposed. The test exemplars producing the error rate are manually annotated, and the model errors are categorized in terms of their linguistic nature and compositionality features.",
keywords = "Association measures, Distributional semantics, Selectional restrictions, Vector-space representation evaluation, Vector-space semantic models",
author = "Polina Panicheva and Ekaterina Protopopova and Grigoriy Bukia and Olga Mitrofanova",
year = "2017",
month = "1",
day = "1",
doi = "10.1007/978-3-319-52920-2_22",
language = "English",
isbn = "9783319529196",
series = "Communications in Computer and Information Science",
publisher = "Springer",
pages = "236--247",
editor = "Natalia Loukachevitch and Alexander Panchenko and Konstantin Vorontsov and Labunets, {Valeri G.} and Savchenko, {Andrey V.} and Ignatov, {Dmitry I.} and Nikolenko, {Sergey I.} and Khachay, {Mikhail Yu.}",
booktitle = "Analysis of Images, Social Networks and Texts - 5th International Conference, AIST 2016, Revised Selected Papers",
address = "Germany",

}

Panicheva, P, Protopopova, E, Bukia, G & Mitrofanova, O 2017, Evaluating distributional semantic models with Russian noun-adjective compositions. в N Loukachevitch, A Panchenko, K Vorontsov, VG Labunets, AV Savchenko, DI Ignatov, SI Nikolenko & MY Khachay (ред.), Analysis of Images, Social Networks and Texts - 5th International Conference, AIST 2016, Revised Selected Papers. Communications in Computer and Information Science, том. 661, Springer, стр. 236-247, Yekaterinburg, Российская Федерация, 7/04/16. https://doi.org/10.1007/978-3-319-52920-2_22

Evaluating distributional semantic models with Russian noun-adjective compositions. / Panicheva, Polina; Protopopova, Ekaterina; Bukia, Grigoriy; Mitrofanova, Olga.

Analysis of Images, Social Networks and Texts - 5th International Conference, AIST 2016, Revised Selected Papers. ред. / Natalia Loukachevitch; Alexander Panchenko; Konstantin Vorontsov; Valeri G. Labunets; Andrey V. Savchenko; Dmitry I. Ignatov; Sergey I. Nikolenko; Mikhail Yu. Khachay. Springer, 2017. стр. 236-247 (Communications in Computer and Information Science; Том 661).

Результат исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференции

TY - GEN

T1 - Evaluating distributional semantic models with Russian noun-adjective compositions

AU - Panicheva, Polina

AU - Protopopova, Ekaterina

AU - Bukia, Grigoriy

AU - Mitrofanova, Olga

PY - 2017/1/1

Y1 - 2017/1/1

N2 - In the paper vector-space semantic models based on Word2Vec word embeddings algorithm and a count-based association-oriented algorithm are evaluated and compared by measuring association strength between Russian nouns and adjectives. A dataset of nouns and associated adjectives is used as the test set for pseudodisambiguation task. Models are trained with corpora of Russian fiction. A measure of lexical association anomaly is applied evaluating similarity between the initial noun and the resulting attributive phrase. Results of association strength are reported for models characterized by different parameter values; the best parameter value combinations are proposed. The test exemplars producing the error rate are manually annotated, and the model errors are categorized in terms of their linguistic nature and compositionality features.

AB - In the paper vector-space semantic models based on Word2Vec word embeddings algorithm and a count-based association-oriented algorithm are evaluated and compared by measuring association strength between Russian nouns and adjectives. A dataset of nouns and associated adjectives is used as the test set for pseudodisambiguation task. Models are trained with corpora of Russian fiction. A measure of lexical association anomaly is applied evaluating similarity between the initial noun and the resulting attributive phrase. Results of association strength are reported for models characterized by different parameter values; the best parameter value combinations are proposed. The test exemplars producing the error rate are manually annotated, and the model errors are categorized in terms of their linguistic nature and compositionality features.

KW - Association measures

KW - Distributional semantics

KW - Selectional restrictions

KW - Vector-space representation evaluation

KW - Vector-space semantic models

UR - http://www.scopus.com/inward/record.url?scp=85014236498&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-52920-2_22

DO - 10.1007/978-3-319-52920-2_22

M3 - Conference contribution

AN - SCOPUS:85014236498

SN - 9783319529196

T3 - Communications in Computer and Information Science

SP - 236

EP - 247

BT - Analysis of Images, Social Networks and Texts - 5th International Conference, AIST 2016, Revised Selected Papers

A2 - Loukachevitch, Natalia

A2 - Panchenko, Alexander

A2 - Vorontsov, Konstantin

A2 - Labunets, Valeri G.

A2 - Savchenko, Andrey V.

A2 - Ignatov, Dmitry I.

A2 - Nikolenko, Sergey I.

A2 - Khachay, Mikhail Yu.

PB - Springer

ER -

Panicheva P, Protopopova E, Bukia G, Mitrofanova O. Evaluating distributional semantic models with Russian noun-adjective compositions. В Loukachevitch N, Panchenko A, Vorontsov K, Labunets VG, Savchenko AV, Ignatov DI, Nikolenko SI, Khachay MY, редакторы, Analysis of Images, Social Networks and Texts - 5th International Conference, AIST 2016, Revised Selected Papers. Springer. 2017. стр. 236-247. (Communications in Computer and Information Science). https://doi.org/10.1007/978-3-319-52920-2_22