Similarity between the association measures: A case study of noun phrases

Результат исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

Выдержка

Collocation extraction has gained much attention in natural language processing, its results are important in various areas of applied linguistics. The research focuses on a comparison between over a dozen of association measures based on a subset of the Russian Web corpus. The paper studies the automatically extracted Adj-Noun collocations. The aim of the experiments is two-fold. First, to examine the difference between statistical measures and second to find the most effective one for the Russian data. The former assumes the calculation of the Spearman’s rank correlation coefficient and the latter implies the evaluation of the extracted lists against a Russian dictionary, i.e. identifying automatically extracted and manually collected collocations. The results are not such straightforward, one can distinguish between groups of measures that demonstrate a relative interchangeability. Also the produced bigrams can be considered as collocations by experts and thus may enrich dictionaries.

Язык оригиналаанглийский
Название основной публикацииProceedings of the Tenth Workshop on Recent Advances in Slavonic Natural Languages Processing
РедакторыPavel Rychly, Adam Rambousek, Ales Horak
ИздательTribun EU
Страницы21-27
Число страниц7
Том2018-December
ISBN (электронное издание)9788026315179
СостояниеОпубликовано - 1 дек 2018
Событие12th Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2018 - Karlova Studanka, Чехия
Продолжительность: 7 дек 20189 дек 2018

Серия публикаций

Название Recent Advances in Slavonic Natural Language Processing
ISSN (печатное издание)2336-4289

Конференция

Конференция12th Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2018
СтранаЧехия
ГородKarlova Studanka
Период7/12/189/12/18

Отпечаток

Glossaries
Linguistics
Processing
Experiments

Предметные области Scopus

  • Искусственный интеллект
  • Математика и теория расчета
  • Информационные системы
  • Программный продукт

Цитировать

Khokhlova, M. (2018). Similarity between the association measures: A case study of noun phrases. В P. Rychly, A. Rambousek, & A. Horak (Ред.), Proceedings of the Tenth Workshop on Recent Advances in Slavonic Natural Languages Processing (Том 2018-December, стр. 21-27). ( Recent Advances in Slavonic Natural Language Processing). Tribun EU.
Khokhlova, Maria. / Similarity between the association measures : A case study of noun phrases. Proceedings of the Tenth Workshop on Recent Advances in Slavonic Natural Languages Processing. редактор / Pavel Rychly ; Adam Rambousek ; Ales Horak. Том 2018-December Tribun EU, 2018. стр. 21-27 ( Recent Advances in Slavonic Natural Language Processing).
@inproceedings{94e4d2cbeba04344832923dffe8c7dc4,
title = "Similarity between the association measures: A case study of noun phrases",
abstract = "Collocation extraction has gained much attention in natural language processing, its results are important in various areas of applied linguistics. The research focuses on a comparison between over a dozen of association measures based on a subset of the Russian Web corpus. The paper studies the automatically extracted Adj-Noun collocations. The aim of the experiments is two-fold. First, to examine the difference between statistical measures and second to find the most effective one for the Russian data. The former assumes the calculation of the Spearman’s rank correlation coefficient and the latter implies the evaluation of the extracted lists against a Russian dictionary, i.e. identifying automatically extracted and manually collected collocations. The results are not such straightforward, one can distinguish between groups of measures that demonstrate a relative interchangeability. Also the produced bigrams can be considered as collocations by experts and thus may enrich dictionaries.",
keywords = "Collocability, Collocations, Corpora, Gold standard, Statistical measures, Statistics",
author = "Maria Khokhlova",
year = "2018",
month = "12",
day = "1",
language = "English",
volume = "2018-December",
series = "Recent Advances in Slavonic Natural Language Processing",
publisher = "Tribun EU",
pages = "21--27",
editor = "Pavel Rychly and Adam Rambousek and Ales Horak",
booktitle = "Proceedings of the Tenth Workshop on Recent Advances in Slavonic Natural Languages Processing",
address = "Czech Republic",

}

Khokhlova, M 2018, Similarity between the association measures: A case study of noun phrases. в P Rychly, A Rambousek & A Horak (ред.), Proceedings of the Tenth Workshop on Recent Advances in Slavonic Natural Languages Processing. том. 2018-December, Recent Advances in Slavonic Natural Language Processing, Tribun EU, стр. 21-27, Karlova Studanka, Чехия, 7/12/18.

Similarity between the association measures : A case study of noun phrases. / Khokhlova, Maria.

Proceedings of the Tenth Workshop on Recent Advances in Slavonic Natural Languages Processing. ред. / Pavel Rychly; Adam Rambousek; Ales Horak. Том 2018-December Tribun EU, 2018. стр. 21-27 ( Recent Advances in Slavonic Natural Language Processing).

Результат исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

TY - GEN

T1 - Similarity between the association measures

T2 - A case study of noun phrases

AU - Khokhlova, Maria

PY - 2018/12/1

Y1 - 2018/12/1

N2 - Collocation extraction has gained much attention in natural language processing, its results are important in various areas of applied linguistics. The research focuses on a comparison between over a dozen of association measures based on a subset of the Russian Web corpus. The paper studies the automatically extracted Adj-Noun collocations. The aim of the experiments is two-fold. First, to examine the difference between statistical measures and second to find the most effective one for the Russian data. The former assumes the calculation of the Spearman’s rank correlation coefficient and the latter implies the evaluation of the extracted lists against a Russian dictionary, i.e. identifying automatically extracted and manually collected collocations. The results are not such straightforward, one can distinguish between groups of measures that demonstrate a relative interchangeability. Also the produced bigrams can be considered as collocations by experts and thus may enrich dictionaries.

AB - Collocation extraction has gained much attention in natural language processing, its results are important in various areas of applied linguistics. The research focuses on a comparison between over a dozen of association measures based on a subset of the Russian Web corpus. The paper studies the automatically extracted Adj-Noun collocations. The aim of the experiments is two-fold. First, to examine the difference between statistical measures and second to find the most effective one for the Russian data. The former assumes the calculation of the Spearman’s rank correlation coefficient and the latter implies the evaluation of the extracted lists against a Russian dictionary, i.e. identifying automatically extracted and manually collected collocations. The results are not such straightforward, one can distinguish between groups of measures that demonstrate a relative interchangeability. Also the produced bigrams can be considered as collocations by experts and thus may enrich dictionaries.

KW - Collocability

KW - Collocations

KW - Corpora

KW - Gold standard

KW - Statistical measures

KW - Statistics

UR - http://www.scopus.com/inward/record.url?scp=85062198775&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85062198775

VL - 2018-December

T3 - Recent Advances in Slavonic Natural Language Processing

SP - 21

EP - 27

BT - Proceedings of the Tenth Workshop on Recent Advances in Slavonic Natural Languages Processing

A2 - Rychly, Pavel

A2 - Rambousek, Adam

A2 - Horak, Ales

PB - Tribun EU

ER -

Khokhlova M. Similarity between the association measures: A case study of noun phrases. В Rychly P, Rambousek A, Horak A, редакторы, Proceedings of the Tenth Workshop on Recent Advances in Slavonic Natural Languages Processing. Том 2018-December. Tribun EU. 2018. стр. 21-27. ( Recent Advances in Slavonic Natural Language Processing).