Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › Рецензирование
Evaluation and Combining Association Measures for Collocation Extraction. / Захаров, Виктор Павлович.
Proceedings of the International Conference IMS-2017 (St. Petersburg; Russian Federation, 21-24 June 2017). 2017. стр. 125-134 (ACM INTERNATIONAL CONFERENCE PROCEEDINGS SERIES).Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › Рецензирование
}
TY - GEN
T1 - Evaluation and Combining Association Measures for Collocation Extraction
AU - Захаров, Виктор Павлович
N1 - Conference code: XX
PY - 2017
Y1 - 2017
N2 - The paper deals with collocation extraction from corpus data. The experiments are described with the objective to study collocation extraction based on statistical association measures. A whole number of formulas have been created to integrate different factors that determine the association between the collocation components. The experiments are described whose objective was to study the method of collocation extraction based on the statistical association measures. The paper is focused on bigram collocations. The obtained data on the measure precision allows to establish to some degree that in cases when collocation extraction is not used for some special purposes such measures as MI.l-og_f, log-Dice, minimum sensitivity should be used. At the same time, various options of their integration are desirable and useful. To use advantages of separate measures, we offer to create a combined list of collocations extracted by different measures and propose a number of parameters that allow to rank collocates in a combined list in some reasonable way.
AB - The paper deals with collocation extraction from corpus data. The experiments are described with the objective to study collocation extraction based on statistical association measures. A whole number of formulas have been created to integrate different factors that determine the association between the collocation components. The experiments are described whose objective was to study the method of collocation extraction based on the statistical association measures. The paper is focused on bigram collocations. The obtained data on the measure precision allows to establish to some degree that in cases when collocation extraction is not used for some special purposes such measures as MI.l-og_f, log-Dice, minimum sensitivity should be used. At the same time, various options of their integration are desirable and useful. To use advantages of separate measures, we offer to create a combined list of collocations extracted by different measures and propose a number of parameters that allow to rank collocates in a combined list in some reasonable way.
KW - metrics
KW - Collocation extraction;
KW - association measures;
KW - evaluation;
KW - correlation;
KW - ranking,
U2 - 10.1145/3143699.3143717
DO - 10.1145/3143699.3143717
M3 - Conference contribution
SN - 978-1-4503-5437-0
T3 - ACM INTERNATIONAL CONFERENCE PROCEEDINGS SERIES
SP - 125
EP - 134
BT - Proceedings of the International Conference IMS-2017 (St. Petersburg; Russian Federation, 21-24 June 2017)
T2 - 2017 International Conference on Internet and Modern Society, IMS 2017
Y2 - 21 June 2017 through 23 June 2017
ER -
ID: 34962509