Evaluation and Combining Association Measures for Collocation Extraction

Research output

1 Citation (Scopus)

Abstract

The paper deals with collocation extraction from corpus data. The experiments are described with the objective to study collocation extraction based on statistical association measures. A whole number of formulas have been created to integrate different factors that determine the association between the collocation components. The experiments are described whose objective was to study the method of collocation extraction based on the statistical association measures. The paper is focused on bigram collocations. The obtained data on the measure precision allows to establish to some degree that in cases when collocation extraction is not used for some special purposes such measures as MI.l-og_f, log-Dice, minimum sensitivity should be used. At the same time, various options of their integration are desirable and useful. To use advantages of separate measures, we offer to create a combined list of collocations extracted by different measures and propose a number of parameters that allow to rank collocates in a combined list in some reasonable way.
Original languageEnglish
Title of host publicationProceedings of the International Conference IMS-2017 (St. Petersburg; Russian Federation, 21-24 June 2017)
Pages125 134
Number of pages10
DOIs
Publication statusPublished - 2017

Publication series

NameACM INTERNATIONAL CONFERENCE PROCEEDINGS SERIES

Cite this

Захаров, В. П. (2017). Evaluation and Combining Association Measures for Collocation Extraction. In Proceedings of the International Conference IMS-2017 (St. Petersburg; Russian Federation, 21-24 June 2017) (pp. 125 134). (ACM INTERNATIONAL CONFERENCE PROCEEDINGS SERIES). https://doi.org/10.1145/3143699.3143717
Захаров, Виктор Павлович. / Evaluation and Combining Association Measures for Collocation Extraction. Proceedings of the International Conference IMS-2017 (St. Petersburg; Russian Federation, 21-24 June 2017). 2017. pp. 125 134 (ACM INTERNATIONAL CONFERENCE PROCEEDINGS SERIES).
@inproceedings{5c6d166f8950476b99fc92b5514a39e1,
title = "Evaluation and Combining Association Measures for Collocation Extraction",
abstract = "The paper deals with collocation extraction from corpus data. The experiments are described with the objective to study collocation extraction based on statistical association measures. A whole number of formulas have been created to integrate different factors that determine the association between the collocation components. The experiments are described whose objective was to study the method of collocation extraction based on the statistical association measures. The paper is focused on bigram collocations. The obtained data on the measure precision allows to establish to some degree that in cases when collocation extraction is not used for some special purposes such measures as MI.l-og_f, log-Dice, minimum sensitivity should be used. At the same time, various options of their integration are desirable and useful. To use advantages of separate measures, we offer to create a combined list of collocations extracted by different measures and propose a number of parameters that allow to rank collocates in a combined list in some reasonable way.",
keywords = "metrics, Collocation extraction; , association measures; , evaluation; , correlation; , ranking,",
author = "Захаров, {Виктор Павлович}",
year = "2017",
doi = "10.1145/3143699.3143717",
language = "English",
isbn = "978-1-4503-5437-0",
series = "ACM INTERNATIONAL CONFERENCE PROCEEDINGS SERIES",
pages = "125 134",
booktitle = "Proceedings of the International Conference IMS-2017 (St. Petersburg; Russian Federation, 21-24 June 2017)",

}

Захаров, ВП 2017, Evaluation and Combining Association Measures for Collocation Extraction. in Proceedings of the International Conference IMS-2017 (St. Petersburg; Russian Federation, 21-24 June 2017). ACM INTERNATIONAL CONFERENCE PROCEEDINGS SERIES, pp. 125 134. https://doi.org/10.1145/3143699.3143717

Evaluation and Combining Association Measures for Collocation Extraction. / Захаров, Виктор Павлович.

Proceedings of the International Conference IMS-2017 (St. Petersburg; Russian Federation, 21-24 June 2017). 2017. p. 125 134 (ACM INTERNATIONAL CONFERENCE PROCEEDINGS SERIES).

Research output

TY - GEN

T1 - Evaluation and Combining Association Measures for Collocation Extraction

AU - Захаров, Виктор Павлович

PY - 2017

Y1 - 2017

N2 - The paper deals with collocation extraction from corpus data. The experiments are described with the objective to study collocation extraction based on statistical association measures. A whole number of formulas have been created to integrate different factors that determine the association between the collocation components. The experiments are described whose objective was to study the method of collocation extraction based on the statistical association measures. The paper is focused on bigram collocations. The obtained data on the measure precision allows to establish to some degree that in cases when collocation extraction is not used for some special purposes such measures as MI.l-og_f, log-Dice, minimum sensitivity should be used. At the same time, various options of their integration are desirable and useful. To use advantages of separate measures, we offer to create a combined list of collocations extracted by different measures and propose a number of parameters that allow to rank collocates in a combined list in some reasonable way.

AB - The paper deals with collocation extraction from corpus data. The experiments are described with the objective to study collocation extraction based on statistical association measures. A whole number of formulas have been created to integrate different factors that determine the association between the collocation components. The experiments are described whose objective was to study the method of collocation extraction based on the statistical association measures. The paper is focused on bigram collocations. The obtained data on the measure precision allows to establish to some degree that in cases when collocation extraction is not used for some special purposes such measures as MI.l-og_f, log-Dice, minimum sensitivity should be used. At the same time, various options of their integration are desirable and useful. To use advantages of separate measures, we offer to create a combined list of collocations extracted by different measures and propose a number of parameters that allow to rank collocates in a combined list in some reasonable way.

KW - metrics

KW - Collocation extraction;

KW - association measures;

KW - evaluation;

KW - correlation;

KW - ranking,

U2 - 10.1145/3143699.3143717

DO - 10.1145/3143699.3143717

M3 - Conference contribution

SN - 978-1-4503-5437-0

T3 - ACM INTERNATIONAL CONFERENCE PROCEEDINGS SERIES

SP - 125 134

BT - Proceedings of the International Conference IMS-2017 (St. Petersburg; Russian Federation, 21-24 June 2017)

ER -

Захаров ВП. Evaluation and Combining Association Measures for Collocation Extraction. In Proceedings of the International Conference IMS-2017 (St. Petersburg; Russian Federation, 21-24 June 2017). 2017. p. 125 134. (ACM INTERNATIONAL CONFERENCE PROCEEDINGS SERIES). https://doi.org/10.1145/3143699.3143717