Problems of Disambiguation of Prepositional Phrases

Standard

Problems of Disambiguation of Prepositional Phrases. / Boyarsky, Kirill ; Kanevsky, Eugeny ; Kozlova, Anastasia .

24th International Conference “Internet and Modern Society” June 24-26, 2021, ITMO University, St. Petersburg: Proceedings . 2021. p. 98-110 (CEUR Workshop proceedings).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Harvard

Boyarsky, K, Kanevsky, E & Kozlova, A 2021, Problems of Disambiguation of Prepositional Phrases. in 24th International Conference “Internet and Modern Society” June 24-26, 2021, ITMO University, St. Petersburg: Proceedings . CEUR Workshop proceedings, pp. 98-110, Internet and Modern Society , Санкт-Петербург, Russian Federation, 24/06/21.

APA

Boyarsky, K., Kanevsky, E., & Kozlova, A. (2021). Problems of Disambiguation of Prepositional Phrases. In 24th International Conference “Internet and Modern Society” June 24-26, 2021, ITMO University, St. Petersburg: Proceedings (pp. 98-110). (CEUR Workshop proceedings).

Vancouver

Boyarsky K, Kanevsky E, Kozlova A. Problems of Disambiguation of Prepositional Phrases. In 24th International Conference “Internet and Modern Society” June 24-26, 2021, ITMO University, St. Petersburg: Proceedings . 2021. p. 98-110. (CEUR Workshop proceedings).

Author

Boyarsky, Kirill ; Kanevsky, Eugeny ; Kozlova, Anastasia . / Problems of Disambiguation of Prepositional Phrases. 24th International Conference “Internet and Modern Society” June 24-26, 2021, ITMO University, St. Petersburg: Proceedings . 2021. pp. 98-110 (CEUR Workshop proceedings).

BibTeX

@inproceedings{f4ccdb93c51e49fa98e6be28ea46161b,

title = "Problems of Disambiguation of Prepositional Phrases",

abstract = "This paper describes the features that appear in parsing procession of multiword turns (phrasemes) able to act as prepositions. These features are considered in the context of automatic analysis of Russian texts. Such phrases have a fairly high homonymy, which creates some difficulties in analysis and defining semantics and, consequently, reduces the accuracy of parsing. More than 320 phrasemes have been classified on the basis of the assumed homonymy types.In the course of the study, the phrasemes have been divided into three groups. The first group includes those phrasemes that can definitely be called prepositions, but potentially have some semantic ambiguity. The second group combines phrasemes that are characterized by the part-of-speech homonymy of preposition/adverb. The third group is characterized by phrasemes that determine the construction of two or three parsing options. The occurrence of multivariate parsing is based on the presence of one or two phrases related to different parts of speech, and a simple conjunction of a preposition with a noun.Within each group, lists of the most common phrasemes have been composed (according to the NCRL), indicating the probability that a certain phraseme may serve as a preposition. The paper also defines the basis on which the compilation of effectively removing homonymy rules for the SemSin parser may rely on. The examples provided in this paper prove that it is necessary to consider not only the direct encirclement of the phraseme, but also its remote context to remove homonymy.",

keywords = "automatic text analysis, disambiguation, homonymy, idiomaticity, prepositional phrases",

author = "Kirill Boyarsky and Eugeny Kanevsky and Anastasia Kozlova",

year = "2021",

month = dec,

day = "13",

language = "English",

series = "CEUR Workshop proceedings",

pages = "98--110",

booktitle = "24th International Conference “Internet and Modern Society” June 24-26, 2021, ITMO University, St. Petersburg",

note = "Internet and Modern Society , IMS-2021 ; Conference date: 24-06-2021 Through 26-06-2021",

url = "http://ims.ifmo.ru/, http://ims.ifmo.ru/ru/pages/2/programma.htm",

}

RIS

TY - GEN

T1 - Problems of Disambiguation of Prepositional Phrases

AU - Boyarsky, Kirill

AU - Kanevsky, Eugeny

AU - Kozlova, Anastasia

N1 - Conference code: 24

PY - 2021/12/13

Y1 - 2021/12/13

N2 - This paper describes the features that appear in parsing procession of multiword turns (phrasemes) able to act as prepositions. These features are considered in the context of automatic analysis of Russian texts. Such phrases have a fairly high homonymy, which creates some difficulties in analysis and defining semantics and, consequently, reduces the accuracy of parsing. More than 320 phrasemes have been classified on the basis of the assumed homonymy types.In the course of the study, the phrasemes have been divided into three groups. The first group includes those phrasemes that can definitely be called prepositions, but potentially have some semantic ambiguity. The second group combines phrasemes that are characterized by the part-of-speech homonymy of preposition/adverb. The third group is characterized by phrasemes that determine the construction of two or three parsing options. The occurrence of multivariate parsing is based on the presence of one or two phrases related to different parts of speech, and a simple conjunction of a preposition with a noun.Within each group, lists of the most common phrasemes have been composed (according to the NCRL), indicating the probability that a certain phraseme may serve as a preposition. The paper also defines the basis on which the compilation of effectively removing homonymy rules for the SemSin parser may rely on. The examples provided in this paper prove that it is necessary to consider not only the direct encirclement of the phraseme, but also its remote context to remove homonymy.

AB - This paper describes the features that appear in parsing procession of multiword turns (phrasemes) able to act as prepositions. These features are considered in the context of automatic analysis of Russian texts. Such phrases have a fairly high homonymy, which creates some difficulties in analysis and defining semantics and, consequently, reduces the accuracy of parsing. More than 320 phrasemes have been classified on the basis of the assumed homonymy types.In the course of the study, the phrasemes have been divided into three groups. The first group includes those phrasemes that can definitely be called prepositions, but potentially have some semantic ambiguity. The second group combines phrasemes that are characterized by the part-of-speech homonymy of preposition/adverb. The third group is characterized by phrasemes that determine the construction of two or three parsing options. The occurrence of multivariate parsing is based on the presence of one or two phrases related to different parts of speech, and a simple conjunction of a preposition with a noun.Within each group, lists of the most common phrasemes have been composed (according to the NCRL), indicating the probability that a certain phraseme may serve as a preposition. The paper also defines the basis on which the compilation of effectively removing homonymy rules for the SemSin parser may rely on. The examples provided in this paper prove that it is necessary to consider not only the direct encirclement of the phraseme, but also its remote context to remove homonymy.

KW - automatic text analysis

KW - disambiguation

KW - homonymy

KW - idiomaticity

KW - prepositional phrases

UR - http://ceur-ws.org/Vol-3090/

M3 - Conference contribution

T3 - CEUR Workshop proceedings

SP - 98

EP - 110

BT - 24th International Conference “Internet and Modern Society” June 24-26, 2021, ITMO University, St. Petersburg

T2 - Internet and Modern Society

Y2 - 24 June 2021 through 26 June 2021

ER -

ID: 84815030