Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Multiword Units in Russian Everyday Speech: Empirical Classification and Corpus-Based Studies. / Богданова-Бегларян, Наталья Викторовна; Шерстинова, Татьяна Юрьевна; Блинова, Ольга Владимировна; Хохлова, Мария Владимировна; Попова, Татьяна Ивановна.
Speech and Computer: 26th International Conference, SPECOM 2024. 2024. p. 187-200 (LNAI ; Vol. 15299).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - Multiword Units in Russian Everyday Speech: Empirical Classification and Corpus-Based Studies
AU - Богданова-Бегларян, Наталья Викторовна
AU - Шерстинова, Татьяна Юрьевна
AU - Блинова, Ольга Владимировна
AU - Хохлова, Мария Владимировна
AU - Попова, Татьяна Ивановна
N1 - Conference code: 26
PY - 2024/11/21
Y1 - 2024/11/21
N2 - The article is dedicated to the results of a research project describing the classes and functioning of multiword units in contemporary Russian every-day speech. The concept of multiword units encompasses quite diverse lin-guistic phenomena, making the creation of a working typology one of the project's central tasks. This typology is necessary for annotating corpus mate-rial and obtaining statistical characteristics. The identified classes of multi-word units include the following units: 1) non-phraseologized collocations, 2) phraseologized collocations, 3) occasional collocations, 4) idiom forms, 5) constructions, 6) precedent texts and their elements, 7) multi-word pragmatic markers, and 8) speech formulas. The article describes the methods for an-notating these units using the ORD corpus of everyday spoken Russian and presents the results of a quantitative analysis of their functioning within the annotated subcorpus. The obtained data can be used to address both theoret-ical tasks in the field of lexical and grammatical description of Russian eve-ryday speech and numerous tasks related to processing or generating live spoken Russian.
AB - The article is dedicated to the results of a research project describing the classes and functioning of multiword units in contemporary Russian every-day speech. The concept of multiword units encompasses quite diverse lin-guistic phenomena, making the creation of a working typology one of the project's central tasks. This typology is necessary for annotating corpus mate-rial and obtaining statistical characteristics. The identified classes of multi-word units include the following units: 1) non-phraseologized collocations, 2) phraseologized collocations, 3) occasional collocations, 4) idiom forms, 5) constructions, 6) precedent texts and their elements, 7) multi-word pragmatic markers, and 8) speech formulas. The article describes the methods for an-notating these units using the ORD corpus of everyday spoken Russian and presents the results of a quantitative analysis of their functioning within the annotated subcorpus. The obtained data can be used to address both theoret-ical tasks in the field of lexical and grammatical description of Russian eve-ryday speech and numerous tasks related to processing or generating live spoken Russian.
KW - modern Russian, everyday speech, oral discourse, multiword units, collocations, syntax, statistical analysis, speech corpus, corpus linguistics, speech technologies
M3 - Conference contribution
T3 - LNAI
SP - 187
EP - 200
BT - Speech and Computer: 26th International Conference, SPECOM 2024
T2 - 26th International Conference on Speech and Computer
Y2 - 25 November 2024 through 28 November 2024
ER -
ID: 127634221