Documents

The article is dedicated to the results of a research project describing the classes and functioning of multiword units in contemporary Russian every-day speech. The concept of multiword units encompasses quite diverse lin-guistic phenomena, making the creation of a working typology one of the project's central tasks. This typology is necessary for annotating corpus mate-rial and obtaining statistical characteristics. The identified classes of multi-word units include the following units: 1) non-phraseologized collocations, 2) phraseologized collocations, 3) occasional collocations, 4) idiom forms, 5) constructions, 6) precedent texts and their elements, 7) multi-word pragmatic markers, and 8) speech formulas. The article describes the methods for an-notating these units using the ORD corpus of everyday spoken Russian and presents the results of a quantitative analysis of their functioning within the annotated subcorpus. The obtained data can be used to address both theoret-ical tasks in the field of lexical and grammatical description of Russian eve-ryday speech and numerous tasks related to processing or generating live spoken Russian.
Translated title of the contributionНеоднословные единицы в русской повседневной речи: эмпирическая классификация и корпусные исследования
Original languageEnglish
Title of host publicationSpeech and Computer: 26th International Conference, SPECOM 2024
Pages187-200
Number of pages14
StatePublished - 21 Nov 2024
Event26th International Conference on Speech and Computer - University of Novi Sad, Белград, Serbia
Duration: 25 Nov 202428 Nov 2024
Conference number: 26
https://specom.nw.ru/2024/
https://specom2024.ftn.uns.ac.rs

Publication series

NameLNAI
Volume15299

Conference

Conference26th International Conference on Speech and Computer
Abbreviated titleSPECOM 2024
Country/TerritorySerbia
CityБелград
Period25/11/2428/11/24
Internet address

    Scopus subject areas

  • Arts and Humanities(all)

ID: 127634221