
The article is dedicated to the results of a research project describing the classes and functioning of multiword units in contemporary Russian every-day speech. The concept of multiword units encompasses quite diverse lin-guistic phenomena, making the creation of a working typology one of the project's central tasks. This typology is necessary for annotating corpus mate-rial and obtaining statistical characteristics. The identified classes of multi-word units include the following units: 1) non-phraseologized collocations, 2) phraseologized collocations, 3) occasional collocations, 4) idiom forms, 5) constructions, 6) precedent texts and their elements, 7) multi-word pragmatic markers, and 8) speech formulas. The article describes the methods for an-notating these units using the ORD corpus of everyday spoken Russian and presents the results of a quantitative analysis of their functioning within the annotated subcorpus. The obtained data can be used to address both theoret-ical tasks in the field of lexical and grammatical description of Russian eve-ryday speech and numerous tasks related to processing or generating live spoken Russian.
Переведенное названиеНеоднословные единицы в русской повседневной речи: эмпирическая классификация и корпусные исследования
Язык оригиналаанглийский
Название основной публикацииSpeech and Computer: 26th International Conference, SPECOM 2024
СостояниеОпубликовано - 21 ноя 2024
СобытиеXXVIth International Conference “Speech and Computer”: Specom 2024 - University of Novi Sad, Белград, Сербия
Продолжительность: 25 ноя 202428 ноя 2024
конференцияXXVIth International Conference “Speech and Computer”
Сокращенное названиеSPECOM-2024
  • modern Russian, everyday speech, oral discourse, multiword units, collocations, syntax, statistical analysis, speech corpus, corpus linguistics, speech technologies

ID: 127634221