DOI

The paper deals with new research findings on pragmatic markers (PMs) use in spoken Russian. The study is based on two speech corpora: “One Day of Speech” (ORD, which contains mainly dialogues), and “Balanced Annotated Collection of Texts” (SAT, which contains only monologues). We explored two annotated subcorpora consisting of 321,504 tokens and 50,128 tokens respectively. The main results are as follows: 1) the extended frequency lists of PMs were formed; 2) PMs, that are frequently used in both types of speech, were identified (e.g., hesitation markers like tam ‘there’, tak ‘that way’), 3) the list of PMs, used primarily in monologue speech, was compiled (in this list there are such PMs as boundary ones znachit ‘well’, nu vot ‘well er’, vs’o ‘that’s all’); 4) the list of PMs, used primarily in dialogues, was made (among such PMs are, for example, “xeno”-markers takoj ‘like’, grit ‘says’ and meta-communicative markers like vidish’ ‘you know’, (ja) ne znaju ‘don’t know’). Particular attention was paid to the variability of pragmatic markers, as well as to complex cases of their identification. Finally, the most common models of pragmatic markers formation (for single-word and multi-word PMs) were revealed.

Язык оригиналаанглийский
Название основной публикацииSpeech and Computer
Подзаголовок основной публикации22nd International Conference, SPECOM 2020, St. Petersburg, Russia, October 7–9, 2020, Proceedings
РедакторыAlexey Karpov, Rodmonga Potapova
Место публикацииCham
ИздательSpringer Nature
Страницы68-78
Число страниц11
ISBN (электронное издание)978-3-030-60276-5
ISBN (печатное издание)978-3-030-60275-8
DOI
СостояниеОпубликовано - 2020
Событие22nd International Conference on Speech and Computer - St. Petersburg, Russia => Online, St. Petersburg, Российская Федерация
Продолжительность: 7 окт 20209 окт 2020
http://specom.nw.ru/2020/program/SPECOM-ICR2020-Conference-Program-06102020.pdf

Серия публикаций

НазваниеLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Том12335 LNAI
ISSN (печатное издание)0302-9743
ISSN (электронное издание)1611-3349

конференция

конференция22nd International Conference on Speech and Computer
Сокращенное названиеSPECOM and ICR 2020
Страна/TерриторияРоссийская Федерация
ГородSt. Petersburg
Период7/10/209/10/20
Сайт в сети Internet

    Области исследований

  • : Russian Everyday Speech, Speech Corpus, Pragmatic Marker, Corpus Annotation, Monologue, Dialogue

    Предметные области Scopus

  • Теоретические компьютерные науки
  • Компьютерные науки (все)

ID: 73276083