The paper deals with new research findings on pragmatic markers (PMs) use in spoken Russian. The study is based on two speech corpora: “One Day of Speech” (ORD, which contains mainly dialogues), and “Balanced Annotated Collection of Texts” (SAT, which contains only monologues). We explored two annotated subcorpora consisting of 321,504 tokens and 50,128 tokens respectively. The main results are as follows: 1) the extended frequency lists of PMs were formed; 2) PMs, that are frequently used in both types of speech, were identified (e.g., hesitation markers like tam ‘there’, tak ‘that way’), 3) the list of PMs, used primarily in monologue speech, was compiled (in this list there are such PMs as boundary ones znachit ‘well’, nu vot ‘well er’, vs’o ‘that’s all’); 4) the list of PMs, used primarily in dialogues, was made (among such PMs are, for example, “xeno”-markers takoj ‘like’, grit ‘says’ and meta-communicative markers like vidish’ ‘you know’, (ja) ne znaju ‘don’t know’). Particular attention was paid to the variability of pragmatic markers, as well as to complex cases of their identification. Finally, the most common models of pragmatic markers formation (for single-word and multi-word PMs) were revealed.

Original languageEnglish
Title of host publicationSpeech and Computer
Subtitle of host publication22nd International Conference, SPECOM 2020, St. Petersburg, Russia, October 7–9, 2020, Proceedings
EditorsAlexey Karpov, Rodmonga Potapova
Place of PublicationCham
PublisherSpringer Nature
Pages68-78
Number of pages11
ISBN (Electronic)978-3-030-60276-5
ISBN (Print)978-3-030-60275-8
DOIs
StatePublished - 2020
Event22nd International Conference on Speech and Computer - St. Petersburg, Russia => Online, St. Petersburg, Russian Federation
Duration: 7 Oct 20209 Oct 2020
http://specom.nw.ru/2020/program/SPECOM-ICR2020-Conference-Program-06102020.pdf

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12335 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference22nd International Conference on Speech and Computer
Abbreviated titleSPECOM 2020
Country/TerritoryRussian Federation
CitySt. Petersburg
Period7/10/209/10/20
Internet address

    Research areas

  • Russian everyday speech, Speech corpus, Pragmatic marker, Corpus annotation, Monologue, Dialogue

    Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

ID: 73276083