Pragmatic Markers Distribution in Russian Everyday Speech: Frequency Lists and Other Statistics for Discourse Modeling

Результат исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

1 Downloads (Pure)

Выдержка

Pragmatic markers (PMs) are discourse units (words and multiword expressions) with a weakened referential meaning, which perform a variety of pragmatic tasks. For example, in English the common PMs are “well”, “you know”, “I think”, and many others. PMs are integral elements of spoken discourseineverylanguage.AccordingtotheresultsobtainedfromtheORDcorpus of everyday Russian, their share can reach up to 6% of the total number of words in speech ofindividual speakers. Morethan that, insome speech fragments, PMs may even exceed the share of significant units (i.e., standard words). However, despite their frequency and usualness, PMs are still poorly understood. Current NLP and discourse modeling systems lack information on PMs distribution and usage, this fact leads to noticeable shortcomings in work of these systems when they face spontaneous speech of everyday spoken discourse. In this paper we present top frequency lists of PMs for Russian dialogue and monologue spokenspeechingeneral, andalso forseparatesociological groupsofinformants (by gender and by age). Our current list of PMs for Russian contains 450 units which are the variants of 50 main structural types. Besides, we consider the most frequent functions of PMs in spoken Russian. The presented quantitative data may be used for improvement of NPL and discourse modeling systems.
Язык оригиналаанглийский
Название основной публикацииSpeech and Computer
РедакторыAlbert Ali Salah, Aleksey Karpov, Rodmonga Potapova
ИздательSpringer
Страницы433-443
Том11658
ISBN (электронное издание)9783030260613
ISBN (печатное издание)9733030260606
СостояниеОпубликовано - 15 авг 2019
Событие21st International Conference, SPECOM 2019 - Istanbul, Турция
Продолжительность: 20 авг 201925 авг 2019

Конференция

Конференция21st International Conference, SPECOM 2019
СтранаТурция
ГородIstanbul
Период20/08/1925/08/19

Цитировать

Bogdanova-Beglarian, N., Sherstinova, T., Blinova, O., & Martynenko, G. (2019). Pragmatic Markers Distribution in Russian Everyday Speech: Frequency Lists and Other Statistics for Discourse Modeling. В A. Ali Salah, A. Karpov, & R. Potapova (Ред.), Speech and Computer (Том 11658, стр. 433-443). Springer.
Bogdanova-Beglarian, Natalia ; Sherstinova, Tatiana ; Blinova, Olga ; Martynenko, Gregory. / Pragmatic Markers Distribution in Russian Everyday Speech: Frequency Lists and Other Statistics for Discourse Modeling. Speech and Computer. редактор / Albert Ali Salah ; Aleksey Karpov ; Rodmonga Potapova. Том 11658 Springer, 2019. стр. 433-443
@inproceedings{c6f2e64539234e12abec84e27eda55a3,
title = "Pragmatic Markers Distribution in Russian Everyday Speech: Frequency Lists and Other Statistics for Discourse Modeling",
abstract = "Pragmatic markers (PMs) are discourse units (words and multiwordexpressions) with a weakened referential meaning, which perform a variety ofpragmatic tasks. For example, in English the common PMs are “well”, “youknow”, “I think”, and many others. PMs are integral elements of spoken discourse in every language. According to the results obtained from the ORD corpusof everyday Russian, their share can reach up to 6{\%} of the total number of wordsin speech of individual speakers. More than that, in some speech fragments, PMsmay even exceed the share of significant units (i.e., standard words). However,despite their frequency and usualness, PMs are still poorly understood. Current NLP and discourse modeling systems lack information on PMs distributionand usage, this fact leads to noticeable shortcomings in work of these systemswhen they face spontaneous speech of everyday spoken discourse. In this paperwe present top frequency lists of PMs for Russian dialogue and monologuespoken speech in general, and also for separate sociological groups of informants (by gender and by age). Our current list of PMs for Russian contains 450 units which are the variants of 50 main structural types. Besides, we consider the most frequent functions of PMs in spoken Russian. The presented quantitative data may be used for improvement of NPL and discourse modeling systems.",
author = "Natalia Bogdanova-Beglarian and Tatiana Sherstinova and Olga Blinova and Gregory Martynenko",
note = "Bogdanova-Beglarian, N., Sherstinova. T., Blinova, O., Martynenko, G. Pragmatic Markers Distribution in Russian Everyday Speech: Frequency Lists and Other Statistics for Discourse Modeling // Speech and Computer. 21st International Conference, SPECOM 2019, Istanbul, Turkey, August 20–25, 2019, Proceedings / Ed. by A. Ali Salah, A. Karpov, R. Potapova. Lecture Notes in Computer Science book series (LNCS, vol. 11658). – Pp. 433-443.",
year = "2019",
month = "8",
day = "15",
language = "English",
isbn = "9733030260606",
volume = "11658",
pages = "433--443",
editor = "{Ali Salah}, Albert and Aleksey Karpov and Rodmonga Potapova",
booktitle = "Speech and Computer",
publisher = "Springer",
address = "Germany",

}

Bogdanova-Beglarian, N, Sherstinova, T, Blinova, O & Martynenko, G 2019, Pragmatic Markers Distribution in Russian Everyday Speech: Frequency Lists and Other Statistics for Discourse Modeling. в A Ali Salah, A Karpov & R Potapova (ред.), Speech and Computer. том. 11658, Springer, стр. 433-443, 21st International Conference, SPECOM 2019, Istanbul, Турция, 20/08/19.

Pragmatic Markers Distribution in Russian Everyday Speech: Frequency Lists and Other Statistics for Discourse Modeling. / Bogdanova-Beglarian, Natalia; Sherstinova, Tatiana; Blinova, Olga; Martynenko, Gregory.

Speech and Computer. ред. / Albert Ali Salah; Aleksey Karpov; Rodmonga Potapova. Том 11658 Springer, 2019. стр. 433-443.

Результат исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

TY - GEN

T1 - Pragmatic Markers Distribution in Russian Everyday Speech: Frequency Lists and Other Statistics for Discourse Modeling

AU - Bogdanova-Beglarian, Natalia

AU - Sherstinova, Tatiana

AU - Blinova, Olga

AU - Martynenko, Gregory

N1 - Bogdanova-Beglarian, N., Sherstinova. T., Blinova, O., Martynenko, G. Pragmatic Markers Distribution in Russian Everyday Speech: Frequency Lists and Other Statistics for Discourse Modeling // Speech and Computer. 21st International Conference, SPECOM 2019, Istanbul, Turkey, August 20–25, 2019, Proceedings / Ed. by A. Ali Salah, A. Karpov, R. Potapova. Lecture Notes in Computer Science book series (LNCS, vol. 11658). – Pp. 433-443.

PY - 2019/8/15

Y1 - 2019/8/15

N2 - Pragmatic markers (PMs) are discourse units (words and multiwordexpressions) with a weakened referential meaning, which perform a variety ofpragmatic tasks. For example, in English the common PMs are “well”, “youknow”, “I think”, and many others. PMs are integral elements of spoken discourse in every language. According to the results obtained from the ORD corpusof everyday Russian, their share can reach up to 6% of the total number of wordsin speech of individual speakers. More than that, in some speech fragments, PMsmay even exceed the share of significant units (i.e., standard words). However,despite their frequency and usualness, PMs are still poorly understood. Current NLP and discourse modeling systems lack information on PMs distributionand usage, this fact leads to noticeable shortcomings in work of these systemswhen they face spontaneous speech of everyday spoken discourse. In this paperwe present top frequency lists of PMs for Russian dialogue and monologuespoken speech in general, and also for separate sociological groups of informants (by gender and by age). Our current list of PMs for Russian contains 450 units which are the variants of 50 main structural types. Besides, we consider the most frequent functions of PMs in spoken Russian. The presented quantitative data may be used for improvement of NPL and discourse modeling systems.

AB - Pragmatic markers (PMs) are discourse units (words and multiwordexpressions) with a weakened referential meaning, which perform a variety ofpragmatic tasks. For example, in English the common PMs are “well”, “youknow”, “I think”, and many others. PMs are integral elements of spoken discourse in every language. According to the results obtained from the ORD corpusof everyday Russian, their share can reach up to 6% of the total number of wordsin speech of individual speakers. More than that, in some speech fragments, PMsmay even exceed the share of significant units (i.e., standard words). However,despite their frequency and usualness, PMs are still poorly understood. Current NLP and discourse modeling systems lack information on PMs distributionand usage, this fact leads to noticeable shortcomings in work of these systemswhen they face spontaneous speech of everyday spoken discourse. In this paperwe present top frequency lists of PMs for Russian dialogue and monologuespoken speech in general, and also for separate sociological groups of informants (by gender and by age). Our current list of PMs for Russian contains 450 units which are the variants of 50 main structural types. Besides, we consider the most frequent functions of PMs in spoken Russian. The presented quantitative data may be used for improvement of NPL and discourse modeling systems.

UR - https://link.springer.com/chapter/10.1007/978-3-030-26061-3_44

M3 - Conference contribution

SN - 9733030260606

VL - 11658

SP - 433

EP - 443

BT - Speech and Computer

A2 - Ali Salah, Albert

A2 - Karpov, Aleksey

A2 - Potapova, Rodmonga

PB - Springer

ER -

Bogdanova-Beglarian N, Sherstinova T, Blinova O, Martynenko G. Pragmatic Markers Distribution in Russian Everyday Speech: Frequency Lists and Other Statistics for Discourse Modeling. В Ali Salah A, Karpov A, Potapova R, редакторы, Speech and Computer. Том 11658. Springer. 2019. стр. 433-443