Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Pragmatic markers distribution in russian everyday speech : Frequency lists and other statistics for discourse modeling. / Bogdanova-Beglarian, Natalia; Sherstinova, Tatiana; Blinova, Olga; Martynenko, Gregory.
Speech and Computer - 21st International Conference, SPECOM 2019, Proceedings. ed. / Albert Ali Salah; Alexey Karpov; Rodmonga Potapova. Vol. 11658 Springer Nature, 2019. p. 433-443 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11658 LNAI).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - Pragmatic markers distribution in russian everyday speech
T2 - 21ST INTERNATIONAL CONFERENCE ON SPEECH AND COMPUTER
AU - Bogdanova-Beglarian, Natalia
AU - Sherstinova, Tatiana
AU - Blinova, Olga
AU - Martynenko, Gregory
N1 - Conference code: 21
PY - 2019/8/15
Y1 - 2019/8/15
N2 - Pragmatic markers (PMs) are discourse units (words and multiword expressions) with a weakened referential meaning, which perform a variety of pragmatic tasks. For example, in English the common PMs are “well”, “you know”, “I think”, and many others. PMs are integral elements of spoken discourse in every language. According to the results obtained from the ORD corpus of everyday Russian, their share can reach up to 6% of the total number of words in speech of individual speakers. More than that, in some speech fragments, PMs may even exceed the share of significant units (i.e., standard words). However, despite their frequency and usualness, PMs are still poorly understood. Current NLP and discourse modeling systems lack information on PMs distribution and usage, this fact leads to noticeable shortcomings in work of these systems when they face spontaneous speech of everyday spoken discourse. In this paper we present top frequency lists of PMs for Russian dialogue and monologue spoken speech in general, and also for separate sociological groups of informants (by gender and by age). Our current list of PMs for Russian contains 450 units which are the variants of 50 main structural types. Besides, we consider the most frequent functions of PMs in spoken Russian. The presented quantitative data may be used for improvement of NPL and discourse modeling systems.
AB - Pragmatic markers (PMs) are discourse units (words and multiword expressions) with a weakened referential meaning, which perform a variety of pragmatic tasks. For example, in English the common PMs are “well”, “you know”, “I think”, and many others. PMs are integral elements of spoken discourse in every language. According to the results obtained from the ORD corpus of everyday Russian, their share can reach up to 6% of the total number of words in speech of individual speakers. More than that, in some speech fragments, PMs may even exceed the share of significant units (i.e., standard words). However, despite their frequency and usualness, PMs are still poorly understood. Current NLP and discourse modeling systems lack information on PMs distribution and usage, this fact leads to noticeable shortcomings in work of these systems when they face spontaneous speech of everyday spoken discourse. In this paper we present top frequency lists of PMs for Russian dialogue and monologue spoken speech in general, and also for separate sociological groups of informants (by gender and by age). Our current list of PMs for Russian contains 450 units which are the variants of 50 main structural types. Besides, we consider the most frequent functions of PMs in spoken Russian. The presented quantitative data may be used for improvement of NPL and discourse modeling systems.
KW - Everyday discourse
KW - Frequency lists
KW - NLP
KW - Pragmatic markers
KW - Pragmatics
KW - Sociolinguistics
KW - Speech corpus
KW - Spoken dialogue
KW - Spoken monologue
KW - Spoken Russian
KW - Statistics
UR - http://www.scopus.com/inward/record.url?scp=85071481696&partnerID=8YFLogxK
UR - http://www.mendeley.com/research/pragmatic-markers-distribution-russian-everyday-speech-frequency-lists-other-statistics-discourse-mo
U2 - 10.1007/978-3-030-26061-3_44
DO - 10.1007/978-3-030-26061-3_44
M3 - Conference contribution
AN - SCOPUS:85071481696
SN - 9783030260606
VL - 11658
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 433
EP - 443
BT - Speech and Computer - 21st International Conference, SPECOM 2019, Proceedings
A2 - Salah, Albert Ali
A2 - Karpov, Alexey
A2 - Potapova, Rodmonga
PB - Springer Nature
Y2 - 20 August 2019 through 25 August 2019
ER -
ID: 45015234