Corpora of Russian Spontaneous Speech as a Tool for Modelling Natural Speech Production and Recognition

Standard

Corpora of Russian Spontaneous Speech as a Tool for Modelling Natural Speech Production and Recognition. / Riekhakaynen, Elena I. .

10th Annual Computing and Communication Workshop and Conference (CCWC). Institute of Electrical and Electronics Engineers Inc., 2020. p. 0406-0411.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review

Harvard

Riekhakaynen, EI 2020, Corpora of Russian Spontaneous Speech as a Tool for Modelling Natural Speech Production and Recognition. in 10th Annual Computing and Communication Workshop and Conference (CCWC). Institute of Electrical and Electronics Engineers Inc., pp. 0406-0411, 10th Annual Computing and Communication Workshop and Conference, Las Vegas, Nevada, United States, 6/01/20. https://doi.org/10.1109/CCWC47524.2020.9031251

APA

Riekhakaynen, E. I. (2020). Corpora of Russian Spontaneous Speech as a Tool for Modelling Natural Speech Production and Recognition. In 10th Annual Computing and Communication Workshop and Conference (CCWC) (pp. 0406-0411). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CCWC47524.2020.9031251

Vancouver

Riekhakaynen EI. Corpora of Russian Spontaneous Speech as a Tool for Modelling Natural Speech Production and Recognition. In 10th Annual Computing and Communication Workshop and Conference (CCWC). Institute of Electrical and Electronics Engineers Inc. 2020. p. 0406-0411 https://doi.org/10.1109/CCWC47524.2020.9031251

Author

Riekhakaynen, Elena I. . / Corpora of Russian Spontaneous Speech as a Tool for Modelling Natural Speech Production and Recognition. 10th Annual Computing and Communication Workshop and Conference (CCWC). Institute of Electrical and Electronics Engineers Inc., 2020. pp. 0406-0411

BibTeX

@inproceedings{cb4b81cdd62d426ab47da8dff41fdb41,

title = "Corpora of Russian Spontaneous Speech as a Tool for Modelling Natural Speech Production and Recognition",

abstract = "The paper presents two corpora of spontaneous Russian. The aim of the study is to describe the speech signal in a way close to the one a listener has to cope with while processing natural speech and to use the corpora for further computer simulation of spoken word recognition. The corpus of adult speech includes around two hours of recordings provided with the orthographic and acoustic-phonetic transcription performed manually by trained phoneticians. The word list imitating the mental lexicon of a listener where each phonetic realization corresponds to all possible variants of its interpretation found in the corpus was created based on the corpus. The analysis of the adult speech shows how often reduced word forms occur in spontaneous speech and allows to develop and check an algorithm of the restoration of grammatical information in noun phrases. The corpus of children's speech includes both longitudinal and experimental data (around 18 hours all together) and is the first example of the corpus of Russian children's speech provided with phonetic annotation. The preliminary analysis of the children's speech shows that at least some reduced variants can be stored in the mental lexicon of a native speaker.",

keywords = "Spontaneous speech, Children's Speech, Russian, Phonetic Reduction, Speech Processing, corpus linguistics",

author = "Riekhakaynen, {Elena I.}",

note = "E. I. Riekhakaynen, {"}Corpora of Russian Spontaneous Speech as a Tool for Modelling Natural Speech Production and Recognition,{"} 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 2020, pp. 0406-0411, doi: 10.1109/CCWC47524.2020.9031251.; null ; Conference date: 06-01-2020 Through 08-01-2020",

year = "2020",

month = mar,

day = "12",

doi = "10.1109/CCWC47524.2020.9031251",

language = "English",

pages = "0406--0411",

booktitle = "10th Annual Computing and Communication Workshop and Conference (CCWC)",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

address = "United States",

url = "http://ieee-ccwc.org/",

}

RIS

TY - GEN

T1 - Corpora of Russian Spontaneous Speech as a Tool for Modelling Natural Speech Production and Recognition

AU - Riekhakaynen, Elena I.

N1 - Conference code: 10

PY - 2020/3/12

Y1 - 2020/3/12

N2 - The paper presents two corpora of spontaneous Russian. The aim of the study is to describe the speech signal in a way close to the one a listener has to cope with while processing natural speech and to use the corpora for further computer simulation of spoken word recognition. The corpus of adult speech includes around two hours of recordings provided with the orthographic and acoustic-phonetic transcription performed manually by trained phoneticians. The word list imitating the mental lexicon of a listener where each phonetic realization corresponds to all possible variants of its interpretation found in the corpus was created based on the corpus. The analysis of the adult speech shows how often reduced word forms occur in spontaneous speech and allows to develop and check an algorithm of the restoration of grammatical information in noun phrases. The corpus of children's speech includes both longitudinal and experimental data (around 18 hours all together) and is the first example of the corpus of Russian children's speech provided with phonetic annotation. The preliminary analysis of the children's speech shows that at least some reduced variants can be stored in the mental lexicon of a native speaker.

AB - The paper presents two corpora of spontaneous Russian. The aim of the study is to describe the speech signal in a way close to the one a listener has to cope with while processing natural speech and to use the corpora for further computer simulation of spoken word recognition. The corpus of adult speech includes around two hours of recordings provided with the orthographic and acoustic-phonetic transcription performed manually by trained phoneticians. The word list imitating the mental lexicon of a listener where each phonetic realization corresponds to all possible variants of its interpretation found in the corpus was created based on the corpus. The analysis of the adult speech shows how often reduced word forms occur in spontaneous speech and allows to develop and check an algorithm of the restoration of grammatical information in noun phrases. The corpus of children's speech includes both longitudinal and experimental data (around 18 hours all together) and is the first example of the corpus of Russian children's speech provided with phonetic annotation. The preliminary analysis of the children's speech shows that at least some reduced variants can be stored in the mental lexicon of a native speaker.

KW - Spontaneous speech

KW - Children's Speech

KW - Russian

KW - Phonetic Reduction

KW - Speech Processing

KW - corpus linguistics

U2 - 10.1109/CCWC47524.2020.9031251

DO - 10.1109/CCWC47524.2020.9031251

M3 - Conference contribution

SP - 406

EP - 411

BT - 10th Annual Computing and Communication Workshop and Conference (CCWC)

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 6 January 2020 through 8 January 2020

ER -

ID: 72568850