CORPRES › SPbU Researchers Portal

DOI

https://doi.org/10.1007/978-3-642-15760-8_50
Final published version

The paper introduces CORPRES - COrpus of Russian Professionally REad Speech developed at the Department of Phonetics, Saint Petersburg State University, as a result of a three-year project. The corpus includes samples of different speaking styles produced by 4 male and 4 female speakers. Six levels of annotation cover all phonetic and prosodic information about the recorded speech data, including labels for pitch marks, phonetic events, phonetic, orthographic and prosodic transcription. Precise phonetic transcription of the data provides an especially valuable resource for both research and development purposes. Overall corpus size is 60 hours of speech. The paper contains information about CORPRES design and annotation principles, and overall data description. Also, we discuss possible use of the corpus in phonetic research and speech technology as well as some findings on the Russian sound system obtained from the corpus data.

Original language	English
Title of host publication	Text, Speech and Dialogue - 13th International Conference, TSD 2010, Proceedings
Place of Publication	Berlin Heidelberg
Publisher	Springer Nature
Pages	392-399
Number of pages	8
ISBN (Print)	3642157599, 9783642157592
DOIs	https://doi.org/10.1007/978-3-642-15760-8_50
State	Published - 2010
Event	13th International Conference on Text, Speech and Dialogue, TSD 2010: 13th International Conference - Brno, Czech Republic Duration: 6 Sep 2010 → 10 Sep 2010 Conference number: 13 https://www.tsdconference.org/

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	6231 LNAI
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	13th International Conference on Text, Speech and Dialogue, TSD 2010
Abbreviated title	TSD 2010
Country/Territory	Czech Republic
City	Brno
Period	6/09/10 → 10/09/10
Internet address	https://www.tsdconference.org/

Scopus subject areas

Theoretical Computer Science
Computer Science(all)

Research areas

annotation, manual transcription, phonetic transcription, Phonetics, prosodic feature labelling, speech corpus, text-to-speech

ID: 4428711

CORPRES: Corpus of Russian professionally read speech

DOI

Publication series

Conference

Scopus subject areas

Research areas