Standard

Using Kaldi for Phonetic Transcription: Evidence from the Corpus of Spoken Russian. / Riekhakaynen, Elena; Skorobagatko, Lada.

Proceedings of the Third International Conference on Advances in Computing Research (ACR’25). Springer Nature, 2025. p. 168-178 (Lecture Notes in Networks and Systems; Vol. 1346).

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Harvard

Riekhakaynen, E & Skorobagatko, L 2025, Using Kaldi for Phonetic Transcription: Evidence from the Corpus of Spoken Russian. in Proceedings of the Third International Conference on Advances in Computing Research (ACR’25). Lecture Notes in Networks and Systems, vol. 1346, Springer Nature, pp. 168-178, The 2025 International Conference on Advances in Computing Research, Ницца, France, 7/07/25. https://doi.org/10.1007/978-3-031-87647-9_15

APA

Riekhakaynen, E., & Skorobagatko, L. (2025). Using Kaldi for Phonetic Transcription: Evidence from the Corpus of Spoken Russian. In Proceedings of the Third International Conference on Advances in Computing Research (ACR’25) (pp. 168-178). (Lecture Notes in Networks and Systems; Vol. 1346). Springer Nature. https://doi.org/10.1007/978-3-031-87647-9_15

Vancouver

Riekhakaynen E, Skorobagatko L. Using Kaldi for Phonetic Transcription: Evidence from the Corpus of Spoken Russian. In Proceedings of the Third International Conference on Advances in Computing Research (ACR’25). Springer Nature. 2025. p. 168-178. (Lecture Notes in Networks and Systems). https://doi.org/10.1007/978-3-031-87647-9_15

Author

Riekhakaynen, Elena ; Skorobagatko, Lada. / Using Kaldi for Phonetic Transcription: Evidence from the Corpus of Spoken Russian. Proceedings of the Third International Conference on Advances in Computing Research (ACR’25). Springer Nature, 2025. pp. 168-178 (Lecture Notes in Networks and Systems).

BibTeX

@inproceedings{e750dd78d080445c872eec4b6d04972e,
title = "Using Kaldi for Phonetic Transcription: Evidence from the Corpus of Spoken Russian",
abstract = "For creating a linguistically annotated speech corpus, it is useful to have a tool for an automatic phonetic transcription. We used the Kaldi tool to transcribe the recordings of radio interviews and talk shows from the Corpus of Spoken Russian. The training set included 2466 interpausal intervals (speech fragments between two pauses), and the test set – 617 ones. 15 models for monophone training and 15 models for triphone training were tested using a low-dimensional dictionary that contained only allophones. The error rates ranged from 44% to 39%. Learning through triphones coped better with the task than the one through monophones. Increasing the length of N-grams had a positive effect on the result of the model, the percentage of errors decreased to 36%. The frequency of allophone occurrence does not seem to affect the accuracy of their recognition. Vowels are recognized worse than consonants, which is consistent with what is known about how trained experts in phonetics transcribe spontaneous speech.",
keywords = "Acoustic Transcription, Automatic Speech Recognition, Natural Language Processing, Phonetic Transcription, Russian Speech",
author = "Elena Riekhakaynen and Lada Skorobagatko",
year = "2025",
month = apr,
day = "16",
doi = "10.1007/978-3-031-87647-9_15",
language = "English",
isbn = "9783031876462",
series = "Lecture Notes in Networks and Systems",
publisher = "Springer Nature",
pages = "168--178",
booktitle = "Proceedings of the Third International Conference on Advances in Computing Research (ACR{\textquoteright}25)",
address = "Germany",
note = "null ; Conference date: 07-07-2025 Through 09-07-2025",
url = "https://iicser.org/ACR25/",

}

RIS

TY - GEN

T1 - Using Kaldi for Phonetic Transcription: Evidence from the Corpus of Spoken Russian

AU - Riekhakaynen, Elena

AU - Skorobagatko, Lada

N1 - Conference code: 3

PY - 2025/4/16

Y1 - 2025/4/16

N2 - For creating a linguistically annotated speech corpus, it is useful to have a tool for an automatic phonetic transcription. We used the Kaldi tool to transcribe the recordings of radio interviews and talk shows from the Corpus of Spoken Russian. The training set included 2466 interpausal intervals (speech fragments between two pauses), and the test set – 617 ones. 15 models for monophone training and 15 models for triphone training were tested using a low-dimensional dictionary that contained only allophones. The error rates ranged from 44% to 39%. Learning through triphones coped better with the task than the one through monophones. Increasing the length of N-grams had a positive effect on the result of the model, the percentage of errors decreased to 36%. The frequency of allophone occurrence does not seem to affect the accuracy of their recognition. Vowels are recognized worse than consonants, which is consistent with what is known about how trained experts in phonetics transcribe spontaneous speech.

AB - For creating a linguistically annotated speech corpus, it is useful to have a tool for an automatic phonetic transcription. We used the Kaldi tool to transcribe the recordings of radio interviews and talk shows from the Corpus of Spoken Russian. The training set included 2466 interpausal intervals (speech fragments between two pauses), and the test set – 617 ones. 15 models for monophone training and 15 models for triphone training were tested using a low-dimensional dictionary that contained only allophones. The error rates ranged from 44% to 39%. Learning through triphones coped better with the task than the one through monophones. Increasing the length of N-grams had a positive effect on the result of the model, the percentage of errors decreased to 36%. The frequency of allophone occurrence does not seem to affect the accuracy of their recognition. Vowels are recognized worse than consonants, which is consistent with what is known about how trained experts in phonetics transcribe spontaneous speech.

KW - Acoustic Transcription

KW - Automatic Speech Recognition

KW - Natural Language Processing

KW - Phonetic Transcription

KW - Russian Speech

UR - https://www.mendeley.com/catalogue/60277827-7b74-3cc3-8758-0432160c02be/

U2 - 10.1007/978-3-031-87647-9_15

DO - 10.1007/978-3-031-87647-9_15

M3 - Conference contribution

SN - 9783031876462

T3 - Lecture Notes in Networks and Systems

SP - 168

EP - 178

BT - Proceedings of the Third International Conference on Advances in Computing Research (ACR’25)

PB - Springer Nature

Y2 - 7 July 2025 through 9 July 2025

ER -

ID: 138031353