F0 contour prediction for the Kazakh language

DOI

https://doi.org/10.1145/3330431.3330436
Final published version

Arman Kaliyev
Yuri N. Matveev
Elena E. Lyakso
Sergey V. Rybin

The article presents work on predicting the fundamental frequency (F0) values for the Kazakh language. The fundamental frequency plays one of the most important roles in the perception of speech, and at the same time modelling continuous F0 is one of the most difficult tasks in the development of intonational speech synthesis systems. The main and obvious difficulty is that a person is able to say the same sentence with different intonations and with different tones. In this work, we used deep neural networks for accurate and qualitative prediction F0 values as close as possible to the natural sounding of Kazakh speech.

Original language	English
Title of host publication	ICEMIS '19
Subtitle of host publication	Proceedings of the 5th International Conference on Engineering and MIS
Publisher	Association for Computing Machinery
ISBN (Electronic)	9781450372121
ISBN (Print)	9781450372121
DOIs	https://doi.org/10.1145/3330431.3330436
State	Published - 6 Jun 2019
Event	5th International Conference on Engineering and MIS, ICEMIS 2019 - Astana, Kazakhstan Duration: 6 Jun 2019 → 8 Jun 2019

Publication series

Name	ACM International Conference Proceeding Series

Conference

Conference	5th International Conference on Engineering and MIS, ICEMIS 2019
Country/Territory	Kazakhstan
City	Astana
Period	6/06/19 → 8/06/19

Scopus subject areas

Software
Human-Computer Interaction
Computer Vision and Pattern Recognition
Computer Networks and Communications

Research areas

DNN, Fundamental frequency, Informants with atypical development, Intonation, Kazakh language, LSTM, Speech synthesis

ID: 46097990