DOI

The article presents work on predicting the fundamental frequency (F0) values for the Kazakh language. The fundamental frequency plays one of the most important roles in the perception of speech, and at the same time modelling continuous F0 is one of the most difficult tasks in the development of intonational speech synthesis systems. The main and obvious difficulty is that a person is able to say the same sentence with different intonations and with different tones. In this work, we used deep neural networks for accurate and qualitative prediction F0 values as close as possible to the natural sounding of Kazakh speech.

Original languageEnglish
Title of host publicationICEMIS '19
Subtitle of host publicationProceedings of the 5th International Conference on Engineering and MIS
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450372121
ISBN (Print)9781450372121
DOIs
StatePublished - 6 Jun 2019
Event5th International Conference on Engineering and MIS, ICEMIS 2019 - Astana, Kazakhstan
Duration: 6 Jun 20198 Jun 2019

Publication series

NameACM International Conference Proceeding Series

Conference

Conference5th International Conference on Engineering and MIS, ICEMIS 2019
Country/TerritoryKazakhstan
CityAstana
Period6/06/198/06/19

    Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

    Research areas

  • DNN, Fundamental frequency, Informants with atypical development, Intonation, Kazakh language, LSTM, Speech synthesis

ID: 46097990