Currently, the automatic speech synthesis technology is undergoing significant changes due to new solutions in the field of machine learning. These solutions qualitatively improve the sound of synthesized speech, bringing it closer to natural human speech. Against the backdrop of this, as well as under the influence of business, the development of artificial emotional speech for human-machine interaction systems has received a new strong turn of development. Due to this prosodic processing for the synthesis of Russian emotional speech has become an important research direction for our research group.The article presents an algorithm for predicting pause locations for three categories of emotional speech. In particular, the authors used three corpora of emotional speech, collected according to emotional categories (neutral, excited and depressed), for training classifiers. The obtained results can be used to create a high-quality automatic synthesizer of emotional speech.

Original languageEnglish
Title of host publicationProceedings of the 2018 IEEE International Conference "Quality Management, Transport and Information Security, Information Technologies", IT and QM and IS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages653-655
Number of pages3
ISBN (Electronic)9781538667576
DOIs
StatePublished - 5 Nov 2018
Event2018 IEEE International Conference "Quality Management, Transport and Information Security, Information Technologies", IT and QM and IS 2018 - St. Petersburg, Russian Federation
Duration: 24 Sep 201828 Sep 2018

Publication series

NameProceedings of the 2018 International Conference ''Quality Management, Transport and Information Security, Information Technologies'', IT and QM and IS 2018

Conference

Conference2018 IEEE International Conference "Quality Management, Transport and Information Security, Information Technologies", IT and QM and IS 2018
Country/TerritoryRussian Federation
CitySt. Petersburg
Period24/09/1828/09/18

    Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

    Research areas

  • emotional speech, pause prediction, prosody, speech synthesis, statistical models

ID: 42717806