Eliciting Meaningful Units from Speech › SPbU Researchers Portal

DOI

https://doi.org/10.21437/Interspeech.2017-855
Final published version

Elicitation of information structure from speech is a crucial stepin automatic speech understanding. In terms of both productionand perception, we consider intonational phrase to be the basicmeaningful unit of information structure in speech. The cur-rent paper presents a method of detecting these units in speechby processing both the recorded speech and its textual repre-sentation. Using syntactic information, we split text into smallgroups of words closely connected with each other. Assum-ing that intonational phrases are built from these small groups,we use acoustic information to reveal their actual boundaries.The procedure was initially developed for processing Russianspeech, and we have achieved the best published results forthis language with F1equal to 0.91. We assume that it maybe adapted for other languages that have some amount of readspeech resources, including under-resourced languages. Forcomparison we have evaluated it on English material (BostonUniversity Radio Speech Corpus). Our results, F1of 0.76, arecomparable with the top systems designed for English.

Original language	English
Title of host publication	Proceeding of Interspeech 2017
Pages	2128-2132
Number of pages	5
DOIs	https://doi.org/10.21437/Interspeech.2017-855
State	Published - 2017
Event	Interspeech 2017 - Stockholm, Sweden Duration: 20 Aug 2017 → 24 Aug 2017

Conference

Conference	Interspeech 2017
Country/Territory	Sweden
City	Stockholm
Period	20/08/17 → 24/08/17

ID: 122811947