Microphone array post-filter in frequency domain for speech recognition using short-time log-spectral amplitude estimator and spectral harmonic/noise classifier

DOI

https://doi.org/10.1007/978-3-319-66429-3_52
Конечная издательская версия

Sergey Salishev
Ilya Klotchkov
Andrey Barabanov

We propose a novel computationally efficient real-time microphone array speech enhancement postfilter with a small delay that takes into account features of speech signal and recognition algorithms. The algorithm is efficient for small microphone arrays. The filter is based on applying a binary classification model to the Log Short-Term Spectral Amplitude (Log-STSA). The proposed algorithm allows substantial improvement of recognition accuracy with minor increase in complexity compared to Wiener post-filter and lower complexity compared to existing voice model based approaches. Objective tests using dual microphone array, ETSI binaural noise database, TIDIGITS database, and CMU Sphinx 4 speech recognizer demonstrate overall 41% Error Rate reduction for SNR from 15 dB to 0 dB. Subjective evaluation also demonstrates substantial noise reduction and intelligibility improvement without musical noise artifacts common for Wiener and Spectral Subtraction based methods. Testing with SiSEC10 four microphone linear equispaced array database shows that recognition accuracy is improved with increased base and/or number of microphones in array.

Язык оригинала	английский
Название основной публикации	Speech and Computer (SPECOM 2017)
Издатель	Springer Nature
Страницы	525-534
Число страниц	10
DOI	https://doi.org/10.1007/978-3-319-66429-3_52
Состояние	Опубликовано - 1 янв 2017
Событие	19th International Conference on Speech and Computer - Hatfield, Великобритания Продолжительность: 11 сен 2017 → 15 сен 2017

Серия публикаций

Название	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Издатель	Springer Nature
Том	10458 LNAI
ISSN (печатное издание)	0302-9743

конференция

конференция	19th International Conference on Speech and Computer
Сокращенное название	SPECOM 2017
Страна/Tерритория	Великобритания
Город	Hatfield
Период	11/09/17 → 15/09/17

ID: 152227169