End-to-End Speech Recognition in Russian › Научные исследования в СПбГУ

DOI

https://doi.org/10.1007/978-3-319-99579-3_40
Конечная издательская версия

Nikita Markovnikov
Irina Kipyatkova
Elena Lyakso

End-to-end speech recognition systems incorporating deep neural networksÂ (DNNs) have achieved good results. We propose applying CTCÂ (Connectionist Temporal Classification) models and attention-based encoder-decoder in automatic recognition of the Russian continuous speech. We used different neural network models such Long short-term memoryÂ (LSTM), bidirectional LSTM and Residual Networks to provide experiments. We got recognition accuracy a bit worse than hybrid models but our models can work without large language model and they showed better performance in terms of average decoding speed that can be helpful in real systems. Experiments are performed with extra-large vocabulary (more than 150K words) of Russian speech.

Язык оригинала	английский
Название основной публикации	Speech and Computer - 20th International Conference, SPECOM 2018, Proceedings
Редакторы	Rodmonga Potapova, Oliver Jokisch, Alexey Karpov
Издатель	Springer Nature
Страницы	377-386
Число страниц	10
ISBN (печатное издание)	9783319995786
DOI	https://doi.org/10.1007/978-3-319-99579-3_40
Состояние	Опубликовано - 1 сен 2018
Событие	20th International Conference on Speech and Computer - Leipzig, Германия Продолжительность: 18 сен 2018 → 22 сен 2018

Серия публикаций

Название	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Том	11096 LNAI
ISSN (печатное издание)	0302-9743
ISSN (электронное издание)	1611-3349

конференция

конференция	20th International Conference on Speech and Computer
Сокращенное название	SPECOM 2018
Страна/Tерритория	Германия
Город	Leipzig
Период	18/09/18 → 22/09/18

Предметные области Scopus

Теоретические компьютерные науки
Компьютерные науки (все)

ID: 36521378