Recent advances in deep leaming for natural language processing achieve and improve over state of the art results in many natural language processing tasks. One problem with neural network models, however, is that they require large datasets, including large labeled datasets for the corresponding problems. In this work, we suggest a dala augmentation method based on extending a given dataset with synonyms for the words appearing there. We apply this approach to the morphologically rich Russian language and show improvements for modem neural network NLP models on standard tasks such as sentiment analysis.

Язык оригиналаанглийский
Название основной публикацииProceedings of the AINL FRUCT 2016 Conference
РедакторыAndrey Filchenkov, Jan Zizka, Lidia Pivovarova, Sergey Balandin
ИздательInstitute of Electrical and Electronics Engineers Inc.
ISBN (электронное издание)9789526839783
СостояниеОпубликовано - 3 апр 2017
Событие5th Artificial Intelligence and Natural Language FRUCT Conference, AINL FRUCT 2016 - Saint-Petersburg, Российская Федерация
Продолжительность: 10 ноя 201612 ноя 2016

Серия публикаций

НазваниеProceedings of the AINL FRUCT 2016 Conference

конференция

конференция5th Artificial Intelligence and Natural Language FRUCT Conference, AINL FRUCT 2016
Страна/TерриторияРоссийская Федерация
ГородSaint-Petersburg
Период10/11/1612/11/16

    Области исследований

  • natural language processing, data augmentation, character-level models, neural networks

    Предметные области Scopus

  • Искусственный интеллект

ID: 95167988