Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Developing a Question Answering System on the Material of Holocaust Survivors’ Testimonies in Russian. / Bukreeva, Liudmila; Guseva, Daria; Dolgushin, Mikhail; Evdokimova, Vera; Obotnina, Vasilisa.
Speech and Computer . 2023. p. 357-366 (Lecture Notes in Computer Science ; Vol. 14339).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - Developing a Question Answering System on the Material of Holocaust Survivors’ Testimonies in Russian
AU - Bukreeva, Liudmila
AU - Guseva, Daria
AU - Dolgushin, Mikhail
AU - Evdokimova, Vera
AU - Obotnina, Vasilisa
PY - 2023/11/22
Y1 - 2023/11/22
N2 - The paper makes use of the annotated task-oriented corpus of Holocaust testimonies in Russian (ruOHQA) to train a question-answer neural network model. We start from data preprocessing, present statistical analysis of the collected corpus for approximately 1500 pairs of questions and answers and describe its strengths and limitations. Also, we carry out experiments on automatic processing of the ruOHQA corpus using pre-trained transformer-based neural network models. Finally, we explore the capability of several models to generate simplified high-quality answers to questions and compare their results. The kind of research we present allows us to extract knowledge from oral history archives more productively.
AB - The paper makes use of the annotated task-oriented corpus of Holocaust testimonies in Russian (ruOHQA) to train a question-answer neural network model. We start from data preprocessing, present statistical analysis of the collected corpus for approximately 1500 pairs of questions and answers and describe its strengths and limitations. Also, we carry out experiments on automatic processing of the ruOHQA corpus using pre-trained transformer-based neural network models. Finally, we explore the capability of several models to generate simplified high-quality answers to questions and compare their results. The kind of research we present allows us to extract knowledge from oral history archives more productively.
KW - Corpora
KW - Question Answering
KW - Visual History Archives
UR - https://www.mendeley.com/catalogue/10aefa45-2994-394d-9d05-d3ffaeec0649/
U2 - 10.1007/978-3-031-48312-7_29
DO - 10.1007/978-3-031-48312-7_29
M3 - Conference contribution
SN - 9783031483110
T3 - Lecture Notes in Computer Science
SP - 357
EP - 366
BT - Speech and Computer
Y2 - 29 November 2023 through 1 December 2023
ER -
ID: 114283349