Topic Modelling in Parallel and Comparable Fiction Texts (the case study of English and Russian prose)

Anastasiia Sedova, Olga Mitrofanova

Research output

Abstract

The paper is devoted to processing parallel and comparable corpora by means of topic modelling. We focus our attention on Russian and English parallel and comparable texts. We use Latent Dirichlet Allocation (LDA) algorithm for building topic models of fiction texts, evaluation of compatibility for the original text and its translation(s), selection of possible translation equivalents.

Original languageEnglish
Title of host publicationProceedings of the International Conference on Internet and Modern Society, IMS 2017
EditorsIrina I. Tolstikova, Nikolai V. Borisov, Victor P. Zakharov, Nikolai V. Borisov, Leonid V. Smorgunov, Radomir V. Bolgov
PublisherAssociation for Computing Machinery
Pages175-180
Number of pages6
ISBN (Electronic)9781450354370
DOIs
Publication statusPublished - 21 Jun 2017
EventИнтернет и современное общество: международная объединенная конференция - Университет ИТМО, St. Petersburg
Duration: 20 Jun 201723 Jun 2017
Conference number: 20
http://icims.ifmo.ru/
http://ims.ifmo.ru/ru/pages/28/IMS_2017.htm
http://ims.ifmo.ru/ru/pages/28/IMS_2017.htm
http://icims.ifmo.ru/

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2017 International Conference on Internet and Modern Society, IMS 2017
Abbreviated titleIMS 2017
CountryRussian Federation
CitySt. Petersburg
Period20/06/1723/06/17
Internet address

Fingerprint

Processing

Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Sedova, A., & Mitrofanova, O. (2017). Topic Modelling in Parallel and Comparable Fiction Texts (the case study of English and Russian prose). In I. I. Tolstikova, N. V. Borisov, V. P. Zakharov, N. V. Borisov, L. V. Smorgunov, & R. V. Bolgov (Eds.), Proceedings of the International Conference on Internet and Modern Society, IMS 2017 (pp. 175-180). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3143699.3143734
Sedova, Anastasiia ; Mitrofanova, Olga. / Topic Modelling in Parallel and Comparable Fiction Texts (the case study of English and Russian prose). Proceedings of the International Conference on Internet and Modern Society, IMS 2017. editor / Irina I. Tolstikova ; Nikolai V. Borisov ; Victor P. Zakharov ; Nikolai V. Borisov ; Leonid V. Smorgunov ; Radomir V. Bolgov. Association for Computing Machinery, 2017. pp. 175-180 (ACM International Conference Proceeding Series).
@inproceedings{39d4d60035c9417688040e861c7e9085,
title = "Topic Modelling in Parallel and Comparable Fiction Texts (the case study of English and Russian prose)",
abstract = "The paper is devoted to processing parallel and comparable corpora by means of topic modelling. We focus our attention on Russian and English parallel and comparable texts. We use Latent Dirichlet Allocation (LDA) algorithm for building topic models of fiction texts, evaluation of compatibility for the original text and its translation(s), selection of possible translation equivalents.",
keywords = "Comparable Texts, English, Fiction, Parallel, Russian, Text Corpora, Topic Modelling",
author = "Anastasiia Sedova and Olga Mitrofanova",
year = "2017",
month = "6",
day = "21",
doi = "10.1145/3143699.3143734",
language = "English",
series = "ACM International Conference Proceeding Series",
publisher = "Association for Computing Machinery",
pages = "175--180",
editor = "Tolstikova, {Irina I.} and Borisov, {Nikolai V.} and Zakharov, {Victor P.} and Borisov, {Nikolai V.} and Smorgunov, {Leonid V.} and Bolgov, {Radomir V.}",
booktitle = "Proceedings of the International Conference on Internet and Modern Society, IMS 2017",
address = "United States",

}

Sedova, A & Mitrofanova, O 2017, Topic Modelling in Parallel and Comparable Fiction Texts (the case study of English and Russian prose). in II Tolstikova, NV Borisov, VP Zakharov, NV Borisov, LV Smorgunov & RV Bolgov (eds), Proceedings of the International Conference on Internet and Modern Society, IMS 2017. ACM International Conference Proceeding Series, Association for Computing Machinery, pp. 175-180, St. Petersburg, 20/06/17. https://doi.org/10.1145/3143699.3143734

Topic Modelling in Parallel and Comparable Fiction Texts (the case study of English and Russian prose). / Sedova, Anastasiia; Mitrofanova, Olga.

Proceedings of the International Conference on Internet and Modern Society, IMS 2017. ed. / Irina I. Tolstikova; Nikolai V. Borisov; Victor P. Zakharov; Nikolai V. Borisov; Leonid V. Smorgunov; Radomir V. Bolgov. Association for Computing Machinery, 2017. p. 175-180 (ACM International Conference Proceeding Series).

Research output

TY - GEN

T1 - Topic Modelling in Parallel and Comparable Fiction Texts (the case study of English and Russian prose)

AU - Sedova, Anastasiia

AU - Mitrofanova, Olga

PY - 2017/6/21

Y1 - 2017/6/21

N2 - The paper is devoted to processing parallel and comparable corpora by means of topic modelling. We focus our attention on Russian and English parallel and comparable texts. We use Latent Dirichlet Allocation (LDA) algorithm for building topic models of fiction texts, evaluation of compatibility for the original text and its translation(s), selection of possible translation equivalents.

AB - The paper is devoted to processing parallel and comparable corpora by means of topic modelling. We focus our attention on Russian and English parallel and comparable texts. We use Latent Dirichlet Allocation (LDA) algorithm for building topic models of fiction texts, evaluation of compatibility for the original text and its translation(s), selection of possible translation equivalents.

KW - Comparable Texts

KW - English

KW - Fiction

KW - Parallel

KW - Russian

KW - Text Corpora

KW - Topic Modelling

UR - http://www.scopus.com/inward/record.url?scp=85040721958&partnerID=8YFLogxK

U2 - 10.1145/3143699.3143734

DO - 10.1145/3143699.3143734

M3 - Conference contribution

AN - SCOPUS:85040721958

T3 - ACM International Conference Proceeding Series

SP - 175

EP - 180

BT - Proceedings of the International Conference on Internet and Modern Society, IMS 2017

A2 - Tolstikova, Irina I.

A2 - Borisov, Nikolai V.

A2 - Zakharov, Victor P.

A2 - Borisov, Nikolai V.

A2 - Smorgunov, Leonid V.

A2 - Bolgov, Radomir V.

PB - Association for Computing Machinery

ER -

Sedova A, Mitrofanova O. Topic Modelling in Parallel and Comparable Fiction Texts (the case study of English and Russian prose). In Tolstikova II, Borisov NV, Zakharov VP, Borisov NV, Smorgunov LV, Bolgov RV, editors, Proceedings of the International Conference on Internet and Modern Society, IMS 2017. Association for Computing Machinery. 2017. p. 175-180. (ACM International Conference Proceeding Series). https://doi.org/10.1145/3143699.3143734