The paper deals with the problem of thematic tagging in works of fiction, Russian short stories of the early 20 th century (1900-1930) serving as research data. The very concept of discourse theme, or topic, is argued to be fuzzy and ambivalent, all the more so in the case of literary prose. In the present study, theme is conceived as a set of keywords basically (but by no means exhaustively) defining the story’s plot. A list of 89 themes was empirically formed, embracing a wide range of topics. A sample corpus of 310 stories was manually tagged, with each story being mapped onto a set of themes. This corpus was divided into three parts corresponding to three periods, 1900-1913, 1914-1922, 1923-1930. These periods of Russian history being radically different, the stories’ content varies greatly, too, not only in what concerns political and social themes, but also in quite personal and mundane matters. The paper traces the themes’ frequency rates across the three periods, accounting for their change dynamics in terms of sociopolitical context. The sample corpus will be further used as training data in devising computational techniques for automated thematic tagging of literary fiction.

Translated title of the contributionТематическое аннотирование художественной литературы: на примере русских рассказов начала XX века
Original languageEnglish
Pages265-276
Number of pages12
StatePublished - 2021
EventInternet and Modern Society - Университет ИТМО, Санкт-Петербург, Russian Federation
Duration: 17 Jun 202020 Jun 2020
Conference number: 23
http://ims.ifmo.ru/ru/pages/2/programma.htm

Conference

ConferenceInternet and Modern Society
Abbreviated title IMS 2020
Country/TerritoryRussian Federation
CityСанкт-Петербург
Period17/06/2020/06/20
Internet address

    Research areas

  • Discourse theme, Literary corpus, Russian literature, Short stories, Thematic tagging

    Scopus subject areas

  • Computer Science(all)

ID: 74160978