The paper deals with the problem of thematic tagging in works of fiction, Russian short stories of the early 20 th century (1900-1930) serving as research data. The very concept of discourse theme, or topic, is argued to be fuzzy and ambivalent, all the more so in the case of literary prose. In the present study, theme is conceived as a set of keywords basically (but by no means exhaustively) defining the story’s plot. A list of 89 themes was empirically formed, embracing a wide range of topics. A sample corpus of 310 stories was manually tagged, with each story being mapped onto a set of themes. This corpus was divided into three parts corresponding to three periods, 1900-1913, 1914-1922, 1923-1930. These periods of Russian history being radically different, the stories’ content varies greatly, too, not only in what concerns political and social themes, but also in quite personal and mundane matters. The paper traces the themes’ frequency rates across the three periods, accounting for their change dynamics in terms of sociopolitical context. The sample corpus will be further used as training data in devising computational techniques for automated thematic tagging of literary fiction.

Переведенное названиеТематическое аннотирование художественной литературы: на примере русских рассказов начала XX века
СобытиеXXIII Объединенная научная конференция «Интернет и современное общество»
конференцияXXIII Объединенная научная конференция «Интернет и современное общество»
Сокращенное название IMS 2020
