Research output: Contribution to conference › Paper › peer-review
The paper deals with the problem of thematic tagging in works of fiction, Russian short stories of the early 20 th century (1900-1930) serving as research data. The very concept of discourse theme, or topic, is argued to be fuzzy and ambivalent, all the more so in the case of literary prose. In the present study, theme is conceived as a set of keywords basically (but by no means exhaustively) defining the story’s plot. A list of 89 themes was empirically formed, embracing a wide range of topics. A sample corpus of 310 stories was manually tagged, with each story being mapped onto a set of themes. This corpus was divided into three parts corresponding to three periods, 1900-1913, 1914-1922, 1923-1930. These periods of Russian history being radically different, the stories’ content varies greatly, too, not only in what concerns political and social themes, but also in quite personal and mundane matters. The paper traces the themes’ frequency rates across the three periods, accounting for their change dynamics in terms of sociopolitical context. The sample corpus will be further used as training data in devising computational techniques for automated thematic tagging of literary fiction.
Translated title of the contribution | Тематическое аннотирование художественной литературы: на примере русских рассказов начала XX века |
---|---|
Original language | English |
Pages | 265-276 |
Number of pages | 12 |
State | Published - 2021 |
Event | Internet and Modern Society - Университет ИТМО, Санкт-Петербург, Russian Federation Duration: 17 Jun 2020 → 20 Jun 2020 Conference number: 23 http://ims.ifmo.ru/ru/pages/2/programma.htm |
Conference | Internet and Modern Society |
---|---|
Abbreviated title | IMS 2020 |
Country/Territory | Russian Federation |
City | Санкт-Петербург |
Period | 17/06/20 → 20/06/20 |
Internet address |
ID: 74160978