DOI

This article presents a system for the automatic processing of user comments aimed at annotating speech and discourse formulas that actively function in everyday interaction, including digital communication. A Python-based program using the Telegram API was developed to automate the collection, filtering, and annotation of empirical data. In addition to building a user corpus, the study also included the evaluation of automatic processing results. The source material was drawn from the Telegram news channel Fontanka SPB Online. As a result of automatic processing, 70 speech and discourse formulas were extracted and grouped based on their source lexicons. The classification of the examined multiword units was grounded in the findings of two research projects: the construction of the Pragmaticon in Moscow and the annotation of stable multiword units in Saint Petersburg. The implementation of automatic annotation enabled the identification of formulas with a high pragmatic load and captured their specific functions in internet communication. For example, semantic irony was observed in the use of formulas such as ‘khorosho’ (‘fine’) and ‘bez problem’ (‘no problem’), which traditionally indicate agreement. The study identified the most frequent types of user responses reflected by the formulas: affirmation and negation. The results demonstrate the potential of the automatic approach for describing speech and discourse formulas in digital discourse and highlight the need to refine existing classifications of speech act.
Язык оригиналарусский
Название основной публикации Speech and Computer. SPECOM 2025
Место публикацииSzeged, Hungary
ИздательSpringer Nature
Страницы278-292
Число страниц15
ISBN (печатное издание)9783032079558
DOI
СостояниеОпубликовано - 2026
Событие27th International Conference on Speech and Computer - Szeged, Hungary, Szeged, Венгрия
Продолжительность: 13 окт 202514 окт 2025
Номер конференции: 27
https://specom.inf.u-szeged.hu/

Серия публикаций

НазваниеLecture Notes in Computer Science
Номер16187

конференция

конференция27th International Conference on Speech and Computer
Сокращенное названиеSpecom 2025
Страна/TерриторияВенгрия
Город Szeged
Период13/10/2514/10/25
Прочее 27-й Международной конференции по вопросам речи и компьютера (SPECOM 2025)
Сайт в сети Internet

    Области исследований

  • Automatic Annotation, Statistical Analysis, Modern Russian, Corpus Linguistics, Discourse Formulas, Internet Discourse, Internet Comment, Speech Formulas

ID: 144722668