Summarization Algorithms for News: A Study of the Coronavirus Theme and Its Impact on the News Extracting Algorithm

Lyudmila Gadasina, Vladislav Veklenko, Pasi Luukka

Результат исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаярецензирование

Аннотация

Extract summarization algorithms help identify significant information from the news by extracting meaningful sentences from the original text. The information background existing at the time of the news release often significantly affects its content. Such background can distort the text summarization algorithm working results. The study was conducted with the example of the theme “coronavirus” (COVID-19), which at the time of the study was one of the main topics in news feeds. Experiments were carried out on sports news articles, concerned football. This news area was selected because it is not related to medical topics. The TextRank algorithm for sport news extraction was applied in two ways. First, the key information from the source text of news was extracted. Then, a list of the COVID related words was created and the key information from news without considering words from this list was extracted. Our approach showed that mentioning a popular theme such as COVID that is not related to sports can have a negative impact on the text summarization algorithm. We suggest that to obtain accurate results of the algorithm operation, it is necessary to first compile a dictionary of terms related to the coronavirus theme and then exclude them when identifying the main content of news texts.

Язык оригиналаанглийский
Название основной публикацииComputational Data and Social Networks - 10th International Conference, CSoNet 2021, Proceedings
РедакторыDavid Mohaisen, Ruoming Jin
ИздательSpringer Nature
Страницы351-360
Число страниц10
ISBN (печатное издание)9783030914332
DOI
СостояниеОпубликовано - 2021
Событие10th International Conference on Computational Data and Social Networks, CSoNet 2021 - Virtual Online
Продолжительность: 15 ноя 202117 ноя 2021

Серия публикаций

НазваниеLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Том13116 LNCS
ISSN (печатное издание)0302-9743
ISSN (электронное издание)1611-3349

конференция

конференция10th International Conference on Computational Data and Social Networks, CSoNet 2021
ГородVirtual Online
Период15/11/2117/11/21

Предметные области Scopus

  • Теоретические компьютерные науки
  • Компьютерные науки (все)

Fingerprint

Подробные сведения о темах исследования «Summarization Algorithms for News: A Study of the Coronavirus Theme and Its Impact on the News Extracting Algorithm». Вместе они формируют уникальный семантический отпечаток (fingerprint).

Цитировать