Extract summarization algorithms help identify significant information from the news by extracting meaningful sentences from the original text. The information background existing at the time of the news release often significantly affects its content. Such background can distort the text summarization algorithm working results. The study was conducted with the example of the theme “coronavirus” (COVID-19), which at the time of the study was one of the main topics in news feeds. Experiments were carried out on sports news articles, concerned football. This news area was selected because it is not related to medical topics. The TextRank algorithm for sport news extraction was applied in two ways. First, the key information from the source text of news was extracted. Then, a list of the COVID related words was created and the key information from news without considering words from this list was extracted. Our approach showed that mentioning a popular theme such as COVID that is not related to sports can have a negative impact on the text summarization algorithm. We suggest that to obtain accurate results of the algorithm operation, it is necessary to first compile a dictionary of terms related to the coronavirus theme and then exclude them when identifying the main content of news texts.

Original languageEnglish
Title of host publicationComputational Data and Social Networks - 10th International Conference, CSoNet 2021, Proceedings
EditorsDavid Mohaisen, Ruoming Jin
PublisherSpringer Nature
Pages351-360
Number of pages10
ISBN (Print)9783030914332
DOIs
StatePublished - 2021
Event10th International Conference on Computational Data and Social Networks, CSoNet 2021 - Virtual Online
Duration: 15 Nov 202117 Nov 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13116 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th International Conference on Computational Data and Social Networks, CSoNet 2021
CityVirtual Online
Period15/11/2117/11/21

    Research areas

  • Coronavirus, Extracting, News, Summarization algorithm, Text

    Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

ID: 91076367