This paper deals with a method for topic labelling that makes use of Explicit Semantic Analysis (ESA). Top words of a topic are given to ESA as an input, and the algorithm yields titles of Wikipedia articles that are considered most relevant to the input. An alternative approach that serves as a strong baseline employs titles of first outputs in a search engine, given topic words as a query. In both methods, obtained titles are then automatically analysed and phrases characterizing the topic are constructed from them with the use of a graph algorithm and are assigned with weights. Within the proposed method based on ESA, post-processing is then performed to sort candidate labels according to empirically formulated rules. Experiments were conducted on a corpus of Russian encyclopaedic texts on linguistics. The results justify applying ESA for this task, and we state that though it works a little inferior to the method based on a search engine in terms of labels’ quality, it can be used as a reasonable alternative because it exhibits two advantages that the baseline method lacks.

Original languageEnglish
Title of host publicationArtificial Intelligence and Natural Language - 7th International Conference, AINL 2018, Proceedings
EditorsLidia Pivovarova, Andrey Filchenkov, Jan Zizka, Dmitry Ustalov
PublisherSpringer Nature
Pages110-116
Number of pages7
ISBN (Print)9783030012038
DOIs
StatePublished - 2018
Event7th International Conference Artificial Intelligence and Natural Language, AINL 2018 - St. Petersburg, Russian Federation
Duration: 17 Oct 201819 Oct 2018

Publication series

NameCommunications in Computer and Information Science
Volume930
ISSN (Print)1865-0929

Conference

Conference7th International Conference Artificial Intelligence and Natural Language, AINL 2018
Country/TerritoryRussian Federation
CitySt. Petersburg
Period17/10/1819/10/18

    Research areas

  • Explicit Semantic Analysis, Russian, Topic labels, Topic modelling

    Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)

ID: 37684204