Explicit semantic analysis as a means for topic labelling › Научные исследования в СПбГУ

DOI

https://doi.org/10.1007/978-3-030-01204-5_11
Конечная издательская версия

Anna Kriukova
Aliia Erofeeva
Olga Mitrofanova
Kirill Sukharev

This paper deals with a method for topic labelling that makes use of Explicit Semantic Analysis (ESA). Top words of a topic are given to ESA as an input, and the algorithm yields titles of Wikipedia articles that are considered most relevant to the input. An alternative approach that serves as a strong baseline employs titles of first outputs in a search engine, given topic words as a query. In both methods, obtained titles are then automatically analysed and phrases characterizing the topic are constructed from them with the use of a graph algorithm and are assigned with weights. Within the proposed method based on ESA, post-processing is then performed to sort candidate labels according to empirically formulated rules. Experiments were conducted on a corpus of Russian encyclopaedic texts on linguistics. The results justify applying ESA for this task, and we state that though it works a little inferior to the method based on a search engine in terms of labels’ quality, it can be used as a reasonable alternative because it exhibits two advantages that the baseline method lacks.

Язык оригинала	английский
Название основной публикации	Artificial Intelligence and Natural Language - 7th International Conference, AINL 2018, Proceedings
Редакторы	Lidia Pivovarova, Andrey Filchenkov, Jan Zizka, Dmitry Ustalov
Издатель	Springer Nature
Страницы	110-116
Число страниц	7
ISBN (печатное издание)	9783030012038
DOI	https://doi.org/10.1007/978-3-030-01204-5_11
Состояние	Опубликовано - 2018
Событие	7th International Conference Artificial Intelligence and Natural Language, AINL 2018 - St. Petersburg, Российская Федерация Продолжительность: 17 окт 2018 → 19 окт 2018

Серия публикаций

Название	Communications in Computer and Information Science
Том	930
ISSN (печатное издание)	1865-0929

конференция

конференция	7th International Conference Artificial Intelligence and Natural Language, AINL 2018
Страна/Tерритория	Российская Федерация
Город	St. Petersburg
Период	17/10/18 → 19/10/18

Предметные области Scopus

Компьютерные науки (все)
Математика (все)

ID: 37684204