Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Explicit semantic analysis as a means for topic labelling. / Kriukova, Anna; Erofeeva, Aliia; Mitrofanova, Olga; Sukharev, Kirill.
Artificial Intelligence and Natural Language - 7th International Conference, AINL 2018, Proceedings. ed. / Lidia Pivovarova; Andrey Filchenkov; Jan Zizka; Dmitry Ustalov. Springer Nature, 2018. p. 110-116 (Communications in Computer and Information Science; Vol. 930).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - Explicit semantic analysis as a means for topic labelling
AU - Kriukova, Anna
AU - Erofeeva, Aliia
AU - Mitrofanova, Olga
AU - Sukharev, Kirill
N1 - Publisher Copyright: © Springer Nature Switzerland AG 2018. Copyright: Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2018
Y1 - 2018
N2 - This paper deals with a method for topic labelling that makes use of Explicit Semantic Analysis (ESA). Top words of a topic are given to ESA as an input, and the algorithm yields titles of Wikipedia articles that are considered most relevant to the input. An alternative approach that serves as a strong baseline employs titles of first outputs in a search engine, given topic words as a query. In both methods, obtained titles are then automatically analysed and phrases characterizing the topic are constructed from them with the use of a graph algorithm and are assigned with weights. Within the proposed method based on ESA, post-processing is then performed to sort candidate labels according to empirically formulated rules. Experiments were conducted on a corpus of Russian encyclopaedic texts on linguistics. The results justify applying ESA for this task, and we state that though it works a little inferior to the method based on a search engine in terms of labels’ quality, it can be used as a reasonable alternative because it exhibits two advantages that the baseline method lacks.
AB - This paper deals with a method for topic labelling that makes use of Explicit Semantic Analysis (ESA). Top words of a topic are given to ESA as an input, and the algorithm yields titles of Wikipedia articles that are considered most relevant to the input. An alternative approach that serves as a strong baseline employs titles of first outputs in a search engine, given topic words as a query. In both methods, obtained titles are then automatically analysed and phrases characterizing the topic are constructed from them with the use of a graph algorithm and are assigned with weights. Within the proposed method based on ESA, post-processing is then performed to sort candidate labels according to empirically formulated rules. Experiments were conducted on a corpus of Russian encyclopaedic texts on linguistics. The results justify applying ESA for this task, and we state that though it works a little inferior to the method based on a search engine in terms of labels’ quality, it can be used as a reasonable alternative because it exhibits two advantages that the baseline method lacks.
KW - Explicit Semantic Analysis
KW - Russian
KW - Topic labels
KW - Topic modelling
UR - http://www.scopus.com/inward/record.url?scp=85054881717&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-01204-5_11
DO - 10.1007/978-3-030-01204-5_11
M3 - Conference contribution
AN - SCOPUS:85054881717
SN - 9783030012038
T3 - Communications in Computer and Information Science
SP - 110
EP - 116
BT - Artificial Intelligence and Natural Language - 7th International Conference, AINL 2018, Proceedings
A2 - Pivovarova, Lidia
A2 - Filchenkov, Andrey
A2 - Zizka, Jan
A2 - Ustalov, Dmitry
PB - Springer Nature
T2 - 7th International Conference Artificial Intelligence and Natural Language, AINL 2018
Y2 - 17 October 2018 through 19 October 2018
ER -
ID: 37684204