Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › Рецензирование
This article deals with the principles of automatic label assignment for e-hypertext markup. We’ve identified 40 topics that are characteristic of hypertext media, after that, we used an ensemble of two graph-based methods using outer sources for candidate labels generation: candidate labels extraction from Yandex search engine (Labels-Yandex); candidate labels extraction from Wikipedia by operations on word vector representations in Explicit Semantic Analysis (ESA). The results of the algorithms are label’s triplets for each topic, after which we carried out a two-step evaluation procedure of the algorithms’ results: at the first stage, two experts assessed the triplet’s relevance to the topic on a 3-value scale (non-conformity to the topic/partial compliance to the topic/full compliance to the topic), second, we carried out evaluation of single labels by 10 assessors who were asked to mark each label by weights «0» – a label doesn’t match a topic; «1» – a label matches a topic. Our experiments show that in most cases Labels-Yandex algorithm predicts correct labels but frequently relates the topic to a label that is relevant to the current moment, but not to a set of keywords, while Labels-ESA works out labels with generalized content. Thus, a combination of these methods will make it possible to markup e-hypertext topics and create a semantic network theory of e-hypertext.
Язык оригинала | английский |
---|---|
Название основной публикации | Recent Trends in Analysis of Images, Social Networks and Texts - 9th International Conference, AIST 2020, Revised Supplementary Proceedings |
Редакторы | Wil M. van der Aalst, Vladimir Batagelj, Alexey Buzmakov, Dmitry I. Ignatov, Anna Kalenkova, Michael Khachay, Olessia Koltsova, Andrey Kutuzov, Sergei O. Kuznetsov, Irina A. Lomazova, Natalia Loukachevitch, Ilya Makarov, Amedeo Napoli, Alexander Panchenko, Panos M. Pardalos, Marcello Pelillo, Andrey V. Savchenko, Elena Tutubalina |
Издатель | Springer Nature |
Страницы | 102-114 |
Число страниц | 13 |
ISBN (печатное издание) | 9783030712136 |
DOI | |
Состояние | Опубликовано - 2021 |
Событие | 9th International Conference on Analysis of Images, Social Networks, and Texts, AIST 2020 - Virtual, Online Продолжительность: 15 окт 2020 → 16 окт 2020 |
Название | Communications in Computer and Information Science |
---|---|
Том | 1357 CCIS |
ISSN (печатное издание) | 1865-0929 |
ISSN (электронное издание) | 1865-0937 |
конференция | 9th International Conference on Analysis of Images, Social Networks, and Texts, AIST 2020 |
---|---|
Город | Virtual, Online |
Период | 15/10/20 → 16/10/20 |
ID: 85926806