Keyword extraction from single Russian document › Научные исследования в СПбГУ

Mikhail Vadimovich Sandul
Elena Georgievna Mikhailova

The problem of automatic keyword and phrases extraction from a text occurs in different tasks of information retrieval and text mining. The task is the identification of terms that best describe the subject of a document. Currently there are a lot of research to solve this problem. Basically, algorithms are developed for texts in English. The possibility of applying these algorithms to the Russian texts are not sufficiently investigated. One of the most known algorithms for solving the problem of keyword extraction is RAKE. This article examines the effectiveness of RAKE algorithm for texts in Russian. The work also applies the hybrid method, which uses the Γ-index metric for phrases weighting, which were obtained using the algorithm RAKE. The article shows that this algorithm is more accurate than PAKE while reducing the number of selected phrases.

Язык оригинала	английский
Страницы (с-по)	30-36
Число страниц	7
Журнал	CEUR Workshop Proceedings
Том	2135
Состояние	Опубликовано - 1 янв 2018
Событие	3rd Conference on Software Engineering and Information Management, SEIM 2018 - Saint Petersburg, Российская Федерация Продолжительность: 14 апр 2018 → …

Предметные области Scopus

Компьютерные науки (все)

ID: 38400996