Standard

Extended list of stop words : Does it work for keyphrase extraction from short texts? / Popova, Svetlana; Skitalinskaya, Gabriella.

Proceedings of the 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017. Том 1 Institute of Electrical and Electronics Engineers Inc., 2017. стр. 401-404 8098815.

Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаяРецензирование

Harvard

Popova, S & Skitalinskaya, G 2017, Extended list of stop words: Does it work for keyphrase extraction from short texts? в Proceedings of the 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017. Том. 1, 8098815, Institute of Electrical and Electronics Engineers Inc., стр. 401-404, 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017, Lviv, Украина, 5/09/17. https://doi.org/10.1109/STC-CSIT.2017.8098815

APA

Popova, S., & Skitalinskaya, G. (2017). Extended list of stop words: Does it work for keyphrase extraction from short texts? в Proceedings of the 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017 (Том 1, стр. 401-404). [8098815] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/STC-CSIT.2017.8098815

Vancouver

Popova S, Skitalinskaya G. Extended list of stop words: Does it work for keyphrase extraction from short texts? в Proceedings of the 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017. Том 1. Institute of Electrical and Electronics Engineers Inc. 2017. стр. 401-404. 8098815 https://doi.org/10.1109/STC-CSIT.2017.8098815

Author

Popova, Svetlana ; Skitalinskaya, Gabriella. / Extended list of stop words : Does it work for keyphrase extraction from short texts?. Proceedings of the 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017. Том 1 Institute of Electrical and Electronics Engineers Inc., 2017. стр. 401-404

BibTeX

@inproceedings{3bb1d79c75254de995ca3d0f852f5a8a,
title = "Extended list of stop words: Does it work for keyphrase extraction from short texts?",
abstract = "In this paper we study the problem of key phrase extraction from short texts written in Russian. As texts we consider messages posted on Internet car forums related to the purchase or repair of cars. The main assumption made is: the construction of lists of stop words for key phrase extraction can be effective if performed on the basis of a small, expert-marked collection. The results show that even a small number of texts marked by an expert can be enough to build an extended list of stop words. Extracted stop words allow to improve the quality of the key phrase extraction algorithm. Prior, we used a similar approach for key phrase extraction from scientific abstracts in the English language. In this paper we work with Russian texts. The obtained results show that the proposed approach works not only for texts that are appropriate in terms of structure and literacy, such as abstracts, but also for short texts, such as forum messages, in which many words may be misspelled and the text itself is poorly structured. Moreover, the results show that proposed approach works well not only with English texts, but also with texts in the Russian language.",
keywords = "information retrieval, keyphrase extraction, short texts, stop words",
author = "Svetlana Popova and Gabriella Skitalinskaya",
year = "2017",
month = nov,
day = "6",
doi = "10.1109/STC-CSIT.2017.8098815",
language = "English",
volume = "1",
pages = "401--404",
booktitle = "Proceedings of the 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",
note = "12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017 ; Conference date: 05-09-2017 Through 08-09-2017",

}

RIS

TY - GEN

T1 - Extended list of stop words

T2 - 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017

AU - Popova, Svetlana

AU - Skitalinskaya, Gabriella

PY - 2017/11/6

Y1 - 2017/11/6

N2 - In this paper we study the problem of key phrase extraction from short texts written in Russian. As texts we consider messages posted on Internet car forums related to the purchase or repair of cars. The main assumption made is: the construction of lists of stop words for key phrase extraction can be effective if performed on the basis of a small, expert-marked collection. The results show that even a small number of texts marked by an expert can be enough to build an extended list of stop words. Extracted stop words allow to improve the quality of the key phrase extraction algorithm. Prior, we used a similar approach for key phrase extraction from scientific abstracts in the English language. In this paper we work with Russian texts. The obtained results show that the proposed approach works not only for texts that are appropriate in terms of structure and literacy, such as abstracts, but also for short texts, such as forum messages, in which many words may be misspelled and the text itself is poorly structured. Moreover, the results show that proposed approach works well not only with English texts, but also with texts in the Russian language.

AB - In this paper we study the problem of key phrase extraction from short texts written in Russian. As texts we consider messages posted on Internet car forums related to the purchase or repair of cars. The main assumption made is: the construction of lists of stop words for key phrase extraction can be effective if performed on the basis of a small, expert-marked collection. The results show that even a small number of texts marked by an expert can be enough to build an extended list of stop words. Extracted stop words allow to improve the quality of the key phrase extraction algorithm. Prior, we used a similar approach for key phrase extraction from scientific abstracts in the English language. In this paper we work with Russian texts. The obtained results show that the proposed approach works not only for texts that are appropriate in terms of structure and literacy, such as abstracts, but also for short texts, such as forum messages, in which many words may be misspelled and the text itself is poorly structured. Moreover, the results show that proposed approach works well not only with English texts, but also with texts in the Russian language.

KW - information retrieval

KW - keyphrase extraction

KW - short texts

KW - stop words

UR - http://www.scopus.com/inward/record.url?scp=85040797829&partnerID=8YFLogxK

U2 - 10.1109/STC-CSIT.2017.8098815

DO - 10.1109/STC-CSIT.2017.8098815

M3 - Conference contribution

AN - SCOPUS:85040797829

VL - 1

SP - 401

EP - 404

BT - Proceedings of the 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 5 September 2017 through 8 September 2017

ER -

ID: 36370094