Using Extended Stopwords Lists to Improve the Quality of Academic Abstracts Clustering

Research output: Contribution to journal › Article › peer-review

Links

https://link.springer.com/chapter/10.1007/978-3-319-55961-2_24

Svetlana Popova
Vera Danilova

Knowledge extraction from scientific documents plays an important role in the development of academic databases and services. We focus on the processing of abstracts to academic papers for the purposes of research data structuring that includes various subtasks, such as key phrase extraction and clustering. The use of abstracts is beneficial, because authors keep up with formal and stylistic requirements imposed by the publishers, and, therefore, informational and language patterns can be revealed. From our viewpoint, the existence of these patterns makes it possible to perform the cross-task application of techniques used for abstracts processing. The aim of the paper is to show it.

Original language	English
Journal	Lecture Notes in Computer Science
Volume	10034
State	Published - 2017
Externally published	Yes

Research areas

Clustering Stopwords Document representation Extended stopwords list construction Natural Language Processing

ID: 7746938