© 2016 FRUCT.This work deseribes the experience of ereating a corefarence resolution system for Russian language. Coreference resolution is a key subtask of Information Extraction, and aims to grouping mentions that refer to the same discourse entity. This work was aimed to applying a clusterization algorithm for Russian-language newswire texts. We narrowed the task to Person proper names clusterization. Our approach model included two steps: mention extraction and clusterization. Mention extraction was proceeded by manually-created grammars for Tomita-parser. For mention grouping, we used agglomerative clusterization on entity level with the help of weighted feature vectors. We run our experiments on newswire texts, annotated for factRuEval-2016 competition, organized by Dialogue Evaluation. We compare our results with competitors. As a baseline, we set built-in Tonuta-parser algorithms for name extraction and name clusterization. We got comparable results and outperformed the baseline.
Язык оригиналаанглийский
Название основной публикацииProceedings of the International FRUCT Conference on Intelligence, Social Media and Web, ISMW FRUCT 2016
ИздательInstitute of Electrical and Electronics Engineers Inc.
Страницы9-16
ISBN (печатное издание)978-952-68397-6-9
DOI
СостояниеОпубликовано - 2016
Событие2016 International FRUCT Conference on Intelligence, Social Media and Web - Saint-Petersburg, Российская Федерация
Продолжительность: 28 авг 20164 сен 2016

конференция

конференция2016 International FRUCT Conference on Intelligence, Social Media and Web
Сокращенное названиеISMW FRUCT 2016
Страна/TерриторияРоссийская Федерация
ГородSaint-Petersburg
Период28/08/164/09/16

ID: 7966426