Ссылки

DOI

As part of our project ParaPhraser on the identification and classification of Russian paraphrase, we have collected a corpus of more than 8000 sentence pairs annotated as precise, loose or non-paraphrases. The corpus is annotated via crowdsourcing by naïve native Russian speakers, but from the point of view of the expert, our complex paraphrase detection model can be more successful at predicting paraphrase class than a naive native speaker. Our paraphrase corpus is collected from news headlines and therefore can be considered a summarized news stream describing the most important events. By building a graph of paraphrases, we can detect such events. In this paper we construct two such graphs: based on the current human annotation and on the complex model prediction. The structure of the graphs is compared and analyzed and it is shown that the model graph has larger connected components which give a more complete picture of the important events than the human annotation graph. Predictive model appears to be better at capturing full information about the important events from the news collection than human annotators.

Язык оригиналаанглийский
Название основной публикацииAdvances in Soft Computing - 15th Mexican International Conference on Artificial Intelligence, MICAI 2016, Proceedings
ИздательSpringer Nature
Страницы41-52
Число страниц12
Том10061 LNAI
ISBN (печатное издание)9783319624334
DOI
СостояниеОпубликовано - 2017
Опубликовано для внешнего пользованияДа
Событие15th Mexican International Conference on Artificial Intelligence, MICAI 2016 - Cancun, Мексика
Продолжительность: 22 окт 201627 окт 2016

Серия публикаций

НазваниеLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Том10061 LNAI
ISSN (печатное издание)0302-9743
ISSN (электронное издание)1611-3349

конференция

конференция15th Mexican International Conference on Artificial Intelligence, MICAI 2016
Страна/TерриторияМексика
ГородCancun
Период22/10/1627/10/16

    Предметные области Scopus

  • Теоретические компьютерные науки
  • Компьютерные науки (все)

ID: 7633637