The paper describes the results of the First Russian Paraphrase Detection Shared Task held in St.-Petersburg, Russia, in October 2016. Research in the area of paraphrase extraction, detection and generation has been successfully developing for a long time while there has been only a recent surge of interest towards the problem in the Russian community of computational linguistics. We try to overcome this gap by introducing the project ParaPhraser.ru dedicated to the collection of Russian paraphrase corpus and organizing a Paraphrase Detection Shared Task, which uses the corpus as the training data. The participants of the task applied a wide variety of techniques to the problem of paraphrase detection, from rule-based approaches to deep learning, and results of the task reflect the following tendencies: the best scores are obtained by the strategy of using traditional classifiers combined with fine-grained linguistic features, however, complex neural networks, shallow methods and purely technical methods also demonstrate competitive results.
Translated title of the contributionParaPhraser: Русский корпус парафраз и дорожка по распознаванию парафраз
Original languageEnglish
Title of host publicationArtificial Intelligence and Natural Language - 6th Conference, AINL 2017, Revised Selected Papers
PublisherSpringer Nature
Pages211-225
Number of pages15
Volume789
EditionCCIS Springer, Cham
ISBN (Electronic)978-3-319-71746-3
ISBN (Print)9783319717456
DOIs
StatePublished - 2018
Event6th Conference on Artificial Intelligence and Natural Language, AINL 2017 - St. Petersburg, Russian Federation
Duration: 19 Sep 201722 Sep 2017

Publication series

NameCommunications in Computer and Information Science
Volume789
ISSN (Print)1865-0929

Conference

Conference6th Conference on Artificial Intelligence and Natural Language, AINL 2017
Country/TerritoryRussian Federation
CitySt. Petersburg
Period19/09/1722/09/17

    Research areas

  • Paraphrase corpus, Paraphrase detection, Russian paraphrase, Shared task

    Scopus subject areas

  • Language and Linguistics
  • Computer Science(all)

ID: 11888226