• A. S. Starostin
  • V. V. Bocharov
  • S. V. Alexeeva
  • A. A. Bodrova
  • A. S. Chuchunkov
  • S. S. Dzhumaev
  • I. V. Efimenko
  • D. V. Granovsky
  • V. F. Khoroshevsky
  • I. V. Krylova
  • M. A. Nikolaeva
  • I. M. Smurov
  • S. Y. Toldova

In this paper, we describe the rules and results of the FactRuEval information extraction competition held in 2016 as part of the Dialogue Evaluation initiative in the run-up to Dialogue 2016. The systems were to extract information from Russian texts and competed in two named entity extraction tracks and one fact extraction track. The paper describes the tasks set before the participants and presents the scores achieved by the contending systems. Additionally, we dwell upon the scoring methods employed for evaluating the results of all the three tracks and provide some preliminary analysis of the state of the art in Information Extraction for Russian texts. We also provide a detailed description of the composition and general organization of the annotated corpus created for the competition by volunteers using the OpenCorpora.org platform. The corpus is publicly available and is expected to evolve in the future.

Original languageEnglish
Title of host publicationFactRuEval 2016: Evaluation of Named Entity Recognition and Fact Extraction Systems for Russian
Pages702-720
Number of pages19
StatePublished - 2016
Event2016 International Conference on Computational Linguistics and Intellectual Technologies, Dialogue 2016 - Moscow, Russian Federation
Duration: 1 Jun 20164 Jun 2016

Publication series

NameKomp'juternaja Lingvistika i Intellektual'nye Tehnologii
PublisherРоссийский государственный гуманитарный университет
ISSN (Print)2221-7932

Conference

Conference2016 International Conference on Computational Linguistics and Intellectual Technologies, Dialogue 2016
Country/TerritoryRussian Federation
CityMoscow
Period1/06/164/06/16

    Research areas

  • Evaluation, Fact extraction, Information extraction, Named entity recognition, Relation extraction

    Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Computer Science Applications

ID: 7569428