Named Entity Normalization for Fact Extraction Task

Links

http://www.dialog-21.ru/media/3456/popovametal.pdf

A. M. Popov
Yu. V. Adaskina
D. A. Andreyeva
Ja. Charabet
A. D. Moskvina
E. V. Protopopova
T. A. Yushina

The paper describes our approach to the task of information extraction within
FactRuEval, an independent evaluation of Named Entity Recognition and Fact
Extraction tools. We took part in the three subtasks of the evaluation: Named
Entity Recognition per se, Entity Normalization and Fact Extraction.
We chose a rule-based approach to the task. The three subtasks correspond to the modules of ‘Hurma’ parser, the tool we have developed. In addition to traditional lexicon and regular expressions based rules, it allows
creating elaborate rules to mine and normalize different kinds of entities
with regard to specific challenges such language as Russian presents to the
researchers. For Fact Extraction, we used skip-gram based algorithm with
no dependencies in order to overcome the problem of data sparsity.
Preliminary results show that our Entity Extraction and Normalization methods score reasonably high and our Fact Extraction score is high
enough, taken into account that that our expected maximum F-measure
is relatively low due to the specifics of the Gold Standard.

Original language	English
Number of pages	11
State	Published - 2016
Event	22-я Международная научная конференция "Диалог" - Москва, Russian Federation Duration: 1 Jun 2016 → 4 Jun 2016

Conference

Conference	22-я Международная научная конференция "Диалог"
Country/Territory	Russian Federation
City	Москва
Period	1/06/16 → 4/06/16

Research areas

Information Extraction, Named Entity Recognition, Named EntityNormalization, Fact Extraction, skip-grams

ID: 106951610