Результаты исследований: Материалы конференций › тезисы
Named Entity Normalization for Fact Extraction Task. / Popov, A. M.; Adaskina, Yu. V.; Andreyeva, D. A.; Charabet, Ja.; Moskvina, A. D.; Protopopova, E. V.; Yushina, T. A.
2016. Реферат от 22-я Международная научная конференция "Диалог", Москва, Российская Федерация.Результаты исследований: Материалы конференций › тезисы
}
TY - CONF
T1 - Named Entity Normalization for Fact Extraction Task
AU - Popov, A. M.
AU - Adaskina, Yu. V.
AU - Andreyeva, D. A.
AU - Charabet, Ja.
AU - Moskvina, A. D.
AU - Protopopova, E. V.
AU - Yushina, T. A.
PY - 2016
Y1 - 2016
N2 - The paper describes our approach to the task of information extraction withinFactRuEval, an independent evaluation of Named Entity Recognition and FactExtraction tools. We took part in the three subtasks of the evaluation: NamedEntity Recognition per se, Entity Normalization and Fact Extraction.We chose a rule-based approach to the task. The three subtasks correspond to the modules of ‘Hurma’ parser, the tool we have developed. In addition to traditional lexicon and regular expressions based rules, it allowscreating elaborate rules to mine and normalize different kinds of entitieswith regard to specific challenges such language as Russian presents to theresearchers. For Fact Extraction, we used skip-gram based algorithm withno dependencies in order to overcome the problem of data sparsity.Preliminary results show that our Entity Extraction and Normalization methods score reasonably high and our Fact Extraction score is highenough, taken into account that that our expected maximum F-measureis relatively low due to the specifics of the Gold Standard.
AB - The paper describes our approach to the task of information extraction withinFactRuEval, an independent evaluation of Named Entity Recognition and FactExtraction tools. We took part in the three subtasks of the evaluation: NamedEntity Recognition per se, Entity Normalization and Fact Extraction.We chose a rule-based approach to the task. The three subtasks correspond to the modules of ‘Hurma’ parser, the tool we have developed. In addition to traditional lexicon and regular expressions based rules, it allowscreating elaborate rules to mine and normalize different kinds of entitieswith regard to specific challenges such language as Russian presents to theresearchers. For Fact Extraction, we used skip-gram based algorithm withno dependencies in order to overcome the problem of data sparsity.Preliminary results show that our Entity Extraction and Normalization methods score reasonably high and our Fact Extraction score is highenough, taken into account that that our expected maximum F-measureis relatively low due to the specifics of the Gold Standard.
KW - Information Extraction
KW - Named Entity Recognition
KW - Named EntityNormalization
KW - Fact Extraction
KW - skip-grams
UR - https://www.dialog-21.ru/digest/2016/online/
M3 - Abstract
T2 - 22-я Международная научная конференция "Диалог"
Y2 - 1 June 2016 through 4 June 2016
ER -
ID: 106951610