Anaphoric annotation and corpus-based anaphora resolution › Научные исследования в СПбГУ

Ссылки

http://www.dialog-21.ru/digests/dialog2014/materials/pdf/ProtopopovaEV.pdf

E. V. Protopopova
A. A. Bodrova
S. A. Volskaya
I. V. Krylova
A. S. Chuchunkov
S. V. Alexeeva
V. V. Bocharov
D. V. Granovsky

The paper describes the noun phase and anaphora annotation in OpenCorpora and compares it to that in other corpora. We discuss the choice of representative texts for anaphoric annotation and the basic principles of syntactic annotation. In case of noun phrase annotation we followed the scheme introduced earlier for morphological annotation: it was carried out in two stages: firstly, all noun phrases and some other syntactic units were annotated by a heterogenous group of people, then a linguist compared all markup results and found the best one, or corrected mistakes. We present some annotation results and cases of annotator's disagreement and proceed to introduce our data-driven anaphora resolution system based on decision trees. We then list the features used to fit the classificator and discuss their relevance and some changes which improved the classificator performance. We also present out rule-based approach to automated noun phrase extraction using Tomita parser. A baseline for anaphora resolution is introduced and we compare it with our results.

Язык оригинала	английский
Название основной публикации	По материалам ежегодной Международной конференции "Диалог" 2014
Издатель	Российский государственный гуманитарный университет
Страницы	562-571
Число страниц	10
ISBN (печатное издание)	2221-7932
Состояние	Опубликовано - 2014

Серия публикаций

Название	Komp'juternaja Lingvistika i Intellektual'nye Tehnologii
Том	13
ISSN (печатное издание)	2221-7932

Предметные области Scopus

Языки и лингвистика
Языки и лингвистика
Прикладные компьютерные науки

ID: 4682461

Anaphoric annotation and corpus-based anaphora resolution: An experiment

Ссылки

Серия публикаций

Предметные области Scopus