Ссылки

DOI

Lately writer identification problem has become actual due to huge amount of documents in digital form. In the current work an approach based on frequency combination of letters is investigated for solving such a task as classification of documents by authorship. This research examines and compares four different distance measures between a text of unknown authorship and an authors' profile: L1 measure, Kullback-Leibler divergence, base metric of Common TV-gram method (OVG)[8] and certain variation of dissimilarity measure of CNG method which was proposed in [12]. Comparison outlines cases when some metric outperforms others with a specific parameter combination. Experiments are conducted on different Russian and English corpora.

Язык оригиналаанглийский
Название основной публикации19th Conference of Open Innovations Association, FRUCT 2016
РедакторыTatiana Tyutina, Sergey Balandin
ИздательInstitute of Electrical and Electronics Engineers Inc.
Страницы24-30
Число страниц7
ISBN (электронное издание)9789526839752
DOI
СостояниеОпубликовано - 2016
Событие19th Conference of Open Innovations Association, FRUCT 2016 - Jyvaskyla, Финляндия
Продолжительность: 7 ноя 201611 ноя 2016

конференция

конференция19th Conference of Open Innovations Association, FRUCT 2016
Страна/TерриторияФинляндия
ГородJyvaskyla
Период7/11/1611/11/16

    Предметные области Scopus

  • Компьютерные науки (все)
  • Электротехника и электроника

ID: 7614470