Literary writing style recognition via a minimal spanning tree-based approach

Результаты исследований: Научные публикации в периодических изданиях › статья › Рецензирование

Ссылки

http://ac.els-cdn.com/S0957417416302573/1-s2.0-S0957417416302573-main.pdf?_tid=4a57d1bc-24a0-11e6-aa60-00000aacb35d&acdnat=1464418334_570f0081f6a06f752dd4efaa6ce767b2

DOI

https://doi.org/10.1016/j.eswa.2016.05.032
Конечная издательская версия

Dmitry Shalymov
Oleg Granichin
Lev Klebanov
Zeev Volkovich

In this paper, we address the problem of literary writing style determination using a comparison of the randomness of two given texts. We attempt to comprehend if these texts are generated from distinct probability sources that can reveal a difference between the literary writing styles of the corresponding authors. We propose a new approach based on the incorporation of the known Friedman-Rafsky two-sample test into a multistage procedure with the aim of stabilizing the process. A sampling pro cedure constructed by applying the N-grams methodology is applied to simulate samples drawn from the pooled text with the aim of evaluating the null hypothesis distribution that appears after the writing styles coincide. Next, samples from different files are selected, and the p-values of the test statistics are calculated. An empirical distribution of these values is compared numerous times with the uniform one on the interval [0, 1], and the writing styles are recognized as different if the rejection fraction in this comparison's sequence is significantly greater than 0.5. The offered approach is language independent in the community of alphabetic languages and does not involve the use of linguistics. In comparison with most existing methods our approach does not deal with any authorship attribute determination. A text itself, more precisely speaking, the distribution of sequential text templates and their mutual occurrences essentially identifies the style. Experiments demonstrate the strong capability of the proposed method. (C) 2016 Elsevier Ltd. All rights reserved.

Язык оригинала	Английский
Страницы (с-по)	145-153
Число страниц	9
Журнал	Expert Systems with Applications
Том	61
DOI	https://doi.org/10.1016/j.eswa.2016.05.032
Состояние	Опубликовано - 1 ноя 2016
Опубликовано для внешнего пользования	Да

ID: 7568102

Pure – это продукт компании Elsevier
На данном информационном ресурсе могут быть опубликованы архивные материалы
с упоминанием физических и юридических лиц, включенных Министерством юстиции
Российской Федерации в реестр иностранных агентов

Вход в Pure