Standard

Writer identification based on letter frequency distribution. / Diurdeva, Polina; Mikhailova, Elena; Shalymov, Dmitry.

19th Conference of Open Innovations Association, FRUCT 2016. ed. / Tatiana Tyutina; Sergey Balandin. Institute of Electrical and Electronics Engineers Inc., 2016. p. 24-30 7892179.

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Harvard

Diurdeva, P, Mikhailova, E & Shalymov, D 2016, Writer identification based on letter frequency distribution. in T Tyutina & S Balandin (eds), 19th Conference of Open Innovations Association, FRUCT 2016., 7892179, Institute of Electrical and Electronics Engineers Inc., pp. 24-30, 19th Conference of Open Innovations Association, FRUCT 2016, Jyvaskyla, Finland, 7/11/16. https://doi.org/10.23919/FRUCT.2016.7892179

APA

Diurdeva, P., Mikhailova, E., & Shalymov, D. (2016). Writer identification based on letter frequency distribution. In T. Tyutina, & S. Balandin (Eds.), 19th Conference of Open Innovations Association, FRUCT 2016 (pp. 24-30). [7892179] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.23919/FRUCT.2016.7892179

Vancouver

Diurdeva P, Mikhailova E, Shalymov D. Writer identification based on letter frequency distribution. In Tyutina T, Balandin S, editors, 19th Conference of Open Innovations Association, FRUCT 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 24-30. 7892179 https://doi.org/10.23919/FRUCT.2016.7892179

Author

Diurdeva, Polina ; Mikhailova, Elena ; Shalymov, Dmitry. / Writer identification based on letter frequency distribution. 19th Conference of Open Innovations Association, FRUCT 2016. editor / Tatiana Tyutina ; Sergey Balandin. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 24-30

BibTeX

@inproceedings{3dd3ac885e344cc5b70d414a54ec4f8a,
title = "Writer identification based on letter frequency distribution",
abstract = "Lately writer identification problem has become actual due to huge amount of documents in digital form. In the current work an approach based on frequency combination of letters is investigated for solving such a task as classification of documents by authorship. This research examines and compares four different distance measures between a text of unknown authorship and an authors' profile: L1 measure, Kullback-Leibler divergence, base metric of Common TV-gram method (OVG)[8] and certain variation of dissimilarity measure of CNG method which was proposed in [12]. Comparison outlines cases when some metric outperforms others with a specific parameter combination. Experiments are conducted on different Russian and English corpora.",
author = "Polina Diurdeva and Elena Mikhailova and Dmitry Shalymov",
year = "2016",
doi = "10.23919/FRUCT.2016.7892179",
language = "English",
pages = "24--30",
editor = "Tatiana Tyutina and Sergey Balandin",
booktitle = "19th Conference of Open Innovations Association, FRUCT 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",
note = "19th Conference of Open Innovations Association, FRUCT 2016 ; Conference date: 07-11-2016 Through 11-11-2016",

}

RIS

TY - GEN

T1 - Writer identification based on letter frequency distribution

AU - Diurdeva, Polina

AU - Mikhailova, Elena

AU - Shalymov, Dmitry

PY - 2016

Y1 - 2016

N2 - Lately writer identification problem has become actual due to huge amount of documents in digital form. In the current work an approach based on frequency combination of letters is investigated for solving such a task as classification of documents by authorship. This research examines and compares four different distance measures between a text of unknown authorship and an authors' profile: L1 measure, Kullback-Leibler divergence, base metric of Common TV-gram method (OVG)[8] and certain variation of dissimilarity measure of CNG method which was proposed in [12]. Comparison outlines cases when some metric outperforms others with a specific parameter combination. Experiments are conducted on different Russian and English corpora.

AB - Lately writer identification problem has become actual due to huge amount of documents in digital form. In the current work an approach based on frequency combination of letters is investigated for solving such a task as classification of documents by authorship. This research examines and compares four different distance measures between a text of unknown authorship and an authors' profile: L1 measure, Kullback-Leibler divergence, base metric of Common TV-gram method (OVG)[8] and certain variation of dissimilarity measure of CNG method which was proposed in [12]. Comparison outlines cases when some metric outperforms others with a specific parameter combination. Experiments are conducted on different Russian and English corpora.

UR - http://www.scopus.com/inward/record.url?scp=85018627306&partnerID=8YFLogxK

U2 - 10.23919/FRUCT.2016.7892179

DO - 10.23919/FRUCT.2016.7892179

M3 - Conference contribution

SP - 24

EP - 30

BT - 19th Conference of Open Innovations Association, FRUCT 2016

A2 - Tyutina, Tatiana

A2 - Balandin, Sergey

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 19th Conference of Open Innovations Association, FRUCT 2016

Y2 - 7 November 2016 through 11 November 2016

ER -

ID: 7614470