Standard

Investigation of text attribution methods based on frequency author profile. / Diurdeva, Polina; Mikhailova, Elena.

Databases and Information Systems - 13th International Baltic Conference, DB and IS 2018, Proceedings. ed. / Olegas Vasilecas; Gintautas Dzemyda; Audrone Lupeikiene. Springer Nature, 2018. p. 314-327 (Communications in Computer and Information Science; Vol. 838).

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Harvard

Diurdeva, P & Mikhailova, E 2018, Investigation of text attribution methods based on frequency author profile. in O Vasilecas, G Dzemyda & A Lupeikiene (eds), Databases and Information Systems - 13th International Baltic Conference, DB and IS 2018, Proceedings. Communications in Computer and Information Science, vol. 838, Springer Nature, pp. 314-327, 13th International Baltic Conference on Databases and Information Systems, DB and IS 2018, Trakai, Lithuania, 1/07/18. https://doi.org/10.1007/978-3-319-97571-9_25

APA

Diurdeva, P., & Mikhailova, E. (2018). Investigation of text attribution methods based on frequency author profile. In O. Vasilecas, G. Dzemyda, & A. Lupeikiene (Eds.), Databases and Information Systems - 13th International Baltic Conference, DB and IS 2018, Proceedings (pp. 314-327). (Communications in Computer and Information Science; Vol. 838). Springer Nature. https://doi.org/10.1007/978-3-319-97571-9_25

Vancouver

Diurdeva P, Mikhailova E. Investigation of text attribution methods based on frequency author profile. In Vasilecas O, Dzemyda G, Lupeikiene A, editors, Databases and Information Systems - 13th International Baltic Conference, DB and IS 2018, Proceedings. Springer Nature. 2018. p. 314-327. (Communications in Computer and Information Science). https://doi.org/10.1007/978-3-319-97571-9_25

Author

Diurdeva, Polina ; Mikhailova, Elena. / Investigation of text attribution methods based on frequency author profile. Databases and Information Systems - 13th International Baltic Conference, DB and IS 2018, Proceedings. editor / Olegas Vasilecas ; Gintautas Dzemyda ; Audrone Lupeikiene. Springer Nature, 2018. pp. 314-327 (Communications in Computer and Information Science).

BibTeX

@inproceedings{239153514ce04fc8bd8f8a7b8d312d3b,
title = "Investigation of text attribution methods based on frequency author profile",
abstract = "The task of text analysis with the objective to determine text{\textquoteright}s author is a challenge the solutions of which have engaged researchers since the last century. With the development of social networks and platforms for publishing of web-posts or articles on the Internet, the task of identifying authorship becomes even more acute. Specialists in the areas of journalism and law are particularly interested in finding a more accurate approach in order to resolve disputes related to the texts of dubious authorship. In this article authors carry out an applicability comparison of eight modern Machine Learning algorithms like Support Vector Machine, Naive Bayes, Logistic Regression, K-nearest Neighbors, Decision Tree, Random Forest, Multilayer Perceptron, Gradient Boosting Classifier for classification of Russian web-post collection. The best results were achieved with Logistic Regression, Multilayer Perceptron and Support Vector Machine with linear kernel using combination of Part-of-Speech and Word N-grams as features.",
keywords = "Author attribution, Frequency author profile, Text classification",
author = "Polina Diurdeva and Elena Mikhailova",
year = "2018",
month = jan,
day = "1",
doi = "10.1007/978-3-319-97571-9_25",
language = "English",
isbn = "9783319975702",
series = "Communications in Computer and Information Science",
publisher = "Springer Nature",
pages = "314--327",
editor = "Olegas Vasilecas and Gintautas Dzemyda and Audrone Lupeikiene",
booktitle = "Databases and Information Systems - 13th International Baltic Conference, DB and IS 2018, Proceedings",
address = "Germany",
note = "13th International Baltic Conference on Databases and Information Systems, DB and IS 2018 ; Conference date: 01-07-2018 Through 04-07-2018",

}

RIS

TY - GEN

T1 - Investigation of text attribution methods based on frequency author profile

AU - Diurdeva, Polina

AU - Mikhailova, Elena

PY - 2018/1/1

Y1 - 2018/1/1

N2 - The task of text analysis with the objective to determine text’s author is a challenge the solutions of which have engaged researchers since the last century. With the development of social networks and platforms for publishing of web-posts or articles on the Internet, the task of identifying authorship becomes even more acute. Specialists in the areas of journalism and law are particularly interested in finding a more accurate approach in order to resolve disputes related to the texts of dubious authorship. In this article authors carry out an applicability comparison of eight modern Machine Learning algorithms like Support Vector Machine, Naive Bayes, Logistic Regression, K-nearest Neighbors, Decision Tree, Random Forest, Multilayer Perceptron, Gradient Boosting Classifier for classification of Russian web-post collection. The best results were achieved with Logistic Regression, Multilayer Perceptron and Support Vector Machine with linear kernel using combination of Part-of-Speech and Word N-grams as features.

AB - The task of text analysis with the objective to determine text’s author is a challenge the solutions of which have engaged researchers since the last century. With the development of social networks and platforms for publishing of web-posts or articles on the Internet, the task of identifying authorship becomes even more acute. Specialists in the areas of journalism and law are particularly interested in finding a more accurate approach in order to resolve disputes related to the texts of dubious authorship. In this article authors carry out an applicability comparison of eight modern Machine Learning algorithms like Support Vector Machine, Naive Bayes, Logistic Regression, K-nearest Neighbors, Decision Tree, Random Forest, Multilayer Perceptron, Gradient Boosting Classifier for classification of Russian web-post collection. The best results were achieved with Logistic Regression, Multilayer Perceptron and Support Vector Machine with linear kernel using combination of Part-of-Speech and Word N-grams as features.

KW - Author attribution

KW - Frequency author profile

KW - Text classification

UR - http://www.scopus.com/inward/record.url?scp=85052856289&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-97571-9_25

DO - 10.1007/978-3-319-97571-9_25

M3 - Conference contribution

AN - SCOPUS:85052856289

SN - 9783319975702

T3 - Communications in Computer and Information Science

SP - 314

EP - 327

BT - Databases and Information Systems - 13th International Baltic Conference, DB and IS 2018, Proceedings

A2 - Vasilecas, Olegas

A2 - Dzemyda, Gintautas

A2 - Lupeikiene, Audrone

PB - Springer Nature

T2 - 13th International Baltic Conference on Databases and Information Systems, DB and IS 2018

Y2 - 1 July 2018 through 4 July 2018

ER -

ID: 38400560