Research output: Contribution to journal › Article › peer-review
Patterning of writing style evolution by means of dynamic similarity. / Amelin, Konstantin; Granichin, Oleg; Kizhaeva, Natalia; Volkovich, Zeev.
In: Pattern Recognition, Vol. 77, 05.2018, p. 45-64.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Patterning of writing style evolution by means of dynamic similarity
AU - Amelin, Konstantin
AU - Granichin, Oleg
AU - Kizhaeva, Natalia
AU - Volkovich, Zeev
PY - 2018/5
Y1 - 2018/5
N2 - This paper suggests a new methodology for patterning writing style evolution using dynamic similarity. We divide a text into sequential, disjoint portions (chunks) of the same size and exploit the Mean Dependence measure, aspiring to model the writing process via association between the current text chunk and its predecessors. To expose the evolution of a style, a new two-step clustering procedure is applied. In the first phase, a distance based on the Mean Dependence between each pair of chunks is evaluated. All document chunks in a pair are embedded in a high dimensional space using a Kuratowski-type embedding procedure and clustered by means of the introduced distance. In the next phase, the rows of the binary cluster classification documents matrix are clustered via the hierarchical single linkage clustering algorithm. By this way, a visualization of the inner stylistic structure of a texts' collection, the resulting classification tree, is provided by the appropriate dendrogram. The approach applied to studying writing style evolution in the "Foundation Universe" by Isaac Asimov, the "Rama" series by Arthur C. Clarke, the "Forsyte Saga" of John Galsworthy, "The Lord of the Rings" by John Ronald Reuel Tolkien and a collection of books prescribed to Romain Gary demonstrates that the suggested methodology is capable of identifying style development over time. Additional numerical experiments with author determination and author verification tasks exhibit the high ability of the method to provide accurate solutions. (C) 2017 Elsevier Ltd. All rights reserved.
AB - This paper suggests a new methodology for patterning writing style evolution using dynamic similarity. We divide a text into sequential, disjoint portions (chunks) of the same size and exploit the Mean Dependence measure, aspiring to model the writing process via association between the current text chunk and its predecessors. To expose the evolution of a style, a new two-step clustering procedure is applied. In the first phase, a distance based on the Mean Dependence between each pair of chunks is evaluated. All document chunks in a pair are embedded in a high dimensional space using a Kuratowski-type embedding procedure and clustered by means of the introduced distance. In the next phase, the rows of the binary cluster classification documents matrix are clustered via the hierarchical single linkage clustering algorithm. By this way, a visualization of the inner stylistic structure of a texts' collection, the resulting classification tree, is provided by the appropriate dendrogram. The approach applied to studying writing style evolution in the "Foundation Universe" by Isaac Asimov, the "Rama" series by Arthur C. Clarke, the "Forsyte Saga" of John Galsworthy, "The Lord of the Rings" by John Ronald Reuel Tolkien and a collection of books prescribed to Romain Gary demonstrates that the suggested methodology is capable of identifying style development over time. Additional numerical experiments with author determination and author verification tasks exhibit the high ability of the method to provide accurate solutions. (C) 2017 Elsevier Ltd. All rights reserved.
KW - Patterning
KW - Writing style
KW - Text mining
KW - Dynamics
KW - AUTHORSHIP ATTRIBUTION
KW - K-MEANS
KW - RECOGNITION
KW - COMPRESSION
KW - PLAGIARISM
KW - ALGORITHM
KW - MODELS
KW - KERNEL
UR - http://www.scopus.com/inward/record.url?scp=85044629634&partnerID=8YFLogxK
UR - http://www.mendeley.com/research/patterning-writing-style-evolution-means-dynamic-similarity
U2 - 10.1016/j.patcog.2017.12.011
DO - 10.1016/j.patcog.2017.12.011
M3 - статья
VL - 77
SP - 45
EP - 64
JO - Pattern Recognition
JF - Pattern Recognition
SN - 0031-3203
ER -
ID: 11875344