Research output: Contribution to journal › Article › peer-review
Identification of user accounts by image comparison : The phash-based approach. / Oliseenko, Valerii D.; Abramov, Maxim V.; Tulupyev, Alexander L.
In: Scientific and Technical Journal of Information Technologies, Mechanics and Optics, Vol. 21, No. 4, 01.08.2021, p. 562-570.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Identification of user accounts by image comparison
T2 - The phash-based approach
AU - Oliseenko, Valerii D.
AU - Abramov, Maxim V.
AU - Tulupyev, Alexander L.
N1 - Publisher Copyright: © 2021, ITMO University. All rights reserved.
PY - 2021/8/1
Y1 - 2021/8/1
N2 - The study presents a new approach to the identification of various online social networks’ users that allows for matching of accounts belonging to the same person. To achieve this goal, images extracted from digital footprints of users are used. The proposed new approach compares not only the main images of a user’s profile, but also all the elements of the graphic content published in a user’s account. The described approach requires a pairwise comparison of the images published by users in two accounts from different online social networks on the “all-to-all” principle to assess the probability that these accounts belong to the same user. The comparison of the labeled graphical content elements is performed using the well-known perceptual hash method called pHash. A computational experiment was conducted to evaluate the results obtained by using the proposed approach, the f1-score achieved 0.886 for three matched images. It is shown that the results of the pHash image comparison can be used for account identification as a standalone approach as well as to complement other identification approaches. The proposed algorithm can be used to supplement the existing methods for comparative analysis of accounts. Automation of the proposed approach provides a tool for aggregation and makes it possible to obtain more information about users, assessing the depth of their personality features. The results can be applied to forming a digital twin of the user for further description of his (or her) traits in the tasks of protection against social engineering attacks, targeted advertising, assessment of creditworthiness, and other studies related to online social networks and social sciences.
AB - The study presents a new approach to the identification of various online social networks’ users that allows for matching of accounts belonging to the same person. To achieve this goal, images extracted from digital footprints of users are used. The proposed new approach compares not only the main images of a user’s profile, but also all the elements of the graphic content published in a user’s account. The described approach requires a pairwise comparison of the images published by users in two accounts from different online social networks on the “all-to-all” principle to assess the probability that these accounts belong to the same user. The comparison of the labeled graphical content elements is performed using the well-known perceptual hash method called pHash. A computational experiment was conducted to evaluate the results obtained by using the proposed approach, the f1-score achieved 0.886 for three matched images. It is shown that the results of the pHash image comparison can be used for account identification as a standalone approach as well as to complement other identification approaches. The proposed algorithm can be used to supplement the existing methods for comparative analysis of accounts. Automation of the proposed approach provides a tool for aggregation and makes it possible to obtain more information about users, assessing the depth of their personality features. The results can be applied to forming a digital twin of the user for further description of his (or her) traits in the tasks of protection against social engineering attacks, targeted advertising, assessment of creditworthiness, and other studies related to online social networks and social sciences.
KW - Data science
KW - Image processing
KW - Online social networks
KW - PHash
KW - Social engineering attacks
KW - User identification
UR - http://www.scopus.com/inward/record.url?scp=85116058457&partnerID=8YFLogxK
U2 - 10.17586/2226-1494-2021-21-4-562-570
DO - 10.17586/2226-1494-2021-21-4-562-570
M3 - Article
AN - SCOPUS:85116058457
VL - 21
SP - 562
EP - 570
JO - Scientific and Technical Journal of Information Technologies, Mechanics and Optics
JF - Scientific and Technical Journal of Information Technologies, Mechanics and Optics
SN - 2226-1494
IS - 4
ER -
ID: 86309815