Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Comparative Study of Clustering Algorithms for Inferring Psychological Profiles from VK-User Avatars Semantics. / Bushmelev, F.; Stoliarova, V.; Prusskikh , Ilya .
Proceedings of the Ninth International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’25), Volume 1 . Springer Nature, 2026. p. 485-496 (Lecture Notes in Networks and Systems; Vol. 1762 LNNS).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - Comparative Study of Clustering Algorithms for Inferring Psychological Profiles from VK-User Avatars Semantics
AU - Bushmelev, F.
AU - Stoliarova, V.
AU - Prusskikh , Ilya
N1 - Export Date: 29 March 2026; Cited By: 0; Correspondence Address: F. Bushmelev; St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg, 39, 14th Line V.O., 199178, Russian Federation; email: fvb@dscs.pro; Conference name: 9th International Scientific Conference on Intelligent Information Technologies for Industry, IITI 2025; Conference date: 5 November 2025 through 7 November 2025; Conference code: 344719
PY - 2026
Y1 - 2026
N2 - Rising user activity on online social media (OSM) platforms like VK drives cross-disciplinary research (psychology, cybersecurity, etc.), where avatars serve as key digital footprints. While ML is widely used for the analysis, adapting universal tools to dataset-specific properties remains challenging. This study focuses on the optimization of the clustering of the datasets with avatars. The intensive computational experiment was conducted in order to identify the clustering structure of such dataset and the best UMAP parameter values that lead to good clusterization with respect to several clusterization quality indices. Our pipeline combines CLIP embeddings, UMAP reduction, and five clustering algorithms (K-means to HDBSCAN and GMM). Hyperparameters were tuned via Grid Search and Bayesian optimization, evaluated on 9,000 VK avatars using four metrics (SI, DBI, CHI, DI). We demonstrate, that those parameters lead to avatar clusterization with user groups that vary in mean Big Five scales. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
AB - Rising user activity on online social media (OSM) platforms like VK drives cross-disciplinary research (psychology, cybersecurity, etc.), where avatars serve as key digital footprints. While ML is widely used for the analysis, adapting universal tools to dataset-specific properties remains challenging. This study focuses on the optimization of the clustering of the datasets with avatars. The intensive computational experiment was conducted in order to identify the clustering structure of such dataset and the best UMAP parameter values that lead to good clusterization with respect to several clusterization quality indices. Our pipeline combines CLIP embeddings, UMAP reduction, and five clustering algorithms (K-means to HDBSCAN and GMM). Hyperparameters were tuned via Grid Search and Bayesian optimization, evaluated on 9,000 VK avatars using four metrics (SI, DBI, CHI, DI). We demonstrate, that those parameters lead to avatar clusterization with user groups that vary in mean Big Five scales. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
KW - CLIP
KW - Clustering
KW - Dimensionality Reduction
KW - Graphical Digital Footprints
KW - Hyperparameter Tuning
KW - Online Social Media
KW - Personality Computing
KW - Barium compounds
KW - Bayesian networks
KW - Computer graphics
KW - Dimensionality reduction
KW - K-means clustering
KW - Reduction
KW - Social sciences computing
KW - Clusterings
KW - Clusterization
KW - Comparatives studies
KW - Graphical digital footprint
KW - Hyper-parameter
KW - Hyperparameter tuning
KW - Online social medias
KW - Personality computing
KW - Social networking (online)
UR - https://www.mendeley.com/catalogue/807b4a9b-b6ce-31a8-b152-3c9751bb824b/
U2 - 10.1007/978-3-032-13615-2_41
DO - 10.1007/978-3-032-13615-2_41
M3 - статья в сборнике материалов конференции
SN - 9783032136145
T3 - Lecture Notes in Networks and Systems
SP - 485
EP - 496
BT - Proceedings of the Ninth International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’25), Volume 1
PB - Springer Nature
Y2 - 5 November 2025 through 7 November 2025
ER -
ID: 151441454