Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Adaptation of Static and Contextualized Topic Modeling Techniques to Hidden Community Detection. / Мамаев, Иван Дмитриевич; Митрофанова, Ольга Александровна.
Digital Geography. Proceedings of the International Conference on Internet and Modern Society (IMS 2022). Springer Nature, 2024. p. 85-97 (Springer Geography; Vol. Part F2317).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
}
TY - GEN
T1 - Adaptation of Static and Contextualized Topic Modeling Techniques to Hidden Community Detection
AU - Мамаев, Иван Дмитриевич
AU - Митрофанова, Ольга Александровна
N1 - Conference code: XXIV
PY - 2024/2/21
Y1 - 2024/2/21
N2 - Today, social networks are among the main means of organizing interactions between people, and their analysis is a burning issue. One of the central tasks of social network analysis is the task of detecting hidden communities that is based on the study of user interactions. The procedures of their detection are based on three main approaches: cluster analysis, graph methods, and hybrid techniques. Recently, a new group of methods, namely, topic modeling, has begun to evolve and it allows taking into account semantic and associative links among the analyzed texts. In this paper, we conduct a series of comparative experiments with probabilistic and contextualized topic models in order to determine the most stable one. The experiments are performed on the corpus of 2020–2021 Russian LiveJournal posts which contains more than 12,500 texts. The results show that contextualized BERT models form the most stable connections between the texts that can become a sound basis for creating a model of hidden communities of Russian LiveJournal users.
AB - Today, social networks are among the main means of organizing interactions between people, and their analysis is a burning issue. One of the central tasks of social network analysis is the task of detecting hidden communities that is based on the study of user interactions. The procedures of their detection are based on three main approaches: cluster analysis, graph methods, and hybrid techniques. Recently, a new group of methods, namely, topic modeling, has begun to evolve and it allows taking into account semantic and associative links among the analyzed texts. In this paper, we conduct a series of comparative experiments with probabilistic and contextualized topic models in order to determine the most stable one. The experiments are performed on the corpus of 2020–2021 Russian LiveJournal posts which contains more than 12,500 texts. The results show that contextualized BERT models form the most stable connections between the texts that can become a sound basis for creating a model of hidden communities of Russian LiveJournal users.
KW - Topic modeling
KW - Corpus linguistics
KW - Computational semantics
KW - LDA
KW - ATM
KW - ATM
KW - BERT
KW - Computational semantics
KW - Corpus linguistics
KW - LDA
KW - Topic modeling
UR - https://www.mendeley.com/catalogue/01d37869-576e-3943-8463-db994eb5f740/
U2 - 10.1007/978-3-031-50609-3_7
DO - 10.1007/978-3-031-50609-3_7
M3 - Conference contribution
SN - 978-3-031-50608-6
T3 - Springer Geography
SP - 85
EP - 97
BT - Digital Geography. Proceedings of the International Conference on Internet and Modern Society (IMS 2022)
PB - Springer Nature
T2 - International Conference "Internet and Modern Society" (IMS-2022)
Y2 - 23 June 2022 through 24 June 2022
ER -
ID: 117017397