Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Topics in the Russian Twitter and relations between their interpretability and sentiment. / Bodrunova, Svetlana S. ; Blekanov, Ivan S.; Kukarkin, Mikhail.
2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS). Institute of Electrical and Electronics Engineers Inc., 2019. p. 549-554.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
}
TY - GEN
T1 - Topics in the Russian Twitter and relations between their interpretability and sentiment
AU - Bodrunova, Svetlana S.
AU - Blekanov, Ivan S.
AU - Kukarkin, Mikhail
N1 - Conference code: 6
PY - 2019/11
Y1 - 2019/11
N2 - Topic modelling is a technique widely used today to detect hidden topicality of text corpora, including those from social media. But, for many quite widespread online languages, like, e.g., Russian, topic modelling is still used rarely. For the Russian Twitter, only a handful of works exists, and these works lack substantial discussion on topic interpretability. Also, the impact of various properties of texts upon the modelling results remains widely unexplored. We partly cover these gaps by assessing a mid-range text corpus of a conflictual Twitter discussion in two respects. In continuation to our earlier study that applied three topic modelling algorithms (LDA, WNTM, and BTM) and assessed their quality via automated means, we here juxtapose automated assessment to human coding and link the human evaluation of topic quality to sentiment of the topics. We show that human coding disagrees with the results of the objective metrics in the number of interpretable topics, showing slightly higher interpretability for the LDA algorithm, but inter-coder reliability is much higher for BTM. We discuss a range of coding issues true for all the three topic models. We also find that interpretability of a topic by the human coders is linked to presence of negative keywords among the topic descriptors, with the strongest linkage shown by BTM.
AB - Topic modelling is a technique widely used today to detect hidden topicality of text corpora, including those from social media. But, for many quite widespread online languages, like, e.g., Russian, topic modelling is still used rarely. For the Russian Twitter, only a handful of works exists, and these works lack substantial discussion on topic interpretability. Also, the impact of various properties of texts upon the modelling results remains widely unexplored. We partly cover these gaps by assessing a mid-range text corpus of a conflictual Twitter discussion in two respects. In continuation to our earlier study that applied three topic modelling algorithms (LDA, WNTM, and BTM) and assessed their quality via automated means, we here juxtapose automated assessment to human coding and link the human evaluation of topic quality to sentiment of the topics. We show that human coding disagrees with the results of the objective metrics in the number of interpretable topics, showing slightly higher interpretability for the LDA algorithm, but inter-coder reliability is much higher for BTM. We discuss a range of coding issues true for all the three topic models. We also find that interpretability of a topic by the human coders is linked to presence of negative keywords among the topic descriptors, with the strongest linkage shown by BTM.
UR - https://dblp.org/db/conf/snams/snams2019.html
UR - https://ieeexplore.ieee.org/document/8931725
M3 - Conference contribution
SN - 978-1-7281-2947-1
SP - 549
EP - 554
BT - 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS)
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS)
Y2 - 22 October 2019 through 25 October 2019
ER -
ID: 49786491