Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Methods of User Opinion Data Crawling in Web 2.0 Social Network Discussions. / Непиющих, Дмитрий Викторович; Блеканов, Иван Станиславович; Тарасов, Никита Андреевич; Максимов, Алексей Юрьевич.
Social Computing and Social Media. HCII 2024.. Springer Nature, 2024. p. 72-81 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14703 LNCS).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - Methods of User Opinion Data Crawling in Web 2.0 Social Network Discussions
AU - Непиющих, Дмитрий Викторович
AU - Блеканов, Иван Станиславович
AU - Тарасов, Никита Андреевич
AU - Максимов, Алексей Юрьевич
PY - 2024
Y1 - 2024
N2 - It is widely accepted that nowadays a significant part of the content on the internet is generated by users of social media platforms which form the basis of Web 2.0. That is why modern media researchers use user-generated content to test their scientific hypotheses using automated data analysis methods and data mining tools. In this study we examine the main approaches to user opinion data crawling in modern social media platforms for subsequent analysis for scientific and research purposes. We propose a data collection approach based on reverse engineering of APK applications, which allows for data extraction from social networks that will not differ in completeness from data from mobile applications. A comparative analysis of the proposed methods in terms of completeness and execution speed is also carried out. According to our findings, implementing a custom REST API is the best approach as it is both reliable and computationally efficient.
AB - It is widely accepted that nowadays a significant part of the content on the internet is generated by users of social media platforms which form the basis of Web 2.0. That is why modern media researchers use user-generated content to test their scientific hypotheses using automated data analysis methods and data mining tools. In this study we examine the main approaches to user opinion data crawling in modern social media platforms for subsequent analysis for scientific and research purposes. We propose a data collection approach based on reverse engineering of APK applications, which allows for data extraction from social networks that will not differ in completeness from data from mobile applications. A comparative analysis of the proposed methods in terms of completeness and execution speed is also carried out. According to our findings, implementing a custom REST API is the best approach as it is both reliable and computationally efficient.
KW - API
KW - Data Collection
KW - Data Crawling
KW - SSL Pinning
KW - User Opinion
KW - Web Crawler
KW - gRPC
UR - https://link.springer.com/chapter/10.1007/978-3-031-61281-7_5
UR - https://www.mendeley.com/catalogue/e6f87ecd-4720-3fcf-bad3-afc77be8db4c/
U2 - 10.1007/978-3-031-61281-7_5
DO - 10.1007/978-3-031-61281-7_5
M3 - Conference contribution
SN - 978-3-031-61280-0
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 72
EP - 81
BT - Social Computing and Social Media. HCII 2024.
PB - Springer Nature
T2 - Social Computing and Social Media: 16th International Conference
Y2 - 29 June 2024 through 4 July 2024
ER -
ID: 125271477