Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Geolocation Detection Approaches for User Discussion Analysis in Twitter. / Blekanov, Ivan ; Maksimov, Alexey ; Nepiyushchikh, Dmitry ; Bodrunova, Svetlana S. .
HCI International 2022 - Late Breaking Papers. Interaction in New Media, Learning and Games: 24th International Conference on Human-Computer Interaction, HCII 2022, Virtual Event, June 26–July 1, 2022, Proceedings. Springer Nature, 2022. p. 16-29 (Lecture Notes in Computer Science; Vol. 13517).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - Geolocation Detection Approaches for User Discussion Analysis in Twitter
AU - Blekanov, Ivan
AU - Maksimov, Alexey
AU - Nepiyushchikh, Dmitry
AU - Bodrunova, Svetlana S.
N1 - Conference code: 24
PY - 2022
Y1 - 2022
N2 - In this research, the authors consider methods for identifying geodata of users of social networks within user discussions. The knowledge of user geolocation data makes it possible to analyze the spread of discussion among users of different countries. Authors do not try to determine the exact geolocation, but rather the country where the users are located. The problem of getting country-level user location data lies in the fact that a high percentage of users do not state their location correctly, either mentioning it in humorous ways or even not stating it at all. There are various methods of obtaining data about the location of users. Among them, there are text-based methods, methods based on the analysis of the context, and methods based on the topology of the user graph. In this paper, we make a special emphasis on a method that allows to reveal geodata of users who specified their geodata incorrectly or did not specify it at all. In order to test our method, we use Twitter datasets. We propose several approaches to resolve the issues stated above. The paper highlights three approaches: the naïve approach, the naïve approach using natural language processing (NLP), and the graph approach, which is glossary-based and determines the number of outgoing connections. We have introduced two measures in order to evaluate the proposed approaches. Recall-GEO and Precision-GEO that are described throughout the paper. The accuracy of UserGraph method is finally evaluated using the metrics above.
AB - In this research, the authors consider methods for identifying geodata of users of social networks within user discussions. The knowledge of user geolocation data makes it possible to analyze the spread of discussion among users of different countries. Authors do not try to determine the exact geolocation, but rather the country where the users are located. The problem of getting country-level user location data lies in the fact that a high percentage of users do not state their location correctly, either mentioning it in humorous ways or even not stating it at all. There are various methods of obtaining data about the location of users. Among them, there are text-based methods, methods based on the analysis of the context, and methods based on the topology of the user graph. In this paper, we make a special emphasis on a method that allows to reveal geodata of users who specified their geodata incorrectly or did not specify it at all. In order to test our method, we use Twitter datasets. We propose several approaches to resolve the issues stated above. The paper highlights three approaches: the naïve approach, the naïve approach using natural language processing (NLP), and the graph approach, which is glossary-based and determines the number of outgoing connections. We have introduced two measures in order to evaluate the proposed approaches. Recall-GEO and Precision-GEO that are described throughout the paper. The accuracy of UserGraph method is finally evaluated using the metrics above.
KW - Geolocation detection
KW - Name entity recognition model
KW - Open street map service
KW - Social network analysis
KW - Twitter users discussion
KW - User graph analysis
UR - https://www.mendeley.com/catalogue/3716d7c2-d1b3-3483-8225-166b2c98d022/
U2 - 10.1007/978-3-031-22131-6_2
DO - 10.1007/978-3-031-22131-6_2
M3 - Conference contribution
SN - 9783031221309
T3 - Lecture Notes in Computer Science
SP - 16
EP - 29
BT - HCI International 2022 - Late Breaking Papers. Interaction in New Media, Learning and Games
PB - Springer Nature
Y2 - 26 June 2022 through 1 July 2022
ER -
ID: 100624931