Measuring prejudice and ethnic tensions in user-generated content

Research output: Contribution to journal › Article › peer-review

Olessia Koltsova
Svetlana Alexeeva
Sergey Nikolenko
Maxim Koltsov

With the spread of social media, ethnic prejudice is becoming publicly available to widening audiences and may have serious offline consequences. This creates demand to detect prejudice and other signs of ethnic tension in user-generated texts, and this task is absolutely different from measuring prejudice with surveys – an approach traditionally developed in psychology. In this work we use a hand coding instrument based on psychological definitions of prejudice and sociological methods of questionnaire construction. Compared to our previous research, we double our hand-coded collection that reaches 14,998 unique user texts retrieved from the Russian language social media. We then train computer classification algorithms to “guess” prejudice as detected by human coders and show significant improvement in quality compared to our earlier results. Still, as not all aspects of prejudice get detected sufficiently well, we analyze potential causes of low quality and outline directions for further improvement.

Original language	English
Pages (from-to)	76-81
Number of pages	6
Journal	Annual Review of CyberTherapy and Telemedicine
Volume	15
State	Published - 1 Jan 2017

Research areas

Ethnicity, Machine learning, Prejudice detection, User content

ID: 103178548