The ability of social media to rapidly disseminate judgements on ethnicity and to influence offline ethnic relations creates demand for the methods of automatic monitoring of ethnicity related online content. In this study we seek to measure the overall volume of ethnicity related discussion in the Russian language social media and to develop an approach that would automatically detect various aspects of attitudes to those ethnic groups. We develop a comprehensive list of ethnonyms and related bigrams that embrace 97 Post-Soviet ethnic groups and obtain all messages containing one of those words from a two-year period from all Russian language social media (N = 2,660,222 texts). We hand-code 7,181 messages where rare ethnicities are overrepresented and train a number of classifiers to recognize different aspects of authors’ attitudes and other text features. After calculating a number of standard quality metrics, we find that we reach good quality in detecting intergroup conflict, positive intergroup contact, and overall negative and positive sentiment. Relevance to the topic of ethnicity and general attitude to an ethnic group are least well predicted, while some aspects such as calls for violence against an ethnic group are not sufficiently present in the data to be predicted.
Original languageEnglish
Title of host publicationDigital Transformation and Global Society
Subtitle of host publicationInternational Conference on Digital Transformation and Global Society DTGS 2017
PublisherSpringer Nature
Pages16-30
ISBN (Electronic)9783319697840
ISBN (Print)9783319697833
DOIs
StatePublished - 1 Jan 2017
Event2nd International Conference on Digital Transformation and Global Society (DTGS) - St. Petersburg
Duration: 21 Jun 201723 Jun 2017

Publication series

NameCommunications in Computer and Information Science
PublisherSpringer Nature
Volume745
ISSN (Print)1865-0929

Conference

Conference2nd International Conference on Digital Transformation and Global Society (DTGS)
CitySt. Petersburg
Period21/06/1723/06/17

    Research areas

  • Classification, Ethnic attitudes, Interethnic relations, Lexicon, Mapping, Social media

ID: 104815162