Documents

DOI

  • Mikhail Dolgushin
  • Dayana Ismakova
  • Yuliya Bidulya
  • Igor Krupkin
  • Galina Barskaya
  • Anastasiya Lesiv

The article discusses the development of an online tool for moderating the content of social network groups. The use of classification using machine learning methods is proposed as the main element of the system. The creation of the feature set of messages is assumed by extracting the content features of the text, as well as the use of word embeddings vectors. The authors conducted a series of experiments to find the best combination of vector representation, content features and classification method. Tests on a dataset of 11 thousand messages in Russian showed the result of 87% accuracy. The architecture of the group moderator’s web application with the ability to automatically apply classification results to control users and display posts is proposed.

Translated title of the contributionСлужба классификации токсичных комментариев в социальной сети
Original languageEnglish
Title of host publicationSpeech and Computer - 23rd International Conference, SPECOM 2021, Proceedings
EditorsAlexey Karpov, Rodmonga Potapova
Place of PublicationCham
PublisherSpringer Nature
Pages157-165
Number of pages9
Volume12997
EditionSpringer
ISBN (Electronic)978-3-030-87802-3
ISBN (Print)9783030878016
DOIs
StatePublished - 2021
Externally publishedYes
Event23rd International Conference on Speech and Computer, SPECOM 2021 - Virtual, Online, Russian Federation
Duration: 27 Sep 202130 Sep 2021
Conference number: 23
http://specom.nw.ru/2021/

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12997 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference23rd International Conference on Speech and Computer, SPECOM 2021
Abbreviated titleSPECOM 2021
Country/TerritoryRussian Federation
CityVirtual, Online
Period27/09/2130/09/21
Internet address

    Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

    Research areas

  • Feature extraction, Moderation, Social media, Text classification, Toxic detection

ID: 97970208