Accepted author manuscript, 566 KB, PDF document
The article discusses the development of an online tool for moderating the content of social network groups. The use of classification using machine learning methods is proposed as the main element of the system. The creation of the feature set of messages is assumed by extracting the content features of the text, as well as the use of word embeddings vectors. The authors conducted a series of experiments to find the best combination of vector representation, content features and classification method. Tests on a dataset of 11 thousand messages in Russian showed the result of 87% accuracy. The architecture of the group moderator’s web application with the ability to automatically apply classification results to control users and display posts is proposed.
| Translated title of the contribution | Служба классификации токсичных комментариев в социальной сети |
|---|---|
| Original language | English |
| Title of host publication | Speech and Computer - 23rd International Conference, SPECOM 2021, Proceedings |
| Editors | Alexey Karpov, Rodmonga Potapova |
| Place of Publication | Cham |
| Publisher | Springer Nature |
| Pages | 157-165 |
| Number of pages | 9 |
| Volume | 12997 |
| Edition | Springer |
| ISBN (Electronic) | 978-3-030-87802-3 |
| ISBN (Print) | 9783030878016 |
| DOIs | |
| State | Published - 2021 |
| Externally published | Yes |
| Event | 23rd International Conference on Speech and Computer, SPECOM 2021 - Virtual, Online, Russian Federation Duration: 27 Sep 2021 → 30 Sep 2021 Conference number: 23 http://specom.nw.ru/2021/ |
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 12997 LNAI |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
| Conference | 23rd International Conference on Speech and Computer, SPECOM 2021 |
|---|---|
| Abbreviated title | SPECOM 2021 |
| Country/Territory | Russian Federation |
| City | Virtual, Online |
| Period | 27/09/21 → 30/09/21 |
| Internet address |
ID: 97970208