In this paper we discuss the task of prepositional phrase classification in the Russian annotated corpus of prepositional phrases. As previous research has shown, differentiation of highly confused classes, namely THEME and OBJECT classes, remains a problem waiting for the computational solution. Since simple classifier architecture demonstrates significant performance on these classes, we propose a tree-based classifier architecture to improve performance on the whole and these classes specifically. This architecture consists of a main classifier validating its decisions concerning troublesome classes with another supporting classifier, trained to differentiate between the classes causing performance dropdown. We experiment with various types of classifiers inside of our architecture and various embedding models for the Russian language, which we use for encoding the dataset. The best result that we managed to achieve is an overall F1-score of 0.76 on the validation set using the classifier trained with DeepPavlov/rubert-base-cased model and SVM (Support Vector Machines) classifiers. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
Original languageEnglish
Title of host publicationInternet and Modern Society. Human-Computer Communication (IMS 2024)
PublisherSpringer Nature
Pages130-139
Number of pages10
ISBN (Print)9783031961762
DOIs
StatePublished - 2026
EventInternet and Modern Society – IMS-2024 - ИТМО-Университет, Санкт-Петербург, Russian Federation
Duration: 24 Jun 202426 Jun 2024
Conference number: XXVII
https://ims.itmo.ru
https://ims.itmo.ru/
https://ims.itmo.ru

Publication series

NameCommunications in Computer and Information Science
Volume2534 CCIS

Conference

ConferenceInternet and Modern Society – IMS-2024
Abbreviated titleIMS 2024
Country/TerritoryRussian Federation
CityСанкт-Петербург
Period24/06/2426/06/24
Internet address

    Research areas

  • phrase embeddings, prepositional phrases, text classification, transformers, word sense disambiguation, Architecture, Classification (of information), Computational linguistics, Embeddings, Natural language processing systems, Text processing, Computational solutions, Performance, Phrase embedding, Prepositional phrase, Simple++, Text classification, Transformer, Transformer modeling, Word Sense Disambiguation, Support vector machines

ID: 151442632