The present study proposes a methodology of a corpus-based analysis of Russian secondary prepositions, primarily focusing on multiwords. Secondary prepositions are units motivated by content words (nouns, adverbs, verbs), which may be combined with primary prepositions to form multiword prepositions (MWPs). Multiword prepositions perform the grammatical function of a preposition in a certain position of a syntactic structure in some contexts and can be a free combination in others. A strict division between secondary multiword prepositions and equivalent free word combinations is not specified. This presents an issue in the task of building a language model as compound prepositional units are commonly mislabeled as free combinations or are labelled inconsistently, thus leading to parsing errors with far-reaching consequences. Our larger study aims at solving this problem by identifying, describing and eventually formalizing the full inventory of Russian MWPs, which demands a special corpus-based research. This paper is devoted to statistical analysis of the use of secondary multiword prepositions in corpora using prepositions expressing causal relations as the base material. The features of multiword prepositions in the function of a preposition are described. Statistical data on the ratio of the use of individual multiword expressions as prepositional units and as free combinations are provided.

Original languageEnglish
Pages (from-to)187-201
Number of pages15
JournalCEUR Workshop Proceedings
Volume2780
StatePublished - 2020
Event2020 Computational Models in Language and Speech Workshop, CMLS 2020 - Kazan, Russian Federation
Duration: 12 Nov 202013 Nov 2020

    Scopus subject areas

  • Computer Science(all)

    Research areas

  • Corpus statistics, Multiword prepositions, Russian language, Secondary prepositions

ID: 84462130