This article provides a consistent formal grammatical and ontological description of the model of the Tibetan compounds system, developed and used for automatic syntactic and semantic analysis of Tibetan texts, on the material of a hand-verified corpus. This model covers all types of Tibetan compounds, which were previously introduced by other authors, and introduces a number of new classes of compounds, taking into account their derivation, structure and semantics. The article describes the tools used for ontological modeling of Tibetan compounds; special attention is paid to the problem of modeling the semantics of verbs and verbal compounds. Nominal and verbal compounds are considered separately, it is noted that the importance of verbal compounds for the Tibetan language system is not less than that of nominal compounds. The statistical data on the absolute frequency distribution of the use of compounds of different types in the current version of the corpus annotation and on the amounts of ontology concepts associated with each class of compounds are given.

Original languageEnglish
Title of host publicationIC3K 2019 - Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
EditorsJan Dietz, David Aveiro, Joaquim Filipe
PublisherSciTePress
Pages144-153
Number of pages10
ISBN (Electronic)9789897583827
StatePublished - 1 Jan 2019
Event11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2019 - Vienna, Austria
Duration: 17 Sep 201919 Sep 2019

Publication series

NameIC3K 2019 - Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Volume2

Conference

Conference11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2019
Country/TerritoryAustria
CityVienna
Period17/09/1919/09/19

    Scopus subject areas

  • Software

    Research areas

  • Compounds, Computer Ontology, Corpus Linguistics, Immediate Constituents, Natural Language Processing, Tibetan Corpus, Tibetan Language

ID: 50358912