Idioms modeling in a computer ontology as a morphosyntactic disambiguation strategy: The case of tibetan corpus of grammar treatises

Anastasia Dobrova, Pavel Grokhovskiy, Maria Smirnova, Nikolay Soms, Алексей Владимирович Добров

Research output

1 Citation (Scopus)

Abstract

The article presents the experience of developing computer ontology as one of the tools for Tibetan idioms processing. A computer ontology that contains a consistent specification of meanings of lexical units with different relations between them represents a model of lexical semantics and both syntactic and semantic valencies, reflecting the Tibetan linguistic picture of the world. The article presents an attempt to classify Tibetan idioms, including compounds, which are idiomatized clips of syntactic groups that have frozen inner syntactic relations and are often characterized by omission of grammatical morphemes; and the application of this classification for idioms processing in computer ontology. The article also proposes methods of using computer ontology for avoiding idioms processing ambiguity.

Original languageEnglish
Title of host publicationText, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings
EditorsPetr Sojka, Aleš Horák, Ivan Kopecek, Karel Pala
PublisherSpringer
Pages76-83
Number of pages8
ISBN (Print)9783030007935
DOIs
Publication statusPublished - 10 Sep 2018
Event21st International Conference on Text, Speech, and Dialogue, TSD 2018 - Brno
Duration: 11 Sep 201814 Sep 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11107 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st International Conference on Text, Speech, and Dialogue, TSD 2018
CountryCzech Republic
CityBrno
Period11/09/1814/09/18

Fingerprint

Grammar
Ontology
Syntactics
Computer simulation
Modeling
Processing
Semantics
Linguistics
Classify
Specification
Specifications
Unit
Corpus
Strategy
Syntax
Model

Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Dobrova, A., Grokhovskiy, P., Smirnova, M., Soms, N., & Добров, А. В. (2018). Idioms modeling in a computer ontology as a morphosyntactic disambiguation strategy: The case of tibetan corpus of grammar treatises. In P. Sojka, A. Horák, I. Kopecek, & K. Pala (Eds.), Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings (pp. 76-83). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11107 LNAI). Springer. https://doi.org/10.1007/978-3-030-00794-2_8
Dobrova, Anastasia ; Grokhovskiy, Pavel ; Smirnova, Maria ; Soms, Nikolay ; Добров, Алексей Владимирович. / Idioms modeling in a computer ontology as a morphosyntactic disambiguation strategy : The case of tibetan corpus of grammar treatises. Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings. editor / Petr Sojka ; Aleš Horák ; Ivan Kopecek ; Karel Pala. Springer, 2018. pp. 76-83 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{6c0f84193f034825bf0f26cbc6b6f9ec,
title = "Idioms modeling in a computer ontology as a morphosyntactic disambiguation strategy: The case of tibetan corpus of grammar treatises",
abstract = "The article presents the experience of developing computer ontology as one of the tools for Tibetan idioms processing. A computer ontology that contains a consistent specification of meanings of lexical units with different relations between them represents a model of lexical semantics and both syntactic and semantic valencies, reflecting the Tibetan linguistic picture of the world. The article presents an attempt to classify Tibetan idioms, including compounds, which are idiomatized clips of syntactic groups that have frozen inner syntactic relations and are often characterized by omission of grammatical morphemes; and the application of this classification for idioms processing in computer ontology. The article also proposes methods of using computer ontology for avoiding idioms processing ambiguity.",
keywords = "Compounds, Computer ontology, Corpus linguistics, Idioms, Immediate constituents, Natural language processing, Tibetan corpus, Tibetan language",
author = "Anastasia Dobrova and Pavel Grokhovskiy and Maria Smirnova and Nikolay Soms and Добров, {Алексей Владимирович}",
year = "2018",
month = "9",
day = "10",
doi = "10.1007/978-3-030-00794-2_8",
language = "English",
isbn = "9783030007935",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
pages = "76--83",
editor = "Petr Sojka and Aleš Hor{\'a}k and Ivan Kopecek and Karel Pala",
booktitle = "Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings",
address = "Germany",

}

Dobrova, A, Grokhovskiy, P, Smirnova, M, Soms, N & Добров, АВ 2018, Idioms modeling in a computer ontology as a morphosyntactic disambiguation strategy: The case of tibetan corpus of grammar treatises. in P Sojka, A Horák, I Kopecek & K Pala (eds), Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11107 LNAI, Springer, pp. 76-83, Brno, 11/09/18. https://doi.org/10.1007/978-3-030-00794-2_8

Idioms modeling in a computer ontology as a morphosyntactic disambiguation strategy : The case of tibetan corpus of grammar treatises. / Dobrova, Anastasia; Grokhovskiy, Pavel; Smirnova, Maria; Soms, Nikolay; Добров, Алексей Владимирович.

Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings. ed. / Petr Sojka; Aleš Horák; Ivan Kopecek; Karel Pala. Springer, 2018. p. 76-83 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11107 LNAI).

Research output

TY - GEN

T1 - Idioms modeling in a computer ontology as a morphosyntactic disambiguation strategy

T2 - The case of tibetan corpus of grammar treatises

AU - Dobrova, Anastasia

AU - Grokhovskiy, Pavel

AU - Smirnova, Maria

AU - Soms, Nikolay

AU - Добров, Алексей Владимирович

PY - 2018/9/10

Y1 - 2018/9/10

N2 - The article presents the experience of developing computer ontology as one of the tools for Tibetan idioms processing. A computer ontology that contains a consistent specification of meanings of lexical units with different relations between them represents a model of lexical semantics and both syntactic and semantic valencies, reflecting the Tibetan linguistic picture of the world. The article presents an attempt to classify Tibetan idioms, including compounds, which are idiomatized clips of syntactic groups that have frozen inner syntactic relations and are often characterized by omission of grammatical morphemes; and the application of this classification for idioms processing in computer ontology. The article also proposes methods of using computer ontology for avoiding idioms processing ambiguity.

AB - The article presents the experience of developing computer ontology as one of the tools for Tibetan idioms processing. A computer ontology that contains a consistent specification of meanings of lexical units with different relations between them represents a model of lexical semantics and both syntactic and semantic valencies, reflecting the Tibetan linguistic picture of the world. The article presents an attempt to classify Tibetan idioms, including compounds, which are idiomatized clips of syntactic groups that have frozen inner syntactic relations and are often characterized by omission of grammatical morphemes; and the application of this classification for idioms processing in computer ontology. The article also proposes methods of using computer ontology for avoiding idioms processing ambiguity.

KW - Compounds

KW - Computer ontology

KW - Corpus linguistics

KW - Idioms

KW - Immediate constituents

KW - Natural language processing

KW - Tibetan corpus

KW - Tibetan language

UR - http://www.scopus.com/inward/record.url?scp=85053886837&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-00794-2_8

DO - 10.1007/978-3-030-00794-2_8

M3 - Conference contribution

AN - SCOPUS:85053886837

SN - 9783030007935

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 76

EP - 83

BT - Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings

A2 - Sojka, Petr

A2 - Horák, Aleš

A2 - Kopecek, Ivan

A2 - Pala, Karel

PB - Springer

ER -

Dobrova A, Grokhovskiy P, Smirnova M, Soms N, Добров АВ. Idioms modeling in a computer ontology as a morphosyntactic disambiguation strategy: The case of tibetan corpus of grammar treatises. In Sojka P, Horák A, Kopecek I, Pala K, editors, Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings. Springer. 2018. p. 76-83. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-00794-2_8