The article describes the scheme of the annotation of pragmatic markers in the corpus of Russian everyday speech “One Day of Speech”. Pragmatic markers are defined as special units in the speech that have only pragmatic function without any (or with ‘bleached’) lexical meaning. The annotation of pragmatic markers is usually performed manually due to the existing ambiguity of markers in different contexts. The typology of pragmatic markers includes different groups marked with special annotation tags. The annotation process was split into two stages since several issues of tagging of PMs arose. The main problems, which occurred during the annotation process, and the possible ways of their solution are also discussed in the research. The paper propose the improved methods of problem solving during the annotation of pragmatic markers applied to the corpus of oral speech, which can be useful for the linguistic annotation of any other levels of oral speech.

Original languageEnglish
Pages (from-to)1-16
Number of pages16
JournalCEUR Workshop Proceedings
Volume2303
StatePublished - 1 Jan 2018
Event2018 International Workshop on Computational Models in Language and Speech, CMLS 2018 - Kazan, Russian Federation
Duration: 1 Nov 2018 → …

    Research areas

  • Corpus annotation, Corpus linguistics, Corpus of everyday speech, Pragmatic marker, Spoken speech

    Scopus subject areas

  • Computer Science(all)

ID: 43825554