Ссылки

DOI

The ORD corpus is a representative resource of everyday spoken Russian that contains about 1000 h of long-term audio recordings of daily communication made in real settings by research volunteers. ORD macro episodes are the large communication episodes united by setting/scene of communication, social roles of participants and their general activity. The paper describes annotation principles used for tagging of macro episodes, provides current statistics on communication situations presented in the corpus and reveals their most common types. Annotation of communication situations allows using these codes as filters for selection of audio data, therefore making it possible to study Russian everyday speech in different communication situations, to determine and describe various registers of spoken Russian. As an example, several high frequency word lists referring to different communication situations are compared. Annotation of macro episodes that is made for the ORD corpus is a prerequisite for its further pragmatic annotation.
Язык оригиналаанглийский
Название основной публикацииSpeech and Computer
Подзаголовок основной публикации17th International Conference, SPECOM 2015, Athens, Greece, September 20-24, 2015, Proceedings
ИздательSpringer Nature
Страницы268-276
ISBN (электронное издание)978-3-319-23132-7
ISBN (печатное издание)978-3-319-23131-0
DOI
СостояниеОпубликовано - 2015
Событие17th International conference on speech and computer - Greece, Athens, Athens, Греция
Продолжительность: 20 сен 201524 сен 2015
http://specom.nw.ru/sites/2015/index.html

Серия публикаций

НазваниеLecture Notes in Computer Science
ИздательSpringer Nature
Том9319
ISSN (печатное издание)0302-9743

конференция

конференция17th International conference on speech and computer
Сокращенное названиеSPECOM 2015
Страна/TерриторияГреция
ГородAthens
Период20/09/1524/09/15
Сайт в сети Internet

    Предметные области Scopus

  • Языки и лингвистика

ID: 71354745