Выдержка
In the last decade, linguists have become increasingly interested in corpus material, which allows for a fresh approach to the phenomena that have already been extensively described in academic works. The dual nature of the co-occurrence phenomenon itself lies, on one hand, in its linguistic component and, on the other, in the probabilistic (combinatorial) characteristics. The former has been described in numerous papers and explicitly defined in dictionaries, while the latter can be identified by a statistical approach. The present paper focuses on the process of building a gold standard that will include data from Russian dictionaries and corpora. The standard is being prepared for a Russian Collocations Database that already includes information on words' collocability and was extracted from text corpora by statistical measures and linguistic filters. The gold standard will be also used for the evaluation of the extracted collocations and for marking them as “true“ collocations with references to the dictionaries.
Язык оригинала | английский |
---|---|
Название основной публикации | 18th Euralex International Congress, 2018 |
Редакторы | Vojko Gorjanc, Simon Krek, Jaka Cibej, Iztok Kosem |
Издатель | European Association for Lexicography |
Страницы | 863-869 |
Число страниц | 7 |
ISBN (электронное издание) | 9789610600961 |
ISBN (печатное издание) | 9789610600978 |
Состояние | Опубликовано - 1 янв 2018 |
Событие | 18th Euralex International Congress, 2018 - Ljubljana, Словения Продолжительность: 17 июл 2018 → 21 июл 2018 |
Конференция
Конференция | 18th Euralex International Congress, 2018 |
---|---|
Страна | Словения |
Город | Ljubljana |
Период | 17/07/18 → 21/07/18 |
Отпечаток
Предметные области Scopus
- Языки и лингвистика
- Языки и лингвистика
Цитировать
}
Building a gold standard for a russian collocations database. / Khokhlova, Maria.
18th Euralex International Congress, 2018. ред. / Vojko Gorjanc; Simon Krek; Jaka Cibej; Iztok Kosem. European Association for Lexicography, 2018. стр. 863-869.Результат исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции
TY - GEN
T1 - Building a gold standard for a russian collocations database
AU - Khokhlova, Maria
PY - 2018/1/1
Y1 - 2018/1/1
N2 - In the last decade, linguists have become increasingly interested in corpus material, which allows for a fresh approach to the phenomena that have already been extensively described in academic works. The dual nature of the co-occurrence phenomenon itself lies, on one hand, in its linguistic component and, on the other, in the probabilistic (combinatorial) characteristics. The former has been described in numerous papers and explicitly defined in dictionaries, while the latter can be identified by a statistical approach. The present paper focuses on the process of building a gold standard that will include data from Russian dictionaries and corpora. The standard is being prepared for a Russian Collocations Database that already includes information on words' collocability and was extracted from text corpora by statistical measures and linguistic filters. The gold standard will be also used for the evaluation of the extracted collocations and for marking them as “true“ collocations with references to the dictionaries.
AB - In the last decade, linguists have become increasingly interested in corpus material, which allows for a fresh approach to the phenomena that have already been extensively described in academic works. The dual nature of the co-occurrence phenomenon itself lies, on one hand, in its linguistic component and, on the other, in the probabilistic (combinatorial) characteristics. The former has been described in numerous papers and explicitly defined in dictionaries, while the latter can be identified by a statistical approach. The present paper focuses on the process of building a gold standard that will include data from Russian dictionaries and corpora. The standard is being prepared for a Russian Collocations Database that already includes information on words' collocability and was extracted from text corpora by statistical measures and linguistic filters. The gold standard will be also used for the evaluation of the extracted collocations and for marking them as “true“ collocations with references to the dictionaries.
KW - Collocations
KW - Corpora
KW - Database
KW - Dictionaries
KW - Russian language
UR - http://www.scopus.com/inward/record.url?scp=85059369182&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85059369182
SN - 9789610600978
SP - 863
EP - 869
BT - 18th Euralex International Congress, 2018
A2 - Gorjanc, Vojko
A2 - Krek, Simon
A2 - Cibej, Jaka
A2 - Kosem, Iztok
PB - European Association for Lexicography
ER -