Корпуса греческого языка: достижения, цели и задачи

Links

https://tronsky.iling.spb.ru/static/tronsky2018_01.pdf
Final published version

Тимофей Александрович Архангельский
Максим Львович Кисилиер

Methods of corpus linguistics become more and more important in Modern Greek Studies. More than forty years ago appeared first versions of Thesaurus Linguae Graecae that has now become one of the most elaborated and functional corpora in the world despite some drawbacks. There are at least five corpora that are relevant for Modern Greek. Most of them have different types of data, different size (from 1.9 million tokens up to 1,6 billion) and are designed for different tasks. The comparison of these corpora demonstrates that none of them can now fully replace the others, however it is not likely that all these corpora may be developed simultaneously. In this article we tried to describe the Corpus of Modern Greek (http://web-corpora.net/GreekCorpus/search/?interface_language=en) and its unique features in order to demonstrate why and how it could undertake functions of the most corpora of Modern Greek, except, probably, the Greek Web Corpus, elTenTen. Unlike other Greek corpora, the Corpus of Modern Greek was created by linguists for linguists and for nonprofessional users and does not require any special registration. Its structure allows it to work with different types of texts including audio data. It possesses a powerful search engine which enables to take into account many detailed grammatical features. Apart from that, the user of the Corpus of Modern Greek can find here translations from English into Modern Greek. We hope that in the nearest future this option will be relevant for Russian as well.

Translated title of the contribution	Corpora of Modern Greek: achievements and goals
Original language	Russian
Pages (from-to)	50-59
Journal	Индоевропейское языкознание и классическая филология
Volume	XXII
Issue number	1
State	Published - 2018
Event	Индоевропейское языкознание и классическая филология (Чтения памяти И. М. Тронского) - ИЛИ, Санкт-Петербург, Russian Federation Duration: 18 Jun 2018 → 20 Jun 2018 Conference number: XXII

Scopus subject areas

Language and Linguistics

Research areas

corpus linguistics, hesaurus Linguae Graecae, Modern Greek Corpora, Corpus of Modern Greek, Greek diglossia