Without any doubt corpora are vital tools for linguistic studies and solution for applied tasks. Although corpora opportunities are very useful, there is a need of another kind of software for further improvement of linguistic research as it is impossible to process huge amount of linguistic data manually. The Sketch Engine representing itself a corpus tool which takes as input a corpus of any language and corresponding grammar patterns. The paper describes the writing of Sketch grammar for the Russian language as a part of the Sketch Engine system. The system gives information about a word's collocability on concrete dependency models, and generates lists of the most frequent phrases for a given word based on appropriate models. The paper deals with two different approaches to writing rules for the grammar, based on morphological information, and also with applying word sketches to the Russian language. The data evidences that such results may find an extensive use in various fields of linguistics, such as dictionary compiling, language learning and teaching, translation (including machine translation), phraseology, information retrieval etc.

Язык оригиналаанглийский
Название основной публикацииProceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010
РедакторыDaniel Tapias, Irene Russo, Olivier Hamon, Stelios Piperidis, Nicoletta Calzolari, Khalid Choukri, Joseph Mariani, Helene Mazo, Bente Maegaard, Jan Odijk, Mike Rosner
ИздательEuropean Language Resources Association (ELRA)
Страницы3491-3494
Число страниц4
ISBN (электронное издание)2951740867, 9782951740860
ISBN (печатное издание)2-9517408-6-7
СостояниеОпубликовано - 2010
Событие7th International Conference on Language Resources and Evaluation, LREC 2010 - Valletta, Мальта
Продолжительность: 17 мая 201023 мая 2010

Серия публикаций

НазваниеProceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010

конференция

конференция7th International Conference on Language Resources and Evaluation, LREC 2010
Страна/TерриторияМальта
ГородValletta
Период17/05/1023/05/10

    Предметные области Scopus

  • Образование
  • Библиотечные и информационные науки
  • Языки и лингвистика
  • Языки и лингвистика

ID: 4477384