Validity of Lingvo-Statistical Parameters for the Corpus of Fiction › Научные исследования в СПбГУ

DOI

https://doi.org/10.1007/978-981-96-0990-1_2
Конечная издательская версия

Finding variables and statistical metrics to describe rank distributions of lexemes is a relevant linguistic task. We analyze the validity of lingvo-statistical parameters (the rank mean and entropy) for describing frequency dictionary of fiction. The comparative use of the Weibull and Haustein functions as approximating ones for the values of the parameters in question is also investigated. The research draws on a representative sample from the Corpus of the Russian Short Stories (1900–1930) (total volume is more than 1,000,000 tokens). The rank mean is shown to be only a relative valid parameter for describing a large-scale corpus of fiction, while the relative validity of entropy is greatly affected by the nature of the texts analyzed. TheWeibull function is proved to be the preferable one for the approximation of the parameters’ growth.

Язык оригинала	английский
Название основной публикации	Literature, Language and Computing
Подзаголовок основной публикации	Russian Contribution from the LiLaC-2023
Место публикации	Singapore
Издатель	Springer Nature
Страницы	15-21
Число страниц	7
ISBN (электронное издание)	978-981-96-0990-1
ISBN (печатное издание)	978-981-96-0989-5
DOI	https://doi.org/10.1007/978-981-96-0990-1_2
Состояние	Опубликовано - мар 2025
Событие	Международная конференция «Литература, язык и компьютерные технологии» (LiLaC: Literature, Language and Computing: Russian Contribution) - СПбГУ, Санкт-Петербург, Российская Федерация Продолжительность: 9 ноя 2023 → 11 ноя 2023 https://conference-spbu.ru/conference/49/

конференция

конференция	Международная конференция «Литература, язык и компьютерные технологии» (LiLaC: Literature, Language and Computing: Russian Contribution)
Сокращенное название	LiLaC 2023
Страна/Tерритория	Российская Федерация
Город	Санкт-Петербург
Период	9/11/23 → 11/11/23
Сайт в сети Internet	https://conference-spbu.ru/conference/49/

ID: 133399730