The use of Weibull and Haustein functions for the approximation of the dependence between sample size and resulting vocabulary size is analyzed. Frequency dictionary of the short stories by А. Р. Chekhov was chosen as the material for the experiment. Haustein function is proved to be the preferable one, with the approximation results being far more precise. Chorological order of text processing is also justified for the analysis along similar lines.
Translated title of the contributionAPPROXIMATION OF THE SAMPLE SIZE - VOCABULARY SIZE DEPENDENCE
Original languageRussian
Title of host publicationКорпусная лингвистика – 2017
Subtitle of host publicationТруды международной конференции
Place of PublicationСПб
PublisherИздательство Санкт-Петербургского университета
Pages151-156
StatePublished - 2017
EventКорпусная лингвистика - 2017 - Санкт-Петербург, Russian Federation
Duration: 27 Jul 201730 Jul 2017
http://phil.spbu.ru/nauka/konferencii/arhiv/konferencii-2016-2017-goda/mezhdunarodnaya-nauchnaya-konferenciya-korpusnaya-lingvistika-2017
https://events.spbu.ru/events/anons/corpora-2017/

Conference

ConferenceКорпусная лингвистика - 2017
Abbreviated titleCORPORA 2017
Country/TerritoryRussian Federation
CityСанкт-Петербург
Period27/07/1730/07/17
Internet address

ID: 17315988