DOI

Statistical parameters, usually used for diagnostic procedures, in many cases cannot be considered to be consistent ones from the statistical point of view, being strongly dependent on sample size. It leads to considerable devaluation of diagnostic results. This paper concerns the problem of consistency verification of parameters in the initial (pre-classification) stage of research. A complete list of parameters, which may be useful for description of text lexicostatistical structure, was determined. Each of these parameters was exposed to the justifiability test. In the result, a number of consistent parameters have been selected, which represent a description tool for the system characteristics of any text and corpora. Having rapid speed of convergence to the limit values, they may effectively perform classification procedures on text data of the arbitrary size. The proposed model of approximation makes it possible as well to forecast the values of all parameters for any sample size.

Язык оригиналаанглийский
Название основной публикацииText, Speech and Dialogue - 3rd International Workshop, TSD 2000, Proceedings
РедакторыPetr Sojka, Ivan Kopecek, Karel Pala
ИздательSpringer Nature
Страницы99-102
Число страниц4
ISBN (печатное издание)3540410422, 9783540410423
DOI
СостояниеОпубликовано - 2000
Событие3rd International Workshop on Text, Speech and Dialogue, TSD 2000 - Brno, Чехия
Продолжительность: 13 сен 200016 сен 2000

Серия публикаций

НазваниеLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Том1902
ISSN (печатное издание)0302-9743
ISSN (электронное издание)1611-3349

конференция

конференция3rd International Workshop on Text, Speech and Dialogue, TSD 2000
Страна/TерриторияЧехия
ГородBrno
Период13/09/0016/09/00

    Предметные области Scopus

  • Теоретические компьютерные науки
  • Компьютерные науки (все)

ID: 88462519