МАРКОВСКИЙ МОМЕНТ ОСТАНОВКИ АГЛОМЕРАТИВНОГО ПРОЦЕССА КЛАСТЕРИЗАЦИИ В ЕВКЛИДОВОМ ПРОСТРАНСТВЕ

Research output: Contribution to journal › Article › peer-review

Department of Diagnostic of Functional Systems

DOI

https://doi.org/10.21638/11702/spbu10.2019.106
Final published version

A. V. Orekhov

When processing large arrays of empirical data or large-scale data, cluster analysis remains one of the primary methods of preliminary typology, which makes it necessary to obtain formal rules for calculating the number of clusters. The most common method for determining the preferred number of clusters is the visual analysis of dendrograms, but this approach is purely heuristic. The number of clusters and the end moment of the clustering algorithm depend on each other. Cluster analysis of data from n-dimensional Euclidean space using the “single linkage” method can consider as a discrete random process. Sequences of “minimum distances” define the trajectories of this process. The “approximation-estimating test” allows us to establish the Markov moment when the growth rate of such a sequence changes from linear to parabolic, which, in turn, may be a sign of the completion of the agglomerative clustering process. The calculation of the number of clusters is the critical problem in many cases of the automatic typology of empirical data. For example, in medicine with cytometric analysis of blood, automated analysis of texts and in other instances when the number of clusters not known in advance.

Translated title of the contribution	Markov moment for the agglomerative method of clustering in Euclidean space
Original language	Russian
Pages (from-to)	76-92
Number of pages	17
Journal	ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. ПРИКЛАДНАЯ МАТЕМАТИКА. ИНФОРМАТИКА. ПРОЦЕССЫ УПРАВЛЕНИЯ
Volume	15
Issue number	1
DOIs	https://doi.org/10.21638/11702/spbu10.2019.106
State	Published - 1 Jan 2019

Scopus subject areas

Control and Optimization
Applied Mathematics
Computer Science(all)

Research areas

Cluster analysis, Least squares method, Markov moment, least squares method, NUMBER, cluster analysis

ID: 41340292