APPLYING CLUSTERING ANALYSIS FOR DISCOVERING TIME SERIES HETEROGENEITY USING SAINT PETERSBURG MORBIDITY RATE AS AN ILLUSTRATION

Standard

APPLYING CLUSTERING ANALYSIS FOR DISCOVERING TIME SERIES HETEROGENEITY USING SAINT PETERSBURG MORBIDITY RATE AS AN ILLUSTRATION. / Bure, V. M. ; Staroverova, K. Yu. .

в: ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. СЕРИЯ 10: ПРИКЛАДНАЯ МАТЕМАТИКА, ИНФОРМАТИКА, ПРОЦЕССЫ УПРАВЛЕНИЯ, Том 12, № 4, 2016, стр. 44-50.

Результаты исследований: Научные публикации в периодических изданиях › статья › Рецензирование

Harvard

Bure, VM & Staroverova, KY 2016, 'APPLYING CLUSTERING ANALYSIS FOR DISCOVERING TIME SERIES HETEROGENEITY USING SAINT PETERSBURG MORBIDITY RATE AS AN ILLUSTRATION', ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. СЕРИЯ 10: ПРИКЛАДНАЯ МАТЕМАТИКА, ИНФОРМАТИКА, ПРОЦЕССЫ УПРАВЛЕНИЯ, Том. 12, № 4, стр. 44-50. https://doi.org/10.21638/11701/spbu10.2016.404

APA

Bure, V. M., & Staroverova, K. Y. (2016). APPLYING CLUSTERING ANALYSIS FOR DISCOVERING TIME SERIES HETEROGENEITY USING SAINT PETERSBURG MORBIDITY RATE AS AN ILLUSTRATION. ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. СЕРИЯ 10: ПРИКЛАДНАЯ МАТЕМАТИКА, ИНФОРМАТИКА, ПРОЦЕССЫ УПРАВЛЕНИЯ, 12(4), 44-50. https://doi.org/10.21638/11701/spbu10.2016.404

Vancouver

Bure VM , Staroverova KY. APPLYING CLUSTERING ANALYSIS FOR DISCOVERING TIME SERIES HETEROGENEITY USING SAINT PETERSBURG MORBIDITY RATE AS AN ILLUSTRATION. ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. СЕРИЯ 10: ПРИКЛАДНАЯ МАТЕМАТИКА, ИНФОРМАТИКА, ПРОЦЕССЫ УПРАВЛЕНИЯ. 2016;12(4):44-50. https://doi.org/10.21638/11701/spbu10.2016.404

Author

Bure, V. M. ; Staroverova, K. Yu. . / APPLYING CLUSTERING ANALYSIS FOR DISCOVERING TIME SERIES HETEROGENEITY USING SAINT PETERSBURG MORBIDITY RATE AS AN ILLUSTRATION. в: ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. СЕРИЯ 10: ПРИКЛАДНАЯ МАТЕМАТИКА, ИНФОРМАТИКА, ПРОЦЕССЫ УПРАВЛЕНИЯ. 2016 ; Том 12, № 4. стр. 44-50.

BibTeX

@article{e7808e35bb81449daa3665f747b171e5,

title = "APPLYING CLUSTERING ANALYSIS FOR DISCOVERING TIME SERIES HETEROGENEITY USING SAINT PETERSBURG MORBIDITY RATE AS AN ILLUSTRATION",

abstract = "One of the machine learning approaches for unsupervised learning is clustering. Clustering has the task of exploring the structure of data with the aim of assigning a set of objects in such a way that objects belonging to the same group are more similar to each other than the objects drawn from different groups. Determining the number of clusters in a data set, searching for stable clusters, selection of dissimilarity measure and algorithm are significant tasks of cluster analysis. Multidimentional clustering is often used when an object is characterized by a vector. A dissimilarity measure or distance is selected with respect to the purpose and features of a certain task. But there are also such fields as economics, geology, medicine, sociology that are often presented by time series. Time series are random processes but not a random vector. That is why it is important to construct such a similarity (or dissimilarity) measure which would take into consideration that data are time–dependent. The research of morbidity rate of Saint Petersburg from 1999 to 2014 years and clustering of 18 districts are conducted. Several different similarity measures are used for clustering. Besides, an interesting aspect is clustering of multidimentional time series. There are two approaches. The first concept is to split multidimentional time series into several univariate time series, whilst the second one is to consider it as a whole unit that preserves the influence of data interdependence. Research is made with application of TSclust, tseries packages in R and missed algorithms are realised there. As a result of clustering of Saint Petersburg districts applying several similarity measures three stable clusters are found out but seven districts do not belong to any cluster. Refs 10. Figs 2.",

keywords = "cluster analysis, clustering, time series similarity measure, stable clusters, кластеризация, мера схожести временных рядов, устойчивость кластеров",

author = "Bure, {V. M.} and Staroverova, {K. Yu.}",

year = "2016",

doi = "10.21638/11701/spbu10.2016.404",

language = "English",

volume = "12",

pages = "44--50",

journal = " ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. ПРИКЛАДНАЯ МАТЕМАТИКА. ИНФОРМАТИКА. ПРОЦЕССЫ УПРАВЛЕНИЯ",

issn = "1811-9905",

publisher = "Издательство Санкт-Петербургского университета",

number = "4",

}

RIS

TY - JOUR

T1 - APPLYING CLUSTERING ANALYSIS FOR DISCOVERING TIME SERIES HETEROGENEITY USING SAINT PETERSBURG MORBIDITY RATE AS AN ILLUSTRATION

AU - Bure, V. M.

AU - Staroverova, K. Yu.

PY - 2016

Y1 - 2016

N2 - One of the machine learning approaches for unsupervised learning is clustering. Clustering has the task of exploring the structure of data with the aim of assigning a set of objects in such a way that objects belonging to the same group are more similar to each other than the objects drawn from different groups. Determining the number of clusters in a data set, searching for stable clusters, selection of dissimilarity measure and algorithm are significant tasks of cluster analysis. Multidimentional clustering is often used when an object is characterized by a vector. A dissimilarity measure or distance is selected with respect to the purpose and features of a certain task. But there are also such fields as economics, geology, medicine, sociology that are often presented by time series. Time series are random processes but not a random vector. That is why it is important to construct such a similarity (or dissimilarity) measure which would take into consideration that data are time–dependent. The research of morbidity rate of Saint Petersburg from 1999 to 2014 years and clustering of 18 districts are conducted. Several different similarity measures are used for clustering. Besides, an interesting aspect is clustering of multidimentional time series. There are two approaches. The first concept is to split multidimentional time series into several univariate time series, whilst the second one is to consider it as a whole unit that preserves the influence of data interdependence. Research is made with application of TSclust, tseries packages in R and missed algorithms are realised there. As a result of clustering of Saint Petersburg districts applying several similarity measures three stable clusters are found out but seven districts do not belong to any cluster. Refs 10. Figs 2.

AB - One of the machine learning approaches for unsupervised learning is clustering. Clustering has the task of exploring the structure of data with the aim of assigning a set of objects in such a way that objects belonging to the same group are more similar to each other than the objects drawn from different groups. Determining the number of clusters in a data set, searching for stable clusters, selection of dissimilarity measure and algorithm are significant tasks of cluster analysis. Multidimentional clustering is often used when an object is characterized by a vector. A dissimilarity measure or distance is selected with respect to the purpose and features of a certain task. But there are also such fields as economics, geology, medicine, sociology that are often presented by time series. Time series are random processes but not a random vector. That is why it is important to construct such a similarity (or dissimilarity) measure which would take into consideration that data are time–dependent. The research of morbidity rate of Saint Petersburg from 1999 to 2014 years and clustering of 18 districts are conducted. Several different similarity measures are used for clustering. Besides, an interesting aspect is clustering of multidimentional time series. There are two approaches. The first concept is to split multidimentional time series into several univariate time series, whilst the second one is to consider it as a whole unit that preserves the influence of data interdependence. Research is made with application of TSclust, tseries packages in R and missed algorithms are realised there. As a result of clustering of Saint Petersburg districts applying several similarity measures three stable clusters are found out but seven districts do not belong to any cluster. Refs 10. Figs 2.

KW - cluster analysis

KW - clustering

KW - time series similarity measure

KW - stable clusters

KW - кластеризация

KW - мера схожести временных рядов

KW - устойчивость кластеров

UR - http://vestnik.spbu.ru/html16/s10/s10v4/04.pdf

U2 - 10.21638/11701/spbu10.2016.404

DO - 10.21638/11701/spbu10.2016.404

M3 - Article

VL - 12

SP - 44

EP - 50

JO - ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. ПРИКЛАДНАЯ МАТЕМАТИКА. ИНФОРМАТИКА. ПРОЦЕССЫ УПРАВЛЕНИЯ

JF - ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. ПРИКЛАДНАЯ МАТЕМАТИКА. ИНФОРМАТИКА. ПРОЦЕССЫ УПРАВЛЕНИЯ

SN - 1811-9905

IS - 4

ER -

ID: 9291852