Standard

Using a Decision Tree for the Clustering Problem. / Гадасина, Людмила Викторовна; Романов, Дмитрий Вячеславович.

Proceedings - 2024 International Russian Smart Industry Conference, SmartIndustryCon 2024. Institute of Electrical and Electronics Engineers Inc., 2024. p. 273-279.

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Harvard

Гадасина, ЛВ & Романов, ДВ 2024, Using a Decision Tree for the Clustering Problem. in Proceedings - 2024 International Russian Smart Industry Conference, SmartIndustryCon 2024. Institute of Electrical and Electronics Engineers Inc., pp. 273-279, Международная научно-практическая конференция "Индустрия 4.0", Сочи, 24/03/24. https://doi.org/10.1109/smartindustrycon61328.2024.10515744

APA

Гадасина, Л. В., & Романов, Д. В. (2024). Using a Decision Tree for the Clustering Problem. In Proceedings - 2024 International Russian Smart Industry Conference, SmartIndustryCon 2024 (pp. 273-279). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/smartindustrycon61328.2024.10515744

Vancouver

Гадасина ЛВ, Романов ДВ. Using a Decision Tree for the Clustering Problem. In Proceedings - 2024 International Russian Smart Industry Conference, SmartIndustryCon 2024. Institute of Electrical and Electronics Engineers Inc. 2024. p. 273-279 https://doi.org/10.1109/smartindustrycon61328.2024.10515744

Author

Гадасина, Людмила Викторовна ; Романов, Дмитрий Вячеславович. / Using a Decision Tree for the Clustering Problem. Proceedings - 2024 International Russian Smart Industry Conference, SmartIndustryCon 2024. Institute of Electrical and Electronics Engineers Inc., 2024. pp. 273-279

BibTeX

@inproceedings{5fc718ff294e443eb447a59f025e3ac0,
title = "Using a Decision Tree for the Clustering Problem",
abstract = "Clustering of datasets with mixed type data: quantitative and categorical is difficult by classical methods of cluster analysis. The study proposes to solve the problem of clustering multivariate data using a decision tree. This method requires setting the number of clusters, limiting the maximum proportion of observations that fall into each cluster, and also requires setting a target variable. The latter requirement can be fulfilled by an expert method or by experimenting with different target variables. The method was tested on the data of advertisements about residential real estate for sale in St. Petersburg. At first, tree clustering method was tested on the dataset with only quantitative data, then on the dataset with mixed types of data: quantitative and categorical. The results were compared with the results of the hierarchical method with different distance metrics. The proposed method does not require data standardization, has a higher speed of operation than hierarchical clustering and shows a clearer interpretation of the clustering results.",
keywords = "classification, clustering, decision tree, regression",
author = "Гадасина, {Людмила Викторовна} and Романов, {Дмитрий Вячеславович}",
year = "2024",
month = mar,
day = "25",
doi = "10.1109/smartindustrycon61328.2024.10515744",
language = "English",
isbn = "9798350395044",
pages = "273--279",
booktitle = "Proceedings - 2024 International Russian Smart Industry Conference, SmartIndustryCon 2024",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",
note = "Международная научно-практическая конференция {"}Индустрия 4.0{"}, SmartIndustryCon-2024 ; Conference date: 24-03-2024 Through 30-03-2024",
url = "https://smartindustrycon.ru/",

}

RIS

TY - GEN

T1 - Using a Decision Tree for the Clustering Problem

AU - Гадасина, Людмила Викторовна

AU - Романов, Дмитрий Вячеславович

N1 - Conference code: 4

PY - 2024/3/25

Y1 - 2024/3/25

N2 - Clustering of datasets with mixed type data: quantitative and categorical is difficult by classical methods of cluster analysis. The study proposes to solve the problem of clustering multivariate data using a decision tree. This method requires setting the number of clusters, limiting the maximum proportion of observations that fall into each cluster, and also requires setting a target variable. The latter requirement can be fulfilled by an expert method or by experimenting with different target variables. The method was tested on the data of advertisements about residential real estate for sale in St. Petersburg. At first, tree clustering method was tested on the dataset with only quantitative data, then on the dataset with mixed types of data: quantitative and categorical. The results were compared with the results of the hierarchical method with different distance metrics. The proposed method does not require data standardization, has a higher speed of operation than hierarchical clustering and shows a clearer interpretation of the clustering results.

AB - Clustering of datasets with mixed type data: quantitative and categorical is difficult by classical methods of cluster analysis. The study proposes to solve the problem of clustering multivariate data using a decision tree. This method requires setting the number of clusters, limiting the maximum proportion of observations that fall into each cluster, and also requires setting a target variable. The latter requirement can be fulfilled by an expert method or by experimenting with different target variables. The method was tested on the data of advertisements about residential real estate for sale in St. Petersburg. At first, tree clustering method was tested on the dataset with only quantitative data, then on the dataset with mixed types of data: quantitative and categorical. The results were compared with the results of the hierarchical method with different distance metrics. The proposed method does not require data standardization, has a higher speed of operation than hierarchical clustering and shows a clearer interpretation of the clustering results.

KW - classification

KW - clustering

KW - decision tree

KW - regression

UR - https://www.mendeley.com/catalogue/8d69ac3e-1ba3-3b2a-839a-ebb45e72897c/

U2 - 10.1109/smartindustrycon61328.2024.10515744

DO - 10.1109/smartindustrycon61328.2024.10515744

M3 - Conference contribution

SN - 9798350395044

SP - 273

EP - 279

BT - Proceedings - 2024 International Russian Smart Industry Conference, SmartIndustryCon 2024

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - Международная научно-практическая конференция "Индустрия 4.0"

Y2 - 24 March 2024 through 30 March 2024

ER -

ID: 121544368