Standard

A study of several matrix-clustering vertical partitioning algorithms in a disk-based environment. / Galaktionov, Viacheslav; Chernishev, George; Smirnov, Kirill; Novikov, Boris; Grigoriev, Dmitry A.

Data Analytics and Management in Data Intensive Domains - XVIII International Conference, DAMDID/RCDL 2016, Revised Selected Papers. ed. / Yannis Manolopoulos; Leonid Kalinichenko; Sergei O. Kuznetsov. Springer Nature, 2017. p. 163-177 (Communications in Computer and Information Science; Vol. 706).

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Harvard

Galaktionov, V, Chernishev, G, Smirnov, K, Novikov, B & Grigoriev, DA 2017, A study of several matrix-clustering vertical partitioning algorithms in a disk-based environment. in Y Manolopoulos, L Kalinichenko & SO Kuznetsov (eds), Data Analytics and Management in Data Intensive Domains - XVIII International Conference, DAMDID/RCDL 2016, Revised Selected Papers. Communications in Computer and Information Science, vol. 706, Springer Nature, pp. 163-177, 18th International Conference on Data Analytics and Management in Data-Intensive Domains, DAMDID 2016, Ershovo, Russian Federation, 11/10/16. https://doi.org/10.1007/978-3-319-57135-5_12

APA

Galaktionov, V., Chernishev, G., Smirnov, K., Novikov, B., & Grigoriev, D. A. (2017). A study of several matrix-clustering vertical partitioning algorithms in a disk-based environment. In Y. Manolopoulos, L. Kalinichenko, & S. O. Kuznetsov (Eds.), Data Analytics and Management in Data Intensive Domains - XVIII International Conference, DAMDID/RCDL 2016, Revised Selected Papers (pp. 163-177). (Communications in Computer and Information Science; Vol. 706). Springer Nature. https://doi.org/10.1007/978-3-319-57135-5_12

Vancouver

Galaktionov V, Chernishev G, Smirnov K, Novikov B, Grigoriev DA. A study of several matrix-clustering vertical partitioning algorithms in a disk-based environment. In Manolopoulos Y, Kalinichenko L, Kuznetsov SO, editors, Data Analytics and Management in Data Intensive Domains - XVIII International Conference, DAMDID/RCDL 2016, Revised Selected Papers. Springer Nature. 2017. p. 163-177. (Communications in Computer and Information Science). https://doi.org/10.1007/978-3-319-57135-5_12

Author

Galaktionov, Viacheslav ; Chernishev, George ; Smirnov, Kirill ; Novikov, Boris ; Grigoriev, Dmitry A. / A study of several matrix-clustering vertical partitioning algorithms in a disk-based environment. Data Analytics and Management in Data Intensive Domains - XVIII International Conference, DAMDID/RCDL 2016, Revised Selected Papers. editor / Yannis Manolopoulos ; Leonid Kalinichenko ; Sergei O. Kuznetsov. Springer Nature, 2017. pp. 163-177 (Communications in Computer and Information Science).

BibTeX

@inproceedings{dd2f3fb3f6564f46913a38578c403ceb,
title = "A study of several matrix-clustering vertical partitioning algorithms in a disk-based environment",
abstract = "In this paper we continue our efforts to evaluate matrix clustering algorithms. In our previous study we presented a test environment and results of preliminary experiments with the “separate” strategy for vertical partitioning. This strategy assigns a separate vertical partition for every cluster found by the algorithm, including inter-submatrix attribute group. In this paper we introduce two other strategies: the “replicate” strategy, which replicates inter-submatrix attributes to every cluster and the “retain” strategy, which assigns inter-submatrix attributes to their original clusters. We experimentally evaluate all strategies in a disk-based environment using the standard TPC-H workload and the PostgreSQL DBMS. We start with the study of record reconstruction methods in the PostgreSQL DBMS. Then, we apply partitioning strategies to three matrix clustering algorithms and evaluate both query performance and storage overhead of the resulting partitions. Finally, we compare the resulting partitioning schemes with the ideal partitioning scenario.",
keywords = "Database tuning, Experimentation, Fragmentation, Matrix clustering, PostgreSQL, TPC-H, Vertical partitioning",
author = "Viacheslav Galaktionov and George Chernishev and Kirill Smirnov and Boris Novikov and Grigoriev, {Dmitry A.}",
note = "Publisher Copyright: {\textcopyright} Springer International Publishing AG 2017. Copyright: Copyright 2017 Elsevier B.V., All rights reserved.; 18th International Conference on Data Analytics and Management in Data-Intensive Domains, DAMDID 2016 ; Conference date: 11-10-2016 Through 14-10-2016",
year = "2017",
doi = "10.1007/978-3-319-57135-5_12",
language = "English",
isbn = "9783319571348",
series = "Communications in Computer and Information Science",
publisher = "Springer Nature",
pages = "163--177",
editor = "Yannis Manolopoulos and Leonid Kalinichenko and Kuznetsov, {Sergei O.}",
booktitle = "Data Analytics and Management in Data Intensive Domains - XVIII International Conference, DAMDID/RCDL 2016, Revised Selected Papers",
address = "Germany",

}

RIS

TY - GEN

T1 - A study of several matrix-clustering vertical partitioning algorithms in a disk-based environment

AU - Galaktionov, Viacheslav

AU - Chernishev, George

AU - Smirnov, Kirill

AU - Novikov, Boris

AU - Grigoriev, Dmitry A.

N1 - Publisher Copyright: © Springer International Publishing AG 2017. Copyright: Copyright 2017 Elsevier B.V., All rights reserved.

PY - 2017

Y1 - 2017

N2 - In this paper we continue our efforts to evaluate matrix clustering algorithms. In our previous study we presented a test environment and results of preliminary experiments with the “separate” strategy for vertical partitioning. This strategy assigns a separate vertical partition for every cluster found by the algorithm, including inter-submatrix attribute group. In this paper we introduce two other strategies: the “replicate” strategy, which replicates inter-submatrix attributes to every cluster and the “retain” strategy, which assigns inter-submatrix attributes to their original clusters. We experimentally evaluate all strategies in a disk-based environment using the standard TPC-H workload and the PostgreSQL DBMS. We start with the study of record reconstruction methods in the PostgreSQL DBMS. Then, we apply partitioning strategies to three matrix clustering algorithms and evaluate both query performance and storage overhead of the resulting partitions. Finally, we compare the resulting partitioning schemes with the ideal partitioning scenario.

AB - In this paper we continue our efforts to evaluate matrix clustering algorithms. In our previous study we presented a test environment and results of preliminary experiments with the “separate” strategy for vertical partitioning. This strategy assigns a separate vertical partition for every cluster found by the algorithm, including inter-submatrix attribute group. In this paper we introduce two other strategies: the “replicate” strategy, which replicates inter-submatrix attributes to every cluster and the “retain” strategy, which assigns inter-submatrix attributes to their original clusters. We experimentally evaluate all strategies in a disk-based environment using the standard TPC-H workload and the PostgreSQL DBMS. We start with the study of record reconstruction methods in the PostgreSQL DBMS. Then, we apply partitioning strategies to three matrix clustering algorithms and evaluate both query performance and storage overhead of the resulting partitions. Finally, we compare the resulting partitioning schemes with the ideal partitioning scenario.

KW - Database tuning

KW - Experimentation

KW - Fragmentation

KW - Matrix clustering

KW - PostgreSQL

KW - TPC-H

KW - Vertical partitioning

UR - http://www.scopus.com/inward/record.url?scp=85018671430&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-57135-5_12

DO - 10.1007/978-3-319-57135-5_12

M3 - Conference contribution

AN - SCOPUS:85018671430

SN - 9783319571348

T3 - Communications in Computer and Information Science

SP - 163

EP - 177

BT - Data Analytics and Management in Data Intensive Domains - XVIII International Conference, DAMDID/RCDL 2016, Revised Selected Papers

A2 - Manolopoulos, Yannis

A2 - Kalinichenko, Leonid

A2 - Kuznetsov, Sergei O.

PB - Springer Nature

T2 - 18th International Conference on Data Analytics and Management in Data-Intensive Domains, DAMDID 2016

Y2 - 11 October 2016 through 14 October 2016

ER -

ID: 72709067