Standard

Hybrid Materialization in a Disk-Based Column-Store. / Klyuchikov, Evgeniy; Chizhov, Anton; Polyntsov, Michael; Chernishev, George; Mikhailova, Elena.

CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD). Association for Computing Machinery, 2024. стр. 164-172 (ACM International Conference Proceeding Series).

Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференцийстатья в сборнике материалов конференциинаучнаяРецензирование

Harvard

Klyuchikov, E, Chizhov, A, Polyntsov, M, Chernishev, G & Mikhailova, E 2024, Hybrid Materialization in a Disk-Based Column-Store. в CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD). ACM International Conference Proceeding Series, Association for Computing Machinery, стр. 164-172, CODS-COMAD 2024: 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD), Bangalore, Индия, 4/01/24. https://doi.org/10.1145/3632410.3632422

APA

Klyuchikov, E., Chizhov, A., Polyntsov, M., Chernishev, G., & Mikhailova, E. (2024). Hybrid Materialization in a Disk-Based Column-Store. в CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD) (стр. 164-172). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3632410.3632422

Vancouver

Klyuchikov E, Chizhov A, Polyntsov M, Chernishev G, Mikhailova E. Hybrid Materialization in a Disk-Based Column-Store. в CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD). Association for Computing Machinery. 2024. стр. 164-172. (ACM International Conference Proceeding Series). https://doi.org/10.1145/3632410.3632422

Author

Klyuchikov, Evgeniy ; Chizhov, Anton ; Polyntsov, Michael ; Chernishev, George ; Mikhailova, Elena. / Hybrid Materialization in a Disk-Based Column-Store. CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD). Association for Computing Machinery, 2024. стр. 164-172 (ACM International Conference Proceeding Series).

BibTeX

@inproceedings{80187ac0234143069d6888f8498a23dc,
title = "Hybrid Materialization in a Disk-Based Column-Store.",
abstract = "In column-oriented query processing, a materialization strategy determines when lightweight positions (row IDs) are translated into tuples. It is an important part of column-store architecture, since it defines the class of supported query plans, and, therefore, impacts overall system performance. In this paper, we continue investigating materialization strategies for a distributed disk-based column-store. We start by demonstrating cases of existing approaches fundamentally limiting resulting system performance. In order to address them, we propose a new model of hybrid materialization. The main feature of hybrid materialization is the ability to manipulate both positions and values at the same time. This way, the query engine can efficiently combine advantages of all the existing strategies and support a new class of query plans. Moreover, hybrid materialization enables the query engine to flexibly customize the materialization policy of individual attributes. We describe our vision of how hybrid materialization can be implemented in a columnar system. As an example, we use PosDB - a distributed, disk-based column-store. We present necessary data structures, the internals of a hybrid operator, and describe the algebra of such operators. Based on this implementation, we evaluate performance of late, ultra-late, and hybrid materialization strategies in several scenarios based on TPC-H queries. Our experiments demonstrate that hybrid materialization is almost two times faster than its counterparts, while providing a more flexible query model.",
keywords = "Analytic workloads, Column-stores, Databases, Hybrid materialization, Late Materialization, Query engine, Query processing",
author = "Evgeniy Klyuchikov and Anton Chizhov and Michael Polyntsov and George Chernishev and Elena Mikhailova",
note = "DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.; CODS-COMAD 2024: 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD) ; Conference date: 04-01-2024 Through 07-01-2024",
year = "2024",
month = jan,
day = "4",
doi = "10.1145/3632410.3632422",
language = "English",
isbn = " 9798400716348",
series = "ACM International Conference Proceeding Series",
publisher = "Association for Computing Machinery",
pages = "164--172",
booktitle = "CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)",
address = "United States",

}

RIS

TY - GEN

T1 - Hybrid Materialization in a Disk-Based Column-Store.

AU - Klyuchikov, Evgeniy

AU - Chizhov, Anton

AU - Polyntsov, Michael

AU - Chernishev, George

AU - Mikhailova, Elena

N1 - DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.

PY - 2024/1/4

Y1 - 2024/1/4

N2 - In column-oriented query processing, a materialization strategy determines when lightweight positions (row IDs) are translated into tuples. It is an important part of column-store architecture, since it defines the class of supported query plans, and, therefore, impacts overall system performance. In this paper, we continue investigating materialization strategies for a distributed disk-based column-store. We start by demonstrating cases of existing approaches fundamentally limiting resulting system performance. In order to address them, we propose a new model of hybrid materialization. The main feature of hybrid materialization is the ability to manipulate both positions and values at the same time. This way, the query engine can efficiently combine advantages of all the existing strategies and support a new class of query plans. Moreover, hybrid materialization enables the query engine to flexibly customize the materialization policy of individual attributes. We describe our vision of how hybrid materialization can be implemented in a columnar system. As an example, we use PosDB - a distributed, disk-based column-store. We present necessary data structures, the internals of a hybrid operator, and describe the algebra of such operators. Based on this implementation, we evaluate performance of late, ultra-late, and hybrid materialization strategies in several scenarios based on TPC-H queries. Our experiments demonstrate that hybrid materialization is almost two times faster than its counterparts, while providing a more flexible query model.

AB - In column-oriented query processing, a materialization strategy determines when lightweight positions (row IDs) are translated into tuples. It is an important part of column-store architecture, since it defines the class of supported query plans, and, therefore, impacts overall system performance. In this paper, we continue investigating materialization strategies for a distributed disk-based column-store. We start by demonstrating cases of existing approaches fundamentally limiting resulting system performance. In order to address them, we propose a new model of hybrid materialization. The main feature of hybrid materialization is the ability to manipulate both positions and values at the same time. This way, the query engine can efficiently combine advantages of all the existing strategies and support a new class of query plans. Moreover, hybrid materialization enables the query engine to flexibly customize the materialization policy of individual attributes. We describe our vision of how hybrid materialization can be implemented in a columnar system. As an example, we use PosDB - a distributed, disk-based column-store. We present necessary data structures, the internals of a hybrid operator, and describe the algebra of such operators. Based on this implementation, we evaluate performance of late, ultra-late, and hybrid materialization strategies in several scenarios based on TPC-H queries. Our experiments demonstrate that hybrid materialization is almost two times faster than its counterparts, while providing a more flexible query model.

KW - Analytic workloads

KW - Column-stores

KW - Databases

KW - Hybrid materialization

KW - Late Materialization

KW - Query engine

KW - Query processing

UR - https://www.mendeley.com/catalogue/a671c5fd-8c95-3194-a1be-ef2311c3bdb7/

U2 - 10.1145/3632410.3632422

DO - 10.1145/3632410.3632422

M3 - Conference contribution

SN - 9798400716348

T3 - ACM International Conference Proceeding Series

SP - 164

EP - 172

BT - CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)

PB - Association for Computing Machinery

T2 - CODS-COMAD 2024: 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)

Y2 - 4 January 2024 through 7 January 2024

ER -

ID: 116480178