On-the-fly filtering of aggregation results in column-stores › Научные исследования в СПбГУ

—Aggregation is a database operation that aims to provide basic analytic capabilities by partitioning source data into several groups and computing some function on values belonging to the same group. Nowadays it is common in databases, and especially in the OLAP domain, which is a primary venue for column-stores. In this paper we propose a novel approach to the design of an aggregation operator inside a column-store system. The core of our approach is an analysis of predicates in the HAVING-clause that allows the runtime pruning of groups. We employ monotonicity and codomain analysis in order to detect groups in which predicates would never be satisfied. Eventually, we aim to save I/O and CPU costs by discarding groups as early as possible. We start by providing a high-level overview of our approach and describe its use-cases. Then, we provide a short introduction into our system and describe a straightforward implementation of an aggregation operator. Next, we provide theoretical foundations for our approach and present an improved algorithm. Finally, we present an experimental validation of our approach inside PosDB — a distributed, disk-based column-store engine that features late materialization and block-oriented processing. Experiments using an SSD drive show that our approach can provide up to 5 times improvement over the naive version.

Язык оригинала	английский
Страницы (с-по)	53-60
Число страниц	8
Журнал	CEUR Workshop Proceedings
Том	2135
Состояние	Опубликовано - 1 янв 2018
Событие	3rd Conference on Software Engineering and Information Management, SEIM 2018 - Saint Petersburg, Российская Федерация Продолжительность: 14 апр 2018 → …

Предметные области Scopus

Компьютерные науки (все)

ID: 35272468