Standard

Position caching in a column-store with late materialization : An initial study. / Galaktionov, Viacheslav; Klyuchikov, Evgeniy; Chernishev, George.

In: CEUR Workshop Proceedings, Vol. 2572, 2020, p. 89-93.

Research output: Contribution to journalConference articlepeer-review

Harvard

APA

Vancouver

Author

BibTeX

@article{76604bc1b64e415f8380258128c31b2d,
title = "Position caching in a column-store with late materialization: An initial study",
abstract = "A common technique to speed up DBMS query processing is to cache parts of query results and reuse them later. In this paper we propose a novel approach which is aimed specifically at caching intermediates in a late-materialization-oriented column-store. The idea of our approach is to cache positions (row numbers) instead of data values. The small size of positional representation is a valuable advantage: cache can accommodate more entries and consider intermediates that involve “heavy” operators, e.g. joins of large tables. Position caching thrives in late materialization environments since position exchange is prevalent in them. In particular, expensive predicates and heavy joins are usually processed based on positions. Our approach is able to cache them efficiently, thus significantly reducing system load. To assess the importance of intermediates our position caching technique features a cost model that is based on usage statistics and complexity estimations. Furthermore, to allow intermediate reuse for the queries that are not fully identical, we proposed an efficient query containment checking algorithm. Several policies for cache population and eviction were proposed. Finally, our approach is enhanced by lightweight compression schemes. Experimental evaluation was performed using a stream of randomly generated Star-Schema-Benchmark-like queries. It showed up to 3 times improvement in query run times. Additionally, compressing the intermediates reduces the space requirements by up to 2 times without a noticeable performance overhead.",
author = "Viacheslav Galaktionov and Evgeniy Klyuchikov and George Chernishev",
note = "Publisher Copyright: {\textcopyright} Copyright 2020 for this paper held by its author(s).; 22nd International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data, DOLAP 2020 ; Conference date: 30-03-2020",
year = "2020",
language = "English",
volume = "2572",
pages = "89--93",
journal = "CEUR Workshop Proceedings",
issn = "1613-0073",
publisher = "RWTH Aahen University",

}

RIS

TY - JOUR

T1 - Position caching in a column-store with late materialization

T2 - 22nd International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data, DOLAP 2020

AU - Galaktionov, Viacheslav

AU - Klyuchikov, Evgeniy

AU - Chernishev, George

N1 - Publisher Copyright: © Copyright 2020 for this paper held by its author(s).

PY - 2020

Y1 - 2020

N2 - A common technique to speed up DBMS query processing is to cache parts of query results and reuse them later. In this paper we propose a novel approach which is aimed specifically at caching intermediates in a late-materialization-oriented column-store. The idea of our approach is to cache positions (row numbers) instead of data values. The small size of positional representation is a valuable advantage: cache can accommodate more entries and consider intermediates that involve “heavy” operators, e.g. joins of large tables. Position caching thrives in late materialization environments since position exchange is prevalent in them. In particular, expensive predicates and heavy joins are usually processed based on positions. Our approach is able to cache them efficiently, thus significantly reducing system load. To assess the importance of intermediates our position caching technique features a cost model that is based on usage statistics and complexity estimations. Furthermore, to allow intermediate reuse for the queries that are not fully identical, we proposed an efficient query containment checking algorithm. Several policies for cache population and eviction were proposed. Finally, our approach is enhanced by lightweight compression schemes. Experimental evaluation was performed using a stream of randomly generated Star-Schema-Benchmark-like queries. It showed up to 3 times improvement in query run times. Additionally, compressing the intermediates reduces the space requirements by up to 2 times without a noticeable performance overhead.

AB - A common technique to speed up DBMS query processing is to cache parts of query results and reuse them later. In this paper we propose a novel approach which is aimed specifically at caching intermediates in a late-materialization-oriented column-store. The idea of our approach is to cache positions (row numbers) instead of data values. The small size of positional representation is a valuable advantage: cache can accommodate more entries and consider intermediates that involve “heavy” operators, e.g. joins of large tables. Position caching thrives in late materialization environments since position exchange is prevalent in them. In particular, expensive predicates and heavy joins are usually processed based on positions. Our approach is able to cache them efficiently, thus significantly reducing system load. To assess the importance of intermediates our position caching technique features a cost model that is based on usage statistics and complexity estimations. Furthermore, to allow intermediate reuse for the queries that are not fully identical, we proposed an efficient query containment checking algorithm. Several policies for cache population and eviction were proposed. Finally, our approach is enhanced by lightweight compression schemes. Experimental evaluation was performed using a stream of randomly generated Star-Schema-Benchmark-like queries. It showed up to 3 times improvement in query run times. Additionally, compressing the intermediates reduces the space requirements by up to 2 times without a noticeable performance overhead.

UR - http://www.scopus.com/inward/record.url?scp=85082452619&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85082452619

VL - 2572

SP - 89

EP - 93

JO - CEUR Workshop Proceedings

JF - CEUR Workshop Proceedings

SN - 1613-0073

Y2 - 30 March 2020

ER -

ID: 98682156