Standard

Desbordante: A Framework for Exploring Limits of Dependency Discovery Algorithms. / Strutovskiy, Maxim; Bobrov, Nikita; Smirnov, Kirill; Chernishev, George.

Proceedings of the 29th Conference of Open Innovations Association FRUCT, FRUCT 2021. ed. / Sergey Balandin; Yevgeni Koucheryavy; Tatiana Tyutina. Institute of Electrical and Electronics Engineers Inc., 2021. p. 344-354 9435469 (Conference of Open Innovation Association, FRUCT; Vol. 2021-May).

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Harvard

Strutovskiy, M, Bobrov, N, Smirnov, K & Chernishev, G 2021, Desbordante: A Framework for Exploring Limits of Dependency Discovery Algorithms. in S Balandin, Y Koucheryavy & T Tyutina (eds), Proceedings of the 29th Conference of Open Innovations Association FRUCT, FRUCT 2021., 9435469, Conference of Open Innovation Association, FRUCT, vol. 2021-May, Institute of Electrical and Electronics Engineers Inc., pp. 344-354, 29th Conference of Open Innovations Association FRUCT, FRUCT 2021, Virtual, Tampere, Finland, 12/05/21. https://doi.org/10.23919/fruct52173.2021.9435469

APA

Strutovskiy, M., Bobrov, N., Smirnov, K., & Chernishev, G. (2021). Desbordante: A Framework for Exploring Limits of Dependency Discovery Algorithms. In S. Balandin, Y. Koucheryavy, & T. Tyutina (Eds.), Proceedings of the 29th Conference of Open Innovations Association FRUCT, FRUCT 2021 (pp. 344-354). [9435469] (Conference of Open Innovation Association, FRUCT; Vol. 2021-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.23919/fruct52173.2021.9435469

Vancouver

Strutovskiy M, Bobrov N, Smirnov K, Chernishev G. Desbordante: A Framework for Exploring Limits of Dependency Discovery Algorithms. In Balandin S, Koucheryavy Y, Tyutina T, editors, Proceedings of the 29th Conference of Open Innovations Association FRUCT, FRUCT 2021. Institute of Electrical and Electronics Engineers Inc. 2021. p. 344-354. 9435469. (Conference of Open Innovation Association, FRUCT). https://doi.org/10.23919/fruct52173.2021.9435469

Author

Strutovskiy, Maxim ; Bobrov, Nikita ; Smirnov, Kirill ; Chernishev, George. / Desbordante: A Framework for Exploring Limits of Dependency Discovery Algorithms. Proceedings of the 29th Conference of Open Innovations Association FRUCT, FRUCT 2021. editor / Sergey Balandin ; Yevgeni Koucheryavy ; Tatiana Tyutina. Institute of Electrical and Electronics Engineers Inc., 2021. pp. 344-354 (Conference of Open Innovation Association, FRUCT).

BibTeX

@inproceedings{806e6533a4bb49539f489a1fcf945d7d,
title = "Desbordante: A Framework for Exploring Limits of Dependency Discovery Algorithms",
abstract = "Automatic discovery of various types of database dependencies (functional, inclusion, matching, and others) is a topic that has received a great deal of attention in the recent years. The problem is formulated as following: having an unexplored dataset, find all dependencies that hold on this data. Such problem formulation arises in business and scientific applications and is aimed at the discovery of patterns in data. Metanome is a pioneering platform which was used to benchmark existing and develop new dependency discovery algorithms. It is notable since it was the first attempt to unify all existing discovery algorithms inside a single suite. However, it should be considered a research prototype rather than a system ready for industrial use. The core reason for this is the choice of the implementation platform (Java) and the absence of optimizations. In this paper we address the problem of high-performance dependency discovery. We present Desbordante - a platform that is intended to make the most of the available computational resources and thus to be more suitable for industrial use. Finally, we evaluate our system experimentally and pose a number of research questions related to the obtained performance and justify its necessity. More precisely we examine 1) whether the Java implementation is indeed worse than the C++ one, 2) is it possible to use simple tricks to improve Metanome's performance, 3) what are the exact reasons behind the performance gap, and 4) what are the user-facing benefits of switching the implementations. ",
author = "Maxim Strutovskiy and Nikita Bobrov and Kirill Smirnov and George Chernishev",
note = "Publisher Copyright: {\textcopyright} 2021 FRUCT.; 29th Conference of Open Innovations Association FRUCT, FRUCT 2021 ; Conference date: 12-05-2021 Through 14-05-2021",
year = "2021",
month = may,
day = "12",
doi = "10.23919/fruct52173.2021.9435469",
language = "English",
series = "Conference of Open Innovation Association, FRUCT",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "344--354",
editor = "Sergey Balandin and Yevgeni Koucheryavy and Tatiana Tyutina",
booktitle = "Proceedings of the 29th Conference of Open Innovations Association FRUCT, FRUCT 2021",
address = "United States",

}

RIS

TY - GEN

T1 - Desbordante: A Framework for Exploring Limits of Dependency Discovery Algorithms

AU - Strutovskiy, Maxim

AU - Bobrov, Nikita

AU - Smirnov, Kirill

AU - Chernishev, George

N1 - Publisher Copyright: © 2021 FRUCT.

PY - 2021/5/12

Y1 - 2021/5/12

N2 - Automatic discovery of various types of database dependencies (functional, inclusion, matching, and others) is a topic that has received a great deal of attention in the recent years. The problem is formulated as following: having an unexplored dataset, find all dependencies that hold on this data. Such problem formulation arises in business and scientific applications and is aimed at the discovery of patterns in data. Metanome is a pioneering platform which was used to benchmark existing and develop new dependency discovery algorithms. It is notable since it was the first attempt to unify all existing discovery algorithms inside a single suite. However, it should be considered a research prototype rather than a system ready for industrial use. The core reason for this is the choice of the implementation platform (Java) and the absence of optimizations. In this paper we address the problem of high-performance dependency discovery. We present Desbordante - a platform that is intended to make the most of the available computational resources and thus to be more suitable for industrial use. Finally, we evaluate our system experimentally and pose a number of research questions related to the obtained performance and justify its necessity. More precisely we examine 1) whether the Java implementation is indeed worse than the C++ one, 2) is it possible to use simple tricks to improve Metanome's performance, 3) what are the exact reasons behind the performance gap, and 4) what are the user-facing benefits of switching the implementations.

AB - Automatic discovery of various types of database dependencies (functional, inclusion, matching, and others) is a topic that has received a great deal of attention in the recent years. The problem is formulated as following: having an unexplored dataset, find all dependencies that hold on this data. Such problem formulation arises in business and scientific applications and is aimed at the discovery of patterns in data. Metanome is a pioneering platform which was used to benchmark existing and develop new dependency discovery algorithms. It is notable since it was the first attempt to unify all existing discovery algorithms inside a single suite. However, it should be considered a research prototype rather than a system ready for industrial use. The core reason for this is the choice of the implementation platform (Java) and the absence of optimizations. In this paper we address the problem of high-performance dependency discovery. We present Desbordante - a platform that is intended to make the most of the available computational resources and thus to be more suitable for industrial use. Finally, we evaluate our system experimentally and pose a number of research questions related to the obtained performance and justify its necessity. More precisely we examine 1) whether the Java implementation is indeed worse than the C++ one, 2) is it possible to use simple tricks to improve Metanome's performance, 3) what are the exact reasons behind the performance gap, and 4) what are the user-facing benefits of switching the implementations.

UR - http://www.scopus.com/inward/record.url?scp=85107450137&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/adcbcd4a-3100-314e-9fcc-266e246d5c1b/

U2 - 10.23919/fruct52173.2021.9435469

DO - 10.23919/fruct52173.2021.9435469

M3 - Conference contribution

AN - SCOPUS:85107450137

T3 - Conference of Open Innovation Association, FRUCT

SP - 344

EP - 354

BT - Proceedings of the 29th Conference of Open Innovations Association FRUCT, FRUCT 2021

A2 - Balandin, Sergey

A2 - Koucheryavy, Yevgeni

A2 - Tyutina, Tatiana

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 29th Conference of Open Innovations Association FRUCT, FRUCT 2021

Y2 - 12 May 2021 through 14 May 2021

ER -

ID: 85237755