DOI

Automatic discovery of various types of database dependencies (functional, inclusion, matching, and others) is a topic that has received a great deal of attention in the recent years. The problem is formulated as following: having an unexplored dataset, find all dependencies that hold on this data. Such problem formulation arises in business and scientific applications and is aimed at the discovery of patterns in data. Metanome is a pioneering platform which was used to benchmark existing and develop new dependency discovery algorithms. It is notable since it was the first attempt to unify all existing discovery algorithms inside a single suite. However, it should be considered a research prototype rather than a system ready for industrial use. The core reason for this is the choice of the implementation platform (Java) and the absence of optimizations. In this paper we address the problem of high-performance dependency discovery. We present Desbordante - a platform that is intended to make the most of the available computational resources and thus to be more suitable for industrial use. Finally, we evaluate our system experimentally and pose a number of research questions related to the obtained performance and justify its necessity. More precisely we examine 1) whether the Java implementation is indeed worse than the C++ one, 2) is it possible to use simple tricks to improve Metanome's performance, 3) what are the exact reasons behind the performance gap, and 4) what are the user-facing benefits of switching the implementations.

Язык оригиналаанглийский
Название основной публикацииProceedings of the 29th Conference of Open Innovations Association FRUCT, FRUCT 2021
РедакторыSergey Balandin, Yevgeni Koucheryavy, Tatiana Tyutina
ИздательInstitute of Electrical and Electronics Engineers Inc.
Страницы344-354
Число страниц11
ISBN (электронное издание)9789526924458
DOI
СостояниеОпубликовано - 12 мая 2021
Событие29th Conference of Open Innovations Association FRUCT, FRUCT 2021 - Virtual, Tampere, Финляндия
Продолжительность: 12 мая 202114 мая 2021

Серия публикаций

НазваниеConference of Open Innovation Association, FRUCT
Том2021-May
ISSN (печатное издание)2305-7254

конференция

конференция29th Conference of Open Innovations Association FRUCT, FRUCT 2021
Страна/TерриторияФинляндия
ГородVirtual, Tampere
Период12/05/2114/05/21

    Предметные области Scopus

  • Компьютерные науки (все)
  • Электротехника и электроника

ID: 85237755