DOI

We analyze the problem of processing of very large datasets on parallel systems and find that the natural approaches to parallelization fail for two reasons. One is connected to long-range correlations between data and the other comes from nonscalar nature of the data. To overcome those difficulties the new paradigm of the data processing is proposed, based on a statistical simulation of the datasets, which in its turn for different types of data is realized on three approaches - decomposition of the statistical ensemble, decomposition on the base of principle of mixing and decomposition over the indexing variable. Some examples of proposed approach show its very effective scaling.

Язык оригиналаанглийский
Название основной публикацииLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
РедакторыMarian Bubak, Geert Dick van Albada, Peter M.A. Sloot, Jack J. Dongarra
ИздательSpringer Nature
Страницы239-246
Число страниц8
ISBN (печатное издание)9783540221142
DOI
СостояниеОпубликовано - 2004

Серия публикаций

НазваниеLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Том3036
ISSN (печатное издание)0302-9743
ISSN (электронное издание)1611-3349

    Предметные области Scopus

  • Теоретические компьютерные науки
  • Компьютерные науки (все)

ID: 77309648