Research output: Chapter in Book/Report/Conference proceeding › Chapter › Research › peer-review
We analyze the problem of processing of very large datasets on parallel systems and find that the natural approaches to parallelization fail for two reasons. One is connected to long-range correlations between data and the other comes from nonscalar nature of the data. To overcome those difficulties the new paradigm of the data processing is proposed, based on a statistical simulation of the datasets, which in its turn for different types of data is realized on three approaches - decomposition of the statistical ensemble, decomposition on the base of principle of mixing and decomposition over the indexing variable. Some examples of proposed approach show its very effective scaling.
Original language | English |
---|---|
Title of host publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Editors | Marian Bubak, Geert Dick van Albada, Peter M.A. Sloot, Jack J. Dongarra |
Publisher | Springer Nature |
Pages | 239-246 |
Number of pages | 8 |
ISBN (Print) | 9783540221142 |
DOIs | |
State | Published - 2004 |
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 3036 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
ID: 77309648