For large data amounts today brings a nu mber of reasons. The increase in databases makes new and very serious hardware requirements of data centers and in recent years, it requires greater investment in hardware, software, the corresponding work and management. The decision of the main problems associated with the actual har dware infrastructure of data center, it must be said, it is much cheaper than the improvement of the software. But this is only a temporary solution. We must look for a global solution to the problem of large data. It is necessary to improve the methods of data processing with the help of the equipment that is available. This article discusses methods of cleaning and transforming data within the Knowledge Discovery in Databases technology for fast applying data mining techniques. In particular, the article shows how the metho d can significantly reduce the data selection for query building in noSQL databases on the example of MongoDB.

Original languageEnglish
Pages (from-to)428-434
Number of pages7
JournalCEUR Workshop Proceedings
Volume1787
StatePublished - 1 Jan 2016
Event7th International Conference Distributed Computing and Gridtechnologies in Science and Education, GRID 2016 - Dubna, Russian Federation
Duration: 4 Jul 20169 Jul 2016

    Scopus subject areas

  • Computer Science(all)

    Research areas

  • Big data, Data cleaning, Database, MongoDB, Query

ID: 33811613