Data preprocessing is an important stage in machine learning. The use of qualitatively prepared data increases the accuracy of predictions, even with simple models. The algorithm has been developed and implemented in the program code for converting the output data of a numerical model to a format suitable for subsequent processing. Detailed algorithm is presented for data pre-processing for selecting the most representative cloud parameters (features). As a result, six optimal parameters: vertical component of speed; temperature deviation from ambient temperature; relative humidity (above the water surface); the mixing ratio of water vapour; total droplet mixing ratio; vertical height of the cloud has been chosen as indicators for forecasting of dangerous convective phenomena (thunderstorm, heavy rain, hail). Feature selection has been provided by using recursive feature elimination algorithm with automatic tuning of the number of features selected with cross-validation. Cloud parameters have been fixed at mature stage of cloud development. Future work will be connected with identification of the influence of the nature of the evolution of the cloud parameters from initial stage to dissipation stage on the probability of a dangerous phenomenon.
Original languageEnglish
Title of host publicationComputational Science and Its Applications – ICCSA 2018
Subtitle of host publication18th International Conference, Melbourne, VIC, Australia, July 2–5, 2018, Proceedings, Part IV
PublisherSpringer Nature
Pages149-159
DOIs
StatePublished - 2018
Event18th International Conference on Computational Science and Its Applications, ICCSA 2018 - Melbourne, Australia
Duration: 2 Jul 20185 Jul 2018

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Nature
Volume10963
ISSN (Print)0302-9743

Conference

Conference18th International Conference on Computational Science and Its Applications, ICCSA 2018
Country/TerritoryAustralia
CityMelbourne
Period2/07/185/07/18

    Research areas

  • Data preprocessing, Feature selection, Machine learning, Numerical model of convective cloud, Thunderstorm, Weather forecasting

    Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

ID: 71301266