Handling missing value on meteorological data classification with rough set based algorithm

Winda Aprianti*, Imam Mukhlash

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Data mining is a process to find patterns and knowledge from a large database. One task in data mining is classification, which is the process of finding rules for predicting the object in the database. The weather has an important role in human life, such as in the areas of social and economic welfare, agriculture, disaster management, and finance. So, weather prediction needs to make planning in various fields. Most of the database cannot be separated from the incompleteness problem, which is caused by faulty procedures manual data entry, incorrect measurements, equipment faults, and many others. In this research, we use incomplete meteorological dataset. Before applying the rough set to obtain rules, incomplete dataset is converted into a complete dataset by replacing the missing value with the average value of the records that have the same decision class. Then we find the lower and upper approximation. We obtained certain rules from lower approximation and possible rules from upper approximation. To test the performance of this algorithm, we applied rules to test data. Result of the application of this algorithm on datasets contain 5%, 10%, 15%, 20%, 25%, and 30% missing value show that increasing of missing value lead to the accuracy of rules decreases and the number of rules no affects the accuracy of rules. Resulted rules by rough set algorithm effectively to predict rainfall for dataset contain missing value less than 25%.

Original languageEnglish
Pages (from-to)1147-1156
Number of pages10
JournalGlobal Journal of Pure and Applied Mathematics
Volume11
Issue number3
Publication statusPublished - 2015

Keywords

  • Classification
  • Incomplete dataset
  • Meteorological
  • Rough set

Fingerprint

Dive into the research topics of 'Handling missing value on meteorological data classification with rough set based algorithm'. Together they form a unique fingerprint.

Cite this