TY - JOUR
T1 - Handling missing value on meteorological data classification with rough set based algorithm
AU - Aprianti, Winda
AU - Mukhlash, Imam
N1 - Publisher Copyright:
© Research India Publications.
PY - 2015
Y1 - 2015
N2 - Data mining is a process to find patterns and knowledge from a large database. One task in data mining is classification, which is the process of finding rules for predicting the object in the database. The weather has an important role in human life, such as in the areas of social and economic welfare, agriculture, disaster management, and finance. So, weather prediction needs to make planning in various fields. Most of the database cannot be separated from the incompleteness problem, which is caused by faulty procedures manual data entry, incorrect measurements, equipment faults, and many others. In this research, we use incomplete meteorological dataset. Before applying the rough set to obtain rules, incomplete dataset is converted into a complete dataset by replacing the missing value with the average value of the records that have the same decision class. Then we find the lower and upper approximation. We obtained certain rules from lower approximation and possible rules from upper approximation. To test the performance of this algorithm, we applied rules to test data. Result of the application of this algorithm on datasets contain 5%, 10%, 15%, 20%, 25%, and 30% missing value show that increasing of missing value lead to the accuracy of rules decreases and the number of rules no affects the accuracy of rules. Resulted rules by rough set algorithm effectively to predict rainfall for dataset contain missing value less than 25%.
AB - Data mining is a process to find patterns and knowledge from a large database. One task in data mining is classification, which is the process of finding rules for predicting the object in the database. The weather has an important role in human life, such as in the areas of social and economic welfare, agriculture, disaster management, and finance. So, weather prediction needs to make planning in various fields. Most of the database cannot be separated from the incompleteness problem, which is caused by faulty procedures manual data entry, incorrect measurements, equipment faults, and many others. In this research, we use incomplete meteorological dataset. Before applying the rough set to obtain rules, incomplete dataset is converted into a complete dataset by replacing the missing value with the average value of the records that have the same decision class. Then we find the lower and upper approximation. We obtained certain rules from lower approximation and possible rules from upper approximation. To test the performance of this algorithm, we applied rules to test data. Result of the application of this algorithm on datasets contain 5%, 10%, 15%, 20%, 25%, and 30% missing value show that increasing of missing value lead to the accuracy of rules decreases and the number of rules no affects the accuracy of rules. Resulted rules by rough set algorithm effectively to predict rainfall for dataset contain missing value less than 25%.
KW - Classification
KW - Incomplete dataset
KW - Meteorological
KW - Rough set
UR - http://www.scopus.com/inward/record.url?scp=84944718311&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:84944718311
SN - 0973-1768
VL - 11
SP - 1147
EP - 1156
JO - Global Journal of Pure and Applied Mathematics
JF - Global Journal of Pure and Applied Mathematics
IS - 3
ER -