TY - JOUR
T1 - Imputation techniques for incomplete load data based on seasonality and orientation of the missing values
AU - Kamisan, Nur Arina Bazilah
AU - Lee, Muhammad Hisyam
AU - Hussin, Abdul Ghapor
AU - Zubairi, Yong Zulina
N1 - Publisher Copyright:
© 2020 Penerbit Universiti Kebangsaan Malaysia. All rights reserved.
PY - 2020/5
Y1 - 2020/5
N2 - In load data, the missing problem always occurs in a set of data. Since it has a seasonal pattern according to days, most of the time, the load usage for the next day is predictable. For this reason, a new model has been developed based on these characteristics. Data containing missing values being divided to its seasonality pattern and for each subdivision, the values from mean, the mean with standard deviation and third quartile are calculated before being rearrange to form a new set of values that will replace the missing values. These three values will be used as imputations for the missing values. To examine the effects of the orientation of the missing values with the choices of imputation, the missing values from the data are divided into three parts: at the front, in the middle and at the end of the data with 5%, 15%, and 25% of missing values. The results from root mean square error and mean absolute error show that the proposed techniques, particularly the mean and the third quartile value, are superior to the other complex methods when dealing with the missing values. The mean imputation is ample when the missing values is presence at the front and in the middle of the data while the third quartile value is superior when the missing values is at the end of the data.
AB - In load data, the missing problem always occurs in a set of data. Since it has a seasonal pattern according to days, most of the time, the load usage for the next day is predictable. For this reason, a new model has been developed based on these characteristics. Data containing missing values being divided to its seasonality pattern and for each subdivision, the values from mean, the mean with standard deviation and third quartile are calculated before being rearrange to form a new set of values that will replace the missing values. These three values will be used as imputations for the missing values. To examine the effects of the orientation of the missing values with the choices of imputation, the missing values from the data are divided into three parts: at the front, in the middle and at the end of the data with 5%, 15%, and 25% of missing values. The results from root mean square error and mean absolute error show that the proposed techniques, particularly the mean and the third quartile value, are superior to the other complex methods when dealing with the missing values. The mean imputation is ample when the missing values is presence at the front and in the middle of the data while the third quartile value is superior when the missing values is at the end of the data.
KW - Data orientation
KW - Missing values
KW - Multiple imputation
KW - Seasonal load data
KW - Seasonality
UR - http://www.scopus.com/inward/record.url?scp=85089652317&partnerID=8YFLogxK
U2 - 10.17576/jsm-2020-4905-22
DO - 10.17576/jsm-2020-4905-22
M3 - Article
AN - SCOPUS:85089652317
SN - 0126-6039
VL - 49
SP - 1165
EP - 1174
JO - Sains Malaysiana
JF - Sains Malaysiana
IS - 5
ER -