TY - JOUR
T1 - Conducting Vessel Data Imputation Method Selection Based on Dataset Characteristics
AU - Fatyanosa, Tirana Noor
AU - Firdausanti, Neni Alya
AU - Soto, Luis Francisco Japa
AU - Santos, Israel Mendonça Dos
AU - Prayoga, Putu Hangga Nan
AU - Aritsugi, Masayoshi
N1 - Publisher Copyright:
© Published under licence by IOP Publishing Ltd.
PY - 2023
Y1 - 2023
N2 - Time series datasets collected from marine sensors inevitably undergo missing data problems. This cause unreliable sensor data to assist the decision-making process. Many methods are offered to impute missing values. However, selecting the best imputation method is not a trivial task, as it usually requires domain expertise and several trial-and-error iterations. Furthermore, when imputations are carried out in a careless way, it generates a high error factor that can lead stakeholders to wrong assumptions. This paper provides a systematic approach that is able to extract characteristics of underlying data and, based on it, recommends the less error-prone imputation method. We evaluate our proposed method using nine real-world vessel datasets. In total, we generated 3859 data samples consisting of 17 inputs and 1 target feature. Experimental results show that the proposed approach is capable of obtaining a weighted F1-Score of 92.6%. Additionally, when compared with the application of careless selected imputation methods, our work is able to gain up to 86% on the average imputation score, with the worst case gain being of 5%. We empirically demonstrate that the proposed approach is efficient when selecting the best imputation methods.
AB - Time series datasets collected from marine sensors inevitably undergo missing data problems. This cause unreliable sensor data to assist the decision-making process. Many methods are offered to impute missing values. However, selecting the best imputation method is not a trivial task, as it usually requires domain expertise and several trial-and-error iterations. Furthermore, when imputations are carried out in a careless way, it generates a high error factor that can lead stakeholders to wrong assumptions. This paper provides a systematic approach that is able to extract characteristics of underlying data and, based on it, recommends the less error-prone imputation method. We evaluate our proposed method using nine real-world vessel datasets. In total, we generated 3859 data samples consisting of 17 inputs and 1 target feature. Experimental results show that the proposed approach is capable of obtaining a weighted F1-Score of 92.6%. Additionally, when compared with the application of careless selected imputation methods, our work is able to gain up to 86% on the average imputation score, with the worst case gain being of 5%. We empirically demonstrate that the proposed approach is efficient when selecting the best imputation methods.
UR - http://www.scopus.com/inward/record.url?scp=85169921086&partnerID=8YFLogxK
U2 - 10.1088/1755-1315/1198/1/012017
DO - 10.1088/1755-1315/1198/1/012017
M3 - Conference article
AN - SCOPUS:85169921086
SN - 1755-1307
VL - 1198
JO - IOP Conference Series: Earth and Environmental Science
JF - IOP Conference Series: Earth and Environmental Science
IS - 1
M1 - 012017
T2 - 10th International Seminar on Ocean, Coastal Engineering, Environmental and Natural Disaster Management, ISOCEEN 2022
Y2 - 29 November 2022
ER -