Conducting Vessel Data Imputation Method Selection Based on Dataset Characteristics

Tirana Noor Fatyanosa, Neni Alya Firdausanti, Luis Francisco Japa Soto, Israel Mendonça Dos Santos, Putu Hangga Nan Prayoga, Masayoshi Aritsugi

Research output: Contribution to journalConference articlepeer-review

3 Citations (Scopus)

Abstract

Time series datasets collected from marine sensors inevitably undergo missing data problems. This cause unreliable sensor data to assist the decision-making process. Many methods are offered to impute missing values. However, selecting the best imputation method is not a trivial task, as it usually requires domain expertise and several trial-and-error iterations. Furthermore, when imputations are carried out in a careless way, it generates a high error factor that can lead stakeholders to wrong assumptions. This paper provides a systematic approach that is able to extract characteristics of underlying data and, based on it, recommends the less error-prone imputation method. We evaluate our proposed method using nine real-world vessel datasets. In total, we generated 3859 data samples consisting of 17 inputs and 1 target feature. Experimental results show that the proposed approach is capable of obtaining a weighted F1-Score of 92.6%. Additionally, when compared with the application of careless selected imputation methods, our work is able to gain up to 86% on the average imputation score, with the worst case gain being of 5%. We empirically demonstrate that the proposed approach is efficient when selecting the best imputation methods.

Original languageEnglish
Article number012017
JournalIOP Conference Series: Earth and Environmental Science
Volume1198
Issue number1
DOIs
Publication statusPublished - 2023
Externally publishedYes
Event10th International Seminar on Ocean, Coastal Engineering, Environmental and Natural Disaster Management, ISOCEEN 2022 - Surabaya, Indonesia
Duration: 29 Nov 2022 → …

Fingerprint

Dive into the research topics of 'Conducting Vessel Data Imputation Method Selection Based on Dataset Characteristics'. Together they form a unique fingerprint.

Cite this