TY - JOUR
T1 - Two Stages Outlier Removal as Pre-Processing Digitizer Data on Fine Motor Skills (FMS) Classification Using Covariance Estimator and Isolation Forest
AU - Fanani, Nurul Zainal
AU - Sooai, Adri Gabriel
AU - Khamid, K.
AU - Rahmanawati, Festa Yumpi
AU - Tormasi, Alex
AU - Koczy, Laszlo T.
AU - Sumpeno, Surya
AU - Purnomo, Mauridhi Hery
N1 - Publisher Copyright:
© 2021. All Rights Reserved.
PY - 2021/8
Y1 - 2021/8
N2 - The increase of the classification accuracy level has become an important problem in machine learning especially in diverse data-set that contain the outlier data. In the data stream or the data from sensor readings that produce large data, it allows a lot of noise to occur. It makes the performance of the machine learning model is disrupted or even decreased. Therefore, clean data from noise is needed to obtain good accuracy and to improve the performance of the machine learning model. This research proposes a two-stages for detecting and removing outlier data by using the covariance estimator and isolation forest methods as pre-processing in the classification process to determine fine motor skill (FMS). The dataset was generated from the process of recording data directly during cursive writing by using a digitizer. The data included the relative position of the stylus on the digitizer board. x position, y position, z position, and pressure values are then used as features in the classification process. In the process of observation and recording, the generated data was very huge so some of them produce the outlier data. From the experimental results that have been implemented, the level of accuracy in the FMS classification process increases between 0.5-1% by using the Random Forest classifier after the detection and outlier removal by using covariance estimator and isolation forest. The highest accuracy rate achieves 98.05% compared to the accuracy without outlier removal, which is only about 97.3%.
AB - The increase of the classification accuracy level has become an important problem in machine learning especially in diverse data-set that contain the outlier data. In the data stream or the data from sensor readings that produce large data, it allows a lot of noise to occur. It makes the performance of the machine learning model is disrupted or even decreased. Therefore, clean data from noise is needed to obtain good accuracy and to improve the performance of the machine learning model. This research proposes a two-stages for detecting and removing outlier data by using the covariance estimator and isolation forest methods as pre-processing in the classification process to determine fine motor skill (FMS). The dataset was generated from the process of recording data directly during cursive writing by using a digitizer. The data included the relative position of the stylus on the digitizer board. x position, y position, z position, and pressure values are then used as features in the classification process. In the process of observation and recording, the generated data was very huge so some of them produce the outlier data. From the experimental results that have been implemented, the level of accuracy in the FMS classification process increases between 0.5-1% by using the Random Forest classifier after the detection and outlier removal by using covariance estimator and isolation forest. The highest accuracy rate achieves 98.05% compared to the accuracy without outlier removal, which is only about 97.3%.
KW - Covariance estimator
KW - Fine motor skill
KW - Isolation forest
KW - Outlier detection
KW - Random forest
UR - http://www.scopus.com/inward/record.url?scp=85109178869&partnerID=8YFLogxK
U2 - 10.22266/ijies2021.0831.50
DO - 10.22266/ijies2021.0831.50
M3 - Article
AN - SCOPUS:85109178869
SN - 2185-310X
VL - 14
SP - 571
EP - 582
JO - International Journal of Intelligent Engineering and Systems
JF - International Journal of Intelligent Engineering and Systems
IS - 4
ER -