Robustness of Support Vector Regression and Random Forest Models: A Simulation Study

Supriadi Hia, Heri Kuswanto*, Dedy Dwi Prastyo

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

Classical statistics are usually based on parametric models, where the performance depends heavily on assumptions and is not robust in the presence of outliers in the data. Due to the COVID-19 pandemic, our daily lives have changed significantly, including slowing economic growth. These extreme changes can manifest as an outlier in time series studies and adversely affect the results of data analysis. Many classical methods of official statistics are prone to outliers. In this work, we evaluate machine learning methods: Support Vector Regression (SVR) and Random Forest (RF) and compare it with ARIMA to determine the robustness through simulation studies. Robustness is measured by the sensitivity of the SVR and Random Forest hyperparameter and the model’s error in the presence of outliers. Simulations show that more outliers lead to higher RMSE values, and conversely, more samples lead to lower RMSE values. The type of outliers significantly impacts the RMSE value of the ARIMA model, where additional outliers (AO) have a worse impact than temporary change (TC). Consecutive outliers produce a smaller RMSE mean than non-consecutive outliers. Based on the sensitivity of hyperparameters, SVR and Random Forest models are relatively robust to the presence of outliers in the data. Based on the simulation results of 100 iterations, we find that SVR is more robust than ARIMA and Random Forest in modeling time series data with outliers.

Original languageEnglish
Title of host publicationLecture Notes on Data Engineering and Communications Technologies
PublisherSpringer Science and Business Media Deutschland GmbH
Pages465-479
Number of pages15
DOIs
Publication statusPublished - 2023

Publication series

NameLecture Notes on Data Engineering and Communications Technologies
Volume165
ISSN (Print)2367-4512
ISSN (Electronic)2367-4520

Keywords

  • Outlier
  • Random forest
  • Robustness
  • Support vector regression

Fingerprint

Dive into the research topics of 'Robustness of Support Vector Regression and Random Forest Models: A Simulation Study'. Together they form a unique fingerprint.

Cite this