TY - GEN
T1 - Assessing the Effectiveness of Oversampling and Undersampling Techniques for Intrusion Detection on an Imbalanced Dataset
AU - Rahma, Fayruz
AU - Rachmadi, Reza Fuad
AU - Pratomo, Baskoro Adi
AU - Purnomo, Mauridhi Hery
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The imbalanced class distribution in intrusion detection systems has been a significant issue. Imbalanced class distribution can negatively impact the performance of intrusion detection systems as they may be biased towards the majority class. We explore the effectiveness of oversampling and under-sampling techniques to address this issue. Oversampling and undersampling techniques aim to balance the class distribution and improve the performance of the intrusion detection system. Oversampling increases the number of records in the minority class to make it closer in size to the majority class. Conversely, undersampling reduces the number of records in the majority class so that it is closer in size to the minority class. We assess the effectiveness of different oversampling and undersampling techniques, including Random OverSampling, SMOTE, ADASYN, Random UnderSampling, AllKNN, TomekLinks, SMOTEENN, and SMOTETomek. The experiment's findings indicate that the raw data achieved the highest accuracy score, 0.965. On the other hand, the Random Oversampling method yielded the highest F1 score, reaching a score of 0.589. When we see the evaluation scores of each class, the recall & F1 scores generally show high contrast between classes with a large amount of data and classes with (previously) a small amount of data, even though the data for training has been more balanced. We found that oversampling and undersampling can improve the performance of intrusion detection systems in specific ways, but this still needs improvement. These results can serve as a reference for researchers developing intrusion detection systems.
AB - The imbalanced class distribution in intrusion detection systems has been a significant issue. Imbalanced class distribution can negatively impact the performance of intrusion detection systems as they may be biased towards the majority class. We explore the effectiveness of oversampling and under-sampling techniques to address this issue. Oversampling and undersampling techniques aim to balance the class distribution and improve the performance of the intrusion detection system. Oversampling increases the number of records in the minority class to make it closer in size to the majority class. Conversely, undersampling reduces the number of records in the majority class so that it is closer in size to the minority class. We assess the effectiveness of different oversampling and undersampling techniques, including Random OverSampling, SMOTE, ADASYN, Random UnderSampling, AllKNN, TomekLinks, SMOTEENN, and SMOTETomek. The experiment's findings indicate that the raw data achieved the highest accuracy score, 0.965. On the other hand, the Random Oversampling method yielded the highest F1 score, reaching a score of 0.589. When we see the evaluation scores of each class, the recall & F1 scores generally show high contrast between classes with a large amount of data and classes with (previously) a small amount of data, even though the data for training has been more balanced. We found that oversampling and undersampling can improve the performance of intrusion detection systems in specific ways, but this still needs improvement. These results can serve as a reference for researchers developing intrusion detection systems.
KW - imbalanced class
KW - intrusion detection system
KW - oversampling and undersampling
UR - http://www.scopus.com/inward/record.url?scp=85182947145&partnerID=8YFLogxK
U2 - 10.1109/IEACon57683.2023.10370430
DO - 10.1109/IEACon57683.2023.10370430
M3 - Conference contribution
AN - SCOPUS:85182947145
T3 - IEACon 2023 - 2023 IEEE Industrial Electronics and Applications Conference
SP - 92
EP - 97
BT - IEACon 2023 - 2023 IEEE Industrial Electronics and Applications Conference
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE Industrial Electronics and Applications Conference, IEACon 2023
Y2 - 6 November 2023 through 7 November 2023
ER -