TY - GEN
T1 - Comparative Analysis of Data Balancing Methods for the Optimization of Botnet Attack Detection Models
AU - Wijaya, Apta Rasendriya
AU - Ahmad, Tohari
AU - Hostiadi, Dandy Pramana
AU - Rachman Putra, Muhammad Aidiel
AU - Alzamzami, Moch Nafkhan
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - In the era of widespread Internet connectivity, botnet threats pose an increasingly significant risk to network security. One way to deal with the risk of botnet threats is to build a reliable detection model. Previous studies have introduced botnet activity detection models using machine-learning approaches. On the other hand, the main problem of botnet activity detection models is the unbalanced proportion of data. In this regard, this study proposes a comparative analysis of data balancing methods for detecting botnet SPAM activities using machine learning algorithms, especially the k-Nearest Neighbors (k-NN) algorithm. This research aims to evaluate the effectiveness of various data balancing techniques in improving the performance of the k-NN classifier for detecting botnet SP AM. This method involves data pre-processing, labeling, splitting, balancing, and classification. The experimental results show that the best performance is obtained from the classification results without the balancing method with a weighted average accuracy of 98.45%, precision of 98.41%, recall of 98.45%, and F1-Score of 98.40%. However, the combination of k-NN with Random oversampling successfully obtained the highest recall value compared to other methods for the minor class (botnet SPAM). This shows that the balancing process has a positive impact on the performance of minority class detection.
AB - In the era of widespread Internet connectivity, botnet threats pose an increasingly significant risk to network security. One way to deal with the risk of botnet threats is to build a reliable detection model. Previous studies have introduced botnet activity detection models using machine-learning approaches. On the other hand, the main problem of botnet activity detection models is the unbalanced proportion of data. In this regard, this study proposes a comparative analysis of data balancing methods for detecting botnet SPAM activities using machine learning algorithms, especially the k-Nearest Neighbors (k-NN) algorithm. This research aims to evaluate the effectiveness of various data balancing techniques in improving the performance of the k-NN classifier for detecting botnet SP AM. This method involves data pre-processing, labeling, splitting, balancing, and classification. The experimental results show that the best performance is obtained from the classification results without the balancing method with a weighted average accuracy of 98.45%, precision of 98.41%, recall of 98.45%, and F1-Score of 98.40%. However, the combination of k-NN with Random oversampling successfully obtained the highest recall value compared to other methods for the minor class (botnet SPAM). This shows that the balancing process has a positive impact on the performance of minority class detection.
KW - Botnet Detection
KW - Information Security
KW - Machine Learning
KW - Network Infrastructure
KW - Network Security
UR - http://www.scopus.com/inward/record.url?scp=85207527036&partnerID=8YFLogxK
U2 - 10.1109/ICSCC62041.2024.10690491
DO - 10.1109/ICSCC62041.2024.10690491
M3 - Conference contribution
AN - SCOPUS:85207527036
T3 - 2024 10th International Conference on Smart Computing and Communication, ICSCC 2024
SP - 608
EP - 613
BT - 2024 10th International Conference on Smart Computing and Communication, ICSCC 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th International Conference on Smart Computing and Communication, ICSCC 2024
Y2 - 25 July 2024 through 27 July 2024
ER -