TY - JOUR
T1 - A New Approach of Botnet Activity Detection Models Using Combination of Univariate and ANOVA Feature Selection Techniques
AU - Hostiadi, Dandy Pramana
AU - Ahmad, Tohari
AU - Putra, Muhammad Aidiel Rachman
AU - Pradipta, Gede Angga
AU - Ayu, Putu Desiana Wulaning
AU - Liandana, Made
N1 - Publisher Copyright:
© (2024), (Intelligent Network and Systems Society). All Rights Reserved.
PY - 2024
Y1 - 2024
N2 - The number of cases in the cyber era has increased significantly, which are caused by malicious software known as malware. This malicious software penetrates the network, infects several computers, and forms a collection of zombie computer networks commonly known as Botnets. These botnet threats can gravely impact valuable system resources and stored data and cause severe financial losses if not handled appropriately. Several previous studies introduced a botnet detection model using algorithms from machine learning by optimizing the feature selection process and having high detection results. However, feature selection is carried out without determining the role of features in the mandatory and non-mandatory categories. In fact, not all features can be selected because they have an important role and influence detection performance. This paper proposes a detection model by optimizing feature selection techniques. The initial process is to categorize features into mandatory and non-mandatory features. The feature selection process is carried out on non-mandatory features using two approaches: Univariate and ANOVA. Then, the best features from the feature selection results are aggregated with the Mandatory features and processed in a classification model for detecting malware attacks. The aim is to obtain the best features used in the classification model to improve detection performance by measuring accuracy, precision, and recall. The classification model used is a Decision tree and was tested on three different datasets, namely CTU-13, NCC, and NCC-2. The experiment result obtained an accuracy of 99.27% on the CTU-13 dataset, 98.96% on the NCC dataset, and 98.87% on the NCC-2 dataset. The resulting average precision value is 98.68% in the CTU-13 dataset, 98.26% in the NCC dataset, and 97.90% in the NCC-2 dataset. Finally, the resulting average recall value was 99.27% on the CTU-13 dataset, 98.96% on the NCC dataset, and 98.87% on the NCC-2 dataset. The detection results showed better results than previous research. This model can make analyzing attacks easier and determine treatment when a malware attack occurs.
AB - The number of cases in the cyber era has increased significantly, which are caused by malicious software known as malware. This malicious software penetrates the network, infects several computers, and forms a collection of zombie computer networks commonly known as Botnets. These botnet threats can gravely impact valuable system resources and stored data and cause severe financial losses if not handled appropriately. Several previous studies introduced a botnet detection model using algorithms from machine learning by optimizing the feature selection process and having high detection results. However, feature selection is carried out without determining the role of features in the mandatory and non-mandatory categories. In fact, not all features can be selected because they have an important role and influence detection performance. This paper proposes a detection model by optimizing feature selection techniques. The initial process is to categorize features into mandatory and non-mandatory features. The feature selection process is carried out on non-mandatory features using two approaches: Univariate and ANOVA. Then, the best features from the feature selection results are aggregated with the Mandatory features and processed in a classification model for detecting malware attacks. The aim is to obtain the best features used in the classification model to improve detection performance by measuring accuracy, precision, and recall. The classification model used is a Decision tree and was tested on three different datasets, namely CTU-13, NCC, and NCC-2. The experiment result obtained an accuracy of 99.27% on the CTU-13 dataset, 98.96% on the NCC dataset, and 98.87% on the NCC-2 dataset. The resulting average precision value is 98.68% in the CTU-13 dataset, 98.26% in the NCC dataset, and 97.90% in the NCC-2 dataset. Finally, the resulting average recall value was 99.27% on the CTU-13 dataset, 98.96% on the NCC dataset, and 98.87% on the NCC-2 dataset. The detection results showed better results than previous research. This model can make analyzing attacks easier and determine treatment when a malware attack occurs.
KW - Botnet detection
KW - Botnet flows
KW - Feature selection
KW - Network infrastructure
KW - Network security
UR - http://www.scopus.com/inward/record.url?scp=85191804080&partnerID=8YFLogxK
U2 - 10.22266/ijies2024.0630.38
DO - 10.22266/ijies2024.0630.38
M3 - Article
AN - SCOPUS:85191804080
SN - 2185-310X
VL - 17
SP - 485
EP - 502
JO - International Journal of Intelligent Engineering and Systems
JF - International Journal of Intelligent Engineering and Systems
IS - 3
ER -