TY - JOUR
T1 - Feature Selection for Intrusion Detection Using Independence Level Test with an Exhaustive Approach
AU - Nururrahmah, Aulia Teaku
AU - Ahmad, Tohari
N1 - Publisher Copyright:
© (2023), (Intelligent Network and Systems Society). All Rights Reserved.
PY - 2023
Y1 - 2023
N2 - The intrusion detection system (IDS) has been developed to detect attacks or suspicious activity on a network. IDS are generally classified into two types: signature-based and anomaly detection. Many studies widely use anomaly-based detection because it can detect new types of attacks on the internet network, but it has several shortcomings. Handling data with high dimensionality directly to the classification process could lead to low accuracy and increased false alarm rates. Selecting relevant features and removing irrelevant features from classification results could be the solution to overcome this issue. In this research, we propose an intrusion detection model using a combination of the Chi-square independence test and an exhaustive search. Firstly, this proposed method employs the independence levels of the Chi-square test to calculate the statistical scores for each feature. The feature list obtained from the first process is continued to the optimisation stage using an exhaustive search. This process aims to calculate the accuracy values of all possible feature combinations from the early-stage feature list and check each feature combination to see if that combination has the best accuracy. This method was tested on four datasets: KDD Cup 99, NSL-KDD, Kyoto 2006+, and UNSW-NB15 using three classifiers: Support vector machine, decision tree, and Naive Bayes. This method achieved the highest accuracy when tested on the UNSW-NB15 dataset using the SVM. Accuracy, precision, recall, and F-score reached values above 95%. Likewise, the FPR value reached the lowest rate of 1.56%.
AB - The intrusion detection system (IDS) has been developed to detect attacks or suspicious activity on a network. IDS are generally classified into two types: signature-based and anomaly detection. Many studies widely use anomaly-based detection because it can detect new types of attacks on the internet network, but it has several shortcomings. Handling data with high dimensionality directly to the classification process could lead to low accuracy and increased false alarm rates. Selecting relevant features and removing irrelevant features from classification results could be the solution to overcome this issue. In this research, we propose an intrusion detection model using a combination of the Chi-square independence test and an exhaustive search. Firstly, this proposed method employs the independence levels of the Chi-square test to calculate the statistical scores for each feature. The feature list obtained from the first process is continued to the optimisation stage using an exhaustive search. This process aims to calculate the accuracy values of all possible feature combinations from the early-stage feature list and check each feature combination to see if that combination has the best accuracy. This method was tested on four datasets: KDD Cup 99, NSL-KDD, Kyoto 2006+, and UNSW-NB15 using three classifiers: Support vector machine, decision tree, and Naive Bayes. This method achieved the highest accuracy when tested on the UNSW-NB15 dataset using the SVM. Accuracy, precision, recall, and F-score reached values above 95%. Likewise, the FPR value reached the lowest rate of 1.56%.
KW - Chi-square
KW - Exhaustive search
KW - Feature selection
KW - Intrusion detection system
KW - Network Infrastructure
KW - Network security
UR - http://www.scopus.com/inward/record.url?scp=85170437731&partnerID=8YFLogxK
U2 - 10.22266/ijies2023.1031.54
DO - 10.22266/ijies2023.1031.54
M3 - Article
AN - SCOPUS:85170437731
SN - 2185-310X
VL - 16
SP - 637
EP - 648
JO - International Journal of Intelligent Engineering and Systems
JF - International Journal of Intelligent Engineering and Systems
IS - 5
ER -