TY - JOUR
T1 - Data Reduction for Optimizing Feature Selection in Modeling Intrusion Detection System
AU - Iman, Alif Nur
AU - Ahmad, Tohari
N1 - Publisher Copyright:
© 1988 © 1988 ASBMB. Currently published by Elsevier Inc; originally published by American Society for Biochemistry and Molecular Biology.
PY - 2020
Y1 - 2020
N2 - With the development and ease of access to internet networks, the potential for attacks and intrusions have increased. The intrusion detection system (IDS), an approach to overcome this problem, is grouped into two models: signature-based and anomaly-based. An anomaly-based IDS can be implemented by machine learning; one of the schemes in machine learning is data reduction. IDS datasets are usually obtained through a real-time process that has undefined proportional data. The purpose of data reduction is to speed up and optimize the process, improving accuracy, precision, and specifications. There are several methods to perform data reduction, one of which uses outlier detection techniques. Proper outlier detection has a positive impact on improving the classification results of machine learning. In this research, the outlier detection is done by a circle generated from the k -means clustering of all selected features. Two scenarios are designed for the evaluation: a circle generated from two points of the minimum and maximum cluster and median of all clusters. The formation of clusters conducted by k -means clustering determines the size and direction of the outlier circle so that it dynamically adjusts the distribution of data from the feature selection results. By employing the previous feature selection algorithms, the comparison is performed to evaluate the proposed method's performance. Our empirical results show that the second scenario can significantly improve the classification results in terms of accuracy, detection rate, and precision. The first and second experiments can increase the accuracy by 0.02%, and the third experiment is by 0.1%. The detection rate in the first, second, and third experiments increases by 0.01%, 0.02%, and 0.07. At the same time, precision increases by 0.04%, 0.02%, and 0.01%, correspondingly.
AB - With the development and ease of access to internet networks, the potential for attacks and intrusions have increased. The intrusion detection system (IDS), an approach to overcome this problem, is grouped into two models: signature-based and anomaly-based. An anomaly-based IDS can be implemented by machine learning; one of the schemes in machine learning is data reduction. IDS datasets are usually obtained through a real-time process that has undefined proportional data. The purpose of data reduction is to speed up and optimize the process, improving accuracy, precision, and specifications. There are several methods to perform data reduction, one of which uses outlier detection techniques. Proper outlier detection has a positive impact on improving the classification results of machine learning. In this research, the outlier detection is done by a circle generated from the k -means clustering of all selected features. Two scenarios are designed for the evaluation: a circle generated from two points of the minimum and maximum cluster and median of all clusters. The formation of clusters conducted by k -means clustering determines the size and direction of the outlier circle so that it dynamically adjusts the distribution of data from the feature selection results. By employing the previous feature selection algorithms, the comparison is performed to evaluate the proposed method's performance. Our empirical results show that the second scenario can significantly improve the classification results in terms of accuracy, detection rate, and precision. The first and second experiments can increase the accuracy by 0.02%, and the third experiment is by 0.1%. The detection rate in the first, second, and third experiments increases by 0.01%, 0.02%, and 0.07. At the same time, precision increases by 0.04%, 0.02%, and 0.01%, correspondingly.
KW - Data reduction
KW - Intrusion detection system
KW - K-means clustering
KW - Machine learning
KW - Network security
UR - http://www.scopus.com/inward/record.url?scp=85095987818&partnerID=8YFLogxK
U2 - 10.22266/ijies2020.1231.18
DO - 10.22266/ijies2020.1231.18
M3 - Article
AN - SCOPUS:85095987818
SN - 2185-310X
VL - 13
SP - 199
EP - 207
JO - International Journal of Intelligent Engineering and Systems
JF - International Journal of Intelligent Engineering and Systems
IS - 6
ER -