Data Reduction for Optimizing Feature Selection in Modeling Intrusion Detection System

Alif Nur Iman, Tohari Ahmad*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)


With the development and ease of access to internet networks, the potential for attacks and intrusions have increased. The intrusion detection system (IDS), an approach to overcome this problem, is grouped into two models: signature-based and anomaly-based. An anomaly-based IDS can be implemented by machine learning; one of the schemes in machine learning is data reduction. IDS datasets are usually obtained through a real-time process that has undefined proportional data. The purpose of data reduction is to speed up and optimize the process, improving accuracy, precision, and specifications. There are several methods to perform data reduction, one of which uses outlier detection techniques. Proper outlier detection has a positive impact on improving the classification results of machine learning. In this research, the outlier detection is done by a circle generated from the k -means clustering of all selected features. Two scenarios are designed for the evaluation: a circle generated from two points of the minimum and maximum cluster and median of all clusters. The formation of clusters conducted by k -means clustering determines the size and direction of the outlier circle so that it dynamically adjusts the distribution of data from the feature selection results. By employing the previous feature selection algorithms, the comparison is performed to evaluate the proposed method's performance. Our empirical results show that the second scenario can significantly improve the classification results in terms of accuracy, detection rate, and precision. The first and second experiments can increase the accuracy by 0.02%, and the third experiment is by 0.1%. The detection rate in the first, second, and third experiments increases by 0.01%, 0.02%, and 0.07. At the same time, precision increases by 0.04%, 0.02%, and 0.01%, correspondingly.

Original languageEnglish
Pages (from-to)199-207
Number of pages9
JournalInternational Journal of Intelligent Engineering and Systems
Issue number6
Publication statusPublished - 2020


  • Data reduction
  • Intrusion detection system
  • K-means clustering
  • Machine learning
  • Network security


Dive into the research topics of 'Data Reduction for Optimizing Feature Selection in Modeling Intrusion Detection System'. Together they form a unique fingerprint.

Cite this