Data preprocessing and feature selection for machine learning intrusion detection systems

Tohari Ahmad, Mohammad Nasrul Aziz

Research output: Contribution to journalArticlepeer-review

70 Citations (Scopus)

Abstract

Flow-based anomaly detection is an issue that still grows in a computer network security environment. Many previous studies have applied data mining as a method for detecting anomaly in an intrusion detection system (IDS). In this paper, we further apply data mining to classifying those anomaly data. This is based on the facts that there are many data which are not ready for use by a classification algorithm. In addition, that algorithm may use all features which actually are not relevant to the classification target. According to these two problems, we define two steps: pre-processing and feature selection, whose results are classified by using k-NN, SVM, and Naive Bayes. The experimental results show that such pre-processing and combination of CFS and PSO are better to apply to SVM which is able to achieve about 99.9291% of accuracy on KDD Cup99 dataset.

Original languageEnglish
Pages (from-to)93-101
Number of pages9
JournalICIC Express Letters
Volume13
Issue number2
DOIs
Publication statusPublished - 2019

Keywords

  • Data mining
  • Feature selection
  • Intrusion detection system
  • Network security

Fingerprint

Dive into the research topics of 'Data preprocessing and feature selection for machine learning intrusion detection systems'. Together they form a unique fingerprint.

Cite this