Undersampling Data Augmentation for BotNet Classification

Evelyn Sierra*, Tohari Ahmad, Muhammad Aidiel Rachman Putra

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Botnets pose significant threats to cybersecurity by exploiting large networks of compromised devices for malicious activities. Traditional Botnet detection methods often struggle with the evolving tactics employed by botnet operators, leading to high false positive rates and reduced detection accuracy. Data, which are employed to train the system, are often an issue relative to achieving good performance. This is because imbalanced data are typically found in the dataset. For example, the dataset used in this study has 0.97 background, 0.23 normal, and 0.07 botnet activities; each activity comprises network traffic and labels for each network. This study investigates this potential problem by implementing text preprocessing in the initial step to obtain clean labels for each traffic network in the dataset. Furthermore, this study employs RandomUnderSampler to ensure that samples from each label reach 2000 data points. Subsequently, classification experiments are conducted using the Random Forest, Decision Tree, Support Vector Machine, k-Nearest Neighbor, and Logistic Regression methods. The results indicate that Random Forest with a RandomUnderSampler ratio of 3:2:1 achieves the highest accuracy rate, reaching 0.97. In addition, the model exhibited 0.97 precision, 0.95 recall, and 0.96 F1 score.

Original languageEnglish
Title of host publication2024 15th International Conference on Computing Communication and Networking Technologies, ICCCNT 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350370249
DOIs
Publication statusPublished - 2024
Event15th International Conference on Computing Communication and Networking Technologies, ICCCNT 2024 - Kamand, India
Duration: 24 Jun 202428 Jun 2024

Publication series

Name2024 15th International Conference on Computing Communication and Networking Technologies, ICCCNT 2024

Conference

Conference15th International Conference on Computing Communication and Networking Technologies, ICCCNT 2024
Country/TerritoryIndia
CityKamand
Period24/06/2428/06/24

Keywords

  • Botnet
  • Computer Security
  • Information Security
  • Network Infrastructure
  • Network Security

Fingerprint

Dive into the research topics of 'Undersampling Data Augmentation for BotNet Classification'. Together they form a unique fingerprint.

Cite this