Enhancing Anomaly Classification Over Log Files through Topic Modeling and Ensemble Methods

Achmad Mujaddid Islami, Irham Maulani, Rifqi Zumadila, Anggi Malanda Yoga Putra, Bagus Jati Santoso

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Log files play a vital role in monitoring system changes, and their usage is rapidly increasing in cloud computing. To effectively address this challenge, text processing is an essential step in extracting information from unstructured text data, specifically log files, and transforming it into identifiable patterns using specific methods. Given the brevity of most log file text, this study focuses on the application of short text topic modeling and ensemble methods to classify log files andextract meaningful insights. The findings demonstrate that anomaly detection using short-text topic modeling and ensemble methods surpasses the benchmark method of Latent Dirichlet Allocation (LDA) topic modeling. Notably, the classification approach utilizing GSDMM,in combination with experiments involving XGBoost, achieves the highest performance when compared to other ensemble methods such as Random Forest, Gradient Boosting, andAdaBoost. To further optimize the performance of the XGBoost method in anomaly detection classification, hyperparameter tuning is conducted using Optuna. This approach effectively identifies the most optimal hyperparameters for XGBoost, leading to enhanced performance. Overall, this research illustrates that the utilization of short text topic modeling and ensemble methods, along with hyperparameter optimization, significantly improves the accuracy and effectiveness of anomaly detection in log file classification.

Original languageEnglish
Title of host publicationProceeding - International Conference on Information Technology and Computing 2023, ICITCOM 2023
EditorsHsing-Chung Chen, Cahya Damarjati, Christian Blum, Yessi Jusman, Siti Nurul Aqmariah Mohd Kanafiah, Waleed Ejaz
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages57-61
Number of pages5
ISBN (Electronic)9798350359633
DOIs
Publication statusPublished - 2023
Event2023 International Conference on Information Technology and Computing, ICITCOM 2023 - Hybrid, Yogyakarta, Indonesia
Duration: 1 Dec 20232 Dec 2023

Publication series

NameProceeding - International Conference on Information Technology and Computing 2023, ICITCOM 2023

Conference

Conference2023 International Conference on Information Technology and Computing, ICITCOM 2023
Country/TerritoryIndonesia
CityHybrid, Yogyakarta
Period1/12/232/12/23

Keywords

  • Anomaly Classification
  • Ensemble Methods
  • Log
  • Natural Language Processing
  • Short Text
  • Topic Modelling

Fingerprint

Dive into the research topics of 'Enhancing Anomaly Classification Over Log Files through Topic Modeling and Ensemble Methods'. Together they form a unique fingerprint.

Cite this