TY - GEN
T1 - Drone Flight Log Anomaly Severity Classification via Sentence Embedding
AU - Silalahi, Swardiantara
AU - Ahmad, Tohari
AU - Studiawan, Hudan
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Log-based anomaly detection is one of the popular research topics in the cybersecurity domain. Typically, the log event classification target is only separated into two classes: normal and anomaly. However, the abnormality might contain words or phrases that determine the importance level of the anomalies. Therefore, this paper proposes to extend anomaly detection into log anomaly severity classification, which consists of four classes, i.e., normal, low, medium, and high. As an initial study, several machine learning models are used to build the detection models. A dataset is constructed to verify and evaluate the models' performance by manually annotating drone flight log messages collected from two public drone flight log datasets. Since machine learning models cannot understand natural language, sentence embedding is used as the feature extractor. It embeds the messages into sentence-level vector embeddings to represent the linguistic features in the log message. The micro-average F1 score is selected as the main evaluation metric, considering the proportion between classes in the dataset is imbalanced. After experimenting with the models with 5-fold cross-validation, the multilayer perceptron outperforms the other models and obtains the highest F1 score of 94.788%. The proposed approach successfully recognizes and detects anomalous events in the drone's flight log data with a promising result.
AB - Log-based anomaly detection is one of the popular research topics in the cybersecurity domain. Typically, the log event classification target is only separated into two classes: normal and anomaly. However, the abnormality might contain words or phrases that determine the importance level of the anomalies. Therefore, this paper proposes to extend anomaly detection into log anomaly severity classification, which consists of four classes, i.e., normal, low, medium, and high. As an initial study, several machine learning models are used to build the detection models. A dataset is constructed to verify and evaluate the models' performance by manually annotating drone flight log messages collected from two public drone flight log datasets. Since machine learning models cannot understand natural language, sentence embedding is used as the feature extractor. It embeds the messages into sentence-level vector embeddings to represent the linguistic features in the log message. The micro-average F1 score is selected as the main evaluation metric, considering the proportion between classes in the dataset is imbalanced. After experimenting with the models with 5-fold cross-validation, the multilayer perceptron outperforms the other models and obtains the highest F1 score of 94.788%. The proposed approach successfully recognizes and detects anomalous events in the drone's flight log data with a promising result.
KW - anomaly detection
KW - digital forensics
KW - drone forensics
KW - forensic timeline
KW - information security
KW - sentence embedding
UR - http://www.scopus.com/inward/record.url?scp=85184802107&partnerID=8YFLogxK
U2 - 10.1109/ICoABCD59879.2023.10390959
DO - 10.1109/ICoABCD59879.2023.10390959
M3 - Conference contribution
AN - SCOPUS:85184802107
T3 - 2023 International Conference on Artificial Intelligence, Blockchain, Cloud Computing, and Data Analytics, ICoABCD 2023
SP - 100
EP - 105
BT - 2023 International Conference on Artificial Intelligence, Blockchain, Cloud Computing, and Data Analytics, ICoABCD 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 International Conference on Artificial Intelligence, Blockchain, Cloud Computing, and Data Analytics, ICoABCD 2023
Y2 - 13 November 2023 through 15 November 2023
ER -