TY - GEN
T1 - A Comparison of Deep Learning for Software Features Extraction in Forensic Online News
AU - Suarezsaga, Fredrikus
AU - Siahaan, Daniel
AU - Yuniarti, Anny
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Software features of forensics are functional components in software. Software feature extraction is performed to detect software features in documents in the form of online news with a forensic category. This study is conducted to find a suitable deep learning model for software feature extraction. This study uses a deep learning approach and CRF layers to perform software feature extraction. The deep learning methods used are BiLSTM-CRF, BiGRU-CRF, and LSTMCRF. The learning process uses Word Embedding models such as Glove, Word2Vec, and Fasttext. The dataset is collected through scraping from online news with the forensic category. The news was tokenized by word level into datasets and annotated. Tests compare deep learning methods that do not use the word embedding model and those that use word embedding. The experimental results show an increase of 2% - 7% in performance metrics. Combining the Fasttext and BiLSTM-CRF word embedding models results in the best performance, with a precision of 94.03%, a recall of 95.60%, an F1-measure of 93.66%, and an accuracy of 98.99%.
AB - Software features of forensics are functional components in software. Software feature extraction is performed to detect software features in documents in the form of online news with a forensic category. This study is conducted to find a suitable deep learning model for software feature extraction. This study uses a deep learning approach and CRF layers to perform software feature extraction. The deep learning methods used are BiLSTM-CRF, BiGRU-CRF, and LSTMCRF. The learning process uses Word Embedding models such as Glove, Word2Vec, and Fasttext. The dataset is collected through scraping from online news with the forensic category. The news was tokenized by word level into datasets and annotated. Tests compare deep learning methods that do not use the word embedding model and those that use word embedding. The experimental results show an increase of 2% - 7% in performance metrics. Combining the Fasttext and BiLSTM-CRF word embedding models results in the best performance, with a precision of 94.03%, a recall of 95.60%, an F1-measure of 93.66%, and an accuracy of 98.99%.
KW - BiLSTM
KW - CRF
KW - deep learning
KW - forensics
KW - software features extraction
KW - word embedding
UR - http://www.scopus.com/inward/record.url?scp=85172861002&partnerID=8YFLogxK
U2 - 10.1109/ICCSCE58721.2023.10237097
DO - 10.1109/ICCSCE58721.2023.10237097
M3 - Conference contribution
AN - SCOPUS:85172861002
T3 - Proceedings - 13th IEEE International Conference on Control System, Computing and Engineering, ICCSCE 2023
SP - 56
EP - 61
BT - Proceedings - 13th IEEE International Conference on Control System, Computing and Engineering, ICCSCE 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 13th IEEE International Conference on Control System, Computing and Engineering, ICCSCE 2023
Y2 - 25 August 2023 through 26 August 2023
ER -