TY - GEN
T1 - Topic Modeling for Cyber Threat Intelligence (CTI)
AU - Suryotrisongko, Hatma
AU - Ginardi, Hari
AU - Ciptaningtyas, Henning Titi
AU - Dehqan, Saeed
AU - Musashi, Yasuo
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Topic modeling algorithms from the natural language processing (NLP) discipline have been used for various applications. For instance, topic modeling for the product recommendation systems in the e-commerce systems. In this paper, we briefly reviewed topic modeling applications and then described our proposed idea of utilizing topic modeling approaches for cyber threat intelligence (CTI) applications. We improved the previous work by implementing BERTopic and Top2Vec approaches, enabling users to select their preferred pre-trained text/sentence embedding model, and supporting various languages. We implemented our proposed idea as the new topic modeling module for the Open Web Application Security Project (OWASP) Maryam: Open-Source Intelligence (OSINT) framework. We also described our experiment results using a leaked hacker forum dataset (nulled.io) to attract more researchers and open-source communities to participate in the Maryam project of OWASP Foundation.
AB - Topic modeling algorithms from the natural language processing (NLP) discipline have been used for various applications. For instance, topic modeling for the product recommendation systems in the e-commerce systems. In this paper, we briefly reviewed topic modeling applications and then described our proposed idea of utilizing topic modeling approaches for cyber threat intelligence (CTI) applications. We improved the previous work by implementing BERTopic and Top2Vec approaches, enabling users to select their preferred pre-trained text/sentence embedding model, and supporting various languages. We implemented our proposed idea as the new topic modeling module for the Open Web Application Security Project (OWASP) Maryam: Open-Source Intelligence (OSINT) framework. We also described our experiment results using a leaked hacker forum dataset (nulled.io) to attract more researchers and open-source communities to participate in the Maryam project of OWASP Foundation.
KW - Maryam
KW - OWASP
KW - cyber threat intelligence
KW - threat recommendation
KW - topic modeling
UR - http://www.scopus.com/inward/record.url?scp=85146934800&partnerID=8YFLogxK
U2 - 10.1109/ICIC56845.2022.10006988
DO - 10.1109/ICIC56845.2022.10006988
M3 - Conference contribution
AN - SCOPUS:85146934800
T3 - 2022 7th International Conference on Informatics and Computing, ICIC 2022
BT - 2022 7th International Conference on Informatics and Computing, ICIC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th International Conference on Informatics and Computing, ICIC 2022
Y2 - 8 December 2022 through 9 December 2022
ER -