TY - GEN
T1 - Hierarchical Topic Mining and Multi-label Classification on Online News in Bahasa
AU - Esti Anggraini, Ratih Nur
AU - MacHmudah, Hana
AU - Sarno, Riyanarto
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The proliferation of online news media has resulted in a surge of online news. Consequently, properly categorizing news based on their respective topics becomes crucial, enabling readers to analyze and comprehend the information more easily. In this study, the researchers conducted topic mining using the Joint Spherical Tree and Text Embedding (JOSH) method, a technique that utilizes spherical space for topic modeling. This method involves embedding a tree structure into the spherical space, with each category being surrounded by its representative terms in the same direction within that space. The resulting topic hierarchy was then employed for multi-label classification of Bahasa news texts acquired through web scraping from various news sites. The most successful topic mining outcomes exhibited a topic coherence value of 0.7882. Subsequently, a Long Short Term Memory (LSTM) model was developed for multi-label classification, yielding good results, including a hamming loss value of 0.9229 and a recall score of 0.9982.
AB - The proliferation of online news media has resulted in a surge of online news. Consequently, properly categorizing news based on their respective topics becomes crucial, enabling readers to analyze and comprehend the information more easily. In this study, the researchers conducted topic mining using the Joint Spherical Tree and Text Embedding (JOSH) method, a technique that utilizes spherical space for topic modeling. This method involves embedding a tree structure into the spherical space, with each category being surrounded by its representative terms in the same direction within that space. The resulting topic hierarchy was then employed for multi-label classification of Bahasa news texts acquired through web scraping from various news sites. The most successful topic mining outcomes exhibited a topic coherence value of 0.7882. Subsequently, a Long Short Term Memory (LSTM) model was developed for multi-label classification, yielding good results, including a hamming loss value of 0.9229 and a recall score of 0.9982.
KW - multi-label classification
KW - online news
KW - topic hierarchy
KW - topic mining
UR - http://www.scopus.com/inward/record.url?scp=85186516317&partnerID=8YFLogxK
U2 - 10.1109/ICAMIMIA60881.2023.10427844
DO - 10.1109/ICAMIMIA60881.2023.10427844
M3 - Conference contribution
AN - SCOPUS:85186516317
T3 - 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings
BT - 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023
Y2 - 14 November 2023 through 15 November 2023
ER -