TY - GEN
T1 - Aspect Based Multilabel Text Classification for Identifying Dangerous Speech Twitter Text
AU - Findawati, Yulian
AU - Pramana, Kresna Adhi
AU - Raharjo, Agus Budi
AU - Abadi, Totok Wahyu
AU - Purwitasari, Diana
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - As part of hate speech, dangerous speech is any expression that can increase the risk of committing violence against other people. So far, hate speech research only explains whether some sentences is categorized as hate speech. It does not explain aspects of the sentences that make them called dangerous speech. Aspects of dangerous speech are social context, historical context, dehumanization, the accusation in the mirror, women and children attack, loyalty to the group, and group threat. This study uses the multi-label text classification method to determine dangerous speeches on Twitter texts based on seven aspects. Then, we assign a weighted score from those aspects to differentiate dangerous and hate speech. Based on the test results show the best performance is the Naive Bayes method with label-based subset accuracy (±36 %), instance-based (average) accuracy (±86%) and classification accuracy (±77%). However, even though Naive Bayes has the best performance in terms of instance based (average) accuracy, the average difference between all methods with Naive Bayes is only ± 0.014, this indicates that other methods also produce quite good performance.
AB - As part of hate speech, dangerous speech is any expression that can increase the risk of committing violence against other people. So far, hate speech research only explains whether some sentences is categorized as hate speech. It does not explain aspects of the sentences that make them called dangerous speech. Aspects of dangerous speech are social context, historical context, dehumanization, the accusation in the mirror, women and children attack, loyalty to the group, and group threat. This study uses the multi-label text classification method to determine dangerous speeches on Twitter texts based on seven aspects. Then, we assign a weighted score from those aspects to differentiate dangerous and hate speech. Based on the test results show the best performance is the Naive Bayes method with label-based subset accuracy (±36 %), instance-based (average) accuracy (±86%) and classification accuracy (±77%). However, even though Naive Bayes has the best performance in terms of instance based (average) accuracy, the average difference between all methods with Naive Bayes is only ± 0.014, this indicates that other methods also produce quite good performance.
KW - dangerous speech
KW - multi-label text classification
KW - twitter texts
KW - weighted sum model
UR - http://www.scopus.com/inward/record.url?scp=85141603438&partnerID=8YFLogxK
U2 - 10.1109/ICoICT55009.2022.9914900
DO - 10.1109/ICoICT55009.2022.9914900
M3 - Conference contribution
AN - SCOPUS:85141603438
T3 - 2022 10th International Conference on Information and Communication Technology, ICoICT 2022
SP - 179
EP - 183
BT - 2022 10th International Conference on Information and Communication Technology, ICoICT 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th International Conference on Information and Communication Technology, ICoICT 2022
Y2 - 2 August 2022 through 3 August 2022
ER -