Skip to main navigation Skip to search Skip to main content

IndoBERT-Based Ensemble Learning for Multi-Level Multi-Label Hate Speech Detection in Indonesian Social Media

  • Institut Teknologi Sepuluh Nopember

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Hate speech on social media platforms has become a pressing issue, with harmful content often leading to social tensions and emotional harm. In Indonesia, the complex linguistic and cultural context of online discourse presents additional challenges for effective hate speech detection. This study addresses these challenges by presenting an ensemble learning approach for hate speech detection in Indonesian social media. Leveraging IndoBERT for language understanding and combining it with Bi-LSTM and Bi-GRU models for sequence processing, we developed a robust multi-model architecture that effectively captures linguistic patterns and contextual nuances unique to Indonesian. The proposed ensemble framework was tested on a comprehensive dataset with multiple hate speech labels, including categories such as Religion, Race, Gender, and Severity. Experimental results demonstrate that the ensemble model achieved an accuracy of 86% and an Fl-score of 63%, significantly outperforming individual models across most categories. This approach highlights the potential of ensemble learning for automated content moderation in Indonesian social media, providing a promising solution for managing diverse forms of online hate speech.

Original languageEnglish
Title of host publication2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024
EditorsFerry Wahyu Wibowo
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages456-461
Number of pages6
ISBN (Electronic)9798331508579
DOIs
Publication statusPublished - 2024
Event2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024 - Jember, Indonesia
Duration: 19 Dec 2024 → …

Publication series

Name2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024

Conference

Conference2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024
Country/TerritoryIndonesia
CityJember
Period19/12/24 → …

Keywords

  • Bi-GRU
  • Bi-LSTM
  • Hate speech detection
  • IndoBERT
  • ensemble learning
  • multi-level multi-label classification

Fingerprint

Dive into the research topics of 'IndoBERT-Based Ensemble Learning for Multi-Level Multi-Label Hate Speech Detection in Indonesian Social Media'. Together they form a unique fingerprint.

Cite this