Comparison of Deep Learning Methods in Detecting Hate Speech in Indonesian Tweets

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Hate speech has negative effects on both the targeted victims and the listeners. The dissemination of hate speech can occur not only physically or verbally, but also in writing on social media. The emergence of hate speech on social media platforms can be difficult to identify in written communication. Currently, hate speech detection relies on machine learning. This study generates a vector representation of words using three pre-trained word insertion models: Global Vectors (GloVe), FastText, and Bidirectional Encoder Representations from Transformers (BERT). Synthetic Minority Oversampling Technique (SMOTE) and Random Over Sampling (ROS) were utilized as balancing methods to rectify data imbalance between classes. In addition, three distinct deep learning architectures were used to identify sentence-level hate speech in Indonesian tweets: Bidirectional Long Sort-Term Memory (BiLSTM), Convolution Neural Network (CNN), and Recurrent Neural Network (RNN). The dataset was collected by crawling the data via the Twitter API. After data underwent preprocessing, characteristics were extracted. Based on experimental results, classifiers employing RNN and BERT embedding and utilizing SMOTE produced the most accurate results (95.5%).

Original languageEnglish
Title of host publicationSIET 2023 - Proceedings of the 8th International Conference on Sustainable Information Engineering and Technology
PublisherAssociation for Computing Machinery
Pages58-63
Number of pages6
ISBN (Electronic)9798400708503
DOIs
Publication statusPublished - 24 Oct 2023
Externally publishedYes
Event8th International Conference on Sustainable Information Engineering and Technology, SIET 2023 - Bali, Indonesia
Duration: 24 Oct 202325 Oct 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Conference8th International Conference on Sustainable Information Engineering and Technology, SIET 2023
Country/TerritoryIndonesia
CityBali
Period24/10/2325/10/23

Keywords

  • Deep Learning
  • Hate Speech
  • Imbalance Data
  • Indonesia Tweets
  • Word Embedding

Fingerprint

Dive into the research topics of 'Comparison of Deep Learning Methods in Detecting Hate Speech in Indonesian Tweets'. Together they form a unique fingerprint.

Cite this