Pre-Trained Word Embeddings for Sarcasm Detection in Indonesian Tweets: A Comparative Study

Mochamad Alfan Rosid, Daniel Siahaan*, Ahmad Saikhu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

In affective computing, sarcasm detection is vital because sarcasm can affect the polarity of sentiment analysis. Sarcasm is one of the most challenging problems researchers face when conducting sentiment analysis. Sarcasm is difficult to identify in text because the meaning of the words expressed by a person are the opposite to what the person really means. Deep learning is currently widely used for sarcasm detection. For generating vector representations of words, three distinct pre-trained word embedding models were employed in this study, namely GloVe, fastText, and BERT. Furthermore, three distinct deep learning architectures were utilized, namely Bidirectional Long Sort-Term Memory (BiLSTM), Bidirectional Gated Recurrent Unit (BiGRU), and Convolutional Neural Network (CNN), for sentence-level sarcasm detection in tweets written in Indonesian. The dataset was collected through data crawling using Twitter API. The data was then further preprocessed and features were extracted. The experimental results indicate that the combination of fastText embeddings and BiGRU as the classifier produced the best performance, with an accuracy of 93.85%.

Original languageEnglish
Title of host publicationProceedings - 2022 9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022
EditorsTeguh Prakoso, Munawar Agus Riyadi, M. Arfan, Yosua Alvin Adi Soetrisno, Hadha Afrisal
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages281-286
Number of pages6
ISBN (Electronic)9781665471480
DOIs
Publication statusPublished - 2022
Event9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022 - Semarang, Indonesia
Duration: 25 Aug 202226 Aug 2022

Publication series

NameProceedings - 2022 9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022

Conference

Conference9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022
Country/TerritoryIndonesia
CitySemarang
Period25/08/2226/08/22

Keywords

  • deep learning
  • pre-trained word embedding
  • sarcasm detection
  • twitter

Fingerprint

Dive into the research topics of 'Pre-Trained Word Embeddings for Sarcasm Detection in Indonesian Tweets: A Comparative Study'. Together they form a unique fingerprint.

Cite this