TY - GEN
T1 - Pre-Trained Word Embeddings for Sarcasm Detection in Indonesian Tweets
T2 - 9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022
AU - Rosid, Mochamad Alfan
AU - Siahaan, Daniel
AU - Saikhu, Ahmad
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - In affective computing, sarcasm detection is vital because sarcasm can affect the polarity of sentiment analysis. Sarcasm is one of the most challenging problems researchers face when conducting sentiment analysis. Sarcasm is difficult to identify in text because the meaning of the words expressed by a person are the opposite to what the person really means. Deep learning is currently widely used for sarcasm detection. For generating vector representations of words, three distinct pre-trained word embedding models were employed in this study, namely GloVe, fastText, and BERT. Furthermore, three distinct deep learning architectures were utilized, namely Bidirectional Long Sort-Term Memory (BiLSTM), Bidirectional Gated Recurrent Unit (BiGRU), and Convolutional Neural Network (CNN), for sentence-level sarcasm detection in tweets written in Indonesian. The dataset was collected through data crawling using Twitter API. The data was then further preprocessed and features were extracted. The experimental results indicate that the combination of fastText embeddings and BiGRU as the classifier produced the best performance, with an accuracy of 93.85%.
AB - In affective computing, sarcasm detection is vital because sarcasm can affect the polarity of sentiment analysis. Sarcasm is one of the most challenging problems researchers face when conducting sentiment analysis. Sarcasm is difficult to identify in text because the meaning of the words expressed by a person are the opposite to what the person really means. Deep learning is currently widely used for sarcasm detection. For generating vector representations of words, three distinct pre-trained word embedding models were employed in this study, namely GloVe, fastText, and BERT. Furthermore, three distinct deep learning architectures were utilized, namely Bidirectional Long Sort-Term Memory (BiLSTM), Bidirectional Gated Recurrent Unit (BiGRU), and Convolutional Neural Network (CNN), for sentence-level sarcasm detection in tweets written in Indonesian. The dataset was collected through data crawling using Twitter API. The data was then further preprocessed and features were extracted. The experimental results indicate that the combination of fastText embeddings and BiGRU as the classifier produced the best performance, with an accuracy of 93.85%.
KW - deep learning
KW - pre-trained word embedding
KW - sarcasm detection
KW - twitter
UR - http://www.scopus.com/inward/record.url?scp=85141864436&partnerID=8YFLogxK
U2 - 10.1109/ICITACEE55701.2022.9924084
DO - 10.1109/ICITACEE55701.2022.9924084
M3 - Conference contribution
AN - SCOPUS:85141864436
T3 - Proceedings - 2022 9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022
SP - 281
EP - 286
BT - Proceedings - 2022 9th International Conference on Information Technology, Computer and Electrical Engineering, ICITACEE 2022
A2 - Prakoso, Teguh
A2 - Riyadi, Munawar Agus
A2 - Arfan, M.
A2 - Soetrisno, Yosua Alvin Adi
A2 - Afrisal, Hadha
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 25 August 2022 through 26 August 2022
ER -