TY - GEN
T1 - Aspect-level Sentiment Analysis for Social Media Data in the Political Domain using Hierarchical Attention and Position Embeddings
AU - Kusumawardani, Renny Pradina
AU - Maulidani, Muhammad Wildan
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/8
Y1 - 2020/8
N2 - In this paper we present our work on aspect-level sentiment analysis on social media data, specifically in the political domain. Aside from being linguistically irregular, political tweets are often ambiguous or contain sentiments of opposite polarity. To address this, we use a deep learning architecture with a hierarchical attention and position embeddings to enable a finer-grained analysis of sentiments at different positions in the text. Our dataset consists of3022 tweets on the politics domain in Bahasa Indonesia having 1514 unique aspects. We find that there are two important factors for model performance: first, the use of a gating mechanism of appropriate complexity - in our case, LSTM gives the best performance in terms of accuracy and outperforms GRU and RNN by almost 7% in average recall. Second, the use of a trainable embeddings pre-trained on data in similar domains - a trainable Word2Vec embeddings trained on social media data in Bahasa Indonesia gives more than 4% better accuracy than without trainable embeddings. Our analysis also shows that correctly predicted tweets have more variance in attention weights, in contrast to incorrectly predicted ones to which input tokens are often assigned similar weights. This indicates the usefulness of attention mechanism in an aspect-based sentiment analysis.
AB - In this paper we present our work on aspect-level sentiment analysis on social media data, specifically in the political domain. Aside from being linguistically irregular, political tweets are often ambiguous or contain sentiments of opposite polarity. To address this, we use a deep learning architecture with a hierarchical attention and position embeddings to enable a finer-grained analysis of sentiments at different positions in the text. Our dataset consists of3022 tweets on the politics domain in Bahasa Indonesia having 1514 unique aspects. We find that there are two important factors for model performance: first, the use of a gating mechanism of appropriate complexity - in our case, LSTM gives the best performance in terms of accuracy and outperforms GRU and RNN by almost 7% in average recall. Second, the use of a trainable embeddings pre-trained on data in similar domains - a trainable Word2Vec embeddings trained on social media data in Bahasa Indonesia gives more than 4% better accuracy than without trainable embeddings. Our analysis also shows that correctly predicted tweets have more variance in attention weights, in contrast to incorrectly predicted ones to which input tokens are often assigned similar weights. This indicates the usefulness of attention mechanism in an aspect-based sentiment analysis.
KW - aspect-level
KW - deep learning
KW - hierarchical attention
KW - political domain
KW - position embeddings
KW - sentiment analysis
KW - social media
UR - http://www.scopus.com/inward/record.url?scp=85094587550&partnerID=8YFLogxK
U2 - 10.1109/ICoDSA50139.2020.9212883
DO - 10.1109/ICoDSA50139.2020.9212883
M3 - Conference contribution
AN - SCOPUS:85094587550
T3 - 2020 International Conference on Data Science and Its Applications, ICoDSA 2020
BT - 2020 International Conference on Data Science and Its Applications, ICoDSA 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 International Conference on Data Science and Its Applications, ICoDSA 2020
Y2 - 5 August 2020 through 6 August 2020
ER -