TY - GEN
T1 - A Sampling-Based Sentiment Analysis of Imbalanced Streamed Movie Reviews
AU - Shiddiqi, Ary Mazharuddin
AU - Ramadhan, Reza Wahyu
AU - Ali Dahman, Gehad Adel
AU - Asyari, Zulchair
AU - Ramadhani, Muhammad Rafi
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Sentiment analysis has gained significant importance in analyzing individuals' attitudes and perceptions toward various products, services, and entertainment mediums, including movies. Evaluating the sentiment expressed in movie reviews can provide valuable insights into how users interpret and react to specific films. However, Movie review datasets often suffer from an imbalance in the distribution of positive and negative sentiment labels, which presents challenges for accurate sentiment classification. We propose a framework that harnesses streaming data for enhancing sentiment analysis algorithms. First, we create an initial model using an IMDB movie review dataset to categorize real-time review streams. To address the issue of imbalanced streamed data in movie reviews, we apply diverse sampling techniques, mitigating bias toward the dominant sentiment. This method bolsters the sentiment classifier's effectiveness. Additionally, we iteratively improve the initial model using recorded classification outcomes. We conducted comprehensive experiments on varied movie review datasets to assess our approach's effectiveness. Evaluation metrics were used for comparison, including accuracy, precision, recall, and F1-score. The results encompassed contrasting our sampling-driven method with baseline approaches. The SVC outperformed other algorithms in a native classification environment, whereas the extra tree excelled in a streamed classification environment. These outcomes underscored our framework's efficacy in enhancing sentiment analysis algorithm performance.
AB - Sentiment analysis has gained significant importance in analyzing individuals' attitudes and perceptions toward various products, services, and entertainment mediums, including movies. Evaluating the sentiment expressed in movie reviews can provide valuable insights into how users interpret and react to specific films. However, Movie review datasets often suffer from an imbalance in the distribution of positive and negative sentiment labels, which presents challenges for accurate sentiment classification. We propose a framework that harnesses streaming data for enhancing sentiment analysis algorithms. First, we create an initial model using an IMDB movie review dataset to categorize real-time review streams. To address the issue of imbalanced streamed data in movie reviews, we apply diverse sampling techniques, mitigating bias toward the dominant sentiment. This method bolsters the sentiment classifier's effectiveness. Additionally, we iteratively improve the initial model using recorded classification outcomes. We conducted comprehensive experiments on varied movie review datasets to assess our approach's effectiveness. Evaluation metrics were used for comparison, including accuracy, precision, recall, and F1-score. The results encompassed contrasting our sampling-driven method with baseline approaches. The SVC outperformed other algorithms in a native classification environment, whereas the extra tree excelled in a streamed classification environment. These outcomes underscored our framework's efficacy in enhancing sentiment analysis algorithm performance.
KW - data streams
KW - imbalanced data classification
KW - sentiment analysis
UR - http://www.scopus.com/inward/record.url?scp=85182939450&partnerID=8YFLogxK
U2 - 10.1109/IEACon57683.2023.10370207
DO - 10.1109/IEACon57683.2023.10370207
M3 - Conference contribution
AN - SCOPUS:85182939450
T3 - IEACon 2023 - 2023 IEEE Industrial Electronics and Applications Conference
SP - 214
EP - 219
BT - IEACon 2023 - 2023 IEEE Industrial Electronics and Applications Conference
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE Industrial Electronics and Applications Conference, IEACon 2023
Y2 - 6 November 2023 through 7 November 2023
ER -