TY - GEN
T1 - Text Augmentation to Overcome Data Limitations in Sentiment Analysis for Bahasa Indonesia
AU - Ashar, Muhammad Nasry
AU - Siahaan, Daniel Oranova
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - This paper focuses on the challenges of sentiment analysis in Natural Language Processing (NLP), with a special emphasis on limited data scenarios. Sentiment analysis, a critical component of text understanding, faces specific difficulties when available data are scarce, leading to problems such as overfitting or underfitting. This study explores ensemble learning techniques, POS Tagging, and text augmentation techniques to overcome these data limitations. The experimental results demonstrate that text augmentation with the prefixes "me-", "ter-", "ber-", and "di-" is effective relative to increasing data variety and quantity, which contributes to improved sentiment analysis model performance. The ensemble learning model achieved an accuracy of 91.29% with significant improvements in precision, recall, and F1-score.
AB - This paper focuses on the challenges of sentiment analysis in Natural Language Processing (NLP), with a special emphasis on limited data scenarios. Sentiment analysis, a critical component of text understanding, faces specific difficulties when available data are scarce, leading to problems such as overfitting or underfitting. This study explores ensemble learning techniques, POS Tagging, and text augmentation techniques to overcome these data limitations. The experimental results demonstrate that text augmentation with the prefixes "me-", "ter-", "ber-", and "di-" is effective relative to increasing data variety and quantity, which contributes to improved sentiment analysis model performance. The ensemble learning model achieved an accuracy of 91.29% with significant improvements in precision, recall, and F1-score.
KW - Data Limitations
KW - Ensemble Learning
KW - Natural Language Processing
KW - Sentiment Analysis
KW - Text Augmentation
UR - https://www.scopus.com/pages/publications/85217556309
U2 - 10.1109/ICODSE63307.2024.10829895
DO - 10.1109/ICODSE63307.2024.10829895
M3 - Conference contribution
AN - SCOPUS:85217556309
T3 - Proceedings of 2024 IEEE International Conference on Data and Software Engineering: Data-Driven Innovation: Transforming Industries and Societies, ICoDSE 2024
SP - 217
EP - 222
BT - Proceedings of 2024 IEEE International Conference on Data and Software Engineering
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Conference on Data and Software Engineering, ICoDSE 2024
Y2 - 30 October 2024 through 31 October 2024
ER -