Abstract
The increasing volume of user-generated content on social media has amplified the spread of hate speech, posing significant challenges to online safety and content moderation. This study presents a comprehensive approach for aspect-based sentiment classification using multiple embedding and classification techniques. The dataset, sourced from the ETHOS multi-label hate speech corpus, includes eight distinct labels: violence, directed - vs -generalized, gender, race, national-origin, disability, religion, and sexual-orientation. Preprocessing techniques were applied to remove symbols, numbers, and irrelevant elements, ensuring data uniformity. Aspect category keywords were expanded using a generative language model (ChatGPT) to enhance the keyword corpus for each label. The research compares five methods: Word2Vec + cosine similarity (AC1), Global Vectors for Word Representation (GloVe) + cosine similarity (AC2), Bidirectional Encoder Representations from Transformers (BERT) embedding + cosine similarity (AC3), fine-tuned BERT (AC4), and pretrained Robustly Optimized BERT (RoBERTa) with fine-tuning (AC5). Evaluation was conducted using Repeated K-Fold cross-validation to prevent bias, measuring performance through metrics such as mean accuracy, precision, recall, F1 score, and Hamming loss. Results indicate that AC5, leveraging pretrained RoBERTa fine-tuned on the dataset, achieved superior performance with a mean F1score of 0.674, mean precision of 0.862, and mean accuracy of 0.890, highlighting the efficacy of transformer-based models in capturing complex, context-specific sentiments. This research underscores the potential of combining extensive pretraining with domain-specific fine-tuning for precise and reliable multilabel sentiment classification.
| Original language | English |
|---|---|
| Title of host publication | 2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024 |
| Editors | Ferry Wahyu Wibowo |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 95-100 |
| Number of pages | 6 |
| ISBN (Electronic) | 9798331508579 |
| DOIs | |
| Publication status | Published - 2024 |
| Event | 2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024 - Jember, Indonesia Duration: 19 Dec 2024 → … |
Publication series
| Name | 2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024 |
|---|
Conference
| Conference | 2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024 |
|---|---|
| Country/Territory | Indonesia |
| City | Jember |
| Period | 19/12/24 → … |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 16 Peace, Justice and Strong Institutions
Keywords
- Aspect-Based Sentiment Analysis (ABSA)
- Transformer
- cosine similarity
- cross-validation
- hate speech detection
- multilabel classification
- word embeddings
Fingerprint
Dive into the research topics of 'Multilabel Aspect-Based Sentiment Analysis with Diverse Embedding Techniques and Finetuned Transformers for Label Detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver