Multilabel Aspect-Based Sentiment Analysis with Diverse Embedding Techniques and Finetuned Transformers for Label Detection

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The increasing volume of user-generated content on social media has amplified the spread of hate speech, posing significant challenges to online safety and content moderation. This study presents a comprehensive approach for aspect-based sentiment classification using multiple embedding and classification techniques. The dataset, sourced from the ETHOS multi-label hate speech corpus, includes eight distinct labels: violence, directed - vs -generalized, gender, race, national-origin, disability, religion, and sexual-orientation. Preprocessing techniques were applied to remove symbols, numbers, and irrelevant elements, ensuring data uniformity. Aspect category keywords were expanded using a generative language model (ChatGPT) to enhance the keyword corpus for each label. The research compares five methods: Word2Vec + cosine similarity (AC1), Global Vectors for Word Representation (GloVe) + cosine similarity (AC2), Bidirectional Encoder Representations from Transformers (BERT) embedding + cosine similarity (AC3), fine-tuned BERT (AC4), and pretrained Robustly Optimized BERT (RoBERTa) with fine-tuning (AC5). Evaluation was conducted using Repeated K-Fold cross-validation to prevent bias, measuring performance through metrics such as mean accuracy, precision, recall, F1 score, and Hamming loss. Results indicate that AC5, leveraging pretrained RoBERTa fine-tuned on the dataset, achieved superior performance with a mean F1score of 0.674, mean precision of 0.862, and mean accuracy of 0.890, highlighting the efficacy of transformer-based models in capturing complex, context-specific sentiments. This research underscores the potential of combining extensive pretraining with domain-specific fine-tuning for precise and reliable multilabel sentiment classification.

Original languageEnglish
Title of host publication2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024
EditorsFerry Wahyu Wibowo
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages95-100
Number of pages6
ISBN (Electronic)9798331508579
DOIs
Publication statusPublished - 2024
Event2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024 - Jember, Indonesia
Duration: 19 Dec 2024 → …

Publication series

Name2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024

Conference

Conference2024 Beyond Technology Summit on Informatics International Conference, BTS-I2C 2024
Country/TerritoryIndonesia
CityJember
Period19/12/24 → …

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 16 - Peace, Justice and Strong Institutions
    SDG 16 Peace, Justice and Strong Institutions

Keywords

  • Aspect-Based Sentiment Analysis (ABSA)
  • Transformer
  • cosine similarity
  • cross-validation
  • hate speech detection
  • multilabel classification
  • word embeddings

Fingerprint

Dive into the research topics of 'Multilabel Aspect-Based Sentiment Analysis with Diverse Embedding Techniques and Finetuned Transformers for Label Detection'. Together they form a unique fingerprint.

Cite this