Learning Models for Software Feature Extraction from Disaster Tweets: A Comparative Study

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This study assesses the effectiveness of different combinations of word embeddings and machine learning models in classifying disaster-related tweets to improve emergency response efforts. The primary objective is to identify the most reliable approach for categorizing tweets as either disaster-related or unrelated. The research employs TF-IDF for feature vectorization and Word2Vec for word embedding, combined with machine learning models such as Logistic Regression, Support Vector Machines, K-Nearest Neighbors, Decision Trees, and Random Forest. After extensive preprocessing, the results indicate that Word2Vec combined with Logistic Regression achieved the highest performance, with 81% accuracy, precision, and recall, and an F1 score of 80%. Other combinations, including Word2Vec with Support Vector Machines and Random Forest, also demonstrated strong results. In contrast, TF-IDF combined with K-Nearest Neighbors resulted in the lowest accuracy at 68%. These findings highlight the critical importance of selecting the appropriate word embedding techniques and machine learning models for effective text classification. Future research should explore more advanced embeddings like BERT and Transformer, while also incorporating temporal and semantic analysis to further enhance classification accuracy and robustness.

Original languageEnglish
Title of host publicationICECOS 2024 - 4th International Conference on Electrical Engineering and Computer Science, Proceeding
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages83-88
Number of pages6
ISBN (Electronic)9798350368253
DOIs
Publication statusPublished - 2024
Event4th International Conference on Electrical Engineering and Computer Science, ICECOS 2024 - Hybrid, Palembang, Indonesia
Duration: 25 Sept 202426 Sept 2024

Publication series

NameICECOS 2024 - 4th International Conference on Electrical Engineering and Computer Science, Proceeding

Conference

Conference4th International Conference on Electrical Engineering and Computer Science, ICECOS 2024
Country/TerritoryIndonesia
CityHybrid, Palembang
Period25/09/2426/09/24

Keywords

  • comparative analysis
  • disaster tweets
  • feature extraction
  • machine learning
  • word embedding

Fingerprint

Dive into the research topics of 'Learning Models for Software Feature Extraction from Disaster Tweets: A Comparative Study'. Together they form a unique fingerprint.

Cite this