Abstract

The complex variability of Long-COVID symptoms often hampers effective disease management. These symptoms can persist long after infection and are often unrecorded in medical records, particularly in mild cases that do not require hospitalization. Social media texts offer diverse sources of information on emerging health conditions. Leveraging these data enhances our understanding of Long-COVID, making Natural Language Processing (NLP) techniques essential. This study introduces an NLP that uses BERT-based models to detect Long COVID symptoms in social media posts. Data from social media platform Twitter were collected using the keyword #LongCovid, followed by a multi-stage preprocessing process that included text cleaning and lexicon-based text filtering. Three models-BERT, BioBERT, and Bio+Clinical BERT-were fine-tuned and evaluated based on their F1 scores. The experimental results demonstrate that the general BERT model outperformed the domain-specific BioBERT and Bio+Clinical BERT models, achieving the highest F1 score in multi-label text classification with an F1 score of 89.90%. This finding highlights BERT's suitability for processing informal language on social media platforms and suggests that general-purpose language models may be more effective for health surveillance on these platforms than models pretrained on medical data. Our study contributes to academic understanding by demonstrating effective symptom identification for health monitoring, particularly of Long-COVID symptoms, from social media data.

Original languageEnglish
Title of host publication2024 International Conference on Decision Aid Sciences and Applications, DASA 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350369106
DOIs
Publication statusPublished - 2024
Event2024 International Conference on Decision Aid Sciences and Applications, DASA 2024 - Manama, Bahrain
Duration: 11 Dec 202412 Dec 2024

Publication series

Name2024 International Conference on Decision Aid Sciences and Applications, DASA 2024

Conference

Conference2024 International Conference on Decision Aid Sciences and Applications, DASA 2024
Country/TerritoryBahrain
CityManama
Period11/12/2412/12/24

Keywords

  • BERT
  • Long COVID
  • SGD 3
  • Symptoms Detection
  • Text Classification
  • Twitter

Fingerprint

Dive into the research topics of 'Early Detection of Long COVID Symptoms from Social Media Using BERT'. Together they form a unique fingerprint.

Cite this