Domain-Adaptive Fine-Tuning of BioMedBERT for Medical Text Classification

Research output: Contribution to journalArticlepeer-review

Abstract

Accurate classification of medical notes and texts is a critical task for improving biomedical information retrieval and decision support systems. In this study, we propose a hybrid deep learning model combining BioMedBERT with Cross-Attention and BiLSTM, aimed at enhancing the classification performance of disease-related abstracts across five categories. The proposed model was evaluated using a dataset comprising 14k annotated samples derived from scientific medical literature. The proposed architecture achieves a macro F1-score of 63.82, outperforming traditional methods such as sentence embedding models (SimCSE, SBERT), zero-shot entailment approaches, and BioBERT variants integrated with MLP classifiers. Findings show that while the model effectively distinguishes between categories such as neoplasms and cardiovascular diseases, challenges persist in classifying abstracts with overlapping semantics, particularly general pathological conditions. This research demonstrates the efficacy of combining domainspecific language models with sequence and attention mechanisms, proposing a viable method for scalable and interpretable biomedical text classification.

Original languageEnglish
Pages (from-to)28523-28529
Number of pages7
JournalEngineering, Technology and Applied Science Research
Volume15
Issue number6
DOIs
Publication statusPublished - 8 Dec 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • BioMedBERT
  • Natural Language Processing
  • domain-adaptive fine-tuning
  • machine learning
  • medical text classification
  • text classification models

Fingerprint

Dive into the research topics of 'Domain-Adaptive Fine-Tuning of BioMedBERT for Medical Text Classification'. Together they form a unique fingerprint.

Cite this