Abstract
Accurate classification of medical notes and texts is a critical task for improving biomedical information retrieval and decision support systems. In this study, we propose a hybrid deep learning model combining BioMedBERT with Cross-Attention and BiLSTM, aimed at enhancing the classification performance of disease-related abstracts across five categories. The proposed model was evaluated using a dataset comprising 14k annotated samples derived from scientific medical literature. The proposed architecture achieves a macro F1-score of 63.82, outperforming traditional methods such as sentence embedding models (SimCSE, SBERT), zero-shot entailment approaches, and BioBERT variants integrated with MLP classifiers. Findings show that while the model effectively distinguishes between categories such as neoplasms and cardiovascular diseases, challenges persist in classifying abstracts with overlapping semantics, particularly general pathological conditions. This research demonstrates the efficacy of combining domainspecific language models with sequence and attention mechanisms, proposing a viable method for scalable and interpretable biomedical text classification.
| Original language | English |
|---|---|
| Pages (from-to) | 28523-28529 |
| Number of pages | 7 |
| Journal | Engineering, Technology and Applied Science Research |
| Volume | 15 |
| Issue number | 6 |
| DOIs | |
| Publication status | Published - 8 Dec 2025 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- BioMedBERT
- Natural Language Processing
- domain-adaptive fine-tuning
- machine learning
- medical text classification
- text classification models
Fingerprint
Dive into the research topics of 'Domain-Adaptive Fine-Tuning of BioMedBERT for Medical Text Classification'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver