Abstract

Dengue infection is a dangerous infectious disease that threatens human health at every age and can be deadly. The imbalance of the dengue infection disease dataset will interfere with the meaning of the final interpretation of the predicted results to be insignificant due to the bias of the minority class classification against the majority. This study aims to improve classification accuracy by resolving multi-class imbalances problems using the proposed new approach, explicitly improving class by giving weights classes to minority and majority classes. Furthermore, resampling problems from imbalanced datasets use the Random resampling and SMOTE techniques. Eight classification algorithms, NN, KNN, Decision Tree, Random Forest, Naïve Bayes, AdaBoost, SVM, and Logistic Regression, were tested on the balanced datasets by applying 10-fold cross-validation and feature selection. The experimental results show that the new proposed approach can improve accuracy higher than the original primary data. The AdaBoost classification algorithm has the highest accuracy compared to other algorithms on dengue infection cases by 87.0%. We then tested the new method in other cases, the hypothyroid disease, to demonstrate its effectiveness and efficiency in increasing accuracy. Thus, our new method can be applied universally in solving classification problems in imbalanced datasets. The results indicate that the AdaBoost classification algorithm improves everlasting outcomes with the highest accuracy by 99.7% in the hypothyroid cases, with an average AUC, F1, precision, and recall towards 99.8%.

Original languageEnglish
Pages (from-to)176-192
Number of pages17
JournalInternational Journal of Intelligent Engineering and Systems
Volume15
Issue number3
DOIs
Publication statusPublished - 2023

Keywords

  • Accuracy
  • Class weights
  • Classification
  • Feature selection
  • Multi-class imbalanced data
  • Random resampling
  • SMOTE

Fingerprint

Dive into the research topics of 'A Multi-Class Classification of Dengue Infection Cases with Feature Selection in Imbalanced Clinical Diagnosis Data'. Together they form a unique fingerprint.

Cite this