Abstract
Early detection of diabetes is essential to reducing a high mortality rate. Early detection can be made by studying the possibility of diabetes from the variables obtained in the data of diabetes patients. How to diagnose a patient with medical data becomes a challenge because these are usually imbalanced, where negative cases severely outnumber positive cases. For preprocessing the imbalanced data, this paper designs an algorithm using resampling techniques combined with an ensemble learning algorithm. There are some oversampling techniques ADASYN, ROS, and SMOTE. Whereas, the undersampling techniques are RUS, Tomek, and ENN. The combined techniques like SMOTE-ENN and SMOTE-Tomek are also used to handle highly imbalanced dataset diabetes. Then, the ensemble learning algorithm that is used is Random Forest, Bagging, AdaBoost, and XGBoost. Based on the experimental results, the best performance is using SMOTE-ENN with AdaBoost, with a recall score of 0.7330 even though the F1-Score of this model is 0.6459. AdaBoost Classifier also has good and stable results with various types of resampling. By using SMOTE-ENN, the recall score of the model increased by 0.1819 and the F1 score decreased by 0.2000 from the original model result. The higher sensitivity/recall is more important in medical diagnoses to correctly identify patients with disease than the F1 Score.
| Original language | English |
|---|---|
| Title of host publication | 2022 5th International Conference on Vocational Education and Electrical Engineering |
| Subtitle of host publication | The Future of Electrical Engineering, Informatics, and Educational Technology Through the Freedom of Study in the Post-Pandemic Era, ICVEE 2022 - Proceeding |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 1-5 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781665475815 |
| DOIs | |
| Publication status | Published - 2022 |
| Event | 5th International Conference on Vocational Education and Electrical Engineering, ICVEE 2022 - Virtual, Surabaya, Indonesia Duration: 10 Sept 2022 → 11 Sept 2022 |
Publication series
| Name | 2022 5th International Conference on Vocational Education and Electrical Engineering: The Future of Electrical Engineering, Informatics, and Educational Technology Through the Freedom of Study in the Post-Pandemic Era, ICVEE 2022 - Proceeding |
|---|
Conference
| Conference | 5th International Conference on Vocational Education and Electrical Engineering, ICVEE 2022 |
|---|---|
| Country/Territory | Indonesia |
| City | Virtual, Surabaya |
| Period | 10/09/22 → 11/09/22 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- diabetes
- ensemble learning
- imbalanced dataset
- resampling
Fingerprint
Dive into the research topics of 'Performance Analysis of Resampling and Ensemble Learning Methods on Diabetes Detection as Imbalanced Dataset'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver