Synthesis Ensemble Oversampling and Ensemble Tree-Based Machine Learning for Class Imbalance Problem in Breast Cancer Diagnosis

N. Slamet Sudaryanto, Mauridhi Hery Purnomo, Diana Purwitasari, Eko Mulyanto Yuniarno

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

The Wisconsin Breast Cancer Database dataset describes the imbalanced class. The imbalanced class will produce accuracy that only favors the majority class but not the minority class. Several ensemble oversampling methods are SMOTE and Random Over Sampling. Meanwhile, the tree-based machine learning ensemble used is Random Forest, Adaptive Boosting, and eXtreme Gradient Boosting. At the level 1 ensemble stage, one of the ensemble models with the best performance will be selected as input for the level 2 ensemble process. The level 2 ensemble is a boosting ensemble, where the results of the best ensemble model chosen at the level 1 ensemble will be used as the base model for boosting the XGBoost algorithm. The results were tested with 10 Fold Cross Validation of 0.981, Accuracy 0.987, Recall 0.980 and Precision 0.982. The performance of our proposed framework outperforms several recent classification studies in the breast cancer domain.

Original languageEnglish
Title of host publicationProceeding of the International Conference on Computer Engineering, Network and Intelligent Multimedia, CENIM 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages110-116
Number of pages7
ISBN (Electronic)9781665476508
DOIs
Publication statusPublished - 2022
Event2022 International Conference on Computer Engineering, Network and Intelligent Multimedia, CENIM 2022 - Surabaya, Indonesia
Duration: 22 Nov 202223 Nov 2022

Publication series

NameProceeding of the International Conference on Computer Engineering, Network and Intelligent Multimedia, CENIM 2022

Conference

Conference2022 International Conference on Computer Engineering, Network and Intelligent Multimedia, CENIM 2022
Country/TerritoryIndonesia
CitySurabaya
Period22/11/2223/11/22

Keywords

  • AdaBoost
  • Ensemble
  • Imbalanced Class
  • ROS
  • Random Forest
  • SMOTE
  • XGBoost

Fingerprint

Dive into the research topics of 'Synthesis Ensemble Oversampling and Ensemble Tree-Based Machine Learning for Class Imbalance Problem in Breast Cancer Diagnosis'. Together they form a unique fingerprint.

Cite this