Binary Classification on Imbalanced Data: A Case Study for Birth Events in Indonesia

Ratih A. Ningrum*, Indah Fahmiyah, M. A. Syahputra, Aretha Levi, Neni Alya Firdausanti, Diana Nurlaily

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Classification for binary imbalanced class data is still an interesting topic. Especially in the case of classification which is based on the data-driven approach. By this approach, there is often an imbalance in the target class of classification. Therefore, the study of class imbalance is ineluctable. In this study, we classified birth events for the Indonesia Demographic and Health Survey (DHS) 2017 data. We implemented machine learning algorithms, i.e. Logistic Regression (LR) and Support Vector Machine (SVM) classifiers to classify the birth event for women in Indonesia. Several resampling techniques were applied including Undersampling, Oversampling, and Hybrid to rebalance the data distribution. The performance of each technique was evaluated based on several evaluation metrics. We used Accuracy, Sensitivity, F1-Score, Area Under Curve, and Geometric mean to evaluate the classification results. A significant discrepancy in the score of evaluation metrics was found between the methods when the LR and SVM classifiers were employed. Precisely, the evaluation score metrics are high for the balanced data obtained from Undersampling techniques, i.e., Nearmiss-1 for LR classifier and NCL for SVM classifier. The value of Accuracy, Sensitivity, F1-Score, Area Under Curve, and Geometric mean for Nearmiss-1 are 0.9859, 0.9720, 0.9858, 0.9860, 0.9859, respectively. Then for NCL the score of evaluation metrics are 0.9829, 0.9767, 0.9882, 0.9884, 0.9883, respectively. Overall, Undersampling techniques gave higher evaluation score metrics than Oversampling techniques and Hybrid techniques for Indonesia DHS 2017 imbalanced classification.

Original languageEnglish
Title of host publicationProceedings of the International Conference on Advanced Technology and Multidiscipline, ICATAM 2021
Subtitle of host publication"Advanced Technology and Multidisciplinary Prospective Towards Bright Future" Faculty of Advanced Technology and Multidiscipline
EditorsPrihartini Widiyanti, Prastika Krisma Jiwanti, Gunawan Setia Prihandana, Ratih Ardiati Ningrum, Rizki Putra Prastio, Herlambang Setiadi, Intan Nurul Rizki
PublisherAmerican Institute of Physics Inc.
ISBN (Electronic)9780735444423
DOIs
Publication statusPublished - 19 May 2023
Event1st International Conference on Advanced Technology and Multidiscipline: Advanced Technology and Multidisciplinary Prospective Towards Bright Future, ICATAM 2021 - Virtual, Online
Duration: 13 Oct 202114 Oct 2021

Publication series

NameAIP Conference Proceedings
Volume2536
ISSN (Print)0094-243X
ISSN (Electronic)1551-7616

Conference

Conference1st International Conference on Advanced Technology and Multidiscipline: Advanced Technology and Multidisciplinary Prospective Towards Bright Future, ICATAM 2021
CityVirtual, Online
Period13/10/2114/10/21

Fingerprint

Dive into the research topics of 'Binary Classification on Imbalanced Data: A Case Study for Birth Events in Indonesia'. Together they form a unique fingerprint.

Cite this