Data Augmentation Technique Using Two Step SMOTE for Electronic-nose Signal in Breath Ketone Level Detection

Dhiza Wahyu Firmansyah, Riyanarto Sarno*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Breath acetone concentrations were found to be correlated with blood ketone levels. Based on this evidence, predicting blood ketone levels using breath analysis and machine learning (ML) becomes possible. Nevertheless, a good ML model requires a large amount of training data. Under certain conditions, it is difficult to collect large amounts of data such as during the Covid-19 pandemic. To overcome this problem, we propose an augmentation technique to extend the number of training datasets using two step synthetic minority oversampling (SMOTE). The first step was to increase the amount of training data by combining it with synthetic data, while the second step was to balance the data at each ketone level. The strategy for using SMOTE with regression was further explained since this study aims to predict ketone levels with numerical output values and SMOTE is typically used in classification cases. The proposed method was evaluated by entering the data into several ML methods such as deep neural network regression (DNN-R), linear regression (ML-R), ransac regression (RC-R), K-nearest neighbour regression (KNN-R), decision tree regression (DT-R), random forest regression (RF-R), Ada boost regression (AD-R), Gradient boost regression (GB-R) and XG-boost regression (XGB-R). Based on the test results, when compared without the proposed method, an increase in accuracy was obtained on DNN-R, ML-R, RC-R, KNN-R, DT-R, RF-R, AB-R, GB-R, and XGB-R by 0.958%, 9.51%, 35.74%, 18.133%, 8.236%, 11.348, 9.47%, 5.093%, and 11.264% respectively.

Original languageEnglish
Pages (from-to)523-536
Number of pages14
JournalInternational Journal of Intelligent Engineering and Systems
Issue number4
Publication statusPublished - 2023


  • Breath ketone level
  • Data augmentation
  • Electronic-nose
  • Gas sensor
  • Machine learning


Dive into the research topics of 'Data Augmentation Technique Using Two Step SMOTE for Electronic-nose Signal in Breath Ketone Level Detection'. Together they form a unique fingerprint.

Cite this