Software Defect Prediction Using a Combination of Oversampling and Undersampling Methods

Aizul Faiz Iswafaza, Siti Rochimah

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Software quality can be improved by doing software testing, but the more features are developed the more resources are required, therefore software defect prediction (SDP) is introduced. Various kinds of machine learning methods are used to develop SDP. However, various kinds of problems arise in SDP activities, namely data redundancy, class imbalance and feature redundancy. In this study, a combination of oversampling and under-sampling (COU) model will be proposed to solve the problem of data redundancy and class imbalance. The oversampling method used is RSMOTE and the under-sampling method used is ENN. The application of the combination model will later provide a new set of datasets that are more balanced and cleaner from ambiguous, noisy and duplication of data. From the new data generated by the model, deep learning will then be applied as a prediction model. And the evaluation will be done by applying the f-measure measurement. The results of this study indicate that the COU model used gives good results in improving the quality of SDP. When compared with the average value generated by the RSMOTE model in making predictions, the COU model provides an increase in f-measure evaluation results by 11% where the average value obtained is 0.876.

Original languageEnglish
Title of host publicationProceeding - 6th International Conference on Information Technology, Information Systems and Electrical Engineering
Subtitle of host publicationApplying Data Sciences and Artificial Intelligence Technologies for Environmental Sustainability, ICITISEE 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages127-132
Number of pages6
ISBN (Electronic)9798350399615
DOIs
Publication statusPublished - 2022
Event6th International Conference on Information Technology, Information Systems and Electrical Engineering, ICITISEE 2022 - Virtual, Online, Indonesia
Duration: 13 Dec 202214 Dec 2022

Publication series

NameProceeding - 6th International Conference on Information Technology, Information Systems and Electrical Engineering: Applying Data Sciences and Artificial Intelligence Technologies for Environmental Sustainability, ICITISEE 2022

Conference

Conference6th International Conference on Information Technology, Information Systems and Electrical Engineering, ICITISEE 2022
Country/TerritoryIndonesia
CityVirtual, Online
Period13/12/2214/12/22

Keywords

  • AEEEM
  • RSMOTE
  • combined oversampling and under-sampling
  • edited nearest neighbors
  • software defect prediction

Fingerprint

Dive into the research topics of 'Software Defect Prediction Using a Combination of Oversampling and Undersampling Methods'. Together they form a unique fingerprint.

Cite this