Abstract
The percentage of passing courses is dependent on the assistance provided to students. To ensure the effectiveness of these efforts, identifying students at risk of course failure as early as possible is crucial. The list of students at risk can be generated through academic performance prediction based on historical data. However, the number of students failing (7%) is significantly lower than the number succeeding (93%), resulting in a class imbalance that hampers performance. A widely adopted technique for addressing class imbalance issues is synthetic sample oversampling. Many oversampling techniques neglect discrete features, whereas the existing technique for discrete features treats all features uniformly and does not select samples as a basis for generating synthetic data. This limitation is capable of introducing noise and borderline samples. As a result, this study introduced a novel discrete feature oversampling technique called GLoW SMOTE-D. This technique accelerated the improvement of minority sample learning by performing multiple selections and multiple weighting in order to effectively reduce noise. Experimental results showed that this technique significantly enhanced the performance of students' failure in the course prediction model when compared to various other techniques across a range of performance measures and classifiers.
Original language | English |
---|---|
Pages (from-to) | 8889-8901 |
Number of pages | 13 |
Journal | IEEE Access |
Volume | 12 |
DOIs | |
Publication status | Published - 2024 |
Keywords
- Discrete
- imbalanced dataset
- oversampling
- students' failure