TY - GEN
T1 - Reducing the Error Mapping of the Students’ Performance Using Feature Selection
AU - Yamasari, Yuni
AU - Rochmawati, Naim
AU - Qoiriah, Anita
AU - Suyatno, Dwi F.
AU - Ahmad, Tohari
N1 - Publisher Copyright:
© 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - In an educational environment, classifying the cognitive aspect of students is critical. It is because an accurate classification is needed by a lecturer to take the right decision for enhancing a better educational environment. To the best of our knowledge, there is no previous research that focuses on this classification process. In this paper, we propose discretization and feature selection methods before the classification. For this purpose, we adopt the equal frequency for the discretization whose result is evaluated by using logistic regression with two regularizations: lasso and ridge. The experimental result shows that four-intervals on the ridge achieve the highest accuracy. It is to be the base to determine the level of the student’s performance: excellent, good, fair, and poor. Next, we remove unnecessary features, by using the Gain Ratio and Gini Index. Also, we build classifiers to evaluate our proposed methods by using k-Nearest Neighbors (k-NN), Neural Network (NN), and CN2 Rule Induction. The experimental result indicates that both discretization and feature selection can enhance the performance of the classification process. Concerning the accuracy level, there is an increase of about 35%, 2.14%, and 3.8% on average of k-NN, NN, and CN2 Rule Induction respectively, from those with original features.
AB - In an educational environment, classifying the cognitive aspect of students is critical. It is because an accurate classification is needed by a lecturer to take the right decision for enhancing a better educational environment. To the best of our knowledge, there is no previous research that focuses on this classification process. In this paper, we propose discretization and feature selection methods before the classification. For this purpose, we adopt the equal frequency for the discretization whose result is evaluated by using logistic regression with two regularizations: lasso and ridge. The experimental result shows that four-intervals on the ridge achieve the highest accuracy. It is to be the base to determine the level of the student’s performance: excellent, good, fair, and poor. Next, we remove unnecessary features, by using the Gain Ratio and Gini Index. Also, we build classifiers to evaluate our proposed methods by using k-Nearest Neighbors (k-NN), Neural Network (NN), and CN2 Rule Induction. The experimental result indicates that both discretization and feature selection can enhance the performance of the classification process. Concerning the accuracy level, there is an increase of about 35%, 2.14%, and 3.8% on average of k-NN, NN, and CN2 Rule Induction respectively, from those with original features.
KW - Classification
KW - Data mining
KW - Features selection
KW - Performance
KW - Student
UR - http://www.scopus.com/inward/record.url?scp=85105881922&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-73689-7_18
DO - 10.1007/978-3-030-73689-7_18
M3 - Conference contribution
AN - SCOPUS:85105881922
SN - 9783030736880
T3 - Advances in Intelligent Systems and Computing
SP - 176
EP - 185
BT - Proceedings of the 12th International Conference on Soft Computing and Pattern Recognition, SoCPaR 2020
A2 - Abraham, Ajith
A2 - Ohsawa, Yukio
A2 - Gandhi, Niketa
A2 - Jabbar, M. A.
A2 - Haqiq, Abdelkrim
A2 - McLoone, Seán
A2 - Issac, Biju
PB - Springer Science and Business Media Deutschland GmbH
T2 - 12th International Conference on Soft Computing and Pattern Recognition, SoCPaR 2020 and 16th International Conference on Information Assurance and Security, IAS 2020
Y2 - 15 December 2020 through 18 December 2020
ER -