TY - JOUR
T1 - Identifying dominant characteristics of students' cognitive domain on clustering-based classification
AU - Yamasari, Yuni
AU - Nugroho, Supeno M.S.
AU - Yoshimoto, Kayo
AU - Takahashi, Hideya
AU - Purnomo, Mauridhi H.
N1 - Publisher Copyright:
© 2020 Intelligent Network and Systems Society.
PY - 2020/2
Y1 - 2020/2
N2 - The rapid growth of information and communications technology-based educational tools generates a large volume of student data with many features (characteristics). However, the mining process in the clustering task of student data is not often done optimally, so the performance of the system decreases. To overcome this problem, we propose a discretization method on logistic regression to determine the most optimal number of clusters. Additionally, we introduce a technique that combines the features selection using a filter-and wrapper-based procedures (HFS) to identify the dominant features of the students' cognitive domains. Furthermore, we evaluate the identification result by three clustering methods, namely: K-means, EM, and Farthest first. Finally, we propose the clustering-based classification so the results can be measured by using the classification metrics. Here, we apply two evaluation techniques, namely: cross-validation and percentage split. The experimental results indicate that our approach describes predominance, in terms of classification metrics over conventional methods. Our approach is around 10,847-11,134 percent higher in terms of accuracy average than the original features on both the assessment techniques. Also, this approach significantly reduces the time taken to create a prototype between 0.0167-0.027 seconds. This gives the impact on a significant reduction in the model created to the number of unsuitable students on classes based on the cognitive domain, namely: 3-12 students.
AB - The rapid growth of information and communications technology-based educational tools generates a large volume of student data with many features (characteristics). However, the mining process in the clustering task of student data is not often done optimally, so the performance of the system decreases. To overcome this problem, we propose a discretization method on logistic regression to determine the most optimal number of clusters. Additionally, we introduce a technique that combines the features selection using a filter-and wrapper-based procedures (HFS) to identify the dominant features of the students' cognitive domains. Furthermore, we evaluate the identification result by three clustering methods, namely: K-means, EM, and Farthest first. Finally, we propose the clustering-based classification so the results can be measured by using the classification metrics. Here, we apply two evaluation techniques, namely: cross-validation and percentage split. The experimental results indicate that our approach describes predominance, in terms of classification metrics over conventional methods. Our approach is around 10,847-11,134 percent higher in terms of accuracy average than the original features on both the assessment techniques. Also, this approach significantly reduces the time taken to create a prototype between 0.0167-0.027 seconds. This gives the impact on a significant reduction in the model created to the number of unsuitable students on classes based on the cognitive domain, namely: 3-12 students.
KW - Classification
KW - Clustering
KW - Cognitive domain
KW - Features selection
KW - Student
UR - http://www.scopus.com/inward/record.url?scp=85080863405&partnerID=8YFLogxK
U2 - 10.22266/ijies2020.0229.16
DO - 10.22266/ijies2020.0229.16
M3 - Article
AN - SCOPUS:85080863405
SN - 2185-310X
VL - 13
SP - 167
EP - 180
JO - International Journal of Intelligent Engineering and Systems
JF - International Journal of Intelligent Engineering and Systems
IS - 1
ER -