The rapid growth of information and communications technology-based educational tools generates a large volume of student data with many features (characteristics). However, the mining process in the clustering task of student data is not often done optimally, so the performance of the system decreases. To overcome this problem, we propose a discretization method on logistic regression to determine the most optimal number of clusters. Additionally, we introduce a technique that combines the features selection using a filter-and wrapper-based procedures (HFS) to identify the dominant features of the students' cognitive domains. Furthermore, we evaluate the identification result by three clustering methods, namely: K-means, EM, and Farthest first. Finally, we propose the clustering-based classification so the results can be measured by using the classification metrics. Here, we apply two evaluation techniques, namely: cross-validation and percentage split. The experimental results indicate that our approach describes predominance, in terms of classification metrics over conventional methods. Our approach is around 10,847-11,134 percent higher in terms of accuracy average than the original features on both the assessment techniques. Also, this approach significantly reduces the time taken to create a prototype between 0.0167-0.027 seconds. This gives the impact on a significant reduction in the model created to the number of unsuitable students on classes based on the cognitive domain, namely: 3-12 students.

Original languageEnglish
Pages (from-to)167-180
Number of pages14
JournalInternational Journal of Intelligent Engineering and Systems
Issue number1
Publication statusPublished - Feb 2020


  • Classification
  • Clustering
  • Cognitive domain
  • Features selection
  • Student


Dive into the research topics of 'Identifying dominant characteristics of students' cognitive domain on clustering-based classification'. Together they form a unique fingerprint.

Cite this