Comparison of Data Mining Techniques on Stroke Clinical Dataset

Viko Pradana Prasetyo, Muhammad Fajrul Alam Ulin Nuha, Makhi Hakim Hakiki, Retno Aulia Vinarti*, Arif Djunaidy

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review


Stroke is a significant cause of mortality and morbidity worldwide, making it essential to identify individuals at risk of experiencing a stroke. The aim of this research article is to develop a predictive model to determine the stroke risk of individuals based on their medical history and compare the effectiveness of preprocessing techniques on the model's performance. The methodology involves two streams of analysis - with and without data preprocessing - utilizing classification models to predict stroke risk (K-Nearest Neighbor, Decision Tree and Support Vector Machine). The results indicate that data preprocessing improves the performance of all models, with KNN and SVM showing high precision and recall values, making them effective models for predicting strokes. Conversely, the decision tree model performs well with data preprocessing despite slightly lower accuracy and recall values. These findings suggest that preprocessing is a crucial stage in machine learning and can enhance the performance of classification models in predicting stroke risk.

Original languageEnglish
Pages (from-to)502-511
Number of pages10
JournalProcedia Computer Science
Publication statusPublished - 2024
Event7th Information Systems International Conference, ISICO 2023 - Washington, United States
Duration: 26 Jul 202328 Jul 2023


  • Classification
  • Data Mining Techniques
  • Decision Tree
  • Health
  • KNN
  • SVM
  • Stroke


Dive into the research topics of 'Comparison of Data Mining Techniques on Stroke Clinical Dataset'. Together they form a unique fingerprint.

Cite this