Abstract
– The objective of this study is to cluster and classify data using a combination of the k-means and C4.5 methods. The process involves clustering and subsequent classification. The classification process uses k-folds = 10 and samples = stratified sampling. In this study, analphabets in Indonesia of a minimum age of 15 years (15+) were evaluated. The data are the percentage of analogs between 2017 and 2019. The dataset was obtained from https://www.bps.go.id and is accessible at https://osf.io/crwug. In this study, the Davies Bouldin index (DBI) was used to determine the number of clusters with an optimal DBI value of k = 2, namely, 0,121. The results of the cluster maps in Indonesian territories demonstrate low clustering (C 0 = 22 provinces) and high clustering (C 1 = 11 provinces) for children with k = 2 analphabets. Then, the clustering results were classified, and an accuracy of 97.50 was realized, along with a recall of 90.91%, a precision of 100.00%, and an AUC (optimistic) of 0.95 (excellent classification).
| Original language | English |
|---|---|
| Pages (from-to) | 1-8 |
| Number of pages | 8 |
| Journal | Library Philosophy and Practice |
| Volume | 2021 |
| Publication status | Published - 2021 |
Keywords
- C4.5 Algorithm
- Child Illiteracy
- Classification
- K-Means
- –Data Mining
Fingerprint
Dive into the research topics of 'Application of the Cluster Classification Data Mining Method to Child Illiteracy in Indonesia'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver