TY - JOUR
T1 - Application of the Cluster Classification Data Mining Method to Child Illiteracy in Indonesia
AU - Arifin, Muhammad
AU - Bhawika, Gita Widi
AU - Muazar Habibi, M. A.
AU - Firdaus, Winci
AU - Agustinova, Danu Eko
AU - Rahim, Robbi
N1 - Publisher Copyright:
© 2021
PY - 2021
Y1 - 2021
N2 - – The objective of this study is to cluster and classify data using a combination of the k-means and C4.5 methods. The process involves clustering and subsequent classification. The classification process uses k-folds = 10 and samples = stratified sampling. In this study, analphabets in Indonesia of a minimum age of 15 years (15+) were evaluated. The data are the percentage of analogs between 2017 and 2019. The dataset was obtained from https://www.bps.go.id and is accessible at https://osf.io/crwug. In this study, the Davies Bouldin index (DBI) was used to determine the number of clusters with an optimal DBI value of k = 2, namely, 0,121. The results of the cluster maps in Indonesian territories demonstrate low clustering (C 0 = 22 provinces) and high clustering (C 1 = 11 provinces) for children with k = 2 analphabets. Then, the clustering results were classified, and an accuracy of 97.50 was realized, along with a recall of 90.91%, a precision of 100.00%, and an AUC (optimistic) of 0.95 (excellent classification).
AB - – The objective of this study is to cluster and classify data using a combination of the k-means and C4.5 methods. The process involves clustering and subsequent classification. The classification process uses k-folds = 10 and samples = stratified sampling. In this study, analphabets in Indonesia of a minimum age of 15 years (15+) were evaluated. The data are the percentage of analogs between 2017 and 2019. The dataset was obtained from https://www.bps.go.id and is accessible at https://osf.io/crwug. In this study, the Davies Bouldin index (DBI) was used to determine the number of clusters with an optimal DBI value of k = 2, namely, 0,121. The results of the cluster maps in Indonesian territories demonstrate low clustering (C 0 = 22 provinces) and high clustering (C 1 = 11 provinces) for children with k = 2 analphabets. Then, the clustering results were classified, and an accuracy of 97.50 was realized, along with a recall of 90.91%, a precision of 100.00%, and an AUC (optimistic) of 0.95 (excellent classification).
KW - C4.5 Algorithm
KW - Child Illiteracy
KW - Classification
KW - K-Means
KW - –Data Mining
UR - http://www.scopus.com/inward/record.url?scp=85103699045&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85103699045
SN - 1522-0222
VL - 2021
SP - 1
EP - 8
JO - Library Philosophy and Practice
JF - Library Philosophy and Practice
ER -