TY - GEN
T1 - Clustering on Multidimensional Poverty Data using PAM and K-prototypes Algorithm
T2 - 2019 International Seminar on Intelligent Technology and Its Application, ISITIA 2019
AU - Wijayanto, Aris
AU - Suprapto, Yoyon K.
AU - Wulandari, D. P.
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/8
Y1 - 2019/8
N2 - Poverty is still a serious concern of the Indonesian government. Through the Multidimensional Poverty terminology, experts try to understand poverty with a more comprehensive approach. Using data from the 2017 National Socio-Economic Survey (SUSENAS) and Alkire-Foster method, this study measures poverty in terms of the various deprivations experienced by residents in Jambi Province. The dimensions used in this study consist of 3 dimensions, namely: health, education, and living standard. This study investigates the use of PAM (partitioning around medoids) and k-prototype and compares their effectiveness in clustering mixed data types, using poverty data from published governmental data. This study also examines the scalability of the PAM and K-prototypes algorithm against the number of clusters for a given number of observations. The performance evaluation is carried out by comparing the value of the silhouette coefficient (SC) from each clustering method. In this study, clustering with K-prototypes is 59 % better than PAM in term of the SC value. The scalability test has shown that the K-prototypes algorithm is faster than the PAM algorithm. Considering the SC value, we can conclude that the cluster formed is reasonable. The one-way ANOVA and Kruskal-Wallis test result shows that 13 out of 17 variables used are a statistically significant difference between the formed clusters.
AB - Poverty is still a serious concern of the Indonesian government. Through the Multidimensional Poverty terminology, experts try to understand poverty with a more comprehensive approach. Using data from the 2017 National Socio-Economic Survey (SUSENAS) and Alkire-Foster method, this study measures poverty in terms of the various deprivations experienced by residents in Jambi Province. The dimensions used in this study consist of 3 dimensions, namely: health, education, and living standard. This study investigates the use of PAM (partitioning around medoids) and k-prototype and compares their effectiveness in clustering mixed data types, using poverty data from published governmental data. This study also examines the scalability of the PAM and K-prototypes algorithm against the number of clusters for a given number of observations. The performance evaluation is carried out by comparing the value of the silhouette coefficient (SC) from each clustering method. In this study, clustering with K-prototypes is 59 % better than PAM in term of the SC value. The scalability test has shown that the K-prototypes algorithm is faster than the PAM algorithm. Considering the SC value, we can conclude that the cluster formed is reasonable. The one-way ANOVA and Kruskal-Wallis test result shows that 13 out of 17 variables used are a statistically significant difference between the formed clusters.
KW - Clustering Mixed Data Types
KW - K-prototypes
KW - Multidimensional Poverty
KW - PAM
UR - http://www.scopus.com/inward/record.url?scp=85078483268&partnerID=8YFLogxK
U2 - 10.1109/ISITIA.2019.8937130
DO - 10.1109/ISITIA.2019.8937130
M3 - Conference contribution
AN - SCOPUS:85078483268
T3 - Proceedings - 2019 International Seminar on Intelligent Technology and Its Application, ISITIA 2019
SP - 210
EP - 215
BT - Proceedings - 2019 International Seminar on Intelligent Technology and Its Application, ISITIA 2019
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 28 August 2019 through 29 August 2019
ER -