TY - GEN
T1 - Improving classification performance of public complaints with TF-IGM weighting
T2 - 2017 International Conference on Sustainable Information Engineering and Technology, SIET 2017
AU - Mahfud, Fakhris Khusnu Reza
AU - Tjahyanto, Aris
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - Currently Media Center e-Wadul still uses manual labeling in the process of complaint submission. As a result, Media Center administration takes a long time in coordinating with regional work unit (SKPD) to respond to complaints registered. Therefore, it is necessary to classify complaints based on SKPD to speed up the timing of complaint submission. The challenge of classification using text data is to have a high dimension due to a large number of features. In addition, features that appear in almost all classes and even all classes and do not characterize a class are challenges in this research. The proposed term weighting is Term Frequency-Inverse Gravity Moment (TF-IGM). TF-IGM can calculate distinguishing class precisely of a term especially for multiclass problems in this study. The famous Term Frequency-Inverse-Document Frequency (TF-IDF) and TF-Binary weighting methods are also used as a comparison. The classification is performed on Support Vector Machine (SVM), Naive Bayes and K-Nearest Neighbor (KNN) algorithm. In this research, the incoming public complaints will be processed through the pre-process stage, term weighting stage, and classification stage. The classification performance using TF-IGM weighting on SVM method yielded the best value compared to others with accuracy, precision, recall and f-measure respectively 80.11%, 80.70%, 80.10%, and 80.20%.
AB - Currently Media Center e-Wadul still uses manual labeling in the process of complaint submission. As a result, Media Center administration takes a long time in coordinating with regional work unit (SKPD) to respond to complaints registered. Therefore, it is necessary to classify complaints based on SKPD to speed up the timing of complaint submission. The challenge of classification using text data is to have a high dimension due to a large number of features. In addition, features that appear in almost all classes and even all classes and do not characterize a class are challenges in this research. The proposed term weighting is Term Frequency-Inverse Gravity Moment (TF-IGM). TF-IGM can calculate distinguishing class precisely of a term especially for multiclass problems in this study. The famous Term Frequency-Inverse-Document Frequency (TF-IDF) and TF-Binary weighting methods are also used as a comparison. The classification is performed on Support Vector Machine (SVM), Naive Bayes and K-Nearest Neighbor (KNN) algorithm. In this research, the incoming public complaints will be processed through the pre-process stage, term weighting stage, and classification stage. The classification performance using TF-IGM weighting on SVM method yielded the best value compared to others with accuracy, precision, recall and f-measure respectively 80.11%, 80.70%, 80.10%, and 80.20%.
KW - Classification
KW - E-Government
KW - Media Center E-Wadul
KW - TF-IGM
UR - http://www.scopus.com/inward/record.url?scp=85049378363&partnerID=8YFLogxK
U2 - 10.1109/SIET.2017.8304138
DO - 10.1109/SIET.2017.8304138
M3 - Conference contribution
AN - SCOPUS:85049378363
T3 - Proceedings - 2017 International Conference on Sustainable Information Engineering and Technology, SIET 2017
SP - 220
EP - 225
BT - Proceedings - 2017 International Conference on Sustainable Information Engineering and Technology, SIET 2017
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 24 November 2017 through 25 November 2017
ER -