TY - GEN
T1 - How SVM can compensate logit based response label with various characteristics in predictor? A simulation study
AU - Riyadi, Mohammad Alfan Alfian
AU - Prastyo, Dedy Dwi
AU - Purnami, Santi Wulan
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/11/13
Y1 - 2018/11/13
N2 - In general, supervised machine learning methods for classification can be categorized into two approaches, namely parametric and nonparametric. Parametric method has limitations in term of the assumptions must be satisfied. One way to handle this problem is using non parametric approaches. The state of the art classification method is support vector machine (SVM). However, the computational burden of kernel SVM limits its application to large scale datasets that demand high computational time. So, one way to cope the limitation is using ensemble approach that splits the data and applies learning procedure at each subset of data. In this work, the Clustered Support Vector Machine (CSVM) is chosen. So far, the studies of CSVM are limited to theoretical and direct application for real dataset. The application to real dataset directly has a weakness that we never know in detail how various characteristic in predictor affect the learning process. So, it is necessary to do a simulation study to further explore how complex the data, particularly in predictor, that can be handled by SVM and CSVM. There are ten scenarios conducted in this simulation study. The response label is generated using Iogistic regression model with various characteristic setting in predictor in each scenario. Given the true response label is generated using Iogit model, the results of this simulation study show that SVM and CSVM can compensate the performance of Iogistic regression in some scenarios. These results showed that SVM is powerful in classification method regardless how the response label is generated.
AB - In general, supervised machine learning methods for classification can be categorized into two approaches, namely parametric and nonparametric. Parametric method has limitations in term of the assumptions must be satisfied. One way to handle this problem is using non parametric approaches. The state of the art classification method is support vector machine (SVM). However, the computational burden of kernel SVM limits its application to large scale datasets that demand high computational time. So, one way to cope the limitation is using ensemble approach that splits the data and applies learning procedure at each subset of data. In this work, the Clustered Support Vector Machine (CSVM) is chosen. So far, the studies of CSVM are limited to theoretical and direct application for real dataset. The application to real dataset directly has a weakness that we never know in detail how various characteristic in predictor affect the learning process. So, it is necessary to do a simulation study to further explore how complex the data, particularly in predictor, that can be handled by SVM and CSVM. There are ten scenarios conducted in this simulation study. The response label is generated using Iogistic regression model with various characteristic setting in predictor in each scenario. Given the true response label is generated using Iogit model, the results of this simulation study show that SVM and CSVM can compensate the performance of Iogistic regression in some scenarios. These results showed that SVM is powerful in classification method regardless how the response label is generated.
KW - CSVM
KW - Logistic Regression
KW - SVM
KW - Simulation Study
UR - http://www.scopus.com/inward/record.url?scp=85058375423&partnerID=8YFLogxK
U2 - 10.1109/ICITEED.2018.8534762
DO - 10.1109/ICITEED.2018.8534762
M3 - Conference contribution
AN - SCOPUS:85058375423
T3 - Proceedings of 2018 10th International Conference on Information Technology and Electrical Engineering: Smart Technology for Better Society, ICITEE 2018
SP - 615
EP - 620
BT - Proceedings of 2018 10th International Conference on Information Technology and Electrical Engineering
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th International Conference on Information Technology and Electrical Engineering, ICITEE 2018
Y2 - 24 July 2018 through 26 July 2018
ER -