TY - JOUR
T1 - The use of entropy based fuzzy membership on weighted logistic regression for the unbalanced data
AU - Harumeka, Ajiwasesa
AU - Purnami, Santi Wulan
AU - Rahayu, Santi Puteri
N1 - Publisher Copyright:
© Published under licence by IOP Publishing Ltd.
PY - 2021/11/10
Y1 - 2021/11/10
N2 - Logistic regression is a popular and powerful classification method. The addition of ridge regularization and optimization using a combination of linear conjugate gradients and IRLS, called Truncated Regularized Iteratively Re-weighted Least Square (TR-IRLS), can outperform Support Vector Machine (SVM) in terms of processing speed, especially when applied to large data and have competitive accuracy. However, neither SVM nor TR-IRLS is good enough when applied to unbalanced data. Fuzzy Support Vector Machine (FSVM) is an SVM development for unbalanced data that adds fuzzy membership to each observation. The fuzzy membership makes the interest of each observation in the minority class higher than the majority class. Meanwhile, TR-IRLS developed into a Rare Event Weighted Logistic Regression (RE-WLR) by adding weight to logistic regression and bias correction. The weighting of the RE-WLR depends on the undersampling scheme. It allows an "information loss". Between FSVM and RE-WLR has a similarity, the weight based only on class differences (minority or majority). Entropy Based Fuzzy Support Vector Machine (EFSVM) is a method used to accommodate the weaknesses of FSVM by considering the class certainty of class observations. As a result, EFSVM is able to improve SVM performance for unbalanced data, even beating FSVM. For this reason, we use EF on the TR-IRLS algorithm to classify large and unbalanced data, as a proposed method. This method is called Entropy-Based Fuzzy Weighted Logistic Regression (EF-WLR). This Research shows the review of EF-WLR for unbalanced data classification.
AB - Logistic regression is a popular and powerful classification method. The addition of ridge regularization and optimization using a combination of linear conjugate gradients and IRLS, called Truncated Regularized Iteratively Re-weighted Least Square (TR-IRLS), can outperform Support Vector Machine (SVM) in terms of processing speed, especially when applied to large data and have competitive accuracy. However, neither SVM nor TR-IRLS is good enough when applied to unbalanced data. Fuzzy Support Vector Machine (FSVM) is an SVM development for unbalanced data that adds fuzzy membership to each observation. The fuzzy membership makes the interest of each observation in the minority class higher than the majority class. Meanwhile, TR-IRLS developed into a Rare Event Weighted Logistic Regression (RE-WLR) by adding weight to logistic regression and bias correction. The weighting of the RE-WLR depends on the undersampling scheme. It allows an "information loss". Between FSVM and RE-WLR has a similarity, the weight based only on class differences (minority or majority). Entropy Based Fuzzy Support Vector Machine (EFSVM) is a method used to accommodate the weaknesses of FSVM by considering the class certainty of class observations. As a result, EFSVM is able to improve SVM performance for unbalanced data, even beating FSVM. For this reason, we use EF on the TR-IRLS algorithm to classify large and unbalanced data, as a proposed method. This method is called Entropy-Based Fuzzy Weighted Logistic Regression (EF-WLR). This Research shows the review of EF-WLR for unbalanced data classification.
UR - http://www.scopus.com/inward/record.url?scp=85121229904&partnerID=8YFLogxK
U2 - 10.1088/1755-1315/880/1/012048
DO - 10.1088/1755-1315/880/1/012048
M3 - Conference article
AN - SCOPUS:85121229904
SN - 1755-1307
VL - 880
JO - IOP Conference Series: Earth and Environmental Science
JF - IOP Conference Series: Earth and Environmental Science
IS - 1
M1 - 012048
T2 - 4th International Conference on Science and Technology Applications in Climate Change, STACLIM 2021
Y2 - 1 July 2021 through 2 July 2021
ER -