TY - JOUR
T1 - Improved performance of fake account classifiers with percentage overlap features selection
AU - Tjahyanto, Aris
AU - Pratama, Rivanda Putra
AU - Shiddiqi, Ary Mazharuddin
N1 - Publisher Copyright:
© 2024, Institute of Advanced Engineering and Science. All rights reserved.
PY - 2024/6
Y1 - 2024/6
N2 - Feature selection plays a crucial role in the development of high-performance classification models. We propose an innovative method for detecting fake accounts. This method leverages the percentage overlap technique to refine feature selection. We introduce our technique upon earlier work that showcased the enhanced efficacy of the Naïve Bayesian classifier through dataset normalization. Our study employs a dataset of account profiles sourced from Twitter, which we normalize using the Min-Max method. We analyze the results through a series of comprehensive experiments involving diverse classification algorithms—such as Naïve Bayes, decision tree, k-nearest neighbors (KNN), deep learning, and support vector machines (SVM). Our experimental results demonstrate a 100% accuracy achieved by the SVM and deep learning classifiers. The results are attributed to the percentage overlap technique, which facilitates the identification of four highly informative features. These findings outperform models with more extensive feature sets, underscoring the efficacy of our approach.
AB - Feature selection plays a crucial role in the development of high-performance classification models. We propose an innovative method for detecting fake accounts. This method leverages the percentage overlap technique to refine feature selection. We introduce our technique upon earlier work that showcased the enhanced efficacy of the Naïve Bayesian classifier through dataset normalization. Our study employs a dataset of account profiles sourced from Twitter, which we normalize using the Min-Max method. We analyze the results through a series of comprehensive experiments involving diverse classification algorithms—such as Naïve Bayes, decision tree, k-nearest neighbors (KNN), deep learning, and support vector machines (SVM). Our experimental results demonstrate a 100% accuracy achieved by the SVM and deep learning classifiers. The results are attributed to the percentage overlap technique, which facilitates the identification of four highly informative features. These findings outperform models with more extensive feature sets, underscoring the efficacy of our approach.
KW - Bots
KW - Fake accounts
KW - Fake users
KW - Feature selection
KW - Internet
UR - http://www.scopus.com/inward/record.url?scp=85192761004&partnerID=8YFLogxK
U2 - 10.11591/ijai.v13.i2.pp1585-1595
DO - 10.11591/ijai.v13.i2.pp1585-1595
M3 - Article
AN - SCOPUS:85192761004
SN - 2089-4872
VL - 13
SP - 1585
EP - 1595
JO - IAES International Journal of Artificial Intelligence
JF - IAES International Journal of Artificial Intelligence
IS - 2
ER -