TY - GEN
T1 - Improving the accuracy of predicting disease risk scores using SOM clustering based on noisy feature
AU - Rahayu, Endang Sri
AU - Yuniarno, Eko Mulyanto
AU - Purnama, I. Ketut Edhy
AU - Purnomo, Mauridhi Hery
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/9/29
Y1 - 2021/9/29
N2 - In providing health services, health workers need information about the patient's disease risk to ensure that the services offered are by their needs. Meanwhile, the classification algorithm will train the system to predict disease risk information. This study will prove that using clustering Neural Network Self Organizing Maps (SOM) can increase the accuracy of predicting disease risk scores due to noisy features in the dataset. Clustering carried out at the preprocessing stage resulted in grouping disease features from 841 categories to 247 categories. The design of SOM clustering consists of a matrix of 15 x 20 neurons, 100 epochs, root means square performance, resulting in an accuracy rate of 93.7% in 47 seconds. In the training phase, 38,659 public data from Kaggle were applied, divided into seven age groups. In each age group, the system classifies disease risk scores into 11 risk score classes. The results of SOM clustering are used as predictors in the prediction system through experiments using five classification algorithms. Based on the results obtained, the Fine Tree Algorithm has the highest increase in accuracy for the entire dataset, from 99.1% to 99.8%.
AB - In providing health services, health workers need information about the patient's disease risk to ensure that the services offered are by their needs. Meanwhile, the classification algorithm will train the system to predict disease risk information. This study will prove that using clustering Neural Network Self Organizing Maps (SOM) can increase the accuracy of predicting disease risk scores due to noisy features in the dataset. Clustering carried out at the preprocessing stage resulted in grouping disease features from 841 categories to 247 categories. The design of SOM clustering consists of a matrix of 15 x 20 neurons, 100 epochs, root means square performance, resulting in an accuracy rate of 93.7% in 47 seconds. In the training phase, 38,659 public data from Kaggle were applied, divided into seven age groups. In each age group, the system classifies disease risk scores into 11 risk score classes. The results of SOM clustering are used as predictors in the prediction system through experiments using five classification algorithms. Based on the results obtained, the Fine Tree Algorithm has the highest increase in accuracy for the entire dataset, from 99.1% to 99.8%.
KW - accuracy
KW - and SOM clustering
KW - disease risk score
KW - noisy feature
KW - prediction
UR - http://www.scopus.com/inward/record.url?scp=85119979141&partnerID=8YFLogxK
U2 - 10.1109/IES53407.2021.9593293
DO - 10.1109/IES53407.2021.9593293
M3 - Conference contribution
AN - SCOPUS:85119979141
T3 - International Electronics Symposium 2021: Wireless Technologies and Intelligent Systems for Better Human Lives, IES 2021 - Proceedings
SP - 30
EP - 35
BT - International Electronics Symposium 2021
A2 - Yunanto, Andhik Ampuh
A2 - Kusuma N, Artiarini
A2 - Hermawan, Hendhi
A2 - Putra, Putu Agus Mahadi
A2 - Gamar, Farida
A2 - Ridwan, Mohamad
A2 - Prayogi, Yanuar Risah
A2 - Ruswiansari, Maretha
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 23rd International Electronics Symposium, IES 2021
Y2 - 29 September 2021 through 30 September 2021
ER -