TY - GEN
T1 - Development of Pre-Processing for Chronic Kidney Disease Prediction Using K-Nearest Neighbors Imputer and Chi-Square
AU - Mardianto, Ricky
AU - Saikhu, Ahmad
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Chronic Kidney Disease (CKD) is a condition in which the function and/or structure of the kidneys are severely damaged, resulting in an inability to filter blood as they should. This disease develops slowly and is difficult to recover from. In the early stages of CKD, symptoms often do not manifest clearly, and patients may not be aware of it. One of the primary hazards is the development of complications and mortality. The use of machine learning is seeing a growing trend in the identification of illnesses, such as CKD. Machine learning algorithms assist in identifying and predicting early-stage CKD. Early detection of CKD can provide appropriate medical treatment and medication to prevent risks from other diseases. Recent research indicates that accurately detecting CKD remains challenging due to frequently encountered invalid data and numerous missing-values. Consequently, optimal handling of missing-values within the data and the utilization of feature selection are expected to enhance the predictive quality in early CKD detection. This study employed an approach to handling missing-values using K-Nearest Neighbour (KNN) imputer and feature selection based on the use of the Chi-square test on the Chronic Kidney Disease dataset from Kaggle.com. Machine learning techniques used include Extra Tree Classifier, Random Forest, XGBoost and deep learning techniques used include TabNet and TabTransformer. The results of the experimentation showed that the Extra Tree Classifier method produced better accuracy with an accuracy of 99,25%. Thus, handling missing-values using KNN-imputer and feature selection based on the Chi-square test is a good application method for detecting early-stage CKD.
AB - Chronic Kidney Disease (CKD) is a condition in which the function and/or structure of the kidneys are severely damaged, resulting in an inability to filter blood as they should. This disease develops slowly and is difficult to recover from. In the early stages of CKD, symptoms often do not manifest clearly, and patients may not be aware of it. One of the primary hazards is the development of complications and mortality. The use of machine learning is seeing a growing trend in the identification of illnesses, such as CKD. Machine learning algorithms assist in identifying and predicting early-stage CKD. Early detection of CKD can provide appropriate medical treatment and medication to prevent risks from other diseases. Recent research indicates that accurately detecting CKD remains challenging due to frequently encountered invalid data and numerous missing-values. Consequently, optimal handling of missing-values within the data and the utilization of feature selection are expected to enhance the predictive quality in early CKD detection. This study employed an approach to handling missing-values using K-Nearest Neighbour (KNN) imputer and feature selection based on the use of the Chi-square test on the Chronic Kidney Disease dataset from Kaggle.com. Machine learning techniques used include Extra Tree Classifier, Random Forest, XGBoost and deep learning techniques used include TabNet and TabTransformer. The results of the experimentation showed that the Extra Tree Classifier method produced better accuracy with an accuracy of 99,25%. Thus, handling missing-values using KNN-imputer and feature selection based on the Chi-square test is a good application method for detecting early-stage CKD.
KW - KNN-Imputer
KW - chi-square
KW - chronic kidney disease
KW - deep learning
KW - machine learning
KW - missing-value
UR - http://www.scopus.com/inward/record.url?scp=85210508659&partnerID=8YFLogxK
U2 - 10.1109/ICITISEE63424.2024.10730259
DO - 10.1109/ICITISEE63424.2024.10730259
M3 - Conference contribution
AN - SCOPUS:85210508659
T3 - 2024 8th International Conference on Information Technology, Information Systems and Electrical Engineering, ICITISEE 2024
SP - 179
EP - 184
BT - 2024 8th International Conference on Information Technology, Information Systems and Electrical Engineering, ICITISEE 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th International Conference on Information Technology, Information Systems and Electrical Engineering, ICITISEE 2024
Y2 - 29 August 2024 through 30 August 2024
ER -