TY - GEN
T1 - Patient Diagnosis Classification based on Electronic Medical Record using Text Mining and Support Vector Machine
AU - Jamaluddin, M.
AU - Wibawa, Adhi Dharma
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/9/18
Y1 - 2021/9/18
N2 - Electronic Medical Record (EMR) is an important element of information technology in healthcare sector. EMR is an electronic record containing health-related information on patients that can be created and managed by authorized physician and staff in a healthcare service organization. EMR is a framework for determining diagnosis and treatment. EMR has free text and unstructured format which makes it more difficult to extract the hidden information as a decision support system. This study performs classification from Indonesian EMR for clinical decision support system (CDSS) in classifying patient diagnosis using Term Frequency-Inverse Document Frequency (TF-IDF) for feature extraction and Support Vector Machine (SVM) for classifier method. SVM is a powerful algorithm in high-dimensional data such as in textual data processing. The focus diagnoses classified in this paper are tuberculosis, cancer, diabetes mellitus, hypertension, and chronic kidney which have high prevalence rates in Indonesia. The model is built by considering the kernel function and the use of stopword removal or without stopword removal. The result showed that TF - IDF and SVM method could be used effectively to predict diagnosis with stop word removal. Classification performance increased with stopword removal on all SVM kernels with accuracy in linear kernel 89.91 %, polynomial kernel 90.58%, RBF kernel 90.75%, and sigmoid kernel 91.03%..
AB - Electronic Medical Record (EMR) is an important element of information technology in healthcare sector. EMR is an electronic record containing health-related information on patients that can be created and managed by authorized physician and staff in a healthcare service organization. EMR is a framework for determining diagnosis and treatment. EMR has free text and unstructured format which makes it more difficult to extract the hidden information as a decision support system. This study performs classification from Indonesian EMR for clinical decision support system (CDSS) in classifying patient diagnosis using Term Frequency-Inverse Document Frequency (TF-IDF) for feature extraction and Support Vector Machine (SVM) for classifier method. SVM is a powerful algorithm in high-dimensional data such as in textual data processing. The focus diagnoses classified in this paper are tuberculosis, cancer, diabetes mellitus, hypertension, and chronic kidney which have high prevalence rates in Indonesia. The model is built by considering the kernel function and the use of stopword removal or without stopword removal. The result showed that TF - IDF and SVM method could be used effectively to predict diagnosis with stop word removal. Classification performance increased with stopword removal on all SVM kernels with accuracy in linear kernel 89.91 %, polynomial kernel 90.58%, RBF kernel 90.75%, and sigmoid kernel 91.03%..
KW - Electronic Medical Record
KW - Support Vector Machine
KW - Text Mining
UR - http://www.scopus.com/inward/record.url?scp=85118922515&partnerID=8YFLogxK
U2 - 10.1109/iSemantic52711.2021.9573178
DO - 10.1109/iSemantic52711.2021.9573178
M3 - Conference contribution
AN - SCOPUS:85118922515
T3 - Proceedings - 2021 International Seminar on Application for Technology of Information and Communication: IT Opportunities and Creativities for Digital Innovation and Communication within Global Pandemic, iSemantic 2021
SP - 243
EP - 248
BT - Proceedings - 2021 International Seminar on Application for Technology of Information and Communication
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 International Seminar on Application for Technology of Information and Communication, iSemantic 2021
Y2 - 18 September 2021 through 19 September 2021
ER -