The development of information technology and smartphones has caused production of many data around us. In every second million of new data is created in the form of text, audio, image and even videos. This environment then has triggered big data analytics demand. One of big data that is produced daily is data on the history of healthcare services in hospitals. Important new information can be retrieved through this huge dataset, especially concerning the patient symptoms, drug usage and new diseases report. In this study, text processing technique is applied on text data of patient medical record data from public hospital during 2017 till 2019 regarding the patient symptoms and the disease classification. Naïve Bayes Classifier and Random Forest algorithms are used to classify diseases in medical record data with 19 diseases in preprocessing data. A list of modified Indonesian stop words was used to filter the symptom sentences. The result indicates that the Random Forest classification algorithm can achieve the highest accuracy of around 99.9%, better and more accurate than the Naïve Bayes classification algorithm. This experiment shows that our proposed method provides a robust system and good accuracy for classifying medical record data with many diseases.

Original languageEnglish
Title of host publicationInternational Electronics Symposium 2021
Subtitle of host publicationWireless Technologies and Intelligent Systems for Better Human Lives, IES 2021 - Proceedings
EditorsAndhik Ampuh Yunanto, Artiarini Kusuma N, Hendhi Hermawan, Putu Agus Mahadi Putra, Farida Gamar, Mohamad Ridwan, Yanuar Risah Prayogi, Maretha Ruswiansari
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages5
ISBN (Electronic)9781665443463
Publication statusPublished - 29 Sept 2021
Event23rd International Electronics Symposium, IES 2021 - Surabaya, Indonesia
Duration: 29 Sept 202130 Sept 2021

Publication series

NameInternational Electronics Symposium 2021: Wireless Technologies and Intelligent Systems for Better Human Lives, IES 2021 - Proceedings


Conference23rd International Electronics Symposium, IES 2021


  • disease
  • healthcare
  • naïve Bayes classification
  • random forest
  • text mining


Dive into the research topics of 'Text Mining in Healthcare for Disease Classification using Machine Learning Algorithm'. Together they form a unique fingerprint.

Cite this