TY - GEN
T1 - Text Classification based on Sentence Features and Preprocessing Settings for Labeling eHealth Consultation Answer
AU - Rahmawati, Yunianita
AU - Siahaan, Daniel
AU - Purwitasari, Diana
AU - Wijayanti, Rahma Auri
AU - Bidan, Praktik Mandiri
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - This study identifies key aspects of doctor- patient communication by analyzing doctor responses from eHealth Consultation Answer data on the Online Healthcare Consultations (OHC) site, AloDokter. The analysis is performed through topic segmentation to help users better understand the doctors' explanations. Previously, researchers attempted topic segmentation on the same dataset using unsupervised learning and clustering-based text segmentation methods, but the results were suboptimal. To address this issue, we propose a solution that involves segmenting the text using text classification techniques, specifically by leveraging sentence features and refined customized preprocessing. Sentence features are enhanced by combining word vectors with sentence embeddings. The preprocessing improvements involve optimizing the tokenization process and refining stopword handling, focusing on encountered in doctor-patient communication. This research aims to compare several model text classification methods based on sentence features and preprocessing configurations and evaluate their performance. Additionally, we present the performance of text classification methods without including sentence features or customized preprocessing for comparison. Our experiments demonstrate that The Multi-Layer Perceptron (MLP) models with sentence features achieve an average F1-score of 91%, outperforming other text classification models.
AB - This study identifies key aspects of doctor- patient communication by analyzing doctor responses from eHealth Consultation Answer data on the Online Healthcare Consultations (OHC) site, AloDokter. The analysis is performed through topic segmentation to help users better understand the doctors' explanations. Previously, researchers attempted topic segmentation on the same dataset using unsupervised learning and clustering-based text segmentation methods, but the results were suboptimal. To address this issue, we propose a solution that involves segmenting the text using text classification techniques, specifically by leveraging sentence features and refined customized preprocessing. Sentence features are enhanced by combining word vectors with sentence embeddings. The preprocessing improvements involve optimizing the tokenization process and refining stopword handling, focusing on encountered in doctor-patient communication. This research aims to compare several model text classification methods based on sentence features and preprocessing configurations and evaluate their performance. Additionally, we present the performance of text classification methods without including sentence features or customized preprocessing for comparison. Our experiments demonstrate that The Multi-Layer Perceptron (MLP) models with sentence features achieve an average F1-score of 91%, outperforming other text classification models.
KW - Multi- Layer Perceptron
KW - Sentence Features
KW - Text Classification
KW - Text Segmentation
KW - eHealth consultation Answer
UR - https://www.scopus.com/pages/publications/105004416169
U2 - 10.1109/ISRITI64779.2024.10963536
DO - 10.1109/ISRITI64779.2024.10963536
M3 - Conference contribution
AN - SCOPUS:105004416169
T3 - 7th International Seminar on Research of Information Technology and Intelligent Systems: Advanced Intelligent Systems in Contemporary Society, ISRITI 2024 - Proceedings
SP - 1065
EP - 1070
BT - 7th International Seminar on Research of Information Technology and Intelligent Systems
A2 - Wibowo, Ferry Wahyu
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th International Seminar on Research of Information Technology and Intelligent Systems, ISRITI 2024
Y2 - 11 December 2024
ER -