TY - GEN
T1 - Text Segmentation Methods for Annotation on eHealth Consultation with Interview Function Labels
T2 - 8th IEEE International Conference on Software Engineering and Computer Systems, ICSECS 2023
AU - Rahmawati, Yunianita
AU - Siahaan, Daniel
AU - Purwitasari, Diana
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - There have been several existing text segmentation methods. Nevertheless, no study has provided the experimental result on the performance of those methods in segmenting text of health consultation data based on sentence context and provided automatic annotation on the segmented results in the form of interview function labels. This study compares four methods with different text segmentation approaches and analyzes their performances based on their reliability concerning human expert judgment. The methods are Content Vector Segmentation (CVS), GraphSeg, K-Means, and Latent Dirichlet Allocation (LDA). This study used a greedy similarity approach to perform automatic annotation by selecting candidate labels based on the maximum value. The annotation results were evaluated using Gwet's AC1 method to assess the integrity among evaluators in clinical research. The evaluation results indicate that the CVS method outperforms other methods and has a substantial level of agreement (0.67). It is also relatively stable, with a standard deviation of 0.12.
AB - There have been several existing text segmentation methods. Nevertheless, no study has provided the experimental result on the performance of those methods in segmenting text of health consultation data based on sentence context and provided automatic annotation on the segmented results in the form of interview function labels. This study compares four methods with different text segmentation approaches and analyzes their performances based on their reliability concerning human expert judgment. The methods are Content Vector Segmentation (CVS), GraphSeg, K-Means, and Latent Dirichlet Allocation (LDA). This study used a greedy similarity approach to perform automatic annotation by selecting candidate labels based on the maximum value. The annotation results were evaluated using Gwet's AC1 method to assess the integrity among evaluators in clinical research. The evaluation results indicate that the CVS method outperforms other methods and has a substantial level of agreement (0.67). It is also relatively stable, with a standard deviation of 0.12.
KW - Data Annotation
KW - Text Segmentation
KW - eHealth Consultation
UR - http://www.scopus.com/inward/record.url?scp=85175446119&partnerID=8YFLogxK
U2 - 10.1109/ICSECS58457.2023.10256340
DO - 10.1109/ICSECS58457.2023.10256340
M3 - Conference contribution
AN - SCOPUS:85175446119
T3 - 8th International Conference on Software Engineering and Computer Systems, ICSECS 2023
SP - 72
EP - 77
BT - 8th International Conference on Software Engineering and Computer Systems, ICSECS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 25 August 2023 through 27 August 2023
ER -