TY - GEN
T1 - Identification Semantic Text of Indonesian Medical Terms from Question-Answer Data
AU - Purwitasari, Diana
AU - Juanita, Safitri
AU - Eddy Purnama, I. Ketut
AU - Raihan, Muhammad
AU - Purnomo, Mauridhi Hery
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Searching for health-related information online is becoming more difficult due to the proliferation of multiple-meaning phrases such as semantic words. This study examines the semantic process in medical terms using a collection of doctor's answer texts which requires finding an appropriate model for recognizing text in pairs of comparable Indonesian medical phrases or terminology synonyms. This study contributes to finding an automatic semantic text detection using a word embedding approach to identify text in pairs of similar Indonesian medical terms. We selected 108 pairs of annotated medical terms (Biomedical Named Entity Recognition (Bio-NER)) in Indonesian based on a collection of doctor texts with 60 pairs of similar words and 48 pairs of dissimilar words. Our dataset was processed with the word embedding approach of FastText and BioWordVec. There are two approaches of BioWordVec: with (BioWordVec-2) and without (BioWordVec) translation process. We compared the performance of FastText, BioWordVec, and BioWordVec-2 using measures like accuracy, specificity, and sensitivity. The results show that the BioWordVec-2 model performs better than other models in identifying similar pairs.
AB - Searching for health-related information online is becoming more difficult due to the proliferation of multiple-meaning phrases such as semantic words. This study examines the semantic process in medical terms using a collection of doctor's answer texts which requires finding an appropriate model for recognizing text in pairs of comparable Indonesian medical phrases or terminology synonyms. This study contributes to finding an automatic semantic text detection using a word embedding approach to identify text in pairs of similar Indonesian medical terms. We selected 108 pairs of annotated medical terms (Biomedical Named Entity Recognition (Bio-NER)) in Indonesian based on a collection of doctor texts with 60 pairs of similar words and 48 pairs of dissimilar words. Our dataset was processed with the word embedding approach of FastText and BioWordVec. There are two approaches of BioWordVec: with (BioWordVec-2) and without (BioWordVec) translation process. We compared the performance of FastText, BioWordVec, and BioWordVec-2 using measures like accuracy, specificity, and sensitivity. The results show that the BioWordVec-2 model performs better than other models in identifying similar pairs.
KW - Indonesian Medical Terms
KW - Question-Answer Data
KW - Semantic Text
KW - Word Embedding
UR - http://www.scopus.com/inward/record.url?scp=85150469224&partnerID=8YFLogxK
U2 - 10.1109/ICITISEE57756.2022.10057601
DO - 10.1109/ICITISEE57756.2022.10057601
M3 - Conference contribution
AN - SCOPUS:85150469224
T3 - Proceeding - 6th International Conference on Information Technology, Information Systems and Electrical Engineering: Applying Data Sciences and Artificial Intelligence Technologies for Environmental Sustainability, ICITISEE 2022
SP - 565
EP - 569
BT - Proceeding - 6th International Conference on Information Technology, Information Systems and Electrical Engineering
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th International Conference on Information Technology, Information Systems and Electrical Engineering, ICITISEE 2022
Y2 - 13 December 2022 through 14 December 2022
ER -