TY - GEN
T1 - Semantic Role Labeling for Information Extraction on Indonesian Texts
T2 - 24th International Seminar on Intelligent Technology and Its Applications, ISITIA 2023
AU - Ariyanto, Amelia Devi Putri
AU - Fatichah, Chastine
AU - Purwitasari, Diana
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - The information extraction process includes Semantic Role Labeling (SRL) as one of its sub-tasks. SRL aims to determine the semantic role of each entity within a sentence by examining the meaning of the predicate. This helps construct the sentence structure by identifying the relationships between predicates and their corresponding arguments. SRL development is less common than Named Entity Recognition (NER) for information extraction because SRL annotation process is complicated, and labeling results are sometimes ambiguous. In event extraction problem, the use of NER alone is insufficient. Identifying location entities generated by NER is still inaccurate because geographic coordinates indicate locations irrelevant to actual events. On the other hand, SRL can detect locations precisely and in depth according to actual events. Even though the annotation process is complicated, the SRL can be adjusted according to the required domain and its ontology so that SRL can extract location entities down to the event level.. This research aims to offer a comprehensive analysis concerning the advancement of Semantic Role Labeling (SRL) for extracting information from Indonesian texts. Indonesian is a low-resource language with a different character from English and only has very little literature, so it is interesting to study. The papers used for the review process came from IEEE, Science Direct, and Google Scholar from 2013 to 2023, and 15 papers were found that matched the research objectives. The study results show that most papers use Indonesian-language news articles as their dataset because they use formal language, which usually has a good language structure. The methods used in SRLs are mostly rule-based. A weakness of the rule-based development method is that the rules are very dependent on a particular language or problem domain. Thus, further work can use a transformer-based deep learning approach to perform SRL on Indonesian-language texts.
AB - The information extraction process includes Semantic Role Labeling (SRL) as one of its sub-tasks. SRL aims to determine the semantic role of each entity within a sentence by examining the meaning of the predicate. This helps construct the sentence structure by identifying the relationships between predicates and their corresponding arguments. SRL development is less common than Named Entity Recognition (NER) for information extraction because SRL annotation process is complicated, and labeling results are sometimes ambiguous. In event extraction problem, the use of NER alone is insufficient. Identifying location entities generated by NER is still inaccurate because geographic coordinates indicate locations irrelevant to actual events. On the other hand, SRL can detect locations precisely and in depth according to actual events. Even though the annotation process is complicated, the SRL can be adjusted according to the required domain and its ontology so that SRL can extract location entities down to the event level.. This research aims to offer a comprehensive analysis concerning the advancement of Semantic Role Labeling (SRL) for extracting information from Indonesian texts. Indonesian is a low-resource language with a different character from English and only has very little literature, so it is interesting to study. The papers used for the review process came from IEEE, Science Direct, and Google Scholar from 2013 to 2023, and 15 papers were found that matched the research objectives. The study results show that most papers use Indonesian-language news articles as their dataset because they use formal language, which usually has a good language structure. The methods used in SRLs are mostly rule-based. A weakness of the rule-based development method is that the rules are very dependent on a particular language or problem domain. Thus, further work can use a transformer-based deep learning approach to perform SRL on Indonesian-language texts.
KW - Indonesian Text
KW - Information Extraction
KW - Literature Review
KW - Named Entity Recognition
KW - Semantic Role Labeling
UR - http://www.scopus.com/inward/record.url?scp=85171173907&partnerID=8YFLogxK
U2 - 10.1109/ISITIA59021.2023.10221008
DO - 10.1109/ISITIA59021.2023.10221008
M3 - Conference contribution
AN - SCOPUS:85171173907
T3 - 2023 International Seminar on Intelligent Technology and Its Applications: Leveraging Intelligent Systems to Achieve Sustainable Development Goals, ISITIA 2023 - Proceeding
SP - 119
EP - 124
BT - 2023 International Seminar on Intelligent Technology and Its Applications
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 26 July 2023 through 27 July 2023
ER -