TY - GEN
T1 - Syntaxis-based extraction method with type and function of word detection approach for machine translation of Indonesian-Tolaki and English sentences
AU - Yamin, Muh
AU - Sarno, Riyanarto
AU - Abdullah, Rachmad
AU - Untung,
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - To build an Indonesian Machine Translation (MT), it is not only needed a related syntactic analysis to the correct spelling of words but also needed related contextual analysis, consist type and function of word, morphology, and semantic. The dictionaries usage is needed to translates Indonesian basic words and to captures good word translations through the semantic and context of words in a sentence or document. This study purposes to extracts Indonesian and Tolaki words for building a good MT by comparing the development of Indonesian MT which focuses on deep cases of morphology and syntactic. We developed morphtool to captures the morphological elements of Indonesian and Tolaki words. For working in deep syntactic case, we build a rule to captures the function and type of word that can affect the word itself translation in the sentence. We combine supervised and unsupervised techniques to work on the text extraction in the words, sentences, and documents through the morphonemic rules of Indonesian-Tolaki syntaxis manner. Then, we use hybrid MT, combining Statistical MT (SMT) and Rule Based MT (RBMT), for sentence translation process. The hybrid MT evaluation process from the Indonesian-Tolaki to English translation performance test shows the accuracy result is 0.74. Meanwhile, the performance test of the English to Indonesian-Tolaki translation shows the accuracy result is 0.71. These results indicate that the proposed MT method can work better than the SMT and RBMT methods with an average accuracy of around 70%.
AB - To build an Indonesian Machine Translation (MT), it is not only needed a related syntactic analysis to the correct spelling of words but also needed related contextual analysis, consist type and function of word, morphology, and semantic. The dictionaries usage is needed to translates Indonesian basic words and to captures good word translations through the semantic and context of words in a sentence or document. This study purposes to extracts Indonesian and Tolaki words for building a good MT by comparing the development of Indonesian MT which focuses on deep cases of morphology and syntactic. We developed morphtool to captures the morphological elements of Indonesian and Tolaki words. For working in deep syntactic case, we build a rule to captures the function and type of word that can affect the word itself translation in the sentence. We combine supervised and unsupervised techniques to work on the text extraction in the words, sentences, and documents through the morphonemic rules of Indonesian-Tolaki syntaxis manner. Then, we use hybrid MT, combining Statistical MT (SMT) and Rule Based MT (RBMT), for sentence translation process. The hybrid MT evaluation process from the Indonesian-Tolaki to English translation performance test shows the accuracy result is 0.74. Meanwhile, the performance test of the English to Indonesian-Tolaki translation shows the accuracy result is 0.71. These results indicate that the proposed MT method can work better than the SMT and RBMT methods with an average accuracy of around 70%.
KW - RBMT
KW - SMT
KW - hybrid MT
KW - machine translation
UR - http://www.scopus.com/inward/record.url?scp=85145440686&partnerID=8YFLogxK
U2 - 10.1109/ICITRI56423.2022.9970225
DO - 10.1109/ICITRI56423.2022.9970225
M3 - Conference contribution
AN - SCOPUS:85145440686
T3 - 2022 International Conference on Information Technology Research and Innovation, ICITRI 2022
SP - 101
EP - 106
BT - 2022 International Conference on Information Technology Research and Innovation, ICITRI 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 International Conference on Information Technology Research and Innovation, ICITRI 2022
Y2 - 10 November 2022
ER -