TY - GEN
T1 - A Deep Learning Approach for Word Segmentation in Javanese Letter Manuscript Transliteration
AU - Nevin, Muhammad
AU - Putra, I. Kadek Agus Ariesta
AU - Ansori, Dwinanda Bagoes
AU - Sarno, Riyanarto
AU - Haryono, Agus
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Traditionally written using the Javanese Letter or 'Aksara Jawa' script, the Javanese language encompasses a rich corpus of manuscripts that record diverse subjects such as history, culture, and traditional practices. With over 121,668 Javanese manuscript titles identified, only a fraction has been transliterated into alphabetical writing, underscoring the urgent need for efficient language preservation methods. This study evaluates deep learning models for word segmentation in Javanese manuscript transliteration. Results from experiments conducted on an unseen dataset reveal that the Bidirectional Long Short-Term Memory (BiLSTM) model outperforms other architectures consistently across all metrics. With an accuracy of 98.62% and superior scores in f1-score, precision, and recall, the BiLSTM model demonstrates robustness in capturing the intricate linguistic patterns and textual structures inherent in Javanese manuscripts.
AB - Traditionally written using the Javanese Letter or 'Aksara Jawa' script, the Javanese language encompasses a rich corpus of manuscripts that record diverse subjects such as history, culture, and traditional practices. With over 121,668 Javanese manuscript titles identified, only a fraction has been transliterated into alphabetical writing, underscoring the urgent need for efficient language preservation methods. This study evaluates deep learning models for word segmentation in Javanese manuscript transliteration. Results from experiments conducted on an unseen dataset reveal that the Bidirectional Long Short-Term Memory (BiLSTM) model outperforms other architectures consistently across all metrics. With an accuracy of 98.62% and superior scores in f1-score, precision, and recall, the BiLSTM model demonstrates robustness in capturing the intricate linguistic patterns and textual structures inherent in Javanese manuscripts.
KW - BiLSTM
KW - deep learning
KW - javanese manuscript
KW - natural language processing
KW - word segmentation
UR - https://www.scopus.com/pages/publications/85214528757
U2 - 10.1109/ICTIIA61827.2024.10761167
DO - 10.1109/ICTIIA61827.2024.10761167
M3 - Conference contribution
AN - SCOPUS:85214528757
T3 - Proceedings - 2024 2nd International Conference on Technology Innovation and Its Applications, ICTIIA 2024
BT - Proceedings - 2024 2nd International Conference on Technology Innovation and Its Applications, ICTIIA 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd International Conference on Technology Innovation and Its Applications, ICTIIA 2024
Y2 - 12 September 2024 through 13 September 2024
ER -