TY - JOUR
T1 - LONTAR-DETC
T2 - Dense and High Variance Balinese Character Detection Method in Lontar Manuscripts
AU - Suciati, Nanik
AU - Sutramiani, Ni Putu
AU - Siahaan, Daniel
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2022
Y1 - 2022
N2 - This paper proposed LONTAR-DETC, a method to detect handwritten Balinese characters in Lontar manuscripts. LONTAR-DETC is a deep learning architecture based on YOLO. The detection of Balinese characters in Lontar manuscripts is challenging due to the characteristics of Balinese characters in Lontar manuscripts. Balinese characters in Lontar manuscripts are dense, overlapping, have high variance, contain noise, and classes of these characters are imbalanced. The proposed method consists of three steps, namely data generation, Lontar manuscript annotation, and Balinese character detection. The first step is data generation, in which synthetic images of original Lontar manuscript images are generated with enhanced image quality. The second step is data annotation to build a new Lontar manuscript dataset. As a result, we also propose the Handwritten Balinese Character of Lontar manuscript (HBCL-DETC) dataset, a novel Balinese character in Lontar manuscripts dataset. HBCL-DETC contains 600 images that consists of more than 100,000 Balinese characters annotated by experts. Finally, the third step is training the YOLOv4 detection model using the HBCL-DETC dataset. We created this dataset specifically for the task of detecting Balinese characters in Lontar manuscripts. To evaluate the reliability of the dataset, we experimented with three scenarios. In the first scenario, the detection model was trained using original images of Lontar manuscripts, in the second scenario the detection model was trained with the addition of augmented grayscale images, and in the third scenario the detection model was trained using HBCL-DETC. Based on the experimental results, LONTAR-DETC can detect Balinese characters with high detection rate with mAP of 99.55%.
AB - This paper proposed LONTAR-DETC, a method to detect handwritten Balinese characters in Lontar manuscripts. LONTAR-DETC is a deep learning architecture based on YOLO. The detection of Balinese characters in Lontar manuscripts is challenging due to the characteristics of Balinese characters in Lontar manuscripts. Balinese characters in Lontar manuscripts are dense, overlapping, have high variance, contain noise, and classes of these characters are imbalanced. The proposed method consists of three steps, namely data generation, Lontar manuscript annotation, and Balinese character detection. The first step is data generation, in which synthetic images of original Lontar manuscript images are generated with enhanced image quality. The second step is data annotation to build a new Lontar manuscript dataset. As a result, we also propose the Handwritten Balinese Character of Lontar manuscript (HBCL-DETC) dataset, a novel Balinese character in Lontar manuscripts dataset. HBCL-DETC contains 600 images that consists of more than 100,000 Balinese characters annotated by experts. Finally, the third step is training the YOLOv4 detection model using the HBCL-DETC dataset. We created this dataset specifically for the task of detecting Balinese characters in Lontar manuscripts. To evaluate the reliability of the dataset, we experimented with three scenarios. In the first scenario, the detection model was trained using original images of Lontar manuscripts, in the second scenario the detection model was trained with the addition of augmented grayscale images, and in the third scenario the detection model was trained using HBCL-DETC. Based on the experimental results, LONTAR-DETC can detect Balinese characters with high detection rate with mAP of 99.55%.
KW - Balinese characters
KW - YOLOv4
KW - data generation
KW - dense
KW - high variance
UR - http://www.scopus.com/inward/record.url?scp=85124089050&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2022.3147069
DO - 10.1109/ACCESS.2022.3147069
M3 - Article
AN - SCOPUS:85124089050
SN - 2169-3536
VL - 10
SP - 14600
EP - 14609
JO - IEEE Access
JF - IEEE Access
ER -