TY - JOUR
T1 - Automatic lip reading for daily Indonesian words based on frame difference and horizontal-vertical image projection
AU - Nasuha, Aris
AU - Arifin, Fatchul
AU - Sardjono, Tri Arief
AU - Takahashi, Hideya
AU - Purnomo, Mauridhi Hery
N1 - Publisher Copyright:
© 2005 - 2017 JATIT & LLS. All rights reserved.
PY - 2017/1/31
Y1 - 2017/1/31
N2 - Automatic lip reading is one of research being developed lately. Automatic lip reading has been used for various purposes, such as enhancing speech recognition and aid to speech training for the deaf. There are two approaches in lip feature extraction, namely appearance based and shape based. Appearance based approach is usually better, because it provides visual features that cover not only lips structure but also teeth and tongue visibility. However, the drawback of this approach is producing too many features. This paper presents the new method, integration of frame difference and horizontal-vertical image projection. This proposed method is part of appearance approach, apart from using image projection as dimensionality reduction. We implement the proposed method in automatic lip reading to classify five daily words in Indonesian language. We use 200 data which are recorded in frontal face and focused around the lip. MLP (Multi Layer Perceptron) and SVM (Support Vector Machine) are used as classifiers. Model of the proposed method are evaluated using 4-fold cross-validation. Of four algorithms on the proposed method, the best result is achieved by the combination of folded lip image and double difference. The comparison of the proposed method and 2D-DCT (2 Dimension–Discrete Cosine Transform) shows that the proposed method exceeds 2D-DCT in CA (Classification Accuracy) and AUC (Area Under ROC Curve), specifically when using MLP as classifier. The proposed method achieves 96.5% in CA and 0.9993 in AUC, whereas 2D-DCT achieves 94% in CA and 0.9978 in AUC.
AB - Automatic lip reading is one of research being developed lately. Automatic lip reading has been used for various purposes, such as enhancing speech recognition and aid to speech training for the deaf. There are two approaches in lip feature extraction, namely appearance based and shape based. Appearance based approach is usually better, because it provides visual features that cover not only lips structure but also teeth and tongue visibility. However, the drawback of this approach is producing too many features. This paper presents the new method, integration of frame difference and horizontal-vertical image projection. This proposed method is part of appearance approach, apart from using image projection as dimensionality reduction. We implement the proposed method in automatic lip reading to classify five daily words in Indonesian language. We use 200 data which are recorded in frontal face and focused around the lip. MLP (Multi Layer Perceptron) and SVM (Support Vector Machine) are used as classifiers. Model of the proposed method are evaluated using 4-fold cross-validation. Of four algorithms on the proposed method, the best result is achieved by the combination of folded lip image and double difference. The comparison of the proposed method and 2D-DCT (2 Dimension–Discrete Cosine Transform) shows that the proposed method exceeds 2D-DCT in CA (Classification Accuracy) and AUC (Area Under ROC Curve), specifically when using MLP as classifier. The proposed method achieves 96.5% in CA and 0.9993 in AUC, whereas 2D-DCT achieves 94% in CA and 0.9978 in AUC.
KW - Appearance based approach
KW - Indonesian language
KW - Lip reading
KW - Neural network
KW - Visual speech recognition
UR - http://www.scopus.com/inward/record.url?scp=85011705841&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85011705841
SN - 1992-8645
VL - 95
SP - 393
EP - 402
JO - Journal of Theoretical and Applied Information Technology
JF - Journal of Theoretical and Applied Information Technology
IS - 2
ER -