TY - GEN
T1 - Adding an Emotions Filter to Javanese Text-to-Speech System
AU - Mulyanto, Edy
AU - Yuniarno, Eko Mulyanto
AU - Purnomo, Mauridhi Hery
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - One way to interact, humans use speech. Text-to-speech system (TTS) is the process of converting text into speech. Some TTS applications in the community are visual chatbot applications, screen readers, and digital talking books for the blind. The object of this research is the Javanese language, the addition of an emotional filter to the Javanese language TTS with an automatic syllabification and phonetic transcription system. The addition of an emotional filter uses the prosody manipulation method and predetermined rate factor. Perception test to test the emotional filter, while Syllable Error Rate (SER) to test the accuracy of the syllabification system and phonetic transcription. The Mean Opinion Score (MOS) is used to evaluate the level of naturalness of speech, while the Word Error Rate (WER) is to measure the performance of speech clarity. SER test shows a value of 0.985%, the WER test produces a value of 25.03% and a MOS score of 3.60 obtained from 15 respondents.
AB - One way to interact, humans use speech. Text-to-speech system (TTS) is the process of converting text into speech. Some TTS applications in the community are visual chatbot applications, screen readers, and digital talking books for the blind. The object of this research is the Javanese language, the addition of an emotional filter to the Javanese language TTS with an automatic syllabification and phonetic transcription system. The addition of an emotional filter uses the prosody manipulation method and predetermined rate factor. Perception test to test the emotional filter, while Syllable Error Rate (SER) to test the accuracy of the syllabification system and phonetic transcription. The Mean Opinion Score (MOS) is used to evaluate the level of naturalness of speech, while the Word Error Rate (WER) is to measure the performance of speech clarity. SER test shows a value of 0.985%, the WER test produces a value of 25.03% and a MOS score of 3.60 obtained from 15 respondents.
KW - Emotion Filter
KW - Phonetic Transcription
KW - Predetermined Factor
KW - Prosody Manipulation
KW - Syllabification
KW - Text-to-Speech
UR - http://www.scopus.com/inward/record.url?scp=85066485190&partnerID=8YFLogxK
U2 - 10.1109/CENIM.2018.8711229
DO - 10.1109/CENIM.2018.8711229
M3 - Conference contribution
AN - SCOPUS:85066485190
T3 - 2018 International Conference on Computer Engineering, Network and Intelligent Multimedia, CENIM 2018 - Proceeding
SP - 142
EP - 146
BT - 2018 International Conference on Computer Engineering, Network and Intelligent Multimedia, CENIM 2018 - Proceeding
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 International Conference on Computer Engineering, Network and Intelligent Multimedia, CENIM 2018
Y2 - 26 November 2018 through 27 November 2018
ER -