There are many researches held on the text-to-audiovisual, but only a few are applied on Indonesian language. The results of the present research can be applied to a very wide field, e.g. gaming industry, animation industry, human computer interaction systems, etc. The correspondence among speech, mouth movements (visual phoneme/viseme) and phoneme spoken is needed to produce a realistic text-to-audiovisual. This research aims to develop a text-toaudiovisual synthesizer for Indonesian language based on inputted Indonesian text called TTAVI (Text-To-AudioVisual synthesizer for Indonesian language). The method consists of four major parts, namely, building the models of Indonesian’s viseme, converting a text-to-speech, synchronization process, and stringing the visemes by using the morphing viseme algorithm. Morphing viseme algorithm shows that a virtual character of the phonemes pronunciation resulting from the TTAVI synthesizer is smoother. 10 Indonesian texts inputted to TTAVI synthesizer were examined by 30 users. The appraisal results of users were calculated by applying Mean Opinion Score (MOS) methods. The average of the MOS score is 4.106 with a value range from 1 to 5. This shows that TTAVI synthesizer is considered good, and morphing viseme algorithm is able to make the result of TTAVI synthesizer smoother.

Original languageEnglish
Pages (from-to)1149-1156
Number of pages8
JournalInternational Review on Computers and Software
Issue number11
Publication statusPublished - Nov 2015


  • A model of indonesian’s visemes
  • Audiovisual
  • Indonesian text
  • Morphing viseme
  • Viseme


Dive into the research topics of 'A text-to-audiovisual synthesizer for Indonesian by morphing Viseme'. Together they form a unique fingerprint.

Cite this