Human Voice Emotion Identification Using Prosodic and Spectral Feature Extraction Based on Deep Neural Networks

Agustinus Bimo Gumelar, Afid Kurniawan, Adri Gabriel Sooai, Mauridhi Hery Purnomo, Eko Mulyanto Yuniarno, Indar Sugiarto, Agung Widodo, Andreas Agung Kristanto, Tresna Maulana Fahrudin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Citations (Scopus)

Abstract

It is well known that human voice at the perceptual level consists of multimodality information. Therefore, a modality can be shared via neural emotion network through the independent stimuli processes. The expression and identification of emotions are significant steps for the human communication process. This process is biologically adaptive in a continuous manner, and for this reason, human voice identification becomes useful for classifying and identifying an effective specific characteristic between them. In this paper, we propose to identify the difference between the six essentials of human voice emotion. They will be generated using prosodic and spectral features extraction by utilizing Deep Neural Networks (DNNs). The result of our experiment has obtained accuracy as much as 78.83%. It presupposed that the higher intensity of emotions found in the sound sample would automatically trigger the level of accuracy as same as a higher one. Moreover, gender identification was also carried out along with the approximate accuracy at 90%. Nevertheless, from the learning process with the composition of 80:20, the training-testing data has obtained an exact accurate result by 100%.

Original languageEnglish
Title of host publication2019 IEEE 7th International Conference on Serious Games and Applications for Health, SeGAH 2019
EditorsDuarte Duque, Jeremy White, Nuno Rodrigues, Joao L. Vilaca, Nuno Dias
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728103006
DOIs
Publication statusPublished - Aug 2019
Event7th IEEE International Conference on Serious Games and Applications for Health, SeGAH 2019 - Kyoto, Japan
Duration: 5 Aug 20197 Aug 2019

Publication series

Name2019 IEEE 7th International Conference on Serious Games and Applications for Health, SeGAH 2019

Conference

Conference7th IEEE International Conference on Serious Games and Applications for Health, SeGAH 2019
Country/TerritoryJapan
CityKyoto
Period5/08/197/08/19

Keywords

  • DNNs
  • human voice emotion
  • prosodic feature
  • spectral feature

Fingerprint

Dive into the research topics of 'Human Voice Emotion Identification Using Prosodic and Spectral Feature Extraction Based on Deep Neural Networks'. Together they form a unique fingerprint.

Cite this