Effect of different splitting criteria on the performance of speech emotion recognition

Bagus Tris Atmaja, Akira Sasou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)


Traditional speech emotion recognition (SER) eval-uations have been performed merely on a speaker-independent condition; some of them even did not evaluate their result on this condition. This paper highlights the importance of splitting training and test data for SER by script, known as sentence-open or text-independent criteria. The results show that em-ploying sentence-open criteria degraded the performance of SER. This finding implies the difficulties of recognizing emotion from speech in different linguistic information embedded in acoustic information. Surprisingly, text-independent criteria consistently performed worse than speaker+text-independent criteria. The full order of difficulties for splitting criteria on SER performances from the most difficult to the easiest is text-independent, speaker+text-independent, speaker-independent, and speaker+text-dependent, The gap between speaker+text-independent and text-independent was smaller than other criteria, strengthening the difficulties of recognizing emotion from sneech in different sentences.

Original languageEnglish
Title of host publicationTENCON 2021 - 2021 IEEE Region 10 Conference
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages5
ISBN (Electronic)9781665495325
Publication statusPublished - 2021
Externally publishedYes
Event2021 IEEE Region 10 Conference, TENCON 2021 - Auckland, New Zealand
Duration: 7 Dec 202110 Dec 2021

Publication series

NameIEEE Region 10 Annual International Conference, Proceedings/TENCON
ISSN (Print)2159-3442
ISSN (Electronic)2159-3450


Conference2021 IEEE Region 10 Conference, TENCON 2021
Country/TerritoryNew Zealand


  • Speech emotion recognition
  • data partition
  • speaker-independent
  • splitting criteria
  • text-independent


Dive into the research topics of 'Effect of different splitting criteria on the performance of speech emotion recognition'. Together they form a unique fingerprint.

Cite this