Exploring the Impact of Spatio-Temporal Patterns in Audio Spectrograms on Emotion Recognition

Shintami Chusnul Hidayati*, Adam Satria Adidarma, Kelly Rossa Sungkono

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Speech emotion recognition plays a vital role in enhancing human-computer interaction and improving user experience in various applications. This paper investigates the utilization of spatio-temporal patterns in speech emotion recognition, contrasting them with conventional methods that rely solely on spatial or temporal information. The approach involves a parallel architecture, coupling Convolutional Neural Networks (CNNs) with Transformers as an encoder block network. This design combines the spatial feature extraction capabilities of CNNs with the temporal modeling strengths of Transformers, enabling the capture of intricate patterns and contextual relationships within speech data. We present a comprehensive experimental analysis conducted on three benchmark datasets, shedding light on the impact of the utilization of spatio-temporal patterns in advancing the field of speech emotion recognition.

Original languageEnglish
Title of host publication2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages200-205
Number of pages6
ISBN (Electronic)9798350309225
DOIs
Publication statusPublished - 2023
Event2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Lombok, Indonesia
Duration: 14 Nov 202315 Nov 2023

Publication series

Name2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings

Conference

Conference2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023
Country/TerritoryIndonesia
CityLombok
Period14/11/2315/11/23

Keywords

  • audio signal processing
  • automation
  • spatio-temporal pattern
  • speech emotion recognition
  • technology

Fingerprint

Dive into the research topics of 'Exploring the Impact of Spatio-Temporal Patterns in Audio Spectrograms on Emotion Recognition'. Together they form a unique fingerprint.

Cite this