Skip to main navigation Skip to search Skip to main content

Exploring the Impact of Spatio-Temporal Patterns in Audio Spectrograms on Emotion Recognition

  • Institut Teknologi Sepuluh Nopember

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Citations (Scopus)

Abstract

Speech emotion recognition plays a vital role in enhancing human-computer interaction and improving user experience in various applications. This paper investigates the utilization of spatio-temporal patterns in speech emotion recognition, contrasting them with conventional methods that rely solely on spatial or temporal information. The approach involves a parallel architecture, coupling Convolutional Neural Networks (CNNs) with Transformers as an encoder block network. This design combines the spatial feature extraction capabilities of CNNs with the temporal modeling strengths of Transformers, enabling the capture of intricate patterns and contextual relationships within speech data. We present a comprehensive experimental analysis conducted on three benchmark datasets, shedding light on the impact of the utilization of spatio-temporal patterns in advancing the field of speech emotion recognition.

Original languageEnglish
Title of host publication2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages200-205
Number of pages6
ISBN (Electronic)9798350309225
DOIs
Publication statusPublished - 2023
Event2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Lombok, Indonesia
Duration: 14 Nov 202315 Nov 2023

Publication series

Name2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings

Conference

Conference2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023
Country/TerritoryIndonesia
CityLombok
Period14/11/2315/11/23

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Keywords

  • audio signal processing
  • automation
  • spatio-temporal pattern
  • speech emotion recognition
  • technology

Fingerprint

Dive into the research topics of 'Exploring the Impact of Spatio-Temporal Patterns in Audio Spectrograms on Emotion Recognition'. Together they form a unique fingerprint.

Cite this