Abstract
Speech emotion recognition plays a vital role in enhancing human-computer interaction and improving user experience in various applications. This paper investigates the utilization of spatio-temporal patterns in speech emotion recognition, contrasting them with conventional methods that rely solely on spatial or temporal information. The approach involves a parallel architecture, coupling Convolutional Neural Networks (CNNs) with Transformers as an encoder block network. This design combines the spatial feature extraction capabilities of CNNs with the temporal modeling strengths of Transformers, enabling the capture of intricate patterns and contextual relationships within speech data. We present a comprehensive experimental analysis conducted on three benchmark datasets, shedding light on the impact of the utilization of spatio-temporal patterns in advancing the field of speech emotion recognition.
| Original language | English |
|---|---|
| Title of host publication | 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 200-205 |
| Number of pages | 6 |
| ISBN (Electronic) | 9798350309225 |
| DOIs | |
| Publication status | Published - 2023 |
| Event | 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Lombok, Indonesia Duration: 14 Nov 2023 → 15 Nov 2023 |
Publication series
| Name | 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 - Proceedings |
|---|
Conference
| Conference | 2023 International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, ICAMIMIA 2023 |
|---|---|
| Country/Territory | Indonesia |
| City | Lombok |
| Period | 14/11/23 → 15/11/23 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 7 Affordable and Clean Energy
Keywords
- audio signal processing
- automation
- spatio-temporal pattern
- speech emotion recognition
- technology
Fingerprint
Dive into the research topics of 'Exploring the Impact of Spatio-Temporal Patterns in Audio Spectrograms on Emotion Recognition'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver