On The Optimal Classifier For Affective Vocal Bursts And Stuttering Predictions Based On Pre-Trained Acoustic Embedding

Bagus Tris Atmaja*, Zanjabila, Akira Sasou*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Speech emotion recognition currently gained more interest from researchers due to its potential applications in the market. Instead of a speech, vocal bursts are understudied and may contain richer affective information than speech for recognizing emotion (e.g., laugh for happiness and cry for sadness). On the other side, acoustic features used for the affective vocalization may also be helpful for the stuttering evaluation task. Instead of handcrafted acoustic features, a pre-trained model feature extractor is now attaining more attention due to its competitiveness in modeling universal speech embedding. However, the previous speech embedding evaluations are not well-suited for emotion recognition. In this study, the researchers evaluated acoustic embedding extracted from a model fine-tuned on an affective speech dataset for affective vocalization and stuttering predictions using different classifiers. The methods were evaluated on a baseline classifier from the previous study and five new different classifiers, including an ensemble classifier. The results show improvements over the baseline methods; the ensemble classifier consistently resulted in the optimal performance on new validation sets with balanced and unnormalized data for both affective vocal bursts and stuttering predictions.

Original languageEnglish
Title of host publicationProceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1690-1695
Number of pages6
ISBN (Electronic)9786165904773
DOIs
Publication statusPublished - 2022
Event2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022 - Chiang Mai, Thailand
Duration: 7 Nov 202210 Nov 2022

Publication series

NameProceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022

Conference

Conference2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
Country/TerritoryThailand
CityChiang Mai
Period7/11/2210/11/22

Fingerprint

Dive into the research topics of 'On The Optimal Classifier For Affective Vocal Bursts And Stuttering Predictions Based On Pre-Trained Acoustic Embedding'. Together they form a unique fingerprint.

Cite this