TY - GEN
T1 - BiLSTM-CNN Hyperparameter Optimization for Speech Emotion and Stress Recognition
AU - Gumelar, Agustinus Bimo
AU - Yuniarno, Eko Mulyanto
AU - Adi, Derry Pramono
AU - Sooai, Adri Gabriel
AU - Sugiarto, Indar
AU - Purnomo, Mauridhi Hery
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/9/29
Y1 - 2021/9/29
N2 - The most automated speech recognition (ASR) systems are extremely complicated, integrating many approaches and requiring a high variety of tuning parameters. Deep understanding and experience of each component are required to achieve optimal performance in ASR, confining the development of ASR systems to the experts. Hyperparameters are crucial for machine learning algorithms because they directly regulate the behavior of training algorithms and have a major impact on model performance. As a result, developing an effective hyperparameter optimization technique to optimize any given machine learning method would considerably increase machine learning efficiency. This work investigates the use of Random Forest and Bayesian to automatically optimize BiLSTM-CNN systems. We built the ASR based on the BiLSTM-CNN model and customized its hyperparameters value to heed our low-hardware specification during optimization. Furthermore, we gathered 1,000 clips of speech data from various movies, classifying them according to emotion and stress classes. In pursuit of contextual-level understanding in our ASR, we transcribed our speech data and used the bigram textual feature. Our Random Forest-optimized BiLSTM-CNN model ultimately reaches 84% of accuracy result and learning runtime in under 17 seconds.
AB - The most automated speech recognition (ASR) systems are extremely complicated, integrating many approaches and requiring a high variety of tuning parameters. Deep understanding and experience of each component are required to achieve optimal performance in ASR, confining the development of ASR systems to the experts. Hyperparameters are crucial for machine learning algorithms because they directly regulate the behavior of training algorithms and have a major impact on model performance. As a result, developing an effective hyperparameter optimization technique to optimize any given machine learning method would considerably increase machine learning efficiency. This work investigates the use of Random Forest and Bayesian to automatically optimize BiLSTM-CNN systems. We built the ASR based on the BiLSTM-CNN model and customized its hyperparameters value to heed our low-hardware specification during optimization. Furthermore, we gathered 1,000 clips of speech data from various movies, classifying them according to emotion and stress classes. In pursuit of contextual-level understanding in our ASR, we transcribed our speech data and used the bigram textual feature. Our Random Forest-optimized BiLSTM-CNN model ultimately reaches 84% of accuracy result and learning runtime in under 17 seconds.
KW - Automatic Speech Recognition
KW - Bayesian Optimization
KW - BiLSTM-CNN
KW - Hyperparameter Optimization
KW - Random Forest
UR - http://www.scopus.com/inward/record.url?scp=85119973597&partnerID=8YFLogxK
U2 - 10.1109/IES53407.2021.9594024
DO - 10.1109/IES53407.2021.9594024
M3 - Conference contribution
AN - SCOPUS:85119973597
T3 - International Electronics Symposium 2021: Wireless Technologies and Intelligent Systems for Better Human Lives, IES 2021 - Proceedings
SP - 156
EP - 161
BT - International Electronics Symposium 2021
A2 - Yunanto, Andhik Ampuh
A2 - Kusuma N, Artiarini
A2 - Hermawan, Hendhi
A2 - Putra, Putu Agus Mahadi
A2 - Gamar, Farida
A2 - Ridwan, Mohamad
A2 - Prayogi, Yanuar Risah
A2 - Ruswiansari, Maretha
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 23rd International Electronics Symposium, IES 2021
Y2 - 29 September 2021 through 30 September 2021
ER -