TY - GEN
T1 - Enhancing detection of pathological voice disorder based on deep VGG-16 CNN
AU - Gumelar, Agustinus Bimo
AU - Yuniarno, Eko Mulyanto
AU - Anggraeni, Wiwik
AU - Sugiarto, Indar
AU - Mahindara, Vincentius Raki
AU - Purnomo, Mauridhi Hery
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10/6
Y1 - 2020/10/6
N2 - As a matter of fact, the system of human voice production is a sophisticated biological device that can modulate pitch and loudness. The essentials of internal and external factors often damage the vocal folds and change the vocal voice as a result. Thus, the consequences are well-portrayed in the function of the body and stand of emotion. Consequently, it is primary to identify voice changes at an early stage, deliver an opportunity to overcome any consequence, and enhance the patient's quality of life. In this case, voice disorder can be detected automatically by using Machine Learning (ML) techniques, which is, indeed, serves as a critical role. In this experiment, we specifically employ the Convolutional Neural Network (CNN), and a robust CNN model: the VGG-16. In investigating the performance of CNN in detecting disordered speech, we used the particular Pathological Voice Disorder (PVD) dataset, named the Respiratory Sound Database, which comprises hundreds of sampled PVD sound files. The experiment showed the accuracy of voice pathology detection arouses to 92.03%.
AB - As a matter of fact, the system of human voice production is a sophisticated biological device that can modulate pitch and loudness. The essentials of internal and external factors often damage the vocal folds and change the vocal voice as a result. Thus, the consequences are well-portrayed in the function of the body and stand of emotion. Consequently, it is primary to identify voice changes at an early stage, deliver an opportunity to overcome any consequence, and enhance the patient's quality of life. In this case, voice disorder can be detected automatically by using Machine Learning (ML) techniques, which is, indeed, serves as a critical role. In this experiment, we specifically employ the Convolutional Neural Network (CNN), and a robust CNN model: the VGG-16. In investigating the performance of CNN in detecting disordered speech, we used the particular Pathological Voice Disorder (PVD) dataset, named the Respiratory Sound Database, which comprises hundreds of sampled PVD sound files. The experiment showed the accuracy of voice pathology detection arouses to 92.03%.
KW - CNN
KW - LSTM
KW - Pathological Voice Disorder
KW - VGG-16
KW - VTLP Method
UR - http://www.scopus.com/inward/record.url?scp=85112630222&partnerID=8YFLogxK
U2 - 10.1109/IBIOMED50285.2020.9487589
DO - 10.1109/IBIOMED50285.2020.9487589
M3 - Conference contribution
AN - SCOPUS:85112630222
T3 - IBIOMED 2020 - Proceedings of the 37th International Conference on Biomedical Engineering
SP - 28
EP - 33
BT - IBIOMED 2020 - Proceedings of the 37th International Conference on Biomedical Engineering
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 37th International Conference on Biomedical Engineering, IBIOMED 2020
Y2 - 6 October 2020 through 8 October 2020
ER -