TY - GEN
T1 - GMM Performance Evaluation through Centroid Initialization of k-Means in Text-Independent Speaker Identification
AU - Noviyantono, Endyk
AU - Buliali, Joko Lianto
AU - Arifianto, Dhany
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/7/29
Y1 - 2021/7/29
N2 - One of the most common methods used in the process of identifying speakers is the Gaussian Mixture Model (GMM) method. The quality of GMM depends on the method selected to train the Gaussian. One method that the researcher has chosen is to use k-Means. In this study, an evaluation process was performed on the k-Means GMM using three centroid initialization methods: randomization, seeding and density analysis. The application of seeding uses the k-Means method, whereas the application of density analysis uses the histogram method. We applied two evaluation criteria, namely the complexity of the training process and the accuracy of the speaker identification process. Experiments were conducted over three types of voice test duration: 2, 4 and 6 seconds. We also used nine types of Gaussian components, ranging from 4 to 20 components, with an increasing scale of 2+n. Our proposed method using density analysis has a clustering process time of 33.7% lower, but with the highest accuracy of 95.5%.
AB - One of the most common methods used in the process of identifying speakers is the Gaussian Mixture Model (GMM) method. The quality of GMM depends on the method selected to train the Gaussian. One method that the researcher has chosen is to use k-Means. In this study, an evaluation process was performed on the k-Means GMM using three centroid initialization methods: randomization, seeding and density analysis. The application of seeding uses the k-Means method, whereas the application of density analysis uses the histogram method. We applied two evaluation criteria, namely the complexity of the training process and the accuracy of the speaker identification process. Experiments were conducted over three types of voice test duration: 2, 4 and 6 seconds. We also used nine types of Gaussian components, ranging from 4 to 20 components, with an increasing scale of 2+n. Our proposed method using density analysis has a clustering process time of 33.7% lower, but with the highest accuracy of 95.5%.
KW - clustering algorithms
KW - gaussian mixture model
KW - histograms
KW - k-means
KW - speaker recognition
UR - http://www.scopus.com/inward/record.url?scp=85116272549&partnerID=8YFLogxK
U2 - 10.1109/ICERA53111.2021.9538743
DO - 10.1109/ICERA53111.2021.9538743
M3 - Conference contribution
AN - SCOPUS:85116272549
T3 - Proceeding - ICERA 2021: 2021 3rd International Conference on Electronics Representation and Algorithm
SP - 95
EP - 98
BT - Proceeding - ICERA 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd International Conference on Electronics Representation and Algorithm, ICERA 2021
Y2 - 29 July 2021
ER -