TY - JOUR
T1 - Hybrid K-means, fuzzy C-means, and hierarchical clustering for DNA hepatitis C virus trend mutation analysis
AU - Al Kindhi, Berlian
AU - Sardjono, Tri Arief
AU - Purnomo, Mauridhi Hery
AU - Verkerke, Gijbertus Jacob
N1 - Publisher Copyright:
© 2018 Elsevier Ltd
PY - 2019/5/1
Y1 - 2019/5/1
N2 - Every single strand of DNA consists of 10 sequences of nucleotides. These sequences cannot be separated or randomly arranged because each sequence of DNA contains a certain genomic encoding. When a virus mutates, a drug or vaccine for that virus that has been given to a patient will become useless. Therefore, there is a need for a method of analysing the likely direction of DNA mutation so that preventative measures can be adapted more quickly. RNA-type viruses are able to alter the patterns of infected DNA, which is one way for such a virus to defend itself. In this paper, we propose a new hybrid clustering method that combines K-means, fuzzy C-means, and hierarchical clustering to predict the direction of DNA mutation trends. We have combined these three different approaches in a hybrid clustering method and tested it on two data sets of 1000 isolated positive hepatitis C virus (HCV)-infected and non-infected DNA strands with 37 HCV primers. We compare the results with those of eight other clustering methods, and the comparison shows that our method achieves sensitivity and specificity values of 0.998. The level of precision of cluster division is also 0.004 higher than that of the next highest among the eight methods considered for comparison. From this study, the primer trends that most often appear in isolated DNA can be found, and the origins of these trends in isolated DNA can be inferred.
AB - Every single strand of DNA consists of 10 sequences of nucleotides. These sequences cannot be separated or randomly arranged because each sequence of DNA contains a certain genomic encoding. When a virus mutates, a drug or vaccine for that virus that has been given to a patient will become useless. Therefore, there is a need for a method of analysing the likely direction of DNA mutation so that preventative measures can be adapted more quickly. RNA-type viruses are able to alter the patterns of infected DNA, which is one way for such a virus to defend itself. In this paper, we propose a new hybrid clustering method that combines K-means, fuzzy C-means, and hierarchical clustering to predict the direction of DNA mutation trends. We have combined these three different approaches in a hybrid clustering method and tested it on two data sets of 1000 isolated positive hepatitis C virus (HCV)-infected and non-infected DNA strands with 37 HCV primers. We compare the results with those of eight other clustering methods, and the comparison shows that our method achieves sensitivity and specificity values of 0.998. The level of precision of cluster division is also 0.004 higher than that of the next highest among the eight methods considered for comparison. From this study, the primer trends that most often appear in isolated DNA can be found, and the origins of these trends in isolated DNA can be inferred.
KW - Fuzzy C-means
KW - Hepatitis C virus
KW - Hierarchical clustering
KW - Hybrid clustering
KW - K-means
UR - http://www.scopus.com/inward/record.url?scp=85059074386&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2018.12.019
DO - 10.1016/j.eswa.2018.12.019
M3 - Article
AN - SCOPUS:85059074386
SN - 0957-4174
VL - 121
SP - 373
EP - 381
JO - Expert Systems with Applications
JF - Expert Systems with Applications
ER -