TY - GEN
T1 - Classification of Covid-19 Variants Using Boosting Algorithm
AU - Muhammad, Izzudin
AU - Mukhlash, Imam
AU - Jamhuri, Mohammad
AU - Iqbal, Mohammad
AU - Irawan, Mohammad Isa
N1 - Publisher Copyright:
© 2022 Institute of Advanced Engineering and Science (IAES).
PY - 2022
Y1 - 2022
N2 - COVID-19 is a disease caused by a virus from the coronavirus group, namely severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The Sars-CoV-2 virus has 5 variants that are included in the variant of concern (VOC) namely Alpha, Beta, Delta, Gamma, and Omicron. The COVID-19 virus has infected more than 400 million people worldwide. This information causes a significant increase in data with the result that computations are needed to obtain knowledge (pattern) from the data. Machine learning is a tool that can facilitate the analysis of big data, one of which is classification. In this paper, we implement two boosting algorithms: eXtreme Gradient Boosting (XGB) and Light Gradient Boosting Machine (LGBM), to classify the Deoxyribonucleic acid (DNA) sequence data from the COVID-19 virus variants. Additionally, we utilized one-hot encoded method to encode data. The experiment results showed that XGB has better accuracy than LGBM, but LGBM has faster computation time than XGB. The highest accuracy is 0.992.
AB - COVID-19 is a disease caused by a virus from the coronavirus group, namely severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The Sars-CoV-2 virus has 5 variants that are included in the variant of concern (VOC) namely Alpha, Beta, Delta, Gamma, and Omicron. The COVID-19 virus has infected more than 400 million people worldwide. This information causes a significant increase in data with the result that computations are needed to obtain knowledge (pattern) from the data. Machine learning is a tool that can facilitate the analysis of big data, one of which is classification. In this paper, we implement two boosting algorithms: eXtreme Gradient Boosting (XGB) and Light Gradient Boosting Machine (LGBM), to classify the Deoxyribonucleic acid (DNA) sequence data from the COVID-19 virus variants. Additionally, we utilized one-hot encoded method to encode data. The experiment results showed that XGB has better accuracy than LGBM, but LGBM has faster computation time than XGB. The highest accuracy is 0.992.
KW - Boosting Algorithm
KW - COVID-19
KW - Classification
KW - DNA Sequencing
KW - One-Hot Encoded
UR - http://www.scopus.com/inward/record.url?scp=85142718282&partnerID=8YFLogxK
U2 - 10.23919/EECSI56542.2022.9946452
DO - 10.23919/EECSI56542.2022.9946452
M3 - Conference contribution
AN - SCOPUS:85142718282
T3 - International Conference on Electrical Engineering, Computer Science and Informatics (EECSI)
SP - 29
EP - 34
BT - Proceedings - 9th International Conference on Electrical Engineering, Computer Science and Informatics, EECSI 2022
A2 - Facta, Mochammad
A2 - Syafrullah, Mohammad
A2 - Riyadi, Munawar Agus
A2 - Subroto, Imam Much Ibnu
A2 - Irawan, Irawan
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th International Conference on Electrical Engineering, Computer Science and Informatics, EECSI 2022
Y2 - 6 October 2022 through 7 October 2022
ER -