TY - GEN
T1 - Optimizing genomic signal extraction of COVID-19 variants using the multi linear predictive coding (M-LPC) method
AU - Ardamayanti, Thaliah Fauz
AU - Wijaya, Ridho Nur Rohman
AU - Hidayat, Nurul
AU - Irawan, Mohammad Isa
N1 - Publisher Copyright:
© 2025 Author(s).
PY - 2025/3/17
Y1 - 2025/3/17
N2 - Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a novel coronavirus transmitted to humans and causes COVID-19. In some cases, COVID-19 shows symptoms similar to other illnesses, such as influenza, making it difficult to diagnose. We propose a machine learning-based feature extraction method called Multi Linear Predictive Coding (M-LPC) to classify COVID-19 genome signals. M-LPC employs the LPC principle method along with a sliding window technique. This approach utilizes mathematical calculations to generate basic statistical values for identifying features within DNA sequences, including nucleotide frequency, GC distribution, as well as maximum, minimum, mean, and standard deviation. These features can accurately distinguish COVID-19 variants from viruses that have similar symptoms, such as influenza. The advantage of the M-LPC method lies in its ability to be applied to DNA sequences of any length that have been converted into genomic signals, and generate simple features for subsequent machine learning-based classification. Our research shows that M-LPC successfully extracts essential features of the COVID-19 genome signal with an accuracy of 99.86% for three different disease classes and 92.45% for 12 different classes. Therefore, our proposed method with high accuracy can serve as a decision support tool for more accurate diagnosis of COVID-19."
AB - Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a novel coronavirus transmitted to humans and causes COVID-19. In some cases, COVID-19 shows symptoms similar to other illnesses, such as influenza, making it difficult to diagnose. We propose a machine learning-based feature extraction method called Multi Linear Predictive Coding (M-LPC) to classify COVID-19 genome signals. M-LPC employs the LPC principle method along with a sliding window technique. This approach utilizes mathematical calculations to generate basic statistical values for identifying features within DNA sequences, including nucleotide frequency, GC distribution, as well as maximum, minimum, mean, and standard deviation. These features can accurately distinguish COVID-19 variants from viruses that have similar symptoms, such as influenza. The advantage of the M-LPC method lies in its ability to be applied to DNA sequences of any length that have been converted into genomic signals, and generate simple features for subsequent machine learning-based classification. Our research shows that M-LPC successfully extracts essential features of the COVID-19 genome signal with an accuracy of 99.86% for three different disease classes and 92.45% for 12 different classes. Therefore, our proposed method with high accuracy can serve as a decision support tool for more accurate diagnosis of COVID-19."
UR - http://www.scopus.com/inward/record.url?scp=105001150755&partnerID=8YFLogxK
U2 - 10.1063/5.0262646
DO - 10.1063/5.0262646
M3 - Conference contribution
AN - SCOPUS:105001150755
T3 - AIP Conference Proceedings
BT - AIP Conference Proceedings
A2 - Ghosh, Bapan
A2 - Suryanto, Agus
A2 - Nuraini, Nuning
A2 - Shofianah, Nur
PB - American Institute of Physics
T2 - 10th International Symposium on Biomathematics, SYMOMATH 2023
Y2 - 6 August 2023 through 8 August 2023
ER -