TY - JOUR
T1 - Bayesian Model Averaging (BMA) Based on Logistic Regression for Gene Selection and Classification of Animal Tumor Disease on Microarray Data
AU - Kuswanto, Heri
AU - Fitriana, Ika Nur Laily
N1 - Publisher Copyright:
© 2022, International Journal on Advanced Science, Engineering and Information Technology. All Rights Reserved.
PY - 2022
Y1 - 2022
N2 - Tumor is one of the deadly diseases which is frequently to be found in animals. However, identifying whether an animal has a tumor still becomes a big challenge. Classification of tumor disease can be done through gene expression, which consists of hundreds of genes, but only a small number of samples is taken. This data structure is called microarray data having the characteristic of highdimensional data. The choice of a single model can be a problem for high-dimensional data because it ignores model uncertainty. This research proposed to use Bayesian Model Averaging (BMA) to model the uncertainty model by averaging the posterior distribution of all best models, weighted by their posterior model probabilities. Selecting relevant genes to diagnose animal tumors is very important; hence, variable selection needs to be carried out. The selection of predictor variables is carried out by using the iterative BMA algorithm. The BMA results showed that from 335 gene expressions, 12 genes were selected to be relevant genes for classifying whether the animals have a tumor or normal. Moreover, from 2335 possible models formed, 12 of the best models are selected. The accuracy of BMA results is assessed using the Brier Score, resulting from a value indicating that the BMA model is good enough to classify animals, whether they have a tumor or not. This research has proven that BMA with logistic performance has very good predictability; hence, the method can be applied to classify other diseases.
AB - Tumor is one of the deadly diseases which is frequently to be found in animals. However, identifying whether an animal has a tumor still becomes a big challenge. Classification of tumor disease can be done through gene expression, which consists of hundreds of genes, but only a small number of samples is taken. This data structure is called microarray data having the characteristic of highdimensional data. The choice of a single model can be a problem for high-dimensional data because it ignores model uncertainty. This research proposed to use Bayesian Model Averaging (BMA) to model the uncertainty model by averaging the posterior distribution of all best models, weighted by their posterior model probabilities. Selecting relevant genes to diagnose animal tumors is very important; hence, variable selection needs to be carried out. The selection of predictor variables is carried out by using the iterative BMA algorithm. The BMA results showed that from 335 gene expressions, 12 genes were selected to be relevant genes for classifying whether the animals have a tumor or normal. Moreover, from 2335 possible models formed, 12 of the best models are selected. The accuracy of BMA results is assessed using the Brier Score, resulting from a value indicating that the BMA model is good enough to classify animals, whether they have a tumor or not. This research has proven that BMA with logistic performance has very good predictability; hence, the method can be applied to classify other diseases.
KW - Animal tumor
KW - Bma
KW - Gene expression
KW - Microarray
UR - http://www.scopus.com/inward/record.url?scp=85144947455&partnerID=8YFLogxK
U2 - 10.18517/ijaseit.12.6.16462
DO - 10.18517/ijaseit.12.6.16462
M3 - Article
AN - SCOPUS:85144947455
SN - 2088-5334
VL - 12
SP - 2378
EP - 2385
JO - International Journal on Advanced Science, Engineering and Information Technology
JF - International Journal on Advanced Science, Engineering and Information Technology
IS - 6
ER -