TY - JOUR
T1 - Gene selection and classification of microarray gene expression data based on a new adaptive L1-norm elastic net penalty
AU - Alharthi, Aiedh Mrisi
AU - Lee, Muhammad Hisyam
AU - Algamal, Zakariya Yahya
N1 - Publisher Copyright:
© 2021 The Authors
PY - 2021/1
Y1 - 2021/1
N2 - The removal of irrelevant and insignificant genes has always been a major step in microarray data analysis. The application of gene selection methods in biological datasets has greatly increased, supporting expert systems in cancer diagnostic capability with high classification accuracy. Penalized logistic regression (PLR) using the elastic net (EN) has been widely used in high-dimensional cancer classification in recent years to estimate the gene coefficients and perform gene selection simultaneously. However, the EN estimator does not satisfy the oracle properties. This paper proposes the PLR using the adaptive elastic net (AEN), abbreviated as PLRAEN, to address the inconsistency. Our method employs the ratio (BWR) as an initial weight inside the L1-norm of the EN model. Several experiments were performed on a simulation study for a different number of predictor variables, sample sizes, and correlation coefficients and also on three public gene expression datasets to evaluate the effectiveness. Experimental results demonstrate that the proposed method consistently outperforms two other contemporary penalized methods regarding classification accuracy and the number of selected genes. Therefore, we conclude that PLRAEN is a better method to implement gene selection in the high-dimensional cancer classification field.
AB - The removal of irrelevant and insignificant genes has always been a major step in microarray data analysis. The application of gene selection methods in biological datasets has greatly increased, supporting expert systems in cancer diagnostic capability with high classification accuracy. Penalized logistic regression (PLR) using the elastic net (EN) has been widely used in high-dimensional cancer classification in recent years to estimate the gene coefficients and perform gene selection simultaneously. However, the EN estimator does not satisfy the oracle properties. This paper proposes the PLR using the adaptive elastic net (AEN), abbreviated as PLRAEN, to address the inconsistency. Our method employs the ratio (BWR) as an initial weight inside the L1-norm of the EN model. Several experiments were performed on a simulation study for a different number of predictor variables, sample sizes, and correlation coefficients and also on three public gene expression datasets to evaluate the effectiveness. Experimental results demonstrate that the proposed method consistently outperforms two other contemporary penalized methods regarding classification accuracy and the number of selected genes. Therefore, we conclude that PLRAEN is a better method to implement gene selection in the high-dimensional cancer classification field.
KW - Adapted elastic net
KW - Cancer diagnosis
KW - Gene selection
KW - Penalized logistic regression
UR - http://www.scopus.com/inward/record.url?scp=85107088836&partnerID=8YFLogxK
U2 - 10.1016/j.imu.2021.100622
DO - 10.1016/j.imu.2021.100622
M3 - Article
AN - SCOPUS:85107088836
SN - 2352-9148
VL - 24
JO - Informatics in Medicine Unlocked
JF - Informatics in Medicine Unlocked
M1 - 100622
ER -