Abstract
Molecular descriptor selection is a pivotal tool for quantitative structure–activity relationship modeling. This paper proposes a novel molecular descriptor selection method on the basis of taking into account the information of the group type that the descriptor belongs to. This descriptor selection method is on the basis of combining penalized logistic regression with 2-sample t test. The proposed method can perform filtering and weighting simultaneously. Specifically, 2-sample t test is employed as filter method by removing the descriptor which is not show statistically significant difference. On the other hand, a weighted penalized logistic regression is used by assigning a weight depending on the 2-sample t test value inside the descriptor type block. The proposed method is experimentally tested and compared with state-of-the-art selection methods. The results show that our proposed method is simpler and faster with efficient classification performance.
Original language | English |
---|---|
Article number | e2915 |
Journal | Journal of Chemometrics |
Volume | 31 |
Issue number | 10 |
DOIs | |
Publication status | Published - Oct 2017 |
Externally published | Yes |
Keywords
- QSAR classification
- SCAD
- adaptive lasso
- descriptor selection
- penalized logistic regression