Cancer is a disease caused by abnormal growth due to the cells of the body’stissues that turn into cancer cells. Radio therapy is one of the cancer treatments thathas a side effect of killing normal cells around cancer cells. Radio protector is madeto reduce normal cell death and increase cancer cell death. This research identifies the compounds corresponding to the toxicity with normal cell death rate below and above20%. The data used in this study is the level of toxicity to classify compounds for radio protector consisting of 84 compounds with 217 predictors (features). Two ensemble based machine learning approaches are applied to overcoming the problem of high dimensionality of the data, namely Logistic Regression Ensembles (LORENS) and Ensemble Support Vector Machine (AdaBoost-SVM). The AdaBoost-SVM is applied to the important features selected by Mean Decreasing Gini (MDG) index. The results showed thatthe AdaBoost-SVM outperforms LORENS significantly. The accuracy is 0.7889 obtainedby examining 5% of most important features.
- High dimensionality