TY - JOUR
T1 - Integrating data selection and extreme learning machine to predict protein-ligand binding site
AU - Mahdiyah, Umi
AU - Imah, Elly Matul
AU - Irawan, M. Isa
N1 - Publisher Copyright:
© 2015 Umi Mahdiyah, Elly Matul Imah and M. Isa Irawan.
PY - 2016
Y1 - 2016
N2 - Recently, computer-aided drug design is developing rapidly. The first step of computer-aided drug design is to find a protein - ligand binding site, which is a pocket or cleft on the surface of the protein being used to bind a ligand (drug). In this study, the binding site is defined as a binary classification problem to differ the location which can bind or cannot bind the ligand. Classification method used in this research is Extreme Learning Machine (ELM), because this method has fast learning process. In the real case, the dataset usually has imbalanced data. One of them is to predict binding site. Imbalanced data can be solved in several ways. In this study we carried out the integration of data selection and classification to overcome the inconsistency problem. The performance of integrating between data selection and Extreme Learning Machine to predict protein-ligand binding site is measured by using recall, specificity, G-mean and CPU time. The average of recall, specificity, G-mean and CPU time in this research are respectively, those are 91.8472%, 97.071%, 94.2647 %, and 2.79 second.
AB - Recently, computer-aided drug design is developing rapidly. The first step of computer-aided drug design is to find a protein - ligand binding site, which is a pocket or cleft on the surface of the protein being used to bind a ligand (drug). In this study, the binding site is defined as a binary classification problem to differ the location which can bind or cannot bind the ligand. Classification method used in this research is Extreme Learning Machine (ELM), because this method has fast learning process. In the real case, the dataset usually has imbalanced data. One of them is to predict binding site. Imbalanced data can be solved in several ways. In this study we carried out the integration of data selection and classification to overcome the inconsistency problem. The performance of integrating between data selection and Extreme Learning Machine to predict protein-ligand binding site is measured by using recall, specificity, G-mean and CPU time. The average of recall, specificity, G-mean and CPU time in this research are respectively, those are 91.8472%, 97.071%, 94.2647 %, and 2.79 second.
KW - Binding site protein-ligand
KW - Extreme learning machine
KW - Imbalanced data
UR - http://www.scopus.com/inward/record.url?scp=84992199794&partnerID=8YFLogxK
U2 - 10.12988/ces.2016.66114
DO - 10.12988/ces.2016.66114
M3 - Article
AN - SCOPUS:84992199794
SN - 1313-6569
VL - 9
SP - 791
EP - 797
JO - Contemporary Engineering Sciences
JF - Contemporary Engineering Sciences
IS - 13-16
ER -