Abstract

DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.

Original languageEnglish
Title of host publicationProceedings of the 13th IMT-GT International Conference on Mathematics, Statistics and their Applications, ICMSA 2017
EditorsHaslinda Ibrahim, Nazrina Aziz, Mohd Kamal Mohd Nawawi, Azizah Mohd Rohni, Jafri Zulkepli
PublisherAmerican Institute of Physics Inc.
ISBN (Electronic)9780735415959
DOIs
Publication statusPublished - 22 Nov 2017
Event13th IMT-GT International Conference on Mathematics, Statistics and their Applications, ICMSA 2017 - Kedah, Malaysia
Duration: 4 Dec 20177 Dec 2017

Publication series

NameAIP Conference Proceedings
Volume1905
ISSN (Print)0094-243X
ISSN (Electronic)1551-7616

Conference

Conference13th IMT-GT International Conference on Mathematics, Statistics and their Applications, ICMSA 2017
Country/TerritoryMalaysia
CityKedah
Period4/12/177/12/17

Fingerprint

Dive into the research topics of 'Fuzzy support vector machine for microarray imbalanced data classification'. Together they form a unique fingerprint.

Cite this