Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
A Novel Prediction Method for Zinc-Binding Sites in Proteins by an Ensemble of SVM and Sample-Weighted Probabilistic Neural Network
oleh: Hui Li, Dechang Pi, Chuanming Chen, Hongyi Li
Format: | Article |
---|---|
Diterbitkan: | IEEE 2019-01-01 |
Deskripsi
In the prediction of zinc-binding sites in proteins, there are few real binding-site residues, whereas most residues are non-binding-site residues, resulting in a typical imbalanced classification problem. This paper proposes a novel method, SSWPNN (an ensemble of support vector machine and sample-weighted probabilistic neural network), based on downsampling and an ensemble of different classifiers, in view of the imbalance of zinc-binding sites in proteins. Multiple random downsampling techniques without replacement are performed on the whole set, and the support vector machine is trained as the base classifier on each subset to calculate the weights of samples, while the sample-weighted probabilistic neural network is constructed as a strong classifier for prediction. The experimental results showed that our method is superior to other methods not only in the overall prediction performance for the four types of residues but also in the prediction performance for any type of residue. The results of experimental testing on an independent test set collected by the authors in recent years showed that our method achieved better prediction performance than others not only for the four types of residues overall but also for any one type of residue. In addition, the importance of the features selected by the method is analyzed by reducing certain feature to calculate the scores of the performance index. The source code and datasets are available at http://net.jitsec.cn:88/UploadedImages/SSWPNN.rar.