Please use this identifier to cite or link to this item: http://localhost:80/xmlui/handle/123456789/5319
Title: Prediction of Membrane Proteins Using Machine Learning Approaches
Authors: Hayat, Maqsood
Keywords: Computer science, information & general works
Issue Date: 2012
Publisher: Pakistan Institute of Engineering and Applied Sciences Islamabad, Pakistan
Abstract: Membrane proteins are the basic constituent of a cell that manage intra and extracellular processes of a cell. About 20-30% of genes of eukaryotic organisms are encoded from membrane proteins. In addition, almost 50% of drugs are directly targeted against membrane proteins. Owing to the significant role of membrane proteins in living organisms, the identification of membrane proteins with substantial accuracy is essential. However, the annotation of membrane proteins through conventional methods is difficult, sometimes even impossible. Therefore, membrane proteins are predicted from topogenic sequences using computational intelligence techniques. In this study, we conducted our research in two phases regarding the prediction of membrane protein types and structures. In Phase-I, regarding the prediction of membrane protein types, four different ways are explored in order to enhance true prediction. In the first part of phase-I, membrane protein types are predicted using Composite protein sequence representation followed by the application of principal component analysis in conjunction with individual classifiers. In the second part, the notion of ensemble classification is utilized. In part three, an error correction code is incorporated with Support Vector Machine using evolutionary profiles (Position Specific Scoring Matrix) and SAAC based features. Finally, in part four, a two-layer web predictor Mem- PHybrid is developed. Mem-PHybrid accomplishes the prediction in two steps. First, a protein query is identified as a membrane or a non-membrane protein. In case of membrane protein, then its type is predicted. In the second phase of this research, the structure of membrane protein is recognized as alpha-helix transmembrane or outer membrane proteins. In case of alpha- helix transmembrane proteins, features are explored from protein sequences by two feature extraction schemes of distinct natures; including physicochemical properties and compositional index of amino acids. Singular value decomposition is employed to extract high variation features. A hybrid feature vector is formed by combining the different types of features. Weighted Random Forest is then used as a classification algorithm. On the other hand, in case of outer membrane proteins, protein sequences are represented by Amino acid composition, PseAA composition, and SAAC along with their hybrid models. Genetic programming, K-nearest neighbor, and fuzzy K-nearest neighbor are adopted as classification algorithms. Through the simulation study, we observed that the prediction performance of our proposed approaches in case of both types and structures prediction is better compared to existing state of the arts/approaches. Finally, we conclude that our proposed approach for membrane proteins might play a significant role in Computational Biology, Molecular Biology, Bioinformatics, and thus might help in applications related to drug discovery. In addition, the related web predictors provide sufficient information to researchers and academicians in future research.
URI: http://142.54.178.187:9060/xmlui/handle/123456789/5319
Appears in Collections:Thesis

Files in This Item:
File Description SizeFormat 
1640.htm128 BHTMLView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.