Breadcrumb
- Home
- Publications
- Proceedings
- 2006 AIChE Annual Meeting
- Topical A: Systems Biology
- Proteomic Systems Biology
- (147b) Protein Structure and Fold Recognition Using Amino Acid Interaction Models
The basic systems idea adopted in the study is to form different models for each class of protein structure representing the nonlinear dependencies and interactions between amino acids for each class. These models are designed so as to predict a particular amino acid concentration in a given class based on all or few other amino acid compositions. The model parameters are estimated using the training data for every class. For a protein sample to be classified, its given amino acid compositions are compared with those predicted by the trained classifier models for different classes. The sample protein is classified as belonging to the structure model with least prediction error. The decision is tested using prediction capabilities defined based on statistical error.
The new discrimination algorithm based on class structure models is validated using illustrative and well studied protein structure classification problems. Proven literature data sets are utilized as benchmark problems for basic four class structure prediction and multi class fold recognition in large sets of proteins. Different classification performance testing procedures are adopted to compare the results obtained from proposed new method with classical classification algorithms like LDA, QDA, SVM and ANN. The proposed model based discrimination method is observed to perform well for all the cases. The results are superior to existing methods especially for re-substitution and cross validation tests. Improvement of 10-20% overall prediction rate is achieved over the best known existing algorithms. The new approach has potential to resolve similar system characterization problems and such extensions to the other areas of proteomics are being investigated.
Keywords: protein structure classification, multivariate statistics, discriminant analysis, biological systems modeling.
[1] Kuo-Chen Chou Biochemical and Biophysical Research Communications 264, 216224, 1999.
[2] Chun-Ting Zhang and Ren Zhang, J.Theor. Biol. 201, 189 199, 1999. [3] C.H.Q. Ding and I. Dubchak, Bioinformatics, 17(4), 349-358, 2001.
[4] R. Sokal and F. J. Rohlf, Biometry: The principles and practice of statistics in biological research. 3rd. Edition, W.H. Freeman & co., New York, 1995