Breadcrumb
- Home
- Publications
- Proceedings
- 2009 Annual Meeting
- Food, Pharmaceutical & Bioengineering Division
- Poster Session: Bioengineering
- (485ah) Secondary Structure Aided Protein Structure Alignment
Structural alignment can be done via several methods, including contact map overlap maximization (MAX-CMO) that aligns proteins to maximize the number of common residue contacts. Xie and Sahinidis [1] recently developed a reduction-based exact algorithm (CMOS), which is the state-of-the-art for the MAX-CMO problem. Computational experiments demonstrate that this algorithm runs significantly faster than prior exact algorithms for MAX-CMO and is also able to resolve some hard CMO instances not solved in the past. In addition, this algorithm produces protein clusters that are in excellent agreement with the SCOP classification. Although the CMOS algorithm is faster than all competing algorithms for MAX-CMO, it still is not fast enough to allow for an all-to-all structure comparison for proteins from the Protein Data Bank in a reasonable computing time.
The present work aims at improving the CMOS algorithm through the incorporation of novel bounding techniques. Proteins comprise of some well known structural motifs (secondary structures) which are the building blocks of protein folds. We represent the 3d structure of the protein as a sequence of these secondary structure motifs. The corresponding sequences for two proteins are then compared by well-established sequence alignment algorithms, such as the Needleman-Wunsch algorithm [2], providing an alignment and a lower bound for the problem. Our preliminary computations indicate that the dynamic programming-based bounds employed in CMOS can be significantly improved by the use of the proposed approach. For the Skolnick data set, the proposed approach leads to a 1.5 fold improvement in the root node bounds of the CMOS algorithm. We further demonstrate the utility of this approach through extensive computational studies.
[1] W. Xie and N.V. Sahinidis. A reduction-based exact algorithm for the contact map overlap problem. Journal of Computational Biology, 14:637-654, 2007.
[2] S. Needleman and C. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol., 48:443-453, 1970.