2006 AIChE Annual Meeting
(218b) De Novo Peptide Identification Via Mixed-Integer Linear Optimization and Tandem Mass Spectrometry
In this work, we present a novel algorithm for the de novo sequencing of peptides using tandem mass spectroscopy and mixed-integer linear optimization (MILP) [6]. A two-stage framework is employed to accommodate missing peaks in the tandem mass spectrum; the first stage sequences candidate peptides using single amino acid weights and the second stage allows for combinations of two to three amino acid weights to be used in the construction of the candidate sequences. A preprocessing algorithm is utilized for the identification of important ions in the tandem mass spectrum which can be exploited in the problem formulation, such as ion peaks corresponding to the N- or C-terminus of the peptide and offsets indicative of post-translational modifications. Residue assignment ambiguities are subsequently resolved using a modified SEQUEST algorithm [7] so as to exploit information in the tandem mass spectrum which was not utilized in the sequencing calculations. This post-processing component of the method replaces weights in the candidate peptide sequences derived from the second stage calculations with permutations of amino acids consistent with these weights. The theoretical tandem mass spectrum for each candidate sequence is predicted and cross-correlated with the experimental tandem mass spectrum and the highest scoring sequence is reported as the most probable peptide. The significant contributions of this work include the generation of rank-ordered lists of candidate sequences and the direct incorporation of complementary ions into the sequencing calculations via constraint equations. Several computational studies will be presented to demonstrate the predictive capabilities and instrument-independency of the proposed approach.
[1] V. Dancik, T.A. Addona, K.R. Clauser, J.E. Vath, and P.A. Pevzner. De novo peptide sequencing via tandem mass spectrometry. J. Comp. Biol., 6(3):327-342, 1999.
[2] T. Chen, M.Y. Kao, M. Tepel, J. Rush, and G.M. Church. A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. J. Comp. Biol., 10(3):325-337, 2001.
[3] T. Chen and L. Bingwen. A suboptimal algorithm for de novo peptide sequencing via tandem mass spectrometry. J. Comp. Biol., 10(1):1-12, 2003.
[4] B. Ma, , K.Z. Zhang, C. Hendrie, C. Liang, M. Li, A. Doherty-Kirby, and G. Lajoie. PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom., 17:2337-2342, 2003.
[5] A. Frank and P. Pevzner. Pepnovo: De novo peptide sequencing via probabilistic network modeling. Anal. Chem., 77(4):964-973, 2005.
[6] P.A. DiMaggio and C.A. Floudas. De novo peptide identification via tandem mass spectrometry and mixed-integer optimization. submitted for publication, 2006.
[7] J.K. Eng, A.L. McCormack, and J.R. Yates. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom., 5:976-989, 1994.