De novo peptide sequencing methods for tandem mass spectra
De novo peptide sequencing from MS/MS spectra has become of primary importance in proteomics. It provides essential information for studies of protein structure and function. With the availability of various MS/MS spectra, a lot of computational methods have been developed to infer peptide sequences from them. However, current de novo peptide sequencing methods still have limitations. Some major ones include a lack of suitable models reflecting MS/MS spectra, limited information extracted from MS/MS spectra, and the inefficient use of multiple spectra. This thesis addresses some of the limitations with a series of novel computational methods designed for various MS/MS spectra and their combinations. The main content of the thesis starts with a comprehensive review of recent developments in de novo peptide sequencing methods, followed by two novel methods for single spectrum sequencing problems, and then presents two paired spectra sequencing methods. The first chapter introduces relevant background information, objectives of the study, and the structure of the thesis. After that, a comprehensive review of de novo peptide sequencing methods is given. It summarizes recent developments of computational methods for various experimental spectra, compares and analyzes their advantages and disadvantages, and points out some future research directions. Having these potential research directions, the thesis next presents two novel methods designed for higher-energy collisional dissociation (HCD) spectra and electron capture dissociation (ECD) (or electron transfer dissociation (ETD)) spectra, respectively. These methods apply new spectrum graph models with multiple types of edges, integrate amino acid combination (AAC) information and peptide tags, and consider spectrum-specific information to suit different spectra. After that, multiple spectra sequencing problem is studied. A framework for de novo peptide sequencing of multiple spectra is given with applications to two different spectra pairs. One pair is spectrally complementary to each other, and the other is similar spectra with property differences. These methods include effective spectra merging criteria and parent mass correction steps, and modify the previously proposed graph models to fit the merged spectra. Experiments on several experimental MS/MS spectra datasets and datasets pairs show the advantages of the proposed methods in terms of peptide sequencing accuracy. Finally, conclusions and future work directions are given at the end of the thesis. To summarize the work in the thesis, a series of novel computational methods for de novo peptide sequencing are proposed. These methods target different types of MS/MS spectra and their combinations. Experiential results show the proposed methods are either better than competing methods that already exist, or fill gaps in the suite of currently available methods.
DegreeDoctor of Philosophy (Ph.D.)
SupervisorWu, Fang-Xiang; Kusalik, Anthony J.
CommitteeKeil, Mark; McQuillan, Ian; Selvaraj, Gopalan
Copyright DateAugust 2015