Novel Automated Platform for Proteoform Driven Top-Down Mass Spectrometry Proteomics

Corbett, John Rawson

Novel Automated Platform for Proteoform Driven Top-Down Mass Spectrometry Proteomics

dc.contributor.advisor	Patrie, Steven
dc.contributor.advisor	Yoo, Hyuntae
dc.creator	Corbett, John Rawson
dc.date.accessioned	2020-02-05T14:22:54Z
dc.date.available	2020-02-05T14:22:54Z
dc.date.created	2017-12
dc.date.issued	2017-12
dc.date.submitted	December 2017
dc.date.updated	2020-02-05T14:22:54Z
dc.description.abstract	Top-Down proteomics studies protein complexity at the intact proteoform level in order to study chemical modifications, such as co-post translational modifications and non-enzymatic protein processing (e.g., redox active modifications, glycation). With this approach, information content associated with the diversity of chemical/biological processes, such as glycosylation, lipidation, and proteolysis that occur in vivo, is captured facilitating an enhanced representative observation of biological complexity. To obtain this information, a traditional Top-Down approach uses liquid chromatography separations in conjunction with mass spectrometry and database querying techniques in order to identify proteoforms. For example, this approach was used in a study highlighting differentially expressed levels of phosphor-proteoforms within cardiac myofilaments and their association with different degrees of congestive heart failure. Although these strategies have been well characterized, such an approach is not applicable towards large scale proteome analysis due to the high heterogeneity of expressed proteoforms. For this type of analysis, multiple dimensions of orthogonal chromatographic separations are used to antagonize proteoform complexity, with prior attempts identifying over 3,000 unique proteoforms from the HeLa S3 cell line. These Top-Down platforms have also been used towards completing proteome scale label-free quantitative studies; however, such approaches have often struggled due to limited quantitative dynamic range. Additionally, chromatographic separation strategies have been protein driven reducing proteoform observation to only the most abundant species, and in some cases a complete loss of proteoform information (i.e., related glycoproteoforms) due to limitations associated with charging/ionization efficiency, ion transfer, and mass spectrometer resolving power. To address these obstacles, a novel platform that utilizes the concept of isoelectric point separation has been implemented in order to complete chromatographic separations at the proteoform level. Utilizing high resolution in solution isoelectric focusing with superficially porous liquid chromatography and Fourier-transform mass spectrometry, a ~5x improvement of observed proteoforms from cardiac myofibril tissue (1D: 112 vs. 2D: 582 proteoforms) was determined with species ranging from 3 – 230 kDa in size. In addition, novel data processing strategies that are capable of distinguishing related proteoform information content separated into different mass spectra have been implemented with the objective to establish the three quantitative levels of Top-Down proteomics (proteoform, protein, and proteoform ratios). Standard proteins with different physiochemical properties and modification classes were studied to create calibration curves under non-spiked and spiked conditions (i.e., E. coli matrix effect) with a linear dynamic range of 102 – 103 and low femtomole limits of detection values established. Additionally, results indicate that proteoform ratio information content, outside of matrix effects, is independent of protein loading. To aid in automating the data processing strategies associated with mass spectral deconvolution and data binning procedures, triplicate E. coli proteome analyses have been completed with a sliding window approach illustrating reproducible spectral intensity values (~15.1% relative standard deviation) and chromatographic precision tolerances of ± 0.2 pI units and ± 12 seconds for weighted pI and hydrophobicity calculations respectively. Using this platform, Lipocalin-type Prostaglandin D-Synthase, a highly glycosylated cerebrospinal fluid (CSF) protein, was fully characterized with 200+ proteoforms identified, a 65x improvement compared to other non-pI based Top-Down platforms that are chromatographically protein driven. In the future, the completion of CSF proteome profiling investigations will contribute to the interpretation of changes in proteoform modifications and expression levels and the correlation to unique pathobiology associated with different neurodegenerative and neuroinflammatory diseases.
dc.format.mimetype	application/pdf
dc.identifier.uri	https://hdl.handle.net/10735.1/7235
dc.language.iso	en
dc.rights	©2017 John Rawson Corbett. All Rights Reserved.
dc.subject	Proteomics
dc.subject	Mass spectrometry
dc.subject	Chromatographic analysis
dc.subject	Electronic data processing
dc.subject	Automation
dc.title	Novel Automated Platform for Proteoform Driven Top-Down Mass Spectrometry Proteomics
dc.type	Dissertation
dc.type.material	text
thesis.degree.department	Biomedical Engineering
thesis.degree.grantor	The University of Texas at Dallas
thesis.degree.level	Doctoral
thesis.degree.name	PHD

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ETD-5608-006-CORBETT-260839.35.pdf
Size:: 6.08 MB
Format:: Adobe Portable Document Format
Description:: Dissertation

Download

License bundle

Now showing 1 - 2 of 2

Name:: LICENSE.txt
Size:: 1.84 KB
Format:: Plain Text
Description:

Download

Name:: PROQUEST_LICENSE.txt
Size:: 5.84 KB
Format:: Plain Text
Description:

Download

Collections

UTD Theses and Dissertations