Figure - available from: Nature Methods
This content is subject to copyright. Terms and conditions apply.
O-glycopeptide identification and SSGL of pGlyco3
a–c, Software comparisons of O-glycopeptide searches with IHMO HEK-293 cell line data. a, The overlaps of other tools with pGlyco3 on O-GPSMs. For glycans, only the total glycan compositions were compared; SSGL was not considered. ETD scans in the results of Byonic were mapped to their corresponding HCD scans for the comparisons. If an HCD spectrum and its sister ETD spectrum were identified as the same glycopeptide, we kept only one GPSM; otherwise, we kept both. b, The identified IHMO O-GPSMs were validated by Hex-containing results and further validated by Hex-diagnostic ions. c, The runtime comparison. d, Validation of the SSGL-FDR of pGlycoSite using the entrapment-based SSGL-FDR and OpeRATOR-based SSGL-FDR (Methods). e, Localized site-specific O-glycans of ITIH4, KNG1 and F12 proteins in human serum samples. Site-groups were discarded and SSGL assignments with maximal probability ≥0.75 are displayed. f, An annotated spectrum of the localized O-glycan and its SSGL probability for KNG1-S403. The HCD spectrum is annotated in Supplementary Fig. 11. The ScoreTable with BestPath is shown in Supplementary Data.

O-glycopeptide identification and SSGL of pGlyco3 a–c, Software comparisons of O-glycopeptide searches with IHMO HEK-293 cell line data. a, The overlaps of other tools with pGlyco3 on O-GPSMs. For glycans, only the total glycan compositions were compared; SSGL was not considered. ETD scans in the results of Byonic were mapped to their corresponding HCD scans for the comparisons. If an HCD spectrum and its sister ETD spectrum were identified as the same glycopeptide, we kept only one GPSM; otherwise, we kept both. b, The identified IHMO O-GPSMs were validated by Hex-containing results and further validated by Hex-diagnostic ions. c, The runtime comparison. d, Validation of the SSGL-FDR of pGlycoSite using the entrapment-based SSGL-FDR and OpeRATOR-based SSGL-FDR (Methods). e, Localized site-specific O-glycans of ITIH4, KNG1 and F12 proteins in human serum samples. Site-groups were discarded and SSGL assignments with maximal probability ≥0.75 are displayed. f, An annotated spectrum of the localized O-glycan and its SSGL probability for KNG1-S403. The HCD spectrum is annotated in Supplementary Fig. 11. The ScoreTable with BestPath is shown in Supplementary Data.

Source publication
Article
Full-text available
Great advances have been made in mass spectrometric data interpretation for intact glycopeptide analysis. However, accurate identification of intact glycopeptides and modified saccharide units at the site-specific level and with fast speed remains challenging. Here, we present a glycan-first glycopeptide search engine, pGlyco3, to comprehensively a...

Similar publications

Article
Full-text available
The plasma proteome can help bridge the gap between the genome and diseases. Here we describe genome-wide association studies (GWASs) of plasma protein levels measured with 4,907 aptamers in 35,559 Icelanders. We found 18,084 associations between sequence variants and levels of proteins in plasma (protein quantitative trait loci; pQTL), of which 19...

Citations

... Implemented in the Meta-Morpheus search engine [21], O-Pair search was shown to be orders of magnitude faster than conventional searches while also capable of identifying additional glycopeptides due to its divide-and-conquer approach to reducing the total search space [20,22]. Another glycoproteomics software package, pGlyco3 [23], implemented a similar O-glycosite localization algorithm, with some clever adjustments to account for the glycan-first search of pGlyco vs the peptide-first search of MetaMorpheus (and MSFragger). ...
... Expanding this comparison to include Byonic [40] and pGlyco3 [23] shows that the peptide-first search method, used by both MSFragger and MetaMorpheus, outperformed other methods in this dataset. The Byonic search results are taken from Lu et al. [20] and show the limitations of conventional searches for O-glycoproteomics. ...
... Second, we evaluated FragPipe using the glycan entrapment experiment introduced by pGlyco3. In this experiment, various human cell lines were treated with an O-glycan elongation inhibitor to prevent the addition of galactose to the initial O-GalNAc [23]. Thus, glycans containing hexose are expected to be present only at low levels, in contrast to typical O-glycans that frequently contain hexose(s), such as the core 1 glycan GalNAc-Gal. ...
Article
Full-text available
Identification of O-glycopeptides from tandem mass spectrometry data is complicated by the near complete dissociation of O-glycans from the peptide during collisional activation and by the combinatorial explosion of possible glycoforms when glycans are retained intact in electron-based activation. The recent O-Pair search method provides an elegant solution to these problems, using a collisional activation scan to identify the peptide sequence and total glycan mass, and a follow-up electron-based activation scan to localize the glycosite(s) using a graph-based algorithm in a reduced search space. Our previous O-glycoproteomics methods with MSFragger-Glyco allowed for extremely fast and sensitive identification of O-glycopeptides from collisional activation data but had limited support for site localization of glycans and quantification of glycopeptides. Here, we report an improved pipeline for O-glycoproteomics analysis that provides proteome-wide, site-specific, quantitative results by incorporating the O-Pair method as a module within FragPipe. In addition to improved search speed and sensitivity, we add flexible options for oxonium ion-based filtering of glycans and support for a variety of MS acquisition methods and provide a comparison between all software tools currently capable of O-glycosite localization in proteome-wide searches.
... Glycopeptides were enriched from the peptide mixture using either size exclusion chromatography or mixed-mode anion exchange cartridge (MAX), and analyzed by mass spectrometry (MS) in data dependent acquisition mode an Orbitrap Eclipse mass spectrometer (Thermo Fisher Scientific) [11,16,17]. Data was searched in pGlyco3 [18]. Commercial bovine (Thermo Scientific) and rabbit (Sigma) serum albumin were digested followed by glycopeptide enrichment using MAX. ...
... First, we performed deep discovery analysis using donor serum samples to identify intact glycopeptides with sites Asn 68 and Asn 123 . We then confirmed our findings using streamlined enrichment methods, targeted LC-MS/MS analysis of 18 O-labeled deglycosylated peptides as well as MS3 analysis of intact glycopeptides. These findings were validated in serum samples from twenty additional donors by targeted glycopeptide detection. ...
... Eight fractions from SEC were analyzed using LC-MS/MS-based discovery pipeline [11] (Fig. 1A). The resulting data were searched using pGlyco3 for glycopeptide identification [18]. The search was performed against the UniProt human proteome database and the in-built human N-glycan database [13]. ...
Article
Full-text available
Background Glycosylation is an enzyme-catalyzed post-translational modification that is distinct from glycation and is present on a majority of plasma proteins. N-glycosylation occurs on asparagine residues predominantly within canonical N-glycosylation motifs (Asn-X-Ser/Thr) although non-canonical N-glycosylation motifs Asn-X-Cys/Val have also been reported. Albumin is the most abundant protein in plasma whose glycation is well-studied in diabetes mellitus. However, albumin has long been considered a non-glycosylated protein due to absence of canonical motifs. Albumin contains two non-canonical N-glycosylation motifs, of which one was recently reported to be glycosylated. Methods We enriched abundant serum proteins to investigate their N-linked glycosylation followed by trypsin digestion and glycopeptide enrichment by size-exclusion or mixed-mode anion-exchange chromatography. Glycosylation at canonical as well as non-canonical sites was evaluated by liquid chromatography–tandem mass spectrometry (LC–MS/MS) of enriched glycopeptides. Deglycosylation analysis was performed to confirm N-linked glycosylation at non-canonical sites. Albumin-derived glycopeptides were fragmented by MS3 to confirm attached glycans. Parallel reaction monitoring was carried out on twenty additional samples to validate these findings. Bovine and rabbit albumin-derived glycopeptides were similarly analyzed by LC–MS/MS. Results Human albumin is N-glycosylated at two non-canonical sites, Asn⁶⁸ and Asn¹²³. N-glycopeptides were detected at both sites bearing four complex sialylated glycans and validated by MS3-based fragmentation and deglycosylation studies. Targeted mass spectrometry confirmed glycosylation in twenty additional donor samples. Finally, the highly conserved Asn¹²³ in bovine and rabbit serum albumin was also found to be glycosylated. Conclusions Albumin is a glycoprotein with conserved N-linked glycosylation sites that could have potential clinical applications.
... Mass spectrometry, lectin binding assays, and ELISAs are among the technologies used to identify and characterize glycans and to quantify glycosylation levels at different sites on proteins (Bagdonaite et al., 2022;Goumenou et al., 2021;Pan et al., 2011;Wu et al., 2014). Despite significant progress in the identification of intact glycopeptides using tandem mass spectrometry (Fang, Zheng et al., 2022;Polasky et al., 2020;Shen et al., 2021;Toghi Eshghi et al., 2015;Zeng et al., 2021) and glycan databases (Alocci et al., 2019;Fujita et al., 2021;Minoru Kanehisa, 2017), the high-throughput analysis of glycoproteomic data still faces considerable obstacles. A prominent challenge is the lack of specialized resources tailored to the glycoproteomic evaluation of intact glycopeptides (IGPs). ...
Preprint
Protein glycosylation plays a pivotal role in various biological processes, and the analysis of intact glycopeptides (IGPs) has emerged as a powerful approach for characterizing alterations in protein glycosylation associated with diseases. Despite the critical insights gained from IGP analysis, there is an evident scarcity of intact glycopeptide database and specialized tools for a comprehensive glycoproteomic examination. In response to this deficiency, we have developed a Python package, "GPnotebook," which consolidates the intact glycopeptides identified from different cancer types by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) and includes analytical tools for an in-depth characterization of glycopeptides. GPnotebook facilitates an array of functions including statistical profiling, differential expression analysis, glycosylation subtype categorization, investigation of glycosylation-phosphorylation interplay, survival analysis, and glycosylation enzyme assessment. We have deployed GPnotebook in a study of Pancreatic Ductal Adenocarcinoma (PDAC), thereby validating its application and demonstrating its capabilities. Our findings suggest that IGPs hold significant promise as cancer-specific changes and subtype differentiation. Consequently, GPnotebook stands out as a valuable resource for cancer researchers delving into the nuances of protein glycosylation and its correlation with cancer phenotypes.
... Numerous advanced software tools have been developed for large-scale intact N-glycopeptide identification, such as StrucGP (Shen et al. 2021), MSFragger-Glyco (Polasky et al. 2020), Glyco-Decipher (Fang et al. 2022), GPSeeker (Xiao and Tian 2019), Byonic (Bern et al. 2012), and pGlyco (Liu et al. 2017;Zeng et al. 2021Zeng et al. , 2016. The primary motivation for these developments is to enhance the accuracy and search efficiency of intact glycopeptide identification, while the quantification of intact N/O-glycopeptides remains a considerable challenge for many of these software tools (Abrahams et al. 2020;Polasky and Nesvizhskii 2023). ...
... Recently, pGlyco has evolved to address this need, offering quantitative analysis capabilities (Zeng et al. 2021). Our recent study evaluating these tools in glycosylation analysis revealed that Byonic performs exceptionally well in identifying intact N-glycopeptides (Kawahara et al. 2021). ...
Article
The site-specific N-glycosylation changes of human plasma immunoglobulin gamma molecules (IgGs) have been shown to modulate the immune response and could serve as potential biomarkers for the accurate diagnosis of various diseases. However , quantifying intact N-glycopeptides accurately in large-scale clinical samples remains a challenge, and the quantitative N-glycosylation of plasma IgGs in patients with chronic kidney diseases (CKDs) has not yet been studied. In this study, we present a novel integrated intact N-glycopeptide quantitative pipeline (termed GlycoQuant), which combines our recently developed mass spectrometry fragmentation method (EThcD-sceHCD) and an intact N-glycopeptide batch quantification software tool (the upgraded PANDA v.1.2.5). We purified and digested human plasma IgGs from 58 healthy controls (HCs), 48 patients with membranous nephropathy (MN), and 35 patients with IgA nephropathy (IgAN) within an hour. Then, we analyzed the digested peptides without enrichment using EThcD-sceHCD-MS/MS, which provided higher spectral quality and greater identified depth. Using upgraded PANDA, we performed site-specific N-glycosylation quantification of IgGs. Several quantified intact N-glycopeptides not only distinguished CKDs from HCs, but also different types of CKD (MN and IgAN) and may serve as accurate diagnostic tools for renal tubular function. In addition, we proved the applicability of this pipeline to complex samples by reanalyzing the intact N-glycopeptides from cell, urine, plasma, and tissue samples that we had previously identified. We believe that this pipeline can be applied to large-scale clinical N-glycoproteomic studies, facilitating the discovery of novel glycosylated biomarkers. Graphical abstract Yang Zhao and Yong Zhang have contributed equally to this work and share first authorship. Extended author information available on the last page of the article Y. Zhao et al.
... To our knowledge, this finding has not been previously described for any haptoglobin-derived glycopeptide with multiple glycosylation sites in PMM2-CDG. We were able to do this because of the peptide-and glycosylation site-specific analysis that is now possible with recent advances in MS and database searching capabilities (22,23,37). These findings indicate substantial hypoglycosylation of these sites in haptoglobin in PMM2-CDG. ...
... Recent advances in sample preparation, MS, and software development overcome these limitations (45)(46)(47), allowing analysis of thousands of glycopeptides to identify putative glycan structure based on composition and known biosynthetic pathways, as well as the protein-specific glycosylation sites at which the glycan is attached (37,48,49). Determining definitive glycan structures requires further analysis (50,51). ...
... Database searching and data analysis. Database searching was performed using publicly available software pGlyco Version 2.2.0; pGlyco version 3.0 was used to search for glycosylation at multiple glycosylation sites (37,59). Glycan databases already available with the software were used, and Uniprot Human Reviewed protein sequences (20,432 entries, downloaded February 1, 2021) were used as protein sequence FASTA file. ...
Article
Full-text available
BACKGROUND Diagnosis of PMM2-CDG, the most common congenital disorder of glycosylation (CDG), relies on measuring carbohydrate-deficient transferrin (CDT) and genetic testing. CDT tests have false negatives and may normalize with age. Site-specific changes in protein N-glycosylation have not been reported in sera in PMM2-CDG.METHODS Using multistep mass spectrometry-based N-glycoproteomics, we analyzed sera from 72 individuals to discover and validate glycopeptide alterations. We performed comprehensive tandem mass tag-based discovery experiments in well-characterized patients and controls. Next, we developed a method for rapid profiling of additional samples. Finally, targeted mass spectrometry was used for validation in an independent set of samples in a blinded fashion.RESULTSOf the 3,342 N-glycopeptides identified, patients exhibited decrease in complex-type N-glycans and increase in truncated, mannose-rich, and hybrid species. We identified a glycopeptide from complement C4 carrying the glycan Man5GlcNAc2, which was not detected in controls, in 5 patients with normal CDT results, including 1 after liver transplant and 2 with a known genetic variant associated with mild disease, indicating greater sensitivity than CDT. It was detected by targeted analysis in 2 individuals with variants of uncertain significance in PMM2.CONCLUSION Complement C4-derived Man5GlcNAc2 glycopeptide could be a biomarker for accurate diagnosis and therapeutic monitoring of patients with PMM2-CDG and other CDGs.FUNDINGU54NS115198 (Frontiers in Congenital Disorders of Glycosylation: NINDS; NCATS; Eunice Kennedy Shriver NICHD; Rare Disorders Consortium Disease Network); K08NS118119 (NINDS); Minnesota Partnership for Biotechnology and Medical Genomics; Rocket Fund; R01DK099551 (NIDDK); Mayo Clinic DERIVE Office; Mayo Clinic Center for Biomedical Discovery; IA/CRC/20/1/600002 (Center for Rare Disease Diagnosis, Research and Training; DBT/Wellcome Trust India Alliance).
... At the heart of proteomics data analysis is peptide identification by matching fragment spectra to theoretical or experimental spectra for candidate peptides 4 . Most commonly used proteomics [5][6][7][8] or glycoproteomics [9][10][11][12][13][14][15][16] search engines are based on database searching, where peptide spectrum matches (PSMs) or glycopeptide spectrum matches (GPSMs) are scored on the presence of fragment ions theoretically generated from peptide sequence and glycan but largely disregard fragment ion intensities. As a complementary approach, spectral library searching correlates the intensity pattern of fragment ions of the analyte to library spectra typically constructed from previous identification data 17 , which has been reported to yield more discriminative match scores than database searching for data-dependent acquisition (DDA) analysis [18][19][20][21] . ...
... For each GPSM, candidate glycopeptides were generated replacing the original glycan with its structural isomers. In this study, the built-in glycan databases of pGlyco 14 were used as the glycan space (2922 glycan structures for human and 7878 for mouse), appended with glycan structures uniquely present in the original identification results by StrucGP 13 . The fragment ions were extracted from the experimental (query) spectrum by matching to the m/z of theoretical fragments of the candidate glycopeptides. ...
Article
Full-text available
Deep learning has achieved a notable success in mass spectrometry-based proteomics and is now emerging in glycoproteomics. While various deep learning models can predict fragment mass spectra of peptides with good accuracy, they cannot cope with the non-linear glycan structure in an intact glycopeptide. Herein, we present DeepGlyco, a deep learning-based approach for the prediction of fragment spectra of intact glycopeptides. Our model adopts tree-structured long-short term memory networks to process the glycan moiety and a graph neural network architecture to incorporate potential fragmentation pathways of a specific glycan structure. This feature is beneficial to model explainability and differentiation ability of glycan structural isomers. We further demonstrate that predicted spectral libraries can be used for data-independent acquisition glycoproteomics as a supplement for library completeness. We expect that this work will provide a valuable deep learning resource for glycoproteomics.
... Open source Yes GPS enables the analysis of diverse glycosylation patterns and simplifies data analysis. However, it relies on potentially incomplete databases for identification and requires substantial computational resources [162,163]. ...
Article
Glycosylation, the major post‐translational modification of proteins, significantly increases the diversity of proteoforms. Glycans are involved in a variety of pivotal structural and functional roles of proteins, and changes in glycosylation are profoundly connected to the progression of numerous diseases. Mass spectrometry (MS) has emerged as the gold standard for glycan and glycopeptide analysis because of its high sensitivity and the wealth of fragmentation information that can be obtained. Various separation techniques have been employed to resolve glycan and glycopeptide isomers at the front end of the MS. However, differentiating structures of isobaric and isomeric glycopeptides constitutes a challenge in MS‐based characterization. Many reports described the use of various ion mobility–mass spectrometry (IM–MS) techniques for glycomic analyses. Nevertheless, very few studies have focused on N ‐ and O ‐linked site‐specific glycopeptidomic analysis. Unlike glycomics, glycoproteomics presents a multitude of inherent challenges in microheterogeneity, which are further exacerbated by the lack of dedicated bioinformatics tools. In this review, we cover recent advances made towards the growing field of site‐specific glycosylation analysis using IM–MS with a specific emphasis on the MS techniques and capabilities in resolving isomeric peptidoglycan structures. Furthermore, we discuss commonly used software that supports IM–MS data analysis of glycopeptides.
... The combination of gly-comics and glycoproteomics has contributed greatly to the analysis of glycosylation in a variety of organisms, in particular humans [5]. In recent years, direct analysis of intact glycopeptide/glycoproteins has been enabled due to the development of mass spectrometry technologies and bioinformatics tools [6][7][8]. This strategy not only identifies the glycans, proteins, and their glycosites but also obtains additional site-specific glycosylation information, which allows us to identify glycans and their respective glycosylation sites. ...
Article
Background: Drosophila melanogaster is a well-studied and highly tractable genetic model system for deciphering the molecular mechanisms underlying various biological processes. Although being one of the most critical post-translational modifications of proteins, the understanding of glycosylation in Drosophila is still lagging behind compared with that of other model organisms. Methods: In this study, we systematically investigated the site-specific N-glycan profile of Drosophila melanogaster using intact glycopeptide analysis technique. This approach identified the glycans, proteins, and their glycosites in Drosophila, as well as information on site-specific glycosylation, which allowed us to know which glycans are attached to which glycosylation sites. Results: The results showed that the majority of N-glycans in Drosophila were high-mannose type (69.3%), consistent with reports in other insects. Meanwhile, fucosylated N-glycans were also highly abundant (22.7%), and the majority of them were mono-fucosylated. In addition, 24 different sialylated glycans attached with 16 glycoproteins were identified, and these proteins were mainly associated with developmental processes. Gene ontology analysis showed that N-glycosylated proteins in Drosophila were involved in multiple biological processes, such as axon guidance, N-linked glycosylation, cell migration, cell spreading, and tissue development. Interestingly, we found that seven glycosyltransferases and four glycosidases were N-glycosylated, which suggested that N-glycans may play a regulatory role in the synthesis and degradation of N-glycans and glycoproteins. Conclusions: To our knowledge, this work represents the first comprehensive analysis of site-specific N-glycosylation in Drosophila, thereby providing new perspectives for the understanding of biological functions of glycosylation in insects.
... Once LC-MS data were collected, high-performance computational software was critical to deciphering the precise location of O-glycosites in peptide sequences. Recently published software, including MSFragger-Glyco (40), pGlyco3 (41), and O-Pair (42), employs the next-generation index-search algorithm to significantly improve search speed to tackle the highly complex HCD-pd-EThcD dataset of O-glycopeptides. The series and stepwise innovations in O-glycoprotease-based methods, HCD-pd-EThcD optimization, and software engineering establish a practical approach to mapping the in vivo O-glycosites. ...
... The determination of the "Oglycoproteome", i.e., Identifying the specific location of every O-glycan (O-glycosites) on the serine, threonine, and tyrosine residues of every protein in a cell, tissue, or organ, would provide a valuable baseline for defining any consequences to changes that may accompany mutations and/or physiologic changes to the expression of enzymes responsible for the acquisition of these carbohydrate side chains. To obtain an atlas of serine-and threonine-O-glycosites with high precision and confidence from mouse tissues, organs, and fluid, we integrated the EXoO method (5,38), HCD-pd-EThcD mass spectrometry (39), and database software packages (40)(41)(42) to develop a qualitative and quantitative analysis workflow of Oglycosites (Fig. 1). The EXoO method starts with the preparation of a tryptic digest of the sample. ...
... Next, glycopeptides were fragmented by an optimized HCD-pd-EThcD LC-MS method that produces diagnostic oxonium ions, which facilitate the identification of the glycopeptide and confirm the location of the O-glycan tags within the glycopeptide sequences. The resulting LC-MS data were analyzed using complimentary software, including MSFragger-Glyco, pGlyco3, and O-Pair (40)(41)(42). This integrated workflow permits qualitative and quantitative analysis of thousands of O-glycosites. ...
Article
Full-text available
The family of GalNAc-Ts (GalNAcpolypeptide:N-Acetylgalactosaminyl transferases) catalyzes the first committed step in the synthesis of O-glycans, which is an abundant and biologically important protein modification. Abnormalities in the activity of individual GalNAc-Ts can result in congenital disorders of O-glycosylation (CDG) and influence a broad array of biological functions. How site-specific O-glycans regulate biology is unclear. Compiling in vivo O-glycosites would be an invaluable step in determining the function of site-specific O-glycans. We integrated chemical and enzymatic conditions that cleave O-glycosites, a higher-energy dissociation product ions-triggered electron-transfer/higher-energy collision dissociation mass spectrometry (MS) workflow and software to study nine mouse tissues and whole blood. We identified 2,154 O-glycosites from 595 glycoproteins. The O-glycosites and glycoproteins displayed consensus motifs and shared functions as classified by Gene Ontology terms. Limited overlap of O-glycosites was observed with protein O-GlcNAcylation and phosphorylation sites. Quantitative glycoproteomics and proteomics revealed a tissue-specific regulation of O-glycosites that the differential expression of Galnt isoenzymes in tissues partly contributes to. We examined the Galnt2-null mouse model, which phenocopies congenital disorder of glycosylation involving GALNT2 and revealed a network of glycoproteins that lack GalNAc-T2-specific O-glycans. The known direct and indirect functions of these glycoproteins appear consistent with the complex metabolic phenotypes observed in the Galnt2-null animals. Through this study and interrogation of databases and the literature, we have compiled an atlas of experimentally identified mouse O-glycosites consisting of 2,925 O-glycosites from 758 glycoproteins.
... The identification of intact glycopeptides were performed by pGlyco3.0 [26] . The enzymes were set as follows: trypsin, C-term of KR with two missed cleavages; trypsin and chymotrypsin, C-term of KRFYLWM with six missed cleavages; trypsin and elastase, C-term of KRLITSAV with six missed cleavages. ...
Preprint
Carcinoembryonic antigen (CEA) of human plasma is a biomarker of many cancer diseases, and its N-glycosylation accounts for 60% of molecular mass. It is highly desirable to characterize its glycoforms for providing additional dimension of features to increase its performance in prognosis and diagnosis of cancers. However, to systematically characterize its site-specific glycosylation is challenging due to its low abundance. Here, we developed a highly sensitive strategy for in-depth glycosylation profiling of plasma CEA through chemical proteomics combined with multi-enzymatic digestion. A trifunctional probe was utilized to generate covalent bond of plasma CEA and its antibody upon UV irradiation. As low as 1 ng/mL CEA in plasma could be captured and digested with trypsin and chymotrypsin for intact glycopeptide characterization. Twenty six out of 28 potential N-glycosylation sites were well identified, which were the most comprehensive N-glycosylation site characterization of CEA on intact glycopeptide level as far as we known. Importantly, this strategy was applied to the glycosylation analysis of plasma CEA in cancer patients. Differential site-specific glycoforms of plasma CEA were observed in patients with colorectal carcinomas (CRC) and lung cancer. The distributions of site-specific glycoforms were different as the progression of CRC, and most site-specific glycoforms were overexpressed in stage II of CRC. Overall, we established a highly sensitive chemical proteomic method to profile site-specific glycosylation of plasma CEA, which should generally applicable to other well-established cancer glycoprotein biomarkers for improving their cancer diagnosis and monitoring performance. In Brief A chemical proteomic approach for glycosylation profiling of proteins was established for glycosylation characterization of plasma CEA with low abundance. Although CEA has been widely used in diagnosis and prognosis of many cancers, it lacks specificity and sensitivity. We found that the glycosylation of CEA on intact glycopeptide level provided additional dimension of molecular features to improve the performance of CEA in cancer diagnosis and progression. Highlights A chemical proteomic approach for glycosylation profiling of proteins with low abundance Glycosylation identification of plasma CEA on intact glycopeptide level with high sensitivity and reproducibility Glycosylation features of plasma CEA in cancer patients with CRC and lung cancer and in CRC patients at different progression stages Graphical Abstract