ArticlePDF Available

Multilevel proteomic analyses reveal molecular diversity between diffuse-type and intestinal-type gastric cancer

Authors:

Abstract and Figures

Diffuse-type gastric cancer (DGC) and intestinal-type gastric cancer (IGC) are the major histological types of gastric cancer (GC). The molecular mechanism underlying DGC and IGC differences are poorly understood. In this research, we carry out multilevel proteomic analyses, including proteome, phospho-proteome, and transcription factor (TF) activity profiles, of 196 cases covering DGC and IGC in Chinese patients. Integrative proteogenomic analysis reveals ARIDIA mutation associated with opposite prognostic effects between DGC and IGC, via diverse influences on their corresponding proteomes. Systematical comparison and consensus clustering analysis identify three subtypes of DGC and IGC, respectively, based on distinct patterns of the cell cycle, extracellular matrix organization, and immune response-related proteins expression. TF activity-based subtypes demonstrate that the disease progressions of DGC and IGC were regulated by SWI/SNF and NFKB complexes. Furthermore, inferred immune cell infiltration and immune clustering show Th1/Th2 ratio is an indicator for immunotherapeutic effectiveness, which is validated in an independent GC anti-PD1 therapeutic patient group. Our multilevel proteomic analyses enable a more comprehensive understanding of GC and can further advance the precision medicine.
A summary of proteogenomic analysis of DGC a Genes with non-silent variants in at least six patients (9%) are depicted on the OncoPrint. Bars on the right of the graph show the numbers of non-synonymous mutations. b Prognosis outcomes of corresponding gene mutations. n (with ARID1A mutant) = 7 and n (without ARID1A mutant) = 58 biologically independent samples. The points and error bars show the median of hazard ratio (HR) and 95% confidence interval (CI). cCis- and trans-effects of significantly mutated genes (y-axis) on protein level (x-axis). The p-values are calculated by Fisher’s exact test. The related biological functions and pathways are shown at the bottom. d Altered expression of ARID1A associated with ARID1A mutations. In proteome, n (mutant) = 7 and n (WT) = 58 biologically independent samples. In TF activity profile, n (mutant) = 2 and n (WT) = 38 biologically independent samples. Boxplots show median (central line), upper and lower quartiles (box limits), min to max range. The p-values are calculated using two-sided student’s t-test. e Pathways enriched by GSEA in ARID1A mutated patients. Nominal p-value, calculated as phenotype based permutation test. f TLR signaling pathway was significantly enriched in ARID1A mutated patients. g Altered expression of proteins enriched in TLR signaling pathway. The p-values are calculated by two-sided Wilcoxon rank-sum test. h Altered expression of CD14 associated with ARID1A mutation. n (mutant) = 7 and n (WT) = 58 biologically independent samples. Boxplots show median (central line), upper and lower quartiles (box limits), min to max range. The p-value is calculated using two-sided student’s t-test. i The prognostic outcome of CD14 in DGC. n (low) = 42 and n (high) = 23 biologically independent samples. The p-value is calculated using Log-rank test. Source data are provided as a Source Data file.
… 
Integrated multilevel proteomic analyses showed different pathogenic mechanism of DGC and IGC a The association of Lauren classification with clinical information. Two-sided Fisher’s exact test is used for categorical variables. b The association of Lauren classification with clinical outcomes. n (DGC) = 83 and n (IGC) = 102 biologically independent samples. P-values are from Log-rank test. c Representative differentially expressed proteins in the featured pathways of DGC and IGC. d Microenvironment scores and immune scores of DGC and IGC. n (DGC) = 83 and n (IGC) = 100 biologically independent samples. Violin plots show median and interquartile range. The p-values are from two-sided Wilcoxon rank-sum test. e Comparison of immune cell infiltration between DGC and IGC. Two-sided Wilcoxon rank-sum test is used. The Benjamini–Hochberg (BH) adjusted p-values are 0.019 (NK cells), 0.0016 (NKT), 1.71E-9 (CD4 + T-cells), 1.89E-12 (CD4 + memory T-cells), 0.0017 (CD8 + T-cells), 0.00056 (CD8 + Tem), and 4.25E-7 (Macrophages). f Integrated analysis of cell cycle regulation pathway at protein, kinase, TF activity and phospho-site levels in DGC and IGC. g Integrated analysis of DNA mismatch repair pathway at protein, kinase and phospho-site levels in DGC and IGC. h The expression of CDK4 and ATR in DGC and IGC. n (DGC) = 10 and n (IGC) = 16 biologically independent samples. Boxplots show median (central line), upper and lower quartiles (box limits), min to max range. The p-values are calculated by two-sided Wilcoxon rank-sum test. i The prognostic analyses of CDK4 and ATR in TCGA cohort. In comparison of CDK4, n (high expression) = 9 and n (low expression) = 13 biologically independent samples. In comparison of ATR, n (high expression) = 32 and n (low expression) = 42 biologically independent samples. The p-value is calculated using Log-rank test. j Summary of signature proteins and pathways involved in DGC and IGC. ****p < 1.0e-4, ***p < 1.0e-3, **p < 0.01, *p < 0.05. Source data are provided as a Source Data file.
… 
Proteomic subtyping of GC and associations with clinical outcomes a The association of proteomic subtypes with clinical outcomes in DGC and IGC. n (DGC cluster 1) = 23, n (DGC cluster 2) = 28, n (DGC cluster 3) = 28, n (IGC cluster 1) = 18, n (IGC cluster 2) = 49, and n (IGC cluster 3) = 25 biologically independent samples. P-values are from Log-rank test. b Clinical characteristics annotation in GC proteomic subtypes. c Pathways that significantly enriched in the proteomic subtypes. d GSEA revealed the cell cycle and immune related pathways are enriched in the proteomic subtypes and have opposite prognoses between DGC and IGC. n (DGC) = 79 and n (IGC) = 92 biologically independent samples. The points and error bars show the median of hazard ratio (HR) and 95% confidence interval (CI). e The association of chemotherapy with DFS in each GC proteomic subtypes. Chemotherapy: n (DGC cluster 1) = 19, n (DGC cluster 2) = 22, n (DGC cluster 3) = 22, n (IGC cluster 1) = 12, n (IGC cluster 2) = 36, and n (IGC cluster 3) = 20 biologically independent samples. No chemotherapy: n (DGC cluster 1) = 4, n (DGC cluster 2) = 6, n (DGC cluster 3) = 7, n (IGC cluster 1) = 6, n (IGC cluster 2) = 13, and n (IGC cluster 3) = 5 biologically independent samples. The points and error bars show the median of hazard ratio (HR) and 95% confidence interval (CI). f Percentage of patients with different cell cycle phases. The p-value is calculated by Fisher’s exact test. g Summary of cell cycle regulation in DGC cluster 1 and IGC cluster 3. Proteins involved in DNA replication and cell division in DGC cluster 1 and IGC cluster 3, phospho-sites of CDK1 and CDK2 substrates in DGC cluster 1 and IGC cluster 3 are shown, respectively. h KSEA analysis of CDKs kinase activities in DGC cluster 1 and IGC cluster 2. Kinases with p-value < 0.05 (permutation test) are colored in red (CDK2, p-value = 0.041) or blue (CDK1, p-value = 0.049). i The association of chemotherapy with DFS in GC patients with high CDK1 and low CDK2 level. n (chemotherapy) = 19 and n (no chemotherapy) = 7 biologically independent samples. The p-value is from Log-rank test. Source data are provided as a Source Data file.
… 
This content is subject to copyright. Terms and conditions apply.
Article https://doi.org/10.1038/s41467-023-35797-6
Multilevel proteomic analyses reveal
molecular diversity between diffuse-type
and intestinal-type gastric cancer
Wenhao Shi
1,2,9
,YushenWang
1,2,9
,ChenXu
3,9
,YanLi
4,9
,SaiGe
5
,
Bin Bai
6
, Kecheng Zhang
7
,YunzhiWang
4
, Nairen Zheng
2
,JuanWang
6
,
Shiqi Wang
6
,GangJi
6
,JipengLi
6
, Yongzhan Nie
6
, Wenquan Liang
7
, Xiaosong Wu
7
,
Jianxin Cui
7
,YiWang
2
,LinChen
7
, Qingchuan Zhao
6
,LinShen
5
,
Fuchu He
1,2,8
,JunQin
2,4
&ChenDing
2,4
Diffuse-type gastric cancer (DGC) and intestinal-type gastric cancer (IGC) are
the major histological types of gastric cancer (GC). The molecular mechanism
underlying DGC and IGC differences are poorly understood. In this research,
we carry out multilevel proteomic analyses, including proteome, phospho-
proteome, and transcription factor (TF) activity proles, of 196 cases covering
DGC and IGC in Chinese patients. Integrative proteogenomic analysis reveals
ARIDIA mutation associated with opposite prognostic effects between DGC
and IGC, via diverse inuences on their corresponding proteomes. Systema-
tical comparison and consensus clustering analysis identify three subtypes of
DGC and IGC, respectively, based on distinct patterns of the cell cycle, extra-
cellular matrix organization, and immune response-related proteins expres-
sion. TF activity-based subtypes demonstrate that the disease progressions of
DGCandIGCwereregulatedbySWI/SNF and NFKB complexes. Furthermore,
inferred immune cell inltration and immune clustering show Th1/Th2 ratio is
an indicator for immunotherapeutic effectiveness, which is validated in an
independent GC anti-PD1 therapeutic patient group. Our multilevel proteomic
analyses enable a more comprehensive understanding of GC and can further
advance the precision medicine.
Gastric cancer (GC) is the third most common cause of global cancer
mortality1,2. Gastric adenocarcinomas constitute ~95% of the GCs and
are classied into diffuse, intestinal, and mixed types asper the widely
used Lauren classication3. Patients diagnosed with diffuse-type GC
(DGC) and those diagnosed with intestinal-type GC (IGC) account for
30% and 54% of all GC patients, respectively4. DGC displays a scattered
cellular organization, poor adhesion, and poor cellular differentiation,
while IGC displays a tubular or glandular cellular organization with
tight adhesion junctions and less stromal component3,4. The different
pathophysiological and molecular features of DGC and IGC suggest
different mechanisms of carcinogenesis; therefore, it is imperative to
investigate the mechanism differences between DGC and IGC.
In the past decade, large-scale genomic and transcriptomic stu-
dies carried out have revealed molecular characteristics of GC57.For
example, the Cancer Genome Atlas (TCGA) conducted whole-exome
sequencing and mRNA sequencing analyses of GC57. However, studies
focused on systematical comparison of DGC and IGC were sparse. As
reported in a study based on transcriptome analysis, Jinawath et al.,
found that genes encoding extracellular matrix (ECM) proteins were
more highly expressed in DGC than in IGC, while those encoding
Received: 20 September 2021
Accepted: 3 January 2023
Check for updates
A full list of afliations appears at the end of the paper. e-mail: chenlinbj@vip.sina.com;zhaoqc@fmmu.edu.cn;shenlin@bjmu.edu.cn;
hefc@nic.bmi.ac.cn;jqin1965@126.com;crickding@163.com
Nature Communications | (2023) 14:835 1
1234567890():,;
1234567890():,;
Content courtesy of Springer Nature, terms of use apply. Rights reserved
metabolic proteins were more highly expressed in IGC than in DGC8.
While there has been signicant progress, deeper understanding of
different molecular pathology of DGC and IGC from proteomic data
remains lacking, which impedes the discovery of new biomarkers and
drug targets for DGC and IGC.
Previous proteomic studies focused on the proteomic landscape
of DGC9,10, whichwas a pathological type with poor prognosis and few
treatment options4. For example, Ge et al. identied the proteomic
subtypes and signaling pathways associated with clinical outcomes of
DGC patients, such as cell cycle, epithelial-to-mesenchymal transition
(EMT), and immune responses9. This demonstrates that DGC is char-
acterized by inter-patient heterogeneity at protein level and can be
classied based on proteomic signatures. Mun et al., used proteomic
and phospho-proteomic approaches to systematically demonstrate
alterations in key biological processes, such as cell proliferation,
immune response, metabolism, and invasion of DGC10. These studies
have signicantly enhanced our understanding of the molecular
heterogeneity prevalent in DGC. However, research investigating the
underlying molecular subtypes of IGC is still lacking, despite IGC
patients accounting for the highest proportion of total GC patients.
Lack of clinical proteomic research on IGC hinders our c omprehensive
understanding of GC heterogeneity and searching for novel ther-
apeutic targets.
The transcription factor (TF) activities orchestrate the intracel-
lular signaling pathways during diverse biological processes in carci-
nogenesis. Several TFs have been reported to promote GC progression
by different molecular mechanisms; for example, MYC promotes cell
proliferation11; FOXC1 promotes EMT12; the SWI/SNF complex governs
chromatin structure and gene transcription13,14;andtheNFKBcomplex
promotes inammation and immune response15.However,thecom-
plete TF activity proles of DGC and IGC have not been described yet.
We have previously developed an approach called TFRE, which could
detect and evaluate inferred TF activities at proteomic level16,17.
Comprehensive analysis with TF activity proles would provide a
panoramic view of possible pathogenic mechanisms and therapeutics
of DGC and IGC.
In addition to altered intracellular signal transduction induced by
overexpression of TFs, components of the tumor microenvironment
(TME) also affect disease progression in GC18,19. Recent studies have
demonstrated that the TME is a complex system wherein the tumor-
inltrating immune cells play a key role in GC progression18,19.More-
over, heterogeneity of the TME affects immunotherapeutic effective-
ness. Pembrolizumab, a monoclonal antibody directing against PD-1, is
approved by US Food and Drug Administration (FDA) for advanced GC
patients; however, the response rate is as low as 1026% in GC patients
with metastasis20. Therefore, the identication of predictive bio-
markers and exploration of resistance mechanism to immunotherapy
would be important for improving therapeutic effects for GC patients.
Here, we present multilevel proteomic analyses of GC by ana-
lyzing the proteome of 196 pairs of tumor tissues and their normal
adjacent tissues (NATs). We demonstrate the different pathogenic
mechanisms between DGC and IGC based on multi-omics data.
Moreover, we perform proteomic clustering and obtain molecular
subtypes with distinct expression levels of proteins that play a role
in cell cycle, ECM, and immune response, indicating the hetero-
geneity prevalent in DGC and IGC. Additionally, we nd that NFKB
and SWI/SNF complexes are crucial in distinguishing two subtypes
of DGC and IGC, and are associated with different patient prog-
nosis, respectively. The characterization of immune landscape
further reveals the existence of diverse immunotherapy targets,
especially for Th1/Th2 ratio in predicting GC immunotherapeutic
effectiveness. Our integrative proteomic analyses present a
multilevel proteomic landscape that serve as a rich resource for
understanding the molecular characteristics of GC and for identi-
fying potential therapeutic targets in GC treatment.
Results
Comprehensive proteomic landscape of GC cohort
We collected 196 pairs of primary GC samples (DGC, n= 83;
IGC, n= 102; and mixed-type gastric cancer (MGC), n= 11) and the
NATs from treatment-naïve Chinese patients (Supplementary Table 1,
Supplementary Data 1). A schematic of the experimental design is
shown in Fig. 1a. A mass spectrometry (MS)-based label-free quanti-
cation strategy, referred to the Chinese Human Proteome Project
(CNHPP)2123, was adopted for this study. A Fast-Seq workow24
was performed to prole the proteomes of 194 paired samples.
A phospho-proteomic analysis was conducted on 184 paired samples
using a TiO
2
enrichment strategy25. In addition, concatenated tandem
array of consensus TF response elements (TFRE) for TF enrichment,
reecting TFsDNA binding activity, was carried out for all
the samples16. The tryptic digestions of the 293T cell lysate were
measured as standards to evaluate sample quality control (QC).
The average spearmans correlation coefcients among standards in
proteome, phospho-proteome, and TF activity prole platforms
were 0.92, 0.94, and 0.95 (Supplementary Fig. 1a), respectively.
The median coefcient of variation (CV) values among standards in
proteome, phospho-proteome, and TF activity prole platforms were
0.28, 0.26, and 0.34 (Supplementary Fig. 1b), respectively. The den-
sity of three datasets exhibited unimodal distribution (Supplemen-
tary Fig. 1c). These evaluations demonstrated the stability of our MS
platforms.
Upon proling the proteomes of all the patient samples, we
identied 11,688 proteins in total (Supplementary Fig. 1d). Further-
more, 44,750 phospho-sites were identied for 6619 phosphopro-
teins with a condent site localization score (Mascot ion score >20,
Supplementary Fig. 1d), and 597 TFs were identied upon the infer-
red TF activity proles (Supplementary Fig. 1d). The pairwise samples
review showed that the number of proteins, phospho-sites, and TFs
identied in the GC tumor samples were higher than that identied in
the paired NAT samples (Supplementary Fig. 1d, e). This suggested a
lower degree of differentiation and a higher degree of heterogeneity
in tumor tissues than that in the NATs. Multilevel proteomics
increased the proteome coverage of kinases and TFs. For example,
phospho-proteomics increased the number of kinases detected to
325, and the TF activity proles increased the number of TFs detected
to 756 (Supplementary Fig. 1f).
Our multilevel proteomic data enabled a comprehensive
exploration of altered protein expression between the GC tissues and
NATs. After sample QC and normalization procedures, we performed
principal-component analyses (PCAs) of proteomes, phospho-
proteomes and TF activity proles. All datasets could separate
tumors and NATs (SupplementaryFig. 2a),indicating altered proteins,
phospho-sites, and TF activity landscape in GC. Compared with the
NATs, GC-related proteins, TFs, and phospho-sites were identied.
Among them, 1548 proteins, 123 TFs, and 163 phospho-sites were
upregulated, and 671 proteins, 20 TFs, and 194 phospho-sites were
downregulated (two-sided Wilcoxon signed-rank test, BH adjusted
p< 0.05, ratio o f tumor to NAT (T/NAT) > 2 or <0.5, Fig. 1b, Supple-
mentary Data 3ac) in tumor tissues. Gene set enrichment analysis
(GSEA) of proteome demonstrated that the proteins involved in DNA
replication, cell cycle, ECM organization, and immune response were
signicantly upregulated in tumor tissues, whereas those involved in
metabolism (i.e., fatty acid β-oxidation, tricarboxylic acid (TCA) cycle,
and oxidative phosphorylation) were signicantly downregulated in
tumor tissues (Fig. 1b, Supplementary Fig. 2b). Pathway enrichment
analysis of TF activity proles indicated that upregulated TFs in tumor
tissues involved in mediating cell cycle, NF-kappa B signaling pathway,
and Ras signaling pathway, whereas upregulated TFs in NATs involved
in calcium signaling pathway, glucagon signaling pathway, and cAMP
signaling pathway (Fig. 1b). Pathway enrichment analysis of phospho-
proteome indicated that tumor phosphoproteins were involved in
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 2
Content courtesy of Springer Nature, terms of use apply. Rights reserved
MGC
n=11
DGC
83
IGC
102
Proteome
TF activity profile
Phospho-proteome
TiO enrichment LC-MS/MS
fraction
TFRE DNA enrichment
Nuclear extract
Peptides
No. GC 01-196
c
d
Tissue
e
a
b
n= n=
Proteome
Phospho-proteome
TF activity profile
Lauren subtype
Gender
Age
Stage
Chemotherapy
Signet ring cells
HER2
Multi-omics Lauren subtype Gender Age Stage Chemotherapy Signet ring cells HER2
Detected
Undetected
DGC
IGC
MGC
Female
Male
< 50
>=50
Chemotherapy
No Chemotherapy
Positive
Negative
Unknown
-
+
Unknown
++
+++
−4
−2
024
TOP2A
SMC4
MCM2
SMARCA5
WAPAL
POLD3
PSMD2
PSMA5
CCNL1
EP300
PCM1
CDK6
AHCTF1
E2F3
E2F4
Phospho-proteome
Proteome
TF activity profile
Cell cycle
M Phase
Mitotic G1 phase
and G1/S transition
Mitotic G2
-G2/M phases
S Phase
Log2(T/NAT)
CDK1 CDK2 MAPK1 MAPK3CAMK2ACSNK2A1 PAK1
PSMD2 PSMA5
VIM RIPK2
LIG1
MSH6
TOP2A NUP188 MCM2 LMNB1
HDAC4 YBX1
LEO1
Cell cycle GPCR signaling pathway
Kinase Substrate
1
HDAC
YB
1
G1
1
A5
P
M
M
SH6
L
P
M
PSM
M
P
M
A5
MAPK signaling pathway
Drug available
Inferred activity
Protein
Phospho-site
MAPK1
MAPK3
CDK1
CDK2
CSNK2A1
CAMK2A
PAK1
−2
0
2
−2.5
0
2.5
Enrichment score Log10(T/N)
Drug target
Yes
No
TCA cycle
Fatty acid degradation
Gastric acid secretion
Cell cycle
Mismatch repair
TNF signaling
DNA replication
NAT Tumor
−2 0 2
z score
Phospho-proteome ProteomeTF activity profile
Cell cycle
NF-kappa B signaling pathway
Ras signaling pathway
Calcium signaling pathway
Glucagon signaling pathway
cAMP signaling pathway
Cell cycle
Regulation of TP53 activity
DNA Repair
Apoptotic execution phase
Vesicle-mediated transport
Membrane trafficking
Glucose metabolism
ECM-receptor interaction
PI3K-Akt signaling pathway
Signaling by Rho GTPases
FAM83H
CRABP2
ANXA13
RAB11FIP1
SLC44A4
CD177
STAP2
CLDN3
EPCAM
GCNT3
HEPH
USH1C
REG4
OLFM4
ELF3
HNF4A
CLRN3
ANKS4B
CEACAM1
P
APSS2
SDSL
C1S
CYP27A1
SULT2B1
GRB7
TAC S T D 2
ECM1
EPHA2
ACOT13
CL
YBL
ETFB
GOT1
FA2H
SYTL2
Log10(T/NAT)
−1.5 0.5 2.5
MYRF
PHB
MLXIPL
ARNTL2
RARG
HNF4A
VDR
ELF3
Esophagus
Intestine
Liver
Stomach
Stomach
Liver
Intestine
Esophagus
TF
TG
f
Genome
Tumor
NAT
Fig. 1 | Multilevel proteomic atlas of human GC samples. a Workow of human
gastric cancer multilevel proteomic atlas construction. bDifferentially expressed
proteins in tumor tissuesand NATs and their associated biologicalpathways. Red,
upregulated pathways in tumor tissues; blue, upregulated pathways in NATs.
cRepresentative differentially expressed proteins in the cell cycle with multilevel
proteomic levels (proteome, TF activity prole, and phospho-proteome).
dRegulation network of TFs and their target genes. Tissue-specic TFs are shown.
ePhospho-regulatory network in GC. Red, kinases; yellow, substrates. The main
function or pathways of substrate proteins are labeled. fThe expression of GC
signature kinases in multilevel proteomiclevel (protein, phospho-site,and inferred
activity).
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 3
Content courtesy of Springer Nature, terms of use apply. Rights reserved
carcinogenesis-associated pathways/processes, such as the cell cycle,
regulation of TP53 activity, and DNA repair, et al., whereas those in
NATs were involved in physiological functions, such as vesicle-
mediated transport, membrane trafcking, and glucose metabolism
(Fig. 1b). These analyses showed that the characteristics of tumor tis-
sues at proteome, TF activity, and phospho-proteome levels showed
partial consistency with some differences.
Notably, proteins involved in the cell cycle were upregulated in all
the three datasets. Cell cycle proteins were evaluated based on their
altered expression patterns (WAPAL, AHCTF1, etc.), phosphorylation
patterns (CCNL1 T67, SMC4 S41, etc.), and inferred TF activities
(SMARCA5, E2F3, etc.) in GC tumor tissues (Fig. 1c). We found that GC
patients with high DNA binding activities of SMARCA5 and E2F3 and
high phosphorylation of CCNL1 (at T67) and SMC4 (at S41) in tumor
tissues associated with poor prognoses (Log-rank test, p< 0.05)
(Supplementary Fig. 2c). The analyses in TF activity proles and
phospho-proteomes suggested that TFs and kinases played a unique
role in oncogenesis. This is a nding that cannot be observed from
merely analyzing the protein expression proles.
A recent analysis on the GC cell lineage revealed that GC cells may
transdifferentiate into other digestive tract cell lineages26.AsTFs
determined the cell fate27,rstly, we compared inferred TF activities
between the tumor tissues and NATs and found that intestine-specic
TFs, including ELF3, HNF4A, and VDR, and esophagus-specicTFs,
including RARG and ARNTL2, were upregulated in tumor tissues;
however, stomach-specic TFs, such as MYRF, were downregulated
(Supplementary Fig. 2d). Secondly, we compared these TFstarget
genes (TGs) expression between the tumor tissues and NATs based on
proteomic prole. As presented in Fig. 1d, the TF-TG regulation net-
work showed that TGs expression levels exhibited similar tendencies
with tissue-specic TFs. Thirdly, we comprehensively counted the
proportion of tissue-specic-proteins expression changes between GC
tumor- and NAT-specic protein expression. We found that 25.0%
intestine- and 37.8% esophagus-specic proteins, such as ELF3
and EPHA2, respectively, were upregulated in the tumor tissues,
whereas 77.1% stomach-specic proteins, such as MUC5AC, were
downregulated (Supplementary Fig. 2e, Supplementary Data 3d). The
downregulated stomach-specic proteins in GC tumor tissues (Sup-
plementary Fig. 2f) indicated that the normal physiological function of
stomach decreased or lost in GC tumor tissues. At last, we analyzed
the association between stomach-specic proteins expression and
GC patientsprognosis. We found stomach-specic proteins, such
as VSIG2 and B4GALNT3, were associated with favorable prognosis
(Log-rank test, p<0.05; Supplementary Fig. 2g). These results
demonstrated that the altered expression of tissue-specic TFs affec-
ted trans-differentiation in GC and patientsprognosis, reinforcing the
fact that proteomic, phospho-proteomic, and TF activity proles
possessed distinct biological characteristics. Comprehensive analysis
of multilevel proteomics could provide novel insights into signaling
pathways and drug targets.
To explore the role of kinases in GC, we selected phospho-sites
which exhibited larger alteration than its protein expression alteration
between the tumor tissues and NATs. Subsequently, 229 phospho-sites
were dened as GC-associated phospho-sites (Supplementary Fig. 2h).
Kinase substrate enrichment analysis (KSEA)28 of GC-associated phos-
pho-sites identied multiple kinases, including CDK1, CDK2, CSNK2A1,
CAMK2A,PAK1,MAPK1,andMAPK3,wereactivatedinGCtumortis-
sues (Fig. 1e). These kinases regulated cell cycle and several oncogenic
pathways, including GPCR signaling pathway and MAPK signaling
pathway. Further investigation revealed that the expression, phos-
phorylation, and activity of MAPK1, MAPK3, and CDK1 had increased;
thus, these three kinases could serve as potential drug targets for GC
patients (Fig. 1f).
Thus, our ndings have so far established a comprehensive pro-
teomic landscape of Chinese GC patients. Moreover, these datasets
serve as a multilevel resource for investigating GC pathology and
precision medicine.
ARID1A mutation performed different effects between DGC
and IGC
To investigate the alteration of genetic information of GC, we
performed statistical analysis on gene mutation frequency based
on the panel of 274 cancer driver and GC hotspot genes among 65
DGC patients. Thirteen genes detected with mutations in at least
9% patients were presented (Fig. 2a, Supplementary Data 2).
Among these gene mutations, TP53, CDH1, KMT2D, RHOA, ARID1A,
APC,andPIK3CA were detected as high-frequency mutations
(10.8-47.7%), consistent with previous reports5,9.Toexplorethe
association of gene mutations with prognostic outcomes, we
calculated the hazard ratio (HR) of gene mutations based on
survival outcomes. We found that patients with ARID1A mutation
had unfavorable prognosis (Fig. 2b).
Genomic alterations that affect gene expression levels at the same
locus are designated as cis-effects, whereas an impact of another locus
is dened as a trans-effect29. We comprehensively characterized the cis-
and trans-effects of genetic alterations on protein level (Fig. 2c).
Comparing to cis-effects, proteinabundance alterations occurred more
prominently in numerous trans-effects, and these alterations had bio-
logical process propensity. Consistently, patients with CDH1 mutation
had lower ECM proteins expression (Fig. 2c), which showed the CDH1
gene function in ECM organization30. Importantly, only three genes
showed cis-effects:TP53,PIK3CA,andARID1A. Patients with the TP53 or
PIK3CA mutations had increased corresponding proteins abundance,
while patients with ARID1A mutation had lower ARID1A protein
expression (Fig. 2d). We also compared the TF activity of ARID1A in TF
activity proles between 7 patients with ARID1A mutation and wild type
patients. We found that the TF activity of ARID1A was also decreased
in ARID1A mutated patients (Fig. 2d). These results demonstrated
that ARID1A mutation caused its protein expression decrease and TF
activity reduction.
As the only gene mutation which was correlated with unfavor-
able prognosis, we further investigated how ARID1A mutation cor-
related with the alteration of the cancer proteome, namely
alterations of related proteins and pathways. We mined TGsdata of
ARID1A31. ARID1A was primarily reported as a transcriptional
repressor32. Thus, we surveyed the TGs that were elevated in ARID1A
mutated patients. GSEA analysis showed signicantly altered path-
ways between samples with and without ARID1A mutation. Based on
the normalized enrichment scores, we found that pattern recogni-
tion receptor signaling pathway and TLR signaling pathway were the
most signicantly enriched pathways in ARID1A mutated patients
(Fig. 2ef). Among 15 proteins involved in TLR signaling pathway,
CD14 and PIK3AP1 were signicantly upregulated in ARID1A mutated
patients (Fig. 2g, h). Furthermore, the prognostic analysis showed
that CD14 was an unfavorable prognostic protein in DGC (Log-rank
test, p< 0.05; Fig. 2i). CD14 had been reported as a protein involved
in increasing cytokine production, increasing tumor growth, and
promoting inammatory in several cancer types33. These results
demonstrated that patients with ARID1A mutation had unfavorable
outcomes and activated CD14 mediated TLR signaling pathway.
The ARID1A mutation was found as an unfavorable prognostic
factor in this DGC cohort. Then, we surveyed the prognostic correla-
tion of ARID1A mutation in IGC cohort. In Wangs cohort, we found
ARID1A mutation was a prognostic factor associated with better
prognosis34. Then, we explored the TCGA cohort7, validating prog-
nostic association of ARID1A mutation, and found patients with ARID1A
mutation in IGC had better prognoses, whereas patients with ARID1A
mutation in DGC were associated with poor prognoses (Supplemen-
tary Fig. 3). These results indicated that the mutation of ARID1A
had opposite prognostic effects between DGC and IGC, via diverse
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 4
Content courtesy of Springer Nature, terms of use apply. Rights reserved
inuences on their corresponding proteomes. Therefore, it is impor-
tant to compare DGC and IGC based on multilevel proteomic data.
Integrated multilevel proteomics in DGC and IGC
Lauren classication includes DGC, IGC, and MGC, among which the
former two pathological types are the major3. In our cohort, clinical
information showed that DGC development was signicantly depen-
dent on age (Chi-square test, BH adjusted p=0.023), tumor location
(Chi-square test, BH adjusted p= 0.0007), and lymphovascular invasion
(Chi-square test, BH adjusted p= 0.0018; Fig. 3a). Consistent with the
current clinical knowledge, survival analysis revealed that IGC patients
had signicantly prolonged survival (Log-rank test, p= 0.0309; Fig. 3b).
ab
cd
TP53
CDH1
KMT2D
APC
SPTA1
PIK3CA
RHOA
FAT4
PKHD1
ARID1A
DNAH7
BAP1
KMT2C
030
Alteration frequency
Missense Mutation
In Frame Del
Nonsense Mutation
Frame Shift Del
Splice Site
In Frame Ins
Frame Shift Ins
47.7%
24.6%
16.9%
12.3%
12.3%
12.3%
12.3%
10.8%
10.8%
10.8%
9.2%
9.2%
9.2%
-5 0 5 10
Log2(HR)
f
ge
Mutant WT
-6
-4
-2
0
2
4
6
ARID1A TF activity profile
p=0.0462
Log10(T/NAT ratio)
Mutant WT
-4
-2
0
2
4
ARID1A Proteome
p=0.0423
Log10(T/NAT ratio)
Regulation of macroautophagy
Regulation of epithelial cell migration
Positive regulation of MAPK cascade
Response to endotplasmic reticulum stress
Positive regulation of endothelial cell migration
Antigen processing and presentation
Erad pathway
Endothelial cell migration
Toll like receptor signaling pathway
Pattern recognition receptor signaling pathway
1.85 1.90 1.95 2.00
Normalized Enrichment Score
Size
20
30
40
50
2.4
2.6
2.8
-Log10(p value)
Genes altered
Affected proteins
cis
trans
KMT2C
BAP1
DNAH7
ARID1A
PKHD1
FAT4
RHOA
PIK3CA
APC
SPTA1
KMT2D
CDH1
TP53
-Log10(p value)
1.5
2.0
2.5
−1.0
−0.5
0.0
0.5
1.0
1.5
ITGA3
CD47
LAMA2
TNXB
TP53
ARID1A
KMT2D
CDKN1B
MCM4
MCM6
CDK6
CDK1
ATR
AKT1
PIK3CD
JAK2
TLR2
IGF1R
RHOA
PIK3CA
MAPK1
RAF1
MAPK12
MAP2K1
MAPKAPK2
CDC42
MYD88
TLR8
RAC1
STAT1
CD14
Fold change
ECM Cell cycle PI3K-Akt
signaling pathway
MAPK
signaling pathway
TLR
signaling pathway
0 500 1000 1500 2000
0
50
100 p=0.0311
days
Probability of Survival
low n=42
high n=23
Mutant WT
-2
-1
0
1
2
CD14
p=0.0205
Log10(T/NAT ratio)
CD14
hi
0.6
0.9
1.2
1.5
01234
Log2(fold change)
−Log10(p value)
significant
no
up
CD14
PIK3AP1
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 5
Content courtesy of Springer Nature, terms of use apply. Rights reserved
To investigate altered proteomic features of GC, we compared
proteins with signicantly differential expression (Wilcoxon paired
signed-rank test, BH adjusted p< 0.05, foldchange > 1.5) between
tumor tissues and NATs in DGC and IGC, respectively (Supplementary
Fig. 4a). We identied 2512 proteins upregulated in DGC, among which
2212 (88.1%) proteins were also upregulated in IGC (68.2% of upregu-
lated proteins in IGC). Among 1106 proteins downregulated in DGC,
686 (62.0%) proteins were also downregulated in IGC (82.2% of
downregulated proteins in IGC). Based on T/NAT ratio, we further
ltered out 1897 differentially expressed proteins between DGC and
IGC (2 folds), and divided them into six groups (Supplementary
Fig. 4b). Proteins in group 1 (449 proteins) and group 2 (921 proteins)
were upregulated in tumor tissues in DGC and IGC. Proteins in group 4
(374 proteins) and group 5 (142 proteins) were downregulated in
tumor tissues in DGC and IGC. Most differential proteins (1886/1898)
were included in these four groups, and the pathways enriched in these
fourgroupswereshowninSupplementaryFig.4c.Proteinsingroup3
(10 proteins) and proteins in group 6 (1 protein) had converse dysre-
gulation directions between IGC and DGC (Supplementary Fig. 4d).
The prognostic analysis (Supplementary Fig. 4e) revealed that GALNT3
and TRMT10C were oppositely associated with prognostic outcomes
in DGC and IGC, which deserved to be further studied. These results
demonstrated that, when comparing to NATs, the directions of most
dysregulations in DGC and IGC tumor tissues were consistent, while
thechangemagnitudesofthedifferentialproteinsweredifferent
between DGC and IGC. Overall, DGC and IGC tumor tissues showed
tumorous characteristics compared to NATs, while the comparison
based on T/NAT ratio showed tumor heterogeneity between DGC
and IGC.
To further investigate the tumor heterogeneity between DGC and
IGC, we compared the differential expression at multilevel (proteome,
phospho-proteome, and TF activity). Based on the ratio of tumor tis-
sues to NATs, we found a total of 384 proteins differentially expressed
between DGC and IGC (two-sided Wilcoxon rank-sum test, BH adjusted
p< 0.05, ratio of DGC to IGC > 2 or <0.5). Among them, 83 and 301
were upregulated in DGC and IGC, respectively (Fig. 3c, Supplemen-
tary Fig. 5a, b, Supplementary Data 4a). Pathway enrichment analysis
demonstrated that the signicantly upregulated proteins in DGC were
involved in immune system, complement cascade, ECM organization,
and cell migration, suggesting that the TME proteins were major
components of the DGC proteome. In contrast, proteins upregulated
inIGCweremainlyinvolvedinDNAdamage,ERBBsignaling,meta-
bolism, and VEGF signaling pathway. Additionally, 743 and 536
phospho-sites were enriched in DGC and IGC, respectively (two-sided
Wilcoxon rank-sum test, BH adjusted p<0.05,ratioofDGCtoIGC>2
or <0.5; Supplementary Data 4b). The pathway enrichment of
phospho-proteome validated that ECM organization, immune system,
and metastasis played major roles in DGC progression, while pro-
liferation and metabolism played major roles in IGC progression
(Supplementary Fig. 5c).
To explore the different effects of tumor microenvironment in
DGC, we compared the xCell scores35 of the DGC and IGC tumors.
The microenvironment and immune scores were higher in DGC than in
IGC (Wilcoxon rank-sum test, BH adjusted p< 0.05; Fig. 3d), indicating
a higher degree of tumor inltration by immune cells in DGC than in
IGC. Subsequently, we compared immune cells prevalent in DGC and
IGC tumors and found that DGC tumors had higher inltration of
CD4 + T cells, CD8 + T cells, and macrophages than the IGC tumors
(Wilcoxon rank-sum test, BH adjusted p<0.05,Fig.3e).
Pathway enrichment analysis showed cell cycle related processes
were upregulated in both DGC and IGC, but the specic signaling
pathways were different (Supplementary Fig. 5c). We then investigated
the different molecular mechanisms related to cell cycle in DGC and
IGC, to search for the distinct potential drug targets. RB1 is a crucial TF
that suppresses cell cycle by inhibiting E2F in tumors. Phosphorylation
of RB1 by CDK4/6 causes the dissociation of E2F from the RB1-E2F
complex, releasing RB1-regulated cell cycle suppression36.ForDGC
patients, we observed that RB1 possessed increased phosphorylation
and decreased TF activity compared with IGC patients, while E2F
activity and CDK4/6 levels were upregulated in DGC (Fig. 3f). This
integrative analysis demonstrated that RB1 was phosphorylated in
DGC and promoted the disassociated with E2F, which increased E2F
activity and drove cell cycle progression in DGC. These results indi-
cated the possibility of employing the CDK4/6 complex as a potential
drug target for DGC.
For IGC patients, we performed a comprehensive investigation of
the DNA repair network by evaluating the kinase activity, phospho-site,
and protein expression levels. Twelve proteins involved in DNA
damage, including MLH1, MSH3, and MSH6, were upregulated in IGC
patients (Fig. 3g, Wilcoxon rank-sum test, p< 0.05, ratio of IGC to
DGC > 2). Further investigation showed increase in phospho-sites on
DNA damage proteins in IGC patients, including those on PARP1,
SMC3, and SSRP1. Additionally, our comparative analysis indicated
that phosphorylated ATM/ATR were upregulated in IGC patients
(Fig. 3g). Previous studies have indicated that ATM/ATR, core com-
ponents of the DNA repair network, are activated to initiate homo-
logous recombination repair in the event of DNA double-strand
breaks37. Thus, we can presumethat DNA damage related proteins can
serve as potential drug targets for IGC. In order to further validate
these ndings, we validated these potential drug targets of DGC and
IGC in TCGA cohort7. We compared the expression of CDK4/6
and ATM/ATR, and found the expression of CDK4 was higher in DGC,
and the expression of ATR was higher in IGC (Fig. 3h). Further, prog-
nostic analysis showed the expression of CDK4 and ATR were both
negatively associated with clinical outcomes in DGC and IGC (Fig. 3i),
respectively. These results proved that CDK4/6 and ATM/ATR were the
potential targets for DGC and IGC, respectively (Fig. 3j).
PCAs of TF activity proles could distinguish between the DGC
and IGC datasets (Supplementary Fig. 5d), indicating a large difference
in molecular features between DGC and IGC. We reasoned that certain
key TFs would be not onlyupregulated in tumor tissues in comparison
to NATs, but elevated in particular tumor subtypes. We compared TF
activities between DGC and IGC and found that 24 TFs were differen-
tially activated between DGC and IGC (two-sided Wilcoxon rank-sum
Fig. 2 | A summary of proteogenomic analysis of DGC. a Genes with non-silent
variants in at least six patients (9%) aredepicted on the OncoPrint. Bars on the right
of the graph show the numbers of non-synonymous mutations. bPrognosis out-
comes ofcorrespondinggene mutations.n(with ARID1Am utant) = 7 and n(without
ARID1A mutant) =58 biologically independent samples. The points and error bars
show the median of hazard ratio (HR) and 95% condence interval (CI). cCis- and
trans-effects of signicantly mutated genes (y-axis) on protein level (x-axis). The p-
values are calculated by Fishers exact test. The related biological functions and
pathways are shown at the bottom. dAltered expression of ARID1A associatedwith
ARID1A mutations. In proteome, n(mutant) = 7 and n(WT) = 58 biologically inde-
pendent samples. In TF activity prole, n(mutant) = 2 and n(WT) = 38 biologically
independent samples. Boxplots show median (central line), upper and lower
quartiles (box limits), min to max range. The p-values are calculated using two-
sided studentst-test. ePathways enriched by GSEA in ARID1A mutated patients.
Nominal p-value, calculated as phenotype based permutation test. fTLR signaling
pathway was signicantly enriched in ARID1A mutated patients. gAltered expres-
sion of proteins enriched in TLR signaling pathway. The p-values are calculated by
two-sided Wilcoxon rank-sum test. hAltered expression of CD14 associated with
ARID1A mutation. n(mutant) = 7 and n(WT) = 58 biologically independent samples.
Boxplots show median (central line), upper and lower quartiles (box limits), min to
max range. The p-value is calculated using two-sided studentst-test. iThe prog-
nostic outcome of CD14 in DGC. n(low)= 42 and n(high) = 23 biologically inde-
pendent samples. The p-value is calculated using Log-rank test. Source data are
provided as a Source Data le.
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 6
Content courtesy of Springer Nature, terms of use apply. Rights reserved
test,BHadjustedp< 0.05, ratio of DGC to IGC > 2 or <0.5; Supple-
mentary Fig. 5e). Among them, 20 and 4 TFs were upregulated in DGC
and IGC, respectively. Subsequently, we proposed the concept master
TFs, which could be further predicted by the enrichment of corre-
sponding downstream TGs based on CellNet database38. As a result,
FOXC1 and MYC were regarded as the master TFs of DGC and IGC,
respectively (Supplementary Fig. 5f, g, Supplementary Data 4c). Fur-
ther analysis of TGs regulated by master TFs suggested that TGs of
FOXC1 were mainly involved in ECM organization (COL1A1 and
COL1A2), ECM-receptor interaction (COL4A1/2, and LAMB2), and
migration (WNT5A), which were dominated in DGC; while TGs of MYC
were mainly involved in ribosomebiogenesis (IMP4 and NOP56), RNA
b
c
a
0 1000 2000 3000
0
50
100
OS
days
DGC n=83
IGC n=102
p =0.0309
0 1000 2000 3000
0
50
100
DGC n=83
IGC n=102
Percent survival
days
0.0
0.5
1.0
Microenvironment score
p=1.83e-6
0.0
0.5
1.0
IGC DGC
Immune score
p=5.30e-6
Gender
Age(p=0.023)
Tumor location(p=0.0007)
Lymphovascular invasion
(p=0.0018)
Stage
Lauren subtype
Proteomic subtype
TF activity subtype
Immune subtype
Gender Age Tumor location
Stage
Lauren subtype
Proteomic subtype TF activity subtype Immune subtype
Male
Female <50
>50 Cardia
Body
Antrum No
Yes
1
2
3
4
MGC
IGC
DGC
1
2
3
1
2
1
2
3
NA
NA
NA
NA
ATM
ATR
ATM_S1981
LIG1_S201
PARP10_S1011
PARP1_S782
SMC3_S1083
SSRP1_S659
SSRP1_S667
SSRP1_S668
SSRP1_S671
SSRP1_S672
SSRP1_S673
DDB2
ERCC3
GTF2F1
GTF2H1
GTF2H3
MLH1
MSH3
MSH6
MTA1
PARP9
RFC2
XRCC1
Profiling Phospho-site KSEA
MGC DGCIGC
E2F
E2F
RB1
RB1
ATR
ATM CDK4/6
p
Double strand breaks
Replication stress
G1
S
G2
M
Cell cycle
DGC
IGC DGC
MGC DGCIGC
Lymphovascular invasion
IGC
de
f
g
h
DFS
p =0.0339
Percent survival
−3 0 3
Z score
DNA damage
MLH1
MSH3
ATR_S435
IGC MGC DGC
RB1 Protein expression
RB1 TF activity
TF activity
Proteome
RB1_T373 phosphorylation
CDK4
CDK6
E2F2
E2F5
E2F6
CD4+ T−cells
CD4+ memory T-cells
CD8+ T−cells
CD8+ Tem
Macrophages
NK cells
NKT
****
****
**
****
*
**
IGC DGC
−3 0 3
Z score
**
Immune system
IGC DGC
−1 0 1
Z score
IGC
MGC
DGC
FCER1G CD177 TRIM50
CD93 CD180 MPO
-0.49 -0.66 1.15 -1. 13 0.35 0.78 -0.36 -0.77 1.13
-1.09 0.21 0.88 -1.10 0.86 0.23 -0.78 -0.35 1.13
1.05 0.60
0.41 0.90
0.55
1.02 0.73 0.18
DDB2 GTF2H1 MSH3
MSH6 MTA1 XRCC1
DNA damage
−0.94−0.10 1.15 −0.68 −0.47 −1.15
−0.04 −0.98 −1.14 −1.08
i
j
DGC
IGC
0
2000
4000
6000
CDK4
mRNA expression (RESM)
0
500
1000
1500
2000
ATR
p=0.0215 p=0.0030
TCGA cohort TCGA cohort
050100
0
50
100
CDK4 in DGC
0 20406080
0
50
100
ATR in IGC
Months Months
Probability of survival
p=0.0352 p=0.0463
High expression n=9
Low expression n=13
High expression n=32
Low expression n=42
mRNA expression (RESM)
TCGA cohort TCGA cohort
DGC
IGC
Probability of survival
−3 0 3
Z score
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 7
Content courtesy of Springer Nature, terms of use apply. Rights reserved
metabolism (PUS1 and RPL13A), and proliferation (FBL and DKC1),
which were dominated in IGC (SupplementaryFig. 5h, i). TG expression
levels of LAMB2 and FBL were associated with poor clinical outcomes
in DGC and IGC patients, respectively (Log-rank test, p<0.05; Sup-
plementary Fig. 5j). The TF activity analyses results were consistent
with the nding that altered proteins were involved in proliferation
and DNA damage in IGC, while ECM and immune response in DGC.
In summary, our comprehensive analysis based on multilevel
proteomics provided profound mechanisms and proposed CDK4/6
and ATM/ATR as the potential targets for DGC and IGC, respectively.
Proteomic subtypes of GC and their association with clinical
outcomes
Clinically, tumor treatment is largely depended on histological exam-
ination. Increasing numbers of studies have showed there were dif-
ferent molecular subtypes, which had different prognosis and
therapeutic targets, in each tumor histological type39,40. Consensus
clustering41 based on upregulated proteins in DGC tumor tissues
compared with NATs identied three DGC proteomic subtypes: DGC
cluster 1 (n= 23), DGC cluster 2 (n=28), and DGC cluster 3 (n=28).
Similarly, we applied upregulated proteins in IGC tumors tissues
compared with NATs in consensus clustering and identied three IGC
proteomic subtypes: IGC cluster 1 (n=18), IGCcluster 2 (n=49), and
IGC cluster 3 (n= 25) (Supplementary Fig. 6a). Multivariate cox
regression analysis suggested that these subtypes were signicantly
associated with clinical outcomes, irrespective of other clinical char-
acteristics, including gender, age, TNMstage, and chemotherapy (Log-
rank test, p< 0.05; Fig. 4a, b). This result indicated that proteomic
subtyping could serve as an independent prognostic predictive factor.
Notably, for DGC patients, DGC cluster 1 had the best prognosis,
whereas DGC cluster 2 and cluster 3 had worse prognoses; for IGC
patients, IGC cluster 1 had the best prognoses, whereas IGC cluster 3
had the worst prognosis.
These subtypes showed distinct molecular features. We identied
2367 and 3154 differentially expressed proteins (two-sided Wilcoxon
rank-sum test, BH adjusted p< 0.05, foldchange > 2) across the DGC
and IGC clusters, respectively. Among the six subtypes, DGC cluster 1
and IGC cluster 3 were characterized by cell cycle (such as CDK1/2 and
CDK6) and DNA replication (such as ORCS3 and AHCTF1); DGC cluster
2 and IGC cluster 2 were featured with ECM organization (such as DMD
and MUC5AC), collagen formation and biosynthesis (such as CD36,
COL6A1, and LAMA2); many immune response-related proteins (such
as CD163, IDO1, and ICAM1), and proteins regulating neutrophil
degranulation and complement cascade (such as FCER1G, IL 16, and
C5) were overrepresented in DGC cluster 3 and IGC cluster 1 (Fig. 4c,
Supplementary Fig. 6b, Supplementary Data 5a, b). Consistently, the
DGC cluster 3 and IGC cluster 1 had the highest immune score in DGC
and IGC,respectively; IGC cluster 2 had the highest stromascore in IGC
(Fig. 4b). KSEA analysis within each cluster based on tumor phospho-
proteomes revealed activation of subtype-specic kinases (Supple-
mentary Fig. 7a, b, Supplementary Data 5c). We found PRKACA and
PRKCA were activated in DGC cluster 3 and IGC cluster 1, respectively;
TGFBR2wasactivatedbothinIGCcluster2andDGCcluster2;AURKB
was activated both in IGC cluster 3 and DGC cluster 1. These obser-
vations suggested that DGC and IGC clusters exhibited distinct char-
acteristics, which were validated in tumor microenvironment, pathway
enrichment, and kinase enrichment. Among these subtypes, DGC
cluster 3 and IGC cluster 1 had similar biological processes, while DGC
cluster 1 and IGC cluster 3 showed more consistence.
We observed a contrasting phenomenon in this study wherein
DGC and IGC patients with upregulated cell cycle- and immune
response-related proteins exhibited different clinical outcomes. DGC
cluster 1 featured with cell cycle had better prognosis thanDGC cluster
3 featured with immune-response related processes. Conversely, IGC
cluster 3, with similar molecular features as DGC cluster 1, had poor
prognosis (Fig. 4a). ssGSEA analysis in these clusters further demon-
strated that the NESs (normalized enriched scores) for cell cycle- and
immune response-related signaling pathways were associated with
different clinical outcomes in DGC and IGC (Fig. 4d). Regulation of
spindle assembly and DNA unwinding involved in DNA replication
were associated with favorable prognosis in DGC. However, the reg-
ulation of mitotic cell cycle andregulation ofcell cycle phase transition
were associated with unfavorable prognosis in IGC (Log-rank test,
p< 0.05). Moreover, alternative complement pathway activation and
CD4 + α/βT cellactivation were associated with favorable prognosis in
IGC, but macrophage migration and leukocyte aggregation were
associated with unfavorableprognosis in DGC (Log-ranktest, p< 0.05).
Further, we validated this nding using the transcriptomic dataset of
TCGA GC cohort7. We performed ssGSEA and calculated NES of path-
ways for every patient, and analyzed the prognostic effects of signaling
pathways based on the correlation between NES of pathways and
clinical outcomes. We found the pathway, regulation of cell cycle
phase transition, was a prognostic unfavorable pathway in IGC but was
a prognostic favorable pathway in DGC. Conversely, the pathway,
leukocyte aggregation, was a prognostic unfavorable pathway in DGC
but was a prognostic favorable pathway in IGC. These results validated
that cellcycle- and immune response-related signaling pathways were
associated with opposite clinical outcomes between DGC and IGC
(Supplementary Fig. 6c).
As cell cycle status impacted the sensitivity of patients to adjuvant
chemotherapy42, we compared the prognoses of patients who under-
went adjuvant chemotherapy and those who did not in each subtype.
We found that DGC cluster 1 patients were insensitive to adjuvant
chemotherapy, whereas IGC cluster 3 patients were sensitive (Fig. 4e,
Supplementary Fig. 7c). To evaluate whether tumor cell cycle phases
affected the sensitivity of patients to chemotherapy, we performed
further statistical analysis and found that DGC cluster 1 had the highest
percentage of patients with upregulated S phase signature proteins,
whereas IGC cluster 3 had the highest percentage of patients with
upregulated G2M phase transition signature proteins (Fishersexact
test, p=0.0128; Fig. 4f, Supplementary Data 5d). As reported, S and
G2M phases were featured by DNA replication and cell division,
Fig. 3 | Integrated multilevel proteomic analyses showed different pathogenic
mechanism of DGC and IGC. a The association of Laurenclassicationwith clinical
information. Two-sided Fishers exact test is used for categorical variables. bThe
association of Lauren classication with clinical outcomes. n(DGC) = 83 and n
(IGC) = 102 biologically independent samples. P-values are from Log-rank test.
cRepresentative differentially expressed proteinsin the featured pathways of DGC
and IGC. dMicroenvironment scores and immune scores of DGC and IGC. n
(DGC) = 83 and n(IGC) = 100 biologically independent samples. Violin plots show
median and interquartile range. The p-values are from two-sided Wilcoxon rank-
sum test. eComparison of immune cell inltration between DGC and IGC. Two-
sided Wilcoxon rank-sum test is used. The BenjaminiHochberg (BH) adjusted
p-values are 0.019 (NK cells), 0.0016 (NKT), 1.71E-9 (CD4 + T-cells), 1.89E-12
(CD4 + memory T-cells), 0.0017 (CD8 + T-cells), 0.00056 (CD8+ Tem), and 4.25E-7
(Macrophages). fIntegrated analysis of cell cycle regulation pathway at protein,
kinase, TFactivity and phospho-site levels in DGC and IGC.gIntegrated analysis of
DNA mismatch repair pathway at protein, kinase and phospho-site levels in DGC
and IGC. hThe expression of CDK4 and ATR in DGC and IGC. n(DGC)= 10 and
n(IGC) = 16biologicallyindependent samples. Boxplots show median(central line),
upper and lower quartiles (box limits), min to max range. The p-values are calcu-
lated by two-sided Wilcoxon rank-sum test. iThe prognostic analyses of CDK4 and
ATR in TCGA cohort. In comparison of CDK4, n(high expression) = 9 and n(low
expression) = 13 biologically independent samples. In comparison of ATR, n(high
expression) = 32 and n(low expression) = 42 biologically independentsamples. The
p-value is calculated using Log-rank test. jSummary of signature proteins and
pathways involved in DGCand IGC. ****p< 1.0e-4, ***p< 1.0e-3, **p< 0.01, *p<0.05.
Source data are provided as a Source Data le.
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 8
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Immune score
Stroma score
Microenvironment score
1
3
2
Gender
Stage
Tumor location
Lymphovascular invasion
Age
Ki67
Signet ring cells Immune score Stroma score
Microenvironment score
Proteomic subtype Gender Stage
Tumor location Lymphovascular invasion
Age
Ki67
Signet ring cells
Male
Female
<50
≥50
1
3
2
4
Antrum
Cardia
Body No
Yes
No
Yes 20-2
NA
NA
NA
Z score
Proteomic subtype
DGC IGC
b
cd
f
g
h
20-2
Z score
20-2
Z score
20-2
Z score
Regulation of mitotic cell cycle
Regulation of cell cycle phase transition
Complement activation, alternative pathway
CD4+ α/β T cell activation
-4 -2 0 2 4 6
Log2(HR)
IGC
Cluster 3Cluster 1 Cluster 2
DGC
Cluster 3Cluster 1 Cluster 2
−2
−1
0
1
2
Z score
D_1 D_2 D_3 I_1 I_2 I_3
0
50
100
Proteomic subtypes
Percent
G1
S
G2M
i
DNA unwinding involved
in DNA replication
Leukocyte aggregation
Regulation of spindle assembly
Macrophage migration
-3-2 -1 0 1 2 3
Log2(HR)
-5 0 5
DGC DFS
Log2(HR)
cluster 3
cluster 2
cluster 1
-10 -5 0 5
IGC DFS
Log2(HR)
cluster 3
cluster 2
cluster 1
-2 -1 0 1 2
CDK2
MAPK1
MAPK3
CDK5
CDK7
CDK1
Kinase Z-score
DGC cluster 1
IGC cluster 3
p=0.0128
0 500 1000 1500 2000 2500
0
50
100
CDK1 high CDK2 low samples
days
No chemotherapy n=7
Chemotherapy n=19
p=0.0324
Percent survival
CSNK2A1
IGC1
IGC2
IGC3
Cell Cycle, Mitotic
S Phase
DNA Replication
Extracellular Matrix Organization
Collagen Formation
Collagen Biosynthesis and Modifying Enzymes
Immune System
Neutrophil Degranulation
Regulation of Complement Cascade
−1 0 1
Z Score
DGC1
DGC2
DGC3
DGC
p=0.033
IGC
p=0.022
SG2
M
G1
DNA replication
DGC Cluster 1
IGC Cluster 3
CDK2
Cell cycle
Cell division
CDK1 substrate
CDK2 substrate
CDK1
overexpressed proteins
overexpressed phospho-sites
overexpressed proteins
overexpressed phospho-sites
0 500 1000 1500 2000
0
50
100
days
Probability of survival
C1 n=23
C2 n=28
C3 n=28
0 1000 2000 3000
0
50
100
days
Probability of survival
C1 n=18
C2 n=49
C3 n=25
KIF11
KIF2C
ANLN
KIF23
TPX2
CENPF
CKAP2
UBE2C
CKAP5
NCAPD2
ECT2
NUSAP1
MCM2_S139
RRM2_S20
SRSF11_S207
HNRNPK_S216
SNRNP70_S226
STMN1_S25
PAICS_S27
PTPN2_S304
LMNB2_S37
STMN1_S38
MCM2_S41
VIM_S56
SKP2_S64
RPA2
UBR7
RRM1
MCM5
MCM4
MCM6
DGC Cluster 1 IGC Cluster 3
TOP2A_S1354
FLNA_S1533
SIRT2_S368
ANAPC1_S377
RAP1GAP_S484
PML_S518
TPX2_S738
LMNA_T19
NME1_T94
a
e
Fig. 4 | Proteomic subtyping of GC and associations with clinical outcomes.
aThe association of proteomic subtypes with clinical outcomes in DGC and IGC. n
(DGC cluster 1) = 23, n(DGC cluster 2) = 28, n(DGC cluster 3) = 28, n(IGC cluster
1) = 18, n(IGC cluster 2) = 49, and n(IGC cluster 3) = 25 biologically independent
samples. P-valuesare from Log-rank test. bClinical characteristicsannotation in GC
proteomic subtypes. cPathways that signicantly enriched in the proteomic sub-
types. dGSEA revealed the cell cycle and immune related pathways are enriched in
the proteomic subtypes and have opposite prognoses between DGC and IGC.
n(DGC) = 79 and n(IGC) =92 biologically independent samples. The points and
error bars show the median of hazard ratio (HR) and 95% condence interval (CI).
eThe association of chemotherapy with DFS in each GC proteomic subtypes.
Chemotherapy: n(DGC cluster 1) = 19, n(DGC cluster2) = 22, n(DGC cluster 3) = 22,
n(IGC cluster 1) = 12, n(IGC cluster 2) = 36, and n(IGC cluster 3) = 20 biologically
independent samples. No chemotherapy: n(DGC cluster 1) = 4, n(DGC cluster
2) = 6, n(DGC cluster 3) = 7, n(IGC cluster 1) = 6, n(IGC cluster 2) = 13, and n(IGC
cluster3) = 5 biologically independentsamples. The points and error bars showthe
median of hazard ratio (HR) and 95% condence interval (CI). fPercentage of
patients with different cell cycle phases. The p-value is calculated by Fishers exact
test. gSummary of cell cycle regulation in DGC cluster 1 and IGC cluster 3. Proteins
involved in DNA replication and cell division in DGC cluster 1 and IGC cluster 3,
phospho-sites of CDK1 and CDK2 substrates in DGC cluster 1 and IGC cluster3 are
shown, respectively. hKSEA analysis ofCDKs kinase activitiesin DGC cluster 1 and
IGC cluster 2. Kinases with p-value < 0.05 (permutation test) are colored in red
(CDK2, p-value = 0.041) or blue (CDK1, p-value = 0 .049). iThe association of che-
motherapy with DFS in GC patients with high CDK1 and low CDK2 level. n(che-
motherapy ) = 19 and n(no chemotherapy)= 7 biologically independent samples.
The p-value is from Log-rank test. Source data are provided as a Source Data le.
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 9
Content courtesy of Springer Nature, terms of use apply. Rights reserved
respectively43. We further compared the key proteins involved in DNA
replication and cell division, and found that these proteins had reverse
expression patterns in DGC cluster 1 and IGC cluster 3 patients
(two-sided Wilcoxon rank-sum test, p< 0.05, foldchange > 2; Fig. 4g,
Supplementary Data 5e). Thus, chemotherapy treatment strategies
should be devised after considering cell cycle phases of the tumor
cells. We proposed proteins involved in DNA replication were worth to
be considered as therapeutic targets for DGC cluster 1 (Fig. 4g).
Comparison of the DGC cluster 1 and IGC cluster 3 phospho-
proteomes revealed the activation of subtype-specickinases,suchas
CDK2 and CDK1, respectively (Fig. 4h). Further investigation into the
differential phospho-sites showed that increase in CDK2 substrates
and decrease in CDK1 substrates were observed in DGC cluster 1 (two-
sided Wilcoxon rank-sum test, p< 0.05, foldchange > 2; Fig. 4g, Sup-
plementary Data 5f). Survival analysis revealed CDK2 was associated
with good prognosis in DGC and with poor prognosis in IGC (Log-rank
test, p< 0.05; Supplementary Fig. 7d). These observations demon-
strated that the activity of CDKs, particularly CDK1 and CDK2, asso-
ciated with diverse prognoses among GC patients. As previously
reported, CDK1 promoted G2-M transition, whereas CDK2 promoted
DNA replication andS phase transition43.Ourndings forDGC cluster 1
and IGC cluster 3 were consistent with these observations. As cell
proliferation status and cellcycle phases affected a patientsresponse
to chemotherapy, we attempted to predict the chemotherapeutic
response in GC patients based on CDK1 and CDK2 expression levels. In
our GC cohort, we found that patients with high CDK1 and low CDK2
levels beneted from adjuvant chemotherapy (Log-rank test, p<0.05;
Fig. 4i), indicating that CDK1 and CDK2 levels could serve as bio-
markers to gauge chemotherapeutic response in GC patients.
In summary, our proteomic subtypes showed the converse cor-
relation between protein features and prognoses in DGC and IGC,
which provided guidance for patient stratication and therapy stra-
tegies in clinic.
TF activity proles and their clinical relevance
Screening DNA-binding activity of TFs in GC can advance our under-
standing of GC heterogeneity. Despite the application of proteome
proling have made great progress in precision oncology, the existing
strategies of quantifying the changes in TF activities have certain
limitations44. The sub-proteome consisting of TFs is usually neglected
in cancer proteomics because of low abundance of TFs. Therefore, we
detected and evaluated inferred TF activities at proteomic level by
TFRE approach which we previously developed 16,17. We constructed the
TF activity proles for the 196 pairs of GC tumor tissues and NATs.
We performed cluster analysis of TF activity proles of DGC and
IGC with 425 and 396 TFs detected in >50% DGC and IGC patients,
respectively, and identied two subtypes in each dataset (Supple-
mentary Fig. 8a, b). Further analysis of the TF activity-based subtypes
demonstrated their signicant correlation with patientssurvival (Log-
rank test, p< 0.05), indicating the prognostic power of clustering TF
activity proles (Fig. 5a). For convenience, TF activity-based subtypes
were designated as DGC TF cluster 1 (n= 40), DGC TF cluster 2 (n=43),
IGC TF cluster 1 (n= 42), and IGC TF cluster 2 (n=60), respectively.
Evaluation of the clinical features of TF activity-based subtypes
revealed that the DGC TF cluster 2 comprised more patients with
lymphovascular invasion (55.3% in cluster 1 and 75.6% in cluster 2) and
a higherprobability of antrum (26.3% in cluster1 and 46.7% in cluster 2)
than DGC TF cluster 1 (Fig. 5b). The IGC TF cluster 2 comprised less
stage I GC patients (2.6% in cluster 1 and 8.9% in cluster 2) than IGC TF
cluster 1.
Subsequently, we identied the master TFs in each TF activity-
based subtype (Fig. 5c, Supplementary Fig. 8ce, Supplementary
Data 6a). We found that NFKB2 dominated in IGC TF cluster 1;
SMARCE1 and TFAP4 dominated in IGC TF cluster 2; MLX and
SMARCC1 dominated in DGC TF cluster 1; NFKB1, RELA, and IRF2
dominated in DGC TF cluster 2. In total, NFKB complex was nominated
as the master TFs in IGC TF cluster 1 and DGC TF cluster 2; SWI/SNF
complex was nominated as the master TFs in IGC TF cluster 2 and DGC
TF cluster 1. Notably, for DGC patients, DGC TF cluster 1 had better
prognosis, whereas DGC TF cluster 2 had worse prognosis; for IGC
patients, IGC TF cluster 1 had better prognosis, whereas IGC TF cluster
2 had worse prognosis (Log-rank test, p< 0.05). These results showed
the diverse prognostic correlation of NFKB complex and SWI/SNF
complex in DGC and IGC.
Pathway enrichment analysis of TGs demonstrated that the mas-
ter TFs regulated different biological functions in different clusters
(Fig. 5d, Supplementary Data 6b). For example, the NFKB complex
involved in Rho protein signal transduction and platelet activation in
IGC TF cluster 1, while it involved in immune response, CAMs trans-
lation, and cell migration in DGC TF cluster 2. On the other hand, the
SWI/SNF complex involved in translation and cell cycle progression in
IGC TF cluster 2, while it involved in RNA splicing and DNA replication
in DGC TF cluster 1 (Fig. 5ef).
A question that we posed was why master TFs could regulate
a different set of genes in different subtypes. As phosphorylation
is a fundamental mechanism to regulate TF activities, we explored
the effect of phosphorylation on master TFs based on the kinase-
substratenetwork.Wecomparedthe phosphorylation levels of
these TFs and found that phosphorylation of NFKB1 at S907,
S937, S939, and S941 were increased in DGC TF cluster 2, while
phosphorylation of TFAP4 at S124 was increased in IGC TF cluster
2 (Wilcoxon rank-sum test, BH adjusted p< 0.05, foldchange > 2;
Fig. 5g). Subsequently, we screened for kinases that were possibly
responsible for these ve phospho-sites by correlation analysis.
We found 33 kinases had signicant positive correlation with
these ve phospho-sites (spearmanscorrelationcoefcient > 0,
p<0.05;Fig.5h, Supplementary Data 6c). The signal transduction
network of TF activity-based subtypes was depicted in Fig. 5i.
In DGC TF activity cluster 2, the kinase activity of IKBKE was
correlated with phospho-site S941 of NFKB1. This indicated
that IKBKE activated NFKB1, which was consistent with the pre-
vious studies45. In IGC, ATM/ATR activity had a signicantly
positive correlation with TFAP4 (phosphorylated at S124), which
associated with the expression of cell division-related proteins. As
shown in Fig. 3g, we found that ATM/ATR had higher activities in
IGC than in DGC. These observations indicated a potential role of
ATM/ATR in regulating cell division in DGC and IGC oncogenesis
via the activation of distinct downstream TFs such as TFAP4.
Here, we elucidated the roles of TF complexes in pathological
processes and presented the kinase-TF-target gene network in
DGC and IGC subtypes based on integrating multilevel proteomic
data (Fig. 5i).
Additionally, we found the master TFs were correlated with
prognoses among patients treated with adjuvant chemotherapy or
not. For example, among patients with higher activity of SMARCC1
in IGC or lower activity of NFKB1 in DGC, who received adjuvant che-
motherapy presented good prognosis (Log-rank test, p<0.05).Thus,
IGC patients with high SMARCC1 activity and DGC patients with low
NFKB1 activity could benet from chemotherapy (Supplementary
Fig. 8f). Moreover, our ndings concurred with previous reports that
NFKB1 is involved in resistance to chemotherapy and radiotherapy46,
indicating that NFKB1 and SMARCC1 could be potential biomarkers for
GC diagnosis and for selection of an effective treatment strategy.
Characteristics of multilevel proteomic subtyping and its
robustness
We performed consensus clustering analysis based on phospho-
proteomic data. We applied phospho-sites detected in >50% DGC and
IGC patients, corresponding to 4484 and 4739 phospho-proteins,
respectively, in consensus clustering and identied three DGC
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 10
Content courtesy of Springer Nature, terms of use apply. Rights reserved
phospho-proteomic subtypes and three IGC phospho-proteomic
subtypes. We designated the subtypes as DGC phospho-proteomic
cluster 1 (n= 27), DGC phospho-proteomic cluster 2 (n=37),andDGC
phospho-proteomic cluster 3 (n= 16) in DGC; and IGC phospho-
proteomic cluster 1 (n= 27), IGC phospho-proteomic cluster 2 (n=26),
and IGC phospho-proteomic cluster 3 (n= 30) in IGC, respectively
(Supplementary Fig. 9). Then, we summarized these subtyping results
from individual dataset. As shown in Supplementary Table 2, except
correspondence between phospho-proteomic subtypes and TF
activity-based subtypes in IGC, the statistical results of classication
concordance among subtypes based on three datasets were all
signicant (chi-square test, p< 0.05). These results demonstrated
TF activity subtype
Age
Stage
Tumor location
Lymphovascular invasion
Signet ring cells
Gender
TF activity subtype
Age
Stage
Gender
Tumor location
Lymphovascular invasion
HER2
Ki67
TF activity subtype Age
Stage
Tumor location Lymphovascular
invasion Signet ring cells
Gender
DGC cluster 1
DGC cluster 2
Male
Female
<=60
>60
I
IV
II
III
Cardia
Antrum
Body No
Yes
No
Yes
2468
TF activity subtype
IGC cluster 1
IGC cluster 2
Ki67
HER2
-
+++
+
++
d
ab
c
f
h
PS124
gi
DGC
IGC
−2 0 2
z score
S941
PP
S907
e
Up-regulated phospho-sites in DGC Cluster 2
Up-regulated phospho-sites in IGC Cluster 2 DGC IGC
Spearman
correlation coefficient
Cluster 1
IGC
NFKB2
DGC
NFKB1
58 53
2235
Cluster 2 Cluster 1 Cluster 2
SMARCC1 RELA
IRF2
Regulation of Rho protein signal transduction
Platelet activation
Mismatch repair
Translation
Ribosome
Cell division
Cell cycle
DNA replication
RNA splicing
Spliceosome
Focal adhesion
NF−kappa B signaling pathway
Cell−matrix adhesion
Inflammatory response
Cell migration
Cell adhesion molecules (CAMs)
Immune response
0
2
4
6
8
-Log10 FDR
DGC
MCM3
MCM5
RFC1
FEN1
SRSF1
CDC5L
RNPS1
TCERG1
CD40
CD44
CD48
CD84
DOCK2
ICAM1
ITGAL
ITGB2
PIK3CG
UNC13D
TMEM173
IGC
APBB1IP
ARHGDIB
ARHGEF1
EIF3D
RPL12
RPL13A
RPL36
RPL7A
RPS26
KIF2C
NACA
NCAPH
NPM1
AURKB
Immune response
Cell adhesion molecules (CAMs)
RNA splicing
DNA replication
Cell division
Translation
Rho signal transduction
Cluster 1 Cluster 2
Cluster 2
Cluster 1
0
1
2
3
4
−1.0 −0.5 0.0 0.5 1.0
Log10(fold change)
−Log10(p value)
NFKB1_S941
NFKB1_T939
NFKB1_S907
NFKB1_S937
TFAP4_S124
NFKB1_S937
NFKB1_S907
NFKB1_S941
NFKB1_T939
TFAP4_S124
PRKACA
CDK9
PAK4
PRKAA2
PIM2
CAMK2D
PRKCH
PRKCQ
RPS6KB1
PRKCB
PRKCA
RET
MAP3K2
CHEK1
CDK5
AKT3
SGK1
DYRK1A
PRKCD
PRKCG
IKBKE
RPS6KA3
PRKCE
MAP2K6
ATM
RPS6KB2
MAPK8
RPS6KA1
CDK2
ATR
0
0.2
0.4
0.6
0.8
1
NFKB2
SMARCE1TFAP4
SMARCC1
NFKB1
IRF2 RELAREL
S939
PP
S937
Cell divisionTranslation
Rho signal transduction
Immune response Cell adhesion molecules (CAMs)
RNA splicingDNA replication
PRKACA
CHEK1 IKBKE PRKACB
ATM/ATR
6
2
1
SMARCE1
TFAP4
74
4
2
MLX
32
Cluster 1
Cluster 2
TFs with
target genes prediction
TFs with differential activities
in two subtypes
TFs with higher activities
in tumor tissues
1
r etsul
cCG
I
2
re
tsu
lc
C
G
I1
ret
s
ulcC
G
D2
rets
u
lcCG
D
0 500 1000 1500 2000
0
50
100
DGC
p=0.022
days
Probability of Survival
C1 n=38
C2 n=45
0 1000 2000 3000
0
50
100
IGC
p=0.037
days
C1 n=40
C2 n=62
Probability of Survival
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 11
Content courtesy of Springer Nature, terms of use apply. Rights reserved
that our TF activity-based subtypes, proteomic subtypes and phospho-
proteomic subtypes had high classication concordance (Fig. 6a).
To explore the characteristics of each phospho-proteomic
subtype and correspondences between proteomic subtypes and
phospho-proteomic subtypes, differentially expressed phospho-sites
(Wilcoxon rank-sum test, BH adjusted p< 0.05, foldchange > 2) were
identied, and pathway enrichment analysis was performed. As shown
in Supplementary Fig. 9b, c, DGC phospho-proteomic cluster 1 was
characterized by RNA splicing, cell cycle, DNA repair and RHO GTPase
cycle, which were corresponding to characteristics of DGC proteomic
subtypes cluster 1; DGC phospho-proteomic cluster 2 was character-
ized by cytoskeleton organization, which showed similar features with
DGC proteomic cluster 2; DGC phospho-proteomic cluster 3 was
characterized by cadherin binding and celladhesion molecule binding,
which showed similar features with DGC proteomic cluster 3; IGC
phospho-proteomic cluster 1 was characterized by cytoskeleton
organization and actin cytoskeleton organization, which were corre-
sponding to characteristics of proteomic subtype cluster 2; IGC
phospho-proteomic cluster 2 was characterized by RNA splicing and
DNA repair; and IGC phospho-proteomic cluster 3 was characterized
by cell cycle, which was corresponding to characteristics of proteomic
subtype cluster 3.
In addition, we explored the association between TF activity-
based subtypes and proteomic subtypes. We found that 65% (15 out of
23) DGC proteomic subtype cluster1 comprised a subset of the DGC TF
activity subtype cluster1; 96% (27 out of 28) DGC proteomic subtype
cluster 3 comprised a subset of the DGC TF activity subtype cluster 2;
61% (11 out of 18) IGC proteomic subtype cluster1 comprised a subset
of the IGC TF activity subtype cluster 1; 84% (21 out of 25) IGC pro-
teomic subtype cluster 3 comprised a subset of the IGC TF activity
subtype cluster 2. These results showed that the proteomic subtypes
were signicantly overlapped with the TF activity subtypes. Interest-
ingly, patients in DGC proteomicsubtype cluster 2 were grouped into
two TF activity subtypes (18 patients (64%) in TF activity subtypes
cluster 1 and 10 patients (36%) in TF activity subtypes cluster 2, Fig. 6b).
For DGC proteomic subtype cluster 2, we further explored the
clinical and molecular differences between two TF activity-based
subtypes. As expected, patients in TF activity subtype cluster 2 had
worse prognosis (Fig. 6c), featured by higher TF activity of NFKB1 and
lower TF activity of SMARCC1 (Fig. 6d). Further prognostic analysis of
TF activities revealed that TF activity of NFKB1 was negatively corre-
lated with the prognosis, and the TF activity of SMARCC1 showed
positive correlation with the prognosis (Fig. 6e). Accordingly, combi-
nation of these two TF activities could well distinguish patients with
poor prognosis (with high activity of NFKB1 and low activity of
SMARCC1) from those with good prognosis (with high activity of
SMARCC1 and low activity of NFKB1), exhibiting good prognostic
predictive capacity (Supplementary Fig. 10a). At last, we compared the
expression of TGs of NFKB1 and SMARCC1 based on proteomic dataset
(Fig. 6f, g). We found SMARCC1s target genes, involved in RNA splicing
and DNA replication, were upregulated in TF activity subtype cluster 1;
NFKB1s target genes, related to immune response, were upregulated
in TF activity subtype cluster 2. These results were consistent with
pathway enrichment in DGC phospho-proteomic subtype cluster 1 and
cluster 2 (Supplementary Fig. 10b). Overall, integrated subtyping
results suggested that proteomic subtypes coupled with TF activity
analysis could be exploited for prognostic prediction and combina-
torial therapeutic strategy.
To validate the robustness of the proteomic subtyping, the Muns
cohort10, which was subtyped into Prot 1 (immune response related
processes), Prot 2 (actin cytoskeleton and cadherin signaling), Prot 3
(metabolism), and Prot 4 (RNA processing), was used as an indepen-
dent validation cohort. Based on the 200 most representative proteins
of each proteomic subtype identied in our cohort, the Muns cohort
were reanalyzed and clustered into three proteomic subtypes: subtype
1(n=23), subtype 2 (n=24), and subtype 3 (n= 27) (Supplementary
Fig. 10b, Supplementary Data 6d). The signature proteins and subtype-
specic pathways (subtype 1: spliceosome, corresponding to Prot 3
and 4; subtype 2: ECM organization, corresponding to Prot 2; and
subtype 3, immune response, corresponding to Prot 1) were shown in
Supplementary Fig. 10ce. We performed chi-square test to assess the
classication concordance between proteomic subtypes and Muns
subtypes. The statistical results of classication concordance were
signicant (chi-square test, p< 0.05, Supplementary Fig. 10f). The high
classication concordance demonstrated that the consistent expres-
sion pattern of signature proteins dominant in our subtyping could be
observed in Muns cohort, supporting the reliability of our subtyping.
To further validate the classication power of TF activity, we used
a Bayesian algorithm to distinguish patients in Munscohortintotwo
TF subtypes (NFKB1 subtype and SMARCC1 subtype)47.InMuns
cohort, 24 and 28 cases were identied as NFKB1 subgroup and
SMARCC1 subgroup, respectively (Fig. 6h, i). As shown in Supple-
mentary Fig. 10e, we found that the patients in subtype 2 of proteomic
subtypes were assessed as two TF activity subtypes (9 patients in
SMARCC1 subtype and 8 patients in NFKB1 subtype) in Munscohort.
We observed the similar corresponding association of TF activity
subtypes and proteomic subtype cluster 2 in our cohort and Muns
cohort. Also, statistical analysis showed the classication concordance
between proteomic subtypes and TF subtypes of Munscohortwere
signicant (chi-square test, p< 0.05, Supplementary Fig. 10f). These
results showed that our TF activity subtypes were robust, which could
be supported by the published GC dataset.
Immune characterization of GC tumors
Tumor microenvironment (TME) comprises tumor cells, cancer-
associated broblasts, inltrating immune cells, and endothelial
cells18. Several studies have indicated that the TME inuences cancer
progression and therapeutic responses in patients19. Although recent
advances in immunotherapy and targeted drug therapy in treating GC
patients have improved patient prognosis, these therapies are efcient
only for a subset of patients. It is imperative to address indicators for
immunotherapeutic effectiveness.
To better understand the concept of immune cell inltration in GC
tumors, we performed xCell35 analysis of the proteomic data to infer
the relative abundance of diverse cell types in the TME (Fig. 7,Sup-
plementary Fig. 11). Consensus clustering based on inferred cell pro-
portion identied the following three sets of tumors with distinct
immune signatures and stromal features: immune cluster 1 (n=69),
immune cluster 2 (n= 65), and immune cluster 3 (n=49; Fig. 7a, b,
Supplementary Fig. 11a). We found that immune cluster 1 had lower
immune and stoma scores (ANOVA, p< 0.001) and had a higher pro-
portion of epithelial cells than other clusters. As expected, ssGSEA
Fig. 5 | DGC and IGC subtypes based on TF activity proles. a The association of
TF activity-based subtypes with clinical outcomes in DGC and IGC. n(DGC cluster
1) = 38, n(DGC cluster 2) = 45, n(IGC cluster 1) = 40, and n(IGC cluster 2) = 62
biologically independent samples. P-values are from Log-rank test. bClinical
characteristics annotation in GC TF activity-based subtypes. cMaster TFs selection
in each TF activity-based subtype. dPathway enrichment analysis of master TFs
regulated TGs in each TF activity-based subtype. eA list of TGs regulated by master
TFs in signicantly altered pathways and their abundance in each DGC TF activity-
based subtype. fA list of TGs regulated by master TFs in signicantly altered
pathways and their abundancein each IGC TF activity-based subtype. gExpression
of phospho-sites in each TF activity-based subtype. The p-valuesare from Wilcoxon
rank-sum test. Red and orange colors, upregulated phospho-sites in DGC cluster 2
and IGC cluster 2, respectively. hSpearmans correlation coefcients between
kinases and phospho-sites upregulated in DGC cluster2 and IGC cluster 2.
iPhospho-regulatorynetwork in GC. Source data are provided as a Source Data le.
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 12
Content courtesy of Springer Nature, terms of use apply. Rights reserved
analysis indicated the epithelial cell morphogenesis and positive reg-
ulation of mitotic cell cycle phase transition were elevated in immune
cluster 1. Furthermore, canonical markers of epithelial cells, i.e.,
EPCAM, KRT18, MUC1, and CDH1 had the highest expression in
immune cluster 1 than other clusters (Fig. 7a, Supplementary
Fig. 11bd). Immune cluster 2 had higher immune and microenviron-
ment scores (two-sided ANOVA, p< 0.001), higher proportions of the
CD4 + T cells, neutrophils, and macrophages, and lower proportion of
the natural killer T cells than other clusters. This observation might be
supported by that pathways such as complement activation and reg-
ulation of NIK/NFKB signaling were elevated in immune cluster 2.
Moreover, canonical markers of macrophages, i.e., TLR2 and ARG148,
and immunotherapeutic targets, i.e., FCGR1A, CD276, and CD2720,had
higher expression in immune cluster 2 than other clusters (Fig. 7a,
Supplementary Fig. 11be, Supplementary Data 7a). As for immune
cluster 3, we observed higher stroma score (two-sided ANOVA ANOVA,
p< 0.001) and higher proportion of broblasts, lymphatic endothelial
cells, and microvascular endothelial cells than other clusters. Fibro-
blast proliferation, ECM assembly and regulation of actin lament-
based movement were enriched in immune cluster 3. The canonical
marker of endothelial cells, DCN, had higher expression in immune
cluster 3 than other clusters (Fig. 7a, Supplementary Fig. 11bd). Thus,
0 500 1000 1500
0
50
100
DGC proteomic subtype cluster 2
days
Probability of Survival
TF activity subtype
cluster 2 n=10
p=0.050
ab
c
-6 -4 -2 0 2 4 6
UBP1
TFCP2
TCF20
SMARCC2
SMARCC1
NOC3L
HNF4G
HMG20A
GATA6
GATA4
REL
NFKB2
NFKB1
NFATC3
FOXK2
E2F3
Log2(HR)
E2F3
FOXK2
NFATC3
NFKB1
NFKB2
REL
GATA4
GATA6
HMG20A
HNF4G
NOC3L
SMARCC1
SMARCC2
TCF20
TFCP2
UBP1
−2
−1
0
1
2
c
c
lust
e
er
r
1
1
cluster 2
z score
d
e
TF activity subtype
Proteomic subtype
f
0246810
Cell adhesion molecules
NF-kappa B signaling pathway
Inflammatory response
Immune response
DNA replication
Spliceosome
RNA splicing
-Log10(FDR)
MCM3
MCM4
RFC1
ALYREF
NAP1L1
SNRPA1
TCERG1
CDC5L
CD40
RELB
GNAI2
ITGAL
CD180
TMEM173
PIK3CD
ITGB2
CD44
PTPRC
−2
−1
0
1
2
D−2
D−2
D−1
D−1
D−2
D−2
Cluster 1 Cluster 2
NFKB1
NFKB1
SMARCC1
SMARCC1
z score
g
h
TF activity subtype
NFKB1
SMARCC1
−0.6
−0.3
−0.1
0.1
0.3
0.6
0.0
0.5
1.0
Probability
probability in NFKB1 subtype
probability in SMARCC1 subtype
z score
-1.0
-0.5
0.0
0.5
1.0
SMARCC1 subtype
p=0.0006
Log10(T/NAT ratio)
NFKB1
SMARCC1
-1.0
-0.5
0.0
0.5
1.0
NFKB1 subtype
p=0.0025
Log10(T/NAT ratio)
NFKB1
SMARCC1
i
cluster 1 n=18
Phospho subtypesProteomic subtypesTF activity subtypes
I−2
I−2
I−1
I−1
D−2
D−2
D−1
D−1
I−3
I−3
I−1
I−1
D−3
D−3
D−1
D−1
D−2
D−2
I−2
I−2
I−3
I−3
I−2
I−2
I−1
I−1
D−3
D−3
D−2
D−2
D−1
D−1
Fig. 6 | Characteristics of Multilevel Proteomic Subtyping and its Robustness.
aSankey diagram depicting the association of samples classied into TF activity,
proteome and phospho-proteome-based subtypes.bSankey diagramdepicting the
association of samples classied into DGC proteomic cluster 2 and DGC TF sub-
types. cPrognostic outcomesof GC patients in DGC proteomic subtype cluster 2. n
(cluster 1) = 18 and n(cluster 2) = 10 biologically independent samples. The p-value
is from Log-rank test. dTF activities comparison between two TF activity subtypes.
ePrognostic outcomes of TFs with signicantly differential activities in two TF
activity subtypes. n=28 biologically independent samples. The points and error
bars show the median of hazard ratio (HR) and 95% condence interval (CI).
fProteins expression of target genes of two TFs. gPathways enriched in two sub-
groups of DGC proteomic subtype cluster 2. hPerformance of the TF subtype
predictor based on NFKB1 a nd SMARCC1. iThe expression of NFKB1 a nd SMARCC1
in two subgroups. n(SMARCC1 subtype) = 28 and n(NFKB1 subtype)= 24 biolo-
gicallyindependent samples. Boxplots show median(central line),upper and lower
quartiles (box limits), min to max range. The p-value is calculated using two-sided
studentst-test. Source data are provided as a Source Data le.
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 13
Content courtesy of Springer Nature, terms of use apply. Rights reserved
the immune subtypes were dened as epithelial subtype (cluster 1,
cold tumor), immune subtype (cluster 2, hot tumor), and endothelial
subtype (cluster 3, Fig. 7b).
Multivariate cox regression analysis revealed immune subtypes
(cluster 13) were associated with prognoses after adjusting for other
clinical covariates (Log-rank test, p<0.05;Fig.7c, d). Interestingly, we
found that DGC and IGC patients in immune cluster 3 exhibited a
opposite prognostic trend. For IGC patients, immune cluster 3 had the
best prognosis, while for DGC patients, immune cluster 3 hadthe worst
prognosis (the red lines; Fig. 7c, d). To address this issue, we compared
tumor inltration of immune cells in DGC and IGC patients in immune
cluster 3. The common lymphoid progenitor, NK cells, and Th2 cells
0.0
0.4
0.8
1.2
−0.1 0.0 0.1 0.2 0.3
Stroma Score
Immune Score
1
Proteomic subtype
2
3
NA
1
2
TF activity subtype
1
2
3
Immune subtype
DGC
IGC
Lauren subtype Gender
Male
Female
1
2
3
4
Stage
Tumor location
NA
A
B
C
0
1
NA
Lymphovascular invasion
0
1
NA
Signet ring cells Ki 67
Immune score Stroma score
Microenvironment score Immune characteristics
GSEA
-2
2
0
-2
2
0
-2
2
0
Proteomic subtype
TF activity subtype
Immune subtype
Lauren subtype
Gender
Stage
Tumor location
Lymphovascular invasion
Signet ring cells
Ki 67
ImmuneScore
StromaScore
MicroenvironmentScore
CD4+ Tcm
CD8+ naive T−cells
CLP
Epithelial cells
MSC
Th1 cells
pro B−cells
Astrocytes
CD4+ T−cells
CD4+ memory T−cells
CD8+ T−cells
CD8+ Tcm
CD8+ Tem
Macrophages
Macrophages M1
Macrophages M2
Neutrophils
Th2 cells
DC
Fibroblasts
HSC
NKT
Microvascular endothelial cells
Lymphatic endothelial cell
xCell scorexCell signature
****
*
**
*
**
Clinical info
Epithelial subtype
Immune subtype
Endothelial subtype
Immune hot
Inflammatory response
Immune cell infiltration
Epithelial cell infiltration Endothelial cell infiltration
Cell cycle EMT
Metabolism DGC poor prognosis
IGC poor prognosis
CDK kinase CAMK kinase ECM
APC mutation
Fibroblast
DNA replication
a
Epithelial cell morphogenesis
Epithelial structure maintenance
Cell cycle G2/M phase transition
Positive regulation of mitotic cell cycle phase transition
Regulation of cell cycle G2/M phase transition
Natural killer cell activation involved in immune response
Complement activation, alternative pathway
Regulation of Fc receptor mediated stimulatory signaling pathway
Response to interleukin−4
Regulation of NIK/NF−kappaB signaling
Microtubule cytoskeleton organization
Fibroblast proliferation
Extracellular matrix assembly
Regulation of extracellular matrix organization
Regulation of actin filament−based movement
ssGSEA
0 500 1000 1500 2000
0
50
100
Immune subtype DGC DFS
cluster1 n=20
cluster2 n=46
cluster3 n=17
p=0.040
0 1000 2000 3000
0
50
100
Immune subtype IGC DFS
cluster1 n=49
cluster2 n=19
cluster3 n=32
p=0.030
Percent survival
Percent survival
d
e
b
0.0
0.5
1.0
1.5
−4 −2 0 2 4
Log2(fold change)
−Log10(p value)
DGC higher infiltration IGC higher infiltration
Th1 cells
Osteoblast
CD8+ T cells
CD4+ memory T cells
Hepatocytes
Th2 cells
Skeletal muscle
Common lymphoid progenitor
NK cells
Mesangial cells
Profiling
-2
2
0
-2
2
0
-2
2
0
-2
2
0
DGC IGC
-1 .0
-0 .5
0.0
0.5
1.0
0 100020003000
0
50
100
Th1/Th2 ratio
days
Percent survival
low n=93
p=0.0377
high n=27
f
g
hi
c
#1
#2
#3
#4
#5
#6
#7
#8
#9
#10
#11
#12
#13
#14
-4
-2
0
2
4
Log2(Th1/Th2 ratio)
SD/PD PR
Response
j
Th1 cells
Y
Y
PD-1 anti-PD-1 therapy
Th2 cells
ROS
DNA damage
MHC
Mutation
Gastric cancer cells
Log2(Th1/Th2ratio)
p=0.028
SD/PD PR
-4
-2
0
2
4
p = 0.032
z score
z score
z score
z score
z score
z score
z score
Log2(Th1/Th2 ratio)
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 14
Content courtesy of Springer Nature, terms of use apply. Rights reserved
exhibited higher levels in IGC patients than in DGC patients, whereas
CD4 + memory T cells, CD8 +T cells, and Th1 cells exhibited higher
levels in DGC patients than in IGC patients (Wilcoxon rank-sum test,
p< 0.1, foldchange > 1.2; Fig. 7e, Supplementary Data 7b). DGC patients
had a higher Th1/Th2 ratio than IGC patients in immune cluster 3
(Fig. 7f). It has been previously reported that Th1/Th2 ratio could be
used as a prognostic marker49. Moreover, Mohammadi et al. demon-
strated that Th1 and Th2 cells had differential contribution with
respect to immune response to Helicobacter pylori infection-related
gastritis. The Th1 cells were involved in pathogenesis, and the Th2 cells
were associated with protection from the infection49. Th2 cells, not
Th1 cells, reduced inammationandshowedbenecial effects on GC
treatment50.Signicantly, in our cohort, Th1/Th2 ratio was negatively
associated with prognosis for all GC patients (Log-rank test, p=0.0377;
Fig. 7g), demonstrating that the Th1/Th2 ratio could serve as a prog-
nostic indicator in GC patients. To validate this conclusion, the Th1/
Th2 ratiovalues were calculated by evaluation of xCellbased on TCGA
transcriptomic dataset7. We compared the Th1/Th2 ratio between DGC
and IGC, and analyzed the association of Th1/Th2 ratio with prognosis.
We found the ratio of Th1/Th2 was higher in DGC than IGC, and this
ratio value was negatively related to the prognosis in TCGA cohort
(Supplementary Fig. 12a, b). These results were consistent with the
results observed inour proteomicdata, which validated theprognostic
effect of Th1/Th2 ratio in GC.
We further explored the reason asto why there was higher tumor
inltration by Th1 cells in DGC than in IGC. We calculated the spear-
mans correlation coefcients for Th1 cell scores and GSEA pathway
NESs. We found that generation of reactive oxygen species (ROS) had
the highest correlation with Th1 cells (spearmanscorrelationcoef-
cient = 0.72, p= 0.0012; Supplementary Fig. 12c). This indicated that
ROS may affect Th1 cell recruitment in DGC. Notably, ROS generation
is one of the hallmarks of cancer progression, and causes oxidative
damage to DNA, proteins and lipids51.Furthermore,ROSincreasesthe
mutational load and enhances antigen processing and presentation,
which was a common mechanism that affects the immune
microenvironment52. Subsequently, correlation analysis of pathway
NESs revealed that ROS generation, cellular response to DNA damage,
mutational load, and antigen processing and presentation via MHC
class II were signicantly and positively correlated with each other
(Supplementary Fig. 12d). We found a signicantly positive correlation
(spearmans correlation coefcient = 0.77, p= 0.025) between the Th1/
Th2 ratio and mutational load in immune cluster three patients (Sup-
plementary Fig. 12e). These results reported a potential mechanism
that elevated ROS in DGC increased the expression of MHC class II
molecules in response to DNA damage and mutational load increase,
subsequently recruiting more Th1 cells (Supplementary Fig. 12f).
To validate the correlation between Th1/Th2 ratio and immu-
notherapeutic effectiveness, we collected a group of GC patients
treated with anti-PD1 therapy, including 7 responder cases (PR) and 7
non-responder cases (SD/PD) (Supplementary Data 7c). The formalin-
xed parafn-embedded (FFPE) tumor tissue sections derived from 14
therapy-naïve GC patients were collected. Proteomics measurement
resulted in 7705 proteins in total. On an average, 4575 proteins were
identied per sample. The immune cells inltration in the 14 samples
were evaluated by xCell analysis based on the proteomic proles. The
Th1/Th2 ratio values of 14 samples were calculated as shown in Fig. 7h.
We found that the Th1/Th2 ratio was signicantly higher in the
responder group compared to the non-responder group (Fig. 7i). This
result suggested that the Th1/Th2 ratio could be an indicator for pre-
dicting clinical outcomes of immunotherapy among GC patients
(Fig. 7j). Therefore, the relationship between Th1/Th2 ratio and
immunotherapeutic effectiveness was further validated in an inde-
pendent gastric cancer anti-PD1 therapeutic patient group.
Discussion
GC is one of the main cancer types worldwide; the global 5-year survival
rates for GC patients remain ~2530%4. In clinical diagnosis, Lauren
classication is used for the preliminary diagnosis of GC patients.
However, molecular characteristics of Lauren classication (DGC and
IGC in major) are unclear, which hinder appropriate treatment approa-
ches application for patients with different pathologies. In this study, we
constructed a multilevel proteomic landscape by analyzing the pro-
teome, phospho-proteome, and TF activity prole datasets. TFRE
approach16,17, a DNA pull-down-based TF activity assay, was used in this
study to infer the activity of TFs. TFRE approach could detect and
quantify more TFs than proteome, which provided more detailed pro-
teomic landscape. The integrated analysis among TF activity prole and
proteome constructed the TF-TG signal transduction network, which
provided biological mechanisms of tumor processes and potential drug
targets. The proteome, phospho-proteome, and TF activity prole pro-
vided insights into the biological processes underlying GC, from protein
abundance, post-translational modication to TF activity, indicating the
importance of our GC protein landscape. Phospho-proteome and TF
activity prole increased the identications of kinases and TFs in tryptic
peptides samples, allowing us to compare the results more deeply in
proteomic analyses. This study focused on quantication analyses
within platforms, while the comparison among different platforms is
also an important issue need to further study.
Multilevel proteomic analysis indicated that DGC and IGC were
associated with different prognoses and pathogenic mechanisms,
thus, requiring different therapeutic options. We found that DNA
damage was upregulated in IGC, whereas immune and ECM proteins
were upregulated in DGC. It is possible that ATM/ATR, the key kinases
in DNA mismatch repair, regulated cell proliferation in IGC by acti-
vating the SWI/SNF complex. Therefore, we proposed ATM/ATR as
potential therapeutic targets for IGC. The potential targets for treating
DGCareCDK4/6,whichregulatedcellcycleinDGCbyactivatingthe
RB1/E2F pathway53. Analysis of TCGA data revealed that 66% of the GC
patients exhibited altered expression of at least one of the following
cell cycle related genes: RB1,CCND1,CCNE1,CDK2,CDK4,CDK6,
CDKN2A,CDKN2B,E2F1,E2F2,E2F3,andE2F454. Moreover, molecular
dissection of the chromosome band 7q21 amplicon in gastro-
esophageal junction adenocarcinomas revealed upregulated CDK6
expression at both transcription and translation levels54.Targeting
Fig. 7 | Characterization of immune inltration in GC. a Heatmap illustrating the
immune/stroma signatures from xCell, and ssGSEApathway scores ineach immune
subtype. P-values are from two-sided chi-square test. The p-values are 1.64E-6
(Lauren subtype), 0.017(Gender), 0.027(Tumor location), 0.01(Lymphovascular
invasion), and 0.0059(Signet ring cells).bContour plot of two-dimensional density
based on imm une score (y-axis) andstroma score (x-axis) among different immune
clusters. c,dKaplanMeier curves of DFS for DGC and IGC based on immune
subtypes. n(DGC cluster 1) = 20, n(DGC cluster 2) = 46, n(DGC cluster 3) = 17, n
(IGC cluster 1) = 49, n(IGC cluster 2 ) = 19, and n(IGC cluster 3) = 32 biologically
independent samples. P-values are from Log-rank test. eImmune cell inltration
between DGC and IGC. The p-values are from two-sided Wilcoxonrank-sum test.
fTh1/Th2 ratio in DGC and IGC. n(DGC)= 17 and n(IGC) = 32 biologically
independent samples. Boxplots show median (central line), upper and lower
quartiles (box limits), min to max range. P-values are calculated using two-sided
studentst-test. gThe association of Th1/Th2 ratio with prognostic outcomes in all
GC patients. n(low) = 93 and n(high)= 27 biologically independent samples. P-
values are from Log-rank test. hDistribution of Th1/Th2 ratio in the GC anti-PD-1
patient group. iComparison of Th1/Th2 ratio between responder and non-
responder groups. n(PR)= 7 and n(SD/PD) = 7 biologically independent samples.
Boxplots show median (central line), upper and lower quartiles (box limits), min to
max range. P-values are calculated using two-sided studentst-test. Each point
represents a sample. jSummary of T helper cells recruitment mechanism in GC.
****p< 1.0e-4, ***p<1.0e-3, **p<0.01, *
p< 0.05. Source data are provided as a
Source Data le.
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 15
Content courtesy of Springer Nature, terms of use apply. Rights reserved
CDK4/6 have been reported to improve patient outcomes in clinical
trialsin a variety of tumor types53, whichare also worth toinvestigatein
GC. Besides in our cohort, we validated in TCGA cohort that CDK4/6
and ATM/ATR were the potential targets for DGC and IGC, respec-
tively. These results indicated the universality of this conclusion,
suggesting these potential targets need to be further tested in
clinical trial.
OurproteomicandTFactivity-basedsubtypesshowedthereverse
correlation between protein/TF features and prognoses in DGC and
IGC. The validation of subtypes in an independent cohort illustrated the
robustness of our subtypes. The aberrations in biological processes
among subtypes provided guidance for patient stratication and ther-
apy strategies in clinic. Based on the analysis of proteomic subtypes, we
presumed that analyzing the cell cycle phases could improve the che-
motherapeutic efcacy, and CDK1/2 could be used as biomarkers pre-
dicting chemotherapeutic response. TF activity-based subtypes showed
the importance of TFs SMARCC1 and NFKB1 in DGC and IGC. The SWI/
SNF chromatin remodeling complex controls stemness, differentiation,
and proliferation, etc13. NFKB complex was reported to play important
roles in immune responses, cell proliferation, cell death, and inam-
mation, etc55. Nevertheless, it is difcult for TFs to be targeted by small
molecule inhibitors as they lack functional sites or allosteric regulatory
pockets that generally exist in kinases or other enzymes. In addition to
development of agents to inhibit cytoplasmic proteases that activate
NFKB, direct approaches such as Proteolysis Targeting Chimeras
(PROTACs) are emerging56. These treatment approaches can be
employed to treat patients with DGC TF cluster 2 subtype.
We performed immune subtyping based on the inferred immune
cell scores and dened three immune subtypes (epithelial subtype,
immune subtype, and endothelial subtype). Characteristics extraction
and pathway enrichment analysis suggested TME involved molecular
regulation mechanisms. For example, we proposed that tumor inl-
tration by immune cells, such as macrophages, were associated with
metastasis by activating the NFKB complex in DGC. Prognostic analysis
of immune subtypes proved that DGC and IGC patients in immune
cluster 3 exhibited reverse prognostic association. Furthermore, we
found that the Th1/Th2 ratio was differential in DGC and IGC, and this
value could serve as an indicator to predict immunotherapeutic
effectiveness. This result wasvalidated in published TCGA cohort and
an anti-PD1 immunotherapeutic patient group. Additionally, our data
indicated that the recruitment of T helper cells was linked to ROS level
and mutational load. We observed patients with high Th1/Th2 ratio,
who responded to anti-PD1 therapy, had highly expressed inducers
and lowly expressed scavengers of ROS. These results indicated that
immunotherapy responders had higher Th1/Th2 ratio and increased
ROS level. Antioxidant therapy, which depresses the level of ROS by
antioxidants, has been reported to improve clinical outcomesof tumor
patients52. We believe that Th1/Th2 ratio could serve asa biomarker to
determine the selection of antioxidant therapy for GC patients, which
required further investigation.
In summary, our research performed comprehensive proteomic
analyses of DGC and IGC. Multilevel proteomic subtypes were identi-
ed with distinct molecular features and clinical outcomes.
Methods
The construction of the GC cohort
The Medical Research Ethics Committees of Peking University Cancer
Hospital (2015KT70), Xijing Hospital (KY20150415), Chinese PLA
General Hospital (S2016-057-02), and Zhongshan Hospital (B2019-
200R) approved this study, and all patients provided written informed
consent for sample collection, analysis, and publishing basic and
clinicopathological information.
We selected 83 cases of diffuse-type gastric cancer (DGC), 102
cases of intestinal-type gastric cancer (IGC) and 11 cases of mixed
gastric cancer (MGC) from Peking University Cancer Hospital, Xijing
Hospital, and Chinese PLA General Hospital. These 196 patients
underwent total or subtotal gastrectomy between 2012 and 2015, and
no patient in this cohort was treated with neoadjuvant chemotherapy
or chemo-radiation therapy before operation. The surgical treatments
were performed by clinicians according to guidelines57. All cases were
staged according to the seventh edition of the American Joint Com-
mittee on Cancer (AJCC) staging system. The corresponding NATs
were selected 5 cm away from the sites at which the primary tumor
tissues were sampled. The muscle layers were carefully removed using
a scalpel and ne forcep, and the mucosa layers were used as NATs.
Each specimen was collected within 30 min after operation, cleaned
with sterile towel, immediately transferred into sterile freezing vials
and immersed in liquid nitrogen, then stored at 80 °C until use.
Tumor tissues and their nearby tissues were evaluated by pathologists.
Specimens in dry ice were transferred to National Center for Protein
Sciences (The PHOENIX Center, Beijing).
Thedateofoperationwasusedasasurrogateforthedateofinitial
diagnosis. Overall survival (OS) wasdened as the interval between the
date of initial surgical resection to the date of last known contact or
death. Disease free survival (DFS) was dened as the interval between
the date of initial surgical resection to the date of progression or to the
last follow-up date. There were 144 patients (~75%) received che-
motherapy after surgery. Whether patients receive chemotherapy or
not was based on the clinical guidelines, patientsprognosis and the
patientswillingness. With or without chemotherapy in this research
was dened as with or without at least one cycle of adjuvant che-
motherapy. Demographics, histopathologic information, primary
tumor location, treatment details including chemotherapy drugs,
doses and routes of administration, and outcome parameters were
collected. Signet ring cell proportion, lymphovascular invasion, and
Ki67 were also determined.
Sample collection of the anti-PD1 patient group
We surveyed medical records of GC patients in the Department of
Pathology, Zhongshan Hospital, Fudan University (Shanghai, R. P.
China), and then screened 14 GC patients treated with anti-PD1
immunotherapy after surgery from December 2018 to August 2021.
The treatment response was evaluated by CT/MRI scanning following
the Response Evaluation Criteria in Solid Tumors (RECIST) (version1.1).
Tumor response was assessed and categorized as a complete response
(CR), partial response (PR), stable disease (SD), or progressive disease
(PD). Here, patients with CR and PR were dened as responder and
those with SD and PD were dened as non-responder. In the anti-PD1
patient group, 7 responders (PR) and 7 non-responders (SD/PD) were
included. Detailed clinical information of each patient was included in
Supplementary Data 7c. The formalin-xed parafn-embedded (FFPE)
tissue sections derived from 14 therapy-naïve GC patients were col-
lected, and the tumor regions were determined by pathological
examination.
Cell line
Human HEK293T (Cat# CRL-11268 from ATCC; RRID: CVCL_QW54) was
obtained and cultured in DMEM (GIBCO) with 10% FBS (GIBCO) in 5%
CO
2
at 37 °C. Cells validation using short tandem repeat markers (STR)
were performed by Meixuan Biological Science and Technology Ltd.
(Shanghai). In detail, these cell lines were rstly tested cell species by
PCR method using extracted totalgenomic DNA, and examinedby STR
proling. Then, STR data were analyzed using the DSMZ (German
Collection of Microorganisms and Cell Cultures) online STR database
(http://www.dsmz.de/fp/cgi-bin/str.html). Cell lines were tested nega-
tive for mycoplasma contamination.
Targeted exome sequencing
A capture panel was developed, which covered coding exons
and anking splicing junctions for 274 gastric cancer driver genes9.
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 16
Content courtesy of Springer Nature, terms of use apply. Rights reserved
For each pair of tumor and paired NAT samples, genomic DNA was
extracted using the Gentra Puregene (Qiagen). Briey, 1 μgofgenomic
DNA from each sample was mechanically sheared, end repaired, and
ligated to molecularly bar-coded adapters to generate sequencing
libraries following the manufacturers standard protocol (Illumina).
Captured sample DNA was sequenced on an Illumina HiSeq 2000
according to the standard operating protocol.
Protein extraction and trypsin digestion
Samples were minced and lysed in lysis buffer (8M urea, 100mM Tris
hydrochloride, pH 8.0) containing protease and phosphatase inhibi-
tors (Thermo Scientic) followed by 1 min of sonication (3 s on and 3 s
off, amplitude 25%). The lysate was centrifuged at 14,000 g for 10 min
and the supernatant was collected as whole tissue extract. Protein
concentration was determined by Bradford protein assay. Extracts
from each sample (100 μg proteins) was reduced with 10 m M dithio-
threitol at 56 °C for 30 min and alkylated with 10 mMiodoacetamide at
room temperature (RT) in the dark for additional 30 min. Samples
were then digested using the lter aided proteome preparation (FASP)
method58 with trypsin. Briey, samples were transferred into a 30kD
Microcon lter (Millipore) and centrifuged at 14,000g for 20 min. The
precipitate on the lter was washed twice by adding 300 μL washing
buffer (8 M urea in 100mM Tris,pH 8.0) into the lter and centrifuged
at 14,000g for 20 min. The precipitate was resuspended in 200 μL
100 mM NH
4
HCO
3
. Trypsin with a protein-to-enzyme ratio of 50:1 (w/
w) was added into the lter. Proteins were digested at 37 °C for 16 h.
After tryptic digestion, peptides were collected by centrifugation at
14,000 g for 20 min and dried in a vacuum concentrator (Thermo
Scientic).
Tryptic peptides were separated in a home-made reverse-phase
C18 column in a pipet tip. Peptides were eluted and separated intonine
fractions using a stepwise gradient of increasing acetonitrile (6%, 9%,
12%, 15%, 18%, 21%, 25%, 30%, and 35%) at pH 10. Nine fractions were
combined to six fractions, dried in a vacuum concentrator (Thermo
Scientic), and then analyzed by liquid chromatography tandem mass
spectrometry (LC-MS/MS).
For FFPE sample preparation, sections (10 μm thick) from FFPE
blocks were macro-dissected, deparafnized with xylene, and
washed with ethanol. The ethanol was removed completely and the
sections were left to air-dry. FFPE samples were added lysis buffer
[0.1 M Tris-HCl (pH 8.0), 0.1 M DTT (Sigma, 43815), 1 mM PMSF
(Amresco, M145)] and lysed with 4% sodium dodecyl sulfate (SDS).
The extracted solution was collected, and then added the pre-cold
acetone with 4-fold volume. Subsequently, the acetone-precipitated
proteins were washed with cooled acetone. Filter-aided sample pre-
paration (FASP) procedure58 was used for protein digestion.
Phospho-peptide enrichment
Tryptic peptides were used for phospho-peptide enrichment. 15 mg
TiO
2
-coupled beads were incubated with 500 μl Binding buffer (BB) for
10 min. Separated TiO
2
into three 1.5 mL EP tubes equally, 5 mg for
each and centrifuged 2000 g for 2 min. Peptides were resolved with
100 uL BB solution and combined with 5 mg incubated TiO
2
for 30 min.
Then centrifuged 1000 g for 2 min to collect supernatant and trans-
ferred them to a second EP tube which included TiO
2
.Repeatedthe
phospho-peptides procedure twice and then discarded the super-
natant. TiO
2
was washed with BB solution for ve times. An additional
washing procedure was carried out with the wash buffer 1 (30% ACN,
0.5% triuoroacetic acid) for one time and then with the wash buffer 2
(80% ACN, 0.5% triuoroacetic acid) for two times to further remove
the unphosphorylated peptides. Peptides were eluted and separated
into 6 fractions using a stepwise gradient of increasing acetonitrile
(0%, 2%, 5%, 8%, 10%, 40%) at pH 10. Six fractions were combined into 3
fractions, dried in a vacuum concentrator (Thermo Scientic) and then
analyzed by LC-MS/MS.
Nuclear proteins extraction
The tissues were washed twice with ice-cold phosphate-buffered saline
to remove blood and other contaminates, then suspended in 800 μLof
Cytoplasmic Extraction Reagent I (CER I) buffer (NE-PER kit, Thermo
Scientic) and homogenized using a tissue grinder. Nuclear proteins
wereextractedinaccordancewiththemanufacturers instructions59.
Protein concentrations were determined using the Bradford method.
Approximately, 1 mg of the nuclear protein was extracted from each
tissue sample.
TFRE pull-down and trypsin digestion
DNA was synthesized by Genscript (Nanjing, Jiangsu Province, China).
Biotinylated TFRE primers (Forward primer: 5'-CATTCAGGCTGCG
CAACTGTTG-3', Reverse primer: 5'-GTGAGTTAGCTCACTCATTAGG-3')
were synthesized by Sigma. Dynabeads (M-280 streptavidin) were
purchased from Invitrogen. Approximately 23 pmol of biotinylated
DNA was pre-immobilized on Dynabeads and then mixed with nuclear
extracts (NEs) from the tissues. The mixtures were incubated for 2 h at
4 °C. The supernatant was discarded, and the Dynabeads were washed
twice with NETN solution (100 mM NaCl, 20 mM Tris-HCl, 0.5 mM
ethylenediaminetetraacetic acid and 0.5% (vol/vol) Nonidet P-40) and
then twice with phosphate-buffered saline. The TFRE pull-down beads
were resuspended with 20 μL of SDS loading buffer and boiled for
5 min at 95 °C. The samples were then loaded on 10cm 10% SDS-
polyacrylamide gel electrophoresis gels and run to 1/3 of the length.
The gel was stained with coomassie brilliant blue and then destained in
5% ethanol/10% acetic acid solution. Six bands were excised according
to the molecular weight ranges and then subjected to in-gel trypsin
digestion. 0.1% formic acid was used to stop digestion and 50% acet-
onitrile was used to extract peptides. Peptide solution was dried in a
vacuum concentrator (Thermo Scientic) and then analyzed by LC-
MS/MS.
LC-MS/MS analysis
The three kinds of peptide samples (proteome, phospho-proteome,
and TF activity prole) were detected by Orbitrap analyzer-based mass
spectrometers platforms. The proteomic peptide samples were
detected on Orbitrap Fusion (Thermo Fisher Scientic, Rockford, IL,
USA) mass spectrometers, the phospho-proteomic peptide samples
were detected on Fusion Lumos mass spectrometers (Thermo Fisher
Scientic, Rockford, IL, USA), and the TF activity prole peptide sam-
ples were detected on Q Exactive HF (Thermo Fisher Scientic, Rock-
ford, IL, USA) mass spectrometers. Each layer dataset was acquired by
the same mass spectrometer.
Dried peptide samples were re-dissolved in Solvent A (0.1% formic
acid in water) and loaded to a trap column (100μm × 2 cm, home-
made; particle size, 3 μm; pore size, 120 Å; SunChrom, USA) with a max
pressure of 280 bar using Solvent A, then separated on a home-made
150 μm × 12cm silica microcolumn (particle size, 1.9 μm; pore size,
120 Å; SunChrom, USA) with a gradient of 535% mobile phase B
(acetonitrile and 0.1% formic acid) at a ow rate of 350 nL/min
for 75 min.
The eluted peptides were ionized under 2 kV. MS was operated
under a data-dependent acquisition (DDA) mode. For detection with
Fusion or Fusion Lumos mass spectrometer, a precursor scan was car-
ried out in the Orbitrap by scanning m/z 3001400 with a resolution of
120,000 at 200 m/z. The most intense ions selected under top-speed
mode were isolated in Quadrupole with a 1.6 m/z window and frag-
mented by higher energy collisional dissociation (HCD) with normalized
collision energy of 35%, then measured in the linear ion trap using the
rapid ion trap scan rate. Automatic gain control targets were 5 ×10e5
ions with a max injection time of 50 ms for full scans and 5× 10e3 with
35 ms for MS/MS scans. Dynamic exclusion time was set as 18s.
The MS analysis for Q Exactive HF were performed with one full
scan (3001400 m/z, R= 60,000 at 200 m/z) at automatic gain control
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 17
Content courtesy of Springer Nature, terms of use apply. Rights reserved
target of 3e6 ions, followed by up to 20 data-dependent MS/MS scans
with HCD (target 2 × 10e3 ions, max injection time 40 ms, isolation
window 1.6 m/z, normalized collision energy of 27%), detected in the
Orbitrap (R= 15,000 at 200 m/z).
MS data processing
All the MS data were processed in the Firmiana60 platform. Raw les
were searched against the human National Center for Biotechnology
Information (NCBI) ref-seq protein database (updated on 07-04-2013,
32,015 entries) by Mascot 2.3 (Matrix Science Inc). Mass tolerances
were 20 ppm for precursor and 0.5 Da for products ions for Fusion and
Fusion Lumos series. Mass tolerances were 20 ppm for precursor and
50mmu for products ions for Q Exactive HF series. Up to two missed
cleavages were allowed. The data were also searched against a decoy
database so that protein identications were accepted at a false dis-
covery rate (FDR) of 1%.
For proteome proling, Carbamidomethylation (C) was set in
search engine as a xed modication; Acetyl (Protein N-term) and
Oxidation (M), as variable modications. For phospho-proteome,
Carbamidomethylation (C) was set in search engine as a xed mod-
ication; Phospho (ST), Phospho (Y), Acetyl (Protein N-term), and
Oxidation (M), as variable modications. Phospho-sites were repor-
ted when phospho-peptides showed an ion score >20, otherwise the
precise modication site was deemed ambiguous. Phospho-sites
with abundance <25% of all phospho-sites were excluded. For TF
activity proles, the search engine set Phospho (ST), Phospho (Y),
DeStreak (C), Acetyl (Protein N-term), and Oxidation (M) as variable
modications.
Protein quantication and normalization
We applied match between runs (MBR) algorithm60,61. We built a
dynamicregressionfunctionbasedoncommonidentied peptides in
samples. According to correlation value R2, Firmiana chose linear or
quadratic function for regression to calculate retention time (RT) of
corresponding hidden peptides, and to check the existence of the
extracted ion chromatogram (XIC) based on the m/z and calculated
RT. The function evaluated the peak area values of those existed XICs.
These peak area values were considered as parts of corresponding
proteins.
For proteomic data normalization, label-free protein quantica-
tions were calculated using a label-free, intensity based absolute
quantication (iBAQ) approach62. The fraction of total (FOT) was used
to represent the normalized abundance of a particular protein across
samples. FOT of protein was dened as a proteins iBAQ divided by the
total iBAQ of all identied proteins within one sample. The FOT was
multiplied by 10e6 for the ease of presentation. For the phospho-
proteomics, the data matrix of peptides with phosphorylated mod-
ication was used for phospho-sites extraction and quantication.
Then, the phospho-sites expression matrix was subjected to quantile
normalization using normalized quantile functions22,63 implemented in
the R/Bioconductor package limma v.3.24.1564.Afterthat,thenor-
malized phospho-sites abundance was log2-transformed. Weobtained
a quantied data matrix including 44,750 phospho-sites (Supplemen-
tary Data 2c). In TF activity prole, we also used quantile-based nor-
malization and obtained a quantied data matrix including 597 TFs
(Supplementary Data 2b). The data distribution (Supplementary
Fig. 1c) showed quantile normalization was suitable for our TF activity
prole, too. At last, missing values were assigned the minimum value in
each proteomic layer.
Quality control (QC) for MS platforms and samples data
QC was performed for platforms and samples. The average spearmans
correlation coefcient among standards (tryptic digestions of the
HEK293T cell lysate, Cat# CRL-11268 from ATCC; RRID: CVCL_QW54) in
proteome platform was 0.92; the average correlation coefcient
among standards in TF activity prole platform was 0.95; and the
average correlation coefcient among standards in phospho-
proteome platform was 0.94 (Supplementary Fig. 1a). The median cv
values among standards in proteome, phospho-proteome, and TF
activity prole platforms were 0.28, 0.26, and 0.34, respectively
(Supplementary Fig. 1b). The density of the tumor (orange) and NAT
(blue) proteomes exhibited a unimodal distribution, in accordance
with the proteomic quality control (Supplementary Fig. 1c). These
results showed the stability of our MS platforms.
For samples data, the distribution of median values was used to
discriminate the samples with insufcient protein or phospho-site
detected. The samples with median values which were larger than
upper quartile + 1.5 IQR (interquartile range) would be excluded from
further analyses. To evaluate the comparability of data, we compared
the data distribution with boxplots and density curves. Samples with
a clear bimodal distribution of protein quantication would be
excluded from further analyses. Furthermore, QC results required
both of tumor tissues and paired NATs passed QC procedures. In this
research, 194 pairwise samples of proteomic proles, 196 pairwise
samples of TF activity proles, and 184 pairwise samples of phospho-
proteomic proles passed the QC procedures and were used for
further analyses.
Principal component analysis (PCA)
PCA was performed to visualize the separation of tumor tissues and
normal adjacent tissues (NATs). We performed PCA on 196 paired
tumor and NAT samples to illustrate the proteomic, phospho-pro-
teomic, and TF activity prole differences between tumor and NAT
samples (Supplementary Fig. 2a). Also, we performed PCA on 196 DGC,
IGC, and MGC samplesTF activity proles to illustrate the global
molecular differences between Lauren classication of GC samples
(Supplementary Fig. 5d). The PCA function under the R package was
implemented for unsupervised clustering analysis. The 90% con-
dence coverage was represented by a colored ellipse for each group,
which was calculated based on the mean and covariance of points in
each specic group.
The screen of differently expressed proteins (DEPs)
Wilcoxon paired signed-rank test was used to identify proteins with
signicantly differential expression between tumor tissues and NATs.
Wilcoxon rank-sum test was used toidentify proteins with signicantly
differential expression between DGC and IGC. DEPs were also exam-
ined between two clusters of TF activity-based subtypes by Wilcoxon
rank-sum test in DGC and IGC, respectively. P-values were adjusted
using Benjamini-Hochberg (BH) correction. Foldchange was calculated
by average or median ratio. Proteins with foldchange values larger than
certain standards (usually 2x) and BH adjusted p-values < 0.05 were
considered as signicantly different.
Pathway enrichment analysis
DEPs or subtype signature proteins were used to perform pathway
enrichment analysis according to Gene Ontology and KEGG in DAVID.
Reactome or STRING-based pathway enrichment analysis was also
performed. Statistical signicance was considered when FDR value
was <0.05.
Kinase-substrate enrichment analysis (KSEA)
Kinase-Substrate Enrichment Analysis (KSEA) estimated changes in a
kinases activity by measuring and averaging the amounts of its
identied substrates instead of a single substrate, which enhanced
the signal-to-noise ratio from inherently noisy phospho-proteomic
data. The ratios of identied phospho-sites between tumor tissues
and NATs were used to estimate the kinase activities by KSEA
algorithm28. The information of kinase-substrate relationships was
obtained from databases including PhosphoSite65 and NetworKIN
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 18
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3.0. Statistical analysis was performed in R (version 4.0.4) with
KruskalWallis test.
TFRE enrichment analysis66
We calculated the proteins ratio between proteome and TF activity
proles. The TFs, which were annotated in CellNet38 database, had
higher ratio than other proteins. Besides TFs, we selected proteins with
ratio values >4 folds as TFRE enriched proteins. In total, 4185 proteins
were regarded as TFRE enriched proteins, including 597 TFs.
Gene set enrichment analysis (GSEA)
Gene Set Enrichment Analysis (GSEA) was applied to nd enriched
pathways between tumor tissues and NATs. Proteins detected in >95%
samples were selected, and missing values were then imputed with the
minimum value of the proteomic data. It was also used to calculate the
GSEA enrichment scores over 4347 pathways with at least 10 over-
lapping genes, for each sample. GSEA was performed by the GSEA
software (http://software.broadinstitute.org/gsea/index.jsp)orR
package clusterProler. Gene sets including Gene ontology, KEGG,
Reactome, and HALLMARK downloaded from the Molecular Sig-
natures Database (MSigDB v7.1, http://software.broadinstitute.org/
gsea/msigdb/index.jsp) were set as background.
Tissue specic proteins analysis
Tissue specic annotation was from Human Protein Atlas67.Intotal,
1882 proteins had tissue-specic annotation in proteomic data,
including 206 TFs. We calculated the proportion of tissue-specic
proteins alteration in each tissue, especially digestive tract including
esophagus, intestine, liver and stomach. TF-TG regulated networkwas
built based on gene regulatory network from CellNet38.
KaplanMeier analysis
Standard statistical tests were used to analyze the clinical data,
including but not limited to studentst-test, Fishers exact test, and
Log-rank test. All survival analysis among the proteomic/TF activity/
immune subtypes, used KaplanMeier method; p-values were cal-
culated using the Log-rank test. Hazard ratio (HR) was calculated
from Cox proportional hazards regression analysis. All the survival
analyses of proteomic subtyping were adjusted by other clinical
covariates including gender, age, TNM stage and chemotherapy,
demonstrating that our subtyping could serve as an independent
survival outcome predictive factor. In addition, we performed formal
statistical tests for interaction analyses. The results of interaction
analyses revealed there was no signicant enrichment of TNM
stages in each subtype. All subtyping survival outcome analyses
results were shown by DFS. For the optimal cutoff point in the K-M
analysis of certain proteins, we used function surv_cutpoint of surv-
miner package in R. P-values < 0.05 were considered as signicantly
different. All the analyses of clinical data were performed in R or
GraphPad Prism.
Consensus clustering analysis
Consensus clustering was performed using the R package Con-
sensus Cluster Plus. Samples were clustered using Euclidean dis-
tance as the distance measure. We performed 1000 resampling
repetitions in the range of 2 to 6 clusters. Log-rank tests and
KaplanMeier survival curves were used to compare the survival
curves among the subtypes.
The protein expression matrix of the 79 paired DGC samples was
used to identify the DGC proteomic subtypes with upregulated pro-
teins in tumor tissues. The protein expression matrix of the 92 paired
IGC samples was used to identify the IGC proteomic subtypes with
upregulated proteins in tumor tissues. As summarized in Supple-
mentary Fig. 6a, the clustering analysis of the tumors by protein
abundance divided DGC and IGC patients into three proteomic
subtypes, respectively. A consensus matrix with k= 3 appeared to have
the clearest cut between clusters and showed signicant association
with the patientssurvival. Thus, we selected 3 clusters as the best
subtypes for the DGC and IGC proteomic subtypes.
For the TF activity proles, 425 and 396 TFs detected in >50% DGC
and IGC patients, respectively, were applied for DGC and IGC sub-
typing. We performed consensus clustering and set the same para-
meters as that for the proteomic subtyping. The consensus CDF and
delta plots showed increasing in area for k= 2, and this provided the
clearest separation among the clusters (Supplementary Fig. 8a, b).
Thus, we selected 2 clusters as the best subtypes for the TF activities
matrix.
For the phospho-proteome data, the phospho-sites detected in
>50% DGC and IGC patients, corresponding to 4484 and 4739 phos-
pho-proteins, respectively, were applied for DGC subtyping and IGC
subtyping. We performed consensus clustering and set the same
parameters as that for the proteomic subtyping. The consensus
cumulative distribution function (CDF) and delta plots showed
increasing in area for k= 3, and this provided the clearest separation
among the clusters (Supplementary Fig. 9a). Thus, we selected 3
clusters as the best subtypes for the phospho-proteomic expression
matrix.
Consensus clustering was performed with the xCell results of 183
paired GC samples.
Euclidean distance and 1000 resampling repetitions in the range
of 26 clusters were used. As summarized in Supplementary Fig. 11a,
the clustering analysis of the tumors by xCell score divided 183
patients into three immune clusters. A consensus matrix with k=3
appeared to have the clearest cut between clusters and showed sig-
nicant association with the patientssurvival in DGC and IGC. Thus,
we selected 3 clusters as the best subtypes for the inferredimmune cell
score matrix.
To identify molecular signatures for each subtype in our pro-
teomic cohort, we compared the protein expression in each subtype
against all other subtypes. The statistical signicance was calculated by
Wilcoxon rank-sum test. For a given subtype, proteins with a fold-
change > 2 and p<0.05, were dened as signature proteins, when
compared with other subtypes.
Bayesian predictor for NFKB1 and SMARCC1 subtypes
The Bayesian algorithm was applied to cluster subtypes based
on TF activity in Muns cohort (Fig. 6h, NFKB1 subtype and
SMARCC1 subtype)47. The z scores of NFKB1 and SMARCC1 in our TF
activity proles were used to create a linear predictor score (LPS) for
each patient based on TF activity subtypes. The LPS distribution of
each TF activity subtype was used to estimate the likelihood that a
new sample was in each of the two subtypes by applying Bayesrule.
The z scores of NFKB1 and SMARCC1 in the validation cohort were
used to calculate the probability based on the predictor. The mem-
bership of NFKB1 and SMARCC1 subtype was assigned as above based
on a cutoff of 75% certainty. At last, 28 and 24 cases were identied as
NFKB1 subgroup and SMARCC1 subgroup, respectively.
Classication concordance evaluation
Based on the classication among proteomic subtypes, phospho-
proteomic subtypes and TF activity-based subtypes of our cohort, we
performed chi-square test to assess the classication concordance.
Except correspondence between phospho-proteomic subtypes and TF
activity-based subtypes in IGC, the statistical results of classication
concordance among subtypes based on three datasets were all sig-
nicant (chi-square test, p< 0.05, Supplementary Table 2). Based on
the classication of Munscohort
10, we performed chi-square test to
assess the classication concordance among proteomic subtypes, TF
subtypes, and Muns subtypes. The statistical results of classication
concordance among subtypes were signicant (chi-square test,
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 19
Content courtesy of Springer Nature, terms of use apply. Rights reserved
p< 0.05, Supplementary Fig. 10f). These results demonstrated thatour
subtypes had high classication concordance.
Cell cycle phase analysis
Cell cycle phase analysis was performed using the R package Seurat68.
Cell cycle scores of patients were calculated and patients were labeled
with G1, S and G2/M classication (Fig. 4f). Signicantly upregulated
cell cycle regulating proteins and phospho-sites were selected (fold-
change > 2, Wilcoxon rank-sum test, BH adjusted p<0.05).
Master TFs nomination
Master TFs dominate the GC progression. We nominated master TFs
according to three criteria66 as follows: (a) the activities of TFs were
upregulated in tumor tissues comparing to NATs; (b) the activities of
TFs were upregulated in a subtype; (c) signicant enrichment based on
altered TGs. In Supplementary Fig. 8e, enrichment was calculated
based on DEPs between DGC and IGC using hypergeometric test. In
Fig. 5c, enrichment was calculated based on DEPs between two TF
activity-based subtypes. P-values < 0.05 were considered as signicant
enrichment.
Construction of signaling transduction network
In Fig. 5h, i, the network among kinases and TFs were annotated using
calculated correlation between TFsphospho-sites and kinases. Four
phospho-sites were considered as key phospho-sites which affected
the TFsactivities. The correlationbetween the kinase activities and the
phospho-sites on TFs were calculated with pair-wised spearmans
correlation coefcients. P-values < 0.05 were considered as signicant
correlation. Forty signicantly positive correlations were used to
construct the kinase-TF regulation network (Fig. 5h-i). TGs were from
DEPs between two TF activity-based subtypes. The TF-TG regulated
network was built on CellNet38,visualizedandgeneratedbythesoft-
ware Cytoscape (version 3.6.1).
xCell
The abundance of 64 kinds of cell types, microenvironment scores,
immune scores, and stroma scores were inferred by proteome data via
xCell (https://xcell.ucsf.edu/). The density distribution was generated
based on immune and stroma scores. The differential cell types
between DGC and IGC in immune cluster 3 were compared based on
xCell scores.
Reporting summary
Further information on research design is available in the Nature
Portfolio Reporting Summary linked to this article.
Data availability
The MS raw data generated in this study have been deposited in the
ProteomeXchange Consortium (dataset identier: PXD038214)viathe
iProX69 partner repositoryunder accession code IPX0004428000.The
normalized proteome, phospho-proteome, and TF activity data
matrices are available under this accession. The MS raw data of anti-
PD1 group have been deposited in the ProteomeXchange Consortium
(dataset identier: PXD038188)viatheiProX
69 partner repository
under accession code IPX0004819000. The targeted exome sequen-
cing data are available in the GSA70 (Genome Sequence Archive,
https://ngdc.cncb.ac.cn/gsa-human/) under restricted access
HRA002466 and HRA003612 (fastq les) for data privacy laws related
to patient consent for data sharing, access can be obtained by the
Request Data steps in GSA database website or contacting corre-
sponding author. The approximate response time for accession
requests is about 2weeks. Once access has been granted, the data will
be available to download for 3 months. The TCGA publicly available
data used in this study are available in the Genomic Data Commons
Data Portal under accession code TCGA-STAD (https://portal.gdc.
cancer.gov/)7. The remaining data are available within the Article,
Supplementary Information or Source Data le. Source data are pro-
vided with this paper.
References
1. Wild, C., Weiderpass, E. & Stewart, B.World Cancer Report: Cancer
Research for Cancer Prevention (WHO, 2020).
2. Chen, W. et al. Cancer statistics in China, 2015. CA Cancer J. Clin.
66,115132 (2016).
3. Lauren, P. The two histological main types of gastric carcinoma:
diffuse and so-called intestinal-type carcinoma. Acta Pathologica
Microbiologica Scandinavica 64,3149 (1965).
4. Smyth, E. C., Nilsson, M., Grabsch, H. I., van Grieken, N. C. T. &
Lordick, F. Gastric cancer. Lancet 396,635648 (2020).
5. Cancer Genome Atlas Research N. Comprehensive molecular
characterization of gastric adenocarcinoma. Nature 513,
202209 (2014).
6. Cristescu, R. et al. Molecular analysis of gastric cancer identies
subtypes associated with distinct clinical outcomes. Nat. Med. 21,
449456 (2015).
7. Blum, A., Wang, P. & Zenklusen, J. C. SnapShot: TCGA-analyzed
tumors. Cell 173,530(2018).
8. Jinawath, N. et al. Comparison of gene-expression proles between
diffuse- and intestinal-type gastric cancers using a genome-wide
cDNA microarray. Oncogene 23, 68306844 (2004).
9. Ge, S. et al. A proteomic landscape of diffuse-type gastric cancer.
Nat. Commun. 9,1012(2018).
10. Mun, D. G. et al. Proteogenomic characterization of human early-
onset gastric cancer. Cancer Cell 35,111124.e110 (2019).
11. Baluapuri, A., Wolf, E. & Eilers, M. Target gene-independent func-
tions of MYC oncoproteins. Nat. Rev. Mol. Cell Biol. 21,
255267 (2020).
12. Han, B. et al. FOXC1: an emerging marker and therapeutic target for
cancer. Oncogene 36,39573963 (2017).
13. Mashtalir, N. et al. Modular organization and assembly of SWI/SNF
family chromatin remodeling complexes. Cell 175,12721288.e1220
(2018).
14. Mittal, P. & Roberts, C. W. M. The SWI/SNF complex in cancer
biology, biomarkers and therapy. Nat. Rev. Clin. Oncol. 17,
435448 (2020).
15. Liu, T., Zhang, L., Joo, D. & Sun, S. C. NF-kappaB signaling in
inammation. Signal Transduct. Target Ther. 2, 17023 (2017).
16. Ding, C. et al. Proteome-wide proling of activated transcrip-
tion factors with a concatenated tandem array of transcription
factor response elements. Proc. Natl Acad. Sci. USA 110,
67716776 (2013).
17. Shi, W. et al. Transcription factor response elements on tip: a sen-
sitive approach for large-scale endogenous transcription factor
quantitative identication. Anal. Chem. 88,1199011994 (2016).
18. Morazán-Fernández, D. et al. In silico pipeline to identify tumor-
specic antigens for cancer immunotherapy using exome
sequencing data. Phenomics. 2, (2022).
19. Jiang, Y. et al. Tumor immune microenvironment and chemo-
sensitivity signature for predicting response to chemotherapy in
gastric cancer. Cancer Immunol. Res. 7,20652073 (2019).
20. Kim, S. T. et al. Comprehensive molecular characterization of clin-
ical responses to PD-1 inhibition in metastatic gastric cancer. Nat.
Med. 24, 14491458 (2018).
21. Xu, J. Y. et al. Integrative proteomic characterization of human lung
adenocarcinoma. Cell 182,245261.e217 (2020).
22. Jiang, Y. et al. Proteomics identies new therapeutic targets of
early-stage hepatocellular carcinoma. Nature 567,
257261 (2019).
23. Li, C. et al. Integrated omics of metastatic colorectal cancer. Cancer
Cell 38,734747.e739 (2020).
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 20
Content courtesy of Springer Nature, terms of use apply. Rights reserved
24. Ding, C. et al. A fast workow for identication and quantication of
proteomes. Mol. Cell Proteom. 12,23702380 (2013).
25. Mertins,P.etal.Reproducibleworkow for multiplexed deep-scale
proteome and phosphoproteome analysis of tumor tissues by liquid
chromatography-mass spectrometry. Nat. Protoc. 13,
16321661 (2018).
26. Wang, R. et al. Single-cell dissection of intratumoral heterogeneity
and lineage diversity in metastatic gastric adenocarcinoma. Nat.
Med. 27,141151 (2021).
27. Teng, S. et al. Tissue-specic transcription reprogramming pro-
motes liver metastasis of colorectal cancer. Cell Res. 30,
3449 (2020).
28. Wiredja,D.D.,Koyuturk,M.&Chance,M.R.TheKSEAApp:aweb-
based tool for kinase activity inference from quantitative phos-
phoproteomics. Bioinformatics 33,34893491 (2017).
29. Wang, L. B. et al. Proteogenomic and metabolomic characterization
of human glioblastoma. Cancer Cell 39,509528.e520 (2021).
30. Hansford, S. et al. Hereditary diffuse gastric cancer syndrome:
CDH1 mutations and beyond. JAMA Oncol. 1,2332 (2015).
31. Zheng, R. et al. Cistrome data browser: expanded datasets and new
tools for gene regulatory analysis. Nucleic Acids Res. 47,
D729D735 (2019).
32. Nagarajan, S. et al. ARID1A inuences HDAC1/BRD4 activity, intrinsic
proliferative capacity and breast cancer treatment response. Nat.
Genet 52,187197 (2020).
33. Cheah, M. T. et al. CD14-expressing cancer cells establish the
inammatory and proliferative tumor microenvironment in bladder
cancer. Proc. Natl Acad. Sci. USA 112,47254730 (2015).
34. Wang, K. et al. Exome sequencing identies frequent mutation of
ARID1A in molecular subtypes of gastric cancer. Nat. Genet 43,
12191223 (2011).
35. Aran, D., Hu, Z. & Butte, A. J. xCell: digitally portraying the tissue
cellular heterogeneity landscape. Genome Biol. 18, 220 (2017).
36. Kolch, W., Halasz, M., Granovskaya, M. & Kholodenko, B. N. The
dynamic control of signal transduction networks in cancer cells.
Nat. Rev. Cancer 15,515527 (2015).
37. Jackson, S. P. & Bartek, J. The DNA-damage response in human
biology and disease. Nature 461,10711078 (2009).
38. Cahan, P. et al. CellNet: network biology applied to stem cell
engineering. Cell 158,903915 (2014).
39. Xiao, T. et al. High-resolution and multidimensional phenotypes can
complement genomics data to diagnose diseases in the neonatal
population. Phenomics. 2, (2022).
40. Ying, W. Phenomic studies on diseases: potential and challenges.
Phenomics. 3,(2023).
41. Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class dis-
covery tool with condence assessments and item tracking. Bioin-
formatics 26,15721573 (2010).
42. Jimenez Fonseca, P. et al. Lauren subtypes of advanced gastric
cancer inuence survival and response to chemotherapy: real-
world data from the AGAMENON National Cancer Registry. Br. J.
Cancer 117,775782 (2017).
43. Malumbres,M.&Barbacid,M.Cellcycle,CDKsandcancer:a
changing paradigm. Nat. Rev. Cancer 9,153166 (2009).
44. Ding, C. et al. Proteomics and precision medicine. Small Methods 3,
1900075 (2019).
45. Guan, H. et al. IKBKE is over-expressed in glioma and contributes to
resistance of glioma cells to apoptosis via activating NF-kappaB. J.
Pathol. 223,436445 (2011).
46. Zhai, J. et al. Cancer-associated broblasts-derived IL-8 mediates
resistance to cisplatin in human gastric cancer. Cancer Lett. 454,
3743 (2019).
47. Iqbal, J. et al. Gene expression signatures delineate biological and
prognostic subgroups in peripheral T-cell lymphoma. Blood 123,
29152923 (2014).
48. Steggerda, S. M. et al. Inhibition of arginase by CB-1158 blocks
myeloid cell-mediated immune suppression in the tumor micro-
environment. J. Immunother. Cancer 5,101(2017).
49. Mohammadi,M.,Nedrud,J.,Redline,R.,Lycke,N.&Czinn,S.J.
Murine CD4 T-cell response to Helicobacter infection: TH1 cells
enhance gastritis and TH2 cells reduce bacterial load. Gastro-
enterology 113, 18481857 (1997).
50. Whary,M.T.etal.Intestinalhelminthiasis in colombian children
promotes a Th2 response to helicobacter pylori: possible implica-
tions for gastric carcinogenesis. Cancer Epidemiol. Biomark. Prev.
14,14641469 (2005).
51. Sies, H. & Jones, D. P. Reactive oxygen species (ROS) as pleiotropic
physiological signalling agents. Nat. Rev. Mol. Cell Biol. 21,
363383 (2020).
52. Hayes, J. D., Dinkova-Kostova, A. T. & Tew, K. D. Oxidative stress in
cancer. Cancer Cell 38,167197 (2020).
53. OLeary, B., Finn, R. S. & Turner, N. C. Treating cancer with selective
CDK4/6 inhibitors. Nat. Rev. Clin. Oncol. 13,417430 (2016).
54. Van Dekken, H. et al. Molecular dissection of the chromosome band
7q21 amplicon in gastroesophageal junction adenocarcinomas
identies cyclin-dependent kinase 6 at both genomic and protein
expression levels. Genes, Chromosomes Cancer 47,
649656 (2008).
55. Ankers, J. M. et al. Dynamic NF-kappaB and E2F interactions control
the priority and timing of inammatory signalling and cell pro-
liferation. Elife 5, e10473 (2016).
56. Liu, J. et al. TF-PROTACs enable targeted degradation of tran-
scription factors. J. Am. Chem. Soc. 143,89028910 (2021).
57. Shen, L. et al. Management of gastric cancer in Asia: resource-
stratied guidelines. Lancet Oncol. 14,e535e547 (2013).
58. Wisniewski, J. R., Zougman, A., Nagaraj, N. & Mann, M. Universal
sample preparation method for proteome analysis. Nat. Methods 6,
359362 (2009).
59. Tsai,N.P.,Lin,Y.L.,Tsui,Y.C.&Wei,L.N.Dualactionofepidermal
growth factor: extracellular signal-stimulated nuclear-cytoplasmic
export and coordinated translation of selected messenger RNA. J.
Cell Biol. 188,325333 (2010).
60. Tyanova, S., Temu, T. & Cox, J. The MaxQuant computational plat-
form for mass spectrometry-based shotgun proteomics. Nat. Pro-
toc. 11,23012319 (2016).
61. Cox, J. et al. Accurate proteome-wide label-free quantication by
delayed normalization and maximal peptide ratio extraction,
termed MaxLFQ. Mol. Cell Proteom. 13,25132526 (2014).
62. Schwanhausser, B. et al. Global quantication of mammalian gene
expression control. Nature 473, 337342 (2011).
63. Liu, W. et al. Large-scale and high-resolution mass spectrometry-
based proteomics proling denes molecular subtypes of eso-
phageal cancer for therapeutic targeting. Nat. Commun. 12,
4961 (2021).
64. Bolstad,B.M.,Irizarry,R.A.,Astrand,M.&Speed,T.P.Acomparison
of normalization methods for high density oligonucleotide
array data based on variance and bias. Bioinformatics 19,
185193 (2003).
65. Hornbeck, P. V. et al. 15 years of PhosphoSitePlus(R): integrating
post-translationally modied sites, disease variants and isoforms.
Nucleic Acids Res. 47,D433D441 (2019).
66. Zhou, Q. et al. A mouse tissue transcription factor atlas. Nat. Com-
mun. 8,15089(2017).
67. Uhlen, M. et al. Towards a knowledge-based human protein Atlas.
Nat. Biotechnol. 28,12481250 (2010).
68. Butler,A.,Hoffman,P.,Smibert,P.,Papalexi,E.&Satija,R.Inte-
grating single-cell transcriptomic data across different conditions,
technologies, and species. Nat. Biotechnol. 36,411420 (2018).
69. Ma, J. et al. iProX: an integrated proteome resource. Nucleic Acids
Res. 47, D1211D1217 (2019).
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 21
Content courtesy of Springer Nature, terms of use apply. Rights reserved
70. Chen, T. et al. The genome sequence archive family: toward
explosive data growth and diverse data types. Genomics Proteom.
Bioinforma. 19,578583 (2021).
Acknowledgements
This work was supported by the National Key R&D Program of China
(2022YFA1303200 [C.D.], 2022YFA1303201 [C.D.], 2020YFE0201600
[C.D.], 2018YFE0201600 [C.D.], 2018YFE0201603 [C.D.],
2018YFA0507500 [C.D.], 2018YFA0507501 [C.D.], 2017YFA0505100
[C.D.], 2017YFA0505102 [C.D.], 2017YFA0505101 [C.D.],
2017YFC0908404 [C.D.], 2016YFA0502500 [C.D.], 2018YFA0507503
[Yi.W.]); Program of Shanghai Academic/Technology Research Leader
(22XD1420100 [C.D.]); Shuguang Program of Shanghai Education
Development Foundation and Shanghai Municipal Education Commis-
sion (19SG02 [C.D.]); National Natural Science Foundation of China
(31972933 [C.D.], 31770886 [C.D.], 31700682 [C.D.], 81902514 [S.G.]);
Major Project of Special Development Funds of Zhangjiang National
Independent innovation Demonstration Zone (ZJ2019-ZD-004 [C.D.]);
Shanghai Municipal Science and Technology Major Project
(2017SHZDZX01 [C.D.]); the Fudan original research personalized sup-
port project [C.D.]; and Chinese Academy of Medical Sciences Innova-
tion Fund for Medical Sciences (CIFMS, 2019-12M-5-063 [F.H.]).
Author contributions
C.D., J.Q., Q.Z., L.C., F.H., Yi.W., and L.S. conceived and supervised the
project. B.B., C.X., K.Z., J.W., S.W., G.J., J.L., Y.N., W.L., X.W., J.C., and S.G.
coordinated the acquisition, distribution, and quality evaluation of GC
tumors and adjacent tissues. W.S., Yun.W., and N.Z. directed and per-
formed analyses and quality control of MS data. W.S., Yus.W., Y.L., and
C.D. performed proteomic data analyses. W.S., Y.L., Yus.W., and C.D.
wrote the manuscript.
Competing interests
The authors declare no competing interests.
Additional information
Supplementary information The online version contains
supplementary material available at
https://doi.org/10.1038/s41467-023-35797-6.
Correspondence and requests for materials should be addressed to Lin
Chen, Qingchuan Zhao, Lin Shen, Fuchu He, Jun Qin or Chen Ding.
Peer review information Nature Communications thanks Bing Zhang
and the other, anonymous, reviewer(s) for their contribution to the peer
review of this work.
Reprints and permissions information is available at
http://www.nature.com/reprints
Publishers note Springer Nature remains neutral with regard to jur-
isdictional claims in published maps and institutional afliations.
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as
long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license, and indicate if
changes were made. The images or other third party material in this
article are included in the articles Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not
included in the articles Creative Commons license and your intended
use is not permitted by statutory regulation or exceeds the permitted
use, you will need to obtain permission directly from the copyright
holder. To view a copy of this license, visit http://creativecommons.org/
licenses/by/4.0/.
© The Author(s) 2023
1
School of Life Sciences, Tsinghua University, Beijing 100084, China.
2
State Key Laboratory of Proteomics, Beijing Proteome Research Center, National
Center for Protein Sciences (The PHOENIX center, Beijing), Beijing Institute of Lifeomics, Beijing 102206, China.
3
Department of Pathology, Zhongshan
Hospital, Fudan University, Shanghai 200032, China.
4
State Key Laboratory of Genetic Engineering and Collaborative Innovation Center for Genetics and
Development, School of Life Sciences, Institute of Biomedical Sciences, Human Phenome Institute, Zhongshan Hospital, Fudan University, Shanghai
200433, China.
5
Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital and Institute,
Beijing 100142, China.
6
State Key Laboratory of Cancer Biology, National Clinical Research Center for Digestive Diseases and Department of Digestive
Surgery, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xian 710032, China.
7
Department of General Surgery & Institute of
General Surgery, Chinese PLA General Hospital First Medical Center, Beijing 100853, China.
8
Research Unit of Proteomics Driven Cancer Precision
Medicine, Chinese Academy of Medical Sciences, Beijing102206, China.
9
These authors contributed equally: Wenhao Shi, Yushen Wang, Chen Xu, Yan Li.
e-mail: chenlinbj@vip.sina.com;zhaoqc@fmmu.edu.cn;shenlin@bjmu.edu.cn;hefc@nic.bmi.ac.cn;jqin1965@126.com;crickding@163.com
Article https://doi.org/10.1038/s41467-023-35797-6
Nature Communications | (2023) 14:835 22
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... These three components such as diffuse type, signet ring cells, and a poorly differentiation have been described as 0 20 40 60 80 100 120 140 0 20 40 60 Time, months Time, months 605 47 9 5 4 2 1 0 797 55 12 5 3 2 1 0 175 12 1 0 123 8 3 1 poor prognosis features amid GC population (21). The incidence of young age at the time of diagnosis has increased over the past decade, defined as "early onset of GC", these populations have demonstrated to highlight the prevalence of these 3 features of bad prognosis, besides higher metastatic risk (19,21,30). Besides, the early onset of gastric cancer has also been associated with hereditary diffuse gastric cancer, which is an autosomal dominant cancer syndrome caused by the inactivation of germline mutations in E-cadherin, tumor suppressor gene (CDH1), and less frequently variants in CTNNA1. ...
... Also, when analyzing elderly patients, women had higher risk of death compared to elderly man. Our findings are in agreement with several publications, where female patients have worse OS than males, regardless of age, due to the greater presence of poor prognostic factors such as histology and clinical stage as mentioned before (4,21,30,36). ...
Article
Full-text available
Background Incidence of young patients (aged 40 years or younger) diagnosed with gastric carcinoma has increased worldwide. Young GC diagnosis, have clinicopathological features that differ from elderly, and is correlated with bad prognosis factors. The purpose of this work is to describe the prevalence, clinic-pathological features, and prognosis of overall survival (OS) of young Latin-American patients with GC. Methods Retrospective, observational study. Included patients treated at the National Cancer Institute [2004–2020]. Statistical analysis: χ² and t-test, Kaplan-Meier, Log-Rank and Cox-Regression. Statistical significance differences were assessed when P was bilaterally <0.05. Results A total of 2,543 patients fulfilled the inclusion criteria. Young-patients were predominantly female (54%), with diffuse-type adenocarcinoma (68%), signet-ring-cell (72%), poor-differentiation (90%), and metastatic (79%). In OS analysis, patients with metastatic disease, showed differences regarding age, young patients reported a median-OS of 8 versus 13 months for elderly patients (P=0.001). Among young patients, differences were also observed regarding gender, young-female patients had a median-OS of 5 versus 11 months for young-man (P=0.001). Conclusions This is one of the pioneer studies correlating age with gender and the prognostic features of bad prognosis in Latin-American population. Besides, supports the idea that a global effort is required to improve awareness, prevention, and early diagnosis of GC.
... The validation cohort (https://www.iprox.cn//page/subproject. html?id=IPX0004819001) is a group of GC patients treated with anti-PD1 therapy, including seven responder cases and seven nonresponder cases (Shi et al., 2023). The sample is also FFPE, and the proteins were quantified using an MS-based label-free method. ...
Article
Full-text available
Background: Immune checkpoint inhibitors (ICIs) have revolutionized cancer treatment; however, a significant proportion of gastric cancer (GC) patients do not respond to this therapy. Consequently, there is an urgent need to elucidate the mechanisms underlying resistance to ICIs and identify robust biomarkers capable of predicting the response to ICIs at treatment initiation. Methods: In this study, we collected GC tissues from 28 patients prior to the administration of anti-programmed death 1 (PD-1) immunotherapy and conducted protein quantification using high-resolution mass spectrometry (MS). Subsequently, we analyzed differences in protein expression, pathways, and the tumor microenvironment (TME) between responders and non-responders. Furthermore, we explored the potential of these differences as predictive indicators. Finally, using machine learning algorithms, we screened for biomarkers and constructed a predictive model. Results: Our proteomics-based analysis revealed that low activity in the complement and coagulation cascades pathway (CCCP) and a high abundance of activated CD8 T cells are positive signals corresponding to ICIs. By using machine learning, we successfully identified a set of 10 protein biomarkers, and the constructed model demonstrated excellent performance in predicting the response in an independent validation set (N = 14; area under the curve [AUC] = 0.959). Conclusion: In summary, our proteomic analyses unveiled unique potential biomarkers for predicting the response to PD-1 inhibitor immunotherapy in GC patients, which may provide the impetus for precision immunotherapy.
... All data were processed using the Firmiana proteomics workstation 83 , which were used in our previous studies 46,84,85 . The DIA were searched using FragPipe (v12.1) with MSFragger (2.2) 86 , and the DDA files were searched using the Mascot search engine (version 2.4, Matrix Science Inc), against the NCBI human Refseq protein database. ...
Article
Full-text available
Cetuximab therapy is the major treatment for colorectal cancer (CRC), but drug resistance limits its effectiveness. Here, we perform longitudinal and deep proteomic profiling of 641 plasma samples originated from 147 CRC patients (CRCs) undergoing cetuximab therapy with multi-course treatment, and 90 healthy controls (HCs). COL12A1, THBS2, S100A8, and S100A9 are screened as potential proteins to distinguish CRCs from HCs both in plasma and tissue validation cohorts. We identify the potential biomarkers (RRAS2, MMP8, FBLN1, RPTOR, and IMPDH2) for the initial response prediction. In a longitudinal setting, we identify two clusters with distinct fluctuations and construct the model with high accuracy to predict the longitudinal response, further validated in the independent cohort. This study reveals the heterogeneity of different biomarkers for tumor diagnosis, the initial and longitudinal response prediction respectively in the first course and multi-course cetuximab treatment, may ultimately be useful in monitoring and intervention strategies for CRC.
Article
Gastric cancer (GC) is highly metastatic and characterized by HER2 amplification. Aberrant HER2 expression drives metastasis, therapy resistance, and tumor recurrence. HER2 amplification contributes to drug resistance by upregulating DNA repair enzymes and drug afflux proteins, reducing drug efficacy. HER2 modulates transcription factors critical for cancer stem cell properties, further impacting drug resistance. HER2 activity is influenced by HER-family ligands, promoting oncogenic signaling. These features point to HER2 as a targetable driver in GC. This review outlines recent advances in HER2-mediated mechanisms and their upstream and downstream signaling pathways in GC. Additionally, it discusses preclinical research investigation that comprehends trastuzumab-sensitizing phytochemicals, chemotherapeutics, and nanoparticles as adjunct therapies. These developments hold promise for improving outcomes and enhancing the management of HER2-positive GC.
Article
Full-text available
Gastric cancer (GC) is one of the most common cancers worldwide. Most patients are diagnosed at the progressive stage of the disease, and current anticancer drug advancements are still lacking. Therefore, it is crucial to find relevant biomarkers with the accurate prediction of prognoses and good predictive accuracy to select appropriate patients with GC. Recent advances in molecular profiling technologies, including genomics, epigenomics, transcriptomics, proteomics, and metabolomics, have enabled the approach of GC biology at multiple levels of omics interaction networks. Systemic biological analyses, such as computational inference of “big data” and advanced bioinformatic approaches, are emerging to identify the key molecular biomarkers of GC, which would benefit targeted therapies. This review summarizes the current status of how bioinformatics analysis contributes to biomarker discovery for prognosis and prediction of therapeutic efficacy in GC based on a search of the medical literature. We highlight emerging individual multi-omics datasets, such as genomics, epigenomics, transcriptomics, proteomics, and metabolomics, for validating putative markers. Finally, we discuss the current challenges and future perspectives to integrate multi-omics analysis for improving biomarker implementation. The practical integration of bioinformatics analysis and multi-omics datasets under complementary computational analysis is having a great impact on the search for predictive and prognostic biomarkers and may lead to an important revolution in treatment.
Article
Background Junctional adhesion molecule 2 (JAM2) plays a pivotal role in various biological processes, including proliferation, metastasis and angiogenesis, contributing to tumor progression. While previous studies have highlighted the polarizing functions of JAM2 in different cancer types, its specific role in lung adenocarcinoma (LUAD) remains unclear. Methods In this study, we harnessed multiple public databases to analyze the expression and prognostic significance of JAM2 in LUAD. Using the Linkedomics database, Matescape database and R package, we explored the associated genes, the potential biological functions and the impact of JAM2 on the tumor microenvironment. Our findings from public databases were further validated using real‐time quantitative PCR, western blot and immunohistochemistry. Additionally, in vitro experiments were conducted to assess the influence of JAM2 on LUAD cell proliferation, invasion, migration, apoptosis and epithelial–mesenchymal transition. Furthermore, we established a xenograft model to investigate the in vivo effects of JAM2 on tumorigenesis. Results Our results revealed a significant downregulation of JAM2 in LUAD, and patients with low JAM2 expression exhibited unfavorable overall survival outcomes. Functional enrichment analysis indicated that JAM2 may be associated with processes such as cell adhesion, extracellular matrix, cell junctions and regulation of proliferation. Notably, increased JAM2 expression correlated with higher tumor microenvironment scores and reduced immune cell abundance. Furthermore, overexpression of JAM2 induced apoptosis, suppressed tumor proliferation and exhibited potential inhibitory effects on tumor invasion and migration through the modulation of epithelial–mesenchymal transition. Additionally, in vivo experiments confirmed that JAM2 overexpression led to a reduction in tumor growth. Conclusion Overall, our study highlights the clinical significance of low JAM2 expression as a predictor of poor prognosis in LUAD patients. Moreover, JAM2 was found to exert inhibitory effects on various aspects of tumor progression. Consequently, JAM2 emerges as a promising prognostic biomarker and a potential therapeutic target for LUAD patients.
Article
Intestinal-type gastric cancer (IGC) is the most frequent type of gastric cancer in high-incidence populations. The early stages of IGC growth successively include nonatrophic gastritis (NAG), chronic atrophic gastritis (CAG) and intestinal metaplasia (IM). However, the mechanisms of IGC development through these stages remain unclear. For this study, single-cell RNA-seq data related to IGC were downloaded from the GEO database, and immune cells of the tumor microenvironment (TME) were annotated using R software. Changes in the proportion of immune cells and altered cell-to-cell interactions were explored at different disease stages using R software, with a focus on plasma cells. Additionally, IGC samples from the TCGA database were used for immune cell infiltration analysis, and a Cox proportional risk regression model was constructed to identify possible prognostic genes. The results indicated that for precancerous lesions, interactions between immune cells were mainly dominated by chemokines to stimulate the infiltration and activation of immune cells. In tumors, intercellular movement of upregulated molecules and amplified signals were associated with the tumor necrosis factor family and immunosuppression to escape immune surveillance and promote tumor growth. Regarding prognostic analysis, IGLC3, IGLV1-44, IGKV1-16, IGHV3-21, IGLV1-51, and IGLV3-19 were found to be novel biomarkers for IGC. Our analysis of the IGC single-cell atlas together with bulk transcriptome data contributes to understanding TME heterogeneity at the molecular level during IGC development and provides insights for elucidating the mechanism of IGC and discovering novel targets for precise therapy.
Article
Full-text available
Gastric cancer (GC) is a major public health problem worldwide, with high mortality rates due to late diagnosis and limited treatment options. Biomarker research is essential to improve the early detection of GC. Technological advances and research methodologies have improved diagnostic tools, identifying several potential biomarkers for GC, including microRNA, DNA methylation markers , and protein-based biomarkers. Although most studies have focused on identifying biomarkers in biofluids, the low specificity of these markers has limited their use in clinical practice. This is because many cancers share similar alterations and biomarkers, so obtaining them from the site of disease origin could yield more specific results. As a result, recent research efforts have shifted towards exploring gastric juice (GJ) as an alternative source for biomarker identification. Since GJ is a waste product during a gastroscopic examination, it could provide a "liquid biopsy" enriched with disease-specific biomarkers generated directly at the damaged site. Furthermore, as it contains secretions from the stomach lining, it could reflect changes associated with the developmental stage of GC. This narrative review describes some potential biomarkers for gastric cancer screening identified in gastric juice.
Article
Full-text available
The rapid development of such research field as multi-omics and artificial intelligence (AI) has made it possible to acquire and analyze the multi-dimensional big data of human phenomes. Increasing evidence has indicated that phenomics can provide a revolutionary strategy and approach for discovering new risk factors, diagnostic biomarkers and precision therapies of diseases, which holds profound advantages over conventional approaches for realizing precision medicine: first, the big data of patients' phenomes can provide remarkably richer information than that of the genomes; second, phenomic studies on diseases may expose the correlations among cross-scale and multi-dimensional phenomic parameters as well as the mechanisms underlying the correlations; and third, phenomics-based studies are big data-driven studies, which can significantly enhance the possibility and efficiency for generating novel discoveries. However, phenomic studies on human diseases are still in early developmental stage, which are facing multiple major challenges and tasks: first, there is significant deficiency in analytical and modeling approaches for analyzing the multi-dimensional data of human phenomes; second, it is crucial to establish universal standards for acquirement and management of phenomic data of patients; third, new methods and devices for acquirement of phenomic data of patients under clinical settings should be developed; fourth, it is of significance to establish the regulatory and ethical guidelines for phenomic studies on diseases; and fifth, it is important to develop effective international cooperation. It is expected that phenomic studies on diseases would profoundly and comprehensively enhance our capacity in prevention, diagnosis and treatment of diseases.
Article
Full-text available
Unlabelled: Tumor-specific antigens or neoantigens are peptides that are expressed only in cancer cells and not in healthy cells. Some of these molecules can induce an immune response, and therefore, their use in immunotherapeutic strategies based on cancer vaccines has been extensively explored. Studies based on these approaches have been triggered by the current high-throughput DNA sequencing technologies. However, there is no universal nor straightforward bioinformatic protocol to discover neoantigens using DNA sequencing data. Thus, we propose a bioinformatic protocol to detect tumor-specific antigens associated with single nucleotide variants (SNVs) or "mutations" in tumoral tissues. For this purpose, we used publicly available data to build our model, including exome sequencing data from colorectal cancer and healthy cells obtained from a single case, as well as frequent human leukocyte antigen (HLA) class I alleles in a specific population. HLA data from Costa Rican Central Valley population was selected as an example. The strategy included three main steps: (1) pre-processing of sequencing data; (2) variant calling analysis to detect tumor-specific SNVs in comparison with healthy tissue; and (3) prediction and characterization of peptides (protein fragments, the tumor-specific antigens) derived from the variants, in the context of their affinity with frequent alleles of the selected population. In our model data, we found 28 non-silent SNVs, present in 17 genes in chromosome one. The protocol yielded 23 strong binders peptides derived from the SNVs for frequent HLA class I alleles for the Costa Rican population. Although the analyses were performed as an example to implement the pipeline, to our knowledge, this is the first study of an in silico cancer vaccine using DNA sequencing data in the context of the HLA alleles. It is concluded that the standardized protocol was not only able to identify neoantigens in a specific but also provides a complete pipeline for the eventual design of cancer vaccines using the best bioinformatic practices. Supplementary information: The online version contains supplementary material available at 10.1007/s43657-022-00084-9.
Article
Full-text available
Esophageal cancer (EC) is a type of aggressive cancer without clinically relevant molecular subtypes, hindering the development of effective strategies for treatment. To define molecular subtypes of EC, we perform mass spectrometry-based proteomic and phosphoproteomics profiling of EC tumors and adjacent non-tumor tissues, revealing a catalog of proteins and phosphosites that are dysregulated in ECs. The EC cohort is stratified into two molecular subtypes—S1 and S2—based on proteomic analysis, with the S2 subtype characterized by the upregulation of spliceosomal and ribosomal proteins, and being more aggressive. Moreover, we identify a subtype signature composed of ELOA and SCAF4, and construct a subtype diagnostic and prognostic model. Potential drugs are predicted for treating patients of S2 subtype, and three candidate drugs are validated to inhibit EC. Taken together, our proteomic analysis define molecular subtypes of EC, thus providing a potential therapeutic outlook for improving disease outcomes in patients with EC.
Article
Full-text available
The Genome Sequence Archive (GSA) is a data repository for archiving raw sequence data, which provides data storage and sharing services for worldwide scientific communities. Considering explosive data growth with diverse data types, here we present the GSA family by expanding into a set of resources for raw data archive with different purposes, namely, GSA (https://ngdc.cncb.ac.cn/gsa/), GSA for Human (GSA-Human, https://ngdc.cncb.ac.cn/gsa-human/), and Open Archive for Miscellaneous Data (OMIX, https://ngdc.cncb.ac.cn/omix/). Compared with the 2017 version, GSA has been significantly updated in data model, online functionalities, and web interfaces. GSA-Human, as a new partner of GSA, is a data repository specialized in human genetics-related data with controlled access and security. OMIX, as a critical complement to the two resources mentioned above, is an open archive for miscellaneous data. Together, all these resources form a family of resources dedicated to archiving explosive data with diverse types, accepting data submissions from all over the world, and providing free open access to all publicly available data in support of worldwide research activities.
Article
Full-text available
Intratumoral heterogeneity (ITH) is a fundamental property of cancer; however, the origins of ITH remain poorly understood. We performed single-cell transcriptome profiling of peritoneal carcinomatosis (PC) from 15 patients with gastric adenocarcinoma (GAC), constructed a map of 45,048 PC cells, profiled the transcriptome states of tumor cell populations, incisively explored ITH of malignant PC cells and identified significant correlates with patient survival. The links between tumor cell lineage/state compositions and ITH were illustrated at transcriptomic, genotypic, molecular and phenotypic levels. We uncovered the diversity in tumor cell lineage/state compositions in PC specimens and defined it as a key contributor to ITH. Single-cell analysis of ITH classified PC specimens into two subtypes that were prognostically independent of clinical variables, and a 12-gene prognostic signature was derived and validated in multiple large-scale GAC cohorts. The prognostic signature appears fundamental to GAC carcinogenesis and progression and could be practical for patient stratification.
Article
Advances in genomic medicine have greatly improved our understanding of human diseases. However, phenome is not well understood. High-resolution and multidimensional phenotypes have shed light on the mechanisms underlying neonatal diseases in greater details and have the potential to optimize clinical strategies. In this review, we first highlight the value of analyzing traditional phenotypes using a data science approach in the neonatal population. We then discuss recent research on high-resolution, multidimensional, and structured phenotypes in neonatal critical diseases. Finally, we briefly introduce current technologies available for the analysis of multidimensional data and the value that can be provided by integrating these data into clinical practice. In summary, a time series of multidimensional phenome can improve our understanding of disease mechanisms and diagnostic decision-making, stratify patients, and provide clinicians with optimized strategies for therapeutic intervention; however, the available technologies for collecting multidimensional data and the best platform for connecting multiple modalities should be considered.
Article
Glioblastoma (GBM) is the most aggressive nervous system cancer. Understanding its molecular pathogenesis is crucial to improving diagnosis and treatment. Integrated analysis of genomic, proteomic, post-translational modification and metabolomic data on 99 treatment-naive GBMs provides insights to GBM biology. We identify key phosphorylation events (e.g., phosphorylated PTPN11 and PLCG1) as potential switches mediating oncogenic pathway activation, as well as potential targets for EGFR-, TP53-, and RB1-altered tumors. Immune subtypes with distinct immune cell types are discovered using bulk omics methodologies, validated by snRNA-seq, and correlated with specific expression and histone acetylation patterns. Histone H2B acetylation in classical-like and immune-low GBM is driven largely by BRDs, CREBBP, and EP300. Integrated metabolomic and proteomic data identify specific lipid distributions across subtypes and distinct global metabolic changes in IDH-mutated tumors. This work highlights biological relationships that could contribute to stratification of GBM patients for more effective treatment.
Article
We integrate the genomics, proteomics, and phosphoproteomics of 480 clinical tissues from 146 patients in a Chinese colorectal cancer (CRC) cohort, among which 70 had metastatic CRC (mCRC). Proteomic profiling differentiates three CRC subtypes characterized by distinct clinical prognosis and molecular signatures. Proteomic and phosphoproteomic profiling of primary tumors alone successfully distinguishes cases with metastasis. Metastatic tissues exhibit high similarities with primary tumors at the genetic but not the proteomic level, and kinase network analysis reveals significant heterogeneity between primary colorectal tumors and their liver metastases. In vivo xenograft-based drug tests using 31 primary and metastatic tumors show personalized responses, which could also be predicted by kinase-substrate network analysis no matter whether tumors carry mutations in the drug-targeted genes. Our study provides a valuable resource for better understanding of mCRC and has potential for clinical application.
Article
Gastric cancer is the fifth most common cancer and the third most common cause of cancer death globally. Risk factors for the condition include Helicobacter pylori infection, age, high salt intake, and diets low in fruit and vegetables. Gastric cancer is diagnosed histologically after endoscopic biopsy and staged using CT, endoscopic ultrasound, PET, and laparoscopy. It is a molecularly and phenotypically highly heterogeneous disease. The main treatment for early gastric cancer is endoscopic resection. Non-early operable gastric cancer is treated with surgery, which should include D2 lymphadenectomy (including lymph node stations in the perigastric mesentery and along the celiac arterial branches). Perioperative or adjuvant chemotherapy improves survival in patients with stage 1B or higher cancers. Advanced gastric cancer is treated with sequential lines of chemotherapy, starting with a platinum and fluoropyrimidine doublet in the first line; median survival is less than 1 year. Targeted therapies licensed to treat gastric cancer include trastuzumab (HER2-positive patients first line), ramucirumab (anti-angiogenic second line), and nivolumab or pembrolizumab (anti-PD-1 third line).