ArticlePDF Available

Mining for regulatory programs in the cancer transcriptome

July 2005
Nature Genetics 37(6):579-83

July 2005
37(6):579-83

DOI:10.1038/ng1578

Source
PubMed

Authors:

Daniel R Rhodes

Thermo Fisher Scientific

Shanker Kalyana-Sundaram

University of Michigan

Vasudeva Mahavisno

University of Michigan

Show all 6 authorsHide

DNA microarrays have been widely applied to cancer transcriptome analysis. The Oncomine database contains a large collection of such data, as well as hundreds of derived gene-expression signatures. We studied the regulatory mechanisms responsible for gene deregulation in these cancer signatures by searching for the coordinate regulation of genes with common transcription factor binding sites. We found that genes with binding sites for the archetypal cancer transcription factor, E2F, were disproportionately overexpressed in a wide variety of cancers, whereas genes with binding sites for other transcription factors, such as Myc-Max, c-Rel and ATF, were disproportionately overexpressed in specific cancer types. These results suggest that alterations in pathways activating these transcription factors may be responsible for the observed gene deregulation and cancer pathogenesis.

Overview of the method used to elucidate CRPs.Data were integrated from three sources: TRANSFAC6, the University of California Santa Cruz (UCSC) genome browser and Oncomine5. Putative transcription factor regulatory signatures were compared with gene-expression signatures, and their overlap was assessed using the binomial distribution to derive CRPs.

…

Regulatory programs encoded in gene-expression signatures.(a) Regulatory programs were inferred if a gene-expression signature significantly overlapped with a signature of candidate transcription factor targets. The number of significant overlaps observed (red) is compared with the number expected by chance (black). (b) A representative normal tissue regulatory program linking the HNF4 transcription factor to 78 target genes (20 shown) exclusively activated in normal liver tissue (Li) relative to several other normal tissues. (c) Two regulatory programs activated in de novo DLBCL relative to post-transformation DLBCL and follicular lymphoma. The transcription factor of the second program (E2F1) is a target gene in the first program (c-Rel), suggesting a two-tier regulatory mechanism. Red indicates relative overexpression of genes (rows) in the profiled samples (columns); blue indicates relative underexpression.

…

Figures - uploaded by Shanker Kalyana-Sundaram

Content may be subject to copyright.

Content uploaded by Shanker Kalyana-Sundaram

Content may be subject to copyright.

NATURE GENETICS

VOLUME 37

NUMBER 6

JUNE 2005 579

ANALYSIS

Global gene expression profiling with DNA microarrays has been widely

applied to human cancer, leading to the elucidation of complex gene-

expression programs activated and repressed in various types and sub-

types of cancer, a ‘molecular taxonomy’ of cancer. We and others have

attempted to characterize large collections of cancer gene-expression

data in terms of common ‘signatures’ of activation

1,2

or in terms of

coordinately regulated processes or ‘modules’

. But such efforts have

not focused on the regulatory mechanisms responsible for observed

gene-expression alterations in cancer. Some gene-expression patterns

observed from microarray data probably represent a downstream read-

out of a few genetic aberrations (mutations, amplifications, deletions,

translocations, etc.) that led to the activation or inactivation of a few

transcription factors. In some cases, cancer-causing genetic aberrations

may not be directly apparent from these downstream gene-expression

read-outs. For example, a mutation in the Rb tumor suppressor that

leads to dissociation and activation of E2F1 would manifest not in the

differential gene-expression of Rb or E2F1, but in the coordinate activa-

tion of E2F target genes. Global methods for inferring transcriptional

regulatory mechanisms from gene-expression data have been widely

applied to yeast gene expression and also to the human cell cycle

but not

yet to human cancer. We searched for cancer regulatory programs that

link transcription factors to target genes that are conditionally activated

in specific cancer types and subtypes (Fig. 1).

We began by defining gene-expression signatures characteristic of a

wide variety of cancer types and subtypes represented in the Oncomine

database

. We used data from 65 independent studies including 6,732

microarray experiments and 70.8 million gene-expression measure-

ments to derive 265 gene-expression signatures (Supplementary Table

1 online). Signatures were defined as sets of genes with statistically

significant (Q < 0.10) differential expression in cancer, either relative

to normal tissue or relative to other types or subtypes of cancer. We also

derived normal tissue signatures as sets of genes differentially expressed

in a single normal tissue type relative to other normal tissue types.

Gene-expression signatures ranged in size from 20 genes to 2,200 genes

and represented nearly every major type of cancer and normal tissue.

Next, we constructed a database of transcriptional regulatory sig-

natures, relating transcription factors to candidate target genes by

identifying putative transcription factor binding sites in the promoter

sequences of human genes. We submitted all 1-kb human promoter

sequences to the MATCH software program, which identifies and scores

sequence matches to transcription factor binding site position weight

matrices from the TRANSFAC database

. Although a high-scoring

match does not constitute a definitive transcription factor binding site

and regulatory interaction, we reasoned that the sets of genes with the

highest scoring matches are likely to be enriched for true target genes.

After we applied a match threshold and rank filter, our database con-

tained 361 regulatory signatures, comprising 466,491 potential regula-

tory interactions that represent putative transcription factor binding

sites in the promoters of candidate target genes (Supplementary Table

2 online). There are several limitations to this approach, and we concede

that our database is incomplete and probably contains false interac-

tions. But the database is sufficient for large-scale enrichment analysis

as well as initial hypothesis generation.

With a database of gene-expression signatures and transcriptional

regulatory signatures in place, we sought to identify conditional regula-

tory programs (CRPs), consisting of a transcription factor that coor-

dinately regulates a set of target genes in a particular tissue type. We

identified candidate CRPs by searching for disproportionate overlap

of regulatory signatures with gene-expression signatures. We reasoned

that if a transcription factor is responsible for the coordinate regulation

Mining for regulatory programs in the cancer

transcriptome

Daniel R Rhodes

1–3

, Shanker Kalyana-Sundaram

, Vasudeva Mahavisno

, Terrence R Barrette

, Debashis Ghosh

2,4

& Arul M Chinnaiyan

1–3,5

DNA microarrays have been widely applied to cancer transcriptome analysis. The Oncomine database contains a large

collection of such data, as well as hundreds of derived gene-expression signatures. We studied the regulatory mechanisms

responsible for gene deregulation in these cancer signatures by searching for the coordinate regulation of genes with common

transcription factor binding sites. We found that genes with binding sites for the archetypal cancer transcription factor, E2F, were

disproportionately overexpressed in a wide variety of cancers, whereas genes with binding sites for other transcription factors,

such as Myc-Max, c-Rel and ATF, were disproportionately overexpressed in specific cancer types. These results suggest that

alterations in pathways activating these transcription factors may be responsible for the observed gene deregulation and cancer

pathogenesis.

Department of Pathology,

Bioinformatics Program,

Comprehensive Cancer

Center, and Departments of

Biostatistics and

Urology, University of Michigan

Medical School, Ann Arbor, Michigan 48109, USA. Correspondence should be

addressed to A.M.C. (arul@umich.edu).

Published online 26 May 2005; doi:10.1038/ng1578

580 VOLUME 37

NUMBER 6

JUNE 2005

NATURE GENETICS

ANALYSIS

of a set of genes in a given tissue type, then the candidate target genes of

the transcription factor should be disproportionately over-represented

in the gene-expression signature. We compared 265 gene expression

signatures with 366 regulatory signatures. We counted the degree of

overlap between all signature pairs and computed the significance of

the overlaps by the binomial distribution. From this analysis, we defined

311 regulatory programs that showed highly significant overlap (P <

0.00033) between a gene-expression signature and a regulatory signa-

ture (Supplementary Tables 3–6 online). Given the total number of

hypotheses tested, we would expect only 31 signature pairs to have such

significant overlap by chance (Q < 0.10; Fig. 2a).

We first examined the 81 CRPs specific to normal human tissues.

Several of the programs validated our method by specifically linking a

transcription factor to the tissue type in which it is known to act. For

example, CRPs 11 and 58 are composed of genes activated in normal

liver tissue that have promoter binding sites for HNF4α and HNF1,

respectively, two hepatocyte nuclear factors known to control liver-spe-

cific processes

7,8

(Fig. 2b). CRP 222 is composed of genes activated in

muscle tissue that have promoter binding sites for MEF2A (also called

RSRFC4), a transcription factor with a known role in myocyte differen-

tiation

. CRP 24 is composed of genes activated in normal brain tissue

that have promoter binding sites for the EGR (also called KROX) family

of transcription factors

. CRPs 32, 136 and 307 are composed of genes

activated in normal blood cells that have binding sites for IRF family of

transcription factors, which are activated in white blood cells and have

a role in host defense

. Similarly, binding sites for NF-κB, which has a

central role in immune function, are also enriched in genes expressed

in blood cells (CRPs 178 and 264)

. The gene-expression signature for

early progenitor cells showed an enrichment for Oct/POU family tran-

scription factors, which function in pluripotent stem cells

13,14

. Taken

together, these results show that our approach can identify regulatory

mechanisms responsible for gene regulation in human tissue. They also

suggest sets of target genes for each of these tissue-specific regulatory

programs. Other normal tissue CRPs may link transcription factors to

tissue types in which they were not previously known to act. The full

list of normal tissue CRPs is provided in Supplementary Table 4 online

and can be explored in detail at Oncomine.

Next, we examined the 232 CRPs involving human cancer. More than

half (126) of these relate one of several variant E2F binding sites to tar-

get genes in one of many cancer types, including follicular lymphoma,

Burkitt lymphoma, diffuse large B-cell lymphoma (DLBCL), acute

lymphoblastic leukemia, glioblastoma, medulloblastoma, leiomyosar-

coma, small cell lung cancer (SCLC), squamous cell lung cancer, hepa-

tocellular carcinoma, salivary adenoid cystic carcinoma, adrenocortical

carcinoma, high-grade astrocytoma and high-grade breast carcinoma.

These results reaffirm that activation of the E2F pathway is a prevalent

event in human cancer

15,16

and provide hundreds of putative E2F tar-

gets activated in specific human tumors. As others have found, we show

through a second layer of enrichment analysis that E2F CRPs include

genes involved in several cellular proliferation related processes such as

the cell cycle, DNA replication and mRNA splicing. We also show that

several E2F cancer regulatory programs, including those activated in

high-grade breast cancer (CRP 99) and small cell lung cancer (CRP 96),

are enriched for proteins involved in chromatin modification, including

EZH2, JJAZ1 (SUZ12), CBX3, HMGA1 and BAF53A. The E2F path-

way regulates EZH2 (ref. 17); perhaps its role in regulating chromatin

modifying genes is more widespread. Regulatory programs linking

NF-Y binding sites to cancer signatures were also common and usually

coincided with E2F cancer regulatory programs. This is not surprising

as the binding of the NF-Y transcription factor to certain promoters

is necessary for E2F activity

. NF-Y transactivation is dependent on

phosphorylation by CDK2, suggesting a potential therapeutic approach

for repressing NF-Y and thus E2F cancer regulatory programs

To confirm that the E2F cancer regulatory programs represent gene

sets truly activated by E2F, we collected data from an independent study

that identified transcriptional targets of the E2F family in an inducible

cell line system

. In total, 558 genes were significantly overexpressed

upon E2F activation. We reasoned that if at least a fraction of these

results represented a physiologically relevant E2F signature, and if our

E2F cancer regulatory programs represented valid programs activated

by E2F in human cancer in vivo, then we should find substantial overlap

between the two. To test this, we selected a representative E2F binding

site and its ten respective cancer regulatory programs. In nine of the ten

CRPs, we found a significant enrichment of genes from the in vitro E2F

signature (P < 0.005), suggesting that our approach identified valid E2F

targets activated in human cancers (Supplementary Table 7 online).

To select the most promising candidate E2F targets among our CRPs,

we identified target genes that are activated by E2F in the inducible cell

culture system and are most common in E2F CRPs. Among the nine

CRPs that showed significant overlap with the in vitro signature, eight

contained three known E2F targets, including CCNE2, RRM2 (ref. 20)

and EZH2 (ref. 17). Other known E2F targets activated in many cancer

CRPs include TFDP1, CDC25, RPA1 and USP13. The near universal

Figure 1 Overview of the method used to elucidate CRPs. Data were

integrated from three sources: TRANSFAC

, the University of California

Santa Cruz (UCSC) genome browser and Oncomine

. Putative transcription

factor regulatory signatures were compared with gene-expression signatures,

and their overlap was assessed using the binomial distribution to derive

CRPs.

TRANSFAC

TRANSCRIPTIONFACTOR

POSITIONWEIGHTMATRICES

UCSC genome browser

2EF3EQ

KBPROMOTERSEQUENCES

Oncomine

MICROARRAYS

MILLIONMEASUREMENTS

Putative binding site identification

(1) Match algorithm

(2) rank threshold

Differential expression analysis

(1) t-test

(2) false discovery rate correction

Gene-expression signatures

205 cancer signatures

29 normal tissue signatures

Putative regulatory signatures

466,491 candidate binding sites

Expected

Observed

0

I.I











Enrichment analysis

S1S2 S3 S4S5S6S7 S8 S9 S10 S11S12S13S14S15S16S17S18

CRPs

Transcription factor X regulates target gene Y in tissue Z

Factor

NATURE GENETICS

VOLUME 37

NUMBER 6

JUNE 2005 581

ANALYSIS

activation of these genes by E2F in cancer suggests that they are crucial

mediators of carcinogenesis. For example, the observed activation of

cyclin E2 by E2F is part of an autoregulatory loop, as cyclin E2 activates

CDKs, which further activate E2F

. Most of these E2F target genes have

a role in cellular proliferation; therefore, their importance in E2F-medi-

ated tumorigenesis is not surprising. Notably, however, EZH2 functions

as a chromatin-modifying transcriptional repressor and is important in

embryogenesis

. Functional studies showed that EZH2 promotes inva-

sion in breast cancer cells

and is associated with a lethal phenotype in

prostate cancer

. EZH2 is also associated with poorly differentiated can-

cers relative to their well-differentiated counterparts

. Perhaps the hyper-

activated E2F pathway leads to EZH2 overexpression and concomitant

repression of prodifferentiation genes, thus locking cells into an undif-

ferentiated invasive phenotype. This raises the possibility that the E2F

pathway is responsible for both cancer growth and dedifferentiation.

Our analysis uncovered several other cancer regulatory programs

involving transcription factors other than E2F. For example, CRP 120

suggests that c-Rel activates hundreds of target genes in DLBCL, and

CRP 45 suggests that c-Rel activity is most apparent in de novo DLBCL

relative to transformed DLBCL (Fig. 2c). The enrichment of c-Rel bind-

ing sites in the promoters of genes activated in DLBCL is consistent

with the observation of c-Rel amplification in DLBCL

. Although one

report failed to find a link between c-Rel amplification status and down-

stream gene-expression changes

, our results suggest that c-Rel activity

is evident from DLBCL gene-expression patterns. Transformed and

de novo DLBCL are markedly different at the gene-expression level

though morphologically indistinguishable. Our work suggests that a

key difference may be the specific activation of the c-Rel regulatory

program in de novo DLBCL. Upon examination of the target genes in

the c-Rel–DLBCL program, we found an enrichment of genes involved

in both cell proliferation and apoptosis. We found that E2F1 was among

the target gene set; our analysis also identified an E2F1-DLBCL regula-

tory program. These results suggested that there may be a two-tiered

regulatory mechanism beginning with c-Rel activation of target genes,

which include E2F1, and then E2F1 activation of its target genes, many

of which have a role in cellular proliferation (Fig. 2c).

Another regulatory program (CRP 240) details an abundance of c-

Myc–Max binding sites among genes overexpressed in SCLC. This is

consistent with the known amplification and overexpression of the

Myc family of transcription factors in SCLC

28,29

. Enrichment analysis

of this program identified a preponderance of genes involved in DNA

metabolism and the cell cycle. Furthermore, we found that this program

significantly overlapped with the E2F1-SCLC regulatory program (CRP

96; P = 0.01), suggesting that several genes are dually activated by E2F1

and Myc in SCLC. Myc binding sites were also common among genes

activated in normal umbilical endothelial cells (CRP 186) as well as

in adreocortical carcinoma (CRP 320). The final regulatory program

that we explored suggested that ATF activates target genes in salivary

carcinoma, of which a disproportionate number are involved in cell

migration (CRP 177). ATF1 is activated as a fusion protein in meta-

static melanoma

, but no link between ATF and salivary carcinoma

currently exists. If the ATF program is indeed overactivated in salivary

carcinoma, then therapies targeting ATF in melanoma

may be useful.

We observed that many of the transcription factors involved in cancer

regulatory programs have oncogenic activity, suggesting that their pre-

dicted regulatory function in CRPs may be important in carcinogen-

esis. Transcription factors identified by our analysis with a causative

role in cancer include E2F1 (ref. 32), Myc

, c-Rel

, ATF1 (ref. 30) and

C-ETS-1 (ref. 35). All regulatory programs and their target genes can be

explored through our web-based data-mining platform, Oncomine.

The identification of a CRP implies that a specific transcription factor

is active in a specific tissue type and is responsible for the observed gene

regulation or deregulation. Activation of a transcription factor can occur

either as a downstream effect of a signaling cascade (e.g., phosphoryla-

100

1,000

10,000

23456789101112

/VERLAPSIGNIFICANCENEGATIVELOGPVALUE

#OUNTOFSIGNATUREPAIRS

Transformed

Follicular

De novo

PQBP1

LNPEP

ZNF212

BTG2

DKFZp566O084

DRPLA

SSNA1

UBE2M

FTH1

FLJ2001

CCRL2

MGC15677

PKMYT1

PCQAP

C9orf23

E2F1

RC3

PLEKHA3

Gene

TUBG1

KIAA0406

HRD1 | MRPL49

NUP98 | HEL308

TRA2A | ARHGEF2

RNASE4

SLC3A2

DVL3

raptor

EDF1

CPSF1 | ADCK5

MBD3

PKMYT1 | FLJ30002

MAZ

TTK

SLC3A2

E2F1

TRIP10

NAG

RC3

Transformed

Follicular

De novo

Gene

S1S2 S3 S4S5S6S7 S8 S9 S10 S11S12S13 S14S15S16S17S18

19 other types

HNF4

E2F1

c-REL

Figure 2 Regulatory programs encoded in

gene-expression signatures. (a) Regulatory

programs were inferred if a gene-expression

signature significantly overlapped with a

signature of candidate transcription factor

targets. The number of significant overlaps

observed (red) is compared with the number

expected by chance (black). (b) A representative

normal tissue regulatory program linking

the HNF4 transcription factor to 78 target

genes (20 shown) exclusively activated in

normal liver tissue (Li) relative to several

other normal tissues. (c) Two regulatory

programs activated in de novo DLBCL relative

to post-transformation DLBCL and follicular

lymphoma. The transcription factor of the

second program (E2F1) is a target gene in the

first program (c-Rel), suggesting a two-tier

regulatory mechanism. Red indicates relative

overexpression of genes (rows) in the profiled

samples (columns); blue indicates relative

underexpression.

582 VOLUME 37

NUMBER 6

JUNE 2005

NATURE GENETICS

ANALYSIS

tion, nuclear translocation) or by overexpression of the transcription

factor itself. To identify those CRPs that may be regulated by the latter

mechanism, we searched for concomitant overexpression of the tran-

scription factors that bind to the enriched binding sites. We found over-

expression of a respective transcription factor for 79 of the 311 identified

CRPs (25%). This analysis may provide a level of validity to some CRPs

and, in some cases, may identify the specific transcription factor among

a family that is responsible for the target gene regulation. For example,

the CREB-ATF binding site is enriched among genes overexpressed in

salivary carcinoma (CRP 250). We found that both CREB1 and ATF5

are significantly overexpressed in salivary carcinoma, suggesting that

they may be responsible for the CREB-ATF CRP (Supplementary Fig.

1 online). We also observed concomitant overexpression of the expected

transcription factor with several of the normal tissue CRPs, including

HNF4A in liver, MEF2A in muscle, EGR3 and EGR4 in brain and mul-

tiple IRFs in blood (Supplementary Table 8 online).

In summary, integrative bioinformatics analyses similar to those

carried out by Segal et al.

, our group and others

1,36

will generate new

hypotheses about cancer progression. Previously, by carefully main-

taining clinical annotations of the specific tissue specimens analyzed,

we were able to identify gene alterations that were common to cancer

regardless of tissue of origin as well as gene signatures characteristic of

more aggressive dedifferentiated cancers

. In this report, our integrative

approach of analyzing gene-expression signatures in the context of can-

didate regulatory signatures identified hundreds of normal tissue and

cancer CRPs, of which we have highlighted only a few. Several of these

regulatory programs link a transcription factor to a tissue type in which

the transcription factor is thought to act, whereas several others suggest

new regulatory mechanisms in cancer and normal tissue, such as ATF

pathway activation in salivary carcinoma. Furthermore, the regulatory

programs uncovered by our analysis suggest candidate target genes,

such as E2F1 activation by c-Rel in de novo DLBCL and chromatin-

modifying genes by E2F in high-grade breast cancer. Though powerful,

our approach has several limitations: (i) the number of characterized

transcription factor binding sites, (ii) the accuracy of the binding sites,

(iii) the facts that we only scanned 1-kb promoters and that binding

sites are likely to occur outside this region and (iv) the number of genes

profiled in the microarray studies and the sensitivity of the various

microarray platforms. Despite these limitations, we were able to discern

several regulatory mechanisms encoded in gene-expression signatures.

We anticipate that our approach will become more valuable as the accu-

racy and coverage of transcription factor target databases improves.

METHODS

Cancer signatures. We derived cancer signatures from the Oncomine

cancer microarray database. We used 65 independent data sets compris-

ing 6,348 samples (arrays) and 70.9 million gene-expression measure-

ments. The samples spanned 26 normal and cancer tissue types. The 65

data sets measured an average of 6,376.5 (range 507–15,294) unique

genes as determined by Entrez Gene. We analyzed differential expres-

sion with Student’s t-test and false discovery rates to identify genes with

significant differential expression between two classes of samples. We

defined gene-expression signatures from analyses that resulted in 20 or

more significant genes (Q < 0.10, mean difference > 0.5 Z-score units),

for a total of 234 gene-expression signatures with an average size of

398 genes (range 20–2,997). Twenty-nine were normal human tissue

signatures, and 205 were cancer signatures, of which 68 were derived

from comparisons of a cancer type and other cancer types, 50 from

comparisons of a cancer type and the respective normal tissue, 22 from

comparisons of various molecular subtypes of cancer and 12 from com-

parisons of histologic subtypes of cancer.

Regulatory signatures. We defined regulatory signatures by scanning

human gene promoter sequences for the presence of experimentally

defined transcription factor binding sites. We downloaded 1-kb pro-

moter sequences from 20,647 RefSeq reference sequences from the

University of California Santa Cruz genome browser (August 2004).

These reference sequences mapped to 15,665 unique genes (Entrez

Gene). In cases with multiple reference sequences per gene, we ana-

lyzed each promoter sequence independently. We submitted sequences

sequentially to MATCH, a component of the TRANSFAC Professional

Suite, which scans a sequence for the presence of transcription factor

binding sites as determined by a database of position weight matrices.

We applied the following settings: ‘group of matrices’ was set to ‘verte-

brates’; ‘use high quality matrices’ was selected; ‘cut-off selection’ was

set to 0.8 and 0.85 ‘as mat. sim and core sim. cutoff.’ For each promoter

sequence, the program output ‘hits’ designated by matrix identifier and

factor name. For each hit, the position, strand, core match and matrix

match were provided. In total, 366 distinct matrices were identified in

the promoters of human genes, although many of the matrices represent

variants of the same transcription factor binding site. With the afore-

mentioned settings, 16,159,457 million hits were identified. Because in

some cases, our lenient match threshold identified hits in nearly every

promoter sequence, we filtered the hit list to contain only the top 2,000

hits per matrix sorted by the matrix similarity score. Five matrices with

greater than 2,000 perfect matches (score = 1.0) were removed from the

analysis. To ensure that our results would be robust to the selected hit

threshold, the analysis was rerun with 1,500 and 2,500 hit thresholds.

As expected, we obtained largely overlapping results (data not shown).

After mapping reference sequences to Entrez Gene, we defined 466,491

potential regulatory interactions. Transcription factor matrices had an

average of 1,292.2 potential gene targets (range 4–1,554).

Enrichment analysis. We assessed each gene-expression signature (S

)

for the significant enrichment of each regulatory signature (S

). The

possible set for each gene-expression signature (P

) was defined as the

set of measured genes in each respective data set. The possible set for

regulatory signatures (P

) was defined as the set of genes with available

promoter sequences. We counted the number of genes intersecting a

gene-expression signature and a regulatory signature: n = c(S

∩ S

where c(A) denotes the number of elements in set A. We counted the

number of genes in both the regulatory signature and the possible set

for the gene-expression signature: N = c(S

∩P

). Next, we computed

the background probability of observing a gene in a gene-expression

signature by dividing the number of genes in both the gene-expression

signature and the possible set for the regulatory signature by the number

of genes in both possible sets:

Finally, we calculated the probability of observing an equal or larger

intersection between the gene-expression signature and regulatory sig-

nature by chance by summing the binomial distribution probabilities

for all intersections of equal or larger size:

We applied the method of false discovery rates to adjust P values for

multiple hypothesis testing. We calculated Q values as:

NATURE GENETICS

VOLUME 37

NUMBER 6

JUNE 2005 583

ANALYSIS

where N is the number of regulatory signatures tested against each

gene-expression signature and R is the ascending order rank of the

respective P value. We also calculated global Q values where N is the

total number of hypotheses tested (all gene-expression signatures by

all regulatory signatures). We used the global Q value to assess the sig-

nificance of our entire study and used the signature-specific Q values

to interpret the significance of the observed enrichments per gene-

expression signature.

In vitro E2F analysis. We collected target genes for E2F1, E2F2 and

E2F3 from an in vitro E2F profiling study

. We created a composite

signature of 588 target genes by combining all genes that were induced

by any one of the E2F family members. We selected ten CRPs that cor-

responded to a single representative E2F binding site (V$E2F_Q4_01)

for enrichment analysis. We carried out enrichment analysis by the

binomial distribution exactly as described in the preceding section.

URLs. The Oncomine database is available at http://www.oncomine.

org/. Promoter sequences from the University of California Santa Cruz

genome browser are available at http://hgdownload.cse.ucsc.edu/gold-

enPath/hg17/bigZips/.

Note: Supplementary information is available on the Nature Genetics website.

ACKNOWLEDGMENTS

We thank D. Gibbs for hardware support and R. Varambally for database support.

This research is supported in part by the National Institutes of Health through the

University of Michigan’s Cancer Center Support Grant, pilot funds from the Dean’s

Office and the Department of Pathology. D.R.R. was supported by the Medical

Scientist Training Program and the Cancer Biology Training Program, and A.M.C.

is a Pew Scholar.

COMPETING INTERESTS STATEMENT

The authors declare that they have no competing financial interests.

Published online at http://www.nature.com/naturegenetics/

1. Ramaswamy, S., Ross, K.N., Lander, E.S. & Golub, T.R. A molecular signature of

metastasis in primary solid tumors. Nat. Genet. 33, 49–54 (2003).

2. Rhodes, D.R. et al. Large-scale meta-analysis of cancer microarray data identifies

common transcriptional profiles of neoplastic transformation and progression. Proc.

Natl. Acad. Sci. USA 101, 9309–9314 (2004).

3. Segal, E., Friedman, N., Koller, D. & Regev, A. A module map showing conditional

activity of expression modules in cancer. Nat. Genet. 36, 1090–1098 (2004).

4. Elkon, R., Linhart, C., Sharan, R., Shamir, R. & Shiloh, Y. Genome-wide in silico

identification of transcriptional regulators controlling the cell cycle in human cells.

Genome Res. 13, 773–780 (2003).

5. Rhodes, D.R. et al. ONCOMINE: a cancer microarray database and integrated data-

mining platform. Neoplasia 6, 1–6 (2004).

6. Matys, V. et al. TRANSFAC: transcriptional regulation, from patterns to profiles.

Nucleic Acids Res. 31, 374–378 (2003).

7. Sladek, F.M., Zhong, W.M., Lai, E. & Darnell, J.E. Jr. Liver-enriched transcription

factor HNF-4 is a novel member of the steroid hormone receptor superfamily. Genes

Dev. 4, 2353–2365 (1990).

8. Xanthopoulos, K.G. et al. The different tissue transcription patterns of gene for

HNF-1, C/EBP, HNF-3, and HNF-4, protein factors that govern liver-specific transcrip-

tion. Proc. Natl. Acad. Sci. USA 88, 3807–3811 (1991).

9. Black, B.L. & Olson, E.N. Transcriptional control of muscle development by myocyte

enhancer factor-2 (MEF2) proteins. Annu. Rev. Cell Dev. Biol. 14, 167–196 (1998).

10. O’Donovan, K.J., Tourtellotte, W.G., Millbrandt, J. & Baraban, J.M. The EGR family

of transcription-regulatory factors: progress at the interface of molecular and systems

neuroscience. Trends Neurosci. 22, 167–173 (1999).

11. Taniguchi, T., Ogasawara, K., Takaoka, A. & Tanaka, N. IRF family of transcription

factors as regulators of host defense. Annu. Rev. Immunol. 19, 623–655 (2001).

12. Caamano, J. & Hunter, C.A. NF-kappaB family of transcription factors: central regula-

tors of innate and adaptive immune functions. Clin. Microbiol. Rev. 15, 414–429

(2002).

13. Rosner, M.H. et al. A POU-domain transcription factor in early stem cells and germ

cells of the mammalian embryo. Nature 345, 686–692 (1990).

14. Nichols, J. et al. Formation of pluripotent stem cells in the mammalian embryo depends

on the POU transcription factor Oct4. Cell 95, 379–391 (1998).

15. La Thangue, N.B. The yin and yang of E2F-1: balancing life and death. Nat. Cell Biol.

5, 587–589 (2003).

16. Zhu, W., Giangrande, P.H. & Nevins, J.R. E2Fs link the control of G1/S and G2/M

transcription. EMBO J. 23, 4615–4626 (2004).

17. Bracken, A.P. et al. EZH2 is downstream of the pRB-E2F pathway, essential for prolif-

eration and amplified in cancer. EMBO J. 22, 5323–5335 (2003).

18. Chae, H.D., Yun, J., Bang, Y.J. & Shin, D.Y. Cdk2-dependent phosphorylation of the

NF-Y transcription factor is essential for the expression of the cell cycle-regulatory

genes and cell cycle G1/S and G2/M transitions. Oncogene 23, 4084–4088 (2004).

19. Muller, H. et al. E2Fs regulate the expression of genes involved in differentiation,

development, proliferation, and apoptosis. Genes Dev. 15, 267–285 (2001).

20. DeGregori, J., Kowalik, T. & Nevins, J.R. Cellular targets for activation by the E2F1

transcription factor include DNA synthesis- and G1/S-regulatory genes. Mol. Cell Biol.

15, 4215–4524 (1995).

21. Keenan, S.M., Lents, N.H. & Baldassare, J.J. Expression of cyclin E renders cyclin D-

CDK4 dispensable for inactivation of the retinoblastoma tumor suppressor protein, acti-

vation of E2F, and G1-S phase progression. J. Biol. Chem. 279, 5387–5396 (2004).

22. Cao, R. et al. Role of histone H3 lysine 27 methylation in Polycomb-group silencing.

Science 298, 1039–1043 (2002).

23. Kleer, C.G. et al. EZH2 is a marker of aggressive breast cancer and promotes neoplastic

transformation of breast epithelial cells. Proc. Natl. Acad. Sci. USA 100, 11606–

11611 (2003).

24. Varambally, S. et al. The polycomb group protein EZH2 is involved in progression of

prostate cancer. Nature 419, 624–629 (2002).

25. Gilmore, T.D., Kalaitzidis, D., Liang, M.C. & Starczynowski, D.T. The c-Rel transcrip-

tion factor and B-cell proliferation: a deal with the devil. Oncogene 23, 2275–2286

(2004).

26. Houldsworth, J. et al. Relationship between REL amplification, REL function, and clini-

cal and biologic features in diffuse large B-cell lymphomas. Blood 103, 1862–1868

(2004).

27. Lossos, I.S. et al. Transformation of follicular lymphoma to diffuse large-cell lymphoma:

alternative patterns with increased or decreased expression of c-myc and its regulated

genes. Proc. Natl. Acad. Sci. USA 99, 8886–8891 (2002).

28. Nau, M.M. et al. L-myc, a new myc-related gene amplified and expressed in human

small cell lung cancer. Nature 318, 69–73 (1985).

29. Wong, A.J. et al. Gene amplification of c-myc and N-myc in small cell carcinoma of

the lung.

Science 233, 461–464 (1986).

30. Zucman, J. et al. EWS and ATF-1 gene fusion induced by t(12;22) translocation in

malignant melanoma of soft parts. Nat. Genet. 4, 341–345 (1993).

31. Jean, D. & Bar-Eli, M. Targeting the ATF-1/CREB transcription factors by single chain

Fv fragment in human melanoma: potential modality for cancer therapy. Crit. Rev.

Immunol. 21, 275–286 (2001).

32. Johnson, D.G., Cress, W.D., Jakoi, L. & Nevins, J.R. Oncogenic capacity of the E2F1

gene. Proc. Natl. Acad. Sci. USA 91, 12823–12827 (1994).

33. Schwab, M., Varmus, H.E. & Bishop, J.M. Human N-myc gene contributes to neoplastic

transformation of mammalian cells in culture. Nature 316, 160–162 (1985).

34. Sylla, B.S. & Temin, H.M. Activation of oncogenicity of the c-rel proto-oncogene. Mol.

Cell Biol. 6, 4709–4716 (1986).

35. Seth, A. & Papas, T.S. The c-ets-1 proto-oncogene has oncogenic activity and is posi-

tively autoregulated. Oncogene 5, 1761–1767 (1990).

36. Lamb, J. et al. A mechanism of cyclin D1 action encoded in the patterns of gene

expression in human cancer. Cell 114, 323–334 (2003).

Integrated Analysis Reveals COL4A3 as a Novel Diagnostic and Therapeutic Target in UV-Related Skin Cutaneous Melanoma

Article

Full-text available

Jun 2024

Background High levels of UV exposure are a significant factor that can trigger the onset and progression of SKCM. Moreover, this exposure is closely linked to the malignancy of the tumor and the prognosis of patients. Our objective is to identify a tumor biomarker database associated with UV exposure, which can be utilized for prognostic analysis and diagnosis and treatment of SKCM. Methods This study used the weighted gene co-expression network analyses (WGCNA) and gene mutation frequency analyses to screen for UV-related target genes using the GSE59455 and the cancer genome atlas databases (TCGA). The prognostic model was created using Cox regression and least absolute shrinkage and selection operator analyses (LASSCO). Furthermore, in vitro experiments further validated that the overexpression or knockdown of COL4A3 could regulate the proliferation and migration abilities of SKMEL28 and A357 melanoma cells. Results A prognostic model was created that included six genes with a high UV-related mutation in SKCM: COL4A3, CHRM2, DSC3, GIMAP5, LAMC2, and PSG7. The model had a strong patient survival correlation (P˂0.001, hazard ratio (HR) = 1.57) and significant predictor (P˂0.001, HR = 3.050). Furthermore, the model negatively correlated with immune cells, including CD8⁺ T cells (Cor=−0.408, P˂0.001), and M1-type macrophages (Cor=−0.385, P˂0.001), and immune checkpoints, including programmed cell death ligand-1. Moreover, we identified COL4A3 as a molecule with significant predictive functionality. Overexpression of COL4A3 significantly inhibited the proliferation, migration, and invasion abilities of SKMEL28 and A357 melanoma cells, while knockdown of COL4A3 yielded the opposite results. And overexpression of COL4A3 enhanced the inhibitory effects of imatinib on the proliferation, migration, and invasion abilities of SKMEL28 and A357 cells. Conclusion The efficacy of the prognostic model was validated by analyzing the prognosis, immune infiltration, and immune checkpoint profiles. COL4A3 stands out as a novel diagnostic and therapeutic target for SKCM, offering new strategies for small-molecule targeted drug therapies.

Data-driven modeling of core gene regulatory network underlying leukemogenesis in IDH mutant AML

Article

Full-text available

Apr 2024

Acute myeloid leukemia (AML) is characterized by uncontrolled proliferation of poorly differentiated myeloid cells, with a heterogenous mutational landscape. Mutations in IDH1 and IDH2 are found in 20% of the AML cases. Although much effort has been made to identify genes associated with leukemogenesis, the regulatory mechanism of AML state transition is still not fully understood. To alleviate this issue, here we develop a new computational approach that integrates genomic data from diverse sources, including gene expression and ATAC-seq datasets, curated gene regulatory interaction databases, and mathematical modeling to establish models of context-specific core gene regulatory networks (GRNs) for a mechanistic understanding of tumorigenesis of AML with IDH mutations. The approach adopts a new optimization procedure to identify the top network according to its accuracy in capturing gene expression states and its flexibility to allow sufficient control of state transitions. From GRN modeling, we identify key regulators associated with the function of IDH mutations, such as DNA methyltransferase DNMT1, and network destabilizers, such as E2F1. The constructed core regulatory network and outcomes of in-silico network perturbations are supported by survival data from AML patients. We expect that the combined bioinformatics and systems-biology modeling approach will be generally applicable to elucidate the gene regulation of disease progression.

Causal integration of multi‐omics data with prior knowledge to generate mechanistic hypotheses

Article

Full-text available

Jan 2021
MOL SYST BIOL

Multi-omics datasets can provide molecular insights beyond the sum of individual omics. Various tools have been recently developed to integrate such datasets, but there are limited strategies to systematically extract mechanistic hypotheses from them. Here, we present COSMOS (Causal Oriented Search of Multi-Omics Space), a method that integrates phosphoproteomics, transcriptomics, and metabolomics datasets. COSMOS combines extensive prior knowledge of signaling, metabolic, and gene regulatory networks with computational methods to estimate activities of transcription factors and kinases as well as network-level causal reasoning. COSMOS provides mechanistic hypotheses for experimental observations across multi-omics datasets. We applied COSMOS to a dataset comprising transcriptomics, phosphoproteomics, and metabolomics data from healthy and cancerous tissue from eleven clear cell renal cell carcinoma (ccRCC) patients. COSMOS was able to capture relevant crosstalks within and between multiple omics layers, such as known ccRCC drug targets. We expect that our freely available method will be broadly useful to extract mechanistic insights from multi-omics studies.

TimeNexus: A Novel Cytoscape App to Analyze Time-Series Data Using Temporal MultiLayer Networks (tMLNs)

Preprint

Full-text available

Dec 2020

Integrating -omics data with biological networks such as protein-protein interaction networks is a popular and useful approach to interpret expression changes of genes in changing conditions, and to identify relevant cellular pathways, active subnetworks or network communities. Yet, most -omics data integration tools are restricted to static networks and therefore cannot easily be used for analyzing time-series data. Determining regulations or exploring the network structure over time requires time-dependent networks which incorporate time as one component in their structure. Here, we present a method to project time-series data on sequential layers of a multilayer network, thus creating a temporal multilayer network (tMLN). We implemented this method as a Cytoscape app we named TimeNexus. TimeNexus allows to easily create, manage and visualize temporal multilayer networks starting from a combination of node and edge tables carrying the information on the temporal network structure. To allow further analysis of the tMLN, TimeNexus creates and passes on regular Cytoscape networks in form of static versions of the tMLN in three different ways: i) over the entire set of layers, ii) over two consecutive layers at a time, iii) or on one single layer at a time. We combined TimeNexus with the Cytoscape apps PathLinker and AnatApp/ANAT to extract active subnetworks from tMLNs. To test the usability of our app, we applied TimeNexus together with PathLinker or ANAT on temporal expression data of the yeast cell cycle and were able to identify active subnetworks relevant for different cell cycle phases. We furthermore used TimeNexus on our own temporal expression data from a mouse pain assay inducing hindpaw inflammation and detected active subnetworks relevant for an inflammatory response to injury, including immune response, cell stress response and regulation of apoptosis. TimeNexus is freely available from the Cytoscape app store at https://apps.cytoscape.org/apps/TimeNexus.

On the NF-Y regulome as in ENCODE (2019)

Article

Full-text available

Dec 2020
PLOS COMPUT BIOL

NF-Y is a trimeric Transcription Factor -TF- which binds with high selectivity to the conserved CCAAT element. Individual ChIP-seq analysis as well as ENCODE have progressively identified locations shared by other TFs. Here, we have analyzed data introduced by ENCODE over the last five years in K562, HeLa-S3 and GM12878, including several chromatin features, as well RNA-seq profiling of HeLa-S3 cells after NF-Y inactivation. We double the number of sequence-specific TFs and co-factors reported. We catalogue them in 4 classes based on co-association criteria, infer target genes categorizations, identify positional bias of binding sites and gene expression changes. Larger and novel co-associations emerge, specifically concerning subunits of repressive complexes as well as RNA-binding proteins. On the one hand, these data better define NF-Y association with single members of major classes of TFs, on the other, they suggest that it might have a wider role in the control of mRNA production.

Phylostratigraphic analysis of tumor and developmental transcriptomes reveals relationship between oncogenesis, phylogenesis and ontogenesis

Preprint

Full-text available

Oct 2017

The question of the existence of cancer is inadequately answered by invoking somatic mutations or the disruptions of cellular and tissue control mechanisms. As such uniformly random events alone cannot account for the almost inevitable occurrence of an extremely complex process such as cancer. In the different epistemic realm, an ultimate explanation of cancer is that cancer is a reversion of a cell to an ancestral pre-Metazoan state, i.e. a cellular form of atavism. Several studies have suggested that genes involved in cancer have evolved at particular evolutionary time linked to the unicellular-multicellular transition. Here we used a refined phylostratigraphic analysis of evolutionary ages of the known genes/pathways associated with cancer and the genes differentially expressed between normal and cancer tissue as well as between embryonic and mature (differentiated) cells. We found that cancer-specific transcriptomes and cancer-related pathways were enriched for genes that evolved in the pre-Metazoan era and depleted of genes that evolved in the post-Metazoan era. By contrast an opposite relation was found for cell maturation: the age distribution frequency of the genes expressed in differentiated epithelial cells were enriched for post-Metazoan genes and depleted of pre-Metazoan ones. These findings support the atavism theory that cancer cells manifest the reactivation of an ancient ancestral state featuring unicellular modalities. Thus our bioinformatics analyses suggest that not only does oncogenesis recapitulate ontogenesis, and ontogenesis recapitulates phylogenesis, but also oncogenesis recapitulates phylogenesis. This more encompassing perspective may offer a natural organizing framework for genetic alterations in cancers and point to new treatment options that target the genes controlling the atavism transition. One Sentence Summary Tracing cancer gene evolutionary ages revealed that cancer reverts to a pre-existing early Metazoan state.

Systematic analysis of immune-related genes based on a combination of multiple databases to build a diagnostic and a prognostic risk model for hepatocellular carcinoma

Article

Full-text available

Mar 2021
CANCER IMMUNOL IMMUN

The immune microenvironment plays a vital role in the progression of hepatocellular carcinoma (HCC). Thousands of immune-related genes (IRGs) have been identified, but their effects on HCC are not fully understood. In this study, we identified the differentially expressed IRGs and analyzed their functions in HCC in a systematic way. Furthermore, we constructed a diagnostic and a prognostic model using multiple statistical methods, and both models had good distinguishing performance, which we verified in several independent datasets. This diagnostic model was also adaptable to proteomic data. The combination of a prognostic risk model and classic clinical staging can effectively distinguish patients in high- and low-risk groups. Furthermore, we systematically explore the differences in the immune microenvironment between the high-risk group and the low-risk group to help clinical decision-making. In summary, we systematically analyzed immune-related genes in HCC, explored their functions, constructed a diagnostic and a prognostic model and investigated potential therapeutic schedules in high-risk patients. The model performance was verified in multiple databases. Our findings can provide directions for future research.

Nanotechnology-Assisted Isolation and Analysis of Circulating Tumor Cells on Microfluidic Devices

Article

Full-text available

Aug 2020

Circulating tumor cells (CTCs), a type of cancer cell that spreads from primary tumors into human peripheral blood and are considered as a new biomarker of cancer liquid biopsy. It provides the direction for understanding the biology of cancer metastasis and progression. Isolation and analysis of CTCs offer the possibility for early cancer detection and dynamic prognosis monitoring. The extremely low quantity and high heterogeneity of CTCs are the major challenges for the application of CTCs in liquid biopsy. There have been significant research endeavors to develop efficient and reliable approaches to CTC isolation and analysis in the past few decades. With the advancement of microfabrication and nanomaterials, a variety of approaches have now emerged for CTC isolation and analysis on microfluidic platforms combined with nanotechnology. These new approaches show advantages in terms of cell capture efficiency, purity, detection sensitivity and specificity. This review focuses on recent progress in the field of nanotechnology-assisted microfluidics for CTC isolation and detection. Firstly, CTC isolation approaches using nanomaterial-based microfluidic devices are summarized and discussed. The different strategies for CTC release from the devices are specifically outlined. In addition, existing nanotechnology-assisted methods for CTC downstream analysis are summarized. Some perspectives are discussed on the challenges of current methods for CTC studies and promising research directions.

TNBC response to paclitaxel phenocopies interferon response which reveals cell cycle-associated resistance mechanisms

Preprint

Full-text available

Jun 2024

Paclitaxel is a standard of care neoadjuvant therapy for patients with triple negative breast cancer (TNBC); however, it shows limited benefit for locally advanced or metastatic disease. Here we used a coordinated experimental-computational approach to explore the influence of paclitaxel on the cellular and molecular responses of TNBC cells. We found that escalating doses of paclitaxel resulted in multinucleation, promotion of senescence, and initiation of DNA damage induced apoptosis. Single-cell RNA sequencing (scRNA-seq) of TNBC cells after paclitaxel treatment revealed upregulation of innate immune programs canonically associated with interferon response and downregulation of cell cycle progression programs. Systematic exploration of transcriptional responses to paclitaxel and cancer-associated microenvironmental factors revealed common gene programs induced by paclitaxel, IFNB, and IFNG. Transcription factor (TF) enrichment analysis identified 13 TFs that were both enriched based on activity of downstream targets and also significantly upregulated after paclitaxel treatment. Functional assessment with siRNA knockdown confirmed that the TFs FOSL1, NFE2L2 and ELF3 mediate cellular proliferation and also regulate nuclear structure. We further explored the influence of these TFs on paclitaxel-induced cell cycle behavior via live cell imaging, which revealed altered progression rates through G1, S/G2 and M phases. We found that ELF3 knockdown synergized with paclitaxel treatment to lock cells in a G1 state and prevent cell cycle progression. Analysis of publicly available breast cancer patient data showed that high ELF3 expression was associated with poor prognosis and enrichment programs associated with cell cycle progression. Together these analyses disentangle the diverse aspects of paclitaxel response and identify ELF3 upregulation as a putative biomarker of paclitaxel resistance in TNBC.

Big Tumorigenesis Mechanisms in Systems Cancer Biology via Big Database Mining and Network Modeling

Chapter

Jan 2017

Transformation of follicular lymphoma to diffuse large-cell lymphoma: Alternative patterns with increased or decreased expression of c-myc and its regulated genes

Article

Full-text available

Jul 2002

The natural history of follicular lymphoma (FL) is frequently characterized by transformation to a more aggressive diffuse large B cell lymphoma (DLBCL). We compared the gene-expression profiles between transformed DLBCL and their antecedent FL. No genes were observed to increase or decrease their expression in all of the cases of histological transformation. However, two different gene-expression profiles associated with the transformation process were defined, one in which c-myc and genes regulated by c-myc showed increased expression and one in which these same genes showed decreased expression. Further, there was a striking difference in gene-expression profiles between transformed DLBCL and de novo DLBCL, because the gene-expression profile of transformed DLBCL was more similar to their antecedent FL than to de novo DLBCL. This study demonstrates that transformation from FL to DLBCL can occur by alternative pathways and that transformed DLBCL and de novo DLBCL have very different gene-expression profiles that may underlie the different clinical behaviors of these two types of morphologically similar lymphomas.

TRANSFAC®: Transcriptional regulation, from patterns to profiles

Article

Full-text available

Jan 2003
NUCLEIC ACIDS RES

The TRANSFAC® database on eukaryotic transcriptional regulation, comprising data on transcription factors, their target genes and regulatory binding sites, has been extended and further developed, both in number of entries and in the scope and structure of the collected data. Structured fields for expression patterns have been introduced for transcription factors from human and mouse, using the CYTOMER® database on anatomical structures and developmental stages. The functionality of Match™, a tool for matrix-based search of transcription factor binding sites, has been enhanced. For instance, the program now comes along with a number of tissue-(or state-)specific profiles and new profiles can be created and modified with Match™ Profiler. The GENE table was extended and gained in importance, containing amongst others links to LocusLink, RefSeq and OMIM now. Further, (direct) links between factor and target gene on one hand and between gene and encoded factor on the other hand were introduced. The TRANSFAC® public release is available at http://www.gene-regulation.com. For yeast an additional release including the latest data was made available separately as TRANSFAC® Saccharomyces Module (TSM) at http://transfac.gbf.de. For CYTOMER® free download versions are available at http://www.biobase.de:8080/index.html.

The different tissue transcription patterns of genes for HNF-1, C/EBP, HNF-3, and HNF-4, protein factors that govern liver-specific transcription

Article

Full-text available

Jun 1991

The transcription factors that act in hepatocyte-specific gene expression include proteins that are present mainly in liver cells (HNF-1/LFB1, C/EBP, HNF-3, HNF-4) (HNF, hepatocyte nuclear factor; C/EBP, rat enhancer binding protein) and proteins that are widely distributed (AP-1, NF-1, NF-Y/ACF). We show here that the genes encoding each of these liver-enriched factors exhibit different patterns of transcriptional control in different tissues. In addition, there were several instances in which transcription was detected (e.g., for HNF-1) when no mRNA or specific DNA binding protein was found, suggesting the importance of posttranscriptional control in some instances for these factors. These experiments identify C/EBP, HNF-3, and HNF-4, and perhaps also HNF-1, as targets for the study of cascades of transcriptionally controlled transcription factors in differentiated cells.

Liver-enriched transcription factor HNF-4 is a novel member steroid hormone receptor superfamily

Article

Full-text available

Jan 1991
GENE DEV

HNF-4 (hepatocyte nuclear factor 4) is a protein enriched in liver extracts that binds to sites required for the transcription of the genes for transthyretin (TTR), the carrier protein in the serum for vitamin A and thyroid hormone, and for apolipoprotein CIII (apoCIII), a major constituent of chylomicrons and very low-density lipoproteins (VLDL). Synthetic oligonucleotides derived from amino acid sequence of affinity-purified HNF-4 protein (54 kD) were used in the polymerase chain reaction (PCR) to isolate a cDNA clone encoding the protein. HNF-4 is a member of the steroid hormone receptor superfamily with an unusual amino acid in the conserved "knuckle" of the first zinc finger (DGCKG). Studies with in vitro-translated HNF-4 protein show that it binds to its recognition site as a dimer, and cotransfection assays indicate that it activates transcription in a sequence-specific fashion in nonhepatic (HeLa) cells. Northern blot analysis reveals that HNF-4 mRNA is present in kidney and intestine, as well as liver, but is absent in other tissues. DNA-binding and antisera reactivity data suggest that HNF-4 could be identical to liver factor A1 (LF-A1), a DNA-binding activity implicated in the regulation of transcription of the alpha 1-antitrypsin, apolipoprotein A1, and pyruvate kinase genes. The similarity between HNF-4 and other ligand-dependent transcription factors raises the possibility that HNF-4 and the genes it regulates respond to an as yet unidentified ligand.

Erratum: Cellular targets for activation by the E2F1 transcription factor include DNA synthesis- and G1/S-regulatory genes (Molecular and Cellular Biology 15:8 (4219-4220))

Article

Oct 1995

Prostate Cancer

Article

Dec 2002
Surgery

Prostate cancer is the second most common cause of death from cancer in men, and is exceeded only by lung cancer in male mortality rates from malignant disease. Whilst clinically significant prostate cancer is largely a disease of ‘Western’ society, it is a paradox that microfocal, well-differentiated ‘latent’ cancer, which is only diagnosed at autopsy, has an incidence of up to 80% in 80 year old men and an equal worldwide distribution. There may be either some aetiological factor in America and Europe, which activates this latent cancer, or else some protective factor, possibly dietary, which prevents expression of the diseases in the ‘East’. In the USA, clinically significant prostate cancer is diagnosed in 9.5% of men and 3.5% will die from the disease.

Type I interferon system and IRF family of transcription factors in host defense regulation

Article

Mar 2005

Type I interferons (IFN-α/β) were originally identified as humoral factors, which are secreted in virally infected cells and confer an antiviral state in uninfected cells. Subsequently, their multifunctional roles have also been demonstrated, which include antitumor actions. More recently, the IFN system has been the focus of much attention in the context of the regulation of the innate and adaptive immune systems. Indeed, the IFN genes are induced in antigen-presenting cells (APCs) via the activation of distinct Toll-like receptors (TLRs), and accumulating evidence indicates the importance of TLR-induced IFN-α/β for the induction of both innate and adaptive immune responses. Two members of the interferon regulatory factor (IRF) family of transcription factors, IRF-3 and IRF-7, play mutually nonredundant functions in IFN-α/β gene induction in response to viral infection or TLR stimulation. Another unique facet of the IFN-α/β system is that IFN-α/β are produced at low levels in normally growing cells. Although seemingly futile, a weak signal by these IFNs is critical to eliciting from cells strong responses to other stimuli, thereby providing a foundation for an efficient operation of the immune system. In the context of the antitumor action of IFNs, p53 gene transcription is induced by IFN-α/β, accompanied by an increase in p53 protein level for boosting p53 responses in tumor suppression. Furthermore, a new link was discovered between p53 and IFN-α/β in antiviral immunity. In this review, we focus on recent studies on the type IIFN (IFN-α/β) system and IRF-family transcription factors with respect to immunity and oncogenesis.

E2Fs regulate the expression of genes involved in differentiation

Article

A POU-domain transcription factor in early stem cells and germ cells of the mammalian embryo

Article

Jul 1990

The murine oct-3 gene encodes a transcription factor containing a POU-specific domain and a homeodomain. In marked contrast to other homeodomain-encoding genes, oct-3 is expressed in the totipotent and pluripotent stem cells of the pregastrulation embryo and is down-regulated during differentiation to endoderm and mesoderm, suggesting that it has a role in early development. The oct-3 gene is also expressed in primordial germ cells and in the female germ line.

The c-Ets-1 proto-oncogene has oncogenic activity and is positively autoregulated

Article

Jan 1991
ONCOGENE

The proto-oncogene ets-1 is a member of the ets family of genes that share homology with the viral oncogene, v-ets, of the avian leukemia virus E26. By using expression vectors, we demonstrate that the ets-1 gene transforms NIH3T3 cells and the ets-1 transfected cells form colonies in soft agar and induce tumors in nude mice. We have also determined that the ets-1 protein contains homology with the helix-loop-helix motif of the HLH family proteins, but lacks the basic domain upstream of helix I. Transfection of the NIH3T3 cells with ets-1 vectors results in the activation of the endogenous ets-1 gene. Using hybridization probes that can distinguish between transcripts from endogenous and exogenous templates, we show that the endogenous ets-1 gene is activated by the expression of the transfected exogenous ets-1. In contrast, the expression of transfected ets-2 has no effect on the endogenous ets-1 gene expression. The results indicate that the ets-1 proto-oncogene is positively autoregulated by its product.

Mining for regulatory programs in the cancer transcriptome

Abstract and Figures

Recommended publications

Analyzing methylated DNA in liquid biopsies by Sanger sequencing

Guide: Sanger sequencing and fragment analysis

Poster: KRAS mutation detection by Sanger sequencing using capillary electrophoresis

Learn how Zurich University scaled up monoclonal antibody (mAb) production

pRb-Independent Growth Arrest and Transcriptional Regulation of E2F Target Genes

Rhodes DR, Chinnaiyan AMIntegrative analysis of the cancer transcriptome. Nat Genet 37(Suppl): S31-S...

Meta-Analysis of Microarrays Interstudy Validation of Gene Expression Profiles Reveals Pathway Dysre...

Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D, Barrette T, Pandey A, Chinnaiyan AML...

Molecular Concepts Analysis Links Tumors, Pathways, Mechanisms, and Drugs