ArticlePDF Available

An epigenomic roadmap to induced pluripotency reveals DNA methylation as a reprogramming modulator

Authors:

Abstract and Figures

Reprogramming of somatic cells to induced pluripotent stem cells involves a dynamic rearrangement of the epigenetic landscape. To characterize this epigenomic roadmap, we have performed MethylC-seq, ChIP-seq (H3K4/K27/K36me3) and RNA-Seq on samples taken at several time points during murine secondary reprogramming as part of Project Grandiose. We find that DNA methylation gain during reprogramming occurs gradually, while loss is achieved only at the ESC-like state. Binding sites of activated factors exhibit focal demethylation during reprogramming, while ESC-like pluripotent cells are distinguished by extension of demethylation to the wider neighbourhood. We observed that genes with CpG-rich promoters demonstrate stable low methylation and strong engagement of histone marks, whereas genes with CpG-poor promoters are safeguarded by methylation. Such DNA methylation-driven control is the key to the regulation of ESC-pluripotency genes, including Dppa4, Dppa5a and Esrrb. These results reveal the crucial role that DNA methylation plays as an epigenetic switch driving somatic cells to pluripotency.
Content may be subject to copyright.
ARTICLE
Received 30 Aug 2014 |Accepted 21 Oct 2014 |Published 10 Dec 2014
An epigenomic roadmap to induced pluripotency
reveals DNA methylation as a reprogramming
modulator
Dong-Sung Lee1,2,3,*, Jong-Yeon Shin1,4,*, Peter D. Tonge5, Mira C. Puri5,6, Seungbok Lee1,2,3, Hansoo Park1,2,3,
Won-Chul Lee1,4, Samer M.I. Hussein5, Thomas Bleazard7, Ji-Young Yun1,4, Jihye Kim1,4, Mira Li5,
Nicole Cloonan8,9, David Wood8, Jennifer L. Clancy10, Rowland Mosbergen11, Jae-Hyuk Yi1, Kap-Seok Yang4,
Hyungtae Kim4, Hwanseok Rhee12, Christine A. Wells11,13, Thomas Preiss10,14, Sean M. Grimmond8,15,
Ian M. Rogers5,16,17, Andras Nagy5,17,18 & Jeong-Sun Seo1,2,3,4
Reprogramming of somatic cells to induced pluripotent stem cells involves a dynamic
rearrangement of the epigenetic landscape. To characterize this epigenomic roadmap, we
have performed MethylC-seq, ChIP-seq (H3K4/K27/K36me3) and RNA-Seq on samples
taken at several time points during murine secondary reprogramming as part of Project
Grandiose. We find that DNA methylation gain during reprogramming occurs gradually, while
loss is achieved only at the ESC-like state. Binding sites of activated factors exhibit focal
demethylation during reprogramming, while ESC-like pluripotent cells are distinguished
by extension of demethylation to the wider neighbourhood. We observed that genes with
CpG-rich promoters demonstrate stable low methylation and strong engagement of histone
marks, whereas genes with CpG-poor promoters are safeguarded by methylation. Such DNA
methylation-driven control is the key to the regulation of ESC-pluripotency genes, including
Dppa4, Dppa5a and Esrrb. These results reveal the crucial role that DNA methylation plays as
an epigenetic switch driving somatic cells to pluripotency.
DOI: 10.1038/ncomms6619 OPEN
1Genomic Medicine Institute (GMI), Medical Research Center, Seoul National University, Seoul 110-799, Korea. 2Department of Biomedical Sciences, Seoul
National University College of Medicine, Seoul 110-799, Korea. 3Department of Biochemistry, Seoul National University College of Medicine, Seoul 110-799,
Korea. 4Life Science Institute, Macrogen Inc., Seoul 153-781, Korea. 5Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario,
Canada M5G 1X5. 6Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada M5T 3H7. 7Faculty of Medical and Human Sciences,
University of Manchester, Manchester M13 9PT, UK. 8Queensland Centre for Medical Genomics, Institute for Molecular Bioscience, The University of
Queensland, St Lucia, Queensland 4072, Australia. 9QIMR Berghofer Medical Research Institute, Genomic Biology Lab, 300 Herston Road, Herston,
Queensland 4006, Australia. 10 Genome Biology Department, The John Curtin School of Medical Research, The Australian National University, Canberra,
Australian Capital Territory 2601, Australia. 11 Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane,
Queensland 4072, Australia. 12 Macrogen Bioinformatics Center, Macrogen, Seoul 153-781, Republic of Korea. 13 College of Medical, Veterinary and Life
Sciences, University of Glasgow, Glasgow, Scotland G12 8TA, UK. 14 Molecular, Structural & Computational Biology Division, Victor Chang Cardiac Research
Institute, Sydney, New South Wales 2010, Australia. 15Wolfson Wohl Cancer Research Centre, Institute for Cancer Sciences, University of Glasgow, Bearsden,
Glasgow Scotland G61 1BD, UK. 16 Department of Physiology, University of Toronto, Toronto, Ontario, Canada M5T 3H7. 17 Department of Obstetrics and
Gynaecology, University of Toronto, Toronto, Ontario, Canada M5T 3H7. 18 Institute of Medical Science, University of Toronto, Toronto, Ontario, Canada M5T
3H7. * These authors contributed equally to this work. Correspondence and requests for materials should be addressed to J.-S.S. (email: jeongsun@snu.ac.kr).
NATURE COMMUNICATIONS | 5:5619 | DOI: 10.1038/ncomms6619 | www.nature.com/naturecommunications 1
&2014 Macmillan Publishers Limited. All rights reserved.
Somatic cells can be reprogrammed into induced pluripotent
stem cells (iPSCs) by the expression of defined transcription
factors1–5. During the reprogramming process, the global
epigenetic landscape has to be reset to establish the epigenetic
marks of the pluripotent state through DNA methylation and
chromatin-remodelling processes2,6–9. Through the development
of a secondary reprogramming system10, iPSC generation was
initially described as a multistep process characterized by
transcriptional, DNA methylation and chromatin changes11–14.
Genome-wide analysis of specific chromatin modification
dynamics at early stages of reprogramming indicated that this
progress might be constrained by repressive epigenetic modi-
fications, such as H3K9me3 and DNA methylation15–18.
More recently, it has been proposed that DNA methylation
during iPSC generation functions in the silencing of genes
involved in differentiation, while also facilitating chromatin
remodelling18–20. DNA demethylation appears to play an
important role in reactivating pluripotency genes, which are
hypermethylated and silenced in somatic cells, particularly in the
late stages of the reprogramming process13. However, overall
understanding of the global dynamics of epigenetic modification
at different stages during reprogramming remains poor.
In this work, we have utilized a murine secondary
reprogramming system to sample cellular trajectories during
reprogramming and performed whole-genome bisulfite sequen-
cing, chromatin immunoprecipitation sequencing (ChIP-seq;
H3K4me3, H3K27me3 and H3K36me3), and RNA sequencing
(RNA-Seq) to characterize the epigenomic roadmap to pluripo-
tency at base resolution21,22. Our observations provide a deeper
understanding of the reprogramming process and reveal the
crucial role that DNA methylation plays in the epigenetic switch
that drives somatic cells to pluripotency.
Results and Discussion
Dynamic changes in DNA methylation during reprogramming.
The Project Grandiose secondary reprogramming samples
present a unique opportunity to profile cellular state changes
at various time points during reprogramming10,21,22. These
consisted of secondary mouse embryonic fibroblasts (2°MEF),
six intermediate time points at high doxycycline (dox)
concentrations (D2H, D5H, D8H, D11H, D16H and D18H),
three alternative intermediate time points collected for samples
treated with reduced dox concentrations (D16L, D21L
and D21Ø), the secondary iPSCs (2°iPSCs), the primary iPSCs
(1°iPSCs) used to generate the chimeric mouse and a mouse
Rosa rtTA embryonic stem cell line (ESC) for standard
comparison (Fig. 1a–c). As described in ref. 21, these samples
showed reprogramming to two distinct pluripotent states:
ESC-like cells and the ‘F-class’ consisting of stages D16H
and D18H.
In this manuscript, we describe base-resolution bisulfite
sequencing of the 13 Project Grandiose samples and investigation
of global DNA methylation changes during reprogramming
(Supplementary Data 1). The sample methylomes were scanned
using a sliding window of 30 CpGs, identifying 7,890 differen-
tially methylated regions (DMRs) covering 22 Mb, representing
0.81% of the mouse genome (Fig. 2a,b, Supplementary Data 2,
Epigenomic regulation of reprogramming
Chip-Seq
RNA-Seq
Integrative
analysis
H3K4me3
H3K27me3
H3K36me3
Differentially methylated regions
DNA methylation
change around TFBS
• Pattern analysis
• Feature enrichment
test in each pattern
DNA methylation
accumulation
in gain and loss
Whole-genome bisulfite sequencing (MethylC-Seq)
1°iPSC
1°MEF
Tetraploid embryo
complementation
Primary
reprogramming
2°MEFs
Secondary
reprogramming
2°MEFs 2°iPSC
1°iPSC
ESC
D2H D5H D8H D11H D16H D18H
D16L
D21L
D21Ø
1,500
5
0
Dox
concn (ng)
Chr16qB5
2°MEF D18H 2°iPSCs ESC
Dppa2 Dppa2 Dppa2 Dppa2
Sample
RefSeq
DNA
methylation
H3K4me3
H3K27me3
H3K36me3
qA1 qB1qA2 qB2 qB3 qB4 qB5 qC1.1 qC1.3 qC3.1 qC3.3 qC4qC2
Figure 1 | Experimental and computational analysis overview of the study. (a) Establishment of secondary system and sample collection. (b) MethylC-
Seq was performed on samples from secondary system. DMRs were identified. RNA-Seq and ChIP-Seq data were integrated with MethylC-Seq data
based on transcripts. (c) Base-level visualization of DNA methylation and histone distribution around Dppa2.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6619
2NATURE COMMUNICATIONS | 5:5619 | DOI: 10.1038/ncomms6619 | www.nature.com/naturecommunications
&2014 Macmillan Publishers Limited. All rights reserved.
Supplementary Figs 1a–c, 6a). Unsupervised hierarchical cluster-
ing performed on the DNA methylation state of DMRs (Fig. 2a)
distinguished the intermediate states (D2H-D18H and D16L-
D21L) from the ESC-like pluripotent states (D21Ø, 1oiPSCs,
2oiPSCs and ESCs). DMRs were categorized into three groups
based on the changing pattern of DNA methylation (Fig. 2a).
The DMR-1 group exhibited increased methylation levels after
(DMR-1a) or during (DMR-1b) high-level reprogramming factor
expression and included genes related to development and cell
differentiation, such as the Hox family, Col25a1 and Meox2. The
DMR-2 group represented differential methylation changes
between two pluripotent states: either gradual demethylation to
F-class and methylation in the ESC-like state (DMR-2a) or
gradual methylation to F-class and acquired demethylation in the
ESC-like state (DMR-2b). A final group (DMR-3) was identified
as exhibiting low methylation levels in the ESC-like state
0
5
10
20
30
Number of hyper-DMRs
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
3,000
1,500
0
0
1,500
Number of hypo-DMRs
3,000
Fold enrichment (x)
Low-methylated DMRs
(30%)
High-methylated DMRs
(70%)
K27me3
only 2.8%
No K4me3
or K27me3
3.1% K4me3
only 49.6%
Both
K4/K27me3
41.0%
K27me3 only
6.3%
No K4me3
or K27me3
79.7%
Both
K4/K27me3
0.9%
K4me3
only 16.6%
00.5 1
DNA methylation
level
DMR-1a
(n=1,819)
DMR-1b
(n=1,453)
DMR-2a
(n=553)
DMR-2b
(n=1,291)
DMR-3
(n=2,774)
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
RefSeq
DNAme
variation
DMR
TFBS
H3K4me3
H3K36me3
H3K27me3
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
51 kb
Dppa4 Dppa2
Nanog Esrrb Nanog
DMRs (%)
DMR_6651 DMR_6652
K4me3
TFBS
SINE
K27me3
Simple_repeat
CpG_shore
LTR
LINE
Intron
CpGi
exon
Promoter
K36me3
Enhancer
Meox1
Angpt2
Dmrt2
Erg
Hoxa1
Hoxd3
Mafb
Ptx3
Snai2
Abi3bp
Rapgef4
Sulf2
Clec14a
Col24a1
Col25a1
Il1rl2
Meox2
Nanog
Elf3
Gap43
Gfra1
Gli2
Icam1
Naprt1
Lhx6
0
20
40
60
80
100
Genes
within
DMR
Dnmt3l
Dppa4
Dppa5a
Esrrb
Gm11607
Itgb7
Rec8
Sox15
Sycp3
Trap1a
Triml2
Zfp42
Figure 2 | DMRs and features affecting DNA methylation change during reprogramming. (a) Hierarchical clustering based on the DNA methylation level
of DMRs in each sample. Each DMR was centred with the mean and normalized. DMRs were clustered into six groups based on pairwise correlations.
(b) Base-level visualization of two DMRs from group DMR-3b in the promoter regions of Dppa4 and Dppa2, known ESC-pluripotency predictor genes.
(c) DMR accumulation during reprogramming. DMRs were defined as hyper- and hypo-DMRs at each time point. Dark red and dark blue bars represent
ESC-specific Hyper- and Hypo-DMRs. Other colours indicate Hyper- and Hypo-DMRs in the order of left to right. (d) Proportion of DMRs containing
various genomic features. (e) Fold enrichment of examined genomic features within DMRs. (f) Percentage of DMRs containing H3K4me3 or H3K27me3
based on the methylation level (low-methylated r30%, high-methylated Z70%).
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6619 ARTICLE
NATURE COMMUNICATIONS | 5:5619 | DOI: 10.1038/ncomms6619 | www.nature.com/naturecommunications 3
&2014 Macmillan Publishers Limited. All rights reserved.
(1°iPSCs, 2°iPSCs and ESCs), with stable methylation persisting
in the F-class state and intermediate reprogramming samples,
which included multiple pluripotency genes such as Dppa2,
Dppa4, Dppa5a,Esrrb,Tcl1 and Eras (Fig. 2a, Supplementary
Data 2).
We annotated the DMRs in each sample as Hyper- or Hypo-
DMRs where they differed from a corresponding 2°MEF baseline
by over 20% (Fig. 2c). We observed a widespread gradual increase
in methylation to generate Hyper-DMRs during reprogramming,
whereas limited demethylation was observed as cells repro-
grammed to the F-class state (D16H and D18H). The steady
increase in Hyper-DMRs during both high-dox and low-dox
reprogramming challenges the notion that most changes in DNA
methylation occur at a late stage when cells acquire stable
pluripotency13. A similar trend was observed for the average
methylation level of DMRs, as methylation occurred gradually,
while demethylation did not change significantly during
transgene expression (Supplementary Fig. 2a,b). Almost all
Hypo-DMRs found in iPSCs were also observed in ESCs
(98.94%); however, this was not the case for Hyper-DMRs
(61.88%), suggesting that demethylation during reprogramming
occurred more conservatively.
Table 1 | Enrichment of TFBSs in each DMR group.
DMR groups DMR
Number
Sequence-specific transcription factors Transcription regulators
TET1 CTCF Oct4 SOX2 NANOG ESRRB ZFX KLF4 cMYC nMYC E2F1 TCFCP2L1 SMAD1 STAT3 p300 EZH2 SUZ12 RING1B
DMR-1
1a 1,819 NE NE  NE NE NE
1b 1,453 NE NE  þþþþþ
DMR-2
2a 553 NE NE NE NE NE NE þþ NE NE NE NE NE NE 
2b 1,291 þNE NE NE NE þþþNE NE þNE þNE þ þþþ þþþ þþþ
DMR-3 2,774 NE NE þþþ þþþ þþþ þþ þþ þþ þþþ þþþ þþþ þ þþþ þþþ þþþ
TFBS enrichment of total
DMRs versus whole
genome
7,890 13.25 3.84 17.26 15.99 17.26 18.65 13.25 18.65 13.25 17.26 18.65 13.25 17.26 13.25 17.26 18.65 13.25 17.26
DMR, differentially methylated region; TFBS, transcription factor-binding sites.
Fold enrichment versus total DMRs: o0.75 rNE (not enriched) r1.25 rþo1.5 rþþo1.75rþþþ.
0
50
0
100
0
100
0
150
0
200
0
500
0
500
0
2,000
H3K27me3
difference
H3K4me3
difference
TF expression
(FPKM)
Average
TFBS
methylation
difference
OCT4
n= 3,761
SOX2
n= 4,526
KLF4
n= 10,875
NANOG
n= 10,343
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
TF expression
(FPKM)
TF expression
(FPKM)
ESRRB
n= 21,647
TCFCP2L1
n=26,910
SUZ12
n= 4,215
EZH2
n= 5,185
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
2MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1iPSC
2iPSC
ESC
2MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1iPSC
2iPSC
ESC
2MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1iPSC
2iPSC
ESC
2MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1iPSC
2iPSC
ESC
2MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1iPSC
2iPSC
ESC
2MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1iPSC
2iPSC
ESC
2MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1iPSC
2iPSC
ESC
2MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1iPSC
2iPSC
ESC
2°MEF
D2
D5
D8
D11
D16
D18
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
0
–0.1
–0.2
0.1
0
0.2
0
–0.1
Average
TFBS
methylation
difference
H3K4me3
difference
H3K27me3
difference
Average
TFBS
methylation
difference
H3K4me3
difference
H3K27me3
difference
0.1
0
0.2
0
–0.1
0
–0.1
0.1
0
–0.1
0
–0.1
0.1
0
0.1
Figure 3 | Histone modification and DNA methylation change at transcription factor-binding sites. RNA expression level (FPKM) of transcription factors
(line plots), average DNA methylation change (upper bar plots), average H3K4me3 change (blue bar plots) and average H3K27me3 change (red bar
plots) at binding sites of each transcription factor. Selected transcriptionally active genes during high-dox treatment (blue box), transcriptionally silent
genes during high-dox treatment (green box) and polycomb repressive complexes (red box) are shown.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6619
4NATURE COMMUNICATIONS | 5:5619 | DOI: 10.1038/ncomms6619 | www.nature.com/naturecommunications
&2014 Macmillan Publishers Limited. All rights reserved.
TFBSs and histone modification are enriched in the DMRs.To
assay the distribution of histone marks, we performed ChIP-Seq
for H3K4me3, H3K27me3 and H3K36me3 (see Methods). We
determined the distribution and enrichment of these histone
marks within DMRs, as well as other genomic features including
ESC-TFBSs from published data23–26 (Supplementary Data 3).
Notably, we found that 98% of DMRs contained H3K4me3
clusters and 68% contained ESC-TFBSs (Fig. 2d). When we
assessed enrichment of each feature relative to the whole genome,
H3K4me3 marks, ESC-TFBSs, CpG islands, CpG shores and
enhancers showed more than 10-fold enrichment, followed by
promoters and H3K27me3 clusters (Fig. 2e).
Our finding that histone marks were highly enriched within
DMRs led us to explore the relationship between DNA
methylation levels and H3K4me3/H3K27me3 marks within
DMRs (Fig. 2f, Supplementary Fig. 2c, Supplementary Table 1).
DMRs exhibiting low-level methylation (less than 30%) were
frequently associated (96.9%) with H3K4me3 and H3K27me3. In
contrast, the absence of both histone marks was most frequently
associated (79.7%) with DMRs with high levels of methylation
(Z70%), supporting the inverse relationship between DNA
methylation and these two histone modifications. Furthermore,
CpGs inside H3K4me3 and H3K27me3 marks exhibit more
methylation change, in comparison with CpGs inside H3K36me3
mark (Supplementary Fig. 2d).
To investigate the involvement of ESC-TFBSs in reprogram-
ming, we performed separate enrichment analysis for each DMR
group (Table 1). Polycomb-repressive complex (PRC)-binding
sites, including SUZ12, EZH2 and RING1B, were enriched in
DMR-1 and DMR-2b. On the other hand, sequence-specific
pluripotency-associated ESC-TFBSs such as Nanog, Oct4 and
Klf4 (but not CTCF and TET1)-binding sites were enriched in
DMR-3, the group of DMRs that are demethylated only in the
ESC-like state. These results demonstrate the dynamic changes in
DNA methylation at TFBSs, and the connection between the
pattern of changes and TFBS enrichment.
Dynamic changes of TFBS methylation during reprogram-
ming. Interrogating methylation changes at ESC-TFBSs resulted
in the detection of methylation depletion during high-dox treat-
ment, which was not apparent by examining DMRs (Fig. 3,
Supplementary Fig. 3; Methods). This was most obvious at the
binding sites for activated or overexpressed transcription factors
during early time points, such as OCT4, SOX2, KLF4 and
NANOG. These TFBSs also accumulated H3K4me3 modifica-
tions that proceed after the methylation depletion. H3K27me3
marks diminished at binding sites of expressed transcription
factors early in reprogramming. In contrast, ESC-TFBSs for genes
that were not activated during high-dox reprogramming but are
known to play critical roles in ESC-like pluripotent state, such as
ESRRB and TCFCP2L1 (refs 14,27,28), showed no change in
DNA methylation and were demethylated only in the ESC-like
state. The PRC (SUZ12 and EZH2)-binding sites underwent a
gain of DNA methylation during reprogramming but showed
baseline levels of methylation in ESC.
We assessed DNA methylation changes occurring within
±40 kb of ESC-TFBSs (Fig. 4, Supplementary Fig. 4). At the
40 –40 0–40
Position around TFBS (kb)
OCT4
n= 3,761
SOX2
n= 4,526
0–40
Position around TFBS (kb)
040
Position around TFBS (kb)
SUZ12
n= 4,215
ESRRB
n= 21,647
–1 0 1
Normalized average
CpG methylation c hange versus 2MEF
40
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
Normalized average
H3K4me3 change versus 2MEF
Normalized average
H3K27me3 change versus 2MEF
–1 0 1 –1 0 1
Figure 4 | Histone modification and DNA methylation change around transcription factor-binding sites. Average DNA methylation change (left),
average H3K4me3 change (middle) and average H3K27me3 change (right) in the 80-kb neighbourhood of transcription factor-binding sites.
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6619 ARTICLE
NATURE COMMUNICATIONS | 5:5619 | DOI: 10.1038/ncomms6619 | www.nature.com/naturecommunications 5
&2014 Macmillan Publishers Limited. All rights reserved.
Expr-1a
(87)
Expr-1b
(38)
Expr-1c
(55)
Expr-2a
(113)
Expr-2b
(109)
Expr-3a
(41)
Expr-3b
(35)
100% 50% 0% 50% 100%
100% 50% 0% 50% 100%
CpG density CountCount
3
00.02 0 30
010
FPKM
C
p
G met
0101
K4me3
K36me3
Pluripotency TFBS: OCT4, SOX2, KLF4, NANOG, ESRRB, TCFCP2L1,
PRC-binding sites: EZH2, SUZ12, RING1B
ESC specific H3K4me3 in promoters
2°MEF met 0.7
n= 673
2°MEF met 0.3
n= 93
2°iPSC
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
ESC
ESC specific H3K27me3 in promoters
2°MEF met 0.7
n= 86
2°MEF met 0.3
n= 659
% Of promoters with H3K4me3 engagement
2°iPSC
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
ESC
% Of promoters with H3K27me3 engagement
2°MEF
D2H
D5H
D8H
D11H
D16H
D18H
D16L
D21L
D21Ø
1°iPSC
2°iPSC
ESC
CpGI
DNA methylation
H3K4me3
H3K27me3
RefSeq
Dnmt3l
Dppa4
Dppa5a
Eras
Esrrb
Nr5a2
Pecam1
Tcl1
Triml2
Zfp42
Cdh1
Epcam
Fgfbp1
Kit
Lef1
Lin28b
Mycl1
Nanog
Sall4
Slain1
Car4
Ina
Kis2
Lin28a
Nodal
Ooep
Pou3f1
Sall1
Tcf15
Utf1
Cbx4
Clec14a
Col14a1
Egfl6
Meox2
Mylk
Rspo2
Shox2
Spock3
Wnt16
Abi3bp
Col10a1
Emr1
Fabp4
Igf1
Lrrn3
Mfap2
Slc16a4
Stab1
Tyrobp
Drp2
Elavl4
Esx1
Klrg2
Mst1
Prox1
Shisa3
Sost
Sybu
Tmc3
Adamts9
Bmp4
Cgnl1
Des
Enox1
Fblim1
Gbp2
Icam1
Mdk
Smarca1
01
PRC-binding sites
CpG density
mRNA
expression
DNA
methylation
H3K4me3
H3K27me3
Pluripotency TFBS
H3K36me3
D18H
2°MEF
ESC
CpGI
D18H
2°MEF
ESC
D18H
2°MEF
ESC
D18H
2°MEF
ESC
D18H
2°MEF
ESC
01
K27me3
20 kb 20 kb 20 kb 20 kb 20 kb
Expr-1a Expr-1b Expr-1 c Expr-2a Expr-2b
Slain1 Car4 Pecam1 Rspo2 Abi3bp
Figure 5 | Epigenetic features of gene classes and model of gene expression control. (a) Genes were separated into clusters based on gene expression
patterns and DNA methylation. The heatmap presents mRNA expression, DNA methylation level of promoter regions, normalized H3K4me3 level,
normalized H3K27me3 level, CpG densities, pluripotency transcription factor-binding sites and binding sites of PRCs. (b) Base-level visualization of
DNA methylation and histone modifications in the promoter regions of representative genes for each class across all samples. (c) Percentage of
ESC-specific H3K4me3 mark for promoters with high and low initial methylation. (d) Percentage of ESC-specific H3K27me3 mark for promoters with high
and low initial methylation.
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6619
6NATURE COMMUNICATIONS | 5:5619 | DOI: 10.1038/ncomms6619 | www.nature.com/naturecommunications
&2014 Macmillan Publishers Limited. All rights reserved.
binding sites of core ESC-pluripotency transcription factors
(OCT4, SOX2, KLF4 and NANOG), we observed rapid focal
demethylation during high-dox treatment (D2H-D18H) if the
factors were expressed. On the other hand, ESC-like cells (1°iPSC,
2°iPSC and ESC) exhibited extensive demethylation, up to 20 kb
distal from the binding sites. A similar but more delayed process
was also observed for H3K4me3 modifications. The broad
neighbourhoods around PRC-binding sites were hypermethylated
in all samples examined. Interestingly, although methylation
accumulated broadly around PRC (SUZ12, EZH2 and RING1B)-
binding sites (Fig. 4, Supplementary Fig. 4), these underwent
focal renormalization at the ESC-like pluripotent state. These sites
also demonstrate bivalent marks of H3K4me3 and H3K27me3
in ESC-like state24. The patterns of change to DNA methylation
and histone marks were distinct for the three types of
transcription factor shown (Figs 3 and 4). Our results show an
interesting contrast between the focal demethylation induced
early in reprogramming and broader demethylated regions at
ESC-like pluripotent state, perhaps representing a key
distinguishing feature of the pluripotent state where broader
demethylation is required for completion of the reprogramming
to ESC-like state.
We attempted to show that the dynamics of methylation
change at transcription factor-binding sites (TFBSs) could act as a
predictor of importance to the reprogramming process. We
proposed criteria for DNA-binding transcription factors of
41.2 enrichment and 410% overlap in DMR-3, implying
over-representation in DMRs that underwent demethylation at
transition to the ESC-like state, but little change early in
reprogramming. We tested a set of 118 transcription factors with
computationally predicted binding sites against these criteria29,30.
We found only three transcription factors (SOX2, MYC and
OCT4) that fulfilled our criteria, all of which are known to be
important in reprogramming to iPSCs (Supplementary Data 4).
This suggests a high specificity for the prediction criteria,
although sensitivity is low as other factors known to be
involved in reprogramming were not identified. Transcription
factors whose binding sites show significant change in
methylation late in a transition can be called important to that
transition with high confidence. We believe that methylome-
based tests of this nature could have useful application in
prediction of transcription factors involved in other cellular
transitions.
Demethylation leads to precise control of gene expression.We
integrated corresponding RNA expression data22 with our DNA
methylation and histone modification data sets (Supplementary
Tables 2–4, Supplementary Data 5; Methods). Activation of
genes was associated with H3K4me3 occupancy in promoter
regions and repression was associated with either H3K27me3
occupancy or no histone mark (Supplementary Fig. 5a).
Moreover, as we observed in DMRs, engagement of both
H3K4me3 and H3K27me3 marks in promoters was dependent
on DNA methylation levels with a strong inverse relationship
(Supplementary Fig. 5b).
We selected 477 genes segregating into seven clusters on the
basis of expression and epigenetic change over the course of
reprogramming (Fig. 5a,b, Supplementary Table 5; Methods).
These groups represent: activated early in reprogramming (Expr-
1a), activated late in reprogramming with either low- (Expr-1b)
or full- (Expr-1c) DNA methylation in 2°MEF and repressed
during reprogramming with either low- (Expr-2a) or full-
(Expr-2b) DNA methylation in ESC. Genes in Expr-3a were
turned on, while those in Expr-3b were turned off in high-dox;
therefore, they were differentially expressed between D16H/D18H
(F-class cells) and ESC-like cells. Expression changes of genes in
Expr-1a and Expr-2a/b are likely responsible for pluripotency, as
they were differentially expressed between 2°MEF and pluripo-
tent cells21. Finally, the presence of genes in Expr-1b/c explains
why F-class cells are distinct from ESC-like state cells.
The expression dynamics through reprogramming of these
genes was clear upon visualization of the categories and
representative genes from each class (Fig. 5a,b, Supplementary
Fig. 5a–d). Genes repressed by H3K27me3 with low-methylated
promoters in 2°MEF tended to be activated early in reprogram-
ming and had CpG-rich promoters (Expr-1a/b). These loci were
enriched in genes involved in cell adhesion, such as Epcam and
Cdh1 (Fig. 5a (Expr-1a)). In contrast, quiescence of Expr-1c genes
was initially safeguarded by DNA methylation of CpG-poor
promoters, and H3K4me3 was only acquired after late demethy-
lation. The same two modes of control were observed for the
genes repressed by reprogramming. However, as in the analysis of
DMRs, DNA methylation in promoter regions happened early in
reprogramming (Expr-2b), whereas demethylation was detected
exclusively in the ESC-like state, revealing that a gain of
methylation is kinetically favoured over demethylation. This is
also true for histone marks in relation to changes in gene
expression, where histone modifications, specifically the modula-
tion of H3K27me3, occurred early during reprogramming
(Expr-2a) within low-methylated promoters. Interestingly, the
dynamic process of histone modification alterations during
reprogramming was strongly influenced by the starting methyla-
tion state of gene promoters (Fig. 5c,d). Genes with low-
methylated promoters at 2°MEF showed a significantly higher
rate of transition to the ESC-like state for both ESC-specific
histone marks compared with those with fully methylated
promoters. This suggests that DNA methylation presents a major
barrier during somatic cell reprogramming to ESC-like cells and
that the methylation status of a given region determines its
control by histone modifications.
We propose a model that describes the key mechanism of
epigenetic control of gene expression during reprogramming
Epigenomic control of gene expression
Off
On
Late activation
(Expr-1c)
Late activation (Expr-1c)
Repression (Expr-2b)
Repression
(Expr-2b)
Repression (Expr-2a)
Early activation (Expr-1a)
Off
Late activation
(Expr-1b)
Histone modification-driven
(CpG rich)
On
Off
DNA methylation-driven
(CpG poor)
Off
Off
On
H3K27me3
H3K4me3
Un-meC
5meC
(Expr-3a)
(Expr-3b)
Figure 6 | A model summarizing DNA methylation and histone
modification-driven control of gene expression. Dashed arrow represents
the strict control of demethylation. Gene classes affected by changes
are shown in brackets accompanying arrows.
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6619 ARTICLE
NATURE COMMUNICATIONS | 5:5619 | DOI: 10.1038/ncomms6619 | www.nature.com/naturecommunications 7
&2014 Macmillan Publishers Limited. All rights reserved.
(Fig. 6). In genes with CpG-poor promoters, control is driven by
DNA methylation. Such genes may be activated by demethylation
followed by H3K4me3 engagement, producing expression profiles
characteristic of class Expr-1c/2b. In genes with CpG-rich
promoters, low methylation levels allow histone modification-
driven control. This model is supported by data showing the role
of initial methylation status as a modulator of the dynamic
changes to histone modification, and the sequential modification
of DNA methylation followed by histone marks in TFBSs. The
model also accounts for characteristic gene expression classes
(detailed in Figs 5 and 6). We predict that this mechanism may
not only apply to iPSC reprogramming but also to lineage
specification of cells. Therefore, our insights into how DNA
methylation controls the epigenetic landscape in reprogramming
to pluripotency could be crucial to a better understanding of the
mechanisms underlying general cell fate change, and could have
ramifications for stem cell-based therapies.
Methods
Cell culture and secondary reprogramming.ROSA26-rtTA-IRES-GFP mouse
ESC, iPSCs and mouse embryonic fibroblasts were cultured as previously descri-
bed31. ESCs and iPSCs were cultured in 5% CO
2
at 37 °C on irradiated MEFs in
DMEM containing 15% FCS, leukaemia-inhibiting factor, penicillin/streptomycin,
L-glutamine, nonessential amino acids, sodium pyruvate and 2-mercaptoethanol.
1B 1°iPS cells were aggregated with tetraploid host embryos as described10 and
MEFs established from E13.5 embryos. High-dox cell samples were collected on
days 0, 2, 5, 8, 11, 16 and 18 (D2H, D5H, D8H, D11H, D16H and D18H). A
subculture of the reprogramming cells was established from day 19 and cultured in
the absence of dox, to develop a factor-independent 2°iPS cell line by day 30
(2°iPSC). Low-dox samples were maintained from day 8 to day 14 cells in 5 ng dox.
On day 14 the culture was diverged into two, with some of the cells being cultured
until day 21 in the absence of dox (D21Ø) and the remainder being cultured in
5ngml1of dox and collected on day 16 (D16L) and (D21L). Rosa26rtTA ESCs
and 1B 1o iPSCs were collected as controls.
MethylC-Seq library generation.For all 13 samples (2°MEF, D2H, D5H, D8H,
D11H, D16H, D18H, D16L, D21L, D21Ø, 1°iPSC, 2°iPSC and rtTA ESC), 5 mg of
genomic DNA was mixed with 25 ng unmethylated cl857 Sam7 Lambda DNA
(Promega, Madison, WI, USA). The DNA was fragmented by sonication to
300–500 bp with a Covaris S2 system (Covaris) followed by end repair with the
End-It DNA End-Repair Kit (Epicenter). Paired-end universal library adaptors
provided by Illumina were ligated to the sonicated DNA as per the manufacturer’s
instructions for genomic DNA library construction. Ligated products were purified
with AMPure XP beads (Beckman, Brea, CA, USA). Adaptor-ligated DNA was
bisulfite-treated using the EpiTect Bisulfite Kit (QIAGEN) following the manu-
facturer’s instructions and then PCR-amplified using PfuTurboCx Hotstart DNA
polymerase (Agilent, Santa Clara, CA, USA) with the following PCR conditions
(2 min at 95 °C, 4 cycles of 15 s at 98 °C, 30 s at 60 °C, 4 min at 72 °C and then
10 min at 72 °C). The reaction products were purified using the MinElute gel
purification kit (QIAGEN). The sodium bisulfite non-conversion rate was
calculated as the percentage of cytosines sequenced at cytosine reference positions
in the lambda genome.
ChIP library generation.ChIP was carried out as described in ref. 32. In all,
40–150 million cells were fixed with 1% formaldehyde for 10 min at room
temperature, and scraped and stored as pellets ( 80 °C). Samples were lysed at
20 million cells per ml Farnham lysis buffer for 10 min and subsequently at
10 million cells per ml nuclear lysis buffer. The released chromatin was sheared to
100–500 bp (250 bp average) on ice using a SonicsVibraCell Sonicator equipped
with a 3-mm probe. For each sample, 50 ml of solubilized chromatin was used as
input DNA to normalize sequencing results and the remaining chromatin was
immunoprecipitated with 10 mg of H3K4me3 (ab8580)33,10mg H3K27me3
(Millipore 07-449)16 or 10 mg H3K36me3 (ab9050)16 antibodies, separately.
Antibody–chromatin complexes were pulled down with 100 ml magnetic Protein G
Dynal beads (Invitrogen) and washed six times. The chromatin was then eluted,
reverse crosslinked at 65 °C overnight and subjected to RNaseA/proteinase K
treatment. ChIP and input DNA were purified using a Qiagen Purification Column
and quantified using a Quant-it dsDNA High Sensitivity Assay (Invitrogen). For
ChIP sequencing, ChIP-seq libraries were prepared according to the protocols
described in the Illumina ChIP-seq library preparation kit. Briefly, 50 ng of
immunopurified DNA or 100 ng of genomic DNA from an input sample was end-
repaired, followed by the 30addition of a single adenosine nucleotide and ligation
to universal library adapters. Ligated material was separated on a 2.0% agarose gel,
followed by the excision of a 250- to 350-bp fragment and column purification
(QIAGEN). DNA libraries were prepared by PCR amplification (18 cycles).
High-throughput sequencing.MethylC-Seq DNA and ChIP DNA libraries were
sequenced using the Illumina HiSeq 2000 as per the manufacturer’s instructions.
Sequencing of libraries was performed up to 2 101 cycles. Image analysis and
base calling were performed with the standard Illumina pipeline version RTA 2.8.0.
Processing and alignment of MethylC-Seq data.MethylC-Seq sequencing data
were processed using the Illumina analysis pipeline, and FastQ format reads were
aligned to the NCBI37/mm9 mouse reference using the Bismark/Bowtie alignment
algorithm18,34,35. Paired-read MethylC-Seq sequences produced by the Illumina
pipeline in FastQ format were trimmed with trim threshold 1,500; we removed the
last two bases from sequences that were not trimmed and removed three bases
from sequences that were trimmed. The Bismark package version 0.7.7 was used as
the aligner using the following parameters: -e 90 -n 2 -l 32 -X 550. As up to six
independent libraries from each biological replicate were sequenced, we first
removed duplicate reads. Subsequently, the reads from all libraries of a particular
sample were combined. Unique read alignments were then subjected to post-
processing. The number of calls for each base at every reference sequence position
and on each strand was calculated. All results of aligning a read to both the Watson
and Crick converted genome sequences were combined. The CpG methylation
levels were calculated using bisulfite conversion rates by (Number of not converted
Cs per read depth) for each position (Supplementary Data 1).
RNA-Seq library generation and sequencing.Total RNA was subjected to two
rounds of on column DNAseI treatment to remove contaminating DNA using the
RNase-Free DNase set (Qiagen PN 79254) as per the manufacturer’s protocol. The
total RNA was then analysed using the Agilent RNA 6000 Nano Kit (PN 5067-
1511) on the Agilent Bioanalyzer 2100 (PN G2939AA) to quantify yield, qualify
integrity and confirm removal of DNA contamination.
Following DNAseI treatment, 5 mg total RNA from each sample was depleted
of ribosomal RNA using the Ribo-ZerorRNA Removal Kit (Epicenter PN
RZH110424) as per the manufacturer’s instructions. The ribosomal-depleted RNAs
were then run on an Agilent RNA 6000 Pico Kit (PN 5067-1513) on the Agilent
Bioanalyzer 2100 to confirm ribosomal RNA depletion. Sequencing libraries where
generated from the ribosomal-depleted RNA using the SOLiD Transcriptome
Multiplexing Kit (PN 4427046) from Applied Biosystems following the
manufacturer’s publication. Final libraries were quantified and qualified using the
Agilent High Sensitivity DNA Kit (PN 5067-4626) on the Agilent Bioanalyzer 2100.
Sequencing libraries were subsequently pooled in equimolar ratios (four
libraries per pool) and clonally amplified on SOLiD nanobeads. Clonal
amplification was completed via emulsion PCR using the SOLiD EZ Bead System
(PN 4448419, 4448418 and 4448420) coupled with SOLiD EZ Bead N200
amplification reagents (PN 4467267, 4457185, 4467281, 4467283 and 4467282).
Following emulsion PCR, clonally amplified nanobeads were enriched using the
SOLiD EZ Bead Enricher Kits (PN 4467276, 4444140 and 4453073) before being
deposited into SOLiD 6-Lane FlowChip (PN 4461826) using the SOLiD Flowchip
Deposition Kit v2 (PN 4468081) as per the manufacturer’s recommendations.
In total, two flowchips were sequenced yielding a total of eight lanes of data,
with sequencing reads generated using the SOLiD 5500xl platform generating
paired 75 bp forward and 35 bp reverse reads. To allow de-convolution of the
pooled libraries, a single 5-bp index read was generated. A total of 1,204,676,394
fragments (2,409,352,788 reads) were generated post deconvolution, ranging from
35,714,748 to 147,282,580 fragments per library.
Processing and alignment of RNA-Seq data.Sequence mapping was performed
using Applied Biosystems LifeScope v2.5 whole transcriptome (paired-end) ana-
lysis pipeline against the NCBIM37 (mm9) genome and exon-junction libraries
constructed from the Ensembl v64 gene model. Briefly, this pipeline first removes
potential contaminant reads by aligning to a filter set containing rRNA, tRNA,
adaptor sequences and retrotransposon sequences. Following filtering, LifeScope
then aligns all reads to the genome and F3 reads to the junction library. F5 reads
are additionally aligned at a higher sensitivity to exonic sequences within insert size
distance from the paired (F3) read alignment. Read alignments are merged and
disambiguated, and a single BAM (binary alignment/mapped) file output per
library.
BAM files were then additionally filtered to remove reads with a mapping
quality (MAPQ)o9 and all mitochondrial reads. Alignments were then assembled
using Cufflinks (v2.0.2) using the –G parameter to quantify gene and isoform
FPKM expression values against the reference gene model (Ensembl v67).
Identification of methylated cytosines.At each reference cytosine, the binomial
distribution was used to identify whether at least a subset of the genomes within the
sample were methylated, using a 0.01 FDR-corrected Pvalue. We identified methyl
cytosines while keeping the number of false-positive methylcytosine calls below 1%
of the total number of methyl cytosines we identified. The probability Pin the
binomial distribution B(n, P) was estimated from the number of cytosine bases
sequenced in reference cytosine positions in the unmethylated Lambda genome
(referred to as the error rate: nonconversion plus sequencing error frequency). We
interrogated the sequenced bases at each reference cytosine position one at a time,
where read depth refers to the number of reads covering that position. For each
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6619
8NATURE COMMUNICATIONS | 5:5619 | DOI: 10.1038/ncomms6619 | www.nature.com/naturecommunications
&2014 Macmillan Publishers Limited. All rights reserved.
position, the number of trials (n) in the binomial distribution was the read depth.
For each possible value of nwe calculated the number of cytosines sequenced (k)at
which the probability of sequencing kcytosines out of ntrials with an error rate of
pwas less than the value M, where M* (number of unmethylated cytosines) o0.01*
(number of methylated cytosines) and if the error rate of pwas over 0.01, we
assumed that the cytosine was not methylated. In this way, we established the
minimum threshold number of cytosines sequenced at each reference cytosine
position at which the position could be called as methylated, so that out of all
methyl cytosines identified no more than 1% would be because of the error rate.
Calculation of DNA methylation level.If the error rate is less than 0.01 we
calculated adjusted DNA methylation level for cytosine as follow:
Adjusted cytosine methylation level ¼ab
cr

að1Þ
(a¼total Cs, b¼number of converted Cs, cr ¼bisulfite conversion rate).
Identification of DMRs.DMRs (Fig. 2) were identified using a sliding window
approach (Supplementary Fig. 6a, Fig. 2b). A window size of 30 CpGs less than 6 kb
with coverage more than 5 in 15 CpGs per window in all samples were con-
sidered, progressing one CpG per iteration. Total of 20,214,978 windows were
assessed. Windows showing maximum difference and fold enrichment of 30% and
fourfold with Benjamini–Hochberg-corrected FDR from analysis of variance
(ANOVA) test Pvalues of less than 1% were identified as differentially methylated
windows. In all, 188,529 differentially methylated windows were then joined if
regions were overlapped or progressing region and the succeeding regions were
covering more than 60% of the region. This set of 7,890 DMRs covering
21,618,964 bp of the whole genome are reported in Fig. 2 and Supplementary Data 2.
DMRs were then defined as Hyper-DMRs and Hypo-DMRs if the average
methylation level difference of each DMR in each sample was higher or lower by
more than 20% relative to 2°MEF.
Mapping and enrichment analysis of ChIP-Seq reads.Paired-end ChIP-Seq
data were processed using the Illumina analysis pipeline, and mapping was con-
ducted using Bowtie version 0.12.8 with the following parameters: --pairtries 100 -y
-k 1 -n 3 -l 50 -I 0 -X 1000. Enrichment analysis was conducted using MACS36
with parameters of --nomodel -S -w –n –space 30.
ChIP-Seq data analysis.Enriched peaks from ChIP-Seq data were joined into
clusters where at least one sample has a peak for each modification (H3K4me3,
H3K27me3 and H3K36me3; Supplementary Fig. 6b). The total peak width of each
sample within the cluster was calculated as histone mark score within clusters.
TFBS epigenomic change analysis.ESC-TFBSs of mouse ESCs were obtained
from different studies23–25. CpG methylation level of each TFBS in each sample
was calculated. The average CpG methylation change of each TFBS was than
calculated in each sample relative to 2°MEF (Fig. 3). For calculating CpG
methylation change around ESC-TFBSs, the same procedure was applied for 200 bp
400 bins around each ESC-TFBS. The same procedure using enrichment score for
30-bp window was applied for calculating average histone modification change
(Fig. 4).
Genome annotation.Genomic regions and CpG islands were defined based on
NCBI37/mm9 coordinates downloaded from the UCSC website (http://geno-
me.ucsc.edu/). Promoters were arbitrarily defined as 5 kb upstream and 1 kb
downstream of transcriptional start site for each Ensembl release-67 transcript.
Gene bodies are defined as from transcription start to end sites for each transcript.
Histone modification clusters and DMRs were annotated if they overlap with their
promoters.
Fold-enrichment test.Fold enrichment was calculated as follows: (Observed
number of Xin examining region/total length of examining region (bp))/(total
number of Xin reference region/reference region length (bp)), X¼genomic
feature)).
Gene expression pattern separation.We selected genes of expression patterns as
described in Supplementary Table 5.
Data integration and normalization.DNA methylation levels of promoters were
calculated from 5 kb upstream and 1 kb downstream of the transcription start site.
H3K4me3 and H3K27me3 marks were considered if their cluster of peaks were
overlapped with promoters. Overlapped H3K36me3 peaks were calculated for
whole gene. In Fig. 5, for calculating normalized histone modification scores,
maximum peak width was considered as 1 and relative widths were calculated for
each sample in each gene.
Accession codes.Methylome sequencing data are available under the European
Nucleotide Archive accessions no. ERP004116 (http://www.ebi.ac.uk/ena/data/
view/PRJEB4795). Long RNA-seq and Chip-seq sequencing data are available
under the NCBI Sequence Read Archive (SRA) accessions no. SRP046744 (http://
www.ncbi.nlm.nih.gov/sra). Analysed data sets can be obtained from Stemfor-
matics (www.stemformatics.org)37.
References
1. Takahashi, K. & Yamanaka, S. Induction of pluripotent stem cells from mouse
embryonic and adult fibroblast cultures by defined factors. Cell 126, 663–676
(2006).
2. Maherali, N. et al. Directly reprogrammed fibroblasts show global epigenetic
remodeling and widespread tissue contribution. Cell Stem Cell 1, 55–70 (2007).
3. Takahashi, K. et al. Induction of pluripotent stem cells from adult human
fibroblasts by defined factors. Cell 131, 861–872 (2007).
4. Yu, J. et al. Induced pluripotent stem cell lines derived from human somatic
cells. Science 318, 1917–1920 (2007).
5. Park, I. H. et al. Reprogramming of human somatic cells to pluripotency with
defined factors. Nature 451, 141–146 (2008).
6. Kang, L., Wang, J., Zhang, Y., Kou, Z. & Gao, S. iPS cells can support full-term
development of tetraploid blastocyst-complemented embryos. Cell Stem Cell 5,
135–138 (2009).
7. Zhao, X. Y. et al. iPS cells produce viable mice through tetraploid
complementation. Nature 461, 86–90 (2009).
8. Onder, T. T. et al. Chromatin-modifying enzymes as modulators of
reprogramming. Nature 483, 598–602 (2012).
9. Singhal, N. et al. Chromatin-remodeling components of the BAF complex
facilitate reprogramming. Cell 141, 943–955 (2010).
10. Woltjen, K. et al. piggyBac transposition reprograms fibroblasts to induced
pluripotent stem cells. Nature 458, 766–770 (2009).
11. Samavarchi-Tehrani, P. et al. Functional genomics reveals a BMP-driven
mesenchymal-to-epithelial transition in the initiation of somatic cell
reprogramming. Cell Stem Cell 7, 64–77 (2010).
12. Mikkelsen, T. S. et al. Dissecting direct reprogramming through integrative
genomic analysis. Nature 454, 49–55 (2008).
13. Polo, J. M. et al. A molecular roadmap of reprogramming somatic cells into iPS
cells. Cell 151, 1617–1632 (2012).
14. Buganim, Y. et al. Single-cell expression analyses during cellular
reprogramming reveal an early stochastic and a late hierarchic phase. Cell 150,
1209–1222 (2012).
15. Chen, J. et al. Vitamin C modulates TET1 function during somatic cell
reprogramming. Nat. Genet. 45, 1504–1509 (2013).
16. Wang, T. et al. The histone demethylases Jhdm1a/1b enhance somatic cell
reprogramming in a vitamin-C-dependent manner. Cell Stem Cell 9, 575–587
(2011).
17. Plath, K. & Lowry, W. E. Progress in understanding reprogramming to the
induced pluripotent state. Nat. Rev. Genet. 12, 253–265 (2011).
18. Lister, R. et al. Hotspots of aberrant epigenomic reprogramming in human
induced pluripotent stem cells. Nature 471, 68–73 (2011).
19. Papp, B. & Plath, K. Epigenetics of reprogramming to induced pluripotency.
Cell 152, 1324–1343 (2013).
20. Surani, M. A., Hayashi, K. & Hajkova, P. Genetic and epigenetic regulators of
pluripotency. Cell 128, 747–762 (2007).
21. Tonge, P. D. et al. Divergent reprogramming routes lead to alternative stem cell
states. Nature doi: 10.1038/nature14047 (2014).
22. Hussein, S. M. I. et al. Genome-wide characterization of the routes to
pluripotency. Nature doi: 10.1038/nature14046 (2014).
23. Chen, X. et al. Integration of external signaling pathways with the core
transcriptional network in embryonic stem cells. Cell 133, 1106–1117
(2008).
24. Ku, M. et al. Genomewide analysis of PRC1 and PRC2 occupancy identifies two
classes of bivalent domains. PLoS Genet. 4, e1000242 (2008).
25. Wu, H. et al. Dual functions of Tet1 in transcriptional regulation in mouse
embryonic stem cells. Nature 473, 389–393 (2011).
26. Visel, A., Minovitsky, S., Dubchak, I. & Pennacchio, L. A. VISTA Enhancer
Browser--a database of tissue-specific human enhancers. Nucleic Acids Res. 35,
D88–D92 (2007).
27. Feng, B. et al. Reprogramming of fibroblasts into induced pluripotent stem cells
with orphan nuclear receptor Esrrb. Nat. Cell Biol. 11, 197–203 (2009).
28. Fischedick, G. et al. Zfp296 is a novel, pluripotent-specific reprogramming
factor. PloS ONE 7, e34645 (2012).
29. Stormo, G. D. DNA binding sites: representation and discovery. Bioinformatics
16, 16–23 (2000).
30. Ho Sui, S. J. et al. oPOSSUM: identification of over-represented transcription
factor binding sites in co-expressed genes. Nucleic Acids Res. 33, 3154–3164
(2005).
31. Nagy, A. & Gertsenstein, M. Manipulating the Mouse Embryo: A Laboratory
Manual (Cold Spring Harbor Press, 2003).
NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6619 ARTICLE
NATURE COMMUNICATIONS | 5:5619 | DOI: 10.1038/ncomms6619 | www.nature.com/naturecommunications 9
&2014 Macmillan Publishers Limited. All rights reserved.
32. O’Geen, H., Echipare, L. & Farnham, P. J. in Epigenetics Protocols 791, 265–286
(Humana Press, 2011).
33. Gaspar-Maia, A. et al. Chd1 regulates open chromatin and pluripotency of
embryonic stem cells. Nature 460, 863–868 (2009).
34. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-
efficient alignment of short DNA sequences to the human genome. Genome
Biol. 10, R25 (2009).
35. Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller
for bisulfite-Seq applications. Bioinformatics 27, 1571–1572, 2011).
36. Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9,
R137 (2008).
37. Wells, C. A. et al. Stemformatics: visualisation and sharing of stem cell gene
expression. Stem Cell Res. 10, 387–395 (2013).
Acknowledgements
This work has been supported by the Korean Ministry of Knowledge Economy (grant no.
10037410 to J.-S.S.), by the SNUCM research fund (grant no. 0411-20100074 to J.-S.S.)
and by Macrogen Inc. (grant no. MGR03-11 and 12).
Author contributions
J.-S.S. and A.N. conceived and designed the experiments. J.-Y.S., J.-Y.Y., J.K., K.-S.Y. and
H.K. performed MethylC-Seq and ChIP-Seq experiments. P.D.T derived iPSC lines.
M.C.P., M.L., S.M.I.H. and I.M.R. performed pull downs for ChIP-Seq. N.C. and
S.M.G. performed RNA-Seq. D.-S.L. performed sequencing data processing. D.-S.L.,
S.L., W.-C.L. and H.R. conducted bioinformatic and statistical analyses. J.-S.S., D.-S.L.,
J.-Y.S., H.P., T.B. and J.-H.Y. wrote the manuscript.
Additional information
Accession codes: Methylome sequencing data are available under the European Nucleo-
tide Archive accessions no ERP004116 (http://www.ebi.ac.uk/ena/data/view/PRJEB4795).
Long RNA-seq and Chip-seq sequencing data areavailable underthe NCBI SequenceRead
Archive (SRA) accessions no. SRP046744 (http://www.ncbi.nlm.nih.gov/sra).
Supplementary Information accompanies this paper at http://www.nature.com/
naturecommunications
Competing financial interests: The authors declare no competing financial interests.
Reprints and permission information is available online at http://npg.nature.com/
reprintsandpermissions/
How to cite this article: Lee, D.-S. et al. An epigenomic roadmap to induced
pluripotency reveals DNA methylation as a reprogramming modulator. Nat. Commun.
5:5619 doi: 10.1038/ncomms6619 (2014).
This work is licensed under a Creative Commons Attribution 4.0
International License. The images or other third party material in this
article are included in the article’s Creative Commons license, unless indicated otherwise
in the credit line; if the material is not included under the Creative Commons license,
users will need to obtain permission from the license holder to reproduce the material.
To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms6619
10 NATURE COMMUNICATIONS | 5:5619 | DOI: 10.1038/ncomms6619 | www.nature.com/naturecommunications
&2014 Macmillan Publishers Limited. All rights reserved.
... The process of pluripotency reprogramming is multiphasic, involving dynamic alterations of the epigenome and transcriptome during the acquisition of pluripotency. Gene regulation through histone H3 lysine-4 trimethylation (H3K4me3) and H3K27me3 is observed at CpG-rich promoters, where DNA methylation levels are stably low in somatic cells during reprogramming [24]. On the other hand, gene regulation through DNA demethylation is observed at CpG-poor promoters, where methylation levels are high in somatic cells. ...
Article
Full-text available
Although lineage reprogramming from one cell type to another is becoming a breakthrough technology for cell-based therapy, several limitations remain to be overcome, including the low conversion efficiency and subtype specificity. To address these, many studies have been conducted using genetics, chemistry, physics, and cell biology to control transcriptional networks, signaling cascades, and epigenetic modifications during reprogramming. Here, we summarize recent advances in cellular reprogramming and discuss future directions.
... DNA methylation is another epigenetic feature that has a role in developmental programming and epigenetic reprogramming of aged cells. DNA methylation during iPS cell reprogramming facilitates chromatin remodeling while silencing differentiation-associated genes 90,91 . Despite the extensive changes in DNA methylation immediately before the final stages of MEF to iPS cell conversion, several sites in transiently accessible chromatin lack DNA methylation before OSK binding. ...
... DNA methylation is another epigenetic feature that has a role in developmental programming and epigenetic reprogramming of aged cells. DNA methylation during iPS cell reprogramming facilitates chromatin remodeling while silencing differentiation-associated genes 90,91 . Despite the extensive changes in DNA methylation immediately before the final stages of MEF to iPS cell conversion, several sites in transiently accessible chromatin lack DNA methylation before OSK binding. ...
Article
Full-text available
Over the past decade, there has been a dramatic increase in efforts to ameliorate aging and the diseases it causes, with transient expression of nuclear reprogramming factors recently emerging as an intriguing approach. Expression of these factors, either systemically or in a tissue-specific manner, has been shown to combat age-related deterioration in mouse and human model systems at the cellular, tissue and organismal level. Here we discuss the current state of epigenetic rejuvenation strategies via partial reprogramming in both mouse and human models. For each classical reprogramming factor, we provide a brief description of its contribution to reprogramming and discuss additional factors or chemical strategies. We discuss what is known regarding chromatin remodeling and the molecular dynamics underlying rejuvenation, and, finally, we consider strategies to improve the practical uses of epigenetic reprogramming to treat aging and age-related diseases, focusing on the open questions and remaining challenges in this emerging field.
... Gene circuits consisting of two transcription factors, such as PU.1-GATA1 or SOX2-OCT4, have been extensively studied as they are R. Huang et al. Johnson et al., 2015;Shalek et al., 2013). Epigenetic regulation, encompassing histone modification and DNA methylation, has emerged as a key player in cellular heterogeneity and phenotype switching (Lee et al., 2014;Bagci and Fisher, 2013;Zhang et al., 2016;Feinberg and Levchenko, 2023). ...
... Published multi-omics studies discovering novel biological insights which are not possible with single-omics data further supports our points. [3][4][5][6][7][8][9] With the increasing volume of multi-omics data present in publicly accessible biological data repositories, [10][11][12] multi-omics data integration is expected to be the core strategy of modern and future biological data analyses. ...
Article
Data from multiple omics layers of a biological system is growing in quantity, heterogeneity and dimensionality. Simultaneous multi-omics data integration is of immense interest to researchers as it has potential to unlock previously hidden biomolecular relationships leading to early diagnosis, prognosis, and expedited treatments. Many tools for multi-omics data integration are developed. However, these tools are often restricted to highly specific experimental designs, types of omics data, and specific data formats. A major limitation of the field is the lack of a pipeline that can accept data in unrefined form to preserve maximum biology in an individual dataset prior to integration. We fill this gap by developing a flexible, generic multi-omics pipeline called multiomics , to facilitate general-purpose data exploration and analysis of heterogeneous data. The pipeline takes unrefined multi-omics data as input, sample information and user-specified parameters to generate a list of output plots and data tables for quality control and downstream analysis. We have demonstrated its application on a sepsis case study. We enabled limited checkpointing functionality where intermediate output is staged to allow continuation after errors or interruptions in the pipeline and generate a script for reproducing the analysis to improve reproducibility. Our pipeline can be installed as an R package or manually from the git repository, and is accompanied by detailed documentation with walkthroughs on three case studies.
... Inducing pluripotency in terminally differentiated cells requires reprogramming, i.e. a genomewide remodelling of the epigenome, such as histone modifications [64] and DNA methylation [65,66]. This remodelling allows for two cells with the same genome (iPSCs and the differentiated cells they were reprogrammed from) to have completely different cell identities and confers iPSCs with the characteristic plastic chromatin associated with the pluripotent state ( Figure 2). ...
Preprint
Full-text available
Human pluripotent stem cells (PSCs), which include both embryonic and induced pluripotent stem cells, are widely used in fundamental and applied biomedical research. They have been instrumental for better understanding development and cell differentiation processes, disease origin and progression, and can aid in the discovery of new drugs. PSCs also hold great potential in regenerative medicine to treat or diminish the effects of certain debilitating diseases, such as degenerative disorders. However, some concerns have recently been raised over their safety for the use in regenerative medicine. One of the major concerns is the fact that PSCs are prone to errors in passing the correct number of chromosomes to daughter cells, resulting in aneuploid cells. Aneuploidy, characterised by an imbalance in chromosome number, elicits the upregulation of different stress pathways that are deleterious to cell homeostasis, impair proper embryo development and can potentiate cancer development. In this review we will summarise known molecular mechanisms recently revealed to impair mitotic fidelity in human PSCs and the consequences of the decreased mitotic fidelity of these cells. We will finish with speculative views on how the physiological characteristics of PSCs can affect the mitotic machinery and how their suboptimal mitotic fidelity may be circumvented.
... Inducing pluripotency in terminally differentiated cells requires reprogramming, i.e., a genome-wide remodelling of the epigenome, such as histone modifications [67] and DNA methylation [68,69]. This remodelling allows for two cells with the same genome (iPSCs and the differentiated cells they were reprogrammed from) to have completely different cell identities and confers iPSCs with the characteristic plastic chromatin associated with the pluripotent state ( Figure 2). ...
Article
Full-text available
Human pluripotent stem cells (PSCs), which include both embryonic and induced pluripotent stem cells, are widely used in fundamental and applied biomedical research. They have been instrumental for better understanding development and cell differentiation processes, disease origin and progression and can aid in the discovery of new drugs. PSCs also hold great potential in regenerative medicine to treat or diminish the effects of certain debilitating diseases, such as degenerative disorders. However, some concerns have recently been raised over their safety for use in regenerative medicine. One of the major concerns is the fact that PSCs are prone to errors in passing the correct number of chromosomes to daughter cells, resulting in aneuploid cells. Aneuploidy, characterised by an imbalance in chromosome number, elicits the upregulation of different stress pathways that are deleterious to cell homeostasis, impair proper embryo development and potentiate cancer development. In this review, we will summarize known molecular mechanisms recently revealed to impair mitotic fidelity in human PSCs and the consequences of the decreased mitotic fidelity of these cells. We will finish with speculative views on how the physiological characteristics of PSCs can affect the mitotic machinery and how their suboptimal mitotic fidelity may be circumvented.
... The Yamanaka transcription factors Oct3/4, Klf4, Sox2 and Myc (OKSM) epigenetically revert somatic cells to a stem cell-like state (Takahashi & Yamanaka, 2006). Cellular reprogramming via OKSM is in part mediated by remodelling of the DNA methylome (Chondronasiou et al., 2022;Gao et al., 2013;Lee et al., 2014;Nishino et al., 2011). The expression of OKSM facilitates cellular plasticity that is characteristic of more youthful cells . ...
Article
Full-text available
The objectives of this study were to measure to what extent exercise stimulates partial molecular reprogramming in skeletal muscle. We hypothesized that exercise training, beyond its known functional improvements in aged tissues, could recapitulate some of the same effects that are seen with in vivo partial reprogramming by Yamanaka factors. Using transcriptome profiling from 1) a skeletal muscle-specific in vivo Oct3/4, Klf4, Sox2, and Myc (OKSM) reprogramming-factor expression murine model, 2) an in vivo inducible muscle-specific Myc induction murine model, 3) a translatable high-volume hypertrophic exercise training approach in aged mice, and 4) human exercise muscle biopsies, we collectively defined exercise-induced genes common to partial reprogramming. Late-life exercise training lowered murine DNA methylation age according to several contemporary muscle-specific clocks. A comparison of the murine soleus transcriptome after late-life exercise training to the soleus transcriptome after OKSM induction revealed an overlapping signature. Within this signature, downregulation of specific mitochondrial and muscle-enriched genes was conserved in skeletal muscle of long-term exercise-trained humans; among these was muscle-specific Abra/Stars. Myc is the OKSM factor most induced by exercise in muscle and was elevated following exercise training in aged mice. A pulse of MYC rewired the global soleus muscle methylome, and the transcriptome after a MYC pulse partially recapitulated OKSM induction. A common signature also emerged in the murine MYC-controlled and exercise adaptation transcriptomes, including lower muscle-specific Melusin and reactive oxygen species-associated Romo1. With Myc, OKSM, and exercise training in mice as well habitual exercise in humans, the complex I accessory subunit Ndufb11 was lower; low Ndufb11 is linked to longevity in rodents. Our data collectively suggest: 1) a biological age-mitigating effect on the epigenetic landscape by late-life exercise-training in murine skeletal muscle, 2) a common gene expression signature of partial reprogramming by OKSM and exercise training in muscle of humans and aged mice, and 3) that Myc is an exercise-responsive factor that contributes to a rewired molecular profile at the transcriptome and methylome levels. National Institutes of Health: Kevin A. Murach, AG063994; American Federation for Aging Research (AFAR): Kevin A. Murach, Junior Investigator Grant This is the full abstract presented at the American Physiology Summit 2023 meeting and is only available in HTML format. There are no additional versions or additional content available for this abstract. Physiology was not involved in the peer review process.
... Additionally, even cells considered the same phenotype can exhibit significant differences at the singlecell level, referred to as microscopic heterogeneity, which cannot be explained by known driving forces [24,25,26,27]. Epigenetic regulation has been shown to play a crucial role in cellular heterogeneity and phenotype switching [28,29,30,31]. ...
Preprint
Full-text available
Maintaining tissue homeostasis requires proper regulation of stem cell differentiation. The Waddington landscape suggests that gene circuits in a cell form a potential landscape of different cell types, with cells developing into different cell types following attractors of the probability landscape. However, it remains unclear how adult stem cells balance the trade-off between self-renewal and differentiation. We propose that random inheritance of epigenetic states plays a crucial role in stem cell differentiation and develop a hybrid model of stem cell differentiation induced by epigenetic modifications. Our model integrates a gene regulation network, epigenetic state inheritance, and cell regeneration to form multi-scale dynamics ranging from transcription regulation to cell population. Our simulation investigates how random inheritance of epigenetic states during cell division can automatically induce cell differentiation, dedifferentiation, and transdifferentiation. We show that interfering with epigenetic modifications or introducing extra transcription factors can regulate the probabilities of dedifferentiation and transdifferentiation, revealing the mechanism of cell reprogramming. This \textit{in silico} model offers insights into the mechanism of stem cell differentiation and cell reprogramming.
Article
Full-text available
The role of the sodium citrate transporter (NaCT) SLC13A5 is multifaceted and context-dependent. While aberrant dysfunction leads to neonatal epilepsy, its therapeutic inhibition protects against metabolic disease. Notably, insights regarding the cellular and molecular mechanisms underlying these phenomena are limited due to the intricacy and complexity of the latent human physiology, which is poorly captured by existing animal models. This review explores innovative technologies aimed at bridging such a knowledge gap. First, I provide an overview of SLC13A5 variants in the context of human disease and the specific cell types where the expression of the transporter has been observed. Next, I discuss current technologies for generating patient-specific induced pluripotent stem cells (iPSCs) and their inherent advantages and limitations, followed by a summary of the methods for differentiating iPSCs into neurons, hepatocytes, and organoids. Finally, I explore the relevance of these cellular models as platforms for delving into the intricate molecular and cellular mechanisms underlying SLC13A5-related disorders.
Article
Full-text available
Somatic cell reprogramming to a pluripotent state continues to challenge many of our assumptions about cellular specification, and despite major efforts, we lack a complete molecular characterization of the reprograming process. To address this gap in knowledge, we generated extensive transcriptomic, epigenomic and proteomic data sets describing the reprogramming routes leading from mouse embryonic fibroblasts to induced pluripotency. Through integrative analysis, we reveal that cells transition through distinct gene expression and epigenetic signatures and bifurcate towards reprogramming transgene-dependent and -independent stable pluripotent states. Early transcriptional events, driven by high levels of reprogramming transcription factor expression, are associated with widespread loss of histone H3 lysine 27 (H3K27me3) trimethylation, representing a general opening of the chromatin state. Maintenance of high transgene levels leads to re-acquisition of H3K27me3 and a stable pluripotent state that is alternative to the embryonic stem cell (ESC)-like fate. Lowering transgene levels at an intermediate phase, however, guides the process to the acquisition of ESC-like chromatin and DNA methylation signature. Our data provide a comprehensive molecular description of the reprogramming routes and is accessible through the Project Grandiose portal at http://www.stemformatics.org.
Article
Full-text available
Pluripotency is defined by the ability of a cell to differentiate to the derivatives of all the three embryonic germ layers: ectoderm, mesoderm and endoderm. Pluripotent cells can be captured via the archetypal derivation of embryonic stem cells or via somatic cell reprogramming. Somatic cells are induced to acquire a pluripotent stem cell (iPSC) state through the forced expression of key transcription factors, and in the mouse these cells can fulfil the strictest of all developmental assays for pluripotent cells by generating completely iPSC-derived embryos and mice. However, it is not known whether there are additional classes of pluripotent cells, or what the spectrum of reprogrammed phenotypes encompasses. Here we explore alternative outcomes of somatic reprogramming by fully characterizing reprogrammed cells independent of preconceived definitions of iPSC states. We demonstrate that by maintaining elevated reprogramming factor expression levels, mouse embryonic fibroblasts go through unique epigenetic modifications to arrive at a stable, Nanog-positive, alternative pluripotent state. In doing so, we prove that the pluripotent spectrum can encompass multiple, unique cell states.
Article
Full-text available
Vitamin C, a micronutrient known for its anti-scurvy activity in humans, promotes the generation of induced pluripotent stem cells (iPSCs) through the activity of histone demethylating dioxygenases. TET hydroxylases are also dioxygenases implicated in active DNA demethylation. Here we report that TET1 either positively or negatively regulates somatic cell reprogramming depending on the absence or presence of vitamin C. TET1 deficiency enhances reprogramming, and its overexpression impairs reprogramming in the context of vitamin C by modulating the obligatory mesenchymal-to-epithelial transition (MET). In the absence of vitamin C, TET1 promotes somatic cell reprogramming independent of MET. Consistently, TET1 regulates 5-hydroxymethylcytosine (5hmC) formation at loci critical for MET in a vitamin C-dependent fashion. Our findings suggest that vitamin C has a vital role in determining the biological outcome of TET1 function at the cellular level. Given its benefit to human health, vitamin C should be investigated further for its role in epigenetic regulation.
Article
Reprogramming to induced pluripotent stem cells (iPSCs) proceeds in a stepwise manner with reprogramming factor binding, transcription, and chromatin states changing during transitions. Evidence is emerging that epigenetic priming events early in the process may be critical for pluripotency induction later. Chromatin and its regulators are important controllers of reprogramming, and reprogramming factor levels, stoichiometry, and extracellular conditions influence the outcome. The rapid progress in characterizing reprogramming is benefiting applications of iPSCs and is already enabling the rational design of novel reprogramming factor cocktails. However, recent studies have also uncovered an epigenetic instability of the X chromosome in human iPSCs that warrants careful consideration.
Article
Genome-scale technologies are increasingly adopted by the stem cell research community, because of the potential to uncover the molecular events most informative about a stem cell state. These technologies also present enormous challenges around the sharing and visualisation of data derived from different laboratories or under different experimental conditions. Stemformatics is an easy to use, publicly accessible portal that hosts a large collection of exemplar stem cell data. It provides fast visualisation of gene expression across a range of mouse and human datasets, with transparent links back to the original studies. One difficulty in the analysis of stem cell signatures is the paucity of public pathways/gene lists relevant to stem cell or developmental biology. Stemformatics provides a simple mechanism to create, share and analyse gene sets, providing a repository of community-annotated stem cell gene lists that are informative about pathways, lineage commitment, and common technical artefacts. Stemformatics can be accessed at stemformatics.org.