Emily S. Wong's research while affiliated with Victor Chang Cardiac Research Institute and other places

What is this page?


This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

Publications (28)


Enhancer turnover is coupled to germline replication timing
A Mouse enhancers are defined based on combinations of histone marks. B Definition of mouse recent and conserved enhancers⁹. Recent enhancers are defined as regions with mouse-specific histone marks enrichment⁹. Conserved enhancers are aligned to regions with regulatory activity in at least two other species. C Replication time across 200 kb blocks of the mouse genome (n = 8966 blocks) in PGC (n = 2 cell lines), SSC cells (n = 2 cell lines), and early somatic cell types (n = 22 cell lines). Row clustering (blocks) was carried out with k-means clustering; columns are cell-type clusters generated with hierarchical clustering. Row clusters were ordered from early (top) to late (bottom) DNA replication timing, across columns (cell-type clusters). D Numbers of recent and conserved enhancers in regions of (C) with constitutively early (blue), constitutively late (red), and dynamic (gray) replication time. E Enhancer turnover as the log fold change of conserved vs. recent enhancers for the 200 kb clusters across mean germline replication time calculated across PGC (n = 2) and SSP cell lines (n = 2). Shaded areas represent clusters with constitutive DNA replication time. F Scatterplot of mean germline replication time (PGC + SSP) across the 18 clusters shown in (C). P value from a two-sided test of the Pearson correlation coefficient. Shaded areas represent a 95% confidence interval (CI) of the best fits. G Scatterplot of germline mean DNA replication time (PGC + SSP) and log10-transformed numbers of recent and conserved enhancers. Each data point is a cluster defined in (C). The shaded region represents the 95% CI of the line of best fit. H Mean PGC and SSC DNA replication time of poised and active mouse enhancers separated by tissue and type. I Mean germline DNA replication time (PGC + SSP) versus enhancer turnover by tissue and enhancer type. Each data point corresponds to a cluster in (C). The shaded region is the 95% CI of the line of best fit. J The number of recent/conserved enhancers overlapping recent/ancestral retrotransposons (Fisher’s exact test, p < 2.2 × 10⁻¹⁶, two-sided, odds ratio = 2.99).
Deep-learning model links changes in TF binding sites with enhancer turnover
A Deep-learning domain adaptive model trained with HNF4A and CEBPA binding sites in mouse and human genomes³⁷. Prediction on species-specific enhancers and their aligned non-enhancer sequences in the other species. The pie charts show the percentage of enhancers and matched non-enhancer regions with predicted HNF4A and CEBPA TFBSs with a probability threshold ≥\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ge$$\end{document} 0.9. Fisher’s exact test two-sided p values are shown for each enhancer vs non-enhancer comparison. B Examples of species-specific liver candidate enhancers and their sequence alignments to the other species where binding is not predicted in (A). Boxed alignment of a motif identified in the species possessing the enhancer (top sequence) and its alignment to the species without the enhancer (bottom sequence). The motif’s position-weighted matrix (PWM) logo is on the right. The logo is on the negative strand in the last example. * denotes changes to PWM in the orthologous sequence without peak; Details on the data processing of this figure is available in Supplemental Methods. C Numbers of mouse- and human-specific enhancers with predicted TFBSs divided by the total number of enhancers across replication time quintiles. The difference in enhancer proportions was tested using a one-sided Fisher’s exact test between all pairs of DNA replication time quintiles, testing for a higher proportion in the latest quintile (alternative = “greater”). P values are indicated for significant tests (p ≤ 0.05).
Enhancers do not show strong signatures of purifying selection
Derived Allele Frequency (DAF) odds ratio for recently evolved (A) and conserved human liver enhancers (B) and conserved promoters (C) compared to background genomic regions as a measure of selection pressure. Promoters and enhancers were centered based on the location of liver-specific functional motifs. p = 0.01 and 4.34 × 10⁻¹² for recent and conserved enhancers, and p = 8.58 × 10⁻¹⁶ for promoters (two-sided Fisher’s exact test, significance code *P ≤ 0.05 and ****P ≤ 0.0001). In (A–C), the shaded areas represent a 95% confidence interval from sampling the data with replacement (Methods). D Log2-transformed odds ratio of DAF scores for conserved and recent enhancers and promoters. Conservation was defined using multiple thresholds (number of species). Active and inactive enhancers were separated using STARR-seq scores to measure enhancer activity in HepG2 cells³⁹ (Methods). DAF Log ORs for recently evolved human enhancers aligned to the mouse genome where TFBS were detected or not detected using the deep-learning model trained for HNF4A and CEBPA in Fig. 2 are shown. The middle points represent the log2-transformed odds ratio values from a Fisher’s test comparing the proportion of rare and common variants against the genome. Error bars represent the 95% confidence intervals of Fisher’s exact test. Numbers of elements are shown on the right. E Log2 transformed STARR-seq activity of human liver recent and conserved enhancers separated into early (RT >0.5) and late (RT <−0.5) replicating. The quartiles in box plots represent the 25th, 50th (median), and 75th percentiles. The interquartile range (IQR) represents the difference between the 75th and 25th percentiles. The upper whiskers extend to the maximum value of data within 1.5 IQR above the 75th percentile. The lower whiskers extend to the minimum value in the data within 1.5 IQR below the 25th percentile. Outliers are values above the upper whiskers or below the lower whiskers. A two-sided Mann–Whitney U-test p value is shown in each case. The number of enhancers is indicated in each case.
Tissue-specific enhancers are enriched at late replicating regions
A The proportions of early and late replicating enhancers for tissue-specific and non-tissue-specific mouse recent and conserved enhancers (defined with four tissues: brain, liver, muscle, testis) (Two-sided Fisher’s exact test, p, and odds ratio values are shown in each case). B Mean mouse germline DNA replication time versus enhancer turnover rate, defined as log (number of recent enhancers/number of conserved enhancers), for tissue-specific and non-tissue-specific enhancers (shown in red and blue, respectively) across the 18 DNA replication time clusters shown in Fig. 2C. R² = 0.95 (two-sided Pearson correlation p value = 6.43 × 10⁻¹²) and 0.78 (two-sided Pearson correlation p value = 1.19 × 10⁻⁰⁶) for tissue-specific and non-tissue-specific enhancers, respectively. ANCOVA p value for the difference in slope is shown. The shaded area represents a 95% confidence interval of the best fit. C Mean replication time of developmental and housekeeping fruit fly enhancers (one-sided Mann–Whitney U-test housekeeping versus developmental, alternative = “greater,” n = 200 enhancers each class). The quartiles represent the 25th, 50th (median), and 75th percentiles. The interquartile range (IQR) represents the difference between the 75th and 25th percentiles. The upper whiskers extend to the maximum value of data within 1.5 IQR above the 75th percentile, and the lower whiskers extend to the minimum value in the data within 1.5 IQR below the 25th percentile. Outliers are values above the upper whiskers or below the lower whiskers. D, E Violin plots of tissue-specific expression scores (tau values) of human and mouse TFs separated into five quintiles depending on their respective motif enrichments at early versus late replicating enhancers (one-sided Mann–Whitney U-test, pairwise comparison of later vs. earlier replicating quintile, alternative = “greater,” significance code: ‘ns’ P > 0.05, *P ≤ 0.05, **P ≤ 0.01, and ****P ≤ 0.0001). P values for the significant (p ≤ 0.05) comparisons of consecutive quintiles are shown. Across the panel, mouse germline replication times are calculated as the mean across PGC and SSC cells, and human replication times are from H9 cells.
AT-rich motifs are associated with developmental TFs and are overrepresented at late replication time in mammals
A GC percentage of human liver enhancers and promoters and random genomic regions across replication time quintiles (random regions were sampled from the non-genic areas of the genome, excluding promoters and enhancers) (n = 28,175, 11,520, and 5000 enhancers, promoters, and genomic background, respectively). Difference in GC% across DNA replication time quintiles was significant for every type of sequence (Supplementary Table 3). Quartiles, whiskers, and outliers are defined in Fig. 3E. B Mean non-CpG substitutions at liver enhancers, exonic, and intergenic regions across H9 replication time quintiles. Substitutions calculated between humans and the inferred common ancestor of Homo and Pan. The number of substitutions was adjusted by their ancestral nucleotide frequency, and log10 transformed. Error bars represent standard error (the number of regions per quintile is shown in Supplementary Table 4). C Scatterplot of the proportion of GC for TF binding motifs based on enrichment at early versus late replicating human liver enhancers. Two-sided Pearson correlation coefficient and p value are shown. The shaded area represents the 95% confidence interval of the best fit. D The bar plot shows the GC proportion of each motif. Heatmap of the GC/AT nucleotide content of TF binding motifs ordered based on their relative enrichment at early versus late replicating human liver enhancers (n = 5538 each replication time). Each column shows a human TF binding motif from the JASPAR database). E Relative enrichment of TF binding motifs at early versus late replicating liver enhancers grouped by TF class (center heatmap). The GC content of the motifs is shown on the right. Bars are colored by TF Class. Only TF classes with more than ten TFs are shown. The heatmap on the left shows the relative enrichment of homeodomain factors in early versus late replicating enhancers using JASPAR human motifs (left column) and using only the highest scoring motifs (mid column) (Methods). The column on the right shows the relative enrichment of homeodomain factors in early versus late GC%-matched genome background.

+1

Emergence of enhancers at late DNA replicating regions
  • Article
  • Full-text available

April 2024

·

10 Reads

Nature Communications

Paola Cornejo-Páramo

·

Veronika Petrova

·

Xuan Zhang

·

[...]

·

Emily S. Wong

Enhancers are fast-evolving genomic sequences that control spatiotemporal gene expression patterns. By examining enhancer turnover across mammalian species and in multiple tissue types, we uncover a relationship between the emergence of enhancers and genome organization as a function of germline DNA replication time. While enhancers are most abundant in euchromatic regions, enhancers emerge almost twice as often in late compared to early germline replicating regions, independent of transposable elements. Using a deep learning sequence model, we demonstrate that new enhancers are enriched for mutations that alter transcription factor (TF) binding. Recently evolved enhancers appear to be mostly neutrally evolving and enriched in eQTLs. They also show more tissue specificity than conserved enhancers, and the TFs that bind to these elements, as inferred by binding sequences, also show increased tissue-specific gene expression. We find a similar relationship with DNA replication time in cancer, suggesting that these observations may be time-invariant principles of genome evolution. Our work underscores that genome organization has a profound impact in shaping mammalian gene regulation.

Download
Share

DNA-Diffusion: Leveraging Generative Models for Controlling Chromatin Accessibility and Gene Expression via Synthetic Regulatory Elements

February 2024

·

52 Reads

·

2 Citations

The challenge of systematically modifying and optimizing regulatory elements for precise gene expression control is central to modern genomics and synthetic biology. Advancements in generative AI have paved the way for designing synthetic sequences with the aim of safely and accurately modulating gene expression. We leverage diffusion models to design context-specific DNA regulatory sequences, which hold significant potential toward enabling novel therapeutic applications requiring precise modulation of gene expression. Our framework uses a cell type-specific diffusion model to generate synthetic 200 bp regulatory elements based on chromatin accessibility across different cell types. We evaluate the generated sequences based on key metrics to ensure they retain properties of endogenous sequences: transcription factor binding site composition, potential for cell type-specific chromatin accessibility, and capacity for sequences generated by DNA diffusion to activate gene expression in different cell contexts using state-of-the-art prediction models. Our results demonstrate the ability to robustly generate DNA sequences with cell type-specific regulatory potential. DNA-Diffusion paves the way for revolutionizing a regulatory modulation approach to mammalian synthetic biology and precision gene therapy.


The potential of epigenetic therapy to target the 3D epigenome in endocrine-resistant breast cancer

January 2024

·

83 Reads

·

3 Citations

Nature Structural & Molecular Biology

Three-dimensional (3D) epigenome remodeling is an important mechanism of gene deregulation in cancer. However, its potential as a target to counteract therapy resistance remains largely unaddressed. Here, we show that epigenetic therapy with decitabine (5-Aza-mC) suppresses tumor growth in xenograft models of pre-clinical metastatic estrogen receptor positive (ER+) breast tumor. Decitabine-induced genome-wide DNA hypomethylation results in large-scale 3D epigenome deregulation, including de-compaction of higher-order chromatin structure and loss of boundary insulation of topologically associated domains. Significant DNA hypomethylation associates with ectopic activation of ER-enhancers, gain in ER binding, creation of new 3D enhancer–promoter interactions and concordant up-regulation of ER-mediated transcription pathways. Importantly, long-term withdrawal of epigenetic therapy partially restores methylation at ER-enhancer elements, resulting in a loss of ectopic 3D enhancer–promoter interactions and associated gene repression. Our study illustrates the potential of epigenetic therapy to target ER+ endocrine-resistant breast cancer by DNA methylation-dependent rewiring of 3D chromatin interactions, which are associated with the suppression of tumor growth.


Fig 3. BOM provides local and global motif importance scores. a. Genome browser tracks showing the snATAC-seq signal around a region of mouse chromosome 17 near the gene Nkx2-5 for each mouse E8.25 cell type. The location of three cardiomyocyte-specific CRE are shown at the bottom. b. SHAP local explanation of three cardiomyocyte-specific CREs shown in a. The top four most important motifs for classifying those CREs are shown. The red and blue arrows indicate the sign (and the direction) of SHAP values. A representative name was given to every motif, and the motif count is indicated. c. Heatmaps representing the predicted probability of the CREs shown in a and b by each of the binary models trained to predict CREs specific to a cell type in mouse E8.25 d. SHAP values of top TF binding motifs in distinguishing cardiomyocyte-, endothelium-, and neural crest-associated CREs in mouse E8.25. The SHAP values were calculated based on background sets of CREs specific to other cell types. Each dot represents the SHAP value of a CRE for a given motif. The color code represents the normalized motif count: (counts -min(counts, na.rm = TRUE))/(max(counts, na.rm = TRUE) -min(counts, na.rm = TRUE))). A positive SHAP score indicates importance in the target CRE set, while a negative value indicates importance for the background set. e. Mean SHAP values for the TF binding motifs are shown in d. f. Mean expression of the TFs that bind to the motifs in d and e. Expression data are normalized counts from matched scRNA-seq experiment 63 . g. Top 20 most important motifs in distinguishing mouse E8.25 ar re p n is e F se s. d = es n m 5
A Bag-Of-Motif Model Captures Cell States at Distal Regulatory Sequences

January 2024

·

14 Reads

Deciphering the intricate regulatory code governing cell-type-specific gene expression is a fundamental goal in genetics. Current methods struggle to capture the complex interplay between gene distal regulatory sequences and cell context. We developed a computational approach, BOM (Bag-of-Motifs), which represents cis-regulatory sequences by the type and number of TF binding motifs it contains, irrespective of motif order, orientation, and spacing. This simple yet powerful representation allows BOM to efficiently capture the complexity of cell-type-specific information encoded within these sequences. We apply BOM to mouse, human, and zebrafish distal regulatory regions, demonstrating remarkable accuracy. Notably, the method outperforms more complex deep learning models at the same task using fewer parameters. BOM can also uncover cross-species sequence similarities unrecognized by genome alignments. We experimentally validate our in silico predictions using enhancer reporter assay, showing that motifs with the most significant explanatory power are sequence determinants of cell-type specific enhancer activity. BOM offers a novel systematic framework for studying cell-type or condition-specific cis-regulatory sequences. Using BOM, we demonstrate the existence of a highly predictive sequence code at distal regulatory regions in mammals driven by TF binding motifs.



Genome-wide Transcription Factor binding maps reveal cell-specific changes in the regulatory architecture of human HSPC

August 2023

·

77 Reads

·

9 Citations

Blood

Hematopoietic stem and progenitor cells (HSPCs) rely on a complex interplay of transcription factors (TFs) to regulate differentiation into mature blood cells. A heptad of TFs - FLI1, ERG, GATA2, RUNX1, TAL1, LYL1, LMO2 - bind regulatory elements in bulk CD34+ HSPCs. However, whether specific heptad-TF combinations have distinct roles in regulating hematopoietic differentiation remained unknown. We mapped genome-wide chromatin contacts (HiC, H3K27ac HiChIP), chromatin modifications (H3K4me3, H3K27ac, H3K27me3) and 10 TF binding profiles (the Heptad, PU.1, CTCF, and STAG2) in HSPC subsets (HSC-MPP, CMP, GMP, MEP) and found that TF occupancy and enhancer-promoter interactions varied significantly across cell types and were associated with cell-type-specific gene expression. Distinct regulatory elements were enriched with specific heptad-TF combinations, including stem-cell-specific elements with ERG, and myeloid- and erythroid-specific elements with combinations of FLI1, RUNX1, GATA2, TAL1, LYL1, and LMO2. Furthermore, heptad-occupied regions in HSPCs were subsequently bound by lineage-defining TFs such as PU.1 and GATA1, suggesting that heptad factors may prime regulatory elements for use in mature cell types. We also found that enhancers with cell-type-specific heptad occupancy shared a common grammar with respect to TF binding motifs, suggesting that combinatorial binding of specific TF complexes was at least partially regulated by features encoded in specific DNA sequence motifs. Taken together, this study provides a comprehensive characterisation of the gene regulatory landscape in rare subpopulations of human HSPCs. The accompanying datasets should serve as a valuable resource for understanding adult hematopoiesis and a framework for analysing aberrant regulatory networks in leukemic cells.


Sox7-positive endothelial progenitors establish coronary arteries and govern ventricular compaction

August 2023

·

77 Reads

EMBO Reports

The cardiac endothelium influences ventricular chamber development by coordinating trabeculation and compaction. However, the endothelial-specific molecular mechanisms mediating this coordination are not fully understood. Here, we identify the Sox7 transcription factor as a critical cue instructing cardiac endothelium identity during ventricular chamber development. Endothelial-specific loss of Sox7 function in mice results in cardiac ventricular defects similar to non-compaction cardiomyopathy, with a change in the proportions of trabecular and compact cardiomyocytes in the mutant hearts. This phenotype is paralleled by abnormal coronary artery formation. Loss of Sox7 function disrupts the transcriptional regulation of the Notch pathway and connexins 37 and 40, which govern coronary arterial specification. Upon Sox7 endothelial-specific deletion, single-nuclei transcriptomics analysis identifies the depletion of a subset of Sox9/Gpc3-positive endocardial progenitor cells and an increase in erythro-myeloid cell lineages. Fate mapping analysis reveals that a subset of Sox7-null endothelial cells transdifferentiate into hematopoietic but not cardiomyocyte lineages. Our findings determine that Sox7 maintains cardiac endothelial cell identity, which is crucial to the cellular cross-talk that drives ventricular compaction and coronary artery development.


Quantitative trait and transcriptome analysis of genetic complexity underpinning cardiac interatrial septation in mice using an advanced intercross line

June 2023

·

9 Reads

Unlike single-gene mutations leading to Mendelian conditions, common human diseases are likely to be emergent phenomena arising from multilayer, multiscale, and highly interconnected interactions. Atrial and ventricular septal defects are the most common forms of cardiac congenital anomalies in humans. Atrial septal defects (ASD) show an open communication between the left and right atria postnatally, potentially resulting in serious hemodynamic consequences if untreated. A milder form of atrial septal defect, patent foramen ovale (PFO), exists in about one-quarter of the human population, strongly associated with ischaemic stroke and migraine. The anatomic liabilities and genetic and molecular basis of atrial septal defects remain unclear. Here, we advance our previous analysis of atrial septal variation through quantitative trait locus (QTL) mapping of an advanced intercross line (AIL) established between the inbred QSi5 and 129T2/SvEms mouse strains, that show extremes of septal phenotypes. Analysis resolved 37 unique septal QTL with high overlap between QTL for distinct septal traits and PFO as a binary trait. Whole genome sequencing of parental strains and filtering identified predicted functional variants, including in known human congenital heart disease genes. Transcriptome analysis of developing septa revealed downregulation of networks involving ribosome, nucleosome, mitochondrial, and extracellular matrix biosynthesis in the 129T2/SvEms strain, potentially reflecting an essential role for growth and cellular maturation in septal development. Analysis of variant architecture across different gene features, including enhancers and promoters, provided evidence for the involvement of non-coding as well as protein-coding variants. Our study provides the first high-resolution picture of genetic complexity and network liability underlying common congenital heart disease, with relevance to human ASD and PFO.


Figure 5-figure supplement 1
Figure 7
Figure 8-figure supplement 1. STRING network of E12.5 differentially expressed genes and 1463
Figure 10
Table 753 754
Quantitative trait and transcriptome analysis of genetic complexity underpinning cardiac interatrial septation in mice using an advanced intercross line

June 2023

·

64 Reads

·

1 Citation

eLife

Unlike single-gene mutations leading to Mendelian conditions, common human diseases are likely to be emergent phenomena arising from multilayer, multiscale and highly interconnected interactions. Atrial and ventricular septal defects are the most common forms of cardiac congenital anomalies in humans. Atrial septal defects (ASD) show an open communication between left and right atria postnatally, potentially resulting in serious hemodynamic consequences if untreated. A milder form of atrial septal defect, patent foramen ovale (PFO), exists in about one quarter of the human population, strongly associated with ischaemic stroke and migraine. The anatomic liabilities and genetic and molecular basis of atrial septal defects remain unclear. Here, we advance our previous analysis of atrial septal variation through quantitative trait locus (QTL) mapping of an advanced intercross line (AIL) established between the inbred QSi5 and 129T2/SvEms mouse strains, that show extremes of septal phenotypes. Analysis resolved 37 unique septal QTL with high overlap between QTL for distinct septal traits and PFO as a binary trait. Whole genome sequencing of parental strains and filtering identified predicted functional variants, including in known human congenital heart disease genes. Transcriptome analysis of developing septa revealed downregulation of networks involving ribosome, nucleosome, mitochondrial and extracellular matrix biosynthesis in the 129T2/SvEms strain, potentially reflecting an essential role for growth and cellular maturation in septal development. Analysis of variant architecture across different gene features, including enhancers and promoters, provided evidence for involvement of non-coding as well as protein coding variants. Our study provides the first high resolution picture of genetic complexity and network liability underlying common congenital heart disease, with relevance to human ASD and PFO.


Quantitative trait and transcriptome analysis of genetic complexity underpinning cardiac interatrial septation in mice using an advanced intercross line

June 2023

·

8 Reads

Unlike single-gene mutations leading to Mendelian conditions, common human diseases are likely to be emergent phenomena arising from multilayer, multiscale, and highly interconnected interactions. Atrial and ventricular septal defects are the most common forms of cardiac congenital anomalies in humans. Atrial septal defects (ASD) show an open communication between the left and right atria postnatally, potentially resulting in serious hemodynamic consequences if untreated. A milder form of atrial septal defect, patent foramen ovale (PFO), exists in about one-quarter of the human population, strongly associated with ischaemic stroke and migraine. The anatomic liabilities and genetic and molecular basis of atrial septal defects remain unclear. Here, we advance our previous analysis of atrial septal variation through quantitative trait locus (QTL) mapping of an advanced intercross line (AIL) established between the inbred QSi5 and 129T2/SvEms mouse strains, that show extremes of septal phenotypes. Analysis resolved 37 unique septal QTL with high overlap between QTL for distinct septal traits and PFO as a binary trait. Whole genome sequencing of parental strains and filtering identified predicted functional variants, including in known human congenital heart disease genes. Transcriptome analysis of developing septa revealed downregulation of networks involving ribosome, nucleosome, mitochondrial, and extracellular matrix biosynthesis in the 129T2/SvEms strain, potentially reflecting an essential role for growth and cellular maturation in septal development. Analysis of variant architecture across different gene features, including enhancers and promoters, provided evidence for the involvement of non-coding as well as protein-coding variants. Our study provides the first high-resolution picture of genetic complexity and network liability underlying common congenital heart disease, with relevance to human ASD and PFO.


Citations (12)


... Such models are theoretically capable of learning a sequence-to-function mapping that captures underlying biological principles [12][13][14][15][16][17] , and can thereby guide the design of synthetic enhancers with targeted activity levels. DaSilva et al. and Lal et al. demonstrated synthetic cell type-specific enhancer design in silico 18,19 . De Almeida et al. experimentally validated synthetic enhancer designs, initially using STARR-seq to prove enhancer activity in a single Drosophila developmental cell type 15 , then targeting enhancers to four distinct tissue types in the Drosophila embryo and confirming specificity with in vivo assays 20 . ...

Reference:

Iterative deep learning-design of human enhancers exploits condensed sequence grammar to achieve cell type-specificity
DNA-Diffusion: Leveraging Generative Models for Controlling Chromatin Accessibility and Gene Expression via Synthetic Regulatory Elements

... In summary, disruptions in 3D chromatin organization in cancer revealed a connection between these structural changes, epigenetic features, and gene expression, offering critical insights into the 3D chromatin structural features of cancer. In addition to previous findings, studies have shown that drug resistance or drug efficacy causes differences in 3D chromatin organization 123,124 . Understanding how drugs affect chromatin organization can provide insight into the mechanisms underlying drug resistance and efficacy, potentially leading to the development of more effective treatments. ...

The potential of epigenetic therapy to target the 3D epigenome in endocrine-resistant breast cancer

Nature Structural & Molecular Biology

... Protein-protein interaction of heptad transcription factors (TFs) friend leukemia integration factor 1 (Fli-1), GaTa binding protein (GaTa)1, GaTa2, runt-related transcription factor 1 (runX1), T-cell acute lymphocytic leukemia 1 (Tal1) in MeP cells controls linage specific erythroid and megakaryocytic differentiation (4). In the combinatorial binding of heptad factors in bulk human hematopoietic stem progenitor cells (HSPcs), individual progenitors and cell lines reveal cell-specific changes in the regulatory architecture of these transcription factors during lineage specific differentiation (5)(6)(7). ...

Genome-wide Transcription Factor binding maps reveal cell-specific changes in the regulatory architecture of human HSPC
  • Citing Article
  • August 2023

Blood

... Following this, the signal-dependent trans-factors interact with cis-elements, these being more degenerate in enhancers than promoters. These support the protein-protein interactions and strengthen the cooperativity of their binding to DNA [123,[133][134][135][136]. Together with trans-factors, various coactivators such as the mediator complex, lysine acetyltransferases p300, and CREB-binding protein (CBP) are recruited to maintain the nucleosome-free chromatin state (Figure 1) [137][138][139][140][141]. As a result, the plant active enhancers pea PetE and maize b1 are enriched in H3/H4ac and H3K9/K14ac, respectively [127,142]; however, elevated H3K27ac only acts as a marker of active enhancers in animals, not in plants [143,144]. ...

Decoding enhancer complexity with machine learning and high-throughput discovery

Genome Biology

... Similarly, teleost enhancers without detectable evolutionary conservation can direct human gene expression and vice versa [119]. Hence, evolutionary distant animals share similar TFs, TFBSs, and developmental gene regulatory pathways, and enhancer-promoter connections [111,[120][121][122][123]. ...

Distal regulation, silencers, and a shared combinatorial syntax are hallmarks of animal embryogenesis

Genome Research

... Using clonal tracing methods, we and others previously found that functional heterogeneity 25 among unmutated hematopoietic clones is associated in vivo with defined molecular landscapes, both at the epigenetic 26 and transcriptomic level 27 . In accordance, clonal heterogeneity is also emerging as an important cause of incomplete penetrance of leukemic mutations 28 . In previous work, we demonstrated that clonal behaviors were largely scripted in the epigenome of the cell of origin 26 . ...

Non-genetic determinants of malignant clonal fitness at single-cell resolution

Nature

... Furthermore, many of our observations have been supported by other studies. For example, increase in short-range interaction frequencies and predominant switching of chromatin into a more active state in endocrine-resistant ER + breast cancer may reflect decompaction of constitutive heterochromatic regions, a known hallmark of carcinogenesis 50 . Differences in A/B compartment switching observed in the carboplatin resistant state support previously reported interaction strength increase in the A compartment with the parallel decrease in the B compartment under Decitabine treatment of endocrine-resistant breast cancer cell lines 50 . ...

Epigenetic therapy suppresses endocrine-resistant breast tumour growth by re-wiring ER-mediated 3D chromatin interactions

... Analysis of TE compositions of cell type-specific TEPs revealed that ID_B1 (Alu superfamily) was abundant in all cell types. The top-ranked TEs showed high consistency across cell clusters but were enriched to different motifs in different EHT stages (Fig. 7B; Additional file 11: Table S6), possibly related to the variation accumulated on different TE copies during evolution [79,80]. Surprisingly, TEs were found to participate in shaping most cis-regulatory networks closely related to the EHT process. ...

Cis -acting variation is common across regulatory layers but is often buffered during embryonic development

Genome Research

... However, it is also apparent that conserved GRNs and developmental processes can utilize functionally conserved CRMs, which retain functional and mechanistic homology over hundreds of millions of years of evolution [e.g. 3,[4][5][6][7][8][9][10][11]. In these cases, the DNA sequence of the CRMs has often diverged well past the point of possible linear alignment, but non-linear alignment methods or (in some cases) analysis of common transcription factor binding sites allow for their identification. ...

Deep conservation of the enhancer regulatory code in animals
  • Citing Article
  • November 2020

Science

... Extensive functional genomic studies in an early pre-metazoan Capsaspora owczarzaki have identified dynamic cis-regulatory landscapes [36]. Among the early metazoans, enhancers were also identified in the sponge Amphimedon queenslandica [37] that can activate genes in zebrafish and mouse tissues [38]. Early metazoans have complex cellular differentiation patterns indicative of well-structured gene regulatory mechanisms [39,40]. ...

Early origin and deep conservation of enhancers in animals