Sora Yoon

Sora Yoon
University of Pennsylvania | UP · Department of Genetics

PhD

About

42
Publications
6,103
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
622
Citations

Publications

Publications (42)
Article
Full-text available
The human genome functions as a three-dimensional chromatin polymer, driven by a complex collection of chromosome interactions1–3. Although the molecular rules governing these interactions are being quickly elucidated, relatively few proteins regulating this process have been identified. Here, to address this gap, we developed high-throughput DNA o...
Article
Multi-enhancer hubs are spatial clusters of enhancers present across numerous developmental programs. Here, we studied the functional relevance of these three-dimensional structures in T cell biology. Mathematical modeling identified a highly connected multi-enhancer hub at the Ets1 locus, comprising a noncoding regulatory element that was a hotspo...
Article
Full-text available
Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple bat...
Preprint
Multi-enhancer hubs are spatial clusters of enhancers which have been recently characterized across numerous developmental programs. Yet, the functional relevance of these three-dimensional (3D) structures is poorly understood. Here we show that the multiplicity of enhancers interacting with the transcription factor Ets1 is essential to control the...
Article
Full-text available
Innate lymphoid cells (ILCs) are well-characterized immune cells that play key roles in host defense and tissue homeostasis. Yet, how the three-dimensional (3D) genome organization underlies the development and functions of ILCs is unknown. Herein, we carried out an integrative analysis of the 3D genome structure, chromatin accessibility and gene e...
Article
Full-text available
The high mobility group (HMG) transcription factor TCF-1 is essential for early T cell development. Although in vitro biochemical assays suggest that HMG proteins can serve as architectural elements in the assembly of higher-order nuclear organization, the contribution of TCF-1 on the control of three-dimensional (3D) genome structures during T cel...
Preprint
Full-text available
Integration of single-cell RNA sequencing (scRNA-seq) data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression (DE) analysis of scRNA-seq data remain underinvestigated. Here, we benchmarked 41 methods for integrative DE analysis of scRNA-seq data. Batch-effect...
Preprint
Full-text available
Although the molecular rules governing genome organization are being quickly elucidated, relatively few proteins regulating this process have been identified. To address this gap, we developed a fully automated imaging pipeline, called HiDRO (high-throughput DNA or RNA labeling with optimized Oligopaints), that permits quantitative measurement of c...
Article
Full-text available
Architectural stripes tend to form at genomic regions harboring genes with salient roles in cell identity and function. Therefore, the accurate identification and quantification of these features are essential for understanding lineage-specific gene regulation. Here, we present Stripenn, an algorithm rooted in computer vision to systematically dete...
Article
Dysregulation of extracellular matrix proteins in obese adipose tissue (AT) induces systemic insulin resistance. The metabolic roles of type VI collagen and its cleavage peptide endotrophin in obese AT are well established. However, the mechanisms regulating endotrophin generation remain elusive. Herein, we identified that several endotrophin-conta...
Article
Full-text available
Higher-order chromatin structure regulates gene expression, and mutations in proteins mediating genome folding underlie developmental disorders known as cohesinopathies. However, the relationship between three-dimensional genome organization and embryonic development remains unclear. Here we define a role for bromodomain-containing protein 4 (BRD4)...
Preprint
Full-text available
Architectural stripes tend to form at genomic regions harboring genes with salient roles in cell identity and function. Therefore, the accurate identification and quantification of these features is essential for the understanding of lineage-specific gene regulation. Here, we present Stripenn, an algorithm rooted in computer vision to systematicall...
Article
Full-text available
Meta-analyses increase statistical power by combining statistics from multiple studies. Meta-analysis methods have mostly been evaluated under the condition that all the data in each study have an association with the given phenotype. However, specific experimental conditions in each study or genetic heterogeneity can result in “unassociated statis...
Article
Full-text available
Bdellovibrio bacteriovorus 109J is a predatory bacterium which lives by predating on other Gram-negative bacteria to obtain the nutrients it needs for replication and survival. Here, we evaluated the effects two classes of bacterial signaling molecules (acyl homoserine lactones (AHLs) and diffusible signaling factor (DSF)) have on B. bacteriovorus...
Article
Full-text available
Benchmarking RNA-seq differential expression analysis methods using spike-in and simulated RNA-seq data has often yielded inconsistent results. The spike-in data, which were generated from the same bulk RNA sample, only represent technical variability, making the test results less reliable. We compared the performance of 12 differential expression...
Article
We present an R-Shiny package, netGO, for novel network-integrated pathway enrichment analysis. The conventional Fisher's exact test (FET) considers the extent of overlap between target genes and pathway gene-sets, while recent network-based analysis tools consider only network interactions between the two. netGO implements an intuitive framework t...
Article
Full-text available
Peroxisome proliferator-activated receptor γ (PPARγ) is a master regulator of adipose tissue biology. In obesity, phosphorylation of PPARγ at Ser273 (pSer273) by cyclin-dependent kinase 5 (CDK5)/extracellular signal-regulated kinase (ERK) orchestrates diabetic gene reprogramming via dysregulation of specific gene expression. Although many recent st...
Article
Full-text available
The Hsp90 family proteins Hsp90, Grp94, and TRAP1 are present in the cell cytoplasm, endoplasmic reticulum, and mitochondria, respectively; all play important roles in tumorigenesis by regulating protein homeostasis in response to stress. Thus, simultaneous inhibition of all Hsp90 paralogs is a reasonable strategy for cancer therapy. However, since...
Article
Androgen receptor (AR) signaling plays a central role in metabolic reprogramming for prostate cancer (PCa) growth and progression. Mitochondria are metabolic powerhouses of the cell and support several hallmarks of cancer. However, the molecular links between AR signaling and the mitochondria that support the metabolic demands of PCa cells are poor...
Article
Full-text available
Dysregulation of the adipo‐osteogenic differentiation balance of mesenchymal stem cells (MSCs), which are common progenitor cells of adipocytes and osteoblasts, has been associated with many pathophysiologic diseases, such as obesity, osteopenia, and osteoporosis. Growing evidence suggests that lipid metabolism is crucial for maintaining stem cell...
Article
Full-text available
TonEBP (tonicity-responsive enhancer binding protein) is a transcriptional regulator whose expression is elevated in response to various forms of stress including hyperglycemia, inflammation, and hypoxia. Here we investigated the role of TonEBP in acute kidney injury (AKI) using a line of TonEBP haplo-deficient mice subjected to bilateral renal isc...
Article
Full-text available
Background Gene-set analysis (GSA) has been commonly used to identify significantly altered pathways or functions from omics data. However, GSA often yields a long list of gene-sets, necessitating efficient post-processing for improved interpretation. Existing methods cluster the gene-sets based on the extent of their overlap to summarize the GSA r...
Article
Full-text available
We present a novel approach to identify human microRNA (miRNA) regulatory modules (mRNA targets and relevant cell conditions) by biclustering a large collection of mRNA fold-change data for sequence-specific targets. Bicluster targets were assessed using validated messenger RNA (mRNA) targets and exhibited on an average 17.0% (median 19.4%) improve...
Article
Full-text available
A protein that targets peroxisome proliferator-activated receptor γ (PPARγ) (a key regulator of fat cell formation) for degradation suppresses the formation of fat. Excess fat can lead to obesity and cause type 2 diabetes and cardiovascular disease, yet little is known about the mechanisms through which fat cells arise from progenitor cells. Jang H...
Article
Full-text available
Pathway-based analysis in genome-wide association study (GWAS) is being widely used to uncover novel multi-genic functional associations. Many of these pathway-based methods have been used to test the enrichment of the associated genes in the pathways, but exhibited low powers and were highly affected by free parameters. We present the novel method...
Article
Full-text available
Background In differential expression analysis of RNA-sequencing (RNA-seq) read count data for two sample groups, it is known that highly expressed genes (or longer genes) are more likely to be differentially expressed which is called read count bias (or gene length bias). This bias had great effect on the downstream Gene Ontology over-representati...
Article
Full-text available
O-GlcNAcylated proteins are abundant in the brain and are associated with neuronal functions and neurodegenerative diseases. Although several studies have reported the effects of aberrant regulation of O-GlcNAcylation on brain function, the roles of O-GlcNAcylation in synaptic function remain unclear. To understand the effect of aberrant O-GlcNAcyl...
Article
Full-text available
Deregulated pathways identified from transcriptome data of two sample groups have played a key role in many genomic studies. Gene-set enrichment analysis (GSEA) has been commonly used for pathway or functional analysis of microarray data, and it is also being applied to RNA-seq data. However, most RNA-seq data so far have only small replicates. Thi...
Data
Contains Supporting Figures A, B and C and Comparison of one-tailed and two-tailed absolute GSEA results. (PDF)
Article
Full-text available
In differential expression (DE) analysis of RNA-seq count data, it is known that genes with a larger read number are more likely to be differentially expressed. This bias has a profound effect on the subsequent Gene Ontology (GO) analysis by perturbing the ranks of gene-sets. Another known bias is that the commonly used parametric DE analysis metho...
Article
Full-text available
Emerging evidence suggests that aberrant O-GlcNAcylation is associated with tumorigenesis. Many oncogenic factors are O-GlcNAcylated, which modulates their functions. However, it remains unclear how O-GlcNAcylation and O-GlcNAc cycling enzymes, O-GlcNAc transferase (OGT) and O-GlcNAcase (OGA), affect the development of cancer in animal models. In t...
Article
Full-text available
Background Genome-wide expression profiles reflect the transcriptional networks specific to the given cell context. However, most statistical models try to estimate the average connectivity of the networks from a collection of gene expression data, and are unable to characterize the context-specific transcriptional regulations. We propose an approa...

Network

Cited By