Fig 2 - available from: Genome Medicine
This content is subject to copyright. Terms and conditions apply.
Stop-gained and eQTL variants from GTEx show allele-specific regulatory effects on expression. a Effect size of non-eQTL control, eQTL, and stop-gained variants after editing with the polyclonal allelic expression assay. Triangular points mark variants whose effect sizes significantly deviate from the control distribution. b Correlation between the effect sizes of variants in GTEx and effect sizes from the polyclonal allelic expression assay. c Correlation between the effect sizes of variants in HEK293T cells and in HeLa cells from the polyclonal allelic expression assay. d eQTL effect size (aFC) in GTEx tissues for the 13 edited eQTL variants shown as boxplots, with lines indicating the median effect size in GTEx fibroblasts and in the assay. Asterisks mark variants which were significant in the assay in HEK293T cells

Stop-gained and eQTL variants from GTEx show allele-specific regulatory effects on expression. a Effect size of non-eQTL control, eQTL, and stop-gained variants after editing with the polyclonal allelic expression assay. Triangular points mark variants whose effect sizes significantly deviate from the control distribution. b Correlation between the effect sizes of variants in GTEx and effect sizes from the polyclonal allelic expression assay. c Correlation between the effect sizes of variants in HEK293T cells and in HeLa cells from the polyclonal allelic expression assay. d eQTL effect size (aFC) in GTEx tissues for the 13 edited eQTL variants shown as boxplots, with lines indicating the median effect size in GTEx fibroblasts and in the assay. Asterisks mark variants which were significant in the assay in HEK293T cells

Source publication
Article
Full-text available
We present an assay to experimentally test the regulatory effects of genetic variants within transcripts using CRISPR/Cas9 followed by targeted sequencing. We applied the assay to 32 premature stop-gained variants across the genome and in two Mendelian disease genes, 33 putative causal variants of eQTLs, and 62 control variants in HEK293T cells, re...

Contexts in source publication

Context 1
... HDR rate varies greatly between loci but is very well correlated between replicates of the same variant (Spearman's rho = 0.96, p = 2.7 × 10 −14 , Additional file 4: Figure S1b), suggesting that the results of the assay are not strongly influenced by PCR amplification bias or variation in transfection efficiency. The HDR rate was not significantly correlated to NHEJ rate (Spearman's rho = 0.15, p = 0.27) as has been observed before [45], and neither HDR nor NHEJ rate was significantly correlated to two published predictors of gRNA efficiency [39,46] (Additional file 4: Figure S2). There was a minimal correlation between the gene expression level and the standard deviation of the effect size between replicates, which was reduced with the HDR filter of 0.4% (Additional file 4: Figure S1d). ...
Context 2
... the effects of genetic variants on gene expression levels using the polyclonal allelic expression assay, we first analyzed the rare stop-gained variants from GTEx that are expected to trigger NMD. As a group, the stop-gained variants show the expected negative effect sizes as compared to the control distribution (Wilcoxon p = 9.8 × 10 −5 , Fig. 2a). Four of these variants individually deviate significantly from the control distribution (Bonferroni-corrected z test p < 0.05, Fig. 2a and Additional file 3: Table S3). These results demonstrate our ability to capture NMD effects with the ...
Context 3
... stop-gained variants from GTEx that are expected to trigger NMD. As a group, the stop-gained variants show the expected negative effect sizes as compared to the control distribution (Wilcoxon p = 9.8 × 10 −5 , Fig. 2a). Four of these variants individually deviate significantly from the control distribution (Bonferroni-corrected z test p < 0.05, Fig. 2a and Additional file 3: Table S3). These results demonstrate our ability to capture NMD effects with the ...
Context 4
... they are associated and have a high posterior probability of causality based on CAVIAR fine mapping. After editing and QC filtering, 13 eQTL variants remained (Additional file 3: Table S3). The variance of the effect size of the eQTL variants was significantly higher than that of the control variants (0.49 versus 0.038; F test p = 1.7 × 10 −5 ; Fig. 2a), which suggests that the edited eQTL variants as a whole have a greater regulatory effect than the edited control variants. Ten of the 13 variants have an effect in the same direction as the GTEx eQTL effect. Five of the eQTL variants are individually significantly different from the control distribution ( Fig. 2a and Additional file ...
Context 5
... 0.038; F test p = 1.7 × 10 −5 ; Fig. 2a), which suggests that the edited eQTL variants as a whole have a greater regulatory effect than the edited control variants. Ten of the 13 variants have an effect in the same direction as the GTEx eQTL effect. Five of the eQTL variants are individually significantly different from the control distribution ( Fig. 2a and Additional file 3: Table S3), and all five of these variants have an effect in the same direction as in GTEx. Additionally, there is a significant correlation between the effect size of the edited stop-gained, non-eQTL control, and eQTL and their effects in GTEx (Spearman's rho = 0.60; p = 1.9 × 10 −4 , Fig. 2b), again indicating ...
Context 6
... from the control distribution ( Fig. 2a and Additional file 3: Table S3), and all five of these variants have an effect in the same direction as in GTEx. Additionally, there is a significant correlation between the effect size of the edited stop-gained, non-eQTL control, and eQTL and their effects in GTEx (Spearman's rho = 0.60; p = 1.9 × 10 −4 , Fig. 2b), again indicating that the assay captures regulatory effects seen in the ...
Context 7
... 3: Table S3). Due to the small number of control variants in this replication set, we analyzed variant effect sizes rather than distinguishing significant variants in HeLa cells. The effect sizes measured in HEK293T cells and HeLa cells were largely consistent (rho = 0.3, p = 0.258), with all stop variants showing the same direction of effect (Fig. 2c) and small effects of the control variants in HeLa cells. As in HEK293T cells, the HeLa effect sizes showed a correlation with GTEx effect sizes (Spearman's rho = 0.42, p = 0.074; Additional file 4: Figure S5d), indicating that the assay is ...
Context 8
... variants is due to the variants not actually being the causal regulatory variants of their association signals. Additionally, our cell line may not perfectly recapitulate the genetic regulatory effects of GTEx fibroblast samples. To investigate this, we looked at the variation in the effect size between GTEx tissues for each of the eQTL variants (Fig. 2d), as well as the variation of the effect sizes between HEK293 and HeLa cells (Fig. 2c, Additional file 4: Figure S5d). We also looked at interindividual variation within fibroblast samples in GTEx, which may reflect more subtle cell type-specific genetic effects as well as the effects of other regulatory variants that the individuals ...
Context 9
... their association signals. Additionally, our cell line may not perfectly recapitulate the genetic regulatory effects of GTEx fibroblast samples. To investigate this, we looked at the variation in the effect size between GTEx tissues for each of the eQTL variants (Fig. 2d), as well as the variation of the effect sizes between HEK293 and HeLa cells (Fig. 2c, Additional file 4: Figure S5d). We also looked at interindividual variation within fibroblast samples in GTEx, which may reflect more subtle cell type-specific genetic effects as well as the effects of other regulatory variants that the individuals may have. We measured the effect size in eQTL heterozygotes based on the allelic ...
Context 10
... HEK293T aFC, median heterozygous aFC, and eQTL aFC. Some of the variations between the effect sizes observed in HEK293T and HeLa cells may be attributable not only to noise but also cell type-specific regulatory effects. Several of the other variants demonstrate a large range of effects in GTEx both across tissues and across individuals (Fig. 2d and Additional file 4: Figure S5c). The observed effect in the cell line, like an individual or tissue, is likely to fall somewhere in a spectrum of possible ...

Citations

... For the CRISPR assay, we selected 14 rare stop-gained variants that were good candidates, eight of which passed quality control through (1) filtering to rare stop-gained variants with expression and ASE watershed posterior >0.9, (2) filtering to multitissue outlier status in both, and (3) keeping four remaining candidates that lie in complex trait genes and the next 10 with the highest individual outlier signal and Watershed posterior. Variants were tested using the polyclonal editing assay described in (41 For the MPRA, we designed a set of synthetic DNA fragments by retrieving the genomic sequence corresponding to a 150-bp window centered at each variant of interest for the set of eOutlier-associated RVs and controls. For each variant, a reference and alternative sequence was designed that corresponded to each allele. ...
Article
Full-text available
Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits.
... (A recent study suggests that size of a tile negatively correlates with reproducibility of expression driven compared to that driven by ~120bp tiles, emphasizing the importance of this consideration (93)). While AAV-transduced episomes gain histones (105) and chromosome-like nucleosome spacing (106), it is unknown whether generegulatory histone marks on these episomes mirror those of endogenous regulatory chromatin. For these reasons of both local sequence context and chromatin context, we suggest corroboration of MPRA findings in native genomic settings, by, for example, introducing the variant to the genome of a cell line using CRISPR methods. ...
Article
Neuropsychiatric phenotypes have long been known to be influenced by heritable risk factors, directly confirmed by the past decade of genetic studies which have revealed specific genetic variants enriched in disease cohorts. However, the initial hope that a small set of genes would be responsible for a given disorder proved false. The more complex reality is that a given disorder may be influenced by myriad small-effect noncoding variants, and/or by rare but severe coding variants, many de novo. Noncoding genomic sequences—for which molecular functions cannot usually be inferred—harbor a large portion of these variants, creating a substantial barrier to understanding higher-order molecular and biological systems of disease. Fortunately, novel genetic technologies—scalable oligonucleotide synthesis, RNA sequencing, and CRISPR—have opened novel avenues to experimentally identify biologically significant variants en masse. An especially versatile technique resulting from such innovations are Massively Parallel Reporter Assays (MPRAs), powerful molecular genetic tools that can be used to screen ≥thousands of untranscribed or untranslated sequences and their variants for functional effects in a single experiment. This approach, though underutilized in psychiatric genetics, has several useful features for the field. Here, we review methods for assaying putatively functional genetic variants and regions, emphasizing MPRAs and the opportunities they hold for dissection of psychiatric polygenicity. We discuss literature applying functional assays in neurogenetics, highlighting strengths, caveats, and design considerations—especially regarding disease-relevant variables (cell type, neurodevelopment, and sex), and ultimately propose applications of MPRA to both computational and experimental neurogenetics of polygenic disease risk.
... These have, for example, been used to discover co-transcribed gene networks involved in neuronal remodeling (79), and for in vivo assessment of a set of genes discovered to harbor de novo loss of function mutations in ASDs (80). CRISPR editing has been used in vitro to assess single transcript variant effects by comparing reference and allelic RNA and genomic DNA abundances in edited cultures (81). While this approach currently requires a separate culture of cells for each assayed variant, it still enables practicable validation/follow-up of MPRA candidates by the dozens using, e.g., 96-well plates. ...
Preprint
Full-text available
Neuropsychiatric phenotypes have been long known to be influenced by heritable risk factors. The past decade of genetic studies have confirmed this directly, revealing specific common and rare genetic variants enriched in disease cohorts. However, the early hope for these studies-that only a small set of genes would be responsible for a given disorder-proved false. The picture that has emerged is far more complex: a given disorder may be influenced by myriad coding and noncoding variants of small effect size, and/or by rare but severe variants of large effect size, many de novo. Noncoding genomic sequences harbor a large portion of these variants, for which molecular functions cannot usually be inferred from sequence alone. This creates a substantial barrier to understanding the higher-order molecular and biological systems underlying disease risk. Fortunately, a proliferation of genetic technologies-namely, scalable oligonucleotide synthesis, high-throughput RNA sequencing, CRISPR, and CRISPR derivatives-have opened novel avenues to experimentally identifying biologically significant variants en masse. These advances have yielded an especially versatile technique adaptable to large-scale functional assays of variation in both genomic and transcribed noncoding regulatory features: Massively Parallel Reporter Assays (MPRAs). MPRAs are powerful molecular genetic tools that can be used to screen tens of thousands of predefined sequences for functional effects in a single experiment. This approach has several ideal features for psychiatric genetics, but remains underutilized in the field to date. To emphasize the opportunities MPRA holds for dissecting psychiatric polygenicity, we review here its applications to date, discuss its ability to test several biological variables implicated in psychiatric disorders, illustrate this flexibility with a proof-of-principle, in vivo cell-type specific implementation of the assay, and envision future outcomes of applying MPRA to both computational and experimental neurogenetics.
Article
Understanding functional effects of genetic variants is one of the key challenges in human genetics, as much of disease-associated variation is located in non-coding regions with typically unknown putative gene regulatory effects. One of the most important approaches in this field has been molecular quantitative trait locus (molQTL) mapping, where genetic variation is associated with molecular traits that can be measured at scale, such as gene expression, splicing and chromatin accessibility. The maturity of the field and large-scale studies have produced a rich set of established methods for molQTL analysis, with novel technologies opening up new areas of discovery. In this Primer, we discuss the study design, input data and statistical methods for molQTL mapping and outline the properties of the resulting data as well as popular downstream applications. We review both the limitations and caveats of molQTL mapping as well as future potential approaches to tackle them. With technological development now providing many complementary methods for functional characterization of genetic variants, we anticipate that molQTLs will remain an important part of this toolkit as the only existing approach that can measure human variation in its native genomic, cellular and tissue context. Molecular quantitative trait locus (molQTL) mapping associates genetic variation with molecular traits that can be measured as gene expression, splicing and chromatin accessibility. In this Primer, Aguet et al. discuss the study design and implementation of molQTL mapping in various applications, with a focus on technical developments for functional characterization.
Article
Thousands of common genetic variants in the human population have been associated with disease risk and phenotypic variation by genome-wide association studies (GWAS). However, the majority of GWAS variants fall into noncoding regions of the genome, complicating our understanding of their regulatory functions, and few molecular mechanisms of GWAS variant effects have been clearly elucidated. Here, we set out to review genetic variant effects, focusing on expression quantitative trait loci (eQTLs), including their utility in interpreting GWAS variant mechanisms. We discuss the interrelated challenges and opportunities for eQTL analysis, covering determining causal variants, elucidating molecular mechanisms of action, and understanding context variability. Addressing these questions can enable better functional characterization of disease-associated loci and provide insights into fundamental biological questions of the noncoding genetic regulatory code and its control of gene expression. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.