Figure 4 - available from: Scientific Reports
This content is subject to copyright. Terms and conditions apply.
Distribution of effect size for significant eQTL (derived from GTEx). Y-axis represents the putative SNP effect size (slope) over gene expression. X-axis represents the SNPs grouped by the gene where they are located. SNPs mapped to miRNA binding sites are represented by red or green dots for thyroid (a) and testis (b) tissues. Green dots represent those SNPs that are outliers for the respective distribution (z-score >2 or z-score <−2).

Distribution of effect size for significant eQTL (derived from GTEx). Y-axis represents the putative SNP effect size (slope) over gene expression. X-axis represents the SNPs grouped by the gene where they are located. SNPs mapped to miRNA binding sites are represented by red or green dots for thyroid (a) and testis (b) tissues. Green dots represent those SNPs that are outliers for the respective distribution (z-score >2 or z-score <−2).

Source publication
Article
Full-text available
Abstract Non-coding RNAs (ncRNA) have an essential role in the complex landscape of human genetic regulatory networks. One area that is poorly explored is the effect of genetic variations on the interaction between ncRNA and their targets. By integrating a significant amount of public data, the present study cataloged the vast landscape of the regu...

Contexts in source publication

Context 1
... seed matched any significant eQTL in the GTEx dataset. Thyroid and testis were the tissues that presented the highest number of e-QTL in both miRNA-binding sites and lincRNAs ( Supplementary Fig. S2). By analyzing both tissues, it is possible to notice that some eQTL mapped to miRNA-binding sites diverge from the mean distribution of effect size (Fig. 4). The effect size of a given eQTL is defined as the slope of the linear regression and is computed as the effect of the alternative allele relative to the ref- erence allele (allele reported in the human genome reference sequence). This suggests that these eQTL associated with miRNA-binding sites may have a higher influence on gene ...
Context 2
... suggests that these eQTL associated with miRNA-binding sites may have a higher influence on gene expression when compared to eQTL in general. A KEGG enrichment analysis was also performed on such eQTL and returned some common disease-related pathways such as cancer, diabetes, asthma and tuberculosis ( Supplementary Fig. S3 for miRNA-binding sites and Supplementary Fig. S4 for lincRNAs). ...

Similar publications

Article
Full-text available
Long intergenic non-coding RNAs (lincRNAs) are emerging as integral components of signaling pathways in various cancer types. In neuroblastoma, only a handful of lincRNAs are known as upstream regulators or downstream effectors of oncogenes. Here, we exploit RNA sequencing data of primary neuroblastoma tumors, neuroblast precursor cells, neuroblast...
Article
Full-text available
Key Message We characterize a functional lincRNA, XH123 in cotton seedling in defense of cold stress. The silencing of XH123 leads to increased sensitivity to cold stress and the decay of chloroplast. Abstract Cotton, which originated from the arid mid-American region, is one of the most important cash crops worldwide. Cultivated cotton is now wid...
Article
Full-text available
In the post-genomic era, our understanding of the molecular regulators of physiologic and pathologic processes in pregnancy is expanding at the whole-genome level. Longitudinal changes in the known protein-coding transcriptome during normal pregnancy, which we recently reported (Gomez-Lopez et al., 2019), have improved our definition of the major o...
Article
Full-text available
Long non-coding RNAs (lncRNAs) are transcripts longer than 200 bp with low or no protein-coding ability, which play essential roles in various biological processes in plants. Tobacco is an ideal model plant for studying nicotine biosynthesis and metabolism, and there is little research on lncRNAs in this field. Therefore, how to take advantage of t...
Article
Full-text available
The testis is the mammalian tissue with the highest expression levels of long intergenic non-coding RNAs (lincRNAs). However, most in vivo models have not found significant reductions in male fertility when highly expressed lincRNA genes were removed. This suggests that certain lincRNAs may act redundantly or lack functional roles. In the genome of...

Citations

... Additionally, reports suggest that genetic variants with extreme allele frequency differences may underlie some human health disparities across populations [72,73]. Disease susceptible variants are commonly located in cell type specific enhancer and association networks are built on eQTLs associated to traits of clinical or pharmacological relevance [74]. The eQTLs located on BRCA1 enhancer interact with SNPs which are associated directly with traits like menopause, menarche, blood protein levels and body mass index. ...
Article
Genomic cis regulatory elements support the gene transcriptional landscape which fine tune spatiotemporal gene expression via interaction with different transcription factors and co modulators during development. These regulatory elements are poorly conserved, highly heterogenous with limited understanding of their role in gene expression. Here we use a well-known human tumor suppressor gene, Breast Cancer Type 1 (BRCA1) and UCSC human genome browser database to report the in-silico putative cis regulatory enhancer element and its features. We report a 2kb double elite enhancer, GH17J043079 located within intron 12 of the BRCA1 gene. The enhancer interacts with NBR1, NBR2, TMEM106A and RPL27 and VAT1 gene promoters. GH17J043079 showed histone activity in human embryonic stem cells, cancerous cells, housed transcription factors specific to liver cells and was enriched with Alu elements, indicative of ability for potential gene rearrangements. Additionally, it contained eQTLs, rs4793197, rs8176190, rs8176192, rs8176193 and rs8176194 with disparity in allele frequency across populations. Our in-silico review on the features present within GH17J043079 element in BRCA1 helps to postulate an intricate transcription regulation. Such candidate based analysis of features within cis regulatory element on a gene can help elucidate intricate genomic architecture, gene regulation and its impact on complex disorders.
... In general, comprehensive bioinformatics analysis of regulatory networks has pointed out potential interactions and new biological roles between small and long noncoding RNAs associated with complex diseases, such as gastric and colorectal cancers [97][98][99]. We understand that there are inherent limitations in predictive computational analysis; however, in this study, these limitations were minimized by only using experimentally tested and validated data, in order to avoid biased results. ...
Article
Full-text available
Circular RNAs (circRNAs) are a new class of long noncoding RNAs able to perform multiple functions, including sponging microRNAs (miRNAs) and RNA-Binding Proteins (RBPs). They play an important role in gastric carcinogenesis, but its involvement during gastric cancer (GC) development and progression are not well understood. We gathered miRNA and/or RBPs sponge circRNAs present in GC, and accessed their biological roles through functional enrichment of their target genes or ligand RBPs. We identified 54 sponge circRNAs in GC that are able to sponge 51 miRNAs and 103 RBPs. Then, we evaluated their host gene expression using The Cancer Genome Atlas (TCGA) database and observed that COL1A2 is the most overexpressed gene, which may be due to circHIPK3/miR-29b-c/COL1A2 axis dysregulation. We identified 27 GC-related pathways that may be affected mainly by circPVT1, circHIPK3 and circNF1. Our results indicate that circHIPK3/miR-107/BDNF/LIN28 axis may mediate chemoresistance in GC, and that circPVT1, circHIPK3, circNF1, ciRS-7 and circ_0000096 appear to be involved in gastrointestinal cancer development. Lastly, circHIPK3, circNRIP1 and circSMARCA5 were identified in different ethnic populations and may be ubiquitous modulators of gastric carcinogenesis. Overall, the studied sponge circRNAs are part of a complex RBP-circRNA-miRNA-mRNA interaction network, and are involved in the establishment, chemoresistance and progression of GC.
... They found that integration of eQTL data with GWAS data provided an overlap of information between the two that strengthened model performance. Furthermore, the cataloging of eQTLs mapped to non-coding RNA provides a better insight into how non-coding RNA affects gene expression (Branco et al., 2018), increasing the strength of regulatory information at the disposal of ML models. The growing integration of related biological features suggest this will provide clearer insight for models to be able to pinpoint the most likely disease causing genes in a locus (Branco et al., 2018;Dai et al., 2019). ...
... Furthermore, the cataloging of eQTLs mapped to non-coding RNA provides a better insight into how non-coding RNA affects gene expression (Branco et al., 2018), increasing the strength of regulatory information at the disposal of ML models. The growing integration of related biological features suggest this will provide clearer insight for models to be able to pinpoint the most likely disease causing genes in a locus (Branco et al., 2018;Dai et al., 2019). ...
Article
Full-text available
Genome-wide association studies (GWAS) have revealed thousands of genetic loci that underpin the complex biology of many human traits. However, the strength of GWAS – the ability to detect genetic association by linkage disequilibrium (LD) – is also its limitation. Whilst the ever-increasing study size and improved design have augmented the power of GWAS to detect effects, differentiation of causal variants or genes from other highly correlated genes associated by LD remains the real challenge. This has severely hindered the biological insights and clinical translation of GWAS findings. Although thousands of disease susceptibility loci have been reported, causal genes at these loci remain elusive. Machine learning (ML) techniques offer an opportunity to dissect the heterogeneity of variant and gene signals in the post-GWAS analysis phase. ML models for GWAS prioritization vary greatly in their complexity, ranging from relatively simple logistic regression approaches to more complex ensemble models such as random forests and gradient boosting, as well as deep learning models, i.e., neural networks. Paired with functional validation, these methods show important promise for clinical translation, providing a strong evidence-based approach to direct post-GWAS research. However, as ML approaches continue to evolve to meet the challenge of causal gene identification, a critical assessment of the underlying methodologies and their applicability to the GWAS prioritization problem is needed. This review investigates the landscape of ML applications in three parts: selected models, input features, and output model performance, with a focus on prioritizations of complex disease associated loci. Overall, we explore the contributions ML has made towards reaching the GWAS end-game with consequent wide-ranging translational impact.
... In fact, previous studies have linked dysregulation of lncRNAs to ASD (Roberts et al., 2014;Ziats and Rennert, 2013). Furthermore, one study used an expression quantitative trait loci analysis to investigate the genetic variants of lincRNAs associated with clinical phenotypes, such as schizophrenia (Branco et al., 2018). These studies indicated that lincRNAs could play important roles in complex neural disorders, but to date the connection between lincR-NAs and these disorders is not well understood. ...
Article
Full-text available
Human embryonic stem cells (hESCs) and induced pluripotent stem cells (iPSCs) can be differentiated into many different cell types of the central nervous system. One challenge when using pluripotent stem cells is to develop robust and efficient differentiation protocols that result in homogenous cultures of the desired cell type. Here, we have utilized the SMAD-inhibitors SB431542 and Noggin in a fully defined monolayer culture model to differentiate human pluripotent cells into homogenous forebrain neural progenitors. Temporal fate analysis revealed that this protocol results in forebrain-patterned neural progenitor cells that start to express early neuronal markers after two weeks of differentiation, allowing for the analysis of gene expression changes during neurogenesis. Using this system, we were able to identify many previously uncharacterized long intergenic non-coding RNAs that display dynamic expression during human forebrain neurogenesis.
... However, most eQTL analyses are focused on the associations Nucleic Acids Research, 2020, Vol. 48, Database issue D957 between genotypes and protein-coding genes; only a few ncRNA-related eQTL (ncRNA-eQTL) analyses have been performed at the genome-wide level (16). In recent years, although some ncRNA-related SNP databases such as Linc-SNP 2.0 (17), MSDD (18) and lncRNASNP2 (19) have been developed to explore the relationship of ncRNAs and SNPs, no database has been developed to specifically and comprehensively quantify the association between SNP and ncRNA expression. ...
Article
Full-text available
Numerous studies indicate that non-coding RNAs (ncRNAs) have critical functions across biological processes, and single-nucleotide polymorphisms (SNPs) could contribute to diseases or traits through influencing ncRNA expression. However, the associations between SNPs and ncRNA expression are largely unknown. Therefore, genome-wide expression quantitative trait loci (eQTL) analysis to assess the effects of SNPs on ncRNA expression, especially in multiple cancer types, will help to understand how risk alleles contribute toward tumorigenesis and cancer development. Using genotype data and expression profiles of ncRNAs of >8700 samples from The Cancer Genome Atlas (TCGA), we developed a computational pipeline to systematically identify ncRNA-related eQTLs (ncRNA-eQTLs) across 33 cancer types. We identified a total of 6 133 278 and 721 122 eQTL-ncRNA pairs in cis-eQTL and trans-eQTL analyses, respectively. Further survival analyses identified 8312 eQTLs associated with patient survival times. Furthermore, we linked ncRNA-eQTLs to genome-wide association study (GWAS) data and found 262 332 ncRNA-eQTLs overlapping with known disease- and trait-associated loci. Finally, a user-friendly database, ncRNA-eQTL (http://ibi.hzau.edu.cn/ncRNA-eQTL), was developed for free searching, browsing and downloading of all ncRNA-eQTLs. We anticipate that such an integrative and comprehensive resource will improve our understanding of the mechanistic basis of human complex phenotypic variation, especially for ncRNA- and cancer-related studies.
... Another weakness is that the high degree of heterogeneity in the disease severity or HCC-associated clinical manifestations, such as viral hepatitis as well as with alcoholic and non-alcoholic steatohepatitis within HCC patients may result in distinct findings regarding the link between H19 gene polymorphisms and liver tumorigenesis. In addition, although numerous non-coding variants within lncRNA genes were identified as expression quantitative trait loci [51], we failed to further prove that the upstream or intronic SNPs examined in this study affect the expression of H19 and its co-expressed genes due to a lack of expression data for H19 and its targets in our cohort. Furthermore, the genetic association detected in the present investigation might be limited to unique ethnic group unless replication experiments are performed. ...
Article
Full-text available
Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer, whose diversified occurrence worldwide indicates a connection between genetic variations among individuals and the predisposition to such neoplasms. Mounting evidence has demonstrated that long non-coding RNA (lncRNA) H19 can have both promotive and inhibitory effects on cancer development, revealing a dual role in tumorigenesis. In this study, the link of H19 gene polymorphisms to hepatocarcinogenesis was assessed between 359 HCC patients and 1190 cancer-free subjects. We found that heterozygotes for the minor allele of H19 rs2839698 (T) and rs3741219 (G) were more inclined to develop HCC (OR, 1.291; 95% CI, 1.003–1.661; p = 0.047, and OR, 1.361; 95% CI, 1.054–1.758; p = 0.018, respectively), whereas homozygotes for the polymorphic allele of rs2107425 (TT) were correlated with a decreased risk of HCC (OR, 0.606; 95% CI, 0.410–0.895; p = 0.012). Moreover, patients who bear at least one variant allele (heterozygote or homozygote) of rs3024270 were less prone to develop late-stage tumors (for stage III/IV; OR, 0.566; 95% CI, 0.342–0.937; p = 0.027). In addition, carriers of a particular haplotype of three H19 SNPs tested were more susceptible to HCC. In conclusion, our results indicate an association between H19 gene polymorphisms and the incidence and progression of liver cancer.