Fig 5 - uploaded by Eran Tauber
Content may be subject to copyright.
Distribution of most ancient CNEs across phyla. Cladogram showing presence (green), absence (red), or mixed presence-absence (orange) of the deeply conserved RPLP1 and RPLP2 CNEs. Numbers in brackets show number of species analyzed per group for the RPLP1/RPLP2 CNEs respectively. Groups are outlined by color: blue (Protostomia), orange (Deuterostomia), green (Cnidaria, Ctenophora, and Placozoa), pink (Porifera) 

Distribution of most ancient CNEs across phyla. Cladogram showing presence (green), absence (red), or mixed presence-absence (orange) of the deeply conserved RPLP1 and RPLP2 CNEs. Numbers in brackets show number of species analyzed per group for the RPLP1/RPLP2 CNEs respectively. Groups are outlined by color: blue (Protostomia), orange (Deuterostomia), green (Cnidaria, Ctenophora, and Placozoa), pink (Porifera) 

Source publication
Article
Full-text available
Phylogenetic footprinting is a comparative method based on the principle that functional sequence elements will acquire fewer mutations over time than non-functional sequences. Successful comparisons of distantly related species will thus yield highly important sequence elements likely to serve fundamental biological roles. RNA regulatory elements...

Contexts in source publication

Context 1
... 6 Evolution of RPLP1 CNE over at least 670 million years. a Alignment of RPLP1 CNE in all organisms where detected. Sequence logo diagrams of each conserved motif are shown below the alignment. Motif 1a is protostome-specific whereas motif 1b appears to be the ancestral and deuterostome form. Motifs 2 and 3 are variably spaced and are present in all phyla examined. Species name color scheme matches that of Fig. 5. b Diagram showing the position and spacing of each motif in each organism in relation to the translation start site. Genus/species abbreviations are defined in Additional file 2: Table S10  ...
Context 2
... 7 Evolution of RPLP2 CNE over at least 700 million years. a Alignment of RPLP2 CNE in all organisms where detected. The three distinct sequence motifs are shown aligned below the main alignment. Species name color scheme matches that of Figure 5. b Diagram showing the position and spacing of each motif in each organism in relation to the translation start site. Genus/species abbreviations are defined in Additional file 2: Table S10  ...
Context 3
... deeply conserved sequences (> = 350 Myr) also appear to be associated with a specific class of genes. 14 of these CNEs were found to lie completely within transcribed regions (see Methods), and all 20 were found to overlap transcribed regions by at least a third of the length of the CNE. This enrichment is significant ( p < 6.5e-04, hypergeometric test) when compared to the full set of 322, of which only ~70 % overlap transcribed regions by this amount. Remarkably for such a small set of genes, a GO term overrepresentation test turned up 39 significant terms. The 17 genes associated with these 20 CNEs are enriched for genes active in processes such as post-transcriptional regulation of gene expression (q < 6.1e-4), regulation of translation (q < 4.8e-3), and translational elongation (q < 1.2e-2). This list of overrepresented GO terms is, unlike that obtained from the full set of 322 CNEs, devoid of terms relating to transcriptional regulation, matching the shift towards putative translational regulatory CNEs. Among the CNEs that we identified were previously- studied regulatory elements, as well as many unidentified novel putative regulatory elements. As the majority of CNEs overlap 5 ’ UTRs, we calculated the likelihood of there being a conserved secondary structure in each CNE. This analysis revealed several conserved secondary structures, including an example of the well-characterized iron response element (IRE) in the 5 ’ UTR of the Ferritin gene (Additional file 1: Figure S4), a conserved hairpin loop bound by iron response proteins (IRPs) to help maintain iron homeostasis. We also identified novel conserved RNA structures, including a conserved, strong ( − 52.60 kcal/mol) hairpin loop found in the 5 ’ UTR of the Paramyosin gene (Additional file 1: Figure S5) identified in all four Hymenoptera species, and a hairpin loop with perfect stem complementarity but variable apical sequence conserved in the 5 ’ UTR of the Not1 gene (Fig. 4). These three hairpins differ in their fundamental characteristics. IREs are characterized by a highly conserved apical sequence (CAGUGY; clearly demonstrated in the three hymenopteran species) with a more variable stem sequence [3]. In contrast, the 4- nucleotide apical sequence (consensus HVHN) of Not1 appears to be highly variable, whereas the stem sequence is almost perfectly conserved. More sequences are necessary to be able to reliably characterize the Paramyosin hairpin, although there does appear to be at least one variable nucleotide in the hairpin apex. The positions of the hairpins also appear to be of functional importance; all three hairpins are conserved in their position relative to the translation start site, particularly the Not1 hairpin (Fig. 4a). The Not1 hairpin loop has a stem sequence of 12 bp, and the CNE containing it is found directly adjacent to the translation start site. The CNE contains two conserved stem sequences with near-perfect complementarity, a weakly conserved apical sequence, and a highly conserved, upstream ATG-containing motif directly adjacent to the translation start site. In N. vitripennis , this CNE is present in the 5 ’ UTR of all four known transcripts. As the position of this CNE is so strongly conserved, we scanned the first 100 bp of every orthologous transcript in all Ensembl Metazoa species for presence of either the conserved hairpin or for the conserved sequence adjacent to the translation start site. The results of this search (see Methods) indicated that in all cases where the hairpin loop is present the conserved sequence adjacent to the translation start site is present too, but not vice-versa (i.e. the sequence adjacent to the translation start site may exist on its own). The presence of the sequence in the Antarctic krill Euphasia superba (Hunt and Rosato, unpublished data) and in a centipede ( Strigamia maritima ) shows that this CNE was an early arthropod adaptation. We identified conserved putative regulatory sequences in six separate genes of the insect-specific Osiris gene cluster (Additional file 1: Figure S6). Our analysis indicates that these regions are Hymenoptera-specific, and are generally conserved in position relative to their associated gene. Since the conserved regions are associated with a specific class of genes with core functions, the fact that conserved promoter regions were identified near to six genes in the same cluster is perhaps indicative of an important developmental or regulatory role for this as-yet uncharacterized gene cluster. Two conserved sequences were identified in the 5 ’ UTRs of the two ribosomal stalk heterodimer genes, RPLP1 and RPLP2 . Given that parts of these sequences were found to be perfectly conserved over several nucleotides, we looked for presence of the same sequences in more distant phyla. A motif elicitation analysis (see Methods) revealed three separate sequence motifs in the RPLP1 CNE, and three in the RPLP2 CNE. These motifs are present in many different phyla (Fig. 5), including both deuterostomes and protostomes, making these the third and fourth known examples of bilaterian conserved regulatory elements (Bicores) [13]. These two conserved sequences were both early innovations in Animalia. The RPLP1 CNE was found in the genomes of the placozoan Trichoplax adhaerens and the cnidarian Nematostella vectensis (starlet sea anemone). N. vectensis also contains the RPLP2 CNE, as do the ctenophore Mnemiopsis leidyi (warty comb jelly) and the poriferan Amphimedon queen- slandica (a demosponge). Both CNEs were present in the majority of species that we analyzed ( RPLP1 : 33/38 species analyzed, RPLP2 : 23/38). Previously, the most ancient CNEs identified were found conserved between Deuterostomia and Cnidaria [11], thus dating back over 670 million years [12]. The CNE on RPLP2 that we report here appears to have originated even earlier, being found in the Porifera. This CNE is thus likely over 700 million years old [8]. The CNE on RPLP1 may also be older than 670 million years, depending on how the deep splits in the phylogeny of animals are eventually resolved [18]. The conserved regions paint an interesting evolutionary story. Firstly, in the RPLP1 CNE (Fig. 6), there are three distinct conserved sequences. The first conserved sequence appears to have two distinct forms; one found in protostomes (motif 1a) and another in deuterostomes (motif 1b), which appears to be the ancestral form as it is found in Cnidaria and Placozoa. The sec- ond and third motifs are found well conserved across both deuterostomes and protostomes, and are variably spaced; for example all mammalian species analyzed share a similar insertion between these two motifs. The relative position of the CNE is found conserved across all phyla (Fig. 6b), remaining within 150 bp of the translation start site. The RPLP2 CNE (Fig. 7) also appears to be described best as three distinct motifs. Motif 1 is exceptionally well conserved, with no variation at all across 10 bp. Motif 2 comprises a conserved region, generally followed by a short stretch of adenine nucleotides. Motif 3 is short and does not appear to be present in either D. melanogaster or Mnemiopsis leidyi . These observations make clear that these CNEs are functionally complex, being com- prised of several discrete elements punctuated by less evolutionarily constrained sequence. This is in contrast to other kinds of conservation such as ultraconserved regions, where long stretches of nucleotides (>200 bp) are found perfectly conserved between human, rat, and mouse,[19] which can be in some cases deleted without a clear critical loss of function [20]. As a whole, the complexity, shared associated gene function, and age of these CNEs marks them as interesting targets for future study. In this paper, we used a high stringency statistical approach to identify and characterize 322 ancient noncoding elements (Additional file 2: Table S4) which have remained conserved over large evolutionary distances. The bulk of the conserved sequences that we identified are specific to Hymenoptera, but nevertheless have been conserved in place for at least 180 million years of insect evolution (which occurs at a faster pace than vertebrate evolution [14]). A small proportion of the CNEs (20) that we identified were at least 350 million years old, with three stretching back further still. Two CNEs are found conserved in a range of the most basal animal clades across a wide variety of both vertebrates and in- vertebrates and are likely over 670 [12] and 700 million years old [8], likely the oldest CNEs described to date. These two ancient CNEs are located in the 5 ’ UTRs of two genes that are known to interact with one another; RPLP1 and RPLP2 . The two protein products of these ubi- quitously expressed genes, P1 and P2, form a heterodimer; two copies of which bind to the 60s acidic ribosomal protein P0 (coded by the gene RPLP0 ) to form the ribosomal stalk. The ribosomal stalk is involved in translational fine tuning and is crucial for the correct folding of many proteins [21]. The depth and breadth of conservation of these sequences is indicative of a fundamental regulatory role. Indeed, the 5 ’ UTR of RPLP2 has already been shown to have a regulatory role in Drosophila [22], being sufficient to confer full translational control unto RPLP2 as a non- translated gene in the early embryo, but not previously known to be conserved among animals. The fact that this CNE has been previously studied and identified as a regulatory element helps to validate the idea that other CNEs that we have identified are also functional regulatory elements. In Drosophila , the CNE we identified essentially spans the entirety of the RPLP2 5 ’ UTR, whereas in other organisms it is only a constituent part. In this study, we have characterized the motifs likely to be important for the function of this regulatory element and examined their evolution over time. Most of the conserved regions ...

Citations

... RPLP1 plays an important role in the elongation step of protein synthesis [11]. The C-terminal end of RPLP1 is nearly identical to the C-terminal end of the ribosomal phosphoproteins P0 and P2, which can interact with P0 and P2 to form a pentameric complex consisting of P1 and P2 dimers and a P0 monomer [12,13]. During central nervous system development, RPLP1 promotes embryonic fibroblast senescence-associated proliferation [14]. ...
Article
Full-text available
Background Cancer metastasis is the major reason for cancer related deaths, and the mechanism of cancer metastasis still unclear. RPLP1, a member of a group of proteins known as ribosomal proteins, is associated with tumorigenesis and primary cell immortalization and is involved in cellular transformation. However, the expression and potential function of RPLP1 in TNBC remain unclear. Methods The expression of RPLP1 in TNBC tissues and cell lines were detected with Real-Time PCR and western blotting. 81 cases of TNBC tissue samples and adjacent non-tumor tissue samples were tested by immunochemistry to determine the correlation between the RPLP1 expression and clinicopathological characteristics. In vitro, we determined the role and mechanistic pathways of RPLP1 in tumor metastasis in TNBC cell lines. Results In this study, we detected high levels of RPLP1 expression in TNBC samples and cell lines. RPLP1 is upregulated in triple-negative breast cancer (TNBC) tissues and cells, and high expression levels correlate with an increased risk of recurrence and metastasis. Furthermore, high RPLP1 expression was associated with a poor prognosis and was an independent prognostic marker for TNBC. In RPLP1-induced cancer metastasis, RPLP1 may increase cancer cell invasion, which is likely the result of its effect on the cancer cell epithelial-mesenchymal transition. Conclusions Altogether, our findings indicate RPLP1 is a poor prognostic potential biomarker and anti-metastasis candidate therapeutic target in triple-negative breast cancer.
... Co-regulation of neighbouring genes is a pervasive pattern in genomes (Boutanaev et al., 2002;Spellman & Rubin, 2002;Michalak, 2008). Intriguingly, there are Hymenoptera-specific ultraconserved elements (UCEs) in the 5 0 UTR of six Osiris genes (Osi7,8,9,16,20;Davies et al., 2015). These elements do not appear to have a conserved RNA secondary structure, but may play a role in translational regulation. ...
Article
Full-text available
Much of the variation among insects is derived from the different ways that chitin has been molded to form rigid structures, both internal and external. In this study, we identify a highly conserved expression pattern in an insect-only gene family, the Osiris genes, that is essential for development, but also plays a significant role in phenotypic plasticity and in immunity/toxicity responses. The majority of Osiris genes exist in a highly syntenic cluster, and the cluster itself appears to have arisen very early in the evolution of insects. We used developmental gene expression in the fruit fly, Drosophila melanogaster, the bumble bee, Bombus terrestris, the harvester ant, Pogonomyrmex barbatus, and the wood ant, Formica exsecta, to compare patterns of Osiris gene expression both during development, and between alternate caste phenotypes in the polymorphic social insects. Developmental gene expression of Osiris genes is highly conserved across species, and correlated with gene location and evolutionary history. The social insect castes are highly divergent in pupal Osiris gene expression. Sets of co-expressed genes that include Osiris genes are enriched in gene ontology terms related to chitin/cuticle and peptidase activity. Osiris genes are essential for cuticle formation in both embryos and pupae, and genes co-expressed with Osiris genes affect wing development. Additionally, Osiris genes and those co-expressed seem to play a conserved role in insect toxicology defences and digestion. Given their role in development, plasticity, and protection we propose that the Osiris genes play a central role in insect adaptive evolution. This article is protected by copyright. All rights reserved.
... We have furthermore identified 18 putative full coding sequences for regulatory and clock-controlled genes plus extensive fragments for three more such genes of interest and 21 preprohormone candidate contigs, the majority not previously reported. As an example of its usage outside the field of chronobiology, SuperbaSE has also been employed in the identification of an ancient conserved noncoding element in the 5′ region of the Not1 gene (Davies, Krusche, Tauber, & Ott, 2015). ...
Article
Full-text available
Antarctic krill (Euphausia superba) is a crucial component of the Southern Ocean ecosystem , acting as the major link between primary production and higher trophic levels with an annual predator demand of up to 470 million tonnes. It also acts as an ecosystem engineer, affecting carbon sequestration and recycling iron and nitrogen, and has increasing importance as a commercial product in the aquaculture and health industries. Here we describe the creation of a de novo assembled head transcriptome for E. superba. As an example of its potential as a molecular resource, we relate its exploitation in identifying and characterizing numerous genes related to the circadian clock in E. superba, including the major components of the central feedback loop. We have made the transcriptome openly accessible for a wider audience of ecologists, molecular biologists, evolutionary geneticists, and others in a user-friendly format at SuperbaSE, hosted at www.krill.le.ac.uk.
Article
Stearoyl-acyl carrier protein (ACP) desaturases (SADs) and fatty acid desaturases (FADs) are the main genes catalyzing the first and the second desaturation steps throughout the biosynthesis of oleic and linoleic acids, respectively, hence, the major determinants of oil quality and composition. To uncover the molecular mechanisms behind differential oil content and composition in olive fruit, the nucleotide and amino acid sequences of FAD2-2 and three SAD genes were characterized in 'Mari' and 'Shengeh' as two Iranian olive cultivars displaying high contrasting oil composition. In addition, the expression levels of these genes were screened at different time points during fruit development and ripening. Despite detection of a number of nucleotide sequence variations in four characterized genes, the results of amino acid analyses have shown most of them were synonymous and caused no differences at protein level. However, some of the single nucleotide variations caused non cognate amino acid substitution indicating possible conformational changes in the resultant peptide. In particular, regarding OeFAD2-2, a nonsynonymous SNP was detected in 'Shengeh' compared with other olive's FAD2-2 causing an amino acid lysine residue substitution for a glutamic acid, as well as a nonsynonymous substitution in 'Mari' (threonine to serine). The in silico prediction of three-dimensional structure of FAD2-2 revealed these substitutions may lead to structural changes in the final protein structure and function, hence, may contribute to the higher activity of FAD2-2 protein in mesocarp of 'Shengeh' than 'Mari' cultivars. According to results obtained from sequence analysis of oil biosynthesis genes, two Iranian olive cultivars were the most similar and had high sequence differences with international varieties. Results of temporal transcript abundance revealed that OeSAD2 had higher expression in 'Mari' (the high quality oil cultivar) than 'Shengeh' during the critical period of oil biosynthesis. Results of our data corroborates the previous records suggesting that OeSAD2 as the most pronounced isoform of stearoyl-acyl carrier protein desaturases responsible for oleic acid biosynthesis in olive mesocarp. In addition our results also suggest that OeSAD3 may have contributed to the higher oil quality of 'Mari' than 'Shengeh'. Based on our results, OeSAD2 and OeFAD2-2 are suggested as the possible targets for engineering fatty acid composition to help pyramid genes to improve quality and quantity of olive oil and creating high quality olive cultivars.
Article
Full-text available
Background Nasonia vitripennis is an emerging insect model system with haplodiploid genetics. It holds a key position within the insect phylogeny for comparative, evolutionary and behavioral genetic studies. The draft genomes for N. vitripennis and two sibling species were published in 2010, yet a considerable amount of transcriptiome data have since been produced thereby enabling improvements to the original (OGS1.2) annotated gene set. We describe and apply the EvidentialGene method used to produce an updated gene set (OGS2). We also carry out comparative analyses showcasing the usefulness of the revised annotated gene set. ResultsThe revised annotation (OGS2) now consists of 24,388 genes with supporting evidence, compared to 18,850 for OGS1.2. Improvements include the nearly complete annotation of untranslated regions (UTR) for 97 % of the genes compared to 28 % of genes for OGS1.2. The fraction of RNA-Seq validated introns also grow from 85 to 98 % in this latest gene set. The EST and RNA-Seq expression data provide support for several non-protein coding loci and 7712 alternative transcripts for 4146 genes. Notably, we report 180 alternative transcripts for the gene lola.Nasonia now has among the most complete insect gene set; only 27 conserved single copy orthologs in arthropods are missing from OGS2. Its genome also contains 2.1-fold more duplicated genes and 1.4-fold more single copy genes than the Drosophila melanogaster genome. The Nasonia gene count is larger than those of other sequenced hymenopteran species, owing both to improvements in the genome annotation and to unique genes in the wasp lineage.We identify 1008 genes and 171 gene families that deviate significantly from other hymenopterans in their rates of protein evolution and duplication history, respectively. We also provide an analysis of alternative splicing that reveals that genes with no annotated isoforms are characterized by shorter transcripts, fewer introns, faster protein evolution and higher probabilities of duplication than genes having alternative transcripts. Conclusions Genome-wide expression data greatly improves the annotation of the N. vitripennis genome, by increasing the gene count, reducing the number of missing genes and providing more comprehensive data on splicing and gene structure. The improved gene set identifies lineage-specific genomic features tied to Nasonia’s biology, as well as numerous novel genes.OGS2 and its associated search tools are available at http://arthropods.eugenes.org/EvidentialGene/nasonia/, www.hymenopteragenome.org/nasonia/ and waspAtlas: www.tinyURL.com/waspAtlas.The EvidentialGene pipeline is available at https://sourceforge.net/projects/evidentialgene/.