Article

From "Junk" to Gene: Curriculum vitae of a Primate Receptor Isoform Gene

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Exonization of Alu retroposons awakens public opinion, particularly when causing genetic diseases. However, often neglected, alternative "Alu-exons" also carry the potential to greatly enhance genetic diversity by increasing the transcriptome of primates chiefly via alternative splicing.Here, we report a 5' exon generated from one of the two alternative transcripts in human tumor necrosis factor receptor gene type 2 (p75TNFR) that contains an ancient Alu-SINE, which provides an alternative N-terminal protein-coding domain. We follow the primate evolution over the past 63 million years to reconstruct the key events that gave rise to a novel receptor isoform. The Alu integration and start codon formation occurred between 58 and 40 million years ago (MYA) in the common ancestor of anthropoid primates. Yet a functional gene product could not be generated until a novel splice site and an open reading frame were introduced between 40 and 25 MYA on the catarrhine lineage (Old World monkeys including apes).

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Supplementary resources (12)

... Although it is clear that theoretically, like point mutations, retronuons are probably more often a disadvantage than an advantage to the affected individual and in most cases they have no effect at all, it is remarkable that 25% of analyzed promoter regions in the human genome contain retronuon-derived sequences (Jordan et al. 2003) and that the 5Ј ends of a large proportion of mRNAs contain parts of retronuons, thus indicating a role of the respective retronuons in gene regulation (van de Lagemaat et al. 2003;Oei et al. 2004; see also Franchini et al. 2004). A further striking discovery is that up to 5% of human genes harbor sequences from Alu retronuons in their protein-coding regions that arose mainly via alternative splicing (Nekrutenko and Li 2001;Sorek et al. 2002;Lev-Maor et al. 2003;Kreahling and Graveley 2004;Singer et al. 2004), although it needs to be established what percentage of the alternatively spliced mRNAs encode functional protein variants. Despite this unexpected functional potential underlying retronuon insertions, the figures still likely represent an underestimate of the significance of retronuons for the exaptation of gene regulatory elements or novel protein sequence domains over longer evolutionary time frames. ...
... However, given enough time, even the slow but constant ''microevolutionary'' forces of nucleotide exchanges and small indels would have comparable effects (randomization of nonaptive sequences) on the vast majority of genomic sequences that are not under purifying selection. Conversely, punctuated retropositions can take several if not tens of millions of years to become exaptations, ''awaiting'' additional small changes that, for example, create a functional splice site or an open reading frame (Singer et al. 2004). Importantly, despite the large degree of genome disparity, (e.g., the majority of nucleotide sequences have been exchanged between mammals; Fig. 1), significant differences in body plans did not emerge. ...
... Insertion of the Alu element occurred after Anthropoidea split from prosimians and a subsequent point mutation generated an ATG start codon. This base substitution alone, however, was not sufficient for exaptation of the Alu element as a protein-coding exon, as this sequence is nonaptive (not used as part of an alternative mRNA) in Platyrrhini.Only two additional small changes in the lineage leading to Catarrhini including apes, a C→T transition to generate a GT 5Ј splice site and a 7-bp deletion to provide translation into the next exon in the correct reading frame, led to generation and exaptation of this alternative exon(Singer et al. 2004). ...
Article
Full-text available
The application of molecular genetics, in particular comparative genomics, to the field of evolutionary biology is paving the way to an enhanced “New Synthesis.” Apart from their power to establish and refine phylogenies, understanding such genomic processes as the dynamics of change in genomes, even in hypothetical RNA-based genomes and the in vitro evolution of RNA molecules, helps to clarify evolutionary principles that are otherwise hidden among the nested hierarchies of evolutionary units. To this end, I outline the course of hereditary material and examine several issues including disparity, causation, or bookkeeping of genes, adaptation, and exaptation, as well as evolutionary contingency at the genomic level—issues at the heart of some of Stephen Jay Gould's intellectual battlegrounds. Interestingly, where relevant, the genomic perspective is consistent with Gould's agenda. Extensive documentation makes it particularly clear that exaptation plays a role in evolutionary processes that is at least as significant as—and perhaps more significant than—that played by adaptation.
... Therefore, the insertion of Alu elements into intronic regions may introduce new exons into existing, functioning genes. The evolutionary history of several such ''exonization'' events has been characterized in detail [8,9]. For example, in p75TNFR, the insertion of an Alu element and a series of subsequent nucleotide substitutions created a new alternative first exon [8]. ...
... The evolutionary history of several such ''exonization'' events has been characterized in detail [8,9]. For example, in p75TNFR, the insertion of an Alu element and a series of subsequent nucleotide substitutions created a new alternative first exon [8]. Sorek and colleagues investigated the splicing pattern of 61 Alu-containing exons using human mRNA and EST sequences [4]. ...
... We did not observe evidence of positive selection on this ADAR2 exon using these metrics (see Figure S1A). Similarly, SNP-based tests did not indicate evidence of positive selection for the alternative first exon of p75TNFR (see Figure S1B), the result of another well-known functional exonization event [8]. These data show the limitation of using sequence-based approaches to identify functional Alu exonization events. ...
Article
Full-text available
Author Summary New exons have been created and added to existing functional genes during eukaryotic genome evolution. Alu elements, a class of primate-specific retrotransposons, are a major source of new exons in primates. However, recent analyses of expressed sequence tags suggest that the vast majority of Alu-derived exons are low-abundance splice forms and represent non-functional evolutionary intermediates. In order to elucidate the evolutionary impact of Alu-derived exons, we investigated the splicing of 330 Alu-derived exons in 11 human tissues using data from high-density exon arrays with multiple oligonucleotide probes for every exon in the human genome. Our exon array analysis and further RT-PCR experiments reveal surprisingly diverse splicing patterns of these exons. Some Alu-derived exons are constitutively spliced, and some are strongly tissue-specific. In SEPN1, a gene implicated in a form of congenital muscular dystrophy, our data suggest that the muscle-specific inclusion of an Alu-derived exon results from a human-specific splicing change after the divergence of humans and chimpanzees. Our study provides novel insight into the evolutionary significance of Alu exonization events. A subset of Alu-derived exons, especially those derived from more ancient Alu elements in the genome, may have contributed to functional novelties during primate evolution.
... It has been known for some time that the recruitment of Alu elements (a retroposed primate-specific short interspersed element, SINE) as exons Li et al., 2007;Nekrutenko et al., 2002;Post et al., 1990;Schmitz and Brosius, 2011;Singer et al., 2004) was not always advantageous for the carrier (Kim et al., 2008;Larsen et al., 2018;Makałowski et al., 1994;Wallace et al., 1991). The studies of Gil Ast's laboratory gave us mechanistical details on the generation of novel splice sites within and alternative splicing involving Alu elements (Lev-Maor et al., 2003;Ram et al., 2008;Sorek et al., 2002;Sorek et al., 2004;Eisenberg, 2016;Lin et al., 2016). ...
... These observations are in agreement with an analogous situation of the aforementioned exaptations of novel exons from TEs. In the case of the primate-specific Alu elements, it was shown that most exonizations required additional modifications, e.g., in order to generate splice sites, open reading frames, etc. (Krull et al., 2005;Singer et al., 2004). Also, the genomic environment of integration is an important determinant, whether a given TE will function as (potential) regulatory element (Brosius, 2009). ...
... It has been known for some time that the recruitment of Alu elements (a retroposed primate-specific short interspersed element, SINE) as exons Li et al., 2007;Nekrutenko et al., 2002;Post et al., 1990;Schmitz and Brosius, 2011;Singer et al., 2004) was not always advantageous for the carrier (Kim et al., 2008;Larsen et al., 2018;Makałowski et al., 1994;Wallace et al., 1991). The studies of Gil Ast's laboratory gave us mechanistical details on the generation of novel splice sites within and alternative splicing involving Alu elements (Lev-Maor et al., 2003;Ram et al., 2008;Sorek et al., 2002;Sorek et al., 2004;Eisenberg, 2016;Lin et al., 2016). ...
... These observations are in agreement with an analogous situation of the aforementioned exaptations of novel exons from TEs. In the case of the primate-specific Alu elements, it was shown that most exonizations required additional modifications, e.g., in order to generate splice sites, open reading frames, etc. (Krull et al., 2005;Singer et al., 2004). Also, the genomic environment of integration is an important determinant, whether a given TE will function as (potential) regulatory element (Brosius, 2009). ...
Article
Full-text available
The realization that body parts of animals and plants can be recruited or coopted for novel functions dates back to, or even predates the observations of Darwin. S.J. Gould and E.S. Vrba recognized a mode of evolution of characters that differs from adaptation. The umbrella term aptation was supplemented with the concept of exaptation. Unlike adaptations, which are restricted to features built by selection for their current role, exaptations are features that currently enhance fitness, even though their present role was not a result of natural selection. Exaptations can also arise from nonaptations; these are characters which had previously been evolving neutrally. All nonaptations are potential exaptations. The concept of exaptation was expanded to the molecular genetic level which aided greatly in understanding the enormous potential of neutrally evolving repetitive DNA— including transposed elements, formerly considered junk DNA—for the evolution of genes and genomes. The distinction between adaptations and exaptations is outlined in this review and examples are given. Also elaborated on is the fact that such distinctions are sometimes more difficult to determine; this is a widespread phenomenon in biology, where continua abound and clear borders between states and definitions are rare.
... It appears that in this species, the part of the genomic CRHR2 sequence containing the γ1-like seed sequence was lost entirely ("H" in Figure 2). This case study of the evolution of the CRHR2γ splice variant in primates confirms some of the molecular mechanisms leading to exonization of intronic sequences found in previous studies: (a) An Alu element was involved in creating a primate-specific exon and was inserted in the antisense orientation (Sorek, 2007;Sorek, Ast, & Graur, 2002); (b) following insertion of the Alu element, only a few mutations were needed for exonization, namely to create a 5′ donor splice site and an in-frame start codon, millions of years after the initial Alu retrotransposition (Krull, Brosius, & Schmitz, 2005;Singer, Männel, Hehlgans, Brosius, & Schmitz, 2004). However, importantly, in contrast to the case of the human tumor necrosis factor receptor gene type 2 reported (Singer et al., 2004) and various other genes (Nekrutenko & Li, 2001), the Alu element in the γ1 exon is part of the 5′UTR but not of the protein-coding sequence and does not contain the 5′ splice site; instead, a flanking intronic sequence was exapted as a new, alternative proteincoding sequence and the 5′ splice site formed at the 3′ end of this intronic sequence. ...
... This case study of the evolution of the CRHR2γ splice variant in primates confirms some of the molecular mechanisms leading to exonization of intronic sequences found in previous studies: (a) An Alu element was involved in creating a primate-specific exon and was inserted in the antisense orientation (Sorek, 2007;Sorek, Ast, & Graur, 2002); (b) following insertion of the Alu element, only a few mutations were needed for exonization, namely to create a 5′ donor splice site and an in-frame start codon, millions of years after the initial Alu retrotransposition (Krull, Brosius, & Schmitz, 2005;Singer, Männel, Hehlgans, Brosius, & Schmitz, 2004). However, importantly, in contrast to the case of the human tumor necrosis factor receptor gene type 2 reported (Singer et al., 2004) and various other genes (Nekrutenko & Li, 2001), the Alu element in the γ1 exon is part of the 5′UTR but not of the protein-coding sequence and does not contain the 5′ splice site; instead, a flanking intronic sequence was exapted as a new, alternative proteincoding sequence and the 5′ splice site formed at the 3′ end of this intronic sequence. It remains remarkable that an intronic sequence is exapted as coding sequence, as studies with chimeric CRHRs have showed that the N-terminal extracellular domain plays a crucial role in ligand binding and receptor activation (Liaw, Grigoriadis, Lovenberg, Souza, & Maki, 1997). ...
Article
Many G protein‐coupled receptors have splice variants, with potentially different pharmaceutical properties, expression patterns and roles. The human brain expresses three functional splice variants of the type 2 corticotropin‐releasing hormone: CRHR2α, −β and −γ. CRHR2γ has only been reported in humans, but its phylogenetic distribution, and how and when during mammalian evolution it arose, is unknown. Based on genomic sequence analyses, we predict that a functional CRHR2γ is present in all Old World monkeys and apes, and is unique to these species. CRHR2γ arose by exaptation of an intronic sequence―already present in the common ancestor of primates and rodents―after retrotransposition of a short interspersed nuclear element (SINE) and mutations that created a 5’ donor splice site and in‐frame start codon, 32–43 million years ago. The SINE is not part of the coding sequence, only of the 5’ untranslated region and may therefore play a role in translational regulation. Putative regulatory elements and an alternative transcriptional start site were added earlier to this genomic locus by a DNA transposon. The evolutionary history of CRHR2γ confirms some of the earlier reported principles behind the “birth” of alternative exons. The functional significance of CRHR2γ, particularly in the brain, remains to be demonstrated. This article is protected by copyright. All rights reserved.
... The primate-specific Alu elements are the second most prevalent class with almost 1.2 million copies, and thousands of them can be spliced to create Alu-exons (Lev-Maor et al., 2003;Sorek et al., 2004;Zarnack et al., 2013). Only few Alu-exons encode for novel protein isoforms (Lin et al., 2016), and for several, the evolutionary history of their exonisation has been described (Krull et al., 2005;Moller-Krull et al., 2008;Singer et al., 2004). Alu elements are particularly prone to exonisation because as few as three single nucleotide mutations from the Alu consensus sequence are sufficient to create cryptic 5' and 3' splice sites (Lev-Maor et al., 2003;Sorek et al., 2004), and the leftarm Alu sequence contains a CUAUU sequence that can serve as branchpoint (Mercer et al., 2015). ...
... Yet, exonising Alu elements have longer U-tracts than silent Alu elements (median length of 11 nt compared to 7 nt, Figure 4A and Zarnack et al., 2013), possibly suggesting a selection pressure for efficient hnRNPC repression. Previous phylogenetic maps across primate species showed that mutations in the splice site sequences usually occur significantly after integration of the Alu element (Krull et al., 2005;Singer et al., 2004), which would allow time for the U-tracts of the Alu-exon to decay through genomic drift. For example, the exonising Alu element in the REL gene (in the past called c-rel-2) obtained a 5' splice site only after the split of Old and New World monkeys (in the catarrhini lineage), even though the Alu element itself is present in both groups. ...
Article
Full-text available
Alu elements are retrotransposons that frequently form new exons during primate evolution. Here, we assess the interplay of splicing repression by hnRNPC and nonsense-mediated mRNA decay (NMD) in the quality control and evolution of new Alu-exons. We identify 3100 new Alu-exons and show that NMD more efficiently recognises transcripts with Alu-exons compared to other exons with premature termination codons. However, some Alu-exons escape NMD, especially when an adjacent intron is retained, highlighting the importance of concerted repression by splicing and NMD. We show that evolutionary progression of 3' splice sites is coupled with longer repressive uridine tracts. Once the 3' splice site at ancient Alu-exons reaches a stable phase, splicing repression by hnRNPC decreases, but the exons generally remain sensitive to NMD. We conclude that repressive motifs are strongest next to cryptic exons and that gradual weakening of these motifs contributes to the evolutionary emergence of new alternative exons. DOI: http://dx.doi.org/10.7554/eLife.19545.001
... Many examples of Alu exonisation have been reported (Singer et al. 2004;Krull et al. 2005;Schmitz and Brosius 2011). In the human tumour necrosis factor receptor gene type 2 (p75TNFR), an alternative first codon is contributed by an insertion of AluJ, which provides a novel N-terminal protein-coding domain (Singer et al. 2004). ...
... Many examples of Alu exonisation have been reported (Singer et al. 2004;Krull et al. 2005;Schmitz and Brosius 2011). In the human tumour necrosis factor receptor gene type 2 (p75TNFR), an alternative first codon is contributed by an insertion of AluJ, which provides a novel N-terminal protein-coding domain (Singer et al. 2004). Alu integration and start codon formation occurred about 50 MYA in the common ancestor of anthropoid primates. ...
Article
Full-text available
Since their discovery, a growing body of evidence has emerged demonstrating that transposable elements are important drivers of species diversity. These mobile elements exhibit a great variety in structure, size and mechanisms of transposition, making them important putative actors in organism evolution. The vertebrates represent a highly diverse and successful lineage that has adapted to a wide range of different environments. These animals also possess a rich repertoire of transposable elements, with highly diverse content between lineages and even between species. Here, we review how transposable elements are driving genomic diversity and lineage-specific innovation within vertebrates. We discuss the large differences in TE content between different vertebrate groups and then go on to look at how they affect organisms at a variety of levels: from the structure of chromosomes to their involvement in the regulation of gene expression, as well as in the formation and evolution of non-coding RNAs and protein-coding genes. In the process of doing this, we highlight how transposable elements have been involved in the evolution of some of the key innovations observed within the vertebrate lineage, driving the group’s diversity and success.
... This fact may be responsible for their lesser chance of exonization mediated by alternative splicing (Sorek et al., 2002); ii) exonization of TEs in a protein-coding region mediated by alternative splicing including UBE2D3 (NM_181893) of Mammalian Short Interspersed Repeat Element (SINE/MIR), UBE2L3 (NM_198157) of Long INterspersed Element (LINE/L2) and UBE2V1 (NM_021988 and NM_022442) of SINE/MIR. Exonization of TEs has been primarily studied by analysis of primate SINE/Alu (Krull et al., 2005; Singer et al., 2004), but to date little evidence has been collected for SINE/MIR and LINE/L2 (Zemojtel et al., 2007). In this study, we have assessed their exonization in eight primates (human, chimpanzee, rhesus, gorilla, orangutan, marmoset, tarsier and lemur), and retraced the different stages of their separate exonization by monitoring genomic events over 63 Myr of primate evolution and comparing orthologous loci in these primates. ...
... Genomic sequence analysis showed that human UBE2D3 transcript variant 9 (NM_181893) (149AA) is an alternatively spliced transcript created by exonization of SINE/MIR at the 5′-end (Fig. 2A). Located proximal to the major transcript, the novel MIR-derived exon is required to provide a functional ATG start codon and a 5′ splice site (SS) linking it to the others via exon 2 as Singer et al. (2004) have described. To trace back the generation of the first exon of this transcript, we followed the process of the exonized TEs by) and exonized SINE/MIR in the primate UBE2D3. ...
Article
The origin of eukaryotic ubiquitin-conjugating enzymes (E2s) can be traced back to the Guillardia theta nucleomorph about 2500 million years ago (Mya). E2s are largely vertically inherited over eukaryotic evolution [Lespinet, O., Wolf, Y.I., Koonin, E.V., Aravind, L., 2002. The role of lineage-specific gene family expansion in the evolution of eukaryotes. Genome Res. 1048-1059], while mammal E2s experienced evolution of multigene families by gene duplications which have been accompanied by the increase in the species complexity. Because of alternatively splicing, primate-specific expansions of E2s happened once again at a transcriptional level. Both of them resulted in increasing genomic complexity and diversity of primate E2 proteomic function. The evolutionary processes of human E2 gene structure during expansions were accompanied by exon duplication and exonization of intronic sequences. Exonizations of Transposable Elements (TEs) in UBE2D3, UBE2L3 and UBE2V1 genes from primates indicate that exaptation of TEs also plays important roles in the structural innovation of primate-specific E2s and may create alternative splicing isoforms at a transcriptional level. Estimates for the ratio of dN/dS suggest that a strong purifying selection had acted upon protein-coding sequences of their orthologous UBE2D2, UBE2A, UBE2N, UBE2I and Rbx1 genes from animals, plants and fungi. The similar rates of synonymous substitutions are in accordance with the neutral mutation-random drift hypothesis of molecular evolution. Systematic detection of the origin and evolution of E2s, analyzing the evolution of E2 multigene families by gene duplications and the evolutionary processes of E2s during expansions, and testing its evolutionary force using E2s from distant phylogenetic lineages may advance our distinguishing of ancestral E2s from created E2s, and reveal previously unknown relationships between E2s and metazoan complexity. Analysis of these conserved proteins provides strong support for a close relationship between social amoeba and eukaryote, choanoflagellate and metazoan, and for the central roles of social amoeba and choanoflagellate in the origin and evolution of eukaryote and metazoan. Retracing the different stages of primate E2 exonization by monitoring genomic events over 63 Myr of primate evolution will advance our understanding of how TEs dynamically modified primate transcriptome and proteome in the past, and continue to do so.
... -98 ) that constitute 11% of the human genome 55 . RNA binding proteins normally repress the expression of newly-incorporated Alu elements, but decreased repression over long evolutionary periods provides a substrate for new functions 56,57 . The risk-increasing effect of the lowly-used PTPN2 splice junction could therefore represent a harmful evolutionary by-product that attenuates the anti-inflammatory effect of PTPN2. ...
Preprint
Full-text available
The majority of immune-mediated disease (IMD) risk loci are located in non-coding regions of the genome, making it difficult to decipher their functional effects. To assess the extent to which alternative splicing contributes to IMD risk, we mapped genetic variants associated with alternative splicing (splicing quantitative trait loci or sQTL) in macrophages exposed to 24 cellular conditions. We found that genes involved in innate immune response pathways undergo extensive differential splicing in response to stimulation and detected significant sQTL effects for over 5,734 genes across all conditions. We colocalised sQTL signals for over 700 genes with IMD-associated risk loci from 21 IMDs with high confidence (PP4 ≥ 0.75). Approximately half of the colocalisations implicate lowly-used splice junctions (mean usage ratio < 0.1). Finally, we demonstrate how an inflammatory bowel disease (IBD) risk allele increases the usage of a lowly-used isoform of PTPN2, a negative regulator of inflammation. Together, our findings highlight the role alternative splicing plays in IMD risk, and suggest that lowly-used splicing events significantly contribute to complex disease risk.
... In a process called exonization, TEs can integrate into genomic regions and offer recognition by the splicing machinery as a newly recruited exon [100]. Approximately 4% of human genes contain TE motifs in their coding regions, indicating that exons may have been derived from the exonization of TEs [101][102][103][104][105][106]. Some studies have identified that exonized LINEs in the human genome provide an additional domain and produce abnormal transcripts through diverse alternative splicing mechanisms in cancers. ...
Article
Full-text available
Alternative splicing of messenger RNA (mRNA) precursors contributes to genetic diversity by generating structurally and functionally distinct transcripts. In a disease state, alternative splicing promotes incidence and development of several cancer types through regulation of cancer-related biological processes. Transposable elements (TEs), having the genetic ability to jump to other regions of the genome, can bring about alternative splicing events in cancer. TEs can integrate into the genome, mostly in the intronic regions, and induce cancer-specific alternative splicing by adjusting various mechanisms, such as exonization, providing splicing donor/acceptor sites, alternative regulatory sequences or stop codons, and driving exon disruption or epigenetic regulation. Moreover, TEs can produce microRNAs (miRNAs) that control the proportion of transcripts by repressing translation or stimulating the degradation of transcripts at the post-transcriptional level. Notably, TE insertion creates a cancer-friendly environment by controlling the overall process of gene expression before and after transcription in cancer cells. This review emphasizes the correlative interaction between alternative splicing by TE integration and cancer-associated biological processes, suggesting a macroscopic mechanism controlling alternative splicing by TE insertion in cancer.
... RepeatMasker can recognize (exonized) TEs and parts thereof (> 30 nts) in protein-coding sequence regions [28]. We first screened for TE inclusion in the NCBI Consensus CDS databank (CCDS; https://www. ...
Article
Full-text available
Advances in RNA high-throughput sequencing and large-scale functional assays yield new insights into the multifaceted activities of transposed elements and many other previously undiscovered sequence elements. Currently, no tool for easy access, analysis, quantification, and visualization of alternatively spliced exons across multiple tissues or developmental stages is available. Also, analysis pipelines demand computational skills or hardware requirements, which often are hard to meet by wet-lab scientists. We developed ExoPLOT to enable simplified access to massive RNA high throughput sequencing datasets to facilitate the analysis of alternative splicing across many biological samples. To demonstrate the functionality of ExoPLOT, we analyzed the contribution of exonized transposed elements (TE) to human CDS; mRNA splice variants containing the TE-derived exon were quantified and compared to expression levels of TE-free splice variants. For analysis, we utilized 313 human cerebrum, cerebellum, heart, kidney, liver, ovary, and testis transcriptomes, representing various pre- and postnatal developmental stages. ExoPLOT visualizes the relative expression levels of alternative transcripts, e.g., caused by the insertion of new TE-derived exons, across different developmental stages of and among multiple tissues. This tool also provides a unique link between evolution and function during exonization (gain of a new exon) and exaptation (recruitment/co-optation) of a new exon. As input for analysis, we derived a database of 1151 repeat-masked, exonized TEs, representing all prominent families of transposons in the human genome and the collection of human consensus coding sequences (CCDS). ExoPLOT screened preprocessed RNA high-throughput sequencing datasets from seven human tissues to quantify and visualize the dynamics in RNA splicing for these 1151 TE-derived exons during the entire human organ development. In addition, we successfully mapped and analyzed 993 recently described exonized sequences from the human frontal cortex onto these 313 transcriptome libraries. ExoPLOT's approach to preprocessing RNA deep sequencing datasets facilitates alternative splicing analysis and significantly reduces processing times. In addition, ExoPLOT's design allows studying alternative RNA isoforms other than TE-derived in a customized – coordinate-based manner and is available at http://retrogenomics3.uni-muenster.de:3838/exz-plot-d/.
... ORFs from non-genic transcripts, is largely understudied. This is surprising considering that other lines of research revealed a pervasively transcribed genome (Neme and Tautz 2016;Kapranov et al. 2012) and the exonisation of intergenic DNA and of intronic regions in particular (Singer et al. 2004;Krull et al. 2005;Jürgen Schmitz et al. 2011). Novel transcripts abound, e.g., in mouse (Neme and Tautz 2016), but properties of ORFs contained and possible functions of proteins encoded remain unclear. ...
Preprint
A recent surge of studies suggested that many novel genes arise de novo from previously non-coding DNA and not by duplication. However, since most studies concentrated on longer evolutionary time scales and rarely considered protein structural properties, it remains unclear how these properties are shaped by evolution, depend on genetic mechanisms and influence gene survival. Here we compare open reading frames (ORFs) from high coverage transcriptomes from mouse and another four mammals covering 160 million years of evolution. We find that novel ORFs pervasively emerge from intergenic and intronic regions but are rapidly lost again while relatively fewer arise from duplications but are retained over much longer times. Surprisingly, disorder and other protein properties of young ORFs do not change with gene age. Only length and nucleotide composition change, probably to avoid aggregation. Thus de novo genes resemble frozen accidents of randomly emerged ORFs which survived initial purging, likely because they are functional.
... These observations are in agreement with an analogous situation of the aforementioned exaptations of novel exons from TEs. In the case of the primate-specific Alu elements, it was shown that most exonizations required additional modifications, e.g., in order to generate splice sites, open reading frames, etc. ( Krull et al., 2005;Singer et al., 2004). Also, the genomic environment of integration is an important determinant, whether a given TE will function as (potential) regulatory element (Brosius, 2009). ...
... These observations are in agreement with an analogous situation of the aforementioned exaptations of novel exons from TEs. In the case of the primate-specific Alu elements, it was shown that most exonizations required additional modifications, e.g., in order to generate splice sites, open reading frames, etc. ( Krull et al., 2005;Singer et al., 2004). Also, the genomic environment of integration is an important determinant, whether a given TE will function as (potential) regulatory element (Brosius, 2009). ...
... Naturally, such studies cannot capture the earliest stages of de novo gene emergence. This is because these stages are probably connected to a pervasively transcribed genome 43,44 in general and to the exonization of intergenic and intronic regions in particular [45][46][47] . Previous studies have also found that novel transcripts abound, for example, in mouse 43 and fruitfly 20 , but the properties of the ORFs present and the possible functions of the proteins encoded remain unclear. ...
Article
Full-text available
A recent surge of studies have suggested that many novel genes arise de novo from previously noncoding DNA and not by duplication. However, most studies concentrated on longer evolutionary time scales and rarely considered protein structural properties. Therefore, it remains unclear how these properties are shaped by evolution, depend on genetic mechanisms and influence gene survival. Here we compare open reading frames (ORFs) from high coverage transcriptomes from mouse and another four mammals covering 160 million years of evolution. We find that novel ORFs pervasively emerge from noncoding regions but are rapidly lost again, while relatively fewer arise from the divergence of coding sequences but are retained much longer. We also find that a subset (14%) of the mouse-specific ORFs bind ribosomes and are potentially translated, showing that such ORFs can be the starting points of gene emergence. Surprisingly, disorder and other protein properties of young ORFs hardly change with gene age in short time frames. Only length and nucleotide composition change significantly. Thus, some transcribed de novo genes resemble 'frozen accidents' of randomly emerged ORFs that survived initial purging. This perspective complies with very recent studies indicating that some neutrally evolving transcripts containing random protein sequences may be translated and be viable starting points of de novo gene emergence.
... In another study, Herger MC et al. described the function of RNA Pol II transcripts originating from intronic Alu elements (aluRNAs) in nucleolar assembly and function. Earlier, Alu transcripts have been implicated in the regulation of gene expression and translation but not linked to the nucleolar organization [106,107]. Through series of experiments, Herger MC et al. demonstrated that aluRNA interacts with the multifunctional nucleolar proteins nucleolin (NCL) and nucleophosmin (NPM) and tethered to chromatin and this is sufficient to target large genomic regions to the nucleolus [108]. This strongly suggests that the interaction of NCL and NPM with aluRNA is important to build up a functional nucleolus. ...
Chapter
Full-text available
For the last four decades, we have known that noncoding RNAs maintain critical housekeeping functions such as transcription, RNA processing, and translation. However, in the late 1990s and early 2000s, the advent of high-throughput sequencing technologies and computational tools to analyze these large sequencing datasets facilitated the discovery of thousands of small and long noncoding RNAs (lncRNAs) and their functional role in diverse biological functions. For example, lncRNAs have been shown to regulate dosage compensation, genomic imprinting, pluripotency, cell differentiation and development, immune response, etc. Here we review how lncRNAs bring about such copious functions by employing diverse mechanisms such as translational inhibition, mRNA degradation, RNA decoys, facilitating recruitment of chromatin modifiers, regulation of protein activity, regulating the availability of miRNAs by sponging mechanism, etc. In addition, we provide a detailed account of different mechanisms as well as general principles by which lncRNAs organize functionally different nuclear sub-compartments and their impact on nuclear architecture.
... In contrast, " embedded Alu RNAs " are localized within introns, transcribed by Pol II and spliced out from pre-mRNA during mRNA maturation (Deininger, 2011). Alu transcripts have been shown to regulate gene expression posttranscriptionally , being involved in alternative splicing (Singer et al, 2004), RNA editing (Mattick & Mehler, 2008 ) and translation efficiency (Capshew et al, 2012; Fitzpatrick & Huang, 2012), but have not been linked to the functional organization of nuclear subcompartments . In the present study, we show that Pol II-dependent aluRNAs regulate nucleolar structure and rRNA synthesis via interaction with nucleolin (NCL), a major structural and multifunctional nucleolar protein with pivotal functions in ribosome biogenesis (Ginisty et al, 1999). ...
Article
Full-text available
Non-coding RNAs play a key role in organizing the nucleus into functional subcompartments. By combining fluorescence microscopy and RNA deep-sequencing-based analysis, we found that RNA polymerase II transcripts originating from intronic Alu elements (aluRNAs) were enriched in the nucleolus. Antisense-oligo-mediated depletion of aluRNAs or drug-induced inhibition of RNA polymerase II activity disrupted nucleolar structure and impaired RNA polymerase I-dependent transcription of rRNA genes. In contrast, overexpression of a prototypic aluRNA sequence increased both nucleolus size and levels of pre-rRNA, suggesting a functional link between aluRNA, nucleolus integrity and pre-rRNA synthesis. Furthermore, we show that aluRNAs interact with nucleolin and target ectopic genomic loci to the nucleolus. Our study suggests an aluRNA-based mechanism that links RNA polymerase I and II activities and modulates nucleolar structure and rRNA production.
... As an aside, 5 ′ UTRs controlled by alternative splicing often might contribute to ORFs and therefore encode not only cis-regulatory motifs but also, potentially, functional protein domains. 50 Enhancers are parts of genes Unlike the core promoter region, enhancer elements often reside at large distances from the regulated transcript. However, the actual three-dimensional genome organization also depends on chromatinmediated interactions and might differ remarkably from the linear arrangement of DNA primary sequence. ...
Article
Full-text available
Outdated gene definitions favored regions corresponding to mature messenger RNAs, in particular, the open reading frame. In eukaryotes, the intergenic space was widely regarded nonfunctional and devoid of RNA transcription. Original concepts were based on the assumption that RNA expression was restricted to known protein-coding genes and a few so-called structural RNA genes, such as ribosomal RNAs or transfer RNAs. With the discovery of introns and, more recently, sensitive techniques for monitoring genome-wide transcription, this view had to be substantially modified. Tiling microarrays and RNA deep sequencing revealed myriads of transcripts, which cover almost entire genomes. The tremendous complexity of non-protein-coding RNA transcription has to be integrated into novel gene definitions. Despite an ever-growing list of functional RNAs, questions concerning the mass of identified transcripts are under dispute. Here, we examined genome-wide transcription from various angles, including evolutionary considerations, and suggest, in analogy to novel alternative splice variants that do not persist, that the vast majority of transcripts represent raw material for potential, albeit rare, exaptation events. © 2015 New York Academy of Sciences.
... An interesting example was reported where an Alu element gave rise to a novel 5 ′ exon in the human tumor necrosis factor type 2 gene (p75TNFR), providing a novel N-terminal protein domain resulting in a novel receptor isoform. 25 In addition, gene-integrated Alus can be a source of promoters, enhancers, silencers, insulators and influence mRNA stability. 26 Thus, 7SL is a prominent example of an ncRNA that has evolved diverse functions upon retrotransposition and amplification. ...
Article
Full-text available
The human genome is scattered with repetitive sequences, and the ENCODE project revealed that 60–70% of the genomic DNA is transcribed into RNA . As a consequence, the human transcriptome contains a large portion of repeat‐derived RNAs ( repRNAs ). Here, we present a hypothesis for the evolution of novel functional repeat‐derived RNAs from non‐coding RNAs ( ncRNAs ) by retrotransposition. Upon amplification, the ncRNAs can diversify in sequence and subsequently evolve new activities, which can result in novel functions. Non‐coding transcripts derived from highly repetitive regions can therefore serve as a reservoir for the evolution of novel functional RNAs . We base our hypothetical model on observations reported for short interspersed nuclear elements derived from 7SL RNA and tRNAs , α satellites derived from snoRNAs and SL RNAs derived from U1 small nuclear RNA. Furthermore, we present novel putative human repeat‐derived ncRNAs obtained by the comparison of the Dfam and Rfam databases, as well as several examples in other species. We hypothesize that novel functional ncRNAs can derive also from other repetitive regions and propose Genomic SELEX as a tool for their identification. WIREs RNA 2014, 5:591–600. doi: 10.1002/wrna.1243 This article is categorized under: RNA Processing > 3' End Processing RNA Turnover and Surveillance > Turnover/Surveillance Mechanisms
... Initially viewed as "junk" DNA without function, seminal studies in rodents [15,16] and primates [17][18][19] indicate a far more important role for SINEs in genome organization, gene evolution, and disease. For example, germ-line insertions are correlated with non-homologous genome rearrangements, generation of novel coding sequences, alteration of regulatory elements and are linked with the origin and evolution of highly conserved non-coding elements in mammals [18,[20][21][22][23][24][25][26]. ...
Article
Full-text available
Background Repetitive short interspersed elements (SINEs) are retrotransposons ubiquitous in mammalian genomes and are highly informative markers to identify species and phylogenetic associations. Of these, SINEs unique to the order Carnivora (CanSINEs) yield novel insights on genome evolution in domestic dogs and cats, but less is known about their role in related carnivores. In particular, genome-wide assessment of CanSINE evolution has yet to be completed across the Feliformia (cat-like) suborder of Carnivora. Within Feliformia, the cat family Felidae is composed of 37 species and numerous subspecies organized into eight monophyletic lineages that likely arose 10 million years ago. Using the Felidae family as a reference phylogeny, along with representative taxa from other families of Feliformia, the origin, proliferation and evolution of CanSINEs within the suborder were assessed. Results We identified 93 novel intergenic CanSINE loci in Feliformia. Sequence analyses separated Feliform CanSINEs into two subfamilies, each characterized by distinct RNA polymerase binding motifs and phylogenetic associations. Subfamily I CanSINEs arose early within Feliformia but are no longer under active proliferation. Subfamily II loci are more recent, exclusive to Felidae and show evidence for adaptation to extant RNA polymerase activity. Further, presence/absence distributions of CanSINE loci are largely congruent with taxonomic expectations within Feliformia and the less resolved nodes in the Felidae reference phylogeny present equally ambiguous CanSINE data. SINEs are thought to be nearly impervious to excision from the genome. However, we observed a nearly complete excision of a CanSINEs locus in puma (Puma concolor). In addition, we found that CanSINE proliferation in Felidae frequently targeted existing CanSINE loci for insertion sites, resulting in tandem arrays. Conclusions We demonstrate the existence of at least two SINE families within the Feliformia suborder, one of which is actively involved in insertional mutagenesis. We find SINEs are powerful markers of speciation and conclude that the few inconsistencies with expected patterns of speciation likely represent incomplete lineage sorting, species hybridization and SINE-mediated genome rearrangement.
... As a specific example of this, Lunyak et al. [83] showed that tissue-specific transcription of a SINE sequence in the murine growth hormone locus is required for the establishment of functional chromatin domains, which in turn permit gene activation. It is now clear that TE insertions into the untranslated regions of genes are frequently associated with alternative splicing events (exonizations), and the de novo generation of exons [84][85][86][87]. In addition to exonization, TE insertions are also known to deliver novel introns. ...
Article
Full-text available
One of the most unexpected insights that followed from the completion of the human genome a decade ago was that more than half of our DNA is derived from transposable elements (TEs). Due to advances in high throughput sequencing technologies it is now clear that TEs comprise the largest molecular class within most metazoan genomes. TEs, once categorised as "junk DNA", are now known to influence genomic structure and function by increasing the coding and non-coding genetic repertoire of the host. In this way TEs are key elements that stimulate the evolution of metazoan genomes. This review highlights several lines of TE research including the horizontal transfer of TEs through host-parasite interactions, the vertical maintenance of TEs over long periods of evolutionary time, and the direct role that TEs have played in generating morphological novelty.
... al., mapped out the stepwise mutagenesis over millions of years that generated an alternative 5′ exon in the human tumor necrosis factor receptor gene. 70 Additionally, insertion of inverted Alu allows for a stretch of Us incorporating into pre-mRNA, together with desirable mutations, making it into a functional splice site for exon generation. Therefore, mutations over time contribute to the conversion of pseudosplice sites within Alus into functional ones. ...
Article
Alus are transposable elements, belonging to the short interspersed element family. They occupy over 10% of human genome and have been spreading through genomes over the past 65 million years. In the past, they were considered junk DNA with little function that took up genome volumes. Today, Alus and other transposable elements emerge to be key players in cellular function, including genomic activities, gene expression regulations and evolution. Here we summarize the current understanding of Alu function in genome and gene expression regulation in human cell nuclei.
... If TE insertion is a random event, sense and antisense insertions might occur at the same frequency. However, some TEs have been shown to preferentially insert into the antisense orientation in relation to the host gene (Lorenc and Makalowski, 2003; Makalowski et al., 1994; Medstrand et al., 2002; Singer et al., 2004; Smit, 1999; van de Lagemaat et al., 2003). Conversely, the frequencies of sense and antisense insertions in bovine exons were found to be similar for each subfamily (Almeida et al., 2007). ...
Article
Full-text available
Transposable elements are mobile genomic sequences that comprise a large portion of mammalian genomes. The transposable element fusion phenomenon within porcine genes has not yet been reported; therefore, we investigated transposable element fusion genes in the Sus scrofa genome. Porcine transposable element-mediated chimeric transcripts were identified and characterized. Most transposable elements preferentially inserted themselves into an antisense orientation and into the 3’ end of porcine genes. The transposable element fusion gene between porcine mRNA and ERV class I, one of the LTR retrotransposons, was not detected. This data will be of great use to further studies focused on a better understanding of the biological function of porcine genes in relation to transposable elements.
... A recent study showed that Estrogen Receptor α (ERα), which is involved in human breast cancer, preferentially targets mammalian interspersed repeats (MIRs) transposons [16]. The exonization of mutated TE sequences has proven to be a significant mechanism for the creation of novel exons [17][18][19][20]. In the TranspoGene database, 1423 human exonized TEs (TE exons), involved in~1700 RefSeq genes, have been collected [21]. ...
Article
Full-text available
The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. This study seeks to validate and extend previous studies on the expression of TE exons by an integrative statistical analysis of high throughput RNA sequencing data. We collected 26 RNA-seq datasets spanning multiple tissues and cancer types. The exon-level digital expressions (indicating retention rates in mRNAs) were quantified by a double normalized measure, called the rescaled RPKM (Reads Per Kilobase of exon model per Million mapped reads). We analyzed the distribution profiles and the variability (across samples and between tissue/disease groups) of TE exon expressions, and compared them with those of other constitutive or cassette exons. We inferred the effects of four genomic factors, including the location, length, cognate TE family and TE nucleotide proportion (RTE, see Methods section) of a TE exon, on the exons' expression level and expression variability. We also investigated the biological implications of an assembly of highly-expressed TE exons. Our analysis confirmed prior studies from the following four aspects. First, with relatively high expression variability, most TE exons in mRNAs, especially those without exact counterparts in the UCSC RefSeq (Reference Sequence) gene tables, demonstrate low but still detectable expression levels in most tissue samples. Second, the TE exons in coding DNA sequences (CDSs) are less highly expressed than those in 3[prime] (5[prime]) untranslated regions (UTRs). Third, the exons derived from chronologically ancient repeat elements, such as MIRs, tend to be highly expressed in comparison with those derived from younger TEs. Fourth, the previously observed negative relationship between the lengths of exons and the inclusion levels in transcripts is also true for exonized TEs. Furthermore, our study resulted in several novel findings. They include: (1) for the TE exons with non-zero expression and as shown in most of the studied biological samples, a high TE nucleotide proportion leads to their lower retention rates in mRNAs; (2) the considered genomic features (i.e. a continuous variable such as the exon length or a category indicator such as 3[prime]UTR) influence the expression level and the expression variability (CV) of TE exons in an inverse manner; (3) not only the exons derived from Alu elements but also the exons from the TEs of other families were preferentially established in zinc finger (ZNF) genes.
... There is a growing body of literature demonstrating the profound influences, both at the cellular and organismal level, mediated by alterations in non-coding DNA content, in general, and ERV LTRs, in particular [80,81,82,83,84,85,86,87,88,89,90]. In humans, well known examples of such regulation include: the human salivary amylase gene whose expression in the salivary glands is mediated by an HERV-E 59 LTR, which acts as an enhancer element for a cellular promoter [88] and the human fetal gamma globin gene where an ERV9 LTR element ,40 kb upstream acts to stimulate expression during fetal development and suppress expression after birth [87]. ...
Article
Full-text available
All genes in the TRIM6/TRIM34/TRIM5/TRIM22 locus are type I interferon inducible, with TRIM5 and TRIM22 possessing antiviral properties. Evolutionary studies involving the TRIM6/34/5/22 locus have predominantly focused on the coding sequence of the genes, finding that TRIM5 and TRIM22 have undergone high rates of both non-synonymous nucleotide replacements and in-frame insertions and deletions. We sought to understand if divergent evolutionary pressures on TRIM6/34/5/22 coding regions have selected for modifications in the non-coding regions of these genes and explore whether such non-coding changes may influence the biological function of these genes. The transcribed genomic regions, including the introns, of TRIM6, TRIM34, TRIM5, and TRIM22 from ten Haplorhini primates and one prosimian species were analyzed for transposable element content. In Haplorhini species, TRIM5 displayed an exaggerated interspecies variability, predominantly resulting from changes in the composition of transposable elements in the large first and fourth introns. Multiple lineage-specific endogenous retroviral long terminal repeats (LTRs) were identified in the first intron of TRIM5 and TRIM22. In the prosimian genome, we identified a duplication of TRIM5 with a concomitant loss of TRIM22. The transposable element content of the prosimian TRIM5 genes appears to largely represent the shared Haplorhini/prosimian ancestral state for this gene. Furthermore, we demonstrated that one such differentially fixed LTR provides for species-specific transcriptional regulation of TRIM22 in response to p53 activation. Our results identify a previously unrecognized source of species-specific variation in the antiviral TRIM genes, which can lead to alterations in their transcriptional regulation. These observations suggest that there has existed long-term pressure for exaptation of retroviral LTRs in the non-coding regions of these genes. This likely resulted from serial viral challenges and provided a mechanism for rapid alteration of transcriptional regulation. To our knowledge, this represents the first report of persistent evolutionary pressure for the capture of retroviral LTR insertions.
... Frequently, a series of changes was required to generate all the necessary genomic conditions for successful splicing.Fig. 3 illustrates the reconstruction of all steps that occurred for the emergence of the novel Alu-derived exon 1 in the human tumor necrosis factor receptor gene type 2 (p75TNFR; [27]), including the insertion of the AluJo element into the 5 0 UTR of the gene and the untimed acquisition of an alternative transcription start site upstream of the element. In addition, a point mutation generated a new ATG start codon derived from an ATA sequence within the AluJo element. ...
Article
Protein-coding genes are composed of exons and introns flanked by untranslated regions. Before the mRNA of a gene can be translated into protein, the splicing machinery removes all the intronic regions and joins the protein-coding exons together. Exonization is a process, whereby genes acquire new exons from non-protein-coding, primarily intronic, DNA sequences. Genomic insertions or point mutations within DNA sequences often generate alternative splice sites, causing the splicing system to include new sequences as exons or to elongate existing exons. Because the alternative splice sites are not as efficient as the originals the new variants usually constitute a minor fraction of mature mRNAs. While the prevailing original splice variant maintains functionality, the additional sequence, free from selection pressure, evolves a new function or eventually vanishes. If the new splice variant is advantageous, selection might operate to optimize the new splice sites and consequently increase the proportion of the alternative splice variant. In some instances, the original splice variant is completely replaced by constitutive splicing of the new form. Because of the fortuitous presence of internal splice site-like structures within their sequences, portions of transposed elements frequently serve as modules of exonization. Their recruitment requires a long and versatile optimization process involving multiple changes over a time span of millions, even hundreds of millions, of years. Comparisons of corresponding genes and mRNAs in phylogenetically related species enables one to chronologically reconstruct such changes, from ancient ancestors to living species, in a stepwise manner. We will review this process using three different exemplary cases: (1) the evolution of a constitutively spliced mammalian-wide repeat (MIR), (2) the evolution of an alternative exon 1 from an alternative 5'-extended primary transcript containing an Alu element, and (3) a rare case of the stepwise exoniztion of an Alu element-derived sequence mediated by A-to-I RNA editing.
... A significant feature of Alus is their dimeric structure, involving a fusion of two slightly dissimilar arms [58]. [146] Intracellular TNFR P75TNFR Tumour necrosis factor receptor Alu Old World primate Exonization Novel isoform Various Active Singer et al., 2004 [147] Altered infectiousdisease resistance? [73] Colon Le antigen expression B3GALT5 Galactosyltransferase ERV Old World primate Regulatory Alternative promoter Colon, small intestine, breast Active Dunn et al., 2003 [152] Prolactin potentiation of the adaptive immune response Parathyroid gland Active McHaffie and Ralston, 1995 [164] PRKACG cAMP signalling/ regulation of metabolism [166] Altered arterial wall function? ...
Article
Full-text available
Transposable elements (TEs) are increasingly being recognized as powerful facilitators of evolution. We propose the TE-Thrust hypothesis to encompass TE-facilitated processes by which genomes self-engineer coding, regulatory, karyotypic or other genetic changes. Although TEs are occasionally harmful to some individuals, genomic dynamism caused by TEs can be very beneficial to lineages. This can result in differential survival and differential fecundity of lineages. Lineages with an abundant and suitable repertoire of TEs have enhanced evolutionary potential and, if all else is equal, tend to be fecund, resulting in species-rich adaptive radiations, and/or they tend to undergo major evolutionary transitions. Many other mechanisms of genomic change are also important in evolution, and whether the evolutionary potential of TE-Thrust is realized is heavily dependent on environmental and ecological factors. The large contribution of TEs to evolutionary innovation is particularly well documented in the primate lineage. In this paper, we review numerous cases of beneficial TE-caused modifications to the genomes of higher primates, which strongly support our TE-Thrust hypothesis.
... There is evidence in the literature that an Alu element can induce transcription of an adjacent protein-coding gene. For example, the first exon of a p75TNFR gene transcript variant was derived from an AluJo element (Singer et al., 2004). ...
Article
Motivation: Many genes in the human genome produce a wide variety of transcript variants resulting from alternative exon splicing, differential promoter usage, or altered polyadenylation site utilization that may function differently in human cells. Here, we present a bioinformatics method for the systematic identification of human-specific novel transcript variants that might have arisen after the human-chimpanzee divergence. Results: The procedure involved collecting genomic insertions that are unique to the human genome when compared with orthologous chimpanzee and rhesus macaque genomic regions, and that are expressed in the transcriptome as exons evidenced by mRNAs and/or expressed sequence tags (ESTs). Using this procedure, we identified 112 transcript variants that are specific to humans; 74 were associated with known genes and the remaining transcripts were located in unannotated genomic loci. The original source of inserts was mostly transposable elements including L1, Alu, SVA, and human endogenous retroviruses (HERVs). Interestingly, some non-repetitive genomic segments were also involved in the generation of novel transcript variants. Insert contributions to the transcripts included promoters, terminal exons and insertions in exons, splice donors and acceptors and complete exon cassettes. Comparison of personal genomes revealed that at least seven loci were polymorphic in humans. The exaptation of human-specific genomic inserts as novel transcript variants may have increased human gene versatility or affected gene regulation.
... The longer introns presumably provide a good environment for exonization [40]. Effects of TE exonization within the first intron are usually neutral with respect to the protein sequence, but can affect signal sequences [41]. In order to analyze whether the location bias results from potential involvement of purifying selection, we separated our data to three groups: exonizations that contain an in-frame stop codon (599 exons), exonizations that are non-symmetrical and do not contain an in-frame stop codon (216 exons), and symmetrical exons that do not contain stop codons (137 exons). ...
Article
Full-text available
Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization. Elucidation of the evolutionary constraints that have shaped fixation of transposed elements within human and mouse protein coding genes and subsequent exonization is important for understanding of how the exonization process has affected transcriptome and proteome complexities. Here we show that exonization of transposed elements is biased towards the beginning of the coding sequence in both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of transposed elements can be population-specific, implying that exonizations may enhance divergence and lead to speciation. SNP density analysis revealed differences between Alu and other transposed elements. Finally, we identified cases of primate-specific Alu elements that depend on RNA editing for their exonization. These results shed light on TE fixation and the exonization process within human and mouse genes.
... both mechanisms have acted on Tes in mammals, especially in Alu elements. The molecular mechanisms that lead to the exonization of Alu elements have been studied in detail [72][73][74] and are discussed further in BOX 3. The formation of alternative exons from Alu elements permits new functions to be established without eliminating the original function of a protein 75 . In some cases, the insertion of Alu into an exon or the formation of a constitutive exon from Alu can be deleterious and can lead to human genetic diseases. ...
Article
Full-text available
Over the past decade, it has been shown that alternative splicing (AS) is a major mechanism for the enhancement of transcriptome and proteome diversity, particularly in mammals. Splicing can be found in species from bacteria to humans, but its prevalence and characteristics vary considerably. Evolutionary studies are helping to address questions that are fundamental to understanding this important process: how and when did AS evolve? Which AS events are functional? What are the evolutionary forces that shaped, and continue to shape, AS? And what determines whether an exon is spliced in a constitutive or alternative manner? In this Review, we summarize the current knowledge of AS and evolution and provide insights into some of these unresolved questions.
... 43,44 Numerically, a large contribution to genomes are short interspersed elements (SINEs), such as primate Alu elements. SINEs are retronuons derived from small npcRNAs; at various stages of decay after retroposion they can also be recruited as novel domains in protein genes [45][46][47][48][49] or might even trigger the inclusion of anonymous intronic sequences by activating cryptic splice sites. Conveniently, these splice products are generated in parallel with ordinary mRNA products, initially often in low proportions, as if the new protein is being "tested" by natural selection for utility (for a detailed summary, see Ref. 50. ...
Article
Full-text available
While once almost synonymous, there is an increasing gap between the expanding definition of what constitutes a gene and the conservative and narrowly defined terms code or coding, which for a long time, almost exclusively constituted the open reading frame. Much confusion results from this disparity, especially in light of the plethora of noncoding RNAs (more correctly termed "non-protein-coding RNAs") that usually are encoded and transcribed by their own genes. A simple solution would be to adopt Ed Trifonov's less constrained definition of a code as any sequence pattern that can have a biological function. Such consideration favors not only a more complex view of the gene as an entity composed of many more or less conserved subgenic modules, but also a concept of modular evolution of genes and entire genomes.
... In mammals, this seems to be facilitated by the presence of potential splice sites in some TEs [23]. One example is the human tumour necrosis factor receptor gene (p75 TNFR ), in which an alternative 5 exon is exapted from a TE [24]. This process could also generate completely new genes, as suggested for two mouse genes, lungerkine and mNSC1, comprising TE sequences almost entirely [23]. ...
Article
Full-text available
Genomes contain a large number of genes that do not have recognizable homologues in other species. These genes, found in only one or a few closely related species, are known as orphan genes. Their limited distribution implies that many of them are probably involved in lineage-specific adaptive processes. One important question that has remained elusive to date is how orphan genes originate. It has been proposed that they might have arisen by gene duplication followed by a period of very rapid sequence divergence, which would have erased any traces of similarity to other evolutionarily related genes. However, this explanation does not seem plausible for genes lacking homologues in very closely related species. In the present article, we review recent efforts to identify the mechanisms of formation of primate orphan genes. These studies reveal an unexpected important role of transposable elements in the formation of novel protein-coding genes in the genomes of primates.
... Examples of SINE exaptation as promoters, however, are limited and represented by a sense B1 [53] and an antisense B2 [54] element in mouse. In human, an isoform of the p75TNFR gene initiates transcription from an antisense MIR SINE, with the adjacent AluJo providing an alternative translation start site [55] . Furthermore, a bioinformatics analysis reports the existence of several unvalidated antisense Alu-associated TSS [8]. ...
Article
Full-text available
The human neuronal apoptosis inhibitory protein (NAIP) gene is no longer principally considered a member of the Inhibitor of Apoptosis Protein (IAP) family, as its domain structure and functions in innate immunity also warrant inclusion in the Nod-Like Receptor (NLR) superfamily. NAIP is located in a region of copy number variation, with one full length and four partly deleted copies in the reference human genome. We demonstrate that several of the NAIP paralogues are expressed, and that novel transcripts arise from both internal and upstream transcription start sites. Remarkably, two internal start sites initiate within Alu short interspersed element (SINE) retrotransposons, and a third novel transcription start site exists within the final intron of the GUSBP1 gene, upstream of only two NAIP copies. One Alu functions alone as a promoter in transient assays, while the other likely combines with upstream L1 sequences to form a composite promoter. The novel transcripts encode shortened open reading frames and we show that corresponding proteins are translated in a number of cell lines and primary tissues, in some cases above the level of full length NAIP. Interestingly, some NAIP isoforms lack their caspase-sequestering motifs, suggesting that they have novel functions. Moreover, given that human and mouse NAIP have previously been shown to employ endogenous retroviral long terminal repeats as promoters, exaptation of Alu repeats as additional promoters provides a fascinating illustration of regulatory innovations adopted by a single gene.
... Alu exonization has been shown to require only one [Lev-Maor et al., 2003;Sorek et al., 2004] (Fig. 3A) or only a few Singer et al., 2004] mutations. The new type of exonization described here was caused by a single deletion that removed intronic elements promoting (poly(T)-tail and branch site) or repressing (AG dinucleotide) splice site selection and created a more favorable 3 0 splice site organization. ...
Article
Cryptic exons or pseudoexons are typically activated by point mutations that create GT or AG dinucleotides of new 5' or 3' splice sites in introns, often in repetitive elements. Here we describe two cases of tetrahydrobiopterin deficiency caused by mutations improving the branch point sequence and polypyrimidine tracts of repeat-containing pseudoexons in the PTS gene. In the first case, we demonstrate a novel pathway of antisense Alu exonization, resulting from an intronic deletion that removed the poly(T)-tail of antisense AluSq. The deletion brought a favorable branch point sequence within proximity of the pseudoexon 3' splice site and removed an upstream AG dinucleotide required for the 3' splice site repression on normal alleles. New Alu exons can thus arise in the absence of poly(T)-tails that facilitated inclusion of most transposed elements in mRNAs by serving as polypyrimidine tracts, highlighting extraordinary flexibility of Alu repeats in shaping intron-exon structure. In the other case, a PTS pseudoexon was activated by an A>T substitution 9 nt upstream of its 3' splice site in a LINE-2 sequence, providing the first example of a disease-causing exonization of the most ancient interspersed repeat. These observations expand the spectrum of mutational mechanisms that introduce repetitive sequences in mature transcripts and illustrate the importance of intronic mutations in alternative splicing and phenotypic variability of hereditary disorders.
... Surprisingly , almost all of the genome is transcribed in some organisms, not simply the protein-coding portion567. Some repetitive sequence classes that were thought to only selfishly expand genome size at the expense of the host are known to regulate transcription and contribute to gene evolution89101112 . Genomes contain large non-coding regions that are conserved across species131415, and lineage-specific, non-coding DNA between distantly related species is associated with the same regulatory functions ; such patterns are consistent with non-coding DNA having a regulatory function [16,17]. ...
Article
Full-text available
The basis of genome size variation remains an outstanding question because DNA sequence data are lacking for organisms with large genomes. Sixteen BAC clones from the Mexican axolotl (Ambystoma mexicanum: c-value = 32 x 10(9) bp) were isolated and sequenced to characterize the structure of genic regions. Annotation of genes within BACs showed that axolotl introns are on average 10x longer than orthologous vertebrate introns and they are predicted to contain more functional elements, including miRNAs and snoRNAs. Loci were discovered within BACs for two novel EST transcripts that are differentially expressed during spinal cord regeneration and skin metamorphosis. Unexpectedly, a third novel gene was also discovered while manually annotating BACs. Analysis of human-axolotl protein-coding sequences suggests there are 2% more lineage specific genes in the axolotl genome than the human genome, but the great majority (86%) of genes between axolotl and human are predicted to be 1:1 orthologs. Considering that axolotl genes are on average 5x larger than human genes, the genic component of the salamander genome is estimated to be incredibly large, approximately 2.8 gigabases! This study shows that a large salamander genome has a correspondingly large genic component, primarily because genes have incredibly long introns. These intronic sequences may harbor novel coding and non-coding sequences that regulate biological processes that are unique to salamanders.
... There are three known origins of alternatively spliced exons: 1) exon shuffling, which is a form of gene duplication [9][10][11]; 2) exonization of intronic sequences [12][13][14][15][16]; and 3) change in the mode of splicing from constitutive to alternative splicing during evolution [17,18]. One mechanism responsible for the shift from constitutive to alternative splicing is accumulation of mutations in the 59 splice site region. ...
Article
Full-text available
Author Summary The human genome is crowded with over one million copies of primate-specific retrotransposed elements, termed Alu. A large fraction of Alu elements are located within intronic sequences. The human transcriptome undergoes extensive RNA editing (A-to-I), to higher levels than any other tested organism. RNA editing requires the formation of a double-stranded RNA structure in order to occur. Over 90% of the editing sites in the human transcriptome are found within Alu sequences. Thus, the high level of RNA editing is indicative of extensive secondary structure formation in mRNA precursors driven by intronic Alu-Alu base pairing. Splicing is a molecular mechanism in which introns are removed from an mRNA precursor and exons are ligated to form a mature mRNA. Here, we show that Alu insertions into introns can affect the splicing of the flanking exons. We experimentally demonstrate that two Alu elements that were inserted into the same intron in opposite orientation undergo base-pairing, and consequently shift the splicing pattern of the downstream exon from constitutive inclusion in all mature mRNA molecules to alternative skipping. This emphasizes the impact of Alu elements on the primate-specific transcriptome evolution, as such events can generate new isoforms that might acquire novel functions.
Article
Alu sequences are the most abundant repetitive elements in the human genome, and have proliferated to more than one million copies in the human genome. Primate-specific Alu sequences account for ∼10% of the human genome, and their spread within the genome has the potential to generate new exons. The new exons produced by Alu elements appear in various primate genes, and their functions have been elucidated. Here, we identified a new exon in the insulin-like 3 gene (INSL3), which evolved ∼50 million years ago, and led to a splicing variant with 31 extra amino acid residues in addition to the original 95 nucleotides (NTs) of INSL3. The Alu-INSL3 isoform underwent diverse changes during primate evolution; we identified that human Alu-INSL3 might be on its way to functionality and has potential to antagonize LGR8-INSL3 function. Therefore, the present study is designed to provide an example of the evolutionary trajectory of a variant peptide hormone antagonist that caused by the insertion of an Alu element in primates.
Article
Toxin genes in animals undergo accelerated evolution compared to non-toxin genes to be effective and competitive in prey capture, as well as to enhance their predator defense. Several mechanisms have been proposed to explain this unusual phenomenon. These include (a) frequent mutations in exons compared to introns and nonsynonymous substitutions in exons; (b) high frequency of point mutations are due to the presence of more unstable triplets in exons compared to introns; (c) Accelerated Segment Switch in Exons to alter Targeting (ASSET); (d) Rapid Accumulation of Variations in Exposed Residues (RAVERs); (e) alteration in intron-exon boundary; (f) deletion of exon; and (g) loss/gain of domains through recombination. By systematic analyses of snake venom disintegrin/metalloprotease genes, I describe a new mechanism in the evolution of these genes through exonization and intronizationTLN. In the evolution of RTS/KTS disintegrins, a new exon (10a) is formed in intron 10 of the disintegrin/metalloprotease gene. Unlike more than 90% new exons that are from repetitive elements in introns, exon 10a originated from a non-repetitive element. To incorporate exon 10a, part of the exon 11 is intronized to retain the open reading frame. This is the first case of simultaneous exonization and intronization within a single gene. This new mechanism alters the function of toxins through drastic changes to the molecular surface via insertion of new exons and deletion of exons.
Chapter
Charles Fox and Jason Wolf have brought together leading researchers to produce a cutting-edge primer introducing readers to the major concepts in modern evolutionary genetics. This book spans the continuum of scale, from studies of DNA sequence evolution through proteins and development to multivariate phenotypic evolution, and the continuum of time, from ancient events that lead to current species diversity to the rapid evolution seen over relatively short time scales in experimental evolution studies. Chapters are accessible to an audience lacking extensive background in evolutionaryy genetics but also current and in-depth enough to be of value to established researchers in evolution biology.
Article
Processes of genome, and particularly primates genome, evolution are the focus of this thesis. The questions of conservation, or not, of nucleotidic sequences for peptidic precursors in the primate lineage, and of creation, evolution and expression of two primate-specific genes families are examined. Exon shuffling, retroposition, duplications... are implicated in the creation of novel genes. Studying the creation and first times of a gene requires the description of recent genes, still harbouring their creation features. Two such genes families retained my attention: 1. The PMCHL genes were created recently trough retroposition and rehandling events, but also indels and mutations accumulation. Structural and phylogenetic analyses favoured a better comprehension of the origin of these primates specific genes. The PMCHL genes are transcribed in a tissues-specific manner, and many alternative splicing are described. Finally, PMCHL mRNA expression has been compared in Macaque and Human brains. 2. The GUSB-derived genes have been identified through Fluorescent In Situ Hybridization on primates chromosomes. A systematic search in the human genome allowed the discovery and description of a big gene family encompassing at least 15 paralogous in human. The structural and phylogenetic analyses we performed led to a more precise description of their creation and evolution. Some members of this genes family acquired a transcriptional capacity.
Article
The diversity of dog breeds make the domestic dog a valuable model for identifying genes responsible for many phenotypic and behavioral traits. The brain, in particular, is a region of interest for the analysis of molecular changes that are involved in dog-specific behavioral phenotypes. However, such studies are handicapped due to incomplete annotation of the dog genome. We present a high-coverage transcriptome of the dog brain using RNA-Seq. Two areas of the brain, hypothalamus and cerebral cortex, were selected for their roles in cognition, emotion, and neuroendocrine functions. We detected many novel features of the dog transcriptome, including 13,799 novel exons, 51,357 exons with unique 5' or 3' modifications, and many novel alternative splicing events. We provide some examples of novel features in genes that are related to domestication, including ADCY8, SMOC2, and PRNP. We also found 247 novel protein-coding genes and 328 noncoding RNAs, including 57 long noncoding RNAs that represent the first empirical evidence for a large fraction of noncoding RNAs in the dog. In addition, we analyze both gene expression and alternative splicing differences between the hypothalamus and cerebral cortex and find that there is very little overlap between genes that are differentially alternatively spliced and genes that are differentially expressed. We thereby suggest that researchers who want to pinpoint the genetic causes for dog breed-specific traits and diseases should not confine their studies to gene expression alone, but should consider other factors such as alternative splicing and changes in untranslated regions.
Article
Full-text available
The modern evolutionary synthesis leaves unresolved some of the most fundamental, long-standing questions in evolutionary biology: What is the role of sex in evolution? How does complex adaptation evolve? How can selection operate effectively on genetic interactions? More recently, the molecular biology and genomics revolutions have raised a host of critical new questions, through empirical findings that the modern synthesis fails to explain: for example, the discovery of de novo genes; the immense constructive role of transposable elements in evolution; genetic variance and biochemical activity that go far beyond what traditional natural selection can maintain; perplexing cases of molecular parallelism; and more. Here I address these questions from a unified perspective, by means of a new mechanistic view of evolution that offers a novel connection between selection on the phenotype and genetic evolutionary change (while relying, like the traditional theory, on natural selection as the only source of feedback on the fit between an organism and its environment). I hypothesize that the mutation that is of relevance for the evolution of complex adaptation¿while not Lamarckian, or "directed" to increase fitness¿is not random, but is instead the outcome of a complex and continually evolving biological process that combines information from multiple loci into one. This allows selection on a fleeting combination of interacting alleles at different loci to have a hereditary effect according to the combination's fitness. Empirical evidence for the proposed mechanism from both molecular evolution and evolution at the organismal level is discussed, and multiple predictions are offered by which it may be tested. This proposed mechanism addresses the problem of how beneficial genetic interactions can evolve under selection, and also offers an intuitive explanation for the role of sex in evolution, which focuses on sex as the generator of genetic combinations. Importantly, it also implies that genetic variation that has appeared neutral through the lens of traditional theory can actually experience selection on interactions and thus has a much greater adaptive potential than previously considered. Reviewers: This article was reviewed by Nigel Goldenfeld (nominated by Eugene V. Koonin), Jürgen Brosius and W. Ford Doolittle.
Article
The Alu element has been a major source of new exons during primate evolution. Thousands of human genes contain spliced exons derived from Alu elements. However, identifying Alu exons that have acquired genuine biological functions remains a major challenge. We investigated the creation and establishment of Alu exons in human genes, using transcriptome profiles of human tissues generated by high-throughput RNA sequencing (RNA-Seq) combined with extensive RT-PCR analysis. More than 25% of Alu exons analyzed by RNA-Seq have estimated transcript inclusion levels of at least 50% in the human cerebellum, indicating widespread establishment of Alu exons in human genes. Genes encoding zinc finger transcription factors have significantly higher levels of Alu exonization. Importantly, Alu exons with high splicing activities are strongly enriched in the 5'-UTR, and two-thirds (10/15) of 5'-UTR Alu exons tested by luciferase reporter assays significantly alter mRNA translational efficiency. Mutational analysis reveals the specific molecular mechanisms by which newly created 5'-UTR Alu exons modulate translational efficiency, such as the creation or elongation of upstream ORFs that repress the translation of the primary ORFs. This study presents genomic evidence that a major functional consequence of Alu exonization is the lineage-specific evolution of translational regulation. Moreover, the preferential creation and establishment of Alu exons in zinc finger genes suggest that Alu exonization may have globally affected the evolution of primate and human transcriptomes by regulating the protein production of master transcriptional regulators in specific lineages.
Article
The leptin receptor (LEPR) is a crucial regulatory protein that interacts with Leptin. In our analysis of LEPR, novel AluJb-derived alternative transcripts were identified in the genome of the rhesus monkey. In order to investigate the occurrence of AluJb-derived alternative transcripts and the mechanism underlying exonization events, we conducted analyses using a number of primate genomic DNAs and adipose RNAs of tissue and primary cells derived from the crab-eating monkey. Our results demonstrate that the AluJb element has been integrated into our common ancestor genome prior to the divergence of simians and prosimians. The lineage-specific exonization event of the LEPR gene in chimpanzees, orangutans, and Old World monkeys appear to have been accomplished via transition mutations of the 5' splicing site (second position of C to T). However, in New World monkeys and prosimians, the AluJb-related LEPR transcript should be silenced by the additional transversion mutation (fourth position of T to G). The AluJb-related transcript of human LEPR should also be silenced by a mutation of the 5' splicing site (first position of G to A) and the insertion of one nucleotide sequence (minus fourth position of A). Our data suggests that lineage-specific exonization events should be determined by the combination event of the formation of splicing sites and protection against site-specific mutation pressures. These evolutionary mechanisms could be major sources for primate diversification.
Article
Regensburg, Univ., Diss., 2005 (Nicht für den Austausch).
Article
Full-text available
My dissertation encompasses five different studies that are linked by a common theme the investigation of transposable element (TE) contributions to eukaryotic gene sequences. A detailed analysis of exonization events of LTR elements in the human genome shows the preference towards the fixation of LTR elements in gene untranslated regions, which supports the existing concept of a major role of LTR elements as a natural source of regulatory sequences. The ability of different classes of sequence similarity search methods to detect TE-derived sequences was evaluated. In general, the different search methods are found to be complementary, and combined search approaches are needed to systematically check any data set for all potential TE-associated coding sequences. On average, TE-derived exon sequences have low protein coding potential. In particular, non-coding TEs, are frequently exonized but unlikely to encode protein sequences. Many of these non-coding exonized TEs may be actually involved in gene regulation via the formation of double stranded RNA complexes with complementary TE-derived exons. The investigation of the relationship between human miRNAs and TEs shows that 55 experimentally verified human miRNA genes (~12%) originated from TEs. Overall, TE-derived miRNA genes are less conserved than non TE-derived miRNAs. The potential regulatory and functional significance of TE-derived miRNAs was explored. An ab initio prediction algorithm I developed was used to discover putative cases of novel TE-derived miRNA genes. A miRNA gene family, hsa-mir-548, was found to be derived from Made1 family of MITEs. The palindromic structure of the Made1 elements, and MITEs in general, points to a specific mechanism by which these sequences can be recognized and processed by the miRNA biogenesis pathway. MITEs may also represent an evolutionary link between siRNAs and miRNAs. An original model for a siRNA-to-miRNA evolutionary transition mediated by DNA-type TEs is proposed. This model is supported by the presence of evolutionary intermediate TE sequences that encode both siRNAs and miRNAs in the Arabidopsis and rice genomes. The siRNA-to-miRNA evolutionary transition is representative of a number of other regulatory mechanisms that evolved to silence TEs and were later co-opted to serve as regulators of host gene expression. Ph.D. Committee Chair: Jordan, I. King; Committee Member: Borodovsky, Mark; Committee Member: Bunimovich, Leonid; Committee Member: Choi, Jung; Committee Member: McDonald, John
Article
Full-text available
Transposable elements (TEs) are powerful facilitators of genome evolution, and hence of phenotypic diversity as they can cause genetic changes of great magnitude and variety. TEs are ubiquitous and extremely ancient, and although harmful to some individuals, they can be very beneficial to lineages. TEs can build, sculpt, and reformat genomes by both active and passive means. Lineages with active TEs or with abundant homogeneous inactive populations of TEs that can act passively by causing ectopic recombination are potentially fecund, adaptable, and taxonate readily. Conversely, taxa deficient in TEs or possessing heterogeneous populations of inactive TEs may be well adapted in their niche, but tend to prolonged stasis and may risk extinction by lacking the capacity to adapt to change, or diversify. Because of recurring intermittent waves of TE infestation, available data indicate a compatibility with punctuated equilibrium, in keeping with widely accepted interpretations of evidence from the fossil record. We propose a general and holistic synthesis on how the presence of TEs within genomes makes them flexible and dynamic, so that genomes themselves are powerful facilitators of their own evolution.
Article
The sirtuin family of class III histone deacetylases (HDACs) is named after their homology to yeast silent information regulator 2 (SIR2). SIR2 and its mammalian derivatives (SIRT1-7) play a central role in gene silencing, cell cycle, aging and metabolism. Here we reported cDNA cloning, chromosome mapping,expression and evolutional analysis of sirtuin genes in Sus scrofa (Tongcheng pig). Sequence analysis showed that porcine sirtuin genes contain 7 members designated SIRT1-7. Tissue distribution analysis indicated porcine sirtuin genes ubiquitously expressed but with the highest abundance in brain, spinal cord and genital tissue. In silico and radiation hybrid mapping analysis mapped porcine SIRT1-7 to the chromosomes 14q23,6q11-12, 2q29, 14q19, 7p12, 2q11, and 12p15, respectively. We also isolated and characterized genomic sequence of porcine SIRT1, which spaned a region of 31,834 bp comprising 9 exons ranging in size from 80 bpto 2121 bp. The 5' flanking genomic region preceding an open reading frame of SIRT1 has a TATA box, a small300 bp CpG island and several putative Sp1 and p53 transcription factor binding sites. Moreover, we isolated two novel splicing SIRT6 variants with 346 bp (variant 2) in-frame deletions from lung and 327 bp(variant 3) in-frame deletions from spleen and brain. This is the first systematic report of molecular cloning and characterization of sirtuin genes in pigs, which will be helpful for a better understanding of the physiological role of sirtuin proteins in pigs.
Article
Transposable elements (TEs) are major sources of new exons in higher eukaryotes. Almost half of the human genome is derived from TEs, and many types of TEs have the potential to exonize. In this work, we conducted a large-scale analysis of human exons derived from mammalian-wide interspersed repeats (MIRs), a class of old TEs which was active prior to the radiation of placental mammals. Using exon array data of 328 MIR-derived exons and RT-PCR analysis of 39 exons in 10 tissues, we identified 15 constitutively spliced MIR exons, and 15 MIR exons with tissue-specific shift in splicing patterns. Analysis of RNAs from multiple species suggests that the splicing events of many strongly included MIR exons have been established before the divergence of primates and rodents, while a small percentage result from recent exonization during primate evolution. Interestingly, exon array data suggest substantially higher splicing activities of MIR exons when compared with exons derived from Alu elements, a class of primate-specific retrotransposons. This appears to be a universal difference between exons derived from young and old TEs, as it is also observed when comparing Alu exons to exons derived from LINE1 and LINE2, two other groups of old TEs. Together, this study significantly expands current knowledge about exonization of TEs. Our data imply that with sufficient evolutionary time, numerous new exons could evolve beyond the evolutionary intermediate state and contribute functional novelties to modern mammalian genomes.
Article
Full-text available
Genomic nomenclature has not kept pace with the levels and depth of analyzing and understanding genomic structure, function, and evolution. We wish to propose a general terminology that might aid the integrated study of evolution and molecular biology. Here we designate as a "nuon" any stretch of nucleic acid sequence that may be identifiable by any criterion. We show how such a general term will facilitate contemplation of the structural and functional contributions of such elements to the genome in its past, current, or future state. We focus in this paper on pseudogenes and dispersed repetitive elements, since their current names reflect the prevalent view that they constitute dispensable genomic noise (trash), rather than a vast repertoire of sequences with the capacity to shape an organism during evolution. This potential to contribute sequences for future use is reflected in the suggested terms "potonuons" or "potogenes." If such a potonuon has been coopted into a variant or novel function, an evolutionary process termed "exaptation," we employ the term "xaptonuon." If a potonuon remains without function (nonaptive nuon), it is a "nonaptation" and we term it "naptonuon." A number of examples for potonuons and xaptonuons are given.
Article
Full-text available
Dispersion of repetitive sequence elements is a source of genetic variability that contributes to genome evolution. Alu elements, the most common dispersed repeats in the human genome, can cause genetic diseases by several mechanisms, including de novo Alu insertions and splicing of intragenic Alu elements into mRNA. Such mutations might contribute positively to protein evolution if they are advantageous or neutral. To test this hypothesis, we searched the literature and sequence databases for examples of protein-coding regions that contain Alu sequences: 17 Alu 'cassettes' inserted within 15 different coding sequences were found. In three instances, these events caused genetic diseases; the possible functional significance of the other Alu-containing mRNAs is discussed. Our analysis suggests that splice-mediated insertion of intronic elements is the major mechanism by which Alu segments are introduced into mRNAs.
Article
Full-text available
The evolution, mobility and deleterious genetic effects of human Alus are fairly well understood. The complexity of regulated transcriptional expression of Alus is becoming apparent and insight into the mechanism of retrotransposition is emerging. Unresolved questions concern why mobile, highly repetitive short interspersed elements (SINEs) have been tolerated throughout evolution and why and how families of such sequences are periodically replaced. Either certain SINEs are more successful genomic parasites or positive selection drives their relative success and genomic maintenance. A complete understanding of the evolutionary dynamics and significance of SINEs requires determining whether or not they have a function(s). Recent evidence suggests two possibilities, one concerning DNA and the other RNA. Dispersed Alus exhibit remarkable tissue-specific differences in the level of their 5-methylcytosine content. Differences in Alu methylation in the male and female germlines suggest that Alu DNA may be involved in either the unique chromatin organization of sperm or signaling events in the early embryo. Alu RNA is increased by cellular insults and stimulates protein synthesis by inhibiting PKR, the eIF2 kinase that is regulated by double-stranded RNA. PKR serves other roles potentially linking Alu RNA to a variety of vital cell functions. Since Alus have appeared only recently within the primate lineage, this proposal provokes the challenging question of how Alu RNA could have possibly assumed a significant role in cell physiology.
Article
Full-text available
Transpositions of Alu sequences, representing the most abundant primate short interspersed elements (SINE), were evaluated as molecular cladistic markers to analyze the phylogenetic affiliations among the primate infraorders. Altogether 118 human loci, containing intronic Alu elements, were PCR analyzed for the presence of Alu sequences at orthologous sites in each of two strepsirhine, New World and Old World monkey species, Tarsius bancanus, and a nonprimate outgroup. Fourteen size-polymorphic amplification patterns exhibited longer fragments for the anthropoids (New World and Old World monkeys) and T. bancanus whereas shorter fragments were detected for the strepsirhines and the outgroup. From these, subsequent sequence analyses revealed three Alu transpositions, which can be regarded as shared derived molecular characters linking tarsiers and anthropoid primates. Concerning the other loci, scenarios are represented in which different SINE transpositions occurred independently in the same intron on the lineages leading both to the common ancestor of anthropoids and to T. bancanus, albeit at different nucleotide positions. Our results demonstrate the efficiency and possible pitfalls of SINE transpositions used as molecular cladistic markers in tracing back a divergence point in primate evolution over 40 million years old. The three Alu insertions characterized underpin the monophyly of haplorhine primates (Anthropoidea and Tarsioidea) from a novel perspective.
Article
Full-text available
We report the identification of a novel p75TNF receptor isoform termed icp75TNFR, which is generated by the use of an alternative transcriptional start site within the p75TNFR gene and characterized by regulated intracellular expression. The icp75TNFR protein has an apparent molecular mass of approximately 50 kDa and is recognized by antibodies generated against the transmembrane form of p75TNFR. The icp75TNFR binds the tumor necrosis factor(TNF) and mediates intracellular signaling. Overexpression of the icp75TNFR cDNA results in TNF-induced activation of NFkappaB in a TNF receptor-associated factor 2 (TRAF2)-dependent manner. Thus, our results provide an example for intracellular cytokine receptor activation.
Article
Transpositions of Alu sequences, representing the most abundant primate short interspersed elements (SINE), were evaluated as molecular cladistic markers to analyze the phylogenetic affiliations among the primate infraorders. Altogether 118 human loci, containing intronic Alu elements, were PCR analyzed for the presence of Alu sequences at orthologous sites in each of two strepsirhine, New World and Old World monkey species, Tarsius bancanus, and a nonprimate outgroup. Fourteen size-polymorphic amplification patterns exhibited longer fragments for the anthropoids (New World and Old World monkeys) and T. bancanus whereas shorter fragments were detected for the strepsirhines and the outgroup. From these, subsequent sequence analyses revealed three Alu transpositions, which can be regarded as shared derived molecular characters linking tarsiers and anthropoid primates. Concerning the other loci, scenarios are represented in which different SINE transpositions occurred independently in the same intron on the lineages leading both to the common ancestor of anthropoids and to T. bancanus, albeit at different nucleotide positions. Our results demonstrate the efficiency and possible pitfalls of SINE transpositions used as molecular cladistic markers in tracing back a divergence point in primate evolution over 40 million years old. The three Alu insertions characterized underpin the monophyly of haplorhine primates (Anthropoidea and Tarsioidea) from a novel perspective.
Article
To study the genome-wide impact of transposable elements (TEs) on the evolution of protein-coding regions, we examined 13 799 human genes and found 533 (∼4%) cases of TEs within protein-coding regions. The majority of these TEs (∼89.5%) reside within ‘introns’ and were recruited into coding regions as novel exons. We found that TE integration often has an effect on gene function. In particular, there were two mouse genes whose coding regions consist largely of TEs, suggesting that TE insertion might create new genes. Thus, there is increasing evidence for an important role of TEs in gene evolution. Because many TEs are taxon-specific, their integration into coding regions could accelerate species divergence.
Article
A highly resolved primate cladogram based on DNA evidence is congruent with extant and fossil osteological evidence. A provisional primate classification based on this cladogram and the time scale provided by fossils and the model of local molecular clocks has all named taxa represent clades and assigns the same taxonomic rank to those clades of roughly equivalent age. Order Primates divides into Strepsirhini and Haplorhini. Strepsirhines divide into Lemuriformes and Loriformes, whereas haplorhines divide into Tarsiiformes and Anthropoidea. Within Anthropoidea when equivalent ranks are used for divisions within Platyrrhini and Catarrhini, Homininae divides into Hylobatini (common and siamang gibbon) and Hominini, and the latter divides into Pongina forPongo(orangutans) and Hominina forGorillaandHomo. Homoitself divides into the subgeneraH.(Homo) for humans andH.(Pan) for chimpanzees and bonobos. The differences between this provisional age related phylogenetic classification and current primate taxonomies are discussed.
Article
To facilitate gene finding and for the investigation of human molecular genetics on a genome scale, we present a comprehensive survey on various statistical features of human exons. We first show that human exons with flanking genomic DNA sequences can be classified into 12 mutually exclusive categories. This classification could serve as a standard for future studies so that direct comparisons of results can be made. A database for eight categories (related to human genes in which coding regions are split by introns) was built from GenBank release 87.0 and analyzed by a number of methods to characterize statistical features of these sequences that may serve as controls or regulatory signals for gene expression. The statistical information compiled includes profiles of signals for transcription, splicing and translation, various compositional statistics and size distributions. Further analyses reveal novel correlations and constraints among different splicing features across an internal exon that are consistent with the Exon Definition model. This information is fundamental for a quantitative view of human gene organization, and should be invaluable for individual scientists to design human molecular genetics experiments.
Article
The existing classification of human Alu sequences is revised and expanded using a novel methodology and a larger set of sequence data. Our study confirms that there are two major Alu subfamilies, Alu-J and Alu-S. The Alu-S subfamily consists of at least five distinct subfamilies referred to as Alu-Sx, Alu-Sq, Alu-Sp, Alu-Sc, and Alu-Sb. The Alu-Sp and Alu-Sq subfamilies have been revealed by this study. Alu subfamilies differ from one another in a number of positions called diagnostic. In this paper the diagnostic positions are defined in quantitative terms and are used to evaluate statistical significance of the observed subfamilies. Each Alu subfamily most likely represents pseudogenes retroposed from evolving functional source Alu genes. Evidence presented in this paper indicates that Alu-Sp and Alu-Sc pseudogenes were retroposed from different source genes, during overlapping periods of time, and at different rates. Our analysis also indicates that the previously identified Alu-type transcript BC200 comes from an active Alu gene that might have existed even before the origin of dimeric Alu sequences. The source genes for Alu pseudogene families are reconstructed. It is assumed that diagnostic differences between reconstructed source genes reflect mutations that have occurred in true source Alu genes under natural selection. Some of these mutations are compensatory and are used to reconstruct a common secondary structure of Alu RNAs transcribed from the source genes. The biological function of Alu RNA is discussed in the context of its homology to the elongation-arresting domain of 7SL RNA.(ABSTRACT TRUNCATED AT 250 WORDS)
Article
Alu repetitive elements are found in approximately 1.4 million copies in the human genome, comprising more than one-tenth of it. Numerous studies describe exonizations of Alu elements, that is, splicing-mediated insertions of parts of Alu sequences into mature mRNAs. To study the connection between the exonization of Alu elements and alternative splicing, we used a database of ESTs and cDNAs aligned to the human genome. We compiled two exon sets, one of 1176 alternatively spliced internal exons, and another of 4151 constitutively spliced internal exons. Sixty one alternatively spliced internal exons (5.2%) had a significant BLAST hit to an Alu sequence, but none of the constitutively spliced internal exons had such a hit. The vast majority (84%) of the Alu-containing exons that appeared within the coding region of mRNAs caused a frame-shift or a premature termination codon. Alu-containing exons were included in transcripts at lower frequencies than alternatively spliced exons that do not contain an Alu sequence. These results indicate that internal exons that contain an Alu sequence are predominantly, if not exclusively, alternatively spliced. Presumably, evolutionary events that cause a constitutive insertion of an Alu sequence into an mRNA are deleterious and selected against.
Article
Introns are removed from precursor messenger RNAs in the cell nucleus by a large ribonucleoprotein complex called the spliceosome. The spliceosome contains five subcomplexes called snRNPs, each with one RNA and several protein components. Interactions of the snRNPs with each other and the intron are highly dynamic, changing in an ordered progression throughout the splicing process. This allosteric cascade of interactions is programmed into the RNA and protein components of the spliceosome, and is driven by a family of DExD/H-box RNA-dependent ATPases. The dependence of cascade progression on multiple intron-recognition events likely serves to enforce the accuracy of splicing. Here, the progression of the allosteric cascade from the first recognition event to the first catalytic step of splicing is reviewed.
Article
Different forms of tumor necrosis factor (TNF) interact with two specific receptors for TNF (TNFR) on the cell membrane to induce a variety of effects. While sharing structural similarities in their extracellular domains, the two TNFRs differ in their intracellular domain, their signal transduction, and consequently their function. In addition, one of the two TNFRs can be expressed in two differently located isoforms. This makes the TNF-TNFR system very complex. The dual TNF function for either cell death or survival upon interaction of members of the TNF ligand family with members of the TNF receptor family will be discussed.
Article
Transpositions of primate-specific Alu elements were applied as molecular cladistic markers in a phylogenetic analysis of South American primates. Seventy-four human and platyrrhine loci containing intronic Alu elements were PCR screened in various New World monkeys and the human outgroup to detect the presence of orthologous retrotransposons informative of New World monkey phylogeny. Six loci revealed size polymorphism in the amplification pattern, indicating a shared derived character state due to the presence of orthologous Alu elements confirmed by subsequent sequencing. Three markers corroborate (1) New World monkey monophyly and one marker supports each of the following callitrichine relationships: (2) Callithrix and Cebuella are more closely related to each other than to any other callitrichine, (3) the callitrichines form a monophyletic clade including Callimico, and (4) the next living relatives to the callitrichines are Cebus, Saimiri, and Aotus.