—Mammalian tree and number of mammalian-specific gene families. The tree depicts the phylogenetic relationships between 30 mammalian species from different major groups. The values in each node indicate the number of families that were mapped to the branch ending in the node. We define three conservation levels: " mam-basal " (class 2, approximately older than 100 Myr, red), " mam-young " (class 1, green) and " species-specific " (class 0, blue). The branch length represents the approximate number of substitutions per site as inferred from previous studies (see Materials and Methods). The scale bar on the bottom left corner represents 6 substitutions per 100 nucleotides. Dotted lines have been added to some branches to improve readability. 

—Mammalian tree and number of mammalian-specific gene families. The tree depicts the phylogenetic relationships between 30 mammalian species from different major groups. The values in each node indicate the number of families that were mapped to the branch ending in the node. We define three conservation levels: " mam-basal " (class 2, approximately older than 100 Myr, red), " mam-young " (class 1, green) and " species-specific " (class 0, blue). The branch length represents the approximate number of substitutions per site as inferred from previous studies (see Materials and Methods). The scale bar on the bottom left corner represents 6 substitutions per 100 nucleotides. Dotted lines have been added to some branches to improve readability. 

Source publication
Article
Full-text available
The birth of genes that encode new protein sequences is a major source of evolutionary innovation. However, we still understand relatively little about how these genes come into being and which functions they are selected for. To address these questions, we have obtained a large collection of mammalian-specific gene families that lack homologues in...

Similar publications

Preprint
Full-text available
The birth of genes that encode new protein sequences is a major source of evolutionary innovation. However, we still understand relatively little about how these genes come into being and which functions they are selected for. To address these questions we have obtained a large collection of mammalian-specific gene families that lack homologues in...
Article
Full-text available
The environment experienced early in life often affects the traits that are developed after an individual has transitioned into new life stages and environments. Because the phenotypes induced by earlier environments are then screened by later ones, these ‘carry‐over effects’ influence fitness outcomes across the entire life cycle. While the last t...

Citations

... The activation of these host defense mechanisms triggers complex immune signaling pathways, establishing lasting immunity against pathogens [4,6]. Extensive studies have analyzed the functional properties and specific signaling pathways of gene expression in mammals and their potential in biotherapeutic applications [7,8]. However, the search for homologous proteins in salmonids, particularly Salmo salar, underscores the imperative for a more comprehensive exploration of innate immune system components in this species. ...
Article
Full-text available
The innate immune response in Salmo salar, mediated by pattern recognition receptors (PRRs), is crucial for defending against pathogens. This study examined DDX41 protein functions as a cytosolic/nuclear sensor for cyclic dinucleotides, RNA, and DNA from invasive intracellular bacteria. The investigation determined the existence, conservation, and functional expression of the ddx41 gene in S. salar. In silico predictions and experimental validations identified a single ddx41 gene on chromosome 5 in S. salar, showing 83.92% homology with its human counterpart. Transcriptomic analysis in salmon head kidney confirmed gene transcriptional integrity. Proteomic identification through mass spectrometry characterized three unique peptides with 99.99% statistical confidence. Phylogenetic analysis demonstrated significant evolutionary conservation across species. Functional gene expression analysis in SHK-1 cells infected by Piscirickettsia salmonis and Renibacterium salmoninarum indicated significant upregulation of DDX41, correlated with increased proinflammatory cytokine levels and activation of irf3 and interferon signaling pathways. In vivo studies corroborated DDX41 activation in immune responses, particularly when S. salar was challenged with P. salmonis, underscoring its potential in enhancing disease resistance. This is the first study to identify the DDX41 pathway as a key component in S. salar innate immune response to invading pathogens, establishing a basis for future research in salmonid disease resistance.
... In animals and plants, different tissues and sexes differ in the degree of transcriptome conservation. For example, evolutionarily young genes are disproportionately expressed in the testis of nematodes, flies and mammals (Haerty et al. 2007;Kondo et al. 2017;Villanueva-Cañas et al. 2017;Rödelsperger et al. 2021), and in male reproductive cells in plants (Cui et al. 2015;Gossmann et al. 2016;Julca et al. 2021). To test whether these patterns also characterise . ...
Preprint
Full-text available
Complex multicellularity has emerged independently across a few eukaryotic lineages and is often associated with the rise of elaborate, tightly coordinated developmental processes. How multicellularity and development are interconnected in evolution is a major question in biology. The hourglass model of embryonic evolution depicts how developmental processes are conserved during evolution, predicting morphological and molecular divergence in early and late embryo stages, bridged by a conserved mid-embryonic (phylotypic) period linked to the formation of the basic body plan. Initially found in animal embryos, molecular hourglass patterns have recently been proposed for land plants and fungi. However, whether the hourglass pattern is an intrinsic feature of all developmentally complex eukaryotic lineages remains elusive. Here, we tested the prevalence of a (molecular) hourglass in the brown algae, the third most developmentally complex lineage on earth that has evolved multicellularity independently from animals, fungi, and plants. By exploring the evolutionary transcriptome of brown algae with distinct morphological complexities, we uncovered an hourglass pattern during embryogenesis in developmentally complex species. Filamentous algae without a canonical embryogenesis display an evolutionary transcriptome that is most conserved in multicellular stages of the life cycle, whereas unicellular stages are more rapidly evolving. Our findings suggest that transcriptome conservation in brown algae is associated with cell differentiation stages, but not necessarily linked to embryogenesis. Together with previous work in animals, plants and fungi, we provide further evidence for the generality of a developmental hourglass pattern across complex multicellular eukaryotes.
... According to Sampaio et al. (2001) there is information that R. glutinis had been identified with the species R. kratochvilovae, showing that the potential for lipid accumulation variation could be lower between closely related species and strains of the same species. The present investigation postulates the concept of how orthologous genes could preserve their function for a long time (Villanueva-Cañas et al. 2017). The main cellular components found in CON-5 and POR-3, are distributed in genes of specific groups, which fulfil different functions ( Figure 2). ...
Article
Full-text available
Genomes of oleaginous yeast strains Rhodotorula glutinis CON-5 and Rhodotorula kratochvilovae POR-3, isolated from areas in the northern Peruvian Andes using SPAdes, were sequenced and assembled applying Illumina and de novo. Genomes of 20,515,696 and 20,738,185 bp, respectively, were determined. From the structural and functional annotations, the Basidiomycota phylum showed a similarity of 76.8% and 86.5% with 6,976 and 8,124 pairs of proteins in both yeasts respectively, with homologues in the UniProt data bank. Using OrthoVenn, a relationship between both yeasts was obtained from 450 orthologous groups. Likewise, the above-mentioned yeasts and R. toruloides (oleaginous Basidiomycota) showed 1,574 orthologous groups, indicating a good relationship. Construction of phylogenetic trees of genes encoding metabolic enzymes was also carried out, based on the ITS sequences which showed that CON-5 and POR-3 have a greater relationship with R. graminis. Their phylogenetic relationship was ascertained and determined that the enzymes involved in the metabolism of CON-5 and POR-3 are related to each other. It was also found that the protein sequences of the Basidiomycota phylum differ from Ascomycota. The study showed functional evidence regarding the lipid accumulation phenotype, an important aspect in the context of obtaining lipids or oleochemicals.
... 5 This idea is supported by the observation that new genes showed exclusive or biased expression in the testis of both Drosophila 6,7 and mammals. 8 After the retention through positive selection, these testis-specific genes may eventually acquire expression and functions in other tissues and be incorporated into other biological processes. However, to our knowledge, a genome-wide analysis of the somatic expression of young duplicate genes throughout development and at singlecell resolution is still lacking. ...
Article
Full-text available
Gene duplication produces the material that fuels evolutionary innovation. The “out-of-testis” hypothesis suggests that sperm competition creates selective pressure encouraging the emergence of new genes in male germline, but the somatic expression and function of the newly evolved genes are not well understood. We systematically mapped the expression of young duplicate genes throughout development in Caenorhabditis elegans using both whole-organism and single-cell transcriptomic data. Based on the expression dynamics across developmental stages, young duplicate genes fall into three clusters that are preferentially expressed in early embryos, mid-stage embryos, and late-stage larvae. Early embryonic genes are involved in protein degradation and develop essentiality comparable to the genomic average. In mid-to-late embryos and L4-stage larvae, young genes are enriched in intestine, epidermal cells, coelomocytes, and amphid chemosensory neurons. Their molecular functions and inducible expression indicate potential roles in innate immune response and chemosensory perceptions, which may contribute to adaptation outside of the sperm.
... For example, their expression profile is narrow, most often in the testis in animals (e.g. Betr an & Long, 2003;Dai et al., 2006;Zhao et al., 2014;Luis Villanueva-Cañas et al., 2017). Without strong selection constraints, they potentially evolved to refine regulatory networks, and such a lack of constrained and conserved functions might facilitate the evolution of novel functions such as hybrid inviability. ...
Article
Full-text available
Genetic incompatibilities are widespread between species. However, it remains unclear whether they all originated after population divergence as suggested by the Bateson–Dobzhansky–Muller model, and if not, what is their prevalence and distribution within populations. The gene presence–absence variations (PAVs) provide an opportunity for investigating gene–gene incompatibility. Here, we searched for the repulsion of coexistence between gene PAVs to identify the negative interaction of gene functions separately in two Oryza sativa subspecies. Many PAVs are involved in subspecies‐specific negative epistasis and segregate at low‐to‐intermediate frequencies in focal subspecies but at low or high frequencies in the other subspecies. Incompatible PAVs are enriched in two functional groups, defense response and protein phosphorylation, which are associated with plant immunity and consistent with autoimmunity being a known mechanism of hybrid incompatibility in plants. Genes in the two enriched functional groups are older and seldom directly interact with each other. Instead, they interact with other younger gene PAVs with diverse functions. Our results illustrate the landscape of genetic incompatibility at gene PAVs in rice, where many incompatible pairs have already segregated as polymorphisms within subspecies, and many are novel negative interactions between older defense‐related genes and younger genes with diverse functions.
... New genes have frequently been associated with the evolution of taxon-specific traits and adaptations. [1][2][3] For some time after their emergence, new genes are restricted to a single lineage and are often referred to as "lineage-specific genes" (LSGs). Such young genes are typically expressed at lower levels than older genes, with a high proportion expressed in specific tissues and developmental stages. ...
... Such young genes are typically expressed at lower levels than older genes, with a high proportion expressed in specific tissues and developmental stages. 2,4 In animals, new genes are often testis-biased, [5][6][7][8][9] and it has been suggested that LSGs play an important role in the developmental divergence between species. 10,11 The analysis of expression profiles throughout development in C. elegans found an enrichment of younger genes in late embryogenesis, 11 while in slime molds LSGs were found predominantly biased during the middle stages of development. ...
Article
Full-text available
Lineage-specific genes can contribute to the emergence and evolution of novel traits and adaptations. Tardigrades are animals that have adapted to tolerate extreme conditions by undergoing a form of cryptobiosis called anhydrobiosis, a physical transformation to an inactive desiccated state. While studies to understand the genetics underlying the interspecies diversity in anhydrobiotic transitions have identified tardigrade-specific genes and family expansions involved in this process, the contributions of species-specific genes to the variation in tardigrade development and cryptobiosis are less clear. We used previously published transcriptomes throughout development and anhydrobiosis (5 embryonic stages, 7 juvenile stages, active adults, and tun adults) to assess the transcriptional biases of different classes of genes between 2 tardigrade species, Hypsibius exemplaris and Ramazzottius varieornatus. We also used the transcriptomes of 2 other tardigrades, Echiniscoides sigismundi and Richtersius coronifer, and data from 3 non-tardigrade species ( Adenita vaga, Drosophila melanogaster, and Caenorhabditis elegans) to help identify lineage-specific genes. We found that lineage-specific genes have generally low and narrow expression but are enriched among biased genes in different stages of development depending on the species. Biased genes tend to be specific to early and late development, but there is little overlap in functional enrichment of biased genes between species. Gene expansions in the 2 tardigrades also involve families with different functions despite homologous genes being expressed during anhydrobiosis in both species. Our results demonstrate the interspecific variation in transcriptional contributions and biases of lineage-specific genes during development and anhydrobiosis in 2 tardigrades.
... The MUC1 gene first appeared in mammals [17]. Many of the genes that arose in early mammalian evolution encode proteins expressed in the mammary gland, skin and immune cells [18]. Other early mammalian genes are preferentially activated in testes, supporting the potential for sexual selection. ...
... Certain of the new mammalian protein-encoding genes emerged de novo from noncoding genomic regions [18]. MUC1 has no homology with other genes except for sequences encoding a ~122 aa sea urchin sperm protein-enterokinase-agrin (SEA) domain [17,[21][22][23]. ...
Article
Full-text available
The mucin 1 (MUC1) gene was discovered based on its overexpression in human breast cancers. Subsequent work demonstrated that MUC1 is aberrantly expressed in cancers originating from other diverse organs, including skin and immune cells. These findings supported a role for MUC1 in the adaptation of barrier tissues to infection and environmental stress. Of fundamental importance for this evolutionary adaptation was inclusion of a SEA domain, which catalyzes autoproteolysis of the MUC1 protein and formation of a non-covalent heterodimeric complex. The resulting MUC1 heterodimer is poised at the apical cell membrane to respond to loss of homeostasis. Disruption of the complex releases the MUC1 N-terminal (MUC1-N) subunit into a protective mucous gel. Conversely, the transmembrane C-terminal (MUC1-C) subunit activates a program of lineage plasticity, epigenetic reprogramming and repair. This MUC1-C-activated program apparently evolved for barrier tissues to mount self-regulating proliferative, inflammatory and remodeling responses associated with wound healing. Emerging evidence indicates that MUC1-C underpins inflammatory adaptation of tissue stem cells and immune cells in the barrier niche. This review focuses on how prolonged activation of MUC1-C by chronic inflammation in these niches promotes the cancer stem cell (CSC) state by establishing auto-inductive nodes that drive self-renewal and tumorigenicity.
... Previous studies, including our own, have suggested that evolutionarily young genes tend to be expressed in the testis (Levine et al., 2006;Begun et al., 2007;Soumillon et al., 2013;Zhao et al., 2014;Luis Villanueva-Cañas et al., 2017;Witt et al., 2019), though other tissues, like the brain, can also give rise to evolutionarily young genes . In contrast, the utORFs identified here are expressed more in the brain than in the testis ( Figure 5-figure supplement 1) in each inferred class. ...
... Moreover, while most annotated genes are highly expressed in the testis, most utORFs are not expressed at all in the testis, yet they are expressed in the brain ( Figure 5-figure supplement 1). Considering that most previously reported de novo genes also tend to be expressed in the testis (Levine et al., 2006;Begun et al., 2007;Zhao et al., 2014;Luis Villanueva-Cañas et al., 2017;Witt et al., 2019), our results may suggest that the proportion of protein-coding de novo genes might be higher in the brain than in the testis or that our MS-first approach allows the detection of otherwise-missed protein-coding de novo genes. Furthermore, these utORFs may have biologically significant effects in the brain. ...
Article
Full-text available
De novo gene origination, where a previously nongenic genomic sequence becomes genic through evolution, is increasingly recognized as an important source of novelty. Many de novo genes have been proposed to be protein-coding, and a few have been experimentally shown to yield protein products. However, the systematic study of de novo proteins has been hampered by doubts regarding their translation without the experimental observation of protein products. Using a systematic, mass-spectrometry-first computational approach, we identify 993 unannotated open reading frames with evidence of translation (utORFs) in Drosophila melanogaster. To quantify the similarity of these utORFs across Drosophila and infer phylostratigraphic age, we develop a synteny-based protein similarity approach. Combining these results with reference datasets ontissue- and life stage-specific transcription and conservation, we identify different properties amongst these utORFs. Contrary to expectations, the fastest-evolving utORFs are not the youngest evolutionarily. We observed more utORFs in the brain than in the testis. Most of the identified utORFs may be of de novo origin, even accounting for the possibility of false-negative similarity detection. Finally, sequence divergence after an inferred de novo origin event remains substantial, suggesting that de novo proteins turn over frequently. Our results suggest that there is substantial unappreciated diversity in de novo protein evolution: many more may exist than previously appreciated; there may be divergent evolutionary trajectories, and they may be gained and lost frequently. All in all, there may not exist a single characteristic model of de novo protein evolution, but instead, there may be diverse evolutionary trajectories.
... In the osteichthyan lineage, sparc-L locally duplicated into sparc-L1 and sparc-L2 (Bertrand et al. 2013;Enault et al. 2018), and multiple duplications and losses at this locus also produced a broad array of scpp members (Kawasaki et al. 2005;Kawasaki 2009Kawasaki , 2011. Some of these duplications occurred concomitantly with the origin of mammals and are associated with evolutionary innovation, as is the case for the milk Caseins and the saliva Muc7 gene Xu et al. 2016;Luis Villanueva-Cañas et al. 2017). Genome projects have identified scpp genes at the sparc-L1/sparc-L2 locus in all osteichthyan species examined to date, although with poor support of 1-to-1 orthology between teleosts and tetrapods, reflecting a high rate of sequence divergence and gene turnover (Kawasaki and Amemiya 2014;Qu et al. 2015;Braasch et al. 2016;Lin et al. 2016;Kawasaki et al. 2017;Lv et al. 2017;Cheng et al. 2021). ...
Article
Full-text available
In bony vertebrates, skeletal mineralization relies on the secretory calcium-binding phosphoproteins (Scpp) family whose members are acidic extracellular proteins posttranslationally regulated by the Fam20°C kinase. As scpp genes are absent from the elephant shark genome, they are currently thought to be specific to bony fishes (osteichthyans). Here, we report a scpp gene present in elasmobranchs (sharks and rays) that evolved from local tandem duplication of sparc-L 5′ exons and show that both genes experienced recent gene conversion in sharks. The elasmobranch scpp is remarkably similar to the osteichthyan scpp members as they share syntenic and gene structure features, code for a conserved signal peptide, tyrosine-rich and aspartate/glutamate-rich regions, and harbor putative Fam20°C phosphorylation sites. In addition, the catshark scpp is coexpressed with sparc-L and fam20°C in tooth and scale ameloblasts, similarly to some osteichthyan scpp genes. Despite these strong similarities, molecular clock and phylogenetic data demonstrate that the elasmobranch scpp gene originated independently from the osteichthyan scpp gene family. Our study reveals convergent events at the sparc-L locus in the two sister clades of jawed vertebrates, leading to parallel diversification of the skeletal biomineralization toolkit. The molecular evolution of sparc-L and its coexpression with fam20°C in catshark ameloblasts provides a unifying genetic basis that suggests that all convergent scpp duplicates inherited similar features from their sparc-L precursor. This conclusion supports a single origin for the hypermineralized outer odontode layer as produced by an ancestral developmental process performed by Sparc-L, implying the homology of the enamel and enameloid tissues in all vertebrates.
... One such example is DCAF16, a protein that interacts with CUL4 (reF. 27 ). As highlighted throughout this Review, several CUL4− DCAF complexes regulate critical events in embryogenesis (for example, CUL4-DCAF13 regulating oogenesis and zygotic gene expression, CUL4-DCAF2 regulating zygotic divisions, and others). ...
Article
Mammalian development demands precision. Millions of molecules must be properly located in temporal order, and their function regulated, to orchestrate important steps in cell cycle progression, apoptosis, migration and differentiation, to shape developing embryos. Ubiquitin and its associated enzymes act as cellular guardians to ensure precise spatio-temporal control of key molecules during each of these important cellular processes. Loss of precision results in numerous examples of embryological disorders or even cancer. This Review discusses the crucial roles of E3 ubiquitin ligases during key steps of early mammalian development and their roles in human disease, and considers how new methods to manipulate and exploit the ubiquitin regulatory machinery — for example, the development of molecular glues and PROTACs — might facilitate clinical therapy.