ArticleLiterature Review

Disentangling the origins of virophages and polintons

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Virophages and polintons are part of a complex system that also involves eukaryotes, giant viruses, as well as other viruses and transposable elements. Virophages are cosmopolitan, being found in environments ranging from the Amazon River to Antarctic hypersaline lakes, while polintons are found in many single celled and multicellular eukaryotes. Virophages and polintons have a shared ancestry, but their exact origins are unknown and obscured by antiquity and extensive horizontal gene transfer (HGT). Paleovirology can help disentangle the complicated gene flow between these two, as well as their giant viral and eukaryotic hosts. We outline the evidence and theoretical support for polintons being descended from viruses and not vice versa. In order to disentangle the natural history of polintons and virophages, we suggest that there is much to be gained by embracing rigorous metagenomics and evolutionary analyses. Methods from paleovirology will play a pivotal role in unravelling ancient relationships, HGT and patterns of cross-species transmission.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The function and abundance of MGEs such as viruses and TEs has been extensively reviewed, and we provide only a few highlights here. TEs are present in genomes across the tree of life (e.g., Kidwell and Lisch 2001;Suzuki and Bird 2008;Kejnovsky et al. 2012;Campbell et al. 2017) and can constitute more than half the genome of many eukaryotic lineages (e.g., Kazazian 2004;Fedoroff 2012;Song and Schaack 2018). Viruses are the most abundant biological entities on Earth (e.g., Edwards and Rohwer 2005;Koonin 2017), and, like TEs, they are able to integrate into eukaryotic genomes (Chalker and Yao 2011;Koonin 2017;Song and Schaack 2018). ...
... Similarly, rapid evolution and replication of viruses create an "arms race" with the host genomes that have evolved to eliminate them (e.g., Bruscella et al. 2017;Koonin and Krupovic 2018). Consequently, replication and mobilization of MGEs is a substantial source of genetic variation in eukaryotes, and these abilities allow MGEs to both resist elimination and create an immediate and lasting impact on host evolution (e.g., Kidwell and Lisch 2001;Schaack et al. 2010;Campbell et al. 2017;Koonin and Krupovic 2018). ...
... Epigenetic mechanisms are key to eukaryotic responses to MGEs (e.g., Levine et al. 2016;Campbell et al. 2017;Song and Schaack 2018;Parhad and Theurkauf 2019). In many cases, epigenetic responses protect the host's germline by limiting TE mobilization (Chung et al. 2008;Suzuki and Bird 2008;Parhad and Theurkauf 2019). ...
Article
Through analyses of diverse microeukaryotes, we have previously argued that eukaryotic genomes are dynamic systems that rely on epigenetic mechanisms to distinguish germline (i.e., DNA to be inherited) from soma (i.e., DNA that undergoes polyploidization, genome rearrangement, etc.), even in the context of a single nucleus. Here, we extend these arguments by including two well-documented observations: (1) eukaryotic genomes interact frequently with mobile genetic elements (MGEs) like viruses and transposable elements (TEs), creating genetic conflict, and (2) epigenetic mechanisms regulate MGEs. Synthesis of these ideas leads to the hypothesis that genetic conflict with MGEs contributed to the evolution of a dynamic eukaryotic genome in the last eukaryotic common ancestor (LECA), and may have contributed to eukaryogenesis (i.e., may have been a driver in the evolution of FECA, the first eukaryotic common ancestor). Sex (i.e., meiosis) may have evolved within the context of the development of germline-soma distinctions in LECA, as this process resets the germline genome by regulating/eliminating somatic (i.e., polyploid, rearranged) genetic material. Our synthesis of these ideas expands on hypotheses of the origin of eukaryotes by integrating the roles of MGEs and epigenetics.
... Integrated Mavirus virophages reactivate upon giant virus infection and confer protection against Cafeteria roenbergensis virus in the flagellate Cafeteria burkhardae (Fischer and Hackl, 2016). The virophage-first scenario therefore suggests that an ancestral virophage evolved shortly after the origin of NCLDVs, and was able to parasitise NCLDVs by virtue of their shared promoter and poly-A sequences (Campbell et al., 2017;Fischer and Suttle, 2011). One of these lineages of ancestral virophages would have gained an integrase and endogenised into eukaryotic hosts providing an immune defence system, which gave rise to Mavericks and other elements (Fischer and Suttle, 2011;Katzourakis and Aswad, 2014). ...
... The tree topologies that we estimated (Figure 2 and Figure 3), also suggest that the most recent common ancestor of Mavericks, adenoviruses and PLVs was not a virophage, which is at odds with a virophage-first scenario. Indeed, the basal position of virophages is inconsistent with the virophagefirst hypothesis which proposes that protovirophages coevolved with NCLDVs, acquired an integrase and then gave rise to Mavericks and other elements in the lineage (Campbell et al., 2017;Fischer and Suttle, 2011). However, we found that the second-highest frequency root sampled in the Bayesian MCMC was on the branch of PLV BS_539 + NCLDVs (frequency = 27.4%). ...
Article
Full-text available
Bamfordviruses are arguably the most diverse group of viruses infecting eukaryotes. They include the Nucleocytoplasmic Large DNA viruses (NCLDVs), virophages, adenoviruses, Mavericks and Polinton-like viruses. Two main hypotheses for their origins have been proposed: the ‘nuclear-escape’ and ‘virophage-first’ hypotheses. The nuclear-escape hypothesis proposes an endogenous, Maverick -like ancestor which escaped from the nucleus and gave rise to adenoviruses and NCLDVs. In contrast, the virophage-first hypothesis proposes that NCLDVs coevolved with protovirophages; Mavericks then evolved from virophages that became endogenous, with adenoviruses escaping from the nucleus at a later stage. Here, we test the predictions made by both models and consider alternative evolutionary scenarios. We use a data set of the four core virion proteins sampled across the diversity of the lineage, together with Bayesian and maximum-likelihood hypothesis-testing methods, and estimate rooted phylogenies. We find strong evidence that adenoviruses and NCLDVs are not sister groups, and that Mavericks and Mavirus acquired the rve-integrase independently. We also found strong support for a monophyletic group of virophages (family Lavidaviridae ) and a most likely root placed between virophages and the other lineages. Our observations support alternatives to the nuclear-escape scenario and a billion years evolutionary arms-race between virophages and NCLDVs.
... flagellate Cafeteria burkhardae (Fischer and Hackl, 2016 ). The virophage-first scenario therefore suggests that an ancestral virophage evolved shortly after the origin of NCLDVs, and was able to parasitise NCLDVs by virtue of their shared promoter and poly-A sequences (Campbell et al., 2017 ;Fischer and Suttle, 2011 ). One of these lineages of ancestral virophages would have gained an integrase and endogenised into eukaryotic hosts providing an immune defence system, which gave rise to Mavericks and other elements (Fischer and Suttle, 2011 ;Katzourakis and Aswad, 2014 ). ...
... The tree topologies that we estimated (Figure 2 and Figure 3 ), also suggest that the most recent common ancestor of Mavericks, adenoviruses and PLVs was not a virophage, which is at odds with a virophage-first scenario. Indeed, the basal position of virophages is inconsistent with the virophage-first hypothesis which proposes that protovirophages coevolved with NCLDVs, acquired an integrase and then gave rise to Mavericks and other elements in the lineage (Campbell et al., 2017 ;Fischer and Suttle, 2011 ). However, we found that the second-highest frequency root sampled in the Bayesian MCMC was on the branch of PLV BS_539 + NCLDVs (frequency = 27.4%). ...
Preprint
Full-text available
Bamfordviruses are arguably the most diverse group of viruses infecting eukaryotes. They include the Nucleocytoplasmic Large DNA viruses (NCLDVs), virophages, adenoviruses, Mavericks and Polinton-like viruses. Two main hypotheses for their origins have been proposed: the “nuclear-escape” and “virophage-first” hypotheses. The “nuclear-escape” hypothesis proposes an endogenous, Maverick-like ancestor which escaped from the nucleus and gave rise to adenoviruses and NCLDVs. In contrast, the “virophage-first” hypothesis proposes that NCLDVs coevolved with protovirophages; Mavericks then evolved from virophages that became endogenous, with adenoviruses escaping from the nucleus at a later stage. Here, we test the predictions made by both models and consider alternative evolutionary scenarios. We use a data set of the four core virion proteins sampled across the diversity of the lineage, together with Bayesian and maximum-likelihood hypothesis-testing methods, and estimate rooted phylogenies. We find strong evidence that adenoviruses and NCLDVs are not sister groups, and that Mavericks and Mavirus acquired the rve-integrase independently. We also found strong support for a monophyletic group of virophages (family Lavidaviridae) and a most likely root placed between virophages and the other lineages. Our observations support alternatives to the nuclear-escape scenario and a billion years evolutionary arms-race between virophages and NCLDVs.
... Integrated Mavirus virophages reactivate upon giant virus infection and confer protection against Cafeteria roenbergensis virus in the flagellate Cafeteria burkhardae (Fischer and Hackl, 2016). The "virophage-first" scenario therefore suggests that an ancestral virophage evolved shortly after the origin of NCLDVs, and was able to parasitise NCLDVs by virtue of their shared promoter and poly-A sequences (Campbell et al., 2017;Fischer and Suttle, 2011). One of these lineages of ancestral virophages would have gained an integrase and endogenised into eukaryotic hosts providing an immune defence system, which gave rise to Mavericks and other elements (Fischer and Suttle, 2011;Katzourakis and Aswad, 2014). ...
... The tree topologies that we estimated (Figure 2 and Figure 3), also suggest that the most recent common ancestor of Mavericks, adenoviruses and PLVs was not a virophage, which is at odds with a virophage-first scenario. Indeed, the basal position of virophages is inconsistent with the virophage-first hypothesis which proposes that protovirophages coevolved with NCLDVs, acquired an integrase and then gave rise to Mavericks and other elements in the lineage (Campbell et al., 2017;Fischer and Suttle, 2011). However, we found that the second-highest frequency root sampled in the Bayesian MCMC was on the branch of PLV BS_539 + NCLDVs. ...
Preprint
Full-text available
Bamfordviruses are arguably the most diverse group of viruses infecting eukaryotes. They include the Nucleocytoplasmic Large DNA viruses (NCLDVs), virophages, adenoviruses, Mavericks and Polinton-like viruses. Two main hypotheses for their origins have been proposed: the “nuclear-escape” and “virophage-first” hypotheses. The “nuclear-escape” hypothesis proposes an endogenous, Maverick-like ancestor which escaped from the nucleus and gave rise to adenoviruses and NCLDVs. In contrast, the “virophage-first” hypothesis proposes that NCLDVs coevolved with protovirophages; Mavericks then evolved from virophages that became endogenous, with adenoviruses escaping from the nucleus at a later stage. Here, we test the predictions made by both models and consider alternative scenarios. We use a data set of the four core virion proteins sampled across the diversity of the lineage, together with Bayesian and maximum-likelihood hypothesis-testing methods and estimate rooted phylogenies. We find strong evidence that adenoviruses and NCLDVs are not sister groups, and that Mavericks and Mavirus acquired the rve-integrase independently. We also found strong support for a monophyletic group of virophages (family Lavidaviridae) and a most likely root placed between virophages and the other lineages. Our observations support alternative scenarios to the nuclear-escape and a billion years evolutionary arms-race between virophages and NCLDVs.
... Integrated Mavirus virophages reactivate upon giant virus infection and confer protection against Cafeteria roenbergensis virus in the flagellate Cafeteria burkhardae (Fischer and Hackl, 2016). The virophage-first scenario therefore suggests that an ancestral virophage evolved shortly after the origin of NCLDVs, and was able to parasitise NCLDVs by virtue of their shared promoter and poly-A sequences (Campbell et al., 2017;Fischer and Suttle, 2011). One of these lineages of ancestral virophages would have gained an integrase and endogenised into eukaryotic hosts providing an immune defence system, which gave rise to Mavericks and other elements (Fischer and Suttle, 2011;Katzourakis and Aswad, 2014). ...
... The tree topologies that we estimated (Figure 2 and Figure 3), also suggest that the most recent common ancestor of Mavericks, adenoviruses and PLVs was not a virophage, which is at odds with a virophage-first scenario. Indeed, the basal position of virophages is inconsistent with the virophagefirst hypothesis which proposes that protovirophages coevolved with NCLDVs, acquired an integrase and then gave rise to Mavericks and other elements in the lineage (Campbell et al., 2017;Fischer and Suttle, 2011). However, we found that the second-highest frequency root sampled in the Bayesian MCMC was on the branch of PLV BS_539 + NCLDVs (frequency = 27.4%). ...
Preprint
Full-text available
Bamfordviruses are arguably the most diverse group of viruses infecting eukaryotes. They include the Nucleocytoplasmic Large DNA viruses (NCLDVs), virophages, adenoviruses, Mavericks and Polinton-like viruses. Two main hypotheses for their origins have been proposed: the “nuclear-escape” and “virophage-first” hypotheses. The “nuclear-escape” hypothesis proposes an endogenous, Maverick-like ancestor which escaped from the nucleus and gave rise to adenoviruses and NCLDVs. In contrast, the “virophage-first” hypothesis proposes that NCLDVs coevolved with protovirophages; Mavericks then evolved from virophages that became endogenous, with adenoviruses escaping from the nucleus at a later stage. Here, we test the predictions made by both models and consider alternative scenarios. We use a data set of the four core virion proteins sampled across the diversity of the lineage, together with Bayesian and maximum-likelihood hypothesis-testing methods and estimate rooted phylogenies. We find strong evidence that adenoviruses and NCLDVs are not sister groups, and that Mavericks and Mavirus acquired the rve-integrase independently. We also found strong support for a monophyletic group of virophages (family Lavidaviridae) and a most likely root placed between virophages and the other lineages. Our observations support alternative scenarios to the nuclear-escape and a billion years evolutionary arms-race between virophages and NCLDVs.
... This hypothesis was put forward based on gene similarity network analyses in which polintons and some bacteriophages shared similarities between five core genes [39,67]. There is also a competing scenario that polintons may have originated from inserted virophages [68,69], which are small infecting viral-like particles of the largest NCLDVs [67]. The first hypothesis is supported by the observable similarities in protein sets on the sequence or structural level [39,67]. ...
... The combination of shared gene order and primary sequence similarities between the mavirus virophage and a politon from the slime mould Polysphondylium pallidum is consistent with the latter hypothesis [68,76]. In addition, some virophages have been demonstrated to insert into the genome of eukaryotes infected by NCLDVs and to control the transcription levels of the NCLDVs, and by doing so improve the survival of the eukaryotic host from giant virus infections at the population level and hence have the potential to evolve into transposable elements [68,69]. ...
Article
Full-text available
The nucleocytoplasmic large DNA viruses (NCLDVs) are a diverse group that currently contain the largest known virions and genomes, also called giant viruses. The first giant virus was isolated and described nearly 20 years ago. Their genome sizes were larger than for any other known virus at the time and it contained a number of genes that had not been previously described in any virus. The origin and evolution of these unusually complex viruses has been puzzling, and various mechanisms have been put forward to explain how some NCLDVs could have reached genome sizes and coding capacity overlapping with those of cellular microbes. Here we critically discuss the evidence and arguments on this topic. We have also updated and systematically reanalysed protein families of the NCLDVs to further study their origin and evolution. Our analyses further highlight the small number of widely shared genes and extreme genomic plasticity among NCLDVs that are shaped via combinations of gene duplications, deletions, lateral gene transfers and de novo creation of protein-coding genes. The dramatic expansions of the genome size and protein-coding gene capacity characteristic of some NCLDVs is now increasingly understood to be driven by environmental factors rather than reflecting relationships to an ancient common ancestor among a hypothetical cellular lineage. Thus, the evolution of NCLDVs is writ large viral, and their origin, like all other viral lineages, remains unknown.
... Eukaryotic TEs have immense diversity, but can be classified into ∼7 major orders: LTR, LINE, and tyrosine-recombinase (YR) retrotransposons, and DDE transposons, cryptons, helitrons, and polintons [41]. Three of them (YR retrotransposons, cryptons, polintons) were long thought absent in mammals [42], [43]. Recently, relics of non-autonomous cryptons were detected in human [44], and one polintonrelated element was found in chromosome 7 [45]. ...
Article
Full-text available
Protein fossils, i.e. noncoding DNA descended from coding DNA, arise frequently from transposable elements (TEs), decayed genes, and viral integrations. They can reveal, and mislead about, evolutionary history and relationships. They have been detected by comparing DNA to protein sequences, but current methods are not optimized for this task. We describe a powerful DNA-protein homology search method. We use a 64×21 substitution matrix, which is fitted to sequence data, automatically learning the genetic code. We detect subtly homologous regions by considering alternative possible alignments between them, and calculate significance (probability of occurring by chance between random sequences). Our method detects TE protein fossils much more sensitively than blastx, and > 10× faster. Of the ~7 major categories of eukaryotic TE, three were long thought absent in mammals: we find two of them in the human genome, polinton and DIRS/Ngaro. This method increases our power to find ancient fossils, and perhaps to detect non-standard genetic codes. The alternative-alignments and significance paradigm is not specific to DNA-protein comparison, and could benefit homology search generally. This is an extended version of a conference paper [1].
... Comparison of different phylogenetic hypotheses using the sequences of the core viral proteins further suggest that virophages evolved to parasitise NCLDVs shortly after the origin of the NCLDV ancestor [58]. This is consistent with the close correspondence between the promoter and poly-A sequences of virophages and NCLDVs, which would have been inherited from a recent common ancestor and this specificity maintained throughout evolutionary time [64,65]. This lineage of primitive virophages would have further diversified into the modern virophages (lavidaviruses), Mavericks, Polinton-like viruses, cytoplasmic linear plasmids, and adenoviruses [58]. ...
Article
Paleovirology is the study of ancient viruses and how they have coevolved with their hosts. An increasingly detailed understanding of the diversity, origins, and evolution of the DNA viruses of eukaryotes has been obtained through the lens of paleovirology in recent years. Members of multiple viral families have been found integrated in the genomes of eukaryotes, providing a rich fossil record to study. These elements have extended our knowledge of exogenous viral diversity, host ranges, and the timing of viral evolution, and are revealing the existence of entire new families of eukaryotic integrating dsDNA viruses and transposons. Future work in paleovirology will continue to provide insights into antiviral immunity, viral diversity, and potential applications, and reveal other secrets of the viral world.
... On the other hand, Polintons (aka Maveriks (Pritham et al., 2007) share several structural characteristics and an evolutionary ancestry with virophages, which are small (15-25 kb) dsDNA viruses that infect giant virus replication factories and may limit the replication of their viral host (La Scola et al., 2008;Mougari et al., 2019). Several analyses suggest that virophages are the progenitors of Polintons (Fischer and Suttle, 2011;Krupovic et al., 2014;Campbell et al., 2017). Polintons would then constitute a case of endogenization of viral elements. ...
Article
Full-text available
Giant viruses of amoebas, recently classified in the class Megaviricetes, are a group of viruses that can infect major eukaryotic lineages. We previously identified a set of giant virus sequences in the genome of Phytophthora parasitica , an oomycete and a devastating major plant pathogen. How viral insertions shape the structure and evolution of the invaded genomes is unclear, but it is known that the unprecedented functional potential of giant viruses is the result of an intense genetic interplay with their hosts. We previously identified a set of giant virus sequences in the genome of P. parasitica , an oomycete and a devastating major plant pathogen. Here, we show that viral pieces are found in a 550-kb locus and are organized in three main clusters. Viral sequences, namely RNA polymerases I and II and a major capsid protein, were identified, along with orphan sequences, as a hallmark of giant viruses insertions. Mining of public databases and phylogenetic reconstructions suggest an ancient association of oomycetes and giant viruses of amoeba, including faustoviruses, African swine fever virus (ASFV) and pandoraviruses, and that a single viral insertion occurred early in the evolutionary history of oomycetes prior to the Phytophthora – Pythium radiation, estimated at ∼80 million years ago. Functional annotation reveals that the viral insertions are located in a gene sparse region of the Phytophthora genome, characterized by a plethora of transposable elements (TEs), effectors and other genes potentially involved in virulence. Transcription of viral genes was investigated through analysis of RNA-Seq data and qPCR experiments. We show that most viral genes are not expressed, and that a variety of mechanisms, including deletions, TEs insertions and RNA interference may contribute to transcriptional repression. However, a gene coding a truncated copy of RNA polymerase II along a set of neighboring sequences have been shown to be expressed in a wide range of physiological conditions, including responses to stress. These results, which describe for the first time the endogenization of a giant virus in an oomycete, contribute to challenge our view of Phytophthora evolution.
... Eukaryotic TEs have immense diversity, but can be classified into ∼7 major orders: LTR, LINE, and tyrosine-recombinase (YR) retrotransposons, and DDE transposons, cryptons, helitrons, and polintons [36]. Three of them (YR retrotransposons, cryptons, polintons) have not been found in mammals [24,3]. ...
Preprint
Full-text available
Protein fossils, i.e. noncoding DNA descended from coding DNA, arise frequently from transposable elements (TEs), decayed genes, and viral integrations. They can reveal, and mislead about, evolutionary history and relationships. They have been detected by comparing DNA to protein sequences, but current methods are not optimized for this task. We describe a powerful DNA-protein homology search method. We use a 64x21 substitution matrix, which is fitted to sequence data, automatically learning the genetic code. We detect subtly homologous regions by considering alternative possible alignments between them, and calculate significance (probability of occurring by chance between random sequences). Our method detects TE protein fossils much more sensitively than blastx, and > 10x faster. Of the ~7 major categories of eukaryotic TE, three have not been found in mammals: we find two of them in the human genome, polinton and DIRS/Ngaro. This method increases our power to find ancient fossils, and perhaps to detect non-standard genetic codes. The alternative-alignments and significance paradigm is not specific to DNA-protein comparison, and could benefit homology search generally.
... Therefore, it has been proposed that the genome of these virophages evolved through a recombination between a viral form of Polintons and a virophage ancestor that co-infected an amoeba infected with a mimivirus. Another scenario is that Polintons evolved from viruses (and not vice versa) and that virophages are not descendants from Polintons but share the same ancestor virus whose nature and identity are undetermined [111]. ...
Article
Full-text available
The last decade has been marked by two eminent discoveries that have changed our perception of the virology field: The discovery of giant viruses and a distinct new class of viral agents that parasitize their viral factories, the virophages. Coculture and metagenomics have actively contributed to the expansion of the virophage family by isolating dozens of new members. This increase in the body of data on virophage not only revealed the diversity of the virophage group, but also the relevant ecological impact of these small viruses and their potential role in the dynamics of the microbial network. In addition, the isolation of virophages has led us to discover previously unknown features displayed by their host viruses and cells. In this review, we present an update of all the knowledge on the isolation, biology, genomics, and morphological features of the virophages, a decade after the discovery of their first member, the Sputnik virophage. We discuss their parasitic lifestyle as bona fide viruses of the giant virus factories, genetic parasites of their genomes, and then their role as a key component or target for some host defense mechanisms during the tripartite virophage–giant virus–host cell interaction. We also present the latest advances regarding their origin, classification, and definition that have been widely discussed.
... Altogether, a rich network of mobile genetic elements contributes to the host-virus coevolution and interviral gene transfer [78]. Virophages and other mobile elements could facilitate gene transfer, thereby having the potential to shape the genomes of giant viruses and impact their diversity [21,79,80]. ...
Article
Full-text available
Viruses are the most prevalent infectious agents, populating almost every ecosystem on earth. Most viruses carry only a handful of genes supporting their replication and the production of capsids. It came as a great surprise in 2003 when the first giant virus was discovered and found to have a >1 Mbp genome encoding almost a thousand proteins. Following this first discovery, dozens of giant virus strains across several viral families have been reported. Here, we provide an updated quantitative and qualitative view on giant viruses and elaborate on their shared and variable features. We review the complexity of giant viral proteomes, which include functions traditionally associated only with cellular organisms. These unprecedented functions include components of the translation machinery, DNA maintenance, and metabolic enzymes. We discuss the possible underlying evolutionary processes and mechanisms that might have shaped the diversity of giant viruses and their genomes, highlighting their remarkable capacity to hijack genes and genomic sequences from their hosts and environments. This leads us to examine prominent theories regarding the origin of giant viruses. Finally, we present the emerging ecological view of giant viruses, found across widespread habitats and ecological systems, with respect to the environment and human health.
... Altogether, a rich network of mobile genetic elements contributes to the host-virus coevolution and interviral gene transfer [78]. Virophages and other mobile elements could facilitate gene transfer, thereby having the potential to shape the genomes of giant viruses and impact their diversity [21,79,80]. ...
Preprint
Full-text available
Viruses are the most prevalent infectious agents, populating almost every ecosystem on earth. Most viruses carry only a handful of genes supporting their replication and the production of capsids. It came as a great surprise in 2003 when the first giant virus was discovered and found to have a >1Mbp genome encoding almost a thousand proteins. Following this first discovery, dozens of giant virus strains across several viral families have been reported. Here, we provide an updated quantitative and qualitative view on giant viruses and elaborate on their shared and variable features. We review the complexity of giant virus proteomes, which include functions traditionally associated only with cellular organisms. These unprecedented functions include components of the translation machinery, DNA maintenance, and metabolic enzymes. We discuss the possible underlying evolutionary processes and mechanisms that might have shaped the diversity of giant viruses and their genomes, highlighting their remarkable capacity to hijack genes and genomic sequences from their hosts and environments. This leads us to examine prominent theories regarding the origin of giant viruses. Finally, we present the emerging ecological view of giant viruses, found across widespread habitats and ecological systems, with respect to the environment and human health.
Article
Full-text available
High diversity and differential evolution profiles have been observed for DD34E/Tc1 transposons; several families originating from these groups, such as DD34E/ZB, DD34E/SB, DD35E/TR, DD36E/IC, and DD38E/IT, have been well defined. Even though Frisky, Tiang, Tsessebe, and Topi transposons have been identified in Anopheles gambiae, their taxonomic distribution and phylogenetic relationship in nature remain largely unknown. The evolutionary profiles of Frisky, Tiang, Tsessebe, and Topi homology transposons were investigated in the current study. In total, 254 homology transposons of Frisky, Tiang, Hob, Tsessebe, and Topi were obtained in 200 species by data mining. The phylogenetic tree revealed that these transposons were classified into five main clades (Frisky, Tiang, Hob, Tsessebe, and Topi) forming a monophyletic clade with 98% bootstrap support, belonging to the DD34E/Tc1 group, and named as Skipper (SK). SK transposons show a wide distribution in animals; however, differential taxonomic distribution patterns were observed for the subfamilies of Frisky, Tiang, Hob, Tsessebe, and Topi; extensive invasion of Frisky in animals was found, whereas Tiang, Hob, Tsessebe, and Topi were mainly detected in Diptera. SK elements share a similar structural organization and display high sequence identities across subfamilies. Evolutionary dynamics and structural analysis revealed that SKs in some species, such as Bombyx mori, Lordiphosa magnipectinata, Carassius gibelio, Triplophysa dalaica, and Silurus glanis, have recently evolved and present as intact copies, indicating that SKs in these genomes may be active. Together, these observations improve our understanding of the diversity of DD34E/Tc1 transposons and their impacts on genome evolution in animals.
Article
Bamfordviruses are arguably the most diverse group of viruses infecting eukaryotes. They include the Nucleocytoplasmic Large DNA viruses (NCLDVs), virophages, adenoviruses, Mavericks and Polinton-like viruses. Two main hypotheses for their origins have been proposed: the ‘nuclear-escape’ and ‘virophage-first’ hypotheses. The nuclear-escape hypothesis proposes an endogenous, Maverick-like ancestor which escaped from the nucleus and gave rise to adenoviruses and NCLDVs. In contrast, the virophage-first hypothesis proposes that NCLDVs coevolved with protovirophages; Mavericks then evolved from virophages that became endogenous, with adenoviruses escaping from the nucleus at a later stage. Here, we test the predictions made by both models and consider alternative evolutionary scenarios. We use a data set of the four core virion proteins sampled across the diversity of the lineage, together with Bayesian and maximum-likelihood hypothesis-testing methods, and estimate rooted phylogenies. We find strong evidence that adenoviruses and NCLDVs are not sister groups, and that Mavericks and Mavirus acquired the rve-integrase independently. We also found strong support for a monophyletic group of virophages (family Lavidaviridae) and a most likely root placed between virophages and the other lineages. Our observations support alternatives to the nuclear-escape scenario and a billion years evolutionary arms-race between virophages and NCLDVs.
Article
Bamfordviruses are arguably the most diverse group of viruses infecting eukaryotes. They include the Nucleocytoplasmic Large DNA viruses (NCLDVs), virophages, adenoviruses, Mavericks and Polinton-like viruses. Two main hypotheses for their origins have been proposed: the ‘nuclear-escape’ and ‘virophage-first’ hypotheses. The nuclear-escape hypothesis proposes an endogenous, Maverick-like ancestor which escaped from the nucleus and gave rise to adenoviruses and NCLDVs. In contrast, the virophage-first hypothesis proposes that NCLDVs coevolved with protovirophages; Mavericks then evolved from virophages that became endogenous, with adenoviruses escaping from the nucleus at a later stage. Here, we test the predictions made by both models and consider alternative evolutionary scenarios. We use a data set of the four core virion proteins sampled across the diversity of the lineage, together with Bayesian and maximum-likelihood hypothesis-testing methods, and estimate rooted phylogenies. We find strong evidence that adenoviruses and NCLDVs are not sister groups, and that Mavericks and Mavirus acquired the rve-integrase independently. We also found strong support for a monophyletic group of virophages (family Lavidaviridae) and a most likely root placed between virophages and the other lineages. Our observations support alternatives to the nuclear-escape scenario and a billion years evolutionary arms-race between virophages and NCLDVs.
Article
Bamfordviruses are arguably the most diverse group of viruses infecting eukaryotes. They include the Nucleocytoplasmic Large DNA viruses (NCLDVs), virophages, adenoviruses, Mavericks and Polinton-like viruses. Two main hypotheses for their origins have been proposed: the ‘nuclear-escape’ and ‘virophage-first’ hypotheses. The nuclear-escape hypothesis proposes an endogenous, Maverick-like ancestor which escaped from the nucleus and gave rise to adenoviruses and NCLDVs. In contrast, the virophage-first hypothesis proposes that NCLDVs coevolved with protovirophages; Mavericks then evolved from virophages that became endogenous, with adenoviruses escaping from the nucleus at a later stage. Here, we test the predictions made by both models and consider alternative evolutionary scenarios. We use a data set of the four core virion proteins sampled across the diversity of the lineage, together with Bayesian and maximum-likelihood hypothesis-testing methods, and estimate rooted phylogenies. We find strong evidence that adenoviruses and NCLDVs are not sister groups, and that Mavericks and Mavirus acquired the rve-integrase independently. We also found strong support for a monophyletic group of virophages (family Lavidaviridae) and a most likely root placed between virophages and the other lineages. Our observations support alternatives to the nuclear-escape scenario and a billion years evolutionary arms-race between virophages and NCLDVs.
Article
Full-text available
Transposable elements (TEs) are ubiquitous genetic elements, able to jump from one location of the genome to another, in all organisms. For this reason, on the one hand, TEs can induce deleterious mutations, causing dysfunction, disease and even lethality in individuals. On the other hand, TEs can increase genetic variability, making populations better equipped to respond adaptively to environmental change. To counteract the deleterious effects of TEs, organisms have evolved strategies to avoid their activation. However, their mobilization does occur. Usually, TEs are maintained silent through several mechanisms, but they can be reactivated during certain developmental windows. Moreover, TEs can become de-repressed because of drastic changes in the external environment. Here, we describe the ‘double life’ of TEs, being both ‘parasites’ and ‘symbionts’ of the genome. We also argue that the transposition of TEs contributes to two important evolutionary processes: the temporal dynamic of evolution and the induction of genetic variability. Finally, we discuss how the interplay between two TE-dependent phenomena, insertional mutagenesis and epigenetic plasticity, plays a role in the process of evolution.
Article
Full-text available
The discovery of virophage carries along the proof of existence of a new bio controlling agent in the entire biosystem. The virophage is a parasite to a giant virus and works by hijacking "the giant virus" viral factory, an essential machinery for the giant virus's replication, leading to a sharp incline of the virophage viral load inside the host cell. Success of the host cell survival against the invading giant virus is shown by the decline of the destroyed cell during lytic stage after virophage co-infection to the giant virus. Virophage has a similar role to the bacteriophage but instead of targeting a bacterium, it targets specifically on virus. Hitherto, the existence of human-borne virophage and interactions of virophage to human microbiome remain elusive, thus future studies are required. This short review will highlight the discovery, types and recent known method of virophage replication. We also added some biological perspectives of the connections and interactions between the virophage and its host to exploit the virophage main role as a biocontrolling agent to pathogenic viruses that are potentially benevolent for human life.
Chapter
Protein fossils, i.e. noncoding DNA descended from coding DNA, arise frequently from transposable elements (TEs), decayed genes, and viral integrations. They can reveal, and mislead about, evolutionary history and relationships. They have been detected by comparing DNA to protein sequences, but current methods are not optimized for this task. We describe a powerful DNA-protein homology search method. We use a 64 \(\times \) 21 substitution matrix, which is fitted to sequence data, automatically learning the genetic code. We detect subtly homologous regions by considering alternative possible alignments between them, and calculate significance (probability of occurring by chance between random sequences). Our method detects TE protein fossils much more sensitively than blastx, and \({>}10{\times }\) faster. Of the \(\sim \)7 major categories of eukaryotic TE, three were long thought absent in mammals: we find two of them in the human genome, polinton and DIRS/Ngaro. This method increases our power to find ancient fossils, and perhaps to detect non-standard genetic codes. The alternative-alignments and significance paradigm is not specific to DNA-protein comparison, and could benefit homology search generally.
Article
Full-text available
ELife digest Viruses have been with us for billions of years, and exist everywhere in nature that life is found. Viruses therefore have had a significant impact on the evolution of all organisms, from bacteria to humans. Unfortunately, viruses do not leave fossils, and so we know very little about how viruses originate and evolve over time. Fortunately, over the course of millions of years, genetic sequences from the viruses accumulate in the DNA genomes of living organisms (including humans). These sequences can serve as molecular “fossils” for exploring the natural history of viruses and their hosts. Diehl et al. have now searched the genomes of 50 modern mammals for “fossil” viral remnants of an ancient group of viruses known as ERV-Fc. This revealed that ERV-Fc viruses infected the ancestors of at least 28 of these mammal species between 15 million and 30 million years ago. The viruses affected a diverse range of hosts, including carnivores, rodents and primates. The distribution of ERV-Fc among different mammals indicates that the viruses spread to every continent except Antarctica and Australia, and that they jumped between species more than 20 times. Diehl et al. also pinpointed patterns of evolutionary change in the genes of the ERV-Fc viruses that reflect how the viruses adapted to different host mammals. As part of this process, the viruses often exchanged genes with each other and with other types of viruses. Such genetic recombination is likely to have played a significant role in the evolutionary success of the ERV-Fc viruses. Mammalian genomes contain hundreds of thousands of ancient viral fossils similar to ERV-Fc. Future work could study these to improve our understanding of when and why new viruses emerge and how long-term contact with viruses affects the evolution of their host organisms. DOI: http://dx.doi.org/10.7554/eLife.12704.002
Article
Full-text available
Very little is known about the ancient origin of retroviruses, but owing to the discovery of their ancient endogenous viral counterparts, their early history is beginning to unfold. Here we report 36 lineages of basal amphibian and fish foamy-like endogenous retroviruses (FLERVs). Phylogenetic analyses reveal that ray-finned fish FLERVs exhibit an overall co-speciation pattern with their hosts, while amphibian FLERVs might not. We also observe several possible ancient viral cross-class transmissions, involving lobe-finned fish, shark and frog FLERVs. Sequence examination and analyses reveal two major lineages of ray-finned fish FLERVs, one of which had gained two novel accessory genes within their extraordinarily large genomes. Our phylogenetic analyses suggest that this major retroviral lineage, and therefore retroviruses as a whole, have an ancient marine origin and originated together with, if not before, their jawed vertebrate hosts >450 million years ago in the Ordovician period, early Palaeozoic Era.
Article
Full-text available
Unlabelled: Virophages are a unique group of circular double-stranded DNA viruses that are considered parasites of giant DNA viruses, which in turn are known to infect eukaryotic hosts. In this study, the genomes of three novel Yellowstone Lake virophages (YSLVs)--YSLV5, YSLV6, and YSLV7--were identified from Yellowstone Lake through metagenomic analyses. The relative abundance of these three novel virophages and previously identified Yellowstone Lake virophages YSLV1 to -4 were determined in different locations of the lake, revealing that most of the sampled locations in the lake, including both mesophilic and thermophilic habitats, had multiple virophage genotypes. This likely reflects the diverse habitats or diversity of the eukaryotic hosts and their associated giant viruses that serve as putative hosts for these virophages. YSLV5 has a 29,767-bp genome with 32 predicted open reading frames (ORFs), YSLV6 has a 24,837-bp genome with 29 predicted ORFs, and YSLV7 has a 23,193-bp genome with 26 predicted ORFs. Based on multilocus phylogenetic analysis, YSLV6 shows a close evolutionary relationship with YSLV1 to -4, whereas YSLV5 and YSLV7 are distantly related to the others, and YSLV7 represents the fourth novel virophage lineage. In addition, the genome of YSLV5 has a G+C content of 51.1% that is much higher than all other known virophages, indicating a unique host range for YSLV5. These results suggest that virophages are abundant and have diverse genotypes that likely mirror diverse giant viral and eukaryotic hosts within the Yellowstone Lake ecosystem. Importance: This study discovered novel virophages present within the Yellowstone Lake ecosystem using a conserved major capsid protein as a phylogenetic anchor for assembly of sequence reads from Yellowstone Lake metagenomic samples. The three novel virophage genomes (YSLV5 to -7) were completed by identifying specific environmental samples containing these respective virophages, and closing gaps by targeted PCR and sequencing. Most of the YSLV genotypes were associated primarily with photic-zone and nonhydrothermal samples; however, YSLV5 had a unique distribution with an occurrence in vent samples similar to that in photic-zone samples and with a higher GC content that suggests a distinct host and habitat compared to other YSLVs. In addition, genome content and phylogenetic analyses indicate that YSLV5 and YSLV7 are distinct from known virophages and that additional as-yet-uncharacterized virophages are likely present within the Yellowstone Lake ecosystem.
Article
Full-text available
Background The known plant viruses mostly infect angiosperm hosts and have RNA or small DNA genomes. The only other lineage of green plants with a relatively well-studied virome, unicellular chlorophyte algae, is mostly infected by viruses with large DNA genomes. Thus RNA viruses and small DNA viruses seem to completely displace large DNA virus genomes in late branching angiosperms. To understand better the expansion of RNA viruses in the taxonomic span between algae and angiosperms, we analyzed the transcriptomes of 66 non-angiosperm plants characterized by the 1000 Plants Genomes Project. ResultsWe found homologs of virus RNA-dependent RNA polymerases in 28 non-angiosperm plant species, including algae, mosses, liverworts (Marchantiophyta), hornworts (Anthocerotophyta), lycophytes, a horsetail Equisetum, and gymnosperms. Polymerase genes in algae were most closely related to homologs from double-stranded RNA viruses leading latent or persistent lifestyles. Land plants, in addition, contained polymerases close to the homologs from single-stranded RNA viruses of angiosperms, capable of productive infection and systemic spread. For several polymerases, a cognate capsid protein was found in the same library. Another virus hallmark gene family, encoding the 30 K movement proteins, was found in lycophytes and monilophytes but not in mosses or algae. Conclusions The broadened repertoire of RNA viruses suggests that colonization of land and growth in anatomical complexity in land plants coincided with the acquisition of novel sets of viruses with different strategies of infection and reproduction.
Article
Full-text available
Unlabelled: Virus genomes are prone to extensive gene loss, gain, and exchange and share no universal genes. Therefore, in a broad-scale study of virus evolution, gene and genome network analyses can complement traditional phylogenetics. We performed an exhaustive comparative analysis of the genomes of double-stranded DNA (dsDNA) viruses by using the bipartite network approach and found a robust hierarchical modularity in the dsDNA virosphere. Bipartite networks consist of two classes of nodes, with nodes in one class, in this case genomes, being connected via nodes of the second class, in this case genes. Such a network can be partitioned into modules that combine nodes from both classes. The bipartite network of dsDNA viruses includes 19 modules that form 5 major and 3 minor supermodules. Of these modules, 11 include tailed bacteriophages, reflecting the diversity of this largest group of viruses. The module analysis quantitatively validates and refines previously proposed nontrivial evolutionary relationships. An expansive supermodule combines the large and giant viruses of the putative order "Megavirales" with diverse moderate-sized viruses and related mobile elements. All viruses in this supermodule share a distinct morphogenetic tool kit with a double jelly roll major capsid protein. Herpesviruses and tailed bacteriophages comprise another supermodule, held together by a distinct set of morphogenetic proteins centered on the HK97-like major capsid protein. Together, these two supermodules cover the great majority of currently known dsDNA viruses. We formally identify a set of 14 viral hallmark genes that comprise the hubs of the network and account for most of the intermodule connections. Importance: Viruses and related mobile genetic elements are the dominant biological entities on earth, but their evolution is not sufficiently understood and their classification is not adequately developed. The key reason is the characteristic high rate of virus evolution that involves not only sequence change but also extensive gene loss, gain, and exchange. Therefore, in the study of virus evolution on a large scale, traditional phylogenetic approaches have limited applicability and have to be complemented by gene and genome network analyses. We applied state-of-the art methods of such analysis to reveal robust hierarchical modularity in the genomes of double-stranded DNA viruses. Some of the identified modules combine highly diverse viruses infecting bacteria, archaea, and eukaryotes, in support of previous hypotheses on direct evolutionary relationships between viruses from the three domains of cellular life. We formally identify a set of 14 viral hallmark genes that hold together the genomic network.
Article
Full-text available
Virophages are parasites of giant viruses that infect eukaryotic organisms and may affect the ecology of inland water ecosystems. Despite the potential ecological impact, limited information is available on the distribution, diversity, and hosts of virophages in ecosystems. Metagenomics revealed that virophages were widely distributed in inland waters with various environmental characteristics including salinity and nutrient availability. A novel virophage population was overrepresented in a planktonic microbial community of the Tibetan mountain lake, Lake Qinghai. Our study identified coccolithophores and coccolithovirus-like phycodnaviruses in the same community, which may serve as eukaryotic and viral hosts of the virophage population, respectively.
Article
Full-text available
Virophages are small double-stranded DNA viruses that are parasites of giant DNA viruses that infect unicellular eukaryotes. Here we identify a novel group of virophages, named Dishui Lake virophages (DSLVs) that were discovered in Dishui Lake (DSL): an artificial freshwater lake in Shanghai, China. Based on PCR and metagenomic analysis, the complete genome of DSLV1 was found to be circular and 28,788 base pairs in length, with a G+C content 43.2%, and 28 predicted open reading frames (ORFs). Fifteen of the DSLV1 ORFs have sequence similarity to known virophages. Two DSLV1 ORFs exhibited sequence similarity to that of prasinoviruses (Phycodnaviridae) and chloroviruses (Phycodnaviridae), respectively, suggesting horizontal gene transfer occurred between these large algal DNA viruses and DSLV1. 46 other virophages-related contigs were also obtained, including six homologous major capsid protein (MCP) gene. Phylogenetic analysis of these MCPs showed that DSLVs are closely related to OLV (Organic Lake virophage) and YSLVs (Yellowstone Lake virophages), especially to YSLV3, except for YSLV7. These results indicate that freshwater ecotopes are the hotbed for discovering novel virophages as well as understanding their diversity and properties.
Article
Full-text available
Endogenous retroviruses (ERVs) represent past retroviral infections and accordingly can provide an ideal framework to infer virus-host interaction over their evolutionary history. In this study, we target high quality Pol sequences from 7,994 Class I and 8,119 Class II ERVs from 69 mammalian genomes and surprisingly find that retroviruses harbored by bats and rodents combined occupy the major phylogenetic diversity of both classes. By analyzing transmission patterns of 30 well-defined ERV clades, we corroborate the previously published observation that rodents are more competent as originators of mammalian retroviruses and reveal that bats are more capable of receiving retroviruses from non-bat mammalian origins. The powerful retroviral hosting ability of bats is further supported by a detailed analysis revealing that the novel bat gammaretrovirus, Rhinolophus ferrumequinum retrovirus, likely originated from tree shrews. Taken together, this study advances our understanding of host-shaped mammalian retroviral evolution in general.
Article
Full-text available
Background The rapidly growing metagenomic databases provide increasing opportunities for computational discovery of new groups of organisms. Identification of new viruses is particularly straightforward given the comparatively small size of viral genomes, although fast evolution of viruses complicates the analysis of novel sequences. Here we report the metagenomic discovery of a distinct group of diverse viruses that are distantly related to the eukaryotic virus-like transposons of the Polinton superfamily. Results The sequence of the putative major capsid protein (MCP) of the unusual linear virophage associated with Phaeocystis globosa virus (PgVV) was used as a bait to identify potential related viruses in metagenomic databases. Assembly of the contigs encoding the PgVV MCP homologs followed by comprehensive sequence analysis of the proteins encoded in these contigs resulted in the identification of a large group of Polinton-like viruses (PLV) that resemble Polintons (polintoviruses) and virophages in genome size, and share with them a conserved minimal morphogenetic module that consists of major and minor capsid proteins and the packaging ATPase. With a single exception, the PLV lack the retrovirus-type integrase that is encoded in the genomes of all Polintons and the Mavirus group of virophages. However, some PLV encode a newly identified tyrosine recombinase-integrase that is common in bacteria and bacteriophages and is also found in the Organic Lake virophage group. Although several PLV genomes and individual genes are integrated into algal genomes, it appears likely that most of the PLV are viruses. Given the absence of protease and retrovirus-type integrase, the PLV could resemble the ancestral polintoviruses that evolved from bacterial tectiviruses. Apart from the conserved minimal morphogenetic module, the PLV widely differ in their genome complements but share a gene network with Polintons and virophages, suggestive of multiple gene exchanges within a shared gene pool. Conclusions The discovery of PLV substantially expands the emerging class of eukaryotic viruses and transposons that also includes Polintons and virophages. This class of selfish elements is extremely widespread and might have been a hotbed of eukaryotic virus, transposon and plasmid evolution. New families of these elements are expected to be discovered. Electronic supplementary material The online version of this article (doi:10.1186/s12915-015-0207-4) contains supplementary material, which is available to authorized users.
Article
Full-text available
Satellite viruses encode structural proteins required for the formation of infectious particles but depend on helper viruses for completing their replication cycles. Because of this unique property, satellite viruses that infect plants, arthropods, or mammals, as well as the more recently discovered satellite-like viruses that infect protists (virophages), have been grouped with other, so-called “sub-viral agents.” For the most part, satellite viruses are therefore not classified. We argue that possession of a coat-protein-encoding gene and the ability to form virions are the defining features of a bona fide virus. Accordingly, all satellite viruses and virophages should be consistently classified within appropriate taxa. We propose to create four new genera — Albetovirus, Aumaivirus, Papanivirus, and Virtovirus — for positive-sense single-stranded (+) RNA satellite viruses that infect plants and the family Sarthroviridae, including the genus Macronovirus, for (+)RNA satellite viruses that infect arthopods. For double-stranded DNA virophages, we propose to establish the family Lavidaviridae, including two genera, Sputnikvirus and Mavirus.
Article
Full-text available
Horizontal gene transfer from retroviruses to mammals is well documented and extensive, but is rare between unrelated viruses with distinct genome types. Three herpesviruses encode a gene with similarity to a retroviral superantigen gene (sag) of the unrelated mouse mammary tumour virus (MMTV). We uncover ancient retroviral sags in over 20 mammals to reconstruct their shared history with herpesviral sags, revealing that the acquisition is a convergent evolutionary event. A retrovirus circulating in South American primates over 10 million years ago was the source of sag in two monkey herpesviruses, and a different retrovirus was the source of sag in a Peruvian rodent herpesvirus. We further show through a timescaled phylogenetic analysis that a cross-species transmission of monkey herpesviruses occurred after the acquisition of sag. These results reveal that a diverse range of ancient sag-containing retroviruses independently donated sag twice from two separate lineages that are distinct from MMTV.
Article
Full-text available
Significance Virophages are viruses that hijack the replication machinery of giant viruses for their own replication. Virophages negatively impact giant virus replication and improve the survival chances of eukaryotic cells infected by giant viruses. In this study, we identified segments of the Bigelowiella natans genome that originate from virophages and giant viruses, revealing genomic footprints of battles between these viral entities that occurred in this unicellular alga. Interestingly, genes of virophage origin are transcribed, suggesting that they are functional. We hypothesize that virophage integration may be beneficial to both the virophage and B. natans by increasing the chances for the virophage to coinfect the cell with a giant virus prey and by defending the host cell against fatal giant virus infections.
Article
Full-text available
Microbial communities in glacial ecosystems are diverse, active, and subjected to strong viral pressures and infection rates. In this study we analyse putative virus genomes assembled from three dsDNA viromes from cryoconite hole ecosystems of Svalbard and the Greenland Ice Sheet to assess the potential hosts and functional role viruses play in these habitats. We assembled 208 million reads from the virus-size fraction and developed a procedure to select genuine virus scaffolds from cellular contamination. Our curated virus library contained 546 scaffolds up to 230 Kb in length, 54 of which were circular virus consensus genomes. Analysis of virus marker genes revealed a wide range of viruses had been assembled, including bacteriophages, cyanophages, nucleocytoplasmic large DNA viruses and a virophage, with putative hosts identified as Cyanobacteria, Alphaproteobacteria, Gammaproteobacteria, Actinobacteria, Firmicutes, eukaryotic algae and amoebae. Whole genome comparisons revealed the majority of circular genome scaffolds (CGS) formed 12 novel groups, two of which contained multiple phage members with plasmid-like properties, including a group of phage-plasmids possessing plasmid-like partition genes and toxin-antitoxin addiction modules to ensure their replication and a satellite phage-plasmid group. Surprisingly we also assembled a phage that not only encoded plasmid partition genes, but a clustered regularly interspaced short palindromic repeat (CRISPR)/Cas adaptive bacterial immune system. One of the spacers was an exact match for another phage in our virome, indicating that in a novel use of the system, the lysogen was potentially capable of conferring immunity on its bacterial host against other phage. Together these results suggest that highly novel and diverse groups of viruses are present in glacial environments, some of which utilize very unusual life strategies and genes to control their replication and maintain a long-term relationship with their hosts.
Article
Full-text available
Our discovery highlights the remarkable ability of DNA transposons to colonize and shape genomes from all domains of life, as well as giant viruses. Our findings continue to blur the division between viral and cellular genomes, adheri ng to the emerging view that the content, dynamics, and evolution of the genomes of giant vir uses do not substantially differ from those of cellular organisms.
Article
Full-text available
Repbase Update (RU) is a database of representative repeat sequences in eukaryotic genomes. Since its first development as a database of human repetitive sequences in 1992, RU has been serving as a well-curated reference database fundamental for almost all eukaryotic genome sequence analyses. Here, we introduce recent updates of RU, focusing on technical issues concerning the submission and updating of Repbase entries and will give short examples of using RU data. RU sincerely invites a broader submission of repeat sequences from the research community. Electronic supplementary material The online version of this article (doi:10.1186/s13100-015-0041-9) contains supplementary material, which is available to authorized users.
Article
Full-text available
Search of metagenomics sequence databases for homologs of virophage capsid proteins resulted in the discovery of a new family of virophages in the sheep rumen metagenome. The genomes of the rumen virophages (RVP) encode a typical virophage major capsid protein, ATPase and protease combined with a Polinton-type, protein primed family B DNA polymerase. The RVP genomes appear to be linear molecules, with terminal inverted repeats. Thus, the RVP seem to represent virophage-Polinton hybrids that are likely capable of formation of infectious virions. Virion proteins of mimiviruses were detected in the same metagenomes as the RVP suggesting that the virophages of the new family parasitize on giant viruses that infect protist inhabitants of the rumen. This article was reviewed by Mart Krupovic and Kenneth Stedman; for complete reviews, see the Reviewers’ Reports section. Electronic supplementary material The online version of this article (doi:10.1186/s13062-015-0054-9) contains supplementary material, which is available to authorized users.
Article
Full-text available
Mpemba paradox results from hydrogen-bond anomalous relaxation. Heating stretches the O:H nonbond and shortens the H‒O bond via Coulomb coupling; cooling reverses this process to emit heat at a rate depending on its initial storage. Skin ultra-low mass density raises the thermal diffusivity and favors outward heat flow from the liquid.
Article
Full-text available
Herpesviruses are ubiquitous double-stranded DNA viruses infecting many animals, with the capacity to cause disease in both immunocompetent and immunocompromised hosts. Different herpesviruses have different cell tropisms, and have been detected in a diverse range of tissues and sample types. Metagenomics - encompassing viromics - analyses the nucleic acid of a tissue or other sample in an unbiased manner, making few or no prior assumptions about which viruses may be present in a sample. This approach has successfully discovered a number of novel herpesviruses. Furthermore, metagenomic analysis can identify herpesviruses with high degrees of sequence divergence from known herpesviruses and does not rely upon culturing large quantities of viral material. Metagenomics has had success in two areas of herpesvirus sequencing: firstly, the discovery of novel exogenous and endogenous herpesviruses in primates, bats and cnidarians; and secondly, in characterising large areas of the genomes of herpesviruses previously only known from small fragments, revealing unexpected diversity. This review will discuss the successes and challenges of using metagenomics to identify novel herpesviruses, and future directions within the field.
Article
Full-text available
Diverse eukaryotes including animals and protists are hosts to a broad variety of viruses with double-stranded (ds) DNA genomes, from the largest known viruses, such as pandoraviruses and mimiviruses, to tiny polyomaviruses. Recent comparative genomic analyses have revealed many evolutionary connections between dsDNA viruses of eukaryotes, bacteriophages, transposable elements, and linear DNA plasmids. These findings provide an evolutionary scenario that derives several major groups of eukaryotic dsDNA viruses, including the proposed order “Megavirales,” adenoviruses, and virophages from a group of large virus-like transposons known as Polintons (Mavericks). The Polintons have been recently shown to encode two capsid proteins, suggesting that these elements lead a dual lifestyle with both a transposon and a viral phase and should perhaps more appropriately be named polintoviruses. Here, we describe the recently identified evolutionary relationships between bacteriophages of the family Tectiviridae, polintoviruses, adenoviruses, virophages, large and giant DNA viruses of eukaryotes of the proposed order “Megavirales,” and linear mitochondrial and cytoplasmic plasmids. We outline an evolutionary scenario under which the polintoviruses were the first group of eukaryotic dsDNA viruses that evolved from bacteriophages and became the ancestors of most large DNA viruses of eukaryotes and a variety of other selfish elements. Distinct lines of origin are detectable only for herpesviruses (from a different bacteriophage root) and polyoma/papillomaviruses (from single-stranded DNA viruses and ultimately from plasmids). Phylogenomic analysis of giant viruses provides compelling evidence of their independent origins from smaller members of the putative order “Megavirales,” refuting the speculations on the evolution of these viruses from an extinct fourth domain of cellular life.
Article
Full-text available
Significance For millions of years retroviruses, such as HIV in humans, have attacked vertebrates. Occasionally retroviruses infiltrate germ cells, incorporate themselves into the host’s genome, and transmit vertically to the host’s offspring as endogenous retroviruses (ERVs). Consequently, ERVs make up large portions of vertebrate genomes and represent a record of past host–retrovirus interactions. We developed pan-vertebrate ERV analyses to provide an overview of host–retrovirus interactions, generating insights into retroviral evolution, diversity, host-switching, and the factors influencing retroviral transmission. Astoundingly, we found over 36,000 ERV lineages across our sample of vertebrate diversity. The results provide knowledge about host–retrovirus coevolution, suggesting an unprecedented ability of retroviruses to switch between distantly related vertebrates and implying existence of additional, yet unidentified retroviruses.
Article
Full-text available
Polintons (also known as Mavericks) are large DNA transposons that are widespread in the genomes of eukaryotes. We have recently shown that Polintons encode virus capsid proteins, which suggests that these transposons might form virions, at least under some conditions. In this Opinion article, we delineate the evolutionary relationships among bacterial tectiviruses, Polintons, adenoviruses, virophages, large and giant DNA viruses of eukaryotes of the proposed order 'Megavirales', and linear mitochondrial and cytoplasmic plasmids. We hypothesize that Polintons were the first group of eukaryotic double-stranded DNA viruses to evolve from bacteriophages and that they gave rise to most large DNA viruses of eukaryotes and various other selfish genetic elements.
Article
Full-text available
Giant viruses have revealed a number of surprises that challenge conventions on what constitutes a virus. The Samba virus newly isolated in Brazil expands the known distribution of giant mimiviruses to a near-global scale. These viruses, together with the transposon-related virophages that infect them, pose a number of questions about their evolutionary origins that need to be considered in the light of the complex entanglement between host, virus and virophage genomes. See research article: http://www.virologyj.com/content/11/1/95.
Article
Full-text available
Nucleocytoplasmic large DNA viruses (NCLDVs) are eukaryotic viruses with large genomes (100 kb-2.5 Mb), which include giant Mimivirus, Megavirus and Pandoravirus. NCLDVs are known to infect animals, protists and phytoplankton but were never described as pathogens of land plants. Here, we show that the bryophyte Physcomitrella patens and the lycophyte Selaginella moellendorffii have open reading frames (ORFs) with high phylogenetic affinities to NCLDV homologues. The P. patens genes are clustered in DNA stretches (up to 13 kb) containing up to 16 NCLDV-like ORFs. Molecular evolution analysis suggests that the NCLDV-like regions were acquired by horizontal gene transfer from distinct but closely related viruses that possibly define a new family of NCLDVs. Transcriptomics and DNA methylation data indicate that the NCLDV-like regions are transcriptionally inactive and are highly cytosine methylated through a mechanism not relying on small RNAs. Altogether, our data show that members of NCLDV have infected land plants.
Article
Full-text available
Herpesviridae is a diverse family of large and complex pathogens whose genomes are extremely difficult to sequence. This is particularly true for clinical samples, and if the virus, host, or both genomes are being sequenced for the first time. Although herpesviruses are known to occasionally integrate in host genomes, and can also be inherited in a Mendelian fashion, they are notably absent from the genomic fossil record comprised of endogenous viral elements (EVEs). Here, we combine paleovirological and metagenomic approaches to both explore the constituent viral diversity of mammalian genomes and search for endogenous herpesviruses. We describe the first endogenous herpesvirus from the genome of the Philippine tarsier, belonging to the Roseolovirus genus, and characterize its highly defective genome that is integrated and flanked by unambiguous host DNA. From a draft assembly of the aye-aye genome, we use bioinformatic tools to reveal over 100,000 bp of a novel rhadinovirus that is the first lemur gammaherpesvirus, closely related to Kaposi's sarcoma-associated virus. We also identify 58 genes of Pan paniscus lymphocryptovirus 1, the bonobo equivalent of human Epstein-Barr virus. For each of the viruses, we postulate gene function via comparative analysis to known viral relatives. Most notably, the evidence from gene content and phylogenetics suggests that the aye-aye sequences represent the most basal known rhadinovirus, and indicates that tumorigenic herpesviruses have been infecting primates since their emergence in the late Cretaceous. Overall, these data show that a genomic fossil record of herpesviruses exists despite their extremely large genomes, and expands the known diversity of Herpesviridae, which will aid the characterization of pathogenesis. Our analytical approach illustrates the benefit of intersecting evolutionary approaches with metagenomics, genetics and paleovirology.
Article
Full-text available
Reviewers This article was reviewed by Lakshminarayan M. Iyer and I. King Jordan. For complete reviews, see the Reviewers’ Reports section. Polintons (also known as Mavericks) and Tlr elements of Tetrahymena thermophila represent two families of large DNA transposons widespread in eukaryotes. Here, we show that both Polintons and Tlr elements encode two key virion proteins, the major capsid protein with the double jelly-roll fold and the minor capsid protein, known as the penton, with the single jelly-roll topology. This observation along with the previously noted conservation of the genes for viral genome packaging ATPase and adenovirus-like protease strongly suggests that Polintons and Tlr elements combine features of bona fide viruses and transposons. We propose the name ‘Polintoviruses’ to denote these putative viruses that could have played a central role in the evolution of several groups of DNA viruses of eukaryotes.
Article
Full-text available
Virophages, which are potentially important ecological regulators, have been discovered in association with members of the order Megavirales. Sputnik virophages target the Mimiviridae, Mavirus was identified with the Cafeteria roenbergensis virus, and virophage genomes reconstructed by metagenomic analyses may be associated with the Phycodnaviridae. Despite the fact that the Sputnik virophages were isolated with viruses belonging to group A of the Mimiviridae, they can grow in amoebae infected by Mimiviridae from groups A, B or C. In this study we describe Zamilon, the first virophage isolated with a member of group C of the Mimiviridae family. By co-culturing amoebae with purified Zamilon, we found that the virophage is able to multiply with members of groups B and C of the Mimiviridae family but not with viruses from group A. Zamilon has a 17,276 bp DNA genome that potentially encodes 20 genes. Most of these genes are closely related to genes from the Sputnik virophage, yet two are more related to Megavirus chiliensis genes, a group B Mimiviridae, and one to Moumouvirus monve transpoviron.
Article
Full-text available
Significance Retroviruses, such as HIV, are important pathogens of vertebrates, including humans. They are capable of crossing species barriers to infect new hosts, but knowledge about the evolutionary history of retroviruses is limited. However, genomic traces of past retrovirus activities known as “endogenous retroviruses” can be screened from sequenced genomes and analyzed to improve understanding of retrovirus evolution. Here we use a unique approach to address the evolution of one group of retroviruses in a screen of 60 diverse vertebrate host genomes. We find evidence of rampant host-switching across mammalian orders by members of this group throughout their evolutionary history. We also find evidence that the spread of infective retroviruses from this group may be facilitated by rats.
Article
Full-text available
The recently discovered Pandoraviruses are by far the largest viruses known, with their 2 megabase genomes exceeding in size the genomes of numerous bacteria and archaea. Pandoraviruses show a distant relationship with other nucleocytoplasmic large DNA viruses (NCLDV) of eukaryotes, lack some of the NCLDV core genes and in particular do not appear to be specifically related to the other, better characterized family of giant viruses, the Mimiviridae. Here we report phylogenetic analysis of 6 core NCLDV genes that confidently places Pandoraviruses within the family Phycodnaviridae, with an apparent specific affinity with Coccolithoviruses. We conclude that, despite their many unusual characteristics, Pandoraviruses are highly derived phycodnaviruses. These findings imply that giant viruses have independently evolved from smaller NCLDV on at least two occasions. This article was reviewed by Patrick Forterre and Lakshminarayan Iyer. For the full reviews, see the Reviewers’ reports section.
Article
Full-text available
The reticuloendotheliosis viruses (REVs) comprise several closely related amphotropic retroviruses isolated from birds. These viruses exhibit several highly unusual characteristics that have not so far been adequately explained, including their extremely close relationship to mammalian retroviruses, and their presence as endogenous sequences within the genomes of certain large DNA viruses. We present evidence for an iatrogenic origin of REVs that accounts for these phenomena. Firstly, we identify endogenous retroviral fossils in mammalian genomes that share a unique recombinant structure with REVs-unequivocally demonstrating that REVs derive directly from mammalian retroviruses. Secondly, through sequencing of archived REV isolates, we confirm that contaminated Plasmodium lophurae stocks have been the source of multiple REV outbreaks in experimentally infected birds. Finally, we show that both phylogenetic and historical evidence support a scenario wherein REVs originated as mammalian retroviruses that were accidentally introduced into avian hosts in the late 1930s, during experimental studies of P. lophurae, and subsequently integrated into the fowlpox virus (FWPV) and gallid herpesvirus type 2 (GHV-2) genomes, generating recombinant DNA viruses that now circulate in wild birds and poultry. Our findings provide a novel perspective on the origin and evolution of REV, and indicate that horizontal gene transfer between virus families can expand the impact of iatrogenic transmission events.
Article
Full-text available
Paleovirology is the study of ancient viruses, typically over prehistoric or geological timescales. There is no physical ‘fossil record’ of viruses; virions persist for short time periods, and rapidly degrade leaving no direct trace of their existence. Many viruses can enter the genomes of their
Article
Full-text available
Coccolithophores have influenced the global climate for over 200 million years. These marine phytoplankton can account for 20 per cent of total carbon fixation in some systems. They form blooms that can occupy hundreds of thousands of square kilometres and are distinguished by their elegantly sculpted calcium carbonate exoskeletons (coccoliths), rendering them visible from space. Although coccolithophores export carbon in the form of organic matter and calcite to the sea floor, they also release CO2 in the calcification process. Hence, they have a complex influence on the carbon cycle, driving either CO2 production or uptake, sequestration and export to the deep ocean. Here we report the first haptophyte reference genome, from the coccolithophore Emiliania huxleyi strain CCMP1516, and sequences from 13 additional isolates. Our analyses reveal a pan genome (core genes plus genes distributed variably between strains) probably supported by an atypical complement of repetitive sequence in the genome. Comparisons across strains demonstrate that E. huxleyi, which has long been considered a single species, harbours extensive genome variability reflected in different metabolic repertoires. Genome variability within this species complex seems to underpin its capacity both to thrive in habitats ranging from the equator to the subarctic and to form large-scale episodic blooms under a wide variety of environmental conditions.
Article
Full-text available
Background Recent advances of genomics and metagenomics reveal remarkable diversity of viruses and other selfish genetic elements. In particular, giant viruses have been shown to possess their own mobilomes that include virophages, small viruses that parasitize on giant viruses of the Mimiviridae family, and transpovirons, distinct linear plasmids. One of the virophages known as the Mavirus, a parasite of the giant Cafeteria roenbergensis virus, shares several genes with large eukaryotic self-replicating transposon of the Polinton (Maverick) family, and it has been proposed that the polintons evolved from a Mavirus-like ancestor. Results We performed a comprehensive phylogenomic analysis of the available genomes of virophages and traced the evolutionary connections between the virophages and other selfish genetic elements. The comparison of the gene composition and genome organization of the virophages reveals 6 conserved, core genes that are organized in partially conserved arrays. Phylogenetic analysis of those core virophage genes, for which a sufficient diversity of homologs outside the virophages was detected, including the maturation protease and the packaging ATPase, supports the monophyly of the virophages. The results of this analysis appear incompatible with the origin of polintons from a Mavirus-like agent but rather suggest that Mavirus evolved through recombination between a polinton and an unknownvirus. Altogether, virophages, polintons, a distinct Tetrahymena transposable element Tlr1, transpovirons, adenoviruses, and some bacteriophages form a network of evolutionary relationships that is held together by overlapping sets of shared genes and appears to represent a distinct module in the vast total network of viruses and mobile elements. Conclusions The results of the phylogenomic analysis of the virophages and related genetic elements are compatible with the concept of network-like evolution of the virus world and emphasize multiple evolutionary connections between bona fide viruses and other classes of capsid-less mobile elements.
Article
Full-text available
The giant virus Mimiviridae family includes 3 groups of viruses: group A (includes Acanthamoeba polyphaga Mimivirus), group B (includes Moumouvirus) and group C (includes Megavirus chilensis). Virophages have been isolated with both group A Mimiviridae (the Mamavirus strain) and the related Cafeteria roenbergensis virus, and they have also been described by bioinformatic analysis of the Phycodnavirus. Here, we found that the first two strains of virophages isolated with group A Mimiviridae can multiply easily in groups B and C and play a role in gene transfer among these virus subgroups. To isolate new virophages and their Mimiviridae host in the environment, we used PCR to identify a sample with a virophage and a group C Mimiviridae that failed to grow on amoeba. Moreover, we showed that virophages reduce the pathogenic effect of Mimivirus (plaque formation), establishing its parasitic role on Mimivirus. We therefore developed a co-culture procedure using Acanthamoeba polyphaga and Mimivirus to recover the detected virophage and then sequenced the virophage's genome. We present this technique as a novel approach to isolating virophages. We demonstrated that the newly identified virophages replicate in the viral factories of all three groups of Mimiviridae, suggesting that the spectrum of virophages is not limited to their initial host.
Article
Full-text available
Virophages, e.g., Sputnik, Mavirus and Organic Lake virophage (OLV), are unusual parasites of giant dsDNA viruses, yet little is known about their diversity. Here we describe the global distribution, abundance, and genetic diversity of virophages based on analyzing and mapping comprehensive metagenomic databases. The results revealed distinct abundance and world-wide distribution of virophages involving almost all geographical zones and a variety of unique environments. These environments ranged from deep ocean to inland, iced to hydrothermal lakes, and human gut to animal-associated habitats. Four complete (Yellowstone Lake virophages, YSLVs) virophage genomic sequences were obtained, as was one near-complete (Ace Lake Mavirus, ALM) sequence. The genomes obtained were 27,849 bp long with 26 predicted open reading frames (ORFs) (YSLV 1), 23,184 bp with 21 ORFs (YSLV 2), 27,050 bp with 23 ORFs (YSLV 3), 28,306 bp with 34 ORFs (YSLV 4), and 17,767 bp with 22 ORFs (ALM). The homologous counterparts of five genes of putative FtsK-HerA family DNA packaging ATPase, DNA helicase/primase, cysteine protease, major capsid protein (MCP), and minor capsid protein (mCP) were present in all virophages studied thus far. They also shared a conserved gene cluster comprising the two core genes of MCP and mCP. Comparative genomic and phylogenetic analyses showed that YSLVs, having a closer relationship between each other than to the other virophages, were more closely related to OLV than to Sputnik, but distantly related to Mavirus and ALM. These findings indicated that virophages appear to be widespread and genetically diverse, with at least 3 major lineages.
Article
Full-text available
A distinct class of infectious agents, the virophages that infect giant viruses of the Mimiviridae family, has been recently described. Here we report the simultaneous discovery of a giant virus of Acanthamoeba polyphaga (Lentille virus) that contains an integrated genome of a virophage (Sputnik 2), and a member of a previously unknown class of mobile genetic elements, the transpovirons. The transpovirons are linear DNA elements of ∼7 kb that encompass six to eight protein-coding genes, two of which are homologous to virophage genes. Fluorescence in situ hybridization showed that the free form of the transpoviron replicates within the giant virus factory and accumulates in high copy numbers inside giant virus particles, Sputnik 2 particles, and amoeba cytoplasm. Analysis of deep-sequencing data showed that the virophage and the transpoviron can integrate in nearly any place in the chromosome of the giant virus host and that, although less frequently, the transpoviron can also be linked to the virophage chromosome. In addition, integrated fragments of transpoviron DNA were detected in several giant virus and Sputnik genomes. Analysis of 19 Mimivirus strains revealed three distinct transpovirons associated with three subgroups of Mimiviruses. The virophage, the transpoviron, and the previously identified self-splicing introns and inteins constitute the complex, interconnected mobilome of the giant viruses and are likely to substantially contribute to interviral gene transfer.
Article
Full-text available
Viruses are abundant ubiquitous members of microbial communities and in the marine environment affect population structure and nutrient cycling by infecting and lysing primary producers. Antarctic lakes are microbially dominated ecosystems supporting truncated food webs in which viruses exert a major influence on the microbial loop. Here we report the discovery of a virophage (relative of the recently described Sputnik virophage) that preys on phycodnaviruses that infect prasinophytes (phototrophic algae). By performing metaproteogenomic analysis on samples from Organic Lake, a hypersaline meromictic lake in Antarctica, complete virophage and near-complete phycodnavirus genomes were obtained. By introducing the virophage as an additional predator of a predator-prey dynamic model we determined that the virophage stimulates secondary production through the microbial loop by reducing overall mortality of the host and increasing the frequency of blooms during polar summer light periods. Virophages remained abundant in the lake 2 y later and were represented by populations with a high level of major capsid protein sequence variation (25-100% identity). Virophage signatures were also found in neighboring Ace Lake (in abundance) and in two tropical lakes (hypersaline and fresh), an estuary, and an ocean upwelling site. These findings indicate that virophages regulate host-virus interactions, influence overall carbon flux in Organic Lake, and play previously unrecognized roles in diverse aquatic ecosystems.
Article
Full-text available
DNA transposons are mobile genetic elements that have shaped the genomes of eukaryotes for millions of years, yet their origins remain obscure. We discovered a virophage that, on the basis of genetic homology, likely represents an evolutionary link between double-stranded DNA viruses and Maverick/Polinton eukaryotic DNA transposons. The Mavirus virophage parasitizes the giant Cafeteria roenbergensis virus and encodes 20 predicted proteins, including a retroviral integrase and a protein-primed DNA polymerase B. On the basis of our data, we conclude that Maverick/Polinton transposons may have originated from ancient relatives of Mavirus, and thereby influenced the evolution of eukaryotic genomes, although we cannot rule out alternative evolutionary scenarios.
Article
Full-text available
Integration into the nuclear genome of germ line cells can lead to vertical inheritance of retroviral genes as host alleles. For other viruses, germ line integration has only rarely been documented. Nonetheless, we identified endogenous viral elements (EVEs) derived from ten non-retroviral families by systematic in silico screening of animal genomes, including the first endogenous representatives of double-stranded RNA, reverse-transcribing DNA, and segmented RNA viruses, and the first endogenous DNA viruses in mammalian genomes. Phylogenetic and genomic analysis of EVEs across multiple host species revealed novel information about the origin and evolution of diverse virus groups. Furthermore, several of the elements identified here encode intact open reading frames or are expressed as mRNA. For one element in the primate lineage, we provide statistically robust evidence for exaptation. Our findings establish that genetic material derived from all known viral genome types and replication strategies can enter the animal germ line, greatly broadening the scope of paleovirological studies and indicating a more significant evolutionary role for gene flow from virus to animal genomes than has previously been recognized.
Article
Full-text available
Chlorella variabilis NC64A, a unicellular photosynthetic green alga (Trebouxiophyceae), is an intracellular photobiont of Paramecium bursaria and a model system for studying virus/algal interactions. We sequenced its 46-Mb nuclear genome, revealing an expansion of protein families that could have participated in adaptation to symbiosis. NC64A exhibits variations in GC content across its genome that correlate with global expression level, average intron size, and codon usage bias. Although Chlorella species have been assumed to be asexual and nonmotile, the NC64A genome encodes all the known meiosis-specific proteins and a subset of proteins found in flagella. We hypothesize that Chlorella might have retained a flagella-derived structure that could be involved in sexual reproduction. Furthermore, a survey of phytohormone pathways in chlorophyte algae identified algal orthologs of Arabidopsis thaliana genes involved in hormone biosynthesis and signaling, suggesting that these functions were established prior to the evolution of land plants. We show that the ability of Chlorella to produce chitinous cell walls likely resulted from the capture of metabolic genes by horizontal gene transfer from algal viruses, prokaryotes, or fungi. Analysis of the NC64A genome substantially advances our understanding of the green lineage evolution, including the genomic interplay with viruses and symbiosis between eukaryotes.
Article
Full-text available
Herpesviruses are members of a diverse family of viruses that colonize all vertebrates from fish to mammals. Although more than one hundred herpesviruses exist, all are nearly identical architecturally, with a genome consisting of a linear double-stranded DNA molecule (100 to 225 kbp) protected by an icosahedral capsid made up of 162 hollow-centered capsomeres, a tegument surrounding the nucleocapsid, and a viral envelope derived from host membranes. Upon infection, the linear viral DNA is delivered to the nucleus, where it circularizes to form the viral episome. Depending on several factors, the viral cycle can proceed either to a productive infection or to a state of latency. In either case, the viral genetic information is maintained as extrachromosomal circular DNA. Interestingly, however, certain oncogenic herpesviruses such as Marek's disease virus and Epstein-Barr virus can be found integrated at low frequencies in the host's chromosomes. These findings have mostly been viewed as anecdotal and considered exceptions rather than properties of herpesviruses. In recent years, the consistent and rather frequent detection (in approximately 1% of the human population) of human herpesvirus 6 (HHV-6) viral DNA integrated into human chromosomes has spurred renewed interest in our understanding of how these viruses infect, replicate, and propagate themselves. In this review, we provide a historical perspective on chromosomal integration by herpesviruses and present the current state of knowledge on integration by HHV-6 with the possible clinical implications associated with viral integration.
Article
Full-text available
The Nucleo-Cytoplasmic Large DNA Viruses (NCLDV) comprise an apparently monophyletic class of viruses that infect a broad variety of eukaryotic hosts. Recent progress in isolation of new viruses and genome sequencing resulted in a substantial expansion of the NCLDV diversity, resulting in additional opportunities for comparative genomic analysis, and a demand for a comprehensive classification of viral genes. A comprehensive comparison of the protein sequences encoded in the genomes of 45 NCLDV belonging to 6 families was performed in order to delineate cluster of orthologous viral genes. Using previously developed computational methods for orthology identification, 1445 Nucleo-Cytoplasmic Virus Orthologous Groups (NCVOGs) were identified of which 177 are represented in more than one NCLDV family. The NCVOGs were manually curated and annotated and can be used as a computational platform for functional annotation and evolutionary analysis of new NCLDV genomes. A maximum-likelihood reconstruction of the NCLDV evolution yielded a set of 47 conserved genes that were probably present in the genome of the common ancestor of this class of eukaryotic viruses. This reconstructed ancestral gene set is robust to the parameters of the reconstruction procedure and so is likely to accurately reflect the gene core of the ancestral NCLDV, indicating that this virus encoded a complex machinery of replication, expression and morphogenesis that made it relatively independent from host cell functions. The NCVOGs are a flexible and expandable platform for genome analysis and functional annotation of newly characterized NCLDV. Evolutionary reconstructions employing NCVOGs point to complex ancestral viruses.
Article
Full-text available
The virophage Sputnik is a satellite virus of the giant mimivirus and is the only satellite virus reported to date whose propagation adversely affects its host virus' production. Genome sequence analysis showed that Sputnik has genes related to viruses infecting all three domains of life. Here, we report structural studies of Sputnik, which show that it is about 740 Å in diameter, has a T=27 icosahedral capsid, and has a lipid membrane inside the protein shell. Structural analyses suggest that the major capsid protein of Sputnik is likely to have a double jelly-roll fold, although sequence alignments do not show any detectable similarity with other viral double jelly-roll capsid proteins. Hence, the origin of Sputnik's capsid might have been derived from other viruses prior to its association with mimivirus.
Article
Full-text available
Retroviruses can leave a “fossil record” in their hosts’ genomes in the form of endogenous retroviruses. Foamy viruses, complex retroviruses that infect mammals, have been notably absent from this record. We have found an endogenous foamy virus within the genomes of sloths and show that foamy viruses were infecting mammals more than 100 million years ago and codiverged with their hosts across an entire geological era. Our analysis highlights the role of evolutionary constraint in maintaining viral genome structure and indicates that accessory genes and mammalian mechanisms of innate immunity are the products of macroevolutionary conflict played out over a geological time scale.
Article
Some giant viruses encode a genome larger than that of some bacteria, but their evolutionary history is a mystery. Examining the genomes within a sample from a wastewater treatment plant in Austria, Schulz et al. assembled a previously undiscovered giant virus genome, which they used to mine genetic databases for related viruses. The authors thus identified a group of giant viruses with more genes encoding components of the protein translation machinery, including aminoacyl transfer RNA synthetases, than in other giant viruses. Phylogenetic analyses suggest that the genes were acquired in an evolutionarily recent time frame, likely from, and as an adaptation to, their hosts.Science, this issue p. 82The discovery of giant viruses blurred the sharp division between viruses and cellular life. Giant virus genomes encode proteins considered as signatures of cellular organisms, particularly translation system components, prompting hypotheses that these viruses derived from a fourth domain of cellular life. H
Article
Endogenous viral elements are increasingly found in eukaryotic genomes, yet little is known about their origins, dynamics, or function. Here we provide a compelling example of a DNA virus that readily integrates into a eukaryotic genome where it acts as an inducible antiviral defence system. We found that the virophage mavirus, a parasite of the giant Cafeteria roenbergensis virus (CroV), integrates at multiple sites within the nuclear genome of the marine protozoan Cafeteria roenbergensis. The endogenous mavirus is structurally and genetically similar to eukaryotic DNA transposons and endogenous viruses of the Maverick/Polinton family. Provirophage genes are not constitutively expressed, but are specifically activated by superinfection with CroV, which induces the production of infectious mavirus particles. Virophages can inhibit the replication of mimivirus-like giant viruses and an anti-viral protective effect of provirophages on their hosts has been hypothesized. We find that provirophage-carrying cells are not directly protected from CroV; however, lysis of these cells releases infectious mavirus particles that are then able to suppress CroV replication and enhance host survival during subsequent rounds of infection. The microbial host–parasite interaction described here involves an altruistic aspect and suggests that giant-virus-induced activation of provirophages might be ecologically relevant in natural protist populations.
Article
Regulatory use of endogenous retroviruses Mammalian genomes contain many endogenous retroviruses (ERVs), which have a range of evolutionary ages. The propagation and maintenance of these genetic elements have been attributed to their ability to contribute to gene regulation. Chuong et al. demonstrate that some ERV families are enriched in regulatory elements, so that they act as independently evolved enhancers for immune genes in both humans and mice (see the Perspective by Lynch). The analysis revealed a primate-specific element that orchestrates the transcriptional response to interferons. Selection can therefore act on selfish genetic elements to generate novel gene networks. Science , this issue p. 1083 see also p. 1029
Article
Large dsDNA viruses are involved in the population control of many globally distributed species of eukaryotic phytoplankton and have a prominent role in bloom termination. The genus Phaeocystis (Haptophyta, Prymnesiophyceae) includes several high-biomass-forming phytoplankton species, such as Phaeocystis globosa, the blooms of which occur mostly in the coastal zone of the North Atlantic and the North Sea. Here, we report the 459,984-bp-long genome sequence of P. globosa virus strain PgV-16T, encoding 434 proteins and eight tRNAs and, thus, the largest fully sequenced genome to date among viruses infecting algae. Surprisingly, PgV-16T exhibits no phylogenetic affinity with other viruses infecting microalgae (e.g., phycodnaviruses), including those infecting Emiliania huxleyi, another ubiquitous bloom-forming haptophyte. Rather, PgV-16T belongs to an emerging clade (the Megaviridae) clustering the viruses endowed with the largest known genomes, including Megavirus, Mimivirus (both infecting acanthamoeba), and a virus infecting the marine microflagellate grazer Cafeteria roenbergensis. Seventy-five percent of the best matches of PgV-16T-predicted proteins correspond to two viruses [Organic Lake phycodnavirus (OLPV)1 and OLPV2] from a hypersaline lake in Antarctica (Organic Lake), the hosts of which are unknown. As for OLPVs and other Megaviridae, the PgV-16T sequence data revealed the presence of a virophage-like genome. However, no virophage particle was detected in infected P. globosa cultures. The presence of many genes found only in Megaviridae in its genome and the presence of an associated virophage strongly suggest that PgV-16T shares a common ancestry with the largest known dsDNA viruses, the host range of which already encompasses the earliest diverging branches of domain Eukarya.
Article
Paleovirology, the study of viruses on evolutionary timescales, can exploit information from endogenous viral elements (EVEs), which are the result of heritable horizontal gene transfer (HGT) from viruses to hosts. The availability of genomic data has increased opportunities to study EVEs, and bioinformatics techniques have been crucial in cataloguing EVE diversity and taxonomic coverage. Recent advances show that some EVEs have been co-opted as cellular genes, often as inhibitors of viral infection. These genes are an intriguing strategy in virus-host evolutionary battles in that genetic material is transferred from virus to host, and then used by the host against the virus. In this review, we consider the genes and processes involved in EVE-derived immunity (EDI), assess factors leading to its emergence, and outline how future work will benefit from incorporating evolutionary approaches.
Article
In contrast to RNA viruses, double-stranded DNA viruses have low mutation rates yet must still adapt rapidly in response to changing host defenses. To determine mechanisms of adaptation, we subjected the model poxvirus vaccinia to serial propagation in human cells, where its antihost factor K3L is maladapted against the antiviral protein kinase R (PKR). Viruses rapidly acquired higher fitness via recurrent K3L gene amplifications, incurring up to 7%-10% increases in genome size. These transient gene expansions were necessary and sufficient to counteract human PKR and facilitated the gain of an adaptive amino acid substitution in K3L that also defeats PKR. Subsequent reductions in gene amplifications offset the costs associated with larger genome size while retaining adaptive substitutions. Our discovery of viral "gene-accordions" explains how poxviruses can rapidly adapt to defeat different host defenses despite low mutation rates and reveals how classical Red Queen conflicts can progress through unrecognized intermediates.