ArticlePublisher preview available

Hemimastigophora is a novel supra-kingdom-level lineage of eukaryotes

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Almost all eukaryote life forms have now been placed within one of five to eight supra-kingdom-level groups using molecular phylogenetics1–4. The ‘phylum’ Hemimastigophora is probably the most distinctive morphologically defined lineage that still awaits such a phylogenetic assignment. First observed in the nineteenth century, hemimastigotes are free-living predatory protists with two rows of flagella and a unique cell architecture5–7; to our knowledge, no molecular sequence data or cultures are currently available for this group. Here we report phylogenomic analyses based on high-coverage, cultivation-independent transcriptomics that place Hemimastigophora outside of all established eukaryote supergroups. They instead comprise an independent supra-kingdom-level lineage that most likely forms a sister clade to the ‘Diaphoretickes’ half of eukaryote diversity (that is, the ‘stramenopiles, alveolates and Rhizaria’ supergroup (Sar), Archaeplastida and Cryptista, as well as other major groups). The previous ranking of Hemimastigophora as a phylum understates the evolutionary distinctiveness of this group, which has considerable importance for investigations into the deep-level evolutionary history of eukaryotic life—ranging from understanding the origins of fundamental cell systems to placing the root of the tree. We have also established the first culture of a hemimastigote (Hemimastix kukwesjijk sp. nov.), which will facilitate future genomic and cell-biological investigations into eukaryote evolution and the last eukaryotic common ancestor. Phylogenetic analyses based on single-cell transcriptomic data from two hemimastigotes, a Spironema species and the newly described Hemimastix kukwesjijk, indicate that Hemimastigophora is a supra-kingdom-level lineage of eukaryotes.
This content is subject to copyright. Terms and conditions apply.
LETTER https://doi.org/10.1038/s41586-018-0708-8
Hemimastigophora is a novel supra-kingdom-level
lineage of eukaryotes
Gordon Lax1,4, Yana Eglit1,4, Laura Eme2,3,4, Erin M. Bertrand1, Andrew J. Roger2 & Alastair G. B. Simpson1*
Almost all eukaryote life forms have now been placed within
one of five to eight supra-kingdom-level groups using molecular
phylogenetics1–4. The ‘phylum’ Hemimastigophora is probably
the most distinctive morphologically defined lineage that still
awaits such a phylogenetic assignment. First observed in the
nineteenth century, hemimastigotes are free-living predatory
protists with two rows of flagella and a unique cell architecture
5–7
;
to our knowledge, no molecular sequence data or cultures are
currently available for this group. Here we report phylogenomic
analyses based on high-coverage, cultivation-independent
transcriptomics that place Hemimastigophora outside of all
established eukaryote supergroups. They instead comprise an
independent supra-kingdom-level lineage that most likely forms a
sister clade to the ‘Diaphoretickes’ half of eukaryote diversity (that
is, the ‘stramenopiles, alveolates and Rhizaria’ supergroup (Sar),
Archaeplastida and Cryptista, as well as other major groups). The
previous ranking of Hemimastigophora as a phylum understates the
evolutionary distinctiveness of this group, which has considerable
importance for investigations into the deep-level evolutionary
history of eukaryotic life—ranging from understanding the origins
of fundamental cell systems to placing the root of the tree. We have
also established the first culture of a hemimastigote (Hemimastix
kukwesjijk sp. nov.), which will facilitate future genomic and cell-
biological investigations into eukaryote evolution and the last
eukaryotic common ancestor.
We identified two previously undescribed species of the rarely
observed protist group Hemimastigophora (one Spironema and one
Hemimastix) in enrichments from soil. Here we formally describe the
newly identified Hemimastix species.
Hemimastix Foissner, Blatterer & Foissner 1988
Hemimastix kukwesjijk Eglit and Simpson, sp. nov.
Etymology. Kukwesjijk (approximate pronunciation, ‘ku–ga–wes–jij–
k’). ‘Kukwes-’ (Mi’kmaq), a rapacious, hairy ogre from the traditions
of the Mi’kmaq First Nation of Nova Scotia; ‘-jijk, a diminutive
plural suffix. ‘Little ogres’ reflects the predatory and hairy nature of
this microorganism, and the use of Mi’kmaq language and tradition
acknowledges the region in which the species was isolated.
Type material. The name-bearing hapantotype consists of trophic cells
and dividing cells of strain BW2H that are osmium-fixed, sputter-coated
and mounted for scanning electron microscopy. This material is deposited
with the American Museum of Natural History (New York) with accession
code AMNH_IZC 00267132. This material also contains prey Spumella
sp. (Stramenopiles) and uncharacterized prokaryotes, both of which are
explicitly excluded from the hapantotype.
Description. Hemimastix species, 16.5–20.5-μm long with 17–19
flagella per row.
Type locality. Bluff Wilderness Trail, Nova Scotia, Canada
(44.6610154°N, 63.7674669°W); soil from mixed-species woodland.
Gene sequence. The partial small subunit ribosomal RNA (SSU
rRNA) gene sequence of strain BW2H has been deposited in GenBank,
accession code MF682191.
Comments. Cells are larger and have several more flagella than
Hemimastix amphikineta, the only previously described species
(14-μm by 7-μm cell body, 12 flagella per row6).
Cells of H. kukwesjijk are oval in profile with a blunt anterior
projection (the capitulum) and two rows of flagella along their
whole length (Fig.1b, Extended Data Fig.1). In cultivation as
strain BW2H, live cells were 16.5–20.5-μm long by 7–12.5-μm wide
(18.3 ± 1μm × 9.9 ± 1.2μm; n = 61), with a sub-central, rounded
nucleus and posterior contractile vacuole (Fig.1c). Each row of 17–19
flagella (mean 18.4; n = 25) lay in a channel between the two thick
thecal plates. The anteriormost 9 or 10 flagella were closely spaced,
and the rest emerged from separate notches in the underlying plate
(Fig.1b, e). The capitulum was bordered by the overlapping anterior
1Department of Biology, Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Nova Scotia, Canada. 2Centre for Comparative Genomics and
Evolutionary Bioinformatics, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada. 3Present address: Department of Cell and Molecular Biology,
Science for Life Laboratory, Uppsala University, Uppsala, Sweden. 4These authors contributed equally: Gordon Lax, Yana Eglit, Laura Eme. *e-mail: alastair.simpson@dal.ca
cap.
ef
ab
cd
Fig. 1 | Micrographs of studied hemimastigotes. a, Spironema cf.
multiciliatum, cell 1 (of 4) isolated for transcriptomics. bf, H. kukwesjijk,
cell 1 (of 2) isolated for transcriptomics (b); note the presence of the
capitulum (cap.). c, d, Cells from culture (strain BW2H); note the nucleus
and the contractile vacuole at the posterior (c), and feeding on prey
with the capitulum (d). e, General view of cell (strain BW2H), anterior
with the capitulum to right. f, Detail of the capitulum, showing caps
of undischarged extrusomes (arrowheads) and close-spaced flagella in
anterior part of flagellar rows. ad, Differential interference contrast light
microscopy. e, f, Scanning electron microscopy. Scale bars, 10μm (a), 5μm
(be; scale bar in b applies to images bd), 1μm (f).
410 | NATURE | VOL 564 | 20/27 DECEMBER 2018
© 2018 Springer Nature Limited. All rights reserved.
... Additionally, single-cell genomics (or transcriptomics) can provide insights into the functional potential of these organisms by identifying key genes and metabolic pathways. Recent examples of protist sequence data obtained through single-cell (or "few-cell") approaches include the discovery of novel high-ranking lineages (Lax et al. 2018;Wideman et al. 2020;Schön et al. 2021). Moreover, single-cell genomics has enabled the study of symbiotic and parasitic protists, shedding light on their genetic adaptations and evolutionary relationships (Dia and Cheeseman 2021;Boscaro et al. 2023). ...
... Understanding the deepest part of the eukaryotic ToL (eToL) thus equates to resolving the relationships among major microbial lineages. Over the past decade, the deep structure of the eukaryotic tree has undergone substantial revisions, propelled by breakthroughs in phylogenomics and the integration of numerous evolutionarily pivotal protist lineages into molecular studies (Brown et al. 2018;Lax et al. 2018;Burki et al. 2020;Tikhonenkov et al. 2022). However, clarifying the earliest divergences in the eToL, including the location of its root, remains a formidable challenge. ...
... However, culturing novel protists can be difficult as many require often-unknown specific environmental conditions, nutrients, symbiotic partners, or microbial prey for growth. Nevertheless, advances in culturing and single-cell isolation techniques can lead to a deeper (Lax et al. 2018;Galindo et al. 2019;Gawryluk et al. 2019;Tikhonenkov et al. 2022). Similarly, painstakingly long cultivation efforts for prokaryotes from various environments for which we knew very little have proven to be invaluable, as shown by the recent cultures of Asgard archaea (Imachi et al. 2020;Rodrigues-Oliveira et al. 2022). ...
Article
Full-text available
In this perspective, we explore the transformative impact and inherent limitations of metagenomics and single-cell genomics on our understanding of microbial diversity and their integration into the Tree of Life. We delve into the key challenges associated with incorporating new microbial lineages into the Tree of Life through advanced phylogenomic approaches. Additionally, we shed light on enduring debates surrounding various aspects of the microbial Tree of Life, focusing on recent advances in some of its deepest nodes, such as the roots of bacteria, archaea, and eukaryotes. We also bring forth current limitations in genome recovery and phylogenomic methodology, as well as new avenues of research to uncover additional key microbial lineages and resolve the shape of the Tree of Life.
... Diaphoretickes is a designation of a large-scale taxonomic assemblage ("megagroup") comprised of Archaeplastida (including Rhodelphidia and Picozoa), Pancryptista, Haptista, Telonemea, and SAR (Adl et al. 2019), recently shown to embrace also Hemimastigophora and Provora (Lax et al. 2018;Tice et al. 2021;Tikhonenkov et al. 2022). We detected POP proteins with predicted mitochondrial localization in representatives of several deeply diverged Diaphoretickes taxa where the nature of the mitochondrial DNAP had not been defined before, including Telonemea, Centrohelea, Microhelida, Rhodelphidia, Picozoa, and Provora ( Fig. 6; supplementary table S1, Supplementary Material online). ...
... Unfortunately, most studies so far carried out to explore the eukaryotic root by using a noneukaryotic outgroup are by themselves not sufficient to infer the early evolution of mitochondrion-localized DNAP, as they included only Discoba of all the 3 rdxPolA-containing lineages (Derelle and Lang 2012;He et al. 2014;Baldauf 2023a, 2023b). However, when the root positions suggested by most of these analyses are mapped onto taxonomically comprehensive unrooted eukaryote phylogenies as recently inferred from multigene datasets (Lax et al. 2018;Tice et al. 2021;Tikhonenkov et al. 2022), the rdxPolA-containing taxa are found in both principal clades defined by the root, with Discoba in one of them the Malawimonadida plus Ancyromonadida in the other. This is consistent with the outgroup-based taxon-rich rooting analysis conducted by Derelle et al. (2015), which splits Discoba and Malawimonadida into the 2 major clades separated by the inferred root (ancyromonads were missing from the analysis). ...
Article
Full-text available
DNA polymerases (DNAPs) synthesize DNA from deoxyribonucleotides in a semi-conservative manner and serve as the core of DNA replication and repair machineries. In eukaryotic cells, there are two genome-containing organelles, mitochondria and plastids, that were derived from an alphaproteobacterium and a cyanobacterium, respectively. Except for rare cases of genome-lacking mitochondria and plastids, both organelles must be served by nucleus-encoded DNAPs that localize and work in them to maintain their genomes. The evolution of organellar DNAPs has yet to be fully understood because of two unsettled issues. First, the diversity of organellar DNAPs has not been elucidated in the full spectrum of eukaryotes. Second, it is unclear when the DNAPs that were used originally in the endosymbiotic bacteria giving rise to mitochondria and plastids were discarded, as the organellar DNAPs known to date show no phylogenetic affinity to those of the extant alphaproteobacteria or cyanobacteria. In this study, we identified from diverse eukaryotes 134 family A DNAP sequences, which were classified into 10 novel types, and explored their evolutionary origins. The subcellular localizations of selected DNAPs were further examined experimentally. The results presented here suggest that the diversity of organellar DNAPs has been shaped by multiple transfers of the PolI gene from phylogenetically broad bacteria, and their occurrence in eukaryotes was additionally impacted by secondary plastid endosymbioses. Finally, we propose that the last eukaryotic common ancestor may have possessed two mitochondrial DNAPs, POP and a candidate of the direct descendant of the proto-mitochondrial DNAP, rdxPolA, identified in this study.
... Until recently, the evolutionary relationships within eukaryotic diversity might have seemed rather messy. However, major parts of the eukaryotic tree have recently stabilized [32], even though new kingdomlevel lineages are continuing to be discovered [5,[33][34][35][36]. Essentially, animals, fungi, and plants make up branches in the Opisthokonts and Archaeplastids, and the rest of the tree is occupied by diverse 'kingdoms' of protists. ...
Article
Full-text available
The mitochondria contain their own genome derived from an alphaproteobacterial endosymbiont. From thousands of protein-coding genes originally encoded by their ancestor, only between 1 and about 70 are encoded on extant mitochondrial genomes (mitogenomes). Thanks to a dramatically increasing number of sequenced and annotated mitogenomes a coherent picture of why some genes were lost, or relocated to the nucleus, is emerging. In this review, we describe the characteristics of mitochondria-to-nucleus gene transfer and the resulting varied content of mitogenomes across eukaryotes. We introduce a ‘burst-upon-drift’ model to best explain nuclear-mitochondrial population genetics with flares of transfer due to genetic drift. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-024-01824-1.
Preprint
Eukaryotic cell biology is largely understood from paradigms established on few model organisms, largely from the animal and fungi (opisthokonts) and to a lesser extent plants. These organisms, however, constitute only a small proportion of eukaryotic diversity, and the principles of their cell biology may not be universal to other, understudied but globally impactful, organisms. Intriguingly, there are cellular components that are present in diverse eukaryotes, but are not in the animals and fungi on which the best developed models of cell biology are derived. Consequently, these components are not included in the generally adopted frameworks of cellular function that are meant to explain eukaryotic biology. The membrane complex TSET is the best studied such example, well established to play a role in cell division and endocytosis in plants. It is found across eukaryotes, but is highly reduced in opisthokonts. Its general prevalence, abundance, and relevance in eukaryotic cellular activity is unclear. Here we show that TSET is encoded in genomes of five cosmopolitan and critical groups of primarily photosynthetic eukaryotes (green algae, red algae, stramenopiles, haptophytes and cryptophytes), with particular prevalence in the green algae and some stramenopile groups. A meta-analysis of published gene expression data from the model diatom Phaeodactylum tricornutum shows that this complex is coregulated with components of the endomembrane trafficking machinery. Moreover, meta-transcriptomic data from Tara Oceans reveals that TSET genes are both present and expressed by diatoms in the wild. These data suggest that TSET may be playing an important and underrecognized role in cellular activities within marine ecosystems. More broadly, the results support the idea that use of systems-level data for non-model organisms can illuminate our understanding of core principles of eukaryotic cell function, and may reveal important and under-appreciated players that deserve to be integrated into the pervasive models of cellular capacity.
Preprint
Full-text available
Spatial arrangement of the cytoskeleton in the cells of protists has been used for decades for taxonomy and phylogenetic inference at various levels. In contrast, the protein composition of non-microtubular structures is mostly unknown. Exceptions are system I fibres in algae, which are built of striated fiber assemblins (SFA). Interestingly, SFAs are also components of a range of other, dissimilar structures, playing a role in the cortex of ciliates, cells division in apicomplexans, and adhesion of the parasite Giardia to the intestine. In a broad bioinformatic survey, we show the presence of three ancestral eukaryotic paralogues of SFA, and note that they are present in all ‗typical excavates‘ – simple flagellates bearing a ventral feeding groove. In one representative, Paratrimastix pyriformis, we localised one of the SFA paralogues using specific antibodies and expansion microscopy. We show that it co-localises specifically with structures attached to the basal body of the posterior flagellum, namely the right microtubular root, composite fibre, and B-fibre. As the morphology of the ‗typical excavates‘ may be ancestral to eukaryotes, we speculate that the role in the development or function of this feeding apparatus may be ancestral to the protein.
Article
Eupelagonemids, formerly known as Deep Sea Pelagic Diplonemids I (DSPD I), are among the most abundant and diverse heterotrophic protists in the deep ocean, but little else is known about their ecology, evolution, or biology in general. Originally recognized solely as a large clade of environmental ribosomal subunit RNA gene sequences (SSU rRNA), branching with a smaller sister group DSPD II, they were postulated to be diplonemids, a poorly studied branch of Euglenozoa. Although new diplonemids have been cultivated and studied in depth in recent years, the lack of cultured eupelagonemids has limited data to a handful of light micrographs, partial SSU rRNA gene sequences, a small number of genes from single amplified genomes, and only a single formal described species, Eupelagonema oceanica. To determine exactly where this clade goes in the tree of eukaryotes and begin to address the overall absence of biological information about this apparently ecologically important group, we conducted single-cell transcriptomics from two eupelagonemid cells. A SSU rRNA gene phylogeny shows that these two cells represent distinct subclades within eupelagonemids, each different from E. oceanica. Phylogenomic analysis based on a 125-gene matrix contrasts with the findings based on ecological survey data and shows eupelagonemids branch sister to the diplonemid subgroup Hemistasiidae.
Article
Full-text available
The genus Naegleria is a taxonomic subfamily consisting of 47 free-living amoebae. The genus can be found in warm aqueous or soil habitats worldwide. The species Naegleria fowleri is probably the best-known species of this genus. As a facultative parasite, the protist is not dependent on hosts to complete its life cycle. However, it can infect humans by entering the nose during water contact, such as swimming, and travel along the olfactory nerve to the brain. There it causes a purulent meningitis (primary amoebic meningoencephalitis or PAME). Symptoms are severe and death usually occurs within the first week. PAME is a frightening infectious disease for which there is neither a proven cure nor a vaccine. In order to contain the disease and give patients any chance to survival, action must be taken quickly. A rapid diagnosis is therefore crucial. PAME is diagnosed by the detection of amoebae in the liquor and later in the cerebrospinal fluid. For this purpose, CSF samples are cultured and stained and finally examined microscopically. Molecular techniques such as PCR or ELISA support the microscopic analysis and secure the diagnosis.
Article
Eukaryotrophic protists are ecologically significant and possess characteristics key to understanding the evolution of eukaryotes; however, they remain poorly studied, due partly to the complexities of maintaining predator–prey cultures. Kaonashia insperata , gen. nov., et sp. nov., is a free‐swimming biflagellated eukaryotroph with a conspicuous ventral groove, a trait observed in distantly related lineages across eukaryote diversity. Di‐eukaryotic (predator–prey) cultures of K . insperata with three marine algae ( Isochrysis galbana , Guillardia theta , and Phaeodactylum tricornutum ) were established by single‐cell isolation. Growth trials showed that the studied K . insperata clone grew particularly well on G . theta , reaching a peak abundance of 1.0 × 10 ⁵ ± 4.0 × 10 ⁴ cells ml ⁻¹ . Small‐subunit ribosomal DNA phylogenies infer that K . insperata is a stramenopile with moderate support; however, it does not fall within any well‐defined phylogenetic group, including environmental sequence clades (e.g. MASTs), and its specific placement remains unresolved. Electron microscopy shows traits consistent with stramenopile affinity, including mastigonemes on the anterior flagellum and tubular mitochondrial cristae. Kaonashia insperata may represent a novel major lineage within stramenopiles, and be important for understanding the evolutionary history of the group. While heterotrophic stramenopile flagellates are considered to be predominantly bacterivorous, eukaryotrophy may be relatively widespread amongst this assemblage.
Article
Full-text available
Background: The Golgi apparatus is a central meeting point for the endocytic and exocytic systems in eukaryotic cells, and the organelle's dysfunction results in human disease. Its characteristic morphology of multiple differentiated compartments organized into stacked flattened cisternae is one of the most recognizable features of modern eukaryotic cells, and yet how this is maintained is not well understood. The Golgi is also an ancient aspect of eukaryotes, but the extent and nature of its complexity in the ancestor of eukaryotes is unclear. Various proteins have roles in organizing the Golgi, chief among them being the golgins. Results: We address Golgi evolution by analyzing genome sequences from organisms which have lost stacked cisternae as a feature of their Golgi and those that have not. Using genomics and immunomicroscopy, we first identify Golgi in the anaerobic amoeba Mastigamoeba balamuthi. We then searched 87 genomes spanning eukaryotic diversity for presence of the most prominent proteins implicated in Golgi structure, focusing on golgins. We show some candidates as animal specific and others as ancestral to eukaryotes. Conclusions: None of the proteins examined show a phyletic distribution that correlates with the morphology of stacked cisternae, suggesting the possibility of stacking as an emergent property. Strikingly, however, the combination of golgins conserved among diverse eukaryotes allows for the most detailed reconstruction of the organelle to date, showing a sophisticated Golgi with differentiated compartments and trafficking pathways in the common eukaryotic ancestor.
Article
Full-text available
Recent phylogenetic analyses position certain ‘orphan’ protist lineages deep in the tree of eukaryotic life, but their exact placements are poorly resolved. We conducted phylogenomic analyses that incorporate deeply sequenced transcriptomes from representatives of collodictyonids (diphylleids), rigifilids, Mantamonas and ancyromonads (planomonads). Analyses of 351 genes, using site-heterogeneous mixture models, strongly support a novel supergroup-level clade that includes collodictyonids, rigifilids and Mantamonas, which we name ‘CRuMs’. Further, they robustly place CRuMs as the closest branch to Amorphea (including animals and fungi). Ancyromonads are strongly inferred to be more distantly related to Amorphea than are CRuMs. They emerge either as sister to malawimonads, or as a separate deeper branch. CRuMs and ancyromonads represent two distinct major groups that branch deeply on the lineage that includes animals, near the most commonly inferred root of the eukaryote tree. This makes both groups crucial in examinations of the deepest-level history of extant eukaryotes.
Article
Full-text available
Protein transport systems are fundamentally important for maintaining mitochondrial function. Nevertheless, mitochondrial protein translocases such as the kinetoplastid ATOM complex have recently been shown to vary in eukaryotic lineages. Various evolutionary hypotheses have been formulated to explain this diversity. To resolve any contradiction, estimating the primitive state and clarifying changes from that state are necessary. Here, we present more likely primitive models of mitochondrial translocases, specifically the translocase of the outer membrane (TOM) and translocase of the inner membrane (TIM) complexes, using scrutinized phylogenetic profiles. We then analyzed the translocases' evolution in eukaryotic lineages. Based on those results, we propose a novel evolutionary scenario for diversification of the mitochondrial transport system. Our results indicate that presequence transport machinery was mostly established in the last eukaryotic common ancestor, and that primitive translocases already had a pathway for transporting presequence-containing proteins. Moreover, secondary changes including convergent and migrational gains of a presequence receptor in TOM and TIM complexes, respectively, likely resulted from constrained evolution. The nature of a targeting signal can constrain alteration to the protein transport complex.
Article
Full-text available
High animal and plant richness in tropical rainforest communities has long intrigued naturalists. It is unknown if similar hyperdiversity patterns are reflected at the microbial scale with unicellular eukaryotes (protists). Here we show, using environmental metabarcoding of soil samples and a phylogeny-aware cleaning step, that protist communities in Neotropical rainforests are hyperdiverse and dominated by the parasitic Apicomplexa, which infect arthropods and other animals. These host-specific parasites potentially contribute to the high animal diversity in the forests by reducing population growth in a density-dependent manner. By contrast, too few operational taxonomic units (OTUs) of Oomycota were found to broadly drive high tropical tree diversity in a host-specific manner under the Janzen-Connell model. Extremely high OTU diversity and high heterogeneity between samples within the same forests suggest that protists, not arthropods, are the most diverse eukaryotes in tropical rainforests. Our data show that protists play a large role in tropical terrestrial ecosystems long viewed as being dominated by macroorganisms.
Article
Full-text available
The innovation of the eukaryote cytoskeleton enabled phagocytosis, intracellular transport and cytokinesis, and is responsible for the diversity of eukaryotic morphologies. Still, the relationship between phenotypic innovations in the cytoskeleton and their underlying genotype is poorly understood. To explore the genetic mechanism of morphological evolution of the eukaryotic cytoskeleton we provide the first single cell transcriptomes from uncultured, free-living unicellular eukaryotes: the polycystine radiolarian Lithomelissa setosa (Nassellaria) and Sticholonche zanclea (Taxopodida). A phylogenomic approach using 255 genes finds Radiolaria and Foraminifera as separate monophyletic groups (together as Retaria), while Cercozoa is shown to be paraphyletic. Analysis of the genetic components of the cytoskeleton and mapping of the evolution of these to the revised phylogeny of Rhizaria reveal lineage-specific gene duplications and neofunctionalization of α and β tubulin in Retaria, actin in Retaria and Endomyxa, and Arp2/3 complex genes in Chlorarachniophyta. We show how genetic innovations have shaped cytoskeletal structures in Rhizaria, and how single cell transcriptomics can be applied for resolving deep phylogenies and studying gene evolution in uncultured protist species.
Article
Full-text available
The ease with which phylogenomic data can be generated has drastically escalated the computational burden for even routine phylogenetic investigations. To address this, we present phyx : a collection of programs written in C ++ to explore, manipulate, analyze and simulate phylogenetic objects (alignments, trees and MCMC logs). Modelled after Unix/GNU/Linux command line tools, individual programs perform a single task and operate on standard I/O streams that can be piped to quickly and easily form complex analytical pipelines. Because of the stream-centric paradigm, memory requirements are minimized (often only a single tree or sequence in memory at any instance), and hence phyx is capable of efficiently processing very large datasets. Availability and implementation: phyx runs on POSIX-compliant operating systems. Source code, installation instructions, documentation and example files are freely available under the GNU General Public License at https://github.com/FePhyFoFum/phyx. Contact: eebsmith@umich.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Article
Full-text available
Protists, which are single-celled eukaryotes, critically influence the ecology and chemistry of marine ecosystems, but genome-based studies of these organisms have lagged behind those of other microorganisms. However, recent transcriptomic studies of cultured species, complemented by meta-omics analyses of natural communities, have increased the amount of genetic information available for poorly represented branches on the tree of eukaryotic life. This information is providing insights into the adaptations and interactions between protists and other microorganisms and macroorganisms, but many of the genes sequenced show no similarity to sequences currently available in public databases. A better understanding of these newly discovered genes will lead to a deeper appreciation of the functional diversity and metabolic processes in the ocean. In this Review, we summarize recent developments in our understanding of the ecology, physiology and evolution of protists, derived from transcriptomic studies of cultured strains and natural communities, and discuss how these novel large-scale genetic datasets will be used in the future.
Article
Proteins have distinct structural and functional constraints at different sites that lead to site-specific preferences for particular amino acid residues as the sequences evolve. Heterogeneity in the amino acid substitution process between sites is not modeled by commonly used empirical amino acid exchange matrices. Such model misspecification can lead to artefacts in phylogenetic estimation such as long-branch attraction. Although sophisticated site-heterogeneous mixture models have been developed to address this problem in both Bayesian and maximum likelihood (ML) frameworks, their formidable computational time and memory usage severely limits their use in large phylogenomic analyses. Here we propose a posterior mean site frequency (PMSF) method as a rapid and efficient approximation to full empirical profile mixture models for ML analysis. The PMSF approach assigns a conditional mean amino acid frequency profile to each site calculated based on a mixture model fitted to the data using a preliminary guide tree. These PMSF profiles can then be used for in-depth tree-searching in place of the full mixture model. Compared with widely used empirical mixture models with k classes, our implementation of PMSF in IQ-TREE (http://www.iqtree.org) speeds up the computation by approximately k /1.5-fold and requires a small fraction of the RAM. Furthermore, this speedup allows, for the first time, full nonparametric bootstrap analyses to be conducted under complex site-heterogeneous models on large concatenated data matrices. Our simulations and empirical data analyses demonstrate that PMSF can effectively ameliorate long-branch attraction artefacts. In some empirical and simulation settings PMSF provided more accurate estimates of phylogenies than the mixture models from which they derive.
Article
Recent global surveys of marine biodiversity have revealed that a group of organisms known as “marine diplonemids” constitutes one of the most abundant and diverse planktonic lineages. Though discovered over a decade ago, their potential importance was unrecognized, and our knowledge remains restricted to a single gene amplified from environmental DNA, the 18S rRNA gene (small subunit [SSU]). Here, we use single-cell genomics (SCG) and microscopy to characterize ten marine diplonemids, isolated from a range of depths in the eastern North Pacific Ocean. Phylogenetic analysis confirms that the isolates reflect the entire range of marine diplonemid diversity, and comparisons to environmental SSU surveys show that sequences from the isolates range from rare to superabundant, including the single most common marine diplonemid known. SCG generated a total of ∼915 Mbp of assembled sequence across all ten cells and ∼4,000 protein-coding genes with homologs in the Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology database, distributed across categories expected for heterotrophic protists. Models of highly conserved genes indicate a high density of non-canonical introns, lacking conventional GT-AG splice sites. Mapping metagenomic datasets to SCG assemblies reveals virtually no overlap, suggesting that nuclear genomic diversity is too great for representative SCG data to provide meaningful phylogenetic context to metagenomic datasets. This work provides an entry point to the future identification, isolation, and cultivation of these elusive yet ecologically important cells. The high density of nonconventional introns, however, also portends difficulty in generating accurate gene models and highlights the need for the establishment of stable cultures and transcriptomic analyses.
Article
Living eukaryotes descended from a heterotrophic common ancestor with a complex cytoskeleton, and mitochondria acquired through endosymbiosis. Most belong to one of a few ‘supergroups.’ Archaeplastida (which includes diverse algae plus land plants) have ‘primary’ plastids directly descended from cyanobacteria. ‘Sar’ contains many algae with ‘complex’ plastids (e.g., diatoms and dinoflagellates), and several well-known protozoan groups (e.g., ciliates, apicomplexan parasites, and foraminifera). Excavata mostly consists of flagellated protozoa, many of which are anaerobic and/or parasitic. Amoebozoa are mostly amoebae, with some being ‘slime molds,’ while Opisthokonta (within Obazoa) includes various protozoa (e.g., choanoflagellates) in addition to animals and fungi. Protists of unresolved phylogenetic position include the haptophyte and cryptophyte algae, and several protozoan groups (e.g., centrohelids).