ArticleLiterature Review

Selection of Conserved Blocks from Multiple Alignments for Their Use in Phylogenetic Analysis

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The use of some multiple-sequence alignments in phylogenetic analysis, particularly those that are not very well conserved, requires the elimination of poorly aligned positions and divergent regions, since they may not be homologous or may have been saturated by multiple substitutions. A computerized method that eliminates such positions and at the same time tries to minimize the loss of informative sites is presented here. The method is based on the selection of blocks of positions that fulfill a simple set of requirements with respect to the number of contiguous conserved positions, lack of gaps, and high conservation of flanking positions, making the final alignment more suitable for phylogenetic analysis. To illustrate the efficiency of this method, alignments of 10 mitochondrial proteins from several completely sequenced mitochondrial genomes belonging to diverse eukaryotes were used as examples. The percentages of removed positions were higher in the most divergent alignments. After removing divergent segments, the amino acid composition of the different sequences was more uniform, and pairwise distances became much smaller. Phylogenetic trees show that topologies can be different after removing conserved blocks, particularly when there are several poorly resolved nodes. Strong support was found for the grouping of animals and fungi but not for the position of more basal eukaryotes. The use of a computerized method such as the one presented here reduces to a certain extent the necessity of manually editing multiple alignments, makes the automation of phylogenetic analysis of large data sets feasible, and facilitates the reproduction of the final alignment by other researchers.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Protein sequences of RBOH, SOD, CAT, PrxR, APX, GPX, MDAR, DHAR, GR, Trx, and GLR from C. sinensis were retrieved from CPBD [23]. Multiple sequence alignment was performed utilizing MAFFT [38], followed by alignment refinement with Gblocks [39], model selection using PartitionFinder 2 [40], and phylogenetic tree reconstruction via IQ-tree [41]. ...
... utilizing MAFFT [38], followed by alignment refinement with Gblocks [39], model sele tion using PartitionFinder 2 [40], and phylogenetic tree reconstruction via IQ-tree [41]. ...
Article
Full-text available
Reactive oxygen species (ROS) are pivotal in signal transduction processes in plant–pathogen interactions. The ROS signaling pathways involved in Candidatus Liberibacter asiaticus (CLas) and Xanthomonas citri subspecies citri (Xcc) infections in Citrus sinensis (sweet orange) are unclear. In this study, we comprehensively identified ROS metabolism-associated genes, including 9 NADPH oxidase (RBOH), 14 superoxide dismutase (SOD), 1 catalase (CAT), 9 peroxiredoxin (PrxR), 5 ascorbate peroxidase (APX), 4 glutathione peroxidase (GPX), 3 monodehydroascorbate reductase (MDAR), 2 dehydroascorbate reductase (DHAR), 2 glutathione reductase (GR), 24 thioredoxin (Trx), and 18 glutaredoxin (GLR) genes in C. sinensis. An analysis revealed variable gene structures but conserved motifs and domains in ROS subfamilies. A comparative synteny analysis with Arabidopsis thaliana and Vitis vinifera indicated evolutionary conservation of most ROS metabolism-associated genes, with some originating from gene duplication events post-species divergence in C. sinensis. Expression profiling revealed five up-regulated genes and four down-regulated genes during both CLas and Xcc infections. Promoter analysis revealed numerous stress-responsive elements in the promoter of ROS metabolism-associated genes. Protein–protein interaction network analysis highlighted the involvement of ROS metabolism in various biological processes. A comparison of ROS metabolism-associated genes between C. sinensis and Poncirus trifoliata indicated multiple gene gain and loss events within ROS subfamilies of C. sinensis. This study enhances our understanding of ROS metabolism in C. sinensis and sheds light on citrus–pathogen interactions.
... uni-biele feld. de/ reput er) [67] was employed to locate larger repetitive sequences, with the following parameters: Hamming distance of 3, a minimum repeat size of 30 bp, and a maximum computed repeat of 5,000 bp [68]. This search aimed to identify forward (F), reverse (R), palindromic (P), and complementary (C) repeats within the LSC, IRb, IRa, and SSC regions. ...
... For chloroplast genome data, we selected 33 chloroplast genomes from 18 species for analysis (Additional file 2: Table S5). We employed MAFFT v7 to compare all the complete chloroplast genomes, the gaps were deleted by Gblocks v.0.91b [68]. Subsequently, the best model GTR-GAMMA was selected in jModelTest2 on XSEDE (2.1.6) ...
Article
Full-text available
Background Baolia H.W.Kung & G.L.Chu is a monotypic genus only known in Diebu County, Gansu Province, China. Its systematic position is contradictory, and its morphoanatomical characters deviate from all other Chenopodiaceae. Recent study has regarded Baolia as a sister group to Corispermoideae. We therefore sequenced and compared the chloroplast genomes of this species, and resolved its phylogenetic position based on both chloroplast genomes and marker sequences. Results We sequenced 18 chloroplast genomes of 16 samples from two populations of Baolia bracteata and two Corispermum species. These genomes of Baolia ranged in size from 152,499 to 152,508 bp. Simple sequence repeats (SSRs) were primarily located in the LSC region of Baolia chloroplast genomes, and most of them consisted of single nucleotide A/T repeat sequences. Notably, there were differences in the types and numbers of SSRs between the two populations of B. bracteata. Our phylogenetic analysis, based on both complete chloroplast genomes from 33 species and a combination of three markers (ITS, rbcL, and matK) from 91 species, revealed that Baolia and Corispermoideae (Agriophyllum, Anthochlamys, and Corispermum) form a well-supported clade and sister to Acroglochin. According to our molecular dating results, a major divergence event between Acroglochin, Baolia, and Corispermeae occurred during the Middle Eocene, approximately 44.49 mya. Ancestral state reconstruction analysis showed that Baolia exhibited symplesiomorphies with those found in core Corispermoideae characteristics including pericarp and seed coat. Conclusions Comparing the chloroplast genomes of B. bracteata with those of eleven typical Chenopodioideae and Corispermoideae species, we observed a high overall similarity and a one notable noteworthy case of inversion of approximately 3,100 bp. of DNA segments only in two Atriplex and four Chenopodium species. We suggest that Corispermoideae should be considered in a broader sense, it includes Corispermeae (core Corispermoideae: Agriophyllum, Anthochlamys, and Corispermum), as well as two new monotypic tribes, Acroglochineae (Acroglochin) and Baolieae (Baolia).
... Multiple sequence alignments were generated using MUSCLE as implemented in MEGA X 24 . Gaps in alignments were trimmed using Gblocks 25 and alignments were further curated by manual inspection. Species trees were inferred using TimeTree (rodents, prosimians) 26 or obtained from Zoonomia (bats). ...
Article
Full-text available
Zika and dengue virus nonstructural protein 5 antagonism of STAT2, a critical interferon signaling transcription factor, to suppress the host interferon response is required for viremia and pathogenesis in a vertebrate host. This affects viral species tropism, as mouse STAT2 resistance renders only immunocompromised or humanized STAT2 mice infectable. Here, we explore how STAT2 evolution impacts antagonism. By measuring the susceptibility of 38 diverse STAT2 proteins, we demonstrate that resistance arose numerous times in mammalian evolution. In four species, resistance requires distinct sets of multiple amino acid changes that often individually disrupt STAT2 signaling. This reflects an evolutionary ridge where progressive resistance is balanced by the need to maintain STAT2 function. Furthermore, resistance may come with a fitness cost, as resistance that arose early in lemur evolution was subsequently lost in some lemur lineages. These findings underscore that while it is possible to evolve resistance to antagonism, complex evolutionary trajectories are required to avoid detrimental host fitness consequences.
... The suborder Striatina contained the single family Striamoebidae with the type species Striamoeba (=Thecamoeba) striata. Bovee (1985) listed this family as a member of the suborder Thecina and included the species Striamoeba munda (Schaeffer, 1926) in the genus Striamoeba. However, Page (1971Page ( , 1977 and other authors (e.g. ...
Article
Full-text available
Until recently, it was believed that amoebae of the genus Thecamoeba Fromentel, 1874 could be relatively easily distinguished from each other at the light-microscopic level. The main characteristics were the shape and size of the locomotive form and the morphology of the nucleus. However, recent studies with molecular methods have shown that several sibling species may be hidden behind every "classical" morphological species of Thecamoeba. Therefore, re-description and obtaining molecular data on "classic" Thecamoeba species became necessary tasks. However, during recent decades, almost all type cultures have been lost from international culture collections. During our study of the fauna of Moscow ponds, we isolated a strain identical to the type culture of T. striata established by F.C. Page both at the morphological level and by the sequence of the 18S rRNA gene. We obtained new images that clearly illustrated the diversity of locomotive forms and the morphology of the nucleus of the species T. striata. An analysis of faunistic studies showed that amoebae of "T. striata species group" are distributed almost worldwide and are a common component of freshwater and terrestrial ecosystems.
... We aligned each protein-coding gene (PCG) using RevTrans 2.0 (Wernersson and Pedersen 2003) and each rRNA using MAFFT 7.245 (Katoh and Standley 2013). Poorly aligned and highly divergent sequences were eliminated using Gblocks 0.91b (Castresana 2000) and then all sequences were concatenated. Phylogenetic analysis was performed via the maximum likelihood (ML) method with 1,000 bootstrap replications (Felsenstein 1985) using CIPRES Portal 3.1 (Miller et al. 2010), based on the TVM þ I þ G model, which was determined using the jModelTest (Posada 2008). ...
Article
Full-text available
This study is the first to sequence the complete mitochondrial genome (mitogenome) of Perforatus perforatus Bruguière, 1789 (Balanomorpha: Balanidae). The 15,536-bp long P. perforatus mitogenome contained a typical set of animal mitochondrial genes, along with one control region. The P. perforatus mitogenome had an inverted gene block (trnP-ND4L-ND4-trnH-ND5-trnF) between trnS(gct) and trnT. This inverted gene block had been detected six species in three subfamilies of the Balanidae family (Balaninae, Acastinae and Megabalaninae), but our results show that it is also present in Concavinae, in which P. perforatus is included. The phylogenetic tree based on the concatenated sequences of the 13 protein-coding genes and two rRNA genes showed that P. perforatus is closely associated with Acasta sulcate and Balanus trigonus within Balanidae.
... The resulting alignment was also subjected to Gblocks version 0.91b, available on the web server of phylogeny.fr [29], to remove ambiguously aligned positions and reduce the uncertainty of phylogenetic reconstruction [30]. ...
Article
Full-text available
Leucobryum is a moss genus that exhibits various bioactivities. However, the identification of Leucobryum species with morphology alone remains difficult. Chemical profile analysis provides additional tools for plant classification. This method uses chemical similarities to identify the differences among some plants, especially the different varieties of plant species. The objective of this study was to obtain the chemical profiles and use chemometrics to identify selected Leucobryum species found in Thailand. Lipophilic extracts from 18 samples of five taxa were analysed with thin-layer chromatography (TLC) and high-performance liquid chromatography (HPLC). Principal component analysis (PCA) and hierarchical cluster analysis were used to investigate the similarities of the chemical profiles. Permutational multivariate analysis of variance (PERMANOVA) was performed to determine the difference among taxa. The taxonomic identities were also verified with a molecular marker. Morphological and molecular identification were consistent for five taxa. The TLC profile could separate only the genus of the sample, while HPLC chromatograms showed separation among the Leucobryum taxa and consistent patterns within the same species. In addition, the chemical profiles of Leucobryum species were separated at the species level in PCA. The PERMANOVA showed significantly different profiles among the species (P value = 0.002). Therefore, this work illustrates the potential of chemometrics as supporting evidence for species delimitation in the moss genus Leucobryum.
... Newly generated sequences were combined with sequences available in GenBank (table 1), and aligned by MAFFT v. 7, employing auto-selected strategy in Geneious Prime v. 2023.2.1 (http://www.geneious.com/). Ambiguous sites were removed using Gblocks v. 0.91b [22], which resulted in COI (596 bp), 16S (418 bp), 18S (1758 bp) and ITS1 (667 bp). To infer the phylogenetic positions of the species examined in this study, maximum-likelihood (ML) analyses were performed using the IQ-TREE software [23]. ...
Article
Full-text available
Polychaetes are typically found in marine environments with limited species adapting to semi-terrestrial habitats. The genus Stygocapitella comprises interstitial polychaetes dwelling in sandy beach areas around or above the high-water line. Based on molecular data, previous studies suggested the presence of multiple cryptic species in some different localities in the world lumped together as Stygocapitella subterranea. In Japan, reports on Stygocapitella were scarce, with only one species having been documented 40 years ago at Ishikari Beach in Hokkaido by the name of S. subterranea. We revisited these earlier findings and uncovered the presence of two distinct species in Stygocapitella. One of these species is herein named Stygocapitella itoi sp. nov., while the other corresponds to S. budaevae, originally described from the Russian Far East. Stygocapitella itoi sp. nov. possesses a chaetal pattern similar to that of S. australis, S. furcata and S. pacifica but can be distinguished from the congeners by two characters: a slightly forked pygidium and forked chaetae consisting of two teeth and two outer prongs. Our multi-locus phylogenetic analysis showed close relationships across the Pacific Ocean in two separated lineages in the genus, suggesting ancient dispersal or allopatric speciation after vicariance events.
... (v. 18) as reference and then aligned using MUSCLE (Edgar, 2004). Gaps at the end of the aligned 151 sequences were removed using Gblocks (Castresana, 2000). Sequences were clustered into 152 sequence variant cluster of 100% identity using the USEARCH cluster_fast command (Edgar, 2010 to infer a maximum likelihood tree. ...
Preprint
Full-text available
The rhizosphere microbiome contributes to crop health in the face of disease pressures. Increased diversity and production of antimicrobial metabolites are characteristics of the microbiome that underpin microbial-mediated pathogen resistance. A goal of sustainable agriculture is to unravel the mechanisms by which crops assemble beneficial microbiomes, but precise understanding of the ability of the plant to manipulate intragenus microdiversity is unclear. Through an integrative approach combining culture-dependent methods and long-read amplicon sequencing, we demonstrate cultivar-dependent taxonomic and functional microdiversity of the rhizocompetent and bioactive Pseudomonas genus associated with Fusarium -resistant versus susceptible winter wheat cultivars. The resistant cultivar demonstrated increased Pseudomonas taxonomic but not biosynthetic diversity when compared to the susceptible cultivar, correlating with a thinner root diameter of the resistant cultivar. We found enrichment of antifungal Pseudomonas isolates, genes (chitinase), and biosynthetic gene clusters (pyoverdine) in the resistant cultivar. Overall, we highlight cultivar-dependent microdiversity of Pseudomonas taxonomy and functional potential in the rhizosphere, which may link to root morphology and play a role in crop susceptibility to disease.
... The DUF247 genes were aligned by Muscle (fast) in codon with FasParser v2.10.0 [28]. Poorly aligned regions were trimmed using GBLOCKS v.0.91b with relaxed settings [29]. The un-rooted phylogenetic tree inferred by the neighbor-joining (NJ) method using MEGAX with the following parameters: Poisson model, partial deletion, and 1000 bootstrap replications [30]. ...
Article
Full-text available
Background The domain of unknown function 247 (DUF247) proteins is involved in plant development and stress response. Rice is an important worldwide cereal crop, although an increasing number of DUF proteins have been identified, the understanding of DUF proteins is still very limited in rice. Results In this study, we identified 69 genes that encode DUF247 proteins in the rice (Oryza sativa) genome by homology searches and domain prediction. All the OsDUF247 proteins were classified into four major groups (I, II, III and IV) by phylogenetic analysis. Remarkably, OsDUF247 genes clustered on the chromosomes solely show close phylogenetic relationships, suggesting that gene duplications have driven the expansion of the DUF247 gene family in the rice genome. Tissue profile analysis showed that most DUF247 genes expressed at constitutive levels in seedlings, roots, stems, and leaves, except for seven genes (LOC_Os01g21670, LOC_Os03g19700, LOC_Os05g04060, LOC_Os08g26820, LOC_Os08g26840, LOC_Os08g26850 and LOC_Os09g13410) in panicles. These seven genes were induced by various abiotic stress, including cold, drought, heat, hormone treatment, and especially salt, as demonstrated by further experimental analysis. DUF247 proteins contain transmembrane domains located on the membrane, suggesting their significant roles in rice development and adaptation to the environment. Conclusions These findings lay the foundation for functional characterizations of DUF247 genes to unravel their exact role in rice cultivars.
... To investigate the phylogenetic placement of T. adamsii based on 13 mitochondrial PCGs, we examined the phylogenetic relationships among 29 caenogastropod species and one vetigastropod species. Each PCG was aligned using ClustalW (Thompson et al. 1994), and poorly aligned sites were removed by using Gblock 0.91b (Castresana 2000). The vetigastropod species Haliotis rubra was used as the outgroup. ...
Article
Full-text available
The worm snail Thylacodes adamsii (Mörch, 1859) (Littorinimorpha: Vermetidae) is a sessile gastropod that mainly inhabits rocky shores along the warm temperate to tropical ocean. Herein, the complete mitochondrial genome (mitogenome) of T. adamsii from South Korea was characterized. The genome is 14,913 bp in length and contains 13 protein-coding genes (PCGs), 22 tRNA genes, and 2 rRNA genes. The genome organization and base composition of T. adamsii are similar to those of other vermetids. A phylogenetic tree was reconstructed using maximum likelihood based on the nucleotide sequences of the 13 PCGs; this tree supported the monophyly of Vermetidae. The complete mitogenome of T. adamsii can assist with molecular species identification and vermetid phylogenetic research in the future.
... Multiple sequence alignments for each protein were performed separately using MAFFT with iterative alignment (FFT-NS-i) and the BLOSUM62 amino acid scoring matrix (Katoh and Standley, 2013). All sites with gapped positions in more than 50% of the sequences were removed using GBLOCKS (Castresana, 2000). Maximum likelihood analysis was performed in IQ-Tree2 v 2.2.0 (Minh et al., 2020). ...
... modules/blob/master/sloan.pm) to convert the CDS to amino acid sequences, align with MAFFT, and then convert the sequences back to nucleotides, as in Sharbrough et al. (2022). We used two distinct alignment trimming strategies: Gblocks v0.91b with the '-n' parameter set (Castresana, 2000), and ClipKIT v1.2.0 (Steenwyk et al., 2020) with the -l parameter set and using a custom Python wrapper to convert ClipKITtrimmed amino acid alignments back to CDS alignments (https://github.com/jsharbrough/protTrim2CDS). ...
Article
Full-text available
Premise Allopolyploidy—a hybridization‐induced whole‐genome duplication event—has been a major driver of plant diversification. The extent to which chromosomes pair with their proper homolog vs. with their homoeolog in allopolyploids varies across taxa, and methods to detect homoeologous gene flow (HGF) are needed to understand how HGF has shaped polyploid lineages. Methods The ABBA‐BABA test represents a classic method for detecting introgression between closely related species, but here we developed a modified use of the ABBA‐BABA test to characterize the extent and direction of HGF in allotetraploid Coffea arabica. Results We found that HGF is abundant in the C. arabica genome, with both subgenomes serving as donors and recipients of variation. We also found that HGF is highly maternally biased in plastid‐targeted—but not mitochondrial‐targeted—genes, as would be expected if plastid–nuclear incompatibilities exist between the two parent species. Discussion Together, our analyses provide a simple framework for detecting HGF and new evidence consistent with selection favoring overwriting of paternally derived alleles by maternally derived alleles to ameliorate plastid–nuclear incompatibilities. Natural selection therefore appears to shape the direction and intensity of HGF in allopolyploid coffee, indicating that cytoplasmic inheritance has long‐term consequences for polyploid lineages.
... Ribosomal genes (16S rRNA and 28S rRNA) were aligned using MAFFT v7.2 (Katoh and Standley 2013) with the L-INS-i algorithm. Ambiguous alignment areas were trimmed by Gblocks (Castresana 2000), the parameter ribosomal gene block with a minimum length was set to 2 base pairs (bp), and the allowed gap position was selected with half; the minimum length of the protein-coding gene block was set to 3 bp, and the allowed gap position was also selected with half. ...
Article
Full-text available
In this study, we present a new species of freshwater mussel in the genus Postolata Dai et al., 2023, from Guangxi Province, China, by integrating morphological, anatomical, and molecular data. Postolata longjiangensis Liu & Wu, sp. nov. is distinguished from its congener (i.e., Postolata guangxiensis) by its shell shape, beak position, surface sculpture, nacre color, and hinge structure. Molecular species delimitation results based on the mitochondrial COI gene support the separation of Postolata longjiangensis Liu & Wu, sp. nov. from its congener. The multi-locus (COI + 16S rRNA + 28S rRNA) phylogeny reveals that this species forms the sister lineage to Postolata guangxiensis in the tribe Gonideini.
... The 16S rRNA gene sequence of the BU72 strain was predicted from the assembled genome using RNAmmer and used to construct phylogenetic trees together with the corresponding 16S rRNA gene sequences of 21 sequenced Brucella strains. The sequences were aligned using Mafft, and poorly aligned positions were removed using Gblocks v.0.91b (Castresana 2000) with default settings. A phylogeny was constructed by the neighbour-joining method, and the tree topology was evaluated by performing a bootstrap analysis of 1000 data sets using MEGA6.0 ...
Article
Full-text available
Hydrocarbon and heavy metal pollution are amongst the most severe and prevalent environmental problems due to their toxicity and persistence. Bioremediation using microorganisms is considered one of the most effective ways to treat polluted sites. In the present study, we unveil the bioremediation potential of Brucella pituitosa strain BU72. Besides its ability to grow on multiple hydrocarbons as the sole carbon source and highly tolerant to several heavy metals, BU72 produces different exopolysaccharide-based surfactants (EBS) when grown with glucose or with crude oil as sole carbon source. These EBS demonstrated particular and specific functional groups as determined by Fourier transform infrared (FTIR) spectral analysis that showed a strong absorption peak at 3250 cm⁻¹ generated by the -OH group for both EBS. The FTIR spectra of the produced EBS revealed major differences in functional groups and protein content. To better understand the EBS production coupled with the degradation of hydrocarbons and heavy metal resistance, the genome of strain BU72 was sequenced. Annotation of the genome revealed multiple genes putatively involved in EBS production pathways coupled with resistance to heavy metals genes such as arsenic tolerance and cobalt-zinc-cadmium resistance. The genome sequence analysis showed the potential of BU72 to synthesise secondary metabolites and the presence of genes involved in plant growth promotion. Here, we describe the physiological, metabolic, and genomic characteristics of Brucella pituitosa strain BU72, indicating its potential as a bioremediation agent
... Ulbr. Последовательности были выравнены в программе MAFFT v.7.505 (Katoh, 2013) и отфильтрованы в программе Gblocks v0.91b (Castresana, 2000). Подбор модели нуклеотидных замен и построение филогенетического дерева производилось в программе IQ-TREE v.2.2.2.7 (Nguyen, 2016) со следующими параметрами: Substitution model -Auto, Bootstrap analysis -Ultrafast (Number of bootstrap iterations -10000). ...
Article
The study of water crowfoots in the Yenisei River confirmed the wide distribution of recently restored rigid-leaved North American-North Asian species Ranunculus subrigidus instead of previously mentioned European R. circinatus. The species identification as R. subrigidus was confirmed by the analysis of the ITS region nucleotide sequence and morphological data. In contrast to literature data, samples of this species collected in the Yenisei did not have sessile leaves, in half of the samples, the length between first and second bifurcation was greater than the maximum of 0.2 cm previously described for the species, in some of the samples, internodes were shorter than the leaves, also different types of pubescence were observed on plant shoot parts. The conditions under which this species grows in the Yenisei River differ from the previously reported ones, primarily, in such parameters as the water hardness and the flow regime. At the city of Krasnoyarsk, seasonal development of R. subrigidus is affected by the Krasnoyarsk Hydroelectric Power Plant. The paper presents a geobotanical description of the Yenisei River plant communities dominated by and containing R. subrigidus.
... The cp genomes of Iris dichotoma (Iridaceae) and Lycorris sanguinea (Amaryllidaceae) were used as the outgroup [21]. The two sets of sequences were aligned by MAFFT [46], and then the alignments were adjusted by the Gblocks program [50]. The maximum likelihood (ML) method was employed to construct phylogenetic trees by RAxMLversion 8.0 software using the GTRGAMMA model [51]. ...
Preprint
Full-text available
Habenaria, a member of Orchidaceae family is the cosmopolitan distributions, which has significant medicinal and ornamental values. Regardless of morphology and molecular data that have been studied in recent times, the phylogenetic relationship is still under debate. Here, we sequenced, assembled, and annotated the whole chloroplast (cp) genome of two species (Habenaria aitchisonii Rchb.f. and Habenaria tibetica Schltr.ex Limpricht) of Habenaria grown on the Qinghai-Tibetan Plateau (QTP), and combined with seven already published cp genomes which may assist to uncover their genomic profiling. The two genomes ranged from 155259-155269 bp in length and both encoded 132 genes, including 86 protein, 38 tRNA and 8 rRNA. In the cp genomes, the tandem repeats (797), SSRs (2195) and diverse loci (3214) were identified. Comparative analyses of codon usage, amino frequency, microsatellite, oligo repeats and transition and transversion substitutions showed similarities among the species. Moreover, we identified 16 highly polymorphic regions with nucleotide diversity above 0.02, which may be suitable for robust authentic barcoding and inferring in the phylogeny of Habenaria species. Among the polymorphic regions, positive selection was significantly exerted on the several genes such as cemA, petA, and ycf1. This may suggest that the important adaptation stratagem for two Habenaria species on the QTP. The phylogenetic relationship displayed that H.aitchisonii and H. tibetica have closer relationship than others and the rest seven species clustered in the other three groups. Our findings also supported the idea that Habenaria could be divided into different sections. This study enriched the genomics resources of Habenaria, which may be helpful for the conservation efforts of these endangered species.
... Each gene was aligned separately using MUSCLE (Edgar 2004) within Geneious. To improve alignment quality and accuracy, ambiguous regions were trimmed using Gblocks v0.91b (Castresana 2000). Individual alignments were then concatenated to create a two-gene alignment for all 81 samples. ...
Article
Full-text available
Here, we describe a new species of Crotalaria L. discovered in Mengla County, Xishuangbanna Dai Autonomous Prefecture, Yunnan, China. The new species, Crotalaria menglaensis S.A.Rather, was confirmed by identifying diagnostic morphological characteristics, performing principal component analyses of phenotypic traits, and phylogenetic analyses based on nuclear ITS and plastid matK sequences. Phylogenetic analyses recovered the two accessions of the new species to be sister to C. bracteata Roxb. ex DC. In turn, these two species formed the sister clade to the two accessions of C. incana L. The morphometric analyses revealed that all three species were distinct, while the analyses of distinctive characters enabled unambiguous distinction of the new species by its growth habit, leaflets, flower structure and pod morphology. In contrast to the two related species, the new species is currently known only from ca. 100 mature individuals. Thus, this species is considered to be critically endangered.
... The resulting single copy orthologs were aligned using MUSCLE v.5.1 (Edgar 2004) with the option "-maxiters 16". The alignments were concatenated and parsed with Gblocks v.0.91b (Castresana 2000) with default parameters. ModelTest-NG v.0.1.7 (Darriba et al. 2020) was used to optimize the evolutionary model. ...
Preprint
Some Basidiomycete fungi are important plant pathogens, and certain species have been associated with the grapevine trunk disease esca. We present the genomes of four species associated with esca: Fomitiporia mediterranea, Fomitiporia polymorpha, Tropicoporus texanus, and Inonotus vitis. We generated high-quality phased genome assemblies using long-read sequencing. The genomic and functional comparisons identified potential virulence factors, suggesting their roles in disease development. Similar to other white-rot fungi known for their ability to degrade lignocellulosic substrates, these four genomes encoded a variety of lignin peroxidases and carbohydrate-active enzymes (CAZymes) such as CBM1, AA9, and AA2. The analysis of gene family expansion and contraction revealed dynamic evolutionary patterns, particularly in genes related to secondary metabolite production, plant cell wall decomposition, and xenobiotic degradation. The availability of these genomes will serve as a reference for further studies of diversity and evolution of virulence factors and their roles in Esca symptoms and host resistance.
... Ribosomal genes 16S, 28S, and 28S were aligned using MAFFT v7.2 (Katoh & Standley, 2013) with the L-INS-i algorithm. Ambiguous alignment regions were trimmed using Gblocks v0.91b (Castresana, 2000). The ribosomal gene block parameter was set to a minimum length of 2 bp, and the allowed gap positions were set to none. ...
Article
Freshwater bivalves (Bivalvia, Unionida) are one of the most threatened groups of animals in the world. Defining species boundaries and understanding the phylogeny and genetic diversity of these species is key to guiding their conservation and management. However, the presence of significant phenotypic plasticity and convergence within this group complicates species delimitation. This includes the freshwater mussel genus Nodularia, endemic to East Asia, for which a comprehensive understanding of species diversity and phylogenetic relationships remains elusive due to inadequate sampling in previous studies, particularly in China, a widely recognized biodiversity hotspot for freshwater mussels. Here, we conduct comprehensive taxonomic and phylogenetic analyses of Nodularia species based on extensive sampling across 23 provinces in China and multiple data sources, including shell morphology, soft body anatomy, six-gene (COI + ND1 + 16S + 18S + 28S + histone H3) and mitogenome datasets. The integrative systematics approach used here reveals 10 distinct species in this genus, four of which are new to science, i.e. Nodularia hanensis sp. nov., Nodularia huana sp. nov., Nodularia fusiformans sp. nov., Nodularia dualobtusus sp. nov. and two of which are new records for China, i.e. Nodularia dorri (Wattebled [Journal de Conchyliologie, 34, 1886, 54]) and Nodularia micheloti (Morlet [Journal de Conchyliologie, 34, 1886, 75]). We also propose that the nominal species Nodularia jourdyi (Morlet [Journal de Conchyliologie, 34, 1886, 75]) syn. nov. is a new synonym for Nodularia douglasiae (Griffith & Pidgeon, 1833) based on molecular data. BI, ML, and BEAST analyses based on the six-gene dataset and mitochondrial phylogenomics consistently support the following phylogenetic relationships: (N. dorri + (N. hanensis sp. nov. + N. micheloti)) + (N. breviconcha + (N. huana sp. nov. + (N. fusiformans sp. nov. + ((N. nuxpersicae + N. nipponensis) + (N. dualobtusus sp. nov. + N. douglasiae))))). The molecular clock with fossil calibration indicates that Nodularia originated in the Late Cretaceous period (ca. 73.78 Mya). It then diverged into two independent clades during the Middle Paleogene (ca. 45.01 Mya), followed by a rapid burst of extant speciation during the Neogene (mean age 28.28 to 4.79 Mya). Nodularia breviconcha is the earliest differentiated taxon among the 10 Nodularia taxa, appearing during the Paleogene-Neogene transition (28.28 Mya; 95% HPD = 14.35–48.44 Mya). Taken together, we provide a robust systematic framework for Nodularia species, addressing phylogenetic relationships, taxonomy, and evolutionary history of this group.
... For phylogenetic inference, single-copy orthogroups were extracted from the OrthoFinder results, and multiple sequence alignment was performed using MAFFT 7.205 (Katoh & Standley, 2013), with the parameters "localpair -maxiterate 1000". The resulting alignments were trimmed with Gblocks 0.91b (Castresana, 2000) using the parameters "-b5 = h" and then concatenated. Maximum-likelihood trees were constructed using IQ-TREE 2.1.3 ...
... Subsequently, the aligned sequences in each dataset were manually trimmed to the same length. Ambiguously aligned sites in the ribosomal gene were removed using GBLOCKS v. 0.91b (Castresana 2000) with the least stringent settings. Because of the limited molecular data of Protobranchia, different combined gene datasets were used for phylogenetic analyses of the family Yoldiidae and superfamily Nuculanoidea. ...
Article
Full-text available
In present study, a previously unidentified but frequently encountered species of deep-sea protobranch, Yoldiella haimaensissp. nov., is described new to science from the Haima Cold Seep on the northwestern slope of the South China Sea. A morphological analysis confirmed that this species belongs to a previously undescribed species of the genus Yoldiella A.E. Verrill & K.J. Bush, 1897. It differs morphologically from other known species within the genus in its shell shape, degree of inflation, beaks, and number of hinge teeth. Furthermore, we sequenced three gene segments of Y. haimaensissp. nov., comprising a nuclear ribosomal gene (18S rRNA), a nuclear protein-coding gene (histone H3), and a mitochondrial gene (cytochrome c oxidase subunit I, COI). Our phylogenetic analysis performed on the superfamily Nuculanoidea and family Yoldiidae indicates that the genus Yoldiella is non-monophyletic, and the widely recognized families within the superfamily Nuculanoidea are also not monophyletic. Our results provide molecular insights into the Protobranchia and highlight the necessity for further samples and data to revise the classification of families and genera within the superfamily using an integrative approach that combines morphological analysis and molecular data.
... Each protein-coding gene was aligned with the ClustalW alignment method (Thompson et al. 1994). Poorly aligned sites were removed and the sequences of each PCG were concatenated using the Gblock 0.91b (Castresana 2000). The sequences of PCGs were edge-linked, the best substitution model was selected as GTR þ F þ I þ G4 by ModelFinder (Kalyaanamoorthy et al. 2017), and the maximum likelihood tree was inferred using the IQ-TREE online webserver (Trifinopoulos et al. 2016). ...
Article
Full-text available
The ground beetle Synuchus nitidus (Motschulsky, 1861) (Carabidae: Harpalinae: Sphodrini) is one of the most common species in the forests of South Korea, which has the potential to be utilized as an environmental indicator. Here, we characterized the complete mitochondrial genome (mitogenome) of S. nitidus, which is the first in the harpaline tribe Sphodrini. Its genome is 16,392 bp in length and composed of 13 protein-coding genes (PCGs), 22 tRNA genes, two rRNA genes, and an A + T rich region. In addition, we reconstructed a maximum likelihood tree to elucidate the phylogenetic position of Sphodrini among the seven harpaline tribes using nucleotide sequences of the 13 PCGs. The ML tree supported a monophyletic clade of the subfamily Harpalinae and showed a close relationship between Sphodrini and Lebinii with a low bootstrap value. The complete mitogenome of S. nitidus could be helpful for molecular species identification and exploring phylogenetic relationships among carabids.
... Alignment of chloroplast genes across all species was performed using PRANK version 130,410 [53] software using the codon model. Poorly aligned regions were trimmed using Gblocks version 0.91b [54] with the option "−t = c, " selected to set the type of sequence to codons. Genes that were absent in at least one species were excluded, and the aligned sequences were combined into a super matrix. ...
Article
Full-text available
Background Lycium is an economically and ecologically important genus of shrubs, consisting of approximately 70 species distributed worldwide, 15 of which are located in China. Despite the economic and ecological importance of Lycium, its phylogeny, interspecific relationships, and evolutionary history remain relatively unknown. In this study, we constructed a phylogeny and estimated divergence time based on the chloroplast genomes (CPGs) of 15 species, including subspecies, of the genus Lycium from China. Results We sequenced and annotated 15 CPGs in this study. Comparative analysis of these genomes from these Lycium species revealed a typical quadripartite structure, with a total sequence length ranging from 154,890 to 155,677 base pairs (bp). The CPGs was highly conserved and moderately differentiated. Through annotation, we identified a total of 128–132 genes. Analysis of the boundaries of inverted repeat (IR) regions showed consistent positioning: the junctions of the IRb/LSC region were located in rps19 in all Lycium species, IRb/SSC between the ycf1 and ndhF genes, and SSC/IRa within the ycf1 gene. Sequence variation in the SSC region exceeded that in the IR region. We did not detect major expansions or contractions in the IR region or rearrangements or insertions in the CPGs of the 15 Lycium species. Comparative analyses revealed five hotspot regions in the CPG: trnR(UCU), atpF-atpH, ycf3-trnS(GGA), trnS(GGA), and trnL-UAG, which could potentially serve as molecular markers. In addition, phylogenetic tree construction based on the CPG indicated that the 15 Lycium species formed a monophyletic group and were divided into two typical subbranches and three minor branches. Molecular dating suggested that Lycium diverged from its sister genus approximately 17.7 million years ago (Mya) and species diversification within the Lycium species of China primarily occurred during the recent Pliocene epoch. Conclusion The divergence time estimation presented in this study will facilitate future research on Lycium, aid in species differentiation, and facilitate diverse investigations into this economically and ecologically important genus.
... The sequences of selected taxa along with that of the outgroup were aligned in MAFFT (Katoh et al., 2019). The ambiguously aligned regions were removed using the online version of Gblocks 0.91b (Castresana, 2000). For the analyses, the best model was selected in MEGAX (Kumar et al., 2018) using 1577 characters in the final data set. ...
Article
In the exploration of nematode diversity within the coal mine spoil of Sonebhadra district, an isolate of Ironus dentifurcatus Argo & Heyns, 1972 was collected from the soil surrounding the roots of Prosopis juliflora. This study aims to unravel taxonomic intricacies of I. dentifurcatus through analysis of morphometric data, morphological characteristics utilising both light and scanning electron microscopy, and genetic scrutiny employing the SSU 18s rDNA gene marker. The study adds valuable information to the evolutionary history of I. dentifurcatus by constructing a comprehensive phylogenetic tree. This analysis is further augmented by an exploration of phylogeography and genetic divergence within the genus Ironus Bastian, 1865. The results reveal the genetic variability within the species of Ironus and the possible adaptative radiations in the group.
... For the 16S rRNA, we identified and excluded the ambiguously aligned regions using Gblocks v0.91b (Castresana, 2000;Talavera & Castresana, 2007) on a web server (available at https://molevol.cmima. csic.es/castresana/Gblocks_server.html, ...
Article
Despite considerable research efforts in recent years, the deeper phylogenetic relationships among skipperbutterflies (Hesperiidae) remain unresolved. This is primarily because of limited sampling, especially withinAsian and African lineages. In this study, we consolidated previous data and extensively sampled Asian andAfrican taxa to elucidate the phylogenetic relationships within Hesperiidae. The molecular dataset comprisedsequences from two mitochondrial and two nuclear gene regions from 563 species that represented 353 genera.Our analyses revealed seven subfamilies within Hesperiidae: Coeliadinae, Euschemoninae, Eudaminae, Pyrginae,Heteropterinae, Trapezitinae, and Hesperiinae. The systematics of most tribes and genera aligned with those ofprior studies. However, notable differences were observed in several tribes and genera. Overall, the position oftaxa assigned to incertae sedis in Hesperiinae is largely clarified in this study. Our results strongly support themonophyly of the tribe Tagiadini (Pyrginae), and the systematics of some genera are clarified with compre-hensive discussion. We recognize 15 tribes within the subfamily Hesperiinae. Of these, nine tribes are discussedin detail: Aeromachini, Astictopterini, Erionotini, Unkanini (new status), Ancistroidini, Ismini (confirmed status),Plastingini (new status), Gretnini (confirmed status), and Eetionini (confirmed status). We propose four subtribeswithin Astictopterini: Hypoleucina subtrib.n., Aclerosina, Cupithina, and Astictopterina. Furthermore, wedescribe a new genus (Hyarotoides gen.n.) and reinstate two genera (Zea reinst.stat. and Sepa reinst.stat.) asvalid. Additionally, we propose several new combinations: Zea mytheca comb.n., Sepa bononia comb.n. &reinst.stat., and Sepa umbrosa comb.n. This study, with extensive sampling of Asian and African taxa, greatlyenhances the understanding of the knowledge of the skipper tree of life.
... We used MAFFT software to align the protein-coding genes 23 . We used Gblocks to remove the unaligned areas 24,25 . Finally, the results of sequence alignment are used for selection pressure analysis. ...
Article
Full-text available
Locomotor preferences and habitat types may drive animal evolution. In this study, we speculated that locomotor preference and habitat type may have diverse influences on Bovidae mitochondrial genes. We used selection pressure and statistical analysis to explore the evolution of mitochondrial DNA (mtDNA) protein-coding genes (PCGs) from diverse locomotor preferences and habitat types. Our study demonstrates that locomotor preference (energy demand) drives the evolution of Bovidae in mtDNA PCGs. The habitat types had no significant effect on the rate of evolution in Bovidae mitochondrial genes. Our study provides deep insight into the adaptation of Bovidae.
... The amino acid sequences of 13 PCGs for ten species were aligned individually by Clustal W (Thompson et al. 1994) with default parameters. The alignments of each PCG were refined by eliminating gaps or low conserved positions through Gblocks 0.91b web server (Castresana 2000;Dereeper et al. 2008) with default settings. The refined alignments were concatenated into a single multiple-sequence alignment. ...
Article
Full-text available
The striped notothen Trematomus hansoni is an Antarctic fish species belonging to the family Nototheniidae (cod icefishes) that is distributed throughout the Southern Ocean. In this study, the complete mitochondrial genome of T. hansoni was sequenced using an Illumina MiSeq platform. The circular mitochondrial genome is 19,218 bp long and contains 13 protein-coding genes, 23 tRNA genes, two rRNA genes, and one control region. Notably, there are two trnG-UCC genes and the second gene, located between trnE-UUC and trnI-GAU, has no D-arm structure. The base composition is 56.18% of A + T and 43.82% of G + C. The phylogenetic analysis supports that T. hansoni is grouped into a single clade with T. bernacchii. This study will be a valuable resource for further research on the phylogeny and evolution of the genus Trematomus.
... Table S8 shows GenBank numbers for the 68 species that were used to construct phylogenetic trees. The 13 protein-coding genes were extracted in PhyloSuite [96] for MAFFT [99] comparison, and conserved regions were selected using Gblock [100] and then linked together using concatenate sequence in PhyloSuite [96]. DAMBE [101] was used to analyze the saturation of the third-codon positions. ...
Article
Full-text available
Extreme weather poses huge challenges for animals that must adapt to wide variations in environmental temperature and, in many cases, it can lead to the local extirpation of populations or even the extinction of an entire species. Previous studies have found that one element of amphibian adaptation to environmental stress involves changes in mitochondrial gene expression at low temperatures. However, to date, comparative studies of gene expression in organisms living at extreme temperatures have focused mainly on nuclear genes. This study sequenced the complete mitochondrial genomes of five Asian hylid frog species: Dryophytes japonicus, D. immaculata, Hyla annectans, H. chinensis and H. zhaopingensis. It compared the phylogenetic relationships within the Hylidae family and explored the association between mitochondrial gene expression and evolutionary adaptations to cold stress. The present results showed that in D. immaculata, transcript levels of 12 out of 13 mitochondria genes were significantly reduced under cold exposure (p < 0.05); hence, we put forward the conjecture that D. immaculata adapts by entering a hibernation state at low temperature. In H. annectans, the transcripts of 10 genes (ND1, ND2, ND3, ND4, ND4L, ND5, ND6, COX1, COX2 and ATP8) were significantly reduced in response to cold exposure, and five mitochondrial genes in H. chinensis (ND1, ND2, ND3, ND4L and ATP6) also showed significantly reduced expression and transcript levels under cold conditions. By contrast, transcript levels of ND2 and ATP6 in H. zhaopingensis were significantly increased at low temperatures, possibly related to the narrow distribution of this species primarily at low latitudes. Indeed, H. zhaopingensis has little ability to adapt to low temperature (4 °C), or maybe to enter into hibernation, and it shows metabolic disorder in the cold. The present study demonstrates that the regulatory trend of mitochondrial gene expression in amphibians is correlated with their ability to adapt to variable climates in extreme environments. These results can predict which species are more likely to undergo extirpation or extinction with climate change and, thereby, provide new ideas for the study of species extinction in highly variable winter climates.
... Sequences were aligned with MAFFT v. 7 online (Katoh et al. 2019) and the alignment was checked manually using Mesquite v. 2023.3.81 (Maddison & Maddison 2023). Ambiguous regions were delimited using the online version of Gblocks v. 0.91b (Castresana 2000) at http://phylogeny.lirmm.fr/, allowing for gap positions within the final blocks, and carefully checked manually (Supplementary Material Files 2 & 3, available online). ...
Article
The foliicolous lichen Gallaicolichen pacificus exhibits unique goniocystangia-like structures named peltidiangia and peltidia. Its taxonomic classification within the Ascomycota has been unclear due to the absence of ascomata and lack of molecular data. Here we clarify the phylogenetic affinities of Gallaicolichen pacificus by analyzing mitochondrial small subunit ribosomal RNA (mtSSU) sequences obtained from specimens collected in New Caledonia. Ascomata and ascospores of G. pacificus , previously unknown, are described and illustrated for the first time. The results from the molecular and morphological analyses clearly indicate that Gallaicolichen pacificus belongs to the Porinaceae and is closely related to Porina guianensis. This is a remarkable extension of the already known, wide morphological diversity of thalli and diaspores produced within this family.
... Additional genomes were downloaded from NCBI for outgroups (Supplementary Table 3). Contigs matching UCE probes were aligned using MAFFT (Katoh and Standley, 2013) and trimmed with Gblocks (Castresana, 2000;Talavera and Castresana, 2007) using the following settings (− b1 0.5, − b2 0.5, − b3 6, − b4 4). Both steps were run in PHYLUCE v1.7.1. ...
... We aligned the tRNA and rRNA genes with the MAFFT (version 7) online service (https://mafft.cbrc.jp/alignment/server/, accessed on 2 February 2024) [37,38] and removed the unreliable alignment regions using Gblocks 0.91b [39,40]. PartitionFinder2 [41] was used to evaluate the optimal partitioning scheme and substitution models on PhyloSuite v1.2.3pre3 [26] with linked branch lengths, BIC, and searching by the greedy algorithm. ...
Article
Full-text available
Using Illumina sequencing technology, we generated complete mitochondrial genomes (mitogenomes) of three constituent species of the aphid genus Hyalopterus Koch, Hyalopterus amygdali (Blanchard), Hyalopterus arundiniformis Ghulamullah, and Hyalopterus pruni (Geoffroy). The sizes of the Hyalopterus mitogenomes range from 15,306 to 15,410 bp, primarily due to variations in the length of non-coding regions. The Hyalopterus mitogenomes consist of 37 coding genes arranged in the order of the ancestral insect mitogenome, a control region, and a repeat region between trnE and trnF. According to the COI-based analysis, one previously reported mitogenome of H. pruni should be assigned to H. arundiniformis. The gene order, nucleotide composition, and codon usage in the Hyalopterus mitogenomes are highly conserved and similar to those of other species of Aphidinae. The tandem repeat units differ in nucleotide composition, length, and copy number across three Hyalopterus species. Within the widespread Eurasian species H. arundiniformis, variation in repeat units among different geographic populations is observed, indicating that the repeat region may provide valuable insights for studying the intraspecific diversification of aphids. Phylogenetic analyses based on 28 complete mitogenomes of Aphidinae supported the monophyly of Aphidinae, Aphidini, Macrosiphini, and two subtribes of Aphidini. Hyalopterus was monophyletic. H. amygdali and H. pruni formed a sister group, while H. arundiniformis was placed basally. Characterization of the mitogenomes of Hyalopterus provides valuable resources for further comparative studies and for advancing our understanding of the aphid mitogenome architecture.
... Maximum-likelihood trees were constructed in MEGA X, with standard settings. Trimmed alignments were then generated (29), and when required, ALTER(30) was used to convert file formats. MEME (31) and FUBAR (14) were performed with HyPhy (32) using the Datamonkey web application (33). ...
Article
Mammalian mRNAs possess an N7-methylguanosine (m7G) cap and 2′O methylation of the initiating nucleotide at their 5’ end, whereas certain viral RNAs lack these characteristic features. The human antiviral restriction factor IFIT1 recognizes and binds to specific viral RNAs that lack the 5’ features of host mRNAs, resulting in targeted suppression of viral RNA translation. This interaction imposes a significant host-driven evolutionary pressure on viruses, and many viruses have evolved mechanisms to evade the antiviral action of human IFIT1. However, less is known about the virus-driven pressures that may have shaped the antiviral activity of IFIT1 genes across mammals. Here, we take an evolution-guided approach to show that the IFIT1 gene is rapidly evolving in multiple mammalian clades, with positive selection acting upon several residues in distinct regions of the protein. In functional assays with 39 IFIT1s spanning diverse mammals, we demonstrate that IFIT1 exhibits a range of antiviral phenotypes, with many orthologs lacking antiviral activity against viruses that are strongly suppressed by other IFIT1s. We further show that IFIT1s from human and a bat, the black flying fox, inhibit Venezuelan equine encephalitis virus (VEEV) and strongly bind to Cap0 RNAs. Unexpectedly, chimpanzee IFIT1, which differs from human IFIT1 by only 8 amino acids, does not inhibit VEEV infection and exhibits minimal Cap0 RNA-binding. In mutagenesis studies, we determine that amino acids 364 and 366, the latter of which is undergoing positive selection, are sufficient to confer the differential anti-VEEV activity between human and chimpanzee IFIT1. These data suggest that virus-host genetic conflicts have influenced the antiviral specificity of IFIT1 across diverse mammalian orders.
Article
Effective species conservation necessitates the ability to accurately differentiate among species, a challenge compounded by taxonomic uncertainties in freshwater mussels due to substantial intraspecific variation and pronounced phenotypic plasticity in shell morphology. The taxonomic status and species validity of Scabies longata and S. chinensis, two species endemic in China, have been under continuous debate since establishment. The lack of essential molecular data required for a comprehensive systematic study has resulted in the unresolved taxonomic status of these two species. This study presents molecular data, including COI barcoding, COI + 28S rRNA, and mitogenomic data combined with morphological characteristics to assess the validity of S. longata and S. chinensis. Both morphological and COI barcoding data support the conclusion that S. longata and S. chinensis are junior synonyms of Nodularia douglasiae and N. nuxpersicae respectively. Our findings suggest the absence of Scabies species in China. Mitochondrial phylogenetic analyses were used to further elucidate intrageneric relationships within the genus Nodularia, revealing the following relationships: (N. breviconcha (Nodularia sp. 1 (N. douglasiae (N. nuxpersicae, N. nipponensis)))). We underscore the significance of employing an integrated taxonomic approach for species identification, especially given the considerable morphological disparities between larvae and adult freshwater mussels. Proper morphological identification of adult specimens is essential for extracting meaningful taxonomic characters. Furthermore, our findings suggest a notable resemblance between the freshwater bivalve fauna in southern China and those east of the Mekong River.
Article
Tea, one of the most widely consumed beverages globally, exhibits remarkable genomic diversity in its underlying flavour and health‐related compounds. In this study, we present the construction and analysis of a tea pangenome comprising a total of 11 genomes, with a focus on three newly sequenced genomes comprising the purple‐leaved assamica cultivar “Zijuan”, the temperature‐sensitive sinensis cultivar “Anjibaicha” and the wild accession “L618” whose assemblies exhibited excellent quality scores as they profited from latest sequencing technologies. Our analysis incorporates a detailed investigation of transposon complement across the tea pangenome, revealing shared patterns of transposon distribution among the studied genomes and improved transposon resolution with long read technologies, as shown by long terminal repeat (LTR) Assembly Index analysis. Furthermore, our study encompasses a gene‐centric exploration of the pangenome, exploring the genomic landscape of the catechin pathway with our study, providing insights on copy number alterations and gene‐centric variants, especially for Anthocyanidin synthases. We constructed a gene‐centric pangenome by structurally and functionally annotating all available genomes using an identical pipeline, which both increased gene completeness and allowed for a high functional annotation rate. This improved and consistently annotated gene set will allow for a better comparison between tea genomes. We used this improved pangenome to capture the core and dispensable gene repertoire, elucidating the functional diversity present within the tea species. This pangenome resource might serve as a valuable resource for understanding the fundamental genetic basis of traits such as flavour, stress tolerance, and disease resistance, with implications for tea breeding programmes.
Article
Full-text available
Coltiviruses, belonging to the genus Coltivirus within the family Spinareoviridae, are predominantly tick-borne viruses. Some of these species have been implicated in human diseases; however, their diversity, geographical distribution, and evolutionary dynamics remain inadequately. Therefore, this study was undertaken to explore the phylogenetic evolution of coltiviruses and related viruses. Our results revealed the detection of novel coltivirus-related sequences in adult female Haemaphysalis megaspinosa ticks collected from Kanagawa Prefecture, Japan. Molecular phylogenetic analysis revealed a close association between the sequences and the genome sequences of known coltivirus-related viruses, namely Qinghe tick reovirus and Fennes virus. The putative coltivirus-related virus was tentatively designated the Nakatsu tick virus. This study provides insights into the phylogenetic evolution of coltiviruses and related viruses.
Article
Bivalves constitute an important resource for fisheries and as cultural objects. Bivalve phylogenetics has had a long tradition using both morphological and molecular characters, and genomic resources are available for a good number of commercially important species. However, relationships among bivalve families have been unstable and major conflicting results exist between mitogenomics and results based on Sanger-based amplicon sequencing or phylotranscriptomics. Here we design and test an ultraconserved elements probe set for the class Bivalvia with the aim to use hundreds of loci without the need to sequence full genomes or transcriptomes, which are expensive and complex to analyze, and to open bivalve phylogenetics to museum specimens. Our probe set successfully captured 1,513 UCEs for a total of 263,800 bp with an average length of 174.59 ±3.44 per UCE (ranging from 28 to 842 bp). Phylogenetic testing of this UCE probe set across Bivalvia and within the family Donacidae using different data matrices and methods for phylogenetic inference shows promising results at multiple taxonomic levels. In addition, our probe set was able to capture large numbers of UCEs for museum specimens collected before 1900 and from DNAs properly stored, of which many museums and laboratories are well stocked. Overall, this constitutes a novel and useful resource for bivalve phylogenetics.
Article
Full-text available
Strain MP-1014T, an obligate halophilic actinobacterium, was isolated from the mangrove soil of Thandavarayancholanganpettai, Tamil Nadu, India. A polyphasic approach was utilized to explore its phylogenetic position completely. The isolate was Gram-positive, filamentous, non-motile, and coccoid in older cultures. Ideal growth conditions were seen at 30 °C and pH 7.0, with 5% NaCl (W/V), and the DNA G + C content was 73.3%. The phylogenic analysis of this strain based upon 16S rRNA gene sequence revealed 97–99.8% similarity to the recognized species of the genus Isoptericola. Strain MP-1014T exhibits the highest similarity to I. sediminis JC619T (99.7%), I. chiayiensis KCTC19740T (98.9%), and subsequently to I. halotolerans KCTC19646T (98.6%), when compared with other members within the Isoptericola genus (< 98%). ANI scores of strain MP-1014T are 86.4%, 84.2%, and 81.5% and dDDH values are 59.7%, 53.6%, and 34.8% with I. sediminis JC619T, I. chiayiensis KCTC19740T and I. halotolerans KCTC19646T respectively. The major polar lipids of the strain MP-1014T were phosphatidylinositol, phosphatidylglycerol, diphosphotidylglycerol, two unknown phospholipids, and glycolipids. The predominant respiratory menaquinones were MK9 (H4) and MK9 (H2). The major fatty acids were anteiso-C15:0, anteiso-C17:0, iso-C14:0, C15:0, and C16:0. Also, initial genome analysis of the organism suggests it as a biostimulant for enhancing agriculture in saline environments. Based on phenotypic and genetic distinctiveness, the strain MP-1014 T represents the novel species of the genus Isoptericola assigned Isoptericola haloaureus sp. nov., is addressed by the strain MP-1014 T, given its phenotypic, phylogenetic, and hereditary uniqueness. The type strain is MP-1014T [(NCBI = OP672482.1 = GCA_036689775.1) ATCC = BAA 2646T; DSMZ = 29325T; MTCC = 13246T].
Article
The shrimp genera Leptalpheus Williams, 1965 and Fenneralpheus Felder & Manning, 1986 are composed entirely of symbiotic species that co-inhabit burrows of infaunal macrocrustaceans. We report extensive collections of these genera from western Atlantic, eastern Pacific and Indo-West Pacific regions. Integrative taxonomy methods, including morphological comparisons and analysis of three mitochondrial genetic markers, are used to test species hypotheses and evolutionary relationships among members of these genera. Our molecular analysis failed to recover Leptalpheus or Fenneralpheus as monophyletic groups. Our results strongly supported the monophyly of three clades composed of species of Leptalpheus, loosely corresponding to previously proposed species groups. Three new species closely related to Leptalpheus forceps Williams, 1965, L. marginalis Anker, 2011, and L. mexicanus Ríos & Carvacho, 1983 are described. Leptalpheus ankeri n. sp., from the Caribbean Sea, Atlantic coast of Florida, and Gulf of Mexico, is a polymorphic species that exhibits two major cheliped morphotypes. Leptalpheus sibo n. sp., from the Pacific coast of Nicaragua, is morphologically very similar to L. ankeri n. sp., likely its transisthmian sister species, and shares its cheliped polymorphism. A reassessment of L. forceps concluded that records of this species from the Caribbean Sea and Brazil are not conspecific with L. forceps sensu stricto from the Atlantic coast of the USA and the Gulf of Mexico, and they are herein described as Leptalpheus degravei n. sp. Based on both molecular and morphological evidence, we found Leptalpheus bicristatus Anker, 2011 to be a junior synonym of L. mexicanus and Leptalpheus canterakintzi Anker & Lazarus, 2015 to be a junior synonym of Leptalpheus azuero Anker, 2011. First reports of Leptalpheus axianassae Dworschak & Coelho, 1999 in Texas and Mexico, Leptalpheus denticulatus Anker & Marin, 2009 in the Mariana Islands, Leptalpheus felderi Anker, Vera Caripe & Lira, 2006 and Leptalpheus lirai Vera Caripe, Pereda & Anker, 2021 in the USA, and Leptalpheus pereirai Anker & Vera Caripe, 2016 in Cuba are included.
Article
A Gram-staining-positive actinomycete named YZH12T was isolated from the sediment of the Yangtze River in Nanjing, Jiangsu province, China. Cells were aerobic, non-spore forming, non-motile, short rod (0.4–0.6 × 0.5–1.0 µm) or coccus (0.4–0.6 µm in diameter). Colonies were circular, smooth, and beige to yellowish. Growth occurred at 15–42 °C (optimal 28 °C), pH 5.0–9.0 (optimal 7.0), and 0–10% (w/v) NaCl (optimal 2%). The strain could tolerate 1500 mg/L of imazamox. Strain YZH12T showed 98.7% 16S rRNA gene sequence similarity Nocardioides zeae JM-1068T and less than 97% similarities with other type strains in the genus Nocardioides. Phylogenetic analysis based on genome and 16S rRNA gene sequences indicated that strain YZH12T was phylogenetically affiliated to the genus Nocardioides and formed a subclade with N. zeae JM-1068T and N. alkalitolerans DSM 16699T. The average nucleotide identity (ANI) and digital DNA–DNA hybridization (dDDH) values between YZH12T and closely related type strain N. zeae JM-1068T were 79.9% and 35.2%, respectively. The major fatty acids (> 5%) were C18: 1ω9c, iso-C16: 0, C16: 0, C17: 1ω8cand C18: 0; the major respiratory quinone was MK-8(H4); and the polar lipids profiles were diphosphatidylglycerol (DPG), phosphatidylglycerol (PG), glycolipid (GL), two aminophospholipids (APL1, APL2), and an unknown polar lipid (L). The genomic DNA G + C content is 73.5%. Based on the phenotypic, chemotaxonomic, phylogenetic analyses, and genomic data, strain YZH12T represents a novel species of the genus Nocardioides, for which the name Nocardioides imazamoxiresistens YZH12T is proposed, with strain YZH12T (= KCTC 49964T = MCCC 1K0892T) as the type strain.
Article
Full-text available
Most vertebrates develop distinct females and males, where sex is determined by repeatedly evolved environmental or genetic triggers. Undifferentiated sex chromosomes and large genomes have caused major knowledge gaps in amphibians. Only a single master sex-determining gene, the dmrt1-paralogue (dm-w) of female-heterogametic clawed frogs (Xenopus; ZW♀/ZZ♂), is known across >8740 species of amphibians. In this study, by combining chromosome-scale female and male genomes of a non-model amphibian, the European green toad, Bufo(tes) viridis, with ddRAD- and whole genome pool-sequencing, we reveal a candidate master locus, governing a male-heterogametic system (XX♀/XY♂). Targeted sequencing across multiple taxa uncovered structural X/Y-variation in the 5′-regulatory region of the gene bod1l, where a Y-specific non-coding RNA (ncRNA-Y), only expressed in males, suggests that this locus initiates sex-specific differentiation. Developmental transcriptomes and RNA in-situ hybridization show timely and spatially relevant sex-specific ncRNA-Y and bod1l-gene expression in primordial gonads. This coincided with differential H3K4me-methylation in pre-granulosa/pre-Sertoli cells, pointing to a specific mechanism of amphibian sex determination.
Article
Background Traditional Chinese medicine has used Peucedanum praeruptorum Dunn (Apiaceae) for a long time. Various coumarins, including the significant constituents praeruptorin (A–E), are the active constituents in the dried roots of P. praeruptorum. Previous transcriptomic and metabolomic studies have attempted to elucidate the distribution and biosynthetic network of these medicinal-valuable compounds. However, the lack of a high-quality reference genome impedes an in-depth understanding of genetic traits and thus the development of better breeding strategies. Results A telomere-to-telomere (T2T) genome was assembled for P. praeruptorum by combining PacBio HiFi, ONT ultra-long, and Hi-C data. The final genome assembly was approximately 1.798 Gb, assigned to 11 chromosomes with genome completeness >98%. Comparative genomic analysis suggested that P. praeruptorum experienced 2 whole-genome duplication events. By the transcriptomic and metabolomic analysis of the coumarin metabolic pathway, we presented coumarins’ spatial and temporal distribution and the expression patterns of critical genes for its biosynthesis. Notably, the COSY and cytochrome P450 genes showed tandem duplications on several chromosomes, which may be responsible for the high accumulation of coumarins. Conclusions A T2T genome for P. praeruptorum was obtained, providing molecular insights into the chromosomal distribution of the coumarin biosynthetic genes. This high-quality genome is an essential resource for designing engineering strategies for improving the production of these valuable compounds.
Article
Full-text available
Based on host specificity and distribution data, it has been hypothesized that Ganaspis brasiliensis (Ihering, 1905), a natural enemy of the horticultural pest spotted-wing drosophila, Drosophila suzukii Matsumura, 1931 (SWD), was composed of multiple, cryptic species. Parasitoid wasps assigned to the species name Ganaspis brasiliensis and Ganaspis cf. brasiliensis were investigated using a molecular dataset of ultra-conserved elements (UCEs) and morphology. We report strong evidence for the presence of cryptic species based on the combination of UCE data (1,379 UCE loci), host specificity, ovipositor morphology, and distribution data. We describe these new cryptic species as: Ganaspis lupinisp. nov., and Ganaspis kimorumsp. nov.Ganaspis lupini was formerly recognized as Ganaspis brasiliensis G3, and Ganaspis kimorum as Ganaspis brasiliensis G1. These two new species appear to be restricted to the temperate climates, whereas Ganaspis brasiliensis (formerly recognized as Ganaspis brasiliensis G5) has a more pan-tropical distribution. We investigated the characterization of the ovipositor clip of these species, and hypothesize that G. kimorum, which has a reduced ovipositor clip, has an advantage in ovipositing into fresh fruit, still on the host plant, while attacking SWD; as a corollary, G. brasiliensis and G. lupini, which both have a larger ovipositor clip, are better adapted to attacking hosts in softer, rotting fruit on the ground. As Ganaspis kimorum was authorized for release as a biological control agent against SWD under the name Ganaspis brasiliensis G1, the results here have direct impact on the field of biological control.
Preprint
The superfamily Stromboidea is a clade of morphologically distinctive gastropods which include the iconic Strombidae, or ‘true conchs’. In this study, we present the most taxonomically extensive phylogeny of the superfamily to date, using fossil calibrations to produce a chronogram and extant geographical distributions to reconstruct ancestral ranges. From these results, we confirm the monophyly of all stromboidean families; however, six genera are not monophyletic using current generic assignments (Strombidae: Lentigo, Canarium , Dolomena , Doxander ; Xenophoridae: Onustus, Xenophora ). Within Strombidae, analyses resolve an Indo-West Pacific (IWP) clade sister to an East Pacific/Atlantic clade, together sister to a second, larger IWP clade. Our results also indicate two pulses of strombid diversification within the Miocene, and a Tethyan/IWP origin for Strombidae – both supported by the fossil record. However, conflicts between divergence time estimates and the fossil record warrant further exploration. Species delimitation analyses using the COI barcoding gene support several taxonomic changes. We synonymise Euprotomus aurora with Euprotomus bulla , Strombus alatus with Strombus pugilis , Dolomena abbotti with Dolomena labiosa , and Dolomena operosa with Dolomena vittata . We identified cryptic species complexes within Terebellum terebellum , Lambis lambis , “Canarium” wilsonorum, Dolomena turturella and Maculastrombus mutabilis . We reinstate Rimellopsis laurenti as a species (previously synonymised with R. laurenti ) and recognise Harpago chiragra rugosus and Lambis truncata sowerbyi valid at the rank of species. Finally, we establish several new combinations, rendering Lentigo , Dolomena , and Canarium monophyletic: Lentigo thersites , Dolomena robusta , Dolomena epidromis , Dolomena turturella , Dolomena taeniata, Dolomena vanikorensis , D. vittata , “Canarium” wilsonorum , Hawaiistrombus scalariformis , Maculastrombus mutabilis , Maculastrombus microurceus .
Article
We present a draft genome of the little bush moa ( Anomalopteryx didiformis )—one of approximately nine species of extinct flightless birds from Aotearoa, New Zealand—using ancient DNA recovered from a fossil bone from the South Island. We recover a complete mitochondrial genome at 249.9× depth of coverage and almost 900 megabases of a male moa nuclear genome at ~4 to 5× coverage, with sequence contiguity sufficient to identify more than 85% of avian universal single-copy orthologs. We describe a diverse landscape of transposable elements and satellite repeats, estimate a long-term effective population size of ~240,000, identify a diverse suite of olfactory receptor genes and an opsin repertoire with sensitivity in the ultraviolet range, show that the wingless moa phenotype is likely not attributable to gene loss or pseudogenization, and identify potential function-altering coding sequence variants in moa that could be synthesized for future functional assays. This genomic resource should support further studies of avian evolution and morphological divergence.
Article
Full-text available
Green plants appear to comprise two sister lineages, Chlorophyta (classes Chlorophyceae, Ulvophyceae, Trebouxio-phyceae, and Prasinophyceae) and Streptophyta (Charophyceae and Embryophyta, or land plants). To gain insight into the nature of the ancestral green plant mitochondrial genome, we have sequenced the mitochondrial DNAs (mtDNAs) of Nephroselmis olivacea and Pedinomonas minor. These two green algae are presumptive members of the Prasino-phyceae. This class is thought to include descendants of the earliest diverging green algae. We find that Nephroselmis and Pedinomonas mtDNAs differ markedly in size, gene content, and gene organization. Of the green algal mtDNAs se-quenced so far, that of Nephroselmis (45,223 bp) is the most ancestral (minimally diverged) and occupies the phyloge-netically most basal position within the Chlorophyta. Its repertoire of 69 genes closely resembles that in the mtDNA of Prototheca wickerhamii , a later diverging trebouxiophycean green alga. Three of the Nephroselmis genes (nad10 , rpl14 , and rnpB) have not been identified in previously sequenced mtDNAs of green algae and land plants. In contrast, the 25,137-bp Pedinomonas mtDNA contains only 22 genes and retains few recognizably ancestral features. In several respects , including gene content and rate of sequence divergence, Pedinomonas mtDNA resembles the reduced mtDNAs of chlamydomonad algae, with which it is robustly affiliated in phylogenetic analyses. Our results confirm the existence of two radically different patterns of mitochondrial genome evolution within the green algae.
Article
Full-text available
The mitochondrial DNA (mtDNA) of Porphyra purpurea, a circular-mapping genome of 36,753 bp, has been completely sequenced. A total of 57 densely packed genes has been identified, including the basic set typically found in animals and fungi, as well as seven genes characteristic of protist and plant mtDNAs and specifying ribosomal proteins and subunits of succinate:ubiquinone oxido-reductase. The mitochondrial large subunit rRNA gene contains two group II introns that are extraordinarily similar to those found in the cyanobacterium Calothrix sp, suggesting a recent lateral intron transfer between a bacterial and a mitochondrial genome. Notable features of P. purpurea mtDNA include the presence of two 291-bp inverted repeats that likely mediate homologous recombination, resulting in genome rearrangement, and of numerous sequence polymorphisms in the coding and intergenic regions. Comparative analysis of red algal mitochondrial genomes from five different, evolutionarily distant orders reveals that rhodophyte mtDNAs are unusually uniform in size and gene order. Finally, phylogenetic analyses provide strong evidence that red algae share a common ancestry with green algae and plants.
Article
Full-text available
We present ADE-4, a multivariate analysis and graphical display software. Multivariate analysis methods available in ADE-4 include usual one-table methods like principal component analysis and correspondence analysis, spatial data analysis methods (using a total variance decomposition into local and global components, analogous to Moran and Geary indices), discriminant analysis and within/between groups analyses, many linear regression methods including lowess and polynomial regression, multiple and PLS (partial least squares) regression and orthogonal regression (principal component regression), projection methods like principal component analysis on instrumental variables, canonical correspondence analysis and many other variants, coinertia analysis and the RLQ method, and several three-way table (k-table) analysis methods. Graphical display techniques include an automatic collection of elementary graphics corresponding to groups of rows or to columns in the data table, thus providing a very efficient way for automatic k-table graphics and geographical mapping options. A dynamic graphic module allows interactive operations like searching, zooming, selection of points, and display of data values on factor maps. The user interface is simple and homogeneous among all the programs; this contributes to making the use of ADE-4 very easy for nonspecialists in statistics, data analysis or computer science.
Article
Full-text available
We describe here the complete genome sequence (1,111,523 base pairs) of the obligate intracellular parasite Rickettsia prowazekii, the causative agent of epidemic typhus. This genome contains 834 protein-coding genes. The functional profiles of these genes show similarities to those of mitochondrial genes: no genes required for anaerobic glycolysis are found in either R. prowazekii or mitochondrial genomes, but a complete set of genes encoding components of the tricarboxylic acid cycle and the respiratory-chain complex is found in R. prowazekii. In effect, ATP production in Rickettsia is the same as that in mitochondria. Many genes involved in the biosynthesis and regulation of biosynthesis of amino acids and nucleosides in free-living bacteria are absent from R. prowazekii and mitochondria. Such genes seem to have been replaced by homologues in the nuclear (host) genome. The R. prowazekii genome contains the highest proportion of non-coding DNA (24%) detected so far in a microbial genome. Such non-coding sequences may be degraded remnants of 'neutralized' genes that await elimination from the genome. Phylogenetic analyses indicate that R. prowazekii is more closely related to mitochondria than is any other microbe studied so far.
Article
Full-text available
We describe here the complete genome sequence (1,111,523 base pairs) of the obligate intracellular parasite Rickettsia prowazekii, the causative agent of epidemic typhus. This genome contains 834 protein-coding genes. The functional profiles of these genes show similarities to those of mitochondrial genes: no genes required for anaerobic glycolysis are found in either R. prowazekii or mitochondrial genomes, but a complete set of genes encoding components of the tricarboxylic acid cycle and the respiratory-chain complex is found in R. prowazekii. In effect, ATP production in Rickettsia is the same as that in mitochondria. Many genes involved in the biosynthesis and regulation of biosynthesis of amino acids and nucleosides in free-living bacteria are absent from R. prowazekii and mitochondria. Such genes seem to have been replaced by homologues in the nuclear (host) genome. The R. prowazekii genome contains the highest proportion of non-coding DNA (24%) detected so far in a microbial genome. Such non-coding sequences may be degraded remnants of 'neutralized' genes that await elimination from the genome. Phylogenetic analyses indicate that R. prowazekii is more closely related to mitochondria than is any other microbe studied so far.
Article
Full-text available
Mitochondrial DNA (mtDNA) sequences are widely used for inferring the phylogenetic relationships among species. Clearly, the assumed model of nucleotide or amino acid substitution used should be as realistic as possible. Dependence among neighboring nucleotides in a codon complicates modeling of nucleotide substitutions in protein-encoding genes. It seems preferable to model amino acid substitution rather than nucleotide substitution. Therefore, we present a transition probability matrix of the general reversible Markov model of amino acid substitution for mtDNA-encoded proteins. The matrix is estimated by the maximum likelihood (ML) method from the complete sequence data of mtDNA from 20 vertebrate species. This matrix represents the substitution pattern of the mtDNA-encoded proteins and shows some differences from the matrix estimated from the nuclear-encoded proteins. The use of this matrix would be recommended in inferring trees from mtDNA-encoded protein sequences by the ML method.
Article
Full-text available
A maximum likelihood method for inferring evolutionary trees from DNA sequence data was developed by Felsenstein (1981). In evaluating the extent to which the maximum likelihood tree is a significantly better representation of the true tree, it is important to estimate the variance of the difference between log likelihood of different tree topologies. Bootstrap resampling can be used for this purpose (Hasegawa et al. 1988; Hasegawa and Kishino 1989), but it imposes a great computation burden. To overcome this difficulty, we developed a new method for estimating the variance by expressing it explicitly.The method was applied to DNA sequence data from primates in order to evaluate the maximum likelihood branching order among Hominoidea. It was shown that, although the orangutan is convincingly placed as an outgroup of a human and African apes clade, the branching order among human, chimpanzee, and gorilla cannot be determined confidently from the DNA sequence data presently available when the evolutionary rate constancy is not assumed.
Article
Full-text available
A maximum likelihood method for inferring protein phylogeny was developed. It is based on a Markov model that takes into account the unequal transition probabilities among pairs of amino acids and does not assume constancy of rate among different lineages. Therefore, this method is expected to be powerful in inferring phylogeny among distantly related proteins, either orthologous or parallogous, where the evolutionary rate may deviate from constancy. Not only amino acid substitutions but also insertion/deletion events during evolution were incorporated into the Markov model. A simple method for estimating a bootstrap probability for the maximum likelihood tree among alternatives without performing a maximum likelihood estimation for each resampled data set was developed. These methods were applied to amino acid sequence data of a photosynthetic membrane protein,psbA, from photosystem II, and the phylogeny of this protein was discussed in relation to the origin of chloroplasts.
Article
Full-text available
The DNA sequence of the 15,532-base pair (bp) mitochondrial DNA (mtDNA) of the chiton Katharina tunicata has been determined. The 37 genes typical of metazoan mtDNA are present: 13 for protein subunits involved in oxidative phosphorylation, 2 for rRNAs and 22 for tRNAs. The gene arrangement resembles those of arthropods much more than that of another mollusc, the bivalve Mytilus edulis. Most genes abut directly or overlap, and abbreviated stop codons are inferred for four genes. Four junctions between adjacent pairs of protein genes lack intervening tRNA genes; however, at each of these junctions there is a sequence immediately adjacent to the start codon of the downstream gene that is capable of forming a stem-and-loop structure. Analysis of the tRNA gene sequences suggests that the D arm is unpaired in tRNA(ser)(AGN), which is typical of metazoan mtDNAs, and also in tRNA(ser)(UCN), a condition found previously only in nematode mtDNAs. There are two additional sequences in Katharina mtDNA that can be folded into structures resembling tRNAs; whether these are functional genes is unknown. All possible codons except the stop codons TAA and TAG are used in the protein-encoding genes, and Katharina mtDNA appears to use the same variation of the mitochondrial genetic code that is used in Drosophila and Mytilus. Translation initiates at the codons ATG, ATA and GTG. A + T richness appears to have affected codon usage patterns and, perhaps, the amino acid composition of the encoded proteins. A 142-bp non-coding region between tRNA(glu) and CO3 contains a 72-bp tract of alternating A and T.
Article
Full-text available
SWISS-PROT (http://www.expasy.ch/) is a curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include: an increase in the number and scope of model organisms; cross-references to two additional databases; a variety of new documentation files and improvements to TrEMBL, a computer annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except the CDS already included in SWISS-PROT.
Article
Full-text available
A bacterial spore was revived, cultured, and identified from the abdominal contents of extinct bees preserved for 25 to 40 million years in buried Dominican amber. Rigorous surface decontamination of the amber and aseptic procedures were used during the recovery of the bacterium. Several lines of evidence indicated that the isolated bacterium was of ancient origin and not an extant contaminant. The characteristic enzymatic, biochemical, and 16S ribosomal DNA profiles indicated that the ancient bacterium is most closely related to extant Bacillus sphaericus.
Article
Full-text available
An obligately aerobic chemoheterotrophic bacterium (strain F199) previously isolated from Southeast Coastal Plain subsurface sediments and shown to degrade toluene, naphthalene, and other aromatic compounds (J. K. Fredrickson, F. J. Brockman, D. J. Workman, S. W. Li, and T. O. Stevens, Appl. Environ. Microbiol. 57:796-803, 1991) was characterized by analysis of its 16S rRNA nucleotide base sequence and cellular lipid composition. Strain F199 contained 2-OH14:0 and 18:1 omega 7c as the predominant cellular fatty acids and sphingolipids that are characteristic of the genus Sphingomonas. Phylogenetic analysis of its 16S rRNA sequence indicated that F199 was most closely related to Sphingomonas capsulata among the bacteria currently in the Ribosomal Database. Five additional isolates from deep Southeast Coastal Plain sediments were determined by 16S rRNA sequence analysis to be closely related to F199. These strains also contained characteristic sphingolipids. Four of these five strains could also grow on a broad range of aromatic compounds and could mineralize [14C]toluene and [14C]naphthalene. S. capsulata (ATCC 14666), Sphingomonas paucimobilis (ATCC 29837), and one of the subsurface isolates were unable to grow on any of the aromatic compounds or mineralize toluene or naphthalene. These results indicate that bacteria within the genus Sphingomonas are present in Southeast Coastal Plain subsurface sediments and that the capacity for degrading a broad range of substituted aromatic compounds appears to be common among Sphingomonas species from this environment.
Article
Full-text available
The 16,260-bp mitochondrial DNA (mtDNA) from the starfish Asterina pectinifera has been sequenced. The genes for 13 proteins, two rRNAs and 22 tRNAs are organized in an extremely economical fashion, similar to those of other animal mtDNAs, with some of the genes overlapping each other. The gene organization is the same as that for another echinoderm, sea urchin, except for the inversion of a 4.6-kb segment that contains genes for two proteins, 13 tRNAs and the 16S rRNA. Judging from the organization of the protein coding genes, mammalian mtDNAs resemble the sea urchin mtDNA more than that of the starfish. The region around the 3' end of the 12S rRNA gene of the starfish shows a high similarity with those for vertebrates. This region encodes a possible stem and loop structure; similar potential structures occur in this region of vertebrate mtDNAs and also in nonmitochondrial small subunit rRNA. A similar stem and loop structure is also found at the 3' end of the 16S rRNA genes in A. pectinifera, in another starfish Pisaster ochraceus, in vertebrates and in Drosophila, but not in sea urchins. The full sequence data confirm the presumption that AGA/AGG, AUA and AAA codons, respectively, code for serine, isoleucine, and asparagine in the starfish mitochondria, and that AGA/AGG codons are read by tRNA(GCUSer), which possesses a truncated dihydrouridine arm, that was previously suggested from a partial mtDNA sequence. The structural characteristics of tRNAs and possible mechanisms for the change in the mitochondrial genetic code are also discussed.
Article
Full-text available
Molecular systematists generally rely on computer algorithms to establish the alignment of DNA sequences. However, when alignment regions are characterized by multiple insertions and deletions, these gap-filled stretches of DNA are often excised before phylogenetic reconstruction. This exclusion of systematic data is generally determined by subjective criteria. We explore a replicable methodology in which the comparison of several multiple sequence alignments can be used to eliminate regions of unstable sequence alignment. Using crocodilian and insect mitochondrial (mt) ribosomal (r) DNA as examples, we caution against the removal of sequence data prior to phylogenetic reconstruction.
Article
Full-text available
Molecular evolutionary relationships within the protozoan order Kinetoplastida were deduced from comparisons of the nuclear small and large subunit ribosomal RNA (rRNA) gene sequences. These studies show that relationships among the trypanosomatid protozoans differ from those previously proposed from studies of organismal characteristics or mitochondrial rRNAs. The genera Leishmania, Endotrypanum, Leptomonas, and Crithidia form a closely related group, which shows progressively more distant relationships to Phytomonas and Blastocrithidia, Trypanosoma cruzi, and lastly Trypanosoma brucei. The rooting of the trypanosomatid tree was accomplished by using Bodo caudatus (family Bodonidae) as an outgroup, a status confirmed by molecular comparisons with other eukaryotes. The nuclear rRNA tree agrees well with data obtained from comparisons of other nuclear genes. Differences with the proposed mitochondrial rRNA tree probably reflect the lack of a suitable outgroup for this tree, as the topologies are otherwise similar. Small subunit rRNA divergences within the trypanosomatids are large, approaching those among plants and animals, which underscores the evolutionary antiquity of the group. Analysis of the distribution of different parasitic life-styles of these species in conjunction with a probable timing of evolutionary divergences suggests that vertebrate parasitism arose multiple times in the trypanosomatids.
Article
Full-text available
Phylogenetic relationships among plants, animals, and fungi were examined by using sequences from 25 proteins. Four insertions/deletions were found that are shared by two of the three taxonomic groups in question, and all four are uniquely shared by animals and fungi relative to plants, protists, and bacteria. These include a 12-amino acid insertion in translation elongation factor 1 alpha and three small gaps in enolase. Maximum-parsimony trees were constructed from published data for four of the most broadly sequenced of the 25 proteins, actin, alpha-tubulin, beta-tubulin, and elongation factor 1 alpha, with the latter supplemented by three new outgroup sequences. All four proteins place animals and fungi together as a monophyletic group to the exclusion of plants and a broad diversity of protists. In all cases, bootstrap analyses show no support for either an animal-plant or fungal-plant clade. This congruence among multiple lines of evidence strongly suggests, in contrast to traditional and current classification, that animals and fungi are sister groups while plants constitute an independent evolutionary lineage.
Article
Full-text available
Advanced sequencing techniques allow rapid deduction of individual amino acid sequences of highly related proteins. Due to their quasi-species nature, viral genomes (e.g. HIV-1) represent one of the most common sources of related proteins. Another example of related proteins are immunoglobulins. Local differences in amino acid conservation are useful indicators of potential domain structures and immunological or functional epitopes prior to structural analysis of proteins. Although variability indices can be calculated by several methods, delineation of boundaries between sequence stretches with similar variability indices is left to the user. We use algorithmic scale-space filtering for delineation of conserved and variable sequence stretches within a protein which is performed on an algorithmic basis avoiding arbitrary assignments. Our method correctly identified variable regions for the human immunoglobulin λ-chain V-regions (subgroup I). Prediction of the variable regions of the HIV-1 gp120 env protein was in agreement with empirical derived definitions. These examples indicate that our method is useful for the regional assignment of protein variability solely on the basis of amino acid sequences.
Article
We have determined the complete nucleotide (nt) sequence of the mitochondrial genome of an oligochaete annelid, the earthworm Lumbricus terrestris. This genome contains the 37 genes typical of metazoan mitochondrial DNA (mtDNA), including ATPase8, which is missing from some invertebrate mtDNAs. ATPase8 is not immediately upstream of ATPase6, a condition found previously only in the mtDNA of snails. All genes are transcribed from the same DNA strand. The largest noncoding region is 384 nt and is characterized by several homopolymer runs, a tract of alternating TA pairs, and potential secondary structures. All protein-encoding genes either overlap the adjacent downstream gene or end at an abbreviated stop codon. In Lumbricus mitochondria, the variation of the genetic code that is typical of most invertebrate mitochondrial genomes is used. Only the codon ATG is used for translation initiation. Lumbricus mtDNA is A + T rich, which appears to affect the codon usage pattern. The DHU arm appears to be unpaired not only in tRNAser(AGN), as is typical for metazoans, but perhaps also in tRNAser(UCN), a condition found previously only in a chiton and among nematodes. Relating the Lumbricus gene organization to those of other major protostome groups requires numerous rearrangements.
Article
The DNA sequence of the 15,532-base pair (bp) mitochondrial DNA (mtDNA) of the chiton Katharina tunicata has been determined. The 37 genes typical of metazoan mtDNA are present: 13 for protein subunits involved in oxidative phosphorylation, 2 for rRNAs and 22 for tRNAs. The gene arrangement resembles those of arthropods much more than that of another mollusc, the bivalve Mytilus edulis. Most genes abut directly or overlap, and abbreviated stop codons are inferred for four genes. Four junctions between adjacent pairs of protein genes lack intervening tRNA genes; however, at each of these junctions there is a sequence immediately adjacent to the start codon of the downstream gene that is capable of forming a stem-and-loop structure. Analysis of the tRNA gene sequences suggests that the D arm is unpaired in tRNA(ser)(AGN), which is typical of metazoan mtDNAs, and also in tRNA(ser)(UCN), a condition found previously only in nematode mtDNAs. There are two additional sequences in Katharina mtDNA that can be folded into structures resembling tRNAs; whether these are functional genes is unknown. All possible codons except the stop codons TAA and TAG are used in the protein-encoding genes, and Katharina mtDNA appears to use the same variation of the mitochondrial genetic code that is used in Drosophila and Mytilus. Translation initiates at the codons ATG, ATA and GTG. A + T richness appears to have affected codon usage patterns and, perhaps, the amino acid composition of the encoded proteins. A 142-bp non-coding region between tRNA(glu) and CO3 contains a 72-bp tract of alternating A and T.
Article
The EMBL Nucleotide Sequence Database (aka EMBL-Bank; http://www.ebi.ac.uk/embl/) incorporates, organises and distributes nucleotide sequences from all available public sources. EMBL-Bank is located and maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK. In an international collaboration with DDBJ (Japan) and GenBank (USA), data are exchanged amongst the collaborating databases on a daily basis. Major contributors to the EMBL database are individual scientists and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via FTP, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many other specialized databases. For sequence similarity searching, a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. All resources can be accessed via the EBI home page at http://www.ebi.ac.uk.
Article
Analysis of the mitochondrial DNA of a liverwort Marchantia polymorpha by electron microscopy and restriction endonuclease mapping indicated that the liverwort mitochondrial genome was a single circular molecule of about 184,400 base-pairs. We have determined the complete sequence of the liverwort mitochondrial DNA and detected 94 possible genes in the sequence of 186,608 base-pairs. These included genes for three species of ribosomal RNA, 29 genes for 27 species of transfer RNA and 30 open reading frames (ORFs) for functionally known proteins (16 ribosomal proteins, 3 subunits of H+-ATPase, 3 subunits of cytochrome c oxidase, apocytochrome b protein and 7 subunits of NADH ubiquinone oxidoreductase). Three ORFs showed similarity to ORFs of unknown function in the mitochondrial genomes of other organisms. Furthermore, 29 ORFs were predicted as possible genes by using the index of G + C content in first, second and third letters of codons (42.0 ± 10.9%, 37.0 ± 13.2% and 26.4 ± 9.4%, respectively) obtained from the codon usages of identified liverwort genes. To date, 32 introns belonging to either group I or group II intron have been found in the coding regions of 17 genes including ribosomal RNA genes (rrn18 and rrn26), a transfer RNA gene (trnS) and a pseudogene (ψnad7). RNA editing was apparently lacking in liverwort mitochondria since the nucleotide sequences of the liverwort mitochondrial DNA were well-conserved at the DNA level.
Article
The circular, 17,443 nucleotide-pair mitochondrial (mt) DNA molecule of the sea anemone, Metridium senile (class Anthozoa, phylum Cnidaria) is presented. This molecule contains genes for 13 energy pathway proteins and two ribosomal (r) RNAs but, relative to other metazoan mtDNAs, has two unique features: only two transfer RNAs (tRNA(f-Met) and tRNA(Trp)) are encoded, and the cytochrome c oxidase subunit I (COI) and NADH dehydrogenase subunit 5 (ND5) genes each include a group I intron. The COI intron encodes a putative homing endonuclease, and the ND5 intron contains the molecule's ND1 and ND3 genes. Most of the unusual characteristics of other metazoan mtDNAs are not found in M. senile mtDNA: unorthodox translation initiation codons and partial translation termination codons are absent, the use of TGA to specify tryptophan is the only genetic code modification, and both encoded tRNAs have primary and secondary structures closely resembling those of standard tRNAs. Also, with regard to size and secondary structure potential, the mt-s-rRNA and mt-1-rRNA have the least deviation from Escherichia coli 16S and 23S rRNAs of all known metazoan mt-rRNAs. These observations indicate that most of the genetic variations previously reported in metazoan mtDNAs developed after Cnidaria diverged from the common ancestral line of all other Metazoa.
Article
In the mitochondrial genome of the hemichordate Balanoglossus carnosus, the codon AAA, which is assigned to lysine in most metazoans but to asparagine in echinoderms, is absent. Furthermore, the lysine tRNA gene carries an anticodon substitution that renders its gene product unable to decode AAA codons, whereas the asparagine tRNA gene has not changed to encode a tRNA with the ability to recognize AAA codons. Thus, the hemichordate mitochondrial genome can be regarded as an intermediate in the process of reassignment of mitochondrial AAA codons, where most metazoans represent the ancestral situation and the echinoderms the derived situation. This lends support to the codon capture hypothesis. We also show that the reassignment of the AAA codon is associated with a reduction in the relative abundance of lysine residues in mitochondrial proteins.
Article
We describe a stochastic method for tracing the evolutionary pattern of multialigned sequences. This method allows us to detect gene regions with distinct evolutionary dynamics, e.g., regions that significantly deviate from the expected behavior. Accurate detection of hypervariable or hyperconstrained regions may provide useful information on the structure/function relationship of biosequences. This information can help localize functional constraints. In addition, the selection of distinct evolutionary dynamics may assist in the correct use of biosequences as reliable molecular clocks.
Article
The complete 94,192 bp sequence of the mitochondrial genome from race s of Podospora anserina is presented (1 kb = 10(3) base pairs). Three regions unique to race A are also presented bringing the size of this genome to 100,314 bp. Race s contains 31 group I introns (33 in race A) and 2 group II introns (3 in race A). Analysis shows that the group I introns can be categorized according to families both with regard to secondary structure and their open reading frames. All identified genes are transcribed from the same strand. Except for the lack of ATPase 9, the Podospora genome contains the same genes as its fungal counterparts, N. crassa and A. nidulans. About 20% of the genome has not yet been identified. DNA sequence studies of several excision-amplification plasmids demonstrate a common feature to be the presence of short repeated sequences at both termini with a prevalence of GGCGCAAGCTC.
Article
The branching order of the kingdoms Animalia, Plantae, and Fungi has been a controversial issue. Using the transformed distance method and the maximum parsimony method, we investigated this problem by comparing the sequences of several kinds of macromolecules in organisms spanning all three kingdoms. The analysis was based on the large-subunit and small-subunit ribosomal RNAs, 10 isoacceptor transfer RNA families, and six highly conserved proteins. All three sets of sequences support the same phylogenetic tree: plants and animals are sibling kingdoms that have diverged more recently than the fungi. The ribosomal RNA and protein data sets are large enough so that in both cases the inferred phylogeny is statistically significant. The present report appears to be the first to provide statistically conclusive molecular evidence for the phylogeny of the three kingdoms. The determination of this phylogeny will help us to understand the evolution of various molecular, cellular, and developmental characters shared by any two of the three kingdoms. Noting that the large-subunit rRNA sequences have evolved at similar rates in the three kingdoms, we estimated the ratio of the time since the animal-plant split to the time since the fungal divergence to be 0.90.
Article
The 16S ribosomal RNA sequences from Agrobacterium tumefaciens and Pseudomonas testosteroni have been determined to further delimit the origin of the endosymbiont that gave rise to the mitochondrion. These two prokaryotes represent the alpha and beta subdivisions, respectively, of the so-called purple bacteria. The endosymbiont that gave rise to the mitochondrion belonged to the alpha subdivision, a group that also contains the rhizobacteria, the agrobacteria, and the rickettsias--all prokaryotes that have developed intracellular or other close relationships with eukaryotic cells.
Article
The complete nucleotide sequence of the circular mitochondrial (mt) DNA from the red alga Chondrus crispus was determined (25,836 nucleotides, A+T content 72.1%). Fifty one genes were identified. They include genes encoding three subunits of the cytochrome oxidase (cox1 to 3), apocytochrome b (cob), seven subunits of the NADH dehydrogenase complex (nad1 to 6, nad4L), two ATPase subunits (atp6 and atp9), three ribosomal RNAs (rrn5, srn and lrn), 23 tRNAs and four ribosomal proteins (rps3, rps11, rps12 and rpl16). Two subunits of the succinate dehydrogenase complex (sdhB and sdhC), usually found on nuclear genomes, are also located on the mtDNA of C. crispus. One group IIb intron is inserted in the tRNAIle gene. Six potentially functional open reading frames were identified, four of them having counterparts among green plant mtDNAs. The use of a modified genetic code and the absence of RNA editing, previously reported for the cox3 gene, appears as a general characteristic of this molecule. Mitochondrial genes are encoded on both DNA strands, in two opposite major transcriptional directions, suggesting the existence of two main transcriptional units. Two long and stable stem-loops were identified in intergenic regions, which are believed to be involved with transcription and replication. The main structural features of this genome are compared with the overall organization of mtDNAs and are discussed in view of the evolution of mitochondria.
Article
The process of multiple sequence alignment provides homology statements for the phylogenetic analysis of molecular data. Unfortunately, multiple alignments are frequently nonunique. Two sources of these multiple alignments are analysis based on different sets of alignment parameter values (gap:change cost ratios) and nonunique equally costly alignments based on a single set of alignment parameters. By "eliding" these individual alignments into a single grand alignment, phylogeny that is weighted toward those positions that align more consistently can be reconstructed. Positions that show greater variation among alignments will be relatively downweighted. The technique results in a weighting procedure that is a posteriori and based on the evidence established from the original sequence alignments.
Article
In phylogenetic trees based on comparison of nuclear small subunit rRNA sequences, Acanthamoeba castellanii (an amoeboid protozoon) is positioned near the base of the radiation leading to the animals, fungi and plants. However, the specific affiliation of this protist with the major multicellular lineages of eukaryotes is currently uncertain. To further explore the evolutionary position of A. castellanii, we have determined the complete primary sequence of its mitochondrial genome. We find that the circular mtDNA (41,591 bp; 70.6% A+T) encodes two rRNAs (small subunit and large subunit), 16 tRNAs and 33 proteins (17 subunits of the respiratory chain and 16 ribosomal proteins). As well, this genome contains eight open reading frames (ORFs) larger than 60 codons and of undefined function. Two of these ORFs (orf124 and orf142) have homologs in other mtDNAs ("orf25" and "orfB", respectively), three are unique to A. castellanii mtDNA (orf83, orf115 and orf349), and three are intronic ORFs. Among notable features of A. castellanii mtDNA are the following: (1) Genes and ORFs are all encoded on the same strand and are tightly packed, with only 6.8% of the total sequence not having an evident coding function and intergenic spacer sequences ranging from only 1 to 616 bp (average 64 bp). Ten pairs of protein-coding genes overlap by up to 38 bp and two subunits of cytochrome oxidase (COX1 and COX2) are specified by a single continuous ORF. (2) Only three introns, all group I and each containing a free-standing ORF, are present; these are localized in the 3'-half of the large subunit rRNA gene. (3) The genome encodes fewer than the minimal number of tRNA species required to support mitochondrial protein synthesis, suggesting that additional tRNAs are imported from the cytosol into A. castellanii mitochondria. Of the 16 tRNAs specified by A. castellanii mtDNA (one with an 8-nucleotide anticodon loop), 13 have been shown or are predicted to undergo a novel form of RNA editing within the acceptor stem. (4) A modified genetic code is used in which UGA specifies tryptophan. (5) Repeated sequences and obvious small sequence motifs that might represent regulatory elements are absent. In overall size, gene content and organizational pattern, A. castellanii mtDNA most closely resembles the mtDNA of the chlorophycean alga Prototheca wickerhamii (55,326 bp; 74.2% A+T), but is quite different in these respects from the mtDNA of Chlamydomonas reinhardtii (15,758 bp; 54.8% A+T), another chlorophycean alga, as well from characterized animal and fungal mitochondrial genomes.(ABSTRACT TRUNCATED AT 250 WORDS)
Article
The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to downweight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.
Article
The complete nucleotide sequence of the circular mitochondrial (mt) DNA of the chlorophyte alga Prototheca wickerhamii has been determined (55,328 base-pairs, A+T content 74.2%). The genes identified encode three subunits of the cytochome oxidase, apocytochrome b, nine subunits of the NADH dehydrogenase complex (nad1 to 7, nad4L and nad9), three ATPase subunits (atp6, atp9, atp1 (also referred to as atpA)), three ribosomal RNAs (5 S (rrn5), small subunit (srn) and large subunit (lrn) RNA), 26 tRNAs, and 13 ribosomal proteins. A total of five group I introns reside in lrn and cox1, two of which include intronic open reading frames (ORFs). Five free-standing ORFs longer than 60 codons are present. Three of these ORFs are counterparts to genes encoding proteins of unknown function in plant mitochondria (orf25 and orfB of angiosperms and orf244 of liverwort), whereas two of them are unique. Mitochondrial genes are encoded on both DNA strands in a way that suggests the existence of two transcription units, each including approximately one half of the mitochondrial genome. The two intergenic regions in which transcription is believed to initiate and terminate are about ten times longer than the other intergenic regions (1118 and 1993 nt versus 100 to 150 nt). A total of 29 recurring sequence motifs (30 to 200 nt long) have been found in intergenic regions. Nine different types of motifs are present, most of them arranged as tandem repeats. These motifs may be implicated in transcription, e.g. as signals for initiation, termination and/or processing. Phylogenetic analysis on the basis of the cox1 gene strongly suggested that P. wickerhamii and plant mitochondrial genomes are monophyletic. The finding of plant-specific mitochondrial genes such as orf25, orf244, orfB and rrn5 in P. wickerhamii mitochondria corroborates this idea.
Article
The MUST package is a phylogenetically oriented set of programs for data management and display, allowing one to handle both raw data (sequences) and results (trees, number of steps, bootstrap proportions). It is complementary to the main available software for phylogenetic analysis (PHYLIP, PAUP, HENNING86, CLUSTAL) with which it is fully compatible. The first part of MUST consists of the acquisition of new sequences, their storage, modification, and checking of sequence integrity in files of aligned sequences. In order to improve alignment, an editor function for aligned sequences offers numerous options, such as selection of subsets of sequences, display of consensus sequences, and search for similarities over small sequence fragments. For phylogenetic reconstruction, the choice of species and portions of sequences to be analyzed is easy and very rapid, permitting fast testing of numerous combinations of sequences and taxa. The resulting files can be formatted for most programs of tree construction. An interactive tree-display program recovers the output of all these programs. Finally, various modules allow an in-depth analysis of results, such as comparison of distance matrices, variation of bootstrap proportions with respect to various parameters or comparison of the number of steps per position. All presently available complete sequences of 28S rRNA are furnished aligned in the package. MUST therefore allows the management of all the operations required for phylogenetic reconstructions.
Article
The demarcation of protist kingdoms is reviewed, a complete revised classification down to the level of subclass is provided for the kingdoms Protozoa, Archezoa, and Chromista, and the phylogenetic basis of the revised classification is outlined. Removal of Archezoa because of their ancestral absence of mitochondria, peroxisomes, and Golgi dictyosomes makes the kingdom Protozoa much more homogeneous: they all either have mitochondria and peroxisomes or have secondarily lost them. Predominantly phagotrophic, Protozoa are distinguished from the mainly photosynthetic kingdom Chromista (Chlorarachniophyta, Cryptista, Heterokonta, and Haptophyta) by the absence of epiciliary retronemes (rigid thrust-reversing tubular ciliary hairs) and by the lack of two additional membranes outside their chloroplast envelopes. The kingdom Protozoa has two subkingdoms: Adictyozoa, without Golgi dictyosomes, containing only the phylum Percolozoa (flagellates and amoeboflagellates); and Dictyozoa, made up of 17 phyla with Golgi dictyosomes. Dictyozoa are divided into two branches: (i) Parabasalia, a single phylum with hydrogenosomes and 70S ribosomes but no mitochondria, Golgi dictyosomes associated with striated roots, and a kinetid of four or five cilia; and (ii) Bikonta (16 unicellular or plasmodial phyla with mitochondria and bikinetids and in which Golgi dictyosomes are not associated with striated ciliary roots), which are divided into two infrakingdoms: Euglenozoa (flagellates with discoid mitochondrial cristae and trans-splicing of miniexons for all nuclear genes) and Neozoa (15 phyla of more advanced protozoa with tubular or flat [usually nondiscoid] mitochondrial cristae and cis-spliced spliceosomal introns). Neozoa are divided into seven parvkingdoms: (i) Ciliomyxa (three predominantly ciliated phyla with tubular mitochondrial cristae but no cortical alveoli, i.e., Opalozoa [flagellates with tubular cristae], Mycetozoa [slime molds], and Choanozoa [choanoflagellates, with flattened cristae]); (ii) Alveolata (three phyla with cortical alveoli and tubular mitochondrial cristae, i.e., Dinozoa [Dinoflagellata and Protalveolata], Apicomplexa, and Ciliophora); (iii) Neosarcodina (phyla Rhizopoda [lobose and filose amoebae] and Reticulosa [foraminifera; reticulopodial amoebae], usually with tubular cristae); (iv) Actinopoda (two phyla with axopodia: Heliozoa and Radiozoa [Radiolaria, Acantharia]); (v) Entamoebia (a single phylum of amoebae with no mitochondria, peroxisomes, hydrogenosomes, or cilia and with transient intranuclear centrosomes); (vi) Myxozoa (three endoparasitic phyla with multicellular spores, mitochondria, and no cilia: Myxosporidia, Haplosporidia, and Paramyxia); and (vii) Mesozoa (multicells with tubular mitochondrial cristae, included in Protozoa because, unlike animals, they lack collagenous connective tissue).
Article
As molecular phylogeny increasingly shapes our understanding of organismal relationships, no molecule has been applied to more questions than have ribosomal RNAs. We review this role of the rRNAs and some of the insights that have been gained from them. We also offer some of the practical considerations in extracting the phylogenetic information from the sequences. Finally, we stress the importance of comparing results from multiple molecules, both as a method for testing the overall reliability of the organismal phylogeny and as a method for more broadly exploring the history of the genome.
Article
Although DNA is the carrier of genetic information, it has limited chemical stability. Hydrolysis, oxidation and nonenzymatic methylation of DNA occur at significant rates in vivo, and are counteracted by specific DNA repair processes. The spontaneous decay of DNA is likely to be a major factor in mutagenesis, carcinogenesis and ageing, and also sets limits for the recovery of DNA fragments from fossils.
Article
A phylogenetic framework inferred from comparisons of small subunit ribosomal RNA sequences describes the evolutionary origin and early branching patterns of the kingdom Animalia. Maximum likelihood analyses show the animal lineage is monophyletic and includes choanoflagellates. Within the metazoan assemblage, the divergence of sponges is followed by the Ctenophora, the Cnidaria plus the placozoan Trichoplax adhaerens, and finally by an unresolved polychotomy of bilateral animal phyla. From these data, it was inferred that animals and fungi share a unique evolutionary history and that their last common ancestor was a flagellated protist similar to extant choanoflagellates.
Article
The complete 27,694-bp mitochondrial (mt) DNA sequence of Hansenula wingei, which is a typical budding yeast and contains circular mitochondrial DNA, has been determined. The mt sequence contains genes encoding large and small ribosomal RNAs, 25 tRNAs, three subunits of cytochrome c oxidase (subunits 1, 2 and 3), three subunits of ATPase (subunits 6, 8 and 9), apocytochrome b, seven subunits of NADH dehydrogenase (subunits 1, 2, 3, 4, 4L, 5 and 6), and a ribosomal protein, VAR1. The VAR1 gene is considered to be a typical yeast type. This is consistent with data on DNA and the deduced amino-acid sequence homology comparisons of genes ubiquitous in yeast and fungi. However, we have identified seven genes encoding NADH dehydrogenase subunits, which are not found in other yeast mitochondrial genomes, thus placing the H. wingei mitochondrial genome in a unique position. In addition the H. wingei mitochondrial genome also encodes one tRNA pseudogene and one short unidentified ORF. The genome is compact with only two introns both of which contain an ORF. One intron lies in the large rRNA gene while the other is situated in the cytochrome c oxidase subunit-1 gene. The conserved nonanucleotide motif (A/T)TATAAG (T/A)(A/T), which is a transcription start signal in Saccharomyces cerevisiae mitochondria, has also been found in the H. wingei mitochondrial genome. The codon assignments for ATA and CTN in H. wingei mitochondria are different from those in S. cerevisiae mitochondria. These results indicate a unique and novel structure for the H. wingei mitochondrial genome in terms of characteristics which are typical for both yeast and for filamentous fungi. This is the first complete mt DNA sequence report in yeast.
Article
We have determined the complete nucleotide sequence of the circular mitochondrial DNA (mtDNA) of the chytridiomycete fungus, Allomyces macrogynus (57,473 bp; A + T content 60.5%). The identified genes that are typical for most fungal mitochondria include those for the large (rnl) and small subunit (rns) ribosomal RNAs, a complete set of 25 tRNAs, three ATPase subunits (atp6, atp8 and atp9), apocytochrome b(cob), three subunits of the cytochrome oxidase complex (cox1, cox2 and cox3), and seven subunits of the NADH dehydrogenase complex (nad1, nad2, nad3, nad4, nad4L, nad5 and nad6). A total of 28 introns of both groups are found, some of which contain open reading frames (ORFs) coding for potential endonucleases (group I) or reverse-transcriptases (group II). All mitochondrial genes are transcribed from the same DNA strand, as is the case in many other eufungi. Particular features of the A. macrogynus mtDNA include: (1) the first documented case of a fungal mitochondrial ribosomal protein gene (rps3) that is clearly identified by similarity with bacterial homologues; (2) four unique ORFs; (3) the presence of an insert in the atp6 gene that may have been acquired by interspecific transfer; (4) more than 67 short, highly structured and conserved DNA elements inserted in intergenic spacers, introns, and variable regions of the rnl and rns genes: these elements are unusually G + C rich; (5) rRNA structures that resemble more closely those of eubacteria than their counterparts in other fungal mitochondria. The high degree of conservation of the A. macrogynus mitochondrial rRNA secondary structures, the existence of a mitochondrial rps3 gene (common to protist but unique in fungal mtDNAs), and phylogenetic relationships inferred from highly conserved protein genes, demonstrate consistently the ancestral character of this fungal mitochondrial genome.
Article
The nucleotide sequence of the regions flanking the A+T region of Drosophila melanogaster mitochondrial DNA (mtDNA) has been determined. Included are the genes encoding the transfer RNAs for valine, isoleucine, glutamine and methionine, the small ribosomal RNA and the 5'-coding sequences of the large ribosomal RNA and NADH dehydrogenase subunit II. This completes the nucleotide sequence of the D. melanogaster mitochondrial genome. The circular mtDNA of D. melanogaster varies in size among different populations largely due to length differences in the control region (Fauron & Wolstenholme, 1976; Fauron & Wolstenholme, 1980a, b); the mtDNA region we have sequenced, combined with those sequenced by others, yields a composite genome that is 19,517 bp in length as compared to 16,019 bp for the mtDNA of D. yakuba. D. melanogaster mtDNA exhibits an extreme bias in base composition; it comprises 82.2% deoxyadenylate and thymidylate residues as compared to 78.6% in D. yakuba mtDNA. All genes encoded in the mtDNA of both species are in identical locations and orientations. Nucleotide substitution analysis reveals that tRNA and rRNA genes evolve at less than half the rate of protein coding genes.