Fig 2 - uploaded by Bailin Hao
Content may be subject to copyright.
A prokaryote phylogenetic tree based on collections of aminoacyl tRNA synthetases and calculated at string length K = 5. See caption to fig. 1 for explanation of labelings. 

A prokaryote phylogenetic tree based on collections of aminoacyl tRNA synthetases and calculated at string length K = 5. See caption to fig. 1 for explanation of labelings. 

Source publication
Article
Full-text available
In order to show that the newly developed K-string composition distance method, based on counting oligopeptide frequencies, for inferring phylogenetic relations of prokaryotes works equally well without requiring the whole proteome data, we used all ribosomal proteins and the set of aminoacyl tRNA synthetases for each species. The latter group has...

Contexts in source publication

Context 1
... tree based on ribosomal proteins is given in fig. 1 and that based on aminoacyl tRNA synthetases in fig. 2. The calculation included all 123 organisms. Since different strains of the same species as well as different species within the same genus always grouped together, in the final drawing we kept only one representative species from each genus. Therefore, these trees are effectively genus ...
Context 2
... appeared within the gamma group. This is so on all our trees in this paper and in ref. [13]. We could add a further ob- servation that the separated deeper gamma subgroup comprises two genera with small genome size (Buchnera and Wigglesworthia). The latter may even get quite far from the main Proteobacteria groups on the tRNA synthetase tree ( fig. 2). The fact that the species with significantly smaller genome forms a separate deeper subgroup on all these trees might be a manifestation of real evolutionary history as small genomes should naturally evolve earlier. Anyway, the effect of genome size poses a problem which could not be seen clearly on trees based on any single ...
Context 3
... the three Spirochetes (Burbu, Trepa and Le- pin) appear together in fig. 1 as they were grouped in the Bergey's Manual. However, Lepin stands out in fig. 2 and on the proteome trees in ref. [13]. We could not tell whether this was a consequence of significant dif- ference other ...

Similar publications

Chapter
Full-text available
Eukaryotic plants contain three different protein-synthesizing systems, localized in the cell cytoplasm, in the mitochondrion, and in the chloroplast (1,2). Indeed it is possible to isolate from plant cells cytoplasmic, mitochondrial and chloroplastic ribosomes that differ in the sedimentation coefficient and in the rRNA composition (3) as well as...

Citations

... Later, researchers tried to construct evolution relationships by using a set of functional sequences. For instance, Snel used a common gene set [26,27], Wolf used a conserved cluster of homologous genes [28], and Qi uses all protein-coding sequences or all protein sequences [29][30][31]. Although this method improves the reliability of evolution relationships, it still cannot resolve the inconsistency problem because there is no uniform standard to select a consistent gene/protein set and to determine the number of selected sets. ...
Article
Full-text available
Background Exploring evolution regularities of genome sequences and constructing more objective species evolution relationships at the genomic level are high-profile topics. Based on the evolution mechanism of genome sequences proposed in our previous research, we found that only the 8-mers containing CG or TA dinucleotides correlate directly with the evolution of genome sequences, and the relative frequency rather than the actual frequency of these 8-mers is more suitable to characterize the evolution of genome sequences. Result Therefore, two types of feature sets were obtained, they are the relative frequency sets of CG1 + CG2 8-mers and TA1 + TA2 8-mers. The evolution relationships of mammals and reptiles were constructed by the relative frequency set of CG1 + CG2 8-mers, and two types of evolution relationships of insects were constructed by the relative frequency sets of CG1 + CG2 8-mers and TA1 + TA2 8-mers respectively. Through comparison and analysis, we found that evolution relationships are consistent with the known conclusions. According to the evolution mechanism, we considered that the evolution relationship constructed by CG1 + CG2 8-mers reflects the evolution state of genome sequences in current time, and the evolution relationship constructed by TA1 + TA2 8-mers reflects the evolution state in the early stage. Conclusion Our study provides objective feature sets in constructing evolution relationships at the genomic level.
... Later, people tried to construct phylogenetic relationship by using a set of functional sequences. For instance, common gene set (Snel et al.1999;Huynen et al. 1999), conserved cluster of homologous genes (Wolf et al. 2001), all protein coding sequences or all protein sequences Wei et al. 2004;. Although this method improves the reliability of the phylogenetic relationship, it still cannot resolve the inconsistency of the phylogenetic trees because there is no uniform standard to select a consistent sequence set and to determine the number of these sequences, some functional sequences are not possessed by all species. ...
Preprint
Exploring the composition and evolution regularity of genome sequences and constructing phylogenetic relationship by alignment-free method in genome level are high-profile topics. Our previous researches discovered the CG and TA independent selection laws existed in genome sequences by analysis on the spectral features of 8-mer subsets of 920 eukaryote and prokaryote genomes. We found that the evolution state of genomes is determined by the intensity of the two independent selections and the degree of the mutual inhibition between them. In this study, the two independent selection patterns of 22 primate and 28 insect genome sequences were analyzed further. The two complete 8-mer motif sets containing CG or TA dinucleotide and their feature of relative frequency are proposed. We found that the two 8-mer sets and their feature are related directly to sequence evolution of genomes. According to the relative frequency of two 8-mer sets, phylogenetic trees were constructed respectively for the given primate and insect genomes. Through analysis and comparison, we found that our phylogenetic trees are more consistent with the known conclusions. The two kinds of phylogenetic relationships constructed by CG 8-mer set and TA 8-mer set are similar in insect genomes, but the phylogenetic relationship constructed by CG 8-mer set reflect the evolution state of genomes in current age and phylogenetic relationship constructed by TA 8-mer set reflect the evolution state of genomes in a slight earlier period. We thought it is the result that the TA independent selection is repressed by the CG independent selection in the process of genome evolution. Our study provides a theoretical approach to construct more objective evolution relationships in genome level.
... With wide taxonomic coverage of selected sequences it may contribute to the establishment of a whole-genome backbone for the prokaryotic branch of the Tree of Life. We mention, in addition, that CVTree method has been tested for protein families and could yield meaningful results (Wei et al., 2004). ...
Article
The Composition Vector Tree (CVTree) is a parameter-free and alignment-free method toinfer prokaryotic phylogeny from thier complete genomes. It is distinct from the ytraditional 16S rRNA analysis in both the input data and the methodology. The prokaryotic phylogenetic trees constructed by using the CVTree method agree well with the Bergey's taxonomy in all major groupings and fine branching patterns. Thus, combined use of the CVTree approach and the 16S rRNA analysis may provide an objective and reliable reconstruction of the prokaryotic branch of the Tree of Life
... Singular Value Decomposition (SVD) method to analyze character string frequencies, phylogenies from unaligned whole genome data has been successfully applied to vertebrate phylogeny using vector representations 23 ; and applied to bacterial phylogeny as well 24 . A similar approach was introduced by Qi et al in a series publications called K-string approach [25][26][27][28] . Their method is based on the appearance frequency of oligopeptides of fixed length in the whole genome. ...
Book
Recent advances in development of sequencing technology has resulted in a deluge of genomic data. In order to make sense of this data, there is an urgent need for algorithms for data processing and quantitative reasoning. An emerging in silico approach, called computational genomic signatures, addresses this need by representing global species-specific features of genomes using simple mathematical models. This text introduces the general concept of computational genomic signatures, and it reviews some of the DNA sequence models which can be used as computational genomic signatures. The text takes the position that a practical computational genomic signature consists of both a model and a measure for computing the distance or similarity between models. Therefore, a discussion of sequence similarity/distance measurement in the context of computational genomic signatures is presented. The remainder of the text covers various applications of computational genomic signatures in the areas of metagenomics, phylogenetics and the detection of horizontal gene transfer.
Article
Full-text available
Composition Vector Tree (CVTree) implements a systematic method of inferring evolutionary relatedness of microbial organisms from the oligopeptide content of their complete proteomes (http://cvtree.cbi.pku.edu.cn). Since the first bacterial genomes were sequenced in 1995 there have been several attempts to infer prokaryote phylogeny from complete genomes. Most of them depend on sequence alignment directly or indirectly and, in some cases, need fine-tuning and adjustment. The composition vector method circumvents the ambiguity of choosing the genes for phylogenetic reconstruction and avoids the necessity of aligning sequences of essentially different length and gene content. This new method does not contain ‘free’ parameter and ‘fine-tuning’. A bootstrap test for a phylogenetic tree of 139 organisms has shown the stability of the branchings, which support the small subunit ribosomal RNA (SSU rRNA) tree of life in its overall structure and in many details. It may provide a quick reference in prokaryote phylogenetics whenever the proteome of an organism is available, a situation that will become commonplace in the near future.