FIG 5 - uploaded by Eugeni Belda
Content may be subject to copyright.
-Phylogenetic relationships between 30 c-proteobacterial genomes inferred from a breakpoint distance (A) and an inversion distance matrix (B). Values at nodes reflect the percentage of times the clade defined by that node appears in the 100 jackknife trees. Distance-based phylogenetic methods were Fitch-Margoliash (upper values) and neighbor joining (lower values). See species abbreviations in table 1. The bar represents 20 breakpoints (A) or 20 inversions (B). 

-Phylogenetic relationships between 30 c-proteobacterial genomes inferred from a breakpoint distance (A) and an inversion distance matrix (B). Values at nodes reflect the percentage of times the clade defined by that node appears in the 100 jackknife trees. Distance-based phylogenetic methods were Fitch-Margoliash (upper values) and neighbor joining (lower values). See species abbreviations in table 1. The bar represents 20 breakpoints (A) or 20 inversions (B). 

Source publication
Article
Full-text available
Genome rearrangements have been studied in 30 gamma-proteobacterial complete genomes by comparing the order of a reduced set of genes on the chromosome. This set included those genes fulfilling several characteristics, the main ones being that an ortholog was present in every genome and that none of them had been acquired by horizontal gene transfe...

Contexts in source publication

Context 1
... NJ and the FM methods were used with BP dis- tances to reconstruct the phylogeny ( fig. 5A). Both methods inferred the same topology. In order to obtain supporting values for each node, BP distances were estimated after obtaining 100 random samples with genomes containing half the number of genes. The inferred topology was similar to that obtained based on amino acid sequences ( fig. 2), but with several important ...
Context 2
... same approach was carried out with the INV dis- tances ( fig. 5B), obtaining an even closer topology to the sequence-based one (fig. 2). The Shi. flexneri strains move closer to E. coli, and She. oneidensis slightly changed its ...
Context 3
... comparison of the rearrangement and sequence distances ( fig. 3) and the observation of the branch lengths in the phylogenetic tree ( fig. 5) show that although rear- rangement distances increase with time, they occur at a het- erogeneous rate with strong variations between and throughout the evolution of lineages. A great acceleration of the three Pasteurellaceae lineages was detected. We have estimated that, on average, they evolve at a relative rate of at least twice that ...
Context 4
... gene order phylogeny did not show a cluster of the three endosymbiotic species. The effect of the long- branch attraction is expected to be smaller because the acceleration of the branches leading to the three endosym- bionts are, relative to free-living enteric bacteria, much smaller in the gene order than in the sequence phylogeny (see figs. 2 and 5). Several incorrect estimations may affect our gene order phylogeny. ...

Similar publications

Article
Full-text available
The bacterial family Enterobacteriaceae gave rise to a variety of symbiotic forms, from the loosely associated commensals, often designated as secondary (S) symbionts, to obligate mutualists, called primary (P) symbionts. Determination of the evolutionary processes behind this phenomenon has long been hampered by the unreliability of phylogenetic r...

Citations

... The increase in the frequency of cassette excision upon inversion of the SI was such that we reasoned that there must be a limiting factor preventing such an event to occur in natural settings. Indeed, genome rearrangements are not very rare [25], and if a single inversion was sufficient to empty a SCI of all its cassettes in a few generations, then the massive structures of SCIs would not exist. Hence, we thought that there might be a high selective pressure on the orientation of the SI, preventing any spontaneous inversion. ...
Preprint
Integrons are adaptive bacterial devices that rearrange promoter less gene cassettes into variable ordered arrays under stress conditions, to sample combinatorial phenotypic diversity. Chromosomal integrons often carry hundreds of silent gene cassettes, with integrase-mediated recombination leading to rampant DNA excision and integration, posing a potential threat to genome integrity. How this activity is regulated and controlled, particularly through selective pressures, to maintain such large cassette arrays is unknown. Here we show a key role of promoter-containing toxin–antitoxin (TA) cassettes as abortive systems that kill the cell when the overall cassette excision rate is too high. These results highlight the importance of TA cassettes regulating the cassette recombination dynamics and provide insight into the evolution and success of integrons in bacterial genomes. Teaser The accumulation of cassette functions in integrons is ensured by toxin–antitoxin systems which kill the cell when the cassette excision rate is too high.
... These latter two may be considered simplifying assumptions, since they reduce the size of the state space. However, as will be shown later, the framework just as easily accommodates the alternate cases. 2 Here, a region is a contiguous section of the genome such as a sequence of genes (for an example of how such genomic simplification may be enacted in practice, seeBelda et al. 2005). ...
Article
Full-text available
We present a unified framework for modelling genomes and their rearrangements in a genome algebra, as elements that simultaneously incorporate all physical symmetries. Building on previous work utilising the group algebra of the symmetric group, we explicitly construct the genome algebra for the case of unsigned circular genomes with dihedral symmetry and show that the maximum likelihood estimate (MLE) of genome rearrangement distance can be validly and more efficiently performed in this setting. We then construct the genome algebra for a more general case, that is, for genomes that may be represented by elements of an arbitrary group and symmetry group, and show that the MLE computations can be performed entirely within this framework. There is no prescribed model in this framework; that is, it allows any choice of rearrangements that preserve the set of regions, along with arbitrary weights. Further, since the likelihood function is built from path probabilities—a generalisation of path counts—the framework may be utilised for any distance measure that is based on path probabilities.
... The permutation file was converted with the program GRIMM in a distant matrix, which contained the minimal number of inversions required between a pair of genomes to explain differences in gene order (26). Distance matrix was charged in MEGA7 (76) and a neighbor joining algorithm was used to infer the rearrangement phylogeny (80). Orthology between CDS and pseudogenes from M. leprae Br4923 and M. lepromatosis FJ924 was obtained with a reciprocal BLASTN best hit strategy (E value = 1.0E-05). ...
Article
Full-text available
Leprosy is a dreaded infection that still affects millions of people worldwide. Mycobacterium lepromatosis is a recently recognized cause in addition to the well-known Mycobacterium leprae . M. lepromatosis is likely specific for diffuse lepromatous leprosy, a severe form of the infection and endemic in Mexico. This study constructed and annotated the complete genome sequence of M. lepromatosis FJ924 and performed comparative genomic analyses with related mycobacteria.
... A low level of gene order conservation in bacterial genomes is well established (Koonin et al. 1996(Koonin et al. , 2021Puigbò et al. 2010;Darmon and Leach 2014). Successive chromosomal inversions are one mechanism by which gene order could be rearranged and indeed inversions have been noted as a type of organizational variant in many bacterial species (Belda et al. 2005;Darling et al. 2008;Matthews et al. 2011;Scott and Ely 2016;Xu et al. 2016;Mao and Grogan 2017;Repar and Warnecke 2017;Ely et al. 2019;Shelyakin et al. 2019), sometimes shown to be associated with recombination between inverted repeats such as ribosomal RNA operons or mobile genetic elements including prophage (Matthews et al. 2011;Wang et al. 2017;Fitzgerald et al. 2021). However, it remains an open question to which degree the observed long-term evolutionary lack of gene order on bacterial chromosomes is due to the successive effects of overlapping inversions which in many cases are not expected to confer any immediate selective advantage on the affected bacterial strain. ...
Article
Full-text available
Analysis of bacterial genomes shows that while diverse species share many genes in common their linear order on the chromosome is often not conserved. While rearrangements in gene order could occur by genetic drift, an alternative hypothesis is rearrangement driven by positive Selection during Niche Adaptation (SNAP). Here, we provide the first experimental support for the SNAP hypothesis. We evolved Salmonella to adapt to growth on malate as sole carbon source and followed the evolutionary trajectories. The initial adaptation to growth in the new environment involved the duplication of 1.66 Mb, corresponding to one third of the Salmonella chromosome. This duplication is selected to increase the copy number of a single gene, dctA, involved in the uptake of malate. Continuing selection led to the rapid loss or mutation of duplicate genes from either copy of the duplicated region. After 2000 generations only 31% of the originally duplicated genes remained intact and the gene order within the Salmonella chromosome has been significantly and irreversibly altered. These results experientially validate predictions made by the SNAP hypothesis and show that selection during niche adaptation can be a strong driving force for rearrangements in chromosomal gene order.
... The rearrangement rate in the Bu. aphidicola was found close to zero during the last 100-150 Myr of evolution (Belda et al., 2005). It should be probably taken into account that rates of sporeforming bacteria evolution might be significantly slower. ...
... For the specific case study in this and the two subsequent sections, we model the evolution of single-strand, circular genomes with unoriented regions. (The general case is presented in Section 5.) Genomes that are to be compared share N identified regions of interest, where a region is a contiguous section of the genome such as a sequence of genes (for an example of how such genomic simplification may be enacted in practice, see [4]). Accordingly, we use unsigned permutations, that is, elements of the symmetric group, S N , to represent both genomes and rearrangements. ...
... We write zs to emphasise the structure of the model element as a member of A, and choose to sum over rearrangements of the form za i rather than za i z for simplicity (from (32), these have the same action, so either form may be used). Note that, in contrast to the formulation in Sections 3 and 4, we no longer have a concept of 'lone' s ∈ C[G] (that is, a model element without the symmetry element), since we have not defined the model in G. 4 It remains to connect the path probabilities to the regular character of powers of the model element zs ∈ A (c.f (3) in Section 2). Recall from Section 3.1 that za · zg gives a convex combination of genomes, that is, ...
... Considering the matrix ρ A reg (zs) as a whole, we see, just as in Section 3.1, that the regular representation of the model element in A is the transition matrix for a Markov chain with 4 One could of course conceive of what such an element would look like, however it would not have its formerly useful properties, such as commuting with z, and is simply not relevant in this formulation. ...
Preprint
We present a unified framework for modelling genomes and their rearrangements in a genome algebra, as elements that simultaneously incorporate all physical symmetries. Building on previous work utilising the group algebra of the symmetric group, we explicitly construct the genome algebra for the case of unsigned circular genomes with dihedral symmetry and show that the maximum likelihood estimate (MLE) of genome rearrangement distance can be validly and more efficiently performed in this setting. We then construct the genome algebra for the general case, that is, for genomes represented by elements of an arbitrary group and symmetry group, and show that the MLE computations can be performed entirely within this framework. There is no prescribed model in this framework; that is, it allows any choice of rearrangements with arbitrary weights. Further, since the likelihood function is built from path probabilities -- a generalisation of path counts -- the framework may be utilised for any distance measure that is based on path probabilities.
... 8. Филогенетическое дерево Dickeya по 92 конкатенированным генам домашнего хозяйства (RAxML, бутстрэп 1000, внешняя группа -Pectobacterium fontis M022). ды, основанные, главным образом, на конкатенации большого количества генов (Lerat et al., 2003), создании «супердеревьев» и «консенсусном» подходе, заключающемся в исключении из сравнения аминокислот (FYMINK: фенилаланин, тирозин, метионин, изолейцин, аспарагин и лизин), которые больше всего страдают от смещения нуклеотидного состава (Comas et al., 2007), модификаций моделей эволюции последовательностей (Herbeck et al., 2005) и использовании структуры генома в качестве основы филогенетических построений (Belda et al., 2005). ...
Book
Представленная вниманию читателя книга суммирует пяти- летний опыт работы лаборатории молекулярной биоинженерии Института биоорганической химии им. академиков М. М. Шемя- кина и Ю. А. Овчинникова РАН над поддержанным Российским Научным Фондом проектом применения вирусов бактерий (бак- териофагов, фагов) для биологического контроля бактериальных заболеваний картофеля. Изначально выглядевшая достаточно прямолинейной концепция cоздания коллекции охарактеризо- ванных бактериофагов с последующим применением их против ограниченного круга бактериальных патогенов в сельском хо- зяйстве оказалась значительно более сложной и многосторон- ней. Для исчерпывающего понимания проблемы и выявления закономерностей биоконтроля с помощью фагов потребовались исследования биологических и генетических аспектов патоге- неза и эволюции бактерий, разрушающих растительную ткань, молекулярных деталей взаимодействия фаг — бактериальный хозяин, возникновения устойчивости бактерий к фагам и пре- одоления ее, подбора условий для оптимального культивирова- ния, стратегии использования сконструированных фаговых пре- паратов, предварительной дифференцированной диагностики возбудителей мягкой гнили. Некоторые из полученных резуль- татов ранее не были описаны в научной литературе не только для бактериофагов фитопатогенов, но и бактериофагов в целом. К тому же в период реализации проекта происходили значитель- ные изменения в таксономии как целевых бактерий, так и фагов, основанные на накопленной генетической информации. Эти из- менения не всегда находят своевременное отражение в обзорной тематической литературе, и даже сравнительно недавно опубли- кованные сведения оказываются недостоверными и неполными без учета современной таксономии. Поэтому достаточно много внимания в соответствующих главах уделено освещению послед- 14 15 них тенденций в формировании новых таксонов (родов, видов и подвидов) среди бактерий и фагов. Впервые в отечественной научной литературе читателю предлагается современное фор- мальное описание существующих на данный момент таксонов бактерий, вызывающих мягкую гниль растений (Часть I c опи- сательным приложением). Следует иметь в виду, что накопление генетического материала в мировых базах данных происходит стремительно. Несмотря на то, что в данный момент системати- зация по крайней мере, непосредственно относящихся к пред- мету этой книги бактерий и фагов кажется завершенной, в са- мое ближайшее время вполне возможно открытие новых видов бактерий, способных вызывать заболевания растений и других, не описанных до сих пор разновидностей бактериофагов. Пред- ставленный в работе алгоритм исследований представляется до- статочно универсальным, и может быть адаптирован с учетом дополнительных параметров, например, условий применения или меняющейся картины фитопатотий, а также применен при разработке аналогичных задач в случае других патогенов.
... Large scale changes in the arrangement of genes within a chromosome abound in biology and are key agents of sequence evolution (Belda et al., 2005;Beckmann et al., 2007). The differences in the order of genes along a chromosome were used as a phylogenetic marker as early as 1938 (Dobzhansky and Sturtevant, 1938) when Dobzhansky used them to determine different strains of Drosophila melanogaster. ...
... The differences in the order of genes along a chromosome were used as a phylogenetic marker as early as 1938 (Dobzhansky and Sturtevant, 1938) when Dobzhansky used them to determine different strains of Drosophila melanogaster. Inversions of chromosomal fragments are believed to be the main type of rearrangement event in bacterial genomes (Belda et al., 2005). ...
Article
Full-text available
Measuring the distance between two bacterial genomes under the inversion process is usually done by assuming all inversions to occur with equal probability. Recently, an approach to calculating inversion distance using group theory was introduced, and is effective for the model in which only very short inversions occur. In this paper, we show how to use the group-theoretic framework to establish minimal distance for any weighting on the set of inversions, generalizing previous approaches. To do this we use the theory of rewriting systems for groups, and exploit the Knuth–Bendix algorithm, the first time this theory has been introduced into genome rearrangement problems. The central idea of the approach is to use existing group theoretic methods to find an initial path between two genomes in genome space (for instance using only short inversions), and then to deform this path to optimality using a confluent system of rewriting rules generated by the Knuth–Bendix algorithm.
... These results reflect Vesicomid clams are related to family Cyrenidae than other families. Also, V. cyprinoides and Corbicula fluminea may be evolved recently and are closely related since gene rearrangement in closely related species is less than in distant 55 . The gene orders conserved in Cyrenid clams were identified as R-cytb-rrnL-ATP8-nad4, ATP6-nad3, and L-nad1-L-V. ...
Article
Full-text available
The Indian black clam Villorita cyprinoides (Family: Cyrenidae), an extractive commercially exploited species with aquaculture importance contributing more than 70% of clam fishery in India, is endemic to the Indian peninsula. Currently, there is very sparse information, especially on the molecular data of Villorita. The present study aims to provide a comprehensive knowledge of mitogenome architecture and assess the phylogenetic status of Cyrenidae. This has resulted in reporting the first complete mitogenome of V. cyprinoides using next-generation sequencing technology. The A+T circular mitogenome was 15,880 bp long, exhibiting 13 protein-coding genes (PCGs) including ATP8 (absent in several bivalves), 22 transfer RNA, and two ribosomal RNA genes residing in the heavy strand in a clockwise orientation and a gene order akin to Corbicula fluminea. The molecular phylogeny inferred from a concatenated multi-gene sequence [14 mitochondrial (12 PCGs, rrnS and rrnL) and two nuclear genes (Histone H3, 18S rRNA)] from 47 representative species of superorder Imparidentia, clustered V. cyprinoides and Cyrenid clams to a single clade supporting the monophyly of Cyrenidae. The subsequent mitochondrial gene order analysis substantiates the close relationship of V. cyprinoides and C. fluminea, analogous to phylogenetic output. The multilocus tree topology calibrated with verified fossil data deciphered the origin and diversification of Cyrenid clams during late Triassic-early Jurassic. The data derived from this study shall contribute remarkably for further insights on cryptic species identification, molecular characterization of bivalve mitogenomes and mitochondrial evolutionary history of genus Villorita. Moreover, complete mitogenome can aid in potential marker development for assessing the genetic health of black clam populations.
... Large scale changes in the arrangement of genes within a chromosome abound in biology and are key agents of sequence evolution [Beckmann et al., 2007, Belda et al., 2005b. The differences in the order of genes along a chromosome were used as a phylogenetic marker as early as 1938 [Dobzhansky and Sturtevant, 1938] when Dobzhansky used them to determine different strains of Drosophila melanogaster. ...
... Inversions of chromosomal fragments are believed to be the main type of rearrangement event in bacterial genomes [Belda et al., 2005b]. ...
Preprint
Measuring the distance between two bacterial genomes under the inversion process is usually done by assuming all inversions to occur with equal probability. Recently, an approach to calculating inversion distance using group theory was introduced, and is effective for the model in which only very short inversions occur. In this paper, we show how to use the group-theoretic framework to establish minimal distance for any weighting on the set of inversions, generalizing previous approaches. To do this we use the theory of rewriting systems for groups, and exploit the Knuth--Bendix algorithm, the first time this theory has been introduced into genome rearrangement problems. The central idea of the approach is to use existing group theoretic methods to find an initial path between two genomes in genome space (for instance using only short inversions), and then to deform this path to optimality using a confluent system of rewriting rules generated by the Knuth--Bendix algorithm.