ArticlePDF Available

The Flag-2 Locus, an Ancestral Gene Cluster, Is Potentially Associated with a Novel Flagellar System from Escherichia coli

American Society for Microbiology
Journal of Bacteriology
Authors:

Abstract and Figures

Escherichia coli K-12 possesses two adjacent, divergent, promoterless flagellar genes, fhiA-mbhA, that are absent from Salmonella enterica. Through bioinformatics analysis, we found that these genes are remnants of an ancestral 44-gene cluster and are capable of encoding a novel flagellar system, Flag-2. In enteroaggregative E. coli strain 042, there is a frameshift in lfgC that is likely to have inactivated the system in this strain. Tiling path PCR studies showed that the Flag-2 cluster is present in 15 of 72 of the well-characterized ECOR strains. The Flag-2 system resembles the lateral flagellar systems of Aeromonas and Vibrio, particularly in its apparent dependence on RpoN. Unlike the conventional Flag-1 flagellin, the Flag-2 flagellin shows a remarkable lack of sequence polymorphism. The Flag-2 gene cluster encodes a flagellar type III secretion system (including a dedicated flagellar sigma-antisigma combination), thus raising the number of distinct type III secretion systems in Escherichia/Shigella to five. The presence of the Flag-2 cluster at identical sites in E. coli and its close relative Citrobacter rodentium, combined with its absence from S. enterica, suggests that it was acquired by horizontal gene transfer after the former two species diverged from Salmonella. The presence of Flag-2-like gene clusters in Yersinia pestis, Yersinia pseudotuberculosis, and Chromobacterium violaceum suggests that coexistence of two flagellar systems within the same species is more common than previously suspected. The fact that the Flag-2 gene cluster was not discovered in the first 10 Escherichia/Shigella genome sequences studied emphasizes the importance of maintaining an energetic program of genome sequencing for this important taxonomic group.
(A) Schematic representation of the Flag-2 gene cluster in E. coli 042. The 48.8-kb region between nucleotide positions 281000 and 329800 from the finished genome of E. coli 042 is indicated by a solid black line. GLIMMER-predicted CDSs are indicated by arrows approximately to scale and colored according to operon and function. Flag-2 genes are designated according to the scheme proposed in this paper. Nt, nucleotides. (B) Schematic representation of the Flag-2 gene clusters of other bacteria. The solid black lines indicate the genome fragments in E. coli K-12 strain MG1655 (11.4 kb), S. enterica serovar Typhi Ty2 (4.9 kb), V. parahaemolyticus (11.7-kb region 1 and 23.2-kb region 2 from chromosome 2), Y. pestis CO92 (44.4 kb), and Y. pestis KIM (28.5 kb) that are equivalent to the lateral flagellar cluster of E. coli 042. E. coli K-12 and S. enterica serovar Typhi Ty2 are representative of other sequenced E. coli strains and S. enterica strains, respectively (data not shown). The dotted lines indicate deletions or missing genes relative to the E. coli 042 Flag-2 gene cluster. CDSs are indicated by arrows colored according to the E. coli 042 scheme shown in panel A. CDSs that are insertions (IS) relative to the E. coli 042 genome are indicated by arrows below the backbone. (C) Tiling path PCR used to map the Flag-2 gene cluster of E. coli strains. Primers and their orientations are indicated by small triangles. The red triangles indicate the lfhA-lafU primers used to confirm the absence of the lateral flagellar cluster in E. coli strains. The orange and blue triangles indicate the lfhAB and lafTU primers, respectively, that were used to confirm the presence of genes between lfhA and lafU. E. coli strains suspected of harboring a Flag-2 gene cluster were further analyzed with primer sets 1 to 8 (lines with triangles at each end) by using long-range PCR to determine the tiling path. ECOR strains that were found to harbor the Flag-2 gene cluster could be divided into four types, type 1 (ECOR4, -49, and -50), type 2 (ECOR-1, -3, -5, -12, -17, -24, -64, -65, -67), type 3 (ECOR-35 and -36), and type 4 (ECOR48), according to the tiling path. Successful long-range PCRs of fragments 1 to 8 (approximately 5 kb each) are indicated by solid black lines. Unsuccessful PCRs are indicated by dotted lines. The red lines indicate cases in which long-range PCR with complementary primers from adjacent successful amplifications was used to span regions that could not be amplified.
… 
Content may be subject to copyright.
JOURNAL OF BACTERIOLOGY, Feb. 2005, p. 1430–1440 Vol. 187, No. 4
0021-9193/05/$08.000 doi:10.1128/JB.187.4.1430–1440.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
The Flag-2 Locus, an Ancestral Gene Cluster, Is Potentially Associated
with a Novel Flagellar System from Escherichia coli
Chuan-Peng Ren,
1
Scott A. Beatson,
1
Julian Parkhill,
2
and Mark J. Pallen
1
*
Bacterial Pathogenesis and Genomics Unit, Division of Immunity and Infection, Medical School, University of Birmingham,
Birmingham,
1
and The Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge,
2
United Kingdom
Received 2 September 2004/Accepted 15 October 2004
Escherichia coli K-12 possesses two adjacent, divergent, promoterless flagellar genes, fhiA-mbhA, that are
absent from Salmonella enterica. Through bioinformatics analysis, we found that these genes are remnants of
an ancestral 44-gene cluster and are capable of encoding a novel flagellar system, Flag-2. In enteroaggregative
E. coli strain 042, there is a frameshift in lfgC that is likely to have inactivated the system in this strain. Tiling
path PCR studies showed that the Flag-2 cluster is present in 15 of 72 of the well-characterized ECOR strains.
The Flag-2 system resembles the lateral flagellar systems of Aeromonas and Vibrio, particularly in its apparent
dependence on RpoN. Unlike the conventional Flag-1 flagellin, the Flag-2 flagellin shows a remarkable lack of
sequence polymorphism. The Flag-2 gene cluster encodes a flagellar type III secretion system (including a
dedicated flagellar sigma-antisigma combination), thus raising the number of distinct type III secretion
systems in Escherichia/Shigella to five. The presence of the Flag-2 cluster at identical sites in E. coli and its close
relative Citrobacter rodentium, combined with its absence from S. enterica, suggests that it was acquired by
horizontal gene transfer after the former two species diverged from Salmonella. The presence of Flag-2-like gene
clusters in Yersinia pestis,Yersinia pseudotuberculosis, and Chromobacterium violaceum suggests that coexistence
of two flagellar systems within the same species is more common than previously suspected. The fact that the
Flag-2 gene cluster was not discovered in the first 10 Escherichia/Shigella genome sequences studied emphasizes
the importance of maintaining an energetic program of genome sequencing for this important taxonomic
group.
The motile gamma-proteobacterium Escherichia coli has
been widely accepted in biology as a model organism, an opin-
ion typified by quotations such as “all cell biologists have two
cells of interest: the one they are studying and Escherichia coli
(34) or Jacques Monod’s famous dictum “Tout ce qui est vrai
pour le Colibacille est vrai pour l’e´le´phant” (“What is true for
E. coli is also true of the elephant”) (26). However, within this
single model species, which now encompasses the shigellas,
there are remarkable variations in genome size, and the largest
E. coli genomes possess more than 1 Mb more DNA than the
smallest E. coli genomes (42). Comfortingly, the most com-
monly used laboratory strain, K-12, has one of the smallest E.
coli genomes, leading to the often unwitting assumption that
this model strain represents the ancestral or archetypical state
of the species.
Curiously, one area of bacteriology in which E. coli K-12 has
been eclipsed as a model organism is the study of flagellar
biosynthesis, assembly, and regulation. In this area, Salmonella
enterica serovar Typhimurium strain LT2 has been the most
commonly used model organism (34, 35). Nonetheless, it has
been assumed that the genetics and physiology of flagellar
systems are essentially the same in E. coli and S. enterica; minor
differences include a tap receptor gene in E. coli but not in S.
enterica and an fliB flagellar methylase gene and a phase 2
locus in S. enterica but not in E. coli (12, 34). In a similar vein,
we have recently concluded from comparative sequence anal-
ysis that the S. enterica-E. coli model of flagellar function holds
up surprisingly well even when it is generalized to bacteria that
are only distantly related to E. coli (43). However, there are at
least two challenges to this paradigm. First, unlike the E. coli-S.
enterica archetype, many flagellar systems rely on the alterna-
tive sigma factor RpoN as a key facet of gene regulation (11,
27, 28, 30, 55). Second, some gamma-proteobacterial species
(Aeromonas hydrophila and Vibrio parahaemolyticus) have been
shown to utilize two distinct flagellar systems for motility, a
polar system for swimming in the liquid phase and a lateral
system for swarming over solid surfaces (31, 32, 37).
Studies initially with uropathogenic E. coli (UPEC) and later
with other pathotypes suggested that E. coli strains often ac-
quire new complex pathogenic phenotypes in a single step by
the acquisition of pathogenicity islands, which contain viru-
lence genes clustered on the chromosome and which are ac-
quired en bloc by horizontal gene transfer (21, 22). More
recently, the island concept has been generalized to encompass
almost any horizontally acquired gene cluster or even any re-
gion in which there is a difference between two genomes. The
most striking example of the latter expansion of use occurred
when the first genome sequence of a pathogenic strain, entero-
hemorrhagic E. coli O157:H7, was compared to the K-12 ge-
nome sequence (47).
Recently, uncritical adoption of the island concept in ge-
nome annotation has faced several challenges. It is now clear
that in the K-12–O157 comparison, neither the presumed po-
larity of change (i.e., the presumption that an O-island is an
* Corresponding author. Mailing address: Bacterial Pathogenesis
and Genomics Unit, Division of Immunity and Infection, Medical
School, University of Birmingham, Birmingham, England B15 2TT.
Phone: (44) 121 414 7163. Fax: (44) 121 414 3454. E-mail: m.pallen
@bham.ac.uk.
C.-P.R. and S.A.B. contributed equally to this study.
1430
insertion in O157 relative to the K-12 backbone) nor the an-
cestral status of K-12 can be justified in all cases. For example,
we described a striking case in which O-island 115 is part of a
much larger gene cluster, ETT2, associated with type III se-
cretion, and in which the essential difference between O157
and K-12 at this locus is a deletion in K-12 rather than an
insertion into the O157 genome (i.e., O157 reflects the ances-
tral state better than K-12) (48). Furthermore, some so-called
pathogenicity islands, even in UPEC, are more fluid than first
thought (54). Thus, rather than a fixed core of housekeeping
genes supplemented by a limited set of optional islands, the E.
coli genomes are perhaps better viewed as frequently redrafted
palimpsests, subject to repeated rounds of insertion, deletion,
and rearrangement.
In addition to within-species alignments, comparison of the
E. coli K-12 genome with the genome of S. enterica LT2 might
be seen as a way of defining the E. coli backbone and the E. coli
genomic islands (39). Such a comparison reveals a small but
puzzling difference in the flagellar gene repertoires. K-12 pos-
sesses an additional pair of divergent, promoterless genes,
fhiA-mbhA, an apparent flagellar islet that is absent from S.
enterica (39). These genes appear to encode incomplete ho-
mologs of FlhA and MotB. We therefore examined the
genomic context of the mbhA-fhiA genes in 11 different ge-
nome sequences from Escherichia/Shigella strains. We were
surprised to discover that these genes represent a remnant of
an ancestral gene cluster present in around one-fifth of E. coli
strains that is potentially capable of encoding a novel flagellar
system previously overlooked in this intensively studied model
organism.
MATERIALS AND METHODS
Genome sequencing. Enteroaggregative E. coli (EAEC) strain 042 (O44:H18)
used in this study was originally isolated from a child with diarrhea in Lima, Peru.
The initial shotgun sequences were generated from 70,000 paired-end se-
quences by using dye terminator chemistry with ABI3700 automated sequencers.
The initial shotgun sequences were assembled by using PHRAP (www.phrap-
.org), and the sequence of the Flag-2 locus was checked by using GAP4 (http:
//staden.sourceforge.net).
Sequence analysis. The Flag-2 gene cluster was initially identified in the
unfinished genome of EAEC strain 042 by using BLASTP searches with E. coli
K-12 flagellar protein sequences against E. coli 042 GLIMMER (14)-predicted
coding sequences (CDSs) (available at http://vge.ac.uk/; genome sequence data
downloaded from http://www.sanger.ac.uk/ on 10 December 2003). Systematic
gene names are those provided by the Sanger Institute for the complete genome
of E. coli 042 (http://www.sanger.ac.uk/Projects/Escherichia_Shigella/). Subse-
quent BLASTP and PSI-BLAST searches of the nonredundant protein and
nucleotide databases (http://www.ncbi.nlm.nih.gov/) and unfinished microbial ge-
nomes (http://vge.ac.uk/) resulted in identification of equivalent gene clusters in
the complete genomes of Vibrio parahaemolyticus,Yersinia pestis strains KIM and
CO92, and Chromobacterium violaceum (10, 15, 36, 44), in the complete but
unannotated genome of Yersinia pseudotuberculosis (ftp://bbrp.llnl.gov/pub/cbnp
/y.pseudotuberculosis/), in the incomplete genome of Citrobacter rodentium
(http://www.sanger.ac.uk/Projects/C_rodentium/), and in previously annotated
clusters from Aeromonas species (19, 41).
When possible, comparative analyses of the regions surrounding and contain-
ing Flag-2 gene clusters were performed and visualized by using the coliBASE
server (http://colibase.bham.ac.uk) (13), and these analyses covered the complete
or nearly complete genome sequences of 12 Escherichia/Shigella strains, 11 Sal-
monella strains, and selected other bacterial pathogens; the organisms used
included the laboratory strains E. coli K-12 strain MG1655, E. coli K-12 strain
W3110, and E. coli strain DH10B; UPEC strain CFT073; enterohemorrhagic E.
coli O157:H7 strains EDL933 and RIMD 0509952 (Sakai); EAEC strain 042;
enteropathogenic E. coli strain E2348/69; Shigella flexneri 2a strains 2457T and
301; Shigella dysenteriae strain M131649 (M131); Shigella sonnei strain 53G;
Salmonella enterica serovar Typhi strains CT18 and Ty2; S. enterica serovar
Typhimurium strains LT2, DT104, and SL1344; Salmonella enteritidis strains LK5
and PT4; the lesser known salmonellae Salmonella bongori,Salmonella dublin,
Salmonella gallinarum strain 287/91, and Salmonella pullorum;Yersinia pestis
strains CO92 and KIM; Yersinia enterocolitica strain 8081; and C. rodentium (9,
15, 16, 24, 29, 40, 44, 46, 47, 53, 54).
Detailed analyses of the Flag-2 clusters of E. coli 042, V. parahaemolyticus,Y.
pestis strains KIM, CO92, and 91001, Y. pseudotuberculosis,C. violaceum, and C.
rodentium were carried out by using stand-alone BLAST (4) to confirm the
presence of positional orthologs and CLUSTALW (51) to align orthologous
protein sequences. When appropriate, E. coli 042 GLIMMER-predicted CDSs
were shortened to more closely match the gene lengths of corresponding or-
thologs from V. parahaemolyticus and Y. pestis genomes and to minimize the
overlap between adjacent genes. SEAVIEW (18) was used to visualize multiple
alignments, and ARTEMIS (49) was used to annotate the E. coli 042 Flag-2
region. Promoter and sigma factor binding sequences were predicted by using
promscan (http://www.promscan.uklinux.net/). All sequence analyses were car-
ried out with a Macintosh G5 computer.
Strains. The ECOR strain collection was kindly provided by Thomas Whittam
and has been described elsewhere (http://foodsafe.msu.edu/whittam/ECOR).
Representatives of other pathotypes, including NMEC (E. coli associated with
neonatal meningitis) strain RS218, EAEC strain 042, enterotoxigenic E. coli
strain H10407, EAEC strain EAEC25, UPEC strain CFT073, and E. coli strain
K-12 were kindly provided by Ian Henderson (University of Birmingham), while
an isogenic nontoxigenic derivative of the E. coli O157:H7 Sakai strain was a kind
gift from Chihiro Sasakawa (University of Tokyo).
PCR. Genomic DNA from each strain was extracted with a Puregene isolation
kit (Flowgen, Ashby-de-la-Zouch, United Kingdom) and was stored at 4°C.
Primers were designed by using the Primer3 software on the coliBASE server
(http://colibase.bham.ac.uk). Primer sequences are listed in Table 1. For short
PCRs, each 20-l reaction mixture contained 1 U of Taq polymerase (Invitrogen,
Renfrew, United Kingdom) in the buffer supplied by the manufacturer, 20 ng of
genomic DNA, and each deoxynucleoside triphosphate at a concentration of 250
M. The short PCR conditions were 30 cycles of 30 s at 94°C, 30 s at 62°C, and
TABLE 1. Primers used in this study
Primer Sequence (5–3) (forward) Sequence (5–3) (reverse)
fhiA-mbhA GTTGATCGCCAGAATCATCATC (FhiA-F) ATATTGCGGTTCTGGTCGTCTT (MbhA-R)
fhiA-flanking TGAAAGTCAGGTGGAAGTGGTC (LfhB-F) CCGATGGTCATCAGCACATACT (LfhA-R)
mbhA-flanking GGATGAGACGGGCTGATTTTAT (Lafu-F) GGGACGTTTTTAGGCGTCTTTA (Lafu-R)
Flg1 TTCAAACATATTGCGGTTCTGG GGCGTCACCATGACTTTTACC
Flg2 CGTCCTGAATTTTGCTCATCTG AGCAATGGAAGCTACCCTCAAG
Flg3 GTCATCCTCAGACAGCATCACC CGACAACCTGTATCTGGAAACC
Flg4 GAGGAAACCGAGTTGTCGTTCT GGTCGCAATCTGTGAGGAAATA
Flg5 GCTGATTTGCAGATTCAGAAAGG TGACAGCAAATAACGCAGTTCC
Flg6 TCTGCCGGAAAATATTCAATCC TTATTCCGCTGTGGAAAGATGA
Flg7 GTTGAGGATCCCCTGCAACAT TCATCAGAATCAGCACCTGGAT
Flg8 CCGGGGATATTTTACCCATCTC CCCCCATAATCTTCAACTCCAG
Second fliC CAAAAGGCGTGCCAATATTTTT AAACGATTAATCCCAAGGAGCA
Nested fliC CCGTTAATGCAATCAGCAAAAG GGACATGCTGTTGGACTGTTTC
VOL. 187, 2005 Flag-2 FLAGELLAR CLUSTER FROM E. COLI 1431
30 s at 72°C, followed by a 7-min extension at 72°C. Long PCRs were performed
by using TaKaRa LA Taq (Cambrex Bio Science, Wokingham, United Kingdom)
in the buffer supplied by the manufacturer. Each 25-l long PCR mixture con-
tained 60 ng of genomic DNA, 5 pmol of each primer, each deoxynucleoside
triphosphate at a concentration of 250 M,and1UofTaKaRa LA Taq; the
reaction conditions were 30 cycles of a two-step program consisting of 20 s at
96°C and 10 min at 69°C, with a 10-min extension at 72°C. The long PCR
fragments were analyzed by electrophoresis by using a 0.5% agarose gel, while
the short PCR products were analyzed on a 1.0% agarose gel.
We employed a three-stage PCR strategy to scan isolates for the Flag-2 cluster.
Initially, primers FhiA-F and MbhA-R were applied to our strain collection (the
ECOR collection supplemented with selected pathogenic strains) in a conven-
tional short PCR to detect the K-12-like fhiA-mbhA genotype. Next, primer pairs
LfhB-F plus LfhA-R and LafU-F plus LafT-R were applied to all strains to
detect the pairs of genes at the ends of the 042 Flag-2 cluster. Finally, tiling path
PCR, which was described in a previous paper (48), was used to obtain a
complete tiling path through the gene cluster in all 15 Flag-2-positive ECOR
strains. In this study, long PCR primers were designed to amplify eight 5-kb
fragments spanning the whole 35-kb cluster, with each fragment overlapping its
neighbors by a few hundred base pairs (Table 1 and Fig. 1C). Any negative
FIG. 1. (A) Schematic representation of the Flag-2 gene cluster in E. coli 042. The 48.8-kb region between nucleotide positions 281000 and
329800 from the finished genome of E. coli 042 is indicated by a solid black line. GLIMMER-predicted CDSs are indicated by arrows approximately
to scale and colored according to operon and function. Flag-2 genes are designated according to the scheme proposed in this paper. Nt, nucleotides.
(B) Schematic representation of the Flag-2 gene clusters of other bacteria. The solid black lines indicate the genome fragments in E. coli K-12 strain
MG1655 (11.4 kb), S. enterica serovar Typhi Ty2 (4.9 kb), V. parahaemolyticus (11.7-kb region 1 and 23.2-kb region 2 from chromosome 2), Y. pestis
CO92 (44.4 kb), and Y. pestis KIM (28.5 kb) that are equivalent to the lateral flagellar cluster of E. coli 042. E. coli K-12 and S. enterica serovar
Typhi Ty2 are representative of other sequenced E. coli strains and S. enterica strains, respectively (data not shown). The dotted lines indicate
deletions or missing genes relative to the E. coli 042 Flag-2 gene cluster. CDSs are indicated by arrows colored according to the E. coli 042 scheme
shown in panel A. CDSs that are insertions (IS) relative to the E. coli 042 genome are indicated by arrows below the backbone. (C) Tiling path
PCR used to map the Flag-2 gene cluster of E. coli strains. Primers and their orientations are indicated by small triangles. The red triangles indicate
the lfhA-lafU primers used to confirm the absence of the lateral flagellar cluster in E. coli strains. The orange and blue triangles indicate the lfhAB
and lafTU primers, respectively, that were used to confirm the presence of genes between lfhA and lafU.E. coli strains suspected of harboring a
Flag-2 gene cluster were further analyzed with primer sets 1 to 8 (lines with triangles at each end) by using long-range PCR to determine the tiling
path. ECOR strains that were found to harbor the Flag-2 gene cluster could be divided into four types, type 1 (ECOR4, -49, and -50), type 2
(ECOR-1, -3, -5, -12, -17, -24, -64, -65, -67), type 3 (ECOR-35 and -36), and type 4 (ECOR48), according to the tiling path. Successful long-range
PCRs of fragments 1 to 8 (approximately 5 kb each) are indicated by solid black lines. Unsuccessful PCRs are indicated by dotted lines. The red
lines indicate cases in which long-range PCR with complementary primers from adjacent successful amplifications was used to span regions that
could not be amplified.
1432 REN ET AL. J. BACTERIOL.
results obtained by the long PCR were followed up by deletion scanning long
PCRs as described previously (48).
Amplification and sequencing of Flag-2 flagellin genes. We attempted to PCR
amplify the Flag-2 flagellin gene, lafA, from each Flag-2-positive strain. Primers
patterned on the sequences flanking the EAEC 042 lafA gene were used to
amplify 1-kb PCR products (Table 1). The amplicons were purified by using a
PCR purification kit (QIAGEN, Crawley, United Kingdom) and were sequenced
directly with nested primers.
Nucleotide sequence accession number. The sequence and annotation of the
Flag-2 cluster from E. coli 042 have been deposited in the EMBL database under
accession number CR753847.
RESULTS
The E. coli 042 genome contains a cluster of 44 genes that
are predicted to encode a novel, second flagellar system. Ini-
tially, we confirmed that strain 042 possessed the same four
conventional flagellar gene clusters as strain K-12 (data not
shown). We then compared the fhiA-mbhA pseudogene cluster
from the K-12 genome to the homologous region of the ge-
nome of strain 042. We were surprised to discover a large
region in which there were differences between the two strains;
in 042, intact versions of these genes represent the boundaries
of a cluster of 44 genes that has apparently been deleted in the
lineage leading to K-12 (i.e., 042, not K-12, represents the
ancestral state). More surprising is the realization that this
cluster apparently encodes an entire second flagellar system
(Fig. 1A), which is referred to here as Flag-2 to distinguish it
from the conventional peritrichous system, which we desig-
nated Flag-1.
BLAST searches of the nonredundant GenBank databases
showed that E. coli 042 Flag-2 genes are more similar to lateral
flagellar genes found in V. parahaemolyticus and Y. pestis than
to the conventional Flag-1 genes found in other E. coli strains
and E. coli 042 itself. V. parahaemolyticus has been shown to
have a lateral flagellar system encoded by (at least) 37 genes in
five operons arranged in two clusters, in addition to a polar
flagellar system (50). E. coli 042 Flag-2 contains positional
orthologs of all V. parahaemolyticus lateral flagellar genes ex-
cept motY. In addition, there is a conserved operon structure,
but there is one salient difference: the genes form a single
cluster in E. coli 042 but form two well-separated clusters in V.
parahaemolyticus (Fig. 1B).
Although it might be premature to assume that the Flag-2
flagellar system produces lateral flagella, we adopted a lateral
flagellar nomenclature for Flag-2 that is easily comparable to
that used for the well-established Flag-1 flagellar system and is
compatible with the requirements of genome annotation (Ta-
ble 2). Genes with fli,flg, and flh prefixes in the Flag-1 system
have prefixes of lfi,lfg, and lfh, respectively, in Flag-2-like and
lateral flagellar systems (e.g., LfiM is the Flag-2 or lateral
flagellar homolog of FliM). This nomenclature is consistent
with that used for a previous V. parahaemolyticus GenBank
submission (accession no. U51896), as well as a recent submis-
sion by Stewart et al. (accession no. AY225128). In the latter
case a different nomenclature was used in the accompanying
publication (genes were designated with reference to the polar
flagellar nomenclature, so that fliM
L
was equivalent to the
polar flagellar gene fliM) (50), but this system is not easily
adapted for use in standard sequence file formats. Importantly,
the homologs of Flag-1 genes fliCDSTKLA and motAB are
designated lafABCDEFSTU, which is consistent with the first
description of lateral flagellar genes (38) (accession no. gb:
U52957) and subsequent descriptions of Aeromonas lateral
flagellar clusters (32). Additional genes that were found to be
associated with Flag-2 loci but not with Flag-1 loci are referred
to as laf genes, as with the regulatory protein LafK encoded by
the V. parahaemolyticus Flag-2 system (50).
Flag-2 is not present in other Escherichia/Shigella and Sal-
monella strains whose genomes have been sequenced. We were
interested to see if there was any evidence of Flag-2 in other
strains of Escherichia/Shigella and S. enterica.InE. coli K-12
strain MG1655 fhiA and mbhA are positioned between yafM
and dinP, and they exhibit 95% nucleotide identity with lfhA
and lafU of E. coli 042 Flag-2, respectively, suggesting that they
are in fact remnants of an ancestral Flag-2 locus (Fig. 1B). We
propose that fhiA and mbhA should be renamed lfhA and lafU
and reannotated as pseudogenes in recognition of this rela-
tionship.
Both K-12 strain MG1655 lfhA and lafU appear to have been
truncated such that the first 391 and 189 nucleotides of E. coli
042 lfhA and lafU, respectively, do not match sequences any-
where in the K-12 genome. This places the predicted site of
deletion of the Flag-2 cluster at nucleotide 250060 of the K-12
strain MG1655 genome. This point of deletion is also found in
all other available Escherichia/Shigella genome sequences;
more specifically, it occurs in E. coli K-12 strain W3110,
enteropathogenic E. coli strain E2348/69, strain DH10B,
O157:H7 strains EDL933 and RIMD 0509952 (Sakai),
UPEC strain CFT073, S. flexneri 2a strains 301 and 2457T, and
S. sonnei 53G. Strain DH10B also has a frameshift mutation
within lfhA. These results suggest that the entire Flag-2 locus
was originally present in the last common ancestor of the
species. It is not clear why the deletion occurred. However, the
ends of the deletion correspond exactly to the 17- or 25-bp
direct repeat GTNGATNNTCANCAGNNTNA(N)TAAA,
which does not appear anywhere else in the E. coli 042 ge-
nome. It is possible that recombination between the repeats
may have been responsible for the deletion.
All 11 S. enterica genomes surveyed lack counterparts of the
lfhA and lafU genes (Fig. 1B). The S. enterica genomes also
lack yafL and yafM, but they do contain full-length divergently
transcribed dinP and yafK genes (Fig. 1B). In contrast to se-
quenced Escherichia/Shigella strains, this genetic arrangement
suggests that S. enterica probably never possessed a Flag-2
locus. The hypothesis that Escherichia/Shigella acquired Flag-2
by lateral gene transfer after divergence from S. enterica is
supported by the observation that yafM encodes a protein with
significant similarity to transposases and inactivated derivatives
(data not shown).
The E. coli 042 Flag-2 cluster is very similar to the V. para-
haemolyticus lateral flagellar system. The majority of E. coli
042 Flag-2 protein sequences exhibit 25 to 58% amino acid
identity with the orthologous proteins in the V. parahaemolyti-
cus lateral flagellar system (36, 50), and the average level of
identity is around 40%. In contrast, LfgA, LfgM, and LfgN (the
P-ring addition, anti-
28
, and chaperone proteins, respectively)
are more divergent, exhibiting 25% identity. FlgM and FlgN
show similar high variability and compositional bias across the
full range of flagellar diversity. LfiM and LfiJ (export and
assembly proteins) and LafD and LafF (a chaperone protein
and a protein with an unknown function, respectively) were not
VOL. 187, 2005 Flag-2 FLAGELLAR CLUSTER FROM E. COLI 1433
detected in TBLASTN searches with V. parahaemolyticus pro-
teins against the E. coli 042 Flag-2 region, but positional or-
thologs were identified by using PSI-BLAST.
lfgC is a pseudogene in E. coli 042. In most cases, the GLIM-
MER-predicted CDSs in the E. coli 042 sequence closely
matched the lengths of their counterparts in V. parahaemolyti-
cus, although there are minor discrepancies in the predicted
start codons for lfiG,lfgA,lfgB,lfgG,lfgL, and lafU (data not
shown). However, the lfgC gene from 042 is over 40 codons
shorter than its homologs in other systems. A TBLASTN
search of the E. coli 042 Flag-2 region with V. parahaemolyticus
FlgC indicated that there is a frameshift mutation in E. coli 042
lfgC. A run of three GC dinucleotide repeats is present up-
stream of the 042 lfgC open reading frame, but if one repeat is
removed, a full-length lfgC gene is discernible, and it exhibits a
high level of identity with V. parahaemolyticus lfgC over its
entire length. The lfgC gene encodes a FlgC-like proximal rod
protein and so is likely to be essential for the production of
Flag-2 flagella by E. coli 042. In other words, and consistent
with our inability to elicit swarming motility in E. coli 042 (data
not shown), this frameshift probably inactivated the Flag-2
system in this strain.
Evidence that the Flag-2 system might be RpoN regulated.
V. parahaemolyticus contains an RpoN-dependent regulator,
TABLE 2. Flag-2 nomenclature
Nomenclature
Gene E. coli 042
a
E. coli K-12
b
V. parahaemolyticus
c
Y. pestis
d
A. hydrophila
e
Predicted function
lfhA Ec042-0245 fhiA flhA
L
YPO0703 Export, assembly
lfhB Ec042-0246 flhB
L
flhB Export, assembly
lfiR Ec042-0247 fliR
L
fliR Export, assembly
lfiQ Ec042-0248 fliQ
L
fliQ Export, assembly
lfiP Ec042-0249 fliP
L
fliP Export, assembly
lfiN Ec042-0250 fliN
L
fliN Switch (C ring)
lfiM Ec042-0251 fliM
L
fliM Switch (C ring)
lafK Ec042-0252 lafK YPO0712 Regulatory
lfiE Ec042-0253 fliE
L
fliE Basal body component
lfiF Ec042-0254 fliF
L
fliF M ring
lfiG Ec042-0255 fliG
L
fliG Switch (C ring)
lfiH Ec042-0256 fliH
L
fliH Export, assembly
lfiI Ec042-0257 fliI
L
fliI Export, assembly
lfiJ Ec042-0258 fliJ
L
YPO0718 Export, assembly
Ec042-0259 Cytidylyltransferase
Ec042-0260 Glycosyl transferase
lafV Ec042-0261 fliU-like Lysine-N-methylase
lfgN Ec042-0262 flgN
L
flgN flgN Chaperone
lfgM Ec042-0263 flgM
L
flgM flgM Anti
28
lfgA Ec042-0264 flgA
L
flgA flgA P-ring addition
lfgB Ec042-0265 flgB
L
flgB flgB Rod
lfgC Ec042-0266 flgC
L
flgC flgC Rod
lfgD Ec042-0267 flgD
L
flgD flgD Rod
lfgE Ec042-0268 flgE
L
flgE flgE Hook
lfgF Ec042-0269 flgF
L
flgF flgF Rod
lfgG Ec042-0270 flgG
L
flgG flgG Rod
lfgH Ec042-0271 flgH
L
flgH flgH L ring
lfgI Ec042-0272 flgI
L
flgI flgI P ring
lfgJ Ec042-0273 flgJ
L
flgJ flgJ Peptidoglycan hydrolase
lfgK Ec042-0274 flgK
L
flgK flgK Hook-associated protein 1
lfgL Ec042-0275 flgL
L
flgL flgL Hook-associated protein 3
lafW Ec042-0276 VPA0275 VPO0734 Possible hook-associated protein
Ec042-0277 Unknown (COG4683)
Ec042-0278 Regulator
lafZ Ec042-0279 VPO0736 Transmembrane regulator
lafA Ec042-0280 lafA flaA1 lafA Flagellin
lafB Ec042-0281 lafB fliD lafB Hook-associated protein 2
lafC Ec042-0282 lafC fliS lafC Chaperone
lafD Ec042-0283 lafD YPO0742 lafX Chaperone
lafE Ec042-0284 lafE YPO0743 lafE Hook length control
lafF Ec042-0285 lafF YPO0744 lafF Unknown
lafS Ec042-0286 lafS fliA lafS
28
lafT Ec042-0287 lafT motA lafT H
motor protein A
lafU Ec042-0288 mbhA lafU motB lafU H
motor protein B
a
E. coli 042 systematic gene names according to genome annotation (http://www.sanger.ac.uk/Projects/Escherichia_Shigella/).
b
E. coli K-12 strain MG1665 gene designations according to the completed genome sequence (2).
c
V. parahaemolyticus gene designations according to the completed genome sequence (2) and the nomenclature suggested by Stewart and McCarter (50).
d
Y. pestis CO92 gene designations according to the completed genome sequence (46).
e
A. hydrophila gene designations according to accession numbers AY028400 (41) and AY129558 (3). Note that the flg genes are also required for polar flagellar
expression and are more closely related to E. coli Flag-1 flagellar genes than to the lfg genes found in E. coli 042, Y. pestis, and V. parahaemolyticus.
1434 REN ET AL. J. BACTERIOL.
LafK, that is required for the expression of lateral flagellar
early genes (50). RpoN has not previously been shown to be
required for E. coli flagellar systems, so we were interested to
see if there was any evidence that the Flag-2 system might be
regulated in this manner. As expected, the LafK homolog in
Flag-2 contains a full-length Pfam:Sigma54 activat domain
(9.2e-113). Furthermore, we identified consensus
54
sites (T
GGCAC-N
5
-TTGC) upstream of both lfgB and lafB transla-
tion start codons, as found in V. parahaemolyticus (50). To-
gether, these findings suggest that the Flag-2 system is RpoN
dependent.
In V. parahaemolyticus the central chemosensory elements
are shared by both polar and lateral flagellar systems (50).
Similarly, E. coli 042 Flag-2 does not encode the normal com-
plement of chemotaxis proteins normally found associated with
polar flagellar systems, so it is likely that it too shares chemo-
sensory functions with the Flag-1 flagellar system. Despite
these similarities, it appears that the regulation of Flag-2-like
flagellar gene expression may differ substantially in V. parah-
aemolyticus and E. coli 042. In V. parahaemolyticus there is a
consensus
54
site upstream of the lateral flagellar motY gene;
however, this gene is not present in E. coli 042 Flag-2, and no
consensus
54
site was identified in this region. Interestingly, V.
parahaemolyticus LafK contains a CheY-like receiver domain
at its N terminus, whereas E. coli 042 LafK (and Y. pestis LafK
homologs) has no such domain and consequently is 100 amino
acids shorter.
The E. coli 042 Flag-2 locus contains nonflagellar genes.
Two additional CDSs are found between lfiJ and lfgN in the E.
coli 042 Flag-2 cluster that are not found in the V. parahaemo-
lyticus or Y. pestis genomes (Fig. 1B). The predicted coding
sequence immediately adjacent to lfiJ (Ec042-0259) encodes a
135-residue homolog of cytidylyltransferase (residues 5 to 130
match Pfam:CTP_transf_2; 1.00e-6). In BLASTP searches of
the nonredundant protein database of the National Center for
Biotechnology Information (nr) with Ec042-0259, known and
putative glycerol-3-phosphate cytidylyltransferases were found
with high significance (gi 46914301; 59% identity; 1e-39).
Ec042-0260 encodes an 823-residue protein with an N-terminal
glycosyl transferase domain (residues 44 to 217 match Pfam:
Glycos_transf_2; 1.40e-13) and a C-terminal glycerolphospho-
transferase domain (residues 647 to 835 match Pfam:glyphos-
_transf; 4.60e-3). This domain organization is shared by more
than 20 other proteins, including the minor teichoic acid bio-
synthesis protein encoded by ggaB of Bacillus subtilis and the
teichoic acid biosynthesis protein encoded by tagF of Staphy-
lococcus epidermidis. Genes that are homologs of Ec042-0259
and Ec042-0260 are often found together as part of capsular
polysaccharide biosynthesis gene clusters. It is possible that
these genes may be responsible for posttranslational modifica-
tion of the flagellar proteins, such as the glycosylation demon-
strated to occur on the Aeromonas lateral flagella (20).
Two further predicted coding sequences with no counter-
parts in the V. parahaemolyticus lateral gene flagellar system
are found between lfgL and lafA in the E. coli 042 Flag-2
cluster. Ec042-0277 encodes a 115-residue protein with simi-
larity to other bacterial proteins with an unknown function
(COG4683). Ec042-0278 encodes a 100-residue protein that
contains a helix-turn-helix domain (residues 32 to 86 match
Pfam:HTH_3; 5.20e-12) and exhibits high amino acid identity
(50%) with several other putative transcriptional regulators.
E. coli 042 Flag-2 contains three genes that are conserved in
other similar flagellar gene clusters. Situated between Ec042-
0260 and lfgN is a gene predicted to encode a 323-residue
protein with significant similarity as determined by BLASTP
analysis against nr to a hypothetical protein from Y. pestis KIM
(E value, 1e-4) and two S. enterica FliB proteins (E value,
2e-4). FliB is a lysine-N-methylase that is required for post-
translational methylation of lysine residues in the flagellin of S.
enterica but is not found in E. coli (12). The fliB gene is
normally found adjacent to fliA, and in some S. enterica strains
FliB has been found to be encoded by two adjacent genes
(fliUV) (17). Interestingly, there has been a report that Aero-
monas punctata has a fliU-like gene (gb AAK57643) in a clus-
ter with lafA1,lafA2, and lafB, although the lack of this gene
did not noticeably affect swarming or swimming motility (32).
No such homolog could be identified in the lateral flagellar
clusters of V. parahaemolyticus or Y. pestis, although a homolog
was found in the C. rodentium Flag-2 cluster (see below). We
predict that a FliB homolog is a novel component of some
Flag-2-like flagellar systems and propose the designation lafV
for the gene that encodes it.
Immediately downstream and in the same orientation as lfgL
is a gene predicted to encode a 325-residue protein with sig-
nificant similarity to hypothetical proteins encoded in Flag-2-
like flagellar gene clusters in Y. pestis,Y. pseudotuberculosis,C.
violaceum, and V. parahaemolyticus. Interestingly, the homolog
from V. parahaemolyticus is annotated a putative flagellin. A
PSI-BLAST search with this sequence did indeed find flagellin
proteins, albeit with low significance. Therefore, this protein is
conserved in several Flag-2-like flagellar systems, and we pro-
pose the designation lafW for the gene that encodes it. Given
that FlgL and FliC are paralogous, the low-significance
matches to flagellin suggest that LafW may represent a novel
hook-associated protein, like that encoded by the adjacent lfgL
gene.
Upstream and divergent from lafA is a predicted coding
sequence for a 298-residue protein with an N-terminal tran-
scriptional regulator domain (residues 40 to 115; Pfam:trans_
reg_C; expect value, 8.10e-10) and a predicted membrane-
spanning region (residues 160 to 182). In BLASTP searches
against nr several putative transcriptional regulators were
found, along with several other regulators with known func-
tions (notably, Vibrio cholerae ToxR). Intriguingly, homologs
were found in syntenic locations in Y. pestis KIM and CO92, Y.
pseudotuberculosis, and C. rodentium. We propose that this
putative transmembrane transcriptional regulator could be in-
volved in Flag-2 gene expression, and we designated the cor-
responding gene lafZ.
Sequenced strains of Y. pestis each have a single Flag-2-like
flagellar gene cluster. Both the annotated Y. pestis genomes
(CO92 [46] KIM [15]) and the completed but unpublished Y.
pestis biovar Mediaevalis strain 91001 genome (gb:NC_005810)
have predicted Flag-2-like flagellar gene clusters (Fig. 1B and
Table 2). In common with E. coli 042 and in contrast to V.
parahaemolyticus, all three Y. pestis genomes encode the Flag-2
system in a single locus. However, this locus is present at a
different chromosomal location than its equivalent in E. coli
042. In nearly all cases, the amino acid identity between Flag-2
VOL. 187, 2005 Flag-2 FLAGELLAR CLUSTER FROM E. COLI 1435
orthologs is 10% higher between Y. pestis and E. coli 042 than
between V. parahaemolyticus and E. coli 042, suggesting that
there is less evolutionary distance between the former Flag-2
clusters than between the latter clusters. Ec042-0259, Ec042-
0260, Ec042-0277, Ec042-0278, and lafV are not found in either
strain, but both KIM and CO92 have copies of lafW and lafZ in
the appropriate locations. Interestingly, Y. pestis CO92 has
three sequential copies of lafA that appear to have arisen via
gene duplication. None of the Y. pestis genomes appears to
have a functional Flag-2 cluster (Fig. 1B); all three contain a
frameshift mutation in lfhA that should truncate LfhA to 432
residues. KIM also has a 14.9-kb deletion that includes the
entire lafBCDEFSTU locus and several non-Flag-2 genes be-
tween lafA and a pair of transposase genes, y3440 and y3439;
CO92 has a small insertion of a few hundred nucleotides in the
middle of lfgF; and 91001 has a frameshift mutation in lfgL that
should lead to a severe truncation of LfgL.
Flag-2-like flagellar loci in C. violaceum,C. rodentium, and Y.
pseudotuberculosis.We were interested to see if Flag-2-like
flagellar genes could be identified in any additional bacterial
genome sequences. We identified Flag-2-like flagellar loci us-
ing two broad criteria: (i) higher sequence identity with the E.
coli 042 Flag-2 genes than with the E. coli K-12 Flag-1 genes
and (ii) a conserved genetic organization, with five operons
showing a conserved gene order arranged in one or two gene
clusters (in contrast to the typical four or five clusters encoding
Flag-1). In addition, we used two more focused criteria, ab-
sence of an fliO homolog and presence of lafV,lafY, and lafZ
homologs.
We identified Flag-2-like flagellar loci in the genomes of C.
violaceum,C. rodentium, and Y. pseudotuberculosis. Each ge-
nome also encoded a complete Flag-1-like system, and there
were some minor differences between the Flag-2 systems; the
most extreme of these differences was that the C. violaceum
system did not appear to be RpoN dependent. C. rodentium is
a close relative of E. coli and has been used as a model to study
type III secretion (reviewed in reference 33). Although the
genome sequence is not yet complete, the content and orga-
nization of the C. rodentium Flag-2 cluster are indistinguish-
able from those of the E. coli 042 cluster, and there are high
levels of nucleotide identity (80 to 90%) across the entire
cluster. Furthermore, the cluster is located in the same position
relative to the genomic backbone, suggesting that the Flag-2
cluster was acquired by a common ancestor prior to the diver-
gence of the Escherichia and Citrobacter clades. Interestingly,
the C. rodentium Flag-2 cluster lacks the inactivating frameshift
mutation in lfgC found in the 042 Flag-2 cluster.
Similarly, the Y. pseudotuberculosis Flag-2 cluster is very
similar to the Y. pestis cluster and occurs at the same chromo-
somal location, although this is not surprising as Y. pestis is a
recently derived clone of Y. pseudotuberculosis (1). Intriguingly,
Y. pseudotuberculosis possessesfull-length copies of lfhA,lfgF,
and lfgL, suggesting that the Flag-2 system may be functional in
this organism. Also, curiously, the Y. enterocolitica 8081 ge-
nome does not appear to encode a Flag-2 system; this system
appears to have been lost due to a 100-kb deletion (data not
shown).
The Flag-2 cluster is present in around 20% of E. coli
strains. Next, we wished to determine the distribution of the
Flag-2 gene cluster among a larger collection of Escherichia
strains. PCR across the fhiA-mbhA boundary was positive for
58 strains (80%) from the well-characterized ECOR collec-
tion, showing that they all possessed the same two-gene scar
seen in K-12. Fifteen ECOR strains (20%) were negative in
this PCR (ECOR-1, -3, -4, -5, -12, -17, -24, -35, -36, -48, -49,
-50, -64, -65, and -67) (Table 3), suggesting that they might
harbor the full Flag-2 cluster at this site (Fig. 2A). A second
round of PCRs targeting pairs of genes at either end of the full
Flag-2 cluster provided complementary results (i.e., negative
for the 58 strains with the K-12 Flag-2 genotype and positive
for the 15 strains with the 042 Flag-2 genotype) (Fig. 2B).
These results were consistent with the hypothesis that the full
Flag-2 cluster was present in the last common ancestor of all E.
coli strains and, although it has been lost from most strains, it
has been retained in a sizable minority (around one-fifth) of E.
coli isolates. Curiously, similar short PCR surveys applied to
four Escherichia spp. other than E. coli (Escherichia blattae,
Escherichia fergunsonii,Escherichia hermannii, and Escherichia
TABLE 3. E. coli strains from the ECOR collection that possess an apparently intact Flag-2 gene cluster
a
Isolate Group O
b
H
b
Host Locale Clinical status
c
ETT2 genotype
ECOR-1 A ON HN Human (female, 19 yr) Iowa Healthy Partial
ECOR-3 A O1 NM Dog Massachusetts Healthy Partial
ECOR-4 A ON HN Human (female, 5 yr) Iowa Healthy Absent
ECOR-5 A O79 NM Human (female, 56 yr) Iowa Healthy Partial
ECOR-12 A O7 H32 Human (female) Sweden Healthy Partial
ECOR-17 A O106 NM Pig Indonesia Healthy Partial
ECOR-24 A O15 NM Human (female) Sweden Healthy Partial
ECOR-35 D O1 NM Human (female, 36 yr) Iowa Healthy Partial
ECOR-36 D O79 H25 Human (female, 20 yr) Iowa Healthy Partial
ECOR-48 D ON HM Human (female) Sweden UTI (C) Complete
ECOR-49 D O2 NM Human (female) Sweden Healthy Complete
ECOR-50 D O2 HN Human (female) Sweden UTI (P) Complete
ECOR-64 B2 O75 NM Human (female) Sweden UTI (C) Absent
ECOR-65
d
B2 ON H10 Celebese ape Washington Healthy Absent
ECOR-67 B1 O4 H43 Goat Indonesia Healthy Partial
a
The information is adapted from information on the ECOR website (http://foodsafe.msu.edu/Whittam/ecor/).
b
ON and HN, nontypeable with standard antisera; NM, nonmotile strain. HM indicates a form of nontypeable motile strain in which multiple H antisera reacted.
c
UTI, symptomatic urinary tract infection (C, acute cystitis; P, acute pyelonephritis).
d
Isolate from a 200 animal.
1436 REN ET AL. J. BACTERIOL.
vulneris) showed that all four of them possessed the K-12-like
genotype (data not shown).
Next, we used tiling path PCR to obtain a complete tiling
path through the entire Flag-2 locus for the 15 ECOR strains
of interest plus strain 042. Eight pairs of long PCR primers
were used to survey the 35-kb cluster. Most PCRs were
positive for all 15 strains (Fig. 1C). Any negative results were
followed up by deletion-scanning PCRs with the primers flank-
ing the negative regions (Fig. 1C and Fig. 3). Surprisingly, in
contrast to our experience with the ETT2 gene cluster, we
could not detect any large-scale insertions, deletions, or rear-
rangements in any of the Flag-2 clusters from the 15 ECOR
strains compared to the 042 genotype (Fig. 1C).
Lack of sequence polymorphism in the Flag-2 flagellin. As it
is known that there is considerable sequence polymorphism in
the flagellin genes associated with the Flag-1 system, we were
interested in determining whether similar variability occurred
in the Flag-2 lafA flagellin genes (52). Using primers patterned
on the genes flanking the E. coli 042 lafA sequence (Table 1),
we successfully amplified and sequenced the central two-thirds
of the lafA genes from 11 of the 15 Flag-2-positive ECOR
strains (ECOR-4, -17, -24, -35, -36, -48, -49, -64, -65, and -67).
The ECOR lafA sequences all exhibited 95% nucleotide
identity. Three sequences each had a single nonsynonymous
substitution, although each of the substitutions was a conser-
vative amino acid change, whereas the remaining nucleotide
differences were silent.
DISCUSSION
The discovery of the Flag-2 gene cluster in E. coli resulted in
several surprises. Given the status of E. coli as a model organ-
ism, it is remarkable to discover a new flagellar gene cluster in
this species, especially one associated with a novel, self-con-
tained, but previously unsuspected flagellar system that is
probably RpoN dependent. Also surprising is the realization
that this system appears to have been present in the ancestor of
all E. coli cells.
These observations are relevant to the study of model or-
ganisms; they cast doubt on the wisdom of ever considering a
single strain, such as K-12, as the archetype for a whole species.
Furthermore, they emphasize the need to adopt a historical
and comparative viewpoint when genomes are annotated, so
that genes such as fhiA and mbhA that represent remnants of
larger gene clusters can be recognized as such and appropri-
ately annotated as pseudogenes.
Several points spring to mind about the evolution of this
gene cluster. The locus occurs at the same location in the E.
coli and C. rodentium genomes, suggesting that it was present
in the ancestor of both species. However, its absence from
Salmonella suggests that it was acquired after these two species
diverged from Salmonella. This, combined with the presence of
a Flag-2-like locus in Yersinia at an entirely different site in the
genome, suggests that the cluster was acquired independently
at least twice by lateral gene transfer. This suggestion is sup-
ported by a striking property of the cluster: unlike the Flag-1
system, all the components for the Flag-2 system appear to be
encoded in a single large gene cluster, so that the entire self-
contained Flag-2 flagellar system could be acquired in a single
step. However, the mechanism of lateral gene transfer remains
FIG. 2. PCR scanning of all 72 ECOR strains plus E. coli RS218,
EAEC strain 042, E. coli CFT073, ETEC strain H10407, EAEC strain
25, E. coli K-12, and E. coli O157:H7 strain Sakai. PCR mixtures were
loaded on a 1.0% agarose gel with HyperLadder I MW markers (Bio-
line, London, United Kingdom). Lanes corresponding to Flag-2-posi-
tive strains are labeled according to ECOR strain number or patho-
type. (A) PCR scanning with primer fhiA-mbhA. A 600-bp PCR
product indicates that fhiA and mbhA are fused, as in E. coli K-12.
Negative results suggest the presence of intervening sequence between
fhiA and mbhA. (B) PCR scanning with primer fhiA-flanking. An
1,000-bp PCR product indicates the presence of a full-length lfhA
gene, as in E. coli 042.
FIG. 3. Deletion-scanning PCR of all 15 ECOR strains for the
presence of the cluster. Lanes M1 and M2 contained a high-molecular-
weight DNA marker (Invitrogen); the positions of 15-, 20-, and 25-kb
markers are indicated on the right. The ECOR strain numbers (see
text and Table 3) are indicated at the top.
VOL. 187, 2005 Flag-2 FLAGELLAR CLUSTER FROM E. COLI 1437
unclear, although the similarities between YafM, encoded by a
gene at one end of the cluster, and transposases might provide
a clue.
Curiously, the majority of E. coli strains lack almost all
Flag-2 genes and possess an identical fusion (fhiA-mbhA) be-
tween the remnants of genes (lfhA and lafU) from either end of
the cluster. This might suggest that the cluster was deleted
once, in the ancestor of all such strains. However, a finding that
does not support this idea is the lack of congruence between
the distribution of intact or deleted Flag-2 clusters and the
accepted phylogenetic structure of E. coli, as defined by mul-
tilocus enzyme electrophoresis (25) (i.e., division into the A,
B1, B2, and D clades). Intact Flag-2 clusters are more common
in the A clade but are nonetheless scattered throughout all
four subdivisions, scuppering any attempt to link the Flag-2
genotype with the lines of phylogenetic descent. One possible
explanation is that recombination has occurred between strains
at this locus, purging most of them of the Flag-2 cluster and
overwriting any phylogenetic signal.
So far, we have not detected a phenotype in enteroaggrega-
tive E. coli strain 042, whose genome has been sequenced, that
could be ascribed to the Flag-2 cluster; for example, this strain
did not show swarming behavior in our hands (data not
shown). This is not too surprising, as we discovered a frame-
shift in one important gene, lfgC, that was almost certain to
have inactivated the system. However, we did not detect any
other inactivating mutations in the Flag-2 cluster in this strain,
suggesting that the Flag-2 system was active in the recent past.
Indeed, we cannot rule out adaptation to the laboratory envi-
ronment as the cause of loss of Flag-2 function in this strain.
The fact that the Flag-2 system has been inactivated in all
Escherichia/Shigella strains whose genomes have been se-
quenced, whether through a large deletion or, in the case of
042, through a frameshift mutation, provoked comparison with
other pathogens that have lost motility functions (e.g., loss of
Flag-1 function in Shigella, loss of both Flag-1 and Flag-2-like
systems in Y. pestis, and loss of flagellar motility in Bordetella
pertussis) (2, 45, 46). It is tempting to speculate that similar
immune selective pressures provide a common explanation,
given that flagellin is so highly visible to the innate immune
system through its interactions with Toll-like receptor 5 (23).
However, the frequent loss of the Flag-2 system may simply
reflect the energy costs of producing a large multiprotein or-
ganelle in niches where it provides no selective advantage.
Although we have yet to find a strain with an active Flag-2
system, a number of pertinent structural and functional pre-
dictions can be made about the system upon scrutiny of the
gene cluster. By analogy with related systems in Vibrio and
Aeromonas, one could anticipate some distinctive features that
distinguish Flag-2 from the conventional Flag-1 system; for
example, its smaller flagellin could assemble into filaments
thinner than conventional flagella (7, 32). One could also ex-
pect the Flag-2 system to mediate swarming motility and to be
activated under high-viscosity conditions (5, 32). Other roles
might include biofilm formation, cell-cell linkage, surface col-
onization, and adhesion to and invasion of eukaryotic cells.
One might even anticipate a role in gut colonization and/or
virulence.
The distribution of the Flag-2 filaments is likely to differ
from the peritrichous distribution seen with the Flag-1 system.
All previously characterized Flag-2-like systems are lateral
flagellar systems (7, 32). However, all known lateral systems
occur in association with polar flagella rather than peritrichous
flagella, which are characteristic of Flag-1; thus, given this
discrepancy, it is perhaps premature to assume that the Flag-2
system in E. coli necessarily produces flagella with a lateral
distribution. For this reason, we adopted the Flag-2 designa-
tion rather than simply calling this system the E. coli lateral
flagellar system (even though that is what it might turn out to
be). By analogy with the Vibrio lateral flagellar system, Flag-2
is likely to be proton driven (6). This suggests an intriguing
difference between the Flag-1–Flag-2 combination in E. coli
and the lateral-polar combination in Vibrio;inE. coli strains
that possess an intact Flag-2 system, two proton-driven flagel-
lar systems might coexist in the same cell, whereas in the
lateral-polar arrangement, one system is proton driven and the
other is driven by sodium ions (6).
Another striking feature of the Flag-2 system is the lack of
variability in the sequence of its flagellin, LafA; this distin-
guishes it from Flag-1, in which there are numerous antigeni-
cally distinct H types associated with sequence polymorphisms
in the surface-exposed D2 and D3 domains of the Flag-1 flagel-
lin, FliC (52). Indeed, it is interesting that identical Flag-2 lafA
sequences are distributed among ECOR strains with various H
types (Table 3). This hints at differences in the selective pres-
sures exerted on the two systems by the acquired immune
system or by other pressures driving flagellar diversity.
Analysis of the Flag-2 gene cluster allows several conclusions
to be drawn about the regulation of Flag-2 gene expression and
biosynthesis of the second flagellar system in E. coli. The most
remarkable inference, based on the presence of LafK and
some RpoN consensus-binding sites, is that this system is likely
to be RpoN dependent. The Flag-1 system appears to be in-
creasingly unusual among bacterial flagellar systems in its lack
of dependence on RpoN. If proven, the RpoN dependence of
Flag-2 would therefore provide a fundamental link between
flagellar biosynthesis in E. coli and many other RpoN-depen-
dent flagellar systems. Furthermore, it might enable molecular
dissection of the role of RpoN in flagellar gene expression in a
tractable host.
Another prediction is that like the lateral flagellar systems of
Vibrio and Aeromonas, the E. coli Flag-2 system utilizes its own
flagellar sigma-antisigma combination, encoded by homo-
logues of FliA (LafS) and FlgM (LfgM). How regulation of
Flag-2 gene expression is coupled to global gene regulation is
less clear. It is likely to be coordinately regulated with Flag-1
and may exploit the same chemotaxis apparatus (it appears to
have none of its own). However, it may well be independent of
FlhCD, which are high-level regulators of the Flag-1 system, as
these regulators are absent from species that contain func-
tional Flag-2-like lateral flagellar systems (data not shown).
The Flag-2 gene cluster encodes all the components of a
flagellar type III secretion system (Table 2). If the Flag-2 gene
cluster does indeed encode a functioning flagellar system in
some strains, this would increase the number of distinct type
III secretion systems in Escherichia/Shigella to five. It is now
well established that there is regulatory cross talk between
some of these systems (48). The discovery of the Flag-2 cluster
in 20% of E. coli strains increases the potential for this phe-
nomenon, particularly as some ECOR strains possess ETT2,
1438 REN ET AL. J. BACTERIOL.
Flag-2, and Flag-1 genes (Table 3). Furthermore, given our
recent discovery that regulatory influences can outlive the de-
cay of structural genes in the ETT2 gene cluster (the “Cheshire
cat” effect), it is possible that the Flag-2 locus might exert
regulatory effects even in strains in which it is no longer capa-
ble of encoding a fully functional flagellar system (48, 56).
The fact that the Flag-2 gene cluster was not discovered in
the first 10 Escherichia/Shigella genome sequences obtained
emphasizes the importance of maintaining an energetic pro-
gram of genome sequencing in this taxonomic group. The hunt
is now on for a functional Flag-2 system and for any pheno-
types associated with it in E. coli strains. However, the pres-
ence of similar, potentially functional gene clusters in C. ro-
dentium and Y. pseudotuberculosis should also focus the
spotlight on motility in these bacteria. In addition, one might
anticipate fresh insights into Flag-2-like systems to emerge
from the soon-to-be-completed Aeromonas hydrophila ATCC
7966 genome sequence.
In conclusion, the obvious similarities between Flag-2 genes
and related genes in C. rodentium and Y. pseudotuberculosis
and with lateral flagellar systems in Aeromonas and Vibrio
illustrate a recent dictum (8), ...ifwearewilling to think in
terms of an idealized E. coli, we can include a great deal of
well-studied biology of closely related enteric bacteria, while
the discovery of the Flag-2 locus in E. coli demonstrates how
much there is still to learn about motility in this intensively
studied model organism.
ACKNOWLEDGMENTS
M.J.P. thanks the BBSRC for funding tiling path PCR work through
project grant D13414 and for funding the coliBASE site through grant
EGA16107. S.A.B. thanks the MRC for supporting him with a Bioin-
formatics Fellowship. We thank the Wellcome Trust for funding the
sequencing of Escherichia/Shigella genome sequences.
M.J.P. thanks Arshad Khan for systems administration. We thank
the BBRP Sequencing Group at LLNL for making the unfinished Y.
pseudotuberculosis genome sequence data publicly available.
REFERENCES
1. Achtman, M., K. Zurth, G. Morelli, G. Torrea, A. Guiyoule, and E. Carniel.
1999. Yersinia pestis, the cause of plague, is a recently emerged clone of
Yersinia pseudotuberculosis. Proc. Natl. Acad. Sci. USA 96:14043–14048.
2. Al Mamun, A. A., A. Tominaga, and M. Enomoto. 1997. Cloning and char-
acterization of the region III flagellar operons of the four Shigella subgroups:
genetic defects that cause loss of flagella of Shigella boydii and Shigella
sonnei. J. Bacteriol. 179:4493–4500.
3. Altarriba, M., S. Merino, R. Gavin, R. Canals, A. Rabaan, J. G. Shaw, and
J. M. Tomas. 2003. A polar flagella operon (flg)ofAeromonas hydrophila
contains genes required for lateral flagella expression. Microb. Pathog. 34:
249–259.
4. Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller,
and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation
of protein database search programs. Nucleic Acids Res. 25:3389–3402.
5. Atsumi, T., Y. Maekawa, T. Yamada, I. Kawagishi, Y. Imae, and M. Homma.
1996. Effect of viscosity on swimming by the lateral and polar flagella of
Vibrio alginolyticus. J. Bacteriol. 178:5024–5026.
6. Atsumi, T., L. McCarter, and Y. Imae. 1992. Polar and lateral flagellar
motors of marine Vibrio are driven by different ion-motive forces. Nature
355:182–184.
7. Belas, M. R., and R. R. Colwell. 1982. Scanning electron microscope obser-
vation of the swarming phenomenon of Vibrio parahaemolyticus. J. Bacteriol.
150:956–959.
8. Bender, R. A. 1996. Variations on a theme by Escherichia, p. 4–9. In F. C.
Neidhardt, R. Curtiss III, J. L. Ingraham, E. C. C. Lin, K. B. Low, B.
Magasanisk, W. S. Reznikoff, M. Riley, M. Schaechter, and H. E. Umbarger
(ed.), Escherichia coli and Salmonella: cellular and molecular biology, 2nd
ed., vol. 1. ASM Press, Washington, D.C.
9. Blattner, F. R., G. Plunkett III, C. A. Bloch, N. T. Perna, V. Burland, M.
Riley, J. Collado-Vides, J. D. Glasner, C. K. Rode, G. F. Mayhew, J. Gregor,
N. W. Davis, H. A. Kirkpatrick, M. A. Goeden, D. J. Rose, B. Mau, and Y.
Shao. 1997. The complete genome sequence of Escherichia coli K-12. Science
277:1453–1474.
10. Brazilian National Genome Project Consortium. 2003. The complete ge-
nome sequence of Chromobacterium violaceum reveals remarkable and ex-
ploitable bacterial adaptability. Proc. Natl. Acad. Sci. USA 100:11660–11665.
11. Brun, Y. V., and L. Shapiro. 1992. A temporally controlled sigma-factor is
required for polar morphogenesis and normal cell division in Caulobacter.
Genes Dev. 6:2395–2408.
12. Burnens, A. P., J. Stanley, R. Sack, P. Hunziker, I. Brodard, and J. Nicolet.
1997. The flagellin N-methylase gene fliB and an adjacent serovar-specific
IS200 element in Salmonella typhimurium. Microbiology 143:1539–1547.
13. Chaudhuri, R. R., A. M. Khan, and M. J. Pallen. 2004. coliBASE: an online
database for Escherichia coli,Shigella and Salmonella comparative genomics.
Nucleic Acids Res. 32(Database issue):D296-D299.
14. Delcher, A. L., D. Harmon, S. Kasif, O. White, and S. L. Salzberg. 1999.
Improved microbial gene identification with GLIMMER. Nucleic Acids Res.
27:4636–4641.
15. Deng, W., V. Burland, G. Plunkett III, A. Boutin, G. F. Mayhew, P. Liss, N. T.
Perna, D. J. Rose, B. Mau, S. Zhou, D. C. Schwartz, J. D. Fetherston, L. E.
Lindler, R. R. Brubaker, G. V. Plano, S. C. Straley, K. A. McDonough, M. L.
Nilles, J. S. Matson, F. R. Blattner, and R. D. Perry. 2002. Genome sequence
of Yersinia pestis KIM. J. Bacteriol. 184:4601–4611.
16. Deng, W., S. R. Liou, G. Plunkett III, G. F. Mayhew, D. J. Rose, V. Burland,
V. Kodoyianni, D. C. Schwartz, and F. R. Blattner. 2003. Comparative
genomics of Salmonella enterica serovar Typhi strains Ty2 and CT18. J.
Bacteriol. 185:2330–2337.
17. Doll, L., and G. Frankel. 1993. fliU and fliV: two flagellar genes essential for
biosynthesis of Salmonella and Escherichia coli flagella. J. Gen. Microbiol.
139:2415–2422.
18. Galtier, N., M. Gouy, and C. Gautier. 1996. SEAVIEW and PHYLO_WIN:
two graphic tools for sequence alignment and molecular phylogeny. Comput.
Appl. Biosci. 12:543–548.
19. Gavin, R., S. Merino, M. Altarriba, R. Canals, J. G. Shaw, and J. M. Tomas.
2003. Lateral flagella are required for increased cell adherence, invasion and
biofilm formation by Aeromonas spp. FEMS Microbiol. Lett. 224:77–83.
20. Gavin, R., A. A. Rabaan, S. Merino, J. M. Tomas, I. Gryllos, and J. G. Shaw.
2002. Lateral flagella of Aeromonas species are essential for epithelial cell
adherence and biofilm formation. Mol. Microbiol. 43:383–397.
21. Hacker, J., G. Blum-Oehler, B. Hochhut, and U. Dobrindt. 2003. The mo-
lecular basis of infectious diseases: pathogenicity islands and other mobile
genetic elements. A review. Acta Microbiol. Immunol. Hung. 50:321–330.
22. Hacker, J., and J. B. Kaper. 2000. Pathogenicity islands and the evolution of
microbes. Annu. Rev. Microbiol. 54:641–679.
23. Hayashi, F., K. D. Smith, A. Ozinsky, T. R. Hawn, E. C. Yi, D. R. Goodlett,
J. K. Eng, S. Akira, D. M. Underhill, and A. Aderem. 2001. The innate
immune response to bacterial flagellin is mediated by Toll-like receptor 5.
Nature 410:1099–1103.
24. Hayashi, T., K. Makino, M. Ohnishi, K. Kurokawa, K. Ishii, K. Yokoyama,
C. G. Han, E. Ohtsubo, K. Nakayama, T. Murata, M. Tanaka, T. Tobe, T.
Iida, H. Takami, T. Honda, C. Sasakawa, N. Ogasawara, T. Yasunaga, S.
Kuhara, T. Shiba, M. Hattori, and H. Shinagawa. 2001. Complete genome
sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic com-
parison with a laboratory strain K-12. DNA Res. 8:11–22.
25. Herzer, P. J., S. Inouye, M. Inouye, and T. S. Whittam. 1990. Phylogenetic
distribution of branched RNA-linked multicopy single-stranded DNA
among natural isolates of Escherichia coli. J. Bacteriol. 172:6175–6181.
26. Jacob, F. 1988. The statue within: an autobiography. Basic Books, New York,
N.Y.
27. Jacobi, S., R. Schade, and K. Heuner. 2004. Characterization of the alter-
native sigma factor
54
and the transcriptional regulator FleQ of Legionella
pneumophila, which are both involved in the regulation cascade of flagellar
gene expression. J. Bacteriol. 186:2540–2547.
28. Jagannathan, A., C. Constantinidou, and C. W. Penn. 2001. Roles of rpoN,
fliA, and flgR in expression of flagella in Campylobacter jejuni. J. Bacteriol.
183:2937–2942.
29. Jin, Q., Z. Yuan, J. Xu, Y. Wang, Y. Shen, W. Lu, J. Wang, H. Liu, J. Yang,
F. Yang, X. Zhang, J. Zhang, G. Yang, H. Wu, D. Qu, J. Dong, L. Sun, Y.
Xue, A. Zhao, Y. Gao, J. Zhu, B. Kan, K. Ding, S. Chen, H. Cheng, Z. Yao,
B. He, R. Chen, D. Ma, B. Qiang, Y. Wen, Y. Hou, and J. Yu. 2002. Genome
sequence of Shigella flexneri 2a: insights into pathogenicity through compar-
ison with genomes of Escherichia coli K12 and O157. Nucleic Acids Res.
30:4432–4441.
30. Kawagishi, I., M. Nakada, N. Nishioka, and M. Homma. 1997. Cloning of a
Vibrio alginolyticus rpoN gene that is required for polar flagellar formation.
J. Bacteriol. 179:6851–6854.
31. Kirov, S. M. 2003. Bacteria that express lateral flagella enable dissection of
the multifunctional roles of flagella in pathogenesis. FEMS Microbiol. Lett.
224:151–159.
32. Kirov, S. M., B. C. Tassell, A. B. Semmler, L. A. O’Donovan, A. A. Rabaan,
and J. G. Shaw. 2002. Lateral flagella and swarming motility in Aeromonas
species. J. Bacteriol. 184:547–555.
VOL. 187, 2005 Flag-2 FLAGELLAR CLUSTER FROM E. COLI 1439
33. Luperchio, S. A., and D. B. Schauer. 2001. Molecular pathogenesis of
Citrobacter rodentium and transmissible murine colonic hyperplasia. Mi-
crobes Infect. 3:333–340.
34. Macnab, R. M. 1996. Flagella and motility, p. 123–145. In F. C. Neidhardt,
R. Curtiss III, J. L. Ingraham, E. C. C. Lin, K. B. Low, B. Magasanik, W. S.
Reznikoff, M. Riley, M. Schaechter, and H. E. Umbarger (ed.), Escherichia
coli and Salmonella: cellular and molecular biology, 2nd ed., vol. 1. ASM
Press, Washington, D.C.
35. Macnab, R. M. 2003. How bacteria assemble flagella. Annu. Rev. Microbiol.
57:77–100.
36. Makino, K., K. Oshima, K. Kurokawa, K. Yokoyama, T. Uda, K. Tagomori,
Y. Iijima, M. Najima, M. Nakano, A. Yamashita, Y. Kubota, S. Kimura, T.
Yasunaga, T. Honda, H. Shinagawa, M. Hattori, and T. Iida. 2003. Genome
sequence of Vibrio parahaemolyticus: a pathogenic mechanism distinct from
that of V. cholerae. Lancet 361:743–749.
37. McCarter, L. L. 2004. Dual flagellar systems enable motility under different
circumstances. J. Mol. Microbiol. Biotechnol. 7:18–29.
38. McCarter, L. L., and M. E. Wright. 1993. Identification of genes encoding
components of the swarmer cell flagellar motor and propeller and a sigma
factor controlling differentiation of Vibrio parahaemolyticus. J. Bacteriol.
175:3361–3371.
39. McClelland, M., L. Florea, K. Sanderson, S. W. Clifton, J. Parkhill, C.
Churcher, G. Dougan, R. K. Wilson, and W. Miller. 2000. Comparison of the
Escherichia coli K-12 genome with sampled genomes of a Klebsiella pneu-
moniae and three Salmonella enterica serovars, Typhimurium, Typhi and
Paratyphi. Nucleic Acids Res. 28:4974–4986.
40. McClelland, M., K. E. Sanderson, J. Spieth, S. W. Clifton, P. Latreille, L.
Courtney, S. Porwollik, J. Ali, M. Dante, F. Du, S. Hou, D. Layman, S.
Leonard, C. Nguyen, K. Scott, A. Holmes, N. Grewal, E. Mulvaney, E. Ryan,
H. Sun, L. Florea, W. Miller, T. Stoneking, M. Nhan, R. Waterston, and
R. K. Wilson. 2001. Complete genome sequence of Salmonella enterica se-
rovar Typhimurium LT2. Nature 413:852–856.
41. Merino, S., R. Gavin, S. Vilches, J. G. Shaw, and J. M. Tomas. 2003. A
colonization factor (production of lateral flagella) of mesophilic Aeromonas
spp. is inactive in Aeromonas salmonicida strains. Appl. Environ. Microbiol.
69:663–667.
42. Ochman, H., and I. B. Jones. 2000. Evolutionary dynamics of full genome
content in Escherichia coli. EMBO J. 19:6637–6643.
43. Pallen, M. J., C. W. Penn, and R. R. Chaudhuri. Bacterial flagellar diversity
in the post-genomic era. Trends Microbiol, in press.
44. Parkhill, J., G. Dougan, K. D. James, N. R. Thomson, D. Pickard, J. Wain,
C. Churcher, K. L. Mungall, S. D. Bentley, M. T. Holden, M. Sebaihia, S.
Baker, D. Basham, K. Brooks, T. Chillingworth, P. Connerton, A. Cronin, P.
Davis, R. M. Davies, L. Dowd, N. White, J. Farrar, T. Feltwell, N. Hamlin,
A. Haque, T. T. Hien, S. Holroyd, K. Jagels, A. Krogh, T. S. Larsen, S.
Leather, S. Moule, P. O’Gaora, C. Parry, M. Quail, K. Rutherford, M.
Simmonds, J. Skelton, K. Stevens, S. Whitehead, and B. G. Barrell. 2001.
Complete genome sequence of a multiple drug resistant Salmonella enterica
serovar Typhi CT18. Nature 413:848–852.
45. Parkhill, J., M. Sebaihia, A. Preston, L. D. Murphy, N. Thomson, D. E.
Harris, M. T. Holden, C. M. Churcher, S. D. Bentley, K. L. Mungall, A. M.
Cerdeno-Tarraga, L. Temple, K. James, B. Harris, M. A. Quail, M. Acht-
man, R. Atkin, S. Baker, D. Basham, N. Bason, I. Cherevach, T. Chilling-
worth, M. Collins, A. Cronin, P. Davis, J. Doggett, T. Feltwell, A. Goble, N.
Hamlin, H. Hauser, S. Holroyd, K. Jagels, S. Leather, S. Moule, H. Norb-
erczak, S. O’Neil, D. Ormond, C. Price, E. Rabbinowitsch, S. Rutter, M.
Sanders, D. Saunders, K. Seeger, S. Sharp, M. Simmonds, J. Skelton, R.
Squares, S. Squares, K. Stevens, L. Unwin, S. Whitehead, B. G. Barrell, and
D. J. Maskell. 2003. Comparative analysis of the genome sequences of
Bordetella pertussis,Bordetella parapertussis and Bordetella bronchiseptica.
Nat. Genet. 35:32–40.
46. Parkhill, J., B. W. Wren, N. R. Thomson, R. W. Titball, M. T. Holden, M. B.
Prentice, M. Sebaihia, K. D. James, C. Churcher, K. L. Mungall, S. Baker,
D. Basham, S. D. Bentley, K. Brooks, A. M. Cerdeno-Tarraga, T. Chilling-
worth, A. Cronin, R. M. Davies, P. Davis, G. Dougan, T. Feltwell, N. Hamlin,
S. Holroyd, K. Jagels, A. V. Karlyshev, S. Leather, S. Moule, P. C. Oyston,
M. Quail, K. Rutherford, M. Simmonds, J. Skelton, K. Stevens, S. White-
head, and B. G. Barrell. 2001. Genome sequence of Yersinia pestis, the
causative agent of plague. Nature 413:523–527.
47. Perna, N. T., G. Plunkett III, V. Burland, B. Mau, J. D. Glasner, D. J. Rose,
G. F. Mayhew, P. S. Evans, J. Gregor, H. A. Kirkpatrick, G. Posfai, J.
Hackett, S. Klink, A. Boutin, Y. Shao, L. Miller, E. J. Grotbeck, N. W. Davis,
A. Lim, E. T. Dimalanta, K. D. Potamousis, J. Apodaca, T. S. Ananthara-
man, J. Lin, G. Yen, D. C. Schwartz, R. A. Welch, and F. R. Blattner. 2001.
Genome sequence of enterohaemorrhagic Escherichia coli O157:H7. Nature
409:529–533.
48. Ren, C. P., R. R. Chaudhuri, A. Fivian, C. M. Bailey, M. Antonio, W. M.
Barnes, and M. J. Pallen. 2004. The ETT2 gene cluster, encoding a second
type III secretion system from Escherichia coli, is present in the majority of
strains but has undergone widespread mutational attrition. J. Bacteriol.
186:3547–3560.
49. Rutherford, K., J. Parkhill, J. Crook, T. Horsnell, P. Rice, M. A. Rajan-
dream, and B. Barrell. 2000. Artemis: sequence visualization and annota-
tion. Bioinformatics 16:944–945.
50. Stewart, B. J., and L. L. McCarter. 2003. Lateral flagellar gene system of
Vibrio parahaemolyticus. J. Bacteriol. 185:4508–4518.
51. Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W:
improving the sensitivity of progressive multiple sequence alignment through
sequence weighting, position-specific gap penalties and weight matrix choice.
Nucleic Acids Res. 22:4673–4680.
52. Wang, L., D. Rothemund, H. Curd, and P. R. Reeves. 2003. Species-wide
variation in the Escherichia coli flagellin (H-antigen) gene. J. Bacteriol.
185:2936–2943.
53. Wei, J., M. B. Goldberg, V. Burland, M. M. Venkatesan, W. Deng, G.
Fournier, G. F. Mayhew, G. Plunkett III, D. J. Rose, A. Darling, B. Mau,
N. T. Perna, S. M. Payne, L. J. Runyen-Janecky, S. Zhou, D. C. Schwartz,
and F. R. Blattner. 2003. Complete genome sequence and comparative
genomics of Shigella flexneri serotype 2a strain 2457T. Infect. Immun. 71:
2775–2786.
54. Welch, R. A., V. Burland, G. Plunkett III, P. Redford, P. Roesch, D. Rasko,
E. L. Buckles, S. R. Liou, A. Boutin, J. Hackett, D. Stroud, G. F. Mayhew,
D. J. Rose, S. Zhou, D. C. Schwartz, N. T. Perna, H. L. Mobley, M. S.
Donnenberg, and F. R. Blattner. 2002. Extensive mosaic structure revealed
by the complete genome sequence of uropathogenic Escherichia coli. Proc.
Natl. Acad. Sci. USA 99:17020–17024.
55. Wolfe, A. J., D. S. Millikan, J. M. Campbell, and K. L. Visick. 2004. Vibrio
fischeri
54
controls motility, biofilm formation, luminescence, and coloniza-
tion. Appl. Environ. Microbiol. 70:2520–2524.
56. Zhang, L., R. R. Chaudhuri, C. Constantinidou, J. L. Hobman, M. D. Patel,
A. C. Jones, D. Sarti, A. J. Roe, I. Vlisidou, R. K. Shaw, F. Falciani, M. P.
Stevens, D. L. Gally, S. Knutton, G. Frankel, C. W. Penn, and M. J. Pallen.
2004. Regulators encoded in the Escherichia coli type III secretion system 2
gene cluster influence expression of genes within the locus for enterocyte
effacement in enterohemorrhagic Escherichia coli O157:H7. Infect. Immun.
72:7282–7293.
1440 REN ET AL. J. BACTERIOL.
... These genes are organized in three genomic clusters, which are collectively termed the primary agellar ( ag-1) locus [3,12,13]. In addition to the ag-1 system, a second, evolutionary distinct, agellar system, encoded by the ag-2 locus, was observed and shown to be relatively prevalent among members of the order Enterobacterales [14,15]. The ag-2 locus encodes a lateral agellar system and has been postulated to facilitate swarming motility among the Enterobacterales, although it has been inactivated through gene deletions and transposon integration in a substantial proportion of enterobacterial taxa [14][15][16]. ...
... In addition to the ag-1 system, a second, evolutionary distinct, agellar system, encoded by the ag-2 locus, was observed and shown to be relatively prevalent among members of the order Enterobacterales [14,15]. The ag-2 locus encodes a lateral agellar system and has been postulated to facilitate swarming motility among the Enterobacterales, although it has been inactivated through gene deletions and transposon integration in a substantial proportion of enterobacterial taxa [14][15][16]. Here, by means of a comprehensive comparative genomic analysis, we identi ed three additional distinct agellar loci ag-3, ag-4 and ag-5 which are distributed among the Enterobacterales. ...
... Flagellar motility, a common feature among most members of the Enterobacterales, was long considered to derive from a single chromosomally encoded peritrichous agellar system ( ag-1). However, a second, distinct lateral agellar system ( ag-2) was recently identi ed and shown to be fairly common across most family lineages within the order [14,15]. Here we have identi ed three additional distinct agellar loci, ag-3 to ag-5, with discrete taxonomic distributions. ...
Preprint
Full-text available
Background: Flagellar motility is an efficient means of movement that allows bacteria to successfully colonize and compete with other microorganisms within their respective environments. The production and functioning of flagella is highly energy intensive and therefore flagellar motility is a tightly regulated process. Despite this, some bacteria have been observed to possess multiple flagellar systems which allow distinct forms of motility. Results: Comparative genomic analyses showed that, in addition to the previously identified primary peritrichous (flag-1) and secondary, lateral (flag-2) flagellar loci, three novel types of flagellar loci, varying in both gene content and gene order, are encoded on the genomes of members of the order Enterobacterales. The flag-3 and flag-4 loci encode predicted peritrichous flagellar systems while the flag-5 locus encodes a polar flagellum. In total, 798/4,028 (~20%) of the studied taxa incorporate dual flagellar systems, while nineteen taxa incorporate three distinct flagellar loci. Phylogenetic analyses indicate the complex evolutionary histories of the flagellar systems among the Enterobacterales. Conclusions: Supernumerary flagellar loci are relatively common features across a broad taxonomic spectrum in the order Enterobacterales. Here, we report the occurrence of five (flag-1 to flag-5) flagellar loci on the genomes of enterobacterial taxa, as well as the occurrence of three flagellar systems in select members of the Enterobacterales. Considering the energetic burden of maintaining and operating multiple flagellar systems, they are likely to play a role in the ecological success of members of this family and we postulate on their potential biological functions.
... No comparable pattern was found in relation to the pathotypes. Almost all genomes lack a fragment present in the putatively intact ETT2 region of phylogroup D1 EAEC strain 042 (eivJICAEGF), which is located between two small direct repeats and thus often deleted [43]. CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. ...
... The eprI gene was determined as the only significantly mastitis-associated and -enriched virulence-associated gene ( Table 2). This gene, which was also shown to be significantly associated with phylogroup A strains, belongs to the ETT2 determinant, a large gene cluster with frequent deletion isoforms in E. coli [43]. The ETT2 type III secretion system is contained on GI8 of E. coli 1303 and on GI9 of strain ECC-1470 (Additional file 10: Dataset S5 and Additional file 11: Dataset S6). ...
... An alternative flagellar system (Flag-2) is encoded on 1303 GI1 [43]. The Flag-2 locus encodes also for a type III secretion system in addition to the alternative flagellar system, which might be in cross-talk with ETT2. ...
Preprint
Full-text available
Background Escherichia coli bovine mastitis is a disease of significant economic importance in the dairy industry. Molecular characterization of mastitis-associated E. coli (MAEC) did not result in the identification of common traits. Nevertheless, a mammary pathogenic E. coli (MPEC) pathotype has been proposed suggesting virulence traits that differentiate MAEC from commensal E. coli . The present study was designed to investigate the MPEC pathotype hypothesis by comparing the genomes of MAEC and commensal bovine E. coli . Results We sequenced the genomes of eight E. coli isolated from bovine mastitis cases and six fecal commensal isolates from udder-healthy cows. We analyzed the phylogenetic history of bovine E. coli genomes by supplementing this strain panel with eleven bovine-associated E. coli from public databases. The majority of the isolates originate from phylogroups A and B1, but neither MAEC nor commensal strains could be unambiguously distinguished by phylogenetic lineage. The gene content of both MAEC and commensal strains is highly diverse and dominated by their phylogenetic background. Although individual strains carry some typical E. coli virulence-associated genes, no traits important for pathogenicity could be specifically attributed to MAEC. Instead, both commensal strains and MAEC have very few gene families enriched in either pathotype. Only the aerobactin siderophore gene cluster was enriched in commensal E. coli within our strain panel. Conclusions This is the first characterization of a phylogenetically diverse strain panel including several MAEC and commensal isolates. With our comparative genomics approach we could not confirm previous studies that argue for a positive selection of specific traits enabling MAEC to elicit bovine mastitis. Instead, MAEC are facultative and opportunistic pathogens recruited from the highly diverse bovine gastrointestinal microbiota. Virulence-associated genes implicated in mastitis are a by-product of commensalism with the primary function to enhance fitness in the bovine gastrointestinal tract. Therefore, we put the definition of the MPEC pathotype into question and suggest to designate corresponding isolates as MAEC.
... These genes are organized in three genomic clusters, which are collectively termed the primary flagellar (flag-1) locus [3,12,13]. In addition to the flag-1 system, a second, evolutionary distinct, flagellar system, encoded by the flag-2 locus, was observed and shown to be relatively prevalent among members of the order Enterobacterales [14,15]. The flag-2 locus encodes a lateral flagellar system and has been postulated to facilitate swarming motility among the Enterobacterales, although it has been inactivated through gene deletions and transposon integration in a substantial proportion of enterobacterial taxa [14][15][16]. ...
... In addition to the flag-1 system, a second, evolutionary distinct, flagellar system, encoded by the flag-2 locus, was observed and shown to be relatively prevalent among members of the order Enterobacterales [14,15]. The flag-2 locus encodes a lateral flagellar system and has been postulated to facilitate swarming motility among the Enterobacterales, although it has been inactivated through gene deletions and transposon integration in a substantial proportion of enterobacterial taxa [14][15][16]. Here, by means of a comprehensive comparative genomic analysis, we identified three additional distinct flagellar loci flag-3, flag-4 and flag-5 which are distributed among the Enterobacterales. ...
... Flagellar motility, a common feature among most members of the Enterobacterales, was long considered to derive from a single chromosomally encoded peritrichous flagellar system (flag-1). However, a second, distinct lateral flagellar system (flag-2) was recently identified and shown to be fairly common across most family lineages within the order [14,15]. Here we have identified three additional distinct flagellar loci, flag-3 to flag-5, with discrete taxonomic distributions. ...
Article
Full-text available
Background: Flagellar motility is an efficient means of movement that allows bacteria to successfully colonize and compete with other microorganisms within their respective environments. The production and functioning of flagella is highly energy intensive and therefore flagellar motility is a tightly regulated process. Despite this, some bacteria have been observed to possess multiple flagellar systems which allow distinct forms of motility. Results: Comparative genomic analyses showed that, in addition to the previously identified primary peritrichous (flag-1) and secondary, lateral (flag-2) flagellar loci, three novel types of flagellar loci, varying in both gene content and gene order, are encoded on the genomes of members of the order Enterobacterales. The flag-3 and flag-4 loci encode predicted peritrichous flagellar systems while the flag-5 locus encodes a polar flagellum. In total, 798/4028 (~ 20%) of the studied taxa incorporate dual flagellar systems, while nineteen taxa incorporate three distinct flagellar loci. Phylogenetic analyses indicate the complex evolutionary histories of the flagellar systems among the Enterobacterales. Conclusions: Supernumerary flagellar loci are relatively common features across a broad taxonomic spectrum in the order Enterobacterales. Here, we report the occurrence of five (flag-1 to flag-5) flagellar loci on the genomes of enterobacterial taxa, as well as the occurrence of three flagellar systems in select members of the Enterobacterales. Considering the energetic burden of maintaining and operating multiple flagellar systems, they are likely to play a role in the ecological success of members of this family and we postulate on their potential biological functions.
... The γ-proteobacterial genus Plesiomonas phylogenetically branches from the base of the enterics yet retains native γproteobacterial and polar and lateral motors. Their presence in Plesiomonas suggests that an ancestral enteric γ-proteobacterium lost polar flagella and acquired the β-proteobacterial flagellum, although the order of these events cannot be inferred; some enterics retain lateral flagella (Ren et al., 2005), suggesting either selective loss, or reacquisition after species radiation. Plesiomonas is primarily aquatic yet also causes gastroenteritis, like many enterics, and whether Plesiomonas should be classified as an enteric (Enterobacteriaceae) remains controversial (Janda et al., 2016). ...
Article
Full-text available
The γ-proteobacteria are a group of diverse bacteria including pathogenic Escherichia, Salmonella, Vibrio, and Pseudomonas species. The majority swim in liquids with polar, sodium-driven flagella and swarm on surfaces with lateral, non-chemotactic flagella. Notable exceptions are the enteric Enterobacteriaceae such as Salmonella and E. coli. Many of the well-studied Enterobacteriaceae are gut bacteria that both swim and swarm with the same proton-driven peritrichous flagella. How different flagella evolved in closely related lineages, however, has remained unclear. Here, we describe our phylogenetic finding that Enterobacteriaceae flagella are not native polar or lateral γ-proteobacterial flagella but were horizontally acquired from an ancestral β-proteobacterium. Using electron cryo-tomography and subtomogram averaging, we confirmed that Enterobacteriaceae flagellar motors resemble contemporary β-proteobacterial motors and are distinct to the polar and lateral motors of other γ-proteobacteria. Structural comparisons support a model in which γ-proteobacterial motors have specialized, suggesting that acquisition of a β-proteobacterial flagellum may have been beneficial as a general-purpose motor suitable for adjusting to diverse conditions. This acquisition may have played a role in the development of the enteric lifestyle.
... The transport genes MexCD-OprJ and MexAB-OprM have also been implicated in biofilm formation (Gillis et al., 2005). Several other genes that were implicated in biofilm formation in non-ocular bacteria include genes involved in: (a) motility (cyaA and crp), (b) biosynthesis of the flagella (Chilcott and Hughes, 2000), (c) Type1 fimbriae (coded by the genes fim A to fimI), (d) the fim like genes fhiA and ECs1554 (e) curli formation (csgA to csgG, cya, ihfB, lon, mlrA, nlpD, ompR, rpoS, and yieO) (Hancock and Klemm, 2007;Niba et al., 2007;Prigent-Combaret et al., 2000;Ren et al., 2005), (f) LPS synthesis (ipcA, gmhB, rfaD, rfaE, rfaF, rfaG, rfaH, and rfaP) (Hancock and Klemm, 2007;Niba et al., 2007), (g) biofilm architecture such as cellulose synthesis encoding genes (bcsABZC and bcsEFG genes) (Hancock and Klemm, 2007), (h) colanic acid synthesis (wcaL and wcaM) (Hancock and Klemm, 2007), (i) pathogenicity and virulence genes (rfaH, c2418 to c2440) (Hancock and Klemm, 2007) and (j) stress genes (cspG, cspH, pphA, ibpA, ibp, soxS, hha and yfiD) (Beloin et al., 2004;Hancock and Klemm, 2007;Ren et al., 2004;Schembri et al., 2003). Pathogenicity and virulence genes ECs3276 encoding a virulence protein populating the cluster O80301 (Baisa et al., 2013). ...
Article
Background The review focuses on the bacteria associated with the human eye using the dual approach of detecting cultivable bacteria and the total microbiome using next generation sequencing. The purpose of this review was to highlight the connection between antimicrobial resistance and biofilm formation in ocular bacteria. Methods Pubmed was used as the source to catalogue culturable bacteria and ocular microbiomes associated with the normal eyes and those with ocular diseases, to ascertain the emergence of anti-microbial resistance with special reference to biofilm formation. Results This review highlights the genetic strategies used by microorganisms to evade the lethal effects of anti-microbial agents by tracing the connections between candidate genes and biofilm formation. Conclusion The eye has its own microbiome which needs to be extensively studied under different physiological conditions; data on eye microbiomes of people from different ethnicities, geographical regions etc. are also needed to understand how these microbiomes affect ocular health.
Article
Full-text available
Escherichia coli sequence type 131 (ST131) is a globally dominant multidrug-resistant clone, although its clinical impact on patients with bloodstream infection (BSI) is incompletely understood. This study aims to further define the risk factors, clinical outcomes, and bacterial genetics associated with ST131 BSI. A prospectively enrolled cohort study of adult inpatients with E. coli BSI was conducted from 2002 to 2015. Whole-genome sequencing was performed with the E. coli isolates. Of the 227 patients with E. coli BSI in this study, 88 (39%) were infected with ST131. Patients with E. coli ST131 BSI and those with non-ST131 BSI did not differ with respect to in-hospital mortality (17/82 [20%] versus 26/145 [18%]; P = 0.73). However, in patients with BSI from a urinary tract source, ST131 was associated with a numerically higher in-hospital mortality than patients with non-ST131 BSI (8/42 [19%] versus 4/63 [6%]; P = 0.06) and increased mortality in an adjusted analysis (odds ratio of 5.85; 95% confidence interval of 1.44 to 29.49; P = 0.02). Genomic analyses showed that ST131 isolates primarily had an H4:O25 serotype, had a higher number of prophages, and were associated with 11 flexible genomic islands as well as virulence genes involved in adhesion (papA, kpsM, yfcV, and iha), iron acquisition (iucC and iutA), and toxin production (usp and sat). In patients with E. coli BSI from a urinary tract source, ST131 was associated with increased mortality in an adjusted analysis and contained a distinct repertoire of genes influencing pathogenesis. These genes could contribute to the higher mortality observed in patients with ST131 BSI.
Chapter
Every year a significant number of people die due to drug-resistant infectious diseases. Several countries have put together efforts to overcome antimicrobial resistance so as to overcome the loss of human life and the accompanying economic burden. The human ocular surface has a paucibacterial microbiome, and several ocular bacteria have acquired resistance to different classes of antibiotics. This chapter is to understand antimicrobial resistance in ocular bacteria with reference to their identity, the drugs to which they are resistant to and strategies to overcome resistance.
Article
Full-text available
Ore mineral and host lithologies have been sampled with 89 oriented samples from 14 sites in the Naica District, northern Mexico. Magnetic parameters permit to charac- terise samples: saturation magnetization, density, low- high-temperature magnetic sus- ceptibility, remanence intensity, Koenigsberger ratio, Curie temperature and hystere- sis parameters. Rock magnetic properties are controlled by variations in titanomag- netite content and hydrothermal alteration. Post-mineralization hydrothermal alter- ation seems the major event that affected the minerals and magnetic properties. Curie temperatures are characteristic of titanomagnetites or titanomaghemites. Hysteresis parameters indicate that most samples have pseudo-single domain (PSD) magnetic grains. Alternating filed (AF) demagnetization and isothermal remanence (IRM) ac- quisition both indicate that natural and laboratory remanences are carried by MD-PSD spinels in the host rocks. The trend of NRM intensity vs susceptibility suggests that the carrier of remanent and induced magnetization is the same in all cases (spinels). The Koenigsberger ratio range from 0.05 to 34.04, indicating the presence of MD and PSD magnetic grains. Constraints on the geometry of the intrusive source body devel- oped in the model of the magnetic anomaly are obtained by quantifying the relative contributions of induced and remanent magnetization components.
Article
Full-text available
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic, and statistical refinements permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is described for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position Specific Iterated BLAST (PSLBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities.
Article
Full-text available
Chromobacterium violaceum is one of millions of species of free-living microorganisms that populate the soil and water in the extant areas of tropical biodiversity around the world. Its complete genome sequence reveals (i) extensive alternative pathways for energy generation, (ii) 500 ORFs for transport-related proteins, (iii) complex and extensive systems for stress adaptation and motility, and (iv) widespread utilization of quorum sensing for control of inducible systems, all of which underpin the versatility and adaptability of the organism. The genome also contains extensive but incomplete arrays of ORFs coding for proteins associated with mammalian pathogenicity, possibly involved in the occasional but often fatal cases of human C. violaceum infection. There is, in addition, a series of previously unknown but important enzymes and secondary metabolites including paraquat-inducible proteins, drug and heavy-metal-resistance proteins, multiple chitinases, and proteins for the detoxification of xenobiotics that may have biotechnological applications.
Article
Full-text available
Escherichia coli O157:H7 is a major food-borne infectious pathogen that causes diarrhea, hemorrhagic colitis, and hemolytic uremic syndrome. Here we report the complete chromosome sequence of an O157:H7 strain isolated from the Sakai outbreak, and the results of genomic comparison with a benign laboratory strain, K-12 MG1655. The chromosome is 5.5 Mb in size, 859 Kb larger than that of K-12. We identified a 4.1-Mb sequence highly conserved between the two strains, which may represent the fundamental backbone of the E. coli chromosome. The remaining 1.4-Mb sequence comprises of O157:H7-specific sequences, most of which are horizontally transferred foreign DNAs. The predominant roles of bacteriophages in the emergence of O157:H7 is evident by the presence of 24 prophages and prophage-like elements that occupy more than half of the O157:H7-specific sequences. The O157:H7 chromosome encodes 1632 proteins and 20 tRNAs that are not present in K-12. Among these, at least 131 proteins are assumed to have virulence-related functions. Genome-wide codon usage analysis suggested that the O157:H7-specific tRNAs are involved in the efficient expression of the strain-specific genes. A complete set of the genes specific to O157:H7 presented here sheds new insight into the pathogenicity and the physiology of O157:H7, and will open a way to fully understand the molecular mechanisms underlying the O157:H7 infection.
Article
We have sequenced the genome of Shigella flexneri serotype 2a, the most prevalent species and serotype that causes bacillary dysentery or shigellosis in man. The whole genome is composed of a 4 607 203 bp chromosome and a 221 618 bp virulence plasmid, designated pCP301. While the plasmid shows minor divergence from that sequenced in serotype 5a, striking characteristics of the chromosome have been revealed. The S.flexneri chromosome has, astonishingly, 314 IS elements, more than 7‐fold over those possessed by its close relatives, the non‐pathogenic K12 strain and enterohemorrhagic O157:H7 strain of Escherichia coli. There are 13 translocations and inversions compared with the E.coli sequences, all involve a segment larger than 5 kb, and most are associated with deletions or acquired DNA sequences, of which several are likely to be bacteriophage‐transmitted pathogenicity islands. Furthermore, S.flexneri, resembling another human‐restricted enteric pathogen, Salmonella typhi, also has hundreds of pseudogenes compared with the E.coli strains. All of these could be subjected to investigations towards novel preventative and treatment strategies against shigellosis.