Content uploaded by Hafiz Muhammad Zeeshan Raza
Author content
All content in this area was uploaded by Hafiz Muhammad Zeeshan Raza on Sep 26, 2023
Content may be subject to copyright.
Vol.:(0123456789)
1 3
International Microbiology
https://doi.org/10.1007/s10123-023-00416-3
RESEARCH
Uropathogenic bacteria anddeductive genomics
towardsantimicrobial resistance, virulence, andpotential drug targets
AaimaAmin1,2· RamishaNoureen3· AyeshaIftikhar4· AnnamHussain2· WadiB.Alonazi5·
HazMuhammadZeeshanRaza2· IfraFerheen6· MuhammadIbrahim2
Received: 22 April 2023 / Revised: 15 July 2023 / Accepted: 3 August 2023
© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2023
Abstract
Urinary tract infections (UTIs) are among the most prevalent bacterial infections affecting people in inpatient and outpatient
settings. The current study aimed to sequence the genome of uropathogenic Escherichia coli strain CUI-B1 resourced from
a woman having uncomplicated cystitis and pyelonephritis. Followed by deductive genomics towards potential drug targets
using E. coli strain CUI-B1, strain O25b: H4-ST131, Proteus mirabilis strain HI4320, Klebsiella pneumoniae strain 1721,
and Staphylococcus saprophyticus strain ATCC 15305 uropathogenic strains. Comparative genome analysis revealed that
genes related to the survival of E. coli, P. mirabilis, K. pneumoniae, and S. saprophyticus, such as genes of metal-requiring
proteins, defense-associated genes, and genes associated with general physiology, were found to be highly conserved in the
genomes including strain CUI-B1. However, the genes responsible for virulence and drug resistance, mainly those that are
involved in bacterial secretion, fimbriae, adherence, and colonization, were found in various genomic regions and varied
from one species to another or within the same species. Based on the genome sequence, virulence, and antimicrobial-resistant
gene dataset, the subtractive proteomics approach revealed 22 proteins mapped to the pathogen’s unique pathways and among
them, entB, clbH, chuV, and ybtS were supposed to be potential drug targets and the single drug could be utilized for all
above-mentioned strains. These results may provide the foundation for the optimal target for future discovery of drugs for
E. coli-, P. mirabilis-, K. pneumoniae-, and S. saprophyticus-based infections and could be investigated further to employ
in personalized drug development.
Keywords Whole-genome sequencing· Virulence genes· Antibiotic resistance genes· Potential drug targets
Introduction
Urinary tract infections (UTIs) are among the most prevalent
bacterial infections, affecting 150 million people annually
around the globe, which mainly affects the bladder, urethra,
ureters, and kidneys. UTIs can also be recurrent and lead to
serious complications if left untreated (Klein and Hultgren
2020). UTIs are generally more frequent in women rather
than in men, and risk hazards include sexual activity, preg-
nancy, menopause, urinary tract abnormalities, catheteriza-
tion, and certain immune-compromising medical conditions
(Shaeriya etal. 2021). The improper use of antibiotics has
significantly contributed to the rise of multi-drug-resistant
(MDR) bacteria, making it almost difficult to treat UTIs and
other bacterial infections(Hatt and Rather 2008).
UTIs can be caused by various types of bacteria, but
some of the most studied ones include E. coli accounting
for about 80–85% of cases, E. faecalis mostly in people with
* Muhammad Ibrahim
Ibrahim@cuisahiwal.edu.pk
1 Medical Department, Quaid e Azam Medical College,
Bahawalpur, Pakistan
2 Department ofBiosciences, COMSATS University
Islamabad, Sahiwal Campus, Sahiwal, Pakistan
3 Medical Department, Fatima Jinnah Medical University,
Lahore, Pakistan
4 Medical Department, Govt Khawaja Muhammad Safdar
Medical Collage, Sialkot, Pakistan
5 Health Administration Department, College ofBusiness
Administration, King Saud University, Riyadh11587,
SaudiArabia
6 Laboratory ofGenetics ofMicroorganisms, School
ofBiosciences andVeterinary Medicine, University
ofCamerino, Camerino, Italy
International Microbiology
1 3
underlying health conditions, P. mirabilis particularly in peo-
ple with urinary tract abnormalities, K. pneumoniae particu-
larly in people with weakened immune systems, and S. sapro-
phyticus is a common cause of UTIs in young women. Other
less common bacteria that can cause UTIs include P. aerugi-
nosa, Enterobacter spp., and S. marcescens (Flores-Mireles
etal. 2015; Hatt and Rather 2008). In the case of UTIs, bac-
teria can form biofilms on the walls of the bladder or urinary
catheters, which can lead to recurrent or persistent infections.
Biofilms can also provide a protective environment for bac-
teria, allowing them to evade the body’s immune response
and antibiotics. As a result, treating biofilm-associated UTIs
can be challenging, and multiple rounds of antibiotics may
be necessary to fully eliminate the infection (Stickler 2008;
Pelling etal. 2019; Kahlmeter 2003).
Uropathogenic strains are distinguished by the expression
of unique bacterial structures, properties, and products that
are referred to as virulence factors, which facilitate the bac-
teria’s invasion, colonization, and evasion of host defenses
(Flores-Mireles etal. 2015; Stamm and Norrby 2001). These
virulence factors include adhesins, such as type 1 fimbriae
and P fimbriae, which allow bacteria to adhere and invade
bladder epithelial cells. Various environmental cues, such as
the availability of nutrients and the presence of host signals,
modulate the expression of these virulence factors, allowing
bacteria to adapt to the urinary tract environment and cause
infection (Subashchandrabose etal. 2014; Hannan etal.
2012; Kostakioti etal. 2012). Moreover, a significant asso-
ciation between virulence factors of pathogens and antimi-
crobial resistance has been reported (Cepas and Soto 2020).
The significant risk to world health is increased by the
development of antibiotic-resistant bacteria among uropath-
ogenic strains. Therefore, knowledge regarding genome
sequences of uropathogenic associated strains, presence
of different virulence, drug resistance genes, identification
of potential drug targets, and niche adaptations can aid to
understand the pathogenicity of UTI and treat the patients
properly. Thus, reducing the inappropriate use of antibiot-
ics and utilizing shared proteins in subtractive proteomics
techniques can suggest prospective therapeutic targets that
may be further researched in the direction of personalized
medicine.
Materials andmethods
Sample collection
Midstream urine samples were collected from women included
in the study. The samples were collected from Victoria Hospi-
tal Bahawalpur and Fatima Jinnah Medical University Lahore,
Pakistan, based on criteria as described by Saraf etal. (2022)
with ethical approval. The samples were sent immediately to
the microbiology laboratory, where they were cultivated and
identified within 2 h to prevent any possible contamination or
decrease in the number of microbes.
Inclusion criteria Premenopausal non-pregnant women
(aged 41) with prior diagnosis of acute uncomplicated cys-
titis and pyelonephritis were approached. The diagnosis was
also verified using the criteria given by the European Asso-
ciation of Urology Section of Infection in Urology classifica-
tion of UTIs based on clinical presentation, risk factors, and
severity scale. Regardless of their urban or rural background,
samples were exclusively collected from women belonging
to low socioeconomic group.
Exclusion criteria Women having pregnancy, menstruation,
menopause, complicated UTIs, underlying diseases, prior
antibiotic therapy, and middle and upper socioeconomic sta-
tus were excluded from the study. The inclusion and exclu-
sion criteria were followed as described by Smelov etal.
(2016).
Isolation andidentification ofE. coli
Twenty samples were cultured on MacConkey agar (Oxoid,
UK). Isolated E. coli colonies were obtained by selecting a
single colony from the surface of the medium, re-streaking
it on MacConkey agar plates, and then incubated the plates
at 37°C for overnight. Biochemical tests, MR-VP, and Gram
staining were used to investigate purified bacteria (Kurbasic
etal. 2010), Lactose and Indole tests to confirm the pres-
ence of E. coli.
DNA extraction
The selected strain based on biochemical tests and follow-
ing the manufacturer’s instructions, genomic bacterial DNA
(gDNA) was isolated using the DNeasy Blood & Tissue Kit
(Qiagen). The Qubit dsDNA HS Assay Kit (Thermo Fisher
Scientific, USA) was used to fluorometrically measure the
DNA content. Agarose gel electrophoresis and spectropho-
tometry were used to confirm the material’s integrity and
purity respectively. To determine the genetic diversity and
evolutionary relationship between E. coli strain CUI-B1 and
those in other publicly available databases (n = 1463), we
conducted phylogenetic analyses using 16S rRNA with the
twenty most comparable strains based on BLAST search sim-
ilarity (Feng etal. 2023). The tree derived from 16S rRNA
has traditionally been used to define prokaryotic taxonomy.
The genome sequence was conducted as described below.
International Microbiology
1 3
Library preparation
Following the manufacturer’s instructions, the Nextera
XT Kit (Illumina) was used to prepare the library from
the gDNA isolated from the verified sample for genome
sequencing. Briefly, limited-cycle PCR was used to frag-
ment and amplify 1 ng of DNA using Nextera XT barcodes
(12 cycles). The Agencourt AMPure XP magnetic beads
were used to purify the PCR product (Beckman Coulter).
After diluting each library to 4.0 nM, they were pooled. To
assess the quality of the libraries, DNA quantification was
performed via Qubit and Agilent Bioanalyzer. Barcoded
libraries that were pooled were diluted to 12 pM. Using a
MiSeq Illumina (Macrogen South Korea), whole-genome
sequencing was carried out.
Quality filtering, alignment, andde novo assembly
The Illumina adaptors, final sequences, and bases below
Q20, and unnamed bases were removed from FastQ files
during the purification process using cut adapt 1.4. The Q
score and A, T, G, and C fractions were determined using
the Rqc v1.10.2 R package. Bowtie v2.3.3.1 was used for
the alignment of quality-filtered reads using E. coli O25b:
H4-ST131 as reference, and Picard Tools v2.15.0 was used
to sort the reads by chromosome coordinates. In case of
overlapping reads, FLASH2 v2.2.00.21 was used to merge
them into one read. De novo sequencing was performed with
Genius Prime v3 to verify the detection of all genes absent
from the previously identified E. coli strain.
Comparative genomic analysis
The genome comparative analysis of E. coli strain CUI-B1
with representative UTI-associated strains such as E coli,
P. mirabilis, K. pneumoniae, and S. saprophyticus strains
was conducted using the BRIG software package (Alikhan
etal. 2011). The tool is used to create alignment of several
genomes to examine very similar subsequences, evolution-
ary processes like inversion, and rearrangement and compar-
ison. Moreover, GC skew and GC content map were gener-
ated using CGView Server V1.0 (Grant andStothard 2008).
Genome‑wide gene prediction andanalysis
To determine the virulence genes of the E coli, P. mirabilis, K.
pneumoniae, and S. saprophyticus genomes, we retrieved the
FASTA gene sequences present in the Virulence Factor Data-
base (VFDB). This VFDB version included more than 30178
genes, encoding over 1800 unique virulence factors (VFs),
across over 30 distinct genera of bacteria. Nucleotide-Nucleo-
tide BLAST v2.7.1+ was used to match the whole VFDB gene
set with the genomes of the E. coli strain O25b:H4-ST131, E.
coli strain CUI-B1, P. mirabilis HI4320, K. pneumoniae strain
U1, and S. saprophyticus strain ATCC 15305. We kept only the
sequences that spanned 80% or more of the VFDB query gene.
Only the VFDB gene that efficiently matched with the chosen
genomes was taken into consideration when multiple VFDB
genes displayed similar alignments to an assembly as described
by Paniagua-Contreras etal. (2019). Based on VFDB, the 349
VFDB genes for which the orthologous genes were found were
manually categorized into functional categories (Table S1).
Subtractive proteomic approach forprediction
ofpotential drug targets
To predict the potential drug targets by subtractive proteomics
approaches, the common virulence proteins which encode the
key virulence features such as iron uptake, adherence, toxin,
and regulating proteins were further processed extensively for
the identification of potential drug targets. The flow chart of
the methodology is shown in Fig.1. Firstly, the elimination of
homologous proteins from the shared virulence proteins was
performed by BLASTp against human proteome as described
by Ehsan etal. (2018). We have considered only those proteins
with no hits found as non-homologous proteins after performing
BLASTp against human proteome and kept for further analysis.
To look into the location of the virulence proteins particularly to
ascertain the suitability of drug targets, the subcellular localiza-
tion of the final set of non-homologous proteins was predicted
by using PSORTb (Yu etal. 2010). The proteins with cyto-
plasmic nature were further processed for the identification of
unique metabolic pathways by KAAS server at KEGG. Unique
metabolic pathways mean the pathways encoded by bacterial
species but not humans. KAAS (KEGG Automatic Annotation
Server) provides functional annotation of genes by BLAST
comparisons against the manually curated KEGG GENES
database. The result contains KO (KEGG Orthology) assign-
ments and automatically generated KEGG pathways (Moriya
etal. 2007). Proteins encoded by common metabolic pathways
were discarded. In further screening, the drugability potential of
proteins was assessed. Drugability is the probability of the pro-
teins binding to a drug-like compound. For this purpose, each
drug target protein was aligned with DrugBank and opted the
proteins which have appropriate drug target either approved by
FDA, experimental, nutraceuticals, or biotech drugs.
Results anddiscussion
Isolation andidentification ofEscherichia coli
Given that every year, urinary tract infections affect 150
million individuals throughout the globe (Flores-Mireles
etal. 2015), all the strains employed in this study have
clinical relevance for conditions like UTIs. Table1 presents
International Microbiology
1 3
20 potential E. coli isolates causing UTIs, 17 (85%) tested
positive for indole production and methyl-red test however
negative for urease production as well as Voges-Proskauer
reaction. As described by Holt etal. (Holt etal. 1994), these
isolates were confirmed to be E. coli and a randomly sin-
gle isolate was further processed for 16s RNA sequencing,
genome sequencing, and characterizations.
Evolutionary analysis ofE. coli strain CUI B1
The 16S rRNA tree revealed that E. coli isolates CUI-B1
seemed genetically more identical to E. coli isolates such as
E. coli strain SE, E. coli strain E1, and E. coli strain E10 than
those from the public database (Fig.2). E. coli strain E1 has
been identified as the causative agent of various diseases in
poultry, such as yolk sac infection, E. coli strain E10 caused
multi-bacterial airsacculitis (Feng etal. 2023), and E. coli
strain SE was reported in a public database as the causative
agent of udder health and neonatal disease.
Several genotypes derived from E. coli strain UTI89-2
are shown in another cluster which is depicted in Fig.2.
Infections caused by E. coli transmitted through contami-
nated meat products are frequently associated with intestinal
disorders, such as diarrhea. However, researchers have noted
E. coli-related UTIs that seem to be connected to exposure
to animals used in food production (Feng etal. 2023). The
16S rRNA gene is a highly conserved component of the
transcriptional apparatus in all DNA-based life forms, and
horizontal gene transfer has minimal effect on it (Daubin
etal. 2003). Nonetheless, in literature, several variations in
certain variable regions of the 16S rRNA gene have been
reported, and in addition to these variations and phylogenetic
analyses, the strain CUI-B1 was clustered in the same group
as E. coli E1 and E. coli E10, while E. coli strain UTI89-2
belonged to a different cluster. Experimental studies demon-
strated the shared pathogenic potential of human UTIs and
avian pathogenic E. coli, suggesting that these extraintestinal
E. coli may be descended from the same bacterial lineages or
share common evolutionary origins (Kathayat etal. 2021).
Consistent detection of specific human UTI lineages in poul-
try or poultry products, but rarely in other meat products,
supports the notion that poultry may serve as a reservoir for
human UTIs.
Genomic features ofUTI‑associated strains
Overall genomic features of these strains are as follows. The
draft genome sequence of E. coli strain CUI-B1 which was iso-
lated in our study is of 5.45-Mbp genome size and comprised
of 12 contiguous DNA sequences with 6029 coding genes,
83 tRNA, 23 rRNA, and 50.5% GC contents. Similarly, the
E. coli strain O25b: H4-ST131 which was used as a reference
strain in this study encodes 5329 coding genes, 103 tRNA,
22 rRNA, 50.5% GC contents with single chromosome, and
a total genomic length of 5.49 Mbp. The genome sequence of
P. mirabilis strain H14320 causal agent of intestinal tracts of
humans consists of single chromosomal DNA of 4.06 Mbp
along with 1 plasmid. Its genome encodes 3752 genes, 83
tRNAs, and 22 rRNAs, with a GC content of 38.9% (Table2).
The genome sequence of K. pneumoniae strain 1721 possesses
three contagious DNA sequences and a plasmid collectively of
Fig. 1 Graphical presentations of the subtractive proteomics
approaches for potential drug targets
International Microbiology
1 3
Table 1 Biochemical
characterization of E. coli Isolate No. Indol test Methyl red
test
Voges. P test Urease test Acid from sugars
Glucose Lactose
CUI-B1 + + − − + +
CUI-B2 + + − − + +
CUI-B3 + + − − + +
CUI-B4 + + − − + +
CUI-B5 + + − − + +
CUI-B6 − - − − + +
CUI-B7 + + − − − −
CUI-B8 + + − − + +
CUI-B9 + + − − + +
CUI-B10 + + − − + +
CUI-B11 + + − − + +
CUI-B12 − − − − + +
CUI-B13 + + − − + +
CUI-B14 + + − − + +
CUI-B15 + + − − + +
CUI-B16 + + − − + −
CUI-B17 + + − − + +
CUI-B18 − − − − + +
CUI-B19 + + − − + +
CUI-B20 + + − − + +
Fig. 2 16S rRNA gene
sequence-based phylogenetic
analysis of E. coli
International Microbiology
1 3
5.46 Mbp. The genome encodes 5500 genes, 86 tRNAs, and
9 rRNAs, with a GC content of 57.29%. The draft genome
sequence of S. saprophyticus strain ATCC 15305 consists of
four contiguous DNA sequences and with three plasmids col-
lectively of 2.59 Mbp, encoding 2522 genes, 60 tRNAs, and
13 rRNAs, with a GC content of 33.1% (Table2).
Overall, the genomic features show a variation not only
in the genome size, and GC contents but also in gene coding
sequences. Most importantly, the amino acid composition
which is shaped by GC contents is correlated with the pro-
tein structure classes, the metabolic efficiency, and the trans-
lation efficiency leading to the protein functioning varied in
each species (Meng-Ze etal. 2018). These findings could be
a potential source to understand the epidemiology, diversity,
pathogenicity, and drug resistance characteristics following
potential anti-virulent drug targets (Table3).
Comparative genome analysis
For comprehensive genome-wide comparative analy-
sis, the results of the sequence-based comparison were
obtained for E. coli, K. pneumoniae, P. mirabilis, and
S. saprophyticus strains used in this study by using the
RAST server and BRIG genome-wide analysis tools. The
E. coli strain O25b: H4-ST131 was used as a reference for
visualization of the genomic regions, which are shown in
Fig.1. The circular genomes of the E. coli strain CUI-B1,
K. pneumoniae strain 1721, P. mirabilis strain H14320,
and S. saprophyticus strain ATCC 15305 unveiled the %
age of GC contents, coding sequence (CDS), open reading
frame, and total no. of RNAs (Fig.3). The visualization
analysis of the entire genomes presented significant simi-
larity levels (>90%) between E. coli strain CUI-B1 and E.
coli strain O25b: H4-ST13, while similarly with the rest
of the genomes was ranging from 40 to 70% and numerous
positions of latent insertion/deletion sites were depicted in
the genomes of E. coli, K. pneumoniae, P. mirabilis, and
S. saprophyticus strains. Moreover, various hypothetical
genes were encoded by these strains consistent with previ-
ous genome-wide studies. In genomics, the comparisons
and characterizations of functional elements depend on
the identification of the natural selection footprints (Kim
etal., 2012). Such studies reveal a new dimension of com-
parative genomics such that a comparison of five genomes
associated with UTI highlights comprehensively the differ-
ences in overall levels of genome organization.
Identification ofantimicrobial resistance genes
In the five genomic assemblies, the resistant gene identifier
predicted the wide distribution of various antibiotic-resist-
ant genes among all strains.In the analysis of five differ-
ent genomic assemblies, the resistant gene identifier was
used to predict the presenceof various antibiotic-resistant
Table 2 Genomic features of strains associated with urinary tract infection
Escherichia coli
strain CUI-B1 Escherichia coli
O25b:H4-ST131 Proteus mirabilis
strain HI4320 Klebsiella pneumoniae
strain 1721 Staphylococcus saprophyticus
strain ATCC 15305
Origin Human Human Human Human Human
Contigs No. 12 1 1 3 1
Genome size (Mb) 5.38 5.40 4.06 5.46 2.59
Coding genes 6029 5329 3752 5500 2522
G+C content (%) 50.4 50.5 38.9 57.29 33.1
tRNA No. 83 103 83 86 60
rRNA No. 23 22 22 9 13
Table 3 Drugability potential of the five druggable targets
Name of gene Drug Bank ID Drug Bank target Drug group Molecular
weight (kDa)
Drug resistance
entB DB01942/DB02793 Putative hydroxy pyruvate isomerase
YgbM/Henazine biosynthesis protein
PhzD
Experimental, Investigation 32.0254 Ethambutol
clbH DB01672 2,3-Dihydroxy-benzoic acid Experimental 83.12 Chloramphenicol
chuV DB00171 ATP Investigational, Nutraceutical 45.62 Colistin
ybtS DB01942 Putative hydroxypyruvate isomerase
YgbM
Experimental, Investigational 32.02 Metallo-β-lactamase
International Microbiology
1 3
genes across all strains. For instance, the E.coli strain CUI-
B1 was found to have 56antibiotic resistance genes. These
genes confer resistance to several types of antibiotics, such
as β-lactams,fluoroquinolones, aminoglycosides, sulfona-
mides, chloramphenicol, trimethoprim, and tetracycline.
This is incomparison to the reference E.coli strain O25b:
H4-ST131, which has 57 different types of antibiotic-resist-
antgenes. Similarly, the K. pneumoniae strain 1721 carries
a certain number of such genes, while the P. mirabilisstrain
HI4320 carries 14, and the S. saprophyticus strain ATCC
15305 carries 10 antibiotic-resistant genes(Table S1). Over-
all, each strain encodes at least six antibiotic-resistant genes.
The genes with the highest prevalence among the strains
were rsmA, KpnF, and CRP, the only three genes found in
all strains from different genera. rsmA is a gene that regu-
lates the virulence of P. aeruginosa. However, its negative
effect on MexEF-OprN overexpression has been noted to
confer resistance to various antibiotics (Pessi etal. 2001)
and its homolog in E. coli is csrA. The kpnF gene found in
all strains except S. saprophyticus strain ATCC 15305 which
is previously described in K. pneumoniae also boosts high
resistance to different antibiotics, antiseptics, and disinfect-
ants (Srinivasan and Rajamohan 2013). The acute-phase
protein, also called C-reactive protein or CRP, is an acute-
phase protein and the name of the protein is due to its ability
to precipitate the S. pneumoniae C-polysaccharide. CRP is
potentially a useful tool to enhance antibiotic stewardship
(Cals and Ebell 2018). rsmA, KpnF, and CRP exhibited a
stronger genetic relationship between strains that had pro-
files of gene-producing heavy metal efflux pumps regardless
of their origin.
The high incidence of antibiotic genes that code for drug
resistance serves as an alert to us regarding the necessity of
rigorous surveillance programs to track resistance and the
spread of these diseases among various sources. Moreover,
antibiotic resistance gene profiling of strains studied here
provides complexity of antimicrobial resistance (AMR)
transmission and movement proclivity. On the other hand,
compared to other strains, E coli had much higher odds
of being identified when resistance determinants were
Fig. 3 Comparative genome
visualization of uroptahogenic
bacteria
International Microbiology
1 3
examined by species type as they offer some more hints
regarding possible initial sources and the trafficking of spe-
cific resistance genes. In short, one of the biggest obstacles
to effectively treating UTI infections in the clinic is the com-
position and distribution of antibiotic resistance genes.
Detection ofvirulence genes andanalysis
Using the complete dataset of virulent genes retrieved from the
VFDB by using various keywords (Table S2), BLAST analysis
was performed against each genome assembly and found 85 to
205 virulence genes in the five genome assemblies. Such as E.
coli strain O25b: H4-ST131 encodes 208 virulent genes, E. coli
strain CUI-B1 encodes 205, P. mirabilis strain H14320 com-
prised 202, K. pneumoniae strain 1721 comprised 152, and S.
saprophyticus strain ATCC15305 comprised 85 virulent genes
distributed at various genomics regions, while 44 of them were
shared by all the UTI-associated strains (Table S3). The most
abundant among these shared genes included the genes which
are linked in iron uptake such as Ferric enterobactin transport
protein FepE, iron/manganese transport gene SitC, enterobac-
tin exporter, and salicylate synthase Irp9. For almost all living
cells, iron is a vital micronutrient. Both invasive pathogens and
human cells, including those of the immune system, need iron
to maintain their function, proliferation, and metabolism. On
the one hand, most human diseases’ pathogenicity is associated
with microbial iron uptake (Nairz and Weiss 2020). Contrarily,
the sequestering of iron from bacteria and other pathogens is a
successful host defense tactic that adheres to the fundamentals
of nutritional immunity (Nairz and Weiss 2020).
Because free iron is bound to heme or circulated proteins
like transferrin or lactoferrin, bacterial pathogens cannot
access it. Pathogens obtain iron from the host by a variety
of processes, including the uptake of Fe2+ (Feo system),
the absorption of heme, and the production of extracellular
Fe3+-chelating molecules known as siderophores, which are
either produced by the pathogens themselves or by other
bacteria (Caza and Kronstad 2013; Cornelis and Dingemans
2013). So, the presence of virulent genes associated with
iron uptake in a pathogen can indeed be a cause for concern
as it indicates a potential mechanism for the E. coli strain
CUI-B1 to obtain nutrients from the host and potentially
cause disease.
Following iron uptake-associated genes, various types of
toxin genes were encoded by all selected strains of UTI.
Among them were ket toxin genes, uropathogenic-specific
protein (usp), acyl-CoA dehydrogenase (clbF), malonyl-
CoA transacylase (clbG), thioesterase (clbQ), putative
amidase (clbL) etc. Gram-negative enteric pathogens, such
as E.coli, are particularly adept at acquiring virulence fac-
tors through HGT, which has contributed to their ability to
cause a wide range of infections in humans (Ochman etal.
2000). The uropathogenic-specific protein (Usp) of E. coli is
a genotoxin that is toxic to mammalian cells and is linked to
strains that cause prostatitis, pyelonephritis, and bacteremia.
The prevalence of usp in E. coli strains from patients with
colorectal cancer has also been reported (Rihtar etal. 2020).
The identification of usp in this study in both gram-negative
and positive bacteria seems to be the linked with HGT.
Another group of proteins distributed among the UTI
genomes was proteinaceous toxins, such as putative thi-
oesterase, putative malonyl-CoA transacylase, and putative
acyl-CoA dehydrogenase. Adhesins are found on the surface
of bacterial cells that aid in the process of adhering to other
cells or inanimate surfaces (Shruthi etal. 2012). To suc-
cessfully colonize a new host, bacteria must first attach to
its surface. The VFs on pathogenic bacteria can be used as
indicators and antibacterial targets. However, various UPEC
strains might take shelter in multiple VFs, and numerous fac-
tors are only found in a portion of UPEC strains (Shah etal.
2019). Hence, it appears that the development of efficient
and generally applicable strategies to prevent or cure UTIs
brought on by UPEC requires a combination of several VFs
found in the currently under study strains. The creation of
such cutting-edge approaches to managing illness would be
made easier by identifying the pattern in which VFs cause
UTIs and comprehending their epidemiological spread.
In short, the virulence factors identified in our research
included features from both databases and had at least 90%
sequence similarity with reference genes. UTI virulence
genes include temperature-sensitive haemagglutinin (tsh),
ferric aerobactin receptor (iutA), increased serum survival
(iss), heat-resistant agglutinin (hra), P fimbrial adhesin
(papC), capsular polysialic acid virulence factor group
2 (kpsII), colicin V (cvaC), and invasive factor of brain
endothelial cells locus A (ibeA) of E. coli strains. The advent
of drug-resistant microorganisms among UPEC strains with
the distribution of virulent factors among entire genomes
increases the global health risk (Tabasi etal., 2015). Overall,
the distribution of virulence genes among the E. coli genome
is complex and can vary depending on the strain and the spe-
cific virulence factors involved. Understanding the distribu-
tion and function of these virulent genes is an active area of
research and is important for developing effective strategies
for the prevention and treatment of E. coli infections.
Subtractive proteomics approach forprediction
ofpotential drug targets
By using various computational analyses, 44 common viru-
lent proteins were further analyzed and considered as suitable
targets for the formulation of novel anti-virulent drugs. Ant-
virulence intends to selectively demobilize pathogens that slow
infection rates. Anti-virulence drugs will maintain the endog-
enous microbiome while exposing pathogens susceptible to
elimination through the host immune system, in contrast to
International Microbiology
1 3
the fact that conventional antibiotics do not generally differ-
entiate between commensal intestinal bacteria and pathogenic
microbes (Knowles and Gromo 2003). We ran BLASTp on the
human proteome and identified 38 proteins as non-homologous
out of 44 shared proteins to exclude host homolog from the
non-redundant proteome. Host non-homologous proteins are
proteins that are not similar in structure or sequence to proteins
produced by the pathogen or other foreign antigen. Because
these proteins are not like foreign antigens, they are less likely
to trigger an immune response against the host’s tissues (Naz
etal. 2015; Sanober etal. 2017). Thirty-eight non-homologous
proteins (Table S4) were analyzed for localizations and results
revealed that 22 proteins were the proteins belonging to cyto-
plasmic locations (Table S5). Predicting the cellular localiza-
tion of unknown proteins also allows researchers to investigate
their activity, participation in disease mechanisms, and poten-
tial therapeutic development pathways. Drugs have an easier
time accessing cytoplasmic proteins than membrane proteins;
hence, they are favored (Reygaert 2018; Sanober etal. 2017).
Virulent, cytoplasmic proteins were examined in KAAS for
their corresponding metabolic pathways (Moriya etal. 2007)
which may successfully identify the collection of pathogen-specific
proteins required for the pathogen’s survival (Solanki and Tiwari
2018; Uddin etal. 2015). Out of the 21 protein sets, 14 proteins
are members of unique metabolic pathways, and the bulk (80%) of
these proteins are members of several pathogen-distinct metabolic
pathways. In Table S6, specifics of distinctive metabolic pathways
are presented. The likelihood of side effects is lower for proteins
that belong to distinct pathways since there are no competing path-
ways in the host, making them the most likely drug targets.
The ability of a protein to bind small drug-like molecules
with a high affinity is referred to as druggability. For this pur-
pose, a Drug Bank database was used in which BLASTp was
performed on each drug target with a set of criteria against a
database of substances from DrugBank: Experimental drugs,
FDA-approved, nutraceuticals, small molecules, and biotech.
Based on a bit score > 100, 5 of the initial 14 proteins were
selected as possible pharmaceutical targets and belonged to
the drug group experimental small-molecule medications,
exploratory or FDA-approved. In further characterizations,
the molecular weight of all potential pharmacological targets
was predicted by the Expasy Proparam tool (Gasteiger etal.
2005). Proteins with a molecular weight of 110 kDa or less are
suggested for use in the current study because they are easy to
purify (Naz etal. 2015). Proteins selected for the drug have a
molecular weight of 32 to 84 kDa, providing evidence of the
value of protein screening for drug targets.
Resistance was also considered as a standard for choosing
which druggable proteins to be prioritized as described previ-
ously (Adamus-Bialek etal. 2013; Ehsan etal. 2018). Literature
revealed that Escherichia coli O25b:H4-ST131, P. mirabilis
strain HI4320, K. pneumoniae strain 1721, and S. saprophyticus
strain ATCC 15305 exhibited resistance to multiple antibiotics
such as fosfomycin, aminoglycosides, fluoroquinolones, tetra-
cycline, and sulfonamides. Putative amidase clbL was the only
protein that did not exhibit drug resistance and was discarded.
Moreover, all the proteins have a suitable template for the predic-
tion of the structure of each protein (Fig.4), therefore, seems to
be considered a potential drug target for E. coli strain CUI-B1,
E. coli strain O25b: H4-ST131, P. mirabilis strain H14320, K.
pneumoniae, and S. saprophyticus strain ATCC15305.
Conclusions
The evolutionary analysis demonstrated the shared evolu-
tionary cluster of human urinary tract infections and avian
pathogenic E. coli, suggesting that these extraintestinal E.
coli may be descended from the same bacterial lineages or
share common evolutionary origins. Consistent detection of
specific human UTI lineages in poultry or poultry products,
but rarely in other meat products, supports the notion that
Fig. 4 The 3D structure of potential drug targets
International Microbiology
1 3
poultry may serve as a reservoir for human UTI. Moreo-
ver, the advent of drug-resistant bacteria in UTI-associated
strains increases the serious threat to global health. Genome
sequence and comparative analysis revealed E. coli strain
CUI-B1 seems to be among the group of serious threats to
healthcare settings that need to be addressed well in time.
Moreover, based on genome sequence, virulence, and antimi-
crobial-resistant gene dataset following reverse engineering
approach revealed 22 proteins mapped to pathogens unique
pathways and characterized four proteins entB, clbH, chuV,
and ybtS to be a potential druggable target which could be
utilized for these strains. Identifying potential drug targets
is an important aspect of drug discovery and development.
The findings may provide the foundation for suitable tar-
gets for prospective drug discovery for E. coli, P. mirabilis,
K. pneumoniae, and S. saprophyticus-based infections and
could be employed in personalized drug development also.
Supplementary Information The online version contains supplemen-
tary material available at https:// doi. org/ 10. 1007/ s10123- 023- 00416-3.
Author contributions AA, RN, and AI collected samples, performed
basics analysis, and wrote 1st relevant 1st draft. AH, WBA, AH, and
IF performed and guided bioinformatics analysis. ZMZR wrote the
“Discussion” section and proof reading. MI supervised whole work
and drafted the manuscript for final submission.
Funding This research was funded by the Researchers Supporting
Project number (RSP2023R332), King Saud University, Saudi Arabia.
Data availability The 16s RNA sequence data was submitted to Gen-
Bank/DDBJ/ENA under accession no. OQ858381, while the Whole
Genome Shotgun project has been deposited at DDBJ/ENA/GenBank
under the accession JARWMK000000000. The version described in
this paper is version JARWMK010000000.
Declarations
Conflict of interest The authors declare no competing interests.
References
Adamus-Bialek A, Zajac E, Parniewski P, Kaca W (2013) Compari-
son of antibiotic resistance patterns in collections of Escheri-
chia coli and Proteus mirabilis uropathogenic strains. Mol Biol
Rep 40:3429–3435
Alikhan NF, Petty NK, Ben Zakour NL etal (2011) BLAST Ring
Image Generator (BRIG): simple prokaryote genome compari-
sons. BMC Genomics 12:402
Cals JW, Ebell MH (2018) C-reactive protein: guiding antibiotic pre-
scribing decisions at the point of care. Br J Gen Pract 668:112–113
Caza M, Kronstad JW (2013) Shared and distinct mechanisms of iron
acquisition by bacterial and fungal pathogens of humans. Front
Cell Infect Microbiol 19(3):80
Cepas V, Soto SM (2020) Relationship between virulence and resist-
ance among Gram-negative bacteria. Antibiotics 9:719
Cornelis P, Dingemans J (2013) Pseudomonas aeruginosa adapts its
iron uptake strategies in function of the type of infections. Front
Cell Infect Microbiol 14(3):75
Daubin V, Moran NA, Ochman H (2003) Phylogenetics and the cohe-
sion of bacterial genomes. Science 301:829–832
Ehsan N, Ahmad S, Navid A, Azam SS (2018) Identification of poten-
tial antibiotic targets in the proteome of multi-drug resistant Pro-
teus mirabilis. Meta Gene 18:167–173
Feng A, Akter S, Leigh SA etal (2023) Genomic diversity, pathogenic-
ity and antimicrobial resistance of Escherichia coli isolated from
poultry in the southern United States. BMC Microbiol 23:15
Flores-Mireles A, Walker J, Caparon M etal (2015) Urinary tract
infections: epidemiology, mechanisms of infection and treatment
options. Nat Rev Microbiol 13:269–284
Gasteiger E etal (2005) Protein identification and analysis tools on
the ExPASy server. In: Walker JM (ed) The proteomics protocols
handbook. Springer Protocols Handbooks, Humana Press
Grant JR, Stothard P (2008) The CGView Server: a compara-
tive genomics tool for circular genomes. Nucleic Acids Res
36:W181–W184
Hannan TJ, Totsika M, Mansfield KJ, Moore KH, Schembri MA, Hult-
gren SJ (2012) Host–pathogen checkpoints and population bot-
tlenecks in persistent and intracellular uropathogenic Escherichia
coli bladder infection. FEMS Microbiol Rev 3:616–648
Hatt J, Rather P (2008) Role of bacterial biofilms in urinary tract infec-
tions. Curr Top Microbiol Immunol 322:163–192
Holt JG, Krieg NR, Sneath PH, Staley JT, Williams ST (1994) Ber-
gey’s manual of determinate bacteriology. Lippincott Williams &
Wilkins, Baltimore, Maryland, USA
Kahlmeter G (2003) An international survey of the antimicrobial sus-
ceptibility of pathogens from uncomplicated urinary tract infec-
tions: the ECO· SENS Project. J Antimicrob Chemother 51:69–76
Kathayat D, Lokesh D, Ranjit S, Rajashekara G (2021) Avian patho-
genic Escherichia coli (APEC): an overview of virulence and
pathogenesis factors, zoonotic potential, and control strategies.
Pathogens 10:467
Kim J, Lee T, Kim TH (2012) An integrated approach of comparative
genomics and heritability analysis of pig and human on obesity
trait: evidence for candidate genes on human chromosome 2.
BMC Genomics 13:711
Klein RD, Hultgren SJ (2020) Urinary tract infections: microbial
pathogenesis, host–pathogen interactions and new treatment strat-
egies. Nat Rev Microbiol 18:211–226
Knowles J, Gromo G (2003) Target selection in drug discovery. Nat
Rev Drug Discov 2:63–69
Kostakioti M, Hultgren SJ, Hadjifrangiskou M (2012) Molecular blue-
print of uropathogenic Escherichia coli virulence provides clues
toward the development of anti-virulence therapeutics. Virulence
3:592–594
Kurbasic A, Jakobsen L, Skjøt-Rasmussen L (2010) Escherichia coli
isolates from broiler chicken meat, broiler chickens, pork, and pigs
share phylogroups and antimicrobial resistance with community-
dwelling humans and patients with urinary tract infect Kuskowski
ion. Foodborne Pathog Dis 7:537–547
Meng-Ze D, Changjiang Z, Huan W, Shuo L, Wen W, Feng-Bia F
(2018) The GC content as a main factor shaping the amino acid
usage during bacterial evolution process. Front Microbiol 9:2948
Moriya Y, Itoh M, Okuda M, Yoshizawa AC, Kanehisa M (2007)
KAAS: an automatic genome annotation and pathway reconstruc-
tion server. Nucleic Acids Res 35:182–185
Nairz M, Weiss G (2020) Iron in infection and immunity. Mol Asp
Med 75:100864
Naz A, Awan FM, Obaid A, Muhammad SA, Paracha RZ, Ahmad J, Ali
A (2015) Identification of putative vaccine candidates against Hel-
icobacter pylori exploiting exoproteome and secretome: a reverse
vaccinology-based approach. Infect Genet Evol 32:280–291
International Microbiology
1 3
Ochman H, Lawrence JG, Groisman EA (2000) Lateral gene transfer
and the nature of bacterial innovation. Nature 405:299–304
Paniagua-Contreras GL, Monroy-Pérez E, Díaz-Velásquez CE, Uribe-
García A, Labastida A, Peñaloza-Figueroa F etal (2019) Whole-
genome sequence analysis of multidrug-resistant uropathogenic
strains of Escherichia coli from Mexico. Infect Drug Resist
12:2363–2377
Pelling H, Nzakizwanayo J, Milo S, Denham EL, MacFarlane WM,
Bock LJ etal (2019) Bacterial biofilm formation on indwelling
urethral catheters. Lett Appl Microbiol 68:277–293
Pessi G, Williams F, Hindle Z, Heurlier K, Holden MT, Cámara
M, Haas D etal (2001) The global posttranscriptional regula-
tor RsmA modulates production of virulence determinants and
N-acylhomoserine lactones in Pseudomonas aeruginosa. J Bac-
teriol 184(335):6676–6683
Reygaert WC (2018) An overview of the antimicrobial resistance
mechanisms of bacteria. AIMS Microbiol 4:482–501
Rihtar E, Žgur Bertok D, Podlesek Z (2020) The uropathogenic specific
protein gene usp from Escherichia coli and Salmonella bongori
is a novel member of the TyrR and H-NS regulons. Microorgan-
isms 8:330
Sanober G, Ahmad S, Azam SS (2017) Identification of plausible drug
targets by investigating the druggable genome of MDR Staphylo-
coccus epidermidis. Gene Report 7:147–153
Saraf VS, Bhatti T, Javed S, Bokhari H (2022) Antimicrobial resist-
ance pattern in E. coli isolated from placental tissues of pregnant
women in low-socioeconomic setting of Pakistan. Curr Microbiol
79:83
Shaeriya F, Al Remawy R, Makhdoom A, Alghamdi A, Shaheen
M, FA. (2021) Purple urine bag syndrome. Saudi J Kidney Dis
Transpl 32:530–531
Shah C, Baral R, Bartaula B etal (2019) Virulence factors of uropatho-
genic Escherichia coli (UPEC) and correlation with antimicrobial
resistance. BMC Microbiol 19:204
Shruthi N, Kumar R, Kumar R (2012) Phenotypic study of virulence
factors in Escherichia coli isolated from antenatal cases, catheter-
ized patients, and faecal flora. J Clin Diagn Res 6:1699–1703
Smelov V, Naber K, Johansen TEB (2016) Improved classification
of urinary tract infection: future considerations. Eur Urol Suppl
15:71–80
Solanki V, Tiwari V (2018) Subtractive proteomics to identify novel
drug targets and reverse vaccinology for the development of chi-
meric vaccine against Acinetobacter baumannii. Sci Rep 8:9044
Srinivasan VB, Rajamohan G (2013) KpnEF, a new member of the
Klebsiella pneumoniae cell envelope stress response regulon, is
an SMR-type efflux pump involved in broad-spectrum antimicro-
bial resistance. Antimicrob Agents Chemother 57(9):4449–4462
Stamm WE, Norrby SR (2001) Urinary tract infections: disease pano-
rama and challenges. J Infect Dis 183:S1–S4
Stickler DJ (2008) Bacterial biofilms in patients with indwelling uri-
nary catheters. Nat Clin Pract Urol 5:598–608
Subashchandrabose S, Hazen TH, Brumbaugh AR, Himpsl SD, Smith
SN, Ernst RD, Rasko DA etal (2014) Host-specific induction of
Escherichia coli fitness genes during human urinary tract infec-
tion. Proc Natl Acad Sci 111:18327–18332
Tabasi M, Asadi Karam MR, Habibi M, Yekaninejad MS, Bouzari
S (2015) Phenotypic assays to determine virulence factors of
uropathogenic Escherichia coli isolates and their correlation with
antibiotic resistance pattern. Osong Public Health Res Perspect
6:261–268
Uddin R, Saeed K, Khan W, Azam SS, Wadood A (2015) Metabolic
pathway analysis approach: identification of novel therapeutic
target against methicillin resistant Staphylococcus aureus. Gene
556:213–226
Yu NY, Wagner JR, Laird MR, Melli G etal (2010) PSORTb 3.0:
improved protein subcellular localization prediction with refined
localization subcategories and predictive capabilities for all
prokaryotes. Bioinformatics 26:1608–1615
Publisher’s note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds
exclusive rights to this article under a publishing agreement with the
author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of
such publishing agreement and applicable law.
A preview of this full-text is provided by Springer Nature.
Content available from International Microbiology
This content is subject to copyright. Terms and conditions apply.