ArticlePDF Available

A Domain-Centric Analysis of Oomycete Plant Pathogen Genomes Reveals Unique Protein Organization

Authors:

Abstract and Figures

Oomycetes comprise a diverse group of organisms that morphologically resemble fungi but belong to the stramenopile lineage within the supergroup of chromalveolates. Recent studies have shown that plant pathogenic oomycetes have expanded gene families that are possibly linked to their pathogenic lifestyle. We analyzed the protein domain organization of 67 eukaryotic species including four oomycete and five fungal plant pathogens. We detected 246 expanded domains in fungal and oomycete plant pathogens. The analysis of genes differentially expressed during infection revealed a significant enrichment of genes encoding expanded domains as well as signal peptides linking a substantial part of these genes to pathogenicity. Overrepresentation and clustering of domain abundance profiles revealed domains that might have important roles in host-pathogen interactions but, as yet, have not been linked to pathogenicity. The number of distinct domain combinations (bigrams) in oomycetes was significantly higher than in fungi. We identified 773 oomycete-specific bigrams, with the majority composed of domains common to eukaryotes. The analyses enabled us to link domain content to biological processes such as host-pathogen interaction, nutrient uptake, or suppression and elicitation of plant immune responses. Taken together, this study represents a comprehensive overview of the domain repertoire of fungal and oomycete plant pathogens and points to novel features like domain expansion and species-specific bigram types that could, at least partially, explain why oomycetes are such remarkable plant pathogens.
Content may be subject to copyright.
A Domain-Centric Analysis of Oomycete Plant Pathogen
Genomes Reveals Unique Protein Organization1[W][OA]
Michael F. Seidl*, Guido Van den Ackerveken, Francine Govers, and Berend Snel
Theoretical Biology and Bioinformatics (M.F.S., B.S.) and Plant-Microbe Interactions (G.V.d.A.), Department of
Biology, Utrecht University, 3584 CH Utrecht, The Netherlands; Centre for BioSystems Genomics, 6700 AB
Wageningen, The Netherlands (M.F.S., G.V.d.A., F.G., B.S.); Graduate School Experimental Plant Sciences,
6708 PB Wageningen, The Netherlands (M.F.S.); and Laboratory of Phytopathology, Plant Sciences Group,
Wageningen University, 6708 PB Wageningen, The Netherlands (F.G.)
Oomycetes comprise a diverse group of organisms that morphologically resemble fungi but belong to the stramenopile lineage
within the supergroup of chromalveolates. Recent studies have shown that plant pathogenic oomycetes have expanded gene
families that are possibly linked to their pathogenic lifestyle. We analyzed the protein domain organization of 67 eukaryotic
species including four oomycete and five fungal plant pathogens. We detected 246 expanded domains in fungal and oomycete
plant pathogens. The analysis of genes differentially expressed during infection revealed a significant enrichment of genes
encoding expanded domains as well as signal peptides linking a substantial part of these genes to pathogenicity. Over-
representation and clustering of domain abundance profiles revealed domains that might have important roles in host-
pathogen interactions but, as yet, have not been linked to pathogenicity. The number of distinct domain combinations
(bigrams) in oomycetes was significantly higher than in fungi. We identified 773 oomycete-specific bigrams, with the majority
composed of domains common to eukaryotes. The analyses enabled us to link domain content to biological processes such as
host-pathogen interaction, nutrient uptake, or suppression and elicitation of plant immune responses. Taken together, this
study represents a comprehensive overview of the domain repertoire of fungal and oomycete plant pathogens and points to
novel features like domain expansion and species-specific bigram types that could, at least partially, explain why oomycetes
are such remarkable plant pathogens.
Oomycetes are a diverse group of organisms that
live as saprophytes or as pathogens of plants, insects,
fish, vertebrates, and microbes (Govers and Gijzen,
2006). The numerous plant pathogenic oomycete spe-
cies cause devastating diseases on many different host
plants and have a huge impact on agriculture. A prom-
inent example is Phytophthora infestans,thecausal
agent of late blight of potato (Solanum tuberosum)and
tomato (Solanum lycopersicum) and responsible for the
Irish potato famine in the 19th century. Plant patho-
genic oomycetes include a large number of different
species that vary in their lifestyle, from obligate
biotrophic and hemibiotrophic to necrotrophic. In
addition, they show great differences in host selectiv-
ity, ranging from broad to very narrow (Erwin and
Ribeiro, 1996; Agrios, 2005). Oomycetes have morpho-
logical features similar to filamentous fungi, and the
two groups exploit common infection structures and
mechanisms (Latijnhouwers et al., 2003). Together
with diatoms, brown algae, and golden-brown algae,
oomycetes are classified as stramenopiles, a lineage
that is united with alveolates in the supergroup of
chromalveolates (Baldauf et al., 2000; Yoon et al.,
2002). The monophyly of this supergroup, however, is
under debate (Baurain et al., 2010). The genomes of
oomycetes sequenced so far are variable in size and
content, ranging from 65 Mb in Phytophthora ramorum
to 240 Mb in P. infestans (Haas et al., 2009), and only
include plant pathogenic species. Analysis of these
genomes revealed that several gene families facilitat-
ing the infection process are expanded (Martens et al.,
2008). Extreme examples are gene families encoding
cytoplasmic effector proteins such as RXLR effectors,
which share the host cell-targeting motif RXLR and
suppress defense responses in the host, and the ne-
crosis-inducing proteins classified as Crinklers (Crn;
Haas et al., 2009). To date, a few oomycete genomes
have been sequenced, and this enables a compre-
hensive comparison of genomic features present in
oomycetes, fungi, and other eukaryotic species such
as gene families and protein domains. Experimentally
derived functional knowledge of the majority of gene
products in oomycetes in a comparable depth as for
model species like Saccharomyces cerevisiae and Arabi-
1
This work was supported by the Center for BioSystems Ge-
nomics, which is part of The Netherlands Genomics Initiative/
Netherlands Organization for Scientific Research.
* Corresponding author; e-mail m.f.seidl@uu.nl.
The author responsible for distribution of materials integral to the
findings presented in this article in accordance with the policy
described in the Instructions for Authors (www.plantphysiol.org) is:
Michael F. Seidl (m.f.seidl@uu.nl).
[W]
The online version of this article contains Web-only data.
[OA]
Open Access articles can be viewed online without a sub-
scription.
www.plantphysiol.org/cgi/doi/10.1104/pp.110.167841
628 Plant PhysiologyÒ,February 2011, Vol. 155, pp. 628–644, www.plantphysiol.org Ó2010 American Society of Plant Biologists
dopsis (Arabidopsis thaliana) will likely not be accessi-
ble in the near future. Hence, comparative genomics
provides an important framework to functionally char-
acterize oomycete gene products and generate hypo-
theses on the basic cellular functions as well as the
complex interactions of these plant pathogens with
their hosts and environment.
In this study, we focus on protein domains because
these are the basic functional, evolutionary, and struc-
tural units that shape proteins (Rossmann et al., 1974;
Orengo et al., 1997; Vogel et al., 2004). Domains func-
tion independently in single-domain proteins or syn-
ergistically in multidomain proteins (Doolittle, 1995;
Vogel et al., 2004; Bashton and Chothia, 2007). Accord-
ingly, some domains always occur with a defined set of
functional partners, whereas others are highly versa-
tile and form combinations of two consecutively oc-
curring domains (also called bigrams) with different
N- or C-terminal partners (Marcotte et al., 1999; Basu
et al., 2008). Here, we analyzed the domain repertoire
predicted from the genome sequences of 67 eukaryotic
species and compared filamentous plant pathogens
with other eukaryotes with a special emphasis on
oomycetes. We show how differences in the domain
repertoire of oomycetes, especially in the expansion of
certain domain families and the formation of species-
specific bigram types, can be linked to the biology of
this group of organisms. This allowed the generation
of candidate sets of proteins and domains that are
likely to play roles in the lifestyle of oomycetes or
their interaction with plants.
RESULTS
The Domain Repertoire of Oomycete Plant Pathogens
and Its Comparison with Other Eukaryotes
We analyzed the domain architecture of the predic-
ted proteomes in 67 eukaryotes covering all major
groups of the eukaryotic tree of life with the exception
of the supergroup Rhizaria (Fig. 1A; Supplemental
Table S1). We included seven stramenopiles, four of
which are plant pathogenic oomycetes, namely the
obligate biotrophic downy mildew Hyaloperonospora
arabidopsidis and three hemibiotrophic Phytophthora
species. The selection also contained five fungal plant
pathogens, including rice (Oryza sativa)blastfungus
(Magnaporthe grisea)andcorn(Zea mays)smut(Usti-
lago maydis), both species with a (hemi)biotrophic
lifestyle comparable to the oomycete plant pathogens
used in the analysis (Fig. 1B).
The domain architecture of all 1,250,996 predicted
proteins in the 67 eukaryotic genomes was analyzed
using HMMER (Eddy, 1998) and a local Pfam-A data-
base (Finn et al., 2008). Overall, 59% (737,851) of all
proteins have one or more predicted domain. We
detected a total of 1,464,807 domains in all species,
80,180 within the stramenopiles and 51,030 in oomy-
cetes.
In order to characterize the domain repertoire of
eukaryotes, we used two metrics: the number of do-
main types and the number of different combinations
of adjacent domains, also called bigrams (Fig. 2). In
total, 13,994 bigram types were identified in the 67
eukaryotic genomes, consisting of 6,356 different do-
main types. As described by Basu et al. (2008), the
number of bigram types increases superlinearly rela-
tive to the number of domain types, with the highest
numbers in multicellular organisms (Fig. 3). We ob-
served separate clusters formetazoans,fungi,and
plants (including land plants and mosses). Oomycetes
and fungi have similar numbers of domain types,
ranging from 2,000 to 2,500; however, oomycetes, in
particular Phytophthora species, contain significantly
more bigram types. The three analyzed Phytophthora
species appeared to have approximately 50% more
bigram types compared with other organisms that
have similar numbers of domain types (Fig. 3; P=
0.00019, by one-sided Wilcoxon rank-sum test). This
even holds when we apply a more conservative ap-
proach by discarding all domain and bigram types
that occur once in each predicted proteome (Supple-
mental Fig. S1A). We observed that the number of
domain types as well as the number of bigram types
increases with proteome size and reaches saturation
for larger proteomes (Supplemental Fig. S1, B and C;
Cosentino Lagomarsino et al., 2009). Although oomy-
cetes and in particular Phytophthora species contain a
similar number of domain types as fungi, they have a
larger predicted proteome (Supplemental Fig. S1B).
However, they contain more bigram types than fungi
but less than other species with predicted proteomes
of similar size (e.g. Drosophila melanogaster;Supple-
mental Fig. S1C).
Domain Overrepresentation Provides a Snapshot of
Pathogen-Host Interaction
Apart from a wide and abundant repertoire of do-
mains related to transposable elements (Haas et al.,
2009), the most abundant domain types in oomycetes
are similar to those in other eukaryotes (Supplemental
Table S2). Hence, absolute domain abundance alone
is not indicative enough to correlate domains to the
lifestyle of both fungal and oomycete plant pathogens.
Instead, we identified domains that are overrepre-
sented in plant pathogens relative to other eukaryotes
(Fig. 1B).
Our analysis inferred 246 overrepresented domains
in plant pathogens that are observed in 24,970 proteins
(P,0.001, by Fisher’s exact test; a selection of well-
described overrepresented domains is depicted in Fig.
4A; Supplemental Table S3). Since we analyzed the
expansion in plant pathogens at the level of a group
rather than an individual species, domains that are
reported as being expanded in the group are not ne-
cessarily expanded in all species of the group or may
even be absent (Supplemental Table S3). For example,
secreted proteins encoding carbohydrate-binding fam-
Domain Analysis in Oomycete Plant Pathogens
Plant Physiol. Vol. 155, 2011 629
ily 25 domains (IPR005085) are only found in Phyto-
phthora species and not in fungal plant pathogens,
whereas secreted proteins containing the Cys-rich
domain (CFEM; IPR008427) are only observed in fun-
gal pathogens (Kulkarni et al., 2003).
Many proteins involved in host-pathogen interac-
tion are secreted in the apoplast or, like the RXLR
effector proteins, translocated into host cells following
their secretion from the pathogen (Haas et al., 2009).
Hence, we also predicted the presence of potential
N-terminal signal peptide sequences in the whole
proteomes of the analyzed species. The combined
secretome encompasses 100,521 potentially secreted
proteins, of which 11,352 are predicted in plant path-
ogens (Supplemental Fig. S2). Approximately 20%
(2,478) of these proteins contain overrepresented do-
mains; hence, proteins containing overrepresented
domains are 1.85-fold enriched in the predicted secre-
tome of the analyzed plant pathogens (P= 2.57 3
102231, by Fisher’s exact test).
Oomycete proteins with significantly expanded do-
mains are prime candidates for being pathogenicity
associated. To assess this hypothesis, we tested if
P. infestans genes that are differentially expressed dur-
ing infection of the potato host are enriched for the
aforementioned expanded domains. For this, we uti-
lized NimbleGen microarray data that include genome-
wide expression levels of P. infestans genes at different
days post inoculation (dpi) of potato leaves as well as
from mycelium grown in vitro on different media
(Haas et al., 2009). We identified in total 1,584 genes
that are significantly induced or repressed in P. infes-
tans during infection (differentially expressed for at
least one of the time points 2–5 dpi) compared with
those grown in vitro (three different growth media;
P,0.05, q,0.05, by ttest; Supplemental Table S4A;
Supplemental File S1). Of the 1,584 differentially ex-
pressed genes, 259 encode proteins containing signif-
icantly expanded domains (Supplemental Table S4B),
which is 1.2-fold more than expected (P= 8.8 31025,
by Fisher’s exact test). Moreover, 44 of these 259 genes
also encode proteins with a predicted signal peptide,
which is a significant enrichment (1.8-fold; P= 4.38 3
1025, by Fisher’s exact test). The majority (41) of these
44 genes are differentially expressed early in infection
(2 dpi; Fig. 5A). All genes differentially expressed at 3
dpi are also differentially expressed at 2 dpi (Fig. 5, A
and B). Consequently, the 44 differentially expressed
genes coding for proteins with both predicted signal
peptides as well as overrepresented domains are prom-
ising candidates for pathogenicity-associated proteins,
of which several will be discussed in detail below.
For several groups of overrepresented domains,
a direct or indirect role in host-pathogen interaction
and/or plant pathogen lifestyle has already been
hypothesized or demonstrated (Dean et al., 2005; Tyler
et al., 2006; Haas et al., 2009). Nearly 18% of the 246
overrepresented domains belong to three groups of
domains: (1) hydrolase domains; (2) domains involved
in substrate transport over membranes, such as the
general ATP-binding cassette (ABC) transporter-like
domain (IPR003439) but also more specialized trans-
porters of sulfate (IPR011547) and amino acids
(IPR004841/IPR013057); and (3) domains present in
peptidases, such as the metalloprotease-type M28
domain (IPR007484) found in many secreted proteins.
Of the hydrolases, which encompass 9% of the over-
represented domains, the majority are present in en-
Figure 1. A, The major eukaryotic
groups considered in the analysis and
the number of species represented in
every group. For the exact species used
in the analysis, see Supplemental Table
S1. The tree is adapted from Simpson
and Roger (2004) and incorporates the
phylogeny for the stramenopiles based
on Blair et al. (2008). B, Fungal and
oomycete plant-pathogenic species
used in this analysis. The plant patho-
gens include species with different
lifestyles, indicated by the symbol fol-
lowing the species name. The phylog-
eny for the fungi is based on James
et al. (2006).
Seidl et al.
630 Plant Physiol. Vol. 155, 2011
zymes that hydrolyze glycosidic bonds. An example
is the glycoside hydrolase (GH) family 12 domain
(IPR002594). This domain is observed 34 times in plant
pathogens, which overall contain 91,747 domains,
and 43 times in all eukaryotes, which have a total of
1,464,807 domains, and hence is 12.62-fold (3.66 log2-
fold) enriched in the plant pathogens. This domain is
mainly observed in secreted proteins (27 out of 34;
SignalP prediction). The majority (79%) of the GH-12
domains are found in oomycete plant pathogens, and
the expression of two of these hydrolase genes in
P. infestans (PITG_08944 and PITG_16991) is signifi-
cantly induced during infection of potato (Fig. 5;
Supplemental Table S4). In total, 33 differentially ex-
pressed genes during plant infection in P. infestans
encode proteins that contain GH domains, includ-
ing GH-17 (IPR000490) in endo-1,3-b-glucosidase and
GH-81 (IPR005200) in b-1,3-glucanases as well as se-
veral members of GH-28 (IPR000743), a domain in-
volved in soft rotting of host tissues and described in
both fungal and bacterial plant pathogens (He and
Collmer, 1990; Ruttkowski et al., 1990). Twenty-eight
P.infestans genes coding for domains involved in
transmembrane transport are differentially expressed
during plant infection (Supplemental Table S4). Exam-
ples of genes encoding domains involved in substrate
Figure 2. Description of different met-
rics used in this study. In the example
shown, we observe five different do-
main types. The abundance of a do-
main type is defined as the number of
occurrences of the individual entity
within the species (e.g. domain type
B has an abundance of two). The ver-
satility is defined as the number of
different direct adjacent N- or C-terminal
neighbors. We distinguish between N-
and C-terminal partners (e.g. the ver-
satility of domain type C is three). A
bigram is a set of two directly adjacent
domains, and we also consider two
entities of the same domain a bigram
(e.g. we observe nine different bigram
types in the proteome, of which two
have an abundance of three [right
panel]).
Figure 3. Dependence of the number
of domain and bigram types observed
in the analyzed species. The average
number of different bigrams of species
that have between 2,000 and 2,500
different domain types is indicated
with the bottom horizontal red bar.
The top horizontal red bar indicates the
average number of different bigrams for
Phytophthora species. The full species
names corresponding to the abbreviations
can be found in Supplemental Table
S1. A magnification of the area encom-
passing the oomycete and fungal plant
pathogens is shown; the species of
interest are highlighted. The dots are
colored according to the major eukar-
yotic groups as indicated in the text
box.
Domain Analysis in Oomycete Plant Pathogens
Plant Physiol. Vol. 155, 2011 631
transport over the membrane are PITG_04307, which
encodes an ABC-2-type transporter (IPR013525),
PITG_12808, which encodes an amino acid transporter
(IPR013057), as well as PITG_22087, a gene encoding
both ABC-like (IPR003439) and ABC-2-type domains
(Supplemental Table S4). Extracellular degrading en-
zymes like cutinases contain an overrepresented do-
main (IPR000675; P= 3.72 310261). This domain is
observed 65 times in plant pathogenic species, corre-
sponding to a 13.3-fold (3.73 log2-fold) enrichment
(Fig. 4A). In total, 61 proteins in plant pathogens pre-
dicted to possess this domain are potentially secreted
(Supplemental File S1). Another overrepresented
domain that is present in secreted proteins and in-
volved in maceration and soft rotting of plant tissue is
the pectate lyase (IPR004898). This domain is 15.34-
fold (3.94 log2-fold) enriched in plant pathogens and
mainly found in oomycetes. Five genes in P. infestans
encode this domain as well as a predicted N-terminal
signal peptide and are differentially expressed (Fig. 5).
Novel Candidate Domains Significantly Expanded in
Plant Pathogens
Next to domains that were already directly or indi-
rectly implied in host-pathogen interaction, we iden-
tified novel candidates that are also expanded in plant
pathogens, several of which are encoded in P. infestans
genes differentially expressed during infection of the
host. Genes encoding the significantly expanded alco-
hol dehydrogenase (zinc binding; IPR013149) as well
as a GroES-like alcohol dehydrogenase (IPR013154)
domains are ubiquitous in all analyzed eukaryotes,
and also the combination of these two domains is pre-
sent in all species with only a few exceptions. Nine of
these genes in P. infestans are induced during infection
(Supplemental Table S4). Sixty-five genes in plant
pathogens encode proteins with FAD-linked oxi-
dase (IPR006094) and berberine/berberine-like (BBE)
domains (IPR012951), of which three out of six in
P. infestans are induced during infection (PITG_02928,
Figure 4. Overrepresentation of se-
lected, well-described domains in-
volved in plant-pathogen interaction
and establishing or maintaining infec-
tion. A, The log2-fold overrepresenta-
tion of the domains in plant pathogens
is shown in the bar chart. The absolute
number of occurrences in plant patho-
gens and the percentage of all predic-
ted domains in plant pathogens are
displayed in the bars, and the corrected
Pvalues are shown at the tip of the
bars. The fold overrepresentation and
the Pvalue for the Kazal protease
inhibitor domain were based on the
overrepresentation in oomycetes com-
pared with plant pathogens (indicated
by the white bar and asterisks). B, The
overrepresented domains described in
A are depicted in their possible cellular
role during infection of the plant host.
Seidl et al.
632 Plant Physiol. Vol. 155, 2011
Figure 5. (Legend appears on following page.)
Domain Analysis in Oomycete Plant Pathogens
Plant Physiol. Vol. 155, 2011 633
PITG_02930, and PITG_20764). The BBE domain is
involved in the biosynthesis of the alkaloid berberine
(Facchini et al., 1996). The genes encode a predicted
N-terminal signal peptide, although molecular analy-
sis of proteins containing these domains in plants in-
dicated that at least some of these are not secreted but
instead are targeted to specialized vesicles (Amann
et al., 1986; Kutchan and Dittrich, 1995; Facchini et al.,
1996). Moreover, Moy et al. (2004) observed induced
expression of a soybean (Glycine max) gene (BE584185)
shortly after infection with Phytophthora sojae con-
taining these two domains. A recent analysis from
Raffaele et al. (2010) focusing solely on the secretome
in P. infestans corroborates our results and also con-
cludes that proteins with BBE and FAD-linked oxi-
dase domains are candidate virulence factors. Three
genes encoding secreted metallophosphoeste-
rases (IPR004843; PITG_20454, PITG_07720, and
PITG_10322) show induced gene expression. These
metallophosphoesterase domains are found in phos-
phatases and hence are involved in the regulation of
protein activity, since they work as antagonists of
kinase activity.
For approximately 6% of all overrepresented do-
mains, no or limited functional information is avail-
able in Pfam. These are the so-called DUFs: domains of
unidentified function. Given their expansion in plant
pathogens and the fact that other overrepresented
domainsareknowntofunctionindiverseaspects
of plant-pathogen interactions, these DUFs are also
likely to play a role in the lifestyle of plant pathogens
and hence are promising targets for further experi-
mental validation (Supplemental Table S3). Secreted
proteins containing a combination of two overrepre-
sented DUFs, DUF2403 (IPR018807) and DUF2401
(IPR018805), are exclusively found in fungi and in
oomycetes, with the majority (approximately 75%) in
oomycetes. The N-terminal DUF2403 contains a Gly-
rich region without further functional annotation,
whereas five highly conserved Cys residues charac-
terize the C-terminal DUF2401. Proteins containing
both DUFs have been characterized in S. cerevisiae and
in Candida albicans as being covalently linked to the
cell wall (Terashima et al., 2002; Yin et al., 2005; Klis
et al., 2009). Another overrepresented DUF within
plant pathogens and mainly found in oomycetes
is DUF953 (IPR010357). This domain is present in
several eukaryotic proteins with thioredoxin-like func-
tion, and two genes in P. infestans containing this
domain are differentially expressed during infection
(PITG_07008 and PITG_07010). DUF590 (IPR007632),
which is ubiquitous in nearly all eukaryotes, is ob-
served in proteins containing eight putative transmem-
brane helices. These proteins exhibit calcium-activated
ion channel activity and are involved in diverse bio-
logical processes (Yang et al., 2008). The P. infestans
gene PITG_06653 that contains the DUF590 domain is
differentially expressed during infection, and this pro-
vides further support for a role in host-pathogen
interaction. The exemplified DUFs as well as other
overrepresented domains with less or no functional
annotation are interesting candidates for further func-
tional studies to decipher their precise role in plant
pathogens.
Domain Overrepresentation in Oomycete
Plant Pathogens
Since the previous analysis grouped both fungal and
oomycete plant pathogens, domains specifically en-
riched in oomycetes were not directly discernible.
Hence, we compared the relative domain abundance
predicted in plant pathogens (Fig. 1B) with the aim to
identify domains specifically enriched in oomycetes.
Of the 75 domains that are overrepresented in oomy-
cetes, 20 are not observed in any fungal plant pathogen
and therefore can be considered oomycete specific
within plant pathogens (Supplemental Table S5). In
general, the abundance of expanded domains in
Phytophthora species is higher than in H. arabidopsidis.
A well-described example is the NPP1 domain
(IPR008701) that is present in secreted (SignalP: 122)
necrosis-inducing proteins. It shows a significant over-
representation in oomycetes (1.68-fold [0.75 log2-fold]
enriched), in particular in Phytophthora species, but
is also observed 10 times in fungal plant pathogens
as well as in a few cases in nonpathogenic fungi as
noted before (Gijzen and Nu
¨rnberger, 2006). Four P.
infestans genes encoding this domain are induced early
during infection (2–3 dpi), whereas a single gene
(PITG_18453) is induced late (5 dpi). Several pepti-
dases (e.g. containing the peptidase S1/S6 and C1A
domains) are overrepresented compared with other
plant pathogens. S1/S6 (IPR001254; 1.6-fold [0.74 log2-
fold]) is predicted in 91 proteins, of which 67 have a
predicted secretion signal, while C1A (IPR000668;
1.79-fold [0.85 log2-fold]) is predicted in 78 proteins,
of which 31 are potentially secreted. C1A is present in
several eukaryotic species, but within the plant path-
ogenic group it is exclusively found in oomycetes.
Figure 5. Gene expression analysis of P. infestans genes encoding overrepresented domains and a predicted N-terminal signal
peptide. Genes with significant gene expression changes at different time points after infection (2–5 dpi) relative to the expression
intensities of different growth media are displayed (P,0.05, q,0.05, by ttest). Heat maps show the significantly differentially
expressed genes at different time points relative to growth media. Genes were clustered using Spearman rank correlation and
average linkage clustering. Gene identifiers as well as domain descriptions are displayed. Gene expression profiles are displayed
for the expression intensities relative to the average intensities of the growth media for each time point after infection. Heat maps
and expression profiles of the significantly differentially expressed genes relative to the growth media are shown for individual
time points as follows: 2 dpi (A), 3 dpi (B), 4 dpi (C), and 5 dpi (D).
Seidl et al.
634 Plant Physiol. Vol. 155, 2011
Several secreted protease inhibitors of the Kazal family
containing the Kazal I1 (IPR002350) and Kazal-type
(IPR011497) domains are significantly expanded in
oomycetes and are within the group of analyzed plant
pathogens specific to oomycetes. This suggests that
they provide an increased level of protection of the
pathogen against host-encoded defense-related pro-
teases (Tyler et al., 2006). Another domain that is
oomycete specific within the plant pathogens is the
Na/Pi cotransporter (IPR003841) involved in the up-
take of phosphate. Several other transporters that have
already been described as being overrepresented in
plant pathogens (e.g. the ABC-2-type transporters) are
significantly expanded within oomycete plant patho-
gens, since these species are the major contributors to
the overall abundance of this domain in plant patho-
gens. The abundance of predicted Ser/Thr-like kinase
domains (IPR017442) compared with other plant path-
ogenic species is surprisingly high, and this domain is
specifically expanded in the Phytophthora species. Even
if several expanded domains are observed in both
oomycete as well as fungal plant pathogens, the explo-
ration of domains primarily expanded in oomycetes
(e.g. certain transporter families and defense- and
signaling-related domains) highlights functional en-
tities that discriminate between these groups of plant
pathogens.
Clustering of Abundance Profiles Reveals Additional
Potential Pathogenicity Factors
We extended the set of candidate domains that might
be important for host-pathogen interaction beyond
overrepresented domains by searching for additional
domains that show presence, absence, and expansion
profiles similar to overrepresented domains, since
these domains are likely to be functionally linked or
involved in similar biological processes (Pellegrini
et al., 1999). We calculated a normalized profile of
domain abundance and clustered similar abundance
profiles using hierarchical clustering (Supplemental
File S1). Several clusters contained a mix of signifi-
cantly overrepresented domains and domains whose
expansion in plant pathogens is not significant. We
exemplify this with three clusters that contain 20%
of all overrepresented domains in plant pathogens
(Fig. 6).
In the first cluster (Fig. 6), domains are mainly
expanded in oomycete plant pathogens. The abun-
dance of some domains in plant pathogens is too low
to be identified as being overrepresented. For example,
the PcF domain (IPR018570), which is present in a
small, approximately 50-amino acid necrosis-induc-
ing protein found in various Phytophthora species
(Orsomando et al., 2001; Liu et al., 2005), was not
identified in the initial overrepresentation analysis.
Also in this cluster is the sugar fermentation stimula-
tion domain (IPR005224), which is mainly found in
bacteriaandinvolvedintheregulationofmaltose
metabolism (Kawamukai et al., 1991). In this first
cluster, we observed a high number (approximately
40%) of domains without functional characterization
that are mainly present in bacteria. An example is
DUF1949 (IPR015269), a domain that is only found in
the three analyzed Phytophthora species. This domain
is observed in functional uncharacterized bacterial
proteins like YIGZ in Escherichia coli K12 and adopts
a ferredoxin-like fold (Park et al., 2004). The Phytoph-
thora and bacterial proteins containing DUF1949 also
contain a second, N-terminal uncharacterized protein
family, UPF (UPF00029, IPR001498). This domain is
also found in the human protein Impact and is con-
served from bacteria to eukaryotes (Okamura et al.,
2000). The P. infestans gene (PITG_00027) containing
both domains is induced early in infection (Supple-
mental Table S4B). Since these DUFs cluster with
overrepresented domains, they are promising candi-
dates for further study.
The domains in the second cluster mainly show an
expansion of the abundance in both fungal and oomy-
cete plant pathogens. This cluster contains, for exam-
ple, cell wall-degrading domains like cutinases, pectate
lysases, and other hydrolases and also the NPP1 do-
main that is found in necrosis-inducing proteins. The
glycosyl hydrolase family 88 comprises unsaturated
glucoronyl hydrolases thought to be involved in bio-
film degradation and is mainly found in bacteria and
fungi (Itoh et al., 2006). Interestingly, homologs are also
observed in plant pathogenic bacteria (e.g. Pectobacte-
rium atrosepticum), in fungi (e.g. M. grisea), and in all
three Phytophthora species.
The third cluster contains domains that are not ex-
clusively found in plant pathogens but have a broader
abundance profile. This cluster includes a variety of
overrepresented hydrolases, epimerases, and the ABC-
2-type transporter domain (IPR013525) that is ob-
served nearly 500 times in plant pathogenic species.
Another domain that is found in this cluster is the
dienelactone hydrolase domain (IPR002925), observed
in all plant pathogens and also in other eukaryotic
species, with a high abundance in plants as well as in
fungi. This domain hydrolyzes dienelactone to mal-
eylacetate in bacteria (Pathak et al., 1991) and is also
detected in a putative 1,3:1,4-b-glucanase from P.
infestans that is proposed to be involved in cell wall
metabolism (McLeod et al., 2003).
Quantification of Oomycete-Specific Bigrams
Domains generally do not act as single entities in
proteins but rather synergistically with other domains
in the same protein or with domains in interacting
proteins (Park et al., 2001; Vogel et al., 2004). Domains
involved in signaling, sensing, and generic interac-
tions are versatile and form combinations with several
different partner domains (Supplemental Table S6). As
described by others (Vogel et al., 2005), we observed
that the versatility of domains is proportional to their
abundance (Supplemental Fig. S3). Hence, we applied
a weighted bigram frequency that corrects for abun-
Domain Analysis in Oomycete Plant Pathogens
Plant Physiol. Vol. 155, 2011 635
dance to detect domains that are promiscuous or
prone to form combinations with different partners
(Basu et al., 2008). The average number of promiscu-
ous domains in oomycetes is 424 and in Phytophthora is
464. This is higher than the average number of pro-
miscuous domains (357) over all other species (Sup-
plemental Table S7).
We observed that oomycetes have a higher number
of bigram types than species with a comparable num-
ber of domain types (Fig. 3). We identified in total
13,994 different bigram types throughout the 67 ana-
lyzed species. The majority of these bigram types (i.e.
7,724, or 55.2%) are predicted in only a single species.
In oomycetes, bigram types formed by domains that
are associated with transposable elements showed a
high abundance (Supplemental Tables S8 and S9). We
identified 1,107 bigram types occurring exclusively in
plant pathogens, the majority of which (773) are only
observed in the analyzed oomycetes (Supplemental
Table S10). These oomycete-specific bigram types are
Figure 6. Average linkage clustering of normalized domain profiles using Spearman rank correlation as a distance measurement.
The species tree for all eukaryotic species is depicted on top, with the color code of their supergroup as introduced in Figure 1.
Plant pathogens are marked with stars, and the arrowheads highlight domains identified as overrepresented in plant pathogens.
Seidl et al.
636 Plant Physiol. Vol. 155, 2011
identified in total 1,511 times in 1,375 predicted pro-
teins. Of the 773 oomycete-specific bigram types, 53
are present in all oomycetes (Fig. 7A). The biggest
overlap in oomycete-specific domain types is observed
between the Phytophthora species, especially between
P. ramorum and P. sojae. A recent analysis of domain
combination in P. ramorum and P. sojae already re-
vealed several proteins involved in metabolism and
regulatory networks containing novel bigrams (Morris
et al., 2009). We additionally observed in total 43
bigram types that are shared either between P. infestans
and P. sojae or between P. infestans and P. ramorum.
However, the majority of oomycete-specific bigrams
(467) are specific for a single species. The number of
oomycete-specific bigram types highly exceeds the
number of oomycete-specific domain types (41). Inter-
estingly, only six of the oomycete-specific domains
participate in forming the specific bigrams. Therefore,
common domain types form the majority of the ob-
served species-specific domain combinations, empha-
sizing the importance of novel domain combinations
rather than novel domain types as a source for species-
specific functionality. Even when we selectively look at
the bigrams that occur at least twice in the same pro-
teome or once in at least two different proteomes, we
still observe 320 bigram types that are specific to
oomycetes and occur in 982 predicted proteins.
Approximately 8% of the proteins containing an
oomycete-specific bigram have a predicted secretion
signal (9.2% of all oomycete proteins contain a predic-
ted secretion signal). An example that is observed in a
secreted putative Cys protease present in all analyzed
oomycetes is the combination of the peptidase C1A
domain (IPR000668) and the ML domain (IPR003172).
The ML domain is known to be involved in lipid
binding and innate immunity and has been observed
in plants, fungi, and animals (Inohara and Nun
˜ez,
2002). The proteins containing this bigram also have an
N-terminal cathepsin inhibitory domain (IPR013201)
that is often found next to the peptidase C1A domain
and prevents access of the substrate to the binding cleft
(Groves et al., 1996). Another bigram that is found in
secreted proteins predicted in the analyzed Phytoph-
thora species is the combination of the carbohydrate-
binding domain family 25 (IPR005085; CBM25) with a
GH-31 domain (IPR000322) as well as the tandem
combination of CBM25 domains N terminal to the
glycosyl hydrolase domain. The presence of the se-
creted CBM25 and GH-31 combination has recently
been noted in Pythium ultimum (Le
´vesque et al., 2010).
We further tried to elucidate the presence of RXLR or
Crn motifs in proteins containing oomycete-specific
bigrams. We predicted the presence of one of these
motifs using individual HMMER models for both the
RXLR and the Crn motif (see “Materials and Me-
thods”). We overall predicted 746 proteins containing
an RXLR and 99 proteins with a Crn motif. None of
these proteins is predicted to contain an oomycete-
specific bigram type.
The most abundant oomycete-specific bigram type
that occurs in 64 proteins is a combination of the
phosphatidylinositol 3-phosphate-binding zinc finger
(FYVE type) and the GAF domain. The presence of this
oomycete-specific bigram in P. ramorum and P. sojae has
been noted before (Morris et al., 2009). The GAF
domain is described as one of the most abundant
domains in small-molecule-binding regulatory pro-
teins (Zoraghi et al., 2004). It is present in a large
number of different proteins with a wide range of
cellular functions, such as gene regulation (Aravind
and Ponting, 1997) and light detection and signaling
(Sharrock and Quail, 1989; Montgomery and Lagarias,
2002). A typical eukaryotic domain composition in-
volving the GAF domain is N terminal to the 3#
5#-cyclic phosphodiesterase domain found in phos-
phodiesterases that regulate pathways with cyclic
nucleotide-monophosphate as second messengers
(Sharrock and Quail, 1989; Martinez et al., 2002). This
organization is observed in total 111 times, and five
times in oomycetes (Fig. 7B). The GAF-FYVE bigram is
either observed as a single bigram (in 53 proteins) or in
combination with other domains (in 11 proteins), for
example with myosin (Richards and Cavalier-Smith,
2005). In P. infestans, two genes (PITG_07627 and
PITG_09293) encoding proteins with this combination
are induced early during infection of the plant (Sup-
plemental Table S4B). A phylogenetic analysis of the
GAF domain in eukaryotes and prokaryotes showed
that all GAF domains in oomycetes that are involved
in the fusion with FYVE exclusively cluster with pro-
karyotic GAF domains, whereas other GAFs also clu-
ster with eukaryotes. Hence, this suggests a horizontal
gene transfer from bacteria to oomycetes of those GAF
domains that are involved in the fusion with FYVE
(Fig. 7C; see “Materials and Methods”). The FYVE-
type zinc finger is not identified in prokaryotic species;
hence, we suggest two independent events, namely a
horizontal gene transfer of the GAF domain from
bacteria to oomycetes and subsequently a fusion to the
zinc finger domain. Horizontal gene transfer seems to
play an important role in the evolution of eukaryotes
(Keeling and Palmer, 2008), and recent evidence in-
dicates that these events also have a significant con-
tribution to the genome content of protists and
oomycetes, as they received genetic material from dif-
ferent sources (Richards and Talbot, 2007; Martens
et al., 2008; Morris et al., 2009). Because GAF domains
are known to be involved in many different cellular
processes, we can only speculate about the biological
function of proteins harboring the GAF-FYVE bigram.
A possible function is the targeting of proteins to lipid
layers by the zinc finger domain in response to second
messengers sensed by the GAF domain.
Several domains involved in the phospholipid sig-
naling domain were found to be overrepresented in
the filamentous plant pathogens and in particular in
oomycetes. These included the phosphatidylinositol
3-/4-kinase, PIK (IPR000403), the phosphatidylinositol
4-phosphate 5-kinase domain, PIPK (IPR002498), as
Domain Analysis in Oomycete Plant Pathogens
Plant Physiol. Vol. 155, 2011 637
Figure 7. A, Venn diagram depicting
the presence of oomycete-specific bi-
gram types in the analyzed oomycete
proteomes and indicating the number
of shared bigram types between dif-
ferent proteomes. The total number
of oomycete-specific bigram types in
each proteome is shown in parenthe-
ses. The Venn diagram was produced
using Venny (Oliveros, 2007). B, Do-
main architecture of example proteins
containing a GAF domain. The top two
architectures resemble common pro-
tein architectures: the cGMP-depen-
dent 3#,5#-cyclic phosphodiesterase
(observed 111 times in eukaryotes
and five times in oomycetes) and phy-
tochrome A (observed 21 times in
eukaryotes). The bottom two archi-
tectures depict oomycete-specific ar-
chitectures: the FYVE-GAF fusion is
observed 53 times independent of
other domains, and the myosin motor
head in combination with the FYVE-
GAF fusion is observed four times, a
single copy in each of the oomycetes
included in this study. aa, Amino acids.
C, Simplified evolutionary tree based
on the phylogenetic analysis of the
GAF domain in prokaryotes and eukar-
yotes. GAF domains from proteins with
a FYVE-GAF fusion are exclusively
found to be close to bacterial GAF
domains. Other oomycete proteins
that only contain the GAF domain
without the FYVE domain also cluster
with other eukaryotic sequences.
Seidl et al.
638 Plant Physiol. Vol. 155, 2011
well as the phosphatidylinositol 3-phosphate-binding
FYVE. Novel domain compositions in proteins in-
volved in phospholipid signaling and metabolism in
Phytophthora species have been reported previously
(Meijer and Govers, 2006). Signaling domains like the
FYVE and the PIK, as well as domains like the IQ-
calmodulin-binding domain (IPR000048) and the
phox-like domain (IPR001683), form highly abundant
oomycete-specific bigram types (Supplemental Table
S10). Moreover, other domains, like the Ser/Thr pro-
tein kinase-like (IPR017442), pleckstrin homology
(IPR001849), and DEP (IPR000591) domains, are in-
volved in several oomycete-specific bigram types (e.g.
the DEP-Ser/Thr protein kinase-like domain fusion is
predicted in the proteomes of all analyzed oomycetes).
Additionally, domains that are components of the
histone acetylation-based regulatory system form oo-
mycete-specific bigrams, such as the AP2 (IPR001471)
and the histone deacetylase (IPR000286) domain com-
bination (Iyer et al., 2008), which is observed in P.
ramorum as well as in P. sojae.
DISCUSSION
We predicted the domain repertoire encoded in the
genomes of four oomycete plant pathogens and com-
pared it with a broad variety of eukaryotes spanning
all major groups, including several fungal plant path-
ogens that have a similar morphology, lifestyle, and
ecological niche as oomycete plant pathogens. We
quantified and examined domain properties observed
in oomycetes and especially emphasized differences
and common themes within fungal and oomycete
plant pathogens and their probable contribution to a
pathogenic lifestyle.
We observed that oomycete plant pathogens,
in particular Phytophthora species, have significantly
higher numbers of unique bigram types compared
with species with a similar number of domain types
(Fig. 2A). However, oomycetes also have on average
50% more predicted genes than most of the analyzed
fungi, but at the same time they encode a comparable
number of domain types and hence exhibit similar
domain diversity (Supplemental Fig. S1B). The high
number of genes observed in oomycetes suggests en-
larged complexity compared with fungi, which is not
directly obvious from the domain diversity but instead
from the number of unique bigram types (Supplemen-
tal Fig. S1C). This observation has two possible expla-
nations: (1) the larger number of genes predicted from
oomycete genomes provides the flexibility to form
new domain combinations based on a limited set of
already existing domains that are in quantities similar
to fungi; (2) the domain models that cover specific do-
mains are incomplete and therefore do not provide the
required sensitivity for oomycete genomes. Hence, we
would underestimate the number of observable do-
main types (and to a certain extent the number of
predicted bigram types). Additionally, oomycetes, es-
pecially Phytophthora species, are no longer following
the observed trend that organisms with a higher num-
ber of genes (proteins) contain a larger number of do-
main types. Consequently, they are shifted when
comparing the number of predicted domain and bi-
gram types. Nevertheless, both possible explanations
and the observed numbers allow us to conclude that
oomycete genomes, especially Phytophthora species,
harbor a large repertoire of genes encoding different
bigram types compared with species of comparable
complexity and, in the case of filamentous fungi, even
similar morphology.
Oomycetes and fungal plant pathogens seem to be
very similar to other eukaryotes with respect to abso-
lute domain abundance (Supplemental Table S2),
and this metric is hence not sufficiently indicative to
correlate domains directly or indirectly with the path-
ogenic lifestyle. Therefore, we predicted overrepre-
sented domains in plant pathogens and identified 246
domains that are significantly expanded (Supplemen-
tal Table S3). Proteins containing overrepresented
domains are significantly enriched in the predicted
secretome of the analyzed plant pathogens, corrobo-
rating the idea that expanded domain families are
involved in host-pathogen interaction and that these
proteins are mainly acting in the extracellular space. It
has to be noted that the presence of a predicted signal
peptide does not necessarily mean that these proteins
are found extracellularly, since some proteins are re-
tained in the endoplasmic reticulum/Golgi and hence
are not secreted (Bendtsen et al., 2004).
Since we anticipate that proteins that are directly
involved in host-pathogen interaction are differentially
regulated upon infection, we utilized the NimbleGen
microarray data of P. infestans (Haas et al., 2009) and
identified 259 induced/repressed genes encoding
proteins containing overrepresented domains. Genes
containing overrepresented domains are significantly
enriched within the set of differentially expressed
genes containing a predicted domain. Moreover, this
subset contains a significantly higher abundance of
genes with a predicted N-terminal signal peptide than
expected. These observations highlight and corrobo-
rate the initially emerging link between domain ex-
pansion and host-pathogen interaction.
The majority of the 246 expanded domains are
present in proteins that are involved in general carbo-
hydrate metabolism, nutrient uptake, signaling net-
works, and suppression of host responses and hence
might contribute to establishing and maintaining
pathogenesis (Fig. 4). The variety of overrepresented
domains involved in substrate transport over mem-
branes is of special interest. Filamentous plant patho-
gens and especially oomycetes exhibit a complex and
expanded repertoire of these domains, enabling them
to absorb nutrients from their environment and host.
The expression of P. infestans genes encoding ABC-2-
like transporters, amino acid transporters, and Na/Pi
cotransporter is induced early in infection of the plant,
suggesting that these proteins act during the biotro-
Domain Analysis in Oomycete Plant Pathogens
Plant Physiol. Vol. 155, 2011 639
phic phase of infection. Several other genes encoding
proteins with a predicted extracellular localization are
induced during infection and contain overrepresented
domains. For example, three P. infestans genes encod-
ing the predicted N-terminal signal peptide as well as
FAD-linked oxidase and BBE domains are induced
during infection. The BBE domain is involved in the
biosynthesis of the alkaloid berberine (Facchini et al.,
1996). Moy et al. (2004) showed that a soybean homo-
log of this gene is inducing after infection with P. sojae.
Molecular studies of proteins containing BBE domains
in plants have indicated that several proteins contain-
ing these domains are in fact not secreted but instead
targeted to specific alkaloid biosynthetic vesicles
where the proteins accumulate (Amann et al., 1986;
Kutchan and Dittrich, 1995; Facchini et al., 1996). The
expansion of domain families with potential direct or
indirect roles in host-pathogen interaction in filamen-
tous plant pathogens strongly suggests adaptation to
their lifestyle at the genomic level.
In addition to known domains, the set of overrepre-
sented domains also revealed domains that, as yet,
have not been implicated in pathogenicity nor are
functionally characterized. An example is the DUF953
domain, which, within plant pathogens, is mainly
found in oomycetes. This domain is observed in eukar-
yotic proteins with a thioredoxin-like function, and
P. infestans genes encoding these domains are differen-
tially expressed during infection. The significant ex-
pansion of these domains in plant pathogens, and the
fact that other well-described domains with a function
in plant pathogenicity are also overrepresented, make
proteins encoding poorly described but expanded do-
mains interesting candidates to decipher their role in
filamentous plant pathogens in general and oomycetes
in particular.
We determined domain overrepresentation on the
basis of species groups (plant pathogens and oomy-
cetes) rather than on the level of individual species. We
are aware that, as a consequence of this approach, we
might have identified domains as being overrepre-
sented in one group even if they do not need to be
present or expanded in all the members (Supplemental
Tables S3 and S5). Hence, we might falsely extrapolate
the functional role of a domain in a subset of species to
the whole group (e.g. a domain that is exclusively
found in plant pathogenic fungi and not in oomycetes
would still be overrepresented in the plant pathogenic
group). Especially when comparing oomycete with
fungal plant pathogens, the dominant expansion of
domain families within Phytophthora species over fam-
ilies in H. arabidopsidis might bias the inferred overrep-
resented domain (Supplemental Table S5). Since we in
general want to identify candidate domains that might
be directly or indirectly involved in host-pathogen
interaction, either at the level of filamentous plant
pathogens or oomycetes, we think our group-based
approach is appropriate to establish a set of candidate
proteins and domains.
Moreover, the clustering of presence, absence, and
expansion patterns of domains known or implicated to
be involved in a plant pathogenic lifestyle with do-
mains that have no known or direct connection to host-
pathogen interactions aids in expanding this set
of novel candidate domains (Fig. 5). For example,
DUF1949 is within our species selection exclusively
found in Phytophthora species and adopts a ferredoxin-
like fold. The N-terminal region of proteins containing
this domain shows similarity to another domain
(UPF00029) that has been found in the human Impact
protein. The P. infestans gene containing both domains
is induced early during infection of the plant, providing
additional, independent evidence for the possible role
of genes encoding this uncharacterized domain in host-
pathogen interaction. However, domains that are also
abundant in nonpathogenic species (e.g. other strame-
nopiles) might not be related to or only indirectly
involved in pathogenicity. Hence, the exact nature of
the contribution of these domains to pathogenesis or to
general lifestyle requires more in-depth experimental
studies of the candidate domains and genes predicted
to contain these functional entities.
Protein domains generally do not act as single enti-
ties but in synergy with other domains in the same
protein or with other domains in interacting proteins.
We identified 773 oomycete-specific bigrams, of which
53 are observed in all analyzed oomycetes (Fig. 7A;
Supplemental Table S10). Based on our species selec-
tion, we cannot conclude that the oomycete-specific
bigrams are common to all oomycetes, since they might
only be specific for plant pathogenic oomycetes or even
for the selected oomycetes analyzed in this study. The
majority of the 773 bigrams, however, are specific for a
subset of the tested oomycete species or even a single
species. The 320 bigram types that are obse rved in more
than a single species or twice in the same proteome are
observed in 982 predicted proteins. These bigrams are
less likely to be the result of a wrong gene annotation
and include already well-described examples of oo-
mycete-specific domain combinations, such as the
FYVE-PIK bigram observed in Phytophthora phospha-
tidylinositol kinases (Meijer and Govers, 2006), the
AP2-histone deacetylase bigram that is specifically
found in P. ramorum and P. infestans (Iyer et al., 2008),
and the myosin head domain-FYVE bigram as well as
the FYVE-GAF bigram found in myosin proteins in all
analyzed oomycetes (Richards and Cavalier-Smith,
2005). Still, some of the bigrams could be artificial due
to false negatives or false positives in the domain pre-
dictions. The remaining, species-specific bigrams could
be the result of artificial fusion of genes due to wrong
gene annotation or an actual biological signal in one of
the analyzed oomycete species. The derived results are
not only dependent on the quality of the genome
sequences of the analyzed oomycetes but also on that
of the other eukaryotes. Wrong predictions of bigrams
in these species would lead to false negatives in
oomycetes. Hence, the number of derived oomycete-
specific bigrams is only an approximation, and the true
Seidl et al.
640 Plant Physiol. Vol. 155, 2011
set of oomycete-specific bigrams needs to be further
analyzed. Recent analyses of the underlying molecular
mechanisms of domain gain in animals have shown
that in fact gene fusion, tightly linked with gene dupli-
cation, is the major mechanism that shaped novel
protein architecture (Buljan et al., 2010; Marsh and
Teichmann, 2010). The contributions of this mechanism
in forming lineage- or even species-specific bigrams in
oomycetes and the probable role of the flexible ge-
nomes have to be further analyzed. The bigrams pre-
sented here form a comprehensive starting point for an
in-depth bioinformatic and experimental analysis of
promising gene families coding novel domain combi-
nations.
Common domain types form the majority of the
observed oomycete-specific bigrams, emphasizing the
importance of novel combinations rather than novel
domain types as a source for species-specific function-
ality. Only a minority of proteins containing oomycete-
specific bigrams are secreted, and none of these pro-
teins is predicted to contain a RXLR or Crn motif. We
are aware that the total number of predicted proteins
containing the RXLR or Crn motif is lower than re-
ported in other studies where those were predicted
using multiple complementary methods (Haas et al.,
2009). However, when directly comparing the number
of proteins predicted to contain the RXLR motif by
HMMER alone, the reported numbers are similar to our
predictions. Together with the observation that RXLR
proteins do not contain known Pfam-A domains in the
C-terminal domain (Haas et al., 2009), our data are not
in conflict with RXLR protein predictions from previ-
ous studies. Of the known Crn genes in P. infestans, 40%
do not encode a secretion signal (Haas et al., 2009);
hence, these sequences are not considered in the pre-
diction of Crn motifs in our analysis and explain the
discrepancy between the previously reported numbers
and our predictions. Haas et al. (2009) have reported
a huge number of different C-terminal structures in
P. infestans Crns that contained up to 36 different
domains, of which 33 are not described in Pfam. Several
of these domains induce necrosis in plants. Since we
focused in our analysis exclusively on Pfam domains,
we did not expect to find these proteins containing
specific bigrams.
The majority of proteins containing oomycete-
specific bigrams seem to be functional in the pathogen
cytoplasm. Moreover, domains involved in mediation
between macromolecules or lipids (e.g. the FYYE or
the phox-like domain) as well as signaling domains
(e.g. Ser/Thr kinase-like or the DEP domain) are highly
abundant in oomycete-specific bigrams. Ser/Thr ki-
nase domain-like is overrepresented in oomycetes
compared with fungal plant pathogens and is particu-
larly expanded within the Phytophthora species (Sup-
plemental Table S5). This expanded repertoire together
with the high abundance of this domain in oomycete-
specific bigrams strongly suggests that oomycetes have
the capacity to recombine existing signaling pathways
in a novel and complicated network that is distinct from
other eukaryotes. This might also be true for other
interaction networks, since several domains mediating
interactions between macromolecules (e.g. DNA-bind-
ing zinc finger [IPR007087] or protein-protein interac-
tion like WW/Rsp5/WWP [IPR001202]) are also highly
abundant in oomycete-specific bigrams. Whether this
reflects a general phenomenon in all oomycetes, spe-
cific for the plant pathogenic species analyzed in this
study, or only for Phytophthora species, can only be
answered when more oomycetes, including sapro-
phytes and pathogens with different hosts, have been
sequenced.
We outlined a complex but comprehensive picture of
the domain repertoire of filamentous plant pathogens
focusing on oomycetes and showed how differences
compared with other eukaryotes are reflecting the
biology of these groups of organisms. Especially the
expansion of certain domain families is directly linked
with the lifestyle of oomycete plant pathogens and
allowed the generation of a set of candidate domains
likely to play important roles in the interaction with the
plant host. Proteins containing overrepresented do-
mains are enriched in the predicted secretome of the
analyzed species. Moreover, the expression analysis of
genes encoding overrepresented domains during in-
fection of the plant revealed a significant enrichment of
genes encoding overrepresented domains within the
differentially expressed genes. Furthermore, we ob-
served a significantly higher than expected abundance
of genes encoding a signal peptide within the set of
differentially expressed genes containing expanded
domains. This added additional, independent evi-
dence for the biological significance of our observa-
tions. Furthermore, oomycete genomes encode a set of
proteins containing oomycete-specific domain combi-
nations that are formed by common domain types and
include several domains involved in signaling and/or
mediation of interactions between macromolecules.
Oomycetes, therefore, might possess altered regulatory
and signaling networks that differ from other eukary-
otes. If the described and discussed differences in the
domain repertoire of oomycetes have a direct influence
on plant pathogenicity or are generally useful in these
organisms needs to be analyzed further. Nevertheless,
they provide promising starting points that will aid our
understanding of the biology of oomycetes in general
and plant pathogens in particular.
MATERIALS AND METHODS
Species Used in the Analysis
In the performed analysis, 67 eukaryotic species representing four of the
five eukaryotic supergroups (excluding Rhizaria) were considered (Fig. 1A;
for species abbreviations, see Supplemental Table S1). We used the predicted
best model proteomes for all subsequent analyses.
Identification of Domain Composition
We predicted the domain repertoire of all proteins encoded in the diverse
genomes using hmmpfam (HMMER package version 2.3.2) and a local Pfam-
Domain Analysis in Oomycete Plant Pathogens
Plant Physiol. Vol. 155, 2011 641
A database (version 23). We applied a domain model-specific gathering cutoff
and used HMM models that are optimized to search for full-length entities in
the query sequence.
In order to obtain the nonoverlapping domain architecture of multido-
main proteins, we resolved overlapping domains according to certain rules.
We defined two domains as overlapping if more than 10% of the predicted
domain locations were overlapping (based on the relative length of the
domains).If,inthecaseofoverlappingdomains,thee-valuedifferencewas
largerthan5(ona–log
10 scale), we kept the domain with the highest
e-value.Incaseswherethedifferencewassmaller,wekeptthelongest
model. If both overlapping models had the same length, we considered
differences in e-v alue and bit score. In the case of the Pfam-based pre dictions
for 15 proteins, the applied rules did not resolve overlapping entities.
Therefore, we considered the Conserved Domain Database (version 2.16)
superfamily annotation, which automatica lly clusters domain entities that
resemble evolutionarily related domains. If both domains corresponded to
thesamefamily,wechooseoneentity.
Based on the nonoverlapping domain architecture, we derived different
metrics for each proteome. We counted the abundance for every domain and
the resulting number of different domain types per analyzed proteome. We
defined domain bigrams as two consecutively located domains in a single
protein. We discriminated between reciprocal domain pairs, so that the bigram
(A|B) is not identical to (B|A), and took repeating domains into accou nt, such
as (A|A). Based on the set of bigrams, we also determined the versatility of all
individual domains in a given proteome, which is defined by the number of
different direct N- and C-terminal partners, also including reciprocal and self-
repeated pairs.
Prediction of Secreted Proteins
Secreted proteins were predicted using SignalP (version 3.0; Bendtsen
et al., 2004) in combination with TMHMM (version 2.0; Krogh et al., 2001). We
restricted the analysis to the first 70 amino acids of the protein and accepted
signal peptide predictions if both the neural network and the HMM imple-
mented in SignalP predicted the presence of a signal peptide under default
parameters. Moreover, we declined predicted signal peptides if TMHMM
predicted more than one transmembrane region in the protein. If only a single
transmembrane helix was predicted and the predicted region was over-
lapping with the SignalP prediction for more than 10 amino acids and
positioned within the first 35 amino acids from the start, we included the
protein in the set of secreted proteins.
Domain Overrepresentation
Domain overrepresentation was calculated using a one-sided Fisher’s exact
test. The derived Pvalues were Bonferroni corrected for multiple testing by
multiplying the Pvalue with the number of conducted tests. The corrected
Pvalues were compared with an a= 0.001 to infer domain overrepresentation.
For the overrepresented domains in oomycete plant pathogens compared with
fungal plant pathogens, we considered domains that occur at least once in a
single plant pathogen but nevertheless could also occur in other eukaryotic
species.
Gene Expression Analysis of Phytophthora infestans
We extracted NimbleGen expression data of P. infestans during infection of
potato (Solanum tuberosum) 2 to 5 dpi from the Gene Expression Omnibus
(http://www.ncbi.nlm.nih.gov/geo/). The setup and initial analysis of the
NimbleGen data are described by Haas et al. (2009). The log2-transformed and
mean-centered array intensities were analyzed for differential expression
using Multiexperiment Viewer (Saeed et al., 2006). The ttests were conducted
between two groups (group A, different media types; group B, replicates for
1 dpi). The test was applied for each day after inoculation, and significant up-/
down-regulated genes were reported applying a Pvalues cutoff of 0.05. False
discovery rates were addressed using R and the qvalue package by computing
qvalues for each of the comparisons and subsequently applying a qvalue
cutoff of 0.05 (Storey and Tibshirani, 2003; R Development Core Team, 2010).
Visualization of the heat maps was done using R and the Bioconductor
package utilizing Spearman correlation as a distance measurement and
hierarchical clustering (average linkage; Gentleman et al., 2004). Gene ex-
pression intensities relative to the average expression intensities in media
types (V8, RS, Pea) were computed in R.
Clustering of Domain Profiles
We created abundance profiles for each domain based on the abundance in
each individual proteome. We excluded domains that were only identified in a
single species. The rows (domains) were multiplied by a scaling factor so that
the sum of squares was 1, and subsequently the columns (species) were
normalized in the same way. We performed a hierarchical clustering (average
linkage) of the profiles using the Spearman correlation matrix as a distance
measurement. The normalization and clustering were performed using Clus-
ter (Eisen et al., 1998), and the visualization was done using TreeView (http://
rana.lbl.gov/EisenSoftware.htm).
Domain Promiscuity
We calculated the domain promiscuity for every domain in the analyzed
species based on weighted bigram frequency (Basu et al., 2008). We took a
relatively moderate cutoff for determining promiscuous domains; every do-
main with a higher promiscuity score than a domain that is only present once in
the genome and is participating in one bigram type is called promiscuous.
Prediction of the RXLR and Crn Motifs in Oomycetes
We identified the presence of the RXLR motif in all predicted proteins in the
analyzed oomycetes using three different HMMER models (R.H.Y. Jiang,
personal communication). The first model was created using Phytophthora
ramorum and Phytophthora sojae RXLRs and included the RXLR motif itself and
10 amino acids downstream and upstream of the motif. The two other models
were based separately on RXLRs from P. infestans and Hyaloperonospora arab-
idopsidis and included 10 amino acids upstream from the RXLR motif and five
amino acids downstream of the DEER motif. We used HMMER (hmmsearch)
with an e-value cutoff of 10 and subsequently combined all predictions. Fur-
thermore, we demanded the presence of a predicted signal peptide (SignalP)
cleavage site within the first 30 amino acids of the protein, the gap between the
cleavage site and the start of the motif to be 30 or less, the start of the motif to be
within the first 100 amino acids of the protein, and the starting position of the
RXLR motif to be downstream of the cleavage site. For the identification of the
Crn LFLAK motif, we used a HMMER model of that region (B.J. Haas, personal
communication) and the same sequence demands as for the RXLRs.
Phylogenetic Analysis of the GAF Domain
We derived all sequences containing a GAF domain from the selected
proteomes and extracted the amino acid sequence of the domain based on the
start and end points of the domain model. We conducted a similarity search
with the extracted domains using BLASTP (version 2.2.20) with an e-value
cutoff of 1 31025and a low-complexity filter against a set of 295 bacterial
predicted proteomes (downloaded from the National Center for Biotechnol-
ogy Information ftp server on January 27, 2009). In the homologs that were
obtained, domains were predicted using hmmpfam as described above.
Subsequently, prokaryotic GAF domains were extracted and aligned together
with the eukaryotic domains using mafft (version 6.713b) with the local
alignment strategy (Katoh et al., 2002). A phylogenetic tree was constructed
with RAxML (version 7.0.4) using the GAMMA model of rate heterogeneity
and the WAG amino acid substitution matrix (Stamatakis, 2006).
Supplemental Data
The following materials are available in the online version of this article.
Supplemental Figure S1. Dependence of the number of domain types,
bigram types, and proteome sizes.
Supplemental Figure S2. Size of the predicted secretome of the 67
analyzed eukaryotes.
Supplemental Figure S3. Dependance of versatility and abundance of the
analyzed domains.
Supplemental Table S1. Summary of the eukaryotic species analyzed in
this study.
Supplemental Table S2. Domain abundance reported for all predicted
Pfam domains.
Supplemental Table S3. Overrepresented domains in plant pathogens.
Seidl et al.
642 Plant Physiol. Vol. 155, 2011
Supplemental Table S4. Differentially expressed genes in P. infestans.
Supplemental Table S5. Overrepresented domains in oomycetes.
Supplemental Table S6. Domain versatility reported for all predicted
Pfam domains.
Supplemental Table S7. Domain promiscuity reported for all predicted
Pfam domains.
Supplemental Table S8. Bigram abundance reported for all species, plant
pathogens, and plant pathogenic oomycetes.
Supplemental Table S9. Bigram abundance (excluding self-repeated do-
mains) reported for all species, plant pathogens, and plant pathogenic
oomycetes.
Supplemental Table S10. Summary of the oomycete-specific bigrams.
Supplemental File S1. TreeView file containing the clustered domain
abundance profiles.
ACKNOWLEDGMENTS
We thank Lidija Berke, John van Dam, and Jos Boekhorst for fruitful
discussion and comments on the manuscript as well as Rui Peng Wang for
support with the P. infestans gene expression data. We also thank Harold J.G.
Meijer for discussion of fusion proteins in P. infestans, Rays H.Y. Jiang for
providing the RXLR-HMMER model, and Brian J. Haas for the Crn LFLAK-
HMMER model. Some of the sequence data and annotation were produced
by the U.S. Department of Energy Joint Genome Institute (http://www.jgi.
doe.gov), the Broad Institute of Harvard and the Massachusetts Institute of
Technology (http://www.broadinstitute.org), or the Stanford Genome Tech-
nology Center (http://med.stanford.edu/sgtc/) in collaboration with the
user community (for detailed information, see Supplemental Table S1).
Received October 19, 2010; accepted November 24, 2010; published November
30, 2010.
LITERATURE CITED
Agrios GN (2005) Plant Pathology, Ed 5. Academic Press, New York
Amann M, Wanner G, Zenk MH (1986) Intracellular compartmentation of
two enzymes of berberine biosynthesis in plant cell cultures. Planta 167:
310–320
Aravind L, Ponting CP (1997) The GAF domain: an evolutionary link
between diverse phototransducing proteins. Trends Biochem Sci 22:
458–459
Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF (2000) A kingdom-
level phylogeny of eukaryotes based on combined protein data. Science
290: 972–977
Bashton M, Chothia C (2007) The generation of new protein functions by
the combination of domains. Structure 15: 85–99
Basu MK, Carmel L, Rogozin IB, Koonin EV (2008) Evolution of protein
domain promiscuity in eukaryotes. Genome Res 18: 449–461
Baurain D, Brinkmann H, Petersen J, Rodrı
´guez-Ezpeleta N, Stechmann
A, Demoulin V, Roger AJ, Burger G, Lang BF, Philippe H (2010)
Phylogenomic evidence for separate acquisition of plastids in crypto-
phytes, haptophytes, and stramenopiles.MolBiolEvol27: 1698–1709
Bendtsen JD, Nielsen H, von Heijne G, Brunak S (2004) Improved
prediction of signal peptides: SignalP 3.0. J Mol Biol 340: 783–795
Blair JE, Coffey MD, Park SY, Geiser DM, Kang S (2008) A multi-locus
phylogeny for Phytophthora utilizing markers derived from complete
genome sequences. Fungal Genet Biol 45: 266–277
Buljan M, Frankish A, Bateman A (2010) Quantifying the mechanisms of
domain gain in animal proteins. Genome Biol 11: R74
Cosentino Lagomarsino M, Sellerio AL, Heijning PD, Bassetti B (2009)
Universal features in the genome-level evolution of protein domains.
Genome Biol 10: R12
Dean RA, Talbot NJ, Ebbole DJ, Farman ML, Mitchell TK, Orbach MJ,
Thon M, Kulkarni R, Xu JR, Pan H, et al (2005) The genome sequence of
the rice blast fungus Magnaporthe grisea.Nature434: 980–986
Doolittle RF (1995) The multiplicity of domains in proteins. Annu Rev
Biochem 64: 287–314
Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14: 755–763
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and
display of genome-wide expression patterns. Proc Natl Acad Sci USA
95: 14863–14868
Erwin DC, Ribeiro OK (1996) Phytophthora Diseases Worldwide. American
Phytopathological Society, St. Paul
Facchini PJ, Penzes C, Johnson AG, Bull D (1996) Molecular characteri-
zation of berberine bridge enzyme genes from opium poppy. Plant
Physiol 112: 1669–1677
Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G,
Forslund K, Eddy SR, Sonnhammer ELL, et al (2008) The Pfam protein
families database. Nucleic Acids Res 36: D281–D288
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S,
Ellis B, Gautier L, Ge Y, Gentry J, et al (2004) Bioconductor: open
software development for computational biology and bioinformatics.
Genome Biol 5: R80
Gijzen M, Nu
¨rnberger T (2006) Nep1-like proteins from plant pathogens:
recruitment and diversification of the NPP1 domain across taxa. Phy-
tochemistry 67: 1800–1807
Govers F, Gijzen M (2006) Phytophthora genomics: the plant destroyers’
genome decoded. Mol Plant Microbe Interact 19: 1295–1301
Groves MR, Taylor MA, Scott M, Cummings NJ, Pickersgill RW, Jenkins
JA (1996) The prosequence of procaricain forms an alpha-helical domain
that prevents access to the substrate-binding cleft. Structure 4: 1193–
1203
Haas BJ, Kamoun S, Zody MC, Jiang RHY, Handsaker RE, Cano LM,
Grabherr M, Kodira CD, Raffaele S, Torto-Alalibo T, et al (2009)
Genome sequence and analysis of the Irish potato famine pathogen
Phytophthora infestans.Nature461: 393–398
He SY, Collmer A (1990) Molecular cloning, nucleotide sequence, and
marker exchange mutagenesis of the exo-poly-alpha-D-galacturonosi-
dase-encoding pehX gene of Erwinia chrysanthemi EC16. J Bacteriol
172: 4988–4995
Inohara N, Nun
˜ez G (2002) ML: a conserved domain involved in innate
immunity and lipid metabolism. Trends Biochem Sci 27: 219–221
ItohT,HashimotoW,MikamiB,MurataK(2006) Substrate recognition by
unsaturated glucuronyl hydrolase from Bacillus sp. GL1.Biochem
Biophys Res Commun 344: 253–262
Iyer LM, Anantharaman V, Wolf MY, Aravind L (2008) Comparative
genomics of transcription factors and chromatin proteins in parasitic
protists and other eukaryotes. Int J Parasitol 38: 1–31
James TY, Kauff F, Schoch CL, Matheny PB, Hofstetter V, Cox CJ, Celio G,
Gueidan C, Fraker E, Miadlikowska J, et al (2006) Reconstructing the
early evolution of fungi using a six-gene phylogeny. Nature 443: 818–822
KatohK,MisawaK,KumaK,MiyataT(2002) MAFFT: a novel method for
rapid multiple sequence alignment based on fast Fourier transform.
Nucleic Acids Res 30: 3059–3066
Kawamukai M, Utsumi R, Takeda K, Higashi A, Matsuda H, Choi YL,
Komano T (1991) Nucleotide sequence and characterization of the sfs1
gene: sfs1 is involved in CRP*-dependent mal gene expression in
Escherichia coli.JBacteriol173: 2644–2648
Keeling PJ, Palmer JD (2008) Horizontal gene transfer in eukaryotic
evolution. Nat Rev Genet 9: 605–618
Klis FM, Sosinska GJ, de Groot PW, Brul S (2009) Covalently linked cell
wall proteins of Candida albicans and their role in fitness and virulence.
FEM Yeast Res 9: 1013–1028
Krogh A, Larsson B, von Heijne G, Sonnhammer EL (2001) Predicting
transmembrane protein topology with a hidden Markov model: appli-
cation to complete genomes. J Mol Biol 305: 567–580
Kulkarni RD, Kelkar HS, Dean RA (2003) An eight-cysteine-containing
CFEM domain unique to a group of fungal membrane proteins. Trends
Biochem Sci 28: 118–121
Kutchan TM, Dittrich H (1995) Characterization and mechanism of the
berberine bridge enzyme, a covalently flavinylated oxidase of benzo-
phenanthridine alkaloid biosynthesis in plants. J Biol Chem 270: 24475–
24481
Latijnhouwers M, de Wit PJGM, Govers F (2003) Oomycetes and fungi:
similar weaponry to attack plants. Trends Microbiol 11: 462–469
Le
´vesque CA, Brouwer H, Cano L, Hamilton JP, Holt C, Huitema E,
Raffaele S, Robideau GP, Thines M, Win J, et al (2010) Genome
sequence of the necrotrophic plant pathogen Pythium ultimum reveals
original pathogenicity mechanisms and effector repertoire. Genome Biol
11: R73
Domain Analysis in Oomycete Plant Pathogens
Plant Physiol. Vol. 155, 2011 643
Liu Z, Bos JI, Armstrong M, Whisson SC, da Cunha L, Torto-Alalibo T,
Win J, Avrova AO, Wright F, Birch PR, et al (2005) Patterns of
diversifying selection in the phytotoxin-like scr74 gene family of
Phytophthora infestans. Mol Biol Evol 22: 659–672
Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D
(1999) Detecting protein function and protein-protein interactions from
genome sequences. Science 285: 751–753
Marsh JA, Teichmann SA (2010) How do proteins gain new domains?
Genome Biol 11: 126
Martens C, Vandepoele K, Van de Peer Y (2008) Whole-genome analysis
reveals molecular innovations and evolutionary transitions in chromal-
veolate species. Proc Natl Acad Sci USA 105: 3427–3432
Martinez SE, Beavo JA, Hol WGJ (2002) GAF domains: two-billion-
year-old molecular switches that bind cyclic nucleotides. Mol Interv 2:
317–323
McLeod A, Smart CD, Fry WE (2003) Characterization of 1,3-beta-glucanase
and 1,3;1,4-beta-glucanase genes from Phytophthora infestans.Fungal
Genet Biol 38: 250–263
Meijer HJG, Govers F (2006) Genomewide analysis of phospholipid
signaling genes in Phytophthora spp.: novelties and a missing link. Mol
Plant Microbe Interact 19: 1337–1347
Montgomery BL, Lagarias JC (2002) Phytochrome ancestry: sensors of
bilins and light. Trends Plant Sci 7: 357–366
Morris PF, Schlosser LR, Onasch KD, Wittenschlaeger T, Austin R,
Provart N (2009) Multiple horizontal gene transfer events and domain
fusions have created novel regulatory and metabolic networks in the
oomycete genome. PLoS ONE 4: e6133
Moy P, Qutob D, Chapman BP, Atkinson I, Gijzen M (2004) Patterns of
gene expression upon infection of soybean plants by Phytophthora sojae.
Mol Plant Microbe Interact 17: 1051–1062
Okamura K, Hagiwara-Takeuchi Y, Li T, Vu TH, Hirai M, Hattori M,
Sakaki Y, Hoffman AR, Ito T (2000) Comparative genome analysis of
the mouse imprinted gene impact and its nonimprinted human homo-
log IMPACT: toward the structural basis for species-specific imprinting.
Genome Res 10: 1878–1889
Oliveros JC (2007) VENNY: An Interactive Tool for Comparing Lists with
Venn Diagrams. http://bioinfogp.cnb.csic.es/tools/venny/index.html
(October 7, 2010)
Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM
(1997) CATH: a hierarchic classification of protein domain structures.
Structure 5: 1093–1108
Orsomando G, Lorenzi M, Raffaelli N, Dalla Rizza M, Mezzetti B,
Ruggieri S (2001) Phytotoxic protein PcF, purification, characterization,
and cDNA sequencing of a novel hydroxyproline-containing factor
secreted by the strawberry pathogen Phytophthora cactorum.JBiolChem
276: 21578–21584
Park F, Gajiwala K, Eroshkina G, Furlong E, He D, Batiyenko Y, Romero
R,ChristopherJ,BadgerJ,HendleJ,etal(2004) Crystal structure of
YIGZ, a conserved hypothetical protein from Esch erichia coli K12 with a
novel fold. Proteins 55: 775–777
Park J, Lappe M, Teichmann SA (2001) Mapping protein family interac-
tions: intramolecular and intermolecular protein family interaction
repertoires in the PDB and yeast. J Mol Biol 307: 929–938
Pathak D, Ashley G, Ollis D (1991) Thiol protease-like active site found in
the enzyme dienelactone hydrolase: localization using biochemical,
genetic, and structural tools. Proteins 9: 267–279
Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO (1999)
Assigning protein functions by comparative genome analysis: protein
phylogenetic profiles. Proc Natl Acad Sci USA 96: 4285–4288
Raffaele S, Win J, Cano L, Kamoun S (2010) Analyses of genome archi-
tecture and gene expression reveal novel candidate virulence factors in
the secretome of Phytophthora infestans. BMC Genomics 11: 637
R Development Core Team (2010) R: A Language and Environment for
Statistical Computing. R Foundation for Statistical Computing, Vienna
Richards TA, Cavalier-Smith T (2005) Myosin domain evolution and the
primary divergence of eukaryotes. Nature 436: 1113–1118
Richards TA, Talbot NJ (2007) Plant parasitic oomycetes such as Phytoph-
thora species contain genes derived from three eukaryotic lineages. Plant
Signal Behav 2: 112–114
Rossmann MG, Moras D, Olsen KW (1974) Chemical and biological
evolution of nucleotide-binding protein. Nature 250: 194–199
RuttkowskiE, Labitzke R, Khanh NQ, Lo
¨ffler F, Gottschalk M, Jany KD (1990)
Cloning and DNA sequence analysis of a polygalacturonase cDNA from
Aspergillus niger RH5344. Biochim Biophys Acta 1087: 104–106
Saeed AI, Bhagabati NK, Braisted JC, Liang W, Sharov V, Howe EA, Li J,
Thiagarajan M, White JA, Quackenbush J (2006) TM4 microarray
software suite. Methods Enzymol 411: 134–193
Sharrock RA, Quail PH (1989) Novel phytochrome sequences in Arabi-
dopsis thaliana: structure, evolution, and differential expression of a
plant regulatory photoreceptor family. Genes Dev 3: 1745–1757
Simpson AGB, Roger AJ (2004) The real ‘kingdoms’ of eukaryotes. Curr
Biol 14: R693–R696
Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylo-
genetic analyses with thousands of taxa and mixed models. Bioinfor-
matics 22: 2688–2690
Storey JD, Tibshirani R (2003) Statistical significance for genomewide
studies. Proc Natl Acad Sci USA 100: 9440–9445
Terashima H, Fukuchi S, Nakai K, Arisawa M, Hamada K, Yabuki N,
Kitada K (2002) Sequence-based approach for identification of cell wall
proteins in Saccharomyces cerevisiae.CurrGenet40: 311–316
Tyler BM, Tripathy S, Zhang X, Dehal P, Jiang RHY, Aerts A, Arredondo
FD, Baxter L, Bensasson D, Beynon JL, et al (2006) Phytophthora genome
sequences uncover evolutionary origins and mechanisms of pathogen-
esis. Science 313: 1261–1266
Vogel C, Bashton M, Kerrison ND, Chothia C, Teichmann SA (2004)
Structure, function and evolution of multidomain proteins. Curr Opin
Struct Biol 14: 208–216
Vogel C, Teichmann SA, Pereira-Leal J (2005) The relationship between
domain duplication and recombination. J Mol Biol 346: 355–365
Yang YD, Cho H, Koo JY, Tak MH, Cho Y, Shim WS, Park SP, Lee J, Lee B,
Kim BM, et al (2008) TMEM16A confers receptor-activated calcium-
dependent chloride conductance. Nature 455: 1210–1215
Yin QY, de Groot PW, Dekker HL, de Jong L, Klis FM, de Koster CG (2005)
Comprehensive proteomic analysis of Saccharomyces cerevisiae cell walls:
identification of proteins covalently attached via glycosylphosphatidy-
linositol remnants or mild alkali-sensitive linkages. J Biol Chem 280:
20894–20901
Yoon HS, Hackett JD, Pinto G, Bhattacharya D (2002) The single, ancient
origin of chromist plastids. Proc Natl Acad Sci USA 99: 15507–15512
Zoraghi R, Corbin JD, Francis SH (2004) Properties and functions of GAF
domains in cyclic nucleotide phosphodiesterases and other proteins.
Mol Pharmacol 65: 267–278
Seidl et al.
644 Plant Physiol. Vol. 155, 2011
... The biochemical distinction of oomycetes from fungi is evident in its cell wall composition, which contains minimal chitin but is rich in cellulose and β-glucans [45][46][47]. Other differences include its mitochondrial structure, actin cytoskeleton, and protein repertoire [48][49][50]. Notably, P. insidiosum has an incomplete sterol biosynthesis pathway, relying on external sterol sources for physiological functions, which contributes to its resistance or reduced susceptibility to sterol biosynthesis inhibitors and sterol-binding drugs [21,[51][52][53]. ...
Article
Full-text available
This review article explores the effectiveness of antibacterial drugs that inhibit protein synthesis in treating pythiosis, a difficult-to-treat infection caused by Pythium insidiosum. The article highlights the susceptibility of P. insidiosum to antibacterial drugs, such as macrolides, oxazolidinones, and tetracyclines. We examine various studies, including in vitro tests, experimental infection models, and clinical case reports. Based on our synthesis of these findings, we highlight the potential of these drugs in managing pythiosis, primarily when combined with surgical interventions. The review emphasizes the need for personalized treatment strategies and further research to establish standardized testing protocols and optimize therapeutic approaches.
... It is worth noticing in our analysis the expression of some of the apoplastic (elicitins, CWDE, and proteases) and cytoplasmatic effectors (RxLR) upregulated at biotrophic stages by P. palmivora during oil palm infection, co-expressed with genes related to nutrients metabolism such as sulfur, sugar, or amino acid transporters. The role of sulfur during plant pathogen infection is still poorly understood in oomycetes; however, in the genome of P. infestans, an overrepresentation of sulfate permease genes compared to fungi has been found [58]. Also, it has been shown that P. infestans upregulated sulfate permeases during the infection of tomatoes, suggesting an important role in pathogenesis [59]. ...
Article
Full-text available
Bud Rot, caused by Phytophthora palmivora, is considered one of the main diseases affecting African oil palm (Elaeis guineensis). In this study, we investigated the in vitro molecular dynamics of the pathogen–host interaction by analyzing gene expression profiles from oil palm genotypes that were either susceptible or resistant to the disease. We observed distinct interactions of P. palmivora with resistant and susceptible oil palms through co-expression network analysis. When interacting with susceptible genotypes, P. palmivora exhibited upregulation of carbohydrate and sulfate transport genes. These genes demonstrated co-expression with apoplastic and cytoplasmic effectors, including cell wall degrading enzymes, elicitins, and RxLR motif effectors. The pathogen manipulated susceptible oil palm materials, exacerbating the response and compromising the phenylpropanoid pathway, ultimately leading to susceptibility. In contrast, resistant materials exhibited control over their response through putative Heat Shock Proteins (HSP) that maintained homeostasis between primary metabolism and biotic defense. Co-expressed genes related to flavonoids, WRKY transcripts, lectin-type receptors, and LRR receptors may play important roles in pathogen control. Overall, the study provides new knowledge of the molecular mechanisms underlying the interaction between E. guineensis and P. palmivora, which can contribute to controlling Bud Rot in oil palms and gives new insights into the interactions of P. palmivora with their hosts.
... In additional to BAGs, several other multidomain proteins also have atypical domain combinations exclusively found in oomycetes. 38,39 Domain recombination may lead to the evolution of novel multidomain proteins with different or new functions. Since no phylogenetic evidence supports the involvement of the rarely occurred horizontal gene transfer (HGT), 40 the BAG-sHSP combination from taxonomically unrelated groups could be best explained by convergent evolution. ...
Article
Full-text available
Protein homeostasis is vital for organisms and requires chaperones like the conserved Bcl-2-associated athanogene (BAG) co-chaperones that bind to the heat shock protein 70 (HSP70) through their C-terminal BAG domain (BD). Here, we show an unconventional BAG subfamily exclusively found in oomycetes. Oomycete BAGs feature an atypical N-terminal BD with a short and oomycete-specific α1 helix (α1′), plus a C-terminal small heat shock protein (sHSP) domain. In oomycete pathogen Phytophthora sojae, both BD-α1′ and sHSP domains are required for P. sojae BAG (PsBAG) function in cyst germination, pathogenicity, and unfolded protein response assisting in 26S proteasome-mediated degradation of misfolded proteins. PsBAGs form homo- and heterodimers through their unique BD-α1′ to function properly, with no recruitment of HSP70s to form the common BAG-HSP70 complex found in other eukaryotes. Our study highlights an oomycete-exclusive protein homeostasis mechanism mediated by atypical BAGs, which provides a potential target for oomycete disease control.
... Oxidoreductases, important virulence factors induced during plant infection [80,81], are over-represented in the secretome of the N. parvum supplemented with the Eucalyptus stem (5-fold), where they may contribute to the alkaloid biosynthesis and production of hydrogen peroxide through the oxidation of metabolites [82]. ...
Article
Full-text available
Neofusicoccum parvum is a fungal plant pathogen of a wide range of hosts but knowledge about the virulence factors of N. parvum and host–pathogen interactions is rather limited. The molecules involved in the interaction between N. parvum and Eucalyptus are mostly unknown, so we used a multi-omics approach to understand pathogen–host interactions. We present the first comprehensive characterization of the in vitro secretome of N. parvum and a prediction of protein–protein interactions using a dry-lab non-targeted interactomics strategy. We used LC-MS to identify N. parvum protein profiles, resulting in the identification of over 400 proteins, from which 117 had a different abundance in the presence of the Eucalyptus stem. Most of the more abundant proteins under host mimicry are involved in plant cell wall degradation (targeting pectin and hemicellulose) consistent with pathogen growth on a plant host. Other proteins identified are involved in adhesion to host tissues, penetration, pathogenesis, or reactive oxygen species generation, involving ribonuclease/ribotoxin domains, putative ricin B lectins, and necrosis elicitors. The overexpression of chitosan synthesis proteins during interaction with the Eucalyptus stem reinforces the hypothesis of an infection strategy involving pathogen masking to avoid host defenses. Neofusicoccum parvum has the molecular apparatus to colonize the host but also actively feed on its living cells and induce necrosis suggesting that this species has a hemibiotrophic lifestyle.
... At the genomic level, sulfate permease domains (SuIP, IPR011547) are over-represented in the genomes of oomycete pathogens, with an average abundance of 13.5 domains compared with 8.58 for all other eukaryote species [35]. A total of 16 SuIPs were classified in P. infestans and ten in Pythium ultimum, compared with only six in the fungus M. oryzae [36]. ...
Article
The biochemical versatility of sulfur (S) lends itself to myriad roles in plant–pathogen interactions. This review evaluates the current understanding of mechanisms by which pathogens acquire S from their plant hosts and highlights new evidence that plants can limit S availability during the immune responses. We discuss the discovery of host disease-susceptibility genes related to S that can be genetically manipulated to create new crop resistance. Finally, we summarize future research challenges and propose a research agenda that leverages systems biology approaches for a holistic understanding of this important element’s diverse roles in plant disease resistance and susceptibility.
... Cohen-Gihon and coworkers in 2011 studied the evolution of domain promiscuity through the inference of domain architectures of ancestral genomes (Cohen-Gihon et al. 2011). Other promiscuity studies have focused on plant pathogenic oomycetes, showing a repertoire of genes encoding several promiscuous domains to contribute to the interaction with the plant host (Seidl et al. 2011). Yu et al. have recently conducted an extensive analysis of protein domain architectures on 4159 bacterial, 187 archaeal, and 448 eukaryotic genomes showing that the information gain measures on bigram models allow accurate taxonomic reconstructions. ...
Article
Full-text available
Diverse studies have shown that the content of genes present in sequenced genomes does not seem to correlate with the complexity of the organisms. However, various studies have shown that organism complexity and the size of the proteome has, indeed, a significant correlation. This characteristic allows us to postulate that some molecular mechanisms have permitted a greater functional diversity to some proteins to increase their participation in developing organisms with higher complexity. Among those mechanisms, the domain promiscuity, defined as the ability of the domains to organize in combination with other distinct domains, is of great importance for the evolution of organisms. Previous works have analyzed the degree of domain promiscuity of the proteomes showing how it seems to have paralleled the evolution of eukaryotic organisms. The latter has motivated the present study, where we analyzed the domain promiscuity in a collection of 84 eukaryotic proteomes representative of all the taxonomy groups of the tree of life. Using a grammar definition approach, we determined the architecture of 1,223,227 proteins, conformed by 2,296,371 domains, which established 839,184 bigram types. The phylogenetic reconstructions based on differences in the content of information from measures of proteome promiscuity confirm that the evolution of the promiscuity of domains in eukaryotic organisms resembles the evolutionary history of the species. However, a close analysis of the PHD and RING domains, the most promiscuous domains found in fungi and functional components of chromatin remodeling enzymes and important expression regulators, suggests an evolution according to their function.
... This adaptive capacity facilitates the evolutionary "arms race" between oomycete effectors and host resistance genes (Wang et al., 2019b), but also allows adaptation of the core cellular machinery of pathogens, including metabolism and signal transduction, leading to various unique properties (Judelson, 2017). For instance, oomycetes have several genes encoding unique proteins with novel domain combinations (Seidl et al., 2010;van den Hoogen and Govers, 2018a), as well as a number of horizontally transferred genes coding for proteins with functions in metabolism (Richards et al., 2011). Oomycetes are osmotrophs, which means they secrete enzymes to digest large molecules (polymers) extracellularly and import the resulting small molecules as nutrients (Richards and Talbot, 2013). ...
Article
Full-text available
Metabolism is the set of biochemical reactions of an organism that enables it to assimilate nutrients from its environment and to generate building blocks for growth and proliferation. It forms a complex network that is intertwined with the many molecular and cellular processes that take place within cells. Systems biology aims to capture the complexity of cells, organisms, or communities by reconstructing models based on information gathered by high-throughput analyses (omics data) and prior knowledge. One type of model is a genome-scale metabolic model (GEM) that allows studying the distributions of metabolic fluxes, i.e., the “mass-flow” through the network of biochemical reactions. GEMs are nowadays widely applied and have been reconstructed for various microbial pathogens, either in a free-living state or in interaction with their hosts, with the aim to gain insight into mechanisms of pathogenicity. In this review, we first introduce the principles of systems biology and GEMs. We then describe how metabolic modeling can contribute to unraveling microbial pathogenesis and host–pathogen interactions, with a specific focus on oomycete plant pathogens and in particular Phytophthora infestans. Subsequently, we review achievements obtained so far and identify and discuss potential pitfalls of current models. Finally, we propose a workflow for reconstructing high-quality GEMs and elaborate on the resources needed to advance a system biology approach aimed at untangling the intimate interactions between plants and pathogens.
... fgenesh1_kg.79_#_14_#_15_V2.0 combined two whole genes from the V1.0 annotation, the V2.0.0 functional assignment included an additional PFAM domain, PF12698, an ABC-2 family transporter protein, which are often highly expressed in plant pathogens such as the oomycetes as they play roles in the biotrophic phase of infection and pathogenicity (Seidl et al., 2011;Ah-Fong et al., 2017). The second, estExt_fgenesh1_pm.C_90019, fgenesh1_pm.9_#_20, ...
Article
Full-text available
Phytophthora cinnamomi is a pathogenic oomycete that causes plant dieback disease across a range of natural ecosystems and in many agriculturally important crops on a global scale. An annotated draught genome sequence is publicly available (JGI Mycocosm) and suggests 26,131 gene models. In this study, soluble mycelial, extracellular (secretome), and zoospore proteins of P. cinnamomi were exploited to refine the genome by correcting gene annotations and discovering novel genes. By implementing the diverse set of sub-proteomes into a generated proteogenomics pipeline, we were able to improve the P. cinnamomi genome annotation. Liquid chromatography mass spectrometry was used to obtain high confidence peptides with spectral matching to both the annotated genome and a generated 6-frame translation. Two thousand seven hundred sixty-four annotations from the draught genome were confirmed by spectral matching. Using a proteogenomic pipeline, mass spectra were used to edit the P. cinnamomi genome and allowed identification of 23 new gene models and 60 edited gene features using high confidence peptides obtained by mass spectrometry, suggesting a rate of incorrect annotations of 3% of the detectable proteome. The novel features were further validated by total peptide support, alongside functional analysis including the use of Gene Ontology and functional domain identification. We demonstrated the use of spectral data in combination with our proteogenomics pipeline can be used to improve the genome annotation of important plant diseases and identify missed genes. This study presents the first use of spectral data to edit and manually annotate an oomycete pathogen.
Article
The phylum Oomycota contains economically important pathogens of animals and plants, including Saprolegnia parasitica , the causal agent of the fish disease saprolegniasis. Due to intense fish farming and banning of the most effective control measures, saprolegniasis has re-emerged as a major challenge for the aquaculture industry. Oomycete cells are surrounded by a polysaccharide-rich cell wall matrix that, in addition to being essential for cell growth, also functions as a protective “armor.” Consequently, the enzymes responsible for cell wall synthesis provide potential targets for disease control. Oomycete cell wall biosynthetic enzymes are predicted to be plasma membrane proteins. To identify these proteins, we applied a quantitative (iTRAQ) mass spectrometry-based proteomics approach to the plasma membrane of the hyphal cells of S. parasitica , providing the first complete plasma membrane proteome of an oomycete species. Of significance is the identification of 65 proteins enriched in detergent-resistant microdomains (DRMs). In silico analysis showed that DRM-enriched proteins are mainly involved in molecular transport and β-1,3-glucan synthesis, potentially contributing to pathogenesis. Moreover, biochemical characterization of the glycosyltransferase activity in these microdomains further supported their role in β-1,3-glucan synthesis. Altogether, the knowledge gained in this study provides a basis for developing disease control measures targeting specific plasma membrane proteins in S. parasitica . IMPORTANCE The significance of this research lies in its potential to combat saprolegniasis, a detrimental fish disease, which has resurged due to intensive fish farming and regulatory restrictions. By targeting enzymes responsible for cell wall synthesis in Saprolegnia parasitica , this study uncovers potential avenues for disease control. Particularly noteworthy is the identification of several proteins enriched in membrane microdomains, offering insights into molecular mechanisms potentially involved in pathogenesis. Understanding the role of these proteins provides a foundation for developing targeted disease control measures. Overall, this research holds promise for safeguarding the aquaculture industry against the challenges posed by saprolegniasis.
Article
Full-text available
Background - Pythium ultimum is a ubiquitous oomycete plant pathogen responsible for a variety of diseases on a broad range of crop and ornamental species. Results -The P. ultimum genome (42.8 Mb) encodes 15,290 genes and has extensive sequence similarity and synteny with related Phytophthora species, including the potato blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86% of genes, with detectable differential expression of suites of genes under abiotic stress and in the presence of a host. The predicted proteome includes a large repertoire of proteins involved in plant pathogen interactions, although, surprisingly, the P. ultimum genome does not encode any classical RXLR effectors and relatively few Crinkler genes in comparison to related phytopathogenic oomycetes. A lower number of enzymes involved in carbohydrate metabolism were present compared to Phytophthora species, with the notable absence of cutinases, suggesting a significant difference in virulence mechanisms between P. ultimum and more host-specific oomycete species. Although we observed a high degree of orthology with Phytophthora genomes, there were novel features of the P. ultimum proteome, including an expansion of genes involved in proteolysis and genes unique to Pythium. We identified a small gene family of cadherins, proteins involved in cell adhesion, the first report of these in a genome outside the metazoans. Conclusions - Access to the P. ultimum genome has revealed not only core pathogenic mechanisms within the oomycetes but also lineage-specific genes associated with the alternative virulence and lifestyles found within the pythiaceous lineages compared to the Peronosporaceae.
Article
In Papaver somniferum (opium poppy) and related species, (S)-reticuline serves as a branch-point intermediate in the biosynthesis of numerous isoquinoline alkaloids. The berberine bridge enzyme (BBE) ([S]-reticuline:oxygen oxidoreductase [methylene bridge forming], EC 1.5.3.9) catalyzes the stereospecific conversion of the N-methyl moiety of (S)-reticuline into the berberine bridge carbon of (S)-scoulerine and represents the first committed step in the pathway leading to the antimicrobial alkaloid sanguinarine. Three unique genomic clones (bbe1, bbe2, and bbe3) similar to a BBE cDNA from Eschscholtzia californica (California poppy) were isolated from opium poppy. Two clones (bbe2 and bbe3) contained frame-shift mutations of which bbe2 was identified as a putative, nonexpressed pseudogene by RNA blot hybridization using a gene-specific probe and by the lack of transient expression of a chimeric gene fusion between the bbe2 5[prime] flanking region and a [beta]-glucuronidase reporter gene. Similarly, bbe1 was shown to be expressed in opium poppy plants and cultured cells. Genomic DNA blot-hybridization data were consistent with a limited number of bbe homologs. RNA blot hybridization showed that bbe genes are expressed in roots and stems of mature plants and in seedlings within 3 d after germination. Rapid and transient BBE mRNA accumulation also occurred after treatment with a fungal elicitor or with methyl jasmonate. However, sanguinarine was found only in roots, seedlings, and fungal elicitor-treated cell cultures.
Article
A multiple sequence alignment program, MAFFT, has been developed. The CPU time is drastically reduced as compared with existing methods. MAFFT includes two novel techniques. (i) Homo logous regions are rapidly identified by the fast Fourier transform (FFT), in which an amino acid sequence is converted to a sequence composed of volume and polarity values of each amino acid residue. (ii) We propose a simplified scoring system that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length. Two different heuristics, the progressive method (FFT‐NS‐2) and the iterative refinement method (FFT‐NS‐i), are implemented in MAFFT. The performances of FFT‐NS‐2 and FFT‐NS‐i were compared with other methods by computer simulations and benchmark tests; the CPU time of FFT‐NS‐2 is drastically reduced as compared with CLUSTALW with comparable accuracy. FFT‐NS‐i is over 100 times faster than T‐COFFEE, when the number of input sequences exceeds 60, without sacrificing the accuracy.
Article
Out of the eight enzymes involved in the biosynthesis of the isoquinoline alkaloid berberine, at least, two enzymes, berberine bridge enzyme and (S)-tetrahydroprotoberberine oxidase, are exclusively located in a vesicle with a specific gravity of ϱ=1.14 g·cm(-3) as shown by direct enzymatic assay as well as immunoelectrophoresis. Electronmicroscopic examination of the enzyme-containing particulate preparation from Berberis wilsoniae var. subcaulialata cultured cells demonstrated that it is composed mainly of membranous vesicles. The protein composition of this preparation reveals the presence of only about 20 separable proteins, of which two major ones are berberine bridge enzyme and (S)-tetrahydroprotoberberine oxidase. Incubation of these vesicles with the substrate (S)-reticuline in the presence and absence of S-adenosyl-L-methionine leads to the formation of a red product which was identified as dehydroscoulerine. If the cytoplasmic enzyme S-adenosyl-L-methionine:(S)-scoulerine-9-O-methyltransferase is added to the vesicle preparation in the presence of (S)-reticuline and S-adenosyl-L-methionine, not dehydroscoulerine but columbamine, the immediate precursor of berberine is formed. Some of the quaternary alkaloids are located inside the vesicles; fusion of these vesicles leads to vacuoles containing the quaternary alkaloids. These vesicles are the first highly specific and unique compartment serving only alkaloid biosynthesis; they are found in members of four different plant families and in cell cultures as well as in differentiated tissue.