Content uploaded by Russell Kohel
Author content
All content in this area was uploaded by Russell Kohel
Content may be subject to copyright.
1234 / Molecular Plant-Microbe Interactions
MPMI Vol. 17, No. 11, 2004, pp. 1234–1241. Publication no. M-2004-0909-02R. © 2004 The American Phytopathological Society
Cloning, Characterization, and Evolution
of the NBS-LRR-Encoding
Resistance Gene Analogue Family
in Polyploid Cotton (Gossypium hirsutum L.)
Limei He,1 Chunguang Du,2 Lina Covaleda,1 Zhanyou Xu,1 A. Forest Robinson,3 John Z. Yu,3
Russell J. Kohel,3 and Hong-Bin Zhang1
1Department of Soil and Crop Sciences and Institute for Plant Genomics and Biotechnology, 2123 TAMU, Texas A&M
University, College Station, U.S.A.; 2Department of Biology and Molecular Biology, Montclair State University, Upper
Montclair, NJ, U.S.A.; and 3USDA-ARS, SPARC, Cotton Pathology and Crop Germplasm Research Units, 2765 F&B Road,
College Station, TX, U.S.A.
Submitted 22 April 2004. Accepted 29 July 2004.
The nucleotide-binding site-leucine-rich repeat (NBS-
LRR)-encoding gene family has attracted much research
interest because approximately 75% of the plant disease
resistance genes that have been cloned to date are from
this gene family. We cloned the NBS-LRR-encoding genes
from polyploid cotton by a polymerase chain reaction-
based approach. A sample of 150 clones was selected from
the NBS-LRR gene sequence library and was sequenced,
and 61 resistance gene analogs (RGA) were identified. Se-
quence analysis revealed that RGA are abundant and
highly diverged in the cotton genome and could be cate-
gorized into 10 distinct subfamilies based on the similari-
ties of their nucleotide sequences. The numbers of mem-
bers vary many fold among different subfamilies, and
gene index analysis showed that each of the subfamilies is
at a different stage of RGA family evolution. Genetic map-
ping of a selection of RGA indicates that the RGA reside
on a limited number of the cotton chromosomes, with
those from a single subfamily tending to cluster and two
of the RGA loci being colocalized with the cotton bacte-
rial blight resistance genes. The distribution of RGA be-
tween the two subgenomes A and D of cotton is uneven,
with RGA being more abundant in the A subgenome than
in the D subgenome. The data provide new insights into
the organization and evolution of the NBS-LRR-encoding
RGA family in polyploid plants.
The genes encoding the nucleotide-binding site (NBS) and
leucine-rich repeat (LRR) motifs constitute a large multigene
family (hereafter referred to as the NBS-LRR gene family) in
plants. In the past several years, more than 40 genes confer-
ring resistance to different pathogens, including bacteria,
fungi, nematodes, and viruses, have been cloned in plants. Of
the cloned plant disease resistance (R) genes, approximately
75% were from the NBS-LRR gene family (Hulbert et al.
2001). Therefore, isolation and characterization of the NBS-
LRR-encoding genes and determination of their organization
and evolution in plant genomes are of significance for under-
standing of plant-pathogen interactions and development of
novel approaches to effective control of plant pathogens in
agriculture.
The NBS-LRR genes are abundant in plants. Whole-ge-
nome sequence analysis revealed that there are 150 to 175
NBS-LRR genes in the Arabidopsis genome (Dangl and
Jones 2001; Meyers et al. 2003; Richly et al. 2002), consti-
tuting about 0.6% of its 25,000 genes (The Arabidopsis Ge-
nome Initiative 2000), and there are approximately 600 NBS-
LRR genes in the rice genome (Goff et al. 2002; Meyers et
al. 1999), constituting about 1.5% of its 40,000 genes (Goff
et al. 2002). According to the presence or absence of a TIR
domain (Drosophila Tol l and mammalian Interleukin-1 re-
ceptor homology region) at the N terminus of the protein, the
NBS-LRR gene family is classified into two classes, the TIR-
NBS-LRR and non-TIR-NBS-LRR classes (Hulbert et al.
2001). Since similar genes were found in Rhizobia spp.,
yeasts, Drosophila spp., and vertebrates, in which they are
involved in signal transduction pathways, the plant defense
system is considered to be ancient and to predate the evolu-
tion of the vertebrate immune system (Hammond-Kosack
and Jones 1997; Suominen et al. 2001; Van der Biezen and
Jones 1998). The copy number of NBS-LRR genes might
vary widely within a species, and the loci of the genes might
rapidly rearrange (Leister et al. 1998).
Although the overall sequence homology of the NBS-LRR
genes may vary significantly, several short motifs of their en-
coding proteins, such as NBS and LRR, are highly con-
served. These conserved motifs have enabled rapid isolation
of the NBS-LRR genes or resistance gene analogs (RGA)
from different plant species by using a polymerase chain
reaction (PCR)-based approach with degenerate oligonucleo-
tide primers designed from these domains. RGA were iso-
lated from several plant species, such as potato (Leister et al.
1996), soybean (Kanazin et al. 1996; Yu et al. 1996), lettuce
(Shen et al. 1998), tomato (Ohmori et al. 1998; Pan et al.
2000), rice (Leister et al. 1998; Mago et al. 1999), barley
(Leister et al. 1998; Seah et al. 1998), wheat (Seah et al.
1998; 2000), chickpea (Huettel et al. 2002), and Medicago
truncatula (Zhu et al. 2002). Genetic mapping revealed that
many of the RGA either cosegregate with or are closely
linked to known disease resistance loci (Kanazin et al. 1996;
Leister et al. 1996, 1998; Mago et al. 1999; Pan et al. 2000;
Shen et al. 1998; Yu et al. 1996). Ramalingam and associates
Corresponding author: H.-B. Zhang; Telephone: +979-862-2244; Fax:
+979-862-4790; E-mail: hbz7049@tamu.edu
Vol. 17, No. 11, 2004 / 1235
(2003) showed that, in rice, RGA are associated not only
with qualitative resistance but also with quantitative re-
sponse. These isolated RGA, thus, have provided useful tools
to dissect, tag, and isolate genes conferring both qualitative
and quantitative resistance to different pathogens. Neverthe-
less, little is known about how the NBS-LRR gene family as
an entity is organized, functions, and evolves in plant ge-
nomes, especially in polyploid plant genomes.
Polyploid plants are widely distributed, constituting approxi-
mately 60% of flowering plants. Cottons are the leading textile
fiber and the second most important oilseed in the world and
have long been used as a model species for speciation, poly-
ploidization, and evolutionary studies of polyploid plants. The
genus Gossypium, to which cotton belongs, contains about 50
species, and the phylogeny among the species has been estab-
lished (Seelanan et al. 1997; Small et al. 1998; Wendel 1989;
Wendel and Albert 1992). Of these species, the two cultivated
tetraploid species, G. hirsutum and G. barbadense, are dip-
loidized allopolyploids containing A and D subgenomes, with
each subgenome consisting of 13 chromosomes. The A and D
subgenomes of the tetraploid cottons split from a common an-
cestor 6 to 11 million years ago (MYA) (Wendel 1989). The A
and D genomes hybridized to form a tetraploid some 1 to 2
MYA, from which several tetraploid species, including G. hir-
sutum and G. barbadense, have evolved. While no R genes or
RGA have been reported in these species, they provide a desir-
able system for studies of organization and evolution of the
NBS-LRR gene family in polyploid plants.
For this study, we cloned and sequenced a number of NBS-
LRR genes from the cultivated tetraploid cotton G. hirsutum
and identified a number of the NBS-LRR RGA. Phylogenetic
and gene index analyses were conducted to determine the rela-
tionships among the cotton RGA and cloned plant NBS-LRR-
encoding R genes and to elucidate the evolution of the RGA
family in the cotton genome. A selection of the RGA was
mapped to an existing cotton genetic linkage map (Yu et al.
1998) to estimate their distribution in the two cotton subge-
nomes. These RGA clones provide not only useful markers for
genetically mapping the disease resistance genes but also essen-
tial materials for studying the organization and evolution of the
NBS-LRR gene family in this plant species.
RESULTS
Cloning and analysis
of NBS-LRR-encoding gene sequences.
We produced PCR products from cotton genomic DNA
templates, using the degenerate primer pairs designed accord-
ing to the NBS and membrane-spanning motifs of several
cloned plant R genes conferring resistance to bacteria, fungi,
viruses, and nematodes that represented both TIR-NBS-LRR
and non-TIR-NBS-LRR classes. The PCR products were
analyzed on an agarose gel, and two bands were observed,
one being about 560 bp and the other about 700 bp in size
(Fig. 1). To further confirm the PCR amplification, different
PCR conditions, including the concentrations of template
DNA, Mg++, and Taq DNA polymerase, were tested. Al-
though the 560-bp band could be reproduced under all condi-
tions tested, the 700-bp band could not. Therefore, the 560-
bp band was excised from the gel and was cloned into the
pGEM-T vector. More than 50 white clones were selected
randomly from the library and were analyzed by PCR. The
result showed that all of the clones had the expected insert
sizes of about 560 bp (data not shown), suggesting that the
560-bp band was cloned properly. To facilitate further analy-
sis, 768 recombinant clones of the library were arrayed as indi-
vidual clones in two 384-well microplates.
To estimate the abundance and divergence of the NBS-
LRR-encoding RGA and their evolution in the cotton ge-
nome, 229 clones were randomly selected from the library
and were sequenced. After the primers and vectors of the
clones were removed, 150 of them had sequence reads of 500
bp or longer and, thus, were further analyzed against the
GenBank database by Blastx search. Of the 150 clones, 62
were shown to have significantly high similarities (e-value <
0.001; Fig. 2) at the amino acid level to the cloned plant
NBS-LRR-encoding R genes, RGA, or both in the GenBank
database. These clones were defined in this study as RGA
and were further analyzed. To predict the abundance of the
RGA in the cotton genome, the nucleotide sequences of the
62 RGA clones were analyzed using DNA Strider software
(Marck 1988). Only one pair (2D13 and 2E15) of the 62
RGA clones (1.6%) had identical sequences, which sug-
gested that the NBS-LRR-encoding RGA are abundant in the
cotton genome, although further study is needed to estimate
the exact number of RGA.
To predict whether they are potentially functional, the se-
quences of the 61 different RGA clones were further analyzed
by the GENESCAN program to search for ORF. Of the 61
RGA clones, 32 (52.5%) had ORF of 100 or more amino acids,
14 (23.0%) had ORF of 39 to 99 amino acids, and the remain-
ing 15 (24.6%) had no ORF, due to premature stop codons,
frame-shift mutations, or both (Fig. 2). If the clones that had
no ORF are considered to be pseudogenes (Deloukas et al.
2001; Kanazin et al. 1996; Mungall et al. 2003; Pan et al.
2000), only approximately 75% of the NBS-LRR-encoding
RGA are potentially functional in the cotton genome.
Phylogenetic analysis of the NBS-LRR-encoding RGA.
To determine the relationships among the cotton RGA and
cloned plant NBS-LRR-encoding R genes, seven of the R gene
sequences, representing both TIR-NBS-LRR and non-TIR-
NBS-LRR classes and the non-NBS-LRR gene Cf-5 (Hulbert
et al. 2001), were downloaded from GenBank. The seven
NBS-LRR-encoding R genes were L6, Bs2, Gpa2, I2, Mi1-2,
RPM1, and RPS2. The Cf-5 gene was used as an outgroup,
Fig. 1. Polymerase chain reaction products of genomic DNA of the cotton
root-knot nematode resistance line Auburn 634 and the susceptible line
Deltapine 16 using the nucleotide-binding site motif degenerate primers.
The 560-bp band of Auburn 634 was cloned and analyzed in this study.
1236 / Molecular Plant-Microbe Interactions
although it does not encode the NBS motif. Then, multiple
alignments were conducted among the 61 cotton RGA se-
quences, seven known NBS-LRR-encoding R gene sequences,
and the outgroup gene Cf-5, using the ClustalX software. Ac-
cording to the degree of similarity, 500-bp segments of the se-
quences were selected for phylogenetic analysis.
The phylogenetic analysis was carried out by use of the
PAUP software. A total of 100 bootstrap runs were performed.
Fig. 2. Consensus tree of the cotton resistance gene analogs (RGA) constructed by phylogenic analysis using the PAUP package (Swofford 2001). The cotton
nucleotide-binding site-leucine-rich repeat (NBS-LRR) RGA family is grouped into 10 subfamilies, each being indicated by bold-faced Arabic numbers 1
through 10. The genetic distances between clades (designated subfamilies) were ≥0.5512, whereas the genetic distances within a clade were ≤0.5268. The
sequences in boldface could be translated into open reading frames (ORF) of 100 or more amino acids, the sequences underlined could be translated into
ORF of 39 to 99 amino acids, and the sequences neither in boldface nor underlined could not be translated into ORF and were assumed to be pseudogenes.
The numbers above the horizontal branches are the branch confidence in percentage, estimated by using the Felsenstein’s bootstrap approach. The plant R
genes representing the NBS-LRR class selected from GenBank are italicized, and the LRR-transmembrane domain gene Cf-5, cloned from tomato, was used
as an outgroup (Hulbert et al. 2001).
Vol. 17, No. 11, 2004 / 1237
The consensus tree is shown in Figure 2, with the 50% major-
ity rule. The RGA were classified into 10 clades, designated
subfamilies 1 through 10, in 74 to 100 of the 100 bootstrap
replicates. The genetic distances between pairs of the RGA
ranged from 0.0096 between 2B07 and 2J13 to 0.7569 be-
tween 2D07 and 2A22, with ≥0.5512 between clades and
≤0.5268 within a clade. Clades 1, 2, 3, and 4 each consisted of
five or more members of the 61 RGA, whereas clades 5
through 10 each consisted of only one or two members of the
61 RGA, suggesting that the abundance of each RGA subfam-
ily in the cotton genome was significantly different. Among
the four largest subfamilies, clade 2 had the greatest within-
genetic divergence, with the genetic distances between its sub-
clades being around 0.5000.
To estimate the evolutionary status of each subfamily con-
sisting of five or more RGA, the percentages of the RGA that
had no ORF were calculated. Only two (12.5%) of the 16 RGA
in clade 2 had no ORF, whereas 20.0% of the RGA in clade 4
had no ORF, 23.2% in clade 1, and 46.2% in clade 3. The
percentage of the clones in clade 2 having no ORF was lower
by twofold than the mean of the entire RGA family (24.6%),
whereas that in clade 3 was higher than the mean of the entire
RGA family by about twofold.
Of the seven cloned NBS-LRR-encoding R genes included
in the phylogenetic analysis, four, I2, Mi, RPS2, and L, each
were claded with one or more cotton NBS-LRR-encoding
RGA. The L gene representing the TIR-NBS-LRR-encoding R
genes was claded in the clade 3, and the RPS2, Mi, and I2
genes representing the non-TIR-NBS-LRR-encoding R genes
(Hulbert et al. 2001) were claded in clades 4, 5, and 6, respec-
tively. To further explore the similarity of the cotton RGA to
the cloned NBS-LRR-encoding R genes in the clades, multiple
alignments among them were conducted at the amino acid se-
quence level, using the ClustalX program (Fig. 3). The result
showed that the RGA and NBS-LRR-encoding R genes also
had high similarities at the amino acid sequence level, though
variations were observed.
Genetic mapping of RGA.
To estimate the distribution of NBS-LRR-encoding RGA in
the two subgenomes of the cotton genome, 22 of the 62 RGA
were selected and surveyed for polymorphism between the two
parents, G. hirsutum TM-1 and G. barbadense 3-79, of the map-
ping population of an existing cotton genetic map (Yu et al.
1998). A total of 15 from subfamilies (clades) 1 (3 RGA), 3 (9
RGA), 4 (2 RGA), and 6 (1 RGA), respectively, were found to
Fig. 3. Multiple sequence alignment of the cotton nucleotide-binding site-leucine-rich repeat (NBS-LRR) resistance gene analog (RGA) subfamilies claded
with known plant NBS-LRR-encoding R genes. The amino acid sequences corresponding to the 500-bp nucleotides of the RGA and known NBS-LRR-
encoding R genes starting from the NBS motif were used in the alignment by ClustalX. The motifs of the amino acid sequences are highlighted by the
ClustalX program. The known NBS-LRR-encoding R genes are L for subfamily 3, RPS2 for subfamily 4, I2 for subfamily 5, and Mi for subfamily 6.
1238 / Molecular Plant-Microbe Interactions
be polymorphic and thus mapped to the cotton genetic map. The
genetic distances and relative positions of the clones in the cot-
ton genetic map (Yu et al. 1998) are shown in Figure 4. In total,
16 polymorphic fragments derived from the 15 RGA were
mapped to seven of the 26 chromosomes or linkage groups of
the cotton A and D subgenomes, with four, named A1, A3, A4,
and Chr., 6 belonging to the A subgenome and three, named
chromosomes 17, 20b, and 23, belonging to the D subgenome.
Of the 16 RGA polymorphic fragments, nine were mapped to a
single linkage group, A4 of the subgenome A, two to chromo-
some 23, and one to each of chromosomes or linkage groups
A1, A3, 6, 17, and 20b. Two polymorphic fragments (2A22a and
2A22b) identified by probe 2A22 were mapped to different sub-
genomes, one (2A22a) to a chromosome (chromosome 6) of the
A subgenome and the other (2A22b) to a chromosome (chromo-
some 23) of the D subgenome. Since no disease resistance genes
were mapped to the genetic map (Yu et al. 1998) that was used
in this experiment, we conducted a literature search to infer the
relationships between the mapped loci of the RGA and the
known disease resistance loci. As a result, although few disease
resistance genes were mapped in cotton (Wright et al. 1998), the
RGA of linkage group A4 were colocalized with a quantitative
trait locus (QTL) conferring resistance to the cotton bacterial
blight pathogen Xanthomonas campestris pv. malvacearum
(Smith) Dye, and that of chromosome 20b was likely colocal-
ized with one of the two bacterial blight resistance gene loci
mapped to the chromosome.
DISCUSSION
We successfully cloned the NBS-LRR-encoding gene se-
quences from the cotton genome with PCR, by use of the degen-
erate oligonucleotide primers designed from the NBS regions
of several NBS-LRR-encoding R genes cloned in several di-
verged plant species (Hulbert et al. 2001). Sequence analysis
of 150 clones randomly selected from the DNA library re-
vealed that 62 (41.3%) have significant similarities to cloned
plant NBS-LRR-encoding R genes and RGA in the GenBank.
The fact that only two of the 62 RGA were shown to be identi-
cal in sequences suggests that RGA are abundant in the cotton
genome. These results agree with the findings discovered by
whole genome sequencing in the genomes of Arabidopsis
(Richly et al. 2002; Meyer et al. 2003) and rice (Goff et al.
2002), which contain about 150 and 600 NBS-LRR RGA, re-
spectively. Genetic mapping of 16 RGA showed that these
NBS-LRR-encoding RGA reside at a limited number of the 26
cotton chromosomes, which is consistent with findings in
other plant species (Kanazin et al. 1996; Mago et al. 1999; Yu
et al. 1996). Of these RGA, 56% mapped to a single linkage
group of the A subgenome (A4). The colocalization of the cot-
ton RGA mapped to this linkage group (A4) and chromosome
20b with the previously mapped cotton bacterial blight (X.
campestris pv. malvacearum) resistance loci (Wright et al.
1998) suggests that these RGA may be involved in the bacte-
rial blight resistance.
G. hirsutum and G. barbadense are diploidized allopoly-
ploid species containing A and D subgenomes. Distribution of
RGA between the two subgenomes seems biased. This study
showed that the RGA are more abundant in the A subgenome
than in the D subgenome, since 12 of the 16 RGA sequences
mapped to the A subgenome, whereas only four of them
mapped to the D subgenome. Further investigation is needed
to test whether the difference in abundance of RGA between
the two subgenomes of the polyploid species is due to sam-
pling, different genome sizes (the A subgenome is about two-
fold larger than the D subgenome), evolutionary drive, or any
or all of these in combination. The genetic mapping of RGA
also suggests that the RGA that share higher similarities or
from a single subfamily tend to cluster in the genome, although
some of them may reside at different loci of the genome. This
is supported in this study by genetic mapping of the members
selected from subfamilies 1, 3, and 4.
The RGA gene family is widely divergent in the cotton ge-
nome. High genetic divergence was found between the 61
Fig. 4. Distribution of cotton nucleotide-binding site-leucine-rich repeat resistance gene analogs (RGA) in the cotton genetic map (Yu et al. 1998). The
names in boldface indicate the cotton RGA cloned in this study, of which two polymorphic fragments of 2A22, indicated by “a” and “b”, respectively, were
mapped. The linkage groups of the cotton genetic map are named by their corresponding chromosomes, whenever relevant data are available, or subgenomic
origins (A or D) of polyploid cotton. The numbers in parentheses behind each RGA indicate the subfamily from which the RGA is derived.
Vol. 17, No. 11, 2004 / 1239
RGA in this study at the nucleotide sequence level. This may
suggest that the cotton RGA are ancient and have evolved rap-
idly. The former hypothesis seems supported by the fact that
four of the 10 cotton NBS-LRR-encoding RGA subfamilies (3,
4, 5, and 6) were claded with four NBS-LRR-encoding R
genes, L, RPS2, Mi, and I2, cloned from three different species,
flax, Arabidopsis, and tomato, of diverged dicot plant species,
indicating that the cotton RGA and NBS-LRR-encoding R
genes may share ancestors. Nevertheless, it is also possible
that the clade of the cotton RGA with the L, RPS2, Mi, and I2
genes might result from convergent evolution that mimicks
orthology of the cotton RGA. The rapid evolution of the cotton
NBS-LRR RGA seems to have been supported by the forma-
tion of the diverged subfamilies and the fact that the two larg-
est subfamilies (1 and 2) were not claded with any of the seven
NBS-LRR-encoding R genes included in the phylogenetic
analysis.
The cotton NBS-LRR RGA family consists of at least 10
distinct subfamilies. This classification is supported by high
bootstrap resampling (>90% for all subfamilies but subfamily
2). The number of subfamilies is similar to that observed in
other species such as soybean (Kanazin et al. 1996; Yu et al.
1996), but it is possible that additional subfamilies may be
found when additional RGA clones are sequenced. Neverthe-
less, the abundance of each subfamily in the cotton genome is
significantly different, varying from 1 to 21 of the 61 RGA
analyzed, and more than one third of the RGA are contributed
by a single subfamily (1). Another significant feature of the
classification is that six of the 10 subfamilies each consist of
only one or two of the RGA analyzed. This difference may re-
flect, at least in part, the status of each subfamily in the course
of RGA family evolution. For instance, the subfamilies 1 and 4
may be at the plateau of its evolution, while the subfamilies 2,
3, and 5 through 10 may be at incipient, degenerative, or ves-
tigial stages (discussed below). Alternatively, a unique abun-
dance for each RGA subfamily in the cotton genome may be
associated with the functions of the genes represented by the
RGA, host-pathogen interactions, or both.
Gene index analysis of the RGA further supports the above
hypothesis that each subfamily of the NBS-LRR RGA family
is at an independently determined stage of RGA family evolu-
tion in the cotton genome. Because 15 of the 61 different RGA
analyzed do not have ORF due to stop codons, frame-shift
mutations, or both, approximately one fourth of the RGA are
likely to be pseudogenes. This number is in agreement with
those identified in the human genome in general (Deloukas et
al. 2001; Mungall et al. 2003). Although RGA pseudogenes
were previously observed in soybean (1 of 9 =11.1%; Kanasin
et al. 1996), tomato (7 of 75 =9.3%; Pan et al. 2000) and
Arabidopsis (approximately 10%; Meyers et al. 2003), the ra-
tio of pseudogenes (24.6%) observed in the polyploid cotton in
this study is much higher. In comparison, Arabidopsis and
tomato are both diploid and soybean is a diploidized ancient
tetraploid, while cotton is a diploidized allopolyploid. Thus,
the level of the ploidy might play a role in the accumulation of
pseudogenes in the genomes. From this point of view, the sub-
families of the RGA family that have a lower ratio of pseu-
dogenes than the mean for the entire family may be at evolving
stages, while those that have a higher ratio of pseudogenes
may be at degenerative stages. Based on this criterion, two of
the 10 cotton RGA subfamilies (3 and 5) are likely to be at de-
generative stages, six (2, 6, 7, 8, 9, and 10) are at incipient or
evolving stages, and two (1 and 4) are at stable stages.
NBS-encoding RGA have been cloned from several plant
species and colocalized with many known resistance genes
loci, including those for qualitative and quantitative resistance.
In this study, the RGA mapped to linkage group A4 were colo-
calized with a QTL, and that of chromosome 20b was colocal-
ized with a gene locus of the cotton bacterial blight resistance
previously mapped (Wright et al. 1998), suggesting that the
NBS-LRR-encoding RGA may be involved in both qualitative
and quantitative resistance in cotton. Therefore, the cotton
RGA isolated in this study will provide useful tools for devel-
oping DNA markers and cloning the genes for resistance to
different pathogens in cotton, in which few studies have previ-
ously been conducted. The marker development can be accom-
plished by genetic mapping of the RGA against the known
resistance genes, and the NBS-LRR-encoding R genes can be
isolated by positional cloning, using a whole-genome
BAC/BIBAC-based integrated physical and genetic map of the
cotton genome, under development in our laboratories. More-
over, traditionally, cotton is a model species for studies of speci-
ation, polyploidization, and evolution in polyploid plants. The
cloned cotton RGA will promote studies of organization, func-
tion, and evolution of the NBS-LRR-encoding gene family in
polyploid plants representing about 60% of flowering plants.
MATERIALS AND METHODS
Plant materials and genomic DNA exaction.
The cotton root-knot nematode resistance line Auburn 634,
G. hirsutum, was used as DNA source for NBS-LRR gene
sequence cloning. The G. hirsutum genetic standard line TM-
1, G. barbadense genetic standard line 3-79, and the popula-
tion of the TM-1 × 3-79 cross containing 171 F2 plants were
used to genetically map the cloned RGA, to estimate their dis-
tribution between the cotton A and D subgenomes (Yu et al.
1998). Genomic DNA was isolated from fresh or frozen leaf
tissues, using a cetytrimethylammonium bromide method
(Doyle and Doyle 1990) with minor modifications.
NBS-LRR gene sequence cloning.
The NBS-LRR gene sequences were cloned by a PCR-based
approach, using the genomic DNA of cotton root-knot nema-
tode resistance line Auburn 634 and susceptible line Deltapine
16 as templates and a pair of degenerate sequences of several
cloned plant NBS-LRR-encoding R genes as primers. The de-
sign of the degenerate primers was based on the NBS and
membrane-spanning motif sequences of two cloned R genes, N
and L6, from the TIR-NBS-LRR class and three cloned R
genes, RPS2, RPM1, and Cre, from the non-TIR-NBS-LRR
class (Hulbert et al. 2001). RPS2 and RPM1 are the bacterial
resistance genes in Arabidopsis, N is a viral resistance gene in
tobacco, L6 is a fungal resistance gene of flax, and Cre is the
cyst nematode resistance gene candidate of Aegilope tauschii
(Lagudah et al. 1997). The forward primer F1 was designed in
sense direction, corresponding to the amino acid sequence
GMGGVGKT of the NBS motif: 5′-GGNATGGGNGGNGTN
GGNAA(A/G)AC-3′, and the reverse primer R1 was based on
the amino acid sequence GLPLALKV of the membrane-span-
ning motif in anti-sense direction: 5′-AC(T/C)TTNA
(A/G) NGCNA(A/G)NGGNA(A/G)NCC-3′.
PCR reaction was carried out in a volume of 50 µl contain-
ing 25 ng of genomic DNA, 130 µM dNTPs, 15 µM each
primer, 2.5 units Taq polymerase, and 1× PCR reaction buffer
(Life Technologies, Rockville, MD, U.S.A.) with 1.5 mM
MgCl2. The reaction conditions were 3 min at 94°C, followed
by 30 cycles of denaturing at 94°C for 1 min, annealing at
45°C for 1 min, and elongating at 72°C for 2 min. Finally, the
reaction was incubated at 72°C for an additional 7 min. The
PCR product was separated by electrophoresis on a 1.2%
(wt/vol) agarose gel. Desired bands were excised from the gel,
and the DNA was purified, using the Prep-A-Gene kit (Bio-
Rad, Hercules, CA, U.S.A.) and cloned in the pGEM-T vector
1240 / Molecular Plant-Microbe Interactions
(Promega, Madison, WI, U.S.A.). Recombinant DNA was
transferred into Escherichia coli DH10B strain cells by elec-
troporation and was plated on the Luria Broth (LB) agar blue
and white selective medium. The white colonies having inserts
were selected, were arrayed as individual clones in 384-well
microplates containing freezing medium (Zhang et al. 1996)
plus 50 mg of ampicillin per liter, and were maintained in –80°C
freezers.
DNA sequencing and analysis.
Clones were randomly selected from the NBS-LRR gene
sequence library and were grown overnight in LB medium
containing 50 mg of ampicillin per liter. Plasmid DNA was
purified according to the alkaline lysis method (Sambrook et
al. 1989) and was sequenced from one or both strands, using
the ABI PRISM BigDye terminator cycle sequencing ready
reaction kit (Applied Biosystems, Foster City, CA, U.S.A.)
with M13 forward or reverse primer. Sequences of the PCR
product were determined on the ABI PISM 377 DNA sequencer
(Applied Biosystems). Sequences were edited manually to fur-
ther verify the sequence and, using GeneDoc software, to re-
move the primer and vector sequences. Database searches
were performed using the National Center for Biotechnology
Information Center’s Blastx to search the similarity of the
RGA to the NBS-LRR-encoding R genes and RGA cloned in
plants, with an e-value < 0.001 considered as hits. RGA se-
quences of 500 bp or longer were analyzed in gene index,
according to Deloukas and associates (2001) and Mungall and
associates (2003), using the GENSCAN program (Burge and
Karlin 1997). The RGA sequences that did not give ORF were
defined as “pseudogenes,” according to Deloukas and associ-
ates (2001) and Mungall and associates (2003).
Phylogenetic analysis.
The alignment of the RGA clones was based on 500-bp
nucleotide sequences, starting from the NBS motif between
the degenerate sequence primer pair. The sequences were used
for phylogenetic analysis to construct the phylogenetic tree of
the RGA with the PAUP package version 4.0b10 (Swofford
2001). The Felsenstein’s bootstrap method was employed to
evaluate the reliability of each branch of the tree. Also included
in the phylogenetic analysis were the sequences of correspond-
ing regions of the following cloned R genes: L6 (U27081),
RPS2 (U12860), Bs2 (AF202179), GPA2 (AF195939), Mi1-2
(AF039682), I2C-1 (AF004878), RPM1 (X87851), and Cf-5
(AF053993). The L gene belongs to the TIR-NBS-LRR class,
whereas RPS2, Bs2, GPA2, Mi1-2, I2C-1, and RPM1 represent
the non-TIR-NBS-LRR gene class. The Cf-5 gene does not be-
long to the NBS-LRR genes, but it does encode the LRR mo-
tif, which was used as an outgroup in the experiment. The
nucleotide sequences of the cotton NBS-LRR RGA have been
deposited in the GenBank under accession numbers:
AY600372 to AY600433. The clades (designating the subfami-
lies of the NBS-LRR RGA gene family) that showed signifi-
cant similarities to cloned plant NBS-LRR-encoding R genes
were further analyzed at the amino acid sequence level using
ClustalX (Thompson et al. 1997). This computer program pro-
vides an integrated environment for performing multiple se-
quence and profile alignments.
Genetic mapping.
The standard restriction fragment length polymorphism
mapping procedure was used to map RGA to an existing cot-
ton genetic map, using the G. hirsutum TM-1 × G. barbadense
3-79 mapping population containing 171 F2 plants (Yu et al.
1998). Four restriction enzymes, EcoRI, EcoRV, HindIII, and
XbaI, were used to digest genomic DNA of both parents to
prepare Southern blots for polymorphism survey and DNA of
the F2 plants to prepare the Southern blots for mapping. South-
ern blot hybridization was carried out at 65°C, using the puri-
fied insert DNA of the RGA clones as probes. After hybridiza-
tion, the filters were washed three times in 1× SSC (1× SSC is
0.15 M NaCl plus 0.015 M sodium citrate) (Sambrook et al.
1989), 0.1% (wt/vol) sodium dodecyl sulfate at 65°C, 30 min
each wash. The polymorphic bands of each clone were
mapped on the existing cotton genetic map (Yu et al. 1998)
with MAPMAKER 3.0b (Lander et al. 1987), using a log of
the likelihood ratio threshold of 4.0 and the Kasambi mapping
function.
ACKNOWLEDGMENTS
This study was supported in part by a grant of the Texas Cotton Bio-
technology Initiative and Texas Agricultural Experiment Station (8536-
203232).
LITERATURE CITED
The Arabidopsis Genome Initiative. 2000. Analysis of the genome sequence
of the flowering plant Arabidopsis thaliana. Nature 408:796-815.
Burge, C., and Karlin, S. 1997. Prediction of complete gene structures in
human genomic DNA. J. Mol. Biol. 268:78-94.
Dangl, J. L., and Jones, J. D. 2001. Plant pathogens and integrated defense
responses to infection. Nature 411:826-833.
Deloukas, P., Matthews, L. H., Ashurst, J., Burton, J., Gilbert, J. G. R., Jones,
M., Stavrides, G., Almeida, J. P., Babbage, A. K., Bagguley, C. L., Bailey,
J., Barlow, K. F., Bates, K. N., Beard, L. M., Beare, D. M., Beasley, O. P.,
Bird, C. P., Blakey, S. E., Bridgeman, A. M., Brown, A. J., Buck, D.,
Burrill, W., Butler, A. P., Carder, C., Carter, N. P., Chapman, J. C., Clamp,
M., Clark, G., Clark, L. N., Clark, S. Y., Clee, C. M., Clegg, S., Cobley, V.
E., Collier, R. E., Connor, R., Corby, N. R., Coulson, A., Coville, G. J.,
Deadman, R., Dhami, P., Dunn, M., Ellington, A. G., Frankland, J. A.,
Fraser, A., French, L., Garner, P., Grafham, D. V., Griffiths, C., Griffiths,
M. N. D., Gwilliam, R., Hall, R. E., Hammond, S., Harley, J. L., Heath, P.
D., Ho, S., Holden, J. L., Howden, P. J., Huckle, E., Hunt, A. R., Hunt, S.
E., Jekosch, K., Johnson, C. M., Johnson, D., Kay, M. P., Kimberley, A.
M., King, A., Knights, A., Laird, G. K., Lawlor, S., Lehvaslaiho, M. H.,
Leversha, M., Lloyd, C., Lloyd, D. M., Lovell, J. D., Marsh, V. L., Martin,
S. L., Mcconnachie, L. J., Mclay, K., Mcmurray, A. A., Milne, S., Mistry,
D., Moore, M. J. F., Mullikin, J. C., Nickerson, T., Oliver, K., Parker, A.,
Patel, R., Pearce, T. A. V., Peck, A. I., Phillimore, B. J. C. T.,
Prathalingam, S. R., Plumb, R. W., Ramsay, H., Rice, C. M., Ross, M. T.,
Scott, C. E., Sehra, H. K., Shownkeen, R., Sims, S., Skuce, C. D., Smith,
M. L., Soderlund, C., Steward, C. A., Sulston, J. E., Swann, M.,
Sycamore, N., Taylor, R., Tee, L., Thomas, D. W., Thorpe, A., Tracey, A.,
Tromans, A. C., Vaudin, M., Wall, M., Wallis, J. M., Whitehead, S. L.,
Whittaker, P., Willey, D. L., Williams, L., Williams, S. A., Wilming, L.,
Wray, P. W., Hubbard, T., Durbin, R. M., Bentley, D. R., Beck, S., and
Rogers, J. 2001. The DNA sequence and comparative analysis of human
chromosome 20. Nature 414:865-871.
Doyle, J. J., and Doyle, J. L. 1990. Isolation of plant DNA from fresh tis-
sue. Focus 12:13-15.
Goff, S. A., Ricke, D., Lan, T. H., Presting, G., Wang, R., Dunn, M.,
Glazebrook, J., Sessions, A., Oeller, P., Varma, H., Hadley, D., Hutchison,
D., Martin, C., Katagiri, F., Lange, B. M., Moughamer, T., Xia, Y.,
Budworth, P., Zhong, J., Miguel, T., Paszkowski, U., Zhang, S., Colbert,
M., Sun, W. L., Chen, L., Cooper, B., Park, S., Wood, T. C., Mao, L.,
Quail, P., Wing, R., Dean, R., Yu, Y., Zharkikh, A., Shen, R.,
Sahasrabudhe, S., Thomas, A., Cannings, R., Gutin, A., Pruss, D., Reid,
J., Tavtigian, S., Mitchell, J., Eldredge, G., Scholl, T., Miller, R. M.,
Bhatnagar, S., Adey, N., Rubano, T., Tusneem, N., Robinson, R.,
Feldhaus, J., Macalma, T., Oliphant, A., and Briggs, S. 2002. A draft
sequence of the rice genome (Oryza sativa L. ssp. japonica). Science
296:79-92.
Hammond-Kosack, K. E., and Jones, J. D. G. 1997. Plant disease resis-
tance genes. Ann. Rev. Plant Physiol. Plant Mol. Biol. 48:575-607.
Huettel, B., Santra, D., Muehlbauer, F. J., and Kahl, G. 2002. Resistance
gene analogues of chickpea (Cicer arietinum L.): Isolation, genetic
mapping and association with a Fusarium resistance gene cluster.
Theor. Appl. Genet. 105:479-490.
Hulbert, S. H., Webb, C. A., Smith, S. M., and Sun, Q. 2001. Resistance
gene complexes: Evolution and utilization. Ann. Rev. Phytopathol.
39:285-312.
Vol. 17, No. 11, 2004 / 1241
Kanazin, V., Marek, L. F., and Shoemaker, R. C. 1996. Resistance gene
analogs are conserved and clustered in soybean. Proc. Natl. Acad. Sci.
U.S.A. 93:11746-11750.
Lagudah, E. S., Moullet, O., and Appels, R. 1997. Map-based cloning of a
gene sequence encoding a nucleotide-binding domain and a leucine-
rich region at the Cre3 nematode resistance locus of wheat. Genome
40:659-665.
Lander, E. S., Green, P., Abrahamson, J., Barlow, A., Daly, M. J., Lincoin,
S. E., and Newburg, L. 1987. MAPMAKER: An interactive computer
package for constructing primary genetic linkage maps of experimental
and natural populations. Genomics 1:174-181.
Leister, D., Ballvora, A., Salamini, F., and Gebhardt, C. 1996. A PCR-
based approach for isolating pathogen resistance genes from potato
with potential for wide application in plants. Nat. Genet. 14:421-429.
Leister, D., Kurth, J., Laurie, D. A., Yano, M., Sasaki, T., Devos, K., Graner,
A., and Schulze-Lefert, P. 1998. Rapid reorganization of resistance gene
homologues in cereal genomes. Proc. Natl. Acad. Sci. U.S.A. 95:370–
375.
Mago, R., Nair, S., and Mohan, M. 1999. Resistance gene analogues from
rice: Cloning, sequencing and mapping. Theor. Appl. Genet. 99:50–57.
Marck, C. 1988. “DNA Strider”: A “C” program for the fast analysis of
DNA and protein sequences on the Apple Macintosh family of com-
puters. Nucleic Acids Res. 16:1829-1836.
Meyers, B. C., Dickerman, A. W., Michelmore, R. W., Sivaramakrishnan,
S., Sobral, B. W., and Young, N. D. 1999. Plant disease resistance genes
encode members of an ancient and diverse protein family within the
nucleotide-binding superfamily. Plant J. 20:317-332.
Meyers, B.C., Kozik, A., Griego, A., Kuang, H., and Michelmore, R. W.
2003. Genome-wide analysis of NBS-LRR–encoding genes in Arabi-
dopsis. Plant Cell 15:809-834.
Mungall, A. J., Palmer, S. A., Sims, S. K., Edwards, C. A., Ashurst, J. L.,
Wilming, L., Jones, M. C., Horton, R., Hunt, S. E., Scott, C. E., Gilbert,
J. G. R., Clamp, M. E., Bethel, G., Milne, S., Ainscough, R., Almeida,
J. P., Ambrose, K. D., Andrews, T. D., Ashwell, R. I. S., Babbage, A.
K., Bagguley, C. L., Bailey, J., Banerjee, R., Barker, D. J., Barlow, K.
F., Bates, K., Beare, D. M., Beasley, H., Beasley, O., Bird, C. P., Blakey,
S., Bray-Allen, S., Brook, J., Brown, A. J., Brown, J. Y., Burford, D. C.,
Burrill, W., Burton, J., Carder, C., Carter, N. P., Chapman, J. C., Clark,
S. Y., Clark, G., Clee, C. M., Clegg, S., Cobley, V., Collier, R. E.,
Collins, J. E., Colman, L. K., Corby, N. R., Coville, G. J., Culley, K.
M., Dhami, P., Davies, J., Dunn, M., Earthrowl, M. E., Ellington, A. E.,
Evans, K. A., Faulkner, L., Francis, M. D., Frankish, A., Frankland, J.,
French, L., Garner, P., Garnett, J., Ghori, M. J. R., Gilby, L. M., Gillson,
C. J., Glithero, R. J., Grafham, D. V., Grant, M., Gribble, S., Griffiths,
C., Griffiths, M., Hall, R., Halls, K. S., Hammond, S., Harley, J. L.,
Hart, E. A., Heath, P. D., Heathcott, R., Holmes, S. J., Howden, P. J.,
Howe, K. L., Howell, G. R., Huckle, E., Humphray, S. J., Humphries,
M. D., Hunt, A. R., Johnson, C. M., Joy, A. A., Kay, M., Keenan, S. J.,
Kimberley, A. M., King, A., Laird, G. K., Langford, C., Lawlor, S.,
Leongamornlert, D. A., Leversha, M., Lloyd, C. R., Lloyd, D. M.,
Loveland, J. E., Lovell, L., Martin, S., Mashreghi-Mohammadi, M.,
Maslen, G.L, Matthews, L., Mccann, O. T., Mclaren, S. J., Mclay, K.,
Mcmurray, A., Moore, M. J. F., Mullikin, J. C., Niblett, D., Nickerson,
T., Novik, K. L., Oliver, K., Overton-Larty, E. K., Parker, A., Patel, R.,
Pearce, A. V., Peck, A. I., Phillimore, B., Phillips, S., Plumb, R. W.,
Porter, K. M., Ramsey, Y., Ranby, S. A., Rice, C. M., Ross, M. T.,
Searle, S. M., Sehra, H. K., Sheridan, E., Skuce, C. D., Smith, S.,
Smith, M., Spraggon, L., Squares, S. L., Steward, C. A., Sycamore, N.,
Tamlyn-Hall, G., Tester, J., Theaker, A. J., Thomas, D. W., Thorpe, A.,
Tracey, A., Tromans, A., Tubby, B., Wall, M., Wallis, J. M., West, A. P.,
White, S. S., Whitehead, S. L., Whittaker, H., Wild, A., Willey, D. J.,
Wilmer, T. E., Wood, J. M., Wray, P. W., Wyatt, J. C., Young, L.,
Younger, R. M., Bentley, D. R., Coulson, A., Durbin, R., Hubbard, T.,
Sulston, J. E., Dunham, I., Rogers, J., and Beck, S. 2003. The DNA
sequence and analysis of human chromosome 6. Nature 425:805-811.
Ohmori, T., Murata, M., and Motoyoshi, F. 1998. Characterization of dis-
ease resistance gene-like sequences in near-isogenic lines of tomato.
Theor. Appl. Genet. 96:331–338.
Pan, Q., Wendel, J., and Fluhr, W. 2000. Divergent evolution of plant
NBS-LRR resistance gene homologues in dicot and cereal genomes. J.
Mol. Evol. 50:203-213.
Ramalingam, J., Vera Cruz, C. M., Kukreja, K., Chittoor, J. M., Wu, J.-L.,
Lee, S. W., Baraoidan, M., George, M. L., Cohen, M. B., Hulbert, S. H.,
Leach, J. E., and Leung, H. 2003. Candidate defense genes from rice,
barley, and maize and their association with qualitative and quantitative
resistance in rice. Mol. Plant-Microbe Interact. 16:14-24.
Richly, E., Kurth, J., and Leister, D. 2002. Mode of amplification and reor-
ganization of resistance genes during recent Arabidopsis thaliana evo-
lution. Mol. Biol. Evol. 19:76-84.
Sambrook, J., Fritsch, E. F., and Maniatis, T. 1989. Molecular cloning: A
Laboratory Manual, 2nd ed. Cold Spring Harbor Laboratory Press,
Cold Spring Harbor, NY, U.S.A.
Seah, S., Sivasithamparam, K., Karakousis, A., and Lagudah, E. S. 1998.
Cloning and characterisation of a family of disease resistance gene ana-
logs from wheat and barley. Theor. Appl. Genet. 97:937–945.
Seah, S., Spielmeyer, W., Jahier, J., Sivvvasithamparam, K., and Lagudah,
E. S. 2000. Resistance gene analogs within an introgressed chromoso-
mal segment derived from Triticum ventricosum that confers resistance
to nematode and rust pathogens in wheat. Mol. Plant-Microbe Interact.
13:334-341.
Seelanan, T., Schnabel, A., and Wendel, J. F. 1997. Congruence and con-
sensus in the cotton tribe. Syst. Bot. 22:259-290.
Shen, K. A., Meyers, B. C., Islam-Faridi, M. N., Chin, D. B., Stelly, D.
M., and Michelmore, R. W. 1998. Resistance gene candidates identi-
fied by PCR with degenerate oligonucleotide primers map to clusters
of resistance genes in lettuce. Mol. Plant-Microbe Interact. 11:815-
823.
Small, R. L., Ryburn, J. A., Cronn, R. C., Seelanan, T., and Wendel, J. F.
1998. The tortoise and the hare: Choosing between noncoding plastome
and nuclear Adh sequences for phylogeny reconstruction in a recently
diverged plant group. Amer. J. Bot. 85:1301-1315.
Suominen, L., Roos, C., Lortet, G., Paulin, L., and Lindstrom, K. 2001.
Identification and structure of the Rhizobium galegae common nodula-
tion genes: Evidence for horizontal gene transfer. Mol. Biol. Evol.
18:907-16.
Swofford, D. L. 2001. PAUP: Phylogenetic Analysis Using Parsimony.
Version 4. Sinauer Associates, Sunderland, MA, U.S.A.
Thompson, J. D., Gibson, T. J., Plewniak, F., Jeanmougin, F., and Higgins,
D. G. 1997. The ClustalX windows interface: Flexible strategies for
multiple sequence alignment aided by quality analysis tools. Nucleic
Acids Res. 24:4876-4882.
Van der Biezen, E. A., and Jones, J. D. G. 1998. The NB-ARC domain: A
novel signaling motif shared by plant resistance gene products and
regulators of cell death in animals. Curr. Biol. 8:226-227.
Wendel, J. F. 1989. New World tetraploid cottons contain Old World cyto-
plasm. Proc. Natl. Acad. Sci. U.S.A. 86:4132-4136.
Wendel, J. F., and Albert, V. A. 1992. Phylogenetics of the cotton genus
(Gossypium L.): Character-state weighted parsimony analysis of chloro-
plast DNA restriction site data and its systematic and biogeographic im-
plications. Syst. Bot. 17:115-143.
Wright, R.J., Thaxton, P. M., El-Zik, K. M., and Paterson, A. H. 1998. D-
Subgenome Bias of Xcm resistance genes in tetraploid Gossypium (cot-
ton) suggests that polyploid formation has created novel avenues for
evolution. Genetics 149:1987-1996.
Yu, Y. G., Buss, G. R., and Maroof, M. A. S. 1996. Isolation of a super-
family of candidate disease resistance genes in soybean based on a con-
served nucleotide-binding site. Proc. Natl. Acad. Sci. U.S.A. 93:11751-
11756.
Yu, Z. H., Park, Y. H., Lazo, G. R., and Kohel, R. J. 1998. Molecular map-
ping of the cotton genome: QTL analysis of fiber quality characteristics.
Proc. Plant Animal Genome VI P352.
Zhang, H.-B., Choi, S., Woo, S.-S., Li, Z., and Wing, R. A. 1996. Con-
struction and characterization of two rice bacterial artificial chromo-
some libraries from the parents of a permanent recombinant inbred
mapping population. Mol. Breed. 2:11-24.
Zhu, H., Cannon, S. B., Young, N. D., and Cook, D. R. 2002. Phylogeny
and genomic organization of the TIR and non-TIR-NBS-LRR resis-
tance gene family in Medicago trauncatula. Mol. Plant-Microbe
Interact. 15:529-539.
AUTHOR-RECOMMENDED INTERNET RESOURCES
ClustalX webpage: www-igbmc.u-strasbg.fr/BioInfo/ClustalX/Top.html
GeneDoc software: www.psc.edu/biomed/genedoc/
The GENSCAN program: genes.mit.edu/GENSCAN.html