ArticlePDF Available

The map-based sequence of rice genome

August 2005
Nature 436(7052)

August 2005
436(7052)

DOI:10.1038/nature03895

Authors:

Jia Liu

The Hong Kong Polytechnic University

Kristine M Jones

U.S. Department of Health and Human Services

Show all 177 authorsHide

Rice, one of the world's most important food plants, has important syntenic relationships with the other cereal species and is a model plant for the grasses. Here we present a map-based, finished quality sequence that covers 95% of the 389Mb genome, including virtually all of the euchromatin and two complete centromeres. A total of 37,544 nontransposable- element-related protein-coding genes were identified, of which 71% had a putative homologue in Arabidopsis. In a reciprocal analysis, 90% of the Arabidopsis proteins had a putative homologue in the predicted rice proteome. Twenty-nine per cent of the 37,544 predicted genes appear in clustered gene families. The number and classes of transposable elements found in the rice genome are consistent with the expansion of syntenic regions in the maize and sorghum genomes. We find evidence for widespread and recurrent gene transfer from the organelles to the nuclear chromosomes. The map-based sequence has proven useful for the identification of genes underlying agronomic traits. The additional single-nucleotide polymorphisms and simple sequence repeats identified in our study should accelerate improvements in rice production.

Maps of the twelve rice chromosomes.For each chromosome (Chr 1−12), the genetic map is shown on the left and the PAC/BAC contigs on the right. The position of markers flanking the PAC/BAC contigs (green) is indicated on the genetic map. Physical gaps are shown in white and the nucleolar organizer on chromosome 9 is represented with a dotted green line. Constrictions in the genetic maps and arrowheads to the right of physical maps represent the chromosomal positions of centromeres for which rice CentO satellites are sequenced. The maps are scaled to genetic distances in centimorgans (cM) and the physical maps are depicted in relative physical lengths. Please refer to Table 2 for estimated lengths of the chromosomes.

…

| Size of each chromosome based on sequence data and estimated gaps

…

| Transposons in the rice genome

…

Figures - uploaded by Bahattin Tanyolac

Content may be subject to copyright.

Content uploaded by Bahattin Tanyolac

Content may be subject to copyright.

Content uploaded by Subodh Srivastava

Content may be subject to copyright.

The map-based sequence of the rice

genome

International Rice Genome Sequencing Project*

Rice, one of the world’s most important food plants, has important syntenic relationships with the other cereal species

and is a model plant for the grasses. Here we present a map-based, ﬁnished quality sequence that covers 95% of the

389 Mb genome, including virtually all of the euchromatin and two complete centromeres. A total of 37,544 non-

transposable-element-related protein-coding genes were identiﬁed, of which 71% had a putative homologue in

Arabidopsis. In a reciprocal analysis, 90% of the Arabidopsis proteins had a putative homologue in the predicted rice

proteome. Twenty-nine per cent of the 37,544 predicted genes appear in clustered gene families. The number and

classes of transposable elements found in the rice genome are consistent with the expansion of syntenic regions in the

maize and sorghum genomes. We ﬁnd evidence for widespread and recurrent gene transfer from the organelles to the

nuclear chromosomes. The map-based sequence has proven useful for the identiﬁcation of genes underlying agronomic

traits. The additional single-nucleotide polymorphisms and simple sequence repeats identiﬁed in our study should

accelerate improvements in rice production.

Rice (Oryza sativa L.) is the most important food crop in the world

and feeds over half of the global population. As the ﬁrst step in a

systematic and complete functional characterization of the rice

genome, the International Rice Genome Sequencing Project

(IRGSP) has generated and analysed a highly accurate ﬁnished

sequence of the rice genome that is anchored to the genetic map.

Our analysis has revealed several s alient features o f the rice

genome:

. We provide evidence for a genome size of 389 Mb. This size

estimation is , 260 Mb larger than the fully sequenced dicot plant

model Arabidopsis thaliana. We generated 370 Mb of ﬁnished

sequence, representing 95% coverage of the genome and virtually

all of the euchromatic regions.

. A total of 37,544 non-transposable-element-related protein-cod-

ing sequences were detected, compared with ,28,000–29,000 in

Arabidopsis, w ith a lower gene density of one gene per 9.9 kb in

rice. A total of 2,859 genes seem to be unique to rice and the other

cereal s, some of which might di fferentiate monocot and dicot

lineages.

. Gene knockouts are useful tools for determining gene function

and relating genes to phenotypes. We identiﬁed 11,487 Tos17 retro-

transposon insertion sites, of which 3,243 are in genes.

. Between 0.38 and 0.43% of the nuclear genome contains orga-

nellar DNA fragments, representing repeated and ongoing transfer of

organellar DNA to the nuclear genome.

. The transposon content of rice is at least 35% and is populated by

representatives from all known transposon superfamilies.

. We have identiﬁed 80,127 polymorphic sites that distinguish

between two c ultivated rice subspecies, j aponica and indica,

resulting in a high-resol ution genetic map for rice. Single-nucleo-

tide polymorp hism (SNP) frequency varies from 0.53 to 0.78%,

which is 20 times the frequency observed between the Co lumbia

and Landsberg erecta ecotypes of Arabidopsis.

. A comparison between the IRGSP genome sequence and the

6.3 £ indica and 6 £ japonica whole-genome shotgun sequence

assemblies revealed that the draft sequences provided coverage of

69% by indica and 78% by japonica relative to the map-based

sequence.

Rice has played a central role in human nutrition and culture for

the past 10,000 years. It has been estimated that world rice pro-

duction must increase by 30% over the next 20 years to meet

projected demands from population increase and economic devel-

opment

. Rice grown on the most productive irrigated land has

achieved nearly maximum production with current strains

Environmental degradation, including pollution, increase in night

time temperature due to global warming

, reductions in suitable

arable land, water, labour and energy-dependent fertilizer provide

additional constraints. These factors make steps to maximize rice

productivity particularly important. Increasing yield potential and

yield stability will come from a combination of biotechnology and

improved conventional breeding. Both will be dependent on a high-

quality rice genome sequence.

Rice beneﬁts from having the smallest genome of the major cereals,

dense genetic maps and relative ease of genetic transformation

. The

discovery of extensive genome colinearit y among the Poaceae

has

established rice as the model organism for the cereal grasses. These

properties, along with the ﬁnished sequence and other tools under

development, set the stage for a complete functional characterization

of the rice genome.

The International Rice Genome Sequencing Project

The IRGSP, formally established in 1998, pooled the resources of

sequencing groups in ten nations to obtain a complete ﬁnished

quality sequence of the rice genome (Oryza sativa L. ssp. japonica

cv. Nipponbare). Finished quality sequence is deﬁned as containing

less than one error in 10,000 nucleotides, having resolved ambigu-

ities, and having made all state-of-the-art attempts to close gaps.

The IRGSP released a high-quality map-based draft sequence in

ARTICLES

*Lists of participants and afﬁl iations appear at the end of the paper

Vol 436|11 August 2005|doi:10.1038/nature03895

793

December 2002. Three completely sequenced chromosomes have

been published

5–7

, as well as two completely sequen ced centro-

meres

8–10

. As the IRGSP subscribed to an immediate-release policy,

high-quality map-based sequence has been public for some time.

This has permitted rice geneticists to identify several genes under-

lying traits, and revealed very large and previously unknown seg-

mental duplications that comprise 60% of the genome

11–13

. The

public sequence has also revealed new details about the syntenic

relationships and gene mobility between ri ce, maize and sor-

ghum

13–15

Physical maps, sequencing and coverage

The IRGSP sequenced the genome of a single inbred cultivar, Oryza

sativa ssp. japonica cv. Nipponbare, and adopted a hierarchical clone-

by-clone method using bacterial and P1 artiﬁcial chromosome clones

(BACs and PACs, respectively). This strategy used a high-density

genetic map

, expressed-sequence tags (ESTs)

,yeastartiﬁcial

chromosome (YAC)- and BAC-based physical maps

18–20

, BAC-end

sequences

and two draft sequences

22,23

. A total of 3,401 BAC/PAC

clones (Table 1) were sequenced to approximately tenfold sequence

coverage, assembled, ordered and ﬁnished to a sequence quality of

less than one error per 10,000 bases. A majority of physical gaps in

the BAC/PAC tiling path were bridged using a variety of substrates,

including PCR fragments, 10-kb plasmids and 40-kb fosmid

clones. A total of 62 unsequenced physical gaps, including nine

centromere and 17 telomere gaps, remain on the 12 chromosomes

(Table 2). Chromosome arm and telomere gaps were measured,

and the nine centromere gaps were estimated on the basis of

CentO satellite DNA content. The remaining gaps are estimated to

total 18.1 Mb.

Ninety-seven percent of the BAC/PACs and gap sequences (3,360)

have been submitted as ﬁnished quality in the PLN division of

GenBank/DDBJ/EMBL. These and the remaining draft-sequenced

clones were used to construct pseudomolecules representing the 12

chromosomes of rice (Fig. 1). The total nucleotide sequence of the 12

pseudomolecules is 370,733,456 bp, with an N-average continuous

sequence length of 6.9 Mb (see Table 1 for a deﬁnition of N-average

length). Sequence quality was ass essed by comparing 1.2 Mb of

overlapping sequence produced by different laboratories. The overall

accuracy was calculated as 99.99% (Supplementary Table 2). The

statistics of sequenced PAC/BAC clones and pseudomolecules for

each chromosome are shown in Table 1.

The genome size of rice (O. sativa ssp. japonica cv. Nipponbare)

was reported to have a haploid nuclear DNA content of 394 Mb on

the basis of ﬂow cytometry

, and 403 Mb on the basis of lengths of

anchored BAC contigs and estimates of gap sizes

. Table 2 shows the

calculated size for each chromosome and the estimated coverage.

Adding the estimated length of the gaps to the sum of the non-

overlapping sequence, the total length of the rice nuclear genome was

calculated to be 388.8 Mb. Therefore, the pseudomolecules are

expected to cover 95.3% of the entire genome and an estimated

98.9% of the euchromatin. An independent measure of genome

coverage represented by the pseudomolecules was obtained by

searching for unique EST markers

; of 8,440 ESTs, 8,391 (99.4%)

were identiﬁed in the pseudomolecules.

Centromere location

Typical eukaryotic centromeres contain repetitive sequences, includ-

ing satellite DNA at the centre and retrotransposons and transposons

in the ﬂanking regions. All rice centromeres contain the highly

repetitive 155–165 bp CentO satellite DNA, together with centro-

mere-speciﬁc retrotransposons

25,26

. The CentO satellites are located

within the functional domain of the rice centromere

10,26

. Complete

sequencing of the centromeres of rice chromosomes 4 and 8 revealed

that they consist of 59 kb and 69 kb of clustered CentO repeats

(respectively)

8–10

, tandemly arrayed head-to-tail within the clusters.

Numerous retrotransposons, including the centromere-speciﬁc

RIRE7, are found between and around the CentO repeats. CentO

clusters show differences in len gth and orientation for the two

centromeres.

BLASTN analysis of the pseudomolecules indicated that about

0.9 Mb of CentO repeats (corresponding to more than 5,800 copies of

the satellite) were sequenced and found to be associated with

centromere-speciﬁc retroelements. Locations of all CentO sequences

correspond to genetically identiﬁed centromere regions (Supplemen-

tary Table 3). Our pseudomolecules cover the centromere regions on

chromosomes 4, 5 and 8, and portions of the centromeres on the

remaining chromosomes (Fig. 1).

Gene content, expression and distribution

We masked the pseudomolecules for repetitive sequences and used

the ab initio gene ﬁnder FGENESH to identify only non-transpo-

sable-element-related genes. A total of 37,544 non-transposable-

element protein-coding sequences were predicted, resulting in a

density of one gene per 9.9 kb (Supplementary Tables 4 and 5). As

the ability to identify unannotated and transposable-element-related

genes improves, the true protein-coding gene number in rice will

doubtless be revised.

Full-length complementary DNA sequences are available for rice

and provide a powerful resource for improving gene model structure

derived from ab initio gene ﬁnders

. Of the 37,544 non-transposa-

ble-element-related FGENESH models, 17,016 could be supported

by a total of 25,636 full-length cDNAs (Supplementary Table 6).

A total of 22,840 (61%) genes had a high identity match with a rice

ESTor full-length cDNA. On average, about 10.7 ESTsequences were

present for each expressed rice gene. A total of 2,927 genes aligned

well with ESTs from other cereal species, and 330 of these genes

matched only with a non-rice cereal EST (Supplementary Fig. 1).

Except for the short arms of chromosomes 4, 9 and 10, which are

known to be highly heterochromatic, the density of expressed genes

is greater on the d istal portions of the chromosome arms

compared with the regions around the centromeres (Supplementary

Fig. 2).

A total of 19,675 proteins had matches with entries in the Swiss-

Prot database; of these, 4,500 had no expression support. Domain

searches revealed a minimum of one motif or domain present in 63%

of the predicted proteins, with a total of 3,328 different domains

present in the predicted rice proteome. The ﬁve most abundant

domains were associated with protein kinases (Supplementary

Table 7). Fifty-one per cent of the predicted proteins could be

associated with a biological process (Supplementary Fig. 3a), with

metabolism (29.1%) and cellular physiological processes (11.9%)

representing the two most abundant classes.

Approximately 71% (26,837) of the predicted rice proteins have a

homologue in the Arabidopsis proteome (Supplementary Fig. 4). In a

reciprocal search, 89.8% (26,004) of the proteins from the Arabi-

dopsis genome have a homologue in the rice proteome. Of the 23,170

rice genes with rice EST, cereal EST, or full-length cDNA support,

20,311 (88%) have a homologue in Arabidopsis. Fewer putative

homologues were found in other model species: 38.1% in Drosophila,

40.8% in human, 36.5% in Caenorhabditis elegans, 30.2% in yeast,

17.6% in Synechocystis and 10.2% in Escherichia coli.

There are profound differences in plant architecture and biochem-

istry between monocotyledonous and dicotyledonous angiosperms.

Only 2,859 rice genes with evidence of transcription lack homologues

in the Arabidopsis genome. We investigated these to learn what

functions they en coded. The vast majority had no matches, or

most closely matched unknown or hypothetical proteins. The grasses

have a class of seed storage proteins called prolamins that is not found

in dicots. There are also families of hormone response proteins and

defence proteins, such as proteinase inhibitors, chitinases, patho-

genesis-related proteins and seed allergens, many of which are

tandemly repeated (Supplementary Table 8). Nevertheless, with a

large number of proteins of unknown function, the most interesting

ARTICLES NATURE|Vol 436|11 August 2005

794

differences between the genome content of these two groups of

angiosperms remain to be discovered.

Tos17 is an endogenous copia-like retrotransposon in rice that is

inactive under normal growth conditions. In t issue culture, it

becomes activated, transposes and is stably inherite d when the

plant is regenerated

. There are only two copies of Tos17 in the

rice cultivar Nipponbare. These features, together with its preferen-

tial insertion into gene-rich regions, make Tos17 uniquely suitable for

the functional analysis of rice genes by gene disruption. About 50,000

Tos17-insertion lines carrying 500,000 insertions have been pro-

duced

. A total of 11,487 target loci were mapped on the 12

pseudomolecules (Supplementary Fig. 5), with at least one insertion

detected in 3,243 genes. The density of Tos17 insertions is higher in

euchromatic regions of the genome

, in contrast to the distribution

of high-copy retrotransposons, which are more frequently found in

pericentromeric regions. A similar target site preference has been

reported for T-DNA insertions in Arabidopsis

Tandem gene families

One surprising outcome of the Arabidopsis genome analysis was the

large percentage (17%) of genes arranged in tandem repeats

. When

performing a similar analysis with rice, the percentage was compar-

able (14%). However, manual curation on rice chromosome 10

showed one gene family encoding a glycine-rich protein with 27

copies and one encoding a TRAF/BTB domain protein with 48

copies

. These tandemly repeated f amilies are interrupted wit h

other genes and are not included in strictly deﬁned tandem repeats.

We therefore screened for all tandemly arranged genes in 5-Mb

intervals. Using these criteria, 29% of the genes (10,837) are ampli-

ﬁed at least once in tandem, and 153 rice gene arrays contained 10–

134 members (Supplementary Fig. 6). Sixty ﬁve per cent of the

tandem arrays with over 27 members, and 33% of all the arrays with

over 10 members, contain protein kinase domains (Supplementary

Table 9).

Non-coding RNA genes

The nucleolar organizer, consisting of 17S–5.8S–25S ribosomal DNA

coding units, is found at the telomeric end of the short arm of

chromosome 9 (ref. 34) in O. sativa ssp. japonica, and is estimated to

comprise 7 Mb (ref. 35). A second 17S–5.8S–25S rDNA locus is

found at the end of the short arm of chromosome 10 in O. sativa ssp.

indica

. A single 5S cluster is present on the short arm of chromo-

some 11 in the v icinit y of the centromere

, and encompasses

0.25 Mb.

A total of 763 transfer RNA genes, including 14 tRNA pseudogenes

were detected in the 12 pseudomolecules. In comparison, a total of

611 tRNA genes were detected in Arabidopsis

. Supplementary Fig. 7

shows the distribution of these tRNA genes in each chromosome.

Chromosome 4 has a single tRNA cluster

, and chromosome 10 has

two large clusters derived from inserted chloroplast DNA

. Except for

regions of intermediate density on chromosomes 1, 2, 8 and 12, there

seem to be no other large clusters.

MicroRNAs (miRNAs), a class of eukaryotic non-coding RNAs,

are believed to regulate gene expression by interacting with the target

messenger RNA

. miRNAs have been predicted from Arabidopsis

and rice

, and we mapped 158 miRNAs onto the rice pseudomole-

cules (Supplementary Table 10). Among other non-coding RNAs, we

identiﬁed 215 small nucleolar RNA (snoRNA) and 93 spliceosomal

RNA genes, both showing biased chromosomal distributions, in the

rice genome (Supplementary Table 11).

Organellar insertions in the nuclear genome

Mitochondria and chloroplasts originated from alpha-proteobac-

teria and cyanobacteria endosymbionts. A continuous transfer of

organellar DNA to the nucleus has resulted in the presence of

chloroplast and mitochondrial DNA inserted in the nuclear chromo-

somes. Although the endosymbionts probably contained genomes of

several Mb at the time they were internalized, the organellar genomes

diminished so that the present size of the mitochondrial genome is

less than 600 kb, and that of the chloroplast is only 150 kb. Homology

search es detected 421–453 chloroplast insertions and 909–1,191

mitochondrial insertions, depending upon the stringency adopted

(Supplementary Fig. 8 and Supplementary Table 12). Thus, chlor-

oplast and mitochondrial insertions contribute 0.20–0.24% and

0.18–0.19% of the nuclear genome of rice, respectively, and corre-

spond to 5.3 chloroplast and 1.3 mitochondrial genome equivalents.

The distribution of chloroplast and mitochondrial insertions over

the 12 chromosomes indicates that mitochondrial and chloroplast

transfers occurred independently. Two chromosomes harbour more

insertions than the others (Supplementary Fig. 8 and Supplementary

Table 12), with chromosome 12 containing nearly 1% mitochondrial

DNA and chromosome 10 containing approximately 0.8% chlor-

Table 1 | Classiﬁcation and distribution of sequenced PAC and BAC clones* on the 12 rice chromosomes

Chr Sequencing laboratory† PAC BAC OSJNBa/b OJ OSJNO Others‡ Total§ Pseudomolecule (bp) N-average lengthk (bp) Accession no.

1 RGP, KRGRP 251 77 42 23 4 0 397 43,260,640 9,688,259 AP008207

2 RGP, JIC 117 16 80 142 4 0 359 35,954,074 7,793,366 AP008208

3 ACWW, TIGR 1 8 263 47 1 10 330 36,189,985 5,196,992 AP008209

4 NCGR 2 7 275 7 0 0 291 35,489,479 1,427,419 AP008210

5 ASPGC 67 11 113 87 0 0 278 29,733,216 3,086,418 AP008211

6 RGP 169 20 78 14 0 0 281 30,731,386 8,669,608 AP008212

7 RGP 102 19 68 97 0 0 286 29,643,843 14,923,781 AP008213

8 RGP 113 23 56 83 2 0 277 28,434,680 14,872,702 AP008214

9 RGP, KRGRP, BIOTEC, BRIGI 72 24 72 50 5 0 223 22,692,709 5,219,517 AP008215

10 ACWW, TIGR, PGIR 1 5 172 6 0 21 205 22,683,701 2,124,647 AP008216

11 ACWW, TIGR, IIRGS, PGIR, Genoscope 10 6 236 3 2 1 258 28,357,783 1,087,274 AP008217

12 Genoscope 2 6 179 79 0 2 268 27,561,960 7,600,514 AP008218

Total 907 222 1634 638 18 34 3453 370,733,456 6,928,182

Chr, chromosome.

*PAC, Rice Genome Research Program PAC; BAC, Rice Genome Research Program BAC; OSJNBa/b, Clemson University Genomics Institute BAC; OJ, Monsanto BAC; OSJNO, Arizona

Genomics Institute fosmid (http://www.genome.arizona.edu/orders/direct.html?library ¼ OSJNOa); Others, artiﬁcial gap-ﬁlling clones designated as OSJNA and OJA.

†ACWW (Arizona Genomics Institute, Cold Spring Harbor Laboratory, Washington University Genome Sequencing Center, University of Wisconcin) Rice Genome Sequencing Consortium;

ASPGC, Academia Sinica Plant Genome Center; BIOTEC, National Center for Genetic Engineering and Biotechnology; BRIGI, Brazilian Rice Genome Initiative; IIRGS, Indian Initiative for Rice

Genome Sequencing; JIC, John Innes Centre; KRGRP, Korea Rice Genome Research Program; NCGR, National Center for Gene Research; PGIR, Plant Genome Initiative at Rutgers; RGP, Rice

Genome Research Program; TIGR, The Institute for Genomic Research.

‡Constructs derived by joining (mostly from the clone gap regions) sequence from PCR fragments, Monsanto or Syngenta sequences and the neighbouring clone sequences.

§A total of 2,494 BAC and 907 PAC clones were used for draft and ﬁnished sequencing. Monsanto draft-sequenced BACs underlie 638 ﬁnished clones. The Syngenta draft sequence

contributed to the assemblies of 140 IRGSP clone sequences. Thirty-four sequence submissions are artiﬁcial constructs derived by joining a regional sequence (mostly from the clone gap

regions) from PCR fragments, Monsanto or Syngenta sequences with the neighbouring clone sequences. This also includes 93 clones submitted as phase 1 or phase 2 to the HTG section of

GenBank.

kN-average length: the average length of a contiguous segment (without sequence or physical gaps) containing a randomly chosen nucleotide.

NATURE|Vol 436|11 August 2005 ARTICLES

795

oplast DNA. It is clear that several successive transfer events have

occurred, as insertions of less than 10 kb have heterogeneous iden-

tities. The longest insertions, however, systematically show .98.5%

identity to organellar DNA (Supplementary Table 13), indicating

recent insertions for both chloroplast and mitochondrial genomes.

Transposable elements

The rice genome is populated by representatives from all known

transposon superfamilies, including elements that cannot be easily

classiﬁed into either class I or II (ref. 40). Previous estimates of the

transposon content in the rice genome range from 10 to 25% (refs 21,

40). However, the increased availability of transposon quer y

sequences and the use of proﬁle hidden Markov models allow the

identiﬁcation of more divergent elements

and indicate that the

transposon content of the O. sativa ssp. japonica genome is at least

35% (Table 3). Chromosomes 8 and 12 have the highest transposon

content (38.0% and 38.3%, respectively), and chromosomes 1

(31.0%), 2 (29.8%) and 3 (29.0%) have the lowest proportion of

transposons. Conversely, elements belonging to the IS5/Tourist and

IS630/Tc1/mariner superfamilies, which are generally correlated with

gene density, are prevalent on the ﬁrst three chromosomes and least

frequent on chromosomes 4 and 12.

Class II elements, characterized by terminal inverted-repeats and

including the hAT, CACTA, IS256/Mutator,IS5/Tourist, and IS630/

Tc1/mariner superfamilies, outnumber class I elements, which

includ e long termi nal-repeat (LTR) retrotransposons (Ty1/copi a,

Ty3/gypsy and TRIM) and non-LTR retrotransposons (LINEs and

SINEs, or long- and short-interspersed nucleotide elements, respect-

ively), by more than twofold (Table 3). However, the nucleotide

contribution of class I is greater than that of class II, due mostly to the

large size of LTR retrotransposons and the small size of IS5/Tourist

and IS630/Tc1/mariner elements. The inverse is the case for maize,

for which class I elements outnumber class II elements

. Given their

larger sizes, differential ampliﬁcation of LTR elements in maiz e

compared with rice is consistent w ith the genomic expansion

found between orthologous regions of rice and maize

15,33

Most class I elements are concentrated in gene-poor, heterochro-

matic regions such as the centromeric and pericentromeric regions

(Supplementary Table 14). In contrast, members of some transposon

superfamilies, including IS5/Tourist,IS630/Tc1/mariner and LINEs,

have a signiﬁcant positive correlation with both recombination rate

and gene density. There is an effect o f average element length

associated with these patterns: short elements generally show a

positive correlation with recombination rate and gene density, and

are under-represented in the centromere regions, whereas larger

elements have higher centromeric and pericentromeric abundance.

Intraspeciﬁc sequence polymorphism

Map-based cloning to identify genes that are associated with agro-

nomic traits is dependent on having a high frequency of polymorphic

markers to order recombination events. In rice, most of the segregat-

ing populations are generated from crosses between the two major

subspecies of cultivated rice, Oryza sativa ssp. japonica and O. sativa

ssp. indica. Although several studies on the polymorphisms detected

between japonica and indica subspecies have been reported

6,43,44

, the

analysis reported here uses an approach that ensures comparison of

orthologous sequences. O. sativa ssp. indica cv. Kasalath and O. sativa

ssp. japonica cv. Nipponbare are the parents of the most densely

mapped rice population

. BAC-end sequences were obtained from a

Kasalath BAC library of 47,194 clones. Only high quality, single-copy

sequences were mapped to the Nipponbare pseudomolecules, and

only paired inverted sequences that mapped within 200 kb were

considered. A total of 26,632 paired Kasalath BAC-end sequences

were mapped to the 12 rice pseudomolecules (Supplementary

Table 15). Kasalath BAC clones spanned 308 Mb or 79% of the

Nipponbare genome. Sequence alignments with a PHRED quality

value of 30 covered 12,319,100 bp (3%) of the total rice genome. A

total of 80,127 sites differed in the corresponding regions in Nip-

ponbare and Kasalath. The frequency of SNPs varied between

chromosomes (0.53–0.78%). Insertions and deletions w ere also

detected. The ratio of small insertion/deletion site nucleotides (1–

14 bases) against the alignment length (0.20–0.27%) was similar

among the different chromosomes, and there was no preference for

the direction of insertions or deletions. The main patterns of base

substitutions obser ved between Nipponbare and Kasalath are shown

in Supplementary Table 16. Transitions (70%) were the most

prominent substitutions; this is a substantially higher fraction than

found between Arabidopsis ecotypes Columbia and Landsberg erecta

Class 1 simple sequence repeats in the rice genome

Class 1 simple sequence repeats (SSRs) are perfect repeats .20

nucleotides in length

that behave as hypervariable loci, providing

a rich source of markers for use in genetics and breeding. A total of

18,828 Class 1 di, tri and tetra-nucleotide SSRs, representing 47

distinctive motif families, were identiﬁed and annotated on the rice

genome (Supplementary Fig. 9). Supplementary Table 17 provides

information about the physical positions of all Class 1 SSRs in

relation to widely used restriction-fragment length polymorphisms

(RFLPs)

16,46

and previously published SSRs

. There was an average of

51 hypervariable SSRs per Mb, with the highest density of markers

occurring on chromosome 3 (55.8 SSR Mb

) and the lowest occur-

ring on chromosome 4 (41.0 SSR Mb

). A summary of information

about the Class 1 SSRs identiﬁed in the rice pseudomolecules appears

Table 2 | Size of each chromosome based on sequence data and estimated gaps

Chr Sequenced bases (bp) Gaps on arm regions Telomeric gaps* (Mb) Centromeric gap† (Mb) rDNA‡ (Mb) Total (Mb) Coverage§ (%) Coveragek (%)

No. Length (Mb)

1 43,260,640 5 0.33 0.06 1.40 45.05 99.1 96.0

2 35,954,074 3 0.10 0.01 0.72 36.78 99.7 97.7

3 36,189,985 4 0.96 0.04 0.18 37.37 97.3 96.8

4 35,489,479 3 0.46 0.20 36.15 98.7 98.2

5 29,733,216 6 0.22 0.05 30.00 99.3 99.1

6 30,731,386 1 0.02 0.03 0.82 31.60 99.8 97.2

7 29,643,843 1 0.31 0.01 0.32 30.28 98.9 97.9

8 28,434,680 1 0.09 0.05 28.57 99.7 99.5

9 22,692,709 4 0.13 0.14 0.62 6.95 30.53 98.8 74.3

10 22,683,701 4 0.68 0.13 0.47 23.96 96.6 94.7

11 28,357,783 4 0.21 0.04 1.90 0.25 30.76 99.1 92.2

12 27,561,960 0 0.00 0.05 0.16 27.77 99.8 99.2

All 370,733,456 36 3.51 0.81 6.59 7.20 388.82 98.9 95.3

*Estimated length including the telomeres, calculated with the average value of 3.2 kb for each chromosome

†Estimated length of centromere-speciﬁc CentO repeats on each chromosome

‡Represents the estimated length of the17S–5.8S–25S rDNA cluster on Chr 9 (ref. 35) and the 5S cluster on Chr 11 (ref. 24).

§Coverage of the pseudomolecules for the euchromatic regions in each chromosome.

kCoverage of the pseudomolecules over the full length of each chromosome.

ARTICLES NATURE|Vol 436|11 August 2005

796

in Supplementary Table 18. Several thousand of these SSRs have

already been shown to amplify well and be polymorphic in a panel of

diverse cultivars

, and thus are of immediate use for genetic analysis.

Genome-wide comparison of draft versus ﬁnished sequences

Two whole-genome shotgun assemblies of draft-quality rice

sequence have been published

23,47

, and reassemblies of both have

just appeared

. One of these is an assembly of 6.28 £ coverage of O.

sativa ssp. indica cv. 93-11. The second sequence is a , 6 £ coverage

of O. sativa ssp. japonica cv. Nipponbare

23,48

. These assembl ies

predict genome sizes of 433 Mb for japonica and 466 Mb for. indica,

which differ from our estimation of a 389 Mb japonica genome.

Contigs from the whole-genome shotgun assembly of 93-11 and

Nipponbare

were aligned with the IRGSP pseudomolecules. Non-

redundant coverage of the pseudomolecules by the indica assembly

varied from 78% for chromosome 3 to 59% for chromosome 12, with

an overall coverage of 69% (Supplementary Table 19). When genes

supported by full-length cDNA coverage were aligned to the covered

regions, we found that 68.3% were completely covered by the indica

sequences. The average size of the indica contigs is 8.2 kb, so it is not

surprising that many did not completely cover the gene models

deﬁned here. The coverage of the Nipponbare whole-genome shot-

gun assembly varied from 68–82%, with an overall coverage of 78%

of the genome, and 75.3% of the full-length cDNAs supported gene

models.

We undertook a detailed comparison of the ﬁrst Mb of these

assemblies on 1S (the short arm of chromosome 1) with the IRGSP

chromosome 1 (Supplementary Fig . 10 and Supplementary Table

20). The num bers from this comparison agree with the whole-

genome comparison described above. In addition, we observed

that a substantial portion of the contigs from each assembly were

non-homologous, m isaligned or provided duplicate coverage.

Indeed, the whole-genome shotgun assembly d iffered by 0.05%

base-pair mismatches for the two aligned regions from the same

Nipponbare cultivar. The two assemblies were further examined for

the presence of the CentO sequence (Supplementary Table 21). Sixty-

eight per cent of the copies observed in the 93-11 assembly and 32%

of the CentO-containing contigs in the whole-genome shotgun

Nipponbare assembly were found outside the centromeric regions.

In contrast, the CentO repeats were restricted to the centromeric

regions in the IRGSP pseudomolecules. It is unlikely that there are

dispersed centromeres in indica rice; misassembly of the whole-

genome shotgun sequences is a more likely explanation for dispersed

CentO repeats. These observations indicate that the draft sequences,

although providing a useful preliminary survey of the genome, might

not be adequate for gene annotation, functional genomics or the

identiﬁcation of genes underlying agronomic traits.

Concluding remarks

The attainment of a complete and accurate map-based sequence for

rice is compelling. We now have a blueprint for all of the rice

chromosomes. We know, with a high level o f conﬁdence, the

distribution and location of all the main components

—

the genes,

repetitive sequences and centromeres. Substantial portions of the

map-based sequence have been in public databases for some time,

and the availability of provisional rice pseudomolecules based on this

sequence has provided the scientiﬁc community with numerous

opportunities to evaluate the genome, as indicated by the number of

publications in rice biology and genetics over the past few years.

Furthermore, the wealth of SNP and SSR information provided here

Figure 1 | Maps of the twelve rice chromosomes. For each chromosome

(Chr 1–12), the genetic map is shown on the left and the PAC/BAC contigs

on the right. The position of markers ﬂanking the PAC/BAC contigs (green)

is indicated on the genetic map. Physical gaps are shown in white and the

nucleolar organizer on chromosome 9 is represented with a dotted green

line. Constrictions in the genetic maps and arrowheads to the right of

physical maps represent the chromosomal positions of centromeres for

which rice CentO satellites are sequenced. The maps are scaled to genetic

distances in centimorgans (cM) and the physical maps are depicted in

relative physical lengths. Please refer to Table 2 for estimated lengths of the

chromosomes.

NATURE|Vol 436|11 August 2005 ARTICLES

797

and elsewhere will accelerate marker-assisted breeding and positional

cloning, facilitating advances in rice improvement.

The syntenic relationships between rice and the cereal grasses have

long been recognized

. Comparing genome organization, genes and

intergenic regions between cereal species will permit identiﬁcation of

regions that are highly conserved or rapidly evolving. Such regions

are expected to yield crucial insights into genome evolution, specia-

tion and domestication.

METHODS

Physical map and sequencing. Nine genomic libraries from Oryza sativa ssp.

japonica cultivar Nipponbare were used to establish the physical map of rice

chromosomes by polymerase chain reaction (PCR) screening

, ﬁngerprinting

and end-sequencing

. The PAC, BAC and fosmid clones on the physical map

were subjected to random shearing and shotgun sequencing to tenfold redun-

dancy, using both universal primers and the dye-terminator or dye-primer

methods. The sequences were assembled using PHRED (http://www.genome.-

washington.edu/UWGC/analysistools/Phred.cfm) and PHRAP (http://www.ge-

nome.washington.edu/UWGC/analysistools/Phrap.cfm) software packages or

using the TIGR Assembler (http://www.tigr.org/software/assembler/).

Sequence gaps were resolved by full sequencing of gap-bridge clones, PCR

fragments or direct sequencing of BACs. Sequence ambiguities (indicated by

PHRAP scores less than 30) were resolved by conﬁrming the sequence data using

alternative chemistries or different polymerases. We empirically determined that

a PHRAP score of 30 or above exceeds the standard of less than one error in

10,000 bp. BAC and PAC assemblies were tested for accuracy by comparing

computationally derived ﬁngerprint patterns with experimentally determined

patterns of restriction enzyme digests. Sequence quality was also evaluated by

comparing independently obtained overlapping sequences.

Small physical gaps were ﬁlled by long-range PCR. Remaining physical gaps

were measured using ﬂuorescence in situ hybridization analysis. We used the

length of CentO arrays

to estimate the size of each of the remaining centromere

gaps.

Annotation and bioinformatics. Gene models were predicted using FGENESH

(http://www.softberry.com/berry.phtml?topic ¼ fgenesh) using the monocot

trained matrix on the native and repeat-masked pseudomolecules. Gene models

with incomplete open reading frames, those encoding proteins of less than 50

amino acids, or those corresponding to organellar DNA were omitted from the

ﬁnal set. The coordinates of transposable elements, excluding MITEs (miniature

inverted-repeat transposable elements), were used to mask the pseudomolecules.

Conserved domain/motif searches and association with gene ontologies were

performed using InterproScan (http://www.ebi.ac.uk/InterProScan/) in combi-

nation with the Interpro2Go program. For biological processes, the number of

detected domains was re-calculated as number of non-redundant proteins.

The predicted rice proteome was searched using BLASTP against the

proteomes of several model species for which a complete genome sequence

and deduced protein set was available. Each rice chromosome was searched

against the TIGR rice gene index (http://www.tigr.org/tdb/tgi/ogi/) and against

gene index entries that aligned to gene models corresponding to expressed genes.

In addition, ﬁve cereal gene indices (http://www.tigr.org/tdb/tgi/) were searched

against the rice chromosomes, and gene index matches were recorded. We

searched the Oryza sativa ssp. japonica cv. Nipponbare collection of full-length

cDNAs (ftp://cdna01.dna.affrc.go.jp/pub/data/), after ﬁrst removing the trans-

posable-element-related sequences, against the FGENESH models.

Gene models with rice full-length cDNA, EST or cereal EST matches but

without identiﬁable homologues in the Arabidopsis genome were searched for

conserved domains/motifs using InterproScan, and for homologues in the

Swiss-Prot database (http://us.expasy.org/sprot/) using BLASTP. All proteins

with positive blast matches were further compared with the nr database (http://

www.ncbi.nlm.nih.gov/blast/html/blastcgihelp.html#protein_databases), using

BLASTP to eliminate truncated proteins and those with matches to other dicots.

Tandem gene families. The rice genome was subjected to a BLASTP search as

previously described

. The search was also performed by permitting more than

one unrelated gene within the arrays, and the limit of the search was set to 5-Mb

intervals to exclude large chromosomal duplications.

Non-coding RNAs. Transfer-RNA genes were detected by the program tRNA-

scan SE (http://www.genetics.wustl.edu/eddy/tRNAscan-SE/). The miRNA reg-

istry in the Rfam database (http://www.sanger.ac.uk/Software/Rfam/) was used

as a reference database for miRNAs. In addition, experimentally validated

miRNAs of other species, excluding Arabidopsis miRNAs, were used for BLASTN

queries against the pseudomolecules. Spliceosomal and snoRNAs were retrieved

from the Rfam database and used for queries. BLASTN was used to ﬁnd the

location of snoRNAs and spliceosomal RNAs in the pseudomolecules.

Organellar insertions. Oryza sativa ssp. japonica Nip ponbare chloroplast

(GenBank NC_001320) and mitochondrial (GenBank BA000029) sequences

were aligned with the pseudomolecules using BLASTN and MUMmer

Transposable elements. The TIGR Oryza Repeat Database, together with other

published and unpublished rice transposable element sequences, was used to

create RTEdb (a rice transposable element database)

and determine transpo-

sable element coordinates on the rice pseudomolecules. In the case of hAT, IS256/

Mutator,IS5/Tourist and IS630/Tc1/mariner elements, family-speciﬁc proﬁle

hidden Markov models were applied using HMMER

(http://hmmer.wustl.edu/).

The remaining superfamilies were annotated using RepeatMasker (http://

www.repeatmasker.org/).

Tos17 insertions. Flanking sequ ences of trans posed copies of 6,278 Tos17

insertion lines were isolated by modiﬁed thermal asymmetric interlaced

(TAIL)-PCR and suppression PCR, and screened against the pseudomolecule

sequences.

SNP discovery. BAC clones from an O. sativa ssp. indica var. Kasalath BAC

library were end-sequenced. Sequence reads were omitted if they contained more

than 50% nucleotides of low quality or high similarity to known repeats. The

remaining sequences were subjected to BLASTN analysis against the pseudo-

molecule s. Gaps within the alignments were classiﬁed as small insertions/

deletions.

SSR loci. The Simple Sequence Repeat Identiﬁcation Tool (http://www.gramene.

org/) was used to identify simple sequence repeat motifs, and the physical

position of all Class 1 SSRs was recorded. The copy number of SSR markers was

estimated using electronic (e)-PCR to determine the number of independent hits

of primer pairs on the pseudomolecules.

Whole-genome shotgun assembly analysis. Contigs from the BGI 6.28 £

whole genome assembly of O. sativa ssp. indica 93-11 (GenBank/DDBJ/EMBL

accession number AAAA02000001–AAAA02050231) and the Syngenta 6 £

whole gen ome a ssembly of O. sa ti va ssp. japonica cv. Nipponbare

(AACV01000001–AACV01035047; ref. 48) were aligned with the pseudomole-

cules using MUMmer

. The number of IRGSP Nipponbare full-length cDNA-

supported gene models completely covered by the aligned contigs was tabulated.

The 155-bp CentO consensus sequence was used for BLAST analysis against the

93-11 and Nipponbare whole-genome shotgun contigs, and the coordinates of

the positive hits recorded. Locations of centromeres for each indica chromosome

were obtained with the CentO sequence positions on the IRGSP pseudomolecule

of the corresponding chromosome. A detailed comparison of the BGI-assembled

and -mapped Syngenta contigs (AACV01000001–AACV01000070) and the 93-

11 contigs (AAAA02000001–AAAA02000093) was obtained by BLAST analysis

against the IRGSP chromosome 1 pseudomolecule.

Detailed procedures for the analyses described above can be found in the

Supplementary Information.

Received 29 December 2004; accepted 25 May 2005.

1. Peng, S., Cassman, K. G., Virmani, S. S., Sheehy, J. & Khush, G. S. Yield

potential trends of tropical rice since the release of IR8 and the challenge of

increasing rice yield potential. Crop Sci. 39, 1552–-1559 (1999).

2. Peng, S. et al. Rice yields decline with higher night temperature from global

warming. Proc. Natl Acad. Sci. USA 101, 9971–-9975 (2004).

Table 3 | Transposons in the rice genome

Copy no. ( £ 10

) Coverage (kb) Fraction of genome (%)

Class I

LINEs 9.6 4161.3 1.12

SINEs 1.8 209.9 0.06

Ty1/copia 11.6 14266.7 3.85

Ty3/gypsy 23.5 40363.3 10.90

Other class I 15.4 12733.3 3.43

Total class I 61.9 71734.4 19.35

Class II

hAT 1.1 1405.9 0.38

CACTA 10.8 9987.3 2.69

IS630/Tc1/mariner 67.0 8388.3 2.26

IS256/Mutator 8.8 13485.7 3.64

IS5/Tourist 57.9 12095.8 3.26

Other class II 18.2 2703.6 0.73

Total class II 163.8 48066.6 12.96

Other TEs 23.6 6797.7 1.80

Total TEs 249.3 129019.3* 34.79

TE, transposable element.

*Total length; corrected for 2420.7 kb in overlaps of multiple, non-nested elements.

ARTICLES NATURE|Vol 436|11 August 2005

798

3. Sasaki, T. & Burr, B. International Rice Genome Sequencing Project: the effort to

completely sequence the rice genome. Curr. Opin. Plant Biol. 3, 138–-141 (2000).

4. Moore, G., Devos, K. M., Wang, Z. & Gale, M. D. Cereal genome evolution:

Grasses, line up and form a circle. Curr. Biol. 5, 737–-739 (1995).

5. Sasaki, T. et al. The genome sequence and structure of rice chromosome 1.

Nature 420, 312–-316 (2002).

6. Feng, Q. et al. Sequenc e and analysis of rice chromosome 4. Nature 420,

316–-320 (2002).

7. Rice Chromosome 10 Sequencing Consortium, In-depth view of structure,

activity, and evolution of rice chromosome 10. Science 300, 1566–-1569 (2003).

8. Wu, J. et al. Composition and structure of the centromeric region of rice

chromosome 8. Plant Cell 16, 967–-976 (2004).

9. Zhang, Y. et al. Structural features of the rice chromosome 4 centromere.

Nucleic Acids Res. 32, 2023–-2030 (2004).

10. Nagaki, K. et al. Sequencing of a rice centromere uncovers active genes. Nature

Genet. 36, 138–-145 (2004).

11. Guyot, R. & Keller, B. Ancestral genome duplication in rice. Genome 47,

610–-614 (2004).

12. Simillion, C., Vandepoele, K., Saeys, Y. & Van de Peer, Y. Building genomic

proﬁles for uncovering segmental homology in the twilight zone. Genome Res.

14, 1095–-1106 (2004).

13. Paterson, A. H., Bowers, J. E. & Chapman, B. A. Ancient polyploidization

predating divergence of the cereals, and its consequences for comparative

genomics. Proc. Natl Acad. Sci. USA 101, 9903–-9908 (2004).

14. Salse, J., Piegu, B., Cooke, R. & Delseny, M. New in silico insight into the

synteny between rice (Oryza sativa L.) and maize (Zea mays L.) highlights

reshufﬂing and identiﬁes new duplications in the rice genome. Plant J. 38,

396–-409 (2004).

15. Lai, J. et al. Gene loss and movement in the maize genome. Genome Res. 14,

1924–-1931 (2004).

16. Harushima, Y. et al. A high-density rice genetic linkage map with 2275 markers

using a single F

population. Genetics 148, 479–-494 (1998).

17. Yamamoto, K. & Sasaki, T. Large-scale EST sequencing in rice. Plant Mol. Biol.

35, 135–-144 (1997).

18. Saji, S. et al. A physical map with yeast artiﬁcial chromosome (YAC) clones

covering 63% of the 12 rice chromosomes. Genome 44, 32–-37 (2001).

19. Wu, J. et al. A comprehensive rice transcript map containing 6591 expressed

sequence tag sites. Plant Cell 14, 525–-535 (2002).

20. Chen, M. et al. An integrated physical and genetic map of the rice genome.

Plant Cell 14, 537–-545 (2002).

21. Mao, L. et al. Rice transposable elements: a survey of 73,000 sequence-

tagged-connectors. Genome Res. 10, 982–-990 (2000).

22. Barry, G. F. The use of the Monsanto draft rice genome sequence in research.

Plant Physiol. 125, 1164–-1165 (2001).

23. Goff, S. A. et al. A draft sequence of the rice genome (Oryza sativa L. ssp.

japonica). Science 296, 92–-100 (2002).

24. Ohmido, N., Kijima, K., Akiyama, Y., de Jong, J. H. & Fukui, K. Quantiﬁcation of

total genomic DNA and selected repetitive sequences reveals concurrent

changes in different DNA families in indica and japonica rice. Mol. Gen. Genet.

263, 388–-394 (2000).

25. Dong, F. et al. Rice (Oryza sativa) centromeric regions consist of complex DNA.

Proc. Natl Acad. Sci. USA 95, 8135–-8140 (1998).

26. Cheng, Z. et al. Functional rice centromeres are marked by a satellite repeat

and a centromere-speciﬁc retrotransposon. Plant Cell 14, 1691–-1704 (2002).

27. Kikuchi, S. et al. Collection, mapping, and annotation of over 28,000 cDNA

clones from japonica rice. Science 301, 376–-379 (2003).

28. Castelli, V. et al. Whole genome sequence comparisons and “full-length” cDNA

sequences: a combined approach to evaluate and improve Arabidopsis genome

annotation. Genome Res. 14, 406–-413 (2004).

29. Hirochika, H., Sugimoto, K., Otsuki, Y., Tsugawa, H. & Kanda, M.

Retrotransposons of rice involved in mutations induced by tissue culture. Proc.

Natl Acad. Sci. USA 93, 7783–-7788 (1996).

30. Miyao, A. et al. Target site speciﬁcity of the Tos17 retrotransposon shows a

preference for insertion within genes and against insertion in retrotransposon-

rich regions of the genome. Plant Cell 15, 1771–-1780 (2003).

31. Alonso, J. M. et al. Genome-wide insertional mutagenesis of Arabidopsis

thaliana. Science 301, 653–-657 (2003).

32. Arabidopsis Genome Initiative, Analysis of the genome sequence of the

ﬂowering plant Arabidopsis thaliana. Nature 408, 796–-815 (2000).

33. Song, R., Llaca, V. & Messing, J. Mosaic organization of orthologous sequences

in grass genomes. Genome Res. 12, 1549–-1555 (2002).

34. Shishido, R., Sano, Y. & Fukui, K. Ribosomal DNAs: an exception to the

conservation of gene order in rice genomes. Mol. Gen. Genet. 263, 586–-591

(2000).

35. Oono, K. & Sugiura, M. Heterogeneity of the ribosomal RNA gene clusters in

rice. Chromosoma 76, 85–-89 (1980).

36. Kamisugi, Y. et al. Physical mapping of the 5S ribosomal RNA genes on rice

chromosome 11. Mol. Gen. Genet. 245, 133–-138 (1994).

37. Bartel, D. P. MicroRNAs: Genomics, biogenesis, mechanism, and function. Cell

116, 281–-297 (2004).

38. Wang, X. J., Reyes, J. L., Chua, N. H. & Gaasterland, T. Prediction and

identiﬁcation of Arabidopsis thaliana microRNAs and their mRNA targets.

Genome Biol. 5, R65 (2004).

39. Wang, J. F., Zhou, H., Chen, Y. Q., Luo, Q. J. & Qu, L. H. Identiﬁcation of 20

microRNAs from Oryza sativa. Nucleic Acids Res. 32, 1688–-1695 (2004).

40. Turcotte, K., Srinivasan, S. & Bureau, T. Survey of transposable elements from

rice genomic sequences. Plant J. 25, 169–-179 (2001).

41. Eddy, S. R. Proﬁle hidden Markov models. Bioinformatics 14, 755–-763 (1998).

42. Messing, J. et al. Sequence composition and genome organization of maize.

Proc. Natl Acad. Sci. USA 101, 14349–-14354 (2004).

43. Shen, Y. J. et al. Development of genome-wide DNA polymorphism database

for map-based cloning of rice genes. Plant Physiol. 135, 1198–-1205 (2004).

44. Feltus, F. A. et al. An SNP resource for rice genetics and breeding based on

subspecies indica and japonica genome alignments. Genome Res. 14, 1812–-1819

(2004).

45. McCouch, S. R. et al. Development and mapping of 2240 new SSR markers for

rice (Oryza sativa L.). DNA Res. 9, 257–-279 (2002).

46. Causse, M. A. et al. Saturated molecular map of the rice genome based on an

interspeciﬁc backcross population. Genetics 138, 1251–-1274 (1994).

47. Yu, J. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. indica).

Science 296, 79–-92 (2002).

48. Yu, J. et al. The genomes of Oryza sativa: A history of duplications. PLoS Biol. 3,

e38 (2005).

49. Delcher, A. L. et al. Alignment of whole genomes. Nucleic Acids Res. 27,

2369–-2376 (1999).

50. Juretic, N., Bureau, T. E. & Bruskiewich, R. M. Transposable element annotation

of the rice genome. Bioinformatics 20, 155–-160 (2004).

Supplementary Information is linked to the online version of the paper at

www.nature.com/nature.

Acknowledgements Work at the RGP was supported by the Ministry of

Agriculture, Forestry and Fisheries of Japan. Work at TIGR was supported by

grants to C.R.B. from the USDA Cooperative State Research, Education and

Extension Service–National Research Initiative, the National Science Foundation

and the US Department of Energy. Work at the NCGR was supported by the

Chinese Ministry of Science and Technology, the Chinese Academy of Sciences,

the Shanghai Municipal Commission of Science and Technology, and the

National Natural Science Foundati on of China. Work at Genoscope was

supported by le Ministe

re de la Recherche, France. Funding for the work at the

AGI and AGCoL was provided by grants to R.A.W. and C.S. from the USDA

Cooperative State Research, Education and Extension Service–National Research

Initiative, the National Science Foundation, the US Department of Energy and

the Rockefeller Foundation. Work at CSHL was supported by grants from the

USDA Cooperative State Research, Education and Extension Service–National

Research Initiative and from the National Science Foundation. Work at the

ASPGC was supported by Academia Sinica, National Science Council, Council of

Agriculture, and Institute of Botany, Academia Sinica. The IIRGS acknowledges

the Department of Biotechnology, Government of India, for ﬁnancial assistance

and the Indian Council of Agricultural Research, New Delhi, for support. Work at

Rice Gene Discovery was supported by BIOTECH and the Princess Sirindhorn’s

Plant Germplasm Conservation Initiative Program. Work at PGIR was supported

by Rutgers University. The BRIGI was supported by Coordenac¸a

ode

Aperfeic¸oamento de Pessoal de Nı

vel Superior (CAPES), Conselho Nacional de

Desenvolvimento Cientı

ﬁco e Tecnolo

gico (CNPq), Financiadora de Estudos e

Projetos - Ministe

rio de Cie

ncia e Tecnologia (FINEP-MCT), Fundac¸a

ode

Amparo a Pesquisa do Rio Grande do Sul (FAPERGS) and Universidade Federal

de Pelotas (UFPel). Work at McGill and York Universities was supported by the

National Science and Engineering Research Council of Canada and the Canadian

International Development Agency. Funding for H.H. at the National Institute of

Agrobiological Sciences was from the Ministry of Agriculture, Forestry , and

Fisheries of Japan, and the Program for Promotion of Basic Research Activities

for Innovative Biosciences. Funding at Brookhaven National Laboratory was from

The Rockefeller Foundation and the Ofﬁce of Basic Energy Science of the United

States Department of Energy. We would like to thank G. Barry and S. Goff for

their help in negotiating agreements that permitted the sharing of ma terials and

sequence with the IRGSP. We also acknowledge the work of G. Barry, S. Goff

and their colleagues in facilitating the transfer of sequence information and

supporting data.

Author Information The genomic sequence is available under accession

numbers AP008207–AP008218 in international databases (DDBJ, GenBank and

EMBL). Reprints and permissions information is available at npg.nature.com/

reprintsandpermissions. The authors declare no competing ﬁnancial interests.

Correspondence and requests for materials should be addressed to Takuji

Sasaki (tsasaki@nias.affrc.go.jp).

NATURE|Vol 436|11 August 2005 ARTICLES

799

International Rice Genome Sequencing Project (Participants are arranged by area of contribution and then by ins titution.)

Physical Maps and Sequencing: Rice Genome Research Program (RGP) Takashi Matsumoto

, Jianzhong Wu

, Hiroyuki Kanamori

, Yuichi

Katayose

, Masaki Fujisawa

, Nobukazu Namiki

, Hiroshi Mizuno

, Kimiko Yamamoto

, Baltazar A. Antonio

, Tomoya Baba

, Katsumi Sakata

Yoshiaki Nagamura

, Hiroyoshi Aoki

, Koji Arikawa

, Kohei Arita

, Takahito Bito

, Yoshino Chiden

, Nahoko Fujitsuka

, Rie Fukunaka

, Masao

Hamada

, Chizuko Harada

, Akiko Hayashi

, Saori Hijishita

, Mikiko Honda

, Satomi Hosokawa

, Yoko Ichikawa

, Atsuko Idonuma

, Masumi

Iijima

, Michiko Ikeda

, Maiko Ikeno

, Kazue Ito

, Sachie Ito

, Tomoko Ito

, Yuichi Ito

, Yukiyo Ito

, Aki Iwabuchi

, Kozue Kamiya

, Wataru

Karasawa

, Kanako Kurita

, Satoshi Katagiri

, Ari Kikuta

, Harumi Kobayashi

, Noriko Kobayashi

, Kayo Machita

, Tomoko Maehara

Masatoshi Masukawa

, Tatsumi Mizubayashi

, Yoshiyuki Mukai

, Hideki Nagasaki

, Yuko Nagata

, Shinji Naito

, Marina Nakashima

, Yuko

Nakama

, Yumi Nakamichi

, Mari Nakamura

, Ayano Meguro

, Manami Negishi

, Isamu Ohta

, Tomoya Ohta

, Masako Okamoto

, Nozomi

Ono

, Shoko Saji

, Miyuki Sakaguchi

, Kumiko Sakai

, Michie Shibata

, Takanori Shimokawa

, Jianyu Song

, Yuka Takazaki

, Kimihiro

Terasawa

, Mika Tsugane

, Kumiko Tsuji

, Shigenori Ueda

, Kazunori Waki

, Harumi Yamagata

, Mayu Yamamoto

, Shinichi Yamamoto

Hiroko Yamane

, Shoji Yoshiki

, Rie Yoshihara

, Kazuko Yukawa

, Huisun Zhong

, Masahiro Yano

, Takuji Sasaki (Principal Investigator)

;

The Institute for Genomic Research (TIGR) Qiaoping Yuan

, Shu Ouyang

, Jia Liu

, Kristine M. Jones

, Kristen Gansberger

, Kelly Moffat

Jessica Hill

, Jayati Bera

, Douglas Fadrosh

, Shaohua Jin

, Shivani Johri

, Mary Kim

, Larry Overton

, Matthew Reardon

, Tamara Tsitrin

Hue Vuong

, Bruce Weaver

, Anne Ciecko

, Luke Tallon

, Jacqueline Jackson

, Grace Pai

, Susan Van Aken

, Terry Utterback

, Steve

Reidmuller

, Tamara Feldblyum

, Joseph Hsiao

, Victoria Zismann

, Stacey Iobst

, Aymeric R. de Vazeille

, C. Robin Buell (Principal

Investigator)

; National Center for Gene Research Chinese Academy of Sciences (NCGR) Kai Ying

, Ying Li

, Tingting Lu

, Yuchen

Huang

, Qiang Zhao

, Qi Feng

, Lei Zhang

, Jingjie Zhu

, Qijun Weng

, Jie Mu

, Yiqi Lu

, Danlin Fan

, Yilei Liu

, Jianping Guan

, Yujun

Zhang

, Shuliang Yu

, Xiaohui Liu

, Yu Zhang

, Guofan Hong

, Bin Han (Principal Investigator)

; Genoscope Nathalie Choisne

, Nadia

Demange

, Gisela Orjeda

, Sylvie Samain

, Laurence Cattolico

, Eric Pelletier

, Arnaud Couloux

, Beatrice Segurens

, Patrick Wincker

Angelique D’Hont

, Claude Scarpelli

, Jean Weissenbach

, Marcel Salanoubat

, Francis Quetier (Principal Investigator)

; Arizona

Genomics Institute (AGI) and Arizona Genomics Computational Laboratory (AGCol) Yeisoo Yu

, Hye Ran Kim

, Teri Rambo

, Jennifer

Currie

, Kristi Collura

, Meizhong Luo

, Tae-Jin Yang

, Jetty S. S. Ammiraju

, Friedrich Engler

, Carol Soderlund

, Rod A. Wing (Principal

Investigator)

; Cold Spring Harbor Laboratory (CSHL) Lance E. Palmer

, Melissa de la Bastide

, Lori Spiegel

, Lidia Nascimento

, Theresa

Zutavern

, Andrew O’Shaughnessy

, Sujit Dike

, Neilay Dedhia

, Raymond Preston

, Vivekanand Balija

, W. Richard McCombie (Principal

Investigator)

; Academia Sinica Plant Genome Center (ASPGC) Teh-Yuan Chow

, Hong-Hwa Chen

, Mei-Chu Chung

, Ching-San

Chen

, Jei-Fu Shaw

, Hong-Pang Wu

, Kwang-Jen Hsiao

, Ya-Ting Chao

, Mu-kuei Chu

, Chia-Hsiung Cheng

, Ai-Ling Hour

, Pei-Fang

Lee

, Shu-Jen Lin

, Yao-Cheng Lin

, John-Yu Liou

, Shu-Mei Liu

, Yue-Ie Hsing (Principal Investigator)

; Indian Initiative for Rice Genome

Sequencing (IIRGS), University of Delhi South Campus (UDSC) S. Raghuvanshi

, A. Mohanty

, A. K. Bharti

11,13

, A. Gaur

, V. Gupta

,D.

Kumar

, V. Ravi

, S. Vij

, A. Kapur

, Parul Khurana

, Paramjit Khurana

, J. P. Khurana

, A. K. Tyagi (Principal Investigator)

; Indian

Initiative for Rice Genome Sequencing (IIRGS), Indian Agricultural Research Institute (IARI) K. Gaikwad

, A. Singh

, V. Dalal

,S.

Srivastava

, A. Dixit

, A. K. Pal

, I. A. Ghazi

, M. Yadav

, A. Pandit

, A. Bhargava

, K. Sureshbabu

, K. Batra

, T. R. Sharma

,T.

Mohapatra

, N. K. Singh (Principal Investigator)

; Plant Genome Initiative at Rutgers (PGIR) Joachim Messing (Principal Investigator)

Amy Bronzino Nelson

, Galina Fuks

, Steve Kavchok

, Gladys Keizer

, Eric Linton Victor Llaca

, Rentao Song

, Bahattin Tanyolac

Steve Young

; Korea Rice Genome Research Program (KRGRP) Kim Ho-Il

, Jang Ho Hahn (Principal Investigator)

; National Center for

Genetic Engineering and Biotechnology (BIOTEC) G. Sangsakoo

, A. Vanavichit (Principal Investigator)

; Brazilian Rice Genome

Initiative (BRIGI) Luiz Anderson Teixeira de Mattos

, Paulo Dejalma Zimmer

, Gaspar Malone

, Odir Dellagostin

, Antonio Costa de

Oliveira (Principal Investigator)

; John Innes Centre (JIC) Michael Bevan

, Ian Bancroft

; Washington University School of Medicine

Genome Sequencing Center Pat Minx

, Holly Cordum

, Richard Wilson

; University of Wisconsin–Madison Zhukuan Cheng

, Weiwei

Jin

, Jiming Jiang

, Sally Ann Leong

Annotation and Analysis: Hisakazu Iwama

, Takashi Gojobori

21,22

, Takeshi Itoh

22,23

, Yoshihito Niimura

, Yasuyuki Fujii

, Takuya

Habara

, Hiroaki Sakai

23,25

, Yoshiharu Sato

, Greg Wilson

, Kiran Kumar

, Susan McCouch

, Nikoleta Juretic

, Douglas Hoen

Stephen Wright

, Richard Bruskiewich

, Thomas Bureau

, Akio Miyao

, Hirohiko Hirochika

, Tomotaro Nishikawa

, Koh-ichi

Kadowaki

& Masahiro Sugiura

Coordination: Benjamin Burr

Afﬁliations for participants:

National Institute of Agrobiological Sciences/Institute of the Society for Techno-innovation of Agriculture, Forestry and Fisheries, 2-1-2 Kannondai,

Tsukuba, Ibaraki 305-8602, Japan.

The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, Maryland 20850, USA.

Shanghai Institutes for Biological

Sciences, Chinese Academy of Sciences (CAS), 500 Caobao Road, Shanghai 200233, China.

Centre National de Se

quenc¸age, INRA-URGV, and CNRS UMR-8030, 2, rue Gaston

Cre

mieux, CP 5706, 91057 EVRY Cedex, France.

UMR PIA, Cirad-Amis, TA40-03 avenue Agropolis, 34398 Montpellier Cedex 05, France.

Department of Plant Sciences, BIO5

Institute, The University of Arizona, Tucson, Arizona 85721, USA.

Cold Spring Harbor Labora tory, Cold Spring Harbor, New York 11723, USA.

Institute of Botany, Academia

Sinica, 128, Sec. 2, Yen-Chiu-Yuan Rd, Nankang, Taipei 11529, Taiwan.

National Cheng Kung University, No. 1, Ta-Hsueh Road, Tainan 701, Taiwan.

National Yang-Ming

University, 155, Sec. 2, Li-Nong St, Peitou, Taipei 112, Taiwan.

Department of Plant Molecular Biology, University of Delhi South Campus, New Delhi 110021, India.

National

Research Centre on Plant Biotechnology, Indian Agricultural Research Institute, New Delhi 110012, India.

Waksman Institute, Rutgers University, Piscataway, New Jersey

08854, USA.

National Institute of Agricultural Science and Technology, RDA, Suwon, 441-707 Republic of Korea.

Rice Gene Discovery Unit, Kasetsart University, Nakron

Pathom 73140, Thailand.

Centro de Genomica e Fitomelhoramento, UFPel, Pelotas, RS, l 96001-970, Brazil.

John Innes Centre, Norwich Research Park, Colney, Norwich NR4

7UH, UK.

Washington University Genome Sequencing Center, 33 33 For est Park Boulevard, St. Louis, Missouri 63108, USA.

University of Wisconsin, Department of

Horticulture, Madison, Wisconsin 53706, USA.

University of Wisconsin, Department of Plant Pathology, Madison, Wisconsin 53706, USA.

Center for Information Biology and

DNA Data Bank of Japan, National Institute of Genetics, Mishima 411-8540, Japan.

Biological Information Research Center, N ational Institute of Advanced Industrial Science

and Technology, Koto-ku, Tokyo 135-0064, Japan.

National Institute of Agrobiological Sciences, Tsukuba, Ibaraki 305-8602, Japan.

Medical Research Institute, Tokyo

Medical and Dental University, Bunkyo-ku, Tokyo 113-8510, Japan.

Japan Biological Information Research Center, Japan Biological Informatics Consortium, Koto-ku, Tokyo 135-

0064, Japan.

Plant Breeding Dept, Cornell University, Ithaca, New York 14850-1901, USA.

Cold Spring Harbor Laborato ry, PO Box 100, 1 Bungtown Road, Cold Spring Harbor,

New York 11724, USA.

Department of Biology, McGill University, 1205 Dr Penﬁeld Avenue, Montreal, Quebec H3A 1B1, Canada.

Department of Biology, York University,

4700 Keele Street, Toronto, Ontario M3J 1P3, Canada.

Biometrics and Bioinformatics Unit, International Rice Research Institute, DAPO Box 7777, Metro Manila, Philippines.

Graduate School of Natural Sciences, Nagoya City University, Nagoya 467-8501, Japan.

Biology Department, Brookhaven National Laboratory, Upton, New York 11973, USA.

ARTICLES NATURE|Vol 436|11 August 2005

800

Genomic prediction and QTL analysis for grain Zn content and yield in Aus-derived rice populations

Article

May 2024

Zinc (Zn) biofortification of rice can address Zn malnutrition in Asia. Identification and introgression of QTLs for grain Zn content and yield (YLD) can improve the efficiency of rice Zn biofortification. In four rice populations we detected 56 QTLs for seven traits by inclusive composite interval mapping (ICIM), and 16 QTLs for two traits (YLD and Zn) by association mapping. The phenotypic variance (PV) varied from 4.5% (qPN4.1) to 31.7% (qPH1.1). qDF1.1, qDF7.2, qDF8.1, qPH1.1, qPH7.1, qPL1.2, qPL9.1, qZn5.1, qZn5.2, qZn6.1 and qZn7.1 were identified in both dry and wet seasons; qZn5.1, qZn5.2, qZn5.3, qZn6.2, qZn7.1 and qYLD1.2 were detected by both ICIM and association mapping. qZn7.1 had the highest PV (17.8%) and additive effect (2.5 ppm). Epistasis and QTL co-locations were also observed for different traits. The multi-trait genomic prediction values were 0.24 and 0.16 for YLD and Zn respectively. qZn6.2 was co-located with a gene (OsHMA2) involved in Zn transport. These results are useful for Zn biofortificatiton of rice.

Convergent evolution of desiccation tolerance in grasses

Article

Full-text available

Jun 2024

Desiccation tolerance has evolved repeatedly in plants as an adaptation to survive extreme environments. Plants use similar biophysical and cellular mechanisms to survive life without water, but convergence at the molecular, gene and regulatory levels remains to be tested. Here we explore the evolutionary mechanisms underlying the recurrent evolution of desiccation tolerance across grasses. We observed substantial convergence in gene duplication and expression patterns associated with desiccation. Syntenic genes of shared origin are activated across species, indicative of parallel evolution. In other cases, similar metabolic pathways are induced but using different gene sets, pointing towards phenotypic convergence. Species-specific mechanisms supplement these shared core mechanisms, underlining the complexity and diversity of evolutionary adaptations to drought. Our findings provide insight into the evolutionary processes driving desiccation tolerance and highlight the roles of parallel and convergent evolution in response to environmental challenges.

Bulked Segregant RNA-Seq (BSR-Seq) Analysis of Pollinated Pistils Reveals Genes Influencing Spikelet Fertility in Rice

Article

Jun 2024
Rice Sci

Genomics

Chapter

Jun 2024

Plant breeding research has historically progressed due to unintentional selection following crop development and the desire for increased food availability. The progress in achieving this objective unveiled the constituents of plant genomes that control the whole plant life. Plant genomics seeks to create high-throughput genome-wide techniques, tools, and approaches to unravel the fundamentals of genetic traits, genetic diversity, and by-product output; to comprehend phenotypic development across the plant developmental stages along with its genetics by ecological factors; to map significant loci in the genome; and subsequently to hasten crop improvement. The accessibility of cost-effective, high-throughput DNA sequencing systems has directed the complete sequencing of hundreds of plant genomes, which has broad implications for all aspects of plant biology research and its applications. Constantly increased efforts have been put into plant genomics studies over the last 30 years. However, these technologies have offered several unanticipated challenges. This chapter briefly reviews developments in plant genomics research over the previous decades, plant genome sequencing initiatives, and major challenges in the genomics era.

Plant thaumatin-like protein family: Genome-wide diversification, evolution, and functional adaptation

Chapter

Jan 2024

Proteomic signatures uncover phenotypic plasticity of susceptible and resistant genotypes by wall remodelers in rice blast

Article

Jun 2024
PLANT CELL ENVIRON

Molecular communication between macromolecules dictates extracellular matrix (ECM) dynamics during pathogen recognition and disease development. Extensive research has shed light on how plant immune components are activated, regulated and function in response to pathogen attack. However, two key questions remain largely unresolved: (i) how does ECM dynamics govern susceptibility and disease resistance, (ii) what are the components that underpin these phenomena? Rice blast, caused by Magnaporthe oryzae adversely affects rice productivity. To understand ECM regulated genotype‐phenotype plasticity in blast disease, we temporally profiled two contrasting rice genotypes in disease and immune state. Morpho‐histological, biochemical and electron microscopy analyses revealed that increased necrotic lesions accompanied by electrolyte leakage governs disease state. Wall carbohydrate quantification showed changes in pectin level was more significant in blast susceptible compared to blast resistant cultivar. Temporally resolved quantitative disease‐ and immune‐responsive ECM proteomes identified 308 and 334 proteins, respectively involved in wall remodelling and integrity, signalling and disease/immune response. Pairwise comparisons between time and treatment, messenger ribonucleic acid expression, diseasome and immunome networks revealed novel blast‐related functional modules. Data demonstrated accumulation of α‐galactosidase and phosphatase were associated with disease state, while reactive oxygen species, induction of Lysin motif proteins, CAZymes and extracellular Ca‐receptor protein govern immune state.

Meta-Analysis of Rice Phosphoproteomics Data to Understand Variation in Cell Signaling Across the Rice Pan-Genome

Article

May 2024
J PROTEOME RES

Genetic characteristics of a sake rice variety Hakutsurunishiki酒米品種「白鶴錦」の遺伝的特性

Article

Jan 2021

Hakutsurunishiki is a sake rice variety developed by Yamadabo for the seed parent and Wataribune 2 for the pollen parent, equivalent to a sibling of Yamadanishiki. Although Hakutsurunishiki has different characteristics from Yamadanishiki, its genetic characteristics have yet to be determined. We therefore attempted to clarify the genetic characteristics of Hakutsurunishiki by the whole genome information among both varieties and the parents. There was a different inheritance of genomic regions derived from the parents between Hakutsurunishiki and Yamadanishiki. In addition, there were DNA polymorphisms at several starch-related genes and the quantitative trait loci related to white-core expression in the rice grain, suggesting phenotypic differences between the two varieties.

Plant Genetics Redefined-Edited Perspectives on Transformative Breeding: Volume 2

Book

Sep 2023

The journey of genetic improvement has been nothing short of extraordinary in the constantly changing world of agriculture and plant biology. The study of plant genetics has continuously redefined its limits and potential since Mendel's foundational experiments on pea plants and more recent advances in genomics and biotechnology. This book, "Plant Genetics Redefined: Edited Perspectives on Transformative Breeding-(Volume-2)" provides proof of the dynamic nature of plant genetics and the unrelenting search for novel breeding techniques. This volume's aim is to examine the cutting edge of plant genetics through a variety of innovative viewpoints and ground-breaking studies. This book represents a collaborative effort of passionate scientists, research scholars, and educators who share a common goal: to unlock the full potential of plant genetic resources and to develop crop varieties that can thrive in diverse environments while meeting the nutritional needs of a growing global population. The chapters within this volume are a testament to their dedication and expertise. Our journey begins by revisiting the fundamental principles of genetics and breeding, examining how the legacy of classical genetics still informs our modern approaches. We then venture into the realm of molecular genetics and genomics, where the sequencing of plant genomes and the discovery of genes with untapped potential are reshaping our understanding of plant biology. As such, this book is designed to provide a comprehensive overview of the cutting-edge techniques and methodologies that are reshaping the field of plant breeding along with the traditional plant breeding techniques. The book chapter ‘Revolutionizing Crop Improvement through Genomic Selection’ delves into the transformative impact of genomic selection on the field of crop improvement. Genomic selection's ability to expedite breeding programs is a central theme, elucidating how it drastically reduces the time and resources traditionally required for developing improved crop varieties. The pivotal role of transcriptomics and metabolomics in advancing the field of plant breeding is detailed in the chapter ‘Application of Transcriptomics and Metabolomics in Refining Plant Breeding’. Statistical methodologies are instrumental in unraveling complex genetic patterns, estimating genetic parameters, and predicting trait outcomes. The book chapter ‘Application of Statistical Methods in Plant Breeding’ explores the pivotal role of statistical methods in modern plant breeding, emphasizing their application to enhance the efficiency, precision, and success of breeding programs. Precision tools such as the CRISPR-Cas9 system and its variants enable researchers to precisely modify specific genes within plant genomes, offering the potential to revolutionize crop breeding. The book chapters, ‘Precision Tools of Genome Editing for Enhancing Crop Genetics’, ‘Application of Gene Editing in Refining Plant Breeding Processes’ and ‘Modification of Crop Plant Genomes using Genome Editing’ explore the revolutionary advances in genome editing technologies and their transformative impact on crop genetics enhancement. By harnessing scientific innovation, promoting agricultural sustainability, and fostering collaboration among stakeholders, QPM initiatives hold promise in addressing malnutrition and promoting healthier societies. The book chapter ‘Strategic Implementation of Quality Protein Maize (QPM) to Address Protein Deficiency’ presents an in-depth exploration of Quality Protein Maize (QPM) and its strategic implementation as a solution to address protein deficiency, particularly in regions where malnutrition remains a critical concern. The book chapter ‘Bridging the Gap between Phenotype and Genotype through High-Throughput Phenotyping’ delves into the innovative realm of high-throughput phenotyping and its pivotal role in narrowing the divide between phenotype and genotype in plant biology. Gene pyramiding is a genetic breeding approach that combines multiple desirable genes or alleles to enhance and stack valuable traits in crops. The chapter ‘Utilizing Gene Pyramiding for Improved Traits’ explores the significance, methodologies, and practical applications of gene pyramiding, emphasizing its potential to revolutionize crop improvement and address the multifaceted challenges faced by modern agriculture. As the world confronts the increasing frequency of extreme weather events, shifting climate patterns, and resource constraints, the development of climate-resilient crops has become imperative for global food security. The book chapter ‘Strategies for Breeding Crops Resilient to Climate Challenges’ underscores the critical importance of breeding climate-resilient crops as a key component of climate change adaptation in agriculture and explores innovative strategies for breeding crops that exhibit resilience in the face of climate challenges. The book chapter ‘The Role of Genetic Diversity in Preserving and Utilizing Variability for Plant Breeding’ details the pivotal role of genetic diversity in plant breeding, emphasizing its significance in preserving and harnessing the rich variability found within plant populations. Heterosis, or hybrid vigor, is a phenomenon in which the offspring of two genetically distinct parents exhibit superior traits compared to either parent. It represents a powerful approach to meet the increasing demands for food production and quality in a rapidly changing agricultural landscape. The chapter ‘Exploiting Heterosis for Improved Yield and Quality through Hybrid Breeding’, unravel the mystery of heterosis and its transformation into a practical tool for crop improvement. We witness the emergence of bioinformatics as a guiding light, illuminating the paths through vast datasets and unveiling patterns that guide breeding decisions. From deciphering the intricacies of plant genomes to harnessing the power of big data, the journey is both humbling and exhilarating, well detailed in the chapter, ‘Application of Bioinformatics and Computational Tools in Crop Genetics’. In the chapter, ‘Ethical Considerations and Intellectual Property Rights in Contemporary Plant Breeding’, we delve into the fine distinction of patents, plant variety protection, and proprietary technologies that shape the course of crop improvement. These legal mechanisms, while fostering innovation, also raise essential questions about equitable access, biodiversity conservation, and the rights of traditional knowledge holders. We confront the ethical contours of access to essential resources, the responsible stewardship of genetic diversity, and the imperative of ensuring that innovation serves the greater good. As editors, we hope that this volume not only serves as a guide for researchers and practitioners but also sparks a renewed appreciation for the marvels of plant genetics. The amalgamation of these time-tested practices with modern tools forms the essence of “Plant Genetics Redefined: Edited Perspectives on Transformative Breeding-(Volume-2)". We extend our gratitude to the contributors who have shared their expertise, experiences, and insights within these pages. Their collective wisdom enriches this work and serves as a beacon for future explorations in the realm of plant breeding. The book “Plant Genetics Redefined: Edited Perspectives on Transformative Breeding-(Volume-2)" stands as a tribute to the perseverance of those who seek to enhance agricultural productivity, sustainability, and resilience. May this book inspire readers to continue pushing the boundaries of knowledge and innovation, ensuring a nourished and sustainable future for all.

Progress in Rice Breeding Based on Genomic Research

Article

Full-text available

Apr 2024

The role of rice genomics in breeding progress is becoming increasingly important. Deeper research into the rice genome will contribute to the identification and utilization of outstanding functional genes, enriching the diversity and genetic basis of breeding materials and meeting the diverse demands for various improvements. Here, we review the significant contributions of rice genomics research to breeding progress over the last 25 years, discussing the profound impact of genomics on rice-genome sequencing, functional-gene exploration, and novel breeding methods, and we provide valuable insights for future research and breeding practices.

Analysis of the genome sequence of the flowering plant Arabidopsis thaliana

Article

Full-text available

Oct 2000

Yield Potential Trends of Tropical Rice since the Release of IR8 and the Challenge of Increasing Rice Yield Potential

Article

Full-text available

Nov 1999

1 . There- harvest index. New plant type breeding has not yet improved yield fore, rice yield potential has remained almost constant potential due to poor grain filling and low biomass production. Factors in the tropical environments. The theoretical potential that cause poor grain filling and low biomass production of the NPT yield has been estimated at 15.9 Mg ha21 in these envi- lines have been identified. Selecting parents with good grain filling ronments based on the total amount of incident solar traits, introduction of indica genes into NPT's tropical japonica back- radiation during the growing season (Yoshida, 1981). ground, and a refinement of the original NPT design are expected to On the basis of this estimate, there appears to be a large improve the performance of the NPT lines. Further enhancement in gap between the yield potential of the best available yield potential may be possible from use of intersubspecific heterosis rice cultivars and the maximum theoretical yield. At between indica and NPT lines.

A High-Density Rice Genetic Linkage Map with 2275 Markers Using a Single F2 Population

Article

Full-text available

Jan 1998

A 2275-marker genetic map of rice (Oryza sativa L.) covering 1521.6 cM in the Kosambi function has been constructed using 186 F2 plants from a single cross between the japonica variety Nipponbare and the indica variety Kasalath. The map provides the most detailed and informative genetic map of any plant. Centromere locations on 12 linkage groups were determined by dosage analysis of secondary and telotrisomics using > 130 DNA markers located on respective chromosome arms. A limited influence on meiotic recombination inhibition by the centromere in the genetic map was discussed. The main sources of the markers in this map were expressed sequence tag (EST) clones from Nipponbare callus, root, and shoot libraries. We mapped 1455 loci using ESTs; 615 of these loci showed significant similarities to known genes, including single-copy genes, family genes, and isozyme genes. The high-resolution genetic map permitted us to characterize meiotic recombinations in the whole genome. Positive interference of meiotic recombination was detected both by the distribution of recombination number per each chromosome and by the distribution of double crossover interval lengths.

Development and Mapping of 2240 New SSR Markers for Rice (Oryza sativa L.)

Article

Full-text available

Jan 2002

Susan Mccouch

A total of 2414 new di-, tri- and tetra-nucleotide non-redundant SSR primer pairs, representing 2240 unique marker loci, have been developed and experimentally validated for rice ( Oryza sativa L.). Duplicate primer pairs are reported for 7% (174) of the loci. The majority (92%) of primer pairs were developed in regions flanking perfect repeats ≥ 24 bp in length. Using electronic PCR (e-PCR) to align primer pairs against 3284 publicly sequenced rice BAC and PAC clones (representing about 83% of the total rice genome), 65% of the SSR markers hit a BAC or PAC clone containing at least one genetically mapped marker and could be mapped by proxy. Additional information based on genetic mapping and “nearest marker” information provided the basis for locating a total of 1825 (81%) of the newly designed markers along rice chromosomes. Fifty-six SSR markers (2.8%) hit BAC clones on two or more different chromosomes and appeared to be multiple copy. The largest proportion of SSRs in this data set correspond to poly(GA) motifs (36%), followed by poly(AT) (15%) and poly(CCG) (8%) motifs. AT-rich microsatellites had the longest average repeat tracts, while GC-rich motifs were the shortest. In combination with the pool of 500 previously mapped SSR markers, this release makes available a total of 2740 experimentally confirmed SSR markers for rice, or approximately one SSR every 157 kb.

A physical map with yeast artificial chromosome (YAC) clones covering 63% of the 12 rice chromosomes

Article

Feb 2001
GENOME

A new YAC (yeast artificial chromosome) physical map of the 12 rice chromosomes was constructed utilizing the latest molecular linkage map. The 1439 DNA markers on the rice genetic map selected a total of 1892 YACs from a YAC library. A total of 675 distinct YACs were assigned to specific chromosomal locations. In all chromosomes, 297 YAC contigs and 142 YAC islands were formed. The total physical length of these contigs and islands was estimated to 270 Mb which corresponds to approximately 63% of the entire rice genome (430 Mb). Because the physical length of each YAC contig has been measured, we could then estimate the physical distance between genetic markers more precisely than previously. In the course of constructing the new physical map, the DNA markers mapped at 0.0-cM intervals were ordered accurately and the presence of potentially duplicated regions among the chromosomes was detected. The physical map combined with the genetic map will form the basis for elucidation of the rice genome structure, map-based cloning of agronomically important genes, and genome sequencing.Key words: physical mapping, YAC contig, rice genome, rice chromosomes.

Analysis of the genome sequence of the flowering plant

Article

Jan 2000

T. Arabidopsis Genome Initiative

Erratum: Genome-wide insertional mutagenesis of Arabidopsis thaliana (Science (August 1) (653))

Article

Sep 2003

Jose Alonso

A draft sequence of the rice genome (Oryza sativa L. ssp. japonica) (April, pg 92, 2002)

Article

Aug 2005

S.A. Goff

Rice Transposable Elements: A Survey of 73,000 Sequence-Tagged-Connectors

Article

Jul 2000
GENOME RES

Long Mao

As part of an international effort to sequence the rice genome, the Clemson University Genomics Institute is developing a sequence-tagged-connector (STC) framework. This framework includes the generation of deep-coverage BAC libraries from O. sativa ssp.japonica c.v. Nipponbare and the sequencing of both ends of the genomic DNA insert of the BAC clones. Here, we report a survey of the transposable elements (TE) in >73,000 STCs. A total of 6848 STCs were found homologous to regions of known TE sequences (E<10⁻⁵) by FASTX search of STCs against a set of 1358 TE protein sequences obtained from GenBank. Of these TE-containing STCs (TE–STCs), 88% (6027) are related to retroelements and the remaining are transposase homologs. Nearly all DNA transposons known previously in plants were present in the STCs, including maize Ac/Ds,En/Spm, Mutator, and mariner-like elements. In addition, 2746 STCs were found to contain regions homologous to known miniature inverted-repeat transposable elements (MITEs). The distribution of these MITEs in regions near genes was confirmed by EST comparisons to MITE-containing STCs, and our results showed that the association of MITEs with known EST transcripts varies by MITE type. Unlike the biased distribution of retroelements in maize, we found no evidence for the presence of gene islands when we correlated TE–STCs with a physical map of the CUGI BAC library. These analyses of TEs in nearly 50 Mb of rice genomic DNA provide an interesting and informative preview of the rice genome.

A Comprehensive Rice Transcript Map Containing 6591 Expressed Sequence Tag Sites

Article

Mar 2002

Jianzhong Wu

To determine the chromosomal positions of expressed rice genes, we have performed an expressed sequence tag (EST) mapping project by polymerase chain reaction–based yeast artificial chromosome (YAC) screening. Specific primers designed from 6713 unique EST sequences derived from 19 cDNA libraries were screened on 4387 YAC clones and used for map construction in combination with genetic analysis. Here, we describe the establishment of a comprehensive YAC-based rice transcript map that contains 6591 EST sites and covers 80.8% of the rice genome. Chromosomes 1, 2, and 3 have relatively high EST densities, approximately twice those of chromosomes 11 and 12, and contain 41% of the total EST sites on the map. Most of the EST-dense regions are distributed on the distal regions of each chromosome arm. Genomic regions flanking the centromeres for most of the chromosomes have lower EST density. Recombination frequency in these regions is suppressed significantly. Our EST mapping also shows that 40% of the assigned ESTs occupy only ∼21% of the entire genome. The rice transcript map has been a valuable resource for genetic study, gene isolation, and genome sequencing at the Rice Genome Research Program and should become an important tool for comparative analysis of chromosome structure and evolution among the cereals.

The map-based sequence of rice genome

Abstract and Figures

Recommended publications

Fighting fake Chinese Herbal Medicines

Simulation model prepares cardiologists for surgeries

The map-based sequence of the rice genome

The map-based sequence of the rice genome

The map-based sequence of the rice genome

The map-based sequence of the rice genome