ArticlePDF Available

The GP problem: Quantifying gene-to-phenotype relationships

February 2002
In Silico Biology 2(2):151-64

February 2002
2(2):151-64

Source
PubMed

Authors:

Mark Cooper

The University of Queensland

Scott C. Chapman

The Commonwealth Scientific and Industrial Research Organisation

Dean Podlich

DuPont Pioneer

G. L. Hammer

The University of Queensland

In this paper we refer to the gene-to-phenotype modeling challenge as the GP problem. Integrating information across levels of organization within a genotype-environment system is a major challenge in computational biology. However, resolving the GP problem is a fundamental requirement if we are to understand and predict phenotypes given knowledge of the genome and model dynamic properties of biological systems. Organisms are consequences of this integration, and it is a major property of biological systems that underlies the responses we observe. We discuss the E(NK) model as a framework for investigation of the GP problem and the prediction of system properties at different levels of organization. We apply this quantitative framework to an investigation of the processes involved in genetic improvement of plants for agriculture. In our analysis, N genes determine the genetic variation for a set of traits that are responsible for plant adaptation to E environment-types within a target population of environments. The N genes can interact in epistatic NK gene-networks through the way that they influence plant growth and development processes within a dynamic crop growth model. We use a sorghum crop growth model, available within the APSIM agricultural production systems simulation model, to integrate the gene-environment interactions that occur during growth and development and to predict genotype-to-phenotype relationships for a given E(NK) model. Directional selection is then applied to the population of genotypes, based on their predicted phenotypes, to simulate the dynamic aspects of genetic improvement by a plant-breeding program. The outcomes of the simulated breeding are evaluated across cycles of selection in terms of the changes in allele frequencies for the N genes and the genotypic and phenotypic values of the populations of genotypes.

Hierarchy and scale in modeling processes within plant systems.

…

Grain yield values (t/ha) for the 4235-genotype classes in the Mild Terminal Stress and Severe Terminal Stress environment-types for color coded representations of each of the four traits; (a) Transpiration Efficiency, (b) Osmotic Adjustment, (c) Phenology and (d) Stay-green. Genotype classes are color coded according to their genetic distance from the target genotype in the target population of environments (Hamming distance), extending from yellow (all alleles different from the target genotype) to blue (no alleles different from the target genotype).

…

Change in gene frequency of the + alleles for increasing level of the four traits (TE=Transpiration Efficiency, OA=Osmotic Adjustment, Ph=Phenology, SG=Stay-green) over cycles of selection, when selection is conducted in the Severe Terminal Stress (a) and Mild Terminal Stress (b) environment-types.

…

Grain yield values (t/ha) for the 4235-genotype classes and the average trajectory of a population of genotypes (red line) over cycles of selection, when selection is conducted in the Severe Terminal Stress (a) and Mild Terminal Stress (b) environment-types. Genotype classes are color coded according to their genetic distance from the target genotype in either the Severe Terminal Stress (a) or the Mild Terminal Stress (b) environment-types, extending from yellow (all alleles different from the target genotype) to blue (no alleles different from the target genotype).

…

No caption available

…

Figures - uploaded by Mark Cooper

Content may be subject to copyright.

Content uploaded by Mark Cooper

Content may be subject to copyright.

In Silico Biology 2 (2002) 151–164

IOS Press

Electronic publication can be found in In Silico Biol. 2, 0013 <http:// www.bioinfo.de/isb/2002/02/0013/>, 24 January 2002.

151

The GP Problem: Quantifying Gene-to-Phenotype

Relationships

Mark Cooper

1, 2,*

, Scott C. Chapman

, Dean W. Podlich

1, 2

and Graeme L. Hammer

1, 4

School of Land and Food Sciences, The University of Queensland, Brisbane, Queensland 4072,

Australia

Current Address: Pioneer Hi-Bred International Inc., 7300 N.W. 62

Avenue, P.O. Box 1004,

Johnston, Iowa 50131, USA

CSIRO Plant Industry, 120 Meiers Road, Indooroopilly, Queensland 4068, Australia

Agricultural and Production Systems Research Unit (APSRU), Queensland Department of

Primary Industries, Tor Street, Toowoomba, Queensland, Australia

Edited by E. Wingender; received 26 September 2001; revised and accepted 21 December 2001; published

24 January 2002

ABSTRACT

: In this paper we refer to the gene-to-phenotype modeling challenge as the GP problem.

Integrating information across levels of organization within a genotype-environment system is a major

challenge in computational biology. However, resolving the GP problem is a fundamental requirement if

we are to understand and predict phenotypes given knowledge of the genome and model dynamic proper-

ties of biological systems. Organisms are consequences of this integration, and it is a major property of

biological systems that underlies the responses we observe. We discuss the

E(NK)

model as a framework

for investigation of the GP problem and the prediction of system properties at different levels of organiza-

tion. We apply this quantitative framework to an investigation of the processes involved in genetic im-

provement of plants for agriculture. In our analysis,

genes determine the genetic variation for a set of

traits that are responsible for plant adaptation to

environment-types within a target population of envi-

ronments. The

genes can interact in epistatic

gene-networks through the way that they influence

plant growth and development processes within a dynamic crop growth model. We use a sorghum crop

growth model, available within the APSIM agricultural production systems simulation model, to integrate

the gene-environment interactions that occur during growth and development and to predict genotype-to-

phenotype relationships for a given

E(NK)

model. Directional selection is then applied to the population

of genotypes, based on their predicted phenotypes, to simulate the dynamic aspects of genetic improve-

ment by a plant-breeding program. The outcomes of the simulated breeding are evaluated across cycles of

selection in terms of the changes in allele frequencies for the

genes and the genotypic and phenotypic

values of the populations of genotypes.

Links:

http://pig.ag.uq.edu.au/qu-gene/

http://www.apsru.gov.au/Products/apsim.htm

____________________________

*Corresponding author: Email: mark.cooper@pioneer.com.

M. Cooper et al. / The GP Problem

152

KEYWORDS:

E(NK)

model, epistasis, genotype-by-environment interactions, plant, crop, target popula-

tion of environments, genetic space

INTRODUCTION

Today, a major research focus in the field of genetics and computational biology is developing methods

to predict properties of organisms and populations of organisms at the phenotypic level from knowledge

of the structure, function and diversity of genomes. We refer to the problem of determining gene-to-

phenotype relationships as the GP problem. Formulating a solution to this problem for a defined natural

system requires integration of information across many levels of organization within biophysical systems.

An iterative modeling approach combined with strategic experimentation provides a powerful framework

for tackling the GP problem. The objective of this paper is to define what we mean by an iterative model-

ing approach. We commence by describing a general approach to modeling natural systems and then il-

lustrate its application in the modeling of plant breeding programs in an agricultural context.

It has often been stated that a model is a simplification of the natural system under investigation and

that the level of simplification must be balanced against the complexity of the properties of the system to

be studied. Therefore, what is an appropriate modeling approach to tackle the GP problem? In describing

modeling strategies in general, Rosen [1985] and Casti [1989, 1997] distinguished between the "Natural

System" that we are attempting to understand and the "Formal System" that is a mathematical construc-

tion of how we understand the properties of the natural system (Fig. 1). This is a useful starting point for

thinking about how we might model a genotype-environment system.

Fig.1. Concept map for modeling biophysical systems. The "Natural System" is the biophysical structure that is un-

der investigation and the "Formal System" is the model based investigative strategy that is in use to construct

"Knowledge Structures" that represent properties of the "Natural System".

Approaches to constructing a formal system that captures the important properties of a natural system

can take many forms. Here we are interested in a mathematical framework that allows us to represent the

key components of the natural system and define the key relationships between these components. We

M. Cooper et al. / The GP Problem

153

intend that the mathematical relationships that we construct within the formal system will ultimately be

representative of the causal relationships that are properties of the natural system, so that we can investi-

gate the implications of these causal relationships within the formal system. More detailed formalizations

might be appropriate if the objective is to investigate relationships at lower levels within the system,

rather than to understand the relationships among its components. An example of this is in the modeling

of processes related to the productivity of agricultural plants (Fig. 2). The genetics of different species

and varieties within species determine how plants interact with the soil and aerial (radiation, temperature,

rainfall) environments to develop structures to 'capture' radiation, CO

and water. While the enzymes and

biochemistry of the primary processes involved in the photosynthesis are extensively studied, it is diffi-

cult to integrate the results of these processes (assimilation of CO

into biomass) over various time scales,

and model the effect of this biomass as it is diverted to new tissues and organs to either store biomass, or

capture more resources. It has been considered that models can only simulate one or two levels of scale

away from the level of their primary function. Further, at molecular and atomic scales, the requirement

for information input to the system rapidly increases. By studying and experimenting within the natural

system we attempt to gain knowledge about the biophysical structures and their causal relationships at

appropriate scales. Some of these causal relationships may be functional, others pre-functional and in

many cases non-functional but consequential of the ways in which the biophysical structures interact

[Kauffman, 2000].

Fig. 2. Hierarchy and scale in modeling processes within plant systems.

In many cases the results of experiments in biology are summarized descriptively. Alternatively we can

attempt to encode, within a formal mathematical framework, our understanding of the results of the ex-

periments (Fig. 1). The collection of these formal mathematical structures that we create is a model of the

system. In many cases we may commence an experiment with a prior model or hypothesis and use the

results of our experimental program to update and improve our model of the natural system [e.g. Ideker et

al., 2001]. As we refine and improve our model through iterative cycles of experimentation and modeling

we will be able to study properties of the natural system within the properties of the formal system. This

will give us a basis for determining the level of confidence we have in decoding the structures observed

within the model and making predictions about the properties that we expect to see within the natural sys-

tem. Additionally, model building through iteration will enable us to acquire and interpret data structures

from experimental programs as a foundation for constructing knowledge structures and queries that apply

to the properties of the natural system. As we improve the quality of our model we will increasingly im-

prove our power to predict properties of the natural system across its levels of organization.

M. Cooper et al. / The GP Problem

154

INTEGRATION BY ASSUMPTION OR BY FOLDING OUT THE DETAIL?

If we attempt to construct an integrated model of a natural system without adequate attention to the

ways that the components of the system interact across levels of organization (Fig. 2) then we are either

confining ourselves to working within a level of organization or we will construct a model that has lim-

ited power to provide insight into many of the properties of the natural system. In the absence of experi-

mental evidence, attempting to integrate across levels of organization by assuming that interactions are

unlikely to be important will leave the resulting model vulnerable to deviate from the natural system

whenever these interactions become important.

In classical quantitative genetics many of the complicating interactions that can impact on gene-to-

phenotype relationships have been assumed to be unimportant, based on the expectation that their effects

are small, and/or that their estimation is impractical. Two properties of genotype-environment systems

that are often ignored are those of gene-to-gene interactions (epistasis) and gene-by-environment interac-

tions [e.g. Clark, 2000]. For example, in defining the value of a genotype for a quantitative trait that is

determined by multiple genes, the assumption that epistasis is zero implies that the effects of the alleles

for the segregating genes are independent of the effects of the alleles at the other genes. In this case, for

each gene, additive and dominance intra-gene effects can be defined in terms of contrasts between the

homozygous and heterozygous genotypes. Hence, the value of the multi-gene genotype for an individual

is then simply determined as the cumulative effects of the genes by summing the allele effects across the

segregating genes. Similarly, gene-by-environment (GE) interactions have been assumed to be unimpor-

tant or a source of error that can be summed to zero by evaluating genotypes in adequately large samples

of experimental environments representing the target population of environments.

Where experimental evidence demonstrates that the interactions are important it is necessary to directly

evaluate their implications within the formal system. Analyses of the genetic architecture of quantitative

traits in model systems indicate important sources of genetic variation attributed to epistasis and GE in-

teractions [Mackay, 2001]. The same can be expected of economically important traits in agricultural

plant species. Therefore, in tackling the GP problem for quantitative traits we seek a modeling framework

that enables investigation of the impact of gene-to-gene and gene-by-environment interactions.

MODELING A GENOTYPE-ENVIRONMENT SYSTEM

To progress from a general discussion of strategies for modeling natural systems to the specifics re-

quired to model genotype-environment systems it is necessary to define both the key properties and rela-

tionships that are important in the target natural system and the methods that are to be used in construct-

ing the formal system. Figure 3 is a concept map, based on the modeling framework described in Figure

1, which focuses on the GP problem for a genotype-environment system. Our objective is to establish a

formal representation of a genotype-environment system to enable modeling gene-to-phenotype relation-

ships as a basis for evaluating the efficiency of plant breeding strategies [Cooper et al., 1999]. Therefore,

here we emphasize the quantification of allelic variation at

genes and their potential interactions within

gene networks [Kauffman, 1993] and with

environmental conditions [Podlich and Cooper, 1998] in

determining the gene-to-phenotype relationships for the traits to be improved by plant breeding.

The scope for modeling plant and animal breeding strategies has been a long-term focus of applied

quantitative genetics [e.g. Falconer and Mackay, 1996; Comstock, 1996]. The use of computer simulation

approaches has increased as hardware and software capability and flexibility have improved. Adopting a

simulation approach to study gene-to-phenotype relationships provides greater flexibility for investigating

the influences of epistasis and GE interactions than is possible within the classical statistical modeling

approach [Kempthorne, 1988; Podlich and Cooper, 1998]. Kauffman [1993] gave a comprehensive dis-

M. Cooper et al. / The GP Problem

155

cussion of the

model and its suitability for investigating the impact of epistasis in evolutionary proc-

esses. Podlich and Cooper [1998] defined the

E(NK)

model as an extension of Kauffman's

model in

order to accommodate the effects of gene-by-environment interactions. In the

E(NK)

model gene-by-

environment interactions are possible where different forms of

gene network models can be expressed

in the different environmental conditions that are possible within a target population of environments.

Fig. 3. Concept map for modeling the key components of a genotype-environment system and the relationships to

the components of the E(NK) model and the investigative strategies applied to quantify the value of alleles of genes

within the genotype-environment system [Adapted from Cooper et al., 1999].

The relationships between the components of the

E(NK)

model and the biophysical components of a

genotype-environment system are indicated in Figure 3. Some of the investigation strategies that can be

used to provide the information necessary to build formal models of gene-to-phenotype relationships and

quantify the value of allelic variation in terms of the components of the

E(NK)

framework are indicated.

Key activities that are emphasized include: (i) environmental characterization as a basis for defining the

target population of environments and causes of GE interactions, (ii) genetic analysis to study genetic

variation for biochemical pathways, physiological processes and adaptive traits, (iii) genetic (recombina-

tion) and physical mapping of genes, (iv) functional genomics to study the regulation and expression of

genes, and (v) crop growth models that define the relationships between genetic variation for traits, plant

growth and development processes and variation in environmental resources within a target population of

environments [e.g. Bidinger et al., 1996].

SORGHUM BREEDING EXAMPLE: PROBLEM AND MODEL DEFINITION

To examine the effectiveness of a breeding strategy we need to define two properties of a genotype-

environment system: (1) the target population of environments, and (2) the target genotype for the gene-

to-phenotype model. Within the target geographical area that a breeding program operates, new genotypes

M. Cooper et al. / The GP Problem

156

are developed over sequences of cycles of intermating parents, evaluation and selection of progeny to

identify new genotypes that have high and stable yield performance across a wide range of environmental

conditions. The occurrence of environmental conditions within the geographical area has both spatial and

temporal dimensions and the different conditions can occur with different frequencies in both dimensions.

This results in a complex mixture of different environmental conditions that is referred to here as the

tar-

get population of environments

. In the presence of GE interactions, understanding the environmental fac-

tors that influence genotype performance and cause these interactions is an important step in designing an

effective testing strategy for measurement of trait phenotypes as part of a breeding program. The

target

genotype

is then defined as the genotype that results in the best trait performance across the target popula-

tion of environments for the specified gene-to-phenotype model. For complex genotype-environment sys-

tems there can be multiple genotype targets. As

E(NK)

models become more complex, with increasing

levels of

and

, it becomes increasingly difficult to compute and identify a single target genotype. In

these situations, where it is not possible to create and evaluate all potential genotypes for a gene-to-

phenotype model, alternative evaluation strategies are used. In the example we consider here the geno-

type-environment system is of a size that definition of a single target genotype is possible.

In this example we discuss some key results from a larger long-term study. This larger study is investi-

gating the requirements (Fig. 3) for model development and simulation of sorghum (

Sorghum bicolor

(L.)

Moench) adaptation and grain yield for the heterogeneous dryland agricultural system in northeastern

Australia [Chapman et al., 2000a,b,c, 2002a,b].

First we provide some background and context to the complexity of this genotype-environment system.

Sorghum is the major summer crop grown in the northeastern cropping region of Australia. Grain yield is

the major economic product and is used mainly as animal feed. Sorghum grain yield is a complex quanti-

tative trait and is the result of interactions and integration of many component traits that can themselves

interact with variation in environmental conditions (rainfall, temperature and solar radiation) during a

crop growth and developmental cycle of around 100 days. The major environmental variable that has a

dominant influence on grain yield variation is water availability to the crop. Variation in water availabil-

ity is a consequence of complex spatial and temporal variation in rainfall prior to and during the growth

of the crop and also the spatial variation in the water holding capacity of the soil types across the geo-

graphical area. We have found that the environmental variation in incidence of drought can explain a sig-

nificant component of the GE interactions for grain yield [Chapman et al., 2000a,b,c]. Research into the

genetic and physiological bases of drought tolerance of sorghum has identified and examined the impor-

tance of the following four traits: (1) phenology, in particular the timing of flowering (PH) [Hammer et

al., 1989], (2) stay-green (SG) [Borrell and Hammer, 2000], (3) transpiration efficiency (TE) [Hammer et

al., 1997; Mortlock and Hammer, 1999], and (4) osmotic adjustment (OA) [Hammer et al., 1999]. In par-

allel research, genetic analysis and the construction of a molecular marker map for grain sorghum [Tao et

al., 1998, 2000] has enabled trait dissection. This body of work provides working hypotheses of the num-

ber of genes or Quantitative Trait Loci (QTL) that may contribute to the genetic variation for these four

traits [Chapman et al., 2002a,b].

With access to this experimental database we have used a simulation approach to investigate the effi-

ciencies of plant breeding strategies used for genetic improvement of grain yield of sorghum under the

dryland conditions in Australia. This required us to develop an interface between a genetic modeling plat-

form (QU-GENE) [Podlich and Cooper, 1998; http://pig.ag.uq.edu.au/qu-gene/] and a cropping system

model (APSIM) [McCown et al., 1996; http://www.apsru.gov.au/products/apsim.htm], which has a mod-

ule for sorghum [Hammer and Muchow, 1994; Hammer et al., 2001]. This interface was constructed in a

way that used information generated from our ability to characterize environments for their occurrence of

drought, our understanding of the spatial and temporal distributions of drought in the target population of

environments, and the data available from genetic and physiological analyses of traits considered to con-

tribute to drought tolerance (Fig. 3). This provides a model architecture that links the alleles of genes and

M. Cooper et al. / The GP Problem

157

the plant growth and development processes that respond to variation in the environmental conditions to

determine grain yield (Fig. 4). Thus, by developing an interface between the QU-GENE genetic model

and the APSIM-Sorg model for sorghum there is a relationship between genes and phenotypes that en-

ables investigation of the GP problem within a genotype-environment system context. These gene-to-

phenotype relationships can be used to assess the value of genes in terms of an

E(NK)

model for grain

yield in a target population of environments. Further, as additional experimental information becomes

available it is possible to continually update the genetic and physiological models for the genotype-

environment system, our assessment of the allelic variation we have identified, and any impact that this

may have on the efficiency of the breeding strategies we are using for genetic improvement of sorghum.

Fig. 4. Schematic of the modular structures and linkages between QU-GENE and APSIM. In this example S1 recur-

rent selection was used as the breeding strategy to improve grain yield of the sorghum population of genotypes.

Other plant breeding strategies are indicated (e.g. pedigree selection). Genotypes are categorized into expression-

states in QU-GENE and these expression-states map to trait values modeled in APSIM-Sorg for different combina-

tions of soil and weather data. Output from APSIM is processed to define both the yield of all possible genotypes

(expression-state combinations) and the frequency of drought environment types (ETs) encountered in the target

population of environments (TPE).

The

E(NK)

model can be parameterized in a number of ways, including: (1) Constructing Boolean gene

networks and sampling genotype values for the components of the networks from underlying distributions

of gene effects; a procedure pioneered by Kauffman [1993]; (2) Defining inheritance models using em-

pirical estimates for classical quantitative genetic parameters [Podlich and Cooper, 1998]; and (3) Speci-

fying gene networks to represent the properties of biochemical pathways. For the sorghum genotype-

environment system in our example the resulting

E(NK)

model is a consequence of the number of genes

specified to control variation for traits, the number of environment-types identified for the target popula-

tion of environments and the physiological relationships that determine crop growth and development

with the APSIM-Sorg sorghum model. This is a novel approach for determining the parameters for an

E(NK)

model and it is made feasible by developing the interface between QU-GENE and APSIM (Fig. 4).

M. Cooper et al. / The GP Problem

158

Here we consider an

E(NK)

model where the number of environment-types

=3 and the total number of

genes

=15. Each of the 15 genes has two alleles segregating within a base population of genotypes. The

level of epistasis for grain yield, as defined by the

parameter, is not explicitly defined here and is an

emergent property of the extent of trait interconnectedness within the APSIM-Sorg crop growth model.

The three environment-types represent different levels of severity of drought: (1) mild terminal stress,

(2) moderate terminal stress, and (3) severe terminal stress. These drought environment types, together

with their frequencies of occurrence in the target population of environments, were determined from an

analysis of the timing and severity of water deficits during crop growth and development by running the

APSIM-Sorg model for a standard genotype with approximately 100 years of weather data across a num-

ber of locations in northeastern Australia. The locations represented different soil types from the target

geographical area. The APSIM-Sorg simulations were then summarized by cluster analysis to identify the

three key drought environment-types (Fig. 4) [Chapman et al., 2000b,c]. While there are three environ-

ment-types in the target population of environments, to be concise we will mostly concentrate on only

two of these in this paper; (1) the mild-terminal stress environment-type, and (2) the severe terminal stress

environment-type. The 15 genes determine the genetic variation for grain yield in the environment-types

by specifying the extent of genetic variation for the four traits PH (3 genes), SG (5 genes), TE (5 genes)

and OA (2 genes). Thus, the genetic variation for grain yield is an emergent property of the variation for

the physiologically defined growth and development processes in the APSIM-Sorg model impacted by

the four traits. The process we have used here to specify the genetic variation for grain yield differs from

the classical quantitative genetics approach where effects of "yield-genes" are specified in ways that are

unrelated to or unconstrained by the biophysical properties of plant growth and development processes.

The resultant genetic variation for grain yield in the base population of genotypes is then subjected to a

series of recurrent cycles of directional selection for increased levels of grain yield. The breeding strategy

we evaluate in this example is S1 recurrent selection [Hallauer and Miranda, 1988] and selection is based

on the yield phenotypes of genotypes when they are evaluated in samples of environments taken from the

target population of environments.

The genetic changes in the population of genotypes in response to the selection imposed by the breed-

ing strategy are examined in terms of: (1) the changes in frequencies of the alternative alleles for the 15

genes (referred to as changes in gene frequencies) on a trait basis, and (2) the changes in grain yield per-

formance of the genotypes created and selected during the course of the simulation experiment. We

examine these changes due to selection at both genetic and phenotypic levels by constructing response

surfaces that relate genetic distances between genotypes to the phenotypic values for the four traits PH,

SG, TE, OA and also grain yield. Genetic distances are calculated as Hamming Distances, which give a

measure of the number of alleles that differ between any pair of genotypes.

For 15 genes, each segregating for two alleles, there are 3

= 14,348,907 possible genotypes from all

combinations of alleles. The frequency of occurrence of these genotypes in the reference population is

dependent on the gene frequencies for the 15 genes. Running the APSIM-Sorg crop growth model

14,348,907 times for each environmental condition was not feasible. Therefore, in this example we re-

duced the number of simulations necessary by allocating genotypes to classes based on defining "expres-

sion states" for each trait. An expression state was defined for a trait by the total number of + or - alleles

summed across the genes influencing the trait, where the + allele increased trait value and the - allele de-

creased trait value. Adopting this approach, for

genes determining genetic variation for a trait, with two

alleles per gene, there are 2

+1 expression states for the trait. For example, for the trait OA with

=2,

individuals can have 0, 1, 2, 3 or 4 + alleles, representing the 5 states of expression for OA. There are

numbers of genotypes in each of the expression state classes. If we label the two genes A (

A,a

) and B

(

B,b

) such that the alternative alleles are

(+),

(-) and

(+),

(-) then the genotype membership of the

expression state classes are: 0 =

aabb

; 1 =

Aabb

aaBb

; 2 =

AAbb

AaBb

aaBB

; 3 =

AABb

, A

aBB

; 4 =

AABB

. We then divided the range of phenotypic values for the traits into equal increments on a linear

M. Cooper et al. / The GP Problem

159

scale, with genotype

aabb

defined as the lowest expression state and

AABB

the highest expression state

for OA. The same process was applied to the other three traits. Following this procedure, we have 5 ex-

pression states for OA, 7 expression states for PH, 11 expression states for both SG and TE. With the four

traits we have 5×7×11×11 = 4,235 combinations of expression states. Thus, the 14,348,907-dimension

genotype space is condensed and mapped onto a 4,235-dimension expression state space. Running 4,235

APSIM-Sorg simulations for the 600 environments used to represent the target population of environ-

ments was manageable with our computer cluster [Micallef et al., 2001; http://pig.ag.uq.edu.au/qu-gene]

resources. The deterministic relationship between genotypes and trait expression states used in this exam-

ple is only one of many ways in which a gene-to-phenotype relationship can be constructed within our

modeling framework (Figs. 3 and 4).

SORGHUM BREEDING EXAMPLE: RESULTS

For the three environment-types the APSIM-Sorg model was used to estimate a grain yield value for

each of the 4,235 trait expression states, referred to hereafter as genotype classes. These estimates were

averages from ca. 200 runs of the model, using as inputs daily weather data and soils data from location-

year combinations chosen to represent the target population of environments. Some appreciation of the

genetic variation for yield that exists among the genotype classes for each of the four traits in the mild

Fig. 5. Grain yield distribution of the genotype classes for the Mild Terminal Stress (colored blue) and Severe Ter-

minal Stress (colored red) environment-types, for representations where the genotype classes are distributed accord-

ing to their genetic distance from the target genotype (based on grain yield) for each of the four traits; (a) Transpira-

tion Efficiency, (b) Osmotic Adjustment, (c) Phenology and (d) Stay-green. The vertical axis indicates the percent-

age of the 4235-genotype classes present at each yield/Hamming distance combination. The horizontal left axis

indicates the level of grain yield (t/ha). The horizontal right axis indicates the number of alleles different from the

target genotype in the target population of environments (referred to as Hamming distance).

M. Cooper et al. / The GP Problem

160

terminal stress and severe terminal stress environment-types is given in Figure 5. For both environment-

types a series of grain yield frequency distributions is shown for each trait. The genotypic classes are or-

dered on their genetic distance (measured as a Hamming distance) from the allele combination of the tar-

get genotype in the target population of environments. As expected lower grain yields are achieved under

severe terminal stress (colored red) than in the mild terminal stress (colored blue) environment-type. For

any genotype class for the four traits there is considerable genetic variation for grain yield, which results

from genotypic variation for the other three traits.

To evaluate the consequences of the effects of GE interactions between the mild terminal stress and se-

vere terminal stress environment-types at the level of grain yield we need to examine the relationship be-

tween grain yield performance in both environment-types. To do this we construct a scatter plot of the

yield values in both environment-types (Fig. 6). If there were no GE interactions there would be a perfect

correlation of the grain yield values between the two environment-types. From the shape of the distribu-

tion of the yield values it can be seen that there are GE interactions and that the genotypes with highest

grain yield differ between the two environment-types.

Fig. 6. Grain yield values (t/ha) for the 4235-genotype classes in the Mild Terminal Stress and Severe Terminal

Stress environment-types for color coded representations of each of the four traits; (a) Transpiration Efficiency, (b)

Osmotic Adjustment, (c) Phenology and (d) Stay-green. Genotype classes are color coded according to their genetic

distance from the target genotype in the target population of environments (Hamming distance), extending from

yellow (all alleles different from the target genotype) to blue (no alleles different from the target genotype).

In Figure 6, each of the 4235 genotype classes is color coded by trait, extending from light (yellow) to

dark (blue), to depict for each trait the genetic distance between the genotype class and the target geno-

M. Cooper et al. / The GP Problem

161

type. As the colors get darker the genotypes in the classes have more alleles in common (giving a lower

Hamming distance) with the target genotype. For both TE (Fig. 6a) and OA (Fig. 6b), genotypes with

high yield in the severe terminal and mild terminal stress environment-types generally have a large pro-

portion of genes in common with genotypes that yield well in the target population of environments. The

situation is different for PH (Fig. 6c). For the PH trait, genotypes that have a high yield in the mild termi-

nal stress environment-type have many genes in common with the target genotype, whereas genotypes

that have high yield in the severe terminal stress environment-type are genetically distant from the target

genotype. Thus, we have strong GE interactions for grain yield that can impact on selection outcomes for

the PH trait and yield in the different environment-types and in the target population of environments. For

SG (Fig. 6d) there is a strong association between high yield in the mild terminal stress environment-type

and having genes in common with the target genotype. However, this relationship is much weaker in the

severe terminal stress environment-type, in part because the other traits have a stronger influence on yield

in this environment-type.

Since there are strong epistatic and GE interactions for the four traits in determining grain yield in the

genotype-environment system represented in this example, it is important to consider the influence of se-

lection environment on the expected changes in the genetic structure of the population. Here we examine

genetic responses over recurrent cycles of selection on yield phenotypes in either the severe terminal

stress or mild terminal stress environment-types. These responses to selection are examined in terms of

changes in the gene frequencies of alleles for increasing levels of trait expression for each trait (Fig. 7)

and finally in terms of trajectories through genetic space for yield (Fig. 8).

Fig. 7. Change in gene frequency of the + alleles for increasing level of the four traits (TE=Transpiration Efficiency,

OA=Osmotic Adjustment, Ph=Phenology, SG=Stay-green) over cycles of selection, when selection is conducted in

the Severe Terminal Stress (a) and Mild Terminal Stress (b) environment-types.

Selection for increased grain yield within the severe terminal stress environment-type (Fig. 7a) had the

effect of rapidly increasing the frequencies of alleles that enhanced expression of the two traits OA and

TE, gradually increasing the frequencies of alleles for enhanced SG, and decreasing the frequencies of

alleles for later flowering, thus selecting early flowering genotypes that could developmentally escape

from the severe terminal stress conditions. After selection cycles 5 and 6, once the alleles for greater ex-

pression of OA and TE were fixed, the rate of increase in frequency of alleles for enhanced levels of SG

was greater than in the previous selection cycles. Selection for higher grain yield under the mild terminal

M. Cooper et al. / The GP Problem

162

stress environment-type (Fig. 7b) resulted in a different pattern of changes in frequencies of alleles to that

observed for the severe terminal stress environment-type (Fig. 7a). Under the mild terminal stress envi-

ronment-type selection for greater yield favored an increase in the frequencies of alleles for higher ex-

pression levels of all four traits (Fig. 7b). Thus, in contrast to the severe terminal stress environment-type,

where early flowering genotypes were favored, selection in the mild terminal stress environment-type

favored late flowering genotypes. Therefore, as we expect in the presence of these interactions, if we plot

the trajectories through genetic space followed by the populations over cycles of selection for yield, these

trajectories contrast depending on whether we select under a severe terminal stress environment-type

(Fig. 8a) or a mild terminal stress environment-type (Fig. 8b).

Fig. 8. Grain yield values (t/ha) for the 4235-genotype classes and the average trajectory of a population of geno-

types (red line) over cycles of selection, when selection is conducted in the Severe Terminal Stress (a) and Mild

Terminal Stress (b) environment-types. Genotype classes are color coded according to their genetic distance from

the target genotype in either the Severe Terminal Stress (a) or the Mild Terminal Stress (b) environment-types, ex-

tending from yellow (all alleles different from the target genotype) to blue (no alleles different from the target geno-

type).

SORGHUM BREEDING EXAMPLE: DISCUSSION

The purposes for considering the sorghum breeding example we have described in this paper were

threefold: (1) to demonstrate some aspects of the approaches we are developing and using to investigate

and deal with the GP problem for complex traits in plant breeding applications (Fig. 3), (2) to emphasize

the importance that both epistatic and GE interactions can have in gene-to-phenotype relationships, and

(3) show how the

E(NK)

model can be used as a framework for many approaches to investigating the GP

problem. An equally valid case study, with availability of a suitable experimental information base, could

be the study of human health issues such as heart disease with influences from the genetics of individuals

and the lifestyle environment they choose.

To date our investigation of sorghum genetic improvement in Australia has synthesized a large body of

information that previously existed as a series of less well connected studies. The modeling framework

we now have has highlighted many previously unappreciated implications of interactions between breed-

ing strategies, the genetic architecture of traits and the environments in which we select for higher grain

yield. Also, and perhaps most importantly, the results of these studies have provided testable hypotheses

and focal points for further experimentation to test our current understanding of the ways in which these

M. Cooper et al. / The GP Problem

163

traits interact with each other and environmental conditions to determine grain yield. Thus, we are enter-

ing another cycle of the iterative modeling approach described in Figure 3.

The GP problem has always and will continue to be a major challenge in biology. With the increasing

availability of the complete genome sequences of a number of prokaryotic and eukaryotic organisms, our

improving ability to define the locations of genes in these sequences, and our growing knowledge of the

functional relationships between these genes and the biochemical and metabolic pathways they influence

[Karp, 2001], we are beginning to understand the dynamical nature of the GP problem. We see that an

iterative modeling approach, as described in this paper, is a logical quantitative framework for exploring

the growing experimental databases and creating knowledge structures for genotype-environment systems

(Fig. 1). This provides a foundation for defining priorities in the model development process and in decid-

ing when development of practical applications is feasible. In our case the practical applications we seek

are efficient plant breeding strategies that contribute to sustainable agricultural systems.

ACKNOWLEDGMENTS

We thank Professor John Casti for his permission to create a modification of his original modeling

concept map in Figure 1 and also Research Trends, Trivandrum, India, for permission to reproduce com-

ponents of Figure 3.

REFERENCES

[1] Bidinger, F. R., Hammer, G. L. and Muchow, R. C. (1996). The physiological basis of genotype by environ-

ment interaction in crop adaptation.

: Plant Adaptation and Crop Improvement, Cooper, M. and Hammer,

G.L. (eds). CAB International, Wallingford, pp. 329-347.

[2] Borrell, A. K. and Hammer, G. L. (2000). Nitrogen dynamics and the physiological basis of stay-green in

sorghum. Crop Sci. 40, 1295-1307.

[3] Casti, J. L. (1989). Paradigms Lost: Images of Man in the Mirror of Science. Cardinal, London.

[4] Casti, J. L. (1997). Would-be-Worlds: How Simulation is Changing the Frontiers of Science. John Wiley &

Sons, Inc., New York.

[5] Clark, A. G. (2000). Limits to prediction of phenotypes from knowledge of genotypes. Evol. Biol. 32, 205-

224.

[6] Chapman, S. C., Cooper, M., Butler, D. G. and Henzell, R. G. (2000a). Genotype by environment interac-

tions affecting grain sorghum. I. Characteristics that confound interpretation of hybrid yield. Aust. J. Agric.

Sci. 51, 197-207.

[7] Chapman, S. C., Cooper, M., Hammer, G. L. and Butler, D. G. (2000b). Genotype by environment interac-

tions affecting grain sorghum. II. Frequencies of different seasonal patterns of drought stress are related to

location effects on hybrid yields. Aust. J. Agric. Sci. 51, 209-221.

[8] Chapman, S. C., Hammer, G. L., Butler, D. G. and Cooper, M. (2000c). Genotype by environment interac-

tions affecting grain sorghum. III. Temporal sequences and spatial patterns in the target population of envi-

ronments. Aust. J. Agric. Sci. 51, 223-233.

[9] Chapman, S. C., Cooper, M. and Hammer, G. L. (2002a). Using crop simulation to generate genotype by

environment interaction effects for sorghum in water-limited environments. Aust. J. Agric. Sci., in press.

[10] Chapman, S. C., Cooper, M., Podlich, D. W. and Hammer, G. L. (2002b). Evaluating plant breeding strate-

gies by simulating gene action and environmental effects to predict phenotypes for dryland adaptation.

Agron. J., submitted.

[11] Comstock, R. E. (1996). Quantitative Genetics with Special Reference to Plant and Animal Breeding. Iowa

State University Press, Ames.

M. Cooper et al. / The GP Problem

164

[12] Cooper, M., Podlich, D. W., Jensen, N. M., Chapman, S. C. and Hammer, G. L. (1999). Modelling plant

breeding programs. Trends Agron. 2, 33-64.

[13] Falconer, D. S. and Mackay, T. F. C. (1996). Introduction to Quantitative Genetics. 4th edn. Longman, Es-

sex.

[14] Hallauer, A. R. and Miranda, J. B. F. (1988). Quantitative Genetics in Maize Breeding 2nd edn. Iowa State

University Press, Ames.

[15] Hammer, G. L., Chapman, S. C., and Snell, P. (1999). Crop simulation modelling to improve selection effi-

ciency in plant breeding programs. Proc. Ninth Assembly Wheat Breeding Society of Australia, Toowoomba,

pp. 79-85.

[16] Hammer, G. L. Farquhar, G. D. and Broad, I. J. (1997). On the extent of genetic variation for transpiration

efficiency in sorghum. Aust. J. Agric. Res. 48, 649-655.

[17] Hammer, G. L. and Muchow, R. C. (1994). Assessing climatic risk to sorghum production in water-limited

subtropical environments. I. Development and testing of a simulation model. Field Crops Res. 36, 221-234.

[18] Hammer G. L., Vanderlip R. L., Gibson G., Wade L. J., Henzell R. G., Younger D. R., Warren J., Dale A. B.

(1989). Genotype by environment interaction in grain sorghum. II. Effects of temperature and photoperiod on

ontogeny. Crop Sci. 29, 376-384.

[19] Hammer, G. L., van Oosterom, E. J., Chapman, S. C. and McLean, G. (2001). The economic theory of water

and nitrogen dynamics in field crops. In: Proceedings of the Fourth Australian Sorghum Conference, Kooral-

byn, Queensland, 5-8 Feb 2001, A. K. Borrell and R. G. Henzell (eds). CD-Rom Format. Range Media Pty

Ltd. (ISBN: 0-7242-2163-8).

[20] Ideker, T., Thorsson, V., Ranish, J. A., Christmas, R., Buhler, J., Eng, J. K., Bumgarner, R., Goodlett, D. R.,

Aebersold, R. and Hood, L. (2001). Integrated genomic and proteomic analyses of a systematically perturbed

metabolic network. Science 292, 929-934.

[21] Karp, P. D. (2001). Pathway databases: A case study in computational symbolic theories. Science 293, 2040-

2044.

[22] Kauffman, S. A. (1993). The Origins of Order: Self-Organization and Selection in Evolution. Oxford Univer-

sity Press, New York.

[23] Kauffman, S. A. (2000). Investigations. Oxford University Press, Oxford.

[24] Kempthorne, O. (1988). An overview of the field of quantitative genetics. In: Proceedings of the Second

International Conference on Quantitative Genetics, Weir, B. S., Eisen, E. J., Goodman, M. M. and

Namkoong, G. (eds). Sinauer Associates, Inc., Sunderland, pp. 47-56.

[25] Mackay, T. F. C. (2001). Quantitative trait loci in Drosophila. Nat. Rev. Genet. 2, 11-20.

[26] McCown, R. L., Hammer, G. L., Hargreaves, J. N. G., Holzworth, D. P. and Freebairn, D. M. (1996). AP-

SIM: A novel software system for model development, model testing, and simulation in agricultural systems

research. Agric. Syst. 50, 255-271.

[27] Micallef, K. P., Cooper, M. and Podlich, D. W. (2001). Using clusters of computers for large QU-GENE

simulation experiments. Bioinformatics 17, 194-195.

[28] Mortlock, M. Y. and Hammer, G. L., (1999). Genotype and water limitation effects on transpiration effi-

ciency in sorghum. J. Crop Prod. 2, 265-286.

[29] Podlich, D. W. and Cooper, M. (1998). QU-GENE: a platform for quantitative analysis of genetic models.

Bioinformatics 14, 632-653.

[30] Rosen, R. (1985). Anticipatory Systems: Philosophical, Mathematical and Methodological Foundations. Per-

gamon Press, Oxford.

[31] Tao, Y. Z., Jordan, D. R., Henzell, R. G. and McIntyre, C. L. (1998). Construction of a genetic map in a sor-

ghum RIL population using probes from different sources and its alignment with other sorghum maps. Aust.

J. Agric. Res. 49, 729-736.

[32] Tao, Y. Z., Henzell, R. G., Jordan, D. R., Butler, D. G., Kelly, A. M. and McIntyre, C. L. (2000). Identifica-

tion of genomic regions associated with stay green in sorghum by testing RILs in multiple environments.

Theo. Appl. Genet. 100, 1225-1232.

Crop modeling suggests limited transpiration would increase yield of sorghum across drought-prone regions of the United States

Article

Full-text available

Jan 2024

Breeding sorghum to withstand droughts is pivotal to secure crop production in regions vulnerable to water scarcity. Limited transpiration (LT) restricts water demand at high vapor pressure deficit, saving water for use in critical periods later in the growing season. Here we evaluated the hypothesis that LT would increase sorghum grain yield in the United States. We used a process-based crop model, APSIM, which simulates interactions of genotype, environment, and management (G × E × M). In this study, the G component includes the LT trait (GT) and maturity group (GM), the EW component entails water deficit patterns, and the MP component represents different planting dates. Simulations were conducted over 33 years (1986-2018) for representative locations across the US sorghum belt (Kansas, Texas, and Colorado) for three planting dates and maturity groups. The interaction of GT x EW indicated a higher impact of LT sorghum on grain for late drought (LD), mid-season drought (MD), and early drought (ED, 8%), than on well-watered (WW) environments (4%). Thus, significant impacts of LT can be achieved in western regions of the sorghum belt. The lack of interaction of GT × GM × MP suggested that an LT sorghum would increase yield by around 8% across maturity groups and planting dates. Otherwise, the interaction GM × MP revealed that specific combinations are better suited across geographical regions. Overall, the findings suggest that breeding for LT would increase sorghum yield in the drought-prone areas of the US without tradeoffs.

Modelling selection response in plant breeding programs using crop models as mechanistic gene-to-phenotype (CGM-G2P) multi-trait link functions

Article

Full-text available

Dec 2020

Plant breeding programs are designed and operated over multiple cycles to systematically change the genetic makeup of plants to achieve improved trait performance for a Target Population of Environments (TPE). Within each cycle, selection applied to the standing genetic variation within a structured reference population of genotypes (RPG) is the primary mechanism by which breeding programs make the desired genetic changes. Selection operates to change the frequencies of the alleles of the genes controlling trait variation within the RPG. The structure of the RPG and the TPE has important implications for the design of optimal breeding strategies. The breeder’s equation, together with the quantitative genetic theory behind the equation, informs many of the principles for design of breeding programs. The breeder’s equation can take many forms depending on the details of the breeding strategy. Through the genetic changes achieved by selection, the cultivated varieties of crops (cultivars) are improved for use in agriculture. From a breeding perspective, selection for specific trait combinations requires a quantitative link between the effects of the alleles of the genes impacted by selection and the trait phenotypes of plants and their breeding value. This gene-to-phenotype link function provides the G2P map for one to many traits. For complex traits controlled by many genes, the infinitesimal model for trait genetic variation is the dominant G2P model of quantitative genetics. Here we consider motivations and potential benefits of using the hierarchical structure of crop models as CGM-G2P trait link functions in combination with the infinitesimal model for the design and optimisation of selection in breeding programs.

Integrating biophysical crop growth models and whole genome prediction for their mutual benefit: A case study in wheat phenology

Article

May 2023
J EXP BOT

Running crop growth models (CGM) coupled with whole genome prediction (WGP), as a CGM-WGP model, introduces environmental information to WGP and genomic relatedness information to the genotype-specific parameters (GSPs) modelled through CGMs. Previous studies have primarily used CGM-WGP to infer prediction accuracy without exploring its potential to enhance CGM and WGP. Here, we implemented a heading date and a heading and maturity date wheat phenology model within a CGM-WGP framework and compared it to CGM and WGP. The CGM-WGP resulted in more heritable GSPs with more biologically realistic correlation structures between GSPs and phenology traits compared to CGM-modelled GSPs that reflected the correlation of measured phenotypes. Another advantage of CGM-WGP is the ability to infer accurate prediction with much smaller and less diverse reference data compared to that required for CGM. A genome-wide association analysis linked the GSPs from the CGM-WGP model to nine significant phenology loci including Vrn-A1 and the three PPD1 genes, which were not detected for CGM-modelled GSPs. Selection on GSPs could be simpler than on observed phenotypes. For example, thermal time traits are theoretically more independent candidates, compared to the highly correlated heading and maturity dates, which could be used to achieve an environment-specific optimal flowering period. CGM-WGP combines the advantages of CGM and WGP to predict more accurate phenotypes for new genotypes under alternative or future environmental conditions.

A Review on Resistance to Biotic Stress in Leaf-Colored Plant

Article

Full-text available

Jan 2022

Crop modeling defines opportunities and challenges for drought escape, water capture, and yield increase using chilling‐tolerant sorghum

Article

Full-text available

Sep 2021

Many crop species, particularly those of tropical origin, are chilling sensitive, so improved chilling tolerance can enhance production of these crops in temperate regions. For the cereal crop sorghum (Sorghum bicolor L.), early planting and chilling tolerance have been investigated for >50 years, but the potential value or tradeoffs of this genotype × management change have not been formally evaluated with modeling. To assess the potential of early planted chilling-tolerant grain sorghum in the central US sorghum belt, we conducted CERES-Sorghum simulations and characterized scenarios under which this change would be expected to enhance (or diminish) drought escape, water capture, and yield. We conducted crop growth modeling for full- and short-season hybrids under rainfed systems that were simulated to be planted in very early (April), early (May 15), and normal (June 15) planting dates over 1986–2015 in four locations in Kansas representative of the central US sorghum belt. Simulations indicated that very early planting will generally lead to lower initial soil moisture, longer growing periods, and higher evapotranspiration. Very early planting is expected to extend the growing period by 20% for short- or full-season hybrids, reduce evaporation during fallow periods, and increase plant transpiration in the two-thirds of years with the highest precipitation (mean > 428 mm), leading to 11% and 7% increase grain yield for short- and full-season hybrids, respectively. Thus, in this major sorghum growing region, very early and early planting could reduce risks of terminal droughts, extend seasons, and increase rotation options, suggesting that further development of chilling-tolerant hybrids is warranted.

Multiomics for Crop Improvement

Chapter

Jan 2024

The growing food demand in the world due to the increasing population and decreasing availability of agricultural land requires new crops that are more productive and resistant to harsher environmental conditions. Thus, rapid and effective exploration, identification, and validation of an important trait, gene, molecular mediator, and protein interaction are important for improving crop yield and quality in the near future. Integrating genomics, transcriptomics, proteomics, metabolomics and phenomics enables a deeper understanding of the mechanisms underlying the complex architecture of many phenotypic traits of agricultural relevance. Here, we cite several relevant examples that can appraise our understanding of the recent developments in omics technologies and how they drive our quest to breed climate-resilient crops. Large-scale genome resequencing, pangenomes, and genome-wide association studies aid in identifying and analysing species-level genome variations. RNA-sequencing-driven transcriptomics approach has provided unprecedented opportunities for performing crop abiotic and biotic stress response studies. Additionally, the high-resolution proteomics technologies necessitated a gradual shift from the general descriptive studies of plant protein abundances to large-scale analysis of protein-metabolite interactions. Especially, advent in metabolomics is currently receiving special attention, owing to the role metabolites play as metabolic intermediates and close links to the phenotypic expression. Further, the high-throughput phenomics approach opened new research domains such as root system architecture analysis and plant root-associated microbes for improved crop health and climate resilience. Overall, integrating the PANOMICS approach to modern plant breeding and genetic engineering methods ensures the development of climate-smart crops with higher nutrition quality that can sustainably meet the current and future global food demands.

Evaluating Plant Breeding Strategies by Simulating Gene Action and Dryland Environment Effects

Article

Full-text available

Jan 2003

Functional genomics is the systematic study of genome‐wide effects of gene expression on organism growth and development with the ultimate aim of understanding how networks of genes influence traits. Here, we use a dynamic biophysical cropping systems model (APSIM‐Sorg) to generate a state space of genotype performance based on 15 genes controlling four adaptive traits and then search this space using a quantitative genetics model of a plant breeding program (QU‐GENE) to simulate recurrent selection. Complex epistatic and gene × environment effects were generated for yield even though gene action at the trait level had been defined as simple additive effects. Given alternative breeding strategies that restricted either the cultivar maturity type or the drought environment type, the positive (+) alleles for 15 genes associated with the four adaptive traits were accumulated at different rates over cycles of selection. While early maturing genotypes were favored in the Severe‐Terminal drought environment type, late genotypes were favored in the Mild‐Terminal and Midseason drought environment types. In the Severe‐Terminal environment, there was an interaction of the stay‐green (SG) trait with other traits: Selection for + alleles of the SG genes was delayed until + alleles for genes associated with the transpiration efficiency and osmotic adjustment traits had been fixed. Given limitations in our current understanding of trait interaction and genetic control, the results are not conclusive. However, they demonstrate how the per se complexity of gene × gene × environment interactions will challenge the application of genomics and marker‐assisted selection in crop improvement for dryland adaptation.

Experiences of Applying Field-Based High-Throughput Phenotyping for Wheat Breeding

Chapter

Jul 2021

High-throughput phenotyping (HTP) is poised to fundamentally transform plant breeding through increased accuracy, spatial, and temporal resolution in measuring breeding trials. In this chapter, we examine different types of phenotyping platforms, data management, and data utilization for decision making using HTP in plant breeding, with case studies from wheat breeding programs. Development of HTP platforms, both ground-based and aerial vehicles requires evaluating the traits to be measured as well as the resources available. Data management is a critical part of the overall research process, and an example data management program is provided. Finally, examples of HTP use within crop breeding and plant science are presented. This chapter provides an overview of the entire HTP process from system conception to decision making within research programs based on HTP data.

Tackling G × E × M interactions to close on-farm yield-gaps: creating novel pathways for crop improvement by predicting contributions of genetics and management to crop productivity

Article

Full-text available

Jun 2021
THEOR APPL GENET

Key message Climate change and Genotype-by-Environment-by-Management interactions together challenge our strategies for crop improvement. Research to advance prediction methods for breeding and agronomy is opening new opportunities to tackle these challenges and overcome on-farm crop productivity yield-gaps through design of responsive crop improvement strategies. Abstract Genotype-by-Environment-by-Management (G × E × M) interactions underpin many aspects of crop productivity. An important question for crop improvement is “How can breeders and agronomists effectively explore the diverse opportunities within the high dimensionality of the complex G × E × M factorial to achieve sustainable improvements in crop productivity?” Whenever G × E × M interactions make important contributions to attainment of crop productivity, we should consider how to design crop improvement strategies that can explore the potential space of G × E × M possibilities, reveal the interesting Genotype–Management (G–M) technology opportunities for the Target Population of Environments (TPE), and enable the practical exploitation of the associated improved levels of crop productivity under on-farm conditions. Climate change adds additional layers of complexity and uncertainty to this challenge, by introducing directional changes in the environmental dimension of the G × E × M factorial. These directional changes have the potential to create further conditional changes in the contributions of the genetic and management dimensions to future crop productivity. Therefore, in the presence of G × E × M interactions and climate change, the challenge for both breeders and agronomists is to co-design new G–M technologies for a non-stationary TPE. Understanding these conditional changes in crop productivity through the relevant sciences for each dimension, Genotype, Environment, and Management, creates opportunities to predict novel G–M technology combinations suitable to achieve sustainable crop productivity and global food security targets for the likely climate change scenarios. Here we consider critical foundations required for any prediction framework that aims to move us from the current unprepared state of describing G × E × M outcomes to a future responsive state equipped to predict the crop productivity consequences of G–M technology combinations for the range of environmental conditions expected for a complex, non-stationary TPE under the influences of climate change.

Crop modeling defines opportunities and challenges for drought escape, water capture, and yield increase using chilling-tolerant sorghum

Preprint

Full-text available

Jan 2021

Many crop species, particularly those of tropical origin, are chilling sensitive so improved chilling tolerance can enhance production of these crops in temperate regions. For the cereal crop sorghum (Sorghum bicolor L.) early planting and chilling tolerance have been investigated for >50 years, but the potential value or tradeoffs of this genotype X management change has not been formally evaluated with modeling. To assess the potential of early-planted chilling-tolerant grain sorghum in the central US sorghum belt, we conducted CERES-Sorghum simulations and characterized scenarios under which this change would be expected to enhance (or diminish) drought escape, water capture, or yield. We conducted crop growth modeling for a full- and short-season hybrids under rainfed systems that were simulated to be planted in early (mid-April), normal (mid-May), and late (mid-June) planting dates from 1986 to 2015 in four locations in Kansas representative of the central US sorghum belt. Simulations indicated that early planting will generally lead to lower initial soil moisture, longer growing periods, and higher evapotranspiration. Early planting is expected to extend the growing period by 20% for short- or full-season hybrids, reduce evaporation during fallow periods, and increase plant transpiration in the two-thirds of years with the highest precipitation (mean > 428 mm), leading to 11% and 7% increase grain yield for short- and full-season hybrids, respectively. Thus, in this major sorghum growing region early planting could reduce risks of terminal droughts, extend seasons, and increase rotation options, suggesting that further development of chilling tolerant hybrids is warranted.

Integrated Genomic and Proteomic Analyses of a Systematically Perturbed Metabolic Network

Article

Full-text available

May 2001
SCIENCE

We demonstrate an integrated approach to build, test, and refine a model of a cellular pathway, in which perturbations to critical pathway components are analyzed using DNA microarrays, quantitative proteomics, and databases of known physical interactions. Using this approach, we identify 997 messenger RNAs responding to 20 systematic perturbations of the yeast galactose-utilization pathway, provide evidence that approximately 15 of 289 detected proteins are regulated posttranscriptionally, and identify explicit physical interactions governing the cellular response to each perturbation. We refine the model through further iterations of perturbation and global measurements, suggesting hypotheses about the regulation of galactose utilization and physical interactions between this and a variety of other metabolic pathways.

QU-GENE: a simulation platform for quantitative analysis of genetic models

Article

Full-text available

Aug 1998

Classical quantitative genetics theory makes a number of simplifying assumptions in order to develop mathematical expressions that describe the mean and variation (genetic and phenotypic) within and among populations, and to predict how these are expected to change under the influence of external forces. These assumptions are often necessary to render the development of many aspects of the theory mathematically tractable. The availability of high-speed computers today provides opportunity for the use of computer simulation methodology to investigate the implications of relaxing many of the assumptions that are commonly made. QU-GENE (QUantitative-GENEtics) was developed as a flexible computer simulation platform for the quantitative analysis of genetic models. Three features of the QU-GENE software that contribute to its flexibility are (i) the core E(N:K) genetic model, where E is the number of types of environment, N is the number of genes, K indicates the level of epistasis and the parentheses indicate that different N:K genetic models can be nested within types of environments, (ii) the use of a two-stage architecture that separates the definition of the genetic model and genotype-environment system from the detail of the individual simulation experiments and (iii) the use of a series of interactive graphical windows that monitor the progress of the simulation experiments. The E(N:K) framework enables the generation of families of genetic models that incorporate the effects of genotype-by-environment (G x E) interactions and epistasis. By the design of appropriate application modules, many different simulation experiments can be conducted for any genotype-environment system. The structure of the QU-GENE simulation software is explained and demonstrated by way of two examples. The first concentrates on some aspects of the influence of G x E interactions on response to selection in plant breeding, and the second considers the influence of multiple-peak epistasis on the evolution of a four-gene epistatic network. QU-GENE is available over the Internet at (http://pig.ag.uq.edu.au/qu-gene/) m.cooper@mailbox.uq.edu. au

Limits to Prediction of Phenotypes from Knowledge of Genotypes

Article

Jan 2000

Andrew G. Clark

The fact that natural selection acts on phenotypes but the transmission of traits to the next generation is indirectly accomplished through genes gives rise to a challenging set of problems in evolutionary Biology. In order to understand adaptive evolution, it appears to be essential to first understand how genotypes give rise to observed phenotypes, or more precisely, how variation in phenotypes is mediated by underlying variation in genotypes. As the tools of molecular genetics give an increasingly detailed view of the underlying genetic variation, one would hope that this problem would be solved by the sheer volume of genetic data. Human molecular genetics has produced many significant successes recently, particularly in identifying genes that cause Mendelian disorders. In stark contrast, chronic diseases that exhibit familial clustering but do not segregate like a Mendelian gene have been remarkably difficult to analyze genetically. The focus of this chapter is on the question, “What are the barriers to our understanding of the genetic basis for familiar clustering of chronic diseases?” We will focus on medical genetics rather than the more general problem of genotype-phenotype associations in evolutionary Biology, because knowledge of phenotypic variation is so extensive for humans and the quantity of data on genetic variation is soon going to eclipse that of all other species, if it has not already.

Anticipatory systems. Philosophical, mathematical, and methodological foundations

Article

Jan 1985

Robert Rosen

Construction of a genetic map in a sorghum RIL population using probes from different sources and its comparison with other sorghum maps

Article

Jan 1998
AUST J AGR RES

A genetic map was established using 120 F-5 sorghum recombinant inbred lines (RILs) developed from a cross between 2 Australian elite sorghum inbred lines, QL39 and QL41. A variety of DNA probes, including sorghum genomic DNA, maize genomic DNA and cDNA, sugarcane genomic DNA and cDNA, and cereal anchor probes, were screened to identify DNA polymorphism between the parental lines. Using 5 restriction enzymes, probe polymorphism levels were low (26.5%). A total of 155 restriction fragment length polymorphism (RFLP) loci and 8 simple sequence repeat (SSR) loci were mapped onto 21 linkage groups, covering a map distance of approximately 1400 cM. Genes for 3 simply inherited traits, awns (AW), mesocarp thickness (Z), and organophosphate insecticide (OPR) reaction, were also mapped. The relationships between this map and other published sorghum maps were reviewed and a comparison of major sorghum RFLP maps attempted. This comparison is expected to enhance the effectiveness of existing mapping information and will facilitate efforts to map agronomically important traits in sorghum.

Genotype-by-Environment Interaction in Grain Sorghum. II. Effects of Temperature and Photoperiod on Ontogeny

Article

Mar 1989

(…) Development rate of all hybrids exhibited a curvilinear response to temperature in both phases. Old and new hybrids differed in their temperature responses in GS1 but were similar in GS2. New hybrids had slower rates of development at all temperatures, but the difference was greater at higher temperature (>25 o C). All hybrids had similar short-day photoperiodic response in GS1, with a critical photoperiod 13.2 h. The models were tested on a separate data set covering a similar broad range of environments and performed well

Identification of genomic regions associated with stay green in sorghum by testing RILs in multiple environments

Article

Jan 2000

Stay green is an important drought resistance trait for sorghum production. QTLs for this trait with consistent effects across a set of environments would increase the efficiency of selection because of its relatively low heritability. One hundred and sixty recombinant inbreds, derived from a cross between QL39 and QL41, were used as a segregating population for genome mapping and stay green evaluation. Phenotypic data were collected in replicated field trials from five sites and in three growing seasons, and analysed by fitting appropriate models to account for spatial variability and to describe the genotype by environment interaction. Interval mapping and non-parametric mapping identified three regions, each in a separate linkage group, associated with stay green in more than one trial, and two regions in single trial. The regions on linkage groups B and I were both consistently identified from three trials. The multiple environment testing was very helpful for correctly identifying QTLs associated with the trait. The utilisation of molecular markers for stay green in sorghum breeding is also discussed.

APSIM: a Novel Software System for Model Development, Model Testing and Simulation in Agricultural Systems Research

Article

Feb 1996

APSIM (Agricultural Production Systems Simulator) is a software system which allows (a) models of crop and pasture production, residue decomposition, soil water and nutrient flow, and erosion to be readily re-configured to simulate various production systems and (b) soil and crop management to be dynamically simulated using conditional rules. A key innovation is change from a core concept of a crop responding to resource supplies to that of a soil responding to weather, management and crops. While this achieves a sound logical structure for improved simulation of soil management and long-term change in the soil resource, it does so without loss of sensitivity in simulating crop yields. This concept is implemented using a program structure in which all modules (e.g. growth of specific crops, soil water, soil N, erosion) communicate with each other only by messages passed via a central ‘engine’. Using a standard interface design, this design enables easy removal, replacement, or exchange of modules without disruption to the operation of the system. Simulation of crop sequences and multiple crops are achieved by managing connection of crop growth modules to the engine.A shell of software tools has been developed within a WINDOWS environment which includes user-installed editor, linker, compiler, testbed generator, graphics, database and version control software. While the engine and modules are coded in FORTRAN, the Shell is in C++. The resulting product is one in which the functions are coded in the language most familiar to the developers of scientific modules but provides many of the features of object oriented programming. The Shell is written to be aware of UNIX operating systems and be capable of using the processor on UNIX workstations.

Assessing climatic risk to sorghum production in water-limited subtropical environments I. Development and testing of a simulation model

Article

Mar 1994
FIELD CROP RES

Sorghum (Sorghum bicolor (L.) Moench.) is one of the major summer crops grown in the subtropics. The high rainfall variability and limited planting opportunities in these regions make crop production risky. A robust crop simulation model can assist farmer decision-making via simulation analyses to quantify production risks. Accordingly, we developed a simple, yet mechanistic crop simulation model for sorghum for use in assessing climatic risk to production in water-limited environments. The model simulates grain yield, biomass accumulation, crop leaf area, phenology and soil water balance. The model uses a daily time-step and readily available weather and soil information and assumes no nutrient limitation. The model was tested on numerous data (n=38) from experiments spanning a broad range of environments in the semi-arid tropics and subtropics. Potential limitations in the model were identified and examined in a novel testing procedure by using combinations of predicted and observed data in various modules of the model. The model performed satisfactorily, accounting for 94% and 64% of the variation in total biomass and grain yield, respectively. The difference in outcome for biomass and yield was caused by limitations in predicting harvest index. The concepts involved, and the limitations encountered, developing a crop model to be simple but consistent with the biophysical rigour required for application to such a diverse range of environments, are discussed.

Introduction To Quantitative Genetics

Article

Jun 1962

The GP problem: Quantifying gene-to-phenotype relationships

Abstract and Figures

Recommended publications

APSIM and DSSAT models as decision support tools

Evaluation of Sorghum in Western Australia

Simulation model of photosynthesis and dry matter accumulation in oilseed flax based on APSIM

Sustainable intensification options to improve yield potential and eco-efficiency for rice-wheat rot...