ArticlePDF Available

Comparison Between Linear and Non-parametric Regression Models for Genome-Enabled Prediction in Wheat

Authors:

Abstract and Figures

In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Content may be subject to copyright.
INVESTIGATION
Comparison Between Linear and Non-parametric
Regression Models for Genome-Enabled Prediction
in Wheat
Paulino Pérez-Rodríguez,*
,
1
Daniel Gianola,
Juan Manuel González-Camacho,* José Crossa,
Yann Manès,
and Susanne Dreisigacker
*Colegio de Postgraduados, Montecillo, Texcoco 56230, México, Departments of Animal Sciences, Dairy Science,
and Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin 53706, and Biometrics
and Statistics Unit and Global Wheat Program, International Maize and Wheat Improvement Center (CIMMYT), 06600
Mexico, D.F., México
ABSTRACT In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression
models have been used. This study assessed the predictive ability of linear and non-linear models using
dense molecular markers. The linear models were linear on marker effects and included the Bayesian
LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity
on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks
(BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using
306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two
traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that
the three non-linear models had better overall prediction accuracy than the linear regression specication.
Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge
regression, Bayes A, and Bayes B models.
KEYWORDS
GenPred
Shared data
resources
Genome-enabled prediction of complex traits based on marker data
are becoming important in plant and animal breeding, personalized
medicine, and evolutionary biology (Meuwissen et al. 2001; Bernardo
and Yu 2007; de los Campos et al. 2009, 2010; Crossa et al. 2010, 2011;
Ober et al. 2012). In the standard, innitesimal, pedigree-based model
of quantitative genetics, the family structure of a population is re-
ected in some expected resemblance between relatives. The latter is
measured as an expected covariance matrix among individuals and is
used to predict genetic values (e.g. Crossa et al. 2006; Burgueño et al.
2007, 2011). Whereas pedigree-based models do not account for Men-
delian segregation and the expected covariance matrix is constructed
using assumptions that do not hold (e.g. absence of selection and
mutation and random mating), the marker-based models allow trac-
ing Mendelian segregation at several positions of the genome and
observing realized (as opposed to expected) covariances. This enhan-
ces the potential for improving the accuracy of estimates of genetic
values, thus increasing the genetic progress attainable when these
predictions are used for selection purposes in lieu of pedigree-based
predictions. Recently, de los Campos et al. (2009, 2010) and Crossa
et al. (2010, 2011) used Bayesian estimates from genomic parametric
and semi-parametric regressions, and they found that models that
incorporate pedigree and markers simultaneously had better predic-
tion accuracy for several traits in wheat and maize than models based
only on pedigree or only on markers.
The standard linear genetic model represents the phenotypic
response of the i
th
individual (yi) as the sum of a genetic value, gi,and
of a model residual, ei, such that the linear model for nindividuals
ði¼1; :::; nÞis represented as yi¼giþei. However, building predic-
tive models for complex traits using a large number of molecular markers
(p) with a set of lines comprising individuals (n)withpnis
challenging because individual marker effects are not likelihood-
identied. In this case, marker effects can be estimated via penal-
ized parametric or semi-parametric methods or their Bayesian
counterparts, rather than via ordinary least squares. This reduces
Copyright © 2012 Pérez-Rodríguez et al.
doi: 10.1534/g3.112.003665
Manuscript received July 9, 2012; accepted for publication October 5, 2012
This is an open-access article distributed under the terms of the Creative
Commons Attribution Unported License (http://creativecommons.org/licenses/
by/3.0/), which permits unrestricted use, distribution, and reproduction in any
medium, provided the original work is properly cited.
Supporting information is available online at http://www.g3journal.org/lookup/
suppl/doi:10.1534/g3.112.003665/-/DC1
1
Corresponding author: Colegio de Postgraduados, Montecillo, Texcoco 56230,
México. E-mail: perpdgo@gmail.com
Volume 2 | December 2012 | 1595
the mean-squared error of estimates; it also increases prediction
accuracy of out-of-sample cases and prevents over-tting (de los
Campos et al. 2010). In addition to the well-known Bayes A and B
linear regression models originally proposed by Meuwissen et al.
(2001) for incorporating marker effects into gi, there are several pe-
nalized parametric regression methods for estimating marker effects,
such as ridge regression, the least absolute shrinkage and selection
operator (LASSO), and the elastic net (Hastie et al. 2009). The Bayes-
ian counterparts of these models have proved to be useful because
appropriate priors can be assigned to the regularization parameter(s),
and uncertainty in the estimations and predictions can be measured
directly by applying the Bayesian paradigm.
Regression methods assume a linear relationship between pheno-
type and genotype, and they typically account for additive allelic
effects only; however, evidence of epistatic effects on plant traits is vast
and well documented (e.g. Holland 2001, 2008). In wheat, for instance,
detailed analyses have revealed a complex circuitry of epistatic inter-
actions in the regulation of heading time involving different vernali-
zation genes, day-length sensitivity genes, and earliness per se genes, as
well as the environment (Laurie et al. 1995; Cockram et al. 2007).
Epistatic effects have also been found to be an important component
of the genetic basis of plant height and bread-making quality traits
(Zhang et al. 2008; Conti et al. 2011). It is becoming common to study
gene ·gene interactions by using a paradigm of networks that
includes aggregating gene ·gene interaction that exists even in the
absence of main effects (McKinney and Pajewski 2012). Interactions
between alleles at two or more loci could theoretically be represented
in a linear model via use of appropriate contrasts. However, this does
not scale when the number of markers (p) is large, as the number of
2-locus, 3-locus, etc., interactions is mind boggling.
An alternative approach to the standard parametric modeling of
complex interactions is provided by non-linear, semi-parametric
methods, such as kernel-based models (e.g. Gianola et al. 2006;
Gianola and van Kaam 2008) or articial neural networks (NN) (Okut
et al. 2011; Gianola et al. 2011), under the assumption that such
procedures can capture signals from high-order interactions. The po-
tential of these methods, however, depends on the kernel chosen and
on the neural network architecture. In a recent study, Heslot et al.
(2012) compared the predictive accuracy of several genome-enabled
prediction models, including reproducing kernel Hilbert space
(RKHS) and NN, using barley and wheat data; the authors found
that the non-linear models gave a modest but consistent predictive
superiority (as measured by correlations between predictions and
realizations) over the linear models. In particular, the RKHS model
had a better predictive ability than that obtained using the para-
metric regressions.
The use of RKHS for predicting complex traits was rst proposed
by Gianola et al. (2006) and Gianola and van Kaam (2008). de los
Campos et al. (2010) further developed the theoretical basis of RHKS
with kernel averaging(simultaneous use of various kernels in the
model) and showed its good prediction accuracy. Other empirical
studies in plants have corroborated the increase in prediction accuracy
of kernel methods (e.g. Crossa et al. 2010, 2011; de los Campos et al.
2010; Heslot et al. 2012). Recently, Long et al. (2010), using chicken
data, and González-Camacho et al. (2012), using maize data, showed
that NN methods provided prediction accuracy comparable to that
obtained using the RKHS method. In NN, the bases functions (adap-
tive covariates) are inferred from the data, which gives the NN great
potential and exibility for capturing complex interactions between
input variables (Hastie et al. 2009). In particular, Bayesian regularized
neural networks (BRNN) and radial basis function neural networks
(RBFNN) have features that make them attractive for use in genomic
selection (GS).
In this study, we examined the predictive ability of various linear
and non-linear models, including the Bayes A and B linear regression
models of Meuwissen et al. (2001); the Bayesian LASSO, as in Park
and Casella (2008) and de los Campos et al. (2009); RKHS, using the
kernel averagingstrategy proposed by de los Campos et al. (2010);
the RBFNN, proposed and used by González-Camacho et al. (2012);
and the BRNN, as described by Neal (1996) and used in the context of
GS by Gianola et al. (2011). The predictive ability of these models was
compared using a cross-validation scheme applied to a wheat data set
from CIMMYTs Global Wheat Program.
MATERIALS AND METHODS
Experimental data
The data set included 306 elite wheat lines, 263 lines that are
candidates for the 29
th
Semi-Arid Wheat Screening Nursery (SAWSN),
and 43 lines from the 18
th
Semi-Arid Wheat Yield Trial (SAWYT) from
CIMMYTs Global Wheat Program. These lines were genotyped
with 1717 diversity array technology (DArT) markers generated
by Triticarte Pty. Ltd. (Canberra, Australia; http://www.triticarte.
com.au). Two traits were analyzed: grain yield (GY) and days to
heading (DTH) (see Supporting Information, File S1).
The traits were measured in a total of 12 different environments
(112) (Table 1): GY in environments 17 and DTH in environments
15 and 812 (10 in all). Different agronomic practices were used.
Yield trials were planted in 2009 and 2010 using prepared beds and
at plots under controlled drought or irrigated conditions. Yield data
from experiments in 2010 were replicated, whereas data from trials
in 2009 were adjusted means from an alpha lattice incomplete
block design with adjustment for spatial variability in the direction
of rows and columns using the autoregressive model tted in both
directions.
Data used to train the models for GY and DTH in 2009 were the
best linear unbiased estimator (BLUE) after spatial analysis, whereas
the BLUE data for 2010 were obtained after performing analyses in
each of the 12 environments and combined. The experimental designs
in each location consisted of alpha lattice incomplete block designs of
different sizes, with two replicates each.
Broad-sense heritability at individual environments was calculated
as h2¼s2
g=ðs2
gþs2
e
nrepsÞ,wheres2
gand s2
eare the genotype and error var-
iance components, respectively, and nreps is the number of replicates.
For the combined analyses across environments, broad-sense herita-
bility was calculated as h2¼s2
g=ðs2
gþs2
ge
nenv þs2
e
nenv ·nrepsÞÞ, where the term s2
ge is
the genotype ·environment interaction variance component, and
nenv is the number of environments included in the analysis.
Statistical models
One method for incorporating markers is to dene gias a para-
metric linear regression on marker covariates xij with form
gi¼P
p
j¼1
xijbj, such that yi¼P
p
j¼1
xijbjþei(j= 1,2,...,pmarkers); here, bjis
the partial regression of yion the j
th
marker covariate (Meuwissen
et al. 2001). Extending the model to allow for an intercept
yi¼mþX
p
j¼1
xijbjþei(1)
We adopted Gaussian assumptions for model residuals; speci-
cally, the joint distribution of model residuals in Equation 1 was
1596 | P. Pérez-Rodríguez et al.
assumed normal with mean zero and variance s2
e. The likelihood
function is
pyjm;g;s2
e¼Y
n
i¼1
N0
@yimþX
p
j¼1
xij bj;s2
e1
A(2)
where NyijmþP
p
j¼1
xijbj;s2
eis a normal density for random variable
yicentered at mþP
p
j¼1
xijbjand with variance s2
e.Dependingon
how priors on the marker effects are assigned, different Bayesian
linear regression models result.
Linear models: Bayesian ridge regression, Bayesian
LASSO, Bayes A, and Bayes B
A standard penalized regression method is ridge regression (Hoerl
and Kennard 1970); its Bayesian counterpart, Bayesian ridge regres-
sion (BRR), uses a prior density of marker effects, pðbjjvÞ,thatis,
Gaussian, centered at zero and with variance common to all the
markers, that is, pðbjjs2
bÞ¼Nðbjj0;s2
bÞ,wheres2
bis a prior-variance
of marker effects. Marker effects are assumed independent and iden-
tically distributed apriori. We assigned scaled inverse chi distributions
x22ðdf:;s:Þto the variance parameters s2
eand s2
b. The prior degrees
of freedom parameters were set to df:¼4ands:¼1. It can be shown
that the posterior mean of marker effects is the best linear unbiased
predictor (BLUP) of marker effects, so Bayesian ridge regression is
often referred to as RR-BLUP (de los Campos et al. 2012).
The Bayesian LASSO, Bayes A, and Bayes B relax the assumption
of common prior variance to all marker effects. The relationship
among these three models is as follows: Bayes B can be considered as
the most general of the three, in the sense that Bayes A and Bayesian
ridge regression can be viewed as special cases of Bayes B. This is
because Bayes A is obtained from Bayes B by setting p¼0(the
proportion of markers with null effects), and Bayesian ridge regression
is obtained from Bayes B by setting p¼0 and assuming that all the
markers have the same variance.
Bayes B uses a mixture distribution with a mass at zero, such that
the (conditional) prior distribution of marker effects is given by
bjs2
j;p¼0 with probability p
Nð0;s2
jÞwith probability 1-p(3)
The prior assigned to s2
j;j¼1; ::::; pis the same for all markers,
i.e. a scaled inverted chi squared distribution x22ðdfb;sbÞ,wheredfb
are the degrees of freedom and sbis a scaling parameter. Bayes B
becomes Bayes A by setting p=0.
In the case of Bayes B, we took p¼0:95;dfb¼4, and
sb¼~s2
aðdfb22Þ=dfbwith ~s2
a¼~s2
S=hð12pÞP
p
j¼1
2qjð12qjÞi,where
qjis the allele frequency for marker jand ~s2
Sis the additive genetic
variance explained by markers [see Habier et al. (2011) and Resende
et al. (2012) for more details]. In the case of s2
e,weassignedaat prior
as in Wang et al. (1994).
The Bayesian LASSO assigns a double exponential (DE) distribu-
tion to all marker effects (conditionally on a regularization parameter
l), centered at zero and with marker-specic variance, that is,
pðbjjl;seÞ¼DEbjj0;l
s2
e. The DE distribution does not conjugate
with the Gaussian likelihood, but it can be represented as a mixture of
scaled normal densities, which allows easy implementation of the model
(Park and Casella 2008; de los Campos et al. 2009). The priors used
were exactly the same as those used in González-Camacho et al. (2012).
The models used in this study, the Bayesian ridge regression,
Bayesian LASSO (BL), Bayes A, and Bayes B, are explained in detail
in several articles; for example, Bayes A and Bayes B are described
in Meuwissen et al. (2001), Habier et al. (2011), and Resende et al.
(2012), and an account of BL is given in de los Campos et al. (2009,
2012), Crossa et al. (2010, 2011), Perez et al. (2010), and González-
Camacho et al. (2012).
Non-linear models: RBFNN, BRNN, and RKHS
In this section, we describe the basic structure of the non-linear single
hidden layer feed-forward neural network (SLNN) with two of its
variants, the radial basis function neural network and the Bayesian
regularized neural network. We also give a brief explanation of RKHS
with the averaging kernel method at the end of this section.
Single hidden layer feed-forward neural network: In a single-layer
feed-forward (SLNN), the non-linear activation functions in the
hidden layer enable a NN to have universal approximation ability,
giving it great potential and exibility in terms of capturing complex
patterns. The structure of the SLNN is depicted in Figure 1, which
illustrates the structure of the method for a phenotypic continuous
response. This NN can be thought of as a two-step regression (e.g.
Hastie et al. 2009). In the rst step, in the non-linear hidden layer, S
data-derived basis functions (k=1,2,...,Sneurons), fz½k
ig,are
inferred, and in the second step, in the linear output layer, the re-
sponse is regressed on the basis functions (inferred in the hidden
layer). The inner product between the input vector and the weight
nTable 1 Twelve environments representing combinations of diverse agronomic management (drought or full irrigation, sowing in
standard, bed, or at systems), sites in Mexico, and years for two traits, grain yield (GY) and days to heading (DTH), with their broad-
sense heritability (h
2
) measured in 2010
Environment Code Agronomic Management Site in Mexico Year Trait Measured h
2
(GY) h
2
(DTH)
1 Drought-bed Cd. Obregon 2009 GY, DTH ——
2 Drought-bed Cd. Obregon 2010 GY, DTH 0.833 0.991
3 Drought-at Cd. Obregon 2010 GY, DTH 0.465 0.984
4 Full irrigation-bed Cd. Obregon 2009 GY, DTH ——
5 Full irrigation-bed Cd. Obregon 2010 GY, DTH 0.832 0.086
6 Heat-bed Cd. Obregon 2010 GY 0.876
7 Full irrigation-at melga Cd. Obregon 2010 GY 0.876
8 Standard Toluca 2009 DTH ——
9 Standard El Batan 2009 DTH ——
10 Small observation plot Cd. Obregon 2009 DTH ——
11 Small observation plot Cd. Obregon 2010 DTH 0.950
12 Standard Agua Fria 2010 DTH 0.990
Volume 2 December 2012 | Linear and Non-parametric Regression Models for GS | 1597
vector (b½k) of each neuron of the hidden layer, plus a bias (intercept
bk), is performed, that is, u½k
i¼bkþP
p
j¼1
xijb½k
j;(j=1,...,pmarkers);
this is then transformed using a non-linear activation function
gkðu½k
iÞ. One obtains z½k
i¼gkbkþP
p
j¼1
xijb½k
j,wherebkis an in-
tercept and (b
1[1]
,...,b
p[1]
;...,b
1[S]
,...,b
p[S]
)9is a vector of re-
gression coefcients or weightsof each neuron kin the hidden
layer. The gkð:Þis the activation function, which maps the inputs
into the real line in the closed interval [21,1]; for example,
gkðxÞ¼expð2xÞ21
expð2xÞþ1is known as the tangent hyperbolic function.
Finally, in the linear output layer, phenotypes are regressed on the
data-derived features, fz½k
ig, according to
yi¼mþX
S
k¼1
wkz½k
iþei¼mþX
S
k¼1
wkgk0
@bkþX
p
j¼1
xij b½k
j1
Aþei:
(4)
Radial basis function neural network: The RBFNN was rst
proposed by Broomhead and Lowe (1988) and Poggio and Girosi
(1990). Figure 2 shows the architecture of a single hidden layer
RBFNN with Snon-linear neurons. Each non-linear neuron in the
hidden layer has a Gaussian radial basis function (RBF) dened as
z½k
i¼exp½2hkkxi2ckk2,wherekxi2ckkis the Euclidean norm
between the input vector xiand the center vector ckand hkis the
bandwidth of the Gaussian RBF. Subsequently, in the linear output layer,
phenotypes are regressed on the data-derived features, fz½k
ig,accord-
ing to yi¼mþP
S
k¼1
wkz½k
iþei,whereeiis a model residual.
Estimating the parameters of the RBFNN: The vector of
weights v¼fw1; :::; wSgof the linear output layer is obtained using
the ordinary least-squares t that minimizes the mean squared differ-
ences between the ^
yi(from RBFNN) and the observed responses yiin
the training set, provided that the Gaussian RBFs for centers ckand hk
of the hidden layer are dened. The centers are selected using an
orthogonalization least-squares learning algorithm, as described by
Chen et al. (1991) and implemented in Matlab 2010b. The centers
are added iteratively such that each new selected center is orthogonal
to the others. The selected centers maximize the decrease in the mean-
squared error of the RBFNN, and the algorithm stops when the
number of centers (neurons) added to the RBFNN attains a desired
precision (goal error) or when the number of centers is equal to the
number of input vectors, that is, when S=n. The bandwidth hkof the
Gaussian RBF is dened in terms of a design parameter of the net
spread,thatis,hk¼0:8326
spread2
for each Gaussian RBF of the hidden
layer. To select the best RBFNN, a grid for training the net was gener-
ated, containing different values of spread and different precision values
(goal error). The initial value of the spread was the median of the
Euclidean distances between each pair of input vectors (xi), and an initial
value of 0.02 for the goal error was considered. The parameter spread
allows adjusting the form of the Gaussian RBF such that it is sufciently
large to respond to overlapping regions of the input space but not so big
that it might induce the Gaussian RBF to have a similar response.
Bayesian regularized neural networks: The difference between
SLNN and BRNN is in the function to be minimized (see the
penalized function below); therefore, the basic structure of a BRNN
can be represented in Figure 1 as well. The SLNN described above is
exible enough to approximate any non-linear function; this great
exibility allows NN to capture complex interactions among predictor
Figure 1 Structure of a single-layer feed-
forward neural network (SLNN) adapted from
González-Camacho et al. (2012). In the hid-
den layer, input variables xi=ðxi1; :::; xipÞ
(j=1,...,p markers) are combined for each
neuron (k=1,...,Sneurons) using a linear func-
tion, u½k
i¼bkþP
p
j¼1
xijb½k
j, and subsequently trans-
formed using a non-linear activation function,
yielding a set of inferred scores, z½k
i¼gkðu½k
iÞ.
These scores are used in the output layer as
basis functions to regress the response using
the linear activation function on the data-
derived predictors yi¼mþP
S
k¼1
wkz½k
iþei.
1598 | P. Pérez-Rodríguez et al.
variables (Hastie et al. 2009). However, this exibility also leads to two
important issues: (1) as the number of neurons increases, the number
of parameters to be estimated also increases; and (2) as the number of
parameters rises, the risk of over-tting also increases. It is common
practice to use penalized methods via Bayesian methods to prevent or
palliate over-tting.
MacKay (1992, 1994) developed a framework for obtaining esti-
mates of all the parameters in a feed-forward single neural network
by using an empirical Bayes approach. Let u¼(w
1
,...,w
S
;b
1
,...,b
S
;
b
1[1]
,...,b
p[1]
;...,b
1[S]
,...,b
p[S]
,m)9be the vector containing all the
weights, biases, and connection strengths. The author showed that the
estimation problem can be solved in two steps, followed by iteration:
(1) Obtain the conditional posterior modes of the elements in u
assuming that the variance components s2
eand s2
uare known
and that the prior distribution for the all the elements in uis given
by pðujs2
uÞ¼MNð0;s2
uIÞ. It is important to note that this ap-
proach assigns the same prior to all elements of u, even though this
may not always be the best thing to do. The density of the condi-
tional (given the variance parameters) posterior distribution of the
elements of u,accordingtoBayestheorem, is given by
pðuy;s2
e;s2
uÞ¼ pðyu;s2
eÞpðus2
uÞ
pðys2
e;s2
uÞ(5)
The conditional modes can be obtained by maximizing Equation 5
over u. However, the problem is equivalent to minimizing the following
penalized sum of squares [see Gianola et al. (2011) for more details]
FðuÞ¼bX
n
i¼1
e2
iþaX
m
j¼1
u2
j
where b¼1=ð2s2
eÞ,a¼1=ð2s2
uÞ,eiis the difference between ob-
served and predicted phenotypes for the tted model, and uj
(j¼1; :::; m) is the j
th
element of vector u.
(2) Update s2
eand s2
u. The updating formulas are obtained by max-
imizing an approximation to the marginal likelihood of the data
pðyjs2
e;s2
uÞ(the evidence) given by the denominator of Equa-
tion 5.
(3) Iterate between (1) and (2) until convergence.
The original algorithm developed by MacKay was further im-
proved by Foresee and Hagan (1997) and adopted by Gianola et al.
(2011) in the context of genome and pedigree-enabled prediction. The
algorithm is equivalent to estimation via maximum penalized likeli-
hood estimation when weight decayis used, but it has the advan-
tage of providing a way of setting the extent of weight decay
through the variance component s2
u. Neal (1996) pointed out that
the procedure of MacKay (1992, 1994) can be further generalized.
For example, there is no need to approximate probabilities via
Gaussian assumptions; furthermore, it is possible to estimate the
entire posterior distributions of all the elements in u, not only their
(conditional) posterior modes. Next, we briey review Neals ap-
proach to solving the problem; a comprehensive revision can be
found in Lampinen and Vehtari (2001).
Prior distributions:
a) Variance component of the residuals: Neal (1996) used a
conjugate inverse Gamma distribution as a prior for the variance
associated with the residual, ei, given in Equation 4, that is, s2
e
Inv-Gammaðse;dfeÞ,whereseand dfeare the scale and degrees of
freedom parameters, respectively. These parameters can be set to
the default values given by Neal (1996), se=0.05, dfe=0.5. These values
were also used by Lampinen and Vehtari (2001).
b) Connection strengths, weights, and biases: Neal (1996) sug-
gested dividing the network parameters in uinto groups and then
using hierarchical models for each group of parameters; for example,
connection strengths (b
1[1]
,...,b
p[1]
;...;b
1[S]
,...,b
p[S]
), biases
(b
1
,...,b
S
) of the hidden layer, and output weights (w
1
,...,w
S
), and
general mean or bias (m) of the linear output layer. Suppose that
u
1
,...,u
k
are parameters of a given group; then assume
Figure 2 Structure of a radial basis function
neural network adapted from González-
Camacho et al. (2012). In the hidden layer,
information from input variables ðxi1; :::; xipÞ
(j=1,...,p markers) is rst summarized by
means of the Euclidean distance between
each of the input vectors fxigwith respect
to S(data-inferred) (k=1,...,Sneurons) centers
fckg, that is, u½k
i¼hkjjxi2ckjj2. These dis-
tances are then transformed using the
Gaussian function z½k
i¼expð2u½k
iÞ. These
scores are used in the output layer as basis
functions for the linear regression
yi¼mþP
S
k¼1
wkz½k
iþei.
Volume 2 December 2012 | Linear and Non-parametric Regression Models for GS | 1599
pðu1; :::; uks2
uÞ¼ð2pÞ2k=2sk
uexp(21
2s2
uX
S
k¼1
u2
k)
And, at the last stage of the model, assign the prior s2
u
Inv-Gammaðsu;dfuÞ. The scale parameter of the distribution asso-
ciated with the group of parameters containing the connection
strengths (b
1[1]
,...,b
p[1]
;...;b
1[S]
,...,b
p[S]
) changes according
to the number of inputs, in this case, su¼ð0:05=p1=dfuÞ2with
dfu¼0:5andpis the number of markers in the data set.
By using Markov chain Monte Carlo (MCMC) techniques through
an algorithm called hybrid Monte Carlo, Neal (1996) developed a soft-
ware termed exible Bayesian modeling (FBM) capable of obtaining
samples from the posterior distributions of all unknowns in a neural
network (as in Figure 1).
Reproducing kernel Hilbert spaces regression: RKHS models have
been suggested as an alternative to multiple linear regression for
capturing complex interaction patterns that may be difcult to
account for in a linear model framework (Gianola et al. 2006). In
RKHS model, the regression function takes the form
fðxiÞ¼mþX
n
i9¼1
ai9Kðxi;xi9Þ(6)
where xi¼ðxi1; :::; xip Þ
9and xi9¼ðxi91; :::; xi9pÞ
9are input vectors of
marker genotypes in individuals iand i9;ai9are regression coef-
cients; and Kðxi;xi9Þ¼expð2hkxi2xi9k2Þis the reproducing kernel
dened (here) with a Gaussian RBF, where his a bandwidth param-
eter and kxi2xi9kis the Euclidean norm between each pair of input
vectors. The strategy termed kernel averagingfor selecting optimal
values of hwithin a set of candidate values was implemented using
the Bayesian approach described in de los Campos et al. (2010).
Similarities and connections between the RKHS and the RBFNN
are given in González-Camacho et al. (2012).
Assessment of the modelspredictive ability
The predictive ability of the models given above was compared using
Pearsons correlation and predictive mean-squared error (PMSE)
using predicted and realized values. A total of 50 random partitions
were generated for each of the data sets, and each partition randomly
assigned 90% of the lines to the training set and the remaining 10% to
the validation set. The partition scheme used was similar to that in
Gianola et al. (2011) and González-Camacho et al. (2012).
All scripts were run in a Linux work station; for Bayesian ridge
regression and Bayesian LASSO, we used the R package BLR (de los
Campos and Perez 2010), whereas for RKHS, we used the R imple-
mentation described in de los Campos et al. (2010), which was kindly
provided by the authors. In the case of Bayes A and Bayes B, we used
a program described by Hickey and Tier (2009), which is freely
available at http://sites.google.com/site/hickeyjohn/alphabayes.For
the BRNN, we used the FMB software available at http://www.cs.
toronto.edu/~radford/fbm.software.html. Because the computational
time required to evaluate the predictive ability of the BRNN network
was great, we used the Condor high throughput computing system at
the University of Wisconsin-Madison (http://research.cs.wisc.edu/
condor). The RBFNN model was run using Matlab 2010b for Linux.
The differences in computing times between the models were great.
The computing times for evaluating the prediction ability of the 50
partitions for each trait were as follows, 10 min for RBFNN, 1.5 hr for
RKHS, 3 hr for BRR, 3.5 hr for BL, 4.5 hr for Bayes B, 5.5 hr for Bayes
A, and 30 days for BRNN. In the case of RKHS, BRR, BL, Bayes A,
and Bayes B, inferences were based on 35,000 MCMC samples, and on
10,000 samples for BRNN. The estimated computing times were
obtained using, as reference, a single Intel Xeon CPU 5330 2.4 GHz
and 8 Gb of RAM memory. Signicant reduction in computing time
was achieved by parallelizing the tasks.
RESULTS
Data from replicated experiments in 2010 were used to calculate the
broad-sense heritability for each trait in each environment (Table 1).
Broad-sense heritability across locations for 2010 data were 0.67 for
GY and 0.92 for DTH. These high estimates can be explained, at least
in part, by the strict environmental control of trials conducted at
CIMMYTs experiment station at Ciudad Obregon. The heritability
of the two traits for 2009 was not estimated because the only available
phenotypic data were adjusted means for each environment.
Predictive assessment of the models
The predictive ability of the different models for GY and DTH varied
among the 12 environments. The model deemed best using correla-
tions (Table 2) tended to be the one with the smallest average PMSE
(Table 3). The three non-parametric models had higher predictive
correlations and smaller PMSE than the linear models for both GY
and DTH. Within the linear models, the results are mixed, and all
models gave similar predictions. Within the non-parametric models,
RBFNN and RKHS always gave higher correlations between predicted
values and realized phenotypes, and a smaller average PMSE than the
BRNN. The mean of the correlations and the associated standard
errors can be used to test for statistically signicant improvements
in the predictability of the non-linear models vs. the linear models.
The t-test (with a¼0:05) showed that RKHS gave signicant
improvements in prediction in 13/19 cases (Table 3) compared with
the BL, whereas RBFNN was signicantly better than the BL in 10/19
cases. Similar results were obtained when comparing RKHS and
RBFNN with Bayes A and Bayes B.
Correlations between observed and predicted values for DTH were
lowest overall in environments 4 and 8, in Cd. Obregon, 2009, and in
Toluca, 2009. Average PMSE was in agreement with the ndings
based on correlations. Although accuracies in environment 4 were
much lower than in other environments, the higher accuracy of the
non-parametric models (RKHS, RBFNN, and BRNN) over that of the
linear models (BL, BRR, Bayes A, and Bayes B) was consistent with
what was observed in the other environments. Figures 3 and 4 give
scatter plots of the correlations obtained with the three non-parametric
models vs. the BL for DTH and GY, respectively; each circle repre-
sents the estimated correlations for each of the two models included
in the plot. In Figure 3, AC, DTH had a total of 500 points (10
environments and 50 random training-testing partitions). In Figure 4,
AC, GY had a total of 350 points (7 environments and 50 random
partitions in each environment). A point above the 45-degree line
represents an analysis where the method whose predictive correlation
is given on the vertical axis (RKHS, RBFNN, BRNN) outperformed
the one whose correlation is given on the horizontal axis (BL). Both
gures show that although there is a great deal of variability due to
partition, for both DTH and GY, the overall superiority of RKHS and
RBFNN over the linear model BL is clear. For both traits, BL had
slightly better prediction accuracy than the BRNN in terms of the
number of individual correlation points. It is interesting to note that
some cross-validation partitions picked subsets of training data that
had negative, zero, or very low correlations with the observed values in
1600 | P. Pérez-Rodríguez et al.
the validation set. These results indicate that lines in the training set
are not necessarily related to those in the validation set.
DISCUSSION AND CONCLUSIONS
Understanding the impact of epistasis on quantitative traits remains
a major challenge. In wheat, several studies have reported signicant
epistasis for grain yield and heading or owering time (Goldringer
et al. 1997). Detailed analyses have shown that vernalization, day-
length sensitivity, and earliness per se genes are mainly responsible
for regulating heading time. The vernalization requirement relates to
the sensitivity of the plant to cold temperatures, which causes it to
accelerate spike primordial formation. Transgenic and mutant analyses,
for example, have suggested a pathway involving epistatic interactions
that combines environment-induced suppression and upregulation of
several genes, leading to nal oral transition (Shimada et al. 2009).
There is evidence that the aggregation of multiple gene ·gene
interactions (epistasis) with small effects into small epistatic networks
nTable 2 Average correlation (SE in parentheses) between observed and predicted values for grain yield (GY) and days to heading (DTH)
in 12 environments for seven models
Trait Environment BL BRR Bayes A Bayes B RKHS RBFNN BRNN
1 0.59 (0.11) 0.59 (0.11) 0.59 (0.11) 0.56 (0.11) 0.66 (0.09) 0.66 (0.10) 0.64 (0.11)
2 0.58 (0.14) 0.57 (0.14) 0.61 (0.12) 0.57 (0.13) 0.63 (0.13) 0.61 (0.13) 0.62 (0.13)
3 0.60 (0.13) 0.60 (0.12) 0.62 (0.11) 0.60 (0.12) 0.68 (0.10) 0.69 (0.10) 0.67 (0.11)
4 0.02 (0.18) 0.07 (0.17) 0.06 (0.17) 0.06 (0.17) 0.12 (0.18) 0.16 (0.18) 0.02 (0.19)
DTH 5 0.65 (0.09) 0.64 (0.10) 0.66 (0.09) 0.66 (0.09) 0.69 (0.08) 0.68 (0.08) 0.68 (0.08)
8 0.36 (0.15) 0.37 (0.15) 0.36 (0.15) 0.35 (0.14) 0.46 (0.13) 0.46 (0.14) 0.39 (0.15)
9 0.59 (0.12) 0.59 (0.11) 0.53 (0.12) 0.52 (0.11) 0.62 (0.11) 0.63 (0.11) 0.61 (0.12)
10 0.54 (0.14) 0.52 (0.14) 0.56 (0.13) 0.54 (0.14) 0.61 (0.13) 0.62 (0.12) 0.57 (0.13)
11 0.52 (0.15) 0.52 (0.16) 0.53 (0.13) 0.51 (0.13) 0.58 (0.14) 0.59 (0.13) 0.55 (0.14)
12 0.45 (0.19) 0.42 (0.18) 0.45 (0.18) 0.45 (0.18) 0.47 (0.18) 0.39 (0.19) 0.35 (0.19)
Average 0.59 (0.12) 0.58 (0.12) 0.60 (0.12) 0.57 (0.12) 0.65 (0.10) 0.48 (0.14) 0.48 (0.14)
1 0.48 (0.13) 0.43 (0.14) 0.48 (0.13) 0.46 (0.13) 0.51 (0.12) 0.51 (0.12) 0.50 (0.13)
2 0.48 (0.14) 0.41 (0.17) 0.48 (0.14) 0.48 (0.14) 0.50 (0.14) 0.43 (0.16) 0.43 (0.16)
3 0.20 (0.21) 0.29 (0.22) 0.20 (0.22) 0.18 (0.22) 0.37 (0.20) 0.42 (0.21) 0.32 (0.24)
GY 4 0.45 (0.15) 0.46 (0.13) 0.43 (0.15) 0.42 (0.15) 0.53 (0.12) 0.55 (0.11) 0.49 (0.14)
5 0.59 (0.14) 0.56 (0.16) 0.75 (0.11) 0.74 (0.12) 0.64 (0.13) 0.66 (0.13) 0.63 (0.13)
6 0.70 (0.10) 0.67 (0.11) 0.73 (0.08) 0.71 (0.08) 0.73 (0.08) 0.71 (0.08) 0.69 (0.10)
7 0.46 (0.14) 0.50 (0.14) 0.42 (0.14) 0.40 (0.15) 0.53 (0.13) 0.54 (0.14) 0.50 (0.14)
Average 0.62 (0.10) 0.57 (0.14) 0.69 (0.10) 0.70 (0.09) 0.67 (0.09) 0.56 (0.12) 0.65 (0.10)
Fitted models were Bayesian LASSO (BL), RR-BLUP (BRR), Bayes A, Bayes B, reproducing kernel Hilbert spaces regression (RKHS), radial basis function neural networks
(RBFNN) and Bayesian regularized neural networks (BRNN) across 50 random partitions of the data with 90% in the training set and 10% in the validation set. The
models with highest correlations are underlined.
nTable 3 Predictive mean- squared error (PMSE) between observed and predicted values for grain yield (GY) and days to heading (DTH)
in 12 environments for seven models
Trait Environment BL BRR Bayes A Bayes B RKHS RBFNN BRNN
1 13.02 13.18 12.72 13.23 11.02 10.85 11.52
2 11.89 12.37 10.65 11.28 10.19 10.72 10.44
3 8.18 8.44 7.31 7.59 6.29 6.25 6.63
4 21.59 22.27 21.79 21.67 21.14 22.64 21.49
DTH 5 8.86 9.23 8.48 8.37 7.95 8.02 8.21
8 14.72 15.22 14.54 14.58 13.12 13.19 14.81
9 21.38 21.44 23.71 23.93 20.50 19.84 20.62
10 7.72 8.51 7.27 7.57 6.66 6.51 7.36
11 6.83 7.12 6.59 6.74 6.03 5.96 6.51
12 13.60 14.42 13.56 13.46 13.25 14.86 15.75
Average 6.09 6.47 5.99 6.28 5.31 9.12 9.25
1 0.07 0.09 0.07 0.07 0.07 0.07 0.07
2 0.06 0.08 0.06 0.06 0.06 0.07 0.07
3 0.06 0.07 0.06 0.06 0.05 0.05 0.05
GY 4 0.22 0.24 0.23 0.23 0.20 0.19 0.21
5 0.39 0.44 0.26 0.27 0.35 0.33 0.36
6 0.13 0.15 0.12 0.13 0.12 0.13 0.13
7 0.40 0.41 0.43 0.44 0.38 0.37 0.39
Average 0.06 0.07 0.05 0.05 0.05 0.07 0.06
Fitted models were Bayesian LASSO (BL), RR-BLUP (BRR), Bayes A, Bayes B, reproducing kernel Hilbert space regression (RKHS), radial basis function neural networks
(RBFNN) and Bayesian regularized neural networks (BRNN) across 50 random partitions of the data with 90% in the training set and 10% in the validation set. The
models with lowest PMSE are underlined.
Volume 2 December 2012 | Linear and Non-parametric Regression Models for GS | 1601
is important for explaining the heritability of complex traits in ge-
nome-wide association studies (McKinney and Pajewski 2012). Epi-
static networks and gene ·gene interactions can also be exploited for
GS via suitable statistical-genetic models that incorporate network
complexities. Evidence from this study, as well as from other research
involving other plant and animal species, suggests that models that are
non-linear in input variables (e.g. SNPs) predict outcomes in testing
sets better than standard linear regression models for genome-enabled
prediction. However, it should be pointed out that better predictive
ability can have several causes, one of them the ability of some non-
linear models to capture epistatic effects. Furthermore, the random
cross-validation scheme used in this study was not designed to
Figure 3 Plots of the predictive correlation for each of 50 cross-validation partitions and 10 environments for days to heading (DTH) in different
combinations of models. (A) When the best non-parametric model is RKHS, this is represented by an open circle; when the best linear model is BL,
this is represented by a lled circle. (B) When the best non-parametric model is RBFNN, this is represented by an open circle; when the best linear
model is BL, this is represented by a lled circle. (C) When the best non-parametric model is BRNN, this is represented by an open circle; when the
best linear model is BL, this is represented by a lled circle. The histograms depict the distribution of the correlations in the testing set obtained
from the 50 partitions for different models. The horizontal (vertical) dashed line represents the average of the correlations for the testing set in the
50 partitions for the model shown on the Y (X) axis. The solid line represents Y = X; i.e. both models have the same prediction ability.
1602 | P. Pérez-Rodríguez et al.
specically assess epistasis but rather to compare the modelspredic-
tive ability.
It is interesting to compare results from different predictive
machineries when applied to either maize or wheat. Differences in
the prediction accuracy of non-parametric and linear models (at least
for the data sets included in this and other studies) seem to be more
pronounced in wheat than in maize. Although differences depend,
among other factors, on the trait-environment combination and the
number of markers, it is clear from González-Camacho et al. (2012)
that for owering traits (highly additive) and traits such as grain yield
(additive and epistatic) in maize, the BL model performed very sim-
ilarly to the RKHS and RBFNN. On the other hand, in the present
study, which involves wheat, the RKHS, RBFNN, and BRNN models
clearly had a markedly better predictive accuracy than BL, BRR, Bayes
A, or Bayes B. This may be due to the fact that, in wheat, additive ·
additive epistasis plays an important role in grain yield, as found by
Figure 4 Plot of the correlation for each of 50 cross-validation partitions and seven environments for grain yield (GY) in different combinations of
models. (A) When the best model is RKHS, this is represented by an open circle; when the best model is BL, this is represented by a lled circle. (B)
When best model is RBFNN, this is represented by an open circle; when the best model is BL, this is represented by a lled circle. (C) When the
best model is BRNN, this is represented by an open circle; when the best model is BL, this is represented by a lled circle. The histograms depict
the distribution of the correlations in the testing set obtained from the 50 partitions for different models. The horizontal (vertical) dashed line
represents the average of the correlations for the testing set in the 50 partitions for the model shown on the Y (X) axis. The solid line represents Y =
X; i.e. both models have the same prediction ability.
Volume 2 December 2012 | Linear and Non-parametric Regression Models for GS | 1603
Crossa et al. (2006) and Burgueño et al. (2007, 2011) when assessing
additive, additive ·additive, additive ·environment, and additive ·
additive ·environment interactions using a pedigree-based model
with the relationship matrix A.
As pointed out rst by Gianola et al. (2006) and subsequently by
Long et al. (2010), non-parametric models do not impose strong
assumptions on the phenotype-genotype relationship, and they have
the potential of capturing interactions among loci. Our results with
real wheat data sets agreed with previous ndings in animal and plant
breeding and with simulated experiments, in that a non-parametric
treatment of markers may account for epistatic effects that are not
captured by linear additive regression models. Using extensive maize
data sets, González-Camacho et al. (2012) found that RBFNN and
RKHS had some similarities and seemed to be useful for predicting
quantitative traits with different complex underlying gene action un-
der varying types of interaction in different environmental conditions.
These authors suggested that it is possible to make further improve-
ments in the accuracy of the RKHS and RBFNN models by introduc-
ing differential weights in SNPs, as shown by Long et al. (2010) for
RBFs.
The training population used here was not developed specically
for this study; it was made up of a set of elite lines from the CIMMYT
rain-fed spring wheat breeding program. Our results show that it
is possible to achieve good predictions of line performance by com-
bining phenotypic and genotypic data generated on elite lines. As
genotyping costs decrease, breeding programs could make use of
genome-enabled prediction models to predict the values of new
breeding lines generated from crosses between elite lines in the
training set before they reach the yield testing stage. Lines with the
highest estimated breeding values could be intercrossed before being
phenotyped. Such a rapid cyclingscheme would accelerate the x-
ation rate of favorable alleles in elite materials and should increase the
genetic gain per unit of time, as described by Heffner et al. (2009).
It is important to point out that proof-of-concept experiments are
required before genome-enabled selection can be implemented
successfully in plant breeding programs. It is necessary to test
genomic predictions on breeding materials derived from crosses
between lines of the training population. If predictions are reliable
enough, an experiment using the same set of parental materials
could be carried out to compare the eld performance of lines coming
from a genomic-assisted recurrent selection program scheme vs. lines
coming from a conventional breeding scheme. The accuracies
reported in this study represent prediction of wheat lines using a train-
ing set comprising lines with some degree of relatedness to lines in the
validation set. When the validation and the training sets are not
genetically related (unrelated families) or represent populations with
different genetic structures and different linkage disequilibrium pat-
terns, then negligible accuracies are to be expected. It seems that
successful application of genomic selection in plant breeding requires
some genetic relatedness between individuals in the training and val-
idation sets, and that linkage disequilibrium information per se does
not sufce (e.g. Makowsky et al. 2011).
ACKNOWLEDGMENTS
Financial support by the Wisconsin Agriculture Experiment Station
and the AVIAGEN, Ltd. (Newbridge, Scotland) to Paulino Pérez and
Daniel Gianola is acknowledged. We thank the Centro Internacional
de Mejoramiento de Maíz y Trigo (CIMMYT) researchers who carried
out the wheat trials and provided the phenotypic data analyzed in this
article.
LITERATURE CITED
Bernardo, R., and J. M. Yu, 2007 Prospects for genome-wide selection for
quantitative traits in maize. Crop Sci. 47(3): 10821090.
Broomhead, D. S., and D. Lowe, 1988 Multivariable functional interpola-
tion and adaptive networks. Complex Systems 2: 321355.
Burgueño, J., J. Crossa, P. L. Cornelius, R. Trethowan, G. McLaren et al.,
2007 Modeling additive ·environment and additive ·additive ·en-
vironment using genetic covariances of relatives of wheat genotypes. Crop
Sci. 47(1): 311320.
Burgueño, J., J. Crossa, J. M. Cotes, F. San Vicente, and B. Das, 2011 Pre-
diction assessment of linear mixed models for multienvironment trials.
Crop Sci. 51(3): 944954.
Chen, S., C. F. N. Cowan, and P. M. Grant, 1991 Orthogonal least squares
learning algorithm for radial basis function networks. Neural Networks,
IEEE Transactions on 2(2): 302309.
Cockram, J., H. Jones, F. J. Leigh, D. OSullivan, W. Powell et al.,
2007 Control of owering time in temperate cereals: genes, domesti-
cation, and sustainable productivity. J. Exp. Bot. 58(6): 12311244.
Conti, V., P. F. Roncallo, V. Beaufort, G. L. Cervigni, R. Miranda et al.,
2011 Mapping of main and epistatic effect QTLs associated to grain
protein and gluten strength using a RIL population of durum wheat.
J. Appl. Genet. 52(3): 287298.
Crossa, J., J. Burgueño, P. L. Cornelius, G. McLaren, R. Trethowan et al.,
2006 Modeling genotype ·environment interaction using additive ge-
netic covariances of relatives for predicting breeding values of wheat
genotypes. Crop Sci. 46(4): 17221733.
Crossa, J., G. de los Campos, P. Perez, D. Gianola, J. Burgueño et al.,
2010 Prediction of genetic values of quantitative traits in plant breeding
using pedigree and molecular markers. Genetics 186(2): 713724.
Crossa, J., P. Perez, G. de los Campos, G. Mahuku, S. Dreisigacker et al.,
2011 Genomic selection and prediction in plant breeding. J. Crop Im-
prov. 25(3): 239261.
de los Campos, G., and P. Perez, 2010. BLR: Bayesian Linear Regression
R package, version 1.2.
de los Campos, G., H. Naya, D. Gianola, J. Crossa, A. Legarra et al.,
2009 Predicting quantitative traits with regression models for dense
molecular markers and pedigree. Genetics 182(1): 375385.
de los Campos, G., D. Gianola, G. J. M. Rosa, K. A. Weigel, and J. Crossa,
2010 Semi-parametric genomic-enabled prediction of genetic values
using reproducing kernel Hilbert spaces methods. Genet. Res. 92(4):
295308.
de los Campos, G., J. M. Hickey, R. Pong-Wong, H. D. Daetwyler, and M. P. L.
Calus, 2012 Whole genome regression and prediction methods applied to
plant and animal breeding. Genetics DOI: 10.1534/genetics.112.14331.
Foresee, D., and M. T. Hagan, 1997. Gauss-Newton approximation to
Bayesian learning. International Conference on Neural Networks, June
912, Houston, TX.
Gianola, D., and J. B. C. H. M. van Kaam, 2008 Reproducing kernel Hilbert
spaces regression methods for genomic assisted prediction of quantitative
traits. Genetics 178(4): 22892303.
Gianola, D., R. L. Fernando, and A. Stella, 2006 Genomic-assisted predic-
tion of genetic value with semiparametric procedures. Genetics 173(3):
17611776.
Gianola, D., H. Okut, K. A. Weigel, and G. J. M. Rosa, 2011 Predicting
complex quantitative traits with Bayesian neural networks: a case study
with Jersey cows and wheat. BMC Genet. 12: 87.
Goldringer, I., P. Brabant, and A. Gallais, 1997 Estimation of additive
and epistatic genetic variances for agronomic traits in a population of
doubled-haploid lines of wheat. Heredity 79: 6071.
González-Camacho, J. M., G. de los Campos, P. Perez, D. Gianola, J. Cairns
et al., 2012 Genome-enabled prediction of genetic values using radial
basis function. Theor. Appl. Genet. 125: 759771.
Habier, D., R. L. Fernando, K. Kizilkaya, and D. J. Garrik, 2011 Extension
of the Bayesian alphabet for genomic selection. BMC Bioinformatics 12:
186.
Hastie, T., R. Tibshirani, and J. Friedman, 2009 The Elements of Statistical
Learning: Data Mining,Inference and Prediction, Ed. 2. Springer, New York.
1604 | P. Pérez-Rodríguez et al.
Heffner, E. L., M. E. Sorrells, and J. L. Jannink, 2009 Genomic selection for
crop improvement. Crop Sci. 49(1): 112.
Heslot, N., H. P. Yang, M. E. Sorrells, and J. L. Jannink, 2012 Genomic
selection in plant breeding: a comparison of models. Crop Sci. 52(1):
146160.
Hickey, J. M., and B. Tier, 2009 AlphaBayes (Beta): Software for Polygenic
and Whole Genome Analysis. User Manual. University of New England,
Armidale, Australia.
Hoerl, A. E., and R. W. Kennard, 1970 Ridge regression: biased estimation
for nonorthogonal problems. Technometrics 12(1): 5567.
Holland, J. B., 2001 Epistasis and plant breeding. Plant Breeding Reviews
21: 2792.
Holland, J. B., 2008 Theoretical and biological foundations of plant
breeding, pp. 127140 in Plant Breeding: The Arnel R. Hallauer Inter-
national Symposium, edited by K. R. Lamkey and M. Lee. Blackwell
Publishing, Ames, IA.
Lampinen, J., and A. Vehtari, 2001 Bayesian approach for neural networks -
review and case studies. Neural Netw. 14(3): 257274.
Laurie, D. A., N. Pratchett, J. W. Snape, and J. H. Bezant, 1995 RFLP
mapping of ve major genes and eight quantitative trait loci controlling
owering time in a winter ·spring barley (Hordeum vulgare L.) cross.
Genome 38(3): 575585.
Long, N. Y., D. Gianola, G. J. M. Rosa, K. A. Weigel, A. Kranis et al.,
2010 Radial basis function regression methods for predicting quanti-
tative traits using SNP markers. Genet. Res. 92(3): 209225.
MacKay, D. J. C., 1992 A practical Bayesian framework for backpropaga-
tion networks. Neural Comput. 4(3): 448472.
MacKay, D. J. C., 1994 Bayesian non-linear modelling for the prediction
competition. ASHRAE Transactions 100(Pt. 2): 10531062.
Makowsky, R., N. M. Pajewski, Y. C. Klimentidis, A. I. Vazquez, C. W. Duarte
et al., 2011 Beyond missing heritability: prediction of complex traits.
PLoS Genet. 7(4): e1002051.
McKinney, B. A., and N. M. Pajewski, 2012. Six degrees of epistasis: sta-
tistical network models for GWAS. Front. Genet. 2: 109.
Meuwissen, T. H. E., B. J. Hayes, and M. E. Goddard, 2001 Prediction of
total genetic value using genome-wide dense marker maps. Genetics 157
(4): 18191829.
Neal, R. M., 1996. Bayesian Learning for Neural Networks (Lecture Notes in
Statistics), Vol. 118. Springer-Verlag, NY.
Ober, U., J. F. Ayroles, E. A. Stone, S. Richards, D. Zhu et al., 2012 Using
whole-genome sequence data to predict quantitative trait phenotypes in
Drosophila melanogaster. PLoS Genet. 8(5): e1002685.
Okut, H., D. Gianola, G. J. Rosa, and K. A. Weigel, 2011 Prediction of body
mass index in mice using dense molecular markers and a regularized
neural network. Genet. Res. Camb. 93: 189201.
Park, T., and G. Casella, 2008 The Bayesian LASSO. J. Am. Stat. Assoc. 103:
681686.
Perez, P., G. de los Campos, J. Crossa, and D. Gianola, 2010 Genomic-
enabled prediction based on molecular markers and pedigree using the
Bayesian linear regression package in R. Plant Genome 3(2): 106116.
Poggio, T., and F. Girosi, 1990 Networks for approximation and learning.
Proc. IEEE 78(9): 14811497.
Resende, M. F. R., P. Muñoz, M. D. V. Resende, D. J. Garrick, R. L. Fernando
et al., 2012 Accuracy of genomic selection methods in a standard data
set of loblolly pine (Pinus taeda L.). Genetics 4: 15031510.
Shimada, S., T. Ogawa, and S. Kitagawa, 2009 A genetic network of
owering-time genes in wheat leaves, in which an APETALA1/FRUITFULL-
like gene, VRN-1, is upstream of FLOWERING LOCUS T.PlantJ.58:
668681.
Wang, C. S., J. J. Rutledge, and D. Gianola, 1994 Bayesian analysis of mixed
linear models via Gibbs sampling with an application to litter size in
Iberian pigs. Genet. Sel. Evol. 26: 91115.
Zhang, K., J. Tian, L. Zhao, and S. Wang, 2008 Mapping QTLs with epi-
static effects and QTL ·environment interactions for plant height using
a doubled haploid population in cultivated wheat. J. Genet. Genomics 35
(2): 119127.
Communicating editor: J. B. Holland
Volume 2 December 2012 | Linear and Non-parametric Regression Models for GS | 1605
... Several GS models have been developed to estimate GEBVs for traits of interest based on different assumptions (Zhong et al., 2009;Daetwyler et al., 2010;Desta and Ortiz, 2014). Parametric models are commonly used to estimate additive genetic effects, while nonparametric models are appropriate for non-additive genetic effects and multivariates (De Los Campos et al., 2010;Holliday et al., 2012;Peŕez-Rodrıǵuez et al., 2012;Krishnappa et al., 2021). In this study, the GEBVs of BW resistance were estimated using both parametric (RR-BLUP, BayesA, and Bayesian LASSO) and non-parametric (RKHS, SVM, and random forest) models. ...
Article
Full-text available
Bacterial wilt (BW) is a soil-borne disease that leads to severe damage in tomato. Host resistance against BW is considered polygenic and effective in controlling this destructive disease. In this study, genomic selection (GS), which is a promising breeding strategy to improve quantitative traits, was investigated for BW resistance. Two tomato collections, TGC1 (n = 162) and TGC2 (n = 191), were used as training populations. Disease severity was assessed using three seedling assays in each population, and the best linear unbiased prediction (BLUP) values were obtained. The 31,142 SNP data were generated using the 51K Axiom array™ in the training populations. With these data, six GS models were trained to predict genomic estimated breeding values (GEBVs) in three populations (TGC1, TGC2, and combined). The parametric models Bayesian LASSO and RR-BLUP resulted in higher levels of prediction accuracy compared with all the non-parametric models (RKHS, SVM, and random forest) in two training populations. To identify low-density markers, two subsets of 1,557 SNPs were filtered based on marker effects (Bayesian LASSO) and variable importance values (random forest) in the combined population. An additional subset was generated using 1,357 SNPs from a genome-wide association study. These subsets showed prediction accuracies of 0.699 to 0.756 in Bayesian LASSO and 0.670 to 0.682 in random forest, which were higher relative to the 31,142 SNPs (0.625 and 0.614). Moreover, high prediction accuracies (0.743 and 0.702) were found with a common set of 135 SNPs derived from the three subsets. The resulting low-density SNPs will be useful to develop a cost-effective GS strategy for BW resistance in tomato breeding programs.
... It has been proven in wheat (Crossa et al., 2007;Pérez-Rodríguez et al., 2012) that large linkage blocks allow for relatively stable epistatic effects within and between the subgenomes A, B, and D. The use of nonlinear kernels such as the GK enables the capturing of cryptic small epistatic effects. In this study of SHW, the better predictive models were mostly those using the nonlinear GK that is supposed to capture gene × gene interaction, especially those existing between sub-genomes AB and D. ...
Article
Full-text available
Bread wheat (Triticum aestivum L.) is a globally important food crop, which was domesticated about 8–10,000 years ago. Bread wheat is an allopolyploid, and it evolved from two hybridization events of three species. To widen the genetic base in breeding, bread wheat has been re‐synthesized by crossing durum wheat (Triticum turgidum ssp. durum) and goat grass (Aegilops tauschii Coss), leading to so‐called synthetic hexaploid wheat (SHW). We applied the quantitative genetics tools of “hybrid prediction”—originally developed for the prediction of wheat hybrids generated from different heterotic groups — to a situation of allopolyploidization. Our use‐case predicts the phenotypes of SHW for three quantitatively inherited global wheat diseases, namely tan spot (TS), septoria nodorum blotch (SNB), and spot blotch (SB). Our results revealed prediction abilities comparable to studies in ‘traditional’ elite or hybrid wheat. Prediction abilities were highest using a marker model and performing random cross‐validation, predicting the performance of untested SHW (0.483 for SB to 0.730 for TS). When testing parents not necessarily used in SHW, combination prediction abilities were slightly lower (0.378 for SB to 0.718 for TS), yet still promising. Despite the limited phenotypic data, our results provide a general example for predictive models targeting an allopolyploidization event and a method that can guide the use of genetic resources available in gene banks.
... In GS, rrBLUP algorithm is a linear mixed modelbased prediction method that assumes all markers provide genetic effects and their values following a normal distribution [30]. In contrast, the BGLR model is a linear mixed model, which assumes that gene effects are randomly drawn from a multivariate normal distribution and genotype effects are randomly drawn from a multivariate Gaussian process, which takes into account potential pleiotropy and polygenic effects and allows inferring the effects of single gene while estimating genomic values [31]. ...
Article
Full-text available
Background The growth and development of organism were dependent on the effect of genetic, environment, and their interaction. In recent decades, lots of candidate additive genetic markers and genes had been detected by using genome-widely association study (GWAS). However, restricted to computing power and practical tool, the interactive effect of markers and genes were not revealed clearly. And utilization of these interactive markers is difficult in the breeding and prediction, such as genome selection (GS). Results Through the Power-FDR curve, the GbyE algorithm can detect more significant genetic loci at different levels of genetic correlation and heritability, especially at low heritability levels. The additive effect of GbyE exhibits high significance on certain chromosomes, while the interactive effect detects more significant sites on other chromosomes, which were not detected in the first two parts. In prediction accuracy testing, in most cases of heritability and genetic correlation, the majority of prediction accuracy of GbyE is significantly higher than that of the mean method, regardless of whether the rrBLUP model or BGLR model is used for statistics. The GbyE algorithm improves the prediction accuracy of the three Bayesian models BRR, BayesA, and BayesLASSO using information from genetic by environmental interaction (G × E) and increases the prediction accuracy by 9.4%, 9.1%, and 11%, respectively, relative to the Mean value method. The GbyE algorithm is significantly superior to the mean method in the absence of a single environment, regardless of the combination of heritability and genetic correlation, especially in the case of high genetic correlation and heritability. Conclusions Therefore, this study constructed a new genotype design model program (GbyE) for GWAS and GS using Kronecker product. which was able to clearly estimate the additive and interactive effects separately. The results showed that GbyE can provide higher statistical power for the GWAS and more prediction accuracy of the GS models. In addition, GbyE gives varying degrees of improvement of prediction accuracy in three Bayesian models (BRR, BayesA, and BayesCpi). Whatever the phenotype were missed in the single environment or multiple environments, the GbyE also makes better prediction for inference population set. This study helps us understand the interactive relationship between genomic and environment in the complex traits. The GbyE source code is available at the GitHub website (https://github.com/liu-xinrui/GbyE).
... The RR-BLUP model assumes that all markers have common variances with small effects, while the Bayesian models allows different effects and variances of markers with various degrees of shrinkage [10,27,28]. The non-parametric models such as reproducing kernel Hilbert space (RKHS), support vector machine (SVM), and random forest (RF) have been known to be better for capturing non-additive genetic effects and multi-variates relative to parametric models [29][30][31]. For RKHS, the Euclidian genetic distance based Gaussian kernel is used to predict GEBVs with a smoothing parameter to regulate the distribution of marker effects [29,32]. ...
Article
Full-text available
Background Genomic selection (GS) is an efficient breeding strategy to improve quantitative traits. It is necessary to calculate genomic estimated breeding values (GEBVs) for GS. This study investigated the prediction accuracy of GEBVs for five fruit traits including fruit weight, fruit width, fruit height, pericarp thickness, and Brix. Two tomato germplasm collections (TGC1 and TGC2) were used as training populations, consisting of 162 and 191 accessions, respectively. Results Large phenotypic variations for the fruit traits were found in these collections and the 51K Axiom™ SNP array generated confident 31,142 SNPs. Prediction accuracy was evaluated using different cross-validation methods, GS models, and marker sets in three training populations (TGC1, TGC2, and combined). For cross-validation, LOOCV was effective as k-fold across traits and training populations. The parametric (RR-BLUP, Bayes A, and Bayesian LASSO) and non-parametric (RKHS, SVM, and random forest) models showed different prediction accuracies (0.594–0.870) between traits and training populations. Of these, random forest was the best model for fruit weight (0.780–0.835), fruit width (0.791–0.865), and pericarp thickness (0.643–0.866). The effect of marker density was trait-dependent and reached a plateau for each trait with 768−12,288 SNPs. Two additional sets of 192 and 96 SNPs from GWAS revealed higher prediction accuracies for the fruit traits compared to the 31,142 SNPs and eight subsets. Conclusion Our study explored several factors to increase the prediction accuracy of GEBVs for fruit traits in tomato. The results can facilitate development of advanced GS strategies with cost-effective marker sets for improving fruit traits as well as other traits. Consequently, GS will be successfully applied to accelerate the tomato breeding process for developing elite cultivars.
... While traditional ML techniques and mixed linear models remain accurate for smaller datasets, DL excels at feature extraction from extensive datasets, accounting for feature interaction effects. Expanding beyond genomic best linear unbiased prediction (GBLUP) methods holds promise for more accurate phenotype predictions in GS problems, ushering in a new era of precision breeding (Pérez-Rodríguez et al., 2012;Montesinos-López et al., 2018;Jubair et al., 2021;Danilevicz et al., 2022). Fig. 7 illustrates the synergy between machine learning and plant biology, showcasing how the computational power of machine learning enhances the analysis of complex plant data and contributes to advancements in agriculture, crop improvement, ecological understanding, and sustainable resource management. ...
Article
Advances in gene editing and natural genetic variability present significant opportunities to generate novel alleles and select natural sources of genetic variation for horticulture crop improvement. The genetic improvement of crops to enhance their resilience to abiotic stresses and new pests due to climate change is essential for future food security. The field of genomics has made significant strides over the past few decades, enabling us to sequence and analyze entire genomes. However, understanding the complex relationship between genes and their expression in phenotypes-the observable characteristics of an organism-requires a deeper understanding of phenomics. Phenomics seeks to link genetic information with biological processes and environmental factors to better understand complex traits and diseases. Recent breakthroughs in this field include the development of advanced imaging technologies, artificial intelligence algorithms, and large-scale data analysis techniques. These tools have enabled us to explore the relationships between genotype, phenotype, and environment in unprecedented detail. This review explores the importance of understanding the complex relationship between genes and their expression in phenotypes. Integration of genomics with efficient high throughput plant phenotyping as well as the potential of machine learning approaches for genomic and phenomics trait discovery.
... While traditional ML techniques and mixed linear models remain accurate for smaller datasets, DL excels at feature extraction from extensive datasets, accounting for feature interaction effects. Expanding beyond genomic best linear unbiased prediction (GBLUP) methods holds promise for more accurate phenotype predictions in GS problems, ushering in a new era of precision breeding (Pérez-Rodríguez et al., 2012;Montesinos-López et al., 2018;Jubair et al., 2021;Danilevicz et al., 2022). Fig. 7 illustrates the synergy between machine learning and plant biology, showcasing how the computational power of machine learning enhances the analysis of complex plant data and contributes to advancements in agriculture, crop improvement, ecological understanding, and sustainable resource management. ...
Article
Advances in gene editing and natural genetic variability present significant opportunities to generate novel alleles and select natural sources of genetic variation for horticulture crop improvement. The genetic improvement of crops to enhance their resilience to abiotic stresses and new pests due to climate change is essential for future food security. The field of genomics has made significant strides over the past few decades, enabling us to sequence and analyze entire genomes. However, understanding the complex relationship between genes and their expression in phenotypes-the observable characteristics of an organism-requires a deeper understanding of phenomics. Phenomics seeks to link genetic information with biological processes and environmental factors to better understand complex traits and diseases. Recent breakthroughs in this field include the development of advanced imaging technologies, artificial intelligence algorithms, and large-scale data analysis techniques. These tools have enabled us to explore the relationships between genotype, phenotype, and environment in unprecedented detail. This review explores the importance of understanding the complex relationship between genes and their expression in phenotypes. Integration of genomics with efficient high throughput plant phenotyping as well as the potential of machine learning approaches for genomic and phenomics trait discovery.
Article
Full-text available
Genomic selection, the application of genomic prediction (GP) models to select candidate individuals, has significantly advanced in the past two decades, effectively accelerating genetic gains in plant breeding. This article provides a holistic overview of the key factors that have influenced GP in plant breeding during this period. We delved into the pivotal roles of training population size and genetic diversity, and their relationship with the breeding population, in determining genomic prediction accuracy. Special emphasis was placed on optimizing training population size. We explored its benefits and the associated diminishing returns beyond an optimum size. This was done while considering the balance between resource allocation and maximizing prediction accuracy through current optimization algorithms. The density and distribution of single nucleotide polymorphisms (SNPs), level of linkage disequilibrium, genetic complexity, trait heritability, statistical machine learning methods and non-additive effects are the other vital factors. Using wheat, maize and potato as examples, we summarize the effect of these factors on the accuracy of GP for various traits. The search for high accuracy in GP – theoretically reaching one when using the Pearson’s correlation as a metric – is an active research area yet far from optimal for various traits. We hypothesize that with ultra-high sizes of genotypic and phenotypic datasets, effective training population optimization methods, and support from other omics approaches (transcriptomics, metabolomics and proteomics) coupled with deep learning algorithms could overcome the boundaries of the current limitations to achieve the highest possible prediction accuracy, making GS an effective tool in plant breeding.
Article
Full-text available
Summary - The Gibbs sampling is a Monte-Carlo procedure for generating random samples from joint distributions through sampling from and updating conditional distributions. Inferences about unknown parameters are made by: 1) computing directly summary statistics from the samples; or 2) estimating the marginal density of an unknown, and then obtaining summary statistics from the density. All conditional distributions needed to implement the Gibbs sampling in a univariate Gaussian mixed linear model are presented in scalar algebra, so no matrix inversion is needed in the computations. For location parameters, all conditional distributions are univariate normal, whereas those for variance components are scaled inverted chi-squares. The procedure was applied to solve a Gaussian animal model for litter size in the Gamito strain of Iberian pigs. Data were 1 213 records from 426 dams. The model had farrowing season (72 levels) and parity (4) as fixed effects; breeding values (597), permanent environmental effects (426) and residuals were random. In CASE I, variances were assumed known, with REML (restricted maximum likelihood) estimates used as true parameter values. Here, means and variances of the posterior distributions of all effects were obtained, by inversion, from the mixed model equations. These exact solutions were used to check the Monte-Carlo estimates given by Gibbs, using 120 000 samples. Linear regression slopes of true posterior means on Gibbs means were almost exactly 1 for fixed, additive genetic and permanent environmental effects. Regression slopes of true posterior variances on Gibbs variances were 1.00, 1.01 and 0.96, respectively. In CASE II, variances were treated as unknown, with a flat prior assigned to these. Posterior densities of selected location parameters, variance components, heritability and repeatability were estimated. Marginal posterior distributions of dispersion parameters were skewed, save the residual variance; the means, modes and medians of these distributions differed from the REML estimates, as expected from theory. The conclusions are: 1) the Gibbs sampler converged to the true posterior distributions, as suggested by CASE I; 2) it provides a richer description of uncertainty about genetic
Article
Full-text available
Fixed linear models have been used for describing genotype × environment interaction (GE). Previous attempts have been made to assess the predictive ability of some linear mixed models when GE components are treated as random effects and modeled by the factor analytic (FA) model. This study compares the predictive ability of linear mixed models when the GE is modeled by the FA model with that of simple linear mixed models when the GE is not modeled. A cross‐validation scheme is used that randomly deletes some genotypes from sites; the values for these genotypes are then predicted by the different models and correlated with their observed values to assess model accuracy. A total of six multienvironment trials (one potato [Solanum tuberosum L.] trial, three maize [Zea mays L.] trials, and two wheat [Triticum aestivum L.] trials) with GE of varying complexity were used in the evaluation. Results show that for data sets with complex GE, modeling GE using the FA model improved the predictability of the model up to 6%. When GE is not complex, most models (with and without FA) gave high predictability, and models with FA did not seem to lose much predictive ability. Therefore, we concluded that modeling GE with the FA model is a good thing.
Article
Full-text available
Simulation and empirical studies of genomic selection (GS) show accuracies sufficient to generate rapid genetic gains. However, with the increased popularity of GS approaches, numerous models have been proposed and no comparative analysis is available to identify the most promising ones. Using eight wheat (Triticum aestivum L.), barley (Hordeum vulgare L.), Arabidopsis thaliana (L.) Heynh., and maize (Zea mays L.) datasets, the predictive ability of currently available GS models along with several machine learning methods was evaluated by comparing accuracies, the genomic estimated breeding values (GEBVs), and the marker effects for each model. While a similar level of accuracy was observed for many models, the level of overfitting varied widely as did the computation time and the distribution of marker effect estimates. Our comparisons suggested that GS in plant breeding programs could be based on a reduced set of models such as the Bayesian Lasso, weighted Bayesian shrinkage regression (wBSR, a fast version of BayesB), and random forest (RF) (a machine learning method that could capture nonadditive effects). Linear combinations of different models were tested as well as bagging and boosting methods, but they did not improve accuracy. This study also showed large differences in accuracy between subpopulations within a dataset that could not always be explained by differences in phenotypic variance and size. The broad diversity of empirical datasets tested here adds evidence that GS could increase genetic gain per unit of time and cost.
Article
Introduction Gene Action and Statistical Effects Epistasis and Molecular Interactions Complex Molecular Interactions Underlie Quantitative Phenotypes (Sometimes) Biometrical Evidence for Epistasis Evidence for Epistasis from Plant Evolution Studies Molecular Marker Investigations of Epistasis Why Is There More Evidence for Epistasis from QTL Experiments than from Biometrical Studies? Implications of Epistasis for Plant Breeding Literature Cited
Article
Two features distinguish the Bayesian approach to learning models from data. First, beliefs derived from background knowledge are used to select a prior probability distribution for the model parameters. Second, predictions of future observations are made by integrating the model's predictions with respect to the posterior parameter distribution obtained by updating this prior to take account of the data. For neural network models, both these aspects present diiculties | the prior over network parameters has no obvious relation to our prior knowledge, and integration over the posterior is computationally very demanding. I address the problem by deening classes of prior distributions for network param-eters that reach sensible limits as the size of the network goes to innnity. In this limit, the properties of these priors can be elucidated. Some priors converge to Gaussian processes, in which functions computed by the network may be smooth, Brownian, or fractionally Brownian. Other priors converge to non-Gaussian stable processes. Interesting eeects are obtained by combining priors of both sorts in networks with more than one hidden layer.
Article
A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible (1) objective comparisons between solutions using alternative network architectures, (2) objective stopping rules for network pruning or growing procedures, (3) objective choice of magnitude and type of weight decay terms or additive regularizers (for penalizing large weights, etc.), (4) a measure of the effective number of well-determined parameters in a model, (5) quantified estimates of the error bars on network parameters and on network output, and (6) objective comparisons with alternative learning and interpolation models such as splines and radial basis functions. The Bayesian "evidence" automatically embodies "Occam's razor," penalizing overflexible and overcomplex models. The Bayesian approach helps detect poor underlying assumptions in learning models. For learning models well matched to a problem, a good correlation between generalization ability and the Bayesian evidence is obtained.
Article
In plant breeding, multienvironment trials (MET) may include sets of related genetic strains. In self-pollinated species the covariance matrix of the breeding values of these genetic strains is equal to the additive genetic covariance among them. This can be expressed as an additive relationship matrix, A, multiplied by the additive genetic variance. Using Mixed Model Methodology, the genetic covariance matrix can be estimated and Best Linear Unbiased Predictors (BLUPs) of the breeding values obtained. The effectiveness of exploiting rela- tionships among strains tested in METs and usefulness of these BLUPs of breeding values for simultaneously modeling the main effects of genotypes and genotype 3 environment interaction (GE) have not been thoroughly studied. In this study, we obtained BLUPs of breed- ing values using genetic variance-covariance structures constructed as the Kroneker product (direct product) of a structured matrix of genetic variances and covariances for sites and a matrix of genetic relation- ships between strains, A. Results are compared with those from tradi- tional fixed effects and random effects models for studying GE ignoring genetic relationships. A CIMMYT international wheat trial was used for illustration. Results showed that direct products of factor analytic structures with matrix A efficiently model the main effects of genotypes and GE. These models showed the lowest standard error of the BLUPs (SE(BLUP)) of breeding values. Genotypes that were related to other genotypes had small SE(BLUP). Related genotypes can clearly be visualized in biplots.
Article
In self-pollinated species, the variance-covariance matrix of breed- ing values of the genetic strains evaluated in multienvironment trials (MET) can be partitioned into additive effects, additive 3 additive effects, and their interaction with environments. The additive relation- ship matrix A can be used to derive the additive 3 additive genetic variance-covariance relationships among strains, A˜ . This study shows how to separate total genetic effects into additive and additive 3 ad- ditive and how to model the additive 3 environment interaction and additive 3 additive 3 environment interaction by incorporating variance-covariance structures constructed as the Kronecker product of a factor-analytic model across sites and the additive (A) and ad- ditive 3 additive relationships (A˜ ), between strains. Two CIMMYT international trials were used for illustration. Results show that parti- tioning the total genotypic effects into additive and additive 3 additive and their interactions with environments is useful for identifying wheat (Triticum aestivum L.) lines with high additive effects (to be used in crossing programs) as well as high overall production. Some lines and environments had high positive additive 3 environment in- teraction patterns, whereas other lines and environments showed a different additive 3 additive 3 environment interaction pattern.
Book
From the Publisher: Artificial "neural networks" are now widely used as flexible models for regression classification applications, but questions remain regarding what these models mean, and how they can safely be used when training data is limited. Bayesian Learning for Neural Networks shows that Bayesian methods allow complex neural network models to be used without fear of the "overfitting" that can occur with traditional neural network learning methods. Insight into the nature of these complex Bayesian models is provided by a theoretical investigation of the priors over functions that underlie them. Use of these models in practice is made possible using Markov chain Monte Carlo techniques. Both the theoretical and computational aspects of this work are of wider statistical interest, as they contribute to a better understanding of how Bayesian methods can be applied to complex problems. Presupposing only the basic knowledge of probability and statistics, this book should be of interest to many researchers in statistics, engineering, and artificial intelligence. Software for Unix systems that implements the methods described is freely available over the Internet.