ArticlePDF Available

Accuracies of genomic prediction of feed efficiency traits using different prediction and validation methods in an experimental Nelore cattle population

Authors:

Abstract

Animal feeding is the most important economic component of beef production systems. Selection for feed efficiency has not been effective mainly due to difficult and high costs to obtain the phenotypes. The application of genomic selection using SNP can decrease the cost of animal evaluation as well as the generation interval. The objective of this study was to compare methods for genomic evaluation of feed efficiency traits using different cross-validation layouts in an experimental beef cattle population genotyped for a high-density SNP panel (BovineHD BeadChip assay 700k, Illumina Inc., San Diego, CA). After quality control, a total of 437,197 SNP genotypes were available for 761 Nelore animals from the Institute of Animal Science, Sertãozinho, São Paulo, Brazil. The studied traits were residual feed intake, feed conversion ratio, ADG, and DMI. Methods of analysis were traditional BLUP, single-step genomic BLUP (ssGBLUP), genomic BLUP (GBLUP), and a Bayesian regression method (BayesCπ). Direct genomic values (DGV) from the last 2 methods were compared directly or in an index that combines DGV with parent average. Three cross-validation approaches were used to validate the models: 1) YOUNG, in which the partition into training and testing sets was based on year of birth and testing animals were born after 2010; 2) UNREL, in which the data set was split into 3 less related subsets and the validation was done in each subset a time; and 3) RANDOM, in which the data set was randomly divided into 4 subsets (considering the contemporary groups) and the validation was done in each subset at a time. On average, the RANDOM design provided the most accurate predictions. Average accuracies ranged from 0.10 to 0.58 using BLUP, from 0.09 to 0.48 using GBLUP, from 0.06 to 0.49 using BayesCπ, and from 0.22 to 0.49 using ssGB-LUP. The most accurate and consistent predictions were obtained using ssGBLUP for all analyzed traits. The ssGBLUP seems to be more suitable to obtain genomic predictions for feed efficiency traits on an experimental population of genotyped animals. © 2016 American Society of Animal Science. All rights reserved.
INTRODUCTION
The costs associated with feeding represent around
48% of the total cost of the beef cattle industry and are
even greater in feedlot systems (Pendel and Herbel,
2015). Nonetheless, selection for feed efciency traits,
using traditional BLUP, is limited by the difculty and
costs to access the phenotypes of interest (Crowley et
al., 2011). Genomic selection, using SNP, has been es-
pecially helpful to improve quantitative traits that are
Accuracies of genomic prediction of feed efciency traits using different
prediction and validation methods in an experimental Nelore cattle population1
R. M. O. Silva,*2 B. O. Fragomeni,† D. A. L. Lourenco,† A. F. B. Magalhães,* N. Irano,* R. Carvalheiro,*
R. C. Canesin,‡ M. E. Z. Mercadante,‡ A. A. Boligon,§ F. S. Baldi,* I. Misztal,† and L. G. Albuquerque*
*Faculdade de Ciências Agrárias de Veterinárias, UNESP – Univ Estadual Paulista, Department of Animal
Science, Jaboticabal, São Paulo, Brazil, 14884-900; †University of Georgia, Department of Animal and Dairy
Science, Athens 30602-2771; ‡Centro APTA Bovinos de Corte, Animal Science Institute, Sertaozinho, São Paulo,
Brazil, 13460-000; and §Department of Animal Science, Federal University of Pelotas, Pelotas, RS, Brazil, CEP 96160-000
ABSTRACT: Animal feeding is the most important
economic component of beef production systems.
Selection for feed efciency has not been effective
mainly due to difcult and high costs to obtain the phe-
notypes. The application of genomic selection using
SNP can decrease the cost of animal evaluation as well
as the generation interval. The objective of this study
was to compare methods for genomic evaluation of feed
efciency traits using different cross-validation layouts
in an experimental beef cattle population genotyped for
a high-density SNP panel (BovineHD BeadChip assay
700k, Illumina Inc., San Diego, CA). After quality con-
trol, a total of 437,197 SNP genotypes were available
for 761 Nelore animals from the Institute of Animal
Science, Sertãozinho, São Paulo, Brazil. The studied
traits were residual feed intake, feed conversion ratio,
ADG, and DMI. Methods of analysis were traditional
BLUP, single-step genomic BLUP (ssGBLUP), genom-
ic BLUP (GBLUP), and a Bayesian regression method
(BayesCπ). Direct genomic values (DGV) from the last
2 methods were compared directly or in an index that
combines DGV with parent average. Three cross-vali-
dation approaches were used to validate the models: 1)
YOUNG, in which the partition into training and test-
ing sets was based on year of birth and testing animals
were born after 2010; 2) UNREL, in which the data set
was split into 3 less related subsets and the validation
was done in each subset a time; and 3) RANDOM, in
which the data set was randomly divided into 4 subsets
(considering the contemporary groups) and the valida-
tion was done in each subset at a time. On average, the
RANDOM design provided the most accurate predic-
tions. Average accuracies ranged from 0.10 to 0.58 using
BLUP, from 0.09 to 0.48 using GBLUP, from 0.06 to
0.49 using BayesCπ, and from 0.22 to 0.49 using ssGB-
LUP. The most accurate and consistent predictions were
obtained using ssGBLUP for all analyzed traits. The
ssGBLUP seems to be more suitable to obtain genomic
predictions for feed efciency traits on an experimental
population of genotyped animals.
Key words: Bos Indicus, cross-validation, genomic
selection, residual feed intake, single nucleotide polymorphisms
© 2016 American Society of Animal Science. All rights reserved. J. Anim. Sci. 2016.94
doi:10.2527/jas2016-0401
1We would like to thank the São Paulo State Foundation
(FAPESP) for the grants provided (numbers 2013/01228-5 and
2009/16118-5) and APTA Beef Cattle Center – Institute of Animal
Science (IZ) for the data provided.
2Corresponding authors: lgalb@fcav.unesp.br
Received February 22, 2016.
Accepted May 26, 2016.
Published August 18, 2016
Silva et al.
hard or expensive to measure and, because of that, are
not routinely recorded (e.g., feed efciency).
The accuracy of genomic prediction is the key to the
successful application of genomic selection. Accuracy
is strongly dependent on many factors such as linkage
disequilibrium (Meuwissen et al., 2001), allele fre-
quency distribution (Lettre, 2011), effective population
size (Goddard, 2009), heritability of the traits (Goddard,
2009), number of genotyped animals (VanRaden et al.,
2009; Calus, 2010; Daetwyler et al., 2010), marker den-
sity (Moser et al., 2010), and the method used to esti-
mate marker effects (Lourenco et al., 2014). According
to Saatchi et al. (2010) and Habier et al. (2010), the
number of generations separating training and validation
subsets may have inuence on the accuracy of predic-
tion. Likewise, many authors have shown concerns about
validating the model in a less related population (Pérez-
Cabal et al., 2012; Saatchi et al., 2013), especially for
traits difcult and expensive to measure.
Given the economic importance of feed efciency
traits for the livestock industry, there is a need to use
the most suitable method for genomic evaluation fo-
cusing on increasing the accuracy. Also, considering
the costs to measure those traits and, consequently,
there not always being phenotypes available for them,
it is important to measure how accurate the genomic
evaluation would be when it is applied to a less re-
lated population. The objective of this study was to
compare cross-validation designs and methodologies
to predict genomic breeding values for feed efciency
traits in an experimental Nelore cattle population.
MATERIAL AND METHODS
Data
The analyzed Nelore cattle data set was provided by
the Agência Paulista de Tecnologia dos Agronegócios
(APTA), Sertãozinho, São Paulo, Brazil. This herd
has 3 experimental lines: a selection line (NeS), which
has been selected for yearling weight since 1978; the
traditional line (NeT), which has been submitted to
the same selection criterion as NeS but, eventually,
receives animals from other herds; and a control line
(NeC) selected for average yearling weight.
The data set contained pedigree information on
9,551 animals (Table 1), of which 896 had phenotypes
for all studied traits and 788 (born from 2004 to 2012)
of those were genotyped with a high-density SNP chip
(Illumina High-Density Bovine BeadChip, 777,000;
BovineHD BeadChip assay (700k, Illumina Inc., San
Diego, CA). Due the breeding season adopted in this
farm, all the births were concentrated from October
to January. Table 1 shows the description of pedigree
information that has more than 99% of nonfounder
animals with known sire and dam. The SNP markers
with the minor allele frequency and call rate less than
5 and 98%, respectively, were deleted. Also, samples
with a call rate less than 90% were not considered in the
analyses. After genomic data quality control, there were
437,197 SNP and 761 genotyped animals available.
Besides the weight gain test, which has been run-
ning for more than 30 yr, the Institute of Animal Science
has also been conducting a performance test for feed ef-
ciency since 2005, which made it possible to measure
many others efciency traits. In addition to 80 individual
troughs, there are 10 paddocks equipped with a GrowSafe
feed system (GrowSafe Systems Ltd., Airdrie, Alberta,
Canada). The GrowSafe paddocks allow measurement
of the individual feed intake and feeding behavior even
when the animals are kept in groups. In the performance
test, the animals were evaluated for individual feed ef-
ciency for at least 56 d (with average of 83.14 ± 14.66
d) preceded by an adaptation period of 28 d in individual
(n = 683) and collective pens equipped with a GrowSafe
system (n = 213). According to Archer and Bergh (2000),
feed intake requires approximately 56 to 70 d to accurate-
ly measure, whereas feed conversion ratio (FCR) and re-
sidual feed intake (RFI) both required around 70 to 84 d.
The groups of animals that come into the test were sepa-
rated by sex, with an average of 286.48 ± 38.89 d of age
(just after weaning), initial weight of 233.56 ± 48.71 kg,
and nal weight of 314.16 ± 58.34 kg. The animals were
weighed every 14 d after fasting for 12 h (tests in 2005
and 2006) and every 28 d after fasting for males (2007
and 2008) and females (2009 to 2011). From 2009 to
2012, males were weighed weekly without fasting, with
3 weekly weight recordings on consecutive days in 2009
and 2010, 2 weekly weight recordings on consecutive
days in 2011, and 1 weight recording per week in 2012.
In 2013, males were weighed without fasting every 14
d. In 2012, females were weighed on 2 consecutive days
every 15 d. Therefore, each animal was weighed at least
Table 1. Structure of pedigree information
Category Number of animals
Animals in total 9,551
Sires in total 320
Dams in total 2,163
Founders 407
Nonfounders 9,144
Animals with only known sire 16
Animals with only known dam 0
Animals with known sire and dam 9,128
NeC11,536
NeS12,946
NeT13,925
1NeC = control line; NeS = selection line; NeT = traditional line.
Genomic selection for feed efciency traits
4 times with prior fasting or at least 7 times without prior
fasting. The diet was based on corn silage, Brachiaria
hay, soy bran, corn bran, salt, and urea, with 66.8% Total
digestible nutrients (TDN) and 13.2% CP, which allows
ADG of 1.1 kg/d. The analyzed traits were ADG, DMI,
RFI, and FCR.
After the performance test, the ADG was obtained
by the linear regression on days in test (DIT):
yi = α + β × DITi + ε,
in which yi is weight of the ith animal, α is the intercept
of regression equation that represents the initial weight,
β is the linear regression coefcient that represents the
ADG, DITi is day in the performance test of ith obser-
vation, and ε is the error associated to each observation.
The average metabolic weight (MW0.75) was given by
MW0.75 = [α + β × (DIT)/2]0.75.
The model used for the estimation of RFI was
derived from adjustments suggested by Koch et al.
(1963) for DMI. The RFI was considered the error of
the linear regression equation of DMI on ADG and
metabolic weight within each contemporary group
(CG; sex, year of birth, and pen), shown below as de-
scribed by Grion et al. (2014):
β0 + CG * βCG + ADG × CG * βCG×ADG * CG
× MW0.75 * βCG×MW + ε (i.e., RFI),
in which β0 is the intercept; βCG, βCG×ADG, and βTP
are regression coefcients of the CG and of the interac-
tions between CG and the covariates ADG and MW0.75,
respectively; and ε is the residual of the equation (i.e.,
RFI). The FCR was expressed as the ratio of DMI:ADG
as described by Fairfull and Chambers (1984).
Estimation of Heritability
Variance components were estimated for the feed ef-
ciency traits using an animal model under Bayesian in-
ference. Model for RFI and FCR included xed effects of
CG and month of birth; age of animal (linear effect) and
age of dam (linear and quadratic effects) as covariables,
and a random additive animal effect. Also, the linear ef-
fect of 2 principal components calculated based on the
genomic relationship matrix (G) were considered co-
variables to correct for substructure of population as sug-
gested by Price et al. (2006). Figure 1 shows the principal
components analysis with the substructure of analyzed
population. The animals shown in blue are from the NeC,
the animals shown in red are from the NeS, and animals
in green are from the NeT. The model used for ADG and
DMI was the same as used for RFI and FCR, plus the
quadratic effect of age of animal as a covariable.
Phenotypes, pedigree, and genotypes were used
for variance component estimation under single-step
genomic BLUP (ssGBLUP). Therefore, in the ani-
mal model, the inverse of the numerator relationship
matrix (A−1) was replaced by H−1, which combines
pedigree and genomic information. Matrix H−1 can be
obtained as follows (Aguilar et al., 2010):
11
11
22
--
--
=+
-
éù
êú
êú
ëû
00
HA
0G A ,
in which G−1 is the inverse of the genomic relationship
matrix and A−122 is the inverse of the pedigree-based
numerator relationship matrix for genotyped animals.
The general model can be represented as follows:
Y = Xb + Za + e,
in which Y is the vector of phenotypic observations, X
is an incidence matrix of phenotypes and xed effects,
b is the vector of xed effects, Z is an incidence matrix
that relates animals to phenotypes, a is the vector of di-
rect additive genetic effect, and e is a vector of residual
effects. Assumptions were Expectation[Y] = Xb and
var[y] = ZΣZ′ + R, with Σ = var(a) = Hσ2a and R = Iσ2r
in the single-trait model, in which Hσ2a is the additive
genetic variance and Iσ2r is the residual variance, H is
described above, and I is the appropriate identity matrix.
An inverted χ2 distribution was used for the prior values
of the direct and residual genetic variances. The poste-
riori conditional distributions of b, a, and e effects were
sampled from a multivariate normal distribution.
The analysis consisted of a single chain of 500,000
cycles with a “burn-in” of 100,000 cycles, taking a sam-
ple every 10 iterations. Therefore, 40,000 samples were
used to obtain the parameters. Chain convergence was as-
sessed by visual examination. Analyses were performed
using GIBBS2f90 (Misztal et al., 2002; Aguilar et al.,
Figure 1. Distribution of animals by selection line, provided by princi-
pal component analysis using a genomic relationship matrix. NeC = control
line; NeS = selection line; NeT = traditional line; PC = principal component.
Silva et al.
2010). The a posteriori estimates were obtained using the
application POSTGIBBSF90 (Misztal et al., 2002).
Methods of Genomic Analysis
The studied methods for genomic analysis were
genomic BLUP (GBLUP), ssGBLUP, and BayesCπ,
as described below.
Genomic BLUP
For this multistep analysis, rst (step a) a tradi-
tional genetic evaluation was run using a single-trait
animal model (the same xed effects used to estimate
variance components) to obtain EBV and xed effect
solutions to estimate adjusted phenotypes. The model
can be represented as follows:
y = Xβ + Zu + e,
in which y is the vector of phenotype, β is the vector of
xed effects, and u is the vector of direct additive genetic
effect. Considering an innitesimal model, var(u) = Aσ2u,
in which A is the numerator relationship matrix obtained
from pedigree information; var(e) = Iσ2e, in which I is
an identity matrix; and X and Z are incidence matrices
for effects contained in β and u, respectively. Although
the GBLUP and BayesCπ methods allow incorporating
xed effects in the model, the adjusted phenotype was
chosen to be used as a pseudophenotype in both cases
to simplify the process, optimizing the time of genomic
analysis. In addition, the EBV was previously tested in
this study as a pseudophenotype in the model to obtain
the direct genomic values (DGV); however, it provided
an evaluation at least 10% less accurate than when the
adjusted phenotype was used.
The next step (b) consisted of obtaining DGV by
the model shown below:
y* = 1μ + Zg + e,
in which y* is the vector of phenotype adjusted for
xed effects, μ is the overall mean, 1 is a vector of
ones, Z is a matrix linking phenotypes to individuals,
g is a vector of DGV, and e is a vector of residual ef-
fects. It was assumed g ~ N(0, Gσ2g), in which σ2g is
the variance of DGV and G is the genomic relation-
ship matrix. Random residuals were assumed e ~ N(0,
Iσ2e), in which I and σ2e were dened as before.
The G matrix can be obtained as described by
VanRaden (2008):
G = [(M − P)(M − P)′]/[2
1
m
j=
å
Pj(1 − Pj)],
in which M is a matrix of marker alleles with n lines
(n = total number of genotyped animals) and m col-
umns (m = total number of markers) and P is a matrix
containing 2 times the observed frequency of the sec-
ond allele (Pj). Elements of M are set to 0 or 2 for both
homozygous and to 1 for the heterozygous.
BayesCπ
All steps used to predict the genomic value us-
ing GBLUP (described above) were also applied for
BayesCπ. The main difference was in step b, where
the SNP effects were estimated according to assump-
tion presented by Habier et al. (2011). These authors
presented a methodology called BayesCπ, which as-
sumes that a SNP effect is 0 with probability π and this
probability could be estimated from the analyzed data.
The DGV was obtained based on SNP effects ob-
tained by the model shown below:
y* = 1μ + Z*g* + e,
in which y* is the vector of phenotype adjusted for xed
effects, μ is the overall mean, 1 is a vector of ones, Z* is
a matrix linking phenotypes to individuals, g* is a vec-
tor of maker effects, and e is a vector of residual effects.
BayesCπ assumes a mixed distribution to marker
effects and species a common variance for all loci us-
ing the same model equation as used in GBLUP but
considering the elements of u as 1
N
i=
å(zigi*Ii), in which
zi is the genotype of ith marker, coded as the number
of copies of the reference allele; gi* is the effect of
marker i, and Ii is an indicator variable that is equal to
1 if the ith marker has a nonzero effect on the trait and
0 otherwise. In this study, a binomial distribution with
probability π was assumed for Ii and an informative β
distribution was assigned for π (implying that this pa-
rameter was estimated from the analyzed data set, with
α ranging from 0.10 × 10−4 to 0.882 and β = 0.50).
The DGV was calculated for each animal using
the following formula:
DGVi =
1
m
j=
å
Zijgj*,
in which gj* is the estimated effect of marker j.
The prediction equations obtained using the GBLUP
and BayesCπ methods were implemented in the GS3
software developed by Legarra et al. (2010), which is
available at https://github.com/alegarra/gs3 (accessed
10 December 2014). Predictions using multiple steps
(BayesCπ and GBLUP) were calculated either with (ge-
nomic EBV [GEBV]) or without (DGV) such index us-
ing the vector of phenotype adjusted for xed effects as a
response variable.
Genomic selection for feed efciency traits
The analysis consisted of a single chain of 500,000
cycles with a “burn-in” of 50,000 cycles, taking a
sample every 10 iterations. Therefore, 45,000 samples
were used to obtain the parameters. Chain conver-
gence was assessed by visual examination.
Single-Step Genomic BLUP
The model used in ssGBLUP was the same as used
in the BLUP analysis, except for using the H matrix in-
stead of the A matrix. The single-step procedure con-
sists of combining A and G into a single matrix (H) as
already described above. The analyses with ssGBLUP
were performed using BLUPF90 software, available
at http://nce.ads.uga.edu/wiki/doku.php (accessed 10
December 2014).
Genomic EBV
The GEBV of all validation animals were calcu-
lated by an index combining parent average and DGV
(VanRaden et al., 2009):
GEBVi = bDGVDGV + bPAPA .
The weights (b) for DGV and parent average (PA)
were obtained as shown by Guo et al. (2010):
I = bDGVDGV + bPAPA ,
in which PA = (1/2)(EBVsire + EBVdam) using stan-
dard selection index methodology (Hazel 1943),
1
DGV DGV, PA
DGV DGV
DGV, PA PA
PA PA
1
DGV PA
PA DGV
COV COV
COV COV
1/
1
/1
1
V
b
V
b
rR R
rR R

  
=

  
  



=




,
in which r is the correlation between DGV and PA, RDGV
is the accuracy of DGV, and RPA is the accuracy of PA.
Cross-Validation
Three cross-validation approaches were used to
validate the models: 1) RANDOM, in which the data
set was randomly divided into 4 subsets (considering
the CG) and the validation was done in each subset at a
time, and 2) YOUNG, in which the partition into train-
ing and testing sets was based on year of birth and test-
ing animals were born after 2010. This approach was
designed mainly to simulate the interest to gure how
accurate the prediction of next generation will be. And
3) UNREL, in which the data set was split into 3 less
related subsets and the validation was done in each sub-
set a time. For this design, the training and validation
subsets were split based on a K-means approach (Ding
and He, 2004), which divides the data into less related
groups. In this case, the principal component analysis
of G was used to determine how the folders would be
divided. Figure 2 shows which animals were used for
training and testing by all folders of cross-validation.
The animals in black were in the training subset and the
animals in gray were in the testing subset.
As expected, the average relationships between
the test and training subsets were smaller on UNREL
Figure 2. Distribution of train and test groups in each cross-validation design made by principal components analysis based on a genomic matrix.
RANDOM = in which the data set was randomly divided into 4 subsets (considering the contemporary groups) and the validation was done in each subset at a
time; YOUNG = in which the partition into training and testing sets was based on year of birth and testing animals were born after 2010; UNREL = in which the
data set was split into 3 less related subsets and the validation was done in each subset a time; PC = principal component.
Silva et al.
followed by RANDOM and YOUNG (Table 2). Table
2 shows the number of animals in each cross-valida-
tion layout and the proportion of animals in each class
of relationship coefcients (f) between training and
test folds. Even though this study used animals from
only 1 experimental farm, the average of all relation-
ship coefcients between the training and the testing
population was not high (around 0.06 for RANDOM
and YOUNG).
The relationship coefcients between animals
were calculated by CFC software (Sargolzaei et al.,
2006), which uses the A matrix.
The accuracy of DGV/GEBV (or EBV for BLUP)
was calculated as the Pearson correlation between
phenotype adjusted for xed effect (aY) and the ge-
nomic breeding value, divided by square root of heri-
tability (h):
acc = {corr[aY, (GEBV/DGV)]}/h.
This adjustment was made to account for the fact that
adjusted phenotypes were used instead of the true
breeding value (Pryce et al., 2012).
Regression of Phenotype on Breeding Value
(EBV, Genomic EBV, or Direct Genomic Values)
An alternative to evaluate the extent of prediction
bias is to compare the regression of aY on the pre-
dicted breeding value (EBV, GEBV, or DGV), with its
expected value of 1 for each trait (Saatchi et al., 2011).
Hence, the regression coefcients were calculated for
each trait using simple linear regression of the adjust-
ed phenotype on DGV/GEBV/EBV.
RESULTS AND DISCUSSION
Table 3 shows the additive variances and heritability
estimates of the analyzed traits. The estimated variance
components indicate that the studied traits are moder-
ately to highly heritable. The heritabilities estimated for
RFI and FCR were moderate, whereas those estimated
for DMI and ADG were high, which is similar to what
was reported by Herd and Bishop (2000), Bolormaa et
al. (2013), and Nkrumah et al. (2014). Therefore, these
results indicate that a great part of total phenotypic vari-
ance is due to genes effect, which means that these traits
may respond quickly to a selection process.
Among the studied methods, ssGBLUP provided
more accurate predictions than multistep procedures for
all studied traits in the RANDOM design (Table 4). The
improvements on accuracy of predictions provided by
using ssGBLUP were more effective for low heritabil-
ity traits. It probably means that the inclusion of more
than 15% of phenotypic information from ungeno-
typed animals added to genomic and phenotypic infor-
mation from genotyped animals is more effective for
those traits. For low heritability traits, the information
from relatives is considered rst rather than individual
records for genetic evaluation. This could explain the
Table 2. Descriptive statistics of data set used for training and validation, and proportion of animals in each class
of relationship coefcients (f) between training and testing fold of each cross-validation layout.
Cross-validation
layout1
Nt2
Nv3
Relationship coefcients, %
f < 0.10 0.10 < f < 0.25 0.25 < f < 0.50 f > 0.50 Within4
RANDOM_1 617 144 86.02 11.39 2.50 0.09 0.09
RANDOM_2 562 199 85.30 12.59 2.01 0.10 0.07
RANDOM_3 592 169 87.37 10.65 1.89 0.09 0.07
RANDOM_4 512 249 85.12 12.63 2.16 0.09 0.07
YOUNG 500 261 85.83 12.85 1.17 0.15 0.07
UNREL_1 670 91 99.58 0.35 0.07 0.18
UNREL_2 424 337 95.74 3.47 0.77 0.03 0.10
UNREL_3 428 333 95.75 3.45 0.77 0.03 0.11
1Cross-validation approaches: RANDOM, in which the data set was randomly divided into 4 subsets (considering the contemporary groups) and the
validation was done in each subset at a time; YOUNG, in which the partition into training and testing sets was based on year of birth and testing animals
were born after 2010; and UNREL, in which the data set was split into 3 less related subsets and the validation was done in each subset a time.
2Nt = number of animals on training set.
3Nv = number of animals on validation subset.
4The average of relationship coefcient within each fold of validation subset.
Table 3. Additive genetic variance and heritabil-
ity estimates (SE) for residual feed intake (RFI; kg
DM/d), feed conversion ratio (FCR; kg DM), ADG
(kg/d), and DMI (kg)
Traits Mean1SD Additive genetic variance Heritability
RFI 0.00 0.58 0.29 0.17 (0.07)
FCR 7.04 1.77 0.14 0.11 (0.06)
ADG 1.00 0.26 0.01 0.39 (0.08)
DMI 6.69 1.24 0.31 0.43 (0.08)
1The average of each trait.
Genomic selection for feed efciency traits
higher accuracy gain for those traits with the inclusion
of 15% of phenotypic information from ungenotyped
animals. According to Lourenco et al. (2014), ssGB-
LUP has an advantage over multistep methods mainly
because it uses phenotypes rather than pseudopheno-
types and accounts for the entire population structure to
estimate GEBV. Onogi et al. (2014) also concluded that
the implementation of genomic selection by ssGBLUP
provided more accurate predictions than traditional
BLUP for carcass traits even using only genotyped sires
of Japanese Black cattle breed. Comparing GBLUP and
ssGBLUP in a Holstein population, Aguilar et al. (2010)
concluded that genomic evaluations using ssGBLUP
were as accurate as those using a multistep procedure
and that its advantage over other methods should in-
crease in the future when the animals are preselected by
genotype information. It is important to highlight that
if the SE of prediction accuracy were considered, the
accuracies are not signicantly different. However, the
discussion is based on the average accuracy.
The results also showed that the inclusion of marker
information can increase the accuracy of predictions,
especially for RFI, which had the highest increase in
accuracy over traditional BLUP. Higher prediction ac-
curacies were observed for ADG and DMI, which have
the highest heritabilities among studied traits (h2 = 0.39
and h2 = 0.43, respectively), with accuracies ranging
from 0.45 to 0.47 and from 0.45 to 0.49, respectively.
Similar results were reported by Bolormaa et al. (2013),
with the most accurate predictions obtained for the
highest heritable traits. Also, studying traits with similar
heritabilities, Lourenco et al. (2015) reported lower ac-
curacy for the trait that was under strong selection. An
alternative to improve accuracy of genomic prediction
is to calculate the GEBV using an index composed of
DGV and PA (VanRaden et al., 2012). Therefore, pre-
dictions using multiple steps (BayesCπ and GBLUP)
were calculated either with (GEBV) or without (DGV)
such an index. Table 4 shows the accuracy and bias of
DGV/GEBV of studied traits and methodologies.
Using GBLUP, the predictions of GEBV were
less accurate than those of DGV for all analyzed traits,
except for FCR. This probably means that the contri-
bution of parent average is more effective for predic-
tion accuracy of less heritable traits (FCR, h2 = 0.11).
Nonetheless, the bias of GEBV predictions was so
much higher than 1.0, suggesting that all predictions
were underestimated (Neves et al., 2014).
The accuracies of GEBV obtained using BayesCπ
were higher than those for DGV, mostly for the low
heritabilities traits (RFI, h2 = 0.17, and FCR, h2 = 0.11).
Using BayesCπ, predictions of GEBV for ADG and
DMI were equally accurate to that using a single-step
methodology. However, BayesCπ predictions of low
heritability traits were biased. On the other hand, the es-
timates of GEBV for traits with high heritability (ADG,
h2 = 0.39, and DMI, h2 = 0.43) were equally or only a
bit more accurate than predictions of DGV. These re-
sults differ from those found by Lourenco et al. (2014),
which reported greater accuracies for PA in a study us-
ing a small genotyped dairy population. However, ac-
cording to Bijma (2012), accuracy of PA is strongly
reduced by selection. So, once 88% of studied popula-
tion has undergone selection, the accuracy and bias of
prediction using an index with PA could probably be
affected by selection.
In general, the regression coefcients were close to
1, except for the low heritability traits especially from
BayesCπ, which, in most analysis, were over 1, mean-
ing that predictions were underestimated. Similar results
were reported by Neves et al. (2014), where BayesC
and Bayesian Lasso provided the most underestimated
predictions compared with GBLUP. A decrease in bias
of prediction with a larger number of genotyped and re-
corded animals is expected. Previous results with data
from this same population but with a smaller number of
Table 4. Accuracies of direct genomic values (DGV)/genomic EBV (GEBV) by studied traits and methodologies
by RANDOM (model in which the data set was randomly divided into 4 subsets [considering the contemporary
groups] and the validation was done in each subset at a time) cross-validation layout and regression coefcient
of adjusted phenotype on DGV/GEBV (between parentheses)
Traits1
GBLUP2BayesCπ3ssGBLUP4
GEBV
BLUPGEBV DGV GEBV DGV
RFI 0.29 (1.62) 0.36 (0.90) 0.40 (2.13) 0.35 (1.60) 0.45 (1.16) 0.23 (0.07)
FCR 0.32 (2.92) 0.23 (0.78) 0.43 (3.82) 0.23 (3.10) 0.30 (0.99) 0.29 (0.08)
ADG 0.44 (1.12) 0.46 (0.83) 0.46 (1.13) 0.46 (1.09) 0.47 (0.68) 0.45 (0.09)
DMI 0.45 (1.04) 0.48 (0.83) 0.49 (1.11) 0.48 (1.05) 0.49 (0.75) 0.45 (0.08)
1RFI = residual feed intake; FCR = feed conversion ratio.
2GBLUP = genomic BLUP.
3BayesCπ is the Bayesian Cπ methodology.
4ssGBLUP = single-step GBLUP.
Silva et al.
genotyped animals showed higher bias of prediction, es-
pecially for the low heritability traits (Silva et al., 2013).
Among the studied cross-validation designs,
RANDOM provided the most accurate genomic predic-
tion, ranging from 0.23 to 0.49 (Table 5). This probably
happened because the RANDOM design had the high-
est proportion of additive relationships between training
and testing over 0.25 (Table 2). Also, in the RANDOM
design, about 2.14% of relationship coefcients be-
tween animals on training and testing subsets are be-
tween 0.25 and 0.50 (Table 2). The relationship within
each fold in the RANDOM design was weak (Table 2).
According to Pszczola et al. (2012), higher accuracies
are obtained when relationships between animals in the
training population are weak and the relationship be-
tween the training and validation populations is high. In
both subsets (training and testing), animals from differ-
ent generations were used, which allows validating the
model on close relatives and/or validating in animals
from the same generation and the same herd. Comparing
different cross-validation layouts in a dairy cattle popu-
lation, Pérez-Cabal et al. (2012) also found the highest
accuracies in the RANDOM design and concluded that
the number of close relatives in the training and testing
subsets of cross-validation inuences accuracy even
with high or low heritability traits. According to Pryce
et al. (2012) and Chen et al. (2013), the ability to predict
genomic breeding values within and between popula-
tions/breeds depends on the strength of relationships
between all pairwise combinations of individuals. More
accurately predictions can be obtained when the level
of genomic relatedness between individuals is high.
The general mean accuracy of genomic predic-
tions for young animals (YOUNG design) was inter-
mediate to those for UNREL and RANDOM. Saatchi
et al. (2011) also found that accuracies of genomic
prediction on young animals were intermediate to the
accuracies obtained from unrelated populations and
random clustering for most traits.
For ADG and DMI, the predictions obtained for
young animals (YOUNG design) were higher than or
the same as those obtained by the RANDOM design.
Compared with RANDOM, the model apparently los-
es power of predicting GEBV of low heritable traits
(RFI and FCR) for young animals. This happened
mainly because there was information for animals in
the next generations on training and testing subsets in
RANDOM, which account for more accurate predic-
tions. This result agrees with those obtained for Saatchi
et al. (2010) and Habier et al. (2010), which concluded
that the number of generations separating training and
validation subsets also inuences accuracy, with lower
accuracies occurring when the relationship is more dis-
tant. Also, the RANDOM and YOUNG designs had
very similar number of animals and also a similar rela-
tionship between training and testing subsets (Table 2).
Considering that the RANDOM design had an average
of 4 repetitions with high SD, the value of accuracy for
the YOUNG design (which had no repetition), in this
case, could probably be considered another repetition
of RANDOM. That difference in prediction accuracy
between the RANDOM and the YOUNG design prob-
ably is due to the sampling error of the YOUNG design.
Indeed, the main reason to study the YOUNG cross-
validation design is because of the industry interest in
predicting the performance for future generations. So
even for a small population, accurately genomic predic-
tion can be achieved for younger animals, especially for
high heritability traits (Table 5). Still, even for low heri-
tability traits, accuracies as high as 0.31 were obtained.
It is reasonable to assume that the number of ani-
mals in the testing population can affect the accuracy
of prediction (VanRaden et al., 2009; Calus, 2010;
Daetwyler et al., 2010). Usually, for traits with a large
amount of phenotypic information available, such as
milk yield and growth traits, accuracies of genomic
prediction of 0.8 are currently achievable. The accura-
Table 5. Heritability (SE), average accuracy (SE) on
BLUP (EBV), single-step genomic BLUP (ssGB-
LUP), genomic BLUP (GBLUP; direct genomic values
[DGV]), and BayesCπ (DGV) for all studied traits using
different cross-validation layouts (RANDOM, in which
the data set was randomly divided into 4 subsets [consid-
ering the contemporary groups] and the validation was
done in each subset at a time; UNREL, in which the data
set was split into 3 less related subsets and the validation
was done in each subset a time; and YOUNG, in which
the partition into training and testing sets was based on
year of birth and testing animals were born after 2010)
Traits1h2Method RANDOM UNREL YOUNG
RFI 0.17 ± 0.07 BLUP 0.23 (0.07) 0.10 (0.08) 0.24
ssGBLUP 0.45 (0.06) 0.29 (0.10) 0.22
GBLUP 0.36 (0.11) 0.22 (0.08) 0.09
BayesCπ 0.35 (0.10) 0.22 (0.08) 0.06
FCR 0.11 ± 0.06 BLUP 0.29 (0.08) 0.32 (0.06) 0.30
ssGBLUP 0.30 (0.05) 0.29 (0.02) 0.31
GBLUP 0.23 (0.05) 0.10 (0.04) 0.14
BayesCπ 0.23 (0.04) 0.08 (0.05) 0.17
ADG 0.39 ± 0.08 BLUP 0.45 (0.09) 0.24 (0.01) 0.58
ssGBLUP 0.47 (0.09) 0.23 (0.03) 0.47
GBLUP 0.46 (0.10) 0.18 (0.03) 0.54
BayesCπ 0.46 (0.10) 0.17 (0.02) 0.49
DMI 0.43 ± 0.08 BLUP 0.45 (0.08) 0.27(0.04) 0.51
ssGBLUP 0.49 (0.06) 0.35 (0.02) 0.48
GBLUP 0.48 (0.07) 0.32 (0.02) 0.45
BayesCπ 0.48 (0.07) 0.31 (0.01) 0.47
1RFI = residual feed intake; FCR = feed conversion ratio.
Genomic selection for feed efciency traits
cy of genomic prediction of feed efciency was around
0.30 in beef and dairy cattle studies (Bolormaa et al.,
2013). Much larger reference populations need to be
assembled to improve this accuracy. Comparing mul-
tistep procedures for feed efciency traits, Bolormaa
et al. (2013) reported that traits with a large number of
recorded and genotyped animals and with high herita-
bility provided the greatest accuracy of GEBV.
The UNREL layout was designed to have the high-
est relationship within subsets and a small relationship
between them (Table 2). Over 95% of all relationship
coefcients between animals in the training and testing
subsets were less than 0.10, which means that a strong
proportion of animals in the training subset were less re-
lated to those in the testing subset. On average, genomic
predictions obtained in this design were the least accu-
rate, ranging from 0.08 to 0.34 (Table 5). According to
Pérez-Cabal et al. (2012), the number of close relatives
in training and testing populations can also affect the
accuracy of prediction. In our study, using ssGBLUP,
the accuracies of predictions for UNREL ranged from
0.23 to 0.35 for RFI, which was not extremely low. This
is an example of how accurate the prediction would be
for a population less related to that where the prediction
equation was obtained.
In this study, about 430,000 SNP effects were pre-
dicted from 761 records (DGV). With so few points, it
is reasonable to say that a limited number of SNP could
provide good prediction as shown by cross-validation
just by chance. Paul M. Van Raden (USDA, Beltsville,
MD, personal communication, 2015) reported that ac-
curacy of GEBV substantially increased with the “non-
linear” method compared with regular BLUP when
the number of genotyped Holsteins was small, but the
increase is almost nonexistent when the number of
genotyped animals increased. This indicates high esti-
mation noise with few genotyped animals. In studies
at the University of Georgia (Athens, GA) in various
species, SNP selection/weighting seems to improve
the accuracy of GEBV when the number of genotyped
animals is small, but there is little or no improvement
with >15,000 genotyped animals (L. Misztal, personal
communication, 2015). Stam (1980) and, subsequently,
Daetwyler et al. (2010) pointed out that the number of
independent chromosome segments due to a small ef-
fective population size is small.
Using ssGBLUP for evaluation of experimental gen-
otyped populations provided the most accurate predic-
tions and should be considered as an option to simplify
genomic evaluations, especially for low heritability traits.
Conclusions
The ssGBLUP seems to be more suitable for ob-
taining genomic predictions for feed efciency traits
on an experimental population of genotyped animals.
The more the cross-validation subsets are related,
the more accurately genomic breeding values can be
predicted.
The prediction of DGV or GEBV obtained using
Bayesian methodology can be biased, especially for
low heritability traits.
LITERATURE CITED
Archer, J. A., and L. Bergh. 2000. Duration of performance tests
for growth rate, feed intake and feed efciency in four bi-
ological types of beef cattle. Livest. Prod. Sci. 65:47–55.
doi:10.1016/S0301-6226(99)00181-5
Aguilar, I., I. Misztal, D. L. Johnson, A. Legarra, S. Tsuruta, and T.
J. Lawlor. 2010. Hot topic: A unied approach to utilize phe-
notypic, full pedigree, and genomic information for genetic
evaluation of Holstein nal score. J. Dairy Sci. 93:743–752.
doi:10.3168/jds.2009-2730
Bijma, P. 2012. Accuracies of estimated breeding values from
ordinary genetic evaluations do not reect the correlation
between true and estimated breeding values in selected popu-
lations. J. Anim. Breed. Genet. 129:345–358. doi:10.1111/
j.1439-0388.2012.00991.x
Bolormaa, S., J. E. Pryce, K. Kemper, K. Savin, B. J. Hayes, W.
Barendse, Y. Zhang, C. M. Reich, B. A. Mason, R. J. Bunch,
B. E. Harrison, A. Reverter, R. M. Herd, B. Tier, H. U. Graser,
and M. E. Goddard. 2013. Accuracy of prediction of genomic
breeding values for residual feed intake and carcass and meat
quality traits in Bos taurus, Bos indicus, and composite beef
cattle. J. Anim. Sci. 91:3088–3104. doi:10.2527/jas.2012-
5827
Calus, M. P. L. 2010. Genomic breeding value prediction:
Methods and procedures. Animal 4:157–164. doi:10.1017/
S1751731109991352
Chen, L., F. Schenkel, M. Vinsky, D. H. Crews Jr., and C. Li. 2013.
Accuracy of predicting values for residual feed intake in
Angus and Charolais beef cattle. J. Anim. Sci. 91:4669–4678.
doi:10.2527/jas.2013-5715
Crowley, J. J., R. D. Evans, N. Mc Hugh, T. Pabiou, D. A. Kenny,
M. McGee, D. H. Crews Jr., and D. P. Berry. 2011. Genetic
associations between feed efciency measured in a perfor-
mance test station and performance of growing cattle in com-
mercial beef herds. J. Anim. Sci. 89:3382–3393. doi:10.2527/
jas.2011-3836
Daetwyler, H. D., R. Pong-Wong, B. Villanueva, and J. A.
Woolliams. 2010. The impact of genetic architecture on ge-
nome-wide evaluation methods. Genetics 185:1021–1031.
doi:10.1534/genetics.110.116855
Ding, C., and X. He. 2004. K-means clustering via principal com-
ponent analysis. In: Proc. of Int. Conf. Machine Learning,
Banff, Canada, 2004. p. 225–232.
Fairfull, R. W., and J. R. Chambers, 1984. Breeding for feed
efciency: Poultry. Can. J. Anim. Sci. 64:513-527.
Goddard, M. 2009. Genomic selection: Prediction of accuracy and
maximisation of long term response. Genetica (The Hague)
136:245–257.
Silva et al.
Grion, A. L., M. E. Z. Mercadante, J. N. S. G. Cyrillo, S. F. M.
Bonilha, E. Magnani, and R. H. Branco. 2014. Selection
for feed efciency traits and correlated genetic responses in
feed intake and weight gain of Nellore cattle. J. Anim. Sci.
92(3):955–965. doi:10.2527/jas.2013-6682
Guo, G., M. S. Lund, Y. Zhang, and G. Su. 2010. Comparison
between genomic predictions using daughter yield devia-
tion and conventional estimated breeding value as response
variables. J. Anim. Breed. Genet. 127:423–432. doi:10.1111/
j.1439-0388.2010.00878.x
Habier, D., R. L. Fernando, K. Kizilkaya, and D. J. Garrick. 2011.
Extension of the Bayesian alphabet for genomic selection.
BMC Bioinf. 12:186. doi:10.1186/1471-2105-12-186
Habier, D., J. Tetens, F. Seefried, P. Lichtner, and G. Thaller. 2010.
The impact of genetic relationship information on genomic
breeding values in German Holstein cattle. Genet. Sel. Evol.
42:5. doi:10.1186/1297-9686-42-5
Hazel, L. N. 1943. The genetic basis for constructing selection
indices. Genetics 38:476–490.
Herd, R. M., and S. C. Bishop. 2000. Genetic variation in residual
feed intake and its association with other production traits
in British Hereford cattle. Livest. Prod. Sci. 63:111–119.
doi:10.1016/S0301-6226(99)00122-0
Koch, R. M., L. A. Swiger, D. Chambers, and K. E. Gregory. 1963.
Efciency of feed use in beef cattle. J. Anim. Sci. 22:486–494.
Legarra, A., A. Ricard, and O. Filangi. 2010. GS3–Genomic se-
lection, Gibbs sampling, Gauss Seidel and BayesCπ. https://
github.com/alegarra/gs3 (Accessed 4 August 2015.)
Lettre, G. 2011. Recent progress in the study of the genetics of
height. Hum. Genet. 129:465–472. doi:10.1007/s00439-011-
0969-x
Lourenco, D. A., I. Misztal, S. Tsuruta, I. Aguilar, E. Ezra, M. Ron,
A. Shirak, and J. I. Weller. 2014. Methods for genomic evalu-
ation of a relatively small genotyped dairy population and
effect of genotyped cow information in multiparity analyses.
J. Dairy Sci. 97:1742–1752. doi:10.3168/jds.2013-6916
Lourenco, D. A., S. Tsuruta, B. O. Fragomeni, Y. Masuda, I.
Aguilar, A. Legarra, J. K. Bertrand, T. S. Amen, L. Wang,
D. W. Moser, and I. Misztal. 2015. Genetic evaluation us-
ing single-step genomic best linear unbiased predictor in
American Angus. J. Anim. Sci. 93:2653–2662. doi:10.2527/
jas.2014-8836
Meuwissen, T. H., B. J. Hayes, and M. E. Goddard. 2001.
Prediction of total genetic value using genome-wide dense
marker map. Genetics 157:1819–1829.
Misztal, I., S. Tsuruta, T. Strabel, B. Auvray, T. Druet, and D. H.
Lee. 2002. BLUPF90 and related programs (BGF90). In: Proc.
7th World Congr. Genet. Appl. Livest. Prod., Montpellier,
France. Communication No. 28-07. p. 21-22.
Moser, G., M. S. Khatkar, B. J. Hayes, and H. W. Raadsma. 2010.
Accuracy of direct genomic values in Holstein bulls and
cows using subsets of SNP markers. Genet. Sel. Evol. 42:37.
doi:10.1186/1297-9686-42-37
Neves, H. H. R., R. Carvalheiro, A. M. P. O’Brien, Y. T.
Utsunomiya, A. S. Carmo, F. S. Schenkel, J. Sölkner, J. C.
McEwan, C. P. Van Tassell, J. B. Cole, M. V. G. B. Silva, S. A.
Queiroz, T. S. Sonstegard, and J. F. Garcia. 2014. Accuracy
of genomic predictions in Bos indicus (Nellore) cattle. Genet.
Sel. Evol. 46:17. doi:10.1186/1297-9686-46-17
Nkrumah, J. D., J. A. Basarab, M. A. Price, E. K. Okine, A.
Ammoura, S. Guercio, C. Hansen, C. Li, B. Benkel, B.
Murdoch, and S. S. Moore. 2004. Different measures of
energetic efciency and their phenotypic relationships with
growth, feed intake, and ultrasound and carcass merit in hy-
brid cattle. J. Anim. Sci. 82:2451–2459.
Onogi, A., T. Komatsu, N. Shoji, K. Simizu, K. Kurogi, T.
Yasumori, K. Togashi, and H. Iwata. 2014. Genomic pre-
diction in Japanese Black cattle: Application of a single-
step approach to beef cattle. J. Anim. Sci. 92:1931–1938.
doi:10.2527/jas.2014-7168
Pendel, D. L. and Herbel, K. 2015. Feed Costs: Pasture vs Non
Pasture Costs: An Analysis of 2010-2014 Kansas Farm
Management Association Cow Calf Enterprise. http://
www.agmanager.info/livestock/budgets/production/beef/
FeedCosts_2015.pdf.
Pérez-Cabal, M. A., A. I. Vazquez, D. Gianola, G. J. M. Rosa, and
K. A. Wiegel. 2012. Accuracy of genome-enabled prediction
in a dairy cattle population using different cross-validation
layouts. Front. Genet. 3:27. doi:10.3389/fgene.2012.00027
Price, A. L., N. J. Patterson, R. M. Plenge, M. E. Weinblatt, N. A.
Shadick, and D. Reich. 2006. Principal component analysis
corrects for stratication in genome-wide association studies.
Nat. Genet. 38:904–909. doi:10.1038/ng1847
Pryce, J. E., J. Arias, P. J. Bowman, S. R. Davis, K. A. Macdonald,
G. C. Waghorn, W. J. Wales, Y. J. Williams, R. J. Spelman,
and B. J. Hayes. 2012. Accuracy of genomic predictions of
residual feed intake and 250-day body weight in growing
heifers using 625,000 single nucleotide polymorphism mark-
ers. J. Dairy Sci. 95:2108–2119. doi:10.3168/jds.2011-4628
Pszczola, M., T. Strabel, H. A. Mulder, and M. P. L. Calus. 2012.
Reliability of direct genomic values for animals with differ-
ent relationships within and to the reference population. J.
Dairy Sci. 95:389–400. doi:10.3168/jds.2011-4338
Saatchi, M., M. McClure, S. D. McKay, M. M. Rolf, J. W. Kim,
J. E. Decker, T. M. Taxis, R. H. Chapple, H. R. Ramey, S.
L. Northcutt, S. Bauck, B. Woodward, J. C. M. Dekkers, R.
L. Fernando, R. D. Schnabel, D. J. Garrick, and J. F. Taylor.
2011. Accuracies of genomic breeding values in American
Angus beef cattle using K-means clustering for cross-valida-
tion. Genet. Sel. Evol. 43:40. doi:10.1186/1297-9686-43-40
Saatchi, M., S. R. Miraei-Ashtiani, A. Nejati-Javaremi, M. Moradi-
Shahrebabak, and H. Mehrabani-Yeganeh. 2010. The impact
of information quantity and strength of relationship between
training set and validation set on accuracy of genomic esti-
mated breeding values. Afr. J. Biotechnol. 9:438–442.
Saatchi, M., J. Ward, and D. J. Garrick. 2013. Accuracies of di-
rect genomic breeding values in Hereford beef cattle using
national or international training populations. J. Anim. Sci.
91:1538–1551. doi:10.2527/jas.2012-5593
Sargolzaei, M., H. Iwaisaki, and J. J. Colleau. 2006. CFC: A tool
for monitoring genetic diversity. In: Proc. 8th World Congr.
Genet. Appl. Livest. Prod., Belo Horizonte, Brazil. p. 27–28.
Silva, R. M. O., L. Takada, R. H. Branco, M. E. Mercadante, R.
Carvalheiro, and L. G. Albuquerque. 2013. Habilidade de
predição genômica para características de consumo e eciên-
cia alimentar em bovinos Nelore. (In Portuguese.) In: Proc.
X Simpósio Brasileiro de Melhoramento Animal, Uberaba,
Brasil. p. 1-3.
Stam, P. 1980. The distribution of the fraction of the genome iden-
tical by descent in nite random mating populations. Genet.
Res. 35:131–155. doi:10.1017/S0016672300014002
Genomic selection for feed efciency traits
VanRaden, P. M. 2008. Efcient methods to compute genomic pre-
dictions. J. Dairy Sci. 91:4414–4423. doi:10.3168/jds.2007-
0980
VanRaden, P. M., C. P. Van Tassell, G. R. Wiggans, T. S.
Sonstegard, R. D. Schnabel, J. F. Taylor, and F. S. Schenkel.
2009. Invited review: Reliability of genomic predictions
for North American Holstein bulls. J. Dairy Sci. 92:16–24.
doi:10.3168/jds.2008-1514
VanRaden, P. M., J. R. Wright, and T. A. Cooper. 2012. Adjustment
of selection index coefcients and polygenic variance to im-
prove regressions and reliability of genomic evaluations. J.
Dairy Sci. 95:520. (Abstr.)
... The genomic selection was proposed to improve the FE-related traits, since these traits are difficult and expensive to measure, limiting their full-scale use in beef cattle breeding programs (Pryce et al., 2012;Silva et al., 2016). Several methods are available for genomic prediction, which differ statistically and also for use under commercial conditions (Aguilar et al., 2010;Habier et al., 2011;Erbe et al., 2012;Daetwyler et al., 2013;De Los Campos et al., 2013;Chiaia et al., 2018). ...
... To the best of our knowledge, there are no reports of pseudophenotypes evaluation in the genomic prediction of FE traits in Nelore cattle (Pryce et al., 2012;Bolormaa et al., 2013;Silva et al., 2016). As pseudo-phenotype or response variable, the true breeding value would be the ideal parameter. ...
... Under commercial situations, it is important to enable the prediction of the next-generation genetic merit, using genomic information. Thus, older animals or with more reliable EBV can be used as a reference population to define the prediction equations that will be validated in younger animals or with less accurate EBV, that is indicated when a structured data set with older animals, phenotypic, pedigree, and genomic information is available (VanRaden et al., 2009;Habier et al., 2011;Silva et al., 2016). However, in beef cattle, records from FE-related traits are normally available in young unproven animals, since these traits have recently been included as a selection criterion, and proven sires only have genomic information. ...
Article
Full-text available
There is a growing interest to improve feed efficiency (FE) traits in cattle. The genomic selection was proposed to improve these traits since they are difficult and expensive to measure. Up to date, there are scarce studies about the implementation of genomic selection for FE traits in indicine cattle under different scenarios of pseudo-phenotypes, models, and validation strategies on a commercial large scale. Thus, the aim was to evaluate the feasibility of genomic selection implementation for FE traits in Nelore cattle applying different models and pseudo-phenotypes under validation strategies. Phenotypic and genotypic information from 4 329 and 3 467 animals were used, respectively, which were tested for residual feed intake, DM intake, feed efficiency, feed conversion ratio, residual BW gain, and residual intake and BW gain. Six prediction methods were used: single-step genomic best linear unbiased prediction, Bayes A, Bayes B, Bayes Cπ, Bayesian least absolute shrinkage and selection operator (BLASSO), and Bayes R. Phenotypes adjusted for fixed effects (Y*), estimated breeding value (EBV), and EBV deregressed (DEBV) were used as pseudo-phenotypes. The validation approaches used were: (1) random: the data was randomly divided into ten subsets and the validation was done in each subset at a time; (2) age: the partition into training and testing sets was based on year of birth and testing animals were born after 2016; and (3) EBV accuracy: the data was split into two groups, being animals with accuracy above 0.45 the training set; and below 0.45 the validation set. In the analyses that used the Y* as pseudo-phenotype, prediction ability (PA) was obtained by dividing the correlation between pseudo-phenotype and genomic EBV (GEBV) by the square root of the heritability of the trait. When EBV and DEBV were used as the pseudo-phenotype, the simple correlation of this quantity with the GEBV was considered as PA. The prediction methods show similar results for PA and bias. The random cross-validation presented higher PA (0.17) than EBV accuracy (0.14) and age (0.13). The PA was higher for Y* than for EBV and DEBV (30.0 and 34.3%, respectively). Random validation presented the highest PA, being indicated for use in populations composed mainly of young animals and traits with few generations of data recording. For high heritability traits, the validation can be done by age, enabling the prediction of the next-generation genetic merit. These results would support breeders to identify genomic approaches that are more viable for genomic prediction for FE-related traits.
... Genome-wide selection (GWS) has exhibited high genetic gain per unit time for certain traits and has been thoroughly evaluated for its use in genetic breeding of animals Hulsman Hanna et al. 2015;Silva et al. 2016), crops (Lehermeier et al. 2015), and forestry species (Resende et al. 2012(Resende et al. , 2017a. GWS allows the estimation of breeding values (GEBVs, hereafter), which express the genetic potential of individuals considering their markers. ...
... Many strategies for composing training sets have been proposed to produce more accurate r yb y . These strategies split the training and validation sets according to family (Pszczola et al. 2012;Hulsman Hanna et al. 2015;Resende et al. 2017a), population structure (Guo et al. 2014), generation (Pérez-Cabal et al. 2012;Pszczola et al. 2012;Saatchi et al. 2013;Silva et al. 2016), maximum kinship coefficient (Habier et al. 2010), Wright kinship coefficients (Saatchi et al. 2011(Saatchi et al. , 2013Clark et al. 2012;Pérez-Cabal et al. 2012;Boddhireddy et al. 2014), identity by state clustering methods (Boddhireddy et al. 2014), and unrelated individuals (Pérez-Cabal et al. 2012;Silva et al. 2016). Tree breeding populations are normally composed of a highly genetically diverse set of individuals. ...
... Many strategies for composing training sets have been proposed to produce more accurate r yb y . These strategies split the training and validation sets according to family (Pszczola et al. 2012;Hulsman Hanna et al. 2015;Resende et al. 2017a), population structure (Guo et al. 2014), generation (Pérez-Cabal et al. 2012;Pszczola et al. 2012;Saatchi et al. 2013;Silva et al. 2016), maximum kinship coefficient (Habier et al. 2010), Wright kinship coefficients (Saatchi et al. 2011(Saatchi et al. , 2013Clark et al. 2012;Pérez-Cabal et al. 2012;Boddhireddy et al. 2014), identity by state clustering methods (Boddhireddy et al. 2014), and unrelated individuals (Pérez-Cabal et al. 2012;Silva et al. 2016). Tree breeding populations are normally composed of a highly genetically diverse set of individuals. ...
Article
Full-text available
Random k-fold cross-validation in genome-wide selection (GWS) can help to estimate predictive ability (ryŷ). Predictive ability tends to be higher when training, and validation sets present a high degree of kinship. However, many tree breeding populations are less genetically related to the training sets and have different levels of phenotypic diversity. Therefore, this study proposes methods of splitting k-fold cross-validation sets to optimize ryŷ estimates that are consistent with the breeding population and verify the impact of phenotypic and genotypic distribution on GWS. Using a simulated Eucalyptus trait (h2=0.5) and Pinus taeda L. data for diameter at breast height (h2=0.31), six methods were developed based on mutual information (I) and entropy (H) for measuring genetic similarity and phenotypic dissimilarity, respectively. All methods were evaluated for ryŷ, bias, minimum squared error of prediction, and genomic heritability. The Pearson correlations of these parameters with the kinship coefficient, and I and H between and within training and validation sets were also estimated. Our results show that closer genetic similarity did not significantly increase ryŷ and that a lower H reduced ryŷ and overestimated genomic breeding values. Consequently, phenotypic diversity (high H) should be added to tree breeding populations to increase genetic gain and reduce bias. The new methods accurately fitted models according to the entropy of tree breeding populations and their genetic relationship to the training sets. Therefore, these methods provided usable estimates of genetic gain to produce consistent success of long-term tree breeding programs.
... We used a subsample of 782 animals born between 2004 and 2012 (92 NeC, 192 NeS and 498 NeT) that were previously genotyped because of phenotypes related to feed efficiency [16,17]. The average GC for this subsample was equal to 6.1 (ranging from 4.4 to 7.7) and generation 6 had the broadest representation of animals from each line [see Additional file 1: Table S1]. ...
... The average estimated inbreeding coefficients in the genotyped subsample were equal to 0.05 ± 0.02 for NeC (ranging from 0.02 to 0.18), 0.03 ± 0.01 for NeS (ranging from 0.01 to 0.06) and 0.03 ± 0.01 for NeT (ranging from 0.01 to 0.08). Table 1 presents the average additive relationships between the genotyped animals of the three lines, which were estimated based on the full pedigree (see Silva et al. [17] for a complete pedigree description). ...
Article
Full-text available
Background This study aimed at (1) assessing the genomic stratification of experimental lines of Nelore cattle that have experienced different selection regimes for growth traits, and (2) identifying genomic regions that have undergone recent selection. We used a sample of 763 animals genotyped with the Illumina BovineHD BeadChip, among which 674 animals originated from two lines that are maintained under directional selection for increased yearling body weight and 89 animals from a control line that is maintained under stabilizing selection. ResultsMultidimensional analysis of the genomic dissimilarity matrix and admixture analysis revealed a substantial level of population stratification between the directional selection lines and the stabilizing selection control line. Two of the three tests used to detect selection signatures (FST, XP-EHH and iHS) revealed six candidate regions with indications of selection, which strongly indicates truly positive signals. The set of identified candidate genes included several genes with roles that are functionally related to growth metabolism, such as COL14A1, CPT1C, CRH, TBC1D1, and XKR4. Conclusions The current study identified genetic stratification that resulted from almost four decades of divergent selection in an experimental Nelore population, and highlighted autosomal genomic regions that present patterns of recent selection. Our findings provide a basis for a better understanding of the metabolic mechanism that underlies the growth traits, which are modified by selection for yearling body weight.
... In this study, the heritability estimated for RFI (h 2 = 0.3) is in agreement with other estimates reported previously for other Angus populations [10,20,51], an Angus-Brahman herd (0.30) [52], and Nellore (0.17) [53]. However, in some other studies in Angus and Charolais populations, the heritability has been reported as high as 0.47 and 0.68, respectively [54]. ...
Article
Full-text available
Background: Genome-wide association studies (GWAS) are extensively used to identify single nucleotide polymorphisms (SNP) underlying the genetic variation of complex traits. However, much uncertainly often still exists about the causal variants and genes at quantitative trait loci (QTL). The aim of this study was to identify QTL associated with residual feed intake (RFI) and genes in these regions whose expression is also associated with this trait. Angus cattle (2190 steers) with RFI records were genotyped and imputed to high density arrays (770 K) and used for a GWAS approach to identify QTL associated with RFI. RNA sequences from 126 Angus divergently selected for RFI were analyzed to identify the genes whose expression was significantly associated this trait with special attention to those genes residing in the QTL regions. Results: The heritability for RFI estimated for this Angus population was 0.3. In a GWAS, we identified 78 SNPs associated with RFI on six QTL (on BTA1, BTA6, BTA14, BTA17, BTA20 and BTA26). The most significant SNP was found on chromosome BTA20 (rs42662073) and explained 4% of the genetic variance. The minor allele frequencies of significant SNPs ranged from 0.05 to 0.49. All regions, except on BTA17, showed a significant dominance effect. In 1 Mb windows surrounding the six significant QTL, we found 149 genes from which OAS2, STC2, SHOX, XKR4, and SGMS1 were the closest to the most significant QTL on BTA17, BTA20, BTA1, BTA14, and BTA26, respectively. In a 2 Mb windows around the six significant QTL, we identified 15 genes whose expression was significantly associated with RFI: BTA20) NEURL1B and CPEB4; BTA17) RITA1, CCDC42B, OAS2, RPL6, and ERP29; BTA26) A1CF, SGMS1, PAPSS2, and PTEN; BTA1) MFSD1 and RARRES1; BTA14) ATP6V1H and MRPL15. Conclusions: Our results showed six QTL regions associated with RFI in a beef Angus population where five of these QTL contained genes that have expression associated with this trait. Therefore, here we show that integrating information from gene expression and GWAS studies can help to better understand the genetic mechanisms that determine variation in complex traits.
... Some genomic selection studies in beef cattle working with novel traits, such as feed efficiency and beef traits, reported higher genomic prediction ability applying the Yc records as pseudo-phenotypes to train the marker effects compared to the EBV from BLUP ( Silva et al. 2016;Júnior et al. 2016). ...
... The differences regarding the pseudophenotype are in agreement with the literature Morota, Boddhireddy, Vukasinovic, Gianola, & Denise, 2014). Genomic selection studies for novel traits in beef cattle (Silva et al., 2016;Fernandes Júnior et al., 2016), such as feed efficiency and beef quality traits, reported higher genomic prediction ability for adjusted phenotypic used as pseudo-phenotypes over EBV obtained from traditional BLUP. In beef cattle, the EBV obtained from BLUP model in general is less appropriated due to poor pedigree structure and small training population. ...
... The estimates of the regression coefficients (b) showed that the predictor can be deflated for traits of tenderness, lipid percentage and marbling, i.e. the values of DGVs can be underestimated, while b for other traits can be inflated and overestimating the DGVs (Table 3). When the estimates of the regression coefficients were deflated, the Bayesian methods obtained b values more distant of 1 than GBLUP method, which agree with results from Neves et al. (2014) and Silva et al. (2016). In opposite, when regression slopes inflated, the Bayesian methods were close to 1. ...
Article
The objective of this study was to present heritability estimates and accuracy of genomic prediction using different methods for meat quality traits in Nelore cattle. Approximately 5000 animals with phenotypes and genotypes of 412,000 SNPs, were divided into two groups: (1) training population: animals born from 2008 to 2013 and (2) validation population: animals born in 2014. A single-trait animal model was used to estimate heritability and to adjust the phenotype. The methods of GBLUP, Improved Bayesian Lasso and Bayes Cπ were performed to estimate the SNP effects. Accuracy of genomic prediction was calculated using Pearson's correlations between direct genomic values and adjusted phenotypes, divided by the square root of heritability of each trait (0.03–0.19). The accuracies varied from 0.23 to 0.73, with the lowest accuracies estimated for traits associated with fat content and the greatest accuracies observed for traits of meat color and tenderness. There were small differences in genomic prediction accuracy between methods.
... This phenotypic variability could indicate that RFI can be used in genetic selection, a fact that is corroborated by its average of all two-trait analyses heritability (Table 1). (Table 1) Other studies in Nellore cattle have found similar heritability results for ADG, DMI FCR and RFI (Grigoletto et al., 2017;Oliveira et al., 2014;Santana et al., 2014); however, some demonstrated lower values, especially for RFI (Grion et al., 2014;Olivieri et al., 2016;Silva et al., 2016). Differently from the other phenotypes, the FCR showed lower heritability. ...
Article
The objective of this study was to estimate genetic parameters and correlations for intake, feed efficiency and weight gain in beef cattle. Phenotypic data of average daily gain (ADG), dry matter intake (DMI), feed conversion ratio (FCR) and residual feed intake (RFI) calculated from 2,058 male and female Nellore cattle were used. The genetic parameters and heritability estimates were estimated using these data and pedigree by the AIREMLF90 program. The heritability estimates (standard error) found for ADG, DMI, FCR and RFI were 0.35(0.09), 0.46(0.09), 0.19(0.06) and 0.28(0.07), respectively. We highlighted the genetic correlation between ADG and FCR (-0.40) and DMI with RFI (0.61). The heritability and genetic correlations presented here show that is possible to include feed efficiency traits in Nellore breeding programmes, and the selection of efficient animals could reduce feed consumption without performance loss.
... A study using the same dataset [44] revealed genetic stratification among the samples. Population stratification could potentially affect the resulting genomic prediction accuracies; however a random cross-validation approach was adopted in this study so that the impact of stratification was minimized [45]. In general, the prediction accuracy of the DGV for most traits using only SNP was in concordance with the results reported in literature for Nellore cattle breed [37]. ...
Article
Full-text available
Background: Due to the advancement in high throughput technology, single nucleotide polymorphism (SNP) is routinely being incorporated along with phenotypic information into genetic evaluation. However, this approach often cannot achieve high accuracy for some complex traits. It is possible that SNP markers are not sufficient to predict these traits due to the missing heritability caused by other genetic variations such as microsatellite and copy number variation (CNV), which have been shown to affect disease and complex traits in humans and other species. Results: In this study, CNVs were included in a SNP based genomic selection framework. A Nellore cattle dataset consisting of 2230 animals genotyped on BovineHD SNP array was used, and 9 weight and carcass traits were analyzed. A total of six models were implemented and compared based on their prediction accuracy. For comparison, three models including only SNPs were implemented: 1) BayesA model, 2) Bayesian mixture model (BayesB), and 3) a GBLUP model without polygenic effects. The other three models incorporating both SNP and CNV included 4) a Bayesian model similar to BayesA (BayesA+CNV), 5) a Bayesian mixture model (BayesB+CNV), and 6) GBLUP with CNVs modeled as a covariable (GBLUP+CNV). Prediction accuracies were assessed based on Pearson's correlation between de-regressed EBVs (dEBVs) and direct genomic values (DGVs) in the validation dataset. For BayesA, BayesB and GBLUP, accuracy ranged from 0.12 to 0.62 across the nine traits. A minimal increase in prediction accuracy for some traits was noticed when including CNVs in the model (BayesA+CNV, BayesB+CNV, GBLUP+CNV). Conclusions: This study presents the first genomic prediction study integrating CNVs and SNPs in livestock. Combining CNV and SNP marker information proved to be beneficial for genomic prediction of some traits in Nellore cattle.
Article
Full-text available
The aim of the present study was to compare the predictive ability of SNP-BLUP model using different pseudo-phenotypes such as phenotype adjusted for fixed effects, estimated breeding value, and genomic estimated breeding value, using simulated and real data for beef FA profile of Nelore cattle finished in feedlot. A pedigree with phenotypes and genotypes of 10,000 animals were simulated, considering 50% of multiple sires in the pedigree. Regarding to phenotypes, two traits were simulated, one with high heritability (0.58), another with low heritability (0.13). Ten replicates were performed for each trait and results were averaged among replicates. A historical population was created from generation zero to 2020, with a constant size of 2000 animals (from generation zero to 1000) to produce different levels of linkage disequilibrium (LD). Therefore, there was a gradual reduction in the number of animals (from 2000 to 600), producing a “bottleneck effect” and consequently, genetic drift and LD starting in the generation 1001 to 2020. A total of 335,000 markers (with MAF greater or equal to 0.02) and 1000 QTL were randomly selected from the last generation of the historical population to generate genotypic data for the test population. The phenotypes were computed as the sum of the QTL effects and an error term sampled from a normal distribution with zero mean and variance equal to 0.88. For simulated data, 4000 animals of the generations 7, 8, and 9 (with genotype and phenotype) were used as training population, and 1000 animals of the last generation (10) were used as validation population. A total of 937 Nelore bulls with phenotype for fatty acid profiles (Sum of saturated, monounsaturated, omega 3, omega 6, ratio of polyunsaturated and saturated and polyunsaturated fatty acid profile) were genotyped using the Illumina BovineHD BeadChip (Illumina, San Diego, CA) with 777,962 SNP. To compare the accuracy and bias of direct genomic value (DGV) for different pseudo-phenotypes, the correlation between true breeding value (TBV) or DGV with pseudo-phenotypes and linear regression coefficient of the pseudo-phenotypes on TBV for simulated data or DGV for real data, respectively. For simulated data, the correlations between DGV and TBV for high heritability traits were higher than obtained with low heritability traits. For simulated and real data, the prediction ability was higher for GEBV than for Yc and EBV. For simulated data, the regression coefficient estimates (b(Yc,DGV)), were on average lower than 1 for high and low heritability traits, being inflated. The results were more biased for Yc and EBV than for GEBV. For real data, the GEBV displayed less biased results compared to Yc and EBV for SFA, MUFA, n-3, n-6, and PUFA/SFA. Despite the less biased results for PUFA using the EBV as pseudo-phenotype, the b(Yi,DGV estimates obtained for the different pseudo-phenotypes (Yc, EBV and GEBV) were very close. Genomic information can assist in improving beef fatty acid profile in Zebu cattle, since the use of genomic information yielded genomic values for fatty acid profile with accuracies ranging from low to moderate. Considering both simulated and real data, the ssGBLUP model is an appropriate alternative to obtain more reliable and less biased GEBVs as pseudo-phenotype in situations of missing pedigree, due to high proportion of multiple sires, being more adequate than EBV and Yc to predict direct genomic value for beef fatty acid profile.
Article
Full-text available
Predictive ability of genomic EBV when using single-step genomic BLUP (ssGBLUP) in Angus cattle was investigated. Over 6 million records were available on birth weight (BiW) and weaning weight (WW), almost 3.4 million on postweaning gain (PWG), and over 1.3 million on calving ease (CE). Genomic information was available on, at most, 51,883 animals, which included high and low EBV accuracy animals. Traditional EBV was computed by BLUP and genomic EBV by ssGBLUP and indirect prediction based on SNP effects was derived from ssGBLUP; SNP effects were calculated based on the following reference populations: ref_2k (contains top bulls and top cows that had an EBV accuracy for BiW ≥0.85), ref_8k (contains all parents that were genotyped), and ref_33k (contains all genotyped animals born up to 2012). Indirect prediction was obtained as direct genomic value (DGV) or as an index of DGV and parent average (PA). Additionally, runs with ssGBLUP used the inverse of the genomic relationship matrix calculated by an algorithm for proven and young animals (APY) that uses recursions on a small subset of reference animals. An extra reference subset included 3,872 genotyped parents of genotyped animals (ref_4k). Cross-validation was used to assess predictive ability on a validation population of 18,721 animals born in 2013. Computations for growth traits used multiple-trait linear model and, for CE, a bivariate CE-BiW threshold-linear model. With BLUP, predictivities were 0.29, 0.34, 0.23, and 0.12 for BiW, WW, PWG, and CE, respectively. With ssGBLUP and ref_2k, predictivities were 0.34, 0.35, 0.27, and 0.13 for BiW, WW, PWG, and CE, respectively, and with ssGBLUP and ref_33k, predictivities were 0.39, 0.38, 0.29, and 0.13 for BiW, WW, PWG, and CE, respectively. Low predictivity for CE was due to low incidence rate of difficult calving. Indirect predictions with ref_33k were as accurate as with full ssGBLUP. Using the APY and recursions on ref_4k gave 88% gains of full ssGBLUP and using the APY and recursions on ref_8k gave 97% gains of full ssGBLUP. Genomic evaluation in beef cattle with ssGBLUP is feasible while keeping the models (maternal, multiple trait, and threshold) already used in regular BLUP. Gains in predictivity are dependent on the composition of the reference population. Indirect predictions via SNP effects derived from ssGBLUP allow for accurate genomic predictions on young animals, with no advantage of including PA in the index if the reference population is large. With the APY conditioning on about 10,000 reference animals, ssGBLUP is potentially applicable to a large number of genotyped animals without compromising predictive ability. © 2015 American Society of Animal Science. All rights reserved.
Article
Full-text available
Predictive ability of genomic EBV when using single-step genomic BLUP (ssGBLUP) in Angus cattle was investigated. Over 6 million records were available on birth weight (BiW) and weaning weight (WW), almost 3.4 million on postweaning gain (PWG), and over 1.3 million on calving ease (CE). Genomic information was available on, at most, 51,883 animals, which included high and low EBV accuracy animals. Traditional EBV was computed by BLUP and genomic EBV by ssGBLUP and indirect prediction based on SNP effects was derived from ssGBLUP; SNP effects were calculated based on the following reference populations: ref_2k (contains top bulls and top cows that had an EBV accuracy for BiW ≥0.85), ref_8k (contains all parents that were genotyped), and ref_33k (contains all genotyped animals born up to 2012). Indirect prediction was obtained as direct genomic value (DGV) or as an index of DGV and parent average (PA). Additionally, runs with ssGBLUP used the inverse of the genomic relationship matrix calculated by an algorithm for proven and young animals (APY) that uses recursions on a small subset of reference animals. An extra reference subset included 3,872 genotyped parents of genotyped animals (ref_4k). Cross-validation was used to assess predictive ability on a validation population of 18,721 animals born in 2013. Computations for growth traits used multiple-trait linear model and, for CE, a bivariate CE-BiW threshold-linear model. With BLUP, predictivities were 0.29, 0.34, 0.23, and 0.12 for BiW, WW, PWG, and CE, respectively. With ssGBLUP and ref_2k, predictivities were 0.34, 0.35, 0.27, and 0.13 for BiW, WW, PWG, and CE, respectively, and with ssGBLUP and ref_33k, predictivities were 0.39, 0.38, 0.29, and 0.13 for BiW, WW, PWG, and CE, respectively. Low predictivity for CE was due to low incidence rate of difficult calving. Indirect predictions with ref_33k were as accurate as with full ssGBLUP. Using the APY and recursions on ref_4k gave 88% gains of full ssGBLUP and using the APY and recursions on ref_8k gave 97% gains of full ssGBLUP. Genomic evaluation in beef cattle with ssGBLUP is feasible while keeping the models (maternal, multiple trait, and threshold) already used in regular BLUP. Gains in predictivity are dependent on the composition of the reference population. Indirect predictions via SNP effects derived from ssGBLUP allow for accurate genomic predictions on young animals, with no advantage of including PA in the index if the reference population is large. With the APY conditioning on about 10,000 reference animals, ssGBLUP is potentially applicable to a large number of genotyped animals without compromising predictive ability.
Article
Full-text available
Nellore cattle play an important role in beef production in tropical systems and there is great interest in determining if genomic selection can contribute to accelerate genetic improvement of production and fertility in this breed. We present the first results of the implementation of genomic prediction in a Bos indicus (Nellore) population. Influential bulls were genotyped with the Illumina Bovine HD chip in order to assess genomic predictive ability for weight and carcass traits, gestation length, scrotal circumference and two selection indices. 685 samples and 320 238 single nucleotide polymorphisms (SNPs) were used in the analyses. A forward-prediction scheme was adopted to predict the genomic breeding values (DGV). In the training step, the estimated breeding values (EBV) of bulls were deregressed (dEBV) and used as pseudo-phenotypes to estimate marker effects using four methods: genomic BLUP with or without a residual polygenic effect (GBLUP20 and GBLUP0, respectively), a mixture model (Bayes C) and Bayesian LASSO (BLASSO). Empirical accuracies of the resulting genomic predictions were assessed based on the correlation between DGV and dEBV for the testing group. Accuracies of genomic predictions ranged from 0.17 (navel at weaning) to 0.74 (finishing precocity). Across traits, Bayesian regression models (Bayes C and BLASSO) were more accurate than GBLUP. The average empirical accuracies were 0.39 (GBLUP0), 0.40 (GBLUP20) and 0.44 (Bayes C and BLASSO). Bayes C and BLASSO tended to produce deflated predictions (i.e. slope of the regression of dEBV on DGV greater than 1). Further analyses suggested that higher-than-expected accuracies were observed for traits for which EBV means differed significantly between two breeding subgroups that were identified in a principal component analysis based on genomic relationships. Bayesian regression models are of interest for future applications of genomic selection in this population, but further improvements are needed to reduce deflation of their predictions. Recurrent updates of the training population would be required to enable accurate prediction of the genetic merit of young animals. The technical feasibility of applying genomic prediction in a Bos indicus (Nellore) population was demonstrated. Further research is needed to permit cost-effective selection decisions using genomic information.
Article
The implementation of genomic selection for Japanese Black cattle, known for rich marbling of their meat, is now being explored. Although multiple-step methods are often adopted for dairy cattle, they present shortcomings such as bias and loss of information in addition to operational complexity. These can be avoided using single-step genomic BLUP (ssGBLUP) based on the relationship matrix H, which is constructed from the numerator relationship matrix (A) augmented by the genomic relationship matrix (G). This study assessed the use of ssGBLUP for 3 economically important traits in Japanese Black cattle. Three aspects of ssGBLUP that are important for practical use were examined specifically: the mixing proportions of blending G with A, selection of subsets of genotyped animals used for constructing H, and prediction ability for ungenotyped animals. Different mixing proportions were tested to assess the influence of these proportions on variance component estimation and prediction accuracy. For all traits, the highest or nearly highest accuracy was obtained when the adopted mixing proportion provided heritability closest to that inferred based on A. However, the accuracy did not increase greatly under adjustment of the mixing proportion, thereby suggesting that the influence of the mixing proportion on the accuracy was limited. Genotype data of influential bulls showed a greater contribution to accuracy than that of bulls that were less influential. Genotyping animals with phenotypic records increased the accuracy. It can be prioritized over genotyping bulls that are not influential on the population. These results are expected to present good guides to the future expansion of genotyped populations. Even for animals without genotype data but with genotyped sires, ssGBLUP provided more accurate prediction than BLUP did. For both phenotype and breeding value prediction, ssGBLUP provides more accurate prediction than BLUP, suggesting its usefulness in genomic selection in Japanese Black cattle.