Figure - available from: Theoretical and Applied Genetics
This content is subject to copyright. Terms and conditions apply.
Depiction of the Nelder wheel plot design

Depiction of the Nelder wheel plot design

Source publication
Article
Full-text available
Key message Established spatial models improve the analysis of agricultural field trials with or without genomic data and can be fitted with the open-source R package INLA. Abstract The objective of this paper was to fit different established spatial models for analysing agricultural field trials using the open-source R package INLA. Spatial varia...

Citations

... A previous application from plant breeding indicates that INLA is a suitable and promising candidate for our purpose (Selle et al., 2019). Here we expand the considerations from Selle et al. (2019) by discussing how to find suitable priors that impose an appropriate level of shrinkage. ...
... A previous application from plant breeding indicates that INLA is a suitable and promising candidate for our purpose (Selle et al., 2019). Here we expand the considerations from Selle et al. (2019) by discussing how to find suitable priors that impose an appropriate level of shrinkage. To this end, we employ an approach where the eigenvalues are used as scaling factors for the variances of PC-effects (Macciotta et al., 2010), which is equivalent to letting the columns of Z ⋆ have a variance proportional to the corresponding eigenvalue of the SNP covariance matrix. ...
... To underline this point, we will do a sensitivity analysis by carrying out all our analyses for both fixed σ 2 u ⋆ priors suggested in (4), using a "good prior guess" from earlier analyses, as well as using the Gamma prior σ 2 u ⋆ ∼ Γ(1, 5 · 10 −5 ), parametrised with shape and rate, which corresponds to the very naive default prior in the R-INLA framework. Even though the respective Gamma prior has been criticized in other contexts (Gelman, 2006;Hodges, 2013), its use led to quite accurate results in a related study (Selle et al., 2019). Users may of course choose other priors, for example a distribution around the value in (4) with varying degree of uncertainty. ...
Preprint
Full-text available
As larger genomic data sets become available for wild study populations, the need for flexible and efficient methods to estimate and predict quantitative genetic parameters, such as the adaptive potential and measures for genetic change, increases. Animal breeders have produced a wealth of methods, but wild study systems often face challenges due to larger effective population sizes, environmental heterogeneity and higher spatio-temporal variation. Here we adapt methods previously used for genomic prediction in animal breeding to the needs of wild study systems. The core idea is to approximate the breeding values as a linear combination of principal components (PCs), where the PC effects are shrunk with Bayesian ridge regression. Thanks to efficient implementation in a Bayesian framework using integrated nested Laplace approximations (INLA), it is possible to handle models that include several fixed and random effects in addition to the breeding values. Applications to a Norwegian house sparrow meta-population, as well as simulations, show that this method efficiently estimates the additive genetic variance and accurately predicts the breeding values. A major benefit of this modeling framework is computational efficiency at large sample sizes. The method therefore suits both current and future needs to analyze genomic data from wild study systems.
... Because of its higher computational cost, the MCMC method finds it difficult to achieve modeling in the presence of strongly correlated parameters, a situation that will arise in many applications and for which the INLA method is intended to assist. The INLA algorithm has demonstrated its benefits in several research regions, e.g., disease mapping [19,20], genetics [21], public health [22], ecology [23], and forecasting seismicity [24]. ...
Article
Full-text available
In this paper, the Integrated Nested Laplace Algorithm (INLA) is applied to the Epidemic Type Aftershock Sequence (ETAS) model, and the parameters of the ETAS model are obtained for the earthquake sequences active in different regions of Xinjiang. By analyzing the characteristics of the model parameters over time, the changes in each earthquake sequence are studied in more detail. The estimated values of the ETAS model parameters are used as inputs to forecast strong aftershocks in the next period. We find that there are significant differences in the aftershock triggering capacity and aftershock attenuation capacity of earthquake sequences in different seismic regions of Xinjiang. With different cutoff dates set, we observe the characteristics of the earthquake sequence parameters changing with time after the mainshock occurs, and the model parameters of the Ms7.3 earthquake sequence in Hotan region change significantly with time within 15 days after the earthquake. Compared with the MCMC algorithm, the ETAS model fitted with the INLA algorithm can forecast the number of earthquakes in the early period after the occurrence of strong aftershocks more effectively and can forecast the sudden occurrence time of earthquakes more accurately.
... At its essence, SPDEs elegantly merge the unpredictability of stochastic processes with the foundational principles of partial differential equations (PDEs), providing a robust framework for the mathematical modeling of physical phenomena with randomness, alongside the spatial and temporal dynamics inherent in them. The versatility and depth of SPDEs have led to their application in a diverse range of fields, from the intricate patterns of physics to the innovative designs of engineering [1][2][3][4][5], the fluctuating dynamics of financial markets [6][7][8][9][10], and the complex processes within biological sciences [11][12][13][14][15][16]. This cross-disciplinary enthusiasm highlights the crucial importance of SPDEs in deepening our understanding of the world, connecting theoretical exploration with pragmatic resolutions to complex problems. ...
Article
Full-text available
This article aims to provide a comprehensive review of the latest advancements in numerical methods and practical implementations in the field of fractional stochastic partial differential equations (FSPDEs). This type of equation integrates fractional calculus, stochastic processes, and differential equations to model complex dynamical systems characterized by memory and randomness. It introduces the foundational concepts and definitions essential for understanding FSPDEs, followed by a comprehensive review of the diverse numerical methods and analytical techniques developed to tackle these equations. Then, this article highlights the significant expansion in numerical methods, such as spectral and finite element methods, aimed at solving FSPDEs, underscoring their potential for innovative applications across various disciplines.
... In selection trials, the experimental design of the early-generation material [120][121][122] is essential and often uses augmented, partially-replicated, or alpha-lattice designs. Removing or accounting for spatial variation in field trials needs to be done to estimate the accurate value of a line [123,124]. ...
... Randomized complete block designs require extensive replication, meaning they are rarely used in early-generation selection trials due to the large number of lines that must be evaluated with limited seed supplies. Due to the likelihood of spatial variation within the block, ready access to improved statistical software, and greater computational capabilities, selection trials can account for and remove spatial variation [120][121][122]. Many of these same experimental designs and analysis perspectives are important for evaluation trials (discussed below), but generally, the seed is not limiting, and the number of lines is fewer in advanced evaluation trials; so replication is often greater. ...
Article
Full-text available
Wheat (Triticum spp and, particularly, T. aestivum L.) is an essential cereal with increased human and animal nutritional demand. Therefore , there is a need to enhance wheat yield and genetic gain using modern breeding technologies alongside proven methods to achieve the necessary increases in productivity. These modern technologies will allow breeders to develop improved wheat cultivars more quickly and efficiently. This review aims to highlight the emerging technological trends used worldwide in wheat breeding, with a focus on enhancing wheat yield. The key technologies for introducing variation (hybridization among the species, synthetic wheat, and hybridiza-tion; genetically modified wheat; transgenic and gene-edited), inbreeding (double haploid (DH) and speed breeding (SB)), selection and evaluation (marker-assisted selection (MAS), genomic selection (GS), and machine learning (ML)) and hybrid wheat are discussed to highlight the current opportunities in wheat breeding and for the development of future wheat cultivars.
... Modelling genetic relationship as an additional source of information in phenotypic analyses has shown promising results for determining the performance of genotypes both in simulation (Bauer et al. 2006;Möhring et al. 2014;Selle et al. 2019;Terraillon et al. 2022) and empirical studies (Moreau et al. 1999;Oakey et al. 2007;Endelman et al. 2014). Although the estimated experimental efficiency and accuracy was highest with complete genomic relationship information in the study at hand, a marked advantage was likewise observed when utilizing pedigree records for modelling relationship. ...
Article
Full-text available
The increasingly cost-efficient availability of ‘omics’ data has led to the development of a rich framework for predicting the performance of non-phenotyped selection candidates in recent years. The improvement of phenotypic analyses by using pedigree and/or genomic relationship data has however received much less attention, albeit it has shown large potential for increasing the efficiency of early generation yield trials in some breeding programs. The aim of this study was accordingly to assess the possibility to enhance phenotypic analyses of multi-location field trials with complete relationship information as well as when merely incomplete pedigree and/or genomic relationship information is available for a set of selection candidates. For his purpose, four winter bread wheat trial series conducted in Eastern and Western Europe were used to determine the experimental efficiency and accuracy of different resource allocations with a varying degree of relationship information. The results showed that modelling relationship between the selection candidates in the analyses of multi-location trial series was up to 20% more efficient than employing routine analyses, where genotypes are assumed to be unrelated. The observed decrease in efficiency and accuracy when reducing the testing capacities was furthermore less pronounced when modelling relationship information, even in cases when merely partial pedigree and/or genomic information was available for the phenotypic analyses. Exploiting complete and incomplete relationship information in both preliminary yield trials and multi-location trial series has thus large potential to optimize resource allocations and increase the selection gain in programs that make use of various predictive breeding methods.
... is also used in spatial analysis (Cressie and Huang 1999) and in capturing spatial variation in OFE (Selle et al. 2019). ...
Preprint
There is no doubt on the importance of randomisation in agricultural experiments by agronomists and biometricians. Even when agronomists extend the experimentation from small trials to large on-farm trials, randomised designs predominate over systematic designs. However, the situation may change depending on the objective of the on-farm experiments (OFE). If the goal of OFE is obtaining a smooth map showing the optimal level of a controllable input across a grid made by rows and columns covering the whole field, a systematic design should be preferred over a randomised design in terms of robustness and reliability. With the novel geographically weighted regression (GWR) for OFE and simulation studies, we conclude that, for large OFE strip trials, the difference between randomised designs and systematic designs are not significant if a linear model of treatments is fitted or if the spatial variation is not taken into account. But for a quadratic model, systematic designs are superior to randomised designs.
... Raw data were first corrected for spatial variation within each trial, obtaining BLUE values for each trial that were used in further analysis. Spatial variation is common in field trials, and accounting for it increases the accuracy of estimated genetic effects (Selle et al., 2019). After the correction, the variability observed was decomposed into genetic, spatial, and residual variance, and heritability values were estimated. ...
Article
Full-text available
The introduction of Lupinus mutabilis (Andean lupin) in Europe will provide a new source of protein and oil for plant-based diets and biomass for bio-based products, while contributing to the improvement of marginal soils. This study evaluates for the first time the phenotypic variability of a large panel of L. mutabilis accessions both in their native environment and over two cropping conditions in Europe (winter crop in the Mediterranean region and summer crop in North-Central Europe), paving the way for the selection of accessions adapted to specific environments. The panel of 225 accessions included both germplasm pools from the Andean region and breeding lines from Europe. Notably, we reported higher grain yield in Mediterranean winter-cropping conditions (18 g/plant) than in the native region (9 g/plant). Instead, North European summer-cropping conditions appear more suitable for biomass production (up to 2 kg/plant). The phenotypic evaluation of 16 agronomical traits revealed significant variation in the panel. Principal component analyses pointed out flowering time, yield, and architecture-related traits as the main factors explaining variation between accessions. The Peruvian material stands out among the top-yielding accessions in Europe, characterized by early lines with high grain yield (e.g., LIB065, LIB072, and LIB155). Bolivian and Ecuadorian materials appear more valuable for the selection of genotypes for Andean conditions and for biomass production in Europe. We also observed that flowering time in the different environments is influenced by temperature accumulation. Within the panel, it is possible to identify both early and late genotypes, characterized by different thermal thresholds (600°C–700°C and 1,000–1,200°C GDD, respectively). Indications on top-yielding and early/late accessions, heritability of morpho-physiological traits, and their associations with grain yield are reported and remain largely environmental specific, underlining the importance of selecting useful genetic resources for specific environments. Altogether, these results suggest that the studied panel holds the genetic potential for the adaptation of L. mutabilis to Europe and provide the basis for initiating a breeding program based on exploiting the variation described herein.
... Lastly, we will compare the predictive performance of our models using the Continuous Rank Probability Score (CRPS). It makes possible to compare the estimated posterior mean and our observed values while accounting for the uncertainty of the estimation, (Gneiting and Raftery 2007;Selle et al. 2019). It is defined as: ...
Article
Full-text available
Quantifying the total number of individuals (abundance) of species is the basis for spatial ecology and biodiversity conservation. Abundance data are mostly collected through professional surveys as part of monitoring programs, often at a national level. These surveys rarely follow exactly the same sampling protocol in different countries, which represents a challenge for producing biogeographical abundance maps based on the transboundary information available covering more than one country. Moreover, not all species are properly covered by a single monitoring scheme, and countries typically collect abundance data for target species through different monitoring schemes. We present a new methodology to model total abundance by merging count data information from surveys with different sampling protocols. The proposed methods are used for data from national breeding bird monitoring programs in Norway and Sweden. Each census collects abundance data following two different sampling protocols in each country, i.e., these protocols provide data from four different sampling processes. The modeling framework assumes a common Gaussian Random Field shared by both the observed and true abundance with either a linear or a relaxed linear association between them. The models account for particularities of each sampling protocol by including terms that affect each observation process, i.e., accounting for differences in observation units and detectability. Bayesian inference is performed using the Integrated Nested Laplace Approximation (INLA) and the Stochastic Partial Differential Equation (SPDE) approach for spatial modeling. We also present the results of a simulation study based on the empirical census data from mid-Scandinavia to assess the performance of the models under model misspecification. Finally, maps of the expected abundance of birds in our study region in mid-Scandinavia are presented with uncertainty estimates. We found that the framework allows for consistent integration of data from surveys with different sampling protocols. Further, the simulation study showed that models with a relaxed linear specification are less sensitive to misspecification, compared to the model that assumes linear association between counts. Relaxed linear specifications of total bird abundance in mid-Scandinavia improved both goodness of fit and the predictive performance of the models.
... Fisher (1925) introduced the blocking concept, a traditional method dealing with field-spatial variation. However, it is difficult to control small-scale variations by blocking (Burgueño, 2018) because it assumes that the experimental units in a block are homogeneous and plots are independent (Selle et al., 2019). Gleeson and Cullis (1987) suggested modeling the correlation between plot errors in one direction (rows or columns) using an autoregressive integrated moving average model to improve the estimations of genetic parameters by accounting for spatial heterogeneity. ...
... Traditional procedures to minimize spatial variation, such as the use of blocks (complete or incomplete), do not account for the dependency between neighboring block plots (Selle et al., 2019). Considering spatial dependence is fundamental when there is heterogeneity in a trial (Resende et al., 2014). ...
Article
The burning of fossil fuels contributes to global warming. Using renewable energy sources such as elephant grass biomass mitigates anthropogenic impact on nature. The genetic selection of high-yield elephant grass genotypes is important to increase the use of this forage for energy generation. Unmanned aerial vehicles have been used for data collection and optimization of the selection of genotypes. However, statistical tests should be conducted to study the suitability of vegetation indices for predicting morphological traits. In addition, spatial sources of variation, such as soil structure heterogeneity, can disturb the selection process. This study compared the correlation between morphological traits and vegetation indices of elephant grass clones using basic linear mixed and spatial linear mixed models. In addition, we evaluated the magnitude and contribution of each index to explain the variations in traits and identify the best index for this forage. There was significant genetic variability in some morphological traits that enabled selection. Spatial models (autoregressive correlation among rows and columns) were more suitable for modeling some of the evaluated traits. There were changes in the magnitude of the correlation between traits when we considered the best-fit model instead of the non-spatial model. The increase in efficiency using the best-fitted model instead of the non-spatial model was 15.39% for heritability and 9.54% for accuracy. The total dry biomass was the only morphological trait significantly correlated with some vegetation indices, allowing for indirect selection. The coincidence index, heritability, and gains from indirect selection indicated that the normalized difference red-edge index was the best for selecting superior elephant grass high-yielding genotypes. The spatial modeling leveraged the genetic selection of high yield elephant grass genotypes for bioenergetic purposes.
... Spatial variation in OFE may introduce bias while estimating treatment effects and inflate associated standard errors if not accounted for in fitted models. Spatial variation may be caused by environmental factors such as soil fertility, moisture trends, and light exposure (Selle et al., 2019), or it could also arise due to management practices with reoccurring patterns (Gilmour et al., 1997;Hinkelmann, 2012). Two common approaches of tackling spatial variation are through the modelling of a nonstationary mean structure or modelling of a spatially autocorrelated error structure (Fotheringham, 2009;Harris, 2019). ...
... In particular, localised p-values are required to be adjusted to avoid a large number of false positives in the spatial map of treatment effects; see Rakshit et al. (2020) for the details of computing adjusted p-values in GWR. Due to the availability of adequate computing resources and due to the fact that both model fitting and statistical inference under Bayesian framework are extremely intuitive, Bayesian modelling has become popular for analysing agricultural field trials in the last few years (Besag and Higdon, 1999;Theobald et al., 2002;Che and Xu, 2010;Donald et al., 2011;Montesinos-López et al., 2018;Selle et al., 2019;Shirley et al., 2020). Montesinos-López et al. (2018) proposed a multivariate Bayesian analysis to estimate multiple-trait and multiple-environment on-farm data. ...
... Montesinos-López et al. (2018) proposed a multivariate Bayesian analysis to estimate multiple-trait and multiple-environment on-farm data. Selle et al. (2019) compared popular spatial models and proposed a Bayesian modelling framework for variety selection in plant breeding experiments. Jiang et al. (2009) used Bayesian conditional auto-regressive models to account for spatial autocorrelation in OFE data. ...
Article
Accounting for spatial variability is crucial while estimating treatment effects in large on-farm trials. It allows to determine the optimal treatment for every part of a paddock, resulting in a management strategy that improves sustainability and profitability of the farm. We specify a model with spatially correlated random parameters to account for the spatial variability in large on-farm trials. A Bayesian framework has been adopted to estimate the posterior distribution of these parameters. By accounting for spatial variability, this framework allows the estimation of spatially-varying treatment effects in large on-farm trials. Several approaches have been proposed in the past for assessing spatial variability. However, these approaches lack an adequate discussion of the potential problem of model misspecification. Often the Gaussian distribution is assumed for the response variable, and this assumption is rarely investigated. Using Bayesian post sampling tools, we show how to diagnose the problem of model misspecification. To illustrate the applicability of our proposed method, we analysed a real on-farm strip trial from Las Rosas, Argentina, with the main aim of obtaining a spatial map of locally-varying optimal nitrogen rates for the entire paddock. The analysis of these data revealed that the assumption of Gaussian distribution for the response variable is unsatisfactory; the Student-t distribution provides a more robust inference. We finish the paper by discussing the difference between the proposed Bayesian approach and geographically weighted regression, and comparing the results of these two approaches.