ArticlePDF Available

Oracle-based optimization applied to climate model calibration

Authors:

Abstract and Figures

In this paper, we show how oracle-based optimization can be effectively used for the calibration of an intermediate complexity climate model. In a fully developed example, we estimate the 12 principal parameters of the C-GOLDSTEIN climate model by using an oracle- based optimization tool, Proximal-ACCPM. The oracle is a procedure that finds, for each query point, a value for the goodness-of-fit function and an evaluation of its gradient. The difficulty in the model calibration problem stems from the need to undertake costly calculations for each simulation and also from the fact that the error function used to assess the goodness-of-fit is not convex. The method converges to a Fbest fit_ estimate over 10 times faster than a comparable test using the ensemble Kalman filter. The approach is simple to implement and potentially useful in calibrating computationally demanding models based on temporal integration (simulation), for which functional derivative information is not readily available.
Content may be subject to copyright.
A preview of the PDF is not available
... Objective calibration methods have therefore mainly been applied to Earth system Models of Intermediate Complexity (EMIC) since these models are parametrized on a highly aggregated level, such that theoretical arguments are insufficient for parameter estimation. Studies on objective calibration of these models have improved our understanding of the parameter uncertainty and the structural errors of climate models, using efficient methods to deal with the parameter estimation problem (Jackson et al., 2004;Jones et al., 2005;Severijns and Hazeleger, 2005;Beltran et al., 2006;Price et al., 2009;Rougier et al., 2009b). Most of these methods still require several thousand model years of simulations and are therefore still unpractical for computationally demanding climate models. ...
... Most of these methods still require several thousand model years of simulations and are therefore still unpractical for computationally demanding climate models. In addition the problem of finding the global parameter optimum is often poorly addressed as raised in (Beltran et al., 2006). Although some advances have been made to objectively calibrate comprehensive climate models (Jack-son et al., 2008;Neelin et al., 2010;Gregoire et al., 2011;Tett et al., 2013) employed model resolutions are low and it remains unclear if such methods are applicable to high-resolution climate models. ...
... In particular climate models of inter-mediate complexity have been subject to a wide range of objective calibration methods, as shown by e.g. Price et al. (2009) using genetic algorithms, or by Beltran et al. (2006) using an oracle-based optimization. Also physical surrogates of general circulation models with reduced complexity or resolution have been optimized with very different approaches ranging from ensemble Kalman filters, Latin hypercubes, and Markov chain Monte Carlo integrations (Jackson et al., 2004;Jones et al., 2005;Annan et al., 2005;Medvigy et al., 2010;Jarvinen et al., 2010;Gregoire et al., 2011), with an overview presented in Annan and Hargreaves (2007). ...
Thesis
Full-text available
The study of the Earth climate system requires, due to its high complexity, the use of computerbased models which integrate the current knowledge about the underlying processes in the climate system. Such climate models have proven to reproduce many aspects of the observed climate. Yet inherent uncertainties obscure our current understanding of the climate system and in particular of climatic changes as a response to increasing greenhouse gas and aerosol concentrations. One source of this model uncertainty originates from unconfined semi-empirical model parameters employed in model parametrizations. Recent climate research revealed a main contribution of this parameter uncertainty to the overall uncertainty of climate change projections, and highlighted the need for objective approaches to constrain these model parameters using past observations in order to sharpen our understanding of the current and future climate. While the parameter uncertainty of low-resolution and/or idealized global climate models has been assessed intensively in recent research, so far no consensus on how to estimate these parameters in computationally expensive climate models has emerged. The risks and implications of presently adopted subjective approaches are largely unknown, and therefore a matter of constant debate in the climate modeling community. Furthermore, at the present stage little effort has been devoted to systematically assess the parameter uncertainty of regional climate models (RCMs) which are increasingly employed to provide high-resolution climate projections. The goal of this thesis is therefore to assess the importance of the parameter uncertainty of regional climate models (RCMs) and to constrain the determined parameter uncertainty based on an objective calibration approach applicable to computationally expensive models. The objective calibration approach is finally used to systematically assess the implications of parameter estimation for the projection of climate change signals using RCMs, and to identify structural deficiencies of the models. The parameter uncertainty of the RCM COSMO-CLM is assessed by generating perturbed physics ensembles of a large set of uncertain model parameters. The evaluation addresses the mean, the interannual variability and selected extremes over regional sub-domains of the European climate. The ensembles reveal a dominant component of the model uncertainty related to uncertain model parameters which exceeds the uncertainty associated with internal variability and observational errors. In order to quantify the agreement between the simulated and observed climates, an extension of an existing performance metric is proposed that allows for consideration of observational uncertainty and internal variability, and that integrates multivariate model errors based on variety of observational datasets. It is shown that the consideration of observational uncertainty, which is presently neglected in most metrics, has proven to be crucial at regional scales for total cloud cover and high precipitation intensities. The ensembles further demonstrate that parameters in the land surface scheme, the microphysics and convection parametrization dominate the parameter uncertainty in the model under consideration. Based on the knowledge gained from exploring perturbed physics ensembles, we demonstrate the feasibility of an objective calibration approach that takes into account the computational constraints of state-of-the-art RCMs and efficiently estimates model parameters based on a limited number of model simulations. This efficiency has been achieved by approximating the the model results using a quadratic metamodel presented in Neelin et al. (2010). Such an approximation has previously only been applied to low resolution climate models, but is here demonstrated to be applicable to state-of-the-art high-resolution models, as the model sensitivity with respect to parameter perturbations has proven to be surprisingly smooth. The calibration yields an improvement of the model performance, albeit the reference version has previously be tuned by expert knowledge. The objective calibration further demonstrated to be advantageous compared to expert tuning, thanks to the objectiveness of the method. Despite the calibration, some model biases remain due to structural deficiencies or parameters not included in the calibration. A prominent model bias is an overestimation of summer temperatures over semi-arid regions, which is present in a majority of climate models. This model bias has recently been shown to be non-stationary, which leads to an overestimation of projected climate change over biased regions. Complementing this field of research and relying on an ensemble of simulations using calibrated and uncalibrated parameter settings, we find that the previously proposed linear bias correction (that entails a linear increase of summer biases with model temperatures) does no longer hold at high model temperatures. This break down can be explained by the limits of soil moisture depletion, suggesting that model biases should remain constant once drying of the soils is complete. Based on these findings we show that the consideration of the soil state is fundamental for a physically consistent bias correction of climate models and argue that incorporating such a framework could reduce the uncertainty of 21st century summer warming. Finally we show first results on the implications of model calibration for projected regional climate change. The parameter configurations affect systematically the simulated mean climate change signals, and strongly modulate the projected changes in interannual variability as well as temperature and precipitation extremes. Thus past observations proved effective to reduce uncertainties of high resolution European climate change. In summary, the results presented in this thesis demonstrate that challenges associated with uncertain model parameters in computationally expensive climate models can be dealt with objective means. Such methodologies should become a standard in the climate modeling community, in order to foster our understanding of the climate system.
... In recent years, the main efforts of the climate modelling community have been channelled towards developing transparent, reproducible and objective calibration methods, using well-founded mathematical and statistical frameworks (Bellprat et al., 2012b(Bellprat et al., , 2016Hourdin et al., 2017). Among others, methods based on oracle-based optimization, ensemble Kalman filters, Markov chain Monte Carlo integrations, Latin hypercubes and Bayesian stochastic inversion algorithms have been proposed and used for climate models calibration (Price et al., 2009;Beltran et al., 2006;Jackson et al., 2004;Jones et al., 2005;Annan et al., 2005;Medvigy et al., 2010;Järvinen et al., 2010;Gregoire et al., 2011;Tett et al., 2013;Schirber et al., 2013;Ollinaho et al., 2013;Williamson et al., 2013;Annan and Hargreaves, 2007). However, most of these methods cannot be directly applied to computationally costly high-resolution climate models to exhaustively explore their parameter space, since typically hundreds of simulations have to be performed (Bellprat et al., 2012b;Hourdin et al., 2017). ...
Article
Full-text available
The parameter uncertainty of a climate model represents the spectrum of the results obtained by perturbing its empirical and unconfined parameters used to represent subgrid-scale processes. In order to assess a model's reliability and to better understand its limitations and sensitivity to different physical processes, the spread of model parameters needs to be carefully investigated. This is particularly true for regional climate models (RCMs), whose performance is domain dependent. In this study, the parameter space of the Consortium for Small-scale Modeling CLimate Mode (COSMO-CLM) RCM is investigated for the Central Asia Coordinated Regional Climate Downscaling Experiment (CORDEX) domain, using a perturbed physics ensemble (PPE) obtained by performing 1-year simulations with different parameter values. The main goal is to characterize the parameter uncertainty of the model and to determine the most sensitive parameters for the region. Moreover, the presented experiments are used to study the effect of several parameters on the simulation of selected variables for subregions characterized by different climate conditions, assessing by which degree it is possible to improve model performance by properly selecting parameter inputs in each case. Finally, the paper explores the model parameter sensitivity over different domains, tackling the question of transferability of an RCM model setup to different regions of study. Results show that only a subset of model parameters present relevant changes in model performance for different parameter values. Importantly, for almost all parameter inputs, the model shows an opposite behaviour among different clusters and regions. This indicates that conducting a calibration of the model against observations to determine optimal parameter values for the Central Asia domain is particularly challenging: in this case, the use of objective calibration methods is highly necessary. Finally, the sensitivity of the model to parameter perturbation for Central Asia is different than the one observed for Europe, suggesting that an RCM should be retuned, and its parameter uncertainty properly investigated, when setting up model experiments for different domains of study.
... Nevertheless, the method seems to work well in the presence of significant nonlinearity. Finally, a proximal analytic centre cutting plane method (proximal ACCPM), which is highly efficient for scalar processing, is described by Beltran et al. (2005). Using any or all of these methods, it is possible to both tune the model to a best fit and also explore the range of possible outcomes (Annan et al., 2005a, Hargreaves andAnnan, 2006). ...
... [5] The development and application of objective calibration methods has therefore recently gained much attention in climate science. In particular climate models of intermediate complexity have been subject to a wide range of objective calibration methods, as shown by, e.g., Price et al. [2009] using genetic algorithms, or by Beltran et al. [2006] using an oracle-based optimization. Also physical surrogates of general circulation models with reduced complexity or resolution have been optimized with very different approaches ranging from ensemble Kalman filters, Latin hypercubes, and Markov chain Monte Carlo integrations [Jackson et al., 2004;Jones et al., 2005;Annan et al., 2005;Medvigy et al., 2010;Jarvinen et al., 2010;Gregoire et al., 2011], with an overview presented in Annan and Hargreaves [2007]. ...
Article
Full-text available
Climate models are subject to high parametric uncertainty induced by poorly confined model parameters of parameterized physical processes. Uncertain model parameters are typically calibrated in order to increase the agreement of the model with available observations. The common practice is to adjust uncertain model parameters manually, often referred to as expert tuning, which lacks objectivity and transparency in the use of observations. These shortcomings often haze model inter-comparisons and hinder the implementation of new model parameterizations. Methods which would allow to systematically calibrate model parameters are unfortunately often not applicable to state-of-the-art climate models, due to computational constraints facing the high dimensionality and non-linearity of the problem. Here we present an approach to objectively calibrate a regional climate model, using reanalysis driven simulations and building upon a quadratic metamodel presented by Neelin et al. (2010) that serves as a computationally cheap surrogate of the model. Five model parameters originating from different parameterizations are selected for the optimization according to their influence on the model performance. The metamodel accurately estimates spatial averages of 2 m temperature, precipitation and total cloud cover, with an uncertainty of similar magnitude as the internal variability of the regional climate model. The non-linearities of the parameter perturbations are well captured, such that only a limited number of 20-50 simulations are needed to estimate optimal parameter settings. Parameter interactions are small, which allows to further reduce the number of simulations. In comparison to an ensemble of the same model which has undergone expert tuning, the calibration yields similar optimal model configurations, but leading to an additional reduction of the model error. The performance range captured is much wider than sampled with the expert-tuned ensemble and the presented methodology is effective and objective. It is argued that objective calibration is an attractive tool and could become standard procedure after introducing new model implementations, or after a spatial transfer of a regional climate model. Objective calibration of parameterizations with regional models could also serve as a strategy toward improving parameterization packages of global climate models.
... In computationally cheap climate models, the calibration of parameters can be done by minimizing some cost function using search algorithms (e.g., Andronova and Schlesinger 2001;Forest et al. 2002;Knutti et al. 2002Knutti et al. , 2003Annan et al. 2005;Beltran et al. 2005;Frame et al. 2006;Hegerl et al. 2006;Meinshausen et al. 2008). Because of the complexity of AOGCMs and the associated computational cost, model tuning (defined as the adjustment of a model parameter within some known observational range) or calibration by automated procedures (e.g., finding optimal parameter values by minimizing some error metric) is usually unfeasible. ...
Article
Full-text available
Recent coordinated efforts, in which numerous general circulation climate models have been run for a common set of experiments, have produced large datasets of projections of future climate for various scenarios. Those multimodel ensembles sample initial conditions, parameters, and structural uncertainties in the model design, and they have prompted a variety of approaches to quantifying uncertainty in future climate change. International climate change assessments also rely heavily on these models. These assessments often provide equal-weighted averages as best-guess results, assuming that individual model biases will at least partly cancel and that a model average prediction is more likely to be correct than a prediction from a single model based on the result that a multimodel average of present-day climate generally outperforms any individual model. This study outlines the motivation for using multimodel ensembles and discusses various challenges in interpreting them. Among these challenges are that the number of models in these ensembles is usually small, their distribution in the model or parameter space is unclear, and that extreme behavior is often not sampled. Model skill in simulating present-day climate conditions is shown to relate only weakly to the magnitude of predicted change. It is thus unclear by how much the confidence in future projections should increase based on improvements in simulating present-day conditions, a reduction of intermodel spread, or a larger number of models. Averaging model output may further lead to a loss of signal-for example, for precipitation change where the predicted changes are spatially heterogeneous, such that the true expected change is very likely to be larger than suggested by a model average. Last, there is little agreement on metrics to separate "good'' and "bad'' models, and there is concern that model development, evaluation, and posterior weighting or ranking are all using the same datasets. While the multimodel average appears to still be useful in some situations, these results show that more quantitative methods to evaluate model performance are critical to maximize the value of climate change projections from global models.
... Nevertheless, the method seems to work well in the presence of significant nonlinearity. Finally, a proximal analytic centre cutting plane method (proximal ACCPM), which is highly efficient for scalar processing, is described by Beltran et al. (2005). Using any or all of these methods, it is possible to both tune the model to a best fit and also explore the range of possible outcomes (Annan et al., 2005a, Hargreaves and Annan, 2006). ...
Article
Full-text available
The Grid ENabled Integrated Earth system modelling (GENIE) frame-work supports modularity (i.e. interchangeable components) and scala-bility (i.e. variable resolution of the components), which aids traceability, meaning the ability to relate the process representation and results for different module choices and/or resolutions to one another. Here for each of the main components of the Earth system, we discuss the appropriate modelling approaches for our goals and introduce the component models adopted thus far. We describe their coupling to produce a range of com-putationally efficient Earth system models (ESMs) that span a spectrum from intermediate toward full complexity, and summarise the experiments undertaken with them thus far.
... The ACCPM was first implemented for solving two-stage stochastic programming with recourse in Bahn et al. (1995). This specialized implementation is successfully used for practical applications such as the telecommunication network problem in Andrade et al. (2004) and climate model calibration in Beltran et al. (2006). A two-level decomposition algorithm via ACCPM is also addressed in Elhedhli and Goffin (2005) to solve production-distribution problem. ...
Article
Full-text available
In-house production and outsourcing are important strategic decisions for planning production and capacity in business organisations. Outsourcing to overseas suppliers is often associated with risk with respect to the quality of the products. Hence, we developed a multi-stage stochastic programming model that takes into account the uncertainty involved in the production of the quality of outsourced products in the face of stochastic demand. The goal is to find an optimal way to choose between in-house capacity expansion and buying from local suppliers with assured quality versus buying from overseas suppliers. Moreover, we propose three alternative algorithms for solving the problem. These three approaches are: a two-level column generation by using the analytic centre cutting plane method (ACCPM), a two-level Benders’ decomposition by using the ACCPM and a two-level decomposition where the first level is solved by using the classical Dantzig-Wolfe decomposition approach and the second level is solved by using the ACCPM.
Chapter
We propose an optimization model to minimize the gap between predicted conveyance tension and measured field data for friction coefficients calibration. This is an oracle optimization problem since the objective function relies on a complex numerical computing engine to simulate the downhole context. To tackle the NP-hardness in computation complexity, we introduce an improved formulation by representing the residue as function of friction coefficients to reduce the number of parameters and a new stochastic direction descent (SDD) method to avoid locally optimal solutions. Numerical experiments on various field cases are presented to validate this optimization framework. It shows that SDD method is efficient comparing to grid, bisection, and simplex algorithms. Our study is useful in optimization tasks in complex industrial systems architecture & engineering, where objective functions are very costly or slow to evaluate.
Article
Full-text available
A computationally efficient, intermediate complexity ocean-atmosphere-sea ice model (C-GOLDSTEIN) is incorporated into the Grid ENabled Integrated Earth system modeling (GENIE) framework. This involved decoupling of the three component modules that were re-coupled in a modular way, to allow replacement with alternatives and coupling of further components within the framework. The climate model described here (genie_eb_go_gs) is the most basic version of GENIE in which atmosphere, ocean and sea ice all play an active role. Compared to the original model, latitudinal grid resolution has also been generalized to allow a wider range of surface grids to be used and an altered convection scheme has been added. Some other minor modifications and corrections have been applied. For four default meshes, and using the same default parameters as far as possible, we present the results from spin-up experiments. Evaluation of equilibrium states in terms of composite model-observation errors is demonstrated, with caveats regarding the use of un-tuned key parameters. For each mesh, we also carry out four standard climate experiments, based on international protocols: (i) equilibrium climate response (sensitivity) to doubled atmospheric CO2 concentration; (ii) transient climate response to CO2 concentration, increasing at 1% per annum, until doubling; (iii) response of the Atlantic meridional overturning circulation to freshwater hosing over 100 years; and (iv) hysteresis of the overturning circulation under slowly-varied freshwater forcing. Climate sensitivity and transient climate response lie in the ranges 2.85-3.13°C and 1.67-1.97°C respectively. The Atlantic overturning collapses under 0.1 Sv hosing, and subsequently recovers, for one of the meshes. Hosing at 1.0 Sv, the overturning collapses, and remains collapsed, on all four meshes. The hysteresis experiments reveal a wide range in stability of the initial state, from strongly monostable to strongly bistable. The dependencies of experimental results on choice of mesh are thus highlighted and discussed.
Article
In this paper, we present the Grid enabled data management system that has been deployed for the Grid ENabled Integrated Earth system model (GENIE) project. The database system is an augmented version of the Geodise Database Toolbox and provides a repository for scripts, binaries and output data in the GENIE framework. By exploiting the functionality available in the Geodise toolboxes we demonstrate how the database can be employed to tune parameters of coupled GENIE Earth System Model components to improve their match with observational data. A Matlab client provides a common environment for the project Virtual Organization and allows the scripting of bespoke tuning studies that can exploit multiple heterogeneous computational resources. We present the results of a number of tuning exercises performed on GENIE model components using multi-dimensional optimization methods. In particular, we find that it is possible to successfully tune models with up to 30 free parameters using Kriging and Genetic Algorithm methods. Copyright © 2006 John Wiley & Sons, Ltd.
Article
We show how the response of a chaotic model to temporally varying external forcing can be efficiently tuned via parameter estimation using time series data, extending previous work in which an unforced climatologically steady state was used as the tuning target. Although directly fitting a long trajectory of a chaotic deterministic model to a time series of data is generally not possible even in principle, this is not actually necessary for useful prediction on climatological time-scales. If the model and data outputs are averaged over suitable time-scales, the effect of chaotic variability is effectively converted into nothing more troublesome than some statistical noise. We show how tuning of models to unsteady time series data can be efficiently achieved with an augmented ensemble Kalman filter, and we demonstrate the procedure with application to a forced version of the Lorenz model. The computational cost is of the order of 100 model integrations, and so the method should be directly applicable to more sophisticated climate models of at least moderate resolution.
Article
A theory for estimating the probability distribution of the state of a model given a set of observations exists. This nonlinear filtering theory unifies the data assimilation and ensemble generation problem that have been key foci of prediction and predictability research for numerical weather and ocean prediction applications. A new algorithm, referred to as an ensemble adjustment Kalman filter, and the more traditional implementation of the ensemble Kalman filter in which "perturbed observations" are used, are derived as Monte Carlo approximations to the nonlinear filter. Both ensemble Kalman filter methods produce assimilations with small ensemble mean errors while providing reasonable measures of uncertainty in the assimilated variables. The ensemble methods can assimilate observations with a nonlinear relation to model state variables and can also use observations to estimate the value of imprecisely known model parameters. These ensemble filter methods are shown to have significant advantages over four-dimensional variational assimilation in low-order models and scale easily to much larger applications. Heuristic modifications to the filtering algorithms allow them to be applied efficiently to very large models by sequentially processing observations and computing the impact of each observation on each state variable in an independent calculation. The ensemble adjustment Kalman filter is applied to a nondivergent barotropic model on the sphere to demonstrate the capabilities of the filters in models with state spaces that are much larger than the ensemble size. When observations are assimilated in the traditional ensemble Kalman filter, the resulting updated ensemble has a mean that is consistent with the value given by filtering theory, but only the expected value of the covariance of the updated ensemble is consistent with the theory. The ensemble adjustment Kalman filter computes a linear operator that is applied to the prior ensemble estimate of the state, resulting in an updated ensemble whose mean and also covariance are consistent with the theory. In the cases compared here, the ensemble adjustment Kalman filter performs significantly better than the traditional ensemble Kalman filter, apparently because noise introduced into the assimilated ensemble through perturbed observations in the traditional filter limits its relative performance. This superior performance may not occur for all problems and is expected to be most notable for small ensembles. Still, the results suggest that careful study of the capabilities of different varieties of ensemble Kalman filters is appropriate when exploring new applications.
Article
We present a practical, efficient and powerful solution to the problem of parameter estimation in highly non-linear models. The method is based on the ensemble Kalman filter, and has previously been successfully applied to a simple climate model with steady-state dynamics. We demonstrate, via application to the well-known Lorenz model, that the method can successfully perform multivariate parameter estimation even in the presence of chaotic dynamics. Traditional variational methods using an adjoint model have limited applicability to problems of this nature, and the alternative of a brute force (or randomized) search in parameter space is prohibitively expensive for high-dimensional applications. The cost of our method is comparable to that of integrating an ensemble to statistical convergence, and therefore this technique appears to be ideally suited for probabilistic climate prediction.
Article
When the two components are coupled after being spun up individually, the system remains steady provided that no intermittent convection is present in the ocean model. If intermittent convection is operating, the coupled model shows systematic deviations of the surface salinity, which may result in reversals of the thermohaline circulation. This climate drift can be inhibited by removing intermittent convection prior to coupling. The climate model is applied to investigate the effect of excess freshwater discharge into the North Atlantic, and the influence of the parameterization of precipitation is tested. The Atlantic thermohaline flow is sensitive to anomalous freshwater input. Reversals of the deep circulation can occur in the Atlantic, leading to a state where deep water is formed only in the Southern Ocean. -from Authors
Article
Abstract Proximal ACCPM is a variant of the analytic center cutting plane method, in which a proximal term is added to the barrier function that defines the center. The present paper gives a detailed presentation of the method and of its implementation. Proximal ACCPM is used to solve the Lagrangian relaxation of the p-median problem on two sets of problem instances. Problems of the same collection are tentatively solved with the classical column generation scheme. Keywords Lagrangian relaxation, column generation, cutting plane method, analytic cen-