ArticlePDF Available

The geostatistical approach to the inverse problem

Authors:

Abstract

The geostatistical approach to the inverse problem is discussed with emphasis on the importance of structural analysis. Although the geostatistical approach is occasionally misconstrued as mere cokriging, in fact it consists of two steps: estimation of statistical parameters (“structural analysis”) followed by estimation of the distributed parameter conditional on the observations (“cokriging” or “weighted least squares”). It is argued that in inverse problems, which are algebraically undetermined, the challenge is not so much to reproduce the data as to select an algorithm with the prospect of giving good estimates where there are no observations. The essence of the geostatistical approach is that instead of adjusting a grid-dependent and potentially large number of block conductivities (or other distributed parameters), a small number of structural parameters are fitted to the data. Once this fitting is accomplished, the estimation of block conductivities ensues in a predetermined fashion without fitting of additional parameters. Also, the methodology is compared with a straightforward maximum a posteriori probability estimation method. It is shown that the fundamental differences between the two approaches are: (a) they use different principles to separate the estimation of covariance parameters from the estimation of the spatial variable; (b) the method for covariance parameter estimation in the geostatistical approach produces statistically unbiased estimates of the parameters that are not strongly dependent on the discretization, while the other method is biased and its bias becomes worse by refining the discretization into zones with different conductivity.
... To overcome the ill-posedness originating in the limited number of observations, the geostatistical inversion approach [8,9] is widely used for various subsurface applications [10,11,12,13]. The approach uses spatial correlation of the underlying unknown field as prior information within the Hierarchical Bayesian framework [14,15]. ...
... After computing the best estimation, the posterior covariance can be used for the estimation of uncertainty. The posterior covariance of s, V, is the inverse of the Hessian of the objective function (Equation 15) and can be simplified by using matrix identities: ...
... The MT model h M T uses electrical conductivity σ or resistivity ρ distributions to produce the electromagnetic signals. The self-potential model is adopted here to link these two models and eliminate the need for an often unknown or uncertain petrophysical relationship, by simply satisfying the continuity equation of the self-potential in Equation 11 with the currently estimated groundwater velocity and resistivity fields through self-potential observations ϕ within the Bayesian framework as in Equation 15. ...
Preprint
Full-text available
Estimating subsurface properties like hydraulic conductivity using hydrogeological data alone is challenging in field sites with sparse wells. Geophysical data, including Self-potential (SP) and Magnetotelluric (MT), can improve understanding of hydrogeological structures and interpolate data between wells. However, determining hydraulic conductivity requires a proper petrophysical relationship between hydraulic conductivity and inferred geophysical properties, which may not exist or be unique. We propose a joint-inversion approach without assuming petrophysical relationships, using self-potential data to connect groundwater flow velocity to electrical potential differences, and magnetotelluric data to estimate hydraulic conductivity and electrical conductivity. A spectral method is employed for the self-potential forward problem. To accelerate joint data inversion, a dimension reduction technique through the Principal Component Geostatistical Approach is used. The applicability and robustness of the joint-inversion method are demonstrated through inversion tests using hydrogeophysical data sets generated from subsurface models with and without petrophysical relationships. The joint hydraulic head-SP-MT data inversion can reasonably estimate hydraulic conductivity and electrical resistivity, even without knowledge of a one-to-one petrophysical relationship. On average, joint inversion yields a 25% improvement in hydraulic conductivity estimates compared to single data-type inversion. Our proposed joint inversion approach, with SP-data compensating for the absence of a known petrophysical relationship, provided close agreement with joint inversion of head and MT data using a known petrophysical relationship. Successful inversion tests demonstrate the usefulness of SP data in connecting hydrogeological properties and geophysical data without requiring a petrophysical relationship.
... The MAP estimate is computed by solving a PDE-constrained optimization problem consisting of minimizing a certain norm of the difference between predicted and measured observables (data misfit term) plus a regularizing penalty. Assuming that the solution is obtained at a global minimum, the MAP estimate is equivalent to the largest mode of the Bayesian posterior with the data misfit term corresponding to the (negative) Bayesian log-likelihood and the regularizing penalty corresponding to the (negative) Bayesian log-prior [1,2,3]. One can drop the PDE constraint by modeling the predicted observables via a "surrogate" model, at the cost of constructing said model either on the fly (e.g., [4]) or ahead of tackling the inverse problem (e.g., [5,6,7]). ...
... The MAP estimator [1] of y is computed by minimizing the sum of the 2 -norm of the discrepancy between measurements and model predictions, plus a regularization penalty on y, that is, by solving the PDE-constrained minimization problem min u,y ...
... Mathematically, Eqs. (1) and (20) are identical, although the field u(x) computed using these two equations will be different. Therefore, solving the inverse problem for Eq. ...
Preprint
We present a model inversion algorithm, CKLEMAP, for data assimilation and parameter estimation in partial differential equation models of physical systems with spatially heterogeneous parameter fields. These fields are approximated using low-dimensional conditional Karhunen-Lo\'{e}ve expansions, which are constructed using Gaussian process regression models of these fields trained on the parameters' measurements. We then assimilate measurements of the state of the system and compute the maximum a posteriori estimate of the CKLE coefficients by solving a nonlinear least-squares problem. When solving this optimization problem, we efficiently compute the Jacobian of the vector objective by exploiting the sparsity structure of the linear system of equations associated with the forward solution of the physics problem. The CKLEMAP method provides better scalability compared to the standard MAP method. In the MAP method, the number of unknowns to be estimated is equal to the number of elements in the numerical forward model. On the other hand, in CKLEMAP, the number of unknowns (CKLE coefficients) is controlled by the smoothness of the parameter field and the number of measurements, and is in general much smaller than the number of discretization nodes, which leads to a significant reduction of computational cost with respect to the standard MAP method. To show its advantage in scalability, we apply CKLEMAP to estimate the transmissivity field in a two-dimensional steady-state subsurface flow model of the Hanford Site by assimilating synthetic measurements of transmissivity and hydraulic head. We find that the execution time of CKLEMAP scales nearly linearly as $N^{1.33}$, where $N$ is the number of discretization nodes, while the execution time of standard MAP scales as $N^{2.91}$. The CKLEMAP method improved execution time without sacrificing accuracy when compared to the standard MAP.
... At a regional scale, inversions that assimilate all observations simultaneously by utilizing a precomputed forward operator (Lin et al., 2003) that describes the relationship between observations and fluxes are commonly used (for details, see Enting, 2002). This work focuses on the use of precomputed forward operators for atmospheric inverse modeling and addresses the sensitivity analysis and correlation in the forward operator in the context of Bayesian (e.g., Lauvaux et al., 2016) and geostatistical inverse methods (e.g., Kitanidis, 1996). ...
... For linear Bayesian and geostatistical inverse problems, the solutions (see Tarantola, 2005, for the batch Bayesian and Kitanidis, 1996, for the geostatistical case) can be obtained by minimizing their respective objective functions. These ob-jective functions can be given as follows: ...
Article
Full-text available
Several metrics have been proposed and utilized to diagnose the performance of linear Bayesian and geostatistical atmospheric inverse problems. These metrics primarily assess the reductions in the prior uncertainties, compare modeled observations to true observations, and check distributional assumptions. Although important, these metrics should be augmented with a sensitivity analysis to obtain a comprehensive understanding of the atmospheric inversion performance and improve the quality and confidence in the inverse estimates. In this study, we derive closed-form expressions of local sensitivities for various input parameters, including measurements, covariance parameters, covariates, and a forward operator. To further enhance our understanding, we complement the local sensitivity analysis with a framework for a global sensitivity analysis that can apportion the uncertainty in input parameters to the uncertainty associated with inverse estimates. Additionally, we propose a mathematical framework to construct nonstationary correlation matrices from a precomputed forward operator, which is closely tied to the overall quality of inverse estimates. We demonstrate the application of our methodology in the context of an atmospheric inverse problem for estimating methane fluxes in Los Angeles, California.
... Yang [7] used Modflow-2005 numerical simulation software to carry out inverse analysis on pumping test data. P. K. Kitanidis [8,9] proposed the quasi-linear theory of geostatistical solutions for inverse problems, extending the geostatistical method to a method for solving inverse problems involving spatial distribution parameters and process equations. Zhang [10] studied the influence of temperature data on the characterization of reservoir permeability by calculating the joint inversion of flow and temperature observations. ...
Article
Full-text available
During the construction of underground engineering, the prediction of groundwater distribution and rock body permeability is essential for evaluating the safety of the project and guiding subsequent design and construction. This article proposes an objective function that solves an underdetermined inverse analysis problem based on the least-squares theory and regularization method and uses geostatistics theory and the variogram function to describe the spatial characteristics of the actual engineering system. It also establishes an optimization model of the analysis stratum seepage field and puts forward the method of using on-site test observation data to solve the stratum penetration coefficient. Relying on the foundation pit project of the Lingshanwei Station of Qingdao Metro, the on-site pumping and packer permeability test was conducted for different strata venues in the foundation pit, and the on-site water-head observation value was obtained. Physical detection of the influence area of foundation pit excavation confirms the correctness of the model from the region and verifies the accuracy of the model on the value through the on-site pumping test. Results show that the accuracy of the use of this objective function to solve the underdetermined inverse problem is above 85%, which proves the effectiveness of the method. The stratigraphic geological information obtained by the inverse analysis model provides an important basis for engineering design and security construction.
... (12) and (13) the trend is considered only in the secondary variable (i.e., water head) as there is no trend in the primary variable (i.e., log-transmissivity). Furthermore, the universal co-kriging system in which the cross-covariances between log-transmissivity and water head take into account the groundwater flow equation (i.e., the physical link between log-transmissivity and water head) is the geostatistical solution to the inverse problem in hydrogeology (Kitanidis 1996;Zimmerman et al. 1998;Rubin 2003) and hence the descriptive name of "inverse problem universal co-kriging" given to the estimator. ...
Article
Full-text available
Transmissivity is a significant hydrogeological parameter that affects the reliability of groundwater flow and transport models. This study demonstrates the improvement in the estimated transmissivity field of an unconfined detritic aquifer that can be obtained by using geostatistical methods to combine three types of data: hard transmissivity data obtained from pumping tests, soft transmissivity data obtained from lithological information from boreholes, and water head data. The piezometric data can be related to transmissivity by solving the hydrogeology inverse problem, i.e., including the observed water head to determine the unknown model parameters (log transmissivities). The geostatistical combination of all the available information is achieved by using three different geostatistical methodologies: ordinary kriging, ordinary co-kriging and inverse problem universal co-kriging. In addition, there are eight methodological cases to be compared according to which log-transmissivity data are considered as the primary variable in co-kriging and whether two or three variables are used in inverse-problem universal co-kriging. The results are validated by using the performance statistics of the direct modelling of the unconfined groundwater flow and comparing observed water heads with the modelled ones. Although the results show that the two sets of log-transmissivity data are incompatible, the set of log-transmissivity data from the lithofacies provides a good log-transmissivity image that can be improved by inverse modelling. The map provided by inverse-problem universal co-kriging provides the best results. Using three variables, rather than two in the inverse problem, gives worse results because of the incompatibility of the log-transmissivity data sets.
... For the current study, ordinary kriging was adopted to interpolate the spatial distribution of K from 11 HPT K profiles. Detrending was first performed to remove the redundant variability in the variogram (Kitanidis, 1996). Spatial correlation of the high-resolution K values was characterized through variogram analysis using detrended lnK data. ...
Article
Hydraulic profiling tool (HPT) and hydraulic tomography (HT) have been developed as promising techniques for the high-resolution characterization of surficial aquifer systems. HPT surveys can be rapidly conducted at a resolution of 1.5-cm, but provide only one-dimensional vertical profiles and require site-specific formulae to relate HPT measurements to K. Geostatistics-based HT can estimate three-dimensional distributions of hydraulic parameters, but may fail to provide detailed information and can be smooth when pumping/observation data density is sparse and usually are constrained to the area enclosed by the pumping and observation network. In this study, HPT and HT surveys were conducted at the North Campus Research Site (NCRS) in Waterloo, Ontario, Canada to characterize a glaciofluvial multi-aquifer-aquitard system with permeameter K values spanning nearly seven orders of magnitude. We first performed the geostatistical analysis of K values derived from 11 HPT surveys using the power-law model developed for the NCRS by Zhao and Illman (2022b). Then, the benefits of incorporating HPT K profiles into HT were evaluated for the reconstruction of hydraulic parameter fields. Results showed that the arithmetic mean of 11 HPT K profiles was nearly one order of magnitude higher than that of the 544 permeameter test K measurements. The K fields interpolated by ordinary kriging captured the vertical alternating layering patterns of aquifer and aquitard layers, while it over-predicted the values by nearly one order of magnitude than permeameter K for the top part of the aquifer system. By incorporating the kriged K values as prior means for the geostatistics-based HT, improvements were found in capturing spatial heterogeneity of K fields for areas both inside and outside the well cluster and in drawdown predictions than the inversion case using only pressure head data. The inclusion of vertical variation information of K derived from HPT into HT was also helpful in refining the vertical locations of estimated layer boundaries and introducing intralayer variation patterns of K for both aquifer and aquitard layers. This study demonstrates the potential ability of HT and HPT and advocates the joint use of both techniques for the characterization of subsurface heterogeneity at highly heterogeneous sites such as the NCRS.
Chapter
Many contemporary problems within the Earth sciences are complex, and require an interdisciplinary approach. This book provides a comprehensive reference on data assimilation and inverse problems, as well as their applications across a broad range of geophysical disciplines. With contributions from world leading researchers, it covers basic knowledge about geophysical inversions and data assimilation and discusses a range of important research issues and applications in atmospheric and cryospheric sciences, hydrology, geochronology, geodesy, geodynamics, geomagnetism, gravity, near-Earth electron radiation, seismology, and volcanology. Highlighting the importance of research in data assimilation for understanding dynamical processes of the Earth and its space environment and for predictability, it summarizes relevant new advances in data assimilation and inverse problems related to different geophysical fields. Covering both theory and practical applications, it is an ideal reference for researchers and graduate students within the geosciences who are interested in inverse problems, data assimilation, predictability, and numerical methods.
Article
A geostatistical approach is developed for the prediction of log-transmissivity, hydraulic head, and ultimately seepage velocities in a two-dimensional model of a confined aquifer under steady state conditions. The primary goal is to assess the uncertainty in model predictions associated with the uncertainty and scarcity of input data. The method uses cokriging to predict the most likely values for these functions and uses conditional simulations to generate equally probable realizations of these functions. The method allows for model uncertainty in the prescription of the boundary heads. The method is sucessfully applied to an artificial aquifer.
Article
Groundwater flow models require specification of several input parameter fields which are inferred from limited data. In this paper the hydraulic conductivity and recharge rate are estimated for an unconfined aquifer under steady flow conditions. Usually point observations of conductivity and head are available for the estimation of the distributed conductivity field and the recharge rate. Use of numerical flow models require that these fields be prescribed as average values over finite elements. The geostatistical solution to this problem uses linear estimation to obtain the distributed conductivity field and the recharge rate. Monte Carlo simulations are used here to compute the covariances associated with the head data. Results obtained for both artificial and real data show that the head data is effective in improving the estimates that would result using conductivity data alone. The use of Monte Carlo simulations results in a method which can be used under a wider variety of modeling conditions than previous applications of the geostatistical approach. 11 refs., 2 figs.
Article
Prior information on the parameters of a groundwater flow model can be used to improve parameter estimates obtained from nonlinear regression solution of a modeling problem. Two scales of prior information can be available: (1) prior information having known reliability (that is, bias and random error structure) and (2) prior information consisting of best available estimates of unknown reliability. A regression method that incorporates the second scale of prior information assumes the prior information to be fixed for any particular analysis to produce improved, although biased, parameter estimates. Approximate optimization of two auxiliary parameters of the formulation is used to help minimize the bias, which is almost always much smaller than that resulting from standard ridge regression. It is shown that if both scales of prior information are available, then a combined regression analysis may be made.
Article
The problem of estimating Hydrogeologic parameters, in particular, permeability, from input-output measurements is reexamined in a geostatistical framework. The field of the unknown parameters is represented as a `random field' and the estimation procedure consists of two main steps. First, the structure of the parameter field is identified, i.e., mathematical representations of the variogram and the trend are selected and their parameters are established by using all available information, including measurements of hydraulic head and permeability. Second, linear estimation theory is applied to provide minimum variance and unbiased point estimates of hydrogeologic parameters (`kriging'). Structure identification is achieved iteratively in three substeps: structure selection, maximum likelihood estimation, and model validation and diagnostic checking. The methodology was extensively tested through simulations on a simple one-dimensional case. The results are remarkably stable and well behaved. The estimated field is smooth, while small-scale variability is statistically described. As the quality of measurements improves, the procedure reproduces more features of the original field. The results are also shown to be rather insensitive to deviations from assumptions about the geostatistical structure of the field.
Article
The inverse problem is defined here as follows: determine the transmissivity at varius points, given the shape and boundary of the aquifer and recharge intensity and given a set of measured log-transmissivity Y and head H values at a few points. The log-transmissivity distribution is regarded as a realization of a random function of normal and stationary unconditional probability density function (pdf). The solution of the inverse problem is the conditional normal pdf of Y, conditioned on measured H and Y, which is expressed in terms of the unconditional joint pdf of Y and H. The problem is reduced to determining the unconditional head-log-transmissivity covariance and head variogram for a selected Y covariance which depends on a few unknown parameters. This is achieved by solving a first-order approximation of the flow equations. The method is illustrated for an exponential Y covariance, and the effect of head and transmissivity measurements upon the reduction of uncertainty of Y is investigated systematically. It is shown that measurement of H has a lesser impact than those of Y, but a judicious combination may lead to significant reduction of the predicted variance of Y. Possible applications to real aquifers are outlined.
Article
Two separate applications of the geostatistical solution to the inverse problem in groundwater modeling are presented. Both applications estimate the transmissivity field for a two-dimensional model of a confined aquifer under steady flow conditions. The estimates are based on point observations of transmissivity and hydraulic head and also on a model of the aquifer which includes prescribed head boundaries, leakage, and steady state pumping. The model used to describe the spatial variability of the log-transmissivity describes large-scale fluctuations through a linear mean or drift intermediate and small-scale fluctuations through a two-parameter covariance function. The first application presented estimates the log-transmissivities using Gaussian conditional mean estimation. The second application uses an extended form of cokriging. The two methods are compared and their relative merits discussed. The extended cokriging application is applied to the Jordan Aquifer of Iowa. A comparison is also made between the conditional mean application and an analytical approach.
Article
A first-order analytical solution of the inverse problem for aquifer steady flow, presented in paper 1 (Rubin and Dagan, this issue), is applied to the Avra Valley aquifer (Clifton and Neuman, 1982). First, the parameters characterizing the statistical structure of the log-transmissivity Y and water head H fields are estimated by a maximum likelihood procedure. The results for Y are in good agreement with those of (Clifton and Neuman, 1982), in spite of the different methodologies. The incorporation of head measurements is shown to have definite advantages in reducing the estimation variances of Y parameters. Next, the best estimates of Y at various points are obtained by simultaneous conditioning on the measurements of Y and H. It is shown that a substantial reduction in the variance of the conditioned Y is achieved by accounting for H measurements, justifying a posteriori the solution of the inverse problem. Finally, the effective recharge, which is assumed to be uniform, but random, is estimated as part of the process. Although the latter is relatively small for Avra Valley, it might be a parameter of considerable interest in other cases. Further applications of the methodology are suggested.
Article
Linear estimation has found many applications in the inference of spatial functions in surface and subsurface hydrology. The effect of parameter uncertainty is examined in a Bayesian framework with emphasis on the derivation of the Bayesian distribution (and its first two moments) of unknown quantities given some measurements. This distribution accounts not only for natural variability but also for parameter uncertainty. For known covariance parameters the Bayesian distribution is Gaussian (for Gaussian processes) with the mean being a given linear function of the data. This linear estimator is equivalent to the conventional Gaussian conditional mean estimator for a priori known drift coefficients and is the same with kriging for diffuse prior distribution of the drift coefficients; however, the developed procedure is more general. When both drift and covariance function parameters are uncertain, the Bayesian distribution is generally not Gaussian, and the Bayesian conditional mean is a nonlinear estimator. The case of diffuse priors is examined in some detail; it is shown that the posterior distribution of the covariance function parameters is given by the restricted likelihood function, i.e., the likelihood function of generalized increments. The results provide insight into the applicability of maximum likelihood versus restricted maximum likelihood parameter estimation, and conventional linear versus kriging estimation. A more general procedure which includes these methods as special cases is presented.
Article
A quasi-linear theory is presented for the geostatistical solution to the inverse problem. The archetypal problem is to estimate the log transmissivity function from observations of head and log transmissivity at selected locations. The unknown is parameterized as a realization of a random field, and the estimation problem is solved in two phases: structural analysis, where the random field is characterized, followed by estimation of the log transmissivity conditional on all observations. The proposed method generalizes the linear approach of Kitanidis and Vomvoris (1983). The generalized method is superior to the linear method in cases of large contrast in formation properties but informative measurements, i.e., there are enough observations that the variance of estimation error of the log transmissivity is small. The methodology deals rigorously with unknown drift coefficients and yields estimates of covariance parameters that are unbiased and grid independent. The applicability of the methodology is demonstrated through an example that includes structural analysis, determination of best estimates, and conditional simulations.
Article
The purpose of this survey is to review parameter identification procedures in groundwater hydrology and to examine computational techniques which have been developed to solve the inverse problem. Parameter identification methods are classified under the error criterion used in the formulation of the inverse problem. The problem of ill-posedness in connection with the inverse problem is addressed. Typical inverse solution techniques are highlighted. The review also includes the evaluation of methods used for computing the sensitivity matrix. Statistics which can be used to estimate the parameter uncertainty are outlined. Attempts have been made to compare and contrast representative inverse procedures, and direction for future research is suggested.