An example of Maximum Mean Discrepancy (MMD) between two Uniform distributions which have the same mean value but different widths: one has a fixed width 1, the other one has various widths from 0.1 to 2.1 (horizontal axis in the figure). MMD reaches zero when the two distributions are identical.

An example of Maximum Mean Discrepancy (MMD) between two Uniform distributions which have the same mean value but different widths: one has a fixed width 1, the other one has various widths from 0.1 to 2.1 (horizontal axis in the figure). MMD reaches zero when the two distributions are identical.

Source publication
Article
Full-text available
Constraining geophysical models with observed data usually involves solving nonlinear and nonunique inverse problems. Neural mixture density networks (MDNs) provide an efficient way to estimate Bayesian posterior marginal probability density functions (pdf's) that represent the nonunique solution. However, it is difficult to infer correlations betw...

Citations

... Most studies mentioned above, whether using random sampling or variational methods, focus on performing Bayesian inference efficiently and accurately given a specific set of observed data and fixed prior knowledge. Over recent years, researchers made use of neural networks and other machine learning architectures to implement efficient Bayesian inference in which the posterior pdf can be obtained rapidly for any newly observed dataset (Devilee et al. 1999;Meier et al. 2007a,b;Shahraeeni & Curtis 2011;Shahraeeni et al. 2012;de Wit et al. 2013;Käufl et al. 2014Käufl et al. , 2016Zhang & Curtis 2021a;Mosher et al. 2021;Wang et al. 2022;Hansen & Finlay 2022;Alyaev & Elsheikh 2022;Grana et al. 2022;Guan et al. 2023). ...
... PSVI balances mean-field ADVI and full rank ADVI by modelling only the most important (dominant) correlation information in model vector m, guided by physical properties (prior knowledge) of imaging problems. Specifically, in spatial inverse (imaging) problems, model correlations are shown to be strong mainly between pairs of locations that are in spatial proximity to each other, and the magnitude of correlations decreases rapidly as the distance between two locations increases (Gebraad et al. 2020;Zhang & Curtis 2021a;Biswas & Sen 2022). This suggests that it might be sufficient to model correlations only between parameters that are close (e.g., for FWI, parameters of cells that lie within a dominant wavelength of one another), and ignore correlations between those that are further apart. ...
... Walker & Curtis (2014b) used mixture density networks (MDN) to approximate the old posterior distribution, and it has been shown to be difficult to capture posterior correlations between different parameters using this method (Zhang & Curtis 2021a;. Nevertheless, as shown in numerous studies (Devilee et al. 1999;Meier et al. 2007a,b;Shahraeeni & Curtis 2011;Shahraeeni et al. 2012;Käufl et al. 2014Käufl et al. , 2016Cao et al. 2020;Lubo-Robles et al. 2021;Hansen & Finlay 2022;, an advantage of using MDN is that they can determine the posterior pdf corresponding to any dataset extremely rapidly once the networks have been trained. ...
Preprint
Full-text available
Many scientific investigations require that the values of a set of model parameters are estimated using recorded data. In Bayesian inference, information from both observed data and prior knowledge is combined to update model parameters probabilistically. Prior information represents our belief about the range of values that the variables can take, and their relative probabilities when considered independently of recorded data. Situations arise in which we wish to change prior information: (i) the subjective nature of prior information, (ii) cases in which we wish to test different states of prior information as hypothesis tests, and (iii) information from new studies may emerge so prior information may evolve over time. Estimating the solution to any single inference problem is usually computationally costly, as it typically requires thousands of model samples and their forward simulations. Therefore, recalculating the Bayesian solution every time prior information changes can be extremely expensive. We develop a mathematical formulation that allows prior information to be changed in a solution using variational methods, without performing Bayesian inference on each occasion. In this method, existing prior information is removed from a previously obtained posterior distribution and is replaced by new prior information. We therefore call the methodology variational prior replacement (VPR). We demonstrate VPR using a 2D seismic full waveform inversion example, where VPR provides almost identical posterior solutions compared to those obtained by solving independent inference problems using different priors. The former can be completed within minutes even on a laptop whereas the latter requires days of computations using high-performance computing resources. We demonstrate the value of the method by comparing the posterior solutions obtained using three different types of prior information.
... In the field of TEM inversion, one of the main practical limitations of traditional deterministic inversion is the high computational cost caused by a large number of forwarding modeling operations, and the probability method with all advantages has even higher requirements for calculation. In recent years, ML methods, especially NN, have been widely used in geophysical inversion (Ji et al. 2022;Yang et al. 2022;Zhang and Curtis 2021). However, the loss function of the NN is typically non-convex. ...
Article
Inverse problems are typically tackled using deterministic optimization methods that may become trapped in a local minimum or probabilistic methods that can be computationally demanding. In this study, we explore the potential of the back propagation neural network (BPNN) optimized by the genetic algorithm (GA) for onshore transient electromagnetic (TEM) inversion. The GA is employed to optimize the initial parameters of the BPNN, enhancing its global optimization ability. Once the BPNN optimized by GA (GA-BPNN) is properly trained, it can provide the distribution of subsurface electrical conductivity (σ) in 0.1 s. We train the GA-BPNN using synthetic datasets generated by TEM forward modeling and assess its reliability using both synthetic and field data. Theoretical simulations demonstrate that compared with BPNN, the error of GA-BPNN on the inversion results of six samples is reduced by 23.2%. Furthermore, this method can provide reliable results even in the presence of noise in the TEM response. Finally, applying this inversion method to karst exploration with measured data proves the reliability and robustness of the proposed method. The proposed method can support quasi-real-time imaging of subsurface structures and provides a powerful tool for the interpretation of field TEM data.
... rface imaging problems for more than two decades (Bloem et al., 2023;Cao et al., 2020;Devilee et al., 1999;de Wit et al., 2013;Hansen & Finlay, 2022;Käufl et al., 2014Käufl et al., , 2016Lubo-Robles et al., 2021;Meier et al., 2007aMeier et al., , 2007bA. K. Ray & Biswal, 2010;Shahraeeni & Curtis, 2011;Shahraeeni et al., 2012;Siahkoohi et al., 2022;X. Zhang & Curtis, 2021b;X. Zhao et al., 2021). Interestingly, X. Zhang et al. (2023) explained how certain variational methods are related to novel Monte Carlo type algorithms, showing that a spectrum of techniques might be constructed that combine the strengths of both approaches. ...
Article
Full-text available
Geoscientists use observed data to estimate properties of the Earth's interior. This often requires non‐linear inverse problems to be solved and uncertainties to be estimated. Bayesian inference solves inverse problems under a probabilistic framework, in which uncertainty is represented by a so‐called posterior probability distribution. Recently, variational inference has emerged as an efficient method to estimate Bayesian solutions. By seeking the closest approximation to the posterior distribution within any chosen family of distributions, variational inference yields a fully probabilistic solution. It is important to define expressive variational families so that the posterior distribution can be represented accurately. We introduce boosting variational inference (BVI) as a computationally efficient means to construct a flexible approximating family comprising all possible finite mixtures of simpler component distributions. We use Gaussian mixture components due to their fully parametric nature and the ease with which they can be optimized. We apply BVI to seismic travel time tomography and full waveform inversion, comparing its performance with other methods of solution. The results demonstrate that BVI achieves reasonable efficiency and accuracy while enabling the construction of a fully analytic expression for the posterior distribution. Samples that represent major components of uncertainty in the solution can be obtained analytically from each mixture component. We demonstrate that these samples can be used to solve an interrogation problem: to assess the size of a subsurface target structure. To the best of our knowledge, this is the first method in geophysics that provides both analytic and reasonably accurate probabilistic solutions to fully non‐linear, high‐dimensional Bayesian full waveform inversion problems.
... In recent years, numerous fields such as medical imaging, geophysics, and engineering have extensively employed inverse problem techniques to resolve critical issues, like image reconstruction and parameter estimation [59][60][61]. Inverse problem solving tends to have complex implementations and require extensive computational cost [62]. Prior information, such as the mutual dependence of rheological parameters to perfusion, can be embedded as boundary conditions for simplicity. ...
... Relevantly, concurrent literature has delved into the intrinsic dimensionality of NFs, indicating the potential to using NFs to generate models with inherently lower dimensions [31]. • NFs' inherent invertibility negates the need to store state variables during gradient calculations, enabling memory-efficient training and inversion in large-scale 3D applications, such as in geophysics [43,83,84,86,[103][104][105] and ultrasound imaging [66][67][68][69][70]93]. • because of their invertibility NFs guarantee unique latent codes for all model space samples, including out-of-distribution ones. Therefore, they can still be used to invert for out-of-distribution model parameters, while other methods like GANs may introduce bias [7]. ...
Article
Full-text available
Solving multiphysics-based inverse problems for geological carbon storage monitoring can be challenging when multimodal time-lapse data are expensive to collect and costly to simulate numerically. We overcome these challenges by combining computationally cheap learned surrogates with learned constraints. Not only does this combination lead to vastly improved inversions for the important fluid-flow property, permeability, it also provides a natural platform for inverting multimodal data including well measurements and active-source time-lapse seismic data. By adding a learned constraint, we arrive at a computationally feasible inversion approach that remains accurate. This is accomplished by including a trained deep neural network, known as a normalizing flow, which forces the model iterates to remain in-distribution, thereby safeguarding the accuracy of trained Fourier neural operators that act as surrogates for the computationally expensive multiphase flow simulations involving partial differential equation solves. By means of carefully selected experiments, centered around the problem of geological carbon storage, we demonstrate the efficacy of the proposed constrained optimization method on two different data modalities, namely time-lapse well and time-lapse seismic data. While permeability inversions from both these two modalities have their pluses and minuses, their joint inversion benefits from either, yielding valuable superior permeability inversions and CO 2 plume predictions near, and far away, from the monitoring wells.
... For example, Bishop ( 1994 ) combines Gaussian mixture densities with NNs and denotes the resulting method as mixture density network (MDN). Ho wever , an MDN (Al yae v & Elsheikh 2022 ; Zhang & Curtis 2021 ;Earp et al. 2020 ;Meier et al. 2007 ) estimates the inverse problem without considering the physical laws that govern the forward problem. Alternati vel y, VAEs and GANs can consider the inverse and forward problems in the model scheme, and they have been applied in combination with MDNs in Ś mieja et al. ( 2020 ) and Oikarinen et al. ( 2021 ). ...
... To overcome this problem, authors like Al yae v & Elsheikh ( 2022 ) and Meier et al. (2007 ) employ an MDN introduced by Bishop ( 1994 ) as a multiple prediction model, which shows an improvement in the classical architecture of MDNs, and adopt a multimodal trajectory predictions (MTP) loss function (Cui et al. 2019 ) to solve the mode collapse present in some application when we use MDNs. Other investigations, such as Zhang & Curtis ( 2021 ) and Earp et al. ( 2020 ), combine MDNs with Bayesian statistics to improve the estimations. Ho wever , Earp et al. ( 2020 ) conclude that the prior distribution considerabl y af fects these estimations: an incorrect selection of the prior distribution deteriorates the multimodality of the MDN. ...
Article
Estimating subsurface properties from geophysical measurements is a common inverse problem. Several Bayesian methods currently aim to find the solution to a geophysical inverse problem and quantify its uncertainty. However, most geophysical applications exhibit more than one plausible solution. Here, we propose a multimodal variational autoencoder model that employs a mixture of truncated Gaussian densities to provide multiple solutions, along with their probability of occurrence and a quantification of their uncertainty. This autoencoder is assembled with an encoder and a decoder, where the first one provides a mixture of truncated Gaussian densities from a neural network, and the second is the numerical solution of the forward problem given by the geophysical approach. The proposed method is illustrated with a one-dimensional Magnetotelluric inverse problem and recovers multiple plausible solutions with different uncertainty quantification maps and probabilities that are in agreement with known physical observations.
... space to the desired model parameter space, train these machines using paired models and their corresponding synthetic data, and apply the trained machines to field datasets. Applications of such a framework range across the whole spectrum of geophysical inverse problems, including surface wave dispersion inversion and tomography (Cai et al., 2022;X. Zhang & Curtis, 2021), seismic-to-petrophysics inversion (Xiong et al., 2021;C. Zou et al., 2021), crustal thickness and Vp/Vs estimation from receiver functions (F. Wang et al., 2022), earthquake and microseismicity moment tensor inversion Steinberg et al., 2021), magnetic, gravity, and ground-penetrating radar (GPR) data inversion (R. Leong & Zhu, 2021;Nur ...
Article
Full-text available
Plain Language Summary Machine learning has been the topic that attracts massive academic attention in Solid‐Earth Geosciences in the past decade. Applications of machine learning (ML), including the more conventional signal processing‐based methods and the trending deep neural network‐based methods, have dominated many scientific conversations. We introduce the special collection of ML applications in Solid‐Earth geosciences that range from earthquake signal processing, automatic image interpretation, to joint understanding of multiple geoscience datasets. With the extraordinary efforts in ML studies, we now have a better outline of the areas where ML has contributed most significantly through efficiency and automation, and where ML has the potential to revolutionize the workflow and advance the integrated scientific understanding of the Solid‐Earth processes.
... The bijective mappings have a tractable Jacobian, which allows explicit computation of the posterior probability over the parameter space by backward calculation based on the observed data and the distribution of the latent variable, thus offering the potential for the fast Bayesian inversion. The INN has been successfully used to solve the inverse problems for seismic velocity [37], [38]. ...
Article
Full-text available
The inversion of airborne electromagnetic (AEM) data suffers from severe non-uniqueness of the solution. Bayesian inference provides the means to estimate structural uncertainty with a rich suite of statistical information. However, conventional Bayesian methods are computationally demanding in nonlinear inversions, especially considering the huge volumes of observational data, and thus are not feasible in practice. In this study, we develop a fast Bayesian inversion operator based on the invertible neural network (INN) to fully explore the posterior distribution and quantitatively evaluate the model uncertainty. The INN uses a latent variable to capture the information loss during measurement and constructs bijective mappings between AEM data and the resistivity model. We also introduce another noise variable into the INN to account for data uncertainties. Synthetic tests demonstrate that the INN can effectively recover the posterior distribution by a relatively small ensemble of predicted resistivity models whose AEM responses show a significant agreement with the true signal. We also apply the INN inversion operator to a field data set and obtain results consistent with previous studies. The INN shows considerable adaptability to field observations and strong noise robustness. Meanwhile, the INN delivers the inversion result with posterior model distribution for 23366 AEM time series in 20 seconds on a common PC. The inversion efficiency can be further improved for large data set due to its natural parallelizability. The proposed INN method can support fast Bayesian inversion of AEM data and offer tremendous potential for near real-time uncertainty evaluation of underground structures.
... More recently, other approaches, using the standard deviation from predictions of random forests (Zou et al., 2021) or adapted neural networks have been proposed (X. Zhang and Curtis, 2021;U. Meier et al., 2007). ...
... • The algorithm needs to be trained along with noisy data to be able to take it into account (X. Zhang & Curtis, 2021). ...
... In conclusion, geophysicists greatly benefit from ML in their workflows. Zhang and Curtis, 2021) • They need to be trained either on known posterior distributions given the dataset and the prior (e.g.: Devilee et al., 1999) or with a form of likelihood for a model and the dataset with noise (e.g.: X. Zhang and Curtis, 2021) The algorithm discussed in this manuscript (BEL1D) can be seen as a hybrid solution that overcomes those limitations at the cost of a lower versatility. It can be seen as a supervised regression approach. ...
Thesis
Fournir des images du sous-sol à partir d'ensembles de données terrestres est au cœur du travail du géophysicien. De multiples approches ont été appliquées pour s'attaquer à cette tâche. La plupart du temps, cette tâche est réalisée dans un cadre déterministe, ce qui signifie que pour un ensemble de données déterminé, un modèle unique est fourni pour expliquer les données. Cependant, ces approches déterministes ne permettent pas de fournir des estimations raisonnables de l'incertitude, qui tiennent compte de la non-unicité de la solution, du bruit dans les données et des erreurs de modélisation. Pour fournir des modèles précis et exacts du sous-sol tout en tenant compte de l'incertitude, les géophysiciens utilisent des approches probabilistes. Ces approches sont capables d'échantillonner l'ensemble des modèles a priori possibles (le prior) afin d'extraire les modèles qui peuvent raisonnablement expliquer l'ensemble des données (le posterior). De telles approches, bien que supérieures en termes de fiabilité des résultats, sont rarement appliquées en pratique en raison de leurs importantes exigences en termes de temps de calcul. Dans ce manuscrit, l'objectif est de proposer un nouveau processus bayésien pour interpréter ces données géophysiques. Ce nouveau système, appelé Bayesian Evidential Learning, promet de permettre une estimation rapide, précise et exacte de l'incertitude. Ce processus est appliqué et adapté aux jeux de données géophysiques 1D (BEL1D). Ce système présente plusieurs avantages par rapport aux approches probabilistes classiques : il permet des calculs rapides grâce au nombre limité d'exécutions nécessaires, et donne un aperçu de la sensibilité de l'expérience et de la validité de l'antériorité. De plus, il bénéficie de sa construction en tant qu'algorithme de Machine Learning, conduisant à la construction de modèles d'incertitude quasi-instantanés.
... The benefit of using machine learning for simulation-based inference (Cranmer et al. 2020), as opposed to a data-driven approach, is that one can generate as much synthetic data as is computationally tractable in order to train the machine learning model. Other geophysical inversion studies using this synthetic data generation approach have used deep neural networks for seismic reflectivity inversion (Kim & Nakata 2018), invertible neural networks for seismic 1-D surface wave dispersion inversion and 2-D traveltime tomography (Zhang & Curtis 2021), and convolutional neural networks for 3-D density imaging through the inversion of gravity data (Zhang et al. 2022). Our study departs from these other machine learning studies in two main ways: (1) the application of a broadly accessible, user-friendly machine learning model based on a well-established machine learning algorithm from Sci-kit Learn and (2) a joint inversion of more than one type of data set using a simulation-based inference approach with supervised machine learning. ...
Article
The ability to accurately and reliably obtain images of shallow subsurface anomalies within the Earth is important for hazard monitoring and a fundamental understanding of many geologic structures, such as volcanic edifices. In recent years, machine learning (ML) has gained increasing attention as a novel approach for addressing complex problems in the geosciences. Here we present an ML-based inversion method to integrate cosmic-ray muon and gravity datasets for shallow subsurface density imaging at a volcano. Starting with an ensemble of random density anomalies, we use physics-based forward calculations to find the corresponding set of expected gravity and muon attenuation observations. Given a large enough ensemble of synthetic density patterns and observations, the ML algorithm is trained to recognize the expected spatial relations within the synthetic input-output pairs, learning the inherent physical relationships between them. Once trained, the ML algorithm can then interpolate the best-fit anomalous pattern given data that was not used in training, such as those obtained from field measurements. We test the validity of our ML algorithm using field data from the Showa-Shinzan lava dome (Mount Usu, Japan) and show that our model produces results consistent with those obtained using a more traditional Bayesian joint inversion. Our results are similar to the previously published inversion, and suggest that the Showa-Shinzan lava dome consists of a relatively high-density (2200 − 2400 km/m3) cylindrical anomaly, about 300 m in diameter. Adding noise to synthetic training and testing datasets shows that, as expected, the ML algorithm is most robust in areas of high sensitivity, as determined by the forward kernels. Overall, we discover that ML offers a viable alternate method to a Bayesian joint inversion when used with gravity and muon datasets for subsurface density imaging.