Figure 2 - uploaded by Wenjun Lv
Content may be subject to copyright.
Illustration of the domain discrepancy. The data in the source domain are labeled, and the data in the target domain are unlabeled or are partially labeled. The data distribution of the source domain and the target domain is different. Therefore, the classifier trained on the source domain does not perform well on the target domain.

Illustration of the domain discrepancy. The data in the source domain are labeled, and the data in the target domain are unlabeled or are partially labeled. The data distribution of the source domain and the target domain is different. Therefore, the classifier trained on the source domain does not perform well on the target domain.

Source publication
Article
Full-text available
Lithology identification plays an essential role in geological exploration and reservoir evaluation. In recent years, machine learning-based logging lithology identification has received considerable attention due to its ability to fit complex models. Existing work develops machine learning models under the assumption that the data gathered from di...

Contexts in source publication

Context 1
... is a particular case of transfer learning (Pan and Yang, 2009) when two related domains have the same task but different data probability distribution (i.e., they have a domain discrepancy). Figure 2 shows a simple example of domain discrepancy. The data in the source domain are labeled, and the data in the target domain have two cases: unlabeled or partially labeled. ...
Context 2
... two cases belong to the unsupervised DA setting and the semisupervised DA setting, respectively. As shown in Figure 2, the data distributions of the source domain and the target domain are different. Therefore, training a classifier on the source domain (the green line) to predict the labels of the target domain data will cause many misclassifications. ...

Similar publications

Article
Full-text available
The objective of this research work is to examine two prominent dom gas reservoirs H1000 and H4000 in onshore Niger Delta region of Nigeria for potential production activities using seismic wavelet and well-to-seismic tie process to facilitate interpretation and evaluate dom gas (hydrocarbon) bearing formation. Well log and sidewall samples analyse...
Conference Paper
Full-text available
The Hellisheiði high temperature geothermal field is located in SW-Iceland, about 30 km east of Reykjavik. It is related to the Hengill volcanic system situated at a tectonic triple junction of the Western Volcanic Zone (WVZ), the Reykjanes Peninsula Ridge (RPR) and the South Iceland Seismic Zone (SISZ). Orkuveita Reykjavikur (OR) started the explo...
Article
Full-text available
Actuality. The hydrocarbon generation sources in the South Caspian Basin (SCB) are located at the depth of nine or more kilometers, where the organic-rich Maikop sediments occur. The main oil-and-gas fields here are found in reservoirs of the Productive Strata (Lower Pliocene), bedding at a depth of one or more kilometers. At the same time, the sig...
Article
Full-text available
The presence of pyrite in sandstone reservoirs will cause a problem known as low resistivity reservoirs case. The impact of pyrite volume in sandstone reservoirs is very important to be determined, especially to conduct its resistivity correction factor (Rcf). This research was done in the laboratory used nine sandstone pseudo-cores with various py...
Article
Full-text available
Evaluation of Z-Field prospect located in the western Niger Delta at a shallow water depth of about 100m was carried out using 3-D seismic and well data. The aim of this study is to evaluate the hydrocarbon potential of the target bed (R01A reservoir) using checkshot, seismic and well log data. Results of the structural interpretation revealed exte...

Citations

... Most existing models assume an independent identical distribution condition for well-log data collected from different wells (Wu et al. 2022). However, in practice, the data distribution may not hold, even for geographically close wells, leading to degraded performance when applying a model trained on one well to predict the lithofacies of another (Chang et al. 2021). This is due to the complex distribution of subsurface lithofacies resulting from changes in reservoir characteristics caused by geological activities related to sedimentary tectonic evolution. ...
... Chang et al. developed a two-stream multilayer neural network. Under the assumption that new wells do not have lithology labels, integrating unsupervised domain adaptation into lithology identification achieves good performance (Chang et al. 2021). Zhou et al. proposed a cross-domain lithology classification model that the data distributions of the training data set and test data set can be shared. ...
Article
Reservoir lithofacies type is an important indicator of reservoir quality and oiliness, and understanding lithofacies type can help geologists and engineers make informed decisions about exploration and development activities. The use of well-log data to establish machine learning models for lithofacies identification has gained popularity; however, the assumption that data are independent identical distribution followed by these models is often unrealistic. Additionally, there is a possible incompatibility between the training and test data in terms of feature space dimensions. We propose the heterogeneous domain adaptation framework for logging lithofacies identification (HDAFLI) to address these problems. The framework comprises three main contributions: (i) The denoising autoencoder feature mapping (DAFM) module is adopted to resolve the incompatibility issue in feature space between training and test data. The connection between training and test data can be effectively established to improve the performance and generalization ability. (ii) The transferability and discriminative joint probability distribution adaptive (TDJPDA) module addresses the issue of data distribution differences. It improves the transferability of training and test data by minimizing the maximum mean difference (MMD) of the joint probabilities of the source and target domains and enhances their discriminative ability by maximizing the joint probability MMD of different lithofacies categories. (iii) Bayesian optimization is used to optimize hyperparameters in the light gradient boosting machine (LightGBM) model for high computational efficiency in determining the best accuracy. We selected well-logging data from eight wells in the Pearl River Mouth Basin of the South China Sea to design four tasks and compared HDAFLI with various baseline machine learning algorithms and baseline domain adaptive algorithms. The results show that HDAFLI has the highest average accuracy among the four tasks. It is 19.76% and 8.94% higher than the best-performing baseline machine learning algorithm and baseline domain adaptive method among the comparison algorithms, respectively. For HDAFLI, we also conducted ablation experiments, time cost and convergence performance analysis, parameter sensitivity experiments, and feature visualization experiments. The results of ablation experiments show that the three modules of HDAFLI all play an active role, working together to achieve the best results. In addition, HDAFLI has a reasonable time cost, can become stable after several iterations, and has good convergence performance. The results of parameter sensitivity experiments confirm that the accuracy of HDAFLI does not change significantly with changes in hyperparameters, which is robust. The results of feature visualization experiments show that the data of the training set and the test set are concentrated together to a certain extent, which indicates that HDAFLI has completed the task of data distribution alignment very well. The findings of this study can help for a better understanding of how to address the challenge of reservoir lithofacies identification through a heterogeneous domain adaptation framework. By solving the problem of feature space incompatibility and data distribution difference between training data and test data, the application of HDAFLI provides geologists and engineers with more accurate lithofacies classification tools. This study has practical application value for reservoir quality assessment, oiliness prediction, and exploration and development decision-making.
... However, they differ from the PINNS discussed below in that they do not utilize automatic differentiation (98) to calculate the derivatives involved in the loss. Unsupervised learning can discover hidden patterns in unlabeled data and has also been used for various tasks including seismic facies classification (50,99,100), seismic signal or waveform classifi-cation (11,(101)(102)(103), lithology classification (104)(105)(106), seismic migration (37,39), and inversion (107,108). ...
Article
Full-text available
One of the key objectives in geophysics is to characterize the subsurface through the process of analyzing and interpreting geophysical field data that are typically acquired at the surface. Data-driven deep learning methods have enormous potential for accelerating and simplifying the process but also face many challenges, including poor generalizability, weak interpretability, and physical inconsistency. We present three strategies for imposing domain knowledge constraints on deep neural networks (DNNs) to help address these challenges. The first strategy is to integrate constraints into data by generating synthetic training datasets through geological and geophysical forward modeling and properly encoding prior knowledge as part of the input fed into the DNNs. The second strategy is to design nontrainable custom layers of physical operators and preconditioners in the DNN architecture to modify or shape feature maps calculated within the network to make them consistent with the prior knowledge. The final strategy is to implement prior geological information and geophysical laws as regularization terms in loss functions for training the DNNs. We discuss the implementation of these strategies in detail and demonstrate their effectiveness by applying them to geophysical data processing, imaging, interpretation, and subsurface model building.
... Chang et al. proposed a twostream multilayer neural network to solve the data drift problem. The training process of this network is performed by maximum mean discrepancy optimization [35]. Zhang et al. proposed a method for transferring the logs from one well to another without affecting their physical significance [36]. ...
Article
Full-text available
For geological analysis tasks such as reservoir characterization and petroleum exploration, lithology identification is a crucial and foundational task. The logging lithology identification tasks at this stage generally build a lithology identification model, assuming that the logging data share an independent and identical distribution. This assumption, however, does not hold among various wells due to the variations in depositional conditions, logging apparatus, etc. In addition, the current lithology identification model does not fully integrate the geological knowledge, meaning that the model is not geologically reliable and easy to interpret. Therefore, we propose a cross-domain lithology identification method that incorporates geological information and domain adaptation. This method consists of designing a named UAFN structure to better extract the semantic (depth) features of logging curves, introducing geological information via wavelet transform to improve the model’s interpretability, and using dynamic adversarial domain adaptation to solve the data-drift issue cross-wells. The experimental results show that, by combining the geological information in wavelet coefficients with semantic information, more lithological features can be extracted in the logging curve. Moreover, the model performance is further improved by dynamic domain adaptation and wavelet transform. The addition of wavelet transform improved the model performance by an average of 6.25%, indicating the value of the stratigraphic information contained in the wavelet coefficients for lithology prediction.
... However, due to the differences in, for example, logging instruments and drilling fluids, the probability distributions of the labeled training dataset (wells with complete logs) and the testing dataset (wells with missing logs) are pretty different, thereby we should consider building the model by machine learning under the non-iid issue as illustrated in Fig. 1. This issue has been considered in the task of logging interpretation which is studied by domain adaptation (DA) technologies [9], [24], [25], [26]. For example, Chang et al. [24] studied the unsupervised domain adaptation for lithofacies classification, that is, there is no label in the target wells. ...
... This issue has been considered in the task of logging interpretation which is studied by domain adaptation (DA) technologies [9], [24], [25], [26]. For example, Chang et al. [24] studied the unsupervised domain adaptation for lithofacies classification, that is, there is no label in the target wells. By introducing a two-flow multilayer neural network and training with a maximum mean discrepancy optimization, domain invariant, and discriminative feature representations are learned so that cross-domain discrepancy is restrained to a large extent. ...
... The representative work for domain adaptation is developed by means of statistical feature transform (SFT) [27]. Since the deep neural network has a powerful ability to fit nonlinear relationships, these years have witnessed the combination of SFT and deep representation learning [24], [28]. The existing studies have the following issues applying to the missing logs generation task. ...
Article
Geophysical logging instruments continuously measure multiple geophysical properties of borehole rocks, thus providing a feasible way to fine borehole geology modelling. Since the missing problem of well logs is inevitable, it is essential to generate the missing logs by the available ones. Recently, a large body of interdisciplinary studies has demonstrated the effectiveness of applying machine learning to solve the missing logs generation problem, under which the training and testing datasets obey the independent and identical distribution (iid) assumption. This assumption, however, is not satisfied in the case of the cross-well missing logs generation task. A standard method to solve the non-iid issue is to map source and target data to a common feature space and then employ Mean Maximum Discrepancy (MMD) to measure domain differences. However, this method suffers from high computational complexity and poor feature explainability when dealing with logs generation tasks. To solve the above problems, we propose an explainable regression network for cross-well geophysical logs generation named LogRegX. LogRegX integrates single-well feature extraction, cross-well feature alignment, and missing logs prediction while maintaining the explainability of logging features. Specifically, LogRegX leverages the gating mechanism to fuse multi-scale logging features to capture the response characteristics of well logs. The learned source and target feature representations are subject to domain discrepancy constraints, measured by Random Fourier Feature transform induced MMD. Additionally, target-domain information retaining mechanism is introduced to maintain the structure of target data so that the transferred features are explainable. Experiments on real-world field data demonstrate the superiority and the explainability of LogRegX over the existing methods.
... Divergence measures, such as the Kullback-Leibler (KL) divergence (Kullback and Leibler, 1951) and Jason-Shannon (JS) divergence (Manning and Schutze, 1999), can be used to quantify the statistical dissimilarity between two joint distributions. Chang et al. (2021) use the maximum mean discrepancy domain transfer (MMDDT) learning model to predict rock facies, which implicitly imposes the spatial stationarity assumption and minimizes the divergence for the outputs of intermediate layers in a neural network, i.e., the intermediate features, during training. The intermediate features are non-linear transformations of the input well logs, thus can also be regarded as non-linearly normalized well logs for corrected joint distributions. ...
Preprint
Full-text available
Well-log interpretation provides in situ estimates of formation properties such as porosity, hydrocarbon pore volume, and permeability. Reservoir models based on well-log-derived formation properties deliver reserve-volume estimates, production forecasts, and help with decision making in reservoir development. However, due to measurement errors, variability of well logs due to multiple measurement vendors, different borehole tools, and non-uniform drilling/borehole conditions, conventional well-log interpretation methods may not yield accurate estimates of formation properties, especially in the context of multi-well interpretation. To improve the robustness of multi-well petrophysical interpretation, well-log normalization techniques such as two-point scaling and mean-variance normalization are commonly used to impose stationarity constraints for well logs requiring correction. However, these techniques are mostly based on the marginal distribution of well logs and require expert knowledge to be effectively implemented. To reduce the uncertainties and time associated with multi-well petrophysical interpretation, we develop the discriminative adversarial (DA) model and the linear constraint model for well-log normalization and interpretation. We also develop a new divergence-based type well identification method for improved test-well and training-well adaptation.The DA neural network model developed for well-log normalization and interpretation can perform both linear and nonlinear well-log normalization by considering the joint distribution of all types of well logs and formation properties. To train the DA model, classical machine-learning models or classical petrophysical models are first trained to minimize the prediction error of formation properties in the training data set; then the adversarial model is trained to normalize well logs in the test set, such that the joint distribution of normalized well logs and formation property estimates of the test data set reproduce those of the training data set. The linear constraint model uses an ensemble of predictions from linear models to constrain both well-log normalization and interpretation. To identify wells with stationary formation properties as well as well logs, the divergence-based type well identification method is developed to choose type wells for wells requiring correction based on well-log statistical similarity instead of closeness of wells.We apply the developed methods to improve the accuracy of well-log normalization and the estimation of permeability in a carbonate reservoir. Six types of well logs and over 9000 feet of core measurements from 30 wells drilled between 1980s and 2010s in the Seminole San Andres Unit are available to validate the new multi-well interpretation workflow. Our interpretation models is flexible to integrate any types of classical machine-learning methods and petrophysical assumptions for robust petrophysical estimations. In comparison to classical machine-learning models with no normalization, with two-point scaling normalization and with linear constraints, the DA method yields better performance, e.g., the mean-squared error of permeability estimation decreases by approximately 20-50%. Our interpretation workflow can be applied to other stationary signal and image processing problems to mitigate errors introduced by biased measurements, and to better adapt models calibrated with data from one field to other neighboring fields.
Article
Interpretation of seismic data is an essential task in diverse fields of geosciences, and it is a widely used method in hydrocarbon exploration. However, its interpretation requires a significant investment of resources, and obtaining a satisfactory result is not always possible. The literature shows increasing Deep Learning (DL) methods to detect horizons, faults, and potential hydrocarbon reservoirs. However, the models to detect gas reservoirs present generalization difficulties, i.e., the performance of the methods developed based on DL is compromised when used on seismic data coming from new exploration campaigns. This implies that the new seismic data has features that differ from those that the DL model learned to identify based on the training data. The generalization problem is especially true for 2D land surveys where the acquisition process varies, and the data is very noisy. This work proposes a Domain Adaptation method for natural gas detection in 2D seismic data based on comparing seismic features. Whose aim is to allow a better performance of the DL model, when it is used in data that comes from several exploration campaigns, and which are carried out by several teams, in different areas and dates. The proposed method does not require the modification of the training seismic data or the DL model architecture, it focuses on the analysis of the training data to recognize patterns that allow comparing and clustering the seismic data that are similar. Consequently, the resulting clusters contain representative seismic data for specific domains. This work uses seismic data from nine exploration fields of the Paleozoic Basin of Parnaíba in Brazil as the object of study. The proposed method has two different variants both use an LSTM-based DL model to perform gas inference on 2D seismic data., the proposed method results are compared with those obtained using the same DL model, but being trained with all available seismic data. The first variant, called “Cluster Training per Field”, requires only 23% of the original training data and can improve by 3% in Precision, 4% in Recall, 2% in F1, and 2% in Intersection Over Union (IoU). The second variant, called the “Sample Training Cluster Recommendation Method”, achieved a 4% improvement in Precision, 10% in the recall, 7% in F1, and 8% in IoU.