Common kernel functions and their associated mathematical formulas

Common kernel functions and their associated mathematical formulas

Source publication
Article
Full-text available
Lithology prediction is one of the most important issues in the petroleum geology and geological studies of petroleum engineering. Since well logging responses are very analogous for heterogeneous carbonate and evaporite sequences, a precisionist lithology prediction at predetermined depths becomes extremely critical. In this work, a combination of...

Context in source publication

Context 1
... functions evaluate a dot product in feature space, and the defining characteristic of a kernel is that the value of this dot product is actually computed in the input space (Hamel 2009). Some common kernel functions and their associated mathematical formulas have been listed below (Table 1). Following the linear support vector classifier, the formulation for the nonlinear support vector classifier is a generalization on the linear SVM and it is given by (Kecman 2005 ...

Similar publications

Article
Full-text available
Relevance of the research. Permanently identified signs of plastic flow in ultramafic rocks predetermined an approach to their study as metamorphic rocks. This approach uses non-traditional method of the petrofabric analysis. This method allows reconstructing the chronological sequence of formation and plastic deformation of ultramafic rocks in the...
Article
Full-text available
The synonymous use of the general term “landslide”, with a built-in reference to a sliding motion, for all varieties of mass-transport deposits (MTD), which include slides, slumps, debrites, topples, creeps, debris avalanches etc. in subaerial, sublacustrine, submarine, and extraterrestrial environments has created a multitude of conceptual and nom...
Article
Full-text available
Until recently, the existing data prevented the geophysicists from accurately dating the Bysy-Yuryakh stratum, which outcrops in the middle reach of the Kotuy River, constraining the time of its formation to a wide interval from the end of the Late Cambrian to the beginning of the Silurian. The obtained paleomagnetic data unambiguously correlate th...
Article
Full-text available
Upper Ordovician Wufeng Formation-Lower Silurian Longmaxi Formation shale in the Sichuan Basin and its periphery is an important horizon for shale gas exploration and development in my country. The rapid on-site desorption instrument independently developed by Sinopec Wuxi Institute of Petroleum Geology was used to test the desorption gas volume, l...

Citations

... This holistic approach can integrate diverse data sources, including seismic, geological, and production data, offering a more nuanced understanding of reservoir characteristics. Studies have utilized a range of supervised machine learning algorithms, including Random Forest (RF) 7 , Support Vector Machines (SVM) 8 , Artificial Neural Networks (ANN) 9 , adaptive network fuzzy inference system (ANFIS) 10 , and Extreme Gradient Boosting (XGB) 6 www.nature.com/scientificreports/ values. ...
Article
Full-text available
Reservoir characterization, essential for understanding subsurface heterogeneity, often faces challenges due to scale-dependent variations. This study addresses this issue by utilizing hydraulic flow unit (HFU) zonation to group rocks with similar petrophysical and flow characteristics. Flow Zone Indicator (FZI), a crucial measure derived from pore throat size, permeability, and porosity, serves as a key parameter, but its determination is time-consuming and expensive. The objective is to employ supervised and unsupervised machine learning to predict FZI and classify the reservoir into distinct HFUs. Unsupervised learning using K-means clustering and supervised algorithms including Random Forest (RF), Extreme Gradient Boosting (XGB), Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were employed. FZI values from RCAL data formed the basis for model training and testing, then the developed models were used to predict FZI in unsampled locations. A methodical approach involves 3 k-fold cross-validation and hyper-parameter tuning, utilizing the random search cross-validation technique over 50 iterations was applied to optimize each model. The four applied algorithms indicate high performance with coefficients determination (R²) of 0.89 and 0.91 in training and testing datasets, respectively. RF showed the heist performance with training and testing R² values of 0.957 and 0.908, respectively. Elbow analysis guided the successful clustering of 212 data points into 10 HFUs using k-means clustering and Gaussian mixture techniques. The high-quality reservoir zone was successfully unlocked using the unsupervised technique. It has been discovered that the areas between 2370–2380 feet and 2463–2466 feet are predicted to be high-quality reservoir potential areas, with average FZI values of 500 and 800, consecutively. The application of machine learning in reservoir characterization is deemed highly valuable, offering rapid, cost-effective, and precise results, revolutionizing decision-making in field development compared to conventional methods.
... Combining the advantages of logging information and logging data, Yang Sitong [14] et al. used BP neural network for comprehensive processing and realized the oil and gas identification in low porosity and low permeability reservoirs. Sebtosheikh [15] et al. applied a support vector machine algorithm to successfully predict the lithology of heterogeneous carbonate reservoirs. Dong [16] et al. applied the improved Linear Discriminant Analysis (LDA) method to identify lithology and achieved good recognition results through experiments with on-site data sets. ...
Article
Full-text available
Stratum identification is the division of the stratum lithology of one region, which is an important part of petroleum geology research. How to effectively improve the accuracy and efficiency of stratum recognition is an important issue in oil exploration and development. During the traditional oil and gas drilling process, the logging data is commonly used as the main basis to conduct artificial stratum division. The challenges encountered are high labor intensity and excessive dependence on artificial experience for identification accuracy. By comprehensively considering the synergy of multiple parameters in oil and gas drilling, we propose an intelligent sub-layer division model based on the LightGBM algorithm. First, the data set was formed by normalizing, de-noising, and smoothing the drilling engineering parameters and combining them with the element logging parameters. Then, the LightGBM algorithm was applied to build the sub-layer division model, and the deep neural network and support vector machine was introduced for comparative analysis. Finally, the input parameters of the model were optimized by the principal component analysis method to realize the intelligent identification of the stratum sub-layer. The application results of a certain block in the central Bohai Sea oil field showed that the intelligent identification of stratum sub-layer while drilling could be realized. The use of the model and combination of the logging while drilling data with high recognition accuracy provided a crucial theoretical model for the transformation of stratum sub-layer identification technology.
... In lithology detection, a well-organized supervised machine learning classifier with features learning samples achieves outstanding results (Sebtosheikh et al., 2015). SVM explain the linear data as ( ...
Article
Abstract The present study aims to better understand the mineralogy and thermal structure of the Yingxiu-Beichuan fault zone (YBFZ), Sichuan basin, China, which was lacking previously. The previous research on the Wenchuan earthquake Fault Scientific Drilling (WFSD) project was focused on mineral classification, fault analysis, and geochemical research utilizing original logs using WFSD-1 well and at a shallow depth of 700 meters. No investigations are conducted at a depth of around 1550 meters utilizing multiple dimensionality reduced well-logs. Thereafter, we sought to categorize the minerals along the YBFZ using Machine Learning (ML) technologies and concentration-number (C-N) modeling using multiple WFSD-1 and WFSD-2 wells. In the categorization of rocks, three classifiers are discussed: Support Vector Machines (SVM), Feed Forward Back Propagation (BPNN), and Radial Basis Function Neural Networks (RBFN). Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) are used to normalize the log data in a geologically complicated region. By using the BPNN classifier to PCA, LDA, and all 17 logs, we achieved respective accuracy rates of 75.54 %, 86.32 %, and 72.54 %, which are higher than the accuracy rates of RBFN (76.03%, 77.01%, and 65.3%), and SVM (74.45%, 78.03%, and 70.06%). These results suggest that BPNN shows improved accuracy rates for mineral classification in a complex tectonic regime. In addition, concentration-number (C-N) fractal model technique and log-log plots are also used to characterize geothermal features and the results of C-N modeling supports the results of ML models. High gamma ray (GR) ranges of 349.7API, heat production (HP) ranges of 5.5 W/m3, and low thermal conductivity (TC) ranges of 0.28 W/km show that the fault region is associated with comparatively strong radiogenic activity compared to its surroundings. Keywords: Yingxiu-Beichuan fault; dimensionality reduction; mineral classification; Concentration-Number modeling; Wenchuan Earthquake zone.
... ML is applied with great success in various industries, such as engineering [18][19][20][21][22][23], medicine [24][25][26][27][28][29], economy [30][31][32], and environmental and geospatial modeling [33][34][35]. ML has been applied to macroscopic features of target rocks, such as seismic facies classification [36][37][38][39][40] and logging lithofacies classification [41][42][43]. ML has demonstrated outstanding performance in these areas. An important aspect observed in recent research is that ML can learn and adapt to the dynamics of reservoir conditions, such as formation and depositional environment [44], while making use of geophysical data for lithology identification [45,46], porosity, and permeability [47][48][49][50][51][52][53]. ...
Article
Full-text available
Total organic carbon (TOC) is important geochemical data for evaluating the hydrocarbon generation potential of source rocks. TOC is commonly measured experimentally using cutting and core samples. The coring process and experimentation are always expensive and time-consuming. In this study, we evaluated the use of three machine learning (ML) models and two multiple regression models to predict TOC based on well logs. The well logs involved gamma rays (GR), deep resistivity (RT), density (DEN), acoustic waves (AC), and neutrons (CN). The ML models were developed based on random forest (RF), extreme learning machine (ELM), and back propagation neural network (BPNN). The source rock of Paleocene Yueguifeng Formation in Lishui–Jiaojiang Sag was taken as a case study. The number of TOC measurements used for training and testing were 50 and 27. All well logs and selected well logs (including AC, CN, and DEN) were used as inputs, respectively, for comparison. The performance of each model has been evaluated using different factors, including R2, MAE, MSE, and RMSE. The results suggest that using all well logs as input improved the TOC prediction accuracy, and the error was reduced by more than 30%. The accuracy comparison of ML and multiple regression models indicated the BPNN was the best, followed by RF and then multiple regression. The worst performance was observed in the ELM models. Considering the running time, the BPNN model has higher prediction accuracy but longer running time in small-sample regression prediction. The RF model can run faster while ensuring a certain prediction accuracy. This study confirmed the ability of ML models for estimating TOC using well logs data in the study area.
... The results show that the SVM classifier has the optimum performance. Sebtosheikh et al. [8] used lithology log data based on core analysis data in Iranian heterogeneous carbonate reservoirs to build various SVM classifiers with different kernel functions and for lithology categorization. The performance of these models is compared, and the results show that the SVM classifier based on radial basis functions performs optimally. ...
... This indicates that there is no need to write an explicit program to establish the relationship between shear velocity and minerals, pores and fluids, but only to automatically import logging data into the computer to learn the relationship between shear velocity and other logging curves. At present, machine learning has been widely used in rock physics analysis, such as identification of outliers in density and P-sonic curve, reconstruction of curves using a data-driven method [29], lithofacies identification using an artificial neural network (ANN) [30], facies prediction using ANN [31][32][33] and support vector machine (SVM) [34,35] and logging interpretation by data mining [36]. ...
Article
Full-text available
Shear velocity is an important parameter in pre-stack seismic reservoir description. However , in the real study, the high cost of array acoustic logging leads to lacking a shear velocity curve. Thus, it is crucial to use conventional well-logging data to predict shear velocity. The shear velocity prediction methods mainly include empirical formulas and theoretical rock physics models. When using the empirical formula method, calibration should be performed to fit the local data, and its accuracy is low. When using rock physics modeling, many parameters about the pure mineral must be optimized simultaneously. We present a deep learning method to predict shear velocity from several conventional logging curves in tight sandstone of the Sichuan Basin. The XGBoost algorithm has been used to automatically select the feature curves as the model's input after quality control and cleaning of the input data. Then, we construct a deep-feed neuro network model (DFNN) and decompose the whole model training process into detailed steps. During the training process, parallel training and testing methods were used to control the reliability of the trained model. It was found that the prediction accuracy is higher than the empirical formula and the rock physics mod-eling method by well validation.
... The results show that the SVM classifier has the optimum performance. Sebtosheikh et al. [8] used lithology log data based on core analysis data in Iranian heterogeneous carbonate reservoirs to build various SVM classifiers with different kernel functions and for lithology categorization. The performance of these models is compared, and the results show that the SVM classifier based on radial basis functions performs optimally. ...
Article
Full-text available
The lithology of underground formations can be determined using logging data, which is important for a variety of subsurface geoscience and industrial applications. Deep learning technology offers the advantage of discovering a potential relationship between input and output variables, making it a great choice for generating fast and cost-effective lithology classification models. To automatically characterize lithologies, a multiclass image segmentation problem is considered and an improved Unet as a solution is adopted. The model’s input data is two-dimensional images composed of rock feature data at different depths, and the outcome is a result of one-dimensional rock lithology classification. The algorithm’s practicality was tested using the logging data set from the Xinjing mining area in Shanxi Province, in north-central China, and an open-source data set of Canadian strata. Our model is tested against the 1D-convolutional neural network (CNN) and XGBoost algorithms using a good logging data set of the same depth and different depths for testing. The results show that the improved Unet method outperforms the 1D-CNN and XGBoost algorithms in the classification of rock lithologies. This algorithm has high application potential in the automatic interpretation of rock lithologies.
... For lithology classification and interpretation, various methods have been used, including crossplot interpretation and statistical analysis based on histogram plotting (Busch et al., 1987), support vector machine (SVM) using wireline well logs (Sebtosheikh et al., 2015), fuzzy-logics (FL) for association analysis, neural networks and multivariate statistical methodologies, and artificial intelligence approaches . Automating the process of discovering and classifying electrofacies has recently been made possible by a slew of new mathematical approaches. ...
Article
Understanding geological variance in a proved reservoir requires accurate as well as exact characterization of lithological facies. In the Kadanwari gas field, machine learning (ML) classification algorithms have been used to forecast facies on such an accessible dataset. The goal is to increase the reliability of facies categorization using a rigorous application of machine learning. In the current study to identify lithofacies, we have used the self-organizing map (SOM) and crossplot techniques. In the classification of the reservoir, recognition of lithofacies is the main piece of work. It is expensive to identify lithofacies with conventional methods from core data, and it is challenging to extend this application to non-cored wells. This research provides a less expensive method for the systematic and objective recognition of lithofacies through well-log data by Kohonen SOM. The SOMs are human-made neural networks that do not need surveillance and map the input space into groups in the structure as topology is arranged according to the input data changes. The results of SOM and crossplot indicates that the zone of interest is mainly composed of sandstone, shaly sandstone, shale with diminutive amount of carbonates. The cluster analysis approach has been utilized to categorize the reservoir rock groups in the Cretaceous reservoir for the Kadanwari gas field by analyzing the variance of reservoir properties data that are forecast by examining well log dimensions. Four groups of reservoirs were concluded, each of which was internally identical in petrophysical properties but distinct from the others. The reservoir mainly composed of sandstone is graded as excellent reservoir, while shale is graded as poor reservoir.
... Then, given only support vectors are preserved to execute modeling, SVM should be more powerful in both computing speed and predicting accuracy in theory. Many relevant research outcomes have verified the superior potential of SVM on the prediction, such as Al-Anazi and Gates (2010a, b) who demonstrated the capability of SVM in the application of sandy-mud reservoirs, or Sebtosheikh et al. (2015) who testified the effectiveness of SVM on the prediction of carbonate lithologies. Although the previous findings well presented that those mentioned ML-based models are functional and seem being preferential for the lithological prediction, their capabilities still cannot be satisfactory for geologists and geophysicists owing to their relatively worse generalization and robustness. ...
... Hence, the real constructions of contrastive predictors can be written as CRBM-AFSA-PNN and CRBM-AFSA-SVM. As a routine, before modeling, the empirical initial settings of PNN and SVM should be determined, which are given by Table 4 (Al-Anazi and Gates 2010a and b; Paass et al. 2010;Sebtosheikh et al. 2015;Gu et al. 2018; Table 4 Initial setting of predictors, and optimized results of hyper-parameters for each predictor 1 Radial basis function is selected as kernel function in SVM process Ao et al. 2019). The bottom row of the table records the optimized results of hyper-parameters of each predictor. ...
Article
Full-text available
Due to limitations imposed by cored wells, lithological data are often incomplete, and correct identification of lithofacies is problematic. Identification is actually an issue of pattern recognition, and based on newly proved findings, LightGBM (light gradient boosting machine) is considered to be an excellent pattern recognizer and, therefore, well suited for recognizing lithofacies. To remove remaining disadvantageous features and to further enhance the prediction performance of LightGBM, CRBM (continuous restricted Boltzmann machine) and AFSA (artificial fish swarm algorithm) are adopted as assistants to provide, respectively, high-quality learning data and to create optimal hyper-parameter settings during data processing. Subsequently, a predictor characterized by new ensemble learning is proposed, named CRBM-AFSA-LightGBM. To establish comprehensive verification, several validations are designed based on logging data derived from pre-salt carbonate reservoirs of the Santos Basin. Validations demonstrate the effectiveness and significance of integrating CRBM and AFSA; a further two validations are aimed at revealing whether a change in the learning data has an impact on prediction. To highlight the validation effect, PNN (probabilistic neural network) and SVM (support vector machine) are introduced as contrasting predictors. The test results demonstrate three important points: (1) CRBM and AFSA are preferred to assist in the capability of LightGBM; (2) the LightGBM-cored predictor performs better when compared with PNN-cored and SVM-cored predictors, especially when dealing with larger-scale learning data; (3) better robustness of the new predictor because reliable identification still can be achieved even when learning samples are sparse. Because all the validation evidences are optimistic, CRBM-AFSA-LightGBM is verified as a highly efficient and robust prediction tool for lithofacies identification in carbonate reservoirs.
... There are close relationships between well log data and formation and conventionally, for lithology prediction by employing the recorded well log data, a wide range of soft computing methods has been proposed through combinations of various measurements both in the qualitative and quantitative evaluation and hence automated lithology prediction using well logs has achieved outstanding contribution in the field of oil and gas. For solving lithology prediction and classification problems, several soft computing methods such as support vector machines using conventional wire-line well logs [20], cross plot interpretation and statistical analysis based on histogram plotting [21], FL for association analysis, NN and multivariable statistical methodologies [22], artificial intelligence approaches and multivariate statistical analysis [23], NIS apriori algorithm [19], hybrid NN methods [24], self organizing maps (SOM) [25], FL methods [26], artificial neural network (ANN) methods [27], fuzzy curves and ensemble neural networks [28], determination of the total organic carbon (TOC) using ANN [29], multi-agent collaborative learning architecture approaches [30], random forest [31,32], generative adversarial network [33], multivariate statistical methods [34], aggregation of principal component, clustering and discriminant analysis [35], statistical characterization, and discrimination and stratigraphic correction methodologies [36] have been suggested by the researchers. However, majority of these methods are black box which makes it difficult to understand the models for further analysis [37]. ...
... However, majority of these methods are black box which makes it difficult to understand the models for further analysis [37]. The performance of ANN and FL methods are superior compared with statistical methods [20,27,38,39]. SOM methods provide better results in lithology classification compared to other machine learning techniques [40]. ...
Article
Full-text available
With the advancement of machine learning and artificial intelligence, the automated estimation of a bed's complex lithology has become one of the most crucial requirements in petroleum engineering because of its important role in reservoir characterization. In the past geophysical modelling, petro-physical analysis, artificial intelligence and several statistical approaches have been implemented to estimate lithology since prediction of lithology from recorded continuous cores are very expensive and unprofitable. Geoscience researchers often encounter uncertain, inexact, and vague data in the process of lithology identification that results in inefficient classification. Additionally , the complexities that are coupled to the lithology trends and their equivalent fluid responses, produce ambiguity and confuse the models. The goal of this work is to develop a lithology prediction technique by applying rough set theory (RST) as a granular computing approach to construct logical rules from an inconsistent information system that includes data from several well log attributes including the lithology indicator, SQ p and the fluid indicator, SQ s that have noticeable contribution in lithology classification. In addition, the rules will be established as a baseline for application in practice and future developments for multivariate well-log analysis. The results were validated with cutting data, and it was proved that the proposed approach has classified the lithology effectively with misclassification rate less than 18% which is less than other methods in comparison. Moreover, the result has confirmed that the method has a promising prospect as a lithology prediction tool, especially in real-time operation, because of the white-box nature of the module that represents the ability of describing the model's calculation steps and results in easily understandable form. 1. Introduction. Lithology refers to the composition or type of rock in the Earth's sub-surface. The term lithology is used as a gross description of a rock layer in the subsurface and uses familiar names, including sandstone, siltstone, mudstone, etc. The lithology of a layer can be identified by drilling holes, although this method often does not provide exact information. We can also obtain the classification results of lithology from recorded continuous cores that are very expensive and might be unprofitable. The lithol-ogy can also be estimated by geophysical inversion and geophysical modelling methods. Lithology prediction can be performed using petro-physical well logs. The estimation of lithology from well logs multi-attribute data has become one of the most prominent techniques used by several sectors of petroleum engineering, including geological studies for reservoir characterization, reservoir modelling and formation evaluation, well planning including drilling and enhanced oil recovery processes, well completion management, etc. This study shows granular computation works effectively on a basis of rough sets (RS) to recognize the pattern of a few well-log attributes to predict lithology. In the era of smart data mining and analysis, granular computation involves the partitioning of an object into granules, with a granule being a clump of elements defined by similarity, indistin-guishability, functionality or proximity [1, 2, 3]. Rough set theory (RST) [4, 5, 6] is a rising granular computing technique with a wide range of applications in many sectors, particularly in decision studies, inductive inference, conflict resolution, machine learning, knowledge discovery and acquisition, pattern recognition and inductive reasoning [7, 8]. It is very difficult for geoscientists to reduce the size of the dataset and to obtain the associated data simultaneously. RST addresses this challenge by performing as an efficient and effective module that reduces the data size in its computational process and discovers the hidden data patterns in database(s), which is an approach called knowledge discovery (KD) in the field of granular computing. KD has been used for the development of information systems that assist in extracting concealed data patterns and other important information in the datasets. RST performs granular computation from a vague idea (set) depending on two vivid concepts, which are lower approximations and upper approximations (discussed in Methodology section). To perform granular computation, RST requires only the provided data [9]. RST performs by employing a granular understanding of the provided dataset. The most significant benefit of white-box models like RST over black-box approaches is that the detailed knowledge of the classification process is available for better understanding the problem under study. RST includes numerous other advantages [10, 11]. Some are given here.