ArticlePDF Available

What can machine learning do for seismic data processing? An interpolation application

May 2017
Geophysics 82(3):V163-V177

May 2017
82(3):V163-V177

DOI:10.1190/geo2016-0300.1

Authors:

Yongna Jia

Hebei University of Technology

Jianwei Ma

Peking University

Machine learning (ML) systems can automatically mine data sets for hidden features or relationships. Recently, ML methods have become increasingly used within many scientific fields. We have evaluated common applications of ML, and then we developed a novel method based on the classic ML method of support vector regression (SVR) for reconstructing seismic data from under-sampled or missing traces. First, the SVR method mines a continuous regression hyperplane from training data that indicates the hidden relationship between input data with missing traces and output completed data, and then it interpolates missing seismic traces for other input data by using the learned hyperplane. The key idea of our new ML method is significantly different from that of many previous interpolation methods. Our method depends on the characteristics of the training data, rather than the assumptions of linear events, sparsity, or low rank. Therefore, it can break out the previous assumptions or constraints and show universality to different data sets. In addition, our method dramatically reduces the manual workload; for example, it allows users to avoid selecting the window size parameters, as is required for methods based on the assumption of linear events. The ML method facilitates intelligent interpolation between data sets with similar geomorphological structures, which can significantly reduce costs in engineering applications. Furthermore, we combine a sparse transform called the datadriven tight frame (so-called compressed learning) with the SVR method to improve the training performance, in which the training is implemented in a sparse coefficient domain rather than in the data domain. Numerical experiments show the competitive performance of our method in comparison with the traditional f-x interpolation method.

. Comparison of S/N values (dB) obtained using different percentages of training set.

…

. Comparison of computational time (s) obtained using different percentages of training set.

…

Schematic diagram of Gauss SVR. I: A local image patch centered the pixel at ði; jÞ. II: The feature vector can be generated after the Gauss-weighted process.

…

Flow chart depicting the process of interpolation method based on SVR.

…

Schematic diagram of DDTF SVR. I: A local image patch centered the pixel at ði; jÞ. II: The coefficients in the DDTF domain of each patch are equal to W T P ij M BI. III: Convert the coefficient matrix to the feature vector.

…

Figures - uploaded by Jianwei Ma

Content may be subject to copyright.

Content uploaded by Jianwei Ma

Content may be subject to copyright.

What can machine learning do for seismic data processing?

An interpolation application

Yongna Jia1and Jianwei Ma1

ABSTRACT

Machine learning (ML) systems can automatically mine data

sets for hidden features or relationships. Recently, ML methods

have become increasingly used within many scientific fields.

We have evaluated common applications of ML, and then we

developed a novel method based on the classic ML method of

support vector regression (SVR) for reconstructing seismic data

from under-sampled or missing traces. First, the SVR method

mines a continuous regression hyperplane from training data

that indicates the hidden relationship between input data with

missing traces and output completed data, and then it interpo-

lates missing seismic traces for other input data by using the

learned hyperplane. The key idea of our new ML method is sig-

nificantly different from that of many previous interpolation

methods. Our method depends on the characteristics of the

training data, rather than the assumptions of linear events, spar-

sity, or low rank. Therefore, it can break out the previous assump-

tions or constraints and show universality to different data sets. In

addition, our method dramatically reduces the manual workload;

for example, it allows users to avoid selecting the window size

parameters, as is required for methods based on the assumption of

linear events. The ML method facilitates intelligent interpolation

between data sets with similar geomorphological structures,

which can significantly reduce costs in engineering applications.

Furthermore, we combine a sparse transform called the data-

driven tight frame (so-called compressed learning) with the SVR

method to improve the training performance, in which the train-

ing is implemented in a sparse coefficient domain rather than in

the data domain. Numerical experiments show the competitive

performance of our method in comparison with the traditional

f-xinterpolation method.

INTRODUCTION

With the large amounts of data preserved by the Internet, machine

learning (ML) is emerging as a new kind of algorithm designed to

automatically learn the features and relationships hidden in large

data sets. This is often a very attractive alternative to performing the

same work manually. MLs have emerged as workhorses for many

applications, including (but not limited to) spam filters (Androut-

sopoulos et al., 2000;Guzella and Caminhas, 2009), recommender

systems (Bobadilla et al., 2013), credit scoring (Huang et al., 2007),

fraud detection (Ravisankar et al., 2011), and stock trading (Huang

et al., 2005). A vast array of previous research has established that

the primary tools of ML include linear/logistic regression, artificial

neural networks (Haykin, 2004), support vector machines (SVMs)

(Burges, 1998), decision trees (Murthy, 1998), and instance-based

learning (Dutton and Conroy, 1996). The primary functions

performed by ML include classification (for discrete outputs),

regression (for continuous outputs), cluster, association analysis

(Brijs et al., 1999), and anomaly detection (Hassan et al., 2015).

These techniques have a variety of applications. For example, ML

regression (Kwiatkowska and Fargion, 2003) is a promising tool for

improving data mergers, and SVM can facilitate satellite ocean

color sensor cross-calibrations. In addition, clustering can be used

to increase the efficiency and accuracy of regression. As another

example, detecting abnormal wafers can help workers to find faults

in the semiconductor domain (Hassan et al., 2015), and using SVM

to classify random-input sensor data can allow workers to recognize

motor fault (Banerjee and Das, 2012). Overall, due to its good per-

formance, ML has spread rapidly throughout multiple fields, sug-

gesting that it is likely to drive the next big wave of innovation.

What can ML do for seismic data processing? Despite the above-

listed advances, it is still unclear how ML can be used to improve

seismic exploration. ML-related techniques have preliminarily been

applied in the field of reservoir characterization to determine param-

Manuscript received by the Editor 7 June 2016; revised manuscript received 30 December 2016; published online 15 March 2017.

1Harbin Institute of Technology, Department of Mathematics, Harbin, China. E-mail: jiayongna123@163.com; jma@hit.edu.cn.

V163

GEOPHYSICS, VOL. 82, NO. 3 (MAY-JUNE 2017); P. V163–V177, 19 FIGS., 6 TABLES.

10.1190/GEO2016-0300.1

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

eters such as sand fraction, shale fraction, porosity, and permeabil-

ity. In previously published research, Lim (2005) is able to charac-

terize these reservoir properties using fuzzy logic and neural

networks from well data, and Helmy et al. (2010) propose the use

of hybrid computational models to characterize oil and gas reser-

voirs. Zhang et al. (2014) propose ML-based automated fault de-

tection in premigrated seismic data. This method generates a set of

seismic traces from velocity models containing faults with varying

locality and other properties, which then uses these known exam-

ples to train a ML model to identify the presence or locality of faults

in previously unseen traces.

Inspired by ML, seismic data interpolation can be regarded as

a regression problem for continuous output. In other words, ML

methods can be used to generate an approximate function (a con-

tinuous hyperplane for an interpolation projection). In this paper,

we attempt to interpolate seismic data using ML. The subsequent

seismic processing steps, including multiple suppression, migra-

tion, and imaging, among others, generally require a dense seismic

record. However, due to economic and physical constraints, seismic

records are often distributed sparsely, or they are often missing

traces. These missing traces can be reconstructed using interpola-

tion, which can greatly decrease this economic cost.

Several seismic interpolation methods currently exist. Some re-

searchers have proposed methods assuming that the seismic record

of an original section comprises a limited number of linear events.

For example, Spitz (1991) assumes the existence of a series of linear

events and proposes one classical first-order f-x

interpolation method that interpolates missing

traces by using a set of linear equations. In this

case, one-step predictability may be seen as a

necessary but insufficient condition for signals

formed of linear events. Therefore, a suboptimal

solution can instead be generated for the cases of

curved events. Data with curved events are proc-

essed by using window techniques. However, the

quality of these reconstructions is significantly

influenced by window parameters. The method

proposed by Spitz has been extended to other do-

mains, including the time-spatial domain (Claerb-

out, 1992), the frequency-wavenumber domain

(Gülünay, 2003), and the curvelet domain (Naghi-

Figure 1. Example of transforming to a linear classification. (Left) The separated line in

R2is an ellipse (nonlinear). (Right) The separated hyperplane in R3is changed to a linear

hyperplane via a mapping Φ.

0 1 2 x* 5 6 7

−1.5

1.5

Feature vector : x

Label : y

(x,y)

y = f(x)

(x*,y*)

Figure 3. Schematic diagram depicting the process of ML. In the

training stage, ML methods are used to mine an approximate func-

tion y¼fðxÞto fit training set (x,y). In the prediction stage, one

can input the feature vector xinto the trained function fðxÞto ob-

tain the unknown label y.

Figure 2. Explanation of the relationship between the classification

and regression. (a) The ε-insensitive loss function. (b) The regres-

sion hyperplane fðxÞ. The separating hyperplane of the convex hull

can guarantee the training data within the threshold of ε.

V164 Jia and Ma

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

zadeh and Sacchi, 2010). Sparsity-promoting

methods, e.g., the Fourier transform (Sacchi et al.,

1998;Liu and Sacchi, 2004), curvelet transform

(Herrmann and Hennenfent, 2008), and dictionary

learning (Liang et al., 2014;Yu et a l . , 2 0 1 5 ), are

also popular in the field of seismic data interpo-

lation. These sparsity-promoting methods assume

that reconstructed data should be sparser than ob-

served data with missing traces in these transform

domains. Low-rank methods (which attempt to

achieve sparsity for singular values of seismic

data) have also attracted attention in recent years.

These methods assume that seismic data are of

low-rank structures after some pretransformation

ways, such as texture-patch mapping (Ma, 2013;

Yang et al., 2013), Hankel re-embedding (Trickett

et al., 2010;Oropeza and Sacchi, 2011;Naghiza-

deh and Sacchi, 2012;Jia et al., 2016), and

coordinate transformation (Kumar et al., 2013).

Under the assumption of low-rank conditions,

the interpolation problem can thus be translated

to become a rank-reduction matrix completion

problem.

In this paper, we use the support vector regres-

sion (SVR) method (Drucker et al., 1997), a

state-of-the-art ML regression tool, for learning

interpolation of seismic data. The use of SVR is

motivated by three factors: (1) SVR has a solid

theoretical foundation, and it can transform a

nonlinear classification/regression problem in

a low-dimensional space to a linear problem in a

high-dimensional space. Linear regression can be

done more easily. (2) SVR has a good generation

ability in predicting output labels for input data.

(3) SVR is effective at function approximation,

especially in cases with a high-dimensional input

space. It has been successfully used in other

fields, facilitating such programs as traveltime

prediction in intelligent transportation systems

(Wu et al., 2004), wind-speed forecasting

(Mohandes et al., 2004;Santamaría-Bonfil et al.,

2016), and image superresolution (Ni and

Nguyen, 2007). In this paper, the interpolation

project can be learned by a multivariate function

(a hyperplane) from the given example data sets.

Then, the function can be used for prediction and

interpolation of input data with missing traces,

without making any preassumptions about linear

events, sparsity, or low rank. Furthermore, a new

adaptive sparse transform, referred to as the data-

driven tight frame (DDTF) (Cai et al., 2014;Yu

et al., 2015), is combined with SVR to improve

its performance. The learning is implemented in

a sparse coefficient domain rather than in the

original data domain. The sparse transform is

helpful in increasing learning efficiency. Numeri-

cal experiments performed on varying data

demonstrate the applicability and competitive

performance of our method.

Figure 4. Schematic diagram of Gauss SVR. I: A local image patch centered the pixel at

ði; jÞ. II: The feature vector can be generated after the Gauss-weighted process.

Figure 5. Flow chart depicting the process of interpolation method based on SVR.

Algorithm 1. Interpolation method based on SVR.

Input: Exemplified seismic data M¼fMtg;t¼1;2; :::, missing seismic data Y,

sample matrix F, patch size m.

1. Training stage

1) Down sample the exemplified seismic data M¼fMtgby F,t¼1;2;:::.

2) Calculate the initial preinterpolation data using the bicubic method.

3) Obtain the m2dimensional feature vectors xij and their corresponding labels yij .

4) Input all the pairs of ðxij;y

ijÞto the SVR system and build a continuous

regression function (hyperplane, fðxÞ).

2. Prediction stage

1) Apply the bicubic interpolation to the missing seismic data Y.

2) Extract feature vectors in the same manner as in the training stage.

3) Input all feature vectors into the regression function (hyperplane, fðxÞ) and

obtain the missing labels.

Output: Reconstructed seismic data ~

Figure 6. Schematic diagram of DDTF SVR. I: A local image patch centered the pixel at

ði; jÞ. II: The coefficients in the DDTF domain of each patch are equal to WTPijMBI . III:

Convert the coefficient matrix to the feature vector.

ML for seismic data interpolation V165

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

THEORY

A review of SVR

In ML, SVMs are supervised learning models used for classifi-

cation, regression, and other learning tasks. Support vector classi-

fication is used for data classification, which produces discrete

outputs, and SVR is used for data fitting and regression, which pro-

duces continuous outputs. Here, first, we briefly explain the termi-

nology of regression. Suppose that we are given a training set with n

point pairs as

Ω¼fðxi;y

iÞ;i¼1;2; :::;nÞg;(1)

where xi∈Rdis the feature vector (e.g., the local neighbor infor-

mation of a sampling point) and yi∈Ris the corresponding label of

xi. Solving regression problems requires the construction of an

approximate function fðxÞextending from xto y(y¼fðxÞ)to

fit these point pairs. This function fðxÞis also used to predict other

labels yin point pairs (x,y), in which the feature vector is known

and the label is unknown. These unknown labels ycan thus be

output after the feature vectors xare input into function fðxÞ.

The function form fðxÞof SVR (regression) is generated from

support vector classification, a technique that assumes the nonli-

nearly distributed sample points can be separated linearly if they are

projected to a high-dimensional space through a mapping. Figure 1

presents a simple example (Vapnik, 1995) to explain this idea.

In considering the mapping Φ:R2→R3with Φðx1;x

2Þ¼

ð½x12;½x22;ﬃﬃﬃ

px1x2Þ, support vector classification can separate

the elliptically distributed points in R2into two categories with a

linear hyperplane in R3. This example reveals that support vector

classification can use a mapping to transform a nonlinear classifi-

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Figure 7. (a) Original synthetic data with linear events. (b) Deci-

mated data with 75% regular missing traces (1∕a, where a¼4).

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Figure 8. Four exemplified seismic data used to

support training point pairs.

V166 Jia and Ma

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

cation problem in low-dimensional space to a linear problem in a

high-dimensional space. This linear classification can therefore be

made fairly easily.

The use of SVR to perform for data regression is expressed with

an ε-insensitive loss function Lε(Drucker et al., 1997) (also as

shown in Figure 2a):

Lε¼0jy−fðxÞj ≤ε;

jy−fðxÞj −εotherwise;(2)

in which εrepresents the ε-insensitive loss parameter and the loss

function ignores the error within the threshold of ε. In other words,

if the requirement of error jfðxiÞ−yij≤εis fulfilled, the prediction

value fðxiÞis equal to the label yi. Based on this loss function, the

approximate function should produce as many data pairs as possible

in the threshold. Therefore, the regression hyperplane fðxÞshould

exist as the separating hyperplane of the convex hull (Figure 2b):

Dþ¼fðxT

i;y

iþεÞT;i¼1; :::;ng;(3)

D−¼fðxT

i;y

i−εÞT;i¼1; :::;ng:(4)

By taking the example provided in Figure 1into account, the re-

gression hyperplane fðxÞcan be of the linear form in a high-dimen-

sional space with a mapping ϕ:

fðxÞ¼hw; ϕðxÞi þ b: (5)

The optimization problem to calculate the regression function

fðxÞis to minimize kwk2and guarantee the absolute error jfðxiÞ−

yijin the range of ε:

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Figure 9. Reconstruction results (1∕a,where

a¼4)by(a)bicubic(S∕N¼27.80 dB), (b) f-x

(S∕N¼38.97 dB), (c) Gauss SVR (S∕N¼

41.05 dB), and (d) DDTF SVR (S∕N¼

42.10 dB).

Trace number

Time sample number

20 40 60 80 100 120

200

400

600

800

1000

Trace number

Time sample number

20 40 60 80 100 120

200

400

600

800

1000

Figure 10. (a) Original synthetic data with curved events. (b) Deci-

mated data with sampling ratio 1∕a, where a¼3.

ML for seismic data interpolation V167

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

min

w;b;ξp;ξ

2kwk2þCX

p¼1ðξpþξ

pÞ

s:t:8

yi−ð<ω;ϕðxiÞ>þbÞ≤εþξp

ð<ω;ϕðxiÞ>þbÞ−yi≤εþξ

ξp;ξ

p≥0

;(6)

where ξpand ξ

prepresent the slack variables that can search for the

optimal solution in a larger feasible region.

Equation 6can be solved by using its convex dual problem, in

which the solution is expressed as

fðxÞ¼X

i¼1ðαi−α

iÞhϕðxiÞ;ϕðxÞi þ b;

¼X

i¼1ðαi−α

iÞKhxi;xiþb; (7)

where αand αrepresent the dual variables.

Equation 7shows that the mapping ϕcan be

an implicit mapping via the kernel function

Khxi;xi. This kernel function can be selected

as linear kernel, polynomial kernel, or Gauss ra-

dial basis function, among others. The inputs into

SVR are the known point pairs in equation 1, and

the output is a regression function (which is the

hyperplane fðxÞ). More details about SVR can

be found in Smola and Schölkopf (2004).

Interpolation method based on SVR

In this section, we first provide clarification by

briefly describing the interpolation method using

the ML-based regression as shown in Figure 3.

Some point pairs (x/feature vector and y/label)

such as the training set (hollow blue circles)

are known, and ML methods are used to mine

a regression function y¼fðxÞhidden in these

training point pairs. Other unknown labels y

in (x,y) can be output after the feature vectors

xare input to the trained function y¼fðxÞ.For

use in seismic data interpolation, it is necessary

to include a rule to transform the seismic data

into the form of point pairs (feature vector, label)

and to assign missing pixel values as the un-

known labels. Missing data should be first pre-

interpolated by using the bicubic method or

another method. A local patch in the preinterpo-

lated data is set as a feature vector, with the cor-

responding true pixel value set as a label. The

details of our SVR-based interpolation method

are described as follows.

Our method comprises two stages: the training

stage and the prediction stage. In the training

stage, we choose several examples without miss-

ing traces, whose geomorphological structures

are similar to those of the interpolated seismic

data. To distinguish between the two kinds of

seismic data, we categorize these examples as

either exemplified seismic data or missing seis-

mic data. As mentioned above, the use of SVR

requires that seismic data are transformed to a

new form (feature vector, label). To do this, we

first down-sample the exemplified data by using

the same sample matrix as the missing seismic

data. Second, we use bicubic interpolation to

Trace number

Time sample number

20 40 60 80 100 120

200

400

600

800

1000 11 36 96

100

120

140

160

180

200

220

240

260

Trace number

Time sample number

Trace number

Time sample number

20 40 60 80 100 120

200

400

600

800

1000 11 36 96

100

120

140

160

180

200

220

240

260

Trace number

Time sample number

Trace number

Time sample number

20 40 60 80 100 120

200

400

600

800

1000

11 36 96

100

120

140

160

180

200

220

240

260

Trace number

Time sample number

Trace number

Time sample number

20 40 60 80 100 120

200

400

600

800

1000 11 36 96

100

120

140

160

180

200

220

240

260

Trace number

Time sample number

a) b)

c) d)

e) f)

g) h)

Figure 11. Reconstruction results (1∕a,wherea¼3)by(a)bicubic(S∕N¼28.64 dB),

(S∕N¼33.02 dB). (b), (d), (f) and (h) are their corresponding trace comparisons, respec-

tively. The dotted line represents the original trace, and the solid line represents the re-

constructed trace.

V168 Jia and Ma

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

carry out an initial preinterpolation of the down-sampled exempli-

fied data. It is also possible to use other methods, such as f-x

method, to preinterpolate these down-sampled exemplified data

(see below). Next, for each pixel value at location ði; jÞin the pre-

interpolation matrix, we take a local image patch of size m×mcen-

tered at ði; jÞ. This patch is then weighted by a matrix, which is

constructed from a 2D Gaussian distribution. Finally, this weighted

patch is converted to a row vector (Figure 4):

xij ¼vecðWGðPijMBI ÞÞ;(8)

where MBI is the preinterpolation data obtained using bicubic in-

terpolation and Pij is the patch extraction operator. It is noted that

the patch notation is used to represent the local information of a

pixel, which is not the same as that used in patch-based seismic data

processing (Bonar and Sacchi, 2012). Here, WGrepresents the

weighted way through the Gaussian matrix. The function vec can

reshape a matrix of size m×mto a row vector of size 1×m2.

Therefore, an m2dimensional feature vector xij and its correspond-

ing label yij, the pixel value at position ði; jÞin the exemplified

seismic data, have been obtained.

Following this initial computation, we input all point pairs from

the exemplified data into the SVR system. Then, a continuous re-

gression function (hyperplane, fðxÞ) can be generated, and it is

saved for future use in the prediction stage.

In the prediction stage, the labels/pixel values of missing seismic

data are unknown, but they can be generated using their feature vec-

tors. Constructing feature vectors is done here in the same way as it

is done in the training stage. Note that the missing seismic data are

preinterpolated first with the bicubic method. All feature vectors are

input simultaneously into the regression function (hyperplane, fðxÞ)

trained in the previous stage, thus allowing us to obtain the labels

(missing pixel values). We call this interpolation method Gauss

SVR. The main steps in our interpolation algorithm are given in

Algorithm 1and further explained in Figure 5.

Interpolation method combining the SVR and sparse

transforms

The feature vectors in classical SVR contain only their pixel val-

ues and Gauss-weighted local information. Including more informa-

tion in these feature vectors will likely improve the performance of

our method. For example, Chaplot et al. (2006) propose an idea to

combine wavelets and ML methods to classify magnetic resonance

images of the human brain. After using wavelets as the input to

SVM, they were able to achieve a good classification percentage

of more than 94%.

Similar to wavelet transforms, which have fixed and known basic

functions, the DDTF method (Cai et al., 2014;Liang et al., 2014;Yu

et al., 2015) is an adaptive sparse transform used for learning the

filters from the given data (see Appendix Afor more details). In-

spired by Chaplot et al.’s. (2006) work, we propose an interpolation

method by combining SVR and DDTF, referred to as DDTF SVR.

This new method differs only from Gauss SVR by the extraction

method of feature vectors. DDTF SVR is implemented in a sparse

domain under a DDTF tight frame, rather than in the data domain

(Figure 6). After the adaptive DDTF filter Wis generated, the co-

efficients of each patch are equal to WTPijMBI. Feature vectors are

then obtained by converting WTPijMBI to row vectors. Noted that

the corresponding labels in the training sets are the true pixel values

in the exemplified data, the same as Gauss SVR.

RESULTS AND DISCUSSION

We tested the Gauss SVR and DDTF SVR methods on synthetic

and field seismic data, and we compared the results with those of

bicubic interpolation (Keys, 1981)andSpitz’s(1991)f-xmethod.

To implement SVR, we used LibSVM (Chang and Lin, 2001)and

the Gauss radial basis function as the kernel function. We used a set

of 2D seismic data that includes one spatial dimension and one

temporal dimension. The interpolation addresses a regular sampling

problem, and the sampling ratio is 1∕a, which represents the sample

one trace out of atraces. The local patch size is set as 3×3; therefore,

the length of feature vector is nine. The reconstruction quality is

measured by using the signal-to-noise ratio (S/N), which is expressed

S∕NðdBÞ¼10 log10kIk2

kIn−Ik2

F;(9)

where Inand Irepresent the reconstructed data and the original data,

respectively. This numerical analysis was performed using MATLAB

on a PC with Windows 7, Intel core i-5, 3.2 GHz CPU, and

8GBRAM.

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Figure 12. (a) Original field data. (b) Decimated data with 50%

regular missing traces (1∕a, where a¼2).

ML for seismic data interpolation V169

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

Synthetic example

Data with linear events

Synthetic seismic data with linear events are shown in Figure 7a.

In this section, we present a simple comparison of our results with

those of the bicubic method and the f-xmethod in a case of 25%

regular sampling ratio (1∕a, where a¼4) interpolation (Figure 7b).

Figure 8shows the four exemplified seismic data sets without miss-

ing data, from which 63,504 training point pairs of feature vectors

and labels can be extracted. The interpolation results are shown in

Figure 9. The S/N values of the four different methods (bicubic,

f-x, Gauss SVR, and DDTF SVR) are 27.80, 38.97, 41.05, and

42.10 dB, respectively. The fact that the S/N values obtained using

Gauss SVR and DDTF SVR are higher than those obtained by the

bicubic and f-xmethods indicates that our methods perform suc-

cessfully.

Data with curved events

Synthetic seismic data with curved events are depicted in Fig-

ure 10a. The size of these data is 1000 ×128, corresponding to the

total number of samples along the temporal and spatial directions,

respectively. As shown in Figure 10b, the sampling ratio is 1∕a,

where a¼3. Here, we classify four seismic data sets with three

different curved events as exemplified data. Because the only differ-

ence in the four exemplified data sets is the curvature, we do not

show the images here. Figure 11 shows reconstructed results and

their trace comparisons obtained using the bicubic, f-x,Gauss

SVR, and DDTF SVR methods. In this scenario, based on their

S/N values, f-xperforms better than SVR. Because optimal window

parameters must often be adjusted in the f-xmethod, its S/N values

are not always higher than those of SVR. As is common in ML tech-

niques, the effectiveness of SVR depends on the training set. Future

work should focus on the improvement of SVR in this respect.

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Trace number

Time sample number

20 40 60 80 100 120

100

120

Figure 13. Eight exemplified field seismic data sets used to support the training point pairs.

V170 Jia and Ma

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

Field data example

To better compare our new methods with the f-xmethod, we

tested a series of experiments on a field data set with size

128 ×128 (Figure 12a). The seismic data have been down-sampled

by a regular 50% (1∕a, where a¼2) sample matrix, which is

shown in Figure 12b. To construct the regression function, we select

eight field seismic data sets (Figure 13). These are actually eight

patches from a very large field data set, used as a simulation. These

exemplified data can provide 127,008 training point pairs. Figure 14

records the interpolation results and their trace comparisons ob-

tained using the f-x, Gauss SVR, and DDTF SVR methods. Our

proposed SVR-based methods yield slightly lower S/N values than

those of the f-xmethod. In this example, we test many experiments

to determine the best parameters (e.g., window size) for the f-x

method to achieve high S/N values.

To make the results in reconstruction quality more convincing,

the S/N values versus different sampling ratios (1∕a, where a¼

2, 3, 4) by using bicubic, f-x, Gauss SVR, and DDTF SVR are

recorded in Table 1. Of these four methods, DDTF SVR gets the

best quality with the sampling ratio 1∕a, where a¼3, 4, and it

yields a slightly smaller S/N value than f-xunder a¼2. In addi-

tion, our method can deal with interpolation problems of real field

data more intelligently, without needing to set complex parameters.

Trace number

Time sample number

20 40 60 80 100 120

100

120

2 66 96

100

120

Trace number

Time sample number

0.2 0.4 0.6 0.8

100

Amplitude

Time sample number

original trace

interpolation

Trace number

Time sample number

20 40 60 80 100 120

100

120

2 66 96

100

120

Trace number

Time sample number

0.2 0.4 0.6 0.8

100

Amplitude

Time sample number

original trace

interpolation

Trace number

Time sample number

20 40 60 80 100 120

100

120

2 66 96

100

120

Trace number

Time sample number

0.2 0.4 0.6 0.8

100

Amplitude

Time sample number

original trace

interpolation

Figure 14. First row: reconstruction results. Second row: trace comparison. Third row: magnified view of the rectangular area. The dotted line

represents the original trace, and the solid line represents the reconstructed trace. Methods used from the left to the right are as follows: f-x

(S∕N¼34.21 dB), Gauss SVR (S∕N¼32.20 dB), and DDTF SVR (S∕N¼32.54 dB).

Table 1. Comparison of S/N values (dB) obtained using differ-

ent methods.

a¼2a¼3a¼4

Bicubic 26.42 21.94 17.52

f-x34.21 21.58 18.04

Gauss SVR 32.20 24.47 20.83

DDTF SVR 32.54 25.24 22.12

ML for seismic data interpolation V171

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

Table 3. Comparison of computational time (s) obtained using different percentages of training set.

Percentage 5% 15% 25% 35% 100%

Gauss SVR a¼2 2.91 0.02 23.77 0.24 63.75 0.24 245 74.02 2668.23

a¼3 2.90 0.01 23.82 0.19 63.62 0.28 207.44 9.58 3022.34

a¼4 2.92 0.01 24.17 0.10 65.20 0.34 252.41 53.38 3045.23

DDTF SVR a¼2 3.41 0.09 27.72 0.62 75.39 1.15 209.27 9.89 6120.54

a¼3 3.08 0.06 25.48 0.28 68.97 0.64 193.97 52.46 4478.94

a¼4 3.03 0.03 24.95 0.19 67.01 0.49 138.77 26.89 3343.21

Table 2. Comparison of S/N values (dB) obtained using different percentages of training set.

Percentage 5% 15% 25% 35% 100%

Gauss SVR a¼2 30.62 0.02 31.79 0.01 31.98 0.01 32.05 0.01 32.20

a¼3 23.05 0.01 23.98 0.01 24.14 0.01 24.23 0.01 24.47

a¼4 19.47 0.02 20.24 0.01 20.43 0.01 20.54 0.01 20.83

DDTF SVR a¼2 32.65 0.11 32.61 0.11 32.61 0.10 32.61 0.09 32.54

a¼3 24.81 0.14 24.99 0.12 25.07 0.10 25.12 0.08 25.24

a¼4 21.19 0.18 21.66 0.14 21.85 0.12 21.95 0.11 22.12

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Figure 15. Six exemplified field seismic data used

to support the training point pairs.

V172 Jia and Ma

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

However, the ML training step is extremely time consuming. If

all of the training pairs (127,008) are used in the training stage, the

interpolation experiment takes more than 2500 s. To accelerate our

method, the skill of randomly selecting a certain subset (Reinartz,

2002) as the new training set is used for seismic data interpolation.

For each given sampling ratio, we test 10 times, and the resulting

mean values, along with their standard deviations in terms of S/N

and computational times, are listed in Tables 2and 3. The parameter

x%represents the used percentage of all training sets (out of a total

of 127,008). Table 2demonstrates that 15% of all the training sets

used in the training stage produce a very similar result in S/N with

that of all sets used. However, the computational time (Table 3) de-

creases significantly from more than 2500 s to almost 25 s. For in-

stance, as shown in Table 2, when a¼3and 15% of the training set

is used, the S/N of DDTF SVR is 24.99 0.12 dB. The value is

slightly less than that obtained when all training sets are used

(S∕N¼25.24 dB), whereas the computational time reduces from

4478.94 to 25.48 0.28 s.

Model training for a type of data

The biggest advantage of the ML method is that it can intelli-

gently accomplish tasks assigned by a human, thereby dramatically

reducing the manual workloads. Our method can decrease the com-

plexity in selecting parameters, such as window parameters. In ad-

dition, a trained regression function can be saved for future use in

analyzing a type of seismic data, not only for one particular seismic

data set. The details of this procedure are described as follows. Fig-

ure 15 demonstrates that six seismic data sets are used as exemplified

seismic data to construct a regression function with a 50% sampling

ratio (1∕a,wherea¼2). The trained regression function is then

saved and used to guide the interpolation of three different seismic

data sets (Figure 16). The reconstructed results of the first seismic

data set (shown in Figure 16) as obtained by the f-x, Gauss SVR,

and DDTF SVR are shown in Figure 17a,17d,and17g, respectively.

The second and third rows represent the interpolated results of the

other two seismic data sets. These comparisons reveal that the DDTF

SVR method yields higher S/N values than the Gauss SVR method,

and it is competitive with the f-xmethod.

Discussion and extension

There are several factors that affect the performance of ML

methods, including the choice of preinterpolation method and

the training data set. In our ML methods, we typically use the bi-

cubic method to preinterpolate missing traces before extracting the

feature vectors. In some cases, the reconstructed quality of the f-x

method is slightly higher than that of our ML-based method. There-

fore, we try to use f-xas our chosen preinterpolation method. To

distinguish between the two preinterpolation methods, we refer to

our ML method as bicubic Gauss SVR, bicubic DDTF SVR, f-x

Gauss SVR, and f-xDDTF SVR. The original real field data used

in these test are presented in Figure 12a, and the S/N values obtained

using different sampling ratios are recorded in Table 4.Whena¼2,

the S/N value of the bicubic DDTF SVR (S∕N¼32.54 dB)is

slightly less than that of f-x(S∕N¼34.21 dB). However, f-x

DDTF SVR (S∕N¼34.31 dB) improves the reconstructed quality.

When a¼3, the S/N value of f-x(S∕N¼21.58 dB) is slightly

less than that of the bicubic DDTF SVR (S∕N¼25.24 dB).

Although the f-xDDTF SVR (S∕N¼23.85 dB) achieves a higher

S/N value than the f-x(S∕N¼21.58 dB), its S/N value is still lower

than that of the bicubic DDTF SVR. As mentioned above, the quality

of the reconstructed data is influenced by window parameters in the

f-xmethod, whereas the bicubic is generally easy to use. Therefore,

if one can ignore the slight change in S/N, the bicubic ML method is

likely a better choice.

ML methods work universally with different types of seismic data

and are able to greatly reduce manual workloads. However, the size

of today’s seismic databases often exceeds the size of data sets that

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Figure 16. Three original seismic data that need to be interpolated

under a same regression function.

ML for seismic data interpolation V173

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

SVR can handle and such large quantities of data are not needed

to mine the regression hyperplane. Therefore, the processes of data

preparation, such as data selection and data cleaning, exert strong

controls on the efficiency of our SVR-based ML methods. Future

work should focus on determining how to support effective training

data sets (e.g., in assessing variance and deviation) and how

many training data sets should be used to avoid the over-fitting

problem.

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Trace number

Time sample number

20 40 60 80 100 120

100

200

300

400

500

Figure 17. The reconstructed results (sampling ratio: 1∕a, where a¼2) of three different seismic data under a same regression function.

Methods used from the left to the right: f-x, Gauss SVR, and DDTF SVR. (Top to bottom) Three seismic data are shown in Figure 16. The S/N

values (dB) are (a) 59.51, (b) 62.09, (c) 66.79, (d) 59.15, (e) 61.73, (f) 64.77, (g) 59.40, (h) 62.53, and (i) 66.14.

Table 4. Comparison of S/N values (dB) obtained using differ-

ent preinterpolation methods.

a¼2a¼3a¼4

Bicubic Gauss SVR 32.20 24.47 20.83

DDTF SVR 32.54 25.24 22.12

f-x34.21 21.58 18.04

f-xGauss SVR 34.42 23.54 19.76

DDTF SVR 34.31 23.85 20.05

Table 5. Comparison of S/N values (dB) obtained using differ-

ent ratios with random sampling.

Sampling ratio 10% 30% 50%

Gauss SVR 13.47 19.52 26.61

DDTF SVR 13.77 19.93 26.82

V174 Jia and Ma

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

In this section, we discuss some possible extensions of our

method in uses such as varying sampling strategy, simultaneous

denoising, and 5D interpolation. Regular sampling in the seismic

data interpolation problem has already been discussed in the paper.

But how about randomly sampling? Table 5records the S/N values

versus different ratios of the original real field data (Figure 12a)

as obtained by the interpolation problem. The data in this table

demonstrate that our ML methods can adequately solve interpola-

tion problems with random sampling. In this group of experiments,

the bicubic method is used for preinterpolation, and it is likely that

one could improve the S/N values with another more suitable

method for random preinterpolation.

The simultaneous denoising during interpolation is also an issue

in this field. The synthetic seismic data (Figure 7a) with a 50% regu-

lar sampling ratio (1∕a, where a¼2) are taken as an example, with

a partial close-up of these results shown in Figure 18. The interpo-

lation method based on SVR is limited in its ability to simultane-

ously suppress noise. Maybe this is because our regression function

is a continuous function. In the prediction stage, if the feature vector

contains significant noise, the calculated label/pixel value will yield

a certain deviation. Therefore, it is ideal to first suppress noise dur-

ing the process of constructing the feature vector.

In recent years, the seismic industry has also been interested in

5D seismic data interpolation. The 5D data can be viewed as a five-

order tensor consisting of one time dimension and four spatial di-

mensions describing all locations of the sources as well as those of

the receivers on the surface. Different methods have been proposed

to solve this problem, including methods based on the Fourier trans-

form (Trad, 2009;Xu et al., 2010;Chiu, 2014), dictionary learning

(Yu et al., 2015), Hankel or Toeplitz matrix rearranged-based meth-

ods (Gao et al., 2013), and tensor completion-based methods

(Kreimer and Sacchi, 2012;Kreimer et al., 2013). Here, we also

attempt to test a 5D synthetic seismic data set based on ML. Fig-

ure 19a depicts a data set 32 ×16 ×16 ×16 ×16 in size that has

been modeled using the public MATLAB toolbox SeismicLab. The

decimated data with a regular sampling ratio (1∕a, where a¼3) are

shown in Figure 19b. Figure 19c depicts the reconstructed result by

the Gauss SVR, and Figure 19d displays the difference between the

original data (Figure 19a) and the result (Figure 19c). From these

data, it can be seen that our ML method can be used to solve a 5D

interpolation problem with satisfactory reconstructed results.

Currently, in most industries, the interpolation of down-sampled

regular data can be efficiently solved by f-x-ymethods and other

similar algorithms. Actually, not even Fourier techniques are

necessary for that particular problem because f-x-yis more efficient

and simpler. For irregular data, f-x-ytechniques are not suitable,

but Fourier techniques are. However, it is difficult to apply standard

techniques to very complex topographic scenarios with large gaps

or large separations between acquisition lines, on the order of hun-

dreds of meters. The objective of our ML method is to design a

database for interpolation that is suitable for multiple different types

of data, while simultaneously reducing manual labor. Our ML

method can still work with data sets featuring this kind of sparse

sampling, resulting from factors such as large gaps because it is able

to borrow information from training sets. For example, a typical 5D

interpolation of a wide azimuth land data set, will imply infilling a

4D grid (for each frequency), with a typical coverage of 3% (97%

empty cells). It is only through the use of very large multidimen-

sional windows on the order of 500–1000 m per side that we are

able to address these kinds of gaps. However, the application of ML

in seismic data processing is still in its infancy. Much work remains

to be done to make this technique efficient in those conditions and

then apply it to deal with complex structures, complex topography,

and noise. Finally, future work should also focus on using ML meth-

ods to accomplish reverse time migration and full-waveform in-

version.

15 16 17 18 19 20

100

120

Trace number

Time sample number

15 16 17 18 19 20

100

120

Trace number

Time sample number

15 16 17 18 19 20

100

120

Trace number

Time sample number

15 16 17 18 19 20

100

120

Trace number

Time sample number

Figure 18. Simultaneous interpolation and denois-

ing for 2D synthetic data by Gauss SVR. (a) Origi-

nal clean data (Figure 7a); (b) decimated noisy data

with sampling ratio 1∕a,wherea¼2; (c) the re-

constructed result; and (d) difference between the

reconstructed result and the original clean data.

ML for seismic data interpolation V175

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

CONCLUSION

In this paper, we propose an ML method for seismic data inter-

polation. A hidden relationship (called a continuous hyperplane

fðxÞ) can be mined from large amounts of exemplified training sets

and used to obtain missing data. We present the DDTF SVR method

to improve the performance of the Gauss SVR method. Our new

ML-based method allows us to break away from previous assump-

tions made in existing interpolation methods, and it is universally

applicable to varying data sets. Furthermore, the trained regression

function can be saved for future use to interpolate a type of seismic

data with similar geomorphological structure, which is useful in

production seismic processing. In future work, we will focus on

the improvement of SVR and try to investigate the use of deep

learning methods for seismic data processing. Deep learning is a

branch of ML based on a set of algorithms that attempt to model

high-level abstractions in data.

ACKNOWLEDGMENTS

The authors would like to thank the editors and reviewers for

their helpful comments and suggestions that improved this work,

as well as M. Sacchi for providing the SeismicLab toolbox. This

work is supported by National Natural Science Foundation of China

(grant numbers: NSFC 91330108, 41374121, 61327013, and

41625017), and the Fundamental Research Funds for the Central

Universities (grant number: HIT.PIRS.A201501).

APPENDIX A

DDTF

The DDTF can be briefly described as follows. The objective

function for the filters training in DDTF is

argmin

V;W

2kV−WPMBIk2

FþλkVk0s:t:WTW¼I; (A-1)

where MBI is the bicubic preinterpolation data, Pdenotes the patch

transform, Wis the matrix dictionary, Vis the coefficient matrix

after expanding PMBI on W, and Iis the identity matrix. Here,

kk2

Fand kk0indicate the Frobenius norm and L0norm (which de-

fines the number of nonzeros in a vector), respectively. The con-

straint WTW¼Iindicates that Wis a tight frame. The W

and Vcan be calculated alternatively. More details about DDTF

can be found in Cai et al. (2014) and Liang et al. (2014).

REFERENCES

Androutsopoulos, I., G. Paliouras, V. Karkaletsis, G. Sakkis, C. D. Spyro-

poulos, and P. Stamatopoulos, 2000, Learning to filter spam e-mail: A

comparison of a naive Bayesian and a memory-based approach: Proceed-

ings of the Workshop on Machine Learning and Textual Information Ac-

cess, 4th European Conference on Principles and Practice of Knowledge

Discovery in Databases, 1–13.

Banerjee, T. P., and S. Das, 2012, Multi-sensor data fusion using support

vector machine for motor fault detection: Information Sciences, 217,

96–107, doi: 10.1016/j.ins.2012.06.016.

Bobadilla, J., F. Ortega, A. Hernando, and A. Gutierrez, 2013, Recom-

mender systems survey: Knowledge-Based Systems, 46, 109–132, doi:

10.1016/j.knosys.2013.03.012.

Bonar, D., and M. Sacchi, 2012, Denoising seismic data using the nonlocal

means algorithm: Geophysics, 77, no. 1, A5–A8, doi: 10.1190/geo2011-

0235.1.

Brijs, T., G. Swinnen, K. Vanhoof, and G. Wets, 1999, Using association

rules for product assortment decisions: A case study: Proceedings of the

5th ACM SIGKDD International Conference on Knowledge Discovery

and Data Mining, ACM, 254–260.

Burges, C. J., 1998, A tutorial on support vector machines for pattern rec-

ognition: Data Mining and Knowledge Discovery, 2, 121–167, doi: 10

.1023/A:1009715923555.

Cai, J., H. Ji, Z. Shen, and G. Ye, 2014, Data-driven tight frame construction

and image denoising: Applied and Computational Harmonic Analysis, 37,

89–105, doi: 10.1016/j.acha.2013.10.001.

Chang, C.-C., and C.-J. Lin, 2001, Libsvm: A library for support vector

machines, http://www.csie.ntu.edu.tw/~cjlin/libsvm, accessed 28 February

2017.

Chaplot, S., L. Patnaik, and N. Jagannathan, 2006, Classification of mag-

netic resonance brain images using wavelets as input to support vector

machine and neural network: Biomedical Signal Processing and Control,

1,86–92, doi: 10.1016/j.bspc.2006.05.002.

Chiu, S. K., 2014, Multidimensional interpolation using a model-con-

strained minimum weighted norm interpolation: Geophysics, 79, no. 5,

V191–V199, doi: 10.1190/geo2014-0086.1.

0 16 16 16 16

Trace number

Time sample number

0 16 16 16 16

Trace number

Time sample number

0 16 16 16 16

Trace number

Time sample number

0 16 16 16 16

Trace number

Time sample number

Figure 19. The 5D seismic data interpolation by Gauss SVR.

(a) Original data; (b) decimated data with sampling ratio 1∕a, where

a¼3; (c) the reconstructed result (S∕N¼12.19 dB); and (d) dif-

ference between the reconstructed result and the original data.

V176 Jia and Ma

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

Claerbout, J. F., 1992, Earth soundings analysis: Processing versus inver-

sion: Blackwell Scientific Publications Cambridge.

Drucker, H., C. Burges, L. Kaufman, A. Smola, and V. Vapnik, 1997, Support

vector regression machines: Advances in Neural Information Processing

Systems, 9,155–161.

Dutton, D. M., and G. V. Conroy, 1996, A review of machine learning:

The Knowledge Engineering Review, 12, 341–367, doi: 10.1017/

S026988899700101X.

Gao, J., M. D. Sacchi, and X. Chen, 2013, A fast reduced-rank interpolation

method for prestack seismic volumes that depend on four spatial dimen-

sions: Geophysics, 78, no. 1, V21–V30, doi: 10.1190/geo2012-0038.1.

Gülünay, N., 2003, Seismic trace interpolation in the Fourier transform do-

main: Geophysics, 68, 355–369, doi: 10.1190/1.1543221.

Guzella, T. S., and W. M. Caminhas, 2009, A review of machine learning

approaches to spam filtering: Expert Systems with Applications, 36,

10206–10222, doi: 10.1016/j.eswa.2009.02.037.

Hassan, A. H., S. Lambert-Lacroix, and F. Pasqualini, 2015, Real-time fault

detection in semiconductor using one-class support vector machines:

International Journal of Computer Theory and Engineering, 7,191–196,

doi: 10.7763/IJCTE.2015.V7.955.

Haykin, S., 2004, Neural networks: A comprehensive foundation, 2nd ed.:

Prentice Hall.

Helmy, T., A. Fatai, and K. Faisal, 2010, Hybrid computational models for

the characterization of oil and gas reservoirs: Expert Systems with Ap-

plications, 37, 5353–5363, doi: 10.1016/j.eswa.2010.01.021.

Herrmann, F. J., and G. Hennenfent, 2008, Non-parametric seismic data re-

covery with curvelet frames: Geophysical Journal International, 173, 233–

248, doi: 10.1111/gji.2008.173.issue-1.

Huang, C.-L., M.-C. Chen, and C.-J. Wang, 2007, Credit scoring with a data

mining approach based on support vector machines: Expert Systems with

Applications, 33, 847–856, doi: 10.1016/j.eswa.2006.07.007.

Huang, W., Y. Nakamori, and S.-Y. Wang, 2005, Forecasting stock market

movement direction with support vector machine: Computers & Opera-

tions Research, 32, 2513–2522, doi: 10.1016/j.cor.2004.03.016.

Jia, Y., S. Yu, L. Liu, and J. Ma, 2016, A fast rank-reduction algorithm for

three-dimensional seismic data interpolation: Journal of Applied Geo-

physics, 132, 137–145, doi: 10.1016/j.jappgeo.2016.06.010.

Keys, R. G., 1981, Cubic convolution interpolation for digital image process-

ing: IEEE Transactions on Acoustics, Speech and Signal Processing, 29,

1153–1160, doi: 10.1109/TASSP.1981.1163711.

Kreimer, N., and M. D. Sacchi, 2012, A tensor higher-order singular value

decomposition for prestack seismic data noise reduction and interpola-

tion: Geophysics, 77, no. 3, V113–V122, doi: 10.1190/geo2011-0399.1.

Kreimer, N., A. Stanton, and M. D. Sacchi, 2013, Tensor completion based

on nuclear norm minimization for 5D seismic data reconstruction: Geo-

physics, 78, no. 6, V273–V284, doi: 10.1190/geo2013-0022.1.

Kumar, R., H. Mansour, A. Y. Aravkin, and F. J. Herrmann, 2013, Recon-

struction of seismic wavefields via low-rank matrix factorization in the

hierarchical-separable matrix representation: 83rd Annual International

Meeting, SEG, Expanded Abstracts, 3628–3633.

Kwiatkowska, E. J., and G. S. Fargion, 2003, Application of machine-learn-

ing techniques toward the creation of a consistent and calibrated global

chlorophyll concentration baseline dataset using remotely sensed ocean

color data: IEEE Transactions on Geoscience and Remote Sensing, 41,

2844–2860, doi: 10.1109/TGRS.2003.818016.

Liang, J., J. Ma, and X. Zhang, 2014, Seismic data restoration via data-driven

tight frame: Geophysics, 79, no. 3, V65–V74, doi: 10.1190/geo2013-0252.1.

Lim, J.-S., 2005, Reservoir properties determination using fuzzy logic and

neural networks from well data in offshore Korea: Journal of Petroleum

Science and Engineering, 49,182–192, doi: 10.1016/j.petrol.2005.05.005.

Liu, B., and M. D. Sacchi, 2004, Minimum weighted norm interpolation of

seismic records: Geophysics, 69, 1560–1568, doi: 10.1190/1.1836829.

Ma, J., 2013, Three-dimensional irregular seismic data reconstruction via

low-rank matrix completion: Geophysics, 78, no. 5, V181–V192, doi:

10.1190/geo2012-0465.1.

Mohandes, M., T. Halawani, S. Rehman, and A. A. Hussain, 2004, Support

vector machines for wind speed prediction: Renewable Energy, 29, 939–

947, doi: 10.1016/j.renene.2003.11.009.

Murthy, S. K., 1998, Automatic construction of decision trees from data: A

multidisciplinary survey: Data Mining and Knowledge Discovery, 2, 345–

389, doi: 10.1023/A:1009744630224.

Naghizadeh, M., and M. D. Sacchi, 2010, Beyond alias hierarchical scale

curvelet interpolation of regularly and irregularly sampled seismic data:

Geophysics, 75, no. 6, WB189–WB202, doi: 10.1190/1.3509468.

Naghizadeh, M., and M. D. Sacchi, 2012, Multidimensional de-aliased Cad-

zow reconstruction of seismic records: Geophysics, 78, no. 1, A1–A5,

doi: 10.1190/geo2012-0200.1.

Ni, K. S., and T. Q. Nguyen, 2007, Image super-resolution using support

vector regression: IEEE Transactions on Image Processing, 16, 1596–

1610, doi: 10.1109/TIP.2007.896644.

Oropeza, V., and M. Sacchi, 2011, Simultaneous seismic data denoising and

reconstruction via multichannel singular spectrum analysis: Geophysics,

76, no. 3, V25–V32, doi: 10.1190/1.3552706.

Ravisankar, P., V. Ravi, G. R. Rao, and I. Bose, 2011, Detection of financial

statement fraud and feature selection using data mining techniques: De-

cision Support Systems, 50, 491–500, doi: 10.1016/j.dss.2010.11.006.

Reinartz, T., 2002, A unifying view on instance selection: Data Mining and

Knowledge Discovery, 6, 191–210, doi: 10.1023/A:1014047731786.

Sacchi, M. D., T. J. Ulrych, and C. J. Walker, 1998, Interpolation and

extrapolation using a high-resolution discrete Fourier transform: IEEE

Transactions on Signal Processing, 46,31–38, doi: 10.1109/78.651165.

Santamaría-Bonfil, G., A. Reyes-Ballesteros, and C. Gershenson, 2016,

Wind speed forecasting for wind farms: A method based on support vector

regression: Renewable Energy, 85, 790–809, doi: 10.1016/j.renene.2015.07

.004.

Smola, A. J., and B. Schölkopf, 2004, A tutorial on support vector regres-

sion: Statistics and Computing, 14, 199–222, doi: 10.1023/B:STCO

.0000035301.49549.88.

Spitz, S., 1991, Seismic trace interpolation in the F-X domain: Geophysics,

56, 785–794, doi: 10.1190/1.1443096.

Trad, D., 2009, Five-dimensional interpolation: Recovering from acquisition

constraints: Geophysics, 74, no. 6, V123–V132, doi: 10.1190/1.3245216.

Trickett, S., L. Burroughs, A. Milton, L. Walton, and R. Dack, 2010, Rank-

reduction-based trace interpolation: 80th Annual International Meeting,

SEG, Expanded Abstracts, 3829–3833.

Vapnik, V., 1995, The nature of statistical learning theory: Springer.

Wu, C.-H., J.-M. Ho, and D.-T. Lee, 2004, Travel-time prediction with sup-

port vector regression: IEEE Transactions on Intelligent Transportation

Systems, 5, 276–281, doi: 10.1109/TITS.2004.837813.

Xu, S., Y. Zhang, and G. Lambaré, 2010, Antileakage Fourier transform for

seismic data regularization in higher dimensions: Geophysics, 75, no. 6,

WB113–WB120, doi: 10.1190/1.3507248.

Yang, Y., J. Ma, and S. Osher, 2013, Seismic data reconstruction via matrix

completion: Inverse Problems and Imaging, 7, 1379–1392, doi: 10.3934/ipi.

Yu, S., J. Ma, X. Zhang, and M. D. Sacchi, 2015, Denoising and interpo-

lation of high-dimensional seismic data by learning tight frame: Geophys-

ics, 80, no. 5, V119–V132, doi: 10.1190/geo2014-0396.1.

Zhang, C., C. Frogner, M. Araya-Polo, and D. Hohl, 2014, Machine-learning

based automated fault detection in seismic traces: 76th Annual International

Conference and Exhibition, EAGE, Extended Abstracts, 807–811.

ML for seismic data interpolation V177

Downloaded 10/05/17 to 202.118.249.70. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

Enhancing Predictive Maintenance in Airborne Weapon Systems through Advanced Frequency Augmentation Techniques

Preprint

Full-text available

May 2024

This study investigates the application of frequency augmentation techniques to improve condition-based maintenance (CBM+) systems for airborne weapon systems, focusing on the predictive accuracy of velocity and acceleration parameters. Utilizing MIL-STD-1553B MUX data from 138 sorties of fixed-wing aircraft, we explored six augmentation methods, including traditional interpolations (linear, quadratic, cubic spline) and advanced machine learning models (K-Nearest Neighbor, LSTM, and Bi-LSTM). Our findings indicate that while traditional methods like linear interpolation are more effective for velocity parameters, advanced ML techniques, particularly Bi-LSTM, provide better results for acceleration parameters, which exhibit more complex and rapid variability. The study underscores the necessity of tailoring augmentation approaches to specific parameter characteristics and highlights the potential of ML models in capturing intricate patterns within time-series data. These insights are critical for advancing CBM+ systems in military avionics, enhancing reliability and operational efficiency through improved data processing strategies. Future work should explore more performant augmentation techniques, such as by integrating multi-dimensional datasets, and explore ways to develop automated tools for a more powerful predictive maintenance framework.

Meta-Processing: A robust framework for multi-tasks seismic processing

Article

Full-text available

May 2024
SURV GEOPHYS

Machine learning-based seismic processing models are typically trained separately to perform seismic processing tasks (SPTs) and, as a result, require plenty of high-quality training data. However, preparing training data sets is not trivial, especially for supervised learning (SL). Despite the variability in seismic data across different types and regions, some general characteristics are shared, such as their sinusoidal nature and geometric texture. To learn the shared features and thus, quickly adapt to various SPTs, we develop a unified paradigm for neural network-based seismic processing, called Meta-Processing, that uses limited training data for meta learning a common network initialization, which offers universal adaptability features. The proposed Meta-Processing framework consists of two stages: meta-training and meta-testing. In the former, each SPT is treated as a separate task and the training dataset is divided into support and query sets. Unlike conventional SL methods, here, the neural network (NN) parameters are updated by a bilevel gradient descent from the support set to the query set, iterating through all tasks. In the meta-testing stage, we also utilize limited data to fine-tune the optimized NN parameters in an SL fashion to conduct various SPTs, such as denoising, interpolation, ground-roll attenuation, image enhancement, and velocity estimation, aiming to converge quickly to ideal performance. Extensive numerical experiments are conducted to assess the effectiveness of Meta-Processing on both synthetic and real-world data. The findings reveal that our approach leads to a substantial improvement in the convergence speed and predictive performance of the NN.

USING INTERPOLATION FOR GENERATING INPUT DATA FOR THE GROSS DOMESTIC PRODUCT MONTE CARLO SIMULATION

Article

Nov 2023

Alexei Botchkarev

Input modelling is a complex task within the Monte Carlo simulation, especially when the systems and processes under investigation reveal the non-linear behavior of several interdependent variables. Commonly used approaches for Monte Carlo simulation input modelling include selecting probability distributions and fitting them to existing data; resampling random variates from historical data; and using real-world data as an input model, which in the age of big data becomes more feasible. Each of the approaches comes with its own set of drawbacks. This note aims to describe a new method of input modelling for GDP Monte Carlo simulation based on interpolation of the GDP historical records. Also, this method has been implemented as a publicly available online tool using the Microsoft Azure Machine Learning Studio. A similar approach can be applied to other macroeconomic indicators, e.g., consumer price index (inflation) or current employment statistics. This note is intended for economists, data scientists, and operations research analysts interested in the GDP Monte Carlo simulation. It can also be used by academics, researchers, and practitioners in a broad subject area for generating input data for Monte Carlo simulation. Specifically, it can be of interest for Ph.D. candidates (VAC specialty 5.2.6) performing development of theory and methods of decision-making in economic and social systems, and application of artificial intelligence and big data methods in management.

Engineering Applications of Artificial Intelligence Fractal interpolation in the context of prediction accuracy optimization

Article

Jan 2024
ENG APPL ARTIF INTEL

This paper focuses on the hypothesis of optimizing time series predictions using fractal interpolation techniques. In general, the accuracy of machine learning model predictions is closely related to the quality and quantitative aspects of the data used, following the principle of garbage-in, garbage-out. In order to quantitatively and qualitatively augment datasets, one of the most prevalent concerns of data scientists is to generate synthetic data, which should follow as closely as possible the actual pattern of the original data. This study proposes three different data augmentation strategies based on fractal interpolation, namely the Closest Hurst Strategy, Closest Values Strategy and Formula Strategy. To validate the strategies, we used four public datasets from the literature, as well as a private dataset obtained from meteorological records in the city of Braşov, Romania. The prediction results obtained with the LSTM model using the presented interpolation strategies showed a significant accuracy improvement compared to the raw datasets, thus providing a possible answer to practical problems in the field of remote sensing and sensor sensitivity. Moreover, our methodologies answer some optimization-related open questions for the fractal interpolation step using Optuna framework.

Imaging of Moho Topography with Conditional Generative Adversarial Network from Observed Gravity Anomalies

Article

Full-text available

Mar 2024

Spectral recomposition feature for optimizing seismic velocity model prediction with a neural network

Article

May 2024
J APPL GEOPHYS

Nelson Ricardo Coelho Flores Zuniga

High‐resolution reservoir prediction method based on data‐driven and model‐based approaches

Article

Feb 2024

The Jiyang depression in the southeastern part of the Bohai Bay Basin has a relatively large scale set of shale oil in the Paleogene Shahejie Formation, but the complex internal components lead to narrow frequency bands, low resolution and difficulty in reservoir information extraction. Impedance is important information for reservoir characterization, and how to predict high‐resolution impedance using available information is particularly important. Deep learning, known for its effectiveness in addressing non‐linear problems, has found extensive applications in various fields of oil and gas exploration. However, the challenges of overfitting and poor generalization persist due to the limited availability of training datasets. Besides, existing methods often use networks to solve a single problem in fact, deep learning can deal with a series of problems intelligently. In order to partially solve the above problems, an intelligent storage prediction network framework is proposed in this paper. Physical information is introduced to realize data‐driven and model‐based approaches, thus solving the problem of difficult construction of training datasets. The processing part accomplishes the high‐resolution processing of seismic records, thus solving the problems of narrow bandwidth and low resolution. Initial model constraints are introduced so as to obtain more stable inversion results. Finally, the well data is compared and analysed to identify and predict the lithology and complete the intelligent prediction of unconventional reservoirs. The results are compared with the traditional model‐driven inversion method, revealing that the approach presented in this paper exhibits higher resolution in predicting dolomite. This contributes to the establishment of a robust data foundation for reservoir evaluation.

Unsupervised Deep Learning Framework for 5D Seismic Denoising and Interpolation

Article

Mar 2024

We propose an unsupervised framework to reconstruct the missing data from the noisy and incomplete five-dimensional (5D) seismic data. The proposed method comprises two main components: a deep learning network and a projection onto convex sets (POCS) method. The model works iteratively, passing the data between the two components and splitting the data into a group of patches using a patching scheme. Specifically, the patching scheme breaks the input data into small segments which are then reshaped to a vector of one dimension feeding the deep learning model. Afterward, POCS is utilized to optimize the output data from the deep learning model, which is proposed to denoise and interpolate the extracted patches. The proposed deep learning model consists of several blocks, that are, fully connected layers, attention block, and several skip connections. Following this, the output of the POCS algorithm is considered as the input of the deep learning model for the following iteration. The proposed model iteratively works in an unsupervised scheme where labeled data is not required. A performance comparison with benchmark methods using several synthetic and field examples shows that the proposed method outperforms the traditional methods.

A Machine Learning-Based Thermobarometer for Magmatic Liquids

Article

Full-text available

Mar 2024

Experimentally calibrated models to recover pressures and temperatures of magmas, are widely used in igneous petrology. However, large errors, especially in barometry, limit the capacity of these models to resolve the architecture of crustal igneous systems. Here we apply machine learning to a large experimental database to calibrate new regression models that recover P-T of magmas based on melt composition plus associated phase assemblage. The method is applicable to compositions from basalt to rhyolite, pressures from 0.2 to 15 kbar, and temperatures of 675-1400°C. Testing and optimisation of the model with a filter that removes estimates with standard deviation above the 50th percentile show that pressures can be recovered with root-mean-square-error (RMSE) of 1.1-1.3 kbar and errors on temperature estimates of 21°C. Our findings demonstrate that, given constraints on the coexisting mineral assemblage melt chemistry, is a reliable recorder of magmatic variables. This is a consequence of the relatively low thermodynamic variance of natural magma compositions despite their relatively large number of constituent oxide components. We apply our model to two contrasting cases with well-constrained geophysical information: Mount St. Helens volcano (USA), and Askja caldera in Iceland. Dacite whole-rocks from Mount St Helens erupted 1980-1986, inferred to represent liquids extracted from cpx-hbl-opx-plag-mt-ilm mush, yield melt extraction source pressures of 5.1-6.7 kbar in excellent agreement with geophysical constraints. Melt inclusions and matrix glasses record lower pressures (0.7-3.8 kbar), consistent with magma crystallisation within the upper reaches of the imaged geophysical anomaly and during ascent. Magma reservoir depth estimates for historical eruptions from Askja match the location of seismic wave speed anomalies. Vp/Vs anomalies at 5-10 km depth correspond to hot (~990°C) rhyolite source regions, while basaltic magmas (~1120°C) were stored at 7-17 km depth under the caldera. These examples illustrate how our model can link petrology and geophysics to better constrain the architecture of volcanic feeding systems. Our model (MagMaTaB) is accessible through a user-friendly web application (https://igdrasil.shinyapps.io/MagmaTaBv4/).

Reconstructing Regularly Missing Seismic Traces With a Classifier-Guided Diffusion Model

Article

Jan 2024

Reconstructing missing seismic data is crucial for seismic processing and interpretation. Recent methods struggle when seismic traces are regularly missing, such as near offset data. We proposed a classifier-guided conditional seismic denoising diffusion probabilistic model (CCSeis-DDPM) to enable consistent reconstructions. The CCSeis-DDPM adopts the Markov model architecture of denoising diffusion probabilistic models to generate high-quality results. The model involves classifier-guided training and tailored inference. During training, we employ a U-Net with embedded timestep and three class labels for noise prediction, utilizing classifier guidance to enhance reconstruction accuracy. In the inference phase, the model selectively samples unmasked regions using available seismic data. Our experiments on synthetic and field shot gathers with regularly missing near, mid and far offsets show the proposed CCSeis-DDPM reconstructs regularly missing traces more accurately than current state-of-the-art methods, demonstrated qualitatively and quantitatively. This successful integration of diffusion probabilistic models with classification guidance and conditioning underscores the immense potential of this approach for enhancing seismic data reconstruction processes.

Support vector regression machines

Article

Full-text available

Jan 1997
Adv Neural Inform Process Syst

Wind Speed Forecasting For Wind Farms: A Method Based on Support Vector Regression

Article

Full-text available

Jan 2015
RENEW ENERG

In this paper, a hybrid methodology based on Support Vector Regression for wind speed forecasting is proposed. Using the autoregressive model called Time Delay Coordinates, feature selection is performed by the Phase Space Reconstruction procedure. Then, a Support Vector Regression model is trained using univariate wind speed time series. Parameters of Support Vector Regression are tuned by a genetic algorithm. The proposed method is compared against the persistence model, and autoregressive models (AR, ARMA, and ARIMA) tuned by Akaike's Information Criterion and Ordinary Least Squares method. The stationary transformation of time series is also evaluated for the proposed method. Using historical wind speed data from the Mexican Wind Energy Technology Center (CERTE) located at La Ventosa, Oaxaca, México, the accuracy of the proposed forecasting method is evaluated for a whole range of short termfore-casting horizons (from 1 to 24 hours ahead). Results show that, forecasts made with our method are more accurate for medium (5-23 hours ahead) short term WSF and WPF than those made with persistence and autoregressive models.

Multidimensional interpolation using a model-constrained minimum weighted norm interpolation

Article

Full-text available

Aug 2014

Stephen K. Chiu

Fourier-based minimum weighted norm interpolation (MWNI) has been widely used to regularize land seismic data. However, it has difficulty interpolating regular missing data that are spatially aliased. Minimizing the aliasing artifacts is still a technical challenge in MWNI. I have developed a novel method to address the aliasing problem in MWNI using a prior model as constraints. The prior model was constructed by a linear interpolation along dominant dips to produce a fully regular initial model. The spectral weights derived from this initial model are typically not aliased and can be used to constrain the least-squares inversion in MWNI, frequency by frequency, to overcome the aliasing artifacts. This new interpolation scheme expands the capability of conventional MWNI to handle spatially aliased data that are often associated with steeply dipping structures, and it reconstructs more reliable interpolation results. Decimation tests of a simple 2D synthetic data set and a complex 3D synthetic salt model revealed that modelconstrained MWNI outperforms the conventional MWNI method in handling spatially aliased data. I applied this method to a land field data set and carried out a comprehensive evaluation of interpolated data through the processes of prestack analyses, prestack time migration, and prestack depth migration. The 5D interpolation successfully filled in missing data, increased spatial sampling of prestack gathers, and considerably improved migrated stacked images from the prestacked time and depth imaging.

Real-Time Fault Detection in Semiconductor Using One-Class Support Vector Machines

Article

Full-text available

Jun 2015

Reconstruction of seismic wavefields via low-rank matrix factorization in the hierarchical-separable matrix representation

Conference Paper

Full-text available

Sep 2013

LIBSVM: A library for support vector machines

Article

Jan 2011

A fast rank-reduction algorithm for three-dimensional seismic data interpolation

Article

Jul 2016
J APPL GEOPHYS

Rank-reduction methods have been successfully used for seismic data interpolation and noise attenuation. However, highly intense computation is required for singular value decomposition (SVD) in most rank-reduction methods. In this paper, we propose a simple yet efficient interpolation algorithm, which is based on the Hankel matrix, for randomly missing traces. Following the multichannel singular spectrum analysis (MSSA) technique, we first transform the seismic data into a low-rank block Hankel matrix for each frequency slice. Then, a fast orthogonal rank-one matrix pursuit (OR1MP) algorithm is employed to minimize the low-rank constraint of the block Hankel matrix. In the new algorithm, only the left and right top singular vectors are needed to be computed, thereby, avoiding the complexity of computation required for SVD. Thus, we improve the calculation efficiency significantly. Finally, we anti-average the rank-reduction block Hankel matrix and obtain the reconstructed data in the frequency domain. Numerical experiments on 3D seismic data show that the proposed interpolation algorithm provides much better performance than the traditional MSSA algorithm in computational speed, especially for large-scale data processing.

Interpolation and denoising of high-dimensional seismic data by learning a tight frame

Article

Sep 2015

Sparse transforms play an important role in seismic signal processing steps, such as prestack noise attenuation and data reconstruction. Analytic sparse transforms (so-called implicit dictionaries), such as the Fourier, Radon, and curvelet transforms, are often used to represent seismic data. There are situations, however, in which the complexity of the data requires adaptive sparse transform methods, whose basis functions are determined via learning methods. We studied an application of the data-driven tight frame (DDTF) method to noise suppression and interpolation of high-dimensional seismic data. Rather than choosing a model beforehand (for example, a family of lines, parabolas, or curvelets) to fit the data, the DDTF derives the model from the data itself in an optimum manner. The process of estimating the basis function from the data can be summarized as follows: First, the input data are divided into small blocks to form training sets. Then, the DDTF algorithm is applied on the training sets to estimate the dictionary. The DDTF is typically embodied as an explicit dictionary, and a sparsity-promoting algorithm is used to obtain an optimized tight frame representation of the observed data. The computational time and redundancy is controlled by the block overlap of the training set. Finally, the learned dictionary is used to represent the observed data and to estimate data at unobserved spatial positions. Our numerical results showed that the proposed methodology is capable of recovering n-dimensional prestack seismic data under different signal-to-noise ratio scenarios. We determined that subtle features tend to be better preserved with the DDTF method in comparison with standard Fourier and directional transform reconstruction methods.

Wind Speed Forecasting For Wind Farms: A Method Based on Support Vector Regression

Article

Jul 2015
RENEW ENERG

In this paper, a hybrid methodology based on Support Vector Regression for wind speed forecasting is proposed. Using the autoregressive model called Time Delay Coordinates, feature selection is performed by the Phase Space Reconstruction procedure. Then, a Support Vector Regression model is trained using univariate wind speed time series. Parameters of Support Vector Regression are tuned by a genetic algorithm. The proposed method is compared against the persistence model, and autoregressive models (AR, ARMA, and ARIMA) tuned by Akaike's Information Criterion and Ordinary Least Squares method. The stationary transformation of time series is also evaluated for the proposed method. Using historical wind speed data from the Mexican Wind Energy Technology Center (CERTE) located at La Ventosa, Oaxaca, México, the accuracy of the proposed forecasting method is evaluated for a whole range of short termforecasting horizons (from 1 to 24 hours ahead). Results show that, forecasts made with our method are more accurate for medium (5-23 hours ahead) short term WSF and WPF than those made with persistence and autoregressive models.

Rank‐reduction‐based trace interpolation

Conference Paper

Jan 2010

Summary In previous papers we described a family of multidimensional filters to suppress random noise based on matrix-rank reduction of constant-frequency slices. Here we extend these filters to perform multidimensional trace interpolation. This requires rank reduction when some, perhaps most, of the matrix elements are unknown, a procedure called matrix completion or matrix imputation. We show how this new interpolator improves the spatial resolution of 3D data when applied prior to prestack migration.

What can machine learning do for seismic data processing? An interpolation application

Abstract and Figures

Recommended publications

Accurate Annotation of Remote Sensing Images via Spectral Active Clustering with Little Expert Knowl...

An Outlier Data Analysis using Support Vector Regression

An efficient algorithm based on time decay model for mining maximal frequent itemsets

An Efficient Subsequence Matching Method Based on Index Interpolation