ArticlePDF Available

ClimateLearn: A machine-learning approach for climate prediction using network measures

February 2016
Geoscientific Model Development Discussions

February 2016

DOI:10.5194/gmd-2015-273

License
CC BY 3.0

Authors:

Qingyi Feng

Utrecht University

Ruggero Vasile

Helmholtz-Zentrum Potsdam - Deutsches GeoForschungsZentrum GFZ

Marc Segond

Phedes Lab

Avi Goz

TaKaDu

Show all 9 authorsHide

We present the toolbox ClimateLearn to tackle problems in climate prediction using machine learning techniques and climate network analysis. The package allows basic operations of data mining, i.e. reading, merging, and cleaning data, and running machine learning algorithms such as multilayer artificial neural networks and symbolic regression with genetic programming. Because spatial temporal information on climate variability can be efficiently represented by complex network measures, such data are considered here as input to the machine-learning algorithms. As an example, the toolbox is applied to the prediction of the occurrence and the development of El Niño in the equatorial Pacific, first concentrating on the occurrence of El Niño events one year ahead and second on the evolution of sea surface temperature anomalies with a lead time of three months.

Prediction results on the test set from June 2001 to March 2014 (a) without filtering and (b) with filtering, using an artificial neural network (ANN) with a 3 × 3 layer structure (3 neurons per layer) for a 12 months lead time prediction for the occurrence of El Niño events. The red dashed lines are the actual nominal quantity of the NINO3.4 index (1 stands for the occurrence of an El Niño event where NINO3.4 values are continuously above the threshold of +0.4 @BULLET C for five months, and 0 for the absence of such an event), and the blue solid lines indicate the predicted ones.

…

Regression results on the test set from May 2004 to October 2014 using an ANN with a 2×1 layer structure (2 neurons in the first layer and 1 neuron in the second one) for the prediction of the NINO3.4 index with a lead time of (a) 2 months (NRMSE=0.23), (b) 3 months (NRMSE=0.18), and (c) 4 months (NRMSE=0.22). The red dashed lines are the actual values of NINO3.4 index, and the blue solid lines indicate the predicted ones.

…

Cross-validation results of NINO3.4 index forecast on the test set by keeping certain percentage splits between training set and test set (70-30, 75-25, 80-20, and 85-15), but randomly choosing 200 initial times of the test set t test i from November 1961 to October 2014 for each percentage split. The blue dashed curve is the NRMSE distribution of 70-30 split (70% of T as the training sets and 30% of T as test sets), the green solid line for a 75-25 split, the red solid curve for a 80-20 split and the cyan solid curve for a 85-15 split.

…

Results for a 3-month running mean regression on the test set from May 2004 to October 2014 using (a) an ANN with a 2 × 1 layer structure (2 neurons in the first layer and 1 neuron in the second one, NRMSE=0.14), (b) an ensemble of 49 ANNs with different binary layer structures and up to 7 neurons per layer (only the ensemble mean of the best 10 is showed, NRMSE=0.15) and (c) an ensemble of genetic programmings (only the ensemble mean of the best 10 is showed, NRMSE=0.17) for the three months ahead prediction for the development of the NINO3.4 index. The red dashed curves are the actual values of NINO3.4 index, and the blue solid curves indicate the predicted ones.

…

Prediction results on ENSO variability in 2014 using an ensemble of 36 ANNs with different binary layer structures and up to 6 neurons per layer. (a) The occurrence of the El Niño event given one year ahead, and (b) the development of NINO3.4 index with a three months a lead time (only the ensemble mean is shown, NRMSE=0.19). The red dashed lines are the actual nominal quantity/actual values of NINO3.4 index, the blue solid lines indicate the predicted ones, and the black solid line indicates the predicted one by CFSv2 model (only the ensemble mean is shown, estimated from http:// www.cpc.ncep.noaa.gov/ products/ people/ wwang/ cfsv2 fcst history/ ).

…

Figures - uploaded by Qingyi Feng

Content may be subject to copyright.

Content uploaded by Qingyi Feng

Content may be subject to copyright.

ClimateLearn: A machine-learning approach for climate

prediction using network measures

Qing Yi Feng1, Ruggero Vasile2,3, Marc Segond4, Avi Gozolchiani5, Yang Wang5, Markus

Abel3, Shilomo Havlin5, Armin Bunde6, and Henk A. Dijkstra1

1Institute for Marine and Atmospheric research Utrecht, Utrecht University, The Netherlands

2UP Transfer, Potsdam, Germany

3Ambrosys, Potsdam, Germany

4European Centre for Soft Computing, Mieres, Spain

5Bar-Ilan University, Isreal

6University of Giessen, Germany

Correspondence to: Q. Y. Feng (Q.Feng@uu.nl) and R. Vasile (ruggero.vasile@ambrosys.de)

Abstract. We present the toolbox ClimateLearn to tackle problems in climate prediction using machine

learning techniques and climate network analysis. The package allows basic operations of data mining, i.e. read-

ing, merging, and cleaning data, and running machine learning algorithms such as multilayer artiﬁcial neural

networks and symbolic regression with genetic programming. Because spatial-temporal information on climate

variability can be efﬁciently represented by complex network measures, such data are considered here as input to5

the machine-learning algorithms. As an example, the toolbox is applied to the prediction of the occurrence and

the development of El Ni˜

no in the equatorial Paciﬁc, ﬁrst concentrating on the occurrence of El Ni˜

no events one

year ahead and second on the evolution of sea surface temperature anomalies with a lead time of three months.

1 Introduction

Machine learning is a branch of computer science concerned with automated recognition of (spatio-temporal)10

patterns from data (Mitchell, 1997). It has been increasingly employed in the study of “big data” with the aim

to investigate data syntactically and semantically. In essence, this means an automated search for a best model,

given a certain task and corresponding data. A large number of algorithms have been designed for different tasks

in which the approach is borrowed from bio-inspired investigations on artiﬁcial intelligence (in older times a

synonym of machine learning). Given a task, a human learns what to do and -hopefully- optimizes the working15

schedule according to the given side conditions. This is how machines learn from data: a task is formulated and

then a learning process starts, which consists in building statistical models (in terms of probability distributions) or

functional models. Eventually, optimality criteria and discriminant functions are used to evaluate the performance

of such a model given new data.

The algorithms are divided roughly into three different categories: supervised learning, unsupervised learning20

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

and reinforcement learning (Bishop, 2006). Supervised learning comprises techniques that predict the value of a

target variable ygiven an input variable x, where xand ymight be vectors. A training set of many (x,y)pairs is

used to supervise the learning process and to build a model, which is subsequently used to ﬁnd the target values

corresponding to new data points (xnew,ynew). In unsupervised learning the dataset is not labelled, i.e. there is

no target variable y, and the aim is to ﬁnd patterns in the data such that target variables are identiﬁed, e.g. using25

clustering methods. Finally, in reinforcement learning, a certain goal is pursued in a dynamic environment without

knowing explicitly whether the approach converges to the goal or not, and the learning process is driven by the

feedbacks from the environment.

Machine learning has shown to be very efﬁcient in prediction, for example in solar energy prediction for solar

power plants (Sharma et al., 2011). This forecasting task can be reduced to learning how the solar plant reacts30

to the environmental conditions, and forecasting the future response of the plant using reliable weather data. As

such, the methodology can in principle also be directly applied to climate prediction problems (Slingo and Palmer,

2011), such as the prediction of El Ni˜

no events (Chen et al., 2004) and of interannual variations of the path of the

Kuroshio Current in the North Paciﬁc Ocean (Qiu and Chen, 2005). In particular the occurrence of an El Ni ˜

event has large impacts on the weather around the Paciﬁc (Reilly, 2009). It is therefore crucial to develop precise35

and reliable predictions of such events with considerable lead time and, if so, provide information on how the

events could develop in time.

Since the 1990s, both dynamical models and statistical models have been used to predict El Ni˜

no events (Latif

and Barnett, 1994; Fedorov et al., 2003; Chen et al., 2004; Yeh et al., 2009). Although about 20 models cur-

rently provide El Ni˜

no forecasts routinely, all reliable forecasts are generally limited to a 6 months ahead hori-40

zon. The reason is the so-called Spring Predictability Barrier: during spring errors are greatly ampliﬁed due to

the coupled feedbacks in the equatorial ocean-atmosphere system (Goddard et al., 2001; Duan and Wei, 2013).

Moreover, the prediction skill for the development of El Ni˜

no events is still disappointing for the current models

as can be seen by following the 2015 El Ni˜

no development at http://www.cpc.ncep.noaa.gov/products/ analy-

sis monitoring/enso advisory/ensodisc.html.45

Recently, approaches from complex network theory have been applied to problems in climate dynamics and

shown that spatial-temporal information on climate variability can be efﬁciently represented by network measures

(Tsonis and Roebber, 2004; Steinhaeuser et al., 2011; Tantet and Dijkstra, 2014; Fountalis et al., 2015). The

central two elements of this approach are Climate Network (CN) reconstruction and subsequent network analysis

(Tsonis and Swanson, 2006; Yamasaki et al., 2008; Donges et al., 2009). A notion of connectedness (deﬁning a50

‘link’ in the network) between time series at different locations (the ‘nodes’ in the network) can be obtained by

considering their Pearson correlation. Software packages, such pyunicorn (Donges et al., 2015) and Par@graph

(Ihshaish et al., 2015), are now available for efﬁcient climate network reconstruction and analysis.

Complex-network based indicators of El Ni˜

no occurrences have been developed using climate networks for

example reconstructed from atmosphere surface temperature observations (Yamasaki et al., 2008; Gozolchiani55

et al., 2011). These studies have shown that links based on the spatial correlations of the temperature anomalies

tend to weaken signiﬁcantly during El Ni˜

no events. A large-scale cooperative mode, linking the El Ni ˜

no basin

and the rest of the Paciﬁc climate system builds up one calendar year before the warming event (Ludescher et al.,

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

2013). Based on such ﬁndings on the temporal evolution of the CN, Ludescher et al. (2014) developed a forecasting

scheme for El Ni˜

no events. They suggest that a threshold on the average link weight in the reconstructed CN can60

reliably forecast an El Ni˜

no event one year ahead.

When machine-learning techniques are applied to the prediction of climate variability using data from CNs, one

typical task is to infer or ‘learn’ the dynamics of the climate system from past states and predict its future states.

In this paper, we present a machine-learning approach for climate forecasting using the measures of CNs. The

originality and advantage of this approach is that the temporal information is already contained in the measures of65

the CNs, so the machine-learning techniques will take those into account when making predictions of the future

states of the system. This is a big advantage that is not that common in most of the applications where machine

learning is used for prediction. In section 2, we start with an explanation on how the data for the machine-learning

approach is obtained from complex network analysis. The machine-learning methodology itself is described in

section 3 and subsequently applied in section 4 to the prediction of El Ni˜

no events. A summary and discussion are70

given in section 5.

2 Climate Networks

Climate scientists have been long interested in studying the statistical correlations between observables for gaining

a good understanding of the large-scale development of the climate system. By investigating the correlation

structures of global or regional ﬁelds, such as surface air temperature and geopotential height, much insight is75

gained into the patterns of climate variability. For example, through such analyses, the Southern Oscillation was

discovered by Sir Gilbert Walker and also its relation with the equatorial Tropical sea surface variability, i.e. El

Ni˜

no, was clariﬁed (Katz, 2002).

Suppose that a certain climate system observable indicated by Obelow, such as sea surface temperature (SST)

or surface atmospheric temperature (SAT), is available at ﬁxed measurement stations, certain predeﬁned re-80

gions, or at grid cells (e.g. from observations, proxy reconstructions, reanalysis, or model simulations). The

corresponding data can be represented by an n×Nmatrix F, ordered in such a way that each column vector

Oi= (Oi(t1),···,Oi(tn))Tat each grid point i(i= 1,...,N ) contains a time series of length n.

As mentioned above, one way to deﬁne the links in the climate network is to use the Pearson Correlation,

deﬁning a PCCN, or the Mutual Information, deﬁning a MICN (Feng and Dijkstra, 2014). To reconstruct a PCCN,85

ﬁrst the linear Pearson correlation coefﬁcient between the time series at two grid points iand jis determined. The

elements RP

ij of the correlation matrix RPare given by

ij =Pn

k=1 Oi(tk)Oj(tk)

q(Pn

k=1 O2

i(tk))(Pn

k=1 O2

j(tk)))

.(1)

To reconstruct a MICN , the correlation between the time series of two grid points iand jis determined by the

nonlinear mutual information coefﬁcient, giving90

ij =X

x∈Oi

y∈Oj

p(x,y)log(p(x,y)

p(x)p(y)),(2)

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

where p(x,y)is the joint probability density function of events xand yand p(x)and p(y)are the marginal

probability density functions.

We consider that two nodes iand jhave an unweighted link, if the absolute value of their correlation coefﬁcient

ij (either P=Xor X=M) is larger than a certain threshold value τ. All links are then represented by an95

N×Nadjacency matrix A, which can be determined from the correlation matrix RXaccording to

Aij =H(|RX

ij |−τ),(3)

where His the Heaviside function. The threshold τis in most cases based on statistical signiﬁcance (say above

the 95% level) of the correlations between the time series (Donges et al., 2015).

Another way to deﬁne a link between nodes iand jwas presented in Gozolchiani et al. (2011) and also used in100

Ludescher et al. (2014). First, the cross-correlation function Cij (∆t)between the time series at locations iand j

is calculated, where ∆tis a positive time lag and Cij(∆t) = Cj i(−∆t). Next, the time lag ∆t∗at which Cij (∆t)

is maximal (or minimal) is determined. Finally, weights (W±

ij ) for positive and negative links are deﬁned as:

W+

ij =MAX(Cij )−MEAN(Cij )

STD(Cij ),(4)

and105

W−

ij =MIN(Cij )−MEAN(Cij )

STD(Cij ),(5)

where MAX and MIN are the maximum and minimum values and MEAN,STD are the mean and standard

deviation, respectively. In this way, a weighted and directed link between nodes iand jis obtained (Gozolchiani

et al., 2011; Wang et al., 2013).

There are many other ways to reconstruct climate CNs and an overview is given in Donges et al. (2015). By110

reconstructing CNs, the correlations in time series of observables at different locations is represented with a graph,

deﬁned by its adjacency matrix A. Subsequently, many topological properties of this graph are analysed, such as

the degree diof each node i, given by

di=

j=1

Aij (6)

which is the total number of links that a node possesses. Next step is to use the properties of such a CN as the115

input of machine-learning techniques. Besides the statistical properties of the CN, such as those of the correlation

matrices, also the topological properties of the graph can be used.

3 Machine learning approach

In ClimateLearn, supervised learning approaches are implemented for the prediction of climate variability.

Speciﬁcally we focus on multilayer artiﬁcial neural networks (ANN) and symbolic regression with genetic pro-120

gramming (GP), both explained in this subsection. The approaches follow the typical outline of machine learning:

the algorithms are trained or ‘learn’ a certain behavior from the data This results in a model that is evaluated using

test data, which are different from the ones used for training.

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

3.1 Artiﬁcial neural networks

Artiﬁcial neural networks are a class of statistical learning models inspired by the physiology of biological neural125

networks. They consists of a network of computing units, the neurons, which process input information transform-

ing it in an output signal whose form depends on the network internal state. Their importance has increased due

to recent availability of software capable to efﬁciently train the network and allow to use these methods for a large

variety of problems, from speech and image recognition up to the forecasting of time series and high-dimensional

clustering. Many neural network topologies have been proposed in the literature, as speciﬁc problems require130

speciﬁc topologies to be solved efﬁciently. Here, we concentrate on a speciﬁc conﬁguration known as multilayer

perceptron or multilayer neural network. In Fig. 1a the typical structure of a multilayer perceptron is shown: the

inputs enter the network and are processed by one or more hidden layers and exit at an output layer. Therefore

the computation can be seen as a mapping operation from a n-dimensional input vector to a m-dimensional out-

put vector. In a multilayer perceptron, information travels from the input to the output layer because the neuron135

connections are chosen to be unidirectional. When all neurons of one layer are connected to all the following we

speak about a fully connected multilayer perceptron.

Each neuron performs a speciﬁc kind of computation and Fig. 1b shows the functionality of the Rosenblatt

perceptron. First, a weighted sum of the input variables and the bias term bis built, with the result being then

processed by an activation function f(t). Once the single neuron operation is speciﬁed, one can easily calculate140

the network outputs given an input vector by evaluating the output of each layer by forward input propagation.

The result is a function of the network conﬁguration, i.e. its topology and the value of the connection weights. It

will be the job of the training phase to learn the weights in order to induce the desired computation; training and

learning are used here as synonymous.

Since the advent of neural networks, the training phase has been considered a computationally demanding145

problem mainly because of the absence of efﬁcient algorithms relative to the available computing power. This has

been overcome by the back-propagation algorithm (Bishop, 2006), nowadays widely applied in training multilayer

perceptrons. Given a supervised training set {xi,ti:i= 1...N}with xiinput variables and titarget variables,

we denote by yithe correspondent output computed by the network when xiis fed forward. In general we have

ti6=yi. A global error on the training set can be then deﬁned as a quadratic function of the form150

E(w) = 1

2NX

||ti−yi||2(7)

and can be seen as a function of the network weights w. Other error deﬁnitions are possible, for example by

choosing a different norm. The idea behind back propagation is to minimize this error by updating the weights

using the gradient descend method (with kas iteration index), i.e.

w(k)

ij →w(k)

ij −α∂E(w)

∂w(k)

(8)155

The calculation of the partial derivatives is thus crucial for the algorithm. It is done by using directly the depen-

dence of the error function on the training set instances. When all the instances have been used, one ‘epoch’ of

training is completed. Usually many epochs of training are needed in order for the error function to converge to a

local, or global minimum, resulting in longer training periods.

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

(a)

f(t)

∑

(b)

Fig. 1: (a) Schematics of a fully connected multilayer perceptron with three input variables x1,x2and x3and two

output variables y1and y2, with two hidden layers. (b) The Rosenblatt perceptron, with three inputs and a bias

unit. The weighted input sum is added to the bias term and then enters as argument of the activation function f

which generates the neuron output.

Once training is completed, the model is tested by checking the error on a test set. One main concern in160

this procedure is to avoid overﬁtting of data, i.e. a model that adapts too much on the training data and may

not generalize well when new data are used. In order to minimize the risk of overﬁtting one can employ cross-

validation methods, which consist in providing several bi-partitions of the training set, a training partition and a

validation partition. The network is then trained on the training partition and tested on the validation partition.

Once this process has been performed on several cross-validation partitions and the statistics of the training and165

validation errors are examined, the quality of the model can be established. If no signs of overﬁtting are detected,

the training is considered successful and the network can be employed for generalization.

3.2 Genetic programming

Genetic programming and genetic algorithms are a class of evolutionary based algorithms whose principles are

based on Charles Darwin’s Theory of Evolution (Darwin, 1959). In this masterpiece, Darwin explains that, given a170

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

population of individuals living within an environment, only a subset of them are properly ﬁtted and therefore have

higher chances of survival and reproduction. New generations may inherit these favourable genetic characteristics

and they will end up being dominant inside the population. Variations in the individual characteristics can be clas-

siﬁed in three categories. In the ﬁrst category are variations that are damageable for the individual. To the second

belong the beneﬁcial ones and in the third category, the variations have no effective inﬂuence. Natural selection175

consists in the preservation of beneﬁcial characteristics and their transmittance to the next generation, since those

ﬁtted individuals live longer and most of the time are better able to beat the competition for reproduction.

Given a problem Pto be solved, we imagine to have an ensemble of solutions Sto this problem. According to

its efﬁciency in solving P, each solution can be considered as an individual for which the degree of adaptation to its

environment can be measured in terms of a ﬁtness value. Genetic Programming (GP) is a particular case where the180

evolved individuals are computer programs (Koza, 1993). The aim is to appropriately evolve computer programs

by creating new generations, evaluating their ﬁtness value and ﬁnally selecting the best program that solves the

problem at hand. Here, we restrict such programs to functions f(x1, .. ., xn)of a given number of variables xi,i =

1,...n and we aim at ﬁnding a given function that approximates the solution to our problem accurately enough.

Therefore our application is nothing more than a symbolic regression achieved through genetic programming185

algorithms. The ﬁtness values can be represented mathematically by a real valued functional F[f(x1,...,xn)],

mapping the space of possible solutions onto the real axis. In GP, the programs are typically represented as trees,

where each tree represents an expression of a potential solution to a problem (cf. examples in Fig. 2).

To implement the variations in genetic programming, two operators are commonly used: mutation and

crossover. Their behaviour is very similar to the biological mutation and crossover concepts. In a mutation190

step, a random node in the tree that represents the individual is selected and the corresponding subtree is replaced

randomly by another one (Fig. 2a). Mutation is very important to keep diversity inside the population, and diver-

sity helps the algorithm to explore all the search space and preventing encounters of local maxima of the ﬁtness

functional. Crossover is based on the exchange of characteristics among two individuals. In GP, this is imple-

mented by selecting randomly a node in two trees that represent two individuals and exchange the two subtrees195

attached to two those nodes (Fig. 2b). In this way, the two individuals inherit characteristics of both parent trees.

In a so-called symbolic regression scheme, an initial random population of individuals is generated randomly

using elementary functions taken from a function set (e.g. sinx, exp x,max[x1,x2]) and combined using a certain

algebraical or functional operations (e.g. +,−,∗) in an operation set, whose choice depends on the problem at

hand. The initial population is then evaluated according to a ﬁtness function. The genetic operations mutation and200

crossover are then applied on the population in order to create a second generation which is as well evaluated. This

process continues until a prescribed termination condition, for example when the maximum number of generations

has been reached, or an individual of high enough ﬁtness is found (e.g. when the absolute value of the functional

Fis smaller than a certain tolerance).

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

sin

1∗1

cos (x

)sin (x

)

cos

1∗1

sin

sin(x

)(x

)

(a)

sin

1∗1

cos (x

)sin (x

)

cos

1∗1

sin

sin(x

)(x

)

cos

1∗1

cos(x

)( x

)

(b)

Fig. 2: (a) Example of mutation operation: a branch of the tree on the left is changed (mutated) by substitution

with another compatible branch determined randomly. (b) Example of crossover: two individuals are selected

from the population and a new individual is created by mixing the highlighted branches.

4 Application: El Ni ˜

no variability205

As an example of the application of ClimateLearn, we consider the forecasting of El Ni˜

no events using two

different approaches. First, we focus on the forecasting of the occurrence of events, i.e. the presence (or not)

of an El Ni˜

no in a given interval of time regardless of the intensity of the phenomenon. This problem can be

considered as a classiﬁcation problem, where a set of discrete classes is the output of the model (section 4.1). The

second approach is the forecasting of the time evolution of a scalar characteristic of El Ni˜

no, where we aim at the210

prediction of a real-valued time series by regression. The results in this case will give information on both the

presence and intensity of the event (section 4.2). In section 4.3, we provide speciﬁc results for the occurrence and

development of the El Ni˜

no conditions in the year 2014.

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

4.1 Results: Occurrence of El Ni ˜

no events

Just as in Ludescher et al. (2014), the data consist of atmospheric surface temperature anomalies over the May 1949215

- March 2014 from the NCEP Reanalysis project (Kalnay et al., 1996). From this dataset, a directed, weighted

network was reconstructed (Gozolchiani et al., 2011; Ludescher et al., 2014) using the methodology presented

in section 2. Several measures xi,i = 1,...,N of this network are used as the attributes in the machine-learning

approach and for each quantity a time series (x1

i,...,xT

i) is available. We use a time interval of 10 days (which

gives T= 2365) and choose eight measures (with time included, N= 9). These eight measures are the maximum220

correlation MAX(Cij ), the minimum correlation MIN(Cij ), the maximum delay MAX(∆t∗), the minimum delay

MIN(∆t∗), the maximum link weight MAX(Wij ), the minimum link weight MIN(Wij ), the standard deviation

of the correlation STD(Cij ), and the mean correlation MEAN(Cij )(see section 2).

The target variable is discrete valued and distinguishes the presence or absence of an event. Operationally

(http://www.cpc.ncep.noaa.gov/), an El Ni˜

no event is said to occur when the sea surface temperature anomaly225

over the region 120◦W-170◦W×5◦S-5◦N, the so-called NINO3.4 index (the SST anomaly averaged over the

region [120◦W-170◦W]×[5◦S-5◦N]), is above the threshold of +0.5◦C for at least 5 consecutive months. Hence,

we put y= 0 (no event) when it belongs to a interval of time where El Ni˜

no is not present, and y= 1 when it is

present. Here we do not want to smooth the data and hence we ﬂag an El Ni˜

no event when NINO3.4 values are

continuously above the threshold of +0.4◦C for ﬁve months. Regarding the build of the training and test sets, the230

condition ttest

1> ttrain

Thas to be satisﬁed. This means that the instances in the test set happen after the one in the

training set, since we are only interested in a chronological prediction for the El Ni˜

no event.

The method we choose for the supervised learning is an artiﬁcial neural network (ANN) with a 3×3layer

structure (3 neurons per layer). The training set is from May 1949 to June 2001 (80% of T), the test set is from

June 2001 to March 2014 (20% of T). Similar to Ludescher et al. (2014), the prediction lead time τis 12 months.235

Fig. 3a shows the classiﬁcation results on the test set, where 1stands for the occurrence of an El Ni˜

no event and

0means absence. The result is then ﬁltered by eliminating the isolated and transient events, and by batching the

adjacent events together. Fig. 3b then shows that our forecasting scheme gives accurate alarms 12 months ahead

for the El Ni˜

no events in 2002, 2006 and 2009, and no alarm in 2004. Compared with the results in Ludescher

et al. (2014), the machine-learning toolbox enables us to give a better prediction for the occurrence of El Ni˜

no240

events when using more measures of the same CN.

One advantage of using supervised learning for prediction is that the predictor model is constructed automat-

ically from the training set without subjective decisions like the choice of thresholds. However, because the

available data for prediction as well as the amount of instances is limited, (for example, only a few El Ni˜

no events

occurred between May 1949 and March 2014), the accuracy of the prediction will mostly depend on the length245

of the training set. Consequently we need to choose proper proportions of the available data as the training/test

set to avoid ‘under training’. To demonstrate that the current proportion for the test set (20% of T) gives the

best performance, we conduct a Receiver Operating Characteristic (ROC)- type analysis by varying the proportion

from 16% to 30% of Tas the test set. With a proportion between 16% and 20%, the averaged hit rate D= 0.90

and the averaged false-alarm rate α= 0.10. For 21% to 25%, we ﬁnd D= 0.71 and α= 0.29, and for 26% to250

30%, D= 0.21 and α= 0.79. Thus, to have a higher hit rate and a lower false-alarm rate, the best proportion for

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

Jan 2014 Mar 2014 Jun 2014 Sep 2014 Jan 2015 Mar 2015 Jun 2015

time (years)

NINO3.4 (Nominal)

actual

predicted

Jan 2014 Mar 2014 Jun 2014 Sep 2014 Jan 2015 Mar 2015

0.4

0.2

0.2

0.4

0.6

0.8

time (years)

NINO3.4 / oC

actual

predicted

(b)

(a)

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

time (years)

NINO3.4 (Nominal)

actual

predicted

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

time (years)

NINO3.4 (Nominal)

actual

predicted

(b)

(a)

Fig. 3: Prediction results on the test set from June 2001 to March 2014 (a) without ﬁltering and (b) with ﬁltering,

using an artiﬁcial neural network (ANN) with a 3×3layer structure (3 neurons per layer) for a 12 months lead

time prediction for the occurrence of El Ni˜

no events. The red dashed lines are the actual nominal quantity of the

NINO3.4 index (1stands for the occurrence of an El Ni˜

no event where NINO3.4 values are continuously above

the threshold of +0.4◦C for ﬁve months, and 0for the absence of such an event), and the blue solid lines indicate

the predicted ones.

the test set is ≤20%. Of course, we should also maximize the length of the test set to incorporate more El Ni˜

events for testing, and this motivated our choice of 20% (Fig. 3).

4.2 Results: NINO3.4 index development

Predictions for the development of the NINO3.4 index are more difﬁcult than those for the occurrence of El Ni˜

no255

events. For example, consider the results of the CFS version 2 (CFSv2) model developed by the Environmental

Modeling Center at National Centers for Environmental Prediction (NCEP). This is a fully coupled model repre-

senting the interaction between the Earth’s atmosphere, oceans, land and sea ice (Saha et al., 2014). In August

2014 this model predicted that the NINO3.4 index would go over +1.0◦C in October 2014 but the actual value in

October 2014 was just around +0.5◦C. Hence even for short term predictions (up to few months) a good skill of260

the NINO3.4 index development is still hard to achieve by this model.

Short-term development of the NINO3.4 index is strongly related to the stability of the Paciﬁc background state

and the occurrence of westerly wind bursts (WWBs) near the dateline. In Feng and Dijkstra (2015), PCCNs were

reconstructed using sea surface temperature data from the HadISST dataset Rayner (2003) using the methodology

presented in section 2. As a measure of the coherence in the PCCN, they determine the number of links of each265

node, i.e. the degree of the node. As a measure of the stability of the Paciﬁc climate, they use the skewness Sd

of the degree distribution of the PCCN. In addition to Sd, also the time series of the second principal component

(PC2) of the wind stress residual (the signal due to SST variability is ﬁltered out) is used as a measure the WWB

strength (Feng and Dijkstra, 2015).

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

Next, we use the machine-learning toolbox to investigate the importance of Sdand PC2 for the NINO3.4270

index development by supervised learning regression. The attributes are therefore the background stability index

x1=Sd, the westerly wind burst measure x2= PC2 and the time x3=tfrom November 1961 to October 2014

with a time interval of one month (i.e., T= 636 and N= 3). Given the data set we again have to choose a training

and a test set. In the case of regression we can randomly choose a given percentage of the instances to belong to a

training set and the rest to a test set. Since we do not possess a large amount of data, it is however important that275

these two dataset are as homogeneous as possible in order to avoid overﬁtting issues.

The training set chosen is from November 1961 to April 2004 (80% of T) and the test set is from May 2004 to

October 2014 (20% of T). The quality of the predicted results in the test set is measured by the normalized root

mean squared error (NRMSE) deﬁned as

NRMSE(yA,yP) = 1

max(yA)−min(yA)

ttest

1≤tk≤ttest

(yk

A−yk

P)2

n,(9)280

where yk

Ais the actual time series of NINO3.4 index, the predicted is indicated by yk

P,nand nis the number of

points in the test set.

We ﬁrst employ an ANN with a 2×1layer structure (2 neurons in the ﬁrst layer and 1 neuron in the second

one) to do the regression. Since we do not know the optimal prediction time τwhich would give a reasonable

prediction y(t+τ)at time t+τ, we vary τfrom 1 month up to 12 months. Fig. 4 shows the regression results285

on the test set for the 2-4 months lead time NINO3.4 forecasts. The best prediction, with the smallest value of

NRMSE=0.18, is given at τ= 3 months (Fig. 4b).

To test the robustness of the regression result for the three months lead time NINO3.4 index forecast (Fig. 4b),

we perform a series of cross-validations by keeping speciﬁc percentage splits between training set and test set

(70-30, 75-25, 80-20, and 85-15), but randomly choosing 200 initial times ttest

1of the test set from November290

1961 to October 2014 for each percentage split. From Fig. 5, one can see that the peak values of the NRMSE

remain near 0.17, independent of the choices of the percentage splits and ttest

1. Therefore the regression result in

Fig. 4b is considered robust.

Due to the irregular behavior of the PC2 representing the WWBs (cf. Figure 3 in Feng and Dijkstra (2015)),

the predicted NINO3.4 indices in Fig. 4 show more ﬂuctuations than the actual one. When a 3-month running295

mean is applied to the predicted NINO3.4 index (three months lead time, Fig. 4b) as well as the actual one, the

forecast has a better skill (NRMSE=0.14) as shown in Fig. 6a. To further demonstrate that the result in Fig. 6a is

robust and independent of the choices of the ANN layer structures and the methods for the supervised learning,

we perform the same regression task with an ensemble of 49 ANNs with different binary layer structures and up

to 7 neurons per layer and an ensemble of 50 GP runs. The averaged result of the best 10 ANNs (with the smallest300

NRMSE values) is shown in Fig. 6b with NRMSE=0.15. The averaged result of the best 10 GP runs (having the

smallest regression error) is shown in Fig. 6c with NRMSE=0.17, which are both similar to the one obtained by

the ANN with a 2 ×1 layer structure in Fig. 6a.

305

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

2

1.5

1

0.5

0.5

1.5

time (years)

NINO3.4 / oC

actual

predicted

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

2

1.5

1

0.5

0.5

1.5

time (years)

NINO3.4 / oC

actual

predicted

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

2

1.5

1

0.5

0.5

1.5

time (years)

NINO3.4 / oC

actual

predicted

(a) (b)

(c)

Fig. 4: Regression results on the test set from May 2004 to October 2014 using an ANN with a 2×1layer structure

(2 neurons in the ﬁrst layer and 1 neuron in the second one) for the prediction of the NINO3.4 index with a lead

time of (a) 2 months (NRMSE=0.23), (b) 3 months (NRMSE=0.18), and (c) 4 months (NRMSE=0.22). The red

dashed lines are the actual values of NINO3.4 index, and the blue solid lines indicate the predicted ones.

4.3 Results: El Ni ˜

no development in 2014

In the previous sections we have seen that by using the measures of the CNs from Ludescher et al. (2014) and

Feng and Dijkstra (2015), the machine-learning toolbox ClimateLearn can give robust predictions on the

occurrence of El Ni˜

no events one year ahead and the development of NINO3.4 index with a lead time of three310

months, respectively. We now apply these techniques to the occurrence and development of the situation in 2014.

First, we consider the occurrence of an El Ni˜

no event up to March 2015, by using the same data used in section

4.1 till March 2014. The prediction results on El Ni˜

no occurrence in 2014 are shown in Fig. 7a, by employing

an ensemble of 36 ANNs with different binary layer structures and up to 6 neurons per layer. Like the event in

2012 in Fig. 3b, our forecast scheme tends to ignore the ENSO-neutral favored events or the weak El Ni˜

no events.315

Hence, no El Ni˜

no event between January 2014 to March 2015 is predicted one year ahead (Fig. 7a).

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

0 0.05 0.1 0.15 0.2 0.25 0.3

NRMSE

Frequency (%)

70−30

75−25

80−20

85−15

Fig. 5: Cross-validation results of NINO3.4 index forecast on the test set by keeping certain percentage splits

between training set and test set (70-30, 75-25, 80-20, and 85-15), but randomly choosing 200 initial times of

the test set ttest

ifrom November 1961 to October 2014 for each percentage split. The blue dashed curve is the

NRMSE distribution of 70-30 split (70% of Tas the training sets and 30% of Tas test sets), the green solid line

for a 75-25 split, the red solid curve for a 80-20 split and the cyan solid curve for a 85-15 split.

Second, we consider the development of the NINO3.4 index from January 2014 till January 2015 using the

same data used in section 4.2 till October 2014. The accuracy of the predicted NINO3.4 index over 2014 with a

lead time of three months (Fig. 7b) is quite good (NRMSE=0.19) for example compared with the one given by

CFSv2 model over that period (NRMSE=0.34).320

5 Summary and discussion

In this paper, we have presented the machine-learning toolbox ClimateLearn for climate prediction problems,

based on climate data obtained from complex network reconstruction and analysis. Besides handing multivariate

data from these networks and other sources, another advantage of using this machine-learning toolbox for climate

variability prediction is that the development of predictor models is dynamic and data-driven (Bishop, 2006).325

Using machine-learning techniques with the measures from reconstructed Climate Networks (CNs), we have

provided novel prediction schemes for the occurrence of an El Ni˜

no event (with a lead time of one year) and for

the development of the NINO3.4 index (with a lead time of three months).

By using measures of a directed and weighted CN (Ludescher et al., 2014) and supervised learning classiﬁ-

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

2

1.5

1

0.5

0.5

1.5

time (years)

NINO3.4 / oC

actual

predicted

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

2

1.5

1

0.5

0.5

1.5

time (years)

NINO3.4 / oC

actual

predicted

(b)

(a)

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

2

1.5

1

0.5

0.5

1.5

time (years)

NINO3.4 / oC

actual

predicted

(c)

Fig. 6: Results for a 3-month running mean regression on the test set from May 2004 to October 2014 using (a) an

ANN with a 2×1layer structure (2 neurons in the ﬁrst layer and 1 neuron in the second one, NRMSE=0.14), (b)

an ensemble of 49 ANNs with different binary layer structures and up to 7 neurons per layer (only the ensemble

mean of the best 10 is showed, NRMSE=0.15) and (c) an ensemble of genetic programmings (only the ensemble

mean of the best 10 is showed, NRMSE=0.17) for the three months ahead prediction for the development of the

NINO3.4 index. The red dashed curves are the actual values of NINO3.4 index, and the blue solid curves indicate

the predicted ones.

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

Jan 2014 Mar 2014 Jun 2014 Sep 2014 Jan 2015 Mar 2015 Jun 2015

time (years)

NINO3.4 (Nominal)

actual

predicted

Jan 2014 Mar 2014 Jun 2014 Sep 2014 Jan 2015 Mar 2015

0.4

0.2

0.2

0.4

0.6

0.8

time (years)

NINO3.4 / oC

actual

predicted

CFSv2

(b)

(a)

Fig. 7: Prediction results on ENSO variability in 2014 using an ensemble of 36 ANNs with different binary layer

structures and up to 6 neurons per layer. (a) The occurrence of the El Ni˜

no event given one year ahead, and (b) the

development of NINO3.4 index with a three months a lead time (only the ensemble mean is shown, NRMSE=0.19).

The red dashed lines are the actual nominal quantity/actual values of NINO3.4 index, the blue solid lines indicate

the predicted ones, and the black solid line indicates the predicted one by CFSv2 model (only the ensemble mean

is shown, estimated from http://www.cpc.ncep.noaa.gov/products/people/ wwang/ cfsv2 fcst history/ ).

cation, we developed a forecast scheme in predicting the occurrence of an El Ni˜

no event one year ahead. This330

scheme apparently does not seem to suffer from the ‘spring predictability barrier’ (Goddard et al., 2001). This is

probably due to the fact that the network measures adequately capture the changes of spatial patterns one calendar

year before the warming event (Ludescher et al., 2013). Apparently, the prediction schemes can well represent the

nonlinear relationships among the attributes and give an objective prediction. For example, in the forecast scheme

proposed by Ludescher et al. (2014), the prediction may be sensitive to the choice of the decision threshold θ.335

Moreover, the false alarms and the misses (the El Ni˜

no events in 2006 and 2009 are not detected) show the limita-

tions of their scheme. These deﬁciencies may be caused by the fact that this forecast scheme is based only on one

single measure of the CN. The supervised learning method in our forecast scheme does not have these problems.

In addition, by using measures of an undirected and unweighted CN (Feng and Dijkstra, 2015) that monitor

the stability of the Paciﬁc climate state and a measure of the atmospheric wind-stress noise in combination with340

supervised learning regression, we provided reasonable forecasts of the development of the NINO3.4 index three

months ahead. A lead time of three months is of course too short to make this forecast scheme outcompete existing

ones. However, comparing these forecast results with those from much more sophisticated models like the CFSv2

model indeed conﬁrm that the quantities Sdand PC2 are important factors in the development of El Ni˜

no events.

The software package ClimateLearn is written in python 2.7, and it makes full use of the open source pack-345

ages Weka (available at http:/www.cs.waikato.ac.nz/ml/weka/) and ECJ (available at https://cs.gmu.edu/∼eclab/

projects/ecj/). The package ClimateLearn allows basic operations of data mining, i.e. reading, merging, and

cleaning data, and running machine learning algorithms. Building on the success of complex network approaches

to investigate aspects of climate variability, ClimateLearn provides an innovative and convenient way to pre-

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

dict the occurrence and development of El Ni˜

no events. It can also be directly applied to the prediction of other350

climate variability phenomena.

Code availability

ClimateLearn is available through github at https://github.com/Ambrosys/climatelearn. The package is still in

a raw version and we plan however to reﬁne it by a full python implementation using other open source third-party

libraries (e.g. Deap and Pybrain) in the near future.355

Acknowledgements. The authors would like to acknowledge the support of the LINC project (no. 289447) funded

by EC’s Marie-Curie ITN program (FP7-PEOPLE-2011-ITN). The computations on the Cartesius machine

were funded by the Exact Sciences division of the Netherlands Organization of Scientiﬁc Research under

grant SH-286.

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

References360

Bishop, C. M.: Pattern recognition and machine learning, Springer, New York, 2006.

Chen, D., Cane, M. A., Kaplan, A., Zebiak, S. E., and Huang, D.: Predictability of El Ni ˜

no over the past 148 years, Nature,

428, 733–736, 2004.

Darwin, C.: On the origin of species by means of natural selection, New York :D. Appleton and Co., 1959.

Donges, J. F., Zou, Y., Marwan, N., and Kurths, J.: Complex networks in climate dynamics, The European Physical Journal365

Special Topics, 174, 157–179, 2009.

Donges, J. F., Heitzig, J., Beronov, B., Wiedermann, M., Runge, J., Feng, Q. Y., Tupikina, L., Stolbova, V., Donner, R. V.,

Marwan, N., Dijkstra, H. A., and Kurths, J.: Uniﬁed functional network and nonlinear time series analysis for complex

systems science: The pyunicorn packageUniﬁed functional network and nonlinear time series analysis for complex systems

science: The pyunicorn package, Chaos, pp. 1–26, 2015.370

Duan, W. and Wei, C.: The spring predictability barrier for ENSO predictions and its possible mechanism: results from a fully

coupled model, International Journal of Climatology, 33, 1280–1292, 2013.

Fedorov, A., Harper, S., Philander, S., Winter, B., and Wittenberg, A.: How predictable is El Ni˜

no?, Bulletin of the American

Meteorological Society, 84, 911–919, 2003.

Feng, Q. Y. and Dijkstra, H.: Are North Atlantic multidecadal SST anomalies westward propagating?, Geophysical Research375

Letters, 2014.

Feng, Q. Y. and Dijkstra, H.: Climate Network Based Stability Index for El Ni ˜

no Variability, arXiv:1503.05449, 2015.

Fountalis, I., Bracco, A., and Dovrolis, C.: ENSO in CMIP5 simulations: network connectivity from the recent past to the

twenty-third century, Climate Dynamics, pp. 1–28, 2015.

Goddard, L., Mason, S. J., Zebiak, S. E., Ropelewski, C. F., Basher, R., and Cane, M. A.: Current approaches to seasonal to380

interannual climate predictions, International Journal of Climatology, 21, 1111–1152, 2001.

Gozolchiani, A., Havlin, S., and Yamasaki, K.: Emergence of El Ni ˜

no as an Autonomous Component in the Climate Network,

Physical Review Letters, 107, 148501, 2011.

Ihshaish, H., Tantet, A., Dijkzeul, J. C. M., and Dijkstra, H. A.: Par@Graph: a parallel toolbox for the construction and

analysis of large complex climate networks, Geoscientiﬁc Model Development, 8, 3321–3331, 2015.385

Kalnay, E., Kanamitsu, M., Kistler, R., Collins, W., Deaven, D., Gandin, L., Iredell, M., Saha, S., White, G., Woollen, J., et al.:

The NCEP/NCAR 40-year reanalysis project, Bulletin of the American meteorological Society, 77, 437–471, 1996.

Katz, R. W.: Sir Gilbert Walker and a connection between El Nino and statistics, Statistical Science, pp. 97–112, 2002.

Koza, J. R.: Genetic programming - on the programming of computers by means of natural selection, Complex adaptive

systems, MIT Press, 1993.390

Latif, M. and Barnett, T. P.: Causes of decadal climate variability over the North Paciﬁc and North America, Science, 266,

634–637, 1994.

Ludescher, J., Gozolchiani, A., Bogachev, M. I., Bunde, A., Havlin, S., and Schellnhuber, H. J.: Improved El Ni ˜

no forecasting

by cooperativity detection, Proceedings of the National Academy of Sciences, 110, 11 742–11 745, 2013.

Ludescher, J., Gozolchiani, A., Bogachev, M. I., Bunde, A., Havlin, S., and Schellnhuber, H. J.: Very early warning of next El395

Ni˜

no, Proceedings of the National Academy of Sciences, 111, 2064–2066, 2014.

Mitchell, T.: Machine Learning, McGraw-Hill, New York, 1997.

Qiu, B. and Chen, S.: Variability of the Kuroshio Extension Jet, Recirculation Gyre and Mesocale Eddies on decadal time

scales, Journal Of Physical Oceanography, 35, 2090–2103, 2005.

Rayner, N. a.: Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth400

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

century, Journal of Geophysical Research, 108, 4407, doi:10.1029/2002JD002670, http://www.agu.org/pubs/crossref/2003/

2002JD002670.shtmlhttp://doi.wiley.com/10.1029/2002JD002670, 2003.

Reilly, B.: Disaster and Human History: Case Studies in Nature, Society and Catastrophe, McFarland, Jefferson, 2009.

Saha, S., Moorthi, S., Wu, X., Wang, J., Nadiga, S., Tripp, P., Behringer, D., Hou, Y.-T., Chuang, H.-y., Iredell, M., et al.: The

NCEP climate forecast system version 2, Journal of Climate, 27, 2185–2208, 2014.405

Sharma, N., Sharma, P., Irwin, D., and Shenoy, P.: Predicting solar generation from weather forecasts using machine learning,

in: Smart Grid Communications (SmartGridComm), 2011 IEEE International Conference on, pp. 528–533, IEEE, 2011.

Slingo, J. and Palmer, T.: Uncertainty in weather and climate prediction, Philosophical Transactions Of The Royal Society

A-Mathematical Physical And Engineering Sciences, 369, 4751–4767, 2011.

Steinhaeuser, K., Ganguly, A. R., and Chawla, N. V.: Multivariate and multiscale dependence in the global climate system410

revealed through complex networks, Climate Dynamics, 39, 889–895, 2011.

Tantet, A. and Dijkstra, H. A.: An interaction network perspective on the relation between patterns of sea surface temperature

variability and global mean surface temperature, Earth System Dynamics, 5, 1–14, 2014.

Tsonis, A. a. and Roebber, P.: The architecture of the climate network, Physica A: Statistical Mechanics and its Applications,

333, 497–504, 2004.415

Tsonis, A. A. and Swanson, K. L.: What do networks have to do with climate?, Bulletin Of The American Meteorological

Society, 87, 585–595, 2006.

Wang, Y., Gozolchiani, A., Ashkenazy, Y., Berezin, Y., Guez, O., and Havlin, S.: Dominant imprint of rossby waves in the

climate network, Physical review letters, 111, 138501, 2013.

Yamasaki, K., Gazit, O., and Havlin, S.: Pattern of climate network blinking links follows El Ni˜

no events, EPL (Europhysics),420

83, 28 005, 2008.

Yeh, S.-W., Kug, J.-S., Dewitte, B., Kwon, M.-H., Kirtman, B. P., and Jin, F.-F.: El Ni˜

no in a changing climate, Nature, 461,

511–514, 2009.

Geosci. Model Dev. Discuss., doi:10.5194/gmd-2015-273, 2016

Manuscript under review for journal Geosci. Model Dev.

Published: 11 February 2016

Author(s) 2016. CC-BY 3.0 License.

Network-based forecasting of climate phenomena

Article

Nov 2021

Network theory, as emerging from complex systems science, can provide critical predictive power for mitigating the global warming crisis and other societal challenges. Here we discuss the main differences of this approach to classical numerical modeling and highlight several cases where the network approach substantially improved the prediction of high-impact phenomena: 1) El Niño events, 2) droughts in the central Amazon, 3) extreme rainfall in the eastern Central Andes, 4) the Indian summer monsoon, and 5) extreme stratospheric polar vortex states that influence the occurrence of wintertime cold spells in northern Eurasia. In this perspective, we argue that network-based approaches can gainfully complement numerical modeling.

Evaluation of the real-time El Niño forecasts by the climate network approach between 2011 and present

Article

Full-text available

May 2024
THEOR APPL CLIMATOL

El Niño episodes are part of the El Niño-Southern Oscillation (ENSO), which is the strongest driver of interannual climate variability, and can trigger extreme weather events and disasters in various parts of the globe. Previously we have described a network approach that allows to forecast El Niño events about 1 year ahead. Here we evaluate the real-time forecasts of this approach between 2011 and 2022. We find that the approach correctly predicted (in 2013 and 2017) the onset of both El Niño periods (2014-2016 and 2018-2019) and generated only 1 false alarm in 2019. In June 2022, the approach correctly forecasted the onset of an El Niño event in 2023. For determining the p-value of the 12 real-time forecasts, we consider 2 null hypotheses: (a) random guessing where we assume that El Niño onsets occur randomly, and (b) correlated guessing where we assume that in the year an El Niño ends, no new El Niño will start. We find pa≅0.005\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_a\cong 0.005$$\end{document} and pb≅0.015\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_b\cong 0.015$$\end{document}, this way rejecting both the null hypotheses that the same quality of the forecast can be obtained by chance. We also discuss how the network algorithm can be further improved by systematically reducing the number of false alarms. For 2024, the method indicates the absence of a new El Niño event.

Forecasting the El Niño type well before the spring predictability barrier

Article

Full-text available

Nov 2023

El Niño events represent anomalous episodic warmings, which can peak in the equatorial Central Pacific (CP events) or Eastern Pacific (EP events). The type of an El Niño (CP or EP) has a major influence on its impact and can even lead to either dry or wet conditions in the same areas on the globe. Here we show that the difference of the sea surface temperature anomalies between the equatorial western and central Pacific in December enables an early forecast of the type of an upcoming El Niño ( p -value < 10 ⁻³ ). Combined with a previously introduced climate network-based approach that allows to forecast the onset of an El Niño event, both the onset and type of an upcoming El Niño can be efficiently forecasted. The lead time is about 1 year and should allow early mitigation measures. In December 2022, the combined approach forecasted the onset of an EP event in 2023.

Spatial-temporal transformer network for multi-year ENSO prediction

Article

Full-text available

Mar 2023

The El Niño-Southern Oscillation (ENSO) is a quasi-periodic climate type that occurs near the equatorial Pacific Ocean. Extreme periods of this climate type can cause terrible weather and climate anomalies on a global scale. Therefore, it is critical to accurately, quickly, and effectively predict the occurrence of ENSO events. Most existing research methods rely on the powerful data-fitting capability of deep learning which does not fully consider the spatio-temporal evolution of ENSO and its quasi-periodic character, resulting in neural networks with complex structures but a poor prediction. Moreover, due to the large magnitude of ocean climate variability over long intervals, they also ignored nearby prediction results when predicting the Niño 3.4 index for the next month, which led to large errors. To solve these problem, we propose a spatio-temporal transformer network to model the inherent characteristics of the sea surface temperature anomaly map and heat content anomaly map along with the changes in space and time by designing an effective attention mechanism, and innovatively incorporate temporal index into the feature learning procedure to model the influence of seasonal variation on the prediction of the ENSO phenomenon. More importantly, to better conduct long-term prediction, we propose an effective recurrent prediction strategy using previous prediction as prior knowledge to enhance the reliability of long-term prediction. Extensive experimental results show that our model can provide an 18-month valid ENSO prediction, which validates the effectiveness of our method.

Survey on the Application of Artificial Intelligence in ENSO Forecasting

Article

Full-text available

Oct 2022

Climate disasters such as floods and droughts often bring heavy losses to human life, national economy, and public safety. El Niño/Southern Oscillation (ENSO) is one of the most important inter-annual climate signals in the tropics and has a global impact on atmospheric circulation and precipitation. To address the impact of climate change, accurate ENSO forecasts can help prevent related climate disasters. Traditional prediction methods mainly include statistical methods and dynamic methods. However, due to the variability and diversity of the temporal and spatial evolution of ENSO, traditional methods still have great uncertainty in predicting ENSO. In recent years, with the rapid development of artificial intelligence technology, it has gradually penetrated into all aspects of people’s lives, and the climate field has also benefited. For example, deep learning methods in artificial intelligence can automatically learn and train from a large amount of sample data, obtain excellent feature representation, and effectively improve the performance of various learning tasks. It is widely used in computer vision, natural language processing, and other fields. In 2019, Ham et al. used a convolutional neural network (CNN) model in ENSO forecasting 18 months in advance, and the winter ENSO forecasting skill could reach 0.64, far exceeding the dynamic model with a forecasting skill of 0.5. The research results were regarded as the pioneering work of deep learning in the field of weather forecasting. This paper introduces the traditional ENSO forecasting methods and focuses on summarizing the various latest artificial intelligence methods and their forecasting effects for ENSO forecasting, so as to provide useful reference for future research by researchers.

Network-Based Approach and Climate Change Benefits for Forecasting the Amount of Indian Monsoon Rainfall

Article

Full-text available

Nov 2021

Despite the development of sophisticated statistical and dynamical climate models, a relative long-term and reliable prediction of the Indian summer monsoon rainfall (ISMR) has remained a challenging problem. Towards achieving this goal, here we construct a series of dynamical and physical climate networks based on the global near surface air temperature field. We uncover that some characteristics of the directed and weighted climate networks can serve as efficient long-term predictors for ISMR forecasting. The developed prediction method produces a forecasting skill of 0.54 (Pearson correlation) with a 5-month lead-time by using the previous calendar year’s data. The skill of our ISMR forecast is better than that of operational forecasts models, which have, however, quite a short lead-time. We discuss the underlying mechanism of our predictor and associate it with network-ENSO and ENSO-monsoon connections. Moreover, our approach allows predicting the all India rainfall, as well as the different Indian homogeneous regions’ rainfall, which is crucial for agriculture in India. We reveal that global warming affects the climate network by enhancing cross-equatorial teleconnections between the Southwest Atlantic, the Western part of the Indian Ocean, and the North Asia-Pacific region, with significant impacts on the precipitation in India. A stronger connection through the chain of the main atmospheric circulations patterns benefits the prediction of the amount of rainfall. We uncover a hotspot area in the mid-latitude South Atlantic, which is the basis for our predictor, the South-West Atlantic Subtropical Index ( SWAS-index ). Remarkably, the significant warming trend in this area yields an improvement of the prediction skill.

Machine Learning for the Geosciences

Chapter

Feb 2023

This chapter provides a review on data-driven problems in geoscience, with a special focus on the sub-field of seismology. Geoscience phenomena are often studied by using data-driven models, which are based on various types of data that are monitored and sensed using sophisticated equipment. The large amounts of gathered data make it very attractive for incorporating machine learning and deep learning techniques for advancing research challenges and for promoting the social benefit such as hazard predictions and preservation of natural resources. In the field of seismology, the tasks include seismic event detection, localization, and classification. These have been approached by machine learning techniques, mainly by supervised machine learning methods, since the 90s. Nowadays, deep learning architectures are applied for modeling larger amounts of seismic data in order to provide fast and accurate event detection and classification solutions that are incorporated into real-time analysis systems. We review the noticeable research trends and developments in the field and conclude by discussing advantages, drawbacks, and future possible research directions.

A Framework for Rainfall Forecasting based Crop Analysis using Deep Learning

Conference Paper

Mar 2023

Monitoring and control of temperature, humidity using machine learning

Conference Paper

Jul 2021

In this paper, In Recent Year Machine learning plays important role in analyzing the various time of weather condition all around the world especially Indian Subcontinent. Data is available in the government website to analyze the data for few long years. This should be matched with UCI technique for the machine learning and obtain the condition of different level of data repository. Temperature and humidity level condition of different parameters were analyzed and taken as example of weather condition to monitor the weather condition. We need to design a model to fit the different condition and need to extrapolating the needs of information, and optimize the technique using some algorithm and variation should be targeted and value should be analyzed.

Deep Residual Convolutional Neural Network Combining Dropout and Transfer Learning for ENSO Forecasting

Article

Full-text available

Dec 2021
GEOPHYS RES LETT

To improve EI Niño‐Southern Oscillation (ENSO) amplitude and type forecast, we propose a model based on a deep residual convolutional neural network with few parameters. We leverage dropout and transfer learning to overcome the challenge of insufficient data in model training process. By applying the dropout technique, the model effectively predicts the Niño3.4 Index at a lead time of 20 months during the 1984–2017 evaluation period, which is three more months than that by the existing optimal model. Moreover, with homogeneous transfer learning this model precisely predicts the Oceanic Niño Index up to 18 months in advance. Using heterogeneous transfer learning this model achieved 83.3% accuracy for forecasting the 12‐month‐lead EI Niño type. These results suggest that our proposed model can enhance the ENSO prediction performance.

El Nino in a changing climate

Article

Full-text available

Jan 2009

El Nino events, characterized by anomalous warming in the eastern equatorial Pacific Ocean, have global climatic teleconnections and are the most dominant feature of cyclic climate variability on subdecadal timescales. Understanding changes in the frequency or characteristics of El Nino events in a changing climate is therefore of broad scientific and socioeconomic interest. Recent studies(1-5) show that the canonical El Nino has become less frequent and that a different kind of El Nino has become more common during the late twentieth century, in which warm sea surface temperatures (SSTs) in the central Pacific are flanked on the east and west by cooler SSTs. This type of El Nino, termed the central Pacific El Nino (CP-El Nino; also termed the dateline El Nino(2), El Nino Modoki(3) or warm pool El Nino(5)), differs from the canonical eastern Pacific El Nino (EP-El Nino) in both the location of maximum SST anomalies and tropical-midlatitude teleconnections. Here we show changes in the ratio of CP-El Nino to EP-El Nino under projected global warming scenarios from the Coupled Model Intercomparison Project phase 3 multi-model data set(6). Using calculations based on historical El Nino indices, we find that projections of anthropogenic climate change are associated with an increased frequency of the CP-El Nino compared to the EP-El Nino. When restricted to the six climate models with the best representation of the twentieth-century ratio of CP-El Nino to EP-El Nino, the occurrence ratio of CP-El Nino/EP-El Nino is projected to increase as much as five times under global warming. The change is related to a flattening of the thermocline in the equatorial Pacific.

Par@Graph – a parallel toolbox for the construction and analysis of large complex climate networks

Article

Full-text available

Oct 2015

In this paper, we present Par@Graph, a software toolbox to reconstruct and analyze complex climate networks having a large number of nodes (up to at least 106) and edges (up to at least 1012). The key innovation is an efficient set of parallel software tools designed to leverage the inherited hybrid parallelism in distributed-memory clusters of multi-core machines. The performance of the toolbox is illustrated through networks derived from sea surface height (SSH) data of a global high-resolution ocean model. Less than 8 min are needed on 90 Intel Xeon E5-4650 processors to reconstruct a climate network including the preprocessing and the correlation of 3 × 105 SSH time series, resulting in a weighted graph with the same number of vertices and about 3.2 × 108 edges. In less than 14 min on 30 processors, the resulted graph's degree centrality, strength, connected components, eigenvector centrality, entropy and clustering coefficient metrics were obtained. These results indicate that a complete cycle to construct and analyze a large-scale climate network is available under 22 min Par@Graph therefore facilitates the application of climate network analysis on high-resolution observations and model results, by enabling fast network reconstruct from the calculation of statistical similarities between climate time series. It also enables network analysis at unprecedented scales on a variety of different sizes of input data sets.

Unified functional network and nonlinear time series analysis for complex systems science: The pyunicorn package

Article

Full-text available

Jul 2015

We introduce the pyunicorn (Pythonic unified complex network and recurrence analysis toolbox) open source software package for applying and combining modern methods of data analysis and modeling from complex network theory and nonlinear time series analysis. pyunicorn is a fully object-oriented and easily parallelizable package written in the language Python. It allows for the construction of functional networks such as climate networks in climatology or functional brain networks in neuroscience representing the structure of statistical interrelationships in large data sets of time series and, subsequently, investigating this structure using advanced methods of complex network theory such as measures and models for spatial networks, networks of interacting networks, node-weighted statistics or network surrogates. Additionally, pyunicorn provides insights into the nonlinear dynamics of complex systems as recorded in uni- and multivariate time series from a non-traditional perspective by means of recurrence quantification analysis (RQA), recurrence networks, visibility graphs and construction of surrogate time series. The range of possible applications of the library is outlined, drawing on several examples mainly from the field of climatology.

ENSO in CMIP5 simulations: network connectivity from the recent past to the twenty-third century

Article

Full-text available

Jul 2015

A new methodology based on complex network analysis is applied to state-of-the-art climate model simulations to assess their performances, quantify uncertainties, and uncover changes in global linkages between past and future projections. Network properties of modeled sea surface temperature and precipitation over 1956–2005 are constrained towards observations or reanalyses, and their differences quantified using two metrics. Projected changes from 2051 to 2300 under the scenario with the highest representative and extended concentration pathways (RCP8.5 and ECP8.5) are then determined. The network of models capable of reproducing well major climate modes in the recent past, change little during this century. In contrast, among those models the uncertainties in the projections after 2100 are substantial, and are primarily associated with divergences in the representation of the modes of variability, particularly of the El Niño Southern Oscillation, and their connectivity, and therefore with their intrinsic predictability, more so than with differences in the mean state evolution.

Par@Graph – a parallel toolbox for the construction and analysis of large complex climate networks

Article

Full-text available

Jan 2015
GMD

In this paper, we present Par@Graph, a software toolbox to reconstruct and analyze complex climate networks having a large number of nodes (up to at least O (106)) and of edges (up to at least O (1012)). The key innovation is an efficient set of parallel software tools designed to leverage the inherited hybrid parallelism in distributed-memory clusters of multi-core machines. The performance of the toolbox is illustrated through networks derived from sea surface height (SSH) data of a global high-resolution ocean model. Less than 8 min are needed on 90 Intel Xeon E5-4650 processors to construct a climate network including the preprocessing and the correlation of 3 × 105 SSH time series, resulting in a weighted graph with the same number of vertices and about 3 × 106 edges. In less than 5 min on 30 processors, the resulted graph's degree centrality, strength, connected components, eigenvector centrality, entropy and clustering coefficient metrics were obtained. These results indicate that a complete cycle to construct and analyze a large-scale climate network is available under 13 min. Par@Graph therefore facilitates the application of climate network analysis on high-resolution observations and model results, by enabling fast network construction from the calculation of statistical similarities between climate time series. It also enables network analysis at unprecedented scales on a variety of different sizes of input data sets.

A Climate Network Based Stability Index for El Ni\~no Variability

Article

Full-text available

Mar 2015

Most of the existing prediction methods gave a false alarm regarding the El Ni\~no event in 2014. A crucial aspect is currently limiting the success of such predictions, i.e. the stability of the slowly varying Pacific climate. This property determines whether sea surface temperature perturbations will be amplified by coupled ocean-atmosphere feedbacks or not. The so-called Bjerknes stability index has been developed for this purpose, but its evaluation is severely constrained by data availability. Here we present a new promising background stability index based on complex network theory. This index efficiently monitors the changes in spatial correlations in the Pacific climate and can be evaluated by using only sea surface temperature data.

The NCEP climate forecast system version 2

Article

Full-text available

Mar 2014

The second version of the NCEP Climate Forecast System (CFSv2) was made operational at NCEP in March 2011. This version has upgrades to nearly all aspects of the data assimilation and forecast model components of the system. A coupled reanalysis was made over a 32-yr period (1979-2010), which provided the initial conditions to carry out a comprehensive reforecast over 29 years (1982-2010). This was done to obtain consistent and stable calibrations, as well as skill estimates for the operational subseasonal and seasonal predictions at NCEP with CFSv2. The operational implementation of the full system ensures a continuity of the climate record and provides a valuable up-to-date dataset to study many aspects of predictability on the seasonal and subseasonal scales. Evaluation of the reforecasts show that the CFSv2 increases the length of skillful MJO forecasts from 6 to 17 days (dramatically improving subseasonal forecasts), nearly doubles the skill of seasonal forecasts of 2-m temperatures over the United States, and significantly improves global SST forecasts over its predecessor. The CFSv2 not only provides greatly improved guidance at these time scales but also creates many more products for subseasonal and seasonal forecasting with an extensive set of retrospective forecasts for users to calibrate their forecast products. These retrospective and real-time operational forecasts will be used by a wide community of users in their decision making processes in areas such as water management for rivers and agriculture, transportation, energy use by utilities, wind and other sustainable energy, and seasonal prediction of the hurricane season.

Global Analysis of Sea Surface Temperature, Sea Ice, and Night Marine Air Temperature since the Late Nineteenth Century

Article