Modelling Financial Time Series


Financial time series, in general, exhibit average behaviour at “long” time scales and stochastic behaviour at ‘short” time scales. As in statistical physics, the two have to be modelled using different approaches – deterministic for trends and probabilistic for fluctuations about the trend. In this talk, we will describe a new wavelet based approach to separate the trend from the fluctuations in a time series. A deterministic (non-linear regression) model is then constructed for the trend using genetic algorithm. We thereby obtain an explicit analytic model to describe dynamics of the trend. Further the model is used to make predictions of the trend. We also study statistical and scaling properties of the fluctuations. The fluctuations have non-Gaussian probability distribution function and show multiscaling behaviour. Thus, our work results in a comprehensive model of trends and fluctuations of a financial time series.
Modelling Financial Time Series
P. Manimaran1, J.C. Parikh1, P.K. Panigrahi1S. Basu2,C.M.Kishtawal
and M.B. Porecha1
1Physical Research Laboratory, Ahmedabad 380 009, India.
2Space Applications Centre, Ahmedabad 380 015, India.
1 Introduction
In this presentation, I shall report on the work we have done to study financial
time series.
If one observes time series of stock prices, exchange rate of currencies
or commodity prices one finds that they have some common characteristic
features. They exhibit smooth mean behaviour (cyclic trends) at long time
scales and fluctuations about the smooth trend at much shorter time scales.
As illustrative examples, values of S&P CNX Nifty index of NSE at 30 second
interval and daily interval are shown in Figs. 1 and 2 respectively.
Note also that the time series are not stationary.
In order to model [2] series having these features, we follow the standard
approach in statistical physics. More precisely, we consider mean (trend) to
have deterministic behaviour and fluctuations to have stochastic behaviour.
Accordingly, we model them by quite different methods – smooth trend by
deterministic dynamics and fluctuations by probabilistic methods.
In view of this, we first separate mean behaviour from fluctuations in the
data. Once this is done, we can decompose a time-series {X(i),i =1, ....N }
Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Day 8 Day 9 Day 10
Nifty High Frequency Data (Sampled every 30 seconds)
Index Value
Fig. 1. Nifty high frequency data sampled every 30 seconds for 10 days in January,
1999. (Dates: January 1,4,5,6,7,8,11,12,13,14)
Fig. 2. The daily closing price of S&P CNX Nifty index for the period November
3, 1995-April 11, 2000
X(i)=Xm(i)+Xf(i)(i=1,2, ....N )(1)
where Xm(i) is the mean component and Xf(i) the fluctuating component.
To carry out this separation we have proposed [3] and used discrete wavelet
transforms (DWT) [4] a method well suited for non-stationary data. Section 2
describes in brief some basic ideas of DWT and how it is used to determine the
smooth component {Xm(i)}. The fluctuating component {Xf(i)}is obtained
by subtracting {Xm(i)}from X(i) (see Eq.(1)). We have also compared [3]
our procedure with other methods [5] of extracting fluctuations, by studying
scaling properties of a computer generated non-stationary series.
In Section 3, we model the smooth series {Xm(i); i=1,2, ...N }in the
form of a general non-linear regression equation. The form of the “best” non-
linear function is determined by fitting the data and use of Genetic Algorithm
[6]. Out of sample predictions are made using the model.
Section 4 contains a discussion of statistical properties of the fluctuations
{Xf(i); i=1,2, ...N }. Finally, a summary of our work and some concluding
remarks are given in Section 5.
2 Discrete wavelets – separation of trend from
Wavelet transforms [4] decompose a signal S(t) into components that simulta-
neously provide information about its local time and frequency characteristics.
Therefore, they are very useful for studying non-stationary signals. We use
discrete wavelets in our work because one can obtain a complete orthonor-
mal basis set of strictly finite size. This gives significant mathematical and
computational advantages. In the present work, we have used Coiflet-2 (Cf-2)
wavelets [4].
In wavelets one starts from basis functions, father wavelet φ(t) (scaling
function) and mother wavelet ψ(t) [4]. These functions obey, φ(t)dt =Aand
ψ(t)dt =0,whereAis a constant. Scaling and translation of wavelets lead
to φk=φ(tk)andψj,k =2
j/2ψ2jtk, which satisfy the orthogonality
conditions: φk|φk=δkk,φk|ψjk=0andψj,k |ψj,k=δj,jδk,k.
Any signal belonging to L2(R) can be written in the form,
dj,kψj,k (t)(2)
where c
j,ks are the high-pass coefficients.
We next describe the way we get the smooth trend series starting from the
data series.
A forward Cf-2 transformation is first applied to the time series
{Xi;i=1,2, ...N }.ThisgivesNwavelet coefficients. Of these N
2are low
pass coefficients that describe the average behaviour locally and the other half
2are high pass coefficients corresponding to local fluctuations. In order to
obtain the mean (smooth) series, we set the high pass coefficients to zero and
then an inverse Cf-2 transformation is carried out. This results in “level one”
time series of trend in which fluctuations at the smallest time scale have been
filtered out. One can repeat the entire process on “level one” series to filter
out fluctuations at the next higher time scale to get a “level two” trend series
and so on. Fig. 3 shows the actual {X(i)}and (Cf-2, level 4) trend series
{Xm(i)}of the NASDAQ composite index. Having determined {Xm(i)}the
fluctuation series {Xf(i)}is obtained using Eq. (1).
Fig. 3.
It is important to test the validity of our method. For this purpose, we have
studied [3] scaling behaviour of fluctuations of computer generated Binomial
Multi-Fractal (BMF) series [7].
The qth order fluctuation function Fq(s) [5] is obtained by squaring and
averaging fluctuations over Msforward and Msbackward segments:
b=1 F2(b, s)q/21/q
Here, ‘q’ is the order of moment that takes real values and sis the scale. The
above procedure is repeated for variable window sizes for different values of q
(except q= 0). The scaling behaviour is obtained by analyzing the fluctuation
in a logarithmic scale for each value of q. If the order q= 0, direct evaluation
of Eq. (3) leads to divergence of the scaling exponent. In that case, logarithmic
averaging has to be employed to find the fluctuation function:
Fq(s)exp 1
n F2(b, s)(5)
As is well-known, if the time series is mono-fractal, the h(q) values are inde-
pendent of q. For multi-fractal time series, h(q) values depend on q.
In Table 1, we compare exactly known analytical values of the scaling
exponent hex(q) of BMF with the ones obtained by our wavelet based method
(hw(q)) and those of the earlier [5] method (hp(q)) using local polynomial fit.
Note that the wavelet based approach gives excellent results.
Table 1. The h(q) values of binomial multi-fractal series (BMF) computed analyti-
cally (hex(q)) through MF-DFA (hp(q)) and wavelet (hw(q)) approach, Db-8 wavelet
has been used.
-10 1.9000 1.9304 1.8991
-9 1.8889 1.9184 1.8879
-8 1.8750 1.9032 1.8740
-7 1.8572 1.8837 1.8560
-6 1.8337 1.8576 1.8319
-5 1.8012 1.8210 1.7981
-4 1.7544 1.7663 1.7473
-3 1.6842 1.6783 1.6641
-2 1.5760 1.5397 1.5218
-1 1.4150 1.3939 1.3828
0 0 1.2030 1.2163
1 1.0000 0.9934 1.0091
2 0.8390 0.8312 0.8453
3 0.7309 0.7234 0.7359
4 0.6606 0.6538 0.6649
5 0.6139 0.6075 0.6177
6 0.5814 0.5753 0.5848
7 0.5578 0.5519 0.5610
8 0.5400 0.5343 0.5430
9 0.5261 0.5205 0.5290
10 0.5150 0.5095 0.5178
We have also evaluated the scaling exponent h(q) for the NASDAQ data.
This is shown in Fig. 4 where h(q) is a non-linear function of qindicating that
the fluctuations have multi-fractal nature.
Fig. 4. Scaling exponent for the NASDAQ fluctuations
3 Model of the trend series
We now construct a deterministic model for the smooth series {Xm(i)}shown
in Fig. 3. For convenience of notation, we define
y(i)Xm(i)(i=1,2, ....N )(6)
Further, we assume that y(i) is a non-linear function of dprevious values of
the variable y.Moreprecisely,wehave
y(i)=F((y(i1),y(i2), ....y(id)) (i=d+1,d+2, ....N )(7)
where Fis as an unknown non-linear function and the number of lags dis
also not known. It is worth pointing out that, in the study of chaotic time
series [8], using the method of time delays to reconstruct the state space, d
is actually the embedding dimension of the attractor. We determine its value
from false nearest neighbour analysis [8]. For the smoothened NASDAQ series
of Fig. 3, the embedding dimension d= 4. Therefore,
y(i)=F(y(i1),y(i2)....y(i4)) (i=5, ....N )(8)
We now use genetic algorithm (GA) [6] to obtain the function Fthat best
represents the deterministic map of the trend series. Following the GA ap-
proach of ref. [6] we begin by constructing 200 randomly constructed strings.
These strings contain random sequences of the four basic arithmetic symbols
(+, –, ×,÷), the values of variable yat earlier times and real number con-
stants. This choice implies that in the GA framework, the search for function
Fwill be restricted to the form of ratio of polynomials – in a way similar
to Pad´e approximation. The equation strings were tested on a training set
consisting of first 700 data points of the trend series. A measure of fitness
was defined to evaluate performance of each string. These strings are then or-
dered in decreasing value of fitness and combined pairwise, beginning with the
fittest. We retain 50 such pairs which in turn reproduce so that each combined
string has 4 offsprings. The first two are identical to the two parents and the
remaining two are formed by interchanging parts. A small percentage of the
elements in strings are mutated. At this stage fitness is again computed and
the entire cycle of ordering, combing, reproducing and mutation of strings is
repeated. This is continued for 5000 iterations.
The map that results for our data after carrying out this procedure has
the form
y(i)= y(i4) (y(i1))2
(y(i3))3(i=5, ....700) (9)
The fitness was 0.9.
Clearly, if the map is a good representation of the dynamics, we ought to
have good out of sample predictions. These are shown in Fig. 5.
Fig. 5.
We get very promising results – the mean error is <0.1% and the sign
mismatches are in only about 5% of the predictions. Similar results were also
obtained for the BSE Sensex.
It is worth stressing that these are one time step ahead predictions. If
the map (Eq.(9)) is iteratively used to make dynamic predictions, then the
error grows very quickly. This suggests that the long terms dynamics is not
captured by the map.
4 Statistical properties of fluctuations
These properties have been extensively studied and reported in literature (e.g.,
see refs. [1]-[2]). All the same, for the sake of completeness, we show in the
figures below:
(i) Probability distribution function (PDF) of returns (Fig. 6)
(ii) Auto-correlation function of returns (Fig. 7)
(iii)Auto-correlation function of absolute value of returns (Fig. 8)
Fig. 6 shows that the PDF is not Gaussian – it has skewness = -0.07
and kurtosis = 8.27. Note that for a Gaussian PDF these reduced cumulants
are zero. Figs. 7 and 8 together show that the binary correlations go to zero
quickly but the “volatility” correlations persist for a long time. This means
that fluctuations are not independent.
As mentioned earlier, these fluctuation properties are well-known to occur
in financial markets [1,2].
Fig. 6. Probability distribution function (PDF) of returns
Fig. 7. Auto-correlation function of returns
Fig. 8. Auto-correlation function of absolute value of returns
5 Summary and future work
Observation of financial market data suggests that the dynamics of the market
may be viewed as a superposition of long term deterministic trend and short
scale random fluctuations.
We have used discrete wavelet transforms to separate the two parts and
developed suitable models.
The trend has been modelled in the form of a non-linear auto-regressive
map. An explicit analytic functional form for the map is found using GA. It
gives an excellent fit to the data set and makes very reliable out of sample
single step predictions. Multi-step predictions are not good.
Statistical and scaling properties of fluctuations were determined and as
expected were consistent with earlier findings [1,2].
Regarding further work in this direction, a major challenge is to apply
GA methodology to arrive at a model with capability of making long term
predictions of the trend series.
In conclusion, we have thus made a beginning towards simultaneously
modelling trend and fluctuations of a time series.
