ArticlePDF Available

A Kernel Technique for Forecasting the Variance-Covariance Matrix

Authors:

Abstract and Figures

The forecasting of variance-covariance matrices is an important issue. In recent years an increasing body of literature has focused on multivariate models to forecast this quantity. This paper develops a nonparametric technique for generating multivariate volatility forecasts from a weighted average of historical volatility and a broader set of macroeconomic variables. As opposed to traditional techniques where the weights solely decay as a function of time, this approach employs a kernel weighting scheme where historical periods exhibiting the most similar conditions to the time at which the forecast if formed attract the greatest weight. It is found that the proposed method leads to superior forecasts, with macroeconomic information playing an important role.
Content may be subject to copyright.
NCER Working Paper Series
NCER Working Paper Series
A Kernel Technique for Forecasting the
Variance-Covariance Matrix
Ralf Becker
Ralf Becker
Adam Clements
Adam Clements
Robert O’Neill
Robert O’Neill
Working Paper #66
Working Paper #66
October 2010
October 2010
A Kernel Technique for Forecasting the
Variance-Covariance Matrix.
Ralf Becker
y
, Adam Clements
z
and Rob ert O’Neill
y
y
Economics, School of Social Sciences, University of Manchester
z
School of Economics and Finance, Queensland University of Technology
October 28, 2010
Corresponding author
Ralf Becker
Economics, School of Social Sciences
University of Manchester
email ralf.becker@manchester.ac.uk
Ph +44 (0)161 275 4807.
1 Introduction
The forecasting of variance-covariance matrices (VCMs) is an important issue in nance, having
applications in portfolio selection and risk management as well as being directly used in the pricing
of several nancial assets. In recent years an increasing body of literature has developed multi-
variate models to forecast this matrix, these include the DCC of Engle and Sheppard (2001), the
VARFIMA model of Chiriac and Voev (2009) and Riskmetrics of J.P. Morgan (1996). All of these
models can be used to forecast the VCM of a portfolio and all do so u sing data only relating to
the performance of the sto cks under consideration.
Previous studies, focusing on modelling the volatility of single assets, have identi…ed economic
variables that may in‡uence the variance of returns and attempted to utilise such variables in
forecasting. For example Aït-Sahalia and Brandt (2001) investigate which of a range of factors
in‡uence stock volatility. In this paper we introduce this approach to a multivariate setting as
we introduce a technique for utlilising macroeconomic variables when forecasting the VCM of a
number of stocks.
1
Advances in multivariate volatility modelling are complicated by the requirement that all fore-
casts of VCMs must be positive-de…nite and symmetric, restrictions which have meant it has also
been di¢ cult for models to incorporate macroeconomic information. Multivariate models have also
encountered estimation problems as the dimensions of the VCM are allowed to grow, the technique
presented he re is able to satisfy all of the required constraints and can be apply it to a relatively
large portfolio of stocks.
In this paper we introduce a non-parametric approach to forecasting that is similar in technique
to the Riskmetrics approach. However, while the latter delivers a weighted avearge of past realized
VCM with the weights determined by the distance of obervations to the time at which the forecast is
made, we introduce a weighting approach that allows for a wider range of variables to determine the
weights. We allow statistics that measure how similar matrices are and macroeconomic information
(such as interest rates) to be used in addition to the time information used in the Riskmetrics
approach. Technically we use a multivariate kernel to obtain such weights.This approach builds on
the work by Clements and Becker (2010), who show in a univariate setting that employing kernels
to determine weight structures dependent on the similarity of volatility observations through time
can improve forecast accuracy when compared to more established methods. As the method
essentially calculates VCM forecasts as weighted averages of past VCM it guarantees symmetry
and positive-de …nitness by construction.
We apply our proposed methods to a large real life dataset. As the method makes use of a
potentially large set of exogenous information it is impossible to devise a representative simulation
setup that could serve to establish the usefullness of the prop os ed forecasting tool. Therefore we
provide a careful forecasting experiment in which we compare our method with other models. The
results of this forecasting experiment are promising in that they establish that our nonparametric
approach is able to produce forecasts of the VCM that can be statistically superior. Interestingly
we can demonstate that macroeconomic data are critical to this improvement, providing evidence
that these variables provide important information for predicting the behaviour of the VCM.
The rest of this paper is organised as follows, Section 2 introduces some notation and as-
sumptions. Section 3 reviews the literature surrounding multivariate modelling, nonparametric
econometrics and the relationship between macroeconomic variables and stock return volatility.
Section 4 describes how Riskmetrics, a popular volatility forecasting tool, can be described as a
kernel approach based on time. Section 5 outlines how our model uses kernel techniques to obtain
2
forecasts of the VCM us ing a wider range of data while section 6 introduces the variables we in-
clude in our model. Section 7 outlines our forecasting experiment and provides a discussion of our
results. Finally section 8 concludes and notes further areas of interest.
2 Notations and Assumptions
The model we present is used to forecast the volatility of stock returns in an n stock portfolio.
For any given day, t; the (n 1) vector of returns is denoted by r
t
= (r
1t
; :::; r
nt
)
0
; where r
it
is the return on stock i on day t; and we assume that given all information available at time
t 1, F
t1
; the mean is unforecastable, i.e. E (r
t
jF
t1
) = 0. The object of interest is the (n n)
positive-de…nite variance-covariance matrix of returns, V ar (r
t
jF
t1
) =
t
; which we assume to be
time-varying, predictable and, although unobserved, consistently estimated by a realized variance-
covariance matrix V
t
. Generally in this paper represents the actual VCM, V is an observed
realized value of the VCM, calculated from intraday data, and H is used to denote a forecast of
the matrix.
3 Literature Review
In this paper we use nonparametric econometrics to produce forecasts of a variance-covariance
matrix. Th is approach has previously been used when forecasting univariate volatility in Clements
et al. (2010), in which forecasts are a weighted average of historical values of realized volatility. The
weights are increased when the pattern of historical volatility behaviour is similar to that around
the time at which the forecast is made. Clements et al. (2010) show that at a 1 day forecast
horizon such an approach performs well against competing volatility forecasting techniques.
What distinguishes this approach to many other VCM forecasting models is the role played
by the time between a past VCM observation and the period in which a forecast is made. In
general the weight given to past observations decreases for observations further in the past. An
important example of this is RiskMetrics, described in J.P. Morgan (1996), a popular method of
forecasting the VCM using an exponentially weighted moving average (EWMA) approach. Gijbels,
Pope and Wand (1999) show that this approach c an be interpreted as a kernel approach in which
weights on historical observations are determined by the lag at which a realization was observed.
Fleming, Kirby and Ostdiek (2003) also used an EWMA weighting scheme, similar to that used in
3
Riskmetrics, to show that weighted averages of realized covariance matrices can improve forecasts
of volatility, based on economic performance, compared to methods using daily returns.
The performance of the model proposed in this paper is compared to currently available mu l-
tivariate forecasting models, which have proliferated in number in recent years. The most popular
multivariate model is the dynamic conditional correlation model (DCC) of Engle (2002) and Engle
and Sheppard (2001), which allows correlations between stocks to vary according to a GARCH type
process. Other interesting recent models includ e the regime switching dynamic correlation model
(RSDC) of Pelletier (2006) which assumes that correlations/covariances change depend ing on the
state of the world. Colactio, Engle and Ghysels (2007) introduce a model which models long run
correlation behaviour using a MIDAS approach and short term behaviour via the DCC, however
such an approach requires a complex system of restrictions in order to ensure positive de…niteness
of the VCM matrix. Engle and Kelly (2008) introduce the dynamic equicorrelation model which
assumes that correlations between all stocks have the same value, but that this changes through
time. All of these models share the fact that they must make restrictions, either on paramater val-
ues or in their setup, to ensure that estimates and forecasts obtained from them are positve-de…nite
and symmetric.
An interesting new approach to VCM modelling is that of Chiriac and Voev (2009) who intro-
duce a VARFIMA model which models the behaviour of elements of the Cholesky decomposition
of the VCM, however this su¤ers from the problem that interpretation of the elements of the
decomposition is di¢ cult and incorporating additional variables is far from straightforward.
In this paper we introduce a nonparametric technique for obtaining VCM forecasts using a
multivariate kernel approach that encompasses the Riskmetrics method as a special case. This
extends the univariate approach of Clements et al. (2010) to the multivariate context. Importantly
we show how variables, other but time delay, can be used in a multiplicative kernel weighting.
Our contribution heavily depends on contributions made in nonparametric, kernel estimation
literature. It is well known that when estimating densities or conditional expectations by means
of kernel me thods the properties of the resulting estimators will depend on the choice of kernels
and, more importantly the choice of bandwidth used in the kernel estimators. Plug-in bandwith
rules have been proposed but it was also recognised that these may be inadequate when data did
not meet strict assumptions (Silverman, 1986, Bowman, 1997). Here we will use cross-validation
methods (Bowman , 1984, and Rud emo, 1982) to nd optimal bandwidths.
4
Application of cross-validation will also facilitate the selection of relevant variables to be used
in the kernel weighting algorithm. One class of variables considered for the kernel weighting
algorithm are scalar transformations of matrices as they can be used to establish the closeness
of matrices. The idea is to give higher weight to past observations that relate to times when the
VCM was similar to the current VCM (regardless of how distant that observation is). Moskowitz
(2003) proposes three statistics to evaluate the closenes s of VCMs. The rst metric compares the
matrices’eigenvalues, the second looks at the relative di¤erences between the individual matrix
elements and the third considers how many of the correlations have the same sign in the matrices.
Taken together these three metrics can be u sed to determine the level of similarity between two
VCMs. Other functions used to compare matrices, often called loss functions, have been discussed
in the literature (Laurent, Rombouts and Violante, 2009). One such loss function is the Stein
distance, also known as the MVQLIKE function. This loss function is shown to perform well in
discriminating between VCM forecasts in Clements, Doolan, Hurn and Becker (2009) and Laurent,
Rombouts and Violante (2010) and represents another useful tool for comparing VCMs.
The second class of variables used in the kernel weighting algorithm are variables carrying in-
formation on the state of the economy. The basic idea is to give past VCM larger weights in the
forecast if the macroeconomic conditions are similar to those prevalent at the time of the fore-
cast formation. Aït-Sahalia and Brandt (2001) investigate factors in‡uencing stock volatility and
propose dividend yield, default spreads and term spreads as f actors. This builds on existing work
which identi…es term spreads and default spreads as potential drivers of stock volatility proce sse s.
Campbell (1987), Fama and French (1989) and Harvey (1991) investigate the relationship between
term spreads and volatility while Fama and French (1989), Whitelaw (1994) and Schwert (1989)
consider a volatiliy-default spread relationship. In addition Harvey (1989) considers the impact of
default spreads on covariances. Hence there is an established literature relating these variables to
the behaviour of elements of a VCM.
Empirical evidence in Schwert (1989), Hamilton and Lin (1996) and Campbell, Lettau, Malkiel
& Xu (2001) suggests that during market downturns/recessions stock return volatility can be
expected to increase. Based on these ndings we propose to use an algorithm, such as that detailed
in Pagan and S ossoun ov (2003), to ide ntify periods in which the sto ck market is upbeat, as VCMs
in such periods may have common characteristics. Commodity prices, such as gold (Sjaastad and
Scacciavillani, 1996) and oil (Sadorsky, 1999, and Hamilton, 1996) prices, have also been linked
5
to stock market volatility and are therefore considered here as potential variables to contribute to
the kernel weighting functions.
The nal variable use d in this paper is implied volatility, namely the VIX index of the Chicago
Board of Exchange. This is often interpreted as a market’s view on future stock market volatility.
This measure has been used in the context of univariate volatility forecasting (Poon and Granger,
2003, Blair, Poon and Taylor, 2001) and is here considered as another variable in the multivariate
kernel weighting scheme.
4 Riskmetrics as a Kernel Approach
In this section we restate a result by Gijbels, Pope and Wand (1999) that establishes that a
Riskmetrics type, exponential smoothing forecast can be represented as a univariate kernel forecast
in which weights vary with time. This will provide a special case of the more general methodology
introduced in Section 5 in which we introduce a multivariate kernel which potentially utilises the
variables listed in the previous Section.
In a multivariate setting, the variance-covariance matrix forecast H
T +1
at time T , given by the
standard Riskmetrics equation is
H
T +1
= H
T
+ (1 )r
T
r
0
T
(1)
when observations are equally spaced in time and is a smoothing paramater, 0 < < 1;
commonly set at a value recommended in J.P. Morgan (1996). From recursive substitution and
with H
1
= r
0
1
r
1
, the forecast of the VCM can be expressed as
H
T +1
= (1 )
T 1
X
j=0
j
r
T j
r
0
T j
(2)
The sum of the weights is thus equal to 1
T
and as noted in Gijb els, Pope and Wand (1999)
this approaches 1 as we allow T to approach in…nity. However in order to normalise the sum of
the weights to be exactly 1 we restate the Riskmetrics model as
H
T +1
=
P
T 1
j=0
j
r
T j
r
0
T j
P
T 1
j=0
j
(3)
We can now reformulate (3) as a kernel
1
, de…ning h = 1= log() and K(u) = exp(u)1
u0
, we
1
More accuratel y this is a half kernel as it is zero for T + 1; T + 2; :::etc:.
6
can restate (3) as
H
T +1
=
P
T
j=1
K
tT
h
r
t
r
0
t
P
T
j=1
K
tT
h
=
T
X
j=1
W
rm;t
V
rm;t
(4)
From this we replicate the conclu sion of Gijbels, Pope and Wand (1999) that Riskmetrics is a
zero degree local polynomial kernel estimate with bandwidth h. From a practial point of view
the Riskmetrics kernel determines weights, W
rm;t
= K
tT
h
=
P
T
j=1
K
tT
h
, based on how
close observations of V
rm;t
= r
t
r
0
t
are to time T , the period at which a forecast is being made.
The largest weight is attached to the observation at time T and the weights de crease according to
an exp one ntially weighted smoothing average pattern. In the remainder of this paper we aim to
expand such an approach by including factors other than time in our estimation of kernel weights.
5 Multivariate Kernel Methodology
In this section we present the method by which we obtain the kernel and subsequent forecasts of
the VCM. The inputs to our model are a set of p variables, which we believe to contain informa-
tion relevant to forecasting the VCM, and a time series of realized variance covariance matrices.
Calculation of the n n realized variance-covariance matrix, V
t
, is a non-trivial issue . Here we
compute it using standard methods from the realized (co)variance literature and we assume that
V
t
is positive de…nite. The method used to calculate the matrices used in the rest of this paper is
described in section 7.2.
At time T we wish to obtain a forecast of the d-step ahead VCM, which is the matrix describing
variances and covariances over the time period T + 1 to T + d, denoted by H
(d)
T +d
. We obtain our
forecast by taking a weighted combination of historical VCMs, hence
H
(d)
T +d
=
T d
X
t=1
W
t
V
(d)
t+d
: (5)
As our forecast is a weighted combination of symmetric, positive de…nite matrices, H
T +d
also
has these properties and so is a valid covariance matrix. Ensuring that forecasts of the variance
covariance matrix are positive de…nite is rarely so straightforward and models usually have to
employ parameter restrictions or decompositions of V
t
in order to ensure this.
The focus of much of the remainder of this section is the method by which we determine the
optimal weights, to use in (5). In order to ensure that the weights sum to one we impose the
7
following normalisation:
W
t
=
!
t
P
T d
i=1
!
i
: (6)
This allows Equation (5) to be interpreted as a weighted average, ensuring an appropriate scaling
for H
(d)
T +d
.
We now explain how to determine !
t
using kernel estimation techniques. The idea underpinning
the approach is to determine which of the past time periods had conditions most similar to those
at the time we make the forecast, T. We then place more weight on the VCMs that occured over
the d periods f ollowing the dates that were most similar to time T .
We determine the similarity of other time periods to time T using p variables and employ a
multivariate kernel to calculate the raw weight applicable to day t, hence
!
t
=
p
Y
j=1
K
j
(
t;j
;
T;;j
; h
j
) (7)
where
T;;j
is the element from the T
th
row and j
th
column of the data matrix which has
dimensions T p and h
j
is the bandwidth for the j
th
variable.
For continuous variables K
j
(
t;j
;
T;;j
; h
j
) is the standard normal density kernel
2
(Silverman
1986 and Bowman 1997) de…ned as
K
j
(
t;j
;
T;;j
; h
j
) = (2)
0:5
exp
"
1
2
T;;j
t;j
h
j
2
#
:
In the case of a discrete dummy variable, such as a bull/bear market dummy we use the discrete
univariate kernel proposed in Aitchison and Aitken (1976). The form of the kernel is
K
j
(
t;j
;
T;;j
; h
j
) =
1 h
j
if
t;j
=
T;;j
h
j
=(s
j
1) if
t;j
6=
T;;j
(8)
where s
j
is the number of possible values the discrete variable can take (s
j
= 2 in the case of the
bull/bear market variable). In the two state discrete case h
j
2 [0; 0:5]. If h
j
= 0:5 the value of the
discrete variable has no impact on the forecast, while if h
j
= 0 we disregard data points which do
not share the same discrete variable value as
T;j
.
2
We normalise continuous variables before applying the kernel function.
8
In addition to the discrete and continuous kernels we use a third approach when we include
time as one of the p variables. In that case
K
j
(
t;j
;
T;j
; h
j
) =
h
T t
j
P
T h
q=1
h
T q
j
(9)
which has the same structure as the Riskmetrics approach in Equation (3). However, here we allow
a exible bandwidth, h
j
2 [0; 1], as opposed to a prespeci…ed value as in J.P. Morgan (1996).
As we are using a multiplicative kernel the time kernel suggested in Equation (9) is problematic
as K
j
(), with increasing (T t), will quickly decline towards a value of ze ro which will make the
value of !
t
in (7) approach zero. This implies that in ect observations with su¢ ciently large
(T t) will be ignored regardless of how similar the macroeconomic and VCM characteristics are
to the point of forecast. We therefore propose an alternative scheme where
K
j
(
t;j
;
T;j
; h
j
) =
h
T t
j
P
T h
q=1
h
T q
j
+ 1. (10)
The minor alteration ensures that the weights decline to a value of one. There is s till an increased
weight on more recent observations however less recent time periods are not ignored because of
the ects of the time kernel. We present results using both versions of the time kernel in order to
demonstrate the impact of such an approach.
While the general approach presented through Equations (5), (6) and (7) has the Riskmetrics
approach as a special case
3
, it introduce s a signi…cant amount of additional exibility, by allowing
the weights W
t
to be determined from a set of p variables.
5.1 Choice of Bandwidth
The choice of bandwidth is a non-trivial issue in nonparametric econometrics, however a common
rule of thumb quoted for multivariate density estimation is
h
j
=
4
(p + 2) T
1
p+4
j
where
j
is the standard deviation of the j
th
variable. Although this rule of thumb provides a
simple method for cho osing bandwidths, as noted in Wand and Jones (1995) these bandwidths
may be sub-optimal.
3
Using V
t
= r
t
r
0
t
rather than a realized VCM.
9
Importantly, if one was to optimise (using cross-validation) the bandwidth parameters the
optimised values, h
j
, will contain information on whether the jth element in does contribute
signi…cant information to the optimal weights W
t
. As noted in Li and Racine (2007, pp. 140-141),
irrelevant (continuous) variables are associated with h
j
= 1. For binary variables (and kernel as
in Equation (8)) and a time variable (and a kernel as in Equation (10)) the bandwidths h
j
= 0:5
and h
j
= 1 respectively represent irrelevant variables.
Cross-validation is a bandwidth optimisation strategy introduced in Rudemo (1982) and Bow-
man (1984). It selects bandwidths to minimise the mean integrated squared error (MISE) of density
estimates and is generally recommended as the method of choice in the context of nonparmetric
density and regression analysis (Wand and Jones, 1995, Li and Racine, 2007). As we are interested
in forecast performance rather than density estimation, we obtain bandwidths which minimise the
MVQLIKE of our forecasts rather than the MISE.
MVQLIKE is a robust loss function for th e comparison of matrices, where H
(1)
t
= H
t
is the
(n n) dimensional 1 period ahead forecast of the VCM at time t and V
(1)
t
= V
t
is the realized
VCM at time t
4
. The loss function is calculated as
MV QLIKE(H
t
) = tr(H
1
t
V
t
) log
H
1
t
V
t
n: (11)
This is the criterion function to be minimised in our cross-validation approach. Consider that
we have data available up to and including time period T and we aim to forecast the VCM for
T + 1. The available data over time periods 1 to T can be used to identify the optimal bandwidths
for use in forecasting. This is done by evaluating K (< T ) forecasts for periods T K + 1 to T.
The initial T K observations
5
are used to produce the rst forecast H
T K+1
. For any period ,
T K + 1 T , the forecast H
is based on observations of variables in available at time
1.
Having obtained these forecasts we select bandwidths to minimise the mean of MVQLIKE over
these in-sample forecasts
CV MV Q (h) =
1
K
T
X
=T _K+1
MV QLIKE(H
(h)) (12)
4
The following argument is, for notation al ease, made for 1 period ahead forecasts but the extension to d period
for ecasts is strai ghtforward. Initially we shall also suppre ss the dependence of H
t
on h, the (p 1) vector of
bandwidths.
5
We set T K = 300, which means that every forecast is based on a minimum of 300 obser vations.
10
where dependence on h, the (p 1) vector of bandwidths is now made explicit. The bandwidths
that minimise (12) are then used in Equations (5), (6) and (7) in order to forecast H
T +1
.
The optimised bandwidth values should carry information on which of the p variables included in
contribute signi…cantly to the determiniation of the weights in Equation (5). Li and Racine (2007)
suggest that a cross-validation approach in the context of a multivariate kernel regression should,
asymptotically, deliver bandwidth estimates that approach their "irrelevant" values discussed above
(h
j
= 1, h
j
= 0:5 and h
j
= 1 respectively for continuous, binary and time variables). They suggest
that, therefore, there is no need to eliminate irrelevant variables.
When following this strategy we encountered signi…cant di¢ culties in the optimisation process,
in particular our nonlinear bandwidth optimisation is unable to identify an optimum. We therefore
recommend an alternative strategy which eliminates irrelevant variables and identi…es optimal
bandwidths only for the remaining variables. The elimination of variables is achieved as follows.
Each variable is used as the only variable in to determine kernel weights. We nd the optimal
bandwith,
e
h
j
, for each variable by minimising the criterion in (12). The optimal CV MV Q
e
h
j
is then compared to the CV MV Q
R
which is obtained by forming VCM forecasts from averaging
all available past VCMs. The rationale is that a relevant variable sh ould deliver imp rovem ents
compared to a simple average. This is illustrated in Figure 1 for a forecast h orizon of d = 1
6
. The
dashed lines indicate CV MV Q
R
and the solid lines represent CV MV Q (h
j
). The minima on the
latter illustrate the bandwidth
e
h
j
that minimises CV M V Q (h
j
). Weighting variables that do n ot
improve on the CV M V Q
R
by at least 1% are then eliminated.
In order to obtain a handle on the size of this threshold we simulated 1000 random variables
which were subsequently con sidered as potential weighting variables (and there CV M V Q
e
h
rv
)
calculated. As it turns out a threshold of 1% would eliminate virtually all of these irrelevant
random variables
7
. Despite this the threshold is essentially ad-hoc and it is envisaged that future
reserach may improve on this aspect of the proposed methodology.
6
Similar illustrations for 5 and 22 day forecast h orizons o¤er no additional insight and so are not presented here.
7
We also applied a more con servativ e threshold of 2% but results remained virtually unchanged and are therefore
not reported.
11
Figure 1: Graphs of 1 day ahead CVMVQ against bandwidth values for 12 variables. The dashed line represents the CVMVQ
from a rolling average forecastor. Describtions of the variables used are provided in S ection 6.
12
In short the process of variable elimination and bandwidth optimisation can be summarised in
the following three step procedure:
1. For each of the p variables considered for inclusion in the multivariate kerne l, apply cross
validation to obtain the optimal bandwidth when on ly that variable is in clude d in the kernel
estimator. We refer to these as univariate optimised bandwidths
~
h
j
j = 1; :::; p.
2. Compare the forecasting performance of the univariate optimised bandwidths from Step 1,
CV MV Q
e
h
j
, against CV M V Q
R
from a simple average forecasting model. Any of the p
variables that fails to improve on the rolling average forecast performance by at least 1% is
eliminated at this stage as it is considered to have little value for forecasting. We are left
with p
p variables used as weighting variables.
3. Estimate the multivariate optimised bandwidths h
j
for the p
variables that are not elimi-
nated in Step 2 by minimising the cross validation criterion in Equation (12). As opposed to
Step 1 this optimisation is done simultaneously over all p
bandwidths.
Having obtained the optimised bandwidths from Step 3, we then forecast the VCM for the d
day-ahead time period ending at T + d using (7).
6 Potential Variables
The approach outlined in the previous Section illustrates how the p
relevant variables are to be
identi…ed from a list of p variables initially considered to be potentially relevant. The p
selected
variables then contribute to the calculation of the weights used in Equation (5). Here we describe
the set of p variables from which we select p
variables considered to be relevant. The variables
used can be classi…ed in three categories. First, the time variable, as it is used in the Riskmetrics
approach. This assumes that that VCM observations close to the time period T at which the
forecast is made, are more relevant than observations further back in time. The second class of
potential weighting variables are measures of matrix closeness. In essence, the more similar a
VCM matrix at any time t < T is to the VCM at time T , the larger should be the weight given
to the associated observed VCM over the subsequent d days in Equation (5). These measures of
matrix closeness are discussed in Section 6.1. Finally we consider variables that can broadly be
categorised as variables describing the prevelant economic circumstances at time t. Larger weight
13
is to be given to VCM if the associated macroeconomic conditions are similar to those prevelant
at the time of forecast formation, T . Variables that fall into this category are described in Section
6.2.
6.1 VCM Comparison Variables
Moskowitz (2003) discusses a number of summary statistics that measure the di¤erence between
two matrices. We consider three of these statistics here. The rst is the ratio of the e igenvalues of
the (squared) VCM at time t relative to those of the (squared) VCM at time T :
p
trace (V
0
t
V
t
)
p
trace (V
0
T
V
T
)
(13)
Values close to 1 indicate that matrices are similar to each other.
The second statistic adopted from Moskowitz (2003) evaluates the absolute elementwise di¤er-
ences between two matrices V
t
and V
T
. The sum of all absolute di¤erences is standardised by the
sum of all elements in V
T
. The statistic is de…ned as
0
jV
T
V
t
j
0
V
T
(14)
where is a is a n 1 vector of ones. For identical matrices this statistic will take a value of 0.
A third metric suggested in Moskowitz (2003) is based on the realized correlation matrices C
t
and C
T
8
. The statistic compares how similar C
t
and C
T
are in relation to the average realized
correlation matrix
C. Speci…cally we are concerned about the relative position of a particular
correlation relative to its long-run average. sign(vech(C
t
C)
i
) delivers a positive (negative) sign
if the realized correlation (of the ith unique element) at time t is larger (smaller) than the relevant
average correlation. The statistic considered here essentially c alculates the proportion of the m
unique elements in C
t
that have identical deviations from the long-run correlations as those in C
T
:
1
m
m
X
i=1
I
sign(vech(C
t
C)
i
)=sign
vech(C
T
C)
i

. (15)
I fg is an indicator taking the value of 1 when the statement inside the brackets is true and 0
otherwise and m =
n
2
(n 1) is the number of unique correlations in the n n correlation matrix.
If matrices are identical with respect to this measure this statistic will take the value 1.
8
The realized correlation matrices are calculated from C
t
= D
1
t
V
t
D
1
t
where D
t
is a (n n) diag onal matrix
with
p
V
iit
on the ith diagonal element and V
iit
is the (i; i) element of V
t
.
14
We also compare VCM matrices using the MVQLIKE loss function (Laurent, Rombouts and
Violante, 2009) due to it being a robust multivariate loss function, where it is de…ned as
tr
V
1
t
V
T
log
V
1
t
V
T
n (16)
such that matrices which are identical will deliver a statistic of value 0.
These four statistics will determine the level of similarity between the VCMs at time t and time
T . The variable selection and bandwidth estimation strategy described previously will determine
which of these variables are relevant for VCM forecasting.
6.2 Economic Variables
The variables introduced in this section attempt to identify economic variables that are potentially
signi…cant for VCM behaviour, based on ndings in the existing literature. The rst variable
we introduce to our model is the term spread as it was used in Aït-Sahalia and Brandt (2001),
Campbell (1987), Fama and French (1989) and Harvey (1991) when investigating the time varying
volatility of asset returns. This variable, as de…ned in Aït-Sahalia and Brandt (2001) is the
di¤erence between 1 and 10 year US government bond yields and so if we de…ne the yield on an x
year US government bond at time t as Y
Gx;t
, the term spread variable is
Y
G10;t
Y
G1;t
: (17)
Aït-Sahalia and Brandt (2001), Fama and French (1989), Whitelaw (1994) and Schwert (1989)
investigated the relation between return volatility and default spread. The default spread measures
the di¤erence in yield between Moody’s Aaa and Baa rated corporate bonds. Hence de…ning the
yields on Aaa and Baa rated bonds at time t as Y
Aaa;t
and Y
Baa;t
respectively the default spread
variable is
Y
Baa;t
Y
Aaa;t
: (18)
Both oil prices (Oil
t
) and gold prices (Gold
t
) have been shown to in‡uence stock return volatility
(Sjaastad and Scacciavillani, 1996, Sadorsky, 1999, and Hamilton, 1996), based on this we include
both prices in our investigation.
Schwert (1989), Hamilton and Lin (1996) and Campbell et al. (2001) demonstrate that volatility
increases during economic downturns. We therefore include a dummy variable identifying bull and
15
bear market periods as described in Pagan and Sossounov (2003)
9
. When applying this to data
we use only the information available up until time T in determing turning points b etween states
of the market. We de…ne the variable Bull
t
as having a value of one when the market is bullish
and 0 otherwise.
As we are interested in the volatility of a stock portfolio it may also be useful to include a market
measure of volatility in our list of potential variables. In order to do this we use the volatility index
(VIX) quoted by the Chicago Board of Exchanges, V IX
t
. This provides a measure of volatility
implied by market prices and so may be useful as a guide to the level of volatility expected by the
market.
In addition to these variables we include two variables which we expect to b e irrelevant for
the purpose of VCM forecasting. The spurious variables we use are the temperature in Dubai
10
(DUBAIT EMP
t
) and a random variable generated using a standard normal distribution random
number generator (RANDOM
t
). In the absence of a sensible simulation strategy that can evaluate
the "size" or "power" of our approach, these variables are included as a sensibility check for the
results produced. Any sensible methodology should eliminate such variables. As it turns out,
the proposed methodology does indeed eliminate these two irrelevant variables at all forecasting
horizons.
7 Forecasting Competition
The empirical application presented in this paper is designed to answer the following two questions.
First, does the forecasting approach introduced in Section 5 compare favourably to more established
forecasting techniques for high dimensional VCMs? Second, and more speci…cally, do the economic
indicators discussed in Section 6.2 add valuable information into the process of VCM forecasting?
While one could think of a sensible Monte-Carlo setup to establish the answer to the rst
question, this seems an impossible task with respect to the second . One would have to devise
a large multivariate system that jointly modelled stock returns and macroeconomic variables. It
is likely that any results would be highly speci…c to the system devised and therefore this paper
9
The algorithm identies bull and bear period s base d on monthly data, daily data is often too noisy to support
identication of broad trends. As a result once the algorithm identies a month as belonging to a bull/bear period
all of the constituent days are also assumed to belong to this period.
10
Temperature data was obtained from the University of Daytons daily temperature archive. See
http://www. engr.udayton.edu/weather/.
16
is restricted to an empirical analysis of the questions posed above. In order to obtain su¢ cient
information to address the two issues raised we will apply the Multivariate Kernel approach to two
sets of potential weighting variables. In one set of forecasts (MVK ) the variable elimination and
bandwidth optimisation strategy described in Section 5.1 is applied to the entire set of potential
weighting variables. In a second set of forecasts (MVKne) the weighting variables are restricted
to come from a set that includes the time variable and the variables describing matrix similarities
from Section 6.1. If the latter set, that includes no economic variables, does signi…cantly worse
than the rst, we will consider this evidence that economic variables contain useful information in
the context of VCM forecasting.
In Section 7.1 alternative forecasting models are introduced and Section 7.2 discusses data
estimation setup issues. The model con…dence set methodology used to establish the statistical
signi…cance of our results, is reviewed in S ection 7.3. Results are presented in Section 8.
7.1 Models Included In the Forecast Comparisons
In addition to the forecasting model we propose above we include the Dynamic Conditional Cor-
relation (DCC ) model of Engle and Sheppard (2001), and versions of the RiskMetrics method of
J.P. Morgan (1996) in our MCS evaluations of forecast accuracy. Here we provide a brief summary
of the models we use.
The DCC model is perhaps the most popular of recent models focused on the VCM of stock
returns. In general it models the variances of individual stocks using a GARCH p rocess and then
apples a similar process to the correlation matrices. Hence, assuming returns are non forecastable,
the vector of returns r
t
is distributed as
r
t
= +
t
t
= H
1=2
t
z
t
z
t
IID(0
n
; I
n
)
Where 0
n
is a (n 1) vector of zeroes and I
n
is a (n n) identity matrix and H
t
is the (n n)
variance covariance matrix. Each of the variances, which form the diagonal of H
t
are then modelled
using a GARCH(1,1) process and the DCC models the underlying correlation matrix. De…ne D
t
17
as (n n) diagonal matrix with the (n 1) vector of standard deviations of returns at time t on
the diagonal. We can restate H
t
as in (19)
H
t
= D
t
R
t
D
t
. (19)
Here R
t
is the (n n) correlation matrix describing how stocks move together on day t with
1’s on the diagonal and diagonals between -1 and 1. A further transformation is performed on
this matrix in order to ensure these properties, the transformation is
R
t
= Q
t
Q
t
Q
t
(20)
Q
t
= (1 )
Q +
t1
0
t1
+ Q
t1
Q
t
= diag(Q
t
)
1=2
where
Q is the long run correlation matrix and and are parameters determining how the
correlations move through time. In order to ensure that correlations are stationary the restrictions
a; > 0 and + < 1 are imposed.
In order to obtain muliple horizon forecasts from the DCC we u tilise one of a pair of suggestions
from Engle and Sheppard (2001) which yields the h step forecast equation
^
H
t+h
=
^
D
t+h
^
R
t+h
^
D
t+h
^
R
t+h
=
^
Q
t+h
^
Q
t+h
^
Q
t+h
^
Q
t+h
=
h2
X
i=0
(1 )
Q( + )
i
+ ( + )
h1
^
Q
t+1
where the elements of
^
D
t+h
are forecast from the models used for univariate volatilities.
The DCC model can be di¢ cult to estimate as n increases. In the context of this paper (n = 20)
we utilise the composite likelihood approach of Engle, Shephard and Sheppard (2008) to obtain
estimates of a and (see Appendix B for details).
Another popular method for forecasting the VCM of stock returns is the Riskmetrics forecasting
model (J.P. Morgan, 1996) as describen in Section 4. Th e weighting paramater in (2) determines
the weighting scheme and it is recommended in in J.P. Morgan (1996) that this be set at a value
of 0.94 for daily data and 0.97 for monthly data. These values are used for 1 and 22 day forecast
comparisons respectively. There is no guidance for what should be set to wh en using weekly
18
data and so we follow Laws and Thompson (2005) in setting = 0:95 when obtaining forecasts for
a ve day ahead period. Forecasts from this mo d el will be labeled RM.
The recommended values are the result of averaging optimal s over several di¤erent economic
time-series models, not all of which will be representative for the data at hand. We therefore expect
an optimised value of to outperform the xed recommendation. As well as adopting the above
recommendations for we, therefore, also include a version of Riskmetrics for which we optimise
, choosing a value of that minimses CV M V Q for in sample estimates of the VCMs (RMopt).
We introduce one further adjustment to the Riskmetrics methodology. The new information
entering the Riskmetrics forecast at time T is the cross product of daily returns r
T
r
0
T
(see Equation
1). This can be interpreted as a very noisy proxy for the variance covariance structure at day T .
It is well known that a less noisy proxy is the realized VCM V
T
, and hence we propose (similar to
Fleming, Kirby and Ostdiek, 2003) the following forecasting model
11
H
T +1
= H
T
+ (1 )V
T
. (21)
As the series of V
T
has di¤erent properties compared to r
T
r
0
T
and it is, therefore, apparent that
the xed values for recommended for the latter should not be applied here. We use the cross
validation approach proposed above to nd optimal values for the weighting parameter. In what
follows forecasts from this model are labeled RMvcm.
7.2 Data and Setup
The VCM forecasts included in our analysis all relate to a portfolio of 20 stocks listed on the
NYSE ove r the period 28/11/1997-31/8/2006. A full list of the stocks used can be found in
Appendix A at the end of this paper. The macroeconomic information used covers the same
period. the information on term spreads (GVUS05(CM10)~U$, GVUS05(CM01)~U$), default
spreads (DAAA, DBAA) , oil prices OILBREN) and gold prices (GOLBLN) were obtained from
Datastream
12
. The bull and bear dummy variables were calculated using the algorithm suggested
in Pagan and Sossounov (2003) based on monthly S&P 500 index prices. The algorithm was
adjusted so that the values of the dummies were that which would have been calculated using the
data available at the point in time at which we make our forecast. The VIX data was obtained
11
Illustrated for a d = 1 day forecating horizon. The general is ation is str aightforward.
12
The info rmation in brackets are the datastream codes/names for the data series used to construct these variables.
19
from the Chicago Board of Exchange (CBOE) website
13
.
In versions of the model in which we make multiple day forecasts we take averages of the data
over periods of the same length, with the exception of the bull and bear day variable for which we
take the value on the rst day of the p eriod.
The non-parametric approach described in this pap er utilise realized variance covariance ma-
trices compiled from intra-day price quotes. In order to compile our realized VCMs we use the
following method. We obtain vectors of returns over the period between the market closing on day
t 1 and opening on day t; we denote these as r
Ot
: We also obtain vectors of returns over every 5
minute period during the time the market is open
14
, hence as the stocks are traded over the period
9:30-16:00 we obtain 55 intraday return vectors r
it
i = 1; :::; 55: In order to calculate a VCM for
an entire 24 hour time period we use one of the methods for such calculations proposed in Hansen
and Lunde (2005). We calculate the realized variance-covariance matrix for day t as
V
t
= r
Ot
r
0
Ot
+
55
X
i=1
r
it
r
0
it
. (22)
As the close to open returns on a stock represent a signi…cant part of the risk of holding stocks
it seems appropriate to include these in our forecasting approach. When forecasting over multiple
days (d = 5; 22) we require V
(d)
t+d
which can be obtained from V
(d)
t+d
=
P
t+d
=t+1
V
.
The initial estimation period for all time horizons in the forecast competition results below
consists of the rst 936 datapoints. All forecas ting periods are non-overlapping, leading to 1,266,
253 and 57 forecast periods for the 1, 5 and 22 day forecast horizons respectively. As our model
makes use of instances in the past when conditions are similar to the forecast point it seems logical
to allow the model access to as much data as possible and so we allow expanding estimation samples
to be used in the compilation of forecasts. In order to ensure that the DCC is not hampered by
data restrictions we also employ the expanding dataset in the estimation of the DCC paramaters.
The cross-validation procedure for eliminating variables that do not contribute to improved
VCM forecasts is very computing intensive and therefore is performed after every 200 days, seven
times throughout our sample period. On each of these seven occasions a variable is e ither included
in the model or not. Signi…cant variables are then retained for the following 200 days. On each
13
See http:nnwww.cboe.com/micro/vix/historical.aspx.
14
As all 20 stocks are very liquid and frequen tly traded we do not anticipate any microstructure or non-
synchronicity issues at a 5 minute sampling inte rval.
20
day, however, a new multivariate bandwidth optimisation (as desc ribed in Section 5.1), for the
xed set of retained variables, is performed.
7.3 Analysis of Results Model Condence Sets (MCS)
In our forecast competition we want to determine which of the six models provide the best forecasts.
In order to do this we use the MCS, introduced in Hansen, Lunde and Nason (2003) which analyses
forecasting performance in order to distill that group of models which contains the best forecasting
model with a given con…dence level. This collection of forecasting models is called the model
con…dence set (MCS). Th e models that remain in the MCS at the end of the process are assumed
to have equal predictive power.
We begin the process of forming the MCS with a set of forecasting models
0
. The rst stage
of the process tests the null hypothesis that all of the these models have equal predictive accuracy
(EPA) when their performance is measured against a set of ex-post observations. If H
it
is the i
th
forecast of the VCM at time t and
t
is the observed VCM (or a consistent estimate
15
) for the
same period then the value of a loss function based on comparison of these is denoted L(H
it
;
t
):
The evaluation of the EPA hypothesis is based on loss di¤erentials betwee n the values of the loss
functions for di¤erent models where the loss di¤erential between forecasting models i and j for
time t, d
ij;t
, is de…ned as
d
ij;t
= L(H
it
;
t
) L(H
jt
;
t
).
16
(23)
If all of the forecastors are equally accurate then the loss di¤erentials between all pairs of
forecastors should not be signi…cantly di¤erent from zero. The null hypothesis of EPA is then
H
0
: E (d
ij;t
) = 0 8i > j 2 (24)
and failure to reject H
0
implies all forecasting models have equal predictive ability. We test (24)
using the semi-quadratic te st statistic described in Hansen and Lunde (2007). If the null hypothesis
is rejected at an % con…dence level, we remove the model with the worst loss functions and begin
the process again with the reduced set of forecasting models,
1
: This process is iterated until
15
In the below forecas t experiments we use the realized VCM, V
t
, in place of
t
as it is a consi stent estimator of
the unobserv ed VCM.
21
the test of equal predictive accuracy cannot be rejected, or a single model remains. The model(s)
which survive form the MCS with % con…dence.
The loss function we use to analyse the performance of our VCM forecasts is the MVQLIKE
(Stein distance) function described above in (16). This is a robust loss function, as described in
Laurent, Romouts and Violante (2009). Clements, Doolan, Hurn and Becker (2009) and Laurent
et al. (2010) established that this loss function, compared to other loss functions, identi…es a
correctly speci…ed forecasting model in a smaller MCS, hence it is more discriminatory than, say,
the mean square forecast error criterion.
8 Forecast Comparison - Results
Here we present the results of our forecasting competitions for 1,5 and 22 day forecast horizons.
The results analysed here are the mean of MVQLIKE loss functions for forecasts and the MCS
p-values. If an MCS p-value is greater than the signi…cance level then a model is included in the
MCS, otherwise it is omitted.
We present two sets of results which di¤er in the way in which the time variable is included
into the multivariate kernel forecast. It was discussed in Section 5 that it could be introduced (see
Equation 9) in such a way that obs ervations far distant from the time at which the forecast is
made are heavily penalised (Table 1). Alternatively (see Equation 10) it could be speci…ed such
that close observations can obtain a higher weight, but observations in the long past would not be
excluded from attracting signi…cant positive weights (Table 2).
Referring to the results presented in Table 1 relating to 1 day ahead forecasts, it can be con-
cluded that the MVK model is the only mo del surviving in the MCS. All other models have a
p-value smaller than 5% and are hence exluded from a 95% con…dence level MCS. The forecasting
model with the 2nd largest p-value is the MVKne forecasting model that excludes the econ omic
variables. This allows the conclusion that for short-term forecasts the inclusion of economic vari-
ables adds value to the VCM forecasts in our setup. It is also interesting to note that the standard
Riskmetrics approach (RM - with xed ), and to a lesser extend RMopt, deliver VCM forecast
with loss functions that exceed those of other forecasting models by a large margin (cons idering
the variation in loss functions between the other models). A further interesting conclusion can be
drawn from comparing the results between MVK and the Riskmetrics approach that uses realized
22
1 Day Forecasts 5 Day Forecasts 22 Day Forecasts
MV Q p-value MV Q p-value MV Q p-value
MVK 13.03 1.0000 RMvcm 6.78 1.0000 RMvcm 4.43 1.0000
MVKne 13.12 0.0033 MVKne 6.94 0.1235 MVK 5.49 0.1033
RMvcm 13.31 0.0000 MVK 7.10 0.1235 MVKne 5.68 0.1033
DCC 14.65 0.0000 DCC 8.80 0.0004 DCC 6.39 0.0597
RM opt 17.05 0.0000 RM opt 13.04 0.0000 RM opt 21.93 0.0006
RM 43.35 0.0000 RM 25.35 0.0000 RM 26.22 0.0077
Table 1: MCS Results 1. Uses time kernel as in Equation 9. The table reports the MCS results
for the multivariate kernel (with macro variables - MVK ; without macro variables - MVKnm),
the DCC, the Riskmetrics (RM ), the Riskmetrics with cross-validated and the Riskmetrics
forecasting model using realized VCM (RMvcm). Forecasts are for 1, 5 and 22 days horizons.
MV Q is the average loss function and p-value represents the MCS p-value.
VCM and optimised s (RMvcm). Essentially the latter is a sp ec ial case of the former that excludes
all potential weighting variables but for the time variable. The fact that RMvcm is excluded from
the MCS for 1 day ahead forecasts indicates that matrix comparison and macroeconomic variables
can add signi…cant information to the VCM forecasting proc ess at short forecast horizons.
The nding that the RM and RMopt deliver inferior VCM forecasts can be generalised to
longer forecast horizons. In none of the forecasts comparisons looked at in this paper is any of
these two forecasting models close to being included in a MCS. As the forecast horizon is increased
to 5 and 22 days the relative performance of the RMvcm model improves signi…cantly. For both
these horizons it has the smallest loss function although it shares membership in the MCS with
both the MVK and the MVKnm forecast models. This appears to indicate that the value of the
matrix closeness measures and the economic variables is largest for very short-term forecasts. It is
also notable that for the longest forecast horizon the DCC model is included in a 95% (but not a
90%) con…dence level MCS.
In Table 2 we present resu lts based on a kernel weighted VCM forecasts that use the modi…ed
time kernel proposed in Equation 10. This kernel ensures that that observations are not auto-
matically discarded just because they occur long before the forecast period. While the non kernel
forecasting metho d s remain unchanged
17
the MCS methodology will have to be reapplied as the
MCS p-values are conditional on the initial set of forecasts used.
Some of the basic ndings discussed above remain unchanged. RM and RMopt are still inferior
17
The average loss functions for these f orecas ting models are identical in Table s 1 and 2.
23
1 Day Forecasts 5 Day Forecasts 22 Day Forecasts
MV Q p-value MV Q p-value MV Q p-value
MVK 13.06 1.0000 RMvcm 6.78 1.0000 RMvcm 4.43 1.0000
RMvcm 13.31 0.0110 MVK 6.90 0.2209 MVK 5.55 0.0010
DCC 14.65 0.0000 DCC 8.80 0.0000 DCC 6.39 0.0005
RM opt 17.05 0.0000 MVKnm 11.09 0.0000 MVKnm 8.87 0.0000
MVKnm 17.97 0.0000 RM opt 13.04 0.0000 RM opt 21.93 0.0000
RM 43.35 0.0000 RM 25.35 0.0000 RM 26.22 0.0000
Table 2: MCS Results 2. Uses time kernel as in Equation 10. The tab le rep orts the MCS results
for the multivariate kernel (with macro variables - MVK ; without macro variables - MVKnm),
the DCC, the Riskmetrics (RM ), the Riskmetrics with cross-validated and the Riskmetrics
forecasting model using realized VCM (RMvcm). Forecasts are for 1, 5 and 22 days horizons.
MV Q is the average loss function and p-value represents the MCS p-value.
to all other forecas ting methodologies used here. The value of the kernel VCM forecasting method
is more apparent for shorter forecasting horizons and hence the value of matrix closeness measures
and economic variables disappears with increasing forecast horizon. As the time variable will
have a less dominating impact on the kernel forecasts it is not surprising to nd larger di¤erences
between the kernel forecasts that include macroeconomic variables (MVK ) and those that do not
(MVKnm). On all occasions the former has clearly superior average loss measures and for the
1 and 5 day forecast horizon MVK is included in the MCS while MVKnm is not. At the 1 d ay
horizon the MVK is unambiguously the best model, at the 5 day horizon it is in the MCS together
with the RMvcm, while the latter is the unique surviving model at a 22 day forecasting horizon.
Before highlighting some more aspec ts of the multivariate kernel forecasts it sh ould be noted
that the results for the RMvcm forecasts are rather impressive allowing for the limited information
set utilised in these forecasts. While Fleming et al. (2003) used essentially the same model
they estimated the optimal decay parameter in a slightly di¤erent way. It is apparent that this
extension to the traditional Riskmetrics approach should be seriously considered in the context of
high dimensional VCM forecasting.
It is interesting to compare the MVK forecast performance for the two di¤e rent time kernels.
When using the the kernel that converges to 0 and therefore virtually eliminates observations with
large (T t) the di¤erence in loss functions between the forecasting model that uses and that
which does not use the economic variables is fairly sm all, although statistically signi…cant at the 1
day forecasting horizon. When applying the time kernel that converges to 1, and hence does not
24
penalise observations with large (T t), the di¤erence between the two sets of weighting variables
becomes larger. This result is best interpreted in combination with the observation that the loss
function values for the MVK forecasts remain almost unchanged for both kernel types, whereas
that for the MVKnm deteriorates as one moves from kernel (9) to (10). The obvious interpretation
to this result is that e valuating similarity merely on the basis of matrix closeness measures is
not a good strategy. The kernel forecast will then give weight to past obse rvations of V
t
that
do not positively contribute to the forecasting performance. Past observations appear relevant
when they are similar in terms of V
t
and V
T
having similar characteristics and if the prevailing
macroeconomic conditions at time T are close to those at time t.
In both Table 1 and Table 2 we nd evidence that the macroeconomic information and VCM
based similarity statistics signi…cantly improve on a kernel based purely on time at a 1 day forecast
horizon
18
. We investigate which of the potential kernel weighting variables considered tend to
survive the variable elimination strategy described in Section 5.1.
The survival probabilities are reported in Table 3, these report the percentage of times that a
variable is included in the kernel for forecasting horizon, as mentioned above the inclusion/exclus ion
decision is made every 200 days, or seven times for each forecas t horizon. This Table refe rs to results
generated with the time kernel that converged to 0 (Equation (9))
19
. It is apparent that a goo d
number of the variables considered remained in the weighting mechanism in the vast majority
(if not all) instances. Variables that were eliminated more often than not are the correlation
comparison variable and the MVQLIKE measure of closeness. There is little variation between
the di¤erent forecast horizons with the exception that the comparison of absolute di¤erences is
dropped and the correlation comparison included more often for the 22 day horizon. In general
the inclusion frequencies drop somewhat for the 22 day forecast horizon. The exception are the
term spread as well as oil and gold prices which are never dropped at any forecasting horizon.
18
We also ran ve rsions o f the model in which the 5 and 22 day periods in the estimation of the bandwidths were
non-overlapping. We found that this marginally improve d mean MVQs but did not alter the qualitative nature of
the MCS analysis.
19
Note that the results for all variables other than time are identical regardless of the time variable we use as the
inclusion/exclusio n is based on a univariate kernel.
25
Variable Percentage Inclusion
1 Day 5Day 22 Day
Ratios of Eigenvalues of V
T
and Vt 100% 100% 85.7%
Absolute relative di¤erence s of elements of V
T
and Vt 100% 100% 0%
Proportion of correlations with the same sign at time t and T: 0% 0% 42.9%
MVQLIKE of Vt when compared to V
T
28.6% 71.4% 57.1%
Term Spread of government bonds. 100% 100% 100%
Default spread on corporate debt 100% 85.7% 85.7%
Oil Price 100% 100% 100%
Gold Price 100% 100% 100%
VIX index 85.7% 100% 42.9%
Time 100% 100% 85.7%
Bull and Bear Market Phases 100% 100% 85.7%
Table 3: Variables Included in the Kernel Model.
9 Conclusion
This paper presents a exible kernel model which can be used to forecast symmetric, positive
de…nite variance covariance matrices for large scale portfolio stocks while being able to incorporate
a wide array of data. This is in contrast to many of the more popular approaches to VCM
modelling which have to make simpli…cations to their estimation pro ces ses /paramatisations in
order to handle large scale covariance matrices. The model relies on techniques well established
in the nonparametric econometrics literature. Importantly the scale of the computational task
scales with the number of variables we wish to use in determining our kernel weights rather than
the dimension of the covariance matrix. Our model is exible, capable of using a wide range of
economic information and can be as easily used for small and large scale matrices. It does, however,
depend on the availability of positive de…nite VCM estimates which may not be easily available
for vast dimensional problems.
This paper establishes the feasibility of the proposed forecasting approach and further demon-
strates that using a larger set of information (matrix closeness measures and economic variables)
can have statistically signi…cant advantages. These advantages seem to be strongest for very short
forecast horizons. We also found that a version of the popular Riskmetrics model, using VCMs
based on high frequency data and cross-validated decay parameter, provided extremely useful.
While the kernel method dominated on the very short horizons, the modi…ed Riskmetrics ap-
proach performed best for 5 and 22 day ahead forecasts. This very simple forecasting method has
26
not attracted much attention in the empirical literature and it is suggested that its merits should
be reevaluated.
A number of issues for further research are beyond the scope of this paper. While we evaluated
forecast performance with statistical measures, future research will establish whether the proposed
forecasting metho d ology will deliver economically signi…cant improvements. We also anticipate
that the ability of incorporating exogenous information into the VCM forecasting process will
allow researchers to re-evaluate the type of variables considered in the context of VCM forecasting.
10 Bibliography
References
Aït-Sahalia, Y. & Brandt, M.W. (2001) "Variable selection for portfolio choice", The Journal of
Finance, Vol. 56, no. 4, pp. 1297-1351.
Aitchison, J. & Aitken, C.G.G. (1976) "Multivariate binary discrimination by the kernel method",
Biometrika, Vol. 63, no. 3, pp. 413-420.
Bowman, A.W. (1984) "An alternative method of cross-validation for the smoothing of density
estimates", Biometrika, Vol. 71, pp. 353-360.
Bowman, A. W. (1997) Applied smoothing techniques for data analysis : the kernel approach with
S -Plus illustrations, Clarendon Press, Oxford.
Blair, B. J., Poon, S. H., & Taylor, S. J. (2001), "Forecasting S&P 100 volatility: the incre-
mental information content of implied volatilities and high-frequency index returns", Journal of
Econometrics, vol. 105, no. 1, pp. 5-26
Campbell, J.Y. (1987) "Stock returns and the term structure", Journal of Financial Econometrics,
Vol. 18, pp. 373-399.
Campbell, J.Y., Lettau, M., Malkiel, B.G. & Xu, Y. (2001) "Have individual stocks become more
volatile? An empirical exploration of idiosyncratic risk" The Journal of Finance, Vol. 56, No. 1,
pp1-43.
27
Clements, A., Doolan, M., Hurn, S. & Becker, R. (2009) "Evaluating multivariate volatility
forecasts", NCER Working Paper Series 41, National Centre for Econometric Research.
Clements, A., Hurn, S. & Becker, R. (2010) "Semi-Parametric Forecasting of Realized Volatility",
to appear in: Studies in Nonlinear Dynamics & Econometrics.
Chiriac, R. & Voev, V. (2008) "Modelling and forecasting multivariate realized volatility", CoFE
Discussion Paper 08-06, Center of Finance and Econometrics, University of Konstanz.
Colacito, R., Engle, R.F. & Ghysels, E. (2007), "A component model f or dynamic correlations",
Unpublished
Engle, R.F (2002) "Dynamic conditional correlation - a s imple class of multivariate GARCH
models", Journal of Business and Economic Statistics, Vol. 20, no. 3, pp. 339-350.
Engle, R.F. & Kelly, B.T. (2008), "Dynamic equicorrelation", NYU Working Paper FIN-08-038.
Engle, R. F. & Sheppard, K. (2001), "Theoretical and empirical properties of dynamic conditional
correlation multivariate GARCH", NBER Working Paper No. 8554.
Engle R.F. Shephard N. and Sheppard K. (2008) Fitting vast dimensional time-varying covari-
ance models”, unpublished mimeo, http://www.oxford-man.ox.ac.uk/~nshephard/.
Fama, E.F. & French, K.R. (1989) "Business conditions and expected returns on stocks and
bonds", Journal of Financial Economics, Vol. 25, pp. 23-49.
Fleming, J., Kirby, C. & Ostdiek, B. (2003) "Th e economic value of volatility timing using
realized’volatility", Journal of Financial Economics, Vol. 67, no. 3, pp 473-509.
Gijbels, I., Pope, A. & Wand, M.P. (1999) "Understanding exponential smoothing via kernel
regression", Journal of the Royal Statistical Society, Vol. 61, pp. 39-50.
Hamilton, J.D. & Lin, G. (1996), "Stock market volatility and the business cycle", Journal of
Applied Econometrics, Vol. 11, No. 5, pp. 573-93.
Hamilton, J.D. (1996), "This is what h appened to the oil price-macroeconomy relationship",
Journal of Monetary Economics, Vol. 38, pp. 215-220.
28
Hansen, P.R. A test for superior predictive ability. Brown University, Department of Economics
Working Paper 2003-09 . 2001.
Hansen, P.R. & Lunde, A. (2005) "A realized variance for the whole day based on intermittent
data", Journal of Financial Econometrics, Vol 3 (4), pp525-554.
Hansen, R.P. & Lunde, A. (2007) "MULCOM 1.00, Econometric toolkit for multiple comparisons"
(Packaged with Mulcom package)
Hansen, P. D., Lunde, A., & Nason, J. M. (2003), "Choosing the best volatility models: the model
con…dence set approach", Oxford Bulletin of Economics and Statistics, vol. 65, no. Supplement,
pp. 839-861.
Hansen, P.R., Lunde, A. & Nason, J.M. (2004), Model con…dence s ets for forecasting mod-
els’,Federal Reserve Bank of Atlanta Working Paper No. 2005-7 http:nnssrn.comnnpaper=522382.
Harvey, C.R. (1989) "Time-varying conditional covariance in tests of asset pricing models", Jour-
nal of Financial Economics, Vol. 24, pp. 289-317.
Harvey, C.R., 1991, The speci…cation of conditional expectations, Working paper, Duke Univer-
sity.
J.P. Morgan (1996) Riskmetrics Technical Document 4th Edition, J.P. Morgan, New York.
Krolzig, H.M. & Hendry, D.F. (2001) "Computer automation of general-to-speci…c model selection
procedures", Journal of Economic Dynamics & Control, Vol. 25, pp. 831-866.
Laurent, S., Rombouts, J.V.K. & Violante, F. (2009) "On loss functions and ranking f orecasting
performances of multivariate volatility models," CIRPEE Working Paper 09-48.
Laurent, S., Rombouts, J.V.K. & Violante, F. (2010), "On the forecasting accuracy of multivariate
GARCH models", CIRPÉE Working Paper 10-21.
Laws, J. & Thompson, J. (2005) "Hedging ectiveness of stock index futures", European Journal
of Operational Research, Vol. 163, pp. 171-191.
Li, Q. & Racine, J.S. (2007) Nonparametric Econometrics Theory and Practice, Princeton Uni-
versity Press, Oxford.
29
Moskowitz, T.J. (2003) "An analysis of covariance risk and pricing anomalies", The Review of
Financial Studies, Vol. 16, pp 417-457.
Pagan, A.R. & Sossounov, K.A. (2003) "A simple framework for analysing bull and bear markets",
Journal of Applied Econometrics, Vol. 18, No. 1, pp. 23-46.
Pelletier, D. (2006) "Regime switching for dynamic correlations", Journal of Econometrics, Vol.
131, no. 1-2, pp. 445-473.
Poon S-H. and Granger C.W.J. (2003) Forecasting volatility in nancial markets: a review”
Journal of Economic Literature, 41, 478-539.
Rudemo, M. (1982) "Empirical choice of histograms and kernel density estimators", Scandinavian
Journal of Statistics, Vol. 9, No. 2, pp65-78.
Sadorsky, P. (1999) "Oil price shocks and stock market activity", Energy Economics, Vol. 21, pp.
449-469.
Schwert, G.W. (1989) "Why does stock market volatility change over time", The Journal of
Finance, Vol. 44, no. 5, pp 1115-1153.
Silverman, B.W. (1986) Density Estimation for Statistics and Data Analysis, Chapman & Hall,
London.
Sjaastad, L.A. & Scacciavillani, F. (1996) "The price of gold and the exchange rate", Journal of
international Money and Finance, Vol. 15, No. 6, pp. 879-97.
Vasilellis, G.A. & Meade, N. (1996), "Forecasting volatility for portfolio selection", Journal of
Business, Finance and Accounting, Vol. 23, pp. 125-143.
Wand, M.P. & Jones, M.C. (1995) Kernel Smoothing, Chapman & Hall, London.
Whitelaw, R.F. (1994) "Time variations and covariations in the expectation and volatility of stock
market returns", The Journal of Finance, Vol. 49, No.2, pp. 515-541.
11 Appendix A
Table 4 provides a full list of the stocks used in the analysis in section XX.
30
Ticker symbol Company Name
1 AA Alcoa Inc
2 AXP American Express Inc
3 BA Boeing Co.
4 BAC Bank of America Corp.
5 BMY Bristol Myers Squibb Co.
6 CL Colgate Palmolive
7 DD El DuPont de Nemours co.
8 DIS Walt Disney Corp.
9 GD General Dynamics Corp
10 GE General Electric Co.
11 IBM IBM
12 JNJ Johnson & Johnson
13 JPM JP Morgan Chase Co.
14 KO Coca Cola Corp.
15 MCD Mcdonald’s Corp
16 MER Merril Lynch Co. Inc
17 MMM 3M co.
18 PEP Pepsico Inc.
19 PFE P…zer Inc
20 TYC Tyco International ltd.
Table 4: Stocks included in the forecasting experiment in section X.
12 Appendix B
In the text of the article reference is made to the composite-likelihood DCC calculation method sug-
gested in Engle, Sheppard and She phard 2008 in order to make estimation fo the DCC paramaters
feasible for a large scale problem.
In a DCC model we assume that
r
t
= +
t
t
= H
1=2
t
z
t
z
t
IID(0
n
; I
n
)
and H
t
is governed by a process depe nde nt on a paramater vector ; which contains the values of
; and the unique diagonal elements of
Q from (20). The paramater vector is estimated
31
by maximising the likelihood equation
log L(; z
T
) =
T
X
t=1
L
t
()
L
t
() =
1
2
log jH
t
j
1
2
r
0
t
H
t
r
t
;
however it is di¢ cult to maximise this if we have a high dimensional VCM.
The method suggested by Engle, Sheppard and Shephard 2008 to circumvent this problem
is composite-likelihood estimation. The rst step in this is to estimate a likelihood for several
pairs of stocks. For example if we have three stocks we may compute the likelihoods for DCC
models using the pairs of stocks (1; 2) ; (2; 3) and (1; 3) and we denote thes e likelihoods as L
j;t
(
j
)
j = 1; 2; 3 respectively. Th e same equations as above can be calculated using the equation for
L
t
() above, however the dimension if H
t
and r
t
in this case are 2x2 and 2x1 respectively. The
composite likelihoo d approach then suggests that we sum these likelihoods time and average over
the Z pairwise combinations included. We then choose the paramaters to minimise the sum
CL() =
1
Z
T
X
t=1
Z
X
z=1
L
j;t
(
j
) :
Hence we nd the DCC paramaters which minimise the sum of likelihoods. By doing this it is
possible to estimate the DCC paramaters using only pairwise calculations, however the estimated
paramaters will not be the same unless the pairs are inde pendent. The paramater vector
j
varies
over j = 1; 2; 3 only in the long run correlation paramater for the pair of stocks, obtained from
the standardised GARCH residuals from the rst step of DCC estimation . As in the aggregate
models these are replace d by the full set of long run correlations in there is a di¤erence between
and each
j
.
Engle, Sheppard and Shephard 2008 note that the loss in ciency from using less than all
available pairs is only very small and so we include each of our stocks in only one pairing of stocks
for calculation of the composite-likelihood DCC.
32
ListofNCERWorkingPapers
No.65(Downloadfulltext
)
StanHurn,AndrewMcClellandandKennethLindsay
Aquasimaximumlikelihoodmethodforestimatingtheparametersofmultivariate
diffusions
No.64(Downloadfulltext
)
RalfBeckerandAdamClements
Volatilityandtheroleoforderbookstructure?
No.63(Downloadfulltext
)
AdrianPagan
CanTurkishRecessionsbePredicted?
No.62(Downloadfulltext
)
LionelPageandKatiePage
Evidenceofreferees’nationalfavouritisminrugby
No.61(Downloadfulltext
)
NicholasKing,PDorianOwenandRickAudas
PlayoffUncertainty,MatchUncertaintyandAttendanceatAustralianNationalRugby
LeagueMatches
No.60(Downloadfulltext
)
RalfBecker,AdamClementsandRobertO'Neill
ACholeskyMIDASmodelforpredictingstockportfoliovolatility
No.59(Downloadfulltext
)
PDorianOwen
MeasuringParityinSportsLeagueswithDraws:FurtherComments
No.58(Downloadfulltext
)
DonHarding
Applyingshapeandphaserestrictionsingeneralizeddynamiccategoricalmodelsofthe
businesscycle
No.57(Downloadfulltext
)
ReneeFryandAdrianPagan
SignRestrictionsinStructuralVectorAutoregressions:ACriticalReview
No.56(Downloadfulltext
)
MardiDungeyandLyudmylaHvozdyk
Cojumping:EvidencefromtheUSTreasuryBondandFuturesMarkets
No.55(Downloadfulltext
)
MartinG.Kocher,MarcV.LenzandMatthiasSutter
Psychologicalpressureincompetitiveenvironments:Evidencefromarandomizednatural
experiment:Comment
No.54(Downloadfulltext
)
AdamClementsandAnnastiinaSilvennoinen
Estimationofavolatilitymodelandportfolioallocation
No.53(Downloadfulltext
)
LuisCatãoandAdrianPagan
TheCreditChannelandMonetaryTransmissioninBrazilandChile:AStructuredVAR
Approach
No.52(Downloadfulltext)
VladPavlovandStanHurn
TestingtheProfitabili tyofTechnicalAnalysisasaPortfolioSelectionStrategy
No.51(Downloadfulltext
)
SueBridgewater,LawrenceM.KahnandAmandaH.Goodall
SubstitutionBetweenManagersandSubordinates:EvidencefromBritishFootball
No.50(Downloadfulltext
)
MartinFukacandAdrianPagan
StructuralMacroEconometricModellinginaPolicyEnvironment
No.49(Downloadfulltext
)
TimMChristensen,StanHurnandAdrianPagan
DetectingCommonDynamicsinTransitoryComponents
No.48(Downloadfulltext
)
EgonFranck,ErwinVerbeekandStephanNüesch
IntermarketArbitrageinSportsBetting
No.47(Downloadfulltext
)
RaulCaruso
RelationalGoodatWork!CrimeandSportParticipationinItaly.EvidencefromPanelData
RegionalAnalysisoverthePeriod19972003.
No.46(Downloadfulltext
)(Accepted)
PeterDawsonandStephenDobson
TheInfluenceofSocialPressureandNationalityonIndividualDecisions:Evidencefromthe
BehaviourofReferees
No.45(Downloadfulltext
)
RalfBecker,AdamClementsandChristopherColemanFenn
Forecastperformanceofimpliedvolatilityandtheimpactofthevolatilityriskpremium
No.44
(Download full text)
AdamClementsandAnnastiinaSilvennoinen
Ontheeconomicbenefitofutilitybasedestimationofavolatilitymodel
No.43
(Download full text)
AdamClementsandRalfBecker
Anonparametricapproachtoforecastingrealizedvolatility
No.42 (Download full text)
UweDulleck,RudolfKerschbamerandMatthiasSutter
TheEconomicsofCredenceGoods:OntheRoleofLiability,Verifiability,Reputationand
Competition
No.41
(Downloadfulltext)
AdamClements,MarkDoolan,StanHurnandRalfBecker
Ontheefficacyoftechniquesforevaluatingmultivariatevolatilityforecasts
No.40(Downloadfulltext
)
LawrenceM.Kahn
TheEconomicsofDiscrimination:EvidencefromBasketball
No.39(Downloadfulltext
)
DonHardingandAdrianPagan
AnEconometricAnalysisofSomeModelsforConstructedBinaryTimeSeries
No.38(Downloadfulltext
)
RichardDennis
TimelessPerspectivePolicymaking:WhenisDiscretionSuperior?
No.37(Downloadfulltext
)
PaulFrijters,AmyY.C.LiuandXinMeng
Areoptimisticexpectation skeepingtheChinesehappy?
No.36(Downloadfulltext
)
BennoTorgler,MarkusSchaffner,BrunoS.Frey,SaschaL.SchmidtandUweDulleck
InequalityAversionandPerformanceinandontheField
No.35(Downloadfulltext
)
TMChristensen,A.S.HurnandKALindsay
Discretetimeseriesmodelswhencountsareunobservable
No.34(Downloadfulltext
)
AdamClements,ASHurnandKALindsay
Developinganalyticaldistributionsfortemperatureindicesforthepurposesofpricing
temperaturebasedweatherderivatives
No.33(Downloadfulltext
)
AdamClements,ASHurnandKALindsay
EstimatingthePayoffsofTemperaturebasedWeatherDerivatives
No.32(Downloadfulltext
)
TMChristensen,ASHurnandKALinds ay 
TheDevilisintheDetail:HintsforPracticalOptimisation
No.31(Downloadfulltext
)
UweDulleck,FranzHackl,BernhardWeissandRudolfWinterEbmer
BuyingOnline:SequentialDecisionMakingbyShopbotVisitors

No.30(Downloadfulltext)
RichardDennis
ModelUncertaintyandMonetaryPolicy
No.29(Downloadfulltext
)
RichardDennis
TheFrequencyofPriceAdjustmentandNewKeynesianBusinessCycleDynamics
No.28(Downloadfulltext
)
PaulFrijtersandAydoganUlker
RobustnessinHealthResearch:Dodifferencesinhealthmeasures,techniques,andtime
framematter?
No.27(Downloadfulltext
)
PaulFrijters,DavidW.Johnston,ManishaShahandMichaelA.Shields
EarlyChildDevelopmentandMaternalLaborForceParticipation:UsingHandednessasan
Instrument
No.26(Downloadfulltext
)
PaulFrijtersandTonyBeatton
ThemysteryoftheUshapedrelationshipbetweenhappinessandage.
No.25(Downloadfulltext
)
TMChristensen,ASHurnandKALinds ay 
Itneverrainsbutitpours:Modellingthepersistenceofspikesinelectricityprices
No.24(Downloadfulltext
)
RalfBecker,AdamClementsandAndrewMcClelland
TheJumpcomponentofS&P500volatilityandtheVIXindex
No.23(Downloadfulltext
)
A.S.HurnandV.Pavlov
MomentuminAustralianStockReturns:AnUpdate
No.22(Downloadfulltext
)
MardiDungey,GeorgeMilunovichandSusanThorp
UnobservableShocksasCarriersofContagion:ADynamicAnalysisUsingIdentified
StructuralGARCH
No.21(Downloadfulltext
)(forthcoming)
MardiDungeyandAdrianPagan
ExtendinganSVARModeloftheAustralianEconomy
No.20(Downloadfulltext
)
BennoTorgler,NemanjaAnticandUweDulleck
Mirror,MirrorontheWall,whoistheHappiestofThemAll?
No.19(Downloadfulltext
)
JustinaAVFischerandBennoTorgler
SocialCapitalAndRelativeIncomeConcerns:EvidenceFrom26Countries

No.18(Downloadfulltext)
RalfBeckerandAdamClements
Forecastingstockmarketvolatilityconditionalonmacroeconomicconditions.
No.17(Downloadfulltext
)
RalfBeckerandAdamClements
ArecombinationforecastsofS&P500volatilitystatisticallysuperior?
No.16(Downloadfulltext
)
UweDulleckandNeilFoster
ImportedEquipment,HumanCapitalandEconomicGrowthinDevelopingCountries
No.15(Downloadfulltext
)
RalfBecker,AdamClementsandJamesCurchin
Doesimpliedvolatilityreflectawiderinformationsetthaneconometricforec asts?
No.14(Downloadfulltext
)
ReneeFryandAdrianPagan
SomeIssuesinUsingSignRestrictionsforIdentifyingStructuralVARs
No.13(Downloadfulltext
)
AdrianPagan
WeakInstruments:AGuidetotheLiterature
No.12(Downloadfulltext
)
RonaldG.Cummings,JorgeMartinezVazquez,MichaelMcKeeandBennoTorgler
EffectsofTaxMoraleonTaxCompliance:ExperimentalandSurveyEvidence
No.11(Downloadfulltext
)
BennoTorgler,SaschaL.SchmidtandBrunoS.Frey
ThePowerofPositionalConcerns:APanelAnalysis
No.10(Downloadfulltext
)
RalfBecker,StanHurnandVladPavlov
ModellingSpikesinElectricityPrices
No.9(Downloadfulltext
)
A.Hurn,J.JeismanandK.Lindsay
TeachinganOldDogNewTricks:ImprovedEstimationoftheParametersofStochastic
DifferentialEquationsbyNumericalSolutionoftheFokkerPlanckEquation
No.8(Downloadfulltext
)
StanHurnandRalfBecker
Testingfornonlinearityinmeaninthepresenceofheteroskedasticity.
No.7(Downloadfulltext
)(published)
AdrianPaganandHashemPesaran
OnEconometricAnalysisofStructuralSystemswithPermanentandTransitoryShocksand
ExogenousVariables.

No.6(Downloadfulltext)(published)
MartinFukacandAdrianPagan
LimitedInformationEstimationandEvaluationofDSGEModels.
No.5(Downloadfulltext
)
AndrewE.Clark,PaulFrijtersandMichaelA.Shields
IncomeandHappiness:Evidence,ExplanationsandEconomicImplications.
No.4(Downloadfulltext
)
LouisJ.MacciniandAdrianPagan
Inventories,FluctuationsandBusinessCycles.
No.3(Downloadfulltext
)
AdamClements,StanHurnandScottWhite
EstimatingStochasticVolatilityModelsUsingaDiscreteNonlinearFilter.
No.2(Downloadfulltext
)
StanHurn,J.JeismanandK.A.Lindsay
SeeingtheWoodfortheTrees:ACriticalEvaluationofMethodstoEstimatethe
ParametersofStochasticDifferentialEquations.
No.1(Downloadfulltext
)
AdrianPaganandDonHarding
TheEconometricAnalysisofConstructedBinaryTimeSeries.
ResearchGate has not been able to resolve any citations for this publication.
Article
We study asset allocation when the conditional moments of returns are partly predictable. Rather than first model the return distribution and subsequently characterize the portfolio choice, we determine directly the dependence of the optimal portfolio weights on the predictive variables. We combine the predictors into a single index that best captures time-variations in investment opportunities. This index helps investors determine which economic variables they should track and, more importantly, in what combination. We consider investors with both expected utility (mean-variance and CRRA) and non-expected utility (ambiguity aversion and prospect theory) objectives and characterize their market-timing, horizon effects, and hedging demands.
Article
This paper proposes tests of asset pricing models that allow for time variation in conditional covariances. The evidence indicates that the conditional covariances do change through time. Estimates of the expected excess return on the market divided by the variance of the market (reward-to-risk ratio) are presented for the Sharpe-Lintner CAPM, as well as a number of tests of the model specification. The patterns of the pricing errors through time suggest the model's inability to capture the dynamic behavior of asset returns. This is the working paper version of my 1989 Journal of Financial Economics article.
Article
This article examines the link between several well-known asset pricing "anomalies" and the covariance structure of returns. I find size, book-to-market, and momentum strategies exhibit a strong, weak, and negligible relation to covariance risk, respectively. A size factor helps predict future volatility and covariation, improving the efficiency of investment strategies. Moreover, its premium rises following increases in both its volatility and covariation with other assets. These effects are amplified in recessions. No such relations exist for book-to-market or momentum. These findings may shed light on explanations for these premia and present a challenging set of facts for future theory.
Book
The book describes the use of smoothing techniques in statistics, including both density estimation and nonparametric regression. Considerable advances in research in this area have been made in recent years. The aim of this text is to describe a variety of ways in which these methods can beapplied to practical problems in statistics. The role of smoothing techniques in exploring data graphically is emphasised, but the use of nonparametric curves in drawing conclusions from data, as an extension of more standard parametric models, is also a major focus of the book. Examples are drawnfrom a wide range of applications. The book is intended for those who seek an introduction to the area, with an emphasis on applications rather than on detailed theory. It is therefore expected that the book will benefit those attending courses at an advanced undergraduate, or postgraduate, level,as well as researchers, both from statistics and from other disciplines, who wish to learn about and apply these techniques in practical data analysis. The text makes extensive reference to S-Plus, as a computing environment in which examples can be explored. S-Plus functions and example scriptsare provided to implement many of the techniques described. These parts are, however, clearly separate from the main body of text, and can therefore easily be skipped by readers not interested in S-Plus.
Article
The volatility of an asset is a primary input to the portfolio selection problem. Information about volatility is available from two sources, namely the share market and the option market. This paper examines the forecasting performance, over a three month investment horizon, of time series forecasts (from the share market) and option based implied volatilities. Three time series models, including GARCH, are used and twenty four implied volatility estimation models are employed. Using a data set of twelve UK companies, it is demonstrated that implied volatilities produce better individual forecasts than time series. However, more remarkably, forecasts combining implied volatilies and time series estimates significantly outperform both component forecasts.