ArticlePDF Available

A Kernel Technique for Forecasting the Variance-Covariance Matrix

January 2010

January 2010

Authors:

Ralf Becker

The University of Manchester

Adam Clements

Queensland University of Technology

The forecasting of variance-covariance matrices is an important issue. In recent years an increasing body of literature has focused on multivariate models to forecast this quantity. This paper develops a nonparametric technique for generating multivariate volatility forecasts from a weighted average of historical volatility and a broader set of macroeconomic variables. As opposed to traditional techniques where the weights solely decay as a function of time, this approach employs a kernel weighting scheme where historical periods exhibiting the most similar conditions to the time at which the forecast if formed attract the greatest weight. It is found that the proposed method leads to superior forecasts, with macroeconomic information playing an important role.

Graphs of 1 day ahead CVMVQ against bandwidth values for 12 variables. The dashed line represents the CVMVQ from a rolling average forecastor. Describtions of the variables used are provided in Section 6.

…

we present results based on a kernel weighted VCM forecasts that use the modi…ed

…

Figures - uploaded by Adam Clements

Content may be subject to copyright.

Content uploaded by Adam Clements

Content may be subject to copyright.

NCER Working Paper Series

A Kernel Technique for Forecasting the

Variance-Covariance Matrix

Ralf Becker

Adam Clements

Robert O’Neill

Working Paper #66

October 2010

A Kernel Technique for Forecasting the

Variance-Covariance Matrix.

Ralf Becker

, Adam Clements

and Rob ert O’Neill

Economics, School of Social Sciences, University of Manchester

School of Economics and Finance, Queensland University of Technology

October 28, 2010

Corresponding author

Ralf Becker

Economics, School of Social Sciences

University of Manchester

email ralf.becker@manchester.ac.uk

Ph +44 (0)161 275 4807.

1 Introduction

The forecasting of variance-covariance matrices (VCMs) is an important issue in …nance, having

applications in portfolio selection and risk management as well as being directly used in the pricing

of several …nancial assets. In recent years an increasing body of literature has developed multi-

variate models to forecast this matrix, these include the DCC of Engle and Sheppard (2001), the

VARFIMA model of Chiriac and Voev (2009) and Riskmetrics of J.P. Morgan (1996). All of these

models can be used to forecast the VCM of a portfolio and all do so u sing data only relating to

the performance of the sto cks under consideration.

Previous studies, focusing on modelling the volatility of single assets, have identi…ed economic

variables that may in‡uence the variance of returns and attempted to utilise such variables in

forecasting. For example Aït-Sahalia and Brandt (2001) investigate which of a range of factors

in‡uence stock volatility. In this paper we introduce this approach to a multivariate setting as

we introduce a technique for utlilising macroeconomic variables when forecasting the VCM of a

number of stocks.

Advances in multivariate volatility modelling are complicated by the requirement that all fore-

casts of VCMs must be positive-de…nite and symmetric, restrictions which have meant it has also

been di¢ cult for models to incorporate macroeconomic information. Multivariate models have also

encountered estimation problems as the dimensions of the VCM are allowed to grow, the technique

presented he re is able to satisfy all of the required constraints and can be apply it to a relatively

large portfolio of stocks.

In this paper we introduce a non-parametric approach to forecasting that is similar in technique

to the Riskmetrics approach. However, while the latter delivers a weighted avearge of past realized

VCM with the weights determined by the distance of obervations to the time at which the forecast is

made, we introduce a weighting approach that allows for a wider range of variables to determine the

weights. We allow statistics that measure how similar matrices are and macroeconomic information

(such as interest rates) to be used in addition to the time information used in the Riskmetrics

approach. Technically we use a multivariate kernel to obtain such weights.This approach builds on

the work by Clements and Becker (2010), who show in a univariate setting that employing kernels

to determine weight structures dependent on the similarity of volatility observations through time

can improve forecast accuracy when compared to more established methods. As the method

essentially calculates VCM forecasts as weighted averages of past VCM it guarantees symmetry

and positive-de …nitness by construction.

We apply our proposed methods to a large real life dataset. As the method makes use of a

potentially large set of exogenous information it is impossible to devise a representative simulation

setup that could serve to establish the usefullness of the prop os ed forecasting tool. Therefore we

provide a careful forecasting experiment in which we compare our method with other models. The

results of this forecasting experiment are promising in that they establish that our nonparametric

approach is able to produce forecasts of the VCM that can be statistically superior. Interestingly

we can demonstate that macroeconomic data are critical to this improvement, providing evidence

that these variables provide important information for predicting the behaviour of the VCM.

The rest of this paper is organised as follows, Section 2 introduces some notation and as-

sumptions. Section 3 reviews the literature surrounding multivariate modelling, nonparametric

econometrics and the relationship between macroeconomic variables and stock return volatility.

Section 4 describes how Riskmetrics, a popular volatility forecasting tool, can be described as a

kernel approach based on time. Section 5 outlines how our model uses kernel techniques to obtain

forecasts of the VCM us ing a wider range of data while section 6 introduces the variables we in-

clude in our model. Section 7 outlines our forecasting experiment and provides a discussion of our

results. Finally section 8 concludes and notes further areas of interest.

2 Notations and Assumptions

The model we present is used to forecast the volatility of stock returns in an n stock portfolio.

For any given day, t; the (n  1) vector of returns is denoted by r

= (r

; :::; r

)

; where r

is the return on stock i on day t; and we assume that given all information available at time

t  1, F

t1

; the mean is unforecastable, i.e. E (r

t1

) = 0. The object of interest is the (n  n)

positive-de…nite variance-covariance matrix of returns, V ar (r

t1

) = 

; which we assume to be

time-varying, predictable and, although unobserved, consistently estimated by a realized variance-

covariance matrix V

. Generally in this paper  represents the actual VCM, V is an observed

realized value of the VCM, calculated from intraday data, and H is used to denote a forecast of

the matrix.

3 Literature Review

In this paper we use nonparametric econometrics to produce forecasts of a variance-covariance

matrix. Th is approach has previously been used when forecasting univariate volatility in Clements

et al. (2010), in which forecasts are a weighted average of historical values of realized volatility. The

weights are increased when the pattern of historical volatility behaviour is similar to that around

the time at which the forecast is made. Clements et al. (2010) show that at a 1 day forecast

horizon such an approach performs well against competing volatility forecasting techniques.

What distinguishes this approach to many other VCM forecasting models is the role played

by the time between a past VCM observation and the period in which a forecast is made. In

general the weight given to past observations decreases for observations further in the past. An

important example of this is RiskMetrics, described in J.P. Morgan (1996), a popular method of

forecasting the VCM using an exponentially weighted moving average (EWMA) approach. Gijbels,

Pope and Wand (1999) show that this approach c an be interpreted as a kernel approach in which

weights on historical observations are determined by the lag at which a realization was observed.

Fleming, Kirby and Ostdiek (2003) also used an EWMA weighting scheme, similar to that used in

Riskmetrics, to show that weighted averages of realized covariance matrices can improve forecasts

of volatility, based on economic performance, compared to methods using daily returns.

The performance of the model proposed in this paper is compared to currently available mu l-

tivariate forecasting models, which have proliferated in number in recent years. The most popular

multivariate model is the dynamic conditional correlation model (DCC) of Engle (2002) and Engle

and Sheppard (2001), which allows correlations between stocks to vary according to a GARCH type

process. Other interesting recent models includ e the regime switching dynamic correlation model

(RSDC) of Pelletier (2006) which assumes that correlations/covariances change depend ing on the

state of the world. Colactio, Engle and Ghysels (2007) introduce a model which models long run

correlation behaviour using a MIDAS approach and short term behaviour via the DCC, however

such an approach requires a complex system of restrictions in order to ensure positive de…niteness

of the VCM matrix. Engle and Kelly (2008) introduce the dynamic equicorrelation model which

assumes that correlations between all stocks have the same value, but that this changes through

time. All of these models share the fact that they must make restrictions, either on paramater val-

ues or in their setup, to ensure that estimates and forecasts obtained from them are positve-de…nite

and symmetric.

An interesting new approach to VCM modelling is that of Chiriac and Voev (2009) who intro-

duce a VARFIMA model which models the behaviour of elements of the Cholesky decomposition

of the VCM, however this su¤ers from the problem that interpretation of the elements of the

decomposition is di¢ cult and incorporating additional variables is far from straightforward.

In this paper we introduce a nonparametric technique for obtaining VCM forecasts using a

multivariate kernel approach that encompasses the Riskmetrics method as a special case. This

extends the univariate approach of Clements et al. (2010) to the multivariate context. Importantly

we show how variables, other but time delay, can be used in a multiplicative kernel weighting.

Our contribution heavily depends on contributions made in nonparametric, kernel estimation

literature. It is well known that when estimating densities or conditional expectations by means

of kernel me thods the properties of the resulting estimators will depend on the choice of kernels

and, more importantly the choice of bandwidth used in the kernel estimators. Plug-in bandwith

rules have been proposed but it was also recognised that these may be inadequate when data did

not meet strict assumptions (Silverman, 1986, Bowman, 1997). Here we will use cross-validation

methods (Bowman , 1984, and Rud emo, 1982) to …nd optimal bandwidths.

Application of cross-validation will also facilitate the selection of relevant variables to be used

in the kernel weighting algorithm. One class of variables considered for the kernel weighting

algorithm are scalar transformations of matrices as they can be used to establish the closeness

of matrices. The idea is to give higher weight to past observations that relate to times when the

VCM was similar to the current VCM (regardless of how distant that observation is). Moskowitz

(2003) proposes three statistics to evaluate the closenes s of VCMs. The …rst metric compares the

matrices’eigenvalues, the second looks at the relative di¤erences between the individual matrix

elements and the third considers how many of the correlations have the same sign in the matrices.

Taken together these three metrics can be u sed to determine the level of similarity between two

VCMs. Other functions used to compare matrices, often called loss functions, have been discussed

in the literature (Laurent, Rombouts and Violante, 2009). One such loss function is the Stein

distance, also known as the MVQLIKE function. This loss function is shown to perform well in

discriminating between VCM forecasts in Clements, Doolan, Hurn and Becker (2009) and Laurent,

Rombouts and Violante (2010) and represents another useful tool for comparing VCMs.

The second class of variables used in the kernel weighting algorithm are variables carrying in-

formation on the state of the economy. The basic idea is to give past VCM larger weights in the

forecast if the macroeconomic conditions are similar to those prevalent at the time of the fore-

cast formation. Aït-Sahalia and Brandt (2001) investigate factors in‡uencing stock volatility and

propose dividend yield, default spreads and term spreads as f actors. This builds on existing work

which identi…es term spreads and default spreads as potential drivers of stock volatility proce sse s.

Campbell (1987), Fama and French (1989) and Harvey (1991) investigate the relationship between

term spreads and volatility while Fama and French (1989), Whitelaw (1994) and Schwert (1989)

consider a volatiliy-default spread relationship. In addition Harvey (1989) considers the impact of

default spreads on covariances. Hence there is an established literature relating these variables to

the behaviour of elements of a VCM.

Empirical evidence in Schwert (1989), Hamilton and Lin (1996) and Campbell, Lettau, Malkiel

& Xu (2001) suggests that during market downturns/recessions stock return volatility can be

expected to increase. Based on these …ndings we propose to use an algorithm, such as that detailed

in Pagan and S ossoun ov (2003), to ide ntify periods in which the sto ck market is upbeat, as VCMs

in such periods may have common characteristics. Commodity prices, such as gold (Sjaastad and

Scacciavillani, 1996) and oil (Sadorsky, 1999, and Hamilton, 1996) prices, have also been linked

to stock market volatility and are therefore considered here as potential variables to contribute to

the kernel weighting functions.

The …nal variable use d in this paper is implied volatility, namely the VIX index of the Chicago

Board of Exchange. This is often interpreted as a market’s view on future stock market volatility.

This measure has been used in the context of univariate volatility forecasting (Poon and Granger,

2003, Blair, Poon and Taylor, 2001) and is here considered as another variable in the multivariate

kernel weighting scheme.

4 Riskmetrics as a Kernel Approach

In this section we restate a result by Gijbels, Pope and Wand (1999) that establishes that a

Riskmetrics type, exponential smoothing forecast can be represented as a univariate kernel forecast

in which weights vary with time. This will provide a special case of the more general methodology

introduced in Section 5 in which we introduce a multivariate kernel which potentially utilises the

variables listed in the previous Section.

In a multivariate setting, the variance-covariance matrix forecast H

T +1

at time T , given by the

standard Riskmetrics equation is

T +1

= H

+ (1  )r

(1)

when observations are equally spaced in time and  is a smoothing paramater, 0 <  < 1;

commonly set at a value recommended in J.P. Morgan (1996). From recursive substitution and

with H

= r

, the forecast of the VCM can be expressed as

T +1

= (1  )

T 1

j=0



T j

(2)

The sum of the weights is thus equal to 1  

and as noted in Gijb els, Pope and Wand (1999)

this approaches 1 as we allow T to approach in…nity. However in order to normalise the sum of

the weights to be exactly 1 we restate the Riskmetrics model as

T +1

T 1

j=0



T j

T 1

j=0



(3)

We can now reformulate (3) as a kernel

, de…ning h = 1= log() and K(u) = exp(u)1

u0

, we

More accuratel y this is a half kernel as it is zero for T + 1; T + 2; :::etc:.

can restate (3) as

T +1

j=1



tT



j=1



tT



j=1

rm;t

(4)

From this we replicate the conclu sion of Gijbels, Pope and Wand (1999) that Riskmetrics is a

zero degree local polynomial kernel estimate with bandwidth h. From a practial point of view

the Riskmetrics kernel determines weights, W

rm;t

= K



tT





j=1



tT





, based on how

close observations of V

rm;t

= r

are to time T , the period at which a forecast is being made.

The largest weight is attached to the observation at time T and the weights de crease according to

an exp one ntially weighted smoothing average pattern. In the remainder of this paper we aim to

expand such an approach by including factors other than time in our estimation of kernel weights.

5 Multivariate Kernel Methodology

In this section we present the method by which we obtain the kernel and subsequent forecasts of

the VCM. The inputs to our model are a set of p variables, which we believe to contain informa-

tion relevant to forecasting the VCM, and a time series of realized variance covariance matrices.

Calculation of the n  n realized variance-covariance matrix, V

, is a non-trivial issue . Here we

compute it using standard methods from the realized (co)variance literature and we assume that

is positive de…nite. The method used to calculate the matrices used in the rest of this paper is

described in section 7.2.

At time T we wish to obtain a forecast of the d-step ahead VCM, which is the matrix describing

variances and covariances over the time period T + 1 to T + d, denoted by H

(d)

T +d

. We obtain our

forecast by taking a weighted combination of historical VCMs, hence

(d)

T +d

T d

t=1

(d)

t+d

: (5)

As our forecast is a weighted combination of symmetric, positive de…nite matrices, H

T +d

also

has these properties and so is a valid covariance matrix. Ensuring that forecasts of the variance

covariance matrix are positive de…nite is rarely so straightforward and models usually have to

employ parameter restrictions or decompositions of V

in order to ensure this.

The focus of much of the remainder of this section is the method by which we determine the

optimal weights, to use in (5). In order to ensure that the weights sum to one we impose the

following normalisation:

T d

i=1

: (6)

This allows Equation (5) to be interpreted as a weighted average, ensuring an appropriate scaling

for H

(d)

T +d

We now explain how to determine !

using kernel estimation techniques. The idea underpinning

the approach is to determine which of the past time periods had conditions most similar to those

at the time we make the forecast, T. We then place more weight on the VCMs that occured over

the d periods f ollowing the dates that were most similar to time T .

We determine the similarity of other time periods to time T using p variables and employ a

multivariate kernel to calculate the raw weight applicable to day t, hence

j=1

(

t;j

; 

T;;j

; h

) (7)

where 

T;;j

is the element from the T

row and j

column of the data matrix  which has

dimensions T  p and h

is the bandwidth for the j

variable.

For continuous variables K

(

t;j

; 

T;;j

; h

) is the standard normal density kernel

(Silverman

1986 and Bowman 1997) de…ned as

(

t;j

; 

T;;j

; h

) = (2)

0:5

exp







T;;j

 

t;j



In the case of a discrete dummy variable, such as a bull/bear market dummy we use the discrete

univariate kernel proposed in Aitchison and Aitken (1976). The form of the kernel is

(

t;j

; 

T;;j

; h

) =



1  h

if 

t;j

= 

T;;j

=(s

 1) if 

t;j

6= 

T;;j

(8)

where s

is the number of possible values the discrete variable can take (s

= 2 in the case of the

bull/bear market variable). In the two state discrete case h

2 [0; 0:5]. If h

= 0:5 the value of the

discrete variable has no impact on the forecast, while if h

= 0 we disregard data points which do

not share the same discrete variable value as 

T;j

We normalise continuous variables before applying the kernel function.

In addition to the discrete and continuous kernels we use a third approach when we include

time as one of the p variables. In that case

(

t;j

; 

T;j

; h

) =

T t

T h

q=1

T q

(9)

which has the same structure as the Riskmetrics approach in Equation (3). However, here we allow

a ‡exible bandwidth, h

2 [0; 1], as opposed to a prespeci…ed value as in J.P. Morgan (1996).

As we are using a multiplicative kernel the time kernel suggested in Equation (9) is problematic

as K

(), with increasing (T  t), will quickly decline towards a value of ze ro which will make the

value of !

in (7) approach zero. This implies that in e¤ect observations with su¢ ciently large

(T  t) will be ignored regardless of how similar the macroeconomic and VCM characteristics are

to the point of forecast. We therefore propose an alternative scheme where

(

t;j

; 

T;j

; h

) =

T t

T h

q=1

T q

+ 1. (10)

The minor alteration ensures that the weights decline to a value of one. There is s till an increased

weight on more recent observations however less recent time periods are not ignored because of

the e¤ects of the time kernel. We present results using both versions of the time kernel in order to

demonstrate the impact of such an approach.

While the general approach presented through Equations (5), (6) and (7) has the Riskmetrics

approach as a special case

, it introduce s a signi…cant amount of additional ‡exibility, by allowing

the weights W

to be determined from a set of p variables.

5.1 Choice of Bandwidth

The choice of bandwidth is a non-trivial issue in nonparametric econometrics, however a common

rule of thumb quoted for multivariate density estimation is



(p + 2) T



p+4



where 

is the standard deviation of the j

variable. Although this rule of thumb provides a

simple method for cho osing bandwidths, as noted in Wand and Jones (1995) these bandwidths

may be sub-optimal.

Using V

= r

rather than a realized VCM.

Importantly, if one was to optimise (using cross-validation) the bandwidth parameters the

optimised values, h

, will contain information on whether the jth element in  does contribute

signi…cant information to the optimal weights W

. As noted in Li and Racine (2007, pp. 140-141),

irrelevant (continuous) variables are associated with h

= 1. For binary variables (and kernel as

in Equation (8)) and a time variable (and a kernel as in Equation (10)) the bandwidths h

= 0:5

and h

= 1 respectively represent irrelevant variables.

Cross-validation is a bandwidth optimisation strategy introduced in Rudemo (1982) and Bow-

man (1984). It selects bandwidths to minimise the mean integrated squared error (MISE) of density

estimates and is generally recommended as the method of choice in the context of nonparmetric

density and regression analysis (Wand and Jones, 1995, Li and Racine, 2007). As we are interested

in forecast performance rather than density estimation, we obtain bandwidths which minimise the

MVQLIKE of our forecasts rather than the MISE.

MVQLIKE is a robust loss function for th e comparison of matrices, where H

(1)

= H

is the

(n  n) dimensional 1 period ahead forecast of the VCM at time t and V

(1)

= V

is the realized

VCM at time t

. The loss function is calculated as

MV QLIKE(H

) = tr(H

1

)  log



1



 n: (11)

This is the criterion function to be minimised in our cross-validation approach. Consider that

we have data available up to and including time period T and we aim to forecast the VCM for

T + 1. The available data over time periods 1 to T can be used to identify the optimal bandwidths

for use in forecasting. This is done by evaluating K (< T ) forecasts for periods T  K + 1 to T.

The initial T  K observations

are used to produce the …rst forecast H

T K+1

. For any period ,

T  K + 1    T , the forecast H



is based on observations of variables in  available at time

  1.

Having obtained these forecasts we select bandwidths to minimise the mean of MVQLIKE over

these in-sample forecasts

CV MV Q (h) =

=T _K+1

MV QLIKE(H



(h)) (12)

The following argument is, for notation al ease, made for 1 period ahead forecasts but the extension to d period

for ecasts is strai ghtforward. Initially we shall also suppre ss the dependence of H

on h, the (p  1) vector of

bandwidths.

We set T  K = 300, which means that every forecast is based on a minimum of 300 obser vations.

where dependence on h, the (p  1) vector of bandwidths is now made explicit. The bandwidths

that minimise (12) are then used in Equations (5), (6) and (7) in order to forecast H

T +1

The optimised bandwidth values should carry information on which of the p variables included in

 contribute signi…cantly to the determiniation of the weights in Equation (5). Li and Racine (2007)

suggest that a cross-validation approach in the context of a multivariate kernel regression should,

asymptotically, deliver bandwidth estimates that approach their "irrelevant" values discussed above

= 1, h

= 0:5 and h

= 1 respectively for continuous, binary and time variables). They suggest

that, therefore, there is no need to eliminate irrelevant variables.

When following this strategy we encountered signi…cant di¢ culties in the optimisation process,

in particular our nonlinear bandwidth optimisation is unable to identify an optimum. We therefore

recommend an alternative strategy which eliminates irrelevant variables and identi…es optimal

bandwidths only for the remaining variables. The elimination of variables is achieved as follows.

Each variable is used as the only variable in  to determine kernel weights. We …nd the optimal

bandwith,

, for each variable by minimising the criterion in (12). The optimal CV MV Q





is then compared to the CV MV Q

which is obtained by forming VCM forecasts from averaging

all available past VCMs. The rationale is that a relevant variable sh ould deliver imp rovem ents

compared to a simple average. This is illustrated in Figure 1 for a forecast h orizon of d = 1

. The

dashed lines indicate CV MV Q

and the solid lines represent CV MV Q (h

). The minima on the

latter illustrate the bandwidth

that minimises CV M V Q (h

). Weighting variables that do n ot

improve on the CV M V Q

by at least 1% are then eliminated.

In order to obtain a handle on the size of this threshold we simulated 1000 random variables

which were subsequently con sidered as potential weighting variables (and there CV M V Q





)

calculated. As it turns out a threshold of 1% would eliminate virtually all of these irrelevant

random variables

. Despite this the threshold is essentially ad-hoc and it is envisaged that future

reserach may improve on this aspect of the proposed methodology.

Similar illustrations for 5 and 22 day forecast h orizons o¤er no additional insight and so are not presented here.

We also applied a more con servativ e threshold of 2% but results remained virtually unchanged and are therefore

not reported.

1 3

13.25

13.50

13.75

14.00

Graph of CVMVQ Against Bandwidths

Eigenvalues Ro llin g

1 3

Abso lu te Differen ces Ro llin g

1 3

14.0

14.5

15.0

Mean Difference Ro llin g

1 3

13.950

13.975

14.000

MVQLIKE Ro llin g

1 3

13.25

13.50

13.75

14.00

Term Sp read Ro llin g

1 3

14.0

14.1

14.2

14.3

Risk Premium Ro llin g

1 3

13.50

13.75

14.00

O il Pr ice Ro llin g

1 3

14.25

14.75

Dubai Temp Ro llin g

1 3

14.00

14.25

14.50

14.75

Gold Price Ro llin g

1 3

13.75

14.00

14.25

VIX Ro llin g

0 0.2 0.4

13.8

13.9

14.0

Bull Dummy Ro llin g

0 0.5 1

Time Ro llin g

Figure 1: Graphs of 1 day ahead CVMVQ against bandwidth values for 12 variables. The dashed line represents the CVMVQ

from a rolling average forecastor. Describtions of the variables used are provided in S ection 6.

In short the process of variable elimination and bandwidth optimisation can be summarised in

the following three step procedure:

1. For each of the p variables considered for inclusion in the multivariate kerne l, apply cross

validation to obtain the optimal bandwidth when on ly that variable is in clude d in the kernel

estimator. We refer to these as univariate optimised bandwidths

j = 1; :::; p.

2. Compare the forecasting performance of the univariate optimised bandwidths from Step 1,

CV MV Q





, against CV M V Q

from a simple average forecasting model. Any of the p

variables that fails to improve on the rolling average forecast performance by at least 1% is

eliminated at this stage as it is considered to have little value for forecasting. We are left

with p



 p variables used as weighting variables.

3. Estimate the multivariate optimised bandwidths h



for the p



variables that are not elimi-

nated in Step 2 by minimising the cross validation criterion in Equation (12). As opposed to

Step 1 this optimisation is done simultaneously over all p



bandwidths.

Having obtained the optimised bandwidths from Step 3, we then forecast the VCM for the d

day-ahead time period ending at T + d using (7).

6 Potential Variables

The approach outlined in the previous Section illustrates how the p



relevant variables are to be

identi…ed from a list of p variables initially considered to be potentially relevant. The p



selected

variables then contribute to the calculation of the weights used in Equation (5). Here we describe

the set of p variables from which we select p



variables considered to be relevant. The variables

used can be classi…ed in three categories. First, the time variable, as it is used in the Riskmetrics

approach. This assumes that that VCM observations close to the time period T at which the

forecast is made, are more relevant than observations further back in time. The second class of

potential weighting variables are measures of matrix closeness. In essence, the more similar a

VCM matrix at any time t < T is to the VCM at time T , the larger should be the weight given

to the associated observed VCM over the subsequent d days in Equation (5). These measures of

matrix closeness are discussed in Section 6.1. Finally we consider variables that can broadly be

categorised as variables describing the prevelant economic circumstances at time t. Larger weight

is to be given to VCM if the associated macroeconomic conditions are similar to those prevelant

at the time of forecast formation, T . Variables that fall into this category are described in Section

6.2.

6.1 VCM Comparison Variables

Moskowitz (2003) discusses a number of summary statistics that measure the di¤erence between

two matrices. We consider three of these statistics here. The …rst is the ratio of the e igenvalues of

the (squared) VCM at time t relative to those of the (squared) VCM at time T :

trace (V

)

trace (V

)

(13)

Values close to 1 indicate that matrices are similar to each other.

The second statistic adopted from Moskowitz (2003) evaluates the absolute elementwise di¤er-

ences between two matrices V

and V

. The sum of all absolute di¤erences is standardised by the

sum of all elements in V

. The statistic is de…ned as



 V

j



(14)

where  is a  is a n  1 vector of ones. For identical matrices this statistic will take a value of 0.

A third metric suggested in Moskowitz (2003) is based on the realized correlation matrices C

and C

. The statistic compares how similar C

and C

are in relation to the average realized

correlation matrix



C. Speci…cally we are concerned about the relative position of a particular

correlation relative to its long-run average. sign(vech(C





) delivers a positive (negative) sign

if the realized correlation (of the ith unique element) at time t is larger (smaller) than the relevant

average correlation. The statistic considered here essentially c alculates the proportion of the m

unique elements in C

that have identical deviations from the long-run correlations as those in C

i=1



sign(vech(C





)=sign



vech(C







. (15)

I fg is an indicator taking the value of 1 when the statement inside the brackets is true and 0

otherwise and m =

(n  1) is the number of unique correlations in the n  n correlation matrix.

If matrices are identical with respect to this measure this statistic will take the value 1.

The realized correlation matrices are calculated from C

= D

1

where D

is a (n  n) diag onal matrix

with

iit

on the ith diagonal element and V

iit

is the (i; i) element of V

We also compare VCM matrices using the MVQLIKE loss function (Laurent, Rombouts and

Violante, 2009) due to it being a robust multivariate loss function, where it is de…ned as



1



 log



1



 n (16)

such that matrices which are identical will deliver a statistic of value 0.

These four statistics will determine the level of similarity between the VCMs at time t and time

T . The variable selection and bandwidth estimation strategy described previously will determine

which of these variables are relevant for VCM forecasting.

6.2 Economic Variables

The variables introduced in this section attempt to identify economic variables that are potentially

signi…cant for VCM behaviour, based on …ndings in the existing literature. The …rst variable

we introduce to our model is the term spread as it was used in Aït-Sahalia and Brandt (2001),

Campbell (1987), Fama and French (1989) and Harvey (1991) when investigating the time varying

volatility of asset returns. This variable, as de…ned in Aït-Sahalia and Brandt (2001) is the

di¤erence between 1 and 10 year US government bond yields and so if we de…ne the yield on an x

year US government bond at time t as Y

Gx;t

, the term spread variable is

G10;t

 Y

G1;t

: (17)

Aït-Sahalia and Brandt (2001), Fama and French (1989), Whitelaw (1994) and Schwert (1989)

investigated the relation between return volatility and default spread. The default spread measures

the di¤erence in yield between Moody’s Aaa and Baa rated corporate bonds. Hence de…ning the

yields on Aaa and Baa rated bonds at time t as Y

Aaa;t

and Y

Baa;t

respectively the default spread

variable is

Baa;t

 Y

Aaa;t

: (18)

Both oil prices (Oil

) and gold prices (Gold

) have been shown to in‡uence stock return volatility

(Sjaastad and Scacciavillani, 1996, Sadorsky, 1999, and Hamilton, 1996), based on this we include

both prices in our investigation.

Schwert (1989), Hamilton and Lin (1996) and Campbell et al. (2001) demonstrate that volatility

increases during economic downturns. We therefore include a dummy variable identifying bull and

bear market periods as described in Pagan and Sossounov (2003)

. When applying this to data

we use only the information available up until time T in determing turning points b etween states

of the market. We de…ne the variable Bull

as having a value of one when the market is bullish

and 0 otherwise.

As we are interested in the volatility of a stock portfolio it may also be useful to include a market

measure of volatility in our list of potential variables. In order to do this we use the volatility index

(VIX) quoted by the Chicago Board of Exchanges, V IX

. This provides a measure of volatility

implied by market prices and so may be useful as a guide to the level of volatility expected by the

market.

In addition to these variables we include two variables which we expect to b e irrelevant for

the purpose of VCM forecasting. The spurious variables we use are the temperature in Dubai

(DUBAIT EMP

) and a random variable generated using a standard normal distribution random

number generator (RANDOM

). In the absence of a sensible simulation strategy that can evaluate

the "size" or "power" of our approach, these variables are included as a sensibility check for the

results produced. Any sensible methodology should eliminate such variables. As it turns out,

the proposed methodology does indeed eliminate these two irrelevant variables at all forecasting

horizons.

7 Forecasting Competition

The empirical application presented in this paper is designed to answer the following two questions.

First, does the forecasting approach introduced in Section 5 compare favourably to more established

forecasting techniques for high dimensional VCMs? Second, and more speci…cally, do the economic

indicators discussed in Section 6.2 add valuable information into the process of VCM forecasting?

While one could think of a sensible Monte-Carlo setup to establish the answer to the …rst

question, this seems an impossible task with respect to the second . One would have to devise

a large multivariate system that jointly modelled stock returns and macroeconomic variables. It

is likely that any results would be highly speci…c to the system devised and therefore this paper

The algorithm identi…es bull and bear period s base d on monthly data, daily data is often too noisy to support

identi…cation of broad trends. As a result once the algorithm identi…es a month as belonging to a bull/bear period

all of the constituent days are also assumed to belong to this period.

Temperature data was obtained from the University of Daytons daily temperature archive. See

http://www. engr.udayton.edu/weather/.

is restricted to an empirical analysis of the questions posed above. In order to obtain su¢ cient

information to address the two issues raised we will apply the Multivariate Kernel approach to two

sets of potential weighting variables. In one set of forecasts (MVK ) the variable elimination and

bandwidth optimisation strategy described in Section 5.1 is applied to the entire set of potential

weighting variables. In a second set of forecasts (MVKne) the weighting variables are restricted

to come from a set that includes the time variable and the variables describing matrix similarities

from Section 6.1. If the latter set, that includes no economic variables, does signi…cantly worse

than the …rst, we will consider this evidence that economic variables contain useful information in

the context of VCM forecasting.

In Section 7.1 alternative forecasting models are introduced and Section 7.2 discusses data

estimation setup issues. The model con…dence set methodology used to establish the statistical

signi…cance of our results, is reviewed in S ection 7.3. Results are presented in Section 8.

7.1 Models Included In the Forecast Comparisons

In addition to the forecasting model we propose above we include the Dynamic Conditional Cor-

relation (DCC ) model of Engle and Sheppard (2001), and versions of the RiskMetrics method of

J.P. Morgan (1996) in our MCS evaluations of forecast accuracy. Here we provide a brief summary

of the models we use.

The DCC model is perhaps the most popular of recent models focused on the VCM of stock

returns. In general it models the variances of individual stocks using a GARCH p rocess and then

apples a similar process to the correlation matrices. Hence, assuming returns are non forecastable,

the vector of returns r

is distributed as

=  + 



= H

1=2

 IID(0

; I

)

Where 0

is a (n  1) vector of zeroes and I

is a (n  n) identity matrix and H

is the (n  n)

variance covariance matrix. Each of the variances, which form the diagonal of H

are then modelled

using a GARCH(1,1) process and the DCC models the underlying correlation matrix. De…ne D

as (n  n) diagonal matrix with the (n  1) vector of standard deviations of returns at time t on

the diagonal. We can restate H

as in (19)

= D

. (19)

Here R

is the (n  n) correlation matrix describing how stocks move together on day t with

1’s on the diagonal and o¤ diagonals between -1 and 1. A further transformation is performed on

this matrix in order to ensure these properties, the transformation is

= Q



(20)

= (1    )



Q + 

t1



t1

+ Q

t1



= diag(Q

)

1=2

where



Q is the long run correlation matrix and  and  are parameters determining how the

correlations move through time. In order to ensure that correlations are stationary the restrictions

a;  > 0 and  +  < 1 are imposed.

In order to obtain muliple horizon forecasts from the DCC we u tilise one of a pair of suggestions

from Engle and Sheppard (2001) which yields the h step forecast equation

t+h



t+h



t+h

h2

i=0

(1    )



Q( + )

+ ( + )

h1

t+1

where the elements of

t+h

are forecast from the models used for univariate volatilities.

The DCC model can be di¢ cult to estimate as n increases. In the context of this paper (n = 20)

we utilise the composite likelihood approach of Engle, Shephard and Sheppard (2008) to obtain

estimates of a and  (see Appendix B for details).

Another popular method for forecasting the VCM of stock returns is the Riskmetrics forecasting

model (J.P. Morgan, 1996) as describen in Section 4. Th e weighting paramater  in (2) determines

the weighting scheme and it is recommended in in J.P. Morgan (1996) that this be set at a value

of 0.94 for daily data and 0.97 for monthly data. These values are used for 1 and 22 day forecast

comparisons respectively. There is no guidance for what  should be set to wh en using weekly

data and so we follow Laws and Thompson (2005) in setting  = 0:95 when obtaining forecasts for

a …ve day ahead period. Forecasts from this mo d el will be labeled RM.

The recommended values are the result of averaging optimal s over several di¤erent economic

time-series models, not all of which will be representative for the data at hand. We therefore expect

an optimised value of  to outperform the …xed recommendation. As well as adopting the above

recommendations for  we, therefore, also include a version of Riskmetrics for which we optimise

, choosing a value of  that minimses CV M V Q for in sample estimates of the VCMs (RMopt).

We introduce one further adjustment to the Riskmetrics methodology. The new information

entering the Riskmetrics forecast at time T is the cross product of daily returns r

(see Equation

1). This can be interpreted as a very noisy proxy for the variance covariance structure at day T .

It is well known that a less noisy proxy is the realized VCM V

, and hence we propose (similar to

Fleming, Kirby and Ostdiek, 2003) the following forecasting model

T +1

= H

+ (1  )V

. (21)

As the series of V

has di¤erent properties compared to r

and it is, therefore, apparent that

the …xed values for  recommended for the latter should not be applied here. We use the cross

validation approach proposed above to …nd optimal values for the weighting parameter. In what

follows forecasts from this model are labeled RMvcm.

7.2 Data and Setup

The VCM forecasts included in our analysis all relate to a portfolio of 20 stocks listed on the

NYSE ove r the period 28/11/1997-31/8/2006. A full list of the stocks used can be found in

Appendix A at the end of this paper. The macroeconomic information used covers the same

period. the information on term spreads (GVUS05(CM10)~U$, GVUS05(CM01)~U$), default

spreads (DAAA, DBAA) , oil prices OILBREN) and gold prices (GOLBLN) were obtained from

Datastream

. The bull and bear dummy variables were calculated using the algorithm suggested

in Pagan and Sossounov (2003) based on monthly S&P 500 index prices. The algorithm was

adjusted so that the values of the dummies were that which would have been calculated using the

data available at the point in time at which we make our forecast. The VIX data was obtained

Illustrated for a d = 1 day forecating horizon. The general is ation is str aightforward.

The info rmation in brackets are the datastream codes/names for the data series used to construct these variables.

from the Chicago Board of Exchange (CBOE) website

In versions of the model in which we make multiple day forecasts we take averages of the data

over periods of the same length, with the exception of the bull and bear day variable for which we

take the value on the …rst day of the p eriod.

The non-parametric approach described in this pap er utilise realized variance covariance ma-

trices compiled from intra-day price quotes. In order to compile our realized VCMs we use the

following method. We obtain vectors of returns over the period between the market closing on day

t  1 and opening on day t; we denote these as r

: We also obtain vectors of returns over every 5

minute period during the time the market is open

, hence as the stocks are traded over the period

9:30-16:00 we obtain 55 intraday return vectors r

i = 1; :::; 55: In order to calculate a VCM for

an entire 24 hour time period we use one of the methods for such calculations proposed in Hansen

and Lunde (2005). We calculate the realized variance-covariance matrix for day t as

= r

i=1

. (22)

As the close to open returns on a stock represent a signi…cant part of the risk of holding stocks

it seems appropriate to include these in our forecasting approach. When forecasting over multiple

days (d = 5; 22) we require V

(d)

t+d

which can be obtained from V

(d)

t+d

=t+1



The initial estimation period for all time horizons in the forecast competition results below

consists of the …rst 936 datapoints. All forecas ting periods are non-overlapping, leading to 1,266,

253 and 57 forecast periods for the 1, 5 and 22 day forecast horizons respectively. As our model

makes use of instances in the past when conditions are similar to the forecast point it seems logical

to allow the model access to as much data as possible and so we allow expanding estimation samples

to be used in the compilation of forecasts. In order to ensure that the DCC is not hampered by

data restrictions we also employ the expanding dataset in the estimation of the DCC paramaters.

The cross-validation procedure for eliminating variables that do not contribute to improved

VCM forecasts is very computing intensive and therefore is performed after every 200 days, seven

times throughout our sample period. On each of these seven occasions a variable is e ither included

in the model or not. Signi…cant variables are then retained for the following 200 days. On each

See http:nnwww.cboe.com/micro/vix/historical.aspx.

As all 20 stocks are very liquid and frequen tly traded we do not anticipate any microstructure or non-

synchronicity issues at a 5 minute sampling inte rval.

day, however, a new multivariate bandwidth optimisation (as desc ribed in Section 5.1), for the

…xed set of retained variables, is performed.

7.3 Analysis of Results Model Con…dence Sets (MCS)

In our forecast competition we want to determine which of the six models provide the best forecasts.

In order to do this we use the MCS, introduced in Hansen, Lunde and Nason (2003) which analyses

forecasting performance in order to distill that group of models which contains the best forecasting

model with a given con…dence level. This collection of forecasting models is called the model

con…dence set (MCS). Th e models that remain in the MCS at the end of the process are assumed

to have equal predictive power.

We begin the process of forming the MCS with a set of forecasting models 

. The …rst stage

of the process tests the null hypothesis that all of the these models have equal predictive accuracy

(EPA) when their performance is measured against a set of ex-post observations. If H

is the i

forecast of the VCM at time t and 

is the observed VCM (or a consistent estimate

) for the

same period then the value of a loss function based on comparison of these is denoted L(H

; 

The evaluation of the EPA hypothesis is based on loss di¤erentials betwee n the values of the loss

functions for di¤erent models where the loss di¤erential between forecasting models i and j for

time t, d

ij;t

, is de…ned as

ij;t

= L(H

; 

)  L(H

; 

(23)

If all of the forecastors are equally accurate then the loss di¤erentials between all pairs of

forecastors should not be signi…cantly di¤erent from zero. The null hypothesis of EPA is then

: E (d

ij;t

) = 0 8i > j 2  (24)

and failure to reject H

implies all forecasting models have equal predictive ability. We test (24)

using the semi-quadratic te st statistic described in Hansen and Lunde (2007). If the null hypothesis

is rejected at an % con…dence level, we remove the model with the worst loss functions and begin

the process again with the reduced set of forecasting models, 

: This process is iterated until

In the below forecas t experiments we use the realized VCM, V

, in place of 

as it is a consi stent estimator of

the unobserv ed VCM.

the test of equal predictive accuracy cannot be rejected, or a single model remains. The model(s)

which survive form the MCS with % con…dence.

The loss function we use to analyse the performance of our VCM forecasts is the MVQLIKE

(Stein distance) function described above in (16). This is a robust loss function, as described in

Laurent, Romouts and Violante (2009). Clements, Doolan, Hurn and Becker (2009) and Laurent

et al. (2010) established that this loss function, compared to other loss functions, identi…es a

correctly speci…ed forecasting model in a smaller MCS, hence it is more discriminatory than, say,

the mean square forecast error criterion.

8 Forecast Comparison - Results

Here we present the results of our forecasting competitions for 1,5 and 22 day forecast horizons.

The results analysed here are the mean of MVQLIKE loss functions for forecasts and the MCS

p-values. If an MCS p-value is greater than the signi…cance level then a model is included in the

MCS, otherwise it is omitted.

We present two sets of results which di¤er in the way in which the time variable is included

into the multivariate kernel forecast. It was discussed in Section 5 that it could be introduced (see

Equation 9) in such a way that obs ervations far distant from the time at which the forecast is

made are heavily penalised (Table 1). Alternatively (see Equation 10) it could be speci…ed such

that close observations can obtain a higher weight, but observations in the long past would not be

excluded from attracting signi…cant positive weights (Table 2).

Referring to the results presented in Table 1 relating to 1 day ahead forecasts, it can be con-

cluded that the MVK model is the only mo del surviving in the MCS. All other models have a

p-value smaller than 5% and are hence exluded from a 95% con…dence level MCS. The forecasting

model with the 2nd largest p-value is the MVKne forecasting model that excludes the econ omic

variables. This allows the conclusion that for short-term forecasts the inclusion of economic vari-

ables adds value to the VCM forecasts in our setup. It is also interesting to note that the standard

Riskmetrics approach (RM - with …xed ), and to a lesser extend RMopt, deliver VCM forecast

with loss functions that exceed those of other forecasting models by a large margin (cons idering

the variation in loss functions between the other models). A further interesting conclusion can be

drawn from comparing the results between MVK and the Riskmetrics approach that uses realized

1 Day Forecasts 5 Day Forecasts 22 Day Forecasts

MV Q p-value MV Q p-value MV Q p-value

MVK 13.03 1.0000 RMvcm 6.78 1.0000 RMvcm 4.43 1.0000

MVKne 13.12 0.0033 MVKne 6.94 0.1235 MVK 5.49 0.1033

RMvcm 13.31 0.0000 MVK 7.10 0.1235 MVKne 5.68 0.1033

DCC 14.65 0.0000 DCC 8.80 0.0004 DCC 6.39 0.0597

RM opt 17.05 0.0000 RM opt 13.04 0.0000 RM opt 21.93 0.0006

RM 43.35 0.0000 RM 25.35 0.0000 RM 26.22 0.0077

Table 1: MCS Results 1. Uses time kernel as in Equation 9. The table reports the MCS results

for the multivariate kernel (with macro variables - MVK ; without macro variables - MVKnm),

the DCC, the Riskmetrics (RM ), the Riskmetrics with cross-validated  and the Riskmetrics

forecasting model using realized VCM (RMvcm). Forecasts are for 1, 5 and 22 days horizons.

MV Q is the average loss function and p-value represents the MCS p-value.

VCM and optimised s (RMvcm). Essentially the latter is a sp ec ial case of the former that excludes

all potential weighting variables but for the time variable. The fact that RMvcm is excluded from

the MCS for 1 day ahead forecasts indicates that matrix comparison and macroeconomic variables

can add signi…cant information to the VCM forecasting proc ess at short forecast horizons.

The …nding that the RM and RMopt deliver inferior VCM forecasts can be generalised to

longer forecast horizons. In none of the forecasts comparisons looked at in this paper is any of

these two forecasting models close to being included in a MCS. As the forecast horizon is increased

to 5 and 22 days the relative performance of the RMvcm model improves signi…cantly. For both

these horizons it has the smallest loss function although it shares membership in the MCS with

both the MVK and the MVKnm forecast models. This appears to indicate that the value of the

matrix closeness measures and the economic variables is largest for very short-term forecasts. It is

also notable that for the longest forecast horizon the DCC model is included in a 95% (but not a

90%) con…dence level MCS.

In Table 2 we present resu lts based on a kernel weighted VCM forecasts that use the modi…ed

time kernel proposed in Equation 10. This kernel ensures that that observations are not auto-

matically discarded just because they occur long before the forecast period. While the non kernel

forecasting metho d s remain unchanged

the MCS methodology will have to be reapplied as the

MCS p-values are conditional on the initial set of forecasts used.

Some of the basic …ndings discussed above remain unchanged. RM and RMopt are still inferior

The average loss functions for these f orecas ting models are identical in Table s 1 and 2.

1 Day Forecasts 5 Day Forecasts 22 Day Forecasts

MV Q p-value MV Q p-value MV Q p-value

MVK 13.06 1.0000 RMvcm 6.78 1.0000 RMvcm 4.43 1.0000

RMvcm 13.31 0.0110 MVK 6.90 0.2209 MVK 5.55 0.0010

DCC 14.65 0.0000 DCC 8.80 0.0000 DCC 6.39 0.0005

RM opt 17.05 0.0000 MVKnm 11.09 0.0000 MVKnm 8.87 0.0000

MVKnm 17.97 0.0000 RM opt 13.04 0.0000 RM opt 21.93 0.0000

RM 43.35 0.0000 RM 25.35 0.0000 RM 26.22 0.0000

Table 2: MCS Results 2. Uses time kernel as in Equation 10. The tab le rep orts the MCS results

for the multivariate kernel (with macro variables - MVK ; without macro variables - MVKnm),

the DCC, the Riskmetrics (RM ), the Riskmetrics with cross-validated  and the Riskmetrics

forecasting model using realized VCM (RMvcm). Forecasts are for 1, 5 and 22 days horizons.

MV Q is the average loss function and p-value represents the MCS p-value.

to all other forecas ting methodologies used here. The value of the kernel VCM forecasting method

is more apparent for shorter forecasting horizons and hence the value of matrix closeness measures

and economic variables disappears with increasing forecast horizon. As the time variable will

have a less dominating impact on the kernel forecasts it is not surprising to …nd larger di¤erences

between the kernel forecasts that include macroeconomic variables (MVK ) and those that do not

(MVKnm). On all occasions the former has clearly superior average loss measures and for the

1 and 5 day forecast horizon MVK is included in the MCS while MVKnm is not. At the 1 d ay

horizon the MVK is unambiguously the best model, at the 5 day horizon it is in the MCS together

with the RMvcm, while the latter is the unique surviving model at a 22 day forecasting horizon.

Before highlighting some more aspec ts of the multivariate kernel forecasts it sh ould be noted

that the results for the RMvcm forecasts are rather impressive allowing for the limited information

set utilised in these forecasts. While Fleming et al. (2003) used essentially the same model

they estimated the optimal decay parameter in a slightly di¤erent way. It is apparent that this

extension to the traditional Riskmetrics approach should be seriously considered in the context of

high dimensional VCM forecasting.

It is interesting to compare the MVK forecast performance for the two di¤e rent time kernels.

When using the the kernel that converges to 0 and therefore virtually eliminates observations with

large (T  t) the di¤erence in loss functions between the forecasting model that uses and that

which does not use the economic variables is fairly sm all, although statistically signi…cant at the 1

day forecasting horizon. When applying the time kernel that converges to 1, and hence does not

penalise observations with large (T  t), the di¤erence between the two sets of weighting variables

becomes larger. This result is best interpreted in combination with the observation that the loss

function values for the MVK forecasts remain almost unchanged for both kernel types, whereas

that for the MVKnm deteriorates as one moves from kernel (9) to (10). The obvious interpretation

to this result is that e valuating similarity merely on the basis of matrix closeness measures is

not a good strategy. The kernel forecast will then give weight to past obse rvations of V

that

do not positively contribute to the forecasting performance. Past observations appear relevant

when they are similar in terms of V

and V

having similar characteristics and if the prevailing

macroeconomic conditions at time T are close to those at time t.

In both Table 1 and Table 2 we …nd evidence that the macroeconomic information and VCM

based similarity statistics signi…cantly improve on a kernel based purely on time at a 1 day forecast

horizon

. We investigate which of the potential kernel weighting variables considered tend to

survive the variable elimination strategy described in Section 5.1.

The survival probabilities are reported in Table 3, these report the percentage of times that a

variable is included in the kernel for forecasting horizon, as mentioned above the inclusion/exclus ion

decision is made every 200 days, or seven times for each forecas t horizon. This Table refe rs to results

generated with the time kernel that converged to 0 (Equation (9))

. It is apparent that a goo d

number of the variables considered remained in the weighting mechanism in the vast majority

(if not all) instances. Variables that were eliminated more often than not are the correlation

comparison variable and the MVQLIKE measure of closeness. There is little variation between

the di¤erent forecast horizons with the exception that the comparison of absolute di¤erences is

dropped and the correlation comparison included more often for the 22 day horizon. In general

the inclusion frequencies drop somewhat for the 22 day forecast horizon. The exception are the

term spread as well as oil and gold prices which are never dropped at any forecasting horizon.

We also ran ve rsions o f the model in which the 5 and 22 day periods in the estimation of the bandwidths were

non-overlapping. We found that this marginally improve d mean MVQs but did not alter the qualitative nature of

the MCS analysis.

Note that the results for all variables other than time are identical regardless of the time variable we use as the

inclusion/exclusio n is based on a univariate kernel.

Variable Percentage Inclusion

1 Day 5Day 22 Day

Ratios of Eigenvalues of V

and Vt 100% 100% 85.7%

Absolute relative di¤erence s of elements of V

and Vt 100% 100% 0%

Proportion of correlations with the same sign at time t and T: 0% 0% 42.9%

MVQLIKE of Vt when compared to V

28.6% 71.4% 57.1%

Term Spread of government bonds. 100% 100% 100%

Default spread on corporate debt 100% 85.7% 85.7%

Oil Price 100% 100% 100%

Gold Price 100% 100% 100%

VIX index 85.7% 100% 42.9%

Time 100% 100% 85.7%

Bull and Bear Market Phases 100% 100% 85.7%

Table 3: Variables Included in the Kernel Model.

9 Conclusion

This paper presents a ‡exible kernel model which can be used to forecast symmetric, positive

de…nite variance covariance matrices for large scale portfolio stocks while being able to incorporate

a wide array of data. This is in contrast to many of the more popular approaches to VCM

modelling which have to make simpli…cations to their estimation pro ces ses /paramatisations in

order to handle large scale covariance matrices. The model relies on techniques well established

in the nonparametric econometrics literature. Importantly the scale of the computational task

scales with the number of variables we wish to use in determining our kernel weights rather than

the dimension of the covariance matrix. Our model is ‡exible, capable of using a wide range of

economic information and can be as easily used for small and large scale matrices. It does, however,

depend on the availability of positive de…nite VCM estimates which may not be easily available

for vast dimensional problems.

This paper establishes the feasibility of the proposed forecasting approach and further demon-

strates that using a larger set of information (matrix closeness measures and economic variables)

can have statistically signi…cant advantages. These advantages seem to be strongest for very short

forecast horizons. We also found that a version of the popular Riskmetrics model, using VCMs

based on high frequency data and cross-validated decay parameter, provided extremely useful.

While the kernel method dominated on the very short horizons, the modi…ed Riskmetrics ap-

proach performed best for 5 and 22 day ahead forecasts. This very simple forecasting method has

not attracted much attention in the empirical literature and it is suggested that its merits should

be reevaluated.

A number of issues for further research are beyond the scope of this paper. While we evaluated

forecast performance with statistical measures, future research will establish whether the proposed

forecasting metho d ology will deliver economically signi…cant improvements. We also anticipate

that the ability of incorporating exogenous information into the VCM forecasting process will

allow researchers to re-evaluate the type of variables considered in the context of VCM forecasting.

10 Bibliography

References

Aït-Sahalia, Y. & Brandt, M.W. (2001) "Variable selection for portfolio choice", The Journal of

Finance, Vol. 56, no. 4, pp. 1297-1351.

Aitchison, J. & Aitken, C.G.G. (1976) "Multivariate binary discrimination by the kernel method",

Biometrika, Vol. 63, no. 3, pp. 413-420.

Bowman, A.W. (1984) "An alternative method of cross-validation for the smoothing of density

estimates", Biometrika, Vol. 71, pp. 353-360.

Bowman, A. W. (1997) Applied smoothing techniques for data analysis : the kernel approach with

S -Plus illustrations, Clarendon Press, Oxford.

Blair, B. J., Poon, S. H., & Taylor, S. J. (2001), "Forecasting S&P 100 volatility: the incre-

mental information content of implied volatilities and high-frequency index returns", Journal of

Econometrics, vol. 105, no. 1, pp. 5-26

Campbell, J.Y. (1987) "Stock returns and the term structure", Journal of Financial Econometrics,

Vol. 18, pp. 373-399.

Campbell, J.Y., Lettau, M., Malkiel, B.G. & Xu, Y. (2001) "Have individual stocks become more

volatile? An empirical exploration of idiosyncratic risk" The Journal of Finance, Vol. 56, No. 1,

pp1-43.

Clements, A., Doolan, M., Hurn, S. & Becker, R. (2009) "Evaluating multivariate volatility

forecasts", NCER Working Paper Series 41, National Centre for Econometric Research.

Clements, A., Hurn, S. & Becker, R. (2010) "Semi-Parametric Forecasting of Realized Volatility",

to appear in: Studies in Nonlinear Dynamics & Econometrics.

Chiriac, R. & Voev, V. (2008) "Modelling and forecasting multivariate realized volatility", CoFE

Discussion Paper 08-06, Center of Finance and Econometrics, University of Konstanz.

Colacito, R., Engle, R.F. & Ghysels, E. (2007), "A component model f or dynamic correlations",

Unpublished

Engle, R.F (2002) "Dynamic conditional correlation - a s imple class of multivariate GARCH

models", Journal of Business and Economic Statistics, Vol. 20, no. 3, pp. 339-350.

Engle, R.F. & Kelly, B.T. (2008), "Dynamic equicorrelation", NYU Working Paper FIN-08-038.

Engle, R. F. & Sheppard, K. (2001), "Theoretical and empirical properties of dynamic conditional

correlation multivariate GARCH", NBER Working Paper No. 8554.

Engle R.F. Shephard N. and Sheppard K. (2008) “Fitting vast dimensional time-varying covari-

ance models”, unpublished mimeo, http://www.oxford-man.ox.ac.uk/~nshephard/.

Fama, E.F. & French, K.R. (1989) "Business conditions and expected returns on stocks and

bonds", Journal of Financial Economics, Vol. 25, pp. 23-49.

Fleming, J., Kirby, C. & Ostdiek, B. (2003) "Th e economic value of volatility timing using

’realized’volatility", Journal of Financial Economics, Vol. 67, no. 3, pp 473-509.

Gijbels, I., Pope, A. & Wand, M.P. (1999) "Understanding exponential smoothing via kernel

regression", Journal of the Royal Statistical Society, Vol. 61, pp. 39-50.

Hamilton, J.D. & Lin, G. (1996), "Stock market volatility and the business cycle", Journal of

Applied Econometrics, Vol. 11, No. 5, pp. 573-93.

Hamilton, J.D. (1996), "This is what h appened to the oil price-macroeconomy relationship",

Journal of Monetary Economics, Vol. 38, pp. 215-220.

Hansen, P.R. A test for superior predictive ability. Brown University, Department of Economics

Working Paper 2003-09 . 2001.

Hansen, P.R. & Lunde, A. (2005) "A realized variance for the whole day based on intermittent

data", Journal of Financial Econometrics, Vol 3 (4), pp525-554.

Hansen, R.P. & Lunde, A. (2007) "MULCOM 1.00, Econometric toolkit for multiple comparisons"

(Packaged with Mulcom package)

Hansen, P. D., Lunde, A., & Nason, J. M. (2003), "Choosing the best volatility models: the model

con…dence set approach", Oxford Bulletin of Economics and Statistics, vol. 65, no. Supplement,

pp. 839-861.

Hansen, P.R., Lunde, A. & Nason, J.M. (2004), ’Model con…dence s ets for forecasting mod-

els’,Federal Reserve Bank of Atlanta Working Paper No. 2005-7 http:nnssrn.comnnpaper=522382.

Harvey, C.R. (1989) "Time-varying conditional covariance in tests of asset pricing models", Jour-

nal of Financial Economics, Vol. 24, pp. 289-317.

Harvey, C.R., 1991, The speci…cation of conditional expectations, Working paper, Duke Univer-

sity.

J.P. Morgan (1996) Riskmetrics Technical Document 4th Edition, J.P. Morgan, New York.

Krolzig, H.M. & Hendry, D.F. (2001) "Computer automation of general-to-speci…c model selection

procedures", Journal of Economic Dynamics & Control, Vol. 25, pp. 831-866.

Laurent, S., Rombouts, J.V.K. & Violante, F. (2009) "On loss functions and ranking f orecasting

performances of multivariate volatility models," CIRPEE Working Paper 09-48.

Laurent, S., Rombouts, J.V.K. & Violante, F. (2010), "On the forecasting accuracy of multivariate

GARCH models", CIRPÉE Working Paper 10-21.

Laws, J. & Thompson, J. (2005) "Hedging e¤ectiveness of stock index futures", European Journal

of Operational Research, Vol. 163, pp. 171-191.

Li, Q. & Racine, J.S. (2007) Nonparametric Econometrics Theory and Practice, Princeton Uni-

versity Press, Oxford.

Moskowitz, T.J. (2003) "An analysis of covariance risk and pricing anomalies", The Review of

Financial Studies, Vol. 16, pp 417-457.

Pagan, A.R. & Sossounov, K.A. (2003) "A simple framework for analysing bull and bear markets",

Journal of Applied Econometrics, Vol. 18, No. 1, pp. 23-46.

Pelletier, D. (2006) "Regime switching for dynamic correlations", Journal of Econometrics, Vol.

131, no. 1-2, pp. 445-473.

Poon S-H. and Granger C.W.J. (2003) “Forecasting volatility in …nancial markets: a review”

Journal of Economic Literature, 41, 478-539.

Rudemo, M. (1982) "Empirical choice of histograms and kernel density estimators", Scandinavian

Journal of Statistics, Vol. 9, No. 2, pp65-78.

Sadorsky, P. (1999) "Oil price shocks and stock market activity", Energy Economics, Vol. 21, pp.

449-469.

Schwert, G.W. (1989) "Why does stock market volatility change over time", The Journal of

Finance, Vol. 44, no. 5, pp 1115-1153.

Silverman, B.W. (1986) Density Estimation for Statistics and Data Analysis, Chapman & Hall,

London.

Sjaastad, L.A. & Scacciavillani, F. (1996) "The price of gold and the exchange rate", Journal of

international Money and Finance, Vol. 15, No. 6, pp. 879-97.

Vasilellis, G.A. & Meade, N. (1996), "Forecasting volatility for portfolio selection", Journal of

Business, Finance and Accounting, Vol. 23, pp. 125-143.

Wand, M.P. & Jones, M.C. (1995) Kernel Smoothing, Chapman & Hall, London.

Whitelaw, R.F. (1994) "Time variations and covariations in the expectation and volatility of stock

market returns", The Journal of Finance, Vol. 49, No.2, pp. 515-541.

11 Appendix A

Table 4 provides a full list of the stocks used in the analysis in section XX.

Ticker symbol Company Name

1 AA Alcoa Inc

2 AXP American Express Inc

3 BA Boeing Co.

4 BAC Bank of America Corp.

5 BMY Bristol Myers Squibb Co.

6 CL Colgate Palmolive

7 DD El DuPont de Nemours co.

8 DIS Walt Disney Corp.

9 GD General Dynamics Corp

10 GE General Electric Co.

11 IBM IBM

12 JNJ Johnson & Johnson

13 JPM JP Morgan Chase Co.

14 KO Coca Cola Corp.

15 MCD Mcdonald’s Corp

16 MER Merril Lynch Co. Inc

17 MMM 3M co.

18 PEP Pepsico Inc.

19 PFE P…zer Inc

20 TYC Tyco International ltd.

Table 4: Stocks included in the forecasting experiment in section X.

12 Appendix B

In the text of the article reference is made to the composite-likelihood DCC calculation method sug-

gested in Engle, Sheppard and She phard 2008 in order to make estimation fo the DCC paramaters

feasible for a large scale problem.

In a DCC model we assume that

=  + 



= H

1=2

 IID(0

; I

)

and H

is governed by a process depe nde nt on a paramater vector ; which contains the values of

;  and the o¤ unique o¤ diagonal elements of



Q from (20). The paramater vector is estimated

by maximising the likelihood equation

log L(; z

) =

t=1

()

() = 

log jH

j 

;

however it is di¢ cult to maximise this if we have a high dimensional VCM.

The method suggested by Engle, Sheppard and Shephard 2008 to circumvent this problem

is composite-likelihood estimation. The …rst step in this is to estimate a likelihood for several

pairs of stocks. For example if we have three stocks we may compute the likelihoods for DCC

models using the pairs of stocks (1; 2) ; (2; 3) and (1; 3) and we denote thes e likelihoods as L

j;t

(

)

j = 1; 2; 3 respectively. Th e same equations as above can be calculated using the equation for

() above, however the dimension if H

and r

in this case are 2x2 and 2x1 respectively. The

composite likelihoo d approach then suggests that we sum these likelihoods time and average over

the Z pairwise combinations included. We then choose the paramaters to minimise the sum

CL() =

t=1

z=1

j;t

(

) :

Hence we …nd the DCC paramaters which minimise the sum of likelihoods. By doing this it is

possible to estimate the DCC paramaters using only pairwise calculations, however the estimated

paramaters will not be the same unless the pairs are inde pendent. The paramater vector 

varies

over j = 1; 2; 3 only in the long run correlation paramater for the pair of stocks, obtained from

the standardised GARCH residuals from the …rst step of DCC estimation . As in the aggregate

models these are replace d by the full set of long run correlations in  there is a di¤erence between

 and each 

Engle, Sheppard and Shephard 2008 note that the loss in e¢ ciency from using less than all

available pairs is only very small and so we include each of our stocks in only one pairing of stocks

for calculation of the composite-likelihood DCC.

ListofNCERWorkingPapers



No.65(Downloadfulltext

)

StanHurn,AndrewMcClellandandKennethLindsay 

Aquasi‐maximumlikelihoodmethodforestimatingtheparametersofmultivariate

diffusions



No.64(Downloadfulltext

)

RalfBeckerandAdamClements

Volatilityandtheroleoforderbookstructure?



No.63(Downloadfulltext

)

AdrianPagan

CanTurkishRecessionsbePredicted?



No.62(Downloadfulltext

)

LionelPageandKatiePage

Evidenceofreferees’nationalfavouritisminrugby



No.61(Downloadfulltext

)

NicholasKing,PDorianOwenandRickAudas

PlayoffUncertainty,MatchUncertaintyandAttendanceatAustralianNationalRugby

LeagueMatches



No.60(Downloadfulltext

)

RalfBecker,AdamClementsandRobertO'Neill

ACholesky‐MIDASmodelforpredictingstockportfoliovolatility



No.59(Downloadfulltext

)

PDorianOwen

MeasuringParityinSportsLeagueswithDraws:FurtherComments



No.58(Downloadfulltext

)

DonHarding

Applyingshapeandphaserestrictionsingeneralizeddynamiccategoricalmodelsofthe

businesscycle



No.57(Downloadfulltext

)

ReneeFryandAdrianPagan

SignRestrictionsinStructuralVectorAutoregressions:ACriticalReview



No.56(Downloadfulltext

)

MardiDungeyandLyudmylaHvozdyk

Cojumping:EvidencefromtheUSTreasuryBondandFuturesMarkets



No.55(Downloadfulltext

)

MartinG.Kocher,MarcV.LenzandMatthiasSutter

Psychologicalpressureincompetitiveenvironments:Evidencefromarandomizednatural

experiment:Comment



No.54(Downloadfulltext

)

AdamClementsandAnnastiinaSilvennoinen

Estimationofavolatilitymodelandportfolioallocation



No.53(Downloadfulltext

)

LuisCatãoandAdrianPagan

TheCreditChannelandMonetaryTransmissioninBrazilandChile:AStructuredVAR

Approach

No.52(Downloadfulltext)

VladPavlovandStanHurn

TestingtheProfitabili tyofTechnicalAnalysisasaPortfolioSelectionStrategy



No.51(Downloadfulltext

)

SueBridgewater,LawrenceM.KahnandAmandaH.Goodall

SubstitutionBetweenManagersandSubordinates:EvidencefromBritishFootball



No.50(Downloadfulltext

)

MartinFukacandAdrianPagan

StructuralMacro‐EconometricModellinginaPolicyEnvironment



No.49(Downloadfulltext

)

TimMChristensen,StanHurnandAdrianPagan

DetectingCommonDynamicsinTransitoryComponents



No.48(Downloadfulltext

)

EgonFranck,ErwinVerbeekandStephanNüesch

Inter‐marketArbitrageinSportsBetting



No.47(Downloadfulltext

)

RaulCaruso

RelationalGoodatWork!CrimeandSportParticipationinItaly.EvidencefromPanelData

RegionalAnalysisoverthePeriod1997‐2003.



No.46(Downloadfulltext

)(Accepted)

PeterDawsonandStephenDobson

TheInfluenceofSocialPressureandNationalityonIndividualDecisions:Evidencefromthe

BehaviourofReferees



No.45(Downloadfulltext

)

RalfBecker,AdamClementsandChristopherColeman‐Fenn

Forecastperformanceofimpliedvolatilityandtheimpactofthevolatilityriskpremium



No.44

(Download full text)

AdamClementsandAnnastiinaSilvennoinen

Ontheeconomicbenefitofutilitybasedestimationofavolatilitymodel



No.43

(Download full text)

AdamClementsandRalfBecker

Anonparametricapproachtoforecastingrealizedvolatility



No.42 (Download full text)

UweDulleck,RudolfKerschbamerandMatthiasSutter

TheEconomicsofCredenceGoods:OntheRoleofLiability,Verifiability,Reputationand

Competition



No.41

(Downloadfulltext)

AdamClements,MarkDoolan,StanHurnandRalfBecker

Ontheefficacyoftechniquesforevaluatingmultivariatevolatilityforecasts



No.40(Downloadfulltext

)

LawrenceM.Kahn

TheEconomicsofDiscrimination:EvidencefromBasketball



No.39(Downloadfulltext

)

DonHardingandAdrianPagan

AnEconometricAnalysisofSomeModelsforConstructedBinaryTimeSeries



No.38(Downloadfulltext

)

RichardDennis

TimelessPerspectivePolicymaking:WhenisDiscretionSuperior?



No.37(Downloadfulltext

)

PaulFrijters,AmyY.C.LiuandXinMeng

Areoptimisticexpectation skeepingtheChinesehappy?



No.36(Downloadfulltext

)

BennoTorgler,MarkusSchaffner,BrunoS.Frey,SaschaL.SchmidtandUweDulleck

InequalityAversionandPerformanceinandontheField



No.35(Downloadfulltext

)

TMChristensen,A.S.HurnandKALindsay

Discretetime‐seriesmodelswhencountsareunobservable



No.34(Downloadfulltext

)

AdamClements,ASHurnandKALindsay

Developinganalyticaldistributionsfortemperatureindicesforthepurposesofpricing

temperature‐basedweatherderivatives



No.33(Downloadfulltext

)

AdamClements,ASHurnandKALindsay

EstimatingthePayoffsofTemperature‐basedWeatherDerivatives



No.32(Downloadfulltext

)

TMChristensen,ASHurnandKALinds ay 

TheDevilisintheDetail:HintsforPracticalOptimisation



No.31(Downloadfulltext

)

UweDulleck,FranzHackl,BernhardWeissandRudolfWinter‐Ebmer

BuyingOnline:SequentialDecisionMakingbyShopbotVisitors





No.30(Downloadfulltext)

RichardDennis

ModelUncertaintyandMonetaryPolicy



No.29(Downloadfulltext

)

RichardDennis

TheFrequencyofPriceAdjustmentandNewKeynesianBusinessCycleDynamics



No.28(Downloadfulltext

)

PaulFrijtersandAydoganUlker

RobustnessinHealthResearch:Dodifferencesinhealthmeasures,techniques,andtime

framematter?



No.27(Downloadfulltext

)

PaulFrijters,DavidW.Johnston,ManishaShahandMichaelA.Shields

EarlyChildDevelopmentandMaternalLaborForceParticipation:UsingHandednessasan

Instrument



No.26(Downloadfulltext

)

PaulFrijtersandTonyBeatton

ThemysteryoftheU‐shapedrelationshipbetweenhappinessandage.



No.25(Downloadfulltext

)

TMChristensen,ASHurnandKALinds ay 

Itneverrainsbutitpours:Modellingthepersistenceofspikesinelectricityprices



No.24(Downloadfulltext

)

RalfBecker,AdamClementsandAndrewMcClelland

TheJumpcomponentofS&P500volatilityandtheVIXindex



No.23(Downloadfulltext

)

A.S.HurnandV.Pavlov

MomentuminAustralianStockReturns:AnUpdate



No.22(Downloadfulltext

)

MardiDungey,GeorgeMilunovichandSusanThorp

UnobservableShocksasCarriersofContagion:ADynamicAnalysisUsingIdentified

StructuralGARCH



No.21(Downloadfulltext

)(forthcoming)

MardiDungeyandAdrianPagan

ExtendinganSVARModeloftheAustralianEconomy



No.20(Downloadfulltext

)

BennoTorgler,NemanjaAnticandUweDulleck

Mirror,MirrorontheWall,whoistheHappiestofThemAll?



No.19(Downloadfulltext

)

JustinaAVFischerandBennoTorgler

SocialCapitalAndRelativeIncomeConcerns:EvidenceFrom26Countries





No.18(Downloadfulltext)

RalfBeckerandAdamClements

Forecastingstockmarketvolatilityconditionalonmacroeconomicconditions.



No.17(Downloadfulltext

)

RalfBeckerandAdamClements

ArecombinationforecastsofS&P500volatilitystatisticallysuperior?



No.16(Downloadfulltext

)

UweDulleckandNeilFoster

ImportedEquipment,HumanCapitalandEconomicGrowthinDevelopingCountries



No.15(Downloadfulltext

)

RalfBecker,AdamClementsandJamesCurchin

Doesimpliedvolatilityreflectawiderinformationsetthaneconometricforec asts?



No.14(Downloadfulltext

)

ReneeFryandAdrianPagan

SomeIssuesinUsingSignRestrictionsforIdentifyingStructuralVARs



No.13(Downloadfulltext

)

AdrianPagan

WeakInstruments:AGuidetotheLiterature



No.12(Downloadfulltext

)

RonaldG.Cummings,JorgeMartinez‐Vazquez,MichaelMcKeeandBennoTorgler

EffectsofTaxMoraleonTaxCompliance:ExperimentalandSurveyEvidence



No.11(Downloadfulltext

)

BennoTorgler,SaschaL.SchmidtandBrunoS.Frey

ThePowerofPositionalConcerns:APanelAnalysis



No.10(Downloadfulltext

)

RalfBecker,StanHurnandVladPavlov

ModellingSpikesinElectricityPrices



No.9(Downloadfulltext

)

A.Hurn,J.JeismanandK.Lindsay

TeachinganOldDogNewTricks:ImprovedEstimationoftheParametersofStochastic

DifferentialEquationsbyNumericalSolutionoftheFokker‐PlanckEquation



No.8(Downloadfulltext

)

StanHurnandRalfBecker

Testingfornonlinearityinmeaninthepresenceofheteroskedasticity.



No.7(Downloadfulltext

)(published)

AdrianPaganandHashemPesaran

OnEconometricAnalysisofStructuralSystemswithPermanentandTransitoryShocksand

ExogenousVariables.





No.6(Downloadfulltext)(published)

MartinFukacandAdrianPagan

LimitedInformationEstimationandEvaluationofDSGEModels.



No.5(Downloadfulltext

)

AndrewE.Clark,PaulFrijtersandMichaelA.Shields

IncomeandHappiness:Evidence,ExplanationsandEconomicImplications.



No.4(Downloadfulltext

)

LouisJ.MacciniandAdrianPagan

Inventories,FluctuationsandBusinessCycles.



No.3(Downloadfulltext

)

AdamClements,StanHurnandScottWhite

EstimatingStochasticVolatilityModelsUsingaDiscreteNon‐linearFilter.



No.2(Downloadfulltext

)

StanHurn,J.JeismanandK.A.Lindsay

SeeingtheWoodfortheTrees:ACriticalEvaluationofMethodstoEstimatethe

ParametersofStochasticDifferentialEquations.



No.1(Downloadfulltext

)

AdrianPaganandDonHarding

TheEconometricAnalysisofConstructedBinaryTimeSeries.



ResearchGate has not been able to resolve any citations for this publication.

The Economic Value of Volatility Timing Using 'Realized' Volatility

Article

Jan 2001

Density Estimation for Statistics and Data Analysis.

Article

Jan 1987

Variable Selection for Portfolio Choice

Article

Jan 2001

We study asset allocation when the conditional moments of returns are partly predictable. Rather than first model the return distribution and subsequently characterize the portfolio choice, we determine directly the dependence of the optimal portfolio weights on the predictive variables. We combine the predictors into a single index that best captures time-variations in investment opportunities. This index helps investors determine which economic variables they should track and, more importantly, in what combination. We consider investors with both expected utility (mean-variance and CRRA) and non-expected utility (ambiguity aversion and prospect theory) objectives and characterize their market-timing, horizon effects, and hedging demands.

Time-Varying Conditional Covariances in Tests of Asset Pricing Models

Article

Jan 1989

Campbell R. Harvey

This paper proposes tests of asset pricing models that allow for time variation in conditional covariances. The evidence indicates that the conditional covariances do change through time. Estimates of the expected excess return on the market divided by the variance of the market (reward-to-risk ratio) are presented for the Sharpe-Lintner CAPM, as well as a number of tests of the model specification. The patterns of the pricing errors through time suggest the model's inability to capture the dynamic behavior of asset returns. This is the working paper version of my 1989 Journal of Financial Economics article.

Kernel Smoothing

Article

Feb 1996
TECHNOMETRICS

An analysis of covariance risk and pricing anomalies

Article

Jun 2003

Tobias J. Moskowitz

This article examines the link between several well-known asset pricing "anomalies" and the covariance structure of returns. I find size, book-to-market, and momentum strategies exhibit a strong, weak, and negligible relation to covariance risk, respectively. A size factor helps predict future volatility and covariation, improving the efficiency of investment strategies. Moreover, its premium rises following increases in both its volatility and covariation with other assets. These effects are amplified in recessions. No such relations exist for book-to-market or momentum. These findings may shed light on explanations for these premia and present a challenging set of facts for future theory.

Density Estimation for Statistics and Data Analysis

Article

Mar 2012
TECHNOMETRICS

Kosrow Dehnad

Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-PLUS Illustrations

Book

Nov 1997

The book describes the use of smoothing techniques in statistics, including both density estimation and nonparametric regression. Considerable advances in research in this area have been made in recent years. The aim of this text is to describe a variety of ways in which these methods can beapplied to practical problems in statistics. The role of smoothing techniques in exploring data graphically is emphasised, but the use of nonparametric curves in drawing conclusions from data, as an extension of more standard parametric models, is also a major focus of the book. Examples are drawnfrom a wide range of applications. The book is intended for those who seek an introduction to the area, with an emphasis on applications rather than on detailed theory. It is therefore expected that the book will benefit those attending courses at an advanced undergraduate, or postgraduate, level,as well as researchers, both from statistics and from other disciplines, who wish to learn about and apply these techniques in practical data analysis. The text makes extensive reference to S-Plus, as a computing environment in which examples can be explored. S-Plus functions and example scriptsare provided to implement many of the techniques described. These parts are, however, clearly separate from the main body of text, and can therefore easily be skipped by readers not interested in S-Plus.

RiskMetrics-Technical Document

Book

Jan 1996

J P Morgan/Reuters

Forecasting volatility for portfolio selection

Article

Dec 2006
J BUS FINAN ACCOUNT

The volatility of an asset is a primary input to the portfolio selection problem. Information about volatility is available from two sources, namely the share market and the option market. This paper examines the forecasting performance, over a three month investment horizon, of time series forecasts (from the share market) and option based implied volatilities. Three time series models, including GARCH, are used and twenty four implied volatility estimation models are employed. Using a data set of twelve UK companies, it is demonstrated that implied volatilities produce better individual forecasts than time series. However, more remarkably, forecasts combining implied volatilies and time series estimates significantly outperform both component forecasts.

A Kernel Technique for Forecasting the Variance-Covariance Matrix

Abstract and Figures

Recommended publications

2020 Global Business Challenge - $125,000 in cash prizes

Substitution Matrices for Contextual Alignment

Algebraic techniques for Schrödinger equations in split quaternionic mechanics

The role of representative submatrices in vertical linear complementarity theory

Some Topics in Generalized Inverses of Matrices

Asymptotics for the standard and the Capelli identities