ArticlePDF Available

Modeling the IR curve: the Exponential Polynomial Model ("EPM"), the true extension of Nielson-Siegel

March 2018

March 2018

Authors:

ALM-Vision

The paper presents a new model called "Exponential Polynomial Model". EPM is an extension of the famous Nielson-Siegel model and an alternative to Nielson-Siegel-Svensson model (NSS). NSS raises a lot of issues, being difficult to estimate and explain, with colinearity between factors. EPM is conditionally linear and much more elegant and easier to interpret. Furthermore, the model allows to easily appreciate the improvement provided by additional parameters. Finally, the author discusses some points on AOA and multi-dimentional binomial pricing.

Content uploaded by Serge Moulin

Content may be subject to copyright.

01/03/2018

Modeling the interest rate curve

The exponential polynomial model:

the true extension of Nelson-Siegel

Serge MOULIN

ALM-VISION

 
1 
 
Modeling the interest rate curve:  
the exponential polynomial model,  
the true extension of Nelson-Siegel 
 
 
Table of contents 
 
Abstract ................................................................................................................................................... 2 
Degrees of freedom of the interest rate curve ....................................................................................... 2 
Explaining the moves of the curve .......................................................................................................... 3 
Interest rate curve modeling ................................................................................................................... 4 
Model of Nelson-Siegel ....................................................................................................................... 5 
Estimation of parameters with IR points ............................................................................................. 8 
Dynamic Nelson-Siegel ...................................................................................................................... 11 
Generalizing Nelson-Siegel ................................................................................................................ 11 
The exponential polynomial model EPM(n) ...................................................................................... 12 
Estimating EPM(n) parameters using observed yields ...................................................................... 13 
Explaining the EPM(n) model ............................................................................................................ 14 
An example .................................................................................................................................... 16 
Estimating EPM(n) parameters using prices or in fine yields ............................................................ 17 
Optimizing using in fine yields ....................................................................................................... 20 
About weights and errors .................................................................................................................. 21 
Other statistical models versus least square estimation .................................................................. 22 
Pricing of derivatives ............................................................................................................................. 24 
Conclusion ............................................................................................................................................. 25 
Bibliography ........................................................................................................................................... 27 
About ALM-VISION ................................................................................................................................ 28 
Disclaimer .............................................................................................................................................. 28 
 
   

Abstract

Most of the central banks provide models of the interest rate curve based on the Spline methodology

or the Nelson-Siegel model and its extensions: Nelson-Siegel Svensson and Generalized Nelson-Siegel.

But these last two models raise calibration issues, being far from linear.

After reminding some key elements about the IR curve, we present here a new methodology which

appears easier and more elegant in term of optimum, choice of the number of parameters and their

explanation. Furthermore, the model includes the Nelson-Siegel approach as a specific case. We call it

“the exponential polynomial model” or “EPM(n)” with n being the degree of the model. Nelson-Siegel

is an EPM(1). Estimations are easier also, allowing to choose the required level of parameters.

We finally give an introduction to derivatives pricing using this new methodology, a simple and elegant

way of expanding Cox-Ross but which raises issues in term of AOA assumption.

Degrees of freedom of the interest rate curve

Factorial analysis of interest rate curves generates orthogonal factors explaining its moves. The three

first factors explain usually between 90% and 96% of the moves. They are:

• First, the general level of the curve describing the translation effects. It explains about

50% of the moves.

• Second, the slope (difference between short-term and long-term rates) describing the

rotation effects around a pivot between 2 and 3 years (the usual limit between short-

term and long-term markets). It explains around 30% of the moves.

• Third, the curvature defining the shape of the slope explaining around 10% of the

moves.

• Additional effects allowing to specify the shape of the curvature.

The Nelson-Siegel (“NS”) model focus on the three first parameters and offer a formula giving the

shape of the IR curve, knowing the 3 first factors. However, when the market provides more than five

points, NS’s level of precision is often insufficient and modeling requires adding additional parameters.

The strongest empirical result is the fact that the NS factors match the three first orthogonal ones

and subsequently can also be considered as orthogonal. Furthermore, at least for the first three, their

significance is clear: level, slope, and curvature. This opens up real opportunities in term of modeling

and pricing (see below): with 3 orthogonal parameters, one captures 90% to 96% of the degrees of

freedom of the IR curve. By adding additional factors, one can get an even higher level of accuracy.

The factors describing the curve move over times with the evolution of economic reality. The analysis

of the propagation of a shock toward the different segments of the curve provides an indication of its

economic nature. For example, cyclical moves of the short-term yields have low impact in the long-

term. However, an anticipation of strengthening of the monetary policy in reaction to inflationary

tensions will spread also to the long-term yields. Symmetrically, a decrease in the short-term rates by

central banks in order to reshuffle growth to the economy appears often simultaneously with a

decrease in the long-term rates (in this description of effects, we do not try to identify which move

comes from the macro-economic situation and which move comes from the reaction of the central

bank).

Explaining the moves of the curve

Obviously, identifying the economic parameters explaining the moves of the IR curve is one of the most

natural and fundamental question in finance. Unfortunately, it is also one of the most difficult.

The first explanation came at the beginning of the 20th century with Eugen von Böhm-Bawerk, who

observed that the shape of the IR curve depends on the preference of the economic agents between

receiving their cash now or delaying payment against an additional revenue (the interest). So the first

answer was already focusing on the offer and demand for liquidity and yield.

In 1930, Irving Fisher took over the subject and expressed the hypothesis that, in absence of default

risk, investors expect a return covering them against their anticipation of inflation. Depending on the

level of uncertainty of their anticipations, they shall request a risk premium. Assuming that uncertainty

remains stable, real interest rate (that is net of inflation) should be extremely stable. Empirical studies

on “Fisher effect” shows a clear correlation between core inflation (that is out of its two volatile

components: energy and food) and interest rate. However, modeling is difficult, due to the difficulties

of forecasting inflation, its volatility and the modeling of the risk premium expected by the market.

The second parameter impacting interest rate is the evolution of the GDP compared to its long-term

trend: if it speed up, the market expect a period of tension on the economic capacities, so a higher

demand for credit and higher inflation. It will therefore request a higher premium. Inversely, below

the long-term trend, market anticipates a weak economic activity and shall request a lower premium.

In 1958, Phillips also identified a link between inflation and unemployment and made the conclusion

that interest rate should be a tool in order to “pilot” the unemployment rate around its “structural”

level. Since then, numerous studies have detailed and modulated this link, which sometimes

statistically disappears. Indeed, IR level impacts more industrial activities, consuming of credit, than

services. An excessive decrease of the interest rate generates speculative bubbles and inflation of the

value of assets (by opposition to the inflation of consumption’s goods and services).

Other technical parameters are involved in the shaping of the IR curve: refunding rates by the central

bank (even though this is a short-term instrument only), calendars specificities, profitability of IR

against stock equity expected return, characteristics of the economy and social system (demography,

funding of retirements…)

Analyzing the links between the orthogonal factors and these macro-economic parameters is a rather

new area. Globally, one gets:

• The level of the curve (first factor) is linked to the expected inflation,

• The slope (second factor) is linked to the real activity (evolution of the real GDP) and the

monetary policy (central banks increase short-term rates when the factors of production are

heavily used),

• The curvature is linked to the volatility of inflation but tests are less significant.

Interest rate curve modeling

Several models are used mostly by central banks and IR derivatives trading desks, in order to smooth

the curve in a more constant manner than with the methodology of the polynomial Splines (which

generates waves) and, more importantly, in order to use less parameters.

There are roughly two families of models:

• Models with latent factors are based on the absence of arbitrage opportunities (AOA) and see

the IR curve as the consequence of the forward rate. These models, intellectually attractive,

often face difficulties when seeking to represent the observed reality but allow pricing

derivatives by modeling uncertainty around the forward rate. We include in this category of

models the classical Vasicek (1977), Cox-Ingersoll-Ross (1985), HJM (1992), Duffie and Kan

(1996)…

• The descriptive models do not focus on respecting the AOA but just aim at representing the

reality. They are indeed a simple exercise of optimizing the adjustment between a parametric

curve and observed points. Their proponents justify this approach by the fact that in reality,

pure arbitrages are not available. So, modeling reality should result in AOA model. Some

models have then been extended or clarified in a second step in order to respect formally the

AOA.

In this last case of descriptive models, there is a wide range of shapes of curves’ families to be adjusted

to the observed curve or to the instantaneous forward rates, the two being linked by:

( )

( ) ( )

1..

tf s ds t r t

P t e e

r t f s ds

−−



=

with P(t) the price of the zero-coupon of duration t, r(t) the zero-coupon rate and f(t) the anticipated

instantaneous forward rate at time t.

For exemple:

• The linear interpolation (simple and acceptable if there are numerous points and a

regular curve)

• Legendre polynomials (Almeida – 2005)

• Laguerre functions (Nelson-Siegel – 1987 and Nelson-Siegel Svensson)

• Kernel estimations (Fengler – 2007)

• Numerous improved Splines methodologies…

We will focus on Laguerre

functions which are the most widely used and the most elegant to explain.

Not to confuse with polynomials of Laguerre. The term is regularly used. However, the author was not able to

verify the links with Laguerre’s writing.

Model of Nelson-Siegel

The most common model used by central banks requires four factors only and provides an elegant

approximation of the IR curve explaining directly the three degrees of freedom mentioned above. The

model is the following:

( )

r t l s c e







−−

−



−−

= + + −





The curve can take most of the observed shapes: increasing (in contengo), decreasing (in

backwardation), concave, convex, “S” shape…

5,000%

5,500%

6,000%

6,500%

7,000%

7,500%

8,000%

8,500%

9,000%

0,00 5,00 10,00 15,00 20,00 25,00 30,00

estimation : in contengo curve

observed rates est. Rates NS or EPM(n=1) est. Rates EPM(n=2)

Most importantly, it is easy to explain since the three factors l, s, and c correspond exactly with the

three first degrees of freedom:

• l is the level of the curve: the interest rate when t goes to infinity,

• s describes the slope, that is the difference between short-term rates (l+s) and long-term rate

(l),

• c describes the curvature.

5,000%

5,500%

6,000%

6,500%

7,000%

7,500%

8,000%

8,500%

0,00 5,00 10,00 15,00 20,00 25,00 30,00

estimation : U-shape

observed rates est. Rates NS or EPM(n=1) est. Rates EPM(n=2)

One get trivially:

( ) ( )

( )

lim r 0

lim

t f l s

r t l

+

= = +

Instantaneous forward rates are given by a simple formula. Indeed, using the price of a zero-coupon

P(t):

( )

( ) ( )

( ) ( ) ( )

( ) ( ) ( ) ( )

ln . .

1. . . .t.

tf s ds t r t

P t e e

P t f s ds t r t

t r t

Pt f t l s e c e

P t t t





−−



= − = −









−

= = = + +





This last equation gives an interpretation of λ. Indeed, the instantaneous forward rate at time t (seen

at time 0) converge toward the long-term rate at a speed depending on λ.

But this equation can also be seen as a modeling of degree 1 amid a set of exponential polynomial

functions, which is the key element of this article:

( ) ( )

. . . . .

i i t t

n i n n

f t cst a t e cst e P t





−−



= + = +



Finally, Nelson Siegel provides an elegant extension of the notion of modified duration (or Macaulay

duration). Indeed, for an IR asset representing the time set of cash flows CF(tn), one gets:

( )

.t . .

.t . . . .

Nr t t

n n n

Nr t t t

P CF e

dP CF e dr t

dP CF e dl ds e dc







−

−−

=−





−−

= − + + −











• The term in dl expresses the classical modified duration: the first order effect of the translation of the

IR curve in the price,

• The term in ds expresses the sensitivity to a steepening of the curve,

• The term in dc expresses the sensitivity to the curvature.

• The sensitivity to lambda is usually estimated by approximation, so we didn’t include it.

This formula is useful for the estimation of parameters based on prices.

Estimation of parameters with IR points

The estimation is usually made separately for λ and for the other parameters (but we will see below

that the global estimation is not that complex).

• In order to have a first estimate of λ, one can search empirically tmax which maximizes the

curvature that is:

( )

. . 1

c t e

e t t







−−



−

=−





=



= + +

This equation in λ.tmax is solved using an algorithm of approximation (Newton Raphson…) and one gests

the link between tmax and λ:

λ.tmax = 1.793282

The problem is that the observed curve is not c(t) but r(t).

In first approximation, one can estimate approximatively tmax (in general tmax is between 1.5 and 3

years) as the area where the curvature of r(t) is maximal to the line linking short-term and long-term

rates. From there, one get λ

(between 0.6 and 1.2) but this is a rough estimate which isn’t most of the

time sufficient.

• Knowing λ, the three parameters l,s and c are classically estimated using the least squares

methodology. One get:

( )

. . . . .

s X W X X W Y



−







With W the diagonal matrix of weights of the observations, Y the vector of the n observed interest rate

and X the Jacobian matrix:

1..

. . .

1..











−−

−

−−

−



−−

−



=

−−



−





This methodology is mostly used with observed zero-coupons but also sometimes on the in-fine IR. In

this case, the implied zero-coupons do not follow a NS, which may be confusing (and raises the issue

of the definition of the instantaneous forward rate).

Keep in mind that the observed weights in the least square methodology are equivalent to the variance

of the residue (they are positive). One can impose higher weights on longer maturities in proportion

of their modified duration in order to maintain the same level of accuracy in term of prices.

• The calculation of optimum lambda requires rewriting the Lagrangian:

One can also link λ with the speed of convergence of instantaneous forward rates but this method is even less

accurate.

( ) ( ) ( )

( ) ( )

min , min . . . .

0 2. . . 1

0 2. . . . . . 2

L Y X W Y X

LX W Y X

X W Y X

   

   





  

= − −



= = − −





  



= = − + −



  





Equation (1) gives again

( )

. . . . .

X W X X W Y



−

which can be reinjected into (2) in order to get an

equation in lambda λ only.

For doing so, let us express first







as a function of X and θ, which is an inelegant matrix derivation

calculation:

( ) ( )

( ) ( ) ( ) ( )

( ) ( )

. . . . .

. . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . . . .

t t t t

t t t

X W X X W Y

X W X X W Y X W X X W Y

X W X X W X X W X X W Y

X X X

X W X W X X W X W X W Y











  

−

−−





=





 









   

= − +

   





  

   

= − + +



   

  

   





With:

( )

.11

122

1 . . . 1

1 . . 1

0..

. . .

1 . . . 1

1 . . 1

0..

tnn

t t e















−



+ + −

+−





=



+ + −

+−





One get an explicit formula for





(definitely not simple) and equation (2) can be solved using a

numerical approximation (c# program is available in ALM-VISION pricing library) with the algorithm of

Lagrange or by dichotomy.

The following example shows results with four points, properly located.

0.1

2.00

3.00

3.50

4.00

The calculation of lambda requires calculating





, the left member of equation (2). Results as a

function of lambda are shown on the graph below, with the value of the Lagrangian (that is the sum of

the square of the residuals).

Optimal parameters equal:

Optimum curve

lambda

0.581765

level

4.375706

slope

-2.429906

curvature

1.10489E-10

The curve shows an excellent fit, which is often the case with low curvature and few observations.

0,010000

0,015000

0,020000

0,025000

0,030000

0,035000

-0,02

-0,015

-0,01

-0,005

0,005

0,01

0,015

0,02

0 0,5 1 1,5 2 2,5 3 3,5

Optimisation of lambda

dL/d(lambda) equation (2) Lagrangian

1,50000

2,00000

2,50000

3,00000

3,50000

4,00000

4,50000

0 5 10 15 20

Example of Nelson-Siegel curve

r(t)

Dynamic Nelson-Siegel

The three parameters θ move in times and the model was extended and used to provide longer term

forecast than using only the forward rates. Indeed, the use of forward rates has the great disadvantage

to get the IR curve converging toward a flat curve at the level of the long-term rates l.

Nelson-Siegel allows building scenarios keeping a realistic curve, by assuming that the three factors

follow autoregressive processes. In reality, one observes indeed autoregressivity (or random process

which is included in the family of autoregressive processes with ρ=1). However, long-term level of each

of these factors is difficult to calibrate. Intuitively, they translate the fact that IR move with the

economic cycle around an average value linked to the structural level of inflation and growth of the

economy. But in our fast changing current economic environment, the duration of cycles and their

structural level of inflation is difficult to estimate.

Dynamic Nelson Siegel AR(1) model was completed with a determinist factor achieving AOA (Diebold

2008, Christensen 1999). The model becomes with our notations:

( ) ( )

r t l s c e

t t t

r t X t









−−

−



−−

= + + − −





=−

with c(t) a determinist function and theta following tri-dimensional Ornstein-Uhlenbeck process:

( )

. . . T

d dT dw

  

=  − + 

with dwT a Brownian vector and matrix

0 0 0









 = −





One shows that (c.f. Diebolt):

( ) ( ) ( )

. 2. .

2 . 2. .

1 2 3 4 5 6 7 8 9

. . . . . . .

ct ee

a t a t a a a e a a t e a a t

t t t



−−

= + + + + + + + +

Indeed, the determinist factor tends toward infinite at infinite. This is different from most classical AOA

models which assume that the rate converges toward a deterministic value at infinite.

Indeed, one great difficulty of I.R. modeling is the fact that AOA implies that long term rates is at infinite

a non decreasing process (Dybvig, Ingersoll and Ross 1996). This very powerful result has

consequences: linear class of models for example can’t allow a random finite long-term interest rate.

We can accept to work with a model providing infinite rate at infinite or we must go toward nonlinear

models. A third option is to make the AOA more flexible, which seems to better match realty.

Another default of AFNS is that λ is determinist. In realty, it isn’t the case.

Generalizing Nelson-Siegel

Still, Nelson-Siegel does not allow modeling each specificity of the IR curve and the model was

completed:

• By using one additional factor of curvature: Nelson-Siegel Svensson (with sometimes s2=0).

( )

1 1 2 2

. . . .

1 1 2 2

1 1 1 1

. . . .

t t t t

e e e e

r t l s c e s c e

t t t t

   



   

− − − −

−−

   

− − − −

= + + − + + −

   

   

• By using n additional factors of curvature: generalized Nelson Siegel.

Both models present several major defaults:

• The orthogonality is no longer respected,

• The models are no longer easy to estimate analytically,

• The implied forward rate evolution becomes complicate:

( )

1 2 1 2

. . . .

1 2 1 1 2 2

. . . .t. . .t.

t t t t

NSS

f t l s e s e c e c e

   



− − − −

= + + + +

• It is anesthetic.

The exponential polynomial model EPM(n)

We propose here another extension of NS using our remark that NS model can be seen as a degree 1

modeling inside the set of exponential polynomial functions describing forward rates:

( ) ( )

, 0 , 0

. . . . .

i i t t

n i n n

f t a a t e a e P t





−−



= + = +



One get subsequently:

( ) ( ) ( )

( )

0 , 0 ,

. . . . . .

. . . . . . .

i n i n i

i O i O

r t f s ds a P s e ds

a a s e ds a a I



−

= = +

= + = +







Since the set of integrals

( )

. . .

tis

I s e ds



−

=

is well known and, by integrating by part, one

gets

( )

.. .

I i I



−

= − +

, so by recurrence for i>0 :

( )

1.1

!. 1 . !



−

=−



=−







and:

( ) ( ) ( )

1. . ! . 1 . .

i n t

r t a a i e X





−



= + − =







We have an elegant generalization of Nelson-Siegel while keeping the previous methodology.

Estimating EPM(n) parameters using observed yields

Estimation remain linear and factors can be interpreted exactly in the same way as for Nelson Siegel,

just we added additional dimensions, fully generalizing the notion of modified duration.

For p observations and a degree n exponential polynomial model, the matrix X(p lines, n+2 columns)

becomes (the matrix shows values for columns 1, 2, i+2 and n+2):

( ) ( )

...

111

...

1!!

1. . 1 . . . 1 .

.. ! . !

.. . . . .

. . . .

1!!

1. . 1 . . . 1 .

.. ! . !

tin

lll

ein

tt j t j

Xtt

ein

tt j t j















−−−

−

   

−−−

   

   

   

=−−−

   

   

−



( ) ( )

. . . .

. . 1 . . . 1 .

.. ! . !

ppp

tt j t j



−−





   



   



−−

   



   





After the two first columns, columns are linked by

( ) ( ) ( )

. 1 . .

X i i X i t e



−−

= − −

The derivative of X against λ is still required for calculating the optimum λ (the formula doesn’t change

in X):

( ) ( ) ( ) ( )

200

. . . ! . . . 1 . 1

X t t

a i e j t







−



= − + −









So that:

( ) ( ) ( ) ( ) ( )

( ) ( ) ( )

...

111

200

1 . . 1 !!

. . . . 1 . 1 . . . . 1 . 1

0. ! . !

.. . . . .

. . . .

1 . . 1 !

. . . . 1 . 1 .

0.!

tin

te in

e j t e j t

t j t j

te ie j t



















−

−−

−

   

+− − + − − + −

   

   



== +− − + −











( ) ( )

( )

( ) ( ) ( ) ( )

.t ..

200

!. . . 1 . 1

. . . .

1 . . 1 !!

. . . . 1 . 1 . . . . 1 . 1

0. ! . !

tll

npp

ne j t

te in

e j t e j t

t j t j















−

−−









− + −







   



+−    



− + − − + −

   



   







After the two first columns, columns are linked by

( ) ( ) ( ) ( )

2.2

. . . . 1 . .

X i X i

i t e i t t





−−

  − 

= − − −





Formula remain the same. Just one need to build the two matrix X and





depending on the number

of factors n:

( )

( ) ( )

( )

. . . . .

0 . . . . . . . . . . . . . . . . . .

t t t

X W X X W Y

X X X X

X X W X W X X W X W X W Y W Y X



  

   

−

−−

=







   

   



= + − + + −



   

   

   









The second equation gives a condition to be respected by λ as previously. Attention, there isn’t a

unique solution to the equation specially for lambda small, one can get several solutions. Therefore,

one usually uses a two steps methodology identifying first the area around which is the minimum

Lagrangian and then using the second equation to find the local minimum.

After having calculated optimal lambda, the system is then a simple least-square with n parameters.

Explaining the EPM(n) model

The modeling is much more elegant than NSS or generalized Nelson-Siegel since:

• It is more intuitive,

• It respects the idea that each axis corresponds to a degree of freedom of the IR curve,

• It allows infinite expansion while keeping analytical expressions.

This type of functions is often observed in mathematical literature: functions of Laguerre, distributions

of Sargan… This is due to the fact that they are very closed to polynomial interpolations.

In terms of forward rates, the model just adds additional perturbation, smaller and farer. Indeed:

( )

, 0 ,

. . .

t i i

n i n

f t a e a t





−



=+ 

Functions

i i t



−

add a perturbation centered on their maximum that is i/λ. Since most of the time,

λ is between 0.2 and 2, these additional perturbations give us flexibility to adjust the curve for the

additional observations we get (usually some deformation to adjust before the 2 or 3 years pivot and

after between 5 and 7 years).

For n equal to the number of observations minus 1, there is a direct linked with the Legendre

polynomials.

Of course, the model assumes a finite interest rate at infinite, even though we will see that this

parameter changes in time. This assumption is a limitation of the model but again, what we observe in

real life is that perpetual rents (traded until the XIXth century) had a finite interest rate. As well 100

years US government bonds don’t show a trend toward an infinite IR at infinite.

Dybvig, Ingersoll and Ross (1996) and in parallel Antoine Frachot and Nicole El Karoui

(1997) published

a very strong result according to which at infinite, IR curve is a non-decreasing process. This very

intuitive result can be translated by the fact that the forward instantaneous IR curve f(t) is also non-

decreasing. In this view, our model expresses only the fact instantaneous forward rates suffer some

perturbations which effect disappear in time. The previous result can be translated by the fact an,n < 0.

It is the case most of the time, since IR curves are never seen in backwardation long term. This gives a

better view of the modeling of the forward rate with pi,n the roots of the polynomial:

( )

. . . nin

n LT n n i

f t a a e t





−



= − −







The forward rate is modeled using a simple adjusted polynomial interpolation. The adjustment in



−

allows to have the perturbations disappearing in time.

Indeed, the model doesn’t give any constraint on the curve of forward short-term rate. The proponents

of the methodology argue that there is no such need since the model just interpolates “ad minima”

the reality. And the reality doesn’t allow for absolute arbitrage, so should the model.

Still, in addition of having an,n < 0, the AOA should have a translation in term of regularity of the curve

of forward rates. Intuitively, one can expect that the interpolation should not add any additional wave

to the forward curve.

Another view of the EPM model is to consider that forward rates are converging toward their long-

term value a0 but, before converging, are facing perturbations which disappear in time. These

perturbations being defined as the difference between the observed forward rate and the long term

one, we model them through polynomial interpolations, using the exponential factor to express their

disappearance in time:

( )

, 0 ,

., 0 ,

. . .

t i i

n i n

t i i

n i n

f t a e a t

e f t a a t



−



−=



In this approach,EPM(n) is the adaptation of the polynomial interpolation to a family of functions

converging toward a long term value.

•

“A note on the Behavior of Long Zero Coupon Rates in a No Arbitrage Framework”, N. El

Karoui, A. Frachot, H. Geman, June 1997

An example

Let take 20 points describing the IR curve and compare results of EPM(n) for n=1 (Nelson-Siegel) to 4.

One get:

Lagrangian

lambda

beta 0

beta 1

beta 2

beta 3

beta 4

beta 5

beta 6

0.002116

1.2874

3.195

-1.105

0.000

0.000905

0.4618

3.406

-1.324

1.804

-1.183

0.000897

0.5994

3.385

-1.299

0.986

-0.125

-0.272

0.000458

1.6035

3.280

-1.281

2.017

-5.638

3.373

-0.534

0.000458

1.6035

3.280

-1.281

2.017

-5.638

3.373

-0.534

0.000

The lagrangian doesn’t move regularly. For example, from n=1 (Nelson-Siegel) to n=2, there is a

significant reduction of the residuals as between n=3 and n=4. It isn’t however the case between n=2

and n=3 or between n=4 and n=5

Even though movements are less brutal than what can be found with a direct polynomial interpolation,

the practitioner need to use its market judgment to appreciate the necessity to integrate small

irregularities which may come from temporary market disequilibrium and bid/ask spread.

This subject is key and not directly relative to the model of interpolation of the curve. As it has been

explained in introduction, it is important to analyze each coefficient and their correlation with classical

orthogonal factors. EPM(1) captures most of the moves (usually above 90%). The addition of one or

two more factors can only be justified by the necessity to translate a specific irregularity of the curve.

The example shows also the difficulty linked to the fact that





isn’t strictly growing and the

estimation must be made around the absolute minimum to converge properly. That means that the

2,00

2,20

2,40

2,60

2,80

3,00

3,20

3,40

0,00 5,00 10,00 15,00 20,00 25,00

Example of EPM interpolation using different degrees

vector rates X.theta n=1 X.theta n=2 X.theta n=3 X.theta n=4

methodology of optimizing will require to avoid local minima even though Lagrange convergence

methodology provides only local minimum.

Finally, it is clear that optimizing λ is key and models with fixed λ will provide very likely a much lower

fit with the real observations.

Estimating EPM(n) parameters using prices or in fine yields

Most of the time, only prices of bonds are observed and not zero-coupon yields. Actually working on

yields directly is possible only for swap markets or when liquid striping of bonds is available.

Let assume we have now p observations of prices Pl of bonds paying at time ti a cash flow CFi. One get:

( ) ( )

( )

X t t

t t t t

act i

CF i

P t CF i e

r t t



−−

−



Which expresses the price of trade l as the actualized value of the future cash flow of the bond using

the zero-coupon curve of parameter θ.

Please notice that we present the actuarial rates in parallel to the continuous rates. Using continuous

rates is more rigorous from a methodological point of view but include one additional level of

complexity to communicate to the market

. The two methods are very closed (identical in first degree,

different in second degree but the estimator does adjust for that difference). Both rates are linked by

the simple equality:

( )

exp

ln 1

actuarial modified duration = - =

exponential modified duration = -

act

r t t act

act act

ZC t e r t

d ZC t t

dr rt

d ZC t t

−

=−

= = +

So one usually performs calculation with exponential rates and then convert results in order to

communicate on an actuarial yield curve.

Whatever, the Lagrangian to minimize becomes:

( ) ( )

min , min .

l l l

L w P P

   

  

=−







This is no longer linear but can be approximate with a linear methodology since the functions of prices

are infinitely derivables (c.f. Gourieroux-Monfort-Trognon-Renault). Notice that we did not do any

assumption on the law of probability of our residuals, since it isn’t required.

We change our notations to:

( )

0... 2 0... 2

2. . .

2. . . .

l l l

i n i n

l l l

i j i j i j

dL L

Z w P P x

dx x x

P P P

H w P P x

x x x x x x





= + = +





=





   



= = = − −

   



   



   

  



= = − −



   



     

   





With Z the gradient of the Lagrangian and H its Hessian.

By linearization, we assume that the second term of the sum is negligible:

This article, like every article published by ALM-Vision, gives the theoretical background to an operational

subject. This explains our pragmatic approach, sometimes away from academic publications.

( )

2. . . . 2. . .

l l l l l

l l l l

i j i j i j i i ij

P P P P P

H w P P x w

x x x x x x x x







 



    



= = − − 



 



       





 







Let us assume we have an estimation (θ0,λ0) near the optimum value, one gets the local minimum by

applying a Newton algorithm:

( ) ( ) ( ) ( )

x n x n H x Z x

−

+ = −

That for, we need to calculate the derivatives of L against xi that is:

( ) ( ) ( ) ( )

( ) ( ) ( )

, 0 0 0 0 10

,n 1 0 0 0 0 0 0 0 0 0

. , . .

, , with 0, =1 and 1, . 1 .

, , . , . . ,

l s s

li t

CF s X i s t t

Z X s X i s e

CF s t



   







        

  



−





= = − + = −



+









= = − +



  







With, in approximation:

( )

( ) ( )

( )

( ) ( )

0 0 0 0 0

, . . . . . . . . . . . . . ,

. . . . . . . . . . . . . .

t t t yield

t t t

X X X

X W X W X X W X W X W Y

X X X

X W X W X X W X W X W X

    

   



  

−−



   

   

 − + +



   

   

   







  

   

 − + +



   

  

   





Where

is the matrix of equivalent weights on the initial rates, that is multiplied by the modified

duration.

For simplification, one can also simply estimate the derivative against lambda around the observed

point. The fact that the derivative is maximal around the optimum allows this approximation.

By bootstrapping, one then get the optimum parameters.

There are two possibilities for starting the bootstrapping:

• If we had an estimation the day before, we can obviously use it since IR curve shift are

almost never above some teens of basis points.

• If not, we can use as starting points, the yield and modified duration of the bonds.

Zero-coupon curves are also usually relatively closed from the in fine yields.

This methodology is obviously more complex than the optimizing on yields directly but the approach

is relatively similar.

Convergence is extremely fast, in 3 to 5 iterations in our test. Still it provides us with a local optimum

and it is important to combine this approach with the one on the zero-coupon rates directly in order

to get a global view on the shape of the curve.

Optimizing using in fine yields

When the data are made of in fine yields, the user can deduct the zero-coupons rate only if all the

information is available, that is the 1y, 2y…, 20y. Unfortunately, most of the time, one gets the

information only on some tenures: 1y, 2y, 3y, 5y, 7y, 10y, 15y, 20y, 30y. The other must be deducted

from these ones.

Actually, the methodology is the same with, for the tenure l:

( ) ( )

( )

100

1 , .

1 , 1 ,

lX t t

t t t t

i t t

act i act l

P t CF i e

r t t r t t



−−

= = + =



Just the price of each observation is equal to 1 and the related cash flows are equal to c until l and 1+c

at time l. In this case, we optimize on the price which implicitely overweights the long duration

observations.

Observed prices and estimated prices using EPM(2) on Euribor swap curve (with derivatives of the price

against each parameter)

Price

month

in fine

weight

est. Price

dP/dlambda

dP/dtheta0

dP/dtheta1

dP/dtheta2

dP/dtheta3

100%

-0,3640%

4,00%

100,00%

0,00

100%

-0,3780%

4,00%

100,00%

0,00

-0,02

0,00

100%

-0,3710%

4,00%

100,01%

0,00

-0,04

0,00

100%

-0,3690%

4,00%

100,01%

0,00

-0,04

0,00

100%

-0,3390%

4,00%

100,03%

0,00

-0,16

0,00

100%

-0,3280%

4,00%

100,04%

0,00

-0,24

0,00

100%

-0,2780%

4,00%

100,07%

0,00

-0,49

-0,48

-0,02

0,00

100%

-0,2600%

4,00%

100,05%

-0,02

-1,00

-0,93

-0,06

-0,01

100%

-0,1100%

4,00%

99,95%

-0,06

-2,00

-1,74

-0,23

-0,04

100%

0,0900%

4,00%

99,92%

-0,13

-3,00

-2,45

-0,48

-0,13

100%

0,3000%

4,00%

99,99%

-0,20

-4,00

-3,06

-0,78

-0,28

100%

0,4800%

4,00%

100,04%

-0,28

-4,98

-3,58

-1,11

-0,48

100%

0,6300%

4,00%

100,05%

-0,37

-5,95

-4,03

-1,45

-0,75

100%

0,7600%

4,00%

100,02%

-0,44

-6,90

-4,41

-1,80

-1,07

100%

0,8800%

4,00%

100,02%

-0,51

-7,83

-4,74

-2,14

-1,43

100%

108

0,9800%

4,00%

99,97%

-0,57

-8,74

-5,01

-2,47

-1,83

100%

120

1,0700%

4,00%

99,94%

-0,61

-9,63

-5,24

-2,78

-2,25

100%

144

1,2300%

8,00%

100,01%

-0,67

-11,35

-5,60

-3,35

-3,13

100%

180

1,3900%

8,00%

100,01%

-0,66

-13,80

-5,96

-4,04

-4,45

100%

240

1,5300%

8,00%

100,00%

-0,49

-17,55

-6,27

-4,82

-6,40

100%

360

1,5800%

8,00%

100,00%

0,04

-24,31

-6,48

-5,55

-8,90

One can also optimize on the coupon by observing that:

( ) ( )

( )

cZC



 



−

=

So that:

( ) ( ) ( ) ( )

( )

( ) ( )

( )

( ) ( ) ( )

0 0 0 0 0 0 0

0 0 0

min . . , . min . , . ,

min . , . ,

l l l

l l l l l l

d d d d

l l l

dd l

dc dc dc

w c c d d w c c d d

d d d

w Y Z d d

   





          

   

    









− − − = − −









=−









We are back again to the previous situation with different Y and Z.

About weights and errors

Weights are proportional to the inverse of the variance of the error and allow for some adjustment in

order to take into account the size of a trade, its significance…

• An important point is that optimization on IR and optimization on prices are implicitly using

different weights. Indeed, optimization on IR gives the same weight to the long term observations than

the short term. This generates a higher error in term of prices for long term bonds since in first

approximation dP = modified duration x dr.

At the opposite, optimization on prices compensate mechanically for the higher sensitivity of long term

bonds.

In order to get a higher level of accuracy for long term IR when optimizing directly on IR, one must

increase the weight of long term bonds in proportion of their modified duration.

• In many illiquid markets, there are few trades per day and the curve can’t be accurately

estimated using only the trades of the day. Subsequently, one uses older observations which weight

must be reduced with their age to go down to 0 after 30 to 90 days depending on the market. If the

observation is recent (a few days) in an illiquid market, it should keep almost all its weight to then

decrease quicker in time: for example,

( )

min

.max 0,

min

.max 0,

k t t

wt k t t

−−

−

=−

( )

1 .1ta

wt a





=−











• Another classical adjustment is made in proportion of the size of the trade.

Another difficult issue is linked to the bid/ask spread and the functioning of the market. In a market

dominated by market makers and specialists, trades are often done between the market maker and

its client, the client being usually buyer on the primary market and rather seller on the secondary one

(because big institutional investors keep their lines as much as possible). Subsequently, observed

trades can include the margin of the market maker. This information is not always available and may

generate unjustified volatility to the curve. Reducing the weight of certain type of trades may reduce

this effect.

One must keep in mind that choosing different weights implies obviously modifying the value of the

estimation, which may have significant theoretical impact in term of implicit model (see below).

Other statistical models versus least square estimation

The previous estimations of the parameters chose to minimize the distance between the observations

and the curve, the classical least square methodology:

( )

min ,

iy f t



−



This pure geometric approach doesn’t make any statistical assumption

, just using the standard

Euclidian distance. It has the great advantage of expressing the intuitive objective of the traders: to

get an analytical formula as “closed” as possible from observed prices (using the standard Euclidian

metrics).

However, this methodology requires to eliminates outliers, which can heavily disturb the process,

exactly in the same way as it is the case for a simple least square linear regression. It also requires to

control that residuals follow a gaussian law (so that the Least Square Estimate be also the Estimator

of Maximum Likelihood) or at least are of same variance and not auto-correlated. Indeed, the least

square methodology provides the maximum likelihood estimation only if the residuals are following a

gaussian law.

Identifying outliers is key but sometimes difficult to perform. A classical statistical way of doing so is to

divide the residuals by their observed standard deviation and spot the normalized residuals larger than

[2] or [2.5] in absolute value. A more financial approach is to eliminate the trades which are obviously

“off-market” compared to the practice of the market: standard bid-ask, market makers quotation

spreads… Trades which weren’t done on the market (OTC trades), buy and sell trades (to generate

accounting results in certain accounting system or to roll a line into a new line), end of year or quarter

sell and buy trades have a higher probability to be done “off-market”.

If one doesn’t want or can eliminate outliers, another criteria, giving less weight to outliers could be

the old method of the sum of the absolute values of residuals (the Least Absolute Deviation or LAD

model

( )

min ,

iy f t



−



Still it doesn’t give us any indication whether the EPM is a proper model or not. It simply gives us the

parameters which minimize the sum of the absolute distance between the cloud of observation and

the EPM curve.

Only if residuals follow a Laplace law is the LAD estimate the maximum likelihood estimate. It is often

the case for heavy tails, asymmetric distribution of residuals or very large sample. Indeed, the author

found standardized residuals following a Laplace distribution in the analysis of 40 000 trades in a low

liquidity market: most of the trades were on or very closed to the estimated market curve with heavy

tail of distribution and a sharp decrease of the number of trades out of the curve (compared to the

Explicitly but implicitly, the least square methodology is the maximum likelihood estimator for gaussian

residuals only. If not, it is still unbiased but may not be the optimum.

Also called L1-regression. For LAD and other alternative models, see Birkes and Dodge “alternative methods of

regressions”, 1993, Wiley-interscience.

For MLE, see “An inductive approach to calculate the MLE for the double exponential distribution”, W. Hurley,

Journal of modern applied statistical methods: JMASM – nov. 2009

rather flat shape of the gaussian standard distribution at the average). In this sample, there was no

significant asymmetry.

Other methods include M-regression or nonparametric-regressions5 which perform well with outliers

but are much heavier to implement.

Actually, as for any least square analysis, if we consider Yi as a random variable, moving from a

geometric approach to a statistical one, both methods are based on the assumption that

( )

i i i

E Y t f t



and so we must control that

( )

E Y f t



−=

The difficulty is that the residuals aren’t necessarily of same variance or not auto-correlated (the issue

is exactly the same than with a least square linear regression which is based on the assumptions that

residuals are centered, not auto-correlated and of constant variance. If they are not, the estimate is

still unbiased but is no longer realizing the minimum variance).

Indeed, the analysis of the residuals may require to take a statistical approach assuming a model for

the residuals:

•

( )

. with 0,1

i t i i i

Y f N

  

= + 

or a Laplace law

•

( ) ( )

. with 0,1

i i i

Y f t e N



 

−

=

or a Laplace law

In both models, σi may vary depending on market parameters: volume of the transaction compared to

similar trades, type of trade (primary, secondary, inter, intra), type of line (new line, old line without

volume)…

The resolution of the optimization requires to write the likelihood.

-6 -4 -2 0 2 4 6

Distribution of standardized errors in a low liquidity market

fobs(x): observed density function

f(x): gaussian standard law

distribution function for comparison

Pricing of derivatives

Obviously, the family of exponential polynomial models allow to build simple multi-dimensional

extensions of Cox-Ross to interest rate while keeping a legitimate theoretical background. Indeed,

assuming λ is stable, the IR curve is fully described by vector θ of orthogonal variables.

Very elegant in theory, this class of model is consuming of calculation capacities since a 1000 step tree

generates 10003=109 final possible positions for a Nelson Siegel (or EPM(1) ). There is no point using a

higher model since pricing of IR long-term derivatives isn’t an exact science and the market prices a

significant bid/ask inside the volatility to cover for the other independent factors.

The model assumes that:

• The three parameters θ1 , θ2 and θ3 between two successive periods t and t+dt can either get

up by a % with probability p under the risk neutral probability, or get down by b %, that is :

( ) ( ) ( )

( ) ( )

. 1 proba.

. 1 proba. 1-

i i i

ii i i

t a p

t dt t b p









+





The major difference with a classical Cox-Ross is that r(t) depend on θ(t) and is no longer “risk

free”. It is just for the period between t and t+dt.

• The “risk free rate between t and t+dt” is the instantaneous short-term rate, it depends on the

current value of interest rate:

( ) ( ) ( )

r t t t



. However, in this case, the probabilities

become path-dependent and don’t allow to build a clean tree so we will take the

approximation used in the market of a constant r, even though we know that is far from realty

and raised serious issues from a methodological point of view.

There is no explicit AOA in the model, still we give the constraint as if each parameter was an asset (in

first approximation, the variation of the price of a zero-coupon is proportional to the variations of the

three parameters θ) and real assets were derivatives on these three notional assets (for example, the

price of the zero-coupon is a derivative of the parameters θ). This local AOA is correct since zero-

coupons are regular infinitely derivable functions. This methodology is rather abstract but one get:

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

( )

( ) ( )

1 . . 1 . 1 . 1

i i i i i i i i

iii

E t dt t a p t b p t r t

r t b

pt ab

   

+ = + + + − = +

−

=−

Locally, at each step, the model allows defining the quantity of θ required to be sure to be able in step

+ 1 to honor its derivative contract which is at the base of Cox-Ross approach. The AOA constraint

simply is no longer fully respected because there is no clear risk-neutral probability.

One can choose the IR over the full period in order to avoid dependence on the steps in case of pricing

in fine (for a zero-coupon, neutrality is respected with the forward instantaneous rates).

Each of the parameters are following a tree and after n steps including j up, one get:

( ) ( ) ( ) ( )

, up 0 . 1 . 1

j n j

i i i i

n j a b



−

= + +

There are independent by construction and the state of the IR curve at time N is defined by the vector

θ after i move up of θ0 , j move up of θ1 and k move up of θ2 under probability:

( )

( ) ( ) ( )

0 1 2 0 1 2

, , , up . . . 1 . 1 . 1

N i N j N k

i j k

p N i j k p p p p p p

− − −

= − − −

By recurrence, one can get the price of any asset paying cash flow at time tN in state (I,j,k up) as a

derivative of the three parameters, which allow for market calibration using swaptions prices:

( ) ( ) ( )

( )

0 1 2 , , 0

0, 0 , 0 , 0 . ,N, , , up . , , , up

ti j k

P t P t i j k p N i j k

  

+

American options are calculated by recurrence, with at each step, 23 possible directions:

( ) ( ) ( )

( )

, , 0

max intrinsec value , . 1, , , up . 1, , , up

1dt i j k

u n n u n i j k p i j k

rt =









One still has the classical link with the volatility of each θ, if one assume that each θ follow a log-normal

process (that is doesn’t change sign):

( )

ln 1

rNa



=



+−

=

+



+=

+



But this formula is an approximation since there is no risk free rate over the full period anymore (which

is the realty of the market).

The formula allows to price prepayment options on loans.

Let the value of a loan:

( )

,1. i

ttt

tt t t t

Vt X



−

−

=++



Let us define a statistical rule for exercising the option at time t, that is the percentage of client

exercising function of the IR: q(t, θ). It can be a function of the difference of interest rate with the initial

rate or the difference of marked to market as decided by the bank.

( ) ( )

( )

, , 0

1 q ,

max , 1 .q , , . 1, , , up . 1, , , up

nn dt i j k

u n V t n t n u n i j k p i j k







−







= − +









Conclusion

The Nelson-Siegel model is fascinating because it provides a simple representation of the IR curve with

orthogonal factors easily macro-economically interpretable. It can be generalized in EPM, a powerful

family of model.

However, the operational implementation requires a rigorous methodology since classical

optimization methodologies provide only local optimum.

In addition, this class of models doesn’t respect the AOA. Scholars proposed an analytical modification

of the model in order to get AOA but these solutions generate infinite long term rates. Another

promising direction would be to soften AOA and integrate in the model that there is no risk free rate

on the market. Under these assumptions, EPM provides a powerful and simple tool for pricing

derivatives.

Finally, the study of the coefficients in time is opening also interesting fields of research for asset

management.

Bibliography

• BIS papers n°25 “zero-coupon yield curves estimated by central banks”

• European Central Bank working papers serie n°917, juillet 2008 “modeling and forcasting the

yield curve under model uncertainty”

• European Central Bank working papers serie n°874, février 2008 “how arbitrage free is the

Nelson Siegel model”

• “Yield curve modeling and forecasting: the dynamic Nelson Siegel approach”, Francis X.

Diebold & Glenn D. Rudebush, 29 avril 2012

• Annales d’économie et de statistique n°8 – 1987 “Les méthodes du pseudo-maximum de

vraisemblance”, Alain Trognon

• Méthode des moindres carrés généralisés, Jean Debord, avril 2003

• COMISEF working papers series WPS-031 30/03/2010 “calibrating the Nelson-Siegel Model”,

M. Gilli, S. Gross, E. Schumann

• “A note on the Behavior of Long Zero Coupon Rates in a No Arbitrage Framework”, N. El

Karoui, A. Frachot, H. Geman, June 1997

• Function Minimization, proceedings of the 1972 CERN computing and data processing school,

Pertisau, Austria, 10-24 September, 1972 (CERN 72-21) – page 31 mainly

• Alternative methods of regression, David Birkes & Yadolah Dodge, Wiley Interscience 1993.

• “An inductive approach to calculate the MLE for the double exponential distribution”, W.

Hurley, Journal of modern applied statistical methods: JMASM – nov. 2009

About ALM-VISION

ALM-Vision is a quantitative modeling company founded in 2011. Its mission is to provide quantitative analysis

and scientific support to financial institutions.

The core of our business activity is Asset Liability Management (ALM) modeling. Our modeling tool ALM-

Solutions® is proprietary software developed internally by our team for highly precise state of the art modeling

of banking assets and liabilities to monitor the financial institutions’ interest rate, credit and liquidity risk and to

understand the impact of a variety of economic scenarios on the balance sheet and income statements, including

stress testing. We have also high quality pricing capacities for complex structured financial products.

In addition to ALM modeling, ALM-Vision provides advisory services to financial institutions and is called in to

intervene on technical matters that require high pricing capacity and substantial and extensive experience in the

financial markets (CVA, FVA, deal structuring, ABS, NBT, inflation-linked products, commodity derivatives and

modeling, credit restructuring…).

Most bank ALM and/or risk teams are left alone to handle the new regulatory environment. With the current

difficult environment for the financial industry, both human and technological resources are scarce and the

teams have neither the time nor the capacity to develop the scientific part of their job. We provide our customers

with this technical support and act as a bridge for best practices between our customers. Indeed, each customer

brings us new needs, new issues, new requirements which reinforce our expertise. Our rule is to systematically

share new non-client specific developments as a way to diffuse best practices around the industry. We strongly

believe that our success is based on the fact that we are not a simple IT provider but a true scientific support

team, with strong financial expertise assisting our customers in the whole modeling and analysis of their balance

sheet.

In ALM, software is just the tool. The core of the added value is the modeling and the analysis. Leveraging on our

strong financial and market experience, we help our clients focus on this core in the most efficient way.

Disclaimer

The contents of this document are proprietary to ALM-VISION. This document is produced by ALM-

VISION for institutional investors only and is not financial research. This document is for information

purposes only and is neither an offer to buy or sell, nor a solicitation or a recommendation to buy or

sell, securities or any other product. ALM-VISION makes no representation as to the accounting, tax,

regulatory or other treatment of the structure of any transaction and/or strategy described in this

document and the recipient should perform its own investigation and analysis of the operations and

the risk factors involved before determining whether such transaction is one which it is proper and

appropriate for it to enter into. Some information contained in this document may have been received

from third party or publicly available sources that we believe to be reliable. We have not verified any

such information and assume no responsibility for the accuracy or completeness thereof. This

document may also include details of historic performance levels of various rates, benchmarks or

indices. Past performance is not indicative of future results.

reserved. No information in this document may be reproduced or distributed in whole

or in part without the express written prior consent of ALM-VISION, except for personal

use.

ResearchGate has not been able to resolve any citations for this publication.

A Note on the Behavior of Long Zero Coupon Rates in a No Arbitrage Framework.

Article

Full-text available

Feb 1996

This paper investigates the behavior of long zero-coupon rates and its consequences for usual arbitrage models of the term structure.

Yield Curve Modeling and Forecasting: The Dynamic Nelson-Siegel Approach

Book

Jan 2013

Understanding the dynamic evolution of the yield curve is critical to many financial tasks, including pricing financial assets and their derivatives, managing financial risk, allocating portfolios, structuring fiscal debt, conducting monetary policy, and valuing capital goods. Unfortunately, most yield curve models tend to be theoretically rigorous but empirically disappointing, or empirically successful but theoretically lacking. In this book, Francis Diebold and Glenn Rudebusch propose two extensions of the classic yield curve model of Nelson and Siegel that are both theoretically rigorous and empirically successful. The first extension is the dynamic Nelson-Siegel model (DNS), while the second takes this dynamic version and makes it arbitrage-free (AFNS). Diebold and Rudebusch show how these two models are just slightly different implementations of a single unified approach to dynamic yield curve modeling and forecasting. They emphasize both descriptive and efficient-markets aspects, they pay special attention to the links between the yield curve and macroeconomic fundamentals, and they show why DNS and AFNS are likely to remain of lasting appeal even as alternative arbitrage-free models are developed. Based on the Econometric and Tinbergen Institutes Lectures, Yield Curve Modeling and Forecasting contains essential tools with enhanced utility for academics, central banks, governments, and industry.

• COMISEF working papers series WPS-031 30/03/2010 "calibrating the Nelson-Siegel Model

M Gilli
S Gross
E Schumann

• COMISEF working papers series WPS-031 30/03/2010 "calibrating the Nelson-Siegel Model", M. Gilli, S. Gross, E. Schumann

An inductive approach to calculate the MLE for the double exponential distribution

Jan 2009
J Mod Appl Stat Meth

• "An inductive approach to calculate the MLE for the double exponential distribution", W. Hurley, Journal of modern applied statistical methods: JMASM -nov. 2009

Modeling the IR curve: the Exponential Polynomial Model ("EPM"), the true extension of Nielson-Siegel

Abstract

Recommended publications

Modeling of pricing processes: parameter estimation and forecast

Modelling credit default swap prices

Modelling of IPO Underpricing in Bangladesh

MULTIPLICATIVE DYNAMIC-ADJUSTMENT MODEL OF SALES RESPONSE TO MARKETING-MIX VARIABLES.