ArticlePDF Available

Modeling the IR curve: the Exponential Polynomial Model ("EPM"), the true extension of Nielson-Siegel

Authors:
  • ALM-Vision

Abstract

The paper presents a new model called "Exponential Polynomial Model". EPM is an extension of the famous Nielson-Siegel model and an alternative to Nielson-Siegel-Svensson model (NSS). NSS raises a lot of issues, being difficult to estimate and explain, with colinearity between factors. EPM is conditionally linear and much more elegant and easier to interpret. Furthermore, the model allows to easily appreciate the improvement provided by additional parameters. Finally, the author discusses some points on AOA and multi-dimentional binomial pricing.
01/03/2018
Modeling the interest rate curve
The exponential polynomial model:
the true extension of Nelson-Siegel
Serge MOULIN
ALM-VISION
1
Modeling the interest rate curve:
the exponential polynomial model,
the true extension of Nelson-Siegel
Table of contents
Abstract ................................................................................................................................................... 2
Degrees of freedom of the interest rate curve ....................................................................................... 2
Explaining the moves of the curve .......................................................................................................... 3
Interest rate curve modeling ................................................................................................................... 4
Model of Nelson-Siegel ....................................................................................................................... 5
Estimation of parameters with IR points ............................................................................................. 8
Dynamic Nelson-Siegel ...................................................................................................................... 11
Generalizing Nelson-Siegel ................................................................................................................ 11
The exponential polynomial model EPM(n) ...................................................................................... 12
Estimating EPM(n) parameters using observed yields ...................................................................... 13
Explaining the EPM(n) model ............................................................................................................ 14
An example .................................................................................................................................... 16
Estimating EPM(n) parameters using prices or in fine yields ............................................................ 17
Optimizing using in fine yields ....................................................................................................... 20
About weights and errors .................................................................................................................. 21
Other statistical models versus least square estimation .................................................................. 22
Pricing of derivatives ............................................................................................................................. 24
Conclusion ............................................................................................................................................. 25
Bibliography ........................................................................................................................................... 27
About ALM-VISION ................................................................................................................................ 28
Disclaimer .............................................................................................................................................. 28
2
Abstract
Most of the central banks provide models of the interest rate curve based on the Spline methodology
or the Nelson-Siegel model and its extensions: Nelson-Siegel Svensson and Generalized Nelson-Siegel.
But these last two models raise calibration issues, being far from linear.
After reminding some key elements about the IR curve, we present here a new methodology which
appears easier and more elegant in term of optimum, choice of the number of parameters and their
explanation. Furthermore, the model includes the Nelson-Siegel approach as a specific case. We call it
“the exponential polynomial model” or “EPM(n)” with n being the degree of the model. Nelson-Siegel
is an EPM(1). Estimations are easier also, allowing to choose the required level of parameters.
We finally give an introduction to derivatives pricing using this new methodology, a simple and elegant
way of expanding Cox-Ross but which raises issues in term of AOA assumption.
Degrees of freedom of the interest rate curve
Factorial analysis of interest rate curves generates orthogonal factors explaining its moves. The three
first factors explain usually between 90% and 96% of the moves. They are:
First, the general level of the curve describing the translation effects. It explains about
50% of the moves.
Second, the slope (difference between short-term and long-term rates) describing the
rotation effects around a pivot between 2 and 3 years (the usual limit between short-
term and long-term markets). It explains around 30% of the moves.
Third, the curvature defining the shape of the slope explaining around 10% of the
moves.
Additional effects allowing to specify the shape of the curvature.
The Nelson-Siegel (“NS”) model focus on the three first parameters and offer a formula giving the
shape of the IR curve, knowing the 3 first factors. However, when the market provides more than five
points, NS’s level of precision is often insufficient and modeling requires adding additional parameters.
The strongest empirical result is the fact that the NS factors match the three first orthogonal ones
and subsequently can also be considered as orthogonal. Furthermore, at least for the first three, their
significance is clear: level, slope, and curvature. This opens up real opportunities in term of modeling
and pricing (see below): with 3 orthogonal parameters, one captures 90% to 96% of the degrees of
freedom of the IR curve. By adding additional factors, one can get an even higher level of accuracy.
The factors describing the curve move over times with the evolution of economic reality. The analysis
of the propagation of a shock toward the different segments of the curve provides an indication of its
economic nature. For example, cyclical moves of the short-term yields have low impact in the long-
term. However, an anticipation of strengthening of the monetary policy in reaction to inflationary
tensions will spread also to the long-term yields. Symmetrically, a decrease in the short-term rates by
central banks in order to reshuffle growth to the economy appears often simultaneously with a
decrease in the long-term rates (in this description of effects, we do not try to identify which move
3
comes from the macro-economic situation and which move comes from the reaction of the central
bank).
Explaining the moves of the curve
Obviously, identifying the economic parameters explaining the moves of the IR curve is one of the most
natural and fundamental question in finance. Unfortunately, it is also one of the most difficult.
The first explanation came at the beginning of the 20th century with Eugen von Böhm-Bawerk, who
observed that the shape of the IR curve depends on the preference of the economic agents between
receiving their cash now or delaying payment against an additional revenue (the interest). So the first
answer was already focusing on the offer and demand for liquidity and yield.
In 1930, Irving Fisher took over the subject and expressed the hypothesis that, in absence of default
risk, investors expect a return covering them against their anticipation of inflation. Depending on the
level of uncertainty of their anticipations, they shall request a risk premium. Assuming that uncertainty
remains stable, real interest rate (that is net of inflation) should be extremely stable. Empirical studies
on “Fisher effect” shows a clear correlation between core inflation (that is out of its two volatile
components: energy and food) and interest rate. However, modeling is difficult, due to the difficulties
of forecasting inflation, its volatility and the modeling of the risk premium expected by the market.
The second parameter impacting interest rate is the evolution of the GDP compared to its long-term
trend: if it speed up, the market expect a period of tension on the economic capacities, so a higher
demand for credit and higher inflation. It will therefore request a higher premium. Inversely, below
the long-term trend, market anticipates a weak economic activity and shall request a lower premium.
In 1958, Phillips also identified a link between inflation and unemployment and made the conclusion
that interest rate should be a tool in order to “pilot” the unemployment rate around its “structural”
level. Since then, numerous studies have detailed and modulated this link, which sometimes
statistically disappears. Indeed, IR level impacts more industrial activities, consuming of credit, than
services. An excessive decrease of the interest rate generates speculative bubbles and inflation of the
value of assets (by opposition to the inflation of consumption’s goods and services).
Other technical parameters are involved in the shaping of the IR curve: refunding rates by the central
bank (even though this is a short-term instrument only), calendars specificities, profitability of IR
against stock equity expected return, characteristics of the economy and social system (demography,
funding of retirements…)
Analyzing the links between the orthogonal factors and these macro-economic parameters is a rather
new area. Globally, one gets:
The level of the curve (first factor) is linked to the expected inflation,
The slope (second factor) is linked to the real activity (evolution of the real GDP) and the
monetary policy (central banks increase short-term rates when the factors of production are
heavily used),
The curvature is linked to the volatility of inflation but tests are less significant.
4
Interest rate curve modeling
Several models are used mostly by central banks and IR derivatives trading desks, in order to smooth
the curve in a more constant manner than with the methodology of the polynomial Splines (which
generates waves) and, more importantly, in order to use less parameters.
There are roughly two families of models:
Models with latent factors are based on the absence of arbitrage opportunities (AOA) and see
the IR curve as the consequence of the forward rate. These models, intellectually attractive,
often face difficulties when seeking to represent the observed reality but allow pricing
derivatives by modeling uncertainty around the forward rate. We include in this category of
models the classical Vasicek (1977), Cox-Ingersoll-Ross (1985), HJM (1992), Duffie and Kan
(1996)…
The descriptive models do not focus on respecting the AOA but just aim at representing the
reality. They are indeed a simple exercise of optimizing the adjustment between a parametric
curve and observed points. Their proponents justify this approach by the fact that in reality,
pure arbitrages are not available. So, modeling reality should result in AOA model. Some
models have then been extended or clarified in a second step in order to respect formally the
AOA.
In this last case of descriptive models, there is a wide range of shapes of curves’ families to be adjusted
to the observed curve or to the instantaneous forward rates, the two being linked by:
( )
( ) ( )
( ) ( )
0
..
0
1..
tf s ds t r t
t
P t e e
r t f s ds
t
==
=
with P(t) the price of the zero-coupon of duration t, r(t) the zero-coupon rate and f(t) the anticipated
instantaneous forward rate at time t.
For exemple:
The linear interpolation (simple and acceptable if there are numerous points and a
regular curve)
Legendre polynomials (Almeida 2005)
Laguerre functions (Nelson-Siegel 1987 and Nelson-Siegel Svensson)
Kernel estimations (Fengler 2007)
Numerous improved Splines methodologies
We will focus on Laguerre
functions which are the most widely used and the most elegant to explain.
Not to confuse with polynomials of Laguerre. The term is regularly used. However, the author was not able to
verify the links with Laguerre’s writing.
5
Model of Nelson-Siegel
The most common model used by central banks requires four factors only and provides an elegant
approximation of the IR curve explaining directly the three degrees of freedom mentioned above. The
model is the following:
( )
..
.
11
..
..
tt
t
ee
r t l s c e
tt


−−

−−
= + +


The curve can take most of the observed shapes: increasing (in contengo), decreasing (in
backwardation), concave, convex, “S” shape…
5,000%
5,500%
6,000%
6,500%
7,000%
7,500%
8,000%
8,500%
9,000%
0,00 5,00 10,00 15,00 20,00 25,00 30,00
estimation : in contengo curve
observed rates est. Rates NS or EPM(n=1) est. Rates EPM(n=2)
6
Most importantly, it is easy to explain since the three factors l, s, and c correspond exactly with the
three first degrees of freedom:
l is the level of the curve: the interest rate when t goes to infinity,
s describes the slope, that is the difference between short-term rates (l+s) and long-term rate
(l),
c describes the curvature.
5,000%
5,500%
6,000%
6,500%
7,000%
7,500%
8,000%
8,500%
0,00 5,00 10,00 15,00 20,00 25,00 30,00
estimation : U-shape
observed rates est. Rates NS or EPM(n=1) est. Rates EPM(n=2)
7
One get trivially:
( ) ( )
( )
0
lim r 0
lim
t
t
t f l s
r t l
+
= = +
=
Instantaneous forward rates are given by a simple formula. Indeed, using the price of a zero-coupon
P(t):
( )
( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
0
..
0
..
ln . .
.
1. . . .t.
tf s ds t r t
t
tt
P t e e
P t f s ds t r t
t r t
Pt f t l s e c e
P t t t

−−
==
= − = −




= = = + +

This last equation gives an interpretation of λ. Indeed, the instantaneous forward rate at time t (seen
at time 0) converge toward the long-term rate at a speed depending on λ.
But this equation can also be seen as a modeling of degree 1 amid a set of exponential polynomial
functions, which is the key element of this article:
( ) ( )
..
,,
. . . . .
i i t t
n i n n
in
f t cst a t e cst e P t


−−
= + = +
.
Finally, Nelson Siegel provides an elegant extension of the notion of modified duration (or Macaulay
duration). Indeed, for an IR asset representing the time set of cash flows CF(tn), one gets:
( )
( )
( )
( )
.
1
.
1
..
..
1
.
.t . .
11
.t . . . .
..
nn
nn
nn
Nr t t
n
n
Nr t t
n n n
n
tt
Nr t t t
nn
n
P CF e
dP CF e dr t
ee
dP CF e dl ds e dc
tt


=
=
−−
=
=
=−


−−
= − + +




The term in dl expresses the classical modified duration: the first order effect of the translation of the
IR curve in the price,
The term in ds expresses the sensitivity to a steepening of the curve,
The term in dc expresses the sensitivity to the curvature.
The sensitivity to lambda is usually estimated by approximation, so we didn’t include it.
This formula is useful for the estimation of parameters based on prices.
8
Estimation of parameters with IR points
The estimation is usually made separately for λ and for the other parameters (but we will see below
that the global estimation is not that complex).
In order to have a first estimate of λ, one can search empirically tmax which maximizes the
curvature that is:
( )
( )
( )
..
2
.
1.
0
. . 1
tt
t
e
c t e
t
ct
t
e t t


=−


=
= + +
This equation in λ.tmax is solved using an algorithm of approximation (Newton Raphson…) and one gests
the link between tmax and λ:
λ.tmax = 1.793282
The problem is that the observed curve is not c(t) but r(t).
In first approximation, one can estimate approximatively tmax (in general tmax is between 1.5 and 3
years) as the area where the curvature of r(t) is maximal to the line linking short-term and long-term
rates. From there, one get λ
(between 0.6 and 1.2) but this is a rough estimate which isn’t most of the
time sufficient.
Knowing λ, the three parameters l,s and c are classically estimated using the least squares
methodology. One get:
( )
1
. . . . .
tt
l
s X W X X W Y
c


==



With W the diagonal matrix of weights of the observations, Y the vector of the n observed interest rate
and X the Jacobian matrix:
11
1
..
.
11
..
.
11
1..
. . .
11
1..
nn
n
tt
t
tt
t
nn
ee
e
tt
X
ee
e
tt




−−
−−

−−



=
−−



This methodology is mostly used with observed zero-coupons but also sometimes on the in-fine IR. In
this case, the implied zero-coupons do not follow a NS, which may be confusing (and raises the issue
of the definition of the instantaneous forward rate).
Keep in mind that the observed weights in the least square methodology are equivalent to the variance
of the residue (they are positive). One can impose higher weights on longer maturities in proportion
of their modified duration in order to maintain the same level of accuracy in term of prices.
The calculation of optimum lambda requires rewriting the Lagrangian:
One can also link λ with the speed of convergence of instantaneous forward rates but this method is even less
accurate.
9
( ) ( ) ( )
( ) ( )
( ) ( )
,,
min , min . . . .
0 2. . . 1
0 2. . . . . . 2
t
t
t
L Y X W Y X
LX W Y X
LX
X W Y X
   
 

 
= −
= = −
 

= = − +

 

Equation (1) gives again
( )
1
. . . . .
tt
X W X X W Y
=
which can be reinjected into (2) in order to get an
equation in lambda λ only.
For doing so, let us express first
as a function of X and θ, which is an inelegant matrix derivation
calculation:
( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( )
1
11
11
11
. . . . .
. . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . .
tt
t t t t
t t t t
tt
t t t
X W X X W Y
X W X X W Y X W X X W Y
X W X X W X X W X X W Y
X X X
X W X W X X W X W X W Y



 
−−
−−
−−


=



 
=+





 
= − +
 


 
   
= − + +

   
 
   


With:
( )
( )
( )
( )
1
1.
22
.11
122
11
.
22
.
22
1 . . . 1
1 . . 1
0..
. . .
1 . . . 1
1 . . 1
0..
n
n
t
t
t
tnn
n
nn
t t e
te
tt
X
t t e
te
tt





+ +
+−



=

+ +
+−



One get an explicit formula for
L
(definitely not simple) and equation (2) can be solved using a
numerical approximation (c# program is available in ALM-VISION pricing library) with the algorithm of
Lagrange or by dichotomy.
The following example shows results with four points, properly located.
t
y
0.1
2.00
2
3.00
5
3.50
10
4.00
10
The calculation of lambda requires calculating
L
, the left member of equation (2). Results as a
function of lambda are shown on the graph below, with the value of the Lagrangian (that is the sum of
the square of the residuals).
Optimal parameters equal:
Optimum curve
lambda
0.581765
level
4.375706
slope
-2.429906
curvature
1.10489E-10
The curve shows an excellent fit, which is often the case with low curvature and few observations.
0,010000
0,015000
0,020000
0,025000
0,030000
0,035000
-0,02
-0,015
-0,01
-0,005
0
0,005
0,01
0,015
0,02
0 0,5 1 1,5 2 2,5 3 3,5
Optimisation of lambda
dL/d(lambda) equation (2) Lagrangian
1,50000
2,00000
2,50000
3,00000
3,50000
4,00000
4,50000
0 5 10 15 20
Example of Nelson-Siegel curve
r(t)
11
Dynamic Nelson-Siegel
The three parameters θ move in times and the model was extended and used to provide longer term
forecast than using only the forward rates. Indeed, the use of forward rates has the great disadvantage
to get the IR curve converging toward a flat curve at the level of the long-term rates l.
Nelson-Siegel allows building scenarios keeping a realistic curve, by assuming that the three factors
follow autoregressive processes. In reality, one observes indeed autoregressivity (or random process
which is included in the family of autoregressive processes with ρ=1). However, long-term level of each
of these factors is difficult to calibrate. Intuitively, they translate the fact that IR move with the
economic cycle around an average value linked to the structural level of inflation and growth of the
economy. But in our fast changing current economic environment, the duration of cycles and their
structural level of inflation is difficult to estimate.
Dynamic Nelson Siegel AR(1) model was completed with a determinist factor achieving AOA (Diebold
2008, Christensen 1999). The model becomes with our notations:
( ) ( )
( ) ( )
..
.
11
..
..
.
tt
t
t
ct
ee
r t l s c e
t t t
ct
r t X t


−−

−−
= + +


=−
with c(t) a determinist function and theta following tri-dimensional Ornstein-Uhlenbeck process:
( )
0
. . . T
d dT dw
 
=  + 
with dwT a Brownian vector and matrix
0 0 0
0
00



 =



.
One shows that (c.f. Diebolt):
( ) ( ) ( )
. 2. .
2 . 2. .
1 2 3 4 5 6 7 8 9
11
. . . . . . .
tt
tt
ct ee
a t a t a a a e a a t e a a t
t t t


−−
−−
−−
= + + + + + + + +
Indeed, the determinist factor tends toward infinite at infinite. This is different from most classical AOA
models which assume that the rate converges toward a deterministic value at infinite.
Indeed, one great difficulty of I.R. modeling is the fact that AOA implies that long term rates is at infinite
a non decreasing process (Dybvig, Ingersoll and Ross 1996). This very powerful result has
consequences: linear class of models for example can’t allow a random finite long-term interest rate.
We can accept to work with a model providing infinite rate at infinite or we must go toward nonlinear
models. A third option is to make the AOA more flexible, which seems to better match realty.
Another default of AFNS is that λ is determinist. In realty, it isn’t the case.
Generalizing Nelson-Siegel
Still, Nelson-Siegel does not allow modeling each specificity of the IR curve and the model was
completed:
12
By using one additional factor of curvature: Nelson-Siegel Svensson (with sometimes s2=0).
( )
1 1 2 2
12
. . . .
..
1 1 2 2
1 1 2 2
1 1 1 1
. . . .
. . . .
t t t t
tt
e e e e
r t l s c e s c e
t t t t
 

 
− −
−−
 
− −
= + + + +
 
 
By using n additional factors of curvature: generalized Nelson Siegel.
Both models present several major defaults:
The orthogonality is no longer respected,
The models are no longer easy to estimate analytically,
The implied forward rate evolution becomes complicate:
( )
1 2 1 2
. . . .
1 2 1 1 2 2
. . . .t. . .t.
t t t t
NSS
f t l s e s e c e c e
 

− −
= + + + +
It is anesthetic.
The exponential polynomial model EPM(n)
We propose here another extension of NS using our remark that NS model can be seen as a degree 1
modeling inside the set of exponential polynomial functions describing forward rates:
( ) ( )
..
, 0 , 0
. . . . .
i i t t
n i n n
in
f t a a t e a e P t


−−
= + = +
.
One get subsequently:
( ) ( ) ( )
( )
.
0
00
.
0 , 0 ,
0
11
. . . . . .
11
. . . . . . .
tt
s
n
t
nn
is
i n i n i
i O i O
r t f s ds a P s e ds
tt
a a s e ds a a I
tt
==
= = +
= + = +


Since the set of integrals
( )
.
0
. . .
tis
i
I s e ds
=
is well known and, by integrating by part, one
gets
( )
.
1
.. .
it
ii
te
I i I
= − +
, so by recurrence for i>0 :
( )
( )
.
0
.
0
1.1
.
!. 1 . !
t
j
i
t
ij
Ie
t
i
Ie
j
=
=−

=−



and:
( ) ( ) ( )
.
0,
00
.
1. . ! . 1 . .
.!
j
ni
t
i n t
ij
t
r t a a i e X
tj
==

= + =




We have an elegant generalization of Nelson-Siegel while keeping the previous methodology.
13
Estimating EPM(n) parameters using observed yields
Estimation remain linear and factors can be interpreted exactly in the same way as for Nelson Siegel,
just we added additional dimensions, fully generalizing the notion of modified duration.
For p observations and a degree n exponential polynomial model, the matrix X(p lines, n+2 columns)
becomes (the matrix shows values for columns 1, 2, i+2 and n+2):
( ) ( )
( ) ( )
1
11
...
11
00
111
...
00
.
..
1!!
1. . 1 . . . 1 .
.. ! . !
.. . . . .
. . . .
..
..
1!!
1. . 1 . . . 1 .
.. ! . !
..
1
1
l
ll
p
jj
tin
tt
jj
jj
tin
tt
ll
jj
lll
t
tt
ein
ee
tt j t j
Xtt
ein
ee
tt j t j
e






−−
==
−−
==
   
−−
   
   
   
   
=−−
   
   
   


( ) ( )
..
00
. . . .
..
!!
. . 1 . . . 1 .
.. ! . !
pp
jj
in
pp
tt
jj
ppp
tt
in
ee
tt j t j



−−
==













   

   

−−
   

   


After the two first columns, columns are linked by
( ) ( ) ( )
1.
. 1 . .
it
X i i X i t e
= − −
The derivative of X against λ is still required for calculating the optimum λ (the formula doesn’t change
in X):
( ) ( ) ( ) ( )
.
,
200
.
1
. . . ! . . . 1 . 1
.!
j
ni
t
in
ij
X t t
a i e j t
tj


==

= − +




So that:
( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
1
1
...
11
111
22
200
11
1
..
2
20
..
1 . . 1 !!
. . . . 1 . 1 . . . . 1 . 1
0. ! . !
.
.. . . . .
. . . .
..
.
1 . . 1 !
. . . . 1 . 1 .
0.!
.
l
jj
tin
tt
jj
j
ti
tl
ll
j
l
l
tt
te in
e j t e j t
t j t j
t
Xt
te ie j t
tj
t



==
=
 
+− − + − +
 
 
 

== +− − +




( ) ( )
( )
( ) ( ) ( ) ( )
.
20
.t ..
22
200
.
!. . . 1 . 1
.!
. . . .
..
..
1 . . 1 !!
. . . . 1 . 1 . . . . 1 . 1
0. ! . !
.t
p
j
n
tll
j
l
jj
in
pp
tt
npp
jj
pp
p
t
ne j t
tj
tt
te in
e j t e j t
t j t j




=
−−
==










− +







   

+−    

− + − +
   

   


After the two first columns, columns are linked by
14
( ) ( ) ( ) ( )
2.2
1
. . . . 1 . .
it
X i X i
i t e i t t


 − 
= − −


Formula remain the same. Just one need to build the two matrix X and
X
depending on the number
of factors n:
( )
( ) ( )
( )
1
11
. . . . .
0 . . . . . . . . . . . . . . . . . .
tt
t
tt
t t t
X W X X W Y
X X X X
X X W X W X X W X W X W Y W Y X
 
 
−−
=



 
   

= + − + +


   
 
   






The second equation gives a condition to be respected by λ as previously. Attention, there isn’t a
unique solution to the equation specially for lambda small, one can get several solutions. Therefore,
one usually uses a two steps methodology identifying first the area around which is the minimum
Lagrangian and then using the second equation to find the local minimum.
After having calculated optimal lambda, the system is then a simple least-square with n parameters.
Explaining the EPM(n) model
The modeling is much more elegant than NSS or generalized Nelson-Siegel since:
It is more intuitive,
It respects the idea that each axis corresponds to a degree of freedom of the IR curve,
It allows infinite expansion while keeping analytical expressions.
This type of functions is often observed in mathematical literature: functions of Laguerre, distributions
of Sargan… This is due to the fact that they are very closed to polynomial interpolations.
In terms of forward rates, the model just adds additional perturbation, smaller and farer. Indeed:
( )
.
, 0 ,
. . .
t i i
n i n
in
f t a e a t
=+
Functions
.
..
i i t
te
add a perturbation centered on their maximum that is i/λ. Since most of the time,
λ is between 0.2 and 2, these additional perturbations give us flexibility to adjust the curve for the
additional observations we get (usually some deformation to adjust before the 2 or 3 years pivot and
after between 5 and 7 years).
For n equal to the number of observations minus 1, there is a direct linked with the Legendre
polynomials.
Of course, the model assumes a finite interest rate at infinite, even though we will see that this
parameter changes in time. This assumption is a limitation of the model but again, what we observe in
real life is that perpetual rents (traded until the XIXth century) had a finite interest rate. As well 100
years US government bonds don’t show a trend toward an infinite IR at infinite.
15
Dybvig, Ingersoll and Ross (1996) and in parallel Antoine Frachot and Nicole El Karoui
(1997) published
a very strong result according to which at infinite, IR curve is a non-decreasing process. This very
intuitive result can be translated by the fact that the forward instantaneous IR curve f(t) is also non-
decreasing. In this view, our model expresses only the fact instantaneous forward rates suffer some
perturbations which effect disappear in time. The previous result can be translated by the fact an,n < 0.
It is the case most of the time, since IR curves are never seen in backwardation long term. This gives a
better view of the modeling of the forward rate with pi,n the roots of the polynomial:
( )
,
.
,,
1
. . . nin
nt
n LT n n i
p
f t a a e t
=

= −


The forward rate is modeled using a simple adjusted polynomial interpolation. The adjustment in
.
.
nt
e
allows to have the perturbations disappearing in time.
Indeed, the model doesn’t give any constraint on the curve of forward short-term rate. The proponents
of the methodology argue that there is no such need since the model just interpolates “ad minima”
the reality. And the reality doesn’t allow for absolute arbitrage, so should the model.
Still, in addition of having an,n < 0, the AOA should have a translation in term of regularity of the curve
of forward rates. Intuitively, one can expect that the interpolation should not add any additional wave
to the forward curve.
Another view of the EPM model is to consider that forward rates are converging toward their long-
term value a0 but, before converging, are facing perturbations which disappear in time. These
perturbations being defined as the difference between the observed forward rate and the long term
one, we model them through polynomial interpolations, using the exponential factor to express their
disappearance in time:
( )
( )
( )
( )
.
, 0 ,
., 0 ,
. . .
. . .
t i i
n i n
in
t i i
n i n
in
f t a e a t
e f t a a t
−=
−=
In this approach,EPM(n) is the adaptation of the polynomial interpolation to a family of functions
converging toward a long term value.
“A note on the Behavior of Long Zero Coupon Rates in a No Arbitrage Framework”, N. El
Karoui, A. Frachot, H. Geman, June 1997
16
An example
Let take 20 points describing the IR curve and compare results of EPM(n) for n=1 (Nelson-Siegel) to 4.
One get:
n
Lagrangian
lambda
beta 0
beta 1
beta 2
beta 3
beta 4
beta 5
beta 6
1
0.002116
1.2874
3.195
-1.105
0.000
2
0.000905
0.4618
3.406
-1.324
1.804
-1.183
3
0.000897
0.5994
3.385
-1.299
0.986
-0.125
-0.272
4
0.000458
1.6035
3.280
-1.281
2.017
-5.638
3.373
-0.534
5
0.000458
1.6035
3.280
-1.281
2.017
-5.638
3.373
-0.534
0.000
The lagrangian doesn’t move regularly. For example, from n=1 (Nelson-Siegel) to n=2, there is a
significant reduction of the residuals as between n=3 and n=4. It isn’t however the case between n=2
and n=3 or between n=4 and n=5
Even though movements are less brutal than what can be found with a direct polynomial interpolation,
the practitioner need to use its market judgment to appreciate the necessity to integrate small
irregularities which may come from temporary market disequilibrium and bid/ask spread.
This subject is key and not directly relative to the model of interpolation of the curve. As it has been
explained in introduction, it is important to analyze each coefficient and their correlation with classical
orthogonal factors. EPM(1) captures most of the moves (usually above 90%). The addition of one or
two more factors can only be justified by the necessity to translate a specific irregularity of the curve.
The example shows also the difficulty linked to the fact that
L
isn’t strictly growing and the
estimation must be made around the absolute minimum to converge properly. That means that the
2,00
2,20
2,40
2,60
2,80
3,00
3,20
3,40
0,00 5,00 10,00 15,00 20,00 25,00
Example of EPM interpolation using different degrees
vector rates X.theta n=1 X.theta n=2 X.theta n=3 X.theta n=4
17
methodology of optimizing will require to avoid local minima even though Lagrange convergence
methodology provides only local minimum.
Finally, it is clear that optimizing λ is key and models with fixed λ will provide very likely a much lower
fit with the real observations.
Estimating EPM(n) parameters using prices or in fine yields
Most of the time, only prices of bonds are observed and not zero-coupon yields. Actually working on
yields directly is possible only for swap markets or when liquid striping of bonds is available.
Let assume we have now p observations of prices Pl of bonds paying at time ti a cash flow CFi. One get:
( ) ( )
( )
( )
( )
( )
..
,.
1,
li
i
ii
X t t
l
ll
tt
t t t t
act i
CF i
P t CF i e
r t t
−−
==
+

Which expresses the price of trade l as the actualized value of the future cash flow of the bond using
the zero-coupon curve of parameter θ.
Please notice that we present the actuarial rates in parallel to the continuous rates. Using continuous
rates is more rigorous from a methodological point of view but include one additional level of
18
complexity to communicate to the market
. The two methods are very closed (identical in first degree,
different in second degree but the estimator does adjust for that difference). Both rates are linked by
the simple equality:
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
exp
exp
exp
.
exp
ln 1
1
1
ln
actuarial modified duration = - =
1
ln
exponential modified duration = -
act
r
act
t
r t t act
act act
rr
re
ZC t e r t
d ZC t t
dr rt
d ZC t t
dr
=+
=−
= = +
+
=
So one usually performs calculation with exponential rates and then convert results in order to
communicate on an actuarial yield curve.
Whatever, the Lagrangian to minimize becomes:
( ) ( )
2
,,
1
min , min .
p
l l l
l
L w P P
   
 
=
=−


This is no longer linear but can be approximate with a linear methodology since the functions of prices
are infinitely derivables (c.f. Gourieroux-Monfort-Trognon-Renault). Notice that we did not do any
assumption on the law of probability of our residuals, since it isn’t required.
We change our notations to:
( )
( )
( )
( )
0
1
0... 2 0... 2
2
2
,
.
2. . .
2. . . .
..
n
l
l l l
l
ii
i n i n
l l l
l l l
l
i j i j i j
ij
x
P
dL L
Z w P P x
dx x x
P P P
L
H w P P x
x x x x x x
+
= + = +



=


 
= = = −
 

 

 
 
= = − −

 
 

   
 

With Z the gradient of the Lagrangian and H its Hessian.
By linearization, we assume that the second term of the sum is negligible:
This article, like every article published by ALM-Vision, gives the theoretical background to an operational
subject. This explains our pragmatic approach, sometimes away from academic publications.
19
( )
( )
2
2
,
,
0,
2. . . . 2. . .
..
l l l l l
l l l l
ll
i j i j i j i i ij
ij
ij
P P P P P
L
H w P P x w
x x x x x x x x




 

   

= = − −

 



   




 




Let us assume we have an estimation (θ00) near the optimum value, one gets the local minimum by
applying a Newton algorithm:
( ) ( ) ( ) ( )
1
1.
nn
x n x n H x Z x
+ =
That for, we need to calculate the derivatives of L against xi that is:
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
.0
, 0 0 0 0 10
0
0
,n 1 0 0 0 0 0 0 0 0 0
1
0
. , . .
!
, , with 0, =1 and 1, . 1 .
.!
1.
.
, , . , . . ,
1.
s
s
s
j
i
t
l s s
l
li t
sj
is
s
ls
ll
ll
t
ss
CF s X i s t t
Pi
Z X s X i s e
tj
X
CF s t
PX
ZX
X
   

       
 
+=
++

= = − + =

+



= = − +

 

+

With, in approximation:
( )
( ) ( )
( )
( ) ( )
11
0 0 0 0 0
11
00
, . . . . . . . . . . . . . ,
. . . . . . . . . . . . . .
tt
t t t yield
tt
t t t
X X X
X W X W X X W X W X W Y
X X X
X W X W X X W X W X W X
   
 

 
−−
−−

 
   
 − + +

   
 
   



 
   
 − + +

   
 
   


Where
W
is the matrix of equivalent weights on the initial rates, that is multiplied by the modified
duration.
For simplification, one can also simply estimate the derivative against lambda around the observed
point. The fact that the derivative is maximal around the optimum allows this approximation.
By bootstrapping, one then get the optimum parameters.
There are two possibilities for starting the bootstrapping:
If we had an estimation the day before, we can obviously use it since IR curve shift are
almost never above some teens of basis points.
If not, we can use as starting points, the yield and modified duration of the bonds.
Zero-coupon curves are also usually relatively closed from the in fine yields.
This methodology is obviously more complex than the optimizing on yields directly but the approach
is relatively similar.
Convergence is extremely fast, in 3 to 5 iterations in our test. Still it provides us with a local optimum
and it is important to combine this approach with the one on the zero-coupon rates directly in order
to get a global view on the shape of the curve.
20
Optimizing using in fine yields
When the data are made of in fine yields, the user can deduct the zero-coupons rate only if all the
information is available, that is the 1y, 2y…, 20y. Unfortunately, most of the time, one gets the
information only on some tenures: 1y, 2y, 3y, 5y, 7y, 10y, 15y, 20y, 30y. The other must be deducted
from these ones.
Actually, the methodology is the same with, for the tenure l:
( ) ( )
( )
( )
( )
( )
( )
( )
0
00
..
100
1
1 , .
1 , 1 ,
li
il
i
lX t t
ll
t t t t
i t t
act i act l
cl
P t CF i e
r t t r t t
−−
−−
=
= = + =
++

Just the price of each observation is equal to 1 and the related cash flows are equal to c until l and 1+c
at time l. In this case, we optimize on the price which implicitely overweights the long duration
observations.
Observed prices and estimated prices using EPM(2) on Euribor swap curve (with derivatives of the price
against each parameter)
Price
month
in fine
weight
est. Price
dP/dlambda
dP/dtheta0
dP/dtheta1
dP/dtheta2
dP/dtheta3
100%
JJ
-0,3640%
4,00%
100,00%
0,00
0,00
0,00
0,00
0,00
100%
1W
-0,3780%
4,00%
100,00%
0,00
-0,02
-0,02
0,00
0,00
100%
2W
-0,3710%
4,00%
100,01%
0,00
-0,04
-0,04
0,00
0,00
100%
1
-0,3690%
4,00%
100,01%
0,00
-0,04
-0,04
0,00
0,00
100%
2
-0,3390%
4,00%
100,03%
0,00
-0,16
-0,16
0,00
0,00
100%
3
-0,3280%
4,00%
100,04%
0,00
-0,24
-0,24
0,00
0,00
100%
6
-0,2780%
4,00%
100,07%
0,00
-0,49
-0,48
-0,02
0,00
100%
12
-0,2600%
4,00%
100,05%
-0,02
-1,00
-0,93
-0,06
-0,01
100%
24
-0,1100%
4,00%
99,95%
-0,06
-2,00
-1,74
-0,23
-0,04
100%
36
0,0900%
4,00%
99,92%
-0,13
-3,00
-2,45
-0,48
-0,13
100%
48
0,3000%
4,00%
99,99%
-0,20
-4,00
-3,06
-0,78
-0,28
100%
60
0,4800%
4,00%
100,04%
-0,28
-4,98
-3,58
-1,11
-0,48
100%
72
0,6300%
4,00%
100,05%
-0,37
-5,95
-4,03
-1,45
-0,75
100%
84
0,7600%
4,00%
100,02%
-0,44
-6,90
-4,41
-1,80
-1,07
100%
96
0,8800%
4,00%
100,02%
-0,51
-7,83
-4,74
-2,14
-1,43
100%
108
0,9800%
4,00%
99,97%
-0,57
-8,74
-5,01
-2,47
-1,83
100%
120
1,0700%
4,00%
99,94%
-0,61
-9,63
-5,24
-2,78
-2,25
100%
144
1,2300%
8,00%
100,01%
-0,67
-11,35
-5,60
-3,35
-3,13
100%
180
1,3900%
8,00%
100,01%
-0,66
-13,80
-5,96
-4,04
-4,45
100%
240
1,5300%
8,00%
100,00%
-0,49
-17,55
-6,27
-4,82
-6,40
100%
360
1,5800%
8,00%
100,00%
0,04
-24,31
-6,48
-5,55
-8,90
One can also optimize on the coupon by observing that:
( ) ( )
( )
1,
,,
n
ni
in
ZC
cZC

 
=
21
So that:
( ) ( ) ( ) ( )
( )
( ) ( )
( )
( )
( ) ( ) ( )
0
00
2
2
0 0 0 0 0 0 0
,,
11
,
2
0 0 0
,1
min . . , . min . , . ,
,
min . , . ,
l
l
pp
l l l
l l l l l l
d d d d
ll
YZ
p
l l l
dd l
dc dc dc
w c c d d w c c d d
d d d
w Y Z d d
   


   
 
 
==
=




− − = − −





=−



We are back again to the previous situation with different Y and Z.
About weights and errors
Weights are proportional to the inverse of the variance of the error and allow for some adjustment in
order to take into account the size of a trade, its significance…
An important point is that optimization on IR and optimization on prices are implicitly using
different weights. Indeed, optimization on IR gives the same weight to the long term observations than
the short term. This generates a higher error in term of prices for long term bonds since in first
approximation dP = modified duration x dr.
At the opposite, optimization on prices compensate mechanically for the higher sensitivity of long term
bonds.
In order to get a higher level of accuracy for long term IR when optimizing directly on IR, one must
increase the weight of long term bonds in proportion of their modified duration.
In many illiquid markets, there are few trades per day and the curve can’t be accurately
estimated using only the trades of the day. Subsequently, one uses older observations which weight
must be reduced with their age to go down to 0 after 30 to 90 days depending on the market. If the
observation is recent (a few days) in an illiquid market, it should keep almost all its weight to then
decrease quicker in time: for example,
( )
( )
( )
min
.max 0,
min
1
.max 0,
k t t
e
wt k t t
−−
=
or
( )
2
1 .1ta
t
wt a


=−





Another classical adjustment is made in proportion of the size of the trade.
Another difficult issue is linked to the bid/ask spread and the functioning of the market. In a market
dominated by market makers and specialists, trades are often done between the market maker and
its client, the client being usually buyer on the primary market and rather seller on the secondary one
(because big institutional investors keep their lines as much as possible). Subsequently, observed
trades can include the margin of the market maker. This information is not always available and may
generate unjustified volatility to the curve. Reducing the weight of certain type of trades may reduce
this effect.
One must keep in mind that choosing different weights implies obviously modifying the value of the
estimation, which may have significant theoretical impact in term of implicit model (see below).
22
Other statistical models versus least square estimation
The previous estimations of the parameters chose to minimize the distance between the observations
and the curve, the classical least square methodology:
( )
( )
2
min ,
ii
iy f t
This pure geometric approach doesn’t make any statistical assumption
, just using the standard
Euclidian distance. It has the great advantage of expressing the intuitive objective of the traders: to
get an analytical formula as closed as possible from observed prices (using the standard Euclidian
metrics).
However, this methodology requires to eliminates outliers, which can heavily disturb the process,
exactly in the same way as it is the case for a simple least square linear regression. It also requires to
control that residuals follow a gaussian law (so that the Least Square Estimate be also the Estimator
of Maximum Likelihood) or at least are of same variance and not auto-correlated. Indeed, the least
square methodology provides the maximum likelihood estimation only if the residuals are following a
gaussian law.
Identifying outliers is key but sometimes difficult to perform. A classical statistical way of doing so is to
divide the residuals by their observed standard deviation and spot the normalized residuals larger than
[2] or [2.5] in absolute value. A more financial approach is to eliminate the trades which are obviously
“off-market” compared to the practice of the market: standard bid-ask, market makers quotation
spreads… Trades which weren’t done on the market (OTC trades), buy and sell trades (to generate
accounting results in certain accounting system or to roll a line into a new line), end of year or quarter
sell and buy trades have a higher probability to be done “off-market”.
If one doesn’t want or can eliminate outliers, another criteria, giving less weight to outliers could be
the old method of the sum of the absolute values of residuals (the Least Absolute Deviation or LAD
model
):
( )
min ,
ii
iy f t
Still it doesn’t give us any indication whether the EPM is a proper model or not. It simply gives us the
parameters which minimize the sum of the absolute distance between the cloud of observation and
the EPM curve.
Only if residuals follow a Laplace law is the LAD estimate the maximum likelihood estimate. It is often
the case for heavy tails, asymmetric distribution of residuals or very large sample. Indeed, the author
found standardized residuals following a Laplace distribution in the analysis of 40 000 trades in a low
liquidity market: most of the trades were on or very closed to the estimated market curve with heavy
tail of distribution and a sharp decrease of the number of trades out of the curve (compared to the
Explicitly but implicitly, the least square methodology is the maximum likelihood estimator for gaussian
residuals only. If not, it is still unbiased but may not be the optimum.
Also called L1-regression. For LAD and other alternative models, see Birkes and Dodge “alternative methods of
regressions”, 1993, Wiley-interscience.
For MLE, see “An inductive approach to calculate the MLE for the double exponential distribution”, W. Hurley,
Journal of modern applied statistical methods: JMASM nov. 2009
23
rather flat shape of the gaussian standard distribution at the average). In this sample, there was no
significant asymmetry.
Other methods include M-regression or nonparametric-regressions5 which perform well with outliers
but are much heavier to implement.
Actually, as for any least square analysis, if we consider Yi as a random variable, moving from a
geometric approach to a statistical one, both methods are based on the assumption that
( )
( )
,
i i i
E Y t f t
=
and so we must control that
( )
( )
,0
ii
E Y f t
−=
The difficulty is that the residuals aren’t necessarily of same variance or not auto-correlated (the issue
is exactly the same than with a least square linear regression which is based on the assumptions that
residuals are centered, not auto-correlated and of constant variance. If they are not, the estimate is
still unbiased but is no longer realizing the minimum variance).
Indeed, the analysis of the residuals may require to take a statistical approach assuming a model for
the residuals:
( )
. with 0,1
i
i t i i i
Y f N
 
= +
or a Laplace law
( ) ( )
2
2
.
. with 0,1
i
ii
i i i
Y f t e N

=
or a Laplace law
In both models, σi may vary depending on market parameters: volume of the transaction compared to
similar trades, type of trade (primary, secondary, inter, intra), type of line (new line, old line without
volume)…
The resolution of the optimization requires to write the likelihood.
0
2
4
6
8
10
12
14
16
18
20
-6 -4 -2 0 2 4 6
Distribution of standardized errors in a low liquidity market
fobs(x): observed density function
f(x): gaussian standard law
distribution function for comparison
24
Pricing of derivatives
Obviously, the family of exponential polynomial models allow to build simple multi-dimensional
extensions of Cox-Ross to interest rate while keeping a legitimate theoretical background. Indeed,
assuming λ is stable, the IR curve is fully described by vector θ of orthogonal variables.
Very elegant in theory, this class of model is consuming of calculation capacities since a 1000 step tree
generates 10003=109 final possible positions for a Nelson Siegel (or EPM(1) ). There is no point using a
higher model since pricing of IR long-term derivatives isn’t an exact science and the market prices a
significant bid/ask inside the volatility to cover for the other independent factors.
The model assumes that:
The three parameters θ1 , θ2 and θ3 between two successive periods t and t+dt can either get
up by a % with probability p under the risk neutral probability, or get down by b %, that is :
( ) ( ) ( )
( ) ( )
. 1 proba.
. 1 proba. 1-
i i i
ii i i
t a p
t dt t b p
+
+=
+
The major difference with a classical Cox-Ross is that r(t) depend on θ(t) and is no longer “risk
free”. It is just for the period between t and t+dt.
The risk free rate between t and t+dt is the instantaneous short-term rate, it depends on the
current value of interest rate:
( ) ( ) ( )
01
r t t t

=+
. However, in this case, the probabilities
become path-dependent and don’t allow to build a clean tree so we will take the
approximation used in the market of a constant r, even though we know that is far from realty
and raised serious issues from a methodological point of view.
There is no explicit AOA in the model, still we give the constraint as if each parameter was an asset (in
first approximation, the variation of the price of a zero-coupon is proportional to the variations of the
three parameters θ) and real assets were derivatives on these three notional assets (for example, the
price of the zero-coupon is a derivative of the parameters θ). This local AOA is correct since zero-
coupons are regular infinitely derivable functions. This methodology is rather abstract but one get:
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
( )
( ) ( )
1 . . 1 . 1 . 1
i i i i i i i i
i
iii
E t dt t a p t b p t r t
r t b
pt ab
 
+ = + + + − = +
=
Locally, at each step, the model allows defining the quantity of θ required to be sure to be able in step
+ 1 to honor its derivative contract which is at the base of Cox-Ross approach. The AOA constraint
simply is no longer fully respected because there is no clear risk-neutral probability.
One can choose the IR over the full period in order to avoid dependence on the steps in case of pricing
in fine (for a zero-coupon, neutrality is respected with the forward instantaneous rates).
Each of the parameters are following a tree and after n steps including j up, one get:
( ) ( ) ( ) ( )
, up 0 . 1 . 1
j n j
i i i i
n j a b

= + +
25
There are independent by construction and the state of the IR curve at time N is defined by the vector
θ after i move up of θ0 , j move up of θ1 and k move up of θ2 under probability:
( )
( )
( ) ( ) ( )
0 1 2 0 1 2
, , , up . . . 1 . 1 . 1
N i N j N k
i j k
p N i j k p p p p p p
− −
= − −
By recurrence, one can get the price of any asset paying cash flow at time tN in state (I,j,k up) as a
derivative of the three parameters, which allow for market calibration using swaptions prices:
( ) ( ) ( )
( )
( )
( )
( )
( )
( )
( )
0 1 2 , , 0
0
1
0, 0 , 0 , 0 . ,N, , , up . , , , up
1N
N
N
ti j k
N
P t P t i j k p N i j k
rt
 
=
==
+
American options are calculated by recurrence, with at each step, 23 possible directions:
( ) ( ) ( )
( )
( )
( )
( )
( )
1
, , 0
1
max intrinsec value , . 1, , , up . 1, , , up
1dt i j k
n
u n n u n i j k p i j k
rt =


=+

+

One still has the classical link with the volatility of each θ, if one assume that each θ follow a log-normal
process (that is doesn’t change sign):
( )
( )
.
1
ln 1
1
ln 1
RT
rNa
rN
b
rN
=
+
=
+
+=
+
But this formula is an approximation since there is no risk free rate over the full period anymore (which
is the realty of the market).
The formula allows to price prepayment options on loans.
Let the value of a loan:
( )
( )
,1. i
ii
i
ttt
tt t t t
CF
Vt X

=++
Let us define a statistical rule for exercising the option at time t, that is the percentage of client
exercising function of the IR: q(t, θ). It can be a function of the difference of interest rate with the initial
rate or the difference of marked to market as decided by the bank.
( ) ( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
( )
1
, , 0
1 q ,
max , 1 .q , , . 1, , , up . 1, , , up
1
n
nn dt i j k
n
tn
u n V t n t n u n i j k p i j k
rt

=





= − +


+

Conclusion
The Nelson-Siegel model is fascinating because it provides a simple representation of the IR curve with
orthogonal factors easily macro-economically interpretable. It can be generalized in EPM, a powerful
family of model.
26
However, the operational implementation requires a rigorous methodology since classical
optimization methodologies provide only local optimum.
In addition, this class of models doesn’t respect the AOA. Scholars proposed an analytical modification
of the model in order to get AOA but these solutions generate infinite long term rates. Another
promising direction would be to soften AOA and integrate in the model that there is no risk free rate
on the market. Under these assumptions, EPM provides a powerful and simple tool for pricing
derivatives.
Finally, the study of the coefficients in time is opening also interesting fields of research for asset
management.
27
Bibliography
BIS papers n°25 “zero-coupon yield curves estimated by central banks”
European Central Bank working papers serie n°917, juillet 2008 “modeling and forcasting the
yield curve under model uncertainty
European Central Bank working papers serie n°874, février 2008 “how arbitrage free is the
Nelson Siegel model”
“Yield curve modeling and forecasting: the dynamic Nelson Siegel approach”, Francis X.
Diebold & Glenn D. Rudebush, 29 avril 2012
Annales d’économie et de statistique n°8 – 1987 “Les méthodes du pseudo-maximum de
vraisemblance”, Alain Trognon
Méthode des moindres carrés généralisés, Jean Debord, avril 2003
COMISEF working papers series WPS-031 30/03/2010 “calibrating the Nelson-Siegel Model”,
M. Gilli, S. Gross, E. Schumann
“A note on the Behavior of Long Zero Coupon Rates in a No Arbitrage Framework”, N. El
Karoui, A. Frachot, H. Geman, June 1997
Function Minimization, proceedings of the 1972 CERN computing and data processing school,
Pertisau, Austria, 10-24 September, 1972 (CERN 72-21) page 31 mainly
Alternative methods of regression, David Birkes & Yadolah Dodge, Wiley Interscience 1993.
“An inductive approach to calculate the MLE for the double exponential distribution”, W.
Hurley, Journal of modern applied statistical methods: JMASM nov. 2009
28
About ALM-VISION
ALM-Vision is a quantitative modeling company founded in 2011. Its mission is to provide quantitative analysis
and scientific support to financial institutions.
The core of our business activity is Asset Liability Management (ALM) modeling. Our modeling tool ALM-
Solutions® is proprietary software developed internally by our team for highly precise state of the art modeling
of banking assets and liabilities to monitor the financial institutions’ interest rate, credit and liquidity risk and to
understand the impact of a variety of economic scenarios on the balance sheet and income statements, including
stress testing. We have also high quality pricing capacities for complex structured financial products.
In addition to ALM modeling, ALM-Vision provides advisory services to financial institutions and is called in to
intervene on technical matters that require high pricing capacity and substantial and extensive experience in the
financial markets (CVA, FVA, deal structuring, ABS, NBT, inflation-linked products, commodity derivatives and
modeling, credit restructuring…).
Most bank ALM and/or risk teams are left alone to handle the new regulatory environment. With the current
difficult environment for the financial industry, both human and technological resources are scarce and the
teams have neither the time nor the capacity to develop the scientific part of their job. We provide our customers
with this technical support and act as a bridge for best practices between our customers. Indeed, each customer
brings us new needs, new issues, new requirements which reinforce our expertise. Our rule is to systematically
share new non-client specific developments as a way to diffuse best practices around the industry. We strongly
believe that our success is based on the fact that we are not a simple IT provider but a true scientific support
team, with strong financial expertise assisting our customers in the whole modeling and analysis of their balance
sheet.
In ALM, software is just the tool. The core of the added value is the modeling and the analysis. Leveraging on our
strong financial and market experience, we help our clients focus on this core in the most efficient way.
Disclaimer
The contents of this document are proprietary to ALM-VISION. This document is produced by ALM-
VISION for institutional investors only and is not financial research. This document is for information
purposes only and is neither an offer to buy or sell, nor a solicitation or a recommendation to buy or
sell, securities or any other product. ALM-VISION makes no representation as to the accounting, tax,
regulatory or other treatment of the structure of any transaction and/or strategy described in this
document and the recipient should perform its own investigation and analysis of the operations and
the risk factors involved before determining whether such transaction is one which it is proper and
appropriate for it to enter into. Some information contained in this document may have been received
from third party or publicly available sources that we believe to be reliable. We have not verified any
such information and assume no responsibility for the accuracy or completeness thereof. This
document may also include details of historic performance levels of various rates, benchmarks or
indices. Past performance is not indicative of future results.
© 2018 ALM-VISION. www.alm-vision.com. Contact info@alm-vision.com. All rights
reserved. No information in this document may be reproduced or distributed in whole
or in part without the express written prior consent of ALM-VISION, except for personal
use.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This paper investigates the behavior of long zero-coupon rates and its consequences for usual arbitrage models of the term structure.
Book
Understanding the dynamic evolution of the yield curve is critical to many financial tasks, including pricing financial assets and their derivatives, managing financial risk, allocating portfolios, structuring fiscal debt, conducting monetary policy, and valuing capital goods. Unfortunately, most yield curve models tend to be theoretically rigorous but empirically disappointing, or empirically successful but theoretically lacking. In this book, Francis Diebold and Glenn Rudebusch propose two extensions of the classic yield curve model of Nelson and Siegel that are both theoretically rigorous and empirically successful. The first extension is the dynamic Nelson-Siegel model (DNS), while the second takes this dynamic version and makes it arbitrage-free (AFNS). Diebold and Rudebusch show how these two models are just slightly different implementations of a single unified approach to dynamic yield curve modeling and forecasting. They emphasize both descriptive and efficient-markets aspects, they pay special attention to the links between the yield curve and macroeconomic fundamentals, and they show why DNS and AFNS are likely to remain of lasting appeal even as alternative arbitrage-free models are developed. Based on the Econometric and Tinbergen Institutes Lectures, Yield Curve Modeling and Forecasting contains essential tools with enhanced utility for academics, central banks, governments, and industry.
• COMISEF working papers series WPS-031 30/03/2010 "calibrating the Nelson-Siegel Model
  • M Gilli
  • S Gross
  • E Schumann
• COMISEF working papers series WPS-031 30/03/2010 "calibrating the Nelson-Siegel Model", M. Gilli, S. Gross, E. Schumann
An inductive approach to calculate the MLE for the double exponential distribution
• "An inductive approach to calculate the MLE for the double exponential distribution", W. Hurley, Journal of modern applied statistical methods: JMASM -nov. 2009