ArticlePDF Available

Numerical aspects of different Kalman filter implementations

Authors:

Abstract

A theoretical analysis is made of the error propagation due to numerical roundoff for four different Kalman filter implementations: the conventional Kalman filter, the square root covariance filter, the square root information filter, and the Chandrasekhar square root filter. An experimental analysis is performed to validate the new insights gained by the theoretical analysis.
lEEE
TRANSACTIONS
OK
AUTOMATIC COh’TROL, VOL. AC-31. NO.
IO.
OCTOBER
1986
Numerical Aspects of Different
Implementations
MICHEL
VERHAEGEN
AND
PAUL VAN DOOREN,
907
Kalman Filter
MEMBER, IEEE
Abstract-A
theoretical analysis is made of the error propagation due
to
numerical roundoff
for
four different Kalman filter implementations:
the conventional Kalman filter, the square root covariance filter, the
square root information filter, and the Chandrasekhar square root filter.
An
experimental analysis is performed to validate the new insights gained
by
the theoretical analysis.
I.
INTRODUCTION
S
INCE the appearance of Kalman’s 1960 paper
[
11, the
so-called Kalman filter (KF) has been applied successfully to
many practical problems. especially in aeronautical and aerospace
applications.
As
applications became more numerous, some
pitfalls of the KF were discovered such as the problem of
divergence
due to the
lack
of
reliability
of
the numerical
algorithm
or to
inaccurate modeling
of the system under
consideration [2].
Therefore, several modified implementations of the KF were
presented in an effort to avoid these numerical problems. Many of
these modifications were based on heuristics (as in the stabilized
KF [3], or the conventional
KF
with lower bounding
[4])
which
often require much experience in order to implement them
effectively. Later, more reliable KF implementations were
described such
as
the square root filter (SRF) proposed by Potter
in 1963
1.51.
For this filter the reliability of the filter estimates is
expected
to
be better because of the use of numerically stable
orthogonal transformations for each recursion step. On the other
hand, the
SRF
implementation required more computations than
the conventional
KF
[6]. This problem of cost efficiency gave rise
to the development of modified versions of the SRF such as the
UDU-algorithms [7] and the Chandrasekhar form [9]. These
implementations can be made
as
efficient as the conventional
KF,
or for the Chandrasekhar SRF even more efficient for some
special experimental conditions.
In this paper we reconsider the numerical robustness of existing
KF’s and derive some results giving new and/or better insights
into their numerical performance. Here we investigate four
“basic” KF implementations: the conventional Kalman filter
(CKF), the square root covariance filter (SRCF), the Chandra-
sekhar square root filter (CSRF), and the square root information
filter
(SRIF).
(The implementations chosen come from [12]; these
differ substantially from the forms described in [7] with the same
names!) This certainly does not cover all possible implementa-
tions encountered in practice, but insights gained for these general
cases are very useful in judging variants such
as
the efficient KF
algorithms based on the sequential processing technique [7] or the
“condensed form” versions [lo], [ll]. After a brief description
of the above filters in Section
11,
we perform in Section
III
a
detailed first-order perturbation study of the error propagation due
April 28, 1986. Paper recommended by Associate Editor,
H.
L.
Weinert.
94035.
Manuscript received January 21, 1985; revised November 21, 1985 and
M. Verhaegen is with
the
NASA Ames Research Center, Moffett Field, CA
P.
Van
Dooren is with Phillips Research Laboratory,
Brussels,
Belgium.
IEEE
Log
Number 8610087.
to roundoff for the above four
KF
implementations. In Section
IV
a realistic simulation study is performed in order to
validate
the
results of the theoretical analysis. Section
V
then outlines a
comparison between the different filter implementations using the
results of the theoretical error analysis and the simulation study.
We end with some concluding remarks in Section
VI.
11.
NOTATION
AND
PRELIMINARIES
In this section we introduce our notation and list the different
Kalman filter types that are discussed in the paper. We consider
the discrete time-varying linear system
Xk+l=AkXk+BkWk+DkUk
(1)
and the linear observation process
Yk
=
ckxk
+
uk
(2)
where
xk, uk,
and
yk
are, respectively, the state vector to
be
estimated
(ER“),
the deterministic input vector
(el?‘),
and the
measurement vector
(ERP),
where
wk
and
uk
are the process noise
and the measurement noise
(ERP)
of the system, and,
finally, where
Ak*,
Bk,
Ck,
and
Dk
are
known
matrices of
appropriate dimensions. The process noise and measurement
noise sequences are assumed zero mean and uncorrelated
E{
wk)
=o,
E{
uk}
=o,
E{
wkvi
}
=o
(3)
with
known
covariances
E{W>W;}=Qk6jk, E(VjV;}=Rk6jk
(4
where
E{
*
}
denotes the mathematical expectation and
Qk
and
Rk
are
positive definite
matrices.
The assumption that
Qk
is nonsingular does not restrict the
generality of the system description, since for the case
of
singular
Qk,
the linearly dependent components in
wk
can always be
removed first [12].
On
the other hand, the regularity of
Rk
rules
out the possibility of including perfect measurements not cor-
rupted by noise.
In
the particular case of perfect measurements,
special adaptations are required for some of the KF implementa-
tions, such
as
the use of the Moore-Penrose inverse for the CKF
[
121. Such special implementations are not considered here except
for a few comments in the concluding remarks.
The
SRF
algorithm uses the Choleski factors of the covariance
matrices or their inverse in order to solve the optimal filtering
problem. Since the process noise covariance matrix
Qk
and the
measurement noise covariance matrix
Rk
are assumed to be
positive definite, the following Choleski factorizations* exist:
Qk=Q.‘2[Qy2]’,
Rk=R:;‘2[Rp]’
(5)
Notice that historically Q”’ and
Ri’Z
have erroneously been called
I
with nonsingularity required for the
SRIF
adjective “square root”
as
far as the names of the filters
are
concerned
“square
roots’’
instead of .‘Chofeski factors.” However, we will maintain
the
because of
the
familiarity that they have acquired.
0018-9286/86/1000-0907$01
.OO
0
1986 IEEE
908
IEEE
TRANSACTIONS
OK
AUTOMATIC CONTROL,
VOL.
AC-31, NO.
IO,
OCTOBER
1986
where
the
factors
QL’2
and
RY2
may be chosen
upper
or
lower
triangular. This freedom of choice is exploited in the development
of the fast KF implementations presented in Section
U.
The
problem is now
to
compute the
minimum variance estimate
of
the stochastic variable
Xk,
provided
yl
up
to
yj
have been measured
””
xklj=xklyl,...y,.
(6)
When
j
=
k
this estimate is called the
filtered estimate
and for
j
=
k
-
1
it is referred to
as
the one-step predicted or, shortly, the
predicted estimate.
The above problem is restricted here
to
these
two types
of
estimates except for a few comments in the
concluding remarks. Kalman filtering
is
a recursive method to
solve this problem. This is done by computing the variances
Pklk
andlor
Pklk-1
and the estimates
&lk
and/or
zklk-
I
from their
previous values,
this
for
k
=
1,
2,
.
Thereby one assumes
Pol
-
(i.e., the covariance matrix of the initial state
xo)
and
fool
-
(Le., the mean of the initial state
xg)
to be
given.
A.
The Conventional Kalman Filter
(CU)
The above recursive solution can be computed by the CKF
equations, summarized in the following “covariance form” [12]:
This set of equations has been implemented
in
various forms; see
[12].
An
efficient implementation that exploits the symmetry of
the different matrices in (7)-(10) requires
per
step
3n3/2
+
n2(3p
+
m/2)
+
n(3p2/2
+
m2)
+
p3/6
“flops” (where 1 flop
=
1
multiplication
+
1
addition). By not exploiting the symmetry of
the matrices
in
(7)-(10) one requires
(n3/2
+
n2m/2
+
np2/2)
more flops. In the error analysis, it is this “costly” implementa-
tion that is initially denoted as the CKF for reasons that are
explained there.
In
Section
II-E
we also give some other variants
that lead
to
further improvements
in
the number of operations.
B.
The Square Root Covariance Filter (SRCF)
Square root covariance filters propagate the Choleski factors of
the error covariance matrix
Pklk-
I
Pk1k-j
=
sk
*
s;
(1
1)
where
Sk
is
chosen to
be
lower triangular. The computational
method
is
summarized
by
the
following
scheme
[12]:
where
Ul
is
an
orthogonal transformation that triangularizes the
prearray. Such a triangularization can, e.g., be obtained using
Householder transformations
1131.
This
recursion is now initiated
with
&,I
-
I
and the Choleski factor
So
of
POI
-
I
as defined in (1 1).
The number of flops needed for
(12)
and
(13)
is
7n3/6
+
n2(5p/2
+
m)
+
n(p2
-t-
m2/2). In
order to reduce the amount of
work,
we only compute here the diagonal elements of the covariance
matrix
Pk+llk,
since usually
diag(Pk+,lk}
carries enough infor-
mation about the estimate
&+l’k
(namely the variance
of
the
individual components). For this reason
our
operation counts
differ, e.g., from those
of
[6]. In Section
11-E
we shortly discuss
some other variants that lead
to
further improvements in the
number of operations.
C.
The Chandrasekhar Square
Root
Filter (CSRF)
If
the system model
(l),
(2)
is
time-invarianf,
the
SRCF
described in Section
II-B
may be simplified to the Chandrasekhar
square root filter, described in
[
14]$ [9]. Here one formulates
recursions for the
increment
of
the covariance matrix, defined as
In general, this matrix can be factored as
where the rank of inc
Pk
is
n,
+
n2
and
C
is called its signature
matrix. The CSRF propagates recursions for
Lk
and
&+lJk
using
[
141:
E,=(:
;)
Such transformations are easily constructed using “skew House-
holder” transformations (using an indefinite E,-norm) and require
as many operations
as
the classical Householder transformations
[14]. (Later,
it
is noted that numerically they are not always well
behaved.) For this implementation the operation count is
(n,
+
n2)
(n2
+
3np
+
p2)
flops.
D.
The Square Root Information Filter (SRIF)
The information fdter accentuates the recursive least-squares
nature of filtering [7],
[12].
The
SRIF
propagates the Choleski
factor of
Pi::
using the Choleski factor of the inverse of the
process- and measurement-noise covariance matrices
where the right factors are all chosen upper triangular. We now
present the Dyer and McReynolds formulation of the
SRIF
(except for the fact that the time and measurement updates are
combined here
as
in
[
121)
which differs from the one presented by
Bierman (see [7] for details). One recursion of the
SRIF
algorithm
VEFWAEGEN
AND
VAN
DOOREN:
DIFFERENT
KALMAN
FILTER
IMPLEMEhTATIONS
909
(postarray)
and the filtered state estimate is computed by
An operation count of this filter is
7n3/6
+
n2(p
+
7m/2)
+
n
(p2/2
+
m2)
flops. Here we did not count the operations
needed for the inversion andlor factorization of
Qk,
Rkr
and
Ak
(for the time-invariant case, e.g., these are computed only once)
and again (as for the SRCF) only the diagonal elements of the
information matrix
Pi!:
are computed at each step.
E.
Efficient Implementations
Variants of the above basic KF implementations have been
developed which mainly exploit some particular structure of the
given problem in order to reduce the amount of computations;
e.g., when the measurement noise covariance matrix
Rk
is
diagonal,
it is possible to perform the
measurement update
in
p
scalar updates. This is the so-called
sequential processing
technique, a feature that is exploited by the UDU‘-algorithm to
operate for the multivariable output case. A similar processing
technique for the
time update
can be formulated when the process
noise covariance matrix
Qk
is
diagonal,
which is then exploited in
the SRIF algorithm. Notice that no such technique can be used for
the CSRF. The UDU’-algorithm also saves operations by using
unit triangular factors
U
and
a
diagonal matrix
D
in the updating
formulas for which then special versions can be obtained
[7].
By
using modified Givens rotations
[
151
one could also obtain similar
savings for the updating of the usual Choleski factors, but these
variants are not reported in the sequel.
For the
time-invariant
case,
the
matrix multiplications and
transformations that characterize the described
KF
implementa-
tions can be made more efficient when the system matrices
{A,
B,
C}
are first transformed by
unitary similarity transformations
to
so-called
condensed form,
whereby these system matrices
{A,,
B,,
Cf}
contain many zeros. From the point of view of
reliability,
these forms are particularly interesting here because no
loss of accuracy is incurred by these unitary similarity transforma-
tions
[
101.
The following types of condensed forms can
be
used to
obtain considerable savings in computation time in the subsequent
filter recursions
[lo]:
the
Shur form,
where
A,
is in upper or
lower Schur form, the
observer-Hessertberg form,
where the
compound matrix
(A
,’
,
C,‘
)
is upper trapezoidal, and the
controller-Hesenberg form,
where the compound matrix
(Af,
B,)
is upper trapezoidal. In
[lo],
an application
is
considered were
these efficient implementations are
also
valid for the
time-varying
case. Note that the use of condensed forms and “sequential
processing” could very well be combined to yield even faster
implementations.
The operation counts for particular mechanizations of these
variants are all given in Table I and indicated by, respectively, the
OPERATION
COUNTS
FOR
THE
DIFFERENT KF’s
TABLE
I
(3/2)n’
+
n’(3p+
m/2)
+
n(3pz/Z
+
m’)
+
p3/6
(3/2)n’+ n2(3p+ 42)
+
n(p’
+
m’)
(3/4)ns
+
nz(5p/2+
mp)
+
n(39/2
+
m2)
+
$/6
I
I
&less.
(3/4)n’
+
nz(7p/2
+
m/2)
+
“(29
+
mZ)
+
p’/6
SRCF
I
full
(7/6)ns
+
n’(5p/2
+
m)
+
n(p’
+
mz/2)
(7/6)n3
+
n’(5p/2
+
m)
+
n(m’/Z)
(1/6)n’
+
n2(5p/2
+
m)
+
n(2p2)
(7/6)n’
+
n’(p
+
Tm/Z)
+
n(d/Z)
“seq,”
“Schur,”
‘I
o-Hess” and “c-Hess” abbreviations, while
“full” refers to the implementations described in previous
sections.
III. ERROR ANALYSIS
In this section we analyze the effect of rounding errors on
Kalman filtering in the four different implementations described
above. The analysis is split in three parts:
1)
what bounds can be
obtained for the errors performed in step
k;
2) how do errors
performed in step
k
propagate in subsequent steps; and 3) how do
errors performed in different steps interact and accumulate.
Although this appears
to
be the logical order in which one should
treat the problem of error buildup in
KF,
we first look at the
second aspect, which is also the only one that has been studied in
the literature
so
far. Therefore, we first need the following lemma
which is easily proved by inspection.
Lemma
I:
Let
A
be
a square nonsingular matrix with smallest
singular value
u~,,
and let
E
be a perturbation of the order of
6
=
llEl12
4
umin(A)
with
11
112
denoting the 2-norm. Then
(A+E)-~=A-~+A~=A-~-A-~EA-~+A~
(24)
where
IIAIII~
5
6/umdamin
-
6)
=
o(6)
(25)
11A2112I6’/~,i,(~,i,-6)=0(6~).
2
(26)
Notice that when
A
and
E
are symmetric, these first- and second-
order approximations
(25)
and
(26)
are also symmetric.
We now thus consider the propagation of errors from step
k
to
step
k
+
1
when
no
additional errors are performed during
that update.
We denote the quantities in_ computer with an
depending on the algorithm.
in step
k,
then:
upperbar, i.e.,
Pklk-],
zklk-l,
Gk, Sk, Tk, Fk, Kk,
Or
Lk,
For the CKF, let
6Pklk-
and
6Xklk-
I
be the accumulated errors
~k!k-]=Pk(k-l+6~k~k-l~k~k-l=~~!k-1+6-fklk-L.
(27)
By
using Lemma
1
for the inverse of
R;
=
R;
+
CJPklk-
]C;,
we find
[~;I]-’=[RJ]-’-[Rfl]-’Ck6Pk,k-,C;[Re,]-’+0(62).
(28)
where
the
full
implementation
of
the
CKF
exploits
symmetry
910
IEEE
TRANSACTIONS
ON
AUTOMATIC CONTROL,
VOL.
AC-31.
NO.
10,
OCTOBER
1986
From this, one then derives for sufficiently large
k.
Notice that
pm
is
smaller than
ym
or
-fm,
hence,
(37), (38)
are better bounds than
(35), (36).
Using
this,
it
and
8&lk-l
are decreasing in time when no additional errors are
-AkPklk-ICiRi-~CkSPklk-l
c~~e-1
+o(62)
manner is the main reason why many Kalman filters do not
diverge in presence of rounding errors.
for symmetric
6Pklk-l.
However, if symmetry .is removed,
divergence may occur when
Ak
(i.e., the original plant)
is
unstable. Indeed, from
(31), (33)
we see that when
Ak
is unstable
Kk=AkPklk-IC;Ri-l
then
also
follows from
(37), (38)
that all three errors
6Pklk-l,
6&,
6Kk=Ak6Pk(k-ICiRi-'
performed. The fact that past errors are weighted in such a
=FksPklk-1C;R~-'+0(6')
(29)
The property
(35)-(38)
was already observed before
[2],
but
where
Fk=Ak(r-Pklk-IC;Ri-'Ck)=Ak-KkCk
(30)
the larger part
of
the error is skew symmetric:
where
Fk
=
(1
-
Pk+l&C;+lR;.:
Ck+l)Ak
has the Same
Ak+lFk
[161.
specem
as
Fk+
I
in the time-invanant case, since
Fk+
IAk
=
We thus fmd that when
6Pklk-
I
or
6Pklk
is symmetric, only the
first
term
in
(31)
or
(33)
remains
and the error propagation
behaves roughly
as
~~~Pk+llk~~2=~~~k~(~
'
(I&Pklk-l)\2=Yi
.
))6pklk-1)12
(35)
((6pk+llk+1((2~\)FkkJ)t
.
ll~pklk1)2=?~
.
\)@klk))2
(36)
which are decreasing in time when
Fk
and
Fk
are contractions
(i.e., when
yk
and
Tk
<
1). The latter is usually the case when the
[
123.
For the time-invariant case one
can
improve
on
this Jy
saying that
Fk
and
Fk
tend to the constant matrices
F,
and
F,,
respectively, with (equal) spectral radius
pm
e
1
and one then has
for
some appropriate matrix norm
[17]:
matrices
Ak, Bk, ck, Qk,
and
Rk
do not Vary to0 wildly in time
IlsPk+l:kll
=PL
'
116Pklk-l
1)
(37)
\(@k+Ilk+l\I=P&
'
\18Pklkl(
(38)
APk+l!k=Ak (~pk~k-l-6p;\k-l)
'
A;
(39)
GPk+I(k+lzAk
*
(6Pklk-6p;lk)
Ai
(40)
and the lack of symmetry
diverges
as
k
increases.
This
phenome-
non is well known in the extensive literature about
Kalman
fitering and experimental experience
has
lead to a number of
different "remedies" to overcome it. The above first-order
perturbation analysis
in
fact explains why they work.
1)
A
first method to avoid divergence due to the loss of
symmetry when
Ak
is unstable, is to
symmetrize
pklk-
I
or
Pklk
at
each recursion of the
CKF
by averaging it with its transpose. This
makes the errors on
P
symmetric, and hence the largest terms in
(31), (33)
disappear!
2)
A second method to make the errors on
P
symmetric, simply
computes only the
upper
(or
lower)
triangular
part
of
these
matrices, such
as
indicated by the implementation in Table
I.
3)
A
third technique
to
avoid the loss of symmetry is the
so-
called (Joseph's) stabilized
KF
[3].
In
this implementation, the set
of equations for updating
P
are rearranged
as
follows:
Pk+Ilk=FkPk(k-IFi +KkRkKi +BkQkBi-
(41)
A
similar first-order perturbation
study
as
for
the CKF above,
learns that
no
symmetrization
is required in order to avoid
divergence since here the error propagation model becomes:
6pk+ I(k=FkhPklk-
IF;
+
O(s2)
(42)
where there are no terms anymore related to the loss of symmetry.
Since for the moment we assume that
no
additional errors
are
performed in the recursions, one
inherently
computes the same
equations for the
SRCF
as
for the CKF. Therefore, starting with
whereby now
errors
Ask
and
&tklk-[
(29), (31), (32), (35),
and
(37)
Still hold,
6Pklk-
1
=
sk
'
6s;
+
8sk
'
si
+
6sk
'
8s;
(43)
is clearly symmetric by construction. According to
(31)
this
now
ensures the convergence to zero
of
6Pklk-
and hence oft&,
6&
and
6.i&
if
yk
is sufficiently bounded
in
the time-varying case.
For the
SFW
we start with errors
6Tk
and
6&(k
and use the
identity
6Pi:=T;
*
6Tk+6T;
.
Tk+&T,(
*
(44)
~Xklk=(Tk+6Tk)-'65^k(k
(45)
to relate this problem to the
CKF
as
well. Here one apparently
does
not
compute
&+ilkLl
from
&k
and therefore one would
expect
no
propagation
of errors between them. Yet, such
a
propagation is
present
via the relation
(45)
with the errors
on
a&+
I(k+l
and
&$kJk,
which
do
propagate from one step
to
another.
This in fact is reflected in the recurrence
(34)
derived earlier.
Since the
SRlF
update is
inherently equivalent
to
an
update of
Pklk
and
,tklk
as in the CKF, the equations
(33), (36)
still hold
where now the symmetry of
6PkIk
is
ensured because
of
(44).
From
this
it follows that
6Pklk
and
6+&.,
and therefore
also
6Tk
VERHAEGEN
AND
VAN
DOOREN:
DIFFERENT
KALMAN
FILTER
IMPLEMENTATIONS
91
1
and
6&k,
converge to zero as
k
increases, provided
Tk
is
sufficiently bounded in the time-varying case.
Finally, for the
CSRF
we
start
with errors
6Lk-1,
6Gk-l,
6RlI/:,
and
6L;klk-l.
Because
of
these errors,
(16)
is perturbed
exactly
as follows:
ell2
k-l+6R:!:
c(Lk-I+6Lk-l)
Gk-Ii-6Gk-I A(Lk-If6Lk-I)
where
U2
is
also
Cp-unitary. When
X
=
IIc'Lk-
4
IIRl:;
I(
(which
is
satisfied when
k
is sufficiently large),
Lemma
A.3
yields after some manipulations
Now
the
(1,
1)
and
(1,
2)
blocks of
U'
are
easily
checked
to
be
given by
Re-1/2-R
and
Re-1/2.C.Li-l-C,
respectively. From
this, one then deri;z that for
k
sufficiently large
k
k
6~
=
6~
ell2
k-l .
[RE-'''
*
Rz'!/:]'+C
6Lk-l
'
[RZ-1'2
.
C
'
Lk-1
*
E]'
f
0(6
*
X)
=&RE'/:
*
[R:-'12
.
R;L_/:lf+O(6
.
X)
(48)
6Gk=6Gk-l [R:-'l2
*
R;'!S'+A
*
6Lk-l
[R;-'l2
*
C
*
Lk-1
*
C]'+O(S
'
X)
Z6Gk-l
'
[R;-'12
*
RekljS'
+0(6
X).
(49)
Here again thus the errors
6R,"/f
and
6Gk-
are
multiplied by the
matrix
[Re-1/2.Rl":]
'
at each step. When
C
is the identity matrix
(i.e., when inc
Pk
is nonnegative)
this
is a contraction since
Rf;
=
Ri-l
+
C'Lk-l'L;-,'C'.
From
this,
we then derive simdar
formulas for the propagation of
6Kk
and
6j&+
Ilk.
Using Lemma
1
for the perturbation of the inverse
in
Kk
=
Gk.R;-1/2,
we find
k
6Kk=6Gk
.
Rz-1/2-Gk
.
R;-'12
.
6RZ'/2
.
Ri-1/2+
O(62)
-6Gk
-
*
Rz-1/2-Kk
*
6Ri1/2
.
RZ-'/2+0(62).
(50)
Using
(49),
(50)
and
the
fact that for large
k,
Kk
=
Kk-
I
+
O(X),
we
then
obtain
which suggests that the inherent decaying of errors performed in
previous steps will be less apparent for
this
filter. Besides that,
nothing
is
claimed about
6Lk
or
6Pk+
Ilk,
but apparently
these
are
less
important for this implementation of the
KF
since they do not
directly affect the precision of the estimate
i?k+llk.
Moreover,
when
C
is not the identity matrix, the above matrix has norm
larger than
1
and divergence may
be
expected.
This
has also
been
observed experimentally
as
shown in Section
N.
We now
turn
our attention to the numerical errors performed in
one single step
k.
Bounds for these errors are derived in the
following theorem.
Theorem
I:
Denoting the
norms
of
the
absolute errors due to
roundoff during the Construction Of
Pk+llk,
Kk,
j?k+llk,
sk,
Tk,
P,fIlktl,
and
gklk
by
Aps Akr Am Ass At, Appinu,
and
Ax,
respectively, we obtain the following upper .bounds (where
all
norms
are
2-norms):
1)
CKF
2)
SRCF
3)
CSRF
VERHAEGEN
AND
VAN
DOOREN:
DIFFERENT
KALMAN
FUTER
IMPLEMENTATIONS
913
obtains
(;g)(
c24/((1-42)(1-+))
c14/(1
V(1-42)
-
42)
O
0
1 141
0”
-4)
)+)
AX
(58)
if
.i,
<
1,
where
+
is the largest of the
yk’s.
When
yk
tends to a
fixed value
ym
it is easily shown that
9
can
be replaced by
ym
in
(58),
since the contributing terms to the summation are those with
growing index
k.
For a
time-invariant
system, finally, this can
then be replaced by
p-
as
was remarked earlier, and the condition
9
=
pm
<
1
is then always satisfied.
For the SRCF, one uses the relation to the CKF (as far as the
propagation of errors from one step to another is concerned) to
derive
(58)
in an analogous fashion, but now with Ap, Ak, and
Ax
appropriately adapted for the SRCF as in Theorem 1. For the
SRIF one also obtains analogously the top and bottom inequalities
of
(57)
for
Ap
and
Ax
adapted for the SRIF as in Theorem
1
and
where now
.i,
is the largest
of
the
Tk’s.
Upon convergence, the
same remarks hold
as
above for replacing
9
by
ym
and
pm.
Finally, for the CSRF, we
can
only derive from
(52),
(53)
a
recursion
of
the type
(
AtotKk
)
5
(Pk
0)
.
(
AtotKk-l
)
+
(2)
(59)
Atotfk
+
1
I
k
c2
Yk
Atotgkl
k
-
1
where
Ok
=
JIR:-l*R;-1112.
Recursive summation of these
inequalities as was done to obtain
(58),
only converge here-for
both
A,,&
and A,,L,-when the
Ok
increase sufficiently slow to
1
as
k
grows. We remark here that these are only upper bounds
(iust as the bounds for the other filters), but the fact that they may
diverge does indeed indicate that for the CSRF numerical
problems are more likely to occur.
Notice that the first-order analysis of the section collapses when
0(~3~)
and
O(6)
errors become comparable. According
to
Lemma
1, this happens when
K(R;)
=
1/6, but in such cases it is highly
probable that divergence will occur for all filter implementations.
IV. EXPERIMENTAL EVALUATION
OF
THE
DIFFERENT KF’s
In this section we show a series of experiments reflecting the
results of our error analysis. For these examples the upper bounds
for numerical roundoff developed in the previous section are
reasonably close
to
the true error build up.
A.
Experimental Setup
The simulations are performed for a realistic flight-path
reconstruction problem, described in
[lo].
The
numerical dzlfi-
culties
observed in a preliminary experimental analysis with the
CKF
[
111
showed that
this
case study is ideally suited
to
validate
the theoretical analysis of Section III. Reversely, it demonstrates
how this first-order perturbation study contributes in understand-
ing and solving these difficulties. In order
to
shed more light on
the trouble spots of some of the filters, we have “artificially”
modified the realistic conditions of our problem (see Table
10.
We
then show that the behavior of the different filters
can
be predicted
by the error analysis of Section
IV.
This
analysis indicated the
following parameters as being relevant for the error propagation
in the four different
KF
implementations we considered.
1)
The
initial
condition
for
the
error
covariance
matrix
Pol
-
2)
The condition number
K(R;)
of the innovation signal
covariance matrix.
In
our example, it turns out that
K(Rk)
approximately determines
K(R;)
during the whole run. This is
partly due to the fact that
(IPklk-
is small compared to
IIRkll.
3)
The spectral norm
yp
and radius
Pk
of
the matrix
Fk.
This can
be affected by “weighting” of the system matrix
Ak
by a factor
CW.
4)
The condition number
K(Qk)
of the process noise covariance
matrix.
5)
The condition number
K(Ak)
of
the system state transition
matrix. This
is
affected by the choice
of
a state-space coordinate
system.
6)
The condition number
K(&)
of the Choleski factor of the
error covariance matrix. This parameter is hard to estimate
a
priori.
These
are
also the parameters we
tried
to influence in our
experimental setup
as
given in Table
LI.
To study roundoff errors in single precision, mixed precision
computations were carried out and
double precision
results are
considered to be exact. The roundoff errors on three different
quantities that result from a
KF
were considered
in
the simula-
tions, namely:
1)
on the state error covariance matrix
P,
denoted by
~tor~k~k-l=~~~k~k-l-~k~k-l~~=~(~tot~~~k-I(~~
2)
on
the Kalman gain
K,
denoted by
At&k
=
llKk
-
=
3)
on the reconstructed state quantities
&lk-
I
or
&&,
denoted by
ll’&iKkll;
ArotXk=11gklk-1-~klk-III
Or
~~~klk-~klk~~*
In the experiments,
the
total
roundoff error
Atot
in
(57)
and (59)
is approximated by the Frobenius norm
of
the difference between
the single and doubl_e precision quantities, which are, respec-
tively, denoted by
(
-
)
and
(*).
For the state error covariance
matrix
Pk
k-
this approximation becomes
AtotPklk-
=
IIPk\k-
-
Pk!&ll\
=
lI6totPklk-
It is noted that the
SRIF
does not
require the Kalman gain
Kk
explicitly to compute the filtered state
quantities. Therefore, the second parameter will not be considered
for this implementation.
Since the accuracy of the first
two
quantities determines the
accuracy
of
the reconstructed state, a first analysis can
be
restricted
to these quantities. If conditions
can
be formulated
under which accuracy degradation of these two quantities occurs,
extensive simulation tests with input and output time histories of
the real (or simulated) system become obsolete.
Because of the inclusion of the
CSRF,
only the
time-invariant
case will be considered here. The SRCF and the SRIF algorithms
are closely related from a numerical point of view. They are,
therefore, first compared
to
the CKF and second to the CSRF.
B.
Comparing the
SRCFISRIF
with the
CKF
The experimental conditions of the different tests are listed in
Table II. From the theoretical analysis of Section
ILI
it follows that
the relevant parameters that influence the reliability of the CKF
are
K(R;)
and
p(Fk),
the spectral radius of
Fk.
Two tests were
performed to analyze their effect. The magnitudes of the variables
K(R~)
and
p(F)
are very close to the values of
K(R)
and
p(A)
given in Table
II.
Two tests were performed to analyze their
effect. The results of these tests are plotted in Fig.
1.
From this
figure the following observations are made.
I)
Test
I-Fig.
I(a):
(p(A)
=
1.0
and
K(R)
=
IO2).
Since symmetry of the error state covariance matrix
P
is not
preserved by the CKF, the roundoff error propagation model for
the local error
6Pklk-
given by
(3
I),
learns that divergence of
roundoff errors on
P,
and hence
on
K
will occur if the original
system is
unstable.
This experiment confirms this divergence
phenomenon
also
when
p(A)
=
1.0,
as
is
the case
for
the
considered flight-path reconstruction problem
[lo].
Furthermore,
it is observed from Fig.. l(a) that the error on
P
with the CKF is
almost completely determined by the
loss
of
symmetry,
com-
As
md~cated
In
the prevlous sectlon, different methods have
been proposed to solve this problem. One particular class of
methods consists of forcing the error on the state covariance
puted by
!IPklk-!
-
pLlk-l\j
=
AsyTklk-1.
914
EEE
TRANSACTIONS
ON
AUTOMATIC CONTROL, VOL. AC-31,
NO.
10,
OCTOBER
1986
obtained in Section
III
the following recurrences:
AtotPk+llk~?'~
'
Atotpklk-l+Ap
(60)
AtotKks
c
*
yk
.
AtotPklk-
I
+
Ak
(61)
where the upperbounds
for
the local errors
Ap
and Ak are given in
Theorem
1,
parts 1 and
2,
for the CKF and the SRCF.
A
comparison of
the
(a) and
(b)
bounds indicates that when the
accuracy of the Kalman gain is considered
no
preference should
exist for the SRF's to the CKF when Ak
is
stable and time-
invariant.
(For situations where Ak has eigenvalues
on
or
outside
the unit circle, the
CKF
has to
be
changed, e.g., to the
CKF(S) implementation.) However, the experimental results
demonstrate that for the latter conditions the
loss
of accuracy with
a CKF(S) is
still
higher than the SRF's. This is
also
generally
observed for other
SRF
variants such
as
the
UDU'
filters
[7].
Here we only want to draw attention to the clear difference to
be
expected (and also reflected by the experiments) between the
accuracy of
Pklk-1
and
Kk
in the CKF(S) implementation with
respect to those of SRF filters.
TABLE
II
TEST
CONDITIONS
TO
EVALUATE
THE
DIFFERENT
KF
IMPLEMENTATIONS
Test1
1.46.10'
1.0
9.90.10'
7.53
1.0
9.80.10-'
9.90.10' 1.0
1.0
1
1.0
:
1.21
I
1.81.10-*
COMPRRISON
SRCF/SRIF
-
CKF
Fig.
1.
Comparison
of
the
SRCF/SRIF
and
the
CFK
(a)-@).
matrix to become symmetric, which
is
done here by averaging the
offdiagonal elements of
P
after each recursion. The behavior of
AIogklk-l
for
this implementation, denoted by cKF(s) in Fig.
l(a),
clearly indicates that
this
implementation becomes again
competitive,
even
when the
original
system is unstable.
A
similar
effect has also been observed when computing only the upper
triangular part
of
P.
On the other hand, the behavior of
AttotPklk-
for Joseph's stabilized CKF, denoted by (J)CKF
in
Fig. l(a),
confirms that the roundoff errors do not diverge even when the
symmetry of
P
is
not retained. We also observe from Fig. l(a)
that the roundoff error
on
P
with these modified CKF remains
higher (a factor
10)
than the SRCF/SRIF
2) Test
2-Fig.
I(b):
(p(A)
=
0.9
and K(R)
=
IO2).
If we make the original system stable, the
CKF
is numerically
stable. Moreover, the accuracy with which the Kalman gain is
computed
is
of
the same order as that of the SRCF. This is in
contrast
with
a general opinion
that
SRF's
improve the calculations
of the Kalman gain
or
filtered estimates [6],
[3].
We can state that
they do not make accuracy poorer. From Fig.
I@)
it is observed
that
only
the error covariance matrix
P
is computed more
accurately, which
confirms
the upperbounds for the roundoff
errors
as
given in Section
III.
Summarizing these bounds, we
C. Comparison SRCF/SRIF with CSRF
The upperbound for the roundoff errors of the Kalman gain and
the state estimate
&+Ilk
computed by the CSRF
(for
large
k)
can
be
summarized
as
follows:
AtotKk
5
Bk
*
Ato&k
-
1
+
Ak
(62)
Atot-fkEkiIIkSC
'
AttotKk-1 +YkAtot-fkJk-I +AX
(63)
With the upperbounds for the
local
errors
Ak
and Ax given
in
Theorem
1,
part 3. This model indicates that the error propagation
is convergent when
pk
=
~~R;-,*(R~)-l~~
-=
1,
which
is
the case
only if the signature matrix
C
1s
the identity matrix
I.
Note that the
error variation AtotKk
is
now
weighted by
Pk
(instead of
Yk
for the
other filters), which even for
C
=
I
becomes very close to
1
for
large
k.
This is
also
the main reason of the
poor
numerical
behavior of this filter. When
C
#
I
(which depends on
the
choice
of
Pol
-
#
0)
Bk
is larger than
1
and
K(&)
may
also become
large. Both these phenomena have a negative influence on the
above bounds and may eventually cause divergence. In addition
to
the
numerical sensitivity
introduced by the choice of
Po:
-
I,
it
also
influences
the
efficiency
of
the CSRF implementation.
This
is
indicated by the parameter
(nl
+
nz)
in Table
I,
which lies
in
the
interval
[n,
min
(m,
p)].
The influence
of
the choice of
Pol-
I
is analyzed by the
following
two
tests.
I)
Test
3-Fig.
2(a)
(Pol
-I
#
0,
p(A)
=
1.0
and
K(R)
=
1.0).
The choice of
Po!
-
I
#
0
influences the CSRF implementation
negatively.
First,
in
this experiment the computational efficiency
decreases in comparison to
the
case
Pol
-
=
0,
discussed
in
the
following test. This is because
(nl
+
n2)
in Table
I
becomes
greater
than
p
or
m.
This was the case
for
all
the tests performed
with
Pol-
#
0.
Second, the transformations used in each
recursion to triangularize the prearray become C-unitary, Le.,
having a condition number
>
1.
This
is due to
the
fact that inc
Po
is not definite. From Fig. 2(a) this negative effect is clearly
observed. Both the error levels on
P
and
K
are
a factor
lo2
larger
than for the SRCF
or
SRIF. For the covariance type algorithms
considered here, it is observed that the error on the Kalman gain is
always higher than the error
on
the state error covariance matrix.
This
is partly due to the extra calculation Gk(R:)-1/2 needed
for
the Kalman gain, where the condition number of
(R;)lI2
determines the loss of accuracy.
2) Test
4-Fig.
2(b)
(Pol
~,
=
0,
p(A)
=
I.0
and
K(R)
=
1.0).
For
this
case inc
Po
=
B.Q.B'
is
positive definite, causing the
transformations used in each recursion to
be
unitary.
On the other
VERHAEGEN
AND
VAN
DOOREN:
DIFFERENT
KALMAN
FILTER
IMPLEMENTATIONS
915
I
CCMPfiRISCN
SRCC/SR:c
-
CSRF
Test
3
-?
]I
i!
Test
4
I
I
Fig.
2.
Comparison
of
the
SRCF/SlUF
and
the
CSRF
(a)-@).
hand
(nl
+
n3
in Table
I
is equal to
m,
what makes the CSRF
“slightly” more efficient compared to the new SRCFISRIF
implementation based on the “condensed” system representa-
tions.
From the experimental results in Fig.
2(b)
we observe that the
error on
P
is very small, while the error on
K
is much higher than
for the SRCF calculations. Furthermore, the errors on
K
with the
CSRF
increase very
slowly
because the coefficient
&
becomes
very close to
1.
This is due to the fact that for the CSRF roundoff
errors are carried along
on
three
matrices, namely
Gk)
(R;)1’2)
and
Lk,
while for the SRCFlSRIF errors are carried along only on
the square
roots
of
P
or
P-I.
For the error on
Lk
(supposing inc
Pk
factored
as
LkL;)
this effect does not cause the errors on
Pk
k-
I
Pk=
LjL;
+PO
(64)
i=O
to accumulate because
Lk
converges rapidly enough to zero such
that
the
accumulated errors on
Pk:
k
Alo$k
=
Li
dtotL/
+
AmLi
.
L;
(65)
i=O
also convergences if the
A,Li
are not too large. The absolute
value of the total error on
(RE)
and
Gk
remain much higher.
This
is
clearly reflected in the loss of accuracy in the calculation of
Generally, the CSRF is
less
reliable
than
the
SRCF/SRIF
combination. For zero initial conditions of the state error
covariance matrix maximal reliability can be achieved with the
CSRF. Therefore, for situations where
n
>>
m,
the CSRF may
be
preferred because of its increased
computational efficiency
despite its
loss
of
accuracy.
We stress the fact that this property is
only valid for the
time-invariant
case. Modifications of the CSRF
exist taking into account certain time-varying effects
[
151, e.g.,
for the process noise covariance matrix
Q.
This, however,
induces again an increased computational complexity.
D.
Comparison
of
the SRCF and the
SRIF
Kk
by
Gk
(R
i)
-
In
the previous experiments the SRCF/SRIF combination
performed equally well. In
this
section a further analysis is made
to compare both implementations. Using the error model that
indicates
the
upperbound for the roundoff errors made during one
SRIF
recursion:
AtotPk+llk+lsf:
Atotpklk+Ap
(66)
Atotgk+llk+ll?k
.
AtotfkIk+~f?k
AtotPklk+Ax
(67)
with the upperbounds of the local errors
Ap
and
Ax
given in
Theorem
1,
part
4,
learns that besides
K(&)
and
P(Fk),
other
system parameters influence the roundoff error accumulation in
the SRIF. The effect of these parameters. is analyzed in the
following tests.
I)
Test 5-Fig. 3(a).
In
this
test very large condition numbers for
A,
Q,
and
R
(see
Table
E),
are considered. As expected, this indeed causes the
error on
P
to be much higher (a factor
lo3)
for the SRIF than for
the SRCF.
As
in test
2,
the large value of
K(R)
again causes a
great loss in the accuracy
of
the Kalman gain calculation in the
SRCF. The level of roundoff errors on
K
indeed becomes a factor
lo2
larger than the roundoff level of
P.
In
this
test we analyzed the deterioration of the error covariance
matrix by the
SRIF
implementation by (fairly unrealistic) large
condition numbers.
In
many practical situations,.the effect of high
K(Q;)
and
K(R~;)
can
be relaxed by
scaling, rearranging
the
system matrices or
using scalar measurement and/or input
updates
[
121. Furthermore, we observed in the experiments that
a
high
K(Qk)
did not result in a high
K(Q~),
which is in contrast with
what was observed for
K(R~).
However,
the
effect of a high
K(&)
is much harder to control and as we have seen may influence the
accuracy of the SRIF negatively. We repeat here that this is due
to
a careful choice of the problem coefficients mere
K(Ak)
and
K(Q~)]
in order to put forward the dependency on these
parameters.
2)
Test 6-Fig. 3(b).
For this test, the measurement error statistics were taken from
real flight-test measurement calibrations
[16].
This results in the
following
forms
for the process noise covariance matrix
Q,
respectively, the measurement noise covariance matrix
R:
Q=diag
{8.10-6,
5.10-’,
5.10-8j,
R=diag
{5.10-2,
2.10-Ij.
(68)
The relevant parameters for the roundoff error propagation are
listed in Table
II.
In
Fig. 3(b) the simulated error
Atmx
on
the
state
calculations
is
plotted for both filter implementations. Here, the
error level with the
SlUF
is significantly higher than that for the
SRCF, while
P
is computed with roughly equal accuracy. This is
due
to
the high condition number of
Tk
(obtained by the test
conditions given in Table
II)
in
the calculation of
the
filtered state
with
the
SRIF by (23).
V.
COMPARISON
OF
THE
DIFFERENT FILTEFS
In
this
section we compare the different filter implementations
based on the error analysis
of
Section
JII
strengthened by the
simulation study of Section
IV
and the complexity analysis
of
Section
11-E.
We first look at the time-varying case (hence excluding the
CSRF). According to the error bounds of Theorem
1,
it appears
that the SRCF has the lowest estimate for the
local
errors
generated in a single step
k.
The
accumulated
errors during
subsequent steps is governed by the norms
Yk
for
dl
three fiters in
a similar fashion (at least for the error on the estimate)-this of
course under the assumption that a “symmetrized” version of the
CKF or the stabilized CKF is considered. From these modifica-
tions, the implementation computing only the upper
(or
lower)
triangular
part
of
the state error covariance matrix is the most
efficient. The experiments of Section
IV
with the realistic flight
path reconstruction problem indeed demonstrate that
the
CKF,
the SRCF, and
the
SlUF
seem to yield a comparable accuracy for
the estimates
gk+
Ilk
or
&+
1,
unless some of the “influential”
parameters in
the
error bounds
of
Theorem 1 become critical.
This
is, e.g., true for the SRIF which is likely to give worse
results when choosing matrices
Ak,
Rk,
or
Qk
that
are
hard to
invert.
As
far as
Rk
or
Qk
is concerned, this is in a sense an
artificial disadvantage since in some situations the inverses
R;
I
916
IEEE
TRANSACTIONS ON AUTOMATIC CONTROL, VOL. AC-31. NO.
10.
OCTOBER
1986
-2
-3
-4
-5
c
II
COPIPRRISON
SRCF
-
SRli
I
Test
5
-2
Fig.
3.
Comparison of
the
SRCF
and
the
SRIF
(a)-@),
and
Q;
are the given data and the matrices Rk and
Qk
have then
to be computed. This then would of course disadvantage the
SRCF.
In
[23]
it is shown that the problems of inverting
covariances can always be bypassed as well for the SRIF as for the
SRCF. The problem of inverting
Ak,
on the other hand, is always
present in the SRIF.
For the computational cost, the SRCFISRIF have a marginal
advantage over the CKF when
n
is significantly larger than
m
and
p
(which is a reasonable assumption
in
general), even when
computing the upper
(or
lower) triangular part
of
P
with the
CKF.
Moreover, preference should go
to
the SRCF (respectively, SRIF)
whenp
<
m
(respectively,
p
>
m),
with a slight preference for
the SRCF when
p
=
m.
As
is shown
in
[lo],
1141,
condensed
forms or even the
CSRF
can sometimes be used in the time-
varying case as well, when, e.g., only some of the matrices are
time-varying or when the variations
are
structured. In that case the
latter
two
may yield significant savings in computing time.
Similarly, considerable savings
can
be
obtained by using sequen-
tial processing
[7]
when diagonal covariances are being treated
(which
is
often the case in practice).
For the time-invariant case, the same comments
as
above hold
for the accuracy of
the
CKF, SRCF, and SRIF. The fourth
candidate, the CSRF, has in general a much poorer accuracy than
the other three.
This
is
now not due to pathologically chosen
parameters, but to the simple fact that the accumulation of
rounding errors from one step to another is usually much more
significant than for the three other filters, as was pointed out in
Section
III.
This is particularly the
case
when the signature matrix
X
is not the identity matrix, which may then lead to divergence as
shown experimentally
in
Section
IV.
As
for
the complexity, the Hessenberg forms of the SRCF and
the
SRIF
seem to
be
the most appealing candidate, except when
the coefficient
nl
+
nz
in
Table
I
for the
CSRF
is
much smaller
than
n.
This is, e.g., the case when the initial covariance
Pol-
I
is
zero, in which case the CSRF becomes the fastest of all four
filters. Although the Schur implementations of the SRCF and
SRIF
are almost as fast as the Hessenberg implementations, they
also have the small additional disadvantage that the original state-
space transformation
U
for
condensing the model to Schur form is
more expensive than that for the other condensed forms and that
the (real) Schur form
is
not always exactly triangular but may
contain some
2
X
2
“bumps” on the diagonal (corresponding to
complex eigenvalues of the real matrix
A).
Finally, the c-Hess.
form of the SRIF (given in Table
r)
requires a more complex
initial transformation
U,
since it is constructed from the pair
(A
-
I,
A
-
IB)
which also may be numerically more delicate due to
the inversion of
A.
As
a general conclusion, we recommend the SRCF, and its
observer-Hessenberg implementation in the time-invariant case,
as
the optimal choice of KF implementation because
of
its
good
balance of reliability and efficiency. Other choices may
of
course
be preferable in some specific cases because of special conditions
that would then be satisfied.
VI.
CONCLUDING
REMARKS
In this paper we have analyzed four different
KF
algorithms and
some new variants for their reliability and computational effi-
ciency. We note here that our implementations may differ
substantially from similarly named algorithms described
in
[7].
The comparison is based on an error analysis and an operation
count where we have made
full
use of possible savings in the time-
invariant case.
From the error models a better insight is also obtained about
which parameters influence the error propagation
in
the different
KF
algorithms that have been investigated. For the CKF and the
SRCF these are the condition number
of
the innovation signal
covariance matrix R;and the spectral norm (radius)
of
the filter
state transition matnx
Fk.
while for the SRIF the relevant
parameters are the condition numbers of
Rk,
Qk,
Q;,
Ak
and of
the Choleski factor
Tk
a;nd the spectral norm (radius) of the filter
state transition matrix
Fk.
For the CSRF the choice of the initial
error covariance matrix
Pol
-
matrix and of the condition number
of
the innovation signal covariance matrix
Ri
become critical.
This influence is
also
verified by a simulation study on the flight-
path reconstruction problem
[lo]
given in Section IV.
Further extensions of these techniques to the problem
of
computing other estimates
&Ij
(e.g., for smoothing) or using
mixed representations of covariances (see, e.g.,
[23])
can
also
be
considered. These mixed representations have the advantage that
they allow for singular covariance matrices
Qk,
Rk,
or
pk!p-
and
even for singular information matrices
Ik!k
=
Pi,;,
thereby
avoiding
any
use
of
generalized inverses.
A.
APPENDIX
Here we briefly recall the propagation of rounding errors
in
some basic problems in linear algebra. The norm used is the
2-
norm.
Let the matrix-vector pair
(A,
6)
be known with relative
precision
6,
and
lib,
respectively,
6~=~~6A~~~~~A~~~
8b=lldbII/IIbIl
then we have the following lemma (assuming
A
to be invertible).
Lemma
A.I 1201:
The errors on the
A
-6
and
A-l.6
can be
bounded by
JI(A
b)-(A
*
b)lIs(b+b).
))All
*
Ilb)l+0(62)
When the errors
6,
and
tib
are the backward errors of the above
problem solved
on
a computer with machine precision
E,
then the
above bounds are reasonably well approximated
by
where all
E,
are
of
the order of
E.
The above approximation implies that no serious cancellations
occur in the product
A
6,
which in general is a reasonable
assumption.
Let
A
now
be
a
m
X
n
matrix of rank
m
<
n
and transform the
compound matrix
[A(
63
by a unitary transformation
Q
as
follows:
VEWEGEN
AND VAN DOOREN: DIFFERENT KALMAN FILTER IMPLEMENTATIONS
917
where
A,
is now invertible. Then we have the following lemma
(where
-
+
denotes the generalized inverse of a matrix).
Lemma
A.2
fl9]:
The errors on the least-squares solution
A
+
.
b
and the residual
b2
can be bounded by
ll(A’
*
b)-(A+
b)lIS6,.
{K(A)
.
llAi
.
bll
+K(A)
IIA+II
IIb2Il)+b.
IIA+II
.
IIbII+0(62)
llm-(bdllS&
.
4-41
.
Ilbll+b
*
IIblI+0(6z).
When the errors
6,
and are the backward errors of the above
problems solved on a computer with machine precision
E,
then
they are both of the order of
E
and the above bounds are
reasonably well approximated by
ll(At
.
b)-(A-
b)llSEd.
{K(A)
11-4’
*
bll
+K(A). IIA+II
IIb?.II+IIA+(I
*
(Ib2Il/cos
d)
IJ(bz)-(bz)ll(~s
{~+K(A))
IIb2ll/cos
d
where all
E;
are
of
the order of
E
and cos
4
=
1)
bz
11
/I1
b
11.
We terminate with perturbation bounds on the QR-factorization
of a matrix
A.
Lemma
A.3 [22]:
Let
A
=
Q.
R,
where
A
has
full
column
rank
n,
Q’
-Q
=
I,,
and
R
is upper triangular. Then for a small
perturbation
A
of
A
there exist perturbations
Q
and
R
of the
factors, such that
A=Q.
R
and
(A-A)=Q
(R-R)+A
llAll
.
K(A)l2
.
IIA
11-
When the error
6,
is
the backward error of the above decomposi-
tion solved on a computer with machine precision
E,
then it is of
the order of
E
and the above bound becomes
llAllSE2
.
K(A12
IIA
11-
A
similar result can
also
be found for a “skew decomposition,”
i.e., where
Q‘.C-Q
=
C,
for some signature matrix
E.
Although
all
these bounds are written for the 2-norm, they
also
hold for several other norms,
up
to a constant which is close to
1
and can therefore be absorbed in the
E;.
This
is, e.g., important
when deriving the above bounds for a matrix
B
instead of a vector
b.
This is done by using the bounds for each column
bi
of
the
matrix
B
and combining these bounds into a bound involving
the
norm of
B,
for which in this case the Frobenius norm is
a
natural
choice [21]. These mixed bounds (as far as norms
are
concerned)
can then again be formulated in terms of one norm
only,
by again
adapting the
E;
appropriately.
with
REFERENCES
R.
E.
Kalman, “A new approach to linear filtering and prediction
problems,”
Trans. ASME. (J. Basic Eng.),
vol.
82D,
pp. 34-45,
Mar. 1960.
A. H. Jazwinski,
Stochastic Processes and Filtering Theory.
New
York: Academic, 1970.
G.
J. Bierman and C. L. Thornton, “Numerical comparison of Kalman
filter algorithms: Orbit determination case study,”
Automatica,
vol.
A. E. Bryson, “Kalman filter divergence and aircraft motion estima-
J.
E.
Potter and
R.
G.
Stem, “Statistical filtering of space navigation
tors,”
Znt. J. Guidance Contr.,
vol.
1,
no. 1, pp. 71-79, 1978.
P. G..Kaminski, A. Bryson, and
S.
Schmidt, “Discrete square root
measurements,” in
Proc.
1963
AZAA Guidance Contr. Conf..,
1963.
13, pp. 23-35, 1977.
filtering: A survey of current techniques,”
ZEEE Trans. Automat.
Contr.,
vol.
AC-16,
pp. 727-736, Dec. 1971.
G. J. Bierman,
Factorization Methods for Discrete Sequential
Estimation.
New York: Academic, 1977.
J. Mendel, “Computational requirements for a discrete Kalman filter,”
IEEE Trans. Automat. Contr.,
vol. AC-16,
no.
6, pp. 748-758,
1971.
T.
Kailath, “Some new algorithms for recursive estimation in constant
linear systems,”
IEEE Trans. Inform. Theory,
vol. IT-19, pp. 750-
760, Nov. 1973.
M. H. Verhaegen, “A new class of algorithms in linear system theory,
with application to real-time aircraft model identification,” Ph.D.
dissertation, Catholic Univ. Leuven, Leuven, Belgium, Nov. 1985.
M. H. Verhaegen and P. Van Dooren, “An efficient implementation of
square root filtering:
Error
analysis, complexity and simulation on
flight-path reconstruction,” in
Proc. INRZA Conf. Anal. Optimiz.
8.
D.
0.
Anderson and J.
B.
Moore,
Optimal
Filtering,
(Information
Syst.,
Nice, June 1984,
Springer-Verlag, vol.
62-63, pp, 250-267.
and System Sciences Series). Englewood Cliffs, NJ: Prentice-Hall,
1979.
G.
H. Golub, “Numerical methods for solving linear least squares
problems,”
Numerkche Mathematik,
vol. 7, pp. 206-216, 1965.
M. Mod and
T.
Kailath, “Square-root algorithms for least squares
497, Aug. 1975.
estimation,”
IEEE Trans. Automat. Contr.,
vol. AC-20, pp. 487-
M.
Gentleman, “Least squares computations by Givens transforma-
tions without square roots,”
JZMA,
vol. 12, pp. 329-336, 1973.
F. R. Gantmacher,
The Theory
of
Matrices.
New York: Chelsea,
1960.
R.
E.
Bellman,
Matrix Anabsis,
2nd ed., New
York:
McGraw-Hill,
1968.
G. W. Stewart, “On the perturbation
of
pseudo-inverses, projections
and linear least square problems,”
SIAM Rev.,
vol. 19, pp. 634-662,
Oct. 1977.
A. van der Sluis, “Stability of the solution of hear least squares
problems,”
Numerische Mathematik,
vol.
23, pp. 241-254, 1975.
J.
H. Wilkinson,
The Algebraic Eigenvalue Problem.
Oxford:
Clarendon, 1965.
G. W. Stewart,
Introduction to Matrix Computations.
New York:
Academic, 1973.
G.
W.
Stewart, “Perturbation bounds
for
the
QR
factorization of a
matrix,”
SIAM J. Numer. Anal..
vol. 14.
OD.
509-518. June 1977.
C.
C.
Paige, “covariance matrix representat&
in
linear
kltering.”
in
Role in Systems Theory, AMs, 1985.
Special Issue of Contemporary Mathematics on Linear Algebra and its
Michel Verhaegen
was born in Antwerp, Belgium,
on September 2, 1959. He received the engineering
degree in aeronautics (Cum Laude) from the
Delft
University of Technology,
Delft,
The Netherlands,
in 1982 and the doctoral degree in applied sciences
from the Catholic University
of
Leuven, Leuven,
Belgium, in 1985.
From 1982 to 1985 he was holder of an
IWONL
Research Assistantship in the Department of
Electrical Engineering of the Catholic University of
Leuven. He is currently holder of a Postdoctoral
Fellowship of the National Research Council that enables
him
to
conduct
Research at the NASA Ames Laboratory, Moffett Field, CA.
His
research
interests are mainly in the interdisciplinary domain between numerical
analysis and linear system theory with emphasis
on
identification and filtering.
Paul
Van Dooren
(S’79-M’80) was
born
in
Tienen, Belgium,
on
November 5, 1950. He
received the engineering degree in computer science
,’:
and the doctoral degree of applied sciences, both
from the Catholic University of Leuven, Leuven,
Belgium, in 1974 and 1979, respectively.
From 1974
to
1977 he
was
Assistant in the
Department of Applied Mathematics and Computer
Science of the Catholic University of Leuven. He
was a Research Associate at the University of
Southern California in 1978-1979, a Postdoctoral
Fellow at Stanford University
in
1979-1980, and a Visiting Fellow at the
Australian National University in 1985. He
is
currently with the Philips
Research Laboratory, Brussels, Belgium. His main interests lie in the areas of
numerical linear algebra, linear system theory, digital signal processing, and
parallel algorithms.
Dr. Van Dooren received the Householder Award
N
in 1981. He is an
Associate Editor of
Systems and Control Letters
and of the
Journal
of
Computational and Applied Mathematics.
... The equation (5) shows that if S is propagated instead of P in the filter calculations, it effectively doubles the precision of the filter. This approach may prevent numerical convergence issues because the condition number of P is the square of the condition number of S. One of often used algorithms to compute square root matrix S is the Cholsky factorisation of the covariance matrix P. The Cholesky factorisation [5] produces matrix S which is often called Cholesky triangle because it is triangular matrix. ...
... The equation (5) shows that if S is propagated instead of P in the filter calculations, it effectively doubles the precision of the filter. This approach may prevent numerical convergence issues because the condition number of P is the square of the condition number of S. One of often used algorithms to compute square root matrix S is the Cholsky factorisation of the covariance matrix P. The Cholesky factorisation [5] produces matrix S which is often called Cholesky triangle because it is triangular matrix. ...
... The modification used in MGS normalizes the columns at each step, so that the numerical errors are bounded. The details of various orthogonalisation algorithms and its properties can be found in literature [2], [5], [7]. Different square root algorithms use update equations based on different transformations. ...
Article
Full-text available
The paper summarises and describes the most commonly used matrix factorisation methods applied in design of the Kalman filter in order to improve computational efficiency and avoid divergence issues caused by numerical round-off and truncation errors. Some forms of the Kalman filter are more prone to the growth of numerical error sand possible divergence than other implementations. In order to prevent the algorithm’s divergence additional processing is needed and this paper discusses pros and cons of different implementations and their numerical characteristics. Numerical issues still arise in finite word length implementations of algorithms, which frequently occur in embedded systems. This paper describes algorithms based on different factorisations such as Cholesky, U-D, SVD and their basic numerical properties.
... This latter method being essentially inflation of the covariance matrix. Almost two decades later, these ad-hoc techniques would, in fact, be mathematically proved to be appropriate approaches [11]. ...
... Artificially inflating the covariance matrix, was one of the solutions proposed to also"control" the Kalman filter divergence problem that engineers started to face at the time [12], [13]. Although the Kalman filter divergence was mainly attributed to modeling errors, round-off errors also were recognized to affect the filter stability [11]. ...
... where 11) and Mathematically, the conventional formulation of the Schmidt filter starts by partitioning the state vector,x − , measurement matrix,H, and Kalman gain K into states and parameters aŝ ...
Preprint
Full-text available
A Schmidt filter is a modification of the Kalman filter that allows to append system parameters as states and considers their uncertainty effect in the filtering process without attempting to estimate such parameters. The states that are only considered but not estimated, are generally known as \textit{consider} or \textit{considered} states. The main contributions of this research are the formulations of a Schmidt-Kalman filter that incorporates the numerical robustness of the well-known square root and factorized filtering forms plus the capacity of actively attempting to update the \textit{considered} states. The filters formulations proposed in this research are a fundamental extension of the Kalman filter. Therefore, the formulations of this work also apply within the Extended Kalman filter framework. More importantly, they are shown to handle nonlinearities, larger initial uncertainties, and poorly conditioned systems better than a typical Extended or Schmidt Kalman filter. Because the new filters are directly based on the Schmidt filter, they offer a novel and straight-forward filtering framework, allowing the use of a more simple filter where a more advanced or elaborated technique could have been needed.
... Significant energy gains from optimized quantization have been demonstrated in [28][29][30] for signal processing and digital communications applications and in [31][32][33] for neural networks. The effects of quantization on the Kalman filter were first studied in [34,35] to understand the convergence of filters with reduced precision. ...
... To develop this framework, we build on the approach of [35], which consists of evaluating the covariance matrix of the estimation error at each filter iteration by considering both error propagation from previous iterations and errors introduced at the current iteration. Our analysis also includes quantized filter parameters and further incorporates the effect of unreliable memories. ...
... Our objective is to compute the total error ∆x k+1|k+1 on the computation ofx k+1 at step k + 1 by considering the two types of errors: quantization and unreliable memory. To handle recursion as in [35], we choose to split the error model in two parts: the errors occurring at step k and the errors from the previous steps, which are propagated up to step k. ...
Article
Full-text available
This paper presents a quantized Kalman filter implemented using unreliable memories. We consider that both the quantization and the unreliable memories introduce errors in the computations, and we develop an error propagation model that takes into account these two sources of errors. In addition to providing updated Kalman filter equations, the proposed error model accurately predicts the covariance of the estimation error and gives a relation between the performance of the filter and its energy consumption, depending on the noise level in the memories. Then, since memories are responsible for a large part of the energy consumption of embedded systems, optimization methods are introduced to minimize the memory energy consumption under the desired estimation performance of the filter. The first method computes the optimal energy levels allocated to each memory bank individually, and the second one optimizes the energy allocation per groups of memory banks. Simulations show a close match between the theoretical analysis and experimental results. Furthermore, they demonstrate an important reduction in energy consumption of more than 50%.
... By subtracting (33) from (27), we define the following residual error vector ...
... The initial values of (36) lie in P 0 , but are invariant to the sensors' errors. Their estimation is mostly performed through careful heuristics, relying on knowledge and intuition of the system designers [33]. In contrast, the process noise covariance propagates the PSD matrix in matrix, which embodies the spectral intensity and correlation of the process noise vector. ...
Preprint
Full-text available
Inertial navigation systems (INS) are widely used in almost any operational environment, including aviation, marine, and land vehicles. Inertial measurements from accelerometers and gyroscopes allow the INS to estimate position, velocity, and orientation of its host vehicle. However, as inherent sensor measurement errors propagate into the state estimates, accuracy degrades over time. To mitigate the resulting drift in state estimates, different approaches of parametric and state estimation are proposed to compensate for undesirable errors, using frequency-domain filtering or external information fusion. Another approach uses multiple inertial sensors, a field with rapid growth potential and applications. The increased sampling of the observed phenomenon results in the improvement of several key factors such as signal accuracy, frequency resolution, noise rejection, and higher redundancy. This study offers an analysis tutorial of basic multiple inertial operation, with a new perspective on the error relationship to time, and number of sensors. To that end, a stationary and levelled sensors array is taken, and its robustness against the instrumental errors is analyzed. Subsequently, the hypothesized analytical model is compared with the experimental results, and the level of agreement between them is thoroughly discussed. Ultimately, our results showcase the vast potential of employing multiple sensors, as we observe improvements spanning from the signal level to the navigation states. This tutorial is suitable for both newcomers and people experienced with multiple inertial sensors.
... where L(q) = R(q)B a , P − 11 from equation (39) being the covariance matrix of the predicted quaternions q − . ...
... The implementation of the EKF yields the following number of flops per time-step, as stated by [39]: ...
... where n is the state dimension (8), p is the measurement estimation (16), m is the state process dimension (8) and N is the window length for the adaptive estimation (10). ...
... The first term represents the standard cost of the KF 16 , the second and third take into account respectively the evaluation of the integral in Eq. (18) and the moving average window application. ...
Conference Paper
Full-text available
Currently on-board operating s/c navigation systems use Kalman filtering of GPS data to achieve real-time Precise Orbit Determination (POD). It allows to greatly reducing dependency on ground support, but high accuracy still relies on manual tuning from ground analysts. The suitable solution for this problem is the adaptive filtering approach. This paper presents the performance analysis of one POD adaptive Kalman filtering on Hardware-in-the-Loop (HIL) data generated by a dedicated GNSS facility. The study has demonstrated the capability of the innovation based adaptive filter to perform the self-tuning and improve the performance w.r.t. sub-optimal parameter initialization case.
... The software environment with a 64-bit architecture uses 52 bits for its mantissa 1 and therefore, the condition number should be below 10 15 to guarantee numerical stability of the system. Additionally, round-off errors on the solution of the system = can be shown to be bound by the following formula (Verhaegen and Van Dooren, 1986): ...
Article
Full-text available
A relatively novel approach of autonomous navigation employing platform dynamics as the primary process model raises new implementational challenges. These are related to: (i) potential numerical instabilities during longer flights; (ii) the quality of model self-calibration and its applicability to different flights; (iii) the establishment of a global estimation methodology when handling different initialisation flight phases; and (iv) the possibility of reducing computational load through model simplification. We propose a unified strategy for handling different flight phases with a combination of factorisation and a partial Schmidt–Kalman approach. We then investigate the stability of the in-air initialisation and the suitability of reusing pre-calibrated model parameters with their correlations. Without GNSS updates, we suggest setting a subset of the state vector as ‘considered’ states within the filter to remove their estimation from the remaining observations. We support all propositions with new empirical evidence: first in model-parameter self-calibration via optimal smoothing and second through applying our methods on three test flights with dissimilar durations and geometries. Our experiments demonstrate a significant improvement in autonomous navigation quality for twelve different scenarios.
... [46,49] formulated the square root of the error covariance matrix and propagated it during the state estimation run. The square root form of the covariance matrix is always symmetric and positive semi-definite [46,57]. This leads to improved stability of the system together with enhanced numerical accuracy. ...
Article
Full-text available
Square-root unscented Kalman filter (SRUKF) is a widely used state estimator for several state of-the-art, highly nonlinear, and critical applications. It improves the stability and numerical accuracy of the system compared to the non-square root formulation, the unscented Kalman filter (UKF). At the same time, SRUKF is less computationally intensive compared to UKF, making it suitable for portable and battery-powered applications. This paper proposes a low-complexity and power-efficient architecture design methodology for SRUKF presented with a use case of the simultaneous localization and mapping (SLAM) problem. Implementation results show that the proposed SRUKF methodology is highly stable and achieves higher accuracy than the extensively used extended Kalman filter and UKF when developed for highly critical nonlinear applications such as SLAM. The design is synthesized and implemented on resource constraint Zynq-7000 XC7Z020 FPGA-based Zedboard development kit and compared with the state-of-the-art Kalman filter-based FPGA designs. Synthesis results show that the architecture is highly stable and has significant computation savings in DSP cores and clock cycles. The power consumption was reduced by 64\(\%\) compared to the state-of-the-art UKF design methodology. ASIC design was synthesized using UMC 90-nm technology, and the results for on-chip area and power consumption results have been discussed.
Article
Estimating the position, velocity, and time of GNSS receivers is based on tracking loops and extracting navigation satellite information. There are two methods for navigation satellites signal tracking: Scalar Tracking Loop (STL) and Vector Tracking Loop (VTL). The inexpensive GNSS receivers usually use STL approach, but VTL approach has advantages such as weak signal tracking, better capability against signal interference, faster tracking after temporary signal outages. Hence, using this method for tracking can improve the performance of GNSS receiver. The main drawback of the VTL method is its high computational volume. In this paper, three techniques have been proposed to reduce the computational load of this tracking method that enables the implementation of the VTL method on inexpensive receivers. The results with real data and simulations show that if these three techniques are used simultaneously, the computational load is reduced by more than 90% and on the one hand; the performance of the tracking loops shows no significant reduction compared to the standard VTL. Therefore, the simultaneous use of these three techniques to implement the VTL method can be very effective on low-cost and low-dynamic receivers. Two techniques of the three proposed techniques can also be used to implement the VTL on all types of GNSS receivers.
Article
This paper surveys perturbation theory for the pseudo–inverse (Moore–Penrose generalized inverse), for the orthogonal projection onto the column space of a matrix, and for the linear least squares problem.
Article
Let A be an $m \times n$ matrix of rank n. The $QR$ factorization of A decomposes A into the product of an $m \times n$ matrix Q with orthonormal columns and a nonsingular upper triangular matrix R. The decomposition is essentially unique, Q being determined up to the signs of its columns and R up to the signs of its rows. If E is an $m \times n$ matrix such that $A + E$ is of rank n, then $A + E$ has an essentially unique factorization $(Q + W)(R + F)$. In this paper bounds on $\| W \|$ and $\| F \|$ in terms of $\| E \|$ are given. In addition perturbation bounds are given for the closely related Cholesky factorization of a positive definite matrix B into the product $R^T R$ of a lower triangular matrix and its transpose. Let A be an $m \times n$ matrix of rank n. The $QR$ factorization of A decomposes A into the product of an $m \times n$ matrix Q with orthonormal columns and a nonsingular upper triangular matrix R. The decomposition is essentially unique, Q being determined up to the signs of its columns and R up to the signs of its rows. If E is an $m \times n$ matrix such that $A + E$ is of rank n, then $A + E$ has an essentially unique factorization $(Q + W)(R + F)$. In this paper bounds on $\| W \|$ and $\| F \|$ in terms of $\| E \|$ are given. In addition perturbation bounds are given for the closely related Cholesky factorization of a positive definite matrix B into the product $R^T R$ of a lower triangular matrix and its transpose.
Article
A common problem in a Computer Laboratory is that of finding linear least squares solutions. These problems arise in a variety of areas and in a variety of contexts. Linear least squares problems are particularly difficult to solve because they frequently involve large quantities of data, and they are ill-conditioned by their very nature. In this paper, we shall consider stable numerical methods for handling these problems. Our basic tool is a matrix decomposition based on orthogonal Householder transformations.
Article
In 1966 Golub and Wilkinson gave upper bounds for the errors of least squares solutions based on orthogonal transformations, in which the square of the condition number of the matrix occurs. In the present paper a geometrical explanation and lower bounds are given. The upper bounds will be shown to be realistic for some classes of problems and types of perturbations, and to be unrealistic for others.
Article
The classical filtering and prediction problem is re-examined using the Bode-Sliannon representation of random processes and the “state-transition” method of analysis of dynamic systems. New results are: (1) The formulation and methods of solution of the problem apply without modification to stationary and nonstationary statistics and to growing-memory and infinitememory filters. (2) A nonlinear difference (or differential) equation is derived for the covariance matrix of the optimal estimation error. From the solution of this equation the coefficients of the difference (or differential) equation of the optimal linear filter are obtained without further calculations. (3) The filtering problem is shown to be the dual of the noise-free regulator problem. The new method developed here is applied to two well-known problems, confirming and extending earlier results. The discussion is largely self-contained and proceeds from first principles; basic concepts of the theory of random processes are reviewed in the Appendix.