Content uploaded by Alexander Bertrand
Author content
All content in this area was uploaded by Alexander Bertrand
Content may be subject to copyright.
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 10, OCTOBER 2010 5277
Distributed Adaptive Node-Specific Signal Estimation
in Fully Connected Sensor Networks—Part I:
Sequential Node Updating
Alexander Bertrand, Student Member, IEEE, and Marc Moonen, Fellow, IEEE
Abstract—We introduce a distributed adaptive algorithm for
linear minimum mean squared error (MMSE) estimation of
node-specific signals in a fully connected broadcasting sensor
network where the nodes collect multichannel sensor signal obser-
vations. We assume that the node-specific signals to be estimated
share a common latent signal subspace with a dimension that is
small compared to the number of available sensor channels at
each node. In this case, the algorithm can significantly reduce the
required communication bandwidth and still provide the same
optimal linear MMSE estimators as the centralized case. Further-
more, the computational load at each node is smaller than in a
centralized architecture in which all computations are performed
in a single fusion center. We consider the case where nodes update
their parameters in a sequential round robin fashion. Numerical
simulations support the theoretical results. Because of its adaptive
nature, the algorithm is suited for real-time signal estimation in
dynamic environments, such as speech enhancement with acoustic
sensor networks.
Index Terms—Adaptive estimation, distributed estimation, wire-
less sensor networks (WSNs).
I. INTRODUCTION
IN a sensor network [1] a general objective is to utilize all
sensor signal observations available in the entire network to
perform a certain task, such as the estimation of a parameter or
signal. Gathering all observations in a fusion center to calculate
an optimal estimate may however require a large communica-
tion bandwidth and computational power. This approach is often
Manuscript received October 21, 2009; accepted March 21, 2010. Date of
publication June 10, 2010; date of current version September 15, 2010. The as-
sociate editor coordinating the review of this manuscript and approving it for
publication was Dr. Ta-Sung Lee. The work of A. Bertrand was supported by
a Ph.D. grant of the I.W.T. (Flemish Institute for the Promotion of Innovation
through Science and Technology). This work was carried out at the ESAT Labo-
ratory of Katholieke Universiteit Leuven, in the frame of K.U. Leuven Research
Council CoE EF/05/006 Optimization in Engineering (OPTEC), Concerted Re-
search Action GOA-AMBioRICS, Concerted Research Action GOA-MaNet,
the Belgian Programme on Interuniversity Attraction Poles initiated by the Bel-
gian Federal Science Policy Office IUAP P6/04 (DYSCO, “Dynamical sys-
tems, control and optimization,” 2007–2011), and Research Project FWO nr.
G.0600.08 (“Signal processing and network design for wireless acoustic sensor
networks”). The scientific responsibility is assumed by its authors.
The authors are with the Department of Electrical Engineering (ESAT-SCD/
SISTA), Katholieke Universiteit Leuven, B-3001 Leuven, Belgium (e-mail:
alexander.bertrand@esat.kuleuven.be; marc.moonen@esat.kuleuven.be).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TSP.2010.2052612
referred to as centralized fusion or estimation. An alternative is
a distributed approach where each node has its own processing
unit and the estimation relies on distributed processing and co-
operation. This approach is preferred, especially so when it is
scalable in terms of its communication bandwidth requirement
and computational complexity.
In many sensor network estimation frameworks, the sensor
signal observations are used to estimate a common network-
wide desired parameter or signal, denoted here by . This means
that all nodes contribute to a common goal, i.e., the estimation of
the globally defined variable , which is the same for all nodes
(see for example [2]–[8]). This can be viewed as a special case
of the more general problem, which is considered here, where
each node in the network estimates a different node-specific de-
sired signal, i.e., node estimates the locally defined signal .
This means that all nodes have a different local objective, which
they pursue through cooperation with other nodes. We describe
a distributed adaptive node-specific signal estimation (DANSE)
algorithm that operates in an ideal fully connected network. The
nodes broadcast compressed multichannel sensor signal obser-
vations that can be captured by all other nodes in the network,
possibly with the help of relay nodes. The computational load
is distributed over the different nodes in the network.
The DANSE algorithm is designed for the case where the
node-specific desired signals share a common (unknown) la-
tent signal subspace. If this signal space has a small dimension
compared to the number of available sensor channels at each
node, the DANSE algorithm exploits this common interest of
the nodes to significantly compress the data to be broadcast, and
yet converge to the optimal linear minimum mean squared error
(MMSE) estimators as if all sensor signal observations were
available at each node. Although the DANSE algorithm implic-
itly assumes a specific structure in the relationship between the
desired signals of the different nodes, it is noted that the actual
parameters of these latent dependencies are not assumed to be
known, i.e., nodes do not know how their desired signal is re-
lated to the desired signals of other nodes. The model that is
assumed in the DANSE algorithm naturally emerges in adap-
tive signal estimation problems in dynamic scenarios where the
target signal statistics and the transfer functions to the sensors
are not known and may change during operation of the algo-
rithm. Therefore, the original target signal cannot be recovered,
and so an option is then to let the nodes optimally estimate the
signal as it is observed locally by the node’s sensors. In this case,
the desired signals of the different nodes are differently filtered
1053-587X/$26.00 © 2010 IEEE
5278 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 10, OCTOBER 2010
versions of the same target signal, i.e., they share a common la-
tent signal subspace.
Because of its adaptive nature, the DANSE algorithm is
suited for real-time applications in dynamic environments.
Typical applications are vibration monitoring, wireless acoustic
sensor networks (for surveillance, video conferencing, do-
motics, audio recording ), and noise reduction in hearing aids
with external sensor nodes and/or cooperation between multiple
hearing aids [9], [10]. Node-specific estimation is particularly
important in applications where a target signal needs to be
estimated as it is observed at a specific sensor position. For
instance, in acoustic surveillance, it is often required to be able
to locate a sound source, so spatial information in the obser-
vations of different nodes must be retained in the estimation
process. In cooperating hearing aids, it is important to estimate
the signal as it impinges at the hearing aid itself, to preserve the
auditory cues for directional hearing [11], [12].
The DANSE algorithm is based on linear compression of
multichannel sensor signal observations. Linear compression
of sensor signal observations for data fusion has been the
topic of earlier work, e.g., [5]–[8]. The presented techniques,
however, assume prior knowledge of the intra- and intersensor
(cross-)correlation structure in the entire network. This must
be obtained by a priori training using all uncompressed sensor
signal observations, or must be derived from a specific data
model. Such assumptions make it difficult to apply the resulting
algorithms in adaptive networks or dynamic environments
where the statistics of the desired signals or sensor signals may
change. The DANSE algorithm can adapt to these changes
because nodes estimate and reestimate all required statistical
quantities on the compressed data during operation. For this,
we assume that each node can adaptively estimate the cross cor-
relation between its local sensor signals and its desired signal.
It is noted that the acquisition of these signal statistics is often
difficult or impossible, since the target signal is assumed to be
unknown. However, we will explain that in particular cases,
it is possible to estimate the required statistics, e.g., when the
target signal has an ON–OFF behavior (such as speech signals),
or when the target source periodically transmits a priori known
training sequences. In cases where the local statistics cannot
be estimated adaptively, the DANSE algorithm can still be
used in a semi-adaptive context, i.e., scenarios with static noise
statistics but with changing target signal statistics or vice versa,
assuming that the static correlation structure is a priori known.
In [13], a batch-mode description of the DANSE algorithm
was briefly introduced. In this paper, we provide more details,
i.e., we include a convergence proof and introduce a truly adap-
tive version. In addition, we address implementation aspects,
and provide extensive simulation results, both in batch mode and
in a dynamic scenario. We only consider the case where nodes
update their parameters in a sequential round robin fashion. The
case where nodes update simultaneously or asynchronously is
treated in a companion paper [14]. In [10], a pruned version
of the DANSE algorithm has been used for microphone-array
based speech enhancement in binaural hearing aids, where it
was referred to as distributed multichannel Wiener filtering. In
this application, two hearing aids in a binaural configuration ex-
change a linear combination of their microphone signals to esti-
mate the target sound that is recorded by their reference micro-
phone. Convergence of the two-node system has been proven for
the special case where there is a single target speaker. The more
general DANSE algorithm provided in this paper allows for a
nontrivial extension to a scenario with multiple target speakers
and a network with more than two nodes. Using extra acoustic
sensor nodes that communicate with the hearing aids generally
improves the noise reduction performance, since the acoustic
sensors physically cover a larger area [9].
The paper is organized as follows. The problem formulation
and notation are presented in Section II. In Section III, we first
address the simple case in which the node-specific desired sig-
nals are scaled versions of each other and we prove conver-
gence of the DANSE algorithm to the optimal linear MMSE
estimators when nodes update their parameters sequentially. In
Section IV, this algorithm is generalized to the case in which
the node-specific desired signals share a common latent -di-
mensional signal subspace. In Section V, we address some im-
plementation details of DANSE and we study the complexity
of the algorithm. Finally, Section VI illustrates the convergence
results with numerical simulations. Conclusions are given in
Section VII.
II. PROBLEM FORMULATION AND NOTATION
A. Node-Specific Linear MMSE Estimation
We consider an ideal fully connected network with sensor
nodes , in which data broadcast by a node
can be captured by all other nodes in the network
through an ideal link. Node collects observations of a com-
plex1valued -channel signal , where is the dis-
crete time index, and where is an -dimensional column
vector. Each channel , , of the signal
corresponds to a sensor signal to which node has access.
We assume that all signals are stationary and ergodic. In prac-
tice, the stationarity and ergodicity assumption can be relaxed to
short-term stationarity and ergodicity, in which case the theory
should be applied to finite signal segments that are assumed to
be stationary and ergodic. For the sake of an easy exposition, we
will omit the time index when referring to a signal, and we will
only write the time index when referring to one specific obser-
vation, i.e., is the observation of the signal at time .We
define as the -channel signal in which all are stacked,
where . This scenario is described in Fig. 1.
It is noted that this problem formulation also allows for hier-
archical network architectures, in which the sensors are grouped
in clusters. The sensors of a specific cluster then transmit
their observations to a nearby fusion center, i.e., a “higher level”
node. The fusion centers then correspond to the nodes in
the above framework, and the collected observations in sensor
cluster correspond to the -channel signals as explained
above. Fig. 2 shows such a scenario for a network with three fu-
sion centers .
We first consider the centralized estimation problem, i.e., we
assume that each node has access to the observations of the en-
tire -channel signal . This corresponds to the case where
1Throughout this paper, all signals are assumed to be complex valued to
permit frequency-domain descriptions.
BERTRAND AND MOONEN: DANSE IN FULLY CONNECTED SENSOR NETWORKS—PART I 5279
Fig. 1. Description of the scenario. The network contains sensor nodes,
, where node collects -channel sensor signal observations and es-
timates a node-specific desired signal , which is a mixture of the channels
of a common latent signal .
Fig. 2. A hierarchical architecture with 3 fusion centers , each one
collecting sensor signals from nearby sensors.
nodes broadcast their uncompressed observations to all other
nodes. In Sections III and IV, the general goal will be to com-
press the broadcast signals, while preserving the estimation per-
formance of this centralized estimator. The objective for node
is to estimate a complex valued node-specific signal , referred
to as the desired signal, from the observations of . We consider
the general case where is not an observed signal, i.e., it is as-
sumed to be unknown, as it is the case in signal enhancement
(e.g., in speech enhancement, is the speech component in a
noisy microphone signal). Node uses a linear estimator to
estimate as where is a complex valued -di-
mensional vector, and where superscript denotes the conju-
gate transpose operator. We assume that the -channel signal
is correlated to the node-specific desired signals, but unlike [6],
[8], we do not restrict ourselves to any data model generating the
sensor signals, nor do we make any assumptions on the proba-
bility distributions of the involved signals. We consider linear
MMSE estimation based on a node-specific estimator , i.e.
(1)
with the expected value operator. Assuming that the cor-
relation matrix has full rank,2the unique so-
lution of (1) is [15]:
(2)
with , where denotes the complex conjugate
of . Based on the assumption that the signals are ergodic,
and can be estimated by time averaging. The is di-
rectly estimated from the sensor signal observations. Since is
assumed to be unknown, the estimation of the correlation vector
has to be done indirectly, based on specific strategies, e.g.,
by exploiting the ON–OFF behavior of the target signal (e.g., for
speech enhancement [9], [10]), by using training sequences, or
by using partial prior knowledge when the estimation is per-
formed in a semi-adaptive context. We will provide more details
on these strategies in Section V-A. In the sequel, we assume that
can be estimated during operation of the algorithm.
In the above estimation procedure, temporal correlation ap-
pears to be ignored. However, differently delayed versions of
one or more sensor signals at node can be added to the chan-
nels of , to also exploit the temporal information in the sig-
nals. For example, assume that node has access to 4 sensor
signals. Then each of these signals is delayed with 1, up to
sample delays, resulting in extra (delayed) channels. In
this case, the dimension of is .
It is noted that our problem statement differs from [2]–[4],
where each node collects different spatio–temporal observations
of two correlated signals and . The objective is then to find
the best common linear fit between these observations, with a
single set of coefficients , which is assumed to be the same for
each node. Since the coefficients in are of interest, only the
locally estimated ’s must be shared between nodes, whereas
the sensor observations themselves are only used locally to up-
date the estimate of . Since all nodes are assumed to estimate
the same set of coefficients, incremental or diffusive averaging
strategies can be used.
B. Common Latent Signal Subspace
In our problem statement, each node only collects observa-
tions of which corresponds to a subset of the channels of the
full signal . To find the optimal MMSE solution (2), each node
therefore in principle has to broadcast its observations of
to all other nodes in the network, which requires a large com-
munication bandwidth. One possibility to reduce the required
bandwidth is to broadcast only a few linear combinations of the
components of the observations instead of all compo-
nents. Finding the optimal linear compression is often a non-
trivial task, and in general this will not lead to the optimal solu-
tions (2). In many practical cases, however, the signals share
a common latent signal subspace, and then this can be exploited
in the compression. The most simple case is when all ,
i.e., the desired signal is the same for all nodes. We will first
handle the slightly more general case where all are scaled
versions of a common latent single-channel signal . For this
2This assumption is mostly satisfied in practice because of a noise component
at every sensor that is independent of other sensors, e.g., thermal noise. If not,
pseudoinverses should be used. A further comment on the rank-deficient case is
made in Section IV-C.
5280 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 10, OCTOBER 2010
scenario, we will introduce the algorithm, in which
the data to be broadcast by each node is compressed by a
factor . Despite this compression, the algorithm converges
to the optimal node-specific solution (2) at every node as if no
compression were used for the broadcasts.
This scenario can then be extended to the more general case
where the desired signals share a common -dimensional signal
subspace, i.e.
(3)
with defining an unknown -dimensional complex vector,
and a latent complex valued -channel signal defining the
-dimensional signal subspace that contains all signals. This
model applies to situations where the desired signal is generated
by multiple latent processes simultaneously (e.g., measuring vi-
brations when there are multiple exciters, or recording a con-
versation between multiple speakers [9]). Since the statistics of
the latent signals as well as the propagation properties to the
different sensors are generally unknown, the signal estimation
procedure can only use statistics that can be obtained from the
local sensor signal observations. The desired signal of each node
is then the linear mixture of the latent target signals as locally
observed by a reference sensor.
In the sequel, we consider the general case where node es-
timates a -channel desired signal
(4)
with a complex valued matrix. This data model is
depicted in Fig. 1. It is noted that the matrix and the la-
tent signal are assumed to be unknown, i.e., nodes do not
know how their node-specific desired signals are related to
each other. Since we also consider complex valued signals, (4)
can correspond to a frequency domain description of a convo-
lutive mixture in the time domain, as in [9], [10]. Expression
(4) then defines a different estimation problem for each specific
frequency. This yields frequency dependent estimators ,
which translate to multitap filters in the time domain.
Notice that, if , the desired signal spans the com-
plete signal subspace defined by the -channel signal (pro-
vided that the matrix has full rank). If this holds
for each node in the network, we will show that the data to
be broadcast by node can be compressed by a factor .
This means that node only needs to broadcast linear com-
binations of the components of its observations of , while the
optimal node-specific solution (2) is still obtained at all nodes.
Notice that in practical applications, the actual signal(s) of in-
terest can be a subset of the entries in , in which case the
other entries should be seen as auxiliary channels to capture
the latent -dimensional signal subspace that contains the ’s.
For instance, consider the case where nodes estimate the target
signal as observed by their reference sensor, i.e., node esti-
mates the node-specific desired signal as in (3). Node then
selects extra auxiliary reference sensors, and also esti-
mates the target signal as it arrives on these sensors. The re-
sulting -channel desired signal then spans the complete
signal subspace if .
III. DANSE WITH SINGLE-CHANNEL BROADCAST SIGNALS
The algorithm introduced in this paper is an iterative scheme
referred to as distributed adaptive node-specific signal estima-
tion (DANSE), since its objective is to estimate a node-spe-
cific signal at each node in a distributed fashion. In the gen-
eral scheme, each node broadcasts -component
compressed sensor signal observations. We will refer to this as
, where the subscript refers to the number of chan-
nels of the broadcast signals. For the sake of an easy exposi-
tion, we first introduce the DANSE algorithm for the simple case
where and we will show that converges to the
optimal filters if , i.e., if the single-channel desired signals
are nonzero scaled versions of the same latent single-channel
signal . In Section IV we generalize this to the more general
algorithm, and we will show that this algorithm con-
verges to the optimal filters if and if all in (4) have
rank .
A. Algorithm
The goal for each node is to estimate the signal with a
linear estimator that uses all observations in the entire network,
i.e., . We aim to obtain the MMSE solutions (2),
without the need for each node to broadcast all components
of the observations. For this, we define a partitioning of the
estimator as with denoting the
-dimensional subvector of that is applied to , and with
superscript denoting the transpose operator. In this way, (1)
is equivalent to
.
.
.
(5)
Since node only has access to the sensor signal observations
of , it can only control a specific part of the estimator ,
namely . In the algorithm, each node broad-
casts the output of this partial estimator, i.e., observations of the
compressed signal . This reduces the data to be
broadcast by a factor . It is noted that acts both as a
compressor and as a part of the estimator , i.e., the observa-
tions of the compressed signal that is broadcast by node is
also used in the estimation of at node itself.
A node now has access to input channels, i.e.,
its own sensor signals and signals that it receives
from the other nodes. Node will compute the optimal linear
combiner of these input channels to estimate .
The coefficient that is applied to the signal observations of
at node is denoted by . A schematic illustration of this
scheme (for ) is shown in Fig. 3. Notice that there is
no decompression involved, i.e., node does not expand the
observations of the signal, but only scales these with a scaling
BERTRAND AND MOONEN: DANSE IN FULLY CONNECTED SENSOR NETWORKS—PART I 5281
Fig. 3. The scheme with three nodes . Each node es-
timates a signal using its own -channel sensor signal observations, and
two single-channel signals broadcast by the other two nodes.
factor . As visualised in Fig. 3, the parametrization of the
now effectively applied at node is therefore
.
.
.(6)
i.e., each is now defined by the set of ’s to-
gether with a vector , defining the scaling
parameters. We use a tilde to indicate that the estimator is pa-
rametrized according to (6), which defines a solution space for
with a specific structure. In this parametrization,
node can only manipulate the parameters and . In the
sequel, we set to remove the ambiguity in
(hence is omitted in Fig. 3). Notice that the solution space
of is -dimensional,
which is smaller3than the original -dimensional solution
space corresponding to the centralized algorithm,
i.e., the solution space of the optimization problem (1). Still, the
goal of the algorithm is to iteratively update the pa-
rameters of (6) until .
In the sequel, we will use the following notation and defi-
nitions. In general, we will use to denote at iteration ,
where can be a signal or a parameter. The -channel signal
is defined as . We define as the vector
with entry omitted. Similarly, we define as the vector
with entry omitted.
At every iteration in the algorithm, one specific
node will update its local parameters and ,by
solving its local node-specific MMSE problem with respect to
3It is assumed here that , i.e., , , and there is at least
one node for which .
its input signals, consisting of its own sensor signal observa-
tions and the compressed signal observations of , i.e.,
it solves
(7)
Let denote the stacked version of the local input signals at
node , i.e.
(8)
Then the solution of (7) is
(9)
with
(10)
(11)
Since there is no decompression involved, the local estimation
problems (7) have a smaller dimension than the original net-
work-wide estimation problems (1), , i.e., the matrix
is smaller than the matrix in (2).
We define a block size which denotes the number of obser-
vations that the nodes collect in between two successive node
updates, i.e., in between two increments of . The al-
gorithm now consists of the following steps:
1) Initialize: ,
Initialize and with random vectors, .
2) Each node performs the following operation cycle:
• Collect the sensor observations ,
.
• Compress these -dimensional observations to
(12)
• Broadcast the compressed observations ,
, to the other nodes.
• Collect the -dimensional data vectors
, , which are stacked
versions of the compressed observations received from
the other nodes.
• Update the estimates of and , by including
the newly collected data.4
• Update the node-specific parameters:
if
if (13)
4In Section V-A, we will suggest some possible strategies to estimate these
parameters.
5282 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 10, OCTOBER 2010
• Compute the estimate of , ,as
(14)
3) .
4) .
5) Return to step 2)
Remark I: Notice that the different iterations are spread out
over time. Therefore, iterative characteristics of the algorithm
do not have an impact on the amount of data that is transmitted,
i.e., each sample is only broadcast once since the time index in
(12) and (14) shifts together with the iteration index.
Remark II: In the above algorithm description, it is not
mentioned how the correlation matrix and the cor-
relation vector should be estimated. This estimation
process depends on the application and the signals involved.
In Section V-A, we will suggest some possible strategies to
estimate and .
Remark III: It is noted that, when a node updates its node-
specific parameters and , the signal statistics of
change, i.e., changes to . Therefore, the next node to
perform an update needs a sufficient number of observations of
to reliably estimate the correlation coefficients involving this
signal. Therefore, the block-length should be chosen large
enough.
B. Convergence and Optimality of if and
Nonzero Desired Signals
We now assume that all are a nonzero scaled version of
the same signal , i.e., , with a nonzero complex
scalar but unknown to the individual nodes. Formula (2) shows
that in this case, all are parallel, i.e.
(15)
with . Therefore, the set belongs
to the solution space used by , as specified by (6), i.e.,
.
In the theoretical convergence analysis in the sequel, we as-
sume that the correlation matrices and the correlation
vectors , , are perfectly estimated, i.e., as if they
are computed over an infinite observation window. Under this
assumption, the following theorem guarantees convergence and
optimality of the algorithm.
Theorem III.1: If the sensor signal correlation matrix has
full rank, and if , , with a complex valued
single-channel signal and , then the
algorithm converges for any initialization of its parameters to
the MMSE solution (2) for all .
Before proving this theorem, we introduce some additional
notation. The vector (without subscript) denotes the stacked
vector of all vectors, i.e.
.
.
.(16)
We also define the following MSE cost functions corresponding
to node :
(17)
(18)
where is defined from and as in (6). Notice that con-
tains the entry , which is a fictitious variable that is never ac-
tually computed by the algorithm. We define
as the function that generates according to (9), i.e.
(19)
with denoting a identity matrix and denoting
an all-zero matrix. It is noted that the right-hand side
of (19) depends on all entries of the argument through the
signal , which is not explicitly revealed in this expression.
The proof of Theorem III.1 provided here differs from the
proof in [10], where a scheme similar to with
has been proved to converge to the optimal solution. Unlike
the proof in [10], our proof allows for a generalization to the
case with , it allows , and provides
more insight in the convergence properties of the algorithm. We
first prove the convergence statement of Theorem III.1, and then
the optimality statement.
Proof of Convergence: We prove that the sequence
and the sequences converge to a
limit point and respectively. When node performs
an update of its variables and at iteration , these
are replaced by the solution of the local MMSE problem (7),
repeated here for convenience:
(20)
If another node were to optimize the variables and
with respect to its own node-specific estimation problem, it
would solve the problem
(21)
Since with , the solution of (20) and
(21) are identical up to a scalar . This means that an update
of and at node , which is an optimization leading
to a decrease of , will also lead to a decrease of for any
if node were allowed to also perform a responding
optimization of its . This shows that for any (independent of
the selection of the node that actually performs an update at
iteration )
(22)
Since all have a lower bound, each sequence
converges to a limit , i.e.
(23)
BERTRAND AND MOONEN: DANSE IN FULLY CONNECTED SENSOR NETWORKS—PART I 5283
If we again assume that node performs an update at iteration
, then because of the strict convexity of the cost function in (20),
the following expression holds:
(24)
with
(25)
This shows that, after convergence of the sequences
, , any update of a
must correspond to a scaling. Notice however that
.
.
..
.
.(26)
i.e., a scaling of a in node does not change the update of
in node , since the scaling is implicitly compensated in
by the parameter . This proves convergence of the sequence
to a limit point and therefore also the sequences
must converge to a limit point , . Notice
that after convergence, based on what was stated earlier
(27)
or equivalently
(28)
From the proof of convergence, one can also conclude that
convergence of the cost functions will be monotonic, when
sampled at the iteration steps in which node updates its
parameters. Indeed, whenever node optimizes its own local
MMSE problem, it also optimizes the corresponding MMSE
problem in node , at least when the latter is allowed to perform
a responding update of its parameter . This shows that the
algorithm is at least as fast as a centralized equivalent
that would use an alternating optimization (AO) technique
[16], which is often referred to as the nonlinear Gauss-Seidel
algorithm [17], with partitioning following directly from the
parameters and for each node.
Proof of Optimality: We now prove that is the solution
of (1) for every node , which is equivalent to proving that the
gradient of is zero when evaluated at equilibrium, i.e.
(29)
Because the solution of (20) sets the partial gradient of with
respect to to zero, we find that
(30)
Since , we can show that
(31)
Combining (30) and (31) yields
(32)
Notice that (27) is equivalent with
(33)
Substituting (33) in (32) yields
(34)
which is equivalent to (29). This proves the theorem.
IV. DANSE WITH -CHANNEL BROADCAST SIGNALS
A. Algorithm
In the algorithm, each node broadcasts
-component compressed sensor signal obser-
vations to the other nodes. This compresses the data to be
sent by node by a factor of . We as-
sume that each node estimates a -channel desired signal
. Assuming that the desired signals
share a common -dimensional latent signal subspace, we
will show in Section IV-B that achieves the optimal
estimators if is chosen equal to . Notice that the actual
signal(s) of interest can be a subset of the vector , and the
other entries should then be seen as auxiliary channels to fully
capture the latent signal subspace, as explained in Section II-B.
Generally, these auxiliary channels are obtained by choosing
extra reference sensors at node .
Again, we use a linear estimator to estimate as
. The objective for node is to
find the linear MMSE estimator
(35)
The solution of (35) is
(36)
with . Again, we define a partitioning of the
estimator as with denoting
the submatrix of that is applied to . We wish
to obtain (36) without the need for each node to broadcast all
components of the observations. Instead each node
will broadcast observations of the -channel compressed signal
. Since the channels of will be highly corre-
lated, further joint compression is possible, but we will not take
this into consideration throughout this paper.
A node can transform the observations of that it receives
from node by a transformation matrix . Again,
it is noted that does not decompress the observations of
the signal , but makes new linear combinations of their
5284 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 10, OCTOBER 2010
components. The parametrization of the effectively applied
at node is then
.
.
.(37)
which is a generalization of (6). Here, node can only optimize
the parameters and . We set
with denoting the identity matrix.
The -channel signal is a stacked ver-
sion of all the broadcast signals. Similarly to the notation in
Section III, we define the signal as the signal with
omitted, and we define as the matrix with the subma-
trix omitted. The MMSE problem that is solved at node ,
at iteration ,isnow
(38)
The solution of (38) is
(39)
with defined as in (10) and with
(40)
The algorithm consists of the following steps:
1) Initialize: , .
Initialize and with random matrices, .
2) Each node performs the following operation cycle:
• Collect the sensor observations ,
.
• Compress these -dimensional observations to
-dimensional vectors
(41)
• Broadcast the compressed observations ,
, to the other nodes.
• Collect the -dimensional data vectors
, , which are stacked
versions of the compressed observations received from
the other nodes.
• Update the estimates of and , by including
the newly collected data.
• Update the node-specific parameters:
if
if (42)
• Compute the estimate of , ,
as
(43)
3) .
4) .
5) Return to step 2)
is a straightforward generalization of the
algorithm as explained in Section III-A, where all vector-vari-
ables are replaced by their matrix equivalent. Similarly, expres-
sions (16)–(19) can be straightforwardly generalized to their
matrix equivalent.
B. Convergence and Optimality of if and
Full Rank
We now assume that , , with a
matrix of rank and a complex valued -channel signal.
This means that all desired signals share the same -dimen-
sional latent signal subspace (i.e., ). Formula (36) shows
that in this case all have the same column space, i.e.
(44)
with . Therefore, the set be-
longs to the solution space used by , as specified by
(37), i.e., . The following
theorem generalizes Theorem III.1.
Theorem IV.1: If the sensor signal correlation matrix
has full rank, and if , , with a complex
valued -channel signal and a matrix of rank ,
then the algorithm converges for any initialization of
its parameters to the MMSE solution (36) for all .
Proof: The proof of Theorem III.1 can straightforwardly
be generalized to prove Theorem IV.1, by replacing every
and by its matrix version and .
In practice, the matrices should be well-conditioned to
obtain the optimal estimators, which is reflected in Theorem
IV.1 by the condition that has full rank. If the -channel
desired signal is defined as the target signal in reference
sensors at node , this matrix can be ill-conditioned if the refer-
ence sensors are close to each other. This problem is investigated
in [9], where the DANSE algorithm is used for noise reduction
in acoustic sensor networks, and a solution is proposed to tackle
this problem.
C. DANSE Under Rank Deficiency
Until now, we have avoided the case where does not
have full rank or when the parameter is overestimated, i.e.,
. Both cases can result in broadcast data for which the
correlation matrix is rank deficient.5In this case, (38) becomes
ill-posed since singular correlation matrices are involved. The
algorithm can cope with these situations by adding
5In the case where , (44) has multiple solutions for since
, . Therefore, the correlation matrix of the broadcast
signal becomes singular, once the submatrix reaches this
rank deficiency.
BERTRAND AND MOONEN: DANSE IN FULLY CONNECTED SENSOR NETWORKS—PART I 5285
a minimum-norm constraint to the local MMSE problems (38),
i.e., using the pseudo-inverse instead of a matrix inverse in the
computation of the solution of (38) [15]. Extensive simulations
have shown that with this modification, the algorithm
still converges to an MMSE solution for rank deficient estima-
tion problems (see Section VI).
However, if the matrix does not have full rank, the so-
lution of (1) is not unique. Simulations have shown that the
solutions obtained by the algorithm, although
leading to a minimal MSE cost at node , are generally different
from the solutions provided by the centralized minimum norm
version, i.e.
(45)
where superscript denotes the pseudoinverse.
V. IMPLEMENTATION ASPECTS
A. Estimation of the Signal Statistics
In the theoretical analysis of the algorithm, it is as-
sumed that the second order signal statistics, which are needed
to solve the MMSE problem (38) are perfectly known. How-
ever, in a practical application, the correlation matrices
and have to be estimated, based on the collected signal
observations. In this section, we will describe some strategies to
estimate these quantities.
Estimation of signal correlation matrices is typically done by
time averaging. This means that some assumptions are made on
short-term ergodicity and stationarity of the signals involved.
However, this stationarity assumption is not necessarily strict.
Even when the signals involved are nonstationary (such as in
speech processing), the algorithm can provide good
estimators. By using long-term correlation matrices, the influ-
ence of rapidly changing temporal statistics is smoothed out,
yielding estimators that mainly exploit the spatial coherence
between the sensors. Since spatial coherence typically changes
slowly, the algorithm is able to provide good estima-
tors, even when the signals themselves are highly nonstationary
(this is e.g., demonstrated by the multichannel speech enhance-
ment experiments in [9]).
We let denote the estimate of at time . Signal
correlation matrices are often estimated in practice by means of
a forgetting factor , i.e.
(46)
Notice that in the algorithm, the statistics change
every time a node updates its parameters. Therefore, (46) is not
suited to compute and , since it uses an infi-
nite time window. A better alternative is a simple time averaging
in a finite observation window, i.e.
(47)
where is the length of the observation window. The procedure
(46) puts more emphasis on the most recent samples, whereas
(47) applies an equal weight to all past samples in the obser-
vation window. The procedure (47) can be implemented recur-
sively by means of an updating and a downdating term, i.e.
(48)
Notice that the window length introduces a trade-off between
tracking performance and estimation performance. Indeed, to
have a fast tracking, the statistics must be estimated from short
signal segments, yielding larger estimation errors in the correla-
tion matrices that are used to compute the estimators at the dif-
ferent nodes. However, as will be demonstrated in Section VI-B,
the algorithm is more robust to these errors, com-
pared to the equivalent centralized algorithm, due to the fact
that uses correlation matrices with smaller dimen-
sions than the network-wide estimation problem.
The estimation of is less straightforward since the
signal cannot be observed directly. However, depending on
the application and the signals involved, some strategies can be
developed to estimate , as explained in the following two
examples.
If the transmitting sources are controlled by the application
itself, as it is the case in a communications scheme, the source
signals that define the different channels in can be manipu-
lated directly. At periodic intervals, a deterministic training se-
quence can be broadcast by the transmitters. If the nodes have
knowledge about these training sequences, they can use this to
compute in a similar way as in (48), during the broad-
cast of these training sequences. After the broadcast, the esti-
mate is fixed until new training sequences are broadcast.
A different strategy can be applied if the desired signal has
an ON–OFF behavior.6Assume that the sensor signals in con-
sist of a desired component and an additive noise component
, i.e., , where has an ON–OFF behavior, and where
then . In many practical applications, it can
also be assumed that and are independent, and therefore7
(49)
If there is a detection mechanism available that detects whether
the signal is present or not, one can estimate in
time segments where only noise is observed (“noise-only seg-
ments”). Since the noise is uncorrelated to the desired compo-
nent , we find that
(50)
with
(51)
where is the desired component in the signal . The se-
lection matrix is used to select the first columns corre-
6This is often used in speech enhancement applications, since a speech signal
typically contains a lot of silent pauses in between words or sentences.
7For the sake of an easy exposition, we assume that the signals and have
zero mean.
5286 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 10, OCTOBER 2010
sponding to . Define the noise correlation
matrix
(52)
where denotes the noise component in the signal . With
(50), and similarly to (49), we readily find that
(53)
Using (53), one can compute as the difference between
and , where the latter is computed as in (48),
during noise-only periods.
Notice that, even if the target signal does not have this ON–OFF
behavior, the above strategy can be used in a semi-adaptive con-
text, i.e., where the target signal statistics may change but the
noise statistics are static and a priori known (or vice versa). In-
deed, if is known, then (53) can be used to compute
the required statistics. Notice that in (53) is a compressed
version of , i.e., it depends on the current parameters
in . Therefore, each node has to broadcast the entries of
, which are needed in the other nodes to compress the cor-
responding submatrices in . Since these values change
only once for each observations that are collected by the
sensors, the resulting increase in bandwidth is negligible com-
pared to the transmission of the samples of .
B. Computational Complexity
The estimation of the correlation matrices and ,
and the inversion of the former, are the most computationally
expensive steps of the algorithm. From (48) it fol-
lows that an update of at node , has a computational
complexity of
(54)
i.e., it is quadratic in the number of nodes , the number of
channels in the broadcast signals, and the number of channels
of the signal . If node updates its parameters and
according to (39), it performs a matrix inversion, which is
computationally more expensive than (54). However, instead of
computing this inversion, node can directly update the inverse
of at each time by means of the matrix inversion
lemma [15], i.e.
(55)
(56)
This update also has computational complexity (54), and
therefore this is the overall complexity for a single node in the
algorithm.
VI. NUMERICAL SIMULATIONS
In this section, we provide simulation results to demonstrate
the behavior of the algorithm. In Section VI-A, we
perform batch mode simulations where the required statistics
are computed over the full length signals, and where the ’s are
available8to compute . In the batch version of ,
all iterations are performed on the same set of signal observa-
tions. In Section VI-B, a more practical scenario with moving
sources is considered. The algorithm adapts to the
changes in the scenario, and each set of observations is only
broadcast once, i.e., subsequent iterations are performed over
different observation sets. Furthermore, a practical estimation
of the correlation matrices is used, where the ’s are assumed
to be unavailable.
A. Batch Mode Simulations
In this section, we simulate the algorithm in batch
mode. This means that all iterations are performed on the full
signal length. The network consists of four nodes , each
having 10 sensors . The dimension of the latent signal
subspace defined by is . All 3 channels of are uni-
formly distributed random processes on the interval [ ]
from which samples are generated. The coefficients
in are generated by a uniform random process on the unit in-
terval. The sensor signals in consist of the different random
mixtures of the latent -channel signal to which zero-mean
white noise is added with half the power of the channels of .
The initial values of all and are taken from a uniform
random distribution on the unit interval.
The batch mode performance of the algorithm as
well as the algorithm is simulated for this particular
scenario. All evaluations of the MSE cost functions are per-
formed on the equivalent least-squares (LS) cost functions, i.e.
(57)
Also, the correlation matrices are replaced by their least squares
equivalent, i.e., is replaced by where denotes
the sample matrix that contains samples of the variable
in its columns.
The results are illustrated in Fig. 4, showing the LS cost of
node 1 versus the iteration index . Node 1 is the first node
that performs an update. It is observed that the al-
gorithm converges to the optimal linear LS solution, whereas
the algorithm does not since in this case.
Downsampling the curve corresponding to by a factor
, keeping only the iterations in which node 1 updates its
parameters, results in a monotonically decreasing cost. This is
because of expression (22), showing that the cost indeed mono-
tonically decreases whenever a node optimizes its parame-
ters. If the curve corresponding to is downsampled
8This is similar to using a priori known training sequences.
BERTRAND AND MOONEN: DANSE IN FULLY CONNECTED SENSOR NETWORKS—PART I 5287
Fig. 4. LS error of node 1 versus iteration for four different scenarios in a
network with nodes. Each node has 10 sensors.
Fig. 5. LS error of node 1 versus iteration for networks with ,
and nodes respectively. Each node has 10 sensors.
with the same factor, we do not obtain a monotonically de-
creasing cost, since expression (22) is not valid anymore for this
case.
In Fig. 5, we vary the number of nodes , keeping all other
parameters unchanged. All nodes again have 10 sensors. Not
surprisingly, the convergence time of increases lin-
early with since the effective number of updates per time unit
in node 1 is reduced. As soon as each node has updated its pa-
rameters three times, the cost is almost at its minimum at each
node.
In Fig. 6(a), we increase the value of while keeping
. Notice that this corresponds to the case where is
overestimated and hence communication bandwidth is used
inefficiently. The estimation problem becomes rank deficient in
this case, and so the algorithm should be modified by replacing
matrix inversions by pseudoinversions (see Section IV-C). The
algorithm still converges, and the optimal LS cost is again
reached after three iterations per node when is overesti-
mated. In Fig. 6(b), we increase the value of together with
, keeping . This is again observed to have a negligible
effect on convergence time.
As a general conclusion, we can state that for all settings
of the parameters , , , the algorithm approxi-
mately achieves convergence as soon as each node has updated
its parameters three times.
Simulation results with speech signals are provided in a
follow-up paper [9]. In this paper, a distributed speech enhance-
ment algorithm based on and its variations, is tested
in a simulated acoustic sensor network scenario.
B. Adaptive Implementation
In this section, we show simulation results of a practical
implementation of the algorithm in a scenario with
moving sources. The main difference with the batch mode
simulations is that subsequent iterations are now performed
on different signal segments, i.e., the same data is never used
twice. This yields larger estimation errors, since shorter signal
segments are used to estimate the statistics of the input signals.
Furthermore, we will use a practical estimation procedure to
estimate the correlation matrices and , yielding
larger estimation errors.
The scenario is depicted in Fig. 7. The network contains
nodes . Each node has a reference sensor at the node itself,
and can collect observations of five additional sensors that
are uniformly distributed within a 1.6-m radius around the node.
Eight localized white Gaussian noise sources are present.
Two target sources move back and forth over the indicated
straight lines at a speed of 1 m/s, and halt for 2 s at the end points
of these lines. The first source (moving on the vertical line)
transmits a low-pass filtered white noise signal with a cut-off
frequency of 1600 Hz. The other source transmits a band-pass
filtered white noise signal in the frequency range from 1600 to
3200 Hz. Both target sources have an ON–OFF behavior with a
period of 0.2 s and both are active 66% of the time. It is assumed
that at each time , all nodes can detect whether the sources are
active or not. The time between two consecutive updates is 0.4 s,
which corresponds to two ON–OFF cycles of the target sources.
This means that, every 0.4 s, the iteration index changes to
. The sensors observe their signals at a sampling frequency
of .
The target source signals have half the power of the noise
sources. In addition to the spatially correlated noise, indepen-
dent white Gaussian sensor noise is added to each sensor signal.
This noise component is 10% of the power of the localized
noise signals. The individual signals originating from the target
sources and the noise sources that are collected by a specific
sensor are attenuated in power and summed. The attenuation
factor of the signal power is , where denotes the distance
between the source and the sensor. We assume that there is no
time delay in the transmission path between the sources and the
5288 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 10, OCTOBER 2010
Fig. 6. LS error of node 1 versus iteration in a network with nodes. Each node has 10 sensors. (a) Different values of , keeping and (b) different
values of .
Fig. 7. Description of the simulated scenario. The network contains four nodes
, each node collecting observations in a cluster of six sensors . One
sensor of each cluster is positioned at the node itself. Two target sources are
moving over the indicated straight lines. Eight noise sources are present .
sensors.9Each node collects six sensor signal observations, and
uses five differently delayed versions of each of these signals in
its estimation process to exploit the temporal correlation in the
target source signals. This means that .
We let denote the signal that is collected at the reference
sensor of node . It consists of an unknown mixture of the
two target source signals, and a noise component , i.e.
(58)
9Since the time delays are the same for all sensors, the spatial information
is purely energy based in this case. Therefore, the nodes cannot perform any
beamforming towards specific locations by exploiting different delay paths be-
tween sources and sensors.
where is the two-channel signal containing the two target
source signals, and where denotes an unknown mixture
vector. The goal for node is to estimate the signal , i.e., the
target source component in its reference sensor. Since ,
the algorithm is used, and therefore an auxiliary de-
sired channel is used to obtain a two-channel desired signal at
every sensor. The auxiliary channel of consists of the target
source component in the signal that is collected by an-
other sensor of node . This component consists of another un-
known mixture of the target sources, so that the conditions of
Theorem IV.1 are satisfied.
The correlation matrix is computed according to
(53). The estimates and are computed sim-
ilarly to (48) with a window length of and
, respectively, which matches the time between two con-
secutive updates.
We will use the signal-to-error ratio (SER) as a measure to as-
sess the performance of the estimators. The instantaneous SER
for node at time and iteration is computed over 3200 sam-
ples, and is defined as
(59)
where denotes the first column of the estimator ,as
defined in (37). Notice that this is the estimator that is of ac-
tual interest, since it estimates the desired component in
the reference sensor. The other column of is viewed as an
auxiliary estimator that is used for the generation of the second
channel of the broadcast signal .
Fig. 8 shows the SER of the four nodes at different time in-
stants. Dashed vertical lines are plotted to indicate the points in
time where both sources start moving, and full vertical lines in-
dicate when they stop moving. The sources stand still in the time
intervals [0–4] s, [10–12] s, and [18–20] s. The performance is
BERTRAND AND MOONEN: DANSE IN FULLY CONNECTED SENSOR NETWORKS—PART I 5289
Fig. 8. SER versus time at the four nodes depicted in Fig. 7. The centralized version is added as a reference. Window lengths are and .
compared to the centralized version, in which all sensor signals
are centralized in a single fusion center that computes the op-
timal estimators according to (2).
In the first 4 s, both sources stand still. The algo-
rithm needs some time to reach a good estimator at each node
(about 2 s), whereas the centralized algorithm converges much
faster. This is because the algorithm updates its nodes
one at a time, with 0.4 s in between two consecutive updates.
The centralized algorithm on the other hand, can update its es-
timators every time a new sample is collected. After a number
of iterations however, the algorithm converges to the
optimal estimators.
Not surprisingly, it is observed that the centralized algorithm
has better tracking capabilities than the algorithm.
This is again a consequence of the fact that the centralized
version computes a new estimator each time a new sample is
collected, yielding a much faster convergence. However, the
algorithm is able to react to changes in the scenario
and always regains optimality after a number of iterations.
Notice that, once the algorithm has converged, it
outperforms the centralized algorithm. This can be explained
by the fact that the algorithm uses correlation ma-
trices with smaller dimension compared to the correlation ma-
trices that are used by the centralized algorithm. Small ma-
trices are generally better conditioned and have a smaller es-
timation error than larger matrices. This performance increase
of compared to its centralized version is observed
to become more significant when the number of sensors in-
creases, yielding larger matrices, or when the window length
decreases, yielding larger estimation errors in the correla-
tion matrices. Fig. 9 shows the performance of and
its centralized version, now with window lengths
and , i.e., roughly half the sizes of the first ex-
periment. It is observed that the estimation performance of the
centralized algorithm significantly decreases compared to the
first experiment, whereas the algorithm is less influ-
enced by the short window length. This observation demon-
strates that is more robust to estimation errors in the
correlation matrices compared to its centralized equivalent. No-
tice that converges much faster in the second exper-
iment, since the time between two consecutive updates is now
0.2 s instead of 0.4 s, due to the shorter window lengths. As al-
ready mentioned in Section V, this faster tracking comes with
the drawback that the estimation performance decreases due to
larger errors in the estimation of the correlation matrices.
In [14], a modified algorithm is studied, where
an improved tracking performance is obtained, by letting nodes
update simultaneously.
VII. CONCLUSION
In this paper, we have introduced a distributed adaptive al-
gorithm for linear MMSE estimation of node-spe-
cific signals in a fully connected broadcasting sensor network,
5290 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 58, NO. 10, OCTOBER 2010
Fig. 9. SER versus time at the four nodes depicted in Fig. 7. The centralized version is added as a reference. Window lengths are and .
where each sensor node collects multichannel sensor signal ob-
servations. The algorithm significantly compresses the data to
be broadcast, and the computational load is shared amongst
the nodes. It is shown that, if the node-specific desired sig-
nals share a common low-dimensional latent signal subspace,
converges and provides the optimal linear MMSE
estimator for every node-specific estimation problem, as if all
nodes have access to all the sensor signals in the network. Sim-
ulations demonstrate that the algorithm achieves the same per-
formance as a centralized algorithm. A practical adaptive imple-
mentation of the algorithm is described and simulated, demon-
strating the tracking capabilities of the algorithm in a dynamic
scenario. It is observed that the algorithm is more ro-
bust to estimation errors in the correlation matrices, compared to
its centralized equivalent. In this paper, we have only considered
the case where nodes update their parameters in a sequential
round robin fashion. A modified algorithm is studied
in a companion paper [14], where an improved tracking perfor-
mance is obtained, by letting nodes update simultaneously.
ACKNOWLEDGMENT
The authors would like to thank B. Cornelis and the anony-
mous reviewers for their valuable comments after proof-reading
this paper.
REFERENCES
[1] D. Estrin, L. Girod, G. Pottie, and M. Srivastava, “Instrumenting the
world with wireless sensor networks,” in Proc. 2001 IEEE Int. Conf.
Acoust., Speech, Signal Processing (ICASSP ’01), 2001, vol. 4, pp.
2033–2036.
[2] C. G. Lopes and A. H. Sayed, “Incremental adaptive strategies over
distributed networks,” IEEE Trans. Signal Processing, vol. 55, pp.
4064–4077, Aug. 2007.
[3] C. G. Lopes and A. H. Sayed, “Diffusion least-mean squares over adap-
tive networks: Formulation and performance analysis,” IEEE Trans.
Signal Processing, vol. 56, pp. 3122–3136, Jul. 2008.
[4] F. Cattivelli, C. G. Lopes, and A. H. Sayed, “Diffusion recursive least-
squares for distributed estimation over adaptive networks,” IEEE Trans.
Signal Processing, vol. 56, pp. 1865–1877, May 2008.
[5] I. Schizas, G. Giannakis, and Z.-Q. Luo, “Distributed estimation using
reduced-dimensionality sensor observations,” IEEE Trans. Signal Pro-
cessing, vol. 55, pp. 4284–4299, Aug. 2007.
[6] Z.-Q. Luo, G. Giannakis, and S. Zhang, “Optimal linear decentralized
estimation in a bandwidth constrained sensor network,” in Proc. 2005
Int. Symp. Inf. Theory (ISIT ), Sept. 2005, pp. 1441–1445.
[7] K. Zhang, X. Li, P. Zhang, and H. Li, “Optimal linear estimation fu-
sion—Part VI: Sensor data compression,” in Proc. 2003 Sixth Int. Conf.
Inf. Fusion, 2003, vol. 1, pp. 221–228.
[8] Y. Zhu, E. Song, J. Zhou, and Z. You, “Optimal dimensionality re-
duction of sensor data in multisensor estimation fusion,” IEEE Trans.
Signal Processing, vol. 53, pp. 1631–1639, May 2005.
[9] A. Bertrand and M. Moonen, “Robust distributed noise reduction in
hearing aids with external acoustic sensor nodes,” EURASIP J. Adv.
Signal Process., vol. 2009, p. 14, 2009, 10.1155/2009/530435, Article
ID 530435.
[10] S. Doclo, T. van den Bogaert, M. Moonen, and J. Wouters, “Reduced-
bandwidth and distributed MWF-based noise reduction algorithms for
binaural hearing aids,” IEEE Trans. Audio, Speech, Language Process.,
vol. 17, pp. 38–51, Jan. 2009.
BERTRAND AND MOONEN: DANSE IN FULLY CONNECTED SENSOR NETWORKS—PART I 5291
[11] T. Klasen, T. Van den Bogaert, M. Moonen, and J. Wouters, “Binaural
noise reduction algorithms for hearing aids that preserve interaural time
delay cues,” IEEE Trans. Signal Processing, vol. 55, pp. 1579–1585,
April 2007.
[12] S. Doclo, T. Klasen, T. Van den Bogaert, J. Wouters, and M. Moonen,
“Theoretical analysis of binaural cue preservation using multi-channel
Wiener filtering and interaural transfer functions,” in Proc. Int. Work-
shop Acoust. Echo Noise Contr. (IWAENC), Paris, France, Sep. 2006.
[13] A. Bertrand and M. Moonen, “Distributed adaptive estimation of cor-
related node-specific signals in a fully connected sensor network,” in
Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSP),
Apr. 2009, pp. 2053–2056.
[14] A. Bertrand and M. Moonen, “Distributed adaptive node-specific signal
estimation in fully connected sensor networks—Part II: Simultaneous
and asynchronous node updating,” IEEE Trans. Signal Process., vol.
58, no. 10, pp. 5292–5306, Oct. 2010.
[15] G. H. Golub and C. F. van Loan, Matrix Computations, 3rd ed. Bal-
timore, MD: The Johns Hopkins University Press, 1996.
[16] J. C. Bezdek and R. J. Hathaway, “Some notes on alternating optimiza-
tion,” in Advances in Soft Computing. Berlin, Germany: Springer,
2002, pp. 187–195.
[17] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Compu-
tation: Numerical Methods. Belmont, MA: Athena Scientific, 1997.
Alexander Bertrand (S’08) was born in Roeselare,
Belgium, in 1984. He received the M.Sc. degree in
electrical engineering from Katholieke Universiteit
Leuven, Belgium, in 2007.
He is currently pursuing the Ph.D. degree with
the Electrical Engineering Department (ESAT),
Katholieke Universiteit Leuven, and was supported
by a Ph.D. grant of the Institute for the Promotion
of Innovation through Science and Technology in
Flanders (IWT-Vlaanderen). His research interests
are in multichannel signal processing, ad hoc sensor
arrays, wireless sensor networks, distributed signal enhancement, speech
enhancement, and distributed estimation.
Marc Moonen (M’94–SM’06–F’07) received the
electrical engineering degree and the Ph.D. degree
in applied sciences from Katholieke Universiteit
Leuven, Belgium, in 1986 and 1990, respectively.
Since 2004, he has been a Full Professor with
the Electrical Engineering Department, Katholieke
Universiteit Leuven, where he is heads a research
team working in the area of numerical algorithms
and signal processing for digital communications,
wireless communications, DSL, and audio signal
processing.
Dr. Moonen received the 1994 KU Leuven Research Council Award, the 1997
Alcatel Bell (Belgium) Award (with P. Vandaele), the 2004 Alcatel Bell (Bel-
gium) Award (with R. Cendrillon), and was a 1997 “Laureate of the Belgium
Royal Academy of Science.” He received a journal Best Paper award from the
IEEE TRANSACTIONS ON SIGNAL PROCESSING (with G. Leus) and from Elsevier
Signal Processing (with S. Doclo). He was chairman of the IEEE Benelux Signal
Processing Chapter (1998–2002), and is currently Past-President of European
Association for Signal Processing (EURASIP) and a member of the IEEE Signal
Processing Society Technical Committee on Signal Processing for Communica-
tions. He served as Editor-in Chief for the EURASIP Journal on Applied Signal
Processing (2003–2005), and has been a member of the editorial board of Inte-
gration, the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II (2002–2003)
and IEEE SIGNAL PROCESSING MAGAZINE (2003–2005) and Integration, the
VLSI Journal. He is currently a member of the editorial board of EURASIP
Journal on Advances in Signal Processing, EURASIP Journal on Wireless Com-
munications and Networking, and Signal Processing.