ArticlePDF Available

Abstract and Figures

In this work, we propose a grid-based method to estimate the location of multiple sources in a wireless acoustic sensor network, where each sensor node contains a microphone array and only transmits direction-of-arrival (DOA) estimates in each time interval, reducing the transmissions to the central processing node. We present new work on modeling the DOA estimation error in such a scenario. Through extensive, realistic simulations, we show that our method outperforms other state-of-the-art methods, in both accuracy and complexity. We also present localization results of real recordings in an outdoor cell of a sensor network.
Content may be subject to copyright.
Localizing multiple audio sources
in a wireless acoustic sensor network
Anthony Griffin1Anastasios Alexandridis1,2Despoina Pavlidi1,2Yiannis
Mastorakis1,2Athanasios Mouchtaris1,2
1Foundation for Research & Technology – Hellas, Institute of Computer Science (FORTH-ICS), Heraklion,
Crete, Greece, GR-70013
2University of Crete, Department of Computer Science,
Heraklion, Crete, Greece, GR-70013
Abstract
In this work, we propose a grid-based method to estimate the location of multiple sources
in a wireless acoustic sensor network, where each sensor node contains a microphone array
and only transmits direction-of-arrival (DOA) estimates in each time interval, reducing the
transmissions to the central processing node. We present new work on modeling the DOA
estimation error in such a scenario. Through extensive, realistic simulations, we show our
method outperforms other state-of-the-art methods, in both accuracy and complexity. We
also present localization results of real recordings in an outdoor cell of a sensor network.
Keywords: Acoustic sensors, acoustic source localization, location estimation, microphone
arrays, wireless acoustic sensor networks
1. Introduction
Microphone arrays have become increasingly popular due to their ability to perform
direction-of-arrival (DOA) estimation. Identifying the direction of incoming sound is the
basis for performing many operations, such as beamforming, speech enhancement, and dis-
tant sound acquisition. However, in many situations not only the DOA, but the actual
Email address: anthonybgriffin@gmail.com, analexan@ics.forth.gr, pavlidi@ics.forth.gr,
jmastor@csd.uoc.gr, mouchtar@ics.forth.gr (Anthony Griffin1Anastasios Alexandridis1,2
Despoina Pavlidi1,2Yiannis Mastorakis1,2Athanasios Mouchtaris1,2)
Preprint submitted to Signal Processing May 28, 2015
location of a sound source in space is required. Wireless acoustic sensor networks (WASNs),
where a number of microphones or microphone arrays are distributed over an area, have
emerged from the need to provide better spatial coverage and perform localization. WASNs
have attracted a lot of interest due to their variety of application in hearing aids, ambient
intelligence, hands-free telephony and acoustic monitoring [1].
Source localization in a WASN is a challenging task as the sensor network poses many
constraints related to time-synchronization, power and bandwidth limitations, etc. For
these reasons, approaches that require the transmission of the full audio signals to the
central processing node are often unsuitable as they are bandwidth consuming, and the
required transmission power can reduce the battery-life of the sensors. Moreover, such
approaches require the signals to be synchronized. The work in [2] circumvented the problem
of synchronization by using special nodes that used their internal Global Positioning System
(GPS) chips to resample the audio samples with a network-common timestamp. However,
the full audio signals still need to be transmitted to the central processing node.
By allowing increased computational ability in the nodes, the absolute minimum trans-
mission bandwidth can be attained when each sensor node only transmits DOA estimates to
the central processing node [3, 4, 5]. Localization using bearing-only (i.e., DOA) estimates
can also tolerate unsynchronized output given that the sources are static or that they move
at a rather slow rate relative to the analysis frame.
The bearing-only localization problem for a single-source has been thoroughly inves-
tigated and a variety of estimators are available in the literature. Closed-form solutions
include the Stansfield estimator [6], which is a weighted linear least squares estimator. The
weights are determined from range information between the source and the sensors. When
range information is not available, the Stansfield estimator reduces to the Orthogonal Vector
(OV) estimator [7]—the unweighted version of the Stansfield estimator.
While simple in their implementation, these linear least squares algorithms suffer from
increased estimation bias. For this reason, maximum-likelihood (ML) and non-linear least
squares (NLS) algorithms have been investigated [8, 9, 10, 11, 12]. A comparison between
the Stansfield estimator and the ML estimator in [8], reveals that the Stansfield estimator
2
provides biased estimates even for a large number of measurements and that the bias may not
vanish as the number of measurements increases. The work in [9] forms geometric relation-
ships between the measured data and formulates the localization problem as a constrained
optimization task, while [10] proposes a variant of the ML estimator that theoretically per-
forms better than the traditional ML approach. Estimators that take into account the
velocity of a moving source—especially for vehicle tracking—are discussed in [12, 11].
The aforementioned methods consider the problem of localizing a single source. However,
in many realistic scenarios multiple sources may co-exist in an area and the location of all
sources may need to be known. The bearing-only multiple source localization problem of
acoustic sources poses many challenging issues. First of all, the so-called data association
problem occurs, where the central processing node receiving DOA estimates for multiple
sources from the different sensors cannot know to which source they belong. Erroneous
DOA combinations across the sensors will result in “ghost sources” that do not correspond
to real sources. A solution to this problem was given in [13] and later generalized in [14]
but has been found to be Non-deterministic Polynomial-time hard (NP-hard) when the
number of sensors is 3. Another solution is discussed in [15, 16] but is suitable only
for noiseless scenarios. The work in [17] proposes a solution based on statistical clustering
of the intersection of bearing lines. However, they again consider idealized scenarios of no
missed detections and no spurious measurements. Localization of multiple sources by angle
and frequency measurements is considered in [18], but this method will fail if the sources
contain the same frequencies, and thus it cannot be applied to the case of acoustic sources.
A method for multiple source localization using non-linear least squares that tries to surpass
the data association problem is discussed in [19]. However, ghost sources are not eliminated,
leading to severe performance degradation.
Our previous experience with DOA estimation [20, 21] has revealed that when the sources
are close together some arrays might only detect one source. This is a valid observation made
from experiments using real recorded signals [20, 21]. As a result, the DOAs of some sources
from some sensors might be missing. This problem of missing DOA estimates as a function
of the sources’ locations is an important aspect which—to the best of our knowledge—has
3
not been widely examined so far.
Our work in [22] considers a method for localizing two sources using far-field DOA
measurements in an outdoor WASN. This paper extends [22] to more than two sources.
Moreover, this paper proposes a novel iterative grid-based approach that can be thought
of as an alternative solution to the NLS estimator. Other iterative solutions for source
localization have also been proposed, the most popular of which are Steered Response Power
(SRP) based approaches [23]. However, when applied to a WASN, such approaches require
a significantly higher amount of information to be transmitted to the central processing
node. In our approach only DOA estimates are transmitted to the central node, keeping
bandwidth requirements to the minimum. When localizing a single source, our grid-based
approach maintains the accuracy of the standard NLS, while performing much better in
terms of computation time.
The computational efficiency allows our approach to be extended to localize multiple
sources. To do so, we apply the single-source grid-based method to each possible combina-
tion of DOA measurements from the sensors and then solve the data association problem
using a sub-optimal—yet efficient—method which relies on the estimated locations and the
corresponding DOA combinations to decide on the actual source locations. Our approach is
real-time and as our simulations and real experiments show, it remains accurate.
Our simulations use new results that we present here to model the DOA estimation error
of the algorithm of [21], and consider the problem of missing DOAs as a function of source
location, which makes them more realistic than simulations considered so far. The problem
of missing DOAs when the sources are close together occurs very often in practice as our
real experiments in this paper suggest.
The remainder of the paper is organised as follows. Section 2 sets up the basic definitions
and assumptions for the problem. Section 3 reviews single source localization methods
using DOA estimates and proposes the intersection point and the grid-based method. Then,
Section 4 discusses the multiple source localization problem extending the intersection point
and the grid-based methods for multiple sources. Simulation results and real experiments
that compare the proposed methods with other state-of-the-art methods in realistic scenarios
4
are presented in Section 5. Finally, Section 6 concludes the paper.
2. The framework
Our framework is a wireless acoustic sensor network whose Mnodes are each equipped
with a microphone array—which we will also refer to as a sensor. This enables each node to
generate a direction-of-arrival estimate for any sources that it can “hear” (any sources whose
signal-to-noise ratio (SNR) at the node is high enough to be detected). It is important to
note that each node’s estimates consist of direction only, and no range information, thus one
node’s DOA estimates are not sufficient to obtain absolute positions for sources.
Let the x- and y-coordinates of the location of the m-th node be given by
qm=hqx,m qy,miT,(1)
and, similarly, let the x- and y-coordinates of the location of the s-th source be given by
ps=hpx,s py,siT.(2)
Given Sactive sound sources, the 2S×1 position vector of all the sources can be written as
p=hpT
1pT
2. . . pT
s. . . pT
SiT,(3)
and we can define the DOA vector of the m-th node as
θm(p) = hhm,1hm,2. . . hm,s . . . hm,SiT,(4)
where
hm,s = arctan py,s qy,m
px,s qx,m
(5)
with arctan(·) denoting the four quadrant arctangent function of the argument that returns
an angle in the range of [0,2π).
In the ideal scenario where the microphone array at each node is able to detect all sources,
the m-th array outputs a S×1 vector of noisy DOA measurements
ˆ
θm=θm(p) + ηm,(6)
5
4 3
1 2
V
V
θ1θ2
θ3
θ4
Figure 1: Example cell with four sensor nodes (blue circles, numbered 1 to 4), and the DOAs (θ1θ4) to a
source (the red circle).
where ηmis the DOA noise at the m-th sensor, which is assumed to be zero-mean Gaussian
with covariance matrix Σm= diag(σ2
m,1, σ2
m,2, . . . , σ2
m,S). The variance of the DOA noise at
each sensor can depend on several factors, such as the DOA estimation method used and
the SNR of the source signals at the microphones. Moreover, reverberation can also affect
the DOA estimation method, resulting in estimates with a greater amount of noise [21]
Note that we assume localization in the two dimensions, similar to other works, e.g., [19,
12, 5, 7]. However, results from real experiments in Section 5.4 indicate that our method
will still work—estimating a location in two dimensions—even when the sound sources are
located at different elevation angles from the microphone arrays, as long as the arrays and
the sources lie approximately in the same plane.
3. Single-source localization from multiple DOA estimates
Let us first consider the case of localizing a single source from multiple DOA estimates.
This is a well-studied problem, but it also serves as an introduction to the multiple source
case. Fig. 1 illustrates an example cell in network with four nodes—separated by V—and
the DOA estimates to the source. It is clear that in the ideal case—i.e., perfect DOA
estimates—the source could be localized by finding the points where the four DOA lines
6
intersect. In practice—or any realistic simulation—the DOA estimates will not be perfect,
and will not all intersect at the same point. We will now discuss some of the state-of-the-art
ways to solve this, followed by our proposed methods and the performance limitations of this
problem, based on the Cram´er-Rao Lower Bound (CRLB). Note that as we are considering
only one source here, (2) and (3) reduce to
p=hpxpyiT.(7)
3.1. Linear least squares
In its simplest form, the linear least squares (LLS) estimator [7, 12] can be described in
the following manner. Given the DOA measurement ˆ
θmfrom the m-th microphone array,
the source is assumed to be located on the line described by:
qx,m sin ˆ
θmqy,m cos ˆ
θm=pxsin ˆ
θmpycos ˆ
θm.(8)
Using all the DOAs from the Msensors, leads to the following system of linear equations
with two unknowns:
Ap =b(9)
where A=
sin ˆ
θ1cos ˆ
θ1
.
.
..
.
.
sin ˆ
θMcos ˆ
θM
and b=
qx,1sin ˆ
θ1qy,1cos ˆ
θ1
.
.
.
qx,M sin ˆ
θMqy,M cos ˆ
θM
.
As the DOA measurements are contaminated by noise, an exact solution to (9) cannot
be found, so the linear least squares solution is used and the location estimate is found as:
ˆ
pLLS = (ATA)1ATb(10)
3.2. Non-linear least squares
The non-linear least squares (NLS) estimator for the single-source case reported in [12],
is the maximum-likelihood estimator when the DOA noise standard deviation is the same
at all sensors. This approach aims at finding the location estimate ˆ
pNLS that minimizes the
following cost function:
C(p) =
M
X
m=1 |ˆ
θmθm(p)|2(11)
7
y
x
(0,0)
4 3
1 2
θ1θ2
θ3
θ4
I3,4
I2,4
I2,3
I1,2
I1,4
I1,3
Figure 2: Example square cell with four sensor nodes (blue circles, numbered 1 to 4), the DOAs (ˆ
θ1ˆ
θ4) to
a source (the red circle), and the intersection points (grey squares, labeled I1,2I3,4) of DOA vector pairs.
The minimization problem can be solved by using recursive gradient-descent methods
while the location estimate from the linear least squares estimator of Section 3.1 can be used
as an initial point to initialize the search.
3.3. Intersection point method
Our intersection point (IP) method [22] is based on finding the location of a source
by taking the centroid of the intersections of pairs of DOA lines. The centroid is simply
the mean of the set of intersection points, and minimizes the sum of squared Euclidean
distances between itself and each point in the set. This method can be thought of as sub-
optimal version of the LLS method of Section 3.1, but we will show later it extends more
easily to the multiple source case.
Fig. 2 illustrates this method with an example, where the DOA estimates have an error
of up to ±5, and the intersection points are labeled I1,2I3,4. The locations of sensors 1 to
4 are: (0, 0), (4, 0), (4, 4), (0, 4), respectively, and the source is at (2.6, 3.0). The estimated
location from the centroid of the intersection points is (2.40, 2.77), which is a distance error
of 0.43, or 11% of the inter-sensor spacing, V. Further inspection of Fig. 2 reveals that the
effect of I1,3is significant. By excluding this point from the centroid, the estimated location
becomes (2.64, 2.99) and the error drops to 0.03, or 1% of V.
8
A question that then naturally arises is: how can we detect and exclude outliers such as
I1,3? It can be shown that these outliers are caused by DOA lines that are almost parallel. A
small change in the slope of either of these lines—due to DOA estimation error—can move
their point of intersection significantly. Thus excluding the intersection points of pairs of
DOA lines that are almost parallel improves the accuracy of the location estimation.
Before proceeding, let us first define the function A(X, Y ), the minimum angular distance
between Xand Y, whose output will be in the range [0, π]. A simple and programatically
efficient implementation is to first ensure that Xand Yare in the range [0,2π), then by
defining
AX,Y = (XY) (mod 2π) (12)
AY,X = (YX) (mod 2π) (13)
the minimum angular distance is given by
A(X, Y ) = min (AX,Y , AY,X ) (14)
Now let γkbe a “parallelness” threshold, source localization using the intersection point
method can then be summarized as:
1. Collect the MDOA estimates.
2. Take each of the pairs of DOA estimates θmi, θmj,i6=jfrom sensors miand mjand
discard it if either of the two conditions are met:
A(θmi, θmj)< γk,(15)
A(θmi, θmj)> π γk.(16)
3. Calculate the points of intersection of the remaining pairs.
4. The estimate of the source location ˆ
pIP is then given by the centroid of the points of
intersection.
Note that this method is extremely computationally efficient, and its resolution has no
inherent limitations, being affected only by the accuracy of the DOA estimates.
9
3.4. Grid-based method
We now propose a novel grid-based (GB) method to solve the single source localization
problem. Our method is an alternative formulation of the NLS estimator of Section 3.2,
which tries to alleviate the major weaknesses of that approach, namely the need for a good
initial point to ensure the estimator does not converge to any local minimum, and the
computational burden of the minimization procedure.
Our approach is based on making the search space discrete by constructing a grid of N
points over the area of interest, and then find the grid point whose DOAs most closely match
the estimated DOAs. Moreover, since our measurements are angles, we propose the use of
the Angular Distance—defined in Section 3.3—as a more proper measure of “similarity” than
the absolute distance utilized in (11). As we will show later, this approach is much more
computationally efficient without losing any accuracy, particularly in the multiple source
case.
We first form the (M×N) matrix,
Ψ=
ψ1,1ψ1,2. . . ψ1,n . . . ψ1,N
ψ2,1ψ2,2. . . ψ2,n . . . ψ2,N
.
.
..
.
..
.
..
.
.
ψm,1ψm,2. . . ψm,n . . . ψm,N
.
.
..
.
..
.
..
.
.
ψM,1ψM,2. . . ψM,n . . . ψM,N
,(17)
where ψm,n is the DOA from the m-th sensor to the n-th grid point. Note that the n-th
column of Ψis formed from the MDOAs to the n-th grid point, as illustrated in Fig. 3.
We then find the index of the grid point whose DOAs most closely match the estimated
DOAs by solving
n= arg min
n
M
X
m=1 hA(ˆ
θm, ψm,n)i2,(18)
where A(X, Y ) is the angular distance function defined in (12) - (14). The source position
estimate ˆ
pGB is then simply given as the co-ordinates of the n-th grid point.
10
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
G
G
4 3
1 2
ψ4,n ψ3,n
ψ2,n
ψ1,n
ψ,n =
ψ1,n
ψ2,n
ψ3,n
ψ4,n
Figure 3: Example cell with four nodes, showing the DOAs to the n-th grid point, and their associated
column vector of Ψ.
A potential issue with this method is the localization error introduced by the discrete
nature of our approach. If we assume that the method works perfectly—or that there are
no DOA errors—then the method will exhibit localization error occurred by discretizing the
area. We will refer to that error as the bias introduced from the use of the grid, and in
Appendix A we derive the resultant root mean square error as:
EGB =V
6(N1).(19)
From (19) it should be clear that for a cell of given dimensions, the number of grid
points N—determined by the resolution of the grid G—will determine the method’s bias.
Increasing Ncan decrease the position estimation error, as it can make the error occurred
from sampling the area significantly small, but it will also increase the complexity of the
algorithm.
To maintain a computationally efficient method when a very dense—i.e, large number of
N—grid is considered, we propose an iterative solution to (18) which starts with a coarse
grid (low value of N), and once the best grid point is found, a new grid centered on this
point is generated, with a smaller spacing between grid points, but also a smaller scope.
Then, the best grid point in the new grid is found. This may be repeated until the desired
accuracy is obtained, while keeping the complexity under control, as it does not require an
exhaustive search over all grid points of the final resolution grid. A possible implementation
11
of the iterative grid-based method can be summarized in the following steps:
1. Denote the initial resolution of the grid as Ginitial, the target resolution as Gtarget, and
let rbe the factor of decrease in resolution after each iteration.
2. Set G=Ginitial.
3. Construct a grid over the area of interest with resolution G.
4. Find the grid point nby using (18).
5. If GGtarget go to step 9.
6. Set V=G,G=G/r.
7. Construct a square grid of dimensions Vand resolution Gcentered on n.
8. Go to step 4.
9. Output the co-ordinates of nas the estimated location.
It is easy to observe, that this iterative version finds a solution in a fixed number of
iterations which depend on the initial and target grid resolution and the decrease resolution
rate rin Step 6. The number Kof iterations required can be calculated as:
K=logr
Ginitial
Gtarget (20)
where dxedenotes the smallest integer number, greater or equal to x.
Moreover, as our simulation results in Section 5.2 indicate, the iterative version achieves
the same performance to its brute force counterpart, thus being able to find the optimal
solution to the problem of (18) without requiring an exhaustive search over all grid points
of the target resolution grid.
Our proposed grid-based method can be extended to 3D localization, as long as DOA
estimation methods able to estimate both azimuth and elevation angles are employed. Our
localization method could be easily extended by employing a grid in three dimensions, and
considering the angular distance of both the azimuth and the elevation angles in (18).
3.5. Performance limitations: Cram´er-Rao Lower Bound
The Cram´er-Rao Lower Bound (CRLB) represents the minimum localization error co-
variance for any unbiased estimator and is defined as the inverse of the Fisher Information
Matrix (FIM) J(p) [24]:
12
4 3
1 2
ˆ
θ1ˆ
θ2
ˆ
θ3
ˆ
θ4
Figure 4: Example cell with four sensor nodes (blue circles, numbered 1 to 4), and the estimated DOAs
(ˆ
θ1ˆ
θ4) to two sources (not shown).
E{(ˆ
pp)(ˆ
pp)T} ≥ J1(p) (21)
where ˆ
pis the estimate of pand E{·} is the expectation operator.
Under the Gaussian assumption for the measurement noise, the FIM is derived as [5]:
J(p) =
M
X
m=1
1
σ2
mpθm(p) [pθm(p)]T.(22)
Note that for the multiple source case, the gradient pθm(p) is simply replaced by the
Jacobian of θm(p) and the noise variance at sensor m,σ2
m, is replaced by the noise covariance
matrix Σm.
4. Multiple-source localization from multiple DOA estimates
The localization of multiple audio sources from DOA estimates is a considerably more
challenging problem than its single source counterpart. The presence of multiple sources
introduces further problems above those of the single-source case. Consider Fig. 4, depicting
an example cell with noisy DOA estimates from two sources. The processing node receiving
the DOA estimates cannot know to which source they belong, and the localization algorithm
must take this into account. An additional complication is that some sensor nodes may
only detect one source, as the sources’ DOAs may be too close together for that node to
discriminate between them, (see node number 3 in Fig. 4). We call this the minimum angular
13
source separation (MASS), i.e., if the angular distance between two sources is less than the
MASS, then the sensor node will only detect one source. The DOA estimation method
used by a sensor node, the spectral content of the source signals, and the array geometry
determine the MASS at this node. Thus, any localization algorithm must deal with the
ambiguity that each DOA estimate may originate from either source, and that some (or
even all) of the sensor nodes may underestimate the number of sources. In the following
we need to let Smdenote the number of sources detected by the m-th sensor. Then let the
maximum value of Smbe S, which is the highest number of sources detected by at least one
sensor. Let XSbe the set of sensors surrounding a cell detecting Ssources in that cell, and
let CSbe the size of that set, i.e., CS=|XS|.
We now present extensions of the single-source algorithms of Section 3. However, note
that there is no multiple source version of the LLS method of Section 3.1.
4.1. Position non-linear least squares
An extension of the NLS method of Section 3.2 was developed in [19]—called position
non-linear least squares (P-NLS)—and works in two stages. In the first stage, all unique
combinations of DOA estimates are formed, and a location estimate for each combination
is calculated as described in Section 3.1. Then in the second stage, the final locations are
estimated by minimizing the following cost function using the estimates from the previous
stage as initial guesses:
CP-NLS(p) =
M
X
m=1
min
i|ˆ
θm,i θm(p)|2.(23)
where ˆ
θm,i is the i-th element of ˆ
θm. For every DOA combination the minima of this
cost function are expected to correspond to the locations of the true sources, however, some
“ghost” sources appear due to spurious intersection of bearing lines from the sensors. We mit-
igated these effects by extending this method to use the approaches of Sections 4.3.1 & 4.3.2
as a third stage to produce S(or less) final estimate locations.
14
4.2. Intersection point method
The extension of the method of Section 3.3 to the multiple source case is relatively
straightforward. Let us first describe the concept of our geometric approach to solving this
problem. We take advantage of the fact that each DOA estimate—from a sensor in XS
can only belong to one source. By dividing the possible locations for sources into the SCS
unique combinations of DOA estimates, we obtain up to SCSregions, (we say “up to” as
some of these regions may be null, depending on the orientation of the DOA estimates).
By counting the number of intersection points in each region, and choosing the one that
contains the most intersection points, we obtain the one that is most likely to contain one
of the sources. Once we have chosen a region—and thus one of the combinations of DOA
estimates—we then choose the next most likely, and so on, and until we are left with only
one remaining possible combination of DOA estimates pointing to the final source. Our
proposed algorithm to localize Ssources can be more formally stated as:
1. Find the intersection points of all of the pairs of DOA lines, removing any pair whose
lines are too parallel, as in step 2 of the single-source algorithm of Section 3.3.
2. Determine S,XSand then CS, set the counter sto zero.
3. Find the CS(S1) circular means of the adjacent pairs of DOAs from the sensors in
XS.
4. The vectors of these circular means form CSShalf-planes, find the regions defined
by all the intersections of all the possible combinations of pairs of half-planes from
different sensors. There will be SCSof them.
5. Find the region with the most intersection points. If there is a tie, choose the region
whose intersection points have the minimum variance. The location of the s-th source
is given by the centroid of the intersection points in this region. Increment s.
6. If s<S, remove all regions that are not distinct from the already chosen region(s)
and go to the previous step.
Note that we have described this algorithm conceptually, but it can be implemented very
efficiently by using line tests—testing whether a point is above, below, or on a line—and
binary masks.
15
4.3. Grid-based method
For multiple sources, the grid-based method must account for the fact that the correct
association of DOAs to the sources is unknown. The localization consists of a two-step
procedure: in the first step, an initial candidate location is estimated for each possible
combination of DOA measurements, while in the second step, the final Ssource locations
must be chosen from the candidate locations.
Let Jdenote the set of all possible unique combinations of DOA estimates and jenu-
merate the combinations. Moreover, let ˆ
θ(j)be the M×1 vector of DOAs for the j-th
combination, and let ˆ
θ(j)
mdenote the DOA of sensor mfor the j-th combination. The car-
dinality of Jdepends on the number of sources each sensor is able to detect and can be
computed as:
|J| =
S
X
s=1
sCs(24)
As the correct association of the DOAs of each sensor to the sources cannot be known,
the single-source GB method of Section 3.5 is applied to each element of Jand the set
Lof candidate source locations is formed with |L| =|J|. Note that this multiple source
localization algorithm increases complexity by at least |J|1 times that of the single source
algorithm, which highlights even more the need for a computationally efficient method to
perform the localization of each DOA combination. As we will show later, our iterative grid-
based approach of Section 3.4 to minimize the non-linear cost function of (18) is significantly
more computationally efficient and results in a similar accuracy as the numerical search
methods for finding the minimum of (11).
In the next step, the final Ssource locations must be identified from the set of candidate
locations Lby solving the data association problem.
4.3.1. Brute-force approach
A brute-force solution to the data association problem is to perform an exhaustive search
over all possible S-tuples of DOA combinations and select the most likely one. An S-tuple
of DOA combinations is defined as the list of SDOA combinations (elements of J) each
of them being an M×1 vector of DOA measurements from the Msensors. Moreover, in
16
forming an S-tuple each sensor must contribute to each of the SDOA combinations with a
different estimate, as the same DOA cannot belong to more than one sources. In the case
where a sensor has not detected all sources the same DOA can be repeated.
The brute-force approach can be summarized in the following steps:
1. Form all possible S-tuples of DOA combinations by combining the elements of set J.
The i-th S-tuple will be of the form:
Ti=nˆ
θ(1),ˆ
θ(2),·· · ,ˆ
θ(S)o.(25)
Note that each DOA combination ˆ
θ(j)is associated with a candidate source location
p(j)=hpxjpyjiTin the set L.
2. For each S-tuple i, calculate the sum of residuals of each DOA combination in the
tuple as:
ri=
S
X
j=1
M
X
m=1 hA(ˆ
θ(j)
m, θm(p(j)))i2.(26)
3. Choose the S-tuple that yields the minimum residual and output the corresponding
candidate locations from that tuple as the final source locations.
This approach suffers from very high complexity as the number of tuples that need to
be tested can grow as high as O((S!)M), making this method highly impractical even for a
moderate number of sources and sensors. In the next section, we propose an alternative way
of solving the data association problem that approximates the performance of the brute-force
method and is much more computationally efficient.
4.3.2. Sequential approach
In this section, we propose a computationally efficient approach to solve the data as-
sociation problem. It is a sub-optimal approach to the brute-force method that relies on
a sequential procedure to find the SDOA combinations that approximate the minimum
residual of (26) without testing all the possible S-tuples of DOA combinations.
Our sequential approach can be stated as:
1. Create a set J0=J.
17
2. For each DOA combination jin the set J0compute the residual:
rj=
M
X
m=1 hA(ˆ
θ(j)
m, θm(p(j)))i2.(27)
3. Choose the DOA combination jwith the minimum residual and output the corre-
sponding location p(j)as the location of one of the sources.
4. Update J0by subtracting all DOA combinations that contain DOAs that are part
of the previously chosen combination j. Only DOAs of the sensors that have not
detected all sources are allowed to take part in other combinations.
5. Repeat steps 2–4 until J0=i.e., all Ssources have been found.
Note that this approach does not need to test all possible S-tuples of DOA combina-
tions, significantly reducing the computational burden to that of testing only O(SM) DOA
combinations.
5. Results and Discussion
In order to investigate the performance of our proposed localization method, we per-
formed simulations and real measurements of a square 4-node cell of a WASN, similar to
that of Fig. 4. Although this is just a study of a cell in a larger sensor network, it is a
reasonable assumption that the performance in each cell would dominate the performance
of the whole network, as the other sensors not belonging to this cell would receive the source
signals with low SNR or not be able to detect the sources’ DOAs at all. Sensors that detect
the sources’ DOAs but do not belong to the cell could be excluded by a higher-layer sensor
selection algorithm. Restricting the localization task to a specific cell has also been used in
other works, e.g., in [25].
First, we investigate source localization using DOAs contaminated by noise of different
levels. We assume a 4-node cell of a WASN where the sources are located inside the cell.
Non-directional isotropic environmental noise and sensor noise will contaminate the sources’
signals received at the microphones of each sensor. This noise can be modeled as white
Gaussian noise of equal power at all microphones, uncorrelated with the source signal and
18
the noise at the other microphones, resulting in a certain level of SNR for each source signal
at the sensors. As we are considering circular arrays of the same number of omni-directional
microphones, we can assume that the accuracy of the DOA estimates of a source at each
sensor is determined by the SNR of that source’s signal at that sensor.
By defining the SNR at each sensor when the source is at the center of the cell (reference
SNR), we can thus estimate the SNR at the sensors when the source is located at any location
within the cell based on the attenuation of the source signal at that location compared to
the center of the cell. We assume that the signal of a source radiates as a spherical wave, and
the attenuation experienced by the source signal travelling from r1meters from the source
to r2meters from the source is given by [26]
a= 20 log10
r2
r1
dB.(28)
Note that the attenuation can be either positive or negative, resulting in SNR at the
sensors which is lower or higher than the reference SNR. Thus, given a reference SNR at the
center of the cell, the SNR of a source signal at the sensors when the source is located at a
given location can be calculated through geometry and the use of (28).
The source’s SNR at each sensor, will then define the standard deviation of the DOA
error of (6). Thus, to proceed with our simulations, we need to model the DOA error as
a function of SNR. It must be emphasized here that our framework results in a different
SNR and, therefore, a different DOA estimation error standard deviation at each sensor.
Moreover, in order to simulate multiple simultaneous sources within the MASS, we need to
study the effect of the MASS on the DOA estimation. The modeling of these parameters is
presented in Section 5.1. Then, Sections 5.2–5.5 present evaluation results using simulations
and real data.
5.1. DOA Estimation Error Modeling
The DOA estimation error at each sensor was assumed to be normally distributed with
a zero mean and a variance that was assumed to be dependent only upon the SNR at each
sensor, which was in turn determined by the length of the path from the source to the sensor.
19
−5 0 5 10 15 20
10°
Figure 5: Modeling the effect of SNR on DOA estimation error standard deviation for a circular microphone
array.
0 5 10 15 20
0
0.1
0.2
0.3
0.4
0.5
SIR (dB)
Normalized DOA
Simulated data
Fitted curve
Figure 6: Modeling the effect of MASS and SIR on DOA estimation error for a circular microphone array.
Following the DOA estimation method of [21], we performed simulations to characterize the
DOA estimation error, using a sensor consisting of a 4-element circular microphone array
with a radius of 2 cm. The parameters of the DOA estimation method used are the same
as the ones in [21]. We assumed an anechoic environment and simulated a speech source
(male speaker) contaminated by white Gaussian noise at various SNR cases ranging from -5
dB to 20 dB. The noise at each microphone is uncorrelated with the speech source and with
the noise at all the other microphones. For each signal-to-noise ratio, the simulation was
repeated with the source rotated in 1increments around the array to avoid any orientation
biasing effects. Fig. 5 shows the standard deviations obtained when the DOA estimation
error at each SNR was fitted with a Gaussian distribution. The fitted curve in Fig. 5 is
given by
std(SNR) = 1.979e0.2815(SNR) + 1.884.(29)
20
0 2 4 6 8 10 12 14 16 18 20
0
2
4
6
8
Reference SNR (dB)
RMSE as % of cell size
GB Iter
GB Exh
CRLB
Figure 7: Position estimation error of the two versions of the grid-based method (exhaustive search and
iterative) as a percentage of cell size Vfor a single source in a square 4-node cell .
As mentioned earlier, in order to simulate multiple simultaneous sources, it was also
important to study the effect on DOA estimation when two sources were within the MASS
of a sensor. We performed a simulation study where two speech sources (one male, one
female) were set at various separations of up to 20—below the MASS of the method of
[21]—and the energy of the second source was incrementally decreased so the signal-to-
interferer ratio (SIR) seen by the first source varied from 0 dB to 20 dB. These simulations
were then repeated with the sources being rotated around the array in 1increments—
whilst preserving their angular separation—to avoid any orientation biasing effects. In all
simulations only one source was detected and Fig. 6 shows the results of these simulations,
where the DOA offset has been normalized by the separation between the sources. The
fitted curve of the normalized DOA estimate, DOAn, (Fig. 6) is given by
DOAn(SIR) = 0.5e0.12987(SIR).(30)
It is clear that the detected source’s DOA is estimated exactly in the middle of the true DOAs
when the sources have equal energy, and moves gradually towards the dominant source as
the weaker source decreases in energy. We used the fitted curve of Fig. 6 in all simulations
involving more than one source.
5.2. Simulation Results
In all simulations, the sources were located anywhere within the cell with independent
uniform probability and the error measurement used was the root mean square error (RMSE)
21
0 2 4 6 8 10 12 14 16 18 20
0
2
4
6
8
10
Reference SNR (dB)
RMSE as % of cell size
IP
LLS
NLS
GB
CRLB
Figure 8: Position estimation error as a percentage of cell size Vfor a single source in a square 4-node cell,
for various values of SNR measured at the center of the cell.
between the estimated positions and the true source positions. For each run—i.e., a different
positioning of the sources—the sources’ true DOAs to the sensors were calculated using (5)
and zero-mean Gaussian DOA noise was added. The standard deviation of the DOA noise
was taken from Fig. 5, according to the sources’ SNRs at the sensors which in turn was
estimated based on the reference SNR at the middle of the cell. For multiple sources, when
the sources were within the MASS, one DOA was estimated through the use of (30).
In our first simulation, we consider the single-source case and compare the performance
of the GB method when an exhaustive search over all grid points is performed against the
iterative version of the method. For the iterative version we used an initial grid and a final
grid with grid point spacings of 12.5% and 0.25% of the sensor spacing, respectively. In
each iteration we reduce the grid point spacing to one half of the previous one (r= 2). For
the exhaustive search version, we use the same grid (i.e., 0.25% of the sensor spacing) and
perform an exhaustive search over all grid points to find the source location according to (18).
Fig. 7 presents the results over 10000 runs for each reference SNR case. It is evident that
the iterative version achieves the same performance without requiring an exhaustive search
over all grid points of the final resolution grid, thus being more computationally efficient.
For all the results presented in the remainder of this paper, the iterative GB method is used
with initial and final grids of 12.5% and 0.25% of the sensor spacing, respectively, and r= 2.
Fig. 8 presents the results of our simulations of a single source, with the five curves
22
0 2 4 6 8 10 12 14 16 18 20
0
2
4
6
8
10
12
14
Reference SNR (dB)
RMSE as % of cell size
IP
P−NLS
GB
CRLB
Figure 9: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell with
a MASS of 0, for various values of SNR measured at the center of the cell.
0 2 4 6 8 10 12 14 16 18 20
0
2
4
6
8
10
12
14
16
18
20
Reference SNR (dB)
RMSE as % of cell size
IP
P−NLS
GB
CRLB
Figure 10: Position estimation error as a percentage of cell size Vfor three sources in a square 4-node cell
with a MASS of 0, for various values of SNR measured at the center of the cell.
representing the methods of Sections 3.1–3.4 and the bound of Section 3.5. The RMSE is
calculated over 10000 runs for each reference SNR case. It is clear that all the methods
perform close to the bound, with the NLS and GB methods being the closest. However,
as we will show later (Section 5.3) the GB method is significantly more efficient in terms
of computation time. For the IP method, we set γk= 20for all the results presented in
this paper. Through several simulations, these parameters for the IP and GB methods were
found to achieve good performance.
The performance of the multiple source localization methods of Sections 4.1–4.3 for two
and three sources was also evaluated through simulations. For all our simulations with
multiple sources presented hereafter, the RMSE is calculated over 5000 runs.
The performance of the methods for two and three sources for the case of 0MASS
23
0 2 4 6 8 10 12 14 16 18 20
0
2
4
6
8
10
12
14
Reference SNR (dB)
RMSE as % of cell size
IP
P−NLS
GB
CRLB
Figure 11: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell
with a MASS of 20, for various values of SNR measured at the center of the cell.
0 2 4 6 8 10 12 14 16 18 20
0
2
4
6
8
10
12
14
16
18
20
Reference SNR (dB)
RMSE as % of cell size
IP
P−NLS
GB
CRLB
Figure 12: Position estimation error as a percentage of cell size Vfor three sources in a square 4-node cell
with a MASS of 20, for various values of SNR measured at the center of the cell.
is displayed in Figures 9 & 10, respectively. Both the P-NLS and GB methods used the
brute force approach of Section 4.3.1 for the final source location selection. These results
are for the idealized case of 0MASS, nonetheless, it is very encouraging to see how close
the performance of the GB method gets to the lower bound. However, it is evident that the
performance of the IP method degrades with three sources.
Any realistic sensors and DOA estimation algorithm will have a non-zero MASS, and
the performance of all localization algorithms is expected to degrade significantly as the
MASS increases. This is due to the fact that the accuracy of the algorithms degrades
as CSdecreases, and an increasing MASS directly decreases CS, especially as the number
of sources increases. Another way to think of this is that as the MASS increases, the
accuracy of the DOA estimates from each sensor is much more likely to degrade significantly,
24
0° 1° 2° 3° 4° 5° 6° 7° 8° 9° 10°
0
2
4
6
8
10
12
14
16
18
20
22
24
Extra DOA error standard deviation
RMSE as % of cell size
IP
P−NLS
GB
CRLB
Figure 13: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell
with a MASS of 0for 20 dB SNR at the center of the cell, for various values of extra DOA error standard
deviation.
0° 1° 2° 3° 4° 5° 6° 7° 8° 9° 10°
0
2
4
6
8
10
12
14
16
18
20
22
24
Extra DOA error standard deviation
RMSE as % of cell size
IP
P−NLS
GB
CRLB
Figure 14: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell
with a MASS of 20for 20 dB SNR at the center of the cell, for various values of extra DOA error standard
deviation.
due to the “merging” effect illustrated in Fig. 6. In the extreme case, CSwill be zero—
i.e., no sensors will detect the true number of sources—and the localization algorithm will
underestimate the number of source locations. A more realistic case of 20MASS is presented
in Figures 11 & 12, and the degrading effect of the increased MASS is clear, particularly for
the three source case. Note again, that the GB method consistently performs the best.
All the previous results have considered the DOA estimation error at the sensors to be
modeled as in Fig. 5. In Figures 13 & 14 we consider the position error for two sources
with increased DOA estimation error when the reference SNR is 20 dB. This is modeled
by taking the result of Fig. 5 and adding an additional Gaussian noise term with a zero-
mean and standard deviation of 1–10at each sensor node. Again, in the 0MASS case,
25
0 5 10 15 20
0
2
4
6
8
10
12
14
Reference SNR (dB)
RMSE as % of cell size
MASS = 30°
MASS = 25°
MASS = 20°
MASS = 15°
MASS = 10°
MASS = 5°
MASS = 0°
CRLB
(a) Brute force
0 5 10 15 20
0
2
4
6
8
10
12
14
Reference SNR (dB)
RMSE as % of cell size
MASS = 30°
MASS = 25°
MASS = 20°
MASS = 15°
MASS = 10°
MASS = 5°
MASS = 0°
CRLB
(b) Sequential
0 5 10 15 20
0
2
4
6
8
10
12
14
Reference SNR (dB)
RMSE as % of cell size
MASS = 30°
MASS = 25°
MASS = 20°
MASS = 15°
MASS = 10°
MASS = 5°
MASS = 0°
CRLB
Figure 15: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell
using the grid-based method and the final step approaches of Sections 4.3.1 & 4.3.2 with various values of
MASS and SNR measured at the center of the cell.
0 5 10 15 20
0
2
4
6
8
10
12
14
16
18
20
Reference SNR (dB)
RMSE as % of cell size
MASS = 30°
MASS = 25°
MASS = 20°
MASS = 15°
MASS = 10°
MASS = 5°
MASS = 0°
CRLB
(a) Brute force
0 5 10 15 20
0
2
4
6
8
10
12
14
16
18
20
Reference SNR (dB)
RMSE as % of cell size
MASS = 30°
MASS = 25°
MASS = 20°
MASS = 15°
MASS = 10°
MASS = 5°
MASS = 0°
CRLB
(b) Sequential
0 5 10 15 20
0
2
4
6
8
10
12
14
16
18
20
Reference SNR (dB)
RMSE as % of cell size
MASS = 30°
MASS = 25°
MASS = 20°
MASS = 15°
MASS = 10°
MASS = 5°
MASS = 0°
CRLB
Figure 16: Position estimation error as a percentage of cell size Vfor three sources in a square 4-node cell
using the grid-based method and the final step approaches of Sections 4.3.1 & 4.3.2 with various values of
MASS and SNR measured at the center of the cell.
the methods show a reasonable agreement with the lower bound, and as the MASS moves
to 20, the performance of all the methods suffers. Once again, the proposed GB method
performs the best with the added DOA estimation error.
With the sequential approach of Section 4.3.2 we presented a solution to the high com-
plexity brute force approach of 4.3.1, whilst acknowledging that its performance may be
worse than the brute force approach. Figures 15 & 16 illustrate the difference in perfor-
26
Table 1: Mean execution times in milliseconds for localization methods for one set of DOA estimations
MASS = 0MASS = 20
one two three two three
Method source sources sources sources sources
LLS 0.12 –
IP 0.69 6.89 44.49 5.31 16.16
GB (& BF) 1.72 36.03 2961.57 19.18 214.34
GB (& Seq.) 1.72 29.39 162.79 16.79 26.69
P-NLS (& BF) 18.88 381.95 5033.43 205.12 509.59
P-NLS (& Seq.) 18.88 375.32 2238.82 202.72 322.08
mance for the two approaches with the GB method for two and three sources, respectively.
It is clear that little performance is lost using the sequential approach particularly at the
higher—and more realistic—values of MASS. The loss in performance is higher at low val-
ues of MASS, and for the three source case. Although it is not shown here due to space
considerations, because the P-NLS method must use either the brute force or the sequential
approach, it too suffers a very similar performance loss to that of the GB method. Fig-
ures 15 & 16 also illustrate the effect of MASS on the RMSE, highlighting the importance
that the DOA estimation used has a low MASS.
5.3. Complexity
All the localization algorithms of Sections 3 & 4 were implemented in MATLAB on
a Windows laptop with a Core i5 CPU running at 2.53 GHz with 4GB RAM, and their
mean execution times are presented in Table 1. Note that while the absolute execution
times may be highly dependent on the machine, we are only interested here in the relative
times between the methods. In the one source case, the LLS method is clearly the fastest,
while the IP method is the fastest in the multiple source cases. The (P-)NLS methods
are clearly the slowest methods, due to non-linear optimization they require. Table 1 also
highlights the dramatic reduction in complexity when using the sequential rather than the
27
(a)
C2= 4
(b)
C2= 2, C1= 2
(c)
C2= 2, C1= 2
(d)
C2= 2, C1= 2
(e)
C2= 1, C1= 3
(f)
C1= 4
(g)
C2= 2, C1= 2
(h)
C3= 2, C1= 2
(i)
C3= 1, C2= 2, C1= 1
(j)
C2= 4
(k)
C3= 3, C2= 1
(l)
C1= 4
Figure 17: Position estimates (the red clouds) using the proposed grid-based method in a square 4-node
cell, for real recordings of two [(a)–(g)] or three [(h)–(l)] simultaneous sources (the blue X’s).
brute force approach, particularly in the three source case. These results, together with
those of Section 5.2, strongly suggest that the GB method with the sequential approach is
the best choice given its accuracy and moderate complexity. To further verify this suitability,
we implemented the GB method with the sequential approach in C++ and measured that
it only consumed 25% of the available processing time, making it an excellent candidate for
a real-time system.
28
0 10 20 30 40 50
0
0.2
0.4
0.6
0.8
1
Error (%)
Empirical CDF
GBM
P−NLS
IPM
Figure 18: Empirical Cumulative Distribution Functions (CDFs) of the error between the estimated and
true source positions using real recorded data.
5.4. Results of Real Measurements
We also performed some real recordings of acoustic sources in a 4-node square cell with
sides 4 meters long. The sensors on the nodes were circular 4-element microphone arrays with
a radius of 2 cm, and the DOA estimation was performed by our real-time system of [20, 21].
The sources were recorded speech, sampled at 44.1 kHz, played back simultaneously through
loudspeakers at different locations, and their SNR at the center of the cell was measured to
be about 10 dB. The DOA estimation and source localization was performed on frames of
2048 samples with 50% overlap. Although a 4 ×4 metre square is not a particularly large
area, since we measure our reference SNR at the center of the cell, these results should be
scalable to larger cells. Fig. 17 shows the position estimates from the real recordings using
the proposed grid-based method for different layouts of two and three sources. The red dots
show the cloud of estimates over about 5 seconds, and show quite accurate localization. The
pairs (f) & (g) and (j) & (k) warrant further discussion. All of the plots except (g) and
(k) used the standard parameter set of [20, 21] which has a MASS of around 20, and it is
clear that in (f) and (j) the source positions are underestimated. By modifying some of the
parameters of the DOA estimation, we were able to decrease the system’s MASS so that all
the sources in (g) and (k) could be localized, albeit with a greater variance in the estimates.
The performance of the P-NLS and our proposed grid-based and intersection point meth-
ods was also compared on our real recorded data. Again, for DOA estimation our real-time
29
Table 2: RMSE as a percentage of cell size for the real recordings (outdoors) of Fig. 17 and their corre-
sponding reverberant simulations with T60 = 400 ms
outdoor reverberant
layout GBM P-NLS IP GBM P-NLS IP
(a) 4.33 4.33 3.80 12.13 32.05 31.58
(b) 6.33 6.33 10.83 18.60 19.07 23.28
(c) 7.45 9.99 3.66 24.47 23.64 32.23
(d) 4.53 4.52 9.84 16.30 17.85 20.81
(e) 14.92 17.03 12.15 14.64 16.51 15.08
(f) 13.39 13.39 13.42 12.81 12.81 13.22
(g) 5.41 5.41 11.93 15.70 15.71 18.24
(h) 7.71 7.70 8.87 11.77 12.49 13.91
(i) 4.61 4.61 6.02 20.64 24.43 19.91
(j) 20.69 21.39 33.04 23.20 23.02 37.27
(k) 12.99 14.15 18.64 24.07 24.45 23.69
(l) 12.38 12.37 12.85 10.72 10.72 10.70
system of [20, 21] was used. Fig. 18 shows the empirical Cumulative Distribution Functions
(CDFs) of the error between the estimated and true source positions for the three localiza-
tion methods. The error was calculated using all frames for all the source positionings of
Fig. 17. It is evident that the P-NLS and GB methods perform the best. However, note that
while the P-NLS and the GB methods have similar performance, our proposed GB method
is much more computationally efficient (Section 5.3).
It should be noted that these recordings took place outdoors, and as such did not have
many reflections, but there was a significant level of distant noise sources, such as cars and
dogs barking. Furthermore, the orientations of the sensors were not finely calibrated, and the
DOA estimates likely have unintended offsets of a few degrees. Thus the conditions were far
from ideal, making the results of our proposed localization method even more encouraging.
30
0 10 20 30 40 50
0
0.2
0.4
0.6
0.8
1
Error (%)
Empirical CDF
GBM
P−NLS
IPM
Figure 19: Empirical Cumulative Distribution Functions (CDFs) of the error between the estimated and
true source positions using recordings in a simulated room with T60 = 400 ms.
5.5. Results in reverberant environments
In this section, we test the efficiency of our localization methods in reverberant en-
vironments. We used the Image-Source method [27] to simulate a reverberant room of
dimensions of 6 ×6×3 meters with reverberation time T60 = 400 ms. We placed a 4-node
cell of sides 4 meters long in the middle of the room. Thus, the nodes’ centers are located
in (1,1),(5,1),(5,5),and (1,5) meters. Again, the nodes consist of circular 4-element mi-
crophone arrays with a radius of 2 cm. We considered the same source positionings and the
same speech signals that we used for our real recordings in Fig. 17. For DOA estimation we
used again our system of [20, 21] on frames of 2048 samples with 50% overlap. Fig. 19 shows
the CDFs of the error between the estimated and true source positions using all frames and
all source positionings. Once again, the grid-based method performs the best. A perfor-
mance degradation for all methods is evident compared to the results in Fig. 18. This is
because reverberation affects the DOA estimation algorithm providing more erroneous DOA
estimates.
Table 2 shows the RMSE over all frames for each position layout of Fig. 17. The results
of the table agree with Figures 18 & 19 as the performance degradation due to reverberation
is evident, and the GB method generally performs the best. It is of note that in layouts
(f) and (l), the outdoor recordings have greater RMSE than the reverberant ones. These
layouts correspond to the cases where the DOA estimation algorithm in all arrays always
31
detects one source. The DOA estimation of this practically single source case is the one least
affected by reverberation [21]. This fact combined with the fact that outdoor recordings were
performed in a real—rather than a simulated environment—can explain this small difference
in the RMSE between the two scenarios.
5.6. Tracking Potential
Due to their real-time natures, the DOA estimation algorithm of [21] and the GB method
we present here suggest the potential for integration with a tracking system. To illustrate
the tracking potential, we implemented a tracking algorithm based on particle filtering using
the framework of [28]. The tracking system uses the location estimates of the GB method
to assign weights to the particles through the following likelihood function:
ptr(ˆ
p(t)
s|x(t)
j,i ) = N(ˆ
p(t)
s,x(t)
j,i ;Σ) (31)
where ˆ
p(t)
sis the s-th source location estimate from the GB method at time t,x(t)
j,i is the
location of particle iassociated with the tracked source jat time tand Ndenotes the
two-dimensional Gaussian distribution with mean x(t)
j,i and covariance matrix Σ, evaluated
at ˆ
p(t)
s. Assuming that the measurements are independent in the x- and y-coordinates, the
covariance matrix can be written as Σ= diag(σ2
x, σ2
y), where the variances σ2
xand σ2
yare
used to quantify the location error that the localization system is expected to produce in
the x- and y-coordinates.
We now illustrate the potential of tracking with a simple example. In the 4 m ×4 m
square cell considered in our simulations, three sources were set to move in straight lines
at different velocities. In this example, the MASS is set to be 15. To implement (31) we
empirically set σx=σy= 0.15. The RMSE over time for 250 runs is shown in Fig. 20. It
is evident that the tracking system consistently improves the localization performance. It
is worth noting the region between 0.5 seconds and 1 second where the sources are located
such that due to the MASS the localization is able to detect only two out of three sources.
In that region, the tracking is able to keep the track of the lost source and significantly
improve the performance.
32
0.5 1 1.5 2 2.5
0
5
10
15
20
25
30
Time (seconds)
RMSE as % of cell size
Localization
Tracking
Localization Mean
Tracking Mean
Figure 20: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell
with a MASS of 20and signals having 20 dB SNR at the center of the cell, for various values of extra DOA
error standard deviation.
6. Conclusions
In this work, we have considered the challenge of localization in a WASN where each
sensor node only transmits direction-of-arrival estimates, reducing the transmissions to the
processing node. We considered some of the real problems in such a scenario, such as modeled
DOA estimation error, and the merging of two estimates that are too close together to be
resolved by the DOA estimation algorithm. We presented a real-time grid-based method to
perform the position estimation of multiple sources along with a sequential approach to the
final source location selection. Through extensive simulations and measurements we showed
that our proposed method outperforms the other state-of-the-art methods considered in
both accuracy and computational complexity.
Appendix A. Grid-based Error Bound
Any grid-based localization method’s accuracy will be limited by the density of its grid
points. To investigate this further, we now calculate a lower bound for the root mean
squared position error. If we assume single source localization, and that the method works
perfectly—or that there are no DOA errors—then the method will always choose the closest
grid point. Let the grid points be uniformly spaced, with Gbeing the inter point spacing in
the xand ydirections (see Fig. 3). Without loss of generality, let us consider a grid point
at (0,0), then due to symmetry, we only need to analyze the squared error in the square
33
defined by (0,0) and (G/2, G/2). Let us also assume that a source may be located anywhere
in the square under consideration, with a uniform probability density function given by
p(x, y) = p(x)·p(y) = 2
G·2
G=4
G2,(A.1)
due to the independence between p(x) and p(y). The squared error between (0,0) and a
point (x, y) is simply x2+y2, and the mean squared error is then given by
E2
GB =ZG/2
0ZG/2
0
(x2+y2)p(x, y)dx dy =G2
6,(A.2)
with the root mean square error being
EGB =G
6.(A.3)
If the inter sensor spacing in the x(and y) direction is defined as V(see Fig. 1), the number
of grid points can be written as
N=V
G+ 12
,(A.4)
and from (A.2), we can write
EGB =V
6(N1).(A.5)
Note that this analysis is independent of the method, and should apply to any grid-based
localisation method.
References
[1] A. Bertrand, Applications and trends in wireless acoustic sensor networks: A signal processing per-
spective, in: IEEE Symp. on Communications and Vehicular Technology in the Benelux, 2011, pp.
1–6.
[2] D. J. Mennill, M. Battiston, D. R. Wilson, J. R. Foote, S. M. Doucet, Field test of an affordable,
portable, wireless microphone array for spatial monitoring of animal ecology and behaviour, Methods
in Ecology and Evolution 3 (4) (2012) 704–712.
[3] A. Ledeczi, G. Kiss, B. Feher, P. Volgyesi, G. Balogh, Acoustic source localization fusing sparse direction
of arrival estimates, in: Int. Workshop on Intelligent Solutions in Embedded Systems, 2006, pp. 1–13.
34
[4] H. Wang, C. E. Chen, A. Ali, S. Asgari, R. E. Hudson, K. Yao, D. Estrin, C. Taylor, Acoustic sensor
networks for woodpecker localization, in: F. T. Luk (Ed.), SPIE Conf. on Advanced Signal Processing
Algorithms, Architectures, and Implementations, Vol. 5910, 2005, pp. 80–91.
[5] A. Farina, Target tracking with bearings-only measurements, Signal Processing 78 (1) (1999) 61–78.
[6] R. G. Stansfield, Statistical theory of D.F. fixing, J. of the Inst. of Electr. Eng. - Part IIIA: Radiocom-
munication 94 (15) (1947) 762–770.
[7] K. Do˘gan¸cay, Bearings-only target localization using total least squares, Signal Processing 85 (9) (2005)
1695–1710.
[8] M. Gavish, A. J. Weiss, Performance analysis of bearing-only target location algorithms, IEEE Trans.
on Aerospace and Electr. Syst. 28 (3) (1992) 817–828.
[9] A. Bishop, B. D. O. Anderson, B. Fidan, P. Pathirana, G. Mao, Bearing-only localization using geomet-
rically constrained optimization, IEEE Trans. on Aerospace and Electr. Syst. 45 (1) (2009) 308–320.
[10] Z. Wang, J. Luo, X. Zhang, A novel location-penalized maximum likelihood estimator for bearing-only
target localization, IEEE Trans. on Signal Processing 60 (12) (2012) 6166–6181.
[11] L. M. Kaplan, Q. Le, On exploiting propagation delays for passive target localization using bearings-
only measurements, J. of the Franklin Institute 342 (2) (2005) 193–211.
[12] L. M. Kaplan, Q. Le, N. Molnar, Maximum likelihood methods for bearings-only target localization, in:
IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Vol. 5, 2001, pp. 3001–3004.
[13] K. Pattipati, S. Deb, Y. Bar-Shalom, R. B. Washburn, A new relaxation algorithm and passive sensor
data association, IEEE Trans. on Automatic Control 37 (2) (1992) 198–213.
[14] S. Deb, M. Yeddanapudi, K. Pattipati, Y. Bar-Shalom, A generalized S-D assignment algorithm for
multisensor-multitarget state estimation, IEEE Trans. on Aerospace and Electr. Syst. 33 (2) (1997)
523–538.
[15] A. Bishop, P. Pathirana, Localization of emitters via the intersection of bearing lines: A ghost elimi-
nation approach, IEEE Trans. on Vehicular Technology 56 (5) (2007) 3106–3110.
[16] A. Bishop, P. Pathirana, A discussion on passive location discovery in emitter networks using angle-only
measurements, in: Int. Conf. on Wireless Communications and Mobile Computing, ACM, 2006.
[17] J. Reed, C. da Silva, R. Buehrer, Multiple-source localization using line-of-bearing measurements:
Approaches to the data association problem, in: IEEE Military Communications Conf. (MILCOM),
2008, pp. 1–7.
[18] H. W. L. Naus, C. V. Van Wijk, Simultaneous localisation of multiple emitters, IEEE Proc. - Radar,
Sonar and Navigation 151 (2) (2004) 65–70. doi:10.1049/ip-rsn:20040184.
[19] L. M. Kaplan, P. Molnar, Q. Le, Bearings-only target localization for an acoustical unattended ground
sensor network, in: Proc. SPIE, Vol. 4393, 2001, pp. 40–51.
35
[20] D. Pavlidi, M. Puigt, A. Griffin, A. Mouchtaris, Real-time multiple sound source localization using a
circular microphone array based on single-source confidence measures, in: Proc. of IEEE Int. Conf. on
Acoustics, Speech, and Signal Processing (ICASSP), 2012.
[21] D. Pavlidi, A. Griffin, M. Puigt, A. Mouchtaris, Real-time multiple sound source localization and
counting using a circular microphone array, IEEE Trans. on Audio, Sp., and Lang. Proc. 21 (10) (2013)
2193–2206.
[22] A. Griffin, A. Mouchtaris, Localizing multiple audio sources from DOA estimates in a wireless acoustic
sensor network, in: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013.
[23] H. Do, H. Silverman, A Fast Microphone Array SRP-PHAT Source Location Implementation using
Coarse-To-Fine Region Contraction (CFRC), in: IEEE Workshop on Applications of Signal Processing
to Audio and Acoustics (WASPAA), 2007, pp. 295–298.
[24] S. M. Kay, Fundamentals of statistical signal processing: estimation theory, Prentice-Hall, Inc., Upper
Saddle River, NJ, USA, 1993.
[25] J. Tiete, F. Dom´ınguez, B. Silva, L. Segers, K. Steenhaut, A. Touhafi, SoundCompass: A Distributed
MEMS Microphone Array-Based Sensor for Sound Source Localization, Sensors 14 (2) (2014) 1918–
1949.
[26] M. J. Crocker, Handbook of Acoustics, Wiley, 1998.
[27] E. Lehmann, A. Johansson, Diffuse reverberation model for efficient image-source simulation of room
impulse responses, IEEE Trans. on Audio, Speech, and Lang. Proc. 18 (6) (2010) 1429 –1439.
[28] J. Valin, F. Michaud, J. Rouat, Robust localization and tracking of simultaneous moving sound sources
using beamforming and particle filtering, Robotics and Autonomous Systems 55 (3) (2007) 216–228.
36
... The TDOA/DOA-based multisource localization requires solving the intractable data association problem, that is, finding the correct combination of TDOAs/DOAs across different microphone arrays that correspond to the same source (Alexandridis and Mouchtaris, 2018). Several approaches have been proposed to estimate the locations of multiple sources by solving the data association problem (Alexandridis and Mouchtaris, 2018;Dang et al., 2019;Dang and Zhu, 2021;Griffin et al., 2015;Jamali-Rad and Leus, 2013;Sundar et al., 2018;Swartling et al., 2011). However, most of them have been derived with the premise of a known number of sources and may not be readily adapted to the task of joint source localization and counting (Alexandridis and Mouchtaris, 2018;Griffin et al., 2015;Jamali-Rad and Leus, 2013;Sundar et al., 2018;Swartling et al., 2011). ...
... Several approaches have been proposed to estimate the locations of multiple sources by solving the data association problem (Alexandridis and Mouchtaris, 2018;Dang et al., 2019;Dang and Zhu, 2021;Griffin et al., 2015;Jamali-Rad and Leus, 2013;Sundar et al., 2018;Swartling et al., 2011). However, most of them have been derived with the premise of a known number of sources and may not be readily adapted to the task of joint source localization and counting (Alexandridis and Mouchtaris, 2018;Griffin et al., 2015;Jamali-Rad and Leus, 2013;Sundar et al., 2018;Swartling et al., 2011). In Dang and Zhu (2021), the present authors proposed a feature associationbased method that is able to estimate the number of sources and their locations jointly. ...
... Most existing methods for multisource position estimation are developed under the assumption that the number of sources is known, such as the two-step TDOA/DOA-based approaches (Alexandridis and Mouchtaris, 2018;Griffin et al., 2015;Jamali-Rad and Leus, 2013;Sundar et al., 2018;Swartling et al., 2011) and the one-step SRP-based method (Brutti et al., 2010). Thus, for a fair comparison, a recent method, capable of joint source counting and position estimation (Dang and Zhu, 2021), is selected as a benchmark, denoted by TDOA Feature-Based in figures and tables. ...
Article
The steered response power (SRP) with phase transform algorithm has been demonstrated to be robust against reverberation and noise for single-source localization. However, when this algorithm is applied to multisource localization (MSL), the “peak missing problem” can occur, namely, that some sources dominate over others over short time intervals, resulting in fewer significant SRP peaks being found than the true number of sources. This problem makes it difficult to detect all the sources among the available SRP peaks. We propose an iteratively reweighted steered response power (IR-SRP) approach that effectively solves the “peak missing problem” and achieves robust MSL in reverberant noisy environments. The initial IR-SRP localization function is computed over the time-frequency (T-F) bins selected by a combination of two weighting schemes, one using coherence, and the other using signal-to-noise ratio. When iterating, our method finds the significant SRP peaks for the dominant sources and eliminates the T-F bins contributed by these sources using inter-channel phase difference information. As a result, the remaining sources can be found in subsequent iterations among the remaining T-F bins. The proposed IR-SRP method is demonstrated using both simulated and measured experiment data.
... Triangulation is an example where a source is localised using multiple microphone arrays: each arrays estimates a DoA, which will then be triangulated to provide an intersection point where the source is located [6,7]. Triangulation does however fail when the DoAs do not intersect, 1 This work is supported by the Research Foundation -Flanders (FWO) under grant numbers G081420N and 11G0721N which consists of two networks that performs DoA estimation as described by [8]. ...
Conference Paper
Full-text available
This paper takes a deep neural network approach to direction of arrival estimation and extends this to a 2D localisa-tion approach. To accomplish the 2D localisation, only two microphone arrays are deployed. This paper will compare different 2D localisation methods, from which a triangula-tion approach is the most straightforward extension of the original deep neural network. The other methods combine information within the neural network. Robustness against slight clock offsets from different arrays is ensured by only mixing information at lower layers in the neural network. It will be shown that combining information between neu-ral network layers has a significant improvement over the triangulation approach.
... There are several methods in the literature for localizing and tracking sound sources using networks of microphone arrays, which can be classified based on the type of measures used to describe the localization problem [1][2][3]. Among them, we focus on SSL methods based on the estimation of the Direction Of Arrival (DOA) [11][12][13][14][15][16][17], defined as either one angle (in 2D space) or two angles (in 3D space) that identify the direction of a sound source with respect to a reference direction. Localizing a sound source in 3D, starting from 3D DOA measurements, results in a triangulation problem involving estimations of angle pairs. ...
... A Wireless Acoustic Sensor Network (WASN) consists of a set of sensing nodes distributed on a physical space that captures data from the acoustic environment. Each node consists of a processing device and an acoustic sensor (microphone) [9][10][11][12]. All nodes are connected through a local wireless network to receive and send data, with the final actionable device being a multi-channel audio recording WASN(each channel being providing by a node in the WASN). ...
Article
Full-text available
Acoustic energy mapping provides the functionality to obtain characteristics of acoustic sources, as: presence, localization, type and trajectory of sound sources. Several beamforming-based techniques can be used for this purpose. However, they rely on the difference of arrival times of the signal at each capture node (or microphone), so it is of major importance to have synchronized multi-channel recordings. A Wireless Acoustic Sensor Network (WASN) can be very practical to install when used for mapping the acoustic energy of a given acoustic environment. However, they are known for having low synchronization between the recordings from each node. The objective of this paper is to characterize the impact of current popular synchronization methodologies as part of the WASN to capture reliable data to be used for acoustic energy mapping. The two evaluated synchronization protocols are: Network Time Protocol (NTP) y Precision Time Protocol (PTP). Additionally, three different audio capture methodologies were proposed for the WASN to capture the acoustic signal: two of them, recording the data locally and one sending the data through a local wireless network. As a real-life evaluation scenario, a WASN was built using nodes conformed by a Raspberry Pi 4B+ with a single MEMS microphone. Experimental results demonstrate that the most reliable methodology is using the PTP synchronization protocol and audio recording locally.
... However, in dynamic scenarios, when the frame length is short, a favorable DUET clustering result is difficult to obtain. In addition, some works on detecting the single-source points are stated in [14,15], which can be employed to enhancement the TF-based methods. ...
Article
Full-text available
A random finite set-based sequential Monte–Carlo tracking method is proposed to track multiple acoustic sources in indoor scenarios. The proposed method can improve tracking performance by introducing recognized speaker identities from the received signals. At the front-end, the degenerate unmixing estimation technique (DUET) is employed to separate the mixed signals, and the time delay of arrival (TDOA) is measured. In addition, a criterion to select the reliable microphone pair is designed to quickly obtain accurate speaker identities from the mixed signals, and the Gaussian mixture model universal background model (GMM-UBM) is employed to train the speaker model. In the tracking step, the update of the weight for each particle is derived after introducing the recognized speaker identities, which results in better association between the measurements and sources. Simulation results demonstrate that the proposed method can improve the accuracy of the filter states and discriminate the sources close to each other.
... Ad-hoc microphone array, which collaboratively organizes a set of randomly distributed microphone arrays in space, is a solution to the problem. Early methods are mostly based conventional signal processing methods [10]. Recently, deep-learningbased methods were studied [11], which is the focus of the paper. ...
Preprint
Recently, an end-to-end two-dimensional sound source localization algorithm with ad-hoc microphone arrays formulates the sound source localization problem as a classification problem. The algorithm divides the target indoor space into a set of local areas, and predicts the local area where the speaker locates. However, the local areas are encoded by one-hot code, which may lose the connections between the local areas due to quantization errors. In this paper, we propose a new soft label coding method, named label smoothing, for the classification-based two-dimensional sound source location with ad-hoc microphone arrays. The core idea is to take the geometric connection between the classes into the label coding process.The first one is named static soft label coding (SSLC), which modifies the one-hot codes into soft codes based on the distances between the local areas. Because SSLC is handcrafted which may not be optimal, the second one, named dynamic soft label coding (DSLC), further rectifies SSLC, by learning the soft codes according to the statistics of the predictions produced by the classification-based localization model in the training stage. Experimental results show that the proposed methods can effectively improve the localization accuracy.
Conference Paper
The well-known steered response power-phase transform (SRP-PHAT) algorithm has been demonstrated to be robust under adverse acoustic conditions for single-source localization. When applying this algorithm to the localization of multiple concurrent sound sources, one may encounter a problem that some sources dominate over the others over short time intervals, resulting in fewer significant SRP peaks being found than the true number of sources. This problem makes it difficult to detect all the sources among the available SRP peaks. This paper proposes an iterative SRP approach to jointly estimating the number of sources and their locations in a distributed microphone network. The proposed approach is derived based on the time-frequency (T-F) bins selected by a combination of two weighting schemes, one using the coherence, and the other using signal-to-noise ratio. At each iteration, a dominant source is localized by exploiting the highest SRP peak and its contribution is removed by eliminating the T-F bins that are dominated by this source. When iteration stops, a set of coarse location estimates are obtained, which are further refined by merging the closely located sources based on their distances. Experimental results demonstrate the robustness of the proposed algorithm to reverberation and noise.
Preprint
The steered response power (SRP) with phase transform algorithm has been demonstrated to be robust against reverberation and noise for single-source localization. However, when this algorithm is applied to multisource localization (MSL), the peak missing problem can occur, namely some sources are predominant over the others during a short time period, resulting in the number of significant SRP peaks is less than the true number of sources. This problem makes it difficult to accurately localize all the sources by directly using the significant SRP peaks. In this manuscript, we propose an Iteratively Reweighted-SRP (IR-SRP) approach that effectively solves the peak missing problem and achieves robust MSL in reverberant noisy environments. The initial IR-SRP localization function is computed over the time frequency (T-F) bins selected by the coherence and the signal-to-noise ratio weighting schemes. When iterating, our method utilizes the significant SRP peaks to localize part of sources and eliminates the T-F bins dominated by these sources using inter-channel phase difference information. Consequently, the peaks corresponding to the non-dominant sources can become significant compared to the results in the previous iterations, making it easy to localize the non-dominant sources. The proposed IR-SRP method is evaluated using both simulated and measured data.
Article
Full-text available
In this work, a multiple sound source localization and counting method is presented, that imposes relaxed sparsity constraints on the source signals. A uniform circular microphone array is used to overcome the ambiguities of linear arrays, however the underlying concepts (sparse component analysis and matching pursuit-based operation on the histogram of estimates) are appli- cable to any microphone array topology. Our method is based on detecting time-frequency (TF) zones where one source is dominant over the others. Using appropriately selected TF components in these “single-source” zones, the proposed method jointly estimates the number of active sources and their corresponding directions of arrival (DOAs) by applying a matching pursuit-based approach to the histogram of DOA estimates. The method is shown to have excellent performance for DOA estimation and source counting, and to be highly suitable for real-time applications due to its low complexity. Through simulations (in various signal-to-noise ratio conditions and reverberant environments) and real environment experiments, we indicate that our method outperforms other state-of-the-art DOA and source counting methods in terms of accuracy, while being significantly more efficient in terms of computational complexity.
Conference Paper
Full-text available
In this work we propose a method to estimate the position of multiple sources in a wireless acoustic sensor network, where each sensor node only transmits direction-of-arrival (DOA) estimates each time interval, minimizing the transmissions to the processing node. Our method is based on the intersection of DOA estimates with outlier removal, and as such is very computationally efficient. We explore the performance of our method through extensive simulations and real measurements.
Article
Full-text available
Sound source localization is a well-researched subject with applications ranging from localizing sniper fire in urban battlefields to cataloging wildlife in rural areas. One critical application is the localization of noise pollution sources in urban environments, due to an increasing body of evidence linking noise pollution to adverse effects on human health. Current noise mapping techniques often fail to accurately identify noise pollution sources, because they rely on the interpolation of a limited number of scattered sound sensors. Aiming to produce accurate noise pollution maps, we developed the SoundCompass, a low-cost sound sensor capable of measuring local noise levels and sound field directionality. Our first prototype is composed of a sensor array of 52 Microelectromechanical systems (MEMS) microphones, an inertial measuring unit and a low-power field-programmable gate array (FPGA). This article presents the SoundCompass's hardware and firmware design together with a data fusion technique that exploits the sensing capabilities of the SoundCompass in a wireless sensor network to localize noise pollution sources. Live tests produced a sound source localization accuracy of a few centimeters in a 25-m2 anechoic chamber, while simulation results accurately located up to five broadband sound sources in a 10,000-m2 open field.
Article
Full-text available
We propose a novel real-time adaptative localization approach for multiple sources using a circular array, in order to suppress the localization ambiguities faced with linear arrays, and assuming a weak sound source sparsity which is derived from blind source separation methods. Our proposed method performs very well both in simulations and in real conditions at 50% real-time.
Article
Full-text available
In this work, a multiple sound source localization and counting method is presented, that imposes relaxed sparsity constraints on the source signals. A uniform circular microphone array is used to overcome the ambiguities of linear arrays, however the underlying concepts (sparse component analysis and matching pursuit-based operation on the histogram of estimates) are applicable to any microphone array topology. Our method is based on detecting time-frequency (TF) zones where one source is dominant over the others. Using appropriately selected TF components in these “single-source” zones, the proposed method jointly estimates the number of active sources and their corresponding directions of arrival (DOAs) by applying a matching pursuit-based approach to the histogram of DOA estimates. The method is shown to have excellent performance for DOA estimation and source counting, and to be highly suitable for real-time applications due to its low complexity. Through simulations (in various signal-to-noise ratio conditions and reverberant environments) and real environment experiments, we indicate that our method outperforms other state-of-the-art DOA and source counting methods in terms of accuracy, while being significantly more efficient in terms of computational complexity.
Conference Paper
Full-text available
Wireless microphone networks or so-called wireless acoustic sensor networks (WASNs) are a next-generation technology for audio acquisition and processing. As opposed to traditional microphone arrays that sample a sound field only locally, often at large distances from the relevant sound sources, WASNs allow to use many more microphones to cover a large area of interest. However, the design of such WASNs is very challenging, especially for real-time audio acquisition and signal enhancement due to the significant data traffic in the network. There is a need for scalable solutions, both on the signal processing level and on the network-communication level. In this paper, we give an overview of applications and trends in the field of WASNs, and we address the core challenges that need to be tackled. We mainly focus on the signal processing level, and we explain how advances in the area of signal processing can relax the high-demanding constraints on the network layer design. Furthermore, we address the interaction between the application layer and the network layer, and we explain why cross-layer design can be important to improve the performance of WASN applications.
Conference Paper
Sensor network technology can revolutionize the study of animal ecology by providing a means of non-intrusive, simultaneous monitoring of interaction among multiple animals. In this paper, we investigate design, analysis, and testing of acoustic arrays for localizing acorn woodpeckers using their vocalizations. Each acoustic array consists of four microphones arranged in a square. All four audio channels within the same acoustic array are finely synchronized within a few micro seconds. We apply the approximate maximum likelihood (AML) method to synchronized audio channels of each acoustic array for estimating the direction-of-arrival (DOA) of woodpecker vocalizations. The woodpecker location is estimated by applying least square (LS) methods to DOA bearing crossings of multiple acoustic arrays. We have revealed the critical relation between microphone spacing of acoustic arrays and robustness of beamforming of woodpecker vocalizations. Woodpecker localization experiments using robust array element spacing in different types of environments are conducted and compared. Practical issues about calibration of acoustic array orientation are also discussed.
Article
This paper extends our development of acoustical bearings-only target localization for the case of multiple moving targets. The resulting techniques can be used to locate and track targets traveling through a network of acoustical sensor arrays. Each array computes and transmits multiple direction-of-arrival (DOA) estimates to a central processor, which employs the target localization technique. In previous work, we developed ML techniques that may or may not account for the fact that a bearing measurement points to the location of a moving target at a retarded time. By inserting a simple bearings association computation in the ML methods, we define quasi-ML techniques that can estimate the location and velocity of multiple targets using multiple bearing estimates per a sensor array.