ArticlePDF Available

Localizing multiple audio sources in a wireless acoustic sensor network

February 2015
Signal Processing 107:54-67

February 2015
107:54-67

DOI:10.1016/j.sigpro.2014.08.013

Authors:

Anthony Griffin

Auckland University of Technology

Anastasios Alexandridis

Foundation for Research and Technology - Hellas

Despoina Pavlidi

Foundation for Research and Technology - Hellas

Yannis Mastorakis

University of Crete

Show all 5 authorsHide

In this work, we propose a grid-based method to estimate the location of multiple sources in a wireless acoustic sensor network, where each sensor node contains a microphone array and only transmits direction-of-arrival (DOA) estimates in each time interval, reducing the transmissions to the central processing node. We present new work on modeling the DOA estimation error in such a scenario. Through extensive, realistic simulations, we show that our method outperforms other state-of-the-art methods, in both accuracy and complexity. We also present localization results of real recordings in an outdoor cell of a sensor network.

Mean execution times in milliseconds for localization methods for one set of DOA estimations

…

Example cell with four sensor nodes (blue circles, numbered 1 to 4), and the DOAs (θ 1-θ 4 ) to a source (the red circle).

…

RMSE as a percentage of cell size for the real recordings (outdoors) of Fig. 17 and their corresponding reverberant simulations with T 60 = 400 ms

…

Example cell with four nodes, showing the DOAs to the n-th grid point, and their associated column vector of Ψ.

…

Modeling the effect of SNR on DOA estimation error standard deviation for a circular microphone array.

…

Figures - uploaded by Athanasios Mouchtaris

Content may be subject to copyright.

Content uploaded by Athanasios Mouchtaris

Content may be subject to copyright.

Localizing multiple audio sources

in a wireless acoustic sensor network

Anthony Griﬃn1Anastasios Alexandridis1,2Despoina Pavlidi1,2Yiannis

Mastorakis1,2Athanasios Mouchtaris1,2

1Foundation for Research & Technology – Hellas, Institute of Computer Science (FORTH-ICS), Heraklion,

Crete, Greece, GR-70013

2University of Crete, Department of Computer Science,

Heraklion, Crete, Greece, GR-70013

Abstract

In this work, we propose a grid-based method to estimate the location of multiple sources

in a wireless acoustic sensor network, where each sensor node contains a microphone array

and only transmits direction-of-arrival (DOA) estimates in each time interval, reducing the

transmissions to the central processing node. We present new work on modeling the DOA

estimation error in such a scenario. Through extensive, realistic simulations, we show our

method outperforms other state-of-the-art methods, in both accuracy and complexity. We

also present localization results of real recordings in an outdoor cell of a sensor network.

Keywords: Acoustic sensors, acoustic source localization, location estimation, microphone

arrays, wireless acoustic sensor networks

1. Introduction

Microphone arrays have become increasingly popular due to their ability to perform

direction-of-arrival (DOA) estimation. Identifying the direction of incoming sound is the

basis for performing many operations, such as beamforming, speech enhancement, and dis-

tant sound acquisition. However, in many situations not only the DOA, but the actual

Email address: anthonybgriffin@gmail.com, analexan@ics.forth.gr, pavlidi@ics.forth.gr,

jmastor@csd.uoc.gr, mouchtar@ics.forth.gr (Anthony Griﬃn1Anastasios Alexandridis1,2

Despoina Pavlidi1,2Yiannis Mastorakis1,2Athanasios Mouchtaris1,2)

Preprint submitted to Signal Processing May 28, 2015

location of a sound source in space is required. Wireless acoustic sensor networks (WASNs),

where a number of microphones or microphone arrays are distributed over an area, have

emerged from the need to provide better spatial coverage and perform localization. WASNs

have attracted a lot of interest due to their variety of application in hearing aids, ambient

intelligence, hands-free telephony and acoustic monitoring [1].

Source localization in a WASN is a challenging task as the sensor network poses many

constraints related to time-synchronization, power and bandwidth limitations, etc. For

these reasons, approaches that require the transmission of the full audio signals to the

central processing node are often unsuitable as they are bandwidth consuming, and the

required transmission power can reduce the battery-life of the sensors. Moreover, such

approaches require the signals to be synchronized. The work in [2] circumvented the problem

of synchronization by using special nodes that used their internal Global Positioning System

(GPS) chips to resample the audio samples with a network-common timestamp. However,

the full audio signals still need to be transmitted to the central processing node.

By allowing increased computational ability in the nodes, the absolute minimum trans-

mission bandwidth can be attained when each sensor node only transmits DOA estimates to

the central processing node [3, 4, 5]. Localization using bearing-only (i.e., DOA) estimates

can also tolerate unsynchronized output given that the sources are static or that they move

at a rather slow rate relative to the analysis frame.

The bearing-only localization problem for a single-source has been thoroughly inves-

tigated and a variety of estimators are available in the literature. Closed-form solutions

include the Stansﬁeld estimator [6], which is a weighted linear least squares estimator. The

weights are determined from range information between the source and the sensors. When

range information is not available, the Stansﬁeld estimator reduces to the Orthogonal Vector

(OV) estimator [7]—the unweighted version of the Stansﬁeld estimator.

While simple in their implementation, these linear least squares algorithms suﬀer from

increased estimation bias. For this reason, maximum-likelihood (ML) and non-linear least

squares (NLS) algorithms have been investigated [8, 9, 10, 11, 12]. A comparison between

the Stansﬁeld estimator and the ML estimator in [8], reveals that the Stansﬁeld estimator

provides biased estimates even for a large number of measurements and that the bias may not

vanish as the number of measurements increases. The work in [9] forms geometric relation-

ships between the measured data and formulates the localization problem as a constrained

optimization task, while [10] proposes a variant of the ML estimator that theoretically per-

forms better than the traditional ML approach. Estimators that take into account the

velocity of a moving source—especially for vehicle tracking—are discussed in [12, 11].

The aforementioned methods consider the problem of localizing a single source. However,

in many realistic scenarios multiple sources may co-exist in an area and the location of all

sources may need to be known. The bearing-only multiple source localization problem of

acoustic sources poses many challenging issues. First of all, the so-called data association

problem occurs, where the central processing node receiving DOA estimates for multiple

sources from the diﬀerent sensors cannot know to which source they belong. Erroneous

DOA combinations across the sensors will result in “ghost sources” that do not correspond

to real sources. A solution to this problem was given in [13] and later generalized in [14]

but has been found to be Non-deterministic Polynomial-time hard (NP-hard) when the

number of sensors is ≥3. Another solution is discussed in [15, 16] but is suitable only

for noiseless scenarios. The work in [17] proposes a solution based on statistical clustering

of the intersection of bearing lines. However, they again consider idealized scenarios of no

missed detections and no spurious measurements. Localization of multiple sources by angle

and frequency measurements is considered in [18], but this method will fail if the sources

contain the same frequencies, and thus it cannot be applied to the case of acoustic sources.

A method for multiple source localization using non-linear least squares that tries to surpass

the data association problem is discussed in [19]. However, ghost sources are not eliminated,

leading to severe performance degradation.

Our previous experience with DOA estimation [20, 21] has revealed that when the sources

are close together some arrays might only detect one source. This is a valid observation made

from experiments using real recorded signals [20, 21]. As a result, the DOAs of some sources

from some sensors might be missing. This problem of missing DOA estimates as a function

of the sources’ locations is an important aspect which—to the best of our knowledge—has

not been widely examined so far.

Our work in [22] considers a method for localizing two sources using far-ﬁeld DOA

measurements in an outdoor WASN. This paper extends [22] to more than two sources.

Moreover, this paper proposes a novel iterative grid-based approach that can be thought

of as an alternative solution to the NLS estimator. Other iterative solutions for source

localization have also been proposed, the most popular of which are Steered Response Power

(SRP) based approaches [23]. However, when applied to a WASN, such approaches require

a signiﬁcantly higher amount of information to be transmitted to the central processing

node. In our approach only DOA estimates are transmitted to the central node, keeping

bandwidth requirements to the minimum. When localizing a single source, our grid-based

approach maintains the accuracy of the standard NLS, while performing much better in

terms of computation time.

The computational eﬃciency allows our approach to be extended to localize multiple

sources. To do so, we apply the single-source grid-based method to each possible combina-

tion of DOA measurements from the sensors and then solve the data association problem

using a sub-optimal—yet eﬃcient—method which relies on the estimated locations and the

corresponding DOA combinations to decide on the actual source locations. Our approach is

real-time and as our simulations and real experiments show, it remains accurate.

Our simulations use new results that we present here to model the DOA estimation error

of the algorithm of [21], and consider the problem of missing DOAs as a function of source

location, which makes them more realistic than simulations considered so far. The problem

of missing DOAs when the sources are close together occurs very often in practice as our

real experiments in this paper suggest.

The remainder of the paper is organised as follows. Section 2 sets up the basic deﬁnitions

and assumptions for the problem. Section 3 reviews single source localization methods

using DOA estimates and proposes the intersection point and the grid-based method. Then,

Section 4 discusses the multiple source localization problem extending the intersection point

and the grid-based methods for multiple sources. Simulation results and real experiments

that compare the proposed methods with other state-of-the-art methods in realistic scenarios

are presented in Section 5. Finally, Section 6 concludes the paper.

2. The framework

Our framework is a wireless acoustic sensor network whose Mnodes are each equipped

with a microphone array—which we will also refer to as a sensor. This enables each node to

generate a direction-of-arrival estimate for any sources that it can “hear” (any sources whose

signal-to-noise ratio (SNR) at the node is high enough to be detected). It is important to

note that each node’s estimates consist of direction only, and no range information, thus one

node’s DOA estimates are not suﬃcient to obtain absolute positions for sources.

Let the x- and y-coordinates of the location of the m-th node be given by

qm=hqx,m qy,miT,(1)

and, similarly, let the x- and y-coordinates of the location of the s-th source be given by

ps=hpx,s py,siT.(2)

Given Sactive sound sources, the 2S×1 position vector of all the sources can be written as

p=hpT

1pT

2. . . pT

s. . . pT

SiT,(3)

and we can deﬁne the DOA vector of the m-th node as

θm(p) = hhm,1hm,2. . . hm,s . . . hm,SiT,(4)

where

hm,s = arctan py,s −qy,m

px,s −qx,m

(5)

with arctan(·) denoting the four quadrant arctangent function of the argument that returns

an angle in the range of [0,2π).

In the ideal scenario where the microphone array at each node is able to detect all sources,

the m-th array outputs a S×1 vector of noisy DOA measurements

θm=θm(p) + ηm,(6)

4 3

1 2

θ1θ2

θ3

θ4

Figure 1: Example cell with four sensor nodes (blue circles, numbered 1 to 4), and the DOAs (θ1–θ4) to a

source (the red circle).

where ηmis the DOA noise at the m-th sensor, which is assumed to be zero-mean Gaussian

with covariance matrix Σm= diag(σ2

m,1, σ2

m,2, . . . , σ2

m,S). The variance of the DOA noise at

each sensor can depend on several factors, such as the DOA estimation method used and

the SNR of the source signals at the microphones. Moreover, reverberation can also aﬀect

the DOA estimation method, resulting in estimates with a greater amount of noise [21]

Note that we assume localization in the two dimensions, similar to other works, e.g., [19,

12, 5, 7]. However, results from real experiments in Section 5.4 indicate that our method

will still work—estimating a location in two dimensions—even when the sound sources are

located at diﬀerent elevation angles from the microphone arrays, as long as the arrays and

the sources lie approximately in the same plane.

3. Single-source localization from multiple DOA estimates

Let us ﬁrst consider the case of localizing a single source from multiple DOA estimates.

This is a well-studied problem, but it also serves as an introduction to the multiple source

case. Fig. 1 illustrates an example cell in network with four nodes—separated by V—and

the DOA estimates to the source. It is clear that in the ideal case—i.e., perfect DOA

estimates—the source could be localized by ﬁnding the points where the four DOA lines

intersect. In practice—or any realistic simulation—the DOA estimates will not be perfect,

and will not all intersect at the same point. We will now discuss some of the state-of-the-art

ways to solve this, followed by our proposed methods and the performance limitations of this

problem, based on the Cram´er-Rao Lower Bound (CRLB). Note that as we are considering

only one source here, (2) and (3) reduce to

p=hpxpyiT.(7)

3.1. Linear least squares

In its simplest form, the linear least squares (LLS) estimator [7, 12] can be described in

the following manner. Given the DOA measurement ˆ

θmfrom the m-th microphone array,

the source is assumed to be located on the line described by:

qx,m sin ˆ

θm−qy,m cos ˆ

θm=pxsin ˆ

θm−pycos ˆ

θm.(8)

Using all the DOAs from the Msensors, leads to the following system of linear equations

with two unknowns:

Ap =b(9)

where A=





sin ˆ

θ1−cos ˆ

θ1

sin ˆ

θM−cos ˆ

θM







and b=





qx,1sin ˆ

θ1−qy,1cos ˆ

θ1

qx,M sin ˆ

θM−qy,M cos ˆ

θM







As the DOA measurements are contaminated by noise, an exact solution to (9) cannot

be found, so the linear least squares solution is used and the location estimate is found as:

pLLS = (ATA)−1ATb(10)

3.2. Non-linear least squares

The non-linear least squares (NLS) estimator for the single-source case reported in [12],

is the maximum-likelihood estimator when the DOA noise standard deviation is the same

at all sensors. This approach aims at ﬁnding the location estimate ˆ

pNLS that minimizes the

following cost function:

C(p) =

m=1 |ˆ

θm−θm(p)|2(11)

(0,0)

4 3

1 2

θ1θ2

θ3

θ4

I3,4

I2,4

I2,3

I1,2

I1,4

I1,3

Figure 2: Example square cell with four sensor nodes (blue circles, numbered 1 to 4), the DOAs (ˆ

θ1–ˆ

θ4) to

a source (the red circle), and the intersection points (grey squares, labeled I1,2–I3,4) of DOA vector pairs.

The minimization problem can be solved by using recursive gradient-descent methods

while the location estimate from the linear least squares estimator of Section 3.1 can be used

as an initial point to initialize the search.

3.3. Intersection point method

Our intersection point (IP) method [22] is based on ﬁnding the location of a source

by taking the centroid of the intersections of pairs of DOA lines. The centroid is simply

the mean of the set of intersection points, and minimizes the sum of squared Euclidean

distances between itself and each point in the set. This method can be thought of as sub-

optimal version of the LLS method of Section 3.1, but we will show later it extends more

easily to the multiple source case.

Fig. 2 illustrates this method with an example, where the DOA estimates have an error

of up to ±5◦, and the intersection points are labeled I1,2–I3,4. The locations of sensors 1 to

4 are: (0, 0), (4, 0), (4, 4), (0, 4), respectively, and the source is at (2.6, 3.0). The estimated

location from the centroid of the intersection points is (2.40, 2.77), which is a distance error

of 0.43, or 11% of the inter-sensor spacing, V. Further inspection of Fig. 2 reveals that the

eﬀect of I1,3is signiﬁcant. By excluding this point from the centroid, the estimated location

becomes (2.64, 2.99) and the error drops to 0.03, or 1% of V.

A question that then naturally arises is: how can we detect and exclude outliers such as

I1,3? It can be shown that these outliers are caused by DOA lines that are almost parallel. A

small change in the slope of either of these lines—due to DOA estimation error—can move

their point of intersection signiﬁcantly. Thus excluding the intersection points of pairs of

DOA lines that are almost parallel improves the accuracy of the location estimation.

Before proceeding, let us ﬁrst deﬁne the function A(X, Y ), the minimum angular distance

between Xand Y, whose output will be in the range [0, π]. A simple and programatically

eﬃcient implementation is to ﬁrst ensure that Xand Yare in the range [0,2π), then by

deﬁning

AX,Y = (X−Y) (mod 2π) (12)

AY,X = (Y−X) (mod 2π) (13)

the minimum angular distance is given by

A(X, Y ) = min (AX,Y , AY,X ) (14)

Now let γkbe a “parallelness” threshold, source localization using the intersection point

method can then be summarized as:

1. Collect the MDOA estimates.

2. Take each of the pairs of DOA estimates θmi, θmj,i6=jfrom sensors miand mjand

discard it if either of the two conditions are met:

A(θmi, θmj)< γk,(15)

A(θmi, θmj)> π −γk.(16)

3. Calculate the points of intersection of the remaining pairs.

4. The estimate of the source location ˆ

pIP is then given by the centroid of the points of

intersection.

Note that this method is extremely computationally eﬃcient, and its resolution has no

inherent limitations, being aﬀected only by the accuracy of the DOA estimates.

3.4. Grid-based method

We now propose a novel grid-based (GB) method to solve the single source localization

problem. Our method is an alternative formulation of the NLS estimator of Section 3.2,

which tries to alleviate the major weaknesses of that approach, namely the need for a good

initial point to ensure the estimator does not converge to any local minimum, and the

computational burden of the minimization procedure.

Our approach is based on making the search space discrete by constructing a grid of N

points over the area of interest, and then ﬁnd the grid point whose DOAs most closely match

the estimated DOAs. Moreover, since our measurements are angles, we propose the use of

the Angular Distance—deﬁned in Section 3.3—as a more proper measure of “similarity” than

the absolute distance utilized in (11). As we will show later, this approach is much more

computationally eﬃcient without losing any accuracy, particularly in the multiple source

case.

We ﬁrst form the (M×N) matrix,

Ψ=







ψ1,1ψ1,2. . . ψ1,n . . . ψ1,N

ψ2,1ψ2,2. . . ψ2,n . . . ψ2,N

ψm,1ψm,2. . . ψm,n . . . ψm,N

ψM,1ψM,2. . . ψM,n . . . ψM,N







,(17)

where ψm,n is the DOA from the m-th sensor to the n-th grid point. Note that the n-th

column of Ψis formed from the MDOAs to the n-th grid point, as illustrated in Fig. 3.

We then ﬁnd the index of the grid point whose DOAs most closely match the estimated

DOAs by solving

n∗= arg min

m=1 hA(ˆ

θm, ψm,n)i2,(18)

where A(X, Y ) is the angular distance function deﬁned in (12) - (14). The source position

estimate ˆ

pGB is then simply given as the co-ordinates of the n∗-th grid point.

4 3

1 2

ψ4,n ψ3,n

ψ2,n

ψ1,n

ψ∗,n =







ψ1,n

ψ2,n

ψ3,n

ψ4,n







Figure 3: Example cell with four nodes, showing the DOAs to the n-th grid point, and their associated

column vector of Ψ.

A potential issue with this method is the localization error introduced by the discrete

nature of our approach. If we assume that the method works perfectly—or that there are

no DOA errors—then the method will exhibit localization error occurred by discretizing the

area. We will refer to that error as the bias introduced from the use of the grid, and in

Appendix A we derive the resultant root mean square error as:

EGB =V

√6(√N−1).(19)

From (19) it should be clear that for a cell of given dimensions, the number of grid

points N—determined by the resolution of the grid G—will determine the method’s bias.

Increasing Ncan decrease the position estimation error, as it can make the error occurred

from sampling the area signiﬁcantly small, but it will also increase the complexity of the

algorithm.

To maintain a computationally eﬃcient method when a very dense—i.e, large number of

N—grid is considered, we propose an iterative solution to (18) which starts with a coarse

grid (low value of N), and once the best grid point is found, a new grid centered on this

point is generated, with a smaller spacing between grid points, but also a smaller scope.

Then, the best grid point in the new grid is found. This may be repeated until the desired

accuracy is obtained, while keeping the complexity under control, as it does not require an

exhaustive search over all grid points of the ﬁnal resolution grid. A possible implementation

of the iterative grid-based method can be summarized in the following steps:

1. Denote the initial resolution of the grid as Ginitial, the target resolution as Gtarget, and

let rbe the factor of decrease in resolution after each iteration.

2. Set G=Ginitial.

3. Construct a grid over the area of interest with resolution G.

4. Find the grid point n∗by using (18).

5. If G≤Gtarget go to step 9.

6. Set V=G,G=G/r.

7. Construct a square grid of dimensions Vand resolution Gcentered on n∗.

8. Go to step 4.

9. Output the co-ordinates of n∗as the estimated location.

It is easy to observe, that this iterative version ﬁnds a solution in a ﬁxed number of

iterations which depend on the initial and target grid resolution and the decrease resolution

rate rin Step 6. The number Kof iterations required can be calculated as:

K=logr

Ginitial

Gtarget (20)

where dxedenotes the smallest integer number, greater or equal to x.

Moreover, as our simulation results in Section 5.2 indicate, the iterative version achieves

the same performance to its brute force counterpart, thus being able to ﬁnd the optimal

solution to the problem of (18) without requiring an exhaustive search over all grid points

of the target resolution grid.

Our proposed grid-based method can be extended to 3D localization, as long as DOA

estimation methods able to estimate both azimuth and elevation angles are employed. Our

localization method could be easily extended by employing a grid in three dimensions, and

considering the angular distance of both the azimuth and the elevation angles in (18).

3.5. Performance limitations: Cram´er-Rao Lower Bound

The Cram´er-Rao Lower Bound (CRLB) represents the minimum localization error co-

variance for any unbiased estimator and is deﬁned as the inverse of the Fisher Information

Matrix (FIM) J(p) [24]:

4 3

1 2

θ1ˆ

θ2

θ3

θ4

Figure 4: Example cell with four sensor nodes (blue circles, numbered 1 to 4), and the estimated DOAs

(ˆ

θ1–ˆ

θ4) to two sources (not shown).

E{(ˆ

p−p)(ˆ

p−p)T} ≥ J−1(p) (21)

where ˆ

pis the estimate of pand E{·} is the expectation operator.

Under the Gaussian assumption for the measurement noise, the FIM is derived as [5]:

J(p) =

m=1

σ2

m∇pθm(p) [∇pθm(p)]T.(22)

Note that for the multiple source case, the gradient ∇pθm(p) is simply replaced by the

Jacobian of θm(p) and the noise variance at sensor m,σ2

m, is replaced by the noise covariance

matrix Σm.

4. Multiple-source localization from multiple DOA estimates

The localization of multiple audio sources from DOA estimates is a considerably more

challenging problem than its single source counterpart. The presence of multiple sources

introduces further problems above those of the single-source case. Consider Fig. 4, depicting

an example cell with noisy DOA estimates from two sources. The processing node receiving

the DOA estimates cannot know to which source they belong, and the localization algorithm

must take this into account. An additional complication is that some sensor nodes may

only detect one source, as the sources’ DOAs may be too close together for that node to

discriminate between them, (see node number 3 in Fig. 4). We call this the minimum angular

source separation (MASS), i.e., if the angular distance between two sources is less than the

MASS, then the sensor node will only detect one source. The DOA estimation method

used by a sensor node, the spectral content of the source signals, and the array geometry

determine the MASS at this node. Thus, any localization algorithm must deal with the

ambiguity that each DOA estimate may originate from either source, and that some (or

even all) of the sensor nodes may underestimate the number of sources. In the following

we need to let Smdenote the number of sources detected by the m-th sensor. Then let the

maximum value of Smbe S, which is the highest number of sources detected by at least one

sensor. Let XSbe the set of sensors surrounding a cell detecting Ssources in that cell, and

let CSbe the size of that set, i.e., CS=|XS|.

We now present extensions of the single-source algorithms of Section 3. However, note

that there is no multiple source version of the LLS method of Section 3.1.

4.1. Position non-linear least squares

An extension of the NLS method of Section 3.2 was developed in [19]—called position

non-linear least squares (P-NLS)—and works in two stages. In the ﬁrst stage, all unique

combinations of DOA estimates are formed, and a location estimate for each combination

is calculated as described in Section 3.1. Then in the second stage, the ﬁnal locations are

estimated by minimizing the following cost function using the estimates from the previous

stage as initial guesses:

CP-NLS(p) =

m=1

min

i|ˆ

θm,i −θm(p)|2.(23)

where ˆ

θm,i is the i-th element of ˆ

θm. For every DOA combination the minima of this

cost function are expected to correspond to the locations of the true sources, however, some

“ghost” sources appear due to spurious intersection of bearing lines from the sensors. We mit-

igated these eﬀects by extending this method to use the approaches of Sections 4.3.1 & 4.3.2

as a third stage to produce S(or less) ﬁnal estimate locations.

4.2. Intersection point method

The extension of the method of Section 3.3 to the multiple source case is relatively

straightforward. Let us ﬁrst describe the concept of our geometric approach to solving this

problem. We take advantage of the fact that each DOA estimate—from a sensor in XS—

can only belong to one source. By dividing the possible locations for sources into the SCS

unique combinations of DOA estimates, we obtain up to SCSregions, (we say “up to” as

some of these regions may be null, depending on the orientation of the DOA estimates).

By counting the number of intersection points in each region, and choosing the one that

contains the most intersection points, we obtain the one that is most likely to contain one

of the sources. Once we have chosen a region—and thus one of the combinations of DOA

estimates—we then choose the next most likely, and so on, and until we are left with only

one remaining possible combination of DOA estimates pointing to the ﬁnal source. Our

proposed algorithm to localize Ssources can be more formally stated as:

1. Find the intersection points of all of the pairs of DOA lines, removing any pair whose

lines are too parallel, as in step 2 of the single-source algorithm of Section 3.3.

2. Determine S,XSand then CS, set the counter sto zero.

3. Find the CS(S−1) circular means of the adjacent pairs of DOAs from the sensors in

XS.

4. The vectors of these circular means form CSShalf-planes, ﬁnd the regions deﬁned

by all the intersections of all the possible combinations of pairs of half-planes from

diﬀerent sensors. There will be SCSof them.

5. Find the region with the most intersection points. If there is a tie, choose the region

whose intersection points have the minimum variance. The location of the s-th source

is given by the centroid of the intersection points in this region. Increment s.

6. If s<S, remove all regions that are not distinct from the already chosen region(s)

and go to the previous step.

Note that we have described this algorithm conceptually, but it can be implemented very

eﬃciently by using line tests—testing whether a point is above, below, or on a line—and

binary masks.

4.3. Grid-based method

For multiple sources, the grid-based method must account for the fact that the correct

association of DOAs to the sources is unknown. The localization consists of a two-step

procedure: in the ﬁrst step, an initial candidate location is estimated for each possible

combination of DOA measurements, while in the second step, the ﬁnal Ssource locations

must be chosen from the candidate locations.

Let Jdenote the set of all possible unique combinations of DOA estimates and jenu-

merate the combinations. Moreover, let ˆ

θ(j)be the M×1 vector of DOAs for the j-th

combination, and let ˆ

θ(j)

mdenote the DOA of sensor mfor the j-th combination. The car-

dinality of Jdepends on the number of sources each sensor is able to detect and can be

computed as:

|J| =

s=1

sCs(24)

As the correct association of the DOAs of each sensor to the sources cannot be known,

the single-source GB method of Section 3.5 is applied to each element of Jand the set

Lof candidate source locations is formed with |L| =|J|. Note that this multiple source

localization algorithm increases complexity by at least |J|−1 times that of the single source

algorithm, which highlights even more the need for a computationally eﬃcient method to

perform the localization of each DOA combination. As we will show later, our iterative grid-

based approach of Section 3.4 to minimize the non-linear cost function of (18) is signiﬁcantly

more computationally eﬃcient and results in a similar accuracy as the numerical search

methods for ﬁnding the minimum of (11).

In the next step, the ﬁnal Ssource locations must be identiﬁed from the set of candidate

locations Lby solving the data association problem.

4.3.1. Brute-force approach

A brute-force solution to the data association problem is to perform an exhaustive search

over all possible S-tuples of DOA combinations and select the most likely one. An S-tuple

of DOA combinations is deﬁned as the list of SDOA combinations (elements of J) each

of them being an M×1 vector of DOA measurements from the Msensors. Moreover, in

forming an S-tuple each sensor must contribute to each of the SDOA combinations with a

diﬀerent estimate, as the same DOA cannot belong to more than one sources. In the case

where a sensor has not detected all sources the same DOA can be repeated.

The brute-force approach can be summarized in the following steps:

1. Form all possible S-tuples of DOA combinations by combining the elements of set J.

The i-th S-tuple will be of the form:

Ti=nˆ

θ(1),ˆ

θ(2),·· · ,ˆ

θ(S)o.(25)

Note that each DOA combination ˆ

θ(j)is associated with a candidate source location

p(j)=hpxjpyjiTin the set L.

2. For each S-tuple i, calculate the sum of residuals of each DOA combination in the

tuple as:

ri=

j=1

m=1 hA(ˆ

θ(j)

m, θm(p(j)))i2.(26)

3. Choose the S-tuple that yields the minimum residual and output the corresponding

candidate locations from that tuple as the ﬁnal source locations.

This approach suﬀers from very high complexity as the number of tuples that need to

be tested can grow as high as O((S!)M), making this method highly impractical even for a

moderate number of sources and sensors. In the next section, we propose an alternative way

of solving the data association problem that approximates the performance of the brute-force

method and is much more computationally eﬃcient.

4.3.2. Sequential approach

In this section, we propose a computationally eﬃcient approach to solve the data as-

sociation problem. It is a sub-optimal approach to the brute-force method that relies on

a sequential procedure to ﬁnd the SDOA combinations that approximate the minimum

residual of (26) without testing all the possible S-tuples of DOA combinations.

Our sequential approach can be stated as:

1. Create a set J0=J.

2. For each DOA combination jin the set J0compute the residual:

rj=

m=1 hA(ˆ

θ(j)

m, θm(p(j)))i2.(27)

3. Choose the DOA combination j∗with the minimum residual and output the corre-

sponding location p(j∗)as the location of one of the sources.

4. Update J0by subtracting all DOA combinations that contain DOAs that are part

of the previously chosen combination j∗. Only DOAs of the sensors that have not

detected all sources are allowed to take part in other combinations.

5. Repeat steps 2–4 until J0=∅i.e., all Ssources have been found.

Note that this approach does not need to test all possible S-tuples of DOA combina-

tions, signiﬁcantly reducing the computational burden to that of testing only O(SM) DOA

combinations.

5. Results and Discussion

In order to investigate the performance of our proposed localization method, we per-

formed simulations and real measurements of a square 4-node cell of a WASN, similar to

that of Fig. 4. Although this is just a study of a cell in a larger sensor network, it is a

reasonable assumption that the performance in each cell would dominate the performance

of the whole network, as the other sensors not belonging to this cell would receive the source

signals with low SNR or not be able to detect the sources’ DOAs at all. Sensors that detect

the sources’ DOAs but do not belong to the cell could be excluded by a higher-layer sensor

selection algorithm. Restricting the localization task to a speciﬁc cell has also been used in

other works, e.g., in [25].

First, we investigate source localization using DOAs contaminated by noise of diﬀerent

levels. We assume a 4-node cell of a WASN where the sources are located inside the cell.

Non-directional isotropic environmental noise and sensor noise will contaminate the sources’

signals received at the microphones of each sensor. This noise can be modeled as white

Gaussian noise of equal power at all microphones, uncorrelated with the source signal and

the noise at the other microphones, resulting in a certain level of SNR for each source signal

at the sensors. As we are considering circular arrays of the same number of omni-directional

microphones, we can assume that the accuracy of the DOA estimates of a source at each

sensor is determined by the SNR of that source’s signal at that sensor.

By deﬁning the SNR at each sensor when the source is at the center of the cell (reference

SNR), we can thus estimate the SNR at the sensors when the source is located at any location

within the cell based on the attenuation of the source signal at that location compared to

the center of the cell. We assume that the signal of a source radiates as a spherical wave, and

the attenuation experienced by the source signal travelling from r1meters from the source

to r2meters from the source is given by [26]

a= 20 log10

dB.(28)

Note that the attenuation can be either positive or negative, resulting in SNR at the

sensors which is lower or higher than the reference SNR. Thus, given a reference SNR at the

center of the cell, the SNR of a source signal at the sensors when the source is located at a

given location can be calculated through geometry and the use of (28).

The source’s SNR at each sensor, will then deﬁne the standard deviation of the DOA

error of (6). Thus, to proceed with our simulations, we need to model the DOA error as

a function of SNR. It must be emphasized here that our framework results in a diﬀerent

SNR and, therefore, a diﬀerent DOA estimation error standard deviation at each sensor.

Moreover, in order to simulate multiple simultaneous sources within the MASS, we need to

study the eﬀect of the MASS on the DOA estimation. The modeling of these parameters is

presented in Section 5.1. Then, Sections 5.2–5.5 present evaluation results using simulations

and real data.

5.1. DOA Estimation Error Modeling

The DOA estimation error at each sensor was assumed to be normally distributed with

a zero mean and a variance that was assumed to be dependent only upon the SNR at each

sensor, which was in turn determined by the length of the path from the source to the sensor.

−5 0 5 10 15 20

0°

2°

4°

6°

8°

10°

SNR (dB)

standard deviation

Simulated data

Fitted curve

Figure 5: Modeling the eﬀect of SNR on DOA estimation error standard deviation for a circular microphone

array.

0 5 10 15 20

0.1

0.2

0.3

0.4

0.5

SIR (dB)

Normalized DOA

Simulated data

Fitted curve

Figure 6: Modeling the eﬀect of MASS and SIR on DOA estimation error for a circular microphone array.

Following the DOA estimation method of [21], we performed simulations to characterize the

DOA estimation error, using a sensor consisting of a 4-element circular microphone array

with a radius of 2 cm. The parameters of the DOA estimation method used are the same

as the ones in [21]. We assumed an anechoic environment and simulated a speech source

(male speaker) contaminated by white Gaussian noise at various SNR cases ranging from -5

dB to 20 dB. The noise at each microphone is uncorrelated with the speech source and with

the noise at all the other microphones. For each signal-to-noise ratio, the simulation was

repeated with the source rotated in 1◦increments around the array to avoid any orientation

biasing eﬀects. Fig. 5 shows the standard deviations obtained when the DOA estimation

error at each SNR was ﬁtted with a Gaussian distribution. The ﬁtted curve in Fig. 5 is

given by

std(SNR) = 1.979e−0.2815(SNR) + 1.884.(29)

0 2 4 6 8 10 12 14 16 18 20

Reference SNR (dB)

RMSE as % of cell size

GB Iter

GB Exh

CRLB

Figure 7: Position estimation error of the two versions of the grid-based method (exhaustive search and

iterative) as a percentage of cell size Vfor a single source in a square 4-node cell .

As mentioned earlier, in order to simulate multiple simultaneous sources, it was also

important to study the eﬀect on DOA estimation when two sources were within the MASS

of a sensor. We performed a simulation study where two speech sources (one male, one

female) were set at various separations of up to 20◦—below the MASS of the method of

[21]—and the energy of the second source was incrementally decreased so the signal-to-

interferer ratio (SIR) seen by the ﬁrst source varied from 0 dB to 20 dB. These simulations

were then repeated with the sources being rotated around the array in 1◦increments—

whilst preserving their angular separation—to avoid any orientation biasing eﬀects. In all

simulations only one source was detected and Fig. 6 shows the results of these simulations,

where the DOA oﬀset has been normalized by the separation between the sources. The

ﬁtted curve of the normalized DOA estimate, DOAn, (Fig. 6) is given by

DOAn(SIR) = 0.5e−0.12987(SIR).(30)

It is clear that the detected source’s DOA is estimated exactly in the middle of the true DOAs

when the sources have equal energy, and moves gradually towards the dominant source as

the weaker source decreases in energy. We used the ﬁtted curve of Fig. 6 in all simulations

involving more than one source.

5.2. Simulation Results

In all simulations, the sources were located anywhere within the cell with independent

uniform probability and the error measurement used was the root mean square error (RMSE)

0 2 4 6 8 10 12 14 16 18 20

Reference SNR (dB)

RMSE as % of cell size

LLS

NLS

CRLB

Figure 8: Position estimation error as a percentage of cell size Vfor a single source in a square 4-node cell,

for various values of SNR measured at the center of the cell.

between the estimated positions and the true source positions. For each run—i.e., a diﬀerent

positioning of the sources—the sources’ true DOAs to the sensors were calculated using (5)

and zero-mean Gaussian DOA noise was added. The standard deviation of the DOA noise

was taken from Fig. 5, according to the sources’ SNRs at the sensors which in turn was

estimated based on the reference SNR at the middle of the cell. For multiple sources, when

the sources were within the MASS, one DOA was estimated through the use of (30).

In our ﬁrst simulation, we consider the single-source case and compare the performance

of the GB method when an exhaustive search over all grid points is performed against the

iterative version of the method. For the iterative version we used an initial grid and a ﬁnal

grid with grid point spacings of 12.5% and 0.25% of the sensor spacing, respectively. In

each iteration we reduce the grid point spacing to one half of the previous one (r= 2). For

the exhaustive search version, we use the same grid (i.e., 0.25% of the sensor spacing) and

perform an exhaustive search over all grid points to ﬁnd the source location according to (18).

Fig. 7 presents the results over 10000 runs for each reference SNR case. It is evident that

the iterative version achieves the same performance without requiring an exhaustive search

over all grid points of the ﬁnal resolution grid, thus being more computationally eﬃcient.

For all the results presented in the remainder of this paper, the iterative GB method is used

with initial and ﬁnal grids of 12.5% and 0.25% of the sensor spacing, respectively, and r= 2.

Fig. 8 presents the results of our simulations of a single source, with the ﬁve curves

0 2 4 6 8 10 12 14 16 18 20

Reference SNR (dB)

RMSE as % of cell size

P−NLS

CRLB

Figure 9: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell with

a MASS of 0◦, for various values of SNR measured at the center of the cell.

0 2 4 6 8 10 12 14 16 18 20

Reference SNR (dB)

RMSE as % of cell size

P−NLS

CRLB

Figure 10: Position estimation error as a percentage of cell size Vfor three sources in a square 4-node cell

with a MASS of 0◦, for various values of SNR measured at the center of the cell.

representing the methods of Sections 3.1–3.4 and the bound of Section 3.5. The RMSE is

calculated over 10000 runs for each reference SNR case. It is clear that all the methods

perform close to the bound, with the NLS and GB methods being the closest. However,

as we will show later (Section 5.3) the GB method is signiﬁcantly more eﬃcient in terms

of computation time. For the IP method, we set γk= 20◦for all the results presented in

this paper. Through several simulations, these parameters for the IP and GB methods were

found to achieve good performance.

The performance of the multiple source localization methods of Sections 4.1–4.3 for two

and three sources was also evaluated through simulations. For all our simulations with

multiple sources presented hereafter, the RMSE is calculated over 5000 runs.

The performance of the methods for two and three sources for the case of 0◦MASS

0 2 4 6 8 10 12 14 16 18 20

Reference SNR (dB)

RMSE as % of cell size

P−NLS

CRLB

Figure 11: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell

with a MASS of 20◦, for various values of SNR measured at the center of the cell.

0 2 4 6 8 10 12 14 16 18 20

Reference SNR (dB)

RMSE as % of cell size

P−NLS

CRLB

Figure 12: Position estimation error as a percentage of cell size Vfor three sources in a square 4-node cell

with a MASS of 20◦, for various values of SNR measured at the center of the cell.

is displayed in Figures 9 & 10, respectively. Both the P-NLS and GB methods used the

brute force approach of Section 4.3.1 for the ﬁnal source location selection. These results

are for the idealized case of 0◦MASS, nonetheless, it is very encouraging to see how close

the performance of the GB method gets to the lower bound. However, it is evident that the

performance of the IP method degrades with three sources.

Any realistic sensors and DOA estimation algorithm will have a non-zero MASS, and

the performance of all localization algorithms is expected to degrade signiﬁcantly as the

MASS increases. This is due to the fact that the accuracy of the algorithms degrades

as CSdecreases, and an increasing MASS directly decreases CS, especially as the number

of sources increases. Another way to think of this is that as the MASS increases, the

accuracy of the DOA estimates from each sensor is much more likely to degrade signiﬁcantly,

0° 1° 2° 3° 4° 5° 6° 7° 8° 9° 10°

Extra DOA error standard deviation

RMSE as % of cell size

P−NLS

CRLB

Figure 13: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell

with a MASS of 0◦for 20 dB SNR at the center of the cell, for various values of extra DOA error standard

deviation.

0° 1° 2° 3° 4° 5° 6° 7° 8° 9° 10°

Extra DOA error standard deviation

RMSE as % of cell size

P−NLS

CRLB

Figure 14: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell

with a MASS of 20◦for 20 dB SNR at the center of the cell, for various values of extra DOA error standard

deviation.

due to the “merging” eﬀect illustrated in Fig. 6. In the extreme case, CSwill be zero—

i.e., no sensors will detect the true number of sources—and the localization algorithm will

underestimate the number of source locations. A more realistic case of 20◦MASS is presented

in Figures 11 & 12, and the degrading eﬀect of the increased MASS is clear, particularly for

the three source case. Note again, that the GB method consistently performs the best.

All the previous results have considered the DOA estimation error at the sensors to be

modeled as in Fig. 5. In Figures 13 & 14 we consider the position error for two sources

with increased DOA estimation error when the reference SNR is 20 dB. This is modeled

by taking the result of Fig. 5 and adding an additional Gaussian noise term with a zero-

mean and standard deviation of 1◦–10◦at each sensor node. Again, in the 0◦MASS case,

0 5 10 15 20

Reference SNR (dB)

RMSE as % of cell size

MASS = 30°

MASS = 25°

MASS = 20°

MASS = 15°

MASS = 10°

MASS = 5°

MASS = 0°

CRLB

(a) Brute force

0 5 10 15 20

Reference SNR (dB)

RMSE as % of cell size

MASS = 30°

MASS = 25°

MASS = 20°

MASS = 15°

MASS = 10°

MASS = 5°

MASS = 0°

CRLB

(b) Sequential

0 5 10 15 20

Reference SNR (dB)

RMSE as % of cell size

MASS = 30°

MASS = 25°

MASS = 20°

MASS = 15°

MASS = 10°

MASS = 5°

MASS = 0°

CRLB

Figure 15: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell

using the grid-based method and the ﬁnal step approaches of Sections 4.3.1 & 4.3.2 with various values of

MASS and SNR measured at the center of the cell.

0 5 10 15 20

Reference SNR (dB)

RMSE as % of cell size

MASS = 30°

MASS = 25°

MASS = 20°

MASS = 15°

MASS = 10°

MASS = 5°

MASS = 0°

CRLB

(a) Brute force

0 5 10 15 20

Reference SNR (dB)

RMSE as % of cell size

MASS = 30°

MASS = 25°

MASS = 20°

MASS = 15°

MASS = 10°

MASS = 5°

MASS = 0°

CRLB

(b) Sequential

0 5 10 15 20

Reference SNR (dB)

RMSE as % of cell size

MASS = 30°

MASS = 25°

MASS = 20°

MASS = 15°

MASS = 10°

MASS = 5°

MASS = 0°

CRLB

Figure 16: Position estimation error as a percentage of cell size Vfor three sources in a square 4-node cell

using the grid-based method and the ﬁnal step approaches of Sections 4.3.1 & 4.3.2 with various values of

MASS and SNR measured at the center of the cell.

the methods show a reasonable agreement with the lower bound, and as the MASS moves

to 20◦, the performance of all the methods suﬀers. Once again, the proposed GB method

performs the best with the added DOA estimation error.

With the sequential approach of Section 4.3.2 we presented a solution to the high com-

plexity brute force approach of 4.3.1, whilst acknowledging that its performance may be

worse than the brute force approach. Figures 15 & 16 illustrate the diﬀerence in perfor-

Table 1: Mean execution times in milliseconds for localization methods for one set of DOA estimations

MASS = 0◦MASS = 20◦

one two three two three

Method source sources sources sources sources

LLS 0.12 – – – –

IP 0.69 6.89 44.49 5.31 16.16

GB (& BF) 1.72 36.03 2961.57 19.18 214.34

GB (& Seq.) 1.72 29.39 162.79 16.79 26.69

P-NLS (& BF) 18.88 381.95 5033.43 205.12 509.59

P-NLS (& Seq.) 18.88 375.32 2238.82 202.72 322.08

mance for the two approaches with the GB method for two and three sources, respectively.

It is clear that little performance is lost using the sequential approach particularly at the

higher—and more realistic—values of MASS. The loss in performance is higher at low val-

ues of MASS, and for the three source case. Although it is not shown here due to space

considerations, because the P-NLS method must use either the brute force or the sequential

approach, it too suﬀers a very similar performance loss to that of the GB method. Fig-

ures 15 & 16 also illustrate the eﬀect of MASS on the RMSE, highlighting the importance

that the DOA estimation used has a low MASS.

5.3. Complexity

All the localization algorithms of Sections 3 & 4 were implemented in MATLAB on

a Windows laptop with a Core i5 CPU running at 2.53 GHz with 4GB RAM, and their

mean execution times are presented in Table 1. Note that while the absolute execution

times may be highly dependent on the machine, we are only interested here in the relative

times between the methods. In the one source case, the LLS method is clearly the fastest,

while the IP method is the fastest in the multiple source cases. The (P-)NLS methods

are clearly the slowest methods, due to non-linear optimization they require. Table 1 also

highlights the dramatic reduction in complexity when using the sequential rather than the

(a)

C2= 4

(b)

C2= 2, C1= 2

(c)

C2= 2, C1= 2

(d)

C2= 2, C1= 2

(e)

C2= 1, C1= 3

(f)

C1= 4

(g)

C2= 2, C1= 2

(h)

C3= 2, C1= 2

(i)

C3= 1, C2= 2, C1= 1

(j)

C2= 4

(k)

C3= 3, C2= 1

(l)

C1= 4

Figure 17: Position estimates (the red clouds) using the proposed grid-based method in a square 4-node

cell, for real recordings of two [(a)–(g)] or three [(h)–(l)] simultaneous sources (the blue X’s).

brute force approach, particularly in the three source case. These results, together with

those of Section 5.2, strongly suggest that the GB method with the sequential approach is

the best choice given its accuracy and moderate complexity. To further verify this suitability,

we implemented the GB method with the sequential approach in C++ and measured that

it only consumed 25% of the available processing time, making it an excellent candidate for

a real-time system.

0 10 20 30 40 50

0.2

0.4

0.6

0.8

Error (%)

Empirical CDF

GBM

P−NLS

IPM

Figure 18: Empirical Cumulative Distribution Functions (CDFs) of the error between the estimated and

true source positions using real recorded data.

5.4. Results of Real Measurements

We also performed some real recordings of acoustic sources in a 4-node square cell with

sides 4 meters long. The sensors on the nodes were circular 4-element microphone arrays with

a radius of 2 cm, and the DOA estimation was performed by our real-time system of [20, 21].

The sources were recorded speech, sampled at 44.1 kHz, played back simultaneously through

loudspeakers at diﬀerent locations, and their SNR at the center of the cell was measured to

be about 10 dB. The DOA estimation and source localization was performed on frames of

2048 samples with 50% overlap. Although a 4 ×4 metre square is not a particularly large

area, since we measure our reference SNR at the center of the cell, these results should be

scalable to larger cells. Fig. 17 shows the position estimates from the real recordings using

the proposed grid-based method for diﬀerent layouts of two and three sources. The red dots

show the cloud of estimates over about 5 seconds, and show quite accurate localization. The

pairs (f) & (g) and (j) & (k) warrant further discussion. All of the plots except (g) and

(k) used the standard parameter set of [20, 21] which has a MASS of around 20◦, and it is

clear that in (f) and (j) the source positions are underestimated. By modifying some of the

parameters of the DOA estimation, we were able to decrease the system’s MASS so that all

the sources in (g) and (k) could be localized, albeit with a greater variance in the estimates.

The performance of the P-NLS and our proposed grid-based and intersection point meth-

ods was also compared on our real recorded data. Again, for DOA estimation our real-time

Table 2: RMSE as a percentage of cell size for the real recordings (outdoors) of Fig. 17 and their corre-

sponding reverberant simulations with T60 = 400 ms

outdoor reverberant

layout GBM P-NLS IP GBM P-NLS IP

(a) 4.33 4.33 3.80 12.13 32.05 31.58

(b) 6.33 6.33 10.83 18.60 19.07 23.28

(d) 4.53 4.52 9.84 16.30 17.85 20.81

(e) 14.92 17.03 12.15 14.64 16.51 15.08

(f) 13.39 13.39 13.42 12.81 12.81 13.22

(g) 5.41 5.41 11.93 15.70 15.71 18.24

(h) 7.71 7.70 8.87 11.77 12.49 13.91

(i) 4.61 4.61 6.02 20.64 24.43 19.91

(j) 20.69 21.39 33.04 23.20 23.02 37.27

(k) 12.99 14.15 18.64 24.07 24.45 23.69

(l) 12.38 12.37 12.85 10.72 10.72 10.70

system of [20, 21] was used. Fig. 18 shows the empirical Cumulative Distribution Functions

(CDFs) of the error between the estimated and true source positions for the three localiza-

tion methods. The error was calculated using all frames for all the source positionings of

Fig. 17. It is evident that the P-NLS and GB methods perform the best. However, note that

while the P-NLS and the GB methods have similar performance, our proposed GB method

is much more computationally eﬃcient (Section 5.3).

It should be noted that these recordings took place outdoors, and as such did not have

many reﬂections, but there was a signiﬁcant level of distant noise sources, such as cars and

dogs barking. Furthermore, the orientations of the sensors were not ﬁnely calibrated, and the

DOA estimates likely have unintended oﬀsets of a few degrees. Thus the conditions were far

from ideal, making the results of our proposed localization method even more encouraging.

0 10 20 30 40 50

0.2

0.4

0.6

0.8

Error (%)

Empirical CDF

GBM

P−NLS

IPM

Figure 19: Empirical Cumulative Distribution Functions (CDFs) of the error between the estimated and

true source positions using recordings in a simulated room with T60 = 400 ms.

5.5. Results in reverberant environments

In this section, we test the eﬃciency of our localization methods in reverberant en-

vironments. We used the Image-Source method [27] to simulate a reverberant room of

dimensions of 6 ×6×3 meters with reverberation time T60 = 400 ms. We placed a 4-node

cell of sides 4 meters long in the middle of the room. Thus, the nodes’ centers are located

in (1,1),(5,1),(5,5),and (1,5) meters. Again, the nodes consist of circular 4-element mi-

crophone arrays with a radius of 2 cm. We considered the same source positionings and the

same speech signals that we used for our real recordings in Fig. 17. For DOA estimation we

used again our system of [20, 21] on frames of 2048 samples with 50% overlap. Fig. 19 shows

the CDFs of the error between the estimated and true source positions using all frames and

all source positionings. Once again, the grid-based method performs the best. A perfor-

mance degradation for all methods is evident compared to the results in Fig. 18. This is

because reverberation aﬀects the DOA estimation algorithm providing more erroneous DOA

estimates.

Table 2 shows the RMSE over all frames for each position layout of Fig. 17. The results

of the table agree with Figures 18 & 19 as the performance degradation due to reverberation

is evident, and the GB method generally performs the best. It is of note that in layouts

(f) and (l), the outdoor recordings have greater RMSE than the reverberant ones. These

layouts correspond to the cases where the DOA estimation algorithm in all arrays always

detects one source. The DOA estimation of this practically single source case is the one least

aﬀected by reverberation [21]. This fact combined with the fact that outdoor recordings were

performed in a real—rather than a simulated environment—can explain this small diﬀerence

in the RMSE between the two scenarios.

5.6. Tracking Potential

Due to their real-time natures, the DOA estimation algorithm of [21] and the GB method

we present here suggest the potential for integration with a tracking system. To illustrate

the tracking potential, we implemented a tracking algorithm based on particle ﬁltering using

the framework of [28]. The tracking system uses the location estimates of the GB method

to assign weights to the particles through the following likelihood function:

ptr(ˆ

p(t)

s|x(t)

j,i ) = N(ˆ

p(t)

s,x(t)

j,i ;Σ) (31)

where ˆ

p(t)

sis the s-th source location estimate from the GB method at time t,x(t)

j,i is the

location of particle iassociated with the tracked source jat time tand Ndenotes the

two-dimensional Gaussian distribution with mean x(t)

j,i and covariance matrix Σ, evaluated

at ˆ

p(t)

s. Assuming that the measurements are independent in the x- and y-coordinates, the

covariance matrix can be written as Σ= diag(σ2

x, σ2

y), where the variances σ2

xand σ2

yare

used to quantify the location error that the localization system is expected to produce in

the x- and y-coordinates.

We now illustrate the potential of tracking with a simple example. In the 4 m ×4 m

square cell considered in our simulations, three sources were set to move in straight lines

at diﬀerent velocities. In this example, the MASS is set to be 15◦. To implement (31) we

empirically set σx=σy= 0.15. The RMSE over time for 250 runs is shown in Fig. 20. It

is evident that the tracking system consistently improves the localization performance. It

is worth noting the region between 0.5 seconds and 1 second where the sources are located

such that due to the MASS the localization is able to detect only two out of three sources.

In that region, the tracking is able to keep the track of the lost source and signiﬁcantly

improve the performance.

0.5 1 1.5 2 2.5

Time (seconds)

RMSE as % of cell size

Localization

Tracking

Localization Mean

Tracking Mean

Figure 20: Position estimation error as a percentage of cell size Vfor two sources in a square 4-node cell

with a MASS of 20◦and signals having 20 dB SNR at the center of the cell, for various values of extra DOA

error standard deviation.

6. Conclusions

In this work, we have considered the challenge of localization in a WASN where each

sensor node only transmits direction-of-arrival estimates, reducing the transmissions to the

processing node. We considered some of the real problems in such a scenario, such as modeled

DOA estimation error, and the merging of two estimates that are too close together to be

resolved by the DOA estimation algorithm. We presented a real-time grid-based method to

perform the position estimation of multiple sources along with a sequential approach to the

ﬁnal source location selection. Through extensive simulations and measurements we showed

that our proposed method outperforms the other state-of-the-art methods considered in

both accuracy and computational complexity.

Appendix A. Grid-based Error Bound

Any grid-based localization method’s accuracy will be limited by the density of its grid

points. To investigate this further, we now calculate a lower bound for the root mean

squared position error. If we assume single source localization, and that the method works

perfectly—or that there are no DOA errors—then the method will always choose the closest

grid point. Let the grid points be uniformly spaced, with Gbeing the inter point spacing in

the xand ydirections (see Fig. 3). Without loss of generality, let us consider a grid point

at (0,0), then due to symmetry, we only need to analyze the squared error in the square

deﬁned by (0,0) and (G/2, G/2). Let us also assume that a source may be located anywhere

in the square under consideration, with a uniform probability density function given by

p(x, y) = p(x)·p(y) = 2

G·2

G=4

G2,(A.1)

due to the independence between p(x) and p(y). The squared error between (0,0) and a

point (x, y) is simply x2+y2, and the mean squared error is then given by

GB =ZG/2

0ZG/2

(x2+y2)p(x, y)dx dy =G2

6,(A.2)

with the root mean square error being

EGB =G

√6.(A.3)

If the inter sensor spacing in the x(and y) direction is deﬁned as V(see Fig. 1), the number

of grid points can be written as

N=V

G+ 12

,(A.4)

and from (A.2), we can write

EGB =V

√6(√N−1).(A.5)

Note that this analysis is independent of the method, and should apply to any grid-based

localisation method.

References

[1] A. Bertrand, Applications and trends in wireless acoustic sensor networks: A signal processing per-

spective, in: IEEE Symp. on Communications and Vehicular Technology in the Benelux, 2011, pp.

1–6.

[2] D. J. Mennill, M. Battiston, D. R. Wilson, J. R. Foote, S. M. Doucet, Field test of an aﬀordable,

portable, wireless microphone array for spatial monitoring of animal ecology and behaviour, Methods

in Ecology and Evolution 3 (4) (2012) 704–712.

[3] A. Ledeczi, G. Kiss, B. Feher, P. Volgyesi, G. Balogh, Acoustic source localization fusing sparse direction

of arrival estimates, in: Int. Workshop on Intelligent Solutions in Embedded Systems, 2006, pp. 1–13.

[4] H. Wang, C. E. Chen, A. Ali, S. Asgari, R. E. Hudson, K. Yao, D. Estrin, C. Taylor, Acoustic sensor

networks for woodpecker localization, in: F. T. Luk (Ed.), SPIE Conf. on Advanced Signal Processing

Algorithms, Architectures, and Implementations, Vol. 5910, 2005, pp. 80–91.

[5] A. Farina, Target tracking with bearings-only measurements, Signal Processing 78 (1) (1999) 61–78.

[6] R. G. Stansﬁeld, Statistical theory of D.F. ﬁxing, J. of the Inst. of Electr. Eng. - Part IIIA: Radiocom-

munication 94 (15) (1947) 762–770.

[7] K. Do˘gan¸cay, Bearings-only target localization using total least squares, Signal Processing 85 (9) (2005)

1695–1710.

[8] M. Gavish, A. J. Weiss, Performance analysis of bearing-only target location algorithms, IEEE Trans.

on Aerospace and Electr. Syst. 28 (3) (1992) 817–828.

[9] A. Bishop, B. D. O. Anderson, B. Fidan, P. Pathirana, G. Mao, Bearing-only localization using geomet-

rically constrained optimization, IEEE Trans. on Aerospace and Electr. Syst. 45 (1) (2009) 308–320.

[10] Z. Wang, J. Luo, X. Zhang, A novel location-penalized maximum likelihood estimator for bearing-only

target localization, IEEE Trans. on Signal Processing 60 (12) (2012) 6166–6181.

[11] L. M. Kaplan, Q. Le, On exploiting propagation delays for passive target localization using bearings-

only measurements, J. of the Franklin Institute 342 (2) (2005) 193–211.

[12] L. M. Kaplan, Q. Le, N. Molnar, Maximum likelihood methods for bearings-only target localization, in:

IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), Vol. 5, 2001, pp. 3001–3004.

[13] K. Pattipati, S. Deb, Y. Bar-Shalom, R. B. Washburn, A new relaxation algorithm and passive sensor

data association, IEEE Trans. on Automatic Control 37 (2) (1992) 198–213.

[14] S. Deb, M. Yeddanapudi, K. Pattipati, Y. Bar-Shalom, A generalized S-D assignment algorithm for

multisensor-multitarget state estimation, IEEE Trans. on Aerospace and Electr. Syst. 33 (2) (1997)

523–538.

[15] A. Bishop, P. Pathirana, Localization of emitters via the intersection of bearing lines: A ghost elimi-

nation approach, IEEE Trans. on Vehicular Technology 56 (5) (2007) 3106–3110.

[16] A. Bishop, P. Pathirana, A discussion on passive location discovery in emitter networks using angle-only

measurements, in: Int. Conf. on Wireless Communications and Mobile Computing, ACM, 2006.

[17] J. Reed, C. da Silva, R. Buehrer, Multiple-source localization using line-of-bearing measurements:

Approaches to the data association problem, in: IEEE Military Communications Conf. (MILCOM),

2008, pp. 1–7.

[18] H. W. L. Naus, C. V. Van Wijk, Simultaneous localisation of multiple emitters, IEEE Proc. - Radar,

Sonar and Navigation 151 (2) (2004) 65–70. doi:10.1049/ip-rsn:20040184.

[19] L. M. Kaplan, P. Molnar, Q. Le, Bearings-only target localization for an acoustical unattended ground

sensor network, in: Proc. SPIE, Vol. 4393, 2001, pp. 40–51.

[20] D. Pavlidi, M. Puigt, A. Griﬃn, A. Mouchtaris, Real-time multiple sound source localization using a

circular microphone array based on single-source conﬁdence measures, in: Proc. of IEEE Int. Conf. on

Acoustics, Speech, and Signal Processing (ICASSP), 2012.

[21] D. Pavlidi, A. Griﬃn, M. Puigt, A. Mouchtaris, Real-time multiple sound source localization and

counting using a circular microphone array, IEEE Trans. on Audio, Sp., and Lang. Proc. 21 (10) (2013)

2193–2206.

[22] A. Griﬃn, A. Mouchtaris, Localizing multiple audio sources from DOA estimates in a wireless acoustic

sensor network, in: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013.

[23] H. Do, H. Silverman, A Fast Microphone Array SRP-PHAT Source Location Implementation using

Coarse-To-Fine Region Contraction (CFRC), in: IEEE Workshop on Applications of Signal Processing

to Audio and Acoustics (WASPAA), 2007, pp. 295–298.

[24] S. M. Kay, Fundamentals of statistical signal processing: estimation theory, Prentice-Hall, Inc., Upper

Saddle River, NJ, USA, 1993.

[25] J. Tiete, F. Dom´ınguez, B. Silva, L. Segers, K. Steenhaut, A. Touhaﬁ, SoundCompass: A Distributed

MEMS Microphone Array-Based Sensor for Sound Source Localization, Sensors 14 (2) (2014) 1918–

1949.

[26] M. J. Crocker, Handbook of Acoustics, Wiley, 1998.

[27] E. Lehmann, A. Johansson, Diﬀuse reverberation model for eﬃcient image-source simulation of room

impulse responses, IEEE Trans. on Audio, Speech, and Lang. Proc. 18 (6) (2010) 1429 –1439.

[28] J. Valin, F. Michaud, J. Rouat, Robust localization and tracking of simultaneous moving sound sources

using beamforming and particle ﬁltering, Robotics and Autonomous Systems 55 (3) (2007) 216–228.

An iteratively reweighted steered response power approach to multisource localization using a distributed microphone network

Article

Feb 2024
J ACOUST SOC AM

The steered response power (SRP) with phase transform algorithm has been demonstrated to be robust against reverberation and noise for single-source localization. However, when this algorithm is applied to multisource localization (MSL), the “peak missing problem” can occur, namely, that some sources dominate over others over short time intervals, resulting in fewer significant SRP peaks being found than the true number of sources. This problem makes it difficult to detect all the sources among the available SRP peaks. We propose an iteratively reweighted steered response power (IR-SRP) approach that effectively solves the “peak missing problem” and achieves robust MSL in reverberant noisy environments. The initial IR-SRP localization function is computed over the time-frequency (T-F) bins selected by a combination of two weighting schemes, one using coherence, and the other using signal-to-noise ratio. When iterating, our method finds the significant SRP peaks for the dominant sources and eliminates the T-F bins contributed by these sources using inter-channel phase difference information. As a result, the remaining sources can be found in subsequent iterations among the remaining T-F bins. The proposed IR-SRP method is demonstrated using both simulated and measured experiment data.

2D Acoustic Source Localisation Using Decentralised Deep Neural Networks on Distributed Microphone Arrays

Conference Paper

Full-text available

Sep 2021

This paper takes a deep neural network approach to direction of arrival estimation and extends this to a 2D localisa-tion approach. To accomplish the 2D localisation, only two microphone arrays are deployed. This paper will compare different 2D localisation methods, from which a triangula-tion approach is the most straightforward extension of the original deep neural network. The other methods combine information within the neural network. Robustness against slight clock offsets from different arrays is ensured by only mixing information at lower layers in the neural network. It will be shown that combining information between neu-ral network layers has a significant improvement over the triangulation approach.

Diffusion-Based Sound Source Localization Using Networks of Planar Microphone Arrays

Conference Paper

Full-text available

Jun 2023

On the Challenges of Acoustic Energy Mapping Using a WASN: Synchronization and Audio Capture

Article

Full-text available

May 2023
SENSORS-BASEL

Acoustic energy mapping provides the functionality to obtain characteristics of acoustic sources, as: presence, localization, type and trajectory of sound sources. Several beamforming-based techniques can be used for this purpose. However, they rely on the difference of arrival times of the signal at each capture node (or microphone), so it is of major importance to have synchronized multi-channel recordings. A Wireless Acoustic Sensor Network (WASN) can be very practical to install when used for mapping the acoustic energy of a given acoustic environment. However, they are known for having low synchronization between the recordings from each node. The objective of this paper is to characterize the impact of current popular synchronization methodologies as part of the WASN to capture reliable data to be used for acoustic energy mapping. The two evaluated synchronization protocols are: Network Time Protocol (NTP) y Precision Time Protocol (PTP). Additionally, three different audio capture methodologies were proposed for the WASN to capture the acoustic signal: two of them, recording the data locally and one sending the data through a local wireless network. As a real-life evaluation scenario, a WASN was built using nodes conformed by a Raspberry Pi 4B+ with a single MEMS microphone. Experimental results demonstrate that the most reliable methodology is using the PTP synchronization protocol and audio recording locally.

Joint short-time speaker recognition and tracking using sparsity-based source detection

Article

Full-text available

Apr 2023

A random finite set-based sequential Monte–Carlo tracking method is proposed to track multiple acoustic sources in indoor scenarios. The proposed method can improve tracking performance by introducing recognized speaker identities from the received signals. At the front-end, the degenerate unmixing estimation technique (DUET) is employed to separate the mixed signals, and the time delay of arrival (TDOA) is measured. In addition, a criterion to select the reliable microphone pair is designed to quickly obtain accurate speaker identities from the mixed signals, and the Gaussian mixture model universal background model (GMM-UBM) is employed to train the speaker model. In the tracking step, the update of the weight for each particle is derived after introducing the recognized speaker identities, which results in better association between the measurements and sources. Simulation results demonstrate that the proposed method can improve the accuracy of the filter states and discriminate the sources close to each other.

Soft Label Coding for End-to-end Sound Source Localization With Ad-hoc Microphone Arrays

Preprint

Apr 2023

Recently, an end-to-end two-dimensional sound source localization algorithm with ad-hoc microphone arrays formulates the sound source localization problem as a classification problem. The algorithm divides the target indoor space into a set of local areas, and predicts the local area where the speaker locates. However, the local areas are encoded by one-hot code, which may lose the connections between the local areas due to quantization errors. In this paper, we propose a new soft label coding method, named label smoothing, for the classification-based two-dimensional sound source location with ad-hoc microphone arrays. The core idea is to take the geometric connection between the classes into the label coding process.The first one is named static soft label coding (SSLC), which modifies the one-hot codes into soft codes based on the distances between the local areas. Because SSLC is handcrafted which may not be optimal, the second one, named dynamic soft label coding (DSLC), further rectifies SSLC, by learning the soft codes according to the statistics of the predictions produced by the classification-based localization model in the training stage. Experimental results show that the proposed methods can effectively improve the localization accuracy.

An Iterative Steered Response Power Algorithm for Multi-Source Localization and Counting Using Distributed Microphone Networks

Conference Paper

Jan 2024

The well-known steered response power-phase transform (SRP-PHAT) algorithm has been demonstrated to be robust under adverse acoustic conditions for single-source localization. When applying this algorithm to the localization of multiple concurrent sound sources, one may encounter a problem that some sources dominate over the others over short time intervals, resulting in fewer significant SRP peaks being found than the true number of sources. This problem makes it difficult to detect all the sources among the available SRP peaks. This paper proposes an iterative SRP approach to jointly estimating the number of sources and their locations in a distributed microphone network. The proposed approach is derived based on the time-frequency (T-F) bins selected by a combination of two weighting schemes, one using the coherence, and the other using signal-to-noise ratio. At each iteration, a dominant source is localized by exploiting the highest SRP peak and its contribution is removed by eliminating the T-F bins that are dominated by this source. When iteration stops, a set of coarse location estimates are obtained, which are further refined by merging the closely located sources based on their distances. Experimental results demonstrate the robustness of the proposed algorithm to reverberation and noise.

Soft Label Coding for end-to-end Sound Source Localization with ad-hoc Microphone Arrays

Conference Paper

Jun 2023

Disambiguation of Measurements for Multiple Acoustic Source Localization using Deep Multi-Dimensional Assignments

Article

Feb 2023
DIGIT SIGNAL PROCESS

An Iteratively Reweighted Steered Response Power Approach to Multisource Localization Using a Distributed Microphone Network

Preprint

Jan 2023

The steered response power (SRP) with phase transform algorithm has been demonstrated to be robust against reverberation and noise for single-source localization. However, when this algorithm is applied to multisource localization (MSL), the peak missing problem can occur, namely some sources are predominant over the others during a short time period, resulting in the number of significant SRP peaks is less than the true number of sources. This problem makes it difficult to accurately localize all the sources by directly using the significant SRP peaks. In this manuscript, we propose an Iteratively Reweighted-SRP (IR-SRP) approach that effectively solves the peak missing problem and achieves robust MSL in reverberant noisy environments. The initial IR-SRP localization function is computed over the time frequency (T-F) bins selected by the coherence and the signal-to-noise ratio weighting schemes. When iterating, our method utilizes the significant SRP peaks to localize part of sources and eliminates the T-F bins dominated by these sources using inter-channel phase difference information. Consequently, the peaks corresponding to the non-dominant sources can become significant compared to the results in the previous iterations, making it easy to localize the non-dominant sources. The proposed IR-SRP method is evaluated using both simulated and measured data.

Real-Time Multiple Sound Source Localization and Counting Using a Circular Microphone Array

Article

Full-text available

Jul 2013
IEEE T AUDIO SPEECH

In this work, a multiple sound source localization and counting method is presented, that imposes relaxed sparsity constraints on the source signals. A uniform circular microphone array is used to overcome the ambiguities of linear arrays, however the underlying concepts (sparse component analysis and matching pursuit-based operation on the histogram of estimates) are appli- cable to any microphone array topology. Our method is based on detecting time-frequency (TF) zones where one source is dominant over the others. Using appropriately selected TF components in these “single-source” zones, the proposed method jointly estimates the number of active sources and their corresponding directions of arrival (DOAs) by applying a matching pursuit-based approach to the histogram of DOA estimates. The method is shown to have excellent performance for DOA estimation and source counting, and to be highly suitable for real-time applications due to its low complexity. Through simulations (in various signal-to-noise ratio conditions and reverberant environments) and real environment experiments, we indicate that our method outperforms other state-of-the-art DOA and source counting methods in terms of accuracy, while being significantly more efficient in terms of computational complexity.

Localizing multiple audio sources from DOA estimates in a wireless acoustic sensor network

Conference Paper

Full-text available

Oct 2013

In this work we propose a method to estimate the position of multiple sources in a wireless acoustic sensor network, where each sensor node only transmits direction-of-arrival (DOA) estimates each time interval, minimizing the transmissions to the processing node. Our method is based on the intersection of DOA estimates with outlier removal, and as such is very computationally efficient. We explore the performance of our method through extensive simulations and real measurements.

SoundCompass: A Distributed MEMS Microphone Array-Based Sensor for Sound Source Localization

Article

Full-text available

Feb 2014
SENSORS-BASEL

Sound source localization is a well-researched subject with applications ranging from localizing sniper fire in urban battlefields to cataloging wildlife in rural areas. One critical application is the localization of noise pollution sources in urban environments, due to an increasing body of evidence linking noise pollution to adverse effects on human health. Current noise mapping techniques often fail to accurately identify noise pollution sources, because they rely on the interpolation of a limited number of scattered sound sensors. Aiming to produce accurate noise pollution maps, we developed the SoundCompass, a low-cost sound sensor capable of measuring local noise levels and sound field directionality. Our first prototype is composed of a sensor array of 52 Microelectromechanical systems (MEMS) microphones, an inertial measuring unit and a low-power field-programmable gate array (FPGA). This article presents the SoundCompass's hardware and firmware design together with a data fusion technique that exploits the sensing capabilities of the SoundCompass in a wireless sensor network to localize noise pollution sources. Live tests produced a sound source localization accuracy of a few centimeters in a 25-m2 anechoic chamber, while simulation results accurately located up to five broadband sound sources in a 10,000-m2 open field.

Real-time multiple sound source localization using a circular microphone array based on single-source confidence measures

Article

Full-text available

Mar 2012
Acoust Speech Signal Process

We propose a novel real-time adaptative localization approach for multiple sources using a circular array, in order to suppress the localization ambiguities faced with linear arrays, and assuming a weak sound source sparsity which is derived from blind source separation methods. Our proposed method performs very well both in simulations and in real conditions at 50% real-time.

Real-Time Multiple Sound Source Localization and Counting Using a Circular Microphone Array

Article

Full-text available

Oct 2013
IEEE T AUDIO SPEECH

In this work, a multiple sound source localization and counting method is presented, that imposes relaxed sparsity constraints on the source signals. A uniform circular microphone array is used to overcome the ambiguities of linear arrays, however the underlying concepts (sparse component analysis and matching pursuit-based operation on the histogram of estimates) are applicable to any microphone array topology. Our method is based on detecting time-frequency (TF) zones where one source is dominant over the others. Using appropriately selected TF components in these “single-source” zones, the proposed method jointly estimates the number of active sources and their corresponding directions of arrival (DOAs) by applying a matching pursuit-based approach to the histogram of DOA estimates. The method is shown to have excellent performance for DOA estimation and source counting, and to be highly suitable for real-time applications due to its low complexity. Through simulations (in various signal-to-noise ratio conditions and reverberant environments) and real environment experiments, we indicate that our method outperforms other state-of-the-art DOA and source counting methods in terms of accuracy, while being significantly more efficient in terms of computational complexity.

Applications and trends in wireless acoustic sensor networks: A signal processing perspective

Conference Paper

Full-text available

Nov 2011

Alexander Bertrand

Wireless microphone networks or so-called wireless acoustic sensor networks (WASNs) are a next-generation technology for audio acquisition and processing. As opposed to traditional microphone arrays that sample a sound field only locally, often at large distances from the relevant sound sources, WASNs allow to use many more microphones to cover a large area of interest. However, the design of such WASNs is very challenging, especially for real-time audio acquisition and signal enhancement due to the significant data traffic in the network. There is a need for scalable solutions, both on the signal processing level and on the network-communication level. In this paper, we give an overview of applications and trends in the field of WASNs, and we address the core challenges that need to be tackled. We mainly focus on the signal processing level, and we explain how advances in the area of signal processing can relax the high-demanding constraints on the network layer design. Furthermore, we address the interaction between the application layer and the network layer, and we explain why cross-layer design can be important to improve the performance of WASN applications.

Acoustic sensor networks for woodpecker localization

Conference Paper

Aug 2005
Proceedings of SPIE

Sensor network technology can revolutionize the study of animal ecology by providing a means of non-intrusive, simultaneous monitoring of interaction among multiple animals. In this paper, we investigate design, analysis, and testing of acoustic arrays for localizing acorn woodpeckers using their vocalizations. Each acoustic array consists of four microphones arranged in a square. All four audio channels within the same acoustic array are finely synchronized within a few micro seconds. We apply the approximate maximum likelihood (AML) method to synchronized audio channels of each acoustic array for estimating the direction-of-arrival (DOA) of woodpecker vocalizations. The woodpecker location is estimated by applying least square (LS) methods to DOA bearing crossings of multiple acoustic arrays. We have revealed the critical relation between microphone spacing of acoustic arrays and robustness of beamforming of woodpecker vocalizations. Woodpecker localization experiments using robust array element spacing in different types of environments are conducted and compared. Practical issues about calibration of acoustic array orientation are also discussed.

A Novel Location-Penalized Maximum Likelihood Estimator for Bearing-Only Target Localization

Article

Dec 2012

Fundamentals of Statistical Signal Processing, Vol. II: Detection Theory

Article

Jan 1993

S. M. M. Kay

Bearings-only target localization for an acoustical unattended ground sensor network

Article

Jan 2001
Proceedings of SPIE

This paper extends our development of acoustical bearings-only target localization for the case of multiple moving targets. The resulting techniques can be used to locate and track targets traveling through a network of acoustical sensor arrays. Each array computes and transmits multiple direction-of-arrival (DOA) estimates to a central processor, which employs the target localization technique. In previous work, we developed ML techniques that may or may not account for the fact that a bearing measurement points to the location of a moving target at a retarded time. By inserting a simple bearings association computation in the ML methods, we define quasi-ML techniques that can estimate the location and velocity of multiple targets using multiple bearing estimates per a sensor array.

Localizing multiple audio sources in a wireless acoustic sensor network

Abstract and Figures

Recommended publications

Multiple Sound Source Location Estimation in Wireless Acoustic Sensor Networks Using DOA Estimates:...

Iterative Geometry Calibration from Distance Estimates for Wireless Acoustic Sensor Networks

Acoustic Source Localization in Noisy Wireless Sensor Networks

A spatiotemporal approach to passive sound source localization