Technical ReportPDF Available

Good practice guidelines for long-term ecoacoustic monitoring in the UK

Authors:

Abstract and Figures

Passive acoustic monitoring has great potential as a cost-effective method for long-term biodiversity monitoring. However, to maximise its efficacy, standardisation of survey protocols is necessary to ensure data are comparable and permit reliable inferences. The aim of these guidelines is to outline a basic long-term acoustic monitoring protocol that can be adapted to suit a range of projects according to specific objectives and size. Here we summarise some basic recommendations for audible-range terrestrial ecosystem monitoring - more detail can be found in the following chapters. A ‘Quick start guide’ giving further rationale for these recommendations can be found in Appendix 1.
Content may be subject to copyright.
Good practice guidelines for long-term
ecoacoustic monitoring in the UK
With a particular focus on terrestrial biodiversity
at the human-audible frequency range
Page 2
Foreword
The popularity of ecoacoustics as an innovative environmental discipline has enjoyed immense
growth within the last ve years, to a point where it is now becoming dicult to keep up with
all the new research papers published. What soon becomes apparent, however, is a lack of
consensus on which recording and analysis protocols to follow; partly a result of the diering
requirements of each research project, but also an historical artefact of the tropical origins
of much of this research. As more acoustic long-term monitoring schemes start to become
established throughout the UK and neighbouring countries there arises a need to adopt a more
common set of protocols, more akin to our temperate conditions, to allow for valid future analysis
and comparison. To that end a group of ecoacoustic researchers and practitioners met in June
2022 to discuss the formulation of such a set. This work was then taken forward by the authors to
generate the guidelines contained herein.
Digital technologies now allow us the ability to record our acoustic environments widely, with
relative ease; and to subject the resulting recordings to an ever-expanding range of analytical
methods. This opens up the potential to create new approaches to gauging biodiversity and
assessing the changing fortunes of species and their habitats. To maximise these benets it
is vitally important that we secure now, and into the future, data which will illustrate baseline
assessments and highlight change. These guidelines therefore provide welcome instruction and
conformity, particularly for those new to ecoacoustics. Please use them, as appropriate, to help
guide your own contributions to the growing awareness, and use, of sound as an environmental
metric within the UK and Europe.
Bob Ashington (Natural England)
Figure I. Urban nesting Kittiwakes Rissa
tridactyla. Passive acoustic monitoring
has been used to eectively monitor
large seabird colonies - could these noisy
birds be a good candidate for long-term
ecoacoustic monitoring?
Aims
Our good practice guidelines represent the opinions of an experienced team of researchers and
consultants who have come together to synthesise the latest academic research and expert
judgement on eld-proven ways to apply ecoacoustic survey techniques, especially tailored
to long-term biodiversity monitoring. The guidelines are focussed on the use of ecoacoustic
monitoring of audible sounds within terrestrial, temperate ecosystems typical of the UK and
elsewhere in Europe, but we hope they will have wider application. We explicitly do not consider
biodiversity that sonies in the ultrasonic, or marine acoustics, as well-developed monitoring
protocols already exist for this purpose - although naturally there is a degree of overlap.
The co-production of these guidelines follows a UK Acoustics Network (UKAN+) ecoacoustics
symposium held at Manchester Metropolitan University, Manchester, UK on 15-16th June
2022, and attended by over 160 people both online and in-person. The guidelines are intended
to reect the discussions and emerging conclusions from that event - as well as applicable
information and research generated around the world on the topic.
Authors
The guidelines have been produced by the following team:
Oliver Metcalf - Postdoctoral research associate at Manchester Metropolitan University, using
ecoacoustic methods to monitor wildlife.
Carlos Abrahams - Director of Bioacoustics at Baker Consultants, with experience in bird
bioacoustics and freshwater ecoacoustics.
Bob Ashington - Lead Adviser (Ecoacoustics & Earth Observation), Natural England
Ed Baker - Acoustic Biology Researcher at Natural History Museum, London.
Tom Bradfer-Lawrence - Postdoctoral Research Fellow at the University of Stirling, landscape
ecologist with an interest in soundscapes and acoustic indices.
Ella Browning - Postdoctoral Research Fellow, Centre for Environment and Biodiversity Research,
Dept Genetics Evolution and Environment, University College London & Research Scientist, Bat
Conservation Trust, London.
Jonathan Carruthers-Jones - Post-Doctoral Research Fellow, Wildland Research Institute &
School of Earth and Environment, University Of Leeds.
Jennifer Darby - MSc Conservation Biology student at Manchester Metropolitan University.
Jan Dick - Senior Social Ecologist, UK Centre for Ecology and Hydrology, Bush Estate, Penicuik,
Scotland.
Alice Eldridge - Reader in Sonic Systems, University of Sussex, interested in experimental and
applied computational and transdisciplinary ecoacoustics.
David Elliott - Associate Professor at Derby University.
Becky Heath - Researcher at Imperial College London, appraising and developing new methods
in (spatial) sound ecology
Paul Howden-Leach - Director of Ecology at Skyline Ecology, specialising in bat, bird and
soundscape acoustics, and Business Development Consultant for Wildlife Acoustics Inc.
Alison Johnston - Reader in Statistics, Centre for Research into Ecological and Environmental
Modelling, University of St Andrews, UK
Alexander Lees - Reader in conservation biology at Manchester Metropolitan University.
Christoph Meyer - Reader in Global Ecology and Conservation at the University of Salford.
Usue Ruiz Arana - Chartered Landscape Architect and Lecturer, Newcastle University, with
experience in creative soundscape methods for landscape design and management.
Siobhan Smyth - MSc Animal Behaviour student at Manchester Metropolitan University.
Suggested citation: Metcalf, O. et al. (2022) Good practice guidelines for long-term
ecoacoustic monitoring in the UK. UK Acoustics Network.
These guidelines are an output of The UK Acoustics Network Plus [grant number EP/V007866/1],
with additional funding from Manchester Metropolitan University and Baker Consulting Ltd.
Page 3
Foreword
Aims
Authors
Contents
Executive Summary
Equipment and settings
Analysis
Targeted analysis
Soundscape analysis
Glossary
Chapter 1: Introduction
1.1 Biodiversity monitoring
1.2. Why use ecoacoustic monitoring?
1.3 Purpose of these guidelines
1.4. Soundscapes from the human
perspective
1.5 How to use these guidelines
Chapter 2: Hardware
2.1. ARU Specications and what they
mean
2.1.1. Automated recording unit
2.1.2. Microphones
2.2. Cost trade-os with recording units
2.2.1. Budget options
2.2.2. Mid-range options
2.2.3. Top-end options
2.2.4. Localisation-enabled options
2.3. Maintenance and calibration
2.4. Software for programming ARUs
2.5. Future-proong
Chapter 3: Study Protocol
3.1 Temporal considerations
3.1.1. Deployment Schedule
3.1.2. Recording period
3.1.3. Sampling schedule
3.2 Spatial considerations
3.2.1. Detection distance
3.2.2. ARU Positioning
3.3 Audio settings
3.4 Metadata
3.5 Data storage
2
2
3
4
5
5
5
5
6
7
9
9
9
13
14
15
18
18
18
21
24
24
25
25
25
25
27
27
28
28
28
29
30
30
30
30
32
33
34
35
35
35
37
38
40
40
40
41
41
42
44
45
47
49
49
50
51
51
51
53
53
54
56
58
59
59
60
61
62
70
80
82
Contents
Chapter 4. Data Exploration
4.1 Basic data checks
4.2 Spectrograms
4.3 False-colour spectrograms/plots
4.4. Data pre-processing
Chapter 5: Targeted Monitoring
5.1 Acoustic analysis
5.1.1. Manual analysis
5.1.2. Automated and semi-automated
approaches
5.1.3. Sound event detection
5.1.4. Template matching
5.1.5. Machine learning
5.1.6. Deep learning
5.1.7. Assessing classication performance
5.2 Ecological analysis
5.2.1. Presence and absence
5.2.2. Community analysis
5.2.3. Occupancy models
5.2.4. Localisation
5.2.5. Density/Abundance
Chapter 6: Soundscape Analysis
6.1. Introduction to acoustic indices
6.2. Acoustic Analysis
6.3. Computation of acoustic indices
6.4. Sampling eort to capture
soundscape variability
6.5. Ecological Analysis
6.5.1. Indices to characterise landscapes
6.5.2. Indices as proxies for biodiversity
metrics
6.5.3. Deep Learning for Soundscape
Analysis
References
Appendix 1: An evidence-based
quick-start guide for ecoacoustics
deployment
Appendix 2: A table of acoustic
monitoring guidance documents from
around the world
Appendix 3: R code for false-colour
plots
Page 4
Executive Summary
Passive acoustic monitoring has great potential as a cost-eective method for long-term
biodiversity monitoring. However, to maximise its ecacy, standardisation of survey protocols is
necessary to ensure data are comparable and permit reliable inferences.
The aim of these guidelines is to outline a basic long-term acoustic monitoring protocol that
can be adapted to suit a range of projects according to specic objectives and size. Here we
summarise some basic recommendations for audible-range terrestrial ecosystem monitoring -
more detail can be found in the following chapters. A ‘Quick start guide’ giving further rationale
for these recommendations can be found in Appendix 1.
Equipment and settings
Recording devices should be capable of autonomous recording for extended periods
(Section 2.2) to minimise disturbance of the study site and use microphones with a at
frequency response across human-audible frequencies (Section 2.1). All devices in
a study should ideally be the same model (Section 2.5), and with a consistent gain
setting across all recorders (Section 2.1). A non-exhaustive list of available devices is
available in Table 2.1.
We provide a recommended quick-start protocol for those new to ecoacoustics projects
in Appendix 1. This recommends the follow settings and programme:
Sounds should be recorded in .wav format, at a bit depth of 16-bits, and with a
48kHz sampling rate.
Sounds should be recorded as 1 minute length les, with one recording every ve
minutes (1 minute on - 4 minutes o) through the full 24 hour daily cycle.
Deployments should last for a minimum of one week, and take place four times per
year, one in each season.
Recording devices should be placed at least 250 metres apart, with their locations
selected in relation to habitat type or other features of interest.
Consistent metadata should be collected for each deployment, with each term
matching an equivalent in Audobon Core (Section 3.4).
Analysis
We recommend including both targeted and whole soundscape analysis.
Targeted analysis
Birds are readily detectable using Passive Acoustic Monitoring, are a relatively
speciose group which are well-studied both in terms of their suitability for
passive acoustic monitoring and UK ecology; we recommend including
targeted bird surveys, although other taxa may be preferential in dierent
circumstances (see Chapter 5).
Page 5
Targeted studies should be conducted using recorders set at least 250m
apart over a suitable area (Section 3.2). Sampling should be conducted across
the breeding season and for at least one week in each of summer, autumn,
and winter (Section 3.1). At least one hour of data should be sampled for
analysis, with recordings of one minute duration and spaced at least 5
minutes apart during deployment, which should cover the diel period from
30 minutes before sunrise until four hours after sunrise (Section 3.1).
Detection and identication of the species present should be conducted
either manually or using a well-tested automated identication algorithm
such as BirdNET (Section 5.1). At least some traditional bird surveys should be
conducted in parallel to conrm the ecacy of the monitoring protocol.
Soundscape analysis
Soundscape analysis can give insights into environmental sound including
anthropogenic noise pollution, and the acoustic community diversity and their
interactions (see Chapter 6).
Sampling for soundscape monitoring should comprise at least one month
of deployment of independent recorders (i.e., > 250m apart) during each
of the four seasons, consistently repeated across years, and comprise one
minute of recording for every ve minutes across the diel cycle (Section
3.1). Analyses should be undertaken with acoustic indices whose properties
are well understood (for example the Acoustic Complexity Index or the
Bioacoustic Index), and at frequency ranges suitable for the environment
(Section 6.2). At least some ground-truthing (such as the targeted bird surveys
above) should be conducted.
This basic protocol can be adapted to suit the constraints and objectives
of your monitoring project. The discussion in the following chapters aims
to provide you with the requisite knowledge and insight to make sensible
decisions to this end. Equally, it would be straightforward to extend this
monitoring protocol to include species vocalising in ultrasonic ranges such
as bats and small mammals, using the pre-existing guidance documents
highlighted in Chapter 1.
Page 6
Figure 1.1. Passive Acoustic Monitoring oers the opportunity to monitor rewilding
projects such as this one at Sunart Fields, Derbyshire. Credit Rachel Evatt
Glossary
Note that some of these terms have a more general meaning (e.g. ‘array’, ‘aliasing’) - here we dene
them in the context they are used in these guidelines.
Terminology Denition
acoustic indices Statistical summaries of sound energy. Many are designed for use as proxies for traditional ecological metrics like species
richness (see Chapter 6).
aliasing When the frequency of the original sound signal is misidentied during digital representation due to insucient
sampling rate.
anthropophony Sound produced from man-made sources, e.g. trac noise (see section 4.4).
Note: sometimes shortened to ‘anthrophony’ elsewhere, or described as technophony
array Multiple microphones recording simultaneously at a monitoring location (see section 3.2).
attenuation The energy loss of a sound wave as it travels through air, water, soil or other media (see section 3.2.1.).
audible sounds Sounds which have frequencies between 20Hz and 20 kHz
autonomous
recording unit
(ARU)
Audio recording devices which can be programmed to record at set times and left unattended in the eld (due to
autonomous powering and data storage or transmission) for passive acoustic monitoring (see Chapter 2).
bioacoustics The study of the production, transmission and detection of sounds by animals.
biophony Sound produced from biological sources, e.g. bird song.
bit depth The number of bits (0s or 1s) used to store each sample: a higher number increases the amplitude resolution and
decreases the theoretical signal to noise ratio (see section 2.1.1.).
clipping Sound signal distortion which occurs when an amplier receives a signal beyond its maximum sound pressure level. The
top and bottom of soundwaves are cut o or ‘clipped’ (see section 2.1.2.).
detection distance The maximum distance at which a recorder can detect target sound signals. This distance varies depending on the
properties (amplitude/ frequency/ etc.) of the emitted sound (see section 3.2.1.).
diel cycle The full 24 hour period.
dynamic range The sound pressure level between the highest and lowest amplitude levels that a microphone can handle (see section
2.1.2.).
ecoacoustics A fundamental and applied science that investigates the ecological role of sound across levels of ecological organisation.
false-colour
spectrogram
Spectrograms which use the results of three acoustic indices as the values in the Red-Green-Blue channels to colourise
the spectrogram (see section 4.3.).
fast Fourier
transform
(commonly
abbreviated to
FFT)
A signal processing method used to transform audio data from the time-amplitude domain to the time-frequency
domain. Within a given time window, the frequency components of the signal and their relative amplitudes are
calculated. Applied over the recording as a sliding window, a spectrogram is generated enabling visual sound
identication.
frequency
response
The variation in sensitivity of a microphone to dierent frequencies within the range that it can detect (see section
2.1.2.).
gain The amount of amplication the recorder applies to the incoming audio signal before recording it (see section 2.1.1.).
geophony Sound produced from non-living environmental sources, e.g. wind, water
infrasonic Sounds with frequencies below the lower limit of human hearing (< 20Hz).
machine learning Computational models developed using algorithms and statistical models that develop through data-based inference,
rather than following explicit sets of rules as in traditional programming (see sections 5.1.5. and 6.5.
passive acoustic
monitoring (PAM) Automated recording of sounds for ecological monitoring, without the need for human presence.
recording format The le format in which an ARU is able to store sound recordings (e.g. WAV, FLAC, MP3) (see section 2.1.1.).
recording period
Periods of time during deployment of an ARU when the recorder is active, normally arranged for when target ecological
activity takes place. An appropriate sampling schedule must be chosen to record a representative sample of acoustic
activity from these periods (see section 3.1.2.).
Page 7
Terminology Denition
recording time How long an ARU can record continuously (see section 2.1.1.).
sampling rate The number of samples of audio taken per second by a recording device. A sampling rate of 48,000 Hz represents 48,000
samples per second and determines the frequency resolution of the recording (see section 2.1.1.).
sampling schedule The time that an ARU is set to record during deployment. For example, a device may be set to record for 5 minutes out of
every hour across the targeted recording period (see section 3.1.3.).
signal-to-noise
ratio
Decibel (dB) measure of how clearly the loudest sounds (signals) stand out from quieter background sounds (noise)
made by the electronics in the microphone and recorder itself (see section 2.1.2.).
soniferous species Species which deliberately produce sounds e.g. song, calls, stridulation, drumming, etc.
sound pressure
level (SPL)
The pressure deviation from ambient levels caused by a sound wave. Measured in decibels (dB SPL) which is the signal
amplitude proportional to the quietest pressure waves humans can hear (2x10-5 Pa). (see section 2.1.2.).
soundscape The whole acoustic environment resulting from the combination of all audible sounds in an ecosystem (see Chapter 6).
spectrogram Visual representations of the spectrum of frequencies in a sound le, with frequency on the y-axis, time on the x-axis and
amplitude expressed through intensity of colour (see Figure 2.3).
ultrasonic Sounds with a frequency above the upper limit of human hearing (i.e. sounds above 20,000 Hz).
waveform The waveform of a signal (sound) is a graph showing amplitude versus time (Figure 2.3).
zero-crossing-rate The number of times an audio signal crosses zero (negative to positive or vice versa), this serves as a primitive proxy for
basic pitch detection.
Page 8
Chapter 1: Introduction
1.1 Biodiversity monitoring
The entwined global biodiversity and climate crises and their eect on associated
ecosystem services pose a serious threat to planetary health as well as human
health, well-being and the global economy1. This is particularly evident in the United
Kingdom, one of the most nature-depleted countries on the planet2. Given this
context, monitoring biodiversity is vital to provide information on the status of wildlife
populations, invasive species, changes in habitat quality and resilience of ecosystem
functions. In turn, eective biodiversity monitoring is a requirement for evidence-led
conservation policy and the adoption of eective adaptive management protocols.
The UK government and civil society have responded to the threat of biodiversity
loss with a range of measures aimed at conserving and increasing biodiversity. These
currently include the Biodiversity Net Gain3 approach to development, an increased
focus on agri-environment schemes, and a rapid increase in rewilding projects across the
country4 – alongside the continuation of more traditional conservation actions. Given
the recent commitments at COP 155 to restore 30% of degraded land and protect 30% of
the most important areas for biodiversity globally - it is likely we will see an increase and
diversication of these projects in the coming years.
These large-scale projects and schemes require biodiversity monitoring eective
over broad spatial and temporal scales. Many of the UK government responses are
born from the credo of the Lawton report6 - More, bigger, better, and joined up’ –
meaning that they are intended to foster change at large spatial scales. There is also an
increased understanding that ecological change, both positive and negative, occurs
over long periods and not just as an immediate response to one-o interventions7. In
consequence, ecosystem monitoring is in increased (and long-term) demand, but is not
always feasible with traditional ‘boots on the ground’ survey methods.
Fortunately, a range of new, technology-driven, approaches are being developed in
wildlife monitoring globally8,9. These include the use of drones, camera traps, and the
focus of these guidelines, autonomous sound recording units (ARUs),which can be
deployed in the eld to collect sound recordings without regular intervention. These
new technologies oer the capacity to accumulate large quantities of environmental
data, whilst also presenting novel practical and analytical challenges10,11. These
challenges are exacerbated by lack of standards. These guidelines therefore set out
current good practice for the use of ARUs for long-term biodiversity monitoring.
1.2. Why use ecoacoustic monitoring?
Ecoacoustics is an interdisciplinary science that investigates natural and anthropogenic
sounds and their relationship with the environment. An increasing range of ecoacoustic
methods support the use of sound to study the environment. This is a rapidly expanding
approach for the collection and analysis of environmental data, which oers potential
for valuable contributions to long-term biodiversity monitoring and subsequent
management9. Recent developments in more aordable ARUs and sophisticated
and accessible audio data analysis tools have widened the taxonomic, temporal,
and geographical scope of acoustic studies, as well as the research questions being
investigated12,13. Prior to these developments, terrestrial acoustic research focussed
primarily on recording bats with hand-held devices, with either no recording capability,
or with subsequent manual acoustic analysis of the recordings. Geographically, passive
acoustic studies were largely restricted to Europe and North America and limited in
temporal scope. However, the passive acoustic research landscape is now changing
dramatically.
Page 9
The expanding availability of the ecoacoustic toolkit is reected in a rising number of
review papers on the use of ecoacoustics – highly useful resources for those new to
the eld. These include general reviews on the discipline of ecoacoustics14,15, together
with more targeted reviews on animal communication16, avian bioacoustics17,18,19, use
in freshwater habitats20,21, acoustic data processing22, acoustic indices23, localisation of
individuals24, and estimation of population densities25,26.
Ecoacoustics is underpinned by the use of ARUs to record soundscapes in the absence
of a human observer27 thus allowing Passive Acoustic Monitoring (PAM). PAM has several
advantages over more traditional survey methods. The biggest benet of PAM is the
capacity to function for long periods without frequent human intervention, allowing
studies to be conducted over broad spatiotemporal scales15,19. This allows surveys to
be conducted in places where regular access is logistically challenging, minimises
human impact on the study site, eases surveying at times that are unfavourable for
traditional surveys, and enables the collection of large quantities of data. Furthermore,
pre-programmed recording schedules allow for a variety of sampling regimes, reducing
power consumption and further extending the duration over which ARUs can record
without human intervention - in a exible, predictable, and replicable manner. This
reduces the cost of data collection in comparison to traditional survey techniques19, and
facilitates targeting of nocturnal, rare, or hard to detect species that may only vocalise at
specic times28. ARUs oer the capacity to record continuously and at broad frequency
spectra, meaning that PAM can be used to simultaneously monitor all soniferous species
in an area, increasing the cost-eectiveness of multi-taxa surveys and facilitating surveys
of understudied taxonomic groups such as insects15,29.
Figure 1.2. A typical deployment of an autonomous recording unit in UK woodland.
© Copyright Carlos Abrahams.
Page 10
Practical obstacles slowing uptake of PAM have diminished in recent years. The cost of
recording units has fallen greatly, with some costing as little as £65 (AudioMoth) and
a general trend towards miniaturisation assisting with logistical challenges in eld
placement22,30,31,32,33. Similarly, memory cards for ARU devices have increased in capacity
while their costs, and that of long-term storage, are falling. Meanwhile, cloud computing
increasingly represents a long-term, relatively aordable solution for both data storage
and computational capacity for analyses34,35.
PAM oers several other advantages in both data collection and analysis over traditional
in situ methods. For example, it makes standardisation of surveys easier, avoiding eects
from observer presence36 and observer bias in the eld37. Critically, the collection of raw
audio data can provide a permanent record, which has at least six benets:
They are permanent records of the surveys conducted, as well as the results
encountered, something that may be of particular importance for those wishing to
use PAM commercially.
Due to the permanent data record, it is possible to verify and correct bias introduced
at the analysis stage19.
It limits the requirement for specialist observers in the eld, as a single expert can
independently analyse a large number of surveys afterwards38,39,40.
Data are available for reanalysis in case of technological advancements, or for
application to new questions38,41.
The data can be used as tools to engage local stakeholders or engage wider
audiences in conservation, for example around restoration and rewilding projects42.
Recordings represent an acoustic ‘time capsule’, providing historic records that may
provide critical evidence of changing soundscapes in the decades to come43.
Alongside these clear benets, there are some challenges to the eective use of PAM:
Acoustic methods can only record soniferous species when they are producing sound.
Silent or quiet individuals or species will go undetected.
Recording hardware is still being rapidly developed, and the microphone, circuitry
and rmware varies between manufacturers and models, with consequent eects on
the audio data collected.
The storage of data for large projects can be problematic, and there are few
established repositories in which to archive recordings.
There are also current challenges in the analysis of ecoacoustic data and the
interpretation of outputs.
However, with the exponential growth of this interdisciplinary eld in recent years,
combined with the reducing costs of equipment, data storage, computational power,
and ever increasing commitments to address the biodiversity crisis, we believe that
many of these challenges will be ameliorated in the near term.
Page 11
Item Reason
PAM vs
traditional
methods –
eect size*
Condence that PAM
has an advantage/
disadvantage
Soundscape
analysis Can only be undertaken with recorded acoustics +++ High
Temporal scaling ARUs can be deployed to record for long periods at any time of the
day +++ High
Data archiving Acoustic data and analysis processes can be stored as a permanent
record +++ High
Standardisation Sampling and analysis are easier to standardise with identical ARUs
and computational analysis methods +++ Medium
Multi-taxa surveys The same acoustic data can be analysed for multiple taxa +++ Medium-High
Reanalysing data Surveys can be played back to nd overlooked species, or re-
analysed using new methods +++ High
Phenology studies Long-duration recordings facilitate long-term studies +++ High
Avoiding
disturbance Human presence not required during survey periods +++ High
Species richness PAM more eective overall at detecting higher species richness +++ Medium-High
Reliance on expert
labour
Analysis can be undertaken away from busy survey periods, for
instance outside breeding seasons when experts may have more
availability
++ High
Spatial scaling ARUs can be deployed at multiple sites to record simultaneously ++ Medium
Vocal activity rate Relatively straightforward to measure with PAM ++ Medium
Localisation/ Non-
invasive tracking
Complex, but could be done over long periods and in near real-
time. ++ Low
Detection of rare
species
Increased likelihood of detection with longer recordings, but
impractical to actively search + Medium
Species occupancy Easier to collect replicate samples + Medium
Material and
labour costs
Dependent on number of sites/visits and distances to travel.
Equipment is often more expensive, but requires fewer site visits =Low
Weather Recordings impacted by wind and rain, but long deployments can
allow sampling to avoid bad weather = High
Density Can be estimated using PAM, but likely simpler in most cases to
estimate density using traditional methods -Low
Behaviour Lack of visual observations can make interpretation dicult -- Medium
Number of
detections Not always clear how many calling individuals are present -- Medium
Mobility Restricted to stationary survey methods --- High
Survey area Dicult to estimate the exact area covered --- High
Visual detections No visual data – impossible to detect some species or behaviours --- High
*(+++ indicates the largest advantage of PAM over traditional survey methods; - - - indicates the greatest disadvantage compared to traditional
survey methods; = indicates there is no dierence between PAM and traditional surveys methods.)
Table 1.1. Benets and challenges of passive acoustic monitoring and point counts for
biodiversity monitoring. Adapted from Darras et al., 201919.
Page 12
1.3 Purpose of these guidelines
Whilst these guidelines are likely to be of interest to anyone working in ecoacoustics,
they are explicitly targeted at those wishing to use PAM for long-term acoustic
monitoring of European biodiversity, with a particular focus on the UK and audible
sounds. The objective is to provide a clear set of good-practice recommendations,
drawing from academic literature and the authors’ experience, for those with the
greatest opportunity to apply PAM to biodiversity management projects – including
land-managers, ecological consultants, conservation practitioners, and rewilders.
There are some ambiguities surrounding both what is meant by ‘long-term’ monitoring,
and by the broad term ‘biodiversity. Whilst we do not wish to limit this document to use
only at specic timescales, by ‘long-term’ we have in mind the sort of periods over which
an agri-environment scheme may take eect (approx. 3-10 years), a large construction
project that may need to be monitored for ecological impact (approx. 10 years), or the
duration of a rewilding or net-gain project (perhaps 30+ years). Long-term monitoring
can be conducted in two ways – either continuous or periodic. PAM can lend itself to
both approaches. As the choice of intensity and duration of survey periods is likely to be
highly dependent on local and project context, we do not attempt to prescribe a ‘best’
method, but highlight a range of tools and examples that are suitable, as evidenced by
the scientic literature.
It is also necessary to dene more narrowly what we mean by ‘biodiversity. These
guidelines are aimed at facilitating the monitoring of terrestrial biodiversity that
produces sound at or near human-audible frequencies (approximately 20 Hz - 20 kHz) -
an area where good-practice guidelines are currently lacking. Aquatic biodiversity and
species vocalising in ultrasound, such as bats, fall outside the remit of this guidance, in
large part because very good guidelines already exist for acoustic monitoring of these
taxa44,45. In practice, this means the focus is primarily on birds and many mammals,
amphibians, and insects that produce sound at frequencies audible to humans.
However, some of the information here will still be useful to those wishing to monitor
aquatic and ultrasonic species, and these guidelines will apply to those wishing to
simultaneously monitor wildlife that produce sound at any frequency. In practice, that
means for species-specic considerations, the focus is primarily on birds, along with
many mammals, amphibians, and insects that produce sound at frequencies audible to
humans. In addition, we explicitly cover monitoring of whole soundscapes and how to
relate these soundscapes to the biological components of the environment.
There are two main approaches for the use of acoustic data in biodiversity studies. The
rst is to detect, identify and analyse specic spectral and/or temporal features of the
acoustic environment. We refer to this as ‘targeted’ monitoring - the detected features
will most likely be sounds emitted by a target species. This method can also incorporate
the detection and identication of any individual sound - for instance anthropogenic
sounds in a study, such as gunshots, which may be evidence of disturbance events.
The second approach is soundscape monitoring. Here the entire soundscape is treated
as an emergent property of the landscape and environment, and is analysed through
statistical representations of this whole. This can entail, for example, understanding
whether it is a soundscape with a large variety of sounds from a range of sources, or
a simpler soundscape with few and sparse sounds. A great deal more information is
included on these diering approaches in Chapters 5 and 6 respectively, with their
corresponding benets and drawbacks. Which of the approaches is chosen (or how the
two are combined) will inuence all other aspects of study design.
Page 13
1.4. Soundscapes from the human perspective
Whilst these guidelines focus on ecoacoustics, another set of guidelines is currently
being developed to study the soundscape as perceived by humans, with overlapping
applications in acoustics, urban planning and design, and landscape design and
management. These guidelines are contained within the following publications: ‘ISO
12913 Acoustics - Soundscape Part 1: Denition and conceptual framework46, Part 2:
Data collection and reporting requirements47, and Part 3: Data analysis’, with further
parts to be developed (British Standards Institution, 2014, 2018, 2019)48. This approach
studies the soundscape through qualitative methods rst, adopting a bottom-up
approach, and afterwards by acoustic measurements, with a particular focus on human
perception and amenity. Whilst these sets of guidelines are being developed separately,
certain environmental research and industry projects might benet from the integration
of both, and users will likely nd complementarity between the two.
Figure 1.3. Dicult habitat to survey on foot, such as wet woodland can be an excellent place
to deploy autonomous recording units. Credit: Oliver Metcalf.
Page 14
1.5 How to use these guidelines
These guidelines are organised to take the reader through the process of carrying
out an ecoacoustic monitoring study in the order that it might naturally occur – that
is, from purchasing hardware, designing survey protocols, collecting acoustic data,
analysing the data and inferring ecological insight. However, this is not necessarily the
best order to plan a programme of passive acoustic monitoring. An optimal plan must
be informed by the individual context and aims of each project, and inevitably shaped
by time and nancial constraints. For instance, someone reading the guidelines in the
order presented here may determine that three top-of-the-range ARUs are preferable
to ten cheaper but lower quality models, but on coming to the analysis chapter realise
that the ecological analysis they hoped to conduct is simply not feasible with only
three recording units. Similarly, a user with a very clear idea of the ecological objective
of their study may determine the necessary analysis, but on reading the hardware
chapter realise that undertaking such an analysis falls outside of their time or budget
constraints and have to revisit which analyses are possible. Hence these guidelines are
not intended to only be read linearly. Each chapter will inform trade-os between each
of the considerations above, and it is likely that a reader will want to move between the
chapters as they plan a study.
We have attempted to provide a comprehensive introduction to all stages of ecoacoustic
monitoring in these guidelines, but there is a great deal of literature elsewhere that
contains valuable information on how to optimally conduct such surveys (see Table 1.2).
Whilst we refer to these texts throughout, it is worth highlighting here a number of other
excellent existing guidelines which readers may nd useful
Page 15
Figure 1.4. Grassland, wetland, and woodland can hold a diverse range of sonifying
biodiversity. Credit: Oliver Metcalf.
Taxa Region Title Authors and link
Amphibians USA Amphibian Monitoring Protocol
(Version 2.0)
National Park Service, Great Lakes Inventory and Monitoring Network
https://www.nps.gov/im/glkn/amphibians.htm
Bats USA
Range-wide Indiana bat &
Northern long-eared bat survey
guidelines.
U.S. Fish and Wildlife Service. (2022).
https://www.fws.gov/library/collections/range-wide-indiana-bat-and-
northern-long-eared-bat-survey-guidelines
Bats USA
Guidance for conducting
acoustic surveys for bats: Version
1 detector deployment, le
processing and database version
National Park Service
https://irma.nps.gov/DataStore/Reference/Prole/2231984
Bats UK
Designing eective survey and
sampling protocols for passive
acoustic monitoring as part of the
national bat monitoring
Newson, S.E., Boughey, K.L., Robinson, R.A. & Gillings, S. 2021. JNCC Report
No. 688, JNCC, Peterborough, ISSN 0963-8091
https://hub.jncc.gov.uk/assets/4cc324dc-1ad8-446e-acdd-
a656348025b3
Bats Scotland
Bats and onshore wind turbines
- survey, assessment and
mitigation
NatureScot, 2021
https://www.nature.scot/doc/bats-and-onshore-wind-turbines-
survey-assessment-and-mitigation
Bats UK
Bat Surveys for Professional
Ecologists: Good Practice
Guidelines
Collins, J. (ed.) (2016). 3rd edition. The Bat Conservation Trust, London. ISBN-
13 978-1-872745-96-1
https://www.bats.org.uk/resources/guidance-for-professionals/bat-
surveys-for-professional-ecologists-good-practice-guidelines-3rd-
edition
Bats UK Guidelines for passive acoustics
surveys of bats in woodland
Bat Conservation Trust
https://www.bats.org.uk/our-work/national-bat-monitoring-
programme/passive-acoustic-surveys/guidelines-for-passive-acoustic-
surveys-of-bats-in-woodland
Birds Canada
How to Most Eectively Use
Autonomous Recording Units
When Data are Processed by
Human Listeners
Bayne, E., Knaggs, M., and Sólymos, P. Bioacoustic Unit, Bayne Lab at the
University of Alberta & Alberta Biodiversity Monitoring Institute. 2017
http://bioacoustic.abmi.ca/wp-content/uploads/2017/08/ARUs_and_
Human_Listeners.pdf
Birds UK Bird Bioacoustic Surveys –
Developing a Standard Protocol
Abrahams, C. In Practice the Bulletin of the Chartered Institute of Ecology
and Environmental Management. December 2018.
https://www.researchgate.net/publication/329443381_Bird_
Bioacoustic_Surveys_-_Developing_a_Standard_Protocol
Cetaceans USA
Baseline Long-term Passive
Acoustic Monitoring of Baleen
and Sperm Whales and Oshore
Wind Development
Appendix I of: Van Parijs, S. M., Baker, K., Carduner, J., Daly, J., Davis, G. E.,
Esch, C., … Staaterman, E. (2021).
NOAA and BOEM Minimum Recommendations for Use of Passive Acoustic
Listening Systems in Oshore Wind Energy Development Monitoring and
Mitigation Programs. Frontiers in Marine Science, 8, 1575. doi:10.3389/
fmars.2021.760840
Cetaceans Scotland
Use of Static Passive Acoustic
Monitoring (PAM) for monitoring
cetaceans at Marine Renewable
Energy Installations (MREIs) for
Marine Scotland
Embling, C. B., Wilson, B., Benjamins, S., Pikesley, S., Thompson, P., Graham, I.,
Cheney, B., Brookes, K.L., Godley, B.J. & Witt, M. J.
https://tethys.pnnl.gov/sites/default/les/publications/emblingetal.
pdf
Soundscapes Norway
Management relevant
applications of acoustic
monitoring for Norwegian nature
– The Sound of Norway
Sethi, S. S., Fossøy, F., Cretois, B. & Rosten, C. M. 2021.. NINA Report 2064.
Norwegian Institute for Nature Research.
https://brage.nina.no/nina-xmlui/handle/11250/2832294
Soundscapes
and animals Global Passive acoustic monitoring in
ecology and conservation
Ella Browning, Rory Gibb, Paul Glover-Kapfer & Kate E. Jones. 2017. WWF
Conservation Technology Series 1(2). WWF-UK, Woking, United Kingdom.
https://www.wwf.org.uk/sites/default/les/2019-04/
Acousticmonitoring-WWF-guidelines.pdf
Soundscapes UK
The potential use of acoustic
indices for biodiversity
monitoring at long-term
ecological research (LTER) sites
Andrews, C. and Dick, J. 2021. UK Centre for Ecology & Hydrology
https://nora.nerc.ac.uk/id/eprint/531301/1/N531301CR.pdf
Table 1.2. Selected acoustic monitoring guidelines for other taxa and regions.
For the full table see Appendix 2.
Page 16
These guidelines represent the current opinions of experts in the eld on what
constitutes good practice for long-term acoustic monitoring of UK biodiversity. That
does not mean they are perfect; not every challenge in ecoacoustic monitoring has been
investigated, quantied, or properly assessed, and the available hardware and software
tools are constantly evolving. The guidelines, therefore, can only supplement the
knowledge and experience of those undertaking monitoring studies.
Whilst we have attempted to make these guidelines as comprehensive as possible,
there is no substitute for experience. As an emerging interdisciplinary eld, it can be
challenging to nd expertise in all of the relevant subdisciplines when carrying out a
project – ecology, acoustics, signal processing, statistics, and in some cases machine-
learning. Nevertheless, we urge those wishing to undertake acoustic monitoring of
biodiversity without such a range of skills not to be put o, but to reach out to the
myriad sources of help and information highlighted in this document prior to designing
or undertaking their studies.
Finally, as the ultimate end product of biodiversity monitoring is ecological knowledge,
the value of real-world, local, expertise is paramount. It is vital that the information and
guidance in this document is interpreted by experienced and skilled ecologists at every
step in order to apply these methods in an optimal manner.
Page 17
Figure 1.5. Dicult habitat to survey on foot, such as marsh can be an excellent place to
deploy autonomous recording units. Credit: Oliver Metcalf.
Chapter 2: Hardware
Autonomous recording units (ARUs) underpin ecoacoustic monitoring, enabling the collection of
extensive amounts of data with relative ease. Recent years have seen signicant developments in
the price, quality, and availability of these devices - although, not necessarily simultaneously in
the same device. It is, however, a fast-moving area of development, with new devices emerging
annually. Choosing which unit to purchase is likely to be one of the rst decisions made by those
looking to take up ecoacoustic monitoring. Yet, there are complex trade-os to be made, and
deciding on the best unit for a particular monitoring situation should be made after obtaining
a clear idea of the objective of the study, potential recording schedules and requisite analysis
methods49. In this chapter, we provide an introduction to ARUs, describing key considerations for
hardware specication, device cost, and the need for device performance calibration.
2.1. ARU Specications and what they mean
Most passive acoustic hardware comes with a long list of technical specications, but for
the novice ecoacoustician it may not always be clear what these mean, or how important
they are. This section explains some of the common specications found in passive
acoustic hardware manuals.
2.1.1. Automated recording unit
Size and weight - these specications are fairly self-explanatory, but are
an important consideration. Larger units can often hold more batteries and
memory cards, so can be left in the eld for longer, but may take longer to
deploy and collect as fewer can be carried in a single trip. In addition, smaller
units can be easier to nd ideal deployment locations for and are less obtrusive
in the eld, which also reduces the chance of theft.
Recording Time - how long an ARU can record continuously. Note that these
manufacturer values are often given based on battery capacity rather than
memory storage limits - recording at a high sampling rate may mean that the
memory cards ll before the batteries run out; similarly the battery life will
depend upon battery type, recording schedule and temperature. See Figure
6 in Sugai et al. (2020)12 for an illustration of these tradeos. Because there
is a greater power draw on start up, non-continuous recording schedules
may reduce total record time, but most manufacturer’s scheduling softwares
are able to estimate the maximum total recording time based on dierent
schedules.
Recording Format - the le format in which the unit is able to store sound
recordings. The default option here is the .wav le format, which saves
uncompressed data. Some units oer the capacity to record in lossless
compression formats (.FLAC or .W4V), or lossy compressed (.MP3), which can
dramatically increase the storage capacity. Lossy formats irreversibly alter the
acoustic data in a way that is inaudible to humans, but that may potentially
lose ecologically valuable sound data.
Page 18
Sample Rates - the sampling rate is how often the recording device samples
the analogue signal in order to convert it to a digital representation. The sound
signal needs to be sampled at least twice the rate of the maximum frequency
of the sound of interest, e.g. using a sampling rate of 4 kHz will allow recording
of sounds up to just 2 kHz, whilst a sampling rate of 36 kHz will allow the
recording of sounds up to 18 kHz. It is therefore necessary to ensure that
the chosen ARU has an available sampling rate double that of the maximum
frequency of the sounds you wish to record. For human-audible frequencies
this is generally not an issue, but for those wishing to use the same device
to record bats, it is worth ensuring that the device sampling rates go high
enough to record ultrasound. Most, if not all, devices oer variable sampling
rates. This is a useful feature, as the size of the audio le increases linearly with
the sampling rate (e.g. a 1 minute audio le recorded at 32 kHz sampling rate
requires twice as much storage space on disk as one recorded at 16 kHz). For
projects focussed on only species vocalising at low frequencies, being able to
record at a low sampling rate is therefore a useful memory-saving feature.
Bit depth - The number of bits (0s or 1s) used to store each sample: a higher
number increases the amplitude resolution and decreases the theoretical
signal to noise ratio. Digital data are stored in binary values thus a bit depth
of n can store a range 2^n. A 8-bit system has a resolution of 256, 16-bit gives
65,536 etc. A higher bit depth therefore uses more memory when recording
audio, but also allows for greater recovery of data in the case of audio clipping.
Figure 2.1. Audiomoths are small and relatively cheap, making them readily
deployable in a range of locations and with custom-made covers.
Credit: Oliver Metcalf.
Page 19
Power Options - ARU devices are run from a range of power sources. Most
often they are battery-powered; alkaline batteries are cheaper but tend to hold
a lower charge than lithium-ion batteries. Note that the rules around ying
with, and posting/shipping of lithium-ion batteries are much more restrictive
than those for alkaline batteries. Increasingly, some models allow for additional
power sources such as solar panels, 12V or mains electricity to power the units
so that they can run indenitely. However, solar panels remain quite expensive
and deep cycle batteries can lose eciency over time. Note also, the increased
obtrusiveness and subsequent increased chances of theft with the use of bulky
solar panels. Self-built ARU designs can be run from power banks and other
large batteries (including car batteries), meaning they can be exceptionally
long-lasting in the eld. Note that all batteries are aected by cold and
performance will decline in sub-zero temperatures; advances in carbon-based
materials may change this in the future.
Data Storage - most devices take SD or micro-SD cards which are widely
available and are relatively cheap. If you wish to use larger capacity SD cards
(>32 GB) make sure that the device supports exFAT formatted cards, which
will be the case for most units. Additional card slots allow the devices to be
deployed for longer, and at higher sampling rates, meaning that power supply
is the constraining factor on unattended survey times.
Material and design - ARUs face a range of challenging scenarios under
eld conditions. Users will want to ensure that ARUs are fully waterproof,
but also include vents to allow condensation to escape and sound to enter,
if microphones are internal. Adding silica absorbent material into the
enclosure is a good way to ensure electronics aren’t aected by condensation.
Additionally, it is not uncommon for units to be of great interest to a range
of wildlife, so internal or small external microphones can be desirable,
whilst limiting the number of points ants and other invertebrates can gain
ingress, as these can damage devices. For external microphones, long-term
moisture exposure can cause degradation of recording quality, and additional
weatherproong of the microphone can be desirable. Additionally, as with
other autonomous devices like camera traps, theft remains a risk - particularly
in more urban areas. Devices in dull colours avoid additional cost and eort in
camouaging them. Plastic surrounds can be adventitious as they allow the
owner to brand identication marks directly on the unit reducing resale value.
Some devices, such as the Wildlife Acoustics SM4, have additional mounting
plates and/or points for attaching security cables.
Page 20
Figure 2.2. A Song Meter Micro deployed to monitor wetlands. Credit Oliver Metcalf.
Interface - some devices have built in LCD screens and buttons allowing
them to be manually scheduled in the eld. Others have no interactivity and
can only be adjusted through an app that can connect with the device or by
loading a program onto the SD card/directly to the unit pre-deployment. Both
options can work well. LCD screens allow for impromptu alteration of settings
in the eld, whereas the requirement to program devices before deployment
can lead to more careful consideration and setting up of the devices whilst
inside in a more environmentally benign environment - potentially eliminating
a source of mistakes and additional water ingress to devices.
Temperature - some devices will function reliably over a broader range
of temperatures; all devices are likely to function well within the average
temperature ranges expected in the UK.
Gain settings - most recorders oer variable gain settings. Gain settings
determine the amplitude with which a given environmental sound is recorded
as digital data and therefore determine the eective spatial range. Increasing
the gain increases ‘background’ as well as ‘signal’ within targeted acoustic
research. Setting the gain too high can cause clipping, so tests should always
be carried out to determine optimal tradeos according to the monitoring
aims. Analogue gain increases the amplitude of the sound signal before it is
converted into digital data. Note that gain is distinguished from volume which
describes the dB scaling of output - for example when listening to a playback
of a recording.
GPS - internal GPS units can be useful for two reasons. They allow sound les
to be stamped with an accurate recording location, and they permit accurate
time synchronisation of units across an array, which is necessary for sound
localisation studies. The downside to internal GPS devices is that they have
higher battery use, so if precise time synchronisation is not required, a device
without GPS may be preferable.
Thermometer - a very few devices oer inbuilt thermometers to record the
temperature during recording. As temperature can aect sound transmission
through the air, this can be useful for detailed studies wishing to estimate
detection distances, localise sound, and other analyses requiring the speed or
spatial distribution of sound signals.
2.1.2. Microphones
Directional characteristics - microphones can be either directional or omni-
directional. Most microphones supplied with ARUs will be omni-directional,
meaning they sample a three-dimensional sphere around the sensor with
equal sensitivity . Some ARUs may allow attachment of external directional
microphones, which have a cone shaped pick up pattern spreading out in front
of the microphone. These produce more ‘focused’ recordings which may be
useful for studies in which the spatial location of targets is precisely known, or
potentially for some types of localisation analysis.
Microphone sensitivity - when exposed to the same sound source,
dierent microphone models may produce dierent output levels, as some
microphones are more sensitive than others. Microphone sensitivity is the
measure of the microphone’s ability to convert sound pressure into an electric
voltage. The higher the sensitivity, the less pre-amplication is required to
bring the sound to a usable level. The lower the sensitivity, the greater the
pre-amplication required. Lower sensitivity does not necessarily mean a poor
microphone. Microphone sensitivity diers as microphones are designed
Page 21
for capturing specic sounds. Low sensitivity microphones are designed
for capturing loud sounds and generally feature in the music industry for
recording sounds such as guitar ampliers or drum kits. These microphones
are not recommended for quieter sounds, as in order to capture quieter
sounds more gain will be required, resulting in a poorer signal-to-noise ratio.
This also works in reverse, as highly sensitive microphones are designed to
capture quieter sounds. However, if the sound to be captured is too loud, then
the recording will clip, leading to a distorted recording.
Dynamic Range - The dynamic range of a microphone is the sound pressure
level (SPL) dierence between the highest and the lowest amplitude levels
that the microphone and its circuitry can handle. Generally this is measured
as the loudest SPL a microphone can capture without distorting (see Max
input SPL below) and the quietest signal above the self noise (hiss) of the
microphone and preamplier. Once transduced to a digital representation, the
dynamic range of amplitude is determined by the bit depth.
Signal to Noise Ratio - When conducting species- or taxon-specic
monitoring, wildlife sounds are rarely recorded in isolation. A recording will
contain both the sound that you want to record (signal) and the sounds you
do not want to record (noise). The relationship between these two elements
is the Signal to Noise Ratio (SNR). The larger the dierence between the signal
and the noise, the clearer the recorded target sound will be, and the greater
the potential detection distance. Generally there are three types of noise
considered when evaluating SNR. The rst is anthropophonic noise generated
by humans. This can be anything from the low rumble of vehicles such as
aeroplanes or cars, the chatter of humans or industrial sounds. Second is the
noise generated from the natural world (biophony and geophony) that masks
the signal we wish to be recorded, such as wind, rain, the movement of trees
or even non-target animal sounds drowning out the target sounds. Finally,
there is the self-noise, or Equivalent Input Noise (EIN), which is generated by
ARUs themselves which is heard as a faint hiss, even when there is no mic
input. This is a result of the movement of electrons in the device circuitry being
picked up by the recording process along with the signals coming through
and from the microphones. Generally, older or less expensive recorders will
produce a higher level of self-noise. A recorder’s SNR level, as published by
the manufacturer, refers to the self-noise generated from the recorder and
microphone.
Figure 2.3. An illustration of varying signal to noise ratio across dierent devices.
These spectrograms, created in Audacity50, show the same Redwing Turdus iliacus
call on multiple devices. Devices consist of: a cheap USB microphone connected to a
desktop PC, an AudioMoth in a plastic bag, an AudioMoth31 in a homemade waterproof
case, a lapel microphone (EM172) with a digital audio recorder (Zoom H4n Pro;
record level set to 80/100) and a Dodotronic parabolic microphone with a Sound
Devices MixPre 3 digital audio recorder. For full details of the experiment comparing
equipment for monitoring nocturnally migrating birds, see https://nocmig.
com/2020/02/26/equipment-comparison-february-2020/. ©Simon Gillings
Page 22
Frequency Response - dened as the range of sound or frequencies which a
microphone can reproduce and how these vary within that range. In recording
equipment, the frequency response describes the ability of a product to
capture sound at a range of dierent frequencies - a atter response produces
a more faithful representation of the original signal. There is no perfect
microphone for all situations, as microphones are developed to perform
specic tasks. For example a microphone for recording ultrasonic sounds may
not be very good at recording acoustic signals between 20Hz and 20 kHz, and
conversely an acoustic microphone is unlikely to be able to record ultrasonic
sounds eectively, however atter responses are preferable in scientic work.
The frequency response of a microphone is usually displayed graphically,
giving a relative indication of the microphone response at a set range of
frequencies. Figure 2.2 gives an example of a typical frequency response chart.
Figure 2.4. A frequency response chart, showing a microphone with a relatively at
response across the human-audible range, but a sharp decline in very low and higher
frequencies.
Max Input Sound Pressure Level (SPL) - The maximum sound pressure level
a microphone can take without distorting. Distortion or clipping occurs when
the signal exceeds the SPL of the microphone. Fig 2.3 shows two examples
of the same recording, the waveform at the top of the rst image shows that
the signal is well within the range of the microphone, whereas the second
example the waveform goes beyond the maximum SPL, the red elements of
the sonogram shows the frequency where the sound is clipping.
Figure 2.5. Waveform
(A1+B1) and spectrogram
(A2+B2) plots for
recordings showing sound
pressure levels within (A)
and exceeding (B) the
capacity of the recording
unit. Regular occurrence
of sound pressure levels
exceeding the recording
unit capacity may indicate
that the gain has been set
too high. Spectrograms
produced in Kaleidoscope
Pro51.
Page 23
2.2. Cost trade-os with recording units
2.2.1. Budget options
Budget options are likely to be sucient for those with simple recording
requirements - a single channel able to record the human-audible frequency,
and no need for a built-in GPS. AudioMoths31 have revolutionised PAM since
becoming available in 2019, with the low price-point making them widely
accessible. AudioMoths have been used in acoustic studies all over the world.
Although the microphone quality is somewhat lower than more expensive
models, it has proven good enough for studies of many species in the human-
audible frequency range. Any loss in detection distance is made up for by the
fact that, in most cases, it is still cheaper to buy a second unit than it is to buy
an ARU with a better quality microphone. The recent release of AudioMoth
version 1.2 includes an aordable weather-proof case and the potential to
solder a 3.5mm jack for adding an external microphone, allowing the use of a
range of cheaply available, good quality external microphones - as well as the
addition of a GPS unit if desired.
Unfortunately, there is currently one major challenge in using Audiomoths,
which is their availability. As a non-commercial organisation, Open Acoustic
Devices, the producers of Audiomoth, use the GroupGets52 platform to collect
bulk orders of the devices before sending them to be manufactured. The
timing of these group purchases are unpredictable, often at short-notice, and
typically sell out fast (within hours). Audiomoths are also available for direct
sale through Labmaker53 at a higher price, but production has been highly
impacted by the global chip shortage and at the time of writing none are
available until at least 2023 - although this is also increasingly the case for all
suppliers of ARUs. There is also limited formal customer support, although
there are useful forums on the Open Acoustic Devices54 website. Additionally,
AudioMoths are not supplied with a robust weather-proof case and users must
separately purchase or make one. Those wishing to leave recorders out in the
eld for extended periods in bad weather may look to more expensive units
with more robust cases. For those undertaking casual or voluntary projects
who are prepared to wait for initial purchases and replacement devices,
Audiomoths may be an ideal solution, but commercial projects may prefer a
more expensive unit that can be more readily obtained. That option may well
be the Wildlife Acoustics SongMeter (SM) Micro55 which has better recording
quality than an Audiomoth but is similar in many other respects.
The nal option for those with a limited budget but high specication
requirements is to self-build an ARU following open-source designs, such as
the SOLO56, ARUPI57, AURITA58, BUGG59, or Sonitor60 devices. These devices are
all variations based on adding components to the cheap Raspberry Pi61 boards
- except Sonitor which is primarily concerned with the cheap construction
of taxon-specic microphones that can be attached to one of the previous
devices. The resultant products can have external microphones, larger or
adjustable power sources from AA batteries, car batteries to solar panels,
waterproong, and optional network connectivity. Performance of these
devices is as good as some of the top-end devices listed in Table 1.1. However,
obtaining the individual components can be time-consuming (from our own
experience some of the recommended components are no longer available,
and understanding what replacements are suitable requires at least some
Page 24
knowledge of electrical engineering), and in some cases requires soldering.
That means that the process is altogether more dicult than the simple click
and purchase of more commercial products, and fully waterproong the
devices can be challenging. Nevertheless, building a device yourself allows a
degree of exibility and customisation not available in o-the-shelf products,
and maybe the best, or only, option for more complex acoustic projects.
2.2.2. Mid-range options
The SM Mini55 oers an upgrade on the SM Micro, with better recording
quality. The microphone, like all of the other units in the mid-range and top-
end categories, is removable, meaning that it can be changed as performance
begins to decline after long exposure to the elements, without needing to
replace the entire unit. The main appeal of the SM Mini however, is the ability
to add the optional battery lid (£165) and use six 18650 Li-ion batteries giving
it a battery life of 1100 hours (over 6 weeks) - meaning it is a very good option
for those trying to minimise human time in recorder deployment. The SM Mini
and SM Micro can also be programmed and checked through a bluetooth
connection app.
The Titley Chorus62 is a relatively new product, and as far as the authors
are aware has not yet been used in any published academic research, or
publicly available ecological studies in the UK. However, the manufacturer’s
specications and price point mean that for many, this unit may be the ideal
trade-o between top-quality recording quality, robustness and price.
2.2.3. Top-end options
The Wildlife Acoustics SM455 and its Australian counterpart, the Frontier Labs
BAR-LT63 are considered the market-leaders amongst ARUs, with a price tag to
match. Both can be deployed in the eld for extended periods of time, with
huge storage capacity, robust casing, long battery life and optional capacity
to add solar panels to keep them running for even longer. Both units have
benets and disadvantages for particular types of study - the BAR-LT has a
built in GPS for localisation, whilst the SM4s are slightly easier to calibrate for
long-term recording - but either is likely to be suitable for any of the analyses
discussed in this guide.
2.2.4. Localisation-enabled options
Localisation with ARUs is largely still in its infancy, with researchers and
hobbyists generally making custom setups or modifying hardware/ software
of existing omnidirectional devices. Popular advances include CARACAL64, Dev-
Audio/VoxNet65, WASN66 and MAARU67. At the time of writing, no dedicated
commercial localising devices/ platforms are yet available but may soon
become so.
2.3. Maintenance and calibration
Environmental conditions have substantial impacts on the durability and reliability of
acoustic sampling units. As recorders are repeatedly exposed to adverse environmental
conditions, they will degrade in performance - especially exposed parts of the
equipment such as microphones and their windshields. Protection from temperature
extremes, rain or humidity may therefore be required for both microphone and
recording unit68 - this may consist of the standard case normally provided as part of
Page 25
the recording system, potentially with other modications to protect the unit further
from rainfall, wind and animals. Procedures for the regular inspection, maintenance
and calibration of recording systems are also needed to support eld studies69,70,71.
Microphone management, calibration and checking is very important before and after
eld deployments, as degradation in microphone quality over time can signicantly
aect results. To aid this, recorders and microphones should be individually numbered,
checked and calibrated on a regular basis (at least once per year), using a piston-phone,
standardised sound emitters, sweep tests, or other evaluation set-ups to conrm that the
sensitivity of the recording system has not been adversely aected (useful maintenance
resources are available from the Alberta Bioacoustic Unit72). Where smaller and cheaper
ARUs cannot be directly calibrated, it is important to check microphones are still working
within acceptable limits.
Table 2.1. Table of common ARU choices available in the UK. Adapted from Darras et al., 201919.
I. https://www.openacousticdevices.info/audiomoth
II. Price taken from the most recent round of sales in GroupGets https://groupgets.
commanufacturers/open-acoustic-devices/products/audiomoth and converted to GBP
III. https://www.frontierlabs.com.au/bar-lt
IV. Price obtained from NHBS on 25/07/2022: https://www.nhbs.com/frontier-labs-bar-lt-
bioacoustic-recorder
V. titley-scientic.com/uk/chorus.html
VI. https://www.wildlifeacoustics.com/products/song-meter-micro
VII. https://www.wildlifeacoustics.com/products/song-meter-mini
VIII. https://www.wildlifeacoustics.com/products/song-meter-sm4
IX. https://www.instructables.com/ARUPi-A-Low-Cost-Automated-Recording-Unit-for-Soun/
X. https://www.tandfonline.com/doi/suppl/10.1080/09524622.2018.1463293?scroll=top
XI. https://www.bugg.xyz/
XII. https://solo-system.github.io/home.html
XIII. Darras, K., Kolbrek, B., Knorr, A., Meyer, V., Zippert, M., & Wenzel, A. (2021). Assembling
cheap, high-performance microphones for recording terrestrial wildlife: the Sonitor system.
F1000Research, 7, 1984. doi:10.12688/f1000research.17511.3
Page 26
Model
Audiomoth
1.2 with
caseI
BAR-LTIII ChorusVSM MicroVI SM MiniVII SM4VII
ARUPI, AURITA,
BUGG, Solo
SonitorIX, X, XI, XII, XIII
Manufacturer Open Acoustic
Devices
Frontier
Labs Titley Wildlife
Acoustics
Wildlife
Acoustics
Wildlife
Acoustics
Raspberry-Pi based
recorders
Channels 1 1or 2 2 1 1 2 1 or 2
Signal-to-noise
ratio at 1kHz 63 80 80 73 78 80 80
Price in GBP (on
25/07/2022) 95II 879IV 474 239IV 489IV 845IV Variable, approx. 100-300
Storage 1 micro-SD
card 4 SD cards 1 SD card 1 micro-SD
cards 1 SD card 2 SD cards 1 micro-SD card
Power 3 AA cells 6 18650
cells 4 AA cells 3 AA cells
18650 Li-ion
cell or 4 AA
batteries
4 D cells Power bank/car battery/
solar
Solar panel no optional no no no optional optional
Continuous
recording time 187 600 300 200 1200 510 variable
GPS no integrated integrated no no optional no
Frequency range yes yes yes no no no yes - some
2.4. Software for programming ARUs
Most of the devices listed above come with their own software for programming,
synchronising, and scheduling devices. In the case of Audiomoth, Wildlife Acoustics,
and Frontier Labs (which the authors have experience of), these are simple, reasonably
intuitive programs that allow for a great deal of exibility and make the process of
preparing recorders for deployment relatively straight-forward. However this is not
generally the case for the self-built devices (SOLO/ARUPI/AURITA), and although some
rudimentary software may be available, the exibility of recording protocols is inevitably
lower, and coding skills are often required.
2.5. Future-proong
Whilst the fast-paced development of acoustic hardware is a great benet to acoustic
monitoring, it also presents particular novel challenges for long-term monitoring.
Ensuring that any acoustic dierences recorded in the same study ten years apart are
due to real-world ecological change and not dierences in the performance of the
recorder used is of paramount importance. However,it is an issue that has been largely
neglected in the academic literature. The simplest solution is to ensure that the same
devices are used throughout any monitoring project, with regular calibration and
replacement of deteriorating parts.
However, this sort of continuity may not be possible for several reasons. The rst is that
manufacturers are unlikely to maintain production of the same models with the same
specications over long enough periods to allow like-for-like replacement - for instance
two highly popular devices, Audiomoth v1.031 and Wildlife Acoustic SM 255, have been
discontinued in recent years, and are no longer available for purchase new. Secondly,
the capacity of a team to visit the eld may change, meaning that they may require
devices with dierent characteristics that can be left to record for longer. Similarly, when
devices such as the BUGG59, which are able to record continuously using solar powered
chargers and transmit data in real-time using a mobile phone SIM card, are available for
commercial purchase - the power and memory benets of such advances may outweigh
the negatives of lost continuity.
To these challenges we cannot oer a certain solution, but several prudent measures
could be taken in anticipation of better solutions emerging in the future. Firstly, we
recommend playing broadband white noise at a known amplitude and distance from
the recorder, from an unobscured point. White noise sound les are easily sourced
from a range of locations on the internet, or can be easily generated in Audacity50,73
(see Chapter 4.2 for more on analysis software). Climatic conditions (temperature and
humidity in particular) should also be recorded. This should be done when the ARU is
rst deployed, and at regular intervals thereafter. Additionally, when a device is being
replaced, the new ARU and old ARU should be deployed simultaneously for a period of
time. This will allow some reference data for comparison and may allow some degree of
calibration between the devices.
Page 27
Chapter 3: Study Protocol
As with any ecological study, survey design is vital for drawing robust inferences from the
data collected. When considering survey design, there are likely to be complex trade-os to
be considered between landscape, size of the study site, budgetary limitations and human
eort available - experienced ecologists with familiarity of the study area are likely to be best
placed to make these decisions. This chapter discusses some of the most important aspects for
consideration when designing an acoustic monitoring study.
Placement of ARUs and timing of deployment requires careful planning - the key objective
here should be to obtain acoustic data that are representative of the ecological features being
investigated.
3.1 Temporal considerations
Temporal programming within the ARU deployment period can be usefully considered
at dierent temporal scales: deployment schedule, recording periods, and sampling
schedule. Deployment schedule refers to the times when an ARU will be placed in the
eld during weeks, months, seasons. Recording periods describe the time that recording
takes place within a 24 hour cycle - either continuously or targeted, for instance during
the dawn chorus. The sampling schedule describes the pattern of recording within a
given recording period This could range from continuous recording to short recordings
of just a few seconds every hour and is determined by the recording length and inter-
recording intervals.
3.1.1. Deployment Schedule
When considering optimal deployment schedules for long-term monitoring,
it is necessary to consider both the temporal and spatial aspects of the survey
design together in order to ensure the study objectives can be met. In general,
there are two approaches to deployment schedules that can be taken when
assuming equal survey eort. For studies that prioritise tracking temporal
patterns, using a continuous or near continuous deployment at the expense of
a higher number of recording devices is likely to be preferable.
Imagine here a small area of 20 hectares allocated for a rewilding project and
where assessing habitat change over time is the priority. In this case, four
recorders could be placed, either at random locations, or at selected important
sites such as key habitats, and left to record throughout the year. In contrast,
for studies more concerned with the spatial aspects of target species presence,
then using short but intensive study periods with a greater number of devices
distributed spatially is likely to be preferable. Imagine a large farm, concerned
about the eect of land management changes on the site’s bird population.
An array of 15 recorders could be placed in a regular grid across the site for a
month-long period during the breeding season, and again for a similar period
in winter, with annual repeats in order to assess community turnover.
Page 28
Figure 3.1. Illustration of possible temporal scheduling of PAM surveys.
Deployment of ARUs (top) can be continuous throughout the year, or targeted at
certain signicant periods. Recording periods and sampling schedules (bottom) can
be programmed to only collect data when desired - illustrated here is a continuous
recording period across the diel cycle with a sampling schedule of two minutes every
ten (orange), and a non-continuous recording period targeting dawn and dusk but
sampling continuously during these periods (grey).
3.1.2. Recording period
Most ARUs come with the capacity to set recording periods. Non-continuous
recording periods and sampling schedules are useful when resources are
limited; they enable study designs that can be robust enough to meet the
study aims, whilst reducing the amount of data collected and battery power
used, therefore increasing how long ARUs can be left in the eld and reducing
overall eort. A key consideration here is that it is impossible to analyse
data that doesn’t exist, but it is easy to discard or disregard data if too much
is collected. In most cases, it will be desirable to conduct a pilot study to
establish exactly how much data collection is required to make the desired
analysis feasible. It may often be sensible to have some data redundancy and
collect more than necessary, but this must be balanced with the carbon cost
of data storage as the big data of remote-sensing scales globally. Guidance in
Chapters 5 and 6 can be used to assess survey completeness.
Choosing recording periods within a deployment is relatively straightforward.
For general soundscape studies or studies without strong hypotheses about
key periods for target taxa vocalisation, they should cover the entire diel
cycle. Other more targeted options where biophonic activity is of interest
may be to only sample at day or night, at dawn, or avoiding periods of high
anthropogenic activity12. Alternatively, it may be desirable to only record
during periods of expected peak activity in studies with strong hypotheses
about the timing of the vocal activity of focal species (e.g. Natterjack Toad
Epidalea calamita chorusing at dusk).
Page 29
3.1.3. Sampling schedule
The choice of sampling schedule is dependent on the type of study being
conducted, and the goals of the study. For projects aimed at sampling
ecological communities (Chapter 5), studies have shown74,75,76 that using
samples with shorter recording length and smaller inter-recording intervals,
dispersed over long periods, are likely to be more eective in obtaining a good
representation of the community present than longer duration samples with
greater inter-recording intervals over shorter periods. A UK study aiming to
obtain a good representation of the bird community at a single location with
one hour of sampling eort would likely capture a high proportion of the
species present using sixty samples of 1 minute duration spread across the
entire bird breeding season rather than a single hour during one morning75.
However, it is likely that this eect declines with lower species richness, and
is unlikely to have a very strong impact in the UK. The optimal selection will
depend on the sampled community, how often species make identiable
sounds, and daily behaviour patterns of the species of interest.
For soundscape studies (Chapter 6), the optimal sampling schedule will
depend upon the phenomena of interest. Where diurnal patterns are of
interest and events of interest are not too rare a common approach is to have a
sampling schedule recording one minute in every ten (e.g. a recording length
of one minute, with a nine minute inter-recording interval and a continuous
recording period), particularly when deployment periods are throughout the
year or across entire seasons. Shorter or more targeted deployment periods
are likely to require a more frequent sampling schedule.
3.2 Spatial considerations
The exact number and placement of ARUs should be determined by an ecologist
following the same principles of representativeness and sample size that would be
applied to any ecological study. Distance between recorders will be determined by
the objective of the study and the target species, but for small passerine birds, spacing
of approximately 250m should be enough in most habitats to ensure independence
of recordings if desired, or under 50m if overlap in recordings is necessary (e.g. for
localisation). Note that recording distance will also be determined by input gain, see
section 1.
3.2.1. Detection distance
Understanding the ‘detection distance’ being monitored by a single ARU is one
of the most important considerations when designing a study. However, it is
also one of the most dicult to calculate. The amplitude of sounds at source
hugely vary across potential ecological targets (e.g. the sound of a barking
Roe Deer Capreolus capreolus or duetting Tawny Owls Strix aluco will carry
much further than a singing Goldcrest Regulus regulus). Additionally, there are
a number of factors that impact sound attenuation. Sound attenuation is the
energy loss of a sound wave as it travels through air, soil, water or other media
- once enough energy has been lost, a sound wave becomes indistinguishable
from background noise. These factors include environmental parameters that
vary throughout the day, such as background noise level, temperature, air
pressure and humidity - meaning that detection distances at a single location
will vary over time. Attenuation is also impacted by the physical surroundings,
such as vegetation type and density, and local topography. Sound attenuation
occurs at dierent rates at dierent frequencies. In general, lower pitch sounds
have less sound attenuation than higher pitch sounds, but this can vary, as
Page 30
some frequencies carry better through vegetation than others. There is an
increasing body of academic research on measuring detection distances and
ecological sound attenuation77,78, but none of the methods so far proposed
are straightforward. Some problems with estimation appear intractable,
such as animals moving or facing in dierent directions whilst vocalising, or
intraspecic variation in vocalisation amplitude, and these rely on assumptions
that using an average is reasonable (e.g. that a deer barks as often facing the
microphone as it does when facing away).
In consequence, most ecoacoustic studies do not estimate detection distances
precisely12. Instead many studies use broad estimates obtained by playing
sounds at increasing distances and at regular time intervals, or relying on rules-
of-thumb. At the UKAN+ Long-term Acoustic Monitoring of UK Biodiversity
Symposium, one such approximation that had widespread agreement was
that if a human observer could hear a call, then a good quality ARU was likely
to be able to record this as well.
It is worth noting that not knowing precise sound attenuation rates limits
the types of analysis possible, as measures of estimating abundance often
require a strong understanding of the location/distance of calling individuals.
This means that comparisons of community composition and soundscape
characterisation tend to be favoured in ecoacoustic studies, although even
here limitations in understanding detection distances should be carefully
considered when making comparisons between sites.
3.2.2. ARU Positioning
Having identied ARU sites, and the deployment schedule, the microsite
location of recorders can also have an impact on the data collected. Although
ARU microphones are mostly omnidirectional, sound can be blocked by solid
objects. For instance, placement of an ARU against a very broad tree trunk
will inhibit collection of sounds from directly behind the tree; too close to the
ground will introduce reections. If the study is targeted towards a particular
area or species, care should be taken to ensure there is a clear line of sight
between the ARU and the position the target sounds are most likely to occur.
For general recording of the environment, locating the ARU with as open an
aspect as possible will be benecial. Most studies place recorders 1-2m o
the ground, both to avoid reections and interference by curious ungulates.
This is likely to be suitable for most UK-based studies, but placing them higher
may be benecial if a focus on canopy dwelling species is desirable, or there is
concern that equipment may be vandalised.
In many cases, ideal sites for the ARUs may not be possible. Careful
consideration should be given to the risk of theft, potential damage from
passing bovines, and to the privacy of any passers by who are using the area
(see Chapter 3.3 for more on privacy concerns). It may well be necessary to
make considerable concessions in concealing the location of ARUs to avoid
theft - imperfect data is better than returning after several months to an
absent ARU! In the experience of the authors, locating an ARU tucked in on the
edge of a bush does little to limit the collection of soundscape data, assuming
that rustling branches can be avoided. Additionally, there is some research79
that suggests the use of personal and polite labels left on the recorder, as
opposed to neutral or aggressive messaging, is most eective in deterring
thefts of unattended scientic equipment, although this must be hugely
culturally variable. In addition, warning signs that recording is taking
Page 31
place, possibly at some distance from the actual devices, may go some way to
alleviating privacy concerns, especially on private sites. Landowner permission
should always be obtained before deployment of ARUs for ecological
monitoring.
3.3 Audio settings
A major decision in relation to programming audio settings on ARUs is the sampling
frequency. As mentioned previously, the sampling rate needs to be at least double the
frequency at which you wish to record data (or triple if the intention is to use the same
data to survey ultrasonic acoustic diversity). For general studies of human-audible
sound, we recommend using a sampling rate of 48 kHz. The reasoning behind this is
purely pragmatic, it covers the entire human audible range, and is a common sampling
frequency available on the majority of sound recorders.
The le size of audio data increases linearly with sampling rate, meaning that in some
cases it may be preferable to use a lower sampling rate. This will be primarily in studies
targeted at species with low frequency calls. For instance, a study on Common Cuckoo
Cuculus canorus which sing at ~1 kHz and below, could use a sampling frequency of
4 kHz. This would give audio data for everything below 2 kHz - capturing all cuckoo
song, whilst also requiring just 8.3% of the storage capacity of a 48 kHz sampling rate.
However, this would inevitably constrain future questions that might be investigated
with the same set of recordings, and low sample rate can result in poor data resolution.
The other options likely requiring input when scheduling an ARU are the le type,
bit-depth and gain. One study has shown that compression of .wav audio to MP3 had
a surprisingly small impact on the calculation of some acoustic index values, while
others were more severely aected80. Other studies have found similarly mixed eects
in targeted analysis. Nevertheless, this form of compression does entail some loss of
original data. Most users tend to record using uncompressed les (.wav) or lossless
compression (e.g. .ac), which avoids any risk of losing information. Furthermore,
it is increasingly standard for long-term audio storage to be in .wav format, so we
recommend that initial recordings are made in this format and converted later if
necessary.
Bit-depth determines the number of steps in the amplitude scale of a recording, with
increasing bit-depth representing higher resolution in the amplitude, and hence
providing more discrimination between loud and quiet sounds. Bit depth determines
the dynamic range of capture and will impact the amount of information collected,
meaning that incorrect gain settings are less likely to impact data collection. Here,
we strongly recommend a bit depth of 16 or higher, as a bit-depth of 8 tends to be of
relatively poor quality. There is ongoing debate about whether the dierence in quality
between 16 and 24 bit recordings are discernible to the human ear using most audio
equipment, and consequently a bit depth of 16 is a common choice.
Finally, gain is the amount of amplication the recorder applies to the incoming audio
signal before recording it - an inverse to the volume control on a television. In most
cases a medium gain setting of ~+20 dB will likely be most appropriate, it will help in
collecting some quieter sound at a high enough quality to be recognisable, without
resulting in excess clipping. However, if target species are known to be at a great
distance, or are particularly quiet and there is reason to think clipping won’t be an issue,
then using higher gain settings may be appropriate - it is important to test this if at all
possible. Note that gain settings will be hardware specic - this is a good reason for only
using one type of ARU across a survey - but if it is necessary to use more than one type
they will require calibration across devices.
Page 32
3.4 Metadata
Metadata is the information about the recorded data: date, time, location, recording
device, gain settings, etc. Sound les contain a great deal of valuable information for
biodiversity scientists. Without appropriate metadata, however, these les have no
signicant purpose. Metadata allows the contextualisation of audio data within an
informative context in the same way that appropriate labels provide meaningful context
to voucher specimens deposited in a museum collection. As a bare minimum, this
information should provide the location, date, time, details of who made the recording,
the equipment and settings used. Spoken metadata at the start of a recording has the
advantage of being hard to separate from the data itself, and has the disadvantage
of potentially interfering with, or at least complicating, automated analysis pipelines
and can only be done at the start and end of PAM deployments. Spoken metadata is
not a substitute for metadata that enables quick searching by humans and machines.
Searching eectively for a le by date, time and location requires the metadata to be in
text form. Many devices will embed this in the le name, and generated text le.
There are two options for metadata storage: within the le and in an associated
database. Both have advantages and are not mutually exclusive, so a combination of
both is often the best solution. Many tools allow for encoding metadata within les
(examples); the metadata are stored within the le and persist if the les are accidentally
renamed.
The primary advantage of a metadata database is that complex queries are easily
constructed and executed quickly. Another advantage of a database is that relationships
between les and the results of analyses can be dened. Machine learning algorithms
may nd numerous species of birds singing within an audio le at dierent times. A
properly constructed metadata database can quickly identify periods where a Eurasian
Blackbird Turdus merula is singing from many thousands of audio les. Depositing your
les into an appropriate repository may provide the level of functionality required
(and long-term storage) in exchange for making the les publicly available (either
immediately or after an embargo period).
Ensuring interoperability with existing and future bioacoustics infrastructures such as
repositories and aggregators should also be considered. Generally, this means using the
most atomic metadata elds that are practical.
Audubon Core81 (the Biodiversity Information Standards (TDWG) standard for audio-
visual data) is yet to be as widely used as its sister standard DarwinCore82 but has
an increasing number of users within the biodiversity community. Over the last two
years, the standard has actively engaged with the bioacoustics community to ensure
the metadata needs of the bioacoustic and ecoacoustic communities are met by the
standard. Additionally, the recent “RegionOfInterest” addition expands the standard
to include metadata about regions within a le (e.g. periods of blackbird song as
discussed above). Making sure that each eld in the metadata matches an equivalent
AudubonCore term will help to future-proof your metadata. There is work within the
AudubonCore Maintenance Group to provide a user guide for audio les and analyses,
which will soon become a helpful document for the ecoacoustics community.
Standardised metadata will have long-term benets for the community, making it
easier to archive and aggregate datasets. An interesting example of this, the Global
Soundscapes Project83, aims to collate metadata from soundscape recording datasets
globally, and currently holds metadata on 392 projects - and is actively looking for new
collaborators.
Page 33
Page 34
Figure 3.2. Woodland can have a diverse range of vocalising species and can vary
greatly by season – a woodland soundscape will sound very dierent in autumn
compared to spring. Credit: Oliver Metcalf.
3.5 Data storage
Data storage remains one of the most challenging aspects of ecoacoustics. It is easy
to collect terabytes of acoustic data, even over relatively short survey periods. When
planning surveys it is important that storing the collected data in triplicate - and at a
minimum of two independent locations - is budgeted for. When working in the eld in
remote areas, aim to work with duplicate copies on portable hard drives, until they can
get backed up to more permanent storage solutions.
Cloud based computing appears to be the future for long-term large data storage, as
it is scalable, exible, and provides data security and regular data back-up. However,
the cost can be prohibitive for smaller projects and the carbon cost is invariably higher.
Slow internet connections can result in local storage being far faster for analysis. Cloud
storage also oers the potential for easier sharing of data and more collaborative
projects - however there are few options available for large datasets. Free longer term
cloud storage facilities for scientic projects are being developed at national level
in some countries, but the UK lags behind. One potential option is to upload data to
Arbimon35, which oers free storage in exchange for sharing data. For small amounts
of acoustic data, such as good recordings of rare species or particularly interesting
soundscapes, short audio les or sets can be uploaded to online repositories such
as xeno-canto85 or the Macaulay Library86 - both of which hold a huge amount of
ecoacoustic reference material useful for acoustic identication or as training data for
classication models.
One other option to reduce data storage requirements is to compress the les being
stored. A lossless le compression such as .ac may be a good option here. One of
the largest long-term acoustic monitoring projects globally, the Australian Acoustic
Observatory87, advocates an extreme form of acoustic data compression, converting
the original audio files to a series of acoustic indices from which some of the most
relevant ecological data can be retrieved88, although all original data is kept as well. This
is estimated to require six to eight orders of magnitude less storage than preserving
the original audio. Nevertheless, we would recommend storing the original audio data
(or at the very least a representative subsample) and only using this method as a last
resort as it entails a high degree of information loss.
Chapter 4. Data Exploration
Having collected audio data from a PAM study, the sheer quantity of data collected can seem
overwhelming. It is generally desirable to undertake some preliminary data exploration to
determine whether the ARUs have worked correctly, assess the quality of the data, and to get
a feel for the soundscapes recorded. This last point in particular can be vital for gaining an
understanding of the acoustic environment being recorded and the way it changes across the
diel cycle and over longer periods, and for formulating hypotheses for future or additional studies.
Several processes which can help handle large quantities of audio data are discussed in this
chapter
4.1 Basic data checks
Often the simplest of checks are the most important in ensuring the data collected is
what it is expected to be. Some of the most important and useful le metadata can
be accessed by viewing the collected les on a computer. For instance, in Windows
operating systems (OS), opening the folder that contains the relevant audio les,
selecting the View tab and the ‘Details’ option, then using the ‘Add columns’ menu to
add the relevant le information to the screen can be a very useful way to quickly view
the recording metadata (the same information is available in the Finder sidebar on Mac
OS). Spending ten minutes checking the start and end dates of recordings, numbers of
les from each ARU, le sizes, le duration, stereo/mono, sample rate and bit-depth are
all as expected can be invaluable in identifying any problems carried over from set-up or
recording.
4.2 Spectrograms
Spectrograms provide a visual representation of the audio data, with the frequency on
the y-axis, time on the x-axis, and the amplitude represented by the intensity of colour
(Figures 2.1, 2.3, 4.1). Spectrograms are produced by transforming the raw audio data
from the time-amplitude domain to the time-frequency domain typically using a fast
Fourier transform (FFT). It is one of the commonest ways in which to assess audio data.
Generally it is a good idea to make a quick inspection of any new data collected by
visually inspecting a small portion of the data, and displaying it with 15 seconds to 1
minute viewable on the x-axis at a time. Visual inspection of audio at this timescale is
likely to be signicantly faster than listening to the data directly. This will help to identify
any periods in which the recorder may have malfunctioned, or anomalous sound events
such as a period of construction work close to the recorder. Most spectrogram software
has a playback function and we strongly advocate listening to as much data as possible,
sampling whilst viewing to develop your understanding of the soundscape patterns at
the study site in order to support interpretation of later statistical analyses.
Page 35
Figure 4.1. An example of a busy spectrogram. Recorded during the dawn chorus in the Lower
Derwent Valley NNR on 05/05/2020. Spectrogram produced using Raven Pro91.
Most generalist sound-editing software can display spectrograms, and allow the sound
displayed to be listened to simultaneously, whilst also oering various options for
editing sound. We do not provide a comprehensive list of all of the software available,
instead focussing on some popular choices in ecoacoustics. We have illustrated this
guidance document with spectrograms created from a range of software to provide
comparisons. For a wider range of software options, an extensive list has been made by
Tessa Rhinehart89.
One of the most popular choices for viewing and editing audio les is Audacity50, a free,
easy to use open-source audio editor (see Figure 2.1). It is a powerful piece of software
capable of a wide range of visualisation and editing processes. As Audacity is intended
as a general purpose audio editor, it is not necessarily optimised for conducting
ecoacoustic analysis - however, a good article on setting up Audacity for this purpose
(focussed on birds) can be found on the xeno-canto website90.
Raven Lite (free) and Raven Pro91 (licensed) oer a program explicitly designed for
bioacoustic analysis, meaning it is somewhat more intuitive to use, at least initially,
than Audacity. It is convenient for paging through large audio les, or large quantities
of smaller les. Raven Pro is also good for easy labelling of audio data, although some
of the automatic measurement and labelling options are limited to Raven Pro only. The
Raven User Guide is also an excellent document for anyone looking for an explanation as
to how to congure a spectrogram for maximum clarity, and what the dierent settings
do, in a way that is applicable to many dierent programs.
Kaleidoscope Pro, and its free viewer option, Kaleidoscope Lite, is produced by Wildlife
Acoustics51. It can load and close sound les with a single keyboard click, allowing
extremely rapid visual review of spectrograms for a batch of les, which can be easily
tuned for gain and contrast. Sonic Visualiser92 is a free tool for visualisation, analysis and
annotation which was designed for music analysis but with high resolution and fast
loading spectrogram viewing capacity.
It is also possible to create your own spectrograms using R93 (e.g. in the ‘seewave
package94), Python95 (e.g. in Matplotlib96 or SciPy97) or MatLab98 code (e.g. Signal
processing Toolbox), although these tend to be less interactive and it can be tricky to
obtain as much clarity as in the custom made sound-editing software without good
knowledge of the scripting language. There are multiple well-documented packages
available in each language.
Page 36
4.3 False-colour spectrograms/plots
False-colour spectrograms99 and plots are methods to visualise sound over long time
periods, normally using time on the x-axis and frequency or date on the y-axis. Unlike
standard spectrograms, instead of using the raw audio data as the input to ascertain
amplitude, false-colour spectrograms take the results of three acoustic indices (see
Chapter 6 for more on acoustic indices), and use these as the values in the Red-Green-
Blue channels to colourise the spectrogram (Figure 4.2). This means that the input
data is far less granular than a standard spectrogram, allowing for clearer visualisation
of patterns and trends over longer time periods. The principle behind false-colour
spectrograms can be extended to false-colour Extended Acoustic Summary images100,
replacing frequency on the y-axis with another measure of time (e.g. month, year), to
allow visualisation of acoustic change over prolonged periods (Figure 4.3).
The developers of the false-colour spectrogram, Queensland University of Technology
Ecoacoustics Lab, currently provide the Ecoacoustic Analysis Programs software package
for easy generation of false-colour spectrograms, freely downloadable from GitHub101.
In addition, code to create false-colour spectrograms in R is available in Appendix 3, and
the Python package scikit-maad102 contains functions to create your own, or Python
code to do so is available on Sarab Sethi’s GitHub103.
Figure 4.2 False-colour spectrogram showing a 24 hour period, and a frequency range up
to 22 kHz. The dawn chorus is visible between 5-8am, with corresponding high values in the
acoustic indices assigned to the red and green colour bands. Credit: Sarab Sethi.
Page 37
Figure 4.3 False-colour plot for nine ARUs deployed simultaneously across a site over a 17 day
period. The purple bands visible in most plots show low acoustic index values during the night,
with the green blocks in site 9573 showing high levels of acoustic activity at this site.
4.4. Data pre-processing
Having initially visually assessed the data, it may be apparent that some of the data are
problematic. Problematic data can occur for a range of reasons, including recorder faults,
an excess of anthrophony or geophony (e.g. from roadworks nearby, or a period of high
wind and rain), or the absence of any acoustic signals of interest. Careful thought needs
to be given as to what constitutes unwanted noise in the data, and what could be an
important part of a site’s acoustic character - this will vary by the study objectives.
In general, there are few automated processes or documented methods for the removal
of such problematic data. The hardRain package104 in R can identify and remove periods
of intense rainfall from datasets, but is primarily aimed at data collected in tropical
forests, and is less eective in temperate environments. There are also published
methods for identifying wind aected les and ‘denoising’ them (i.e. minimising the
impact of wind noise)105. In many cases, it is likely to be easiest to manually search
for outliers by extracting a range of acoustic index values, and then either visually
examining false colour spectrograms, or by standard statistical methods of identifying
outliers in a dataset. It is also a good idea in large datasets to remove the rst 15 minutes
of recording after deployment (or longer if possible), and the last 15 minutes prior to
collection to limit any impact from the presence of people during this period.
A type of problematic data that may be less apparent during a visual inspection is
private conversations of people in proximity to the recorders. The presence of human
Page 38
speech in passive acoustic data raises a number of ethical concerns, but has received
little attention in the ecoacoustics literature. The simplest way to eliminate human
speech from audio data is to apply a high-pass lter at a frequency that would remove all
or most human sound , for instance ~2 kHz would certainly be enough, however, a large
amount of biophony would also be removed; for instance Great Bittern Botaurus stellaris
and Common Cuckoo Cuculus canorus vocalisations would also be entirely eliminated.
There are a number of well-developed voice activity detection softwares available,
however they are primarily developed for indoor use with voices in close proximity - only
one program has been designed for identication of (Norwegian) speech in ARUs106 - the
Python code is freely available online. Another option is to set a recording schedule of
intermittent short clips that would break up any unintentionally recorded conversations
into unintelligible snippets. However, this may have a signicant impact on the detection
of target sounds or temporal analysis of the soundscape.
Ultimately, it is better to deal with privacy issues by avoiding collecting human speech
and warning of the risk of being recorded at the deployment stage, than it is to deal with
once collected. We are not in a position to advise on the legality of storing PAM data in
respect to the UK General Data Protection Regulations, and practitioners should take
care to ensure they are fully compliant.
Page 39
Figure 4.4. Hay meadows and wet grassland often have strong dawn choruses dominated by
Skylark Alauda arvensis, Reed Bunting Emberiza schoeniclus, and Sedge Warbler Acrocephalus
schoenebaenus. Credit: Oliver Metcalf.
Chapter 5: Targeted Monitoring
Once data is collected and been subject to initial pre-processing, there are a range of available
analysis options. This chapter deals with those methods that deal with detecting and identifying
specic ecologically relevant signals, such as bird calls, within the audio data (Section 5.1 Acoustic
analysis), and how they can be used to gain ecological insight (Section 5.2 Ecological analysis).
The chapter is labelled Targeted monitoring’, as many of the methods can be applied equally to
individual target species, multiple species, or ecologically relevant anthropogenic sounds, such as
gunshots.
5.1 Acoustic analysis
5.1.1. Manual analysis
In many cases, the most accurate and ecient method for obtaining useful
ecological data from audio les will be manual analysis. This is especially true if
community data are required (e.g. data comparable to point counts conducted
in the eld), or if data on the detection and non-detection of a single species
are required with a high degree of accuracy. In these cases, the eort involved
in manually reviewing data is likely to be less than that of training a highly
accurate single-species automated classier, or reviewing predictions of o-
the-shelf multi-species classiers.
The process for retrieving specic data from audio les is similar to that
described in Chapter 3.1, visualising audio les using spectrograms and then
listening to them as necessary to identify signals of interest. Labels, such as
species identications or call types, can be attributed to the relevant section
of the spectrogram, then used later in ecological analysis. All of the software
listed in Chapter 3.1 support labelling of specic sections of the spectrograms
and would be suitable for this type of manual analysis. Although this process
requires considerable human input, visualising the data with spectrograms
can speed up analysis considerably for experienced practitioners. Birdwatchers
using ARUs to record nocturnally migrating birds over their gardens report
being able to analyse a night’s recording of eight hours with average migration
activity in about one hour, identifying and labelling all signicant bird
vocalisations107.
Figure 5.1. Three vocalisations from a rare breeding bird, the Spotted Crake Porzana
porzana, recorded during a targeted PAM survey of the Lower Derwent Valley NNR on
05/04/2022.
Page 40
Given the relatively high amount of human eort required in analysing audio
data this way, the benets compared to traditional eld-based surveys may
be less immediately obvious. However, manual analysis of species data from
recordings is likely to produce better assessment of community species
richness in birds compared to conducting point-counts in the eld19, detecting
an average of 10% more species. Additional benets include allowing analysis
to be undertaken at any time (e.g. outside of busy eldwork periods), can allow
the same observer to analyse temporally synchronous data (i.e. eectively
undertake multiple surveys simultaneously) thus reducing inter-observer
error, and allows for repeated analysis by other observers to correct for errors.
Importantly for commercial enterprises, it also provides a fully evidenced
analysis workow, should the presence or absence of certain species be
queried later.
5.1.2. Automated and semi-automated approaches
Automated approaches to detecting and in some cases identifying signals of
interest are potentially time saving alternatives to manual analysis. There are
a range of approaches available, varying in complexity and output, ranging
from simple algorithms used to detect when any sound event occurs, right up
to cutting-edge neural networks to detect and identify the songs of multiple
species from across the globe, that push at the boundaries of deep-learning
development.
5.1.3. Sound event detection
When choosing an approach to take, it is important to be aware of the
dierence between sound event detection models and classication models.
Sound event detection models are useful when looking for rare sound events
during long quiet periods, or when the majority of sounds are of interest and
it is valuable to isolate them. These are often relatively simplistic approaches
that look for sounds that pass predened thresholds for amplitude or signal-
to-noise ratio (e.g. Raven Pro91, Kaleidoscope Pro51, Tadarida D108), but can
be parameterised to only apply at certain frequencies or with minimum or
maximum time intervals. Given their simplicity, these sorts of models, which
exist on a range of acoustic analysis software, can be quick to congure and
fast to apply to large quantities of data. They can be an eective way of quickly
removing large quantities of audio that is not of interest, with reasonable
condence. What they do not do, however, is identify the detected sounds as
belonging to any species or source.
Page 41
5.1.4. Template matching
One method to obtain detections of an identied sound type is through
template matching. In this method, the user provides one or more templates
of the desired sound, and an algorithm compares the template to the dataset
provided. The output is then a series of detection periods and sometimes
frequencies, with an associated condence score. Users can determine their
own thresholds for accepting a condence score as a true detection. This
method has distinct advantages over more complex algorithms as it requires
limited user input in training the algorithm, doesn’t require a high level of
technical skill to undertake, and is conceptually simple. However, template
matching can be quite slow to run over large quantities of audio data, and
is only likely to be highly accurate for stereotyped calls in relatively simple
acoustic environments109. Although, the process can be applied to less
stereotyped calls or noisier environments; in most cases users choose to set
a relatively low threshold to avoid missing too many calls, then manually
reassess the detections produced to eliminate false positives. This can still be
quite time consuming, but potentially less so than manually assessing all of
the data, or building a more complex classication algorithm110.
Figure 5.2. After the sun sets wetlands can come alive with bird and amphibian
sound – passive acoustic monitoring oers a great way to monitor this.
Credit: Oliver Metcalf.
Page 42
There are several methods available for template-matching analysis. The
most user-friendly is the Arbimon online platform35, which allows data to be
uploaded, stored, and analysed in various ways, including template matching,
free of charge. Arbimon uses a slightly more complex form of template
matching, in which initially provided templates are then used to train a
random-forest detection model - although this process requires very little
user input beyond the initial template111. This type of template matching has
been used in academic ecoacoustic studies successfully across the globe112,113.
However, uploading large quantities of audio data to the web can be time-
consuming, and it may not be suitable for commercial or sensitive projects
due to the somewhat opaque policies about data re-use. It is however an
interesting integrated analysis platform that is worth exploring as an analysis
option, especially for those without coding skills in R or Python, or the time to
develop their own pipelines.
For those with the capacity for basic coding in R, development of a template-
matching pipeline is straightforward thanks to the monitoR package114. There
is a tutorial video on basic setup of such an approach by Daniella Teixeira
available on the UKAN+ Youtube page115. Again, this approach has been well
used in academic studies globally, and there are several papers outlining ways
to optimise the use of such an approach,116,117.
Page 43
Figure 5.3. Targeted passive acoustic monitoring can be a good way to
establish the presence of rare or scarce nocturnal and crepuscular species
such as this Short-eared Owl Asio ammea. Credit: Oliver Metcalf.
5.1.5. Machine learning
Contemporary approaches to machine learning for acoustic analyses include
supervised and unsupervised learning. Supervised machine learning relies
on labelled input and output training data, whereas unsupervised learning
processes unlabelled or raw data, such as the clustering algorithms used in
Kaleidoscope51.
Early supervised learning models are a type of machine-learning algorithm
that use acoustic features identied by the user to train a model capable
of distinguishing between pre-specied classes. The algorithm is provided
with large quantities of training data, from which it can ‘learn’ the patterns
in the provided features. The trained model then makes predictions on the
probability of the new data belonging to a particular class. For instance,
imagine a simple soundscape in which only two species sonied. Someone
looking at the respective calls could observe that there is a great deal of
dierence between the two species in the rate they repeat their calls, and the
pitch at which the calls are given. The algorithm would therefore be provided
with measurements of inter-syllable gap and frequency from calls belonging
to each class (species), and a model trained to predict the probability of which
species the calls emanated from. Generally in complex acoustic classication
tasks, many more features are selected. These types of models have been
used with reasonable success for automated classication of call types, but
are generally being used less as they are out-performed by deep-learning
methods.
Figure 5.4. Small mammals such as this Pygmy Shrew Sorex minutus often make
sound, so can be a good target for acoustic monitoring, although some of the sounds
can be beyond the range of human hearing. Credit: Oliver Metcalf.
Page 44
An overview of many of the software options that use machine-learning
approaches (amongst others) is available in Table 4 of Priyadarshani et al.,
(2018)118. We do not go into great detail here on these programs as for most
people interested in ecological sound classication, the BirdNET app119
or Kaleidoscope Pro51 programs will likely be the best approach, and are
discussed in more detail below.
For those interested in having more control over the classication process,
the Tadarida toolbox108 has been used successfully in Europe to classify a
range of bird, insect, and small mammal sounds, although is most eective
in ultrasonic frequencies. In many cases developing bespoke pipelines in R
or Python can be most eective. R in particular has several packages that can
help with this, in particular gibbonR120 has many useful functions - but it is also
possible to extract acoustic features using a package such as warbleR121, before
using a specialist machine-learning package such as caret122 to perform the
classication.
Clustering algorithms are not provided with training data. Instead a sound
event detection method is undertaken rst, after which the clustering
algorithm groups sounds by similarity. In theory, as call variation should be
greater among species than within species, if congured correctly these
algorithms should result in clusters of single species calls that can then be
identied by an ecologist.
One of the most popular commercial software for ecoacoustic analysis,
Kaleidoscope Pro51 uses clustering. Kaleidoscope Pro can quickly analyse
large quantities of data, and is user-friendly. In many cases, it is likely to be the
optimum species-specic analysis software for commercial projects, although
it is expensive and still requires ecological knowledge to identify sounds once
they are placed in a cluster.
5.1.6. Deep learning
Deep learning neural network models follow the classical machine learning
paradigm, but instead of requiring a primary feature extraction step the raw
audio (or more commonly its spectrogram) is presented as input and a high
dimensional representation of the audio is learned.
Supervised, unsupervised and increasingly semi-supervised and reinforcement
deep learning paradigms exist. The most popular approach to classication
of acoustic data are convolutional neural networks123. Large quantities of data
labelled with a single species or taxa (binary classication) or for multiple
species (multi-label classication) are provided. The algorithms are able to
independently ‘learn’ which features are most relevant in telling them apart
from other sounds. The principle is that this learned representation can
generalise to new data. Convolutional neural networks have produced the
best accuracy metrics for automated classication of any of the methods
mentioned here124. Deep learning algorithms can be eective in quickly
and accurately assessing large quantities of acoustic data, and is the only
classication method that can realistically be fully automated. However,
there is a high level of technical knowledge required to initially train one
of these algorithms, and the process of nding, identifying, and labelling
enough appropriate training data to create an accurate classier can be very
time consuming. For those with basic Python skills, the OpenSoundscape125
package oers a relatively straightforward way to build classication models,
and has a very clear user guide on how to do so.
Page 45
An alternative to training classiers for individual use is to use pre-trained
classication models built by others and made available for use. As the
production and use of deep-learning models is still relatively new in
ecoacoustics, the number of open-source models is limited, but is likely to
increase. Fortunately one of the few available, BirdNET developed by Cornell
Lab of Ornithology119, works for almost all European bird species and is freely
available to download as a standalone program. These multiclass models have
advantages, they only need to be run once over the data to obtain a complete
list of all species present, but also have disadvantages - they can produce some
obscure false predictions, and can in some cases be less accurate than models
trained for one or a few species. Nevertheless, the freely available and user-
friendly nature of the software is likely to make BirdNET a game changer for
analysis of acoustic data for birds, although currently it is only available under
a non-commercial Creative Commons licence. Note also that BirdNET is most
accurate when using the feature that allows it to be constrained by local lists of
birds generated from eBird.
Figure 5.5. A comparison of manual annotation (light blue) vs BirdNET classication
(purple) of 1 minute of audio from the Lower Derwent Valley NNR at 04:00 on 4th
May 2020. Manual annotations were made in Raven Pro by Oliver Metcalf and took
approximately 6 minutes. BirdNET was run through the desktop graphical user
interface, was given the latitude and longitude of the recording, the week of the year,
with overlap set to 2 seconds, sensitivity of 1.0, minimum condence of 0.1 and 4
threads and took 5.6 seconds. The spectrogram was produced in Raven Pro.
Similar automated classication algorithms are available commercially for
bats, and through the BTO Acoustic Pipeline126 for a range of bats, small
rodents, and insects. The BTO Acoustic Pipeline is free for small quantities of
non-commercial audio analysis, but a paid-for model is available for larger
and/or commercial projects, and provides a very good way to obtain a suite
of accurate automated identications of non-bird species without building
individual classication algorithms. However, as the Acoustic Pipeline was
originally designed to process ultrasonic data it requires data with a sampling
rate of 192 kHz or higher (recommended at 384 kHz for AudioMoths), so it may
not be suitable for long periods of monitoring, as SD cards in ARUs would ll
rapidly.
Page 46
Finally, for those with access to experienced data scientists, it is possible to
develop your own deep-learning classication models. Labelling enough
training and test data to develop these models will represent a signicant
start-up cost in terms of time and eort, so this approach is likely only cost-
eective if it is to be applied to very large quantities of acoustic data over
a long period, potentially for multiple species, and when a high degree
of accuracy is required. The development of deep-learning models for
ecoacoustic classication is a eld of active research in computer science, and
as such the eld is developing rapidly - it is worth undertaking a review of the
latest academic literature in the eld before undertaking any such projects.
5.1.7. Assessing classication performance
When using automated classication, one of the most important questions
to answer is how accurate the model is in its predictions. Unfortunately this is
not straightforward to answer, and even the most user-friendly classication
software do not oer reliable estimates of model accuracy. Standard machine-
learning methods for assessing classication performance involve splitting
the labelled dataset in ratio of 80:20 or 70:30, and using the larger sample
to train the algorithm and the smaller set for testing model performance.
Unfortunately this approach does not translate very well for ecoacoustics,
as the labelled data are often highly unrepresentative of the acoustic data
it will be applied to, because the training data needs lots of examples of the
target species calls, but these will likely be far rarer in the natural environment.
Consequently, when using automated classication, it is necessary to budget a
substantial amount of eort to manually assess an independently sampled test
dataset. Knight et al., (2019)127 provide an excellent set of guidelines for which
accuracy metrics should be used, and how to benchmark results.
Note that automated classication models can be very sensitive to dierent
soundscapes, so if the classier is applied to data taken from large spatial
or long temporal scales, it is also a good idea to check for variation in
classication performance across the study data - some methods for how
to do so are provided in Metcalf et al., (2022)128. There is an additional need
for caution when using ‘closed’, o-the-shelf, sound classication tools.
Where the training data and call/ sound features used to train the models
are not published, it is not possible to assess how the algorithm is making
classications and what the potential biases or errors might be, which
ultimately will aect the inference.
Page 47
The metrics used to assess model performance will be dependent on the
objectives of the particular study being undertaken, and the use to which the
classication data will be put. In ecoacoustic studies, precision (the proportion
of correctly predicted presence amongst all predicted presence) and recall (the
proportion of all true presences which are predicted) are generally considered
most useful. When calculating accuracy metrics it is commonplace to use an
initial condence score threshold of 0.5, so that for any condence scores
below 0.5 the target species is predicted absent, and above 0.5 the species is
predicted as present. If initial accuracy metrics have not achieved the desired
level of accuracy, there are two methods to remedy it. The rst is to retrain the
model, using more (or better/ augmented) training data. However, this can be
time consuming, and it is likely to be more eective to rst try to adjust the
condence score threshold to achieve an optimal trade-o between precision
and recall. Assessing precision-recall trade-os at dierent thresholds can
be done formally by building precision-recall curves (there are various R and
Python packages available to do this, such as ROCR129 and PRROC130) using the
test dataset, or less formally by taking stratied subsamples from across the
range of predicted condence scores post-classication - an approach that
may be particularly useful for pre-built classiers like BirdNET.
Figure 5.6. Caledonian pine forests hold a range of biodiversity suitable for passive
acoustic monitoring. One of the rarest – Capercaillie Tetrao urogallus – lek in early
spring and are readily disturbed by human presence, autonomous recorders have
proven to be a good alternative monitoring method. Credit: Oliver Metcalf.
Page 48
Very large studies generating large numbers of positive predictions will require
a fully automated workow, it is likely to be necessary to set a high condence
score threshold to minimise false positives (i.e. predicted presences when the
target species is in fact absent). Precision scores of >0.9 and recall of >0.5 are
probably good targets for most studies. The disparity between these scores
is because when setting thresholds, we recommend focussing on achieving
a low number of false positives (i.e. the algorithm predicting species present
when in reality it is absent) at the expense of missing some true presences
(i.e. prioritising precision over recall). This is because i) species usually vocalise
multiple times, often in quick succession and most classication algorithms
give predictions over short timescales, missing the presence of a species in one
3 second clip is mitigated by detecting it in the next, and ii) falsely assuming
a species is present when it’s not is generally more detrimental to ecological
studies than missing presence. To mitigate this further, it can be eective to
summarise classications over time - e.g. turn multiple predictions from three
second les into a single prediction of presence over ten minutes.
For smaller studies, when all cases a target species is predicted to be present
in can be manually assessed, the inverse approach can be taken. This process
is known as semi-automated classication. Here, the threshold is set low
to maximise recall, so that very few true presences are missed. This will
necessarily produce a higher number of false positives, but these are then
weeded out by manual assessment. This will result in a dataset with fewer
classication errors at the expense of greater manual labelling eort. Other
than in the cases of small datasets, this approach is likely to be desirable for
studies in which it is important to know exactly how often a species vocalises
or is present, or when a classication model performs poorly overall.
5.2 Ecological analysis
5.2.1. Presence and absence
The most basic ecological data obtainable from PAM is the presence or
absence of certain species. In many cases, especially when dealing with
wildlife legislation, even a single data point conrming species presence can
be critically important. PAM can be a very good choice for this sort of survey,
and has been proven eective for a range of species globally131,132,133,134. In
particular, if only a single presence is required, an automatic workow heavily
weighted in favour of precision presents a potentially very ecient way of
obtaining conrmation of species presence. When attempting to establish
absence, the opposite approach should be taken, and either manual analysis
should be adopted, or a semi-automated approach heavily favouring recall.
It is quite clear that for many vocal species, such as those that are elusive,
nocturnal, live in dicult to survey habitats such as reedbeds, or are at
low density - well-designed acoustic surveys over long-durations can be
more eective at conrming presence than manual surveys. Given this, it is
unfortunate and somewhat illogical that the UK Bird Survey Guidelines135
advise that PAM should not be used to establish species absence.
Page 49
5.2.2. Community analysis
A list of species from a location is likely to be high on most ecologists’ lists for
desirable products from an ecoacoustic study. A list of species is relatively easy
to generate, either through manual assessment of the data or by running a
multi-species classier such as BirdNet. However, acoustic data is not a census
of the species within an area - ARUs may not cover the entire area, recording
schedules may not be continuous, or recordings may be subsampled
for manual analysis after collection. Further, species will not be equally
represented in the acoustic data, as some make more sound, or are easier to
detect. It is often therefore desirable to assess how representative any species
list is of the whole community.
There are several methods that can be used to assess the completeness of the
species list, although this exercise is somewhat circular. One very informal way
to do so is to compare the generated list to pre-existing lists of species present
in the area at similar times of year, using data from local recording schemes,
and/or from online citizen science repositories such as eBird136. This process
can help to ensure that a disproportionate number of species aren’t being
missed and also identify any species which may be erroneously identied from
the acoustic dataset, but are not present in other datasets. To more formally
assess survey completeness, species accumulation curves can be easily and
quickly built in R using the iNext137 or vegan138 packages, in which the number
of species found is predicted by the number of survey samples used. Once
the accumulation curves plateau, more survey eort is unlikely to result in
the detection of many more species, so it is reasonable to assume the species
list obtained from that much survey eort is representative of the entire
community present. This can also be used to identify that more survey eort is
needed if accumulation curves do not plateau.
Page 50
Figure 5.7. A Fronter Labs Bioacoustic Recorder deployed in a Spotted Crake
Porzana porzana territory to help monitor this rare, nocturnal, and elusive wetland
breeding species. Credit: Oliver Metcalf.
5.2.3. Occupancy models
Occupancy models estimate the probability of species occupying a site,
in relation to environmental covariates. Importantly, occupancy models
estimate detectability from a structure of multiple visits, and can infer species
occupancy in suitable habitat, even if the species is not detected at a site.
Acoustic data can be divided into discrete, independent, temporal units, and
then treated as ‘repeat visits’ to a site. They have been widely used as a method
to analyse data derived from acoustic studies both in the UK where they have
been used to study rare heathland bird species139 and abroad for elusive forest
passerines140,141. They are well suited to acoustic data as they take presence/
absence data as input and allow for imperfect detection of the sort caused by
classication error or silent individuals. They can also be extended to multi-
season or dynamic occupancy models to allow for understanding of changes
in species occupancy, which are especially useful in long-term monitoring142,143.
Additionally, they can be used for multi-species models to investigate the
impact of co-occurring species, which may be useful in monitoring the impact
of reintroduced or newly occurring species at rewilding sites or conservation
projects. There are various R packages for tting occupancy models, the most
widely used of which is ‘unmarked’144. There are also an increasing number of
papers looking at methods to deal with the sort of errors caused by automated
classication workows for acoustic data145.
5.2.4. Localisation
Localising acoustic signals has a multitude of applications spanning: non-
invasive behaviour monitoring, abundance counting, and locating the
position of chainsaws used in illegal logging146 or gunshots64. As with the
hardware, there are few o-the-shelf software solutions for sound localisation.
Fortunately, there is an excellent review of the approaches taken to sound
localisation so far, which should give anyone wishing to undertake such
an analysis a good starting point, and an idea of the analytical challenges
they are likely to face24. Increasingly open localisation processes are being
released, often with user-friendly interfaces e.g HARKBird147, and ODAS148.
There are also functions and tools in the Python packages scikit-maad102
and OpenSoundscape125 likely to be useful to anyone attempting sound
localisation.
5.2.5. Density/Abundance
Another highly desirable output from acoustic data is obtaining a measure of
species abundance or density - which was recently reviewed149 and would be
recommended reading for anyone wanting to explore the topic. . Estimating
abundance or density is not a simple task, and whilst there isn’t yet one proven
method successful in all scenarios, there are three general approaches.
The rst is to use vocal activity rate. The second is to use localisation of sounds
and complex statistical approaches. The third is to use individual identication.
Each of these has their own strengths and weaknesses and currently each
is only appropriate for a few species or studies that meet the stringent
assumptions.
Page 51
Vocal activity rate150 is predicated on the idea that if individual animals vocalise
at a consistent rate, then vocalisations from a species are linearly related to
the number of individuals present. The challenge with this simple approach is
that accurate information is needed on the average sound production rate of
individuals (or cue rate)151. The approach assumes that average cue rate is the
same between individuals and that detectability is the same at dierent sites.
Much of the development of this work has been with marine mammals152.
In the terrestrial realm, this approach may be applicable to some species,
such as amphibians and territorial bird species during the breeding season.
Importantly, it has also been paired with template matching for classication
of Forster’s Tern Sterna forsteri in the USA153 and automated classication
for Cory’s shearwater Calonectris borealis calls on the Azores to successfully
estimate the size of nesting colonies154 - a potentially very valuable use in
the UK for monitoring potential development impact on colonially nesting
species, or the eects of rat eradication. For other species, it seems unlikely
to be successful or requires a great deal further research - for instance for
nocturnally migrating ocks of birds where call rate is inuenced by a wide
range of factors155, or large ocks of wintering geese where a saturation point
in call rate seems likely to be quickly reached.
A word of caution. The approach outlined above can be used to estimate the
number of individuals within the area surveyed by the acoustic sensor and
associated identication algorithms. However, this is not an estimate of density
without a concurrent estimate of the area surveyed. Therefore, estimating
the density of a species requires more complex methods to estimate the area
surveyed. With colony-nesting species, it may be straightforward to be sure
that the entire colony could be detected by the sensors. However, in wider-
landscape situations it is more complicated. In these scenarios, the cue-
counting method above can be used to estimate the number of individuals
detected at each sensor. Without this, the cue-counting approach can be
assumed to estimate relative abundance at dierent sites, but not to estimate
density.
A second approach involves the localisation of calling individuals156,157,158. These
locations are then used to estimate distances and distance-sampling methods
are applied. Or alternatively, the localisation is used to identify the same sound
detected on multiple sensors and spatially-explicit capture-recapture methods
are used. Both of these are complex approaches that have only been successful
on a small number of studies, with bespoke analytical development for each
situation. Anyone looking to adopt such an approach in the UK is, for the time
being, likely to need to develop their own bespoke methods. However, some
academic studies have been able to successfully localise passerine species in
North America and estimate density, and the scikit-maad package102 in Python
has several useful functions to facilitate the analysis process.
Finally, some species have calls or songs that are unique to individuals and
thus estimate abundance by knowing the identity of the individuals present.
In some cases, such as Cetti’s Warbler159 and Tawny Owl160, the songs contain
unique phrases or ordering that make this process feasible using PAM, albeit
time-consuming. In other species, the individual dierences are likely to be
subtle and require better quality recordings than are standardly collected with
ARUs equipped with omnidirectional microphones. As with the rst approach,
without estimates of area sampled, this method estimates species abundance,
but not density.
Page 52
Chapter 6: Soundscape Analysis
Soundscape analysis considers the whole soundscape, combining biophony, geophony and
anthropophony. In doing so, soundscape analysis gives space to less charismatic, unheard, or
understudied species, and can also be used to monitor the manner, extent and perhaps impact
of anthropophony. Currently, soundscape analysis is primarily conducted through the use of
acoustic indices - a family of methods used to quantify variation in acoustic energy and relate that
variation to the sonic environment.
Acoustic indices have been used to monitor species richness161,162, community composition163,
the relative contributions of biophony, geophony and anthropophony to a soundscape164,
approximating species abundance165, as means of more intuitively visualising soundscapes100, or
even as a means of mapping an area’s wildness166. However, acoustic indices have yielded mixed
results. A recent meta-analysis167 exploring their association with biodiversity highlighted a weak
relationship and highly variable eect sizes between many of the most commonly used acoustic
indices and species diversity metrics.
Increasingly, deep-learning based methods are being used as an alternative to acoustic indices
for soundscape analysis. Soundscape descriptors built from deep-learning embeddings make
for informative visualisations and are successful predictors of landscape, biomass, and species168.
Deep-learning embeddings have been shown to outperform acoustic indices on landscape
classication tasks and are more robust to experimental variation80,168. However, they can be
complicated to generate, and their opaque nature makes interpretation dicult (see below).
6.1. Introduction to acoustic indices
Soundscape analysis with acoustic indices represent an entirely dierent analysis
paradigm to species-specic analyses. This approach pre-supposes that soundscapes
from dierent locations, habitats, and ecological communities are dierent and that
those dierences are possible to quantify using statistical measures of acoustic energy to
provide ecological information that may complement species data. The varied statistical
methods of measuring variation in acoustic power are collectively termed acoustic
indices161. Of the dozens that have been proposed, most entail calculation of power
ratio between multiple frequency and/or time bins across a recordings, creating more
nuanced versions of conventional sound pressure and spectral density metrics. This
approach has been increasingly popular in the academic literature, used both as a means
to characterise soundscapes and the corresponding landscapes, and in some situations
as proxies for traditional biodiversity metrics such as species richness and species
diversity169.
There are a range of reasons for supposing that the spectro-temporal structures of
a soundscape would be reective of its ecological components. The most developed
theories in this eld are the Acoustic Niche Hypothesis and the Acoustic Adaptation
Hypothesis169. The Acoustic Niche Hypothesis170 suggests that species that have evolved
together will also have evolved their own niche in time and frequency space, in which
they can communicate clearly to conspecics without interference from other species.
For example, birds may call at frequencies lower than more dominant cricket species,
whilst other species may avoid vocalising when the avian dawn chorus is at its peak.
The theory posits that a soundscape with fewer quiet gaps in frequency or time will be
reective of higher species richness, as more species have co-evolved to ll the space.
Conversely, degraded habitats will show empty gaps in the soundscape, which represent
the niches of species no longer present. The Acoustic Adaptation Hypothesis171 suggests
that species adapt their vocalisations to the habitat they occur in to maximise how far
the signal is carried. Think for example of the high-pitched and sibilant calls of species
such as Common Kingsher Alcedo atthis, Grey Wagtail Motacilla cinerea,
Page 53
and White-throated Dipper Cinclus cinclus, as species that have all evolved alongside
noisy fast-owing water. This convergence of calls due to the impact of habitat gives
the soundscapes of dierent areas and habitats unique and recognisable properties.
However, it is worth noting that both of these theories are controversial172,173, and there is
evidence both for and against them.
6.2. Acoustic Analysis
Soundscape analysis has several major benets over species-specic analyses. It is
generally easy and quick to calculate acoustic index values, and computationally
relatively inexpensive. Additionally, taking a soundscape approach reduces the need
for complex algorithms or species identication experts. In combination, this can be
an attractive proposition. However it is worth noting that what is gained in ease of
application is somewhat lost in ease of interpretability - it is not always clear what
dierences or changes in acoustic index values mean ecologically.
Careful use of acoustic indices is therefore necessary. One of the most eective uses of
acoustic indices is to ‘characterise’ soundscapes174,175. Whilst all soundscapes are unique,
those coming from similar places, times, and habitats tend to have similarities. Indices
can be used to quantify these similarities and dierences, to identify change, or to make
predictions about the environment in which the recordings were made, without needing
to necessarily understand the underlying causal mechanisms. Indices are also commonly
used as proxies for traditional biodiversity metrics, although this approach may only be
reliable under certain conditions (see section 6.5 for more details) and it is necessary to
ensure there is a great deal of ground-truthed data also available.
Acoustic indices range in complexity - the most basic are simple audio descriptors (e.g.
zero-crossing rate, counts of acoustic events, background noise levels), whilst others
have been designed heuristically to capture the intensity of biophony, ratio of biophony
to anthropophony or distribution of energy across the spectra under the assumption
that this may reect composition of the acoustic community.
Acoustic indices are not a magic solution and must be applied and interpreted in
context. For example, a measure of the number of acoustic events in an urban park
is unlikely to say very much at all about the number or types of non-human species
present in the park as it is likely to be dominated by anthropogenic sounds. However,
there may likely be a relationship between the number of acoustic events and the
number of several seabird species breeding at a colonial nesting site, for instance176.
More heuristic indices may reect some aspect of ecology. However, it is necessary to
check the assumptions of the individual index to ensure that the circumstances it was
designed to reect pertain to the data it is applied to, and “sound-truthed” against some
form of manually assessed data.
Page 54
Index and
original
reference
Index description Soundscape patterns
Acoustic
Complexity Index
(ACI)177
Determines the dierence in amplitude between one time
sample and the next within a frequency band, relative to
the total amplitude within that band.
The concept underlying this index is that biophony is often
of variable intensity, whilst anthrophony such as engine
noise is generally constant. Acoustically rich habitats may
produce low ACI values if intensity does not vary greatly
over time even if there are multiple contributing sound
sources. It is also impervious to constant biophony such as
tropical insect noise.
High values might indicate storms, intermittent rain drops
falling from vegetation, stridulating insects, or high levels
of avian biophony.
Low values are associated with constant noise that lls the
whole spectrogram, for example from loud technophony or
excessive cicada chorus.
ACI value is cumulative; longer recordings will give higher
values. Taking a mean is sensible.
Acoustic
Diversity Index
(ADI)178
Derived by calculating the Shannon entropy of the
distribution of acoustic energy among frequency bands.
ADI ranges from 0 to the log of the number of frequency
bins used.
ADI will increase with greater evenness of energy among
frequency bands. An even signal will give a high value
(could be noisy across frequency bands or completely
silent) and a pure tone (i.e. all energy in one frequency
band) will be closer to 0.
High values associated with high levels of geophony and
technophony, which ll the spectrogram with noise, or
from very quiet recordings with little variation among
frequency bands.
Lowest values reect dominance by a narrow frequency
band, such as nocturnal insect noises in the tropics.
Acoustic
Evenness
(AEve)178
Derived by calculating the Gini coecient of the
distribution of acoustic energy among frequency bands.
Values lie between 0 and 1. Higher values indicate greater
unevenness among frequency bands, i.e. most of the sound
is in a restricted frequency range.
Inverse of the patterns in ADI. High values identify
spectrograms dominated by a narrow frequency band.
Low values indicate many evenly-occupied frequency
bands, although this can also occur in near silent
recordings.
Activity (ACT)100 Proportion of values in the noise-reduced decibel envelope
that exceed 3 dB. Higher values indicate greater acoustic activity
Acoustic Space
Use (ASU)179
A matrix derived by calculating the number of time-
frequency bins (of given duration and frequency bin size)
that are ‘active’ -e.g. surpass a predetermined amplitude
threshold.
Higher values reects the times and frequencies when
acoustic activity is high
Background
noise (BGN)100
The mode of the sound energy distribution of the
waveform envelope.
Higher values indicate a greater level of acoustic energy,
such as during rainstorms.
Bioacoustic
Index (Bio)180
Derived from the sum of the mean amplitudes of individual
frequency bands between 2 – 8 kHz minus that of the
quietest frequency band.
High values are produced by recordings with high
amplitude and greater disparity between loudest and
quietest frequency bands.
Low values arise when there is no sound between 2 and 8
kHz.
Spectral entropy
(Hs)163
Calculated from the relative mean amplitude of individual
frequency bands of a spectrogram. Uses the Shannon
diversity index on those values as a measure of evenness.
Scaled to range between 0 and 1.
Larger values imply a more even distribution of acoustic
energy among frequency bands.
Temporal
entropy (Ht)163
Calculated with the relative values of the amplitude
envelope. Uses the Shannon diversity index on those values
as a measure of evenness. Scaled to range between 0 and 1.
Larger values imply greater temporal evenness.
Acoustic entropy
(H)163
Derived by multiplying spectral entropy (Hf) and temporal
entropy (Ht), again scaled to range between 0 and 1. Within
recording sets this tends to be dominated by Hf.
Higher values reect greater evenness of amplitude among
frequency bands (from either noisy or completely silent
soundscapes). Lower values indicate acoustic energy
concentrated in a narrow frequency range.
Events per
second (EVN)100
Number of times per second the noise-reduced decibel
envelope crosses a 3 dB threshold. Given as the mean per-
second value over the recording.
Higher values indicate more frequent changes in
amplitude.
Table 6.1. An overview of some of the most commonly used acoustic indices adapted from Bradfer-
Lawrence et al (in prep).
Page 55
Median of the
amplitude
envelope (M)181
Louder recordings will give higher values, and so reect
noisier soundscapes.
High values associated with high amplitude events such as
storms.
Low levels from very quiet recordings.
Normalised
Dierence
Soundscape
Index (NDSI)164
This index relies on the theoretical frequency split between
anthrophony (1 – 2 kHz) and biophony (2 – 8 kHz)
(although this may not hold in many systems, see text).
NDSI is calculated from the power spectral density of the
largest biophony band against that of the anthrophony
band: (bio - anthro) / (bio + anthro)
NDSI ranges from -1 to +1, with +1 indicating no sound in
the anthrophony range.
High values reect large amounts of sound somewhere in
the 2 – 8 kHz range, with minimal noise between 1 – 2 kHz.
Low values associated with more noise in the 1 – 2 kHz
band.
Number of
frequency peaks
(NP)182
The number of individual peaks in the mean amplitude
spectrum of a recording, scaled between 0 and 1. A peak is
dened as having an amplitude slope > 0.01 and being >
200 Hz from the next.
Higher diversity of sounds should generate a higher
number of peaks. Although in highly saturated
soundscapes, there may be very few peaks if these sounds
overlap.
Signal-to-Noise
ratio (SNR)100
The dierence between the maximum dB value in the
decibel envelope and the Background Index (see above)
Higher values should reect a transient sound event with a
much higher amplitude above the background noise level.
Soundscape
Saturation
(Sm)183
The proportion of frequency bins that are acoustically
active per minute. Derived from the power (maximum
amplitude in dB) in each frequency band minus the modal
amplitude of that same frequency band. If these values
exceed a threshold then the band is active.
Higher values indicate a more active spectrogram, the
soundscape is more saturated.
6.3. Computation of acoustic indices
Computing acoustic indices is relatively straightforward, so it is somewhat surprising
that there are not more user-friendly programs available to calculate them. The Analysis
Programs software package from Queensland University of Technology Ecoacoustics
Lab101 provides a wide range of index calculations, whilst Arbimon35 oers some
soundscape calculations akin to acoustic space use. Kaleidoscope Pro51 oers a small
number of simple acoustic descriptors, but none of the more commonly used heuristic
indices.
Fortunately, it is simple to calculate acoustic indices using the R or Python coding
languages – an index value for a single sound le can be generated in just two lines
of code. The rst line to read in a sound le, the second to calculate the index. The
excellent seewave94 and soundecology184 packages in R and the scikit-maad package102
in Python oer functions that will compute a wide range of the most commonly used
acoustic indices very simply. For other less common indices code is often freely shared
as supplementary information in associated scientic publications. In general, acoustic
indices are calculated over 1 minute sound les, with any indices that generate values at
a ner temporal scale averaged to that duration, and that is the recommendation here.
Technically it is possible to calculate indices over longer time periods. If your ecological
hypotheses or questions motivate this, it can be advantageous to calculate variance,
median, minimum and maximum as well as mean for frame-based indices (ACI, RMS,
ZCR etc.) Note that this will reduce the sample size of the data collected, and likely slow
down computation time as reading in and calculating indices over larger sound les
generally takes disproportionately longer, and could use up a high proportion of RAM
memory on smaller computers.
Page 56
Figure 6.1. An example of the advantages and disadvantages of acoustic indices with real
data. The top panel shows acoustic index values calculated for 600 1min audio les - recorded
every minute between 20:00-06:00 on the night of 3rd-4th May 2020 in the Lower Derwent
Valley NNR. Acoustic Complexity Index (ACI) and the Bioacoustic Index (Bio) were calculated
using the soundecology package in R, with Acoustic Entropy (H) using the seewave package.
All indices were calculated between 0.2-8 kHz and were centred between 0-1 afterwards.
Calculation took <5 minutes. The quiet period before midnight is clearly visible, as is the onset
of the dawn chorus shortly after 03:00. The bottom panel illustrates some of the complexity in
interpreting index values. A) shows the quiet period, without a strong response from any of the
indices. B) shows the minimum H value, with very little increase in ACI or Bio values - caused by
strong wind at low frequencies, C) shows the maximum ACI value, paired with small changes in
ACI and Bio, caused by heavy rainfall across the frequency range and D) the maximum Bio value,
with similar increase in ACI but no change in H - caused by bird song. It is worth noting however
that the sound le at D only contains sound from a single species - Eurasian Skylark Alauda
arvensis, that has a broadband song, conspicuous vocal mimicry and high temporal variation.
Page 57
6.4. Sampling eort to capture soundscape variability
Generally, it is not necessary to collect data continuously for soundscape analysis.
However, it is important to be condent that the recordings have captured the naturally
occurring variation in a soundscape. This should be checked before any further analyses,
such as through comparisons among sites. As a rule of thumb, indices should be
calculated from at least 120 hours of recordings from the target period - e.g. if a study
wished to compare index values from dawn at dierent seasons, it would be necessary
to record for 3 hours each morning for 40 days in each of winter, spring, summer,
and autumn. However, it is worth noting that the 120-hour threshold was estimated
from data collected in the tropics, and using indices calculated across both the entire
frequency range and the entire diel cycle. It is likely that the number of hours needed
to capture biophonic soundscape variability in the UK is considerably less, and that
measuring indices at narrower time and frequency bins could reduce this still further.
To be condent that the subsequently derived indices values are still representative
of the variation in the soundscape would require a more formal assessment of survey
completeness.
One option for assessing the level of precision with which soundscapes have been
captured involves assessing reduction in the variance of the cumulative standard error
of acoustic indices185. Standard errors stabilise when natural variability rather than data
paucity is driving index variance. Bradfer-Lawrence et al (2019)185 found that variance
in indices standard errors reached ~10% with 120 hours (ve days) of continuous
recording. Although the quantity of recordings required to reach this 10% threshold will
vary among systems, standard error variance will follow a similar shape of exponential
decline with increasing quantity of recordings (TBL pers. obs.). Logistical constraints may
necessitate deployments of less than 120 hours, but acousticians should strive to ensure
they have at least passed the modelled inection point, and recognise that shorter
deployments result in less comprehensive capture of the soundscape.
The steps to calculate reduction in variance are as follows:
1. Generate acoustic indices values for each recording.
2. Randomly assign the recordings from a single site into groups*. Each group
comprises recordings equivalent to one hour of deployment time.
3. Calculate the standard error of each index at each site. Standard error is cumulative,
use progressively larger quantities of recordings by adding data for an additional
group for each calculation. For example, standard error for the third deployment hour
is calculated using the index values from the rst three groups, for the fourth hour
with index values from the rst four groups, and so on.
4. For each acoustic index, divide each group’s standard error by the maximum value
across all groups from that sites recordings1, to give proportions of the maximum.
5. Quantify the reduction in variance with increasing quantities of recordings using
non-linear regression with a Weibull distribution. The forthcoming AcousticIndices’
R package includes functions to automate standard error calculations and the non-
linear modelling.
* Or a single site-by-deployment combination if there was more than one deployment at
a site.
Page 58
6.5. Ecological Analysis
6.5.1. Indices to characterise landscapes
Indices have been used in a number of studies to successfully characterise
soundscapes. What this means in practice is that each soundscape is unique,
but that soundscapes from similar habitats have enough in common that
machine-learning algorithms can dierentiate them based on index values
generated from recordings in those habitats. In general, it is a good idea to use
a suite of acoustic indices to characterise soundscapes in order to represent
dierent aspects of the sound present in the recordings. Commonly used
indices for this purpose include Acoustic Complexity Index177, the Bioacoustic
Index180, Acoustic Entropy163, Acoustic Evenness/Diversity178 and the number of
frequency peaks182, although a wide range of others can and perhaps should
be used186.
Additionally, the sensitivity of soundscape characterisation can be improved
by calculating indices over a range of time and frequency bins187. These
subsets of the entire soundscape are best selected by choosing periods
and frequencies that are representative of when the target community or
taxa are likely to have a strong presence in the soundscape - for instance
around dawn and between 0.5-10 kHz for a study targeting birds. This is
because soundscapes change across the diel cycle - think of the dierence
in sound between the dawn chorus and the middle of the afternoon, and at
dierent frequencies - the insects stridulating at higher frequencies may be
more dierent between sites than mammals and birds at lower frequencies.
Calculating single values across the entire temporal and frequency ranges
mask these subtle dierences, so it is better to generate indices values at
a range of dierent frequency and temporal bins, ideally based on prior
ecological knowledge of the timings and frequencies of species groups
communication. If taking this approach, ensure that you adjust any parameters
of the indices as some have default values (NDSI, ADI, BI etc.)
Once acoustic index values have been generated, these values can be used
in standard ecological analyses - exploratory ordination, classication or
regression models. Random Forests188 are a common choice as they have no
formal distributional assumptions and are non-parametric so they can handle
skewed, as well as categorical data. Random forests are an ensemble learning
method that can be used for classication or regression and can be used for
multivariate data.
These algorithms are ‘trained’ on a subset of labelled data (e.g. a series of index
values which are labelled as having come from a certain habitat), learning
what aspects of the data are most characteristic of that particular habitat.
Having done so, it is then possible to use the algorithm to make predictions
as to whether a new recording belongs to that habitat or not. In theory, this
could provide an indication of habitat quality - several studies have shown that
degraded or secondary forest in the tropics can be distinguished accurately
from undisturbed primary forest using acoustic indices162,187,189, as well as
broader land-use types such as woodland and farmland162. In some situations
acoustic indices have been shown to predict habitat type more accurately
than species lists, suggesting that soundscape analyses may provide
complementary ecological information to targeted analyses162. This possibility
requires further investigation. If exploring this approach in new habitats,
ecological ground-truthing should be conducted.
Page 59
This research also highlights the potential to track habitat changes over time
through soundscape analysis. For example we might expect the soundscape
of a rewilding project on farmland would start out with acoustic index values
similar to neighbouring farmland, but as scrub habitat and then woodland
develops, index values should grow closer to neighbouring natural habitats.
This is potentially a cost-eective and ecient way of tracking long-term
changes in a landscape.
The potential for this approach is supported by a study demonstrating
that three acoustic indices - acoustic richness, median amplitude and
temporal entropy - successfully characterised the dierence between islands
with invasive predators still present, and those in which predators had
been removed and had recovering populations of Leach’s Storm-petrels
Oceanodroma leucorhoa190. Elsewhere, acoustic indices have been used to
assess the impact of construction and drilling at a gas platform development
in tropical forest113.
6.5.2. Indices as proxies for biodiversity metrics
Early ecoacoustic research investigated the potential for acoustic indices as
proxies for biodiversity metrics, however a recent review reveals mixed success.
This approach is heavily grounded in the Acoustic Niche Hypothesis, and relies
on the idea that a more ‘complete’ soundscape entails more species being
present. In general, heuristic indices designed to capture this soundscape
‘completeness’ are modelled against some metric of biodiversity, often species
richness, derived from traditional survey methods. Simple Spearman’s Rank
correlations and linear models are often used to do so, although these are
too simple a method to model what is likely a complex relationship. Under
this approach, in order to try and eliminate masking sounds from sources not
relevant to the biodiversity metrics, it is important to only calculate acoustic
indices at appropriate times and frequency bins187,191,192.
Overall, whilst some studies have successfully shown a strong relationship
between acoustic indices and species richness, there is a great deal more
research needed before we understand the relationship between the number
and abundance of species present in an area and the emergent soundscape193.
In particular, the impact of species with song mimicry is poorly understood, as
is the level at which acoustic indices may saturate - e.g. it may be possible to
establish dierences in the soundscape between one and ten vocalising frogs,
but not between 100 and 1000.
In contrast, the relatively simple measure of soundscape saturation has been
eectively used to measure community turnover in selectively-logged tropical
forest in Papua New Guinea194. Here, the spectrogram was gridded, and each
cell was considered to be acoustically active when the amplitude power
passes a threshold. These measures of soundscape saturation were then used
to measure the dissimilarity between the dierent forest types - nding that
logged forest leads to increasing homogeneity of the soundscape, with a loss
of characteristic dawn and dusk choruses. This approach (and the conceptually
similar Acoustic Space Use) could be applied to large restoration projects,
especially when it is ecologically rational to hypothesise that restoration will
increase ecological and soundscape diversity whilst unrestored areas will
remain homogenous.
Page 60
6.5.3. Deep Learning for Soundscape Analysis
Just as deep neural network models can be used to support automated
species detection, deep learning methods can be used to learn soundscape
representations168. Rather than training a new model from data, numerous
pre-trained models are now readily shared. It has been shown that a large
model trained on hundreds of hours of labelled YouTube data (VGGish) can be
applied to ecological tasks. How does this work? The power of deep learning
models comes from their many layers - VGGish has 24 layers, for example.
Typically data is presented to an input layer and predictions read from the
output layer. Under this approach the ‘hidden layers’ are inspected and
the representations learned by the model to make predictions that can be
adopted as a ‘learned representation, akin to a multivariate acoustic index.
These learned representations have been used to characterise soundscapes, to
detect anomalous sound events such as gunshots or chainsaws, and to predict
with a high degree of accuracy the presence or absence of a range of forest
indicator species in Borneo195.
Pre-trained models can be further “tuned” with local soundscape recordings
using self-supervised methods - greatly reducing the human eort in labelling
data and increasingly the accuracy of the representation. However, as with
all things there are trade-os. Learned representations can be very powerful,
as they can provide detailed representations and are not based on human
assumptions about potential links between soundscape facets and ecology.
However traditional approaches are notoriously opaque, making it dicult
to interpret results. Current research applies methods from visual learning to
investigate the relationships between these abstract learned representations
and the spectrogram representations they are trained on, and that are more
humanly accessible. Current approaches provide potential for monitoring
change; in the future we may gain insight into ecological signicance of this
change.
Research in this eld is fast-moving and implementation requires strong
technical skills. However, these advances will likely support accessible,
interactive interfaces for data exploration in the near future.
Page 61
Figure 6.2. Passive acoustic monitoring can be an eective way to monitor wildlife
on farmland. Credit: Oliver Metcalf
1. IPBES (2019). Summary for policymakers of the
global assessment report on biodiversity and
ecosystem services of the Intergovernmental
Science-Policy Platform on Biodiversity and
Ecosystem Services. S. Díaz, J. Settele, E. S. Brondízio,
H. T. Ngo, M. Guèze, J. Agard, A. Arneth, P. Balvanera,
K. A. Brauman, S. H. M. Butchart, K. M. A. Chan, L.
A. Garibaldi, K. Ichii, J. Liu, S. M. Subramanian, G.
F. Midgley, P. Miloslavich, Z. Molnár, D. Obura, A.
Pfa, S. Polasky, A. Purvis, J. Razzaque, B. Reyers, R.
Roy Chowdhury, Y. J. Shin, I. J. Visseren-Hamakers,
K. J. Willis, and C. N. Zayas (eds.). IPBES secretariat,
Bonn, Germany. 56 pages. https://doi.org/10.5281/
zenodo.3553579
2. Davis, J. (2020). UK has ‘led the world’ in destroying
the natural environment. Natural History Museum.
https://www.gov.uk/guidance/biodiversity-metric-
calculate-the-biodiversity-net-gain-of-a-project-or-
development
3. Planning Advisory Service (No date). Biodiversity Net
Gain for local authorities. https://www.local.gov.uk/
pas/topics/environment/biodiversity-net-gain-local-
authorities
4. Rewilding Britain (No date). What is rewildling?.
https://www.rewildingbritain.org.uk/explore-
rewilding/what-is-rewilding
5. Conference of the Parties to the Convention on
Biological Diversity (2022). Kunning-Montreal Global
Biodiversity framework. https://www.cbd.int/doc/c/
e6d3/cd1d/daf663719a03902a9b116c34/cop-15-l-
25-en.pdf
6. Lawton, J.H., Brotherton, P.N.M., Brown, V.K., Elphick,
C., Fitter, A.H., Forshaw, J., Haddow, R.W., Hilborne, S.,
Leafe, R.N., Mace, G.M., Southgate, M.P., Sutherland,
W.J., Tew, T.E., Varley, J., & Wynne, G.R. (2010) Making
Space for Nature: a review of England’s wildlife sites
and ecological network. Report to Defra.
7. Watts, N., Amann, M., Arnell, N., Ayeb-Karlsson, S.,
Beagley, J., Belesova, K., … Costello, A. (2021). The
2020 report of The Lancet Countdown on health and
climate change: responding to converging crises.
The Lancet. doi:10.1016/S0140-6736(20)32290-X
8. Berger-Tal, O., & Lahoz-Monfort, J. J. (2018).
Conservation technology: The next generation.
Conservation Letters, 11(6), e12458. doi:10.1111/
conl.12458
9. Lahoz-Monfort, J. J., & Magrath, M. J. L. (2021,
October 4). A Comprehensive Overview of
Technologies for Species and Habitat Monitoring
and Conservation. BioScience. Oxford Academic.
doi:10.1093/biosci/biab073
10. Priyadarshani, N., Marsland, S., & Castro, I. (2018).
Automated birdsong recognition in complex
acoustic environments: a review. Journal of Avian
Biology, 49(5). doi:10.1111/JAV.01447
11. Wearn, O. R., Freeman, R., & Jacoby, D. M. P. (2019).
Responsible AI for conservation. Nature Machine
Intelligence, 1, 72–73. doi:https://doi.org/10.1038/
s42256-019-0022-7
References
12. Sugai, L. S. M., Desjonquères, C., Silva, T. S. F., &
Llusia, D. (2020, November 13). A roadmap for
survey designs in terrestrial acoustic monitoring.
(N. Pettorelli & V. Lecours, Eds.), Remote Sensing in
Ecology and Conservation. doi:10.1002/rse2.131
13. Chambert, T., Waddle, J. H., Miller, D. A. W., Walls,
S. C., & Nichols, J. D. (2018). A new framework for
analysing automated acoustic species detection
data: Occupancy estimation and optimization of
recordings post-processing. Methods in Ecology
and Evolution, 9(3), 560–570. doi:10.1111/2041-
210X.12910
14. Gibb, R., Browning, E., Glover-Kapfer, P., & Jones, K.
E. (2019). Emerging opportunities and challenges
for passive acoustics in ecological assessment and
monitoring. Methods in Ecology and Evolution.
Wiley/Blackwell (10.1111). doi:10.1111/2041-
210X.13101
15. Sugai, L. S. M., Silva, T. S. F., Ribeiro, J. W., & Llusia,
D. (2019). Terrestrial Passive Acoustic Monitoring:
Review and Perspectives. BioScience. Oxford
University Press. doi:10.1093/biosci/biy147
16. Teixeira, D., Maron, M., & Rensburg, B. J. (2019).
Bioacoustic monitoring of animal vocal behavior for
conservation. Conservation Science and Practice,
1(8). doi:10.1111/csp2.72
17. 17. Shoneld, J., & Bayne, E. M. (2017). Autonomous
recording units in avian ecological research: current
use and future applications. Avian Conservation and
Ecology, 12(1), art14. doi:10.5751/ACE-00974-120114
18. Darras, K., Batáry, P., Furnas, B., Celis-Murillo, A., Van
Wilgenburg, S. L., Mulyani, Y. A., & Tscharntke, T.
(2018). Comparing the sampling performance of
sound recorders versus point counts in bird surveys:
A meta-analysis. Journal of Applied Ecology, 55(6),
2575–2586. doi:10.1111/1365-2664.13229
19. Darras, K., Batáry, P., Furnas, B. J., Grass, I., Mulyani,
Y. A., & Tscharntke, T. (2019). Autonomous sound
recording outperforms human observation for
sampling birds: a systematic map and user guide.
Ecological Applications, 29(6). doi:10.1002/eap.1954
20. Greenhalgh, J. A., Genner, M. J., Jones, G., &
Desjonquères, C. (2020). The role of freshwater
bioacoustics in ecological research. Wiley
Interdisciplinary Reviews: Water, 7(3), 1–20.
doi:10.1002/wat2.1416
21. 21. Abrahams, C., Desjonquères, C., & Greenhalgh,
J. (2021). Pond Acoustic Sampling Scheme: A
draft protocol for rapid acoustic data collection in
small waterbodies. Ecology and Evolution, 11(12),
7532–7543. doi:10.1002/ece3.7585
22. Merchant, N. D., Fristrup, K. M., Johnson, M. P., Tyack,
P. L., Witt, M. J., Blondel, P., & Parks, S. E. (2015).
Measuring acoustic habitats. Methods in Ecology
and Evolution, 6(3), 257–265. doi:10.1111/2041-
210X.12330
23. Alcocer, I., Lima, H., Sugai, L. S. M., & Llusia, D. (2022).
Acoustic indices as proxies for biodiversity: a meta-
analysis. Biological Reviews, (1), 0–000. doi:10.1111/
brv.12890
Page 62
24. 24. Rhinehart, T. A., Chronister, L. M., Devlin, T.,
& Kitzes, J. (2020, July 1). Acoustic localization of
terrestrial wildlife: Current practices and future
opportunities. Ecology and Evolution. John Wiley &
Sons, Ltd. doi:10.1002/ece3.6216
25. Marques, T. A., Thomas, L., Martin, S. W., Mellinger,
D. K., Ward, J. A., Moretti, D. J., Tyack, P. L. (2013).
Estimating animal population density using passive
acoustics. Biological Reviews, 88(2), 287–309.
doi:10.1111/brv.12001
26. Pérez-Granados, C., & Traba, J. (2021, March 8).
Estimating bird density using passive acoustic
monitoring: a review of methods and suggestions
for further research. Ibis. Wiley. doi:10.1111/ibi.12944
27. Sueur, J., & Farina, A. (2015). Ecoacoustics: the
Ecological Investigation and Interpretation of
Environmental Sound. Biosemiotics, 8(3), 493–502.
doi:10.1007/s12304-015-9248-x
28. Williams, E. M., O’Donnell, C. F. J., & Armstrong, D. P.
(2018). Cost-benet analysis of acoustic recorders
as a solution to sampling challenges experienced
monitoring cryptic species. Ecology and Evolution,
8(13), 6839–6848. doi:10.1002/ece3.4199
29. Moussy, C., Bureld, I. J., Stephenson, P. J., Newton, A.
F. E., Butchart, S. H. M., Sutherland, W. J., … Donald,
P. F. (2022, February 17). A quantitative global review
of species population monitoring. Conservation
Biology. doi:10.1111/cobi.13721
30. Beason, R. D., Riesch, R., & Koricheva, J. (2019).
AURITA: an aordable, autonomous recording device
for acoustic monitoring of audible and ultrasonic
frequencies. Bioacoustics, 28(4), 381–396. doi:10.108
0/09524622.2018.1463293
31. Hill, A. P., Prince, P., Piña Covarrubias, E., Doncaster,
C. P., Snaddon, J. L., & Rogers, A. (2018). AudioMoth:
Evaluation of a smart open acoustic device for
monitoring biodiversity and the environment.
Methods in Ecology and Evolution, 9(5), 1199–1211.
doi:10.1111/2041-210X.12955
32. Sethi, S. S., Ewers, R. M., Jones, N. S., Orme, C. D. L., &
Picinali, L. (2018). Robust, real-time and autonomous
monitoring of ecosystems with an open, low-
cost, networked device. Methods in Ecology and
Evolution, 9(12), 2383–2387. doi:10.1111/2041-
210X.13089
33. Whytock, R. C., & Christie, J. (2017, November). Solo:
an open source, customizable and inexpensive
audio recorder for bioacoustic research. (K.
Jones, Ed.), Methods in Ecology and Evolution.
doi:10.1111/2041-210X.12678
34. Brown, A., Garg, S., & Montgomery, J. (2020).
AcoustiCloud: A cloud-based system for managing
large-scale bioacoustics processing. Environmental
Modelling and Software, 131, 104778. doi:10.1016/j.
envsoft.2020.104778
35. RFCx Arbimon platform www.arbimon.rfcx.org
36. Alldredge, M. W., Pollock, K. H., Simons, T. R., Collazo,
J. A., & Shriner, S. A. (2007). Time-of-detection
method for estimating abundance from point-count
surveys. Auk, 124(2), 653–664. doi:10.1642/0004-
8038
37. Sauer, J. R., Peterjohn, B. G., & Link, W. A. (1994).
Observer Dierences in the North American
Breeding Bird Survey. The Auk, 111(1), 50–62.
doi:10.2307/4088504
38. Campbell, M., & Francis, C. M. (2011, April). Using
stereo-microphones to evaluate observer variation
in North American Breeding Bird Survey point
counts. Auk. American Ornithological Society.
doi:10.1525/auk.2011.10005
39. Digby, A., Towsey, M., Bell, B. D., & Teal, P. D.
(2013). A practical comparison of manual and
autonomous methods for acoustic monitoring.
Methods in Ecology and Evolution, 4(7), 675–683.
doi:10.1111/2041-210X.12060
40. Wheeldon, A., Mossman, H. L., Sullivan, M. J. P.,
Mathenge, J., & de Kort, S. R. (2019). Comparison
of acoustic and traditional point count methods
to assess bird diversity and composition in the
Aberdare National Park, Kenya. African Journal of
Ecology, 57(2), 168–176. doi:10.1111/aje.12596
41. Swiston, K. A., & Mennill, D. J. (2009). Comparison
of manual and automated methods for identifying
target sounds in audio recordings of Pileated,
Pale-billed, and putative Ivory-billed woodpeckers.
Journal of Field Ornithology, 80(1), 42–50.
doi:10.1111/j.1557-9263.2009.00204.x
42. Barclay, L. (2017). Listening to Communities and
Environments. Contemporary Music Review, 36(3),
143–158. doi:10.1080/07494467.2017.1395140
43. Sugai, L. S. M., & Llusia, D. (2019). Bioacoustic time
capsules: Using acoustic monitoring to document
biodiversity. Ecological Indicators, 99, 149–152.
doi:10.1016/j.ecolind.2018.12.021
44. Collins, J. (2016). Bat Surveys for Professional
Ecologists: Good Practice Guidelines (3rd edn).
The Bat Conservation Trust, London. London.
Retrieved from https://cdn.bats.org.uk/uploads/
pdf/Resources/Bat_Survey_Guidelines_2016_NON_
PRINTABLE.pdf?v=1542281971
45. Abrahams, C., Desjonquères, C., & Greenhalgh, J.
(2021). Pond Acoustic Sampling Scheme: A draft
protocol for rapid acoustic data collection in
small waterbodies. Ecology and Evolution, 11(12),
7532–7543. doi:10.1002/ece3.7585
46. BSI (The British Standards Institution) (2014) ‘BS ISO
12913 - 1 : 2014 Acoustics — Soundscape Part 1:
Denition and conceptual framework’.
47. BSI (The British Standards Institution) (2018) ‘PD ISO/
TS 12 12913-2 Acoustics - Soundscape Part 2 : Data
collection and reporting requirements’
48. BSI (The British Standards Institution) (2019) ‘PD
ISO / TS 12913 - 3:2019 BSI Standards Publication
Acoustics — Soundscape - Part 3: Data analysis.
49. Ella Browning, Rory Gibb, Paul Glover-Kapfer &
Kate E. Jones. 2017. WWF Conservation Technology
Series 1(2). WWF-UK, Woking, United Kingdom.
https://www.wwf.org.uk/sites/default/les/2019-04/
Acousticmonitoring-WWF-guidelines.pdf
50. Audacity software. https://www.audacityteam.org/.
Accessed on 12/10/2022.
Page 63
51. Wildlife Acoustics Kaleidoscope software. https://
www.wildlifeacoustics.com/products/kaleidoscope.
Accessed 13/11/2022.
52. GroupGets https://groupgets.com/manufacturers/
open-acoustic-devices/products/audiomoth.
Accessed on 12/10/2022
53. Labmaker https://www.labmaker.org/collections/
earth-and-ecology/products/audiomoth-v1-2-
0?www.labmaker.org&gclid=Cj0KCQjwy5maBh
DdARIsAMxrkw02lCpBLEapeoCLk0Y_1yaFZRC
FZ-U6-RqyQ9loDqhGX0cxadTg2-EaAivZEALw_wcB.
Accessed on 12/10/2022.
54. Open Acoustics Support forums https://www.
openacousticdevices.info/support. Accessed on
12/10/2022.
55. Wildlife Acoustics https://www.wildlifeacoustics.
com. Accessed on 12/10/2022.
56. Whytock, R. C., & Christie, J. (2017, November). Solo:
an open source, customizable and inexpensive
audio recorder for bioacoustic research. (K.
Jones, Ed.), Methods in Ecology and Evolution.
doi:10.1111/2041-210X.12678. Device link: https://
solo-system.github.io/home.html
57. ARUPI https://www.instructables.com/ARUPi-A-
Low-Cost-Automated-Recording-Unit-for-Soun/.
Accessed on 12/10/2022.
58. Beason, R. D., Riesch, R., & Koricheva, J. (2019).
AURITA: an aordable, autonomous recording device
for acoustic monitoring of audible and ultrasonic
frequencies. Bioacoustics, 28(4), 381–396. doi:10.108
0/09524622.2018.1463293
59. Sethi, S. S., Ewers, R. M., Jones, N. S., Orme, C. D. L., &
Picinali, L. (2018). Robust, real-time and autonomous
monitoring of ecosystems with an open, low-
cost, networked device. Methods in Ecology and
Evolution, 9(12), 2383–2387. doi:10.1111/2041-
210X.13089. Device: https://www.bugg.xyz/
60. Darras, K., Kolbrek, B., Knorr, A., Meyer, V., Zippert,
M., & Wenzel, A. (2021). Assembling cheap, high-
performance microphones for recording terrestrial
wildlife: the Sonitor system. F1000Research, 7, 1984.
doi:10.12688/f1000research.17511.3
61. Raspberry Pi. https://www.raspberrypi.org/.
Accessed on 12/10/22
62. Titley Chorus. https://www.titley-scientic.com/
uk/products/anabat-systems/chorus. Accessed on
12/10/2022
63. Frontier Labs. https://www.frontierlabs.com.au/bar-
lt. Accessed on 12/10/2022.
64. Wijers, M., Loveridge, A., Macdonald, D. W., &
Markham, A. N. (2019). CARACAL: A versatile passive
acoustic monitoring tool for wildlife research and
conservation. Bioacoustics, 1–17. https://doi. org/10.
1080/09524622.2019.1685408.
65. Allen, M., Girod, L., Newton, R., Madden, S.,
Blumstein, D. T., & Estrin, D. (2008). VoxNet: An
Interactive, rapidly-deployable acoustic monitoring
platform. In: 2008 7th International Conference
on Information Processing in Sensor Networks
(IPSN 2008). St. Louis, MO: IEEE, pp. 371–382. doi:
https://doi.org/10.1109/IPSN.2008.45. url: http://
ieeexplore.ieee.org/document/4505488/ (visited on
09/07/2018).
66. L. Brüggemann, B. Schütz and N. Aschenbruck,
“Ornithology meets the IoT: Automatic Bird
Identication, Census, and Localization,” 2021
IEEE 7th World Forum on Internet of Things
(WF-IoT), 2021, pp. 765-770, doi: 10.1109/WF-
IoT51360.2021.9595401.
67. Heath et al - in prep.
68. Blumstein, D. T., Mennill, D. J., Clemins, P., Girod,
L., Yao, K., Patricelli, G., … Kirschel, A. N. G. (2011).
Acoustic monitoring in terrestrial environments
using microphone arrays: Applications,
technological considerations and prospectus.
Journal of Applied Ecology, 48(3), 758–767.
doi:10.1111/j.1365-2664.2011.01993.x
69. Adams, A. M., Jantzen, M. K., Hamilton, R. M., &
Fenton, M. B. (2012). Do you hear what I hear?
Implications of detector selection for acoustic
monitoring of bats. Methods in Ecology and
Evolution, 3(6), 992–998. doi:10.1111/j.2041-
210X.2012.00244.x
70. Merchant, N. D., Fristrup, K. M., Johnson, M. P., Tyack,
P. L., Witt, M. J., Blondel, P., & Parks, S. E. (2015).
Measuring acoustic habitats. Methods in Ecology
and Evolution, 6(3), 257–265. doi:10.1111/2041-
210X.12330
71. Turgeon, P. J., Van Wilgenburg, S. L., & Drake, K. L.
(2017). Microphone variability and degradation:
implications for monitoring programs employing
autonomous recording units. Avian Conservation
and Ecology, 12(1). doi:10.5751/ACE-00958-120109
72. Bioacoustics Unit (2016). SongMeter (SM3)
Maintenance Protocol. Bayne Lab at the University of
Alberta & Alberta Biodiversity Monitoring Institute.
https://www.wildtrax.ca/dam/jcr:9a5ad9ac-
c684-4712-a811-74f882acfd5b/BU_2019_
SM3MaintenanceProtocol.pdf
73. TechJunkie (2010). Generating White, Pink Or Brown
Noise With Audacity https://www.youtube.com/
watch?v=qjspInr_Ps4. Accessed on 12/10/2022.
74. Wimmer, J., Towsey, M., Roe, P., & Williamson, I.
(2013). Sampling environmental acoustic recordings
to determine bird species richness. Ecological
Applications, 23(6). doi:10.1890/12-2088.1
75. Bayne, E., Knaggs, M., & Solymos, P. (2017). How to
Most Eectively Use Autonomous Recording Units
When Data are Processed by Human Listeners.
Bioacoustic Unit, University of Alberta and Alberta
Biodiversity Monitoring Institute. Retrieved from
http://bioacoustic.abmi.ca/
76. Metcalf, O. C., Barlow, J., Marsden, S., Gomes de
Moura, N., Berenguer, E., Ferreira, J., & Lees, A. C.
(2022). Optimizing tropical forest bird surveys using
passive acoustic monitoring and high temporal
resolution sampling. Remote Sensing in Ecology and
Conservation, 8(1), 45–56. doi:10.1002/rse2.227
77. Yip, D. A., Leston, L., Bayne, E. M., Sólymos, P., &
Grover, A. (2017). Dérivation expérimentale de
distances de détection d’enregistrements audio
et d’observateurs humains permettant l’analyse
intégrée de points d’écoute. Avian Conservation and
Ecology, 12(1). doi:10.5751/ACE-00997-120111
Page 64
78. Piña-Covarrubias, E., Hill, A. P., Prince, P., Snaddon, J.
L., Rogers, A., & Doncaster, C. P. (2019). Optimization
of sensor deployment for acoustic detection and
localization in terrestrial environments. Remote
Sensing in Ecology and Conservation, 5(2), 180–192.
doi:10.1002/rse2.97
79. Clarin, B. M., Bitzilekis, E., Siemers, B. M., & Goerlitz,
H. R. (2014). Personal messages reduce vandalism
and theft of unattended scientic equipment.
Methods in Ecology and Evolution, 5(2), 125–131.
doi:10.1111/2041-210X.12132
80. Heath, B. E., Sethi, S. S., Orme, C. D. L., Ewers, R. M., &
Picinali, L. (2021). How index selection, compression,
and recording schedule impact the description of
ecological soundscapes. Ecology and Evolution,
11(19), 13206–13217. doi:10.1002/ece3.8042
81. Audobon Core. https://ac.tdwg.org/. Accessed
13/11/2022.
82. Darwin Core. https://dwc.tdwg.org/. Accessed
13/11/2022.
83. Kevin F.A. Darras, Steven Van Wilgenburg, Rodney
Rountree, Yuhang Song, Youfang Chen, & Thomas
Cherico Wanger. (2022). The Global Soundscapes
Project: overview of datasets and meta-data
(0.2.1) [Data set]. Zenodo. https://doi.org/10.5281/
zenodo.6537739
84. Arbimon by Rainforest Connection https://arbimon.
rfcx.org/. Accessed 14/11/2022.
85. Xeno-canto. Online acoustic repository. https://
xeno-canto.org/. Accessed 14/11/2022.
86. Macaulay Library at the Cornell Lab of Ornithology.
Multimedia natural history archive. https://www.
macaulaylibrary.org/. Accessed 14/11/2022.
87. Australian Acoustic Observatory (A2O). https://
acousticobservatory.org/.
88. Phillips, Y. F., Towsey, M., & Roe, P. (2018). Revealing
the ecological content of long-duration audio-
recordings of the environment through clustering
and visualisation. PLoS ONE, 13(3). doi:10.1371/
journal.pone.0193345
89. Bioacoustics Software list by Tessa Rhinehart (rhine3)
on Github https://github.com/rhine3/bioacoustics-
software. Accessed on 28/10/2022.
90. Volker, A. (2012). Using Audacity: hints and tricks.
Xeno-canto webpage. https://xeno-canto.org/
forum/topic/3042. Accessed on 28/10/2022
91. Bioacoustics Research Program, 2010. Raven Pro:
Interactive Sound Analysis Software. Cornell Lab of
Ornithology.
92. Sonic Visualiser. https://www.sonicvisualiser.org/.
Accessed 13/11/2022.
93. R Core Team (2022). R: A language and environment
for statistical computing. R Foundation for Statistical
Computing, Vienna, Austria. https://www.R-project.
org/.
94. Sueur J, Aubin T, Simonis C (2008). seewave: a free
modular tool for sound analysis and synthesis.
Bioacoustics, 18: 213-226
95. Python Software Foundation. Python Language
Reference. Available at http://www.python.org.
96. J. D. Hunter, “Matplotlib: A 2D Graphics
Environment”, Computing in Science & Engineering,
vol. 9, no. 3, pp. 90-95, 2007.
97. Virtanen, P., Gommers, R., Oliphant, T. E., Haberland,
M., Reddy, T., Cournapeau, D., … Vázquez-Baeza,
Y. (2020). SciPy 1.0: fundamental algorithms for
scientic computing in Python. Nature Methods,
17(3), 261–272. doi:10.1038/s41592-019-0686-2
98. MATLAB, The MathWorks, Inc., Natick, Massachusetts,
United States. https://uk.mathworks.com/products/
matlab.html
99. Towsey, M., Zhang, L., Cottman-Fields, M., Wimmer,
J., Zhang, J., & Roe, P. (2014). Visualization of long-
duration acoustic recordings of the environment. In
Procedia Computer Science (Vol. 29, pp. 703–712).
doi:10.1016/j.procs.2014.05.063
100. Towsey, M., Znidersic, E., Broken-Brow, J., Indraswari,
K., Watson, D. M., Phillips, Y., … Roe, P. (2018).
Long-duration, false-colour spectrograms for
detecting species in large audio data-sets. Journal of
Ecoacoustics, 2(1), 1–1. doi:10.22261/jea.iuswui
101. Towsey, M., Truskinger, A., Cottman-Fields, M., &
Roe, P. (2018, March 5). Ecoacoustics Audio Analysis
Software v18.03.0.41 (Version v18.03.0.41). Zenodo.
http://doi.org/10.5281/zenodo.1188744
102. Ulloa, J. S., Haupert, S., Latorre, J. F., Aubin,
T., & Sueur, J. (2021). scikit-maad: An open-
source and modular toolbox for quantitative
soundscape analysis in Python. Methods in
Ecology and Evolution, 12, 2334– 2340. https://doi.
org/10.1111/2041-210X.13711
103. Sethi, S., False-colour Index Spectrogram. GitHub
repository. https://github.com/sarabsethi/false_
colour_index_spectrogram. Accessed 07/11/2022.
104. Metcalf, O. C., Lees, A. C., Barlow, J., Marsden, S. J.,
& Devenish, C. (2020). hardRain: An R package for
quick, automated rainfall detection in ecoacoustic
datasets using a threshold-based approach.
Ecological Indicators, 109, 105793. doi:10.1016/j.
ecolind.2019.105793
105. Juodakis, J., & Marsland, S. (2022). Wind-robust
sound event detection and denoising for
bioacoustics. Methods in Ecology and Evolution,
13, 2005– 2017. https://doi.org/10.1111/2041-
210X.13928
106. Cretois, B., Rosten, C. M., & Sethi, S. S. (2022). Voice
activity detection in eco-acoustic data enables
privacy protection and is a proxy for human
disturbance. Methods in Ecology and Evolution, 00,
1– 10. https://doi.org/10.1111/2041-210X.14005
107. Gillings, S., https://nocmig.com/processing/.
Accessed on 14/11/2022
108. Bas, Y., Bas, D., & Julien, J.-F. (2017). Tadarida:
A Toolbox for Animal Detection on Acoustic
Recordings. Journal of Open Research Software, 5(1),
6. doi:10.5334/jors.154
109. Katz, J., Hafner, S. D., & Donovan, T. (2016).
Assessment of Error Rates in Acoustic Monitoring
with the R package monitoR. Bioacoustics, 25(2),
177–196. doi:10.1080/09524622.2015.1133320
Page 65
110. Balantic, C. M., & Donovan, T. M. (2020). Statistical
learning mitigation of false positives from template-
detected data in automated acoustic wildlife
monitoring. Bioacoustics, 29(3), 296–321. doi:10.108
0/09524622.2019.1605309
111. Aide, T. M., Corrada-Bravo, C., Campos-Cerqueira, M.,
Milan, C., Vega, G., & Alvarez, R. (2013). Real-time
bioacoustics monitoring and automated species
identification. PeerJ, 2013(1). doi:10.7717/peerj.103
112. Campos-Cerqueira, M., Mena, J. L., Tejeda-Gómez, V.,
Aguilar-Amuchastegui, N., Gutierrez, N., & Aide,
T. M. (2020). How does FSC forest certification affect
the acoustically active fauna in Madre de Dios, Peru?
Remote Sensing in Ecology and Conservation, 6(3),
274–285. doi:10.1002/rse2.120
113. Deichmann, J. L., Hernández-Serna, A., Delgado C.,
J. A., Campos-Cerqueira, M., & Aide, T. M. (2017).
Soundscape analysis and acoustic monitoring
document impacts of natural gas exploration on
biodiversity in a tropical forest. Ecological Indicators,
74, 39–48. doi:10.1016/j.ecolind.2016.11.002
114. Katz, J., Hafner, S. D., & Donovan, T. (2016).
Bioacoustics The International Journal of Animal
Sound and its Recording Tools for automated
acoustic monitoring within the R package monitoR
Tools for automated acoustic monitoring within the
R package monitoR. Bioacoustics, 25(2), 197–210. do
i:10.1080/09524622.2016.1138415
115. Teixeira, D., 2022. Presentation at the UKAN+ Long-
term Bioacoustic Monitoring in the UK symposiym.
Youtube. https://www.youtube.com/watch?v=zd5
eG30TWnA&list=PLtnYW6qlPmyhiqGw78P7MnY1-
TcI2hpfH&index=3.
116. Ducrettet, M., Forget, P. M., Ulloa, J. S., Yguel, B.,
Gaucher, P., Princé, K., … Sueur, J. (2020). Monitoring
canopy bird activity in disturbed landscapes with
automatic recorders: A case study in the tropics.
Biological Conservation, 245. doi:10.1016/j.
biocon.2020.108574
117. Teixeira, D., Linke, S., Hill, R., Maron, M., & van
Rensburg, B. J. (2022). Fledge or fail: Nest monitoring
of endangered black-cockatoos using bioacoustics
and open-source call recognition. Ecological
Informatics, 69, 101656. doi:10.1016/j.
ecoinf.2022.101656
118. Priyadarshani, N., Marsland, S., & Castro, I. (2018).
Automated birdsong recognition in complex
acoustic environments: a review. Journal of Avian
Biology, 49(5). doi:10.1111/JAV.01447
119. Kahl, S., Wood, C. M., Eibl, M., & Klinck, H. (2021).
BirdNET: A deep learning solution for avian diversity
monitoring. Ecological Informatics, 61, 101236.
doi:10.1016/j.ecoinf.2021.101236
120. Clink, D, and Klinck, H., (2019). gibbonR - R package
for automated detection, classification and
visualization of acoustic signals. K. Lisa Yang Center
for Conservation Bioacoustics, Cornell Lab of
Ornithology, Cornell University. GitHub repository:
https://github.com/DenaJGibbon/gibbonR
121. Araya-Salas, M., & Smith-Vidaurre, G. (2017). warbleR:
an r package to streamline analysis of animal
acoustic signals. Methods in Ecology and Evolution,
8(2), 184–191. doi:10.1111/2041-210X.12624
122. Kuhn, M. (2021). caret: Classication and Regression
Training. Retrieved from https://cran.r-project.org/
package=caret
123. Stowell, D. (2022, December 13). Computational
bioacoustics with deep learning: a review and
roadmap. PeerJ. doi:10.7717/peerj.13152
124. Kahl, S., Denton, T., Klinck, H., Glotin, H., Goëau,
H., Vellinga, W. P., … Joly, A. (2021). Overview of
BirdCLEF 2021: Bird call identication in soundscape
recordings. In CEUR Workshop Proceedings (Vol.
2936, pp. 1437–1450). Retrieved from https://ceur-
ws.org/Vol-2936/paper-123.pdf
125. Rhinehart, T., Lapp, S., & Kitzes, J. (2022). Identifying
and building on the current state of bioacoustics
software. The Journal of the Acoustical Society of
America, 151(4), A27–A27. doi:10.1121/10.0010544
126. British Trust for Ornithology Acoustic Pipleline.
https://www.bto.org/our-science/projects/bto-
acoustic-pipeline. Accessed on 14/11/2022
127. Knight, E. C., Hannah, K. C., Foley, G. J., Scott, C. D.,
Brigham, R. M., & Bayne, E. (2017). Recommendations
for acoustic recognizer performance assessment
with application to ve common automated signal
recognition programs. Avian Conservation and
Ecology, 12(2), art14. doi:10.5751/ACE-01114-120214
128. Metcalf, O. C., Barlow, J., Bas, Y., Berenguer, E.,
Devenish, C., França, F., … Lees, A. C. (2022).
Detecting and reducing heterogeneity of error in
acoustic classication. Methods in Ecology and
Evolution, 00, 1–13. doi:10.1111/2041-210X.13967
129. Sing T, Sander O, Beerenwinkel N, Lengauer T (2005).
“ROCR: visualizing classier performance in R.
Bioinformatics, 21(20), 7881. http://rocr.bioinf.mpi-
sb.mpg.de.
130. Grau J, Grosse I, Keilwagen J (2015). “PRROC:
computing and visualizing precision-recall and
receiver operating characteristic curves in R.
Bioinformatics, 31(15), 2595-2597.
131. Manzano, R., Bota, G., Brotons, L., Soto-Largo, E., &
Pérez-Granados, C. (2022). Low-cost open-source
recorders and ready-to-use machine learning
approaches provide eective monitoring of
threatened species. Ecological Informatics, 101910.
doi:10.1016/J.ECOINF.2022.101910
132. Duchac, L. S., Lesmeister, D. B., Dugger, K. M., Ru, Z.
J., & Davis, R. J. (2020). Passive acoustic monitoring
eectively detects Northern Spotted Owls and
Barred Owls over a range of forest conditions.
Condor, 122(3). doi:10.1093/condor/duaa017
133. Celis-Murillo, A., Deppe, J. L., & Ward, M. P. (2012).
Eectiveness and utility of acoustic recordings
for surveying tropical birds. Journal of Field
Ornithology, 83(2), 166–179. doi:10.1111/j.1557-
9263.2012.00366.x
134. Frommolt, K. H. (2017). Information obtained from
long-term acoustic recordings: applying bioacoustic
techniques for monitoring wetland birds during
breeding season. Journal of Ornithology, 158(3),
659–668. doi:10.1007/s10336-016-1426-3].
135. UK Bird Survey Guidelines. https://
birdsurveyguidelines.org/803-2/. Accessed
14/11/2022.
Page 66
136. Sullivan, B.L., C.L. Wood, M.J. Ili, R.E. Bonney, D.
Fink, and S. Kelling. 2009. eBird: a citizen-based bird
observation network in the biological sciences.
Biological Conservation 142: 2282-2292.
137. Hsieh TC, Ma KH, Chao A (2022). iNEXT: Interpolation
and Extrapolation for Species Diversity. R package
version 3.0.0, http://chao.stat.nthu.edu.tw/
wordpress/software_download/.
138. Oksanen, J., Blanchet, F. G., Friendly, M., Kindt, R.,
Legendre, P., McGlinn, D., Minchin, P.R., O’Hara, R.
B., Simpson, G.L., Solymos, P., Henry, M., Stevens, H.,
Szoecs, E., Wagner, H. (2020). vegan: Community
Ecology Package. R package version 2.5-7. https://
CRAN.R-project.org/package=vegan
139. Abrahams, C., & Geary, M. (2020). Combining
bioacoustics and occupancy modelling for improved
monitoring of rare breeding bird populations.
Ecological Indicators, 112, 106131. doi:10.1016/j.
ecolind.2020.106131
140. Metcalf, O. C., Ewen, J. G., McCready, M., Williams,
E. M., & Rowclie, J. M. (2019). A novel method for
using ecoacoustics to monitor post-translocation
behaviour in an endangered passerine. Methods
in Ecology and Evolution, 10(5), 626–636.
doi:10.1111/2041-210X.13147
141. Campos-Cerqueira, M., & Aide, T. M. (2016).
Improving distribution data of threatened species
by combining acoustic monitoring and occupancy
modelling. Methods in Ecology and Evolution, 7(11),
1340–1348. doi:10.1111/2041-210X.12599
142. MacKenzie, D. I., Nichols, J. D., Royle, J. A., Pollock,
K. H., Bailey, L. L., & Hines, J. E. (2017). Occupancy
Estimation and Modeling: Inferring Patterns and
Dynamics of Species Occurrence: Second Edition.
Occupancy Estimation and Modeling: Inferring
Patterns and Dynamics of Species Occurrence:
Second Edition. Elsevier. doi:10.1016/C2012-0-
01164-7
143. Kéry, M., Guillera-Arroita, G., & Lahoz-Monfort, J.
J. (2013). Analysing and mapping species range
dynamics using occupancy models. Journal of
Biogeography, 40(8), 1463–1474. doi:10.1111/
jbi.12087
144. Fiske, I. J., & Chandler, R. B. (2011). Unmarked: An R
package for tting hierarchical models of wildlife
occurrence and abundance. Journal of Statistical
Software, 43(10), 1–23. doi:10.18637/jss.v043.i10
145. Chambert, T., Waddle, J. H., Miller, D. A. W., Walls,
S. C., & Nichols, J. D. (2018). A new framework for
analysing automated acoustic species detection
data: Occupancy estimation and optimization of
recordings post-processing. Methods in Ecology
and Evolution, 9(3), 560–570. doi:10.1111/2041-
210X.12910
146. Andrei, V. (2015). Considerations on Developing
a Chainsaw Intrusion Detection and Localization
System for Preventing Unauthorized Logging.
Journal of Electrical and Electronic Engineering, 3(6),
202. https://doi.org/10.11648/j.jeee.20150306.15
147. Shinji Sumitani, Reiji Suzuki, Naoaki Chiba, Shiho
Matsubayashi, Takaya Arita, Kazuhiro Nakadai,
Hiroshi G. Okuno: An Integrated Framework for
Field Recording, Localization, Classication and
Annotation of Birdsongs Using Robot Audition
Techniques — Harkbird 2.0, Proc. of 2019 IEEE
International Conference on Acoustics, Speech and
Signal Processing (ICASSP2019), pp. 8246-8250.
148. F. Grondin, D. Létourneau, C. Godin, J.-S. Lauzon, J.
Vincent, S. Michaud, S. Faucher, F. Michaud, ODAS:
Open embeddeD Audition System, Frontiers in
Robotics and AI, Volume 9, 2022
149. Pérez-Granados, C., & Traba, J. (2021, July 1).
Estimating bird density using passive acoustic
monitoring: a review of methods and suggestions
for further research. Ibis. John Wiley & Sons, Ltd.
doi:10.1111/ibi.12944
150. Pérez-Granados, C., Bota, G., Giralt, D., Barrero, A.,
Gómez-Catasús, J., Bustillo-De La Rosa, D., & Traba,
J. (2019). Vocal activity rate index: a useful method
to infer terrestrial bird abundance with acoustic
monitoring. Ibis, 161(4), 901–907. doi:10.1111/
ibi.12728
151. Pérez-Granados, C., Gómez-Catasús, J., Bustillo-
de la Rosa, D., Barrero, A., Reverter, M., & Traba, J.
(2019). Eort needed to accurately estimate Vocal
Activity Rate index using acoustic monitoring: A
case study with a dawn-time singing passerine.
Ecological Indicators, 107, 105608. doi:10.1016/j.
ecolind.2019.105608
152. Marques, T. A., Thomas, L., Martin, S. W., Mellinger,
D. K., Ward, J. A., Moretti, D. J., … Tyack, P. L. (2013).
Estimating animal population density using passive
acoustics. Biological Reviews, 88(2), 287–309.
doi:10.1111/brv.12001
153. Borker, A. L., Mckown, M. W., Ackerman, J. T., Eagles-
Smith, C. A., Tershy, B. R., & Croll, D. A. (2014). Vocal
activity as a low cost and scalable index of seabird
colony size. Conservation Biology, 28(4), 1100–1108.
doi:10.1111/cobi.12264
154. Oppel, S., Hervías, S., Oliveira, N., Pipa, T., Silva, C.,
Geraldes, P., … McKown, M. (2014). Estimating
population size of a nocturnal burrow-nesting
seabird using acoustic monitoring and habitat
mapping. Nature Conservation, 7, 1–13. doi:10.3897/
natureconservation.7.6890
155. Gillings, S., & Scott, C. (2021). Nocturnal ight calling
behaviour of thrushes in relation to articial light at
night. Ibis, 163(4), 1379–1393. doi:10.1111/ibi.12955
156. Stevenson, B. C., Borchers, D. L., Altwegg, R., Swift, R.
J., Gillespie, D. M., & Measey, G. J. (2015). A general
framework for animal density estimation from
acoustic detections across a xed microphone array.
Methods in Ecology and Evolution, 6(1), 38–48.
doi:10.1111/2041-210X.12291
157. Frommolt, K. H., & Tauchert, K. H. (2014). Applying
bioacoustic methods for long-term monitoring of a
nocturnal wetland bird. Ecological Informatics, 21,
4–12. doi:10.1016/j.ecoinf.2013.12.009
Page 67
158. Sebastián-González, E., Camp, R. J., Tanimoto, A.
M., de Oliveira, P. M., Lima, B. B., Marques, T. A.,
& Hart, P. J. (2018). Density estimation of sound-
producing terrestrial animals using single automatic
acoustic recorders and distance sampling. Avian
Conservation and Ecology, 13(2). doi:10.5751/ACE-
01224-130207
159. Luschi, P., & Del Seppia, C. (1996). Song-type function
during territorial encounters in male Cetti’s Warblers
Cettia cetti. Ibis, 138(3), 479–484. doi:10.1111/j.1474-
919x.1996.tb08068.x
160. Galeotti, P., & Pa Van, G. (1991). Individual
recognition of male tawny owls (Strix aluco) using
spectrograms of their territorial calls. Ethology
Ecology and Evolution, 3(2), 113–126. doi:10.1080/0
8927014.1991.9525378
161. Sueur, J., Farina, A., Gasc, A., Pieretti, N., & Pavoine, S.
(2014). Acoustic indices for biodiversity assessment
and landscape investigation. Acta Acustica United
with Acustica, 100(4), 772–781. doi:10.3813/
AAA.918757
162. Eldridge, A., Guyot, P., Moscoso, P., Johnston, A.,
Eyre-Walker, Y., & Peck, M. (2018). Sounding out
ecoacoustic metrics: Avian species richness is
predicted by acoustic indices in temperate but not
tropical habitats. Ecological Indicators, 95, 939–952.
doi:10.1016/j.ecolind.2018.06.012
163. Sueur, J., Pavoine, S., Hamerlynck, O., & Duvail,
S. (2008). Rapid acoustic survey for biodiversity
appraisal. PLoS ONE, 3(12). doi:10.1371/journal.
pone.0004065
164. Kasten, E. P., Gage, S. H., Fox, J., & Joo, W. (2012).
The remote environmental assessment laboratory’s
acoustic library: An archive for studying soundscape
ecology. Ecological Informatics, 12, 50–67.
doi:10.1016/j.ecoinf.2012.08.001
165. Papin M, Aznar M, Germain E, Guérold F, Pichenot J
(2019) Using acoustic indices to estimate wolf pack
size. Ecological Indicators, 103, 202-211.
166. Carruthers-Jones, J., Eldridge, A., Guyot, P.,
Hassall, C., & Holmes, G. (2019). The call of the
wild: Investigating the potential for ecoacoustic
methods in mapping wilderness areas. Science
of the Total Environment, 695. doi:10.1016/j.
scitotenv.2019.133797
167. Alcocer, I., Lima, H., Sugai, L. S. M., & Llusia, D. (2022).
Acoustic indices as proxies for biodiversity: a meta-
analysis. Biological Reviews, (1), 0–000. doi:10.1111/
brv.12890
168. Sethi, S. S., Jones, N. S., Fulcher, B. D., Picinali, L., Clink,
D. J., Klinck, H., … Ewers, R. M. (2020). Characterizing
soundscapes across diverse ecosystems using a
universal acoustic feature set. Proceedings of the
National Academy of Sciences of the United States
of America, 117(29), 17049–17055. doi:10.1073/
pnas.2004702117
169. Sueur, J., & Farina, A. (2015). Ecoacoustics: the
Ecological Investigation and Interpretation of
Environmental Sound. Biosemiotics, 8(3), 493–502.
doi:10.1007/s12304-015-9248-x
170. Krause, B. (1993). The niche hypothesis. Soundscape
Newsletter, 6, 6–10. Retrieved from https://www.
researchgate.net/publication/295609070
171. Morton, E. S. (1975). Ecological Sources of Selection
on Avian Sounds. The American Naturalist, 109(965),
17–34. doi:10.1086/282971
172. Tobias, J. A., Planqué, R., Cram, D. L., & Seddon, A.
N. (2014). Species interactions and the structure of
complex communication networks. Proceedings
of the National Academy of Sciences of the United
States of America, 111(3), 1020–1025. doi:10.1073/
pnas.1314337111.
173. Boncoraglio, G., & Saino, N. (2007). Habitat structure
and the evolution of bird song: A meta-analysis of
the evidence for the acoustic adaptation hypothesis.
Functional Ecology, 21(1). https://doi.org/10.1111/
j.1365-2435.2006.01207.x
174. Rendon, N., Rodríguez-Buritica, S., Sanchez-Giraldo,
C., Daza, J. M., & Isaza, C. (2022). Automatic acoustic
heterogeneity identication in transformed
landscapes from Colombian tropical dry forests.
Ecological Indicators, 140. doi:10.1016/J.
ECOLIND.2022.109017
175. Barbaro, L., Sourdril, A., Froidevaux, J. S. P., Cauchoix,
M., Calatayud, F., Deconchat, M., & Gasc, A. (2022).
Linking acoustic diversity to compositional and
congurational heterogeneity in mosaic landscapes.
Landscape Ecology, 37(4), 1125–1143. https://doi.
org/10.1007/s10980-021-01391-8
176. Brownlie, K., Monash, R., Geeson, JJ., Fort, J.,
Bustamante, P., & Arnould, J.P.Y., (2020) Developing
a passive acoustic monitoring technique for
Australia’s most numerous seabird, the Short-
tailed Shearwater (Ardenna tenuirostris), Emu
- Austral Ornithology, 120:2, 123-134, DOI:
10.1080/01584197.2020.1732828.
177. Pieretti, N., Farina, A., & Morri, D. (2011). A new
methodology to infer the singing activity of an avian
community: The Acoustic Complexity Index (ACI).
Ecological Indicators, 11(3), 868–873. doi:10.1016/j.
ecolind.2010.11.005
178. Villanueva-Rivera, L. J., Pijanowski, B. C., Doucette,
J., & Pekin, B. (2011). A primer of acoustic analysis
for landscape ecologists. Landscape Ecology, 26(9),
1233–1246. doi:10.1007/s10980-011-9636-9
179. Aide, T. M., Hernández-Serna, A., Campos-Cerqueira,
M., Acevedo-Charry, O., & Deichmann, J. L. (2017).
Species richness (of insects) drives the use of
acoustic space in the tropics. Remote Sensing, 9(11).
doi:10.3390/rs9111096
180. Boelman, N. T., Asner, G. P., Hart, P. J., & Martin, R. E.
(2007). Multi-trophic invasion resistance in Hawaii:
Bioacoustics, eld surveys, and airborne remote
sensing. Ecological Applications, 17(8), 2137–2144.
doi:10.1890/07-0004.1
181. Depraetere, M., Pavoine, S., Jiguet, F., Gasc, A.,
Duvail, S., & Sueur, J. (2012). Monitoring animal
diversity using acoustic indices: Implementation in
a temperate woodland. Ecological Indicators, 13(1),
46–54. doi:10.1016/j.ecolind.2011.05.006
182. Gasc, A., Sueur, J., Pavoine, S., Pellens, R., &
Grandcolas, P. (2013). Biodiversity Sampling Using
a Global Acoustic Approach: Contrasting Sites with
Microendemics in New Caledonia. PLoS ONE, 8(5),
e65311. doi:10.1371/journal.pone.0065311
Page 68
183. Burivalova, Z., Towsey, M., Boucher, T., Truskinger,
A., Apelis, C., Roe, P., & Game, E. T. (2018). Using
soundscapes to detect variable degrees of human
inuence on tropical forests in Papua New Guinea.
Conservation Biology, 32(1), 205–215. doi:10.1111/
cobi.12968
184. Villanueva-Rivera, L.J., and Pijanowski, B.C.,
(2018). soundecology: Soundscape Ecology. R
package version 1.3.3. https://CRAN.R-project.org/
package=soundecology
185. Bradfer-Lawrence, T., Gardner, N., Bunnefeld, L.,
Bunnefeld, N., Willis, S. G., & Dent, D. H. (2019).
Guidelines for the use of acoustic indices in
environmental research. Methods in Ecology and
Evolution, 10(10), 1796–1807. doi:10.1111/2041-
210X.13254
186. Buxton, R. T., McKenna, M. F., Clapp, M., Meyer,
E., Stabenau, E., Angeloni, L. M., … Wittemyer, G.
(2018). Ecacy of extracting indices from large-
scale acoustic recordings to monitor biodiversity.
Conservation Biology, 32(5), 1174–1184.
doi:10.1111/cobi.13119
187. Metcalf, O. C., Barlow, J., Devenish, C., Marsden, S.,
Berenguer, E., & Lees, A. C. (2021). Acoustic indices
perform better when applied at ecologically
meaningful time and frequency scales. Methods
in Ecology and Evolution, 12(3), 421–431.
doi:10.1111/2041-210X.13521
188. Breiman, L. (2001). Random forests. Machine
Learning, 45(1), 5–32. doi:10.1023/A:1010933404324
189. Do Nascimento, L. A., Campos-Cerqueira, M., &
Beard, K. H. (2020). Acoustic metrics predict habitat
type and vegetation structure in the Amazon.
Ecological Indicators, 117, 106679. doi:10.1016/j.
ecolind.2020.106679
190. Borker, A. L., Buxton, R. T., Jones, I. L., Major, H. L.,
Williams, J. C., Tershy, B. R., & Croll, D. A. (2020).
Do soundscape indices predict landscape-scale
restoration outcomes? A comparative study of
restored seabird island soundscapes. Restoration
Ecology, 28(1), 252–260. doi:10.1111/rec.13038
191. Ross, S. R. P. J., Friedman, N. R., Yoshimura, M.,
Yoshida, T., Donohue, I., & Economo, E. P. (2021).
Utility of acoustic indices for ecological monitoring
in complex sonic environments. Ecological
Indicators, 121. doi:10.1016/j.ecolind.2020.107114
192. Sánchez-Giraldo, C., Bedoya, C. L., Morán-Vásquez,
R. A., Isaza, C. V., & Daza, J. M. (2020). Ecoacoustics
in the rain: understanding acoustic indices under
the most common geophonic source in tropical
rainforests. Remote Sensing in Ecology and
Conservation, 6(3), 248–261. doi:10.1002/rse2.162.
193. Wang, Y., Zhang, Y., Xia, C., & Moller, A. P. (2022). A
meta-analysis of the eects in alpha acoustic indices.
Biodiversity Science, 0. doi:10.17520/BIODS.2022369
194. Burivalova, Z., Purnomo, Wahyudi, B., Boucher, T. M.,
Ellis, P., Truskinger, A., … Game, E. T. (2019). Using
soundscapes to investigate homogenization of
tropical forest diversity in selectively logged forests.
Journal of Applied Ecology, 56(11), 2493–2504.
doi:10.1111/1365-2664.13481
195. Sethi, S. S., Ewers, R. M., Jones, N. S., Sleutel,
J., Shabrani, A., Zulkii, N., & Picinali, L. (2022).
Soundscapes predict species occurrence in tropical
forests. Oikos, 2022(3). doi:10.1111/oik.08525
Page 69
Appendix 1: An evidence-based quick-start
guide for ecoacoustics deployment
The programming and deployment of automated acoustic recorders can be confusing to new
practitioners, with a bewildering array of decisions to make on machine settings and eldwork
approaches. Research studies have used a wide variety of methods, with little coordination or
development of good-practice - the limited guidelines previously available have been scattered
throughout the literature.
Below, we set out some recommendations for implementing an ecoacoustics study focussed on
long-term monitoring and the use of Acoustic Indices. The recommendations are based, where
possible, on evidence from the scientic literature. The recommendations advise on audio settings
for the recorders, recording schedules, and how best to deploy detectors in the eld spatially and
temporally. The choices provided will not be optimal for every project, but this quick-start guide
is intended to advise those who are starting with ecoacoustics, or who are seeking consistency in
approach. The recommendations are not intended to constrain experienced researchers who fully
understand the best options for their own eld of study.
As part of the evidence gathered for these recommendations, a questionnaire was circulated to
attendees at the UK Acoustics Network (UKAN+) ecoacoustics symposium held at Manchester
Metropolitan University on 15-16th June 2022. This asked a series of questions related to the
parameters of a recommended survey protocol for an ecoacoustics study, focussed on developing
audio data for analysis with acoustic indices. The favoured choices of the 84 respondents to the
survey are included below (referred to as ‘UKAN questionnaire’).
Sample Rate
Sample rate is the number of sound samples recorded per second. It aects the temporal
resolution of acoustic data, and sets the highest frequency of the sound that will be recorded. The
sample rate is programmed within the recording device settings.
We recommend using a 48 kHz sample rate
Research evidence
UKAN questionnaire: 55% of respondents selected a sample rate of 44.1/48 kHz for
ecoacoustic studies.
Within the Global Soundscapes Project database (Darras, 2022), 183 of the 325 projects
listed used a 48 kHz sample rate.
https://doi.org/10.5281/zenodo.6486836
The Silent Cities project used a 48 kHz sample rate in its global scale study of
soundscapes during the Covid-19 lockdown.
DOI: 10.17605/OSF.IO/H285U
The most common (37%) sampling rate used in the 35 acoustic index studies reviewed
by Alcocer et al. (2022) was 44.1 kHz.
https://doi.org/10.1111/brv.12890
Burivalova et al (2017) used a 44 kHz sample rate when using soundscapes to detect the
eects of human inuence on tropical forests.
https://doi.org/10.1111/cobi.12968
For bird surveys, Darras et al. (2018) recommended recording all frequencies in the
audible range - with a sampling frequency of 44.1 kHz.
DOI: 10.1111/1365-2664.13229
Page 70
Rationale
A 48 kHz sample rate will record the full range of human hearing, and be able to capture
a wide range of biological and environmental sounds in high resolution.
The sample rate you should use for audio les in ecoacoustic studies depends on the
specic requirements of your study and the types of sounds you are trying to capture.
In general, a higher sample rate will result in a greater temporal resolution and allow for
more accurate representation of the original sound. However, higher sample rates also
result in larger le sizes, which can be an issue with large datasets.
The sample rate needs to be twice the highest frequency of sound that is to be recorded.
For example, the upper range of human hearing is ~20kHz, so needs a sample rate of
40kHz to be recorded (as dened in the Nyquist theorem). Similarly, lesser horseshoe
bats have a call at 110 kHz, and so a sample rate of 256 kHz is normally used to ensure
these calls are captured.
In ecoacoustic studies, it is common to use sample rates in the range of 44.1 kHz to 48
kHz. These sample rates are sucient for capturing a wide range of sounds, including
most vocalizations and other biological sounds. Some studies may require higher sample
rates if they are focused on capturing very high frequency sounds or if they are trying to
capture very ne temporal detail. In these cases, sample rates of 96 kHz or higher may be
necessary.
The large Silent Cities citizen-science project used a 48 kHz sample rate, and the widely
used BirdNET algorithm for birdsong classication is designed to work with a 48 kHz
sample rate.
Most audio recorders have a number of potential sample rate options, with rates such as
16, 24, 32, 44.1, 48, 64, 96, 128 and 256 being fairly standard.
Sample rate determines the size of audio les, with high sample rates having a
correspondingly high le size. A mono .wav le at 256 kHz sample rate of 8 seconds
length may be around 4 MB in size, while a le of the same length at 44.1 kHz might only
be 640 KB.
High sample rates can be ‘downsampled’ to reduce le size if necessary - the opposite is
not possible.
Bit depth
The bit depth of an audio le refers to the number of bits used to represent each sample of the
audio signal. A higher bit depth allows for a greater dynamic range, which means that the audio
le can capture a wider range of amplitude levels. However, higher bit depths also result in larger
le sizes.
We recommend using a 16 bit depth encoding
Research evidence
There has been little study of the eects of bit depth on ecoacoustic studies.
Rationale
In ecoacoustic studies, it is common to use bit depths of 16 bits or 24 bits. These bit
depths are sucient for capturing a wide range of sound volumes. Higher bit depths,
especially 32 bit recordings, reduce the potential for ‘clipping’ with loud sounds.
For the majority of automated acoustic recorders, the bit depth is set by the units
rmware and can not be changed. Hence, no decision on this parameter is normally
necessary by the user. The majority of automated units, e.g. Wildlife Acoustics,
Audiomoth, Frontier Labs and Swift all record 16 bit les. Handheld recorders from
manufacturers such as Tascam and Zoom can also record at 24 or 32 bit depth, which
provide a wider amplitude scale.
Page 71
File type
There are a number of dierent le types that can be used for audio les in ecoacoustic studies.
Some common le types include WAV, AIFF, FLAC, and MP3.
We recommend using .wav les
Research evidence
The meta-analysis of acoustic index studies by Alcocer et al. (2022) revealed that 94% of
the projects used WAV format audio les.
https://doi.org/10.1111/brv.12890
Heath et al (2021) describe how, with compressed recordings, the signal is altered in
relation to the level of compression, with higher frequencies and quieter sounds most
severely altered. Lossless compression should be preferred in ecoacoustic studies, but
if data storage is an issue, then MP3 encoding can be used while potentially having
minimal impact on most acoustic indices.
https://doi.org/10.1002/ece3.8042
For bird surveys, Darras et al. (2018) recommended recording all audible frequencies in
uncompressed WAV or FLAC audio le format.
https://doi.org/10.1111/1365-2664.13229
Rationale
WAV (Waveform Audio File Format) is a ubiquitous le type that can be produced by
most recorders, and processed by most software. Although le sizes can be larger than
other le types, the les are uncompressed and lossless, preserving all the data from the
original recording.
FLAC (Free Lossless Audio Codec) les are a lossless compressed format. The le sizes
often being approximately half of an equivalent WAV le. Some researchers therefore
use this format to archive recordings, saving space (and cost), while not reducing the
information held within the audio recording.
AIFF (Audio Interchange File Format) is another lossless le format that is similar to WAV.
It is also widely supported and is a good choice for preserving the quality of the original
audio.
MP3 (MPEG Audio Layer 3) is a compressed le format that is widely used for storing
audio data. It is a lossy format, which means that it removes some of the audio data in
order to reduce the size of the le. While MP3 les are generally smaller in size compared
to WAV and AIFF les, they may not be as suitable for preserving the quality of the
original audio.
Zero-crossing audio les are simple representations of when the recorded audio signal
crosses the zero line. They can be used to reconstruct a sound wave, and hence provide
data on frequency, but not on amplitude. Zero-crossing audio les are typically created
by applying a threshold to the original audio signal, such that only those samples that
exceed the threshold are retained. The le sizes are very small compared to other types.
The choice of le type depends partly on the recorder used. For example, Audiomoths
record only in WAV format, while Wildlife Acoustics can save les as WAV, a proprietary
W4V compressed format, and as ZC zero-crossing les. The Frontier Lab’s BAR-LT
supports WAV and FLAC les.
Page 72
File length
Acoustic recorders can be programmed to record le lengths ranging from seconds to hours. The
choice of recording length normally depends on issues around practical le management and
how the recorded data will be processed.
We recommend using a 1-minute le length
Research evidence
UKAN questionnaire: 31% of respondents selected a 1 minute le length, with 20%
selecting 5 minutes.
The 35 acoustic index studies reviewed by Alcocer et al. (2022) used le lengths equal to
(40%) or shorter than 1 minute (40%).
https://doi.org/10.1111/brv.12890
When manually processing birdsong recordings, Bayne et al. (2017) found that shorter
duration (1 min) les increased detection rates for species and allowed for wider
coverage of times of day and dierent dates.
http://bioacoustic.abmi.ca/wp-content/uploads/2017/08/ARUs_and_Human_Listeners.
pdf
A literature review by Minkova et al. (2020) found that a small number of studies have
contrasted alternative le-length choices, indicating that short duration les (e.g. 15 s–1
min) are most eective and ecient for detecting species, particularly for species that
are relatively common. Minkova et al. (2020) chose to use 1 min audio clips.
https://www.dnr.wa.gov/publications/lm_oesf_pac_sp.pdf
For all species considered during the tundra breeding bird survey by Thompson et al.
(2017), analysis indicated that for most species of birds, a single 10 min survey during
times and dates of high availability (June, between 0500 hours and 2000 hours) is likely
sucient to establish occupancy status. However, in any single 10 min recording, the
majority of species are detected within the rst few minutes; thus shorter duration
recordings are likely to be more ecient in detecting species occupancy
https://wildlife.onlinelibrary.wiley.com/doi/abs/10.1002/jwmg.21285
Cook & Hartley (2018) used two dierent time-sampling methods to calculate species
richness and acoustic prevalence of birds, comparing 5 min sections of recordings with
the rst 10 s of each minute to create a composite of 5min duration. The 10 s composite
samples detected 26% more species and produced improved prevalence indices,
requiring 60% less listening time to detect as many species as the 5 min sections.
doi.org/10.5751/ACE-01221-130121
Metcalf et al. (2021) compared the results of sampling one-hour of data by using 240 15 s
samples spread randomly across a survey window, with sampling of four 15 min samples.
They found that the shorter les, providing a ‘higher temporal resolution’, outperformed
the less frequent longer les in every metric considered, detecting 50% higher alpha
diversity, and 10% higher gamma diversity.
https://doi.org/10.1111/2041-210X.13521
Cifuentes et al (2021) suggest that short recordings sampled throughout the survey
period accurately represent acoustic patterns, with an optimal schedule of ten 1 minute
samples per hour.
https://doi.org/10.21068/c2021.v22n01a02
Melo et al. (2021) employed a 2 minute le length (with a single recording per hour),
and were able to detect a large number of anuran species with an appropriate level of
sampling eort and temporal scale.
https://doi.org/10.1016/j.ecolind.2021.108305
The review by Sugai (2019) found that for studies with 24 h diel recordings, the most
commonly used recording lengths were up to 3 min (59%), or between 3-10 min (31.8%).
https://doi.org/10.1093/biosci/biy147
Page 73
Rationale
Analysis of data for ecoacoustic studies has commonly been undertaken using 1 min
length les, such that this has become a de facto standard: https://research.ecosounds.
org/2019/08/09/analyzing-data-in-one-minute-chunks.html
This relatively short le length enables a greater range of time periods to be covered
for the same data volume, aids parallel computation with manageable le sizes, retains
sucient detail of vocalisation structures (e.g. birdsong sequences), and can be easily
viewed in reasonable temporal detail on a standard computer screen. In addition, when
calculating acoustic indices, this le length seems to achieve a compromise between
introducing boundary eects from cropping sound sequences into short segments,
and over-smoothing temporal variation to gross averages. Finally, one minute has been
shown to be an ecient length for listening by analysts, without attention fading and
signals being missed.
Files per hour
A number of studies have found that a stratied ‘on-o’ time sampling programme (e.g. recording
1 minute in every 10), can capture comparable data to continuous recording, with consequent
benets in terms of battery life, data storage and processing time.
We recommend recording 12 les per hour
Research evidence
UKAN questionnaire: 27% of respondents selected 6 les per hour, with 17% selecting 12
les per hour.
Bradfer-Lawrence et al. (2020) assessed the length of time required to generate stable
acoustic index values at a location, and concluded that continuous recordings are
more eective for rapidly capturing soundscape character, while sparse time-sampling
delayed this process. As a result, their recommendation was to sample continuously to
minimise the required deployment period.
https://doi.org/10.1111/2041-210X.13254
Pieretti et al. (2015) simulated ve dierent recording schedules from continuous sound
les: (i) one minute every ve; (ii) one minute every 10; (iii) one minute every 20; (iv)
one minute every 30; and (v) one minute every 60. For each schedule they calculated
the Acoustic Complexity Index. The 1 min in ve schedule closely correlated with the
soundscape captured by continuous recordings (r>0.90; p<0.01), while providing an 80%
storage space and battery power reduction compared to the continuous sampling.
https:///doi.org/10.1111/2041-210X.13254#mee313254-bib-0035
Shaw et al. (2022) investigated the eort required to estimate bird species richness and
composition in European forests. They compared sampling intensity for 1 min les,
in intervals from 1-in-3 (n = 20 per hour) to 1-in-60 min (n = 1 per hour). The highest
species richness was with recordings at the highest intensity of one every 3 mins.
https://doi.org/10.1002/ece3.9491
The studies in the literature review by Minkova et al. (2020) recorded a daily total of
recordings ranging from 10-240 min per 24-hour period (equal to 0.4-10 minutes per
hour). However, the sampling protocol was often inuenced by study limitations such as
availability of personnel, hardware and data storage capacity. Minkova et al. (2020) used
two sampling densities: four 1 min clips from each hour 0400-1000 , and two 1 min clips
from each hour 1000-2200.
https://www.dnr.wa.gov/publications/lm_oesf_pac_sp.pdf
Page 74
The review by Sugai (2019) found that for studies with 24 h diel recordings, most used
a single recording per hour (47%), with the remaining studies using 2, 4, or 6 recordings
per hour.
https://doi.org/10.1093/biosci/biy147
Rationale
In combination with le lengths of one minute, as recommended above, 12 recordings
per hour provide a 20% time-sampling coverage. This level of sampling eort has been
shown to adequately capture soundscape characteristics or species directions, while
balancing data storage and processing requirements.
Daily programme
Detection probability for bird and other taxa normally varies with time of the day, so recording
times distributed throughout the day will sample the entire community most eectively.
We recommend recording for the full 24 hour cycle
Research evidence
UKAN questionnaire: 67% of respondents selected the full 24 hr daily cycle
Bradfer-Lawrence et al. (2019) found that characteristic diel patterns are important for
determining dierences between habitat types. Acoustic indices may be highly similar
between habitats at some times of the day, while diering widely at other times. A wide
range of recording times is therefore useful in characterising habitat types.
https://doi.org/10.1111/2041-210X.13254
The bird study by La & Nudds (2016) found that morning-only acoustic recordings
underestimated species richness, and that the greatest number of species per unit of
sampling eort was detected with on-the-hour samples between 07:00 and 12:00, and
at 21:00.
https://doi.org/10.1002/ecs2.1294
Shaw et al. (2022) investigated the eort to estimate bird species richness and
composition in European forests. They compared recording in a dawn period (1 hr before
sunrise), a morning period (1 hr beginning 3 hr after sunrise), and a combined period
including both day phases. Species richness was signicantly higher when including
both day phases compared to dawn alone, and was slightly higher in the morning
compared to dawn (yielding 80% of recorded species). However, certain nocturnal/
crepuscular species could only be observed in the dawn period.
https://doi.org/10.1002/ece3.9491
Thompson et al. (2017) deployed recorders to assess how avian detection at dierent
times of day, and dates. In their subarctic tundra sites, without a distinct dawn or dusk,
most species displayed circadian patterns, with detection peaking at 0800-1200 hours,
but remaining high through the day for some species. Between 2200 hours and 0500
hours, detection rates dropped to near zero, signaling a rest period for most species. The
peak time of detection for most species took place in the late morning (0900–1000hours)
doi/abs/10.1002/jwmg.21285
Sugai et al (2019) reviewed the recording periods for 460 studies that used passive
acoustic monitoring. Due to a concentration on bat and anuran studies, sampling eort
was mostly concentrated during the night. However, soundscape studies, not targeted
at particular taxa, recorded through more of the diel cycle, with most eort at dawn and
dusk.
https://doi.org/10.1093/biosci/biy147
Page 75
Froidevuax et al. (2014) showed that sampling the full night was essential to fully
capture the maximum number of bat species in forest habitats - covering the dusk and
dawn peaks in bat activity only, did not record the rarer species with low detection
probabilities.
https://doi.org/10.1002/ece3.1296
Linke et al. (2020) demonstrated that acoustic activity is highly sensitive to diurnal
variation, with only 25-50% of sound types in tropical freshwaters detectable in any 4 hr
period. A comprehensive sampling strategy therefore needs to include a 24 hr recording
schedule to capture soundscape patterns.
https://doi.org/10.1111/fwb.13227
Rationale
Recording through the full 24 hour period will capture all time events during the day,
including the avian dawn and evening choruses, and nocturnal animals. It also allows the
soundscape to be characterised evenly through the diel cycle.
Recording sound through the 24 hour diel cycle can be important in ecoacoustic studies
to capture the full range of sounds produced in an ecosystem, and to study the eects
of diel patterns on sound production. Many ecosystems are characterised by changes in
the soundscape produced over the course of a day in response to the natural history and
behaviour of dierent species. By recording sound over a full diel cycle, it is possible to
study these eects.
Deployment period
Automated recorders are able to be powered for extended periods, particularly if using extended
battery packs or even solar power. The storage capacity of SD cards has also expanded to the
extent that days or weeks of sound data can be recorded on single deployments.
We recommend that deployments should last for a minimum of one week
Research evidence
UKAN questionnaire: 22% of respondents considered that a one week deployment was
appropriate for ecoacoustic studies, with 20% selecting two weeks.
The 35 acoustic index studies reviewed by Alcocer et al. (2022) recorded for an average
of 44 days (range 1–282 days).
https://doi.org/10.1111/brv.12890
Bayne et al. (2017) state that, for singing birds, deployment over several days results
in higher detection and occupancy rates than using a single day. However, there are
diminishing returns - with fewer benets from month-long deployments in comparison
to covering more locations.
http://bioacoustic.abmi.ca/wp-content/uploads/2017/08/ARUs_and_Human_Listeners.
pdf
Bradfer-Lawrence et al. (2019) recommend collecting at least 120 hr of continuous
recordings per site, to fully describe the soundscape in tropical habitats. These
soundscapes are often more complex than those of temperate systems, and so less time
may be required in (e.g.) European contexts.
https://doi.org/10.1111/2041-210X.13254
Minkova et al. (2020) studied breeding forest birds and recorded for a 10 day period,
before extracting four discrete 24 hour periods from this total.
https://www.dnr.wa.gov/publications/lm_oesf_pac_sp.pdf
Page 76
The acoustic bird survey by Franklin et al. (2020) recorded for 15hrs/site and resulted
in an average of 88% completeness of the assemblage, 73% completeness could be
achieved with 5hrs of recordings.
https://www.researchgate.net/publication/339665372_Establishing_the_adequacy_of_
recorded_acoustic_surveys_of_forest_bird_assemblages
Shaw et al. (2022) investigated the eort to estimate bird species richness and
composition in European forests. They compared durations of 1–4 recording days for
each recorder. Bird richness signicantly increased with each added day up to 3 days,
with no dierence from adding the 4th day.
https://doi.org/10.1002/ece3.9491
The bird study by La & Nudds (2016) found that a survey period of at least 3 days was
required to maximise species richness.
https://doi.org/10.1002/ecs2.1294
Furnas & Bowie (2020) stated the importance of adopting a temporal schedule that
represents the range of conditions likely to eect detection probabilities (e.g. changes
in weather, phenology and movement of animals). In most cases, this requires sampling
over several days, with appropriate environmental covariates being recorded as part of
the study protocol.
https://doi.org/10.2989/00306525.2020.1788829
Melo et al (2021) considered that species detection in monitoring programs is strongly
associated with both sampling eort and temporal range of monitoring. Their study
compared six potential sampling scenarios: single hour/day, ve night/full-day, thirty
night/full-day using recordings of 2 mins every hour. The greatest species richness was
recorded with the thirty full day scenario.
https://doi.org/10.1016/j.ecolind.2021.108305
Rationale
Automated passive acoustic methods enable long-term deployments that can not
normally be matched by observers. They thus enable a higher sampling eort and wider
temporal range of sampling than traditional approaches, and consequently produce
higher probabilities of species detection.
Number of deployments per year
While many studies focus on particular times of year, such as the spring bird breeding period,
for long-term ecoacoustics studies there will be considerable value in recording audio data
throughout the annual cycle.
We recommend that deployments should take place a minimum of four times per year, once
per season
Research evidence
UKAN questionnaire: 51% of respondents selected 4 deployments per year (one per
season).
Bradfer-Lawrence et al., (2019) considered that short deployments during distinct
seasons may be as suitable as a single long deployment (e.g. to total 120+ hours).
https://doi.org/10.1111/2041-210X.13254
Siddagangaiah et al. (2022) studied the annual variation in underwater soundscapes,
nding a phenology of sh chorusing that changed between seasons, reecting species
behaviour.
https://doi.org/10.1038/s43247-022-00442-5
Page 77
Rationale
Species occupancy and vocal activity levels will vary throughout the year, as will the
overall soundscape of an ecosystem. To adequately capture this annual variation, it is
recommended that ecoacoustic studies should cover all seasons: summer, autumn,
winter and spring.
Spatial layout
When using multiple recorders, a decision needs to be made on how to arrange these spatially.
Random, transect, grid or fractal patterns can be used, or the location of recorders can be selected
based on target features such as habitat types or nesting locations.
We recommend that recorder locations should be selected based on parameters such as
habitat type
Research evidence
UKAN questionnaire: 58% of respondents would use a selected/optimised spatial
distribution of sensors (e.g. by habitat type), with 17% choosing a grid-based
arrangement.
Wood & Peery (2022) discuss two dierent sampling frameworks for acoustic studies.
Recorders may be deployed preferentially in areas known to be important to a
species, such as nest sites, implying an ‘area of occupancy’ concept of a species range,
Alternatively, recording locations may be randomly determined without relation to any
knowledge of species use, e.g. in a survey grid, within a wider ‘extent of occurrence.
Preferential sampling requires substantial pre-survey information, but leads to intuitive
parameter interpretation and greater precision due to its ner spatial scale; while greater
survey coverage is attainable with random sampling.
https://doi.org/10.1111/ibi.13092
Piña-Covarrubias et al. (2018) tested how the placement of acoustic sensors could be
optimized, as an alternative to the use of standard grids. They found that, on hilly terrain,
selected placements on higher ground could halve the required number of sensors to
cover an area, compared to a square grid.
https://doi.org/10.1002/rse2.97
In their study on bats, Froidevuax et al. (2014) showed that the three-dimensional
structure of forests, including all microhabitats, must be sampled to adequately record
the full species complement of bat communities.
https://doi.org/10.1002/ece3.1296
Rationale
The spatial layout of recorders in a study will largely depend on the aims of the project.
Investigations of environmental gradients will promote the use of linear, i.e. transect,
layouts, while studies examining dierences between habitat types will likely employ a
selected or stratied grid layout. Projects to determine occupancy of particular habitat
features, such as amphibian presence in ponds, will clearly make use of closely targeted
locations. Many studies have used survey designs where detectors are rotated across
a number of locations to increase geographical coverage. This reduces comparability
between sites in terms of the dates when sampling occurs, but can be eective in
maximising limited hardware resources.
Page 78
Spatial density
Unless simultaneous recordings are specically required across an array of recorders (for the
purposes of localization), then spacing between units is normally set to prevent any replication of
sounds between sites. When undertaking species-specic studies, the spatial density of recording
sites may usefully correspond to typical territory size of the target species.
We recommend that recorder locations should be a minimum of 250m apart
Research evidence
UKAN questionnaire: 30% of respondents selected a 500m separation distance between
recorders (equal to 4 recorders/km2), with 23% choosing a 250m distance (equal to 16
recorders/km2).
Minkova et al. (2020) aimed to evaluate bird habitat use in forest stands of dierent ages
and management types. Their preliminary eld tests (in Kuehne et al. 2019) showed that
the eective detection range of their Songmeter units was unlikely to exceed 125m for
their species of interest, and so they spaced sampling locations ≥250m apart.
https://www.dnr.wa.gov/publications/lm_oesf_pac_sp.pdf
The Yip et al. (2017) study on bird sounds conrmed that, for all species calls and
broadcast tones, detection probability declined with increasing distance and decreasing
sound amplitude, and was higher in open vegetation than in closed vegetation.
Furnas & Bowie (2020) state the recommendation, following traditional point
counts, that independent sampling locations at least 250m apart should be used for
autonomous sound recorders. This separation distance will address the potential for
double counting and spatial autocorrelation, with their resulting biases on results and
precision.
https://doi.org/10.2989/00306525.2020.1788829ttps://doi.org/10.2989/00306525.2020.1
788829
Rationale
For coverage of a site, the aim is normally to sample across the range of the habitats
and species of interest, with recorders placed to limit overlap of detection radii so that
counts are independent. The eective radius of most recorders is in the region of 50m,
so a minimum separation distance of at least 100m should be used. As a recommended
standard, a larger 250m spacing between recorder locations would provide 16 sampling
locations/km2. This is dense enough to provide a good level of survey data, and is also
likely to be relevant to the territory sizes of many species of interest within ecological
assessments.
Page 79
Taxa Region Title Authors and link
Amphibians USA Amphibian Monitoring Protocol
(Version 2.0)
National Park Service, Great Lakes Inventory and Monitoring Network
https://www.nps.gov/im/glkn/amphibians.htm
Bats USA
Range-wide Indiana bat &
Northern long-eared bat survey
guidelines.
U.S. Fish and Wildlife Service. (2022).
https://www.fws.gov/library/collections/range-wide-indiana-bat-and-
northern-long-eared-bat-survey-guidelines
Bats USA A Plan for the North American Bat
Monitoring Program (NABat)
USDA (2015)
https://www.srs.fs.usda.gov/pubs/gtr/gtr_srs208.pdf
Bats USA
Guidance for conducting
acoustic surveys for bats: Version
1 detector deployment, le
processing and database version
National Park Service
https://irma.nps.gov/DataStore/Reference/Prole/2231984
Bats UK
Designing eective survey and
sampling protocols for passive
acoustic monitoring as part of the
national bat monitoring
Newson, S.E., Boughey, K.L., Robinson, R.A. & Gillings, S. 2021. JNCC Report
No. 688, JNCC, Peterborough, ISSN 0963-8091.
https://hub.jncc.gov.uk/assets/4cc324dc-1ad8-446e-acdd-a656348025b3
Bats Scotland
Bats and onshore wind turbines
- survey, assessment and
mitigation
NatureScot, 2021
https://www.nature.scot/doc/bats-and-onshore-wind-turbines-survey-
assessment-and-mitigation
Bats UK
Bat Surveys for Professional
Ecologists: Good Practice
Guidelines
Collins, J. (ed.) (2016). 3rd edition. The Bat Conservation Trust, London. ISBN-
13 978-1-872745-96-1
https://www.bats.org.uk/resources/guidance-for-professionals/bat-surveys-
for-professional-ecologists-good-practice-guidelines-3rd-edition
Bats UK Guidelines for passive acoustics
surveys of bats in woodland
Bat Conservation Trust
https://www.bats.org.uk/our-work/national-bat-monitoring-programme/
passive-acoustic-surveys/guidelines-for-passive-acoustic-surveys-of-bats-in-
woodland
Birds Canada Species detection survey
protocols: Forest bird surveys
Saskatchewan Ministry of Environment. 2014. Forest Birds Survey Protocol.
Fish and Wildlife Branch Technical Report No. 2014-10.0. 3211 Albert Street,
Regina, Saskatchewan.
http://www.environment.gov.sk.ca/Default.aspx?DN=bcaf2087-feef-4e7e-
acbf-e788a0734e71
Birds New
Zealand
Protocols for the inventory and
monitoring of populations of the
endangered Australasian bittern
(Botaurus poiciloptilus) in New
Zealand
O’Donnell, C., and Williams, E., New Zealand Department of Conservation.
2015.
https://www.researchgate.net/publication/275465977_Protocols_for_
the_inventory_and_monitoring_of_populations_of_the_endangered_
Australasian_bittern_in_New_Zealand
Birds UK Bird Survey Guidelines: Passive
audio recording
Bird Survey & Assessment Steering Group. (2022). Bird Survey Guidelines for
assessing ecological impacts, v.0.1.7.
https://birdsurveyguidelines.org/803-2/
Birds Canada
How to Most Eectively Use
Autonomous Recording Units
When Data are Processed by
Human Listeners
Bayne, E., Knaggs, M., and Sólymos, P. Bioacoustic Unit, Bayne Lab at the
University of Alberta & Alberta Biodiversity Monitoring Institute. 2017
http://bioacoustic.abmi.ca/wp-content/uploads/2017/08/ARUs_and_
Human_Listeners.pdf
Birds UK Bird Bioacoustic Surveys –
Developing a Standard Protocol
Abrahams, C. inpractice the Bulletin of the Chartered Institute of Ecology
and Environmental Management. December 2018.
https://www.researchgate.net/publication/329443381_Bird_Bioacoustic_
Surveys_-_Developing_a_Standard_Protocol
Appendix 2: A table of acoustic monitoring
guidance documents from around the world
Page 80
Taxa Region Title Authors and link
Birds Canada
Terrestrial ABMI Autonomous
Recording Unit (ARU) and
Remote Camera Trap Protocols
Alberta Biodiversity Monitoring Institute. 2021.
https://www.abmi.ca/home/publications/551-600/599
Cetaceans USA
Baseline Long-term Passive
Acoustic Monitoring of Baleen
and Sperm Whales and Oshore
Wind Development
Appendix I of: Van Parijs, S. M., Baker, K., Carduner, J., Daly, J., Davis,
G. E., Esch, C., … Staaterman, E. (2021). NOAA and BOEM Minimum
Recommendations for Use of Passive Acoustic Listening Systems in Oshore
Wind Energy Development Monitoring and Mitigation Programs. Frontiers in
Marine Science, 8, 1575.
https://www.frontiersin.org/articles/10.3389/fmars.2021.760840/full
Cetaceans Global Position Statement 3: Passive
Acoustic Monitoring
Marine Mammal Observer Association, 2013
https://www.mmo-association.org/mmoa-activities/position-
statements?id=111
Cetaceans Scotland
Use of Static Passive Acoustic
Monitoring (PAM) for monitoring
cetaceans at Marine Renewable
Energy Installations (MREIs) for
Marine Scotland
Embling, C. B., Wilson, B., Benjamins, S., Pikesley, S., Thompson, P., Graham, I.,
Cheney, B., Brookes, K.L., Godley, B.J. & Witt, M. J.
https://tethys.pnnl.gov/sites/default/les/publications/emblingetal.pdf
Cetaceans New
Zealand
Report of the Marine Mammal
Observer/Passive Acoustic
Monitoring Requirements
Technical Work ing Group
DOC (Ed) 2016. Marine Species and Threats, Department of Conservation,
Wellington, New Zealand. 47 p.
https://www.doc.govt.nz/globalassets/documents/conservation/marine-
and-coastal/seismic-surveys-code-of-conduct/twg-reports-2016/01-scr-
mmo-pam-reqs.pdf
Devices Canada
Autonomous Recording Unit
Deployment Protocol: SM2, SM3,
and SM4 Models of Song Meters
Lankau, H., Bioacoustic Unit, Bayne Lab at the University of Alberta & Alberta
Biodiversity Monitoring Institute. 2017
http://bioacoustic.abmi.ca/wp-content/uploads/2018/01/
DeploymentProtocol_e.pdf
Devices Canada SongMeter (SM3) Maintenance
Protocol
Bioacoustic Unit, Bayne Lab at the University of Alberta & Alberta
Biodiversity Monitoring Institute. 2016
https://www.wildtrax.ca/dam/jcr:9a5ad9ac-c684-4712-a811-74f882acfd5b/
BU_2019_SM3MaintenanceProtocol.pdf
Devices Australia Deployment manual for solar
powered acoustic sensors
The Australian Acoustic Observatory | A2O.
https://acousticobservatory.org/deployment-information/
Fish Northeast
Atlantic
ICES Survey Protocols – Manual
for Acoustic Surveys Coordinated
under ICES Working Group on
Acoustic and Egg Surveys for
Small Pelagic Fish
Doray, M., Boyra, G., and van der Kooij, J. (Eds.). 2021. 1st Edition. ICES
Techniques in Marine Environmental Sciences Vol. 64.100 pp.
https://doi. org/10.17895/ices.pub.7462
Whole
Soundscape Norway
Management relevant
applications of acoustic
monitoring for Norwegian nature
– The Sound of Norway
Sethi, S. S., Fossøy, F., Cretois, B. & Rosten, C. M. 2021.. NINA Report 2064.
Norwegian Institute for Nature Research.
https://brage.nina.no/nina-xmlui/handle/11250/2832294
Whole
soundscape UK Bioacoustics for Agri-
Environment Monitoring
Excerpt from: Developing technologies for agri-environment monitoring
Developing approaches to agri-environment monitoring (M&E Baseline/
Programme Development) - LM04108 CEH Project reference: 7379 Date
22/02/2021 Roy, D.B., Abrahams, C., August, T., Christelow, J., Gerard, F.,
Howell, K., Logie, M., McCracken, M., Pallet, D., Pocock, M., Read, D.S. &
Staley, J.
https://randd.defra.gov.uk/ProjectDetails?ProjectId=20551
Whole
Soundscapes Global
Silent·Cities: A participatory
monitoring programme of an
exceptional modication of
urban soundscapes
Samuel Challéat, Amandine Gasc, Nicolas Farrugia, Jérémy Froidevaux
https://osf.io/h285u/
Whole
Soundscapes Global Passive acoustic monitoring in
ecology and conservation
Ella Browning, Rory Gibb, Paul Glover-Kapfer & Kate E. Jones. 2017. WWF
Conservation Technology Series 1(2). WWF-UK, Woking, United Kingdom.
https://www.wwf.org.uk/sites/default/les/2019-04/Acousticmonitoring-
WWF-guidelines.pdf
Whole
Soundscapes UK
The potential use of acoustic
indices for biodiversity
monitoring at long-term
ecological research (LTER) sites
Andrews, C. and Dick, J. 2021. UK Centre for Ecology & Hydrology
https://nora.nerc.ac.uk/id/eprint/531301/1/N531301CR.pdf
Page 81
Appendix 3: R code for false-colour plots
# Kaleidoscope False Colour Plot
# Carlos Abrahams 2022-12-23
library(tidyverse)
library(scales)
# Example dataset #######
# Generate example data for hourly samples over ve days
set.seed(123)
ai_data <- tibble(
ACI = runif(120, min = 150, max = 200),
BI = runif(120, min = 50, max = 100),
NDSI = runif(120, min = -1, max = 1),
ai_dtime = seq(ymd_hms(‘2018-08-06 00:00:00’),
ymd_hms(‘2018-08-10 23:59:00’),
by = ‘1 hour’)
)
# Extract year_day and hour from ai_dtime POSIX
ai_data <- ai_data %>%
mutate(ai_date = yday(ai_dtime),
ai_time = hour(ai_dtime))
# Rescale all Acoustic Index scores to 0-1 for RGB plotting #######
ai_data <- ai_data %>%
mutate(
ACInorm = rescale(ACI),
BInorm = rescale(BI),
NDSInorm = rescale(NDSI)
)
# Plot false-colour raster #######
ggplot(ai_data, aes(x = ai_date, y = ai_time)) +
geom_raster(ll = rgb(
red = ai_data$ACInorm,
green = ai_data$BInorm,
blue = ai_data$NDSInorm
)) +
labs(
x = “Year Day”,
y = “Time”,
title = “False-colour plot of acoustic indices”,
subtitle = “ACI = Red, BI = Green, NDSI = Blue”
)
Page 82
... In recent years, noisy environments have changed significantly and rapidly, where areas with animal abundance are now quieter or substituted by non-natural sound sources, e.g., electric vehicles, drones, and heat pumps (Waddington et al., 2022). At the beginning of 2023, the Acoustics Network of the UK, Manchester Metropolitan University and Baker Consultants released a "Good practice guidelines for long-term ecoacoustic monitoring in the UK", which provides recommendations about hardware (equipment and sensors) to use, study protocol (temporal and spatial considerations, audio settings, metadata and data storage) (Metcalf et al., 2023). Some essential information regarding equipment and settings for long-term ecoacoustic monitoring are stated in the guideline, such as type of equipment, sampling rate, bit depth, duration and periodicity. ...
... The periodicity should be one week and take place four times per year, one in each season. In terms of acoustic indices were highlighted: Acoustic Complexity Index (ACI), Acoustic Diversity Index (ADI), Acoustic Evenness (AEve), Activity (ACT), Acoustic Space Use (ASU), Background noise (BGN), Bioacoustic Index (Bio), Spectral entropy (Hs), Temporal entropy (Ht), Acoustic entropy (H), Events per second (EVN), Median of the amplitude envelope (M), Normalised Difference Soundscape Index (NDSI), Number of frequency peaks (NP), Signal-to-Noise ratio (SNR), and Soundscape Saturation (Sm) (Metcalf et al., 2023). ...
Conference Paper
In recent years, several studies have shown how anthropogenic noise impacts wildlife. The methodologies used to quantify noise appear to influence data reliability and subsequent findings. Therefore, it is appropriate to review the robustness of acoustic measurement procedures to understand the extent to which studies can be relied upon. In 2023, the UK Acoustics Network produced "Good practice guidelines for long-term ecoacoustic monitoring in the UK". These guidelines will be used for the methodological parametrisation of our investigation. This study quantifies the reliability of existing studies on anthropogenic noise impacts on birds without confounding factors (on an acoustic basis only) through a systematic literature review. The criteria investigated are: equipment used, calibration, frequency range and duration. Additionally, data on how birds are influenced by anthropogenic noise and the indices used were extracted to quantify and qualify noise impact. The screening of manuscripts will follow the Prisma procedure for systematic reviews, and the results will be clustered according to geographical location. This work expects to summarises how anthropogenic noise impacts birds worldwide and how the robustness of the acoustic measurements influences these results.
... Leveraging sophisticated machine learning algorithms like BirdNET [4], this work underscores the potential for PAM technology to revolutionise bird monitoring practices and solve the challenges mentioned. Guided by standardised protocols [5] and the investigation of various types of autonomous recording units [6,7], this research aims to contribute to the advancement of ecological studies, conservation initiatives, and the protection of South African wetland birds and their habitats. The existing solutions and systems lay the foundation for the development of an effective monitoring sensor for South African wetland birds, with the potential to enhance ecological research and conservation efforts in these vital ecosystems. ...
Article
Full-text available
Biodiversity monitoring, particularly in a country as diverse as South Africa with its extensive migratory bird population, presents significant challenges. This challenge becomes even more pronounced in environments with a multitude of coexisting bird species, notably wetlands, which serve as crucial breeding and feeding grounds for various avian species. This research will address this challenge by designing a cost-effective sound-based sensor system capable of deployment in diverse wetland ecosystems. The primary aim is to aid in the monitoring of bird species by detecting their presence and distribution and then transmitting this valuable data to a central base station. To assess the system’s feasibility and performance, a series of experiments were conducted at the Rondevlei Nature Reserve in Cape Town, South Africa. These experiments focused on the sensor’s capacity to accurately identify avian species while maintaining robustness in varying environmental conditions. The results yielded promising outcomes, demonstrating the successful identification of bird species. Furthermore, the system exhibited reliability across different weather conditions, positioning it as a viable choice for long-term deployment in wetland environments. Beyond species detection, this project also delved into practical aspects of data transfer and storage efficiency, ensuring the system’s suitability for real-world applications. Modularity was another crucial consideration, simplifying maintenance and upgrades. Moreover, a preliminary cost analysis indicated the cost-effectiveness of the system compared to commercial alternatives. The integration of climate sensors into the monitoring system was explored as a future direction. This addition holds the potential to provide a more comprehensive approach to environmental monitoring by incorporating climate data into the analysis. Such a holistic approach can further enrich our understanding of bird behaviour in relation to changing environmental conditions. The findings of this research have significant implications for avian conservation and ecological studies, particularly in the unique context of South Africa. This project introduces an affordable and practical tool for monitoring bird species in wetland habitats, offering valuable insights into the preservation and management of these critical ecosystems.
Book
Sound and listening are intrinsically linked to how we experience and engage with places and communities. This guide invites landscape architects and urban designers to become soundscape architects and offers practical advice on sound and listening applicable to each stage of a design project: from reading the environment to intervening on it. The book will be of interest to landscape architects, together with other design professionals such as urban designers, architects, artists, planners and engineers that play a primary role in the composition of the soundscape
Preprint
Full-text available
Fifty-nine percent of species on Earth inhabit the soil. However, soils are degrading at unprecedented rates, necessitating efficient, cost-effective, and minimally intrusive biodiversity monitoring methods to aid in their restoration. Ecoacoustics is emerging as a promising tool for detecting and monitoring soil biodiversity, recently proving effective in a temperate forest restoration context. However, understanding the efficacy of soil ecoacoustics in other ecosystems and bioregions is essential. Here, we applied ecoacoustics tools and indices (Acoustic Complexity Index, Bioacoustic Index, Normalised Difference Soundscape Index) to measure soil biodiversity in an Australian grassy woodland restoration chronosequence. We collected 240 soil acoustic samples from two cleared plots (continuously cleared through active management), two woodland restoration plots (revegetated 14-15 years ago), and two remnant vegetation plots over 5 days at Mount Bold, South Australia. We used a below-ground sampling device and sound attenuation chamber to record soil invertebrate communities, which were also manually counted. We show that acoustic complexity and diversity were significantly higher in revegetated and remnant plots than in cleared plots, both in-situ and in sound attenuation chambers. Acoustic complexity and diversity were also strongly positively associated with soil invertebrate abundance and richness, and each chronosequence age class supported distinct invertebrate communities. Our results provide support that soil ecoacoustics can effectively measure soil biodiversity in woodland restoration contexts. This technology holds promise in addressing the global need for effective soil biodiversity monitoring methods and protecting our most diverse ecosystems.
Article
Full-text available
Purpose of Review Urban green spaces provide benefits for human health and well-being, among other properties, thanks to their ability to attenuate environmental pollutants. The sound environment is not healthy in most cities, and this situation has not changed in recent decades. These green spaces are potential quiet areas with good acoustic quality if they are designed and planned properly from a multidisciplinary perspective. Although the mitigating effects of green infrastructure have been extensively studied, their application in green areas has been very limited. The objective of this study is to analyze those characteristics of green spaces that contribute to a healthy soundscape and, in turn, the benefits that this would give them to the characteristics of green areas, users, and their physical environment. Recent Findings Current studies show that to accurately determine the relationship between green spaces and health and well-being benefits, it is necessary to know the interaction with other environmental variables, including the soundscape. The development and application of ISO/TS 12913-2 have promoted the consideration of the soundscape and the use of appropriate procedures for its evaluation. Summary The inclusion of soundscape quality in epidemiological studies will improve the quantification of the effects of green spaces on the health and well-being of citizens. Only the consideration of global indicators, such as L den (dB), show the importance of the sound environment in the interaction with other environmental variables and user activities for the determination of the effects of green spaces on health.
Article
Full-text available
Passive acoustic monitoring is a powerful tool for monitoring vocally active taxa. Automated signal recognition software reduces the expert time needed for recording analyses and allows researchers and managers to manage large acoustic datasets. The application of state-of-the-art techniques for automated identification, such as Convolutional Neural Networks, may be challenging for ecologists and managers without informatics or engineering expertise. Here, we evaluated the use of AudioMoth — a low-cost and open-source sound recorder — to monitor a threatened and patchily distributed species, the Eurasian bittern (Botaurus stellaris). Passive acoustic monitoring was carried out across 17 potential wetlands in north Spain. We also assessed the performance of BirdNET — an automated and freely available classifier able to identify over 3000 bird species — and Kaleidoscope Pro — a user-friendly recognition software — to detect the vocalizations and the presence of the target species. The percentage of presences and vocalizations of the Eurasian bittern automatically detected by BirdNET and Kaleidoscope software was compared to manual annotations of 205 recordings. The species was effectively recorded up to distances of 801–900 m, with at least 50% of the vocalizations uttered within that distance being manually detected; this distance was reduced to 601–700 m when considering the analyses carried out using Kaleidoscope. BirdNET detected the species in 59 of the 63 (93.7%) recordings with known presence of the species, while Kaleidoscope detected the bittern in 62 recordings (98.4%). At the vocalization level, BirdNet and Kaleidoscope were able to detect between 76 and 78%, respectively, of the vocalizations detected by a human observer. Our study highlights the ability of AudioMoth for detecting the bittern at large distances, which increases the potential of that technique for monitoring the species at large spatial scales. According to our results, a single AudioMoth could be useful for monitoring the species' presence in wetlands of up to 150 ha. Our study proves the utility of passive acoustic monitoring, coupled with BirdNet or Kaleidoscope Pro, as an accurate, repeatable, and cost-efficient method for monitoring the Eurasian bittern at large spatial and temporal scales. Nonetheless, further research should evaluate the performance of BirdNET on a larger number of species, and under different recording conditions (e.g., more closed habitats), to improve our knowledge about BirdNET's ability to perform bird monitoring. Future studies should also aim to develop an adequate protocol to perform effective passive acoustic monitoring of the Eurasian bittern.
Article
Full-text available
Eco‐acoustic monitoring is increasingly being used to map biodiversity across large scales, yet little thought is given to the privacy concerns and potential scientific value of inadvertently recorded human speech. Automated speech detection is possible using voice activity detection (VAD) models, but it is not clear how well these perform in diverse natural soundscapes. In this study we present the first evaluation of VAD models for anonymization of eco‐acoustic data and demonstrate how speech detection frequency can be used as one potential measure of human disturbance. We first generated multiple synthetic datasets using different data preprocessing techniques to train and validate deep neural network models. We evaluated the performance of our custom models against existing state‐of‐the‐art VAD models using playback experiments with speech samples from a man, woman and child. Finally, we collected long‐term data from a Norwegian forest heavily used for hiking to evaluate the ability of the models to detect human speech and quantify a proxy for human disturbance in a real monitoring scenario. In playback experiments, all models could detect human speech with high accuracy at distances where the speech was intelligible (up to 10 m). We showed that training models using location specific soundscapes in the data preprocessing step resulted in a slight improvement in model performance. Additionally, we found that the number of speech detections correlated with peak traffic hours (using bus timings) demonstrating how VAD can be used to derive a proxy for human disturbance with fine temporal resolution. Anonymizing audio data effectively using VAD models will allow eco‐acoustic monitoring to continue to deliver invaluable ecological insight at scale, while minimizing the risk of data misuse. Furthermore, using speech detections as a proxy for human disturbance opens new opportunities for eco‐acoustic monitoring to shed light on nuanced human–wildlife interactions.
Article
Full-text available
Passive acoustic monitoring can be an effective method for monitoring species, allowing the assembly of large audio datasets, removing logistical constraints in data collection and reducing anthropogenic monitoring disturbances. However, the analysis of large acoustic datasets is challenging and fully automated machine learning processes are rarely developed or implemented in ecological field studies. One of the greatest uncertainties hindering the development of these methods is spatial generalisability—can an algorithm trained on data from one place be used elsewhere? We demonstrate that heterogeneity of error across space is a problem that could go undetected using common classification accuracy metrics. Second, we develop a method to assess the extent of heterogeneity of error in a random forest classification model for six Amazonian bird species. Finally, we propose two complementary ways to reduce heterogeneity of error, by (i) accounting for it in the thresholding process and (ii) using a secondary classifier that uses contextual data. We found that using a thresholding approach that accounted for heterogeneity of precision error reduced the coefficient of variation of the precision score from a mean of 0.61 ± 0.17 ( SD ) to 0.41 ± 0.25 in comparison to the initial classification with threshold selection based on F ‐score. The use of a secondary, contextual classification with thresholding selection accounting for heterogeneity of precision reduced it further still, to 0.16 ± 0.13, and was significantly lower than the initial classification in all but one species. Mean average precision scores increased, from 0.66 ± 0.4 for the initial classification, to 0.95 ± 0.19, a significant improvement for all species. We recommend assessing—and if necessary correcting for—heterogeneity of precision error when using automated classification on acoustic data to quantify species presence as a function of an environmental, spatial or temporal predictor variable.
Article
Full-text available
Tropical ecosystems with high levels of endemism are under threat due to climate change and deforestation. The conservation actions are urgent and must rely on a clear understanding of landscape heterogeneity from transformed landscapes. Currently, passive acoustic monitoring uses the soundscape to understand the dynamics of biological communities and physical components of the sites and thus complement the information about the structures of landscape. However, the link between the analysis and quantification of ecosystem transformation based on acoustic methods and acoustic heterogeneity is just beginning to be analyzed. This document proposes a new beta Acoustic Heterogeneity Index (AHI) that quantifies the acoustic heterogeneity related to landscape transformation. AHI estimates the acoustic dissimilarity between sites modeling membership degrees of mixture models in three transformation states: high, medium, and low. We hypothesized that if acoustic recordings of different habitats are analyzed looking for particular patterns, it is possible to quantify the landscape heterogeneity between sites using sound. To calculate the AHI we propose a methodology of five steps: (1) filtering out recordings with high noise levels, (2) estimating acoustics indices, (3) including temporal patterns, (4) using GMM classification models to recognize habitat transformation levels, and (5) calculating the proposed AHI. We tested the proposal with data collected from 2015 to 2017 for 22 tropical dry forests (TDF) sites in two watersheds of Colombian Caribbean region. The sites were labeled by the level of landscape transformation using forest degradation indicators with satellite imagery. We compared these labels with the predicted transformation of our method showing an F1 score of 92% and 90% in regions of La Guajira and Bolívar respectively. To use AHI interactively, we analized the soundscapes similarities on geographic maps in the study regions. We identified that AHI allows estimating the similarity of points with similar transformations, and where the soundscape provides information about the transition states. This proposal allows complementing landscape transformation studies with information on the acoustic heterogeneity between pairs of sites.
Presentation
Full-text available
No PDF available ABSTRACT Bioacoustics is a powerful and increasingly commonly used tool for terrestrial and marine biological assessments. As the scale of bioacoustic data collection has increased, techniques for processing these data have diversified. However, with analysis methods rapidly evolving and dozens of analysis software packages already available, it is challenging to identify which software, if any, meets a particular researcher’s needs. We reviewed bioacoustics software to identify packages aimed at or used by bioacoustics researchers in ecology. We compiled descriptions of the function of 65 stable or actively developed software packages used for bioacoustics analyses. Of these, 59 were free or open-source packages. In addition, we developed free, open-source Python software, OpenSoundscape, that addresses gaps in available software. OpenSoundscape simplifies the process of creating flexible, scalable deep learning algorithms for bioacoustic analysis. It can be used to train binary or multiclass convolutional neural networks with any PyTorch-implemented model structure (e.g., ResNet50, Inception v3). Researchers can easily customize its spectrogram preprocessing and data augmentation routines to improve model performance. OpenSoundscape also includes modules to work with annotated acoustic data, apply additional signal processing algorithms, perform acoustic localization, and “open the black box” of deep learning using Grad-CAM.
Article
Full-text available
Animal vocalisations and natural soundscapes are fascinating objects of study, and contain valuable evidence about animal behaviours, populations and ecosystems. They are studied in bioacoustics and ecoacoustics, with signal processing and analysis an important component. Computational bioacoustics has accelerated in recent decades due to the growth of affordable digital sound recording devices, and to huge progress in informatics such as big data, signal processing and machine learning. Methods are inherited from the wider field of deep learning, including speech and image processing. However, the tasks, demands and data characteristics are often different from those addressed in speech or music analysis. There remain unsolved problems, and tasks for which evidence is surely present in many acoustic signals, but not yet realised. In this paper I perform a review of the state of the art in deep learning for computational bioacoustics, aiming to clarify key concepts and identify and analyse knowledge gaps. Based on this, I offer a subjective but principled roadmap for computational bioacoustics with deep learning: topics that the community should aim to address, in order to make the most of future developments in AI and informatics, and to use audio data in answering zoological and ecological questions.
Article
Full-text available
Context There is a long-standing quest in landscape ecology for holistic biodiversity metrics accounting for multi-taxa diversity in heterogeneous habitat mosaics. Passive acoustic monitoring of biodiversity may provide integrative indices allowing to investigate how soundscapes are shaped by compositional and configurational heterogeneity of mosaic landscapes. Objectives We tested the effects of dominant habitat and landscape heterogeneity on acoustic diversity indices across a large range of mosaic landscapes from two long-term socio-ecological research areas in Occitanie, France and Arizona, USA. Methods We assessed acoustic diversity by automated recording for 44 landscapes distributed along gradients of compositional and configurational heterogeneity. We analyzed the responses of six acoustic indices and a composite multiacoustic index to habitat type and multi-scale landscape metrics for three time periods: 24 h-diel cycles, dawns and nights. Results Landscape mosaics dominated by permanent grasslands in Occitanie and woodlands in Arizona produced the highest values of acoustic diversity. Moreover, several indices including H, ADI, NDSI, NP and the multiacoustic index consistently responded to edge density in both study regions, but with contrasting patterns, increasing in Occitanie and decreasing in Arizona. Landscape configuration was a key driver of acoustic diversity for diel and nocturnal soundscapes, while dawn soundscapes depended more on landscape composition. Conclusions Acoustic diversity was correlated more with configurational than compositional heterogeneity in both regions, with contrasting effects explained by the interplay between biogeography and land use history. We suggest that multiple acoustic indices are needed to properly account for complex responses of soundscapes to large-scale habitat heterogeneity in mosaic landscapes.
Article
Full-text available
Accurate occurrence data is necessary for the conservation of keystone or endangered species, but acquiring it is usually slow, laborious and costly. Automated acoustic monitoring offers a scalable alternative to manual surveys but identifying species vocalisations requires large manually annotated training datasets, and is not always possible (e.g. for lesser studied or silent species). A new approach is needed that rapidly predicts species occurrence using smaller and more coarsely labelled audio datasets. We investigated whether local soundscapes could be used to infer the presence of 32 avifaunal and seven herpetofaunal species in 20 min recordings across a tropical forest degradation gradient in Sabah, Malaysia. Using acoustic features derived from a convolutional neural network (CNN), we characterised species indicative soundscapes by training our models on a temporally coarse labelled point-count dataset. Soundscapes successfully predicted the occurrence of 34 out of the 39 species across the two taxonomic groups, with area under the curve (AUC) metrics from 0.53 up to 0.87. The highest accuracies were achieved for species with strong temporal occurrence patterns. Soundscapes were a better predictor of species occurrence than above-ground carbon density – a metric often used to quantify habitat quality across forest degradation gradients. Our results demonstrate that soundscapes can be used to efficiently predict the occurrence of a wide variety of species and provide a new direction for data driven large-scale assessments of habitat suitability.
Article
Sound recordings are used in various ecological studies, including wildlife monitoring by acoustic surveys. Such surveys often require automatic detection of target sound events in the large amount of data produced. However, current processing methods, especially those relying on sound intensity for detection, are severely impacted by wind, which causes transient intensity peaks. The rapid dynamics of this noise invalidate standard noise estimators, and no satisfactory method for dealing with wind exists in bioacoustics, where simple training and generalization between conditions are important. We estimate the transient noise level by fitting short‐term spectrum models to a wavelet packet representation. This estimator is then combined with log‐spectral subtraction to stabilize the background level. The resulting adjusted wavelet series can be analysed by standard detectors. We use real data from long‐term acoustic monitoring to tune this workflow, demonstrate its denoising capabilities and test the improved detection in two population surveys of birds. The proposed short‐term estimator was more effective than standard (constant) noise estimates in both denoising and detection tasks. In the surveys, the noise‐robust workflow greatly reduced the number of false alarms. As a result, the survey efficiency (precision of the estimated call density) improved for both species. In contrast to existing methods, the proposed estimator can adjust for transient broadband noises without requiring additional hardware or extensive tuning to each species. It improved the detection workflow based on very little training data, making it particularly attractive for detection of rare species or general soundscape analysis.
Article
Ecologists are increasingly using bioacoustics in wildlife monitoring programs. Remote autonomous sound recorders provide new options for collecting data for species and in contexts that were previously difficult. However, post-processing of sound files to extract relevant data remains a significant challenge. Detection algorithms, or call recognizers, can aid automation of species detection but their performance and reliability has been mixed. Further, building recognizers typically requires either costly commercial software or expert programming skills, both of which reduces their accessibility to ecologists responsible for monitoring. In this study we investigated the performance of open-source call recognizers provided by the monitoR package in R, a language popular among ecologists. We tested recognizers on sound data collected under natural conditions at nests of two endangered subspecies of black-cockatoo, the Kangaroo Island glossy black-cockatoo Calyptorhynchus lathami halmaturinus (n = 23 nests), and the south-eastern red-tailed black-cockatoo Calyptorhynchus banksii graptogyne (n = 20 nests). Specifically, we tested the performance of binary point matching recognizers in confirming daily nest activity (active or inactive) and nesting outcome (fledge or fail). We tested recognizers on recordings from nests of known status using 3 × 3-h recordings per nest, from early, mid and late stages of the recording period. Daily nest activity was correctly assigned in 61.7% of survey days analysed (n = 60 days) for the red-tailed black-cockatoo, and 62.3% of survey days (n = 69 days) for the glossy black-cockatoo. Fledging was successfully detected in all cases. Precision (true positive / true positive + false positive) of individual detections was 70.2% for the south-eastern red-tailed black-cockatoo and 37.1% for the Kangaroo Island glossy black-cockatoo. Manual verification of outputs is still required, but it is not necessary to verify all detections to confirm an active nest (i.e., nest is deemed active when true positives are identified). We conclude that bioacoustics combined with semi-automated post-processing can be an appropriate tool for nest monitoring in these endangered subspecies.