ArticlePDF Available

Integration of genomic and ecological methods inform management of an undescribed, yet highly exploited, sardine species

The Royal Society
Proceedings of the Royal Society B
Authors:

Abstract and Figures

Assessing genetic diversity within species is key for conservation strategies in the context of human-induced biotic changes. This is important in marine systems, where many species remain undescribed while being overfished, and conflicts between resource-users and conservation agencies are common. Combining niche modelling with population genomics can contribute to resolving those conflicts by identifying management units and understanding how past climatic cycles resulted in current patterns of genetic diversity. We addressed these issues on an undescribed but already overexploited species of sardine of the genus Harengula. We find that the species distribution is determined by salinity and depth, with a continuous distribution along the Brazilian mainland and two disconnected oceanic archipelagos. Genomic data indicate that such biogeographic barriers are associated with two divergent intraspecific lineages. Changes in habitat availability during the last glacial cycle led to different demographic histories among stocks. One coastal population experienced a 3.6-fold expansion, whereas an island-associated population contracted 3-fold, relative to the size of the ancestral population. Our results indicate that the island population should be managed separately from the coastal population, and that a Marine Protected Area covering part of the island population distribution can support the viability of this lineage.
Content may be subject to copyright.
royalsocietypublishing.org/journal/rspb
Research
Cite this article: Coelho JFR, Mendes LF,
Di Dario F, Carvalho PH, Dias RM, Lima SMQ,
Verba JT, Pereira RJ. 2024 Integration of
genomic and ecological methods inform
management of an undescribed, yet highly
exploited, sardine species. Proc. R. Soc. B 291:
20232746.
https://doi.org/10.1098/rspb.2023.2746
Received: 5 December 2023
Accepted: 6 February 2024
Subject Category:
Global change and conservation
Subject Areas:
ecology, evolution, genomics
Keywords:
Harengula, Clupeidae, fishery genomics,
genetic diversity, marine protected areas
Author for correspondence:
Jéssica Fernanda Ramos Coelho
e-mail: jessicovsky@gmail.com
Co-last authors.
Electronic supplementary material is available
online at https://doi.org/10.6084/m9.figshare.
c.7090089.
Integration of genomic and ecological
methods inform management of an
undescribed, yet highly exploited,
sardine species
Jéssica Fernanda Ramos Coelho
1
, Liana de Figueiredo Mendes
2
,
Fabio Di Dario
3
, Pedro Hollanda Carvalho
3
, Ricardo Marques Dias
4
,
Sergio Maia Queiroz Lima
1
, Julia Tovar Verba
5,
and Ricardo J. Pereira
5,6,
1
Departamento de Botânica e Zoologia, and
2
Departamento de Ecologia, Universidade Federal do Rio Grande do
Norte, Avenida Senador Salgado Filho S/N, Campus Universitário, 59078-970, Natal/RN, Brazil
3
Instituto de Biodiversidade e Sustentabilidade - Universidade Federal do Rio de Janeiro, Avenida São José do
Barreto, 764, 27965-045, Macaé/RJ, Brazil
4
Museu Nacional, Universidade Federal do Rio de Janeiro, Quinta da Boa Vista - São Cristóvão, 20940-040,
Rio de Janeiro/RJ, Brazil
5
Evolutionary Biology, Ludwig Maximilian University of Munich, Grosshaderner Strasse 2, 82152, Planegg-
Martinsried, Germany
6
Department of Zoology, State Museum of Natural History Stuttgart, Rosenstein 13, 70191, Stuttgart, Germany
JFRC, 0000-0002-3736-8912; JTV, 0000-0001-5399-6890; RJP, 0000-0002-8076-4822
Assessing genetic diversity within species is key for conservation strategies
in the context of human-induced biotic changes. This is important in
marine systems, where many species remain undescribed while being over-
fished, and conflicts between resource-users and conservation agencies are
common. Combining niche modelling with population genomics can con-
tribute to resolving those conflicts by identifying management units and
understanding how past climatic cycles resulted in current patterns of
genetic diversity. We addressed these issues on an undescribed but already
overexploited species of sardine of the genus Harengula. We find that the
species distribution is determined by salinity and depth, with a continuous
distribution along the Brazilian mainland and two disconnected oceanic
archipelagos. Genomic data indicate that such biogeographic barriers are
associated with two divergent intraspecific lineages. Changes in habitat
availability during the last glacial cycle led to different demographic
histories among stocks. One coastal population experienced a 3.6-fold
expansion, whereas an island-associated population contracted 3-fold,
relative to the size of the ancestral population. Our results indicate that
the island population should be managed separately from the coastal
population, and that a Marine Protected Area covering part of the island
population distribution can support the viability of this lineage.
1. Introduction
The current biodiversity crisis is leading to the decrease in population size and the
extinction of some species [1,2], threatening the balance of ecosystems [3,4]. The
Convention on Biological Diversity (CBD) recognizes genetic diversity within
species as one of the three main levels of biodiversity that must be monitored
and preserved, along with species and ecosystem diversity [5]. This is because gen-
etic diversity within species is the heritable variability upon which selection can
act, and thus underlies the ability for a species to adapt and persist in changing
environments [6]. Although the CBD set a target to maintain and restore genetic
diversity within species until 2030 in the KunmingMontreal Global Biodiversity
Framework, monitoring changes of genetic diversity within species is challenging
because it requires understanding how past and current environment affect the
© 2024 The Author(s) Published by the Royal Society. All rights reserved.
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
evolution of intraspecific lineages. This understanding is especially challenging in marine ecosystems, particularly in hyper-diverse
tropical regions of the globe, where species are still being described, their relative abundances and connectivity are unknown, and
yet some species are under strong human exploitation. Conflicts of interest between resource users and conservation managers start
at a local scale owing to environmental policies that disregard scientific data and usersparticipation in management [7]. These
local conflicts can, however, affect economy, food availability and ecosystem functioning at a broader spatialscale, demanding updated
scientific information to support better conservation laws and practices. The integrations of genomic data [810] and ecological mod-
elling [11,12] can potentially provide evidence-based suggestions for conservation plans and contribute to resolve such conflicts.
Determining the number of intraspecific lineages within a species of concern is an important task in conservation biology, as
evolutionarily independent units often require independent management plans [13]. Likewise, assessing relative levels of genetic
diversity within each intraspecific lineage (N
e
or effective population size) is important because it reflects both the differences in cur-
rent number of individuals (Nor census size) and the demographic history of these lineages in response to past environmental change
[14]. Both the number of intraspecific lineages and their genetic diversity depend on evolutionary processes such as genetic drift in
isolated populations, selection in divergent environments and gene flow among them. Because such processes are highly dynamic
over space and time, in order to aid conservation planning, the integration of genomic analyses and ecological niche modelling
can provide fast and useful information regarding the drivers of population size and connectivity of exploited species. These processes
are well understood in terrestrial systems [1517], yet only recently have they begun to be studied in marine systems [18]. Despite the
high potential dispersal ability of marine taxa, recent studies have shown that several oceanographic barriers, such as oceanic currents
and depth, constrain gene flow [19], therefore enabling diversification within species. Understanding how such oceanographic
barriers condition the number of management stocks, their current levels of genetic diversity, and genetic connectivity remain
important tasks in conservation biology.
Marine systems support the livelihood of millions of people worldwide, increasing the conflict between local communities
and conservation strategies. The application of genomics to fisheries management has contributed in resolving some of these
conflicts by identifying the number of stocks within species of commercial interest, their levels of genetic connectivity, and
relative measures of diversity [20]. Yet, such efforts have been mainly restricted to marine species from the Northern Hemisphere,
where species-level diversity is lower relative to tropical regions. For example, genomic studies in the yellow-fin tuna and
cod have helped identifying divergent intraspecific lineages as distinct fishery stocks with high genetic differentiation that require
independent management [21,22]. Genomic data on species of conservation concern, such as cod, herring and European hake,
have been used to identify the fishing origin of commercial products, and thus aid against illegal, unreported and unregulated fish-
eries [23]. A study in yellow-fin tuna indicates that the populations occurring in different oceans present significant and asymmetric
gene flow [24], allowing the identification of populations that act as source or sink of migrants, another factor relevant for defining
conservation priorities. Studies in a pipefish [19] and in the Arctic charr [25] have quantified changes in effective population size
(N
e
) of different stocks, potentially driven by both their past evolution and more contemporary exploitation by fisheries. Much
less is known about species in tropical waters, even though such species play an indispensable role in the subsistence of local
communities [26].
One such economically important group of fishes worldwide is the Clupeidae, a family that includes several species of forage
fishes popularly known as sardines, herrings and shads, among others [27,28]. The genus Harengula Valenciennes 1847 comprises
four taxonomically recognized species, three of which are reported as occurring in thewestern Atlantic. Whereas Harengula humeralis
(Cuvier, 1829) is anatomically distinct and restricted mostly to the central Atlantic, Harengula clupeola (Cuvier, 1829) and Harengula
jaguana Poey, 1865 are reported as having a much broader distribution, from the US coast to southern Brazil [27]. However, recent
molecular studies indicate that specimens of Harengula off the Brazilian coast belong to a third mitochondrial lineage, which is
more divergent (3.2% in the barcoding gene CO1)thanH. clupeola and H. jaguana (1.8%) [29]. The lineage of Harengula likely endemic
to the Brazilian Biogeographic Province, herein Harengula sp., is apparentlyallopatric relative to congeners and is the only memberof
the Clupeidae present in two oceanic archipelagos of the South Atlantic (Fernando de Noronha (FNO) and Trindade-Martin Vaz;
[30]), suggesting a higher tolerance to ecological conditions than other overall similar species of sardines [31]. Although Harengula
sp. remains unrecognized taxonomically, it shows signs of decline in population size [32] and likely requires specific management
measures [33,34].
Harengula sp. is consumed as part of traditional dishes and is the main source of live bait in several regions of the Brazilian
Exclusive Economic Zone [35]. Currently, there is no specific management strategy for Harengula sp. in Brazil, except in Marine
Protected Areas (MPAs) and their buffer zones. The fishery of this species within the MPA of the FNO archipelago led to conflict
between environmental managers, who established a no-take zone within the MPA in the 1980s, and local fishers, who claimed the
right to fish in this area [36]. In an attempt to resolve this conflict, the Brazilian government lifted restrictions to exploit sardine in
the no-take zone of FNO MPA, initially from 2020 to 2022, and later extended this lift of restrictions until 2024 [37]. Such conflicts
have an impact in the local economy and fishery strategies of the island. To date there is no scientific evaluation of how the island
population is divergent from the coastal population, what is their degree of genetic connectivity, or if there are temporal and spatial
changes in genetic diversity (N
e
).
Here, we addressed these issues by combining ecological niche models and population genomic methods in this highly
exploited species of sardine that remains largely unknown by scientists and legislators. First, we used ecological niche models to
establish predictions on how temporal and spatial changes in habitat suitability have constrained the distribution of Harengula sp.
relative to other formally recognized species of the genus, and how these changes have affected divergence and gene flow within
Harengula sp. Second, we used genomic data to test how those oceanographic barriers led to intraspecific divergence. Finally, we
used demographic modelling to estimate the magnitude and direction of gene flow between independent lineages and changes in
N
e
. These results provide science-based information to support better management strategies focused on the sustainable exploitation
of this species.
royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 291: 20232746
2
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
2. Material and methods
(a) Habitat dynamics
To understand the environmental drivers of the distribution of Harengula sp., and how these have changed over time, we fitted an eco-
logical niche model (ENM) in the current climate, and then projected it to the Last Glacial Maximum (LGM), approximately 21 thousand
years ago. We used a machine-learning algorithm of maximum entropy (MaxEnt v. 3.4.4; [38]) in the dismo package 1.35 in R v. 4.0.5
[39,40]. MaxEnt default modelling parameters were employed (e.g. beta multiplier and feature classes) to perform the models, which were
then evaluated according to area under the receiver operating characteristic (ROC) curve (AUC) [41,42].
We retrieved 227 georeferenced occurrence records (latitude and longitude) of Harengula clupeolaand H. jaguanain the Brazilian
Exclusive Economic Zone (EEZ) from online databases and from our fieldwork collection [43,44] (electronic supplementary material,
table S1). Since a previous study showed that Harengula sp.is the only species occurring in the Brazilian EEZ [29], we re-classified these obser-
vations as Harengula sp. (figure 1). To increase accuracy in our database, we only considered georeferenced records that refer to specimens
collected after 1945 [46] and that are available as vouchers in biological collections.
We downloaded four abiotic layers characterizing present and LGM climatic and geophysical aspects of the environment from
MARSPEC, using the sdmpredictors package in R [40,47]. The layers used in the models were: bathymetry (or depth; m), slope (degrees),
mean of sea surface salinity (psu) and mean of sea surface temperature (°C) [48,49]. We chose these variables because they (i) are of
general relevance for species of the Clupeidae [27], (ii) show low correlation between them (Pearsonsr< 0.8), and (iii) are available
for both present and LGM scenarios. To better delimit the area of occurrence of this species and to avoid model overfitting due to differ-
ences between occurrence points and background points [50], we cropped the layers into provinces as defined by Spalding et al.[45].
See electronic supplementary material for details.
(b) Molecular analyses
We collected 92 specimens from 12 sites representing most of the known distribution of Harengula sp., 10 of which are closely associated with
the Brazilian coast, one is located at the coastal island of Abrolhos (ABR), and another is located in the oceanic archipelago of Fernando de
Noronha (FNO) (figure 1; electronic supplementary material, table S1, licences SISBIO 76 053-1 and 74 802-1). Samples were either collected
using a beach seine or bought in local fish markets. Muscle samples were preserved in 100% ethanol and sent for DNA extraction, sequencing
and single nucleotide polymorphism (SNP) genotyping at Diversity Arrays Technology (DArT), following the automated DArTseq method,
hereafter DArTseq. DArTseq is a cost-effective approach for SNP discovery using a reduced representation of the genome [51]. It is similar to
the double digest restriction-site associated DNA sequencing (ddRAD) method as it uses one restriction enzyme that recognizes a common
sequence motif, and another that recognizes a rarer motif. By linking sequencing adaptors to genomic DNA fragments that have both restric-
tion sites and are within a target fragment size, we recover thousands of sequencing tags (or loci) scattered randomly along the genome,
which contain one or more SNPs each [52]. This methodology does not require a reference genome for identifying thousands of SNPs in
homologous sites across individuals and hence is widely applicable to undescribed species lacking such resources.
60° W
10° S
20° S
30° S
Harengula sp.
North Brazil
Current
Southern
Equatorial
Current
Brazil
Current
FNO
ecoregions
North Brazil Shelf
tropical southwestern
Atlantic
warm temperate
southwestern Atlantic
river outflow
CE
RN
PB
PE
AL
BA
ABR
ES
RJ
SP
SC
other
occurrences
50° W 40° W 30° W 20° W 10° W
Figure 1. Observation of Harengula sp. in the southwest Atlantic. Observations are marked by asterisks (occurrence record only), or by coloured circles for sampling
sites included in the genomic analyses. Arrows indicate the direction of main oceanic currents. Ecoregions are marked as in Spalding et al.[45]. Sampling localities:
FNO, Fernando de Noronha archipelago (oceanic island); CE, Ceará; RN, Rio Grande do Norte; PB, Paraíba; PE, Pernambuco; AL, Alagoas; BA, Bahia; ABR, Abrolhos
(continental island); ES, Espírito Santo; RJ, Rio de Janeiro, SP, São Paulo; SC, Santa Catarina.
royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 291: 20232746
3
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
From the unfiltered dataset, we removed two individuals with an excessive amount of missing data (greater than 98%), and 166 loci
that are putatively sex-linked (electronic supplementary material, figure S1). We further filtered SNPs considering the assumptions of each
analysis (electronic supplementary material, table S2), using the package dartR in R [53]. For the diversity analyses, we generated a
linked-SNPsdataset, which contained all SNPs across loci, irrespective of the amount of missing data, and a linked-SNPs-0MDdataset
removing all SNPs containing missing data. For population structure analyses and divergence, we generated an unlinked-best-SNPs
dataset containing the most informative (i.e. higher frequency) SNP per locus. Finally, for the demographic analyses, we generated an
unlinked-random-SNPsdataset, which retained a random SNP per locus, irrespective of its frequency. See electronic supplementary
material, table S2 for details on the DArTseq service and data filtering details.
(c) Population structure and genetic diversity
To estimate the number of intraspecific lineages of Harengula sp. we used two approaches that differ in their model assumptions. Because
both approaches assume no linkage between SNPs within loci, we used the unlinked-best-SNPs dataset. First, we performed a non-model-
based principal component analysis (PCA) to visualize the genetic similarities between individuals, using the dartR package with the file
in the genlight format [53]. Second, we estimated the most likely number of clusters using the software ADMIXTURE v. 1.3.0 [54], which
estimates individual ancestry by maximizing HardyWeinberg and linkage equilibria within Kancestral clusters. For this, we converted
the genlight file to plink using the gl2plink function [53]. We considered models with Kfrom 1 to 12 (total number of sampling sites), and
used 10 independent runs for each K. The most likely number of clusters was estimated based on the lowest cross-validation error which
indicates the predictive accuracy of each model [55].
To estimate the level of genetic divergence within Harengula sp., we used the same unlinked-best-SNPs dataset to calculate pairwise gen-
etic differentiation (F
ST
;[56]) between all sampling sites in R (function gl.fst.pop)[40,53]. Finally, we tested for isolation by distance within the
coastal genetic cluster using the Mantel test in the dartR package (function gl.ibd), accessing its significance using 999 permutations [53,57,58].
To estimate genetic diversity within individuals, sampling sites and clusters, we used the linked-SNPs dataset and calculated nucleotide
diversity based only on variant sites (π-SNP). In contrast with the traditional measure of nucleotide diversity (π;[59]), which computes the
average number of nucleotide differences across sequences with variant and invariant sites, π-SNP specifically assesses nucleotide diversity
in variant sites only. We estimated π-SNP using pixy v. 1.2.7.beta1 [60]. For this, we converted the genlight object to vcf using plink v. 3 and
the gl2vcf function [53]. Using the linked-SNPs-0MD, we calculated inbreeding coefficient (F
IS
), i.e. the level of heterozygosity of an individ-
ual relative to the heterozygosity observed in its cluster or population, using the basic.stats function from the hierfstat package [61]inR.
It is important to note that diversity summary statistics assume that each group conforms to the expectations for WrightFisher population,
and thus that allelic frequencies are not influenced by gene flow between populations nor by changes in population size. In contrast, the
algorithm used in ADMIXTURE and in the following demographic analysis considers possible gene flow between clusters.
(d) Demographic history
To test whether the observed genetic variation significantly deviates from the expected variation under neutrality (e.g. due to demo-
graphic change or to selection), we calculated TajimasD[62] for each identified cluster, using DnaSP 6 [63] and the linked-SNPs-0MD
dataset because this software does not take missing data into consideration.
To estimate the magnitude and direction of gene flow between population clusters, we used a diffusion approximation method
implemented in δaδi v. 2.0.5 [64] and the δaδi-pipeline [65]. Because this method is based on the site frequency spectrum, we estimated
it based on the dataset unlinked-random-SNPs, which contains 34% missing data, considering variant sites only. We converted the data to
vcf using the function gl2vcf. We reduced the amount of missing data by projecting down the number of SNPs and individuals, choosing a
projection that maintained a relatively large number of segregating sites and individuals. We tested three simpler demographic model
scenarios where change in effective population occurs during population splitting: (i) no migration, with three parameters: effective
size of population 1 (N
e
1), size of population 2 (N
e
2) and time since split (T); (ii) symmetric migration, with a fourth parameter reflecting
migration (2N
e
m); and (iii) asymmetric migration, with two migration parameters in opposing directions (2N
e
m1 >2 and 2N
e
m2 >1).
Additionally, we tested three more complex models allowing additional changes of effective population size after population splitting
(T2): (iv) symmetric migration with size change in one population, with six parameters (N
e
1a,N
e
1b,N
e
2,2N
e
m,T1 and T2); (v) sym-
metric migration with size change in both populations, with a seventh parameter describing change in the second population (N
e
2b); and
(vi) asymmetric migration with size change in both populations, with an eighth parameter describing different migration rates (2N
e
m1 >
2 and 2N
e
m2 >1). Since mutation rate and generation time are unknown for this species, we were not able to estimate absolute parameter
values. Therefore, all parameters were estimated relative to the effective population size of the ancestral population [64]. We selected the
model with a combination of lowest Akaike information criterion (AIC) and higher parameters convergence, accounting for the different
number of parameters in competing models [66,67]. To verify how well the selected model and estimated parameters fitted our data,
we visually inspected the residuals between empirical and model site frequency spectra (SFSs) plotted using the δaδi-pipeline [64].
We considered the parameter values estimated under the model and replicate with lowest AIC score.
3. Results
(a) Habitat dynamics
Our ecological niche model (ENM) for Harengula sp. shows good ability to discriminate between areas where the species occurs and
areas where it does not occur (AUC 0.964, s.d. 0.017). Bathymetry and mean sea surface salinity showed the highest percentage
contribution to model performance, of 73.5 and 23.7%, respectively (electronic supplementary material, figure S2 and table S3).
The ENM for the present ( figure 2b) shows continuous, although heterogeneous, habitat suitability alongca 6100 km of the Brazilian
continental shelf, bounded by the AmazonOrinoco Barrier in the north and the La Plata River in the south. The suitable habitat at the
oceanic archipelagos of FNO (included in the genetic sampling of this study) and of Trindade-Martin Vaz (not sampled here) are sep-
arated from the coast by approximately 400 and 1000 km of unsuitable habitat, respectively. Seamounts between these islands and the
royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 291: 20232746
4
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
coast provide patchy suitable area. All the islands located on the continental shelf, such as Abrolhos (ABR),are connected to the coast by
highly suitable habitat.
The ENM for the LGM (figure 2a) shows that the suitable habitat was narrower than in the present, although still continuous
along the Brazilian coast. Habitat suitability at the oceanic archipelagos of FNO and Trindade-Martin Vaz was similar to the
present model. Islands in the continental shelf had lower suitability than present and were exposed owing to the lower sea level.
(b) Molecular analyses
Our unfiltered dataset comprised 91 individuals genotyped for 66 639 loci, with 98 313 binary SNPs (on average, two SNPs per
locus), a total of 30.35% missing data and an average read depth across all markers of 20 reads per locus. Individuals had an aver-
age of 28.84% missing data, while SNPs had 69%. Five individuals had more than 40% missing data and were excluded from
downstream analyses. After filtering for genotype call consistency (call rate) and locus quality (reproducibility) (electronic sup-
plementary material, table S2), we kept 86 individuals genotyped for 62 295 loci, containing 88 639 SNPs and 29.5% missing
data (linked-SNPs dataset). Further removal of missing data retained 9369 SNPs in the linked-SNPs-0MD dataset. The
unlinked-best-SNPs and unlinked-random-SNPs datasets had monomorphic loci removed, resulting in each dataset consisting
of 59 992 SNPs with 34% missing data.
(c) Population structure and genetic diversity
The first dimensional axis of the PCA (figure 3b) explains 4% of the genetic variance and separates individuals from the coast of
Brazil (hereafter coastalpopulation), including those from Abrolhos Island on the continental shelf, from individuals collected in
the oceanic FNO archipelago (hereafter islandpopulation). The second dimensional axis explains 1.8% of the variance and further
divides the individuals from the coastal population latitudinally.
The ADMIXTURE analysis shows the highest likelihood for the model assuming two ancestral clusters (K=2,figure 3a), which
separated all the island individuals from all the coastal individuals. Individuals from the island do not share ancestry with the
coastal population. Conversely, individuals from the coast share some ancestry with the island population, with the population
closest to the island CE showing up to 32.4% of island ancestry, and the furthest population (SC) showing no island ancestry.
Genetic differentiation (F
ST
) between the island population and every locality from the coastal population was significant and
above 0.17 (electronic supplementary material, table S4); F
ST
between the island cluster and the coastal cluster was 0.14 (p< 0.05).
Between coastal sites, F
ST
was relatively low and was higher between the extremes of the costal distribution (CESC: 0.04, p< 0.05;
electronic supplementary material, table S4). Consequently, there is significant isolation by distance in the coastal population
(y= 0.0054 + 7.1×10
09x
,R² = 0.318, p= 0.001; figure 4a).
Estimates of individual nucleotide diversity estimated only using variant sites (π-SNP) ranged from 0.047 to 0.069 for the coast,
and from 0.044 to 0.054 for the island (figure 4b). The sampling sites at the known extremes of the species distribution showed
(a)LGM
10° S
20° S
30° S
50° W 40° W 30° W 50° W 40° W 30° W
present
FNO
ABR
habitat suitability
0 0.98
land exposed today
land exposed LGM
(b)
Figure 2. Ecological niche model for the occurrence of the scaled sardine Harengula sp. in (a) the Last Glacial Maximum (LGM, approx. 21 ka) and (b) present
climates. The magnified area in the upper inset shows suitability among seamounts connecting the oceanic archipelago of Fernando de Noronha (FNO) to the
Brazilian coast in the LGM, and that in the lower inset highlights the increase in habitat suitability at the continental shelf Abrolhos Island (ABR). The contour
of the models reflects marine provinces as described in Spalding et al.[45].
royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 291: 20232746
5
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
average levels of diversity. Overall π-SNP for the coast was 0.06 while in FNO it was 0.05 (figure 4b; electronic supplementary
material, table S5). The inbreeding coefficient (F
IS
) was 0.033 in FNO and ranged from 0.024 to 0.109 for coastal sampling sites
(electronic supplementary material, table S6 and figure S3).
(d) Demographic history
Values of TajimasDwere negative for every coastal locality (between 0.24 and 0.85; electronic supplementary material, table S5),
as well as for the whole coastal clusterconsidered together (1.50, p> 0.1). TajimasDwas also negative for the FNO population, yet
closer to neutral expectations (0.29, p> 0.1). These results are consistent with deviations from the assumptions from a Wright
Fisher population [68]. Some such deviations, such as change in population size and migration, are accounted for in the demographic
models below.
After down-projecting the SNP data, we considered 19 905 SNPs, 62 individuals from the coastal populations and seven indi-
viduals from FNO. All models including gene flow have lower AIC score than the model without gene flow (electronic
supplementary material, figure S4). The lowest AIC values were observed for both the symmetric migrationand asymmetric
migrationmodels (electronic supplementary material, table S7; figure 3c). However, only the simpler model of symmetric
migrationshowed a complete convergence of the estimated parameters across replicates and presented low residuals (electronic
supplementary material, figures S4 and S5). N
e
estimation indicated that the population from the coast experienced an expansion
after splitting (N
e
1 = 3.597 relative to the ancestral population), while the island population experienced a contraction (N
e
2 = 0.324;
electronic supplementary material, table S7; figure 3d). The estimated symmetric migration (2N
e
m) is 2.785.
4. Discussion
(a) How does habitat suitability generate intraspecific divergence?
Ecological niche models contribute to uncovering environmental corridors and locations of potential population settlement. For
species with high dispersal ability, like fishes, these models identify potential corridors connecting distantly located habitats, as
well as zones of environmental inadequacy isolating them, aiding the detection of population structure that can be tested with
other tools, such as genomics. Our ecological models indicate that depth and salinity explain most the distribution of Harengula
sp. (contribution is 73.5 and 23.7%, respectively; electronic supplementary material, table S3). The present-day model (figure 2b)
shows a continuous suitable habitat covering the continental shelf of most of the Brazilian coast, whereas suitable habitats in the
two oceanic archipelagos (i.e. Fernando de Noronha (FNO) and Trindade-Martin Vaz) are disconnected from the coast. During
the LGM, chains of seamounts provided suitable habitat that connected the coast to the islands (inset; figure 2a); such a putative
Ne coast = 3.6 Ne FNO = 0.3
(a) genetic admixture
(b) genetic divergence (c) demographic models
(d) demographic parameters
anc
Ne1Ne2
anc
Ne1Ne2
anc
Ne1Ne2
2Nem2Nem1
2Nem2
M1. no migration M2. symmetric
migration
M3. asymmetric
migration
divergence
time
anc
coast: north
FNO
coast: south
0-
5-
–5 -
0–5 5 10 15 20
axis 1 (4.0%)
axis 2 (1.8%)
2Nem = 2.8
ADMIXTURE
proportion
0.8
0.4
0FNO CE RN PB PE AL BA ABR ES RJ SCSP
FNO
CE
RN
PB
PE
AL
BA
ABR
ES
RJ
SP
SC
Figure 3. Genomic divergence and gene flow between intraspecific lineages of the scaled sardine Harengula sp. (a) ADMIXTURE analyses assuming two ancestral
clusters (model with highest likelihood). (b) Principal component analysis. (c) Competing demographic models depicting divergence between the coastal and island
clusters. The coloured model is most likely to fit the data according to Akaike information criterion (electronic supplementary material, table S7), but parameters are
not to scale. (d) Demographic parameters estimated using the best demographic model (symmetric migration). The width of the population boxes reflects effective
population (N
e
) relative to the ancestral N
e
(anc). For explanation of abbreviations for site names, see figure 1.
royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 291: 20232746
6
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
ecological corridor has low relative suitability in the present-day model. Studies on co-distributed species dependent on shallow
waters, such as parrotfishes [69], octopuses [70], moray eels, mullets and other animals [30], highlight the importance of seamounts
for the colonization of oceanic islands, during glacial periods of lower sea level.
Our results also reveal the effectiveness of oceanographic barriers in the evolution of Harengula sp. The population structure
analyses indicate that Harengula sp. is structured into a coastal population, including the Abrolhos Island on the continental shelf,
and into an island population at FNO (figure 3a,b). Some co-distributed coastal species likely originally from the Brazilian coast
that colonized FNO also show this genetic divergence, as in the case of corals [71], lobster [72] and the reef-associated rockpool
blenny [73]. However, other species that colonized FNO do not show this pattern of divergence, such as parrotfishes [74], octo-
puses [70] and snappers [75], indicating that species occurring in similar habitats can have different evolutionary responses to
the same oceanographic barriers. Future studies can clarify if such idiosyncratic patterns of genetic divergence are related to
dispersal traits, such as the duration of larval stage, type of reproduction and body size.
In accordance with our ENM showing continuous suitability along the 3700 km of the coastal range sampled here, we find low
genetic differentiation between coastal localities (F
ST
< 0.04; electronic supplementary material, table S4) and significant isolation
by distance (figure 4a). These levels of genetic differentiation are similar to values found in other pelagic species with large dis-
persal capacity, such as the dog snapper (F
ST
= 0.037) [75]. Thus, our results suggest that continuous habitat allows gene flow along
the coastal range, but limited dispersal rate of this species also resulted in clinal divergence along the coast.
(b) How does demographic history condition current levels of genetic variability?
We tested if changes in habitat suitability during glacial cycles are consistent with temporal changes in effective population size
(N
e
), a measure of genetic variability. Ecological and demographic models of Harengula sp. show congruent scenarios. Our ENMs
suggest that the suitable habitats for Harengula sp. strongly increased at the coast but remained stable at the oceanic islands, includ-
ing the sampled archipelago FNO (figure 2). Similar spatial patterns have been detected for multiple co-distributed coastal species
[76,77], raising the hypothesis that marine species restricted to shallow habitats might have contrasting demographic histories at
oceanic islands relative to the coast, a hypothesis that is supported by our demographic analyses of Harengula sp.
Our demographic modelling shows that a relatively simple model of split between coastal and island populations with instan-
taneous change in population size, followed by symmetric gene flow (i.e. symmetric migrationmodel with four parameters), is
the best model explaining the demographic history of these populations and is a good representation of the observed genomic data
(a)
(b)
isolation by distance
genetic diversity
0.04
R2 = 0.318, p = 0.001
FST / (1 – FST)
0.03
0.02
0.01
0
0.070
0.065
0.060
0.055
0.050
0.045
SC SP RJ ES ABR BA AL PE PB RN CE FNO
nucleotide diversity (π-SNP)
01 × 1062 × 106
distance (km)
3 × 106
Figure 4. Genetic diversity of Harengula sp. (a) Isolation by distance, considering mainland coastal sites only (excluding FNO), and (b) individual nucleotide diversity
(π-SNP) grouped per sampling site (mainland coastal sites in orange and oceanic island in dark grey). For explanation of abbreviations for site names, see figure 1.
royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 291: 20232746
7
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
(electronic supplementary material, figure S5). This model, along with other models allowing gene flow, clearly rejects a no migration
model (electronic supplementary material, figure S4), showing that, after the colonization of the oceanic island from the coast (i.e.
population split), the two divergent sardine populations evolved in the presence of genetic connectivity across the oceanographic
barriers described above. More complex models including asymmetric migration or a second change in population size do not
show convergence in parameter estimates and AIC scores across replicates, and therefore could not be interpreted (electronic sup-
plementary material, figure S4). This genetic connectivity is congruent with our ENM for Harengula sp., suggesting that seamount
chains between the Brazilian coast and the oceanic archipelago of FNO might act as an effective ecological corridor between
differentiated populations, particularly during the glacial periods, when the sea level was lower (figure 2).
Under the best model of symmetric migration, we find that the coastal population experienced an expansion to 3.6 times the
size of the ancestral population, while the island population experienced a contraction to one-third of the ancestral population
(electronic supplementary material, table S7). Such contrasting demographic histories recapitulate the results from our ENM,
suggesting that the observed changes in genomic variability and effective population size (N
e
) reflect long-term evolutionary pro-
cesses associated with the glacial cycles rather than shorter-term processes associated with human induced over-exploitation. The
magnitude of the gene flow (2N
e
m, or the effective number of gene migrations received by a population per generation) detected is
also surprising. In the absence of selection, 2N
e
m values equal to or higher than 1 prevent populations from accumulating or main-
taining divergence [68]. Therefore, finding higher values of 2N
e
m between populations that maintain genomic differentiation
implies that selection, at least in part of the genome, must counteract gene flow [78]. In Harengula sp. we find that 2N
e
m is 2.7,
suggesting that selection is maintaining divergence between coastal and island populations in the face of such high migration
rates. Yet, it is unclear which forms of selection could maintain divergence between these populations. Interestingly, Harengula
sp. is the only species among more than 400 species of the Clupeiformes that occurs in the archipelago of FNO [30], a situation
that indicates that oceanic islands in Brazil are generally unsuitable for sardines and herrings. Future studies (of e.g. diet, mor-
phology, physiology, development) are necessary for understanding which factors are associated with divergence and
persistence of the oceanic lineage of Harengula sp. in such an unusual habitat. In addition, increasing the genomic data by
using high-coverage whole-genome sequencing can provide insights about underlying genetic adaptation in the island population.
(c) How does evolutionary genomics provide insights into conservation and management of fishery stocks?
Previous studies using catch data of Harengula sp. from the coast of Brazil have shown that the species already presents signs of over-
exploitation [32], suggesting the need for protection measures. Yet, it was unclear to what extent the coastal population is
representative of the whole species range, making it difficult to establish recommendations for sustainable exploitation of this species
in its entire distribution. Our study provides the first evidence that this undescribed, yet exploited, sardine species encompasses one
unique coastal lineage, and at least another isolated lineage in the oceanic archipelago of FNO. It remains to be tested if there is a third
lineage in the isolated oceanic archipelago of Trindade-Martin Vaz. Levels of genomic differentiation between lineages (F
ST
= 0.14;
electronic supplementary material, table S4) are similar to or above those distinguishing fishery stocks such as the European sardine
(minimum F
ST
between stocks > 0.05; [79]), the European hake (min F
ST
> 0.058; [80]) and yellow-fin tuna (min F
ST
> 0.15; [22]) esti-
mated using similar data. The coastal and oceanic populations of Harengula sp. should therefore be managed as independent
stocks, with strategies considering the particularities of each location, including the different fishing activities and their economic
dependence on this natural resource. The FNO population, in particular, is likely locally adapted to environmental conditions that
are exceptional for species of the Clupeidae. The ecological and social importance of the island lineage, in addition to its independent
evolution, suggests that the no-take zone of the Marine Protected Area in FNO, which covers only part of thedistribution of the island
population, is important for supporting the viability of this population of Harengula sp. This reinforces the need for monitoring and
management of this likely sensitive resource in order to maintain exploitation at sustainable levels in the long term, ensuring not only
the livelihoods of the local fishing population, but also the proper functioning of the archipelagos marine ecosystem, which is highly
dependent on these forage fishes. More generally, the framework presented here can be applied to other species in need of urgent
management decisions but where ecological and genetic information remains limited. Such limitation is disproportionally affecting
less developed areas of the world, particularly tropical regions of the Southern Hemisphere, where human dependence on natural
resources is stronger, scientific resources are more limited, and a significant proportion of biodiversity remains to be described.
Thus, the integration of ecological and genomic methods can offer a low cost and efficient solution to resolve the accelerating conflicts
between management and local communities that depend on undescribed biodiversity.
Ethics. This work did not require ethical approval from a human subject or animal welfare committee.
Data accessibility. Data used in the ecological niche model and genomic analyses are available at the Dryad Digital Repository and can be accessed
here: https://doi.org/10.5061/dryad.np5hqbzzg [81].
Supplementary material is available online [82].
Declaration of AI use. We have not used AI-assisted technologies in creating this article.
Authorscontributions. J.F.R.C.: conceptualization, data curation, formal analysis, investigation, visualization, writingoriginal draft, writingreview
and editing; L.d.F.M.: data curation; F.D.D.: data curation, review and editing; P.H.C.: data curation; R.M.D.: data curation; S.M.Q.L.: conceptualiz-
ation, resources, supervision, writingreview and editing; J.T.V.: conceptualization, formal analysis, funding acquisition, investigation, project
administration, supervision, visualization, writingoriginal draft, writingreview and editing; R.J.P.: conceptualization, project administration,
resources, supervision, visualization, writingoriginal draft, writingreview and editing.
All authors gave final approval for publication and agreed to be held accountable for the work performed herein.
Conflict of interest declaration. The authors have no conflict of interest to declare.
Funding. J.F.R.C. was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior Brasil (CAPES) finance code 001.
Financial support to F.D.D. was provided by CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico PROTAX 443302/2020).
S.M.Q.L. receives a Conselho Nacional de Desenvolvimento Científico e Tecnologico (CNPq) productivity research grant (no. Proc 312066/
royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 291: 20232746
8
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
2021-0). This study was developed in the context of the Projeto MULTIPESCA Ciência para a sustentabilidade da pesca, pescado e pescadores do
Rio de Janeiro, which received support from the Marine and Fisheries Research Project. The Marine and Fisheries Research Project is an offset
measure established under a consent decree agreed between the company PRIO and the Federal Public ProsecutorsOffice in Rio de Janeiro. It is
implemented by FUNBIO.
Acknowledgments. We thank Ricardo Betancur-R (Scripps Institution of Oceanography) for donating samples.
References
1. Barnosky AD et al. 2011 Has the Earths sixth mass extinction already arrived? Nature 471,5157. (doi:10.1038/nature09678)
2. Pereira HM et al. 2010 Scenarios for global biodiversity in the 21st century. Science 330, 14961501. (doi:10.1126/science.1196624)
3. Cardinale BJ et al. 2012 Biodiversity loss and its impact on humanity. Nature 486,5967. (doi:10.1038/nature11148)
4. Ceballos G, Ehrlich PR, Dirzo R. 2017 Biological annihilation via the ongoing sixth mass extinction signaled by vertebrate population losses and declines. Proc. Natl Acad. Sci. USA
114, E6089E6096. (doi:10.1073/pnas.1704949114)
5. CBD. 1992 Convention on Biological Diversity: text and annexes. Montreal, Canada: Secretariat of the Convention on Biological Diversity.
6. OBrien D et al. 2022 Bringing together approaches to reporting on within species genetic diversity. J. Appl. Ecol. 59, 22272233. (doi:10.1111/1365-2664.14225)
7. Lopes PFM, Rosa EM, Salyvonchyk S, Nora V, Begossi A. 2013 Suggestions for fixing top-down coastal fisheries management through participatory approaches. Mar. Pol. 40,
100110. (doi:10.1016/j.marpol.2012.12.033)
8. Garner BA et al. 2016 Genomics in conservation: case studies and bridging the gap between data and application. Trends Ecol. Evol. 31,8183. (doi:10.1016/j.tree.2015.10.009)
9. Hogg CJ, Ottewell K, Latch P, Rossetto M, Biggs J, Gilbert A, Richmond S, Belov K. 2022 Threatened Species Initiative: empowering conservation action using genomic resources.
Proc. Natl Acad. Sci. USA 119, e2115643118. (doi:10.1073/pnas.2115643118)
10. Xuereb A, DAloia CC, Andrello M, Bernatchez L, Fortin M. 2021 Incorporating putatively neutral and adaptive genomic data into marine conservation planning. Conserv. Biol. 35,
909920. (doi:10.1111/cobi.13609)
11. Costa GC, Nogueira C, Machado RB, Colli GR. 2010 Sampling bias and the use of ecological niche modeling in conservation planning: a field evaluation in a biodiversity hotspot.
Biodivers. Conserv. 19, 883899. (doi:10.1007/s10531-009-9746-8)
12. Schwartz MW. 2012 Using niche models with climate projections to inform conservation management decisions. Biol. Conserv. 155, 149156. (doi:10.1016/j.biocon.2012.06.011)
13. Smedbol RK, Stephenson R. 2001 The importance of managing within-species diversity in cod and herring fisheries of the north-western Atlantic. J. Fish Biol. 59, 109128.
(doi:10.1111/j.1095-8649.2001.tb01382.x)
14. Hare MP, Nunney L, Schwartz MK, Ruzzante DE, Burford M, Waples RS, Ruegg K, Palstra F. 2011 Understanding and estimating effective population size for practical application in
marine species management. Conserv. Biol. 25, 438449. (doi:10.1111/j.1523-1739.2010.01637.x)
15. Carnaval AC, Hickerson MJ, Haddad CFB, Rodrigues MT, Moritz C. 2009 Stability predicts genetic diversity in the Brazilian Atlantic Forest hotspot. Science 323, 785789. (doi:10.
1126/science.1166955)
16. Chen Y, Jiang Z, Fan P, Ericson PGP, Song G, Luo X, Lei F, Qu Y. 2022 The combination of genomic offset and niche modelling provides insights into climate change-driven
vulnerability. Nat. Commun. 13, 4821. (doi:10.1038/s41467-022-32546-z)
17. Templeton AR, Robertson RJ, Brisson J, Strasburg J. 2001 Disrupting evolutionary processes: the effect of habitat fragmentation on collared lizards in the Missouri Ozarks. Proc.
Natl Acad. Sci. USA 98, 54265432. (doi:10.1073/pnas.091093098)
18. Fraser DJ, Weir LK, Bernatchez L, Hansen MM, Taylor EB. 2011 Extent and scale of local adaptation in salmonid fishes: review and meta-analysis. Heredity 106, 404420. (doi:10.
1038/hdy.2010.167)
19. Knutsen H et al. 2022 Combining population genomics with demographic analyses highlights habitat patchiness and larval dispersal as determinants of connectivity in coastal fish
species. Mol. Ecol. 31, 25622577. (doi:10.1111/mec.16415)
20. Valenzuela-Quiñonez F. 2016 How fisheries management can benefit from genomics? Brief Funct. Genomics 15, 352357. (doi:10.1093/bfgp/elw006)
21. Johansen T, Besnier F, Quintela M, Jorde PE, Glover KA, Westgaard J, Dahle G, Lien S, Kent MP. 2020 Genomic analysis reveals neutral and adaptive patterns that challenge the
current management regime for East Atlantic cod Gadus morhua L. Evol. Appl. 13, 26732688. (doi:10.1111/eva.13070)
22. Pecoraro C et al. 2018 The population genomics of yellowfin tuna (Thunnus albacares) at global geographic scale challenges current stock delineation. Scient. Rep. 8, 13890.
(doi:10.1038/s41598-018-32331-3)
23. Nielsen EE et al. 2012 Gene-associated markers provide tools for tackling illegal fishing and false eco-certification. Nat. Commun. 3, 851. (doi:10.1038/ncomms1845)
24. Barth JMI, Damerau M, Matschiner M, Jentoft S, Hanel R. 2017 Genomic differentiation and demographic histories of Atlantic and Indo-Pacific yellowfin tuna (Thunnus albacares)
populations. Genome Biol. Evol. 9, 10841098. (doi:10.1093/gbe/evx067)
25. Layton KKS et al. 2021 Genomic evidence of past and future climate-linked loss in a migratory Arctic fish. Nat. Clim. Change 11, 158165. (doi:10.1038/s41558-020-00959-7)
26. Bevilacqua AHV, Angelini R, Steenbeek J, Christensen V, Carvalho AR. 2019 Following the fish: the role of subsistence in a fish-based value chain. Ecol. Econ. 159, 326334.
(doi:10.1016/j.ecolecon.2019.02.004)
27. Whitehead PJP. 1985 FAO species catalogue. Clupeoid fishes of the world. An annotated and illustrated catalogue of the herrings, sardines, pilchards, sprats, anchovies and
wolfherrings. Part 1 - Chirocentridae, Clupeidae and Pristigasteridae. FAO Fish. Synop. no. 125, vol. 7, pt 1. Rome, Italy: Food and Agriculture Organization of the United Nations.
28. Birge TL, Ralph GM, Di Dario F, Munroe TA, Bullock RW, Maxwell SM, Santos MD, Hata H, Carpenter KE. 2021 Global conservation status of the worlds most prominent forage
fishes (Teleostei: Clupeiformes). Biol. Conserv. 253, 108903. (doi:10.1016/j.biocon.2020.108903)
29. Araújo TFP. 2020 Da ginga à sardinha: etnoictiologia e sistemática molecular de pequenos peixes de valor cultural da costa Brasileira [From ginga to sardines: ethnoichthyology
and molecular systematics of a small fish of cultural value at the Brazilian coast]. Masters thesis, Universidade Federal do Rio Grande do Norte, Natal.
30. Pinheiro HT et al. 2015 Fish biodiversity of the Vitória-Trindade seamount chain, southwestern Atlantic: an updated database. PLoS ONE 10, e0118180. (doi:10.1371/journal.pone.
0118180)
31. Barletta M, Lima ARA. 2019 Systematic review of fish ecology and anthropogenic impacts in South American estuaries: setting priorities for ecosystem conservation. Front. Mar.
Sci. 6, 237. (doi:10.3389/fmars.2019.00237)
32. Verba JT, Pennino MG, Coll M, Lopes PFM. 2020 Assessing drivers of tropical and subtropical marine fish collapses of Brazilian Exclusive Economic Zone. Sci. Total Environ. 702,
134940. (doi:10.1016/j.scitotenv.2019.134940)
royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 291: 20232746
9
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
33. Liu J, Slik F, Zheng S, Lindenmayer DB. 2022 Undescribed species have higher extinction risk than known species. Conserv. Lett. 15, e12876. (doi:10.1111/conl.12876)
34. Tedesco PA, Bigorne R, Bogan AE, Giam X, Jézéquel C, Hugueny B. 2014 Estimating how many undescribed species have gone extinct. Conserv. Biol. 28, 13601370. (doi:10.1111/
cobi.12285)
35. Ferreira-Araujo T, Macedo Lopes PF, Queiroz Lima SM. 2021 Size matters: identity of culturally important herrings in northeastern Brazil. Ethnobiol. Conserv. 10,129. (https://
ethnobioconservation.com/index.php/ebc/article/view/402)
36. Lopes PFM, Mendes L, Fonseca V, Villasante S. 2017 Tourism as a driver of conflicts and changes in fisheries value chains in Marine Protected Areas. J. Environ. Manage. 200,
123134. (doi:10.1016/j.jenvman.2017.05.080)
37. DOU. 2021 Extrato de compromisso [Commitment summary]. Diário Oficial da União, Governo Federal do Brasil 13, 119.
38. Phillips SJ, Anderson RP, Schapire RE. 2006 Maximum entropy modeling of species geographic distributions. Ecol. Modell.190, 231259. (doi:10.1016/j.ecolmodel.2005.03.026)
39. Hijmans RJ, Phillips SJ, Leathwick J, Elith J. 2017 Package dismo. See https://cran.r-project.org/web/packages/dismo/dismo.pdf.
40. R Core Team. 2023 R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. See http://www.R-project.org/.
41. Allouche O, Tsoar A, Kadmon R. 2006 Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS). J. Appl. Ecol. 43, 12231232.
(doi:10.1111/j.1365-2664.2006.01214.x)
42. Shabani F, Kumar L, Ahmadi M. 2018 Assessing accuracy methods of species distribution models: AUC, specificity, sensitivity and the true skill statistic. Glob. J. Hum. Social Sci. B
18,718.
43. GBIF. 2021 Global Biodiversity Information Facility, 9254. (doi:10.15468/dl.kuxret)
44. GBIF. 2021 Global Biodiversity Information Facility, 0. (doi:10.15468/dl.3z4egq)
45. Spalding MD et al. 2007 Marine Ecoregions of the World: A Bioregionalization of Coastal and Shelf Areas. BioScience 57, 573583. (doi:10.1641/B570707)
46. Zizka A et al. 2020 No one-size-fits-all solution to clean GBIF. PeerJ 8, e9916. (doi:10.7717/peerj.9916)
47. Bosch S, Tyberghein L, De Clerck O, Fernandez S, Schepers L, LifeWatch Belgium. 2022 sdmpredictors: Species distribution modelling predictor datasets. See https://cran.r-project.
org/web/packages/sdmpredictors/sdmpredictors.pdf.
48. Sbrocco EJ. 2014 Paleo-MARSPEC: gridded ocean climate layers for the mid-Holocene and Last Glacial Maximum: ecological archives E095-149. Ecology 95, 1710. (doi:10.1890/14-
0443.1)
49. Sbrocco EJ, Barber PH. 2013 MARSPEC: ocean climate layers for marine spatial ecology: ecological archives E094-086. Ecology 94, 979. (doi:10.1890/12-1358.1)
50. Radosavljevic A, Anderson RP. 2014 Making better Maxent models of species distributions: complexity, overfitting and evaluation. J. Biogeogr. 41,629643. (doi:10.1111/jbi.12227)
51. Kilian A et al. 2012 Diversity arrays technology: a generic genome profiling technology on open platforms. In Data production and analysis in population genomics (eds F
Pompanon, A Bonin), pp. 6789. Totowa, NJ: Humana Press.
52. Jaccoud D. 2001 Diversity arrays: a solid state technology for sequence information independent genotyping. Nucleic Acids Res. 29, 25e225. (doi:10.1093/nar/29.4.e25)
53. Gruber B, Unmack PJ, Berry OF, Georges A. 2018 DARTR: an R package to facilitate analysis of SNP data generated from reduced representation genome sequencing. Mol. Ecol.
Resour. 18, 691699. (doi:10.1111/1755-0998.12745)
54. Alexander DH, Novembre J, Lange K. 2009 Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 16551664. (doi:10.1101/gr.094052.109)
55. Wold S. 1978 Cross-validatory estimation of the number of components in factor and principal components models. Technometrics 20, 397405. (doi:10.1080/00401706.1978.
10489693)
56. Wright S. 1943 Isolation by distance. Genetics 28, 114138. (doi:10.1093/genetics/28.2.114)
57. Manly BFJ. 1986 Randomization and regression methods for testing for associations with geographical, environmental and biological distances between populations. Popul. Ecol.
28, 201218. (doi:10.1007/BF02515450)
58. Mantel N. 1967 The detection of disease clustering and a generalized regression approach. Cancer Res. 27, 175178.
59. Nei M, Li WH. 1979 Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl Acad. Sci. USA 76, 52695273. (doi:10.1073/pnas.76.10.
5269)
60. Korunes KL, Samuk K. 2021 pixy : Unbiased estimation of nucleotide diversity and divergence in the presence of missing data. Mol. Ecol. Resour. 21, 13591368. (doi:10.1111/
1755-0998.13326)
61. Goudet J. 2005 hierfstat, A package for R to compute and test hierarchical F-statistics. Mol. Ecol. Notes 5, 184186. (doi:10.1111/j.1471-8286.2004.00828.x)
62. Tajima F. 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585595. (doi:10.1093/genetics/123.3.585)
63. Rozas J, Ferrer-Mata A, Sánchez-DelBarrio JC, Guirao-Rico S, Librado P, Ramos-Onsins SE, Sánchez-Gracia A. 2017 DnaSP 6: DNA sequence polymorphism analysis of large data
sets. Mol. Biol. Evol. 34, 32993302. (doi:10.1093/molbev/msx248)
64. Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD. 2009 Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data.
PLoS Genet. 5, e1000695. (doi:10.1371/journal.pgen.1000695)
65. Portik DM, Leaché AD, Rivera D, Barej MF, Burger M, Hirschfeld M, Rödel M, Blackburn DC, Fujita MK. 2017 Evaluating mechanisms of diversification in a Guineo-Congolian tropical
forest frog using demographic model selection. Mol. Ecol. 26, 52455263. (doi:10.1111/mec.14266)
66. Akaike H. 1973 Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika 60, 255265. (doi:10.1093/biomet/60.2.255)
67. Wagenmakers E-J, Farrell S. 2004 AIC model selection using Akaike weights. Psychon. Bull. Rev. 11, 192196. (doi:10.3758/BF03206482)
68. Wright S. 1931 Evolution in Mendelian populations. Genetics 16,97159. (doi:10.1093/genetics/16.2.97)
69. Roos NC, Carvalho AR, Lopes PFM, Pennino MG. 2015 Modeling sensitive parrotfish (Labridae: Scarini) habitats along the Brazilian coast. Mar. Environ. Res. 110,92100. (doi:10.
1016/j.marenvres.2015.08.005)
70. Bein B, Lima FD, Lazzarotto H, Rocha LA, Leite TS, Lima SMQ, Pereira RJ. 2023 Population genomics of an Octopus species identify oceanographic barriers and inbreeding patterns.
Mar. Biol. 170, 161. (doi:10.1007/s00227-023-04307-z)
71. Peluso L, Tascheri V, Nunes FLD, Castro CB, Pires DO, Zilberberg C. 2018 Contemporary and historical oceanographic processes explain genetic connectivity in a southwestern
Atlantic coral. Scient. Rep. 8, 2684. (doi:10.1038/s41598-018-21010-y)
72. Gaeta J, Acevedo I, López-Márquez V, Freitas R, Cruz R, Maggioni R, Herrera R, Machordom A. 2020 Genetic differentiation among Atlantic island populations of the brown spiny
lobster Panulirus echinatus (Decapoda: Palinuridae). Aquat. Conserv. Mar. 30, 868881. (doi:10.1002/aqc.3297)
73. Neves JMM, Lima SMQ, Mendes LF, Torres RA, Pereira RJ, Mott T. 2016 Population structure of the rockpool blenny Entomacrodus vomerinus shows source-sink dynamics among
ecoregions in the tropical southwestern Atlantic. PLoS ONE 11, e0157472. (doi:10.1371/journal.pone.0157472)
royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 291: 20232746
10
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
74. Verba JT, Ferreira CEL, Pennino MG, Hagberg L, Lopes PFM, Padovani Ferreira B, MaiaQueiroz Lima S, Stow A. 2023 Genetic structure of the threatened gray parrotfish (Sparisoma
axillare) in the southwestern Atlantic. Coral Reefs 42, 105117. (doi:10.1007/s00338-022-02324-w)
75. Verba JT, Stow A, Bein B, Pennino MG, Lopes PFM, Ferreira BP, Mortier M, Maia Queiroz Lima S, Pereira RJ. 2022 Low population genetic structure is consistent with high habitat
connectivity in a commercially important fish species (Lutjanus jocu). Mar. Biol. 170,5.(doi:10.1007/s00227-022-04149-1)
76. Jenkins TL, Castilho R, Stevens JR. 2018 Meta-analysis of northeast Atlantic marine taxa shows contrasting phylogeographic patterns following post-LGM expansions. PeerJ 6,
e5684. (doi:10.7717/peerj.5684)
77. Marko PB, Hoffman JM, Emme SA, Mcgovern TM, Keever CC, Cox LN. 2010 The expansioncontractionmodel of Pleistocene biogeography: rocky shores suffer a sea change? Mol.
Ecol. 19, 146169. (doi:10.1111/j.1365-294X.2009.04417.x)
78. Hey J, Pinho C. 2012 Population genetics an objectivity in species diagnosis. Evolution 66, 14131429. (doi:10.1111/j.1558-5646.2011.01542.x)
79. Fonseca R da et al. 2022 Population genomics reveals the underlying structure of the small pelagic European sardine and suggests low connectivity within Macaronesia. (doi:10.
22541/au.161628445.52373083/v3)
80. Leone A, Álvarez P, García D, Saborido-Rey F, Rodriguez-Ezpeleta N. 2019 Genome-wide SNP based population structure in European hake reveals the need for harmonizing
biological and management units. ICES J. Mar. Sci. 76, 22602266. (doi:10.1093/icesjms/fsz161)
81. Coelho JFR, Mendes LF, Di Dario F, Carvalho PH, Dias RM, Lima SMQ, Verba JT, Pereira RJ. 2024 Data from: Integration of genomic and ecologic methods inform management of
an undescribed, yet highly exploited, sardine species. Dryad Digital Repository. (doi:10.5061/dryad.np5hqbzzg)
82. Coelho JFR, Mendes LF, Di Dario F, Carvalho PH, Dias RM, Lima SMQ, Verba JT, Pereira RJ. 2024 Integration of genomic and ecological methods inform management of an
undescribed, yet highly exploited, sardine species. Figshare. (doi:doi:10.6084/m9.figshare.c.7090089)
royalsocietypublishing.org/journal/rspb Proc. R. Soc. B 291: 20232746
11
Downloaded from https://royalsocietypublishing.org/ on 06 March 2024
... However, Harengula sp. is absent at the São Pedro e São Paulo archipelago, an oceanic island of the Brazilian EEZ located ca. 630 km northeast of FNO (Pinheiro et al. 2020). Previous modelling and genetic studies with this unique insular sardine have only included data from FNO and mainland coastal sites (Bennemann 2022;Coelho et al. 2024). The absence of information from ATR and TMV evidences the difficulties and costly logistics involved in sampling in MPAs in oceanic islands. ...
... Similarly, our models indicate that environmental suitability for the occurrence of the scaled sardine Harengula sp. will shift southwards, which may lead this species to become less abundant or even disappear from most of its northern distribution along the Brazilian coast, and become more abundant in coastal zones of the south, where suitability is predicted to increase in the future even under the worst-case scenario of climate change (RCP 8.5). ENMs also show that the high freshwater discharge of the Amazon-Orinoco River plume is an area of low suitability for the occurrence of Harengula sp., probably delimiting northern boundary of habitat suitability for this species, as previously suggested (Coelho et al. 2024). Additionally, we show that MPAs closer to the mainland coast can export sardine' eggs and larvae that could replenish the fishery stock biomass at the Brazilian coast, while such passive dispersal was not detected from more distant MPAs. ...
... during the Last Glacial Maximum (ca. 21Kya) indicate narrow suitability along the Brazilian coast, followed by expansion of habitat suitability towards the present climate, agreeing with the signal of population expansion detected by genomic data (Coelho et al. 2024). Niche models and genetic data (mitochondrial DNA and genomic SNPs) showed that Harengula sp. at the Brazilian coast and the oceanic island of FNO are separated by depth, while individuals from the continental island of ABR are connected to those at the Brazilian coast (Bennemann 2022;Coelho et al. 2024). ...
Article
Full-text available
Fishery statistics are mainly made by recording the popular fish names, which is later translated into scientific identification. However, these names often either refer to a species group and/or vary along their distribution, increasing identification uncertainty. Species that have cultural value for traditional communities are known as culturally important species (CIS). Herein, we assessed Fishers’ Ecological Knowledge to investigate small-silvery herrings (ginga) used as part of a traditional dish \ginga com tapioca", that is recognized as a cultural heritage in the Brazilian northeastern. Through 103 interviews conducted in six communities in three states, we determined that ginga, although a name known elsewhere, is only traded as such in the metropolitan area of Natal. In this region, ginga is caught with drift net and deemed profitable by fishers. We identified both over- and under-differentiation, with ginga recognized by fishers as five, and sold as three main species, namely Opisthonema oglinum, Harengula sp., and Lile piquitinga. The larger specimens of two of those species (O. oglinum and Harengula sp.) were also traded as sardines. We found that most individuals sold as ginga were juveniles, which might impact the recruitment of some fish species. Due to its unique cultural relevance to the local community of Natal, ginga could be considered a CIS, which could aid future management or conservation measures.
Article
Full-text available
Coastal marine ecosystems are highly productive and important for global fisheries. To mitigate over exploitation and to establish efficient conservation management plans for species of economic interest, it is necessary to identify the oceanographic barriers that condition divergence and gene flow between populations with those species, and that determine their relative amounts of genetic variability. Here, we present the first population genomic study of an Octopus species, Octopus insularis , which was described in 2008 and is distributed in coastal and oceanic island habitats in the tropical Atlantic Ocean. Using genomic data, we identify the South Equatorial current as the main barrier to gene flow between southern and northern parts of the range, followed by discontinuities in the habitat associated with depth. We find that genetic diversity of insular populations significantly decreases after colonization from the continental shelf, also reflecting low habitat availability. Using demographic modelling, we find signatures of a stronger population expansion for coastal relative to insular populations, consistent with estimated increases in habitat availability since the Last Glacial Maximum. The direction of gene flow is coincident with unidirectional currents and bidirectional eddies between otherwise isolated populations. Together, our results show that oceanic currents and habitat breaks are determinant in the diversification of coastal marine species where adults have a sedentary behavior but paralarvae are dispersed passively, shaping standing genetic variability within populations. Lower genetic diversity within insular populations implies that these are particularly vulnerable to current human exploitation and selective pressures, calling for the revision of their protection status.
Article
Full-text available
The level of habitat availability influences genetic divergence among populations and the genetic diversity within populations. In the marine environment, near-shore species are among the most sensitive to habitat changes. Knowledge of how historical environmental change affected habitat availability and genetic variation can be applied to the development of proactive management strategies of exploited species. Here, we modeled the contemporary and historical distribution of Lutjanus jocu in Brazil. We describe patterns of genomic diversity to better understand how climatic cycles might correlate with the species demographic history and current genetic structure. We show that during the Last Glacial Maximum, there were ecological barriers that are absent today, possibly dividing the range of the species into three geographically separated areas of suitable habitat. Consistent with a historical reduction in habitat area, our analysis of demographic changes shows that L. jocu experienced a severe bottleneck followed by a population size expansion. We also found an absence of genetic structure and similar levels of genetic diversity throughout the sampled range of the species. Collectively, our results suggest that habitat availability changes have not obviously influenced contemporary levels of genetic divergence between populations. However, our demographic analyses suggest that the high sensitivity of this species to environmental change should be taken into consideration for management strategies. Furthermore, the general low levels of genetic structure and inference of high gene flow suggest that L. jocu likely constitutes a single stock in Brazilian waters and, therefore, requires coordinated legislation and management across its distribution.
Article
Full-text available
Despite the marine environment being typified by a lack of obvious barriers to dispersal, levels of genetic divergence can arise in marine organisms from historical changes in habitat availability, current oceanographic regimes and anthropogenic factors. Here we describe the genetic structure of the Gray Parrotfish, Sparisoma axillare, and identify environmental variables associated with patterns of genetic divergence throughout most of its distribution in Brazil. The heavily exploited Gray Parrotfish is endemic to Brazil, and there is lack of data on population structure that is needed to support sustainable management. To address this shortfall we analyzed 5429 SNPs from individuals sampled in nine locations, ranging from tropical to subtropical reef systems and costal to oceanic environments with varying levels of protection. We found low levels of genetic structure along the coast, including the oceanic island of Fernando de Noronha, and that a combination of water depth, ocean currents and geographic distance were the major drivers explaining genetic divergence. We identified a distinct genetic population around Trindade Island, 1000 km from the coast, highlighting the conservation significance of this population. Colonization of this oceanic site probably occurred during the Pleistocene periods of lower sea levels, allowing this shallow water-dependent species to use the seamount chain as stepping stones to Trindade. Our data further suggest that two protected areas, Costa dos Corais and Fernando de Noronha, likely play an important role as larval sources for much of the species distribution.
Article
Full-text available
Global warming is increasingly exacerbating biodiversity loss. Populations locally adapted to spatially heterogeneous environments may respond differentially to climate change, but this intraspecific variation has only recently been considered when modelling vulnerability under climate change. Here, we incorporate intraspecific variation in genomic offset and ecological niche modelling to estimate climate change-driven vulnerability in two bird species in the Sino-Himalayan Mountains. We found that the cold-tolerant populations show higher genomic offset but risk less challenge for niche suitability decline under future climate than the warm-tolerant populations. Based on a genome-niche index estimated by combining genomic offset and niche suitability change, we identified the populations with the least genome-niche interruption as potential donors for evolutionary rescue, i.e., the populations tolerant to climate change. We evaluated potential rescue routes via a landscape genetic analysis. Overall, we demonstrate that the integration of genomic offset, niche suitability modelling, and landscape connectivity can improve climate change-driven vulnerability assessments and facilitate effective conservation management.
Preprint
Full-text available
The European sardine ( Sardina pilchardus , Walbaum 1792) is indisputably a commercially important species. Previous studies using uneven sampling or a limited number of makers have presented sometimes conflicting evidence for the genetic structure of S. pilchardus populations. Here we show that whole genome data from 108 individuals from 16 sampling areas across 5,000 Km of the species’ distribution range (from the Eastern Mediterranean to the archipelago of Azores) supports at least three genetic clusters. One includes individuals from Azores and Madeira, with evidence of substructure separating these two archipelagos in the Atlantic. Another cluster broadly corresponds to the center of the distribution including the sampling sites around Iberia, separated by the Almeria-Oran front from the third cluster that includes all of the Mediterranean samples, except those from the Alboran Sea. Individuals from the Canary Islands appear as belonging to the same ancestral group as those from the Mediterranean. This suggests at least two important geographical barriers to gene flow, even though these do not seem complete, with many individuals from around Iberia and the Mediterranean showing some patterns compatible with admixture with other genetic clusters. Genomic regions corresponding to the top outliers of genetic differentiation are located in areas of low recombination indicative that genetic architecture also has a role in shaping population structure. These regions include genes related to otolith formation, a calcium carbonate structure in the inner ear previously used to distinguish S. pilchardus populations. Our results provide a baseline for further characterization of physical and genetic barriers that divide European sardine populations, and information for transnational stock management of this highly exploited species towards sustainable fisheries.
Article
Full-text available
Genetic diversity is one of the three main levels of biodiversity recognised in the Convention on Biological Diversity (CBD). Fundamental for species adaptation to environmental change, genetic diversity is nonetheless under‐reported within global and national indicators. When it is reported, the focus is often narrow and confined to domesticated or other commercial species. Several approaches have recently been developed to address this shortfall in reporting on genetic diversity of wild species. While multiplicity of approaches is helpful in any development process, it can also lead to confusion among policy makers and heighten a perception that conservation genetics is too abstract to be of use to organisations and governments. As the developers of five of the different approaches, we have come together to explain how various approaches relate to each other and propose a scorecard, as a unifying reporting mechanism for genetic diversity. Policy implications. We believe the proposed combined approach captures the strengths of its components and is practical for all nations and subnational governments. It is scalable and can be used to evaluate species conservation projects as well as genetic conservation projects.
Article
Full-text available
Gene flow shapes spatial genetic structure and the potential for local adaptation. Among marine animals with non‐migratory adults, the presence or absence of a pelagic larval stage is thought to be a key determinant in shaping gene flow and the genetic structure of populations. In addition, the spatial distribution of suitable habitats is expected to influence the distribution of biological populations and their connectivity patterns. We used whole genome sequencing to study demographic history and reduced representation (ddRAD) sequencing data to analyze spatial genetic structure in broadnosed pipefish (Syngnathus typhle). Its main habitat is eelgrass beds, which are patchily distributed along the study area in southern Norway. Demographic connectivity among populations was inferred from long‐term (~30 year) population counts that uncovered a rapid decline in spatial correlations in abundance with distance as short as ~2 km. These findings were contrasted with data for two other fish species that have a pelagic larval stage (corkwing wrasse, Symphodus melops; black goby, Gobius niger). For these latter species, we found wider spatial scales of connectivity and weaker genetic isolation‐by‐distance patterns, except where both species experienced a strong barrier to gene flow, seemingly due to lack of suitable habitat. Our findings verify expectations that a fragmented habitat and absence of a pelagic larval stage promote genetic structure, while presence of a pelagic larvae stage increases demographic connectivity and gene flow, except perhaps over extensive habitat gaps.
Article
Full-text available
Newly discovered species are often threatened with extinction but in many cases have received limited conservation effort. To guide future conservation, it is important to determine the extinction risk of newly described species. Here, we test how time since formal description of a species is linked to its threat status to obtain a better insight into the possible threat status of newly described species and as yet undescribed species. We compiled IUCN Red List data for 53,808 species from five vertebrate groups described since 1758. Extinction risk for more recently described species has increased significantly over time; the proportion of threatened species among newly described species has increased from 11.9% for species described between 1758 and 1767 to 30.0% for those described between 2011 and 2020. Based on projections from our analysis, this could further increase to 47.1% by 2050. The pattern is consistent across vertebrate taxonomic groups and biomes. Current species extinction rates estimated from data of all known species are therefore highly likely to be underestimated. Intensive fieldwork to boost discovery of new species and immediate conservation action for newly described species, especially in tropical areas, is urgently required.
Article
Full-text available
Globally, 15,521 animal species are listed as threatened by the International Union for the Conservation of Nature, and of these less than 3% have genomic resources that can inform conservation management. To combat this, global genome initiatives are developing genomic resources, yet production of a reference genome alone does not conserve a species. The reference genome allows us to develop a suite of tools to understand both genome-wide and functional diversity within and between species. Conservation practitioners can use these tools to inform their decision-making. But, at present there is an implementation gap between the release of genome information and the use of genomic data in applied conservation by conservation practitioners. In May 2020, we launched the Threatened Species Initiative and brought a consortium of genome biologists, population biologists, bioinformaticians, population geneticists, and ecologists together with conservation agencies across Australia, including government, zoos, and nongovernment organizations. Our objective is to create a foundation of genomic data to advance our understanding of key Australian threatened species, and ultimately empower conservation practitioners to access and apply genomic data to their decision-making processes through a web-based portal. Currently, we are developing genomic resources for 61 threatened species from a range of taxa, across Australia, with more than 130 collaborators from government, academia, and conservation organizations. Developed in direct consultation with government threatened-species managers and other conservation practitioners, herein we present our framework for meeting their needs and our systematic approach to integrating genomics into threatened species recovery.