ArticlePDF Available

The Design of Sampling Transects for Characterizing Water Quality in Estuaries

Authors:

Abstract

The high spatial variability of estuaries poses a challenge for characterizing estuarine water quality. This problem was examined by conducting monthly high-resolution transects for several water quality variables (chlorophylla, suspended particulate matter and salinity) in San Francisco Bay (California, U.S.A.). Using these data, six different ways of choosing station locations along a transect, in order to estimate mean conditions, were compared. In addition, 11 approaches to estimating the variance of the transect mean when stations are equally spaced were compared, and the relationship between variance of the estimated transect mean and number of stations was determined. The results provide guidelines for sampling along the axis of an estuary: (1) choose as many equally-spaced stations as practical; (2) estimate the variance of the meanyby var (y)=(1/10n2)Σnj=2(yj−yj−1)2, wherey1, ...,ynare the measurements at thenstations; and (3) attain the desired precision by adjusting the number of stations according to var(y)α1/n2. The inverse power of 2 in the last step is a consequence of the underlying spatial correlation structure in San Francisco Bay; more studies of spatial structure at other estuaries are needed to determine the generality of this relationship.
Estuarine, Coastal and Shelf Science (1997) 45, 285–302
The Design of Sampling Transects for Characterizing
Water Quality in Estuaries
A. D. Jassby
a
, B. E. Cole
b
and J. E. Cloern
b
a
Division of Environmental Studies, University of California, Davis, CA 95616, U.S.A.
b
U.S. Geological Survey, Water Resources Division, 345 Middlefield Road, Menlo Park, CA 94025, U.S.A.
Received 8 April 1996 and accepted in revised form 30 September 1996
The high spatial variability of estuaries poses a challenge for characterizing estuarine water quality. This problem was
examined by conducting monthly high-resolution transects for several water quality variables (chlorophyll a, suspended
particulate matter and salinity) in San Francisco Bay (California, U.S.A.). Using these data, six dierent ways of choosing
station locations along a transect, in order to estimate mean conditions, were compared. In addition, 11 approaches to
estimating the variance of the transect mean when stations are equally spaced were compared, and the relationship
between variance of the estimated transect mean and number of stations was determined. The results provide guidelines
for sampling along the axis of an estuary: (1) choose as many equally-spaced stations as practical; (2) estimate the variance
of the mean y¯ by var (y¯)=(1/10n
2
)G
n
j=2
(y
j
"y
j"1
)
2
, where y
1
,...,y
n
are the measurements at the nstations; and (3)
attain the desired precision by adjusting the number of stations according to var(y¯)1/n
2
. The inverse power of 2 in the
last step is a consequence of the underlying spatial correlation structure in San Francisco Bay; more studies of spatial
structure at other estuaries are needed to determine the generality of this relationship. ?1997 Academic Press Limited
Keywords: sampling design; spatial structure; water quality; chlorophyll distribution; suspended particulate matter;
salinity; geostatistics; San Francisco Bay
Introduction
Estuaries present unusual diculties in characterizing
the spatial distributions of the properties that collec-
tively define water quality— nutrients, dissolved gases,
trace contaminants, suspended sediments, salinity and
plankton populations. Large-scale patterns of spatial
variability include the longitudinal salinity gradient
along the continuum between the estuarine drainage
basin and the coastal ocean. Superimposed onto this
trend are sources of smaller-scale spatial variability,
including distributed point sources; features of water
circulation such as fronts, eddies or convergences that
create localized turbidity maxima (e.g. Peterson et al.,
1975); patchiness resulting from irregularities in
bottom topography (e.g. Powell et al., 1986); and
biologically-mediated spatial dierences in processes
such as primary production and biogeochemical
transformations of reactive constituents (e.g. Jassby
et al., 1993;Cloern, 1996). Many of these sources
of spatial variability are unique to or amplified for
estuaries.
At the same time, by virtue of the large human
populations often associated with estuaries, anthropo-
genic impacts on water quality are strong and the need
for characterizing ambient conditions and temporal
trends in these conditions is correspondingly urgent.
The variability inherent in estuaries implies that a
greater sampling eort is often necessary to describe
water quality adequately, compared to other aquatic
systems. The question of how to sample the spatial
extent of estuaries most eciently arises naturally,
whether the objective is to describe current conditions
or temporal trends in these conditions. Historically,
most station configurations in estuaries, and arguably
in most other aquatic ecosystems as well, have been
chosen on the basis of surface physiographic features
or by a cursory knowledge of spatial heterogeneity.
These configurations may very well turn out to be
near-optimal in some useful sense, but there is no way
to tell without a more objective approach.
This paper considers the general question: how
should samples be taken in an estuary or subembay-
ment so that regional properties (e.g. mean concen-
tration or mean population abundance) can be
compared from one time period to another or from
one subregion to another? Although this is perhaps the
simplest form of trend detection (the underlying goal
of most monitoring and assessment programmes), it is
a significant issue for several reasons. First, for certain
important water quality variables, the regional
(estuary-wide) or subregional (subembayment) mean
0272–7714/97/030285+18 $25.00/0/ec960199 ?1997 Academic Press Limited
provides an informative scalar index of ambient
conditions. We want to know, for example, if the
trophic state of an estuary (as indexed by chloro-
phyll a) is exhibiting a positive temporal trend, or if a
trace contaminant is higher in one subembayment
than another. Second, use of the mean enables one to
connect empirical observations in the estuary to a
large body of results from sampling theory and geo-
statistics. This connection supports the aim of provid-
ing a conceptual framework for both understanding
the observations and generalizing them to other water
bodies. Finally, the regional and subregional means
provide an important way of communicating estuarine
conditions to the public and environmental managers,
precisely because the mean is so simple and widely
understood.
The answer to the question posed here depends on
the (usually unknown) spatial structure of the water
quality measurements of interest. As spatial structure
diers among the dierent components of water qual-
ity (Powell et al., 1989), the present work followed the
lead of previous estuarine studies (Madden & Day,
1992;Childers et al., 1994) in choosing three separate
but complementary water quality indicators: salinity,
suspended particulate matter (SPM) and chlorophyll
a. Salinity is a conservative tracer of mixing along the
river–ocean continuum and therefore a surrogate for
longitudinal processes. Suspended particulate matter,
strongly aected by rapid exchange between the water
column and bottom sediments, is a surrogate for
vertical processes. Chlorophyll a, a measure of phyto-
plankton biomass and a representative non-
conservative constituent that quickly responds to
spatially-variable sources and sinks, often reflects
lateral processes (Huzzey et al., 1990). As the spatial
structures of these dierent components change
with time, the measurements were repeated at
monthly intervals. The sampling programme was
conducted over an annual period in San Francisco
Bay, a complex estuarine system that exhibits all
modes of spatial-temporal variability expected in
shallow coastal ecosystems influenced by tidal,
wind, riverine and anthropogenic eects (Cloern &
Nichols, 1985).
Site description
The San Francisco Estuary or ‘ Bay-Delta ’ consists of
a landward, tidal freshwater region known as the Delta
and a seaward region known as San Francisco Bay
(Figure 1). The Delta is a highly dissected region of
channels and islands where the Sacramento, San
Joaquin and other rivers coalesce and narrow as they
flow westward. The outflow from the Delta passes
through a narrow notch in the Coast Range into a
series of subembayments, and ultimately through a
narrow deep trough—the Golden Gate—into the
Pacific Ocean. Four major subembayments are
usually recognized: South, Central, San Pablo and
Suisun Bays. Together they constitute San Francisco
Bay, the largest coastal embayment on the Pacific
coast of the United States. Ninety percent of the
freshwater input into the Bay flows through the Delta
from regional drainage; the remainder is supplied by
local tributaries. The drainage basin of the estuary
encompasses 40% of California’s land area. River
inputs are highly seasonal, consisting of rainfall during
autumn and winter, and snowmelt during spring and
early summer. In addition to this dependence on
climate, flow is aected by a series of upstream
reservoirs that are managed for agriculture, power,
flood control and repulsion of salinity intrusions. A
large portion of the flow reaching the Delta is diverted,
mostly for agricultural purposes, before it can reach
the Bay.
Water quality problems in the Bay-Delta are multi-
ple, complex and linked in various ways. A major
underlying issue is management of freshwater inflow,
which aects estuarine population abundances both
directly, through transport, and indirectly, through
eects on salinity and other variables (Jassby et al.,
1995). Contaminants include sediments and metals
introduced from mining operations, domestic sewage,
persistent and toxic trace substances from industrial
discharge and urban runo, and biocides in agricul-
tural drainage (Davis et al., 1991). Occasional high
chlorophyll concentrations and threats of harmful
algal blooms are also of concern (Jassby et al., 1994).
Several large monitoring eorts are in place with
the goals of assessing existing water quality, determin-
ing trends in trace contaminants and population
abundances, and exploring the underlying causal
processes. The size of these programmes, the social
importance of the water quality problems and
the extreme variability of the estuary all demand a
closer and more objective examination of the sampling
eort.
General approach
This paper considers here only the longitudinal varia-
bility along the central channel that connects the
seaward and landward domains of the San Francisco
Bay system. By using variables that together reflect all
three spatial dimensions, however, these observations
in the estuarine channel encompass processes occur-
ring upstream, in adjacent marshes and lateral shoals,
due to point source discharges, and within the local
286 A. D. Jassby et al.
water column proper. The specific goal is to choose a
minimal number of sampling locations along the
transect from which one can estimate a scalar index of
conditions (in this case, the mean) with sucient
precision (i.e. with suciently low variance) that
useful comparisons can be made among dierent time
periods.
Deciding on a station array requires consideration
of three linked issues, addressed here in sequence:
(1) What kind of sampling design should be
adopted (e.g. random, systematic or stratified)?
(2) How can the precision (variance) of the transect
mean be estimated?
(3 For a prescribed level of precision, how many
samples (stations along a transect) are required?
Answers to these questions require knowledge
about the underlying distribution of the parent popu-
lation of all possible samples. The authors’ approach
is empirical and is not driven by theoretical assump-
tions about the underlying distribution. Although it is
impossible to sample the entire population of water
quality measurements in an estuary, a surrogate
parent population can be acquired by collecting a
large number (thousands) of samples as closely-
spaced sensor measurements made while a ship pro-
files an axial transect. A modified version of the
integrated software-instrument package MIDAS
(Multiple Interface Data Acquisition System; Walser
et al., 1992; see also Madden & Day, 1992) was used
to collect and store measurements from flow-through
water quality sensors and a Global Positioning System
(GPS) navigation system. Subsampling from the high-
resolution MIDAS transect data was then used to
address the issues listed above.
37°30
'
N
0
San Pablo Bay
38°
Dumbarton Bridge sta.
10 20
Miles
010 20
km
California
Pacific Ocean
Los Angeles
Map
Area
South Bay
San Bruno
Shoal sta.
Coyote Creek sta.
San Jose
Oakland
Oakland Bay
Bridge sta.
San Francisco/
122°30
'
122°W
Suisun Bay
Rio Vista sta.
Sacramento R.
Sacramento R. sta.
Pt. San Pablo sta.
Mare I. sta. Avon sta.
Martinez sta.
Chipps I. sta.
San Joaquin R.
Central
Bay
San Francisco
Delta
Pacific Ocean
Ange l.sta.
F 1. San Francisco Bay. The MIDAS cruise track is shown as a solid line along the axis of the estuary.
Water quality in estuaries 287
We begin with a consideration of the spatial sam-
pling design (Issue #1). In simple random sampling,
each station is selected randomly and independently
in space along the transect line. Although simple in
concept and obviously unbiased, random sampling
has an important drawback in situations where
the spatial correlation is high; if two stations are
randomly chosen too close together, then they will
have similar values and one of them is, to a certain
extent, wasted.
Systematic sampling, i.e. equidistant spacing of
stations along the transect line, avoids this problem
and therefore yields a more precise estimate of the
spatial mean in many situations (Murthy & Rao,
1988). A more precise estimate of the spatial mean
implies, in turn, that temporal trends of a given
size can be detected in fewer years or, alternately,
that smaller temporal trends can be detected in any
given time interval. Systematic spatial designs are
also more convenient to implement. Systematic
samples suer, however, from a serious drawback in
that unbiased estimates of precision are unavailable,
and approximations based on assumptions about
the nature of the underlying population must be
utilized (Bellhouse, 1988). The precision cannot
be estimated based on the sampling design alone
because a systematic sample is essentially a random
sample of size one; once the first station is selected,
the locations of the others are completely specified as
well. Furthermore, systematic sampling does not
always yield the most precise estimates; the relative
performance of dierent designs depends on the
structure of the underlying population (Cochran,
1977).
Stratified sampling refers to, in this case, dividing a
relatively heterogeneous estuary into more homo-
geneous subdomains and then carrying out either a
random or systematic programme of sampling inde-
pendently within each subdomain (stratum). Insofar
as the within-subdomain variability is reduced relative
to the between-subdomain variability, stratification
can lead to a more precise estimate of the mean
than either simple random or systematic sampling
(Cochran, 1977). The strong spatial correlation char-
acteristic of estuaries (Powell et al., 1986) suggests
that stratification of sampling into spatially contiguous
subregions might be appropriate. In order to choose
the strata in a consistent way, a novel method is
employed here; the machinery of tree-based modelling
(Clark & Pregibon, 1992).
The MIDAS transect data enable one to evaluate
the relative performance of these dierent sampling
designs. In particular, simple random sampling, sys-
tematic sampling and their stratified counterparts,
stratified random and stratified systematic sampling,
are compared.
As suggested above, if systematic sampling turns
out to give the most precise estimate of the underlying
mean, one must decide how the variance of the
estimate can best be calculated from the low-
resolution station arrays commonly encountered in
practice (Issue #2). Many dierent estimators have
been proposed, most of which are based on an as-
sumed model of population behaviour and so are
appropriate only when the model truly represents the
population. Whether or not a single tractable model
can be applied to estuarine data in general is not
known. At dierent times and locations, transects
appear to be dominated by noise, linear or higher-
order trends, persistence (spatial autocorrelation) or,
most often, a complex combination of these basic
patterns. When only low-resolution samples are avail-
able, there is little hope of identifying a suitable
model. The authors’ intention is, therefore, to assess
the robustness of the dierent estimators for use when
high-resolution data are not accessible, given that the
appropriate model may be temporally sensitive. Sub-
sets of these methods have been compared for demo-
graphic (Wolter, 1984) and stereological (Mattfeldt,
1989) data but their relative performance cannot be
extrapolated to natural ecosystems, which can exhibit
quite dierent population structures. Again, the MI-
DAS data enable a direct assessment of the dierent
variance estimators by providing high-resolution de-
scriptions of the underlying populations in an estuary.
Given a sampling design and a way to estimate the
resulting precision, how does one choose an appropri-
ate number of stations (Issue #3)? The use of some
criterion of performance or objective function is
an essential step in completing this phase of design,
but the criterion depends on the overall monitoring
objective (e.g. to describe ambient conditions, assess
compliance with standards, detect trends or determine
causal mechanisms) and the costs and uncertainties
associated with dierent designs. Rather than linking
this analysis to a specific objective function, it was
asked how the ability to reproduce the underlying
data, as measured by the variance of the estimated
mean, depends on the number of stations. This rela-
tion is simple enough to calculate with a high-
resolution data set in hand, but of specific interest is
what can be said when the only data available are from
sparser transects (i.e. the usual kind of data collected
in monitoring programmes). Therefore, one should
look for generalities in the relation that may be char-
acteristic of the underlying spatial structure in an
estuary, and can be used to guide sample size when
only low-resolution data are available.
288 A. D. Jassby et al.
Methods
Data collection
Ten cruises were conducted from November 1994
through September 1995 at approximately neap tide,
collecting data along a 150 km transect in the main
channel of San Francisco Bay from the landward end
of South Bay through Central Bay to Rio Vista on the
Sacramento River (Figure 1). The timing of the
cruises captured the broad range of freshwater flow
conditions experienced over the year (Figure 2), as
well as major events such as the spring bloom in South
Bay and summer estuarine turbidity maximum in the
northern Bay. Peak flows during the 1995 water year
were among the highest of the last 40 years; the range
of flow conditions encountered during these transects
is therefore unusually large compared to an ‘ average
water year ’ (Figure 2).
Fourteen hydrographic, meteorological and naviga-
tional variables were measured using the Multiple
Interface Data Acquisition System (MIDAS) on
board the RV Polaris. The ship’s location was
determined with a Trimble NavTrac XL Global
Positioning System. Water for the hydrographic
parameters was pumped from a through-hull fitting
located at the bow of the ship at a depth of approxi-
mately 2 m. The pumped water sample was directed
through an array of sensors for continuous analysis.
The hydrographic variables were measures of salinity,
temperature, chlorophyll fluorescence and turbidity.
Salinity was derived from measures of conductivity
and temperature made using a Sea-Bird Electronics
SBE-21 thermosalinograph. Temperature was
measured with a Sea-Bird Electronics SBE-3
temperature probe located at the bow of the ship in
the pumped sample stream, chlorophyll fluorescence
with a Turner Designs Model 10 flow-through fluor-
ometer, and turbidity with a Turner Designs Model
10 flow-through nephelometer.
Discrete water samples for chlorophyll aand SPM
were collected at 10–18 selected stations during the
ongoing recording of fluorescence and nephelometry
data signals (Edmunds et al., 1995). Samples for
chlorophyll were filtered onto a Gelman A/E glass
fibre filter and immediately frozen. The air-dried
filter was ground in 90% acetone within 1 week of
collection. After extraction for 18–24 h at "10 )C,
absorbances of the extracts were determined on a
Hewlett-Packard 8452A diode array spectrophoto-
meter. Chlorophyll avalues were calculated using
Lorenzen’s (1967) equations. Samples for SPM were
filtered onto preweighed, 0·4-ìm pore size, poly-
carbonate membrane filters and then air dried. The
filters were reweighed and the concentration of SPM
calculated after a correction was made for salt on the
filters. The fluorescence and nephelometry signals
were calibrated separately for each cruise using the
discrete values collected during that cruise. For the
first three cruises and the July cruise, the authors were
unable to obtain significant regressions of the MIDAS
fluorescence signal on discrete chlorophyll values.
Fluorescence data were not used in the analyses.
The MIDAS data acquisition system records the
data at a sampling interval of approximately 6 s. The
ship’s speed over ground varied with the tides but was
generally about 5 m s
"1
(10 knots), which resulted in
a spatial sampling interval of approximately 30 m.
During each transect from South Bay to Rio Vista,
approximately 5000 measures of each parameter were
collected. The average distance between successive
data points in the raw database was approximately
30 m, but the actual distances were variable because
of changing ship speed. The implementation of tree-
based regression used here is sensitive to the data
density (data points per km of transect), and changes
in this density over the course of the transect can bias
the analysis. In order to equalize the data density
over the transect, a subset of the data was formed
by marking the transect at 100-m intervals measured
along the transect from the starting point, and
selecting the single data record closest to each marker.
Horizontal stratification of the estuary
Stratification calculations. To compare the dierent
sampling designs, the estuary must first be stratified.
Tree-based modelling or regression operates by suc-
cessively splitting a dataset (transect) into increasingly
homogeneous subsets or strata until some stopping
rule comes into eect. In this case, each split is chosen
to maximize the dierence between the variance of the
parent ’ stratum and the sum of the variances of the
S
10 000
0OMonth (water year 1995)
D
6000
2000
J
Net delta outflow (m3 s–1)
NFAMMJJA
F 2. Freshwater outflow from the Sacramento-San
Joaquin Delta into San Francisco Bay. ——, water year
1995 (1 October 1994–30 September 1995); · · ·, average
for the water years 1956–95; _, cruise dates.
Water quality in estuaries 289
two ‘ children’ strata. As dierent transects and vari-
ables result in dierent splits, a further step is to
extract from the combined collection of splits those
regions where they tend to cluster and collectively
support the placement of a boundary. Tree-based
modelling therefore serves more as a guide to the
location of strata boundaries rather than an exact
specification of these boundaries.
Trees were ‘ grown ’ using the algorithms of
S-PLUS (Clark & Pregibon, 1992;Statistical
Sciences, 1994). Each transect was successively split
along the transect path in order to maximize the
quantity ÄD:
where y
i
is the ith observation in the parent stratum; ì
is the mean value of the parent stratum; Land Rare
the sets of indices defining the left-hand and right-
hand children strata, respectively; and ì
L
and ì
R
are
the mean values in the two respective children strata.
The splitting process continued until none of the
resulting strata could account for more than 10% of
the original variance. Strata were then compared
among variables for the same transect and among
transects for the same variable. The value of 10% was
chosen because smaller values resulted in strata that
were probably dependent on tidal stage. For example,
the splits computed for two successive transects on 18
and 19 January 1995 (beginning at the same time of
day but at opposite ends of the Bay) essentially
coincided if 10% was used as a cuto; on the other
hand, splits that resulted in strata accounting for less
than 10% of the original variance did not coincide.
Sample allocation among strata. In order to test the
ecacy of a stratification scheme, one must also
decide how to allocate samples among the various
strata. This can be done in several dierent ways. The
simplest method is proportional allocation, in which the
number of samples in any stratum is directly propor-
tional to the stratum size. Stratum size in the case of
a MIDAS transect is simply the length along the
transect between stratum boundaries.
In contrast to proportional allocation, the most
ecient or optimal allocation of stations among strata
takes into account stratum variability and sampling
costs, in addition to stratum size. For any stratum iof
a transect, the number of stations that minimizes the
variance of the estimated mean for a given total cost is:
where W
i
is the stratum size, S
i
is the stratum standard
deviation and c
i
is the cost per sample in that stratum
(Cochran, 1977). If the cost of sampling a station is
constant throughout the estuary, then Equation 2
implies that the density of stations within a stratum is
simply proportional to the standard deviation of the
transect variable.
Due to the potential discrepancy in optimal alloca-
tions for dierent variables (due to dierent values of
S
i
), the ecacy of a compromise allocation among
strata was also examined. The compromise was
eected by minimizing the average over all three
variables of the proportional increase in variance over
optimal allocation (Chatterjee, 1967). It can be shown
that the resulting sample sizes are:
where n
ij
is the optimum sample size in stratum ifor
variable j.
Sampling design
The variance of the mean was calculated for several
dierent practical sampling strategies, and compared
to simple random sampling. The authors’ approach
was to regard the MIDAS transect data as the under-
lying population. The variance of the simple random
sampling estimate is then given by:
where Sis the population standard deviation, n
is the sample size, and f=n/N is the sampling fraction
with Nthe population size (Cochran, 1977). In
practice, nwill be much less than 75 and Nis
usually around 1500, so the true sampling fraction is
much less than 5%. As a rule of thumb, the finite
population correction (1-f) can be ignored when
f<0·05 (Barnett, 1991), and it will be ignored in what
follows.
For stratified random sampling, the variance of the
estimated mean depends on the sample allocation
strategy. In the case of proportional allocation, the
variance is:
where W
i
is the relative size and S
i
is the standard
deviation of the ith stratum, and there are hstrata. For
optimal allocation, the corresponding calculation is:
290 A. D. Jassby et al.
The variance for a compromise allocation can be
determined from the general result for stratified
samples, which can be expressed in the form:
where n
j
=ná
j
.
The traditional station configuration in San
Francisco Bay has been an approximately systematic
one, i.e. with equal distances between adjacent
stations. Assuming Nis a multiple of n, there are
m=N/n possible systematic samples. When Nwas
not a multiple, the method of circular systematic
sampling due to Lahiri (Bellhouse, 1988) was used.
The variance of the systematic estimate is then simply:
where Yz
k
is the mean of the kth potential systematic
sample and Yzis the transect mean.
Finally, the performance of stratified sampling with
proportional allocation was investigated again, but
with systematic rather than simple random sampling
within strata. Variances within strata are then given by
Equation 8, but the variance of the estimated overall
transect mean is:
where var(y¯
i
) is the variance for the estimated mean of
stratum i.
Each sampling strategy was compared to simple
random sampling by calculating the percent decrease
in variance 100(1"V/V
ran
), where V
ran
is the variance
given by Equation 4 and Vis the variance due to
one of the other strategies. For stratified random
sampling, ndrops out and the comparisons are
independent of sample size. For systematic and strati-
fied systematic, however, the ratio depends on n
and so the results for three sample sizes (10, 20 and
40), typical of the range for transects in estuarine
research and monitoring programmes and specifically
covering the range used in San Francisco Bay, were
examined.
Variance estimators
A variety of estimators that have been proposed for
systematic sampling and that are simple to compute
were examined (Table 1). The first estimator SRS is
simply the variance of the simple random sampling
estimate (Equation 4). Estimator MURT1 considers
the systematic sample as a stratified random sample
with two samples from each of n/2 strata; MURT2 is
similar but based on successive dierences (Murthy
& Rao, 1988). The next three estimators are based
on higher-order dierences; WOLT1 and WOLT2
attempt to account for trends and WOLT3 for auto-
correlation (Wolter, 1984). The estimator KOOP
consists of a pseudo-replication in which the sample is
split into two systematic subsamples (Koop, 1971).
The next three estimators also attempt to take into
account the spatial correlation structure of the popu-
lation. Estimator COCH is an asymptotic result due to
Cochran (1946) and assumes an autoregressive pro-
cess of order one; estimators CHEV (Yates, 1960;
Chevrou, 1976) and GUND (Gunderson & Jensen,
1987) are based on regionalized variable theory
(Mattfeldt, 1989). CHEV was developed specifically
for error estimation in linear systematic arrays, while
GUND is based on a quadratic approximation to the
variogram. The final estimator MAHA is based on
two independent systematic samples of size n/2, a
technique known as the method of interpenetrating
subsamples (Mahalanobis, 1946).
The estimators for sample sizes of 10, 20 and 40
stations were compared. For a given sample size,
hydrographic variable and transect, each estimator
was applied to all possible systematic samples, and the
performance of an estimator was summarized with the
mean square error MSE:
Results for all transects, for a given sample size and
variable, were then averaged and ranked.
Sample size
The relation between variance and sample size was
demonstrated empirically by computing var(y¯)(Equa-
tion 8) from all possible systematic subsamples for a
range of sample sizes, specifically 5 through 50. This
range encompasses the number of fixed stations likely
to be encountered in practice. The calculation was
repeated for each transect and variable. The relation
was assumed to be of the form:
and estimates of áwere extracted using the Golub-
Pereyra algorithm for partially linear models (Bates &
Chambers, 1992).
Water quality in estuaries 291
The relation between variance and sample size
is also a relation between variance and interstation
distance. In order to portray how variance changes
with spatial scale, as opposed to sample size, var(y¯)
(Equation 8) was computed from all possible
systematic subsamples for station separations of
1–64 km.
Results
Horizontal stratification of the estuary
The transect data are essentially one-dimensional and
so are best portrayed as a function of distance
measured along the transect. In order to simplify the
presentation, the main points regarding stratification
are illustrated with results from two cruises (Figures 3
and 4). The first feature to note is that the strata
chosen can dier among variables for the same cruise.
For example, chlorophyll aexhibits near-homogeneity
between the Bay Bridge and Sacramento River on 4
April 1995, while four strata have been attributed to
SPM in this same region (Figure 3). Secondly, and in
a similar vein, the strata can dier among cruises
for the same variable. For example, five strata are
required to describe chlorophyll aon 4 April 1995
while eight are required on 21 September 1995
T 1. Estimators of variance for systematic sampling
Estimator Description
The sample is denoted by y
j
,j=1,2,...,n.
150
15
00Transect distance (km)
Salinity
50
10
5
100
100
0
SPM (mg l–1)
40
20
25
5
Chl a (
µ
g l–1)
10
15
60
80
20
DuBr SBSh BBr PtSP Mare Avon Chip SacR
F 3. MIDAS transects with tree-based model super-
imposed for 4 April 1995. DuBr, Dumbarton Bridge; SBSh,
San Bruno Shoal; BBr, Bay Bridge; PtSP, Pt. San Pablo;
Mare, Mare I.; Avon, Avon Pier; Chip, Chipps I.; SacR, first
Sacramento R. Station; SPM, suspended particulate matter.
292 A. D. Jassby et al.
(Figure 4). Finally, strata boundaries do not necess-
arily demarcate homogeneous regions, but may occur
in the middle of a strong spatial gradient, such as the
one for salinity on 21 September 1995. These
three features illustrate that an estuarine model of
variable-independent, temporally stable and homo-
geneous subdomains can be a poor approximation.
The subdomains defined by tree-based regression,
which turn out to be the appropriate ones for stratified
sampling (see below), change among variables and
seasons. Furthermore, the frequent presence of large
gradients over much of the estuary contradicts the
very notion of homogeneous subdomains.
Although the model of homogeneous subdomains
may be in some sense a poor one for this estuary, the
model need not fit perfectly in order for improvements
in estimating the overall mean. The only require-
ment is that the typical within-stratum variance
is suciently small compared to the variance of the
within-stratum means (Barnett, 1991). Further-
more, indicating the positions of splits but not their
importance does not fully characterize the results
and may bias one’s view of the ecacy of strati-
fication. In order to test its ecacy more object-
ively, one needs to decide on a compromise stratifi-
cation that summarizes the commonalities among
the tree-based regressions for individual cruises and
variables.
For each variable, the positions of the splits for all
cruises were superimposed on the cruise track (Figure
5). Each split of a stratum is represented by a square,
the area of which is proportional to the total variance
represented by that stratum. Major splits for chloro-
phyll aare situated near the Dumbarton Bridge, the
San Bruno Shoal and Angel Island. For SPM, the
major boundaries are in the vicinity of the Dumbarton
Bridge, Angel Island and in northern San Pablo Bay.
Salinity is stratified most strongly near Angel Island,
northern San Pablo Bay and Martinez. Most of these
locations coincide with important physiographic and
hydrological features (Figure 1). The Dumbarton
Bridge marks a significant constriction in southern
South Bay; the San Bruno Shoal is a large shallow
expanse that is also a hydrodynamic and biological
boundary (Powell et al., 1986); Angel Island marks
the southern boundary of the river-dominated portion
of the estuary, where the flow from the Sacramento
and San Joaquin Rivers turns westward and exits
through the Golden Gate; and Martinez marks the
upstream boundary of the Carquinez Strait, a narrow
constriction in the northern Bay. A number of major
boundaries cluster toward the northern end of San
Pablo Bay, but do not clearly demarcate any single
position. The ‘ lability ’ in this region is due to the
strong gradients often present in San Pablo Bay.
Rather than situating a boundary at some location in
the centre of these splits that has no physiographic or
hydrodynamic significance, the authors chose to
locate a boundary at Mare Island, which marks the
northern end of these splits and the seaward boundary
of the Carquinez Strait.
In this way, all the boundaries have a physiographic
or hydrodynamic significance. The exact boundaries
of the six strata are defined in Table 2. Note that
stratum size (length) changes slightly among transects
because of small dierences in the actual course taken
by the ship. The means and standard deviations of
water quality variables for each stratum and transect
are summarized in Table 3.
Proportional allocation of stations is the same for all
variables, and simply mirrors stratum size (Table 4).
Optimal allocation, on the other hand, diers greatly
among variables for most strata, indicating that there
is no general optimal allocation. The compromise
allocation resembles the proportional allocation
although, based on their covariance among strata, it is
most similar to the optimal allocation for chlorophyll.
150
30
00Transect distance (km)
Salinity
50
10
5
100
80
0
SPM (mg l–1)
40
20
8
2
Chl a (
µ
g l–1)
4
6
60
DuBr SBSh BBr PtSP Mare Avon Chip SacR
15
25
20
F 4. MIDAS transects with tree-based model super-
imposed for 21 September 1995. DuBr, Dumbarton Bridge;
SBSh, San Bruno Shoal; BBr, Bay Bridge; PtSP, Pt. San
Pablo; Mare, Mare I.; Avon, Avon Pier; Chip, Chipps I.;
SacR, first Sacramento R. Station; SPM, suspended particu-
late matter.
Water quality in estuaries 293
Sampling design
In the case of stratified random sampling, propor-
tional allocation showed large increases in precision
compared to simple random sampling, largest in the
case of salinity but substantial for all variables (Table
5). Optimal allocation exhibited further increases
in precision in all cases, eliminating 23–45% of
the remaining variance. The compromise allocation,
220
160
Easting (km)
580
180
600
Northing (km)
560
200
(a) 220
160
Easting (km)
580
180
600
Northing (km)
560
200
(b)
220
160
Easting (km)
580
180
600
Northing (km)
560
200
(c)
F 5. Summary plot of the split locations for all transects. The area of the square representing each split is proportional
to the decrease in deviance due to the split. (a) Chlorophyll a; (b) suspended particulate matter; (c) salinity.
T 2. Definition of a stratification scheme for the MIDAS transects in San Francisco Bay
Stratum
no. Description Size&SD
(km) Northing
(km) Easting
(km)
1 South of Dumbarton Br. 6·9&0·3 <151·4
2 Dumbarton Br. to San Bruno Shoal 23·3&0·6 151·4–165·3
3 San Bruno Shoal to Angel I. 28·7&0·7 165·3–188·8
4 Angel I. to Mare I. 37·3&2·2 §188·8 <564·6
5 Mare I. to Martinez 13·1&1·1 §188·8 564·6–574·5
6 East of Martinez 51·8&1·7 §188·8 §574·5
Locations are specified in terms of UTM coordinates.
294 A. D. Jassby et al.
however, showed only a modest improvement over
proportional allocation for salinity, and was slightly
worse in the case of chlorophyll aand SPM.
Systematic sampling performed better than strati-
fied random sampling with as few as 10 stations,
although the large standard deviations imply that
results were highly transect dependent (Table 5). As
the sample size increased, the precision of systematic
sampling, increased and exceeded even stratified ran-
dom sampling with optimal allocation using only 20
stations. Stratified systematic sampling was slightly
better than simple systematic in the case of salinity but
worse in the case of SPM, regardless of sample size.
For chlorophyll, stratified systematic was sometimes
better, sometimes worse.
Variance estimators
The top three estimators, COCH, CHEV and GUND,
were among those which attempted to account for
spatial autocorrelation (Table 6). Estimator CHEV
had an average ranking of 1·6, COCH 2·6 and GUND
T 3. Mean&standard deviation of water quality variables within each stratum of Table 2
Stratum number
1234 5 6
29 November 1994
Salinity 28·5&0·5 30·4&0·4 30·3&0·1 27·4&2·4 19·9&2·2 6·31&5·26
SPM 62·8&10·6 24·9&10·8 6·75&1·57 20·3&13·3 55·7&7·0 37·5&14·3
Chl a1·82&0·08 2·08&0·07 2·02&0·08 1·70&0·10 1·46&0·04 1·53&0·05
18 January 1995
Salinity 19·7&1·3 19·6&1·7 14·8&1·8 6·01&4·76 0·252&0·141 0·0775&0·0179
SPM 27·2&14·1 1·19&2·78 1·53&2·26 55·7&52·5 157&4 152&18
Chl a1·46&0·04 1·53&0·01 1·54&0·01 1·42&0·13 1·16&0·02 1·19&0·06
7 February 1995
Salinity 15·5&0·2 15·4&0·6 13·4&1·3 5·8&2·2 1·18&0·96 0·0795&0·0054
SPM 4·57&0·96 5·34&0·68 4·91&2·43 36·5&19·0 97·9&13·8 96·3&11·8
Chl a1·44&0·01 1·45&0·02 1·51&0·02 1·41&0·08 1·14&0·06 1·15&0·04
7 March 1995
Salinity 15·1&0·3 17·5&1·0 20·1&1·1 9·54&3·37 1·54&1·15 0·101&0·021
SPM 47·6&3·8 28·5&9·9 9·72&2·96 20·7&10·1 40·6&3·7 25·9&7·8
Chl a38·1&2·0 28·1&6·5 10·5&5·6 5·09&1·46 4·87&0·28 3·61&0·60
4 April 1995
Salinity 6·82&0·34 9·03&1·00 11·5&1·2 4·3&2·1 0·211&0·162 0·108&0·011
SPM 59&16 23&7 10·8&5·4 48·7&12·5 62·5&5·6 43·5&8·3
Chl a11·5&1·5 8·12&1·42 8·74&5·16 5·56&1·23 5·12&0·28 4·861·56
2 May 1995
Salinity 14&0 16·2&1·4 18·9&1·7 6·4&3·6 0·121&0·032 0·0785&0·0137
SPM 95·3&37·2 34·4&13·4 18·5&10·2 70·9&21·5 53·4&14·4 22·1&4·7
Chl a13·6&6·6 3·41&2·05 5·28&4·03 15·2&4·2 7·05&2·24 1·57&0·80
13 June 1995
Salinity 13·6&1·3 17·5&0·9 21·6&1·5 15&6 2·58&1·35 0·146&0·171
SPM 634&145 85·7&101 14·9&28·9 29·1&25·7 167&73 79&86
Chl a6·97&1·92 2·63&0·529 2·18&0·12 2·27&0·11 2·72&0·30 2·51&0·39
18 July 1995
Salinity 20·4&0·4 22·2&0·7 24·8&0·8 13·7&5·0 2·42&1·57 0·107&0·113
SPM 19·1&5·2 5·38&4·07 2·54&1·01 19·4&14·6 76&10 30·3&18·7
Chl a2·46&0·09 2·27&0·13 2·22&0·16 2·34&0·24 3·4&0·3 2·67&0·43
16 August 1995
Salinity 22·5&0·3 24·1&0·7 27·1&1·1 20·9&4·0 10·1&2·2 1·54&2·03
SPM 32·7&5·5 10·5&6·3 3·67&1·08 20·7&7·1 25·1&3·8 32·6&11·9
Chl a6·31&1·2 1·23&1·10 0·9&1·6 8·65&2·29 5·4&1·2 5·62&2·41
21 September 1995
Salinity 22·6&0·9 26·2&1·0 27·8&1·0 24·2&3·6 14·7&2·1 2·14&3·17
SPM 26·3&19·4 5·78&2·21 4·48&1·05 5·76&3·91 25·7&4·8 29&7
Chl a4·94&1·35 1·92&0·52 1·44&0·36 2·18&1·05 2·38&0·30 2·4&30·53
SPM, suspended particulate matter; Chl a, chlorophyll a.
Water quality in estuaries 295
4·0. The next best estimators were KOOP and
MAHA, which base their estimates on two subsamples
of equal size. The estimator SRS, which treats the
sample as if it were a simple random sample, is highly
inecient; it came in last in every instance.
Sample size
Fits of the inverse power relationship separately to
each variable and transect (Equation 11) resulted in a
narrow range of áaveraging 1·9&0·1 (SE) (Figure 6).
No significant eects of either variables or transects
were found. For theoretical reasons discussed below,
the ability of an inverse square curve to fit all data
simultaneously was examined. The resulting fits each
appeared to be a satisfactory description of the data
(Figure 7).
The relative standard error of the median transect is
almost always less than 10% when stations are spaced
up to 8 km apart (Figure 8).
Discussion
Horizontal stratification of the estuary
Horizontal stratification of an estuary, i.e. division of
the estuary into subdomains, can be motivated by
many dierent goals:
(1) The need for precise estimates of estuary-wide
statistics such as the overall mean was the authors’
primary motivation. As discussed above, if the estuary
can be divided into subdomains that are relatively
homogeneous compared to the between-subdomain
variability, then estimates of the overall mean will be
more precise than for a simple random sample. These
T 4. Sample sizes within strata expressed as a percentage of the total number of samples
Stratum
no. Proportional
allocation&SD
Optimal Allocation&SD Compromise
allocation&SDChlorophyll aSPM Salinity
1 4·3&0·2 8·6&8·0 6·0&4·9 1·4&0·9 6·5&4·2
2 14·5&0·5 17·0&11·5 11·2&7·4 9·2&5·0 13·1&5·1
3 17·8&0·6 23·5&15·9 5·8&4·4 14·7&8·0 15·6&8·1
4 23·1&0·9 22·3&12·7 32·1&14·6 54·1&15·9 34·1&5·0
5 8·1&0·5 3·9&2·5 6·5&2·9 5·6&3·5 5·1&1·4
6 32·2&1·0 24·7&13·9 38·4&12·8 15·0&23·7 25·6&13·4
SPM, suspended particulate matter.
The standard deviations represent variation among transects.
T 5. Percent decrease in var(y¯) compared to simple random sampling for dierent types of
sampling strategies
Sampling type Chlorophyll a&SD SPM&SD Salinity&SD
Stratified random
Proportional 65·4&19·9 73·1&10·1 93·4&2·0
Optimal 75·1&15·6 79·3&9·5 96·4&1·1
Compromise 63·5&23·8 72·4&12·0 94·6&2·2
n=10
Systematic 70·8&24·0 86·6&15·3 95·7&1·1
Stratified systematic 66·6&32·7 82·8&9·0 96·9&1·2
n=20
Systematic 78·3&11·2 93·1&6·1 97·4&1·3
Stratified systematic 84·1&9·2 91·0&4·6 98·4&0·6
n=40
Systematic 86·7&10·3 95·2&5·6 99·2&0·4
Stratified systematic 91·8&4·2 94·4&3·0 99·2&0·3
SPM, suspended particulate matter.
The standard deviations represent variation among transects.
296 A. D. Jassby et al.
results show that stratification is very eective in
improving precision over simple random sampling.
Stratified random sampling, however, is inferior to
simple systematic sampling with as few as 10 samples.
Moreover, stratified systematic sampling oers no real
improvements over simple systematic sampling. The
authors’ conclusion that horizontal stratification is
ineective refers specifically to this context.
(2) Administrative convenience can be a valid
reason when, for example, dierent sampling methods
are required for dierent habitats of an estuary (e.g.
shoals vs channels).
(3) Stratification may also proceed along political
boundaries, particularly when the issue is one of
compliance with government regulations.
(4) Division into subdomains can also be motivated
by the need to understand underlying causal mecha-
nisms, in which case one might want to stratify on the
basis of covariability of dierent spatial locations in
time. In fact, previous research on the San Francisco
Bay-Delta has clearly shown how dierent (over-
lapping) spatial subdomains can be identified with
separate causal mechanisms through the use of rotated
principal component analysis, a regionalization pro-
cedure common in meteorology (Jassby & Powell,
1994;Cloern & Jassby, 1995).
Tree-based regression is one of many approaches to
the problem of grouping objects (in this case, loca-
tions) into subgroups according to their similarity.
Legendre (1987) has reviewed a number of these
other techniques, some similar to tree-based regres-
sion, that respect spatial contiguity, i.e. that give
weight to proximity in space as well as to similarity in
magnitude. Several features of tree-based regression
attracted us originally. First, by operating through a
binary recursive partitioning, it automatically pre-
serves spatial contiguity within subdomains. Second,
although not so much a consideration in this study, it
can be applied to higher-dimensional data. Finally, it
is easily shown that the criterion used by tree-based
regression to choose splits (Equation 1) is equivalent
to maximizing:
T 6. Ranking of estimators in Table 1 based on their MSE (Equation 10) for MIDAS transect
data
Estimator
Chl aSPM Salinity
n=10 n=20 n=40 n=10 n=20 n=40 n=10 n=20 n=40
SRS 11 11 11 11 11 11 11 11 11
MURT1 998999999
MURT2 10 10 10 10 10 10 10 10 10
WOLT1 766887545
WOLT2 677766836
WOLT3 859678757
KOOP 143244488
COCH 432423113
CHEV 311111222
GUND 224555364
MAHA 585332671
n, size of systematic sample.
SPM, suspended particulate matter; Chl a, chlorophyll a.
12
0
Power
2.0
6
3.01.0
10
2.51.5
12
0
Power
2.0
2
3.0
Number
1.0 2.51.5
4
8
F 6. Histogram of values for the inverse power áin
Equation 11. Each individual value is the result of fitting
Equation 11 to the data for a single variable (salinity,
suspended particulate matter or chlorophyll aand transect.
Water quality in estuaries 297
Relative variance
5010 20 30 40 Sample size 5010 20 30 40 5010 20 30 40
21 September 1995
16 August 1995
18 July 1995
13 June 1995
2 May 1995
4 April 1995
7 March 1995
7 February 1995
18 January 1995
29 November 1994
Chlorophyll aSPM Salinity
F 7. Variance (relative to maximum value) plotted against systematic sample size for each cruise (1–10) and variable.
The lines are fitted inverse square curves. SPM, suspended particulate matter.
298 A. D. Jassby et al.
which is the dierence in the variances for simple
random sampling and stratified random sampling with
proportional allocation (Cochran, 1977). In other
words, at each iteration, tree-based regression chooses
the split that maximizes the benefits of stratified
sampling.
The performance of tree-based regression may
sometimes appear disappointing, specifically in the
presence of strong gradients (e.g. the salinity panel in
Figure 4); boundaries are laid down at apparently
arbitrary locations on the gradient that have no distin-
guishing features. This behaviour, however, reflects
the fact that the estuary often does not fit the implicit
model of comprising homogeneous subdomains. Simi-
lar behaviour would be found with other techniques
that partition to minimize the within-subdomain varia-
bility. In fact, tree-based regression was actually very
eective in guiding the authors’ choice of stratum
boundaries, considering that the resulting stratified
sampling estimate decreased the variance of the esti-
mated mean by 73 to 97% (Table 5). Note, however,
that tree-based regression may not be appropriate for
identifying subdomains in other contexts; as pointed
out above, it fails to isolate transitional subdomains,
tending to split them instead.
Sampling design
Despite the ecacy of stratified sampling, systematic
sampling almost always yields a higher precision,
regardless of the method of sample allocation for
stratified sampling (Table 5). In theory, the relative
performance of the dierent sampling designs depends
on the properties of the underlying population
(Cochran, 1977;Murthy & Rao, 1988). If the popula-
tion is completely randomly arranged, systematic sam-
pling is no better than simple random sampling. For a
population dominated by a linear trend, stratified ran-
dom sampling is the most ecient. For a population
varying periodically in space, performance of the sys-
tematic sample depends on the interstation interval: if
the sampling interval is divisible by the wavelength,
estimates will be highly inecient; on the other hand, if
the sampling interval is an odd multiple of half the
wavelength, estimates will be highly ecient. For
populations with just serial correlation, the results de-
pend on the nature of the spatial covariance structure.
For example, Hajek (1959) extended the earlier results
of Cochran (1946) to show that, in the case of station-
ary populations, systematic sampling minimizes the
variance of the sample mean as long as the spatial
correlation function is positive, decreasing and convex.
The study transects do not fall clearly into any of
these ideal categories. Many of the features of the
large-scale variability are clearly related to large-scale
structural aspects of the estuary basin, such as the
transition from the narrow Carquinez Strait to open
San Pablo Bay in the vicinity of Mare Island or the
shallow expanse of the San Bruno Shoal in the south-
ern Bay (Figures 3 and 4). The large-scale variability,
therefore, is most properly treated as a ‘ deterministic ’
spatial trend. In more confined reaches or on smaller
scales, many of the aforementioned special cases may
apply. For example, a linear trend in fluorescence
occurs on the 10-km scale between San Bruno Shoal
and the Dumbarton Bridge (Figure 3), while station-
ary time series models incorporating serial correlation
appear to be appropriate for this series on the
scale 1 km and smaller. These results, therefore,
demonstrate the robustness of systematic sampling for
a range of spatial variability types found in estuaries.
100
16
1
4
10
32
0.01
Interstation interval (km)
0.1
64
Standard error (%)
128
Salinity
1000
16
1
4
10
32
0.01 64
Standard error (%)
128
SPM
100
16
1
4
10
32
0.01
0.1
64
Standard error (%)
128
Chlorophyll a
F 8. Relative standard error of the estimated transect
mean from systematic samples as a function of the inter-
station distance. The boxplot for each distance represents
the variability among transects. The interior horizontal line
marks the median; the lower and upper box boundaries
mark the first and third quartiles, respectively; the vertical
lines extend to all points within 1·5 times the interquartile
distance; more extreme points are shown by horizontal lines
standing alone. SPM, suspended particulate matter.
Water quality in estuaries 299
Note that a stratified systematic design oers only
modest improvements at best, and sometimes even
worse precision than unstratified systematic sampling
(Table 5). As the systematic samples for dierent
strata are chosen independently, stations from dier-
ent strata may fall very close together near boundaries
between strata and provide redundant information,
also a failing of both the simple and stratified random
design. The improvements are too modest to warrant
the additional complications of stratified systematic
sampling.
Variance estimators
The variance estimators with the lowest MSE (Equa-
tion 10)—COCH, CHEV and GUND—are those
specifically devised to account for spatial autocorrela-
tion in the population (Table 1). Estimator CHEV is
recommended on the basis of its empirical perform-
ance; interestingly, it was designed specifically for
linear systematic samples (Chevrou, 1976) and the
results here attest to the success of that design. Esti-
mator COCH would also be a good choice. It is one
devised by Cochran (1946) for stationary populations
with an autoregressive structure of order one, equiva-
lent to an exponential correlogram or variogram. As
discussed below, the correlograms commonly encoun-
tered in these data are in fact exponential. Wolter
(1984) observed that this estimator had remarkably
good properties for artificial populations that are
dominated by linear trends or autocorrelation. On the
other hand, it was found that treating the systematic
sample as if it were a simple random sample (SRS)
leads to a poor estimate of the variance. Note also that
MURT1 and MURT2 turned out to be relatively
inecient for these estuarine data, in contrast to their
superior performance for demographic data (Wolter,
1984).
Sample size
The relation between variance of the mean and
sample size is well described by an inverse power law
with an average power of 1·9&0·1 and more than
80% of the cases occur in the range 1·5–2·5 (Figure
6). It can be shown that the power is related to the
nature of the variogram (Simard et al., 1992). Let n
systematic samples be taken along the transect dis-
tance T,soL=T/n is the distance between samples.
Suppose for distances hup to L, the variogram has the
form:
ã(h)=ch
b
, with 0 ¦b<2
where cis a constant. Then one can derive for the
one-dimensional case (namely, a transect) that (D.
Marcotte, pers. comm.):
Now in the case of exponential, linear or spherical
variograms (Isaaks & Srivastava, 1989), provided that
the range is much larger than L,b=1 and an inverse
square law would be expected. For Gaussian vario-
grams, which have parabolic as opposed to linear
behaviour near the origin, b=2, and for a nugget
eect, b=0,sothatadierent power law would hold
in these cases. The present finding of an inverse
square law is therefore in agreement with past studies
of spatial correlation in San Francisco Bay, which
have demonstrated the presence of exponential spatial
correlation (Powell et al., 1986).
How widely can the inverse square relationship be
applied to other systems? Based on the above argu-
ments, this question can be rephrased by asking how
representative are variograms with linear behaviour
near the origin (i.e. linear, exponential and spherical
variograms). A nugget eect has been observed for
water quality variables in both estuarine (Legendre &
Trousellier, 1988;Legendre et al., 1989;Simard et al.,
1992) and coastal waters (Denman & Freeland, 1985;
Yoder et al., 1987). The shapes of the variograms,
however, were usually exponential or at least compat-
ible with an exponential shape where the resolution
was too poor to be certain. Furthermore, the nugget
eect may represent sampling error and not be an
inherent feature such as high short scale variability
(Isaaks & Srivastava, 1989); with more precise
measurement techniques, the nugget eect could
weaken or disappear. Nonetheless, the evidence from
estuaries on variogram shape is sparse. Parabolic
behaviour has been observed, moreover, in at least one
other tidal estuary; North Inlet, South Carolina
(Childers et al., 1994). Based on the existing evidence,
an inverse square law cannot therefore be assumed for
other estuaries, and the actual power could lie
between 1 and 3. There is clearly a need here for
expanding the empirical knowledge of estuarine
spatial autocorrelation. Theoretical investigation of
the link between the variogram or correlogram and
underlying physical and biological processes could
also help resolve the generality of any sampling design.
Where high-resolution data such as the MIDAS
data are not available, it may be possible to take a
geostatistical approach both to the variance estimate
and the relation between precision and sample size.
Based on a model of the underlying spatial auto-
correlation, i.e. the variogram, kriging methodology
300 A. D. Jassby et al.
provides optimal point or global estimates, including
the precision of the estimates. It has been widely used
in ecology (Rossi et al., 1992) and sometimes applied
to estuaries (Simard et al., 1992). It can also be used
to generate an empirical relation between variance and
sample size (Burgess et al., 1981;Oliver & Webster,
1991). A number of important conditions must be
satisfied, however, and as many as 200 stations may be
required to properly define the variogram for kriging
(Oliver & Webster, 1991), much larger than the
number of stations typically constituting an estuarine
sampling programme. With too few points, it may be
impossible to resolve behaviour near the origin and
variance estimates can then be unreliable (Thioulouse
et al., 1993). In that case, the variance estimator
suggested by this study provides a useful alternative,
although how to scale the variance for dierent sample
sizes will remain uncertain without knowledge of the
variogram shape.
Despite the central importance of the mean for both
theoretical and practical reasons, as pointed out in the
Introduction, it is not the most relevant statistic for
many water quality variables. In the case of pollutant
indices such as fecal coliforms, for example, the pro-
portion of the population exceeding some specific
level is the characteristic of interest. A global mean
that falls within sanitary guidelines may disguise
locally important water quality problems. Further
work should, therefore, consider not only how to
generalize these results regarding regional means to
other estuaries, but also the need for similar analyses
on other population statistics such as quantiles.
Concluding remarks
The results of this study provide guidelines for esti-
mating estuary-wide means with low-resolution data
from fixed stations. Given any desired precision for
the estimate, the first step is to take systematic
samples with as many stations as practical. Next, the
estimator CHEV or COCH (Table 1) is used to
calculate the variance of each sample, which will of
course vary somewhat from transect to transect even
when the number of stations is constant (Figure 8).
Finally, the desired station number is determined
from the typical or characteristic variance found in the
previous step, the target variance, and the inverse
power relation between variance and sample size. At
present, the actual value for the power must come
from prior knowledge of the variogram or correlogram
shape or, if enough stations are used, by calculating
the variogram from the initial array of stations. Fur-
ther research may reveal some general rules for deduc-
ing variogram shape or this power from features of
estuarine dynamics that can be observed with fewer
stations.
Acknowledgements
The authors thank Jody Edmunds and Jane Carey
for their central roles in the MIDAS data collection
eort. Denis Marcotte provided invaluable advice on
how spatial structure aects the relationship between
variance and sample size; the authors are grateful to
him for always being willing to help in a timely fashion
and to share his unpublished results. Two anonymous
reviewers suggested important improvements to the
original manuscript. The data collection and analysis
were supported by the San Francisco Estuary
Regional Monitoring Program for Trace Substances
(SFEI 135-95), the U.S. Geological Survey San
Francisco Bay Ecosystem Program and the U.S.
Environmental Protection Agency (R819658)
through the Center for Ecological Health Research at
the University of California, Davis. Although the U.S.
EPA partially funded preparation of this document, it
does not necessarily reflect the views of the agency and
no ocial endorsement should be inferred.
References
Barnett, V. 1991 Sample Survey: Principles and Methods. Edward
Arnold, London, 173 pp.
Bates, D. M. & Chambers, J. M. 1992 Nonlinear models. In
Statistical Models in S (Chambers, J. M. & Hastie, T. J., eds).
Wadsworth & Brooks/Cole Advanced Books & Software, Pacific
Grove, California, pp. 421–454.
Bellhouse, D. R. 1988 Systematic sampling. In Handbook of Statis-
tics, vol. 6 (Krishnaiah, P. R. & Rao, C. R., eds). Elsevier Science
Publishers, Amsterdam, pp. 125–145.
Burgess, T. M., Webster, R. & McBratney, A. B. 1981 Optimal
interpolation and isarithmic mapping of soil properties. IV.
Sampling strategy. Journal of Soil Science 32, 643–659.
Chatterjee, S. 1967 A note on optimum stratification. Skandinavisk
Aktuarietidskrift 50, 40–44.
Chevrou, R. B. 1976 Pre´cisions des mesures de superficie estime´e
par grille de points ou intersections de paralle`les. Annales des
Sciences Forestie`res 33, 257–269.
Childers, D. L., Sklar, F. H. & Hutchinson, S. E. 1994 Statistical
treatment and comparative analysis of scale-dependent aquatic
transect data in estuarine landscapes. Landscape Ecology 9, 127–
141.
Clark, L. A. & Pregibon, D. 1992 Tree-based models. In Statistical
Models in S (Chambers, J. M. & Hastie, T. J., eds). Wadsworth
& Brooks/Cole Advanced Books & Software, Pacific Grove,
California, pp. 377–419.
Cloern, J. E. 1996 Phytoplankton bloom dynamics in coastal
ecosystems—a review with some general lessons from sustained
investigation of San Francisco Bay (California, USA). Reviews of
Geophysics 34, 127–168.
Cloern, J. E. & Jassby, A. D. 1995 Year-to-year fluctuation of the
spring phytoplankton bloom in South San Francisco Bay: an
example of ecological variability at the land-sea interface. In
Ecological Time Series (Steele, J. H. & Powell, T. M., eds).
Chapman and Hall, New York, pp. 139–149.
Water quality in estuaries 301
Cloern, J. E. & Nichols, F. H. 1985 Temporal dynamics of an
estuary: San Francisco Bay. Hydrobiologia 129, 1–237.
Cochran, W. G. 1946 Relative accuracy of systematic and stratified
random samples for a certain class of populations. Annals of
Mathematical Statistics 17, 164–177.
Cochran W. G. 1977 Sampling Techniques, 3rd ed.John Wiley &
Sons, New York, 428 pp.
Davis J. A., Gunther, A. J., Richardson, B. J., O’Connor, J. M.,
Spies, R. B., Wyatt, E., Larson, E. and A. Chan Meiorin. 1991
Status and Trends Report on Pollutants in the San Francisco Estuary.
San Francisco Estuary Project, Oakland, California, 240 pp.
Denman, K. L. & Freeland, H. J. 1985 Correlation scales, objective
mapping and a statistical test of geostrophy over the continental
shelf. Journal of Marine Research 43, 517–539.
Edmunds J. L., Cole, B. E., Cloern, J. E., Carey, J. M. & Jassby,
A. D. 1995 Studies of the San Francisco Bay, California, Estuarine
Ecosystem: Pilot Regional Monitoring Results, 1994. Open-File
Report 95–378. U.S. Geological Survey, Menlo Park, California,
436 pp.
Gunderson, H. J. G. & Jensen, E. B. 1987 The eciency of
systematic sampling in stereology and its prediction. Journal of
Microscopy 147, 229–263.
Ha´jek, J. 1959 Optimum strategy and other problems in probability
sampling. Casopis Pro Pestovani Matematiky 84, 387–420.
Huzzey, L. M., Cloern, J. E. & Powell, T. M. 1990 Episodic
changes in lateral transport and phytoplankton distribution
in south San Francisco Bay. Limnology and Oceanography 35,
472–478.
Isaaks, E. H. & Srivastava, R. M. . 1989 An Introduction to Applied
Geostatistics. Oxford University Press, New York.
Jassby, A. D. & Powell, T. M. 1994 Hydrodynamic influences on
interannual chlorophyll variability in an estuary: upper San
Francisco Bay-Delta (California, U.S.A.). Estuarine, Coastal and
Shelf Science 39, 595–618.
Jassby, A. D., Cloern, J. E. & Powell, T. M. 1993 Organic carbon
sources and sinks in San Francisco Bay: variability induced by
river flow. Marine Ecology Progress Series 95, 39–54.
Jassby, A. D., Cloern, J. E., Carey, J. M., Cole, B. E. & Rudek, J.
1994 San Francisco Bay/Delta Regional Monitoring Program:
plankton and water quality pilot study 1993. In San Francisco
Estuary Regional Monitoring Program for Trace Substances: 1993
Annual Report. San Francisco Estuary Institute, Richmond,
California, pp. 117–128.
Jassby, A. D., Kimmerer, W. J., Monismith, S. G. et al. 1995
Isohaline position as a habitat indicator for estuarine populations.
Ecological Applications 5, 272–289.
Koop, J. C. 1971 On splitting a systematic sample for variance
estimation. Annals of Mathematical Statistics 42, 1084–1087.
Legendre, P. 1987 Constrained clustering. In Developments in
Numerical Ecology (Legendre, P. & Legendre, L., eds). Springer-
Verlag, Berlin, pp. 289–307.
Legendre, P. & Troussellier, M. 1988 Aquatic heterotrophic
bacteria: Modeling in the presence of spatial autocorrelation.
Limnology and Oceanography 33, 1055–1067.
Legendre, P., Troussellier, M., Jarry, V. & Fortin, M.-J. F. 1989
Design for simultaneous sampling of ecological variables: from
concepts to numerical solutions. Oikos 55, 30–42.
Lorenzen, C. J. 1967 Determination of chlorophyll and pheo-
pigments: Spectrophotometric equations. Limnology and Ocea-
nography 12, 343–346.
Madden, C. J. & Day, J. W. Jr. 1992 An instrument system for high
speed mapping of chlorophyll a and physico-chemical variables in
surface waters. Estuaries 15, 421–427.
Mahalanobis, P. C. 1946 Recent experiments in statistical sampling
in the Indian Statistical Institute. Journal of the Royal Statistical
Society, Series A 109, 325–370.
Mattfeldt, T. 1989 The accuracy of one-dimensional systematic
sampling. Journal of Microscopy 153, 301–313.
Murthy, M. N. & Rao, T. J. 1988 Systematic sampling with
illustrative examples. In Handbook of Statistics, vol. 6 (Krishnaiah,
P. R. & Rao, C. R., eds). Elsevier Science, Amsterdam, pp. 147–
185.
Oliver, M. A. & Webster, R. 1991 How geostatistics can help you.
Soil Use and Management 7, 206–217.
Peterson, D. H., Conomos, T. J., Broenkow, W. W. & Doherty,
P. C. 1975 Location of the non-tidal current null zone in
northern San Francsico Bay. Estuarine and Coastal Marine Science
3, 1–11.
Powell, T. M., Cloern, J. E. & Walters, R. A. 1986 Phytoplankton
spatial distribution in South San Francisco Bay: mesoscale and
small-scale variability. In Estuarine Variability (Wolfe, D. A., ed.).
Academic Press, London, pp. 369–383.
Powell, T. M., Cloern, J. E. & Huzzey, L. M. 1989 Spatial and
temporal variability in South San Francisco Bay (USA). I.
Horizontal distributions of salinity, suspended sediments, and
phytoplankton biomass and productivity. Estuarine, Coastal and
Shelf Science 28, 583–597.
Rossi, R. E., Mulla, D. J., Journel, A. G. & Franz, E. H. 1992
Geostatistical tools for modeling and interpreting ecological
spatial dependence. Ecological Monographs 62, 277–314.
Simard, Y., Legendre, P. & Lavoie, G. M. D. 1992 Mapping,
estimating biomass, and optimizing sampling programs for spa-
tially autocorrelated data: case study of the northern shrimp
(Pandalus borealis). Canadian Journal of Fisheries and Aquatic
Sciences 49, 32–45.
Statistical Sciences 1995 S-PLUS, Version 3.3 for Windows. Statisti-
cal Sciences, Seattle, Washington.
Thioulouse, J., Royet, J. P., Ploye, H. & Houllier, F. 1993 Evalu-
ation of the precision of systematic sampling: nugget eect and
covariogram modelling. Journal of Microscopy 172, 249–256.
Walser, W. E. Jr., Hughes, R. G. & Rabalais, S. C. 1992 Multiple
interface data acquisition speeds at-sea research. Sea Technology
33, 29–34.
Wolter, K. M. 1984 An investigation of some estimators of variance
for systematic sampling. Journal of the American Statistical Associ-
ation 79, 781–790.
Yates, F. 1960 Sampling Methods for Censuses and Surveys, 3rd ed.
Charles Grin and Company, London, 440 pp.
Yoder, J. A., McClain, C. R., Blanton, J. O. & Oey, L. Y. 1987
Spatial scales in CZCS-chlorophyll imagery of the southeastern
U.S. continental shelf. Limnology and Oceanography 32, 929–941.
302 A. D. Jassby et al.
... High-resolution remote sensing data can help expand monitoring by being an additional data source, and by providing more spatial information for making sampling decisions. Understanding the spatial distribution of water quality variables is vital for successful sampling designs [13,21]. Knowing how much variability exists at different spatial scales is an important consideration: for example, coastal systems tend to have most of their variability contained below 250 m, but areas offshore have less small-scale variability [22]. ...
... In addition, knowing the amount of spatial variability in different areas of a system is highly important. In SF Bay, longitudinal variability has been examined [13] but bay-wide spatial variability is unconstrained. ...
Article
Full-text available
Understanding spatial variability of water quality in estuary systems is important for making monitoring decisions and designing sampling strategies. In San Francisco Bay, the largest estuary system on the west coast of North America, tracking the concentration of suspended materials in water is largely limited to point measurements with the assumption that each point is representative of its surrounding area. Strategies using remote sensing can expand monitoring efforts and provide a more complete view of spatial patterns and variability. In this study, we (1) quantify spatial variability in suspended particulate matter (SPM) concentrations at different spatial scales to contextualize current in-water point sampling and (2) demonstrate the potential of satellite and shipboard remote sensing to supplement current monitoring methods in San Francisco Bay. We collected radiometric data from the bow of a research vessel on three dates in 2019 corresponding to satellite overpasses by Sentinel-2, and used established algorithms to retrieve SPM concentrations. These more spatially comprehensive data identified features that are not picked up by current point sampling. This prompted us to examine how much variability exists at spatial scales between 20 m and 10 km in San Francisco Bay using 10 m resolution Sentinel-2 imagery. We found 23–80% variability in SPM at the 5 km scale (the scale at which point sampling occurs), demonstrating the risk in assuming limited point sampling is representative of a 5 km area. In addition, current monitoring takes place along a transect within the Bay’s main shipping channel, which we show underestimates the spatial variance of the full bay. Our results suggest that spatial structure and spatial variability in the Bay change seasonally based on freshwater inflow to the Bay, tidal state, and wind speed. We recommend monitoring programs take this into account when designing sampling strategies, and that end-users account for the inherent spatial uncertainty associated with the resolution at which data are collected. This analysis also highlights the applicability of remotely sensed data to augment traditional sampling strategies. In sum, this study presents ways to supplement water quality monitoring using remote sensing, and uses satellite imagery to make recommendations for future sampling strategies.
... Generating reliable information about the spatiotemporal variability of lake WQ requires a systematic, adequate and regular monitoring of lake waters (Sargaonkar 2003;Singh et al. 2004). WQ characterization based on the sampling of a few spatially and temporally inadequate sites is most often misleading (Jassby et al. 1997;Hedger et al. 2001;Anttila et al. 2008). However, even precise and effective sampling is futile, if, the samples are not meticulously analyzed (Magyar et al. 2013). ...
Article
Full-text available
This research explains the background processes responsible for the spatial distribution of hydrochemical properties of the picturesque eutrophic Himalayan Lake, Dal, located in Kashmir valley, India. Univariate and multivariate statistical analyses were used to understand the spatiotemporal variability of 18 hydrochemical parameters comprising of 12,960 observations collected from 30 sampling sites well distributed within the lake at a grid spacing of 1 km² from March 2014 to February 2016. Hierarchical cluster analysis (HCA) grouped all the sampled data into three clusters based on the hydrochemical similarities, Discriminant analysis also revealed the same clusters and patterns in the data, validating the results of HCA. Wilk’s λ quotient distribution revealed the contribution of ions, nutrients, secchi disk transparency, dissolved oxygen and pH in the formation of clusters. The results are in consonance with the Principal Component Analysis of the whole lake data and individual clusters, which showed that the variance is maximally explained by the ionic component (46.82%) followed by dissolved oxygen and pH (9.36%), nitrates and phosphates (7.33%) and Secchi disk transparency (5.98%). Spatial variability of the hydrochemistry of the lake is due to the variations in water depth, lake water dynamics, flushing rate of water, organic matter decomposition, and anthropogenic pressures within and around the Dal lake ecosystem. Overall, the water quality of the lake is unfit for drinking due to the presence of coliform bacteria in the lake waters.
... Therefore, the degrees of freedom to strong conclusions about the spatial variability of water chemistry and chlorophyll a remain rather limited, which clearly emphasizes how challenging this kind of a sampling program is in terms of manpower. A few similar studies have been performed before but those have been mainly focused on biology of inland waters (Cloern et al. 1992;Van de Bogert et al. 2012) and especially in the marine environment (Jassby et al. 1997). Also, remote sensing have been applied in studying horizontal distribution of water quality variables (Hedger et al. 2001), but in spite of its good spatial resolution remote sensing can only cover the nearsurface layer of the water column. ...
Article
Full-text available
Spatial variability, an essential characteristic of lake ecosystems, has often been neglected in field research and monitoring. In this study, we apply spatial statistical methods for the key physics and chemistry variables and chlorophyll a over eight sampling dates in two consecutive years in a large (area 103 km²) eutrophic boreal lake in southern Finland. In the four summer sampling dates, the water body was vertically and horizontally heterogenic except with color and DOC, in the two winter ice-covered dates DO was vertically stratified, while in the two autumn dates, no significant spatial differences in any of the measured variables were found. Chlorophyll a concentration was one order of magnitude lower under the ice cover than in open water. The Moran statistic for spatial correlation was significant for chlorophyll a and NO2+NO3–N in all summer situations and for dissolved oxygen and pH in three cases. In summer, the mass centers of the chemicals were within 1.5 km from the geometric center of the lake, and the 2nd moment radius ranged in 3.7–4.1 km respective to 3.9 km for the homogeneous situation. The lateral length scales of the studied variables were 1.5–2.5 km, about 1 km longer in the surface layer. The detected spatial “noise” strongly suggests that besides vertical variation also the horizontal variation in eutrophic lakes, in particular, should be considered when the ecosystems are monitored.
... The SFB is the largest estuary in California, consisting of six subembayments defined by their characteristic ranges of waterquality variables (Jassby et al., 1997). Spatial patterns in this estuary are controlled by the salinity gradient and its seasonal variability . ...
Article
Full-text available
Keywords: Amnesic shellfish poisoning Paralytic shellfish poisoning Diarrhetic shellfish poisoning Microcystin toxins Shellfish Chronic toxin exposure A B S T R A C T San Francisco Bay (SFB) is a eutrophic estuary that harbors both freshwater and marine toxigenic organisms that are responsible for harmful algal blooms. While there are few commercial fishery harvests within SFB, recreational and subsistence harvesting for shellfish is common. Coastal shellfish are monitored for domoic acid and paralytic shellfish toxins (PSTs), but within SFB there is no routine monitoring for either toxin. Dinophysis shellfish toxins (DSTs) and freshwater microcystins are also present within SFB, but not routinely monitored. Acute exposure to any of these toxin groups has severe consequences for marine organisms and humans, but chronic exposure to sub-lethal doses, or synergistic effects from multiple toxins, are poorly understood and rarely addressed. This study documents the occurrence of domoic acid and microcystins in SFB from 2011 to 2016, and identifies domoic acid, microcystins, DSTs, and PSTs in marine mussels within SFB in 2012, 2014, and 2015. At least one toxin was detected in 99% of mussel samples, and all four toxin suites were identified in 37% of mussels. The presence of these toxins in marine mussels indicates that wildlife and humans who consume them are exposed to toxins at both sub-lethal and acute levels. As such, there are potential deleterious impacts for marine organisms and humans and these effects are unlikely to be documented. These results demonstrate the need for regular monitoring of marine and freshwater toxins in SFB, and suggest that co-occurrence of multiple toxins is a potential threat in other ecosystems where freshwater and seawater mix.
... This aspect is extremely importance when mapping the spatial distribution of sediments on the sea bed. Among all the proposed sampling strategies (Cochran, 1977;Clark and Hosking, 1986;Jassby et al., 1997;Haining, 1990;Webster, 1999;Hirzel and Guisan, 2002;de Zorzi et al., 2008), a systematic sampling design appears to be best suited for the local estimation of a spatial variable. This approach provides the most precise estimates for a given effort (Caeiro et al., 2003). ...
Article
Dredged sediments have different physical and chemical characteristics compared with the sediments in place, which generates multiple effects on the environment. In this study, we show that the sampling strategy used to monitor the effects of dredge spoil deposition on the surrounding environment can lead to different interpretations. It appears that sediment sample replicates may or may not be necessary, depending on the studied area, the prevailing environmental forcings before sediment sampling and the combination of these two factors. The proposed modus operandi allows us to optimize both the confidence on the obtained results and the cost of the sediment studies (sampling and laboratory analyses). The results are based on the sediment fine fraction, which is considered as a key environmental component due, for example, to its strong association with the structure of benthic faunal communities as well as its role in the build-up of pollutants.
... SFB is the largest estuary in California, consisting of six subembayments ( Fig. 1; Jassby et al., 1997). North Bay is riverdominated, with salinities ranging from 0 to 15. ...
Article
Full-text available
San Francisco Bay (SFB), USA, is highly enriched in nitrogen and phosphorus, but has been resistant to the classic symptoms of eutrophication associated with over-production of phytoplankton. Observations in recent years suggest that this resistance may be weakening, shown by: significant increases of chlorophyll-a (chl-a) and decreases of dissolved oxygen (DO), common occurrences of phytoplankton taxa that can form Harmful Algal Blooms (HAB), and algal toxins in water and mussels reaching levels of concern. As a result, managers now ask: what levels of chl-a in SFB constitute tipping points of phytoplankton biomass beyond which water quality will become degraded, requiring significant nutrient reductions to avoid impairments? We analyzed data for DO, phytoplankton species composition, chl-a, and algal toxins to derive quantitative relationships between three indicators (HAB abundance, toxin concentrations, DO) and chl-a. Quantile regressions relating HAB abundance and DO to chl-a were significant, indicating SFB is at increased risk of adverse HAB and low DO levels if chl-a continues to increase. Conditional probability analysis (CPA) showed chl-a of 13 mg m⁻³ as a “protective” threshold below which probabilities for exceeding alert levels for HAB abundance and toxins were reduced. This threshold was similar to chl-a of 13–16 mg m⁻³ that would meet a SFB-wide 80% saturation Water Quality Criterion (WQC) for DO. Higher “at risk” chl-a thresholds from 25 to 40 mg m⁻³ corresponded to 0.5 probability of exceeding alert levels for HAB abundance, and for DO below a WQC of 5.0 mg L⁻¹ designated for lower South Bay (LSB) and South Bay (SB). We submit these thresholds as a basis to assess eutrophication status of SFB and to inform nutrient management actions. This approach is transferrable to other estuaries to derive chl-a thresholds protective against eutrophication.
... Oscillations of the semidiurnal tides, with maximum velocitỹ 170 cm s À1 at the Golden Gate (Walters et al., 1985), drive fast water exchange between CB and the Pacific Ocean. However, transports between CB and SB are slowed by a sill that functions as a topographic control on flow, leading to differences in residence times and chemical and biological properties of the two regions (Jassby et al., 1997). Central Bay is directly connected to the Pacific Ocean by tidal stirring and nontidal transport processes (Walters et al., 1985), and its water properties (salinity, nutrients, turbidity) are marine in character, especially during the dry summer-autumn (Conomos et al., 1979). ...
Article
Full-text available
Estuaries are connected to both land and ocean so their physical, chemical and biological dynamics are influenced by climate patterns over watersheds and ocean basins. We explored climate-driven oceanic variability as a source of estuarine variability by comparing monthly time series of temperature and chlorophyll-a inside San Francisco Bay with those in adjacent shelf waters of the California Current System (CCS) that are strongly responsive to wind-driven upwelling. Monthly temperature fluctuations inside and outside the Bay were synchronous, but their correlations weakened with distance from the ocean. These results illustrate how variability of coastal water temperature (and associated properties such as nitrate and oxygen) propagates into estuaries through fast water exchanges that dissipate along the estuary. Unexpectedly, there was no correlation between monthly chlorophyll-a variability inside and outside the Bay. However, at the annual scale Bay chlorophyll-a was significantly correlated with the Spring Transition Index (STI) that sets biological production supporting fish recruitment in the CCS. Wind forcing of the CCS shifted in the late 1990s when the STI advanced 40 days. This shift was followed, with lags of 1-3 years, by 3-19 fold increased abundances of 5 ocean-produced demersal fish and crustaceans and 2.5-fold increase of summer chlorophyll-a in the Bay. These changes reflect a slow biological process of estuary-ocean connectivity operating through the immigration of fish and crustaceans that prey on bivalves, reduce their grazing pressure and allow phytoplankton biomass to build. We identified clear signals of climate-mediated oceanic variability in this estuary, and discovered that the response patterns vary with the process of connectivity and the time scale of ocean variability. This result has important implications for managing nutrient inputs to estuaries connected to upwelling systems, and for assessing their responses to changing patterns of upwelling timing and intensity as the planet continues to warm. This article is protected by copyright. All rights reserved.
Article
Full-text available
Flow-regulated discharges of water from control structures into estuaries result in hydrologic and water chemistry conditions that impact spatial and temporal variability in the structure and biomass of phytoplankton communities, including the potential for harmful algal blooms (HABs). The relationships between regulated Caloosahatchee River (i.e., C-43 Canal) discharges and phytoplankton communities in the Caloosahatchee Estuary and adjacent nearshore regions on the southwest coast of Florida were investigated during two study periods, 2009–2010 and 2018–2019. During periods of low to moderate discharge rates, when mesohaline conditions predominated in the estuary, and water residence times were comparatively long, major blooms of the HAB dinoflagellate species Akashiwo sanguinea were observed in the estuary. Periods of high discharge were characterized by comparatively low phytoplankton biomass in the estuary and greater influence of a wide range of freshwater taxa in the upper reaches. By contrast, intense blooms of the toxic dinoflagellate Karenia brevis in the nearshore region outside of the estuary were observed during high discharge periods in 2018–2019. The latter events were significantly associated with elevated levels of nitrogen in the estuary compared to lower average concentrations in the 2009–2010 study period. The relationships observed in this study provide insights into the importance of managing regulated discharge regimes to minimize adverse impacts of HABs on the health of the estuary and related coastal environments.
Article
In shallow estuaries with strong river influence, the short residence time and pronounced gradients generate an environment for plankton that differs substantially in its dynamics from that of the open ocean, and the question arises “How is phytoplankton biomass affected?” This study assesses the small-scale spatial and temporal distribution of phytoplankton in Apalachicola Bay, a shallow bar-built estuary in the Florida Panhandle. Phytoplankton peaks were characterized to gain insights into the processes affecting spatial heterogeneity in biomass. Chlorophyll a (Chl a) distribution at 50-m spatial resolution was mapped using a flow-through sensor array, Dataflow©, operated from a boat that sampled four transects across the bay every 2 weeks for 16 months. Chl a peaks exceeding background concentrations had an average width of 1.3 ± 0.7 km delineated by an average gradient of 3.0 ± 6.0 μg Chl a L⁻¹ km⁻¹. Magnitude of E-W wind, velocity of N-S wind, tidal stage, and temperature affected peak characteristics. Phytoplankton contained in the peaks contributed 7.7 ± 2.7% of the total integrated biomass observed along the transects during the study period. The river plume front was frequently a location of elevated Chl a, which shifted in response to river discharge. The results demonstrate that despite the shallow water column, river flushing, and strong wind and tidal mixing, distinct patchiness develops that should be taken into consideration in ecological studies and when assessing productivity of such ecosystems.
Article
The complex shapes of estuaries need to be considered when developing spatial interpolation methods for water quality analyses. In this study, a statistical interpolation method (kriging) is used to interpolate water quality data in Chesapeake Bay, and the issue of shape is addressed by incorporating "water distance" into the method (i.e., the shortest path over water between any two points). Results show that water-distance-based kriging performed just as well as, and in most cases better than, a kriging method based on Euclidean distance. Benefits of the water-distance-based method with kriging include improved estimates in regions with complex geometry and lower uncertainty in the kriging predictions.
Chapter
Full-text available
Estuaries are transitional ecosystems at the interface of the terrestrial and marine realms. Their unique physiographic position gives rise to large spatial variability, and to dynamic temporal variability resulting, in part, from a variety of forces and fluxes at the oceanic and terrestrial boundaries. River flow, in particular, is an important mechanism for delivering watershed-derived materials such as fresh water, sediments, and nutrients; each of these quantities in turn directly influences the physical structure and biological communities of estuaries. With this setting in mind, we consider here the general proposition that estuarine variability at the yearly time scale can be caused by annual fluctuations in river flow. We use a “long-term” (15-year) time series of phytoplankton biomass variability in South San Francisco Bay (SSFB), a lagoon-type estuary in which phytoplankton primary production is the largest source of organic carbon (Jassby et al. 1993).
Article
Full-text available
Sources and sinks of organic carbon for San Francisco Bay (California, USA) were estimated for 1980. Sources for the southern reach were dominated by phytoplankton and benthic microalgal production. River loading of organic matter was an additional important factor in the northern reach. Tidal marsh export and point sources played a secondary role. Autochthonous production in San Francisco Bay appears to be less than the mean for temperate-zone estuaries, primarily because turbidity limits microalgal production and the development of seagrass beds. Exchange between the Bay and Pacific Ocean plays an unknown but potentially important role in the organic carbon balance. Interannual variability in the organic carbon supply was assessed for Suisun Bay, a northern reach subembayment that provides habitat for important fish species (delta smelt Hypomesus transpacificus and larval striped bass Morone saxatilus). The total supply fluctuated by an order of magnitude; depending on the year, either autochthonous sources (phytoplankton production) or allochthonous sources (riverine loading) could be dominant. The primary cause of the year-to-year change was variability of freshwater inflows from the Sacramento and San Joaquin rivers, and its magnitude was much larger than long-term changes arising from marsh destruction and point source decreases. Although interannual variability of the total organic carbon supply could not be assessed for the southern reach, year-to-year changes in phytoplankton production were much smaller than in Suisun Bay, reflecting a relative lack of river influence.
Article
Full-text available
Phytoplankton blooms are prominent features of biological variability in shallow coastal ecosystems such as estuaries, lagoons, bays, and tidal rivers. Long-term observation and research in San Francisco Bay illustrates some patterns of phytoplankton spatial and temporal variability and the underlying mechanisms of this variability. Blooms are events of rapid production and accumulation of phytoplankton biomass that are usually responses to changing physical forcings originating in the coastal ocean (e.g., tides), the atmosphere (wind), or on the land surface (precipitation and river runoff). These physical forcings have different timescales of variability, so algal blooms can be short-term episodic events, recurrent seasonal phenomena, or rare events associated with exceptional climatic or hydrologic conditions. The biogeochemical role of phytoplankton primary production is to transform and incorporate reactive inorganic elements into organic forms, and these transformations are rapid and lead to measurable geochemical change during blooms. Examples include the depletion of inorganic nutrients (N, P, Si), supersaturation of oxygen and removal of carbon dioxide, shifts in the isotopic composition of reactive elements (C, N), production of climatically active trace gases (methyl bromide, dimethylsulfide), changes in the chemical form and toxicity of trace metals (As, Cd, Ni, Zn), changes in the biochemical composition and reactivity of the suspended particulate matter, and synthesis of organic matter required for the reproduction and growth of heterotrophs, including bacteria, zooplankton, and benthic consumer animals. Some classes of phytoplankton play special roles in the cycling of elements or synthesis of specific organic molecules, but we have only rudimentary understanding of the forces that select for and promote blooms of these species. Mounting evidence suggests that the natural cycles of bloom variability are being altered on a global scale by human activities including the input of toxic contaminants and nutrients, manipulation of river flows, and translocation of species. This hypothesis will be a key component of our effort to understand global change at the land-sea interface. Pursuit of this hypothesis will require creative approaches for distinguishing natural and anthropogenic sources of phytoplankton population variability, as well as recognition that the modes of human disturbance of coastal bloom cycles operate interactively and cannot be studied as isolated processes.
Article
A multidisciplinary ecological study is in progress in the Thau marine lagoon, on the Mediterranean coast of France. Sampling is being conducted in two phases. Phase 1 is a pre-sampling program (pilot study), space- and time-intensive, bearing on 10 variables only; it was conducted in 1986 and 1987. During phase 2, that began in 1988, more variables will be studied at fewer stations, and at the most appropriate time scales; the purpose is to increase our understanding of ecological processes through modelling. This paper examines the results of the pre-sampling program and attempts to determine how to distribute samples through space, and through time, in order to best sample the variability of the system. Through space, four methods are proposed to select 20 stations among 63. It is shown that none of the methods always performs better than all others, their power of reproducing the best part of the original variable's variability depending upon the shape of the spatial structure (gradient, patches, hole, etc.). It is also shown that all four methods are far more efficient at rendering the system's variability than either random or systematic sampling designs. Along the time axis, the hourly, daily and monthly sampling scales were compared as to their coefficients of variation for each variable, and the daily and monthly scales were selected as being, overall, the most informative for the processes under study.