Article

A hierarchical zero-inflated Poisson regression model for stream fish distribution and abundance

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Ecologists are frequently confronted with the challenge of accurately modelling species abundance. However, this task requires one to deal with both presence/absence as well as abundance. Traditional Poisson regression models are not ade-quate when attempting to deal with both issues simultaneously. Zero-inflated regression models have been proposed to deal with this problem with much success. We extend these models to incorporate both a multilevel hierarchical structure and spatial correlation. The model is illustrated using a dataset concerning the Hypseleotris galii (Fire-tailed Gudgeon), a native species to eastern Australia.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... We developed a Bayesian hierarchical zero-inflated Poisson (ZIP) mixture model similar to that developed by Boone et al. (2012). The model estimates the probability of species absence as a mixture of two separate processes; a Bernoulli process that determines the species' presence or absence and a Poisson process that determines the species abundance (Martin et al. 2005, Boone et al. 2012). ...
... We developed a Bayesian hierarchical zero-inflated Poisson (ZIP) mixture model similar to that developed by Boone et al. (2012). The model estimates the probability of species absence as a mixture of two separate processes; a Bernoulli process that determines the species' presence or absence and a Poisson process that determines the species abundance (Martin et al. 2005, Boone et al. 2012). Parameters describing ecological relationships within the model are estimated at several levels of the hierarchy, described as either Habitat unit, Reach, River or Region (Fig. 3). ...
... distance to the river mouth) which vary in space only. We also quantified the spatial autocorrelation in mean species abundances as well as the local-scale species– environment relationships, g lij estimated in the hierarchical component of the model (Boone et al. 2012). Just as these relationships may vary according to patch context, variation in these relationships may also display some level of spatial autocorrelation. ...
Article
Understanding the determinants of species’ distributions and abundances is a central theme in ecology. The development of statistical models to achieve this has a long history and the notion that the model should closely reflect underlying scientific understanding has encouraged ecologists to adopt complex statistical methods as they arise. In this paper we describe a Bayesian hierarchical model that reflects a conceptual ecological model of multi-scaled environmental determinants of riverine fish species’ distributions and abundances. We illustrate this with distribution and abundance data of a small-bodied fish species, the Empire gudgeon Hypseleotris galii, in the Mary and Albert Rivers, Queensland, Australia. Specifically, the model sought to address; 1) the extent that landscape-scale abiotic variables can explain the species’ distribution compared to local-scale variables, 2) how local-scale abiotic variables can explain species’ abundances, and 3) how are these local-scale relationships mediated by landscape-scale variables. Overall, the model accounted for around 60% of variation in the distribution and abundance of H. galii. The findings show that the landscape-scale variables explain much of the distribution of the species; however, there was considerable improvement in estimating the species’ distribution with the addition of local-scale variables. There were many strong relationships between abundance and local-scale abiotic variables; however, several of these relationships were mediated by some of the landscape-scale variables. The extent of spatial autocorrelation in the data was relatively low compared to the distances among sampling reaches. Our findings exemplify that Bayesian statistical modelling provides a robust framework for statistical modelling that reflects our ecological understanding. This allows ecologists to address a range of ecological questions with a single unified probability model rather than a series of disconnected analyses.
... Increasingly sophisticated applications of ZIMs have been proposed to model freshwater fish distributions, including models that include species observations with multiple size classes (Kanno et al., 2012), account for non-linear responses of species to environmental conditions , and quantify environmental conditions at multiple spatial scales (Stewart-Koster et al., 2013). Hierarchical model structures have also been proposed to account for spatial autocorrelation among environmental conditions due to the hierarchical structure of the river network (Boone et al., 2012). However, while studies applying ZIMs to marine fish observations have incorporated the observation process (i.e., the equipment used and sampling effort) into their proposed models, there are relatively few studies that propose similar models for freshwater species . ...
Article
Statistical species distribution models (SDMs) are widely used to quantify how taxa respond to environmental conditions and to predict their distribution. However, the application of SDMs to freshwater fish taxa is complicated by the active dispersal of fish taxa through river networks, and the species- and habitat-dependent observation process (i.e., the sampling method and effort) required to accurately sample their distributions. Many studies have applied presence-absence models (PAMs) to fish taxa, while more recent studies have proposed zero-inflated models (ZIMs) to account for count observations with many zeroes. However, relatively few studies have incorporated the observation process into the model structure, which would facilitate the combination of data from various monitoring programs that differ in their observation process. In this study, we use conceptual models to identify potentially dominant natural and anthropogenic environmental conditions with a direct, mechanistic effect on the distributions of freshwater fish taxa in Switzerland, a region with a large range of environmental conditions, from alpine streams that are mainly affected by hydromorphological alterations to lowland streams in densely populated areas with intensive agricultural land use. Moreover, numerous barriers impede fish migration along the entire river network. Using combined data from two fish monitoring programs in Switzerland, we applied an exhaustive cross-validation procedure to select a set of environmental variables with the highest (out-of-sample) predictive performance for the PAM and ZIM for fish density (individuals/m²) of the seven most prevalent fish taxa (Salmo spp., Cottus spp., Squalius spp., Barbatula spp., Barbus spp., Phoxinus spp., Gobio spp.). We used these variables to develop a PAM and ZIM for each taxon that accounts for differences in sampling methods and sampling effort. We quantified the quality of fit during calibration using all samples and predictive performance during 5-fold cross-validation of each model. Results show that stream temperature and stream morphology within the accessible habitat commonly appear among the best predictive presence-absence models for multiple taxa. Spatial variables that account for migration barriers and quantify morphological conditions within the accessible habitat were selected for 6 out of 7 taxa. The selected PAMs performed well for all taxa with an intermediate prevalence (10–40%), with an explanatory power (D2) of between 0.32 - 0.37 during calibration using all samples and only minor decreases in explanatory power during cross-validation (D2= 0.34 – 0.44). As expected, the PAM for the highly prevalent Salmo spp. (91%) failed to predict the few absence data points. By contrast, the ZIM model performed best for Salmo spp., with a standardized likelihood ratio of 1.56. For all other taxa besides Barbus spp. the ZIM models also had likelihood ratios above one, indicating a better predictive performance than the null model. We hope this study stimulates the development and application of fish species distribution models based on prior knowledge of causally linked environmental variables and incorporating observation errors to improve their predictive performance. This can facilitate learning from biomonitoring data to support management.
... Although there is much research on spatial zero-inflated models, limited examples are available on extending ZIP models to multiscale data (Boone, Stewart-Koster, & Kennard, 2012). On the other hand, many studies have been done on multiscale models to address scaling effects (Fonseca & Ferreira, 2017;Kolaczyk & Huang, 2001;Louie & Kolaczyk, 2004, 2006a, but these models have not been applied to zero-inflated data. ...
Article
It is our primary focus to study the spatial distribution of disease incidence at different geographical levels. Often, spatial data are available in the form of aggregation at multiple scale levels such as census tract, county, and state. When data are aggregated from a fine (e.g., county) to a coarse (e.g., state) geographical level, there will be loss of information. The problem is more challenging when excessive zeros are available at the fine level. After data aggregation, the excessive zeros at the fine level will be reduced at the coarse level. If we ignore the zero inflation and the aggregation effect, we could get inconsistent risk estimates at the fine and coarse levels. Hence, in this paper, we address those problems using zero-inflated multiscale models that jointly describe the risk variations at different geographical levels. For the excessive zeros at the fine level, we use a zero-inflated convolution model, whereas we consider a regular convolution model for the smoothed data at the coarse level. These methods provide a consistent risk estimate at the fine and coarse levels when high percentages of structural zeros are present in the data.
... In the presence of an excess number of zeros, an alternative modeling approach can be obtained using a zero-inflated negative binomial (or Poisson) regression. Zero-inflated models consist of a mixture where a zero-inflated structure is incorporated such that there are two classes of zeros in the count process, one coming from a point mass and the other from a non-truncated process (Boone et al. 2012). However, the Gibbs sampler proposed in this paper is not straightforward or generalizable to zero-inflated counts and should be considered for future research. ...
Article
Full-text available
Whole genome prediction models are useful tools for breeders when selecting candidate individuals early in life for rapid genetic gains. However, most prediction models developed so far assume that the response variable is continuous and that its empirical distribution can be approximated by a Gaussian model. A few models have been developed for ordered categorical phenotypes, but there is a lack of genomic prediction models for count data. There are well-established regression models for count data that cannot be used for genomic-enabled prediction because they were developed for a large sample size (n) and a small number of parameters (p); however, the rule in genomic-enabled prediction is that p is much larger than the sample size n. Here we propose a Bayesian mixed negative binomial (BMNB) regression model for counts, and we present the conditional distributions necessary to efficiently implement a Gibbs sampler. The proposed Bayesian inference can be implemented routinely. We evaluated the proposed BMNB model together with a Poisson model, a Normal model with untransformed response, and a Normal model with transformed response using a logarithm, and applied them to two real wheat datasets from the International Maize and Wheat Improvement Center. Based on the criteria used for assessing genomic prediction accuracy, results indicated that the BMNB model is a viable alternative for analyzing count data.
... No spatial autocorrelation was detected in the model residuals. To predict the future spatial distribution of bass, we used the 7DAD mean water temperatures forecasted for early and late summer of the 2040s and 2080s in the fitted Bayesian hierarchical model, and then obtained the median of the posterior predictive distribution of bass abundances per reach (e.g., Boone et al. 2012). ...
Article
Predicting how climate change is likely to interact with myriad other stressors that threaten species of conservation concern is an essential challenge in aquatic ecosystems. This study provides a framework to accomplish this task in salmon-bearing streams of the northwestern United States, where land-use related reductions in riparian shading have caused changes in stream thermal regimes, and additional warming from projected climate change may result in significant losses of coldwater fish habitat over the next century. Predatory non-native smallmouth bass have also been introduced into many northwestern streams and their range is likely to expand as streams warm, presenting an additional challenge to the persistence of threatened Pacific salmon. The goal of this work was to forecast the interactive effects of climate change, riparian management, and non-native species on stream-rearing salmon, and to evaluate the capacity of restoration to mitigate these effects. We intersected downscaled global climate forecasts with a local-scale water temperature model to predict mid- and end-of-century temperatures in streams in the Columbia River basin; we compared one stream that is thermally impaired due to the loss of riparian vegetation and another that is cooler and has a largely intact riparian corridor. Using the forecasted stream temperatures in conjunction with fish-habitat models, we predicted how stream-rearing Chinook salmon and bass distributions would change as each stream warmed. In the highly modified stream, end-of-century warming may cause near total loss of Chinook salmon rearing habitat and a complete invasion of the upper watershed by bass. In the less modified stream, bass were thermally restricted from the upstream-most areas. In both systems, temperature increases resulted in higher predicted spatial overlap between stream-rearing Chinook salmon and potentially predatory bass in the early summer (2-4-fold increase) and greater abundance of bass. We found that riparian restoration could prevent the extirpation of Chinook salmon from the more altered stream, and could also restrict bass from occupying the upper 31 km of salmon rearing habitat. The proposed methodology and model predictions are critical for prioritizing climate-change adaptation strategies before salmonids are exposed to both warmer water and greater predation risk by non-native species.
... Specification of a zero-inflated or hurdle mixed effects model in a Bayesian framework is considerably more complex. Boone et al. (2012) successfully specified a zero-inflated Poisson regression model with multilevel hierarchical structure and spatial correlation to model stream fish distribution and abundance. We attempted to incorporate the complex design of DEMO experiment into both zero-inflated and hurdle regression models, but the models failed to converge even with a small number of predictors. ...
... The Bayesian inference is consistent with the scientific process of progressive learning and offers a natural mechanism for sequentially updating beliefs (specified in terms of model parameters) every time new data are collected from the system and for predicting the consequences of future management actions, while properly accounting for uncertainty in the updated beliefs (Arhonditsis et al., 2008a). Recent research has also shown that the Bayesian paradigm can effectively alleviate problems of spatiotemporal resolution mismatch among different submodels of integrated environmental modeling systems, overcome the conceptual or scale misalignment between processes of interest and supporting information, exploit disparate sources of information that differ with regard to the measurement error and resolution , and accommodate tightly intertwined environmental processes operating at different spatiotemporal scales (Boone et al., 2012; Hooten et al., 2011; Qian et al., 2010; Wikle, 2003; Wikle et al., 1998; Zhang and Arhonditsis, 2009). Several recent studies have attempted to demonstrate the benefits of Bayesian inference techniques in the context of model-based water quality management. ...
Article
Count regression models maintain a steadfast presence in modern applied statistics as highlighted by their usage in diverse areas like biometry, ecology, and insurance. However, a common practical problem with observed count data is the presence of excess zeros relative to the assumed count distribution. The seminal work of Lambert (1992) was one of the first articles to thoroughly treat the problem of zero‐inflated count data in the presence of covariates. Since then, a vast literature has emerged regarding zero‐inflated count regression models. In this first of two review articles, we survey some of the classic and contemporary literature on parametric zero‐inflated count regression models, with emphasis on the utility of different univariate discrete distributions. We highlight some of the primary computational tools available for estimating and assessing the adequacy of these models. We concurrently emphasize the diverse data problems to which these models have been applied. This article is categorized under: • Statistical Models > Generalized Linear Models • Software for Computational Statistics > Software/Statistical Software • Algorithms and Computational Methods > Maximum Likelihood Methods Abstract Estimated conditional mean portions of four zero‐inflated (ZI) regression models: ZI Poisson (ZIP), ZI negative binomial (ZINB), ZI generalized Poisson (ZIGP), and ZI Conway–Maxwell–Poisson (ZICMP).
Article
There is a rich literature on the analysis of longitudinal data with missing values. However, the analysis becomes complex for semi-continuous (zero-inflated) longitudinal response with missingness. In this article, we propose a partially varying coefficients regression model for analysing such data. We use a two-part model, where in the first part we propose a latent dynamic model for accounting a ‘zero’ or a ‘non-zero’ response, and in the second part we use another dynamic model for estimating the mean trajectories of non-zero responses. Two dynamic models are linked through subject-specific random effects. The missing covariates are imputed repeatedly based on their respective posterior predictive distributions and the missing responses are imputed using the working model under different identifying restrictions. We analyse data from the Health and Retirement Study (HRS) for aged individuals and develop a dynamic model for predicting out-of-pocket medical expenditures (OOPME) containing excess zeros. The operating characteristics of the proposed model are investigated through extensive simulation studies.
Article
Models that formulate mathematical linkages between fish use and habitat characteristics are applied for many purposes. For riverine fish, these linkages are often cast as resource selection functions with variables including depth and velocity of water and distance to nearest cover. Ecologists are now recognizing the role that detection plays in observing organisms, and failure to account for imperfect detection can lead to spurious inference. Herein, we present a flexible N-mixture model to associate habitat characteristics with the abundance of riverine salmonids that simultaneously estimates detection probability. Our formulation has the added benefits of accounting for demographics variation and can generate probabilistic statements regarding intensity of habitat use. In addition to the conceptual benefits, model application to data from the Trinity River, California, yields interesting results. Detection was estimated to vary among surveyors, but there was little spatial or temporal variation. Additionally, a weaker effect of water depth on resource selection is estimated than that reported by previous studies not accounting for detection probability. N-mixture models show great promise for applications to riverine resource selection.
Article
Full-text available
Abstract JUNK, W. J., P. B. BAYLEY, AND R. E. SPARKS, 1989. The flood pulse concept in river-floodplain systems, p. 110-127. In D. P. Dodge [ed.] Proceedings of the International Large River Symposium. Can. Spec. Publ. Fish. Aquat. Sci. 106. The principal driving force responsible for the existence, productivity, and interactions of the major biota in river—floodplain systems is the flood pulse. A spectrum of geomorphological and hydrological conditions produces flood pulses, which range from unpredictable to predictable and from short to long duration. Short and generally unpredictable pulses occur in low-order streams or heavily modified systems with floodplains that have been leveed and drained by man. Because low-order stream pulses are brief and unpredictable, organisms have limited adaptations for directly utilizing the aquatic/terrestrial transition zone (ATTZ), although aquatic organisms benefit indirectly from transport of resources into the lotic environment. Conversely, a predictable pulse of long duration engenders organismic • adaptations and strategies that efficiently utilize attributes of the ATTZ. This pulse is coupled with a dynamic edge effect, which extends a "moving littoral" throughout the ATTZ. The moving littoral prevents prolonged stagnation and allows rapid recycling of organic matter and nutrients, thereby resulting in high productivity. Primary production associated with the ATTZ is much higher than that of permanent water bodies in unmodified systems. Fish yields and production are strongly related to the extent of accessible floodplain, whereas the main river is used as a migration route by most of the fishes. In temperate regions, light and/or temperature variations may modify the effects of the pulse, and anthropogenic influences on the flood pulse or floodplain frequently limit production. A local floodplain, however, can develop by sedimentation in a river stretch modified by a low head dam. Borders of slowly flowing rivers turn into floodplain habitats, becoming separated from the main channel by levées. The flood pulse is a "batch" process and is distinct from concepts that emphasize the continuous processes in flowing water environments, such as the river continuum concept. Flooclplains are distinct because they do not depend on upstream processing inefficiencies of organic matter, although their nutrient pool is influenced by periodic lateral exchange of water and sediments with the main channel. The pulse concept is distinct because the position of a floodplain within the river network is not a primary determinant of the processes that occur. The pulse concept requires an approach other than the traditional limnological paradigms used in lotic or lentic systems. Résumé JUNK, W. J., P. B. BAYLEY, AND R. E. SPARKS. 1989. The flood pulse concept in river-floodplain systems, p. 110-127. In D. P. Dodge [cd.] Proceedings of the International Large River Symposium. Can. Spec. Publ. Fish. Aquat. Sci . 106. Les inondations occasionnées par la crue des eaux dans les systèmes cours d'eau-plaines inondables constituent le principal facteur qui détermine la nature et la productivité du biote dominant de même que les interactions existant entre les organismes biotiques et entre ceux-ci et leur environnement. Ces crues passagères, dont la durée et la prévisibilité sont variables, sont produites par un ensemble de facteurs géomorphologiques et hydrologiques. Les crues de courte durée, généralement imprévisibles, surviennent dans les réseaux hydrographiques peu ramifiées ou dans les réseaux qui ont connu des transformations importantes suite à l'endiguement et au drainage des plaines inondables par l'homme. Comme les crues survenant dans les réseaux hydrographiques d'ordre inférieur sont brèves et imprévisibles, les adaptations des organismes vivants sont limitées en ce qui a trait à l'exploitation des ressources de la zone de transition existant entre le milieu aquatique et le milieu terrestre (ATTZ), bien que les organismes aquatiques profitent indirectement des éléments transportés dans le milieu lotique. Inversement, une crue prévisible de longue durée favorise le développement d'adaptations et de stratégies qui permettent aux organismes d'exploiter efficacement 1 'ATTZ. Une telle crue s'accompagne d'un effet de bordure dynamique qui fait en sorte que l'ATTZ devient un « littoral mobile'<. Dans ces circonstances, il n'y a pas de stagnation prolongée et le recyclage de la matière organique et des substances nutritives se fait rapidement, ce qui donne lieu à une productivité élevée. La production primaire dans l'ATTZ est beaucoup plus élevée que celle des masses d'eau permanentes dans les réseaux hydrographiques non modifiés. Le rendement et la production de poissons sont étroitement reliés à l'étendue de la plaine inondable, tandis que le cours normal de la rivière est utilisé comme voie de migration par la plupart des poissons.
Conference Paper
Full-text available
The Flood Pulse Concept in River—Floodplain Systems Wolfgang J. Junk Max Planck Institut für Limnologie, August Thienemann Strasse 2, Post fach 165, D-2320 Pion, West Germany Peter B. Bayley and Richard E. Sparks Illinois Natural History Survey, 607 E. Peabody Dr., Champaign, IL 61820, USA Abstract JUNK, W. J., P. B. BAYLEY, AND R. E. SPARKS, 1989. The flood pulse concept in river-floodplain systems, p. 110-127. In D. P. Dodge [ed.] Proceedings of the International Large River Symposium. Can. Spec. Publ. Fish. Aquat. Sci. 106. The principal driving force responsible for the existence, productivity, and interactions of the major biota in river—floodplain systems is the flood pulse. A spectrum of geomorphological and hydrological conditions produces flood pulses, which range from unpredictable to predictable and from short to long duration. Short and generally unpredictable pulses occur in low-order streams or heavily modified systems with floodplains that have been leveed and drained by man. Because low-order stream pulses are brief and unpredictable, organisms have limited adaptations for directly utilizing the aquatic/terrestrial transition zone (ATTZ), although aquatic organisms benefit indirectly from transport of resources into the lotic environment. Conversely, a predictable pulse of long duration engenders organismic • adaptations and strategies that efficiently utilize attributes of the ATTZ. This pulse is coupled with a dynamic edge effect, which extends a "moving littoral" throughout the ATTZ. The moving littoral prevents prolonged stagnation and allows rapid recycling of organic matter and nutrients, thereby resulting in high productivity. Primary production associated with the ATTZ is much higher than that of permanent water bodies in unmodified systems. Fish yields and production are strongly related to the extent of accessible floodplain, whereas the main river is used as a migration route by most of the fishes. In temperate regions, light and/or temperature variations may modify the effects of the pulse, and anthropogenic influences on the flood pulse or floodplain frequently limit production. A local floodplain, however, can develop by sedimentation in a river stretch modified by a low head dam. Borders of slowly flowing rivers turn into floodplain habitats, becoming separated from the main channel by levées. The flood pulse is a "batch" process and is distinct from concepts that emphasize the continuous processes in flowing water environments, such as the river continuum concept. Flooclplains are distinct because they do not depend on upstream processing inefficiencies of organic matter, although their nutrient pool is influenced by periodic lateral exchange of water and sediments with the main channel. The pulse concept is distinct because the position of a floodplain within the river network is not a primary determinant of the processes that occur. The pulse concept requires an approach other than the traditional limnological paradigms used in lotic or lentic systems. Résumé JUNK, W. J., P. B. BAYLEY, AND R. E. SPARKS. 1989. The flood pulse concept in river-floodplain systems, p. 110-127. In D. P. Dodge [cd.] Proceedings of the International Large River Symposium. Can. Spec. Publ. Fish. Aquat. Sci . 106. Les inondations occasionnées par la crue des eaux dans les systèmes cours d'eau-plaines inondables constituent le principal facteur qui détermine la nature et la productivité du biote dominant de même que les interactions existant entre les organismes biotiques et entre ceux-ci et leur environnement. Ces crues passagères, dont la durée et la prévisibilité sont variables, sont produites par un ensemble de facteurs géomorphologiques et hydrologiques. Les crues de courte durée, généralement imprévisibles, surviennent dans les réseaux hydrographiques peu ramifiées ou dans les réseaux qui ont connu des transformations importantes suite à l'endiguement et au drainage des plaines inondables par l'homme. Comme les crues survenant dans les réseaux hydrographiques d'ordre inférieur sont brèves et imprévisibles, les adaptations des organismes vivants sont limitées en ce qui a trait à l'exploitation des ressources de la zone de transition existant entre le milieu aquatique et le milieu terrestre (ATTZ), bien que les organismes aquatiques profitent indirectement des éléments transportés dans le milieu lotique. Inversement, une crue prévisible de longue durée favorise le développement d'adaptations et de stratégies qui permettent aux organismes d'exploiter efficacement 1 'ATTZ. Une telle crue s'accompagne d'un effet de bordure dynamique qui fait en sorte que l'ATTZ devient un « littoral mobile'<. Dans ces circonstances, il n'y a pas de stagnation prolongée et le recyclage de la matière organique et des substances nutritives se fait rapidement, ce qui donne lieu à une productivité élevée. La production primaire dans l'ATTZ est beaucoup plus élevée que celle des masses d'eau permanentes dans les réseaux hydrographiques non modifiés. Le rendement et la production de poissons sont étroitement reliés à l'étendue de la plaine inondable, tandis que le cours normal de la rivière est utilisé comme voie de migration par la plupart des poissons. 110
Article
Full-text available
Zero-inflated Poisson (ZIP) regression is a model for count data with excess zeros. It assumes that with probability p the only possible observation is 0, and with probability 1 – p, a Poisson(λ) random variable is observed. For example, when manufacturing equipment is properly aligned, defects may be nearly impossible. But when it is misaligned, defects may occur according to a Poisson(λ) distribution. Both the probability p of the perfect, zero defect state and the mean number of defects λ in the imperfect state may depend on covariates. Sometimes p and λ are unrelated; other times p is a simple function of λ such as p = l/(1 + λ) for an unknown constant T. In either case, ZIP regression models are easy to fit. The maximum likelihood estimates (MLE's) are approximately normal in large samples, and confidence intervals can be constructed by inverting likelihood ratio tests or using the approximate normality of the MLE's. Simulations suggest that the confidence intervals based on likelihood ratio tests are better, however. Finally, ZIP regression models are not only easy to interpret, but they can also lead to more refined data analyses. For example, in an experiment concerning soldering defects on printed wiring boards, two sets of conditions gave about the same mean number of defects, but the perfect state was more likely under one set of conditions and the mean number of defects in the imperfect state was smaller under the other set of conditions; that is, ZIP regression can show not only which conditions give lower mean number of defects but also why the means are lower.
Article
Full-text available
We assessed the relative importance of environmental variation, interspecific competition for space, and predator abundance on assemblage structure and microhabitat use in a stream fish assemblage inhabiting Coweeta Creek, North Carolina, USA. Our study en- compassed a 10-yr time span (1983-1992) and included some of the highest and lowest flows in the last 58 years. We collected 16 seasonal samples which included data on: (1) habitat availability (total and microhabitat) and microhabitat diversity, (2) assemblage structure (i.e., the number and abundances of species comprising a subset of the community), and (3) micro- habitat use and overlap. We classified habitat availability data on the basis of year, season, and hydrologic period. Hydrologic period (i.e., pre-drought (PR), drought (D), and post-drought (PO)) represented the temporal location of a sample with respect to a four-year drought that occurred during the study. Hydrologic period explained a greater amount of variance in habitat availability data than either season or year. Total habitat availability was significantly greater during PO than in PR or D, although microhabitat diversity did not differ among either seasons or hydrologic periods. There were significantly fewer high-flow events (i.e., >2.1 mVs) during D than in either PR or PO periods. We observed a total of 16 species during our investigation, and the total number of species was significantly higher in D than in PR samples. Correlation analyses between the number of species present (total and abundant species) and environmental data yielded limited results, although the total number of species was inversely correlated with total habitat availability. A cluster analysis grouped assemblage structure samples by hydrologic period rather than season or year, supporting the contention that variation in annual flow had a strong impact on this assemblage. The drought had little effect on the numerical abundance of benthic species in this assemblage; however, a majority of water-column species increased in abundance. The increased abundances of water-column species may have been related to the decrease in high-flow events observed during the drought. Such high-flow events are known to cause mortality in stream fishes. Microhabitat use data showed that species belonged to one of three microhabitat guilds: benthic, lower water column, and mid water column. In general, species within the same guild did not exhibit statistically distinguishable patterns of microhabitat use, and most significant differences occurred between members of different guilds. However, lower water-column guild species frequently were not separable from all members of either benthic or mid-water-col umn species. Variations in the abundance of potential competitors or predators did not produce strong shifts in microhabitat use by assemblage members. Predators were present in the site in only 9 of 16 seasonal samples and never were abundant (maximum number observed per day was 2). In conclusion, our results demonstrate that variability in both mean and peak flows had a much stronger effect on the structure and use of spatial resources within this assemblage then either interspecific competition for space or predation. Consequently,' we suspect that the patterns in both assemblage structure and resource use displayed by fishes in Coweeta Creek arose from the interaction between environmental variation and species- specific evolutionary constraints on behavior, morphology, and physiology.
Article
Full-text available
We examine evidence for the structuring of fish communities from stream and lake systems and the roles of biotic, abiotic, and spatial factors in determining the species composition. Piscivory by fish is a dominant factor in both stream and lake systems whereas evidence for the importance of competition appears less convincing. Within small streams or lakes, the impact of predation may exclude other species, thereby leading to mutually exclusive distributions and strong differences in community composition. Within a geographic region, abiotic effects frequently dictate the relative importance of piscivory, thereby indirectly influencing the composition of prey species present. The spatial scale of studies influences our perceived importance of biotic versus abiotic factors, with small-scale studies indicating a greater importance of competition and large-scale studies emphasizing abiotic controls. The scale of the individual sites considered is critical because smaller systems have higher variability and wider extremes of conditions than larger lakes and rivers. The stability of physical systems and degree of spatial connectivity contribute to increased diversity in both larger stream and larger lake systems. We identify challenges and needs that must be addressed both to advance the field of fish community ecology and to face the problems associated with human-induced changes.
Article
Full-text available
1. Dam removal has great potential for restoring rivers and streams, yet limited data exist documenting recovery of associated biota within these systems following removals, especially on larger systems. This study examined the effects of a dam breach on benthic macroinvertebrate and fish assemblages in the Fox River, Illinois, U.S.A. 2. Benthic macroinvertebrates and fish were collected above and below the breached dam and three nearby intact dams for 1 year pre‐ and 3 years post‐breach (2 years of additional pre‐breach fish data were obtained from previous surveys). We also examined the effects of the breach on associated habitat by measuring average width, depth, flow rate and bed particle size at each site. 3. Physical habitat at the former impoundment (IMP) became comparable to free‐flowing sites (FF) within 1 year of the breach (width and depth decreased, flow rate and bed particle size increased). We also found a strong temporal effect on depth and flow rate at all surveyed sites. 4. Following the breach, relative abundance of Ephemeroptera, Plecoptera and Trichoptera (largely due to hydropsychid caddisflies) increased, whereas relative abundance of Ostracoda decreased, in the former IMP to levels comparable to FF sites. High variation in other metrics (e.g. total taxa, diversity) precluded determination of an effect of the breach on these aspects of the assemblage. However, non‐metric multidimensional scaling (NMDS) ordinations indicated that overall macroinvertebrate assemblage structure at the former IMP shifted to a characteristically FF assemblage 2 years following the breach. 5. Total fish taxa and a regional fish index of biotic integrity became more similar in the former IMP to FF sites following the breach. However, other fish metrics (e.g. biomass, diversity, density) did not show a strong response to the breach of the dam. Ordinations of abundance data suggested the fish assemblage only slightly shifted to FF characteristics 3 years after the breach. 6. Effects of the breach to the site immediately below the former dam included minor alterations in habitat (decreased flow rate and increased particle size) and short‐term changes in several macroinvertebrate metrics (e.g. decreased assemblage diversity and EPT richness for first post‐year), but longer‐term alterations in several fish metrics (e.g. decreased assemblage richness for all three post‐years; decreased density for first two post‐years). However, NMDS ordinations suggested no change to overall assemblage structure for both macroinvertebrates and fish following the breach at this downstream site. 7. Collectively, our results support the effectiveness of dam removal as a restoration practice for impaired streams and rivers. However, differences in response times of macroinvertebrates and fish coupled with the temporal effect on several habitat variables highlight the need for longer‐term studies.
Article
Full-text available
1. Society benefits immeasurably from rivers. Yet over the past century, humans have changed rivers dramatically, threatening river health. As a result, societal well-being is also threatened because goods and services critical to human society are being depleted. 2. ‘Health’— shorthand for good condition (e.g. healthy economy, healthy communities) — is grounded in science yet speaks to citizens. 3. Applying the concept of health to rivers is a logical outgrowth of scientific principles, legal mandates, and changing societal values. 4. Success in protecting the condition, or health, of rivers depends on realistic models of the interactions of landscapes, rivers, and human actions. 5. Biological monitoring and biological endpoints provide the most integrative view of river condition, or river health. Multimetric biological indices are an important and relatively new approach to measuring river condition. 6. Effective multimetric indices depend on an appropriate classification system, the selection of metrics that give reliable signals of river condition, systematic sampling protocols that measure those biological signals, and analytical procedures that extract relevant biological patterns. 7. Communicating results of biological monitoring to citizens and political leaders is critical if biological monitoring is to influence environmental policies. 8. Biological monitoring is essential to identify biological responses to human actions. By using the results to describe the condition, or health, of rivers and their adjacent landscapes and to diagnose causes of degradation, we can develop restoration plans, estimate the ecological risks associated with land use plans in a watershed, or select among alternative development options to minimize river degradation.
Article
Full-text available
Longitudinal analysis of the distribution and abundance of river fishes provides a context-specific characterization of species responses to riverscape heterogeneity. We exam-ined spatially continuous longitudinal profiles (35–70 km) of fish distribution and aquatic habitat (channel gradient, depth, temperature, and water velocity) for three northeastern Ore-gon rivers. We evaluated spatial patterns of river fishes and habitat using multivariate analysis to compare gradients in fish assemblage structure among rivers and at multiple spatial scales. Spatial structuring of fish assemblages exhibited a generalized pattern of cold-and coolwater fish assemblage zones but was variable within thermal zones, particularly in the warmest river. Landscape context (geographic setting and thermal condition) influenced the observed rela-tionship between species distribution and channel gradient. To evaluate the effect of spatial extent and geographical context on observed assemblage patterns and fish–habitat relation-ships, we performed multiple ordinations on subsets of our data from varying lengths of each river and compared gradients in assemblage structure within and among rivers. The relative associations of water temperature increased and channel morphology decreased as the spatial scale of analysis increased. The crossover point where both variables explained equal amounts of variation was useful for identifying transitions between cool-and coldwater fish assem-blages. Spatially continuous analysis of river fishes and their habitats revealed unexpected eco-logical patterns and provided a unique perspective on fish distribution that emphasized the importance of habitat heterogeneity and spatial variability in fish–habitat relationships.
Article
Full-text available
Habitat models serve three main purposes: First, to predict species occurrences on the basis of abiotic and biotic variables, second to improve the understanding of species-habitat relationships and third, to quantify habitat requirements. The use of statistical models to predict the likely occurrence or distribution of species based on relevant variables is becoming an increasingly important tool in conservation planning and wildlife management. This article aims to provide an overview of the current status of development and application of statistical methodologies for analysing the species-environment association, with a clear emphasis on aquatic habitat. It describes the main types of univariate and multivariate techniques available for analysis of species-environment association, and specifically focuses on the assessment of the strengths and weaknesses of the available statistical methods to estimate habitat suitability. A second objective of this article is to propose new approaches using existing statistical methods. A wide array of habitat statistical models has been developed to analyse habitat-species relationship. Generally, physical habitat is dependent on more than one variable (e.g. depth, velocity, substrate, cover) and several suitability indices must be combined to define a composite index. Multivariate approaches are more appropriate for the analysis of aquatic habitat as they inherently consider the interrelation and correlation structure of the environmental variables. Ordinary multiple linear regression and logistic regression are popular methods often used for modelling of species and their relationships with environment. Ridge regression and Principal component regression are particularly useful when the independent variables are highly correlated. More recent regression modelling paradigms like generalized linear models (GLMs) present advantages in dealing with non-normal environmental variables. Generalized additive models (GAMs) and artificial neural networks are better suited for analysis of non-linear relationships between species distribution and environmental variables. The fuzzy logic approach presents advantages in dealing with uncertainties that often exist in habitat modelling. Appropriate methods for analysis of multi-species data are also presented. Finally, the few existing comparative studies for predictive modelling are reviewed, and advantages and disadvantages of different methods are discussed. Copyright © 2006 John Wiley & Sons, Ltd.
Article
Full-text available
A common feature of ecological data sets is their tendency to contain many zero values. Statistical inference based on such data are likely to be inefficient or wrong unless careful thought is given to how these zeros arose and how best to model them. In this paper, we propose a framework for understanding how zero-inflated data sets originate and deciding how best to model them. We define and classify the different kinds of zeros that occur in ecological data and describe how they arise: either from ‘true zero’ or ‘false zero’ observations. After reviewing recent developments in modelling zero-inflated data sets, we use practical examples to demonstrate how failing to account for the source of zero inflation can reduce our ability to detect relationships in ecological data and at worst lead to incorrect inference. The adoption of methods that explicitly model the sources of zero observations will sharpen insights and improve the robustness of ecological analyses.
Article
Full-text available
SUMMARY 1. The prediction of species distributions is of primary importance in ecology and conservation biology. Statistical models play an important role in this regard; however, researchers have little guidance when choosing between competing methodologies because few comparative studies have been conducted. 2. We provide a comprehensive comparison of traditional and alternative techniques for predicting species distributions using logistic regression analysis, linear discriminant analysis, classification trees and artificial neural networks to model: (1) the presence/absence of 27 fish species as a function of habitat conditions in 286 temperate lakes located in south-central Ontario, Canada and (2) simulated data sets exhibiting deterministic, linear and non-linear species response curves. 3. Detailed evaluation of model predictive power showed that approaches produced species models that differed in overall correct classification, specificity (i.e. ability to correctly predict species absence) and sensitivity (i.e. ability to correctly predict speciespresence) and in terms of which of the study lakes they correctly classified. Onaverage, neural networks outperformed the other modelling approaches, although all approaches predicted species presence/absence with moderate to excellent success. 4. Based on simulated non-linear data, classification trees and neural networks greatly outperformed traditional approaches, whereas all approaches exhibited similar correct classification rates when modelling simulated linear data. 5. Detailed evaluation of model explanatory insight showed that the relative importance of the habitat variables in the species models varied among the approaches, where habitat variable importance was similar among approaches for some species and very different for others. 6. In general, differences in predictive power (both correct classification rate and identity of the lakes correctly classified) among the approaches corresponded with differences in habitat variable importance, suggesting that non-linear modelling approaches (i.e. classification trees and neural networks) are better able to capture and model complex, non-linear patterns found in ecological data. The results from the comparisons using simulated data further support this notion. 7. By employing parallel modelling approaches with the same set of data and focusing on comparing multiple metrics of predictive performance, researchers can begin to choose predictive models that not only provide the greatest predictive power, but also best fit the proposed application.
Article
Full-text available
In this article we use moving averages to develop new classes of models in a flexible modeling framework for stream networks Streams and rivers are among our most important resources, yet models with autocorrelated errors for spatially continuous stream networks have been described only recently We develop models based on stream distance rather than on Euclidean distance Spatial autocovariance models developed for Euclidean distance may not be valid when using stream distance We begin by describing a stream topology We then use moving averages to build several classes of valid models for streams Various models are derived depending on Whether the moving average has a "tail-up" stream, a "tail-down" stream, or a "two-tail" construction These models also can account for the volume and direction of flowing water The data tor this article come from the Ecosystem Health Monitoring Program in Southeast Queensland. Australia, an important national program alined at monitoring water quality We model two water chemistry variables. pH and conductivity, for sample sizes close to 100 We estimate fixed effects and make spatial predictions One interesting aspect of stream networks is the possible dichotomy of autocorrelation between flow-connected and flow-unconnected locations For this reason, it is important to have a flexible modeling framework. which we achieve on the example data using a variance component approach
Article
Full-text available
We investigated the relationships between different environmental variables and the spatial distribution patterns of the stoneloach (Barbatula barbatula) at the stream system, the stream site, and the mesohabitat (riffle/pool) scales in south-western France. Stoneloach occurred at 240 sites (out of 554 sampling sites), chiefly close to the source, in areas at low elevation and with weak slopes. Population density at a site was primarily influenced by physical conditions. Stream width was positively related to the probability of presence of stoneloach within the stream system, but negatively related to local density. These results indicate that stoneloaches can occur in a wide range of streams, but they are less abundant in wide rivers, probably because of lower habitat heterogeneity. Slope was negatively correlated to both fish presence at the regional scale and local density, suggesting that stoneloach’s swimming performance were weak under greater erosive forces. These results suggested that the distribution of populations and the density of stoneloach were governed by the suitability of physical habitat. Multi-scale studies of factors influencing a species’distribution allow to integrate patterns observed at different scales, and enhance our understanding of interactions between animals and their environment. The use of few pertinent variables in successful final models could reduce the effort and cost of data collection for water management applications.
Article
Full-text available
Understanding the relationship between an index of biological community state and habitat is important for policy makers and environmental managers. A common approach to modeling this relationship is to use regression. However, this simple method becomes complicated when the data are clustered and have both within-cluster and between-cluster spatial correlation. This article proposes a Bayesian hierarchical model that incorporates both types of spatial correlation. This model yields both an understanding of the within-cluster relationships as well as an overall relationship between these variables. We apply this method to evaluate the relationship between the index of biotic integrity (a common measure of fish condition) and the qualitative habitat evaluation index (a common measure of habitat quality). This method allows us to show that there is a relationship between the biological community state and habitat and that this relationship varies across river basins, while accounting for the within- and between-spatial correlations.
Article
Full-text available
Classification of streams and stream habitats is useful for research involving establishment of monitoring stations, determination of local impacts of land-use practices, generalization from site-specific data, and assessment of basin-wide, cumulative impacts of human activities on streams and their biota. This article presents a frame-work for a hierarchical classification system, entailing an organized view of spatial and temporal variation among and within stream systems. Stream habitat systems, defined and classified on several spatiotemporal scales, are associated with watershed geomorphic features and events. Variables selected for classification define relative long-term capacities of systems, not simply short-term states. Streams and their watershed environments are classified within the context of a regional biogeoclimatic landscape classification. The framework is a perspective that should allow more systematic interpretation and description of watershed-stream relationships.
Article
Full-text available
The variability of benthic macroinvertebrate (bmi) communities at three spatial scales was examined in the Juncal Stream, Coiba Island, Panama. Standard Surber samples were taken at riffle habitats of first, second and third order streams of the Juncal Stream watershed. In a nested design, we had three riffles within each stream order, and three sampling points within each riffle. Bmi total density and richness showed greater variation among stream orders and within riffles, while individual taxa varied mostly among and within riffles, and community evenness varied within riffles. Current velocity was strongly related to the variability of bmi community descriptors. Multidimensional scaling showed that first and third stream orders were more similar in bmi composition than second order. Riffles were more heterogeneous in composition within first order than within second or third orders, probably related to the greater substrate heterogeneity in first order riffles.
Article
We assessed the relative importance of environmental variation, interspecific competition for space, and predator abundance on assemblage structure and microhabitat use in a stream fish assemblage inhabiting Coweeta Creek, North Carolina, USA. Our study encompassed a 10-yr time span (1983-1992) and included some of the highest and lowest flows in the last 58 years. We collected 16 seasonal samples which included data on: (1) habitat availability (total and microhabitat) and microhabitat diversity, (2) assemblage structure (i.e., the number and abundances of species comprising a subset of the community), and (3) microhabitat use and overlap. We classified habitat availability data on the basis of year, season, and hydrologic period. Hydrologic period (i.e., pre-drought [PR], drought [D], and post-drought [PO]) represented the temporal location of a sample with respect to a four-year drought that occurred during the study. Hydrologic period explained a greater amount of variance in habitat availability data than either season or year. Total habitat availability was significantly greater during PO than in PR or D, although microhabitat diversity did not differ among either seasons or hydrologic periods. There were significantly fewer high-flow events (i.e., greater than or equal to 2.1 m(3)/s) during D than in either PR or PO periods. We observed a total of 16 species during our investigation, and the total number of species was significantly higher in D than in PR samples. Correlation analyses between the number of species present (total and abundant species) and environmental data yielded limited results, although the total number of species was inversely correlated with total habitat availability. A cluster analysis grouped assemblage structure samples by hydrologic period rather than season or year, supporting the contention that variation in annual flow had a strong impact on this assemblage. The drought had little effect on the numerical abundance of benthic species in this assemblage; however, a majority of water-column species increased in abundance. The increased abundances of water-column species may have been related to the decrease in high-flow events observed during the drought. Such high-flow events are known to cause mortality in stream fishes. Microhabitat use data showed that species belonged to one of three microhabitat guilds: benthic, lower water column, and mid water column. In general, species within the same guild did not exhibit statistically distinguishable patterns of microhabitat use, and most significant differences occurred between members of different guilds. However, lower water-column guild species frequently were not separable from all members of either benthic or mid-water-column species. Variations in the abundance of potential competitors or predators did not produce strong shifts in microhabitat use by assemblage members. Predators were present in the site in only 9 of 16 seasonal samples and never were abundant (maximum number observed per day was 2). In conclusion, our results demonstrate that variability in both mean and peak flows had a much stronger effect on the structure and use of spatial resources within this assemblage then either interspecific competition for space or predation. Consequently, we suspect that the patterns in both assemblage structure and resource use displayed by fishes in Coweeta Creek arose from the interaction between environmental variation and species-specific evolutionary constraints on behavior, morphology, and physiology.
Article
This paper extends the two-component approach to modelling count data with extra zeros, considered by Mullahy (1986), Heilbron (1994) and Welsh et al. (1996), to take account of possible serial dependence between repeated observations. Generalized estimating equations (Liang & Zeger, 1986) are constructed for each component of the model by incorporating correlation matrices into each of the maximum likelihood estimating equations. The proposed method is demonstrated on weekly counts of Noisy Friarbirds (Philemon corniculatus), which were recorded by observers for the Canberra Garden Bird Survey (Hermes, 1981).
Book
Freshwater Fishes of North-Eastern Australia provides details of the ecology, systematics, biogeography and management of 79 species of native fish present in the region. It includes detailed information on their identification, evolutionary history, breeding biology, feeding ecology, movement patterns, macro-, meso- and micro-habitat use, water quality tolerances, conservation status and current threats, as well as environmental flow and management needs. Based on the results of extensive field surveys and a comprehensive review of existing literature, it is designed to assist environmental practitioners and managers to make informed decisions about future management strategies. It will also encourage a greater research effort into the region’s aquatic fauna by providing a comprehensive resource that enables other researchers to adopt a more quantitative and strategic framework for their research. Joint winner of the 2005 Whitley Medal.
Article
It is argued that the problem of pattern and scale is the central problem in ecology, unifying population biology and ecosystems science, and marrying basic and applied ecology. Applied challenges, such as the prediction of the ecological causes and consequences of global climate change, require the interfacing of phenomena that occur on very different scales of space, time, and ecological organization. Furthermore, there is no single natural scale at which ecological phenomena should be studied; systems generally show characteristic variability on a range of spatial, temporal, and organizational scales. The observer imposes a perceptual bias, a filter through which the system is viewed. This has fundamental evolutionary significance, since every organism is an "observer" of the environment, and life history adaptations such as dispersal and dormancy alter the perceptual scales of the species, and the observed variability. It likewise has fundamental significance for our own study of ecological systems, since the patterns that are unique to any range of scales will have unique causes and biological consequences. The key to prediction and understanding lies in the elucidation of mechanisms underlying observed patterns. Typically, these mechanisms operate at different scales than those on which the patterns are observed; in some cases, the patterns must be understood as emerging form the collective behaviors of large ensembles of smaller scale units. In other cases, the pattern is imposed by larger scale constraints. Examination of such phenomena requires the study of how pattern and variability change with the scale of description, and the development of laws for simplification, aggregation, and scaling. Examples are given from the marine and terrestrial literatures.
Book
Introduction.- Data management and software.- Advice for teachers.- Exploration.- Linear regression.- Generalised linear modelling.- Additive and generalised additive modelling.- Introduction to mixed modelling.- Univariate tree models.- Measures of association.- Ordination--first encounter.- Principal component analysis and redundancy analysis.- Correspondence analysis and canonical correspondence analysis.- Introduction to discriminant analysis.- Principal coordinate analysis and non-metric multidimensional scaling.- Time series analysis--Introduction.- Common trends and sudden changes.- Analysis and modelling lattice data.- Spatially continuous data analysis and modelling.- Univariate methods to analyse abundance of decapod larvae.- Analysing presence and absence data for flatfish distribution in the Tagus estuary, Portugual.- Crop pollination by honeybees in an Argentinean pampas system using additive mixed modelling.- Investigating the effects of rice farming on aquatic birds with mixed modelling.- Classification trees and radar detection of birds for North Sea wind farms.- Fish stock identification through neural network analysis of parasite fauna.- Monitoring for change: using generalised least squares, nonmetric multidimensional scaling, and the Mantel test on western Montana grasslands.- Univariate and multivariate analysis applied on a Dutch sandy beach community.- Multivariate analyses of South-American zoobenthic species--spoilt for choice.- Principal component analysis applied to harbour porpoise fatty acid data.- Multivariate analysis of morphometric turtle data--size and shape.- Redundancy analysis and additive modelling applied on savanna tree data.- Canonical correspondence analysis of lowland pasture vegetation in the humid tropics of Mexico.- Estimating common trends in Portuguese fisheries landings.- Common trends in demersal communities on the Newfoundland-Labrador Shelf.- Sea level change and salt marshes in the Wadden Sea: a time series analysis.- Time series analysis of Hawaiian waterbirds.- Spatial modelling of forest community features in the Volzhsko-Kamsky reserve.
Article
With the rise of new powerful statistical techniques and GIS tools, the development of predictive habitat distribution models has rapidly increased in ecology. Such models are static and probabilistic in nature, since they statistically relate the geographical distribution of species or communities to their present environment. A wide array of models has been developed to cover aspects as diverse as biogeography, conservation biology, climate change research, and habitat or species management. In this paper, we present a review of predictive habitat distribution modeling. The variety of statistical techniques used is growing. Ordinary multiple regression and its generalized form (GLM) are very popular and are often used for modeling species distributions. Other methods include neural networks, ordination and classification methods, Bayesian models, locally weighted approaches (e.g. GAM), environmental envelopes or even combinations of these models. The selection of an appropriate method should not depend solely on statistical considerations. Some models are better suited to reflect theoretical findings on the shape and nature of the species’ response (or realized niche). Conceptual considerations include e.g. the trade-off between optimizing accuracy versus optimizing generality. In the field of static distribution modeling, the latter is mostly related to selecting appropriate predictor variables and to designing an appropriate procedure for model selection. New methods, including threshold-independent measures (e.g. receiver operating characteristic (ROC)-plots) and resampling techniques (e.g. bootstrap, cross-validation) have been introduced in ecology for testing the accuracy of predictive models. The choice of an evaluation measure should be driven primarily by the goals of the study. This may possibly lead to the attribution of different weights to the various types of prediction errors (e.g. omission, commission or confusion). Testing the model in a wider range of situations (in space and time) will permit one to define the range of applications for which the model predictions are suitable. In turn, the qualification of the model depends primarily on the goals of the study that define the qualification criteria and on the usability of the model, rather than on statistics alone.
Article
* This paper is a development of some of the estimation problems discussed by Utting and Cole [5]. The author wishes to express his indebtedness to J. A. C. Brown of the Department of Applied Economics for helpful criticism and for suggesting the application of the Poisson series distribution to the analysis of household composition.In a number of situations we are faced with the problem of determining efficient estimates of the mean and variance of a distribution specified by (i) a non-zero probability that the variable assumes a zero value, together with (ii) a conditional distribution for the positive values of the variable. This estimation problem is analyzed and its implications for the Pearson type III, exponential, lognormal and Poisson series conditional distributions are investigated. Two simple examples are given.
Article
Local species richness is a function of many factors operating at multiple spatial and temporal scales. We examined stream fish communities from regions throughout Virginia to assess (1) the relative influence of local vs. regional factors on local species richness, (2) evidence for community saturation, and (3) scale dependency of regional influences. We defined regions at four spatial scales: major drainages, drainage-physiography units, hydrologic-physiography units, and sites. We used multiple regression to identify key correlates of local native and introduced diversity for each regional scale. Both local (e.g., microhabitat diversity) and regional (e.g., species richness) factors were correlated with local diversity; regional diversity was the most consistent correlate. Plots of local vs. regional native diversity were asymptotic for the three largest regional definitions, thereby suggesting community saturation. However, analogous plots for introduced species were not asymptotic; local introduced diversity was a linear function of regional introduced diversity. Introduced populations were pervasive, but less abundant locally than native populations, thereby suggesting that native species are better adapted. Overall, stream fish communities in Virginia appeared to be neither completely saturated nor freely invadable. The ability of regression models and particular independent variables to account for variation in local diversity changed considerably with regional scale. Most regional correlates of local diversity were scale dependent. The concept of hierarchical environmental filters provides a useful framework for integrating the multiple scales over which ecological processes organize communities. Retrospective analyses of the impacts of introduced species on native communities provide some insight regarding community saturation, but conclusive evidence must await studies that couple comparative and experimental approaches. Clear interpretation of regional influences on local diversity will require careful definition of regions. Comparative analyses at multiple regional scales may be the most insightful approach to understanding the complete array of processes that organize communities.
Article
We examined the predictive power and transferability of habitat-based models by comparing associations of tangerine darter Percina aurantiaca and stream habitat at local and regional scales in North Fork Holston River (NFHR) and Little River, Virginia. Our models correctly predicted the presence or absence of tangerine darters in NFHR for 64% (local model) and 78% (regional model) of the sampled habitat-units (i.e., pools, runs, riffles). The distribution of tangerine darters apparently was influenced more by regional variables than local variables. Data from Little River and 37 historical records from Virginia were used to assess transferability of our models developed from NFHR data. In general, the models did not transfer well to Little River; all models predicted that either no (regional model) or few (local model) habitat-units in Little River would contain tangerine darter even though the species was observed in 83% of the habitat-units sampled. Conversely, the regional model correctly predicted presence of tangerine darters for 95% of the historical records. Principal components analysis showed extensive overlap in NFHR and Little River habitat which suggests that the two streams are ecologically similar. The suitability of Little River for tangerine darters was shown more clearly by principal components analysis than by our models. Because different limiting factors may apply in different systems, the elimination of potentially important ecological variables may compromise model transferabiiity. A hierarchical approach to habitat modeling, with regard to variable retention, may improve transferability of habitat models.
Article
The hierarchical structure of natural systems can be useful in designing ecological studies that are informative at multiple spatial scales. Although stream systems have long been recognized as having a hierarchical spatial structure, there is a need for more empirical research that exploits this structure to generate an understanding of population biology, community ecology, and species–ecosystem linkages across spatial scales. We review studies that link pattern and process across multiple scales of stream-habitat organization, highlighting the insight derived from this multiscale approach and the role that mechanistic hypotheses play in its successful application. We also describe a frontier in stream research that relies on this multiscale approach: assessing the consequences and mechanisms of ecological processes occurring at the network scale. Broader use of this approach will advance many goals in applied stream ecology, including the design of reserves to protect stream biodiversity and the conservation of freshwater resources and services.
Book
An overview of statistical decision theory, which emphasizes the use and application of the philosophical ideas and mathematical structure of decision theory. The text assumes a knowledge of basic probability theory and some advanced calculus is also required.
Article
1. We compared assemblages of ground-active, terrestrial beetles and spiders from different areas of river red gum Eucalyptus camaldulensis floodplain forest in subhumid, south-eastern Australia before and for 2 years following a managed flood to determine whether the Flood Pulse Concept is an appropriate ecological model for this regulated, lowland river-floodplain system. 2. Immediately following flooding, the abundance, species richness and biomass of beetles were greatest at sites that had been inundated for the longest period (approximately 4 months). The abundance, species richness and biomass of spiders were not reduced at sites that were flooded for 4 months compared with unflooded or briefly flooded areas. Sites recently flooded for several months had high densities of predatory, hygrophilic beetles (Carabidae) and spiders (Lycosidae). 3. Over the 2 years following the flood, beetles generally were more abundant at sites that had been inundated for longer. At all sampling times, the species richness of beetles at sites increased with the length of time sites were inundated, even before the flood. Neither the abundance nor species richness of spiders was related to duration of flooding. 4. The structure of beetle and spider assemblages at sites that were flooded for different lengths of time did not appear to converge monotonically over the 2 years after the flood. 5. Managed flooding promotes diversity of beetles and spiders both by providing conditions that create a ‘pulse’ in populations of hygrophilic specialists in the short term, and by creating subtle, persistent changes in forest-floor conditions. Despite its monotypic canopy, river red gum floodplain forest is a habitat mosaic generated by differing inundation histories.
Article
When analyzing Poisson count data sometimes a high frequency of extra zeros is observed. The Zero-Inflated Poisson (ZIP) model is a popular approach to handle zero-inflation. In this paper we generalize the ZIP model and its regression counterpart to accommodate the extent of individual exposure. Empirical evidence drawn from an occupational injury data set confirms that the incorporation of exposure information can exert a substantial impact on the model fit. Tests for zero-inflation are also considered. Their finite sample properties are examined in a Monte Carlo study.
Article
This paper extends the two-component approach to modelling count data with extra zeros, considered by Mullahy (1986), Heilbron (1994) and Welsh et al. (1996), to take account of possible serial dependence between repeated observations. Generalized estimating equations (Liang & Zeger, 1986) are constructed for each component of the model by incorporating correlation matrices into each of the maximum likelihood estimating equations. The proposed method is demonstrated on weekly counts of Noisy Friarbirds (Philemon cornic-ulatus), which were recorded by observers for the Canberra Garden Bird Survey (Hermes, 1981).
Article
SUMMARY 1. Fish in Mediterranean streams survive through the summer in residual surface waters, encompassing a broad range of abiotic and biotic conditions. Yet, the extent to which fish assemblages may be shaped by functional heterogeneity in dry-season refugia is largely unknown. This study addresses this issue, by examining fish assemblage and population attributes, and predation patterns in residual summer habitats (12 pools and six runs) across a Mediterranean catchment in south-west Portugal. 2. Species richness was fairly constant among runs but increased with pool size, with the addition of exotics and rare natives to large pools resulting in nested subsets. The four most common species (chub, nase, loach and eel) were considered generalists in terms of their use of dry-season refugia. Conversely, rare species presented more specialised habitats, with barbel and exotics favouring pools and stickleback favouring runs. 3. Age and size of the two most common species varied among dry-season habitats. Age 0 chub were restricted to runs, where spawning stages (age 2 and older fish) were also more represented. Age 0 nase also concentrated in runs, but the individuals collected in pools exhibited greater growth. Conversely, age 2 and older nase were proportionally more abundant in pools, but with greater growth and better condition in runs. 4. The otter was the main fish predator, consuming fish of all species and size classes, irrespective of habitat. Otter activity concentrated in pools, where predation risk for cyprinids seemed to be much higher than in runs. 5. Dry-season refugia apparently vary in functional importance for different fish species and life stages, acting as complementary units in the landscape. Therefore, the presence of both pool and run refugia trough the summer dry season may play a critical role in promoting the persistence of native species in Mediterranean streams.
Article
1. Studies in several parts of the world have examined variation in univariate descriptors of macroinvertebrate assemblage structure in perennially flowing stony streams across hierarchies of spatial scale using nested analyses of variance. However, few have investigated whether this spatial variation changes with time or whether these results are representative of habitats other than riffles or of other stream types, such as intermittently flowing streams. 2. We describe patterns in taxon richness and abundance from two sets of samples from stony streams in the Otway Range and the Grampians Range, Victoria, Australia, collected using hierarchical designs. Sampling of riffles was repeated in the Otways, to determine whether spatial patterns were consistent among times. In the Grampians, spatial patterns were compared between intermittent and perennially flowing streams (stream type) by sampling pools. 3. In the Otways streams, most variation in the dependent variables occurred between sample units. Patterns of variation among the other scales (streams, segments, riffles, groups of stones) were not consistent between sampling times, suggesting that they may have little ecological significance. 4. In the Grampians streams, variation in macroinvertebrate taxon richness and abundance differed significantly between replicate streams within each stream type but not between stream types or pools. The largest source of variation in taxon richness was stream type. Little variation occurred among sample units. 5. The pattern of most variation occurring among sample units is robust both to differences in the method of sampling and different dependent variables among studies and increasingly appears to be a property of riffles in stony, perennial upland streams. High variation among sample units (residual variation) limits the explanatory power of linear models and therefore, where samples are from a single sampling time, small but significant components of variation are unlikely to represent features of assemblage structure that will be stable over time.
Article
1. Many natural ecosystems are heterogeneous at scales ranging from microhabitats to landscapes. Running waters are no exception in this regard, and their environmental heterogeneity is reflected in the distribution and abundance of stream organisms across multiple spatial scales. 2. We studied patchiness in benthic macroinvertebrate abundance and functional feeding group (FFG) composition at three spatial scales in a boreal river system. Our sampling design incorporated a set of fully nested scales, with three tributaries, two stream sections (orders) within each tributary, three riffles within each section and ten benthic samples in each riffle. 3. According to nested anovas, most of the variation in total macroinvertebrate abundance, abundances of FFGs, and number of taxa was accounted for by the among-riffle and among-sample scales. Such small-scale variability reflected similar patterns of variation in in-stream variables (moss cover, particle size, current velocity and depth). Scraper abundance, however, varied most at the scale of stream sections, probably mirroring variation in canopy cover. 4. Tributaries and stream sections within tributaries differed significantly in the structure and FFG composition of the macroinvertebrate assemblages. Furthermore, riffles in headwater (second order) sections were more variable than those in higher order (third order) sections. 5. Stream biomonitoring programs should consider this kind of scale-dependent variability in assemblage characteristics because: (i) small-scale variability in abundance suggests that a few replicate samples are not enough to capture macroinvertebrate assemblage variability present at a site, and (ii) riffles from the same stream may support widely differing benthic assemblages.
Article
1. We intensively sampled 16 western Oregon streams to characterize: (1) the variability in macroinvertebrate assemblages at seven spatial scales; and (2) the change in taxon richness with increasing sampling effort. An analysis of variance (ANOVA) model calculated spatial variance components for taxon richness, total density, percent individuals of Ephemeroptera, Plecoptera and Trichoptera (EPT), percent dominance and Shannon diversity. 2. At the landscape level, ecoregion and among‐streams components dominated variance for most metrics, accounting for 43–72% of total variance. However, ecoregion accounted for very little variance in total density and 36% of the variance was attributable to differences between streams. For other metrics, variance components were more evenly divided between stream and ecoregion effects. 3. Within streams, approximately 70% of variance was associated with unstructured local spatial variation and not associated with habitat type or transect position. The remaining variance was typically split about evenly between habitat and transect. Sample position within a transect (left, centre or right) accounted for virtually none of the variance for any metric. 4. New taxa per stream increased rapidly with sampling effort with the first four to eight Surber samples (500–1000 individuals counted), then increased more gradually. After counting more than 50 samples, new taxa continued to be added in stream reaches that were 80 times as long as their mean wetted width. Thus taxon richness was highly dependent on sampling effort, and comparisons between sites or streams must be normalized for sampling effort. 5. Characterization of spatial variance structure is fundamental to designing sampling programmes where spatial comparisons range from local to regional scales. Differences in metric responses across spatial scales demonstrate the importance of designing sampling strategies and analyses capable of discerning differences at the scale of interest.
Article
1. Species-discharge relationships (SDR) are aquatic analogues of species-area relationships, and are increasingly used in both basic research and conservation planning. SDR studies are often limited, however, by two shortcomings. First, they do not determine whether reported SDRs, which normally use complete drainage basins as sampling units, are scale dependent. Second, they do not account for the effects of habitat diversity within or among samples. 2. We addressed both problems by using discrete fish zones as sampling units in a SDR analysis. To do so, we first tested for longitudinal zonation in three rivers in the southeastern U.S.A. In each river, we detected successive ‘lower’, ‘middle’, and ‘upper’ fish zones, which were characterized by distinct fish assemblages with predictable habitat requirements. Because our analyses combined fish data from multiple sources, we also used rarefaction and Monte Carlo simulation to ensure that our zonation results were robust to spurious sampling effects. 3. Next, we estimated the average discharge within each zone, and plotted these estimates against the respective species richness within each zone (log10 data). This revealed a significant, linear SDR (r2 = 0.83; P < 0.01). Notably, this zonal SDR fit the empirical data better than a comparable SDR that did not discriminate among longitudinal zones. We therefore conclude that the southeastern fish SDR is scale dependent, and that accounting for within-basin habitat diversity is an important step in explaining the high diversity of southeastern fishes. 4. We then discuss how our zonal SDR can be used to improve conservation planning. Specifically, we show how the slope of the SDR can be used to forecast potential extinction rates, and how the zonal data can be used to identify species of greatest concern.
Article
1. Macroinvertebrate count data often exhibit nested or hierarchical structure. Examples include multiple measurements along each of a set of streams, and multiple synoptic measurements from each of a set of ponds. With data exhibiting hierarchical structure, outcomes at both sampling (e.g. within stream) and aggregated (e.g. stream) scales are often of interest. Unfortunately, methods for modelling hierarchical count data have received little attention in the ecological literature. 2. We demonstrate the use of hierarchical count models using fingernail clam (Family: Sphaeriidae) count data and habitat predictors derived from sampling and aggregated spatial scales. The sampling scale corresponded to that of a standard Ponar grab (0.052 m2) and the aggregated scale to impounded and backwater regions within 38–197 km reaches of the Upper Mississippi River. Impounded and backwater regions were resampled annually for 10 years. Consequently, measurements on clams were nested within years. Counts were treated as negative binomial random variates, and means from each resampling event as random departures from the impounded and backwater region grand means. 3. Clam models were improved by the addition of covariates that varied at both the sampling and regional scales. Substrate composition varied at the sampling scale and was associated with model improvements, and reductions (for a given mean) in variance at the sampling scale. Inorganic suspended solids (ISS) levels, measured in the summer preceding sampling, also yielded model improvements and were associated with reductions in variances at the regional rather than sampling scales. ISS levels were negatively associated with mean clam counts. 4. Hierarchical models allow hierarchically structured data to be modelled without ignoring information specific to levels of the hierarchy. In addition, information at each hierarchical level may be modelled as functions of covariates that themselves vary by and within levels. As a result, hierarchical models provide researchers and resource managers with a method for modelling hierarchical data that explicitly recognises both the sampling design and the information contained in the corresponding data.
Article
Ecologists have recognized for decades the importance of spatial scale in ecological processes and patterns, as well as the complications scale poses for understanding ecological mechanisms. Here we highlight the opportunity attention to scale offers experimental ecology. Despite many advantages to considering scale, a review of the literature indicates that multi‐scale experimental studies are rare. Although much work has focused on scale as a primary factor (e.g. island size), we draw attention to scale as a ‘lurking’ variable: one which influences the relationship between two or more variables that are not usually understood to be scale‐dependent. We highlight three basic observations from which scale‐dependence arises: abundance increases with area, environmental conditions vary across space, and the effect of an organism on its environment is spatially limited. From these arise first‐order scale‐dependence, which relates an ecological variable of interest to a measure of scale. Combining first‐order relationships together, we can produce second‐order scale‐dependencies, which occur when the relationship between two or more variables is mediated by scale. It is these relationships that are of particular interest, as they have the potential to confound experimental results. Most ecological experiments have incorporated scale either implicitly or not at all. We suggest that an explicit consideration of scale could help resolve some long‐standing debates when scale is turned from a lurking variable into a working variable. Finally, we review and evaluate four different experimental sampling designs and corresponding statistical analyses that can be used to address the effects of scale in ecological experiments.
Article
Environmental data are spatial, temporal, and often come with many zeros. In this paper, we included space-time random effects in zero-inflated Poisson (ZIP) and ‘hurdle’ models to investigate haulout patterns of harbor seals on glacial ice. The data consisted of counts, for 18 dates on a lattice grid of samples, of harbor seals hauled out on glacial ice in Disenchantment Bay, near Yakutat, Alaska. A hurdle model is similar to a ZIP model except it does not mix zeros from the binary and count processes. Both models can be used for zero-inflated data, and we compared space-time ZIP and hurdle models in a Bayesian hierarchical model. Space-time ZIP and hurdle models were constructed by using spatial conditional autoregressive (CAR) models and temporal first-order autoregressive (AR(1)) models as random effects in ZIP and hurdle regression models. We created maps of smoothed predictions for harbor seal counts based on ice density, other covariates, and spatio-temporal random effects. For both models predictions around the edges appeared to be positively biased. The linex loss function is an asymmetric loss function that penalizes overprediction more than underprediction, and we used it to correct for prediction bias to get the best map for space-time ZIP and hurdle models. Published in 2007 by John Wiley & Sons, Ltd.
Article
1. We used field surveys to compare the density and mesohabitat-scale distribution of the native coastrange sculpin (Cottus aleuticus) and the prickly sculpin (C. asper) in coastal rivers in north-western California, U.S.A., with and without an introduced piscivorous fish, the Sacramento pikeminnow, Ptychocheilus grandis. We also measured mortality of tethered prickly sculpin in a field experiment including river, habitat type (pools versus riffles) and cover as factors. 2. Average sculpin density (C. aleuticus and C. asper combined) in two rivers without pikeminnow was 21 times higher than the average density in two rivers in a drainage with introduced pikeminnow. In riffles, differences in the density of sculpins among rivers could be linked to differences in cover. However, riffles in rivers without pikeminnow had an average sculpin density 77 times higher than rivers with pikeminnow, yet only nine times more cover. In pools, cover availability did not differ among rivers, but the density of sculpins in rivers without pikeminnow was 11 times higher than rivers with pikeminnow. 3. In the field experiment, mortality of tethered sculpin varied substantially among treatments and ANOVA indicated a significant River × Habitat × Cover interaction (P < 0.001). Overall, tethered prickly sculpin suffered 40% mortality over 24 h in rivers with pikeminnow and 2% mortality in rivers without pikeminnow, suggesting that predation is the mechanism by which the pikeminnow affects sculpins. 4. The apparent reduction in sculpin abundance by introduced pikeminnow has probably significantly altered food webs and nutrient transport processes, and increased the probability of extinction of coastrange and prickly sculpins in the Eel River drainage.
Article
Count data arises in many contexts. Here our concern is with spatial count data which exhibit an excessive number of zeros. Using the class of zero-inflated count models provides a flexible way to address this problem. Available covariate information suggests formulation of such modeling within a regression framework. We employ zero-inflated Poisson regression models. Spatial association is introduced through suitable random effects yielding a hierarchical model. We propose fitting this model within a Bayesian framework considering issues of posterior propriety, informative prior specification and well-behaved simulation based model fitting. Finally, we illustrate the model fitting with a data set involving counts of isopod nest burrows for 1649 pixels over a portion of the Negev desert in Israel.