Figure - available from: Environmental and Ecological Statistics
This content is subject to copyright. Terms and conditions apply.
a Changes occurred between the years 2000–2012 on the satellite map constituted by 40,000 quadrat pixels of size 1 ha for a quadrat area of size 20 km located in Sardinia. b Changes in the reference map artificially constructed from the satellite map

a Changes occurred between the years 2000–2012 on the satellite map constituted by 40,000 quadrat pixels of size 1 ha for a quadrat area of size 20 km located in Sardinia. b Changes in the reference map artificially constructed from the satellite map

Source publication
Article
Full-text available
Large-scale remote sensing-based inventories of forest cover are usually carried out by combining unsupervised classifications of satellite pixels into forest/non forest classes (map data) with subsequent time-consuming visual on-screen imagery classification of a probabilistic sample of pixels taken as the ground truth (reference data). In this pa...

Similar publications

Article
Full-text available
Traditional forestry, ecology, and fuels monitoring methods can be costly and error-prone, and are often used beyond their original assumptions due to difficulty or unavailability of more appropriate methods. These traditional methods tend to be rigid and may not be useful for detecting new ecological changes or required data at modern levels of pr...
Article
Full-text available
Sufficient plant-available water is one of the most important requirements for vital, stable, and well-growing forest stands. In the face of climate change, there are various approaches to derive recommendations considering tree species selection based on plant-available water provided by measurements or simulations. Owing to the small-parcel manag...
Article
Full-text available
Tree-related microhabitats (TreMs) play an important role in maintaining forest biodiversity and have recently received more attention in ecosystem conservation, forest management and research. However, TreMs have until now only been assessed by experts during field surveys, which are time-consuming and difficult to reproduce. In this study, we eva...
Article
Full-text available
• The impact of disturbances on boreal forest plant communities is not fully understood, particularly when different disturbances are combined, and regime shifts to alternate stable states are possible after disturbance. A long‐term monitored semi‐natural forest site subject to intense combined storm and bark beetle damage beginning in 2005 provide...
Article
Full-text available
Forest aboveground biomass (AGB) is of great significance since it represents large carbon storage and may reduce global climate change. However, there are still considerable uncertainties in forest AGB estimates, especially in rugged regions, due to the lack of effective algorithms to remove the effects of topography and the lack of comprehensive...

Citations

... Among the three schemes, OPSS has historically been widely adopted [33] and is currently implemented in several national forest inventories (e.g., Italy and the USA), as it allows the sample pixels to be spread over the entire AOI. In fact, while SRSWoR can lead to over/under-representation of the AOI, OPSS reduces the probability of selecting neighboring pixels, ensuring spatially balanced samples with related advantages [34]. ...
Article
Full-text available
Remote sensing products are typically assessed using a single accuracy estimate for the entire map, despite significant variations in accuracy across different map areas or classes. Estimating per-pixel uncertainty is a major challenge for enhancing the usability and potential of remote sensing products. This paper introduces the dataDriven open access tool, a novel statistical design-based approach that specifically addresses this issue by estimating per-pixel uncertainty through a bootstrap resampling procedure. Leveraging Sentinel-2 remote sensing data as auxiliary information, the capabilities of the Google Earth Engine cloud computing platform, and the R programming language, dataDriven can be applied in any world region and variables of interest. In this study, the dataDriven tool was tested in the Rincine forest estate study area—eastern Tuscany, Italy—focusing on volume density as the variable of interest. The average volume density was 0.042, corresponding to 420 m3 per hectare. The estimated pixel errors ranged between 93 m3 and 979 m3 per hectare and were 285 m3 per hectare on average. The ability to produce error estimates for each pixel in the map is a novel aspect in the context of the current advances in remote sensing and forest monitoring and assessment. It constitutes a significant support in forest management applications and also a powerful communication tool since it informs users about areas where map estimates are unreliable, at the same time highlighting the areas where the information provided via the map is more trustworthy. In light of this, the dataDriven tool aims to support researchers and practitioners in the spatially exhaustive use of remote sensing-derived products and map validation.
... To obtain the best results, one must obtain a sufficient amount of high-quality reference data samples that can capture spectral class variability efficiently (Li et al. 2014;Persello and Bruzzone 2014). Studies have shown that spreading samples evenly throughout regions and organizing samples according to their proportion to strata can help improve classification accuracy when the main objective is to increase overall accuracy (Jin, Stehman, and Mountrakis 2014;Pagliarella, Corona, and Fattorini 2018;Zhu et al. 2016). Collecting good and adequate data samples poses a great challenge for most researchers as it is a very hectic and time-consuming process that requires field expertise, domain knowledge and rigorous field visits that might sometimes include complex and tough terrains that are inaccessible and not feasible to visit. ...
Article
In today’s world, by integrating remote sensing technology and modern state-of-the-art machine learning techniques, obtaining Land Use Land Cover (LULC) maps has become easier in comparison to traditional manual methods. The performance of a Machine Learning classifier is influenced by various factors. The objective of this study is to evaluate the impact of sampling design in rough complex terrain located in the Northern Himalayan region in Uttarakhand state, India, where reference data is often limited due to the geographical characteristics of the study area. Three sampling design strategies have been incorporated in this study, namely, stratified random sampling with a proportional number of samples (SRS)proportional, stratified random sampling with an equal number of samples (SRS)equivalent and stratified systematic sampling with an equal number of samples with a minimum distance of 10 m between the consecutive samples (SSS)D = 10 m for the LULC classification. In this study, Sentinel-2 data of 10 m spatial resolution for the study area of Dehradun district, Uttarakhand, India, has been selected. The following conclusions can be drawn from the results of this study (i) (SRS)proportional achieved the highest Overall Accuracy (OA) among all the three sampling techniques. The OA and kappa score (ka) using (SRS)proportional are OA = 90.25 and ka = 0.874 by Random Forest, OA = 88.84 and ka = 0.856 by Support Vector Machine and k Nearest Neighbours (kNN) obtained OA = 87.72 and ka = 0.842, respectively. (ii) It was found that in the case of (SRS)proportional, the majority classes like the deciduous forest, evergreen forest and cropland achieved higher recall and precision values in comparison to those obtained from the other two sampling strategies, i.e. (SRS)equivalent and (SSS)D = 10 m. (iii) The results showed that while switching from (SRS)proportional to (SRS)equivalent or from (SRS)proportional to (SSS)D = 10 m, there was a slight reduction in the precision and recall values for the majority classes and a slight increase for a few of the minority classes.
... Validation data points were selected for visual interpretation to identify forest type, presence of change, type of change (deforestation, degradation), date of change and presence of one or more direct drivers. A stratified random sampling scheme was developed to select spatially balanced samples proportional to the size of the map classes of forest type and change (Pagliarella et al., 2018). At least 150 random points per change class were selected, with more points for larger map classes (N = 11,078), along with a random sample of stable points from all land cover classes (N = 1,192). ...
... Similar kinds of changes in the microbial community of agriculture systems can also be expected during different crops seasons. Thus, the timing of field study becomes critically important in a biodiversity study and hence it needs adequate attention of the researchers (Fattorini 2003;Pagliarella et al. 2018). ...
... In biodiversity researches, plant and animal ecologists generally employ advanced statistical concepts and tools, such as sampling designs, categorization, normalization, data extrapolation, regression analysis, ANOVA, factor analysis, cluster analysis, logistic regression, generalized linear and generalized additive modelling in order to enhance the quality of their studies (Guisan et al. 2002;Fattorini 2003;Chiarucci et al. 2011;Pagliarella et al. 2018). However, very few microbiologists have given suitable consideration to the above tools for assessing microbial diversity of diverse habitats (Ampe and Miambi 2000;Oliveira et al. 2020;Banerjee et al. 2020). ...
Article
Full-text available
Microalgal and cyanobacterial communities have a key role in sustaining the fertility of aquatic and terrestrial habitats. Thus, understanding the actual biodiversity of these communities is a task of utmost importance. However, this particular task suffers from several technical constraints and challenges. Sampling procedures and criteria for counting individuals of various microalgal and cyanobacterial species in such systems have not been standardized. Biodiversity indices are considered promising; however, ambiguity in respect of species concept and characterization criteria of microalgal and cyanobacterial forms makes the determination of biodiversity indices a complicated task. Recently, DNA barcoding was employed for the identification of microalgal and cyanobacterial species. However, it needs sufficient experimental validation. The functional diversity and zeta diversity, which are helpful in ecosystem process assessment, are largely unexplored for microalgal and cyanobacterial communities. Adequate knowledge of sampling designs, methods for detecting outliers and errors, and data transformation in biodiversity studies are crucial. Several analytical tools, such as analysis of variance (ANOVA), analysis of similarity (ANOSIM), multidimensional scaling (MDS) and cluster analysis are obligatory for understanding the compositional differences of different microbial communities. Regression and multiple correlations are important in realizing the relationships of different environmental factors. Principal component analysis (PCA) and canonical correspondence analysis (CCA) are effective in interpreting the influence of environmental factors on the distribution of microalgal and cyanobacterial species in a geographical region or a land patch. Nevertheless, statistical software packages are the backbone of research activities these days. So, development of new biodiversity software packages specific to microalgae and cyanobacteria is required.
... At present, a few studies have explored the distribution of samples [18][19][20][21]. In these studies, simple random sampling, stratified sampling, and even distribution among classes were investigated. ...
... In these studies, simple random sampling, stratified sampling, and even distribution among classes were investigated. The conclusions were not consistent, but most studies indicated that when more attention was paid to the overall accuracy, distributing samples according to the proportion to strata and distributing them balanced in regions were helpful to improve the classification accuracy [10,20,21]. To obtain better classification results with fewer but informative labeled samples, active learning was widely used in land cover classification using remotely sensed images [22,23]. ...
Article
Full-text available
High-quality training samples are essential for accurate land cover classification. Due to the difficulties in collecting a large number of training samples, it is of great significance to collect a high-quality sample dataset with a limited sample size but effective sample distribution. In this paper, we proposed an object-oriented sampling approach by segmenting image blocks expanded from systematically distributed seeds (object-oriented sampling approach) and carried out a rigorous comparison of seven sampling strategies, including random sampling, systematic sampling, stratified sampling (stratified sampling with the strata of land cover classes based on classification product, Latin hypercube sampling, and spatial Latin hypercube sampling), object-oriented sampling, and manual sampling, to explore the impact of training sample distribution on the accuracy of land cover classification when the samples are limited. Five study areas from different climate zones were selected along the China–Mongolia border. Our research identified the proposed object-oriented sampling approach as the first-choice sampling strategy in collecting training samples. This approach improved the diversity and completeness of the training sample set. Stratified sampling with strata defined by the combination of different attributes and stratified sampling with the strata of land cover classes had their limitations, and they performed well in specific situations when we have enough prior knowledge or high-accuracy product. Manual sampling was greatly influenced by the experience of interpreters. All these sampling strategies mentioned above outperformed random sampling and systematic sampling in this study. The results indicate that the sampling strategies of training datasets do have great impacts on the land cover classification accuracies when the sample size is limited. This paper will provide guidance for efficient training sample collection to increase classification accuracies.
... These considerations should suggest that, for reasons of transparency and communication, sampling schemes should be taken simple, simply ensuring spatially balanced samples (e.g. Pagliarella et al., 2018), while auxiliary information should be exploited at the estimation level. Indeed, simple, spatially balanced schemes can be readily obtained by partitioning the area to be sampled by tassels of equal size and randomly selecting a plot/pixel in each tassel. ...
Article
Model-assisted estimation of forest wood volume is approached exploiting the wall-to-wall information available from satellite data and partial information achieved from airborne laser scanning (ALS) covering a portion of the survey area. If the portion covered by ALS is selected by a probabilistic sampling scheme, two-phase estimators are considered in which the two sources of information are exploited by means of linear and non-linear models. If the portion covered by ALS is fixed because purposively selected, the two sources of information are exploited by the double-calibration estimator. The performance of the proposed strategies is checked by a simulation study from two study areas in Southern and Northern Italy.
... Forest inventories have a long history of using systematic sampling (Spurr 1952, p 379) that continues to this date at both local, regional, and national levels (Brooks and Wiant Jr 2004;Kangas and Maltamo 2006;Nelson et al. 2008;Tomppo et al. 2010;Vidal et al. 2016). Since forests exhibit non-random spatial structures (Sherrill et al. 2008;Alves et al. 2010;von Gadow et al. 2012;Pagliarella et al. 2018), the main benefit of a uniform sampling intensity across a population under study (i.e. spatial balance) is an anticipated lower variance in an estimate of the population mean (total). ...
Article
Full-text available
Background Large area forest inventories often use regular grids (with a single random start) of sample locations to ensure a uniform sampling intensity across the space of the surveyed populations. A design-unbiased estimator of variance does not exist for this design. Oftentimes, a quasi-default estimator applicable to simple random sampling ( SRS ) is used, even if it carries with it the likely risk of overestimating the variance by a practically important margin. To better exploit the precision of systematic sampling we assess the performance of five estimators of variance, including the quasi default. In this study, simulated systematic sampling was applied to artificial populations with contrasting covariance structures and with or without linear trends. We compared the results obtained with the SRS , Matérn’s, successive difference replication, Ripley’s, and D’Orazio’s variance estimators. Results The variances obtained with the four alternatives to the SRS estimator of variance were strongly correlated, and in all study settings consistently closer to the target design variance than the estimator for SRS . The latter always produced the greatest overestimation. In populations with a near zero spatial autocorrelation, all estimators, performed equally, and delivered estimates close to the actual design variance. Conclusion Without a linear trend, the SDR and DOR estimators were best with variance estimates more narrowly distributed around the benchmark; yet in terms of the least average absolute deviation, Matérn’s estimator held a narrow lead. With a strong or moderate linear trend, Matérn’s estimator is choice. In large populations, and a low sampling intensity, the performance of the investigated estimators becomes more similar.
... From a modelling and efficiency perspective, it is advantageous to distribute the forest inventory samples uniformly across the population of interest (Mostafa, Ahmad 2017;Pagliarella et al. 2018;Räty et al. 2018) or to emulate the population distribution of one or more auxiliary variables (Grafström, Ringvall 2013;Grafström et al. 2017). These and other sampling approaches are becoming popular, and have in many instances replaced stand-level inventories (Duplat, Perrotte 1981;Mäkelä, Pekkarinen 2004;Muukkonen, Heiskanen 2007;Nothdurft et al. 2009;Kangas et al. 2018). ...
Article
Full-text available
Forest inventories provide predictions of stand means on a routine basis from models with auxiliary variables from remote sensing as predictors and response variables from field data. Many forest inventory sampling designs do not afford a direct estimation of the among-stand variance. As consequence, the confidence interval for a model-based prediction of a stand mean is typically too narrow. We propose a new method to compute (from empirical regression residuals) an among-stand variance under sample designs that stratify sample selections by an auxiliary variable, but otherwise do not allow a direct estimation of this variance. We test the method in simulated sampling from a complex artificial population with an age class structure. Two sampling designs are used (one-per-stratum, and quasi systematic), neither recognize stands. Among-stand estimates of variance obtained with the proposed method underestimated the actual variance by 30-50%, yet 95% confidence intervals for a stand mean achieved a coverage that was either slightly better or at par with the coverage achieved with empirical linear best unbiased estimates obtained under less efficient two-stage designs.
... Model selection has often been acknowledged as one of the most critical steps by modellers in many research fields, correctly achievable through a statistical comparison [30][31][32]. Several studied also focused on data quality when dealing with climate [33,34], forest mensuration [35,36] and modelling activities in ...
... Model selection has often been acknowledged as one of the most critical steps by modellers in many research fields, correctly achievable through a statistical comparison [30][31][32]. Several studied also focused on data quality when dealing with climate [33,34], forest mensuration [35,36] and modelling activities in general [37][38][39]. While data quality was detected here once more as the focal point for cost-effective research, the same cannot be said for modelling tools. ...
Article
Full-text available
Stem tapers are mathematical functions modelling the relative decrease of diameter (rD) as the relative height (rH) increase in trees and can be successfully used in precision forest harvesting. In this paper, the diameters of the stem at various height of 202 Pinus nigra trees were fully measured by means of an optical relascope (CRITERION RD 1000) by adopting a two-steps non-destructive strategy. Data were modelled with four equations including a linear model, two polynomial functions (second and third order) and the Generalised Additive Model. Predictions were also compared with the output from the TapeR R package, an object-oriented tool implementing the β-Spline functions and widely used in the literature and scientific research. Overall, the high quality of the database was detected as the most important driver for modelling with algorithms almost equivalent each other. The use of a non-destructive sampling method allowed the full measurement of all the trees necessary to build a mathematical function properly. The results clearly highlight the ability of all the tested models to reach a high statistical significance with an adjusted-R squared higher than 0.9. A very low mean relative absolute error was also calculated with a cross validation procedure and small standard deviation were associated. Substantial differences were detected with the TapeR prediction. Indeed, the use of mixed models improved the data handling with outputs not affected by autocorrelation which is one of the main issues when measuring trees profile. The profile data violate one of the basic assumptions of modelling: the independence of sampled units (i.e., autocorrelation of measured values across the stem of a tree). Consequently, the use of simple parametric equations can only be a temporary resource before more complex built-in apps are able to allow basic users to exploit more powerful modelling techniques.
... Their use has been recently discouraged in forest studies (e.g. Pagliarella et al., 2018). ...
Article
Spatial populations are usually located on a continuous support. They can be surfaces representing the values of the survey variable at any location, finite collections of units with the corresponding values of the survey variable, or finite collections of areal units partitioning the support, where the survey variable is the total amount of an attribute within. We derive conditions on the design sequence ensuring consistency of the Horvitz–Thompson estimator of spatial population totals, supposing minimal requirements on the survey variable. A simulation study is performed to check theoretical results. Consistency and its implications in real surveys are discussed with focus on environmental surveys.