Figure - uploaded by Yin Wang
Content may be subject to copyright.
The evaluation of stepwise, RF and XGBoost models for maize AGB estimation.

The evaluation of stepwise, RF and XGBoost models for maize AGB estimation.

Source publication
Article
Full-text available
Monitoring the aboveground biomass (AGB) of maize is essential for improving site-specific nutrient management and predicting yield to ensure food safety. A low-altitude unmanned aerial vehicle (UAV) was employed to acquire hyperspectral imagery of the maize canopy at three growth stages (V6, R1, R3) to estimate the maize AGB. Five maize nitrogen (...

Context in source publication

Context 1
... compare the AGB estimation results based on linear regression and machine learning models (RF, XGBoost), the same variables, including 30 VIs and crop height, were employed in the three models. As Table 3 shows, regardless of which model was applied, the estimation of dry AGB always achieved better performance than the estimation of fresh AGB at the V6 growth stage. Although RMSEs of estimated dry AGB were lower than the RMSEs of estimated fresh AGB at R1 growth stage, REs of estimated dry AGB were higher than REs of estimated fresh AGB, and this was due to the large mean value of fresh AGB at R1 growth stage. ...

Similar publications

Article
Full-text available
The management of low-density savannah and woodland forests for carbon storage presents a mechanism to offset the expense of ecologically informed forest management strategies. However, existing carbon monitoring systems draw on vast amounts of either field observations or aerial light detection and ranging (LiDAR) collections, making them financia...
Article
Full-text available
Remote sensing utilization for plantation forests management in Java was not yet widely used, whereas the ability of remote sensing data for land cover monitoring and forest resource mapping has been developed, ranging from low resolution imagery for global areas to moderate and high resolution for small scale areas. Data availability and human res...
Article
Full-text available
Remote sensing techniques are frequently applied for the surveying of remote areas, where the use of conventional surveying techniques remains difficult and impracticable. In this paper, we focus on one of the remote glacier areas, namely the Tyndall Glacier area in the Southern Patagonian Icefield in Chile. Based on optical remote sensing data in...
Article
Full-text available
Maize is the crop with the largest planting area in the middle reaches of the Heihe River,with large water requirements and high evapotranspiration during the growing period. Accurately obtaining the maize planting area has important significances for the adjustment of crop planting structure and reasonable planning of water resources in the region...
Article
Full-text available
Forests are an essential part of the ecosystem and play an irreplaceable role in maintaining the balance of the ecosystem and protecting biodiversity. The monitoring of forest distribution plays an important role in the conservation and management of forests. This paper analyzes and compares the performance of imagery from GF-1 WFV, Landsat 8, and...

Citations

... Higher prediction accuracies were during flowering, grain-filling, and the late grain maturity stages. During these stages, the characteristics of the crop change significantly with the intensity of greenness, chlorophyll concentrations, number of leaves, and plant height [29,86,87]. Therefore, the growth period can influence the capability of the UAV data to predict maize yield. ...
Article
Full-text available
Optimizing the prediction of maize (Zea mays L.) yields in smallholder farming systems enhances crop management and thus contributes to reducing hunger and achieving one of the Sustainable Development Goals (SDG 2-zero hunger). This research investigated the capability of un-manned aerial vehicle (UAV)-derived data and machine learning algorithms to estimate maize yield and evaluate its spatiotemporal variability through the phenological cycle of the crop in Bronk-horstspruit, South Africa, where UAV data collection took over four dates (pre-flowering, flowering, grain filling, and maturity). The five spectral bands (red, green, blue, near-infrared, and red-edge) of the UAV data, vegetation indices, and grey-level co-occurrence matrix textural features were computed from the bands. Feature selection relied on the correlation between these features and the measured maize yield to estimate maize yield at each growth period. Crop yield prediction was then conducted using our machine learning (ML) regression models, including Random Forest, Gradient Boosting (GradBoost), Categorical Boosting, and Extreme Gradient Boosting. The GradBoost regression showed the best overall model accuracy with R 2 ranging from 0.05 to 0.67 and root mean square error from 1.93 to 2.9 t/ha. The yield variability across the growing season indicated that overall higher yield values were predicted in the grain-filling and mature growth stages for both maize fields. An analysis of variance using Welch's test indicated statistically significant differences in maize yields from the pre-flowering to mature growing stages of the crop (p-value < 0.01). These findings show the utility of UAV data and advanced modelling in detecting yield variations across space and time within smallholder farming environments. Assessing the spatiotemporal variability of maize yields in such environments accurately and timely improves decision-making, essential for ensuring sustainable crop production.
... Although handheld SPAD meters partially compensate for the shortcomings of traditional methods, the collection of chlorophyll content in large-scale vegetation communities still requires significant time and labor costs [12]. Spectral remote sensing technology is essential for establishing links between canopy reflectance and biophysical and biochemical parameters and can estimate chlorophyll content at the canopy scale. ...
... Currently, there is extensive research focused on crops and cash crops [7,12,20], but studies applying hyperspectral remote sensing technology to the inversion of chlorophyll content in glycyrrhiza are scarce. This study applies hyperspectral remote sensing technology to the inversion of chlorophyll content in glycyrrhiza, comparing the performance across different growth stages and various models. ...
Article
Full-text available
Glycyrrhiza is an important medicinal crop that has been extensively utilized in the food and medical sectors, yet studies on hyperspectral remote sensing monitoring of glycyrrhiza are currently scarce. This study analyzes glycyrrhiza hyperspectral images, extracts characteristic bands and vegetation indices, and constructs inversion models using different input features. The study obtained ground and unmanned aerial vehicle (UAV) hyperspectral images and chlorophyll content (called Soil and Plant Analyzer Development (SPAD) values) from sampling sites at three growth stages of glycyrrhiza (regreening, flowering, and maturity). Hyperspectral data were smoothed using the Savitzky–Golay filter, and the feature vegetation index was selected using the Pearson Correlation Coefficient (PCC) and Recursive Feature Elimination (RFE). Feature extraction was performed using Competitive Adaptive Reweighted Sampling (CARS), Genetic Algorithm (GA), and Successive Projections Algorithm (SPA). The SPAD values were then inverted using Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost), and the results were analyzed visually. The results indicate that in the ground glycyrrhiza inversion model, the GA-XGBoost model combination performed best during the regreening period, with R2, RMSE, and MAE values of 0.95, 0.967, and 0.825, respectively, showing improved model accuracy compared to full-spectrum methods. In the UAV glycyrrhiza inversion model, the CARS-PLSR combination algorithm yielded the best results during the maturity stage, with R2, RMSE, and MAE values of 0.83, 1.279, and 1.215, respectively. This study proposes a method combining feature selection techniques and machine learning algorithms that can provide a reference for rapid, nondestructive inversion of glycyrrhiza SPAD at different growth stages using hyperspectral sensors. This is significant for monitoring the growth of glycyrrhiza, managing fertilization, and advancing precision agriculture.
... Their optimized AGB models, constructed using random forests and multi-step regression methods, exhibited high accuracy levels for various tree species. Furthermore, Zhang [12] and Zhang's team estimated maize biomass under different nitrogen fertilizer levels in Pingshan County, Jilin Province, utilizing low-altitude unmanned aerial vehicles and hyperspectral images. Employing the XGBoost model, they achieved high-precision predictions for fresh weight and dry weight AGB, particularly during the V6 growth stage (R 2 = 0.81, RMSE = 0.27 t/ha). ...
Article
Full-text available
Deep learning methodologies employed for biomass prediction often neglect the intricate relationships between labels and samples, resulting in suboptimal predictive performance. This paper introduces an advanced supervised contrastive learning technique, termed Improved Supervised Contrastive Deep Regression (SCDR), which is adept at effectively capturing the nuanced relationships between samples and labels in the feature space, thereby mitigating this limitation. Simultaneously, we propose the U-like Hierarchical Residual Fusion Network (BioUMixer), a bespoke biomass prediction network tailored for image data. BioUMixer enhances feature extraction from biomass image data, facilitating information exchange and fusion while considering both global and local features within the images. The efficacy of the proposed method is validated on the Pepper_Biomass dataset, which encompasses over 600 original images paired with corresponding biomass labels. The results demonstrate a noteworthy enhancement in deep regression tasks, as evidenced by performance metrics on the Pepper_Biomass dataset, including RMSE = 252.18, MAE = 201.98, and MAPE = 0.107. Additionally, assessment on the publicly accessible GrassClover dataset yields metrics of RMSE = 47.92, MAE = 31.74, and MAPE = 0.192. This study not only introduces a novel approach but also provides compelling empirical evidence supporting the digitization and precision improvement of agricultural technology. The research outcomes align closely with the identified problem and research statement, underscoring the significance of the proposed methodologies in advancing the field of biomass prediction through state-of-the-art deep learning techniques.
... Compared with satellite remote sensing, UAV remote sensing is characterized by flexible operation, high resolution, and real-time solid data acquisition capability, which extends the application scope in various fields [16]. Particularly noteworthy is the application of UAV hyperspectral data, which provides more detailed spectral information for features, captures slight differences between spectra, significantly improves the classification ability, and has shown great potential for application in the fields of environmental monitoring [17,18], agricultural production [19,20], and resource investigation [21]. However, the numerous spectral bandwidths of hyperspectral data also pose challenges to complex data structures and large data volumes, requiring efficient analysis methods [16]. ...
Article
Hickory trees possess substantial economic, nutritional, and ecological value, and they play a pivotal role in both human societies and natural ecosystems. To maximize benefits, many bases have adopted a composite planting strategy. Effective and timely monitoring of hickory forest species information is crucial for precise management and conservation. In this study, a low-altitude unmanned aerial vehicle (UAV) was employed to capture RGB and hyperspectral data from the canopy of two hickory bases. A hybrid convolutional neural network structure was then utilized to classify different tree species and congeneric hickory species. By introducing a channel attention module to refine the features of hyperspectral images, classification stability was enhanced. The experimental results demonstrate that hyperspectral images yield superior classification performance in hickory species identification compared to RGB images, particularly in classifying highly homogeneous tree species. The 3D-2DCNN-CA proposed in this paper demonstrated the best performance in classifying hickory species compared to other classification methods. In the classification of different tree species, the accuracy for a single hickory reached 99.38%, and in the classification of the same hickory species, it achieved an accuracy of 93.97%. Furthermore, this method also achieved significant classification results at the single-tree scale. These results indicate that the method can realize fine-scale monitoring of hickory forests and provide substantial support for forest land management and expert guidance on planting distribution.
... They concluded that accuracy was better when data was initially divided into different groups before classification. Research on crop height for maize biomass by Zhang [40] used UAV hyperspectral imagery. Prediction of Aboveground biomass (ABG) using stepwise regression and XGBoost regression model with the highest accuracy of R-Squared (R 2 ) of 0.81 and Root Mean Square Error (RMSE) of 0.27 was obtained, showing that hyperspectral imagery can play a vital role in the estimation of maize aboveground biomass with better accuracy. ...
Article
Full-text available
Recently, there has been a notable surge of interest in scientific research regarding spectral images. The potential of these images to revolutionize the digital photography industry, like aerial photography through Unmanned Aerial Vehicles (UAVs), has captured considerable attention. One encouraging aspect is their combination with machine learning and deep learning algorithms, which have demonstrated remarkable outcomes in image classification. As a result of this powerful amalgamation, the adoption of spectral images has experienced exponential growth across various domains, with agriculture being one of the prominent beneficiaries. This paper presents an extensive survey encompassing multispectral and hyperspectral images, focusing on their applications for classification challenges in diverse agricultural areas, including plants, grains, fruits, and vegetables. By meticulously examining primary studies, we delve into the specific agricultural domains where multispectral and hyperspectral images have found practical use. Additionally, our attention is directed towards utilizing machine learning techniques for effectively classifying hyperspectral images within the agricultural context. The findings of our investigation reveal that deep learning and support vector machines have emerged as widely employed methods for hyperspectral image classification in agriculture. Nevertheless, we also shed light on the various issues and limitations of working with spectral images. This comprehensive analysis aims to provide valuable insights into the current state of spectral imaging in agriculture and its potential for future advancements.
... In their study, they utilized data acquired by airborne hyperspectral and thermal sensors, in the meanwhile, machine learning techniques were employed to establish a relationship between physiological features extracted from hyperspectral and thermal images and wheat grain protein content. [12] utilized hyperspectral images obtained by unmanned aerial vehicles, in conjunction with crop height and narrowband vegetation indices, successfully estimated aboveground biomass in maize. However, most research focused on post-processing operation of the acquired images of line-scanning spectrometers. ...
Article
Full-text available
In this paper, the principles of spectral data cube reconstruction based on an integral field snapshot imaging spectrometer and GPU-based acceleration are presented. The primary focus is on improving the reconstruction algorithm using GPU parallel computing technology to enhance the computational efficiency for real-time applications. And the computational tasks of the spectral reconstruction algorithm were transferred to the GPU through program parallelization and memory optimization, resulting in significant performance gains. Experimental results indicate that the average processing time of the GPU-based parallel algorithm is approximately 29.43 ms, showing a substantial acceleration ratio of about 14.27 compared to the traditional CPU serial algorithm with an average processing time of around 420.46 ms. The study aims to refine the GPU parallelization algorithm for continued improvement in computational efficiency and overall performance. The anticipated applications of this research include providing crucial technical support for the perception and monitoring of crop growth traits in agricultural production, contributing to the modernization and advancement of intelligence in the field.
... The conventional regression approaches have been overcome by ML and deep learning to provide precise and accurate statistical predictions [20,21]. Several studies have recently observed the statistical metrics of ML algorithms, for instance, support vector regression (SVR) [22], random forest (RF) regression [23], and XGBoost regression [24], to predict crop production at local (i.e., province) scales. The study by [8] investigated eight different ML classifiers and regressors to forecast the outcome of wheat in the winter season in China. ...
Article
Full-text available
Predictions of crop production in the Chi basin are of major importance for decision support tools in countries such as Thailand, which aims to increase domestic income and global food security by implementing the appropriate policies. This research aims to establish a predictive model for predicting crop production for an internal crop growth season prior to harvest at the province scale for fourteen provinces in Thailand’s Chi basin between 2011 and 2019. We provide approaches for reducing redundant variables and multicollinearity in remotely sensed (RS) and meteorological data to avoid overfitting models using correlation analysis (CA) and the variance inflation factor (VIF). The temperature condition index (TCI), the normalized difference vegetation index (NDVI), land surface temperature (LSTnighttime), and mean temperature (Tmean) were the resulting variables in the prediction model with a p-value < 0.05 and a VIF < 5. The baseline data (2011–2017: June to November) were used to train four regression models, which revealed that eXtreme Gradient Boosting (XGBoost), random forest (RF), and XGBoost achieved R2 values of 0.95, 0.94, and 0.93, respectively. In addition, the testing dataset (2018–2019) displayed a minimum root-mean-square error (RMSE) of 0.18 ton/ha for the optimal solution by integrating variables and applying the XGBoost model. Accordingly, it is estimated that between 2020 and 2022, the total crop production in the Chi basin region will be 7.88, 7.64, and 7.72 million tons, respectively. The results demonstrated that the proposed model is proficient at greatly improving crop yield prediction accuracy when compared to a conventional regression method and that it may be deployed in different regions to assist farmers and policymakers in making more informed decisions about agricultural practices and resource allocation.
... Spectral positions of ASD and UHD185 primarily exhibited sensitivity to AGB within the green peak, red valley, red edge and near-infrared regions. These spectral positions had frequently been utilized in previous research for estimating AGB (Liu et al., 2021a;Yue et al., 2021b;Zhang et al., 2021). The sensitive spectral bands selected through CARS were primarily located within the aforementioned spectral region ( Fig. 6 and Table 4). ...
Article
Full-text available
Accurately estimating potato above-ground biomass (AGB), which is closely associated with the growth and yield of crops, carries significant importance for guiding field management practices. Hyperspectral techniques have emerged as a powerful and efficient tool for quickly and non-invasively acquiring information about AGB due to its capability to provide rich spectral data closely related to crop physiology and biochemistry. However, using spectral features obtained from hyperspectral data, such as spectral reflectance and vegetation indices (VIs), often leads to inaccurate estimations of crop AGB at multiple growth stages due to spectral saturation effects and dynamic changes in spectral responses. To enhance the robustness of AGB estimation models, this study proposed a harmonic decomposition (HD) method derived from Fourier series to extract energy features. The ground (referred to as ASD) and unmanned aerial vehicle hyperspectral (referred to as UHD185) remote sensing data from three growth stages of potatoes in 2018 (validation set) and 2019 (calibration set) were utilized in the study. Firstly, a comparison was made between the spectral reflectance of the potato canopy measured by the ASD and UHD185 sensors. Subsequently, the correlation between spectral reflectance, VIs, and harmonic components obtained from ASD and UHD185 sensors was analyzed in relation to AGB at both the individual and whole growth stage. Then, sensitive bands selected through CARS (competitive adaptive reweighted sampling), the entire spectral reflectance, VIs, and harmonic components, were utilized to construct AGB estimation models by partial least squares regression (PLSR). Finally, the optimal model performance was validated across different years, growth stages, and treatment conditions. The results showed there were differences in spectral reflectance stage was notably higher than that observed for entire acquired by ASD and UHD185 sensors across various wavelengths, but overall, there was a high level of consistency between the two. The correlation of spectral reflectance and VIs with potato AGB at individual growth growth stages. The accuracy of AGB estimation using VIs obtained from ASD (the R2 , RMSE and NRMSE of validation sets were 0.52, 592 kg/hm2 and 26.91 %, respectively) and UHD185 (the R2 , RMSE and NRMSE of validation sets were 0.46, 612 kg/hm2 and 27.82 %, respectively) sensors were low. Utilizing sensitive bands and full spectral reflectance separately improved the precision of models, although the enhancement was somewhat limited. The HD-PLSR models from ASD (the R2 ,RMSE and NRMSE of validation sets were 0.69, 477 kg/hm2 and 21.69 %, respectively) and UHD185 (the R2 , RMSE and NRMSE of validation sets were 0.66, 481 kg/hm2 and 21.86 %, respectively) achieved the best AGB estimation results. Using the HD-PLSR model to estimate AGB for two years, the R2 values were 0.79 and 0.76 for ASD and UHD185, with RMSE values of 381 kg/hm2 and 386 kg/hm2 and NRMSE values of 22.35 % and 22.70 %, respectively. The capability of the HD-PLSR model was confirmed at various growth stages and treatments. This work offers valuable remote sensing technical support for implementing potato growth monitoring and yield assessment in the field.
... Li et al. (2021) observed that XGB surpassed RF in performance, and another comparison by Li et al. (2020) revealed that XGB excelled beyond both RF and linear regression. The findings of this study are also in concordance with the research done by Zhang et al. (2021) and Luo et al. (2022), which have shown that XGB tends to surpass RF in the performance of regression models. The RF algorithm demonstrated greater ease of calibration and resilience against overfitting compared to BRT, an advantage linked to the bagging technique, which lessens the prediction model's variance. ...
Article
Estimating above-ground biomass (AGB) using machine learning (ML) algorithms and multi-sensor satellite data is a promising approach for monitoring and managing forest resources. This research integrated synthetic aperture radar (SAR) and multispectral imagery alongside in-field observations to accurately estimate above-ground biomass (AGB) in the Purna regional landscape of northern Western Ghats, India. The satellite data employed in the study included dual-polarization (VV + VH) imagery from Sentinel-1 and multi-spectral bands from Sentinel-2, processed and analysed using advanced ML algorithms. The ML algorithms, namely Random Forest (RF), Extreme Gradient Boosting (XGB), and Boosted Regression Trees (BRT), were strategically applied across different model scenarios to determine their effectiveness in AGB prediction. The XGB model displayed the highest accuracy with an R2 value of 0.61 and the lowest RMSE of 37.85 t/ha. The spatial distribution of AGB was successfully mapped, showing varied biomass concentrations throughout the study area. The study’s findings demonstrate the potential of integrating SAR and multispectral data for enhanced AGB estimation and suggest that ML models, specifically algorithms like RF, XGB, and BRT can address the complex relationships between AGB and satellite-derived variables more effectively than traditional methods.
... The rice sector is a significant 3 conventional regression approaches have been overcome by ML and deep learning to provide precise and accurate statistical predictions [20,21]. Several studies have recently observed the statistical metrics of ML algorithms, for instance, support vector regression (SVR) [22], random forest (RF) regression [23], and XGBoost regression [24], to predict crop production at local (i.e.,province) scales. This study [8] investigated eight different ML classifiers and regressors to forecast the outcome of wheat in the winter season for China. ...
Preprint
Full-text available
Predictions of crop production in the Chi basin are of major importance for decision support tools in countries such as Thailand, which aim to increase domestic income and global food security by implementing the appropriate policies. This research aims to establish a predictive model for predicting crop production for an internal crop growth season prior of harvest at the province scale for fourteen provinces in Thailand's Chi basin between 2011 and 2019. We provide approaches for reducing redundant variables and multicollinearity in remotely sensed (RS) and meteorological data to avoid overfitting models using correlation analysis (CA) and variance inflation factor (VIF). Temperature condition index (TCI), normalized difference vegetation index (NDVI), land surface temperature (LSTnight), and mean temperature (Tmean) were the resulting variables in the prediction model with a p-value < 0.05 and a VIF < 5. The baseline data (2011–2017: June to November) were used to train four regression models, which revealed that eXtreme Gradient Boosting (XGBoost), random forest (RF), and XGBoost achieved R2 values of 0.95, 0.94, and 0.93, respectively. In addition, the testing dataset (2018–2019) displayed a minimum root mean square error (RMSE) of 0.18 ton/ha for the optimal solution by integrating variables and applying the XGBoost model. Accordingly, it is estimated that between 2020 and 2022, the total crop production in the Chi basin region would be 7.88, 7.64, and 7.72 million tons, respectively. The results demonstrated that the proposed model is proficient at greatly improving crop yield prediction accuracy when compared to a conventional regression method and that it may be deployed in different regions to assist farmers and policymakers in making more informed decisions about agricultural practices and resource allocation.