Kernel density estimation plot of model vs. observation for all ground site observations compared to the model (a) and the corrected model (b) for 2016-2017. The dashed line indicates the 1 : 1 line, and the coloured line indicates the line of best fit using orthogonal regression. The plot is made up of 3 783 303 data points.

Kernel density estimation plot of model vs. observation for all ground site observations compared to the model (a) and the corrected model (b) for 2016-2017. The dashed line indicates the 1 : 1 line, and the coloured line indicates the line of best fit using orthogonal regression. The plot is made up of 3 783 303 data points.

Source publication
Article
Full-text available
Predictions from process-based models of environmental systems are biased, due to uncertainties in their inputs and parameterizations, reducing their utility. We develop a predictor for the bias in tropospheric ozone (O3, a key pollutant) calculated by an atmospheric chemistry transport model (GEOS-Chem), based on outputs from the model and observa...

Context in source publication

Context 1
... point-by-point comparison between all of the surface data (1 January 2016-31 December 2017) and the model with and without the bias corrector is shown in Fig. 4. The bias corrector removes virtually all of the model biases (NMB) taking it from 0.29 to −0.04, substantially reduces the error (RMSE) from 16.2 to 7.5 ppb, and increases the correlation (Pearson's R) from 0.48 to 0.84. Although this evaluation is for a different time period than the training dataset, it is still for the same sites. ...

Citations

... Thus, a hybrid approach combining ML algorithms and CTM-simulated results has increasingly been used in recent years to predict air pollutants and understand their trends. Integrating data from various sources, ML methods have been used as a tool to correct the biases in the lowerresolution simulated results from CTMs (Di et al., 2017;Ivatt and Evans, 2020;Ma et al., 2021). Based on processbased CTMs integrating decades of accumulated knowledge in Earth system science while taking advantage of ML to address still-existing model errors, the hybrid approach has great potential to tackle air quality problems (Irrgang et al., 2021). ...
... Our analysis revealed that training the model with 1 year or more of data results in only marginal reductions in RMSE and enhancements in R 2 (Fig. S1 in the Supplement); thus a timescale of 2 years appears to strike a good balance between computational burden and model accuracy. These results align with the findings of Ivatt and Evans (2020), who suggested that much of the variability in the power spectrum of surface O 3 can be captured by timescales of a year or less. Therefore, here we utilized observations from the 2016-2017 period as the training data, which offered a more economical computing cost and improved training time efficiency, and observations in 2018 as the independent test data to evaluate model performance. ...
Article
Full-text available
Surface ozone (O3) is well known for posing significant threats to both human health and crop production worldwide. However, a multidecadal assessment of the impacts of O3 on public health and crop yields in China is lacking due to insufficient long-term continuous O3 observations. In this study, we used a machine learning (ML) algorithm to correct the biases of O3 concentrations simulated by a chemical transport model from 1981–2019 by integrating multi-source datasets. The ML-enabled bias correction offers improved performance in reproducing observed O3 concentrations and thus further improves our estimates of the impacts of O3 on human health and crop yields. The warm-season trends of increasing O3 in Beijing–Tianjin–Hebei and its surroundings (BTHs) as well as in the Yangtze River Delta (YRD), Sichuan Basin (SCB), and Pearl River Delta (PRD) regions are 0.32, 0.63, 0.84, and 0.81 µg m−3 yr−1 from 1981 to 2019, respectively. In more recent years, O3 concentrations experienced more fluctuations in the four major regions. Our results show that only BTHs have a perceptible increasing trend of 0.81 µg m−3 yr−1 during 2013–2019. Using accumulated O3 over a threshold of 40 ppb (AOT40-China) exposure–yield response relationships, the estimated relative yield losses (RYLs) for wheat, rice, soybean, and maize are 17.6 %, 13.8 %, 11.3 %, and 7.3 % in 1981, increasing to 24.2 %, 17.5 %, 16.3 %, and 9.8 % in 2019, with an increasing rate of +0.03 % yr−1, +0.04 % yr−1, +0.27 % yr−1, and +0.13 % yr−1, respectively. The estimated annual all-cause premature deaths induced by O3 increased from ∼55 900 in 1981 to ∼162 000 in 2019 with an increasing trend of ∼2980 deaths per year. The annual premature deaths related to respiratory and cardiovascular disease are ∼34 200 and ∼40 300 in 1998 and ∼26 500 and ∼79 000 in 2019, having a rate of change of −546 and +1770 deaths per year during 1998–2019, respectively. Our study, for the first time, used ML to provide a robust dataset of O3 concentrations over the past 4 decades in China, enabling a long-term evaluation of O3-induced crop losses and health impacts. These findings are expected to fill the gap of the long-term O3 trend and impact assessment in China.
... GBM is similar to AdaBoost; the major difference is that GBM has a fixed base estimator, i.e., DT, whereas in AdaBoost the base estimator can be changed according to requirements. In GBM, the base model is built to predict the observations in the training dataset as defined here: [41] H ...
Article
Full-text available
Thyroid disease has been on the rise during the past few years. Owing to its importance in metabolism, early detection of thyroid disease is a task of critical importance. Despite several existing works on thyroid disease detection, the problem of class imbalance is not investigated very well. In addition, existing studies predominantly focus on the binary-class problem. This study aims to solve these issues by the proposed approach where ten types of thyroid diseases are considered. The proposed approach uses a differential evolution (DE)-based optimization algorithm to fine-tune the parameters of machine learning models. Moreover, conditional generative adversarial networks are used for data augmentation. Several sets of experiments are carried out to analyze the performance of the proposed approach with and without model optimization. Results suggest that a 0.998 accuracy score can be obtained using AdaBoost with DE optimization which is better than existing state-of-the-art models.
... For example, some studies used recurrent neural networks (RNN) and long short-term memory (LSTM) models to predict air pollution from historical time series pollutant data and meteorological data [23], [24]. For this reason, a new trend is to use the power of deep learning to improve the performance of deterministic models, which could be complemented by machine learning [25]. Deep learning techniques can potentially accelerate atmospheric chemistry computations of CTMs by emulating the chemical mechanism [19]. ...
Article
Full-text available
According to the Air Quality Directive 2008/50/EC, air quality zoning divides a territory into air quality zones where pollution and citizen exposure are similar and can be monitored using similar strategies. However, there is no standardized computational methodology to solve this problem, and only a few experiences in the Comunidad of Madrid based on CHIMERE-WRF. In this study, we propose a methodological improvement based on the application of deep learning. Our method uses the CHIMERE-WRF air quality modelling system and adds a step that uses neural networks architectures to calibrate the simulations. We have validated our method in the Region of Murcia. The results obtained are promising given the values of the Pearson coefficient, obtaining r = 0.94 for NO <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> and r = 0.95 for O <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sub> , improving 86 % and 29 % the performances reported in the state of the art. In addition, the cluster score improves after applying neural networks, demonstrating that neural networks improve the consistency of clusters compared to the current air quality zoning. This opened new research opportunities based on the use of neural networks for dimension reduction in spatial clustering problems, and we were able to provide recommendations for a new measurement point in the Region of Murcia Air Quality Network.
... To investigate the impact of correlated and dependent features on XAI methods, we use simulated atmospheric chemistry data. This data is a useful case study as it represents a real machine learning use case with an active research community (e.g., Ivatt & Evans, 2020;Keller & Evans, 2018;Kelp et al., 2022;, and the system represents a wide range of correlations across chemical species. The dataset we used was generated through coarse resolution simulations from the NASA GEOS composition forecast modeling system (GEOS-CF, Keller et al., 2021), which predicts the global distribution of hundreds of chemical species using the GEOS-Chem chemistry mechanism (Bey et al., 2001). ...
... We predicted the reaction rate using boosted regression trees, implemented with the XGBoost software package (Chen & Guestrin, 2016). Boosted regression trees, and XGBoost in particular, have been used widely across the atmospheric chemistry space and Earth and atmospheric sciences more generally (e.g., Batunacun et al., 2021;Dietz et al., 2019;Ivatt & Evans, 2020;Lee et al., 2019;Silva et al., 2020). ...
Article
Explainable Artificial Intelligence (XAI) methods are becoming popular tools for scientific discovery in the Earth and atmospheric sciences. While these techniques have the potential to revolutionize the scientific process, there are known limitations in their applicability that are frequently ignored. These limitations include that XAI methods explain the behavior of the A.I. model, not the behavior of the training dataset, and that caution should be used when these methods are applied to datasets with correlated and dependent features. Here, we explore the potential cost associated with ignoring these limitations with a simple case-study from the atmospheric chemistry literature – learning the reaction rate of a bimolecular reaction. We demonstrate that dependent and highly correlated input features can lead to spurious process-level explanations. We posit that the current generation of XAI techniques should largely only be used for understanding system-level behavior and recommend caution when using XAI methods for process-level scientific discovery in the Earth and atmospheric sciences.
... Previous studies focusing on forecasting surface ozone with ML have largely focused on predicting ozone in specific regions, often with relatively short time series of data. A variety of methods have been used, including bias-corrected CTMs (Neal et al., 2014;Ivatt and Evans, 2020), linear regression (Olszyna et al., 1997;Thompson et al., 2001), and feed-forward neural networks (Comrie, 1997;Cobourn et al., 2000). More recently, recurrent neural networks (RNNs) and convolutional neural networks (CNNs) have been used in an effort to better capture spatial and temporal dependencies (Biancofiore et al., 2015;Eslami et al., 2020;Ma et al., 2020;Sayeed et al., 2020;Kleinert et al., 2021). ...
... To our knowledge, this is the first study that uses a purely transformer-based model to make accurate forecasts at a large number of stations across different environments and countries, and furthermore, that (Neal et al., 2014) 0.64 20.9 61 Bias-corrected AQUM (Neal et al., 2014) 0.76 16.4 61 Bias-corrected GEOS-Chem (Ivatt and Evans, 2020) 0.84 7.5 2,200 ML methods DRR (Debry and Mallet, 2014) 0.70 6.3 729 CNN 0.77 8.8 21 CNN , 0.79 12.0 25 RNN (Biancofiore et al., 2015) 0.86 12.5 1 CNN-transformer (Chen et al., 2022) NA evaluates the capacity of an ML model to make forecasts on data drawn from countries outside the training dataset. In addition, as far as we are aware, only one other study has applied an architecture with a transformer-based component to ozone forecasting (Chen et al., 2022). ...
Article
Full-text available
Surface ozone is an air pollutant that contributes to hundreds of thousands of premature deaths annually. Accurate short-term ozone forecasts may allow improved policy actions to reduce the risk to human health. However, forecasting surface ozone is a difficult problem as its concentrations are controlled by a number of physical and chemical processes that act on varying timescales. We implement a state-of-the-art transformer-based model, the temporal fusion transformer, trained on observational data from three European countries. In four-day forecasts of daily maximum 8-hour ozone (DMA8), our novel approach is highly skillful (MAE = 4.9 ppb, coefficient of determination $ {\mathrm{R}}^2=0.81 $ ) and generalizes well to data from 13 other European countries unseen during training (MAE = 5.0 ppb, $ {\mathrm{R}}^2=0.78 $ ). The model outperforms other machine learning models on our data (ridge regression, random forests, and long short-term memory networks) and compares favorably to the performance of other published deep learning architectures tested on different data. Furthermore, we illustrate that the model pays attention to physical variables known to control ozone concentrations and that the attention mechanism allows the model to use the most relevant days of past ozone concentrations to make accurate forecasts on test data. The skillful performance of the model, particularly in generalizing to unseen European countries, suggests that machine learning methods may provide a computationally cheap approach for accurate air quality forecasting across Europe.
... These issues may be amplified given that models are often tested by taking random samples across space and time for validation, which does not provide accurate generalizable performance statistics when applying the model to make predictions at unobserved spatial locations. To improve upon these limitations, machine learning has been combined with climate or atmospheric chemistry models to speed up computation or improve prediction accuracy, by replacing time-consuming mechanism components (e.g., chemical integrator) 12,13 , memorizing the input-output relationship in mechanism models 14 , correcting the bias 15 or representing subgrid process 16 . Furthermore, there has been a growing trend of integrating physical knowledge into deep learning techniques to model fluid fields, incorporating constraints such as partial differential equations (PDE) 17,18 , yet apart from few limited applications 19,20 there continue to be gaps in effective strategies for embedding physics in deep learning to improve finely-resolved air quality assessments. ...
... However, the lack of physical mechanisms in most machine learning methods may lead to solutions that seriously violate physical principles such as conservation laws, with potential bias and low interpretability 68 . For example, tree-based machine learners (e.g., random forest, XGBoost and extremely randomized trees) have been widely used alone 46,50,69 or in physical models 12,15 with reported high accuracy, but the binary discretization in these algorithms inherently violate the continuity law of air pollutants, which can lead to discontinuity bias 36 . In contrast to these existing methods 70,71 , based on deep neural networks as a general nonlinear approximator of any continuous functions 72 , the DGM directly simulated dynamics of air mass by graph convolution and encoded the fluid physical laws of mass conservation and continuity by PDE residuals, which is a creative combinational strategy for computing efficiency and generalization. ...
Article
Full-text available
Existing methods for fine-scale air quality assessment have significant gaps in their reliability. Purely data-driven methods lack any physically-based mechanisms to simulate the interactive process of air pollution, potentially leading to physically inconsistent or implausible results. Here, we report a hybrid multilevel graph neural network that encodes fluid physics to capture spatial and temporal dynamic characteristics of air pollutants. On a multi-air pollutant test in China, our method consistently improved extrapolation accuracy by an average of 11–22% compared to several baseline machine learning methods, and generated physically consistent spatiotemporal trends of air pollutants at fine spatial and temporal scales.
... GBTs can automatically select relevant features and assign appropriate weights to them during the training process, effectively handling feature interactions that may be missed by linear models. Recent studies suggest that gradient-boosted decision tree algorithms can be used in atmospheric science to improve the performance of atmospheric chemistry transport models [53] and accurately predict meteorological parameters such as air temperature and humidity [54]. ...
Article
Full-text available
This survey presents an in-depth analysis of machine learning techniques applied to lidar observations for the detection of aerosol and cloud optical, geometrical, and microphysical properties. Lidar technology, with its ability to probe the atmosphere at very high spatial and temporal resolution and measure backscattered signals, has become an invaluable tool for studying these atmospheric components. However, the complexity and diversity of lidar technology requires advanced data processing and analysis methods, where machine learning has emerged as a powerful approach. This survey focuses on the application of various machine learning techniques, including supervised and unsupervised learning algorithms and deep learning models, to extract meaningful information from lidar observations. These techniques enable the detection, classification, and characterization of aerosols and clouds by leveraging the rich features contained in lidar signals. In this article, an overview of the different machine learning architectures and algorithms employed in the field is provided, highlighting their strengths, limitations, and potential applications. Additionally, this survey examines the impact of machine learning techniques on improving the accuracy, efficiency, and robustness of aerosol and cloud real-time detection from lidar observations. By synthesizing the existing literature and providing critical insights, this survey serves as a valuable resource for researchers, practitioners, and students interested in the application of machine learning techniques to lidar technology. It not only summarizes current state-of-the-art methods but also identifies emerging trends, open challenges, and future research directions, with the aim of fostering advancements in this rapidly evolving field.
... This is likely because the data are approximately exponential in distribution, with most of the values for each regressor being 0 or near 0, except a few spikes or outliers. The MAE is a more appropriate metric for the Laplacian-like errors often produced by exponentially distributed variables (Hodson, 2022). Moreover, as shown often in literature, adding randomness to the GBRT also increases the score (Friedman, 2002). ...
Article
Full-text available
Lagrangian particle dispersion models (LPDMs) have been used extensively to calculate source-receptor relationships (“footprints”) for use in applications such as greenhouse gas (GHG) flux inversions. Because a single model simulation is required for each data point, LPDMs do not scale well to applications with large data sets such as flux inversions using satellite observations. Here, we develop a proof-of-concept machine learning emulator for LPDM footprints over a ∼ 350 km × 230 km region around an observation point, and test it for a range of in situ measurement sites from around the world. As opposed to previous approaches to footprint approximation, it does not require the interpolation or smoothing of footprints produced by the LPDM. Instead, the footprint is emulated entirely from meteorological inputs. This is achieved by independently emulating the footprint magnitude at each grid cell in the domain using gradient-boosted regression trees with a selection of meteorological variables as inputs. The emulator is trained based on footprints from the UK Met Office's Numerical Atmospheric-dispersion Modelling Environment (NAME) for 2014 and 2015, and the emulated footprints are evaluated against hourly NAME output from 2016 and 2020. When compared to CH4 concentration time series generated by NAME, we show that our emulator achieves a mean R-squared score of 0.69 across all sites investigated between 2016 and 2020. The emulator can predict a footprint in around 10 ms, compared to around 10 min for the 3D simulator. This simple and interpretable proof-of-concept emulator demonstrates the potential of machine learning for LPDM emulation.
... Stream ow is subject to the interaction of numerous interacting factors, some meteorological (e.g., precipitation, temperature), and some related to landscape attributes (e.g., land use pattern, slope) ( In recent decades, owing to their simplicity and exibility, traditional machine learning models such as Arti cial Neural Networks (ANN), Support Vector Regression (SVR), and Gradient Boosted Regression Tree (GBRT), have been extensively applied to stream ow forecasting (Kumar et al., 2021;Ferreira et al., 2021). The GBRT model, powerful and popular in many domains (Pan et al., 2019;Ivatt et al., 2020;García-Nieto et al., 2021), was deemed useful for our study. For example, Liao et al. (2020) showed a GBRT model to outperform ANN and SVR models, particularly in the forecasting of 4-10 dayahead in ow into a dam reservoir in southern China. ...
Preprint
Full-text available
As much as accurate streamflow forecasts are important and significant for arid regions, they remain deficient and challenging. An ensemble learning strategy of decomposition-based machine learning and deep learning models was proposed to forecast multi-time-step ahead streamflow for northwest China’s Dunhuang Oasis. The efficiency and reliability of a Bayesian Model Averaging (BMA) ensemble strategy for 1-, 2-, and 3-day ahead streamflow forecasting was evaluated in comparison with decomposition-based machine learning and deep learning models: ( i ), a variational-mode-decomposition model coupled with a deep-belief-network model (VMD-DBN), ( ii ) a variational-mode-decomposition model coupled with a gradient-boosted-regression-tree model (VMD-GBRT), ( iii ) a complete ensemble empirical mode decomposition with adaptive noise model coupled with a deep belief network model (CEEMDAN-DBN), and (iv) a complete ensemble empirical mode decomposition with adaptive noise model with a gradient boosted regression tree coupled model (CEEMDAN-GBRT). Satisfactory forecasts were achieved with all proposed models at all lead times; however, based on Nash-Sutcliffe coefficient (NSE) values of 0.976, 0.967, and 0.957, the BMA model achieved the greatest accuracy for 1-, 2-, and 3-day ahead streamflow forecasts, respectively. Uncertainty analysis confirmed the reliability of the BMA model in yielding consistently accurate streamflow forecasts. Thus, the BMA ensemble strategy could provide an efficient alternative approach to multi-time-step ahead streamflow forecasting for areas where physically-based models cannot be used due to a lack of land surface data. The application of the BMA model was particularly valuable when the ensemble members gave equivalent satisfactory performances, making it difficult to choose amongst them.
... Ensemble modelling is one of the commonly listed modelling techniques, which have favourable properties for surrogate modelling (Alizadeh et al., 2020;Archetti and Candelieri, 2019). The applications in atmospheric dispersion modelling can be found in Lucas et al. (2017) for nuclear power source estimation and Ivatt and Evans (2020) for predicting the bias in tropospheric ozone prediction calculated by an atmospheric chemistry transport model. -The surrogate model is tested for a wide variety of realistic annual meteorological conditions, and the performance statistics against the physical model are evaluated. ...
Article
Full-text available
Atmospheric dispersion models predict the dispersion of harmful substances in case of accidents at industrial facilities and nuclear power plants (NPPs). However, high computation time limits their usage in an emergency or long-term analyses. This paper reduces the computation time by designing a surrogate data-driven model using a grid of tree ensemble models as a surrogate for the physical model and meteorological station measurements as model regressors. Regression tree modelling provided information for selecting the most important variables for prediction, while model ensembles improved the prediction accuracy. The approach is tested for an NPP in complex terrain to predict spatial (2D) maps of population doses for 24 hours after a radiological release. The average performance of 2D maps against the physical model is SMSE (Standardized Mean Square Error) < 0.5 and FMS (Figure of Merit in Space) > 0.5. The designed model performs very well in predicting the long-term mean and 95th percentile of population doses. The main shortcoming is the underestimation of very high doses. Performance is expected to be further improved by selecting training data using pattern selection techniques and potentially by alternative machine learning algorithms or interconnected models, which we intend to apply in future work.