Chapter

Accelerating Extreme-Scale Numerical Weather Prediction

April 2016

April 2016

DOI:10.1007/978-3-319-32152-3_54

In book: Parallel Processing and Applied Mathematics: 11th International Conference, PPAM 2015, Krakow, Poland, September 6-9, 2015. Revised Selected Papers, Part II (pp.583-593)
Publisher: Springer International Publishing
Editors: Roman Wyrzykowski, Ewa Deelman, Jack Dongarra, Konrad Karczewski, Jacek Kitowski, Kazimierz Wiatr

Authors:

Willem Deconinck

European Center For Medium Range Weather Forecasts

Mats Hamrud

European Center For Medium Range Weather Forecasts

Christian Kühnlein

European Centre For Medium Range Weather Forecasts

George Mozdzynski

European Center For Medium Range Weather Forecasts

Show all 7 authorsHide

Numerical Weather Prediction (NWP) and climate simulations have been intimately connected with progress in supercomputing since the first numerical forecast was made about 65 years ago. The biggest challenge to state-of-the-art computational NWP arises today from its own software productivity shortfall. The application software at the heart of most NWP services is ill-equipped to efficiently adapt to the rapidly evolving heterogeneous hardware provided by the supercomput-ing industry. If this challenge is not addressed it will have dramatic negative consequences for weather and climate prediction and associated services. This article introduces Atlas, a flexible data structure framework developed at the European Centre for Medium-Range Weather Forecasts (ECMWF) to facilitate a variety of numerical discretisation schemes on heterogeneous architectures, as a necessary step towards affordable ex-ascale high-performance simulations of weather and climate. A newly developed hybrid MPI-OpenMP finite volume module built upon Atlas serves as a first demonstration of the parallel performance that can be achieved using Atlas' initial capabilities.

Development of Atlas, a flexible data structure framework

Preprint

Full-text available

Aug 2019

Willem Deconinck

This document is one of the deliverable reports created for the ESCAPE project. ESCAPE stands for Energy-efficient Scalable Algorithms for Weather Prediction at Exascale. The project develops world-class, extreme-scale computing capabilities for European operational numerical weather prediction and future climate models. This is done by identifying Weather & Climate dwarfs which are key patterns in terms of computation and communication (in the spirit of the Berkeley dwarfs). These dwarfs are then optimised for different hardware architectures (single and multi-node) and alternative algorithms are explored. Performance portability is addressed through the use of domain specific languages. In this deliverable report, we present Atlas, a new software library that is currently being developed at the European Centre for Medium-Range Weather Forecasts (ECMWF), with the scope of handling data structures required for NWP applications in a flexible and massively parallel way. Atlas provides a versatile framework for the future development of efficient NWP and climate applications on emerging HPC architectures. The applications range from full Earth system models, to specific tools required for post-processing weather forecast products. Atlas provides data structures for building various numerical strategies to solve equations on the sphere or limited area's on the sphere. These data structures may contain a distribution of points (grid) and, possibly, a composition of elements (mesh), required to implement the numerical operations required. Atlas can also represent a given field within a specific spatial projection. Atlas is capable of mapping fields between different grids as part of pre- and post-processing stages or as part of coupling processes whose respective fields are discretised on different grids or meshes.

Satellite-Based Global Sea Surface Oxygen Mapping and Interpretation with Spatiotemporal Machine Learning

Article

Dec 2023

The assessment of dissolved oxygen (DO) concentration at the sea surface is essential for comprehending the global ocean oxygen cycle and associated environmental and biochemical processes as it serves as the primary site for photosynthesis and sea-air exchange. However, limited comprehensive measurements and imprecise numerical simulations have impeded the study of global sea surface DO and its relationship with environmental challenges. This paper presents a novel spatiotemporal information embedding machine-learning framework that provides explanatory insights into the underlying driving mechanisms. By integrating extensive in situ data and high-resolution satellite data, the proposed framework successfully generated high-resolution (0.25° × 0.25°) estimates of DO concentration with exceptional accuracy (R2 = 0.95, RMSE = 11.95 μmol/kg, and test number = 2805) for near-global sea surface areas from 2010 to 2018, uncertainty estimated to be ±13.02 μmol/kg. The resulting sea surface DO data set exhibits precise spatial distribution and reveals compelling correlations with prominent marine phenomena and environmental stressors. Leveraging its interpretability, our model further revealed the key influence of marine factors on surface DO and their implications for environmental issues. The presented machine-learning framework offers an improved DO data set with higher resolution, facilitating the exploration of oceanic DO variability, deoxygenation phenomena, and their potential consequences for environments.

Benchmarking Solvers for the One Dimensional Cubic Nonlinear Klein Gordon Equation on a Single Core

Chapter

Jun 2020

To determine the best method for solving a numerical problem modeled by a partial differential equation, one should consider the discretization of the problem, the computational hardware used and the implementation of the software solution. In solving a scientific computing problem, the level of accuracy can also be important, with some numerical methods being efficient for low accuracy simulations, but others more efficient for high accuracy simulations. Very few high performance benchmarking efforts allow the computational scientist to easily measure such tradeoffs in order to obtain an accurate enough numerical solution at a low computational cost. These tradeoffs are examined in the numerical solution of the one dimensional Klein Gordon equation on single cores of an ARM CPU, an AMD x86-64 CPU, two Intel x86-64 CPUs and a NEC SX-ACE vector processor. The work focuses on comparing the speed and accuracy of several high order finite difference spatial discretizations using a conjugate gradient linear solver and a fast Fourier transform based spatial discretization. In addition implementations using second and fourth order timestepping are also included in the comparison. The work uses accuracy-efficiency frontiers to compare the effectiveness of five hardware platforms

A Class of Finite-Volume Models for Atmospheric Flows Across Scales

Conference Paper

Jun 2018

AIAA 2018-3497 Paper The paper examines recent advancements in the class of Nonoscillatory Forward-in-Time (NFT) schemes that exploit the implicit LES (ILES) properties of Multidimensional Positive Deﬁnite Advection Transport Algorithm (MPDATA). The reported developments address both global and limited area models spanning a range of atmospheric ﬂows, from the hydrostatic regime at planetary scale, down to mesoscale and microscale where ﬂows are inherently non- hydrostatic. All models operate on fully unstructured (and hybrid) meshes and utilize a median dual mesh ﬁnite volume discretisation. High performance computations for global ﬂows employ a bespoke hybrid MPI-OpenMP approach and utilise the ATLAS library. Simulations across scales—from a global baroclinic instability epitomising evolution of weather systems down to stratiﬁed orographic ﬂows rich in turbulent phenomena due to gravity-wave breaking in dispersive media, verify the computational advancements and demonstrate the eﬃcacy of ILES both in regularizing large scale ﬂows at the scale of the mesh resolution and taking a role of a subgrid-scale turbulence model in simulation of turbulent ﬂows in the LES regime.

Crossing the chasm: How to develop weather and climate models for next generation computers?

Article

Full-text available

May 2018
GMD

Weather and climate models are complex pieces of software which include many individual components, each of which is evolving under pressure to exploit advances in computing to enhance some combination of a range of possible improvements (higher spatio-temporal resolution, increased fidelity in terms of resolved processes, more quantification of uncertainty, etc.). However, after many years of a relatively stable computing environment with little choice in processing architecture or programming paradigm (basically X86 processors using MPI for parallelism), the existing menu of processor choices includes significant diversity, and more is on the horizon. This computational diversity, coupled with ever increasing software complexity, leads to the very real possibility that weather and climate modelling will arrive at a chasm which will separate scientific aspiration from our ability to develop and/or rapidly adapt codes to the available hardware. In this paper we review the hardware and software trends which are leading us towards this chasm, before describing current progress in addressing some of the tools which we may be able to use to bridge the chasm. This brief introduction to current tools and plans is followed by a discussion outlining the scientific requirements for quality model codes which have satisfactory performance and portability, while simultaneously supporting productive scientific evolution. We assert that the existing method of incremental model improvements employing small steps which adjust to the changing hardware environment is likely to be inadequate for crossing the chasm between aspiration and hardware at a satisfactory pace, in part because institutions cannot have all the relevant expertise in house. Instead, we outline a methodology based on large community efforts in engineering and standardisation, which will depend on identifying a taxonomy of key activities – perhaps based on existing efforts to develop domain-specific languages, identify common patterns in weather and climate codes, and develop community approaches to commonly needed tools and libraries – and then collaboratively building up those key components. Such a collaborative approach will depend on institutions, projects, and individuals adopting new interdependencies and ways of working.

Crossing the Chasm: How to develop weather and climate models for next generation computers?

Article

Full-text available

Sep 2017
GMDD

Weather and climate models are complex pieces of software which include many individual components, each of which is evolving under the pressure to exploit advances in computing to enhance some combination of a range of possible improvements (higher spatio/temporal resolution, increased fidelity in terms of resolved processes, more quantification of uncertainty etc). However, after many years of a relatively stable computing environment with little choice in processing architecture or programming paradigm (basically X86 processors using MPI for parallelism), the existing menu of processor choices includes significant diversity, and more is on the horizon. This computational diversity, coupled with ever increasing software complexity, leads to the very real possibility that weather and climate modelling will arrive at a chasm which will separate scientific aspiration from our ability to develop and/or rapidly adapt codes to the available hardware. In this paper we review the hardware and software trends which are leading us towards this chasm, before describing current progress in addressing some of the tools which we may be able to use to bridge the chasm. This brief introduction to current tools and plans is followed by a discussion outlining the scientific requirements for quality model codes which have satisfactory performance and portability, while simultaneously supporting productive scientific evolution. We assert that the existing method of incremental model improvements employing small steps which adjust to the changing hardware environment is likely to be inadequate for crossing the chasm between aspiration and hardware at a satisfactory pace, in part because institutions cannot have all the relevant expertise in house. Instead, we outline a methodology based on large community efforts in engineering and standardisation, one which will depend on identifying a taxonomy of key activities – perhaps based on existing efforts to develop domain specific languages, identify common patterns in weather and climate codes, and develop community approaches to commonly needed tools, libraries etc – and then collaboratively building up those key components. Such a collaborative approach will depend on institutions, projects and individuals adopting new interdependencies and ways of working.

Stochastic representations of model uncertainties at ECMWF: state of the art and future vision: Stochastic Representations of Model Uncertainties

Article

Full-text available

Jun 2017

Members in ensemble forecasts differ due to the representations of initial uncertainties and model uncertainties. The inclusion of stochastic schemes to represent model uncertainties has improved the probabilistic skill of the ECMWF ensemble by increasing reliability and reducing the error of the ensemble mean. Recent progress, challenges and future directions regarding stochastic representations of model uncertainties at ECMWF are described in this paper. The coming years are likely to see a further increase in the use of ensemble methods in forecasts and assimilation. This will put increasing demands on the methods used to perturb the forecast model. An area that is receiving a greater attention than 5 to 10 years ago is the physical consistency of the perturbations. Other areas where future efforts will be directed are the expansion of uncertainty representations to the dynamical core and to other components of the Earth system as well as the overall computational efficiency of representing model uncertainty.

Increasing horizontal resolution in numerical weather prediction and climate simulations: Illusion or panacea?

Article

Full-text available

Jun 2014
PHILOS T R SOC A

Nils Wedi

The steady path of doubling the global horizontal resolution approximately every 8 years in numerical weather prediction (NWP) at the European Centre for Medium Range Weather Forecasts may be substan- tially altered with emerging novel computing architectures. It coincides with the need to appropriately address and determine forecast uncertainty with increasing resolution, in particular, when convective-scale motions start to be resolved. Blunt increases in the model resolution will quickly become unaffordable and may not lead to improved NWP forecasts. Consequently, there is a need to accordingly adjust proven numerical techniques. An informed decision on the modelling strategy for harnessing exascale, massively parallel computing power thus also requires a deeper understanding of the sensitivity to uncertainty-for each part of the model-and ultimately a deeper understanding of multi-scale interactions in the atmosphere and their numerical realization in ultra-high-resolution NWP and climate simulations. This paper explores opportunities for substantial increases in the forecast efficiency by judicious adjustment of the formal accuracy or relative resolution in the spectral and physical space. One path is to reduce the formal accuracy by which the spectral transforms are computed. The other pathway explores the importance of the ratio used for the horizontal resolution in gridpoint space versus wavenumbers in spectral space. This is relevant for both high-resolution simulations as well as ensemble-based uncertainty estimation.

The Landscape of Parallel Computing Research: A View from Berkeley

Technical Report

Full-text available

Dec 2006

The recent switch to parallel microprocessors is a milestone in the history of computing. Industry has laid out a roadmap for multicore designs that preserves the programming paradigm of the past via binary compatibility and cache coherence. Conventional wisdom is now to double the number of cores on a chip with each silicon generation. A multidisciplinary group of Berkeley researchers met nearly two years to discuss this change. Our view is that this evolutionary approach to parallel hardware and software may work from 2 or 8 processor systems, but is likely to face diminishing returns as 16 and 32 processor systems are realized, just as returns fell with greater instruction-level parallelism. We believe that much can be learned by examining the success of parallelism at the extremes of the computing spectrum, namely embedded computing and high performance computing. This led us to frame the parallel landscape with seven questions, and to recommend the following: The overarching goal should be to make it easy to write programs that execute efficiently on highly parallel computing systems The target should be 1000s of cores per chip, as these chips are built from processing elements that are the most efficient in MIPS (Million Instructions per Second) per watt, MIPS per area of silicon, and MIPS per development dollar. Instead of traditional benchmarks, use 13 "Dwarfs" to design and evaluate parallel programming models and architectures. (A dwarf is an algorithmic method that captures a pattern of computation and communication.) "Autotuners" should play a larger role than conventional compilers in translating parallel programs. To maximize programmer productivity, future programming models must be more human-centric than the conventional focus on hardware or applications. To be successful, programming models should be independent of the number of processors. To maximize application efficiency, programming models should support a wide range of data types and successful models of parallelism: task-level parallelism, word-level parallelism, and bit-level parallelism. Architects should not include features that significantly affect performance or energy if programmers cannot accurately measure their impact via performance counters and energy counters. Traditional operating systems will be deconstructed and operating system functionality will be orchestrated using libraries and virtual machines. To explore the design space rapidly, use system emulators based on Field Programmable Gate Arrays (FPGAs) that are highly scalable and low cost. Since real world applications are naturally parallel and hardware is naturally parallel, what we need is a programming model, system software, and a supporting architecture that are naturally parallel. Researchers have the rare opportunity to re-invent these cornerstones of computing, provided they simplify the efficient programming of highly parallel systems.

A partition of the unit sphere into regions of equal area and small diameter

Article

Full-text available

Jan 2006
ELECTRON T NUMER ANA

Paul Charles Leopardi

The recursive zonal equal area (EQ) sphere partitioning algorithm is a practical algorithm for partitioning higher dimensional spheres into regions of equal area and small diameter. This paper describes the partition algorithm and its implementation in Matlab, provides numerical results and gives a sketch of the proof of the bounds on the diameter of regions. A companion paper [13] gives details of the proof.

A finite-volume module for simulating global all-scale atmospheric flows

Article

Mar 2016

The paper documents the development of a global nonhydrostatic finite-volume module designed to enhance an established spectral-transform based numerical weather prediction (NWP) model. The module adheres to NWP standards, with formulation of the governing equations based on the classical meteorological latitude-longitude spherical framework. In the horizontal, a bespoke unstructured mesh with finite-volumes built about the reduced Gaussian grid of the existing NWP model circumvents the notorious stiffness in the polar regions of the spherical framework. All dependent variables are co-located, accommodating both spectral-transform and grid-point solutions at the same physical locations. In the vertical, a uniform finite-difference discretisation facilitates the solution of intricate elliptic problems in thin spherical shells, while the pliancy of the physical vertical coordinate is delegated to generalised continuous transformations between computational and physical space. The newly developed module assumes the compressible Euler equations as default, but includes reduced soundproof PDEs as an option. Furthermore, it employs semi-implicit forward-in-time integrators of the governing PDE systems, akin to but more general than those used in the NWP model. The module shares the equal regions parallelisation scheme with the NWP model, with multiple layers of parallelism hybridising MPI tasks and OpenMP threads. The efficacy of the developed nonhydrostatic module is illustrated with benchmarks of idealised global weather.

A NEW PARTITIONING APPROACH FOR ECMWF'S INTEGRATED FORECASTING SYSTEM (IFS)

Conference Paper

Oct 2007

George Mozdzynski

Since the mid-90s IFS has used a 2-dimensional scheme for partitioning grid point space to MPI tasks. While this scheme has served ECMWF well there has nevertheless been some areas of concern, namely, communication overheads for IFS reduced grids at the poles to support the Semi-Lagrangian scheme; and the halo requirements needed to support the interpolation of fields between model and radiation grids. These issues have been addressed by the implementation of a new partitioning scheme called EQ_REGIONS which is characterised by an increasing number of partitions in bands from the poles to the equator. The number of bands and the number of partitions in each particular band are derived so as to provide partitions of equal area and small 'diameter'. The EQ_REGIONS algorithm used in IFS is based on the work of Paul Leopardi, School of Mathematics, University of New South Wales, Sydney, Australia.

A Fast Spherical Harmonics Transform for Global NWP and Climate Models

Article

Oct 2013

Very high resolution spectral transform models are believed to become prohibitively expensive, due to the relative increase in computational cost of the Legendre transforms compared to the gridpoint computations. This article describes the implementation of a practical fast spherical harmonics transform into the Integrated Forecasting System (IFS) at ECMWF. Details of the accuracy of the computations, of the parallelisation and memory use are discussed. Results are presented that demonstrate the cost-effectiveness and accuracy of the fast spherical harmonics transform, successfully mitigating the concern about the disproportionally growing computational cost. Using the new transforms, the first T7999 global weather forecast (equivalent to approximately 2.5km horizontal grid size) using a spectral transform model has been produced.

Use of Reduced Gaussian Grids in Spectral Models

Article

Apr 1991

Integrations of spectral models are presented in which the "Gaussian' grid of points at which the nonlinear terms are evaluated is reduced as the poles are approached. A maximum saving in excess of one-third the number of points covering the globe is obtained by requiring that the grid length in the zonal direction does not exceed the grid length at the equator, and that the number of points around a latitude circle enables the use of a fast Fourier transform. The results show that such a reduced grid can be used for short- and medium-range prediction (and presumably also for climate studies) with no significant loss of accuracy compared with use of a conventional grid, which is uniform in longitude. The saving in computational time is between 20% and 25% for the T106 forecast model. -from Authors

A baroclinic instability test case for atmospheric model dynamical cores

Article

May 2007
Q J ROY METEOR SOC

A deterministic initial‐value test case for dry dynamical cores of atmospheric general‐circulation models is presented that assesses the evolution of an idealized baroclinic wave in the northern hemisphere. The initial zonal state is quasi‐realistic and completely defined by analytic expressions which are a steady‐state solution of the adiabatic inviscid primitive equations with pressure‐based vertical coordinates. A two‐component test strategy first evaluates the ability of the discrete approximations to maintain the steady‐state solution. Then an overlaid perturbation is introduced which triggers the growth of a baroclinic disturbance over the course of several days. The test is applied to four very different dynamical cores at varying horizontal and vertical resolutions. In particular, the NASA/NCAR Finite Volume dynamics package, the National Center for Atmospheric Research spectral transform Eulerian and the semi‐Lagrangian dynamical cores of the Community Atmosphere Model CAM3 are evaluated. In addition, the icosahedral finite‐difference model GME of the German Weather Service is tested. These hydrostatic dynamical cores represent a broad range of numerical approaches and, at very high resolutions, provide independent reference solutions. The paper discusses the convergence‐with‐resolution characteristics of the schemes and evaluates the uncertainty of the high‐resolution reference solutions. Copyright © 2006 Royal Meteorological Society

An edge-based unstructured mesh discretisation in geospherical framework

Article

Jul 2010

An arbitrary finite-volume approach is developed for discretising partial differential equations governing fluid flows on the sphere. Unconventionally for unstructured-mesh global models, the governing equations are cast in the anholonomic geospherical framework established in computational meteorology. The resulting discretisation retains proven properties of the geospherical formulation, while it offers the flexibility of unstructured meshes in enabling irregular spatial resolution. The latter allows for a global enhancement of the spatial resolution away from the polar regions as well as for a local mesh refinement. A class of non-oscillatory forward-in-time edge-based solvers is developed and applied to numerical examples of three-dimensional hydrostatic flows, including shallow-water benchmarks, on a rotating sphere.

The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software

Article

Jan 2005

H. SUTTER

The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software

Article

Jan 2005

Herb Sutter

A new partitioning approach for ECMWF’s integrated forecasting system (IFS) In: Proceedings of the Twelfth ECMWF Workshop: Use of High Performance Computing in Meteorology

G Mozdzynski

Mozdzynski, G.: A new partitioning approach for ECMWF's integrated forecasting system (IFS). p 148-166 in Proceedings of the Twelfth ECMWF Workshop: Use of High Performance Computing in Meteorology, 30 October -3 November, 2006, Reading, UK, World Scientific 273 pp. (2007)

A hybrid all-scale finite-volume module for stratified flows on a rotating sphere

P K Smolarkiewicz
W Deconinck
M Hamrud
C Kühnlein
G Mozdzynski
J Szmelter
N P Wedi

Smolarkiewicz, P.K., Deconinck, W., Hamrud, M., Kühnlein, C., Mozdzynski, G., Szmelter, J., Wedi, N.P.: A hybrid all-scale finite-volume module for stratified flows on a rotating sphere, J. Comput. Phys. (2016) submitted

Accelerating Extreme-Scale Numerical Weather Prediction

Abstract

No full-text available

Recommended publications

SumTime-Mousam: Configurable marine weather forecast generator

A method for accelerating test environments

Studies regarding the quality of numerical weather forecasts of the WRF model integrated at high-res...

Characterization of residual information for SeaWinds quality control

The primacy of doubt: Evolution of numerical weather prediction from determinism to probability: EVO...

Can NWP help to improve seasonal predictions of extreme events? A summer 2003 case study

Seasonal climate predictability and forecasting: Status and prospects

Progress in climate prediction and weather forecast operations in China

Non-orographic Gravity Waves: Representation in Climate Models and Effects on Infrasound: Challenges...

Understanding the Role of the Saharan Heat Low in Modifying Atmospheric Dust Distributions - Observa...

Weather forecasts and climate prediction at CPTEC

Towards the probabilistic Earth-system simulator: A vision for the future of climate and weather pre...