Wei Xue

Wei Xue
Tsinghua University | TH · Department of Computer Science and Technology

About

142
Publications
41,005
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,832
Citations

Publications

Publications (142)
Article
The next generation Sunway supercomputer employs the SW26010pro processor, which features a specialized on-chip heterogeneous architecture. Applications with significant hotspots can benefit from the great computation capacity of Sunway many-core architectures by carefully making intensive manual many-core parallelization efforts. However, some pro...
Preprint
The next generation Sunway supercomputer employs the SW26010pro processor, which features a specialized on-chip heterogeneous architecture. Applications with significant hotspots can benefit from the great computation capacity improvement of Sunway many-core architectures by carefully making intensive manual many-core parallelization efforts. Howev...
Article
Full-text available
The underestimation of cloud fraction, especially the low stratus cloud fraction (LSC) over the eastern oceans, remains a problem in most AGCMs. This study investigated potential improvements through perturbing nine moist physical parameters, using uniform sampling and Latin hypercube sampling methods, and quantified the parametric uncertainty and...
Article
Molecular dynamics (MD) simulations of biological systems are playing an increasingly important role in the research of pathogens and drugs. Most MD methods for biological simulations rely on the listed bonds which interact among specific groups of atoms identified by atom tags (unique atom tags regardless the storage location). However, efficient...
Preprint
Full-text available
The Single Column Atmospheric Model (SCAM) is an essential tool for analyzing and improving the physics schemes of CAM. Although it already largely reduces the compute cost from a complete CAM, the exponentially-growing parameter space makes a combined analysis or tuning of multiple parameters difficult. In this paper, we propose a hybrid framework...
Article
Variations in the performance of parallel and distributed systems are becoming increasingly challenging. The runtimes of different executions can vary greatly even with a fixed number of computing nodes. Many HPC applications on supercomputers exhibit such variance. This not only leads to unpredictable execution times, but also renders the system’s...
Article
With the increasing complexity of scientific computing, it is imperative to enhance the efficiency and ease of High Performance Computing (HPC) utilization. Scientific workflow is introduced to that aim, but the current infrastructure still needs optimization. In this paper, we discuss the current problems based on scientific computing scenarios an...
Article
Tridiagonal solver is an important kernel used in a wide range of applications and has been well supported in mainstream numerical libraries. Quite a few parallel algorithms have been developed, but the best-performing algorithm may vary across architectures as well as input sizes. Targeting this algorithm choice challenge, we present a model guide...
Preprint
Hybrid modeling combining data-driven techniques and numerical methods is an emerging and promising research direction for efficient climate simulation. However, previous works lack practical platforms, making developing hybrid modeling a challenging programming problem. Furthermore, the lack of standard data sets and evaluation metrics may hamper...
Article
Full-text available
In climate models, subgrid parameterizations of convection and clouds are one of the main causes of the biases in precipitation and atmospheric circulation simulations. In recent years, due to the rapid development of data science, machine learning (ML) parameterizations for convection and clouds have been demonstrated to have the potential to perf...
Article
Full-text available
Background Large uncertainty in modeling land carbon (C) uptake heavily impedes the accurate prediction of the global C budget. Identifying the uncertainty sources among models is crucial for model improvement yet has been difficult due to multiple feedbacks within Earth System Models (ESMs). Here we present a Matrix-based Ensemble Model Inter-comp...
Article
The tridiagonal solver is an important kernel and is widely supported in mainstream numerical libraries. While parallel algorithms have been studied for many-core architectures, the performance of current algorithms and implementations is still hindered by input size sensitivity and cross-platform portability. In this paper, we propose a novel algo...
Article
Molecular dynamics (MD) simulations are playing an increasingly important role in many areas ranging from chemical materials to biological molecules. With the continuing development of MD models, the potentials are getting larger and more complex. In this paper, we focus on the reactive force field (ReaxFF) potential from LAMMPS to optimize the com...
Article
The Community Atmosphere Model (CAM) has been ported, redesigned, and scaled to the full system of the Sunway TaihuLight, and provides peta-scale climate modeling performance. Based on a novel domain decomposition method, we have fully optimized the complete model code by using both OpenACC refactoring and more aggressive and finer-grained Athread...
Article
Full-text available
With semiconductor technology gradually approaching its physical and thermal limits, recent supercomputers have adopted major architectural changes to continue increasing the performance through more power-efficient heterogeneous many-core systems. Examples include Sunway TaihuLight that has four management processing elements (MPEs) and 256 comput...
Article
Full-text available
A team effort to develop a Community Integrated Earth System Model (CIESM) was initiated in China in 2012. The model was based on NCAR Community Earth System Model (Version 1.2.1) with several novel developments and modifications aimed to overcome some persistent systematic biases, such as the double Intertropical convergence Zone problem and under...
Article
Large-scale molecular dynamics (MD) simulations on supercomputers play an increasingly important role in many research areas. With the capability of simulating charge equilibration (QEq), bonds and so on, Reactive force field (ReaxFF) enables the precise simulation of chemical reactions. Compared to the first principle molecular dynamics (FPM...
Article
Full-text available
The ever-growing complexity of HPC applications and the computer architectures cost more efforts than ever to learn application behaviors. In this paper, we propose the APMT, an Automatic Performance Modeling Tool, to understand and predict performance efficiently in the regimes of interest to developers and performance analysts while outperforming...
Article
Full-text available
In the original version of this article, the second author’s first name was misspelled as Zhipeng. The correct spelling is Zipeng. The sixth author’s first name was misspelled as Jirong. The correct spelling is Jinrong. The correct version is as follows: LICOM Model Datasets for the CMIP6 Ocean Model Intercomparison Project Pengfei LIN1,4, Zipeng Y...
Article
The Sunway TaihuLight supercomputer has been installed for several years and many applications have been ported or built for TaihuLight. Initially most applications running on TaihuLight are with regular memory access patterns, such as dense linear algebra, structured grids and dynamic programming. At the year of 2018, developers have published a g...
Article
Full-text available
The datasets of two Ocean Model Intercomparison Project (OMIP) simulation experiments from the LASG/IAP Climate Ocean Model, version 3 (LICOM3), forced by two different sets of atmospheric surface data, are described in this paper. The experiment forced by CORE-II (Co-ordinated Ocean–Ice Reference Experiments, Phase II) data (1948–2009) is called O...
Preprint
Full-text available
Abstract. With the semi-conductor technology gradually approaching its physical and heat limits, recent supercomputers have adopted major architectural changes to continue increasing the performance through more power-efficient heterogeneous many-core systems. Examples include Sunway TaihuLight that has four Management Processing Element (MPE) and...
Article
Full-text available
Uncertain parameters in physical parameterizations of general circulation models (GCMs) greatly impact model performance. In recent years, automatic parameter optimization has been introduced for tuning model performance of GCMs, but most of the optimization methods are unconstrained optimization methods under a given performance indicator. Therefo...
Conference Paper
GROMACS is one of the most popular Molecular Dynamic (MD) applications and is widely used in the field of chemical and bimolecular system study. Similar to other MD applications, it needs long run-time for large-scale simulations. Therefore, many high performance platforms have been employed to accelerate it, such as Knights Landing (KNL), Cell Pro...
Article
The prediction ability of the climate system is highly depended on the efficient integration of observations and simulations of the Earth, which is regarded as a canonical example of the cyber-physical system. The climate system model, the simulation engine in this cyber-physical system, is one of most challenging applications in scientific computi...
Conference Paper
The Weather Research and Forecasting (WRF) Model is one of the widely-used mesoscale numerical weather prediction system and is designed for both atmospheric research and operational forecasting applications. However, it is an extremely time-consuming application: running a single simulation takes researchers days to weeks as the simulation size sc...
Article
As scientific applications are increasingly ported to GPUs to benefit from both the powerful computing capacity and high throughput, accelerating explicit solvers for GPU-based finite volume methods is gaining more and more attention. In this paper, based on the detailed analysis of the FVM algorithm, we present a set of novel optimization methods,...
Article
Full-text available
Uncertain parameters in physical parameterizations of General Circulation Models (GCMs) greatly impact model performance. In recent years, automatic parameter optimization has been introduced for tuning model performance of GCMs but most of the optimization methods are unconstrained optimization methods under a given performance indicator, so that...
Article
Tropical cyclone (TC) genesis is a problem of great significance in climate and weather research. Although various environmental conditions necessary for TC genesis have been recognized for a long time, prediction of TC genesis remains a challenge due to complex and stochastic processes involved during TC genesis. Different from traditional statist...
Preprint
We introduce NAMSG, an adaptive first-order algorithm for training neural networks. The method is efficient in computation and memory, and straightforward to implement. It computes the gradients at configurable remote observation points, in order to expedite the convergence by adjusting the step size for directions with different curvatures, in the...
Article
In this paper, we propose an efficient time-space-domain optimized (OptTS) finite difference scheme to model 2D and 3D scalar wave propagation. It adopts piecewise constant interpolation coefficients for several consecutive Courant number ranges, which avoids the extra time costs caused by loading the coefficients consecutively according to differe...
Article
Full-text available
Coastal areas, where sea breeze are prevalent, generally have good wind power resources and are favorable sites for wind farms. Corkscrew sea breezes, having greater wind power than backdoor sea breezes, dominate the local circulations in summer over the coastal area of Jiangsu province, China. Daily Weather Research and Forecasting simulations wer...
Article
Full-text available
Traditional trial-and-error tuning of uncertain parameters in global atmospheric general circulation models (GCMs) is time consuming and subjective. This study explores the feasibility of automatic optimization of GCM parameters for fast physics by using short-term hindcasts. An automatic workflow is described and applied to the Community Atmospher...
Conference Paper
The sparse triangular solver (SpTRSV) is one of the most essential kernels in many scientific and engineering applications. Efficiently parallelizing the SpTRSV on modern many-core architectures is considerably difficult due to inherent dependency of computation and discontinuous memory accesses. Achieving high performance of SpTRSV is even more ch...
Article
Full-text available
Soil organic carbon (SOC) has a significant effect on carbon emissions and climate change. However, the current SOC prediction accuracy of most models is very low. Most evaluation studies indicate that the prediction error mainly comes from parameter uncertainties, which can be improved by parameter calibration. Data assimilation techniques have be...
Conference Paper
Sparse Matrix-Vector Multiplication (SpMV) is an essential computation kernel for many data-analytic workloads running in both supercomputers and data centers. The intrinsic irregularity in SpMV is challenging to achieve high performance, especially when porting to new architectures. In this paper, we present our work on designing and implementing...
Conference Paper
Full-text available
Due to the advantages on scalability and reliability, the floating random walk (FRW) algorithm has been widely adopted for calculating the capacitances among three-dimensional (3-D) conductors. This is evidenced by the industrial practice of interconnect capacitance extraction during the design of high-performance very large-scale integrated (VLSI)...
Article
Full-text available
Electromagnetic transients (EMT) simulation is the most accurate and intensive computation for power systems. Past research has shown the potential of accelerating such simulations using graphics processing units (GPUs). In this paper, an efficient GPU-based parallel EMT simulator is designed. Thread-oriented model transformations are first propose...
Article
Full-text available
Traditional trial-and-error tuning of uncertain parameters in global atmospheric General Circulation Models (GCM) is time consuming and subjective. This study explores the feasibility of automatic optimization of GCM parameters for fast physics by using short-term hindcasts. An automatic workflow is described and applied to the Community Atmospheri...
Conference Paper
Full-text available
Performance variance becomes increasingly challenging on current large-scale HPC systems. Even using a fixed number of computing nodes, the execution time of several runs can vary significantly. Many parallel programs executing on supercomputers suffer from such variance. Performance variance not only causes unpredictable performance requirement vi...
Conference Paper
Full-text available
Sparse triangular solve (SpTRSV) is one of the most important kernels in many real-world applications. Currently, much research on parallel SpTRSV focuses on level-set construction for reducing the number of inter-level synchronizations. However, the out-of-control data reuse and high cost for global memory or shared cache access in inter-level syn...
Article
Sparse triangular solve (SpTRSV) is one of the most important kernels in many real-world applications. Currently, much research on parallel SpTRSV focuses on level-set construction for reducing the number of inter-level synchronizations. However, the out-of-control data reuse and high cost for global memory or shared cache access in inter-level syn...
Article
Performance variance becomes increasingly challenging on current large-scale HPC systems. Even using a fixed number of computing nodes, the execution time of several runs can vary significantly. Many parallel programs executing on supercomputers suffer from such variance. Performance variance not only causes unpredictable performance requirement vi...
Article
Full-text available
The original version of this Article contained an error in Figure 2. In panel a, the x axis of the graph was incorrectly labeled 'precipitation bias', and should have read 'negative precipitation bias'. This error has been corrected in both the PDF and HTML versions of the Article.
Article
Full-text available
The response of surface winds over the equatorial Pacific to cloud-related parameters is quantified by using a uniform sampling method and conducting a large number of perturbed parameter simulations. The results show that the surface winds are highly sensitive to, and even linearly dependent on some parameters that include the precipitation effici...
Conference Paper
Full-text available
This paper reports our large-scale nonlinear earthquake simulation software on Sunway TaihuLight. Our innovations include: (1) a customized parallelization scheme that employs the 10 million cores efficiently at both the process and the thread levels; (2) an elaborate memory scheme that integrates on-chip halo exchange through register communcation...
Conference Paper
Memory accesses limit the performance and scalability of countless applications. Many design and optimization efforts will benefit from an in-depth understanding of memory access behavior, which is not offered by extant access tracing and profiling methods. In this paper, we adopt a holistic memory access profiling approach to enable a better under...
Article
Full-text available
To investigate the impacts of uncertain parameters on simulated Pacific Walker circulation (PWC), a large number of perturbed parameter simulations are conducted using GAMIL2 (the Grid-point Atmospheric Model of IAP/LASG, version 2), and three different PWC indices are selected. The results show that the influences of some parameters on PWC are dep...
Article
Full-text available
Climate models show a conspicuous summer warm and dry bias over the central United States. Using results from 19 climate models in the Coupled Model Intercomparison Project Phase 5 (CMIP5), we report a persistent dependence of warm bias on dry bias with the precipitation deficit leading the warm bias over this region. The precipitation deficit is a...
Article
Full-text available
Soil organic carbon (SOC) has a significant effect on the carbon emission and climate change. However, current SOC prediction accuracy of most models is very low. Most evaluation studies indicate that the prediction error mainly comes from parameter uncertainties, which can be obviously improved by parameter calibration. Data assimilation technique...
Article
Full-text available
The scientific demand for more accurate modeling of the climate system calls for more computing power to support higher resolutions, inclusion of more component models, more complicated physics schemes, and larger ensembles. As the recent improvements in computing power mostly come from the increasing number of nodes in a system and the integration...
Article
FPGA-based reconfigurable dataflow engines provide a novel architecture to achieve breakthroughs in both time and energy to solution in numerical simulations. This article presents an efficient dataflow methodology for solving the Euler atmospheric dynamic equations, an essential step for mesoscale atmospheric simulation. The authors present custom...
Article
Full-text available
In this paper, we study the problem of keyword search with access control over encrypted data in cloud computing. We first propose a scalable framework where user can use his attribute values and a search query to locally derive a search capability, and a file can be retrieved only when its keywords match the query and the user’s attribute values c...
Conference Paper
Full-text available
An ultra-scalable fully-implicit solver is developed for stiff time-dependent problems arising from the hyperbolic conservation laws in nonhydrostatic atmospheric dynamics. In the solver, we propose a highly efficient hybrid domain-decomposed multigrid precondi-tioner that can greatly accelerate the convergence rate at the extreme scale. For solvin...
Article
Full-text available
The Sunway TaihuLight supercomputer is the world’s first system with a peak performance greater than 100 PFlops. In this paper, we provide a detailed introduction to the TaihuLight system. In contrast with other existing heterogeneous supercomputers, which include both CPU processors and PCIe-connected many-core accelerators (NVIDIA GPU or Intel Xe...
Conference Paper
In climate change studies, the atmospheric model is an essential component for building a high-resolution climate simulation system. While the accuracy of atmospheric simulations has long been limited by the computational capabilities of CPU platforms, the heterogeneous platforms equipped with accelerators are becoming promising candidates for achi...
Conference Paper
The tridiagonal solver is an important kernel in many scientific and engineering applications. Although quite a few parallel algorithms have been exploited recently, challenges still remain when solving tridiagonal systems on many-core architectures. In this paper, quantitative analysis is conducted to guide the selection of algorithms on different...
Article
Physical parameterization is one of the most important sources of uncertainties in the current climate system models. With the increasing complexity of models and the diverse requirements for climate studies, the priori and manual model tuning method for physical parameterization has become a bottleneck to further improve the climate system model....
Article
In the present study, the LASG/IAP Climate system Ocean Model version 2 (LICOM2) was implemented to replace the original ocean component in the Community Earth System Model version 1.0.4 (CESM1) to form a new coupled model referred to as CESM1+LICOM2. The simulation results from a 300-yr preindustrial experiment by using this model were evaluated a...
Chapter
An overview of the Chinese National Key Basic Research Project entitled “Development and Evaluation of High-Resolution Climate System Models” under grant No. 2010CB951900 is presented. The background and the objectives of the project are introduced. The main progress made in the past 5 years of the project is the development of “one system” and “tw...
Chapter
The ensemble method is effective at reducing model uncertainties. In this work, a novel ensemble technology has been developed and employed to the coupling process in the climate system model, forming a flexible multi-model ensemble coupling platform. This platform can perform the couple of the ensemble of multiple atmospheric models or multiple re...
Book
This book is based on the project "Development and Validation of High Resolution Climate System Models" with the support of the National Key Basic Research Project under grant No. 2010CB951900. It demonstrates the major advances in the development of new, dynamical Atmospheric General Circulation Model (AGCM) and Ocean General Circulation Model (OG...
Article
Full-text available
Physical parameterizations in general circulation models (GCMs), having various uncertain parameters, greatly impact model performance and model climate sensitivity. Traditional manual and empirical tuning of these parameters is time-consuming and ineffective. In this study, a "three-step" methodology is proposed to automatically and effectively ob...
Article
Full-text available
In this work an ultra-scalable algorithm is designed and optimized to accelerate a 3D compressible Euler atmospheric model on the CPU-MIC hybrid system of Tianhe-2. We first reformulate the mesocale model to avoid long-latency operations, and then employ carefully designed inter-node and intra-node domain decomposition algorithms to achieve balance...
Article
Full-text available
Physical parameterizations in General Circulation Models (GCMs), having various uncertain parameters, greatly impact model performance and model climate sensitivity. Traditional manual and empirical tuning of these parameters is time consuming and ineffective. In this study, a "three-step" methodology is proposed to automatically and effectively ob...
Article
Full-text available
Stencils are among the most important and time-consuming kernels in many applications. While stencil optimization has been a well-studied topic on CPU platforms, achieving higher performance and efficiency for the evolving numerical stencils on the more recent multi-core and many-core architectures is still an important issue. In this paper, we exp...
Article
An Interactive Ensemble (IE) platform was established based on a Standard Coupled (SC) climate model with seven atmosphere–land model realizations coupled to a single ocean model and a single sea ice model. The IE strategy reduces stochastic noise generated by atmospheric dynamics and therefore can be used to estimate the impact of atmospheric pert...
Article
Full-text available
Scientific data analysis and visualization have become the key component for nowadays large scale simulations. Due to the rapidly increasing data volume and awkward I/O pattern among high structured files, known serial methods/tools cannot scale well and usually lead to poor performance over traditional architectures. In this paper, we propose a ne...
Conference Paper
Tridiagonal system solver is an important kernel in many scientific and engineering applications. Even though quite a few parallel algorithms and implementations have been addressed in recent years, challenges still remain when solving large-scale tridiagonal system on heterogenous supercomputers. In this paper, a hierarchical algorithm framework S...
Conference Paper
Full-text available
Atmospheric modeling is an essential issue in the study of climate change. However, due to the complicated algo-rithmic and communication models, scientists and researchers are facing tough challenges in finding efficient solutions to solve the atmospheric equations. In this paper, we accelerate a solver for the three-dimensional Euler atmospheric...
Article
Numerical weather forecast is a most efficient means to reduce the effects of unexpected weather events. With the increasing prediction precision and the time-critical requirement, technologies of high performance computing have been improved much. However, I/O has become a significant performance bottleneck when scaling up to thousands of processe...
Conference Paper
Full-text available
This paper presents a hybrid algorithm for the petascale global simulation of atmospheric dynamics on Tianhe-2, the world's current top-ranked supercomputer developed by China's National University of Defense Technology (NUDT). Tianhe-2 is equipped with both Intel Xeon CPUs and Intel Xeon Phi accelerators. A key idea of the hybrid algorithm is to e...
Article
Full-text available
The chaotic atmospheric circulations and the ocean–atmosphere coupling may both cause variations in the North Atlantic Oscillation (NAO). This study uses an interactive ensemble (IE) coupled model to study the contribution of the atmospheric noise and coupling to the monthly variability of the NAO. In the IE model, seven atmospheric general circula...
Article
Full-text available
One of the most essential and challenging components in climate modeling is the atmospheric model. To solve multiphysical atmospheric equations, developers have to face extremely complex stencil kernels that are costly in terms of both computing and memory resources. This article aims to accelerate the solution of global shallow water equations (SW...
Article
Watershed distributed ecohydrological modelling associating with massive data and intensive computation, has a rising demand for performance computing. Till now models parallelisation mainly conducted at a granularity of sub-basin, which is of low parallel efficiency and tends to cause load unbalance. Few studies conducted at a granularity of grid...
Conference Paper
Full-text available
One of the most essential and challenging components in a climate system model is the atmospheric model. To solve the multi-physical atmospheric equations, developers have to face extremely complex stencil kernels. In this paper, we propose a hybrid CPU-FPGA algorithm that applies single and multiple FPGAs to compute the upwind stencil for the glob...
Conference Paper
This paper represents a novel strategy to improve the scalability of the barotropic mode in the Parallel Ocean Program (POP), by theoretically analyzing the barotropic communications bottleneck. POP discretizes the elliptic equations of the barotropic mode into a linear system Ax=b and solves it using the Preconditioned Conjugate Gradient (PCG) met...
Article
This paper discusses performance optimization on the dynamical core of global numerical weather prediction model in Global/Regional Assimilation and Prediction System (GRAPES). GRAPES is a new generation of numerical weather prediction system developed and currently used by Chinese Meteorology Administration. The computational performance of the dy...
Article
Full-text available
Developing highly scalable algorithms for global atmospheric modeling is becoming increasingly important as scientists inquire to understand behaviors of the global atmosphere at extreme scales. Nowadays, heterogeneous architecture based on both processors and accelerators is becoming an important solution for large-scale computing. However, large-...
Conference Paper
Full-text available
Cloud computing cuts down large capital outlays in facilities purchase and eliminates complex system management for users. To protect data confidentiality in cloud utilization, sensitive data are usually stored in encrypted form, making traditional search service on plaintext inapplicable. Thus, enabling keyword search over encrypted data becomes a...
Article
Full-text available
Sea ice is an important component in the Earth’s climate system. Coupled climate system models are indispensable tools for the study of sea ice, its internal processes, interaction with other components, and projection of future changes. This paper evaluates the simulation of sea ice by the Flexible Global Ocean-Atmosphere-Land System model Grid-po...
Conference Paper
form only given. As the only method to study long-term climate trend and to predict potential climate risk, climate modeling is becoming a key research topic among governments and research organizations. One of the most essential and challenging components in climate modeling is the atmospheric model. To cover high resolution in climate simulation...

Network

Cited By