Home
Argonne National Laboratory
Division of Mathematics and Computer Science
Ewing Lusk

Ewing Lusk
Argonne National Laboratory | ANL · Division of Mathematics and Computer Science

About

276

Publications

32,273

Reads

17,962

Citations

Skills and Expertise

MPI

Publications

Ab initio calculation of neutral-current $\nu$-$^{12}$C inclusive quasielastic scattering

Article

Nov 2017

Quasielastic neutrino scattering is an important aspect of the experimental program to study fundamental neutrino properties including neutrino masses, mixing angles, the mass hierarchy and CP-violating phase. Proper interpretation of the experiments requires reliable theoretical calculations of neutrino-nucleus scattering. In this paper we present...

Light-Nuclei Spectra from Chiral Dynamics

Article

Jul 2017

A major goal of nuclear theory is to explain the spectra and stability of nuclei in terms of effective many-body interactions amongst the nucleus' constituents-the nucleons, i.e., protons and neutrons. Such an approach, referred to below as the basic model of nuclear theory, is formulated in terms of point-like nucleons, which emerge as effective d...

Evolution of a minimal parallel programming model

Article

Apr 2017

We take a historical approach to our presentation of self-scheduled task parallelism, a programming model with its origins in early irregular and nondeterministic computations encountered in automated theorem proving and logic programming. We show how an extremely simple task model has evolved into a system, asynchronous dynamic load balancing (ADL...

Series Foreword

Chapter

Nov 2015

An overview of the most prominent contemporary parallel processing programming models, written in a unique tutorial style. With the coming of the parallel computing era, computer scientists have turned their attention to designing programming models that are suited for high-performance parallel computing and supercomputing systems. Programming para...

Dataflow Coordination of Data-Parallel Tasks via MPI 3.0

Conference Paper

Full-text available

Sep 2013

Scientific applications are often complex collections of many large-scale tasks. Mature tools exist for describing task-parallel workflows consisting of serial tasks, and a variety of tools exist for programming a single data-parallel operation. However, few tools cover the intersection of these two models. In this work, we extend the load balancin...

Charge Form Factor and Sum Rules of Electromagnetic Response Functions in C-12

Article

Full-text available

Aug 2013

An "ab initio" calculation of the Carbon-12 elastic form factor, and sum rules of longitudinal and transverse response functions measured in inclusive (e,e') scattering, is reported, based on realistic nuclear potentials and electromagnetic currents. The longitudinal elastic form factor and sum rule are found to be in satisfactory agreement with av...

Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications

Article

Jul 2013

Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant programming challenge. One approach is to structure applications with an upper layer of many loosely coupled coarse-grained tasks, each comprising a tightly-coupled parallel function or program. “Many-task” programming models such as functi...

Computational Nuclear Quantum Many-Body Problem: The UNEDF Project

Article

Full-text available

Jun 2013

Swift/T: Large-scale Application Composition via Distributed-memory Dataflow Processing

Conference Paper

Full-text available

May 2013

Many scientific applications are conceptually built up from independent component tasks as a parameter study, optimization, or other search. Large batches of these tasks may be executed on high-end computing systems; however, the coordination of the independent processes, their data, and their data dependencies is a significant scalability challeng...

Swift/T: Large-Scale Application Composition via Distributed-Memory Dataflow Processing

Conference Paper

Full-text available

May 2013

Many scientific applications are conceptually built up from independent component tasks as a parameter study, optimization, or other search. Large batches of these tasks may be executed on high-end computing systems, however, the coordination of the independent processes, their data, and their data dependencies is a significant scalability challeng...

Swift/T: Scalable Data Flow Programming for Many-Task Applications

Conference Paper

Full-text available

Feb 2013

Swift/T, a novel programming language implementation for highly scalable data flow programs, is presented.

Swift/T: scalable data flow programming for many-task applications

Poster

Feb 2013

Swift/T, a novel programming language implementation for highly scalable data flow programs, is presented.

Advanced MPI including new MPI-3 features

Conference Paper

Sep 2012

This tutorial will cover several advanced topics in MPI. We will cover one-sided communication, dynamic processes, multithreaded communication and hybrid programming, and parallel I/O. We will also discuss new features in the newest version of MPI, MPI-3, which is expected to be officially released a few days before this tutorial. The tutorial will...

Turbine: A distributed-memory dataflow engine for extreme-scale many-task applications

Conference Paper

May 2012

Efficiently utilizing the rapidly increasing concurrency of multi-petaflop computing systems is a significant programming challenge. One approach is to structure applications with an upper-layer of many loosely-coupled coarse-grained tasks, each comprising a tightly coupled parallel function or program. "Many-task" programming models such as functi...

Formal Analysis of MPI-based Parallel Programs

Article

Full-text available

Dec 2011

Most parallel computing applications in highperformance computing use the Message Passing Interface (MPI) API. Given the fundamental importance of parallel computing to science and engineering research, application correctness is paramount. MPI was originally developed around 1993 by the MPI Forum, a group of vendors, parallel programming researche...

MPI on millions of cores

Article

Full-text available

Mar 2011

Petascale parallel computers with more than a million processing cores are expected to be available in a couple of years. Although MPI is the dominant programming interface today for large-scale systems that at the highest end already have close to 300,000 processors, a challenging question to both researchers and users is whether MPI will scale to...

Global‐scale distributed I/O with ParaMEDIC

Article

Nov 2010

Achieving high performance for distributed I/O on a wide-area network continues to be an elusive holy grail. Despite enhancements in network hardware as well as software stacks, achieving high-performance remains a challenge. In this paper, our worldwide team took a completely new and non-traditional approach to distributed I/O, called ParaMEDIC: P...

PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems

Conference Paper

Full-text available

Oct 2010

Parallel programming models on large-scale systems require a scalable system for managing the processes that make up the execution of a parallel program. The process-management system must be able to launch millions of processes quickly when starting a parallel program and must provide mechanisms for the processes to exchange the information needed...

Hybrid parallel programming with MPI and unified parallel C

Conference Paper

Full-text available

May 2010

The Message Passing Interface (MPI) is one of the most widely used programming models for parallel computing. However, the amount of memory available to an MPI process is limited by the amount of local memory within a compute node. Partitioned Global Address Space (PGAS) models such as Unified Parallel C (UPC) are growing in popularity because of t...

The Importance of Non-Data-Communication Overheads in MPI

Article

Full-text available

Feb 2010

With processor speeds no longer doubling every 18-24 months owing to the exponential increase in power consumption and heat dissipation, modern HEC systems tend to rely lesser on the performance of single processing units. Instead, they rely on achieving high-performance by using the parallelism of a massive number of low-frequency/low-power proces...

Implementing MPI on Windows: Comparison with Common Approaches on Unix

Conference Paper

Full-text available

Jan 2010

Commercial HPC applications are often run on clusters that use the Microsoft Windows operating system and need an MPI implementation that runs efficiently in the Windows environment. The MPI developer community, however, is more familiar with the issues involved in implementing MPI in a Unix environment. In this paper, we discuss some of the differ...

More scalability, less pain : A simple programming model and its implementation for extreme computing

Article

Full-text available

Jan 2010

This is the story of a simple programming model, its implementation for extreme computing, and a breakthrough in nuclear physics. A critical issue for the future of high-performance computing is the programming model to use on next-generation architectures. Described here is a promising approach: program very large machines by combining a simplifie...

MPI at Exascale

Article

Full-text available

Jan 2010

With petascale systems already available, researchers are devoting their attention to the issues needed to reach the next major level in performance, namely, exascale. Explicit message passing using the Message Passing Interface (MPI) is the most commonly used model for programming petascale systems today. In this paper, we investigate what is need...

Slouching Towards Exascale

Article

Full-text available

Oct 2009

Ewing Lusk

One question before the high-performance computing community is 'How will application developers write code for exascale machines?' At this point it looks like they might be riding a rough beast indeed. This paper is a brief assessment of where we stand now with respect to writing programs for our largest supercomputers and what we should do next....

Processing MPI datatypes outside MPI

Conference Paper

Full-text available

Sep 2009

The MPI datatype functionality provides a powerful tool for describing structured memory and file regions in parallel applications, enabling noncontiguous data to be operated on by MPI communication and I/O routines. However, no facilities are provided by the MPI standard to allow users to efficiently manipulate MPI datatypes in their own codes. W...

MPI on a Million Processors

Conference Paper

Full-text available

Sep 2009

Petascale machines with close to a million processors will soon be available. Although MPI is the dominant programming model today, some researchers and users wonder (and perhaps even doubt) whether MPI will scale to such large processor counts. In this paper, we examine this issue of how scalable is MPI. We first examine the MPI specification itse...

Using MPI to Implement Scalable Libraries

Conference Paper

Sep 2009

Ewing Lusk

MPI is an instantiation of a general-purpose programming model, and high-performance implementations of the MPI standard have provided scalability for a wide range of applications. Ease of use was not an explicit goal of the MPI design process, which emphasized completeness, portability, and performance. Thus it is not surprising that MPI is occasi...

CIFTS: A Coordinated Infrastructure for Fault-Tolerant Systems

Conference Paper

Full-text available

Sep 2009

Abstract—Considerable work has been done on providing fault tolerance capabilities for different software components on large- scale high-end computing systems. Thus far, however, these fault- tolerant components have worked insularly and independently and information about faults is rarely shared. Such lack of system-wide fault tolerance is emergi...

Toward message passing for a million processes: Characterizing MPI on a massive scale blue gene/P

Article

Full-text available

Sep 2009

Upcoming exascale capable systems are expected to comprise more than amillion processing elements. As researchers continue to work toward architecting these systems, it is becoming increasingly clear that these systems will utilize asignificant amount of shared hardware between processing units; this includes shared caches, memory and network compo...

DARPXs HPCS program: History, models, tools, languages

Article

Dec 2008

AbstractThe historical context with regard to the origin of the DARPA High Productivity Computing Systems (HPCS) program is important for understanding why federal government agencies launched this new, long-term high-performance computing program and renewed their commitment to leadership computing in support of national security, large science an...

Disparity: Scalable Anomaly Detection for Clusters

Conference Paper

Full-text available

Oct 2008

In this paper, we describe disparity, a tool that does parallel, scalable anomaly detection for clusters. Disparity uses basic statistical methods and scalable reduction operations to perform data reduction on client nodes and uses these results to locate node anomalies. We discuss the implementation of disparity and present results of its use on a...

Simulating Failures on Large-Scale Systems

Conference Paper

Full-text available

Oct 2008

Developing fault management mechanisms is a difficult task because of the unpredictable nature of failures. In this paper, we present a fault simulation framework for Blue Gene/P systems implemented as a part of the Cobalt resource manager. The primary goal of this framework is to support system software development. We also present a hardware diag...

Non-data-communication Overheads in MPI: Analysis on Blue Gene/P

Conference Paper

Full-text available

Sep 2008

Modern HEC systems, such as Blue Gene/P, rely on achiev- ing high-performance by using the parallelism of a massive number of low-frequency/low-power processing cores. This means that the local pre- and post-communication processing required by the MPI stack might not be very fast, owing to the slow processing cores. Similarly, small amounts of ser...

Distributed I/O with ParaMEDIC: Experiences with a Worldwide Supercomputer

Conference Paper

Full-text available

Jan 2008

An Efficient Format for Nearly Constant-Time Access to Arbitrary Time Intervals in Large Trace Files

Article

Full-text available

Jan 2008

A powerful method to aid in understanding the performance of parallel applications uses log or trace files containing time-stamped events and states (pairs of events). These trace files can be very large, often hundreds or even thousands of megabytes. Because of the cost of accessing and displaying such files, other methods are often used that redu...

Early Experiments with the OpenMP/MPI Hybrid Programming Model

Conference Paper

Full-text available

Jan 2008

The paper describes some very early experiments on new architectures that support the hybrid programming model. The results are promising in that OpenMP threads interact with MPI as desired, allowing OpenMP-agnostic tools to be used. They explore three environments: a 'typical' Linux cluster, a new large-scale machine from SiCortex, and the new IBM...

EuroPVM/MPI Full-Day Tutorial. Using MPI-2: A Problem-Based Approach

Conference Paper

Full-text available

Jan 2008

MPI-2 introduced many new capabilities, including dynamic process management, one-sided communication, and parallel I/O. Implementations of these features are becoming widespread. This tutorial shows how to use these features by showing all of the steps involved in designing, coding, and tuning solutions to specific problems. The problems are chose...

MPI - Eine Einführung: Portable parallele Programmierung mit dem Message-Passing Interface

Book

Dec 2007

The Computer as Software Component: A Mechanism for Developing and Testing Resource Management Software

Conference Paper

Full-text available

Oct 2007

In this paper, we present an architecture that encapsulates system hardware inside a software component used for job execution and status monitoring. The development of this interface has enabled system simulation, which yields a number of novel benefits, including dramatically improved debug and testing capabilities.

A Portable Method for Finding User Errors in the Usage of MPI Collective Operations

Article

Full-text available

May 2007

An MPI profiling library is a standard mechanism for intercepting MPI calls by applications. Profiling libraries are so named because they are commonly used to gather runtime information about performance characteristics. Here we present a profiling library whose purpose is to detect user errors in the use of MPI's collective operations. While some...

A Composition Environment for MPI Programs

Article

Full-text available

May 2007

While MPI is the most common mechanism for expressing parallelism, MPI programs are not composable by using current MPI process managers or parallel shells. We introduce MPISH2, an MPI process manager analogous to serial Unix shells. It allows the composition of MPI and serial Unix utilities with one another to perform scalable tasks across large n...

Languages for High-Productivity Computing: the DARPA HPCS Language Project.

Article

Full-text available

Mar 2007

We present a summary of the current state of DARPA's HPCS language project. We describe the challenges facing any new language for scalable parallel computing, including the strong competition presented by MPI and the existing Partitioned Global Address Space (PGAS) Languages. We identify some of the major features of the proposed languages, using...

New and Old Tools and Programming Models for High-Performance Computing

Conference Paper

Jan 2007

Ewing Lusk

The computing power to be made available to applications in the coming years continues to increase. Hardware vendors anticipate many cores on single chips and fast networks connecting them, enabling a bewildering array of new approaches to parallel programming whose superiority to ”classical” approaches (MPI) remains uncertain. One certainty is tha...

FLASH. Applications and future

Conference Paper

Dec 2006

Flash has been successful in simulating a wide variety of astrophysical problems, both within the flash center and in the external community. The code has steadily gained acceptance since its initial release for the following reasons: (1) it is easily ported to a variety of computer architectures and the distribution includes support for many stand...

An Interface to Support the Identification of Dynamic MPI 2 Processes for Scalable Parallel Debugging

Conference Paper

Full-text available

Sep 2006

This paper proposes an interface that will allow MPI 2 dynamic programs – those using MPI SPAWN, CONNECT/ACCEPT, or JOIN – to provide information to parallel debuggers such as TotalView about the set of processes that constitute an individual application. The TotalView parallel debugger currently obtains information about the identity of processes...

Using MPI-2: A problem-based approach

Conference Paper

Sep 2006

An Interoperability Approach to System Software, Tools, and Libraries for Clusters

Article

Full-text available

Aug 2006

Systems software for clusters typically derives from a multiplicity of sources: the kernel itself, software associated with a particular distribution, site-specific purchased or open-source software, and assorted home-grown tools and procedures that attempt to glue everything together to meet the needs of the users and administrators of a particula...

M01---Application supercomputing and multiscale simulation techniques

Conference Paper

Jan 2006

Teraflop performance is no longer something of the future as complex integrated and multiscale 3D simulations drive supercomputer development. This tutorial addresses computation at the highest end. An overview of architectures is given (BlueGene/L, Columbia, NEC SX-8, Cray and IBM lines, high-performing clusters) along with programming tools neces...

S01 - Advanced MPI: I/O and one-sided communication.

Conference Paper

Full-text available

Jan 2006

This tutorial is about advanced use of MPI, in particular the parallel I/O and one-sided communication features added in MPI-2. Implementations are now available both from vendors and from open-source projects so that these MPI-2 capabilities can now really be used in practice. The tutorial will be heavily example-driven. For each example we introd...

MPISH2: Unix integration for MPI programs

Conference Paper

Full-text available

Sep 2005

While MPI is the most common mechanism for expressing parallelism, MPI programs remain poorly integrated in Unix environ- ments. We introduce MPISH2, an MPI process manager analogous to serial Unix shells. It provides better integration capabilities for MPI pro- grams by providing a uniform execution mechanism for parallel and serial programs, expo...

Components of Systems Software for Parallel Systems

Conference Paper

Sep 2005

Ewing Lusk

Systems software for clusters and other parallel systems affects multiple types of users. End users interact with it to submit and interact with application jobs and to avail themselves of scalable system tools. Systems administrators interact with it to configure and build software installations on individual nodes, schedule, manage, and account f...

MPISH: a parallel shell for MPI programs

Conference Paper

Full-text available

May 2005

While previous work has shown MPI to provide capabilities for system software, actual adoption has not widely occurred. We discuss process management shortcomings in MPI implementations and their impact on MPI usability for system software and management tasks. We introduce MPISH, a parallel shell designed to address these issues.

Collective Error Detection for MPI Collective Operations

Conference Paper

Full-text available

Mar 2005

An MPI profiling library is a standard mechanism for intercepting MPI calls by applications. Profiling libraries are so named because they are commonly used to gather performance data on MPI programs. Here we present a profiling library whose purpose is to detect user errors in the use of MPI’s collective operations. While some errors can be detect...

Scalable system software: a component-based approach

Article

Full-text available

Jan 2005

The growth in computing resources at scientific computing centers has created new challenges for system software. These multi-teraflop systems often exceed the capabilities of the system software and require new approaches to accommodate these large processor counts. The costs associated with development and maintenance of this software are also si...

Component-based cluster systems software architecture: A case study

Conference Paper

Full-text available

Oct 2004

We describe the use of component architecture in an area to which this approach has not been classically applied, the area of cluster system software. By "cluster system software," we mean the collection of programs used in configuring and maintaining individual nodes, together with the software involved in submission, scheduling, monitoring, and t...

An Open Cluster System Software Stack

Conference Paper

Sep 2004

Ewing Lusk

By “cluster system software,” we mean the software that turns a collection of individual machines into a powerful resource for a wide variety of applications. In this talk we will examine one loosely integrated collection of open-source cluster system software that includes an infrastructure for building component-based systems management tools, a...

MPI cluster system software

Conference Paper

Full-text available

Sep 2004

We describe the use of MPI for writing system software and tools, an area where it has not been previously applied. By "system software" we mean collections of tools used for system management and operations. We describe the common methodologies used for system soft- ware development, together with our experiences in implementing three items of sys...

Fault Tolerance in Message Passing Interface Programs

Article

Full-text available

Sep 2004

In this paper we examine the topic of writing fault-tolerant Message Passing Interface (MPI) applications. We discuss the meaning of fault tolerance in general and what the MPI Standard has to say about it. We survey several approaches to this problem, namely checkpointing, restructuring a class of standard MPI programs, modifying MPI semantics, an...

Methods to Model-Check Parallel Systems Software

Article

Full-text available

Jan 2004

We report on an effort to develop methodologies for formal verification of parts of the Multi-Purpose Daemon (MPD) parallel process management system. MPD is a distributed collection of communicating processes. While the individual components of the collection execute simple algorithms, their interaction leads to unexpected errors that are difficul...

The process management component of a scalable systems software Environment

Conference Paper

Full-text available

Jan 2004

The systems software necessary to operate large-scale parallel computers presents a variety of research and development issues. One approach is to consider systems software as a collection of interacting components, with well-defined published interfaces. The scalable systems software SciDAC project is currently exploring the feasibility of archite...

ADIO: A Framework for High-Performance, Portable Parallel I/O

Chapter

Oct 2003

The major research results from the Scalable Input/Output Initiative, exploring software and algorithmic solutions to the I/O imbalance. As we enter the "decade of data," the disparity between the vast amount of data storage capacity (measurable in terabytes and petabytes) and the bandwidth available for accessing it has created an input/output bot...

High-level programming in MPI

Conference Paper

Sep 2003

MPI is often thought of as a low-level approach, even as a sort of “assembly language,” for parallel programming. This is both true and false. While MPI is designed to afford the programmer the ability to control the flow of data at a detailed level for maximum performance, MPI also provides highly expressive operations that support high-level prog...

Integrating Scalable Process Management into Component-Based Systems Software

Conference Paper

Sep 2003

Ewing Lusk

The Scalable Systems Software Project is exploring the design of a systems software architecture based on separate, replaceable components interacting through publicly defined interfaces. This talk will describe how a scalable process manager has provided the implementation of the process management component of that design. We describe a general,...

Dynamic Process Management in an MPI Setting William Gropp

Article

May 2003

Ewing Lusk

We describe an architecture for the runtime environment for parallel applications as prelude to describing how parallel application might interface to their environment in a portable way. We propose extensions to the Message-Passing Interface (MPI) Standard that provide for dynamic process management, including spawning of new processes by a runnin...

Fault Tolerance in MPI Programs

Article

Full-text available

Jan 2003

Ewing Lusk

This paper examines the topic of writing fault-tolerant MPI applications. We discuss the meaning of fault tolerance in general and what the MPI Standard has to say about it. We survey several approaches to this problem, namely checkpointing, restructuring a class of standard MPI programs, modifying MPI semantics, and extending the MPI speci cation....

Using MPI2: Advanced Features of the Message Passing Interface

Conference Paper

Jan 2003

Process Management for Scalable Parallel Programs

Conference Paper

Sep 2002

Ewing Lusk

Large-scale parallel programs present multiple problems in process management, from scalable process startup to runtime monitoring and signal delivery, to rundown and cleanup. Interactive parallel jobs present special problems in management of standard I/O. In this talk we will present an approach that addresses these issues. The key concept is tha...

MPI on the Grid

Conference Paper

Sep 2002

This tutorial will cover parallel programming with the MPI message passing interface, with special attention paid to the issues that arise in a computational grid environment. After a summary of MPI programming, we will address the issue of process management, first in a single administrative domain and the then across multiple administrative domai...

Goals Guiding Design: PVM and MPI

Article

Full-text available

Sep 2002

Ewing Lusk

PVM and MPI, two systems for programming clusters, are often compared. The comparisons usually start with the unspoken assumption that PVM and MPI represent different solutions to the same problem. In this paper we show that, in fact, the two systems often are solving different problems. In cases where the problems do match but the solutions chosen...

A Multilevel Approach to Topology-Aware Collective

Article

Jul 2002

The efficient implementation of collective communication operations has received much attention. Initial efforts produced "optimal" trees based on network communication models that assumed equal point-to-point latencies between any two processes.

SPINning Parallel Systems Software

Conference Paper

Full-text available

Apr 2002

We describe our experiences in using Spin to verify parts of the Multi Purpose Daemon (MPD) parallel process management system.

MPI in 2002: has it been ten years already?

Conference Paper

Feb 2002

Ewing Lusk

Summary form only given. In April of 1992, a group of parallel computing vendors, computer science researchers, and application scientists met at a one-day workshop and agreed to cooperate on the development of a community standard for the message-passing model of parallel computing. The MPI Forum that eventually emerged from that workshop became a...

Goals guiding design: PVM and MPI

Conference Paper

Full-text available

Feb 2002

A Multilevel Approach to Topology-Aware Collective Operations in Computational Grids

Article

Full-text available

Jan 2002

The efficient implementation of collective communiction operations has received much attention. Initial efforts produced "optimal" trees based on network communication models that assumed equal point-to-point latencies between any two processes. This assumption is violated in most practical settings, however, particularly in heterogeneous systems s...

Parallel programming with MPI

Conference Paper

Nov 2001

Advanced topics in MPI programming

Conference Paper

Nov 2001

Advanced Topics in MPI Programming

Chapter

Oct 2001

Comprehensive guides to the latest Beowulf tools and methodologies. Beowulf clusters, which exploit mass-market PC hardware and software in conjunction with cost-effective commercial network technology, are becoming the platform for many scientific, engineering, and commercial applications. With growing popularity has come growing complexity. Addre...

Parallel Programming with MPI

Chapter

Oct 2001

Advanced Topics in MPI Programming

Chapter

Oct 2001

Parallel Programming with MPI

Chapter

Oct 2001

Scalable Unix Commands for Parallel

Article

Oct 2001

We describe a family of MPI applications we call the Parallel Unix Commands. These commands are natural parallel versions of common Unix user commands such as ls, ps, and find, together with a few similar commands particular to the parallel environment. We describe the design and implementation of these programs and present some performance results...

Components and Interfaces of a Process Management System for Parallel Programs

Article

Oct 2001

Parallel jobs are different from sequential jobs and require a different type of process management. We present here a process management system for parallel programs such as those written using MPI. A primary goal of the system, which we call MPD (for multipurpose daemon), is to be scalable. By this we mean that startup of interactive parallel job...

Scalable Unix Commands for Parallel Processors: A High-Performance Implementation

Conference Paper

Sep 2001

Scalable Unix Commands for Parallel Processors: A High-Performance Implementation

Article

Full-text available

Aug 2001

Optimizing noncontiguous accesses in MPI–IO

Article

Aug 2001

The I/O access patterns of many parallel applications consist of accesses to a large number of small, noncontiguous pieces of data. If an application's I/O needs are met by making many small, distinct I/O requests, however, the I/O performance degrades drastically. To avoid this problem, MPI–IO allows users to access noncontiguous data with a singl...

Interfacing Parallel Jobs to Process Managers

Conference Paper

Aug 2001

A variety of projects worldwide are developing what we call "heterogeneous MPI". These MPI implementations are designed to operate on multiple computers, perhaps of different types, ranging in complexity from a set of desktop workstations to several supercomputers connected via a wide area network. These considerations led us to investigate the fea...

Early Applications in the Message-Passing Interface (MPI)

Article

Full-text available

Apr 2001

We describe a number of early efforts to make use of the Message-Passing Interface (MPI) standard in appli cations, based on an informal survey conducted in May-June, 1994. Rather than a definitive statement of all MPI developmental work, this paper addresses the initial successes, progress, and impressions that appli cation developers have had wit...

Programming with MPI on clusters

Conference Paper

Full-text available

Feb 2001

Ewing Lusk

Not Available

From Trace Generation to Visualization: A Performance Framework for Distributed Parallel Systems

Conference Paper

Dec 2000

In this paper we describe a trace analysis framework, from trace generation to visualization. It includes a unified tracing facility on IBMâ SPä systems, a self-defining interval file format, an API for framework extensions, utilities for merging and statistics generation, and a visualization tool with preview and multiple time-space diagrams. The...

ACL2 for Parallel Systems Software: A Progress Report

Article

Full-text available

Nov 2000

This paper describes our experiences so far

From Trace Generation to Visualization: A Performance Framework for Distributed Parallel Systems

Article

Sep 2000

In this paper we describe a trace analysis framework, from trace generation to visualization. It includes a unified tracing facility on IBM SP systems, a self-defining interval file format, an API for framework extensions, utilities for merging and statistics generation, and a visualization tool with preview and multiple time-space diagrams. The tr...

A Scalable Process-Management Environment for Parallel Programs

Conference Paper

Full-text available

Sep 2000

We present a process management system for parallel programs such as those written using MPI. A primary goal of the system, which we call MPD (for multipurpose daemon), is to be scalable. By this we mean that startup of interactive parallel jobs comprising a thousand processes is quick, that signals can be quickly delivered to processes, and that s...

Wide-area implementation of the Message Passing Interface

Article

Aug 2000

The Message Passing Interface (MPI) can be used as a portable, high-performance programming model for wide-area computing systems. The wide-area environment introduces challenging problems for the MPI implementor, due to the heterogeneity of both the underlying physical infrastructure and the software environment at different sites. In this article...

User's Guide for mpich, a Portable Implementation of MPI

Article

Full-text available

Jun 2000

Ewing Lusk

1 1 Introduction 1 2 Linking and running programs 2 2.1 Scripts to Compile and Link Applications . . . . . . . . . . . . . . . . . . . 2 2.2 Running with mpirun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.3 More detailed control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 Special features of different syste...

Exploiting Hierarchy in Parallel Computer Networks to Optimize Collective Operation Performance

Article

Full-text available

Mar 2000

The ecient implementation of collective communication operations has received much attention. Initial eorts modeled network communication and produced optimal" trees based on those models. However, the models used by these initial eorts assumed equal point-to-point latencies between any two processes. This assumption is violated in heterogeneous sy...

Flash Code: Studying Astrophysical Thermonuclear Flashes

Article

Full-text available

Mar 2000

The Center for Astrophysical Thermonuclear Flashes is constructing a new generation of codes designed to study runaway thermonuclear burning on the surface or in the interior of evolved compact stars. The center has completed the first version of Flash, Flash-1, which addresses various astrophysics problems. Flash-1 represents a major advance towar...

I/O in Parallel Applications: The Weakest Link

Article

Full-text available

Feb 2000

Parallel computers are increasingly being used to run large-scale applications that also have huge I/O requirements. However, many applications obtain poor I/O performance on modern parallel machines. This special issue of IJSA contains papers that describe the I/O requirements and the techniques used to perform I/O in real parallel applications. W...

I/O Characterization of a Portable Astrophysics Application on the IBM SP and Intel Paragon

Article

Full-text available

Feb 2000

Many large-scale applications on parallel machines are bottlenecked by the I/O performance rather than the CPU or communication performance of the system. To improve the I/O performance, it is first necessary for system designers to understand the I/O requirements of various applications. This paper presents the results of a study of the I/O charac...

Why are PVM and MPI so different?

Article

Feb 2000

Ewing Lusk

. PVM and MPI are often compared. These comparisons usually start with the unspoken assumption that PVM and MPI represent different solutions to the same problem. In this paper we show that, in fact, the two systems often are solving different problems. In cases where the problems do match but the solutions chosen by PVM and MPI are different, we e...

Exploiting hierarchy in parallel computer networks to optimize collective operation performance

Conference Paper

Feb 2000

The efficient implementation of collective communication operations has received much attention. Initial efforts modeled network communication and produced “optimal” trees based on those models. However, the models used by these initial efforts assumed equal point-to-point latencies between any two processes. This assumption is violated in heteroge...