Article

CODE: A Unified Approach to Parallel Programming

August 1989
IEEE Software 6(4):10 - 18

August 1989
6(4):10 - 18

DOI:10.1109/52.31648

Source
IEEE Xplore

Authors:

James Browne

University of Texas at Austin

The authors describe CODE (computation-oriented display environment), which can be used to develop modular parallel programs graphically in an environment built around fill-in templates. It also lets programs written in any sequential language be incorporated into parallel programs targeted for any parallel architecture. Broad expressive power was obtained in CODE by including abstractions of all the dependency types that occur in the widely used parallel-computation models and by keeping the form used to specify firing rules general. The CODE programming language is a version of generalized dependency graphs designed to encode the unified parallel-computation model. A simple example is used to illustrate the abstraction level in specifying dependencies and how they are separated from the computation-unit specification. The most important CODE concepts are described by developing a declarative, hierarchical program with complex firing rules and multiple dependency types.< >

Marketing in Emerging Markets

Article

Full-text available

Dec 2017

Taslim Ahammad

A Unified Model for Concurrent Debugging.

Conference Paper

Full-text available

Jan 1993

A Template-based Model for Developing Parallel Applications on a Network Cluster

Article

Parallel programming is complicated. This complexity arises from the compound-ing of low-level parallelism related issues with the problems of writing good sequential code. Over the years, various approaches have been proposed to aid parallel program developers. These approaches employ high-level models of parallel computation, thus hiding the low-level parallelism-related details from the user. Different approaches em-ploy different abstraction techniques, such as communication libraries, macros, new parallel languages and abstract data types. In this paper, we present a template-based approach to parallel application development, which uses frequently occurring patterns for parallelism. A parallel template is a re-usable, application-independent encapsulation of a commonly used parallel computing pattern. It is implemented as a re-usable code-skeleton for quick and reliable development of parallel applications. In the past, parallel programming systems have allowed fast prototyping of parallel applications based on commonly occurring communication and synchronization struc-tures. The uniqueness of this approach is that the templates in this model are generic, with associated structural and behavioral attributes which can be parameterized. Tem-plates have standard interfaces which facilitate their composition. Unlike the similar approaches in the past, which were mostly suitable for solving a limited subset of parallel applications, this approach provides a systematic development model for the hierarchical development and the subsequent refinements of a vast majority of coarse-grained parallel applications, which can be suitably solved on a network cluster. Two of the main issues addressed are: degree of flexibility in application development and extendibility (hence adaptability) of the development system as per user's need. Both of these issues were some of the major concerns in the past.

Views on template-based parallel programming.

Conference Paper

Full-text available

Jan 1996

For almost a decade we have been working at developing and using template-based models for coarse-grained parallel computing. Our initial system, FrameWorks, was positively received but had a number of shortcomings. The Enterprise parallel programming environment evolved out of this work, and now, after several years of experience with the system, its shortcomings are becoming evident. This paper outlines our experiences in developing and using the two parallel programming systems. Many of our observations are relevant to other parallel programming systems, even though they may be based on different assumptions. Although template-base models have the potential for simplifying the complexities of parallel programming, they have yet to realize these expectations for high-performance applications.

From Design Patterns to Parallel Architectural Skeletons

Article

Apr 2002

The concept of design patterns has been extensively studied and applied in the context of object-oriented software design. Similar ideas are being explored in other areas of computing as well. Over the past several years, researchers have been experimenting with the feasibility of employing design-patterns related concepts in the parallel computing domain. In the past, several pattern-based systems have been developed with the intention to facilitate faster parallel application development through the use of pre-implemented and reusable components that are based on frequently used parallel computing design patterns. However, most of these systems face several serious limitations such as limited flexibility, zero extensibility, and ad hoc nature of their components. Lack of flexibility in a parallel programming system limits a programmer to using only the high-level components provided by the system. Lack of extensibility here refers to the fact that most of the existing pattern- based parallel programming systems come with a set of pre-built patterns integrated into the system. However, the system provides no obvious way of increasing the repertoire of patterns when need arises. Also, most of these systems do not offer any generic view of a parallel computing pattern, a fact which may be at the root of several of their shortcomings. This research proposes a generic (i.e., pattern- and application-independent) model for realizing and using parallel design patterns. The term "Parallel Architectural Skeleton" is used to represent the set of generic attributes associated with a pattern. The Parallel Architectural Skeleton Model (PASM) is based on the message-passing paradigm, which makes it suitable for a LAN of workstations and PCs. The model is flexible as it allows the intermixing of high- level patterns with low-level message-passing primitives. An object-oriented and library-based implementation of the model has been completed using C++ and MPI, without necessitating any language extension. The generic model and the library-based implementation allow new patterns to be defined and included into the system. The skeleton-library serves as a "framework" for the systematic, hierarchical development of network-oriented parallel applications.

Visual Low-Code Language for Orchestrating Large-Scale Distributed Computing

Article

Full-text available

Jul 2023

Distributed, large-scale computing is typically performed using textual general-purpose programming languages. This requires significant programming skills associated with the parallelisation and distribution of computations. In this paper, we present a visual (graphical) programming language called the Computation Application Language (CAL) to raise abstraction in distributed computing. CAL programs define computation workflows by visualising data flowing between computation units. The goal is to reduce the amount of traditional code needed and thus facilitate development even by non-professional programmers. The language follows the low-code paradigm, i.e. its implementation (the editor and the runtime system) is available online. We formalise the language by defining its syntax using a metamodel and specifying its semantics using a two-step approach. We define a translation of CAL into an intermediate language which is then defined using an operational approach. This formalisation was used to develop a programming and execution environment. The environment orchestrates computations by interpreting the intermediate language and managing the instantiation of computation modules using data tokens. We also present an explanatory case-study example that shows a practical application of the language.

Flux: Composing efficient and scalable servers

Article

Jan 2007

Programming high-performance server applications is challenging: it is both complicated and error-prone to write the concurrent code required to deliver high performance and scalability. Server performance bottlenecks are difficult to identify and correct. Finally, it is difficult to predict server performance prior to deployment. This paper presents Flux, a language that dramatically simplifies the construction of scalable high-performance server applications. Flux lets programmers compose off-the-shelf, sequential C, C++, or Java functions into concurrent servers. The Flux compiler type-checks programs and guarantees that they are deadlock-free. We have built a number of servers in Flux, including a web server with PHP support, an image-rendering server, a BitTorrent peer, and a game server. These Flux servers perform comparably to their counterparts written entirely in C. By tracking hot paths through a running server, Flux simplifies the identification of performance bottlenecks. The Flux compiler also automatically generates discrete event simulators that accurately predict actual server performance under load and with different hardware resources.

Software Architecture Styles: A Survey

Article

Full-text available

Jan 2014

Ashish Kumar

PPModel: A modeling tool for source code maintenance and optimization of parallel programs

Article

Full-text available

Dec 2012

As the computation power in desktops advances, parallel programming has emerged as one of the essential skills needed by next generation software engineers. However, programs written in popular parallel programming paradigms have a substantial amount of sequential code mixed with the parallel code. Several such versions supporting different platforms are necessary to find the optimum version of the program for the available resources and problem size. As revealed by our study on benchmark programs, sequential code is often duplicated in these versions. This can affect code comprehensibility and re-usability of the software. In this paper, we discuss a framework named PPModel, which is designed and implemented to free programmers from these scenarios. Using PPModel, a programmer can separate parallel blocks in a program, map these blocks to various platforms, and re-execute the entire program. We provide a graphical modeling tool (PPModel) intended for Eclipse users and a Domain-Specific Language (tPPModel) for non-Eclipse users to facilitate the separation, the mapping, and the re-execution. This is illustrated with a case study from a benchmark program, which involves re-targeting a parallel block to CUDA and another parallel block to OpenMP. The modified program gave almost 5× performance gain compared to the sequential counterpart, and 1.5× gain compared to the existing OpenMP version.

Millipede - A Programming Environment Providing Visual Support for Parallel Programming

Conference Paper

Full-text available

Mar 1992

A Skeleton-Based Model for Developing Parallel Applications Using a Network of Processors

Article

One of the greatest obstacles to the mainstream adoption of parallel computing is its complexity. Over the years, various approaches have been proposed to aid parallel program developers. Most of these ap-proaches employ a high-level model of parallel computation, thus hiding the low-level parallelism-related details from the user. Different models employ different abstraction techniques, such as communication libraries, macros, new parallel languages and abstract data types. In this paper we present a skeleton-based approach which uses frequently occurring structures for parallelism, and is a hybrid of high-and low-level models. Each skeleton is a re-usable, application-independent component providing a com-monly used parallel structure. A number of such skeletons can be combined together to create the skeleton of the entire application, which can then be filled in with the application specific components. Unlike other skeleton-based approaches in the past, this work is unique in the following aspects: First, it gives a generic definition to a skeleton, with associated structural and behavioral components. The crucial behavioral components were missing in the related works of the past. Second, it gives a clear-cut and natural model to compose the individual skeletons to develop the entire parallel application. As a result, it is easy for the user to compose skeletons correctly. Third, unlike the previous approaches, the user can work at various levels of abstraction and also intermix them. For instance, the user can intermix skeletons with the lowest level of communication primitives available to him. This gives him a high degree of flexibility in developing his application. Fourth, a library-based approach, together with a generic definition of a skeleton, makes it a highly extendible approach, i.e. a new skeleton can be added to the system as per need. Some recent approaches, which intended to be extendible, were in fact hardly extendible due to the absence of a generic viewpoint of a skeleton. As a direct realization of the model, a 1 library-based development system using object-oriented design methodologies in C++ and the standard Message-Passing Interface (MPI) has been implemented. The latter part of the paper focuses on the implementation and presents experimental results obtained on a cluster of workstations.

Overview of Manifold and its implementation

Article

Full-text available

Feb 1993
Concurrency Pract Ex

Management of the communications among a set of concurrent processes arises in many applications and is a central concern in parallel computing. In this paper we introduce MANIFOLD: a co-ordination language whose sole purpose is to describe and manage complex interconnections among independent, concurrent processes. In the underlying paradigm of this language the primary concern is not with what functionality the individual processes in a parallel system provide. Instead, the emphasis is on how these processes are interconnected and how their interaction patterns change during the execution life of the system. This paper also includes an overview of our implementation of MANIFOLD. As an example of the application of MANIFOLD, we present a series of small manifold programs which describe the skeletons of some adaptive recursive algorithms that are of particular interest in computer graphics. Our concern in this paper is to show the expressiveness of MANIFOLD, the feasibility of its implementation and its usefulness in practice. Issues regarding performance and optimization are beyond the scope of this paper.

Paralex: An Environment for Parallel Programming in Distributed Systems

Conference Paper

Full-text available

Jul 1992

Modern distributed systems consisting of powerful workstations and high-speed interconnection networks are an economical alternative to special-purpose super computers. The technical issues that need to be addressed in exploiting the parallelism inherent in a distributed system include heterogeneity, high-latency communication, fault tolerance and dynamic load balancing. Current software systems for parallel programming provide little or no automatic support towards these issues and require users to be experts in fault-tolerant distributed computing. The Paralex system is aimed at exploring the extent to which the parallel application programmer can be liberated from the complexities of distributed systems. Paralex is a complete programming environment and makes extensive use of graphics to define, edit, execute and debug parallel scientific applications. All of the necessary code for distributing the computation across a network and replicating it to achieve fault tolerance and dynamic load balancing is automatically generated by the system. In this paper we give an overview of Paralex and present our experiences with a prototype implementation.

The Enterprise Distributed Programming Model.

Conference Paper

Full-text available

Jan 1992

Workstation environments have been in use for more than a decade. Although a network of workstations represents a large amount of aggregate computing power, single users often cannot utilize these resources for their applications. Enterprise is a programming environment for designing, coding, debugging, testing, monitoring, profiling and executing programs in a distributed hardware environment. Programs written using Enterprise look like familiar sequential C code; the parallelism is expressed graphically. The system automatically inserts the code necessary to handle communication, synchronization and fault tolerance, allowing the rapid construction of correct distributed programs. Enterprise programs run on a network of computers, absorbing the idle cycles on machines. The system supports load balancing, limited process migration, and dynamic distribution of work in environments with changing resource utilization. This paper concentrates on the user's view of programming in Enterprise.

A Case Study on Pattern-Based Systems for High Performance Computational Biology.

Conference Paper

Full-text available

Jan 2005

Computational biology research is now faced with the burgeoning number of genome data. The rigorous post- processing of this data requires an increased role for high performance computing (HPC). Because the development of HPC applications for computational biology problems is much more complex than the corresponding sequen- tial applications, existing traditional programming tech- niques have demonstrated their inadequacy. Many high level programming techniques, such as skeleton and pattern based programming, have therefore been designed to pro- vide users new ways to get HPC applications without much effort. However, most of them remain absent from the main- stream practice for computational biology. In this paper, we present a new parallel pattern-based system prototype for computational biology. The underlying programming tech- niques are based on generic programming, a programming technique suited for the generic representation of abstract concepts. This allows the system to be built in a generic way at application level and thus provides good extensibil- ity and flexibility. We show how this system can be used to develop HPC applications for popular computational biol- ogy algorithms and lead to significant runtime savings on distributed memory architectures.

Flux: A Language for Programming High-Performance Servers.

Conference Paper

Jan 2006

Programming high-performance server applications is challenging: it is both complicated and error-prone to write the concurrent code required to deliver high perfor- mance and scalability. Server performance bottlenecks are difficult to identify and correct. Finally, it is difficult to predict server performance prior to deployment. This paper presents Flux, a language that dramatically simplifies the construction of scalable high-performance server applications. Flux lets programmers compose off- the-shelf, sequential C or C++ functions into concurrent servers. Flux programs are type-checked and guaran- teed to be deadlock-free. We have built a number of servers in Flux, including a web server with PHP sup- port, an image-rendering server, a BitTorrent peer, and a game server. These Flux servers match or exceed the performance of their counterparts written entirely in C. By tracking hot paths through a running server, Flux simplifies the identification of performance bottle- necks. The Flux compiler also automatically generates discrete event simulators that accurately predict actual server performance under load and with different hard- ware resources.

A Generic Parallel Pattern-Based System for Bioinformatics

Conference Paper

Full-text available

Aug 2004

Parallel program design patterns provide users a new way to get par- allel programs without much effort. However, it is always a serious limitation for most existing parallel pattern-based systems that there is no generic descrip- tion for the structure and behavior of a pattern at application level. This limita- tion has so far greatly hindered the practical use of these systems. In this paper, we present a new parallel pattern-based system for bioinformatics. The underly- ing programming techniques are based on generic programming, a program- ming technique suited for the generic representation of abstract concepts. This allows the new system to be built in a generic way at application level. We show how this system efficiently addresses the shortcomings of existing sys- tems and leads to significant runtime savings for some popular applications in bioinformatics on PC clusters.

Extending OpenMP for Task Parallelism.

Article

Full-text available

Sep 2003

Ami Marowka

In a wide variety of scientific parallel applications, both task and data parallelism must be exploited to achieve the best possible performance on a multiprocessor machine. These applications induce task-graph parallelism with coarse-grain granularity. Nevertheless, using the available task-graph parallelism and combining it with data parallelism can increase the performance of parallel applications considerably since an additional degree of parallelism is exploited. The OpenMP standard supports data parallelism but does not support task-graph parallelism. In this paper we present an integration of task-graph parallelism in OpenMP by extending the parallel sections constructs to include task-index and precedence-relations matrix clauses. There are many ways in which task-graph parallelism can be supported in a programming environment. A fundamental design decision is whether the programmer has to write programs with explicit precedence relations, or if the responsibility of precedence relations generation is delegated to the compiler. One of the benefits provided by parallel programming models like OpenMP is that they liberate the programmer from dealing with the underlying details of communication and synchronization, which are cumbersome and error-prone tasks. If task-graph parallelism is to find acceptance, writing task-graph parallel programs must be no harder than writing data parallel programs, and therefore, in our design, precedence relations are described through simple programmer annotations, with implementation details handled by the system. This paper concludes with a description of several parallel application kernels that were developed to study the practical aspects of task-graph parallelism in OpenMP. The examples demonstrate that exploiting data and task parallelism in a single framework is the key to achieving good performance in a variety of applications.

Visual Parallel Programming and Determinacy: A Language Specification, an Analysis Technique, and a Programming Tool

Article

Aug 1994

Phred is a visual parallel programming language in which programs can be statically analyzed for deterministic behavior. This paper presents t he Phred language, tech- niques for analyzing the language, and a programming environment which supports Phred programming. There are many methods for specifying synchronization and data sharing in parallel programs. The Phred programmer uses graph constructs for describing parallelism, synchronization and data sharing . These graphs are formally described in this paper as a graph grammar. The use of graphs in Phred provides an intuitive and visual representation for parallel computat ions. The inadvertent specifi- cation of nondeterministic computations is a common error in parallel programming. Phred addresses the issue of determinacy by visually indicating regions of a program where nondeterminacy may exist. This analysis and its integration into a program- ming environment is presented here. The Phred programming environment supports the specification, analysis and execution of Phred programs. The distribution of the programming environment itself over several workstations is also described.

I/O in Parallel and Distributed Systems: An Introduction

Chapter

Jan 1996

We sketch the reasons for the I/O bottleneck in parallel and distributed systems, pointing out that it can be viewed as a special case of a general bottleneck that arises at all levels of the memory hierarchy. We argue that because of its severity, the I/O bottleneck deserves systematic attention at all levels of system design. We then present a survey of the issues raised by the I/O bottleneck in five key areas of parallel and distributed systems: applications, algorithms, compilers, operating systems and architecture. Finally, we address some of the trends we observe emerging in new paradigms of parallel and distributed computing: the convergence of networking and I/O, I/O for massively distributed “global information systems” such as the World Wide Web, and I/O for mobile computing and wireless communications. These considerations suggest exciting new research directions in I/O for parallel and distributed systems in the years to come.

Performance and Scalability Issues in the Design and Implementation of a Parallel Programming Environment

Chapter

Jan 1996

Parsec is a parallel programming environment whose goal is to simplify the development of multicomputer programs without, as is often the case, sacrificing performance. We have reconciled these objectives by “compiling” the structure of parallel applications into information to configure each of a small set of communication primitives on a context sensitive basis. In this chapter we show how Parsec can be used to implement a high-performance processor farm and compare Parsec and hand-optimized implementations to demonstrate that Parsec can achieve a similar level of performance. Extensive static analysis and optimization is necessary to achieve these results. We discuss both the tools which perform these tasks as well as the user interface that provides the necessary declarative structural information. Using the processor farm, we show how Parsec simplifies the task of specifying the structure of a parallel application and improves the result by supporting abstraction, reuse and scalability.

A Formalism for Graph-Oriented Distributed Programming

Chapter

Jan 2003

Support for the programming of distributed computing systems has been a primary focus of distributed computing research. It has been recognized that programming a distributed system is more difficult than programming a centralized system. Many of the functions, such as task mapping, interprocess communication, remote invocation, synchronization, and reconfiguration, are very difficult to program. Tools that support parallel and distributed programming can greatly simplify such programming tasks.

PVM and HeNCE: Tools for Heterogeneous Network Computing

Chapter

Jan 1993

Wide area computer networks have become a basic part of today’s computing infrastructure. These networks connect a variety of machines, presenting an enormous computing resource. In this project we focus on developing methods and tools which allow a programmer to tap into this resource. In this talk we describe PVM and HeNCE, tools and methodology under development that assists a programmer in developing programs to execute on a networked group of heterogeneous machines. HeNCE is implemented on top of a system called PVM (Parallel Virtual Machine). PVM is a software package that allows the utilization of a heterogeneous network of parallel and serial computers as a single computational resource. PVM provides facilities for spawning, communication, and synchronization of processes over a network of heterogeneous machines. While PVM provides the low level tools for implementing parallel programs, HeNCE provides the programmer with a higher level abstraction for specifying parallelism.

A computational toolkit for colliding black holes and CFD

Article

Full-text available

Jun 1994

We present a f r a m e w o r k f o r a h i g h l e v el toolkit for solving partial diierential equations. The requirements for very large and complex PDE applications such a s computational dynamics and numerical relativity are examined in the framework of a modular toolkit approach based on visual programming. We a d d r e s s some of the principal non-numerical technical challenges : software integration, scheduling and distribution of the computation over a metacomputer. We also discus some of the challenges found in creating run-time sup-port systems and parallel grid generation modules for future systems.

Prototyping and Simulating Parallel, Distributed Computations

Article

Oct 1994
J PARALLEL DISTR COM

Designing parallel, distributed computations is a significant barrier to the effective use of contemporary equipment. One aspect of the barrier is the difficulty of partitioning a serial solution into a set of communicating computational subsets (e.g., processes) that can be distributed over heterogeneous processors in a distributed hardware environment. The Parallel Distributed computation Graph Model (ParaDiGM) and the VISual Assistant (VISA) have been designed to assist with the partitioning problem. The formal model is composed of two components: a micro model focuses on the functionality of the computation, while a consistent macro model explicitly represents the partition and the communication mechanisms. ParaDiGM encourages the designer to address functionality and partitioning in different submodels, maintaining a mapping between elements in the two submodels. ParaDiGM is formal, but has an intuitive visual presentation; its use is supported by the VISusal Assistant (VISA), a tool for designing, animating, simulating, and prototyping distributed computations. This note informally describes ParaDiGM and VISA, then illustrates how they can be used to assist with the design of parallel, distributed computations.

Software engineering for parallel systems

Article

Jul 1994
INFORM SOFTWARE TECH

Current approaches to software engineering practice for parallel systems are reviewed. The parallel software designer has not only to address the issues involved in the characterization of the application domain and the underlying hardware platform, but, in many instances, the production of portable, scalable software is desirable. In order to accommodate these requirements, a number of specific techniques and tools have been proposed, and these are discussed in this review in the framework of the parallel software life-cycle. The paper outlines the role of formal methods in the practical production of parallel software, but its main focus is the emergence of development methodologies and environments. These include CASE tools and run-time support systems, as well as the use of methods taken from experience of conventional software development. Because of the particular emphasis on performance of parallel systems, work on performance evaluation and monitoring systems is considered.

Integrating the heterogeneous control properties of software modules

Article

Nov 1992
Software Eng Notes

A concurrent software application, whether running on a single machine or distributed across multiple machines, is composed of tasks that interact (communicate and sychronize) in order to achieve some goal. Developing such concurrent programs so they cooperate effectively is a complex task, requiring that progrmmers craft their modules–the components from which concurrent applications are built—to meet both functional requirements and communication requirements. Unfortunately the result of this effort is a module that is difficult to reason about and even more difficult to reuse. Making programmers treat too many diverse issues simultaneously leads to increased development costs and opportunities for error. This suggests the need for ways that a developer may specify control requirements separately from the implementation of functional requirements, but then have this information used automatically when building the component executables. The result is an environment where programmers have increased flexibility in composing software modules into concurrent applications, and in reusing those same modules. This paper describes our research toward a technology for control integration, where we have developed techniques for users to express control objectives for an application and a system that translates those specifications for use in packaging executables.

Flux: Composing ecient and scalable servers

Article

Assessing the Impact of Using Design-Patterns-Based Systems

Article

Ladan Tahvildari

Parallel Computers and Individual-based Models: An Overview

Article

Jan 1992

James W. Haefner

In this expository overview, I briefly review the basics of computer architecture as they relate to parallel computers. Distributed memory, multiprocessor systems are emhasized. I cover methods to parallelize some fundamental types of ecological simulation models: foodweb models, individual-based population models, population models based on partial differential equations, and individual movement models. Recent developments in parallel operating systems and programming tools on multiprocessors are reviewed. Because of complex relationships between parallel computer architecture and efficient algorithms, I conclude that ecological modelers will need to become more acquainted with hardware than previously.

Implementing distributed process farms

Article

Sep 1995
MICROPROCESS MICROSY

We report on an experience of implementation of process farms on distributed systems. Rather than focusing on applications, we analyse in detail the techniques we have used for implementing the corresponding support mechanisms. They are actually part of a more general framework that can be easily extended to include other parallel programming paradigms. We try to substantiate the claim that our highly modular structuring may constitute both a practical and powerful approach for several problems of distributed programming support.

Case Studies of Software Development Tools for Parallel Architectures

Article

Jun 1993

The Parallel Evaluation and Experimentation Platform (PEEP) is the result of an effort at Rome Laboratory to identify the most promising general- purpose software development tools, techniques and approaches from industry and academia for programming high performance parallel computers to meet the needs of Command and Control (C2) applications. The PEEP is a prototype platform for evaluating the applicability of results from parallel programming research efforts to improve the productivity of designers and developers. Intermetrics conducted a study of available innovative tools and techniques beginning in early 1990. From the survey, Intermetrics chose candidates for inclusion on a prototype platform, and began to install and evaluate the chosen components. With the prototype PEEP, a number of case studies were conducted to develop small parallel programs using the selected tools. The purpose of these case studies was not to advance the state of the art in parallel algorithms, but to exercise the tools collected for the prototype PEEP. This work identified requirements on architectures, life cycle activities and technologies to support parallel development and developed a long range plan for the PEEP. The conclusions from these case studies also suggest useful methodologies for developing parallel software, and have led to recommendations based on the performance of the current tools and on the projected needs of parallel software development.

A Software Development Methodology for Parallel Processing Systems

Article

Oct 1995

A software development framework for parallel processing systems based on the parallel object-oriented functional computation model PROOF is evaluated. PROOF/L, a C++ based programming language with additional parallel constructs required by PROOF, is extended to include array data type and input/output features to make PROOFIL easier to use in developing software for parallel processing systems. The front-end translator from PROOF/L to the intermediate form IFl, and the back-end translators from IFl to the C languages on two different MIMD parallel machines, nCube and KSR, are developed. Our framework is evaluated by comparing it with existing software development approaches for parallel processing systems. Our framework is suitable for large-scale parallel software development because it supports the concepts of hierarchical design and shared data, and frees the software developer from considering explicit synchronization, communication, and parallelism. The software development efforts using our framework can be greatly reduced due to implicit synchronization and communication and the compactness of PROOF/L programs. The extension of PROOF/L and the integration of PROOF/L with other programming languages to utilize existing library functions written in languages such as C and FORTRAN are also discussed.

Methodologies for Mapping Tasks onto Heterogeneous Processing Systems

Article

Full-text available

Jul 1995

Complete application tasks, of the type that would be of interest to Rome Laboratory, are large and complex. One approach to dealing with them is heterogeneous computing. Two types of heterogeneous computing systems are: (1) mixed-mode, wherein multiple types of parallelism are available on a single machine; and (2) mixed-machine, wherein a suite of different high-performance computers is connected by high-speed links. In this effort, we studied ways to decompose an application into subtasks and then match each subtask to the mode or machine, which results in the smallest total task execution time. Our accomplishments include: (1) conducting a mixed-mode case study; (2) developing an approach for automatically decomposing a task for mixed-mode execution, and assigning modes to subtasks; (3) extending this approach for use as an heuristic for a particular class of mixed-machine heterogeneous computing systems; (4) surveying the state-of-the-art of heterogeneous computing, and constructing a conceptual framework for automatic mixed-machine heterogeneous computing; (5) examining how to estimate non-deterministic execution of subtasks and complete tasks; and (6) devising an optimal scheme for inter-machine data transfers for a given matching of subtasks to machines.

Introduction to Parallel Computing

Article

May 1992

Today's supercomputers and parallel computers provide an unprecedented amount of computational power in one machine. A basic understanding of the parallel computing techniques that assist in the capture and utilization of that computational power is essential to appreciate the capabilities and the limitations of parallel supercomputers. In addition, an understanding of technical vocabulary is critical in order to converse about parallel computers. The relevant techniques, vocabulary, currently available hardware architectures, and programming languages which provide the basic concepts of parallel computing are introduced in this document. This document updates the document entitled Introduction to Parallel Supercomputing, M88-42, October 1988. It includes a new section on languages for parallel computers, updates the hardware related sections, and includes current references.

Using the Parsec environment to implement a high-performance processor farm (PDF)

Article

Jan 1995

Parsec is a parallel programming environment whose goal is to simplify the development of multicomputer programs without, as is often the case, sacrificing performance. We have reconciled these objectives by "compiling" the structure of parallel applications into information to configure each of a small set of communication primitives on a context-sensitive basis. In this paper, we show how Parsec can be used to implement a high-performance processor farm and compare Parsec and hand-optimized implementations to demonstrate that Parsec can achieve a similar level of performance. Extensive static analysis and optimization is necessary to achieve these results. We discuss both the tools which perform these tasks as well as the user interface that provides the necessary declarative structural information. Using the processor farm, we show how Parsec simplifies the task of specifying the structure of a parallel application and improves the result by supporting abstraction, reuse and scalability.

Reusing sequential software in a distributed environment

Article

Full-text available

Jan 1999

In this paper we present and discuss a real experience of reusing sequential software in a parallel and physically distributed computing environment. Specifically, we have combined the functionalities of two existing systems previously developed at our Department. One, Tracs, is a programming environment for networked, heterogeneous machines that, among other things, is able to generate process farms out of a pure sequential code. The other, SPACE, is a graphical tool that generates sequential Fortran programs for simulating digital transmission systems. We have implemented a tool that restructures SPACE-generated programs to let them match the input required by the Tracs process farm generator. The result is that users of SPACE can transparently take advantage of networked and heterogeneous workstations to run their simulations. We have tackled the problems arising from both parallelism and distribution. The techniques we have used can be easily applied to any problem that can be modelled according to the process farm paradigm. Moreover, our experience shows that the Tracs framework may constitute a sound basis for facilitating engineering efforts on the reuse of sequential software in distributed environments.

Tools for Parallel and Distributed Computing

Chapter

Jun 2009

Thomas Fahringer

Performance engineering of parallel and distributed applications is a complex task that iterates through various phases, ranging from modeling and prediction, to performance measurement, experiment management, data collection, and bottleneck analysis. There is no evidence so far that all of these phases should/can be integrated in a single monolithic tool. Moreover, the emergence of Cloud computing as well as established Grid infrastructures as a wide-area platform for high-performance computing raises the idea to provide tools as interacting Web services that share resources, support interoperability among different users and tools, and most important provide omni-present services over Grid or Cloud infrastructures.

The interaction of the formal and the practical in parallel programming environment development: CODE

Chapter

Full-text available

Apr 2006

The most visible facet of the Computationally-Oriented Display Environment (CODE) is its graphical interface. However, the most important fact about CODE is that it is a programming system based on a formal unified computation graph model of parallel computation which was intended for actual program development. Most previous programming systems based on formal models of computation have been intended primarily to serve as specification systems. This paper focuses on the interaction between the development of the formal model of parallel computation and the development of a practical programming environment. Basing CODE on a formal model of parallel computation was integral to attainment of the initial project goals of an increase in level of abstraction of representation for parallel program structure and architectural independence. It also led to other significant research directions, such as a calculus of composition for parallel programs, and has suggested other directions of research in parallel programming that we have not yet had the opportunity to pursue. We hope this experience with the interaction of the theoretical and the practical may be of interest and benefit to other designers and developers of parallel programming systems.

Programming with very large graphs

Chapter

Apr 2006

The ParaGraph graph editor is a tool for specifying the graphical structure of parallel algorithms. Based on an extended formalism of Aggregate Rewriting Graph Grammars, it is an improvement on existing techniques for describing the families of regular, scalable communication graphs. We expect that ParaGraph will prove useful as a testbed for new techniques for describing, visualizing and analyzing the structure of very large graphs. This work describes ongoing formal (and practical) efforts to make ParaGraph an a effective tool for specifying massive parallelism.

Design-Pattern Based Parallel Programming Model and System Implementation

Conference Paper

Nov 2008

Huabei Wu

Design patterns make it easier to reuse successful designs and architectures. Expressing proven techniques as design patterns makes them more accessible to developers of new systems and helps a designer get a design faster. In sequential and object-oriented programming domain design patterns have played a very important role but in parallel programming the application of design patterns is very few. We propose a design-pattern based parallel programming model and implement a parallel programming environment in the SMP platform to help the developers to build their parallel application system efficiently. The experiment results show our system is effective and competent.

TDFL: A task-level data flow language

Article

Jun 1990

The Task-Level Dataflow Language is a graphical language for architecture-independent parallel programming and is intended for the writing of new programs and the adaptation of existing ones. It is the first coarse-grained dataflow language that supports dynamic modification of program graphs. It provides a systematic use of program constructs to support particular programming styles, such as nondeterminism, iteration, and replication. It has been used successfully in a course on parallel programming.

Data Parallel Program Design.

Conference Paper

Sep 1991

We have shown how graphical languages such as CODE/ROPE and PPSE can be used to design SIMD or data parallel programs. The advantages of this approach are machine independence, design clarity, automated program analysis, and accelerated software development. The disadvantages are that many problems remain in this approach such as how to reduce design clutter, how to automate optimal processor mapping, and how to perform messagepassing optimization. More work is needed before this approach can be used in the design of large-scale applications, but we believe the approach is promising. The main contribution is to show how the simple ideas of stencils, stream generators, and replicators can be used effectively to extend the classical dataflow design paradigm into the data parallel design paradigm. While more work is needed, these three ideas lead to greater expressiveness of design. The PPSE toolset currently does much of the mapping, scheduling, and automatic code generation described in this paper. However, PPSE does not currently handle data parallel programming. Work is progressing toward a full implementation of these ideas. Even so, PPSE has been invaluable for performing a variety of what-if analyses on parallel programs. Insights have been gained with this approach that would not be possible with a purely textual representation of the parallel program.

A Compositional Approach to Concurrent Programming.

Conference Paper

Jan 1996

UNITY to UC: A case study in the derivation of parallel programs

Conference Paper

Jan 1991

This paper describes the use of the UNITY [6] notation and the UC compiler [2] in the design of parallel programs for execution on the Connection Machine CM2 (CM). We illustrate our ideas in the context of a computer simulation of particle diffusion and aggregation in a porous media. We begin with a UNITY specification, progressively refine the specification to derive a UNITY program, translate the program to UC abstractions, which may be further refined to improve efficiency, and finally implement the program on the CM. Performance results on the efficiency of the program constructed using this approach are also included.

An Architecture-Independent Software Development Approach for Parallel Processing Systems.

Conference Paper

Full-text available

Jan 1995

An architecture-independent software development approach for parallel processing systems is presented. This approach is based on the parallel object oriented and functional computation model PROOF and separates the architecture dependent issues from software development. It also facilitates software development for any parallel processing systems by relieving the programmers from the consideration of processor topology and various parallelization aspects of the software. Our approach allows the exploitation of parallelism at both levels of granularity: object level and method level, thereby making our approach effective for software development for various MIMD computers. Software developed using our approach reflects the parallel structure of the problem space which makes the software more understandable and modifiable. A framework consisting of object-oriented analysis, object-design, coding and transformation phases is presented for software development for parallel processing systems. An example is given to illustrate this approach

A platform-independent tool for modeling parallel programs

Conference Paper

Full-text available

Mar 2011

Programming languages that can utilize the underlying parallel architecture in shared memory, distributed memory or Graphics Processing Units (GPUs) are used extensively for solving scientific problems. However, from our observation of studying multiple parallel programs from various domains, such programming languages have a substantial amount of sequential code mixed with the parallel code. When rewriting the parallel code for another platform, the same sequential code is often reused without much modification. Although this is a common occurrence, existing tools and programming environments do not offer much support for this process. In this paper, we introduce a tool named PPmodel, which was designed and implemented to assist programmers in separating the core computation from the details of a specific parallel architecture. Using PPmodel, a programmer can identify and retarget the parallel section of a program to execute in a different platform. With PPmodel, a programmer is better enabled to focus on the parallel section of interest, while ignoring other parallel and sequential sections in a program. The tool is explained by example execution of the parallel section of an OpenMP program for the circuit satisfiability problem in a cluster using the Message Passing Interface (MPI).

SuperPAS: A Parallel Architectural Skeleton Model Supporting Extensibility and Skeleton Composition

Conference Paper

Dec 2004

Application of pattern-based approaches to parallel programming is an active area of research today. The main objective of pattern-based approaches to parallel programming is to facilitate the reuse of frequently occurring structures for parallelism whereby a user supplies mostly the application specific code-components and the programming environment generates most of the code for parallelization. Parallel Architectural Skeleton (PAS) is such a pattern-based parallel programming model and environment. The PAS model provides a generic way of describing the architectural/structural aspects of patterns in message-passing parallel computing. Application development using PAS ishierarchical, similar to conventional parallel programming using MPI, however with the added benefit of reusability and high level patterns. Like most other pattern-based parallel programming models, the benefits of PAS were offset by some of its drawbacks such as difficulty in: (1) extending PAS and (2) skeleton composition. SuperPAS is an extension of PAS that addresses these issues. SuperPAS provides a skeleton description language for the generic PAS. Using SuperPAS, a skeleton developer can extend PAS by adding new skeletons to the repository (i.e., extensibility). SuperPAS also makes the PAS system more flexible by defining composition of skeletons. In this paper, we describe SuperPAS and elaborate its use through examples.

PProto: An Environment for Prototyping Parallel Programs

Article

Nov 1991

Ramón D. Acosta

This paper describes Parallel Proto (PProto), an integrated environment for constructing prototypes of parallel programs. Using functional and performance modeling of dataflow specifications, PProto assists in analysis of high-level software and hardware architectural tradeoffs. Facilities provided by PProto include a visual language and an editor for describing hierarchical dataflow graphs, a resource modeling tool for creating parallel architectures, mechanisms for mapping software components to hardware components, an interactive simulator for prototype interpretation, and a reuse capability. The simulator contains components for instrumenting, animating, debugging, and displaying results of functional and performance models. The Pproto environment is built on top of a substrate for managing user interfaces and database objects to provide consistent views of design objects across system tools.

A Petri Net Approach for Performance Oriented Parallel Program Design.

Article

Jul 1992

Alois Ferscha

Performance orientation in the development process of parallel software is motivated by outlining the misconception of current approaches where performance activies come in at the very end of the development, mainly in terms of measurement or monitoring after the implementation phase. At that time major part of the development work is already done, and performance pitfalls are very hard to repair - if this is possible at all. A development process for parallel programs that launches performance ...

A graphical approach to software development using function graphs

Article

Full-text available

Jan 1981

A comparison of 12 parallel FORTRAN dialects

Article

Full-text available

Oct 1988

A simple program that approximates π by numerical quadrature is rewritten to run on nine commercially available processors to illustrate the compilations that arise in parallel programming in FORTRAN. The machines used are the Alliant FX/8, BBN Butterfly, Cray X-MP/48, ELXSI 6400, Encore Multimax, Flex/32, IBM 3090/VF, Intel iPSC, and Sequent Balance. Some general impediments to using parallel processors to do production work are identified

A Survey of Models for Parallel Computing

Article

Aug 1970

Thomas H. Bredt

The work of Adams, Karp and Miller, Luconi, and Rodriguez on formal models for parallel computations and computer systems is reviewed. A general definition of a parallel schema is given so that the similarities and differences of the models can be discussed. Primary emphasis is on the control structures used to achieve parallel operation and on properties of the models such as determinacy and equivalence. Decidable and undecidable properties are summarized. (Author)

A Comparison of Some Theoretical Models of Parallel Computation

Article

Sep 1973

Raymond E. Miller

In this paper we briefly describe and compare a number of theoretical models for parallel computation; namely, Petri nets, computation graphs, and parallel program schemata. We discuss various problems and properties of parallel computation that can be studied within these formulations and indicate the ties between these properties and the more practical aspects of parallel computation. We show how marked graphs, a particular type of Petri net, are a restricted type of computation graph and indicate how some results of marked graphs can be obtained from known results of computation graphs. Also, for schemata we discuss the decidability versus undecidability of various properties and several techniques of schemata composition.

A portable environment for developing parallel FORTRAN programs

Article

Jul 1987
PARALLEL COMPUT

The emergence of commercially produced parallel computers has greatly increased the problem of producing transportable mathematical software. Exploiting these new parallel capabilities has led to extensions of existing languages such as FORTRAN and to proposals for the development of entirely new parallel languages. We present an attempt at a short term solution to the transportability problem. The motivation for developing the package has been to extend capabilities beyond loop based parallelism and to provide a convenient machine independent user interface. A package called SCHEDULE is described which provides a standard user interface to several shared memory parallel machines. A user writes standard FORTRAN code and calls SCHEDULE routines which express and enforce the large grain data dependencies of his parallel algorithm. Machine dependencies are internal to SCHEDULE and change from one machine to another but the users code remains essentially the same across all such machines. The semantics and usage of SCHEDULE are described and several examples of parallel algorithms which have been implemented using SCHEDULE are presented.

A Comparison of Models of Parallel Computation.

Conference Paper

Jan 1974

A language for specification and programming of reconfigurable parallel computation structures.

Conference Paper

Jan 1982

Formulation and Programming of Parallel Computations: A Unified Approach.

Conference Paper

Jan 1985

James Browne

The PFG Language: Visual Programming for Concurrent Computation.

Conference Paper

Jan 1988

P. David Stotts

PFG (parallel flow graphs) as a language for expression of concurrent, time-dependent computations is described. PFG is rich enough to express many of the common concurrent control structures found in parallel languages, as well as some less common ones. Each syntactic structure in PFG has a direct translation into a portion of a time Petri net model. The net created by legally combining PFG structures is guaranteed to be well-formed, in the sense that each Petri net is the free-choice class and has a clear interpretation in terms of a hardware/software system. Several techniques have been defined which allow the model produced from a PFG program to be analyzed for concurrency properties, such as deadlock freedom and proper mutual exclusion on shared data structures.

A visual programming environment for the Navier-Stokes computer

Article

Feb 1988

The Navier-Stokes computer is a high-performance, reconfigurable, pipelined machine designed to solve large computational fluid dynamics problems. Due to the complexity of the architecture, development of effective, high-level language compilers for the system appears to be a very difficult task. Consequently, a visual programming methodology has been developed which allows users to program the system at an architectural level by constructing diagrams of the pipeline configuration. These schematic program representations can then be checked for validity and automatically translated into machine code. The visual environment is illustrated by using a prototype graphical editor to program an example problem.

CODE: the Computation Oriented Display Environment

Conference Paper

Oct 1989

James Browne

The goals for the Computation Oriented Display Environment (CODE) are to provide a representation power sufficient for facile expression of a wide class of parallel algorithms while at the same time permitting compilation to reasonably efficient programs on a wide spectrum of parallel execution environments and to provide a hierarchical approach to development of parallel programs. CODE is based on a formally specified model of parallel computation which covers most conventional MIMD models of parallel computation. The model is formulated at a higher level of abstraction than conventional MIMD shared-name-space and partitioned-name-space models of parallel computation. The conceptual foundation of CODE, in particular basing the language on an abstract model of parallel computation, has led to two significant capabilities which had not been anticipated: a calculus of composition which may be exploitable for automated or semiautomated program construction and a natural basis for highly effective component reuse

Parallel Programming and the Poker Programming Environment

Article

Aug 1984

L. Snyder

Not Available

A Comparison of 12 Parallel Forum Dialect$," lb2X ,%J-ionr~

Oct 1988
52-68

H Kdrp
R G Babb

H. Kdrp and R.G. Babb, "A Comparison of 12 Parallel Forum Dialect$," lb2X,%J-ionr~, Sept. 1988, pp. 52-68.

ROPE Lser's Mati-cial: A Reusability-Oriented Parallel P r e References

T: ] I Ee
C I Lin

T:]. I.ee and C.I.. Lin, "ROPE Lser's Mati-cial: A Reusability-Oriented Parallel P r e References

1,. I.in, "Prtr granining with CODE: A <:omputation-Oriented Display Environment

Jan 1988

J C Browne

J.C. Browne, M. h. a m, and C.1,. I.in, "Prtr granining with CODE: A <:omputation-Oriented Display Environment," tech. report, Computer Sciences Dept., Univ. of Texas, Austin, Texas, 1988.

ASurvey of Models for Parallel Computing Digital Systems LabA Comparison of Models of Parallel Computation

Jan 1974

T H Bredt

T.H. Bredt,"ASurvey of Models for Parallel Computing,"Tech. Report8, Digital Systems Lab., 3. J.L. Peterson and T.H. Bredt, "A Comparison of Models of Parallel Computation." Proc. 1974

ROPE User's Manual: A Reusability-Oriented Parallel Programming Environment

T J Lee
C L Lin

A Visual Programming Environment for the Navier-Stokes Computer

S Tamboulian
T W Croskett
D Middleton

A. Constructive Unified Model of Parallel Computation

S M Sobek

CODE: A Unified Approach to Parallel Programming

Abstract

No full-text available

Recommended publications

Reliability models for dataflow computer systems