Article

Implementing distributed process farms

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We report on an experience of implementation of process farms on distributed systems. Rather than focusing on applications, we analyse in detail the techniques we have used for implementing the corresponding support mechanisms. They are actually part of a more general framework that can be easily extended to include other parallel programming paradigms. We try to substantiate the claim that our highly modular structuring may constitute both a practical and powerful approach for several problems of distributed programming support.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Article
The growing complexity and performance of modern digital communication links call for enhanced efficiency of the computer simulation software packages that are routinely used as particularly effective and flexible analysis and design tools. The relevant typical computing environment in the telecommunication design laboratories is a cluster of heterogeneous workstations connected by a local area network (LAN). In this paper, we describe how a pre-existing tool for the simulation of transmission systems is parallelized in such a way that the distributed computing power is exploited at its best. The experiment was carried out with the aid of TRACS, a programming environment for networked, heterogeneous machines and led to the development of a ‘parallel execution’ feature of simulation runs. Such a feature is activated by the end-user, with no further modifications to the description of the transmission system under consideration. A few results are reported on the performance boost due to parallelization with respect to the standard serial execution.
Article
Full-text available
In parallel programming, communication patterns are rarely arbitrary and unstructured. Instead, parallel applications tend to employ predetermined patterns of communication between their components. If the most commonly used patterns - such as pipelines, farms and trees - are identified (both in terms of their components and their communication), an environment can make them available as high-level abstractions for use in writing applications. This can yield a structured approach to parallel programming. The paper shows how this structured approach can be accommodated within an object-oriented language. A class library provides the most commonly used patterns and programmers can exploit inheritance to define new patterns. Several examples illustrate the approach and show that it can be efficiently implemented. Keywords Structured Parallelism, Object-Orientation, Communication Patterns This work has been partially supported by the `Progetto Finalizzato Sistemi Informatici e Calcolo P...
Conference Paper
Full-text available
Paralex is a programming environment that allows parallel programs to be developed and executed on distributed systems as if the latter were uniform parallel multiprocessor computers. Architectural heterogeneity, remote communication and failures are rendered transparent to the programmer through automatic system support. The authors address the problems of initial mapping and dynamic alteration of the association between parallel computation components and distributed hosts. Results include novel heuristics and mechanisms to resolve these problems despite the complexities introduced by architectural heterogeneity fault tolerance
Article
Full-text available
Enterprise is a programming environment for designing, coding, debugging, testing, monitoring, profiling, and executing programs for distributed hardware. Developers using Enterprise do not deal with low-level programming details such as marshalling data, sending/receiving messages, and synchronization. Instead, they write their programs in C, augmented by new semantics that allow procedure calls to be executed in parallel. Enterprise automatically inserts the necessary code for communication and synchronization. However, Enterprise does not choose the type of parallelism to apply. The developer is often the best judge of how parallelism can be exploited in a particular application, so Enterprise lets the programmer draw a diagram of the parallelism using a familiar analogy that is inherently parallel: a business organization, or enterprise, which divides large tasks into smaller tasks and allocates assets to perform those tasks. These assets correspond to techniques used in most large-grained parallel programs; pipelines, master/slave processes, divide-and-conquer, and so on,and the number and kinds of assets used determine the amount of parallelism
Article
Full-text available
The Conic environment provides a language-based approach to the building of distributed systems which combines the simplicity and safety of a language approach with the flexibility and accessibility of an operating systems approach. It provides a comprehensive set of tools for program compilation, configuration, debugging, and execution in a distributed environment. A separate configuration language is used to specify the configuration of software components into logical nodes. This provides a concise configuration description and facilitates the reuse of program components in different configurations. Applications are constructed as sets of one or more interconnected logical nodes. Arbitrary, incremental change is supported by dynamic configuration. In addition, the system provides user-transparent datatype transformation between heterogeneous processors. Applications may be run on a mixed set of interconnected computers running the Unix operating system and on base target machines with no resident operating system. The basic principles adopted in the construction of the Conic environment are outlined and the configuration and run-time facilities provided are described
Chapter
The paper reviews the problems inhibiting the widespread use of parallel processing by both industry and by software houses. The two key issues of portability of code and of generality of parallel architectures are discussed. An overview of useful computational models and programming paradigms for parallel machines is presented along with some detailed case studies implemented on transputer arrays. Valiant's results on optimally universal parallel machines are reviewed along with the prospects of building truly general-purpose parallel computers. Some remarks on language and software tool developments for parallel programming form the conclusion to the paper.
Conference Paper
This paper describes an X-window based software environment called HeNCE (Heterogeneous Network Computing Environment) designed to assist scientists in developing parallel programs that run on a network of computers. HeNCE is built on top of a software package called P M which supports process management and communication between a network of heterogeneous computers. HeNCE is based on a parallel programming paradigm where an application program can be described by a graph. Nodes of the graph represent subroutines and the arcs represent data dependencies. HeNCE is composed of integrated graphical tools for creating, compiling, executing, and analyzing HeNCE programs. 1 Introduction Wide area computer networks have become a basic part of today's computing infrastructure. These networks connect a variety of machines, presenting an enormous computing resource. In this project we focus on developing methods and tools which allow a programmer to tap into this resource. In this paper we des...
Article
The PVM system is a programming environment for the development and execution of large concurrent or parallel applications that consist of many interacting, but relatively independent, components. It is intended to operate on a collection of heterogeneous computing elements interconnected by one or more networks. The participating processors may be scalar machines, multiprocessors, or special-purpose computers, enabling application components to execute on the architecture most appropriate to the algorithm. PVM provides a straightforward and general inferface that permits the description of various types of algorithms (and their interactions), while the underlying infrastructure permits the execution of applications on a virtual computing environment that supports multiple parallel computation models. PVM contains facilities for concurrent, sequential or conditional execution of application components, is portable to a variety of architectures, and supports certain forms of error detection and recovery.
Article
Distributed computations are concurrent programs in which processes communicate by message passing. Such programs typically execute on network architectures such as networks of workstations or distributed memory parallel machines (i.e., multicomputers such as hypercubes). Several paradigms - examples or models - for process interaction in distributed computations are described. These include networks of filters, clients, and servers, heartbeat algorithms, probe/echo algorithms, broadcast algorithms, token-passing algorithms, decentralized servers, and bags of tasks. These paradigms are applicable to numerous practical problems. They are illustrated by solving problems, including parallel sorting, file servers, computing the topology of a network, distributed termination detection, replicated databases, and parallel adaptive quadrature. Solutions to all problems are derived in a step-wise fashion from a general specification of the problem to a concrete solution. The derivations illustrate techniques for developing distributed algorithms.
Article
The Tracs graphical programming environment promotes a modular approach to the development of distributed applications. A few types of reusable design components make the environment both simple and powerful. Tracs exploits modularity in an original way. Its support of message models, task models, and architecture models as basic design components provides programmers with a framework that has proven practical, powerful, and easy to understand. Furthermore, modularity has allowed us to add advanced facilities to the environment, with little implementation and integration effort. From this point of view, our choice of supporting message models as a basic design component has proven appropriate. Several of the ideas explored in Tracs will be useful in future work on programming environments for parallel and distributed systems
Article
The authors describe CODE (computation-oriented display environment), which can be used to develop modular parallel programs graphically in an environment built around fill-in templates. It also lets programs written in any sequential language be incorporated into parallel programs targeted for any parallel architecture. Broad expressive power was obtained in CODE by including abstractions of all the dependency types that occur in the widely used parallel-computation models and by keeping the form used to specify firing rules general. The CODE programming language is a version of generalized dependency graphs designed to encode the unified parallel-computation model. A simple example is used to illustrate the abstraction level in specifying dependencies and how they are separated from the computation-unit specification. The most important CODE concepts are described by developing a declarative, hierarchical program with complex firing rules and multiple dependency types.< >
Article
The parallelism in a data flow computer is both microscopic (much more so than in a multiprocessor) and all-encompassing (much more so than in a vector processor). Like the other forms of parallel computer, data flow computers are best programmed in special languages. In fact, their need for such languages is stronger - most data flow designs would be extremely inefficient if programmed in conventional languages such as FORTRAN or PL/I. There are three ″data flow″ languages discussed in this paper: VAL, ID and LUCID.
Article
Over the past two decades (1974-94), advances in semiconductor and integrated circuit technology have fuelled the drive toward faster, ever more efficient computational machines. Today, the most powerful supercomputers can perform computation at billions of floating-point operations per second (gigaflops). This increase in capability is intensifying the demand for even more powerful machines. Computational limits for the largest supercomputers are expected to exceed the teraflops barrier in the coming years. Discussion is given on the following areas: the nature of I/O in massive parallel processing; operating and file systems; runtime system and compilers; and networking technology. The recurrent themes in the parallel I/O problem are the existence of a great variety in access patterns and the sensitivity of current I/O systems to these access patterns. An increase in the variability of access patterns is also expected, and single resource-management approaches will likely not suffice. Providing the I/O infrastructure that will support these requirements will necessitate research in operating systems (parallel file systems, runtime systems, and drivers), language interfaces to high-performance storage systems, high-speed networking, graphics and visualization systems, and new hardware technology for I/O and storage systems.< >