Article

A Practical Algorithm for Static Analysis of Parallel Programs.

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

One approach to analyzing the behavior of a concurrent program requires determining the reachable program states. A program state consists of a set of task states, the values of shared variables used for synchronization, and local variables that derive the values directly from synchronization operations. However, the number of reachable states rises exponentially with the number of tasks and becomes intractable for many concurrent programs. A variation of this approach merges a set of related states into a single virtual state. Using this approach, the analysis of concurrent programs becomes feasible as the number of virtual states is often orders of magnitude less than the number of reachable states. This paper presents a method for determining the virtual states that describe the reachable program states, and the reduction in the number of states is analyzed. The algorithms given have been implemented in a static program analyzer for multitasking Fortran, and the results obtained are discussed.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... It is important to detect atomicity races [10][11][12][13] in the applications for the safety critical of avionics systems. CodeSonar [14] is a representative tool which is widely used to identify the concurrency bugs, such as data races and deadlocks, as well as sequential errors based on the static analysis techniques [15][16][17] using source codes. The static analysis tool is sound, but imprecise because it produces a lot of false positives through evaluation all of possible executions including impractical execution paths which are never reached in the actual execution of the applications. ...
... However, it is difficult to use these information and to check exist atomicity races in the applications, because detecting atomicity races requires to understand parallel executions of processes and to predict their nondeterministic behaviors. Therefore, in general, a range of automatic detection tools based on sophisticated techniques [10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27] is employed to locate atomicity races which exist in ARINC 653 applications. A representative tool for detecting atomicity races in the avionics application is CodeSonar. ...
Article
Full-text available
Atomicity races in ARINC 653 applications are a kind of concurrency bugs which causes nondeterministic behaviors by parallel processes. The defects must be detected to ensure the reliability of the applications, because they may lead to unpredictable results to the programmer. This paper presents a tool, called AR653, to dynamically detect atomicity races for an execution of the application. The tool monitors only minimal information, such as processes, semaphores, and read/write accesses to shared resources, and analyzes the relation of synchronizations to report atomicity races throgh a locking decipline of semaphores. We compared the accuracy of AR653 with CodeSonar using synthetic programs on a simulation system for integrated modular avionics. The emperiacal results show that our tool correctly reports atomicity races in cases of using shared pointers as well as in cases of using shared variables, while CodeSonar only locates atomicity races in cases of unsing shared variables.
... The control-flow graphs of individual processes are modified to highlight the synchronization structure, abstracting away other details. Subsequently the complete state-transition graph of the execution, known as the reachability graph, is constructed, thereby modeling the concurrent program as the set of all possible execution sequences [5,6]. Traditional reachability analysis suffers from combinatorial-explosion i.e., the number of states generated for analysis increases exponentially with the number of concurrent threads of execution. ...
... The practical utility of the apportioning technique can be seen from the following observations: The complexity (number of states generated) of traditional reachability analysis [5,6] is O(p) T , where T is the number of threads and p is the number of interactions for any thread. Extending such techniques to concurrent object-oriented programs by performing additional analysis for each class, results in a complexity of O(c(p l ) m + (p) T ), where c is the number of classes, m the number of methods in each class, and p l is the number of LAP in any method. ...
Article
Reachability analysis is an important and well-known tool for static analysis of critical properties in concurrent programs, such as freedom from deadlocks. Direct application of traditional reachability analysis to concurrent object-oriented programs has many problems, such as incomplete analysis for reusable classes (not safe) and increased computational complexity (not efficient). Apportioning is a technique that overcomes these limitations and enables safe and efficient reachability analysis of concurrent object-oriented programs. Apportioning is based upon a simple but powerful idea of classification of program analysis points as local (having influence within a class) and global (having possible influence outside a class). Given a program and a classification of its analysis points, reachability graphs are generated for: (i) an abstract version of each class in the program having only local analysis points and (ii) an abstract version of the whole program having only global analysis points. The error to be checked is decomposed into a number of sub-properties, which are checked in the appropriate reachability graphs. In this paper we present the development of ARA, an apportioning based tool for analysis of concurrent Java programs. Some of the main features of ARA are: varying the classification of analysis points, distributing the generation of reachability graphs over several machines, and the use of efficient data structures, to further reduce the time required for reachability analysis. We also present our experience with using ARA for the analysis of several programs.
... Typically, the reduction in model size often results in a significant decrease in the size of the corresponding state space. Other methods exploit potential parallelism [6,19] or symmetries [10,17] inherent in the program being analyzed to avoid generation of the full state space. Compositional methods are aimed at performing the analysis in a modular fashion [21]. ...
... Our future work plans are aimed at (1) further improving the efficiency of the analysis, and (2) broadening our example set. In particular, we are currently investigating symmetry-based reduction methods [10,17]. We are also seeking ways to incorporate semantic information about Ada-tasking analysis in stubborn sets and partial orders. ...
... One approach to determining potential races is based on computing all of the reachable concurrency states of the program McD89,Tay84]. The major disadvantage of this approach is that the number of concurrency states may become prohibitively large. ...
... Alternatively, each s y n c hronization event could include the source line number of the statement generating the event. From the source line numbers the path between two adjacent e v ents can be determined and the variables referenced along the path can be computed McD89]. ...
Conference Paper
Full-text available
One of the fundamental problems encountered when debugging a parallel program is determining the race conditions in the program. This paper presents a tool for automatically detecting data races in parallel programs by analyzing program traces. Given a trace of important events, the authors present a series of algorithms which identify those pairs of events which may have participated in a race condition. The linear event trace is first converted into a partial ordering which reflects one of the possible executions. Other algorithms modify the partial ordering on the events, extracting information common to all of the executions which could have generated the linear trace. This allows one to analyze sets of executions rather than just one specific execution based on the trace information. A working trace analyzer has been implemented for IBM parallel fortran. The trace analyzer can report various data races in parallel programs by finding unordered pairs of events and variable access conflicts
... One major problem is deciding which pairs of inputs and SYN-sequences to select for a concurrent program. A number of methods for solving this problem have been proposed [20], [21], [22]. Fig. 1 illustrates the concept of reachability testing. ...
... This is based on the thread synchronization procedure of the Java programming language [25], for which the entry and exit protocols are very similar to the protocols shown in Figs. 22 Also, the Lock_DB(DB) and UnLock_DB(DB) functions are replaced with wait(DB.sem) and signal(DB.sem), which are Java subroutines that implement the binary semaphore operations defined in [4]. ...
... One major problem is deciding which pairs of inputs and SYN-sequences to select for a concurrent program. A number of methods for solving this problem have been proposed [20], [21], [22]. Fig. 1 illustrates the concept of reachability testing. ...
... This is based on the thread synchronization procedure of the Java programming language [25], for which the entry and exit protocols are very similar to the protocols shown in Figs. 22 Also, the Lock_DB(DB) and UnLock_DB(DB) functions are replaced with wait(DB.sem) and signal(DB.sem), which are Java subroutines that implement the binary semaphore operations defined in [4]. ...
Article
Full-text available
The execution of a client/server application involving database access requires a sequence of database transaction events (or, T-events), called a transaction sequence (or, T-sequence). A client/server database application may have nondeterministic behavior in that multiple executions thereof with the same input may produce different T-sequences. We present a framework for testing all possible T-sequences of a client/server database application. We first show how to define a T-sequence in order to provide sufficient information to detect race conditions between T-events. Second, we design algorithms to change the outcomes of race conditions in order to derive race variants, which are prefixes of other T-sequences. Third, we develop a prefix-based replay technique for race variants derived from T-sequences. We prove that our framework can derive all the possible T-sequences in cases where every execution of the application terminates. A formal proof and an analysis of the proposed framework are given. We describe a prototype implementation of the framework and present experimental results obtained from it.
... Static analysis extracts more information from P when "probe-effect" [Sto88], a principle similar to Hiesenberg's uncertainty principle, limits further instrumentation [Tay83], [TaO80], [BBC88], [CaS89], [McD89]. They analyze P and are limited to the P → Μ part of the debugging cycle. ...
... From a text containing blocking synchronization, signal synchronization and non-synchronization statements, static analysis techniques routinely extract a synchronization-control-flow graph [Tay83], [TO80], [BBC88], [CaS89], [McD89], [MiC89]. The graph typically contains three types of nodes and two types of arcs. ...
Thesis
Full-text available
Debugging is a process that involves establishing relationships between several entities: The behavior specified in the program, P, the model/predicate of the expected behavior, M, and the observed execution behavior, E. The thesis of the unified approach is that a consistent representation for P, M and E greatly simplifies the problem of concurrent debugging, both from the viewpoint of the programmer attempting to debug a program and from the viewpoint of the implementor of debugging facilities. Provision of such a consistent representation becomes possible when sequential behavior is separated from concurrent or parallel structuring. Given this separation, the program becomes a set of sequential actions and relationships among these actions. The debugging process, then, becomes a matter of specifying and determining relations on the set of program actions. The relations are specified in P, modeled in M and observed in E. This simplifies debugging because it allows the programmer to think in terms of the program which he understands. It also simplifies the development of a unified debugging system because all of the different approaches to concurrent debugging become instances of the establishment of relationships between the actions. The unified approach defines a formal model for concurrent debugging in which the entire debugging process is specified in terms of program actions. The unified model places all of the approaches to debugging of parallel programs such as execution replay, race detection, model/predicate checking, execution history displays and animation, which are commonly formulated as disjoint facilities, in a single, uniform framework. We have also developed a feasibility demonstration prototype implementation of this unified model of concurrent debugging in the context of the CODE 2.0 parallel programming system. This implementation demonstrates and validates the claims of integration of debugging facilities in a single framework. It is further the case that the unified model of debugging greatly simplifies the construction of a concurrent debugger. All of the capabilities previously regarded as separate for debugging of parallel programs, both in shared memory models of execution and distributed memory models of execution, are supported by this prototype.
... One approach to determining potential races is based on computing all of the reachable concurrency states of the program McD89,Tay84]. The major disadvantage of this approach is that the number of concurrency states may become prohibitively large. ...
... Alternatively, each synchronization event could include the source line number of the statement generating the event. From the source line numbers the path between two adjacent events can be determined and the variables referenced along the path can be computed McD89]. ...
Article
Full-text available
This paper describes techniques which automatically detect data races in parallel programs by analyzing program traces. We view a program execution as a partial ordering of events, and define which executions are consistent with a given trace. In general, it is not possible to determine which of the consistent executions occurred. Therefore we introduce the notion of "safe orderings" between events which are guaranteed to hold in every execution which is consistent with the trace. The main result of the paper is a series of algorithms which determine many of the "safe orderings". An algorithm is also presented to distinguish unordered sequential events from concurrent events. A working trace analyzer has been implemented. The trace analyzer can report various data races in parallel programs by finding unordered pairs of events and variable access conflicts. Keywords: data race, time vector, program trace, parallel programming, debugging, distributed systems 1. Introduction 1
... These techniques are classified into two analysis approaches: static analysis and dynamic analysis. The static analysis based techniques [3,20] are classical approaches that maximize a data race detection coverage by tracking the entire execution paths of a given parallel program. However, their extensive analysis leads to an excessive number of false positive data races [5] that are not actually harmful for a program. ...
Article
Shared-memory based parallel programming with OpenMP and Posix-thread APIs becomes more common to fully take advantage of multiprocessor computing environments. One of the critical risks in multithreaded programming is data races that are hard to debug and greatly damaging to parallel applications if they are uncaught. Although ample effort has been made in building specialized data race detection techniques, the state of art tools such as the Intel Thread Checker still have various functionality and performance problems. In this paper, we present a Versatile On-the-fly Race Detection (VORD) tool that provides an agile, efficient, and scalable race detection environment for various parallel programming models. VORD can automatically construct an empirically optimal set of race engines by utilizing classification and adaptation mechanisms. A Race-Detection Classification (RDC) table is created to categorize adequate engines in the aspect of labeling, detecting, and filtering. An Engine Code Property Selector (ECPS) uses the RDC table to adapt optimal engines for the given target programming models. In addition to RDC and ECPS, we have also implemented an OpenMP parser and a source instrumentor. The functionality and efficiency of VORD were compared with those of the Intel Thread Checker by using a set of OpenMP based kernel programs. The experimental results show that VORD can detect data races in more challenging programming models such as nested thread and synchronization models, and can achieve a couple of orders of a magnitude faster processing time than the Intel Thread Checker in the large parallel programs.
... This reduction technique is based on the fact that blocking conditions depend only on the partial order of events rather than on the particular sequence of these events. In another related approach, symmetry in the system is traced back to its components and then exploited to reduce the complexity of the reachability analysis [McD89]. ...
Article
Full-text available
Within the formal language and automata settings, a modelling and analysisparadigm for multiprocess discrete event systems is proposed. The modelling structure,referred to as one of interacting discrete event systems (IDES), features explicitrepresentation of the system components. In addition, di#erent forms of interactionsbetween the components can be directly represented - as interaction specification -in the modelling structure. A multilevel extension to the model is introduced. The...
... Reachability analysis approach has been popularly used for exhaustive analysis of finite state distributed systems [1,8,9,20,29,32,39,46,49]. It constructs a state transition model of a system from models of primitive processes. ...
Article
Full-text available
Behaviour analysis is useful at all stages in the design and maintenance of well-behaved distributed systems. Dataflow and reachability analyses are two orthogonal but complementary behaviour analysis techniques. Individually, each of these techniques may be inadequate for the analysis of large-scale distributed systems. On the one hand, dataflow analysis algorithms, while tractable, may not be sufficiently accurate to provide meaningful detection of errors. On the other hand, reachability analysis, while providing exhaustive analysis, may be computationally too expensive for complex systems. In this paper, we present a method which integrates a dataflow and a reachability analysis technique to provide a flexible and effective means for analysing distributed systems at preliminary and final design stages respectively. We also describe some effective measures taken to improve the adequacy of the individual analysis techniques using concepts of action dependency and context constraints. A prototype supporting the method has been built and its performance is presented in the paper. A realistic example of a distributed track control system is used as a case study.
... We believe it may be possible to optimize reachability analysis further, especially by exploiting symmetry and other similarity properties of the problem state space. Such work would draw on the research of [17], [40], [36], and [25]. ...
Article
Full-text available
The behavior of a concurrent program often depends on the arbitrary interleaving of computations performed by asynchronous processes. The resulting non-determinism can lead to such phenomena as deadlock and starvation, making program development extremely difficult, and consequently making the development of tools for formal analysis highly desirable. A specification-based approach to concurrency analysis is a particularly promising way of addressing some of the difficulties inherent in concurrent program development. According to this approach, a programmer first writes a specification describing the interprocess communication behavior of a concurrent program. A set of formal analysis techniques are then applied in an effort to determine whether the specification can be fully satisfied. If the analysis is successful, target code is generated automatically that conforms to the specification. This approach has a variety of benefits. While such properties as safety and liveness are rather difficult to discern in actual code, they are actually easy to include as part of a specification. Moreover, state spaces induced by specifications tend to be smaller and more manageable than state spaces of actual code, and this leads to more effective analysis techniques. Finally, the generation of interprocess communication code from formal specifications is accomplished in a relatively straightforward manner.
... An algorithm presented in McDowell [ 19891 computes parallel (i, j ) for programs written in FORTRAN with extensions to support explicit parallelism. Whereas the simple language in Taylor and Osterweil [1980] explicitly prohibits the execution of a process with itself, the algorithm in McDowell [1989] uses the fact that many parallel numerical applications are expressed as collections of identical tasks executing in parallel on shared data. The result is that many fewer states are generated. ...
Article
Full-text available
The main problems associated with debugging concurrent programs are increased complexity, the 'probe effect', nonrepeatability, and the lack of a synchronized global clock. The probe effect refers to the fact that any attempt to observe the behavior of a distributed system may change the behavior of that system. For some parallel programs, different executions with the same data will result in different results even without any attempt to observe the behavior. Even when the behavior can be observed, in many systems the lack of a synchronized global clock makes the results of the observation difficult to interpret. This paper discusses these and other problems related to debugging concurrent programs and presents a survey of current techniques used in debugging concurrent programs. Systems using three general techniques are described: traditional or breakpoint style debuggers, event monitoring systems, and static analysis systems. In addition, techniques for limiting, organizing, and displaying a large amount of data produced by the debugging systems are discussed.
... Partial order reduction [4] is an instance of these approaches in which the effect of representing concurrency is alleviated with interleaving. Symmetrical reduction is another approach in which symmetry in the system is traced back to its components and then exploited to reduce the complexity of the reachability analysis [5]. See [2] for a survey on reduction techniques for blocking detection in logical systems. ...
Conference Paper
Full-text available
Not Available
... Thus, either heuristics are proposed to avoid the consideration of all the interleavings McDowell 1989], or restricted situations are considered, which do not require to consider the interleavings at all. Grunwald and Srinivasan 1993a] for example require data independence of parallel components according to the PCF Fortran standard. ...
Article
Full-text available
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. c fl 1996 ACM 0164-0925/96/0500-0268 $03.50 ACM Transactions on Programming Languages and Systems, Vol. 18, No. 3, May 1996, Pages 268-299. Parallelism for Free Delta 269 Probably, the reason for this deficiency is that a naive adaptation fails [Midkiff and Padua 1990] and that the straightforward correct adaptation needs an unacceptable effort, which is caused by the interleavings that manifest the possible executions of a parallel program. Thus, either heuristics are proposed to avoid the consideration of all the interleavings [McDowell 1989], or restricted situations are considered, which do not require to consider the interleavings at all. Grunwald and Srinivasan [1993a] for example require data independence of parallel components according to the PCF Fortran standard. Th...
... Long and Clarke [LC89] propose a similar task interaction concurrency graph representation, and cite empirical evidence of a linear reduction in the number of states with respect to the concurrency state graph. McDowell [McDo89] builds a reduced concurrency history graph structure by aggregating related concurrency history states into \clans"; the resulting graph may still exceed polynomial size in the worst case. ...
Article
. Infinite wait anomalies associated with a barrier rendezvous model (e.g., Ada) can be divided into two classes: stalls and deadlocks. Although precise static deadlock detection is NP-hard, we present two polynomial time algorithms which operate on a statically derivable program representation, the sync graph, to certify a useful class of programs free of deadlocks. We identify three conditions local to any deadlocked tasks, and a fourth global condition on all tasks, which must occur in the sync graph of any program which can deadlock. Again, exact checking of the local conditions is NP-hard; the algorithms check them using conservative approximations. Certifying stall freedom is intractable for programs with conditional branching, including loops. We give program transforms which may help alleviate this difficulty. Keywords: synchronization anomalies, Ada, deadlocks, static analysis, parallel programming 1 Introduction. Infinite wait synchronization anomalies associated with a b...
... The algorithm computes the set of possible concurrency states for a program and determines the possible sequences in which they may occur. McDowell [17], and Helmbold and McDowell [12] also enumerate possible concurrency states, but present techniques to reduce the number of states. The worst case number of states remains superpolynomial, however. ...
Article
: Many coarse-grained, explicitly parallel programs execute in phases delimited by barriers to preserve sets of cross process data dependencies. One of the major obstacles to optimizing these programs is the necessity to conservatively assume that any two statements in the program may execute concurrently. Consequently, compilers fail to take advantage of opportunities to apply optimizing transformations, particularly those designed to improve data locality, both within and across the phases of the program. We present a simple and efficient compile time algorithm that uses the presence of barriers to perform non-concurrency analysis on coarse-grain, explicitly parallel programs. It works by dividing the program into a set of phases and computing the control flow between them. Each phase consists of one or more sequences of program statements that are delimited by barrier synchronization events and can execute concurrently. We show that the algorithm performs perfectly on all but one of...
Article
A concurrency history graph is way to represent every state that can be entered by a parallel program. Such a graph can be used to detect errors in and verify properties of the parallel program. However, concurrency history graphs are usually too large to be generated in the obvious manner. A new abstraction which allows many program states to be represented by the same node in a concurrency history graph is presented. This new abstraction is more general than previous work by the authors. Using this new abstraction mechanism it is possible to produce concurrency history graphs requiring much less storage than that suggested by a simple worst case complexity analysis. A static analysis tool capable of detecting race conditions in parallel programs is being built based on concurrency history graphs using this new abstraction.
Article
Concurrent programs exhibit nondeterministic behavior in that multiple executions thereof with the same input might produce different sequences of synchronization events and different results. This is because different executions of a concurrent program with the same input may exhibit different interleavings. Thus, one of the major issues in the testing of concurrent programs is how to explore different interleavings or exhaust all the possible interleavings of the target programs. However, for terminating concurrent programs that have cyclic state spaces due to using iterative statements such as busy-waiting loops, they might have an infinite number of feasible synchronization sequences; that is, there is an infinite number of possible interleavings, which makes it impossible to explore all the possible interleavings for this type of concurrent program. To overcome this problem, we propose a testing scheme called dynamic effective testing that can perform state-cover testing for nondeterministic terminating concurrent programs with an infinite number of synchronization sequences. Dynamic effective testing does not require static analysis of the target concurrent program or the assistance of a model checker, and thus is loosely coupled to the syntax of the target concurrent program. It only needs to analyze sequences of synchronization events produced by the execution of the concurrent programs for race detection and state-traversal control. Therefore, the method is easy to port to different programming languages. In addition, only reiterated states discovered in a single SYN-sequence need to be stored. The implementation and experimental results obtained with real code demonstrate that source-code-level dynamic testing can be systematically performed on nondeterministic concurrent programs with infinite synchronization sequences.
Conference Paper
In this paper, we give a matrix-based approach for blocking detection of multiprocess systems. Based on the matrix expression, the deadlock detection is discussed first, and the potential blocking states in composite automata are identified by examining shared transitions in the components. Then the livelock detection in the multiprocess systems are studied using the proposed approach to detect cycles and cliques. Examples is also given for illustration.
Article
Full-text available
The constrained expression approach to analysis of concurrent software systems can be used with a variety of design and programming languages and does not require a complete enumeration of the set of reachable states of the concurrent system. The construction of a toolset automating the main constrained expression analysis techniques and the results of experiments with that toolset are reported. The toolset is capable of carrying out completely automated analyses of a variety of concurrent systems, starting from source code in an Ada-like design language and producing system traces displaying the properties represented bv the analysts queries. The strengths and weaknesses of the toolset and the approach are assessed on both theoretical and empirical grounds
Article
Full-text available
This special issue contains extended versions of selected papers from the workshops on Formal Methods for Industrial Critical Systems (FMICS) held in Eindhoven, The Netherlands, in November 2009 and in Antwerp, Belgium, in September 2010. These were, respectively, the 14th and 15th of a series of international workshops organized by an open working group supported by ERCIM (European Research Consortium for Informatics and Mathematics) that promotes research in all aspects of formal methods (see details in http://www.inrialpes.fr/vasy/fmics/). The FMICS workshops that have produced this special issue considered papers describing original, previously unpublished research and not simultaneously submitted for publication elsewhere, and dealing with the following themes: *Design, specification, code generation and testing based on formal methods. *Methods, techniques and tools to support automated analysis, certification, debugging, learning, optimization and transformation of complex, distributed, real-time and embedded systems. *Verification and validation methods that address shortcomings of existing methods with respect to their industrial applicability (e.g., scalability and usability issues). *Tools for the development of formal design descriptions. *Case studies and experience reports on industrial applications of formal methods, focusing on lessons learned or new research directions. *Impact and costs of the adoption of formal methods. *Application of formal methods in standardization and industrial forums. The selected papers are the result of several evaluation steps. In response to the call for papers, FMICS 2009 received 24 papers and FMICS 2010 received 33 papers, with 10 and 14 accepted, respectively, which were published by Springer-Verlag in the series Lecture Notes in Computer Science (volumes 5825 [1] and 6371 [2] ). Each paper was reviewed by at least three anonymous referees which provided full written evaluations. After the workshops, the authors of 10 papers were invited to submit extended journal versions to this special issue. These papers passed two review phases, and finally 7 were accepted to be included in the journal.
Article
The system-level design decision making and feasibility analysis of the real-time embedded systems are discussed. The design decision-making is related to the mapping and scheduling of the system behavior on the system structure. The opportunities and problems related to the system-on-a-chip technology are also discussed. A method for modeling application-specific real-time embedded systems for the semi-automatic design decision-making related to the system architecture exploration phase in system design is also proposed.
Article
We have designed a safe, polynomial time approximation algorithm for static deadlock detection in a subset of the Ada language [MR90b]. We extend the program representation to include nearly all of the Ada rendezvous primitives, and present preliminary experimental results for an implementation of our algorithm. Our goal is to develop an automatic facility to accurately certify deadlock freedom for a large class of Ada programs.
Conference Paper
One of the central problems in the automatic analysis of distributed or parallel systems is the combinatorial state explosion leading to models, which are exponential in the number of their parallel components. The only known cure for this problem are application specific techniques, which avoid the state explosion problem under special frame conditions. In this paper we present a new such technique, which is tailored to bitvector analyses, which are very common in data flow analysis. In fact, our method allows to adapt most of the practically relevant optimizations for sequential programs, for a parallel setting with shared variables and arbitrary interference between parallel components.
Article
The problem of analyzing concurrent systems has been investigated by many researchers, and several solutions have been proposed. Among the proposed techniques, reachability analysis—systematic enumeration of reachable states in a finite-state model—is attractive because it is conceptually simple and relatively straightforward to automate and can be used in conjunction with model-checking procedures to check for application-specific as well as general properties. This article shows that the nature of the translation from source code to a modeling formalism is of greater practical importance than the underlying formalism. Features identified as pragmatically important are the representation of internal choice, selection of a dynamic or static matching rule, and the ease of applying reductions. Since combinatorial explosion is the primary impediment to application of reachability analysis, a particular concern in choosing a model is facilitating divide-and-conquer analysis of large programs. Recently, much interest in finite-state verification systems has centered on algebraic theories of concurrency. Algebraic structure can be used to decompose reachability analysis based on a flowgraph model. The semantic equivalence of graph and Petri net-based models suggests that one ought to be able to apply a similar strategy for decomposing Petri nets. We describe how category-theoretic treatments of Petri nets provide a basis for decomposition of Petri net reachability analysis.
Chapter
The term "Distributed Software Engineering" isambiguous1. It includes both the engineering ofdistributed software and the process of distributeddevelopment of software, such as cooperative work. Thispaper concentrates on the former, giving an indication ofthe special needs and rewards in distributed computing.In essence, we argue that the structure of these systems asinteracting components is a blessing which forcessoftware engineers towards compositional techniqueswhich offer the...
Conference Paper
Many parallel programs are written in SPMD style, i.e. by running the same sequential program on all processes. SPMD programs include synchronization, but it is easy to write incorrect synchronization patterns. We propose a system that verifies a program's synchronization pattern. We also propose language features to make the synchronization pattern more explicit and easily checked. We have implemented a prototype of our system for Split-C and successfully verified the synchronization structure of realistic programs. 1 Introduction Explicitly-parallel programming---where the programmer specifies the parallelism in a computation---is arguably the most widely used parallel programming paradigm. Despite many years of practical experience, there has been little work on the static semantics of explicitly-parallel programming languages. We propose a static semantics for global synchronization that guarantees an explicitly parallel program has no global synchronization errors. Our proposal i...
Conference Paper
Traditional optimization techniques for sequential programs are not directly applicable to parallel programs where concurrent activities may interfere with each other through shared variables. New compiler techniques must be developed to accommodate features found in parallel languages. In this paper, we use abstract interpretation to obtain useful properties of programs, e.g., side effects, data dependences, object lifetime and concurrent expressions, for a language that supports first-class functions, pointers, dynamic allocations and explicit parallelism through cobegin. These analyses may facilitate many applications, such as program optimization, parallelization, restructuring, memory management, and detecting access anomalies.Our semantics is based on a labeled transition system and is instrumented with procedure strings to record the procedural/concurrency movement along the program interpretation. We develop analyses in both concrete domains and abstract domains, and prove the correctness and termination of the abstract interpretation.
Conference Paper
Shared-memory based parallel programming with OpenMP and Posix-thread APIs is becoming more common to fully take advantage of multiprocessor computing environments. One of the critical risks in the multithreaded programming is data races which are hard to debug and greatly damaging to parallel applications if they are uncaughted. Although ample effort has been made in building specialized data race detection techniques, the state of art tools such as Intel thread checker still have various functionality and performance problems. In this paper, we present an efficient data race detection mechanism named ADAT (Adaptive Dynamic Analysis Tool). ADAT analyzes target program models to categorize the race engines (RDC: Race-Detection Classification) and then selects adequate engines to detect races automatically based upon the RDC (ECPS: Engine Code Property Selector). ADAT constructs an emperically optimal set of race engines in the aspect of labeling, filtering, and detection. In addition to RDC and ECPS, we have implemented an OpenMP parser and a source instrument or in ADAT to support OpenMP programs. The functionality and efficiency of ADAT are compared with those of Intel thread checker by using a set of OpenMP based kernel programs. The experimental results show that ADAT can detect data races with more challenging target program models and can achieve a couple of orders of magnitude faster processing time than Intel thread checker.
Conference Paper
Although the data-flow framework is a powerful tool to statically analyze a program, current data-flow analysis techniques have not addressed the effect of procedures on concurrency analysis. This work develops a data race detection technique using a data-flow framework that analyzes concurrent events in a program in which tasks and procedures interact. There are no restrictions placed on the interactions between procedures and tasks, and thus recursion is permitted. Solving a system of data-flow equations, the technique computes a partial execution order for regions in the program by considering the control flow within a program unit, communication between tasks, and the cdlinglcreation context of procedures and tasks. From the computed execution order, con- current events are determined as unordered events. We show how the information about concurrent events can be used in debugging to automatically detect data races.
Article
Full-text available
Cats (Concurrency Analysis Tool Suite) is designed to satisfy several criteria: it must analyze implementation-level Ada source code and check user-specified conditions associated with program source code; it must be modularized in a fashion that supports flexible composition with other tool components, including integration with a variety of testing and analysis techniques; and its performance and capacity must be sufficient for analysis of real application programs. Meeting these objectives together is significantly more difficult than meeting any of them alone. We describe the design and rationale of Cats and report experience with an implementation. The issues addressed here are primarily practical concerns for modularizing and integrating tools for analysis of actual source programs. We also report successful application of Cats to major subsystems of a (nontoy) highly concurrent user interface system.
Article
A comprehensive overview of data flow frameworks and their characterizing properties is presented, to clarify property definitions and demonstrate their interrelation. Properties ensuring the existence of a solution are differentiated from those guaranteeing particular convergence behavior for specific solution procedures. Examples illustrate the orthogonality of these precision and convergence properties. In addition, several data flow problems are categorized with respect to these properties.
Article
Earlier work has shown the effectiveness of hand-applied program transformations optimizing high-level interprocess communication mechanisms. This paper describes the static analysis techniques necessary to ensure correct compiler application of the optimizing transformations. These techniques include both dataflow analysis and interprocess analysis. This paper focuses on the analysis of communication mechanisms within program modules; however, the analysis techniques can be generalized to handle inter-module optimization analysis as well. The major contributions of this paper include the application of dataflow analysis and the extension of interprocedural analysis—interprocess analysis—to real concurrent programming languages and, more specifically, to the optimization of interprocess communication and synchronization mechanisms that use both static and dynamic channels. In addition, the use of attribute grammars to perform interprocess analysis is significant. This paper also describes an implementation of both intra-process dataflow and interprocess analysis techniques using attribute grammars.
Conference Paper
The object-oriented paradigm provides support for modular and reusable design and is attractive for the construction of large and complex concurrent systems. Reachability analysis is an important and well-known tool for static (pre-run-time) analysis of concurrent programs. However its direct application to concurrent object-oriented programs has many problems, such as incomplete analysis for reusable classes and increased computational complexity. It also seems impossible to arrive at a single general-purpose strategy that is both safe and effective for all programs. The authors propose a tool-suite based approach for the reachability analysis of concurrent object-oriented-programs. This approach enables choice of an appropriate `ideal' tool, for the given program and also provides the flexibility for incorporation of additional tools. They have also proposed a novel abstraction-based partitioning methodology for effective reachability analysis of concurrent object-oriented programs. Using this methodology, they have developed a variety of tools, having different degrees of safety, effectiveness and efficiency, for incorporation into the tool-suite. They have formally shown the safety of these tools for appropriate classes of programs and have evaluated their effectiveness and efficiency
Conference Paper
Traditional compiler techniques operating on control flow graphs are not adequate for analyzing parallel programs where data can flow from one node to another through the shared memory, even though the nodes are not related by control flow edges. Abstract interpretation provides a general and unified framework for program analyses, and can be applied to parallel programs without much difficulty. However, the state space explosion problem in abstract interpretation of parallel programs must be relieved in order to make compile-time analyses practical. Although abstract interpretation itself provides an excellent mechanism for state space reduction by state abstraction, lower precision analysis often results from taking a higher degree of abstraction. In this paper, we present state space reduction that preserves analysis precision by eliminating redundant interleavings, based on Valmari's (1990) stubborn set method. We also propose an iterative algorithm for analyzing programs with pointers and closures, in which knowledge about shared locations required by existing methods is not available. The proposed algorithm has been implemented, and we discuss preliminary results of the implementation
Article
One of the fundamental problems encountered when debugging a parallel program is determining the possible orders in which events could have occurred. Various problems, such as data races and intermittent deadlock, arise when there is insufficient synchronization between the tasks in a parallel program. A sequential trace of an execution can be misleading, as it implies additional event orderings, distorting the concurrent nature of the computation. Algorithms to generate, from the trace of an execution, those event orderings that can be relied on by the programmer are described. By its very nature, the information in an execution trace pertains only to that execution of the program, and may not generalize to other executions. This difficulty is mitigated by defining an inferred program based on the trace and original program, analyzing this inferred program, and showing how the inferred program relates to the original. The results of the algorithms can be used by other automated tools such as a data race detector or constraint checker
Article
The authors address the question of how to use existing sequential Fortran code on multiprocessors. Their answer is Start/Pat, an interactive toolkit that automates the parallelization of sequential Fortran as it teaches the programmer how to exploit and understand parallel structures and architectures. The Start/Pat prototype has been installed at several user sites. The authors discuss the choice of PCF Fortran, the toolkit components, and the features of Pat and Start
Article
The object-oriented paradigm in software engineering provides support for the construction of modular and reusable program components and is attractive for the design of large and complex distributed systems. Reachability analysis is an important and well-known tool for static analysis of critical properties in concurrent programs, such as deadlock freedom. It involves the systematic enumeration of all possible global states of program execution and provides the same level of assurance for properties of the synchronization structure in concurrent programs, such as formal verification. However, direct application of traditional reachability analysis to concurrent object-oriented programs has many problems, such as incomplete analysis for reusable classes (not safe) and increased computational complexity (not efficient). We propose a novel technique called apportioning, for safe and efficient reachability analysis of concurrent object-oriented programs, that is based upon a simple but powerful idea of classification of program analysis points as local (having influence within a class) and global (having possible influence outside a class). We have developed a number of apportioning-based algorithms, having different degrees of safety and efficiency. We present the details of one of these algorithms, formally show its safety for an appropriate class of programs, and present experimental results to demonstrate its efficiency for various examples
Article
Static analysis of concurrent programs has been hindered by the well-known state explosion problem. Although many different techniques have been proposed to combat this state explosion, there is little empirical data comparing the performance of the methods. This information is essential for assessing the practical value of a technique and for choosing the best method for a particular problem. In this paper, we carry out an evaluation of three techniques for combating the state explosion problem in deadlock detection: reachability searching with a partial-order state-space reduction, symbolic model checking and inequality-necessary conditions. We justify the method used for the comparison, and carefully analyze several sources of potential bias. The results of our evaluation provide valuable data on the kinds of programs to which each technique might best be applied. Furthermore, we believe that the methodological issues we discuss are of general significance in comparison of analysis techniques
Article
The commenters discuss several flaws they found in the above-titled paper by G.M. Koran and R.J.A. Burh (see ibid., vol.17, no.10, p.109-1125, (1991)). The commenters argue that the characterization of operational and axiomatic proof method is modified and inaccurate; the classification of modeling techniques for concurrent systems confuses the distinction between state-based and event-based models with the essential distinction between explicit enumeration of behaviors and symbolic manipulation of properties; the statements about the limitations of linear-time temporal logic in relation to nondeterminism are inaccurate; and the characterization of the computational complexity of the analysis technique is overly optimistic.< >
Article
Full-text available
This paper presents a taxonomy of methods for detecting race conditions in parallel programs, shows how recent results fit into the taxonomy, and presents some new results for previously unexamined points in the taxonomy. It also presents a taxonomy of "races" and suggested terminology.
Article
Full-text available
Code motion is well-known as a powerful technique for the optimization of sequential programs. It improves the run-time efficiency by avoiding unnecessary recomputations of values, and it is even possible to obtain computationally optimal results, i.e., results where no program path can be improved any further by means of semantics preserving code motion. In this paper we present a code motion algorithm that for the first time achieves this optimality result for parallel programs. Fundamental is the framework of [KSV1] showing how to perform optimal bitvector analyses for parallel programs as easily and as efficiently as for sequential ones. Moreover, the analyses can easily be adapted from their sequential counterparts. This is demonstrated here by constructing a computationally optimal code motion algorithm for parallel programs by systematically extending its counterpart for sequential programs, the busy code motion transformation of [KRS1, KRS2]. Keywords Parallelism, interleaving...
Article
Full-text available
This paper presents a taxonomy that categorizes methods for determining event
Article
Full-text available
The debugging cycle is the most common methodology for finding and correcting errors in sequential programs. Cyclic debugging is effective because sequential programs are usually deterministic. Debugging parallel programs is considerably more difficult because successive executions of the same program often do not produce the same results. In this paper we present a general solution for reproducing the execution behavior of parallel programs, termed Instant Replay. During program execution we save the relative order of significant events as they occur, not the data associated with such events. As a result, our approach requires less time and space to save the information needed for program replay than other methods. Our technique is not dependent on any particular form of interprocess communication. It provides for replay of an entire program, rather than individual processes in isolation. No centralized bottlenecks are introduced and there is no need for synchronized clocks or a globally consistent logical time. We describe a prototype implementation of Instant Replay on the BBN Butterfly™ Parallel Processor, and discuss how it can be incorporated into the debugging cycle for parallel programs.
Article
Algorithms are presented for detecting errors and anomalies in programs which use synchronization constructs to implement concurrency. The algorithms employ data flow analysis techniques. First used in compiler object code optimization, the techniques have more recently been used in the detection of variable usage errors in single process programs. By adapting these existing algorithms, the same classes of variable usage errors can be detected in concurrent process programs. Important classes of errors unique to concurrent process programs are also described, and algorithms for their detection are presented.
Article
Algorithms are presented for detecting errors and anomalies in programs which use synchronization constructs to implement concurrency. The algorithms employ data flow analysis techniques. First used in compiler object code optimization, the techniques have more recently been used in the detection of variable usage errors in dngle process programs. By adapting these existing algorithms, the sane classes of variable usage errors can be detected in concurrent process programs. Important classes of errors unique to concurrent process programs are also described, and algorithms for their detection are presented.
Article
Developing and verifying concurrent programs presents several problems. A static analysis algorithm is presented here that addresses the following problems: how processes are synchronized, what determines when programs are run in parallel, and how errors are detected in the synchronization structure. Though the research focuses on Ada, the results can be applied to other concurrent programming languages such as CSP.
Article
One of the more pressing problems with runtime monitoring is how to provide a useful description of a deadness error. Simply detecting dead states at runtime requires less information than is needed to satisfactorily describe those states. The authors first describe the debugging facilities for dealing with deadness errors in an experimental monitor. They then give an example of the monitor's use on a tasking program - a simplified model of activities in a gas station. This example illustrates both the strong and weak points in the current monitor. Finally, they describe several possible enhancements to the monitor's debugging facilities. These enhancements not only improve the monitor's reporting of dead states, but also allow a wide class of task sequencing errors to be detected and diagnosed.
ART: Anomaly reporting tool for concurrent programming languages
  • Finke
Finke, A. ART: Anomaly reporting tool for concurrent programming languages. UCSD Tech. Rep. No. CS-086, 1986.