Content uploaded by David P. Helmbold
Author content
All content in this area was uploaded by David P. Helmbold on Nov 05, 2014
Content may be subject to copyright.
6. Conclusion
23
References
[AP87] T. R. Allen and D. A. Padua. Debugging fortran on a shared memory machine.
In
Proc. International Conf. on Parallel Processing
, pages 721{727, 1987.
[Dij65] E. W. Dijkstra. Solution of a problem in concurrent programming control.
Communications of the ACM
, 8(9), September 1965.
[EGP89] P. A. Emrath, S. Ghosh, and D. A. Padua. Event synchronization analysis for
debugging parallel programs. In
Supercomputing '89
, November 1989. Reno,
NV.
[EP88] P. A. Emrath and D. A. Padua. Automatic detection of nondeterminacy in
parallel programs. In
Proc. Workshop on Parallel and Distributed Debugging
,
pages 89{99, May 1988.
[Fid88] C. J. Fidge. Partial orders for parallel debugging. In
Proc. Workshop on
Parallel and Distributed Debugging
, pages 183{194, May 1988.
[GPH*88] M. D. Guzzi, D. A. Padua, J. P. Hoeinger, , and D. H. Lawrie. Cedar fortran
and other vector and parallel fortran dialects. In
Proceedings Supercomputing
'88
, pages 114{121, 1988.
[IBM88]
Parallel FORTRAN language and library reference
. IBM, 1988.
[Lam78] L. Lamport. Time, clocks, and the ordering of events in a distributed system.
CACM
, 21(7):558{565, July 1978.
[Lam86] Leslie Lamport. The mutual exclusion problem: part i{a theory of interprocess
communication.
JACM
, 33(2):290{312, April 1986.
[Mat88] F. Mattern. Virtual time and global states of distributed systems. In M.
Cosnard, editor,
Proceedings of Parallel and Distributed Algorithms
, 1988.
[McD89] C. E. McDowell. A practical algorithm for static analysis of parallel programs.
Journal of Parallel and Distributed Computing
, June, 1989.
[NM89] R. Netzer and B. P. Miller.
Detecting Data Races in Parallel Program Exe-
cutions
. Technical Report 894, University of Wisconsin-Madison, November
1989.
[Tay84] R. N. Taylor.
Debugging Real-Time Software in a Host-Target Environment
.
Technical Report, U.C. Irvine Tech. Rep. 212, 1984.
6. Conclusion
22
of events. We feel that this is misleading { an execution is more properly viewed as a
partial ordering on the events. Fidge and Mattern have pioneered the use of time vectors
to represent these partial orders. We have extended this approach by using time vectors
to analyze sets of executions rather than just capturing a single execution.
6. Conclusion
21
After adding the virtual edge from BW1
to CW1, CW1 becomes the second wait
on S1. Using Algorithm 5:
S
(BW1,CW1)
is
f
(BW1,CW1), (BW1,CS1), (BW1,CS2),
(BS1,CW1), (BS1,CS1), (BS1,CS2)
g
.
After adding the virtual edge from CS1 to
BW1, BW1 b ecomes the second wait on
S1. Again using Algorithm 5:
S
(CW1,BW1)
is
f
(BW1,CW1), (BS1,CW1), (BS2,CW1),
(BW1,CS1), (BS1,CS1), (BS2,CS1)
g
.
S
(BW1,CW1)
\
S
(CW1,BW1) =
f
(BW1,CW1),(BW1,CS1),(BS1,CS1),(BS1,CW1)
g
Figure 5.1: Detect Critical Regions
The problem is made even more dicult when there is no clear correspondence between
the blo cking and enabling events in the trace.
This paper contains a series of algorithms for extracting useful information from
sequential traces with anonymous synchronization. The rst algorithm is very similar to
the vector timestamp methods of Fidge and Mattern [Fid88, Mat88]. The other algorithms
systematically manipulate these vectors of timestamps in order to discover pairs of events
that must be ordered in every execution which is consistent with the trace. In addition to
presenting our algorithms, we have also proved their correctness.
Although our algorithms nd many of these \must-be-ordered" relationships, we have
been unable to prove that they nd all of them. We are investigating additional procedures
which can increase the number of \must-be-ordered" relationships found. We would also
like to distinguish all pairs of events that are concurrent in some consistent execution from
pairs of events which can happen in either order, but not concurrently.
Some parallel programming environments view a parallel execution as a linear sequence
6. Conclusion
20
If
s
?
w
2 =
)
e
k
e
0
, i.e., if there are enough signals for both waits to precede,
then the two waits can happen concurrently.
If
s
?
w
= 1 =
) :
(
e
k
e
0
), i.e., there is only one signal for a wait to precede, then we
can conclude that they cannot happen concurrently. The starting points of critical
regions have b een found. The following procedure is used to determine unordered
sequential event pairs in critical region.
1. First, assume that event
e
happened before
e
0
. Thus
e
0
is the
w
+ 2nd wait for
S
. Using Algorithm 4 with
k
=
w
+ 2 to calculate time vectors for event
e
0
and
other events.
Let
S
(
e; e
0
) =
f
(
e
i
; e
j
):(
e
i
; e
j
)
2
Conc
; e
i
2
E
i
; e
j
2
E
j
;
and ^
(
e
i
)[
i
]
^
(
e
j
)[
i
]
or ^
(
e
j
)[
j
]
^
(
e
i
)[
j
]
g
.
Undo the timestamp up dating.
2. Similarly, assume that event
e
0
happened b efore
e
. Thus
e
is the
w
+ 2nd wait
for
S
. Using Algorithm 4 with
k
=
w
+ 2 to calculate time vectors for event
e
and other events.
Let
S
(
e
0
; e
) =
f
(
e
i
; e
j
):(
e
i
; e
j
)
2
Conc
; e
i
2
E
i
; e
j
2
E
j
;
and ^
(
e
i
)[
i
]
^
(
e
j
)[
i
]
or ^
(
e
j
)[
j
]
^
(
e
i
)[
j
]
g
.
Undo the timestamp up dating.
Let Seq
t
=
S
(
e; e
0
)
\
S
(
e
0
; e
). Notice that Seq
t
maintains the set of unordered
event pairs in the critical region. They are not concurrent in any executions,
whenever
e
happened before
e
0
or
e
0
occurred before
e
.
3. Let Seq = Seq
[
Seq
t
.
Let Conc = Conc
?
Seq
t
.
s
?
w
0 means neither of them can precede. In this case, there is a deadlo ck.
End Algorithm 5.
Algorithm 5 generates two sets of event pairs. Conc contains those concurrent event
pairs. Seq contains those unordered sequential event pairs. The remaining event pairs are
ordered. Figure 5.1 shows the application of this algorithm to the trace from Figure 1.1.
6 Conclusion
One of the most dicult tasks in debugging parallel programs is determining the timing
relationships between the events performed by the parallel program. Although several
parallel systems include facilities for creating a trace of the signicant events, the sequential
nature of the trace makes it dicult to determine which events could have happened in
parallel.
5. Adjusting the Timestamps to Determine Concurrency
19
From equation 4.6 we know that at most
k
non-shadowed signals (excluding
e
i
) in
R
(
e
) do
not follow
e
i
(i.e. the
k
+ 1st smallest and later always follow
e
i
).
Therefore, in every execution, at least one of the
k
+ 1 non-shadowed signals preceding
e
follows (or is equal to)
e
i
.
By transitivity
e
i
happens b efore
e
in every execution consistent with, so,
e
i
e
.
5 Adjusting the Timestamps to Determine Concurrency
Up to now, we have computed a partial order that reects a safe order relation between
events from the trace
E
. Given any two events
e
i
2
E
i
and
e
j
2
E
j
, if ^
(
e
i
)[
i
]
^
(
e
j
)[
i
] or
^
(
e
j
)[
j
]
^
(
e
i
)[
j
] then the two events are ordered. Otherwise,
e
i
and
e
j
are two unordered
events. The unordered events are not necessarily concurrent events. They may have to
occur sequentially. In this case, we call them
unordered sequential
events. For example,
if the program has a properly implemented lock around a critical region, then dierent
executions may have tasks entering the critical region in dierent orders. In no execution,
however, do two tasks concurrently enter the critical region.
When debugging parallel programs, we would like to distinguish those pairs of events
that are concurrent in some consistent execution from pairs of events which can happen
in either order, but not concurrently. Unfortunately, the concurrent relation cannot be
determined immediately from the timestamps. We cannot necessarily say
e
i
can happen
concurrently with event
e
j
even if we know ^
(
e
i
)
k
^
(
e
j
). As an example, in Figure 4.2,
even though ^
(
BW
1)
k
^
(
C W
1), the two W1 events cannot occur at the same time. It is,
in general, a hard problem to determine whether two unordered events can really happen
concurrently.
Let
e; e
0
2
E
be a pair of events. Event
e
may happen concurrently with
e
0
only
if ^
(
e
)
k
^
(
e
0
). The following procedure can be used to detect critical regions, and
to determine unordered sequential event pairs in critical regions. The algorithm will
calculate two sets. The set Conc contains concurrent event pairs, while the set Seq
contains unordered sequential event pairs. Initially, we assume that all unordered events are
potential concurrent events. Once some critical regions have been detected, the algorithm
will move those unordered sequential event pairs from Conc to Seq.
Algorithm 5:
Initially let Conc =
ff
e; e
0
g
:
e; e
0
2
E
, and ^
(
e
)
k
^
(
e
0
)
g
. Let Seq =
.
Repeat the following pro cedure until no more changes are p ossible.
Pick any two unordered wait events
e
and
e
0
for semaphore
S
where (
e; e
0
)
2
Conc.
Let
G
(
e; e
0
) be the set of wait events for semaphore
S
which precede either event
e
or
e
0
(based on current timestamps ^
).
Let
R
(
e; e
0
) =
f
e
00
:
e
00
is a signal event using
S
and
e
00
precedes
e
or
e
0
g [ f
e
00
:
e
00
is not
shadowed with respect to either
e
or
e
0
, and
e
00
does not follow either
e
or
e
0
g
.
Let
s
=
j
R
(
e; e
0
)
j
and
w
=
j
G
(
e; e
0
)
j
.
4. Expanding the Safe Order Relation
18
Figure 4.2: Expanding the Safe Order Relation
Case1: Assume ^
(
e
)[
i
] = ^
(
e
p
)[
i
] then
^
(
e
i
)[
i
]
^
(
e
p
)[
i
]
)
e
i
e
p
by the induction hypothesis (4.4)
)
e
i
e
by transitivity (4.5)
Case2: ^
(
e
)[
i
]
6
= ^
(
e
p
)[
i
].
Event
e
is a wait on semaphore
S
. Let
k
be computed as sp ecied in the algorithm, then
^
(
e
i
)[
i
]
^
(
e
)[
i
] = min
k
+1
f
^
(
e
s
)[
i
] :
e
s
2
R
(
e
)
g
(4.6)
where non-shadowed signal set
R
(
e
) is computed according to the algorithm, and
min
k
+1
selects the
k
+ 1st smallest value from the set.
In every execution, at least
k
+ 1 signal events precede
e
since there are at least
k
waits
on the same semaphore must happen b efore
e
.
In any arbitrary execution P, let
k
s
be the number of shadowed signals (with respect to
e
)
that precede
e
.
By transitivity the corresponding
k
s
shadowing waits precede
e
, and at least
k
+
k
s
waits
on
S
precede wait event
e
.
Therefore, at least
k
+
k
s
+ 1 signal events precede
e
in the execution and
k
+ 1 of them
are non-shadowed signals.
4. Expanding the Safe Order Relation
17
Therefore, the signal event
e
0
s
is shadowed by some wait event
e
0
w
where
e
s
e
0
w
e
0
s
with
respect to
e
.
This forms a contradiction with the assumption that
e
0
s
is shadowed by
e
w
.
The Algorithm 4 is based on the following observation. If
e
is a wait event on semaphore
S
and
k
other wait events on
S
must happen before
e
, then at least
k
+ 1 non-shadowed
signal events happen before
e
in every execution consistent with the trace.
Algorithm 4:
Initially ^
(
e
) =
0
(
e
) for all events
e
2
E
.
Repeat the following pro cedure until no more changes are p ossible.
Pick an event
e
. If
e
is a wait event using semaphore
S
, let
W
(
S
) be the set of wait events on semaphore
S
,
k
be the number of wait events
e
w
2
W
(
S
) such that
e
w
6
=
e
and if
e
w
2
E
i
then
^
(
e
w
)[
i
]
^
(
e
)[
i
], and
R
(
e
) =
f
^
e
: ^
e
is a signal event on
S
,
e
6
^
e
as indicated by the ^
timestamps, and ^
e
is not shadowed with respect to
e
g
and
v
s
= the
k
+ 1st component-wise minimum of ^
(^
e
) for ^
e
2
R
(
e
).
If
e
is not a wait event, let
v
s
be the 0 vector.
^
(
e
) = max(^
(
e
p
)
;
#
(
e
)
; v
s
)
End Algorithm 4.
Figure 4.2 shows the new ^
timestamps generated when Algorithm 4 is executed starting
with Figure 3.1.
Theorem 5:
Algorithm 4 generates only safe order relations, i.e., for any two events
e
i
2
E
i
and
e
2
E
:
^
(
e
i
)[
i
]
^
(
e
)[
i
]
)
e
i
e
Proof
: The proof is by induction on the number of updates. As a base case the theorem
holds for the initial values of ^
from Theorem 4.
Assume the theorem holds before some update. Consider two events
e
i
2
E
i
and
e
2
E
where
^
(
e
i
)[
i
]
>
^
(
e
)[
i
] before the update, and
^
(
e
i
)[
i
]
^
(
e
)[
i
] after the update.
Because ^
(
e
i
)[
i
] never changes, ^
(
e
)[
i
] was updated.
We consider two cases.
4. Expanding the Safe Order Relation
16
Figure 4.1: Shadowed Signal Event
Since for each shadowed signal there is only one corresponding shadowing wait (by
Denition 15), we have
j
R
i
s
(
e
)
jj
R
i
w
(
e
)
j
.
We only need to show that
j
R
i
s
(
e
)
j
=
j
R
i
w
(
e
)
j
.
Assume to the contrary that
j
R
i
s
(
e
)
j
>
j
R
i
w
(
e
)
j
, which means that there are at least two
signals
e
s
and
e
0
s
in
R
i
s
(
e
) shadowed by some
e
w
2
R
i
w
(
e
).
Assume
e
w
e
s
e
0
s
. Let
w
1
and
s
1
be the number of waits and signals on
S
performed by
T
i
between
e
w
and
e
s
,
w
2
and
s
2
be the number of waits and signals on
S
performed by
T
i
between
e
s
and
e
0
s
. This is shown in the following which represents the local subsequence
of events performed by some task, where time moves from left to right.
j ?
s
1
?!j j ?
s
2
?!j
...
e
w
...
e
s
...
e
0
s
...
j ?
w
1
?!j j ?
w
2
?!j
Therefore,
w
1
=
s
1
since
e
s
is shadowed by
e
w
(4.1)
w
1
+
w
2
=
s
1
+
s
2
+ 1 since
e
0
s
is shadowed by
e
w
(4.2)
Combining equations 4.1 and 4.2 gives us
w
2
=
s
2
+ 1 (4.3)
However, equation 4.3 means that the subsequence between
e
s
and
e
0
s
contains more waits
on
S
than signals.
4. Expanding the Safe Order Relation
15
add additional safe orderings into the partial order using the fact that only some wait
events in the trace can actually proceed immediately after each signal event. The partial
order resulting from this nal step will be represented by the time vectors ^
(
e
). Initially,
^
(
e
) =
0
(
e
).
Denition 14:
Let
e
2
E
i
be a wait event and
e
s
2
E
j
be a signal event on the same
semaphore
S
where
^
(
e
)
k
^
(
e
s
)
. Let
E
(
e; e
s
)
be the subsequence of
E
j
containing every
event
e
j
where
e
j
e
s
and
^
(
e
j
)
k
^
(
e
)
. If any sux of
E
(
e; e
s
)
contains more wait events
on
S
than signal events on
S
, then the signal event
e
s
is
shadowed
with respect to
e
.
Denition 15:
Let
E
0
(
e; e
s
)
be the shortest sux of
E
(
e; e
s
)
which contains more wait
events than signal events on
S
, and let
e
w
be the rst event of
E
0
(
e; e
s
)
. We say
e
s
is
shadowed by event
e
w
with respect to
e
.
Lemma 2:
Given a wait event
e
and a signal event
e
s
on the same semaphore
S
, if
e
s
is
shadowed by some event
e
w
with respect to
e
then
Event
e
w
is a wait event on semaphore
S
,
The event
e
w
, which shadows
e
s
with respect to
e
, is unique. We dene
e
w
to be the
shadowing wait event corresponding to
e
s
, and
The subsequence between
e
w
and
e
s
(in the same task) contains as many signal events
as wait events on semaphore
S
.
Proof
: The pro of is straightforward from the denitions.
Denition 16:
For any wait event
e
2
E
, let
R
s
(
e
) =
f
e
s
:
e
s
is shadowed with respect to
e
g
, and
R
w
(
e
) =
f
e
w
:
9
e
s
2
R
s
(
e
)
s.t.
e
s
is shadowed by
e
w
with respect to
e
g
.
In the example shown in Figure 4.1, the signal event CS1 is shadowed by CW1 with
respect to two wait events performed by task B.
Lemma 3:
For any wait event
e
2
E
, the correspondence between shadowed signal and
shadowing wait is one to one, i.e.,
j
R
s
(
e
)
j
=
j
R
w
(
e
)
j
:
Proof
: Let
e
2
E
be a wait event on semaphore
S
. From Denitions 14 and 15, we know
that any pair of corresponding shadowed signal and shadowing wait belongs to the same
task.
Therefore, it is enough to show that the corresp ondence between shadowed signal and
shadowing wait is one to one within each task
T
i
where 1
i
n
.
Let
R
i
s
(
e
) and
R
i
w
(
e
) be the sets of shadowed signal events and shadowing wait events
performed by task
T
i
with resp ect to
e
.
4. Expanding the Safe Order Relation
14
Equation 3.7 implies for some
c
max(
P
(
e
p
)
;
#
(
e
)
;
P
(
e
s
))[
c
]
<
max(
0
(
e
p
)
;
#
(
e
)
;
min(
0
(
e
s
1
)
;
...
0
(
e
s
m
)))[
c
]
:
(3
:
8)
Equation 3.5 and 3.8 imply
P
(
e
s
)[
c
]
<
min(
0
(
e
s
1
)
;
...
0
(
e
s
m
))[
c
] (3.9)
P
(
e
s
)[
c
]
<
0
(
e
s
)[
c
] (3.10)
P
(
e
s
)
6
0
(
e
s
) (3.11)
Again 3.11 contradicts the assumption that
e
is the rst event in the top ological order of
the partial order P such that
P
(
e
)
6
0
(
e
).
Therefore, there is no event
e
in any execution P such that
P
(
e
)
6
0
(
e
).
Theorem 4:
After rewinding, we have a partial order that is a safe order relation, i.e.
0
(
e
i
)
<
0
(
e
)
)
e
i
e:
Proof
: Let
i
be the task performing
e
i
, so
0
(
e
i
)[
i
] =
P
(
e
i
)[
i
] =
#
(
e
i
)[
i
].
0
(
e
i
)[
i
]
0
(
e
)[
i
] from the hypothesis (3.12)
0
(
e
i
)[
i
]
P
(
e
)[
i
] by Lemma 1 (3.13)
P
(
e
i
)[
i
]
P
(
e
)[
i
] since
e
i
2
E
i
(3.14)
e
i
P
!
e
for all
P
and (3.15)
e
i
e
from the denition of
:
(3.16)
The rewinding process is based on the fact that any signal event might enable any
wait event on the same semaphore. We may have lost some safe order relations during
rewinding. As an example, in Figure 3.1, time vector
0
says that two W2 events and
the W1 event in task A may happen concurrently with all of the events in task B and C.
However, it is obvious that the W1 in task A must happ en after the two S1 events in task
B and C, and the second W2 in task A has to wait until all of the events in B and C have
occurred. The nal step in the algorithm will nd some of the order relations lost during
the rewinding procedure.
4 Expanding the Safe Order Relation
The result of the rewind step is a partial order that is a safe order relation. It is an
overly conservative safe order relation because it assumed that any wait could happen
immediately after any signal for the same semaphore. We now undertake a process to
3. Rewinding the Time Vectors
13
From the inductive hypothesis
0
(
e
)
min(
0
(
e
S
1
)
;
...
;
0
(
e
S
k
))
;
and from the algorithm
min(
0
(
e
S
1
)
;
...
;
0
(
e
S
k
))
<
0
(^
e
)
:
Therefore
0
(
e
)
<
0
(^
e
).
After rewinding, we have a partial order that is a safe order relation. If event
e
i
has
an earlier time vector than
e
, we can say
e
i
will happen before
e
in all executions that are
consistent with the given trace. Before we prove this in theorem 4 we rst present one
lemma used in the proof.
Lemma 1:
For any execution
P
consistent with a trace
E
and for al l events
e
2
E
P
(
e
)
0
(
e
)
Proof
: Assume to the contrary that there is an execution
P
and some event
e
such that
P
(
e
)
6
0
(
e
).
In any topological ordering (with respect to partial order P) of events in E, let
e
be the
rst event in the topological ordering such that
P
(
e
)
6
0
(
e
).
We consider two cases.
Case1: If
e
is not a wait event then from Algorithms 1 and 3:
0
(
e
) = max(
0
(
e
p
)
;
#
(
e
)) (3.3)
P
(
e
) = max(
P
(
e
p
)
;
#
(
e
)) (3.4)
Note that
P
(
e
p
)
6
0
(
e
p
) by our choice of
e
. This contradicts the assumption that
e
is the
rst event in the topological order of the partial order P such that
P
(
e
)
6
0
(
e
).
Case2: Event
e
is a wait event. From the choice of
e
we get
P
(
e
p
)
0
(
e
p
) (3.5)
P
(
e
)
6
0
(
e
) (3.6)
Substituting the denitions of
P
and
0
into 3.6 gives:
max(
P
(
e
p
)
;
#
(
e
)
;
P
(
e
s
))
6
max(
0
(
e
p
)
;
#
(
e
)
;
min(
0
(
e
s
1
)
;
...
0
(
e
s
m
))) (3
:
7)
where
e
s
is the corresponding signal event of
e
and thus app ears before
e
in the topological
order, and each
e
s
i
for 1
i
m
is one of the
m
signal events for the semaphore waited
on by
e
w
.
3. Rewinding the Time Vectors
12
Figure 3.1: Rewinding the Time Vectors
Proof
: It is enough to show that
0
(
e
)[
i
]
0
(^
e
)[
i
]
)
0
(
e
)
<
0
(^
e
). The opposite direction
follows directly from the denition of vector comparison.
The proof is by induction on the number of updates made by Algorithm 3. As the base
case, from theorem 2, the theorem holds for the initial
0
values.
Assume the theorem holds before some update. Consider two arbitrary events,
e
2
E
i
and
^
e
2
E
j
, after updating a single time vector.
Since Algorithm 3 does not change
0
(
e
)[
i
] and never increases time vectors, updating
0
(
e
)
can not make
0
(
e
)[
i
]
0
(^
e
)[
i
]
)
0
(
e
)
<
0
(^
e
) false. Therefore, we consider three cases
when
0
(^
e
) was up dated.
Case1:
i
=
j
If
e
= ^
e
p
then from the algorithm
0
(
e
)
<
0
(^
e
).
Otherwise,
0
(
e
)
<
0
(^
e
p
) which implies
0
(
e
)
<
0
(^
e
).
Case2:
i
6
=
j
and
0
(^
e
)[
i
] =
0
(^
e
p
)[
i
].
This implies
0
(
e
)[
i
]
0
(^
e
p
)[
i
]. Since neither
0
(^
e
p
) nor
0
(
e
) changed, by the induction
hypothesis
0
(
e
)
0
(^
e
p
), and the algorithm ensures that
0
(^
e
p
)
<
0
(^
e
). Therefore
0
(
e
)
<
0
(^
e
).
Case3:
i
6
=
j
and
0
(^
e
)[
i
]
6
=
0
(^
e
p
)[
i
]. This implies that ^
e
is a wait event for some semaphore
S
.
Let
e
S
1
...
e
S
k
be the signal events for the semaphore S. From the algorithm denition and
the assumption we know
0
(
e
)[
i
]
0
(^
e
)[
i
] = min(
0
(
e
S
1
)
;
...
;
0
(
e
S
k
))[
i
]
:
3. Rewinding the Time Vectors
11
3 Rewinding the Time Vectors
The result of the initialize step in the previous section is an unsafe order relation. It is
unsafe b ecause we assumed that the
k
th signal event for a particular semaphore was the
one allowing the
k
th wait event to precede. The next step is to rewind the time vectors to
account for the fact that any signal event might be the one that allowed any wait event on
the same semaphore to complete. We use
0
(
e
) to represent the new time vector assigned
to event
e
during and after the rewinding process. Initially
0
is the same as
.
Suppose
e
is a wait event, and
e
1
and
e
2
are two signal events, either of which could
have caused
e
to complete. In this case, we only know that either
e
1
or
e
2
must have
happened b efore
e
. The trace might be in any of the forms:
...
; e
1
;
...
; e;
...
; e
2
;
. . .;
...
; e
2
;
...
; e;
...
; e
1
;
. . .;
...
; e
1
;
...
; e
2
;
. . .
; e;
. . .; or
...
; e
2
;
...
; e
1
;
. . .
; e;
. . ..
However, we can conclude that the common ancestors of
e
1
and
e
2
must o ccur before
e
.
Therefore if
e
a
e
1
and
e
a
e
2
then
e
a
e
. The rewind step dened below uses this fact
to obtain a safe order relation.
Algorithm 3:
Initially,
8
e
2
E;
0
(
e
) =
(
e
).
Repeat the following procedure until no further changes are possible.
For all event
e
2
E
, let
0
(
e
) = max(
0
(
e
p
)
;
#
(
e
)
; v
s
)
where if
e
is wait event on semaphore S:
v
s
= min(
0
(
e
s
1
)
;
...
;
0
(
e
s
k
)) (3.1)
where
e
s
1
...
e
s
k
are all the signal events for the semaphore S. (3.2)
otherwise
v
s
is the 0 vector.
End Algorithm 3.
Observe that the only dierence between Algorithm 3 and Algorithm 2 (used to
compute
) is that for wait events in Algorithm 3,
v
s
is the minimum of a set of time
vectors, which includes the time vector used for
v
s
in computing
. Therefore the values
of
0
will only get smaller as Algorithm 3 executes.
Theorem 3:
For any two distinct events
e
2
E
i
;
^
e
2
E
j
,
0
(
e
)[
i
]
0
(^
e
)[
i
]
()
0
(
e
)
<
0
(^
e
)
2. Initializing the Vectors
10
2. If
e
l
is the corresponding signal event and
e
i
!
e
l
then
(
e
i
)[
i
]
(
e
l
)[
i
] and
e
l
!
e
)
(
e
l
)[
i
]
(
e
)[
i
] and the result follows.
3. If
e
l
is the corresponding signal event and
e
i
6!
e
l
then
e
i
!
e
p
Property 8 (2.8)
(
e
i
)[
i
]
(
e
p
)[
i
] from the inductive hypothesis (2.9)
(
e
p
)[
i
]
(
e
)[
i
] from the denition of
(2.10)
and the result follows.
Theorem 2:
For any two distinct events
e
i
2
E
i
; e
2
E
,
(
e
i
)[
i
]
(
e
)[
i
]
()
(
e
i
)
<
(
e
)
Proof
: The
(
direction is trivial. For the
)
direction, assume to the contrary that there
are two events,
e
i
2
E
i
and
e
2
E
where
(
e
i
)[
i
]
(
e
)[
i
] but
(
e
i
)
6
<
(
e
). Thus there is
some vector comp onent
c
such that
(
e
i
)[
c
]
>
(
e
)[
c
]
:
(2.11)
Let
e
c
be the event occurring in
E
c
with sequence number
(
e
i
)[
c
] then
(
e
c
)[
c
] =
(
e
i
)[
c
]
:
(2.12)
e
c
!
e
i
from Theorem 1 and 2.12 (2.13)
e
i
!
e
from the hypothesis and theorem 1 (2.14)
e
c
!
e
from transitivity of
!
(2.15)
(
e
c
)[
c
]
(
e
)[
c
] from 2.15 and Theorem 1 (2.16)
Combining 2.12 and 2.16 forms a contradiction with 2.11.
Corollary 1:
For any two distinct events
e
2
E
i
,
^
e
2
E
j
,
i
6
=
j
:
(
e
)[
i
]
>
(^
e
)[
i
]
and
(^
e
)[
j
]
>
(
e
)[
j
] =
)
e
k
^
e
The initialization process creates a partial ordering of the events in the trace. This
partial ordering corresponds to an execution which is strongly consistent with the trace.
It describ es the
happened before
relation for the
canonical execution
.
Unfortunately, this partial order only gives the happ ened before relationships between
events for the canonical execution, i.e. it is an unsafe order relation. The
k
th signal event in
one execution might not necessarily be the
k
th signal in some other execution. Therefore,
event
e
may not happen before ^
e
in some other execution even if it did in this execution.
Even when
(
e
)
<
(^
e
) we cannot say
e
must happen before
^
e
.
2. Initializing the Vectors
9
Proof
: From Properties 3 and 7 we know that if either side holds then
e
i
appears before
e
in the trace. Therefore, it suces to prove that whenever the algorithm assigns a time
vector to some event
e
, and
e
i
is any event appearing earlier in the trace (and thus already
assigned a time vector by the algorithm) the two conditions are equivalent. We prove this
by induction on the position of
e
in the trace.
After the rst event is assigned a time vector, the theorem trivially holds as no distinct
pairs of events have been assigned time vectors. We now show that the time vector assigned
to the next event,
e
, satises the theorem assuming that the time vectors assigned to all
events appearing before
e
in the trace satisfy the theorem.
We rst show that assuming
(
e
i
)[
i
]
(
e
)[
i
] then
e
i
!
e
. If
e
2
E
i
, so that the
two events are in the same task,
T
i
, the implication follows because the selected vector
component is the event count for task
T
i
. Otherwise the events occur in dierent tasks
and
(
e
)[
i
] =
(^
e
)[
i
]
where
^
e
is either
(
e
p
or possibly
e
j
if
e
is a wait event and
e
j
is the corresponding signal event.
In either case ^
e
has previously been assigned a time vector and
(
e
)[
i
] =
(^
e
)[
i
] by the denition of
(2.1)
(
e
i
)[
i
]
(^
e
)[
i
] from the assumption (2.2)
^
e
!
e
from the denition of ^
e
(2.3)
Either
e
i
= ^
e
and the theorem is proven or by the induction hypothesis
e
i
!
^
e
, and by
transitivity
e
i
!
e
.
To prove that
e
i
!
e
)
(
e
i
)[
i
]
(
e
)[
i
] we consider three cases.
Case1: If
e
2
E
i
, so that the two events are in the same task, the result follows from
Properties 3 and 4.
Case2: If
e
is not a wait event then
(
e
)[
i
] =
(
e
p
)[
i
] (2.4)
e
i
!
e
p
Property 8 (2.5)
(
e
i
)[
i
]
(
e
p
)[
i
] from the hypothesis (2.6)
(
e
i
)[
i
]
(
e
)[
i
]
:
(2.7)
Case3: If
e
is a wait event then we have three subcases:
1. If
e
i
is the corresponding signal event then the result trivially holds.
2. Initializing the Vectors
8
Algorithm 2:
To compute initial time vectors,
(
e
i
), from a trace
E
use algorithm 1 with
the following mo dications.
The
k
th wait event on semaphore
S
(in trace order) corresp onds to the
k
th signal
event on
S
.
The events are assigned time vectors in the order they appear in the trace.
End Algorithm 2.
For the given trace, Figure 1.1(a) shows the result of the initialization procedure.
The time vectors computed for the canonical execution have the following properties:
Property 4:
If
e
and
^
e
are two events in the same task
T
i
and
e
occurred before
^
e
in the
trace, then
e
!
^
e
and
(
e
)
<
(^
e
)
.
Property 5:
If
e
and
^
e
are the corresponding signal/wait pair (the
k
th signal and the
k
th
wait on the same semaphore
S
in the trace), then
e
!
^
e
and
(
e
)
<
(^
e
)
.
Property 6:
At any point in the trace, the maximum value of any time vector component
is the number of events performed by the corresponding task up to that point.
Property 7:
If
e
2
E
i
and
(
e
)[
i
]
(^
e
)[
i
]
then either
e
= ^
e
or
e
appears before
^
e
in the
trace.
Because an event is only constrained to follow its predecessor in the same task, and in
the case of wait events, the corresponding signal, the following property holds.
Property 8:
If
e
!
^
e
then one of the following is true:
1.
e
= ^
e
p
,
2.
e
= ^
e
s
where
^
e
is a wait event and
^
e
s
is the corresponding signal event,
3.
e
!
^
e
p
or
4.
e
!
^
e
s
where
^
e
is a wait event and
^
e
s
is the corresponding signal event.
Given the correspondence between signal and wait events for execution P, events can
be assigned time vectors by using Algorithm 1. Mattern [Mat88] has shown that the time
vector
P
correctly represents the partial order relation
P
!
, i.e., for any pair of distinct
events
e
i
2
E
i
and
e
2
E
,
P
(
e
i
)[
i
]
P
(
e
)[
i
]
()
e
i
P
!
e:
For completeness, we now prove that the initial time vectors,
, correctly represent the
happened before relation for the canonical execution.
Theorem 1:
For any pair of distinct events
e
i
2
E
i
and
e
2
E
,
(
e
i
)[
i
]
(
e
)[
i
]
()
e
i
!
e
2. Initializing the Vectors
7
1.3 An Overview of the New Algorithms
In the following sections we will introduce a series of algorithms to calculate dierent
time vectors for trace events. By comparing their nal time vectors, we can distinguish
many
ordered
events from the
unordered
, p otentially concurrent, events. Our goal is a set
of time vectors where if event
e
1
has an earlier time vector than
e
2
, then
e
1
will happen
before
e
2
in
all
executions that are consistent with the given trace
3
.
The three phases of the algorithm are \initialize", \rewind", and \expand". The
initialization uses Algorithm 1. The resulting partial order is similar to that computed
by the algorithm of [Fid88]. This partial order is shown to be equivalent to the \happened
before",
P
!
, relation for a canonical execution
P
. Note that the canonical execution is in
general not the same execution which generated the trace. The result of the rewinding
phase is a partial order that is a subrelation of the
relation. Unfortunately this safe
order relation is overly conservative, in that there may be many \must happen before"
relations that it does not include. The third and nal phase results in a safe partial order
that is closer to the \must happen before" relation.
2 Initializing the Vectors
Before giving the algorithm for computing the initial time vectors, we dene a canonical
execution that will be used to verify the \correctness" of the time vectors.
Denition 13:
Given a trace
E
with the total ordering of events,
<
E
, the partial order
P
!
corresponding to the
canonical
execution
P
is constructed by selecting and taking the
transitive closure of the fol lowing subrelation of
<
E
.
If
e
i
and
e
j
are two events from the same task and
e
i
<
E
e
j
then
e
i
P
!
e
j
.
If
e
i
and
e
j
are the
k
th signal and wait events respectively on the same semaphore,
then
e
i
P
!
e
j
.
In the remainder of the paper we will use
!
to mean
P
!
where
P
is the canonical
execution dened above.
Property 3:
If
e
!
^
e
then
e
appears before
^
e
in the trace.
3
Given a specic input and a trace, there are in general executions which are not consistent with that
trace, however, any such execution will contain a race if and only if a race occurred in the execution that
generated the trace [AP87].
1. Introduction
6
length
n
, where
n
is the total number of tasks.
2
Each task
T
i
has its own vector component
C
i
[
i
] which guarantees a strict temporal ordering of events occurring in that task. A local
event counter which is incremented each time an event occurs in the task can b e used as
the lo cal clock.
Before presenting the algorithms for computing time vectors from a trace, we need to
dene some notation.
Denition 9:
For an event
e
2
E
i
,
e
p
is the previous event performed by the same task
T
i
if such an event exists.
Denition 10:
For an event
e
2
E
i
,
#
(
e
)
is the time vector containing the local event
count for
e
in the
i
th position and zeros elsewhere.
Denition 11:
For any two time vectors
u; v
in
Z
n
1.
u
v
() 8
i
(
u
[
i
]
v
[
i
])
2.
u<v
()
u
v
and
u
6
=
v
3.
u
k
v
() :
(
u < v
)
and
:
(
v < u
)
.
Denition 12:
For any k time vectors
v
1
;
...
; v
k
of
Z
n
min(
v
1
;
...
; v
k
)
is a vector of
Z
n
whose
i
th component is
min(
v
1
[
i
]
;
...
; v
k
[
i
])
, and
max(
v
1
;
...
; v
k
)
is a vector of
Z
n
whose
i
th component is
max(
v
1
[
i
]
;
...
; v
k
[
i
])
.
The following algorithm (derived from [Mat88, Fid88]) computes time vectors for the
events in an execution. This algorithm requires the correspondence between signal and
wait events. The time vectors produced reect the execution's partial order.
Algorithm 1:
Given the correspondence between signal and wait events for execution
P
,
events are assigned time vectors,
P
(
e
i
), in topological order.
P
(
e
i
) = max(
v
t
; v
s
;
#
(
e
i
))
where
v
t
=
(
P
(
e
p
i
) if
e
i
has a predecessor
the 0 vector otherwise
v
s
=
8
>
<
>
:
P
(^
e
) if
e
i
is a wait event and
^
e
is the corresponding signal event
the 0 vector if
e
i
is not a wait event
End Algorithm 1.
2
We use an integer valued clock in our discussion although a real number valued clock can also be used.
1. Introduction
5
Denition 4:
An execution is
strongly consistent
with a trace if it is consistent and
the total order specied by the trace is an extension of the partial order specied by the
execution.
For example, consider the trace
f
AS1, CW1, CS1, CS2, BW1, BS1, BS2, AW2, AW2,
AW1
g
. Event AS1 means task A performs a signal(
S
1
), AW1 means task A performs
wait(
S
1
) etc. Figure 1.1 shows the four executions which are consistent with this trace. In
addition, the executions (a) and (b) are strongly consistent with the trace.
Denition 5:
Consider the correspondence between signal and wait events in execution P
and two distinct events
e; e
0
. If
e
P
6!
e
0
and
e
0
P
6!
e
then events
e
and
e
0
are concurrent, and
thus can happen at the same time, in the execution.
Denition 6:
The symbol \
k
" is used to represent the concurrent relationship between
events. Two events
e
and
e
0
are concurrent, i.e.
e
k
e
0
, if they can happen at the same
time in some execution which is consistent with the trace.
Denition 7:
The symbol \
" is used to represent the
must happen b efore
relationship
between events. Given two events
e
and
e
0
, if
e
e
0
, then event
e
will happen before
e
0
in
all executions that are consistent with the given trace. Events
e
and
e
0
are
ordered
if
e
e
0
or
e
0
e
, otherwise, they are
unordered
.
Concurrent events are always unordered, but unordered events need not be concurrent.
For example, see events BW1 and CW1 in Figure 1.1.
Notice that, in general,
e
e
0
is dierent from the relation
e
P
!
e
0
for any choice of
P
.
The former relation tells us that
e
must happen before
e
0
in all executions consistent with
the trace being analyzed, while the later says that
e
happened before
e
0
in the execution
represented by the partial order
P
. If
e
e
0
then
e
P
!
e
0
for all consistent executions
P
.
But the converse condition do es not hold.
In Figure 1.1, CS1
P
!
BW1 if
P
is the execution (a). However, if
P
is the execution
(c), BS1
P
!
CW1, and BW1
P
!
CS1 by transitivity. Event AS1 happens before BW1 and
CW1 in all executions consistent with the trace, therefore AS1
BW1 and AS1
CW1.
There is no order relation between event CS2 and BW1 in execution (a). Therefore, they
can happen concurrently, i.e., CS2
k
BW1.
Denition 8:
A partial ordering R on the events is a
safe
order relation if
e
i
R
e
j
)
e
i
e
j
. If R is not safe, then R is
unsafe
.
1.2 Virtual Time
The concept of virtual time for distributed systems was introduced by Lamport in 1978
[Lam78]. The time vectors we compute in this paper are an extension of the time vectors of
Fidge [Fid88] and Mattern [Mat88]. There, each task
T
i
has a clock
C
i
which is a vector of
1. Introduction
4
Trace =
f
AS1, CW1, CS1, CS2, BW1, BS1, BS2, AW2, AW2, AW1
g
Figure 1.1: Trace, Executions, and Time Vectors
1. Introduction
3
and a (positive integer) sequence number equal to one plus the number of previous
operations performed by the task.
In order to perform the nal race analysis, it must be p ossible to determine from a trace
what shared objects are referenced b etween any two synchronization events. This can b e
done by additionally associating with each event the source line number of the statement
generating the event. From this the path between two adjacent events can be determined
and the variables referenced along the path can be computed [McD89].
Many other kinds of synchronization op erations can b e simulated by using counting
semaphores. Consider, for example, the event \
init task t
" which creates a new task
t
and the event \
await task t
" which blocks the running task until task
t
has terminated.
Given a trace containing these events, we can create an equivalent a trace containing only
semaphore events.
In each execution every wait event has a corresponding signal event. We use this
correspondence to dene a partial order representing that execution.
Denition 1:
An
execution
of a paral lel program is a partial ordering of the events
performed. This partial order is the transitive closure of edges from each event to the
next event performed by the same task and edges to each wait event from the corresponding
signal event.
The relation dened by the partial order
P
representing an execution is called the
happened
before
relation and is denoted with the symbol
P
!
. Our denition of \happened before" is
consistent with that of Lamport[Lam78].
Denition 2:
A trace of an execution is an interleaving of the local sequences of events
E
i
for
1
i
n
where for every prex of the trace and every semaphore S, the prex
contains at least as many signal(S) events as wait(S) events.
Every trace must satisfy the following properties:
Property 1:
No two events in the trace have both the same task id and the same sequence
number.
Property 2:
If there is an event with task id
t
and sequence number
k
, then for every
1
i < k
, there is an event with task id
t
and sequence number
i
appearing earlier in the
trace.
A single execution usually has many possible traces. Similarly, a single trace could have
been generated by any one of a number of executions. (Figures 1.1(a) and 1.1(b) show two
dierent executions for the same trace).
Denition 3:
An execution is
consistent
with a trace if the local sequences of trace events
E
i
for each task
1
i
n
is the same as in the execution.
1. Introduction
2
occur" execution order. Our algorithms appear to be more ecient and may nd more
guaranteed order relations.
Netzer and Miller [NM89] present a formal model of a program execution based
on Lamport's model of concurrent systems [Lam86]. Their model includes fork/join
parallelism and synchronization using semaphores. They distinguish b etween an
actual
data race
, which is a data race exhibited by the particular program execution generating
the trace, and a
feasible data race
, which is a data race that could have been exhibited
due to timing variations. They show how to characterize each detected data race as either
being feasible, or as belonging to a set of data races such that at least one data race in
the set is feasible. They rely on the trace for their ordering information. As an example,
when two tasks try to enter some critical regions surrounded by some binary semaphore
S, their algorithm will say that these two tasks are ordered when accessing these regions.
Under their denitions there is neither an actual nor feasible data race even if two tasks
write to some shared variable in this case. We view the ordering relationships in the trace
with suspicion, and wish to generate race reports in this situation.
We believe that it is more helpful to analyze sets of executions rather than just one
specic execution based on some trace information. We feel that, in terms of detecting data
races by trace analysis, it is critical to distinguish the
ordered
events from the
unordered
,
potentially
concurrent
, events. In this paper we present a collection of algorithms that
extend previous work in computing partial orders. The algorithms presented compute a
partial order containing only \
must occur
" type orderings from a linearly ordered trace
containing anonymous synchronization. The algorithms presented in this paper make
few assumptions about specic trace features and can be adjusted to work with traces
generated by many parallel systems, including IBM Parallel Fortran [IBM88], and Cedar
Fortran [GPH*88].
1.1 Description of the Mo del
We view a parallel program as a nite set of
tasks
T
1
;
...
; T
n
where
n
is the number
of tasks in the system. These tasks p erform synchronization and computation operations,
including computation on shared data
1
. In an execution, each task
T
i
is a sequential entity
characterized by a local sequence
E
i
of events. Dierent tasks may perform operations
concurrently. We assume, for convenience, that each task has a unique identier.
In our model, programs synchronize using only counting semaphores which are assumed
to be initialized to zero. Therefore, each event is a tuple containing:
the operation completed (wait or signal),
the semaphore aected,
the id of the task that performed the operation,
1
Although operations on shared data can b e used for synchronization [Dij65], we only consider explicit
synchronization operations as capable of generating synchronization events.
1. Introduction
1
1 Introduction
One of the fundamental problems encountered when debugging a parallel program is
determining the race conditions in the program. A race condition may exist when two
or more parallel tasks access shared data in an unspecied order and at least one of the
accesses is a write access. Notice that races include both accesses that may occur \at the
same time" and accesses that must occur sequentially but the order is unspecied (e.g.
accesses protected by a lock). One approach to determining potential races is based on
computing all of the reachable concurrent states of the program [McD89, Tay84]. The
major disadvantage of this approach is that the number of concurrent states may become
prohibitively large. Another approach to determining p otential races is based on analyzing
a trace from an execution of the program [EP88, EGP89, NM89]. This approach has the
disadvantage that a trace must be recorded, and is limited to determining races that
can occur given the input data used. Even for the given data, it may not b e possible
to determine all races [AP87]. Nevertheless, this later approach can provide important
information to help in debugging parallel programs and is the sub ject of this paper.
A
trace
species a total ordering of the events performed by the program. For our
purposes, the trace reects only one of the orders in which the events could have occurred.
A more restrictive denition that is dicult to achieve in practice would be for a trace to
specify the exact order in which the events did occur. Since traces are only approximations
of executions, there are usually several executions that are consistent with a given trace.
What we want to compute is the orderings b etween pairs of events that
must occur
in all
executions which are consistent with the trace. In general this will be a partial order. If
the partial order contains all orderings that must occur, then a pair of events not ordered
by this \
must occur
" partial ordering can potentially execute in either order.
Much research has been directed towards determining the partial ordering of events in
parallel and distributed systems. Previous models have assumed point-to-point commu-
nication which makes it very easy to determine which events were caused by which other
events (e.g. \message received by B from A" is clearly caused by \message sent by A to
B"). Unfortunately the synchronization models supported by several parallel programming
languages allow for anonymous communication, where the partner is unknown. Examples
of anonymous communication include lo cks, semaphores, and monitors.
Emrath, Ghosh, and Padua [EGP89] present a method for detecting non-determinacy
in parallel programs that utilize fork/join and event style synchronization instructions
with the
Post, Wait
, and
Clear
primitives. They construct a
Task Graph
from the given
synchronization instructions and the sequential components of the program that is intended
to show the guaranteed orderings b etween events. For each
Wait
event node, all
Post
nodes
that might have triggered that
Wait
are identied. An edge is then added from the closest
common ancestor of these
Post
events to the
Wait
event node. The idea of the algorithm
is very simple, but it may be computationally complex. Also some of the guaranteed
order relations may be missed by their algorithm. Rather than repeatedly computing
the common ancestor information, we use time vectors to calculate the guaranteed \must
0
Analyzing Traces with
Anonymous Synchronization
David P. Helmbold
Charles E. McDowell
Jian-Zhong Wang
UCSC-CRL-89-42
December, 1989
Board of Studies in Computer and Information Sciences
University of California at Santa Cruz
Santa Cruz, CA 95064
abstract
In a parallel system, events can o ccur concurrently. However, programmers are often
forced to rely on misleading sequential traces for information about their program's behav-
ior. We present a series of algorithms which extract ordering information from a sequential
trace with anonymous semaphore-style synchronization.
We view a program execution as a partial ordering of events, and dene which executions
are consistent with a given trace. Although it is generally not possible to determine which
of the consistent executions occurred, we dene the notion of \safe orderings" which are
guaranteed to occur in every execution which is consistent with the trace.
The main results of the paper are algorithms which determine many of the \safe or-
derings". The rst algorithm starts from a sequential trace and creates a partially ordered
canonical execution. The second algorithm strips away the ordering relationships particular
to the canonical execution, so that the resulting partial order is safe. The third algorithm
increases the amount of ordering information while maintaining a safe partial order. All
three algorithms are accompanied by proofs of correctness.
keywords: virtual time, program tracing, parallel processing, debugging
This work was supported by IBM under agreement SL 88096.