ThesisPDF Available

Refactoring via program slicing and sliding

Authors:

Figures

Content may be subject to copyright.
Refactoring via Program Slicing and Sliding
Ran Ettinger
Wolfson College
Trinity Term, 2006
Submitted in partial fulfilment of the requirements for
the degree of Doctor of Philosophy
Oxford University Computing Laboratory
Programming Tools Group
Refactoring via Program Slicing and Sliding
Ran Ettinger, Wolfson College
Thesis submitted for the degree of Doctor of Philosophy
at the University of Oxford
Trinity Term, 2006
Abstract
Mark Weiser’s observation that “programmers use slices when debugging”, back in 1982,
started a new field of research. Program slicing, the study of meaningful subprograms that capture
a subset of an existing program’s behaviour, aims at providing programmers with tools to assist
in a variety of software development and maintenance activities.
Two decades later, the work leading to this thesis was initiated with the observation that
“programmers use slices when refactoring”. Hence, the thesis explores ways in which known
refactoring techniques can be automated through slicing and related program analyses.
Common to all slicing related refactorings, as explored in this thesis, is the goal of improving
reusability, comprehensibility and hence maintainability of existing code. A problem of slice
extraction is posed and its potential contribution to refactoring research highlighted. Limitations
of existing slice-extraction solutions include low applicability and high levels of code duplication.
Advanced techniques for the automation of slice extraction are proposed. The key to their
success lies in a novel program representation introduced in the thesis. We think of a program as
a collection of transparency slides, placed one on top of the other. On each such slide, a subset of
the original statement, not necessarily a contiguous one, is printed. Thus, a subset of a statement’s
slides can be extracted from the remaining slides in an operation of sideways movement, called
sliding. Semantic properties of such sliding operations are extensively studied through known
techniques of predicate calculus and program semantics.
This thesis makes four significant contributions to the slicing and refactoring fields of research.
Firstly, it develops a theoretical framework for slicing-based behaviour-preserving transformations
of existing code. Secondly, it provides a provably correct slicing algorithm. This application of our
theory acts as evidence for its expressive power whilst enabling constructive descriptions of slicing-
based transformations. Thirdly, it applies the above framework and slicing algorithm in solving
the problem of slice extraction. The solution, a family of provably correct sliding transformations,
provides high levels of accuracy and applicability. Finally, the thesis outlines the application of
sliding to known refactorings, making them automatable for the first time.
These contributions provide strong evidence that, indeed, slicing and related analyses can assist
in building automatic tools for refactoring.
To Dana, Amir, Zohara and Ze’ev; and to loved B¨arbel.
Acknowledgements
I would first like to thank prof. Oege de Moor, for taking on the tough role of supervising me and
my work. Not without struggle, we have finally come through, winning. In not hassling me while
I was slowly progressing, and in acting as devil’s advocate, Oege has strongly contributed to the
development and success of this work. I also thank Oege for introducing me to his fine group of
students — some of whom will remain forever my friends (I hope) — and for introducing me to
Dijkstra’s work on program semantics. I finally thank Oege for his effort in securing some funding
for this work, after Intercomp’s unsurprising withdrawal, on my arrival to Oxford.
Mike Spivey supervised my work during Oege’s 2003 Sabbatical and influenced my direction
tremendously. I’m especially grateful to Mike for the full attention and presence during our
supervision meetings, always leaving me inspired and with fresh ideas. Jeremy Gibbons and Mike
were my transfer examiners and later Jeremy and Jeff Sanders performed my confirmation of
status. I am greatly indebted to all three for the insightful comments and feedback, which left a
huge mark on the later results leading to this thesis.
The Programming Tools Group at Oxford, of which I was a member, provided a strong working
environment, through weekly meetings, where talks could be practiced and ideas could be shared
and discussed, and through joint work. I’m grateful to past and present members of the group, in
particular to Iv´an Sanabria, Yorck H¨unke, Stephen Drape, Mathieu Verbaere, and Damien Sereni,
for the professional assistance and collaboration, for the mental support on rough days, and for
the friendship. The collaborative highlight of my Oxford time was the work with Mathieu during
his MSc project. I loved our endless discussions on programming and whatever, and I’m glad you
and Doroth´ee finally returned for the DPhil, and became close friends. Thank you!
Other comlab (past) students, like Gordon, Fatima, Eran, Silvija, Jussi, Abigail, Penelope,
Eldar, David, and Edouard, have contributed immensely to this professional and personal voyage.
I’d also like to extend warm thanks to comlab’s administrative staff, for their continuously diligent
support.
Big thanks go to my final examiners, Jeremy Gibbons and Mark Harman, for the interesting
discussion during a good-spirited viva, and for their valuable suggestions, making this thesis a
ii
better and more professional scientific report. Special thanks go to Raghavan Komondoor, Yvonne
Greene, Iv´an Sanabria, Steve Drape, Sharona Meushar and Itamar Kahn, for commenting on final
drafts of the thesis.
Intercomp Ltd. in Israel was where I first came up with the ideas for this research. I thank my
colleagues and friends there, and I thank prof. Alan Mycroft of Cambridge for his contribution to
the development of the initial research proposal. I also thank prof. Mooly Sagiv for introducing
me to the academic world of program analysis, as well as Eran Tirer and Dr. Mayer Goldberg, for
supporting my Oxford application with letters of recommendation. Itamar Kahn was inspirational
in his own way, and the first to recommend Oxford to me. Sharon Attia was instrumental in the
acceptance of an eventual Oxford offer, and started that adventure with me.
Student life in Oxford is brilliant, mainly due to the distribution of all students into colleges.
Wolfson College provided a beautiful and peaceful environment, perfect for my living and studying.
I would like to thank all my Wolfsonian friends, the staff, members of the boat club, and most
importantly members of our football club. (After all, football was my number one reason for
choosing England.) Participating in sports competitions with the many other colleges, and once
a year with our sister college in Cambridge, Darwin, provided some of the best moments of my
fantastic DPhil experience. On the Jewish side of things, I warmly thank Rabbi Eli and Freidy
Brackman for providing a friendly, social and educational environment in their thriving Oxford
Chabad society. And as for nutrition, nothing compares to Oxford’s late night Kebab vans! Thank
you all for taking good care of me.
Financially surviving the five years of research, as a self-funded student, required some fund
raising. I acknowledge the financial support of IBM’s Eclipse Innovation Grant, the ORS Scheme,
and Oxford University’s Hardship Fund. Sun Labs hired me for a magnificent California intern-
ship (thanks to Michael Van De Vanter, James Gosling, Tom Ball and Tim Prinzing). Continuous
teaching appointments by Oxford’s Computing Laboratory and the Software Engineering Pro-
gramme were fun to perform and provided the much needed extra funds (thanks to Silvija Seres,
Jeremy Gibbons, Steve McKeever, the administrative staff, and all students). In the final 8 months
I was fully supported by my parents (Toda Ima Aba!) and my new position at the IBM Haifa
Research Lab (thanks to Dr. Yishai Feldman and the SAM group) helps paying back Wolfson
College’s Senior Tutor loan (many thanks to Dr. Martin Francis).
Shai, Uri, Itamar & Anna, Yacove, Koby & Adi, Sharona, Becky, Hezi, Keren, and Yo’av,
all helped keeping morale high by visiting and maintaining overseas friendships. My sister Dana,
brother Amir, and their families, my parents, Zohara and Ze’ev, and the rest of the family,
including our UK-based relatives, most notably Yvonne Greene and her lovely family in Banbury
where I found a home away from home, were all extremely supportive in ways more than one; and
indeed very patient. I am forever grateful!
iii
And last but not least, I am happy to thank Georgia Barbara Jettinger for her love and immea-
surable contribution to the success of this journey. And it was actually B¨arbel’s slip of a tongue
that triggered the invention of this thesis’ sliding metaphor. I am deeply grateful to Oxford (and
Edouard and Raya) for introducing us. Following you on your fieldwork to Paris, for my year of
writing-up, turned out brilliant. enial!
Rani Ettinger,
15 June 2007,
Tel Aviv, Israel
Slip sliding away
Slip sliding away
You know the nearer your destination
The more you slip sliding away.
Simon & Garfunkel
iv
Contents
1 Introduction 1
1.1 Refactoring enables iterative and incremental software development . . . . . . . . . 1
1.2 The gap: refactoring tools are important but weak . . . . . . . . . . . . . . . . . . 2
1.2.1 Motivating example: Fowler’s video store . . . . . . . . . . . . . . . . . . . 2
1.3 Programmers use slices when refactoring . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Automatic slice-extraction refactoring via sliding . . . . . . . . . . . . . . . . . . . 8
1.5 Overview: chapter by chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Background and Related Work 12
2.1 Refactoring ........................................ 12
2.1.1 Informal reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.2 Underlying theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Slicing........................................... 15
2.2.1 Slicing examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.2 On slicing and termination . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.3 Slicing criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.4 Syntax-preserving vs. amorphous and semantic slices . . . . . . . . . . . . . 17
2.2.5 Flow sensitivity: backward vs. forward slicing . . . . . . . . . . . . . . . . . 18
2.2.6 Slicing algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.7 SSAform ..................................... 19
2.3 Slice-extraction refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Formal Semantics: Predicate Transformers 26
3.1 Set theory for program variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.1 Sets and lists of distinct variables . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.2 Disjoint sets and tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1.3 Generating fresh variable names . . . . . . . . . . . . . . . . . . . . . . . . 27
v
3.2 Predicate calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.1 The state-space metaphor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.2 Structures, expressions and predicates . . . . . . . . . . . . . . . . . . . . . 28
3.2.3 Square brackets: the ‘everywhere’ operator . . . . . . . . . . . . . . . . . . 29
3.2.4 Functions and equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.5 Global variables in expressions, predicates and programs . . . . . . . . . . . 30
3.2.6 Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2.7 Proof format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.8 From the calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Program semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3.1 Predicate transformers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3.2 Different types of junctivity . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3.3 A definition of deterministic program statements . . . . . . . . . . . . . . . 35
3.4 Program refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4 A Theoretical Framework 37
4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1.1 On slips and slides: an alternative to substatements . . . . . . . . . . . . . 38
4.1.2 Why deterministic? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.3 On deterministic program semantics . . . . . . . . . . . . . . . . . . . . . . 39
4.1.4 On refinement, termination and program equivalence . . . . . . . . . . . . . 39
4.1.5 Semantic language requirements . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1.6 Global variables in transformed predicates . . . . . . . . . . . . . . . . . . . 42
4.2 The programming language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.1 Expressions, variables and types . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.2 Core language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.3 Extended language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Laws of program analysis and manipulation . . . . . . . . . . . . . . . . . . . . . . 49
4.3.1 Manipulating core statements . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.2 Assertion-based program analysis . . . . . . . . . . . . . . . . . . . . . . . . 50
4.3.3 Manipulating liveness information . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Summary ......................................... 52
5 Proof Method for Correct Slicing-Based Refactoring 53
5.1 Introducing slice-refinements and co-slice-refinements . . . . . . . . . . . . . . . . . 53
5.2 Variable-wise proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2.1 Proving slice-refinements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
vi
5.2.2 A co-slice-refinement is a slice-refinement of the complement . . . . . . . . 56
5.3 Slice and co-slice refinements yield a general refinement . . . . . . . . . . . . . . . 58
5.3.1 A corollary for program equivalence . . . . . . . . . . . . . . . . . . . . . . 59
5.4 Example proof: swap independent statements . . . . . . . . . . . . . . . . . . . . . 60
5.5 Summary ......................................... 61
6 Statement Duplication 63
6.1 Example.......................................... 63
6.2 Sequential simulation of independent parallel execution . . . . . . . . . . . . . . . 64
6.3 Formal derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.4 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7 Semantic Slice Extraction 70
7.1 Example.......................................... 71
7.2 Live variables analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.2.1 Simultaneous liveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.3 Formal derivation using statement duplication . . . . . . . . . . . . . . . . . . . . . 75
7.4 Requirements of slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.4.1 Ward’s definition of syntactic and semantic slices . . . . . . . . . . . . . . . 78
7.5 Summary ......................................... 79
8 Slides: A Program Representation 80
8.1 Slideshow: a program execution metaphor . . . . . . . . . . . . . . . . . . . . . . . 80
8.2 Slides in refactoring: sliding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.2.1 One slide per statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.2.2 A separate slide for each variable . . . . . . . . . . . . . . . . . . . . . . . . 81
8.2.3 A separate slide for each individual assignment . . . . . . . . . . . . . . . . 82
8.3 Representing non-contiguous statements . . . . . . . . . . . . . . . . . . . . . . . . 82
8.4 Collecting slides: the union of non-contiguous code . . . . . . . . . . . . . . . . . . 84
8.5 Slide dependence and independence . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8.5.1 Smallest enclosing slide-independent set . . . . . . . . . . . . . . . . . . . . 85
8.6 SSAform ......................................... 86
8.6.1 Transform to SSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.6.2 Back from SSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.6.3 SSA is de-SSA-able . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.7 Summary ......................................... 90
vii
9 A Slicing Algorithm 91
9.1 Flow-insensitive slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
9.1.1 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
9.1.2 Example...................................... 95
9.2 Make it flow-sensitive using SSA-based slides . . . . . . . . . . . . . . . . . . . . . 96
9.2.1 Formal derivation of flow-sensitive slicing . . . . . . . . . . . . . . . . . . . 96
9.2.2 An SSA-based slice is de-SSA-able . . . . . . . . . . . . . . . . . . . . . . . 98
9.2.3 The refined algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9.2.4 Example...................................... 99
9.3 Slice extraction revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9.3.1 The transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9.3.2 Evaluation and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
9.4 Summary ......................................... 104
10 Co-Slicing 105
10.1 Over-duplication: an example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
10.2 Final-use substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
10.2.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.2.2 Deriving the transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
10.3 Advanced sliding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
10.3.1 Statement duplication with final-use substitution . . . . . . . . . . . . . . . 108
10.3.2 Slicing after final-use substitution . . . . . . . . . . . . . . . . . . . . . . . . 111
10.3.3 Definition of co-slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
10.3.4 The sliding transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
10.4Summary ......................................... 114
11 Penless Sliding 115
11.1 Eliminating redundant backup variables . . . . . . . . . . . . . . . . . . . . . . . . 115
11.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
11.1.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
11.1.3 Dead-assignments-elimination and variable-merging . . . . . . . . . . . . . 116
11.2 Compensation-free (or penless) co-slicing . . . . . . . . . . . . . . . . . . . . . . . . 122
11.3 Sliding with penless co-slices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.4Summary ......................................... 124
12 Optimal Sliding 126
12.1 The minimal penless co-slice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
viii
12.1.1 A polynomial-time algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 127
12.2 Slice inclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
12.3 The optimal sliding transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
12.4Summary ......................................... 135
13 Conclusion 137
13.1 Slicing-based refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
13.1.1 Replace Temp with Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
13.1.2 More refactorings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
13.2 Advanced issues and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
13.3Futurework........................................ 142
13.3.1 Formal program re-design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
13.3.2 Further applications of sliding: beyond refactoring . . . . . . . . . . . . . . 143
A Formal Language Definition 145
A.1 Core language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
A.2 Extended language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
B Laws of Program Manipulation 158
B.1 Manipulating core statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
B.2 Assertion-based program analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
B.2.1 Introduction of assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
B.2.2 Propagation of assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
B.2.3 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
B.3 Live variables analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
B.3.1 Introduction and removal of liveness information . . . . . . . . . . . . . . . 174
B.3.2 Propagation of liveness information . . . . . . . . . . . . . . . . . . . . . . . 175
B.3.3 Dead assignments: introduction and elimination . . . . . . . . . . . . . . . 177
C Properties of Slides 180
C.1 Lemmata for proving independent slides yield slices . . . . . . . . . . . . . . . . . . 185
C.2 Slide independence and liveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
D SSA 194
D.1 General derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
D.2 Transform to SSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
D.3 BackfromSSA ...................................... 213
D.4 SSA is de-SSA-able . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
ix
D.5 An SSA-based slice is de-SSA-able . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
E Final-Use Substitution 223
E.1 Formal derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
E.2 Lemmata for proving statement dup. with final use . . . . . . . . . . . . . . . . . . 226
E.3 Stepwise final-use substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
F Summary of Laws 230
F.1 Manipulating core statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
F.2 Assertion-based program analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
F.2.1 Introduction of assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
F.2.2 Propagation of assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
F.2.3 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
F.3 Live variables analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
F.3.1 Introduction and removal of liveness information . . . . . . . . . . . . . . . 232
F.3.2 Propagation of liveness information . . . . . . . . . . . . . . . . . . . . . . . 233
F.3.3 Dead assignments: introduction and elimination . . . . . . . . . . . . . . . 233
Bibliography 234
x
Chapter 1
Introduction
1.1 Refactoring enables iterative and incremental software
development
Programming is a relatively young discipline. In its earlier days, the leading so-called Waterfall
methodology for software development involved separate phases for design and for actual imple-
mentation. This was based on the assumption that a system can be fully specified up-front, ahead
of its implementation. Any later change was considered software maintenance, and involved its
own separate set of processes.
Modern methodologies, in contrast, inherently accommodate for change by admitting a more
iterative and incremental software development process. That is, throughout its lifecycle, software
is developed and released in iterations. Each such iteration is typically targeting an increment in
functionality. Thus, an iteration may involve any and all aspects of development, including design
and coding. Examples include the Rational Unified Process (RUP) [34], eXtreme Programming
(XP) [7] and other so-called agile methodologies [65, 67].
Refactoring [48, 20] is the process of improving the design of existing software. This is achieved
by performing source code transformations that preserve the original functionality. The ability to
update the design and internal structure of programs through refactoring enables change during the
lifecycle of any software system. Thus, refactoring is key to the success of software development.
The premise, when refactoring, is that the design should be clearly reflected by the code itself.
Thus, clarity of code is imperative. Indeed, the goal of many refactoring transformations (e.g. for
renaming variables) is to improve the readability of code.
Another theme in refactoring is the removal of duplication in code. (As a system evolves,
duplication creeps in e.g. by the common ‘quick-and-dirty’ practice of ‘copy-and-paste’ of existing
code.) Such redundancies can and should be removed. This removal is achieved by refactoring
1
CHAPTER 1. INTRODUCTION 2
transformations geared at enhancing reusability of code (e.g. by extracting common code into new
methods with the so-called Extract Method refactoring).
The refactorings in this thesis will indeed target both improved comprehensibility and enhanced
reusability, in supporting the development and maintenance of quality software systems.
Modern software development environments, e.g. MS Visual Studio [68] and Eclipse [66], in-
clude useful support for some refactoring techniques. However, the incompleteness and, at times,
incorrectness of those tools calls for progress in the underlying theory.
In what follows, we illustrate the promise of refactoring and the power of its supporting tools,
on the one hand, while identifying the gap to be filled by this thesis, on the other.
1.2 The gap: refactoring tools are important but weak
In code, a function that yields a value without causing any observable side effects is very valuable.
“You can call this function as often as you like” [20, Page 279]. Such a call is also known as a
query.
A refactoring technique called Replace Temp with Query was introduced by Fowler [20] to
turn the use of a temporary variable holding the value of an expression into a query. The benefit
is increased readability (in the refactored version) and reusability (of the extracted computation).
This scenario is indeed supported in e.g. Eclipse, as a special case of the Extract Method tool.
A more complicated case of Replace Temp with Query occurs when the temp is not assigned
the result of an expression, but rather the result of a computation spanning several lines of code.
If those lines are consecutive in code (i.e. contiguous), they can be selected by the user and again
the Extract Method tool may handle them successfully. Unfortunately, this will not always be
the case; instead, when the code for computing a temporary result is tangled with code for other
concerns, it is said to be non-contiguous.
1.2.1 Motivating example: Fowler’s video store
The following example is taken (with minor changes) from Martin Fowler’s refactoring book [20],
where all refactorings are performed manually. 1The example concerns a piece of software for
running a video store, focusing on the implementation of one feature: composing and printing
a customer rental record statement. The statement includes information on each of the rentals
made by a customer, and summary information; a sample statement is shown in Figure 1.1.
In the original implementation, the preparation of the text to be printed and the computation
of the summary information are tangled inside a single method (see Figure 1.2). In fact, Fowler
1The example itself, as well as a variation on the accompanying discussion, has appeared in a paper titled:
“Untangling: A Slice Extraction Refactoring” [17] by the author of this thesis, co-authored with Mathieu Verbaere.
CHAPTER 1. INTRODUCTION 3
Figure 1.1: A sample customer statement.
Figure 1.2: A tangled statement method.
CHAPTER 1. INTRODUCTION 4
Figure 1.3: The statement method after extracting the computations of the total charge and
frequent renter points.
starts off with a much longer statement method containing all the logic for determining the amount
to be charged and the number of frequent renter points earned per movie rental. These results
depend on the type of the rented movie (regular,children’s or new release) and the number of
days it was rented for.
Fowler then gradually improves the design by factoring out that rental-specific logic (into the
Rental class, which is not shown here). The suggested refactoring steps are motivated by the
introduction of a new requirement, namely the ability to print an html version of the statement.
A quick-and-dirty approach would be to copy the body of the statement method, paste it into a
new htmlStatement method and replace the text-based layout control strings with corresponding
html tags. This would lead to duplication of the code for computing the temporary totalAmount
and frequentRenterPoints variables.
For brevity, we join the refactoring session at the stage Fowler calls Removing Temps ([20,
Page 26]). At this stage the computations of totalAmount and frequentRenterPoints are factored
CHAPTER 1. INTRODUCTION 5
Figure 1.4: The extracted total charge computation.
Figure 1.5: The extracted frequent renter points computation.
CHAPTER 1. INTRODUCTION 6
out (see figures 1.3-1.5 for the result of those two steps). Fowler describes the path by which this
was achieved as “not the simplest case of Replace Temp with Query,totalAmount was assigned to
within the loop, so I have to copy the loop into the query method”. Indeed, to-date, no refactoring
tool supports such cases.
Here is an outline of the mechanical steps that need to be performed by a programmer, in the
absence of tool support, for extracting the total charge computation:
1. In the statement method of Figure 1.2, look for the temporary variable that is assigned the
result of the total charge computation. This is totalAmount which is declared to be of type
double in line 14. Its final value is added to the customer statement in line 29.
2. Create a new method, and name it after the intention of the computation: getTotalCharge.
Declare it to return the type of the extracted variable: double . See line 36 in Figure 1.4.
3. Identify all the statements that contribute to the computation of totalAmount. In this case
these are the statements in lines {14, 16, 19, 20, 25, 26}.
4. Copy the identified statements to the new method. See lines 37 to 42 in Figure 1.4.
5. Scan the extracted code for references to any variables that are parameters to the statement
method. These should be parameters to getTotalCharge as well. In this case, the parameter
list is empty.
6. Look to see which of the extracted statements are no longer needed in the statement method
and delete those. In this case, the while loop is still relevant, and therefore the statements
in lines {16,19,20,26}cannot be deleted; instead, they are duplicated. Lines {14,25}are
needed only in the extracted code and are therefore deleted. In Figure 1.3 they are shown
as blank lines, for clarity.
7. Rename the extracted variable, totalAmount , in the extracted method, getTotalCharge, to
result, and add a return statement at the end of that method (see line 43 in Figure 1.4).
8. Replace the reference to the result of the extracted computation with a call to the target
getTotalCharge method (line 29 in Figure 1.3).
9. Compile and test.
A refactoring tool could reduce the above scenario to (a) selecting a temporary variable (whose
computation is to be extracted), and (b) choosing a name for the extracted method. The tool,
in turn, would either perform the transformation, or reject it if behaviour preservation cannot be
guaranteed. For example, note that the correctness of the transformation above depended on the
immutability of the traversed collection rentals (thus allowing untangling of the three traversals).
CHAPTER 1. INTRODUCTION 7
An attempt at providing such a tool, in early stages of the research leading to this thesis,
suffered several drawbacks. Firstly, in order to guarantee behaviour preservation, the identified
preconditions (e.g. no global variable defined in the extracted code) were clearly stronger than
necessary. Secondly, the levels of code duplication were, again, higher than necessary. The dupli-
cation is due to extracted statements (identified in step 3 above) that are not deleted from the
original (see step 6). As usual, code duplication could be considered harmful in itself, but perhaps
more importantly, it indirectly affected applicability.
A successful reduction in duplication and weakening of preconditions, thus leading to a refined
and more generally applicable tool, required a careful and rigorous study of the many intricacies
in this refactoring. Results of that study are reported in this thesis.
The complete video-store scenario, particularly the breaking up and distribution of the ini-
tially monolithic statement method, motivates and justifies Fowler and Beck’s big refactoring to
Convert Procedural Design to Objects [20, Chapter 12]: “You have code written in a procedural
style. Turn the data records into objects, break up the behavior, and move the behavior to the
objects”.
The steps of turning procedural design to objects mainly involve introducing new classes, ex-
tracting methods, moving variables and methods (to the new classes), inlining methods and renam-
ing variables and methods. All those are either straightforward or already supported by modern
refactoring tools. It is the extraction of non-contiguous code (as in Replace Temp with Query)
for which automation is missing and required.
However hope is not lost, as some solutions to extraction of non-contiguous code have been
proposed and investigated. (In fact, as will be shown later, those tackle a problem different from
the above, but closely related.) Inspired by those, we shall dedicate this thesis to the development
of a novel solution; one that will benefit from the advantages of each of those, whilst highlighting
and overcoming respective limitations.
The extraction of non-contiguous code, especially when dealing with the automation of steps
3 and 6 of the mechanics in the example above, lead us to the following observation.
1.3 Programmers use slices when refactoring
To untangle the desired statements from their context, one can employ program slicing [61, 64]. A
program slice singles out those statements that might have affected the value of a given variable
at a given point in the program. A typical scenario is one in which the programmer selects a
variable (or set of variables) and point of interest, e.g. totalAmount at line 29, in the example
above (Figure 1.2); a slicing tool, in response, computes the (smallest possible) corresponding
slice, e.g. the non-contiguous code of lines {14,16,19,20,25,26}. This slice can then be extracted
CHAPTER 1. INTRODUCTION 8
into a new method, as was the case in steps 3 and 4 of that example. The idea of using slicing for
refactoring has been suggested by Maruyama [42].
Program slicing was invented, by Mark Weiser, for “times when only a portion of a program’s
behavior is of interest” [61], and with the observation that “programmers use slices when debug-
ging” [62]. According to Weiser, slicing is a “method of program decomposition” that “is applied
to programs after they are written, and is therefore useful in maintenance rather than design”
[61].
This is no longer true. In modern software development, as was mentioned earlier, some design
is normally done on each and every development iteration. Thus, since code of earlier iterations
is already available when designing further features (or corrections to existing ones), slicing can
be useful there too.
Therefore, the research leading to this thesis started with the observation that slicing can
be useful in daily program development activities, even outside its initial domain of software
maintenance. As a first step towards such usage, and since refactoring presents such an interesting
blend of design, existing code and behaviour-preserving transformations, this research was initiated
with the question: “How can program slicing and related analyses assist in building automatic
tools for refactoring?” 2
1.4 Automatic slice-extraction refactoring via sliding
We shall propose automation of the Replace Temp with Query refactoring in latter stages of this
thesis. The solution will be composed of a number of behaviour-preserving steps, in a manner
slightly different from the earlier mechanics of manual transformation. In the first step, a selected
slice will be extracted from its so-called complement (i.e. code for the remaining computation).
The problem of slice extraction can be formulated as follows:
Definition 1.1 (Slice extraction).Let Sbe a program statement and Vbe a set of variables;
extract the computation of Vfrom S(i.e. the slice of Swith respect to V) as a reusable program
entity, and update the original Sto reuse the extracted slice.
A novel solution shall be developed in the course of this thesis, thus automating slice-extraction.
The automation will be based on a correct (i.e. behaviour-preserving) slicing algorithm. This al-
gorithm will itself be based on a special program representation, specifically designed for capturing
non-contiguous code. This representation’s primitive elements will be called slides. (This decom-
position of a program into slides is in accordance with a program execution metaphor of overhead
projection of programs printed on transparency slides; see Chapter 8.)
2The author would like to gratefully acknowledge Prof. Alan Mycroft’s advice during preparation of the research
proposal, particularly in the formulation of this research question.
CHAPTER 1. INTRODUCTION 9
It is in illustrating and formalising the slice-extraction refactoring that the program medium
of slides will be instrumental. Suppose the code of a program statement Sis printed on a single
transparency slide. Our initial solution begins by duplicating that slide, yielding two clones, say
S1 and S2, and placing them one on top of the other. This is then followed by sliding one of the
slides (say of S2) sideways, and by adding so-called compensatory code. This compensation will
be responsible for preserving behaviour.
Behaviour can be preserved by keeping initial values of all relevant variables (in fresh backup
variables) ahead of S1, and retrieving those after S1 but ahead of S2. Furthermore, extracted
results, V, can be saved after S1 and retrieved on exit from S2. Pictorially, sliding of S,Vwill
turn Sinto something like the following (with ; for sequential composition; note that the
left column is composed with the right, thus for chronological order read the former, top-down,
before moving on to the latter):
(keep backup of relevant initial values)
;S1 (first clone, i.e. extracted code)
;(keep backup of final values of the extracted V)
;
(retrieve backup of initial values)
;S2 (second clone, i.e. complement)
;(retrieve backup of final values)
A naive sliding transformation, in the form of full statement duplication (as described above),
is formally developed in Chapter 6.
A number of improved versions of sliding, with the goal of reducing code duplication, will be
explored and formalised throughout the thesis. Those will benefit from our decomposition of a
program statement into smaller entities of non-contiguous code (i.e. slides, to be formalised in
Chapter 8).
The reduction of duplication will be achieved by making both the extracted code (i.e. S 1
above) and the complement (i.e. S 2) smaller. In later improvements, the compensatory code will
be made smaller too (see Chapter 11).
1.5 Overview: chapter by chapter
This opening chapter has introduced the challenge of slice-extraction untangling transformations,
with the goals of improving readability and reusability of existing code. The importance and
potential implications of this refactoring and its automation have been highlighted and briefly
demonstrated through a known example from the refactoring literature. Finally, hints to our path
for automating slice extraction have been given. The rest of this thesis is structured as follows:
In Chapter 2 we present background material and related work. This includes refactoring,
CHAPTER 1. INTRODUCTION 10
slicing, and the application of slicing to refactoring in extraction of non-contiguous code.
In Chapter 3 we give background to the adopted formal approach, introducing some rele-
vant concepts from predicate calculus and predicate transformers, set theory and program
refinement.
In Chapters 4 and 5 we begin the presentation of original work by developing a formal
framework for correct slicing-based refactoring, including the definition of a programming
language, a collection of laws to facilitate program analysis and manipulation, and a method
for proving general algorithmic refinements through newly introduced slicing-related ones.
In Chapter 6 we take the first step towards slice extraction by formally developing a trans-
formation of statement duplication. The result is a naive sliding transformation, with both
the extracted code and complement being clones of the original statement.
In Chapters 7, 8 and 9 we develop a first improvement of sliding. The semantic and syntactic
requirements of slicing are derived, leading to the formalisation of a novel slicing algorithm,
one that is based on a program representation of slides. With this slicer, both the extracted
code and the complement are specialised to be the slice of extracted variables and the slice
of the remaining defined variables, respectively.
In Chapter 10 we target further reductions in the duplication caused by sliding. Those are
based on the observation that the complement (or co-slice), previously being the slice of all
non-extracted variables, can become smaller by reusing values of extracted variables.
In Chapter 11 we target the identification and elimination of redundant compensatory code,
result of earlier formulations of sliding.
In Chapter 12 we pose and solve a couple of optimisation problems, thus yielding an optimal
slice-extraction solution via sliding.
Finally, we conclude in Chapter 13 by considering the application of sliding for automat-
ing known refactorings, discussing advanced issues and limitations, and suggesting possible
directions for future work.
1.6 Contributions
This thesis brings together the fields of program slicing and refactoring. As such, it makes four
significant contributions to those fields, as listed below.
CHAPTER 1. INTRODUCTION 11
1. It develops a theoretical framework for slicing-based behviour-preserving transformations of
existing code. The framework, based on wp-calculus, includes a new proof method, specif-
ically designed to support slicing-related transformations of deterministic programs. The
framework further includes a novel program decomposition technique of program slides,
aiming to capture non-contiguous code.
2. It provides a provably correct slicing algorithm. This application of our theory acts as
evidence for its expressive power whilst enabling constructive descriptions of slicing-based
transformations.
3. It applies the above framework and slicing algorithm in solving the problem of slice extrac-
tion. The solution takes the form of a family of provably correct sliding transformations.
Drawing inspiration from a number of existing solutions to related problems of method
extraction, sliding is successful in providing high levels of accuracy and applicability.
4. It identifies and outlines the application of sliding to known refactorings, making them au-
tomatable for the first time. Examples of such refactorings include Replace Temp with Query
and Split Loop.
These contributions provide strong evidence for the validity of our research question. Indeed,
slicing and related analyses can assist in building automatic tools for refactoring.
Chapter 2
Background and Related Work
2.1 Refactoring
2.1.1 Informal reality
Refactoring is defined informally as the process of improving the design of existing software sys-
tems. The improvement takes the form of source code transformations. Each such transformation
is expected to preserve the behaviour of the system while making it more amenable for change. A
programmer can refactor either manually or with the assistance of automatic tools.
Refactoring was introduced by William Opdyke in his PhD thesis [48] and later became widely
known with the introduction of Martin Fowler’s book [20].
The refactoring.com website [71] maintains a list of refactoring tools and an online catalog of
refactorings [69]. The refactoring community discusses the techniques, tools and philosophy on
the refactoring mailing list [72].
In [69, 20], around 100 refactoring techniques are described. There are simple refactorings
such as renaming a class and some more complicated ones, e.g. for extracting a class or a method,
or for moving a method from one class to another. Some bigger refactorings may involve a
whole hierarchy of classes, for example introducing polymorphism, collapsing a redundant class
hierarchy, or even as complex and ambitious as converting a program with procedural design to a
more object-oriented one.
Being driven mostly by examples, the description of each refactoring, in [69, 20], is fairly
informal and imprecise. The success of each transformation depends on the programmer’s good
judgement, complemented with expected assistance from the compiler and the availability of a
comprehensive suite of automated tests.
Eliminating that unconvincing dependence on testing is one of the challenges of refactoring
tools. Such a tool is typically interactive; the programmer is responsible to select a specific refac-
12
CHAPTER 2. BACKGROUND AND RELATED WORK 13
toring from the menus, the tool in response performs the transformation, asking the programmer
to fill in any required details such as new names for introduced program elements.
Another (related) goal of refactoring tools is to speed up the process of refactoring, thus
supporting improved productivity of programmers. Ultimately, programmers would trust the
tools, employ them frequently, on a daily basis, as is dictated by requests for change in the
existing software on which they work.
The RefactoringBrowser for Smalltalk, developed by John Brant and Don Roberts at the
University of Illinois [53], was the first designated refactoring tool. Its success was followed by
several attempts to develop refactoring tools for the Java programming language [25], including
IntelliJ’s IDEA,Microsoft’s Visual Studio and (initially IBM’s) Eclipse. Those tools support
some of the offered refactorings, such as Move/Pull-Up/Push-Down/Extract/Inline Method,Re-
name Field/Method/Class,Self-Encapsulate Field,Add Parameter, and Extract Interface.
However, that support is far from perfect, as short experiments we performed (first in 2003
and then again in 2005 [70, 55]) revealed. There, it was demonstrated how modern tools are
particularly weak in supporting cases where non-trivial data-flow and control-flow analyses are
required. These shortcomings led, in some cases, to an apparently successful refactoring that was
yielding grammatically incorrect code; in other cases, potentially correct transformations were
unnecessarily rejected due to inaccurate, and at times incorrect analysis.
Such bugs in refactoring tools call for a review of refactoring theory. Their existence also act
as motivation for the formal approach taken in this thesis.
2.1.2 Underlying theory
Program representation and analysis
As a result of developing several versions of the Smalltalk RefactoringBrowser, Roberts [52] iden-
tified several criteria, both technical and practical, necessary to the success of a refactoring tool.
The technical requirement is that the tool must maintain a program database, that holds all the
required information about the refactored program’s entities, e.g. packages, classes, fields, methods
and statements, and also their relations and cross-references. The database should enable the tool
to check properties of the program both when checking whether a refactoring request is legal, and
in performing the transformation. As the source code may constantly change, either manually by
the programmer or by the refactoring (or any other source code manipulation) tool, the program
database must also be constantly updated. Regarding the techniques that can be used to construct
the program database, Roberts states that “at one end of the scale are fast, lexical tools such as
grep. At the other end are sophisticated analysis techniques such as dependency graphs. Some-
where in the middle is syntactic analysis using abstract syntax trees. There are tradeoffs between
CHAPTER 2. BACKGROUND AND RELATED WORK 14
speed, accuracy, and richness of information that must be made when deciding which technique
to use. For instance, grep is extremely fast but can be fooled by things like commented-out code.
Dependency information is useful, but often takes a considerable amount of time to compute”.
Existing tools mostly use the abstract syntax tree (AST) compromise, whereas the analysis
required for transformations in this thesis will be of the kind applied in constructing dependency
graphs. In doing so, and in light of Roberts’ observation, as stated above, we pay some attention
to efficiency and performance, when constructively expressing algorithms for analysis and trans-
formation. In particular, most of those will indeed be tree based and require only one pass over
an analysed program’s AST. (This is made possible by the simplicity of our supported language.)
However, behaviour preservation, as is discussed next, will be our prime goal. Consequently,
we shall be concerned with correctness of our algorithms more than with their corresponding
performance and complexity.
Behaviour preservation
Roberts further discusses the accuracy property expected from a refactoring tool. He argues that
the refactorings that a tool implements must reasonably preserve the behaviour of programs, as
total behaviour preservation is impossible to achieve. “For example, a refactoring might make
a program a few milliseconds faster or slower. Usually, this would not affect a program, but
if the program requirements include hard real-time constraints, this could cause a program to
be incorrect”. The reasonable behaviour-preservation degree that should be expected from a
refactoring tool was formally defined by Opdyke [48] as a list of seven properties two versions of
a program must hold before and after a refactoring. The first six involve syntactical correctness
properties that are necessary for a clean compilation of both versions of the program. The seventh
property is called “Semantically equivalent references and operations”, and is defined as follows:
“Let the external interface to the program be via the function main. If the function main is called
twice (once before and once after the refactoring) with the same set of inputs, the resulting set of
output values must be the same” [48].
This property, when dealing with terminating sequential programs, corresponds to the concept
of refinement (see Section 3.4 in the next chapter). And indeed, in his PhD thesis (“Refactorings
as Formal Refinements” [11]), M´arcio Corn´elio formalised a large number of Fowler’s refactorings
as “algebraic refinement rules involving program terms”. The supported language (ROOL, for
“Refinement Object-Oriented Language”) is said to be “a Java-like object-oriented language”
with formal semantics based on weakest preconditions (see Chapter 3 ahead).
However, Corn´elio does not support the refactoring for removing temps (Replace Temp with
Query) which is targeted by this thesis. To formalise and solve such refactoring problems, the
original refinement calculus approach, as presented by Morgan [45] and others, needs to be com-
CHAPTER 2. BACKGROUND AND RELATED WORK 15
bined with projection onto a subset of the program variables, as we discuss in Chapter 5. Thus,
like Corn´elio, we base this work on formal refinement and weakest-preconditions semantics.
For simplicity, and due to the intricate nature of the problem, we shall target a very simple
imperative language, rather than a fully object-oriented one. For the same reasons, we shall focus
on preservation of semantics alone, while avoiding all (important for themselves) questions over
syntactic validity of transformed programs (as e.g. expressed in Opdyke’s first six properties).
Composition of refactorings
Opdyke defined high-level refactoring techniques as a composition of lower-level ones [48]. Each
low-level refactoring is defined with a corresponding set of preconditions. Those, expressed in first
order logic over predicates available in the program database, must be satisfied by the program
and the refactoring criterion (i.e. the type of refactoring and the accompanying parameters chosen
by the user) before a correct refactoring can be performed. The refactoring tool is responsible for
performing such checks.
A naive implementation of refactoring composition would update the program database after
every step. When the composition consists of a long sequence of refactorings, this may yield an
inefficient and slow tool. One approach for reducing the amount of analysis in the implementation
of composite refactorings was introduced in [52]. There, each refactoring’s definition was aug-
mented with a set of properties that a program will definitely satisfy after the transformation, i.e.
postconditions. Using this information, after each step of the composed refactoring, the program
database can be incrementally updated, rather than be re-computed from scratch.
The approach taken in this thesis, however, is somewhat different. Indeed, our transformations
shall be composed of (at times exceedingly long) sequences of smaller steps. But instead of
expecting the actual tool to perform each and every step, we shall first formally develop a solution
“by hand”; then overall preconditions shall be carefully collected; thus the bigger refactorings shall
be formally derived from existing, smaller ones, hence potentially leading to more efficient tools.
As was mentioned in the preceding chapter, refactoring tools can benefit from the capabilities
of a decomposition technique known as program slicing. For this we now turn to present relevant
background on slicing, before relating the two (in the section to follow).
2.2 Slicing
Program slicing is the study of meaningful subprograms. Typically applied to the code of an
existing program, a slicing algorithm is responsible for producing a program (or subprogram) that
preserves a subset of the original program’s behaviour. A specification of that subset is known as
a slicing criterion, and the resulting subprogram is a slice.
CHAPTER 2. BACKGROUND AND RELATED WORK 16
Slicing was invented with the observation that “programmers use slices when debugging” [62].
Nevertheless, the application of program slicing does not stop there. Further applications include
testing [29, 8], program comprehension [9, 51], model checking [44, 15], parallelisation [63, 5],
software metrics [50, 43], as well as software restructuring and refactoring [40, 42, 17]. The latter
application is considered in this thesis.
There can be many different slices for a given program and slicing criterion. Indeed, there
is always at least one slice for a given slicing criterion: the program itself [61]. However, slicing
algorithms are usually expected to produce the smallest possible slices, as those are most useful
in the majority of applications.
2.2.1 Slicing examples
Here is a variation on the “hello world” of program slicing, computing and printing the sum and
product of all numbers in a given array of integers. The index of each statement is given to its
left, for later reference.
original
1i := 0
2; sum := 0
3; prod := 1
4; while i<a.length do
5sum := sum+a[i]
6; prod := prod*a[i]
7; i := i+1
od
8; out << sum
9; out << prod
slice of sum from 8
i := 0
; sum := 0
; while i<a.length do
sum := sum+a[i]
; i := i+1
od
slice of prod from 9
i := 0
; prod := 1
; while i<a.length do
prod := prod*a[i]
; i := i+1
od
A slice of sum from statement 8 must contain the statements {1,2,4,5,7}, and thus can be
obtained by deleting the irrelevant statements {3,6,8,9}. Similarly, a backward slice of prod from
9 should contain {1,3,4,6,7}, and can be obtained by deleting {2,5,8,9}.
2.2.2 On slicing and termination
An interesting aspect of program behaviour is that of termination. Do we expect a slice to
preserve conditions for termination? For example, should the loop statement in the program
above be included in the slice for the array afrom statement 9? And what if a.length is negative?
CHAPTER 2. BACKGROUND AND RELATED WORK 17
(Of course this should never happen, unless the program is misbehaving. But a slicer must be
prepared for all possible program behaviours, including such abnormalities.)
When conditions for termination are preserved, the slice is said to be termination sensitive
[35]. Such slices have been applied e.g. in model reduction [15].
However, in an attempt to yield the smallest possible slices, it is common to remove non-
affecting code even if this code might not terminate. This way, the empty statement skip is a
valid slice for the array ain the example above. Thus, slicing may introduce termination, as is
incidentally the case with refinement (see Section 3.4 of the following chapter).
2.2.3 Slicing criterion
An interactive tool for slicing can be seen as an extension to a source code editor, where the user
can select where to slice from and the tool answers by highlighting the set of statements in the
slice. The user selection, i.e. the slicing criterion, can be specified in different ways. In Weiser’s
original definition [64], it was a pair, hi,Vi, combining a program point, i, and a set of variables,
V,e.g. h8,{sum}i in the example above.
A simplified version, hii, is obtained by omitting the set of variables, e.g. h8i. There, all the
variables that are used in the selected substatement are of interest (e.g. h8,{out,sum}i above).
Some slicing algorithms (most notably the PDG-based slicers [6, 33]), support such kind of criteria
exclusively.
A third variation of the slicing criterion formulation is obtained by selecting a (possibly com-
pound) statement and a set of variables of interest. Here, by avoiding any mention of a program
point, we mean to slice from the end of the selected statement. For example, the slice of h8,{sum}i
(on the second column above) is a valid slice of hS,{sum}i (where Sstands for the compound
statement 1-9), whereas the slice for hT,{sum}i (where Tis the compound statement of 1-3)
would consist of substatement 2 alone. Similarly, the slice for hS,{out}i is the full S. (Note that
here the scope for slicing is mentioned explicitly whereas otherwise it is implicitly expected to be
the whole program.) This kind of criteria has appeared e.g. in [59] and is used, exclusively, in this
thesis.
2.2.4 Syntax-preserving vs. amorphous and semantic slices
When the slice is limited to constructs from the original program, it is said to be syntax preserving.
Such slices are constructed by deleting irrelevant statements from the original program. Thus, a
syntax-preserving slice of a given program statement corresponds to a substatement, possibly a
non-contiguous one.
Amorphous slicing, in turn, combines slicing with a range of transformations, in simplifying
CHAPTER 2. BACKGROUND AND RELATED WORK 18
the resulting code (see e.g. [27]). For example, a termination-sensitive slice of the array a, above,
would potentially be able to exclude the loop from the slice, replacing it with a single test of the
length of a. This way, the initialisation of variable iwould be successfully and correctly removed.
Semantic slicing (defined e.g. in [59]), defines the semantic requirements as expected from
slices, and accepts any program that meets those requirements as a valid slice. When constructively
computing semantic slices, similar techniques to those of amorphous slicing are used, in simplifying
the result.
2.2.5 Flow sensitivity: backward vs. forward slicing
When a program analysis result depends on the order of the statements (i.e. when the analysis of
a program S1;S2is expected to differ from the analysis of S2;S1) the analysis is said
to be flow-sensitive[47]. For producing smaller, more accurate slices, a slicing algorithm should be
flow-sensitive. As such, it can be applied in one of two directions.
Traditional slicing, as presented so far, is known in that respect as backward slicing. Indeed, as
is the case with backward analysis [47, 54], its algorithms propagate information against the flow
of control, while answering questions of what might have happened before arriving at a program
point. The complementary technique is called forward slicing and is computed by looking forward
from a selected program point, thus answering the question “which statements may be affected
by the value computed at this point?”
A slicing algorithm to be developed in this thesis, in Chapter 9, will compute backward slices.
An initial version will be flow-insensitive. Then, flow-sensitivity will be gained by transforming
a program to and from a static-single-assignment (SSA) form, before and after the slicing, re-
spectively. Background on the SSA form will be given shortly, but first we turn to discuss slicing
algorithms.
2.2.6 Slicing algorithms
In this thesis we shall target intraprocedural, backward, syntax-preserving and static slicing. A
variety of such algorithms exists. Those are based on a wide range of program representations,
from abstract syntax trees (AST) [19, 18], through value dependence graphs (VDG) [16, 60]
and the static single information (SSI) form [54], to control flow graphs (CFG) [61, 64] and the
program dependence graph (PDG) [49, 33, 6]. We briefly mention Weiser’s original approach and
the PDG-based one.
According to Weiser [61], automatic slicing should be performed on the program’s flow graph
[1] using data-flow equations. Those approximate the set of variables that may be (directly or
indirectly) relevant for a given slicing criterion at each node of the graph. A node nis added to the
CHAPTER 2. BACKGROUND AND RELATED WORK 19
slice if it defines (i.e. may modify the value of) any of those relevant variables (associated with n).
Furthermore, any “branch statement which can choose to execute or not execute” another node
which is already in the slice, should, in general, be added to the slice [61]. Thus, both data-flow
and control-flow influences are taken into account, iteratively, until a fixed point is reached.
When the slicing criterion involves all variables that are referenced in a selected program point,
the program dependence graph (PDG) offers a fast algorithm for computing the corresponding
slice. The PDG, like the flow graph, contains a node for each program statement; the directed
edges (relevant for slicing) correspond to direct data-flow and control-flow influences, respectively.
Thus, slicing is reduced to a reachability problem. Each node from which there is a directed path
to the selected node (that correspond to the slicing criterion), is added to the slice. This solution is
particularly effective in a situation in which many slices are to be computed on the same program,
since the time to construct the PDG can hence be amortised over all slice computations.
Note that both Weiser’s and the PDG-based approaches consider control and data dependences
on the same level. That is, at each step of the respective algorithm, both kinds of influences are
taken into account in adding statements to the computed slice. That choice is challenged in this
thesis.
As an alternative, we shall primarily consider control dependences, in producing program
entities called slides (see Chapter 8). Those slides shall than be treated as primitives in a novel
slicing algorithm (Chapter 9) that involve data dependences (or rather slide dependences) only.
Thus, when interested in slices from the end of a program, our algorithm will yield traditional
slices, as in the algorithms above, on the one hand, while producing potentially smaller statements,
for what we call co-slices (Chapter 10), on the other.
Our program representation of slides and hence the slicing algorithm will benefit from the popu-
lar intermediate representation (IR) of static single assignment (SSA) [12, 54], which is introduced
next.
2.2.7 SSA form
Static single assignment form is an intermediate program representation in which “every variable
assigned a value in it occurs as the target of only one assignment” [46]. As such, it has proved useful
in static program analysis, in particular for implementing fast and effective optimizing compilers
[46]. Typically, a program is translated into SSA form before performing some program analyses
and related transformations (e.g. constant propagation,invariant code motion); once done, the
transformed program is translated back to its original form.
Jeremy Singer defines the SSA form, along with its younger sister of static single information
(SSI), as members of a more general familiy of IRs, of virtual register renaming schemes (VRRS)
[54]. Any VRRS family member can be generally described as a control flow graph (CFG) with a
CHAPTER 2. BACKGROUND AND RELATED WORK 20
certain renaming scheme applied to its set of virtual registers.
When a defined register (or variable) is renamed, all its references must be renamed too. Thus,
each reference must be reached by a single corresponding definition. This is achieved by adding so-
called pseudo variables in merge points of the CFG. Those merge all reaching definitions into one
new name, using a so-called φ-function. For example, on exit from an IF statement, x3:= φ.(x1,x2)
would merge the two values x1and x2(we call them instances and accordingly x3apseudo instance)
of the two branches of the IF. The merge is such, that the x1will be taken on arrival from one
branch, and the x2if arriving from the other. In general, there are as many arguments to a
φ-function as there are incoming edges in the control flow.
In our context of source-to-source transformations, supporting a simple language with struc-
tured control flow and no aliasing (as will be explained later), we shall be interested in an SSA-like
renaming on the level of program variables (rather then virtual registers). For formalising and
giving examples of SSA, and since our φ-functions will always require two arguments, we shall
avoid them altogether. Instead, they will be pushed back to the incoming branches, separated
into two individual assignments (e.g. x3:= x1at the end of one branch of an IF statement, and
x3:= x2at the other). Accordingly, a variable that is assigned (and used) in a DO loop, will
have a designated pseudo instance assigned both before the loop and at the end of its body. For
example, the SSA version of the sum and prod program from above will be as follows:
i1:= 0
; sum2:= 0
; prod3:= 1
; i4,sum4,prod4:= i1,sum2,prod3
; while i4<a.length do
sum5:= sum4+a[i4]
; prod6:= prod4*a[i4]
; i7:= i4+1
; i4,sum4,prod4:= i7,sum5,prod6
od
; out8:= out ++ sum4
; out9:= out8++ prod4
Note that appending to the out stream had to be simplified (with ++ taking the place of <<).
Note also that, following SSA tradition, we rename instances by adding a subscript. However, we
deviate from the common practice of using increasing natural numbers for the instances of each
variable. Instead, in program examples we prefer to use, for each assignment, its (informal) label
CHAPTER 2. BACKGROUND AND RELATED WORK 21
as a unique subscript (for all variables defined in it). Thus, our loop pseudo instances (e.g. i4) are
defined in two distinct places, on the one hand, but both define the same instance, on the other.
The SSA form will be formalised in Section 8.6 and Appendix D and applied for slicing and
sliding. In particular, our new slicing algorithm is based on slides of the SSA form. This slicing
algorithm should in turn be useful in automating slice-extraction transformations. Accordingly,
we now turn to present background material on that problem.
2.3 Slice-extraction refactoring
An interactive process for behaviour-preserving method extraction was first proposed by Opdyke
[48] and (independently) by Griswold and Notkin [26]. This was however restricted to the extrac-
tion of contiguous code.
Maruyama, in a paper titled “Automated Method-Extraction Refactoring by Using Block-
Based Slicing” from 2001 [42], proposed a scenario according to which, extraction is performed on
a block of statements (i.e. a compound statement — we shall simply call it a statement — acting
as scope for extraction) and a user selected variable whose computation is to be extracted from the
remaining computation. A slice of the selected variable in the selected scope is extracted into a
new method; a call to that method is placed ahead of the code for the remaining computation. We
call the latter the complement, borrowing Gallagher’s terminology [22]. Maruyama’s rudimentary
solution is described formally; the importance of proving correctness is highlighted, but no proof
is given.
Earlier research on extracting slices from existing systems, in the context of software reverse
engineering and reengineering, including that of Lanubile and Visaggio [41] and Cimitile et al. [10],
has focused mainly on how to discover reasonable slicing criteria. In our context of refactoring,
we prefer to leave the choice of what to extract to the programmer. Our approach, in turn, will
focus mainly on how to reorganise the original code to benefit from the extraction.
In what follows, we explore state-of-the-art solutions to a related problem, we call it arbitrary
method extraction, according to which, instead of extracting the slice of a variable, a set of (not
necessarily contiguous) statements is selected for extraction.
Lakhotia and Deprez defined an arbitrary-method-extraction transformation called Tuck [40].
In tucking, a selection of statements and scope for extraction is made (either by the user or
some other tool), and a tool, in response, computes the slice (of the selected statements) and
complement, and composes them sequentially, along with some compensation that may include
backup of initial values and variable renaming (in the complement). The complement itself is
computed through slicing from all non-extracted statements (in the selected scope). If both the
slice and its complement define a variable that is live on exit from scope, the transformation is
CHAPTER 2. BACKGROUND AND RELATED WORK 22
rejected.
Suppose we are asked to extract the slice of statement 2, in the following:
1; while i<a.length do
2sum := sum+a[i]
3; prod := prod*a[i]
4; i := i+1
od
Tucking would compute a statement made of {1,2,4}as the slice to be extracted; from the
remaining statement, 3, it would compute {1,3,4}as basis for the complement; thus so long as
the variable iis not live-on-exit (i.e. will not be used before being re-defined after the loop), we
would get something like:
ii := i
; while i<a.length do
sum := sum+a[i]
; i := i+1
od
; while ii<a.length do
prod := prod*a[ii]
; ii := ii+1
od
The final step of Tuck would then fold the extracted slice into a reusable method.
The 2000 version of arbitrary method extraction by Komondoor and Horwitz (to be referred
to as KH00, [38]), is particularly effective in reordering statements. A sequence of statements
is selected as scope, and a subset of those is selected for extraction. The algorithm seeks valid
permutations of the sequence in which the selected statements are grouped together (i.e. forming
contiguous code). The validity of a permutation depends on some control flow and data flow
related constraints.
For example, suppose we are asked to extract the computation and printing of sum, on state-
ments {2,3,7}in the following:
CHAPTER 2. BACKGROUND AND RELATED WORK 23
1i,sum := 0,0
2; while i<a.length do
3i,sum := i+1,sum+a[i]
od
4; i,prod := 0,1
5; while i<a.length do
6i,prod := i+1,prod*a[i]
od
7; out << sum
8; out << prod
The KH00 would successfuly yield the following two alternatives:
1i,sum := 0,0
2; while i<a.length do
3i,sum := i+1,sum+a[i]
od
7; out << sum
4; i,prod := 0,1
5; while i<a.length do
6i,prod := i+1,prod*a[i]
od
8; out << prod
4i,prod := 0,1
5; while i<a.length do
6i,prod := i+1,prod*a[i]
od
1; i,sum := 0,0
2; while i<a.length do
3i,sum := i+1,sum+a[i]
od
7; out << sum
8; out << prod
In comparison to tucking, this algorithm extracts precisely the selected statements, even if
those do not form a complete slice. This is made possible by allowing two complementary parts:
one to be executed before the extracted code, the other after.
According to this algorithm, neither duplication nor compensatory code is permitted. Conse-
quently, in cases where no permutation satisfies all constraints, the transformation is rejected.
In their 2003 version of arbitrary method extraction [39], Komondoor and Horwitz have relaxed
their earlier restriction on duplication: this time predicates (as well as jumps, which are outside
the scope of this thesis) are allowed to be duplicated, while other statements, e.g. assignments, are
not. Furthermore, this time, instead of having to reject some transformation requests, when no
permutation satisfies all ordering constraints, the new strategy is to drag problematic statements
along with the selected statements, to the extracted part of the resulting program. They refer to
CHAPTER 2. BACKGROUND AND RELATED WORK 24
that dragging as promotion.
A first criticism of such promotion, by Harman et al., has appeared in [28], where amorphous
slicing has been suggested for procedure and function extraction. Their target was to support
program comprehension. Accordingly, their transformations are exploratory, with no aim to keep
the resulting program, as we do in refactoring.
Another technique of code untangling for program comprehension, called fission (reverse of
fusion), has been suggested by Jeremy Gibbons [24]. With fission, the design of a program can
be reconstructed from its implementation. Gibbons illustrate the approach using code examples
from the slicing literature [22]. Indeed, according to Gibbons, “slicing is a fission transformation,
reversing the fusion of independent but similarly-structured computations”. In contrast to program
comprehension, in refactoring the focus is on automation with syntax preservation, such that the
resulting program would reflect an update in the design whilst being easily recognised by the
programmer.
As with Harman et al. and Gibbons, the KH03 is not (and was not designed to be) a slice-
extraction algorithm. (It was actually designed for reducing code duplication by eliminating
multiple clones of code, replacing them all with method calls.) For example, KH03 cannot untangle
the computation of sum from prod, as the Tuck transformation does, since that would involve
a duplication of the assignment to i. In general, in KH03, any loop would have to be either
completely extracted or not at all.
In comparison to its predecessor KH00, the KH03 algorithm presents a step forward in the sense
that some duplication is allowed, thus rendering it more applicable (in fact it is totally applicable,
as in the worst case it extracts the whole program in scope). It offers another improvement with
respect to jumps (which are, again, outside the scope of our investigation). However, in an attempt
to make it more scalable (its complexity is polynomial, compared with the exponential KH00), it
might extract more statements. In the example of KH00 above, the KH03 algorithm would fail to
move the computation of prod out of the way; instead it would extract it along with the selected
statements.
Nevertheless, the KH03 algorithm offers one improvement over its predecessors, which is rele-
vant for slice extraction. Komondoor and Horwitz criticise (in [39]) the Tuck transformation for
not allowing data to flow from the extracted slice to its complement. This results in too large
complements, as is demonstrated in the next example. Suppose we are asked to extract statements
{1,2,4,6}(i.e. the slice of out) from the following program (on the left). The KH03 would yield,
in response, the version on the right:
CHAPTER 2. BACKGROUND AND RELATED WORK 25
1; while i<a.length do
2sum := sum+a[i]
3; prod := prod*a[i]
4; i := i+1
od
5; avg := sum/a.length
6; out << sum
1; while i<a.length do
2sum := sum+a[i]
3; prod := prod*a[i]
4; i := i+1
od
6; out << sum
5; avg := sum/a.length
Note that tucking, on this example, would have duplicated the entire computation of sum,
whereas the KH00 algorithm would have failed, since the selection does not form a valid sequence.
The challenge of this thesis will be to combine the untangling abilities of Tuck with improved
applicability and reduced levels of code duplication, as in KH03, thus yielding (for the example
above, and even in a case where iis live-on-exit) something like:
ii := i
; while i<a.length do
sum := sum+a[i]
; i := i+1
od
; out << sum
;
i := ii
; while i<a.length do
prod := prod*a[i]
; i := i+1
od
; avg := sum/a.length
This completes our presentation of background material and related work on the topics of
refactoring, slicing and slicing-based refactoring. Our approach to solving the problem of slice
extraction is based on formal semantics using so-called predicate transformers. The next chapter,
our second and last background chapter, will introduce relevant concepts and basic theory.
Chapter 3
Formal Semantics: Predicate
Transformers
This chapter introduces background material on the formal approach for program semantics to be
adopted by this thesis. It is mainly based on Dijkstra and Scholten’s monograph Predicate Calculus
and Program Semantics [13] (to be referred to as DS). Relevant properties and theorems will be
recalled. Those will later be used in formally developing our framework of correct transformations.
Furthermore, background on the concept of refinement and its relevance for refactoring and slicing
is presented.
3.1 Set theory for program variables
Some operations and properties from set theory will be useful in discussing sets of program vari-
ables and in calculating program properties.
3.1.1 Sets and lists of distinct variables
For simplicity and convenience, we will interchangably speak of lists and sets of variables. In
programs (as well as descriptions of transformations), lists will often be used, whereas in semantic
reasoning and calculation sets will be preferred. This choice can be justified by the fact that in
our use of lists, the order of elements will only be significant for matching with corresponding
lists (e.g. the two lists in a multiple assignment statement x,y:= 1,2) and elements will not
appear more than once (as is the case for sets).
Thus, we will also take the liberty to use set operations directly on (such) lists. This is a
mere shorthand for taking the sets corresponding to those lists before applying the operation, and
turning the result back into linear list form, afterwards.
26
CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 27
The size of a set (or length of a list) Vwill be denoted |V|.
3.1.2 Disjoint sets and tuples
Since disjointness of two sets will be used extensively, we adopt the notation V1V2 as shorthand
for V1V2 = where V1,V2 are either lists or sets.
When referring to the union of disjoint sets (of variables), say Xand Y, we shall write (X,Y).
This should be understood as XYwith an implicit statement that XYis given. Note that
as with n-tuples, any number of sets would be admitted, and the brackets are not optional.
That same notation shall be used also for pair (or in general n-tuple) forming. There, however,
the elements will not necessarily be sets of variables.
Admittedly, having the same notation for both tuples and disjoint set-union is potentially
confusing. Nonetheless, it appears that the actual meaning can be easily inferred from the context.
3.1.3 Generating fresh variable names
When performing a transformation, we shall soon find ourselves with a need to generate fresh
variable names. For this purpose, we offer two versions of a function called fresh. The first version
shall take a pair (n,V) as an argument, with na natural number (for length) and Va set of
variables (that are presently in use). In turn, it shall produce a set of fresh names, say X0such
that (Q1:) |X0|=n, and (Q2:) X0V.
Here, (Q1:) and (Q2:) are names of the formal requirements (or postconditions). Those
names will be recalled when applying fresh, in hints of derivation steps. We shall use this format
throughout the thesis.
Using an infix ‘.’ (= full stop) for function application, we shall be writing X0:= fresh.(n,V),
where the nwill typically be the length of a given set, say |X|.
When new instances (e.g. for backup) of existing variables will be required, the second version
of fresh will be used. It takes the form X0:= fresh.(X,V) with (X,V) a pair of sets of variables.
This time, we postulate (Q1:) |X0|=|X|, (Q2:) X0V, as before, and an extra requirement
(Q3:) X=sv.X0where sv is a globally available mapping of such freshly generated variables to
their corresponding original source variable.
3.2 Predicate calculus
3.2.1 The state-space metaphor
In imperative programming, a given program, say S, manipulates variables by changing their
corresponding value. The collective value of all program variables, at any point of execution, is
CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 28
known as ‘the state’. A computation under control of Sbegins with a given initial state (i.e. its
input), and terminates (if at all) in a final state (i.e. its output).
This terminology follows a metaphor of a ‘state space’, according to which, each program
variable, say x, being associated with a possibly infinite but denumerable and non-empty set of
distinct possible values (i.e. its type, denoted T.x), stands for a dimension of the state space.
Each possible value, val T .x, is then associated with a single coordinate. Thus, any point in
the state space uniquely represents (by its coordinates) the value of all program variables. (This
should not be confused with abstract values, which are nomally represented by program variables
— it is the variables’ corresponding concrete values that are represented, at least metaphorically,
by a point in the state space.)
3.2.2 Structures, expressions and predicates
According to DS, a structure is “an abstraction over expressions in program variables in the sense
that the state space with its individually named dimensions has been eliminated from the picture”
[13, Page 5]. There, that abstraction was chosen for developing a general theory. Since we adopt
their theory only in the context of program semantics, we shall directly speak of expressions (over
the state space). Note that structures, and hence expressions in program variables, are associated
with a type. Thus integer expressions are distinguished from e.g. boolean expressions. The latter
expressions are also known as predicates.
Hence, a predicate is an expression whose so-called global (i.e. free) variables are program
variables. When evaluated, on a particular state, those variables are assigned (i.e. replaced with)
the specific values (corresponding to that state). Thus, a predicate expresses a dichotomy on a
program’s state space.
The syntax for expressing predicates includes the constants (or boolean scalars) false and true,
relational operations (e.g. =,6=, <, ) on expressions with program variables and possibly logical
variables which must be local (i.e. bound) in the predicate, logical connectives (e.g. ¬,,,),
the universal and existential quantifiers (and respectively) and specific predicate transformers
(i.e. functions from predicates to predicates) which will define the semantics of our programming
language.
The universal () and existential () quantifiers generalise conjunction and disjunction, respec-
tively. The format of the former is (quoted here from DS):
(dummies : range : term) .
“Here, dummies stands for an unordered list of local variables, whose scope is delineated by the
outer parenthesis pair. In what follows, xand ywill be used to denote dummies; the dummies
may be of any understood type.
CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 29
The two components range and term are boolean structures, and so is the whole quantified
expression, which is a boolean scalar if both range and term are boolean scalars. Range and
term may depend on the dummies; their potential dependence on the dummies will be indicated
explicitly by using a functional notation, e.g., if a range has the form r.xs.x.y, it is a conjunction
of r.xwhich may depend on xbut does not depend on y, and s.x.y, which may depend on both.
. . . For the sake of brevity, the range true is omitted” [13, Pages 62-63].
The format for existential quantification is similar, only with in place of the .
3.2.3 Square brackets: the ‘everywhere’ operator
As mentioned, predicates may involve global occurrences of program variables. An important
function from boolean structures (i.e. predicates) to boolean scalars (i.e. true and false ) is the
so-called everywhere operator [13, Page 8]. Its application is denoted by surrounding a predicate,
say P, with a pair of square brackets, [P].
When applied to a boolean scalar, the everywhere operator acts as identity (i.e. [true] = true
and [false] = false ); when applied to a predicate on a given state space, it acts as the universal
quantification over all variables (i.e. dimensions) of that space ([13, Page 115]). Thus, [P] yields
true if and only if Pholds in every single point of the state space (i.e. for any possible assignment
of values to program variables occurring in it).
3.2.4 Functions and equality
Functions in DS are always total in their arguments (i.e. well-defined for all possible values of their
arguments). They are defined as the unique solution of “an equation that contains the argument(s)
of the function being defined as parameter(s)” [13, Page 18].
Function application — as mentioned, denoted by an infix ‘.’ (= full stop) — is left-associative,
such that f.x.yshould be read as (f.x).y.
For example, an integer increment function incr.x=x+ 1 (with the operator + itself defined
as a function of its operands) is defined as the solution of y: [y=x+ 1] or simply [incr .x=x+ 1].
Note that the equality of a pair of expressions over the state space, as in y=x+ 1, is not
merely a boolean scalar, but rather a boolean expression over that same state space. (For example,
consider the state space spanned by (x,y); there, applying y=x+1 to point (5,6) yields true but
applying it to (5,7) yields false.) Hence, it is the square brackets, i.e. the everywhere operator,
that turns the boolean expression into a scalar.
That function application preserves equality — a statement attributed to Leibniz and hence
sometimes referred to, in hints, as “Leibniz” — is formulated, for any function fand arguments
CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 30
xand y, as
[x=y][f.x=f.y].(3.1)
The equality of expressions of type boolean, i.e. equivalence of predicates, can be written either
with = or . The latter is assigned a lower binding power than all other logical connectives, such
that the round brackets in e.g. [(PQ)(¬PQ)] can be removed.
3.2.5 Global variables in expressions, predicates and programs
The set of program variables occurring as global (i.e. free) variables in any expression E(including
predicates) and program statement Swill be referred to as glob.Eand glob.S, respectively.
For convenience and brevity, we shall allow the argument to glob to be any mixed n-tuple of
expressions and statements. This should be read as shorthand for the union of all individual sets.
For example glob.(S1,P,E,S2) is short for glob.S1glob.Pglob.Eglob.S2.
3.2.6 Substitutions
Functions from predicates to predicates, known as predicate transformers, will shortly be presented
and (later) applied for defining program semantics. In our context of predicates on a state space,
such primitive transformers are known as substitution predicate transformers [13, Page 114]. (See
also [13, Chapter 2] for an introduction on substitution and replacement.)
A new predicate, say P0, can be generated from an existing predicate P, by replacing all global
occurrances of program variables Vwith a matching list (in length and corresponding types) of
expressions E. For the syntax of substitutions we deviate from DS (who would write (V:= E).P)
and adopt Morgan’s P[V\E] (for Pwith Vreplaced by E, [45]). Those square brackets are
assigned the highest binding power; with postfix application being left associative, this will allow
writing e.g. f .P[V1\E1].Q[V2\E2][V3\E3] for (f.(P[V1\E1])).((Q[V2\E2])[V3\E3]).
(Also, this format will cleanly allow later definition of special kinds of substitution, by prefixing
Vwith the new substitution’s name.)
By definition, substitution distributes over all logical connectives. When distributing a substi-
tution over a quantifier, potential name clashing (i.e. if local variables whose scope is bound by
the quantifier have the same name as global variables in E) is avoided by renaming the local ones.
Just as we did with the function glob above, for convenience, and since predicates are merely
boolean expressions in program variables, we apply substitutions to any expression, program
statement, or a mixed n-tuple of those. For statements, we shall avoid potential problems (e.g.
what does it mean to replace the target of an assignment with an expression?) by restricting
ourselves to a so-called simple substitution [45, Page 105], substituting variables by variables.
CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 31
Moreover, to avoid introducing aliases, the new variables will have to be distinct and fresh. More
precisely, for freshness in S[X\Y] we expect Y(glob.S\X).
Some properties of substitution are worth noting. In the following, let Astand for any expres-
sion (including predicates) or statement.
Let Xbe any list of variables; then redundant self-substitutions can always be introduced or
removed; we thus postulate
A[X\X] = A.(3.2)
Another simplification allows the merge of following substitutions, if they form a needless chain;
as in the postulate
A[X\Y][Y\E] = A[X\E] (3.3)
provided Yglob.A \ X.
From the preceding two postulates, we can derive conditions for removing (or introducing)
redundant reversed double substitutions. Thus
A[X\Y][Y\X] = A(3.4)
provided Yglob.A \ X.
A different kind of merge of following substitutions, is postulated for cases when substituted
variables are disjoint, and the first substitution does not affect the second. Thus
A[X1\E1][X2\E2] = A[X1,X2\E1,E2] (3.5)
provided X1X2 and X2glob.E1.
Finally, we can simply derive from the preceding postulate the conditions for swapping inde-
pendent substitutions. Thus
A[X1\E1][X2\E2] = A[X2\E2][X1\E1] (3.6)
provided X1X2, X1glob.E2 and X2glob.E1.
3.2.7 Proof format
According to the DS proof format [13, Chapter 4], designed for avoiding needless repetition in long
derivations, if [A=C] can be proved by [A=B] and [B=C] for some intermediate expression
B, we write
A
={here comes a hint to why [A=B]}
B
CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 32
={and here another hint, for [B=C]}
C,
from which the desired [A=C] can be inferred. Note that A,Band Care not necessarily boolean
expressions, and hence the = rather than the more specific . In any case, following DS, even
for boolean expressions the = will be preferred (in derivations). The , instead, will mostly be
used in single line expressions, thus exploiting its low binding power and emphasising that the
arguments are boolean.
Since [AB][BC][AC], when some steps in a derivation are of implication (),
the conclusion is an implication too. Similarly for the follows from () connective, as long as the
two (and ) are not mixed (in a single derivation).
3.2.8 From the calculus
The following set of theorems and equations are borrowed from the calculus of boolean structures,
as defined (and proved) by Dijkstra and Scholten in [13]. Instead of an exhaustive collection, we
state here only non-trivial results that will be of use in the course of this thesis.
The first theorem is proved in (DS: 5,96) of [13] — i.e. Equation 96 on Chapter 5).
Theorem 3.1. For any set W, predicate P, and any function ffrom the (type of the) elements
of Wto predicates
W6=∅ ⇒ [(x:xW:Pf.x)P(x:xW:f.x)] ,(3.7)
i.e. , provided the range is non-empty, conjunction distributes over universal quantification.
The following is taken from (DS: 5,69) in [13].
Theorem 3.2 (Contra-positive).For any P,Q
[PQ≡ ¬Q⇒ ¬P].(3.8)
The (punctual) monotonicity of quantifiers is borrowed from (DS: 5,102) and [13, Page 79].
Theorem 3.3. For any r,f,g
[(x:r.x:f.xg.x)((x:r.x:f.x)(x:r.x:g.x))] (3.9)
[(x:r.x:f.xg.x)((x:r.x:f.x)(x:r.x:g.x))] .(3.10)
Another property of existential quantification is taken from [13, Page 79].
CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 33
Theorem 3.4. For any P,r,f
[P(x:r.x:f.x)(x:r.x:Pf.x)] .(3.11)
Finally, the Laws of Absorption are proved in (DS: 5,23) and (DS: 5,24) of [13].
Theorem 3.5. Conjunction and disjunction satisfy the Laws of Absorption, i.e. , for any P,Q
[P(PQ)P] (3.12)
[P(PQ)P].(3.13)
3.3 Program semantics
3.3.1 Predicate transformers
In Dijkstra and Scholten’s approach to program semantics, a program Sstands for the set of all
computations possible under its control. With respect to a predicate P, defining a dichotomy
on the state space on which Soperates, each computation Cmay be of one of the following
three classes: (1) “eternal” (i.e. fails to terminate); (2) “finally P” (i.e. terminates in a final state
satisfying P); or (3) “finally ¬P” (i.e. terminates in a final state satisfying ¬P).
Each of the following three predicates defines a dichotomy on the initial state space of S
(with the second and third corresponding to final states satisfying P): (1) wp.S.true (i.e. each
computation under control of Sis either “finally P” or “finally ¬P”); (2) wlp.S.P(i.e. either
“eternal” or “finally P”); and (3) wlp.S.(¬P) (either “eternal” or “finally ¬P”).
Here wlp.Semerges as a function from predicates to predicates, i.e. apredicate transformer,
and wlp stands for weakest liberal precondition. Only universally conjunctive predicate trans-
formers are admitted as wlp.S. (See the following section for a formal definition of different types
of junctivity.)
Similarly to wlp, the weakest precondition predicate transformer wp.Sis defined as
[wp.S.Pwp.S.true wlp.S.P] for all P.
Being functions (from predicates to predicates), predicate transformers enjoy Leibniz’s Rule,
i.e. for any predicate transformer f, we have [PQ][f.Pf.Q].
A predicate transformer is said to be monotonic (with respect to implication) if and only if
(P,Q:: [PQ][f.Pf.Q]). Indeed, all predicate transformers used for program semantics
will be monotonic.
3.3.2 Different types of junctivity
A predicate transformer fis universally conjunctive if and only if
(V:Vis a bag of predicates : [f.(P:PV:P)(P:PV:f.P)]) ; similarly, it is
CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 34
universally disjunctive if and only if
(V:Vis a bag of predicates : [f.(P:PV:P)(P:PV:f.P)]).
Other (weaker) types of junctivity include positive junctivity and finite junctivity. The former
differs from universal junctivity in that its junctivity should apply not to any bag (of predicates)
V, but rather to non-empty ones, whereas the latter’s junctivity is expected to apply to any
non-empty bag with a “finite number of distinct predicates” [13, Page 87].
From the definitions, it follows that if a predicate transformer fis universally conjunctive it
is also positively so, and if positively conjunctive it is finitely so. It can also be shown that if f
is finitely conjunctive, it is monotonic (as defined above). Finally, note that a similar weakening
order holds for the corresponding disjunctivity types.
In the next chapter (see Section 4.1.5) we shall define the semantics of a programming lan-
guage exclusively from universally disjunctive and positively conjunctive predicate transformers. It
should be noted that all substitutions, defined earlier (in Section 3.2.6) as predicate transformers,
are universally junctive (see [13, Page 117]).
As an example for the use of finite junctivity, consider the following theorem, which deals with
absorption of termination.
Theorem 3.6 (Absorption of Termination).For any statement Sand predicate P, we have
[wp.S.Pwp.S.true wp.S.P] (3.14)
provided wp.Sis finitely conjunctive. We also have
[wp.S.Pwp.S.true wp.S.true] (3.15)
provided wp.Sis finitely disjunctive.
Proof. For the former, we observe
wp.S.Pwp.S.true
={wp.Sis finitely conjunctive (proviso)}
wp.S.(Ptrue)
={identity element of ‘}
wp.S.P,
and then for the latter, we similarly observe
wp.S.Pwp.S.true
={wp.Sis finitely disjunctive (proviso)}
wp.S.(Ptrue)
CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 35
={zero element of ‘}
wp.S.true .
Note that each part of the proof, being two steps long, will only save one step, whenever
applied. However, at least the former case will be extensively used, and will thus worth its while.
Since our focus will be on deterministic programs, we shall now turn to define formally what
is meant by a program being deterministic.
3.3.3 A definition of deterministic program statements
Interpreting the predicate wlp.S.(¬P) as holding in all initial states for which no computation
under control of Sis “finally P”, leads to another interesting predicate, ¬wlp.S.(¬P), holding
in initial states for which there exists such a computation. So wp .S.Pholds where termination
in Pis unavoidable and ¬wlp.S.(¬P) holds where terminating in Pis merely possible. Dijkstra
and Scholten’s interpretation of a program being deterministic follows “what is possible is also
unavoidable”, and thus the expectation [wp.S.P⇐ ¬wlp.S.(¬P)] for all P.
Since it can be shown that [wp.S.P⇒ ¬wlp.S.(¬P)] for all P, a program Sis considered
deterministic if and only if
[wp.S.P≡ ¬wlp.S.(¬P)] (3.16)
for all P.
The so-called conjugate of a predicate transformer f, is a predicate transformer, f, for which
[f.P≡ ¬f.(¬P)] holds for any predicate P. Surely, due to the redundancy of double negation,
“if one predicate transformer is the conjugate of another, they are each other’s conjugate” [13,
Page 83] (and hence the term conjugate).
Thus, the definition of deterministic statements can be rewritten, as (see also (DS: 7,7) in [13])
Definition 3.7. We have for any statement S
(Sis deterministic) (wp.Sand wlp.Sare each other’s conjugate) .(3.17)
The significance of this definition of conjugates comes from the fact that for any predicate
transformer fand its conjugate fand all types of junctivity, we have (DS: 6,13)
(the conjunctivity type of f) = (the disjunctivity type of f).(3.18)
CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 36
3.4 Program refinement
In his PhD thesis [2], Back introduced in 1978 the concept of refinement as a binary relation
between programs. Adapted to the DS notation, refinement can be formally defined as follows.
Definition 3.8 (Refinement).For any pair of program statements, Sand T, statement Sis said
to be refined by T(or Tis a refinement of S), writing SvT, when for any predicate Pwe have
[wp.S.Pwp .T.P].
Since the semantics of the programming language is defined with monotonic predicate trans-
formers, any part of a given program can be replaced with a refinement of itself, independently of
the surrounding program.
In refinement calculi (e.g. [4, 45]), the programming language admits both specifications and
executable constructs, known as code. The goal is a process for construction of provably correct
code. According to the proposed process, this goal is achieved by specifying the requirements
formally, using a so-called specification statement. Then, in a stepwise manner, the specification
is refined into code. Each step is taken from a vocabulary of provably correct laws of refinement.
We shall introduce a similar set of laws, as relevant for our context, in the next chapter.
Refinements have been shown to be useful for behaviour-preserving transformations of existing
code (e.g. by Ward [56] and Corn´elio [11]). Ward applies refinements for e.g. reengineering and
migration of code from one language to another [58]. Accordingly, his object language, a wide
spectrum language (called WSL), is fundamentally non-deterministic.
Corn´elio has applied refinements directly in the context of refactoring [11] in his PhD thesis
from 2004. There, a large number of known refactorings have been formulated for a Java-like
object-oriented language, called ROOL, and applied in introducing known design patterns [23]
into a given program.
We follow, in this thesis, a similar path of using the refinement relation in developing behaviour-
preserving transformations. However, for simplicity, and since refactoring is concerned with trans-
forming code, we restrict ourselves to a simple imperative deterministic language. Furthermore,
instead of targeting a wide range of refactorings, we focus on the specific problem of slice extrac-
tion.
Slices have been formalised in the context of refinement by Ward [57, 59]. We return to those
definitions later in the thesis (in Chapter 7).
This concludes our presentation of background to the formal semantics of this thesis. In sum-
mary, we have adopted Dijkstra and Scholten’s program semantics of predicate tramsformers. Set
theory and the refinement relation have also been introduced, and will be used when manipulating
programs and developing behaviour-preserving transformations.
Chapter 4
A Theoretical Framework
The original part of this thesis, as introduced so far, begins in this chapter, in which we develop a
theoretical framework for proving correct transformations. The framework, building on traditional
refinement calculus, will aim to support transformations of programs written in a simple imperative
programming language. In contrast to earlier work on refinement, the language will be restricted
to deterministic constructs — thus avoiding the need to synchronize duplicated non-deterministic
choices, as will be explained. This decision is justified by the observation that refactoring is
concerned with transforming existing code rather than specifications.
The chapter begins with a preliminary section, in which some basic concepts are defined. Then,
a variation on Dijkstra’s language of guarded commands is introduced and formalised through
weakest-preconditions semantics. This will be the object language for our transformations.
The framework will be extended and specialised later, e.g. with a slicing-based proof method in
the next chapter and a slicing algorithm in Chapter 9, and will hence be applied in the formulation
and development of solutions to slice extraction and related transformations.
4.1 Preliminaries
As a programming notation, this thesis adopts a subset of Dijkstra’s guarded commands, along
with selected elements from Dijkstra and Scholten’s “Predicate Calculus and Program Semantics”
(DS) [13]. As will be described shortly, a subset of Dijkstra’s language is chosen as our core
language. This is then extended with some advanced constructs borrowed, with adaptations, from
e.g. Morgan’s “Programming from Specifications” [45].
For the sake of simplicity, we choose to make some restricting assumptions on our programming
language. These will allow a concise formulation of transformations. Admittedly, some of those
assumptions are non-realistic and others might just not be desirable (e.g. due to performance
37
CHAPTER 4. A THEORETICAL FRAMEWORK 38
considerations). We return to discuss those choices in the concluding chapter, where we (briefly
and informally) evaluate the applicability of the approach to modern programming languages.
There, we shall propose to complement language extensions with extra applicability conditions.
Hence, in our language, all variables may be copied, leading to an independent clone in a new
storage location. This includes the ability to clone the input and output streams. We restrict our
attention to sequential programs, and expressions in our language (appearing in statements such
as an assignment, or the guard of an IF statement) have no side-effects. Moreover, features such as
aliasing, class hierarchy, overloading, exceptions or concurrency have been left out. As mentioned,
possible implications of including such features are evaluated in the concluding chapter.
4.1.1 On slips and slides: an alternative to substatements
A slice captures a subset of the original behaviour, thus it is said to be a subprogram. When
slicing is syntax preserving (as it normally is), one may also wish to say a slice (of statement S
with respect to variables V) is a substatement (of S). But is that so? And what is a substatement,
anyway?
For example, let Sbe the statement if x>ythen m:= xelse m:= y; now let S1 be
m:= xand S2 be if x>ythen m:= x. Is S1 a substatement of S? how about S2?
Some may consider the former a substatement, since in terms of syntax trees, it may stand for
a subtree. At the same time, others might claim the latter is a substatement, since in terms of
nodes in a flow graph, it represents a subgraph.
We avoid such potential confusion, in this thesis, by refraining from speaking of substatements.
Instead, S1, being a subtree, is said to be a slip of S, whereas S2 is a slide.
Any part of a statement which is in itself a statement is a slip (of that statement). Thus, if
Sis a primitive statement, Sitself is its only slip. However, when Sis a compound statement
(i.e. is compounded of parts S.i, each of which is a statement in itself), then Sitself is one of its
slips, and all slips of each such S.iare, too, slips (or even proper slips) of S. Those slips of each
S.iare sometimes referred to as proper slips, whereas the slips S.ithemselves are also considered
immediate slips (of S). In terms of the abstract syntax tree (AST), a slip corresponds to a subtree.
A slide is complementary to a slip and is defined for each pair of statement and slip. The
slide of Son itself is the statement Swith all immediate slips (if any) replaced with the empty
statement skip . When Sis a compound statement with immediate slips S.i, a slide of Son a slip
Tof S.jis the statement Swith S.jreplaced with the slide of S.jon T, and all other immediate
slips S.i(with i6=j) replaced with the empty statement skip . In terms of the AST, a slide of S
on Tcorresponds to the nodes on the path from (the node acting as root of) Sto (the root of its
subtree) T. But slides will not be further discussed until later, in Chapter 8, where they will be
formalised.
CHAPTER 4. A THEORETICAL FRAMEWORK 39
4.1.2 Why deterministic?
The main reason for focusing on deterministic programs is illustrated by the following inequiva-
lence:
if true x,y := 1,1
true x,y := 2,2
fi 6=
if true x := 1
true x := 2
fi
; if true y := 1
true y := 2
fi
Whereas x=yis a true postcondition (for any initial state) of the program on the left, the
other may terminate with e.g. x = 1 y= 2.
In essence, when duplicating a non-deterministic choice, one must synchronize the two choices
in order to ensure behaviour preservation.
In this thesis we avoid such cases by restricting ourselves to deterministic programs. Future
extension of this work to include non-determinism should be possible and interesting.
Our decision can also be justified, as mentioned earlier, by the observation that non-determinism
is typically useful in specification and during design process whereas refactoring is concerned with
transforming actual code.
4.1.3 On deterministic program semantics
In general, wlp.Sis more fundamental than wp.Sas there is no way of defining the former in
terms of the latter. However, for deterministic programs, we observe that due to (3.16) and the
redundancy of double negation (twice) wlp.Scan be defined by
[wlp.S.Q≡ ¬wp.S.(¬Q)] .(4.1)
Thus in this thesis we leave weakest liberal preconditions alone and define the semantics of our
programming language solely in terms of weakest preconditions.
But before dismissing wlp, we shall use it once more, in investigating the difference between
refinement and equivalence of deterministic program statements. (A final mention of wlp will
follow, in Section 4.1.5.)
4.1.4 On refinement, termination and program equivalence
Two deterministic program statements S,Tare considered semantically equivalent if for all P,
we have [wp.S.Pwp.T.P]. In this thesis this is denoted S=T. A weaker relation is that of
CHAPTER 4. A THEORETICAL FRAMEWORK 40
refinement. There, Sis said to be refined by T(or Tis a refinement of S, denoted SvT) if
[wp.S.Pwp.T.P] for all P.
Essentially, what is the difference between the two relations? Clearly, it can be shown (through
predicate calculus) that S=Tif and only if SvTSwT. But what does it mean for Tto
be a refinement of Sand Snot a refinement of T(and thus S6=T)? This happens when Tis
“more terminating” than S. Operationally speaking, Tmay terminate on input for which Sdoes
not. But on input for which both terminate, the final state is guaranteed to be the same. In other
words, if Sis refined by Tand both terminate under the exact same conditions, they are also
equivalent. This is formulated in the following theorem.
Theorem 4.1. Let S,Tbe any two deterministic statements; then
(S=T)(SvT[wp.S.true wp.T.true]) .
Proof.
S=T
={def. of program equivalence}
(P:: [wp.S.Pwp.T.P])
={Lemma 4.2 (see below)}
(P:: [wp.S.Pwp.T.P][wp.S.true wp.T.true])
={pred. calc. (3.7): the range is non-empty}
(P:: [wp.S.Pwp.T.P]) [wp.S.true wp.T.true]
={def. of refinement}
SvT[wp.S.true wp.T.true].
Lemma 4.2. Let S,Tbe any two deterministic statements; then
(P:: [wp.S.Pwp.T.P]) (P:: [wp.S.Pwp.T.P][wp.S.true wp.T.true]) .
Proof. The LHS RHS part is trivial, due to predicate calculus (implies ).
For LHS RHS , note that since [wp.S.Pwp.T.P] is already given for any predicate P
(RHS), we only need to show [wp.S.Pwp.T.P], for which we observe
wp.S.P
={def. of wp}
wlp.S.Pwp.S.true
CHAPTER 4. A THEORETICAL FRAMEWORK 41
={(4.1) above: Sis deterministic}
¬wp.S.(¬P)wp.S.true
⇐ {RHS and pred. calc. (Theorem of the Contra-positive, 3.8)}
¬wp.T.(¬P)wp.S.true
={(4.1) again: Tis deterministic}
wlp.T.Pwp.S.true
={RHS}
wlp.T.Pwp.T.true
={def. of wp}
wp.T.P.
In contrast to refinement, program equivalence is amenable for deriving correct transformations
in both directions. However, on one of those, a refinement may yield more accurate results (as it
does in slicing by removing irrelevant loops even if those may not terminate — recall Section 2.2.2).
The above result is important as it allows us to confidently focus on developing refinement
rules, where appropriate, knowing that the extra step of turning them into equivalences is always
available.
4.1.5 Semantic language requirements
Dijkstra and Scholten insist on two basic requirements the semantics of each language construct
must satisfy. Firstly (R0:) any wlp.Sis universally conjunctive; and secondly (R1:) [wp.S.false
false] for any S. Requirement R1 — known as “The Law of the Excluded Miracle” — is due to
the observation that no state satisfies false and the predicate wp.S.false holds in states where no
computation under control of Sexists.
When defining semantics in terms of weakest preconditions alone, requirement R0 can be
replaced with a new requirement (RE1:) that wp.Sis universally disjunctive. We prove that RE1
implies (for deterministic S) both R0 and R1 in what follows.
Theorem 4.3. Let Sbe any deterministic statement with (RE1:) wp.Sbeing universally dis-
junctive; we then have both (R0:) wlp.Sis universally conjunctive, and (R1:) [wp.S.false false].
Proof. First, for R0, we observe (on the lines of the proof for (DS: 7,9) in [13])
the conjunctivity type of wlp.S
={properties of conjugate: see (3.18)}
CHAPTER 4. A THEORETICAL FRAMEWORK 42
the disjunctivity type of (wlp.S)
={(3.17); Sis deterministic}
the disjunctivity type of wp.S
={RE1}
universal .
Then, we observe for R1
wp.S.false
={existential quantification over the empty range yields false}
wp.S.(P:P∈ ∅ :P)
={wp.Sis univ. disj. (RE1)}
(P:P∈ ∅ :wp.S.P)
={again, existential quantification over the empty range yields false}
false .
Thus, in our context, RE1 faithfully takes the place of DS’s R0,R1. We note that a consequence
of RE1 and Theorem 4.3 above (thus having R0,R1 available) is that wp.Sis positively conjunctive
(as proved in (DS: 7,8) of [13]). However, it is not universally so for (possibly) non-terminating
S, since universal quantification over the empty range yields true.
4.1.6 Global variables in transformed predicates
According to our adopted formalism, programs manipulate predicates over the program’s state
space, expressed syntactically as boolean structures. In our analysis and manipulation of such
programs, we shall be interested in the set of global variables actually mentioned in the transformed
predicates.
Let Pbe any predicate. We denote the set of global (i.e. free) variables in Pas glob.P.
Let Sbe any given statement; variables in glob.Pmay be subject to direct substitution by the
transformer wp.S. What do we know of glob.(wp.S.P)?
First, we denote variables that will definitely be substituted by wp.Sas ddef.S. Those will be
the variables that are definitely (i.e. for any initial state) defined by S. (Examples of ddef, as well
as the other properties to be introduced shortly, will be given in Section 4.2.) Second, we observe
that some of those variables, as well as others, may find their way into glob.(wp.S.P) even if not
CHAPTER 4. A THEORETICAL FRAMEWORK 43
in glob.P. Those are the variables whose initial value may affect the result of Sand are hence
denoted as input.S. We are now ready to postulate requirement RE2:
glob.(wp.S.P)((glob.P\ddef.S)input.S)
for all S,P.
The transformation of wp.Son some predicates will be restricted to adding a conjunct ex-
pressing termination of S. That is, for such S,Pwe expect [wp.S.PPwp.S.true]. This is
so whenever all variables in glob.Pare guaranteed not to be modified by S(in a case where S
terminates). We thus denote by def.Sthe set of variables that may be modified by S. That is, a
variable xmust be in def.Sif there exists a terminating computation under control of Sfor which
the final value of xdiffers from its initial value. When a variable is not in def.Swe know its initial
and final values will in any case be the same. We hence postulate the requirement RE 3:
[wp.S.PPwp.S.true]
for all S,Pwith glob.Pdef.S.
As can be expected, all definitely defined variables ddef.Sshall always take part in the set of
(possibly) defined variables, def.S. We thus postulate the requirement RE 4:
ddef.Sdef.S
for all S.
In addition to the sets def.S,ddef.Sand input.S, we shall define for each language construct
its set of global (i.e. free) variables. In fact, we shall expect this set to consist of variables from
the three previously mentioned sets. Keeping in mind ddef.Sdef.Sfor all S(RE 4 above), we
now postulate the requirement RE 5:
glob.S=def.Sinput.S
for all S. (Recall the overloading of glob, as was first mentioned in Section 3.2.5, being applicable
for predicates, as before, as well as for program statements, as in this case, or even for program
expressions of any type.)
Here is a summary of the required properties. For any statement Swe require
RE1 wp.Sis universally disjunctive
RE2 glob.(wp.S.P)((glob.P\ddef.S)input.S) for all P
RE3 [wp.S.PPwp.S.true] for all Pwith glob.Pdef.S
RE4 ddef.Sdef.S
RE5 glob.S=def.Sinput.S
CHAPTER 4. A THEORETICAL FRAMEWORK 44
4.2 The programming language
Here is an introduction to the chosen language constructs and their corresponding semantics. A
full definition with proof of all requirements can be found in Appendix A.
4.2.1 Expressions, variables and types
As our transformations deal exclusively with statements and names of variables, while preserving
all types (of variables and expressions), it is tempting and not uncommon to avoid any men-
tion of those. However, to prevent confusion, it is worth mentioning the types with which our
programming language (and hence the code examples in this thesis) shall be concerned.
The basic types (of variables and expressions) shall include integers (with typical arithmetic
and relational operators) and booleans (with similar syntax to predicates).
Further to that, we shall allow variables of type array or stream. Each array variable, say a,
will always be associated with an extra variable, a.length. A stream, dedicated either for reading
(i.e. input) or writing (i.e. output), shall be implemented as a special case of array, with an extra
(implicit) index variable associated with it. This variable shall be pointing to the next available
location.
4.2.2 Core language
Assignment
The first language construct is the assignment statement. It takes the form X:= Ewith X
standing for a finite list of variables and with Estanding for a list of expressions (of the same length
as X). Type compatibility is assumed. This is a so-called simultaneous assignment (or multiple
assignment), in which all expressions are evaluated before being assigned to their corresponding
target variables. It should be noted that the target variables must be distinct. To the case where
the lists Xand Eare both empty, we sometimes refer as “skip ”.
[wp.X:= E.PP[X\E]] for all P;
def.X:= E,X;
ddef.X:= E,X;
input.X:= E,glob.E; and
glob.X:= E,Xglob.E.
It is worth noting that, for simplicity, all expressions in our language are assumed to be well
formed and all operators and functions are complete and hence well defined for all possible values.
The special case of assignment to an array element, say a[i] := Eshall be understood as
an assignment to the whole array, a:= a[i7→ E], meaning that the array aends up being as
CHAPTER 4. A THEORETICAL FRAMEWORK 45
before in all elements other than the i-th, in which it gets the value of E.
An output stream, say out (with dedicated index variable, say out.i), can be appended through
a statement out << E, which should be interpreted as out,out .i:= out [out .i7→ E],out .i+
1. Similarly, reading from an input stream, say in, (with index variable in.i), takes the form
in >> x, and should be interpreted as x,in.i:= in [in .i],in.i+ 1 .
Sequential composition
The first compound construct is sequential composition. It takes the form S1;S2and starts
executing S2 only after normal completion of S1.
[wp.S1;S2.Pwp.S1.(wp.S2.P)] for all P;
def.S1;S2,def.S1def.S2 ;
ddef.S1;S2,ddef.S1ddef.S2 ;
input.S1;S2,input.S1(input.S2\ddef.S1) ; and
glob.S1;S2,glob.(S1,S2) .
Recall glob.(S1,S2) is short for glob.S1glob.S2.
Alternative construct
The alternative construct takes the form of if Bthen S1else S2(and is sometimes abbre-
viated to IF ). Upon execution, if the guard B, a boolean expression, is evaluated to true,S1 is
executed; otherwise, S2 is executed.
[wp.IF .P(Bwp.S1.P)(¬Bwp.S2.P)] for all P;
def.IF ,def.S1def.S2 ;
ddef.IF ,ddef.S1ddef.S2 ;
input.IF ,glob.Binput.S1input.S2 ; and
glob.IF ,glob.Bglob.S1glob.S2 .
Repetitive construct
The repetitive construct takes the form of while Bdo Sod (and is sometimes abbreviated to
DO). Upon execution, if the guard Bis evaluated to true, the guarded Sis executed. Once S
terminates successfully, the process is repeated, until the guard is evaluated to false, in which case
the loop terminates successfully.
CHAPTER 4. A THEORETICAL FRAMEWORK 46
[wp.DO.P(i: 0 i: (ki.false))] for all P,
with kgiven by (DS:9,44) [13]: [k.Q(BP)(¬Bwp.S.Q)] ;
def.DO ,def.S;
ddef.DO ,;
input.DO ,glob.Binput.S; and
glob.DO ,glob.Bglob.S.
As for earlier language constructs (with the exception of substitutions), this formulation of
wp.DO follows the DS notation — for its deterministic subset. Note, however, that the semantics
of repetetion could have been equally defined in terms of implication, by [k.Q(Bwp.S.Q)
(¬BP)]. This would render the similarity to the semantics of IF clearer: one could think of
the DO statement as a recursive construct, say DO0, comprising an IF statement on the lines of
if Bthen S;DO0else skip . Nevertheless, this thesis adopts the DS formulation of loops.
This completes our core language, a subset of Dijkstra and Scholten’s guarded commands [13].
The following constructs are extensions borrowed from Morgan [45], with some adaptations as our
context requires. Later, in order to emphasise that a certain statement Sis restricted to constructs
of the core language, we shall call it a core statement.
4.2.3 Extended language
Assertions
An assertion statement (called “assumption” by Morgan [45]), is a boolean expression on the
program state. If true, execution goes on normally; otherwise the program aborts. (In guarded
commands, “the operational interpretation of abort is that for all initial states its execution fails
to terminate”. [13, Page 135])
[wp.{B}.PBP] for all P;
def.{B},;
ddef.{B},;
input.{B},glob.B; and
glob.{B},glob.B.
Assertions, locally expressing the surrounding context, will serve as a vehicle for performing
correct local transformations and refinements. The assertions will typically be added to a program,
temporarily, by propagating knowledge through a given (compound) statement. They will thus
express intermediate as well as final results of a provably correct program analysis, prior to some
transformation.
CHAPTER 4. A THEORETICAL FRAMEWORK 47
Local variables
Normally, in (block-based, imperative) programming languages, local variables serve for storing
temporary results of computations, before using those in further computations. Being local to a
certain statement, assignments to such variables bear no effect on the surrounding context (even
if a variable with the same name exists there as well). According to Morgan’s definition, a local
variable is initialised (on entry to its scope) to any possible value. However, since such non-
determinism is not permitted in our context, we could either insist local variables must not be
used before being defined — as is commonly enforced in modern languages — or agree on some
initial value. As will be explained shortly, we prefer the latter.
A special case of local variables is that of parameters. Typically, a method (or procedure)
may declare e.g. ‘value’ parameters; such local variables will be initialised, whenever the method
is called, with a copy of the actual value sent. Again, any local modifications will be hidden from
the caller. In [45], Morgan allows value parameters to be sent to any given statement, say S,
through so-called value substitutions;e.g. S[value f\E]would send the value of Eto locals
fin S.
It turns out that we do not require, in our context, the full power of such value substitutions.
Instead, we shall do with what can be termed a self value substitution (i.e. S[value f\f], to use
Morgan’s syntax). The effect of such self-substitution is that, on the one hand, any modification
to fin Sshall be local (i.e. hidden from the surrounding context), while, on the other hand, f
will be initialised to its actual value in that global context.
Since only self value substitutions will be needed, in this work, and since the value-substitution
notation, for those, present a redundancy (i.e. repeating fin the above), we opt to avoid the
introduction of value substitutions. Instead, we shall get the same effect (of self value sub.)
by assuming local variables are initialised to their corresponding global value. In reality, such
initialisation should only take place if a local may be used before being defined.
Local variables may be introduced anywhere a statement is expected. Their introduction takes
the form of |[var L;S]|where Sis any statement and Lstands for a list of variable names. If
a local variable is used before being defined in S, its entry value is used. Since definitions of Lin
Sshould be local, its entry value is kept in a fresh backup variable on entry and retrieved on exit;
accordingly, the semantic definition of |[var L;S]|follows that of L0:= L;S;L:= L0
where L0is fresh.
CHAPTER 4. A THEORETICAL FRAMEWORK 48
[wp.|[var L;S]|.P(wp.S.P[L\L0])[L0\L] ] for all Pwith glob.PL0; or the simpler
[wp.|[var L;S]|.Qwp.S.Q] for all Qwith glob.Q(L,L0)
def.|[var L;S]|,def.S\L;
ddef.|[var L;S]|,ddef.S\L;
input.|[var L;S]|,input.S; and
glob.|[var L;S]|,(def.S\L)input.S; or
glob.|[var L;S]|,(glob.S\(L\input.S)) .
Note that generality is not lost by restricting Pas we do. Whenever elements of L0appear in the
postcondition, those can be locally renamed.
A common case is one in which the declared variables are immediately defined, e.g. |[var L;L:=
E;S]|. In such cases, the shorthand |[var L:= E;S]|is allowed.
Live variables
In program analysis [47], a variable xis considered live, at a given program point, if there exists
a path (from the given point) to a use of x, which is free of re-definitions of x. If a variable is not
live at a given point (i.e. it is dead), its value at that point is of no interest to the program, and
can be modified to anything.
In our refactorings we shall take advantage of this notion of liveness, for example in removing
dead assignments (i.e. assignments to dead variables). However, since there is no simple deter-
ministic way of saying “that variable can hold any value, at this point”, we choose to add the
concept of liveness to the programming language, or rather to the meta-language.
We do so by defining a dual of local variables. Instead of stating which variables are local to
a statement S, we explicitly state the variables that are not. This way, it can be assumed that
those variables will be live on exit. In contrast, all other variables, being local, will be guaranteed
to hold, on exit from S, their corresponding initial value. Thus, all local definitions of those will
be of no relevance to the surrounding context. Consequently, modifying those to any value, just
before exiting S, will bear no effect.
We define S[live V],|[var L;S]|where L:= def.S\V. Thus, the semantics and
properties can be derived from those of local variables, as is summarised in the following. For a
given statement S, set of variables V, a corresponding set L:= def.S\Vand fresh L0, we have:
CHAPTER 4. A THEORETICAL FRAMEWORK 49
[wp.S[live V].P(wp.S.P[L\L0])[L0\L] ] for all Pwith glob.PL0; or the simpler
[wp.S[live V].Qwp.S.Q] for all Qwith glob.Q(L,L0)
def.S[live V],def.SV;
ddef.S[live V],ddef.SV;
input.S[live V],input.S; and
glob.S[live V],(def.SV)input.S.
4.3 Laws of program analysis and manipulation
The weakest-preconditions semantics, as defined above for all language constructs, along with
known theorems from the predicate calculus, can and will be applied in proving a collection of
laws for correct program analysis and manipulation.
Such laws, in turn, will be useful for proving and deriving rules of program equivalence, re-
finement and transformation. (The latter is distinguished from the former two in that instead of
relating two programs, it shall describe how to produce a new program from a given one.)
The collection is by no means exhaustive, though; only laws to be directly useful in the thesis
are formulated. A summary of those laws can be found in Appendix F, and all proofs are given
in Appendix B.
4.3.1 Manipulating core statements
A first set of laws, designates easy manipulation of core statements. Very similar laws have been
defined elsewhere (e.g. [45, 3, 56, 30]).
For example, the following is a definition of a law (see Law 3 in the appendices) to support
the distribution of a statement into (or out of, when applied from right-to-left) both branches of a
following IF statement, provided the former does not define (i.e. modify the value of) any variable
that is tested in the IF’s guard:
Let S,S1,S2,Bbe three statements and a boolean expression, respectively; then
S;if Bthen S1else S2=if Bthen S;S1else S;S2
provided def.Sglob.B.
Another code-motion related law (Law 5 in the appendices), supports moving a (certain kind
of loop-invariant) assignment statement forward, outside a DO loop’s body (or into its end, when
applied from right-to-left):
CHAPTER 4. A THEORETICAL FRAMEWORK 50
Let S1,X,B,Ebe any statement, set of variables, boolean expression and set of expressions,
respectively; then
{X=E};while Bdo S1;(X:= E)od ={X=E};while Bdo S1od ;(X:= E)
provided X(glob.Binput.S1glob.E).
4.3.2 Assertion-based program analysis
When flow-sensitive propeties of specific program points are desired, we shall introduce and then
propagate assertions throughout the program, thus expressing both intermediate and final results
of a program analysis. Again, similar sets of laws have been defined and employed elsewhere, e.g.
by Back [2], Morgan [45] and Ward [56].
For example, the following (Law 7) supports both introduction and elimination of assertions,
following an assignment:
Let X,Y,E1,E2 be two sets of variables and two sets of expressions, respectively; then
X,Y:= E1,E2=X,Y:= E1,E2;{Y=E2}
provided (X,Y)glob.E2.
The following law (Law 12) will be used for propagating assertions forward into branches of
an IF statement, as well as backward ahead of the IF:
Let S1,S2,B1,B2 be two statements and two boolean expressions, respectively; then
{B1};if B2then S1else S2=if B2then {B1};S1else {B1};S2.
Note that this law is a direct corollary of the more general Law 3 (from above) and the fact
that def of assertions is empty.
After introducing and propagating assertions, and before eliminating them, they will typically
be used in making substitutions. If two variables are known to hold the same value ahead of
a statement, the immediate use of one can be replaced with the other, in that statement. By
immediate use, we refer to the used expressions in assignments and the guard of an IF statement.
In the guard of a DO loop, however, we can make such a substitution only if the required assertion
is available both before the loop and at the end of its body. We refer to such substitutions, in
hints, as assertion-based substitution.
Since such substitutions will often be preceded by an introduction of the assertion, following
an assignment statement, we introduce a combined law (Law 18) to which we refer in hints as
assignment-based substitution:
CHAPTER 4. A THEORETICAL FRAMEWORK 51
Let S1,S2,Bbe two statements and a boolean expression, respectively; let X,X0,Y,Z,
E1,E10,E2,E3 be four lists of variables and corresponding lists of expressions; then
X,Y:= E1,E2;Z:= E3=X,Y:= E1,E2;Z:= E3[Y\E2] ;
X,Y:= E1,E2;IF =X,Y:= E1,E2;IF 0; and
X,Y:= E1,E2;DO =X,Y:= E1,E2;DO0
provided ((XX0),Y)glob.E2
where IF := if Bthen S1else S2,
IF 0:= if B[Y\E2] then S1else S2,
DO := while Bdo S1;X0,Y:= E10,E2od
and DO0:= while B[Y\E2] do S1;X0,Y:= E10,E2od .
4.3.3 Manipulating liveness information
Let S,Vbe any statement and set of variables, respectively, and recall our definition of liveness
information, in our extended language. The fact that out of the variables defined in S, only those in
Vare live on exit from Sis expressed as S[live V]. This is syntactic sugar for |[var coV ;S]|
where coV := def.S\V.
Laws for manipulating liveness information (see Sections B.3 and F.3 for proofs of all laws and
summary, respectively) include introduction and removal of auxiliary information, distribution and
propagation of liveness information, and finally introduction and elimination of dead assignments.
(By auxiliary information we refer to information that is locally redundant but may have some
importance in the global context.)
Whenever only (a subset of the) mentioned live variables are actually defined (in a statement
S), the liveness information is redundant, and can be dropped. Conversely, any superset of the
defined variables (in any S) can safely augment Sas liveness information. This is expressed by
the following law (Law 19) for introducing and removing auxiliary liveness information:
Let S,Vbe any statement and set of variables, respectively, with def.SV; then
S=S[live V].
In propagating liveness information over sequential composition, it is interesting to see that,
on the one hand, some live-on-exit variables may be intermediately dead, whereas on the other
hand, some dead-on-exit variables may become intermediately live. This is demonstrated by the
set V0in Law 20:
CHAPTER 4. A THEORETICAL FRAMEWORK 52
Let S1,S2,V1,V2 be any two statements and two sets of variables, respectively; then
(S1;S2)[live V1] =(S1[live V2] ;S2[live V1])[live V1]
provided V2 = (V1\ddef.S2) input.S2.
Here, variables in ddef.S2 are said to be “killed” by S2 whereas variables in input.S2 are
“generated”. Such propagation of information, by removing KILL sets and adding GEN sets, is
common practice in intraprocedural data flow analysis (see e.g. [47, Section 2.1]).
Note that we propagate information directly on the abstract syntax (i.e. its tree-like represen-
tation) rather than on a flow graph. This is made possible due to the simplicity and structured
nature of our language. With that respect, it should also be noted that in all analyses, assuming
the availability of the def,ddef,input sets for all program elements, our algorithms will involve a
single pass of the program’s tree. However, in presentation, we shall not be concerned with time
or space complexities.
Another law for propagating liveness information is Law 22:
Let B,S,V1,V2 be any boolean expression, statement and two sets of variables, respectively;
then
(while Bdo Sod)[live V1] =(while Bdo S[live V2] od)[live V1]
provided V2 = V1(glob.Binput.S).
Liveness information, explicitly added to (and propagated through) a program, can help in
identifying dead assignments. Those can subsequently be removed. The following is one such law
for dead-assignment elimination (Law 24):
Let S,V,Y,Ebe any statement, two sets of variables and set of expressions, respectively; then
S[live V]=(S;Y:= E)[live V]
provided YV.
Note that the law can (and indeed will) also be used to introduce dead assignments.
4.4 Summary
This completes the initial introduction to our framework for refactoring. A subset of Dijkstra’s
language of guarded commands, along with extensions for representing program analysis informa-
tion have been defined with predicate tranformer semantics and related sets of variables. Those
have been used in presenting laws for program analysis and manipulation.
The next chapter will extend our framework by applying some of its elements in devising a
novel method for proving the correctness of slicing-based refactoring transformations.
Chapter 5
Proof Method for Correct
Slicing-Based Refactoring
Our framework for slicing-based refactoring is enhanced in this chapter, with the development of
a proof method for the refinement of deterministic statements. This method will be specifically
tailored for slicing-based refactoring.
5.1 Introducing slice-refinements and co-slice-refinements
A law of refinement typically associates two meta-programs, Sand T, with some applicability
conditions. Any two programs satisfying those conditions are then guaranteed to be related
through refinement; Sis then said to be refined by T, such that the latter preserves the total
correctness of the former. Operationally speaking, whenever both versions are started in the same
state, one in which Sis known to be terminating, we expect Tto produce the exact same result as
S(i.e. to terminate in the same state). With input for which Sdoes not terminate, Tis allowed
to do anything.
Taken formally, in terms of predicate-transformer semantics, we expect that for any given
predicate P, the weakest precondition of Tapplied to Pwill follow from that of Son (the same)
P, everywhere. Thus, when aiming to prove the correctness of such a refinement law, we are
required to show [wp.S.Pwp.T.P] for all P.
Alternatively, when restricting ourselves to deterministic statements, one can prove the correct-
ness of a refinement law in a slice-wise manner. That is, instead of considering general predicates
over the full state space, we consider predicates over a slice (corresponding to a subset of the
program variables, or accordingly to the subspace spanned by their potential values) separately
from predicates over its complement.
53
CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 54
We first define a relation of slice-refinement and a complementary relation of co-slice-refinement
as follows.
(SvVT)(P:glob.PV: [wp.S.Pwp.T.P]), in which case Tis said to be a slice-
refinement of Swith respect to V; and
(Sv(V)T)(P:glob.PV: [wp.S.Pwp.T.P]), in which case Tis said to be a
co-slice-refinement of Swith respect to V.
The subscript Vin SvVTmeans (as the definition shows) the refinement relation holds for
all predicates with global variables in V. Accordingly, the (V) in Sv(V)Tguarantees the
refinement holds for all predicates with no global mention of V.
5.2 Variable-wise proofs
With the above definitions, we now investigate how each slice-refinement (as well as co-slice-
refinements, later) can be proved in a point-wise fashion. Instead of considering any possible
postcondition (on the sliced variables V), only very particular postconditions in the form of
x=val”, for any variable xVand possible value val, are considered.
In the following, we assume any variable, x, has a type, T.x, associated with it (even if that
type is not explicitly declared in the program). All variable types are assumed to be non-empty,
possibly infinite, sets of distinct values.
5.2.1 Proving slice-refinements
Theorem 5.1. For any pair of deterministic statements Sand SV and any set of variables V,
we have
(SvVSV )([wp.S.true wp.SV .true]
(x,val :xVval ∈ T .x: [wp.S.(x=val )wp.SV .(x=val )])) .
Here, the (otherwise arbitrary) name SV was chosen as a hint that this statement has some-
thing to do with Sand V.
Before proving the theorem, we turn to some motivation. At first glance, the theorem may
seem obvious, perhaps due to our mention of point-wise proof and the universal disjunctivity of
wp.Sand wp.T. However, a closer look reveals that this alternative view of slice-refinement is
more variable-wise than it is point-wise (even though the proof will indeed involve points in the
state space).
Furthermore, it turns out that this formulation is not (always) suitable in the presence of non-
determinism. Recall the example from Section 4.1.2, where our focus on deterministic programs
was justified. Despite the fact that all postconditions of the form “x=val ” or “y=val ” yielded
false as weakest precondition of both programs, the duplicated version was not a slice-refinement
CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 55
of the other one with respect to V={x,y}. This was revealed by the postcondition x=yand
the fact that [true ;false].
We are now ready for a proof of correctness, keeping in mind that S,SV are both deterministic
programs and hence wp.S,wp.SV are both universally disjunctive and positively conjunctive.
Proof. (LHS RHS ): Trivial; glob.true =and glob.(x=val)Vfor all xVof type T,
and value val ∈ T .x.
(LHS RHS ): We need to prove that for any predicate Pwith glob.PVwe have
[wp.S.Pwp.SV .P].
The only two predicates on the empty space (when V=), are the boolean scalars false and
true. Now [wp.S.false wp.SV .false] is given by the Law of Excluded Miracle, and [wp.S.true
wp.SV .true] is given by the (RHS) proviso.
Thus in the remainder of this proof we shall assume Vis not empty. We now note that the
predicate Pexpresses a dichotomy on the state subspace spanned by variables V. This dichotomy
can also be represented by the (possibly infinite) set of points at which the predicate is evaluated to
true. Each point can then be represented (by its coordinates) as a conjunction of simple formulae
of the form x=val, one formula for each variable (axis) in V.
Let n:= |V|and let penumerate all n-dimensional points in the state space (p.iis the value of
the i-th dimension, i.e. of variable V.i); and P.pis true if P(with a substitution of each variable,
V.i, with its corresponding value p.i) is evaluated to true at p; then Pcan be rewritten as:
[P(p:P.p=true : (i: 0 i<n:V.i=p.i))] .(5.1)
We now need to prove that [wp.S.Pwp.SV .P] (for all Pwith glob.PV) under the assump-
tions [wp.S.true wp.SV .true] and for all xVand value val ∈ T .xwe have:
[wp.S.(x=val)) wp.SV .(x=val)] .(5.2)
We recall Vis non-empty (i.e. 0<n) and observe (for all Pwith glob.PV)
wp.S.P
={(5.1) above and Leibniz}
wp.S.(p:P.p=true : (i: 0 i<n:V.i=p.i))
={RE1: wp.Sis universally disjunctive}
(p:P.p=true :wp.S.(i: 0 i<n:V.i=p.i))
={0<nand wp.Sis positively conjunctive (Sdeterministic)}
(p:P.p=true : (i: 0 i<n:wp.S.(V.i=p.i)))
⇒ {assumption (5.2) above}
CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 56
(p:P.p=true : (i: 0 i<n:wp.SV .(V.i=p.i)))
={0<nand wp.SV is positively conjunctive (SV deterministic)}
(p:P.p=true :wp.SV .(i: 0 i<n:V.i=p.i))
={RE1: wp.SV is universally disjunctive}
wp.SV .(p:P.p=true : (i: 0 i<n:V.i=p.i))
={(5.1) above and Leibniz}
wp.SV .P.
Note the correctness of the () step above, due to the monotonicity of both (3.9) and
(3.10).
5.2.2 A co-slice-refinement is a slice-refinement of the complement
Co-slice-refinements, as slice-refinements, can be proved in a variable-wise manner.
Corollary 5.2. For any pair of deterministic statements Sand ScoV and any set of variables V,
we have
(Sv(V)ScoV )([wp.S.true wp.ScoV .true]
(x,val :xcoV val ∈ T .x: [wp.S.(x=val)wp.ScoV .(x=val )]))
where coV := ((def.Sdef.ScoV )\V).
Proof. Recalling Sand ScoV are deterministic, we observe
(Sv(V)ScoV )
={Theorem 5.3, see below}
(SvcoV ScoV )
={Theorem 5.1 with SV ,V:= ScoV ,coV }
([wp.S.true wp.ScoV .true]
(x,val :xcoV val ∈ T .x: [wp.S.(x=val)wp.ScoV .(x=val )])) .
Theorem 5.3. A co-slice-refinement is a slice-refinement of the complementary set of defined
variables. That is, for any pair of deterministic statements Sand ScoV and any set of variables
V, we have
(Sv(V)ScoV )(SvcoV ScoV )
where coV := ((def.Sdef.ScoV )\V).
CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 57
Proof. (LHS RHS ): Due to VcoV , all predicates Pwith glob.PcoV (as required on the
RHS) have glob.PV. Thus the LHS yields [wp.S.Pwp.ScoV .P].
(LHS RHS ): We observe for all Pwith glob.PVand glob.P\coV 6=(without the
latter the RHS would already yield the required [wp.S.Pwp.ScoV .P]):
wp.S.P
={pointwise version of Pon the state space spanned by (coV 1,ND):
let coV 1 := glob.PcoV ,ND := glob.P\coV ,n:= |coV 1|and
n0:= |glob.P|}
wp.S.(p:P.p=true : (i: 0 i<n:coV 1.i=p.i)
(i:ni<n0:ND.(in) = p.i))
={junctivity of wp.S: recall Sis deterministic and
0<|ND|due to (glob.P\coV )6=∅}
(p:P.p=true :wp.S.(i: 0 i<n:coV 1.i=p.i)
(i:ni<n0:wp.S.(ND.(in) = p.i)))
={RE3: ND def.Sand RE2}
(p:P.p=true :wp.S.(i: 0 i<n:coV 1.i=p.i)
(i:ni<n0:wp.S.true (ND.(in) = p.i)))
⇒ {RHS, twice: coV 1coV and glob.true =∅}
(p:P.p=true :wp.ScoV .(i: 0 i<n:coV 1.i=p.i)
(i:ni<n0:wp.ScoV .true (ND.(in) = p.i)))
={RE3: ND def.ScoV and RE2}
(p:P.p=true :wp.ScoV .(i: 0 i<n:coV 1.i=p.i)
(i:ni<n0:wp.ScoV .(ND.(in) = p.i)))
={junctivity of wp.ScoV :ScoV is deterministic and again 0 <|ND |}
wp.ScoV .(p:P.p=true :wp.ScoV .(i: 0 i<n:coV 1.i=p.i)
(i:ni<n0:ND.(in) = p.i))
={pointwise version of P}
wp.ScoV .P.
CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 58
5.3 Slice and co-slice refinements yield a general refinement
Combining separate variable-wise proofs for a slice-refinement and its complementary co-slice-
refinement, we can discard the variable-wise approach.
Corollary 5.4. Let S,Tbe any pair of deterministic statements and let Vbe any set of variables;
then
(SvT)((SvVT)(Sv(V)T)) .
Proof. We observe
SvT
={def. of refinement; glob.P ∅ holds for all P}
(P:glob.P ∅ : [wp.S.Pwp.T.P])
={def. of co-slice-refinement}
Sv(∅)T
={Corollary 5.2 with ScoV ,V:= T,:S,Tdeterministic}
[wp.S.true wp.T.true]
(x,val :x(def.Sdef.T)val ∈ T .x: [wp.S.(x=val )wp.T.(x=val )])
={Lemma 5.5 (see below) and pred. calc.}
[wp.S.true wp.T.true]
(x,val :x(def.Sdef.T)val ∈ T .x: [wp.S.(x=val )wp.T.(x=val )])
(x,val :x(V\(def.Sdef.T)) val ∈ T .x:
[wp.S.(x=val)wp.T.(x=val)])
={merging the ranges}
[wp.S.true wp.T.true]
(x,val :x(Vdef.Sdef.T)val ∈ T .x:
[wp.S.(x=val)wp.T.(x=val)])
={splitting the range; pred. calc.}
[wp.S.true wp.T.true]
(x,val :xVval ∈ T .x: [wp.S.(x=val )wp.T.(x=val )])
[wp.S.true wp.T.true]
(x,val :x((def.Sdef.T)\V)val ∈ T .x:
[wp.S.(x=val)wp.T.(x=val)])
={Theorem 5.1 with SV := Tand Corollary 5.2 with ScoV := T: again,
S,Tare deterministic}
CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 59
(SvVT)(Sv(V)T).
Lemma 5.5. Let S,Tbe any pair of statements and let Pbe any predicate, with glob.P(def.S
def.T); then
[wp.S.true wp.T.true][wp.S.Pwp.T.P].
Proof. For all such S,T,P, with [wp.S.true wp.T.true] and glob.P(def.Sdef.T), we observe
wp.S.P
={RE3: glob.Pdef.S(proviso and set theory)}
Pwp.S.true
⇒ {proviso}
Pwp.T.true
={RE3 again: glob.Pdef.T(proviso and set theory)}
wp.T.P.
5.3.1 A corollary for program equivalence
An immediate corollary of the above refinement proof method will support proof of program
equivalence.
Corollary 5.6. Let S,Tbe any pair of deterministic statements and let Vbe any set of variables;
then
(S=T)
(P:glob.PV: [wp.S.Pwp.T.P]) (Q:glob.QV: [wp.S.Qwp.T.Q]) .
Proof. Recalling Sand Tare deterministic, we observe
S=T
={Theorem 4.1}
(SvT)[wp.S.true wp.T.true]
={Corollary 5.4}
(SvVT)(Sv(V)T)[wp.S.true wp.T.true]
={def. of slice-refinement and co-slice-refinement}
CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 60
(P:glob.PV: [wp.S.Pwp.T.P]) (Q:glob.QV: [wp.S.Qwp.T.Q])
[wp.S.true wp.T.true]
={pred. calc. ((3.7), twice): the ranges are non-empty}
(P:glob.PV: [wp.S.Pwp.T.P][wp.S.true wp.T.true])
(Q:glob.QV: [wp.S.Qwp.T.Q][wp.S.true wp.T.true])
={Lemma 4.2, twice}
(P:glob.PV: [wp.S.Pwp.T.P])
(Q:glob.QV: [wp.S.Qwp.T.Q]) .
5.4 Example proof: swap independent statements
To illustrate our new method of proof, consider the following program equivalence for swapping
independent statements:
Program equivalence 5.7. Let S1,S2 be any pair of deterministic statements; then
S1;S2=S2;S1
provided def.S1def.S2, def.S1input.S2 and input.S1def.S2.
Note that the provisos are actually gathered from the following derivation. This is representa-
tive of our general approach to refinement, program equivalence and transformation.
Proof. We first observe for all Pwith glob.Pdef.S1 (note that def.S2glob.Pdue to proviso
def.S1def.S2):
wp.S1;S2.P
={wp of ‘ ;}
wp.S1.(wp.S2.P)
={RE3: def.S2glob.P}
wp.S1.(Pwp.S2.true)
={conj. of wp.S1}
wp.S1.Pwp.S1.(wp.S2.true)
={RE3: def.S1glob.(wp.S2.true) due to RE2 and proviso def.S1input.S2}
wp.S1.Pwp.S1.true wp.S2.true
={absorb term. (3.14)}
CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 61
wp.S1.Pwp.S2.true
={RE3: def.S2glob.(wp.S1.P) due to RE2, proviso input.S1def.S2,
and choice of P}
wp.S2.(wp.S1.P)
={wp of ‘ ;}
wp.S2;S1.P.
We now observe for all Pwith glob.Pdef.S1:
wp.S2;S1.P
={wp of ‘ ;}
wp.S2.(wp.S1.P)
={RE3: def.S1glob.P}
wp.S2.(Pwp.S1.true)
={conj. of wp.S2}
wp.S2.Pwp.S2.(wp.S1.true)
={RE3: def.S2glob.(wp.S1.true) due to RE2 and proviso input.S1def.S2}
wp.S2.Pwp.S2.true wp.S1.true
={absorb term. (3.14)}
wp.S2.Pwp.S1.true
={RE3: def.S1glob.(wp.S2.P) due to RE2, proviso def.S1input.S2,
and choice of P}
wp.S1.(wp.S2.P)
={wp of ‘ ;}
wp.S1;S2.P.
Taken together, the above two derivations yield the required program equivalence, due to
Corollary 5.6 and the determinism of S1 and S2.
5.5 Summary
This chapter has extended our transformation framework by introducing a proof method for both
refinements and program equivalence, specifically designed to support slicing-related refactoring
transformations. Two complementary concepts of slice-refinement and co-slice-refinement have
CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 62
been introduced. It has been shown that proving each kind of refinement separately is equivalent
to proving normal refinements of code. This approach has been shown to be applicable for proving
program equivalence as well, and such an example, for swapping independent statements, has been
proven.
The next chapter will apply this proof method in developing our first version of sliding.
Chapter 6
Statement Duplication
In this chapter, the first step towards slice extraction is taken by formally developing a program
equivalence that yields a transformation of statement duplication. The duplication begins by
making two clones of the original program. These are composed sequentially, and correctness is
ensured by the addition of compensatory code. This code is responsible for keeping and retrieving
backup of initial and final values. One clone is specialized for computing the results carried by the
variables selected for extraction, whereas the other is dedicated to the remaining computations
(as captured by the remaining variables).
6.1 Example
When asked to extract the computation of sum in the following program fragment
while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
we offer to duplicate the selected statement, and systematically add some compensatory code,
to make the transformation correct. This would yield the following version, in which the actual
computation of sum is in the first clone, whereas the remaining results (i.e. in i,prod ) are computed
in the second (complementary) clone:
63
CHAPTER 6. STATEMENT DUPLICATION 64
|[var isum,iprod,ii,fsum
; isum,iprod,ii := sum,prod,i
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
; fsum := sum
;
sum,prod,i := isum,iprod,ii
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
; sum:=fsum
]|
6.2 Sequential simulation of independent parallel execution
Consider the effects of a program statement’s execution as the results carried by its defined vari-
ables, i.e. def.Sfor a statement S, in case of termination. Now consider a partition of def.S, say
to two subsets def.S= (V,coV ). (Here, the otherwise arbitrary name coV was chosen as a hint
that this set of variables is complementary to V.)
If Sis deterministic, its computation can be accomplished by two — or more, depending on
the number of partitions — independent machines. Each such machine will be given the same
initial state and the same program statement for execution. Clearly, due to determinism, both
machines terminate under the same conditions. Then, in case of termination, the results can be
collected from the two machines, say Vfrom the first machine and coV from the second.
The usefulness of the above construction will become clear later on, when each clone of Swill
be independently simplified to achieve its specific designated goal.
For simulating the above scenario (of two independent machines), using our sequential imper-
ative language (for a single machine), we propose to sequentially compose two clones of statement
S. For the second clone to work properly, we insist that the first clone will not modify the original
set of variables. This will be guaranteed by keeping a backup of initial values, and retrieving those
values upon entry to the second clone. Using our available language constructs, this transforma-
tion is formalised in the following section. (We actually formalise it as a program equivalence,
thus keeping it general enough to be relevant for the reverse transformation as well.)
6.3 Formal derivation
Program equivalence 6.1. Let S,V,coV ,iV ,icoV ,fV be any deterministic statement and five
sets of variables, respectively; then
CHAPTER 6. STATEMENT DUPLICATION 65
S=
(iV ,icoV := V,coV
;S
;fV := V
;
V,coV := iV ,icoV
;S
;V:= fV )[live V,coV ]
provided def.S= (V,coV )
and (iV ,icoV ,fV )glob.S.
Proof.
S
={prepare for statement duplication (Lemma 6.2 below):
def.S= (V,coV ) and (iV ,icoV ,fV )glob.S}
(iV ,icoV := V,coV ;S;fV := V;V:= fV )[live V,coV ]
={statement duplication (Lemma 6.3 below): Sis deterministic}
(iV ,icoV := V,coV ;S;fV := V
;V,coV := iV ,icoV ;S;V:= fV )[live V,coV ].
Lemma 6.2. Let S,V,coV ,iV ,icoV ,fV be any statement and five sets of variables, respectively;
then
S=
(iV ,icoV := V,coV
;S
;fV := V
;
V:= fV )[live V,coV ]
provided def.S= (V,coV )
and (iV ,icoV ,fV )glob.S.
Proof.
(iV ,icoV := V,coV ;S;fV := V;V:= fV )[live V,coV ]
={assignment-based sub. (Law 18): VfV since
Vdef.S(proviso), def.Sglob.S(RE5) and glob.SfV (proviso)}
(iV ,icoV := V,coV ;S;fV := V;V:= V)[live V,coV ]
={remove aux. self assignment (Law 2)}
(iV ,icoV := V,coV ;S;fV := V)[live V,coV ]
CHAPTER 6. STATEMENT DUPLICATION 66
={remove dead assignments (Law 24): fV (V,coV ) (proviso and RE5)}
(iV ,icoV := V,coV ;S)[live V,coV ]
={remove dead assignments (Law 25):
(iV ,icoV )(((V,coV )\ddef.S)input.S) (again, proviso and RE5)}
S[live V,coV ]
={remove aux. liveness info. (Law 19): def.S(V,coV )}
S.
Lemma 6.3. Let S,V,coV ,iV ,icoV ,fV be any deterministic statement and five sets of vari-
ables, respectively; then
iV ,icoV := V,coV
;S
;fV := V
=
iV ,icoV := V,coV
;S
;fV := V
;
V,coV := iV ,icoV
;S
provided def.S= (V,coV )
and (iV ,icoV ,fV )glob.S.
Proof.
iV ,icoV := V,coV ;S;fV := V
={intro. following assertion (Law 7)}
iV ,icoV := V,coV ;{V,coV =iV ,icoV };S;fV := V
={see below}
iV ,icoV := V,coV ;{V,coV =iV ,icoV };S;fV := V
;V,coV := iV ,icoV ;S
={remove following assertion (Law 7)}
iV ,icoV := V,coV ;S;fV := V;V,coV := iV ,icoV ;S.
Note that no data may flow from the first clone to the second:
def.S;fV := Vinput.V,coV := iV ,icoV ;S. We now observe for all P
wp.{V,coV =iV ,icoV };S;fV := V;V,coV := iV ,icoV ;S.P
={wp of ‘ ;’ and assertions}
(V,coV =iV ,icoV )wp.S;fV := V;V,coV := iV ,icoV ;S.P
CHAPTER 6. STATEMENT DUPLICATION 67
={wp of ‘ ;’ and ‘:=’}
(V,coV =iV ,icoV )wp.S.(wp.V,coV := iV ,icoV ;S.P)[fV \V]
={wp of ‘ ;’ and ‘:=’}
(V,coV =iV ,icoV )wp.S.(wp.S.P)[V,coV \iV ,icoV ][fV \V].
At this point, due to Corollary 5.6 and the determinism of S, we are ready to distinguish two
complementary cases: (a) glob.PfV ; and (b) glob.PfV . The former case involves results
computed in the first clone of Swhereas the latter takes care of the computations from the second
clone, which are — due to the lack of data flow — independent of the first clone’s results.
Case (a): glob.PfV
(V,coV =iV ,icoV )wp.S.(wp.S.P)[V,coV \iV ,icoV ][fV \V]
={RE3: glob.Pdef.S}
(V,coV =iV ,icoV )wp.S.(wp.S.true P)[V,coV \iV ,icoV ][fV \V]
={dist. of normal subs over ∧}
(V,coV =iV ,icoV )wp.S.((wp.S.true)[V,coV \iV ,icoV ][fV \V]
P[V,coV \iV ,icoV ][fV \V])
={remove redundant subs: RE2 (fV (input.S,iV ,icoV ))}
(V,coV =iV ,icoV )wp.S.((wp.S.true)[V,coV \iV ,icoV ]
P[V,coV \iV ,icoV ][fV \V])
={remove redundant subs: (V,coV )glob.P}
(V,coV =iV ,icoV )wp.S.((wp.S.true)[V,coV \iV ,icoV ]P[fV \V])
={wp.Sis conj.}
(V,coV =iV ,icoV )wp.S.(wp.S.true)[V,coV \iV ,icoV ]wp.S.P[fV \V]
={RE3: (iV ,icoV )def.S(recall (V,coV ) = def.S)}
(V,coV =iV ,icoV )wp.S.true (wp.S.true)[V,coV \iV ,icoV ]
wp.S.P[fV \V]
={remove redundant subs: (V,coV =iV ,icoV )}
(V,coV =iV ,icoV )wp.S.true wp.S.true wp.S.P[fV \V]
={absorb termination (3.14), twice}
(V,coV =iV ,icoV )wp.S.P[fV \V]
={wp of ‘:=’ and ‘ ;}
(V,coV =iV ,icoV )wp.S;fV := V.P
CHAPTER 6. STATEMENT DUPLICATION 68
={wp of assertions and ‘ ;}
wp.{iV ,icoV =V,coV };S;fV := V.P.
Case (b): glob.PfV
(V,coV =iV ,icoV )wp.S.(wp.S.P)[V,coV \iV ,icoV ][fV \V]
={remove redundant sub.: fV (glob.Sglob.P(iV ,icoV )), proviso and RE2}
(V,coV =iV ,icoV )wp.S.(wp.S.P)[V,coV \iV ,icoV ]
={RE3: glob.((V,coV := iV ,icoV ).(wp.S.P)) def.S
(recall (V,coV ) = def.S)}
(V,coV =iV ,icoV )wp.S.true (wp.S.P)[V,coV \iV ,icoV ]
={remove redundant subs: (V,coV =iV ,icoV )}
(V,coV =iV ,icoV )wp.S.true wp.S.P
={absorb termination (3.14)}
(V,coV =iV ,icoV )wp.S.P
={intro. redundant sub.: proviso}
(V,coV =iV ,icoV )wp.S.P[fV \V]
={wp of ‘:=’ and ‘ ;}
(V,coV =iV ,icoV )wp.S;fV := V.P
={wp of assertions and ‘ ;}
wp.{iV ,icoV =V,coV };S;fV := V.P.
6.4 Summary and discussion
This chapter has introduced our first solution to slice extraction, through a naive sliding approach
of statement duplication. Two clones of a given statement are composed for sequential execution.
The first clone, i.e. the extracted code, is dedicated for computing a selected subset of the original
program’s results, whereas the second clone, i.e. the complement, is responsible for the remaining
results. Behaviour preservation is guaranteed, as is formally proved in the chapter, by the addition
of compensatory code.
This includes copying of the initial state, saving it in backup variables, in an approach borrowed
from the Tuck transformation of Lakhotia and Deprez [40]. However, in order to keep the resulting
program as close to the original as possible, we refrain from their decision to rename variables
CHAPTER 6. STATEMENT DUPLICATION 69
in the complement. Instead, the initial state is retrieved from backup variables into the original
ones, just before the complement begins execution.
The success of this approach is based on a new type of compensation, according to which the
final value of extracted variables is also kept in backup variables, ahead of retrieving the initial
state for the complement. Accordingly, those are retrieved once the complement’s execution is
over.
The proof of correctness is based on our proof method, as developed in the preceding chap-
ter. The equivalence of the original program and its duplicated version has been proved for the
extracted set of variables separately from the proof for the complementary set.
Having proved the equivalence of a program and its duplicated version, rather than expressing
it as a direct transformation, the result of this chapter is also applicable for merging a duplicated
statement. However, as we are interested in slice extraction for untangling code, rather than
tangling it, such direction will not be further pursued (and hence the chapter’s title).
Several improvements of statement duplication will be developed in later chapters. Both the
extracted code and its complement will be reduced by slicing (in the next three chapters). Then,
the complement will be further reduced by reusing extracted results (Chapter 10) and redundant
compensation will be eliminated in Chapter 11.
Our correctness proof has been decomposed in a certain manner such that those further im-
provements will be able to reuse parts of it. In particular, both Lemma 6.2 and Lemma 6.3 will
be reused in Chapter 10 when reducing the complement.
Finally, we consider statement duplication as a naive sliding operation, at least metaphorically,
due to the following observation. The code of the given program statement can be printed on a
single transparency slide and photocopied. Placing the two slides one on top of the other yields
the original program; then sliding one away from the other and adding compensatory code would
yield the duplicated version. The further improvements will refine this approach by representing
a program with more slides. On each such slide, in turn, a part of the program will be printed.
Chapter 7
Semantic Slice Extraction
As was introduced earlier in the thesis (back in Chapter 1), the main challenge in slice extraction is
to be able to untangle the extracted code from its complement, whilst minimizing code duplication.
With respect to the goal of minimizing duplication, it may seem self defeating to base our novel
approach on statement duplication. However, this duplication can be justified by the following
observation.
Once a statement has been duplicated, and each of its clones has been specialized (through
copying of initial and final values, in the compensatory code) for computing only a subset of its
results, we have potentially rendered some of its internal statements dead. Those can subsequently
be removed by slicing.
In this chapter, requirements of slicing are derived from the earlier statement-duplication for-
malisation and a form of live variables analysis, to be introduced in the chapter. The result of this
derivation, besides slicing requirements, is a refinement relation similar to but more general than
the program equivalence of statement duplication. This refinement rule will later (in Chapter 9)
be applied in deriving our first slice-extraction transformation.
70
CHAPTER 7. SEMANTIC SLICE EXTRACTION 71
7.1 Example
Going back to the sum and prod example, we now start with the following version:
i,sum,prod := 0,0,1
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
and try to extract sum. Transforming the code according to the statement-duplication program
equivalence (6.1) would yield the following version (replacing liveness information with local vari-
ables):
|[var isum,iprod,ii,fsum
; isum,iprod,ii := sum,prod,i
; i,sum,prod := 0,0,1
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
; fsum := sum
;
sum,prod,i := isum,iprod,ii
; i,sum,prod := 0,0,1
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
; sum:=fsum
]|
The duplicated statement can now be simplified by slicing. The result
|[var isum,iprod,ii,fsum
; isum,iprod,ii := sum,prod,i
; i,sum := 0,0
; while i<a.length do
i,sum :=
i+1,sum+a[i]
od
; fsum := sum
;
sum,prod,i := isum,iprod,ii
; i, prod := 0, 1
; while i<a.length do
i, prod :=
i+1, prod*a[i]
od
; sum:=fsum
]|
can then be re-formatted as
CHAPTER 7. SEMANTIC SLICE EXTRACTION 72
|[var isum,iprod,ii,fsum
; isum,iprod,ii := sum,prod,i
; i,sum := 0,0
; while i<a.length do
i,sum := i+1,sum+a[i]
od
; fsum := sum
;
sum,prod,i := isum,iprod,ii
; i,prod := 0,1
; while i<a.length do
i,prod := i+1,prod*a[i]
od
; sum:=fsum
]|
Notice that the compensatory code, i.e. all backup (local) variables isum,iprod ,ii and fsum ,
along with their respective initialization code, have become redundant (although this will not al-
ways be the case). That redundancy will be removed in later simplification steps (see Chapter 11).
Later in the chapter we shall formally derive the (syntactic and semantic) requirements of
slices, by developing an improved solution to slice extraction (on the lines of the above example).
This solution, as well as the semantics of slices, will be based on a formal approach to live variables
analysis.
But first, we turn to introduce that liveness analysis.
7.2 Live variables analysis
Suppose we wish to perform liveness analysis on a core statement Swith respect to a given set
of variables, V. Let coV := def.S\V, the complementary set of defined variables, be considered
live throughout S(i.e. on exit from any of its slips). We first note that the over-approximation of
considering elements of coV live, even in places where they are not, will not be harmful.
Moreover, variables outside (V,coV ) are not defined in any slip Tof S(since Sis a core
statement, with no local variables). Such variables will keep their initial value throughout S, and
hence there will be no harm in ignoring them.
Accordingly, liveness analysis in Sbegins with S=S[live V,coV ] for any given S,Vwith
coV := def.S\V. This step is correct due to Law 19 (with V:= (V,coV )).
Then, liveness information is propagated to all slips of S, in a syntax-directed manner, as
follows.
For sequential composition, we turn any (S1; S2)[live V1,coV ]with V1Vinto
(S1[live V2,coV ]; S2[live V1,coV ])[live V1,coV ], where
V2 := (V1\ddef.S2) (Vinput.S2), which is correct due to Law 20 and our earlier comments
on redundancy of variables outside (V,coV ) and the legitimacy of over-approximation (in coV ).
CHAPTER 7. SEMANTIC SLICE EXTRACTION 73
For IF statements, we simply turn any (if Bthen S1else S2)[live V1,coV ]into
if Bthen S1[live V1,coV ]else S2[live V1,coV ]by applying Law 21 with V:= (V1,coV ).
Finally, for DO loops, we turn (while Bdo Sod)[live V1,coV ]into
(while Bdo S[live V2,coV ]od)[live V1,coV ], where
V2 := V1(V(glob.Binput.S)), which is correct due to Law 22 and, again, the earlier
comment on redundancy of variables outside (V,coV ).
We refer to results of liveness analysis of statement Son variables V, by saying
“let T[live V1,coV ] be any slip of S[live V,coV ]”. In a full live variables analysis of a statement
S, the set of variables, V, is not explicitly selected and the set def.Sis taken in its place.
Hence, in full liveness analysis, the complementary set coV is empty and we say “let T[live V1]
be any slip of S”. Then T[live V1] can be safely augmented with following assignments to any
of the variables in def.S\V1. This augmentation will still keep all auxiliary liveness information
redundant (and hence removable). That is, any augmentation by assignment to dead variables
(from def.S) is correct, in the context of S. (Augmentation by assignment to any other dead
variable not in def.Swill be correct in S[live V] but not in Sitself. However, we will not be
interested in such augmentations.)
Note that in deviation from traditional liveness analysis (as in [47]), typically propagating
information on a flow graph, until a fixed point is reached, our algorithm requires one pass of the
program’s tree. This is possible due to the simplicity of our language (e.g. no jumps) and the
availablity of summary information (i.e. sets def,ddef and input).
In that light, it is important and interesting to verify that our algorithm is insensitive to
different parses of a given program — which are possible due to the associativity of sequential
composition. That is, as much as S1;(S2;S3) =(S1;S2) ;S3in our language, so
will the analysis produce identical results in both cases, as is shown in the following.
Theorem 7.1. Distribution of liveness information over sequential composition is associative.
Proof. Suppose we perform liveness analysis on a core statement Swith def.S=V, and we
reach a slip of the form S1;S2;S3with live-on-exit variables V3. (Note that the liveness
analysis guarantees V3V.) We now need to show that whatever the internal parsing, the
liveness-analysis algorithm would identify the same results for slips S1, S2 and S3, on both
(S1;(S2;S3))[live V3] and ((S1;S2) ;S3)[live V3] .
For the former, we have
(S1;(S2;S3))[live V3]
={liveness analysis (on V):
let V1 := (V3\ddef.S2;S3)(Vinput.S2;S3)}
(S1[live V1] ;(S2;S3)[live V3])[live V3]
CHAPTER 7. SEMANTIC SLICE EXTRACTION 74
={liveness analysis (on V): let V2 := (V3\ddef.S3) (Vinput.S3)}
(S1[live V1] ;(S2[live V2] ;S3[live V3])[live V3])[live V3] ,
and for the latter, we have
((S1;S2) ;S3)[live V3]
={liveness analysis (on V), with V2 as above}
((S1;S2)[live V2] ;S3[live V3])[live V3]
={liveness analysis (on V): let V10:= (V2\ddef.S2) (Vinput.S2)}
((S1[live V10];S2[live V2])[live V2] ;S3[live V3])[live V3] .
Finally, we observe that V1 = V10, as expected, since
V1
={def. of V1}
(V3\ddef.S2;S3)(Vinput.S2;S3)
={set theory: V3V}
V((V3\ddef.S2;S3)input.S2;S3)
={Lemma 7.2, see below}
V((((V3\ddef.S3) input.S3) \ddef.S2) input.S2)
={set theory: again, V3V}
(((V3\ddef.S3) (Vinput.S3)) \ddef.S2) (Vinput.S2)
={def. of V2}
(V2\ddef.S2) (Vinput.S2)
={def. of V10}
V10.
Lemma 7.2. Let S1,S2,Vbe any two statements and set of variables, respectively; then
(V\ddef.S1;S2)input.S1;S2
=
(((V\ddef.S2) input.S2) \ddef.S1) input.S1.
CHAPTER 7. SEMANTIC SLICE EXTRACTION 75
Proof. We observe
(V\ddef.S1;S2)input.S1;S2
={ddef and input of ‘ ;}
(V\(ddef.S1ddef.S2)) (input.S1(input.S2\ddef.S1))
={set theory}
(V\(ddef.S1ddef.S2)) (input.S2\ddef.S1) input.S1
={set theory}
((V\ddef.S2) \ddef.S1) (input.S2\ddef.S1) input.S1
={set theory}
(((V\ddef.S2) input.S2) \ddef.S1) input.S1.
7.2.1 Simultaneous liveness
Liveness analysis will be useful beyond the elimination of dead assignments.
Definition 7.3 (Simultaneous Liveness).When performing full live variables analysis on a given
S[live V], a set of variables Xis considered simultaneously-live (in S[live V]) if more than one
element of Xis on the live variables set of any slip Tof S. When no such slip exists, Xis not
simultaneously-live in S[live V].
The concept of simultaneous liveness will be useful mainly in the merging of live ranges [54].
A set of non-simultaneously-live variables can (under some further conditions) be merged into one
variable. This will be explored, formalised and applied later in the thesis (see Section 8.6.2 and
Appendix D).
This concludes our introduction to liveness analysis, which will be applied next for slice ex-
traction and later in the thesis e.g. for reducing compensation after sliding (Chapter 11).
7.3 Formal derivation using statement duplication
Refinement 7.4. Let S,SV ,ScoV ,V,coV ,iV ,icoV ,fV be three deterministic statements and
five sets of variables, respectively; then
CHAPTER 7. SEMANTIC SLICE EXTRACTION 76
Sv
(iV ,icoV := V,coV
;SV
;fV := V
;
V,coV := iV ,icoV
;ScoV
;V:= fV )[live V,coV ]
provided def.S= (V,coV ),
S[live V]vSV [live V],
S[live coV ]vScoV [live coV ],
def.SV def.S,
def.ScoV def.Sand
(iV ,icoV ,fV )glob.S.
Proof.
S
={duplicate statement (Program equivalence 6.1): Sis deterministic,
def.S= (V,coV ) and (iV ,icoV ,fV )glob.S(provisos)}
(iV ,icoV := V,coV ;S;fV := V
;V,coV := iV ,icoV ;S;V:= fV )[live V,coV ]
v {Refinement 7.5 with S0:= S}
(iV ,icoV := V,coV ;SV ;fV := V
;V,coV := iV ,icoV ;ScoV ;V:= fV )[live V,coV ].
Refinement 7.5. Let S,S0,SV ,ScoV ,V,coV ,iV ,icoV ,fV be four statements and five sets of
variables, respectively; then
(iV ,icoV := V,coV
;S
;fV := V
;
V,coV := iV ,icoV
;S0
;V:= fV )[live V,coV ]
v
(iV ,icoV := V,coV
;SV
;fV := V
;
V,coV := iV ,icoV
;ScoV
;V:= fV )[live V,coV ]
CHAPTER 7. SEMANTIC SLICE EXTRACTION 77
provided
P1: def.S= (V,coV ),
P2: def.S0= (V,coV ),
P3: S[live V]vSV [live V],
P4: S0[live coV ]vScoV [live coV ],
P5: def.SV def.S,
P6: def.ScoV def.S0and
P7: (iV ,icoV ,fV )glob.S.
Proof.
(iV ,icoV := V,coV ;S;fV := V
;V,coV := iV ,icoV ;S0;V:= fV )[live V,coV ]
={liveness analysis: fV def.S0(P7,RE5,P1 and P2)}
((iV ,icoV := V,coV ;S;fV := V;
V,coV := iV ,icoV ;S0[live coV ])[live fV ,coV ];V:= fV )[live V,coV ]
v {(P4)}
((iV ,icoV := V,coV ;S;fV := V;
V,coV := iV ,icoV ;ScoV [live coV ])[live fV ,coV ];V:= fV )[live V,coV ]
={liveness removal: def.ScoV fV (P1,P2,P6,P7 and RE5)}
((iV ,icoV := V,coV ;S;fV := V;
V,coV := iV ,icoV ;ScoV )[live fV ,coV ];V:= fV )[live V,coV ]
={liveness analysis: (input.ScoV \ddef.fV := V;V,coV := iV ,icoV )
def.iV ,icoV := V,coV ;S(iV ,icoV ); then (iV ,icoV )def.S}
(((iV ,icoV := V,coV ;S[live V])[live V,iV ,icoV ];fV := V;
V,coV := iV ,icoV ;ScoV )[live fV ,coV ];V:= fV )[live V,coV ]
v {(P3)}
(((iV ,icoV := V,coV ;SV [live V])[live V,iV ,icoV ];fV := V;
V,coV := iV ,icoV ;ScoV )[live fV ,coV ];V:= fV )[live V,coV ]
={liveness removal: def.SV (iV ,icoV ) (P5,P7,RE5);
then all potentially dead coV (V,iV ,icoV ) (P1,P7 and RE5)
remain dead, since coV (ddef.fV := V;V,coV := iV ,icoV
\input.fV := V;V,coV := iV ,icoV )}
CHAPTER 7. SEMANTIC SLICE EXTRACTION 78
(iV ,icoV := V,coV ;SV ;fV := V;
V,coV := iV ,icoV ;ScoV ;V:= fV )[live V,coV ].
7.4 Requirements of slicing
From the above law of refinement, we can gather conditions P3 and P5 as requirements for slicing.
That is, for a given deterministic statement Sand set of variables V, any statement SV satisfying
(Q1:) S[live V]vSV [live V]
is (at least semantically) a correct slice of Swith respect to V. Furthermore, if condition
(Q2:) def.SV def.S
holds too, we know SV can successfully replace the extracted Sin a transformation of slice
extraction of Vfrom S.
For sanity checking (of the generality of those requirements), we observe conditions P4 and P6
above, for the complement. There, requirement Q1 with S,V,SV := S0,coV ,ScoV holds due to
P4 and Q2 with SV ,S:= ScoV ,S0holds through P6. Thus, any slice ScoV of S0with respect
to coV would make a good complementary statement in a transformation of slice extraction of V
from S.
For requirement Q1 above, we make one further observation. Following the definitions of
liveness and of the relation of slice-refinement (as defined in Chapter 5), a semantic slice SV of S
and any slice-refinement of those Sand Vis a semantic slice. That is, any SV satisfying SvVSV
satisfies Q1 and is thus a semantic slice of S,V.
In general, any known refinement technique can be applied to S[live V] (rather than directly
to S) in deriving a slice-refinement. However, in an attempt to constructively describe related
transformations, a slicing algorithm will be formally developed later in the thesis.
7.4.1 Ward’s definition of syntactic and semantic slices
Our semantic definition of a slice is akin to several definitions by Martin Ward (e.g. in [57], and
most recently in “Conditioned Semantic Slicing via Abstraction and Refinement in FermaT” by
Ward, Zedan and Hardcastle [59]), and has indeed been inspired by those. Ward et al. base their
semantic definition on a novel relation between programs, called semi-refinement, which involves
the introduction of termination. That is, a program S0is a semi-refinement of another program
Sif they are both equivalent for input on which Sis guaranteed to terminate. On other inputs,
S0is free to terminate, or do any other thing. Recalling that refinement involves the introduction
of either (or both) termination and determinism, we note that in our context of deterministic
CHAPTER 7. SEMANTIC SLICE EXTRACTION 79
programs, refinement and semi-refinement are the same.
In their formalism (also based on predicate transformaers), a semantic slice of a given program
Son variables Xis any program S0for which S0;remove(W\X)is a semi-refinement of
S;remove(W\X), where Wis the final state space for Sand S0, and with remove restricting
that state space. Note how our S[live X] concisely captures their S;remove(W\X). In
effect, their combination of state space restriction and semi-refinement is captured by our notation
for live variables and normal refinement (and hence the requirement Q1 above) — in our context
of deterministic programs.
A second relation between programs, that of a reduction, is introduced by Ward et al. to
participate in the definition of a syntactic slice. A program reduction involves the replacement of
substatements (i.e. slips in our terminology) with skip statements (or exit, which is beyond the
scope of our investigation), thus maintaining the original syntactic structure. Then, any semantic
slice of Son X,S0, is also a syntactic slice, if it is a reduction of S. The next chapter will define
the program entities of slides to achieve a similar effect. This will allow a later formulation of a
provably-correct syntax-preserving slicing algorithm.
7.5 Summary
This chapter has developed a refinement rule for slice extraction, based on statement duplication
(from the preceding chapter) and a live variables analysis. Our approach to liveness analysis has
been formalised in the chapter.
Liveness information is introduced into a program statement by first assuming all variables are
live; then the information is propagated to all slips of the original statement; next, local trans-
formations such as dead-assignment-elimination can be performed; finally, under some conditions,
the correct local transformations are also globally correct such that all liveness information can
be removed.
According to our new liveness-based refinement rule (Refinement 7.4) for slice extraction, both
the extracted code and the complement are slices (of the same program, on two complementary
sets of variables).
Advanced strategies for minimizing the amount of duplication in slice extraction will be ex-
plored later in the thesis, after developing a slicing algorithm (in Chapter 9 ahead); this will be a
semantically correct algorithm, following the slicing requirements (Q1 and Q2) as derived in this
chapter. That algorithm will allow us to constructively describe a transformation based on the
semantic slice-extraction refinement laws of this chapter.
The next chapter will lay the foundations for our slicing algorithm by formalising a novel
decomposition of programs into syntactic elements called slides.
Chapter 8
Slides: A Program Representation
8.1 Slideshow: a program execution metaphor
In this thesis, the execution of imperative procedural sequential programs is thought of as a
systematic slideshow.
According to the slideshow metaphor, the executable code of each procedure is printed on a
single transparent slide, one that is identifiable by the procedure’s unique signature.
For a reason that will soon become clear, we choose to think of the slides as A4 (or longer, if
needs be) transparencies that can be projected using a classroom-like overhead projector. This is
in contrast to traditional photography related slide projectors, where a picture is printed onto a
film (which is then placed inside a cardboard or plastic shell) and whenever selected for viewing, is
being mechanically slid away from a tray (stacking a normally prearranged collection of pictures),
and onto the projector’s lamp.
It is the latter projection style that is responsible for the English terminology of a slide.
Nevertheless, in a sliding transformation, to be introduced shortly and developed throughout the
thesis, the sideways movement of slides will be of a somewhat different nature.
Why do we prefer overhead projection? One reason is that this way, while projecting (i.e.
executing) a program, the presenter can use a non-permanent (i.e. erasable) pen for writing notes
on the slide itself (or alternatively on a separate blank slide that is placed directly on top). This can
be useful e.g. for keeping track of current values of local variables. The other reason is related to
the order of presented slides. In tracking program execution, the order will not be as prearranged,
static and sequential as is usually the case with photographic slideshows.
The slideshow is a demonstration of a typical Von-Neumann style of sequential program ex-
ecution. The program itself is a collection of procedures storing imperative subprograms. (The
model is probably extendable for concurrent and even truly parallel program execution, by having
80
CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 81
either one presenter, simultaneously using multiple trays or projectors/screens, or maybe even a
combination of many independent presenters/projectors.)
In the rest of this thesis, the idea of program execution as a slideshow will play no further
part. (It was introduced here merely as an illustration aid.) Instead, the slideshow metaphor
will be applied to the development and evolution of programs, or more specifically to slicing and
refactoring.
8.2 Slides in refactoring: sliding
8.2.1 One slide per statement
It is in illustrating and formalising the slice-extraction refactoring that the program medium of
slides will be instrumental. A plausible interpretation of our initial solution, that of statement
duplication (from Chapter 6 above), goes as follows.
Suppose the code of a program statement Sis printed on a single transparency slide; duplicate
that slide, thus yielding two clones, say S1 and S2; place them one on top of the other (thus
getting the original S); slide one of them (say S2) sideways; finally, for behaviour preservation,
add compensatory code.
But duplication of code is bad. The interpretation of our first step for reducing such duplication,
(as was defined in the preceding chapter and will be automated in the next), in terms of sliding,
is described in what follows.
8.2.2 A separate slide for each variable
In a first step forward, we will no longer think of a statement Sas being printed on a single slide.
Instead, we take further advantage of features of transparency slides, and dedicate a separate slide
for each defined variable. On each such slide, the slice of that variable (from the end of S) can be
printed.
Assuming no dead code, it can be shown that the union of such slides is Sitself. Then, when
a set of variables Vis selected for extraction from S, all slides of variables in the complementary
set coV := def.S\Vcan be separated from slides of Vby sliding. As in the previous solution,
compensatory code should be added, to ensure behaviour preservation.
Another feature of transparency slides that proves useful here is that the relative location of slid
program elements remains the same. This is a fact that existing approaches for syntax-preserving
slice extraction, e.g. KH03 [39], have struggled with, both in illustration and formalisation. With
the slideshow metaphor in mind, this requirement has become relatively trivial.
The result is the extraction of a slice (of V), with the complement being also a slice, of the
CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 82
complementary set coV . But the complement can be made even smaller, by reusing the extracted
results of V, as in the following.
8.2.3 A separate slide for each individual assignment
Instead of having a slide for each (defined) variable, our final improvement will involve designating
a separate slide for each individual assignment. On each such slide we shall print the assignment
itself, and all guards (controlling whether the assignment will or will not be executed). We shall
pay special attention to preserving layout (on the slide, both metaphorically and later when
formalising slides), such that the original program will be reproducible, as the union of all slides.
Similarly, each slice will consist of the union of all slides of included assignments.
This time, when asked to extract variables Vfrom S, all slides in the slice of Vwill be
separated from the remaining slides by sliding, leaving a potentially smaller complement. However,
for preserving behaviour, some extra measures will need to be taken. These include duplication of
some slides (that must appear in both the extracted slice and its complement) and the renaming
of reused extracted values in the complement.
This sliding transformation, along with the extra measures, will be formalised later in the
thesis (see chapter 10). Then, the need for renaming reused extracted values, in the complement,
will be removed in Chapter 11.
8.3 Representing non-contiguous statements
For slice extraction to be implemented as a sliding operation, we need to decompose a given
statement into a set of not-necessarily contiguous statements. As was introduced earlier, in Sec-
tion 4.1.1, instead of speaking of substatements as parts of a program statement, we speak of slips
and slides. In terms of the abstract syntax tree, the former correspond to a subtree and the latter
to a path from the root to a node. More precisely, a slide is a statement, formed by that path,
replacing any statement child (i.e. slip) of a node on the path, which is itself not on the path,
with the empty statement skip . For convenience, in concrete examples, we avoid mentioning the
skip , leaving an empty space instead. Note that this space is empty but considered transparent,
in contrast to the misleading convention of “whitespace”. (Admittedly, a concrete syntax that
understands such empty spaces as the empty statement would have been preferable.)
For example, the program on the left-hand column can be represented as the union of the slides
of its individual assignments.
CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 83
Let Sbe any core statement and Vbe any set of variables; then
slides.S.V,skip when Vdef.S; otherwise we have the following definitions:
slides.X1,X2 := E1,E2.V,X1 := E1where X1Vand X2V;
slides.S1;S2.V,(slides.S1.V);(slides.S2.V);
slides.if Bthen S1else S2.V,if Bthen (slides.S1.V)else (slides.S2.V);
slides.while Bdo Sod .V,while Bdo (slides.S.V)od .
Figure 8.1: Computing the slides of a core statement with respect to a set of variables.
if x>y then
m:=x
else
m:=y
fi
=
if x>y then
m:=x
else
fi
if x>y then
else
m:=y
fi
Such a union operation will be formalised shortly (in the next section) and such slides of
individual assignments will be formalised later in the chapter (in Section 8.6). But first, we find it
more convenient to formalise a more coarse-grained concept. We define the statement formed by
the union of all individual-assignment slides of a certain program statement Swith respect to a
set of variables V. (This way, we avoid having to formalise the access to an individual assignment,
e.g. through labels.)
For a given program statement Sand any set of variables V, we define the subprogram of S
containing all assignments to variables in V, along with all their enclosing compound statements,
as slides.S.V(see Figure 8.1). In the example above, say the statement on the left is S, then the
statement slides.S.{x,y,z}is the empty statement skip , whereas slides.S.{m}is the union of the
two individual slides (which is in this case the whole program, S).
Note that in general, a given core statement Sis represented as slides.S.(def.S). Further note
that we choose to name that function slides, instead of say slide or slideFor, despite the fact that it
yields a single statement, because its result should be thought of as the collection of all individual-
assignment slides of variables in the selected set. For convenience, we choose not to distinguish
that collection from the actual statement its union would yield. Indeed, when collecting a set of
slides, putting them one on top of the other, the resulting program is the union of those slides.
This is formalised next.
CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 84
Let S1,S2,B,X1,X2,X3,E1,E2,E3 be two core statements, a boolean expression, three sets of
variables and three corresponding expressions, respectively; then
S1skip ,S1 ;
skip S2,S2 ;
X1,X2 := E1,E2X1,X3 := E1,E3,X1,X2,X3 := E1,E2,E3
provided X2X3 ;
S1;S2S10;S20,(S1S10);(S2S20);
if Bthen S1else S2if Bthen S10else S20,
if Bthen (S1S10)else (S2S20);
while Bdo S1od while Bdo S10od ,while Bdo (S1S10)od .
Figure 8.2: Unifying (or merging) statements.
8.4 Collecting slides: the union of non-contiguous code
We define an operation for unifying (or merging) two program statements, S1 and S2, into a single
statement, S1S2 (see Figure 8.2).
Note that two statements S1 and S2 are unifiable (i.e. S 1S2 is well-defined) only when
they have the same shape, as is implicitly expressed in the definition of . For example, an IF
statement can only be merged with an empty statement or with another IF statement whose guard
and two branches are unifiable with the corresponding guard and branches of the former.
Furthermore, note that we do not write S1S2and do not define wp-semantics for slide
union (or for taking slides in general). This is so since is not a construct of our program-
ming language. It is rather a meta-program operation, generating (when well-defined) a program
statement.
Following its definition, it is easy to verify that the union of statements is commutative,
associative and idempotent; hence the choice of infix . The following theorem shows that the
union of slides (for a given statement and a pair of variable sets) is equivalent to the slides of the
union.
Theorem 8.1. Any pair of slides of a single statement, slides.S.V1 and slides.S.V2, is unifiable.
Furthermore, we have
(slides.S.(V1V2)) = ((slides.S.V1) (slides.S.V2)) .
The proof, by induction on the structure of S, can be found in Appendix C.
CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 85
Representing program statements as collections of slides will be useful for slicing. A slicing
algorithm typically takes into account both data and control flow. Slides encompass control
dependences and have been defined here in a syntax-directed way, due to the simplicity of structure
of our language. Data dependences, on the level of slides, will be considered next.
8.5 Slide dependence and independence
Data flow between sets of slides is formalised through a relation of slide dependence and a corre-
sponding concept of slide independence. We start with the latter.
Definition 8.2 (Slide Independence).A set of variables Vis considered slide independent with
respect to a given statement S, if the condition
glob.(slides.S.V)def.SV
holds. (Recall slides.S.Vis a normal statement, so glob.(slides.S.V) is the set of global variables
in that statement.)
Interesting (semantic) properties of independent slides will be extensively investigated in the
next chapter, when developing a slicing algorithm. The complementary notion, of slide depen-
dence, is defined as follows.
Definition 8.3 (A Relation of Slide Dependence).A set of variables V1 depends on another set
V2 with respect to a given statement S,i.e. V 1 is related to V2 through slide dependence, when
input.(slides.S.V1) def.(slides.S.V2) 6=.
8.5.1 Smallest enclosing slide-independent set
The reflexive transitive closure of a set Vin the context of slides.S, denoted V, is the smallest
slide-independent superset of V. (Recall that slide dependence is indeed a relation between sets
of variables.)
When asked to compute the reflexive transitive closure of slide dependence, for a given state-
ment Sand set of variables V, we choose to avoid computing the full relation. Instead, we take a
faster lazy approach, repeatedly adding (to V) slides on which Vdepends, until a fixed point is
reached. At each step, all variables in U:= input.(slides.S.V)def.S, are added to V. A fixed
point is reached when UV.
This can be slightly improved by observing the relationship between global variables and
input. In general, we note that computing the set of global variables, in a collection of slides
(as for any statement), is faster than computing its set of input variables. Now from RE5 we
CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 86
know glob.T=def.Tinput.T. So, for computing the set U, we observe that even though
glob.(slides.S.V)def.Smay be larger than input.(slides.S.V)def.S, the extra variables would
be from def.(slides.S.V) and hence from V. We conclude that since the only purpose of U, in
the present algorithm, is to be tested for inclusion in V(and then possibly be added to V), there
would be no harm in including variables from Vin U.
Thus the algorithm for computing the reflexive transitive closure of slide dependence, for any
given S,Vis as follows:
slides-dep-rtc.S.V,if UVthen Velse slides-dep-rtc.S.(VU)
where U:= glob.(slides.S.V)def.S.
8.6 SSA form
Up till now we had one slide for each variable, including all definitions of that variable. Can we
refine the representation such that each slide will be dedicated to a specific instance (i.e. definition
point) of a variable?
We do that by splitting the selected variable, such that a new variable is defined at each
definition point, in the style of SSA. We then show that under some conditions the instances of
a variable can be merged back (to the original) even after performing some transformations (e.g.
slicing).
8.6.1 Transform to SSA
The set of instance variables in the SSA form replacing a variable of the original program are
expected to maintain a property of no-simultaneous-liveness. This way, it will be possible to
transform the program back from SSA.
AtoSSA algorithm is formally derived for our core language as Transformation D.5 in Ap-
pendix D and is repeated here as Figure 8.3.
In transforming a given statement Swith respect to variables X(i.e. splitting definitions of
Xalone), we aim to end up with a statement S0free of occurrences of X(i.e. glob.S0X) and
with at most one instance (of each member of X) live at each point of S0.
According to Transformation D.5 and its corresponding preconditions P1-P7, we observe that
Xshould be partitioned into six mutually-disjoint subsets X1,X2,X3,X4,X5,X6. However,
following postcondition Q1 and preconditions P5 and P6, we further observe that of those, only
X4 and X5 are both live-on-exit and defined in S. Since in general we mean to transform all
variables in X:= def.S, and since we expect all members of Xto be live-on-exit, we are left with
CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 87
Let S,X,Ybe any core statement and two (disjoint) sets of variables; let X1,X2,X3,X4,X5 be
five (mutually disjoint) subsets of X, and let XL1i,XL2i,XL3i,XL4i,XL4f,XL5fbe six sets of
instances, all included in the full set of instances XLs; let S0be the SSA form of Sdefined by
S0:= toSSA.(S,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs); then (Q1:)
(S;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]=
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S0)[live XL3i,XL4f,XL5f,Y]
and (Q2:) Xglob.S0
provided
P1: glob.S(X,Y),
P2: (X1,X2,X3,X4,X5) X,
P3: (XL1i,XL2i,XL3i,XL4i,XL4f,XL5f)XLs,
P4: XLs (X,Y),
P5: (X1,X3) def.S,
P6: (X2,X4,X5) def.Sand
P7: (X(((X3,X4,X5) \ddef.S)input.S)) (X1,X2,X3,X4) .
Figure 8.3: Transformation D.5 of toSSA (from the appendix).
CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 88
the 2-partition X= (X4,X5). Of those, we observe (by P2,P7) that only members of X4 are
live-on-entry (i.e. in (X\ddef.S)(Xinput.S)).
We now need to prepare initial and final instances for X4 and X5. For the former, we observe
how Q1 and P3 imply that the set of initial instances XL4imust be disjoint from the set XL4fof
final instances, whereas for the latter only final instances (XL5f) are required. From P2,P3 and
P4 it is clear that all instances must be fresh (i.e. disjoint from (X,Y)).
Thus, the toSSA algorithm can be applied as follows:
Let Sbe any given statement; let X:= def.Sand Y:= (glob.S\X) be a 2-partition of
glob.S; let X4 := (X\ddef.S)(Xinput.S) and X5 := X\X4 be a 2-partition of X;
let (XL4i,XL4f,XL5f) := fresh.((X4,X4,X5),(X,Y)) be three sets of fresh instances — ac-
cording to property Q2 of our definition of fresh (see Section 3.1.3) we indeed get the required
(XL4i,XL4f,XL5f)(X,Y)) — and finally let
S0:= toSSA.(S,(X4,X5),XL4i,(XL4f,XL5f),Y,(XL4i,XL4f,XL5f)) and let XLim := glob.S0\
(Y,XL4i,XL4f,XL5f) be the set of all intermediate instances; we then observe
S
={intro aux. liveness info.; intro. dead assignment;
intro. self-assignment; assignment-based sub.}
(S;XL4f,XL5f:= X4,X5;X4,X5 := XL4f,XL5f)[live X4,X5]
={prop. liveness info.}
((S;XL4f,XL5f:= X4,X5)[live XL4f,XL5f]
;X4,X5 := XL4f,XL5f)[live X4,X5]
={Q1 of Transformation D.5: P1-P7 hold by construction (as justified above)}
((XL4i:= X4;S0)[live XL4f,XL5f];X4,X5 := XL4f,XL5f)[live X4,X5]
={remove aux. liveness info.}
(XL4i:= X4;S0;X4,X5 := XL4f,XL5f)[live X4,X5]
={def. of live : def. of XLim; note XLim (X4,X5) due to
Q2 of toSSA (Xglob.S0)}
|[var XL4i,XL4f,XL5f,XLim ;XL4i:= X4;S0;X4,X5 := XL4f,XL5f]|.
8.6.2 Back from SSA
As the derivation above is made of program equations, the reversed derivation is used for returning
from SSA. However, returning from SSA is more general since we wish to return even after having
made some transformations on the immediate SSA version. In effect, returning from SSA involves
CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 89
Let S0be any core statement and (XL1iXL2f)XLs; let Sbe a statement defined by
S:= merge-vars.(S0,XLs,X,XL1i,XL2f,Y); then (Q1:)
(XL1i:= X1;S0)[live XL2f,Y]=(S;XL2f:= X2)[live XL2f,Y]
and (Q2:) XLs glob.S
provided
P1: glob.S0(XLs,Y),
P2: (XL1iXL2f)XLs,
P3: (X1X2) X,
P4: X(XLs,Y),
P5: no two instances of any member of Xare sim.-live at any point in S0[live XL2f,Y],
P6: (XLs ((XL2f\ddef.S0)input.S0)) XL1i,
P7: no def-on-live: i.e. no instance is defined where another instance is live-on-exit,
P8: no multiple-defs: i.e. each assinment defines at most one instance (of any X.i).
Figure 8.4: Transformation D.6 of merge-vars (from the appendix).
the merge of all instances of an original program variable. This way, all definitions of pseudo
instances become redundant self assignments and hence removed.
Accordingly, the fromSSA algorithm is defined to call the more general merge-vars algorithm, as
derived in Transformation D.6 in Appendix D and is repeated here as Figure 8.4. Thus, fromSSA
is defined as
fromSSA.(S0,X,XL1i,XLf ,Y,XLs),merge-vars.(S0,XLs,X,XL1i,XLf ,Y)
where S0is a statement in SSA form with respect to variables X, variables X1 are live-on-entry
with corresponding initial instances XL1i,XLf are final instances of X, variables Yare non-SSA
program variables, and XLs is the complete set of instances of X.
In the following, we show that the toSSA algorithm is invertible, with fromSSA its inverse.
8.6.3 SSA is de-SSA-able
When a statement in SSA can be converted back, thus merging all instances of each transformed
variable, to its original name, we say it is de-SSA-able. In the following theorem we prove that
toSSA yields a de-SSA-able statement. This result is not surprising. It was expected and is stated
here for ‘sanity checking’. However, the theorem will actually become useful a little later, when
CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 90
proving de-SSA-ability of SSA-based slices. There, we shall combine the result of this theorem
with an observation that slicing preserves de-SSA-ability.
Theorem 8.4. Let Sbe any core statement and let X,Y:= (def.S),(glob.S\def.S), X1 :=
(X((X\ddef.S)input.S)) and (XL1i,XLf ) := fresh.((X1,X),(X,Y)); let S0be the SSA
version of S, defined as S0:= toSSA.(S,X,XL1i,XLf ,Y,(XL1i,XLf )); then S0is de-SSA-able.
That is, all preconditions, P1-P8, of the fromSSA algorithm hold for
S00 := fromSSA.(S0,XLs,X,XL1i,XLf ,Y) where XLs := ((XL1i,XLf )(def.S10\Y)).
The proof can be found in appendix D.
8.7 Summary
A program representation of non-contiguous statements has been defined along with an operation
for merging such statements. Slides — following an original program execution metaphor that has
been introduced — encompass control dependences. In the context of our simply structured lan-
guage, this is done in a syntax-directed manner. The complementary notion of data dependences
has been captured by a relation of slide dependence. Together, those will take part in computing
slices, in the next chapter.
For any given statement S, the function slides.Stakes a set of variables, say V, and yields a
statement (slides.S.V) which includes the union of individual-assignment slides of all assignments
to variables in V. That way, we have avoided the need to formalise labels of internal program
points (for distinguishing one assignment of a variable from another).
Instead, the finer-grained level of individual-assignment slides has been made accessible through
the development of transformations to and from the popular static single assignment (SSA) form.
Slides of a particular instance of a variable, on the SSA form, correspond to the individual assign-
ment of that instance, on the original program. The SSA form and its related slides will help in
the next chapter to turn a naive flow-insensitive slicing algorithm into a flow-sensitive one.
Chapter 9
A Slicing Algorithm
This chapter develops a provably correct slicing algorithm. The algorithm is based on the obser-
vation that a slide-independent collection of slides yields a semantically correct slice.
The algorithm’s development will consist of two stages. A first attempt will produce crude
(i.e. too large) slices. Then, by adopting the refined program representation of SSA-based slides,
the same algorithm will be shown to produce refined (i.e. smaller, more accurate and desirable)
slices.
9.1 Flow-insensitive slicing
The observation that independent slides yield correct slices is proved in the following.
Theorem 9.1. Let S,Vbe a core statement and set of variables, respectively. Then provided V
is slide independent in S(i.e. glob.(slides.S.V)def.SV), slides.S.Vis a slice-refinement of S
with respect to V.
Proof. The proof is by induction on the structure of S. We assume that for any slip Tof S(for
which slides.T.Vis independent in T, as is guaranteed by Lemma 9.2), we have [wp.T.Q
wp.(slides.T.V).Q] for all Qwith glob.Qdef.SV. We then prove that provided Vis slide
independent in S, we have [wp.S.Pwp.(slides.S.V).P] for all such Pwith glob.Pdef.SV.
First, if Vdef.Swe observe for all Pwith glob.Pdef.SV(i.e. glob.Pdef.S):
wp.(slides.S.V).P
={slides when Vdef.S}
wp.skip.P
={wp of skip}
P
91
CHAPTER 9. A SLICING ALGORITHM 92
⇐ {pred. calc.}
Pwp.S.true
={RE3: glob.Pdef.S}
wp.S.P.
In the remaining cases we shall assume Vdef.S6=.
S=X:= E: We observe for all Pwith glob.Pdef.SV
wp.X:= E.P
={wp of ‘:=’}
P[X\E]
={remove redundant subs.: let X1 := XVand the proviso ensures
glob.PXX1}
P[X1\E1]
={wp of ‘:=’}
wp.X1 := E1.P
={slides of ‘:=’}
wp.(slides.X:= E.V).P.
S=S1;S2: We observe for all Pwith glob.Pdef.SV
wp.S1;S2.P
={wp of ‘ ;}
wp.S1.(wp.S2.P)
⇒ {ind. hypo.: glob.Pdef.SVand Vis slide ind. in S2}
wp.S1.(wp.(slides.S2.V).P)
⇒ {ind. hypo.: glob.(wp.(slides.S2.V).P)def.SVdue to RE2
since both glob.Pdef.SVand
input.(slides.S2.V)def.SV(slide ind. of Vin S2);
Vis slide ind. in S1}
wp.(slides.S1.V).(wp.(slides.S2.V).P)
={wp of ‘ ;}
wp.(slides.S1.V);(slides.S2.V).P
={slides of ‘ ;}
wp.(slides.S1;S2.V).P.
CHAPTER 9. A SLICING ALGORITHM 93
S=if Bthen S1else S2: We observe for all Pwith glob.Pdef.SV
wp.if Bthen S1else S2.P
={wp of IF}
(Bwp.S1.P)(¬Bwp.S2.P)
⇒ {ind. hypo., twice: glob.Pdef.SVand Vis slide ind. in both S1 and S2}
(Bwp.(slides.S1.V).P)(¬Bwp.(slides.S2.V).P)
={wp of IF}
wp.if Bthen slides.S1.Velse slides.S2.V.P
={slides of IF}
wp.(slides.if Bthen S1else S2.V).P.
S=while Bdo S1od : We observe for all Pwith glob.Pdef.SV
wp.while Bdo S1od .P
={wp of DO: [k.Q(BP)(¬Bwp.S1.Q)]}
(i: 0 i:ki.false)
⇒ {see below; [l.Q(BP)(¬Bwp.(slides.S1.V).Q)]}
(i: 0 i:li.false)
={wp of DO with las above}
wp.while Bdo (slides.S1.V)od .P
={slides of DO}
wp.(slides.while Bdo S1od .V).P.
We finish by proving for the second step above, by induction, having [ki.false li.false] for
all i, provided [wp.S1.Pwp.(slides.S1.V).P] for all Pwith glob.Pdef.SV(induction
hypothesis above).
The base case (i= 0) is trivial ([false false], recall the definition of function iteration).
Then, for the induction step, we assume [ki.false li.false] and prove [ki+1.false li+1 .false].
ki+1.false
={def. of func. it.}
k.(ki.false)
={def. of k}
(BP)(¬Bwp.S1.(ki.false))
CHAPTER 9. A SLICING ALGORITHM 94
⇒ {ind. hypo.}
(BP)(¬Bwp.S1.(li.false))
⇒ {slide ind., proviso and glob.(li.false)def.SVsince
((glob.Bglob.Pinput.S1) def.S)V}
(BP)(¬Bwp.(slides.S1.V).(li.false))
={def. of l}
l.(li.false)
={def. of func. it.}
li+1.false .
Lemma 9.2. Let Sbe any core statement (i.e. no local variable scopes); let Vbe a set of slide-
independent variables (in S); let Tbe any slip of S; then Vis also slide independent in T. That
is,
glob.(slides.T.V)def.TV.
Proof.
glob.(slides.T.V)def.T
⊆ {Lemma 9.3}
glob.(slides.S.V)def.T
⊆ {Lemma 9.4}
glob.(slides.S.V)def.S
⊆ {proviso (Vis slide ind. in S)}
V.
The proofs of the remaining lemmata are given in Appendix C.
Lemma 9.3. Let Sbe any core statement; let Tbe any slip of Sand let Vbe any set of variables;
then
glob.(slides.T.V)glob.(slides.S.V).
Lemma 9.4. Let Sbe a core statement; let Tbe any slip of S; then
def.Tdef.S.
CHAPTER 9. A SLICING ALGORITHM 95
Given a core statement Sand variables of interest V, compute the flow-insensitive slice,
fi-slice.S.V, as follows:
fi-slice.S.V,slides.S.V
where V:= slides-dep-rtc.S.Vwith slides-dep-rtc.S.Vdefined recursively as follows:
slides-dep-rtc.S.V,if UVthen Velse slides-dep-rtc.S.(VU)
where U:= glob.(slides.S.V)def.S.
Figure 9.1: A flow-insensitive slicing algorithm.
9.1.1 The algorithm
Our flow-insensitive slicing algorithm is given in Figure 9.1. Given a core statement Sand a set
of variables V, the algorithm first computes the smallest possible slide-independent superset V
of V(i.e. the reflexive transitive closure of the slide dependence of Son V, as in Section 8.5.1);
then, the union of slides of Son V,i.e. the statement slides.S.V, is produced as the slice of S
on V.
The algorithm is correct in the sense that, for any Sand V, we get fi-slice.S.Vas a valid slice
of Swith respect to V. That is, requirements Q1 and Q2 of slicing both hold.
For the former (S[live V]v(fi-slice.S.V)[live V]), suffice it to show that SvVfi-slice.S.V
holds. This is so due to Theorem 9.1 with V:= V(which gives SvVfi-slice.S.V) and since
VV(by construction of slides-dep-rtc.S.V). To see why this results in SvVfi-slice.S.V,
recall the definition of slice-refinement, from which we indeed get for all predicates Pwith glob.P
VVthe required [wp.S.Pwp.(fi-slice.S.V).P].
The latter (def.(fi-slice.S.V)def.S) holds since all defined variables in any set of slides of S
are also defined in Sitself, as the following lemma (which is proved in Appendix C) shows.
Lemma 9.5. Let S,Vbe any core statement and set of variables, respectively; then
def.(slides.S.V)def.S.
9.1.2 Example
The crudeness of this flow-insensitive slicing is demonstrated with the following example. Slicing
for sum on the program on the left-hand column would yield the program on the right.
CHAPTER 9. A SLICING ALGORITHM 96
i,sum := 0,0
; while i<a.length do
i,sum := i+1,sum+a[i]
od
; i,prod := 0,1
; while i<a.length do
i,prod := i+1,prod*a[i]
od
i,sum := 0,0
; while i<a.length do
i,sum := i+1,sum+a[i]
od
; i := 0
; while i<a.length do
i := i+1
od
The second loop is unnecessarily included since the slide of sum depends on the slide of i, which
in turn includes all assignments to i(along with their enclosing control structures). Nevertheless,
this result is still a correct slice, with respect to sum. But we can do better, as is shown next.
9.2 Make it flow-sensitive using SSA-based slides
Now that we have a provably correct slicing algorithm, we refine it by making it sensitive to
control flow. This will lead to (potentially) smaller, more accurate slices. (As a bonus, the refined
algorithm will be applicable for the more traditional backward slicing, in which slicing criteria
may refer to internal program points. However, this kind of slicing will not be pursued.)
The key to turning the flow-insensitive slicing algorithm into flow-sensitive, lies in splitting
of variables. The previous algorithm, in an attempt to stay on the safe side, added the slides of
all assignments to a variable as soon as any of those slides was needed — completely ignoring
the program point(s) in which the value of that variable was used and the order of execution of
substatements.
By splitting a variable into several instances, such that each definition point introduces a new
variable, as in our SSA-like form (see Section 8.6 of the preceding chapter), the existing algorithm
can gain flow-sensitivity.
9.2.1 Formal derivation of flow-sensitive slicing
Take a core statement Sand any set of variables of interest V. Transform Sto SSA form and take
the flow-insensitive slice of the final instances of V. Finally, all instances can be merged (back to
the original name).
S[live V]
CHAPTER 9. A SLICING ALGORITHM 97
={Q1 of Transformation D.5 with
S0:= toSSA.(S,(V,coV ),(VL1i,coVL1i),(VLf ,coVLf ),
ND,(VL1i,coVL1i,VLf ,coVLf )), def.S= (V,coV ),
ND := glob.S\(V,coV ), V1,coV 1 := (Vinput.S),(coV input.S) and
(VL1i,coVL1i,VLf ,coVLf ) := fresh.((V1,coV 1,V,coV ),glob.S)}
(VL1i,coVLi := V1,coV 1;S0;V,coV := VLf ,coVLf )[live V]
={remove dead assignment (Law 23): coV V}
(VL1i,coVL1i:= V1,coV 1;S0;V:= VLf )[live V]
={prop. liveness info. (Law 20)}
((VL1i,coVL1i:= V1,coV 1;S0)[live VLf ];(V:= VLf ))[live V]
={prop. liveness info. (Law 20)}
((VL1i,coVL1i:= V1,coV 1;S0[live VLf ])[live VLf ];(V:= VLf ))[live V]
v {SV 0:= fi-slice.S0.VLf (Q1 of fi-slice)}
((VL1i,coVL1i:= V1,coV 1;SV 0[live VLf ])[live VLf ];(V:= VLf ))[live V]
={prop. liveness info. (Law 20)}
((VL1i,coVL1i:= V1,coV 1;SV 0)[live VLf ];(V:= VLf ))[live V]
={Q1 of Transformation D.6, Theorem 9.6: DLs := glob.S0\ND ,
SV := fromSSA.(SV 0,(V,coV ),(VL1i,coVL1i),(VLf ,coVLf ),ND,DLs )}
((SV ;VLf := V)[live VLf ];(V:= VLf ))[live V]
={prop. liveness info. (Law 20)}
(SV ;(VLf := V);(V:= VLf ))[live V]
={assignment-based sub. (Law 18): VLf V}
(SV ;(VLf := V);(V:= V))[live V]
={remove redundant self-assignment (Law 2)}
(SV ;(VLf := V))[live V]
={remove dead assignment (Law 24): VLf V}
SV [live V].
The success of fromSSA (in the derivation above) depends on the validity of preconditions
P1-P8 of Transformation D.6. This is indeed guaranteed as is shown in the following.
CHAPTER 9. A SLICING ALGORITHM 98
9.2.2 An SSA-based slice is de-SSA-able
Let S0be the SSA version of a core statement S. Then, the slicing algorithm, once operated on
the set of slides slides.S0, is flow-sensitive. For such slices to be correct syntax-preserving slices,
we need to show they are de-SSA-able.
Theorem 9.6. Any slide-independent statement from the SSA version of any core statement is
de-SSA-able.
That is, let Sbe any core statement and let X,Y:= (def.S),(glob.S\def.S), X1 := (X((X\
ddef.S)input.S)) and (XL1i,XLf ) := fresh.((X1,X),(X,Y)); let S0be the SSA version of S,
defined as S0:= toSSA.(S,X,XL1i,XLf ,Y,(XL1i,XLf )); let XLs := ((XL1i,XLf )(def.S10\
Y)) be the full set of instances (of X, in S0) and let XLI be any (slide-independent) subset of
those instances, with final instances XL2f:= XLI XLf ; finally let SI 0:= slides.S0.XLI be the
corresponding (slide-independent) statement; then SI 0is de-SSA-able. That is, all preconditions,
P1-P8, of the fromSSA algorithm hold for SI := fromSSA.(SI 0,X,XL1i,XL2f,XLs).
The proof can be found in Appendix D.
9.2.3 The refined algorithm
Following the derivation above, our SSA-based flow-sensitive slicing algorithm is given in Fig-
ure 9.2. A given program Sis translated into its corresponding SSA version S0; the flow-insensitive
slice SV 0of S0is taken with respect to final instances VLf of V; finally, SV 0is translated back
from SSA by merging all instances.
The algorithm is correct in the sense that, for any core (and hence deterministic) statement
Sand set of variables V, we get slice.S.Vas a valid slice of Swith respect to V. That is,
requirements Q1 and Q2 of slicing both hold.
The former (S[live V]v(slice.S.V)[live V]) follows from of the derivation above. Note that,
indirectly, this property is a consequence of the corresponding Q1 of fi-slice.
For the latter (def.(slice.S.V)def.S), we need to investigate the effects of toSSA,fi-slice
and fromSSA on defined variables. Let SV := slice.S.Vand let Vd := Vdef.Ssuch that
def.S= (Vd,coV ). Thus we need to show def.SV (Vd,coV ).
Firstly, since def.S= (Vd ,coV ), we observe def.S0involves instances of (Vd,coV ) exclusively.
Secondly, Q2 of fi-slice ensures def.SV 0def.S0. Finally, since all instances of (Vd ,coV ) in SV 0
are successfully merged (see Q2 of fromSSA and Theorem 9.6), we end up with def.SV (Vd,coV )
as required.
CHAPTER 9. A SLICING ALGORITHM 99
Given a core statement Sand variables of interest V, compute the flow-sensitive slice, slice.S.V,
as follows:
slice.S.V,fromSSA.(SV 0,(V,coV ),(VL1i,coVL1i),(VLf ,coVLf ),ND,DLs )
where coV := def.S\V,
SV 0:= fi-slice.S0.VLf ,
S0:= toSSA.(S,(V,coV ),(VL1i,coVL1i),(VLf ,coVLf ),ND,(VL1i,coVL1i,VLf ,coVLf )),
V1,coV 1 := (Vinput.S),(coV input.S),
DLs := glob.S0\ND,
(VL1i,coVL1i,VLf ,coVLf ) := fresh.((V1,coV 1,V,coV ),glob.S)
and ND := glob.S\(V,coV ) .
Figure 9.2: An SSA-based flow-sensitive slicing algorithm.
9.2.4 Example
In contrast to the flow-insensitive slicer (as demonstrated in Section 9.1.2), this refined algorithm
would yield an accurate slice for sum, as is shown here:
i,sum := 0,0
; while i<a.length do
i,sum := i+1,sum+a[i]
od
; i,prod := 0,1
; while i<a.length do
i,prod := i+1,prod*a[i]
od
i,sum := 0,0
; while i<a.length do
i,sum := i+1,sum+a[i]
od
The first step of the algorithm is to turn the program into SSA form:
CHAPTER 9. A SLICING ALGORITHM 100
|[var i1,i2,i3,i4,i5,i6,sum1,sum2,sum3,prod4,prod5,prod6
; i1,sum1:= 0,0
; i2,sum2:= i1,sum1
; while i2<a.length do
i3,sum3:= i2+1,sum2+a[i2]
; i2,sum2:= i3,sum3
od
; i4,prod4:= 0,1
; i5,prod5:= i4,prod4
; while i5<a.length do
i6,prod6:= i5+1,prod5*a[i5]
; i5,prod5:= i6,prod6
od
; i,sum,prod := i5,sum2,prod5
]|
At this point, the flow-insensitive algorithm slices the middle part above for the final instance
sum2of sum. The slide of sum2depends on the slides of sum1,i2and sum3;i2in turn depends on
{i1,i2,i3}whereas sum3depends on {i2,sum2}. We thus get {i1,i2,i3,sum1,sum2,sum3}as the
reflexive transitive closure of slide dependence on sum2, hence the requested slide-independent set
{sum2}, yielding the following program:
; i1,sum1:= 0,0
; i2,sum2:= i1,sum1
; while i2<a.length do
i3,sum3:= i2+1,sum2+a[i2]
; i2,sum2:= i3,sum3
od
Returning from SSA would then, as desired, yield
CHAPTER 9. A SLICING ALGORITHM 101
i,sum := 0,0
; while i<a.length do
i,sum := i+1,sum+a[i]
od
Similarly, requesting the slice of prod , from this SSA-based slicing algorithm, would identify
the second loop alone, whereas the earlier naive algorithm would unnecessarily add the first loop
(for i).
Indeed, the example above was specifically chosen to highlight the differences between our
naive flow-insensitive slicer and the SSA-based one. Nevertheless, it should be noted that had
the two loops been tangled as one, the slicer (in fact both slicers) would still identify the desired,
accurate slice.
9.3 Slice extraction revisited
9.3.1 The transformation
With the SSA-based flow-sensitive slicing algorithm, we are for the first time in position to con-
structively express a sliding transformation. We base the transformation on Refinement 7.4 and
produce the slice of the variables for extraction as the extracted code and the slice of the remaining
variables as the complement.
Transformation 9.7. Let Sbe any core statement and let Vbe any (user selected) set of vari-
ables to be extracted; then
Sv
|[var iV ,icoV ,fV ;iV ,icoV := V0,coV
;SV
;fV := V0
;
V0,coV := iV ,icoV
;ScoV
;V0:= fV ]|
where V0:= Vdef.S,
coV := def.S\V0,
(iV ,icoV ,fV ):=fresh.((V0,coV ,V0),(Vglob.S)),
SV := slice.S.V0
and ScoV := slice.S.coV .
Proof.
S
CHAPTER 9. A SLICING ALGORITHM 102
v {Refinement 7.4: (V0,coV ) = def.Sby def. of V0,coV ;
S[live V0]vSV [live V0] by Q1 of slice;
def.SV def.Sby Q2 of slice; similarly
S[live coV ]vScoV [live coV ] by Q1 of slice;
def.ScoV def.Sby Q2 of slice}
(iV ,icoV := V0,coV ;SV ;fV := V0
;V0,coV := iV ,icoV ;ScoV ;V0:= fV )[live V0,coV ]
={def. of live : (def.SV def.ScoV )(V0,coV )
and (iV ,icoV ,fV )(V0,coV )}
|[var iV ,icoV ,fV
;iV ,icoV := V0,coV ;SV ;fV := V0
;V0,coV := iV ,icoV ;ScoV ;V0:= fV ]|.
9.3.2 Evaluation and discussion
With the above transformation, for example, the computation of sum in the scenario of Section 7.1,
can be correctly untangled from that of prod, as desired.
Our current approach has been inspired by the tucking transformation [40]. In comparing
Tuck to sliding, we first observe that global variables are inherently unsupported by Tuck, and
whenever a live-on-exit variable is defined in both the extracted slice and its complement, the
transformation has to be rejected. For example, when untangling sum and prod as was just
recalled from Section 7.1, the loop variable iis defined in both loops of the resulting program;
had the final value of ibeen used after the loop (e.g. in computing the average), Tuck would have
been rejected.
Our semantic framework, in contrast, is expressive enough to avoid such limitations. The
importance of this improvement over tucking is highlighted by the observation that in the presence
of global variables, and in order to avoid the need for full program analysis (i.e. beyond the context
of extraction), one has to assume all those variables are, indeed, live-on-exit.
Another notable difference between Tuck and sliding is in the construction of the complement.
Their complement is the slice from all non-extracted statements, whereas we slice from the end
of scope, on all non-extracted variables. This approach has been inspired by Gallagher’s view
of a program as a union of slices [22]. There, a program maintenance process, based on that
view, is formalised along with the dependences between various slices. Consequently, conditions
are derived for detecting non-interference of changes on a set of slices. Thus, some changes, e.g.
when debugging, can be performed on a subprogram — leaving the merge of those changes in the
CHAPTER 9. A SLICING ALGORITHM 103
full program to an accompanying tool — with such confidence that eliminates the need for e.g.
reduction testing.
In comparing Tuck and sliding’s approaches, we note that on the one hand, their complement
would include slices from dead statements, if present. This in turn might lead to unnecessary
duplication and possible rejection. Since the slice of all defined variables on a given statement, as
in Gallagher’s view, will never include such dead code, its presence would have no affect on our
approach.
On the other hand, Tuck’s complement has the potential of being more accurate than that of
the current version of sliding. This might be the case whenever any possibly-final-definition of
a non-extracted variable y(i.e. a definition that may reach the end of scope) is included in the
extracted code. Tuck’s complement will include such a definition only if it is indirectly relevant
for other non-extracted statements, whereas ours would definitely include it.
In order to understand the implications of this problem, we further investigate it, distinguish-
ing two cases. Firstly, if all possibly-final-definitions of such a variable, y, are extracted, the full
slice on yis guaranteed to be included in the extracted code. In such cases, we solve the prob-
lem by including yin the set of extracted variables (see Chapter 12, where an optimal sliding
transformation is sought).
Secondly, in cases where at least one such definition of ywas extracted, and at least one
other definition was not, we must distinguish two sub-cases. If yis live-on-exit, they will reject
the transformation. Otherwise, their complement may indeed be smaller, since we assume all
variables are live-on-exit. Accordingly, our sliding transformation may benefit from deriving a
simple corollary for the case in which the live-on-exit variables are explicitly given, in which case,
the complement should be composed of the slice of all non-extracted and live-on-exit variables.
We conclude that our results so far enjoy Tuck’s untangling ability with improved applicability
and comparable levels of duplication. Further reduction in such duplication is still possible, as is
explained next.
A valid criticism (of both Tuck and our current version of sliding) is that the levels of code
duplication are potentially too high. This is highlighted by Komondoor, in his PhD thesis [39].
For example, if the extracted variable’s final value is used in the complement, the entire slice would
be duplicated.
As was explained in Chapter 2, Komondoor’s alternative solutions (KH00 and KH03 with
Horwitz [38, 39] as well as a variation on KH03 in his PhD thesis [37]), were not designed for
untangling by slice extraction and are hence not applicable in our immediate context. Nevertheless,
ideas from his approach have inspired and contributed to our decomposition of slides and the
corresponding improvements to sliding, as will be explored shortly.
CHAPTER 9. A SLICING ALGORITHM 104
9.4 Summary
This chapter has developed a slicing algorithm, based on the observation that slide-independent
sets of slides (from the previous chapter) yield correct slices. The SSA-based slicing algorithm has
allowed a constructive formulation of a first sliding transformation, based on the refinement rule
of semantic slice extraction (from two chapters back).
This version of sliding has been shown to enjoy Tuck’s untangling abilities, with improved appli-
cability, while suffering, as Tuck, from potential over-duplication. Reductions in such duplication
will be explored and formalised in the next chapter.
Chapter 10
Co-Slicing: Advanced Duplication
Reduction in Sliding
In the preceding chapters, a basic slice-extraction refactoring has been introduced. Its automation
through a sliding transformation has been discussed, along with correctness issues. The problem-
atic duplication of the whole program in scope, as of our initial formulation, has been followed by
slicing both the extracted code and its complement, in an attempt to reduce code duplication.
However, the levels of duplication introduced by sliding, so far, are still too high. This is so
in cases where both the slice and its complement share some computation, but instead of reusing
this computation’s extracted result in the complement, the computation’s code ends up being
duplicated.
In this chapter, an advanced sliding strategy for reducing the levels of duplication is proposed,
formalised and applied to sliding.
10.1 Over-duplication: an example
As an example of over-duplication, consider the following sliding of sum.
105
CHAPTER 10. CO-SLICING 106
i,sum,prod := 0,0,1
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
; out << sum
; out << prod
In applying Transformation 9.7 with V:= {sum}, we note that the whole extracted code ends
up being duplicated in the complement, unnecessarily.
|[var ii,isum,iprod,iout,fsum
; ii,isum,iprod,iout:=i,sum,prod,out
; i,sum := 0,0
; while i<a.length do
i,sum :=
i+1,sum+a[i]
od
; fsum:=sum
;
i,sum,prod,out:=ii,isum,iprod,iout
; i,sum,prod := 0,0,1
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
; out << sum
; out << prod
; sum:=fsum
]|
10.2 Final-use substitution
We propose to further reduce duplication in sliding through what we call final-use substitution. A
final use is a reference to a variable’s value (e.g. sum in the example above), in a program point
where it is guaranteed to hold its final value (e.g. where sum is appended to out ).
If the slice of the variable under discussion has been extracted through sliding, that final
value might be available in the complement (e.g. in backup variable fsum), saving us the need to
recompute it.
CHAPTER 10. CO-SLICING 107
10.2.1 Example
To demonstrate the workings of final-use substitutions, we return to the example (from above) of
computing and printing the sum and product of an array of integers. As was already mentioned,
trying to extract sum using our current proposed transformation would lead to duplication of the
whole extracted slice, unnecessarily.
One way of avoiding that duplication is to use fsum in the complement.
|[var ii,isum,iprod,iout,fsum
; ii,isum,iprod,iout:=i,sum,prod,out
; i,sum,prod := 0,0,1
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
; out << sum
; out << prod
; fsum:=sum
;
i,sum,prod,out:=ii,isum,iprod,iout
; i,sum,prod := 0,0,1
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
; out << fsum
; out << prod
; sum:=fsum
]|
Note that not all uses of sum were replaced with fsum; the use inside the loop is not of a final
value, and must not be replaced.
As before (in Transformation 9.7), the above version of statement duplication, with final-use
substitution, has the potential of introducing dead code, which can subsequently be removed. At
this point, slicing (for {sum}in the extracted code and {i,prod,out}in the complement), would
successfully remove the repeated computation of sum; leading to:
CHAPTER 10. CO-SLICING 108
|[var ii,isum,iprod,iout,fsum
; ii,isum,iprod,iout:=i,sum,prod,out
; i,sum := 0,0
; while i<a.length do
i,sum :=
i+1,sum+a[i]
od
; fsum:=sum
;
i,sum,prod,out:=ii,isum,iprod,iout
; i, prod := 0, 1
; while i<a.length do
i, prod :=
i+1, prod*a[i]
od
; out << fsum
; out << prod
; sum:=fsum
]|
10.2.2 Deriving the transformation
Final-use substitution can be formalised in the following way. Starting with S;{V=fV }
where fV glob.Swe transform Sinto S0:= S[final-use V\fV ] demanding S;{V=fV }=
S0;{V=fV }. Statement S0will be using variables in fV instead of Vin points to which
the corresponding assertion can be propagated.
The full derivation of S[final-use V\fV ] is given in Appendix E; the resulting transformation
is given in Figure 10.1.
With final-use substitution constructively defined, we now turn to derive an advanced solution
to slice extraction via sliding.
10.3 Advanced sliding
10.3.1 Statement duplication with final-use substitution
In the following, we show that any core statement Sis equivalent to its duplicated version, in
which the computation of variables (Vr,Vnr ) is separated from that of the complementary set
coV (such that def.S= (Vr,Vnr ,coV )), and variables Vr are offered for reuse in the complement,
through the backup variables of their final values, fVr .
Program equivalence 10.1. Let S,Vr,Vnr,coV ,iVr ,iVnr ,icoV ,fVr ,fVnr be a core statement
and eight sets of variables, respectively; then
CHAPTER 10. CO-SLICING 109
Let Sbe a core statement and Xa set of variables, to be substituted by a corresponding set X0of
fresh variables; the final-use substitution of Son Xwith X0is defined, by cases of S, as follows:
(X1,Y:= E1,E2)[final-use X1,X2\X10,X20],
X1,Y:= E1[X2\X20],E2[X2\X20]where
X= (X1,X2), XYand X0= (X10,X20) ;
(S1;S2)[final-use X1,X2\X10,X20],
S1[final-use X2\X20];S2[final-use X1,X2\X10,X20]where
X1 := Xdef.S2, X2 := X\X1 and
X10,X20are the corresponding subsets of X0;
(if Bthen S1else S2)[final-use X1,X2\X10,X20],
if B[X2\X20]then S1[final-use X1,X2\X10,X20]
else S2[final-use X1,X2\X10,X20]where
X1 := Xdef.(S1,S2), X2 := X\X1 and
X10,X20are the corresponding subsets of X0; and
(while Bdo S1od)[final-use X1,X2\X10,X20],
(while B[X2\X20]do S1[final-use X2\X20]od) where
X1 := Xdef.S1,X2 := X\X1 and
X20is the subset of X0corresponding to X2.
Figure 10.1: An algorithm of final-use substitution.
CHAPTER 10. CO-SLICING 110
S=
(iVr ,iVnr ,icoV := Vr,Vnr ,coV
;S
;fVr ,fVnr := Vr,Vnr
;
Vr ,Vnr ,coV := iVr,iVnr ,icoV
;S[final-use Vr \fVr]
;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]
provided def.S= (Vr ,Vnr,coV )
and (iVr ,iVnr ,icoV ,fVr,fVnr )glob.S.
Proof.
S
={prepare for statement duplication (Lemma 6.2) with
V,iV ,fV := (Vr ,Vnr ),(iVr,iVnr ),(fVr ,fVnr ):
provisos def.S= (Vr,Vnr,coV ) and (iVr ,iVnr ,icoV ,fVr ,fVnr )glob.S}
(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr
;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]
={intro. following assertion (Law 7) with X,Y,E1,E2 := fVnr,fVr,Vnr,Vr:
(fVr ,fVnr )Vr due to both provisos and RE5}
(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr
;{Vr =fVr};Vr,Vnr := fVr,fVnr )[live Vr ,Vnr ,coV ]
={statement duplication (Lemma 6.3) with
V,iV ,fV := (Vr ,Vnr ),(iVr,iVnr ),(fVr ,fVnr ): provisos}
(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr
;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;S;{Vr =fVr }
;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]
={final-use sub.: correct by construction (see Figure 10.1 and Appendix E) and
variables fVr are indeed fresh (proviso fVr glob.S)}
(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr
;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;S[final-use Vr \fVr ];{Vr =fVr }
;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]
={remove aux. assertion (Lemma 10.2, see below) with
V,iV ,fV := (Vr ,Vnr ),(iVr,iVnr ),(fVr ,fVnr )}
(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr
;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;S[final-use Vr \fVr ];Vr,Vnr := fVr,fVnr)
[live Vr,Vnr,coV ].
CHAPTER 10. CO-SLICING 111
Lemma 10.2. Let Sbe any core statement with def.S= (V,coV ), Vr V(and fVr the corre-
sponding subset of fV ) and (iV ,icoV ,fV )glob.S; we then have
iV ,icoV := V,coV
;S
;fV := V
;
V,coV := iV ,icoV
;S[final-use Vr \fVr]
;{Vr =fVr}
=
iV ,icoV := V,coV
;S
;fV := V
;
V,coV := iV ,icoV
;S[final-use Vr \fVr]
The proof of that lemma is given in Appendix E.
10.3.2 Slicing after final-use substitution
The statement duplication with final-use substitution, as in the program equivalence above, can
now be followed up by slicing, as was done earlier in Chapter 7.
Refinement 10.3. Let S,SV ,ScoV ,Vr ,Vnr,coV ,iVr ,iVnr ,icoV ,fVr ,fVnr be three core state-
ments and eight sets of variables, respectively; then
Sv
(iVr ,iVnr ,icoV := Vr,Vnr ,coV
;SV
;fVr ,fVnr := Vr,Vnr
;
Vr ,Vnr ,coV := iVr,iVnr ,icoV
;ScoV
;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]
provided def.S= (Vr ,Vnr,coV ),
S[live Vr,Vnr]vSV [live Vr,Vnr ],
S[final-use Vr \fVr][live coV ]vScoV [live coV ],
def.SV def.S,
def.ScoV def.Sand
(iV ,icoV ,fV )glob.S.
Proof.
S
CHAPTER 10. CO-SLICING 112
={duplicate statement (Program equivalence 10.1):
being a core statement, Sis indeed deterministic,
def.S= (Vr ,Vnr ,coV ) and (iVr,iVnr ,icoV ,fVr,fVnr)glob.S(proviso)}
(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr
;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;S[final-use Vr \fVr ]
;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]
v {Refinement 7.5 with
S0,V,iV ,fV := S[final-use Vr \fVr],(Vr,Vnr),(iVr ,iVnr ),(fVr,fVnr ):
def.S[final-use Vr \fVr] = def.Ssince only uses are replaced by final-use sub.}
(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;SV ;fVr,fVnr := Vr,Vnr
;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;ScoV
;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ].
10.3.3 Definition of co-slicing
Observing the requirements S[final-use Vr \fVr ][live coV ]vScoV [live coV ] and def.ScoV def.S
above, we formalise co-slices as follows:
Definition 10.4 (Complement-Slice (or Co-Slice)).Let Sbe a core (and hence deterministic)
statement and Vbe a set of variables for extraction; let Vr be a subset of Vto be made reusable
through fresh variables fVr . Any statement ScoV for which the two requirements
(Q1:) S[final-use Vr \fVr][live coV ]vScoV [live coV ] and
(Q2:) def.ScoV def.S
both hold, is a correct co-slice of Swith repect to V,Vr and fVr.
A co-slicing algorithm (see Figure 10.2) is consequently derived from the above definition
and the corresponding properties Q1 and Q2 of slicing. From those properties, the algorithm’s
correctness follows.
10.3.4 The sliding transformation
The refinement rule from above, along with the formal definition of co-slices and the corresponding
constructive co-slicing algorithm, are now combined in yielding an advanced sliding transformation.
Transformation 10.5. Let Sbe any core statement and let Vr ,Vnr be any two disjoint (user
selected) sets of variables to be extracted, with Vr to be made available for reuse in the complement;
then
CHAPTER 10. CO-SLICING 113
Let S,V,Vr ,fVr be a core statement and three sets of variables, respectively. The function
co-slice for statement Swith respect to V,Vr,fVr, is defined as follows:
co-slice.S.V.Vr.fVr ,slice.S[final-use Vr \fVr].coV
where coV := def.S\V
provided Vr V
and fVr (Vglob.S) .
Figure 10.2: A co-slicing algorithm, based on slicing and final-use substitution.
Sv
|[var iVr ,iVnr ,icoV ,fVr,fVnr
;iVr ,iVnr ,icoV := Vr0,Vnr 0,coV
;SV
;fVr ,fVnr := Vr0,Vnr 0
;
Vr 0,Vnr 0,coV := iVr,iVnr ,icoV
;ScoV
;Vr 0,Vnr 0:= fVr,fVnr
]|
where Vr 0,Vnr 0:= (Vr def.S),(Vnr def.S),
coV := def.S\(Vr 0,Vnr 0),
(iVr ,iVnr ,icoV ,fVr,fVnr ) := fresh.((Vr0,Vnr 0,coV ,Vr0,Vnr 0),((Vr,Vnr)glob.S)),
SV := slice.S.V0
and ScoV := co-slice.S.(Vr ,Vnr ).Vr.fVr .
Proof.
S
v {Refinement 10.3: (Vr0,Vnr 0,coV ) = def.Sby def. of Vr0,Vnr 0,coV ;
S[live Vr0,Vnr0]vSV [live Vr0,Vnr 0] by Q1 of slice;
def.SV def.Sby Q2 of slice; similarly
S[live coV ]vScoV [live coV ] by Q1 of co-slice;
def.ScoV def.Sby Q2 of co-slice}
(iVr ,iVnr ,icoV := Vr0,Vnr 0,coV ;SV ;fVr,fVnr := Vr0,Vnr 0;
Vr 0,Vnr 0,coV := iVr,iVnr ,icoV ;ScoV ;Vr0,Vnr 0:= fVr,fVnr)
[live Vr0,Vnr0,coV ]
CHAPTER 10. CO-SLICING 114
={def. of live : (def.SV def.ScoV )(V0,coV ) (again, Q2 of slice and co-slice)
and (iVr ,iVnr ,icoV ,fVr,fVnr )(Vr0,Vnr 0,coV )}
|[var iVr ,iVnr ,icoV ,fVr,fVnr ;
iVr ,iVnr ,icoV := Vr0,Vnr 0,coV ;SV ;fVr,fVnr := Vr0,Vnr 0;
Vr 0,Vnr 0,coV := iVr,iVnr ,icoV ;ScoV ;Vr0,Vnr 0:= fVr,fVnr]|.
10.4 Summary
This chapter has introduced an advanced sliding transformation in which the complement reuses
a selection of extracted results, thus yielding a potentially smaller complement, or as we call it
co-slice. Co-slicing has been formalised through a so-called final-use substitution. Constructive
definitions of that substitution, and hence of a co-slicing algorithm, have been developed.
In comparison to our earlier sliding transformation, the advanced version potentially duplicates
less code. However, the price takes the form of extra compensatory code. This is due to final-
use substitution, renaming some used variables in the complement. This renaming, to which we
are opposed, in general, in an attempt to keep the resulting program as close to the original
as possible, has been introduced in order to avoid name clashing. However, since our co-slicing
algorithm involves the removal of dead code by slicing, after final-use substitution, some renaming
can potentially be undone.
The elimination of redundant compensatory code, after sliding, will be pursued in the next
chapter. In particular, undoing the renaming of final-use substitution will be formalised, thus
yielding the concepts of compensation-free co-slices and compensation-free sliding.
Chapter 11
Penless Sliding
When sliding is expected to maintain all variable names (i.e. no renaming), it is not the case that
any final-use substitution yields a valid co-slice. The notion of compensation-free (or penless)
co-slice is introduced in this chapter. Moreover, a general improvement of sliding by eliminating
redundant backup variables is explored, ultimately leading to the formulation of (the conditions
for) a completely penless sliding transformation. The elimination of backup variables is based on
a liveness-analysis related approach to variable merging, on the lines of our merge-vars algorithm
(Appendix D), which has been applied for the return from SSA (in Section 8.6.2).
11.1 Eliminating redundant backup variables
We begin by detecting and eliminating redundant backup variables. When sliding variables
(Vr ,Vnr ) away from coV on statement S(with def.S(Vr,Vnr,coV ) and with Vr available
for reuse in the complement), we naively introduce backup variables (iVr ,iVnr ,icoV ) for initial
values and fVr ,fVnr for final value of extracted variables.
However, some of those backup variables might in fact be redundant and should hence be
removed.
11.1.1 Motivation
Why should those be removed? Following practical considerations, we note that such backup
would require an unnecessarily large storage space, and the two operations of making the backup
and retrieving it would have an unwanted impact on execution time.
Furthermore, suppose we waive our language assumption that any variable is cloneable, and
in turn strengthen sliding’s preconditions to ensure no uncloneable variable is actually cloned. In
that context, the removal of redundant backup variables will be crucial for the applicability of
115
CHAPTER 11. PENLESS SLIDING 116
sliding.
Finally, as was hinted above, when renaming of variables in the complement must be avoided,
the removal of redundant backup variables will allow such renaming to be undone.
11.1.2 Example
Recall the co-slicing example from the preceding chapter (Section 10.2.1). There, asking to ex-
tract and reuse the variable sum (i.e. when Vr ,Vnr ,coV := {sum},,{i,prod ,out }in applying
Transformation 10.5) from the program of Section 10.1, gave us the following result:
|[var ii,isum,iprod,iout,fsum
; ii,isum,iprod,iout:=i,sum,prod,out
; i,sum := 0,0
; while i<a.length do
i,sum :=
i+1,sum+a[i]
od
; fsum:=sum
;
i,sum,prod,out:=ii,isum,iprod,iout
; i, prod := 0, 1
; while i<a.length do
i, prod :=
i+1, prod*a[i]
od
; out << fsum
; out << prod
; sum:=fsum
]|
Here, the sets of backup variables of initial values iVr,iVnr,icoV are {isum},,{ii ,iprod,iout },
respectively, and the backup of final values fVr ,fVnr is {fsum },, respectively. Which of the
backup variables {ii,isum,iprod ,iout ,fsum }is redundant?
11.1.3 Dead-assignments-elimination and variable-merging
We remove redundant backup variables by combining dead-assignments-elimination and the merg-
ing of such backup variables with their corresponding original variables. Recall our merge-vars
algorithm (as mentioned for returning from SSA, see Section 8.6.2 and Appendix D). According to
that approach, merging members of iVr,iVnr,icoV with corresponding members of Vr,Vnr ,coV
is possible if they are never simultaneously-live, never defined in the same assignment, and one is
never defined in an assignment where the other is live-on-exit.
Definition on the same assignment (e.g. of sum and its backup isum or fsum) is not possible
after sliding, since the backup variables are defined only in designated statements. Furthermore,
since we precede this step by dead-assignments-elimination, cases of def-on-live may only occur
CHAPTER 11. PENLESS SLIDING 117
in conjunction with simultaneous liveness; the defined variable must be live too, or its definition
would have been removed. (Note that at this stage, the dead-assignments-elimination removes
unused backup variables of initial values only, e.g. {ii,isum ,iprod}above, but not iout , since the
retrieval from backup of all final values, e.g. the sum := fsum above, renders such backup, e.g.
fsum, live at its point of initialisation, e.g. the fsum := sum above.)
So it is simultaneous liveness that we should worry about. Backup variables for initial values
(e.g. iout, all the others have already been removed) are alive from entry to the extracted slice all
the way to the exit from the initialisation of backup of final values. There, the defined extracted
variables V0(e.g. {sum}) are used and their corresponding live initial backup variables (none of
those in our example, since isum is gone) must remain, as they are simultaneously-live on exit
from the extracted slice. On the other hand, members of coV (e.g. {i,prod ,out}) are alive in
the extracted slice SV only if in glob.SV . Thus, the backup variables for non-extracted initial
variables that do not occur free in the extracted slice can be merged with the corresponding
original variables. Hence, in our example, out and iout can be merged.
Initially, after sliding, the backup variables of all final values (e.g. fsum ) are live-on-exit from
the complement, and hence also when initialised, but the corresponding program variables (i.e.
members of V0,e.g. sum) are not, since they are defined there). If those are neither used nor
defined in the complement, the need for backup disappears. That is, backup variables of final value
of V0\glob.ScoV should be merged with the corresponding original variables. In our example,
hence, sum can be merged with fsum.
The following is the resulting sliding refinement rule, after eliminating redundant backup vari-
ables.
Refinement 11.1. Let S,SV ,ScoV ,Vr ,Vnr,coV ,iVr ,iVnr ,icoV ,fVr ,fVnr be three core state-
ments and eight sets of variables, respectively; then
Sv
(iVr 1,iVnr 1,icoV 11 :=
Vr 1,Vnr 1,coV 11
;SV
;fVr 1,fVr2,fVnr 1,fVnr 2 :=
Vr 1,Vr2,Vnr 1,Vnr 2
;
Vr 1,Vnr 1,coV 11 :=
iVr 1,iVnr 1,icoV 11
;ScoV [fVr 3\Vr3]
;Vr 1,Vr2,Vnr 1,Vnr 2 :=
fVr 1,fVr2,fVnr 1,fVnr 2)[live Vr ,Vnr,coV ]
CHAPTER 11. PENLESS SLIDING 118
provided def.S= (Vr ,Vnr,coV ),
S[live Vr,Vnr]vSV [live Vr,Vnr ],
S[final-use Vr \fVr][live coV ]vScoV [live coV ],
def.SV def.S,
def.ScoV def.S,
(iVr ,iVnr ,icoV ,fVr,fVnr )glob.S,
(Vr 1,Vr2,Vr3) =
(Vr input.ScoV ),(Vr (def.ScoV \input.ScoV )),(Vr \glob.ScoV ),
with (iVr 1,iVr2,iVr3) the corresponding subsets of iVr
and with (fVr 1,fVr2,fVr3) the corresponding subsets of fVr,
(Vnr1,Vnr 2,Vnr 3) =
((Vnr input.ScoV ),(Vnr (def.ScoV \input.ScoV )),(Vnr \glob.ScoV ))
with (iVnr1,iVnr 2,iVnr 3) the corresponding subsets of iVnr
and with (fVnr1,fVnr2,fVnr 3) the corresponding subsets of fVnr and
(coV 11,coV 12,coV 2) =
(coV def.SV input.ScoV ),(coV (input.ScoV \def.SV )),(coV \input.ScoV )
with (icoV 11,icoV 12,icoV 2) the corresponding subsets of icoV .
Proof.
S
v {Refinement 10.3}
(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;SV ;fVr,fVnr := Vr,Vnr
;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;ScoV ;Vr,Vnr := fVr,fVnr)
[live Vr,Vnr,coV ]
={liveness analysis: Vr1,Vnr 1,coV 1 =
(Vr input.ScoV ),(Vnr input.ScoV ),(coV input.ScoV )}
(((iVr ,iVnr ,icoV := Vr,Vnr ,coV ;SV ;fVr,fVnr := Vr,Vnr
;Vr ,Vnr ,coV := iVr,iVnr ,icoV )[live fVr ,fVnr ,Vr1,Vnr1,coV 1]
;ScoV )[live fVr,fVnr ,coV ];Vr,Vnr := fVr,fVnr )[live Vr ,Vnr,coV ]
={remove dead assignments, see below (big step 1)}
(((iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1;SV ;fVr,fVnr := Vr,Vnr
;Vr 1,Vnr 1,coV 1 := iVr 1,iVnr 1,icoV 1)[live fVr ,fVnr ,Vr1,Vnr1,coV 1]
;ScoV ;Vr ,Vnr := fVr,fVnr)[live Vr,Vnr,coV ]
={(coV 1,icoV 1) = ((coV 11,coV 12),(icoV 11,icoV 12))}
CHAPTER 11. PENLESS SLIDING 119
(((iVr 1,iVnr 1,icoV 11,icoV 12 := Vr 1,Vnr 1,coV 11,coV 12
;SV ;fVr ,fVnr := Vr,Vnr
;Vr 1,Vnr 1,coV 11,coV 12 := iVr 1,iVnr 1,icoV 11,icoV 12)
[live fVr,fVnr,Vr1,Vnr 1,coV 1]
;ScoV ;Vr ,Vnr := fVr,fVnr)[live Vr,Vnr,coV ]
={eliminate redundant backup of initial values, see below (big step 2)}
(((iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1;SV ;fVr,fVnr := Vr ,Vnr
;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11)[live fVr ,fVnr ,Vr1,Vnr1,coV 1]
;ScoV ;Vr ,Vnr := fVr,fVnr)[live Vr,Vnr,coV ]
={remove liveness info.; (Vr,Vnr ,fVr,fVnr) =
((Vr 1,Vr2,Vr3),(Vnr1,Vnr2,Vnr 3),(fVr1,fVr2,fVr3),(fVnr 1,fVnr 2,fVnr 3))}
(((iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1;SV
;fVr 1,fVr2,fVr3,fVnr1,fVnr2,fVnr3 := Vr1,Vr2,Vr3,Vnr 1,Vnr 2,Vnr 3
;Vr 1,Vnr 1,coV 11 := iVr 1,iVnr 1,icoV 11 ;ScoV
;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)
[live Vr,Vnr,coV ]
={eliminate redundant backup of final values, see below (big step 3)}
(((iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1;SV
;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2
;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11 ;ScoV [fVr3\Vr3]
;Vr 1,Vr2,Vnr 1,Vnr 2 := fVr 1,fVr2,fVnr 1,fVnr 2)[live Vr,Vnr,coV ].
For big step 1 above, in which dead assignments are removed, we observe
(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;SV ;fVr,fVnr := Vr,Vnr
;Vr ,Vnr ,coV := iVr,iVnr ,icoV )[live fVr ,fVnr ,Vr1,Vnr1,coV 1]
={remove dead assignments}
(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;SV ;fVr,fVnr := Vr,Vnr
;Vr 1,Vnr 1,coV 1 := iVr 1,iVnr 1,icoV 1)[live fVr ,fVnr ,Vr1,Vnr1,coV 1]
={liveness analysis}
((iVr ,iVnr ,icoV := Vr,Vnr ,coV )[live iVr 1,iVnr 1,icoV 1]
;SV ;fVr ,fVnr := Vr,Vnr
;Vr 1,Vnr 1,coV 1 := iVr 1,iVnr 1,icoV 1)[live fVr ,fVnr ,Vr1,Vnr1,coV 1]
={remove dead assignments}
CHAPTER 11. PENLESS SLIDING 120
((iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1)[live iVr 1,iVnr 1,icoV 1]
;SV ;fVr ,fVnr := Vr,Vnr
;Vr 1,Vnr 1,coV 1 := iVr1,iVnr 1,icoV 1)[live fVr ,fVnr ,Vr1,Vnr1,coV 1]
={remove liveness info.}
(iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1;SV ;fVr,fVnr := Vr ,Vnr
;Vr 1,Vnr 1,coV 1 := iVr 1,iVnr 1,icoV 1)[live fVr ,fVnr ,Vr1,Vnr1,coV 1] .
For big step 2 above, in which we eliminate redundant backup of initial values, we observe
(iVr 1,iVnr 1,icoV 1 := Vr 1,Vnr 1,coV 1;SV ;fVr,fVnr := Vr,Vnr
;Vr 1,Vnr 1,coV 11,coV 12 := iVr 1,iVnr 1,icoV 11,icoV 12)
[live fVr,fVnr,Vr1,Vnr 1,coV 1]
={split assignment: (iVr1,iVnr 1,icoV 11) coV 12}
(iVr 1,iVnr 1,icoV 11 := Vr1,Vnr 1,coV 11 ;icoV 12 := coV 12
;SV ;fVr ,fVnr := Vr,Vnr
;Vr 1,Vnr 1,coV 11,coV 12 := iVr 1,iVnr 1,icoV 11,icoV 12)
[live fVr ,fVnr,Vr1,Vnr 1,coV 1]
={swap statements: icoV 12 (glob.SV (fVr,fVnr ,Vr,Vnr)) and
(def.SV ,fVr ,fVnr )(icoV 12,coV 12)}
(iVr 1,iVnr 1,icoV 11 := Vr1,Vnr 1,coV 11 ;SV ;fVr,fVnr := Vr ,Vnr
;icoV 12 := coV 12 ;Vr 1,Vnr 1,coV 11,coV 12 := iVr 1,iVnr 1,icoV 11,icoV 12)
[live fVr,fVnr,Vr1,Vnr 1,coV 1]
={assignment-based sub.: coV 12 icoV 12}
(iVr 1,iVnr 1,icoV 11 := Vr1,Vnr 1,coV 11 ;SV ;fVr,fVnr := Vr ,Vnr
;icoV 12 := coV 12 ;Vr 1,Vnr 1,coV 11,coV 12 := iVr 1,iVnr 1,icoV 11,coV 12)
[live fVr,fVnr,Vr1,Vnr 1,coV 1]
={remove redundant self-assignment;
remove dead assignment: icoV 12 (fVr,fVnr ,iVr1,iVnr1,icoV 11)}
(iVr 1,iVnr 1,icoV 11 := Vr1,Vnr 1,coV 11 ;SV ;fVr,fVnr := Vr,Vnr
;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11)[live fVr ,fVnr ,Vr1,Vnr1,coV 1] .
Finally, for big step 3 above, in which we eliminate redundant backup of final values, we observe
(((ISV ;fVr 1,fVr2,fVr3,fVnr1,fVnr2,fVnr3 := Vr1,Vr2,Vr3,Vnr 1,Vnr 2,Vnr 3
;Vr 1,Vnr 1,coV 11 := iVr 1,iVnr 1,icoV 11 ;ScoV
;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)
[live Vr,Vnr,coV ]
CHAPTER 11. PENLESS SLIDING 121
={split assignment: (fVr1,fVr2,fVnr1,fVnr2) (Vr3,Vnr 3)}
(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2
;fVr 3,fVnr 3 := Vr3,Vnr 3;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11 ;ScoV
;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)
[live Vr,Vnr,coV ]
={swap statements:
(fVr 3,fVnr 3) (Vr1,Vnr 1,coV 11,iVr 1,iVnr 1,icoV 11) and
(Vr 1,Vnr 1,coV 11) (fVr3,fVnr 3,Vr3,Vnr 3)}
(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2
;Vr 1,Vnr 1,coV 11 := iVr 1,iVnr 1,icoV 11 ;fVr3,fVnr3 := Vr3,Vnr 3;ScoV
;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)
[live Vr,Vnr,coV ]
={assignment-based sub.: Vr3(fVr3glob.ScoV ) and
fVr 3def.ScoV }
(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2
;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11 ;fVr3,fVnr3 := Vr3,Vnr 3
;ScoV [fVr 3\Vr3]
;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)
[live Vr,Vnr,coV ]
={swap statements:
(fVr 3,fVnr 3) glob.ScoV [fVr3\Vr3] and
def.ScoV [fVr 3\Vr3] (fVr 3,fVnr 3,Vr3,Vnr 3)}
(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2
;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11 ;ScoV [fVr3\Vr3]
;fVr 3,fVnr 3 := Vr3,Vnr 3
;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)
[live Vr,Vnr,coV ]
={assignment-based sub.: (fVr3,fVnr 3) (Vr3,Vnr3)}
(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2
;Vr 1,Vnr 1,coV 11 := iVr 1,iVnr 1,icoV 11 ;ScoV [fVr3\Vr3]
;fVr 3,fVnr 3 := Vr3,Vnr 3
;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,Vr3,fVnr 1,fVnr 2,Vnr 3)
[live Vr,Vnr,coV ]
CHAPTER 11. PENLESS SLIDING 122
={remove redundant self-assignment;
remove dead assignment: (fVr3,fVnr3) (fVr1,fVr2,fVnr 1,fVnr2,coV )}
(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2
;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11 ;ScoV [fVr3\Vr3]
;Vr 1,Vr2,Vnr 1,Vnr 2 := fVr 1,fVr2,fVnr 1,fVnr 2)[live Vr,Vnr,coV ].
For our example above, we end up with
; i,sum := 0,0
; while i<a.length do
i,sum :=
i+1,sum+a[i]
od
;
i, prod := 0, 1
; while i<a.length do
i, prod :=
i+1, prod*a[i]
od
; out << sum
; out << prod
Note how in the complement, this time, the original variable sum is used where fsum was used
before. This was made possible by the successful merging of sum with its two backup variables
isum and fsum.
11.2 Compensation-free (or penless) co-slicing
Since we consider the renaming of variables (by final-use substitution, when co-slicing) as part
of sliding’s compensation, we accordingly consider a co-slice with no renaming, or with all initial
renaming eventually undone, as in the above, a compensation-free co-slice. Since in our metaphor
of slides and sliding, compensatory code is written using a non-permanent pen on top of printed
transparencies, the merging can be thought of as the erasure of such earlier writing.
Hence, compensation-free co-slices will also be termed penless co-slices. Accordingly, the pro-
cess of producing such co-slices will be termed penless co-slicing.
We define a penless co-slice to be a co-slice that involve no renaming and thus no compensa-
tion, in the following way:
penless-co-slice.S.V.Vr ,(co-slice.S.V.Vr .fVr)[fVr \Vr] where
fVr := fresh.(Vr,(Vglob.S)).
Note that since normal substitution is defined only when the new names are fresh, a penless
co-slice is well-defined only when all reused variables are gone (from the co-slice). That is,
penless-co-slice.S.V.Vr is well-defined when Vr glob.(co-slice.S.V.Vr.fVr).
CHAPTER 11. PENLESS SLIDING 123
Now that the elimination of redundant backup variables and the construction of penless co-
slices have been formalised, we are in position to derive preconditions for compensation-free sliding,
or penless sliding, as in the following.
11.3 Sliding with penless co-slices
The following is a sliding transformation with penless co-slicing and with the elimination of re-
dundant backup variables:
Transformation 11.2. Let Sbe any core statement and let Vr ,Vnr be any two disjoint (user
selected) sets of variables to be extracted, with Vr to be made available for reuse in the complement;
then
Sv
|[var iVnr1,icoV 11,fVnr1,fVnr 2
;iVnr1,icoV 11 := Vnr1,coV 11
;SV
;fVnr1,fVnr 2 := Vnr 1,Vnr 2
;
Vnr1,coV 11 := iVnr1,icoV 11
;ScoV
;Vnr1,Vnr 2 := fVnr 1,fVnr 2
]|
where Vr 0,Vnr 0:= (Vr def.S),(Vnr def.S),
coV := def.S\(Vr 0,Vnr 0),
SV := slice.S.(Vr 0,Vnr 0),
ScoV := penless-co-slice.S.(Vr0,Vnr 0).Vr0,
Vnr1,Vnr 2 := (Vnr 0input.ScoV ),(Vnr0(def.ScoV \input.ScoV )),
coV 11 := (coV def.SV input.ScoV )
and (iVnr1,icoV 11,fVnr1,fVnr 2) :=
fresh.((Vnr1,coV 11,Vnr1,Vnr 2),((Vr,Vnr)glob.S))
provided Vr0glob.(co-slice.S.(Vr 0,Vnr 0).Vr0.fVr) for any fresh fVr.
Proof.
S
CHAPTER 11. PENLESS SLIDING 124
v {Refinement 11.1 with Vr,Vnr,ScoV := Vr 0,Vnr 0,co-slice.S.(Vr0,Vnr 0).Vr0.fVr
on fresh fVr : (Vr0,Vnr 0,coV ) = def.Sby def. of Vr0,Vnr 0,coV ;
S[live Vr0,Vnr0]vSV [live Vr0,Vnr 0] by Q1 of slice;
S[final-use Vr 0\fVr][live coV ]v(co-slice.S.(Vr0,Vnr 0).Vr0.fVr)[live coV ]
by Q1 of co-slice;
def.SV def.Sby Q2 of slice; similarly
def.ScoV def.Sby Q2 of co-slice;
(iVnr1,icoV 11,fVnr1,fVnr 2) glob.Sby Q1 of fresh;
note that Vr 1,Vr2,Vr3 := ,,Vr 0due to the proviso;
consequently iVr1,iVr 2,fVr1 and fVr2 are all empty; also note (for our ScoV )
penless-co-slice.S.(Vr0,Vnr 0).Vr0= (co-slice.S.(Vr0,Vnr 0).Vr0.fVr)[fVr \Vr 0]
by def. of penless-co-slice (which is indeed well-defined due to the proviso)}
(iVnr1,icoV 11 := Vnr1,coV 11 ;SV ;fVnr 1,fVnr 2 := Vnr 1,Vnr 2
;Vnr1,coV 11 := iVnr1,icoV 11 ;ScoV ;Vnr1,Vnr 2 := fVnr1,fVnr 2)
[live Vr0,Vnr0,coV ]
={def. of live : (def.SV def.ScoV )(Vr0,Vnr0,coV ) (again, Q2 of slice and
co-slice) and (iVnr1,icoV 11,fVnr1,fVnr 2) (Vr0,Vnr0,coV )}
|[var iVnr1,icoV 11,fVnr1,fVnr 2;
iVnr1,icoV 11 := Vnr1,coV 11 ;SV ;fVnr 1,fVnr 2 := Vnr 1,Vnr 2
;Vnr1,coV 11 := iVnr1,icoV 11 ;ScoV ;Vnr1,Vnr 2 := fVnr1,fVnr 2]|.
11.4 Summary
In this chapter, an approach for reducing compensation after sliding has been developed. Redun-
dant backup variables of initial and final values have been eliminated. This elimination has been
conducted by the removal of dead assignments and by merging such backup variables with their
original counterparts. When eliminating backup of reusable final values of extracted variables,
those had to be removed from the complement. This has been formalised through a concept of
compensation-free co-slices, or penless co-slices.
A penless co-slice is constructed by first introducing reusable variables through final-use sub-
stitution, then slicing for the remaining variables, and finally undoing the substitution. It is
interesting to see that such a co-slice is potentially smaller than the corresponding slice (of non-
extracted variables).
A sliding transformation whose complement is a penless co-slice has been developed. We call
CHAPTER 11. PENLESS SLIDING 125
it penless sliding. Moreover, if all backup variables are successfully eliminated when sliding, as
was the case in the given example, we say the result is completely penless.
In this light, the KH approach to arbitrary method extraction (both KH00 [38] and KH03 [39],
apart from some specific treatment of jumps in the latter) can be described as completely penless.
Indeed, the approach taken in the last two chapters, leading to the formulation of penless co-slices
and penless sliding has been inspired by their algorithms as well as their criticism of Tuck’s lack
of data flow from slice to complement (as stated in [39, 37]).
Looking back at the sliding transformations of the current and previous chapters, we note that
the user was asked to provide not only the statement in scope Sand variables for extraction V,
as was the case in the earlier sliding transformation (of Chapter 9), but also the subset Vr of
extracted variables to be reused in the complement. However, our original formulation of slice
extraction (in Definition 1.1) had no mention of such Vr . When the goal is to extract precisely the
slice of Son V, whilst producing the smallest possible complement, one could ask “which subset
Vr would yield the smallest possible complement?” This question, as well as a related question
on sliding itself will be treated in the next chapter.
Chapter 12
Optimal Sliding
In previous chapters, all co-slicing related transformations assumed the subset of (extracted) vari-
ables to be made reusable, Vr, is given. In this chapter, however, that assumption is waived.
This immediately raises a variety of optimisation problems. When extracting the computation
of Vin S, using a certain co-slice related sliding transformation, which partition of def.Sinto
((Vr ,Vnr ),coV ) — with (Vr ,Vnr ) extracted, and Vr offered for reuse — would yield an optimal
result? Surely we need to be more specific in describing what is meant by ‘optimal’, and which
transformation is being applied.
In this chapter, we focus on sliding with penless co-slices (i.e. Transformation 11.2 from the
preceding chapter). For any given program statement Sand set of variables to be extracted V,
an optimal solution will identify a set of variables V0(possibly larger than Vitself, as will be
explained shortly), made of subsets (Vr ,Vnr ), for which the extracted slice SV 0will be precisely
slice.S.V; its complement, the penless co-slice ScoV , must end up being the smallest possible, in
terms of the number of individual assignments in it.
It should be noted that we do not mean to consider any substatement in our search: Finding
such minimal co-slices is in general impossible, just as finding minimal slices is — it is equivalent
to solving the halting problem [64]. Instead, the goal is to find the smallest out of all possible
results of our specific (penless) co-slicing algorithm. In our quest, the program statement to be
co-sliced shall be given and fixed, whereas the set of variables on which to co-slice, as well as its
subset of variables to be made available for reuse, shall vary.
We begin our search for an optimal solution by devising an algorithm to find the smallest
possible penless co-slice for given Sand V. We then complete the solution by observing that the
given Vitself is not necessarily the best option for the set of extracted variables V0. As it turns
out, some larger sets may yield precisely the same extracted slice, and with enhanced opportunities
for reuse, thus possibly yielding an even smaller complement.
126
CHAPTER 12. OPTIMAL SLIDING 127
12.1 The minimal penless co-slice
A statement Sand set of variables Vhave at most 2Ndifferent penless co-slices (with N=|V|).
This is so since any subset Vr of Vcan be offered for reuse, thus possibly leading to a different
co-slice. (It should however be remembered that not all subsets necessarily yield well-defined
penless co-slices.)
Let the size of a program statement be determined by the number of individual assignments in
it. With this definition, is there (for given Sand V, with different reusable subsets Vr ) a single
smallest result to our algorithm of penless co-slicing? If so, which subset Vr yields it? And how
can this Vr be found?
Our conjecture is that indeed there is a single smallest penless co-slice (to any given Sand
V). How do we find it? Surely, one could try all subsets of V, composing all possible co-slices
and measuring the size of the penless ones. But this algorithm would be very expensive. In terms
of time complexity, it would grow exponentially with |V|. Is there a faster solution?
12.1.1 A polynomial-time algorithm
The size of (penless or not) co-slices, for any given Sand V, is anti-monotone with respect to the
set Vr of reusable variables. Thus, if Vr1 and Vr are two sets which lead to valid penless co-slices
(in the context S,V) — we refer to such sets as ‘mergeable’ as they can be merged with their
corresponding backup variables, after co-slicing — with Vr 1Vr V, we have
|penless-co-slice.S.V.Vr|≤|penless-co-slice.S.V.Vr 1|. (Recall the size of a statement here is the
number of individual assignments.)
In fact, we should be looking for the largest Vr that yields a penless co-slice. Why largest? Due
to anti-monotonicity of the size of penless co-slices (with respect to the set of reusable variables),
such a penless co-slice will never be larger than the penless co-slice of any other mergeable set of
reusable variables.
Is there only one such largest set? Yes, due to the following definitions and observation.
After co-slicing, the variables in Vr \glob.(co-slice.S.V.Vr.fVr) are definitely mergeable (i.e.
they can be merged with their corresponding members of fVr) whereas variables in
Vr glob.(co-slice.S.V.Vr.fVr) are considered non-mergeable. We thus define the set of mergeable
reusable variables (after co-slicing of S,Vwith reusable Vr Vand fresh fVr,i.e. fVr (V
glob.S)) as
mergeable.S.V.Vr ,Vr \glob.(co-slice.S.V.Vr .fVr). Accordingly, variables in
Vr \mergeable.S.V.Vr are said to be non-mergeable.
Moreover, when all members of a set of reusable variables Vr are mergeable with respect to
S,V,Vr (i.e. when Vr =mergeable.S.V.Vr which is the case iff Vr glob.(co-slice.S.V.Vr.fVr)),
CHAPTER 12. OPTIMAL SLIDING 128
we say Vr is ‘penless’ with respect to co-slice.S.V.
Lemma 12.1. Penlessness is closed under set-union. That is, if Vr1 and Vr2 are two penless
subsets of V, their union Vr 1Vr2 is penless too.
Proof. Assuming the two subsets Vr 1 and Vr2 of extracted variables Vare both penless with
respect to co-slice.S.V, we need to show the union Vr 3 := (Vr1Vr2) is penless too.
On the one hand, we observe
glob.(co-slice.S.V.Vr 3.fVr3)
={def. of co-slice; let coV := def.S\V}
glob.(slice.S[final-use Vr 3\fVr3].coV )
={stepwise final-use sub. (see Section E.3): let Vr21 := Vr2\Vr1}
glob.(slice.S[final-use Vr 1\fVr1][final-use Vr21 \fVr 21].coV )
⊆ {Lemma 12.2, see below}
fVr 21 glob.(slice.S[final-use Vr1\fVr1].coV )
={def. of co-slice}
fVr 21 glob.(co-slice.S.V.Vr1.fVr1) .
Thus Vr1glob.(co-slice.S.V.Vr 3.fVr3) (due to the freshness of fVr21 and penlessness of Vr1
in co-slice.S.V). On the other hand, we have
glob.(co-slice.S.V.Vr 3.fVr3)
={def. of co-slice; let coV := def.S\V}
glob.(slice.S[final-use Vr 3\fVr3].coV )
={stepwise final-use sub. (see Section E.3): let Vr11 := Vr1\Vr2}
glob.(slice.S[final-use Vr 2\fVr2][final-use Vr11 \fVr 11].coV )
⊆ {again Lemma 12.2, see below}
fVr 11 glob.(slice.S[final-use Vr2\fVr2].coV )
={def. of co-slice}
fVr 11 glob.(co-slice.S.V.Vr2.fVr2) .
Thus, similarly to the preceding derivation, Vr2glob.(co-slice.S.V.Vr3.fVr3). We then con-
clude (from set theory and the definition of Vr 3) the desired Vr3glob.(co-slice.S.V.Vr3.fVr 3).
CHAPTER 12. OPTIMAL SLIDING 129
Lemma 12.2. Let S,X,fX ,Ybe any core statement and three sets of variables, respectively;
then
glob.(slice.S[final-use X\fX ].Y)fX glob.(slice.S.Y)
provided fX ((X,Y)glob.S) .
Proof. Recall the definition of slice. There, the given program statement is first translated into
SSA, where it is sliced in a flow-insensitive way, before returning from SSA.
The difference between the SSA versions of Sand S[final-use X\fX ] is only in the references
to fX , since final-use substitution changes only uses, and no definition. So both versions have
the same sets of defined variables and the same sets of slides, with potential differences in the
used variables on those slides. These potential differences, in turn, may lead to differences in the
respective relations of slide dependence. Since fX def.S[final-use X\fX ], the introduced uses of
fX do not yield any new slide dependence.
Consequently, representing the relations of slide dependence as a set of pairs, the set of slide
dependences of the SSA version of S[final-use X\fX ] is a (not necessarily strict) subset of the
corresponding set for Sitself. Thus, the slide-independent set YLf (i.e. the reflexive transitive
closure of the set YLf of final instances of Y) of the former, is a subset of the corresponding set
of the latter. The result is that, excluding fX itself, and even after returning from SSA, the set of
global variables in the former is a subset of the global variables in the latter.
Finally, how do we find the largest penless set? The following observations suggests an opti-
mistic approach.
Lemma 12.3. Mergeability is monotone with respect to co-slicing (in the set of reusable variables).
That is,
mergeable.S.V.Vr 1mergeable.S.V.(Vr 1,Vr2).
Proof.
mergeable.S.V.Vr 1
={def. of mergeable}
Vr 1\glob.(co-slice.S.V.Vr1.fVr1)
={set theory: fVr1Vr1}
Vr 1\(glob.(co-slice.S.V.Vr1.fVr1) \fVr 1)
⊆ {see below}
Vr 1\(glob.(co-slice.S.V.(Vr1,Vr2).(fVr 1,fVr2)) \(fVr1,fVr2))
CHAPTER 12. OPTIMAL SLIDING 130
={set theory: (fVr1,fVr2) Vr1}
Vr 1\glob.(co-slice.S.V.(Vr1,Vr2).(fVr 1,fVr2))
⊆ {set theory}
(Vr 1,Vr2) \glob.(co-slice.S.V.(Vr1,Vr 2).(fVr1,fVr2))
={def. of mergeable}
mergeable.S.V.(Vr1,Vr 2) .
A useful property of co-slicing (for the third step above) is
(glob.(co-slice.S.V.(Vr 1,Vr2).(fVr1,fVr 2))\(fVr1,fVr2)) (glob.(co-slice.S.V.Vr 1.fVr1)\fVr1).
To see why this is so, we observe
glob.(co-slice.S.V.(Vr 1,Vr2).(fVr1,fVr 2)) \(fVr1,fVr2)
={def. of co-slice; let coV := def.S\V}
glob.(slice.S[final-use Vr 1,Vr2\fVr1,fVr 2].coV )\(fVr 1,fVr 2)
={stepwise final-use sub. (see Section E.3)}
glob.(slice.S[final-use Vr 1\fVr1][final-use Vr2\fVr 2].coV )\(fVr 1,fVr 2)
={set theory}
(glob.(slice.S[final-use Vr 1\fVr1][final-use Vr2\fVr 2].coV )\fVr 2) \fVr 1
⊆ {Lemma 12.2 with S,X,fX ,Y:= S[final-use Vr 1\fVr1],Vr2,fVr 2,coV }
glob.(slice.S[final-use Vr 1\fVr1].coV \fVr1
={def. of co-slice;coV =def.S\V}
glob.(co-slice.S.V.Vr 1.fVr1) \fVr1.
An interesting consequence of the monotonicity of mergeability — one which calls for an
optimistic algorithm, when seeking the largest set of penless reusable variables — is the following.
Corollary 12.4. When reducing the set of reusable variables from (Vr 1,Vr2) to (Vr1,Vr 21),
when Vr 21 Vr2 and Vr1mergeable .S.V.(Vr1,Vr2), the subset Vr1 of non-mergeable variables
in the former remains non-mergeable in the latter (i.e. Vr 1mergeable.S.V.(Vr 1,Vr21)).
Proof. Due to the monotonicity of mergeability (Lemma 12.3 above), any member of
Vr 1mergeable.S.V.(Vr1,Vr21) would have to be in Vr1mergeable.S.V.(Vr1,Vr 2) as well
(due to Vr 21 Vr2), thus contradicting the assumption of Vr 1 being non-mergeable in
co-slice.S.V.(Vr1,Vr2).
CHAPTER 12. OPTIMAL SLIDING 131
Given a core statement Sand variables of interest V, compute the largest subset
largest-penless-reusable.S.Vof Vwhich, when offered for reuse, yields a penless co-slice
(penless-co-slice.S.(largest-penless-reusable.S.V), which is in turn not larger than any other
penless co-slice of S,V), as follows:
largest-penless-reusable.S.V,largest-penless-reusable-rec.S.V.V
largest-penless-reusable-rec.S.V.Vr ,if nonMergeable =then Vr else
largest-penless-reusable-rec.S.V.(Vr \nonMergeable)
where nonMergeable := glob.(co-slice.S.V.Vr .fVr)Vr
and fVr := fresh.(Vr,(Vglob.S)) .
Figure 12.1: An algorithm for finding the largest-penless-reusable set.
Remark: the above property should not be confused with ‘non-mergeability is anti-monotone
with respect to co-slicing’. In fact, judging by our definition of non-mergeability, the latter is not
true. When decreasing the set of reusable variables, say from (Vr 1,Vr 2) to Vr1, a non-mergeable
variable from Vr 2 will no longer be considered either (mergeable or non-mergeable), as it will no
longer be offered for reuse.
So with an optimistic approach, the algorithm begins by trying to reuse all extracted vari-
ables V. It then removes all non-mergeable variables. Now, should we trust the result (i.e.
the set Vr := mergeable.S.V.V) to be penless (i.e. Vr =mergeable.S.V.Vr)? Unfortunately
that is not necessarily so. By no longer reusing variables in V\Vr, the set of global variables
glob.(co-slice.S.V.Vr .fVr)\fVr is possibly larger than the corresponding glob.(co-slice.S.V.V.fV )\
fV ; the former might include members of Vr , thus rendering Vr non-penless. However, such vari-
ables can subsequently be removed, repeatedly.
Hence, the algorithm — see Figure 12.1 — is optimistic and recursive. Starting with the largest
available set V, we repeatedly identify and remove non-mergeable variables, until a fixed point is
reached.
12.2 Slice inclusion
When sliding Vin S,e.g. in Transformation 11.2, the extracted computation consists of the
full slice of Vin S,i.e. slice.S.V. The complement, in turn, consists of the code for comput-
ing the remaining results coV := def.S\V,i.e. penless-co-slice.S.V.Vr, with Vr (being e.g.
largest-penless-reusable.S.V, as shown above), a subset of Vof extracted variables whose final
CHAPTER 12. OPTIMAL SLIDING 132
extracted value is to be offered for reuse. Variables in coV , however, might also be modified in
the extracted slice, if those contribute to the computation of V. In such a case, the compensatory
code ensures (through backup variables) those modifications do not interfere with the eventual
computation of coV in the complement.
With this transformation, the final value of the extracted variables can be reused in the com-
plement. But how about the final value of other variables? In the following example, an attempt
to slide the computation of avg (and reusing it in the complement), would lead to duplication of
the code for computing sum.
i,sum,prod := 0,0,1
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
; avg := sum/a.length
; out << sum
; out << prod
; out << avg
The result will look this way:
i,sum := 0,0
; while i<a.length do
i,sum :=
i+1,sum+a[i]
od
; avg := sum/a.length
;
i,sum,prod := 0,0,1
; while i<a.length do
i,sum,prod :=
i+1,sum+a[i],prod*a[i]
od
; out << sum
; out << prod
; out << avg
Notice that the final value of avg was successfully reused. In contrast, the whole computation
of sum had to be duplicated. The reason is that its value at the end of the extracted slice was
ignored, instead of being offered for reuse through final-use substitution (like avg).
In general, there is no reason to restrict final-use substitution to the set of extracted variables,
V. All other variables whose final value is computed in the extracted slice might be good candi-
dates for reuse too. In the example above, it was sum whose final value was computed both in
CHAPTER 12. OPTIMAL SLIDING 133
the slice and the complement.
How can we tell it was the final value of a variable that was computed in the extracted slice?
This is the case whenever the full slice of such a variable, with respect to the program scope, is
included in the extracted code. In general, we can say that all variables whose slices are included
in the slice for Vare candidates for final-use substitution. We denote this set V0and propose to
update our slice-extraction transformations to extract that extended set rather then the requested
set V.
In the example above, where Vwas {avg}, the corresponding V0set includes {avg,sum,i}.
Applying Transformation 11.2 to that latter set, with the largest penless reusable set Vr :=
{avg,sum}, would therefore lead to:
|[var fi
; i,sum := 0,0
; while i<a.length do
i,sum :=
i+1,sum+a[i]
od
; avg := sum/a.length
; fi:=i
;
; i, prod := 0, 1
; while i<a.length do
i, prod :=
i+1, prod*a[i]
od
; out << sum
; out << prod
; out << avg
; i:=fi
]|
Note that this time, the final extracted value of sum was reused in the complement, instead
of being ignored and thus recomputed. This resulting code is considered better than the previous
result in the sense that the code for computing sum is no longer duplicated. And in terms of our
optimisation problem, it yields a smaller co-slice, with less assignments.
On the other hand, we now have more compensatory code. For understanding this, we further
note that the largest set Vr used for final-use substitution was {avg,sum}. The variable iwas
excluded since intermediate values are used in the complement, for the computation of prod.
Instead, the slice for iwas duplicated and its modifications in the complement were ignored
through a backup variable . In this case, considering levels of code duplication, ignoring effects
on iin the extracted slice, instead, would have been as good. However, in terms of the number of
backup variables, the latter would have been better.
Accordingly, when two sliding combinations are similar in terms of code duplication, it might
CHAPTER 12. OPTIMAL SLIDING 134
be desirable to choose the one that minimizes the need for backup variables, as those entail both
extra storage and time for copying.
In this thesis, we leave this aspect (of minimizing such compensatory code) alone, and focus
solely on levels of code duplication, as displayed by the number of individual assignments in the
co-slice.
12.3 The optimal sliding transformation
Transformation 12.5. Let Sbe any core statement and Vbe a set of variables to be extracted;
then
Sv
|[var iVnr1,icoV 11,fVnr1,fVnr 2
;iVnr1,icoV 11 := Vnr1,coV 11
;SV
;fVnr1,fVnr 2 := Vnr 1,Vnr 2
;
Vnr1,coV 11 := iVnr1,icoV 11
;ScoV
;Vnr1,Vnr 2 := fVnr 1,fVnr 2
]|
where V0is the set of variables in Vdef.Swhose slice is included in slice.S.V,
Vr := largest-penless-reusable.S.V0,
Vnr := V0\Vr,
coV := def.S\(Vr ,Vnr ),
SV := slice.S.V0,
ScoV := penless-co-slice.S.V0.Vr,
Vnr1,Vnr 2 := (Vnr input.ScoV ),(Vnr (def.ScoV \input.ScoV )),
coV 11 := (coV def.SV input.ScoV )
and (iVnr1,icoV 11,fVnr1,fVnr 2) := fresh.((Vnr 1,coV 11,Vnr1,Vnr 2),(Vglob.S)) .
Proof.
S
CHAPTER 12. OPTIMAL SLIDING 135
v {Refinement 11.1 with Vr,Vnr,ScoV := Vr 0,Vnr 0,co-slice.S.(Vr0,Vnr 0).Vr0.fVr
on fresh fVr : (Vr0,Vnr 0,coV ) = def.Sby def. of Vr0,Vnr 0,coV ;
S[live Vr0,Vnr0]vSV [live Vr0,Vnr 0] by Q1 of slice;
S[final-use Vr 0\fVr][live coV ]v(co-slice.S.(Vr0,Vnr 0).Vr0.fVr)[live coV ]
by Q1 of co-slice;
def.SV def.Sby Q2 of slice; similarly
def.ScoV def.Sby Q2 of co-slice;
(iVnr1,icoV 11,fVnr1,fVnr 2) glob.Sby Q1 of fresh;
note that Vr 1,Vr2,Vr3 := ,,Vr 0due to the proviso;
consequently iVr1,iVr 2,fVr1 and fVr2 are all empty; also note (for our ScoV )
penless-co-slice.S.(Vr0,Vnr 0).Vr0= (co-slice.S.(Vr0,Vnr 0).Vr0.fVr)[fVr \Vr 0]
by def. of penless-co-slice (which is indeed well-defined due to the proviso)}
(iVnr1,icoV 11 := Vnr1,coV 11 ;SV ;fVnr 1,fVnr 2 := Vnr 1,Vnr 2
;Vnr1,coV 11 := iVnr1,icoV 11 ;ScoV ;Vnr1,Vnr 2 := fVnr1,fVnr 2)
[live Vr0,Vnr0,coV ]
={def. of live : (def.SV def.ScoV )(Vr0,Vnr0,coV ) (again, Q2 of slice and
co-slice) and (iVnr1,icoV 11,fVnr1,fVnr 2) (Vr0,Vnr0,coV )}
|[var iVnr1,icoV 11,fVnr1,fVnr 2;
iVnr1,icoV 11 := Vnr1,coV 11 ;SV ;fVnr 1,fVnr 2 := Vnr 1,Vnr 2
;Vnr1,coV 11 := iVnr1,icoV 11 ;ScoV ;Vnr1,Vnr 2 := fVnr1,fVnr 2]|.
12.4 Summary
This chapter has addressed two related optimisation problems with regards to penless co-slicing
and penless sliding, from the preceding chapter. There, the sliding transformation assumed the
subset of extracted variables to be reused in the complement is given. Here, in contrast, all possible
subsets have been considered, and algorithms for finding the optimal ones have been developed.
The smallest possible penless co-slice is found through an optimistic polynomial time algorithm
that assumes all extracted variables should be made reusable, and repeatedly removes those that
violate penlessness. When a fixed point is reached, the resulting set is guaranteed to yield the
smallest possible penless co-slice. The correctness of this algorithm has been proved through
a number of properties of final-use substitution, slicing and co-slicing that have been formally
developed.
The optimal sliding transformation, one which extracts precisely the slice of selected variables
CHAPTER 12. OPTIMAL SLIDING 136
and with the smallest possible penless co-slice (in terms of number of individual assignments),
has been shown to involve the extraction of a superset of the selected variables and the smallest
penless co-slice, as in our solution to the first optimisation problem. The superset of extracted
variables includes the selected set and all other variables whose slice is included in the extracted
code.
A relation of slice inclusion, contributing to our detection of optimal sliding, has been intro-
duced by Gallagher and Lyle [22].
In a final note, we return to our declared challenge, from the end of Chapter 2. There,
it was shown that if the user requests the extraction of variable out (or equivalently statements
{1,2,4,6}), the Tuck transformation would duplicate the entire extracted slice (when not rejecting
the extraction), the KH00 algorithm would fail, and the KH03 would insist on extracting statement
3 too, which is illegal in our context of slice extraction.
Our optimal sliding from Transformation 12.5 above, with V={out}, would detect V0=
{out,sum,i}, of which Vr ={out,sum }would be offered for reuse. Consequently, the challenge
of untangling like Tuck whilst minimizing code duplication and improving applicability, like KH03,
would be met.
This concludes our current investigation of slice extraction via sliding. Potential applications,
for refactoring and otherwise, as well as possible directions for future work will be outlined in the
next chapter.
Chapter 13
Conclusion
This thesis has explored the application of program slicing and related analyses to the construction
of automatic tools for refactoring.
A theoretical framework for slicing-based refactoring has been developed. The framework has
been introduced in Chapter 4 and further extended in Chapter 5 where a new proof method
has been developed. The method is based on two complementary types of refinements, i.e. slice-
refinement and co-slice-refinement. In our deterministic context, when a program S0is both a slice-
refinement and a co-slice-refinement of another program S, it is guaranteed to be a full refinement
of S. This enables the decomposition of a proof, following the specific decomposition applied in
a given transformation. The construction of our framework has been finalised in Chapter 8, with
the formalisation of a novel program decomposition technique of program slides. We think of a
program as represented by a collection of transparency slides. On each such slide, a non-contiguous
part of the original program is printed, such that the union of all slides yields back the program
itself.
Based on our theoretical framework, a provably correct slicing algorithm has been provided in
Chapter 9. The algorithm is based on the observation that slides capture the control flow aspect
of slicing, whereas complementary data-flow influences are captured by our binary relation of slide
dependence. Thus, a slide-independent set of slides yields a correct slice.
Our framework and slicing algorithm have been applied in solving the problem of slice extrac-
tion, as posed in the introduction chapter, via a family of provably correct sliding transformations.
Building on existing method-extraction algorithms, our approach shares the advantages of those
whilst avoiding some of the respective weaknesses. Thus sliding is successful in providing high
levels of accuracy and applicability.
The thesis comes to a conclusion in this chapter, by discussing implications and potential
applications of sliding transformations, first in the context of refactoring, and more generally,
137
CHAPTER 13. CONCLUSION 138
later. Furthermore, advanced issues and limitations of sliding are evaluated and some ideas for
future work are presented.
13.1 Slicing-based refactoring
13.1.1 Replace Temp with Query
Our journey had started with a promise to offer general automation of Fowler’s refactoring of
Replace Temp with Query. This was motivated, in part, by Fowler and Beck’s big refactoring of
Convert Procedural Design to Objects and the observation that support for removing temps was
missing.
But then, instead, we turned to form and solve the problem of slice extraction (via sliding).
The time has come to explain how sliding can contribute to automating Replace Temp with Query.
Our observation is that
Replace Temp with Query =Extract Slice +Inline Temp +Merge Temps
whereby
Extract Slice =Sliding +Extract Method
with Inline Temp a refactoring to eliminate simple temps that are “getting in the way of other
refactorings” [20, Page 119], and with Merge Temps as in the elimination of compensatory code
after sliding (see e.g. Refinement 11.1).
13.1.2 More refactorings
Our Extract Slice refactoring, automated via sliding, can help with automating some more known
(and yet to be supported) refactorings.
Split/merge loops
The Split Loop refactoring is an immediate candidate for automation through sliding. “You have
a loop that is doing two things. Duplicate the loop” [69]. Indeed, this is what we did throughout
the thesis in most of our examples. However, it should be emphasised that splitting loops is just
a special case of our general slice-extraction refactoring.
It should also be remembered that tangled loops are not bad practice as such. It is left to the
programmer to apply this refactoring judiciously, e.g. when one of the computations in the loop
should be extracted for reuse.
Our separation of refinement and program equivalence rules from actual transformations,
throughout the thesis, was made with the understanding that reverse sliding operations, e.g.
CHAPTER 13. CONCLUSION 139
for entangling loops (a new Merge Loops refactoring?), may under some circumstances be as de-
sirable. The decision when to apply which refactoring, in our understanding, should be left to the
programmer’s good judgement. We merely provide the enabling tools.
Separate Query from Modifier
Side effects in functions can be problematic, e.g. hampering potential reuse. “You have a method
that returns a value but also changes the state of an object”, is a situation that calls for the
Separate Query from Modifier refactoring. “Create two methods, one for the query and one for
the modification” is Fowler’s suggestion [20, Page 279].
Corn´elio has formalised this refactoring for simple cases in which the modifier is made of an
individual assignment (to an object’s field) and the query returns the old value of the assigned
variable (i.e. the field) [11, Page 128]. But what if the querying code is tangled with the modifier?
(Indeed this is the case in Fowler’s original example.)
Our observation is that such cases require untangling of non-contiguous code, as is offered by
sliding. However, working out exact details of this refactoring will require further investigation. If
successful, such application of sliding would yield a novel and advanced solution to this important
and highly non-trivial refactoring.
Arbitrary method extraction
The Extract Method refactoring is considered so important that its automation, in a behaviour-
preserving way, has been declared “The Rubicon” to be crossed by refactoring tools, before those
can be considered serious [21]. Furthermore, many of the other catalogued refactoring transfor-
mations depend on Extract Method as a building block.
One limitation of method extraction, as formulated and supported to-date, is the insistence
on extracting a single fragment of (contiguous) code. Indeed, extracting an arbitrary selection
of fragments, from a given program scope, is much harder. We call this generalisation arbitrary
method extraction. It involves the extraction of a (not necessarily contiguous) set of program
fragments into a single new method.
The most closely related work to sliding, which was introduced early in the thesis (in Sec-
tion 2.3) and indeed influenced its development, includes the Tucking transformation by Lakhotia
and Deprez [40] and two algorithms by Komondoor and Horwitz (KH00 [38] and KH03 [39]). As
was mentioned, none of them actually targeted slice extraction, as was defined in Section 1.4.
In fact, it was (different flavours of) arbitrary method extraction that they targeted. Common
to all three is that an arbitrary selection of fragments is given as input. They differ (from each
other), however, in the rules of the game. Those include (1) the way to determine the enclosing
program scope (from which to extract), (2) the applicability conditions (i.e. when to reject a
CHAPTER 13. CONCLUSION 140
transformation), (3) which non-selected statements can (or should) be dragged along with the
extracted code, (4) what is in the complement and how to compose it with the extracted code, (5)
which parts of the program can be duplicated and/or reversed, and (6) what kind of compensation
is to be allowed.
Our conjecture is that each of the three arbitrary-method-extraction flavours (i.e. Tuck, KH00
and KH03) can be reduced to slice extraction. The results of an initial investigation in that di-
rection have suggested such reductions indeed exist. Those involve the formulation and reuse of
backward slices from internal program points. This can be done with existing sliding transforma-
tions, to be performed on the SSA-form rather than the original. Then, the (existing solution’s)
applicability conditions should be shown to imply de-SSA-ability of the result. Thus, the existing
solutions will be re-formulated, proved correct and automated through sliding.
Consequently, as was the case for slice extraction, corresponding improvements can be expected
to present themselves. Those might yield new solutions with higher applicability (compared with
Tuck and KH00), higher accuracy of extracted code (Tuck and KH03) and complement (Tuck)
reduced levels of duplication and compensation (again, Tuck) and enhanced scalability (mostly
the exponential KH00 but also the cubic KH03).
It should be said that this application of sliding is, on the one hand, somewhat surprising (to
the author, at least), as it was not at all anticipated (in earlier stages of this research). But on
the other hand, it is very reasonable, since the results of Tuck, KH00 and KH03 (in particular
the comparison between the former and latter, in [37] and [39]) have directly contributed to the
invention of slides and sliding.
13.2 Advanced issues and limitations
In choosing a programming language, we have made some simplifying assumptions, such that
formal derivation of the concepts behind sliding has become feasible. It is natural to ask whether
sliding transformations can be upgraded to support “real” languages. In what follows, we consider
the lifting of some earlier restrictions on the supported language.
Firstly, our assumption of all variables being cloneable was made such that we can make backup
of initial and final values, as part of sliding’s compensatory code. Thanks to our penless-sliding ef-
fort to remove redundant backup variables, back in Chapter 11, this restriction can be easily lifted.
This lifting must be complemented with strengthening sliding’s applicability conditions. Added
preconditions will ensure all backup variables of non-cloneable program variables are mergeable
and hence removed. Otherwise, the transformation would be rejected. Alternatively, some mea-
sures might be taken (as in KH03 where no backup variables are allowed) to avoid the need for
such backup.
CHAPTER 13. CONCLUSION 141
Secondly, if aliasing is to be permitted, a preliminary step may perform alias analysis before
sliding begins. This step would rename variables such that the aliases are seamless to the sliding
algorithm. Furthermore, since sliding aims to keep the source as close to the original as possible,
this would have to be complemented with a following step to undo the renaming. There, special
care would have to be taken with compensatory code, to retrieve backup value of all relevant
variables.
Thirdly, allowing structured jumps or even arbitrary control flow, as is the case with existing
solutions to arbitrary method extraction, would require a complete reformulation, at least on
the lower level of our program analysis and manipulation approach (e.g. laws for propagating
assertions). Nevertheless, there appears to be no reason why slides and sliding should not be
applicable in such settings. For example, the slide of an assignment would have to include all its
control-dependence predecessors instead of merely syntax-tree ancestors. (In our simple language,
indeed the latter subsumes the former.) Another example is final-use substitution, whereby instead
of propagating assertions as far as possible before making the substitution, it should be possible
to formulate the substitution in terms of paths over the control flow graph. There, a final use (of
variable x) is one from which all paths to the exit involve no re-definition (of that x).
Fourthly, in the presence of exceptions, sliding’s reordering of statements may be problematic
in the sense that the transformed program might perform more operations before throwing an
exception or might even throw a different exception. When the thrown exceptions are part of
the behaviour to be preserved, such reordering should be limited or even completely abandoned.
Instead, it should be possible to adopt alternative extraction strategies which involve no reordering
of statements. We have proposed two such alternative strategies, one object oriented, and the
other aspect oriented [36], in a paper titled “Untangling: A Slice Extraction Refactoring” [17].
Explained in our context of slips and slides, both strategies involve the sliding of slips, without
their controlling guards. Such slip sliding would extract the statement of a slip into a method of
its own, in the object oriented case, or into an advice in an aspect-oriented context. In the former,
the slip would be replaced with a call to the new method whereas in the latter, the extracted
advice slip will be associated with a pointcut designator, ensuring it is slid back (i.e. woven) into
its original point, before execution.
Fifthly, in the presence of concurrency it is unclear whether sliding is at all appropriate, again,
due to reordering of computations. Nevertheless, some solutions to slicing are available for such
settings (e.g. [35, 15]). It is possible that as it was for aliasing, new preconditions to sliding may
be formed to ensure behaviour preservation.
Finally, supporting procedures, parameters, overloading, and object-oriented constructs (e.g.
inheritance and polymorphism) would be a great challenge. Indeed, slicing research has already
proposed solutions to such problems, on the one hand, whereas predicate transformers (e.g. for
CHAPTER 13. CONCLUSION 142
the language ROOL [11]) have been defined and even applied to refactoring, on the other. It is
currently unknown whether such advanced solutions will be amenable for supporting formulations
of sliding.
In the context of PDG-based slicing, extra language features (e.g. arbitrary control flow [6])
are typically handled by the addition of more edges to the graph. This way, the slicing algorithm
itself is oblivious to those features and remains simple. Similarly, it can be expected that sliding,
being based on slides and slicing, be enhanced by the addition of slide dependences.
13.3 Future work
Sliding, as presented in this thesis, offers an abundance of possible directions for future work.
Earlier in this chapter, we have already mentioned a number of possible applications of sliding
in implementing known refactorings and in extending method extraction to support arbitrary
selections of non-contiguous code fragments.
In our discussion on supporting advanced language features and limitations of sliding, some
further ideas have been highlighted, including the support for different strategies of extraction, as
in our paper on refactoring into aspects [17].
Some further ideas may involve the theory behind sliding, or practicalities, or even other
applications, beyond the initial domain of refactoring.
13.3.1 Formal program re-design
In this thesis, we have restricted our supported language for deterministic constructs. If the earlier
section considered the lifting of language restrictions when supporting “real” languages, here we
turn the other way, considering the effects of supporting non-determinism and specifications.
Our problem with non-determinism has been related to the duplication of such constructs. As
in the above section, this problem can be treated by adding a precondition to sliding, ensuring no
non-deterministic choice is duplicated. Alternatively, a mechanism to ensure exact repetition of
non-deterministic choices, wherever duplicated, can be installed. The details of such mechanisms
would require further work.
It is hoped that with robust support for change of programs and specifications — and sliding
may offer a step towards such support — formal methods of program design, and hence of re-design,
would become more agile and perhaps, consequently, more widely used.
CHAPTER 13. CONCLUSION 143
13.3.2 Further applications of sliding: beyond refactoring
The sliding family of program equivalence and refinement rules, as introduced in the thesis, has
been applied to behaviour-preserving transformations for refactoring, with the aim of supporting
change in software. Nonetheless, this does not have to stop there. Sliding carries the potential
of being relevant and applicable anywhere program equivalence or behaviour-preserving program
changes are.
Software obfuscation
The sliding refinement rules of Chapters 10 and 11 provide a large universe of equivalent programs,
as was explored in the optimisation problems of Chapter 12. In construction, those only differ in
the subset of reusable variables and hence in the size and shape of the co-slice. In obfuscation
[14], a program is being transformed with the aim of becoming less readable. This is desirable e.g.
for software security and protection. In a way, obfuscation is the opposite of refactoring, but as
it also involves behaviour-preserving transformations, it may benefit from sliding. Moreover, the
large number of equivalent programs carries the potential of rendering the reversal of sliding-based
obfuscation harder.
Clone elimination
The arbitrary-method-extraction algorithms, by Komondoor and Horwitz [38, 39, 37], target the
elimination of clones, or duplicated (not-necessarily contiguous) statements, in existing programs.
Their approach eliminates pair of clones as well as clone groups. Since, with some more work,
sliding is expected to be made applicable for such method-extraction techniques (as KH00 and
KH03), it should also be useful in that context.
Integration of program variants
In general, as said, sliding can be expected to be useful wherever program equivalence is. One
interesting application of such equivalence is in the integration of variants of a program. This is
useful, for example, when a group of programmers is working simultaneously on a given code base.
Horwitz, Prins and Reps [31, 32] have suggested some PDG-based and slicing related algorithms
of program merging for integration. Those were based on the observation that if two programs are
represented by isomorphic dependence graphs, they are equivalent. However, the reverse is not
true, obviously, as the problem of program equivalence is in general undecidable. With sliding, we
identify a range of equivalent programs whose dependence graphs will not be isomorphic. This is
due to duplication of guards and assignments. This sliding-related family of equivalent programs,
might, in turn, enhance the capabilities of such program integration algorithms.
CHAPTER 13. CONCLUSION 144
Optimising compilers
In this thesis, we have adopted some program analyses, representations and transformations from
the world of optimising compilers, such as reaching definitions, SSA form, live variables analysis
and the related dead-assignments-elimination.
In turn, it should be interesting to investigate the relevance of sliding transformations to that
domain. It appears that sliding offers more powerful code-motion transformations than the state
of the art.
Programming education
On a different level altogether, it is hoped that slides and sliding, either as a metaphor or in
theory and practice, can find their way into the programming education curriculum, especially in
the education of non-mathematically inclined programmers. For example, teaching and learning
the concept of recursion, with the slideshow metaphor, having a single slide for each iteration,
on which values of parameters and local variables are written with an erasable pen, may prove
simpler, more tangible than present methods. Furthermore, since often programmers think of
programs as slices of non-contiguous code rather than trees or flow graphs, as Weiser has shown
[62], it is hoped that representing programs as collections of slides would appeal to programmers.
Beyond programming
Finally, it should be hard but interesting to examine the application of slides and sliding beyond
the world of programming. In general, any context of evolvable structured documents could benefit
from such techniques.
For example, this thesis has been written, or rather developed, using L
A
T
E
X. Moreover, its
content and structure have evolved throughout. Could slicing, slides and sliding not assist in such
activities?
Appendix A
Formal Language Definition
This appendix gives a full definition of the language used in this thesis. Each language construct is
given semantics formulated as a wp predicate transformer. Then, some syntactic approximations
to required semantic properties of that construct, regarding program variables, is given. Finally,
for each construct, the basic requirements (RE1-RE5) are proved. Those were defined back in
Chapter 4 and are re-stated next. For any statement Swe require
RE1 wp.Sis universally disjunctive
RE2 glob.(wp.S.P)((glob.P\ddef.S)input.S) for all P
RE3 [wp.S.PPwp.S.true] for all Pwith glob.Pdef.S
RE4 ddef.Sdef.S
RE5 glob.S=def.Sinput.S
A.1 Core language
Assignment
[wp.X:= E.PP[X\E]] for all P;
def.X:= E,X;
ddef.X:= E,X;
input.X:= E,glob.E; and
glob.X:= E,Xglob.E.
RE2: glob.(wp.X:= E.P)(glob.P\ddef.X:= E)input.X:= E
glob.(wp.X:= E.P)
={wp of ‘:=’}
145
APPENDIX A. FORMAL LANGUAGE DEFINITION 146
glob.P[X\E]
⊆ {glob of normal sub.}
(glob.P\X)glob.E
={ddef and input of ‘:=’}
(glob.P\ddef.X:= E)input.X:= E
RE3: [wp.X:= E.PPwp.X:= E.true]provided def.X:= Eglob.P
Pwp.X:= E.true
={wp of ‘:=’}
Ptrue[X\E]
={Xglob.true}
Ptrue
={identity element of ∧}
P
={redundant normal sub.: Xglob.Pdue to proviso and definition of def}
P[X\E]
={wp of ‘:=’}
wp.X:= E.P
RE4: ddef.X:= Edef.X:= E: Trivially so, since XX.
RE5: glob.X:= E=def.X:= Einput.X:= E
glob.X:= E
={glob of ‘:=’}
Xglob.E
={def and input of ‘:=’}
def.X:= Einput.X:= E
APPENDIX A. FORMAL LANGUAGE DEFINITION 147
Sequential composition
[wp.S1;S2.Pwp.S1.(wp.S2.P)] for all P;
def.S1;S2,def.S1def.S2 ;
ddef.S1;S2,ddef.S1ddef.S2 ;
input.S1;S2,input.S1(input.S2\ddef.S1) ; and
glob.S1;S2,glob.(S1,S2) .
RE2: glob.(wp.S1;S2.P)(glob.P\ddef.S1;S2)input.S1;S2” pro-
vided glob.(wp.S1.Q)(glob.Q\ddef.S1) input.S1and glob.(wp.S2.P)(glob.P\
ddef.S2) input.S2for any predicates P,Q
glob.(wp.S1;S2.P)
={wp of ‘ ;}
glob.(wp.S1.(wp.S2.P))
⊆ {proviso with Q:= wp.S2.P}
(glob.(wp.S2.P)\ddef.S1) input.S1
⊆ {proviso}
((glob.P\ddef.S2) input.S2) \ddef.S1) input.S1
={set theory: (ab)\c= (a\c)(b\c) and
(d\e)\f=d\(ef)}
(glob.P\(ddef.S1ddef.S2)) (input.S2\ddef.S1) input.S1
={ddef and input of ‘ ;}
(glob.P\ddef.S1;S2)input.S1;S2
RE3: [wp.S1;S2.PPwp.S1;S2.true]provided def.S1;S2glob.P,
[wp.S1.QQwp.S1.true]for any Qwith def.S1glob.Qand [wp.S2.RRwp.S2.true]
for any Rwith def.S2glob.R
wp.S1;S2.P
={wp of ‘ ;}
wp.S1.(wp.S2.P)
={proviso: def.S2glob.P}
wp.S1.(Pwp.S2.true)
APPENDIX A. FORMAL LANGUAGE DEFINITION 148
={wp.S1 is finitely conjunctive}
wp.S1.Pwp.S1.(wp.S2.true)
={proviso: def.S1glob.P}
Pwp.S1.true wp.S1.(wp.S2.true)
={finitely conjunctive}
Pwp.S1.(true wp.S2.true)
={identity element of ∧}
Pwp.S1.(wp.S2.true)
={wp of ‘ ;}
Pwp.S1;S2.true
RE4: ddef.S1;S2def.S1;S2” provided ddef.S1def.S1and ddef.S2
def.S2: Indeed ddef.S1ddef.S2def.S1def.S2, due to the proviso and set theory.
RE5: glob.S1;S2=def.S1;S2input.S1;S2” provided glob.S1 = def.S1
input.S1and glob.S2 = def.S2input.S2
def.S1;S2input.S1;S2
={def and input of ‘ ;}
def.S1def.S2input.S1(input.S2\ddef.S1)
={set theory: ddef.S1def.S1}
def.S1def.S2input.S1input.S2
={proviso}
glob.S1glob.S2
={glob of ‘ ;}
glob.S1;S2
APPENDIX A. FORMAL LANGUAGE DEFINITION 149
Alternative construct
[wp.IF .P(Bwp.S1.P)(¬Bwp.S2.P)] for all P;
def.IF ,def.S1def.S2 ;
ddef.IF ,ddef.S1ddef.S2 ;
input.IF ,glob.Binput.S1input.S2 ; and
glob.IF ,glob.Bglob.S1glob.S2 .
RE2: glob.(wp.IF .P)(glob.P\ddef.IF )input.IF provided glob.(wp.S1.P)(glob.P\
ddef.S1) input.S1and glob.(wp.S2.P)(glob.P\ddef.S2) input.S2
glob.(wp.IF .P)
={wp of IF}
glob.((Bwp.S1.P)(¬Bwp.S2.P))
={def. of glob;glob.B=glob.(¬B)}
glob.Bglob.(wp.S1.P)glob.(wp.S2.P)
⊆ {proviso, twice}
glob.B(glob.P\ddef.S1) input.S1(glob.P\ddef.S2) input.S2
={set theory}
(glob.P\(ddef.S1ddef.S2)) glob.Binput.S1input.S2
={ddef and input of IF}
(glob.P\ddef.IF )input.IF
RE3: [wp.IF .PPwp.IF .true]provided def.IF glob.P
wp.IF .P
={wp of IF}
(Bwp.S1.P)(¬Bwp.S2.P)
={proviso, twice: def.IF glob.Pdef.S1glob.Pand def.S2glob.P}
(BPwp.S1.true)(¬BPwp.S2.true )
={dist. of over , twice}
(BP)(Bwp.S1.true)(¬BP)(¬Bwp.S2.true )
={pred. calc.: [(YZ)(¬YZ)Z]}
P(Bwp.S1.true)(¬Bwp.S2.true )
={wp of IF}
APPENDIX A. FORMAL LANGUAGE DEFINITION 150
Pwp.IF .true
RE4: ddef.IF def.IF provided ddef.S1def.S1and ddef.S2def.S2: Indeed ddef.S1
ddef.S2def.S1def.S2, due to the proviso and set theory (abab).
RE5: glob.IF =def.IF input.IF provided glob.S1 = def.S1input.S1and glob.S2 =
def.S2input.S2
def.IF input.IF
={def and input of IF}
def.S1def.S2glob.Binput.S1input.S2
={proviso}
glob.Bglob.S1glob.S2
={glob of IF}
glob.IF
Repetitive construct
[wp.DO.P(i: 0 i: (ki.false))] for all P,
with kgiven by (DS:9,44) [13]: [k.Q(BP)(¬Bwp.S.Q)] ;
def.DO ,def.S;
ddef.DO ,;
input.DO ,glob.Binput.S; and
glob.DO ,glob.Bglob.S.
RE2: glob.(wp.DO.P)(glob.P\ddef.DO)input.DO provided glob.(wp.S.Q)(glob.Q\
ddef.S)input.S
Recall that wp.DO.Pis equivalent to (i: 0 i:ki.false) with [k.Q(BP)(¬Bwp.S.Q)]
and that ddef.DO ,and input.DO ,glob.Binput.S. Observing that glob.false glob.P
glob.Binput.Sis trivially true, we are left to prove
glob.((BP)(¬Bwp.S.Q)) glob.Pglob.Binput.Sfor any Qwith glob.Qglob.P
glob.Binput.S:
glob.((BP)(¬Bwp.S.Y))
APPENDIX A. FORMAL LANGUAGE DEFINITION 151
={def. of glob}
glob.Bglob.Pglob.(wp.S.Q)
⊆ {proviso}
glob.Bglob.P(glob.Q\ddef.S)input.S
⊆ {set theory}
glob.Bglob.Pglob.Qinput.S
⊆ {proviso}
glob.Bglob.Pglob.Pglob.Binput.Sinput.S
={set theory}
glob.Pglob.Binput.S
RE3: [wp.DO.PPwp.DO.true]provided def.DO glob.Pand [wp.S.PPwp.S.true]
Here, recall that wp.DO .Pis equivalent to (i: 0 i:ki.false) with [k.Q(BP)
(¬Bwp.S.Q)] and def.DO ,def.S. Furthermore, note that wp.DO .true is equivalent to
(i: 0 i:li.false) with [l.Q≡ ¬Bwp.S.Q] due to true being the zero element of as
well as the identity element of . Hence, we need to prove:
[(i: 0 i:ki.false)X(i: 0 i:li.false )] (A.1)
and we do so by proving (by induction) for all j(0):
[(i: 0 ij:ki.false)P(i: 0 ij:li.false )])
again, provided def.DO glob.Pand [wp.S.PPwp.S.true]:
Base case: j= 0
P(i: 0 i0 : li.false)
={one point rule}
P(i: 0 i0 : li.false)
={one point rule}
Pl0.false
={definition of function iteration}
Pfalse
={zero element of ∧}
APPENDIX A. FORMAL LANGUAGE DEFINITION 152
false
={definition of function iteration}
k0.false
={one point rule}
(i: 0 i0 : ki.false)
Step: j+ 1 (with 0j)
(i: 0 ij+ 1 : ki.false )
={splitting the range}
(i: (0 = i)(1 ij+ 1) : ki.false )
={one point rule and transforming the dummy}
k0.false (i: 0 ij:ki+1.false )
={def. of func. it., twice}
false (i: 0 ij:k.ki.false)
={id. elem. of ;kis pos. disj. (and even universally so)}
k.(i: 0 ij:ki.false)
={def. of k}
(BP)(¬Bwp.S.(i: 0 ij:ki.false))
={ind. hypo.}
(BP)(¬Bwp.S.(P∧ ∃i: 0 ij:li.false))
={wp.Sis fin. conj. (and even univ. so)}
(BP)(¬B(wp.S.Pwp.S.(i: 0 ij:li.false)))
={proviso}
(BP)(¬B(Pwp.S.true wp.S.(i: 0 ij:li.false)))
={wp.Sis fin. conj. (and even univ. so)}
(BP)(¬B(Pwp.S.(true (i: 0 ij:li.false))))
={id. elem. of ∧}
(BP)(¬B(Pwp.S.(i: 0 ij:li.false)))
={∨ distributes over ∧}
(BP)(¬BP)(¬Bwp.S.(i: 0 ij:li.false))
={pred. calc.: [(CD)(¬CD)D]}
P(¬Bwp.S.(i: 0 ij:li.false))
APPENDIX A. FORMAL LANGUAGE DEFINITION 153
={def. of l}
Pl.(i: 0 ij:li.false)
={id. elem. of ;lis pos. disj. (and even universally so)}
P(false (i: 0 ij:l.li.false))
={def. of func. it., twice}
P(l0.false (i: 0 ij:li+1.false ))
={one point rule and transforming the dummy}
P(i: (0 = i)(1 ij+ 1) : li.false )
={splitting the range}
P(i: 0 ij+ 1 : li.false )
={∧ distributes over (3.11)}
(i: 0 ij+ 1 : Xli.false )
RE4: ddef.DO def.DO provided ddef.Sdef.S: Trivially so since ddef.DO ,.
RE5: glob.DO =def.DO input.DO provided glob.S=def.Sinput.S
def.DO input.DO
={def and input of DO}
def.Sglob.Binput.S
={proviso}
glob.Bglob.S
={glob of DO}
glob.DO
This completes our subset of Dijkstra and Scholten’s guarded commands [13]. The following
constructs are extensions borrowed from Morgan [45], with some adaptations as our context re-
quires. Since those constructs were not present in [13], we shall also be responsible for proving
requirement RE 1 (i.e. universal disjunctivity).
APPENDIX A. FORMAL LANGUAGE DEFINITION 154
A.2 Extended language
Assertions
[wp.{B}.PBP] for all P;
def.{B},;
ddef.{B},;
input.{B},glob.B; and
glob.{B},glob.B.
RE1: wp.{B}” is universally disjunctive
wp.{B}.(P:PPs :P)
={wp of assertions}
B(P:PPs :P)
={∧ distributes over (3.11)}
(P:PPs :BP)
={again, wp of assertions}
(P:PPs :wp.{B}.P)
RE2: glob.(wp.{B}.P)(glob.P\ddef.{B})input.{B}
glob.(wp.{B}.P)
={wp of assertions}
glob.(BP)
={def. of glob}
glob.Bglob.P
={set theory and ddef and input of assertions}
(glob.P\ddef.{B})input.{B}
RE3: [wp.{B}.PPwp.{B}.true]provided def.{B}glob.P
Pwp.{B}.true
APPENDIX A. FORMAL LANGUAGE DEFINITION 155
={wp of assertions}
PBtrue
={identity element of ∧}
PB
={wp of assertions}
wp.{B}.P
RE4: ddef.{B}def.{B}: Trivially so, since ddef.{B}=def.{B}=.
RE5: glob.{B}=def.{B}input.{B}
glob.{B}
={glob of assertions}
glob.B
={def and input of assertions}
def.{B}input.{B}
Local variables
|[var L;S]|,L0:= L;S;L:= L0where L0is fresh.
[wp.|[var L;S]|.P(wp.S.P[L\L0])[L0\L] ] for all Pwith glob.PL0; or the simpler
[wp.|[var L;S]|.Qwp.S.Q] for all Qwith glob.Q(L,L0)
def.|[var L;S]|,def.S\L;
ddef.|[var L;S]|,ddef.S\L;
input.|[var L;S]|,input.S; and
glob.|[var L;S]|,(def.S\L)input.S; or
glob.|[var L;S]|,(glob.S\(L\input.S)) .
RE1: wp.|[var L;S]|” is universally disjunctive, provided wp.Sis
wp.|[var L;S]|.(P:PPs :P)
={wp of locals: Lglob.Ps}
APPENDIX A. FORMAL LANGUAGE DEFINITION 156
wp.S.(P:PPs :P)
={proviso}
(P:PPs :wp.S.P)
={again, wp of locals}
(P:PPs :wp.|[var L;S]|.P)
RE2: glob.(wp.|[var L;S]|.P)(glob.P\ddef.|[var L;S]|)input.|[var L;S]|
provided glob.PLand glob.(wp.S.P)(glob.P\ddef.S)input.S
glob.(wp.|[var L;S]|.P)
={wp of locals: Lglob.P}
glob.(wp.S.P)
⊆ {proviso and property of ‘\}
(glob.P\ddef.S)input.S
={ddef of locals and set theory: Lglob.P;input of locals}
(glob.P\ddef.|[var L;S]|)input.|[var L;S]|
RE3: [wp.|[var L;S]|.PPwp.|[var L;S]|.true ]provided def.|[var L;S]|
glob.P
wp.|[var L;S]|.P
={wp of locals: Lglob.P}
wp.S.P)
={proviso: (def.S\Lglob.Pbut Lglob.Pso
def.Sglob.P}
Pwp.S.true)
={wp of locals: glob.true =∅}
Pwp.|[var L;S]|.true
APPENDIX A. FORMAL LANGUAGE DEFINITION 157
RE4: ddef.|[var L;S]|def.|[var L;S]|provided ddef.Sdef.S: Indeed ddef.S\
Ldef.S\Ldue to the proviso and set theory.
RE5: glob.|[var L;S]|=def.|[var L;S]|input.|[var L;S]|
glob.|[var L;S]|
={glob of locals}
(def.S\L)input.S
={def and input of locals}
def.|[var L;S]|input.|[var L;S]|
Live variables
Enclosing a statement with liveness information (e.g. S[live V]) guarantees only definitions
of the live variables Vmay be observable on exit from S. We define
S[live V],|[var L;S]|where L:= def.S\V.
Since this definition is by transformation (to another, existing language construct), there is
no need to prove any of the requirements, as they can be inferred. Similarly, there is no need to
define variable properties, as those can be calculated. Thus, the semantics and properties can be
derived from those of local variables, as is summarised in the following. For a given statement S,
set of variables V, a corresponding set L:= def.S\Vand fresh L0, we have:
[wp.S[live V].P(wp.S.P[L\L0])[L0\L] ] for all Pwith glob.PL0; or the simpler
[wp.S[live V].Qwp.S.Q] for all Qwith glob.Q(L,L0)
def.S[live V],def.SV;
ddef.S[live V],ddef.SV;
input.S[live V],input.S; and
glob.S[live V],(def.SV)input.S.
Appendix B
Laws of Program Manipulation
B.1 Manipulating core statements
Law 1 . Let X,Y,E1,E2 be two sets of variables and two sets of expressions, respectively; then
X:= E1;Y:= E2=X,Y:= E1,E2
provided X(Yglob.E2).
Proof.
wp.X:= E1;Y:= E2.P
={wp of ‘ ;’ and twice ‘:=’}
(X:= E1).((Y:= E2).P)
={merge subs: proviso}
(X,Y:= E1,E2).P
={wp of ‘:=’}
wp.X,Y:= E1,E2.P.
Law 2 . Let S,Xbe a statement set of variables, respectively; then
S=S;X:= X.
Proof. We first observe for all Pwith glob.Pdef.S
wp.S;X:= X.P
={wp of ‘ ;’ and ‘:=’}
158
APPENDIX B. LAWS OF PROGRAM MANIPULATION 159
wp.S.((X:= X).P)
={remove redundant self sub.}
wp.S.P.
We next observe for all Qwith glob.Qdef.S(i.e. L glob.Q)
wp.|[var L;S]|.Q
={wpof locals: choose L0glob.(L,S,Q)}
(L0:= L).(wp.S.((L:= L0).Q))
={remove redundant sub.: Lglob.Q(proviso)}
(L0:= L).(wp.S.Q)
={remove redundant sub.: glob.(wp.S.Q)glob.(S,Q) due to RE2}
wp.S.P.
Law 3 . Let S,S1,S2,Bbe three statements and a boolean expression, respectively; then
S;if Bthen S1else S2=if Bthen S;S1else S;S2
provided def.Sglob.B.
Proof.
wp.S;if Bthen S1else S2.P
={wp of ‘ ;}
wp.S.(wp.if Bthen S1else S2.P)
={wp of IF}
wp.S.((Bwp.S1.P)(¬Bwp.S2.P))
={wp.Sis fin. conj.}
wp.S.(Bwp.S1.P)wp.S.(¬Bwp.S2.P)
={Lemma B.1, twice: proviso (see below)}
(Bwp.S.(wp.S1.P)) (¬Bwp.S.true)
(¬Bwp.S.(wp.S2.P)) (Bwp.S.true)
={pred. calc.}
(Bwp.S.(wp.S1.P)wp.S.true)(¬Bwp.S.true wp.S.(wp.S2.P))
={absorb termination (3.14) and wp of ‘ ;’, twice}
APPENDIX B. LAWS OF PROGRAM MANIPULATION 160
(Bwp.S;S1.P)(¬Bwp.S;S2.P)
={wp of IF}
wp.if Bthen S;S1else S;S2.P.
Lemma B.1. Let S,P,Qbe a statement and two predicates, respectively, with def.Sglob.P;
then
[wp.S.(PQ)(Pwp.S.Q)(¬Pwp.S.true)] .
Proof.
wp.S.(PQ)
={pred. calc.; wp.Sis fin. disj.}
wp.S.(¬P)wp.S.Q
={RE3: proviso}
(¬Pwp.S.true)wp.S.Q
={pred. calc.}
(¬Pwp.S.Q)(wp.S.true wp.S.Q)
={pred. calc.; termination absorbs (3.15)}
(Pwp.S.Q)wp.S.true
={pred. calc.}
(Pwp.S.Q)(Pwp.S.true)(¬Pwp.S.true )
={pred. calc.}
(Pwp.S.Qwp.S.true)(¬Pwp.S.true )
={absorb termination (3.14)}
(Pwp.S.Q)(¬Pwp.S.true).
Law 4 . Let S1,S2,S3,Bbe three statements and a boolean expression, respectively; then
if B1then S1else S2;S3=if B1then S1;S3else S2;S3.
Proof.
wp.if B1then S1;S3else S2;S3.P
={wp of IF}
APPENDIX B. LAWS OF PROGRAM MANIPULATION 161
(B1wp.S1;S3.P)(¬B1wp.S2;S3.P)
={wp of ‘ ;’, twice}
(B1wp.S1.(wp.S3.P)) (¬B1wp.S2.(wp.S3.P))
={wp of IF}
wp.if B1then S1else S2.(wp.S3.P)
={wp of ‘ ;}
wp.if B1then S1else S2;S3.P).
Law 5 . Let S1,X,B,Ebe any statement, set of variables, boolean expression and set of
expressions, respectively; then
{X=E};while Bdo S1;(X:= E)od ={X=E};while Bdo S1od ;(X:= E)
provided X(glob.Binput.S1glob.E).
Proof.
wp.{X=E};while Bdo S1od ;(X:= E).P
={wp of ‘ ;’, twice}
wp.{X=E}.(wp.while Bdo S1od .(wp.X:= E.P))
={wp of assertions and ‘:=’}
(X=E)wp.while Bdo S1od .((X:= E).P)
={wp of DO with [k.Q(B(X:= E).P)(¬Bwp.S1.Q)]}
(X=E)(i: 0 i:ki.false)
={∧ distributes over (3.11)}
(i: 0 i: (X=E)ki.false)
={see below; [l.Q(BP)(¬Bwp.S;(X:= E).Q)]}
(i: 0 i: (X=E)li.false)
={∧ distributes over (3.11)}
(X=E)(i: 0 i:li.false)
={wp of DO with las above}
(X=E)wp.while Bdo S1;(X:= E)od .P
={wp of assertions and ‘ ;}
wp.{X=E};while Bdo S1;(X:= E)od .P.
APPENDIX B. LAWS OF PROGRAM MANIPULATION 162
We finish by proving for the middle step above, by induction, having [(X=E)ki.false
(X=E)li.false)] for all i, provided X(glob.Binput.S1glob.E).
For the base case i= 0, we observe that indeed [(X=E)false (X=E)false] (recall the
definition of function iteration). Then, for the induction step, we assume [(X=E)ki.false
(X=E)li.false] and prove [(X=E)ki+1.false (X=E)li+1 .false)].
(X=E)ki+1.false
={def. of func. it.}
(X=E)k.(ki.false)
={def. of k}
(X=E)(B(X:= E).P)(¬Bwp.S1.(ki.false))
={replace equals with equals}
(X=E)(B(X:= X).P)(¬Bwp.S1.(ki.false))
={remove redundant self-sub.}
(X=E)(BP)(¬Bwp.S1.(ki.false))
={intro. redundant sub.: X(ki.false ) due to RE2 and
X(glob.Binput.S1glob.E) (proviso)}
(X=E)(BP)(¬Bwp.S1.((X:= E).(ki.false)))
={ind. hypo.}
(X=E)(BP)(¬Bwp.S1.((X:= E).(li.false)))
={wp of ‘ ;’ and ‘:=’}
(X=E)(BP)(¬Bwp.S1;(X:= E).(li.false))
={def. of l}
(X=E)l.(li.false)
={def. of func. it.}
(X=E)li+1.false .
Law 6 . Let X,Ebe any set of variables and set of expressions, respectively; then
{X=E}={X=E};X:= E.
Proof.
wp.{X=E};X:= E.P
APPENDIX B. LAWS OF PROGRAM MANIPULATION 163
={wp of ‘ ;’ assertions and assignments}
(X=E)((X:= E).P)
={remove redundant sub.}
(X=E)P
={wp of assertions}
wp.{X=E}.P.
B.2 Assertion-based program analysis
B.2.1 Introduction of assertions
Law 7 . Let X,Y,E1,E2 be two sets of variables and two sets of expressions, respectively; then
X,Y:= E1,E2=X,Y:= E1,E2;{Y=E2}
provided (X,Y)glob.E2.
Proof.
wp.X,Y:= E1,E2;{Y=E2}.P
={wp of ‘ ;}
wp.X,Y:= E1,E2.(wp.{Y=E2}.P
={wp of assertions}
wp.X,Y:= E1,E2.((Y=E2) P)
={wp.X,Y:= E1,E2is fin. conj.}
wp.X,Y:= E1,E2.(Y=E2) wp.X,Y:= E1,E2.P
={wp of ‘:=’}
(X,Y:= E1,E2).(Y=E2) wp.X,Y:= E1,E2.P
={normal sub.: proviso}
(E2 = E2) wp.X,Y:= E1,E2.P
={id. elem. of ∧}
wp.X,Y:= E1,E2.P.
APPENDIX B. LAWS OF PROGRAM MANIPULATION 164
Law 8 . Let X,X0,Ebe (same length) lists of variables and expressions, respectively, with
XX0; then
X,X0:= E,E=X,X0:= E,E;{X=X0}.
Proof.
wp.X,X0:= E,E;{X=X0}.P
={wp of ‘ ;}
wp.X,X0:= E,E.(wp.{X=X0}.P
={wp of assertions}
wp.X,X0:= E,E.((X=X0)P)
={wp is fin. conj.}
wp.X,X0:= E,E.(X=X0)) wp.X,X0:= E,E.P
={wp of ‘:=’}
(X,X0:= E,E).(X=X0)wp.X,X0:= E,E.P
={normal sub.: proviso}
(E=E)wp.X,X0:= E,E.P
={id. elem. of ∧}
wp.X,X0:= E,E.P.
Law 9 . Let S1,B1,B2 be any given statement and two boolean expressions, respectively;
then
while B1do S1od =while B1do {B2};S1od .
provided [B1B2].
Proof. In order to prove that the two loop statements are equivalent, it suffices to show for all Q
[¬B1wp.S1.Q≡ ¬B1wp.{B2};S1.Q].
¬B1wp.{B2};S1.Q
={wp of ‘ ;’ and assertions}
¬B1(B2wp.S1.Q)
={pred. calc.}
APPENDIX B. LAWS OF PROGRAM MANIPULATION 165
(¬B1B2) (¬B1wp.S1.Q)
={proviso}
true (¬B1wp.S1.Q)
={id. elem.}
¬B1wp.S1.Q.
B.2.2 Propagation of assertions
Law 10 . Let S,Bbe a statement and boolean expression, respectively; then
{wp.S.B};S=S;{B}.
Proof. We observe for all P:
wp.S;{B}.P
={wp of ‘ ;}
wp.S.(wp.{B}.P)
={wp of assertions}
wp.S.(BP)
={conj. of wp.S}
wp.S.Bwp.S.P)
={wp of assertions}
wp.{wp.S.B}.(wp.S.P)
={wp of ‘ ;}
wp.{wp.S.B};S.P.
Law 11 . Let S,Bbe a statement and boolean expression, respectively; then
{B};S=S;{B}.
provided def.Sglob.B.
Proof.
S;{B}
APPENDIX B. LAWS OF PROGRAM MANIPULATION 166
={swap statements (Law 5.7): def of assertions is empty and
def.Sglob.B(proviso)}
{B};S.
The following law will be used for propagating assertions forward into branches of an IF as
well as backward ahead of an IF.
Law 12 . Let S1,S2,B1,B2 be two statements and two boolean expressions, respectively;
then
{B1};if B2then S1else S2=if B2then {B1};S1else {B1};S2.
Proof.
{B1};if B2then S1else S2
={Law 3: def.{B1}=for any assertion}
if B2then {B1};S1else {B1};S2.
The next law will allow the propagation of assertions forward to the (head of the) body of a
loop.
Law 13 . Let S,B1,B2,B3,B4 be a statement and four boolean expressions, respectively;
then
{B1};while B2do S;{B3}od ={B1};while B2do {B4};S;{B3}od
provided [B1B4] and [B3B4].
Proof.
{B1};while B2do S;{B3}od
={Law 14: proviso}
{B1};while B2B4do S;{B3}od
={Law 9: [B2B4B4]}
{B1};while B2B4do {B4};S;{B3}od .
APPENDIX B. LAWS OF PROGRAM MANIPULATION 167
Law 14 . Let S,B1,B2,B3,B4 be a statement and four boolean expressions, respectively;
then
{B1};while B2do S;{B3}od ={B1};while B2B4do S;{B3}od
provided [B1B4] and [B3B4].
Proof.
{B1};while B2do S;{B3}} od
={proviso and pred. calc.}
{B1B4};while B2do S;{B3B4}} od
={Law 15}
{B1};{B4};while B2do S;{B3};{B4}od
={see below}
{B1};{B4};while B2B4do S;{B3};{B4}od
={Law 15}
{B1B4};while B2B4do S;{B3B4}od
={proviso and pred. calc.}
{B1};while B2B4do S;{B3}od .
So, we are left to prove the middle step above, simplified to
{B4};while B2do S1;{B4}od={B4};while B2B4do S1;{B4}od.
wp.{B4};while B2do S1;{B4}od .P
={wp of ‘ ;’ and assertions}
B4wp.while B2do S1;{B4}od .P
={wp of DO with [k.Q(B2P)(¬B2wp.S1;{B4}.Q)]}
B4(i: 0 i:ki.false)
={∧ distributes over (3.11)}
B4(i: 0 i:B4ki.false)
={see below; [l.Q((B2B4) P)(¬(B2B4) wp.S1;{B4}.Q)]}
B4(i: 0 i:B4li.false)
={∧ distributes over (3.11)}
B4(i: 0 i:li.false)
APPENDIX B. LAWS OF PROGRAM MANIPULATION 168
={wp of DO with las above}
B4wp.while B2B4do S1;{B4}od .P
={wp of ‘ ;’ and assertions}
wp.{B4};while B2B4do S1;{B4}od .P.
We complete the proof by showing for the middle step above [B4k.QB4l.Q] for all Q.
B4l.Q
={def. of l}
B4((B2B4) P)(¬(B2B4) wp.S1;{B4}.Q)
={pred. calc.: [A((AB)C)A(BC)]}
B4(B2P)(¬(B2B4) wp.S1;{B4}.Q)
={pred. calc.: de-Morgan}
B4(B2P)(¬B2∨ ¬B4wp.S1;{B4}.Q)
={pred. calc.: [A(¬AB)AB]}
B4(B2P)(¬B2wp.S1;{B4}.Q)
={def. of k}
B4k.Q.
Law 15 . Let B1,B2 be two boolean expressions; then
{B1B2}={B1};{B2}.
Proof.
wp.{B1};{B2}.P
={wp of ‘ ;’ and assertions}
B1wp.{B2}.P
={wp of assertions}
B1B2P
={wp of assertions}
wp.{B1B2}.P.
APPENDIX B. LAWS OF PROGRAM MANIPULATION 169
Law 16 . Let S,B1,B2 be a statement and two boolean expressions, respectively; then
{B1};while B2do Sod ={B1};while B2do {B1};Sod
provided glob.B1def.S.
Proof.
wp.{B1};while B2do {B1};Sod .P
={wp of ‘ ;’ and assertions}
B1wp.while B2do {B1};Sod .P
={Law 11: glob.B1def.S(proviso)}
B1wp.while B2do S;{B1}od .P
={wp of DO with [k.Q(B2P)(¬B2wp.S;{B1}.Q)]}
B1(i: 0 i:ki.false)
={∧ distributes over (3.11)}
B1(i: 0 i:B1ki.false)
={see below; [l.Q(B2P)(¬B2wp.S.Q)]}
B1(i: 0 i:B1li.false)
={∧ distributes over (3.11)}
B1(i: 0 i:li.false)
={wp of DO with las above}
B1wp.while B2do Sod .P
={wp of assertions and ‘ ;}
wp.{B1};while B2do Sod .P.
We finish by proving for the middle step above, by induction, having [B1ki.false B1
li.false)] for all i, provided glob.B1def.S.
For the base case i= 0, we observe that indeed [B1false B1false] (recall the definition
of function iteration). Then, for the induction step, we assume [B1ki.false B1li.false] and
prove [B1ki+1.false B1li+1.false )].
B1ki+1.false
={def. of func. it.}
B1k.(ki.false)
APPENDIX B. LAWS OF PROGRAM MANIPULATION 170
={def. of k1}
B1(B2P)(¬B2wp.S;{B1}.(ki.false))
={wp of ‘ ;’ and assertions}
B1(B2P)(¬B2wp.S.(B1ki.false))
={ind. hypo.}
B1(B2P)(¬B2wp.S.(B1li.false))
={wp.Sis fin. conj.}
B1(B2P)(¬B2(wp.S.B1wp.S.(li.false)))
={RE3: proviso}
B1(B2P)(¬B2(B1wp.S.true wp.S.(li.false)))
={absorb termination (3.14)}
B1(B2P)(¬B2(B1wp.S.(li.false)))
={pred. calc.}
B1(B2P)(¬B2B1) (¬B2wp.S.(li.false)))
={pred. calc.: absorption}
B1(B2P)(¬B2(wp.S.(li.false)))
={def. of l}
B1l.(li.false)
={def. of func. it.}
B1li+1.false .
B.2.3 Substitution
Law 17 . Let S1,S2,Bbe two statements and a boolean expression, respectively; let X,Ebe
a set of variables and a corresponding list of expressions; and let Y,Y0be two sets of variables;
then
{Y=Y0};X:= E={Y=Y0};X:= E[Y\Y0];
{Y=Y0};IF ={Y=Y0};IF 0; and
{Y=Y0};DO ={Y=Y0};DO0.
APPENDIX B. LAWS OF PROGRAM MANIPULATION 171
where IF := if Bthen S1else S2,
IF 0:= if B[Y\Y0]then S1else S2,
DO := while Bdo S1;{Y=Y0}od
and DO0:= while B[Y\Y0]do S1;{Y=Y0}od .
Proof. Assignment:
wp.{Y=Y0};X:= E[Y\Y0].P
={wp of ‘ ;’ and assertions}
(Y=Y0)wp.X:= E[Y\Y0].P
={wp of ‘:=’}
(Y=Y0)(X:= E[Y\Y0]).P
={replace equals for equals}
(Y=Y0)(X:= E[Y\Y]).P
={redundant self sub.}
(Y=Y0)(X:= E).P
={wp of ‘:=’}
(Y=Y0)wp.X:= E.P
={wp of assertions and ‘ ;}
wp.{Y=Y0};X:= E.P.
IF:
wp.{Y=Y0};if B[Y\Y0]then S1else S2.P
={wp of ‘ ;’ and assertions}
(Y=Y0)wp.if B[Y\Y0]then S1else S2.P
={wp of IF}
(Y=Y0)(B[Y\Y0]wp.S1.P)(¬B[Y\Y0]wp.S2.P)
={replace equals for equals}
(Y=Y0)(B[Y\Y]wp.S1.P)(¬B[Y\Y]wp.S2.P)
={redundant self sub., twice}
(Y=Y0)(Bwp.S1.P)(¬Bwp.S2.P)
={wp of IF}
(Y=Y0)wp.if Bthen S1else S2.P
={wp of assertions and ‘ ;}
APPENDIX B. LAWS OF PROGRAM MANIPULATION 172
wp.{Y=Y0};if Bthen S1else S2.P.
DO:
wp.{Y=Y0};while Bdo S1;{Y=Y0}od .P
={wp of ‘ ;’ and assertions}
(Y=Y0)wp.while Bdo S1;{Y=Y0}od .P
={wp of DO with [k.Q(BP)(¬Bwp.S1;{Y=Y0}.Q}
(Y=Y0)(i: 0 i:ki.false)
={∧ distributes over (3.11)}
(i: 0 i: (Y=Y0)ki.false)
={see below; [l.Q(B0P)(¬B0wp.S1;{Y=Y0}.Q] with
B0,B[Y\Y0]}
(i: 0 i: (Y=Y0)li.false)
={∧ distributes over (3.11)}
(Y=Y0)(i: 0 i:li.false)
={wp of DO with las above}
(Y=Y0)wp.while B[Y\Y0]do S1;{Y=Y0}od .P
={wp of assertions and ‘ ;}
wp.{Y=Y0};while B[Y\Y0]do S1;{Y=Y0}od .P.
We finish by proving for the middle step above, by induction, having [(Y=Y0)(ki.false
(Y=Y0)li.false)] for all i.
For the base case i= 0, we observe that indeed [(Y=Y0)false (Y=Y0)false] (recall the
definition of function iteration). Then, for the induction step, we assume [(Y=Y0)ki.false
(Y=Y0)li.false] and prove [(Y=Y0)ki+1.false (Y=Y0)li+1 .false)].
(Y=Y0)li+1.false
={def. of func. it.}
(Y=Y0)l.(li.false)
={def. of l}
(Y=Y0)(B[Y\Y0]P)(¬B[Y\Y0]wp.S1;{Y=Y0}.(li.false))
={replace equals for equals}
(Y=Y0)(B[Y\Y]P)(¬B[Y\Y]wp.S1;{Y=Y0}.(li.false))
APPENDIX B. LAWS OF PROGRAM MANIPULATION 173
={redundant self sub., twice}
(Y=Y0)(BP)(¬Bwp.S1;{Y=Y0}.(li.false))
={wp of ‘ ;’ and assertions}
(Y=Y0)(BP)(¬Bwp.S1.((Y=Y0)li.false))
={ind. hypo.}
(Y=Y0)(BP)(¬Bwp.S1.((Y=Y0)ki.false))
={wp of ‘ ;’ and assertions}
(Y=Y0)(BP)(¬Bwp.S1;{Y=Y0}.(ki.false))
={def. of k}
(Y=Y0)k.(ki.false)
={def. of func. it.}
(Y=Y0)ki+1.false .
Law 18 . Let S1,S2,Bbe two statements and a boolean expression, respectively; let
X,X0,Y,Z,E1,E10,E2,E3 be four lists of variables and corresponding lists of expressions; then
X,Y:= E1,E2;Z:= E3=X,Y:= E1,E2;Z:= E3[Y\E2] ;
X,Y:= E1,E2;IF =X,Y:= E1,E2;IF 0; and
X,Y:= E1,E2;DO =X,Y:= E1,E2;DO0
provided ((XX0),Y)glob.E2
where IF := if Bthen S1else S2,
IF 0:= if B[Y\E2] then S1else S2,
DO := while Bdo S1;X0,Y:= E10,E2od
and DO0:= while B[Y\E2] do S1;X0,Y:= E10,E2od .
Proof.
X,Y:= E1,E2;Z:= E3
={intro. following assertion (Law 7): (X,Y)glob.E2 (proviso)}
X,Y:= E1,E2;{Y=E2};Z:= E3
={assertion-based sub. (Law 17) with Y0:= E2}
X,Y:= E1,E2;{Y=E2};Z:= E3[Y\E2]
={remove following assignment (Law 7)}
X,Y:= E1,E2;Z:= E3[Y\E2] .
APPENDIX B. LAWS OF PROGRAM MANIPULATION 174
X,Y:= E1,E2;if Bthen S1else S2
={intro. following assertion (Law 7): (X,Y)glob.E2 (proviso)}
X,Y:= E1,E2;{Y=E2};if Bthen S1else S2
={assertion-based sub. (Law 17) with Y0:= E2}
X,Y:= E1,E2;{Y=E2};if B[Y\E2] then S1else S2
={remove following assignment (Law 7)}
X,Y:= E1,E2;if B[Y\E2] then S1else S2.
X,Y:= E1,E2;while Bdo S1;X0,Y:= E10,E2od
={intro. following assertion (Law 7), twice:
(X,Y)glob.E2 and (X0,Y)glob.E2 (proviso)}
X,Y:= E1,E2;{Y=E2};while Bdo S1;X0,Y:= E10,E2;{Y=E2}od
={assertion-based sub. (Law 17) with Y0:= E2}
X,Y:= E1,E2;{Y=E2};
while B[Y\E2] do S1;X0,Y:= E10,E2;{Y=E2}od
={remove following assignment (Law 7), twice}
X,Y:= E1,E2;while B[Y\E2] do S1;X0,Y:= E10,E2od .
B.3 Live variables analysis
B.3.1 Introduction and removal of liveness information
Law 19. Let S,Vbe any statement and set of variables, respectively, with def.SV; then
S=S[live V].
Proof. We observe for all Q
wp.S[live V].Q
={def. of live with coV := def.S\V}
wp.|[var coV ;S]|.Q
={wp of locals: coV =due to proviso}
APPENDIX B. LAWS OF PROGRAM MANIPULATION 175
wp.S.Q.
B.3.2 Propagation of liveness information
Law 20. Let S1,S2,V1,V2 be any two statements and two sets of variables, respectively; then
(S1;S2)[live V1] =(S1[live V2] ;S2[live V1])[live V1]
provided V2 = (V1\ddef.S2) input.S2.
Proof. We observe for all P(with glob.PV1)
wp.(S1[live V2] ;S2[live V1])[live V1] .P
={wp of live : proviso}
wp.S1[live V2] ;S2[live V1] .P
={wp of ;}
wp.S1[live V2] .(wp.S2[live V1] .P)
={wp of live :glob.PV1}
wp.S1[live V2] .(wp.S2.P)
={wp of live :glob.(wp.S2.P)(V1\ddef.S2) input.S2
due to RE2 and the proviso}
wp.S1.(wp.S2.P)
={wp of ;}
wp.S1;S2.P
={wp of live :glob.PV1}
wp.(S1;S2)[live V1] .P.
Law 21. Let B,S1,S2,Vbe any boolean expression, two statements and set of variables,
respectively; then
(if Bthen S1else S2)[live V]=(if Bthen S1[live V]else S2[live V])[live V].
Proof. We observe for all Pwith glob.PV
wp.(if Bthen S1[live V]else S2[live V])[live V].P
={wp of live :glob.PV}
APPENDIX B. LAWS OF PROGRAM MANIPULATION 176
wp.if Bthen S1[live V]else S2[live V].P
={wp of IF}
(Bwp.S1[live V].P)(¬Bwp.S2[live V].P)
={wp of live , twice: glob.PV}
(Bwp.S1.P)(¬Bwp.S2.P)
={wp of IF}
wp.if Bthen S1else S2.P
={wp of live :glob.PV}
wp.(if Bthen S1else S2)[live V].P.
Law 22. Let B,S,Vbe any boolean expression, statement and set of variables, respectively;
then
(while Bdo Sod)[live V1] =(while Bdo S[live V2] od)[live V1]
provided V2 = V1(glob.Binput.S).
Proof. We observe for all Pwith glob.PV1
wp.(while Bdo S[live V2] od)[live V1] .P
={wp of live :glob.PV1}
wp.while Bdo S[live V2] od .P
={wp of DO with [k.Q(BP)(¬Bwp.S[live V2].Q)]}
(i: 0 i:ki.false)
={see below; [l.Q(BP)(¬Bwp.S.Q)]}
(i: 0 i:li.false))
={wp of DO with las above}
wp.while Bdo Sod .P
={wp of live :glob.PV1}
wp.(while Bdo Sod)[live V1] .P.
We go on by proving for the missing step above [ki.false li.false] for all i. We begin
by observing [wp.S.Qwp.S[live V2] .Q] for all Qwith glob.QV2 due to wp of live .
And indeed glob.((BP)(¬Bwp.S.Q)) V2 for all such Q. This is due to the proviso
V2 = V1(glob.Binput.S), the given glob.PV1, and RE2.
APPENDIX B. LAWS OF PROGRAM MANIPULATION 177
B.3.3 Dead assignments: introduction and elimination
Law 23 . Let S,V,X,Y,E1,E2 be any statement, three sets of variables and two sets of
expressions, respectively; then
(S;X:= E1)[live V]=(S;X,Y:= E1,E2)[live V]
provided Y(XV).
Proof. We observe for all Pwith glob.PV
wp.S;X,Y:= E1,E2.P
={wp ;’ and ‘:=’}
wp.S.P[X,Y\E1,E2]
={remove redundant sub.: Yglob.P}
wp.S.P[X\E1]
={wp ;’ and ‘:=’}
wp.S;(X:= E1) .P.
Law 24 . Let S,V,Y,Ebe any statement, two sets of variables and set of expressions,
respectively; then
S[live V]=(S;Y:= E)[live V]
provided YV.
Proof.
(S;Y:= E)[live V]
={Law 23 with X:= ∅}
S[live V].
Law 25 . Let S,V,X,Y,E1,E2 be any statement, three sets of variables and two sets of
expressions, respectively; then
(X:= E1;S)[live V]=(X,Y:= E1,E2;S)[live V]
provided Y(X(V\ddef.S)input.S).
APPENDIX B. LAWS OF PROGRAM MANIPULATION 178
Proof. We observe for all Pwith glob.PV
wp.X,Y:= E1,E2;S.P
={wp ;’ and ‘:=’}
(wp.S.P)[X,Y\E1,E2]
={remove redundant sub.: Yglob.(wp.S.P) due to proviso and RE2}
(wp.S.P)[X\E1]
={wp ‘:=’ and ‘ ;}
wp.X:= E1;S.P.
Law 26 . Let B,S1,S2,Y,V,Ebe a boolean expression, two statements, two sets of variables
and a set of expressions, respectively; then
(S1;while Bdo S2od)[live V]=(S1;while Bdo S2;(Y:= E)od)[live V]
provided Y(Vglob.Binput.S2).
Proof. We observe for all Pwith glob.PV
wp.S1;while Bdo S2;(Y:= E)od]|.P
={wp of ‘ ;}
wp.S1.(wp.while Bdo S2;(Y:= E)od .P)
={wp of DO with [k.Q(BP)(¬Bwp.S2;(Y:= E).Q)]}
wp.S1.(i: 0 i:ki.false)
={see below; [l.Q(BP)(¬Bwp.S2.Q)]}
(wp.S1.(i: 0 i:li.false)
={wp of DO with las above}
wp.S1.(wp.while Bdo S2od .P)
={wp of ‘ ;}
wp.S1;while Bdo S2od .P.
We go on by proving for the middle step above [ki.false li.false] for all i. Keeping in mind
(glob.(P,B)input.S2) Y, we begin by observing glob.(wp.S2.Q)Yfor all Qwith glob.QY.
Thus glob.(ki.false)Yfor all i. We now complete the proof by showing [wp.S2;(Y:=
E).Qwp.S2.Q] for all such Q.
APPENDIX B. LAWS OF PROGRAM MANIPULATION 179
wp.S2;(Y:= E).Q
={wp of ‘ ;’ and assignment}
wp.S2.Q[Y\E]
={remove redundant sub.: proviso}
wp.S2.Q.
Appendix C
Properties of Slides
Lemma C.1. Let Sbe any core statement and let V1,V2 be two sets of variables with V2
(V1def.S); then
(slides.S.V1) = (slides.S.(V1,V2)) .
Proof. We prove the equivalence by induction on the structure of S.
First, when V1def.Swe get
slides.S.V1
={def. of slides}
skip
={def. of slides: (V1,V2) def.S}
slides.S.(V1,V2) .
In the remaining cases we can assume V1def.S6=.
S=X:= E:
slides.X:= E.(V1,V2)
={slides of ‘:=’: X1 := X(V1,V2);
let E1 be the subset of Ecorresponding to X1}
X1 := E1
={slides of ‘:=’: X1 = XV1 due to V2X}
slides.X:= E.V1.
S=S1;S2:
slides.S1;S2.(V1,V2)
180
APPENDIX C. PROPERTIES OF SLIDES 181
={slides of ‘ ;}
(slides.S1.(V1,V2)) ;(slides.S2.(V1,V2))
={ind. hypo., twice: def.S1def.S1;S2; similarly for S2}
(slides.S1.V1) ;(slides.S2.V1)
={slides of ‘ ;}
slides.S1;S2.V1.
S=if Bthen S1else S2:
slides.if Bthen S1else S2.(V1,V2)
={slides of IF}
if Bthen slides.S1.(V1,V2) else slides.S2.(V1,V2)
={ind. hypo., twice: def.S1def.IF ; similarly for S2}
if Bthen slides.S1.V1else slides.S2.V1
={slides of IF}
slides.if Bthen S1else S2.V1.
S=while Bdo S1od :
slides.while Bdo S1od .(V1,V2)
={slides of DO}
while Bdo (slides.S1.(V1,V2)) od
={ind. hypo.}
while Bdo (slides.S1.V1) od
={slides of DO: def.S1 = def.DO}
slides.while Bdo S1od .V1.
Theorem 8.1 (Slides Distribute over Union). Any pair of slides of a single core
statement, slides.S.V1 and slides.S.V2, is unifiable. Furthermore, we have
(slides.S.(V1V2)) = ((slides.S.V1) (slides.S.V2)) .
Proof. We prove the equivalence by induction on the structure of S.
First, when V1def.Swe get
slides.S.V1slides.S.V2
APPENDIX C. PROPERTIES OF SLIDES 182
={def. of slides when def.SV1}
skip slides.S.V2
={def. of ∪}
slides.S.V2
={Lemma C.1: (V1\V2) def.S)}
slides.S.(V1V2) .
A similar derivation proves the case of V2def.S. Thus in the remaining cases we are left to
assume both V1def.S6=and V2def.S6=.
S=X:= E:
slides.X:= E.V1slides.X:= E.V2
={slides of ‘:=’, twice: X1 := XV1, X2 := XV2 and
E1,E2 are the corresponding subsets of E}
X1 := E1X2 := E2
={def. of for assignments: X12 := X1X2 and
E12, the corresponding union of E1 and E2 is well defined:
any X.iin X12 that is both in X1 and X2 is also in Xand so both
E1.iand E2.iare the same (original) E.i}
X12 := E12
={slides of ‘:=’ and set theory: (XV1) (XV2) = X(V1V2)}
slides.X:= E.(V1V2) .
S=S1;S2:
slides.S1;S2.V1slides.S1;S2.V2
={slides of ‘ ;’, twice}
(slides.S1.V1) ;(slides.S2.V1)
(slides.S1.V2) ;(slides.S2.V2)
={def. of for ‘ ;}
((slides.S1.V1) (slides.S1.V2)) ;((slides.S2.V1) (slides.S2.V2))
={ind. hypo., twice}
(slides.S1.(V1V2)) ;(slides.S2.(V1V2))
={slides of ‘ ;}
slides.S1;S2.(V1V2) .
APPENDIX C. PROPERTIES OF SLIDES 183
S=if Bthen S1else S2:
slides.if Bthen S1else S2.V1slides.if Bthen S1else S2.V2
={slides of IF, twice}
if Bthen slides.S1.V1else slides.S2.V1if Bthen slides.S1.V2else slides.S2.V2
={def. of for IF}
if Bthen ((slides.S1.V1) (slides.S1.V2)) else ((slides.S2.V1) (slides.S2.V2))
={ind. hypo., twice}
if Bthen slides.S1.(V1V2) else slides.S2.(V1V2)
={slides of IF}
slides.if Bthen S1else S2.(V1V2) .
S=while Bdo S1od :
slides.while Bdo S1od .V1slides.while Bdo S1od .V2
={slides of DO, twice}
while Bdo (slides.S1.V1) od while Bdo (slides.S1.V2) od
={def. of for DO}
while Bdo ((slides.S1.V1) (slides.S1.V2)) od
={ind. hypo.}
while Bdo (slides.S1.(V1V2)) od
={slides of DO}
slides.while Bdo S1od .(V1V2) .
Lemma 9.5. Let S,Vbe any core statement and set of variables, respectively; then
def.(slides.S.V)def.S.
Proof. The proof is by induction on the structure of S.
First, when Vdef.Swe get
def.(slides.S.V)
={def. of slides:Vdef.S}
def.(skip )
={def of skip }
APPENDIX C. PROPERTIES OF SLIDES 184
⊆ {set theory}
def.S.
In the remaining cases we can assume Vdef.S6=.
S=X:= E:
def.(slides.X:= E.V)
={slides of ‘:=’: X1 := XV;
let E1 be the subset of Ecorresponding to X1}
def.X1 := E1
={def of assignments}
X1
⊆ {def. of X1 and set theory}
X
={def of assignments}
def.X:= E.
S=S1;S2:
def.(slides.S1;S2.V)
={slides of ‘ ;}
def.(slides.S1.V);(slides.S2.V)
={def of ‘ ;}
def.(slides.S1.V)def.(slides.S2.V)
⊆ {ind. hypo., twice}
def.S1def.S2
={def of ‘ ;}
def.S1;S2.
S=if Bthen S1else S2:
def.(slides.if Bthen S1else S2.V)
={slides of IF}
def.if Bthen slides.S1.Velse slides.S2.V
={def of IF}
APPENDIX C. PROPERTIES OF SLIDES 185
def.(slides.S1.V)def.(slides.S2.V)
⊆ {ind. hypo., twice}
def.S1def.S2
={def of IF}
def.if Bthen S1else S2.
S=while Bdo S1od :
def.(slides.while Bdo S1od .V)
={slides of DO}
def.(while Bdo slides.S1.Vod
={def of DO}
def.(slides.S1.V)
⊆ {ind. hypo.}
def.S1
={def of DO}
def.while Bdo S1od .
C.1 Lemmata for proving independent slides yield slices
Lemma 9.3. Let Sbe any core statement; let Tbe any slip of Sand let Vbe any set of
variables; then
glob.(slides.T.V)glob.(slides.S.V).
Proof. The proof is by induction over the structure of S.
First, when Vdef.Twe have slides.T.V=skip and the inclusion is trivial. Hence, in the
remaining cases we shall assume Vdef.T6=and the implied Vdef.S6=, due to def.Tdef.S
(Lemma 9.4).
S=X:= E: Here, Tmust be X:= E, and the inclusion is trivial.
This will be the case whenever Tis Sitself. Hence, in the remaining cases we shall assume T
is a proper slip of S.
S=S1;S2:
glob.(slides.S1;S2.V)
={slides of ‘ ;}
APPENDIX C. PROPERTIES OF SLIDES 186
glob.((slides.S1.V);(slides.S2.V))
={glob of ‘ ;}
glob.(slides.S1.V)glob.(slides.S2.V)
⊇ {Tmust be a slip of either S1 or S2, to which the ind. hypo. applies}
glob.(slides.T.V).
S=if Bthen S1else S2:
glob.(slides.if Bthen S1else S2.V)
={slides of IF}
glob.if Bthen slides.S1.Velse slides.S2.V
={glob of IF}
glob.Bglob.(slides.S1.V)glob.(slides.S2.V)
⊇ {Tmust be a slip of either S1 or S2, to which the ind. hypo. applies}
glob.(slides.T.V).
S=while Bdo S1od :
glob.(slides.while Bdo S1od .V)
={slides of DO}
glob.while Bdo slides.S1.Vod
={glob of DO}
glob.Bglob.(slides.S1.V)
⊇ {Tmust be a slip of S1 and the ind. hypo. applies}
glob.(slides.T.V).
Lemma 9.4. Let Sbe a core statement; let Tbe any slip of S; then
def.Tdef.S.
Proof. The proof is by induction over the structure of S.
S=X:= E: Here, Tmust be X:= E, and the inclusion is trivial.
S=S1;S2:
def.S1;S2
={def of ‘ ;}
def.S1def.S2.
APPENDIX C. PROPERTIES OF SLIDES 187
If Tis either S,S1 or S2, the inclusion is trivial. Otherwise, it must be a slip of either S1 or
S2, and thus def.Tmust be included in either def.S1 or def.S2, respectively, due to the induction
hypothesis.
S=if Bthen S1else S2:
def.if Bthen S1else S2
={def of IF}
def.S1def.S2.
If Tis either S,S1 or S2, the inclusion is trivial. Otherwise, it must be a slip of either S1 or
S2, and thus def.Tmust be included in either def.S1 or def.S2, respectively, due to the induction
hypothesis.
S=while Bdo S1od :
def.while Bdo S1od
={def of DO}
def.S1.
If Tis either Sor S1, the inclusion is trivial. Otherwise, it must be a slip of S1, and thus def.T
must be included in def.S1 due to the induction hypothesis.
C.2 Slide independence and liveness
Theorem C.2. At any program point in a slide-independent statement, any variable of the
slide-independent set or one that was not defined in the original statement is alive only if it
was alive in the corresponding point of the original statement. That is, let S,XI ,Y,LV be
any statement and three sets of variables, respectively, with XI slide independent in slides.S(i.e.
glob.(slides.S.XI )def.SXI ) and LV live-on-exit; let LV 0be the set of live variables at a certain
point in (slides.S.XI )[live LV ] and let LV 00 be the set of live variables at the corresponding point
of S[live LV ]; then (LV 0\Y)(LV 00 \Y) provided def.S\XI Y.
Proof. We prove by induction over the structure of slides.S.XI a variation, stating that provided
LV 1,LV 2 are the live variables on exit from slides.S.XI and S, respectively, with (LV 1\Y)
(LV 2\Y), we also get (LV 10\Y)(LV 200 \Y) for the sets of live variables LV 10,LV 20at any
corresponding points in (slides.S.XI )[live LV 1] and S[live LV 2].
First, for live-on-entry variables LV 10:= (LV 1\ddef.(slides.S.XI )) input.(slides.S.XI ) and
LV 20:= (LV 2\ddef.S)input.S, we observe
APPENDIX C. PROPERTIES OF SLIDES 188
LV 10\Y
={def. of LV 10}
((LV 1\ddef.(slides.S.XI )) input.(slides.S.XI )) \Y
={set theory}
((LV 1\((LV 1\Y)ddef.(slides.S.XI ))) input.(slides.S.XI )) \Y
={Lemma C.4, see below: (LV \Y)def .SXI }
((LV 1\((LV 1\Y)ddef.S)) input.(slides.S.XI )) \Y
={set theory}
((LV 1\ddef.S)input.(slides.S.XI )) \Y
={set theory: RE5 and glob.(slides.S.XI )glob.S}
((LV 1\ddef.S)((glob.S\Y)input.(slides.S.XI ))) \Y
⊆ {Lemma C.5, see below: (glob.S\Y)def.SXI }
((LV 1\ddef.S)((glob.S\Y)input.S)) \Y
={set theory: input.Sglob.S}
((LV 1\ddef.S)input.S)\Y
⊆ {set theory: proviso (LV 1\Y)(LV 2\Y)}
((LV 2\ddef.S)input.S)\Y
={set theory: input.Sglob.S}
LV 20\Y.
For internal points of slides.S.XI , we need to examine sequential composition, IF statements
and DO loops, assuming XI def.S6=(since otherwise we get slides.S.XI =skip , with no
internal points).
Recall that (due to Lemma 9.2) we know that XI , being slide independent in slides.S, is also
slide ind. in slides.T, for any slip Tof S. Furthermore, since def.Tdef.Sfor any slip Tof
S, the proviso def.S\XI Yimplies def.T\XI Y. Thus, the induction hypothesis can
be correctly applied to any slip, provided its respective live-on-exit variables LV 1 and LV 2 have
(LV 1\Y)(LV 2\Y).
S=S1;S2:
The induction hypothesis on (slides.S2.XI )[live LV 1] and S2[live LV 2], ensures (LV 10\Y)
(LV 20\Y) for any point in slides.S2.XI , including its entry. Thus the ind. hypo. for
(slides.S1.XI )[live LV 10] and S1[live LV 2] yields the requested result.
APPENDIX C. PROPERTIES OF SLIDES 189
S=if Bthen S1else S2:
Here, the live-on-exit variables are also live-on-exit of both branches; the ind. hypo. then takes
care of those branches.
S=while Bdo S1od :
The variables live-on-exit from the loop body are exactly the variables live-on-entry (to the DO
loop), LV 10,LV 20; and those are already known to have the requested (LV 10\Y)(LV 20\Y).
Corollary C.3. Slide independence preserves non-simultaneous liveness. That is, let
S,VI ,X,X1,Ybe any statement and four sets of variables, respectively, with VI slide in-
dependent in slides.S(i.e. glob.(slides.S.VI )def.SVI ) and Xnon-simultaneously-live in
S[live X1,Y]; then Xis also non-simultaneously-live in (slides.S.VI )[live X1,Y] provided X1
X,|X1| ≤ 1, Y=glob.S\Xand Xdef.SVI .
Proof. Let LV 0be the set of live variables at a certain point in (slides.S.VI )[live X1,Y]; let LV 00
be the set of live variables at the corresponding point in S[live X1,Y]. From Theorem C.2 we
know (LV 0\Y)(LV 00 \Y), because VI is slide ind. in slides.Sand def.S\VI Y(the latter
is due to def.Sglob.Sand Xdef.SVI ). Thus (by set theory, recall Y=glob.S\Xand
note only variables from glob.SXmay be live) we get XLV 0XLV 00. Combine this with
the non-simultaneous liveness of Xin S(i.e. |(XLV 00)| ≤ 1) and the non-simultaneous liveness
of Xin (slides.S.VI )[live X1,Y] is proved.
Lemma C.4. Let S,V,Xbe any core statement and two sets of variables, respectively; then
(Xddef.S) = (Xddef.(slides.S.V))
provided Xdef.SV.
Proof. First, for the case of Vdef.Swe observe
Xddef.(slides.S.V)
={def. of slides when Vdef.S}
Xddef.skip
={ddef of skip }
X∩ ∅
={set theory}
APPENDIX C. PROPERTIES OF SLIDES 190
={Xdef.Sdue to Vdef.Sand proviso;
hence Xddef.Sdue to RE4}
Xddef.S.
When Vdef.S6=, we prove the inclusion by induction over the structure of S.
S=V1,Y1 := E1,E2with V1Vand Y1V:
Xddef.(slides.V1,Y1 := E1,E2.V)
={slides of ‘:=’: V1Vand Y1V}
Xddef.V1 := E1
={ddef of ‘:=’}
XV1
={set theory: XY1 since X(V1,Y1) Vand VY1}
(XV1) (XY1)
={set theory; V1Y1}
X(V1,Y1)
={ddef of ‘:=’}
Xddef.V1,Y1 := E1,E2.
S=S1;S2:
Xddef.(slides.S1;S2.V)
={slides of ‘ ;}
Xddef.slides.S1.V;slides.S2.V
={ddef of ‘ ;}
X(ddef.(slides.S1.V)ddef.(slides.S2.V))
={set theory}
(Xddef.(slides.S1.V)) (Xddef.(slides.S2.V))
={ind. hypo., twice: def.S1def.S1;S2; similarly for S2}
(Xddef.S1) (Xddef.S2)
={set theory}
X(ddef.S1ddef.S2)
={ddef of ‘ ;}
Xddef.S1;S2.
APPENDIX C. PROPERTIES OF SLIDES 191
S=if Bthen S1else S2:
Xddef.(slides.if Bthen S1else S2.V)
={slides of IF}
Xddef.if Bthen slides.S1.Velse slides.S2.V
={ddef of IF}
Xddef.(slides.S1.V)ddef.(slides.S2.V)
={set theory}
(Xddef.(slides.S1.V)) (Xddef.(slides.S2.V))
={ind. hypo., twice: def.S1def.IF ; similarly for S2}
(Xddef.S1) (Xddef.S2)
={set theory}
X(ddef.S1ddef.S2)
={ddef of IF}
Xddef.if Bthen S1else S2.
S=while Bdo S1od :
Xddef.(slides.while Bdo S1od .V)
={slides of DO}
Xddef.while Bdo slides.S1.Vod
={ddef of DO}
X∩ ∅
={ddef of DO}
Xddef.while Bdo S1od .
Lemma C.5. Let S,VI ,Xbe any core statement and two sets of variables, respectively, with VI
slide independent in slides.S(i.e. glob.(slides.S.VI )def .SVI ); then
(Xinput.(slides.S.VI )) (Xinput.S)
provided Xdef.SVI .
Proof. First, if VI def.Swe get slides.S.VI =skip and the inclusion becomes trivial.
When VI def.S6=, we prove the inclusion by induction over the structure of S.
APPENDIX C. PROPERTIES OF SLIDES 192
S=VI 1,Y1 := E1,E2with VI 1VI and Y1VI :
input.(slides.VI 1,Y1 := E1,E2.VI )
={slides of ‘:=’}
input.VI 1 := E1
={input of ‘:=’}
glob.E1
⊆ {}
glob.E1glob.E2
={input of ‘:=’}
input.VI 1,Y1 := E1,E2.
S=S1;S2:
Xinput.(slides.S1;S2.VI )
={slides of ‘ ;}
Xinput.(slides.S1.VI ;slides.S2.VI )
={input of ‘ ;}
X(input.(slides.S1.VI )(input.(slides.S2.VI )\ddef.(slides.S1.VI )))
={set theory}
(Xinput.(slides.S1.VI )) ((Xinput.(slides.S2.VI )) \(Xddef.(slides.S1.VI )))
⊆ {ind. hypo., twice}
(Xinput.S1) ((Xinput.S2) \(Xddef.(slides.S1.VI )))
={Lemma C.4}
(Xinput.S1) ((Xinput.S2) \(Xddef.S1))
={set theory}
X(input.S1(input.S2\ddef.S1))
={input of ‘ ;}
Xinput.S1;S2.
S=if Bthen S1else S2:
Xinput.(slides.if Bthen S1else S2.VI )
={slides of IF}
Xinput.(if Bthen slides.S1.VI else slides.S2.VI )
APPENDIX C. PROPERTIES OF SLIDES 193
={input of IF}
X(glob.Binput.(slides.S1.VI )input.(slides.S2.VI ))
={set theory}
(Xglob.B)(Xinput.(slides.S1.VI )) (Xinput.(slides.S2.VI ))
⊆ {ind. hypo., twice}
(Xglob.B)(Xinput.S1) (Xinput.S2)
={set theory}
X(glob.Binput.S1input.S2)
={input of IF}
Xinput.if Bthen S1else S2.
S=while Bdo S1od :
Xinput.(slides.while Bdo S1od .VI )
={slides of DO}
Xinput.(while Bdo slides.S1.VI od)
={input of DO}
X(glob.Binput.(slides.S1.VI ))
={set theory}
(Xglob.B)(Xinput.(slides.S1.VI ))
⊆ {ind. hypo.}
(Xglob.B)(Xinput.S1)
={set theory}
X(glob.Binput.S1)
={input of DO}
Xinput.while Bdo S1od .
Appendix D
SSA
D.1 General derivation
The transformation to and from SSA will be based on the following general derivation.
Program equivalence D.1. A variable in Xmay or may not be live-on-exit; independently,
it may or may not be live-on-entry and it may be self-defined or normally defined or not at all
defined. So potentially we have 12 cases. However, in our context, some combinations are not
possible. Firstly, if a variable is self-defined, the used instances must be live-on-entry. Secondly, a
variable may not be live-on-exit-only, unless it is actually defined. So we are left with nine cases
to be distinguished.
For self-definitions, we have variables live-on-entry-only (XL1f:= XL1i) or live-on-both (XL2 :=
XL2i); for normally defined variables, we have the live-on-both, live-on-entry-only, live-on-exit-
only and the dead variables, respectively (XL3f,XL4,XL5f,XL6 := E10,E20,E30,E40); of the
non-defined variables, we have variables live-on-both (X7), live-on-entry-only (X8) and, again,
dead variables (X9).
We note that subsets X1,X2,X3,X4,X7,X8 are live-on-entry, with initial instances
XL1i,XL2i,XL3i,XL4i,XL7i,XL8i, respectively. The final instances XL1f,XL3f,XL5f,XL7iof
subsets X1,X3,X5,X7 are all live-on-exit. Finally, note that subsets XL2,XL4,XL6 represent
dead assignments.
Let XLs be the set of all instances; let Y,Y1 be two more sets of program variables with
Y1Yand Ylive on exit; finally, let E1,E2,E3,E4,E5,E10,E20,E30,E40,E50be ten lists of
194
APPENDIX D. SSA 195
expressions; then
(X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5;
XL1f,XL3f,XL5f,XL7i:= X1,X3,X5,X7)[live XL1f,XL3f,XL5f,XL7i,Y]=
(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i:= X1,X2,X3,X4,X7,X8;
XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := XL1i,XL2i,E10,E20,E30,E40,E50)
[live XL1f,XL3f,XL5f,XL7i,Y]
provided
P1: (X1,X2,X3,X4,X5,X6,X7,X8) X,
P2: (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)XLs,
P3: (XL1f,XL2,XL3f,XL4,XL5f,XL6,XL7i)XLs,
P4: Y1Y,
P5: (X,XLs,Y) disjoint,
P6: glob.(E1,E2,E3,E4,E5) (X1,X2,X3,X4,X7,X8,Y),
P7: [E10=E1[X1,X2,X3,X4,X7,X8\XL1i,XL2i,XL3i,XL4i,XL7i,XL8i]],
P8: [E20=E2[X1,X2,X3,X4,X7,X8\XL1i,XL2i,XL3i,XL4i,XL7i,XL8i]],
P9: [E30=E3[X1,X2,X3,X4,X7,X8\XL1i,XL2i,XL3i,XL4i,XL7i,XL8i]],
P10: [E40=E4[X1,X2,X3,X4,X7,X8\XL1i,XL2i,XL3i,XL4i,XL7i,XL8i]] and
P11: [E50=E5[X1,X2,X3,X4,X7,X8\XL1i,XL2i,XL3i,XL4i,XL7i,XL8i]] .
Proof.
(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i:= X1,X2,X3,X4,X7,X8;
XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := XL1i,XL2i,E10,E20,E30,E40,E50)
[live XL1f,XL3f,XL5f,XL7i,Y]
={assignment-based sub. (Law 18): due to P1,P2 and P5
(X1,X2,X3,X4,X7,X8) (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i);
remove redundant double-sub.: P7-P11 and then P1,P2,P5,P6 give
(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)glob.(E1,E2,E3,E4,E5)}
(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i:= X1,X2,X3,X4,X7,X8;
XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := X1,X2,E1,E2,E3,E4,E5)
[live XL1f,XL3f,XL5f,XL7i,Y]
={remove dead assignment (Law 23): due to P1,P2,P5 and P6
(XL1i,XL2i,XL3i,XL4i,XL8i)
(((XL7i,Y)\Y1) glob.(E1,E2,E3,E4,E5))}
(XL7i:= X7;XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 :=
X1,X2,E1,E2,E3,E4,E5)[live XL1f,XL3f,XL5f,XL7i,Y]
APPENDIX D. SSA 196
={intro. dead assignment (Law 23): due to P1,P2 and P5
(X3,X4,X5,X6) (XL1f,XL3f,XL5f,XL7i,Y)}
(XL7i:= X7;XL1f,XL2,XL3f,X3,XL4,X4,XL5f,X5,XL6,X6,Y1 :=
X1,X2,E1,E1,E2,E2,E3,E3,E4,E4,E5)[live XL1f,XL3f,XL5f,XL7i,Y]
={intro. following assertion (Laws 7, 8)}
(XL7i:= X7;XL1f,XL2,XL3f,X3,XL4,X4,XL5f,X5,XL6,X6,Y1 :=
X1,X2,E1,E1,E2,E2,E3,E3,E4,E4,E5;
{X1,X3,X5 = XL1f,XL3f,XL5f})[live XL1f,XL3f,XL5f,XL7i,Y]
={intro. following assignment (Law 6)}
(XL7i:= X7;XL1f,XL2,XL3f,X3,XL4,X4,XL5f,X5,XL6,X6,Y1 :=
X1,X2,E1,E1,E2,E2,E3,E3,E4,E4,E5;{X1,X3,X5 = XL1f,XL3f,XL5f}
;XL1f,XL3f,XL5f:= X1,X3,X5)[live XL1f,XL3f,XL5f,XL7i,Y]
={remove following assertion (Laws 7, 8)}
(XL7i:= X7;XL1f,XL2,XL3f,X3,XL4,X4,XL5f,X5,XL6,X6,Y1 :=
X1,X2,E1,E1,E2,E2,E3,E3,E4,E4,E5;
XL1f,XL3f,XL5f:= X1,X3,X5)[live XL1f,XL3f,XL5f,XL7i,Y]
={remove dead assignment (Law 23): due to P1,P3 and P5
(XL1f,XL2,XL3f,XL4f,XL5f,XL6) (XL7i,Y,X1,X3,X5)}
(XL7i:= X7;X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5;
XL1f,XL3f,XL5f:= X1,X3,X5)[live XL1f,XL3f,XL5f,XL7i,Y]
={swap statements (Law 5.7): X7(X3,X4,X5,X6,Y1), due to P1,P4 and P5; and
XL7i((X3,X4,X5,X6,Y1) glob.(E1,E2,E3,E4,E5)), by P1,P2,P4,P5}
(X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5;XL7i:= X7;
XL1f,XL3f,XL5f:= X1,X3,X5)[live XL1f,XL3f,XL5f,XL7i,Y]
={merge assignments (Law 1): XL7i(X1,X3,X5), by P1,P2,P5}
(X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5;
XL1f,XL3f,XL5f,XL7i:= X1,X3,X5,X7)[live XL1f,XL3f,XL5f,XL7i,Y].
Program equivalence D.2. Let S1,S2,S10,S20,X1,X2,XL1i,XL2i,Ybe four statements and
five sets of variables, repectively; then
(S1;S2;XL2f:= X2)[live XL2f,Y]=(XL1i:= X1;S10;S20)[live XL2f,Y]
APPENDIX D. SSA 197
provided
P1: (S1;XL3 := X3)[live XL3,Y]=(XL1i:= X1;S10)[live XL3,Y]and
P2: (S2;XL2f:= X2)[live XL2f,Y]=(XL3 := X3;S20)[live XL2f,Y]
where XL3 := ((XL2f\ddef.S20)(input.S20\Y) .
Proof.
(XL1i:= X1;S10;S20)[live XL2f,Y]
={prop. liveness info.: by def. of X3 (and set theory) we get
(XL3,Y)(((XL2f,Y)\ddef.S20)input.S20)}
((XL1i:= X1;S10)[live XL3,Y];S20)[live XL2f,Y]
={P1}
((S1;XL3 := X3)[live XL3,Y];S20)[live XL2f,Y]
={remove liveness info.: again (XL3,Y)(((XL2f,Y)\ddef.S20)input.S20)}
(S1;XL3 := X3;S20)[live XL2f,Y]
={prop. liveness info.}
(S1;(XL3 := X3;S20)[live XL2f,Y])[live XL2f,Y]
={P2}
(S1;(S2;XL2f:= X2)[live XL2f,Y])[live XL2f,Y]
={remove liveness info.}
(S1;S2;XL2f:= X2)[live XL2f,Y].
Program equivalence D.3. Let B,B0,S1,S2,S10,S20,X1,X2,XL1i,XL2i,Ybe two boolean
expressions, four statements and five sets of variables, repectively; then
(if Bthen S1else S2;XL2f:= X2)[live XL2f,Y]=
(XL1i:= X1;if B0then S10else S20)[live XL2f,Y]
provided
P1: [B0B[X1\XL1i]],
P2: XL1iX1,
P3: XL1iglob.B,
P4: (S1;XL2f:= X2)[live XL2f,Y]=(XL1i:= X1;S10)[live XL2f,Y]and
P5: (S2;XL2f:= X2)[live XL2f,Y]=(XL1i:= X1;S20)[live XL2f,Y].
APPENDIX D. SSA 198
Proof.
(XL1i:= X1;if B0then S10else S20)[live XL2f,Y]
={P1}
(XL1i:= X1;if B[X1\XL1i]then S10else S20)[live XL2f,Y]
={assignment-based sub. (Law 18): XL1iX1 (P2)}
(XL1i:= X1;if B[X1\XL1i][XL1i\X1] then S10else S20)[live XL2f,Y]
={remove redundant (reversed) double-sub.: XL1iglob.B(P3)}
(XL1i:= X1;if Bthen S10else S20)[live XL2f,Y]
={dist. statement over IF (Law 3): P3 again}
(if Bthen XL1i:= X1;S10else XL1i:= X1;S20)[live XL2f,Y]
={prop. liveness info.}
(if Bthen (XL1i:= X1;S10)[live XL2f,Y]
else (XL1i:= X1;S20)[live XL2f,Y])[live XL2f,Y]
={P4,P5}
(if Bthen (S1;XL2f:= X2)[live XL2f,Y]
else (S2;XL2f:= X2)[live XL2f,Y])[live XL2f,Y]
={remove liveness info.}
(if Bthen S1;XL2f:= X2else S2;XL2f:= X2)[live XL2f,Y]
={dist. IF over ‘ ;’ (Law 4)}
(if Bthen S1else S2;XL2f:= X2)[live XL2f,Y].
Program equivalence D.4. Let B,B0,S1,S10,X1,X2,XL1i,XL2i,Ybe two boolean expres-
sions, two statements and five (disjoint) sets of variables; then
(DO ;XL2i:= X2)[live XL2i,Y]=(XL1i,XL2i:= X1,X2;DO0)[live XL2i,Y]
APPENDIX D. SSA 199
where DO := while Bdo S1od and
DO0:= while B0do S10od
provided
P1: (X1,X2,XL1i,XL2i,Y) are disjoint,
P2: (XL1i,XL2i)input.DO,
P3: input.DO0(XL1i,XL2i,Y),
P4: [B0B[X1,X2\XL1i,XL2i]] and
P5: (S1;XL1i,XL2i:= X1,X2)[live XL1i,XL2i,Y]=
(XL1i,XL2i:= X1,X2;S10)[live XL1i,XL2i,Y].
Proof.
(XL1i,XL2i:= X1,X2;while B0do S10od)[live XL2i,Y]
={intro. dead assignment (Law 26): due to P1,P3
(X1,X2) ((XL2i,Y)glob.B0input.S10)}
(XL1i,XL2i:= X1,X2;while B0do
S10;X1,X2 := XL1i,XL2iod)[live XL2i,Y]
={intro. following assertion (Law 7), twice}
(XL1i,XL2i:= X1,X2;{XL1i,XL2i=X1,X2};while B0do
S10;X1,X2 := XL1i,XL2i;{XL1i,XL2i=X1,X2}od)[live XL2i,Y]
={prop. assertion (Law 13)}
(XL1i,XL2i:= X1,X2;{XL1i,XL2i=X1,X2};while B0do
{XL1i,XL2i=X1,X2};S10;X1,X2 := XL1i,XL2i;{XL1i,XL2i=X1,X2}
od)[live XL2i,Y]
={intro. following assignment (Law 6)}
(XL1i,XL2i:= X1,X2;{XL1i,XL2i=X1,X2};while B0do
{XL1i,XL2i=X1,X2};XL1i,XL2i:= X1,X2;S10;X1,X2 := XL1i,XL2i;
{XL1i,XL2i=X1,X2}od)[live XL2i,Y]
={prop. assertion (Law 13)}
(XL1i,XL2i:= X1,X2;{XL1i,XL2i=X1,X2};while B0do
XL1i,XL2i:= X1,X2;S10;X1,X2 := XL1i,XL2i;{XL1i,XL2i=X1,X2}
od)[live XL2i,Y]
={remove following assertion (Law 7), twice}
(XL1i,XL2i:= X1,X2;while B0do
XL1i,XL2i:= X1,X2;S10;X1,X2 := XL1i,XL2iod)[live XL2i,Y]
APPENDIX D. SSA 200
={liveness analysis: P3 gives input.DO0(XL1i,XL2i,Y)}
(XL1i,XL2i:= X1,X2;while B0do
((XL1i,XL2i:= X1,X2;S10)[live XL1i,XL2i,Y];
X1,X2 := XL1i,XL2i)[live XL1i,XL2i,Y]od)[live XL2i,Y]
={P5}
(XL1i,XL2i:= X1,X2;while B0do
((S1;XL1i,XL2i:= X1,X2)[live XL1i,XL2i,Y];
X1,X2 := XL1i,XL2i)[live XL1i,XL2i,Y]od)[live XL2i,Y]
={remove liveness info.}
(XL1i,XL2i:= X1,X2;while B0do
(S1;XL1i,XL2i:= X1,X2;X1,X2 := XL1i,XL2i)[live XL1i,XL2i,Y]
od)[live XL2i,Y]
={assignment-based sub. (Law 18): (X1,X2) (XL1i,XL2i) due to P1}
(XL1i,XL2i:= X1,X2;while B0do
(S1;XL1i,XL2i:= X1,X2;X1,X2 := X1,X2)[live XL1i,XL2i,Y]
od)[live XL2i,Y]
={remove redundant self-assignment (Law 2); remove liveness info.}
(XL1i,XL2i:= X1,X2;while B0do S1;XL1i,XL2i:= X1,X2od)[live XL2i,Y]
={P4}
(XL1i,XL2i:= X1,X2;while B[X1,X2\XL1i,XL2i]do
S1;XL1i,XL2i:= X1,X2od)[live XL2i,Y]
={assignment-based sub. (Law 18): (X1,X2) (XL1i,XL2i)}
(XL1i,XL2i:= X1,X2;while B[X1,X2\XL1i,XL2i][XL1i,XL2i\X1,X2] do
S1;XL1i,XL2i:= X1,X2od)[live XL2i,Y]
={remove redundant (reversed) double-sub.: P2 gives (XL1i,XL2i)glob.B}
(XL1i,XL2i:= X1,X2;while Bdo S1;XL1i,XL2i:= X1,X2od)[live XL2i,Y]
={code motion (Law 5): due to P1,P2
(XL1i,XL2i)(glob.Binput.S1(X1,X2))}
(XL1i,XL2i:= X1,X2;while Bdo S1od ;XL1i,XL2i:= X1,X2)[live XL2i,Y]
={remove dead assignment (Law 23): by P1 we get XL1i(XL2i,Y)}
(XL1i,XL2i:= X1,X2;while Bdo S1od ;XL2i:= X2)[live XL2i,Y]
APPENDIX D. SSA 201
={remove dead assignment (Law 23):
(XL1i,XL2i)((X2,Y)glob.Binput.S1) due to P1,P2}
(while Bdo S1od ;XL2i:= X2)[live XL2i,Y].
D.2 Transform to SSA
We now apply the results of the above general derivation in deriving an algorithm to transform
any given program statement to SSA.
Transformation D.5. Let S,X,Ybe any core statement and two (disjoint) sets of variables; let
X1,X2,X3,X4,X5 be five (mutually disjoint) subsets of X, and let
XL1i,XL2i,XL3i,XL4i,XL4f,XL5fbe six sets of instances, all included in the full set of instances
XLs; let S0be the SSA form of Sdefined by
S0:= toSSA.(S,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs); then (Q1:)
(S;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]=
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S0)[live XL3i,XL4f,XL5f,Y]
and (Q2:) Xglob.S0
provided
P1: glob.S(X,Y),
P2: (X1,X2,X3,X4,X5) X,
P3: (XL1i,XL2i,XL3i,XL4i,XL4f,XL5f)XLs,
P4: XLs (X,Y),
P5: (X1,X3) def.S,
P6: (X2,X4,X5) def.Sand
P7: (X(((X3,X4,X5) \ddef.S)input.S)) (X1,X2,X3,X4) .
Preconditions P1 and P2 identifies all program variables (in S); then P3 and P4 (along with
P1) ensure all instances XLs are fresh; P5 and P6 (along with P3 and the repetition of XL2iin
Q1) ensure any live-on-exit final instance is also live-on-entry if and only if its respective program
variable is not defined in S(if it is both defined and live-on-entry, a different initial instance
will be used); finally, P7 (along with P2) makes postcondition Q2 achievable, by demanding the
availability of an initial instance to all live-on-entry variables (in X).
APPENDIX D. SSA 202
Proof. The derivation of the toSSA algorithm is given hand in hand with its proof of correctness.
For a given statement S, we assume for any slip Tof Sthe correctness of its toSSA transformation
(provided all preconditions are met) in proving the correctness for Sitself.
As will be seen, in the course of the following derivation, for each case of
S0:= toSSA.(S,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs) we shall be obliged to
show
DP1: (((XL3i,XL4f,XL5f)\ddef.S0)(input.S0\Y)) (XL1i,XL2i,XL3i,XL4i) and
DP2: (XL4f,XL5f)ddef.S0
provided P1-P7 hold.
In order to allow all recursive calls to introduce fresh instances, without clashing with sur-
rounding variables, we shall make the names of all global program variables (i.e. from Xand Y)
invariably available to recursive calls. Since those calls will be applied to slips of S,P1 will be
guaranteed (due to glob.Tglob.Sfor any slip Tof S).
Furthermore, whenever fresh instances are introduced, they will be added to XLs in further
recursive calls. Thus the inclusion of all instances in XLs, as required by P3, will be maintained.
This way, choosing all fresh names to be distinct from (X,Y,XLs ) will maintain P4. For keeping
P5,P6 and the disjointness of instances (XL1i,XL2i,XL3i,XL4i,XL4f,XL5f) (as implied by P3),
special care will be needed. In the various cases, such considerations will be key in deriving the
details of the transformation. Finally, for P7 to be invariably maintained, we shall propagate
liveness information backward (following our laws of liveness analysis) whilst propagating forward
assignments to initial instances (following both laws of program manipulation and postcondition
Q1).
Assignment
We begin by analyzing the relevant subsets of live variables X1-X5; of those, variables X2,X4,X5
are defined in S, with X2,X4 live-on-entry and X4,X5 live-on-exit. Furthermore, there is a
potential of defined variables X6(X\(X1,X2,X3,X4,X5)); those are neither live-on-entry
nor live-on-exit (i.e. dead assignments, as with X2).
All remaining defined variables (Y1X) will have to be from Y(due to P1).
For variables (X3,X4), being live on both entry and exit, we recall from Q1 and P5 the non-
defined X3 must have the same initial and final instances (XL3i), whereas (from Q1,P3 and P6)
the defined X4 should be given fresh initial instances XL4i.
Likewise, all dead assignments to (X2,X6) must yield fresh instances, (XL2,XL6). (An alter-
native would have been to allow the merging algorithm to remove dead assignments. This would
have caused complications in changing the results of liveness analysis and thus raise questions —
which are better avoided — over the order of translation. Instead, one can perform the elimination
APPENDIX D. SSA 203
of dead assignments independently of the merging.)
In terms of Program equivalence D.1, we observe (for Q1) that our X4,X2,X5,X6,X3,X1
correspond to
X3,X4,X5,X6,X7,X8 over there. Thus
(X4,X2,X5,X6,Y1 := E1,E2,E3,E4,E5;
XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y1]
={Program equivalence D.1 with
X1,X2,X3,X4,X5,X6,X7,X8 := ,,X4,X2,X5,X6,X3,X1,
XL3i,XL4i,XL7i,XL8i:= XL4i,XL2i,XL3i,XL1i,
XL3f,XL4f,XL5f,XL6f:= XL4f,XL2,XL5f,XL6,
XLs := (XLs,XL2,XL4i,XL6) and
(E10,E20,E30,E40,E50) := (E1,E2,E3,E4,E5)
[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i]:
P1 is due to our P2; P2 and P3 are due to our P3 and the fresh choices
P4 holds by choice of Y1Xand our P1;
P5 is a result of our P4 and the fresh choices;
P6 is due to our P1,P2 and P7; finally P7-P11 hold by construction}
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;
XL4f,XL2,XL5f,XL6,Y1 := E10,E20,E30,E40,E50)
[live XL3i,XL4f,XL5f,Y1] .
We thus derive
toSSA.(X4,X2,X5,X6,Y1 := E1,E2,E3,E4,E5,
X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs)) ,
XL4f,XL2,XL5f,XL6,Y1 := E10,E20,E30,E40,E50
where (E10,E20,E30,E40,E50,E60) := (E1,E2,E3,E4,E5,E6)
[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i]
and (XL2,XL6) := fresh.((X2,X6),(X,Y,XLs)) .
Q2. Indeed Xglob.XL4f,XL2,XL5f,XL6,Y11,Y2 := E10,E20,E30,E40,E50,E60since
(Xglob.(E1,E2,E3,E4,E5,E6)) (X1,X2,X3,X4) (due to input of ‘:=’ and P7) and
(X1,X2,X3,X4) are all substituted by elements of XLs (P3) which are disjoint from X(P4).
DP1.
((XL3i,XL4f,XL5f)\(XL4f,XL2,XL5f,XL6,Y1))
(glob.(E10,E20,E30,E40,E50,E60)\Y)
APPENDIX D. SSA 204
={set theory: XL3i(XL4f,XL2,XL5f,XL6,Y1) due to P3,
fresh choice of XL2,XL6 and P1,P4 (for Y1)}
(XL3i(glob.(E10,E20,E30,E40,E50,E60)\Y)
⊆ {def. of E10,E20,E30,E40,E50,E60and
glob.(E1,E2,E3,E4,E5,E6) (X1,X2,X3,X4,Y) due to P1 and P7}
(XL1i,XL2i,XL3i,XL4i).
DP2. Following P5 and P6 we observe
(X3,X4,X5) def.X4,X2,X5,X6,Y11,Y2 := E1,E2,E3,E4,E5,E6is (X4,X5) and in-
deed the matching final instances (XL4f,XL5f) are in
ddef.XL4f,XL2,XL5f,XL6,Y1 := E10,E20,E30,E40,E50.
Sequential composition
The key here is to determine an intermediate set of instances for which all preconditions (P1-P7)
hold for the two recursive calls to toSSA. Let (X1,X2,X3,X4) be the live-on-entry variables, and
let (X3,X4,X5) be live-on-exit, with (X1,X3) (def.S1def.S2) and (X2,X4,X5) (def.S1
def.S2). We have no problem with X3 as all instances (i.e. final, intermediate and initial) will have
to be the same (XL3i) since those are not defined in S1;S2. The remaining live intermediate
variables are X6 := ((X\X3) (((X4,X5) \ddef.S2) input.S2)).
Of X6, variables in X11,X21,X41 := (X1X6),((X2X6) \def.S1),((X4X6) \def.S1)
will have to reuse the initial instances XL11i,XL21i,XL41i. Similarly, variables in X42,X51 :=
(((X4X6) \def.S2),(((X5X6) \def.S2) will reuse final instances XL41f,XL51f. (Note that
X41 X42 since — due to X4(def.S1def.S2) — variables in X41 must be in def.S2 whereas
variables in X42 must not.)
Finally, the remaining variables X61 := (X6\(X11,X21,X41,X42,X51)) must get fresh
instances (XL61 := fresh.(X61,(X,Y,XLs))). In summary, the intermediately-live instances will
be
XL6 := (XL11i,XL21i,XL3i,XL41i,XL42f,XL51f,XL61).
The above construction of XL6, along with the given P1-P7 and the associativity of liveness
analysis (Lemma 7.2, for P7) ensure
S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),XL6,Y,XLs0) — where XLs0:= (XLs ,XL61) —
enjoys all its P1-P7 and thus (Q1)
(S1;XL6 := X6)[live XL6,Y]=
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S10;S20)[live XL6,Y].
Similarly, all P1-P7 for S20:= toSSA.(S2,X,XL6,(XL3i,XL4f,XL5f),Y,XLs00) are guaranteed
for any XLs00 XLs 0, thus yielding (Q1)
APPENDIX D. SSA 205
(S2;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]=
(XL6 := X6;S20)[live XL3i,XL4f,XL5f,Y].
In toSSA, we shall insist on XLs00 := (XLs 0(glob.S10\Y)) in order to avoid defining the same
instance twice, and thus losing the static-single-assignment property.
Q1.
(S1;S2;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y1]
={Program equivalence D.2 with (XL6,XLs0,XLs00 as defined above)
XL1i:= (XL1i,XL2i,XL3i,XL4i); XL2f:= (XL3i,XL4f,XL5f);
S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),XL6,Y,XLs0);
S20:= toSSA.(S2,X,XL6,(XL3i,XL4f,XL5f),Y,XLs00);
ind. hypo. (Q1), twice: P1-P7 of both cases justified above}
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S10;S20)
[live XL3i,XL4f,XL5f,Y1] .
We thus derive
toSSA.(S1;S2,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs),
S10;S20
where X6 := ((X\X3) (((X4,X5) \ddef.S2) input.S2)),
X11 := (X1X6),
X21 := ((X2X6) \def.S1),
X41 := ((X4X6) \def.S1),
X42 := ((X4X6) \def.S2),
X51 := ((X5X6) \def.S2),
X61 := (X6\(X11,X21,X41,X42,X51)),
XL61 := fresh.(X61,(X,Y,XLs)),
XL6 := (XL11i,XL21i,XL3i,XL41i,XL42f,XL51f,XL61),
XLs0:= (XLs,XL61),
S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),XL6,Y,XLs0),
XLs00 := (XLs0(glob.S10\Y))
and S20:= toSSA.(S2,X,XL6,(XL3i,XL4f,XL5f),Y,XLs00) .
Q2. Indeed Xglob.S10;S20due to the ind. hypo. (Q2), twice.
DP1.
((XL3i,XL4f,XL5f)\ddef.S10;S20)((input.S10;S20)\Y)
={associativity of liveness (Lemma 7.2)}
(((((XL3i,XL4f,XL5f)\ddef.S20)input.S20)\ddef.S10)input.S10)\Y)
APPENDIX D. SSA 206
={set theory}
(((((XL3i,XL4f,XL5f)\ddef.S20)(input.S20\Y)) \ddef.S10)input.S10)\Y)
⊆ {ind. hypo. (DP1 of S20)}
((XL6\ddef.S10)input.S10)\Y)
={set theory: XL6Yby construction of XL6 (P3,P4 and freshness of XL61)}
(XL6\ddef.S10)(input.S10\Y)
⊆ {ind. hypo. (DP1 of S10)}
(XL1i,XL2i,XL3i,XL4i).
DP2. Due to ddef of ‘ ;’, we need to show (XL4f,XL5f)(ddef.S10ddef.S20). Final
instances of variables in (X4,X5) def.S2 are in ddef.S20due to the ind. hypo. (i.e. DP2 of S20).
The remaining elements of (X4,X5) must be in X6\def.S2 and hence in (X41,X51). Thus, their
final instances must be in (XL41f,XL51f) and hence in ddef.S10due to the ind. hypo. (i.e. DP2
of S10) as required.
IF
The key this time lies in DP2 and the definition of ddef of IF. Variables in (X4,X5) are defined
in the IF statement and must be defined in both branches of the resulting IF’. We achieve that by
ending both the then and else branches of IF’ with assignments to the final instances (XL4f,XL5f).
But what do we assign to members of (XL4f,XL5f)? In each branch the answer will be different,
depending on whether the variable is defined in that branch and if not, whether it is live on entry
(i.e. in XL4i) or not.
Variables in X4d1 := (X4(def.S1\def.S2)) should be given fresh instances (XL4d1t) in the
then branch but use their initial instances (XL4d1i) as final instances in the else branch. (Failing to
reuse such initial instances would inevitably render DP2 false, by yielding simultaneous liveness.)
Similarly, variables in X4d2 := (X4(def.S2\def.S1)) should be given fresh instances
(XL4d2e) in the else branch but reuse initial instances XL4d2ias final instances in the then
branch.
Now each of the remaining variables of X4 (i.e. in X4d1d2 := X4\(X4d1X4d2)) and each
member of X5 should be given two (distinct) fresh instances
(i.e. (XL4d1d2t,XL4d1d2e,XL5t,XL5e)). Those new instances will in turn act as final instances
in the two recursive calls.
Finally, for brevity, we define XL4t:= (XL4d1t,XL4d2i,XL4d1d2t) and
XL4e:= (XL4d1i,XL4d2e,XL4d1d2e). In summary, the then branch will end with an assign-
ment XL4f,XL5f:= XL4t,XL5t. Similarly, the else branch will end with the assignment
APPENDIX D. SSA 207
XL4f,XL5f:= XL4e,XL5e.
The above construction along with the given P1-P7 ensure
S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4t,XL5t),Y,XLs0) — where XLs0:=
(XLs,XL4d1t,XL4d2e,XL4d1d2t,XL4d1d2e,XL5t,XL5e) — enjoys all its P1-P7. Similarly, all
P1-P7 for
S20:= toSSA.(S2,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4e,XL5e),Y,XLs00) are guaranteed for
any XLs00 XLs 0. As before, in order to avoid double assignments to any instance, we shall insist
on XLs00 := (XLs0(glob.S10\Y)).
We now aim to apply Program equivalence D.3 with
S100 := S10;XL4f,XL5f:= XL4t,XL5tand S200 := S20;XL4f,XL5f:= XL4e,XL5e.
For this to be correct, we have to show
(S1;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y1] =
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S100)[live XL3i,XL4f,XL5f,Y1] and
(S2;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]=
(XL1i,XL2i,XL3i:= X1,X2,X3;S200)[live XL3i,XL4f,XL5f,Y]. For the former, we ob-
serve
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S100)[live XL3i,XL4f,XL5f,Y]
={def. of S100 }
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S10;
XL4f,XL5f:= XL4t,XL5t)[live XL3i,XL4f,XL5f,Y]
={prop. liveness}
((XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S10)[live XL3i,XL4t,XL5t,Y];
XL4f,XL5f:= XL4t,XL5t)[live XL3i,XL4f,XL5f,Y]
={ind. hypo. (Q1 of S10)}
((S1;XL3i,XL4t,XL5t:= X3,X4,X5)[live XL3i,XL4t,XL5t,Y];
XL4f,XL5f:= XL4t,XL5t)[live XL3i,XL4f,XL5f,Y]
={remove liveness}
(S1;XL3i,XL4t,XL5t:= X3,X4,X5;
XL4f,XL5f:= XL4t,XL5t)[live XL3i,XL4f,XL5f,Y]
={assignment-based sub. (Law 18): (X4,X5) (XL3i,XL4t,XL5t) due to
P2,P3,P4 and the freshness of (XL4t,XL5t)}
(S1;XL3i,XL4t,XL5t:= X3,X4,X5;
XL4f,XL5f:= X4,X5)[live XL3i,XL4f,XL5f,Y]
APPENDIX D. SSA 208
={remove dead assignments: (XL4t,XL5t)(XL3i,X4,X5,Y) again, by
P2,P3,P4 and the freshness of (XL4t,XL5t)}
(S1;XL3i:= X3;XL4f,XL5f:= X4,X5)[live XL3i,XL4f,XL5f,Y]
={merge following assignments (Law 1): XL3i(XL4f,XL5f,X4,X5) by P2,P3,P4}
(S1;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y].
The corresponding proof for S200 is similar (and thus omitted). We are now ready to transform
the IF statement into IF’, as follows:
(if Bthen S1else S2;
XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]
={Program equivalence D.3 with S100,S200 as defined above,
B0:= B[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i],
XL1i:= (XL1i,XL2i,XL3i,XL4i) and XL2f:= (XL3i,XL4f,XL5f):
P1 holds by construction (of B0); P2 holds due to our P2,P3,P4;
P3 holds due to P1,P3,P4; and P4,P5 hold as proved above}
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;
if B0then S10;XL4f,XL5f:= XL4t,XL5t
else S20;XL4f,XL5f:= XL4e,XL5e)[live XL3i,XL4f,XL5f,Y].
We thus derive
toSSA.(IF ,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs),IF 0
where IF := if Bthen S1else S2,
IF 0:= if B[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i]
then S10;XL4f,XL5f:= XL4t,XL5telse S20;XL4f,XL5f:= XL4e,XL5e,
X4d1 := (X4(def.S1\def.S2)),
X4d2 := (X4(def.S2\def.S1)),
X4d1d2 := X4def.S1def.S2,
(XL4d1t,XL4d1e,XL4d1d2t,XL4d1d2e,XL5t,XL5e) :=
fresh.((X4d1,X4d2,X4d1d2,X4d1d2,X5,X5),(X,Y,XLs)),
XL4t:= (XL4d1t,XL4d2i,XL4d1d2t),
XL4e:= (XL4d1i,XL4d2e,XL4d1d2e),
XLs0:= (XLs,XL4d1t,XL4d2e,XL4d1d2t,XL4d1d2e,XL5t,XL5e),
S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4t,XL5t),Y,XLs0),
XLs00 := (XLs0(glob.S10\Y))
and S20:= toSSA.(S2,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4e,XL5e),Y,XLs00) .
APPENDIX D. SSA 209
Q2. Indeed Xglob.IF 0due to the ind. hypo. (Q2 of S10and S20) and since (Xglob.B)
(X1,X2,X3,X4) by P7.
DP1.
((XL3i,XL4f,XL5f)\ddef.IF 0)(input.IF 0\Y)
={set theory: (XL4f,XL5f)ddef.IF 0and XL3iddef.IF 0}
XL3i(input.IF 0\Y)
={input of IF 0}
XL3i((glob.B0input.S10;XL4f,XL5f:= XL4t,XL5t
input.S20;XL4f,XL5f:= XL4e,XL5e))\Y)
={input of ‘ ;’; def. of XL4t,XL5e}
XL3i((glob.B0input.S10((XL4d1t,XL4d2i,XL4d1d2t,XL5t)\ddef.S10)
input.S20((XL4d1i,XL4d2e,XL4d1d2e,XL5e)\ddef.S20)) \Y)
={ind. hypo., twice: (XL4d1t,XL4d1d2t,XL5t)ddef.S10(DP2 of S10) and
(XL4d2e,XL4d1d2e,XL5e)ddef.S20(DP2 of S20)}
XL3i((glob.B0input.S10(XL4d2i\ddef.S10)
input.S20(XL4d1i\ddef.S20)) \Y)
⊆ {set theory: (XL4d1i,XL4d2i)XL4i}
(XL3i,XL4i)((glob.B0input.S10input.S20)\Y)
={set theory}
(XL3i,XL4i)(glob.B0\Y)(input.S10\Y)(input.S20\Y)
⊆ {def. of B0and glob.B(X1,X2,X3,X4,Y) due to P1 and P7}
(XL1i,XL2i,XL3i,XL4i)(input.S10\Y)(input.S20\Y)
⊆ {ind. hypo., twice: (input.S10\Y)(XL1i,XL2i,XL3i,XL4i) (DP1 of S10) and
(input.S20\Y)(XL1i,XL2i,XL3i,XL4i) (DP1 of S20)}
(XL1i,XL2i,XL3i,XL4i).
DP2. By construction, we indeed have (XL4f,XL5f)ddef.IF 0.
DO
We need to enforce the policy of non-simultaneous liveness of the instances of a program variable.
Since ddef.DO is empty, the final instance of a live-on-exit variable must also be live-on-entry to
the loop. Thus, the set of live-on-exit-only variables X5 is empty. Furthermore, if the (live-on-
exit) variable is also in input.DO, it must be the final instance that is live-on-entry to the SSA
APPENDIX D. SSA 210
loop DO 0. We achieve that by defining such instances that are also defined in the loop (XL4f) just
before the loop begins and at the end of its body. Similarly, other variables in def.DO input.DO
(not being live-on-exit) should have a dedicated (fresh) loop-entry instance (XL2). Thus, when
recursively transforming the loop body S1 to SSA, non-simultaneous liveness is ensured by sending
instances (XL1i,XL2,XL3i,XL4f) as initial values. Now what about final values for S10?
Non-defined live variables (X1,X3) should be given the corresponding initial instances
(XL1i,XL3i). As for the defined variables (X2,X4), fresh instances (XL2b,XL4b) must be in-
vented for their final S10value. These should in turn be assigned back to the loop-entry instances,
just after S10.
The above construction along with the given P1-P7 ensure
S10:= toSSA.(S1,X,(XL1i,XL2,XL3i,XL4f),(XL1i,XL2b,XL3i,XL4b),Y,XLs0)
— where XLs0:= (XLs,XL2,XL2b,XL4b) — enjoys all its P1-P7.
We now aim to apply Program equivalence D.4 with S100 := S10;XL2,XL4f:= XL2b,XL4b
for S10,B[X1,X2,X3,X4\XL1i,XL2,XL3i,XL4f]] for B0and (XL1i,XL2),(XL3i,XL4f) for
XL1i,XL2irespectively. For this to be correct, we have to show its P1-P5. P1 is a result of
our P1-P4 and the freshness of XL2; P2 is given by our P1,P3,P4 along with RE5 (which yields
input.DO glob.DO); P3 is due to the ind. hypo. (DP1 of S10) and the def. of B0along with
P1,P7 (yielding glob.B(X1,X2,X3,X4,Y)); P4 is correct by construction (of B0); and finally,
for P5, we observe
(S1;XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4)[live XL1i,XL2,XL3i,XL4f,Y]=
(XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;S100)[live XL1i,XL2,XL3i,XL4f,Y]:
(XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;
S100)[live XL1i,XL2,XL3i,XL4f,Y]
={def. of S100 }
(XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;
S10;XL2,XL4f:= XL2b,XL4b)[live XL1i,XL2,XL3i,XL4f,Y]
={prop. liveness}
((XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;
S10)[live XL1i,XL2,XL3i,XL4f,Y];
XL2,XL4f:= XL2b,XL4b)[live XL1i,XL2,XL3i,XL4f,Y]
={ind. hypo. (Q1 of S10)}
((S1;XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4)
[live XL1i,XL2,XL3i,XL4f,Y];
XL2,XL4f:= XL2b,XL4b)[live XL1i,XL2,XL3i,XL4f,Y]
APPENDIX D. SSA 211
={remove liveness}
(S1;XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;
XL2,XL4f:= XL2b,XL4b)[live XL1i,XL2,XL3i,XL4f,Y]
={assignment-based sub. (Law 18): (X2,X4) (XL1i,XL2,XL3i,XL4f) due to
P2,P3,P4 and the freshness of XL2}
(S1;XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;
XL2,XL4f:= X2,X4)[live XL1i,XL2,XL3i,XL4f,Y]
={remove dead assignments: (XL2,XL4f)(XL1i,X2,XL3i,X4,Y) again, by
P2,P3,P4 and the freshness of XL2}
(S1;XL1i,XL3i:= X1,X3;
XL2,XL4f:= X2,X4)[live XL1i,XL2,XL3i,XL4f,Y]
={merge following assignments (Law 1):
(XL1i,XL3i)(XL2,XL4f,X2,X4) by P2,P3,P4}
(S1;XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4)
[live XL1i,XL2,XL3i,XL4f,Y].
Let DO := while Bdo S1od and
DO0:= while B0do S10;XL2,XL4f:= XL2b,XL4bod . We are now ready to transform the
DO statement, as follows:
(DO ;XL3i,XL4f:= X3,X4)[live XL3i,XL4f,Y]
={Program equivalence D.4, as explained and justified above}
(XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;DO0)[live XL3i,XL4f,Y]
={split assignments (Law 1),P2-P4: (XL1i,XL3i)(X2,X4)}
(XL1i,XL3i:= X1,X3;XL2,XL4f:= X2,X4;DO0)[live XL3i,XL4f,Y]
={intro. dead assignment (Law 23), DP1: (XL2i,XL4i)
(((((XL3i,XL4f)(input.DO0\Y)) \(XL2,XL4f)) (X2,X4))}
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4
;XL2,XL4f:= X2,X4;DO0)[live XL3i,XL4f,Y]
={assignment-based sub. (Law 18),P2-P4: (X2,X4) (XL1i,XL2i,XL3i,XL4i)}
(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;
XL2,XL4f:= XL2i,XL4i;DO0)[live XL3i,XL4f,Y].
We thus derive
toSSA.(DO,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f),Y,XLs),
APPENDIX D. SSA 212
XL2,XL4f:= XL2i,XL4i;DO0
where DO := while Bdo S1od ,
DO0:= while B0do S10;XL2,XL4f:= XL2b,XL4bod
(XL2,XL2b,XL4b) := fresh.((X2,X2,X4),(X,Y,XLs)),
XLs0:= (XLs,XL2,XL2b,XL4b),
B0:= B[X1,X2,X3,X4\XL1i,XL2,XL3i,XL4f]
and S10:= toSSA.(S1,X,(XL1i,XL2,XL3i,XL4f),(XL1i,XL2b,XL3i,XL4b),Y,XLs0) .
Q2. Indeed Xglob.XL2,XL4f:= XL2i,XL4i;DO0due to the ind. hypo. (Q2 of S10)
and since P7 gives (Xglob.B)(X1,X2,X3,X4).
DP1.
((XL3i,XL4f)\ddef.XL2,XL4f:= XL2i,XL4i;DO0)
(input.XL2,XL4f:= XL2i,XL4i;DO0\Y)
={ddef and input of ‘:=’, ‘ ;’ and DO}
((XL3i,XL4f)\(XL2,XL4f)) (((XL2i,XL4i)(input.DO0\(XL2,XL4f))) \Y)
={set theory: XL3i(XL2,XL4f) due to P3 and the freshness of XL2;
(XL2i,XL4i)Ydue to P3,P4}
(XL2i,XL3i,XL4i)((input.DO0\(XL2,XL4f)) \Y)
={input of DO’ and set theory}
(XL2i,XL3i,XL4i)
((glob.B0input.S10;XL2,XL4f:= XL2b,XL4b)\(XL2,XL4f,Y))
⊆ {set theory: (glob.B0\(XL2,XL4f,Y)) (XL1i,XL3i)
due to P1,P6 and XDO0}
(XL1i,XL2i,XL3i,XL4i)
(input.S10;XL2,XL4f:= XL2b,XL4b\(XL2,XL4f,Y))
={input of ‘ ;’ and ‘:=’}
(XL1i,XL2i,XL3i,XL4i)
((input.S10((XL2b,XL4b)\ddef.S10)) \(XL2,XL4f,Y))
⊆ {ind. hypo. (DP1 of S10): (input.S10\Y)(XL1i,XL2,XL3i,XL4f)}
(XL1i,XL2i,XL3i,XL4i)(((XL2b,XL4b)\ddef.S10)\(XL2,XL4f,Y))
={(XL2b,XL4b)ddef.S10(derived property DP2):
(XL2b,XL4b) are final instances in S10, of defined variables (X2,X4) def.S1}
(XL1i,XL2i,XL3i,XL4i).
DP2.
APPENDIX D. SSA 213
For variables in X4 we observe that the corresponding final instances XL4fare clearly in
ddef.XL2,XL4f:= XL2i,XL4iand hence in
ddef.XL2,XL4f:= XL2i,XL4i;DO0as required.
D.3 Back from SSA
The following is a derivation of S:= fromSSA.(S0,X,XL1i,XL2f,Y,XLs) when S0includes at
most one live instance of any variable in Xat each program point. The goal is to turn
(XL1i:= X1;S0)[live XL2f,Y]with Xglob.S0into the equivalent
(S;XL2f:= X2)[live XL2f,Y]with
XLs glob.S. This way, a program statement in SSA form can be turned back to the original, as
in the following derivation:
|[var XLi,XLf ,XLim ;XL1i:= X1;S0;X:= XLf ]|
={def. of live with Y:= def.S0\XLs }
(XL1i:= X1;S0;X:= XLf )[live X,Y]
={prop. liveness}
((XL1i:= X1;S0)[live XLf ,Y];X:= XLf )[live X,Y]
={S:= fromSSA.(S,X,XL1i,XLf ,Y,XLs) (Transformation D.6, see below)}
((S;XLf := X)[live XLf ,Y];X:= XLf )[live X,Y]
={remove aux. liveness info.}
(S;XLf := X;X:= XLf )[live X,Y]
={assignment-based sub.; remove self-assignment;
remove dead assignment; remove aux. liveness info.}
S.
Instead of deriving the fromSSA algorithm directly from the SSA form, we shall take a more
general approach. The goal is to allow some transformations (e.g. slicing) to be performed on the
SSA form itself, before returning to the original form.
We observe that the return from SSA involves the merge of all instances (in XLs) back into the
original program variables (X). We hypothesize that as long as there is no simulateneous liveness
of any two instances (to be merged) at any program point, and as long as no such instances are
simultaneously defined (i.e. in a statement of multiple assignment), the merge of all instances
should be possible. Insisting on removal of self-assignments (after, or while merging) will allow
assignments to pseudo instances (in IFs and DO loops) to be eliminated.
APPENDIX D. SSA 214
Thus we shall develop an algorithm for merging sets of variables and use it for returning from
SSA, as follows:
fromSSA.(S0,X,XL1i,XLf ,Y,XLs),merge-vars.(S0,XLs,X,XL1i,XLf ,Y) .
Transformation D.6. Let S0be any core statement and (XL1iXL2f)XLs; let Sbe a
statement defined by
S:= merge-vars.(S0,XLs,X,XL1i,XL2f,Y); then (Q1:)
(XL1i:= X1;S0)[live XL2f,Y]=(S;XL2f:= X2)[live XL2f,Y]
and (Q2:) XLs glob.S
provided
P1: glob.S0(XLs,Y),
P2: (XL1iXL2f)XLs,
P3: (X1X2) X,
P4: X(XLs,Y),
P5: no two instances of any member of Xare sim.-live at any point in S0[live XL2f,Y],
P6: (XLs ((XL2f\ddef.S0)input.S0)) XL1i,
P7: no def-on-live: i.e. no instance is defined where another instance is live-on-exit,
P8: no multiple-defs: i.e. each assinment defines at most one instance (of any X.i).
Proof. As with the toSSA algorithm, the derivation of merge-vars (and hence of fromSSA) is given
hand in hand with its proof of correctness. For a given statement S0, we assume for any slip T0
of S0the correctness of its merge-vars transformation in proving the correctness for S0itself.
Assignment
Aiming to use Program equivalence D.1, we distinguish eight disjoint subsets of X(X1-X8) and the
corresponding defined instances (XL1f,XL2,XL3f,XL4f,XL5f,XL6 (indeed at most one of each
is defined, due to our P8), of which XL1f,XL3f,XL5fare live-on-exit whereas XL2,XL4,XL6
represent dead assignments) and used instances (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i, of which
XL7iis live-on-both entry and exit, and the rest are live-on-entry only).
In terms of our expression of merge-vars and its preconditions P1-P8, we rewrite its X1 as
(X1,X2,X3,X4,X7,X8), XL1ias (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i), X2 as (X1,X3,X5,X7)
and XL2fas (XL1f,XL3f,XL5f,XL7i). According to this decomposition, our target Q1 can be
APPENDIX D. SSA 215
rewritten as
((XL1i,XL2i,XL3i,XL4i,XL7i,XL8i:= X1,X2,X3,X4,X7,X8;S0)
[live XL1f,XL3f,XL5f,XL7i,Y]
=
((S;XL1f,XL3f,XL5f,XL7i:= X1,X3,X5,X7)
[live XL1f,XL3f,XL5f,XL7i,Y]
with S0being the assignment
XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := XL1i,XL2i,E10,E20,E30,E40,E50. Accordingly, pre-
conditions P1-P8 can be understood as
P1: glob.S0(XLs,Y),
P2: ((XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)(XL1f,XL3f,XL5f,XL7i)) XLs,
P3: ((X1,X2,X3,X4,X7,X8) (X1,X3,X5,X7)) X,
P4: X(XLs,Y),
P5: no two instances of any member of Xare simultaneously live at any point in S0,
P6: (XLs (XL7iinput.S0)) (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i),
P7: no def-on-live: i.e. (X2,X4,X6) (X1,X3,X5,X7),
P8: no multiple-defs: i.e. (X1,X2,X3,X4,X5,X6) are disjoint.
For Program equivalence D.1 to be correct, we need to verify its P1-P11.
P1 ((X1,X2,X3,X4,X5,X6,X7,X8) X) holds by construction;
P2 ((XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)XLs) is due to our P2;
P3 ((XL1f,XL2,XL3f,XL4,XL5f,XL6,XL7i)XLs) is due to our P2, P7 and the disjointness
of instances (XLs.iXLs.jfor all i6=j);
P4 (Y1Y) is due to our P1;
P5 (disjointness of (X,XLs,Y)) is due to our P4;
for P6-P11 to hold, we define E1,E2,E3,E4,E5 :=
(E10,E20,E30,E40,E50)[XL1i,XL2i,XL3i,XL4i,XL7i,XL8i\X1,X2,X3,X4,X7,X8]; now P6
(glob.(E1,E2,E3,E4,E5) (X1,X2,X3,X4,X7,X8,Y)) is due to our P1,P6 and the redun-
dancy of reversed double sub. ((X1,X2,X3,X4,X7,X8) glob.(E10,E20,E30,E40,E50) due to
P1,P4); the latter argument proves P7-P11 as well.
We are now ready to apply Program equivalence D.1, as follows:
(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i:= X1,X2,X3,X4,X7,X8;
XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := XL1i,XL2i,E10,E20,E30,E40,E50)
[live XL1f,XL3f,XL5f,XL7i,Y]
={Program equivalence D.1 with
E1,E2,E3,E4,E5 := (E10,E20,E30,E40,E50)
[XL1i,XL2i,XL3i,XL4i,XL7i,XL8i\X1,X2,X3,X4,X7,X8]}
APPENDIX D. SSA 216
(X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5;
XL1f,XL3f,XL5f,XL7i:= X1,X3,X5,X7)[live XL1f,XL3f,XL5f,XL7i,Y].
We thus derive
merge-vars.(XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := XL1i,XL2i,E10,E20,E30,E40,E50,
XLs,X,(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i),(XL1f,XL3f,XL5f,XL7i),Y),
X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5
where E1,E2,E3,E4,E5 := (E10,E20,E30,E40,E50)
[XL1i,XL2i,XL3i,XL4i,XL7i,XL8i\X1,X2,X3,X4,X7,X8].
Q2. XLs glob.X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5
since (due to P2) (XLs glob.(E10,E20,E30,E40,E50)) (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)
and (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)(X1,X2,X3,X4,X7,X8) (due to P2-P4).
Sequential composition
We define XL3 := (XLs ((XL2f\ddef.S20)input.S20)) to be the set of intermediately-live
instances. Due to P5 (no simultaneously-live instances), XL3 includes at most one instance of each
member of X, as required for the two recursive calls: S1 := merge-vars.(S10,XLs,X,XL1i,XL3,Y)
and S2 := merge-vars.(S20,XLs,X,XL3,XL2f,Y). Preconditions P1,P4,P5,P7 and P8 are trivial
consequences of S10and S20both being slips of S0and following our construction of S1 and S2
(in terms of the parameters to merge-vars). P2 for both cases is fine as well, due to our P2 and
the construction of XL3. P3 is fine due to our P3 and by construction of X3, the set of program
variables (in X) corresponding to XL3. Finally, P6 for the call to S20is trivial, again by construc-
tion of XL3; it is also correct for S10, due to our P6 and the associativity of liveness (Lemma 7.2).
As a result, we get
(S2;XL2f:= X2)[live XL2f,Y]=(XL3 := X3;S20)[live XL2f,Y]and
(S1;XL3 := X3)[live XL3,Y]=(XL1i:= X1;S10)[live XL3,Y], as required (as P1
and P2 respectively) in the following:
(XL1i:= X1;S10;S20)[live XL2f,Y]
={Program equivalence D.2 with
S1 := merge-vars.(S10,XLs,X,XL1i,XL3,Y);
S2 := merge-vars.(S20,XLs,X,XL3,XL2f,Y)}
(S1;S2;XL2f:= X2)[live XL2f,Y].
We thus derive
merge-vars.(S10;S20,XLs,X,XL1i,XL2f,Y),
merge-vars.(S10,XLs,X,XL1i,XL3,Y);merge-vars.(S20,XLs,X,XL3,XL2f,Y)
APPENDIX D. SSA 217
where XL3 := XLs ((XL2f\ddef.S20)input.S20) .
Q2. As required, XLs glob.S1;S2due to the ind. hypo. (Q2 of S1 and S2).
IF
All live-on-exit variables are also live-on-exit to each branch. Similarly, it should be safe to
assume all IF’s live-on-entry variables are live-on-entry to each branch. This may be an over-
approximation, but a harmless one, since the initial instances of all those variables are, in any
case, available on entry to both branches. This harmfulness can be verified by the observations
that ddef of each branch is a superset of ddef.IF 0and that the corresponding input is a subset of
input.IF 0.
Thus, the recursive calls, computing S1 := merge-vars.(S10,XLs,X,XL1i,XL2f,Y) and S2 :=
merge-vars.(S20,XLs,X,XL1i,XL2f,Y), faithfully maintain P6. Since the calls are to slips of
IF 0, and since all remaining parameters are identical, all other preconditions P1-P5,P7,P8 (to
both recursive calls) trivially hold. As a result, we get
(S1;XL2f:= X2)[live XL2f,Y]=(XL1i:= X1;S10)[live XL2f,Y]and
(S2;XL2f:= X2)[live XL2f,Y]=(XL1i:= X1;S20)[live XL2f,Y], as required (in
P4 and P5 respectively) for
(XL1i:= X1;if B0then S10else S20)[live XL2f,Y]
={Program equivalence D.3 with
B:= B0[XL1i\X1];
S1 := merge-vars.(S10,XLs,X,XL1i,XL2f,Y);
S2 := merge-vars.(S20,XLs,X,XL1i,XL2f,Y):
P1 ([B0B0[XL1i\X1][X1\XL1i]]) is due to
X1glob.B0(our P1,P3,P4) and the redundancy of reversed double sub.;
P2 (XL1iX1) is due to our P2-P4; it also proves
P3 (XL1iglob.B0[XL1i\X1]); finally
the ind. hypo. (Q1), twice, give P4 and P5}
(if Bthen S1else S2;XL2f:= X2)[live XL2f,Y].
We thus derive
merge-vars.(if B0then S10else S20,XLs,X,XL1i,XL2f,Y),if B0[XL1i\X1]
then merge-vars.(S10,XLs,X,XL1i,XL2f,Y)else merge-vars.(S20,XLs,X,XL1i,XL2f,Y).
Q2. We get XLs glob.IF , as required, due to the ind. hypo. (Q2, twice) and since (XLs
glob.B0)XL1i(due to input of IF and P6).
APPENDIX D. SSA 218
DO
Let XL1i,XL2ibe the live-on-entry instances, with XL2ialso live-on-exit (which must be same
instances as on-entry due to P6 and ddef.DO0being empty); let (X1,X2) Xbe the cor-
responding program variables (the one-to-one mapping from XL1i,XL2ito X1,X2 is due to
P5). Since all live variables on entry to DO0are also live on both ends of S10, we define
S1 := merge-vars.(S10,XLs,X,(XL1i,XL2i),(XL1i,XL2i),Y). The validity of P1-P8 of the call
to merge-vars is a consequence of the given P1-P8 (of DO0), of S10being a slip of DO 0and of the
def. of ddef and input of DO loops.
We now aim to use Program equivalence D.4 with the merged S1 and
B:= B0[XL1i,XL2i\X1,X2], and need to show correctness of its P1-P5 preconditions. P1
(disjointness of (X1,X2,XL1i,XL2i,Y)) is due to P2-P4 and the definition of XL1i,XL2i(only
the latter being live-on-exit from DO0); P2 ((XL1i,XL2i)input.DO ) is due to P2,P4,Q2 and
RE5; P3 is due to our P1 and P6; P4 ([B0B0[XL1i,XL2i\X1,X2][X1,X2\XL1i,XL2i]]) is
due to (X1,X2) glob.B0(our P1,P3,P4) and the redundancy of reversed double sub.; and finally
P5 is given by the induction hypothesis (Q1 of S10); then
(XL1i,XL2i:= X1,X2;
while B0do S10od)[live XL2i,Y]
={Program equivalence D.4 with B:= B0[XL1i,XL2i\X1,X2] and
S1 := merge-vars.(S10,XLs,X,(XL1i,XL2i),(XL1i,XL2i),Y)}
(while Bdo S1od ;XL2i:= X2)[live XL2i,Y].
We thus derive
merge-vars.(while B0do S10od ,XLs,X,(XL1i,XL2i),XL2i,Y),
while B0[XL1i,XL2i\X1,X2] do merge-vars.(S10,XLs,X,(XL1i,XL2i),(XL1i,XL2i),Y)od
.
Q2. Finally, we get the required XLs glob.DO due to the ind. hypo. (Q2 on S10) and since
(XLs glob.B0)(XL1i,XL2i).
D.4 SSA is de-SSA-able
Theorem 8.4. Let Sbe any core statement and let X,Y:= (def.S),(glob.S\def.S), X1 := (X
((X\ddef.S)input.S)) and (XL1i,XLf ) := fresh.((X1,X),(X,Y)); let S0be the SSA version of
S, defined as S0:= toSSA.(S,X,XL1i,XLf ,Y,(XL1i,XLf )); then S0is de-SSA-able. That is, all
APPENDIX D. SSA 219
preconditions, P1-P8, of the fromSSA algorithm hold for S00 := fromSSA.(S0,XLs,X,XL1i,XLf ,Y)
where XLs := ((XL1i,XLf )(def.S10\Y)).
Proof. Preconditions P1 (glob.S0(XLs,Y)) and P2 ((XL1iXLf )XLs ) hold by definition
of XLs; P3 ((X1X)X) holds by definition of X1 and set theory; P4 (X(XLs,Y)) is due to
the definition of X,Y,XLs and due to RE5 (i.e. def.Sglob.S) Q2 of toSSA (i.e. X glob.S0);
and P6 ((XLs ((XLf \ddef.S0)input.S0)) XL1i) is due to DP1 of toSSA.
We are left to show no-simultaneous liveness (P5), no-def-on-live (P7) and no-multiple-defs
(P8). Those shall be proved by induction over the structure of S.
For P5 (no-simultaneous liveness) we first note that having one final instance for each live-on-
exit program variable, and similarly having one initial instance for each live-on-entry variable, as
is guaranteed by toSSA’s derived property DP1, ensures no-simultaneous liveness on-entry. Thus,
in proving P5 for each specific case, we shall only be obliged to show no-simultaneous liveness in
internal program points.
Assignment
Recall toSSA.(X4,X2,X5,X6,Y1 := E1,E2,E3,E4,E5,
X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs)) ,
XL4f,XL2,XL5f,XL6,Y1 := E10,E20,E30,E40,E50where
(E10,E20,E30,E40,E50,E60) := (E1,E2,E3,E4,E5,E6)
[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i]
and (XL2,XL6) := fresh.((X2,X6),(X,Y,XLs)).
P7 (no-def-on-live) and P8 (no-multiple-defs) are both due to the disjointness of
(X2,X4,X5,X6,XL4f,XL5f), the freshness of (XL2,XL4) and hence the disjointness of
(XL2,XL4f,XL5f,XL6). Note that XL4f,XL5fare actually live-on-exit, thus not breaking P7.
Sequential composition
Recall toSSA.(S1;S2,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs),
S10;S20where both S10,S20are constructed by recursive calls to toSSA.
P5,P7,P8: The induction hypothesis ensures no simultaneous liveness in any point of S20or
S10and no def-on-live or multiple-defs in any internal assignment slip.
APPENDIX D. SSA 220
IF
toSSA.(IF ,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs),IF 0
where IF := if Bthen S1else S2,
IF 0:= if B[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i]
then S10;XL4f,XL5f:= XL4t,XL5telse S20;XL4f,XL5f:= XL4e,XL5e,
XL4t:= (XL4d1t,XL4d2i,XL4d1d2t),
XL4e:= (XL4d1i,XL4d2e,XL4d1d2e),
S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4t,XL5t),Y,XLs0)
and S20:= toSSA.(S2,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4e,XL5e),Y,XLs00) .
P5: We have (XL3i,XL4f,XL5f) live at the end of IF 0and at the end of both branches. The
then branch, ending with assignment XL4f,XL5f:= XL4t,XL5t, yields (XL3i,XL4t,XL5t)
at the end of S10(with one instance for each member of (X3,X4,X5)), and maintains no simul-
taneous liveness in S10due to the induction hypothesis. Similarly, the triple
(XL3i,XL4e,XL5e) includes on instance for each member of (X3,X4,X5), being live at the end
of S20.
P7,P8: The (pseudo) assignments at the end of both branches of IF0are both to the live-on-
exit (XL4f,XL5f), final instances of program variables (X4,X5). This, along with the ind. hypo.
on S10and S20maintains P7 and P8.
DO
toSSA.(DO,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f),Y,XLs),
XL2,XL4f:= XL2i,XL4i;DO0
where DO := while Bdo S1od ,
DO0:= while B0do S10;XL2,XL4f:= XL2b,XL4bod
(XL2,XL2b,XL4b):=fresh.((X2,X2,X4),(X,Y,XLs )),
XLs0:= (XLs,XL2,XL2b,XL4b),
B0:= B[X1,X2,X3,X4\XL1i,XL2,XL3i,XL4f]
and S10:= toSSA.(S1,X,(XL1i,XL2,XL3i,XL4f),(XL1i,XL2b,XL3i,XL4b),Y,XLs0) .
P5. First, live instances ahead of DO 0as well as at the end of its body, are from
(XL1i,XL2,XL3i,XL4f), one instance for each member of (X1,X2,X3,X4). This is so due to
DP2 of S10(i.e. input.S10(XL1i,XL2,XL3i,XL4f)) and the definition of B0. Then the assign-
ment at the end of the loop body, XL2,XL4f:= XL2b,XL4b, yields (XL1i,XL2b,XL3i,XL4b)
as live instances on exit from S10. The ind. hypo. ensures no simultaneous liveness in (and ahead
of) S10itself.
P7,P8: As mentioned above, live instances at the end of the loop body are from
(XL1i,XL2,XL3i,XL4f). Thus, the assignment there to the live (XL2,XL4f), one instance for
APPENDIX D. SSA 221
each member of (X2,X4), indeed maintains both P7 and P8. Similarly, the assignment (to the
same instances) ahead of DO0(where, again, the same instances are live), maintains P7 and P8.
D.5 An SSA-based slice is de-SSA-able
Theorem 9.6. Any slide-independent statement from the SSA version of any core statement is
de-SSA-able.
That is, let Sbe any core statement and let X,Y:= (def.S),(glob.S\def.S), X1 := (X((X\
ddef.S)input.S)) and (XL1i,XLf ) := fresh.((X1,X),(X,Y)); let S0be the SSA version of S,
defined as S0:= toSSA.(S,X,XL1i,XLf ,Y,(XL1i,XLf )); let XLs := ((XL1i,XLf )(def.S10\
Y)) be the full set of instances (of X, in S0) and let XLI be any (slide-independent) subset of
those instances, with final instances XL2f:= XLI XLf ; finally let SI 0:= slides.S0.XLI be the
corresponding (slide-independent) statement; then SI 0is de-SSA-able. That is, all preconditions,
P1-P8, of the fromSSA algorithm hold for SI := fromSSA.(SI 0,X,XL1i,XL2f,XLs).
Proof. Preconditions P1 (glob.SI 0(XLs,Y)) and P2 ((XL1iXL2f)XLs ) hold by definition
of XLs; P3 ((X1X2) X) holds by definition of X1 (and set theory) and by definition of XL2f
and its mapping to program variables X2 in X; and P4 (X(XLs,Y)) is due to the definition of
X,Y,XLs and due to Q2 of toSSA (i.e. X glob.S0).
For P5, we observe no-simultaneous liveness is known for S0[live XL2f,Y] (Theorem 8.4).
This property is preseved by taking the slides of any slide-independent set (Corollary C.3). Thus
(slides.S0.XLI )[live XL2f,Y] enjoys no-simultaneous liveness for instances of (each member of)
X.
For P6 ((XLs ((XL2f\ddef.S0)input.S0)) XL1i), we observe
XLs ((XL2f\ddef.(slides.S0.XLI )) input.(slides.S0.XLI ))
={recall def. of XLs := ((XL1i,XLf )(def.S10\Y));
let XLs0:= XLs \XL1isuch that XLs = (XL1i,XLs0);
note that XLs0def.S0since DP2 of toSSA and RE4 give XLf def.S0}
(XL1i,XLs0)((XL2f\ddef.(slides.S0.XLI )) input.(slides.S0.XLI ))
⊆ {set theory}
XL1i(XLs0((XL2f\ddef.(slides.S0.XLI )) input.(slides.S0.XLI )))
={XLI def.S0XLs0,XL2fXLI and since XLI is slide ind. in slides.S0
we get input.(slides.S0.XLI )def.S0XLI }
XL1i(XLI def.S0((XL2f\ddef.(slides.S0.XLI )) input.(slides.S0.XLI )))
APPENDIX D. SSA 222
⊆ {set theory}
XL1i(XLI def.S0((XL2f\(XLI ddef.(slides.S0.XLI )))
(XLI input.(slides.S0.XLI ))))
={Lemma C.4 with S,V,X:= S0,XLI ,XLI :
indeed (XLI def.S0XLI }
XL1i(XLI def.S0((XL2f\(XLI ddef.S0))
(XLI input.(slides.S0.XLI ))))
={set theory: XL2fXLI by definition and
XL2fddef.S0by DP2 of toSSA}
XL1i(XLI def.S0(XLI input.(slides.S0.XLI )))
⊆ {Lemma C.5 with S,VI ,X:= S0,XLI ,XLI :
indeed XLI def.SXLI }
XL1i(XLI def.S0(XLI input.S0))
={XLs input.S0XL1iby DP1 of toSSA;XLI XLs}
XL1i.
For P7 (no def-on-live), we recall no def-on-live is known for S0[live XL2f,Y] (Theorem 8.4).
Like P5, we show that this property is preseved by taking the slides of any slide-independent set.
Since S0[live XL2f,Y] enjoys the property of no-def-on-live (of XLI ), and since any assignment
slip of the form (XLI 1 := E1)[live XLI 21,Y] in (slides.S0.XLI )[live XL2f,Y] has a correspond-
ing slip (XLI 1,coXLI 1 := E1,E2)[live XLI 2,Y] of S0[live XL2f,Y] with XLI 21 XLI 2 (due
to Theorem C.2), we observe that a defined instance (x0XLI 1) may only cause a def-on-live
violation if another instance x00 (of the same program variable x), is live-on-exit from the assign-
ment, i.e. x 2XLI 21. This can only happen if x2XLI 2 as well (since XLI 21 XLI 2).
In such a case, x1 already causes a def-on-live violation in the corresponding (XLI 1,coXLI 1 :=
E1,E2)[live XLI 2,Y], thus contradicting the de-SSA-ability of S0[live XL2f,Y].
Finally, P8 (no-multiple-defs) holds for S0(see Theorem 8.4) and hence for any of its slides.
Appendix E
Final-Use Substitution
E.1 Formal derivation
The following is a formal derivation of final-use substitution, for any core statement Sand matching
sets of variables Xand X0.
We begin with S;{X=X0}where X0glob.S, and while propagating the assertion
backwards into S, as far as possible, we make local assertion-based substitutions to each slip that
ends up being preceded by the assertion. Finally, we remove all assertions.
An equality x=x0will successfully propagate backward over any statement that does not
define x; it will also propagate into any IF statement and into those DO loops whose body does
not define x.
Side note: in terms of control flow paths, the assertion ends up propagating to (the entry of)
any node from which all paths to the exit involve no definition of x. When we are interested in a
formulation of cases in which all uses of xwill be substituted, we should be able to express that
as follows: all paths to the exit from any use of xare clear of definitions of x.
S=X1,Y:= E1,E2: in deriving (X1,Y:= E1,E2)[final-use X1,X2\X10,X20] when
(X10,X20)glob.Sand X2Y, we observe
X1,Y:= E1,E2;{X1,X2 = X10,X20}
={split assertion (Law 15)}
X1,Y:= E1,E2;{X2 = X20};{X1 = X10}
={swap statement and assertion (Law 11): (X2,X20)(X1,Y)}
{X2 = X20};X1,Y:= E1,E2;{X1 = X10}
={assertion-based sub. (Law 17); E10:= E1[X2\X20] and E20:= E2[X2\X20]}
{X2 = X20};X1,Y:= E10,E20;{X1 = X10}
223
APPENDIX E. FINAL-USE SUBSTITUTION 224
={swap assertion and statement (Law 11): (X2,X20)(X1,Y)}
X1,Y:= E10,E20;{X2 = X20};{X1 = X10}
={merge assertions (Law 15)}
X1,Y:= E10,E20;{X1,X2 = X10,X20}.
We thus derive (X1,Y:= E1,E2)[final-use X1,X2\X10,X20],
X1,Y:= E1[X2\X20],E2[X2\X20].
S=S1;S2: in deriving (S1;S2)[final-use X1,X2\X10,X20] when X2def.S2 and
(X10,X20)glob.S), we observe
S1;S2;{X1,X2 = X10,X20}
={final-use sub.: let S20:= S2[final-use X1,X2\X10,X20]}
S1;S20;{X1,X2 = X10,X20}
={split assertion (Law 15)}
S1;S20;{X2 = X20};{X1 = X10}
={swap statement and assertion (Law 11): (X2,X20)def.S2}
S1;{X2 = X20};S20;{X1 = X10}
={final-use sub.: let S10:= S1[final-use X2\X20]}
S10;{X2 = X20};S20;{X1 = X10}
={swap assertion and statement (Law 11): (X2,X20)def.S20}
S10;S20;{X2 = X20};{X1 = X10}
={merge assertions (Law 15)}
S10;S20;{X1,X2 = X10,X20}.
We thus derive (S1;S2)[final-use X1,X2\X10,X20],
S1[final-use X2\X20];S2[final-use X1,X2\X10,X20]where X1 := Xdef.S2, X2 := X\X1
and X10,X20are the corresponding subsets of X0.
S=if Bthen S1else S2: in deriving
(if Bthen S1else S2)[final-use X1,X2\X10,X20]
when X2(def.S1def.S2) and (X10,X20)glob.S, we observe
if Bthen S1else S2;{X1,X2 = X10,X20}
={dist. assertion over IF (Law 12)}
if Bthen S1;{X1,X2 = X10,X20}else S2;{X1,X2 = X10,X20}
APPENDIX E. FINAL-USE SUBSTITUTION 225
={final-use sub., twice: let S10:= S1[final-use X1,X2\X10,X20] and
S20:= S2[final-use X1,X2\X10,X20]}
if Bthen S10;{X1,X2 = X10,X20}else S20;{X1,X2 = X10,X20}
={dist. IF over ‘ ;’ (Law 4)}
if Bthen S10else S20;{X1,X2 = X10,X20}
={split assertion (Law 15)}
if Bthen S10else S20;{X2 = X20};{X1 = X10}
={swap statement and assertion (Law 11): (X2,X20)(def.S10def.S20)}
{X2 = X20};if Bthen S10else S20;{X1 = X10}
={assertion-based sub. (Law 17)}
{X2 = X20};if B[X2\X20]then S10else S20;{X1 = X10}
={swap assertion and statement (Law 11): (X2,X20)(def.S10def.S20)}
if B[X2\X20]then S10else S20;{X2 = X20};{X1 = X10}
={merge assertions (Law 15)}
if B[X2\X20]then S10else S20;{X1,X2 = X10,X20}.
We thus derive (if Bthen S1else S2)[final-use X1,X2\X10,X20],
if B[X2\X20]then S1[final-use X1,X2\X10,X20]else S2[final-use X1,X2\X10,X20]where
X1 := Xdef.(S1,S2), X2 := X\X1 and X10,X20are the corresponding subsets of X0.
S=while Bdo S1od : in deriving (while Bdo S1od)[final-use X1,X2\X10,X20]
(when X2def.S1 and (X10,X20)glob.S), we observe
while Bdo S1od ;{X1,X2 = X10,X20}
={split assertion (Law 15)}
while Bdo S1od ;{X2 = X20};{X1 = X10}
={swap statement and assertion (Law 11): (X2,X20)def.S1}
{X2 = X20};while Bdo S1od ;{X1 = X10}
={prop. assertion forward into loop (Law 16): (X2,X20)def.S1}
{X2 = X20};while Bdo {X2 = X20};S1od ;{X1 = X10}
={swap assertion and statement (Law 11): (X2,X20)def.S1}
{X2 = X20};while Bdo S1;{X2 = X20}od ;{X1 = X10}
={final-use sub.: let S10:= S1[final-use X2\X20]}
APPENDIX E. FINAL-USE SUBSTITUTION 226
{X2 = X20};while Bdo S1;{X2 = X20}od ;{X1 = X10}
={assertion-based sub. (Law 17): let B0:= B[X2\X20]}
{X2 = X20};while B0do S1;{X2 = X20}od ;{X1 = X10}
={swap statement and assertion (Law 11): (X2,X20)def.S10}
{X2 = X20};while B0do {X2 = X20};S10od ;{X1 = X10}
={prop. assertion backward outside loop (Law 16): (X2,X20)def.S10}
{X2 = X20};while B0do S10od ;{X1 = X10}
={swap assertion and statement (Law 11): (X2,X20)def.S10}
while B0do S10od ;{X2 = X20};{X1 = X10}
={merge assertions (Law 15)}
while B0do S10od ;{X1,X2 = X10,X20}.
We thus derive (while Bdo S1od)[final-use X1,X2\X10,X20],
(while B[X2\X20]do S1[final-use X2\X20]od) where X1 := Xdef.S1, X2 := X\X1 and
X20is the subset of X0corresponding to X2.
E.2 Lemmata for proving statement dup. with final use
Lemma 10.2 . Let Sbe any core statement with def.S= (V,coV ), Vr V(and fVr the
corresponding subset of fV ) and (iV ,icoV ,fV )glob.S; we then have
iV ,icoV := V,coV
;S
;fV := V
;
V,coV := iV ,icoV
;S[final-use Vr \fVr]
;{Vr =fVr}
=
iV ,icoV := V,coV
;S
;fV := V
;
V,coV := iV ,icoV
;S[final-use Vr \fVr]
APPENDIX E. FINAL-USE SUBSTITUTION 227
Proof.
iV ,icoV := V,coV ;S;fV := V;
V,coV := iV ,icoV ;S[final-use Vr \fVr];{Vr =fVr }
={swap statement and assertion (Law 10)}
iV ,icoV := V,coV ;S;fV := V;
V,coV := iV ,icoV ;{wp.S[final-use Vr \fVr].(Vr =fVr )};
S[final-use Vr \fVr]
={property of final-use sub. (Lemma E.1, see below)}
iV ,icoV := V,coV ;S;fV := V;
V,coV := iV ,icoV ;{wp.S.(Vr =fVr)};S[final-use Vr \fVr ]
={intro. following assertion (Law 7)}
iV ,icoV := V,coV ;({iV ,icoV =V,coV };S;fV := V;
V,coV := iV ,icoV );{wp.S.(Vr =fVr)};S[final-use Vr \fVr ]
={swap statement and assertion (Law 10) and wp of ‘ ;}
iV ,icoV := V,coV ;
{wp.{iV ,icoV =V,coV };S;fV := V;V,coV := iV ,icoV ;
S.(Vr =fVr)};({iV ,icoV =V,coV };S;fV := V;
V,coV := iV ,icoV );S[final-use Vr \fVr]
={Lemma E.2, see below}
iV ,icoV := V,coV ;{wp.{iV ,icoV =V,coV };S.true };
({iV ,icoV =V,coV };S);fV := V;V,coV := iV ,icoV ;
S[final-use Vr \fVr]
={swap assertion and statement (Law 10)}
iV ,icoV := V,coV ;({iV ,icoV =V,coV };S);{true };fV := V;
V,coV := iV ,icoV ;S[final-use Vr \fVr]
={remove true assertion; remove following assertion (Law 7)}
iV ,icoV := V,coV ;S;fV := V;V,coV := iV ,icoV ;
S[final-use Vr \fVr].
Lemma E.1. Let S,V,fV be any core statement and two sets of variables, respectively; we then
have
[wp.(S[final-use V\fV ].(V=fV )wp.S.(V=fV )]
APPENDIX E. FINAL-USE SUBSTITUTION 228
provided fV (Vglob.S).
Proof.
wp.S.(V=fV )
={pred. calc.}
wp.S.((V=fV )true)
={wp of assertions}
wp.S.(wp.{V=fV }.true)
={wp of ‘ ;}
wp.S;{V=fV }.true
={final-use sub.}
wp.S[final-use V\fV ];{V=fV }.true
={wp of ‘ ;}
wp.S[final-use V\fV ].(wp.{V=fV }.true
={wp of assertions}
wp.S[final-use V\fV ].((V=fV )true)
={pred. calc.}
wp.S[final-use V\fV ].(V=fV ).
Lemma E.2. Let Sbe any core statement with def.S= (V,coV ). We then have
[wp.{iV ,icoV =V,coV };S;fV := V;V,coV := iV ,icoV ;S.(Vr =fVr )
wp.{iV ,icoV =V,coV };S.true]
provided Vr V,fVr is the corresponding subset of fV and (iV ,icoV ,fV )glob.S.
Proof.
wp.{iV ,icoV =V,coV };S;fV := V;V,coV := iV ,icoV ;S.(Vr =fVr )
={statement duplication (Lemma 6.3)}
wp.{iV ,icoV =V,coV };S;fV := V.(Vr =fVr)
={wp of ‘ ;’ and ‘:=’}
wp.{iV ,icoV =V,coV };S.((fV := V).(Vr =fVr))
={normal sub. (proviso)}
wp.{iV ,icoV =V,coV };S.true .
APPENDIX E. FINAL-USE SUBSTITUTION 229
E.3 Stepwise final-use substitution
Theorem E.3. The final-use substitution can be performed in a stepwise manner. That is, for
any core statement Sand four sets of variables X1,X2,fX 1,fX 2, we have
S[final-use X1,X2\fX 1,fX 2] = S[final-use X1\fX 1][final-use X2\fX 2]
provided (fX 1,fX 2) glob.S.
Proof. We follow the semantic requirement of final-use substitution
(S;{X=fX }=S[final-use X\fX ];{X=fX }) and observe
S[final-use X1,X2\fX 1,fX 2] ;{X1,X2 = fX 1,fX 2}
={final-use sub.: (fX 1,fX 2) glob.S(proviso)}
S;{X1,X2 = fX 1,fX 2}
={split assertion: Law 15}
S;{X1 = fX 1};{X2 = fX 2}
={final-use sub.: fX 1glob.S(proviso)}
S[final-use X1\fX 1] ;{X1 = fX 1};{X2 = fX 2}
={swap statements (Program equivalence 5.7): def of assertions is empty}
S[final-use X1\fX 1] ;{X2 = fX 2};{X1 = fX 1}
={final-use sub.: fX 2glob.S[final-use X1\fX 1] since
fX 2(fX 1,glob.S) (as implied by the proviso)}
S[final-use X1\fX 1][final-use X2\fX 2] ;{X2 = fX 2};{X1 = fX 1}
={merge assertions: Law 15}
S[final-use X1\fX 1][final-use X2\fX 2] ;{X1,X2 = fX 1,fX 2}.
Appendix F
Summary of Laws
F.1 Manipulating core statements
Law 1. Let X,Y,E1,E2 be two sets of variables and two sets of expressions, respectively; then
X:= E1;Y:= E2=X,Y:= E1,E2
provided X(Yglob.E2).
Law 2. Let S,Xbe a statement set of variables, respectively; then
S=S;X:= X.
Law 3. Let S,S1,S2,Bbe three statements and a boolean expression, respectively; then
S;if Bthen S1else S2=if Bthen S;S1else S;S2
provided def.Sglob.B.
Law 4. Let S1,S2,S3,Bbe three statements and a boolean expression, respectively; then
if B1then S1else S2;S3=if B1then S1;S3else S2;S3.
Law 5. Let S1,X,B,Ebe any statement, set of variables, boolean expression and set of expres-
sions, respectively; then
{X=E};while Bdo S1;(X:= E)od ={X=E};while Bdo S1od ;(X:= E)
provided X(glob.Binput.S1glob.E).
Law 6. Let X,Ebe any set of variables and set of expressions, respectively; then
{X=E}={X=E};X:= E.
230
APPENDIX F. SUMMARY OF LAWS 231
F.2 Assertion-based program analysis
F.2.1 Introduction of assertions
Law 7. Let X,Y,E1,E2 be two sets of variables and two sets of expressions, respectively; then
X,Y:= E1,E2=X,Y:= E1,E2;{Y=E2}
provided (X,Y)glob.E2.
Law 8. Let X,X0,Ebe (same length) lists of variables and expressions, respectively, with XX0;
then
X,X0:= E,E=X,X0:= E,E;{X=X0}.
Law 9. Let S1,B1,B2 be any given statement and two boolean expressions, respectively; then
while B1do S1od =while B1do {B2};S1od .
provided [B1B2].
F.2.2 Propagation of assertions
Law 10. Let S,Bbe a statement and boolean expression, respectively; then
{wp.S.B};S=S;{B}.
Law 11. Let S,Bbe a statement and boolean expression, respectively; then
{B};S=S;{B}.
provided def.Sglob.B.
Law 12. Let S1,S2,B1,B2 be two statements and two boolean expressions, respectively; then
{B1};if B2then S1else S2=if B2then {B1};S1else {B1};S2.
Law 13. Let S,B1,B2,B3,B4 be a statement and four boolean expressions, respectively; then
{B1};while B2do S;{B3}od ={B1};while B2do {B4};S;{B3}od
provided [B1B4] and [B3B4].
Law 14. Let S,B1,B2,B3,B4 be a statement and four boolean expressions, respectively; then
{B1};while B2do S;{B3}od ={B1};while B2B4do S;{B3}od
provided [B1B4] and [B3B4].
APPENDIX F. SUMMARY OF LAWS 232
Law 15. Let B1,B2 be two boolean expressions; then
{B1B2}={B1};{B2}.
Law 16. Let S,B1,B2 be a statement and two boolean expressions, respectively; then
{B1};while B2do Sod ={B1};while B2do {B1};Sod
provided glob.B1def.S.
F.2.3 Substitution
Law 17. Let S1,S2,Bbe two statements and a boolean expression, respectively; let X,Ebe a
set of variables and a corresponding list of expressions; and let Y,Y0be two sets of variables; then
{Y=Y0};X:= E={Y=Y0};X:= E[Y\Y0];
{Y=Y0};IF ={Y=Y0};IF 0; and
{Y=Y0};DO ={Y=Y0};DO0.
where IF := if Bthen S1else S2,
IF 0:= if B[Y\Y0]then S1else S2,
DO := while Bdo S1;{Y=Y0}od
and DO0:= while B[Y\Y0]do S1;{Y=Y0}od .
Law 18. Let S1,S2,Bbe two statements and a boolean expression, respectively; let
X,X0,Y,Z,E1,E10,E2,E3 be four lists of variables and corresponding lists of expressions; then
X,Y:= E1,E2;Z:= E3=X,Y:= E1,E2;Z:= E3[Y\E2] ;
X,Y:= E1,E2;IF =X,Y:= E1,E2;IF 0; and
X,Y:= E1,E2;DO =X,Y:= E1,E2;DO0
provided ((XX0),Y)glob.E2
where IF := if Bthen S1else S2,
IF 0:= if B[Y\E2] then S1else S2,
DO := while Bdo S1;X0,Y:= E10,E2od
and DO0:= while B[Y\E2] do S1;X0,Y:= E10,E2od .
F.3 Live variables analysis
F.3.1 Introduction and removal of liveness information
Law 19. Let S,Vbe any statement and set of variables, respectively, with def.SV; then
S=S[live V].
APPENDIX F. SUMMARY OF LAWS 233
F.3.2 Propagation of liveness information
Law 20. Let S1,S2,V1,V2 be any two statements and two sets of variables, respectively; then
(S1;S2)[live V1] =(S1[live V2] ;S2[live V1])[live V1]
provided V2 = (V1\ddef.S2) input.S2.
Law 21. Let B,S1,S2,Vbe any boolean expression, two statements and set of variables, respec-
tively; then
(if Bthen S1else S2)[live V]=(if Bthen S1[live V]else S2[live V])[live V].
Law 22. Let B,S,Vbe any boolean expression, statement and set of variables, respectively; then
(while Bdo Sod)[live V1] =(while Bdo S[live V2] od)[live V1]
provided V2 = V1(glob.Binput.S).
F.3.3 Dead assignments: introduction and elimination
Law 23. Let S,V,X,Y,E1,E2 be any statement, three sets of variables and two sets of expres-
sions, respectively; then
(S;X:= E1)[live V]=(S;X,Y:= E1,E2)[live V]
provided Y(XV).
Law 24. Let S,V,Y,Ebe any statement, two sets of variables and set of expressions, respectively;
then
S[live V]=(S;Y:= E)[live V]
provided YV.
Law 25. Let S,V,X,Y,E1,E2 be any statement, three sets of variables and two sets of expres-
sions, respectively; then
(X:= E1;S)[live V]=(X,Y:= E1,E2;S)[live V]
provided Y(X(V\ddef.S)input.S).
Law 26. Let B,S1,S2,Y,V,Ebe a boolean expression, two statements, two sets of variables
and a set of expressions, respectively; then
(S1;while Bdo S2od)[live V]=(S1;while Bdo S2;(Y:= E)od)[live V]
provided Y(Vglob.Binput.S2).
Bibliography
[1] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques and Tools. Addison-
Wesley, 1988. 18
[2] R.-J. Back. On the Correctness of Refinement Steps in Program Development. PhD thesis,
˚
Abo Akademi, Department of Computer Science, Helsinki, Finland, 1978. Report A–1978–4.
36, 50
[3] R.-J. Back. Correctness Preserving Program Refinements: Proof Theory and Applications,
volume 131 of Mathematical Center Tracts. Mathematical Centre, Amsterdam, The Nether-
lands, 1980. 49
[4] R.-J. Back. A Calculus of Refinements for Program Derivations. Acta Informatica, 25:593–
624, 1988. 36
[5] L. Badger and M. Weiser. Minimizing Communication for Synchronizing Parallel Dataflow
programs. In International Conference on Parallel Processing (ICPP), The Pennsylvania State
University, University Park, PA, USA, August 1988, pages 122–126, 1988. 16
[6] T. Ball and S. Horwitz. Slicing Programs with Arbitrary Control-flow. In Automated and
Algorithmic Debugging (AADEBUG), pages 206–222, 1993. 17, 18, 142
[7] K. Beck. Extreme Programming Explained: Embrace Change. Addison-Wesley Longman
Publishing Co., Inc., Boston, MA, USA, 2000. 1
[8] D. Binkley. The Application of Program Slicing to Regression Testing. Information and
Software Technology, 40(11-12):583–594, 1998. 16
[9] D. Binkley, L. R. Raszewski, C. Smith, and M. Harman. An Empirical Study of Amorphous
Slicing as a Program Comprehension Support Tool. In IWPC ’00: Proceedings of the 8th
International Workshop on Program Comprehension, page 161, Washington, DC, USA, 2000.
IEEE Computer Society. 16
234
BIBLIOGRAPHY 235
[10] A. Cimitile, A. D. Lucia, and M. Munro. Identifying Reusable Functions Using Specification
Driven Program Slicing: A Case Study. In ICSM ’95: Proceedings of the International
Conference on Software Maintenance, pages 124–133, 1995. 21
[11] M. Corn´elio. Refactorings as Formal Refinements. PhD thesis, Universidade Federal de
Pernambuco, Centro de Inform´atica, Caixa, Recife - PE - Brazil, 2004. 14, 36, 139, 142
[12] R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently Computing
Static Single Assignment Form and the Control Dependence Graph. ACM Transactions on
Programming Languages and Systems, 13(4):451–490, 1991. 19
[13] E. W. Dijkstra and C. S. Scholten. Predicate Calculus and Program Semantics. Springer-
Verlag New York, Inc., New York, NY, USA, 1990. 26, 28, 29, 30, 31, 32, 33, 34, 35, 37, 41,
42, 46, 150, 153
[14] S. Drape. Obfuscation of Abstract Data Types. DPhil thesis, University of Oxford, United
Kingdom, 2004. 143
[15] M. B. Dwyer, J. Hatcliff, M. Hoosier, V. P. Ranganath, Robby, and T. Wallentine. Evaluating
the Effectiveness of Slicing for Model Reduction of Concurrent Object-Oriented Programs.
In Tools and Algorithms for the Construction and Analysis of Systems, 12th International
Conference, TACAS 2006, Vienna, Austria, pages 73–89, 2006. 16, 17, 141
[16] M. D. Ernst. Practical fine-grained static slicing of optimized code. Technical Report MSR-
TR-94-14, Microsoft Research, Redmond, WA, July 26, 1994. 18
[17] R. Ettinger and M. Verbaere. Untangling: A Slice Extraction Refactoring. In AOSD ’04:
Proceedings of the 3rd International Conference on Aspect-Oriented Software Development,
pages 93–101, New York, NY, USA, 2004. ACM Press. 2, 16, 141, 142
[18] J. Field, G. Ramalingam, and F. Tip. Parametric program slicing. In POPL ’95: Proceedings
of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,
pages 379–392, New York, NY, USA, 1995. ACM Press. 18
[19] J. Field and F. Tip. Dynamic Dependence in Term rewriting Systems and its Application
to Program Slicing. In PLILP ’94: Proceedings of the 6th International Symposium on
Programming Language Implementation and Logic Programming, pages 415–431, London,
UK, 1994. Springer-Verlag. 18
[20] M. Fowler. Refactoring: Improving the Design of Existing Code. Addison Wesley, 2000. 1,
2, 4, 7, 12, 138, 139
BIBLIOGRAPHY 236
[21] M. Fowler. Crossing Refactoring’s Rubicon. February 2001.
http://www.martinfowler.com/articles/refactoringRubicon.html. 139
[22] K. B. Gallagher and J. R. Lyle. Using Program Slicing in Software Maintenance. IEEE
Transactions on Software Engineering, 17(8):751–761, 1991. 21, 24, 102, 136
[23] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable
Object-Oriented Software. Addison-Wesley, 1995. 36
[24] J. Gibbons. Fission for Program Comprehension. In T. Uustalu, editor, Mathematics of Pro-
gram Construction, 8th International Conference, MPC 2006, Kuressaare, Estonia, volume
4014 of Lecture Notes in Computer Science, pages 162–179. Springer, 2006. 24
[25] J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java Language Specification Second Edition.
Addison-Wesley, Boston, Mass., 2000. 13
[26] W. Griswold and D. Notkin. Automated Assistance for Program Restructuring. ACM Trans-
actions on Software Engineering, 2(3):228–269, July 1993. 21
[27] M. Harman, D. Binkley, and S. Danicic. Amorphous Program Slicing. Journal of Systems
and Software, 68(1):45–64, 2003. 18
[28] M. Harman, D. Binkley, R. Singh, and R. M. Hierons. Amorphous Procedure Extraction.
In SCAM ’04: Proceedings of the Source Code Analysis and Manipulation, Fourth IEEE
International Workshop on (SCAM’04), pages 85–94, 2004. 24
[29] M. Harman and S. Danicic. Using Program Slicing to Simplify Testing. Software Testing,
Verification and Reliability, 5(3):143–162, 1995. 16
[30] C. A. R. Hoare, I. J. Hayes, H. Jifeng, C. C. Morgan, A. W. Roscoe, J. W. Sanders, I. H.
Sorensen, J. M. Spivey, and B. A. Sufrin. Laws of Programming. Communications of the
ACM, 30(8):672–686, 1987. 49
[31] S. Horwitz, J. Prins, and T. W. Reps. Integrating Non-Interfering Versions of Programs. In
POPL ’88: Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of
Programming Languages, pages 133–145, 1988. 143
[32] S. Horwitz, J. Prins, and T. W. Reps. Integrating Noninterfering Versions of Programs. ACM
Transactions on Programming Languages and Systems (TOPLAS), 11(3):345–387, 1989. 143
[33] S. Horwitz, T. W. Reps, and D. Binkley. Interprocedural Slicing Using Dependence Graphs.
ACM Transactions on Programming Languages and Systems (TOPLAS), 12(1):26–60, 1990.
17, 18
BIBLIOGRAPHY 237
[34] I. Jacobson, G. Booch, and J. Rumbaugh. The Unified Software Development Process.
Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. 1
[35] G. Jayaraman, V. P. Ranganath, and J. Hatcliff. Kaveri: Delivering the Indus Java Program
Slicer to Eclipse. In Fundamental Approaches to Software Engineering, 8th International
Conference, FASE 2005, pages 269–272, 2005. 17, 141
[36] G. Kiczales, J. Lamping, A. Menhdhekar, C. Maeda, C. Lopes, J.-M. Loingtier, and J. Irwin.
Aspect-Oriented Programming. In M. Ak¸sit and S. Matsuoka, editors, Proceedings European
Conference on Object-Oriented Programming, volume 1241, pages 220–242. Springer-Verlag,
Berlin, Heidelberg, and New York, 1997. 141
[37] R. Komondoor. Automated Duplicated-Code Detection and Procedure Extraction. PhD
thesis, University of Wisconsin-Madison, WI, USA, 2003. 103, 125, 140, 143
[38] R. Komondoor and S. Horwitz. Semantics-Preserving Procedure Extraction. In POPL ’00:
Proceedings of the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages, pages 155–169, New York, NY, USA, 2000. ACM Press. 22, 103, 125, 139, 143
[39] R. Komondoor and S. Horwitz. Effective Automatic Procedure Extraction. In Proceedings
of the 11th IEEE International Workshop on Program Comprehension, 2003. 23, 24, 81, 103,
125, 139, 140, 143
[40] A. Lakhotia and J.-C. Deprez. Restructuring Programs by Tucking Statements into Functions.
Information and Software Technology, 40(11-12):677–690, 1998. 16, 21, 68, 102, 139
[41] F. Lanubile and G. Visaggio. Extracting Reusable Functions by Flow Graph-Based Program
Slicing. IEEE Transactions on Software Engineering, 23(4):246–259, 1997. 21
[42] K. Maruyama. Automated Method-Extraction Refactoring by using Block-Based Slicing. SSR
’01: Proceedings of the 2001 Symposium on Software Reusability, pages 31–40. ACM Press,
2001. 8, 16, 21
[43] T. M. Meyers and D. Binkley. Slice-Based Cohesion Metrics and Software Intervention. In
11th Working Conference on Reverse Engineering (WCRE 2004), November 2004, Delft, The
Netherlands, pages 256–265. IEEE Computer Society, 2004. 16
[44] L. Millett and T. Teitelbaum. Slicing Promela and its Applications to Model Checking. In
Proceedings on Model Checking of Software., 1998. 16
[45] C. Morgan. Programming from Specifications (2nd ed.). Prentice Hall International (UK)
Ltd., Hertfordshire, UK, UK, 1994. 14, 30, 36, 37, 46, 47, 49, 50, 153
BIBLIOGRAPHY 238
[46] S. S. Muchnik. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997.
19
[47] F. Nielson, H. R. Nielson, and C. Hankin. Principles of Program Analysis. Springer, December
2004. 18, 48, 52, 73
[48] W. F. Opdyke. Refactoring Object-Oriented Frameworks. PhD thesis, University of Illinois
at Urbana-Champaign, IL, USA, 1992. 1, 12, 14, 15, 21
[49] K. Ottenstein and L. Ottenstein. The Program Dependence Graph in a Software Development
Environment. Proc. of the ACM SIGSOFT/SIGPLAN Software Engineering Symposium on
Practical Software Development Environments, pages 177–184, 1984. 18
[50] J. Rilling and T. Klemola. Identifying Comprehension Bottlenecks using Program Slicing
and Cognitive Complexity Metrics. In IWPC ’03: Proceedings of the 11th IEEE Interna-
tional Workshop on Program Comprehension, page 115, Washington, DC, USA, 2003. IEEE
Computer Society. 16
[51] J. Rilling and S. P. Mudur. 3D Visualization Techniques to Support Slicing-Based Program
Comprehension. Computers & Graphics, 29(3):311–329, 2005. 16
[52] D. Roberts. Practical Analysis for Refactoring. PhD thesis, University of Illinois at Urbana-
Champaign, IL, USA, 1999. 13, 15
[53] D. Roberts, J. Brant, and R. Johnson. A Refactoring Tool for Smalltalk. Theory and Practice
of Object Systems, 3(4), 1997. 13
[54] J. Singer. Static Program Analysis based on Virtual Register Renaming. DPhil thesis,
University of Cambridge, United Kingdom, 2006. 18, 19, 75
[55] M. Verbaere, R. Ettinger, and O. de Moor. JunGL: a Scripting Language for Refactoring.
In D. Rombach and M. L. Soffa, editors, ICSE’06: Proceedings of the 28th International
Conference on Software Engineering, pages 172–181, New York, NY, USA, 2006. ACM Press.
13
[56] M. Ward. Proving Program Refinements and Transformations. DPhil thesis, University of
Oxford, United Kingdom, 1989. 36, 49, 50
[57] M. P. Ward. Program Slicing via FermaT Transformations. In Computer Software and
Applications Conference, 2002. COMPSAC 2002. Proceedings. 26th Annual International,
pages 357–362, 2002. 36, 78
BIBLIOGRAPHY 239
[58] M. P. Ward. Pigs from Sausages? Reengineering from Assembler to C via FermaT Transfor-
mations. Science of Computer Programming, 52:213–255, 2004. 36
[59] M. P. Ward, H. Zedan, and T. Hardcastle. Conditioned Semantic Slicing via Abstraction and
Refinement in FermaT. In CSMR ’05: Proceedings of the Ninth European Conference on
Software Maintenance and Reengineering, pages 178–187, 2005. 17, 18, 36, 78
[60] D. Weise, R. F. Crew, M. D. Ernst, and B. Steensgaard. Value Dependence Graphs: Rep-
resentation Without Taxation. In Proceedings of the 21st Annual ACM SIGPLAN-SIGACT
Symposium on Principles of Programming Languages, pages 297–310, Portland, OR, Jan.
1994. 18
[61] M. Weiser. Program Slicing. In ICSE ’81: Proceedings of the 5th International Conference
on Software Engineering, pages 439–449, 1981. 7, 8, 16, 18, 19
[62] M. Weiser. Programmers Use Slices When Debugging. Communications of the ACM,
25(7):446–452, 1982. 8, 16, 144
[63] M. Weiser. Reconstructing Sequential Behavior from Parallel Behavior Projections. Informa-
tion Processing Letters, 17(3):129–135, 1983. 16
[64] M. Weiser. Program Slicing. IEEE Transactions on Software Engineering, 10(4):352–357,
1984. 7, 17, 18, 126
[65] Agile alliance website. http://www.agilealliance.com/. 1
[66] Eclipse Website. http://www.eclipse.org/. 2
[67] Manifesto for Agile Software Development. http://www.agilemanifesto.org/. 1
[68] Microsoft Visual Studio Official Website. http://msdn.microsoft.com/vstudio/. 2
[69] An Online Refactoring Catalog. http://www.refactoring.com/catalog/. 12, 138
[70] Refactoring Bugs in Eclipse, IntelliJ IDEA and Visual Studio.
http://progtools.comlab.ox.ac.uk/projects/refactoring/bugreports. 13
[71] Refactoring Website. http://www.refactoring.com/. 12
[72] Yahoo Refactoring Group Mailing List. refactoring@yahoogroups.com. 12
... As shown in the example, some parts of the original code (such as the loop) need to be duplicated. A co-slice, or complement slice [6], is the part of the program that should be left behind once a slice has been extracted from it. As shown above, the co-slice may contain some code that is also part of the extracted fine slice. ...
... If a statement in a loop is selected, all the loop is added. Sliding [6] computes the slice of selected variables from the end of a selected fragment of code, and composes the slice and its complement in a sequence. The complement can be thought of as a fine slice of all non-selected variables, ignoring all dependences of final uses of the selected variables. ...
Conference Paper
Full-text available
Software evolution often requires the untangling of code. Particularly challenging and error-prone is the task of separating computations that are intertwined in a loop. The lack of automatic tools for such transformations complicates maintenance and hinders reuse. We present a theory and implementation of fine slicing, a method for computing executable program slices that can be finely tuned, and can be used to extract non-contiguous pieces of code and untangle loops. Unlike previous solutions, it supports temporal abstraction of series of values computed in a loop in the form of newly-created sequences. Fine slicing has proved useful in capturing meaningful subprograms and has enabled the creation of an advanced computation-extraction algorithm and its implementation in a prototype refactoring tool for Cobol and Java.
... The semantic relation expresses the fact that the slice preserves part of the behaviour of the original program. In the literature, [19,20,21], slices which preserve both relations are referred to as syntactic slices while slices which preserve only the semantic relation are semantic slices. One application of semantic slicing is in the analysis of interactive systems where we are interested in the semantic slices on the points where the program interacts with its environment, another is in analysing the normal behaviour of a program after "slicing away" error handling code (see Section 7). ...
Article
Full-text available
Since the original development of program slicing in 1979 there have been many attempts to define a suitable semantics, which will precisely define the meaning of a slice. Particular issues include handling termination and nontermination, slicing nonterminating programs, and slicing nondeterministic programs. In this paper we review and critique the main attempts to construct a semantics for slicing and present a new operational semantics, which correctly handles slicing for nonterminating and nondeterministic programs. We also present a modified denotational semantics, which we prove to be equivalent to the operational semantics. This provides programmers with 2 different methods to prove the correctness of a slice or a slicing algorithm and means that the program transformation theory and FermaT transformation system, developed last 25 years of research, and which has proved so successful in analyzing terminating programs, can now be applied to nonterminating interactive programs.
... That means it is predominantly method extraction. Since Program Slicing has been used for method extraction [3,4,6,19,21,22] and we can say that it can be used for aspect extraction as well. ...
... Program slicing has several applications in various software engineering domains such as debugging, program comprehension, testing, cohesion measurement, maintenance and reverse engineering (Tip, 1995; Binkley and Gallagher, 1996; Harman and Hierons, 2001). A direct application of program slicing in the field of refactoring is slice extraction, which has been formally defined by Ettinger (2007) as the extraction of the computation of a set of variables V from a program S as a reusable program entity, and the update of the original program S to reuse the extracted slice. Within the context of slice extraction the literature can be divided into two main categories according to Ettinger (2007). ...
Article
Full-text available
Code smells are a popular mechanism for identifying structural design problems in software systems. Several tools have emerged to support the detection of code smells and propose some refactorings. However, existing tools do not guarantee that a smell will be automatically fixed by means of refactorings. This article presents Bandago, an automated approach to fix a specific type of code smell called Brain Method. A Brain Method centralizes the intelligence of a class and manifests itself as a long and complex method that is difficult to understand and maintain by developers. For each Brain Method, Bandago recommends several refactoring solutions to remove the smell using a search strategy based on simulated annealing. Our approach has been evaluated with several open-source Java applications, and the results show that Bandago can automatically fix more than 60% of Brain Methods. Furthermore, we conducted a survey with 35 industrial developers that showed evidence about the usefulness of the refactorings proposed by Bandago. Also, we compared the performance of the Bandago against that of a third-party refactoring tool.
Conference Paper
A semantics-preserving code-motion refactoring transformation by Komondoor and Horwitz (KH) had been shown to be effective in the elimination of type-3 clones, partly thanks to its successful combination of statement reordering with duplication of predicates. According to a recent clone refactorability definition by Tsantalis, however, such a transformation is considered unacceptable whenever the given code fragments contain any statement that cannot be moved. We propose an adaptation of the KH transformation that yields refactorable results according to the definition of Tsantalis. An evaluation of this approach on real-world type-3 clones from the Java portion of the Tiarks benchmark produces promising results, demonstrating how code motion with the duplication of predicates forms an effective step towards the removal of duplication in source code.
Conference Paper
Traditional refactoring is about modifying the structure of existing code without changing its behaviour, but with the aim of making code easier to understand, modify, or reuse. In this paper, we introduce three novel refactorings for retrofitting concurrency to Erlang applications, and demonstrate how the use of program slicing makes the automation of these refactorings possible.
Conference Paper
As program slicing is a technique for computing a subprogram that preserves a subset of the original program’s functionality, program sliding is a new technique for computing two such subprograms, a slice and its complement, the co-slice. A composition of the slice and co-slice in a sequence is expected to preserve the full functionality of the original code. The co-slice generated by sliding is designed to reuse the slice’s results, correctly, in order to avoid re-computation causing excessive code duplication. By isolating coherent slices of code, making them extractable and reusable, sliding is shown to be an effective step in performing advanced code refactorings. A practical sliding algorithm, based on the program dependence graph representation, is presented and evaluated through a manual sliding-based refactoring experiment on real Java code.
Article
In order to ensure correctness, refactorings have to check extensive preconditions before performing the transformation. These preconditions usually involve subtle analyses of the program to be refactored, and as long as there is no good support for implementing them, refactoring is not about transformation, but about analysis. In most cases, these refactoring analyses are very similar to analyses implemented in a compiler and require the same level of detail to ensure behaviour preservation. We therefore propose to implement a refactoring engine on top of a compiler to leverage existing infrastructure, and complement it with refactoring-specific functionality. Many simple refactorings appear as building blocks in more complex refactorings. We have implemented two such building blocks that are widely useful: The first one allows to move symbolic names from one place in the program to another while preserving binding structure; it frees the developer from having to worry about issues like name clashes and accidental overriding. The second building block encapsulates data flow and control flow analyses, enabling the developer to specify precise conditions for validity of a transformation in terms of concepts like dominance and liveness. Based on these approaches, we have implemented a refactoring engine as part of a larger effort to generate IDEs from declarative language specifications using the JastAdd metacompiler tools. The described building blocks were successfully used as a foundation for other refactorings such as Rename, Extract Method, and Encapsulate Field.
Book
Full-text available
In this book we shall introduce four of the main approaches to program analysis: Data Flow Analysis, Control Flow Analysis, Abstract Interpretation, and Type and Effect Systems. Each of Chapters 2 to 5 deals with one of these approaches to some length and generally treats the more advanced material in later sections. Throughout the book we aim at stressing the many similarities between what may at a first glance appear to be very unrelated approaches. To help getting this idea across, and to serve as a gentle introduction, this chapter treats all of-the approaches at the level of examples. The technical details are worked-out but it may be difficult to apply the techniques to related examples until some of the material of later chapters have been studied.
Article
Full-text available
In this thesis we develop a theory of program refinement and equivalence which can be used to develop practical tools for program development, analysis and modification. The theory is based on the use of general specifications and an imperative kernel language. We use weakest preconditions, expressed as formulae in infinitary logic to prove refinement and equivalence between programs. The kernel language is extended by means of "definitional transformations" which define new concepts in terms of those already present. The extensions include practical programming constructs, including recursive procedures, local variables, functions and expressions with side-effects. This expands the language into a "Wide Spectrum Language" which covers the whole range of operations from general specifications to assignments, jumps and labels. We develop theorems for proving the termination of recursive and iterative programs, transforming specifications into recursive programs and transforming recursive procedures into iterative equivalents. We develop a rigorous framework for reasoning about programs with EXIT statements that terminate nested loops from within; and this forms the basis for many efficiency-improving and restructuring transformations. These are used as a tool for program analysis and to derive algorithms by transforming their specifications. We show that the methods of top-down design and program verification using assertions can be viewed as the application of a small subset of transformations.
Chapter
Programs operate on data. It is thus natural to start our considerations of how to think about programs by a discussion of how to think about data types. For this purpose, we do not really need to know how the objects of a type are concretely represented (such representations have been discussed in Chap. chapter:models); we may rather focus on the properties that are satisfied by the operations which have been given to us to work with these objects. This view is also in line with modern software engineering that abstracts from the implementation details of data by encapsulating them in classes that only expose a (more or less) well documented method interface to the user.