ThesisPDF Available

Refactoring via program slicing and sliding

September 2006

September 2006

Thesis for: Doctor of Philosophy
Advisor: Oege de Moor

Authors:

Ran Ettinger

Ben-Gurion University of the Negev

A sample customer statement.

…

A tangled statement method.

…

The statement method after extracting the computations of the total charge and frequent renter points.

…

Figures - uploaded by Ran Ettinger

Content may be subject to copyright.

Content uploaded by Ran Ettinger

Content may be subject to copyright.

Refactoring via Program Slicing and Sliding

Ran Ettinger

Wolfson College

Trinity Term, 2006

Submitted in partial fulﬁlment of the requirements for

the degree of Doctor of Philosophy

Oxford University Computing Laboratory

Programming Tools Group

Refactoring via Program Slicing and Sliding

Ran Ettinger, Wolfson College

Thesis submitted for the degree of Doctor of Philosophy

at the University of Oxford

Trinity Term, 2006

Abstract

Mark Weiser’s observation that “programmers use slices when debugging”, back in 1982,

started a new ﬁeld of research. Program slicing, the study of meaningful subprograms that capture

a subset of an existing program’s behaviour, aims at providing programmers with tools to assist

in a variety of software development and maintenance activities.

Two decades later, the work leading to this thesis was initiated with the observation that

“programmers use slices when refactoring”. Hence, the thesis explores ways in which known

refactoring techniques can be automated through slicing and related program analyses.

Common to all slicing related refactorings, as explored in this thesis, is the goal of improving

reusability, comprehensibility and hence maintainability of existing code. A problem of slice

extraction is posed and its potential contribution to refactoring research highlighted. Limitations

of existing slice-extraction solutions include low applicability and high levels of code duplication.

Advanced techniques for the automation of slice extraction are proposed. The key to their

success lies in a novel program representation introduced in the thesis. We think of a program as

a collection of transparency slides, placed one on top of the other. On each such slide, a subset of

the original statement, not necessarily a contiguous one, is printed. Thus, a subset of a statement’s

slides can be extracted from the remaining slides in an operation of sideways movement, called

sliding. Semantic properties of such sliding operations are extensively studied through known

techniques of predicate calculus and program semantics.

This thesis makes four signiﬁcant contributions to the slicing and refactoring ﬁelds of research.

Firstly, it develops a theoretical framework for slicing-based behaviour-preserving transformations

of existing code. Secondly, it provides a provably correct slicing algorithm. This application of our

theory acts as evidence for its expressive power whilst enabling constructive descriptions of slicing-

based transformations. Thirdly, it applies the above framework and slicing algorithm in solving

the problem of slice extraction. The solution, a family of provably correct sliding transformations,

provides high levels of accuracy and applicability. Finally, the thesis outlines the application of

sliding to known refactorings, making them automatable for the ﬁrst time.

These contributions provide strong evidence that, indeed, slicing and related analyses can assist

in building automatic tools for refactoring.

To Dana, Amir, Zohara and Ze’ev; and to loved B¨arbel.

Acknowledgements

I would ﬁrst like to thank prof. Oege de Moor, for taking on the tough role of supervising me and

my work. Not without struggle, we have ﬁnally come through, winning. In not hassling me while

I was slowly progressing, and in acting as devil’s advocate, Oege has strongly contributed to the

development and success of this work. I also thank Oege for introducing me to his ﬁne group of

students — some of whom will remain forever my friends (I hope) — and for introducing me to

Dijkstra’s work on program semantics. I ﬁnally thank Oege for his eﬀort in securing some funding

for this work, after Intercomp’s unsurprising withdrawal, on my arrival to Oxford.

Mike Spivey supervised my work during Oege’s 2003 Sabbatical and inﬂuenced my direction

tremendously. I’m especially grateful to Mike for the full attention and presence during our

supervision meetings, always leaving me inspired and with fresh ideas. Jeremy Gibbons and Mike

were my transfer examiners and later Jeremy and Jeﬀ Sanders performed my conﬁrmation of

status. I am greatly indebted to all three for the insightful comments and feedback, which left a

huge mark on the later results leading to this thesis.

The Programming Tools Group at Oxford, of which I was a member, provided a strong working

environment, through weekly meetings, where talks could be practiced and ideas could be shared

and discussed, and through joint work. I’m grateful to past and present members of the group, in

particular to Iv´an Sanabria, Yorck H¨unke, Stephen Drape, Mathieu Verbaere, and Damien Sereni,

for the professional assistance and collaboration, for the mental support on rough days, and for

the friendship. The collaborative highlight of my Oxford time was the work with Mathieu during

his MSc project. I loved our endless discussions on programming and whatever, and I’m glad you

and Doroth´ee ﬁnally returned for the DPhil, and became close friends. Thank you!

Other comlab (past) students, like Gordon, Fatima, Eran, Silvija, Jussi, Abigail, Penelope,

Eldar, David, and Edouard, have contributed immensely to this professional and personal voyage.

I’d also like to extend warm thanks to comlab’s administrative staﬀ, for their continuously diligent

support.

Big thanks go to my ﬁnal examiners, Jeremy Gibbons and Mark Harman, for the interesting

discussion during a good-spirited viva, and for their valuable suggestions, making this thesis a

better and more professional scientiﬁc report. Special thanks go to Raghavan Komondoor, Yvonne

Greene, Iv´an Sanabria, Steve Drape, Sharona Meushar and Itamar Kahn, for commenting on ﬁnal

drafts of the thesis.

Intercomp Ltd. in Israel was where I ﬁrst came up with the ideas for this research. I thank my

colleagues and friends there, and I thank prof. Alan Mycroft of Cambridge for his contribution to

the development of the initial research proposal. I also thank prof. Mooly Sagiv for introducing

me to the academic world of program analysis, as well as Eran Tirer and Dr. Mayer Goldberg, for

supporting my Oxford application with letters of recommendation. Itamar Kahn was inspirational

in his own way, and the ﬁrst to recommend Oxford to me. Sharon Attia was instrumental in the

acceptance of an eventual Oxford oﬀer, and started that adventure with me.

Student life in Oxford is brilliant, mainly due to the distribution of all students into colleges.

Wolfson College provided a beautiful and peaceful environment, perfect for my living and studying.

I would like to thank all my Wolfsonian friends, the staﬀ, members of the boat club, and most

importantly members of our football club. (After all, football was my number one reason for

choosing England.) Participating in sports competitions with the many other colleges, and once

a year with our sister college in Cambridge, Darwin, provided some of the best moments of my

fantastic DPhil experience. On the Jewish side of things, I warmly thank Rabbi Eli and Freidy

Brackman for providing a friendly, social and educational environment in their thriving Oxford

Chabad society. And as for nutrition, nothing compares to Oxford’s late night Kebab vans! Thank

you all for taking good care of me.

Financially surviving the ﬁve years of research, as a self-funded student, required some fund

raising. I acknowledge the ﬁnancial support of IBM’s Eclipse Innovation Grant, the ORS Scheme,

and Oxford University’s Hardship Fund. Sun Labs hired me for a magniﬁcent California intern-

ship (thanks to Michael Van De Vanter, James Gosling, Tom Ball and Tim Prinzing). Continuous

teaching appointments by Oxford’s Computing Laboratory and the Software Engineering Pro-

gramme were fun to perform and provided the much needed extra funds (thanks to Silvija Seres,

Jeremy Gibbons, Steve McKeever, the administrative staﬀ, and all students). In the ﬁnal 8 months

I was fully supported by my parents (Toda Ima Aba!) and my new position at the IBM Haifa

Research Lab (thanks to Dr. Yishai Feldman and the SAM group) helps paying back Wolfson

College’s Senior Tutor loan (many thanks to Dr. Martin Francis).

Shai, Uri, Itamar & Anna, Yacove, Koby & Adi, Sharona, Becky, Hezi, Keren, and Yo’av,

all helped keeping morale high by visiting and maintaining overseas friendships. My sister Dana,

brother Amir, and their families, my parents, Zohara and Ze’ev, and the rest of the family,

including our UK-based relatives, most notably Yvonne Greene and her lovely family in Banbury

where I found a home away from home, were all extremely supportive in ways more than one; and

indeed very patient. I am forever grateful!

iii

And last but not least, I am happy to thank Georgia Barbara Jettinger for her love and immea-

surable contribution to the success of this journey. And it was actually B¨arbel’s slip of a tongue

that triggered the invention of this thesis’ sliding metaphor. I am deeply grateful to Oxford (and

Edouard and Raya) for introducing us. Following you on your ﬁeldwork to Paris, for my year of

writing-up, turned out brilliant. G´enial!

Rani Ettinger,

15 June 2007,

Tel Aviv, Israel

Slip sliding away

You know the nearer your destination

The more you slip sliding away.

Simon & Garfunkel

Contents

1 Introduction 1

1.1 Refactoring enables iterative and incremental software development . . . . . . . . . 1

1.2 The gap: refactoring tools are important but weak . . . . . . . . . . . . . . . . . . 2

1.2.1 Motivating example: Fowler’s video store . . . . . . . . . . . . . . . . . . . 2

1.3 Programmers use slices when refactoring . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Automatic slice-extraction refactoring via sliding . . . . . . . . . . . . . . . . . . . 8

1.5 Overview: chapter by chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.6 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Background and Related Work 12

2.1 Refactoring ........................................ 12

2.1.1 Informal reality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.2 Underlying theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Slicing........................................... 15

2.2.1 Slicing examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.2 On slicing and termination . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.3 Slicing criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.4 Syntax-preserving vs. amorphous and semantic slices . . . . . . . . . . . . . 17

2.2.5 Flow sensitivity: backward vs. forward slicing . . . . . . . . . . . . . . . . . 18

2.2.6 Slicing algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2.7 SSAform ..................................... 19

2.3 Slice-extraction refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3 Formal Semantics: Predicate Transformers 26

3.1 Set theory for program variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1.1 Sets and lists of distinct variables . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1.2 Disjoint sets and tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.1.3 Generating fresh variable names . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2 Predicate calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2.1 The state-space metaphor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2.2 Structures, expressions and predicates . . . . . . . . . . . . . . . . . . . . . 28

3.2.3 Square brackets: the ‘everywhere’ operator . . . . . . . . . . . . . . . . . . 29

3.2.4 Functions and equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2.5 Global variables in expressions, predicates and programs . . . . . . . . . . . 30

3.2.6 Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.2.7 Proof format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2.8 From the calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3 Program semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.3.1 Predicate transformers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.3.2 Diﬀerent types of junctivity . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.3.3 A deﬁnition of deterministic program statements . . . . . . . . . . . . . . . 35

3.4 Program reﬁnement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4 A Theoretical Framework 37

4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.1.1 On slips and slides: an alternative to substatements . . . . . . . . . . . . . 38

4.1.2 Why deterministic? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.1.3 On deterministic program semantics . . . . . . . . . . . . . . . . . . . . . . 39

4.1.4 On reﬁnement, termination and program equivalence . . . . . . . . . . . . . 39

4.1.5 Semantic language requirements . . . . . . . . . . . . . . . . . . . . . . . . 41

4.1.6 Global variables in transformed predicates . . . . . . . . . . . . . . . . . . . 42

4.2 The programming language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.2.1 Expressions, variables and types . . . . . . . . . . . . . . . . . . . . . . . . 44

4.2.2 Core language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.2.3 Extended language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.3 Laws of program analysis and manipulation . . . . . . . . . . . . . . . . . . . . . . 49

4.3.1 Manipulating core statements . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.3.2 Assertion-based program analysis . . . . . . . . . . . . . . . . . . . . . . . . 50

4.3.3 Manipulating liveness information . . . . . . . . . . . . . . . . . . . . . . . 51

4.4 Summary ......................................... 52

5 Proof Method for Correct Slicing-Based Refactoring 53

5.1 Introducing slice-reﬁnements and co-slice-reﬁnements . . . . . . . . . . . . . . . . . 53

5.2 Variable-wise proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.2.1 Proving slice-reﬁnements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.2.2 A co-slice-reﬁnement is a slice-reﬁnement of the complement . . . . . . . . 56

5.3 Slice and co-slice reﬁnements yield a general reﬁnement . . . . . . . . . . . . . . . 58

5.3.1 A corollary for program equivalence . . . . . . . . . . . . . . . . . . . . . . 59

5.4 Example proof: swap independent statements . . . . . . . . . . . . . . . . . . . . . 60

5.5 Summary ......................................... 61

6 Statement Duplication 63

6.1 Example.......................................... 63

6.2 Sequential simulation of independent parallel execution . . . . . . . . . . . . . . . 64

6.3 Formal derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.4 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

7 Semantic Slice Extraction 70

7.1 Example.......................................... 71

7.2 Live variables analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

7.2.1 Simultaneous liveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

7.3 Formal derivation using statement duplication . . . . . . . . . . . . . . . . . . . . . 75

7.4 Requirements of slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

7.4.1 Ward’s deﬁnition of syntactic and semantic slices . . . . . . . . . . . . . . . 78

7.5 Summary ......................................... 79

8 Slides: A Program Representation 80

8.1 Slideshow: a program execution metaphor . . . . . . . . . . . . . . . . . . . . . . . 80

8.2 Slides in refactoring: sliding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

8.2.1 One slide per statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

8.2.2 A separate slide for each variable . . . . . . . . . . . . . . . . . . . . . . . . 81

8.2.3 A separate slide for each individual assignment . . . . . . . . . . . . . . . . 82

8.3 Representing non-contiguous statements . . . . . . . . . . . . . . . . . . . . . . . . 82

8.4 Collecting slides: the union of non-contiguous code . . . . . . . . . . . . . . . . . . 84

8.5 Slide dependence and independence . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

8.5.1 Smallest enclosing slide-independent set . . . . . . . . . . . . . . . . . . . . 85

8.6 SSAform ......................................... 86

8.6.1 Transform to SSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

8.6.2 Back from SSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

8.6.3 SSA is de-SSA-able . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

8.7 Summary ......................................... 90

vii

9 A Slicing Algorithm 91

9.1 Flow-insensitive slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

9.1.1 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

9.1.2 Example...................................... 95

9.2 Make it ﬂow-sensitive using SSA-based slides . . . . . . . . . . . . . . . . . . . . . 96

9.2.1 Formal derivation of ﬂow-sensitive slicing . . . . . . . . . . . . . . . . . . . 96

9.2.2 An SSA-based slice is de-SSA-able . . . . . . . . . . . . . . . . . . . . . . . 98

9.2.3 The reﬁned algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

9.2.4 Example...................................... 99

9.3 Slice extraction revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

9.3.1 The transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

9.3.2 Evaluation and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

9.4 Summary ......................................... 104

10 Co-Slicing 105

10.1 Over-duplication: an example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

10.2 Final-use substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

10.2.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

10.2.2 Deriving the transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

10.3 Advanced sliding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

10.3.1 Statement duplication with ﬁnal-use substitution . . . . . . . . . . . . . . . 108

10.3.2 Slicing after ﬁnal-use substitution . . . . . . . . . . . . . . . . . . . . . . . . 111

10.3.3 Deﬁnition of co-slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

10.3.4 The sliding transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

10.4Summary ......................................... 114

11 Penless Sliding 115

11.1 Eliminating redundant backup variables . . . . . . . . . . . . . . . . . . . . . . . . 115

11.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

11.1.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

11.1.3 Dead-assignments-elimination and variable-merging . . . . . . . . . . . . . 116

11.2 Compensation-free (or penless) co-slicing . . . . . . . . . . . . . . . . . . . . . . . . 122

11.3 Sliding with penless co-slices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

11.4Summary ......................................... 124

12 Optimal Sliding 126

12.1 The minimal penless co-slice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

viii

12.1.1 A polynomial-time algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 127

12.2 Slice inclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

12.3 The optimal sliding transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

12.4Summary ......................................... 135

13 Conclusion 137

13.1 Slicing-based refactoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

13.1.1 Replace Temp with Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

13.1.2 More refactorings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

13.2 Advanced issues and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

13.3Futurework........................................ 142

13.3.1 Formal program re-design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

13.3.2 Further applications of sliding: beyond refactoring . . . . . . . . . . . . . . 143

A Formal Language Deﬁnition 145

A.1 Core language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

A.2 Extended language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

B Laws of Program Manipulation 158

B.1 Manipulating core statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

B.2 Assertion-based program analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

B.2.1 Introduction of assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

B.2.2 Propagation of assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

B.2.3 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

B.3 Live variables analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

B.3.1 Introduction and removal of liveness information . . . . . . . . . . . . . . . 174

B.3.2 Propagation of liveness information . . . . . . . . . . . . . . . . . . . . . . . 175

B.3.3 Dead assignments: introduction and elimination . . . . . . . . . . . . . . . 177

C Properties of Slides 180

C.1 Lemmata for proving independent slides yield slices . . . . . . . . . . . . . . . . . . 185

C.2 Slide independence and liveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

D SSA 194

D.1 General derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

D.2 Transform to SSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

D.3 BackfromSSA ...................................... 213

D.4 SSA is de-SSA-able . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

D.5 An SSA-based slice is de-SSA-able . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

E Final-Use Substitution 223

E.1 Formal derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

E.2 Lemmata for proving statement dup. with ﬁnal use . . . . . . . . . . . . . . . . . . 226

E.3 Stepwise ﬁnal-use substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

F Summary of Laws 230

F.1 Manipulating core statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

F.2 Assertion-based program analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

F.2.1 Introduction of assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

F.2.2 Propagation of assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

F.2.3 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

F.3 Live variables analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

F.3.1 Introduction and removal of liveness information . . . . . . . . . . . . . . . 232

F.3.2 Propagation of liveness information . . . . . . . . . . . . . . . . . . . . . . . 233

F.3.3 Dead assignments: introduction and elimination . . . . . . . . . . . . . . . 233

Bibliography 234

Chapter 1

Introduction

1.1 Refactoring enables iterative and incremental software

development

Programming is a relatively young discipline. In its earlier days, the leading so-called Waterfall

methodology for software development involved separate phases for design and for actual imple-

mentation. This was based on the assumption that a system can be fully speciﬁed up-front, ahead

of its implementation. Any later change was considered software maintenance, and involved its

own separate set of processes.

Modern methodologies, in contrast, inherently accommodate for change by admitting a more

iterative and incremental software development process. That is, throughout its lifecycle, software

is developed and released in iterations. Each such iteration is typically targeting an increment in

functionality. Thus, an iteration may involve any and all aspects of development, including design

and coding. Examples include the Rational Uniﬁed Process (RUP) [34], eXtreme Programming

(XP) [7] and other so-called agile methodologies [65, 67].

Refactoring [48, 20] is the process of improving the design of existing software. This is achieved

by performing source code transformations that preserve the original functionality. The ability to

update the design and internal structure of programs through refactoring enables change during the

lifecycle of any software system. Thus, refactoring is key to the success of software development.

The premise, when refactoring, is that the design should be clearly reﬂected by the code itself.

Thus, clarity of code is imperative. Indeed, the goal of many refactoring transformations (e.g. for

renaming variables) is to improve the readability of code.

Another theme in refactoring is the removal of duplication in code. (As a system evolves,

duplication creeps in e.g. by the common ‘quick-and-dirty’ practice of ‘copy-and-paste’ of existing

code.) Such redundancies can and should be removed. This removal is achieved by refactoring

CHAPTER 1. INTRODUCTION 2

transformations geared at enhancing reusability of code (e.g. by extracting common code into new

methods with the so-called Extract Method refactoring).

The refactorings in this thesis will indeed target both improved comprehensibility and enhanced

reusability, in supporting the development and maintenance of quality software systems.

Modern software development environments, e.g. MS Visual Studio [68] and Eclipse [66], in-

clude useful support for some refactoring techniques. However, the incompleteness and, at times,

incorrectness of those tools calls for progress in the underlying theory.

In what follows, we illustrate the promise of refactoring and the power of its supporting tools,

on the one hand, while identifying the gap to be ﬁlled by this thesis, on the other.

1.2 The gap: refactoring tools are important but weak

In code, a function that yields a value without causing any observable side eﬀects is very valuable.

“You can call this function as often as you like” [20, Page 279]. Such a call is also known as a

query.

A refactoring technique called Replace Temp with Query was introduced by Fowler [20] to

turn the use of a temporary variable holding the value of an expression into a query. The beneﬁt

is increased readability (in the refactored version) and reusability (of the extracted computation).

This scenario is indeed supported in e.g. Eclipse, as a special case of the Extract Method tool.

A more complicated case of Replace Temp with Query occurs when the temp is not assigned

the result of an expression, but rather the result of a computation spanning several lines of code.

If those lines are consecutive in code (i.e. contiguous), they can be selected by the user and again

the Extract Method tool may handle them successfully. Unfortunately, this will not always be

the case; instead, when the code for computing a temporary result is tangled with code for other

concerns, it is said to be non-contiguous.

1.2.1 Motivating example: Fowler’s video store

The following example is taken (with minor changes) from Martin Fowler’s refactoring book [20],

where all refactorings are performed manually. 1The example concerns a piece of software for

running a video store, focusing on the implementation of one feature: composing and printing

a customer rental record statement. The statement includes information on each of the rentals

made by a customer, and summary information; a sample statement is shown in Figure 1.1.

In the original implementation, the preparation of the text to be printed and the computation

of the summary information are tangled inside a single method (see Figure 1.2). In fact, Fowler

1The example itself, as well as a variation on the accompanying discussion, has appeared in a paper titled:

“Untangling: A Slice Extraction Refactoring” [17] by the author of this thesis, co-authored with Mathieu Verbaere.

CHAPTER 1. INTRODUCTION 3

Figure 1.1: A sample customer statement.

Figure 1.2: A tangled statement method.

CHAPTER 1. INTRODUCTION 4

Figure 1.3: The statement method after extracting the computations of the total charge and

frequent renter points.

starts oﬀ with a much longer statement method containing all the logic for determining the amount

to be charged and the number of frequent renter points earned per movie rental. These results

depend on the type of the rented movie (regular,children’s or new release) and the number of

days it was rented for.

Fowler then gradually improves the design by factoring out that rental-speciﬁc logic (into the

Rental class, which is not shown here). The suggested refactoring steps are motivated by the

introduction of a new requirement, namely the ability to print an html version of the statement.

A quick-and-dirty approach would be to copy the body of the statement method, paste it into a

new htmlStatement method and replace the text-based layout control strings with corresponding

html tags. This would lead to duplication of the code for computing the temporary totalAmount

and frequentRenterPoints variables.

For brevity, we join the refactoring session at the stage Fowler calls Removing Temps ([20,

Page 26]). At this stage the computations of totalAmount and frequentRenterPoints are factored

CHAPTER 1. INTRODUCTION 5

Figure 1.4: The extracted total charge computation.

Figure 1.5: The extracted frequent renter points computation.

CHAPTER 1. INTRODUCTION 6

out (see ﬁgures 1.3-1.5 for the result of those two steps). Fowler describes the path by which this

was achieved as “not the simplest case of Replace Temp with Query,totalAmount was assigned to

within the loop, so I have to copy the loop into the query method”. Indeed, to-date, no refactoring

tool supports such cases.

Here is an outline of the mechanical steps that need to be performed by a programmer, in the

absence of tool support, for extracting the total charge computation:

1. In the statement method of Figure 1.2, look for the temporary variable that is assigned the

result of the total charge computation. This is totalAmount which is declared to be of type

double in line 14. Its ﬁnal value is added to the customer statement in line 29.

2. Create a new method, and name it after the intention of the computation: getTotalCharge.

Declare it to return the type of the extracted variable: double . See line 36 in Figure 1.4.

3. Identify all the statements that contribute to the computation of totalAmount. In this case

these are the statements in lines {14, 16, 19, 20, 25, 26}.

4. Copy the identiﬁed statements to the new method. See lines 37 to 42 in Figure 1.4.

5. Scan the extracted code for references to any variables that are parameters to the statement

method. These should be parameters to getTotalCharge as well. In this case, the parameter

list is empty.

6. Look to see which of the extracted statements are no longer needed in the statement method

and delete those. In this case, the while loop is still relevant, and therefore the statements

in lines {16,19,20,26}cannot be deleted; instead, they are duplicated. Lines {14,25}are

needed only in the extracted code and are therefore deleted. In Figure 1.3 they are shown

as blank lines, for clarity.

7. Rename the extracted variable, totalAmount , in the extracted method, getTotalCharge, to

result, and add a return statement at the end of that method (see line 43 in Figure 1.4).

8. Replace the reference to the result of the extracted computation with a call to the target

getTotalCharge method (line 29 in Figure 1.3).

9. Compile and test.

A refactoring tool could reduce the above scenario to (a) selecting a temporary variable (whose

computation is to be extracted), and (b) choosing a name for the extracted method. The tool,

in turn, would either perform the transformation, or reject it if behaviour preservation cannot be

guaranteed. For example, note that the correctness of the transformation above depended on the

immutability of the traversed collection rentals (thus allowing untangling of the three traversals).

CHAPTER 1. INTRODUCTION 7

An attempt at providing such a tool, in early stages of the research leading to this thesis,

suﬀered several drawbacks. Firstly, in order to guarantee behaviour preservation, the identiﬁed

preconditions (e.g. no global variable deﬁned in the extracted code) were clearly stronger than

necessary. Secondly, the levels of code duplication were, again, higher than necessary. The dupli-

cation is due to extracted statements (identiﬁed in step 3 above) that are not deleted from the

original (see step 6). As usual, code duplication could be considered harmful in itself, but perhaps

more importantly, it indirectly aﬀected applicability.

A successful reduction in duplication and weakening of preconditions, thus leading to a reﬁned

and more generally applicable tool, required a careful and rigorous study of the many intricacies

in this refactoring. Results of that study are reported in this thesis.

The complete video-store scenario, particularly the breaking up and distribution of the ini-

tially monolithic statement method, motivates and justiﬁes Fowler and Beck’s big refactoring to

Convert Procedural Design to Objects [20, Chapter 12]: “You have code written in a procedural

style. Turn the data records into objects, break up the behavior, and move the behavior to the

objects”.

The steps of turning procedural design to objects mainly involve introducing new classes, ex-

tracting methods, moving variables and methods (to the new classes), inlining methods and renam-

ing variables and methods. All those are either straightforward or already supported by modern

refactoring tools. It is the extraction of non-contiguous code (as in Replace Temp with Query)

for which automation is missing and required.

However hope is not lost, as some solutions to extraction of non-contiguous code have been

proposed and investigated. (In fact, as will be shown later, those tackle a problem diﬀerent from

the above, but closely related.) Inspired by those, we shall dedicate this thesis to the development

of a novel solution; one that will beneﬁt from the advantages of each of those, whilst highlighting

and overcoming respective limitations.

The extraction of non-contiguous code, especially when dealing with the automation of steps

3 and 6 of the mechanics in the example above, lead us to the following observation.

1.3 Programmers use slices when refactoring

To untangle the desired statements from their context, one can employ program slicing [61, 64]. A

program slice singles out those statements that might have aﬀected the value of a given variable

at a given point in the program. A typical scenario is one in which the programmer selects a

variable (or set of variables) and point of interest, e.g. totalAmount at line 29, in the example

above (Figure 1.2); a slicing tool, in response, computes the (smallest possible) corresponding

slice, e.g. the non-contiguous code of lines {14,16,19,20,25,26}. This slice can then be extracted

CHAPTER 1. INTRODUCTION 8

into a new method, as was the case in steps 3 and 4 of that example. The idea of using slicing for

refactoring has been suggested by Maruyama [42].

Program slicing was invented, by Mark Weiser, for “times when only a portion of a program’s

behavior is of interest” [61], and with the observation that “programmers use slices when debug-

ging” [62]. According to Weiser, slicing is a “method of program decomposition” that “is applied

to programs after they are written, and is therefore useful in maintenance rather than design”

[61].

This is no longer true. In modern software development, as was mentioned earlier, some design

is normally done on each and every development iteration. Thus, since code of earlier iterations

is already available when designing further features (or corrections to existing ones), slicing can

be useful there too.

Therefore, the research leading to this thesis started with the observation that slicing can

be useful in daily program development activities, even outside its initial domain of software

maintenance. As a ﬁrst step towards such usage, and since refactoring presents such an interesting

blend of design, existing code and behaviour-preserving transformations, this research was initiated

with the question: “How can program slicing and related analyses assist in building automatic

tools for refactoring?” 2

1.4 Automatic slice-extraction refactoring via sliding

We shall propose automation of the Replace Temp with Query refactoring in latter stages of this

thesis. The solution will be composed of a number of behaviour-preserving steps, in a manner

slightly diﬀerent from the earlier mechanics of manual transformation. In the ﬁrst step, a selected

slice will be extracted from its so-called complement (i.e. code for the remaining computation).

The problem of slice extraction can be formulated as follows:

Deﬁnition 1.1 (Slice extraction).Let Sbe a program statement and Vbe a set of variables;

extract the computation of Vfrom S(i.e. the slice of Swith respect to V) as a reusable program

entity, and update the original Sto reuse the extracted slice.

A novel solution shall be developed in the course of this thesis, thus automating slice-extraction.

The automation will be based on a correct (i.e. behaviour-preserving) slicing algorithm. This al-

gorithm will itself be based on a special program representation, speciﬁcally designed for capturing

non-contiguous code. This representation’s primitive elements will be called slides. (This decom-

position of a program into slides is in accordance with a program execution metaphor of overhead

projection of programs printed on transparency slides; see Chapter 8.)

2The author would like to gratefully acknowledge Prof. Alan Mycroft’s advice during preparation of the research

proposal, particularly in the formulation of this research question.

CHAPTER 1. INTRODUCTION 9

It is in illustrating and formalising the slice-extraction refactoring that the program medium

of slides will be instrumental. Suppose the code of a program statement Sis printed on a single

transparency slide. Our initial solution begins by duplicating that slide, yielding two clones, say

S1 and S2, and placing them one on top of the other. This is then followed by sliding one of the

slides (say of S2) sideways, and by adding so-called compensatory code. This compensation will

be responsible for preserving behaviour.

Behaviour can be preserved by keeping initial values of all relevant variables (in fresh backup

variables) ahead of S1, and retrieving those after S1 but ahead of S2. Furthermore, extracted

results, V, can be saved after S1 and retrieved on exit from S2. Pictorially, sliding of S,Vwill

turn Sinto something like the following (with “ ; ” for sequential composition; note that the

left column is composed with the right, thus for chronological order read the former, top-down,

before moving on to the latter):

(keep backup of relevant initial values)

;S1 (ﬁrst clone, i.e. extracted code)

;(keep backup of ﬁnal values of the extracted V)

;

(retrieve backup of initial values)

;S2 (second clone, i.e. complement)

;(retrieve backup of ﬁnal values)

A naive sliding transformation, in the form of full statement duplication (as described above),

is formally developed in Chapter 6.

A number of improved versions of sliding, with the goal of reducing code duplication, will be

explored and formalised throughout the thesis. Those will beneﬁt from our decomposition of a

program statement into smaller entities of non-contiguous code (i.e. slides, to be formalised in

Chapter 8).

The reduction of duplication will be achieved by making both the extracted code (i.e. S 1

above) and the complement (i.e. S 2) smaller. In later improvements, the compensatory code will

be made smaller too (see Chapter 11).

1.5 Overview: chapter by chapter

This opening chapter has introduced the challenge of slice-extraction untangling transformations,

with the goals of improving readability and reusability of existing code. The importance and

potential implications of this refactoring and its automation have been highlighted and brieﬂy

demonstrated through a known example from the refactoring literature. Finally, hints to our path

for automating slice extraction have been given. The rest of this thesis is structured as follows:

•In Chapter 2 we present background material and related work. This includes refactoring,

CHAPTER 1. INTRODUCTION 10

slicing, and the application of slicing to refactoring in extraction of non-contiguous code.

•In Chapter 3 we give background to the adopted formal approach, introducing some rele-

vant concepts from predicate calculus and predicate transformers, set theory and program

reﬁnement.

•In Chapters 4 and 5 we begin the presentation of original work by developing a formal

framework for correct slicing-based refactoring, including the deﬁnition of a programming

language, a collection of laws to facilitate program analysis and manipulation, and a method

for proving general algorithmic reﬁnements through newly introduced slicing-related ones.

•In Chapter 6 we take the ﬁrst step towards slice extraction by formally developing a trans-

formation of statement duplication. The result is a naive sliding transformation, with both

the extracted code and complement being clones of the original statement.

•In Chapters 7, 8 and 9 we develop a ﬁrst improvement of sliding. The semantic and syntactic

requirements of slicing are derived, leading to the formalisation of a novel slicing algorithm,

one that is based on a program representation of slides. With this slicer, both the extracted

code and the complement are specialised to be the slice of extracted variables and the slice

of the remaining deﬁned variables, respectively.

•In Chapter 10 we target further reductions in the duplication caused by sliding. Those are

based on the observation that the complement (or co-slice), previously being the slice of all

non-extracted variables, can become smaller by reusing values of extracted variables.

•In Chapter 11 we target the identiﬁcation and elimination of redundant compensatory code,

result of earlier formulations of sliding.

•In Chapter 12 we pose and solve a couple of optimisation problems, thus yielding an optimal

slice-extraction solution via sliding.

•Finally, we conclude in Chapter 13 by considering the application of sliding for automat-

ing known refactorings, discussing advanced issues and limitations, and suggesting possible

directions for future work.

1.6 Contributions

This thesis brings together the ﬁelds of program slicing and refactoring. As such, it makes four

signiﬁcant contributions to those ﬁelds, as listed below.

CHAPTER 1. INTRODUCTION 11

1. It develops a theoretical framework for slicing-based behviour-preserving transformations of

existing code. The framework, based on wp-calculus, includes a new proof method, specif-

ically designed to support slicing-related transformations of deterministic programs. The

framework further includes a novel program decomposition technique of program slides,

aiming to capture non-contiguous code.

2. It provides a provably correct slicing algorithm. This application of our theory acts as

evidence for its expressive power whilst enabling constructive descriptions of slicing-based

transformations.

3. It applies the above framework and slicing algorithm in solving the problem of slice extrac-

tion. The solution takes the form of a family of provably correct sliding transformations.

Drawing inspiration from a number of existing solutions to related problems of method

extraction, sliding is successful in providing high levels of accuracy and applicability.

4. It identiﬁes and outlines the application of sliding to known refactorings, making them au-

tomatable for the ﬁrst time. Examples of such refactorings include Replace Temp with Query

and Split Loop.

These contributions provide strong evidence for the validity of our research question. Indeed,

slicing and related analyses can assist in building automatic tools for refactoring.

Chapter 2

Background and Related Work

2.1 Refactoring

2.1.1 Informal reality

Refactoring is deﬁned informally as the process of improving the design of existing software sys-

tems. The improvement takes the form of source code transformations. Each such transformation

is expected to preserve the behaviour of the system while making it more amenable for change. A

programmer can refactor either manually or with the assistance of automatic tools.

Refactoring was introduced by William Opdyke in his PhD thesis [48] and later became widely

known with the introduction of Martin Fowler’s book [20].

The refactoring.com website [71] maintains a list of refactoring tools and an online catalog of

refactorings [69]. The refactoring community discusses the techniques, tools and philosophy on

the refactoring mailing list [72].

In [69, 20], around 100 refactoring techniques are described. There are simple refactorings

such as renaming a class and some more complicated ones, e.g. for extracting a class or a method,

or for moving a method from one class to another. Some bigger refactorings may involve a

whole hierarchy of classes, for example introducing polymorphism, collapsing a redundant class

hierarchy, or even as complex and ambitious as converting a program with procedural design to a

more object-oriented one.

Being driven mostly by examples, the description of each refactoring, in [69, 20], is fairly

informal and imprecise. The success of each transformation depends on the programmer’s good

judgement, complemented with expected assistance from the compiler and the availability of a

comprehensive suite of automated tests.

Eliminating that unconvincing dependence on testing is one of the challenges of refactoring

tools. Such a tool is typically interactive; the programmer is responsible to select a speciﬁc refac-

CHAPTER 2. BACKGROUND AND RELATED WORK 13

toring from the menus, the tool in response performs the transformation, asking the programmer

to ﬁll in any required details such as new names for introduced program elements.

Another (related) goal of refactoring tools is to speed up the process of refactoring, thus

supporting improved productivity of programmers. Ultimately, programmers would trust the

tools, employ them frequently, on a daily basis, as is dictated by requests for change in the

existing software on which they work.

The RefactoringBrowser for Smalltalk, developed by John Brant and Don Roberts at the

University of Illinois [53], was the ﬁrst designated refactoring tool. Its success was followed by

several attempts to develop refactoring tools for the Java programming language [25], including

IntelliJ’s IDEA,Microsoft’s Visual Studio and (initially IBM’s) Eclipse. Those tools support

some of the oﬀered refactorings, such as Move/Pull-Up/Push-Down/Extract/Inline Method,Re-

name Field/Method/Class,Self-Encapsulate Field,Add Parameter, and Extract Interface.

However, that support is far from perfect, as short experiments we performed (ﬁrst in 2003

and then again in 2005 [70, 55]) revealed. There, it was demonstrated how modern tools are

particularly weak in supporting cases where non-trivial data-ﬂow and control-ﬂow analyses are

required. These shortcomings led, in some cases, to an apparently successful refactoring that was

yielding grammatically incorrect code; in other cases, potentially correct transformations were

unnecessarily rejected due to inaccurate, and at times incorrect analysis.

Such bugs in refactoring tools call for a review of refactoring theory. Their existence also act

as motivation for the formal approach taken in this thesis.

2.1.2 Underlying theory

Program representation and analysis

As a result of developing several versions of the Smalltalk RefactoringBrowser, Roberts [52] iden-

tiﬁed several criteria, both technical and practical, necessary to the success of a refactoring tool.

The technical requirement is that the tool must maintain a program database, that holds all the

required information about the refactored program’s entities, e.g. packages, classes, ﬁelds, methods

and statements, and also their relations and cross-references. The database should enable the tool

to check properties of the program both when checking whether a refactoring request is legal, and

in performing the transformation. As the source code may constantly change, either manually by

the programmer or by the refactoring (or any other source code manipulation) tool, the program

database must also be constantly updated. Regarding the techniques that can be used to construct

the program database, Roberts states that “at one end of the scale are fast, lexical tools such as

grep. At the other end are sophisticated analysis techniques such as dependency graphs. Some-

where in the middle is syntactic analysis using abstract syntax trees. There are tradeoﬀs between

CHAPTER 2. BACKGROUND AND RELATED WORK 14

speed, accuracy, and richness of information that must be made when deciding which technique

to use. For instance, grep is extremely fast but can be fooled by things like commented-out code.

Dependency information is useful, but often takes a considerable amount of time to compute”.

Existing tools mostly use the abstract syntax tree (AST) compromise, whereas the analysis

required for transformations in this thesis will be of the kind applied in constructing dependency

graphs. In doing so, and in light of Roberts’ observation, as stated above, we pay some attention

to eﬃciency and performance, when constructively expressing algorithms for analysis and trans-

formation. In particular, most of those will indeed be tree based and require only one pass over

an analysed program’s AST. (This is made possible by the simplicity of our supported language.)

However, behaviour preservation, as is discussed next, will be our prime goal. Consequently,

we shall be concerned with correctness of our algorithms more than with their corresponding

performance and complexity.

Behaviour preservation

Roberts further discusses the accuracy property expected from a refactoring tool. He argues that

the refactorings that a tool implements must reasonably preserve the behaviour of programs, as

total behaviour preservation is impossible to achieve. “For example, a refactoring might make

a program a few milliseconds faster or slower. Usually, this would not aﬀect a program, but

if the program requirements include hard real-time constraints, this could cause a program to

be incorrect”. The reasonable behaviour-preservation degree that should be expected from a

refactoring tool was formally deﬁned by Opdyke [48] as a list of seven properties two versions of

a program must hold before and after a refactoring. The ﬁrst six involve syntactical correctness

properties that are necessary for a clean compilation of both versions of the program. The seventh

property is called “Semantically equivalent references and operations”, and is deﬁned as follows:

“Let the external interface to the program be via the function main. If the function main is called

twice (once before and once after the refactoring) with the same set of inputs, the resulting set of

output values must be the same” [48].

This property, when dealing with terminating sequential programs, corresponds to the concept

of reﬁnement (see Section 3.4 in the next chapter). And indeed, in his PhD thesis (“Refactorings

as Formal Reﬁnements” [11]), M´arcio Corn´elio formalised a large number of Fowler’s refactorings

as “algebraic reﬁnement rules involving program terms”. The supported language (ROOL, for

“Reﬁnement Object-Oriented Language”) is said to be “a Java-like object-oriented language”

with formal semantics based on weakest preconditions (see Chapter 3 ahead).

However, Corn´elio does not support the refactoring for removing temps (Replace Temp with

Query) which is targeted by this thesis. To formalise and solve such refactoring problems, the

original reﬁnement calculus approach, as presented by Morgan [45] and others, needs to be com-

CHAPTER 2. BACKGROUND AND RELATED WORK 15

bined with projection onto a subset of the program variables, as we discuss in Chapter 5. Thus,

like Corn´elio, we base this work on formal reﬁnement and weakest-preconditions semantics.

For simplicity, and due to the intricate nature of the problem, we shall target a very simple

imperative language, rather than a fully object-oriented one. For the same reasons, we shall focus

on preservation of semantics alone, while avoiding all (important for themselves) questions over

syntactic validity of transformed programs (as e.g. expressed in Opdyke’s ﬁrst six properties).

Composition of refactorings

Opdyke deﬁned high-level refactoring techniques as a composition of lower-level ones [48]. Each

low-level refactoring is deﬁned with a corresponding set of preconditions. Those, expressed in ﬁrst

order logic over predicates available in the program database, must be satisﬁed by the program

and the refactoring criterion (i.e. the type of refactoring and the accompanying parameters chosen

by the user) before a correct refactoring can be performed. The refactoring tool is responsible for

performing such checks.

A naive implementation of refactoring composition would update the program database after

every step. When the composition consists of a long sequence of refactorings, this may yield an

ineﬃcient and slow tool. One approach for reducing the amount of analysis in the implementation

of composite refactorings was introduced in [52]. There, each refactoring’s deﬁnition was aug-

mented with a set of properties that a program will deﬁnitely satisfy after the transformation, i.e.

postconditions. Using this information, after each step of the composed refactoring, the program

database can be incrementally updated, rather than be re-computed from scratch.

The approach taken in this thesis, however, is somewhat diﬀerent. Indeed, our transformations

shall be composed of (at times exceedingly long) sequences of smaller steps. But instead of

expecting the actual tool to perform each and every step, we shall ﬁrst formally develop a solution

“by hand”; then overall preconditions shall be carefully collected; thus the bigger refactorings shall

be formally derived from existing, smaller ones, hence potentially leading to more eﬃcient tools.

As was mentioned in the preceding chapter, refactoring tools can beneﬁt from the capabilities

of a decomposition technique known as program slicing. For this we now turn to present relevant

background on slicing, before relating the two (in the section to follow).

2.2 Slicing

Program slicing is the study of meaningful subprograms. Typically applied to the code of an

existing program, a slicing algorithm is responsible for producing a program (or subprogram) that

preserves a subset of the original program’s behaviour. A speciﬁcation of that subset is known as

a slicing criterion, and the resulting subprogram is a slice.

CHAPTER 2. BACKGROUND AND RELATED WORK 16

Slicing was invented with the observation that “programmers use slices when debugging” [62].

Nevertheless, the application of program slicing does not stop there. Further applications include

testing [29, 8], program comprehension [9, 51], model checking [44, 15], parallelisation [63, 5],

software metrics [50, 43], as well as software restructuring and refactoring [40, 42, 17]. The latter

application is considered in this thesis.

There can be many diﬀerent slices for a given program and slicing criterion. Indeed, there

is always at least one slice for a given slicing criterion: the program itself [61]. However, slicing

algorithms are usually expected to produce the smallest possible slices, as those are most useful

in the majority of applications.

2.2.1 Slicing examples

Here is a variation on the “hello world” of program slicing, computing and printing the sum and

product of all numbers in a given array of integers. The index of each statement is given to its

left, for later reference.

original

1i := 0

2; sum := 0

3; prod := 1

4; while i<a.length do

5sum := sum+a[i]

6; prod := prod*a[i]

7; i := i+1

8; out << sum

9; out << prod

slice of sum from 8

i := 0

; sum := 0

; while i<a.length do

sum := sum+a[i]

; i := i+1

slice of prod from 9

i := 0

; prod := 1

; while i<a.length do

prod := prod*a[i]

; i := i+1

A slice of sum from statement 8 must contain the statements {1,2,4,5,7}, and thus can be

obtained by deleting the irrelevant statements {3,6,8,9}. Similarly, a backward slice of prod from

9 should contain {1,3,4,6,7}, and can be obtained by deleting {2,5,8,9}.

2.2.2 On slicing and termination

An interesting aspect of program behaviour is that of termination. Do we expect a slice to

preserve conditions for termination? For example, should the loop statement in the program

above be included in the slice for the array afrom statement 9? And what if a.length is negative?

CHAPTER 2. BACKGROUND AND RELATED WORK 17

(Of course this should never happen, unless the program is misbehaving. But a slicer must be

prepared for all possible program behaviours, including such abnormalities.)

When conditions for termination are preserved, the slice is said to be termination sensitive

[35]. Such slices have been applied e.g. in model reduction [15].

However, in an attempt to yield the smallest possible slices, it is common to remove non-

aﬀecting code even if this code might not terminate. This way, the empty statement skip is a

valid slice for the array ain the example above. Thus, slicing may introduce termination, as is

incidentally the case with reﬁnement (see Section 3.4 of the following chapter).

2.2.3 Slicing criterion

An interactive tool for slicing can be seen as an extension to a source code editor, where the user

can select where to slice from and the tool answers by highlighting the set of statements in the

slice. The user selection, i.e. the slicing criterion, can be speciﬁed in diﬀerent ways. In Weiser’s

original deﬁnition [64], it was a pair, hi,Vi, combining a program point, i, and a set of variables,

V,e.g. h8,{sum}i in the example above.

A simpliﬁed version, hii, is obtained by omitting the set of variables, e.g. h8i. There, all the

variables that are used in the selected substatement are of interest (e.g. h8,{out,sum}i above).

Some slicing algorithms (most notably the PDG-based slicers [6, 33]), support such kind of criteria

exclusively.

A third variation of the slicing criterion formulation is obtained by selecting a (possibly com-

pound) statement and a set of variables of interest. Here, by avoiding any mention of a program

point, we mean to slice from the end of the selected statement. For example, the slice of h8,{sum}i

(on the second column above) is a valid slice of hS,{sum}i (where Sstands for the compound

statement 1-9), whereas the slice for hT,{sum}i (where Tis the compound statement of 1-3)

would consist of substatement 2 alone. Similarly, the slice for hS,{out}i is the full S. (Note that

here the scope for slicing is mentioned explicitly whereas otherwise it is implicitly expected to be

the whole program.) This kind of criteria has appeared e.g. in [59] and is used, exclusively, in this

thesis.

2.2.4 Syntax-preserving vs. amorphous and semantic slices

When the slice is limited to constructs from the original program, it is said to be syntax preserving.

Such slices are constructed by deleting irrelevant statements from the original program. Thus, a

syntax-preserving slice of a given program statement corresponds to a substatement, possibly a

non-contiguous one.

Amorphous slicing, in turn, combines slicing with a range of transformations, in simplifying

CHAPTER 2. BACKGROUND AND RELATED WORK 18

the resulting code (see e.g. [27]). For example, a termination-sensitive slice of the array a, above,

would potentially be able to exclude the loop from the slice, replacing it with a single test of the

length of a. This way, the initialisation of variable iwould be successfully and correctly removed.

Semantic slicing (deﬁned e.g. in [59]), deﬁnes the semantic requirements as expected from

slices, and accepts any program that meets those requirements as a valid slice. When constructively

computing semantic slices, similar techniques to those of amorphous slicing are used, in simplifying

the result.

2.2.5 Flow sensitivity: backward vs. forward slicing

When a program analysis result depends on the order of the statements (i.e. when the analysis of

a program “S1;S2”is expected to diﬀer from the analysis of “S2;S1”) the analysis is said

to be ﬂow-sensitive[47]. For producing smaller, more accurate slices, a slicing algorithm should be

ﬂow-sensitive. As such, it can be applied in one of two directions.

Traditional slicing, as presented so far, is known in that respect as backward slicing. Indeed, as

is the case with backward analysis [47, 54], its algorithms propagate information against the ﬂow

of control, while answering questions of what might have happened before arriving at a program

point. The complementary technique is called forward slicing and is computed by looking forward

from a selected program point, thus answering the question “which statements may be aﬀected

by the value computed at this point?”

A slicing algorithm to be developed in this thesis, in Chapter 9, will compute backward slices.

An initial version will be ﬂow-insensitive. Then, ﬂow-sensitivity will be gained by transforming

a program to and from a static-single-assignment (SSA) form, before and after the slicing, re-

spectively. Background on the SSA form will be given shortly, but ﬁrst we turn to discuss slicing

algorithms.

2.2.6 Slicing algorithms

In this thesis we shall target intraprocedural, backward, syntax-preserving and static slicing. A

variety of such algorithms exists. Those are based on a wide range of program representations,

from abstract syntax trees (AST) [19, 18], through value dependence graphs (VDG) [16, 60]

and the static single information (SSI) form [54], to control ﬂow graphs (CFG) [61, 64] and the

program dependence graph (PDG) [49, 33, 6]. We brieﬂy mention Weiser’s original approach and

the PDG-based one.

According to Weiser [61], automatic slicing should be performed on the program’s ﬂow graph

[1] using data-ﬂow equations. Those approximate the set of variables that may be (directly or

indirectly) relevant for a given slicing criterion at each node of the graph. A node nis added to the

CHAPTER 2. BACKGROUND AND RELATED WORK 19

slice if it deﬁnes (i.e. may modify the value of) any of those relevant variables (associated with n).

Furthermore, any “branch statement which can choose to execute or not execute” another node

which is already in the slice, should, in general, be added to the slice [61]. Thus, both data-ﬂow

and control-ﬂow inﬂuences are taken into account, iteratively, until a ﬁxed point is reached.

When the slicing criterion involves all variables that are referenced in a selected program point,

the program dependence graph (PDG) oﬀers a fast algorithm for computing the corresponding

slice. The PDG, like the ﬂow graph, contains a node for each program statement; the directed

edges (relevant for slicing) correspond to direct data-ﬂow and control-ﬂow inﬂuences, respectively.

Thus, slicing is reduced to a reachability problem. Each node from which there is a directed path

to the selected node (that correspond to the slicing criterion), is added to the slice. This solution is

particularly eﬀective in a situation in which many slices are to be computed on the same program,

since the time to construct the PDG can hence be amortised over all slice computations.

Note that both Weiser’s and the PDG-based approaches consider control and data dependences

on the same level. That is, at each step of the respective algorithm, both kinds of inﬂuences are

taken into account in adding statements to the computed slice. That choice is challenged in this

thesis.

As an alternative, we shall primarily consider control dependences, in producing program

entities called slides (see Chapter 8). Those slides shall than be treated as primitives in a novel

slicing algorithm (Chapter 9) that involve data dependences (or rather slide dependences) only.

Thus, when interested in slices from the end of a program, our algorithm will yield traditional

slices, as in the algorithms above, on the one hand, while producing potentially smaller statements,

for what we call co-slices (Chapter 10), on the other.

Our program representation of slides and hence the slicing algorithm will beneﬁt from the popu-

lar intermediate representation (IR) of static single assignment (SSA) [12, 54], which is introduced

next.

2.2.7 SSA form

Static single assignment form is an intermediate program representation in which “every variable

assigned a value in it occurs as the target of only one assignment” [46]. As such, it has proved useful

in static program analysis, in particular for implementing fast and eﬀective optimizing compilers

[46]. Typically, a program is translated into SSA form before performing some program analyses

and related transformations (e.g. constant propagation,invariant code motion); once done, the

transformed program is translated back to its original form.

Jeremy Singer deﬁnes the SSA form, along with its younger sister of static single information

(SSI), as members of a more general familiy of IRs, of virtual register renaming schemes (VRRS)

[54]. Any VRRS family member can be generally described as a control ﬂow graph (CFG) with a

CHAPTER 2. BACKGROUND AND RELATED WORK 20

certain renaming scheme applied to its set of virtual registers.

When a deﬁned register (or variable) is renamed, all its references must be renamed too. Thus,

each reference must be reached by a single corresponding deﬁnition. This is achieved by adding so-

called pseudo variables in merge points of the CFG. Those merge all reaching deﬁnitions into one

new name, using a so-called φ-function. For example, on exit from an IF statement, x3:= φ.(x1,x2)

would merge the two values x1and x2(we call them instances and accordingly x3apseudo instance)

of the two branches of the IF. The merge is such, that the x1will be taken on arrival from one

branch, and the x2if arriving from the other. In general, there are as many arguments to a

φ-function as there are incoming edges in the control ﬂow.

In our context of source-to-source transformations, supporting a simple language with struc-

tured control ﬂow and no aliasing (as will be explained later), we shall be interested in an SSA-like

renaming on the level of program variables (rather then virtual registers). For formalising and

giving examples of SSA, and since our φ-functions will always require two arguments, we shall

avoid them altogether. Instead, they will be pushed back to the incoming branches, separated

into two individual assignments (e.g. x3:= x1at the end of one branch of an IF statement, and

x3:= x2at the other). Accordingly, a variable that is assigned (and used) in a DO loop, will

have a designated pseudo instance assigned both before the loop and at the end of its body. For

example, the SSA version of the sum and prod program from above will be as follows:

i1:= 0

; sum2:= 0

; prod3:= 1

; i4,sum4,prod4:= i1,sum2,prod3

; while i4<a.length do

sum5:= sum4+a[i4]

; prod6:= prod4*a[i4]

; i7:= i4+1

; i4,sum4,prod4:= i7,sum5,prod6

; out8:= out ++ sum4

; out9:= out8++ prod4

Note that appending to the out stream had to be simpliﬁed (with ++ taking the place of <<).

Note also that, following SSA tradition, we rename instances by adding a subscript. However, we

deviate from the common practice of using increasing natural numbers for the instances of each

variable. Instead, in program examples we prefer to use, for each assignment, its (informal) label

CHAPTER 2. BACKGROUND AND RELATED WORK 21

as a unique subscript (for all variables deﬁned in it). Thus, our loop pseudo instances (e.g. i4) are

deﬁned in two distinct places, on the one hand, but both deﬁne the same instance, on the other.

The SSA form will be formalised in Section 8.6 and Appendix D and applied for slicing and

sliding. In particular, our new slicing algorithm is based on slides of the SSA form. This slicing

algorithm should in turn be useful in automating slice-extraction transformations. Accordingly,

we now turn to present background material on that problem.

2.3 Slice-extraction refactoring

An interactive process for behaviour-preserving method extraction was ﬁrst proposed by Opdyke

[48] and (independently) by Griswold and Notkin [26]. This was however restricted to the extrac-

tion of contiguous code.

Maruyama, in a paper titled “Automated Method-Extraction Refactoring by Using Block-

Based Slicing” from 2001 [42], proposed a scenario according to which, extraction is performed on

a block of statements (i.e. a compound statement — we shall simply call it a statement — acting

as scope for extraction) and a user selected variable whose computation is to be extracted from the

remaining computation. A slice of the selected variable in the selected scope is extracted into a

new method; a call to that method is placed ahead of the code for the remaining computation. We

call the latter the complement, borrowing Gallagher’s terminology [22]. Maruyama’s rudimentary

solution is described formally; the importance of proving correctness is highlighted, but no proof

is given.

Earlier research on extracting slices from existing systems, in the context of software reverse

engineering and reengineering, including that of Lanubile and Visaggio [41] and Cimitile et al. [10],

has focused mainly on how to discover reasonable slicing criteria. In our context of refactoring,

we prefer to leave the choice of what to extract to the programmer. Our approach, in turn, will

focus mainly on how to reorganise the original code to beneﬁt from the extraction.

In what follows, we explore state-of-the-art solutions to a related problem, we call it arbitrary

method extraction, according to which, instead of extracting the slice of a variable, a set of (not

necessarily contiguous) statements is selected for extraction.

Lakhotia and Deprez deﬁned an arbitrary-method-extraction transformation called Tuck [40].

In tucking, a selection of statements and scope for extraction is made (either by the user or

some other tool), and a tool, in response, computes the slice (of the selected statements) and

complement, and composes them sequentially, along with some compensation that may include

backup of initial values and variable renaming (in the complement). The complement itself is

computed through slicing from all non-extracted statements (in the selected scope). If both the

slice and its complement deﬁne a variable that is live on exit from scope, the transformation is

CHAPTER 2. BACKGROUND AND RELATED WORK 22

rejected.

Suppose we are asked to extract the slice of statement 2, in the following:

1; while i<a.length do

2sum := sum+a[i]

3; prod := prod*a[i]

4; i := i+1

Tucking would compute a statement made of {1,2,4}as the slice to be extracted; from the

remaining statement, 3, it would compute {1,3,4}as basis for the complement; thus so long as

the variable iis not live-on-exit (i.e. will not be used before being re-deﬁned after the loop), we

would get something like:

ii := i

; while i<a.length do

sum := sum+a[i]

; i := i+1

; while ii<a.length do

prod := prod*a[ii]

; ii := ii+1

The ﬁnal step of Tuck would then fold the extracted slice into a reusable method.

The 2000 version of arbitrary method extraction by Komondoor and Horwitz (to be referred

to as KH00, [38]), is particularly eﬀective in reordering statements. A sequence of statements

is selected as scope, and a subset of those is selected for extraction. The algorithm seeks valid

permutations of the sequence in which the selected statements are grouped together (i.e. forming

contiguous code). The validity of a permutation depends on some control ﬂow and data ﬂow

related constraints.

For example, suppose we are asked to extract the computation and printing of sum, on state-

ments {2,3,7}in the following:

CHAPTER 2. BACKGROUND AND RELATED WORK 23

1i,sum := 0,0

2; while i<a.length do

3i,sum := i+1,sum+a[i]

4; i,prod := 0,1

5; while i<a.length do

6i,prod := i+1,prod*a[i]

7; out << sum

8; out << prod

The KH00 would successfuly yield the following two alternatives:

1i,sum := 0,0

2; while i<a.length do

3i,sum := i+1,sum+a[i]

7; out << sum

4; i,prod := 0,1

5; while i<a.length do

6i,prod := i+1,prod*a[i]

8; out << prod

4i,prod := 0,1

5; while i<a.length do

6i,prod := i+1,prod*a[i]

1; i,sum := 0,0

2; while i<a.length do

3i,sum := i+1,sum+a[i]

7; out << sum

8; out << prod

In comparison to tucking, this algorithm extracts precisely the selected statements, even if

those do not form a complete slice. This is made possible by allowing two complementary parts:

one to be executed before the extracted code, the other after.

According to this algorithm, neither duplication nor compensatory code is permitted. Conse-

quently, in cases where no permutation satisﬁes all constraints, the transformation is rejected.

In their 2003 version of arbitrary method extraction [39], Komondoor and Horwitz have relaxed

their earlier restriction on duplication: this time predicates (as well as jumps, which are outside

the scope of this thesis) are allowed to be duplicated, while other statements, e.g. assignments, are

not. Furthermore, this time, instead of having to reject some transformation requests, when no

permutation satisﬁes all ordering constraints, the new strategy is to drag problematic statements

along with the selected statements, to the extracted part of the resulting program. They refer to

CHAPTER 2. BACKGROUND AND RELATED WORK 24

that dragging as promotion.

A ﬁrst criticism of such promotion, by Harman et al., has appeared in [28], where amorphous

slicing has been suggested for procedure and function extraction. Their target was to support

program comprehension. Accordingly, their transformations are exploratory, with no aim to keep

the resulting program, as we do in refactoring.

Another technique of code untangling for program comprehension, called ﬁssion (reverse of

fusion), has been suggested by Jeremy Gibbons [24]. With ﬁssion, the design of a program can

be reconstructed from its implementation. Gibbons illustrate the approach using code examples

from the slicing literature [22]. Indeed, according to Gibbons, “slicing is a ﬁssion transformation,

reversing the fusion of independent but similarly-structured computations”. In contrast to program

comprehension, in refactoring the focus is on automation with syntax preservation, such that the

resulting program would reﬂect an update in the design whilst being easily recognised by the

programmer.

As with Harman et al. and Gibbons, the KH03 is not (and was not designed to be) a slice-

extraction algorithm. (It was actually designed for reducing code duplication by eliminating

multiple clones of code, replacing them all with method calls.) For example, KH03 cannot untangle

the computation of sum from prod, as the Tuck transformation does, since that would involve

a duplication of the assignment to i. In general, in KH03, any loop would have to be either

completely extracted or not at all.

In comparison to its predecessor KH00, the KH03 algorithm presents a step forward in the sense

that some duplication is allowed, thus rendering it more applicable (in fact it is totally applicable,

as in the worst case it extracts the whole program in scope). It oﬀers another improvement with

respect to jumps (which are, again, outside the scope of our investigation). However, in an attempt

to make it more scalable (its complexity is polynomial, compared with the exponential KH00), it

might extract more statements. In the example of KH00 above, the KH03 algorithm would fail to

move the computation of prod out of the way; instead it would extract it along with the selected

statements.

Nevertheless, the KH03 algorithm oﬀers one improvement over its predecessors, which is rele-

vant for slice extraction. Komondoor and Horwitz criticise (in [39]) the Tuck transformation for

not allowing data to ﬂow from the extracted slice to its complement. This results in too large

complements, as is demonstrated in the next example. Suppose we are asked to extract statements

{1,2,4,6}(i.e. the slice of out) from the following program (on the left). The KH03 would yield,

in response, the version on the right:

CHAPTER 2. BACKGROUND AND RELATED WORK 25

1; while i<a.length do

2sum := sum+a[i]

3; prod := prod*a[i]

4; i := i+1

5; avg := sum/a.length

6; out << sum

1; while i<a.length do

2sum := sum+a[i]

3; prod := prod*a[i]

4; i := i+1

6; out << sum

5; avg := sum/a.length

Note that tucking, on this example, would have duplicated the entire computation of sum,

whereas the KH00 algorithm would have failed, since the selection does not form a valid sequence.

The challenge of this thesis will be to combine the untangling abilities of Tuck with improved

applicability and reduced levels of code duplication, as in KH03, thus yielding (for the example

above, and even in a case where iis live-on-exit) something like:

ii := i

; while i<a.length do

sum := sum+a[i]

; i := i+1

; out << sum

;

i := ii

; while i<a.length do

prod := prod*a[i]

; i := i+1

; avg := sum/a.length

This completes our presentation of background material and related work on the topics of

refactoring, slicing and slicing-based refactoring. Our approach to solving the problem of slice

extraction is based on formal semantics using so-called predicate transformers. The next chapter,

our second and last background chapter, will introduce relevant concepts and basic theory.

Chapter 3

Formal Semantics: Predicate

Transformers

This chapter introduces background material on the formal approach for program semantics to be

adopted by this thesis. It is mainly based on Dijkstra and Scholten’s monograph Predicate Calculus

and Program Semantics [13] (to be referred to as DS). Relevant properties and theorems will be

recalled. Those will later be used in formally developing our framework of correct transformations.

Furthermore, background on the concept of reﬁnement and its relevance for refactoring and slicing

is presented.

3.1 Set theory for program variables

Some operations and properties from set theory will be useful in discussing sets of program vari-

ables and in calculating program properties.

3.1.1 Sets and lists of distinct variables

For simplicity and convenience, we will interchangably speak of lists and sets of variables. In

programs (as well as descriptions of transformations), lists will often be used, whereas in semantic

reasoning and calculation sets will be preferred. This choice can be justiﬁed by the fact that in

our use of lists, the order of elements will only be signiﬁcant for matching with corresponding

lists (e.g. the two lists in a multiple assignment statement “x,y:= 1,2”) and elements will not

appear more than once (as is the case for sets).

Thus, we will also take the liberty to use set operations directly on (such) lists. This is a

mere shorthand for taking the sets corresponding to those lists before applying the operation, and

turning the result back into linear list form, afterwards.

CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 27

The size of a set (or length of a list) Vwill be denoted |V|.

3.1.2 Disjoint sets and tuples

Since disjointness of two sets will be used extensively, we adopt the notation V1V2 as shorthand

for V1∩V2 = ∅where V1,V2 are either lists or sets.

When referring to the union of disjoint sets (of variables), say Xand Y, we shall write (X,Y).

This should be understood as X∪Ywith an implicit statement that XYis given. Note that

as with n-tuples, any number of sets would be admitted, and the brackets are not optional.

That same notation shall be used also for pair (or in general n-tuple) forming. There, however,

the elements will not necessarily be sets of variables.

Admittedly, having the same notation for both tuples and disjoint set-union is potentially

confusing. Nonetheless, it appears that the actual meaning can be easily inferred from the context.

3.1.3 Generating fresh variable names

When performing a transformation, we shall soon ﬁnd ourselves with a need to generate fresh

variable names. For this purpose, we oﬀer two versions of a function called fresh. The ﬁrst version

shall take a pair (n,V) as an argument, with na natural number (for length) and Va set of

variables (that are presently in use). In turn, it shall produce a set of fresh names, say X0such

that (Q1:) |X0|=n, and (Q2:) X0V.

Here, (Q1:) and (Q2:) are names of the formal requirements (or postconditions). Those

names will be recalled when applying fresh, in hints of derivation steps. We shall use this format

throughout the thesis.

Using an inﬁx ‘.’ (= full stop) for function application, we shall be writing X0:= fresh.(n,V),

where the nwill typically be the length of a given set, say |X|.

When new instances (e.g. for backup) of existing variables will be required, the second version

of fresh will be used. It takes the form X0:= fresh.(X,V) with (X,V) a pair of sets of variables.

This time, we postulate (Q1:) |X0|=|X|, (Q2:) X0V, as before, and an extra requirement

(Q3:) X=sv.X0where sv is a globally available mapping of such freshly generated variables to

their corresponding original source variable.

3.2 Predicate calculus

3.2.1 The state-space metaphor

In imperative programming, a given program, say S, manipulates variables by changing their

corresponding value. The collective value of all program variables, at any point of execution, is

CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 28

known as ‘the state’. A computation under control of Sbegins with a given initial state (i.e. its

input), and terminates (if at all) in a ﬁnal state (i.e. its output).

This terminology follows a metaphor of a ‘state space’, according to which, each program

variable, say x, being associated with a possibly inﬁnite but denumerable and non-empty set of

distinct possible values (i.e. its type, denoted T.x), stands for a dimension of the state space.

Each possible value, val ∈ T .x, is then associated with a single coordinate. Thus, any point in

the state space uniquely represents (by its coordinates) the value of all program variables. (This

should not be confused with abstract values, which are nomally represented by program variables

— it is the variables’ corresponding concrete values that are represented, at least metaphorically,

by a point in the state space.)

3.2.2 Structures, expressions and predicates

According to DS, a structure is “an abstraction over expressions in program variables in the sense

that the state space with its individually named dimensions has been eliminated from the picture”

[13, Page 5]. There, that abstraction was chosen for developing a general theory. Since we adopt

their theory only in the context of program semantics, we shall directly speak of expressions (over

the state space). Note that structures, and hence expressions in program variables, are associated

with a type. Thus integer expressions are distinguished from e.g. boolean expressions. The latter

expressions are also known as predicates.

Hence, a predicate is an expression whose so-called global (i.e. free) variables are program

variables. When evaluated, on a particular state, those variables are assigned (i.e. replaced with)

the speciﬁc values (corresponding to that state). Thus, a predicate expresses a dichotomy on a

program’s state space.

The syntax for expressing predicates includes the constants (or boolean scalars) false and true,

relational operations (e.g. =,6=, <, ≤) on expressions with program variables and possibly logical

variables which must be local (i.e. bound) in the predicate, logical connectives (e.g. ¬,∧,∨,≡),

the universal and existential quantiﬁers (∀and ∃respectively) and speciﬁc predicate transformers

(i.e. functions from predicates to predicates) which will deﬁne the semantics of our programming

language.

The universal (∀) and existential (∃) quantiﬁers generalise conjunction and disjunction, respec-

tively. The format of the former is (quoted here from DS):

(∀dummies : range : term) .

“Here, dummies stands for an unordered list of local variables, whose scope is delineated by the

outer parenthesis pair. In what follows, xand ywill be used to denote dummies; the dummies

may be of any understood type.

CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 29

The two components range and term are boolean structures, and so is the whole quantiﬁed

expression, which is a boolean scalar if both range and term are boolean scalars. Range and

term may depend on the dummies; their potential dependence on the dummies will be indicated

explicitly by using a functional notation, e.g., if a range has the form r.x∧s.x.y, it is a conjunction

of r.xwhich may depend on xbut does not depend on y, and s.x.y, which may depend on both.

. . . For the sake of brevity, the range true is omitted” [13, Pages 62-63].

The format for existential quantiﬁcation is similar, only with ∃in place of the ∀.

3.2.3 Square brackets: the ‘everywhere’ operator

As mentioned, predicates may involve global occurrences of program variables. An important

function from boolean structures (i.e. predicates) to boolean scalars (i.e. true and false ) is the

so-called everywhere operator [13, Page 8]. Its application is denoted by surrounding a predicate,

say P, with a pair of square brackets, [P].

When applied to a boolean scalar, the everywhere operator acts as identity (i.e. [true] = true

and [false] = false ); when applied to a predicate on a given state space, it acts as the universal

quantiﬁcation over all variables (i.e. dimensions) of that space ([13, Page 115]). Thus, [P] yields

true if and only if Pholds in every single point of the state space (i.e. for any possible assignment

of values to program variables occurring in it).

3.2.4 Functions and equality

Functions in DS are always total in their arguments (i.e. well-deﬁned for all possible values of their

arguments). They are deﬁned as the unique solution of “an equation that contains the argument(s)

of the function being deﬁned as parameter(s)” [13, Page 18].

Function application — as mentioned, denoted by an inﬁx ‘.’ (= full stop) — is left-associative,

such that f.x.yshould be read as (f.x).y.

For example, an integer increment function incr.x=x+ 1 (with the operator + itself deﬁned

as a function of its operands) is deﬁned as the solution of y: [y=x+ 1] or simply [incr .x=x+ 1].

Note that the equality of a pair of expressions over the state space, as in y=x+ 1, is not

merely a boolean scalar, but rather a boolean expression over that same state space. (For example,

consider the state space spanned by (x,y); there, applying y=x+1 to point (5,6) yields true but

applying it to (5,7) yields false.) Hence, it is the square brackets, i.e. the everywhere operator,

that turns the boolean expression into a scalar.

That function application preserves equality — a statement attributed to Leibniz and hence

sometimes referred to, in hints, as “Leibniz” — is formulated, for any function fand arguments

CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 30

xand y, as

[x=y]⇒[f.x=f.y].(3.1)

The equality of expressions of type boolean, i.e. equivalence of predicates, can be written either

with = or ≡. The latter is assigned a lower binding power than all other logical connectives, such

that the round brackets in e.g. [(P⇒Q)≡(¬P∨Q)] can be removed.

3.2.5 Global variables in expressions, predicates and programs

The set of program variables occurring as global (i.e. free) variables in any expression E(including

predicates) and program statement Swill be referred to as glob.Eand glob.S, respectively.

For convenience and brevity, we shall allow the argument to glob to be any mixed n-tuple of

expressions and statements. This should be read as shorthand for the union of all individual sets.

For example glob.(S1,P,E,S2) is short for glob.S1∪glob.P∪glob.E∪glob.S2.

3.2.6 Substitutions

Functions from predicates to predicates, known as predicate transformers, will shortly be presented

and (later) applied for deﬁning program semantics. In our context of predicates on a state space,

such primitive transformers are known as substitution predicate transformers [13, Page 114]. (See

also [13, Chapter 2] for an introduction on substitution and replacement.)

A new predicate, say P0, can be generated from an existing predicate P, by replacing all global

occurrances of program variables Vwith a matching list (in length and corresponding types) of

expressions E. For the syntax of substitutions we deviate from DS (who would write (V:= E).P)

and adopt Morgan’s P[V\E] (for Pwith Vreplaced by E, [45]). Those square brackets are

assigned the highest binding power; with postﬁx application being left associative, this will allow

writing e.g. f .P[V1\E1].Q[V2\E2][V3\E3] for (f.(P[V1\E1])).((Q[V2\E2])[V3\E3]).

(Also, this format will cleanly allow later deﬁnition of special kinds of substitution, by preﬁxing

Vwith the new substitution’s name.)

By deﬁnition, substitution distributes over all logical connectives. When distributing a substi-

tution over a quantiﬁer, potential name clashing (i.e. if local variables whose scope is bound by

the quantiﬁer have the same name as global variables in E) is avoided by renaming the local ones.

Just as we did with the function glob above, for convenience, and since predicates are merely

boolean expressions in program variables, we apply substitutions to any expression, program

statement, or a mixed n-tuple of those. For statements, we shall avoid potential problems (e.g.

what does it mean to replace the target of an assignment with an expression?) by restricting

ourselves to a so-called simple substitution [45, Page 105], substituting variables by variables.

CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 31

Moreover, to avoid introducing aliases, the new variables will have to be distinct and fresh. More

precisely, for freshness in S[X\Y] we expect Y(glob.S\X).

Some properties of substitution are worth noting. In the following, let Astand for any expres-

sion (including predicates) or statement.

Let Xbe any list of variables; then redundant self-substitutions can always be introduced or

removed; we thus postulate

A[X\X] = A.(3.2)

Another simpliﬁcation allows the merge of following substitutions, if they form a needless chain;

as in the postulate

A[X\Y][Y\E] = A[X\E] (3.3)

provided Yglob.A \ X.

From the preceding two postulates, we can derive conditions for removing (or introducing)

redundant reversed double substitutions. Thus

A[X\Y][Y\X] = A(3.4)

provided Yglob.A \ X.

A diﬀerent kind of merge of following substitutions, is postulated for cases when substituted

variables are disjoint, and the ﬁrst substitution does not aﬀect the second. Thus

A[X1\E1][X2\E2] = A[X1,X2\E1,E2] (3.5)

provided X1X2 and X2glob.E1.

Finally, we can simply derive from the preceding postulate the conditions for swapping inde-

pendent substitutions. Thus

A[X1\E1][X2\E2] = A[X2\E2][X1\E1] (3.6)

provided X1X2, X1glob.E2 and X2glob.E1.

3.2.7 Proof format

According to the DS proof format [13, Chapter 4], designed for avoiding needless repetition in long

derivations, if [A=C] can be proved by [A=B] and [B=C] for some intermediate expression

B, we write

={here comes a hint to why [A=B]}

CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 32

={and here another hint, for [B=C]}

from which the desired [A=C] can be inferred. Note that A,Band Care not necessarily boolean

expressions, and hence the = rather than the more speciﬁc ≡. In any case, following DS, even

for boolean expressions the = will be preferred (in derivations). The ≡, instead, will mostly be

used in single line expressions, thus exploiting its low binding power and emphasising that the

arguments are boolean.

Since [A⇒B]∧[B≡C]⇒[A⇒C], when some steps in a derivation are of implication (⇒),

the conclusion is an implication too. Similarly for the follows from (⇐) connective, as long as the

two (⇒and ⇐) are not mixed (in a single derivation).

3.2.8 From the calculus

The following set of theorems and equations are borrowed from the calculus of boolean structures,

as deﬁned (and proved) by Dijkstra and Scholten in [13]. Instead of an exhaustive collection, we

state here only non-trivial results that will be of use in the course of this thesis.

The ﬁrst theorem is proved in (DS: 5,96) of [13] — i.e. Equation 96 on Chapter 5).

Theorem 3.1. For any set W, predicate P, and any function ffrom the (type of the) elements

of Wto predicates

W6=∅ ⇒ [(∀x:x∈W:P∧f.x)≡P∧(∀x:x∈W:f.x)] ,(3.7)

i.e. , provided the range is non-empty, conjunction distributes over universal quantiﬁcation.

The following is taken from (DS: 5,69) in [13].

Theorem 3.2 (Contra-positive).For any P,Q

[P⇒Q≡ ¬Q⇒ ¬P].(3.8)

The (punctual) monotonicity of quantiﬁers is borrowed from (DS: 5,102) and [13, Page 79].

Theorem 3.3. For any r,f,g

[(∀x:r.x:f.x⇒g.x)⇒((∀x:r.x:f.x)⇒(∀x:r.x:g.x))] (3.9)

[(∃x:r.x:f.x⇒g.x)⇒((∃x:r.x:f.x)⇒(∃x:r.x:g.x))] .(3.10)

Another property of existential quantiﬁcation is taken from [13, Page 79].

CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 33

Theorem 3.4. For any P,r,f

[P∧(∃x:r.x:f.x)≡(∃x:r.x:P∧f.x)] .(3.11)

Finally, the Laws of Absorption are proved in (DS: 5,23) and (DS: 5,24) of [13].

Theorem 3.5. Conjunction and disjunction satisfy the Laws of Absorption, i.e. , for any P,Q

[P∧(P∨Q)≡P] (3.12)

[P∨(P∧Q)≡P].(3.13)

3.3 Program semantics

3.3.1 Predicate transformers

In Dijkstra and Scholten’s approach to program semantics, a program Sstands for the set of all

computations possible under its control. With respect to a predicate P, deﬁning a dichotomy

on the state space on which Soperates, each computation Cmay be of one of the following

three classes: (1) “eternal” (i.e. fails to terminate); (2) “ﬁnally P” (i.e. terminates in a ﬁnal state

satisfying P); or (3) “ﬁnally ¬P” (i.e. terminates in a ﬁnal state satisfying ¬P).

Each of the following three predicates deﬁnes a dichotomy on the initial state space of S

(with the second and third corresponding to ﬁnal states satisfying P): (1) wp.S.true (i.e. each

computation under control of Sis either “ﬁnally P” or “ﬁnally ¬P”); (2) wlp.S.P(i.e. either

“eternal” or “ﬁnally P”); and (3) wlp.S.(¬P) (either “eternal” or “ﬁnally ¬P”).

Here wlp.Semerges as a function from predicates to predicates, i.e. apredicate transformer,

and wlp stands for weakest liberal precondition. Only universally conjunctive predicate trans-

formers are admitted as wlp.S. (See the following section for a formal deﬁnition of diﬀerent types

of junctivity.)

Similarly to wlp, the weakest precondition predicate transformer wp.Sis deﬁned as

[wp.S.P≡wp.S.true ∧wlp.S.P] for all P.

Being functions (from predicates to predicates), predicate transformers enjoy Leibniz’s Rule,

i.e. for any predicate transformer f, we have [P≡Q]⇒[f.P≡f.Q].

A predicate transformer is said to be monotonic (with respect to implication) if and only if

(∀P,Q:: [P⇒Q]⇒[f.P⇒f.Q]). Indeed, all predicate transformers used for program semantics

will be monotonic.

3.3.2 Diﬀerent types of junctivity

A predicate transformer fis universally conjunctive if and only if

(∀V:Vis a bag of predicates : [f.(∀P:P∈V:P)≡(∀P:P∈V:f.P)]) ; similarly, it is

CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 34

universally disjunctive if and only if

(∀V:Vis a bag of predicates : [f.(∃P:P∈V:P)≡(∃P:P∈V:f.P)]).

Other (weaker) types of junctivity include positive junctivity and ﬁnite junctivity. The former

diﬀers from universal junctivity in that its junctivity should apply not to any bag (of predicates)

V, but rather to non-empty ones, whereas the latter’s junctivity is expected to apply to any

non-empty bag with a “ﬁnite number of distinct predicates” [13, Page 87].

From the deﬁnitions, it follows that if a predicate transformer fis universally conjunctive it

is also positively so, and if positively conjunctive it is ﬁnitely so. It can also be shown that if f

is ﬁnitely conjunctive, it is monotonic (as deﬁned above). Finally, note that a similar weakening

order holds for the corresponding disjunctivity types.

In the next chapter (see Section 4.1.5) we shall deﬁne the semantics of a programming lan-

guage exclusively from universally disjunctive and positively conjunctive predicate transformers. It

should be noted that all substitutions, deﬁned earlier (in Section 3.2.6) as predicate transformers,

are universally junctive (see [13, Page 117]).

As an example for the use of ﬁnite junctivity, consider the following theorem, which deals with

absorption of termination.

Theorem 3.6 (Absorption of Termination).For any statement Sand predicate P, we have

[wp.S.P∧wp.S.true ≡wp.S.P] (3.14)

provided wp.Sis ﬁnitely conjunctive. We also have

[wp.S.P∨wp.S.true ≡wp.S.true] (3.15)

provided wp.Sis ﬁnitely disjunctive.

Proof. For the former, we observe

wp.S.P∧wp.S.true

={wp.Sis ﬁnitely conjunctive (proviso)}

wp.S.(P∧true)

={identity element of ‘∧’}

wp.S.P,

and then for the latter, we similarly observe

wp.S.P∨wp.S.true

={wp.Sis ﬁnitely disjunctive (proviso)}

wp.S.(P∨true)

CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 35

={zero element of ‘∨’}

wp.S.true .

Note that each part of the proof, being two steps long, will only save one step, whenever

applied. However, at least the former case will be extensively used, and will thus worth its while.

Since our focus will be on deterministic programs, we shall now turn to deﬁne formally what

is meant by a program being deterministic.

3.3.3 A deﬁnition of deterministic program statements

Interpreting the predicate wlp.S.(¬P) as holding in all initial states for which no computation

under control of Sis “ﬁnally P”, leads to another interesting predicate, ¬wlp.S.(¬P), holding

in initial states for which there exists such a computation. So wp .S.Pholds where termination

in Pis unavoidable and ¬wlp.S.(¬P) holds where terminating in Pis merely possible. Dijkstra

and Scholten’s interpretation of a program being deterministic follows “what is possible is also

unavoidable”, and thus the expectation [wp.S.P⇐ ¬wlp.S.(¬P)] for all P.

Since it can be shown that [wp.S.P⇒ ¬wlp.S.(¬P)] for all P, a program Sis considered

deterministic if and only if

[wp.S.P≡ ¬wlp.S.(¬P)] (3.16)

for all P.

The so-called conjugate of a predicate transformer f, is a predicate transformer, f∗, for which

[f∗.P≡ ¬f.(¬P)] holds for any predicate P. Surely, due to the redundancy of double negation,

“if one predicate transformer is the conjugate of another, they are each other’s conjugate” [13,

Page 83] (and hence the term conjugate).

Thus, the deﬁnition of deterministic statements can be rewritten, as (see also (DS: 7,7) in [13])

Deﬁnition 3.7. We have for any statement S

(Sis deterministic) ≡(wp.Sand wlp.Sare each other’s conjugate) .(3.17)

The signiﬁcance of this deﬁnition of conjugates comes from the fact that for any predicate

transformer fand its conjugate f∗and all types of junctivity, we have (DS: 6,13)

(the conjunctivity type of f) = (the disjunctivity type of f∗).(3.18)

CHAPTER 3. FORMAL SEMANTICS: PREDICATE TRANSFORMERS 36

3.4 Program reﬁnement

In his PhD thesis [2], Back introduced in 1978 the concept of reﬁnement as a binary relation

between programs. Adapted to the DS notation, reﬁnement can be formally deﬁned as follows.

Deﬁnition 3.8 (Reﬁnement).For any pair of program statements, Sand T, statement Sis said

to be reﬁned by T(or Tis a reﬁnement of S), writing SvT, when for any predicate Pwe have

[wp.S.P⇒wp .T.P].

Since the semantics of the programming language is deﬁned with monotonic predicate trans-

formers, any part of a given program can be replaced with a reﬁnement of itself, independently of

the surrounding program.

In reﬁnement calculi (e.g. [4, 45]), the programming language admits both speciﬁcations and

executable constructs, known as code. The goal is a process for construction of provably correct

code. According to the proposed process, this goal is achieved by specifying the requirements

formally, using a so-called speciﬁcation statement. Then, in a stepwise manner, the speciﬁcation

is reﬁned into code. Each step is taken from a vocabulary of provably correct laws of reﬁnement.

We shall introduce a similar set of laws, as relevant for our context, in the next chapter.

Reﬁnements have been shown to be useful for behaviour-preserving transformations of existing

code (e.g. by Ward [56] and Corn´elio [11]). Ward applies reﬁnements for e.g. reengineering and

migration of code from one language to another [58]. Accordingly, his object language, a wide

spectrum language (called WSL), is fundamentally non-deterministic.

Corn´elio has applied reﬁnements directly in the context of refactoring [11] in his PhD thesis

from 2004. There, a large number of known refactorings have been formulated for a Java-like

object-oriented language, called ROOL, and applied in introducing known design patterns [23]

into a given program.

We follow, in this thesis, a similar path of using the reﬁnement relation in developing behaviour-

preserving transformations. However, for simplicity, and since refactoring is concerned with trans-

forming code, we restrict ourselves to a simple imperative deterministic language. Furthermore,

instead of targeting a wide range of refactorings, we focus on the speciﬁc problem of slice extrac-

tion.

Slices have been formalised in the context of reﬁnement by Ward [57, 59]. We return to those

deﬁnitions later in the thesis (in Chapter 7).

This concludes our presentation of background to the formal semantics of this thesis. In sum-

mary, we have adopted Dijkstra and Scholten’s program semantics of predicate tramsformers. Set

theory and the reﬁnement relation have also been introduced, and will be used when manipulating

programs and developing behaviour-preserving transformations.

Chapter 4

A Theoretical Framework

The original part of this thesis, as introduced so far, begins in this chapter, in which we develop a

theoretical framework for proving correct transformations. The framework, building on traditional

reﬁnement calculus, will aim to support transformations of programs written in a simple imperative

programming language. In contrast to earlier work on reﬁnement, the language will be restricted

to deterministic constructs — thus avoiding the need to synchronize duplicated non-deterministic

choices, as will be explained. This decision is justiﬁed by the observation that refactoring is

concerned with transforming existing code rather than speciﬁcations.

The chapter begins with a preliminary section, in which some basic concepts are deﬁned. Then,

a variation on Dijkstra’s language of guarded commands is introduced and formalised through

weakest-preconditions semantics. This will be the object language for our transformations.

The framework will be extended and specialised later, e.g. with a slicing-based proof method in

the next chapter and a slicing algorithm in Chapter 9, and will hence be applied in the formulation

and development of solutions to slice extraction and related transformations.

4.1 Preliminaries

As a programming notation, this thesis adopts a subset of Dijkstra’s guarded commands, along

with selected elements from Dijkstra and Scholten’s “Predicate Calculus and Program Semantics”

(DS) [13]. As will be described shortly, a subset of Dijkstra’s language is chosen as our core

language. This is then extended with some advanced constructs borrowed, with adaptations, from

e.g. Morgan’s “Programming from Speciﬁcations” [45].

For the sake of simplicity, we choose to make some restricting assumptions on our programming

language. These will allow a concise formulation of transformations. Admittedly, some of those

assumptions are non-realistic and others might just not be desirable (e.g. due to performance

CHAPTER 4. A THEORETICAL FRAMEWORK 38

considerations). We return to discuss those choices in the concluding chapter, where we (brieﬂy

and informally) evaluate the applicability of the approach to modern programming languages.

There, we shall propose to complement language extensions with extra applicability conditions.

Hence, in our language, all variables may be copied, leading to an independent clone in a new

storage location. This includes the ability to clone the input and output streams. We restrict our

attention to sequential programs, and expressions in our language (appearing in statements such

as an assignment, or the guard of an IF statement) have no side-eﬀects. Moreover, features such as

aliasing, class hierarchy, overloading, exceptions or concurrency have been left out. As mentioned,

possible implications of including such features are evaluated in the concluding chapter.

4.1.1 On slips and slides: an alternative to substatements

A slice captures a subset of the original behaviour, thus it is said to be a subprogram. When

slicing is syntax preserving (as it normally is), one may also wish to say a slice (of statement S

with respect to variables V) is a substatement (of S). But is that so? And what is a substatement,

anyway?

For example, let Sbe the statement “if x>ythen m:= xelse m:= yﬁ”; now let S1 be

“m:= x”and S2 be “if x>ythen m:= xﬁ”. Is S1 a substatement of S? how about S2?

Some may consider the former a substatement, since in terms of syntax trees, it may stand for

a subtree. At the same time, others might claim the latter is a substatement, since in terms of

nodes in a ﬂow graph, it represents a subgraph.

We avoid such potential confusion, in this thesis, by refraining from speaking of substatements.

Instead, S1, being a subtree, is said to be a slip of S, whereas S2 is a slide.

Any part of a statement which is in itself a statement is a slip (of that statement). Thus, if

Sis a primitive statement, Sitself is its only slip. However, when Sis a compound statement

(i.e. is compounded of parts S.i, each of which is a statement in itself), then Sitself is one of its

slips, and all slips of each such S.iare, too, slips (or even proper slips) of S. Those slips of each

S.iare sometimes referred to as proper slips, whereas the slips S.ithemselves are also considered

immediate slips (of S). In terms of the abstract syntax tree (AST), a slip corresponds to a subtree.

A slide is complementary to a slip and is deﬁned for each pair of statement and slip. The

slide of Son itself is the statement Swith all immediate slips (if any) replaced with the empty

statement skip . When Sis a compound statement with immediate slips S.i, a slide of Son a slip

Tof S.jis the statement Swith S.jreplaced with the slide of S.jon T, and all other immediate

slips S.i(with i6=j) replaced with the empty statement skip . In terms of the AST, a slide of S

on Tcorresponds to the nodes on the path from (the node acting as root of) Sto (the root of its

subtree) T. But slides will not be further discussed until later, in Chapter 8, where they will be

formalised.

CHAPTER 4. A THEORETICAL FRAMEWORK 39

4.1.2 Why deterministic?

The main reason for focusing on deterministic programs is illustrated by the following inequiva-

lence:

if true −→ x,y := 1,1

true −→ x,y := 2,2

fi 6=

if true −→ x := 1

true −→ x := 2

; if true −→ y := 1

true −→ y := 2

Whereas x=yis a true postcondition (for any initial state) of the program on the left, the

other may terminate with e.g. x = 1 ∧y= 2.

In essence, when duplicating a non-deterministic choice, one must synchronize the two choices

in order to ensure behaviour preservation.

In this thesis we avoid such cases by restricting ourselves to deterministic programs. Future

extension of this work to include non-determinism should be possible and interesting.

Our decision can also be justiﬁed, as mentioned earlier, by the observation that non-determinism

is typically useful in speciﬁcation and during design process whereas refactoring is concerned with

transforming actual code.

4.1.3 On deterministic program semantics

In general, wlp.Sis more fundamental than wp.Sas there is no way of deﬁning the former in

terms of the latter. However, for deterministic programs, we observe that due to (3.16) and the

redundancy of double negation (twice) wlp.Scan be deﬁned by

[wlp.S.Q≡ ¬wp.S.(¬Q)] .(4.1)

Thus in this thesis we leave weakest liberal preconditions alone and deﬁne the semantics of our

programming language solely in terms of weakest preconditions.

But before dismissing wlp, we shall use it once more, in investigating the diﬀerence between

reﬁnement and equivalence of deterministic program statements. (A ﬁnal mention of wlp will

follow, in Section 4.1.5.)

4.1.4 On reﬁnement, termination and program equivalence

Two deterministic program statements S,Tare considered semantically equivalent if for all P,

we have [wp.S.P≡wp.T.P]. In this thesis this is denoted S=T. A weaker relation is that of

CHAPTER 4. A THEORETICAL FRAMEWORK 40

reﬁnement. There, Sis said to be reﬁned by T(or Tis a reﬁnement of S, denoted SvT) if

[wp.S.P⇒wp.T.P] for all P.

Essentially, what is the diﬀerence between the two relations? Clearly, it can be shown (through

predicate calculus) that S=Tif and only if SvT∧SwT. But what does it mean for Tto

be a reﬁnement of Sand Snot a reﬁnement of T(and thus S6=T)? This happens when Tis

“more terminating” than S. Operationally speaking, Tmay terminate on input for which Sdoes

not. But on input for which both terminate, the ﬁnal state is guaranteed to be the same. In other

words, if Sis reﬁned by Tand both terminate under the exact same conditions, they are also

equivalent. This is formulated in the following theorem.

Theorem 4.1. Let S,Tbe any two deterministic statements; then

(S=T)≡(SvT∧[wp.S.true ≡wp.T.true]) .

Proof.

S=T

={def. of program equivalence}

(∀P:: [wp.S.P≡wp.T.P])

={Lemma 4.2 (see below)}

(∀P:: [wp.S.P⇒wp.T.P]∧[wp.S.true ≡wp.T.true])

={pred. calc. (3.7): the range is non-empty}

(∀P:: [wp.S.P⇒wp.T.P]) ∧[wp.S.true ≡wp.T.true]

={def. of reﬁnement}

SvT∧[wp.S.true ≡wp.T.true].

Lemma 4.2. Let S,Tbe any two deterministic statements; then

(∀P:: [wp.S.P≡wp.T.P]) ≡(∀P:: [wp.S.P⇒wp.T.P]∧[wp.S.true ≡wp.T.true]) .

Proof. The LHS ⇒RHS part is trivial, due to predicate calculus (≡implies ⇒).

For LHS ⇐RHS , note that since [wp.S.P⇒wp.T.P] is already given for any predicate P

(RHS), we only need to show [wp.S.P⇐wp.T.P], for which we observe

wp.S.P

={def. of wp}

wlp.S.P∧wp.S.true

CHAPTER 4. A THEORETICAL FRAMEWORK 41

={(4.1) above: Sis deterministic}

¬wp.S.(¬P)∧wp.S.true

⇐ {RHS and pred. calc. (Theorem of the Contra-positive, 3.8)}

¬wp.T.(¬P)∧wp.S.true

={(4.1) again: Tis deterministic}

wlp.T.P∧wp.S.true

={RHS}

wlp.T.P∧wp.T.true

={def. of wp}

wp.T.P.

In contrast to reﬁnement, program equivalence is amenable for deriving correct transformations

in both directions. However, on one of those, a reﬁnement may yield more accurate results (as it

does in slicing by removing irrelevant loops even if those may not terminate — recall Section 2.2.2).

The above result is important as it allows us to conﬁdently focus on developing reﬁnement

rules, where appropriate, knowing that the extra step of turning them into equivalences is always

available.

4.1.5 Semantic language requirements

Dijkstra and Scholten insist on two basic requirements the semantics of each language construct

must satisfy. Firstly (R0:) any wlp.Sis universally conjunctive; and secondly (R1:) [wp.S.false ≡

false] for any S. Requirement R1 — known as “The Law of the Excluded Miracle” — is due to

the observation that no state satisﬁes false and the predicate wp.S.false holds in states where no

computation under control of Sexists.

When deﬁning semantics in terms of weakest preconditions alone, requirement R0 can be

replaced with a new requirement (RE1:) that wp.Sis universally disjunctive. We prove that RE1

implies (for deterministic S) both R0 and R1 in what follows.

Theorem 4.3. Let Sbe any deterministic statement with (RE1:) wp.Sbeing universally dis-

junctive; we then have both (R0:) wlp.Sis universally conjunctive, and (R1:) [wp.S.false ≡false].

Proof. First, for R0, we observe (on the lines of the proof for (DS: 7,9) in [13])

the conjunctivity type of wlp.S

={properties of conjugate: see (3.18)}

CHAPTER 4. A THEORETICAL FRAMEWORK 42

the disjunctivity type of (wlp.S)∗

={(3.17); Sis deterministic}

the disjunctivity type of wp.S

={RE1}

universal .

Then, we observe for R1

wp.S.false

={existential quantiﬁcation over the empty range yields false}

wp.S.(∃P:P∈ ∅ :P)

={wp.Sis univ. disj. (RE1)}

(∃P:P∈ ∅ :wp.S.P)

={again, existential quantiﬁcation over the empty range yields false}

false .

Thus, in our context, RE1 faithfully takes the place of DS’s R0,R1. We note that a consequence

of RE1 and Theorem 4.3 above (thus having R0,R1 available) is that wp.Sis positively conjunctive

(as proved in (DS: 7,8) of [13]). However, it is not universally so for (possibly) non-terminating

S, since universal quantiﬁcation over the empty range yields true.

4.1.6 Global variables in transformed predicates

According to our adopted formalism, programs manipulate predicates over the program’s state

space, expressed syntactically as boolean structures. In our analysis and manipulation of such

programs, we shall be interested in the set of global variables actually mentioned in the transformed

predicates.

Let Pbe any predicate. We denote the set of global (i.e. free) variables in Pas glob.P.

Let Sbe any given statement; variables in glob.Pmay be subject to direct substitution by the

transformer wp.S. What do we know of glob.(wp.S.P)?

First, we denote variables that will deﬁnitely be substituted by wp.Sas ddef.S. Those will be

the variables that are deﬁnitely (i.e. for any initial state) deﬁned by S. (Examples of ddef, as well

as the other properties to be introduced shortly, will be given in Section 4.2.) Second, we observe

that some of those variables, as well as others, may ﬁnd their way into glob.(wp.S.P) even if not

CHAPTER 4. A THEORETICAL FRAMEWORK 43

in glob.P. Those are the variables whose initial value may aﬀect the result of Sand are hence

denoted as input.S. We are now ready to postulate requirement RE2:

glob.(wp.S.P)⊆((glob.P\ddef.S)∪input.S)

for all S,P.

The transformation of wp.Son some predicates will be restricted to adding a conjunct ex-

pressing termination of S. That is, for such S,Pwe expect [wp.S.P≡P∧wp.S.true]. This is

so whenever all variables in glob.Pare guaranteed not to be modiﬁed by S(in a case where S

terminates). We thus denote by def.Sthe set of variables that may be modiﬁed by S. That is, a

variable xmust be in def.Sif there exists a terminating computation under control of Sfor which

the ﬁnal value of xdiﬀers from its initial value. When a variable is not in def.Swe know its initial

and ﬁnal values will in any case be the same. We hence postulate the requirement RE 3:

[wp.S.P≡P∧wp.S.true]

for all S,Pwith glob.Pdef.S.

As can be expected, all deﬁnitely deﬁned variables ddef.Sshall always take part in the set of

(possibly) deﬁned variables, def.S. We thus postulate the requirement RE 4:

ddef.S⊆def.S

for all S.

In addition to the sets def.S,ddef.Sand input.S, we shall deﬁne for each language construct

its set of global (i.e. free) variables. In fact, we shall expect this set to consist of variables from

the three previously mentioned sets. Keeping in mind ddef.S⊆def.Sfor all S(RE 4 above), we

now postulate the requirement RE 5:

glob.S=def.S∪input.S

for all S. (Recall the overloading of glob, as was ﬁrst mentioned in Section 3.2.5, being applicable

for predicates, as before, as well as for program statements, as in this case, or even for program

expressions of any type.)

Here is a summary of the required properties. For any statement Swe require

RE1 wp.Sis universally disjunctive

RE2 glob.(wp.S.P)⊆((glob.P\ddef.S)∪input.S) for all P

RE3 [wp.S.P≡P∧wp.S.true] for all Pwith glob.Pdef.S

RE4 ddef.S⊆def.S

RE5 glob.S=def.S∪input.S

CHAPTER 4. A THEORETICAL FRAMEWORK 44

4.2 The programming language

Here is an introduction to the chosen language constructs and their corresponding semantics. A

full deﬁnition with proof of all requirements can be found in Appendix A.

4.2.1 Expressions, variables and types

As our transformations deal exclusively with statements and names of variables, while preserving

all types (of variables and expressions), it is tempting and not uncommon to avoid any men-

tion of those. However, to prevent confusion, it is worth mentioning the types with which our

programming language (and hence the code examples in this thesis) shall be concerned.

The basic types (of variables and expressions) shall include integers (with typical arithmetic

and relational operators) and booleans (with similar syntax to predicates).

Further to that, we shall allow variables of type array or stream. Each array variable, say a,

will always be associated with an extra variable, a.length. A stream, dedicated either for reading

(i.e. input) or writing (i.e. output), shall be implemented as a special case of array, with an extra

(implicit) index variable associated with it. This variable shall be pointing to the next available

location.

4.2.2 Core language

Assignment

The ﬁrst language construct is the assignment statement. It takes the form X:= Ewith X

standing for a ﬁnite list of variables and with Estanding for a list of expressions (of the same length

as X). Type compatibility is assumed. This is a so-called simultaneous assignment (or multiple

assignment), in which all expressions are evaluated before being assigned to their corresponding

target variables. It should be noted that the target variables must be distinct. To the case where

the lists Xand Eare both empty, we sometimes refer as “skip ”.

[wp.“X:= E”.P≡P[X\E]] for all P;

def.“X:= E”,X;

ddef.“X:= E”,X;

input.“X:= E”,glob.E; and

glob.“X:= E”,X∪glob.E.

It is worth noting that, for simplicity, all expressions in our language are assumed to be well

formed and all operators and functions are complete and hence well deﬁned for all possible values.

The special case of assignment to an array element, say “a[i] := E”shall be understood as

an assignment to the whole array, a:= a[i7→ E], meaning that the array aends up being as

CHAPTER 4. A THEORETICAL FRAMEWORK 45

before in all elements other than the i-th, in which it gets the value of E.

An output stream, say out (with dedicated index variable, say out.i), can be appended through

a statement “out << E”, which should be interpreted as “out,out .i:= out [out .i7→ E],out .i+

1”. Similarly, reading from an input stream, say in, (with index variable in.i), takes the form

“in >> x”, and should be interpreted as “x,in.i:= in [in .i],in.i+ 1 ”.

Sequential composition

The ﬁrst compound construct is sequential composition. It takes the form “S1;S2”and starts

executing S2 only after normal completion of S1.

[wp.“S1;S2”.P≡wp.S1.(wp.S2.P)] for all P;

def.“S1;S2”,def.S1∪def.S2 ;

ddef.“S1;S2”,ddef.S1∪ddef.S2 ;

input.“S1;S2”,input.S1∪(input.S2\ddef.S1) ; and

glob.“S1;S2”,glob.(S1,S2) .

Recall glob.(S1,S2) is short for glob.S1∪glob.S2.

Alternative construct

The alternative construct takes the form of “if Bthen S1else S2ﬁ”(and is sometimes abbre-

viated to IF ). Upon execution, if the guard B, a boolean expression, is evaluated to true,S1 is

executed; otherwise, S2 is executed.

[wp.IF .P≡(B⇒wp.S1.P)∧(¬B⇒wp.S2.P)] for all P;

def.IF ,def.S1∪def.S2 ;

ddef.IF ,ddef.S1∩ddef.S2 ;

input.IF ,glob.B∪input.S1∪input.S2 ; and

glob.IF ,glob.B∪glob.S1∪glob.S2 .

Repetitive construct

The repetitive construct takes the form of “while Bdo Sod ”(and is sometimes abbreviated to

DO). Upon execution, if the guard Bis evaluated to true, the guarded Sis executed. Once S

terminates successfully, the process is repeated, until the guard is evaluated to false, in which case

the loop terminates successfully.

CHAPTER 4. A THEORETICAL FRAMEWORK 46

[wp.DO.P≡(∃i: 0 ≤i: (ki.false))] for all P,

with kgiven by (DS:9,44) [13]: [k.Q≡(B∨P)∧(¬B∨wp.S.Q)] ;

def.DO ,def.S;

ddef.DO ,∅;

input.DO ,glob.B∪input.S; and

glob.DO ,glob.B∪glob.S.

As for earlier language constructs (with the exception of substitutions), this formulation of

wp.DO follows the DS notation — for its deterministic subset. Note, however, that the semantics

of repetetion could have been equally deﬁned in terms of implication, by [k.Q≡(B⇒wp.S.Q)∧

(¬B⇒P)]. This would render the similarity to the semantics of IF clearer: one could think of

the DO statement as a recursive construct, say DO0, comprising an IF statement on the lines of

“if Bthen S;DO0else skip ﬁ ”. Nevertheless, this thesis adopts the DS formulation of loops.

This completes our core language, a subset of Dijkstra and Scholten’s guarded commands [13].

The following constructs are extensions borrowed from Morgan [45], with some adaptations as our

context requires. Later, in order to emphasise that a certain statement Sis restricted to constructs

of the core language, we shall call it a core statement.

4.2.3 Extended language

Assertions

An assertion statement (called “assumption” by Morgan [45]), is a boolean expression on the

program state. If true, execution goes on normally; otherwise the program aborts. (In guarded

commands, “the operational interpretation of abort is that for all initial states its execution fails

to terminate”. [13, Page 135])

[wp.“{B}”.P≡B∧P] for all P;

def.“{B}”,∅;

ddef.“{B}”,∅;

input.“{B}”,glob.B; and

glob.“{B}”,glob.B.

Assertions, locally expressing the surrounding context, will serve as a vehicle for performing

correct local transformations and reﬁnements. The assertions will typically be added to a program,

temporarily, by propagating knowledge through a given (compound) statement. They will thus

express intermediate as well as ﬁnal results of a provably correct program analysis, prior to some

transformation.

CHAPTER 4. A THEORETICAL FRAMEWORK 47

Local variables

Normally, in (block-based, imperative) programming languages, local variables serve for storing

temporary results of computations, before using those in further computations. Being local to a

certain statement, assignments to such variables bear no eﬀect on the surrounding context (even

if a variable with the same name exists there as well). According to Morgan’s deﬁnition, a local

variable is initialised (on entry to its scope) to any possible value. However, since such non-

determinism is not permitted in our context, we could either insist local variables must not be

used before being deﬁned — as is commonly enforced in modern languages — or agree on some

initial value. As will be explained shortly, we prefer the latter.

A special case of local variables is that of parameters. Typically, a method (or procedure)

may declare e.g. ‘value’ parameters; such local variables will be initialised, whenever the method

is called, with a copy of the actual value sent. Again, any local modiﬁcations will be hidden from

the caller. In [45], Morgan allows value parameters to be sent to any given statement, say S,

through so-called value substitutions;e.g. “S[value f\E]”would send the value of Eto locals

fin S.

It turns out that we do not require, in our context, the full power of such value substitutions.

Instead, we shall do with what can be termed a self value substitution (i.e. “S[value f\f]”, to use

Morgan’s syntax). The eﬀect of such self-substitution is that, on the one hand, any modiﬁcation

to fin Sshall be local (i.e. hidden from the surrounding context), while, on the other hand, f

will be initialised to its actual value in that global context.

Since only self value substitutions will be needed, in this work, and since the value-substitution

notation, for those, present a redundancy (i.e. repeating fin the above), we opt to avoid the

introduction of value substitutions. Instead, we shall get the same eﬀect (of self value sub.)

by assuming local variables are initialised to their corresponding global value. In reality, such

initialisation should only take place if a local may be used before being deﬁned.

Local variables may be introduced anywhere a statement is expected. Their introduction takes

the form of “|[var L;S]|”where Sis any statement and Lstands for a list of variable names. If

a local variable is used before being deﬁned in S, its entry value is used. Since deﬁnitions of Lin

Sshould be local, its entry value is kept in a fresh backup variable on entry and retrieved on exit;

accordingly, the semantic deﬁnition of “|[var L;S]|”follows that of “L0:= L;S;L:= L0”

where L0is fresh.

CHAPTER 4. A THEORETICAL FRAMEWORK 48

[wp.“|[var L;S]|”.P≡(wp.S.P[L\L0])[L0\L] ] for all Pwith glob.PL0; or the simpler

[wp.“|[var L;S]|”.Q≡wp.S.Q] for all Qwith glob.Q(L,L0)

def.“|[var L;S]|”,def.S\L;

ddef.“|[var L;S]|”,ddef.S\L;

input.“|[var L;S]|”,input.S; and

glob.“|[var L;S]|”,(def.S\L)∪input.S; or

glob.“|[var L;S]|”,(glob.S\(L\input.S)) .

Note that generality is not lost by restricting Pas we do. Whenever elements of L0appear in the

postcondition, those can be locally renamed.

A common case is one in which the declared variables are immediately deﬁned, e.g. “|[var L;L:=

E;S]|”. In such cases, the shorthand “|[var L:= E;S]|”is allowed.

Live variables

In program analysis [47], a variable xis considered live, at a given program point, if there exists

a path (from the given point) to a use of x, which is free of re-deﬁnitions of x. If a variable is not

live at a given point (i.e. it is dead), its value at that point is of no interest to the program, and

can be modiﬁed to anything.

In our refactorings we shall take advantage of this notion of liveness, for example in removing

dead assignments (i.e. assignments to dead variables). However, since there is no simple deter-

ministic way of saying “that variable can hold any value, at this point”, we choose to add the

concept of liveness to the programming language, or rather to the meta-language.

We do so by deﬁning a dual of local variables. Instead of stating which variables are local to

a statement S, we explicitly state the variables that are not. This way, it can be assumed that

those variables will be live on exit. In contrast, all other variables, being local, will be guaranteed

to hold, on exit from S, their corresponding initial value. Thus, all local deﬁnitions of those will

be of no relevance to the surrounding context. Consequently, modifying those to any value, just

before exiting S, will bear no eﬀect.

We deﬁne “S[live V]”,“|[var L;S]|”where L:= def.S\V. Thus, the semantics and

properties can be derived from those of local variables, as is summarised in the following. For a

given statement S, set of variables V, a corresponding set L:= def.S\Vand fresh L0, we have:

CHAPTER 4. A THEORETICAL FRAMEWORK 49

[wp.“S[live V]”.P≡(wp.S.P[L\L0])[L0\L] ] for all Pwith glob.PL0; or the simpler

[wp.“S[live V]”.Q≡wp.S.Q] for all Qwith glob.Q(L,L0)

def.“S[live V]”,def.S∩V;

ddef.“S[live V]”,ddef.S∩V;

input.“S[live V]”,input.S; and

glob.“S[live V]”,(def.S∩V)∪input.S.

4.3 Laws of program analysis and manipulation

The weakest-preconditions semantics, as deﬁned above for all language constructs, along with

known theorems from the predicate calculus, can and will be applied in proving a collection of

laws for correct program analysis and manipulation.

Such laws, in turn, will be useful for proving and deriving rules of program equivalence, re-

ﬁnement and transformation. (The latter is distinguished from the former two in that instead of

relating two programs, it shall describe how to produce a new program from a given one.)

The collection is by no means exhaustive, though; only laws to be directly useful in the thesis

are formulated. A summary of those laws can be found in Appendix F, and all proofs are given

in Appendix B.

4.3.1 Manipulating core statements

A ﬁrst set of laws, designates easy manipulation of core statements. Very similar laws have been

deﬁned elsewhere (e.g. [45, 3, 56, 30]).

For example, the following is a deﬁnition of a law (see Law 3 in the appendices) to support

the distribution of a statement into (or out of, when applied from right-to-left) both branches of a

following IF statement, provided the former does not deﬁne (i.e. modify the value of) any variable

that is tested in the IF’s guard:

Let S,S1,S2,Bbe three statements and a boolean expression, respectively; then

“S;if Bthen S1else S2ﬁ”=“if Bthen S;S1else S;S2ﬁ”

provided def.Sglob.B.

Another code-motion related law (Law 5 in the appendices), supports moving a (certain kind

of loop-invariant) assignment statement forward, outside a DO loop’s body (or into its end, when

applied from right-to-left):

CHAPTER 4. A THEORETICAL FRAMEWORK 50

Let S1,X,B,Ebe any statement, set of variables, boolean expression and set of expressions,

respectively; then

“{X=E};while Bdo S1;(X:= E)od ”=“{X=E};while Bdo S1od ;(X:= E)”

provided X(glob.B∪input.S1∪glob.E).

4.3.2 Assertion-based program analysis

When ﬂow-sensitive propeties of speciﬁc program points are desired, we shall introduce and then

propagate assertions throughout the program, thus expressing both intermediate and ﬁnal results

of a program analysis. Again, similar sets of laws have been deﬁned and employed elsewhere, e.g.

by Back [2], Morgan [45] and Ward [56].

For example, the following (Law 7) supports both introduction and elimination of assertions,

following an assignment:

Let X,Y,E1,E2 be two sets of variables and two sets of expressions, respectively; then

“X,Y:= E1,E2”=“X,Y:= E1,E2;{Y=E2}”

provided (X,Y)glob.E2.

The following law (Law 12) will be used for propagating assertions forward into branches of

an IF statement, as well as backward ahead of the IF:

Let S1,S2,B1,B2 be two statements and two boolean expressions, respectively; then

“{B1};if B2then S1else S2ﬁ”=“if B2then {B1};S1else {B1};S2ﬁ”.

Note that this law is a direct corollary of the more general Law 3 (from above) and the fact

that def of assertions is empty.

After introducing and propagating assertions, and before eliminating them, they will typically

be used in making substitutions. If two variables are known to hold the same value ahead of

a statement, the immediate use of one can be replaced with the other, in that statement. By

immediate use, we refer to the used expressions in assignments and the guard of an IF statement.

In the guard of a DO loop, however, we can make such a substitution only if the required assertion

is available both before the loop and at the end of its body. We refer to such substitutions, in

hints, as assertion-based substitution.

Since such substitutions will often be preceded by an introduction of the assertion, following

an assignment statement, we introduce a combined law (Law 18) to which we refer in hints as

assignment-based substitution:

CHAPTER 4. A THEORETICAL FRAMEWORK 51

Let S1,S2,Bbe two statements and a boolean expression, respectively; let X,X0,Y,Z,

E1,E10,E2,E3 be four lists of variables and corresponding lists of expressions; then

“X,Y:= E1,E2;Z:= E3”=“X,Y:= E1,E2;Z:= E3[Y\E2] ”;

“X,Y:= E1,E2;IF ”=“X,Y:= E1,E2;IF 0”; and

“X,Y:= E1,E2;DO ”=“X,Y:= E1,E2;DO0”

provided ((X∪X0),Y)glob.E2

where IF := “if Bthen S1else S2ﬁ”,

IF 0:= “if B[Y\E2] then S1else S2ﬁ”,

DO := “while Bdo S1;X0,Y:= E10,E2od ”

and DO0:= “while B[Y\E2] do S1;X0,Y:= E10,E2od ”.

4.3.3 Manipulating liveness information

Let S,Vbe any statement and set of variables, respectively, and recall our deﬁnition of liveness

information, in our extended language. The fact that out of the variables deﬁned in S, only those in

Vare live on exit from Sis expressed as “S[live V]”. This is syntactic sugar for |[var coV ;S]|

where coV := def.S\V.

Laws for manipulating liveness information (see Sections B.3 and F.3 for proofs of all laws and

summary, respectively) include introduction and removal of auxiliary information, distribution and

propagation of liveness information, and ﬁnally introduction and elimination of dead assignments.

(By auxiliary information we refer to information that is locally redundant but may have some

importance in the global context.)

Whenever only (a subset of the) mentioned live variables are actually deﬁned (in a statement

S), the liveness information is redundant, and can be dropped. Conversely, any superset of the

deﬁned variables (in any S) can safely augment Sas liveness information. This is expressed by

the following law (Law 19) for introducing and removing auxiliary liveness information:

Let S,Vbe any statement and set of variables, respectively, with def.S⊆V; then

S=“S[live V]”.

In propagating liveness information over sequential composition, it is interesting to see that,

on the one hand, some live-on-exit variables may be intermediately dead, whereas on the other

hand, some dead-on-exit variables may become intermediately live. This is demonstrated by the

set V0in Law 20:

CHAPTER 4. A THEORETICAL FRAMEWORK 52

Let S1,S2,V1,V2 be any two statements and two sets of variables, respectively; then

“(S1;S2)[live V1] ”=“(S1[live V2] ;S2[live V1])[live V1] ”

provided V2 = (V1\ddef.S2) ∪input.S2.

Here, variables in ddef.S2 are said to be “killed” by S2 whereas variables in input.S2 are

“generated”. Such propagation of information, by removing KILL sets and adding GEN sets, is

common practice in intraprocedural data ﬂow analysis (see e.g. [47, Section 2.1]).

Note that we propagate information directly on the abstract syntax (i.e. its tree-like represen-

tation) rather than on a ﬂow graph. This is made possible due to the simplicity and structured

nature of our language. With that respect, it should also be noted that in all analyses, assuming

the availability of the def,ddef,input sets for all program elements, our algorithms will involve a

single pass of the program’s tree. However, in presentation, we shall not be concerned with time

or space complexities.

Another law for propagating liveness information is Law 22:

Let B,S,V1,V2 be any boolean expression, statement and two sets of variables, respectively;

then

“(while Bdo Sod)[live V1] ”=“(while Bdo S[live V2] od)[live V1] ”

provided V2 = V1∪(glob.B∪input.S).

Liveness information, explicitly added to (and propagated through) a program, can help in

identifying dead assignments. Those can subsequently be removed. The following is one such law

for dead-assignment elimination (Law 24):

Let S,V,Y,Ebe any statement, two sets of variables and set of expressions, respectively; then

“S[live V]”=“(S;Y:= E)[live V]”

provided YV.

Note that the law can (and indeed will) also be used to introduce dead assignments.

4.4 Summary

This completes the initial introduction to our framework for refactoring. A subset of Dijkstra’s

language of guarded commands, along with extensions for representing program analysis informa-

tion have been deﬁned with predicate tranformer semantics and related sets of variables. Those

have been used in presenting laws for program analysis and manipulation.

The next chapter will extend our framework by applying some of its elements in devising a

novel method for proving the correctness of slicing-based refactoring transformations.

Chapter 5

Proof Method for Correct

Slicing-Based Refactoring

Our framework for slicing-based refactoring is enhanced in this chapter, with the development of

a proof method for the reﬁnement of deterministic statements. This method will be speciﬁcally

tailored for slicing-based refactoring.

5.1 Introducing slice-reﬁnements and co-slice-reﬁnements

A law of reﬁnement typically associates two meta-programs, Sand T, with some applicability

conditions. Any two programs satisfying those conditions are then guaranteed to be related

through reﬁnement; Sis then said to be reﬁned by T, such that the latter preserves the total

correctness of the former. Operationally speaking, whenever both versions are started in the same

state, one in which Sis known to be terminating, we expect Tto produce the exact same result as

S(i.e. to terminate in the same state). With input for which Sdoes not terminate, Tis allowed

to do anything.

Taken formally, in terms of predicate-transformer semantics, we expect that for any given

predicate P, the weakest precondition of Tapplied to Pwill follow from that of Son (the same)

P, everywhere. Thus, when aiming to prove the correctness of such a reﬁnement law, we are

required to show [wp.S.P⇒wp.T.P] for all P.

Alternatively, when restricting ourselves to deterministic statements, one can prove the correct-

ness of a reﬁnement law in a slice-wise manner. That is, instead of considering general predicates

over the full state space, we consider predicates over a slice (corresponding to a subset of the

program variables, or accordingly to the subspace spanned by their potential values) separately

from predicates over its complement.

CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 54

We ﬁrst deﬁne a relation of slice-reﬁnement and a complementary relation of co-slice-reﬁnement

as follows.

(SvVT)≡(∀P:glob.P⊆V: [wp.S.P⇒wp.T.P]), in which case Tis said to be a slice-

reﬁnement of Swith respect to V; and

(Sv(V)T)≡(∀P:glob.PV: [wp.S.P⇒wp.T.P]), in which case Tis said to be a

co-slice-reﬁnement of Swith respect to V.

The subscript Vin SvVTmeans (as the deﬁnition shows) the reﬁnement relation holds for

all predicates with global variables in V. Accordingly, the (V) in Sv(V)Tguarantees the

reﬁnement holds for all predicates with no global mention of V.

5.2 Variable-wise proofs

With the above deﬁnitions, we now investigate how each slice-reﬁnement (as well as co-slice-

reﬁnements, later) can be proved in a point-wise fashion. Instead of considering any possible

postcondition (on the sliced variables V), only very particular postconditions in the form of

“x=val”, for any variable x∈Vand possible value val, are considered.

In the following, we assume any variable, x, has a type, T.x, associated with it (even if that

type is not explicitly declared in the program). All variable types are assumed to be non-empty,

possibly inﬁnite, sets of distinct values.

5.2.1 Proving slice-reﬁnements

Theorem 5.1. For any pair of deterministic statements Sand SV and any set of variables V,

we have

(SvVSV )≡([wp.S.true ⇒wp.SV .true]∧

(∀x,val :x∈V∧val ∈ T .x: [wp.S.(x=val )⇒wp.SV .(x=val )])) .

Here, the (otherwise arbitrary) name SV was chosen as a hint that this statement has some-

thing to do with Sand V.

Before proving the theorem, we turn to some motivation. At ﬁrst glance, the theorem may

seem obvious, perhaps due to our mention of point-wise proof and the universal disjunctivity of

wp.Sand wp.T. However, a closer look reveals that this alternative view of slice-reﬁnement is

more variable-wise than it is point-wise (even though the proof will indeed involve points in the

state space).

Furthermore, it turns out that this formulation is not (always) suitable in the presence of non-

determinism. Recall the example from Section 4.1.2, where our focus on deterministic programs

was justiﬁed. Despite the fact that all postconditions of the form “x=val ” or “y=val ” yielded

false as weakest precondition of both programs, the duplicated version was not a slice-reﬁnement

CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 55

of the other one with respect to V={x,y}. This was revealed by the postcondition x=yand

the fact that [true ;false].

We are now ready for a proof of correctness, keeping in mind that S,SV are both deterministic

programs and hence wp.S,wp.SV are both universally disjunctive and positively conjunctive.

Proof. (LHS ⇒RHS ): Trivial; glob.true =∅and glob.(x=val)⊆Vfor all x∈Vof type T,

and value val ∈ T .x.

(LHS ⇐RHS ): We need to prove that for any predicate Pwith glob.P⊆Vwe have

[wp.S.P⇒wp.SV .P].

The only two predicates on the empty space (when V=∅), are the boolean scalars false and

true. Now [wp.S.false ⇒wp.SV .false] is given by the Law of Excluded Miracle, and [wp.S.true ⇒

wp.SV .true] is given by the (RHS) proviso.

Thus in the remainder of this proof we shall assume Vis not empty. We now note that the

predicate Pexpresses a dichotomy on the state subspace spanned by variables V. This dichotomy

can also be represented by the (possibly inﬁnite) set of points at which the predicate is evaluated to

true. Each point can then be represented (by its coordinates) as a conjunction of simple formulae

of the form x=val, one formula for each variable (axis) in V.

Let n:= |V|and let penumerate all n-dimensional points in the state space (p.iis the value of

the i-th dimension, i.e. of variable V.i); and P.pis true if P(with a substitution of each variable,

V.i, with its corresponding value p.i) is evaluated to true at p; then Pcan be rewritten as:

[P≡(∃p:P.p=true : (∀i: 0 ≤i<n:V.i=p.i))] .(5.1)

We now need to prove that [wp.S.P⇒wp.SV .P] (for all Pwith glob.P⊆V) under the assump-

tions [wp.S.true ⇒wp.SV .true] and for all x∈Vand value val ∈ T .xwe have:

[wp.S.(x=val)) ⇒wp.SV .(x=val)] .(5.2)

We recall Vis non-empty (i.e. 0<n) and observe (for all Pwith glob.P⊆V)

wp.S.P

={(5.1) above and Leibniz}

wp.S.(∃p:P.p=true : (∀i: 0 ≤i<n:V.i=p.i))

={RE1: wp.Sis universally disjunctive}

(∃p:P.p=true :wp.S.(∀i: 0 ≤i<n:V.i=p.i))

={0<nand wp.Sis positively conjunctive (Sdeterministic)}

(∃p:P.p=true : (∀i: 0 ≤i<n:wp.S.(V.i=p.i)))

⇒ {assumption (5.2) above}

CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 56

(∃p:P.p=true : (∀i: 0 ≤i<n:wp.SV .(V.i=p.i)))

={0<nand wp.SV is positively conjunctive (SV deterministic)}

(∃p:P.p=true :wp.SV .(∀i: 0 ≤i<n:V.i=p.i))

={RE1: wp.SV is universally disjunctive}

wp.SV .(∃p:P.p=true : (∀i: 0 ≤i<n:V.i=p.i))

={(5.1) above and Leibniz}

wp.SV .P.

Note the correctness of the (⇒) step above, due to the monotonicity of both ∀(3.9) and ∃

(3.10).

5.2.2 A co-slice-reﬁnement is a slice-reﬁnement of the complement

Co-slice-reﬁnements, as slice-reﬁnements, can be proved in a variable-wise manner.

Corollary 5.2. For any pair of deterministic statements Sand ScoV and any set of variables V,

we have

(Sv(V)ScoV )≡([wp.S.true ⇒wp.ScoV .true]∧

(∀x,val :x∈coV ∧val ∈ T .x: [wp.S.(x=val)⇒wp.ScoV .(x=val )]))

where coV := ((def.S∪def.ScoV )\V).

Proof. Recalling Sand ScoV are deterministic, we observe

(Sv(V)ScoV )

={Theorem 5.3, see below}

(SvcoV ScoV )

={Theorem 5.1 with SV ,V:= ScoV ,coV }

([wp.S.true ⇒wp.ScoV .true]∧

(∀x,val :x∈coV ∧val ∈ T .x: [wp.S.(x=val)⇒wp.ScoV .(x=val )])) .

Theorem 5.3. A co-slice-reﬁnement is a slice-reﬁnement of the complementary set of deﬁned

variables. That is, for any pair of deterministic statements Sand ScoV and any set of variables

V, we have

(Sv(V)ScoV )≡(SvcoV ScoV )

where coV := ((def.S∪def.ScoV )\V).

CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 57

Proof. (LHS ⇒RHS ): Due to VcoV , all predicates Pwith glob.P⊆coV (as required on the

RHS) have glob.PV. Thus the LHS yields [wp.S.P⇒wp.ScoV .P].

(LHS ⇐RHS ): We observe for all Pwith glob.PVand glob.P\coV 6=∅(without the

latter the RHS would already yield the required [wp.S.P⇒wp.ScoV .P]):

wp.S.P

={pointwise version of Pon the state space spanned by (coV 1,ND):

let coV 1 := glob.P∩coV ,ND := glob.P\coV ,n:= |coV 1|and

n0:= |glob.P|}

wp.S.(∃p:P.p=true : (∀i: 0 ≤i<n:coV 1.i=p.i)∧

(∀i:n≤i<n0:ND.(i−n) = p.i))

={junctivity of wp.S: recall Sis deterministic and

0<|ND|due to (glob.P\coV )6=∅}

(∃p:P.p=true :wp.S.(∀i: 0 ≤i<n:coV 1.i=p.i)∧

(∀i:n≤i<n0:wp.S.(ND.(i−n) = p.i)))

={RE3: ND def.Sand RE2}

(∃p:P.p=true :wp.S.(∀i: 0 ≤i<n:coV 1.i=p.i)∧

(∀i:n≤i<n0:wp.S.true ∧(ND.(i−n) = p.i)))

⇒ {RHS, twice: coV 1⊆coV and glob.true =∅}

(∃p:P.p=true :wp.ScoV .(∀i: 0 ≤i<n:coV 1.i=p.i)∧

(∀i:n≤i<n0:wp.ScoV .true ∧(ND.(i−n) = p.i)))

={RE3: ND def.ScoV and RE2}

(∃p:P.p=true :wp.ScoV .(∀i: 0 ≤i<n:coV 1.i=p.i)∧

(∀i:n≤i<n0:wp.ScoV .(ND.(i−n) = p.i)))

={junctivity of wp.ScoV :ScoV is deterministic and again 0 <|ND |}

wp.ScoV .(∃p:P.p=true :wp.ScoV .(∀i: 0 ≤i<n:coV 1.i=p.i)∧

(∀i:n≤i<n0:ND.(i−n) = p.i))

={pointwise version of P}

wp.ScoV .P.

CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 58

5.3 Slice and co-slice reﬁnements yield a general reﬁnement

Combining separate variable-wise proofs for a slice-reﬁnement and its complementary co-slice-

reﬁnement, we can discard the variable-wise approach.

Corollary 5.4. Let S,Tbe any pair of deterministic statements and let Vbe any set of variables;

then

(SvT)≡((SvVT)∧(Sv(V)T)) .

Proof. We observe

SvT

={def. of reﬁnement; glob.P ∅ holds for all P}

(∀P:glob.P ∅ : [wp.S.P⇒wp.T.P])

={def. of co-slice-reﬁnement}

Sv(∅)T

={Corollary 5.2 with ScoV ,V:= T,∅:S,Tdeterministic}

[wp.S.true ⇒wp.T.true]∧

(∀x,val :x∈(def.S∪def.T)∧val ∈ T .x: [wp.S.(x=val )⇒wp.T.(x=val )])

={Lemma 5.5 (see below) and pred. calc.}

[wp.S.true ⇒wp.T.true]∧

(∀x,val :x∈(def.S∪def.T)∧val ∈ T .x: [wp.S.(x=val )⇒wp.T.(x=val )])∧

(∀x,val :x∈(V\(def.S∪def.T)) ∧val ∈ T .x:

[wp.S.(x=val)⇒wp.T.(x=val)])

={merging the ranges}

[wp.S.true ⇒wp.T.true]∧

(∀x,val :x∈(V∪def.S∪def.T)∧val ∈ T .x:

[wp.S.(x=val)⇒wp.T.(x=val)])

={splitting the range; pred. calc.}

[wp.S.true ⇒wp.T.true]∧

(∀x,val :x∈V∧val ∈ T .x: [wp.S.(x=val )⇒wp.T.(x=val )])∧

[wp.S.true ⇒wp.T.true]∧

(∀x,val :x∈((def.S∪def.T)\V)∧val ∈ T .x:

[wp.S.(x=val)⇒wp.T.(x=val)])

={Theorem 5.1 with SV := Tand Corollary 5.2 with ScoV := T: again,

S,Tare deterministic}

CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 59

(SvVT)∧(Sv(V)T).

Lemma 5.5. Let S,Tbe any pair of statements and let Pbe any predicate, with glob.P(def.S∪

def.T); then

[wp.S.true ⇒wp.T.true]⇒[wp.S.P⇒wp.T.P].

Proof. For all such S,T,P, with [wp.S.true ⇒wp.T.true] and glob.P(def.S∪def.T), we observe

wp.S.P

={RE3: glob.Pdef.S(proviso and set theory)}

P∧wp.S.true

⇒ {proviso}

P∧wp.T.true

={RE3 again: glob.Pdef.T(proviso and set theory)}

wp.T.P.

5.3.1 A corollary for program equivalence

An immediate corollary of the above reﬁnement proof method will support proof of program

equivalence.

Corollary 5.6. Let S,Tbe any pair of deterministic statements and let Vbe any set of variables;

then

(S=T)≡

(∀P:glob.P⊆V: [wp.S.P≡wp.T.P]) ∧(∀Q:glob.QV: [wp.S.Q≡wp.T.Q]) .

Proof. Recalling Sand Tare deterministic, we observe

S=T

={Theorem 4.1}

(SvT)∧[wp.S.true ≡wp.T.true]

={Corollary 5.4}

(SvVT)∧(Sv(V)T)∧[wp.S.true ≡wp.T.true]

={def. of slice-reﬁnement and co-slice-reﬁnement}

CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 60

(∀P:glob.P⊆V: [wp.S.P⇒wp.T.P]) ∧(∀Q:glob.QV: [wp.S.Q⇒wp.T.Q])∧

[wp.S.true ≡wp.T.true]

={pred. calc. ((3.7), twice): the ranges are non-empty}

(∀P:glob.P⊆V: [wp.S.P⇒wp.T.P]∧[wp.S.true ≡wp.T.true])∧

(∀Q:glob.QV: [wp.S.Q⇒wp.T.Q]∧[wp.S.true ≡wp.T.true])

={Lemma 4.2, twice}

(∀P:glob.P⊆V: [wp.S.P≡wp.T.P])∧

(∀Q:glob.QV: [wp.S.Q≡wp.T.Q]) .

5.4 Example proof: swap independent statements

To illustrate our new method of proof, consider the following program equivalence for swapping

independent statements:

Program equivalence 5.7. Let S1,S2 be any pair of deterministic statements; then

“S1;S2”=“S2;S1”

provided def.S1def.S2, def.S1input.S2 and input.S1def.S2.

Note that the provisos are actually gathered from the following derivation. This is representa-

tive of our general approach to reﬁnement, program equivalence and transformation.

Proof. We ﬁrst observe for all Pwith glob.P⊆def.S1 (note that def.S2glob.Pdue to proviso

def.S1def.S2):

wp.“S1;S2”.P

={wp of ‘ ;’}

wp.S1.(wp.S2.P)

={RE3: def.S2glob.P}

wp.S1.(P∧wp.S2.true)

={conj. of wp.S1}

wp.S1.P∧wp.S1.(wp.S2.true)

={RE3: def.S1glob.(wp.S2.true) due to RE2 and proviso def.S1input.S2}

wp.S1.P∧wp.S1.true ∧wp.S2.true

={absorb term. (3.14)}

CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 61

wp.S1.P∧wp.S2.true

={RE3: def.S2glob.(wp.S1.P) due to RE2, proviso input.S1def.S2,

and choice of P}

wp.S2.(wp.S1.P)

={wp of ‘ ;’}

wp.“S2;S1”.P.

We now observe for all Pwith glob.Pdef.S1:

wp.“S2;S1”.P

={wp of ‘ ;’}

wp.S2.(wp.S1.P)

={RE3: def.S1glob.P}

wp.S2.(P∧wp.S1.true)

={conj. of wp.S2}

wp.S2.P∧wp.S2.(wp.S1.true)

={RE3: def.S2glob.(wp.S1.true) due to RE2 and proviso input.S1def.S2}

wp.S2.P∧wp.S2.true ∧wp.S1.true

={absorb term. (3.14)}

wp.S2.P∧wp.S1.true

={RE3: def.S1glob.(wp.S2.P) due to RE2, proviso def.S1input.S2,

and choice of P}

wp.S1.(wp.S2.P)

={wp of ‘ ;’}

wp.“S1;S2”.P.

Taken together, the above two derivations yield the required program equivalence, due to

Corollary 5.6 and the determinism of S1 and S2.

5.5 Summary

This chapter has extended our transformation framework by introducing a proof method for both

reﬁnements and program equivalence, speciﬁcally designed to support slicing-related refactoring

transformations. Two complementary concepts of slice-reﬁnement and co-slice-reﬁnement have

CHAPTER 5. PROOF METHOD FOR CORRECT SLICING-BASED REFACTORING 62

been introduced. It has been shown that proving each kind of reﬁnement separately is equivalent

to proving normal reﬁnements of code. This approach has been shown to be applicable for proving

program equivalence as well, and such an example, for swapping independent statements, has been

proven.

The next chapter will apply this proof method in developing our ﬁrst version of sliding.

Chapter 6

Statement Duplication

In this chapter, the ﬁrst step towards slice extraction is taken by formally developing a program

equivalence that yields a transformation of statement duplication. The duplication begins by

making two clones of the original program. These are composed sequentially, and correctness is

ensured by the addition of compensatory code. This code is responsible for keeping and retrieving

backup of initial and ﬁnal values. One clone is specialized for computing the results carried by the

variables selected for extraction, whereas the other is dedicated to the remaining computations

(as captured by the remaining variables).

6.1 Example

When asked to extract the computation of sum in the following program fragment

while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

we oﬀer to duplicate the selected statement, and systematically add some compensatory code,

to make the transformation correct. This would yield the following version, in which the actual

computation of sum is in the ﬁrst clone, whereas the remaining results (i.e. in i,prod ) are computed

in the second (complementary) clone:

CHAPTER 6. STATEMENT DUPLICATION 64

|[var isum,iprod,ii,fsum

; isum,iprod,ii := sum,prod,i

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

; fsum := sum

;

sum,prod,i := isum,iprod,ii

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

; sum:=fsum

6.2 Sequential simulation of independent parallel execution

Consider the eﬀects of a program statement’s execution as the results carried by its deﬁned vari-

ables, i.e. def.Sfor a statement S, in case of termination. Now consider a partition of def.S, say

to two subsets def.S= (V,coV ). (Here, the otherwise arbitrary name coV was chosen as a hint

that this set of variables is complementary to V.)

If Sis deterministic, its computation can be accomplished by two — or more, depending on

the number of partitions — independent machines. Each such machine will be given the same

initial state and the same program statement for execution. Clearly, due to determinism, both

machines terminate under the same conditions. Then, in case of termination, the results can be

collected from the two machines, say Vfrom the ﬁrst machine and coV from the second.

The usefulness of the above construction will become clear later on, when each clone of Swill

be independently simpliﬁed to achieve its speciﬁc designated goal.

For simulating the above scenario (of two independent machines), using our sequential imper-

ative language (for a single machine), we propose to sequentially compose two clones of statement

S. For the second clone to work properly, we insist that the ﬁrst clone will not modify the original

set of variables. This will be guaranteed by keeping a backup of initial values, and retrieving those

values upon entry to the second clone. Using our available language constructs, this transforma-

tion is formalised in the following section. (We actually formalise it as a program equivalence,

thus keeping it general enough to be relevant for the reverse transformation as well.)

6.3 Formal derivation

Program equivalence 6.1. Let S,V,coV ,iV ,icoV ,fV be any deterministic statement and ﬁve

sets of variables, respectively; then

CHAPTER 6. STATEMENT DUPLICATION 65

“(iV ,icoV := V,coV

;fV := V

;

V,coV := iV ,icoV

;V:= fV )[live V,coV ]”

provided def.S= (V,coV )

and (iV ,icoV ,fV )glob.S.

Proof.

={prepare for statement duplication (Lemma 6.2 below):

def.S= (V,coV ) and (iV ,icoV ,fV )glob.S}

“(iV ,icoV := V,coV ;S;fV := V;V:= fV )[live V,coV ]”

={statement duplication (Lemma 6.3 below): Sis deterministic}

“(iV ,icoV := V,coV ;S;fV := V

;V,coV := iV ,icoV ;S;V:= fV )[live V,coV ]”.

Lemma 6.2. Let S,V,coV ,iV ,icoV ,fV be any statement and ﬁve sets of variables, respectively;

then

“(iV ,icoV := V,coV

;fV := V

;

V:= fV )[live V,coV ]”

provided def.S= (V,coV )

and (iV ,icoV ,fV )glob.S.

Proof.

“(iV ,icoV := V,coV ;S;fV := V;V:= fV )[live V,coV ]”

={assignment-based sub. (Law 18): VfV since

V⊆def.S(proviso), def.S⊆glob.S(RE5) and glob.SfV (proviso)}

“(iV ,icoV := V,coV ;S;fV := V;V:= V)[live V,coV ]”

={remove aux. self assignment (Law 2)}

“(iV ,icoV := V,coV ;S;fV := V)[live V,coV ]”

CHAPTER 6. STATEMENT DUPLICATION 66

={remove dead assignments (Law 24): fV (V,coV ) (proviso and RE5)}

“(iV ,icoV := V,coV ;S)[live V,coV ]”

={remove dead assignments (Law 25):

(iV ,icoV )(((V,coV )\ddef.S)∪input.S) (again, proviso and RE5)}

“S[live V,coV ]”

={remove aux. liveness info. (Law 19): def.S⊆(V,coV )}

Lemma 6.3. Let S,V,coV ,iV ,icoV ,fV be any deterministic statement and ﬁve sets of vari-

ables, respectively; then

“iV ,icoV := V,coV

;fV := V”

“iV ,icoV := V,coV

;fV := V

;

V,coV := iV ,icoV

;S”

provided def.S= (V,coV )

and (iV ,icoV ,fV )glob.S.

Proof.

“iV ,icoV := V,coV ;S;fV := V”

={intro. following assertion (Law 7)}

“iV ,icoV := V,coV ;{V,coV =iV ,icoV };S;fV := V”

={see below}

“iV ,icoV := V,coV ;{V,coV =iV ,icoV };S;fV := V

;V,coV := iV ,icoV ;S”

={remove following assertion (Law 7)}

“iV ,icoV := V,coV ;S;fV := V;V,coV := iV ,icoV ;S”.

Note that no data may ﬂow from the ﬁrst clone to the second:

def.“S;fV := V”input.“V,coV := iV ,icoV ;S”. We now observe for all P

wp.“{V,coV =iV ,icoV };S;fV := V;V,coV := iV ,icoV ;S”.P

={wp of ‘ ;’ and assertions}

(V,coV =iV ,icoV )∧wp.“S;fV := V;V,coV := iV ,icoV ;S”.P

CHAPTER 6. STATEMENT DUPLICATION 67

={wp of ‘ ;’ and ‘:=’}

(V,coV =iV ,icoV )∧wp.S.(wp.“V,coV := iV ,icoV ;S”.P)[fV \V]

={wp of ‘ ;’ and ‘:=’}

(V,coV =iV ,icoV )∧wp.S.(wp.S.P)[V,coV \iV ,icoV ][fV \V].

At this point, due to Corollary 5.6 and the determinism of S, we are ready to distinguish two

complementary cases: (a) glob.P⊆fV ; and (b) glob.PfV . The former case involves results

computed in the ﬁrst clone of Swhereas the latter takes care of the computations from the second

clone, which are — due to the lack of data ﬂow — independent of the ﬁrst clone’s results.

Case (a): glob.P⊆fV

(V,coV =iV ,icoV )∧wp.S.(wp.S.P)[V,coV \iV ,icoV ][fV \V]

={RE3: glob.Pdef.S}

(V,coV =iV ,icoV )∧wp.S.(wp.S.true ∧P)[V,coV \iV ,icoV ][fV \V]

={dist. of normal subs over ∧}

(V,coV =iV ,icoV )∧wp.S.((wp.S.true)[V,coV \iV ,icoV ][fV \V]∧

P[V,coV \iV ,icoV ][fV \V])

={remove redundant subs: RE2 (fV (input.S,iV ,icoV ))}

(V,coV =iV ,icoV )∧wp.S.((wp.S.true)[V,coV \iV ,icoV ]∧

P[V,coV \iV ,icoV ][fV \V])

={remove redundant subs: (V,coV )glob.P}

(V,coV =iV ,icoV )∧wp.S.((wp.S.true)[V,coV \iV ,icoV ]∧P[fV \V])

={wp.Sis conj.}

(V,coV =iV ,icoV )∧wp.S.(wp.S.true)[V,coV \iV ,icoV ]∧wp.S.P[fV \V]

={RE3: (iV ,icoV )def.S(recall (V,coV ) = def.S)}

(V,coV =iV ,icoV )∧wp.S.true ∧(wp.S.true)[V,coV \iV ,icoV ]∧

wp.S.P[fV \V]

={remove redundant subs: (V,coV =iV ,icoV )}

(V,coV =iV ,icoV )∧wp.S.true ∧wp.S.true ∧wp.S.P[fV \V]

={absorb termination (3.14), twice}

(V,coV =iV ,icoV )∧wp.S.P[fV \V]

={wp of ‘:=’ and ‘ ;’}

(V,coV =iV ,icoV )∧wp.“S;fV := V”.P

CHAPTER 6. STATEMENT DUPLICATION 68

={wp of assertions and ‘ ;’}

wp.“{iV ,icoV =V,coV };S;fV := V”.P.

Case (b): glob.PfV

(V,coV =iV ,icoV )∧wp.S.(wp.S.P)[V,coV \iV ,icoV ][fV \V]

={remove redundant sub.: fV (glob.S∪glob.P∪(iV ,icoV )), proviso and RE2}

(V,coV =iV ,icoV )∧wp.S.(wp.S.P)[V,coV \iV ,icoV ]

={RE3: glob.((V,coV := iV ,icoV ).(wp.S.P)) def.S

(recall (V,coV ) = def.S)}

(V,coV =iV ,icoV )∧wp.S.true ∧(wp.S.P)[V,coV \iV ,icoV ]

={remove redundant subs: (V,coV =iV ,icoV )}

(V,coV =iV ,icoV )∧wp.S.true ∧wp.S.P

={absorb termination (3.14)}

(V,coV =iV ,icoV )∧wp.S.P

={intro. redundant sub.: proviso}

(V,coV =iV ,icoV )∧wp.S.P[fV \V]

={wp of ‘:=’ and ‘ ;’}

(V,coV =iV ,icoV )∧wp.“S;fV := V”.P

={wp of assertions and ‘ ;’}

wp.“{iV ,icoV =V,coV };S;fV := V”.P.

6.4 Summary and discussion

This chapter has introduced our ﬁrst solution to slice extraction, through a naive sliding approach

of statement duplication. Two clones of a given statement are composed for sequential execution.

The ﬁrst clone, i.e. the extracted code, is dedicated for computing a selected subset of the original

program’s results, whereas the second clone, i.e. the complement, is responsible for the remaining

results. Behaviour preservation is guaranteed, as is formally proved in the chapter, by the addition

of compensatory code.

This includes copying of the initial state, saving it in backup variables, in an approach borrowed

from the Tuck transformation of Lakhotia and Deprez [40]. However, in order to keep the resulting

program as close to the original as possible, we refrain from their decision to rename variables

CHAPTER 6. STATEMENT DUPLICATION 69

in the complement. Instead, the initial state is retrieved from backup variables into the original

ones, just before the complement begins execution.

The success of this approach is based on a new type of compensation, according to which the

ﬁnal value of extracted variables is also kept in backup variables, ahead of retrieving the initial

state for the complement. Accordingly, those are retrieved once the complement’s execution is

over.

The proof of correctness is based on our proof method, as developed in the preceding chap-

ter. The equivalence of the original program and its duplicated version has been proved for the

extracted set of variables separately from the proof for the complementary set.

Having proved the equivalence of a program and its duplicated version, rather than expressing

it as a direct transformation, the result of this chapter is also applicable for merging a duplicated

statement. However, as we are interested in slice extraction for untangling code, rather than

tangling it, such direction will not be further pursued (and hence the chapter’s title).

Several improvements of statement duplication will be developed in later chapters. Both the

extracted code and its complement will be reduced by slicing (in the next three chapters). Then,

the complement will be further reduced by reusing extracted results (Chapter 10) and redundant

compensation will be eliminated in Chapter 11.

Our correctness proof has been decomposed in a certain manner such that those further im-

provements will be able to reuse parts of it. In particular, both Lemma 6.2 and Lemma 6.3 will

be reused in Chapter 10 when reducing the complement.

Finally, we consider statement duplication as a naive sliding operation, at least metaphorically,

due to the following observation. The code of the given program statement can be printed on a

single transparency slide and photocopied. Placing the two slides one on top of the other yields

the original program; then sliding one away from the other and adding compensatory code would

yield the duplicated version. The further improvements will reﬁne this approach by representing

a program with more slides. On each such slide, in turn, a part of the program will be printed.

Chapter 7

Semantic Slice Extraction

As was introduced earlier in the thesis (back in Chapter 1), the main challenge in slice extraction is

to be able to untangle the extracted code from its complement, whilst minimizing code duplication.

With respect to the goal of minimizing duplication, it may seem self defeating to base our novel

approach on statement duplication. However, this duplication can be justiﬁed by the following

observation.

Once a statement has been duplicated, and each of its clones has been specialized (through

copying of initial and ﬁnal values, in the compensatory code) for computing only a subset of its

results, we have potentially rendered some of its internal statements dead. Those can subsequently

be removed by slicing.

In this chapter, requirements of slicing are derived from the earlier statement-duplication for-

malisation and a form of live variables analysis, to be introduced in the chapter. The result of this

derivation, besides slicing requirements, is a reﬁnement relation similar to but more general than

the program equivalence of statement duplication. This reﬁnement rule will later (in Chapter 9)

be applied in deriving our ﬁrst slice-extraction transformation.

CHAPTER 7. SEMANTIC SLICE EXTRACTION 71

7.1 Example

Going back to the sum and prod example, we now start with the following version:

i,sum,prod := 0,0,1

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

and try to extract sum. Transforming the code according to the statement-duplication program

equivalence (6.1) would yield the following version (replacing liveness information with local vari-

ables):

|[var isum,iprod,ii,fsum

; isum,iprod,ii := sum,prod,i

; i,sum,prod := 0,0,1

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

; fsum := sum

;

sum,prod,i := isum,iprod,ii

; i,sum,prod := 0,0,1

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

; sum:=fsum

The duplicated statement can now be simpliﬁed by slicing. The result

|[var isum,iprod,ii,fsum

; isum,iprod,ii := sum,prod,i

; i,sum := 0,0

; while i<a.length do

i,sum :=

i+1,sum+a[i]

; fsum := sum

;

sum,prod,i := isum,iprod,ii

; i, prod := 0, 1

; while i<a.length do

i, prod :=

i+1, prod*a[i]

; sum:=fsum

can then be re-formatted as

CHAPTER 7. SEMANTIC SLICE EXTRACTION 72

|[var isum,iprod,ii,fsum

; isum,iprod,ii := sum,prod,i

; i,sum := 0,0

; while i<a.length do

i,sum := i+1,sum+a[i]

; fsum := sum

;

sum,prod,i := isum,iprod,ii

; i,prod := 0,1

; while i<a.length do

i,prod := i+1,prod*a[i]

; sum:=fsum

Notice that the compensatory code, i.e. all backup (local) variables isum,iprod ,ii and fsum ,

along with their respective initialization code, have become redundant (although this will not al-

ways be the case). That redundancy will be removed in later simpliﬁcation steps (see Chapter 11).

Later in the chapter we shall formally derive the (syntactic and semantic) requirements of

slices, by developing an improved solution to slice extraction (on the lines of the above example).

This solution, as well as the semantics of slices, will be based on a formal approach to live variables

analysis.

But ﬁrst, we turn to introduce that liveness analysis.

7.2 Live variables analysis

Suppose we wish to perform liveness analysis on a core statement Swith respect to a given set

of variables, V. Let coV := def.S\V, the complementary set of deﬁned variables, be considered

live throughout S(i.e. on exit from any of its slips). We ﬁrst note that the over-approximation of

considering elements of coV live, even in places where they are not, will not be harmful.

Moreover, variables outside (V,coV ) are not deﬁned in any slip Tof S(since Sis a core

statement, with no local variables). Such variables will keep their initial value throughout S, and

hence there will be no harm in ignoring them.

Accordingly, liveness analysis in Sbegins with S=S[live V,coV ] for any given S,Vwith

coV := def.S\V. This step is correct due to Law 19 (with V:= (V,coV )).

Then, liveness information is propagated to all slips of S, in a syntax-directed manner, as

follows.

For sequential composition, we turn any “(S1; S2)[live V1,coV ]”with V1⊆Vinto

“(S1[live V2,coV ]; S2[live V1,coV ])[live V1,coV ]”, where

V2 := (V1\ddef.S2) ∪(V∩input.S2), which is correct due to Law 20 and our earlier comments

on redundancy of variables outside (V,coV ) and the legitimacy of over-approximation (in coV ).

CHAPTER 7. SEMANTIC SLICE EXTRACTION 73

For IF statements, we simply turn any “(if Bthen S1else S2ﬁ)[live V1,coV ]”into

“if Bthen S1[live V1,coV ]else S2[live V1,coV ]ﬁ”by applying Law 21 with V:= (V1,coV ).

Finally, for DO loops, we turn “(while Bdo Sod)[live V1,coV ]”into

“(while Bdo S[live V2,coV ]od)[live V1,coV ]”, where

V2 := V1∪(V∩(glob.B∪input.S)), which is correct due to Law 22 and, again, the earlier

comment on redundancy of variables outside (V,coV ).

We refer to results of liveness analysis of statement Son variables V, by saying

“let T[live V1,coV ] be any slip of S[live V,coV ]”. In a full live variables analysis of a statement

S, the set of variables, V, is not explicitly selected and the set def.Sis taken in its place.

Hence, in full liveness analysis, the complementary set coV is empty and we say “let T[live V1]

be any slip of S”. Then T[live V1] can be safely augmented with following assignments to any

of the variables in def.S\V1. This augmentation will still keep all auxiliary liveness information

redundant (and hence removable). That is, any augmentation by assignment to dead variables

(from def.S) is correct, in the context of S. (Augmentation by assignment to any other dead

variable not in def.Swill be correct in S[live V] but not in Sitself. However, we will not be

interested in such augmentations.)

Note that in deviation from traditional liveness analysis (as in [47]), typically propagating

information on a ﬂow graph, until a ﬁxed point is reached, our algorithm requires one pass of the

program’s tree. This is possible due to the simplicity of our language (e.g. no jumps) and the

availablity of summary information (i.e. sets def,ddef and input).

In that light, it is important and interesting to verify that our algorithm is insensitive to

diﬀerent parses of a given program — which are possible due to the associativity of sequential

composition. That is, as much as “S1;(S2;S3) ”=“(S1;S2) ;S3”in our language, so

will the analysis produce identical results in both cases, as is shown in the following.

Theorem 7.1. Distribution of liveness information over sequential composition is associative.

Proof. Suppose we perform liveness analysis on a core statement Swith def.S=V, and we

reach a slip of the form “S1;S2;S3”with live-on-exit variables V3. (Note that the liveness

analysis guarantees V3⊆V.) We now need to show that whatever the internal parsing, the

liveness-analysis algorithm would identify the same results for slips S1, S2 and S3, on both

“(S1;(S2;S3))[live V3] ”and “((S1;S2) ;S3)[live V3] ”.

For the former, we have

“(S1;(S2;S3))[live V3] ”

={liveness analysis (on V):

let V1 := (V3\ddef.“S2;S3”)∪(V∩input.“S2;S3”)}

“(S1[live V1] ;(S2;S3)[live V3])[live V3] ”

CHAPTER 7. SEMANTIC SLICE EXTRACTION 74

={liveness analysis (on V): let V2 := (V3\ddef.S3) ∪(V∩input.S3)}

“(S1[live V1] ;(S2[live V2] ;S3[live V3])[live V3])[live V3] ”,

and for the latter, we have

“((S1;S2) ;S3)[live V3] ”

={liveness analysis (on V), with V2 as above}

“((S1;S2)[live V2] ;S3[live V3])[live V3] ”

={liveness analysis (on V): let V10:= (V2\ddef.S2) ∪(V∩input.S2)}

“((S1[live V10];S2[live V2])[live V2] ;S3[live V3])[live V3] ”.

Finally, we observe that V1 = V10, as expected, since

={def. of V1}

(V3\ddef.“S2;S3”)∪(V∩input.“S2;S3”)

={set theory: V3⊆V}

V∩((V3\ddef.“S2;S3”)∪input.“S2;S3”)

={Lemma 7.2, see below}

V∩((((V3\ddef.S3) ∪input.S3) \ddef.S2) ∪input.S2)

={set theory: again, V3⊆V}

(((V3\ddef.S3) ∪(V∩input.S3)) \ddef.S2) ∪(V∩input.S2)

={def. of V2}

(V2\ddef.S2) ∪(V∩input.S2)

={def. of V10}

V10.

Lemma 7.2. Let S1,S2,Vbe any two statements and set of variables, respectively; then

(V\ddef.“S1;S2”)∪input.“S1;S2”

(((V\ddef.S2) ∪input.S2) \ddef.S1) ∪input.S1.

CHAPTER 7. SEMANTIC SLICE EXTRACTION 75

Proof. We observe

(V\ddef.“S1;S2”)∪input.“S1;S2”

={ddef and input of ‘ ;’}

(V\(ddef.S1∪ddef.S2)) ∪(input.S1∪(input.S2\ddef.S1))

={set theory}

(V\(ddef.S1∪ddef.S2)) ∪(input.S2\ddef.S1) ∪input.S1

={set theory}

((V\ddef.S2) \ddef.S1) ∪(input.S2\ddef.S1) ∪input.S1

={set theory}

(((V\ddef.S2) ∪input.S2) \ddef.S1) ∪input.S1.

7.2.1 Simultaneous liveness

Liveness analysis will be useful beyond the elimination of dead assignments.

Deﬁnition 7.3 (Simultaneous Liveness).When performing full live variables analysis on a given

S[live V], a set of variables Xis considered simultaneously-live (in S[live V]) if more than one

element of Xis on the live variables set of any slip Tof S. When no such slip exists, Xis not

simultaneously-live in S[live V].

The concept of simultaneous liveness will be useful mainly in the merging of live ranges [54].

A set of non-simultaneously-live variables can (under some further conditions) be merged into one

variable. This will be explored, formalised and applied later in the thesis (see Section 8.6.2 and

Appendix D).

This concludes our introduction to liveness analysis, which will be applied next for slice ex-

traction and later in the thesis e.g. for reducing compensation after sliding (Chapter 11).

7.3 Formal derivation using statement duplication

Reﬁnement 7.4. Let S,SV ,ScoV ,V,coV ,iV ,icoV ,fV be three deterministic statements and

ﬁve sets of variables, respectively; then

CHAPTER 7. SEMANTIC SLICE EXTRACTION 76

“(iV ,icoV := V,coV

;SV

;fV := V

;

V,coV := iV ,icoV

;ScoV

;V:= fV )[live V,coV ]”

provided def.S= (V,coV ),

S[live V]vSV [live V],

S[live coV ]vScoV [live coV ],

def.SV ⊆def.S,

def.ScoV ⊆def.Sand

(iV ,icoV ,fV )glob.S.

Proof.

={duplicate statement (Program equivalence 6.1): Sis deterministic,

def.S= (V,coV ) and (iV ,icoV ,fV )glob.S(provisos)}

“(iV ,icoV := V,coV ;S;fV := V

;V,coV := iV ,icoV ;S;V:= fV )[live V,coV ]”

v {Reﬁnement 7.5 with S0:= S}

“(iV ,icoV := V,coV ;SV ;fV := V

;V,coV := iV ,icoV ;ScoV ;V:= fV )[live V,coV ]”.

Reﬁnement 7.5. Let S,S0,SV ,ScoV ,V,coV ,iV ,icoV ,fV be four statements and ﬁve sets of

variables, respectively; then

“(iV ,icoV := V,coV

;fV := V

;

V,coV := iV ,icoV

;S0

;V:= fV )[live V,coV ]”

“(iV ,icoV := V,coV

;SV

;fV := V

;

V,coV := iV ,icoV

;ScoV

;V:= fV )[live V,coV ]”

CHAPTER 7. SEMANTIC SLICE EXTRACTION 77

provided

P1: def.S= (V,coV ),

P2: def.S0= (V,coV ),

P3: S[live V]vSV [live V],

P4: S0[live coV ]vScoV [live coV ],

P5: def.SV ⊆def.S,

P6: def.ScoV ⊆def.S0and

P7: (iV ,icoV ,fV )glob.S.

Proof.

“(iV ,icoV := V,coV ;S;fV := V

;V,coV := iV ,icoV ;S0;V:= fV )[live V,coV ]”

={liveness analysis: fV def.S0(P7,RE5,P1 and P2)}

“((iV ,icoV := V,coV ;S;fV := V;

V,coV := iV ,icoV ;S0[live coV ])[live fV ,coV ];V:= fV )[live V,coV ]”

v {(P4)}

“((iV ,icoV := V,coV ;S;fV := V;

V,coV := iV ,icoV ;ScoV [live coV ])[live fV ,coV ];V:= fV )[live V,coV ]”

={liveness removal: def.ScoV fV (P1,P2,P6,P7 and RE5)}

“((iV ,icoV := V,coV ;S;fV := V;

V,coV := iV ,icoV ;ScoV )[live fV ,coV ];V:= fV )[live V,coV ]”

={liveness analysis: (input.ScoV \ddef.“fV := V;V,coV := iV ,icoV ”)∩

def.“iV ,icoV := V,coV ;S”⊆(iV ,icoV ); then (iV ,icoV )def.S}

“(((iV ,icoV := V,coV ;S[live V])[live V,iV ,icoV ];fV := V;

V,coV := iV ,icoV ;ScoV )[live fV ,coV ];V:= fV )[live V,coV ]”

v {(P3)}

“(((iV ,icoV := V,coV ;SV [live V])[live V,iV ,icoV ];fV := V;

V,coV := iV ,icoV ;ScoV )[live fV ,coV ];V:= fV )[live V,coV ]”

={liveness removal: def.SV (iV ,icoV ) (P5,P7,RE5);

then all potentially dead coV (V,iV ,icoV ) (P1,P7 and RE5)

remain dead, since coV ⊆(ddef.“fV := V;V,coV := iV ,icoV ”

\input.“fV := V;V,coV := iV ,icoV ”)}

CHAPTER 7. SEMANTIC SLICE EXTRACTION 78

“(iV ,icoV := V,coV ;SV ;fV := V;

V,coV := iV ,icoV ;ScoV ;V:= fV )[live V,coV ]”.

7.4 Requirements of slicing

From the above law of reﬁnement, we can gather conditions P3 and P5 as requirements for slicing.

That is, for a given deterministic statement Sand set of variables V, any statement SV satisfying

(Q1:) S[live V]vSV [live V]

is (at least semantically) a correct slice of Swith respect to V. Furthermore, if condition

(Q2:) def.SV ⊆def.S

holds too, we know SV can successfully replace the extracted Sin a transformation of slice

extraction of Vfrom S.

For sanity checking (of the generality of those requirements), we observe conditions P4 and P6

above, for the complement. There, requirement Q1 with S,V,SV := S0,coV ,ScoV holds due to

P4 and Q2 with SV ,S:= ScoV ,S0holds through P6. Thus, any slice ScoV of S0with respect

to coV would make a good complementary statement in a transformation of slice extraction of V

from S.

For requirement Q1 above, we make one further observation. Following the deﬁnitions of

liveness and of the relation of slice-reﬁnement (as deﬁned in Chapter 5), a semantic slice SV of S

and any slice-reﬁnement of those Sand Vis a semantic slice. That is, any SV satisfying SvVSV

satisﬁes Q1 and is thus a semantic slice of S,V.

In general, any known reﬁnement technique can be applied to S[live V] (rather than directly

to S) in deriving a slice-reﬁnement. However, in an attempt to constructively describe related

transformations, a slicing algorithm will be formally developed later in the thesis.

7.4.1 Ward’s deﬁnition of syntactic and semantic slices

Our semantic deﬁnition of a slice is akin to several deﬁnitions by Martin Ward (e.g. in [57], and

most recently in “Conditioned Semantic Slicing via Abstraction and Reﬁnement in FermaT” by

Ward, Zedan and Hardcastle [59]), and has indeed been inspired by those. Ward et al. base their

semantic deﬁnition on a novel relation between programs, called semi-reﬁnement, which involves

the introduction of termination. That is, a program S0is a semi-reﬁnement of another program

Sif they are both equivalent for input on which Sis guaranteed to terminate. On other inputs,

S0is free to terminate, or do any other thing. Recalling that reﬁnement involves the introduction

of either (or both) termination and determinism, we note that in our context of deterministic

CHAPTER 7. SEMANTIC SLICE EXTRACTION 79

programs, reﬁnement and semi-reﬁnement are the same.

In their formalism (also based on predicate transformaers), a semantic slice of a given program

Son variables Xis any program S0for which “S0;remove(W\X)”is a semi-reﬁnement of

“S;remove(W\X)”, where Wis the ﬁnal state space for Sand S0, and with remove restricting

that state space. Note how our S[live X] concisely captures their “S;remove(W\X)”. In

eﬀect, their combination of state space restriction and semi-reﬁnement is captured by our notation

for live variables and normal reﬁnement (and hence the requirement Q1 above) — in our context

of deterministic programs.

A second relation between programs, that of a reduction, is introduced by Ward et al. to

participate in the deﬁnition of a syntactic slice. A program reduction involves the replacement of

substatements (i.e. slips in our terminology) with skip statements (or exit, which is beyond the

scope of our investigation), thus maintaining the original syntactic structure. Then, any semantic

slice of Son X,S0, is also a syntactic slice, if it is a reduction of S. The next chapter will deﬁne

the program entities of slides to achieve a similar eﬀect. This will allow a later formulation of a

provably-correct syntax-preserving slicing algorithm.

7.5 Summary

This chapter has developed a reﬁnement rule for slice extraction, based on statement duplication

(from the preceding chapter) and a live variables analysis. Our approach to liveness analysis has

been formalised in the chapter.

Liveness information is introduced into a program statement by ﬁrst assuming all variables are

live; then the information is propagated to all slips of the original statement; next, local trans-

formations such as dead-assignment-elimination can be performed; ﬁnally, under some conditions,

the correct local transformations are also globally correct such that all liveness information can

be removed.

According to our new liveness-based reﬁnement rule (Reﬁnement 7.4) for slice extraction, both

the extracted code and the complement are slices (of the same program, on two complementary

sets of variables).

Advanced strategies for minimizing the amount of duplication in slice extraction will be ex-

plored later in the thesis, after developing a slicing algorithm (in Chapter 9 ahead); this will be a

semantically correct algorithm, following the slicing requirements (Q1 and Q2) as derived in this

chapter. That algorithm will allow us to constructively describe a transformation based on the

semantic slice-extraction reﬁnement laws of this chapter.

The next chapter will lay the foundations for our slicing algorithm by formalising a novel

decomposition of programs into syntactic elements called slides.

Chapter 8

Slides: A Program Representation

8.1 Slideshow: a program execution metaphor

In this thesis, the execution of imperative procedural sequential programs is thought of as a

systematic slideshow.

According to the slideshow metaphor, the executable code of each procedure is printed on a

single transparent slide, one that is identiﬁable by the procedure’s unique signature.

For a reason that will soon become clear, we choose to think of the slides as A4 (or longer, if

needs be) transparencies that can be projected using a classroom-like overhead projector. This is

in contrast to traditional photography related slide projectors, where a picture is printed onto a

ﬁlm (which is then placed inside a cardboard or plastic shell) and whenever selected for viewing, is

being mechanically slid away from a tray (stacking a normally prearranged collection of pictures),

and onto the projector’s lamp.

It is the latter projection style that is responsible for the English terminology of a slide.

Nevertheless, in a sliding transformation, to be introduced shortly and developed throughout the

thesis, the sideways movement of slides will be of a somewhat diﬀerent nature.

Why do we prefer overhead projection? One reason is that this way, while projecting (i.e.

executing) a program, the presenter can use a non-permanent (i.e. erasable) pen for writing notes

on the slide itself (or alternatively on a separate blank slide that is placed directly on top). This can

be useful e.g. for keeping track of current values of local variables. The other reason is related to

the order of presented slides. In tracking program execution, the order will not be as prearranged,

static and sequential as is usually the case with photographic slideshows.

The slideshow is a demonstration of a typical Von-Neumann style of sequential program ex-

ecution. The program itself is a collection of procedures storing imperative subprograms. (The

model is probably extendable for concurrent and even truly parallel program execution, by having

CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 81

either one presenter, simultaneously using multiple trays or projectors/screens, or maybe even a

combination of many independent presenters/projectors.)

In the rest of this thesis, the idea of program execution as a slideshow will play no further

part. (It was introduced here merely as an illustration aid.) Instead, the slideshow metaphor

will be applied to the development and evolution of programs, or more speciﬁcally to slicing and

refactoring.

8.2 Slides in refactoring: sliding

8.2.1 One slide per statement

It is in illustrating and formalising the slice-extraction refactoring that the program medium of

slides will be instrumental. A plausible interpretation of our initial solution, that of statement

duplication (from Chapter 6 above), goes as follows.

Suppose the code of a program statement Sis printed on a single transparency slide; duplicate

that slide, thus yielding two clones, say S1 and S2; place them one on top of the other (thus

getting the original S); slide one of them (say S2) sideways; ﬁnally, for behaviour preservation,

add compensatory code.

But duplication of code is bad. The interpretation of our ﬁrst step for reducing such duplication,

(as was deﬁned in the preceding chapter and will be automated in the next), in terms of sliding,

is described in what follows.

8.2.2 A separate slide for each variable

In a ﬁrst step forward, we will no longer think of a statement Sas being printed on a single slide.

Instead, we take further advantage of features of transparency slides, and dedicate a separate slide

for each deﬁned variable. On each such slide, the slice of that variable (from the end of S) can be

printed.

Assuming no dead code, it can be shown that the union of such slides is Sitself. Then, when

a set of variables Vis selected for extraction from S, all slides of variables in the complementary

set coV := def.S\Vcan be separated from slides of Vby sliding. As in the previous solution,

compensatory code should be added, to ensure behaviour preservation.

Another feature of transparency slides that proves useful here is that the relative location of slid

program elements remains the same. This is a fact that existing approaches for syntax-preserving

slice extraction, e.g. KH03 [39], have struggled with, both in illustration and formalisation. With

the slideshow metaphor in mind, this requirement has become relatively trivial.

The result is the extraction of a slice (of V), with the complement being also a slice, of the

CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 82

complementary set coV . But the complement can be made even smaller, by reusing the extracted

results of V, as in the following.

8.2.3 A separate slide for each individual assignment

Instead of having a slide for each (deﬁned) variable, our ﬁnal improvement will involve designating

a separate slide for each individual assignment. On each such slide we shall print the assignment

itself, and all guards (controlling whether the assignment will or will not be executed). We shall

pay special attention to preserving layout (on the slide, both metaphorically and later when

formalising slides), such that the original program will be reproducible, as the union of all slides.

Similarly, each slice will consist of the union of all slides of included assignments.

This time, when asked to extract variables Vfrom S, all slides in the slice of Vwill be

separated from the remaining slides by sliding, leaving a potentially smaller complement. However,

for preserving behaviour, some extra measures will need to be taken. These include duplication of

some slides (that must appear in both the extracted slice and its complement) and the renaming

of reused extracted values in the complement.

This sliding transformation, along with the extra measures, will be formalised later in the

thesis (see chapter 10). Then, the need for renaming reused extracted values, in the complement,

will be removed in Chapter 11.

8.3 Representing non-contiguous statements

For slice extraction to be implemented as a sliding operation, we need to decompose a given

statement into a set of not-necessarily contiguous statements. As was introduced earlier, in Sec-

tion 4.1.1, instead of speaking of substatements as parts of a program statement, we speak of slips

and slides. In terms of the abstract syntax tree, the former correspond to a subtree and the latter

to a path from the root to a node. More precisely, a slide is a statement, formed by that path,

replacing any statement child (i.e. slip) of a node on the path, which is itself not on the path,

with the empty statement skip . For convenience, in concrete examples, we avoid mentioning the

skip , leaving an empty space instead. Note that this space is empty but considered transparent,

in contrast to the misleading convention of “whitespace”. (Admittedly, a concrete syntax that

understands such empty spaces as the empty statement would have been preferable.)

For example, the program on the left-hand column can be represented as the union of the slides

of its individual assignments.

CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 83

Let Sbe any core statement and Vbe any set of variables; then

slides.S.V,skip when Vdef.S; otherwise we have the following deﬁnitions:

slides.“X1,X2 := E1,E2”.V,“X1 := E1”where X1⊆Vand X2V;

slides.“S1;S2”.V,“(slides.S1.V);(slides.S2.V)”;

slides.“if Bthen S1else S2ﬁ”.V,“if Bthen (slides.S1.V)else (slides.S2.V)ﬁ”;

slides.“while Bdo Sod ”.V,“while Bdo (slides.S.V)od ”.

Figure 8.1: Computing the slides of a core statement with respect to a set of variables.

if x>y then

m:=x

else

m:=y

if x>y then

m:=x

else

∪

if x>y then

else

m:=y

Such a union operation will be formalised shortly (in the next section) and such slides of

individual assignments will be formalised later in the chapter (in Section 8.6). But ﬁrst, we ﬁnd it

more convenient to formalise a more coarse-grained concept. We deﬁne the statement formed by

the union of all individual-assignment slides of a certain program statement Swith respect to a

set of variables V. (This way, we avoid having to formalise the access to an individual assignment,

e.g. through labels.)

For a given program statement Sand any set of variables V, we deﬁne the subprogram of S

containing all assignments to variables in V, along with all their enclosing compound statements,

as slides.S.V(see Figure 8.1). In the example above, say the statement on the left is S, then the

statement slides.S.{x,y,z}is the empty statement skip , whereas slides.S.{m}is the union of the

two individual slides (which is in this case the whole program, S).

Note that in general, a given core statement Sis represented as slides.S.(def.S). Further note

that we choose to name that function slides, instead of say slide or slideFor, despite the fact that it

yields a single statement, because its result should be thought of as the collection of all individual-

assignment slides of variables in the selected set. For convenience, we choose not to distinguish

that collection from the actual statement its union would yield. Indeed, when collecting a set of

slides, putting them one on top of the other, the resulting program is the union of those slides.

This is formalised next.

CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 84

Let S1,S2,B,X1,X2,X3,E1,E2,E3 be two core statements, a boolean expression, three sets of

variables and three corresponding expressions, respectively; then

S1∪skip ,S1 ;

skip ∪S2,S2 ;

“X1,X2 := E1,E2”∪“X1,X3 := E1,E3”,“X1,X2,X3 := E1,E2,E3”

provided X2X3 ;

“S1;S2”∪“S10;S20”,“(S1∪S10);(S2∪S20)”;

“if Bthen S1else S2ﬁ”∪“if Bthen S10else S20ﬁ”,

“if Bthen (S1∪S10)else (S2∪S20)ﬁ;

“while Bdo S1od ”∪“while Bdo S10od ”,“while Bdo (S1∪S10)od ”.

Figure 8.2: Unifying (or merging) statements.

8.4 Collecting slides: the union of non-contiguous code

We deﬁne an operation for unifying (or merging) two program statements, S1 and S2, into a single

statement, S1∪S2 (see Figure 8.2).

Note that two statements S1 and S2 are uniﬁable (i.e. S 1∪S2 is well-deﬁned) only when

they have the same shape, as is implicitly expressed in the deﬁnition of ∪. For example, an IF

statement can only be merged with an empty statement or with another IF statement whose guard

and two branches are uniﬁable with the corresponding guard and branches of the former.

Furthermore, note that we do not write “S1∪S2”and do not deﬁne wp-semantics for slide

union (or for taking slides in general). This is so since ∪is not a construct of our program-

ming language. It is rather a meta-program operation, generating (when well-deﬁned) a program

statement.

Following its deﬁnition, it is easy to verify that the union of statements is commutative,

associative and idempotent; hence the choice of inﬁx ∪. The following theorem shows that the

union of slides (for a given statement and a pair of variable sets) is equivalent to the slides of the

union.

Theorem 8.1. Any pair of slides of a single statement, slides.S.V1 and slides.S.V2, is uniﬁable.

Furthermore, we have

(slides.S.(V1∪V2)) = ((slides.S.V1) ∪(slides.S.V2)) .

The proof, by induction on the structure of S, can be found in Appendix C.

CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 85

Representing program statements as collections of slides will be useful for slicing. A slicing

algorithm typically takes into account both data and control ﬂow. Slides encompass control

dependences and have been deﬁned here in a syntax-directed way, due to the simplicity of structure

of our language. Data dependences, on the level of slides, will be considered next.

8.5 Slide dependence and independence

Data ﬂow between sets of slides is formalised through a relation of slide dependence and a corre-

sponding concept of slide independence. We start with the latter.

Deﬁnition 8.2 (Slide Independence).A set of variables Vis considered slide independent with

respect to a given statement S, if the condition

glob.(slides.S.V)∩def.S⊆V

holds. (Recall slides.S.Vis a normal statement, so glob.(slides.S.V) is the set of global variables

in that statement.)

Interesting (semantic) properties of independent slides will be extensively investigated in the

next chapter, when developing a slicing algorithm. The complementary notion, of slide depen-

dence, is deﬁned as follows.

Deﬁnition 8.3 (A Relation of Slide Dependence).A set of variables V1 depends on another set

V2 with respect to a given statement S,i.e. V 1 is related to V2 through slide dependence, when

input.(slides.S.V1) ∩def.(slides.S.V2) 6=∅.

8.5.1 Smallest enclosing slide-independent set

The reﬂexive transitive closure of a set Vin the context of slides.S, denoted V∗, is the smallest

slide-independent superset of V. (Recall that slide dependence is indeed a relation between sets

of variables.)

When asked to compute the reﬂexive transitive closure of slide dependence, for a given state-

ment Sand set of variables V, we choose to avoid computing the full relation. Instead, we take a

faster lazy approach, repeatedly adding (to V) slides on which Vdepends, until a ﬁxed point is

reached. At each step, all variables in U:= input.(slides.S.V)∩def.S, are added to V. A ﬁxed

point is reached when U⊆V.

This can be slightly improved by observing the relationship between global variables and

input. In general, we note that computing the set of global variables, in a collection of slides

(as for any statement), is faster than computing its set of input variables. Now from RE5 we

CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 86

know glob.T=def.T∪input.T. So, for computing the set U, we observe that even though

glob.(slides.S.V)∩def.Smay be larger than input.(slides.S.V)∩def.S, the extra variables would

be from def.(slides.S.V) and hence from V. We conclude that since the only purpose of U, in

the present algorithm, is to be tested for inclusion in V(and then possibly be added to V), there

would be no harm in including variables from Vin U.

Thus the algorithm for computing the reﬂexive transitive closure of slide dependence, for any

given S,Vis as follows:

slides-dep-rtc.S.V,if U⊆Vthen Velse slides-dep-rtc.S.(V∪U)ﬁ

where U:= glob.(slides.S.V)∩def.S.

8.6 SSA form

Up till now we had one slide for each variable, including all deﬁnitions of that variable. Can we

reﬁne the representation such that each slide will be dedicated to a speciﬁc instance (i.e. deﬁnition

point) of a variable?

We do that by splitting the selected variable, such that a new variable is deﬁned at each

deﬁnition point, in the style of SSA. We then show that under some conditions the instances of

a variable can be merged back (to the original) even after performing some transformations (e.g.

slicing).

8.6.1 Transform to SSA

The set of instance variables in the SSA form replacing a variable of the original program are

expected to maintain a property of no-simultaneous-liveness. This way, it will be possible to

transform the program back from SSA.

AtoSSA algorithm is formally derived for our core language as Transformation D.5 in Ap-

pendix D and is repeated here as Figure 8.3.

In transforming a given statement Swith respect to variables X(i.e. splitting deﬁnitions of

Xalone), we aim to end up with a statement S0free of occurrences of X(i.e. glob.S0X) and

with at most one instance (of each member of X) live at each point of S0.

According to Transformation D.5 and its corresponding preconditions P1-P7, we observe that

Xshould be partitioned into six mutually-disjoint subsets X1,X2,X3,X4,X5,X6. However,

following postcondition Q1 and preconditions P5 and P6, we further observe that of those, only

X4 and X5 are both live-on-exit and deﬁned in S. Since in general we mean to transform all

variables in X:= def.S, and since we expect all members of Xto be live-on-exit, we are left with

CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 87

Let S,X,Ybe any core statement and two (disjoint) sets of variables; let X1,X2,X3,X4,X5 be

ﬁve (mutually disjoint) subsets of X, and let XL1i,XL2i,XL3i,XL4i,XL4f,XL5fbe six sets of

instances, all included in the full set of instances XLs; let S0be the SSA form of Sdeﬁned by

S0:= toSSA.(S,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs); then (Q1:)

“(S;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]”=

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S0)[live XL3i,XL4f,XL5f,Y]”

and (Q2:) Xglob.S0

provided

P1: glob.S⊆(X,Y),

P2: (X1,X2,X3,X4,X5) ⊆X,

P3: (XL1i,XL2i,XL3i,XL4i,XL4f,XL5f)⊆XLs,

P4: XLs (X,Y),

P5: (X1,X3) def.S,

P6: (X2,X4,X5) ⊆def.Sand

P7: (X∩(((X3,X4,X5) \ddef.S)∪input.S)) ⊆(X1,X2,X3,X4) .

Figure 8.3: Transformation D.5 of toSSA (from the appendix).

CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 88

the 2-partition X= (X4,X5). Of those, we observe (by P2,P7) that only members of X4 are

live-on-entry (i.e. in (X\ddef.S)∪(X∩input.S)).

We now need to prepare initial and ﬁnal instances for X4 and X5. For the former, we observe

how Q1 and P3 imply that the set of initial instances XL4imust be disjoint from the set XL4fof

ﬁnal instances, whereas for the latter only ﬁnal instances (XL5f) are required. From P2,P3 and

P4 it is clear that all instances must be fresh (i.e. disjoint from (X,Y)).

Thus, the toSSA algorithm can be applied as follows:

Let Sbe any given statement; let X:= def.Sand Y:= (glob.S\X) be a 2-partition of

glob.S; let X4 := (X\ddef.S)∪(X∩input.S) and X5 := X\X4 be a 2-partition of X;

let (XL4i,XL4f,XL5f) := fresh.((X4,X4,X5),(X,Y)) be three sets of fresh instances — ac-

cording to property Q2 of our deﬁnition of fresh (see Section 3.1.3) we indeed get the required

(XL4i,XL4f,XL5f)(X,Y)) — and ﬁnally let

S0:= toSSA.(S,(X4,X5),XL4i,(XL4f,XL5f),Y,(XL4i,XL4f,XL5f)) and let XLim := glob.S0\

(Y,XL4i,XL4f,XL5f) be the set of all intermediate instances; we then observe

={intro aux. liveness info.; intro. dead assignment;

intro. self-assignment; assignment-based sub.}

“(S;XL4f,XL5f:= X4,X5;X4,X5 := XL4f,XL5f)[live X4,X5] ”

={prop. liveness info.}

“((S;XL4f,XL5f:= X4,X5)[live XL4f,XL5f]

;X4,X5 := XL4f,XL5f)[live X4,X5] ”

={Q1 of Transformation D.5: P1-P7 hold by construction (as justiﬁed above)}

“((XL4i:= X4;S0)[live XL4f,XL5f];X4,X5 := XL4f,XL5f)[live X4,X5] ”

={remove aux. liveness info.}

“(XL4i:= X4;S0;X4,X5 := XL4f,XL5f)[live X4,X5] ”

={def. of live : def. of XLim; note XLim (X4,X5) due to

Q2 of toSSA (Xglob.S0)}

“|[var XL4i,XL4f,XL5f,XLim ;XL4i:= X4;S0;X4,X5 := XL4f,XL5f]|”.

8.6.2 Back from SSA

As the derivation above is made of program equations, the reversed derivation is used for returning

from SSA. However, returning from SSA is more general since we wish to return even after having

made some transformations on the immediate SSA version. In eﬀect, returning from SSA involves

CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 89

Let S0be any core statement and (XL1i∪XL2f)⊆XLs; let Sbe a statement deﬁned by

S:= merge-vars.(S0,XLs,X,XL1i,XL2f,Y); then (Q1:)

“(XL1i:= X1;S0)[live XL2f,Y]”=“(S;XL2f:= X2)[live XL2f,Y]”

and (Q2:) XLs glob.S

provided

P1: glob.S0⊆(XLs,Y),

P2: (XL1i∪XL2f)⊆XLs,

P3: (X1∪X2) ⊆X,

P4: X(XLs,Y),

P5: no two instances of any member of Xare sim.-live at any point in S0[live XL2f,Y],

P6: (XLs ∩((XL2f\ddef.S0)∪input.S0)) ⊆XL1i,

P7: no def-on-live: i.e. no instance is deﬁned where another instance is live-on-exit,

P8: no multiple-defs: i.e. each assinment deﬁnes at most one instance (of any X.i).

Figure 8.4: Transformation D.6 of merge-vars (from the appendix).

the merge of all instances of an original program variable. This way, all deﬁnitions of pseudo

instances become redundant self assignments and hence removed.

Accordingly, the fromSSA algorithm is deﬁned to call the more general merge-vars algorithm, as

derived in Transformation D.6 in Appendix D and is repeated here as Figure 8.4. Thus, fromSSA

is deﬁned as

fromSSA.(S0,X,XL1i,XLf ,Y,XLs),merge-vars.(S0,XLs,X,XL1i,XLf ,Y)

where S0is a statement in SSA form with respect to variables X, variables X1 are live-on-entry

with corresponding initial instances XL1i,XLf are ﬁnal instances of X, variables Yare non-SSA

program variables, and XLs is the complete set of instances of X.

In the following, we show that the toSSA algorithm is invertible, with fromSSA its inverse.

8.6.3 SSA is de-SSA-able

When a statement in SSA can be converted back, thus merging all instances of each transformed

variable, to its original name, we say it is de-SSA-able. In the following theorem we prove that

toSSA yields a de-SSA-able statement. This result is not surprising. It was expected and is stated

here for ‘sanity checking’. However, the theorem will actually become useful a little later, when

CHAPTER 8. SLIDES: A PROGRAM REPRESENTATION 90

proving de-SSA-ability of SSA-based slices. There, we shall combine the result of this theorem

with an observation that slicing preserves de-SSA-ability.

Theorem 8.4. Let Sbe any core statement and let X,Y:= (def.S),(glob.S\def.S), X1 :=

(X∩((X\ddef.S)∪input.S)) and (XL1i,XLf ) := fresh.((X1,X),(X,Y)); let S0be the SSA

version of S, deﬁned as S0:= toSSA.(S,X,XL1i,XLf ,Y,(XL1i,XLf )); then S0is de-SSA-able.

That is, all preconditions, P1-P8, of the fromSSA algorithm hold for

S00 := fromSSA.(S0,XLs,X,XL1i,XLf ,Y) where XLs := ((XL1i,XLf )∪(def.S10\Y)).

The proof can be found in appendix D.

8.7 Summary

A program representation of non-contiguous statements has been deﬁned along with an operation

for merging such statements. Slides — following an original program execution metaphor that has

been introduced — encompass control dependences. In the context of our simply structured lan-

guage, this is done in a syntax-directed manner. The complementary notion of data dependences

has been captured by a relation of slide dependence. Together, those will take part in computing

slices, in the next chapter.

For any given statement S, the function slides.Stakes a set of variables, say V, and yields a

statement (slides.S.V) which includes the union of individual-assignment slides of all assignments

to variables in V. That way, we have avoided the need to formalise labels of internal program

points (for distinguishing one assignment of a variable from another).

Instead, the ﬁner-grained level of individual-assignment slides has been made accessible through

the development of transformations to and from the popular static single assignment (SSA) form.

Slides of a particular instance of a variable, on the SSA form, correspond to the individual assign-

ment of that instance, on the original program. The SSA form and its related slides will help in

the next chapter to turn a naive ﬂow-insensitive slicing algorithm into a ﬂow-sensitive one.

Chapter 9

A Slicing Algorithm

This chapter develops a provably correct slicing algorithm. The algorithm is based on the obser-

vation that a slide-independent collection of slides yields a semantically correct slice.

The algorithm’s development will consist of two stages. A ﬁrst attempt will produce crude

(i.e. too large) slices. Then, by adopting the reﬁned program representation of SSA-based slides,

the same algorithm will be shown to produce reﬁned (i.e. smaller, more accurate and desirable)

slices.

9.1 Flow-insensitive slicing

The observation that independent slides yield correct slices is proved in the following.

Theorem 9.1. Let S,Vbe a core statement and set of variables, respectively. Then provided V

is slide independent in S(i.e. glob.(slides.S.V)∩def.S⊆V), slides.S.Vis a slice-reﬁnement of S

with respect to V.

Proof. The proof is by induction on the structure of S. We assume that for any slip Tof S(for

which slides.T.Vis independent in T, as is guaranteed by Lemma 9.2), we have [wp.T.Q⇒

wp.(slides.T.V).Q] for all Qwith glob.Q∩def.S⊆V. We then prove that provided Vis slide

independent in S, we have [wp.S.P⇒wp.(slides.S.V).P] for all such Pwith glob.P∩def.S⊆V.

First, if Vdef.Swe observe for all Pwith glob.P∩def.S⊆V(i.e. glob.Pdef.S):

wp.(slides.S.V).P

={slides when Vdef.S}

wp.skip.P

={wp of skip}

CHAPTER 9. A SLICING ALGORITHM 92

⇐ {pred. calc.}

P∧wp.S.true

={RE3: glob.Pdef.S}

wp.S.P.

In the remaining cases we shall assume V∩def.S6=∅.

S=“X:= E”: We observe for all Pwith glob.P∩def.S⊆V

wp.“X:= E”.P

={wp of ‘:=’}

P[X\E]

={remove redundant subs.: let X1 := X∩Vand the proviso ensures

glob.P∩X⊆X1}

P[X1\E1]

={wp of ‘:=’}

wp.“X1 := E1”.P

={slides of ‘:=’}

wp.(slides.“X:= E”.V).P.

S=“S1;S2”: We observe for all Pwith glob.P∩def.S⊆V

wp.“S1;S2”.P

={wp of ‘ ;’}

wp.S1.(wp.S2.P)

⇒ {ind. hypo.: glob.P∩def.S⊆Vand Vis slide ind. in S2}

wp.S1.(wp.(slides.S2.V).P)

⇒ {ind. hypo.: glob.(wp.(slides.S2.V).P)∩def.S⊆Vdue to RE2

since both glob.P∩def.S⊆Vand

input.(slides.S2.V)∩def.S⊆V(slide ind. of Vin S2);

Vis slide ind. in S1}

wp.(slides.S1.V).(wp.(slides.S2.V).P)

={wp of ‘ ;’}

wp.“(slides.S1.V);(slides.S2.V)”.P

={slides of ‘ ;’}

wp.(slides.“S1;S2”.V).P.

CHAPTER 9. A SLICING ALGORITHM 93

S=“if Bthen S1else S2ﬁ”: We observe for all Pwith glob.P∩def.S⊆V

wp.“if Bthen S1else S2ﬁ”.P

={wp of IF}

(B⇒wp.S1.P)∧(¬B⇒wp.S2.P)

⇒ {ind. hypo., twice: glob.P∩def.S⊆Vand Vis slide ind. in both S1 and S2}

(B⇒wp.(slides.S1.V).P)∧(¬B⇒wp.(slides.S2.V).P)

={wp of IF}

wp.“if Bthen slides.S1.Velse slides.S2.Vﬁ”.P

={slides of IF}

wp.(slides.“if Bthen S1else S2ﬁ”.V).P.

S=“while Bdo S1od ”: We observe for all Pwith glob.P∩def.S⊆V

wp.“while Bdo S1od ”.P

={wp of DO: [k.Q≡(B∨P)∧(¬B∨wp.S1.Q)]}

(∃i: 0 ≤i:ki.false)

⇒ {see below; [l.Q≡(B∨P)∧(¬B∨wp.(slides.S1.V).Q)]}

(∃i: 0 ≤i:li.false)

={wp of DO with las above}

wp.“while Bdo (slides.S1.V)od ”.P

={slides of DO}

wp.(slides.“while Bdo S1od ”.V).P.

We ﬁnish by proving for the second step above, by induction, having [ki.false ⇒li.false] for

all i, provided [wp.S1.P⇒wp.(slides.S1.V).P] for all Pwith glob.P∩def.S⊆V(induction

hypothesis above).

The base case (i= 0) is trivial ([false ⇒false], recall the deﬁnition of function iteration).

Then, for the induction step, we assume [ki.false ⇒li.false] and prove [ki+1.false ⇒li+1 .false].

ki+1.false

={def. of func. it.}

k.(ki.false)

={def. of k}

(B∨P)∧(¬B∨wp.S1.(ki.false))

CHAPTER 9. A SLICING ALGORITHM 94

⇒ {ind. hypo.}

(B∨P)∧(¬B∨wp.S1.(li.false))

⇒ {slide ind., proviso and glob.(li.false)∩def.S⊆Vsince

((glob.B∪glob.P∪input.S1) ∩def.S)⊆V}

(B∨P)∧(¬B∨wp.(slides.S1.V).(li.false))

={def. of l}

l.(li.false)

={def. of func. it.}

li+1.false .

Lemma 9.2. Let Sbe any core statement (i.e. no local variable scopes); let Vbe a set of slide-

independent variables (in S); let Tbe any slip of S; then Vis also slide independent in T. That

is,

glob.(slides.T.V)∩def.T⊆V.

Proof.

glob.(slides.T.V)∩def.T

⊆ {Lemma 9.3}

glob.(slides.S.V)∩def.T

⊆ {Lemma 9.4}

glob.(slides.S.V)∩def.S

⊆ {proviso (Vis slide ind. in S)}

The proofs of the remaining lemmata are given in Appendix C.

Lemma 9.3. Let Sbe any core statement; let Tbe any slip of Sand let Vbe any set of variables;

then

glob.(slides.T.V)⊆glob.(slides.S.V).

Lemma 9.4. Let Sbe a core statement; let Tbe any slip of S; then

def.T⊆def.S.

CHAPTER 9. A SLICING ALGORITHM 95

Given a core statement Sand variables of interest V, compute the ﬂow-insensitive slice,

ﬁ-slice.S.V, as follows:

ﬁ-slice.S.V,slides.S.V∗

where V∗:= slides-dep-rtc.S.Vwith slides-dep-rtc.S.Vdeﬁned recursively as follows:

slides-dep-rtc.S.V,if U⊆Vthen Velse slides-dep-rtc.S.(V∪U)ﬁ

where U:= glob.(slides.S.V)∩def.S.

Figure 9.1: A ﬂow-insensitive slicing algorithm.

9.1.1 The algorithm

Our ﬂow-insensitive slicing algorithm is given in Figure 9.1. Given a core statement Sand a set

of variables V, the algorithm ﬁrst computes the smallest possible slide-independent superset V∗

of V(i.e. the reﬂexive transitive closure of the slide dependence of Son V, as in Section 8.5.1);

then, the union of slides of Son V∗,i.e. the statement slides.S.V∗, is produced as the slice of S

on V.

The algorithm is correct in the sense that, for any Sand V, we get ﬁ-slice.S.Vas a valid slice

of Swith respect to V. That is, requirements Q1 and Q2 of slicing both hold.

For the former (S[live V]v(ﬁ-slice.S.V)[live V]), suﬃce it to show that SvVﬁ-slice.S.V

holds. This is so due to Theorem 9.1 with V:= V∗(which gives SvV∗ﬁ-slice.S.V) and since

V⊆V∗(by construction of slides-dep-rtc.S.V). To see why this results in SvVﬁ-slice.S.V,

recall the deﬁnition of slice-reﬁnement, from which we indeed get for all predicates Pwith glob.P⊆

V⊆V∗the required [wp.S.P⇒wp.(ﬁ-slice.S.V).P].

The latter (def.(ﬁ-slice.S.V)⊆def.S) holds since all deﬁned variables in any set of slides of S

are also deﬁned in Sitself, as the following lemma (which is proved in Appendix C) shows.

Lemma 9.5. Let S,Vbe any core statement and set of variables, respectively; then

def.(slides.S.V)⊆def.S.

9.1.2 Example

The crudeness of this ﬂow-insensitive slicing is demonstrated with the following example. Slicing

for sum on the program on the left-hand column would yield the program on the right.

CHAPTER 9. A SLICING ALGORITHM 96

i,sum := 0,0

; while i<a.length do

i,sum := i+1,sum+a[i]

; i,prod := 0,1

; while i<a.length do

i,prod := i+1,prod*a[i]

i,sum := 0,0

; while i<a.length do

i,sum := i+1,sum+a[i]

; i := 0

; while i<a.length do

i := i+1

The second loop is unnecessarily included since the slide of sum depends on the slide of i, which

in turn includes all assignments to i(along with their enclosing control structures). Nevertheless,

this result is still a correct slice, with respect to sum. But we can do better, as is shown next.

9.2 Make it ﬂow-sensitive using SSA-based slides

Now that we have a provably correct slicing algorithm, we reﬁne it by making it sensitive to

control ﬂow. This will lead to (potentially) smaller, more accurate slices. (As a bonus, the reﬁned

algorithm will be applicable for the more traditional backward slicing, in which slicing criteria

may refer to internal program points. However, this kind of slicing will not be pursued.)

The key to turning the ﬂow-insensitive slicing algorithm into ﬂow-sensitive, lies in splitting

of variables. The previous algorithm, in an attempt to stay on the safe side, added the slides of

all assignments to a variable as soon as any of those slides was needed — completely ignoring

the program point(s) in which the value of that variable was used and the order of execution of

substatements.

By splitting a variable into several instances, such that each deﬁnition point introduces a new

variable, as in our SSA-like form (see Section 8.6 of the preceding chapter), the existing algorithm

can gain ﬂow-sensitivity.

9.2.1 Formal derivation of ﬂow-sensitive slicing

Take a core statement Sand any set of variables of interest V. Transform Sto SSA form and take

the ﬂow-insensitive slice of the ﬁnal instances of V. Finally, all instances can be merged (back to

the original name).

“S[live V]”

CHAPTER 9. A SLICING ALGORITHM 97

={Q1 of Transformation D.5 with

S0:= toSSA.(S,(V,coV ),(VL1i,coVL1i),(VLf ,coVLf ),

ND,(VL1i,coVL1i,VLf ,coVLf )), def.S= (V,coV ),

ND := glob.S\(V,coV ), V1,coV 1 := (V∩input.S),(coV ∩input.S) and

(VL1i,coVL1i,VLf ,coVLf ) := fresh.((V1,coV 1,V,coV ),glob.S)}

“(VL1i,coVLi := V1,coV 1;S0;V,coV := VLf ,coVLf )[live V]”

={remove dead assignment (Law 23): coV V}

“(VL1i,coVL1i:= V1,coV 1;S0;V:= VLf )[live V]”

={prop. liveness info. (Law 20)}

“((VL1i,coVL1i:= V1,coV 1;S0)[live VLf ];(V:= VLf ))[live V]”

={prop. liveness info. (Law 20)}

“((VL1i,coVL1i:= V1,coV 1;S0[live VLf ])[live VLf ];(V:= VLf ))[live V]”

v {SV 0:= ﬁ-slice.S0.VLf (Q1 of ﬁ-slice)}

“((VL1i,coVL1i:= V1,coV 1;SV 0[live VLf ])[live VLf ];(V:= VLf ))[live V]”

={prop. liveness info. (Law 20)}

“((VL1i,coVL1i:= V1,coV 1;SV 0)[live VLf ];(V:= VLf ))[live V]”

={Q1 of Transformation D.6, Theorem 9.6: DLs := glob.S0\ND ,

SV := fromSSA.(SV 0,(V,coV ),(VL1i,coVL1i),(VLf ,coVLf ),ND,DLs )}

“((SV ;VLf := V)[live VLf ];(V:= VLf ))[live V]”

={prop. liveness info. (Law 20)}

“(SV ;(VLf := V);(V:= VLf ))[live V]”

={assignment-based sub. (Law 18): VLf V}

“(SV ;(VLf := V);(V:= V))[live V]”

={remove redundant self-assignment (Law 2)}

“(SV ;(VLf := V))[live V]”

={remove dead assignment (Law 24): VLf V}

“SV [live V]”.

The success of fromSSA (in the derivation above) depends on the validity of preconditions

P1-P8 of Transformation D.6. This is indeed guaranteed as is shown in the following.

CHAPTER 9. A SLICING ALGORITHM 98

9.2.2 An SSA-based slice is de-SSA-able

Let S0be the SSA version of a core statement S. Then, the slicing algorithm, once operated on

the set of slides slides.S0, is ﬂow-sensitive. For such slices to be correct syntax-preserving slices,

we need to show they are de-SSA-able.

Theorem 9.6. Any slide-independent statement from the SSA version of any core statement is

de-SSA-able.

That is, let Sbe any core statement and let X,Y:= (def.S),(glob.S\def.S), X1 := (X∩((X\

ddef.S)∪input.S)) and (XL1i,XLf ) := fresh.((X1,X),(X,Y)); let S0be the SSA version of S,

deﬁned as S0:= toSSA.(S,X,XL1i,XLf ,Y,(XL1i,XLf )); let XLs := ((XL1i,XLf )∪(def.S10\

Y)) be the full set of instances (of X, in S0) and let XLI be any (slide-independent) subset of

those instances, with ﬁnal instances XL2f:= XLI ∩XLf ; ﬁnally let SI 0:= slides.S0.XLI be the

corresponding (slide-independent) statement; then SI 0is de-SSA-able. That is, all preconditions,

P1-P8, of the fromSSA algorithm hold for SI := fromSSA.(SI 0,X,XL1i,XL2f,XLs).

The proof can be found in Appendix D.

9.2.3 The reﬁned algorithm

Following the derivation above, our SSA-based ﬂow-sensitive slicing algorithm is given in Fig-

ure 9.2. A given program Sis translated into its corresponding SSA version S0; the ﬂow-insensitive

slice SV 0of S0is taken with respect to ﬁnal instances VLf of V; ﬁnally, SV 0is translated back

from SSA by merging all instances.

The algorithm is correct in the sense that, for any core (and hence deterministic) statement

Sand set of variables V, we get slice.S.Vas a valid slice of Swith respect to V. That is,

requirements Q1 and Q2 of slicing both hold.

The former (S[live V]v(slice.S.V)[live V]) follows from of the derivation above. Note that,

indirectly, this property is a consequence of the corresponding Q1 of ﬁ-slice.

For the latter (def.(slice.S.V)⊆def.S), we need to investigate the eﬀects of toSSA,ﬁ-slice

and fromSSA on deﬁned variables. Let SV := slice.S.Vand let Vd := V∩def.Ssuch that

def.S= (Vd,coV ). Thus we need to show def.SV ⊆(Vd,coV ).

Firstly, since def.S= (Vd ,coV ), we observe def.S0involves instances of (Vd,coV ) exclusively.

Secondly, Q2 of ﬁ-slice ensures def.SV 0⊆def.S0. Finally, since all instances of (Vd ,coV ) in SV 0

are successfully merged (see Q2 of fromSSA and Theorem 9.6), we end up with def.SV ⊆(Vd,coV )

as required.

CHAPTER 9. A SLICING ALGORITHM 99

Given a core statement Sand variables of interest V, compute the ﬂow-sensitive slice, slice.S.V,

as follows:

slice.S.V,fromSSA.(SV 0,(V,coV ),(VL1i,coVL1i),(VLf ,coVLf ),ND,DLs )

where coV := def.S\V,

SV 0:= ﬁ-slice.S0.VLf ,

S0:= toSSA.(S,(V,coV ),(VL1i,coVL1i),(VLf ,coVLf ),ND,(VL1i,coVL1i,VLf ,coVLf )),

V1,coV 1 := (V∩input.S),(coV ∩input.S),

DLs := glob.S0\ND,

(VL1i,coVL1i,VLf ,coVLf ) := fresh.((V1,coV 1,V,coV ),glob.S)

and ND := glob.S\(V,coV ) .

Figure 9.2: An SSA-based ﬂow-sensitive slicing algorithm.

9.2.4 Example

In contrast to the ﬂow-insensitive slicer (as demonstrated in Section 9.1.2), this reﬁned algorithm

would yield an accurate slice for sum, as is shown here:

i,sum := 0,0

; while i<a.length do

i,sum := i+1,sum+a[i]

; i,prod := 0,1

; while i<a.length do

i,prod := i+1,prod*a[i]

i,sum := 0,0

; while i<a.length do

i,sum := i+1,sum+a[i]

The ﬁrst step of the algorithm is to turn the program into SSA form:

CHAPTER 9. A SLICING ALGORITHM 100

|[var i1,i2,i3,i4,i5,i6,sum1,sum2,sum3,prod4,prod5,prod6

; i1,sum1:= 0,0

; i2,sum2:= i1,sum1

; while i2<a.length do

i3,sum3:= i2+1,sum2+a[i2]

; i2,sum2:= i3,sum3

; i4,prod4:= 0,1

; i5,prod5:= i4,prod4

; while i5<a.length do

i6,prod6:= i5+1,prod5*a[i5]

; i5,prod5:= i6,prod6

; i,sum,prod := i5,sum2,prod5

At this point, the ﬂow-insensitive algorithm slices the middle part above for the ﬁnal instance

sum2of sum. The slide of sum2depends on the slides of sum1,i2and sum3;i2in turn depends on

{i1,i2,i3}whereas sum3depends on {i2,sum2}. We thus get {i1,i2,i3,sum1,sum2,sum3}as the

reﬂexive transitive closure of slide dependence on sum2, hence the requested slide-independent set

{sum2}∗, yielding the following program:

; i1,sum1:= 0,0

; i2,sum2:= i1,sum1

; while i2<a.length do

i3,sum3:= i2+1,sum2+a[i2]

; i2,sum2:= i3,sum3

Returning from SSA would then, as desired, yield

CHAPTER 9. A SLICING ALGORITHM 101

i,sum := 0,0

; while i<a.length do

i,sum := i+1,sum+a[i]

Similarly, requesting the slice of prod , from this SSA-based slicing algorithm, would identify

the second loop alone, whereas the earlier naive algorithm would unnecessarily add the ﬁrst loop

(for i).

Indeed, the example above was speciﬁcally chosen to highlight the diﬀerences between our

naive ﬂow-insensitive slicer and the SSA-based one. Nevertheless, it should be noted that had

the two loops been tangled as one, the slicer (in fact both slicers) would still identify the desired,

accurate slice.

9.3 Slice extraction revisited

9.3.1 The transformation

With the SSA-based ﬂow-sensitive slicing algorithm, we are for the ﬁrst time in position to con-

structively express a sliding transformation. We base the transformation on Reﬁnement 7.4 and

produce the slice of the variables for extraction as the extracted code and the slice of the remaining

variables as the complement.

Transformation 9.7. Let Sbe any core statement and let Vbe any (user selected) set of vari-

ables to be extracted; then

“|[var iV ,icoV ,fV ;iV ,icoV := V0,coV

;SV

;fV := V0

;

V0,coV := iV ,icoV

;ScoV

;V0:= fV ]|”

where V0:= V∩def.S,

coV := def.S\V0,

(iV ,icoV ,fV ):=fresh.((V0,coV ,V0),(V∪glob.S)),

SV := slice.S.V0

and ScoV := slice.S.coV .

Proof.

CHAPTER 9. A SLICING ALGORITHM 102

v {Reﬁnement 7.4: (V0,coV ) = def.Sby def. of V0,coV ;

S[live V0]vSV [live V0] by Q1 of slice;

def.SV ⊆def.Sby Q2 of slice; similarly

S[live coV ]vScoV [live coV ] by Q1 of slice;

def.ScoV ⊆def.Sby Q2 of slice}

“(iV ,icoV := V0,coV ;SV ;fV := V0

;V0,coV := iV ,icoV ;ScoV ;V0:= fV )[live V0,coV ]”

={def. of live : (def.SV ∪def.ScoV )⊆(V0,coV )

and (iV ,icoV ,fV )(V0,coV )}

“|[var iV ,icoV ,fV

;iV ,icoV := V0,coV ;SV ;fV := V0

;V0,coV := iV ,icoV ;ScoV ;V0:= fV ]|”.

9.3.2 Evaluation and discussion

With the above transformation, for example, the computation of sum in the scenario of Section 7.1,

can be correctly untangled from that of prod, as desired.

Our current approach has been inspired by the tucking transformation [40]. In comparing

Tuck to sliding, we ﬁrst observe that global variables are inherently unsupported by Tuck, and

whenever a live-on-exit variable is deﬁned in both the extracted slice and its complement, the

transformation has to be rejected. For example, when untangling sum and prod as was just

recalled from Section 7.1, the loop variable iis deﬁned in both loops of the resulting program;

had the ﬁnal value of ibeen used after the loop (e.g. in computing the average), Tuck would have

been rejected.

Our semantic framework, in contrast, is expressive enough to avoid such limitations. The

importance of this improvement over tucking is highlighted by the observation that in the presence

of global variables, and in order to avoid the need for full program analysis (i.e. beyond the context

of extraction), one has to assume all those variables are, indeed, live-on-exit.

Another notable diﬀerence between Tuck and sliding is in the construction of the complement.

Their complement is the slice from all non-extracted statements, whereas we slice from the end

of scope, on all non-extracted variables. This approach has been inspired by Gallagher’s view

of a program as a union of slices [22]. There, a program maintenance process, based on that

view, is formalised along with the dependences between various slices. Consequently, conditions

are derived for detecting non-interference of changes on a set of slices. Thus, some changes, e.g.

when debugging, can be performed on a subprogram — leaving the merge of those changes in the

CHAPTER 9. A SLICING ALGORITHM 103

full program to an accompanying tool — with such conﬁdence that eliminates the need for e.g.

reduction testing.

In comparing Tuck and sliding’s approaches, we note that on the one hand, their complement

would include slices from dead statements, if present. This in turn might lead to unnecessary

duplication and possible rejection. Since the slice of all deﬁned variables on a given statement, as

in Gallagher’s view, will never include such dead code, its presence would have no aﬀect on our

approach.

On the other hand, Tuck’s complement has the potential of being more accurate than that of

the current version of sliding. This might be the case whenever any possibly-ﬁnal-deﬁnition of

a non-extracted variable y(i.e. a deﬁnition that may reach the end of scope) is included in the

extracted code. Tuck’s complement will include such a deﬁnition only if it is indirectly relevant

for other non-extracted statements, whereas ours would deﬁnitely include it.

In order to understand the implications of this problem, we further investigate it, distinguish-

ing two cases. Firstly, if all possibly-ﬁnal-deﬁnitions of such a variable, y, are extracted, the full

slice on yis guaranteed to be included in the extracted code. In such cases, we solve the prob-

lem by including yin the set of extracted variables (see Chapter 12, where an optimal sliding

transformation is sought).

Secondly, in cases where at least one such deﬁnition of ywas extracted, and at least one

other deﬁnition was not, we must distinguish two sub-cases. If yis live-on-exit, they will reject

the transformation. Otherwise, their complement may indeed be smaller, since we assume all

variables are live-on-exit. Accordingly, our sliding transformation may beneﬁt from deriving a

simple corollary for the case in which the live-on-exit variables are explicitly given, in which case,

the complement should be composed of the slice of all non-extracted and live-on-exit variables.

We conclude that our results so far enjoy Tuck’s untangling ability with improved applicability

and comparable levels of duplication. Further reduction in such duplication is still possible, as is

explained next.

A valid criticism (of both Tuck and our current version of sliding) is that the levels of code

duplication are potentially too high. This is highlighted by Komondoor, in his PhD thesis [39].

For example, if the extracted variable’s ﬁnal value is used in the complement, the entire slice would

be duplicated.

As was explained in Chapter 2, Komondoor’s alternative solutions (KH00 and KH03 with

Horwitz [38, 39] as well as a variation on KH03 in his PhD thesis [37]), were not designed for

untangling by slice extraction and are hence not applicable in our immediate context. Nevertheless,

ideas from his approach have inspired and contributed to our decomposition of slides and the

corresponding improvements to sliding, as will be explored shortly.

CHAPTER 9. A SLICING ALGORITHM 104

9.4 Summary

This chapter has developed a slicing algorithm, based on the observation that slide-independent

sets of slides (from the previous chapter) yield correct slices. The SSA-based slicing algorithm has

allowed a constructive formulation of a ﬁrst sliding transformation, based on the reﬁnement rule

of semantic slice extraction (from two chapters back).

This version of sliding has been shown to enjoy Tuck’s untangling abilities, with improved appli-

cability, while suﬀering, as Tuck, from potential over-duplication. Reductions in such duplication

will be explored and formalised in the next chapter.

Chapter 10

Co-Slicing: Advanced Duplication

Reduction in Sliding

In the preceding chapters, a basic slice-extraction refactoring has been introduced. Its automation

through a sliding transformation has been discussed, along with correctness issues. The problem-

atic duplication of the whole program in scope, as of our initial formulation, has been followed by

slicing both the extracted code and its complement, in an attempt to reduce code duplication.

However, the levels of duplication introduced by sliding, so far, are still too high. This is so

in cases where both the slice and its complement share some computation, but instead of reusing

this computation’s extracted result in the complement, the computation’s code ends up being

duplicated.

In this chapter, an advanced sliding strategy for reducing the levels of duplication is proposed,

formalised and applied to sliding.

10.1 Over-duplication: an example

As an example of over-duplication, consider the following sliding of sum.

105

CHAPTER 10. CO-SLICING 106

i,sum,prod := 0,0,1

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

; out << sum

; out << prod

In applying Transformation 9.7 with V:= {sum}, we note that the whole extracted code ends

up being duplicated in the complement, unnecessarily.

|[var ii,isum,iprod,iout,fsum

; ii,isum,iprod,iout:=i,sum,prod,out

; i,sum := 0,0

; while i<a.length do

i,sum :=

i+1,sum+a[i]

; fsum:=sum

;

i,sum,prod,out:=ii,isum,iprod,iout

; i,sum,prod := 0,0,1

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

; out << sum

; out << prod

; sum:=fsum

10.2 Final-use substitution

We propose to further reduce duplication in sliding through what we call ﬁnal-use substitution. A

ﬁnal use is a reference to a variable’s value (e.g. sum in the example above), in a program point

where it is guaranteed to hold its ﬁnal value (e.g. where sum is appended to out ).

If the slice of the variable under discussion has been extracted through sliding, that ﬁnal

value might be available in the complement (e.g. in backup variable fsum), saving us the need to

recompute it.

CHAPTER 10. CO-SLICING 107

10.2.1 Example

To demonstrate the workings of ﬁnal-use substitutions, we return to the example (from above) of

computing and printing the sum and product of an array of integers. As was already mentioned,

trying to extract sum using our current proposed transformation would lead to duplication of the

whole extracted slice, unnecessarily.

One way of avoiding that duplication is to use fsum in the complement.

|[var ii,isum,iprod,iout,fsum

; ii,isum,iprod,iout:=i,sum,prod,out

; i,sum,prod := 0,0,1

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

; out << sum

; out << prod

; fsum:=sum

;

i,sum,prod,out:=ii,isum,iprod,iout

; i,sum,prod := 0,0,1

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

; out << fsum

; out << prod

; sum:=fsum

Note that not all uses of sum were replaced with fsum; the use inside the loop is not of a ﬁnal

value, and must not be replaced.

As before (in Transformation 9.7), the above version of statement duplication, with ﬁnal-use

substitution, has the potential of introducing dead code, which can subsequently be removed. At

this point, slicing (for {sum}in the extracted code and {i,prod,out}in the complement), would

successfully remove the repeated computation of sum; leading to:

CHAPTER 10. CO-SLICING 108

|[var ii,isum,iprod,iout,fsum

; ii,isum,iprod,iout:=i,sum,prod,out

; i,sum := 0,0

; while i<a.length do

i,sum :=

i+1,sum+a[i]

; fsum:=sum

;

i,sum,prod,out:=ii,isum,iprod,iout

; i, prod := 0, 1

; while i<a.length do

i, prod :=

i+1, prod*a[i]

; out << fsum

; out << prod

; sum:=fsum

10.2.2 Deriving the transformation

Final-use substitution can be formalised in the following way. Starting with “S;{V=fV }”

where fV glob.Swe transform Sinto S0:= S[ﬁnal-use V\fV ] demanding “S;{V=fV }”=

“S0;{V=fV }”. Statement S0will be using variables in fV instead of Vin points to which

the corresponding assertion can be propagated.

The full derivation of S[ﬁnal-use V\fV ] is given in Appendix E; the resulting transformation

is given in Figure 10.1.

With ﬁnal-use substitution constructively deﬁned, we now turn to derive an advanced solution

to slice extraction via sliding.

10.3 Advanced sliding

10.3.1 Statement duplication with ﬁnal-use substitution

In the following, we show that any core statement Sis equivalent to its duplicated version, in

which the computation of variables (Vr,Vnr ) is separated from that of the complementary set

coV (such that def.S= (Vr,Vnr ,coV )), and variables Vr are oﬀered for reuse in the complement,

through the backup variables of their ﬁnal values, fVr .

Program equivalence 10.1. Let S,Vr,Vnr,coV ,iVr ,iVnr ,icoV ,fVr ,fVnr be a core statement

and eight sets of variables, respectively; then

CHAPTER 10. CO-SLICING 109

Let Sbe a core statement and Xa set of variables, to be substituted by a corresponding set X0of

fresh variables; the ﬁnal-use substitution of Son Xwith X0is deﬁned, by cases of S, as follows:

(X1,Y:= E1,E2)[ﬁnal-use X1,X2\X10,X20],

“X1,Y:= E1[X2\X20],E2[X2\X20]”where

X= (X1,X2), XYand X0= (X10,X20) ;

(S1;S2)[ﬁnal-use X1,X2\X10,X20],

“S1[ﬁnal-use X2\X20];S2[ﬁnal-use X1,X2\X10,X20]”where

X1 := X∩def.S2, X2 := X\X1 and

X10,X20are the corresponding subsets of X0;

(if Bthen S1else S2ﬁ)[ﬁnal-use X1,X2\X10,X20],

“if B[X2\X20]then S1[ﬁnal-use X1,X2\X10,X20]

else S2[ﬁnal-use X1,X2\X10,X20]ﬁ”where

X1 := X∩def.(S1,S2), X2 := X\X1 and

X10,X20are the corresponding subsets of X0; and

(while Bdo S1od)[ﬁnal-use X1,X2\X10,X20],

(while B[X2\X20]do S1[ﬁnal-use X2\X20]od) where

X1 := X∩def.S1,X2 := X\X1 and

X20is the subset of X0corresponding to X2.

Figure 10.1: An algorithm of ﬁnal-use substitution.

CHAPTER 10. CO-SLICING 110

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV

;fVr ,fVnr := Vr,Vnr

;

Vr ,Vnr ,coV := iVr,iVnr ,icoV

;S[ﬁnal-use Vr \fVr]

;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]”

provided def.S= (Vr ,Vnr,coV )

and (iVr ,iVnr ,icoV ,fVr,fVnr )glob.S.

Proof.

={prepare for statement duplication (Lemma 6.2) with

V,iV ,fV := (Vr ,Vnr ),(iVr,iVnr ),(fVr ,fVnr ):

provisos def.S= (Vr,Vnr,coV ) and (iVr ,iVnr ,icoV ,fVr ,fVnr )glob.S}

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr

;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]”

={intro. following assertion (Law 7) with X,Y,E1,E2 := fVnr,fVr,Vnr,Vr:

(fVr ,fVnr )Vr due to both provisos and RE5}

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr

;{Vr =fVr};Vr,Vnr := fVr,fVnr )[live Vr ,Vnr ,coV ]”

={statement duplication (Lemma 6.3) with

V,iV ,fV := (Vr ,Vnr ),(iVr,iVnr ),(fVr ,fVnr ): provisos}

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr

;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;S;{Vr =fVr }

;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]”

={ﬁnal-use sub.: correct by construction (see Figure 10.1 and Appendix E) and

variables fVr are indeed fresh (proviso fVr glob.S)}

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr

;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;S[ﬁnal-use Vr \fVr ];{Vr =fVr }

;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]”

={remove aux. assertion (Lemma 10.2, see below) with

V,iV ,fV := (Vr ,Vnr ),(iVr,iVnr ),(fVr ,fVnr )}

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr

;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;S[ﬁnal-use Vr \fVr ];Vr,Vnr := fVr,fVnr)

[live Vr,Vnr,coV ]”.

CHAPTER 10. CO-SLICING 111

Lemma 10.2. Let Sbe any core statement with def.S= (V,coV ), Vr ⊆V(and fVr the corre-

sponding subset of fV ) and (iV ,icoV ,fV )glob.S; we then have

“iV ,icoV := V,coV

;fV := V

;

V,coV := iV ,icoV

;S[ﬁnal-use Vr \fVr]

;{Vr =fVr}”

“iV ,icoV := V,coV

;fV := V

;

V,coV := iV ,icoV

;S[ﬁnal-use Vr \fVr]

”

The proof of that lemma is given in Appendix E.

10.3.2 Slicing after ﬁnal-use substitution

The statement duplication with ﬁnal-use substitution, as in the program equivalence above, can

now be followed up by slicing, as was done earlier in Chapter 7.

Reﬁnement 10.3. Let S,SV ,ScoV ,Vr ,Vnr,coV ,iVr ,iVnr ,icoV ,fVr ,fVnr be three core state-

ments and eight sets of variables, respectively; then

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV

;SV

;fVr ,fVnr := Vr,Vnr

;

Vr ,Vnr ,coV := iVr,iVnr ,icoV

;ScoV

;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]”

provided def.S= (Vr ,Vnr,coV ),

S[live Vr,Vnr]vSV [live Vr,Vnr ],

S[ﬁnal-use Vr \fVr][live coV ]vScoV [live coV ],

def.SV ⊆def.S,

def.ScoV ⊆def.Sand

(iV ,icoV ,fV )glob.S.

Proof.

CHAPTER 10. CO-SLICING 112

={duplicate statement (Program equivalence 10.1):

being a core statement, Sis indeed deterministic,

def.S= (Vr ,Vnr ,coV ) and (iVr,iVnr ,icoV ,fVr,fVnr)glob.S(proviso)}

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;S;fVr,fVnr := Vr,Vnr

;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;S[ﬁnal-use Vr \fVr ]

;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]”

v {Reﬁnement 7.5 with

S0,V,iV ,fV := S[ﬁnal-use Vr \fVr],(Vr,Vnr),(iVr ,iVnr ),(fVr,fVnr ):

def.S[ﬁnal-use Vr \fVr] = def.Ssince only uses are replaced by ﬁnal-use sub.}

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;SV ;fVr,fVnr := Vr,Vnr

;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;ScoV

;Vr ,Vnr := fVr,fVnr )[live Vr,Vnr ,coV ]”.

10.3.3 Deﬁnition of co-slicing

Observing the requirements S[ﬁnal-use Vr \fVr ][live coV ]vScoV [live coV ] and def.ScoV ⊆def.S

above, we formalise co-slices as follows:

Deﬁnition 10.4 (Complement-Slice (or Co-Slice)).Let Sbe a core (and hence deterministic)

statement and Vbe a set of variables for extraction; let Vr be a subset of Vto be made reusable

through fresh variables fVr . Any statement ScoV for which the two requirements

(Q1:) S[ﬁnal-use Vr \fVr][live coV ]vScoV [live coV ] and

(Q2:) def.ScoV ⊆def.S

both hold, is a correct co-slice of Swith repect to V,Vr and fVr.

A co-slicing algorithm (see Figure 10.2) is consequently derived from the above deﬁnition

and the corresponding properties Q1 and Q2 of slicing. From those properties, the algorithm’s

correctness follows.

10.3.4 The sliding transformation

The reﬁnement rule from above, along with the formal deﬁnition of co-slices and the corresponding

constructive co-slicing algorithm, are now combined in yielding an advanced sliding transformation.

Transformation 10.5. Let Sbe any core statement and let Vr ,Vnr be any two disjoint (user

selected) sets of variables to be extracted, with Vr to be made available for reuse in the complement;

then

CHAPTER 10. CO-SLICING 113

Let S,V,Vr ,fVr be a core statement and three sets of variables, respectively. The function

co-slice for statement Swith respect to V,Vr,fVr, is deﬁned as follows:

co-slice.S.V.Vr.fVr ,slice.S[ﬁnal-use Vr \fVr].coV

where coV := def.S\V

provided Vr ⊆V

and fVr (V∪glob.S) .

Figure 10.2: A co-slicing algorithm, based on slicing and ﬁnal-use substitution.

“|[var iVr ,iVnr ,icoV ,fVr,fVnr

;iVr ,iVnr ,icoV := Vr0,Vnr 0,coV

;SV

;fVr ,fVnr := Vr0,Vnr 0

;

Vr 0,Vnr 0,coV := iVr,iVnr ,icoV

;ScoV

;Vr 0,Vnr 0:= fVr,fVnr

]|”

where Vr 0,Vnr 0:= (Vr ∩def.S),(Vnr ∩def.S),

coV := def.S\(Vr 0,Vnr 0),

(iVr ,iVnr ,icoV ,fVr,fVnr ) := fresh.((Vr0,Vnr 0,coV ,Vr0,Vnr 0),((Vr,Vnr)∪glob.S)),

SV := slice.S.V0

and ScoV := co-slice.S.(Vr ,Vnr ).Vr.fVr .

Proof.

v {Reﬁnement 10.3: (Vr0,Vnr 0,coV ) = def.Sby def. of Vr0,Vnr 0,coV ;

S[live Vr0,Vnr0]vSV [live Vr0,Vnr 0] by Q1 of slice;

def.SV ⊆def.Sby Q2 of slice; similarly

S[live coV ]vScoV [live coV ] by Q1 of co-slice;

def.ScoV ⊆def.Sby Q2 of co-slice}

“(iVr ,iVnr ,icoV := Vr0,Vnr 0,coV ;SV ;fVr,fVnr := Vr0,Vnr 0;

Vr 0,Vnr 0,coV := iVr,iVnr ,icoV ;ScoV ;Vr0,Vnr 0:= fVr,fVnr)

[live Vr0,Vnr0,coV ]”

CHAPTER 10. CO-SLICING 114

={def. of live : (def.SV ∪def.ScoV )⊆(V0,coV ) (again, Q2 of slice and co-slice)

and (iVr ,iVnr ,icoV ,fVr,fVnr )(Vr0,Vnr 0,coV )}

“|[var iVr ,iVnr ,icoV ,fVr,fVnr ;

iVr ,iVnr ,icoV := Vr0,Vnr 0,coV ;SV ;fVr,fVnr := Vr0,Vnr 0;

Vr 0,Vnr 0,coV := iVr,iVnr ,icoV ;ScoV ;Vr0,Vnr 0:= fVr,fVnr]|”.

10.4 Summary

This chapter has introduced an advanced sliding transformation in which the complement reuses

a selection of extracted results, thus yielding a potentially smaller complement, or as we call it

co-slice. Co-slicing has been formalised through a so-called ﬁnal-use substitution. Constructive

deﬁnitions of that substitution, and hence of a co-slicing algorithm, have been developed.

In comparison to our earlier sliding transformation, the advanced version potentially duplicates

less code. However, the price takes the form of extra compensatory code. This is due to ﬁnal-

use substitution, renaming some used variables in the complement. This renaming, to which we

are opposed, in general, in an attempt to keep the resulting program as close to the original

as possible, has been introduced in order to avoid name clashing. However, since our co-slicing

algorithm involves the removal of dead code by slicing, after ﬁnal-use substitution, some renaming

can potentially be undone.

The elimination of redundant compensatory code, after sliding, will be pursued in the next

chapter. In particular, undoing the renaming of ﬁnal-use substitution will be formalised, thus

yielding the concepts of compensation-free co-slices and compensation-free sliding.

Chapter 11

Penless Sliding

When sliding is expected to maintain all variable names (i.e. no renaming), it is not the case that

any ﬁnal-use substitution yields a valid co-slice. The notion of compensation-free (or penless)

co-slice is introduced in this chapter. Moreover, a general improvement of sliding by eliminating

redundant backup variables is explored, ultimately leading to the formulation of (the conditions

for) a completely penless sliding transformation. The elimination of backup variables is based on

a liveness-analysis related approach to variable merging, on the lines of our merge-vars algorithm

(Appendix D), which has been applied for the return from SSA (in Section 8.6.2).

11.1 Eliminating redundant backup variables

We begin by detecting and eliminating redundant backup variables. When sliding variables

(Vr ,Vnr ) away from coV on statement S(with def.S⊆(Vr,Vnr,coV ) and with Vr available

for reuse in the complement), we naively introduce backup variables (iVr ,iVnr ,icoV ) for initial

values and fVr ,fVnr for ﬁnal value of extracted variables.

However, some of those backup variables might in fact be redundant and should hence be

removed.

11.1.1 Motivation

Why should those be removed? Following practical considerations, we note that such backup

would require an unnecessarily large storage space, and the two operations of making the backup

and retrieving it would have an unwanted impact on execution time.

Furthermore, suppose we waive our language assumption that any variable is cloneable, and

in turn strengthen sliding’s preconditions to ensure no uncloneable variable is actually cloned. In

that context, the removal of redundant backup variables will be crucial for the applicability of

115

CHAPTER 11. PENLESS SLIDING 116

sliding.

Finally, as was hinted above, when renaming of variables in the complement must be avoided,

the removal of redundant backup variables will allow such renaming to be undone.

11.1.2 Example

Recall the co-slicing example from the preceding chapter (Section 10.2.1). There, asking to ex-

tract and reuse the variable sum (i.e. when Vr ,Vnr ,coV := {sum},∅,{i,prod ,out }in applying

Transformation 10.5) from the program of Section 10.1, gave us the following result:

|[var ii,isum,iprod,iout,fsum

; ii,isum,iprod,iout:=i,sum,prod,out

; i,sum := 0,0

; while i<a.length do

i,sum :=

i+1,sum+a[i]

; fsum:=sum

;

i,sum,prod,out:=ii,isum,iprod,iout

; i, prod := 0, 1

; while i<a.length do

i, prod :=

i+1, prod*a[i]

; out << fsum

; out << prod

; sum:=fsum

Here, the sets of backup variables of initial values iVr,iVnr,icoV are {isum},∅,{ii ,iprod,iout },

respectively, and the backup of ﬁnal values fVr ,fVnr is {fsum },∅, respectively. Which of the

backup variables {ii,isum,iprod ,iout ,fsum }is redundant?

11.1.3 Dead-assignments-elimination and variable-merging

We remove redundant backup variables by combining dead-assignments-elimination and the merg-

ing of such backup variables with their corresponding original variables. Recall our merge-vars

algorithm (as mentioned for returning from SSA, see Section 8.6.2 and Appendix D). According to

that approach, merging members of iVr,iVnr,icoV with corresponding members of Vr,Vnr ,coV

is possible if they are never simultaneously-live, never deﬁned in the same assignment, and one is

never deﬁned in an assignment where the other is live-on-exit.

Deﬁnition on the same assignment (e.g. of sum and its backup isum or fsum) is not possible

after sliding, since the backup variables are deﬁned only in designated statements. Furthermore,

since we precede this step by dead-assignments-elimination, cases of def-on-live may only occur

CHAPTER 11. PENLESS SLIDING 117

in conjunction with simultaneous liveness; the deﬁned variable must be live too, or its deﬁnition

would have been removed. (Note that at this stage, the dead-assignments-elimination removes

unused backup variables of initial values only, e.g. {ii,isum ,iprod}above, but not iout , since the

retrieval from backup of all ﬁnal values, e.g. the sum := fsum above, renders such backup, e.g.

fsum, live at its point of initialisation, e.g. the fsum := sum above.)

So it is simultaneous liveness that we should worry about. Backup variables for initial values

(e.g. iout, all the others have already been removed) are alive from entry to the extracted slice all

the way to the exit from the initialisation of backup of ﬁnal values. There, the deﬁned extracted

variables V0(e.g. {sum}) are used and their corresponding live initial backup variables (none of

those in our example, since isum is gone) must remain, as they are simultaneously-live on exit

from the extracted slice. On the other hand, members of coV (e.g. {i,prod ,out}) are alive in

the extracted slice SV only if in glob.SV . Thus, the backup variables for non-extracted initial

variables that do not occur free in the extracted slice can be merged with the corresponding

original variables. Hence, in our example, out and iout can be merged.

Initially, after sliding, the backup variables of all ﬁnal values (e.g. fsum ) are live-on-exit from

the complement, and hence also when initialised, but the corresponding program variables (i.e.

members of V0,e.g. sum) are not, since they are deﬁned there). If those are neither used nor

deﬁned in the complement, the need for backup disappears. That is, backup variables of ﬁnal value

of V0\glob.ScoV should be merged with the corresponding original variables. In our example,

hence, sum can be merged with fsum.

The following is the resulting sliding reﬁnement rule, after eliminating redundant backup vari-

ables.

Reﬁnement 11.1. Let S,SV ,ScoV ,Vr ,Vnr,coV ,iVr ,iVnr ,icoV ,fVr ,fVnr be three core state-

ments and eight sets of variables, respectively; then

“(iVr 1,iVnr 1,icoV 11 :=

Vr 1,Vnr 1,coV 11

;SV

;fVr 1,fVr2,fVnr 1,fVnr 2 :=

Vr 1,Vr2,Vnr 1,Vnr 2

;

Vr 1,Vnr 1,coV 11 :=

iVr 1,iVnr 1,icoV 11

;ScoV [fVr 3\Vr3]

;Vr 1,Vr2,Vnr 1,Vnr 2 :=

fVr 1,fVr2,fVnr 1,fVnr 2)[live Vr ,Vnr,coV ]”

CHAPTER 11. PENLESS SLIDING 118

provided def.S= (Vr ,Vnr,coV ),

S[live Vr,Vnr]vSV [live Vr,Vnr ],

S[ﬁnal-use Vr \fVr][live coV ]vScoV [live coV ],

def.SV ⊆def.S,

def.ScoV ⊆def.S,

(iVr ,iVnr ,icoV ,fVr,fVnr )glob.S,

(Vr 1,Vr2,Vr3) =

(Vr ∩input.ScoV ),(Vr ∩(def.ScoV \input.ScoV )),(Vr \glob.ScoV ),

with (iVr 1,iVr2,iVr3) the corresponding subsets of iVr

and with (fVr 1,fVr2,fVr3) the corresponding subsets of fVr,

(Vnr1,Vnr 2,Vnr 3) =

((Vnr ∩input.ScoV ),(Vnr ∩(def.ScoV \input.ScoV )),(Vnr \glob.ScoV ))

with (iVnr1,iVnr 2,iVnr 3) the corresponding subsets of iVnr

and with (fVnr1,fVnr2,fVnr 3) the corresponding subsets of fVnr and

(coV 11,coV 12,coV 2) =

(coV ∩def.SV ∩input.ScoV ),(coV ∩(input.ScoV \def.SV )),(coV \input.ScoV )

with (icoV 11,icoV 12,icoV 2) the corresponding subsets of icoV .

Proof.

v {Reﬁnement 10.3}

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;SV ;fVr,fVnr := Vr,Vnr

;Vr ,Vnr ,coV := iVr,iVnr ,icoV ;ScoV ;Vr,Vnr := fVr,fVnr)

[live Vr,Vnr,coV ]”

={liveness analysis: Vr1,Vnr 1,coV 1 =

(Vr ∩input.ScoV ),(Vnr ∩input.ScoV ),(coV ∩input.ScoV )}

“(((iVr ,iVnr ,icoV := Vr,Vnr ,coV ;SV ;fVr,fVnr := Vr,Vnr

;Vr ,Vnr ,coV := iVr,iVnr ,icoV )[live fVr ,fVnr ,Vr1,Vnr1,coV 1]

;ScoV )[live fVr,fVnr ,coV ];Vr,Vnr := fVr,fVnr )[live Vr ,Vnr,coV ]”

={remove dead assignments, see below (big step 1)}

“(((iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1;SV ;fVr,fVnr := Vr,Vnr

;Vr 1,Vnr 1,coV 1 := iVr 1,iVnr 1,icoV 1)[live fVr ,fVnr ,Vr1,Vnr1,coV 1]

;ScoV ;Vr ,Vnr := fVr,fVnr)[live Vr,Vnr,coV ]”

={(coV 1,icoV 1) = ((coV 11,coV 12),(icoV 11,icoV 12))}

CHAPTER 11. PENLESS SLIDING 119

“(((iVr 1,iVnr 1,icoV 11,icoV 12 := Vr 1,Vnr 1,coV 11,coV 12

;SV ;fVr ,fVnr := Vr,Vnr

;Vr 1,Vnr 1,coV 11,coV 12 := iVr 1,iVnr 1,icoV 11,icoV 12)

[live fVr,fVnr,Vr1,Vnr 1,coV 1]

;ScoV ;Vr ,Vnr := fVr,fVnr)[live Vr,Vnr,coV ]”

={eliminate redundant backup of initial values, see below (big step 2)}

“(((iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1;SV ;fVr,fVnr := Vr ,Vnr

;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11)[live fVr ,fVnr ,Vr1,Vnr1,coV 1]

;ScoV ;Vr ,Vnr := fVr,fVnr)[live Vr,Vnr,coV ]”

={remove liveness info.; (Vr,Vnr ,fVr,fVnr) =

((Vr 1,Vr2,Vr3),(Vnr1,Vnr2,Vnr 3),(fVr1,fVr2,fVr3),(fVnr 1,fVnr 2,fVnr 3))}

“(((iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1;SV

;fVr 1,fVr2,fVr3,fVnr1,fVnr2,fVnr3 := Vr1,Vr2,Vr3,Vnr 1,Vnr 2,Vnr 3

;Vr 1,Vnr 1,coV 11 := iVr 1,iVnr 1,icoV 11 ;ScoV

;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)

[live Vr,Vnr,coV ]”

={eliminate redundant backup of ﬁnal values, see below (big step 3)}

“(((iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1;SV

;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2

;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11 ;ScoV [fVr3\Vr3]

;Vr 1,Vr2,Vnr 1,Vnr 2 := fVr 1,fVr2,fVnr 1,fVnr 2)[live Vr,Vnr,coV ]”.

For big step 1 above, in which dead assignments are removed, we observe

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;SV ;fVr,fVnr := Vr,Vnr

;Vr ,Vnr ,coV := iVr,iVnr ,icoV )[live fVr ,fVnr ,Vr1,Vnr1,coV 1] ”

={remove dead assignments}

“(iVr ,iVnr ,icoV := Vr,Vnr ,coV ;SV ;fVr,fVnr := Vr,Vnr

;Vr 1,Vnr 1,coV 1 := iVr 1,iVnr 1,icoV 1)[live fVr ,fVnr ,Vr1,Vnr1,coV 1] ”

={liveness analysis}

“((iVr ,iVnr ,icoV := Vr,Vnr ,coV )[live iVr 1,iVnr 1,icoV 1]

;SV ;fVr ,fVnr := Vr,Vnr

;Vr 1,Vnr 1,coV 1 := iVr 1,iVnr 1,icoV 1)[live fVr ,fVnr ,Vr1,Vnr1,coV 1] ”

={remove dead assignments}

CHAPTER 11. PENLESS SLIDING 120

“((iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1)[live iVr 1,iVnr 1,icoV 1]

;SV ;fVr ,fVnr := Vr,Vnr

;Vr 1,Vnr 1,coV 1 := iVr1,iVnr 1,icoV 1)[live fVr ,fVnr ,Vr1,Vnr1,coV 1] ”

={remove liveness info.}

“(iVr 1,iVnr 1,icoV 1 := Vr1,Vnr 1,coV 1;SV ;fVr,fVnr := Vr ,Vnr

;Vr 1,Vnr 1,coV 1 := iVr 1,iVnr 1,icoV 1)[live fVr ,fVnr ,Vr1,Vnr1,coV 1] ”.

For big step 2 above, in which we eliminate redundant backup of initial values, we observe

“(iVr 1,iVnr 1,icoV 1 := Vr 1,Vnr 1,coV 1;SV ;fVr,fVnr := Vr,Vnr

;Vr 1,Vnr 1,coV 11,coV 12 := iVr 1,iVnr 1,icoV 11,icoV 12)

[live fVr,fVnr,Vr1,Vnr 1,coV 1] ”

={split assignment: (iVr1,iVnr 1,icoV 11) coV 12}

“(iVr 1,iVnr 1,icoV 11 := Vr1,Vnr 1,coV 11 ;icoV 12 := coV 12

;SV ;fVr ,fVnr := Vr,Vnr

;Vr 1,Vnr 1,coV 11,coV 12 := iVr 1,iVnr 1,icoV 11,icoV 12)

[live fVr ,fVnr,Vr1,Vnr 1,coV 1] ”

={swap statements: icoV 12 (glob.SV ∪(fVr,fVnr ,Vr,Vnr)) and

(def.SV ,fVr ,fVnr )(icoV 12,coV 12)}

“(iVr 1,iVnr 1,icoV 11 := Vr1,Vnr 1,coV 11 ;SV ;fVr,fVnr := Vr ,Vnr

;icoV 12 := coV 12 ;Vr 1,Vnr 1,coV 11,coV 12 := iVr 1,iVnr 1,icoV 11,icoV 12)

[live fVr,fVnr,Vr1,Vnr 1,coV 1] ”

={assignment-based sub.: coV 12 icoV 12}

“(iVr 1,iVnr 1,icoV 11 := Vr1,Vnr 1,coV 11 ;SV ;fVr,fVnr := Vr ,Vnr

;icoV 12 := coV 12 ;Vr 1,Vnr 1,coV 11,coV 12 := iVr 1,iVnr 1,icoV 11,coV 12)

[live fVr,fVnr,Vr1,Vnr 1,coV 1] ”

={remove redundant self-assignment;

remove dead assignment: icoV 12 (fVr,fVnr ,iVr1,iVnr1,icoV 11)}

“(iVr 1,iVnr 1,icoV 11 := Vr1,Vnr 1,coV 11 ;SV ;fVr,fVnr := Vr,Vnr

;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11)[live fVr ,fVnr ,Vr1,Vnr1,coV 1] ”.

Finally, for big step 3 above, in which we eliminate redundant backup of ﬁnal values, we observe

“(((ISV ;fVr 1,fVr2,fVr3,fVnr1,fVnr2,fVnr3 := Vr1,Vr2,Vr3,Vnr 1,Vnr 2,Vnr 3

;Vr 1,Vnr 1,coV 11 := iVr 1,iVnr 1,icoV 11 ;ScoV

;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)

[live Vr,Vnr,coV ]”

CHAPTER 11. PENLESS SLIDING 121

={split assignment: (fVr1,fVr2,fVnr1,fVnr2) (Vr3,Vnr 3)}

“(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2

;fVr 3,fVnr 3 := Vr3,Vnr 3;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11 ;ScoV

;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)

[live Vr,Vnr,coV ]”

={swap statements:

(fVr 3,fVnr 3) (Vr1,Vnr 1,coV 11,iVr 1,iVnr 1,icoV 11) and

(Vr 1,Vnr 1,coV 11) (fVr3,fVnr 3,Vr3,Vnr 3)}

“(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2

;Vr 1,Vnr 1,coV 11 := iVr 1,iVnr 1,icoV 11 ;fVr3,fVnr3 := Vr3,Vnr 3;ScoV

;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)

[live Vr,Vnr,coV ]”

={assignment-based sub.: Vr3(fVr3∪glob.ScoV ) and

fVr 3def.ScoV }

“(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2

;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11 ;fVr3,fVnr3 := Vr3,Vnr 3

;ScoV [fVr 3\Vr3]

;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)

[live Vr,Vnr,coV ]”

={swap statements:

(fVr 3,fVnr 3) glob.ScoV [fVr3\Vr3] and

def.ScoV [fVr 3\Vr3] (fVr 3,fVnr 3,Vr3,Vnr 3)}

“(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2

;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11 ;ScoV [fVr3\Vr3]

;fVr 3,fVnr 3 := Vr3,Vnr 3

;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,fVr3,fVnr 1,fVnr 2,fVnr 3)

[live Vr,Vnr,coV ]”

={assignment-based sub.: (fVr3,fVnr 3) (Vr3,Vnr3)}

“(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2

;Vr 1,Vnr 1,coV 11 := iVr 1,iVnr 1,icoV 11 ;ScoV [fVr3\Vr3]

;fVr 3,fVnr 3 := Vr3,Vnr 3

;Vr 1,Vr2,Vr3,Vnr1,Vnr2,Vnr 3 := fVr1,fVr2,Vr3,fVnr 1,fVnr 2,Vnr 3)

[live Vr,Vnr,coV ]”

CHAPTER 11. PENLESS SLIDING 122

={remove redundant self-assignment;

remove dead assignment: (fVr3,fVnr3) (fVr1,fVr2,fVnr 1,fVnr2,coV )}

“(((ISV ;fVr 1,fVr2,fVnr 1,fVnr 2 := Vr 1,Vr2,Vnr 1,Vnr 2

;Vr 1,Vnr 1,coV 11 := iVr1,iVnr 1,icoV 11 ;ScoV [fVr3\Vr3]

;Vr 1,Vr2,Vnr 1,Vnr 2 := fVr 1,fVr2,fVnr 1,fVnr 2)[live Vr,Vnr,coV ]”.

For our example above, we end up with

; i,sum := 0,0

; while i<a.length do

i,sum :=

i+1,sum+a[i]

;

i, prod := 0, 1

; while i<a.length do

i, prod :=

i+1, prod*a[i]

; out << sum

; out << prod

Note how in the complement, this time, the original variable sum is used where fsum was used

before. This was made possible by the successful merging of sum with its two backup variables

isum and fsum.

11.2 Compensation-free (or penless) co-slicing

Since we consider the renaming of variables (by ﬁnal-use substitution, when co-slicing) as part

of sliding’s compensation, we accordingly consider a co-slice with no renaming, or with all initial

renaming eventually undone, as in the above, a compensation-free co-slice. Since in our metaphor

of slides and sliding, compensatory code is written using a non-permanent pen on top of printed

transparencies, the merging can be thought of as the erasure of such earlier writing.

Hence, compensation-free co-slices will also be termed penless co-slices. Accordingly, the pro-

cess of producing such co-slices will be termed penless co-slicing.

We deﬁne a penless co-slice to be a co-slice that involve no renaming and thus no compensa-

tion, in the following way:

penless-co-slice.S.V.Vr ,(co-slice.S.V.Vr .fVr)[fVr \Vr] where

fVr := fresh.(Vr,(V∪glob.S)).

Note that since normal substitution is deﬁned only when the new names are fresh, a penless

co-slice is well-deﬁned only when all reused variables are gone (from the co-slice). That is,

penless-co-slice.S.V.Vr is well-deﬁned when Vr glob.(co-slice.S.V.Vr.fVr).

CHAPTER 11. PENLESS SLIDING 123

Now that the elimination of redundant backup variables and the construction of penless co-

slices have been formalised, we are in position to derive preconditions for compensation-free sliding,

or penless sliding, as in the following.

11.3 Sliding with penless co-slices

The following is a sliding transformation with penless co-slicing and with the elimination of re-

dundant backup variables:

Transformation 11.2. Let Sbe any core statement and let Vr ,Vnr be any two disjoint (user

selected) sets of variables to be extracted, with Vr to be made available for reuse in the complement;

then

“|[var iVnr1,icoV 11,fVnr1,fVnr 2

;iVnr1,icoV 11 := Vnr1,coV 11

;SV

;fVnr1,fVnr 2 := Vnr 1,Vnr 2

;

Vnr1,coV 11 := iVnr1,icoV 11

;ScoV

;Vnr1,Vnr 2 := fVnr 1,fVnr 2

]|”

where Vr 0,Vnr 0:= (Vr ∩def.S),(Vnr ∩def.S),

coV := def.S\(Vr 0,Vnr 0),

SV := slice.S.(Vr 0,Vnr 0),

ScoV := penless-co-slice.S.(Vr0,Vnr 0).Vr0,

Vnr1,Vnr 2 := (Vnr 0∩input.ScoV ),(Vnr0∩(def.ScoV \input.ScoV )),

coV 11 := (coV ∩def.SV ∩input.ScoV )

and (iVnr1,icoV 11,fVnr1,fVnr 2) :=

fresh.((Vnr1,coV 11,Vnr1,Vnr 2),((Vr,Vnr)∪glob.S))

provided Vr0glob.(co-slice.S.(Vr 0,Vnr 0).Vr0.fVr) for any fresh fVr.

Proof.

CHAPTER 11. PENLESS SLIDING 124

v {Reﬁnement 11.1 with Vr,Vnr,ScoV := Vr 0,Vnr 0,co-slice.S.(Vr0,Vnr 0).Vr0.fVr

on fresh fVr : (Vr0,Vnr 0,coV ) = def.Sby def. of Vr0,Vnr 0,coV ;

S[live Vr0,Vnr0]vSV [live Vr0,Vnr 0] by Q1 of slice;

S[ﬁnal-use Vr 0\fVr][live coV ]v(co-slice.S.(Vr0,Vnr 0).Vr0.fVr)[live coV ]

by Q1 of co-slice;

def.SV ⊆def.Sby Q2 of slice; similarly

def.ScoV ⊆def.Sby Q2 of co-slice;

(iVnr1,icoV 11,fVnr1,fVnr 2) glob.Sby Q1 of fresh;

note that Vr 1,Vr2,Vr3 := ∅,∅,Vr 0due to the proviso;

consequently iVr1,iVr 2,fVr1 and fVr2 are all empty; also note (for our ScoV )

penless-co-slice.S.(Vr0,Vnr 0).Vr0= (co-slice.S.(Vr0,Vnr 0).Vr0.fVr)[fVr \Vr 0]

by def. of penless-co-slice (which is indeed well-deﬁned due to the proviso)}

“(iVnr1,icoV 11 := Vnr1,coV 11 ;SV ;fVnr 1,fVnr 2 := Vnr 1,Vnr 2

;Vnr1,coV 11 := iVnr1,icoV 11 ;ScoV ;Vnr1,Vnr 2 := fVnr1,fVnr 2)

[live Vr0,Vnr0,coV ]”

={def. of live : (def.SV ∪def.ScoV )⊆(Vr0,Vnr0,coV ) (again, Q2 of slice and

co-slice) and (iVnr1,icoV 11,fVnr1,fVnr 2) (Vr0,Vnr0,coV )}

“|[var iVnr1,icoV 11,fVnr1,fVnr 2;

iVnr1,icoV 11 := Vnr1,coV 11 ;SV ;fVnr 1,fVnr 2 := Vnr 1,Vnr 2

;Vnr1,coV 11 := iVnr1,icoV 11 ;ScoV ;Vnr1,Vnr 2 := fVnr1,fVnr 2]|”.

11.4 Summary

In this chapter, an approach for reducing compensation after sliding has been developed. Redun-

dant backup variables of initial and ﬁnal values have been eliminated. This elimination has been

conducted by the removal of dead assignments and by merging such backup variables with their

original counterparts. When eliminating backup of reusable ﬁnal values of extracted variables,

those had to be removed from the complement. This has been formalised through a concept of

compensation-free co-slices, or penless co-slices.

A penless co-slice is constructed by ﬁrst introducing reusable variables through ﬁnal-use sub-

stitution, then slicing for the remaining variables, and ﬁnally undoing the substitution. It is

interesting to see that such a co-slice is potentially smaller than the corresponding slice (of non-

extracted variables).

A sliding transformation whose complement is a penless co-slice has been developed. We call

CHAPTER 11. PENLESS SLIDING 125

it penless sliding. Moreover, if all backup variables are successfully eliminated when sliding, as

was the case in the given example, we say the result is completely penless.

In this light, the KH approach to arbitrary method extraction (both KH00 [38] and KH03 [39],

apart from some speciﬁc treatment of jumps in the latter) can be described as completely penless.

Indeed, the approach taken in the last two chapters, leading to the formulation of penless co-slices

and penless sliding has been inspired by their algorithms as well as their criticism of Tuck’s lack

of data ﬂow from slice to complement (as stated in [39, 37]).

Looking back at the sliding transformations of the current and previous chapters, we note that

the user was asked to provide not only the statement in scope Sand variables for extraction V,

as was the case in the earlier sliding transformation (of Chapter 9), but also the subset Vr of

extracted variables to be reused in the complement. However, our original formulation of slice

extraction (in Deﬁnition 1.1) had no mention of such Vr . When the goal is to extract precisely the

slice of Son V, whilst producing the smallest possible complement, one could ask “which subset

Vr would yield the smallest possible complement?” This question, as well as a related question

on sliding itself will be treated in the next chapter.

Chapter 12

Optimal Sliding

In previous chapters, all co-slicing related transformations assumed the subset of (extracted) vari-

ables to be made reusable, Vr, is given. In this chapter, however, that assumption is waived.

This immediately raises a variety of optimisation problems. When extracting the computation

of Vin S, using a certain co-slice related sliding transformation, which partition of def.Sinto

((Vr ,Vnr ),coV ) — with (Vr ,Vnr ) extracted, and Vr oﬀered for reuse — would yield an optimal

result? Surely we need to be more speciﬁc in describing what is meant by ‘optimal’, and which

transformation is being applied.

In this chapter, we focus on sliding with penless co-slices (i.e. Transformation 11.2 from the

preceding chapter). For any given program statement Sand set of variables to be extracted V,

an optimal solution will identify a set of variables V0(possibly larger than Vitself, as will be

explained shortly), made of subsets (Vr ,Vnr ), for which the extracted slice SV 0will be precisely

slice.S.V; its complement, the penless co-slice ScoV , must end up being the smallest possible, in

terms of the number of individual assignments in it.

It should be noted that we do not mean to consider any substatement in our search: Finding

such minimal co-slices is in general impossible, just as ﬁnding minimal slices is — it is equivalent

to solving the halting problem [64]. Instead, the goal is to ﬁnd the smallest out of all possible

results of our speciﬁc (penless) co-slicing algorithm. In our quest, the program statement to be

co-sliced shall be given and ﬁxed, whereas the set of variables on which to co-slice, as well as its

subset of variables to be made available for reuse, shall vary.

We begin our search for an optimal solution by devising an algorithm to ﬁnd the smallest

possible penless co-slice for given Sand V. We then complete the solution by observing that the

given Vitself is not necessarily the best option for the set of extracted variables V0. As it turns

out, some larger sets may yield precisely the same extracted slice, and with enhanced opportunities

for reuse, thus possibly yielding an even smaller complement.

126

CHAPTER 12. OPTIMAL SLIDING 127

12.1 The minimal penless co-slice

A statement Sand set of variables Vhave at most 2Ndiﬀerent penless co-slices (with N=|V|).

This is so since any subset Vr of Vcan be oﬀered for reuse, thus possibly leading to a diﬀerent

co-slice. (It should however be remembered that not all subsets necessarily yield well-deﬁned

penless co-slices.)

Let the size of a program statement be determined by the number of individual assignments in

it. With this deﬁnition, is there (for given Sand V, with diﬀerent reusable subsets Vr ) a single

smallest result to our algorithm of penless co-slicing? If so, which subset Vr yields it? And how

can this Vr be found?

Our conjecture is that indeed there is a single smallest penless co-slice (to any given Sand

V). How do we ﬁnd it? Surely, one could try all subsets of V, composing all possible co-slices

and measuring the size of the penless ones. But this algorithm would be very expensive. In terms

of time complexity, it would grow exponentially with |V|. Is there a faster solution?

12.1.1 A polynomial-time algorithm

The size of (penless or not) co-slices, for any given Sand V, is anti-monotone with respect to the

set Vr of reusable variables. Thus, if Vr1 and Vr are two sets which lead to valid penless co-slices

(in the context S,V) — we refer to such sets as ‘mergeable’ as they can be merged with their

corresponding backup variables, after co-slicing — with Vr 1⊆Vr ⊆V, we have

|penless-co-slice.S.V.Vr|≤|penless-co-slice.S.V.Vr 1|. (Recall the size of a statement here is the

number of individual assignments.)

In fact, we should be looking for the largest Vr that yields a penless co-slice. Why largest? Due

to anti-monotonicity of the size of penless co-slices (with respect to the set of reusable variables),

such a penless co-slice will never be larger than the penless co-slice of any other mergeable set of

reusable variables.

Is there only one such largest set? Yes, due to the following deﬁnitions and observation.

After co-slicing, the variables in Vr \glob.(co-slice.S.V.Vr.fVr) are deﬁnitely mergeable (i.e.

they can be merged with their corresponding members of fVr) whereas variables in

Vr ∩glob.(co-slice.S.V.Vr.fVr) are considered non-mergeable. We thus deﬁne the set of mergeable

reusable variables (after co-slicing of S,Vwith reusable Vr ⊆Vand fresh fVr,i.e. fVr (V∪

glob.S)) as

mergeable.S.V.Vr ,Vr \glob.(co-slice.S.V.Vr .fVr). Accordingly, variables in

Vr \mergeable.S.V.Vr are said to be non-mergeable.

Moreover, when all members of a set of reusable variables Vr are mergeable with respect to

S,V,Vr (i.e. when Vr =mergeable.S.V.Vr which is the case iﬀ Vr glob.(co-slice.S.V.Vr.fVr)),

CHAPTER 12. OPTIMAL SLIDING 128

we say Vr is ‘penless’ with respect to co-slice.S.V.

Lemma 12.1. Penlessness is closed under set-union. That is, if Vr1 and Vr2 are two penless

subsets of V, their union Vr 1∪Vr2 is penless too.

Proof. Assuming the two subsets Vr 1 and Vr2 of extracted variables Vare both penless with

respect to co-slice.S.V, we need to show the union Vr 3 := (Vr1∪Vr2) is penless too.

On the one hand, we observe

glob.(co-slice.S.V.Vr 3.fVr3)

={def. of co-slice; let coV := def.S\V}

glob.(slice.S[ﬁnal-use Vr 3\fVr3].coV )

={stepwise ﬁnal-use sub. (see Section E.3): let Vr21 := Vr2\Vr1}

glob.(slice.S[ﬁnal-use Vr 1\fVr1][ﬁnal-use Vr21 \fVr 21].coV )

⊆ {Lemma 12.2, see below}

fVr 21 ∪glob.(slice.S[ﬁnal-use Vr1\fVr1].coV )

={def. of co-slice}

fVr 21 ∪glob.(co-slice.S.V.Vr1.fVr1) .

Thus Vr1glob.(co-slice.S.V.Vr 3.fVr3) (due to the freshness of fVr21 and penlessness of Vr1

in co-slice.S.V). On the other hand, we have

glob.(co-slice.S.V.Vr 3.fVr3)

={def. of co-slice; let coV := def.S\V}

glob.(slice.S[ﬁnal-use Vr 3\fVr3].coV )

={stepwise ﬁnal-use sub. (see Section E.3): let Vr11 := Vr1\Vr2}

glob.(slice.S[ﬁnal-use Vr 2\fVr2][ﬁnal-use Vr11 \fVr 11].coV )

⊆ {again Lemma 12.2, see below}

fVr 11 ∪glob.(slice.S[ﬁnal-use Vr2\fVr2].coV )

={def. of co-slice}

fVr 11 ∪glob.(co-slice.S.V.Vr2.fVr2) .

Thus, similarly to the preceding derivation, Vr2glob.(co-slice.S.V.Vr3.fVr3). We then con-

clude (from set theory and the deﬁnition of Vr 3) the desired Vr3glob.(co-slice.S.V.Vr3.fVr 3).

CHAPTER 12. OPTIMAL SLIDING 129

Lemma 12.2. Let S,X,fX ,Ybe any core statement and three sets of variables, respectively;

then

glob.(slice.S[ﬁnal-use X\fX ].Y)⊆fX ∪glob.(slice.S.Y)

provided fX ((X,Y)∪glob.S) .

Proof. Recall the deﬁnition of slice. There, the given program statement is ﬁrst translated into

SSA, where it is sliced in a ﬂow-insensitive way, before returning from SSA.

The diﬀerence between the SSA versions of Sand S[ﬁnal-use X\fX ] is only in the references

to fX , since ﬁnal-use substitution changes only uses, and no deﬁnition. So both versions have

the same sets of deﬁned variables and the same sets of slides, with potential diﬀerences in the

used variables on those slides. These potential diﬀerences, in turn, may lead to diﬀerences in the

respective relations of slide dependence. Since fX def.S[ﬁnal-use X\fX ], the introduced uses of

fX do not yield any new slide dependence.

Consequently, representing the relations of slide dependence as a set of pairs, the set of slide

dependences of the SSA version of S[ﬁnal-use X\fX ] is a (not necessarily strict) subset of the

corresponding set for Sitself. Thus, the slide-independent set YLf ∗(i.e. the reﬂexive transitive

closure of the set YLf of ﬁnal instances of Y) of the former, is a subset of the corresponding set

of the latter. The result is that, excluding fX itself, and even after returning from SSA, the set of

global variables in the former is a subset of the global variables in the latter.

Finally, how do we ﬁnd the largest penless set? The following observations suggests an opti-

mistic approach.

Lemma 12.3. Mergeability is monotone with respect to co-slicing (in the set of reusable variables).

That is,

mergeable.S.V.Vr 1⊆mergeable.S.V.(Vr 1,Vr2).

Proof.

mergeable.S.V.Vr 1

={def. of mergeable}

Vr 1\glob.(co-slice.S.V.Vr1.fVr1)

={set theory: fVr1Vr1}

Vr 1\(glob.(co-slice.S.V.Vr1.fVr1) \fVr 1)

⊆ {see below}

Vr 1\(glob.(co-slice.S.V.(Vr1,Vr2).(fVr 1,fVr2)) \(fVr1,fVr2))

CHAPTER 12. OPTIMAL SLIDING 130

={set theory: (fVr1,fVr2) Vr1}

Vr 1\glob.(co-slice.S.V.(Vr1,Vr2).(fVr 1,fVr2))

⊆ {set theory}

(Vr 1,Vr2) \glob.(co-slice.S.V.(Vr1,Vr 2).(fVr1,fVr2))

={def. of mergeable}

mergeable.S.V.(Vr1,Vr 2) .

A useful property of co-slicing (for the third step above) is

(glob.(co-slice.S.V.(Vr 1,Vr2).(fVr1,fVr 2))\(fVr1,fVr2)) ⊆(glob.(co-slice.S.V.Vr 1.fVr1)\fVr1).

To see why this is so, we observe

glob.(co-slice.S.V.(Vr 1,Vr2).(fVr1,fVr 2)) \(fVr1,fVr2)

={def. of co-slice; let coV := def.S\V}

glob.(slice.S[ﬁnal-use Vr 1,Vr2\fVr1,fVr 2].coV )\(fVr 1,fVr 2)

={stepwise ﬁnal-use sub. (see Section E.3)}

glob.(slice.S[ﬁnal-use Vr 1\fVr1][ﬁnal-use Vr2\fVr 2].coV )\(fVr 1,fVr 2)

={set theory}

(glob.(slice.S[ﬁnal-use Vr 1\fVr1][ﬁnal-use Vr2\fVr 2].coV )\fVr 2) \fVr 1

⊆ {Lemma 12.2 with S,X,fX ,Y:= S[ﬁnal-use Vr 1\fVr1],Vr2,fVr 2,coV }

glob.(slice.S[ﬁnal-use Vr 1\fVr1].coV \fVr1

={def. of co-slice;coV =def.S\V}

glob.(co-slice.S.V.Vr 1.fVr1) \fVr1.

An interesting consequence of the monotonicity of mergeability — one which calls for an

optimistic algorithm, when seeking the largest set of penless reusable variables — is the following.

Corollary 12.4. When reducing the set of reusable variables from (Vr 1,Vr2) to (Vr1,Vr 21),

when Vr 21 ⊆Vr2 and Vr1mergeable .S.V.(Vr1,Vr2), the subset Vr1 of non-mergeable variables

in the former remains non-mergeable in the latter (i.e. Vr 1mergeable.S.V.(Vr 1,Vr21)).

Proof. Due to the monotonicity of mergeability (Lemma 12.3 above), any member of

Vr 1∩mergeable.S.V.(Vr1,Vr21) would have to be in Vr1∩mergeable.S.V.(Vr1,Vr 2) as well

(due to Vr 21 ⊆Vr2), thus contradicting the assumption of Vr 1 being non-mergeable in

co-slice.S.V.(Vr1,Vr2).

CHAPTER 12. OPTIMAL SLIDING 131

Given a core statement Sand variables of interest V, compute the largest subset

largest-penless-reusable.S.Vof Vwhich, when oﬀered for reuse, yields a penless co-slice

(penless-co-slice.S.(largest-penless-reusable.S.V), which is in turn not larger than any other

penless co-slice of S,V), as follows:

largest-penless-reusable.S.V,largest-penless-reusable-rec.S.V.V

largest-penless-reusable-rec.S.V.Vr ,if nonMergeable =∅then Vr else

largest-penless-reusable-rec.S.V.(Vr \nonMergeable)ﬁ

where nonMergeable := glob.(co-slice.S.V.Vr .fVr)∩Vr

and fVr := fresh.(Vr,(V∪glob.S)) .

Figure 12.1: An algorithm for ﬁnding the largest-penless-reusable set.

Remark: the above property should not be confused with ‘non-mergeability is anti-monotone

with respect to co-slicing’. In fact, judging by our deﬁnition of non-mergeability, the latter is not

true. When decreasing the set of reusable variables, say from (Vr 1,Vr 2) to Vr1, a non-mergeable

variable from Vr 2 will no longer be considered either (mergeable or non-mergeable), as it will no

longer be oﬀered for reuse.

So with an optimistic approach, the algorithm begins by trying to reuse all extracted vari-

ables V. It then removes all non-mergeable variables. Now, should we trust the result (i.e.

the set Vr := mergeable.S.V.V) to be penless (i.e. Vr =mergeable.S.V.Vr)? Unfortunately

that is not necessarily so. By no longer reusing variables in V\Vr, the set of global variables

glob.(co-slice.S.V.Vr .fVr)\fVr is possibly larger than the corresponding glob.(co-slice.S.V.V.fV )\

fV ; the former might include members of Vr , thus rendering Vr non-penless. However, such vari-

ables can subsequently be removed, repeatedly.

Hence, the algorithm — see Figure 12.1 — is optimistic and recursive. Starting with the largest

available set V, we repeatedly identify and remove non-mergeable variables, until a ﬁxed point is

reached.

12.2 Slice inclusion

When sliding Vin S,e.g. in Transformation 11.2, the extracted computation consists of the

full slice of Vin S,i.e. slice.S.V. The complement, in turn, consists of the code for comput-

ing the remaining results coV := def.S\V,i.e. penless-co-slice.S.V.Vr, with Vr (being e.g.

largest-penless-reusable.S.V, as shown above), a subset of Vof extracted variables whose ﬁnal

CHAPTER 12. OPTIMAL SLIDING 132

extracted value is to be oﬀered for reuse. Variables in coV , however, might also be modiﬁed in

the extracted slice, if those contribute to the computation of V. In such a case, the compensatory

code ensures (through backup variables) those modiﬁcations do not interfere with the eventual

computation of coV in the complement.

With this transformation, the ﬁnal value of the extracted variables can be reused in the com-

plement. But how about the ﬁnal value of other variables? In the following example, an attempt

to slide the computation of avg (and reusing it in the complement), would lead to duplication of

the code for computing sum.

i,sum,prod := 0,0,1

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

; avg := sum/a.length

; out << sum

; out << prod

; out << avg

The result will look this way:

i,sum := 0,0

; while i<a.length do

i,sum :=

i+1,sum+a[i]

; avg := sum/a.length

;

i,sum,prod := 0,0,1

; while i<a.length do

i,sum,prod :=

i+1,sum+a[i],prod*a[i]

; out << sum

; out << prod

; out << avg

Notice that the ﬁnal value of avg was successfully reused. In contrast, the whole computation

of sum had to be duplicated. The reason is that its value at the end of the extracted slice was

ignored, instead of being oﬀered for reuse through ﬁnal-use substitution (like avg).

In general, there is no reason to restrict ﬁnal-use substitution to the set of extracted variables,

V. All other variables whose ﬁnal value is computed in the extracted slice might be good candi-

dates for reuse too. In the example above, it was sum whose ﬁnal value was computed both in

CHAPTER 12. OPTIMAL SLIDING 133

the slice and the complement.

How can we tell it was the ﬁnal value of a variable that was computed in the extracted slice?

This is the case whenever the full slice of such a variable, with respect to the program scope, is

included in the extracted code. In general, we can say that all variables whose slices are included

in the slice for Vare candidates for ﬁnal-use substitution. We denote this set V0and propose to

update our slice-extraction transformations to extract that extended set rather then the requested

set V.

In the example above, where Vwas {avg}, the corresponding V0set includes {avg,sum,i}.

Applying Transformation 11.2 to that latter set, with the largest penless reusable set Vr :=

{avg,sum}, would therefore lead to:

|[var fi

; i,sum := 0,0

; while i<a.length do

i,sum :=

i+1,sum+a[i]

; avg := sum/a.length

; fi:=i

;

; i, prod := 0, 1

; while i<a.length do

i, prod :=

i+1, prod*a[i]

; out << sum

; out << prod

; out << avg

; i:=fi

Note that this time, the ﬁnal extracted value of sum was reused in the complement, instead

of being ignored and thus recomputed. This resulting code is considered better than the previous

result in the sense that the code for computing sum is no longer duplicated. And in terms of our

optimisation problem, it yields a smaller co-slice, with less assignments.

On the other hand, we now have more compensatory code. For understanding this, we further

note that the largest set Vr used for ﬁnal-use substitution was {avg,sum}. The variable iwas

excluded since intermediate values are used in the complement, for the computation of prod.

Instead, the slice for iwas duplicated and its modiﬁcations in the complement were ignored

through a backup variable ﬁ. In this case, considering levels of code duplication, ignoring eﬀects

on iin the extracted slice, instead, would have been as good. However, in terms of the number of

backup variables, the latter would have been better.

Accordingly, when two sliding combinations are similar in terms of code duplication, it might

CHAPTER 12. OPTIMAL SLIDING 134

be desirable to choose the one that minimizes the need for backup variables, as those entail both

extra storage and time for copying.

In this thesis, we leave this aspect (of minimizing such compensatory code) alone, and focus

solely on levels of code duplication, as displayed by the number of individual assignments in the

co-slice.

12.3 The optimal sliding transformation

Transformation 12.5. Let Sbe any core statement and Vbe a set of variables to be extracted;

then

“|[var iVnr1,icoV 11,fVnr1,fVnr 2

;iVnr1,icoV 11 := Vnr1,coV 11

;SV

;fVnr1,fVnr 2 := Vnr 1,Vnr 2

;

Vnr1,coV 11 := iVnr1,icoV 11

;ScoV

;Vnr1,Vnr 2 := fVnr 1,fVnr 2

]|”

where V0is the set of variables in V∪def.Swhose slice is included in slice.S.V,

Vr := largest-penless-reusable.S.V0,

Vnr := V0\Vr,

coV := def.S\(Vr ,Vnr ),

SV := slice.S.V0,

ScoV := penless-co-slice.S.V0.Vr,

Vnr1,Vnr 2 := (Vnr ∩input.ScoV ),(Vnr ∩(def.ScoV \input.ScoV )),

coV 11 := (coV ∩def.SV ∩input.ScoV )

and (iVnr1,icoV 11,fVnr1,fVnr 2) := fresh.((Vnr 1,coV 11,Vnr1,Vnr 2),(V∪glob.S)) .

Proof.

CHAPTER 12. OPTIMAL SLIDING 135

v {Reﬁnement 11.1 with Vr,Vnr,ScoV := Vr 0,Vnr 0,co-slice.S.(Vr0,Vnr 0).Vr0.fVr

on fresh fVr : (Vr0,Vnr 0,coV ) = def.Sby def. of Vr0,Vnr 0,coV ;

S[live Vr0,Vnr0]vSV [live Vr0,Vnr 0] by Q1 of slice;

S[ﬁnal-use Vr 0\fVr][live coV ]v(co-slice.S.(Vr0,Vnr 0).Vr0.fVr)[live coV ]

by Q1 of co-slice;

def.SV ⊆def.Sby Q2 of slice; similarly

def.ScoV ⊆def.Sby Q2 of co-slice;

(iVnr1,icoV 11,fVnr1,fVnr 2) glob.Sby Q1 of fresh;

note that Vr 1,Vr2,Vr3 := ∅,∅,Vr 0due to the proviso;

consequently iVr1,iVr 2,fVr1 and fVr2 are all empty; also note (for our ScoV )

penless-co-slice.S.(Vr0,Vnr 0).Vr0= (co-slice.S.(Vr0,Vnr 0).Vr0.fVr)[fVr \Vr 0]

by def. of penless-co-slice (which is indeed well-deﬁned due to the proviso)}

“(iVnr1,icoV 11 := Vnr1,coV 11 ;SV ;fVnr 1,fVnr 2 := Vnr 1,Vnr 2

;Vnr1,coV 11 := iVnr1,icoV 11 ;ScoV ;Vnr1,Vnr 2 := fVnr1,fVnr 2)

[live Vr0,Vnr0,coV ]”

={def. of live : (def.SV ∪def.ScoV )⊆(Vr0,Vnr0,coV ) (again, Q2 of slice and

co-slice) and (iVnr1,icoV 11,fVnr1,fVnr 2) (Vr0,Vnr0,coV )}

“|[var iVnr1,icoV 11,fVnr1,fVnr 2;

iVnr1,icoV 11 := Vnr1,coV 11 ;SV ;fVnr 1,fVnr 2 := Vnr 1,Vnr 2

;Vnr1,coV 11 := iVnr1,icoV 11 ;ScoV ;Vnr1,Vnr 2 := fVnr1,fVnr 2]|”.

12.4 Summary

This chapter has addressed two related optimisation problems with regards to penless co-slicing

and penless sliding, from the preceding chapter. There, the sliding transformation assumed the

subset of extracted variables to be reused in the complement is given. Here, in contrast, all possible

subsets have been considered, and algorithms for ﬁnding the optimal ones have been developed.

The smallest possible penless co-slice is found through an optimistic polynomial time algorithm

that assumes all extracted variables should be made reusable, and repeatedly removes those that

violate penlessness. When a ﬁxed point is reached, the resulting set is guaranteed to yield the

smallest possible penless co-slice. The correctness of this algorithm has been proved through

a number of properties of ﬁnal-use substitution, slicing and co-slicing that have been formally

developed.

The optimal sliding transformation, one which extracts precisely the slice of selected variables

CHAPTER 12. OPTIMAL SLIDING 136

and with the smallest possible penless co-slice (in terms of number of individual assignments),

has been shown to involve the extraction of a superset of the selected variables and the smallest

penless co-slice, as in our solution to the ﬁrst optimisation problem. The superset of extracted

variables includes the selected set and all other variables whose slice is included in the extracted

code.

A relation of slice inclusion, contributing to our detection of optimal sliding, has been intro-

duced by Gallagher and Lyle [22].

In a ﬁnal note, we return to our declared challenge, from the end of Chapter 2. There,

it was shown that if the user requests the extraction of variable out (or equivalently statements

{1,2,4,6}), the Tuck transformation would duplicate the entire extracted slice (when not rejecting

the extraction), the KH00 algorithm would fail, and the KH03 would insist on extracting statement

3 too, which is illegal in our context of slice extraction.

Our optimal sliding from Transformation 12.5 above, with V={out}, would detect V0=

{out,sum,i}, of which Vr ={out,sum }would be oﬀered for reuse. Consequently, the challenge

of untangling like Tuck whilst minimizing code duplication and improving applicability, like KH03,

would be met.

This concludes our current investigation of slice extraction via sliding. Potential applications,

for refactoring and otherwise, as well as possible directions for future work will be outlined in the

next chapter.

Chapter 13

Conclusion

This thesis has explored the application of program slicing and related analyses to the construction

of automatic tools for refactoring.

A theoretical framework for slicing-based refactoring has been developed. The framework has

been introduced in Chapter 4 and further extended in Chapter 5 where a new proof method

has been developed. The method is based on two complementary types of reﬁnements, i.e. slice-

reﬁnement and co-slice-reﬁnement. In our deterministic context, when a program S0is both a slice-

reﬁnement and a co-slice-reﬁnement of another program S, it is guaranteed to be a full reﬁnement

of S. This enables the decomposition of a proof, following the speciﬁc decomposition applied in

a given transformation. The construction of our framework has been ﬁnalised in Chapter 8, with

the formalisation of a novel program decomposition technique of program slides. We think of a

program as represented by a collection of transparency slides. On each such slide, a non-contiguous

part of the original program is printed, such that the union of all slides yields back the program

itself.

Based on our theoretical framework, a provably correct slicing algorithm has been provided in

Chapter 9. The algorithm is based on the observation that slides capture the control ﬂow aspect

of slicing, whereas complementary data-ﬂow inﬂuences are captured by our binary relation of slide

dependence. Thus, a slide-independent set of slides yields a correct slice.

Our framework and slicing algorithm have been applied in solving the problem of slice extrac-

tion, as posed in the introduction chapter, via a family of provably correct sliding transformations.

Building on existing method-extraction algorithms, our approach shares the advantages of those

whilst avoiding some of the respective weaknesses. Thus sliding is successful in providing high

levels of accuracy and applicability.

The thesis comes to a conclusion in this chapter, by discussing implications and potential

applications of sliding transformations, ﬁrst in the context of refactoring, and more generally,

137

CHAPTER 13. CONCLUSION 138

later. Furthermore, advanced issues and limitations of sliding are evaluated and some ideas for

future work are presented.

13.1 Slicing-based refactoring

13.1.1 Replace Temp with Query

Our journey had started with a promise to oﬀer general automation of Fowler’s refactoring of

Replace Temp with Query. This was motivated, in part, by Fowler and Beck’s big refactoring of

Convert Procedural Design to Objects and the observation that support for removing temps was

missing.

But then, instead, we turned to form and solve the problem of slice extraction (via sliding).

The time has come to explain how sliding can contribute to automating Replace Temp with Query.

Our observation is that

Replace Temp with Query =Extract Slice +Inline Temp +Merge Temps

whereby

Extract Slice =Sliding +Extract Method

with Inline Temp a refactoring to eliminate simple temps that are “getting in the way of other

refactorings” [20, Page 119], and with Merge Temps as in the elimination of compensatory code

after sliding (see e.g. Reﬁnement 11.1).

13.1.2 More refactorings

Our Extract Slice refactoring, automated via sliding, can help with automating some more known

(and yet to be supported) refactorings.

Split/merge loops

The Split Loop refactoring is an immediate candidate for automation through sliding. “You have

a loop that is doing two things. Duplicate the loop” [69]. Indeed, this is what we did throughout

the thesis in most of our examples. However, it should be emphasised that splitting loops is just

a special case of our general slice-extraction refactoring.

It should also be remembered that tangled loops are not bad practice as such. It is left to the

programmer to apply this refactoring judiciously, e.g. when one of the computations in the loop

should be extracted for reuse.

Our separation of reﬁnement and program equivalence rules from actual transformations,

throughout the thesis, was made with the understanding that reverse sliding operations, e.g.

CHAPTER 13. CONCLUSION 139

for entangling loops (a new Merge Loops refactoring?), may under some circumstances be as de-

sirable. The decision when to apply which refactoring, in our understanding, should be left to the

programmer’s good judgement. We merely provide the enabling tools.

Separate Query from Modiﬁer

Side eﬀects in functions can be problematic, e.g. hampering potential reuse. “You have a method

that returns a value but also changes the state of an object”, is a situation that calls for the

Separate Query from Modiﬁer refactoring. “Create two methods, one for the query and one for

the modiﬁcation” is Fowler’s suggestion [20, Page 279].

Corn´elio has formalised this refactoring for simple cases in which the modiﬁer is made of an

individual assignment (to an object’s ﬁeld) and the query returns the old value of the assigned

variable (i.e. the ﬁeld) [11, Page 128]. But what if the querying code is tangled with the modiﬁer?

(Indeed this is the case in Fowler’s original example.)

Our observation is that such cases require untangling of non-contiguous code, as is oﬀered by

sliding. However, working out exact details of this refactoring will require further investigation. If

successful, such application of sliding would yield a novel and advanced solution to this important

and highly non-trivial refactoring.

Arbitrary method extraction

The Extract Method refactoring is considered so important that its automation, in a behaviour-

preserving way, has been declared “The Rubicon” to be crossed by refactoring tools, before those

can be considered serious [21]. Furthermore, many of the other catalogued refactoring transfor-

mations depend on Extract Method as a building block.

One limitation of method extraction, as formulated and supported to-date, is the insistence

on extracting a single fragment of (contiguous) code. Indeed, extracting an arbitrary selection

of fragments, from a given program scope, is much harder. We call this generalisation arbitrary

method extraction. It involves the extraction of a (not necessarily contiguous) set of program

fragments into a single new method.

The most closely related work to sliding, which was introduced early in the thesis (in Sec-

tion 2.3) and indeed inﬂuenced its development, includes the Tucking transformation by Lakhotia

and Deprez [40] and two algorithms by Komondoor and Horwitz (KH00 [38] and KH03 [39]). As

was mentioned, none of them actually targeted slice extraction, as was deﬁned in Section 1.4.

In fact, it was (diﬀerent ﬂavours of) arbitrary method extraction that they targeted. Common

to all three is that an arbitrary selection of fragments is given as input. They diﬀer (from each

other), however, in the rules of the game. Those include (1) the way to determine the enclosing

program scope (from which to extract), (2) the applicability conditions (i.e. when to reject a

CHAPTER 13. CONCLUSION 140

transformation), (3) which non-selected statements can (or should) be dragged along with the

extracted code, (4) what is in the complement and how to compose it with the extracted code, (5)

which parts of the program can be duplicated and/or reversed, and (6) what kind of compensation

is to be allowed.

Our conjecture is that each of the three arbitrary-method-extraction ﬂavours (i.e. Tuck, KH00

and KH03) can be reduced to slice extraction. The results of an initial investigation in that di-

rection have suggested such reductions indeed exist. Those involve the formulation and reuse of

backward slices from internal program points. This can be done with existing sliding transforma-

tions, to be performed on the SSA-form rather than the original. Then, the (existing solution’s)

applicability conditions should be shown to imply de-SSA-ability of the result. Thus, the existing

solutions will be re-formulated, proved correct and automated through sliding.

Consequently, as was the case for slice extraction, corresponding improvements can be expected

to present themselves. Those might yield new solutions with higher applicability (compared with

Tuck and KH00), higher accuracy of extracted code (Tuck and KH03) and complement (Tuck)

reduced levels of duplication and compensation (again, Tuck) and enhanced scalability (mostly

the exponential KH00 but also the cubic KH03).

It should be said that this application of sliding is, on the one hand, somewhat surprising (to

the author, at least), as it was not at all anticipated (in earlier stages of this research). But on

the other hand, it is very reasonable, since the results of Tuck, KH00 and KH03 (in particular

the comparison between the former and latter, in [37] and [39]) have directly contributed to the

invention of slides and sliding.

13.2 Advanced issues and limitations

In choosing a programming language, we have made some simplifying assumptions, such that

formal derivation of the concepts behind sliding has become feasible. It is natural to ask whether

sliding transformations can be upgraded to support “real” languages. In what follows, we consider

the lifting of some earlier restrictions on the supported language.

Firstly, our assumption of all variables being cloneable was made such that we can make backup

of initial and ﬁnal values, as part of sliding’s compensatory code. Thanks to our penless-sliding ef-

fort to remove redundant backup variables, back in Chapter 11, this restriction can be easily lifted.

This lifting must be complemented with strengthening sliding’s applicability conditions. Added

preconditions will ensure all backup variables of non-cloneable program variables are mergeable

and hence removed. Otherwise, the transformation would be rejected. Alternatively, some mea-

sures might be taken (as in KH03 where no backup variables are allowed) to avoid the need for

such backup.

CHAPTER 13. CONCLUSION 141

Secondly, if aliasing is to be permitted, a preliminary step may perform alias analysis before

sliding begins. This step would rename variables such that the aliases are seamless to the sliding

algorithm. Furthermore, since sliding aims to keep the source as close to the original as possible,

this would have to be complemented with a following step to undo the renaming. There, special

care would have to be taken with compensatory code, to retrieve backup value of all relevant

variables.

Thirdly, allowing structured jumps or even arbitrary control ﬂow, as is the case with existing

solutions to arbitrary method extraction, would require a complete reformulation, at least on

the lower level of our program analysis and manipulation approach (e.g. laws for propagating

assertions). Nevertheless, there appears to be no reason why slides and sliding should not be

applicable in such settings. For example, the slide of an assignment would have to include all its

control-dependence predecessors instead of merely syntax-tree ancestors. (In our simple language,

indeed the latter subsumes the former.) Another example is ﬁnal-use substitution, whereby instead

of propagating assertions as far as possible before making the substitution, it should be possible

to formulate the substitution in terms of paths over the control ﬂow graph. There, a ﬁnal use (of

variable x) is one from which all paths to the exit involve no re-deﬁnition (of that x).

Fourthly, in the presence of exceptions, sliding’s reordering of statements may be problematic

in the sense that the transformed program might perform more operations before throwing an

exception or might even throw a diﬀerent exception. When the thrown exceptions are part of

the behaviour to be preserved, such reordering should be limited or even completely abandoned.

Instead, it should be possible to adopt alternative extraction strategies which involve no reordering

of statements. We have proposed two such alternative strategies, one object oriented, and the

other aspect oriented [36], in a paper titled “Untangling: A Slice Extraction Refactoring” [17].

Explained in our context of slips and slides, both strategies involve the sliding of slips, without

their controlling guards. Such slip sliding would extract the statement of a slip into a method of

its own, in the object oriented case, or into an advice in an aspect-oriented context. In the former,

the slip would be replaced with a call to the new method whereas in the latter, the extracted

advice slip will be associated with a pointcut designator, ensuring it is slid back (i.e. woven) into

its original point, before execution.

Fifthly, in the presence of concurrency it is unclear whether sliding is at all appropriate, again,

due to reordering of computations. Nevertheless, some solutions to slicing are available for such

settings (e.g. [35, 15]). It is possible that as it was for aliasing, new preconditions to sliding may

be formed to ensure behaviour preservation.

Finally, supporting procedures, parameters, overloading, and object-oriented constructs (e.g.

inheritance and polymorphism) would be a great challenge. Indeed, slicing research has already

proposed solutions to such problems, on the one hand, whereas predicate transformers (e.g. for

CHAPTER 13. CONCLUSION 142

the language ROOL [11]) have been deﬁned and even applied to refactoring, on the other. It is

currently unknown whether such advanced solutions will be amenable for supporting formulations

of sliding.

In the context of PDG-based slicing, extra language features (e.g. arbitrary control ﬂow [6])

are typically handled by the addition of more edges to the graph. This way, the slicing algorithm

itself is oblivious to those features and remains simple. Similarly, it can be expected that sliding,

being based on slides and slicing, be enhanced by the addition of slide dependences.

13.3 Future work

Sliding, as presented in this thesis, oﬀers an abundance of possible directions for future work.

Earlier in this chapter, we have already mentioned a number of possible applications of sliding

in implementing known refactorings and in extending method extraction to support arbitrary

selections of non-contiguous code fragments.

In our discussion on supporting advanced language features and limitations of sliding, some

further ideas have been highlighted, including the support for diﬀerent strategies of extraction, as

in our paper on refactoring into aspects [17].

Some further ideas may involve the theory behind sliding, or practicalities, or even other

applications, beyond the initial domain of refactoring.

13.3.1 Formal program re-design

In this thesis, we have restricted our supported language for deterministic constructs. If the earlier

section considered the lifting of language restrictions when supporting “real” languages, here we

turn the other way, considering the eﬀects of supporting non-determinism and speciﬁcations.

Our problem with non-determinism has been related to the duplication of such constructs. As

in the above section, this problem can be treated by adding a precondition to sliding, ensuring no

non-deterministic choice is duplicated. Alternatively, a mechanism to ensure exact repetition of

non-deterministic choices, wherever duplicated, can be installed. The details of such mechanisms

would require further work.

It is hoped that with robust support for change of programs and speciﬁcations — and sliding

may oﬀer a step towards such support — formal methods of program design, and hence of re-design,

would become more agile and perhaps, consequently, more widely used.

CHAPTER 13. CONCLUSION 143

13.3.2 Further applications of sliding: beyond refactoring

The sliding family of program equivalence and reﬁnement rules, as introduced in the thesis, has

been applied to behaviour-preserving transformations for refactoring, with the aim of supporting

change in software. Nonetheless, this does not have to stop there. Sliding carries the potential

of being relevant and applicable anywhere program equivalence or behaviour-preserving program

changes are.

Software obfuscation

The sliding reﬁnement rules of Chapters 10 and 11 provide a large universe of equivalent programs,

as was explored in the optimisation problems of Chapter 12. In construction, those only diﬀer in

the subset of reusable variables and hence in the size and shape of the co-slice. In obfuscation

[14], a program is being transformed with the aim of becoming less readable. This is desirable e.g.

for software security and protection. In a way, obfuscation is the opposite of refactoring, but as

it also involves behaviour-preserving transformations, it may beneﬁt from sliding. Moreover, the

large number of equivalent programs carries the potential of rendering the reversal of sliding-based

obfuscation harder.

Clone elimination

The arbitrary-method-extraction algorithms, by Komondoor and Horwitz [38, 39, 37], target the

elimination of clones, or duplicated (not-necessarily contiguous) statements, in existing programs.

Their approach eliminates pair of clones as well as clone groups. Since, with some more work,

sliding is expected to be made applicable for such method-extraction techniques (as KH00 and

KH03), it should also be useful in that context.

Integration of program variants

In general, as said, sliding can be expected to be useful wherever program equivalence is. One

interesting application of such equivalence is in the integration of variants of a program. This is

useful, for example, when a group of programmers is working simultaneously on a given code base.

Horwitz, Prins and Reps [31, 32] have suggested some PDG-based and slicing related algorithms

of program merging for integration. Those were based on the observation that if two programs are

represented by isomorphic dependence graphs, they are equivalent. However, the reverse is not

true, obviously, as the problem of program equivalence is in general undecidable. With sliding, we

identify a range of equivalent programs whose dependence graphs will not be isomorphic. This is

due to duplication of guards and assignments. This sliding-related family of equivalent programs,

might, in turn, enhance the capabilities of such program integration algorithms.

CHAPTER 13. CONCLUSION 144

Optimising compilers

In this thesis, we have adopted some program analyses, representations and transformations from

the world of optimising compilers, such as reaching deﬁnitions, SSA form, live variables analysis

and the related dead-assignments-elimination.

In turn, it should be interesting to investigate the relevance of sliding transformations to that

domain. It appears that sliding oﬀers more powerful code-motion transformations than the state

of the art.

Programming education

On a diﬀerent level altogether, it is hoped that slides and sliding, either as a metaphor or in

theory and practice, can ﬁnd their way into the programming education curriculum, especially in

the education of non-mathematically inclined programmers. For example, teaching and learning

the concept of recursion, with the slideshow metaphor, having a single slide for each iteration,

on which values of parameters and local variables are written with an erasable pen, may prove

simpler, more tangible than present methods. Furthermore, since often programmers think of

programs as slices of non-contiguous code rather than trees or ﬂow graphs, as Weiser has shown

[62], it is hoped that representing programs as collections of slides would appeal to programmers.

Beyond programming

Finally, it should be hard but interesting to examine the application of slides and sliding beyond

the world of programming. In general, any context of evolvable structured documents could beneﬁt

from such techniques.

For example, this thesis has been written, or rather developed, using L

X. Moreover, its

content and structure have evolved throughout. Could slicing, slides and sliding not assist in such

activities?

Appendix A

Formal Language Deﬁnition

This appendix gives a full deﬁnition of the language used in this thesis. Each language construct is

given semantics formulated as a wp predicate transformer. Then, some syntactic approximations

to required semantic properties of that construct, regarding program variables, is given. Finally,

for each construct, the basic requirements (RE1-RE5) are proved. Those were deﬁned back in

Chapter 4 and are re-stated next. For any statement Swe require

RE1 wp.Sis universally disjunctive

RE2 glob.(wp.S.P)⊆((glob.P\ddef.S)∪input.S) for all P

RE3 [wp.S.P≡P∧wp.S.true] for all Pwith glob.Pdef.S

RE4 ddef.S⊆def.S

RE5 glob.S=def.S∪input.S

A.1 Core language

Assignment

[wp.“X:= E”.P≡P[X\E]] for all P;

def.“X:= E”,X;

ddef.“X:= E”,X;

input.“X:= E”,glob.E; and

glob.“X:= E”,X∪glob.E.

RE2: glob.(wp.“X:= E”.P)⊆(glob.P\ddef.“X:= E”)∪input.“X:= E”

glob.(wp.“X:= E”.P)

={wp of ‘:=’}

145

APPENDIX A. FORMAL LANGUAGE DEFINITION 146

glob.P[X\E]

⊆ {glob of normal sub.}

(glob.P\X)∪glob.E

={ddef and input of ‘:=’}

(glob.P\ddef.“X:= E”)∪input.“X:= E”

RE3: [wp.“X:= E”.P≡P∧wp.“X:= E”.true]provided def.“X:= E”glob.P

P∧wp.“X:= E”.true

={wp of ‘:=’}

P∧true[X\E]

={Xglob.true}

P∧true

={identity element of ∧}

={redundant normal sub.: Xglob.Pdue to proviso and deﬁnition of def}

P[X\E]

={wp of ‘:=’}

wp.“X:= E”.P

RE4: ddef.“X:= E”⊆def.“X:= E”: Trivially so, since X⊆X.

RE5: glob.“X:= E”=def.“X:= E”∪input.“X:= E”

glob.“X:= E”

={glob of ‘:=’}

X∪glob.E

={def and input of ‘:=’}

def.“X:= E”∪input.“X:= E”

APPENDIX A. FORMAL LANGUAGE DEFINITION 147

Sequential composition

[wp.“S1;S2”.P≡wp.S1.(wp.S2.P)] for all P;

def.“S1;S2”,def.S1∪def.S2 ;

ddef.“S1;S2”,ddef.S1∪ddef.S2 ;

input.“S1;S2”,input.S1∪(input.S2\ddef.S1) ; and

glob.“S1;S2”,glob.(S1,S2) .

RE2: glob.(wp.“S1;S2”.P)⊆(glob.P\ddef.“S1;S2”)∪input.“S1;S2” pro-

vided glob.(wp.S1.Q)⊆(glob.Q\ddef.S1) ∪input.S1and glob.(wp.S2.P)⊆(glob.P\

ddef.S2) ∪input.S2for any predicates P,Q

glob.(wp.“S1;S2”.P)

={wp of ‘ ;’}

glob.(wp.S1.(wp.S2.P))

⊆ {proviso with Q:= wp.S2.P}

(glob.(wp.S2.P)\ddef.S1) ∪input.S1

⊆ {proviso}

((glob.P\ddef.S2) ∪input.S2) \ddef.S1) ∪input.S1

={set theory: (a∪b)\c= (a\c)∪(b\c) and

(d\e)\f=d\(e∪f)}

(glob.P\(ddef.S1∪ddef.S2)) ∪(input.S2\ddef.S1) ∪input.S1

={ddef and input of ‘ ;’}

(glob.P\ddef.“S1;S2”)∪input.“S1;S2”

RE3: [wp.“S1;S2”.P≡P∧wp.“S1;S2”.true]provided def.“S1;S2”glob.P,

[wp.S1.Q≡Q∧wp.S1.true]for any Qwith def.S1glob.Qand [wp.S2.R≡R∧wp.S2.true]

for any Rwith def.S2glob.R

wp.“S1;S2”.P

={wp of ‘ ;’}

wp.S1.(wp.S2.P)

={proviso: def.S2glob.P}

wp.S1.(P∧wp.S2.true)

APPENDIX A. FORMAL LANGUAGE DEFINITION 148

={wp.S1 is ﬁnitely conjunctive}

wp.S1.P∧wp.S1.(wp.S2.true)

={proviso: def.S1glob.P}

P∧wp.S1.true ∧wp.S1.(wp.S2.true)

={ﬁnitely conjunctive}

P∧wp.S1.(true ∧wp.S2.true)

={identity element of ∧}

P∧wp.S1.(wp.S2.true)

={wp of ‘ ;’}

P∧wp.“S1;S2”.true

RE4: ddef.“S1;S2”⊆def.“S1;S2” provided ddef.S1⊆def.S1and ddef.S2⊆

def.S2: Indeed ddef.S1∪ddef.S2⊆def.S1∪def.S2, due to the proviso and set theory.

RE5: glob.“S1;S2”=def.“S1;S2”∪input.“S1;S2” provided glob.S1 = def.S1∪

input.S1and glob.S2 = def.S2∪input.S2

def.“S1;S2”∪input.“S1;S2”

={def and input of ‘ ;’}

def.S1∪def.S2∪input.S1∪(input.S2\ddef.S1)

={set theory: ddef.S1⊆def.S1}

def.S1∪def.S2∪input.S1∪input.S2

={proviso}

glob.S1∪glob.S2

={glob of ‘ ;’}

glob.“S1;S2”

APPENDIX A. FORMAL LANGUAGE DEFINITION 149

Alternative construct

[wp.IF .P≡(B⇒wp.S1.P)∧(¬B⇒wp.S2.P)] for all P;

def.IF ,def.S1∪def.S2 ;

ddef.IF ,ddef.S1∩ddef.S2 ;

input.IF ,glob.B∪input.S1∪input.S2 ; and

glob.IF ,glob.B∪glob.S1∪glob.S2 .

RE2: glob.(wp.IF .P)⊆(glob.P\ddef.IF )∪input.IF provided glob.(wp.S1.P)⊆(glob.P\

ddef.S1) ∪input.S1and glob.(wp.S2.P)⊆(glob.P\ddef.S2) ∪input.S2

glob.(wp.IF .P)

={wp of IF}

glob.((B⇒wp.S1.P)∧(¬B⇒wp.S2.P))

={def. of glob;glob.B=glob.(¬B)}

glob.B∪glob.(wp.S1.P)∪glob.(wp.S2.P)

⊆ {proviso, twice}

glob.B∪(glob.P\ddef.S1) ∪input.S1∪(glob.P\ddef.S2) ∪input.S2

={set theory}

(glob.P\(ddef.S1∩ddef.S2)) ∪glob.B∪input.S1∪input.S2

={ddef and input of IF}

(glob.P\ddef.IF )∪input.IF

RE3: [wp.IF .P≡P∧wp.IF .true]provided def.IF glob.P

wp.IF .P

={wp of IF}

(B⇒wp.S1.P)∧(¬B⇒wp.S2.P)

={proviso, twice: def.IF glob.P⇒def.S1glob.Pand def.S2glob.P}

(B⇒P∧wp.S1.true)∧(¬B⇒P∧wp.S2.true )

={dist. of ⇒over ∧, twice}

(B⇒P)∧(B⇒wp.S1.true)∧(¬B⇒P)∧(¬B⇒wp.S2.true )

={pred. calc.: [(Y⇒Z)∧(¬Y⇒Z)≡Z]}

P∧(B⇒wp.S1.true)∧(¬B⇒wp.S2.true )

={wp of IF}

APPENDIX A. FORMAL LANGUAGE DEFINITION 150

P∧wp.IF .true

RE4: ddef.IF ⊆def.IF provided ddef.S1⊆def.S1and ddef.S2⊆def.S2: Indeed ddef.S1∩

ddef.S2⊆def.S1∪def.S2, due to the proviso and set theory (a∩b⊆a∪b).

RE5: glob.IF =def.IF ∪input.IF provided glob.S1 = def.S1∪input.S1and glob.S2 =

def.S2∪input.S2

def.IF ∪input.IF

={def and input of IF}

def.S1∪def.S2∪glob.B∪input.S1∪input.S2

={proviso}

glob.B∪glob.S1∪glob.S2

={glob of IF}

glob.IF

Repetitive construct

[wp.DO.P≡(∃i: 0 ≤i: (ki.false))] for all P,

with kgiven by (DS:9,44) [13]: [k.Q≡(B∨P)∧(¬B∨wp.S.Q)] ;

def.DO ,def.S;

ddef.DO ,∅;

input.DO ,glob.B∪input.S; and

glob.DO ,glob.B∪glob.S.

RE2: glob.(wp.DO.P)⊆(glob.P\ddef.DO)∪input.DO provided glob.(wp.S.Q)⊆(glob.Q\

ddef.S)∪input.S

Recall that wp.DO.Pis equivalent to (∃i: 0 ≤i:ki.false) with [k.Q≡(B∨P)∧(¬B∨wp.S.Q)]

and that ddef.DO ,∅and input.DO ,glob.B∪input.S. Observing that glob.false ⊆glob.P∪

glob.B∪input.Sis trivially true, we are left to prove

glob.((B∨P)∧(¬B∨wp.S.Q)) ⊆glob.P∪glob.B∪input.Sfor any Qwith glob.Q⊆glob.P∪

glob.B∪input.S:

glob.((B∨P)∧(¬B∨wp.S.Y))

APPENDIX A. FORMAL LANGUAGE DEFINITION 151

={def. of glob}

glob.B∪glob.P∪glob.(wp.S.Q)

⊆ {proviso}

glob.B∪glob.P∪(glob.Q\ddef.S)∪input.S

⊆ {set theory}

glob.B∪glob.P∪glob.Q∪input.S

⊆ {proviso}

glob.B∪glob.P∪glob.P∪glob.B∪input.S∪input.S

={set theory}

glob.P∪glob.B∪input.S

RE3: [wp.DO.P≡P∧wp.DO.true]provided def.DO glob.Pand [wp.S.P≡P∧wp.S.true]

Here, recall that wp.DO .Pis equivalent to (∃i: 0 ≤i:ki.false) with [k.Q≡(B∨P)∧

(¬B∨wp.S.Q)] and def.DO ,def.S. Furthermore, note that wp.DO .true is equivalent to

(∃i: 0 ≤i:li.false) with [l.Q≡ ¬B∨wp.S.Q] due to true being the zero element of ∨as

well as the identity element of ∧. Hence, we need to prove:

[(∃i: 0 ≤i:ki.false)≡X∧(∃i: 0 ≤i:li.false )] (A.1)

and we do so by proving (by induction) for all j(≥0):

[(∃i: 0 ≤i≤j:ki.false)≡P∧(∃i: 0 ≤i≤j:li.false )])

again, provided def.DO glob.Pand [wp.S.P≡P∧wp.S.true]:

Base case: j= 0

P∧(∃i: 0 ≤i≤0 : li.false)

={one point rule}

P∧(∃i: 0 ≤i≤0 : li.false)

={one point rule}

P∧l0.false

={deﬁnition of function iteration}

P∧false

={zero element of ∧}

APPENDIX A. FORMAL LANGUAGE DEFINITION 152

false

={deﬁnition of function iteration}

k0.false

={one point rule}

(∃i: 0 ≤i≤0 : ki.false)

Step: j+ 1 (with 0≤j)

(∃i: 0 ≤i≤j+ 1 : ki.false )

={splitting the range}

(∃i: (0 = i)∨(1 ≤i≤j+ 1) : ki.false )

={one point rule and transforming the dummy}

k0.false ∨(∃i: 0 ≤i≤j:ki+1.false )

={def. of func. it., twice}

false ∨(∃i: 0 ≤i≤j:k.ki.false)

={id. elem. of ∨;kis pos. disj. (and even universally so)}

k.(∃i: 0 ≤i≤j:ki.false)

={def. of k}

(B∨P)∧(¬B∨wp.S.(∃i: 0 ≤i≤j:ki.false))

={ind. hypo.}

(B∨P)∧(¬B∨wp.S.(P∧ ∃i: 0 ≤i≤j:li.false))

={wp.Sis ﬁn. conj. (and even univ. so)}

(B∨P)∧(¬B∨(wp.S.P∧wp.S.(∃i: 0 ≤i≤j:li.false)))

={proviso}

(B∨P)∧(¬B∨(P∧wp.S.true ∧wp.S.(∃i: 0 ≤i≤j:li.false)))

={wp.Sis ﬁn. conj. (and even univ. so)}

(B∨P)∧(¬B∨(P∧wp.S.(true ∧(∃i: 0 ≤i≤j:li.false))))

={id. elem. of ∧}

(B∨P)∧(¬B∨(P∧wp.S.(∃i: 0 ≤i≤j:li.false)))

={∨ distributes over ∧}

(B∨P)∧(¬B∨P)∧(¬B∨wp.S.(∃i: 0 ≤i≤j:li.false))

={pred. calc.: [(C∨D)∧(¬C∨D)≡D]}

P∧(¬B∨wp.S.(∃i: 0 ≤i≤j:li.false))

APPENDIX A. FORMAL LANGUAGE DEFINITION 153

={def. of l}

P∧l.(∃i: 0 ≤i≤j:li.false)

={id. elem. of ∨;lis pos. disj. (and even universally so)}

P∧(false ∨(∃i: 0 ≤i≤j:l.li.false))

={def. of func. it., twice}

P∧(l0.false ∨(∃i: 0 ≤i≤j:li+1.false ))

={one point rule and transforming the dummy}

P∧(∃i: (0 = i)∨(1 ≤i≤j+ 1) : li.false )

={splitting the range}

P∧(∃i: 0 ≤i≤j+ 1 : li.false )

={∧ distributes over ∃(3.11)}

(∃i: 0 ≤i≤j+ 1 : X∧li.false )

RE4: ddef.DO ⊆def.DO provided ddef.S⊆def.S: Trivially so since ddef.DO ,∅.

RE5: glob.DO =def.DO ∪input.DO provided glob.S=def.S∪input.S

def.DO ∪input.DO

={def and input of DO}

def.S∪glob.B∪input.S

={proviso}

glob.B∪glob.S

={glob of DO}

glob.DO

This completes our subset of Dijkstra and Scholten’s guarded commands [13]. The following

constructs are extensions borrowed from Morgan [45], with some adaptations as our context re-

quires. Since those constructs were not present in [13], we shall also be responsible for proving

requirement RE 1 (i.e. universal disjunctivity).

APPENDIX A. FORMAL LANGUAGE DEFINITION 154

A.2 Extended language

Assertions

[wp.“{B}”.P≡B∧P] for all P;

def.“{B}”,∅;

ddef.“{B}”,∅;

input.“{B}”,glob.B; and

glob.“{B}”,glob.B.

RE1: wp.“{B}” is universally disjunctive

wp.“{B}”.(∃P:P∈Ps :P)

={wp of assertions}

B∧(∃P:P∈Ps :P)

={∧ distributes over ∃(3.11)}

(∃P:P∈Ps :B∧P)

={again, wp of assertions}

(∃P:P∈Ps :wp.“{B}”.P)

RE2: glob.(wp.“{B}”.P)⊆(glob.P\ddef.“{B}”)∪input.“{B}”

glob.(wp.“{B}”.P)

={wp of assertions}

glob.(B∧P)

={def. of glob}

glob.B∪glob.P

={set theory and ddef and input of assertions}

(glob.P\ddef.“{B}”)∪input.“{B}”

RE3: [wp.“{B}”.P≡P∧wp.“{B}”.true]provided def.“{B}”glob.P

P∧wp.“{B}”.true

APPENDIX A. FORMAL LANGUAGE DEFINITION 155

={wp of assertions}

P∧B∧true

={identity element of ∧}

P∧B

={wp of assertions}

wp.“{B}”.P

RE4: ddef.“{B}”⊆def.“{B}”: Trivially so, since ddef.“{B}”=def.“{B}”=∅.

RE5: glob.“{B}”=def.“{B}”∪input.“{B}”

glob.“{B}”

={glob of assertions}

glob.B

={def and input of assertions}

def.“{B}”∪input.“{B}”

Local variables

“|[var L;S]|”,“L0:= L;S;L:= L0”where L0is fresh.

[wp.“|[var L;S]|”.P≡(wp.S.P[L\L0])[L0\L] ] for all Pwith glob.PL0; or the simpler

[wp.“|[var L;S]|”.Q≡wp.S.Q] for all Qwith glob.Q(L,L0)

def.“|[var L;S]|”,def.S\L;

ddef.“|[var L;S]|”,ddef.S\L;

input.“|[var L;S]|”,input.S; and

glob.“|[var L;S]|”,(def.S\L)∪input.S; or

glob.“|[var L;S]|”,(glob.S\(L\input.S)) .

RE1: wp.“|[var L;S]|” is universally disjunctive, provided wp.Sis

wp.“|[var L;S]|”.(∃P:P∈Ps :P)

={wp of locals: Lglob.Ps}

APPENDIX A. FORMAL LANGUAGE DEFINITION 156

wp.S.(∃P:P∈Ps :P)

={proviso}

(∃P:P∈Ps :wp.S.P)

={again, wp of locals}

(∃P:P∈Ps :wp.“|[var L;S]|”.P)

provided glob.PLand glob.(wp.S.P)⊆(glob.P\ddef.S)∪input.S

glob.(wp.“|[var L;S]|”.P)

={wp of locals: Lglob.P}

glob.(wp.S.P)

⊆ {proviso and property of ‘\’}

(glob.P\ddef.S)∪input.S

={ddef of locals and set theory: Lglob.P;input of locals}

(glob.P\ddef.“|[var L;S]|”)∪input.“|[var L;S]|”

glob.P

wp.“|[var L;S]|”.P

={wp of locals: Lglob.P}

wp.S.P)

={proviso: (def.S\Lglob.Pbut Lglob.Pso

def.Sglob.P}

P∧wp.S.true)

={wp of locals: glob.true =∅}

P∧wp.“|[var L;S]|”.true

APPENDIX A. FORMAL LANGUAGE DEFINITION 157

RE4: ddef.“|[var L;S]|”⊆def.“|[var L;S]|”provided ddef.S⊆def.S: Indeed ddef.S\

L⊆def.S\Ldue to the proviso and set theory.

glob.“|[var L;S]|”

={glob of locals}

(def.S\L)∪input.S

={def and input of locals}

def.“|[var L;S]|”∪input.“|[var L;S]|”

Live variables

Enclosing a statement with liveness information (e.g. “S[live V]”) guarantees only deﬁnitions

of the live variables Vmay be observable on exit from S. We deﬁne

“S[live V]”,“|[var L;S]|”where L:= def.S\V.

Since this deﬁnition is by transformation (to another, existing language construct), there is

no need to prove any of the requirements, as they can be inferred. Similarly, there is no need to

deﬁne variable properties, as those can be calculated. Thus, the semantics and properties can be

derived from those of local variables, as is summarised in the following. For a given statement S,

set of variables V, a corresponding set L:= def.S\Vand fresh L0, we have:

[wp.“S[live V]”.P≡(wp.S.P[L\L0])[L0\L] ] for all Pwith glob.PL0; or the simpler

[wp.“S[live V]”.Q≡wp.S.Q] for all Qwith glob.Q(L,L0)

def.“S[live V]”,def.S∩V;

ddef.“S[live V]”,ddef.S∩V;

input.“S[live V]”,input.S; and

glob.“S[live V]”,(def.S∩V)∪input.S.

Appendix B

Laws of Program Manipulation

B.1 Manipulating core statements

Law 1 . Let X,Y,E1,E2 be two sets of variables and two sets of expressions, respectively; then

“X:= E1;Y:= E2”=“X,Y:= E1,E2”

provided X(Y∪glob.E2).

Proof.

wp.“X:= E1;Y:= E2”.P

={wp of ‘ ;’ and twice ‘:=’}

(X:= E1).((Y:= E2).P)

={merge subs: proviso}

(X,Y:= E1,E2).P

={wp of ‘:=’}

wp.“X,Y:= E1,E2”.P.

Law 2 . Let S,Xbe a statement set of variables, respectively; then

S=“S;X:= X”.

Proof. We ﬁrst observe for all Pwith glob.Pdef.S

wp.“S;X:= X”.P

={wp of ‘ ;’ and ‘:=’}

158

APPENDIX B. LAWS OF PROGRAM MANIPULATION 159

wp.S.((X:= X).P)

={remove redundant self sub.}

wp.S.P.

We next observe for all Qwith glob.Q⊆def.S(i.e. L glob.Q)

wp.“|[var L;S]|”.Q

={wpof locals: choose L0glob.(L,S,Q)}

(L0:= L).(wp.S.((L:= L0).Q))

={remove redundant sub.: Lglob.Q(proviso)}

(L0:= L).(wp.S.Q)

={remove redundant sub.: glob.(wp.S.Q)⊆glob.(S,Q) due to RE2}

wp.S.P.

Law 3 . Let S,S1,S2,Bbe three statements and a boolean expression, respectively; then

“S;if Bthen S1else S2ﬁ”=“if Bthen S;S1else S;S2ﬁ”

provided def.Sglob.B.

Proof.

wp.“S;if Bthen S1else S2ﬁ”.P

={wp of ‘ ;’}

wp.S.(wp.“if Bthen S1else S2ﬁ”.P)

={wp of IF}

wp.S.((B⇒wp.S1.P)∧(¬B⇒wp.S2.P))

={wp.Sis ﬁn. conj.}

wp.S.(B⇒wp.S1.P)∧wp.S.(¬B⇒wp.S2.P)

={Lemma B.1, twice: proviso (see below)}

(B⇒wp.S.(wp.S1.P)) ∧(¬B⇒wp.S.true)∧

(¬B⇒wp.S.(wp.S2.P)) ∧(B⇒wp.S.true)

={pred. calc.}

(B⇒wp.S.(wp.S1.P)∧wp.S.true)∧(¬B⇒wp.S.true ∧wp.S.(wp.S2.P))

={absorb termination (3.14) and wp of ‘ ;’, twice}

APPENDIX B. LAWS OF PROGRAM MANIPULATION 160

(B⇒wp.“S;S1”.P)∧(¬B⇒wp.“S;S2”.P)

={wp of IF}

wp.“if Bthen S;S1else S;S2ﬁ”.P.

Lemma B.1. Let S,P,Qbe a statement and two predicates, respectively, with def.Sglob.P;

then

[wp.S.(P⇒Q)≡(P⇒wp.S.Q)∧(¬P⇒wp.S.true)] .

Proof.

wp.S.(P⇒Q)

={pred. calc.; wp.Sis ﬁn. disj.}

wp.S.(¬P)∨wp.S.Q

={RE3: proviso}

(¬P∧wp.S.true)∨wp.S.Q

={pred. calc.}

(¬P∨wp.S.Q)∧(wp.S.true ∨wp.S.Q)

={pred. calc.; termination absorbs (3.15)}

(P⇒wp.S.Q)∧wp.S.true

={pred. calc.}

(P⇒wp.S.Q)∧(P⇒wp.S.true)∧(¬P⇒wp.S.true )

={pred. calc.}

(P⇒wp.S.Q∧wp.S.true)∧(¬P⇒wp.S.true )

={absorb termination (3.14)}

(P⇒wp.S.Q)∧(¬P⇒wp.S.true).

Law 4 . Let S1,S2,S3,Bbe three statements and a boolean expression, respectively; then

“if B1then S1else S2ﬁ;S3”=“if B1then S1;S3else S2;S3ﬁ”.

Proof.

wp.“if B1then S1;S3else S2;S3ﬁ”.P

={wp of IF}

APPENDIX B. LAWS OF PROGRAM MANIPULATION 161

(B1⇒wp.“S1;S3”.P)∧(¬B1⇒wp.“S2;S3”.P)

={wp of ‘ ;’, twice}

(B1⇒wp.S1.(wp.S3.P)) ∧(¬B1⇒wp.S2.(wp.S3.P))

={wp of IF}

wp.“if B1then S1else S2ﬁ”.(wp.S3.P)

={wp of ‘ ;’}

wp.“if B1then S1else S2ﬁ;S3”.P).

Law 5 . Let S1,X,B,Ebe any statement, set of variables, boolean expression and set of

expressions, respectively; then

“{X=E};while Bdo S1;(X:= E)od ”=“{X=E};while Bdo S1od ;(X:= E)”

provided X(glob.B∪input.S1∪glob.E).

Proof.

wp.“{X=E};while Bdo S1od ;(X:= E)”.P

={wp of ‘ ;’, twice}

wp.“{X=E}”.(wp.“while Bdo S1od ”.(wp.“X:= E”.P))

={wp of assertions and ‘:=’}

(X=E)∧wp.“while Bdo S1od ”.((X:= E).P)

={wp of DO with [k.Q≡(B∨(X:= E).P)∧(¬B∨wp.S1.Q)]}

(X=E)∧(∃i: 0 ≤i:ki.false)

={∧ distributes over ∃(3.11)}

(∃i: 0 ≤i: (X=E)∧ki.false)

={see below; [l.Q≡(B∨P)∧(¬B∨wp.“S;(X:= E)”.Q)]}

(∃i: 0 ≤i: (X=E)∧li.false)

={∧ distributes over ∃(3.11)}

(X=E)∧(∃i: 0 ≤i:li.false)

={wp of DO with las above}

(X=E)∧wp.“while Bdo S1;(X:= E)od ”.P

={wp of assertions and ‘ ;’}

wp.“{X=E};while Bdo S1;(X:= E)od ”.P.

APPENDIX B. LAWS OF PROGRAM MANIPULATION 162

We ﬁnish by proving for the middle step above, by induction, having [(X=E)∧ki.false ≡

(X=E)∧li.false)] for all i, provided X(glob.B∪input.S1∪glob.E).

For the base case i= 0, we observe that indeed [(X=E)∧false ≡(X=E)∧false] (recall the

deﬁnition of function iteration). Then, for the induction step, we assume [(X=E)∧ki.false ≡

(X=E)∧li.false] and prove [(X=E)∧ki+1.false ≡(X=E)∧li+1 .false)].

(X=E)∧ki+1.false

={def. of func. it.}

(X=E)∧k.(ki.false)

={def. of k}

(X=E)∧(B∨(X:= E).P)∧(¬B∨wp.S1.(ki.false))

={replace equals with equals}

(X=E)∧(B∨(X:= X).P)∧(¬B∨wp.S1.(ki.false))

={remove redundant self-sub.}

(X=E)∧(B∨P)∧(¬B∨wp.S1.(ki.false))

={intro. redundant sub.: X(ki.false ) due to RE2 and

X(glob.B∪input.S1∪glob.E) (proviso)}

(X=E)∧(B∨P)∧(¬B∨wp.S1.((X:= E).(ki.false)))

={ind. hypo.}

(X=E)∧(B∨P)∧(¬B∨wp.S1.((X:= E).(li.false)))

={wp of ‘ ;’ and ‘:=’}

(X=E)∧(B∨P)∧(¬B∨wp.“S1;(X:= E)”.(li.false))

={def. of l}

(X=E)∧l.(li.false)

={def. of func. it.}

(X=E)∧li+1.false .

Law 6 . Let X,Ebe any set of variables and set of expressions, respectively; then

“{X=E}”=“{X=E};X:= E”.

Proof.

wp.“{X=E};X:= E”.P

APPENDIX B. LAWS OF PROGRAM MANIPULATION 163

={wp of ‘ ;’ assertions and assignments}

(X=E)∧((X:= E).P)

={remove redundant sub.}

(X=E)∧P

={wp of assertions}

wp.“{X=E}”.P.

B.2 Assertion-based program analysis

B.2.1 Introduction of assertions

Law 7 . Let X,Y,E1,E2 be two sets of variables and two sets of expressions, respectively; then

“X,Y:= E1,E2”=“X,Y:= E1,E2;{Y=E2}”

provided (X,Y)glob.E2.

Proof.

wp.“X,Y:= E1,E2;{Y=E2}”.P

={wp of ‘ ;’}

wp.“X,Y:= E1,E2”.(wp.“{Y=E2}”.P

={wp of assertions}

wp.“X,Y:= E1,E2”.((Y=E2) ∧P)

={wp.“X,Y:= E1,E2”is ﬁn. conj.}

wp.“X,Y:= E1,E2”.(Y=E2) ∧wp.“X,Y:= E1,E2”.P

={wp of ‘:=’}

(X,Y:= E1,E2).(Y=E2) ∧wp.“X,Y:= E1,E2”.P

={normal sub.: proviso}

(E2 = E2) ∧wp.“X,Y:= E1,E2”.P

={id. elem. of ∧}

wp.“X,Y:= E1,E2”.P.

APPENDIX B. LAWS OF PROGRAM MANIPULATION 164

Law 8 . Let X,X0,Ebe (same length) lists of variables and expressions, respectively, with

XX0; then

“X,X0:= E,E”=“X,X0:= E,E;{X=X0}”.

Proof.

wp.“X,X0:= E,E;{X=X0}”.P

={wp of ‘ ;’}

wp.“X,X0:= E,E”.(wp.“{X=X0}”.P

={wp of assertions}

wp.“X,X0:= E,E”.((X=X0)∧P)

={wp is ﬁn. conj.}

wp.“X,X0:= E,E”.(X=X0)) ∧wp.“X,X0:= E,E”.P

={wp of ‘:=’}

(X,X0:= E,E).(X=X0)∧wp.“X,X0:= E,E”.P

={normal sub.: proviso}

(E=E)∧wp.“X,X0:= E,E”.P

={id. elem. of ∧}

wp.“X,X0:= E,E”.P.

Law 9 . Let S1,B1,B2 be any given statement and two boolean expressions, respectively;

then

“while B1do S1od ”=“while B1do {B2};S1od ”.

provided [B1⇒B2].

Proof. In order to prove that the two loop statements are equivalent, it suﬃces to show for all Q

[¬B1∨wp.S1.Q≡ ¬B1∨wp.“{B2};S1”.Q].

¬B1∨wp.“{B2};S1”.Q

={wp of ‘ ;’ and assertions}

¬B1∨(B2∧wp.S1.Q)

={pred. calc.}

APPENDIX B. LAWS OF PROGRAM MANIPULATION 165

(¬B1∨B2) ∧(¬B1∨wp.S1.Q)

={proviso}

true ∧(¬B1∨wp.S1.Q)

={id. elem.}

¬B1∨wp.S1.Q.

B.2.2 Propagation of assertions

Law 10 . Let S,Bbe a statement and boolean expression, respectively; then

“{wp.S.B};S”=“S;{B}”.

Proof. We observe for all P:

wp.“S;{B}”.P

={wp of ‘ ;’}

wp.S.(wp.“{B}”.P)

={wp of assertions}

wp.S.(B∧P)

={conj. of wp.S}

wp.S.B∧wp.S.P)

={wp of assertions}

wp.“{wp.S.B}”.(wp.S.P)

={wp of ‘ ;’}

wp.“{wp.S.B};S”.P.

Law 11 . Let S,Bbe a statement and boolean expression, respectively; then

“{B};S”=“S;{B}”.

provided def.Sglob.B.

Proof.

“S;{B}”

APPENDIX B. LAWS OF PROGRAM MANIPULATION 166

={swap statements (Law 5.7): def of assertions is empty and

def.Sglob.B(proviso)}

“{B};S”.

The following law will be used for propagating assertions forward into branches of an IF as

well as backward ahead of an IF.

Law 12 . Let S1,S2,B1,B2 be two statements and two boolean expressions, respectively;

then

“{B1};if B2then S1else S2ﬁ”=“if B2then {B1};S1else {B1};S2ﬁ”.

Proof.

“{B1};if B2then S1else S2ﬁ”

={Law 3: def.{B1}=∅for any assertion}

“if B2then {B1};S1else {B1};S2ﬁ”.

The next law will allow the propagation of assertions forward to the (head of the) body of a

loop.

Law 13 . Let S,B1,B2,B3,B4 be a statement and four boolean expressions, respectively;

then

“{B1};while B2do S;{B3}od ”=“{B1};while B2do {B4};S;{B3}od ”

provided [B1⇒B4] and [B3⇒B4].

Proof.

“{B1};while B2do S;{B3}od ”

={Law 14: proviso}

“{B1};while B2∧B4do S;{B3}od ”

={Law 9: [B2∧B4⇒B4]}

“{B1};while B2∧B4do {B4};S;{B3}od ”.

APPENDIX B. LAWS OF PROGRAM MANIPULATION 167

Law 14 . Let S,B1,B2,B3,B4 be a statement and four boolean expressions, respectively;

then

“{B1};while B2do S;{B3}od ”=“{B1};while B2∧B4do S;{B3}od ”

provided [B1⇒B4] and [B3⇒B4].

Proof.

“{B1};while B2do S;{B3}} od ”

={proviso and pred. calc.}

“{B1∧B4};while B2do S;{B3∧B4}} od ”

={Law 15}

“{B1};{B4};while B2do S;{B3};{B4}od ”

={see below}

“{B1};{B4};while B2∧B4do S;{B3};{B4}od ”

={Law 15}

“{B1∧B4};while B2∧B4do S;{B3∧B4}od ”

={proviso and pred. calc.}

“{B1};while B2∧B4do S;{B3}od ”.

So, we are left to prove the middle step above, simpliﬁed to

“{B4};while B2do S1;{B4}od“=“{B4};while B2∧B4do S1;{B4}od“.

wp.“{B4};while B2do S1;{B4}od ”.P

={wp of ‘ ;’ and assertions}

B4∧wp.“while B2do S1;{B4}od ”.P

={wp of DO with [k.Q≡(B2∨P)∧(¬B2∨wp.“S1;{B4}”.Q)]}

B4∧(∃i: 0 ≤i:ki.false)

={∧ distributes over ∃(3.11)}

B4∧(∃i: 0 ≤i:B4∧ki.false)

={see below; [l.Q≡((B2∧B4) ∨P)∧(¬(B2∧B4) ∨wp.“S1;{B4}”.Q)]}

B4∧(∃i: 0 ≤i:B4∧li.false)

={∧ distributes over ∃(3.11)}

B4∧(∃i: 0 ≤i:li.false)

APPENDIX B. LAWS OF PROGRAM MANIPULATION 168

={wp of DO with las above}

B4∧wp.“while B2∧B4do S1;{B4}od ”.P

={wp of ‘ ;’ and assertions}

wp.“{B4};while B2∧B4do S1;{B4}od ”.P.

We complete the proof by showing for the middle step above [B4∧k.Q≡B4∧l.Q] for all Q.

B4∧l.Q

={def. of l}

B4∧((B2∧B4) ∨P)∧(¬(B2∧B4) ∨wp.“S1;{B4}”.Q)

={pred. calc.: [A∧((A∧B)∨C)≡A∧(B∨C)]}

B4∧(B2∨P)∧(¬(B2∧B4) ∨wp.“S1;{B4}”.Q)

={pred. calc.: de-Morgan}

B4∧(B2∨P)∧(¬B2∨ ¬B4∨wp.“S1;{B4}”.Q)

={pred. calc.: [A∧(¬A∨B)≡A∧B]}

B4∧(B2∨P)∧(¬B2∨wp.“S1;{B4}”.Q)

={def. of k}

B4∧k.Q.

Law 15 . Let B1,B2 be two boolean expressions; then

“{B1∧B2}”=“{B1};{B2}”.

Proof.

wp.“{B1};{B2}”.P

={wp of ‘ ;’ and assertions}

B1∧wp.“{B2}”.P

={wp of assertions}

B1∧B2∧P

={wp of assertions}

wp.“{B1∧B2}”.P.

APPENDIX B. LAWS OF PROGRAM MANIPULATION 169

Law 16 . Let S,B1,B2 be a statement and two boolean expressions, respectively; then

“{B1};while B2do Sod ”=“{B1};while B2do {B1};Sod ”

provided glob.B1def.S.

Proof.

wp.“{B1};while B2do {B1};Sod ”.P

={wp of ‘ ;’ and assertions}

B1∧wp.“while B2do {B1};Sod ”.P

={Law 11: glob.B1def.S(proviso)}

B1∧wp.“while B2do S;{B1}od ”.P

={wp of DO with [k.Q≡(B2∨P)∧(¬B2∨wp.“S;{B1}”.Q)]}

B1∧(∃i: 0 ≤i:ki.false)

={∧ distributes over ∃(3.11)}

B1∧(∃i: 0 ≤i:B1∧ki.false)

={see below; [l.Q≡(B2∨P)∧(¬B2∨wp.S.Q)]}

B1∧(∃i: 0 ≤i:B1∧li.false)

={∧ distributes over ∃(3.11)}

B1∧(∃i: 0 ≤i:li.false)

={wp of DO with las above}

B1∧wp.“while B2do Sod ”.P

={wp of assertions and ‘ ;’}

wp.“{B1};while B2do Sod ”.P.

We ﬁnish by proving for the middle step above, by induction, having [B1∧ki.false ≡B1∧

li.false)] for all i, provided glob.B1def.S.

For the base case i= 0, we observe that indeed [B1∧false ≡B1∧false] (recall the deﬁnition

of function iteration). Then, for the induction step, we assume [B1∧ki.false ≡B1∧li.false] and

prove [B1∧ki+1.false ≡B1∧li+1.false )].

B1∧ki+1.false

={def. of func. it.}

B1∧k.(ki.false)

APPENDIX B. LAWS OF PROGRAM MANIPULATION 170

={def. of k1}

B1∧(B2∨P)∧(¬B2∨wp.“S;{B1}”.(ki.false))

={wp of ‘ ;’ and assertions}

B1∧(B2∨P)∧(¬B2∨wp.S.(B1∧ki.false))

={ind. hypo.}

B1∧(B2∨P)∧(¬B2∨wp.S.(B1∧li.false))

={wp.Sis ﬁn. conj.}

B1∧(B2∨P)∧(¬B2∨(wp.S.B1∧wp.S.(li.false)))

={RE3: proviso}

B1∧(B2∨P)∧(¬B2∨(B1∧wp.S.true ∧wp.S.(li.false)))

={absorb termination (3.14)}

B1∧(B2∨P)∧(¬B2∨(B1∧wp.S.(li.false)))

={pred. calc.}

B1∧(B2∨P)∧(¬B2∨B1) ∧(¬B2∨wp.S.(li.false)))

={pred. calc.: absorption}

B1∧(B2∨P)∧(¬B2∨(wp.S.(li.false)))

={def. of l}

B1∧l.(li.false)

={def. of func. it.}

B1∧li+1.false .

B.2.3 Substitution

Law 17 . Let S1,S2,Bbe two statements and a boolean expression, respectively; let X,Ebe

a set of variables and a corresponding list of expressions; and let Y,Y0be two sets of variables;

then

“{Y=Y0};X:= E”=“{Y=Y0};X:= E[Y\Y0]”;

“{Y=Y0};IF ”=“{Y=Y0};IF 0”; and

“{Y=Y0};DO ”=“{Y=Y0};DO0”.

APPENDIX B. LAWS OF PROGRAM MANIPULATION 171

where IF := “if Bthen S1else S2ﬁ”,

IF 0:= “if B[Y\Y0]then S1else S2ﬁ”,

DO := “while Bdo S1;{Y=Y0}od ”

and DO0:= “while B[Y\Y0]do S1;{Y=Y0}od ”.

Proof. Assignment:

wp.“{Y=Y0};X:= E[Y\Y0]”.P

={wp of ‘ ;’ and assertions}

(Y=Y0)∧wp.“X:= E[Y\Y0]”.P

={wp of ‘:=’}

(Y=Y0)∧(X:= E[Y\Y0]).P

={replace equals for equals}

(Y=Y0)∧(X:= E[Y\Y]).P

={redundant self sub.}

(Y=Y0)∧(X:= E).P

={wp of ‘:=’}

(Y=Y0)∧wp.“X:= E”.P

={wp of assertions and ‘ ;’}

wp.“{Y=Y0};X:= E”.P.

IF:

wp.“{Y=Y0};if B[Y\Y0]then S1else S2ﬁ”.P

={wp of ‘ ;’ and assertions}

(Y=Y0)∧wp.“if B[Y\Y0]then S1else S2ﬁ”.P

={wp of IF}

(Y=Y0)∧(B[Y\Y0]⇒wp.S1.P)∧(¬B[Y\Y0]⇒wp.S2.P)

={replace equals for equals}

(Y=Y0)∧(B[Y\Y]⇒wp.S1.P)∧(¬B[Y\Y]⇒wp.S2.P)

={redundant self sub., twice}

(Y=Y0)∧(B⇒wp.S1.P)∧(¬B⇒wp.S2.P)

={wp of IF}

(Y=Y0)∧wp.“if Bthen S1else S2ﬁ”.P

={wp of assertions and ‘ ;’}

APPENDIX B. LAWS OF PROGRAM MANIPULATION 172

wp.“{Y=Y0};if Bthen S1else S2ﬁ”.P.

DO:

wp.“{Y=Y0};while Bdo S1;{Y=Y0}od ”.P

={wp of ‘ ;’ and assertions}

(Y=Y0)∧wp.“while Bdo S1;{Y=Y0}od ”.P

={wp of DO with [k.Q≡(B∨P)∧(¬B∨wp.“S1;{Y=Y0}”.Q}

(Y=Y0)∧(∃i: 0 ≤i:ki.false)

={∧ distributes over ∃(3.11)}

(∃i: 0 ≤i: (Y=Y0)∧ki.false)

={see below; [l.Q≡(B0∨P)∧(¬B0∨wp.“S1;{Y=Y0}”.Q] with

B0,B[Y\Y0]}

(∃i: 0 ≤i: (Y=Y0)∧li.false)

={∧ distributes over ∃(3.11)}

(Y=Y0)∧(∃i: 0 ≤i:li.false)

={wp of DO with las above}

(Y=Y0)∧wp.“while B[Y\Y0]do S1;{Y=Y0}od ”.P

={wp of assertions and ‘ ;’}

wp.“{Y=Y0};while B[Y\Y0]do S1;{Y=Y0}od ”.P.

We ﬁnish by proving for the middle step above, by induction, having [(Y=Y0)∧(ki.false ≡

(Y=Y0)∧li.false)] for all i.

For the base case i= 0, we observe that indeed [(Y=Y0)∧false ≡(Y=Y0)∧false] (recall the

deﬁnition of function iteration). Then, for the induction step, we assume [(Y=Y0)∧ki.false ≡

(Y=Y0)∧li.false] and prove [(Y=Y0)∧ki+1.false ≡(Y=Y0)∧li+1 .false)].

(Y=Y0)∧li+1.false

={def. of func. it.}

(Y=Y0)∧l.(li.false)

={def. of l}

(Y=Y0)∧(B[Y\Y0]∨P)∧(¬B[Y\Y0]∨wp.“S1;{Y=Y0}”.(li.false))

={replace equals for equals}

(Y=Y0)∧(B[Y\Y]∨P)∧(¬B[Y\Y]∨wp.“S1;{Y=Y0}”.(li.false))

APPENDIX B. LAWS OF PROGRAM MANIPULATION 173

={redundant self sub., twice}

(Y=Y0)∧(B∨P)∧(¬B∨wp.“S1;{Y=Y0}”.(li.false))

={wp of ‘ ;’ and assertions}

(Y=Y0)∧(B∨P)∧(¬B∨wp.S1.((Y=Y0)∧li.false))

={ind. hypo.}

(Y=Y0)∧(B∨P)∧(¬B∨wp.S1.((Y=Y0)∧ki.false))

={wp of ‘ ;’ and assertions}

(Y=Y0)∧(B∨P)∧(¬B∨wp.“S1;{Y=Y0}”.(ki.false))

={def. of k}

(Y=Y0)∧k.(ki.false)

={def. of func. it.}

(Y=Y0)∧ki+1.false .

Law 18 . Let S1,S2,Bbe two statements and a boolean expression, respectively; let

X,X0,Y,Z,E1,E10,E2,E3 be four lists of variables and corresponding lists of expressions; then

“X,Y:= E1,E2;Z:= E3”=“X,Y:= E1,E2;Z:= E3[Y\E2] ”;

“X,Y:= E1,E2;IF ”=“X,Y:= E1,E2;IF 0”; and

“X,Y:= E1,E2;DO ”=“X,Y:= E1,E2;DO0”

provided ((X∪X0),Y)glob.E2

where IF := “if Bthen S1else S2ﬁ”,

IF 0:= “if B[Y\E2] then S1else S2ﬁ”,

DO := “while Bdo S1;X0,Y:= E10,E2od ”

and DO0:= “while B[Y\E2] do S1;X0,Y:= E10,E2od ”.

Proof.

“X,Y:= E1,E2;Z:= E3”

={intro. following assertion (Law 7): (X,Y)glob.E2 (proviso)}

“X,Y:= E1,E2;{Y=E2};Z:= E3”

={assertion-based sub. (Law 17) with Y0:= E2}

“X,Y:= E1,E2;{Y=E2};Z:= E3[Y\E2] ”

={remove following assignment (Law 7)}

“X,Y:= E1,E2;Z:= E3[Y\E2] ”.

APPENDIX B. LAWS OF PROGRAM MANIPULATION 174

“X,Y:= E1,E2;if Bthen S1else S2ﬁ”

={intro. following assertion (Law 7): (X,Y)glob.E2 (proviso)}

“X,Y:= E1,E2;{Y=E2};if Bthen S1else S2ﬁ”

={assertion-based sub. (Law 17) with Y0:= E2}

“X,Y:= E1,E2;{Y=E2};if B[Y\E2] then S1else S2ﬁ”

={remove following assignment (Law 7)}

“X,Y:= E1,E2;if B[Y\E2] then S1else S2ﬁ”.

“X,Y:= E1,E2;while Bdo S1;X0,Y:= E10,E2od ”

={intro. following assertion (Law 7), twice:

(X,Y)glob.E2 and (X0,Y)glob.E2 (proviso)}

“X,Y:= E1,E2;{Y=E2};while Bdo S1;X0,Y:= E10,E2;{Y=E2}od ”

={assertion-based sub. (Law 17) with Y0:= E2}

“X,Y:= E1,E2;{Y=E2};

while B[Y\E2] do S1;X0,Y:= E10,E2;{Y=E2}od ”

={remove following assignment (Law 7), twice}

“X,Y:= E1,E2;while B[Y\E2] do S1;X0,Y:= E10,E2od ”.

B.3 Live variables analysis

B.3.1 Introduction and removal of liveness information

Law 19. Let S,Vbe any statement and set of variables, respectively, with def.S⊆V; then

S=“S[live V]”.

Proof. We observe for all Q

wp.“S[live V]”.Q

={def. of live with coV := def.S\V}

wp.“|[var coV ;S]|”.Q

={wp of locals: coV =∅due to proviso}

APPENDIX B. LAWS OF PROGRAM MANIPULATION 175

wp.S.Q.

B.3.2 Propagation of liveness information

Law 20. Let S1,S2,V1,V2 be any two statements and two sets of variables, respectively; then

“(S1;S2)[live V1] ”=“(S1[live V2] ;S2[live V1])[live V1] ”

provided V2 = (V1\ddef.S2) ∪input.S2.

Proof. We observe for all P(with glob.P⊆V1)

wp.“(S1[live V2] ;S2[live V1])[live V1] ”.P

={wp of live : proviso}

wp.“S1[live V2] ;S2[live V1] ”.P

={wp of ;}

wp.“S1[live V2] ”.(wp.“S2[live V1] ”.P)

={wp of live :glob.P⊆V1}

wp.“S1[live V2] ”.(wp.S2.P)

={wp of live :glob.(wp.S2.P)⊆(V1\ddef.S2) ∪input.S2

due to RE2 and the proviso}

wp.S1.(wp.S2.P)

={wp of ;}

wp.“S1;S2”.P

={wp of live :glob.P⊆V1}

wp.“(S1;S2)[live V1] ”.P.

Law 21. Let B,S1,S2,Vbe any boolean expression, two statements and set of variables,

respectively; then

“(if Bthen S1else S2ﬁ)[live V]”=“(if Bthen S1[live V]else S2[live V]ﬁ)[live V]”.

Proof. We observe for all Pwith glob.P⊆V

wp.“(if Bthen S1[live V]else S2[live V]ﬁ)[live V]”.P

={wp of live :glob.P⊆V}

APPENDIX B. LAWS OF PROGRAM MANIPULATION 176

wp.“if Bthen S1[live V]else S2[live V]ﬁ”.P

={wp of IF}

(B⇒wp.“S1[live V]”.P)∧(¬B⇒wp.“S2[live V]”.P)

={wp of live , twice: glob.P⊆V}

(B⇒wp.S1.P)∧(¬B⇒wp.S2.P)

={wp of IF}

wp.“if Bthen S1else S2ﬁ”.P

={wp of live :glob.P⊆V}

wp.“(if Bthen S1else S2ﬁ)[live V]”.P.

Law 22. Let B,S,Vbe any boolean expression, statement and set of variables, respectively;

then

“(while Bdo Sod)[live V1] ”=“(while Bdo S[live V2] od)[live V1] ”

provided V2 = V1∪(glob.B∪input.S).

Proof. We observe for all Pwith glob.P⊆V1

wp.“(while Bdo S[live V2] od)[live V1] ”.P

={wp of live :glob.P⊆V1}

wp.“while Bdo S[live V2] od ”.P

={wp of DO with [k.Q≡(B∨P)∧(¬B∨wp.S[live V2].Q)]}

(∃i: 0 ≤i:ki.false)

={see below; [l.Q≡(B∨P)∧(¬B∨wp.S.Q)]}

(∃i: 0 ≤i:li.false))

={wp of DO with las above}

wp.“while Bdo Sod ”.P

={wp of live :glob.P⊆V1}

wp.“(while Bdo Sod)[live V1] ”.P.

We go on by proving for the missing step above [ki.false ≡li.false] for all i. We begin

by observing [wp.S.Q≡wp.“S[live V2] ”.Q] for all Qwith glob.Q⊆V2 due to wp of live .

And indeed glob.((B∨P)∧(¬B∨wp.S.Q)) ⊆V2 for all such Q. This is due to the proviso

V2 = V1∪(glob.B∪input.S), the given glob.P⊆V1, and RE2.

APPENDIX B. LAWS OF PROGRAM MANIPULATION 177

B.3.3 Dead assignments: introduction and elimination

Law 23 . Let S,V,X,Y,E1,E2 be any statement, three sets of variables and two sets of

expressions, respectively; then

“(S;X:= E1)[live V]”=“(S;X,Y:= E1,E2)[live V]”

provided Y(X∪V).

Proof. We observe for all Pwith glob.P⊆V

wp.“S;X,Y:= E1,E2”.P

={wp ‘;’ and ‘:=’}

wp.S.P[X,Y\E1,E2]

={remove redundant sub.: Yglob.P}

wp.S.P[X\E1]

={wp ‘;’ and ‘:=’}

wp.“S;(X:= E1) ”.P.

Law 24 . Let S,V,Y,Ebe any statement, two sets of variables and set of expressions,

respectively; then

“S[live V]”=“(S;Y:= E)[live V]”

provided YV.

Proof.

“(S;Y:= E)[live V]”

={Law 23 with X:= ∅}

“S[live V]”.

Law 25 . Let S,V,X,Y,E1,E2 be any statement, three sets of variables and two sets of

expressions, respectively; then

“(X:= E1;S)[live V]”=“(X,Y:= E1,E2;S)[live V]”

provided Y(X∪(V\ddef.S)∪input.S).

APPENDIX B. LAWS OF PROGRAM MANIPULATION 178

Proof. We observe for all Pwith glob.P⊆V

wp.“X,Y:= E1,E2;S”.P

={wp ‘;’ and ‘:=’}

(wp.S.P)[X,Y\E1,E2]

={remove redundant sub.: Yglob.(wp.S.P) due to proviso and RE2}

(wp.S.P)[X\E1]

={wp ‘:=’ and ‘ ;’}

wp.“X:= E1;S”.P.

Law 26 . Let B,S1,S2,Y,V,Ebe a boolean expression, two statements, two sets of variables

and a set of expressions, respectively; then

“(S1;while Bdo S2od)[live V]”=“(S1;while Bdo S2;(Y:= E)od)[live V]”

provided Y(V∪glob.B∪input.S2).

Proof. We observe for all Pwith glob.P⊆V

wp.S1;while Bdo S2;(Y:= E)od]|”.P

={wp of ‘ ;’}

wp.S1.(wp.“while Bdo S2;(Y:= E)od ”.P)

={wp of DO with [k.Q≡(B∨P)∧(¬B∨wp.“S2;(Y:= E)”.Q)]}

wp.S1.(∃i: 0 ≤i:ki.false)

={see below; [l.Q≡(B∨P)∧(¬B∨wp.S2.Q)]}

(wp.S1.(∃i: 0 ≤i:li.false)

={wp of DO with las above}

wp.S1.(wp.“while Bdo S2od ”.P)

={wp of ‘ ;’}

wp.“S1;while Bdo S2od ”.P.

We go on by proving for the middle step above [ki.false ≡li.false] for all i. Keeping in mind

(glob.(P,B)∪input.S2) Y, we begin by observing glob.(wp.S2.Q)Yfor all Qwith glob.QY.

Thus glob.(ki.false)Yfor all i. We now complete the proof by showing [wp.“S2;(Y:=

E)”.Q≡wp.S2.Q] for all such Q.

APPENDIX B. LAWS OF PROGRAM MANIPULATION 179

wp.“S2;(Y:= E)”.Q

={wp of ‘ ;’ and assignment}

wp.S2.Q[Y\E]

={remove redundant sub.: proviso}

wp.S2.Q.

Appendix C

Properties of Slides

Lemma C.1. Let Sbe any core statement and let V1,V2 be two sets of variables with V2

(V1∪def.S); then

(slides.S.V1) = (slides.S.(V1,V2)) .

Proof. We prove the equivalence by induction on the structure of S.

First, when V1def.Swe get

slides.S.V1

={def. of slides}

skip

={def. of slides: (V1,V2) def.S}

slides.S.(V1,V2) .

In the remaining cases we can assume V1∩def.S6=∅.

S=“X:= E”:

slides.“X:= E”.(V1,V2)

={slides of ‘:=’: X1 := X∩(V1,V2);

let E1 be the subset of Ecorresponding to X1}

“X1 := E1”

={slides of ‘:=’: X1 = X∩V1 due to V2X}

slides.“X:= E”.V1.

S=“S1;S2”:

slides.“S1;S2”.(V1,V2)

180

APPENDIX C. PROPERTIES OF SLIDES 181

={slides of ‘ ;’}

“(slides.S1.(V1,V2)) ;(slides.S2.(V1,V2)) ”

={ind. hypo., twice: def.S1⊆def.“S1;S2”; similarly for S2}

“(slides.S1.V1) ;(slides.S2.V1) ”

={slides of ‘ ;’}

slides.“S1;S2”.V1.

S=“if Bthen S1else S2ﬁ”:

slides.“if Bthen S1else S2ﬁ”.(V1,V2)

={slides of IF}

“if Bthen slides.S1.(V1,V2) else slides.S2.(V1,V2) ﬁ”

={ind. hypo., twice: def.S1⊆def.IF ; similarly for S2}

“if Bthen slides.S1.V1else slides.S2.V1ﬁ”

={slides of IF}

slides.“if Bthen S1else S2ﬁ”.V1.

S=“while Bdo S1od ”:

slides.“while Bdo S1od ”.(V1,V2)

={slides of DO}

“while Bdo (slides.S1.(V1,V2)) od ”

={ind. hypo.}

while Bdo (slides.S1.V1) od

={slides of DO: def.S1 = def.DO}

slides.“while Bdo S1od ”.V1.

Theorem 8.1 (Slides Distribute over Union). Any pair of slides of a single core

statement, slides.S.V1 and slides.S.V2, is uniﬁable. Furthermore, we have

(slides.S.(V1∪V2)) = ((slides.S.V1) ∪(slides.S.V2)) .

Proof. We prove the equivalence by induction on the structure of S.

First, when V1def.Swe get

slides.S.V1∪slides.S.V2

APPENDIX C. PROPERTIES OF SLIDES 182

={def. of slides when def.SV1}

skip ∪slides.S.V2

={def. of ∪}

slides.S.V2

={Lemma C.1: (V1\V2) def.S)}

slides.S.(V1∪V2) .

A similar derivation proves the case of V2def.S. Thus in the remaining cases we are left to

assume both V1∩def.S6=∅and V2∩def.S6=∅.

S=“X:= E”:

slides.“X:= E”.V1∪slides.“X:= E”.V2

={slides of ‘:=’, twice: X1 := X∩V1, X2 := X∩V2 and

E1,E2 are the corresponding subsets of E}

“X1 := E1”∪“X2 := E2”

={def. of ∪for assignments: X12 := X1∪X2 and

E12, the corresponding union of E1 and E2 is well deﬁned:

any X.iin X12 that is both in X1 and X2 is also in Xand so both

E1.iand E2.iare the same (original) E.i}

“X12 := E12 ”

={slides of ‘:=’ and set theory: (X∩V1) ∪(X∩V2) = X∩(V1∪V2)}

slides.“X:= E”.(V1∪V2) .

S=“S1;S2”:

slides.“S1;S2“.V1∪slides.“S1;S2“.V2

={slides of ‘ ;’, twice}

“(slides.S1.V1) ;(slides.S2.V1) ”∪

“(slides.S1.V2) ;(slides.S2.V2) ”

={def. of ∪for ‘ ;’}

“((slides.S1.V1) ∪(slides.S1.V2)) ;((slides.S2.V1) ∪(slides.S2.V2)) ”

={ind. hypo., twice}

“(slides.S1.(V1∪V2)) ;(slides.S2.(V1∪V2)) ”

={slides of ‘ ;’}

slides.“S1;S2”.(V1∪V2) .

APPENDIX C. PROPERTIES OF SLIDES 183

S=“if Bthen S1else S2ﬁ”:

slides.“if Bthen S1else S2ﬁ”.V1∪slides.“if Bthen S1else S2ﬁ”.V2

={slides of IF, twice}

“if Bthen slides.S1.V1else slides.S2.V1ﬁ”∪“if Bthen slides.S1.V2else slides.S2.V2ﬁ”

={def. of ∪for IF}

“if Bthen ((slides.S1.V1) ∪(slides.S1.V2)) else ((slides.S2.V1) ∪(slides.S2.V2)) ﬁ”

={ind. hypo., twice}

“if Bthen slides.S1.(V1∪V2) else slides.S2.(V1∪V2) ﬁ”

={slides of IF}

slides.“if Bthen S1else S2ﬁ”.(V1∪V2) .

S=“while Bdo S1od ”:

slides.“while Bdo S1od ”.V1∪slides.“while Bdo S1od ”.V2

={slides of DO, twice}

“while Bdo (slides.S1.V1) od ”∪“while Bdo (slides.S1.V2) od ”

={def. of ∪for DO}

while Bdo ((slides.S1.V1) ∪(slides.S1.V2)) od

={ind. hypo.}

while Bdo (slides.S1.(V1∪V2)) od

={slides of DO}

slides.“while Bdo S1od ”.(V1∪V2) .

Lemma 9.5. Let S,Vbe any core statement and set of variables, respectively; then

def.(slides.S.V)⊆def.S.

Proof. The proof is by induction on the structure of S.

First, when Vdef.Swe get

def.(slides.S.V)

={def. of slides:Vdef.S}

def.(skip )

={def of skip }

APPENDIX C. PROPERTIES OF SLIDES 184

∅

⊆ {set theory}

def.S.

In the remaining cases we can assume V∩def.S6=∅.

S=“X:= E”:

def.(slides.“X:= E”.V)

={slides of ‘:=’: X1 := X∩V;

let E1 be the subset of Ecorresponding to X1}

def.“X1 := E1”

={def of assignments}

⊆ {def. of X1 and set theory}

={def of assignments}

def.“X:= E”.

S=“S1;S2”:

def.(slides.“S1;S2”.V)

={slides of ‘ ;’}

def.“(slides.S1.V);(slides.S2.V)”

={def of ‘ ;’}

def.(slides.S1.V)∪def.(slides.S2.V)

⊆ {ind. hypo., twice}

def.S1∪def.S2

={def of ‘ ;’}

def.“S1;S2”.

S=“if Bthen S1else S2ﬁ”:

def.(slides.“if Bthen S1else S2ﬁ”.V)

={slides of IF}

def.“if Bthen slides.S1.Velse slides.S2.Vﬁ”

={def of IF}

APPENDIX C. PROPERTIES OF SLIDES 185

def.(slides.S1.V)∪def.(slides.S2.V)

⊆ {ind. hypo., twice}

def.S1∪def.S2

={def of IF}

def.“if Bthen S1else S2ﬁ”.

S=“while Bdo S1od ”:

def.(slides.“while Bdo S1od ”.V)

={slides of DO}

def.(“while Bdo slides.S1.Vod ”

={def of DO}

def.(slides.S1.V)

⊆ {ind. hypo.}

def.S1

={def of DO}

def.“while Bdo S1od ”.

C.1 Lemmata for proving independent slides yield slices

Lemma 9.3. Let Sbe any core statement; let Tbe any slip of Sand let Vbe any set of

variables; then

glob.(slides.T.V)⊆glob.(slides.S.V).

Proof. The proof is by induction over the structure of S.

First, when Vdef.Twe have slides.T.V=skip and the inclusion is trivial. Hence, in the

remaining cases we shall assume V∩def.T6=∅and the implied V∩def.S6=∅, due to def.T⊆def.S

(Lemma 9.4).

S=“X:= E”: Here, Tmust be X:= E, and the inclusion is trivial.

This will be the case whenever Tis Sitself. Hence, in the remaining cases we shall assume T

is a proper slip of S.

S=“S1;S2”:

glob.(slides.“S1;S2“.V)

={slides of ‘ ;’}

APPENDIX C. PROPERTIES OF SLIDES 186

glob.((slides.S1.V);(slides.S2.V))

={glob of ‘ ;’}

glob.(slides.S1.V)∪glob.(slides.S2.V)

⊇ {Tmust be a slip of either S1 or S2, to which the ind. hypo. applies}

glob.(slides.T.V).

S=“if Bthen S1else S2ﬁ”:

glob.(slides.“if Bthen S1else S2ﬁ”.V)

={slides of IF}

glob.“if Bthen slides.S1.Velse slides.S2.Vﬁ”

={glob of IF}

glob.B∪glob.(slides.S1.V)∪glob.(slides.S2.V)

⊇ {Tmust be a slip of either S1 or S2, to which the ind. hypo. applies}

glob.(slides.T.V).

S=“while Bdo S1od ”:

glob.(slides.“while Bdo S1od ”.V)

={slides of DO}

glob.“while Bdo slides.S1.Vod ”

={glob of DO}

glob.B∪glob.(slides.S1.V)

⊇ {Tmust be a slip of S1 and the ind. hypo. applies}

glob.(slides.T.V).

Lemma 9.4. Let Sbe a core statement; let Tbe any slip of S; then

def.T⊆def.S.

Proof. The proof is by induction over the structure of S.

S=“X:= E”: Here, Tmust be X:= E, and the inclusion is trivial.

S=“S1;S2”:

def.“S1;S2”

={def of ‘ ;’}

def.S1∪def.S2.

APPENDIX C. PROPERTIES OF SLIDES 187

If Tis either S,S1 or S2, the inclusion is trivial. Otherwise, it must be a slip of either S1 or

S2, and thus def.Tmust be included in either def.S1 or def.S2, respectively, due to the induction

hypothesis.

S=“if Bthen S1else S2ﬁ”:

def.“if Bthen S1else S2ﬁ”

={def of IF}

def.S1∪def.S2.

If Tis either S,S1 or S2, the inclusion is trivial. Otherwise, it must be a slip of either S1 or

S2, and thus def.Tmust be included in either def.S1 or def.S2, respectively, due to the induction

hypothesis.

S=“while Bdo S1od ”:

def.“while Bdo S1od ”

={def of DO}

def.S1.

If Tis either Sor S1, the inclusion is trivial. Otherwise, it must be a slip of S1, and thus def.T

must be included in def.S1 due to the induction hypothesis.

C.2 Slide independence and liveness

Theorem C.2. At any program point in a slide-independent statement, any variable of the

slide-independent set or one that was not deﬁned in the original statement is alive only if it

was alive in the corresponding point of the original statement. That is, let S,XI ,Y,LV be

any statement and three sets of variables, respectively, with XI slide independent in slides.S(i.e.

glob.(slides.S.XI )∩def.S⊆XI ) and LV live-on-exit; let LV 0be the set of live variables at a certain

point in (slides.S.XI )[live LV ] and let LV 00 be the set of live variables at the corresponding point

of S[live LV ]; then (LV 0\Y)⊆(LV 00 \Y) provided def.S\XI ⊆Y.

Proof. We prove by induction over the structure of slides.S.XI a variation, stating that provided

LV 1,LV 2 are the live variables on exit from slides.S.XI and S, respectively, with (LV 1\Y)⊆

(LV 2\Y), we also get (LV 10\Y)⊆(LV 200 \Y) for the sets of live variables LV 10,LV 20at any

corresponding points in (slides.S.XI )[live LV 1] and S[live LV 2].

First, for live-on-entry variables LV 10:= (LV 1\ddef.(slides.S.XI )) ∪input.(slides.S.XI ) and

LV 20:= (LV 2\ddef.S)∪input.S, we observe

APPENDIX C. PROPERTIES OF SLIDES 188

LV 10\Y

={def. of LV 10}

((LV 1\ddef.(slides.S.XI )) ∪input.(slides.S.XI )) \Y

={set theory}

((LV 1\((LV 1\Y)∩ddef.(slides.S.XI ))) ∪input.(slides.S.XI )) \Y

={Lemma C.4, see below: (LV \Y)∩def .S⊆XI }

((LV 1\((LV 1\Y)∩ddef.S)) ∪input.(slides.S.XI )) \Y

={set theory}

((LV 1\ddef.S)∪input.(slides.S.XI )) \Y

={set theory: RE5 and glob.(slides.S.XI )⊆glob.S}

((LV 1\ddef.S)∪((glob.S\Y)∩input.(slides.S.XI ))) \Y

⊆ {Lemma C.5, see below: (glob.S\Y)∩def.S⊆XI }

((LV 1\ddef.S)∪((glob.S\Y)∩input.S)) \Y

={set theory: input.S⊆glob.S}

((LV 1\ddef.S)∪input.S)\Y

⊆ {set theory: proviso (LV 1\Y)⊆(LV 2\Y)}

((LV 2\ddef.S)∪input.S)\Y

={set theory: input.S⊆glob.S}

LV 20\Y.

For internal points of slides.S.XI , we need to examine sequential composition, IF statements

and DO loops, assuming XI ∩def.S6=∅(since otherwise we get slides.S.XI =skip , with no

internal points).

Recall that (due to Lemma 9.2) we know that XI , being slide independent in slides.S, is also

slide ind. in slides.T, for any slip Tof S. Furthermore, since def.T⊆def.Sfor any slip Tof

S, the proviso def.S\XI ⊆Yimplies def.T\XI ⊆Y. Thus, the induction hypothesis can

be correctly applied to any slip, provided its respective live-on-exit variables LV 1 and LV 2 have

(LV 1\Y)⊆(LV 2\Y).

S=“S1;S2”:

The induction hypothesis on (slides.S2.XI )[live LV 1] and S2[live LV 2], ensures (LV 10\Y)⊆

(LV 20\Y) for any point in slides.S2.XI , including its entry. Thus the ind. hypo. for

(slides.S1.XI )[live LV 10] and S1[live LV 2] yields the requested result.

APPENDIX C. PROPERTIES OF SLIDES 189

S=“if Bthen S1else S2ﬁ”:

Here, the live-on-exit variables are also live-on-exit of both branches; the ind. hypo. then takes

care of those branches.

S=“while Bdo S1od ”:

The variables live-on-exit from the loop body are exactly the variables live-on-entry (to the DO

loop), LV 10,LV 20; and those are already known to have the requested (LV 10\Y)⊆(LV 20\Y).

Corollary C.3. Slide independence preserves non-simultaneous liveness. That is, let

S,VI ,X,X1,Ybe any statement and four sets of variables, respectively, with VI slide in-

dependent in slides.S(i.e. glob.(slides.S.VI )∩def.S⊆VI ) and Xnon-simultaneously-live in

S[live X1,Y]; then Xis also non-simultaneously-live in (slides.S.VI )[live X1,Y] provided X1⊆

X,|X1| ≤ 1, Y=glob.S\Xand X∩def.S⊆VI .

Proof. Let LV 0be the set of live variables at a certain point in (slides.S.VI )[live X1,Y]; let LV 00

be the set of live variables at the corresponding point in S[live X1,Y]. From Theorem C.2 we

know (LV 0\Y)⊆(LV 00 \Y), because VI is slide ind. in slides.Sand def.S\VI ⊆Y(the latter

is due to def.S⊆glob.Sand X∩def.S⊆VI ). Thus (by set theory, recall Y=glob.S\Xand

note only variables from glob.S∪Xmay be live) we get X∩LV 0⊆X∩LV 00. Combine this with

the non-simultaneous liveness of Xin S(i.e. |(X∩LV 00)| ≤ 1) and the non-simultaneous liveness

of Xin (slides.S.VI )[live X1,Y] is proved.

Lemma C.4. Let S,V,Xbe any core statement and two sets of variables, respectively; then

(X∩ddef.S) = (X∩ddef.(slides.S.V))

provided X∩def.S⊆V.

Proof. First, for the case of Vdef.Swe observe

X∩ddef.(slides.S.V)

={def. of slides when Vdef.S}

X∩ddef.skip

={ddef of skip }

X∩ ∅

={set theory}

∅

APPENDIX C. PROPERTIES OF SLIDES 190

={Xdef.Sdue to Vdef.Sand proviso;

hence Xddef.Sdue to RE4}

X∩ddef.S.

When V∩def.S6=∅, we prove the inclusion by induction over the structure of S.

S=“V1,Y1 := E1,E2”with V1⊆Vand Y1V:

X∩ddef.(slides.“V1,Y1 := E1,E2”.V)

={slides of ‘:=’: V1⊆Vand Y1V}

X∩ddef.“V1 := E1”

={ddef of ‘:=’}

X∩V1

={set theory: XY1 since X∩(V1,Y1) ⊆Vand VY1}

(X∩V1) ∪(X∩Y1)

={set theory; V1Y1}

X∩(V1,Y1)

={ddef of ‘:=’}

X∩ddef.“V1,Y1 := E1,E2”.

S=“S1;S2”:

X∩ddef.(slides.“S1;S2”.V)

={slides of ‘ ;’}

X∩ddef.“slides.S1.V;slides.S2.V”

={ddef of ‘ ;’}

X∩(ddef.(slides.S1.V)∪ddef.(slides.S2.V))

={set theory}

(X∩ddef.(slides.S1.V)) ∪(X∩ddef.(slides.S2.V))

={ind. hypo., twice: def.S1⊆def.“S1;S2”; similarly for S2}

(X∩ddef.S1) ∪(X∩ddef.S2)

={set theory}

X∩(ddef.S1∪ddef.S2)

={ddef of ‘ ;’}

X∩ddef.“S1;S2”.

APPENDIX C. PROPERTIES OF SLIDES 191

S=“if Bthen S1else S2ﬁ”:

X∩ddef.(slides.“if Bthen S1else S2ﬁ”.V)

={slides of IF}

X∩ddef.“if Bthen slides.S1.Velse slides.S2.Vﬁ”

={ddef of IF}

X∩ddef.(slides.S1.V)∩ddef.(slides.S2.V)

={set theory}

(X∩ddef.(slides.S1.V)) ∩(X∩ddef.(slides.S2.V))

={ind. hypo., twice: def.S1⊆def.IF ; similarly for S2}

(X∩ddef.S1) ∩(X∩ddef.S2)

={set theory}

X∩(ddef.S1∩ddef.S2)

={ddef of IF}

X∩ddef.“if Bthen S1else S2ﬁ”.

S=“while Bdo S1od ”:

X∩ddef.(slides.“while Bdo S1od ”.V)

={slides of DO}

X∩ddef.“while Bdo slides.S1.Vod ”

={ddef of DO}

X∩ ∅

={ddef of DO}

X∩ddef.“while Bdo S1od ”.

Lemma C.5. Let S,VI ,Xbe any core statement and two sets of variables, respectively, with VI

slide independent in slides.S(i.e. glob.(slides.S.VI )∩def .S⊆VI ); then

(X∩input.(slides.S.VI )) ⊆(X∩input.S)

provided X∩def.S⊆VI .

Proof. First, if VI def.Swe get slides.S.VI =skip and the inclusion becomes trivial.

When VI ∩def.S6=∅, we prove the inclusion by induction over the structure of S.

APPENDIX C. PROPERTIES OF SLIDES 192

S=“VI 1,Y1 := E1,E2”with VI 1⊆VI and Y1VI :

input.(slides.“VI 1,Y1 := E1,E2”.VI )

={slides of ‘:=’}

input.“VI 1 := E1”

={input of ‘:=’}

glob.E1

⊆ {}

glob.E1∪glob.E2

={input of ‘:=’}

input.“VI 1,Y1 := E1,E2”.

S=“S1;S2”:

X∩input.(slides.“S1;S2”.VI )

={slides of ‘ ;’}

X∩input.(slides.S1.VI ;slides.S2.VI )

={input of ‘ ;’}

X∩(input.(slides.S1.VI )∪(input.(slides.S2.VI )\ddef.(slides.S1.VI )))

={set theory}

(X∩input.(slides.S1.VI )) ∪((X∩input.(slides.S2.VI )) \(X∩ddef.(slides.S1.VI )))

⊆ {ind. hypo., twice}

(X∩input.S1) ∪((X∩input.S2) \(X∩ddef.(slides.S1.VI )))

={Lemma C.4}

(X∩input.S1) ∪((X∩input.S2) \(X∩ddef.S1))

={set theory}

X∩(input.S1∪(input.S2\ddef.S1))

={input of ‘ ;’}

X∩input.“S1;S2”.

S=“if Bthen S1else S2ﬁ”:

X∩input.(slides.“if Bthen S1else S2ﬁ”.VI )

={slides of IF}

X∩input.(if Bthen slides.S1.VI else slides.S2.VI ﬁ)

APPENDIX C. PROPERTIES OF SLIDES 193

={input of IF}

X∩(glob.B∪input.(slides.S1.VI )∪input.(slides.S2.VI ))

={set theory}

(X∩glob.B)∪(X∩input.(slides.S1.VI )) ∪(X∩input.(slides.S2.VI ))

⊆ {ind. hypo., twice}

(X∩glob.B)∪(X∩input.S1) ∪(X∩input.S2)

={set theory}

X∩(glob.B∪input.S1∪input.S2)

={input of IF}

X∩input.“if Bthen S1else S2ﬁ”.

S=“while Bdo S1od ”:

X∩input.(slides.“while Bdo S1od ”.VI )

={slides of DO}

X∩input.(while Bdo slides.S1.VI od)

={input of DO}

X∩(glob.B∪input.(slides.S1.VI ))

={set theory}

(X∩glob.B)∪(X∩input.(slides.S1.VI ))

⊆ {ind. hypo.}

(X∩glob.B)∪(X∩input.S1)

={set theory}

X∩(glob.B∪input.S1)

={input of DO}

X∩input.“while Bdo S1od ”.

Appendix D

SSA

D.1 General derivation

The transformation to and from SSA will be based on the following general derivation.

Program equivalence D.1. A variable in Xmay or may not be live-on-exit; independently,

it may or may not be live-on-entry and it may be self-deﬁned or normally deﬁned or not at all

deﬁned. So potentially we have 12 cases. However, in our context, some combinations are not

possible. Firstly, if a variable is self-deﬁned, the used instances must be live-on-entry. Secondly, a

variable may not be live-on-exit-only, unless it is actually deﬁned. So we are left with nine cases

to be distinguished.

For self-deﬁnitions, we have variables live-on-entry-only (XL1f:= XL1i) or live-on-both (XL2 :=

XL2i); for normally deﬁned variables, we have the live-on-both, live-on-entry-only, live-on-exit-

only and the dead variables, respectively (XL3f,XL4,XL5f,XL6 := E10,E20,E30,E40); of the

non-deﬁned variables, we have variables live-on-both (X7), live-on-entry-only (X8) and, again,

dead variables (X9).

We note that subsets X1,X2,X3,X4,X7,X8 are live-on-entry, with initial instances

XL1i,XL2i,XL3i,XL4i,XL7i,XL8i, respectively. The ﬁnal instances XL1f,XL3f,XL5f,XL7iof

subsets X1,X3,X5,X7 are all live-on-exit. Finally, note that subsets XL2,XL4,XL6 represent

dead assignments.

Let XLs be the set of all instances; let Y,Y1 be two more sets of program variables with

Y1⊆Yand Ylive on exit; ﬁnally, let E1,E2,E3,E4,E5,E10,E20,E30,E40,E50be ten lists of

194

APPENDIX D. SSA 195

expressions; then

“(X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5;

XL1f,XL3f,XL5f,XL7i:= X1,X3,X5,X7)[live XL1f,XL3f,XL5f,XL7i,Y]”=

“(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i:= X1,X2,X3,X4,X7,X8;

XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := XL1i,XL2i,E10,E20,E30,E40,E50)

[live XL1f,XL3f,XL5f,XL7i,Y]”

provided

P1: (X1,X2,X3,X4,X5,X6,X7,X8) ⊆X,

P2: (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)⊆XLs,

P3: (XL1f,XL2,XL3f,XL4,XL5f,XL6,XL7i)⊆XLs,

P4: Y1⊆Y,

P5: (X,XLs,Y) disjoint,

P6: glob.(E1,E2,E3,E4,E5) ⊆(X1,X2,X3,X4,X7,X8,Y),

P7: [E10=E1[X1,X2,X3,X4,X7,X8\XL1i,XL2i,XL3i,XL4i,XL7i,XL8i]],

P8: [E20=E2[X1,X2,X3,X4,X7,X8\XL1i,XL2i,XL3i,XL4i,XL7i,XL8i]],

P9: [E30=E3[X1,X2,X3,X4,X7,X8\XL1i,XL2i,XL3i,XL4i,XL7i,XL8i]],

P10: [E40=E4[X1,X2,X3,X4,X7,X8\XL1i,XL2i,XL3i,XL4i,XL7i,XL8i]] and

P11: [E50=E5[X1,X2,X3,X4,X7,X8\XL1i,XL2i,XL3i,XL4i,XL7i,XL8i]] .

Proof.

“(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i:= X1,X2,X3,X4,X7,X8;

XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := XL1i,XL2i,E10,E20,E30,E40,E50)

[live XL1f,XL3f,XL5f,XL7i,Y]”

={assignment-based sub. (Law 18): due to P1,P2 and P5

(X1,X2,X3,X4,X7,X8) (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i);

remove redundant double-sub.: P7-P11 and then P1,P2,P5,P6 give

(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)glob.(E1,E2,E3,E4,E5)}

“(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i:= X1,X2,X3,X4,X7,X8;

XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := X1,X2,E1,E2,E3,E4,E5)

[live XL1f,XL3f,XL5f,XL7i,Y]”

={remove dead assignment (Law 23): due to P1,P2,P5 and P6

(XL1i,XL2i,XL3i,XL4i,XL8i)

(((XL7i,Y)\Y1) ∪glob.(E1,E2,E3,E4,E5))}

“(XL7i:= X7;XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 :=

X1,X2,E1,E2,E3,E4,E5)[live XL1f,XL3f,XL5f,XL7i,Y]”

APPENDIX D. SSA 196

={intro. dead assignment (Law 23): due to P1,P2 and P5

(X3,X4,X5,X6) (XL1f,XL3f,XL5f,XL7i,Y)}

“(XL7i:= X7;XL1f,XL2,XL3f,X3,XL4,X4,XL5f,X5,XL6,X6,Y1 :=

X1,X2,E1,E1,E2,E2,E3,E3,E4,E4,E5)[live XL1f,XL3f,XL5f,XL7i,Y]”

={intro. following assertion (Laws 7, 8)}

“(XL7i:= X7;XL1f,XL2,XL3f,X3,XL4,X4,XL5f,X5,XL6,X6,Y1 :=

X1,X2,E1,E1,E2,E2,E3,E3,E4,E4,E5;

{X1,X3,X5 = XL1f,XL3f,XL5f})[live XL1f,XL3f,XL5f,XL7i,Y]”

={intro. following assignment (Law 6)}

“(XL7i:= X7;XL1f,XL2,XL3f,X3,XL4,X4,XL5f,X5,XL6,X6,Y1 :=

X1,X2,E1,E1,E2,E2,E3,E3,E4,E4,E5;{X1,X3,X5 = XL1f,XL3f,XL5f}

;XL1f,XL3f,XL5f:= X1,X3,X5)[live XL1f,XL3f,XL5f,XL7i,Y]”

={remove following assertion (Laws 7, 8)}

“(XL7i:= X7;XL1f,XL2,XL3f,X3,XL4,X4,XL5f,X5,XL6,X6,Y1 :=

X1,X2,E1,E1,E2,E2,E3,E3,E4,E4,E5;

XL1f,XL3f,XL5f:= X1,X3,X5)[live XL1f,XL3f,XL5f,XL7i,Y]”

={remove dead assignment (Law 23): due to P1,P3 and P5

(XL1f,XL2,XL3f,XL4f,XL5f,XL6) (XL7i,Y,X1,X3,X5)}

“(XL7i:= X7;X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5;

XL1f,XL3f,XL5f:= X1,X3,X5)[live XL1f,XL3f,XL5f,XL7i,Y]”

={swap statements (Law 5.7): X7(X3,X4,X5,X6,Y1), due to P1,P4 and P5; and

XL7i((X3,X4,X5,X6,Y1) ∪glob.(E1,E2,E3,E4,E5)), by P1,P2,P4,P5}

“(X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5;XL7i:= X7;

XL1f,XL3f,XL5f:= X1,X3,X5)[live XL1f,XL3f,XL5f,XL7i,Y]”

={merge assignments (Law 1): XL7i(X1,X3,X5), by P1,P2,P5}

“(X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5;

XL1f,XL3f,XL5f,XL7i:= X1,X3,X5,X7)[live XL1f,XL3f,XL5f,XL7i,Y]”.

Program equivalence D.2. Let S1,S2,S10,S20,X1,X2,XL1i,XL2i,Ybe four statements and

ﬁve sets of variables, repectively; then

“(S1;S2;XL2f:= X2)[live XL2f,Y]”=“(XL1i:= X1;S10;S20)[live XL2f,Y]”

APPENDIX D. SSA 197

provided

P1: “(S1;XL3 := X3)[live XL3,Y]”=“(XL1i:= X1;S10)[live XL3,Y]”and

P2: “(S2;XL2f:= X2)[live XL2f,Y]”=“(XL3 := X3;S20)[live XL2f,Y]”

where XL3 := ((XL2f\ddef.S20)∪(input.S20\Y) .

Proof.

“(XL1i:= X1;S10;S20)[live XL2f,Y]”

={prop. liveness info.: by def. of X3 (and set theory) we get

(XL3,Y)⊇(((XL2f,Y)\ddef.S20)∪input.S20)}

“((XL1i:= X1;S10)[live XL3,Y];S20)[live XL2f,Y]”

={P1}

“((S1;XL3 := X3)[live XL3,Y];S20)[live XL2f,Y]”

={remove liveness info.: again (XL3,Y)⊇(((XL2f,Y)\ddef.S20)∪input.S20)}

“(S1;XL3 := X3;S20)[live XL2f,Y]”

={prop. liveness info.}

“(S1;(XL3 := X3;S20)[live XL2f,Y])[live XL2f,Y]”

={P2}

“(S1;(S2;XL2f:= X2)[live XL2f,Y])[live XL2f,Y]”

={remove liveness info.}

“(S1;S2;XL2f:= X2)[live XL2f,Y]”.

Program equivalence D.3. Let B,B0,S1,S2,S10,S20,X1,X2,XL1i,XL2i,Ybe two boolean

expressions, four statements and ﬁve sets of variables, repectively; then

“(if Bthen S1else S2ﬁ;XL2f:= X2)[live XL2f,Y]”=

“(XL1i:= X1;if B0then S10else S20ﬁ)[live XL2f,Y]”

provided

P1: [B0≡B[X1\XL1i]],

P2: XL1iX1,

P3: XL1iglob.B,

P4: “(S1;XL2f:= X2)[live XL2f,Y]”=“(XL1i:= X1;S10)[live XL2f,Y]”and

P5: “(S2;XL2f:= X2)[live XL2f,Y]”=“(XL1i:= X1;S20)[live XL2f,Y]”.

APPENDIX D. SSA 198

Proof.

“(XL1i:= X1;if B0then S10else S20ﬁ)[live XL2f,Y]”

={P1}

“(XL1i:= X1;if B[X1\XL1i]then S10else S20ﬁ)[live XL2f,Y]”

={assignment-based sub. (Law 18): XL1iX1 (P2)}

“(XL1i:= X1;if B[X1\XL1i][XL1i\X1] then S10else S20ﬁ)[live XL2f,Y]”

={remove redundant (reversed) double-sub.: XL1iglob.B(P3)}

“(XL1i:= X1;if Bthen S10else S20ﬁ)[live XL2f,Y]”

={dist. statement over IF (Law 3): P3 again}

“(if Bthen XL1i:= X1;S10else XL1i:= X1;S20ﬁ)[live XL2f,Y]”

={prop. liveness info.}

“(if Bthen (XL1i:= X1;S10)[live XL2f,Y]

else (XL1i:= X1;S20)[live XL2f,Y]ﬁ)[live XL2f,Y]”

={P4,P5}

“(if Bthen (S1;XL2f:= X2)[live XL2f,Y]

else (S2;XL2f:= X2)[live XL2f,Y]ﬁ)[live XL2f,Y]”

={remove liveness info.}

“(if Bthen S1;XL2f:= X2else S2;XL2f:= X2ﬁ)[live XL2f,Y]”

={dist. IF over ‘ ;’ (Law 4)}

“(if Bthen S1else S2ﬁ;XL2f:= X2)[live XL2f,Y]”.

Program equivalence D.4. Let B,B0,S1,S10,X1,X2,XL1i,XL2i,Ybe two boolean expres-

sions, two statements and ﬁve (disjoint) sets of variables; then

“(DO ;XL2i:= X2)[live XL2i,Y]”=“(XL1i,XL2i:= X1,X2;DO0)[live XL2i,Y]”

APPENDIX D. SSA 199

where DO := “while Bdo S1od ”and

DO0:= “while B0do S10od ”

provided

P1: (X1,X2,XL1i,XL2i,Y) are disjoint,

P2: (XL1i,XL2i)input.DO,

P3: input.DO0⊆(XL1i,XL2i,Y),

P4: [B0≡B[X1,X2\XL1i,XL2i]] and

P5: “(S1;XL1i,XL2i:= X1,X2)[live XL1i,XL2i,Y]”=

“(XL1i,XL2i:= X1,X2;S10)[live XL1i,XL2i,Y]”.

Proof.

“(XL1i,XL2i:= X1,X2;while B0do S10od)[live XL2i,Y]”

={intro. dead assignment (Law 26): due to P1,P3

(X1,X2) ((XL2i,Y)∪glob.B0∪input.S10)}

“(XL1i,XL2i:= X1,X2;while B0do

S10;X1,X2 := XL1i,XL2iod)[live XL2i,Y]”

={intro. following assertion (Law 7), twice}

“(XL1i,XL2i:= X1,X2;{XL1i,XL2i=X1,X2};while B0do

S10;X1,X2 := XL1i,XL2i;{XL1i,XL2i=X1,X2}od)[live XL2i,Y]”

={prop. assertion (Law 13)}

“(XL1i,XL2i:= X1,X2;{XL1i,XL2i=X1,X2};while B0do

{XL1i,XL2i=X1,X2};S10;X1,X2 := XL1i,XL2i;{XL1i,XL2i=X1,X2}

od)[live XL2i,Y]”

={intro. following assignment (Law 6)}

“(XL1i,XL2i:= X1,X2;{XL1i,XL2i=X1,X2};while B0do

{XL1i,XL2i=X1,X2};XL1i,XL2i:= X1,X2;S10;X1,X2 := XL1i,XL2i;

{XL1i,XL2i=X1,X2}od)[live XL2i,Y]”

={prop. assertion (Law 13)}

“(XL1i,XL2i:= X1,X2;{XL1i,XL2i=X1,X2};while B0do

XL1i,XL2i:= X1,X2;S10;X1,X2 := XL1i,XL2i;{XL1i,XL2i=X1,X2}

od)[live XL2i,Y]”

={remove following assertion (Law 7), twice}

“(XL1i,XL2i:= X1,X2;while B0do

XL1i,XL2i:= X1,X2;S10;X1,X2 := XL1i,XL2iod)[live XL2i,Y]”

APPENDIX D. SSA 200

={liveness analysis: P3 gives input.DO0⊆(XL1i,XL2i,Y)}

“(XL1i,XL2i:= X1,X2;while B0do

((XL1i,XL2i:= X1,X2;S10)[live XL1i,XL2i,Y];

X1,X2 := XL1i,XL2i)[live XL1i,XL2i,Y]od)[live XL2i,Y]”

={P5}

“(XL1i,XL2i:= X1,X2;while B0do

((S1;XL1i,XL2i:= X1,X2)[live XL1i,XL2i,Y];

X1,X2 := XL1i,XL2i)[live XL1i,XL2i,Y]od)[live XL2i,Y]”

={remove liveness info.}

“(XL1i,XL2i:= X1,X2;while B0do

(S1;XL1i,XL2i:= X1,X2;X1,X2 := XL1i,XL2i)[live XL1i,XL2i,Y]

od)[live XL2i,Y]”

={assignment-based sub. (Law 18): (X1,X2) (XL1i,XL2i) due to P1}

“(XL1i,XL2i:= X1,X2;while B0do

(S1;XL1i,XL2i:= X1,X2;X1,X2 := X1,X2)[live XL1i,XL2i,Y]

od)[live XL2i,Y]”

={remove redundant self-assignment (Law 2); remove liveness info.}

“(XL1i,XL2i:= X1,X2;while B0do S1;XL1i,XL2i:= X1,X2od)[live XL2i,Y]”

={P4}

“(XL1i,XL2i:= X1,X2;while B[X1,X2\XL1i,XL2i]do

S1;XL1i,XL2i:= X1,X2od)[live XL2i,Y]”

={assignment-based sub. (Law 18): (X1,X2) (XL1i,XL2i)}

“(XL1i,XL2i:= X1,X2;while B[X1,X2\XL1i,XL2i][XL1i,XL2i\X1,X2] do

S1;XL1i,XL2i:= X1,X2od)[live XL2i,Y]”

={remove redundant (reversed) double-sub.: P2 gives (XL1i,XL2i)glob.B}

“(XL1i,XL2i:= X1,X2;while Bdo S1;XL1i,XL2i:= X1,X2od)[live XL2i,Y]”

={code motion (Law 5): due to P1,P2

(XL1i,XL2i)(glob.B∪input.S1∪(X1,X2))}

“(XL1i,XL2i:= X1,X2;while Bdo S1od ;XL1i,XL2i:= X1,X2)[live XL2i,Y]”

={remove dead assignment (Law 23): by P1 we get XL1i(XL2i,Y)}

“(XL1i,XL2i:= X1,X2;while Bdo S1od ;XL2i:= X2)[live XL2i,Y]”

APPENDIX D. SSA 201

={remove dead assignment (Law 23):

(XL1i,XL2i)((X2,Y)∪glob.B∪input.S1) due to P1,P2}

“(while Bdo S1od ;XL2i:= X2)[live XL2i,Y]”.

D.2 Transform to SSA

We now apply the results of the above general derivation in deriving an algorithm to transform

any given program statement to SSA.

Transformation D.5. Let S,X,Ybe any core statement and two (disjoint) sets of variables; let

X1,X2,X3,X4,X5 be ﬁve (mutually disjoint) subsets of X, and let

XL1i,XL2i,XL3i,XL4i,XL4f,XL5fbe six sets of instances, all included in the full set of instances

XLs; let S0be the SSA form of Sdeﬁned by

S0:= toSSA.(S,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs); then (Q1:)

“(S;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]”=

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S0)[live XL3i,XL4f,XL5f,Y]”

and (Q2:) Xglob.S0

provided

P1: glob.S⊆(X,Y),

P2: (X1,X2,X3,X4,X5) ⊆X,

P3: (XL1i,XL2i,XL3i,XL4i,XL4f,XL5f)⊆XLs,

P4: XLs (X,Y),

P5: (X1,X3) def.S,

P6: (X2,X4,X5) ⊆def.Sand

P7: (X∩(((X3,X4,X5) \ddef.S)∪input.S)) ⊆(X1,X2,X3,X4) .

Preconditions P1 and P2 identiﬁes all program variables (in S); then P3 and P4 (along with

P1) ensure all instances XLs are fresh; P5 and P6 (along with P3 and the repetition of XL2iin

Q1) ensure any live-on-exit ﬁnal instance is also live-on-entry if and only if its respective program

variable is not deﬁned in S(if it is both deﬁned and live-on-entry, a diﬀerent initial instance

will be used); ﬁnally, P7 (along with P2) makes postcondition Q2 achievable, by demanding the

availability of an initial instance to all live-on-entry variables (in X).

APPENDIX D. SSA 202

Proof. The derivation of the toSSA algorithm is given hand in hand with its proof of correctness.

For a given statement S, we assume for any slip Tof Sthe correctness of its toSSA transformation

(provided all preconditions are met) in proving the correctness for Sitself.

As will be seen, in the course of the following derivation, for each case of

S0:= toSSA.(S,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs) we shall be obliged to

show

DP1: (((XL3i,XL4f,XL5f)\ddef.S0)∪(input.S0\Y)) ⊆(XL1i,XL2i,XL3i,XL4i) and

DP2: (XL4f,XL5f)⊆ddef.S0

provided P1-P7 hold.

In order to allow all recursive calls to introduce fresh instances, without clashing with sur-

rounding variables, we shall make the names of all global program variables (i.e. from Xand Y)

invariably available to recursive calls. Since those calls will be applied to slips of S,P1 will be

guaranteed (due to glob.T⊆glob.Sfor any slip Tof S).

Furthermore, whenever fresh instances are introduced, they will be added to XLs in further

recursive calls. Thus the inclusion of all instances in XLs, as required by P3, will be maintained.

This way, choosing all fresh names to be distinct from (X,Y,XLs ) will maintain P4. For keeping

P5,P6 and the disjointness of instances (XL1i,XL2i,XL3i,XL4i,XL4f,XL5f) (as implied by P3),

special care will be needed. In the various cases, such considerations will be key in deriving the

details of the transformation. Finally, for P7 to be invariably maintained, we shall propagate

liveness information backward (following our laws of liveness analysis) whilst propagating forward

assignments to initial instances (following both laws of program manipulation and postcondition

Q1).

Assignment

We begin by analyzing the relevant subsets of live variables X1-X5; of those, variables X2,X4,X5

are deﬁned in S, with X2,X4 live-on-entry and X4,X5 live-on-exit. Furthermore, there is a

potential of deﬁned variables X6⊆(X\(X1,X2,X3,X4,X5)); those are neither live-on-entry

nor live-on-exit (i.e. dead assignments, as with X2).

All remaining deﬁned variables (Y1X) will have to be from Y(due to P1).

For variables (X3,X4), being live on both entry and exit, we recall from Q1 and P5 the non-

deﬁned X3 must have the same initial and ﬁnal instances (XL3i), whereas (from Q1,P3 and P6)

the deﬁned X4 should be given fresh initial instances XL4i.

Likewise, all dead assignments to (X2,X6) must yield fresh instances, (XL2,XL6). (An alter-

native would have been to allow the merging algorithm to remove dead assignments. This would

have caused complications in changing the results of liveness analysis and thus raise questions —

which are better avoided — over the order of translation. Instead, one can perform the elimination

APPENDIX D. SSA 203

of dead assignments independently of the merging.)

In terms of Program equivalence D.1, we observe (for Q1) that our X4,X2,X5,X6,X3,X1

correspond to

X3,X4,X5,X6,X7,X8 over there. Thus

“(X4,X2,X5,X6,Y1 := E1,E2,E3,E4,E5;

XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y1] ”

={Program equivalence D.1 with

X1,X2,X3,X4,X5,X6,X7,X8 := ∅,∅,X4,X2,X5,X6,X3,X1,

XL3i,XL4i,XL7i,XL8i:= XL4i,XL2i,XL3i,XL1i,

XL3f,XL4f,XL5f,XL6f:= XL4f,XL2,XL5f,XL6,

XLs := (XLs,XL2,XL4i,XL6) and

(E10,E20,E30,E40,E50) := (E1,E2,E3,E4,E5)

[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i]:

P1 is due to our P2; P2 and P3 are due to our P3 and the fresh choices

P4 holds by choice of Y1Xand our P1;

P5 is a result of our P4 and the fresh choices;

P6 is due to our P1,P2 and P7; ﬁnally P7-P11 hold by construction}

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;

XL4f,XL2,XL5f,XL6,Y1 := E10,E20,E30,E40,E50)

[live XL3i,XL4f,XL5f,Y1] ”.

We thus derive

toSSA.(“X4,X2,X5,X6,Y1 := E1,E2,E3,E4,E5”,

X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs)) ,

“XL4f,XL2,XL5f,XL6,Y1 := E10,E20,E30,E40,E50”

where (E10,E20,E30,E40,E50,E60) := (E1,E2,E3,E4,E5,E6)

[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i]

and (XL2,XL6) := fresh.((X2,X6),(X,Y,XLs)) .

Q2. Indeed Xglob.“XL4f,XL2,XL5f,XL6,Y11,Y2 := E10,E20,E30,E40,E50,E60”since

(X∩glob.(E1,E2,E3,E4,E5,E6)) ⊆(X1,X2,X3,X4) (due to input of ‘:=’ and P7) and

(X1,X2,X3,X4) are all substituted by elements of XLs (P3) which are disjoint from X(P4).

DP1.

((XL3i,XL4f,XL5f)\(XL4f,XL2,XL5f,XL6,Y1))∪

(glob.(E10,E20,E30,E40,E50,E60)\Y)

APPENDIX D. SSA 204

={set theory: XL3i(XL4f,XL2,XL5f,XL6,Y1) due to P3,

fresh choice of XL2,XL6 and P1,P4 (for Y1)}

(XL3i∪(glob.(E10,E20,E30,E40,E50,E60)\Y)

⊆ {def. of E10,E20,E30,E40,E50,E60and

glob.(E1,E2,E3,E4,E5,E6) ⊆(X1,X2,X3,X4,Y) due to P1 and P7}

(XL1i,XL2i,XL3i,XL4i).

DP2. Following P5 and P6 we observe

(X3,X4,X5) ∩def.“X4,X2,X5,X6,Y11,Y2 := E1,E2,E3,E4,E5,E6”is (X4,X5) and in-

deed the matching ﬁnal instances (XL4f,XL5f) are in

ddef.“XL4f,XL2,XL5f,XL6,Y1 := E10,E20,E30,E40,E50”.

Sequential composition

The key here is to determine an intermediate set of instances for which all preconditions (P1-P7)

hold for the two recursive calls to toSSA. Let (X1,X2,X3,X4) be the live-on-entry variables, and

let (X3,X4,X5) be live-on-exit, with (X1,X3) (def.S1∪def.S2) and (X2,X4,X5) ⊆(def.S1∪

def.S2). We have no problem with X3 as all instances (i.e. ﬁnal, intermediate and initial) will have

to be the same (XL3i) since those are not deﬁned in “S1;S2”. The remaining live intermediate

variables are X6 := ((X\X3) ∩(((X4,X5) \ddef.S2) ∪input.S2)).

Of X6, variables in X11,X21,X41 := (X1∩X6),((X2∩X6) \def.S1),((X4∩X6) \def.S1)

will have to reuse the initial instances XL11i,XL21i,XL41i. Similarly, variables in X42,X51 :=

(((X4∩X6) \def.S2),(((X5∩X6) \def.S2) will reuse ﬁnal instances XL41f,XL51f. (Note that

X41 X42 since — due to X4⊆(def.S1∪def.S2) — variables in X41 must be in def.S2 whereas

variables in X42 must not.)

Finally, the remaining variables X61 := (X6\(X11,X21,X41,X42,X51)) must get fresh

instances (XL61 := fresh.(X61,(X,Y,XLs))). In summary, the intermediately-live instances will

XL6 := (XL11i,XL21i,XL3i,XL41i,XL42f,XL51f,XL61).

The above construction of XL6, along with the given P1-P7 and the associativity of liveness

analysis (Lemma 7.2, for P7) ensure

S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),XL6,Y,XLs0) — where XLs0:= (XLs ,XL61) —

enjoys all its P1-P7 and thus (Q1)

“(S1;XL6 := X6)[live XL6,Y]”=

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S10;S20)[live XL6,Y]”.

Similarly, all P1-P7 for S20:= toSSA.(S2,X,XL6,(XL3i,XL4f,XL5f),Y,XLs00) are guaranteed

for any XLs00 ⊇XLs 0, thus yielding (Q1)

APPENDIX D. SSA 205

“(S2;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]”=

“(XL6 := X6;S20)[live XL3i,XL4f,XL5f,Y]”.

In toSSA, we shall insist on XLs00 := (XLs 0∪(glob.S10\Y)) in order to avoid deﬁning the same

instance twice, and thus losing the static-single-assignment property.

Q1.

“(S1;S2;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y1] ”

={Program equivalence D.2 with (XL6,XLs0,XLs00 as deﬁned above)

XL1i:= (XL1i,XL2i,XL3i,XL4i); XL2f:= (XL3i,XL4f,XL5f);

S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),XL6,Y,XLs0);

S20:= toSSA.(S2,X,XL6,(XL3i,XL4f,XL5f),Y,XLs00);

ind. hypo. (Q1), twice: P1-P7 of both cases justiﬁed above}

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S10;S20)

[live XL3i,XL4f,XL5f,Y1] ”.

We thus derive

toSSA.(“S1;S2”,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs),

“S10;S20”

where X6 := ((X\X3) ∩(((X4,X5) \ddef.S2) ∪input.S2)),

X11 := (X1∩X6),

X21 := ((X2∩X6) \def.S1),

X41 := ((X4∩X6) \def.S1),

X42 := ((X4∩X6) \def.S2),

X51 := ((X5∩X6) \def.S2),

X61 := (X6\(X11,X21,X41,X42,X51)),

XL61 := fresh.(X61,(X,Y,XLs)),

XL6 := (XL11i,XL21i,XL3i,XL41i,XL42f,XL51f,XL61),

XLs0:= (XLs,XL61),

S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),XL6,Y,XLs0),

XLs00 := (XLs0∪(glob.S10\Y))

and S20:= toSSA.(S2,X,XL6,(XL3i,XL4f,XL5f),Y,XLs00) .

Q2. Indeed Xglob.“S10;S20”due to the ind. hypo. (Q2), twice.

DP1.

((XL3i,XL4f,XL5f)\ddef.“S10;S20”)∪((input.“S10;S20”)\Y)

={associativity of liveness (Lemma 7.2)}

(((((XL3i,XL4f,XL5f)\ddef.S20)∪input.S20)\ddef.S10)∪input.S10)\Y)

APPENDIX D. SSA 206

={set theory}

(((((XL3i,XL4f,XL5f)\ddef.S20)∪(input.S20\Y)) \ddef.S10)∪input.S10)\Y)

⊆ {ind. hypo. (DP1 of S20)}

((XL6\ddef.S10)∪input.S10)\Y)

={set theory: XL6Yby construction of XL6 (P3,P4 and freshness of XL61)}

(XL6\ddef.S10)∪(input.S10\Y)

⊆ {ind. hypo. (DP1 of S10)}

(XL1i,XL2i,XL3i,XL4i).

DP2. Due to ddef of ‘ ;’, we need to show (XL4f,XL5f)⊆(ddef.S10∪ddef.S20). Final

instances of variables in (X4,X5) ∩def.S2 are in ddef.S20due to the ind. hypo. (i.e. DP2 of S20).

The remaining elements of (X4,X5) must be in X6\def.S2 and hence in (X41,X51). Thus, their

ﬁnal instances must be in (XL41f,XL51f) and hence in ddef.S10due to the ind. hypo. (i.e. DP2

of S10) as required.

The key this time lies in DP2 and the deﬁnition of ddef of IF. Variables in (X4,X5) are deﬁned

in the IF statement and must be deﬁned in both branches of the resulting IF’. We achieve that by

ending both the then and else branches of IF’ with assignments to the ﬁnal instances (XL4f,XL5f).

But what do we assign to members of (XL4f,XL5f)? In each branch the answer will be diﬀerent,

depending on whether the variable is deﬁned in that branch and if not, whether it is live on entry

(i.e. in XL4i) or not.

Variables in X4d1 := (X4∩(def.S1\def.S2)) should be given fresh instances (XL4d1t) in the

then branch but use their initial instances (XL4d1i) as ﬁnal instances in the else branch. (Failing to

reuse such initial instances would inevitably render DP2 false, by yielding simultaneous liveness.)

Similarly, variables in X4d2 := (X4∩(def.S2\def.S1)) should be given fresh instances

(XL4d2e) in the else branch but reuse initial instances XL4d2ias ﬁnal instances in the then

branch.

Now each of the remaining variables of X4 (i.e. in X4d1d2 := X4\(X4d1∪X4d2)) and each

member of X5 should be given two (distinct) fresh instances

(i.e. (XL4d1d2t,XL4d1d2e,XL5t,XL5e)). Those new instances will in turn act as ﬁnal instances

in the two recursive calls.

Finally, for brevity, we deﬁne XL4t:= (XL4d1t,XL4d2i,XL4d1d2t) and

XL4e:= (XL4d1i,XL4d2e,XL4d1d2e). In summary, the then branch will end with an assign-

ment “XL4f,XL5f:= XL4t,XL5t”. Similarly, the else branch will end with the assignment

APPENDIX D. SSA 207

“XL4f,XL5f:= XL4e,XL5e”.

The above construction along with the given P1-P7 ensure

S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4t,XL5t),Y,XLs0) — where XLs0:=

(XLs,XL4d1t,XL4d2e,XL4d1d2t,XL4d1d2e,XL5t,XL5e) — enjoys all its P1-P7. Similarly, all

P1-P7 for

S20:= toSSA.(S2,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4e,XL5e),Y,XLs00) are guaranteed for

any XLs00 ⊇XLs 0. As before, in order to avoid double assignments to any instance, we shall insist

on XLs00 := (XLs0∪(glob.S10\Y)).

We now aim to apply Program equivalence D.3 with

S100 := “S10;XL4f,XL5f:= XL4t,XL5t”and S200 := “S20;XL4f,XL5f:= XL4e,XL5e”.

For this to be correct, we have to show

“(S1;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y1] ”=

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S100)[live XL3i,XL4f,XL5f,Y1] ”and

“(S2;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]”=

“(XL1i,XL2i,XL3i:= X1,X2,X3;S200)[live XL3i,XL4f,XL5f,Y]”. For the former, we ob-

serve

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S100)[live XL3i,XL4f,XL5f,Y]”

={def. of S100 }

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S10;

XL4f,XL5f:= XL4t,XL5t)[live XL3i,XL4f,XL5f,Y]”

={prop. liveness}

“((XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;S10)[live XL3i,XL4t,XL5t,Y];

XL4f,XL5f:= XL4t,XL5t)[live XL3i,XL4f,XL5f,Y]”

={ind. hypo. (Q1 of S10)}

“((S1;XL3i,XL4t,XL5t:= X3,X4,X5)[live XL3i,XL4t,XL5t,Y];

XL4f,XL5f:= XL4t,XL5t)[live XL3i,XL4f,XL5f,Y]”

={remove liveness}

“(S1;XL3i,XL4t,XL5t:= X3,X4,X5;

XL4f,XL5f:= XL4t,XL5t)[live XL3i,XL4f,XL5f,Y]”

={assignment-based sub. (Law 18): (X4,X5) (XL3i,XL4t,XL5t) due to

P2,P3,P4 and the freshness of (XL4t,XL5t)}

“(S1;XL3i,XL4t,XL5t:= X3,X4,X5;

XL4f,XL5f:= X4,X5)[live XL3i,XL4f,XL5f,Y]”

APPENDIX D. SSA 208

={remove dead assignments: (XL4t,XL5t)(XL3i,X4,X5,Y) again, by

P2,P3,P4 and the freshness of (XL4t,XL5t)}

“(S1;XL3i:= X3;XL4f,XL5f:= X4,X5)[live XL3i,XL4f,XL5f,Y]”

={merge following assignments (Law 1): XL3i(XL4f,XL5f,X4,X5) by P2,P3,P4}

“(S1;XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]”.

The corresponding proof for S200 is similar (and thus omitted). We are now ready to transform

the IF statement into IF’, as follows:

“(if Bthen S1else S2ﬁ;

XL3i,XL4f,XL5f:= X3,X4,X5)[live XL3i,XL4f,XL5f,Y]”

={Program equivalence D.3 with S100,S200 as deﬁned above,

B0:= B[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i],

XL1i:= (XL1i,XL2i,XL3i,XL4i) and XL2f:= (XL3i,XL4f,XL5f):

P1 holds by construction (of B0); P2 holds due to our P2,P3,P4;

P3 holds due to P1,P3,P4; and P4,P5 hold as proved above}

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;

if B0then S10;XL4f,XL5f:= XL4t,XL5t

else S20;XL4f,XL5f:= XL4e,XL5eﬁ)[live XL3i,XL4f,XL5f,Y]”.

We thus derive

toSSA.(IF ,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs),IF 0

where IF := “if Bthen S1else S2ﬁ”,

IF 0:= “if B[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i]

then S10;XL4f,XL5f:= XL4t,XL5telse S20;XL4f,XL5f:= XL4e,XL5eﬁ”,

X4d1 := (X4∩(def.S1\def.S2)),

X4d2 := (X4∩(def.S2\def.S1)),

X4d1d2 := X4∩def.S1∩def.S2,

(XL4d1t,XL4d1e,XL4d1d2t,XL4d1d2e,XL5t,XL5e) :=

fresh.((X4d1,X4d2,X4d1d2,X4d1d2,X5,X5),(X,Y,XLs)),

XL4t:= (XL4d1t,XL4d2i,XL4d1d2t),

XL4e:= (XL4d1i,XL4d2e,XL4d1d2e),

XLs0:= (XLs,XL4d1t,XL4d2e,XL4d1d2t,XL4d1d2e,XL5t,XL5e),

S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4t,XL5t),Y,XLs0),

XLs00 := (XLs0∪(glob.S10\Y))

and S20:= toSSA.(S2,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4e,XL5e),Y,XLs00) .

APPENDIX D. SSA 209

Q2. Indeed Xglob.IF 0due to the ind. hypo. (Q2 of S10and S20) and since (X∩glob.B)⊆

(X1,X2,X3,X4) by P7.

DP1.

((XL3i,XL4f,XL5f)\ddef.IF 0)∪(input.IF 0\Y)

={set theory: (XL4f,XL5f)⊆ddef.IF 0and XL3iddef.IF 0}

XL3i∪(input.IF 0\Y)

={input of IF 0}

XL3i∪((glob.B0∪input.“S10;XL4f,XL5f:= XL4t,XL5t”∪

input.“S20;XL4f,XL5f:= XL4e,XL5e)”)\Y)

={input of ‘ ;’; def. of XL4t,XL5e}

XL3i∪((glob.B0∪input.S10∪((XL4d1t,XL4d2i,XL4d1d2t,XL5t)\ddef.S10)∪

input.S20∪((XL4d1i,XL4d2e,XL4d1d2e,XL5e)\ddef.S20)) \Y)

={ind. hypo., twice: (XL4d1t,XL4d1d2t,XL5t)⊆ddef.S10(DP2 of S10) and

(XL4d2e,XL4d1d2e,XL5e)⊆ddef.S20(DP2 of S20)}

XL3i∪((glob.B0∪input.S10∪(XL4d2i\ddef.S10)∪

input.S20∪(XL4d1i\ddef.S20)) \Y)

⊆ {set theory: (XL4d1i,XL4d2i)⊆XL4i}

(XL3i,XL4i)∪((glob.B0∪input.S10∪input.S20)\Y)

={set theory}

(XL3i,XL4i)∪(glob.B0\Y)∪(input.S10\Y)∪(input.S20\Y)

⊆ {def. of B0and glob.B⊆(X1,X2,X3,X4,Y) due to P1 and P7}

(XL1i,XL2i,XL3i,XL4i)∪(input.S10\Y)∪(input.S20\Y)

⊆ {ind. hypo., twice: (input.S10\Y)⊆(XL1i,XL2i,XL3i,XL4i) (DP1 of S10) and

(input.S20\Y)⊆(XL1i,XL2i,XL3i,XL4i) (DP1 of S20)}

(XL1i,XL2i,XL3i,XL4i).

DP2. By construction, we indeed have (XL4f,XL5f)⊆ddef.IF 0.

We need to enforce the policy of non-simultaneous liveness of the instances of a program variable.

Since ddef.DO is empty, the ﬁnal instance of a live-on-exit variable must also be live-on-entry to

the loop. Thus, the set of live-on-exit-only variables X5 is empty. Furthermore, if the (live-on-

exit) variable is also in input.DO, it must be the ﬁnal instance that is live-on-entry to the SSA

APPENDIX D. SSA 210

loop DO 0. We achieve that by deﬁning such instances that are also deﬁned in the loop (XL4f) just

before the loop begins and at the end of its body. Similarly, other variables in def.DO ∩input.DO

(not being live-on-exit) should have a dedicated (fresh) loop-entry instance (XL2). Thus, when

recursively transforming the loop body S1 to SSA, non-simultaneous liveness is ensured by sending

instances (XL1i,XL2,XL3i,XL4f) as initial values. Now what about ﬁnal values for S10?

Non-deﬁned live variables (X1,X3) should be given the corresponding initial instances

(XL1i,XL3i). As for the deﬁned variables (X2,X4), fresh instances (XL2b,XL4b) must be in-

vented for their ﬁnal S10value. These should in turn be assigned back to the loop-entry instances,

just after S10.

The above construction along with the given P1-P7 ensure

S10:= toSSA.(S1,X,(XL1i,XL2,XL3i,XL4f),(XL1i,XL2b,XL3i,XL4b),Y,XLs0)

— where XLs0:= (XLs,XL2,XL2b,XL4b) — enjoys all its P1-P7.

We now aim to apply Program equivalence D.4 with S100 := “S10;XL2,XL4f:= XL2b,XL4b”

for S10,B[X1,X2,X3,X4\XL1i,XL2,XL3i,XL4f]] for B0and (XL1i,XL2),(XL3i,XL4f) for

XL1i,XL2irespectively. For this to be correct, we have to show its P1-P5. P1 is a result of

our P1-P4 and the freshness of XL2; P2 is given by our P1,P3,P4 along with RE5 (which yields

input.DO ⊆glob.DO); P3 is due to the ind. hypo. (DP1 of S10) and the def. of B0along with

P1,P7 (yielding glob.B⊆(X1,X2,X3,X4,Y)); P4 is correct by construction (of B0); and ﬁnally,

for P5, we observe

“(S1;XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4)[live XL1i,XL2,XL3i,XL4f,Y]”=

“(XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;S100)[live XL1i,XL2,XL3i,XL4f,Y]”:

“(XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;

S100)[live XL1i,XL2,XL3i,XL4f,Y]”

={def. of S100 }

“(XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;

S10;XL2,XL4f:= XL2b,XL4b)[live XL1i,XL2,XL3i,XL4f,Y]”

={prop. liveness}

“((XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;

S10)[live XL1i,XL2,XL3i,XL4f,Y];

XL2,XL4f:= XL2b,XL4b)[live XL1i,XL2,XL3i,XL4f,Y]”

={ind. hypo. (Q1 of S10)}

“((S1;XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4)

[live XL1i,XL2,XL3i,XL4f,Y];

XL2,XL4f:= XL2b,XL4b)[live XL1i,XL2,XL3i,XL4f,Y]”

APPENDIX D. SSA 211

={remove liveness}

“(S1;XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;

XL2,XL4f:= XL2b,XL4b)[live XL1i,XL2,XL3i,XL4f,Y]”

={assignment-based sub. (Law 18): (X2,X4) (XL1i,XL2,XL3i,XL4f) due to

P2,P3,P4 and the freshness of XL2}

“(S1;XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;

XL2,XL4f:= X2,X4)[live XL1i,XL2,XL3i,XL4f,Y]”

={remove dead assignments: (XL2,XL4f)(XL1i,X2,XL3i,X4,Y) again, by

P2,P3,P4 and the freshness of XL2}

“(S1;XL1i,XL3i:= X1,X3;

XL2,XL4f:= X2,X4)[live XL1i,XL2,XL3i,XL4f,Y]”

={merge following assignments (Law 1):

(XL1i,XL3i)(XL2,XL4f,X2,X4) by P2,P3,P4}

“(S1;XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4)

[live XL1i,XL2,XL3i,XL4f,Y]”.

Let DO := “while Bdo S1od ”and

DO0:= “while B0do S10;XL2,XL4f:= XL2b,XL4bod ”. We are now ready to transform the

DO statement, as follows:

“(DO ;XL3i,XL4f:= X3,X4)[live XL3i,XL4f,Y]”

={Program equivalence D.4, as explained and justiﬁed above}

“(XL1i,XL2,XL3i,XL4f:= X1,X2,X3,X4;DO0)[live XL3i,XL4f,Y]”

={split assignments (Law 1),P2-P4: (XL1i,XL3i)(X2,X4)}

“(XL1i,XL3i:= X1,X3;XL2,XL4f:= X2,X4;DO0)[live XL3i,XL4f,Y]”

={intro. dead assignment (Law 23), DP1: (XL2i,XL4i)

(((((XL3i,XL4f)∪(input.DO0\Y)) \(XL2,XL4f)) ∪(X2,X4))}

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4

;XL2,XL4f:= X2,X4;DO0)[live XL3i,XL4f,Y]”

={assignment-based sub. (Law 18),P2-P4: (X2,X4) (XL1i,XL2i,XL3i,XL4i)}

“(XL1i,XL2i,XL3i,XL4i:= X1,X2,X3,X4;

XL2,XL4f:= XL2i,XL4i;DO0)[live XL3i,XL4f,Y]”.

We thus derive

toSSA.(DO,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f),Y,XLs),

APPENDIX D. SSA 212

“XL2,XL4f:= XL2i,XL4i;DO0”

where DO := “while Bdo S1od ”,

DO0:= “while B0do S10;XL2,XL4f:= XL2b,XL4bod ”

(XL2,XL2b,XL4b) := fresh.((X2,X2,X4),(X,Y,XLs)),

XLs0:= (XLs,XL2,XL2b,XL4b),

B0:= B[X1,X2,X3,X4\XL1i,XL2,XL3i,XL4f]

and S10:= toSSA.(S1,X,(XL1i,XL2,XL3i,XL4f),(XL1i,XL2b,XL3i,XL4b),Y,XLs0) .

Q2. Indeed Xglob.“XL2,XL4f:= XL2i,XL4i;DO0”due to the ind. hypo. (Q2 of S10)

and since P7 gives (X∩glob.B)⊆(X1,X2,X3,X4).

DP1.

((XL3i,XL4f)\ddef.“XL2,XL4f:= XL2i,XL4i;DO0”)∪

(input.“XL2,XL4f:= XL2i,XL4i;DO0”\Y)

={ddef and input of ‘:=’, ‘ ;’ and DO}

((XL3i,XL4f)\(XL2,XL4f)) ∪(((XL2i,XL4i)∪(input.DO0\(XL2,XL4f))) \Y)

={set theory: XL3i(XL2,XL4f) due to P3 and the freshness of XL2;

(XL2i,XL4i)Ydue to P3,P4}

(XL2i,XL3i,XL4i)∪((input.DO0\(XL2,XL4f)) \Y)

={input of DO’ and set theory}

(XL2i,XL3i,XL4i)∪

((glob.B0∪input.“S10;XL2,XL4f:= XL2b,XL4b”)\(XL2,XL4f,Y))

⊆ {set theory: (glob.B0\(XL2,XL4f,Y)) ⊆(XL1i,XL3i)

due to P1,P6 and XDO0}

(XL1i,XL2i,XL3i,XL4i)∪

(input.“S10;XL2,XL4f:= XL2b,XL4b”\(XL2,XL4f,Y))

={input of ‘ ;’ and ‘:=’}

(XL1i,XL2i,XL3i,XL4i)∪

((input.S10∪((XL2b,XL4b)\ddef.S10)) \(XL2,XL4f,Y))

⊆ {ind. hypo. (DP1 of S10): (input.S10\Y)⊆(XL1i,XL2,XL3i,XL4f)}

(XL1i,XL2i,XL3i,XL4i)∪(((XL2b,XL4b)\ddef.S10)\(XL2,XL4f,Y))

={(XL2b,XL4b)⊆ddef.S10(derived property DP2):

(XL2b,XL4b) are ﬁnal instances in S10, of deﬁned variables (X2,X4) ⊆def.S1}

(XL1i,XL2i,XL3i,XL4i).

DP2.

APPENDIX D. SSA 213

For variables in X4 we observe that the corresponding ﬁnal instances XL4fare clearly in

ddef.“XL2,XL4f:= XL2i,XL4i”and hence in

ddef.“XL2,XL4f:= XL2i,XL4i;DO0”as required.

D.3 Back from SSA

The following is a derivation of S:= fromSSA.(S0,X,XL1i,XL2f,Y,XLs) when S0includes at

most one live instance of any variable in Xat each program point. The goal is to turn

“(XL1i:= X1;S0)[live XL2f,Y]”with Xglob.S0into the equivalent

“(S;XL2f:= X2)[live XL2f,Y]”with

XLs glob.S. This way, a program statement in SSA form can be turned back to the original, as

in the following derivation:

“|[var XLi,XLf ,XLim ;XL1i:= X1;S0;X:= XLf ]|”

={def. of live with Y:= def.S0\XLs }

“(XL1i:= X1;S0;X:= XLf )[live X,Y]”

={prop. liveness}

“((XL1i:= X1;S0)[live XLf ,Y];X:= XLf )[live X,Y]”

={S:= fromSSA.(S,X,XL1i,XLf ,Y,XLs) (Transformation D.6, see below)}

“((S;XLf := X)[live XLf ,Y];X:= XLf )[live X,Y]”

={remove aux. liveness info.}

“(S;XLf := X;X:= XLf )[live X,Y]”

={assignment-based sub.; remove self-assignment;

remove dead assignment; remove aux. liveness info.}

Instead of deriving the fromSSA algorithm directly from the SSA form, we shall take a more

general approach. The goal is to allow some transformations (e.g. slicing) to be performed on the

SSA form itself, before returning to the original form.

We observe that the return from SSA involves the merge of all instances (in XLs) back into the

original program variables (X). We hypothesize that as long as there is no simulateneous liveness

of any two instances (to be merged) at any program point, and as long as no such instances are

simultaneously deﬁned (i.e. in a statement of multiple assignment), the merge of all instances

should be possible. Insisting on removal of self-assignments (after, or while merging) will allow

assignments to pseudo instances (in IFs and DO loops) to be eliminated.

APPENDIX D. SSA 214

Thus we shall develop an algorithm for merging sets of variables and use it for returning from

SSA, as follows:

fromSSA.(S0,X,XL1i,XLf ,Y,XLs),merge-vars.(S0,XLs,X,XL1i,XLf ,Y) .

Transformation D.6. Let S0be any core statement and (XL1i∪XL2f)⊆XLs; let Sbe a

statement deﬁned by

S:= merge-vars.(S0,XLs,X,XL1i,XL2f,Y); then (Q1:)

“(XL1i:= X1;S0)[live XL2f,Y]”=“(S;XL2f:= X2)[live XL2f,Y]”

and (Q2:) XLs glob.S

provided

P1: glob.S0⊆(XLs,Y),

P2: (XL1i∪XL2f)⊆XLs,

P3: (X1∪X2) ⊆X,

P4: X(XLs,Y),

P5: no two instances of any member of Xare sim.-live at any point in S0[live XL2f,Y],

P6: (XLs ∩((XL2f\ddef.S0)∪input.S0)) ⊆XL1i,

P7: no def-on-live: i.e. no instance is deﬁned where another instance is live-on-exit,

P8: no multiple-defs: i.e. each assinment deﬁnes at most one instance (of any X.i).

Proof. As with the toSSA algorithm, the derivation of merge-vars (and hence of fromSSA) is given

hand in hand with its proof of correctness. For a given statement S0, we assume for any slip T0

of S0the correctness of its merge-vars transformation in proving the correctness for S0itself.

Assignment

Aiming to use Program equivalence D.1, we distinguish eight disjoint subsets of X(X1-X8) and the

corresponding deﬁned instances (XL1f,XL2,XL3f,XL4f,XL5f,XL6 (indeed at most one of each

is deﬁned, due to our P8), of which XL1f,XL3f,XL5fare live-on-exit whereas XL2,XL4,XL6

represent dead assignments) and used instances (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i, of which

XL7iis live-on-both entry and exit, and the rest are live-on-entry only).

In terms of our expression of merge-vars and its preconditions P1-P8, we rewrite its X1 as

(X1,X2,X3,X4,X7,X8), XL1ias (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i), X2 as (X1,X3,X5,X7)

and XL2fas (XL1f,XL3f,XL5f,XL7i). According to this decomposition, our target Q1 can be

APPENDIX D. SSA 215

rewritten as

“((XL1i,XL2i,XL3i,XL4i,XL7i,XL8i:= X1,X2,X3,X4,X7,X8;S0)

[live XL1f,XL3f,XL5f,XL7i,Y]”

“((S;XL1f,XL3f,XL5f,XL7i:= X1,X3,X5,X7)

[live XL1f,XL3f,XL5f,XL7i,Y]”

with S0being the assignment

XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := XL1i,XL2i,E10,E20,E30,E40,E50. Accordingly, pre-

conditions P1-P8 can be understood as

P1: glob.S0⊆(XLs,Y),

P2: ((XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)∪(XL1f,XL3f,XL5f,XL7i)) ⊆XLs,

P3: ((X1,X2,X3,X4,X7,X8) ∪(X1,X3,X5,X7)) ⊆X,

P4: X(XLs,Y),

P5: no two instances of any member of Xare simultaneously live at any point in S0,

P6: (XLs ∩(XL7i∪input.S0)) ⊆(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i),

P7: no def-on-live: i.e. (X2,X4,X6) (X1,X3,X5,X7),

P8: no multiple-defs: i.e. (X1,X2,X3,X4,X5,X6) are disjoint.

For Program equivalence D.1 to be correct, we need to verify its P1-P11.

P1 ((X1,X2,X3,X4,X5,X6,X7,X8) ⊆X) holds by construction;

P2 ((XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)⊆XLs) is due to our P2;

P3 ((XL1f,XL2,XL3f,XL4,XL5f,XL6,XL7i)⊆XLs) is due to our P2, P7 and the disjointness

of instances (XLs.iXLs.jfor all i6=j);

P4 (Y1⊆Y) is due to our P1;

P5 (disjointness of (X,XLs,Y)) is due to our P4;

for P6-P11 to hold, we deﬁne E1,E2,E3,E4,E5 :=

(E10,E20,E30,E40,E50)[XL1i,XL2i,XL3i,XL4i,XL7i,XL8i\X1,X2,X3,X4,X7,X8]; now P6

(glob.(E1,E2,E3,E4,E5) ⊆(X1,X2,X3,X4,X7,X8,Y)) is due to our P1,P6 and the redun-

dancy of reversed double sub. ((X1,X2,X3,X4,X7,X8) glob.(E10,E20,E30,E40,E50) due to

P1,P4); the latter argument proves P7-P11 as well.

We are now ready to apply Program equivalence D.1, as follows:

“(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i:= X1,X2,X3,X4,X7,X8;

XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := XL1i,XL2i,E10,E20,E30,E40,E50)

[live XL1f,XL3f,XL5f,XL7i,Y]”

={Program equivalence D.1 with

E1,E2,E3,E4,E5 := (E10,E20,E30,E40,E50)

[XL1i,XL2i,XL3i,XL4i,XL7i,XL8i\X1,X2,X3,X4,X7,X8]}

APPENDIX D. SSA 216

“(X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5;

XL1f,XL3f,XL5f,XL7i:= X1,X3,X5,X7)[live XL1f,XL3f,XL5f,XL7i,Y]”.

We thus derive

merge-vars.(“XL1f,XL2,XL3f,XL4,XL5f,XL6,Y1 := XL1i,XL2i,E10,E20,E30,E40,E50”,

XLs,X,(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i),(XL1f,XL3f,XL5f,XL7i),Y),

“X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5”

where E1,E2,E3,E4,E5 := (E10,E20,E30,E40,E50)

[XL1i,XL2i,XL3i,XL4i,XL7i,XL8i\X1,X2,X3,X4,X7,X8].

Q2. XLs glob.“X3,X4,X5,X6,Y1 := E1,E2,E3,E4,E5”

since (due to P2) (XLs ∩glob.(E10,E20,E30,E40,E50)) ⊆(XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)

and (XL1i,XL2i,XL3i,XL4i,XL7i,XL8i)(X1,X2,X3,X4,X7,X8) (due to P2-P4).

Sequential composition

We deﬁne XL3 := (XLs ∩((XL2f\ddef.S20)∪input.S20)) to be the set of intermediately-live

instances. Due to P5 (no simultaneously-live instances), XL3 includes at most one instance of each

member of X, as required for the two recursive calls: S1 := merge-vars.(S10,XLs,X,XL1i,XL3,Y)

and S2 := merge-vars.(S20,XLs,X,XL3,XL2f,Y). Preconditions P1,P4,P5,P7 and P8 are trivial

consequences of S10and S20both being slips of S0and following our construction of S1 and S2

(in terms of the parameters to merge-vars). P2 for both cases is ﬁne as well, due to our P2 and

the construction of XL3. P3 is ﬁne due to our P3 and by construction of X3, the set of program

variables (in X) corresponding to XL3. Finally, P6 for the call to S20is trivial, again by construc-

tion of XL3; it is also correct for S10, due to our P6 and the associativity of liveness (Lemma 7.2).

As a result, we get

“(S2;XL2f:= X2)[live XL2f,Y]”=“(XL3 := X3;S20)[live XL2f,Y]”and

“(S1;XL3 := X3)[live XL3,Y]”=“(XL1i:= X1;S10)[live XL3,Y]”, as required (as P1

and P2 respectively) in the following:

“(XL1i:= X1;S10;S20)[live XL2f,Y]”

={Program equivalence D.2 with

S1 := merge-vars.(S10,XLs,X,XL1i,XL3,Y);

S2 := merge-vars.(S20,XLs,X,XL3,XL2f,Y)}

“(S1;S2;XL2f:= X2)[live XL2f,Y]”.

We thus derive

merge-vars.(“S10;S20”,XLs,X,XL1i,XL2f,Y),

“merge-vars.(S10,XLs,X,XL1i,XL3,Y);merge-vars.(S20,XLs,X,XL3,XL2f,Y)”

APPENDIX D. SSA 217

where XL3 := XLs ∩((XL2f\ddef.S20)∪input.S20) .

Q2. As required, XLs glob.“S1;S2”due to the ind. hypo. (Q2 of S1 and S2).

All live-on-exit variables are also live-on-exit to each branch. Similarly, it should be safe to

assume all IF’s live-on-entry variables are live-on-entry to each branch. This may be an over-

approximation, but a harmless one, since the initial instances of all those variables are, in any

case, available on entry to both branches. This harmfulness can be veriﬁed by the observations

that ddef of each branch is a superset of ddef.IF 0and that the corresponding input is a subset of

input.IF 0.

Thus, the recursive calls, computing S1 := merge-vars.(S10,XLs,X,XL1i,XL2f,Y) and S2 :=

merge-vars.(S20,XLs,X,XL1i,XL2f,Y), faithfully maintain P6. Since the calls are to slips of

IF 0, and since all remaining parameters are identical, all other preconditions P1-P5,P7,P8 (to

both recursive calls) trivially hold. As a result, we get

“(S1;XL2f:= X2)[live XL2f,Y]”=“(XL1i:= X1;S10)[live XL2f,Y]”and

“(S2;XL2f:= X2)[live XL2f,Y]”=“(XL1i:= X1;S20)[live XL2f,Y]”, as required (in

P4 and P5 respectively) for

“(XL1i:= X1;if B0then S10else S20ﬁ)[live XL2f,Y]”

={Program equivalence D.3 with

B:= B0[XL1i\X1];

S1 := merge-vars.(S10,XLs,X,XL1i,XL2f,Y);

S2 := merge-vars.(S20,XLs,X,XL1i,XL2f,Y):

P1 ([B0≡B0[XL1i\X1][X1\XL1i]]) is due to

X1glob.B0(our P1,P3,P4) and the redundancy of reversed double sub.;

P2 (XL1iX1) is due to our P2-P4; it also proves

P3 (XL1iglob.B0[XL1i\X1]); ﬁnally

the ind. hypo. (Q1), twice, give P4 and P5}

“(if Bthen S1else S2ﬁ;XL2f:= X2)[live XL2f,Y]”.

We thus derive

merge-vars.(“if B0then S10else S20ﬁ”,XLs,X,XL1i,XL2f,Y),“if B0[XL1i\X1]

then merge-vars.(S10,XLs,X,XL1i,XL2f,Y)else merge-vars.(S20,XLs,X,XL1i,XL2f,Y)ﬁ”.

Q2. We get XLs glob.IF , as required, due to the ind. hypo. (Q2, twice) and since (XLs ∩

glob.B0)⊆XL1i(due to input of IF and P6).

APPENDIX D. SSA 218

Let XL1i,XL2ibe the live-on-entry instances, with XL2ialso live-on-exit (which must be same

instances as on-entry due to P6 and ddef.DO0being empty); let (X1,X2) ⊆Xbe the cor-

responding program variables (the one-to-one mapping from XL1i,XL2ito X1,X2 is due to

P5). Since all live variables on entry to DO0are also live on both ends of S10, we deﬁne

S1 := merge-vars.(S10,XLs,X,(XL1i,XL2i),(XL1i,XL2i),Y). The validity of P1-P8 of the call

to merge-vars is a consequence of the given P1-P8 (of DO0), of S10being a slip of DO 0and of the

def. of ddef and input of DO loops.

We now aim to use Program equivalence D.4 with the merged S1 and

B:= B0[XL1i,XL2i\X1,X2], and need to show correctness of its P1-P5 preconditions. P1

(disjointness of (X1,X2,XL1i,XL2i,Y)) is due to P2-P4 and the deﬁnition of XL1i,XL2i(only

the latter being live-on-exit from DO0); P2 ((XL1i,XL2i)input.DO ) is due to P2,P4,Q2 and

RE5; P3 is due to our P1 and P6; P4 ([B0≡B0[XL1i,XL2i\X1,X2][X1,X2\XL1i,XL2i]]) is

due to (X1,X2) glob.B0(our P1,P3,P4) and the redundancy of reversed double sub.; and ﬁnally

P5 is given by the induction hypothesis (Q1 of S10); then

“(XL1i,XL2i:= X1,X2;

while B0do S10od)[live XL2i,Y]”

={Program equivalence D.4 with B:= B0[XL1i,XL2i\X1,X2] and

S1 := merge-vars.(S10,XLs,X,(XL1i,XL2i),(XL1i,XL2i),Y)}

“(while Bdo S1od ;XL2i:= X2)[live XL2i,Y]”.

We thus derive

merge-vars.(“while B0do S10od ”,XLs,X,(XL1i,XL2i),XL2i,Y),

“while B0[XL1i,XL2i\X1,X2] do merge-vars.(S10,XLs,X,(XL1i,XL2i),(XL1i,XL2i),Y)od ”

Q2. Finally, we get the required XLs glob.DO due to the ind. hypo. (Q2 on S10) and since

(XLs ∩glob.B0)⊆(XL1i,XL2i).

D.4 SSA is de-SSA-able

Theorem 8.4. Let Sbe any core statement and let X,Y:= (def.S),(glob.S\def.S), X1 := (X∩

((X\ddef.S)∪input.S)) and (XL1i,XLf ) := fresh.((X1,X),(X,Y)); let S0be the SSA version of

S, deﬁned as S0:= toSSA.(S,X,XL1i,XLf ,Y,(XL1i,XLf )); then S0is de-SSA-able. That is, all

APPENDIX D. SSA 219

preconditions, P1-P8, of the fromSSA algorithm hold for S00 := fromSSA.(S0,XLs,X,XL1i,XLf ,Y)

where XLs := ((XL1i,XLf )∪(def.S10\Y)).

Proof. Preconditions P1 (glob.S0⊆(XLs,Y)) and P2 ((XL1i∪XLf )⊆XLs ) hold by deﬁnition

of XLs; P3 ((X1∪X)⊆X) holds by deﬁnition of X1 and set theory; P4 (X(XLs,Y)) is due to

the deﬁnition of X,Y,XLs and due to RE5 (i.e. def.S⊆glob.S) Q2 of toSSA (i.e. X glob.S0);

and P6 ((XLs ∩((XLf \ddef.S0)∪input.S0)) ⊆XL1i) is due to DP1 of toSSA.

We are left to show no-simultaneous liveness (P5), no-def-on-live (P7) and no-multiple-defs

(P8). Those shall be proved by induction over the structure of S.

For P5 (no-simultaneous liveness) we ﬁrst note that having one ﬁnal instance for each live-on-

exit program variable, and similarly having one initial instance for each live-on-entry variable, as

is guaranteed by toSSA’s derived property DP1, ensures no-simultaneous liveness on-entry. Thus,

in proving P5 for each speciﬁc case, we shall only be obliged to show no-simultaneous liveness in

internal program points.

Assignment

Recall toSSA.(“X4,X2,X5,X6,Y1 := E1,E2,E3,E4,E5”,

X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs)) ,

“XL4f,XL2,XL5f,XL6,Y1 := E10,E20,E30,E40,E50”where

(E10,E20,E30,E40,E50,E60) := (E1,E2,E3,E4,E5,E6)

[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i]

and (XL2,XL6) := fresh.((X2,X6),(X,Y,XLs)).

P7 (no-def-on-live) and P8 (no-multiple-defs) are both due to the disjointness of

(X2,X4,X5,X6,XL4f,XL5f), the freshness of (XL2,XL4) and hence the disjointness of

(XL2,XL4f,XL5f,XL6). Note that XL4f,XL5fare actually live-on-exit, thus not breaking P7.

Sequential composition

Recall toSSA.(“S1;S2”,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs),

“S10;S20”where both S10,S20are constructed by recursive calls to toSSA.

P5,P7,P8: The induction hypothesis ensures no simultaneous liveness in any point of S20or

S10and no def-on-live or multiple-defs in any internal assignment slip.

APPENDIX D. SSA 220

toSSA.(IF ,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f,XL5f),Y,XLs),IF 0

where IF := “if Bthen S1else S2ﬁ”,

IF 0:= “if B[X1,X2,X3,X4\XL1i,XL2i,XL3i,XL4i]

then S10;XL4f,XL5f:= XL4t,XL5telse S20;XL4f,XL5f:= XL4e,XL5eﬁ”,

XL4t:= (XL4d1t,XL4d2i,XL4d1d2t),

XL4e:= (XL4d1i,XL4d2e,XL4d1d2e),

S10:= toSSA.(S1,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4t,XL5t),Y,XLs0)

and S20:= toSSA.(S2,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4e,XL5e),Y,XLs00) .

P5: We have (XL3i,XL4f,XL5f) live at the end of IF 0and at the end of both branches. The

then branch, ending with assignment “XL4f,XL5f:= XL4t,XL5t”, yields (XL3i,XL4t,XL5t)

at the end of S10(with one instance for each member of (X3,X4,X5)), and maintains no simul-

taneous liveness in S10due to the induction hypothesis. Similarly, the triple

(XL3i,XL4e,XL5e) includes on instance for each member of (X3,X4,X5), being live at the end

of S20.

P7,P8: The (pseudo) assignments at the end of both branches of IF0are both to the live-on-

exit (XL4f,XL5f), ﬁnal instances of program variables (X4,X5). This, along with the ind. hypo.

on S10and S20maintains P7 and P8.

toSSA.(DO,X,(XL1i,XL2i,XL3i,XL4i),(XL3i,XL4f),Y,XLs),

“XL2,XL4f:= XL2i,XL4i;DO0”

where DO := “while Bdo S1od ”,

DO0:= “while B0do S10;XL2,XL4f:= XL2b,XL4bod ”

(XL2,XL2b,XL4b):=fresh.((X2,X2,X4),(X,Y,XLs )),

XLs0:= (XLs,XL2,XL2b,XL4b),

B0:= B[X1,X2,X3,X4\XL1i,XL2,XL3i,XL4f]

and S10:= toSSA.(S1,X,(XL1i,XL2,XL3i,XL4f),(XL1i,XL2b,XL3i,XL4b),Y,XLs0) .

P5. First, live instances ahead of DO 0as well as at the end of its body, are from

(XL1i,XL2,XL3i,XL4f), one instance for each member of (X1,X2,X3,X4). This is so due to

DP2 of S10(i.e. input.S10⊆(XL1i,XL2,XL3i,XL4f)) and the deﬁnition of B0. Then the assign-

ment at the end of the loop body, “XL2,XL4f:= XL2b,XL4b”, yields (XL1i,XL2b,XL3i,XL4b)

as live instances on exit from S10. The ind. hypo. ensures no simultaneous liveness in (and ahead

of) S10itself.

P7,P8: As mentioned above, live instances at the end of the loop body are from

(XL1i,XL2,XL3i,XL4f). Thus, the assignment there to the live (XL2,XL4f), one instance for

APPENDIX D. SSA 221

each member of (X2,X4), indeed maintains both P7 and P8. Similarly, the assignment (to the

same instances) ahead of DO0(where, again, the same instances are live), maintains P7 and P8.

D.5 An SSA-based slice is de-SSA-able

Theorem 9.6. Any slide-independent statement from the SSA version of any core statement is

de-SSA-able.

That is, let Sbe any core statement and let X,Y:= (def.S),(glob.S\def.S), X1 := (X∩((X\

ddef.S)∪input.S)) and (XL1i,XLf ) := fresh.((X1,X),(X,Y)); let S0be the SSA version of S,

deﬁned as S0:= toSSA.(S,X,XL1i,XLf ,Y,(XL1i,XLf )); let XLs := ((XL1i,XLf )∪(def.S10\

Y)) be the full set of instances (of X, in S0) and let XLI be any (slide-independent) subset of

those instances, with ﬁnal instances XL2f:= XLI ∩XLf ; ﬁnally let SI 0:= slides.S0.XLI be the

corresponding (slide-independent) statement; then SI 0is de-SSA-able. That is, all preconditions,

P1-P8, of the fromSSA algorithm hold for SI := fromSSA.(SI 0,X,XL1i,XL2f,XLs).

Proof. Preconditions P1 (glob.SI 0⊆(XLs,Y)) and P2 ((XL1i∪XL2f)⊆XLs ) hold by deﬁnition

of XLs; P3 ((X1∪X2) ⊆X) holds by deﬁnition of X1 (and set theory) and by deﬁnition of XL2f

and its mapping to program variables X2 in X; and P4 (X(XLs,Y)) is due to the deﬁnition of

X,Y,XLs and due to Q2 of toSSA (i.e. X glob.S0).

For P5, we observe no-simultaneous liveness is known for S0[live XL2f,Y] (Theorem 8.4).

This property is preseved by taking the slides of any slide-independent set (Corollary C.3). Thus

(slides.S0.XLI )[live XL2f,Y] enjoys no-simultaneous liveness for instances of (each member of)

For P6 ((XLs ∩((XL2f\ddef.S0)∪input.S0)) ⊆XL1i), we observe

XLs ∩((XL2f\ddef.(slides.S0.XLI )) ∪input.(slides.S0.XLI ))

={recall def. of XLs := ((XL1i,XLf )∪(def.S10\Y));

let XLs0:= XLs \XL1isuch that XLs = (XL1i,XLs0);

note that XLs0⊆def.S0since DP2 of toSSA and RE4 give XLf ⊆def.S0}

(XL1i,XLs0)∩((XL2f\ddef.(slides.S0.XLI )) ∪input.(slides.S0.XLI ))

⊆ {set theory}

XL1i∪(XLs0∩((XL2f\ddef.(slides.S0.XLI )) ∪input.(slides.S0.XLI )))

={XLI ∩def.S0⊆XLs0,XL2f⊆XLI and since XLI is slide ind. in slides.S0

we get input.(slides.S0.XLI )∩def.S0⊆XLI }

XL1i∪(XLI ∩def.S0∩((XL2f\ddef.(slides.S0.XLI )) ∪input.(slides.S0.XLI )))

APPENDIX D. SSA 222

⊆ {set theory}

XL1i∪(XLI ∩def.S0∩((XL2f\(XLI ∩ddef.(slides.S0.XLI )))∪

(XLI ∩input.(slides.S0.XLI ))))

={Lemma C.4 with S,V,X:= S0,XLI ,XLI :

indeed (XLI ∩def.S0⊆XLI }

XL1i∪(XLI ∩def.S0∩((XL2f\(XLI ∩ddef.S0))∪

(XLI ∩input.(slides.S0.XLI ))))

={set theory: XL2f⊆XLI by deﬁnition and

XL2f⊆ddef.S0by DP2 of toSSA}

XL1i∪(XLI ∩def.S0∩(XLI ∩input.(slides.S0.XLI )))

⊆ {Lemma C.5 with S,VI ,X:= S0,XLI ,XLI :

indeed XLI ∩def.S⊆XLI }

XL1i∪(XLI ∩def.S0∩(XLI ∩input.S0))

={XLs ∩input.S0⊆XL1iby DP1 of toSSA;XLI ⊆XLs}

XL1i.

For P7 (no def-on-live), we recall no def-on-live is known for S0[live XL2f,Y] (Theorem 8.4).

Like P5, we show that this property is preseved by taking the slides of any slide-independent set.

Since S0[live XL2f,Y] enjoys the property of no-def-on-live (of XLI ), and since any assignment

slip of the form (XLI 1 := E1)[live XLI 21,Y] in (slides.S0.XLI )[live XL2f,Y] has a correspond-

ing slip (XLI 1,coXLI 1 := E1,E2)[live XLI 2,Y] of S0[live XL2f,Y] with XLI 21 ⊆XLI 2 (due

to Theorem C.2), we observe that a deﬁned instance (x0∈XLI 1) may only cause a def-on-live

violation if another instance x00 (of the same program variable x), is live-on-exit from the assign-

ment, i.e. x 2∈XLI 21. This can only happen if x2∈XLI 2 as well (since XLI 21 ⊆XLI 2).

In such a case, x1 already causes a def-on-live violation in the corresponding (XLI 1,coXLI 1 :=

E1,E2)[live XLI 2,Y], thus contradicting the de-SSA-ability of S0[live XL2f,Y].

Finally, P8 (no-multiple-defs) holds for S0(see Theorem 8.4) and hence for any of its slides.

Appendix E

Final-Use Substitution

E.1 Formal derivation

The following is a formal derivation of ﬁnal-use substitution, for any core statement Sand matching

sets of variables Xand X0.

We begin with “S;{X=X0}”where X0glob.S, and while propagating the assertion

backwards into S, as far as possible, we make local assertion-based substitutions to each slip that

ends up being preceded by the assertion. Finally, we remove all assertions.

An equality x=x0will successfully propagate backward over any statement that does not

deﬁne x; it will also propagate into any IF statement and into those DO loops whose body does

not deﬁne x.

Side note: in terms of control ﬂow paths, the assertion ends up propagating to (the entry of)

any node from which all paths to the exit involve no deﬁnition of x. When we are interested in a

formulation of cases in which all uses of xwill be substituted, we should be able to express that

as follows: all paths to the exit from any use of xare clear of deﬁnitions of x.

S=“X1,Y:= E1,E2”: in deriving (X1,Y:= E1,E2)[ﬁnal-use X1,X2\X10,X20] when

(X10,X20)glob.Sand X2Y, we observe

“X1,Y:= E1,E2;{X1,X2 = X10,X20}”

={split assertion (Law 15)}

“X1,Y:= E1,E2;{X2 = X20};{X1 = X10}”

={swap statement and assertion (Law 11): (X2,X20)(X1,Y)}

“{X2 = X20};X1,Y:= E1,E2;{X1 = X10}”

={assertion-based sub. (Law 17); E10:= E1[X2\X20] and E20:= E2[X2\X20]}

“{X2 = X20};X1,Y:= E10,E20;{X1 = X10}”

223

APPENDIX E. FINAL-USE SUBSTITUTION 224

={swap assertion and statement (Law 11): (X2,X20)(X1,Y)}

“X1,Y:= E10,E20;{X2 = X20};{X1 = X10}”

={merge assertions (Law 15)}

“X1,Y:= E10,E20;{X1,X2 = X10,X20}”.

We thus derive (X1,Y:= E1,E2)[ﬁnal-use X1,X2\X10,X20],

“X1,Y:= E1[X2\X20],E2[X2\X20]”.

S=“S1;S2”: in deriving (S1;S2)[ﬁnal-use X1,X2\X10,X20] when X2def.S2 and

(X10,X20)glob.S), we observe

“S1;S2;{X1,X2 = X10,X20}”

={ﬁnal-use sub.: let S20:= S2[ﬁnal-use X1,X2\X10,X20]}

“S1;S20;{X1,X2 = X10,X20}”

={split assertion (Law 15)}

“S1;S20;{X2 = X20};{X1 = X10}”

={swap statement and assertion (Law 11): (X2,X20)def.S2}

“S1;{X2 = X20};S20;{X1 = X10}”

={ﬁnal-use sub.: let S10:= S1[ﬁnal-use X2\X20]}

“S10;{X2 = X20};S20;{X1 = X10}”

={swap assertion and statement (Law 11): (X2,X20)def.S20}

“S10;S20;{X2 = X20};{X1 = X10}”

={merge assertions (Law 15)}

“S10;S20;{X1,X2 = X10,X20}”.

We thus derive (S1;S2)[ﬁnal-use X1,X2\X10,X20],

“S1[ﬁnal-use X2\X20];S2[ﬁnal-use X1,X2\X10,X20]”where X1 := X∩def.S2, X2 := X\X1

and X10,X20are the corresponding subsets of X0.

S=“if Bthen S1else S2ﬁ”: in deriving

(if Bthen S1else S2ﬁ)[ﬁnal-use X1,X2\X10,X20]

when X2(def.S1∪def.S2) and (X10,X20)glob.S, we observe

“if Bthen S1else S2ﬁ;{X1,X2 = X10,X20}”

={dist. assertion over IF (Law 12)}

“if Bthen S1;{X1,X2 = X10,X20}else S2;{X1,X2 = X10,X20}ﬁ”

APPENDIX E. FINAL-USE SUBSTITUTION 225

={ﬁnal-use sub., twice: let S10:= S1[ﬁnal-use X1,X2\X10,X20] and

S20:= S2[ﬁnal-use X1,X2\X10,X20]}

“if Bthen S10;{X1,X2 = X10,X20}else S20;{X1,X2 = X10,X20}ﬁ”

={dist. IF over ‘ ;’ (Law 4)}

“if Bthen S10else S20ﬁ;{X1,X2 = X10,X20}”

={split assertion (Law 15)}

“if Bthen S10else S20ﬁ;{X2 = X20};{X1 = X10}”

={swap statement and assertion (Law 11): (X2,X20)(def.S10∪def.S20)}

“{X2 = X20};if Bthen S10else S20ﬁ;{X1 = X10}”

={assertion-based sub. (Law 17)}

“{X2 = X20};if B[X2\X20]then S10else S20ﬁ;{X1 = X10}”

={swap assertion and statement (Law 11): (X2,X20)(def.S10∪def.S20)}

“if B[X2\X20]then S10else S20ﬁ;{X2 = X20};{X1 = X10}”

={merge assertions (Law 15)}

“if B[X2\X20]then S10else S20ﬁ;{X1,X2 = X10,X20}”.

We thus derive (if Bthen S1else S2ﬁ)[ﬁnal-use X1,X2\X10,X20],

“if B[X2\X20]then S1[ﬁnal-use X1,X2\X10,X20]else S2[ﬁnal-use X1,X2\X10,X20]ﬁ”where

X1 := X∩def.(S1,S2), X2 := X\X1 and X10,X20are the corresponding subsets of X0.

S=“while Bdo S1od ”: in deriving (while Bdo S1od)[ﬁnal-use X1,X2\X10,X20]

(when X2def.S1 and (X10,X20)glob.S), we observe

“while Bdo S1od ;{X1,X2 = X10,X20}”

={split assertion (Law 15)}

“while Bdo S1od ;{X2 = X20};{X1 = X10}”

={swap statement and assertion (Law 11): (X2,X20)def.S1}

“{X2 = X20};while Bdo S1od ;{X1 = X10}”

={prop. assertion forward into loop (Law 16): (X2,X20)def.S1}

“{X2 = X20};while Bdo {X2 = X20};S1od ;{X1 = X10}”

={swap assertion and statement (Law 11): (X2,X20)def.S1}

“{X2 = X20};while Bdo S1;{X2 = X20}od ;{X1 = X10}”

={ﬁnal-use sub.: let S10:= S1[ﬁnal-use X2\X20]}

APPENDIX E. FINAL-USE SUBSTITUTION 226

“{X2 = X20};while Bdo S1;{X2 = X20}od ;{X1 = X10}”

={assertion-based sub. (Law 17): let B0:= B[X2\X20]}

“{X2 = X20};while B0do S1;{X2 = X20}od ;{X1 = X10}”

={swap statement and assertion (Law 11): (X2,X20)def.S10}

“{X2 = X20};while B0do {X2 = X20};S10od ;{X1 = X10}”

={prop. assertion backward outside loop (Law 16): (X2,X20)def.S10}

“{X2 = X20};while B0do S10od ;{X1 = X10}”

={swap assertion and statement (Law 11): (X2,X20)def.S10}

“while B0do S10od ;{X2 = X20};{X1 = X10}”

={merge assertions (Law 15)}

“while B0do S10od ;{X1,X2 = X10,X20}”.

We thus derive (while Bdo S1od)[ﬁnal-use X1,X2\X10,X20],

(while B[X2\X20]do S1[ﬁnal-use X2\X20]od) where X1 := X∩def.S1, X2 := X\X1 and

X20is the subset of X0corresponding to X2.

E.2 Lemmata for proving statement dup. with ﬁnal use

Lemma 10.2 . Let Sbe any core statement with def.S= (V,coV ), Vr ⊆V(and fVr the

corresponding subset of fV ) and (iV ,icoV ,fV )glob.S; we then have

“iV ,icoV := V,coV

;fV := V

;

V,coV := iV ,icoV

;S[ﬁnal-use Vr \fVr]

;{Vr =fVr}”

“iV ,icoV := V,coV

;fV := V

;

V,coV := iV ,icoV

;S[ﬁnal-use Vr \fVr]

”

APPENDIX E. FINAL-USE SUBSTITUTION 227

Proof.

“iV ,icoV := V,coV ;S;fV := V;

V,coV := iV ,icoV ;S[ﬁnal-use Vr \fVr];{Vr =fVr }”

={swap statement and assertion (Law 10)}

“iV ,icoV := V,coV ;S;fV := V;

V,coV := iV ,icoV ;{wp.S[ﬁnal-use Vr \fVr].(Vr =fVr )};

S[ﬁnal-use Vr \fVr]”

={property of ﬁnal-use sub. (Lemma E.1, see below)}

“iV ,icoV := V,coV ;S;fV := V;

V,coV := iV ,icoV ;{wp.S.(Vr =fVr)};S[ﬁnal-use Vr \fVr ]”

={intro. following assertion (Law 7)}

“iV ,icoV := V,coV ;({iV ,icoV =V,coV };S;fV := V;

V,coV := iV ,icoV );{wp.S.(Vr =fVr)};S[ﬁnal-use Vr \fVr ]”

={swap statement and assertion (Law 10) and wp of ‘ ;’}

“iV ,icoV := V,coV ;

{wp.“{iV ,icoV =V,coV };S;fV := V;V,coV := iV ,icoV ;

S”.(Vr =fVr)};({iV ,icoV =V,coV };S;fV := V;

V,coV := iV ,icoV );S[ﬁnal-use Vr \fVr]”

={Lemma E.2, see below}

“iV ,icoV := V,coV ;{wp.“{iV ,icoV =V,coV };S”.true };

({iV ,icoV =V,coV };S);fV := V;V,coV := iV ,icoV ;

S[ﬁnal-use Vr \fVr]”

={swap assertion and statement (Law 10)}

“iV ,icoV := V,coV ;({iV ,icoV =V,coV };S);{true };fV := V;

V,coV := iV ,icoV ;S[ﬁnal-use Vr \fVr]”

={remove true assertion; remove following assertion (Law 7)}

“iV ,icoV := V,coV ;S;fV := V;V,coV := iV ,icoV ;

S[ﬁnal-use Vr \fVr]”.

Lemma E.1. Let S,V,fV be any core statement and two sets of variables, respectively; we then

have

[wp.(S[ﬁnal-use V\fV ].(V=fV )≡wp.S.(V=fV )]

APPENDIX E. FINAL-USE SUBSTITUTION 228

provided fV (V∪glob.S).

Proof.

wp.S.(V=fV )

={pred. calc.}

wp.S.((V=fV )∧true)

={wp of assertions}

wp.S.(wp.“{V=fV }”.true)

={wp of ‘ ;’}

wp.“S;{V=fV }”.true

={ﬁnal-use sub.}

wp.“S[ﬁnal-use V\fV ];{V=fV }”.true

={wp of ‘ ;’}

wp.S[ﬁnal-use V\fV ].(wp.“{V=fV }”.true

={wp of assertions}

wp.S[ﬁnal-use V\fV ].((V=fV )∧true)

={pred. calc.}

wp.S[ﬁnal-use V\fV ].(V=fV ).

Lemma E.2. Let Sbe any core statement with def.S= (V,coV ). We then have

[wp.“{iV ,icoV =V,coV };S;fV := V;V,coV := iV ,icoV ;S”.(Vr =fVr )≡

wp.“{iV ,icoV =V,coV };S”.true]

provided Vr ⊆V,fVr is the corresponding subset of fV and (iV ,icoV ,fV )glob.S.

Proof.

wp.“{iV ,icoV =V,coV };S;fV := V;V,coV := iV ,icoV ;S”.(Vr =fVr )

={statement duplication (Lemma 6.3)}

wp.“{iV ,icoV =V,coV };S;fV := V”.(Vr =fVr)

={wp of ‘ ;’ and ‘:=’}

wp.“{iV ,icoV =V,coV };S”.((fV := V).(Vr =fVr))

={normal sub. (proviso)}

wp.“{iV ,icoV =V,coV };S”.true .

APPENDIX E. FINAL-USE SUBSTITUTION 229

E.3 Stepwise ﬁnal-use substitution

Theorem E.3. The ﬁnal-use substitution can be performed in a stepwise manner. That is, for

any core statement Sand four sets of variables X1,X2,fX 1,fX 2, we have

S[ﬁnal-use X1,X2\fX 1,fX 2] = S[ﬁnal-use X1\fX 1][ﬁnal-use X2\fX 2]

provided (fX 1,fX 2) glob.S.

Proof. We follow the semantic requirement of ﬁnal-use substitution

(“S;{X=fX }”=“S[ﬁnal-use X\fX ];{X=fX }”) and observe

“S[ﬁnal-use X1,X2\fX 1,fX 2] ;{X1,X2 = fX 1,fX 2}”

={ﬁnal-use sub.: (fX 1,fX 2) glob.S(proviso)}

“S;{X1,X2 = fX 1,fX 2}”

={split assertion: Law 15}

“S;{X1 = fX 1};{X2 = fX 2}”

={ﬁnal-use sub.: fX 1glob.S(proviso)}

“S[ﬁnal-use X1\fX 1] ;{X1 = fX 1};{X2 = fX 2}”

={swap statements (Program equivalence 5.7): def of assertions is empty}

“S[ﬁnal-use X1\fX 1] ;{X2 = fX 2};{X1 = fX 1}”

={ﬁnal-use sub.: fX 2glob.S[ﬁnal-use X1\fX 1] since

fX 2(fX 1,glob.S) (as implied by the proviso)}

“S[ﬁnal-use X1\fX 1][ﬁnal-use X2\fX 2] ;{X2 = fX 2};{X1 = fX 1}”

={merge assertions: Law 15}

“S[ﬁnal-use X1\fX 1][ﬁnal-use X2\fX 2] ;{X1,X2 = fX 1,fX 2}”.

Appendix F

Summary of Laws

F.1 Manipulating core statements

Law 1. Let X,Y,E1,E2 be two sets of variables and two sets of expressions, respectively; then

“X:= E1;Y:= E2”=“X,Y:= E1,E2”

provided X(Y∪glob.E2).

Law 2. Let S,Xbe a statement set of variables, respectively; then

S=“S;X:= X”.

Law 3. Let S,S1,S2,Bbe three statements and a boolean expression, respectively; then

“S;if Bthen S1else S2ﬁ”=“if Bthen S;S1else S;S2ﬁ”

provided def.Sglob.B.

Law 4. Let S1,S2,S3,Bbe three statements and a boolean expression, respectively; then

“if B1then S1else S2ﬁ;S3”=“if B1then S1;S3else S2;S3ﬁ”.

Law 5. Let S1,X,B,Ebe any statement, set of variables, boolean expression and set of expres-

sions, respectively; then

“{X=E};while Bdo S1;(X:= E)od ”=“{X=E};while Bdo S1od ;(X:= E)”

provided X(glob.B∪input.S1∪glob.E).

Law 6. Let X,Ebe any set of variables and set of expressions, respectively; then

“{X=E}”=“{X=E};X:= E”.

230

APPENDIX F. SUMMARY OF LAWS 231

F.2 Assertion-based program analysis

F.2.1 Introduction of assertions

Law 7. Let X,Y,E1,E2 be two sets of variables and two sets of expressions, respectively; then

“X,Y:= E1,E2”=“X,Y:= E1,E2;{Y=E2}”

provided (X,Y)glob.E2.

Law 8. Let X,X0,Ebe (same length) lists of variables and expressions, respectively, with XX0;

then

“X,X0:= E,E”=“X,X0:= E,E;{X=X0}”.

Law 9. Let S1,B1,B2 be any given statement and two boolean expressions, respectively; then

“while B1do S1od ”=“while B1do {B2};S1od ”.

provided [B1⇒B2].

F.2.2 Propagation of assertions

Law 10. Let S,Bbe a statement and boolean expression, respectively; then

“{wp.S.B};S”=“S;{B}”.

Law 11. Let S,Bbe a statement and boolean expression, respectively; then

“{B};S”=“S;{B}”.

provided def.Sglob.B.

Law 12. Let S1,S2,B1,B2 be two statements and two boolean expressions, respectively; then

“{B1};if B2then S1else S2ﬁ”=“if B2then {B1};S1else {B1};S2ﬁ”.

Law 13. Let S,B1,B2,B3,B4 be a statement and four boolean expressions, respectively; then

“{B1};while B2do S;{B3}od ”=“{B1};while B2do {B4};S;{B3}od ”

provided [B1⇒B4] and [B3⇒B4].

Law 14. Let S,B1,B2,B3,B4 be a statement and four boolean expressions, respectively; then

“{B1};while B2do S;{B3}od ”=“{B1};while B2∧B4do S;{B3}od ”

provided [B1⇒B4] and [B3⇒B4].

APPENDIX F. SUMMARY OF LAWS 232

Law 15. Let B1,B2 be two boolean expressions; then

“{B1∧B2}”=“{B1};{B2}”.

Law 16. Let S,B1,B2 be a statement and two boolean expressions, respectively; then

“{B1};while B2do Sod ”=“{B1};while B2do {B1};Sod ”

provided glob.B1def.S.

F.2.3 Substitution

Law 17. Let S1,S2,Bbe two statements and a boolean expression, respectively; let X,Ebe a

set of variables and a corresponding list of expressions; and let Y,Y0be two sets of variables; then

“{Y=Y0};X:= E”=“{Y=Y0};X:= E[Y\Y0]”;

“{Y=Y0};IF ”=“{Y=Y0};IF 0”; and

“{Y=Y0};DO ”=“{Y=Y0};DO0”.

where IF := “if Bthen S1else S2ﬁ”,

IF 0:= “if B[Y\Y0]then S1else S2ﬁ”,

DO := “while Bdo S1;{Y=Y0}od ”

and DO0:= “while B[Y\Y0]do S1;{Y=Y0}od ”.

Law 18. Let S1,S2,Bbe two statements and a boolean expression, respectively; let

X,X0,Y,Z,E1,E10,E2,E3 be four lists of variables and corresponding lists of expressions; then

“X,Y:= E1,E2;Z:= E3”=“X,Y:= E1,E2;Z:= E3[Y\E2] ”;

“X,Y:= E1,E2;IF ”=“X,Y:= E1,E2;IF 0”; and

“X,Y:= E1,E2;DO ”=“X,Y:= E1,E2;DO0”

provided ((X∪X0),Y)glob.E2

where IF := “if Bthen S1else S2ﬁ”,

IF 0:= “if B[Y\E2] then S1else S2ﬁ”,

DO := “while Bdo S1;X0,Y:= E10,E2od ”

and DO0:= “while B[Y\E2] do S1;X0,Y:= E10,E2od ”.

F.3 Live variables analysis

F.3.1 Introduction and removal of liveness information

Law 19. Let S,Vbe any statement and set of variables, respectively, with def.S⊆V; then

S=“S[live V]”.

APPENDIX F. SUMMARY OF LAWS 233

F.3.2 Propagation of liveness information

Law 20. Let S1,S2,V1,V2 be any two statements and two sets of variables, respectively; then

“(S1;S2)[live V1] ”=“(S1[live V2] ;S2[live V1])[live V1] ”

provided V2 = (V1\ddef.S2) ∪input.S2.

Law 21. Let B,S1,S2,Vbe any boolean expression, two statements and set of variables, respec-

tively; then

“(if Bthen S1else S2ﬁ)[live V]”=“(if Bthen S1[live V]else S2[live V]ﬁ)[live V]”.

Law 22. Let B,S,Vbe any boolean expression, statement and set of variables, respectively; then

“(while Bdo Sod)[live V1] ”=“(while Bdo S[live V2] od)[live V1] ”

provided V2 = V1∪(glob.B∪input.S).

F.3.3 Dead assignments: introduction and elimination

Law 23. Let S,V,X,Y,E1,E2 be any statement, three sets of variables and two sets of expres-

sions, respectively; then

“(S;X:= E1)[live V]”=“(S;X,Y:= E1,E2)[live V]”

provided Y(X∪V).

Law 24. Let S,V,Y,Ebe any statement, two sets of variables and set of expressions, respectively;

then

“S[live V]”=“(S;Y:= E)[live V]”

provided YV.

Law 25. Let S,V,X,Y,E1,E2 be any statement, three sets of variables and two sets of expres-

sions, respectively; then

“(X:= E1;S)[live V]”=“(X,Y:= E1,E2;S)[live V]”

provided Y(X∪(V\ddef.S)∪input.S).

Law 26. Let B,S1,S2,Y,V,Ebe a boolean expression, two statements, two sets of variables

and a set of expressions, respectively; then

“(S1;while Bdo S2od)[live V]”=“(S1;while Bdo S2;(Y:= E)od)[live V]”

provided Y(V∪glob.B∪input.S2).

Bibliography

[1] A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Principles, Techniques and Tools. Addison-

Wesley, 1988. 18

[2] R.-J. Back. On the Correctness of Reﬁnement Steps in Program Development. PhD thesis,

Abo Akademi, Department of Computer Science, Helsinki, Finland, 1978. Report A–1978–4.

36, 50

[3] R.-J. Back. Correctness Preserving Program Reﬁnements: Proof Theory and Applications,

volume 131 of Mathematical Center Tracts. Mathematical Centre, Amsterdam, The Nether-

lands, 1980. 49

[4] R.-J. Back. A Calculus of Reﬁnements for Program Derivations. Acta Informatica, 25:593–

624, 1988. 36

[5] L. Badger and M. Weiser. Minimizing Communication for Synchronizing Parallel Dataﬂow

programs. In International Conference on Parallel Processing (ICPP), The Pennsylvania State

University, University Park, PA, USA, August 1988, pages 122–126, 1988. 16

[6] T. Ball and S. Horwitz. Slicing Programs with Arbitrary Control-ﬂow. In Automated and

Algorithmic Debugging (AADEBUG), pages 206–222, 1993. 17, 18, 142

[7] K. Beck. Extreme Programming Explained: Embrace Change. Addison-Wesley Longman

Publishing Co., Inc., Boston, MA, USA, 2000. 1

[8] D. Binkley. The Application of Program Slicing to Regression Testing. Information and

Software Technology, 40(11-12):583–594, 1998. 16

[9] D. Binkley, L. R. Raszewski, C. Smith, and M. Harman. An Empirical Study of Amorphous

Slicing as a Program Comprehension Support Tool. In IWPC ’00: Proceedings of the 8th

International Workshop on Program Comprehension, page 161, Washington, DC, USA, 2000.

IEEE Computer Society. 16

234

BIBLIOGRAPHY 235

[10] A. Cimitile, A. D. Lucia, and M. Munro. Identifying Reusable Functions Using Speciﬁcation

Driven Program Slicing: A Case Study. In ICSM ’95: Proceedings of the International

Conference on Software Maintenance, pages 124–133, 1995. 21

[11] M. Corn´elio. Refactorings as Formal Reﬁnements. PhD thesis, Universidade Federal de

Pernambuco, Centro de Inform´atica, Caixa, Recife - PE - Brazil, 2004. 14, 36, 139, 142

[12] R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Eﬃciently Computing

Static Single Assignment Form and the Control Dependence Graph. ACM Transactions on

Programming Languages and Systems, 13(4):451–490, 1991. 19

[13] E. W. Dijkstra and C. S. Scholten. Predicate Calculus and Program Semantics. Springer-

Verlag New York, Inc., New York, NY, USA, 1990. 26, 28, 29, 30, 31, 32, 33, 34, 35, 37, 41,

42, 46, 150, 153

[14] S. Drape. Obfuscation of Abstract Data Types. DPhil thesis, University of Oxford, United

Kingdom, 2004. 143

[15] M. B. Dwyer, J. Hatcliﬀ, M. Hoosier, V. P. Ranganath, Robby, and T. Wallentine. Evaluating

the Eﬀectiveness of Slicing for Model Reduction of Concurrent Object-Oriented Programs.

In Tools and Algorithms for the Construction and Analysis of Systems, 12th International

Conference, TACAS 2006, Vienna, Austria, pages 73–89, 2006. 16, 17, 141

[16] M. D. Ernst. Practical ﬁne-grained static slicing of optimized code. Technical Report MSR-

TR-94-14, Microsoft Research, Redmond, WA, July 26, 1994. 18

[17] R. Ettinger and M. Verbaere. Untangling: A Slice Extraction Refactoring. In AOSD ’04:

Proceedings of the 3rd International Conference on Aspect-Oriented Software Development,

pages 93–101, New York, NY, USA, 2004. ACM Press. 2, 16, 141, 142

[18] J. Field, G. Ramalingam, and F. Tip. Parametric program slicing. In POPL ’95: Proceedings

of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages,

pages 379–392, New York, NY, USA, 1995. ACM Press. 18

[19] J. Field and F. Tip. Dynamic Dependence in Term rewriting Systems and its Application

to Program Slicing. In PLILP ’94: Proceedings of the 6th International Symposium on

Programming Language Implementation and Logic Programming, pages 415–431, London,

UK, 1994. Springer-Verlag. 18

[20] M. Fowler. Refactoring: Improving the Design of Existing Code. Addison Wesley, 2000. 1,

2, 4, 7, 12, 138, 139

BIBLIOGRAPHY 236

[21] M. Fowler. Crossing Refactoring’s Rubicon. February 2001.

http://www.martinfowler.com/articles/refactoringRubicon.html. 139

[22] K. B. Gallagher and J. R. Lyle. Using Program Slicing in Software Maintenance. IEEE

Transactions on Software Engineering, 17(8):751–761, 1991. 21, 24, 102, 136

[23] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable

Object-Oriented Software. Addison-Wesley, 1995. 36

[24] J. Gibbons. Fission for Program Comprehension. In T. Uustalu, editor, Mathematics of Pro-

gram Construction, 8th International Conference, MPC 2006, Kuressaare, Estonia, volume

4014 of Lecture Notes in Computer Science, pages 162–179. Springer, 2006. 24

[25] J. Gosling, B. Joy, G. Steele, and G. Bracha. The Java Language Speciﬁcation Second Edition.

Addison-Wesley, Boston, Mass., 2000. 13

[26] W. Griswold and D. Notkin. Automated Assistance for Program Restructuring. ACM Trans-

actions on Software Engineering, 2(3):228–269, July 1993. 21

[27] M. Harman, D. Binkley, and S. Danicic. Amorphous Program Slicing. Journal of Systems

and Software, 68(1):45–64, 2003. 18

[28] M. Harman, D. Binkley, R. Singh, and R. M. Hierons. Amorphous Procedure Extraction.

In SCAM ’04: Proceedings of the Source Code Analysis and Manipulation, Fourth IEEE

International Workshop on (SCAM’04), pages 85–94, 2004. 24

[29] M. Harman and S. Danicic. Using Program Slicing to Simplify Testing. Software Testing,

Veriﬁcation and Reliability, 5(3):143–162, 1995. 16

[30] C. A. R. Hoare, I. J. Hayes, H. Jifeng, C. C. Morgan, A. W. Roscoe, J. W. Sanders, I. H.

Sorensen, J. M. Spivey, and B. A. Sufrin. Laws of Programming. Communications of the

ACM, 30(8):672–686, 1987. 49

[31] S. Horwitz, J. Prins, and T. W. Reps. Integrating Non-Interfering Versions of Programs. In

POPL ’88: Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of

Programming Languages, pages 133–145, 1988. 143

[32] S. Horwitz, J. Prins, and T. W. Reps. Integrating Noninterfering Versions of Programs. ACM

Transactions on Programming Languages and Systems (TOPLAS), 11(3):345–387, 1989. 143

[33] S. Horwitz, T. W. Reps, and D. Binkley. Interprocedural Slicing Using Dependence Graphs.

ACM Transactions on Programming Languages and Systems (TOPLAS), 12(1):26–60, 1990.

17, 18

BIBLIOGRAPHY 237

[34] I. Jacobson, G. Booch, and J. Rumbaugh. The Uniﬁed Software Development Process.

Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999. 1

[35] G. Jayaraman, V. P. Ranganath, and J. Hatcliﬀ. Kaveri: Delivering the Indus Java Program

Slicer to Eclipse. In Fundamental Approaches to Software Engineering, 8th International

Conference, FASE 2005, pages 269–272, 2005. 17, 141

[36] G. Kiczales, J. Lamping, A. Menhdhekar, C. Maeda, C. Lopes, J.-M. Loingtier, and J. Irwin.

Aspect-Oriented Programming. In M. Ak¸sit and S. Matsuoka, editors, Proceedings European

Conference on Object-Oriented Programming, volume 1241, pages 220–242. Springer-Verlag,

Berlin, Heidelberg, and New York, 1997. 141

[37] R. Komondoor. Automated Duplicated-Code Detection and Procedure Extraction. PhD

thesis, University of Wisconsin-Madison, WI, USA, 2003. 103, 125, 140, 143

[38] R. Komondoor and S. Horwitz. Semantics-Preserving Procedure Extraction. In POPL ’00:

Proceedings of the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming

Languages, pages 155–169, New York, NY, USA, 2000. ACM Press. 22, 103, 125, 139, 143

[39] R. Komondoor and S. Horwitz. Eﬀective Automatic Procedure Extraction. In Proceedings

of the 11th IEEE International Workshop on Program Comprehension, 2003. 23, 24, 81, 103,

125, 139, 140, 143

[40] A. Lakhotia and J.-C. Deprez. Restructuring Programs by Tucking Statements into Functions.

Information and Software Technology, 40(11-12):677–690, 1998. 16, 21, 68, 102, 139

[41] F. Lanubile and G. Visaggio. Extracting Reusable Functions by Flow Graph-Based Program

Slicing. IEEE Transactions on Software Engineering, 23(4):246–259, 1997. 21

[42] K. Maruyama. Automated Method-Extraction Refactoring by using Block-Based Slicing. SSR

’01: Proceedings of the 2001 Symposium on Software Reusability, pages 31–40. ACM Press,

2001. 8, 16, 21

[43] T. M. Meyers and D. Binkley. Slice-Based Cohesion Metrics and Software Intervention. In

11th Working Conference on Reverse Engineering (WCRE 2004), November 2004, Delft, The

Netherlands, pages 256–265. IEEE Computer Society, 2004. 16

[44] L. Millett and T. Teitelbaum. Slicing Promela and its Applications to Model Checking. In

Proceedings on Model Checking of Software., 1998. 16

[45] C. Morgan. Programming from Speciﬁcations (2nd ed.). Prentice Hall International (UK)

Ltd., Hertfordshire, UK, UK, 1994. 14, 30, 36, 37, 46, 47, 49, 50, 153

BIBLIOGRAPHY 238

[46] S. S. Muchnik. Advanced Compiler Design and Implementation. Morgan Kaufmann, 1997.

[47] F. Nielson, H. R. Nielson, and C. Hankin. Principles of Program Analysis. Springer, December

2004. 18, 48, 52, 73

[48] W. F. Opdyke. Refactoring Object-Oriented Frameworks. PhD thesis, University of Illinois

at Urbana-Champaign, IL, USA, 1992. 1, 12, 14, 15, 21

[49] K. Ottenstein and L. Ottenstein. The Program Dependence Graph in a Software Development

Environment. Proc. of the ACM SIGSOFT/SIGPLAN Software Engineering Symposium on

Practical Software Development Environments, pages 177–184, 1984. 18

[50] J. Rilling and T. Klemola. Identifying Comprehension Bottlenecks using Program Slicing

and Cognitive Complexity Metrics. In IWPC ’03: Proceedings of the 11th IEEE Interna-

tional Workshop on Program Comprehension, page 115, Washington, DC, USA, 2003. IEEE

Computer Society. 16

[51] J. Rilling and S. P. Mudur. 3D Visualization Techniques to Support Slicing-Based Program

Comprehension. Computers & Graphics, 29(3):311–329, 2005. 16

[52] D. Roberts. Practical Analysis for Refactoring. PhD thesis, University of Illinois at Urbana-

Champaign, IL, USA, 1999. 13, 15

[53] D. Roberts, J. Brant, and R. Johnson. A Refactoring Tool for Smalltalk. Theory and Practice

of Object Systems, 3(4), 1997. 13

[54] J. Singer. Static Program Analysis based on Virtual Register Renaming. DPhil thesis,

University of Cambridge, United Kingdom, 2006. 18, 19, 75

[55] M. Verbaere, R. Ettinger, and O. de Moor. JunGL: a Scripting Language for Refactoring.

In D. Rombach and M. L. Soﬀa, editors, ICSE’06: Proceedings of the 28th International

Conference on Software Engineering, pages 172–181, New York, NY, USA, 2006. ACM Press.

[56] M. Ward. Proving Program Reﬁnements and Transformations. DPhil thesis, University of

Oxford, United Kingdom, 1989. 36, 49, 50

[57] M. P. Ward. Program Slicing via FermaT Transformations. In Computer Software and

Applications Conference, 2002. COMPSAC 2002. Proceedings. 26th Annual International,

pages 357–362, 2002. 36, 78

BIBLIOGRAPHY 239

[58] M. P. Ward. Pigs from Sausages? Reengineering from Assembler to C via FermaT Transfor-

mations. Science of Computer Programming, 52:213–255, 2004. 36

[59] M. P. Ward, H. Zedan, and T. Hardcastle. Conditioned Semantic Slicing via Abstraction and

Reﬁnement in FermaT. In CSMR ’05: Proceedings of the Ninth European Conference on

Software Maintenance and Reengineering, pages 178–187, 2005. 17, 18, 36, 78

[60] D. Weise, R. F. Crew, M. D. Ernst, and B. Steensgaard. Value Dependence Graphs: Rep-

resentation Without Taxation. In Proceedings of the 21st Annual ACM SIGPLAN-SIGACT

Symposium on Principles of Programming Languages, pages 297–310, Portland, OR, Jan.

1994. 18

[61] M. Weiser. Program Slicing. In ICSE ’81: Proceedings of the 5th International Conference

on Software Engineering, pages 439–449, 1981. 7, 8, 16, 18, 19

[62] M. Weiser. Programmers Use Slices When Debugging. Communications of the ACM,

25(7):446–452, 1982. 8, 16, 144

[63] M. Weiser. Reconstructing Sequential Behavior from Parallel Behavior Projections. Informa-

tion Processing Letters, 17(3):129–135, 1983. 16

[64] M. Weiser. Program Slicing. IEEE Transactions on Software Engineering, 10(4):352–357,

1984. 7, 17, 18, 126

[65] Agile alliance website. http://www.agilealliance.com/. 1

[66] Eclipse Website. http://www.eclipse.org/. 2

[67] Manifesto for Agile Software Development. http://www.agilemanifesto.org/. 1

[68] Microsoft Visual Studio Oﬃcial Website. http://msdn.microsoft.com/vstudio/. 2

[69] An Online Refactoring Catalog. http://www.refactoring.com/catalog/. 12, 138

[70] Refactoring Bugs in Eclipse, IntelliJ IDEA and Visual Studio.

http://progtools.comlab.ox.ac.uk/projects/refactoring/bugreports. 13

[71] Refactoring Website. http://www.refactoring.com/. 12

[72] Yahoo Refactoring Group Mailing List. refactoring@yahoogroups.com. 12

Fine Slicing: Theory and Applications for Computation Extraction

Conference Paper

Full-text available

Mar 2012

Software evolution often requires the untangling of code. Particularly challenging and error-prone is the task of separating computations that are intertwined in a loop. The lack of automatic tools for such transformations complicates maintenance and hinders reuse. We present a theory and implementation of fine slicing, a method for computing executable program slices that can be finely tuned, and can be used to extract non-contiguous pieces of code and untangle loops. Unlike previous solutions, it supports temporal abstraction of series of values computed in a loop in the form of newly-created sequences. Fine slicing has proved useful in capturing meaningful subprograms and has enabled the creation of an advanced computation-extraction algorithm and its implementation in a prototype refactoring tool for Cobol and Java.

The formal semantics of program slicing for nonterminating computations

Article

Full-text available

Oct 2016

Since the original development of program slicing in 1979 there have been many attempts to define a suitable semantics, which will precisely define the meaning of a slice. Particular issues include handling termination and nontermination, slicing nonterminating programs, and slicing nondeterministic programs. In this paper we review and critique the main attempts to construct a semantics for slicing and present a new operational semantics, which correctly handles slicing for nonterminating and nondeterministic programs. We also present a modified denotational semantics, which we prove to be equivalent to the operational semantics. This provides programmers with 2 different methods to prove the correctness of a slice or a slicing algorithm and means that the program transformation theory and FermaT transformation system, developed last 25 years of research, and which has proved so successful in analyzing terminating programs, can now be applied to nonterminating interactive programs.

Application of Program Slicing for Aspect Mining and Extraction - A Discussion

Article

Full-text available

Jan 2012

JSS2011

Data

Full-text available

Nov 2012

Assessing the Refactoring of Brain Methods

Article

Full-text available

Apr 2018

Code smells are a popular mechanism for identifying structural design problems in software systems. Several tools have emerged to support the detection of code smells and propose some refactorings. However, existing tools do not guarantee that a smell will be automatically fixed by means of refactorings. This article presents Bandago, an automated approach to fix a specific type of code smell called Brain Method. A Brain Method centralizes the intelligence of a class and manifests itself as a long and complex method that is difficult to understand and maintain by developers. For each Brain Method, Bandago recommends several refactoring solutions to remove the smell using a search strategy based on simulated annealing. Our approach has been evaluated with several open-source Java applications, and the results show that Bandago can automatically fix more than 60% of Brain Methods. Furthermore, we conducted a survey with 35 industrial developers that showed evidence about the usefulness of the refactorings proposed by Bandago. Also, we compared the performance of the Bandago against that of a third-party refactoring tool.

Efficient method extraction for automatic elimination of type-3 clones

Conference Paper

Feb 2017

Duplication for the Removal of Duplication

Conference Paper

Mar 2016

A semantics-preserving code-motion refactoring transformation by Komondoor and Horwitz (KH) had been shown to be effective in the elimination of type-3 clones, partly thanks to its successful combination of statement reordering with duplication of predicates. According to a recent clone refactorability definition by Tsantalis, however, such a transformation is considered unacceptable whenever the given code fragments contain any statement that cannot be moved. We propose an adaptation of the KH transformation that yields refactorable results according to the definition of Tsantalis. An evaluation of this approach on real-world type-3 clones from the Java portion of the Tiarks benchmark produces promising results, demonstrating how code motion with the duplication of predicates forms an effective step towards the removal of duplication in source code.

Safe Concurrency Introduction through Slicing

Conference Paper

Jan 2015

Traditional refactoring is about modifying the structure of existing code without changing its behaviour, but with the aim of making code easier to understand, modify, or reuse. In this paper, we introduce three novel refactorings for retrofitting concurrency to Erlang applications, and demonstrate how the use of program slicing makes the automation of these refactorings possible.

Program Sliding

Conference Paper

Jun 2012

Ran Ettinger

As program slicing is a technique for computing a subprogram that preserves a subset of the original program’s functionality, program sliding is a new technique for computing two such subprograms, a slice and its complement, the co-slice. A composition of the slice and co-slice in a sequence is expected to preserve the full functionality of the original code. The co-slice generated by sliding is designed to reuse the slice’s results, correctly, in order to avoid re-computation causing excessive code duplication. By isolating coherent slices of code, making them extractable and reusable, sliding is shown to be an effective step in performing advanced code refactorings. A practical sliding algorithm, based on the program dependence graph representation, is presented and evaluated through a manual sliding-based refactoring experiment on real Java code.

Refactoring is not (yet) about transformation

Article

Oct 2008

In order to ensure correctness, refactorings have to check extensive preconditions before performing the transformation. These preconditions usually involve subtle analyses of the program to be refactored, and as long as there is no good support for implementing them, refactoring is not about transformation, but about analysis. In most cases, these refactoring analyses are very similar to analyses implemented in a compiler and require the same level of detail to ensure behaviour preservation. We therefore propose to implement a refactoring engine on top of a compiler to leverage existing infrastructure, and complement it with refactoring-specific functionality. Many simple refactorings appear as building blocks in more complex refactorings. We have implemented two such building blocks that are widely useful: The first one allows to move symbolic names from one place in the program to another while preserving binding structure; it frees the developer from having to worry about issues like name clashes and accidental overriding. The second building block encapsulates data flow and control flow analyses, enabling the developer to specify precise conditions for validity of a transformation in terms of concepts like dominance and liveness. Based on these approaches, we have implemented a refactoring engine as part of a larger effort to generate IDEs from declarative language specifications using the JastAdd metacompiler tools. The described building blocks were successfully used as a foundation for other refactorings such as Rename, Extract Method, and Encapsulate Field.

Principles of Program Analysis

Book

Full-text available

Jan 1999

In this book we shall introduce four of the main approaches to program analysis: Data Flow Analysis, Control Flow Analysis, Abstract Interpretation, and Type and Effect Systems. Each of Chapters 2 to 5 deals with one of these approaches to some length and generally treats the more advanced material in later sections. Throughout the book we aim at stressing the many similarities between what may at a first glance appear to be very unrelated approaches. To help getting this idea across, and to serve as a gentle introduction, this chapter treats all of-the approaches at the level of examples. The technical details are worked-out but it may be difficult to apply the techniques to related examples until some of the material of later chapters have been studied.

Proving Program Refinements and Transformations

Article

Full-text available

Jan 1989

Martin Ward

In this thesis we develop a theory of program refinement and equivalence which can be used to develop practical tools for program development, analysis and modification. The theory is based on the use of general specifications and an imperative kernel language. We use weakest preconditions, expressed as formulae in infinitary logic to prove refinement and equivalence between programs. The kernel language is extended by means of "definitional transformations" which define new concepts in terms of those already present. The extensions include practical programming constructs, including recursive procedures, local variables, functions and expressions with side-effects. This expands the language into a "Wide Spectrum Language" which covers the whole range of operations from general specifications to assignments, jumps and labels. We develop theorems for proving the termination of recursive and iterative programs, transforming specifications into recursive programs and transforming recursive procedures into iterative equivalents. We develop a rigorous framework for reasoning about programs with EXIT statements that terminate nested loops from within; and this forms the basis for many efficiency-improving and restructuring transformations. These are used as a tool for program analysis and to derive algorithms by transforming their specifications. We show that the methods of top-down design and program verification using assertions can be viewed as the application of a small subset of transformations.

Abstract Data Types

Chapter

Oct 2021

Wolfgang Schreiner

Programs operate on data. It is thus natural to start our considerations of how to think about programs by a discussion of how to think about data types. For this purpose, we do not really need to know how the objects of a type are concretely represented (such representations have been discussed in Chap. chapter:models); we may rather focus on the properties that are satisfied by the operations which have been given to us to work with these objects. This view is also in line with modern software engineering that abstracts from the implementation details of data by encapsulating them in classes that only expose a (more or less) well documented method interface to the user.

Interprocedural slicing using dependence graphs

Article