Figure 2 - uploaded by Cliff Click
Content may be subject to copyright.
1 Defining a set using f() = 1 and g(x) = x + 2

1 Defining a set using f() = 1 and g(x) = x + 2

Source publication
Article
Full-text available
This paper presents a framework for describing optimizations. It shows how to combine two such frameworks and how to reason about the properties of the resulting framework. The structure of the framework provides insight into when a combination yields better results. To make the ideas more concrete, this paper presents a framework for combining con...

Citations

... The order in which CF and DCE are performed influences the final result of the compilation (see Figure 2). This phase ordering DCE CF DCE problem is well-known in literature and a practical solution is to simply perform a fixpoint iteration of the optimisation pipeline [Click and Cooper 1995]. Compiler engineers typically try to find an order of optimisations that yields well-optimised programs for either code size [Cooper et al. 1999] or performance [Kulkarni et al. 2006]. ...
Preprint
To ensure that secure applications do not leak their secrets, they are required to uphold several security properties such as spatial and temporal memory safety as well as cryptographic constant time. Existing work shows how to enforce these properties individually, in an architecture-independent way, by using secure compiler passes that each focus on an individual property. Unfortunately, given two secure compiler passes that each preserve a possibly different security property, it is unclear what kind of security property is preserved by the composition of those secure compiler passes. This paper is the first to study what security properties are preserved across the composition of different secure compiler passes. Starting from a general theory of property composition for security-relevant properties (such as the aforementioned ones), this paper formalises a theory of composition of secure compilers. Then, it showcases this theory a secure multi-pass compiler that preserves the aforementioned security-relevant properties. Crucially, this paper derives the security of the multi-pass compiler from the composition of the security properties preserved by its individual passes, which include security-preserving as well as optimisation passes. From an engineering perspective, this is the desirable approach to building secure compilers.
... If a code transformation can be based on the results of an analysis, and the results of an analysis can be made more precise by a previous code transformation, both are usually clearly separated and done in separate passes. This is unfortunate as it is known that performing analyses in a sequence of passes yields less precise results than combining them in a single pass [Click and Cooper 1995;Cousot and Cousot 1979]. ...
... Another application is formal verification. In general, combining analyses yields an analysis which is more precise than performing different analyses in a sequence of passes [Click and Cooper 1995;Cousot and Cousot 1979]. Viewing SSA as an abstract domain is thus interesting when analyzing programs where the SSA translation can be improved by a static analysis, and the static analysis can itself be improved by more precise SSA translation. ...
Article
Full-text available
Static single assignment (SSA) form is a popular intermediate representation that helps implement useful static analyses, including global value numbering (GVN), sparse dataflow analyses, or SMT-based abstract interpretation or model checking. However, the precision of the SSA translation itself depends on static analyses, and a priori static analysis is even indispensable in the case of low-level input languages like machine code. To solve this chicken-and-egg problem, we propose to turn the SSA translation into a standard static analysis based on abstract interpretation. This allows the SSA translation to be combined with other static analyses in a single pass, taking advantage of the fact that it is more precise to combine analyses than applying passes in sequence. We illustrate the practicality of these results by writing a simple dataflow analysis that performs SSA translation, optimistic global value numbering, sparse conditional constant propagation, and loop-invariant code motion in a single small pass; and by presenting a multi-language static analyzer for both C and machine code that uses the SSA abstract domain as its main intermediate representation.
... The dispatch graph is the high-level intermediate representation (IR) generated from a dispatch plan. The IR is inspired by the sea-of-nodes notation [7] and is subsequently optimized by the JVM. Each white box represents a function, while red boxes represent special nodes such as the begin of a basic block or return instructions. ...
... The GraalVM IR is a sea of nodes data structure [3,4,6] that combines the control-flow graph and the data-flow expression graphs into a single graph structure, with hundreds of different kinds of nodes. The semantics of expressions has to correctly handle all the different data types, such as fixed-width integers of 1, 8, 16, 32, and 64 bits (signed and unsigned in some cases), to accurately implement the semantics of all the different languages supported by GraalVM. ...
Preprint
Full-text available
We want to verify the correctness of optimization phases in the GraalVM compiler, which consist of many thousands of lines of complex Java code performing sophisticated graph transformations. We have built high-level models of the data structures and operations of the code using the Isabelle/HOL theorem prover, and can formally verify the correctness of those high-level operations. But the remaining challenge is: how can we be sure that those high-level operations accurately reflect what the Java is doing? This paper addresses that issue by applying several different kinds of differential testing to validate that the formal model and the Java code have the same semantics. Many of these validation techniques should be applicable to other projects that are building formal models of real-world code.
... First, the most native area comprises compilers that have to parse the source code to compile it. For this purpose, in the first step, a compiler creates an internal program representation of the source code, performs optimizations, and finally, transforms the (optimized) program representation into binary executable code [131]. In the first step, the compiler parses the source code and creates an Abstract Syntax Tree (AST) [130] from the source code. ...
Chapter
In the last decades, various concepts have been developed to support the development and maintenance of secure software systems. On the level of programming languages, concepts like Object-Orientation (OO) have been introduced to improve the structuring and reuse in programs. Those concepts have also been reflected in modeling languages like the Unified Modeling Language (UML). On both, various kinds of security and design checks have been introduced to support developers in developing secure software systems. Also, different development processes have been proposed to structure the development and make it projectable. Besides, additional concepts for giving early and constant feedback to developers have been developed to follow these processes successfully. At this point, the most prominent one is continuous integration. While there is an overlap between all of these concepts, these are only partly integrated. We give a short introduction to the enumerated concepts focusing on how the concepts contribute to the development of a secure software system and what are yet unsolved problems.
... First, the most native area comprises compilers that have to parse the source code to compile it. For this purpose, in the first step, a compiler creates an internal program representation of the source code, performs optimizations, and finally, transforms the (optimized) program representation into binary executable code [131]. In the first step, the compiler parses the source code and creates an Abstract Syntax Tree (AST) [130] from the source code. ...
Chapter
Considering the integration of the individual contributions of this thesis as a holistic framework is essential for judging the feasibility and usability of the GRaViTY framework for the development of secure software systems. Therefore, we evaluate in two case studies whether the GRaViTY framework is suitable to support the development of secure software systems as intended. In this regard, we identified two objectives we focus on. First, we investigate whether the technical integration of GRaViTY allows an application of the GRaViTY approach throughout software development processes. Second, we focus on the perspective of developers and security experts working with GRaViTY. Here, we are interested in the practical usability of GRaViTY when applied to software development. Thereby, we focus more on usability as part of software development than on detailed usability in terms of software ergonomics, e.g., regarding the realized user interface. In the end, we investigate if GRaViTY can be applied to model-driven development. Altogether, we successfully applied GRaViTY as part of the two case studies.
... It induces non-trivial phase ordering issues: loop fusion to enhance temporal locality may alter the ability to recognize an efficient BLAS-2 or BLAS-3 implementation in a numerical library. Workarounds introduce constraints on the compiler that interfere with other decisions and passes, which is a longstanding and well-known problem in the compiler community [11]. This work seeks to alleviate the issue by designing higher-level IR components that are more conducive to transformations. ...
... This allows mixing transformations, canonicalizations, constant folding and other enabling rewrites in a single transformation. The result is a system where pass fusion [11] is simple to achieve and alleviates phase ordering issues. Indeed, our structured and retargetable code generation approach is deliberate about extending the notion of passes with more flexible and controlled application of rewrite rules. ...
... In such a case, we were not able to compile to the desired vblendps operation. Instead, we had to originally settle to a pure shuffle-based implementation 11 . ...
Preprint
Full-text available
Despite significant investment in software infrastructure, machine learning systems, runtimes and compilers do not compose properly. We propose a new design aiming at providing unprecedented degrees of modularity, composability and genericity. This paper discusses a structured approach to the construction of domain-specific code generators for tensor compilers, with the stated goal of improving the productivity of both compiler engineers and end-users. The approach leverages the natural structure of tensor algebra. It has been the main driver for the design of progressive lowering paths in \MLIR. The proposed abstractions and transformations span data structures and control flow with both functional (SSA form) and imperative (side-effecting) semantics. We discuss the implications of this infrastructure on compiler construction and present preliminary experimental results.
... First, the most native area comprises compilers that have to parse the source code to compile it. For this purpose, in the first step, a compiler creates an internal program representation of the source code, performs optimizations, and finally, transforms the (optimized) program representation into binary executable code [131]. In the first step, the compiler parses the source code and creates an Abstract Syntax Tree (AST) [130] from the source code. ...
Book
For ensuring a software system's security, it is vital to keep up with changing security precautions, attacks, and mitigations. Although model-based development enables addressing security already at design-time, design models are often inconsistent with the implementation or among themselves. An additional burden are variants of software systems. To ensure security in this context, we present an approach based on continuous automated change propagation, allowing security experts to specify security requirements on the most suitable system representation. We automatically check all system representations against these requirements and provide security-preserving refactorings for preserving security compliance. For both, we show the application to variant-rich software systems. To support legacy systems, we allow to reverse-engineer variability-aware UML models and semi-automatically map existing design models to the implementation. Besides evaluations of the individual contributions, we demonstrate the approach in two open-source case studies, the iTrust electronics health records system and the Eclipse Secure Storage.
... The main contribution of this paper is to devise a formal semantics of the GraalVM IR in Isabelle/HOL [13]. The IR combines control flow and data flow into a single 'sea-of-nodes' graph structure [3], rather than a more conventional control-flow graph with basic blocks representing sequential flow. Sect. 2 gives further details of the GraalVM Compiler. ...
... We have described an Isabelle model and execution semantics for the sophisticated sea-of-nodes graph structure [3] that is used as the internal representation in the GraalVM optimizing compiler [5]. Additionally, we have proved several suites of local optimizations correct according to the semantics. ...
Preprint
Full-text available
The optimization phase of a compiler is responsible for transforming an intermediate representation (IR) of a program into a more efficient form. Modern optimizers, such as that used in the GraalVM compiler, use an IR consisting of a sophisticated graph data structure that combines data flow and control flow into the one structure. As part of a wider project on the verification of optimization passes of GraalVM, this paper describes a semantics for its IR within Isabelle/HOL. The semantics consists of a big-step operational semantics for data nodes (which are represented in a graph-based static single assignment (SSA) form) and a small-step operational semantics for handling control flow including heap-based reads and writes, exceptions, and method calls. We have proved a suite of canonicalization optimizations and conditional elimination optimizations with respect to the semantics.
... It was shown early on that combining optimization passes allows the compiler to discover more facts about the program. One of the first illustrations of the benefits of combining passes was to mix constant propagation, value numbering and unreachable code elimination [10]. ...