ChapterPDF Available

Efficient Analysis of Cyclic Redundancy Architectures via Boolean Fault Propagation

March 2022

March 2022

DOI:10.1007/978-3-030-99527-0_15

License
CC BY 4.0

In book: Tools and Algorithms for the Construction and Analysis of Systems (pp.273-291)

Authors:

Marco Bozzano

Fondazione Bruno Kessler

Alessandro Cimatti

Fondazione Bruno Kessler

Alberto Griggio

Fondazione Bruno Kessler

Many safety critical systems guarantee fault-tolerance by using several redundant copies of their components. When designing such redundancy architectures, it is crucial to analyze their fault trees, which describe combinations of faults of individual components that may cause malfunction of the system. State-of-the-art techniques for fault tree computation use first-order formulas with uninterpreted functions to model the transformations of signals performed by the redundancy system and an AllSMT query for computation of the fault tree from this encoding. Scalability of the analysis can be further improved by techniques such as predicate abstraction, which reduces the problem to Boolean case. In this paper, we show that as far as fault trees of redundancy architectures are concerned, signal transformation can be equivalently viewed in a purely Boolean way as fault propagation. This alternative view has important practical consequences. First, it applies also to general redundancy architectures with cyclic dependencies among components, to which the current state-of-the-art methods based on AllSMT are not applicable, and which currently require expensive sequential reasoning. Second, it allows for a simpler encoding of the problem and usage of efficient algorithms for analysis of fault propagation, which can significantly improve the runtime of the analyses. A thorough experimental evaluation demonstrates the superiority of the proposed techniques.

Available via license: CC BY 4.0

Content may be subject to copyright.

28th International Conference, TACAS 2022

Held as Part of the European Joint Conferences

on Theory and Practice of Software, ETAPS 2022

Munich, Germany, April 2–7, 2022

Proceedings, Part II

Tools and Algorithms

for the Construction

and Analysis of Systems

LNCS 13244 ARCoSS

DanaFisman

GrigoreRosu (Eds.)

Lecture Notes in Computer Science 13244

Founding Editors

Gerhard Goos, Germany

Juris Hartmanis, USA

Editorial Board Members

Elisa Bertino, USA

Wen Gao, China

Bernhard Steffen , Germany

Gerhard Woeginger , Germany

Moti Yung , USA

Advanced Research in Computing and Software Science

Subline of Lecture Notes in Computer Science

Subline Series Editors

Giorgio Ausiello, University of Rome ‘La Sapienza’, Italy

Vladimiro Sassone, University of Southampton, UK

Subline Advisory Board

Susanne Albers, TU Munich, Germany

Benjamin C. Pierce, University of Pennsylvania, USA

Bernhard Steffen ,University of Dortmund, Germany

Deng Xiaotie, Peking University, Beijing, China

Jeannette M. Wing, Microsoft Research, Redmond, WA, USA

More information about this series at https://link.springer.com/bookseries/558

Dana Fisman •Grigore Rosu (Eds.)

Tools and Algorithms

for the Construction

and Analysis of Systems

28th International Conference, TACAS 2022

Held as Part of the European Joint Conferences

on Theory and Practice of Software, ETAPS 2022

Munich, Germany, April 2–7, 2022

Proceedings, Part II

123

Editors

Dana Fisman

Ben-Gurion University of the Negev

Be’er Sheva, Israel

Grigore Rosu

University of Illinois Urbana-Champaign

Urbana, IL, USA

ISSN 0302-9743 ISSN 1611-3349 (electronic)

Lecture Notes in Computer Science

ISBN 978-3-030-99526-3 ISBN 978-3-030-99527-0 (eBook)

https://doi.org/10.1007/978-3-030-99527-0

©The Editor(s) (if applicable) and The Author(s) 2022. This book is an open access publication.

Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International

License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution

and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and

the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book’s Creative Commons license,

unless indicated otherwise in a credit line to the material. If material is not included in the book’s Creative

Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use,

you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication

does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant

protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are

believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors

give a warranty, expressed or implied, with respect to the material contained herein or for any errors or

omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in

published maps and institutional afﬁliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG

The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

ETAPS Foreword

Welcome to the 25th ETAPS! ETAPS 2022 took place in Munich, the beautiful capital

of Bavaria, in Germany.

ETAPS 2022 is the 25th instance of the European Joint Conferences on Theory and

Practice of Software. ETAPS is an annual federated conference established in 1998,

and consists of four conferences: ESOP, FASE, FoSSaCS, and TACAS. Each

conference has its own Program Committee (PC) and its own Steering Committee

(SC). The conferences cover various aspects of software systems, ranging from theo-

retical computer science to foundations of programming languages, analysis tools, and

formal approaches to software engineering. Organizing these conferences in a coherent,

highly synchronized conference program enables researchers to participate in an

exciting event, having the possibility to meet many colleagues working in different

directions in the ﬁeld, and to easily attend talks of different conferences. On the

weekend before the main conference, numerous satellite workshops took place that

attract many researchers from all over the globe.

ETAPS 2022 received 362 submissions in total, 111 of which were accepted,

yielding an overall acceptance rate of 30.7%. I thank all the authors for their interest in

ETAPS, all the reviewers for their reviewing efforts, the PC members for their con-

tributions, and in particular the PC (co-)chairs for their hard work in running this entire

intensive process. Last but not least, my congratulations to all authors of the accepted

papers!

ETAPS 2022 featured the unifying invited speakers Alexandra Silva (University

College London, UK, and Cornell University, USA) and TomášVojnar (Brno

University of Technology, Czech Republic) and the conference-speciﬁc invited

speakers Nathalie Bertrand (Inria Rennes, France) for FoSSaCS and Lenore Zuck

(University of Illinois at Chicago, USA) for TACAS. Invited tutorials were provided by

Stacey Jeffery (CWI and QuSoft, The Netherlands) on quantum computing and

Nicholas Lane (University of Cambridge and Samsung AI Lab, UK) on federated

learning.

As this event was the 25th edition of ETAPS, part of the program was a special

celebration where we looked back on the achievements of ETAPS and its constituting

conferences in the past, but we also looked into the future, and discussed the challenges

ahead for research in software science. This edition also reinstated the ETAPS men-

toring workshop for PhD students.

ETAPS 2022 took place in Munich, Germany, and was organized jointly by the

Technical University of Munich (TUM) and the LMU Munich. The former was

founded in 1868, and the latter in 1472 as the 6th oldest German university still running

today. Together, they have 100,000 enrolled students, regularly rank among the top

100 universities worldwide (with TUM’s computer-science department ranked #1 in

the European Union), and their researchers and alumni include 60 Nobel laureates.

The local organization team consisted of Jan Křetínský(general chair), Dirk Beyer

(general, ﬁnancial, and workshop chair), Julia Eisentraut (organization chair), and

Alexandros Evangelidis (local proceedings chair).

ETAPS 2022 was further supported by the following associations and societies:

ETAPS e.V., EATCS (European Association for Theoretical Computer Science),

EAPLS (European Association for Programming Languages and Systems), and EASST

(European Association of Software Science and Technology).

The ETAPS Steering Committee consists of an Executive Board, and representa-

tives of the individual ETAPS conferences, as well as representatives of EATCS,

EAPLS, and EASST. The Executive Board consists of Holger Hermanns

(Saarbrücken), Marieke Huisman (Twente, chair), Jan Kofroň(Prague), Barbara König

(Duisburg), Thomas Noll (Aachen), Caterina Urban (Paris), Tarmo Uustalu (Reykjavik

and Tallinn), and Lenore Zuck (Chicago).

Other members of the Steering Committee are Patricia Bouyer (Paris), Einar Broch

Johnsen (Oslo), Dana Fisman (Be’er Sheva), Reiko Heckel (Leicester), Joost-Pieter

Katoen (Aachen and Twente), Fabrice Kordon (Paris), Jan Křetínský(Munich), Orna

Kupferman (Jerusalem), Leen Lambers (Cottbus), Tiziana Margaria (Limerick),

Andrew M. Pitts (Cambridge), Elizabeth Polgreen (Edinburgh), Grigore Roşu (Illinois),

Peter Ryan (Luxembourg), Sriram Sankaranarayanan (Boulder), Don Sannella

(Edinburgh), Lutz Schröder (Erlangen), Ilya Sergey (Singapore), Natasha Sharygina

(Lugano), Pawel Sobocinski (Tallinn), Peter Thiemann (Freiburg), Sebastián Uchitel

(London and Buenos Aires), Jan Vitek (Prague), Andrzej Wasowski (Copenhagen),

Thomas Wies (New York), Anton Wijs (Eindhoven), and Manuel Wimmer (Linz).

I’d like to take this opportunity to thank all authors, attendees, organizers of the

satellite workshops, and Springer-Verlag GmbH for their support. I hope you all

enjoyed ETAPS 2022.

Finally, a big thanks to Jan, Julia, Dirk, and their local organization team for all their

enormous efforts to make ETAPS a fantastic event.

February 2022 Marieke Huisman

ETAPS SC Chair

ETAPS e.V. President

vi ETAPS Foreword

Preface

TACAS 2022 was the 28th edition of the International Conference on Tools and

Algorithms for the Construction and Analysis of Systems. TACAS 2022 was part of the

25th European Joint Conferences on Theory and Practice of Software (ETAPS 2022),

which was held from April 2 to April 7 in Munich, Germany, as well as online due to the

COVID-19 pandemic. TACAS is a forum for researchers, developers, and users inter-

ested in rigorous tools and algorithms for the construction and analysis of systems. The

conference aims to bridge the gaps between different communities with this common

interest and to support them in their quest to improve the utility, reliability, ﬂexibility,

and efﬁciency of tools and algorithms for building computer-controlled systems.

There were four submission categories for TACAS 2022:

1. Research papers advancing the theoretical foundations for the construction and

analysis of systems.

2. Case study papers with an emphasis on a real-world setting.

3. Regular tool papers presenting a new tool, a new tool component, or novel

extensions to an existing tool.

4. Tool demonstration papers focusing on the usage aspects of tools.

Papers of categories 1–3 were restricted to 16 pages, and papers of category 4 to six

pages.

This year 159 papers were submitted to TACAS, consisting of 112 research papers,

ﬁve case study papers, 33 regular tool papers, and nine tool demo papers. Authors were

allowed to submit up to four papers. Each paper was reviewed by three Program

Committee (PC) members, who made use of subreviewers. Similarly to previous years,

it was possible to submit an artifact alongside a paper, which was mandatory for regular

tool and tool demo papers.

An artifact might consist of a tool, models, proofs, or other data required for vali-

dation of the results of the paper. The Artifact Evaluation Committee (AEC) was tasked

with reviewing the artifacts based on their documentation, ease of use, and, most

importantly, whether the results presented in the corresponding paper could be accu-

rately reproduced. Most of the evaluation was carried out using a standardized virtual

machine to ensure consistency of the results, except for those artifacts that had special

hardware or software requirements. The evaluation consisted of two rounds. The ﬁrst

round was carried out in parallel with the work of the PC. The judgment of the AEC

was communicated to the PC and weighed in their discussion. The second round took

place after paper acceptance notiﬁcations were sent out; authors of accepted research

papers who did not submit an artifact in the ﬁrst round could submit their artifact at this

time. In total, 86 artifacts were submitted (79 in the ﬁrst round and seven in the second)

and evaluated by the AEC regarding their availability, functionality, and/or reusability.

Papers with an artifact that was successfully evaluated include one or more badges on

the ﬁrst page, certifying the respective properties.

Selected authors were requested to provide a rebuttal for both papers and artifacts in

case a review gave rise to questions. Using the review reports and rebuttals, the

Program and the Artifact Evaluation Committees extensively discussed the papers and

artifacts and ultimately decided to accept 33 research papers, one case study, 12 tool

papers, and four tool demos.

This corresponds to an acceptance rate of 29.46% for research papers and an overall

acceptance rate of 31.44%.

Besides the regular conference papers, this two-volume proceedings also contains

16 short papers that describe the participating veriﬁcation systems and a competition

report presenting the results of the 11th SV-COMP, the competition on automatic

software veriﬁers for C and Java programs. These papers were reviewed by a separate

Program Committee (PC); each of the papers was assessed by at least three reviewers.

A total of 47 veriﬁcation systems with developers from 11 countries entered the sys-

tematic comparative evaluation, including four submissions from industry. Two ses-

sions in the TACAS program were reserved for the presentation of the results: (1) a

summary by the competition chair and of the participating tools by the developer teams

in the ﬁrst session, and (2) an open community meeting in the second session.

We would like to thank all the people who helped to make TACAS 2022 successful.

First, we would like to thank the authors for submitting their papers to TACAS 2022.

The PC members and additional reviewers did a great job in reviewing papers: they

contributed informed and detailed reports and engaged in the PC discussions. We also

thank the steering committee, and especially its chair, Joost-Pieter Katoen, for his

valuable advice. Lastly, we would like to thank the overall organization team of

ETAPS 2022.

April 2022 Dana Fisman

Grigore Rosu

PC Chairs

Swen Jacobs

Andrew Reynolds

AEC Chairs, Tools, and Case-study Chairs

Dirk Beyer

Competition Chair

viii Preface

Organization

Program Committee

Parosh Aziz Abdulla Uppsala University, Sweden

Luca Aceto Reykjavik University, Iceland

Timos Antonopoulos Yale University, USA

Saddek Bensalem Verimag, France

Dirk Beyer LMU Munich, Germany

Nikolaj Bjorner Microsoft, USA

Jasmin Blanchette Vrije Universiteit Amsterdam, The Netherlands

Udi Boker Interdisciplinary Center Herzliya, Israel

Hana Chockler King’s College London, UK

Rance Cleaveland University of Maryland, USA

Alessandro Coglio Kestrel Institute, USA

Pedro R. D’Argenio Universidad Nacional de Córdoba, Argentina

Javier Esparza Technical University of Munich, Germany

Bernd Finkbeiner CISPA Helmholtz Center for Information Security,

Germany

Dana Fisman (Chair) Ben-Gurion University, Israel

Martin Fränzle University of Oldenburg, Germany

Felipe Gorostiaga IMDEA Software Institute, Spain

Susanne Graf UniversitéJoseph Fourier, France

Radu Grosu Stony Brook University, USA

Arie Gurﬁnkel University of Waterloo, Canada

Klaus Havelund Jet Propulsion Laboratory, USA

Holger Hermanns Saarland University, Germany

Falk Howar TU Clausthal / IPSSE, Germany

Swen Jacobs CISPA Helmholtz Center for Information Security,

Germany

Ranjit Jhala University of California, San Diego, USA

Jan Kretinsky Technical University of Munich, Germany

Viktor Kuncak Ecole Polytechnique Fédérale de Lausanne,

Switzerland

Kim Larsen Aalborg University, Denmark

Konstantinos Mamouras Rice University, USA

Daniel Neider Max Planck Institute for Software Systems, Germany

Dejan Nickovic AIT Austrian Institute of Technology, Austria

Corina Pasareanu Carnegie Mellon University, NASA, and KBR, USA

Doron Peled Bar Ilan University, Israel

Anna Philippou University of Cyprus, Cyprus

Andrew Reynolds University of Iowa, USA

Grigore Rosu (Chair) University of Illinois at Urbana-Champaign, USA

Kristin Yvonne Rozier Iowa State University, USA

Cesar Sanchez IMDEA Software Institute, Spain

Sven Schewe University of Liverpool, UK

Natasha Sharygina Universitàdella Svizzera italiana, Italy

Jan Strejček Masaryk University, Czech Republic

Cesare Tinelli University of Iowa, USA

Stavros Tripakis Northeastern University, USA

Frits Vaandrager Radboud University, The Netherlands

Tomas Vojnar Brno University of Technology, Czech Republic

Christoph M. Wintersteiger Microsoft, USA

Lijun Zhang Institute of Software, Chinese Academy of Sciences,

China

Lingming Zhang University of Illinois at Urbana-Champaign, USA

Lenore Zuck University of Illinois at Chicago, USA

Artifact Evaluation Committee

Pavel Andrianov Ivannikov Institute for System Programming

of the RAS, Russia

Michael Backenköhler Saarland University, Germany

Sebastian Biewer Saarland University, Germany

Benjamin Bisping TU Berlin, Germany

Olav Bunte Eindhoven University of Technology, The Netherlands

Damien Busatto-Gaston UniversitéLibre de Bruxelles, Belgium

Marek Chalupa IST Austria, Austria, and Masaryk University,

Czech Republic

Priyanka Darke Tata Consultancy Services, India

Alexandre Duret-Lutz LRDE, France

Shenghua Feng Institute of Software, Chinese Academy of Sciences,

Beijing, China

Mathias Fleury University of Freiburg, Germany

Kush Grover Technical University of Munich, Germany

Dominik Harmim Brno University of Technology, Czech Republic

Swen Jacobs (Chair) CISPA Helmholtz Center for Information Security,

Germany

Xiangyu Jin Institute of Software, Chinese Academy of Sciences

Juraj SičMasaryk University, Czech Republic

Daniela Kaufmann Johannes Kepler University Linz, Austria

Maximilian Alexander Köhl Saarland University, Germany

Mitja Kulczynski Kiel University, Germany

Maurice Laveaux Eindhoven University of Technology, The Netherlands

Yong Li Institute of Software, Chinese Academy of Sciences,

China

Debasmita Lohar Max Planck Institute for Software Systems, Germany

Makai Mann Stanford University, USA

x Organization

Fabian Meyer RWTH Aachen University, Germany

Stefanie Mohr Technical University of Munich, Germany

Malte Mues TU Dortmund, Germany

Yuki Nishida Kyoto University, Japan

Philip Offtermatt Universitéde Sherbrooke, Canada

Muhammad Osama Eindhoven University of Technology, The Netherlands

JiříPavela Brno University of Technology, Czech Republic

Adrien Pommellet LRDE, France

Mathias Preiner Stanford University, USA

JoséProença CISTER-ISEP and HASLab-INESC TEC, Portugal

Tim Quatmann RWTH Aachen University, Germany

Etienne Renault LRDE, France

Andrew Reynolds (Chair) University of Iowa, USA

Mouhammad Sakr University of Luxembourg, Luxembourg

Morten Konggaard Schou Aalborg University, Denmark

Philipp Schlehuber-Caissier LRDE, France

Hans-Jörg Schurr Inria Nancy - Grand Est, France

Michael Schwarz Technische UniversitätMünchen, Germany

Joseph Scott University of Waterloo, Canada

Ali Shamakhi Tehran Institute for Advanced Studies, Iran

Lei Shi University of Pennsylvania, USA

Matthew Sotoudeh University of California, Davis, USA

Jip Spel RWTH Aachen University, Germany

Veronika ŠokováBrno University of Technology, Czech Republic

Program Committee and Jury —SV-COMP

Fatimah Aljaafari University of Manchester, UK

Lei Bu Nanjing University, China

Thomas Bunk LMU Munich, Germany

Marek Chalupa Masaryk University, Czech Republic

Priyanka Darke Tata Consultancy Services, India

Daniel Dietsch University of Freiburg, Germany

Gidon Ernst LMU Munich, Germany

Fei He Tsinghua University, China

Matthias Heizmann University of Freiburg, Germany

Jera Hensel RWTH Aachen University, Germany

Falk Howar TU Dortmund, Germany

Soha Hussein University of Minnesota, USA

Dominik Klumpp University of Freiburg, Germany

Henrich Lauko Masaryk University, Czech Republic

Will Leeson University of Virginia, USA

Xie Li Chinese Academy of Sciences, China

Viktor Malík Brno University of Technology, Czech Republic

Raveendra Kumar

Medicherla

Tata Consultancy Services, India

Organization xi

Rafael SáMenezes University of Manchester, UK

Vince Molnár Budapest University of Technology and Economics,

Hungary

Hernán Ponce de León Bundeswehr University Munich, Germany

Cedric Richter University of Oldenburg, Germany

Simmo Saan University of Tartu, Estonia

Emerson Sales Gran Sasso Science Institute, Italy

Peter Schrammel University of Sussex and Diffblue, UK

Frank Schüssele University of Freiburg, Germany

Ryan Scott Galois, USA

Ali Shamakhi Tehran Institute for Advanced Studies, Iran

Martin Spiessl LMU Munich, Germany

Michael Tautschnig Queen Mary University of London, UK

Anton Vasilyev ISP RAS, Russia

Vesal Vojdani University of Tartu, Estonia

Steering Committee

Dirk Beyer Ludwig-Maximilians-UniversitätMünchen, Germany

Rance Cleaveland University of Maryland, USA

Holger Hermanns Universität des Saarlandes, Germany

Joost-Pieter Katoen (Chair) RWTH Aachen University, Germany, and Universiteit

Twente, The Netherlands

Kim G. Larsen Aalborg University, Denmark

Bernhard Steffen Technische Universität Dortmund, Germany

Additional Reviewers

Abraham, Erika

Aguilar, Edgar

Akshay, S.

Asadi, Sepideh

Attard, Duncan

Avni, Guy

Azeem, Muqsit

Bacci, Giorgio

Balasubramanian, A. R.

Barbanera, Franco

Bard, Joachim

Basset, Nicolas

Bendík, Jaroslav

Berani Abdelwahab, Erzana

Beutner, Raven

Bhandary, Shrajan

Biewer, Sebastian

Blicha, Martin

Brandstätter, Andreas

Bright, Curtis

Britikov, Konstantin

Brunnbauer, Axel

Capretto, Margarita

Castiglioni, Valentina

Castro, Pablo

Ceska, Milan

Chadha, Rohit

Chalupa, Marek

Changshun, Wu

Chen, Xiaohong

Cruciani, Emilio

Dahmen, Sander

Dang, Thao

Danielsson, Luis Miguel

xii Organization

Degiovanni, Renzo

Dell’Erba, Daniele

Demasi, Ramiro

Desharnais, Martin

Dierl, Simon

Dubslaff, Clemens

Egolf, Derek

Evangelidis, Alexandros

Fedyukovich, Grigory

Fiedor, Jan

Fitzpatrick, Stephen

Fleury, Mathias

Frenkel, Hadar

Gamboa Guzman, Laura P.

Garcia-Contreras, Isabel

Gianola, Alessandro

Goorden, Martijn

Gorostiaga, Felipe

Gorrieri, Roberto

Grahn, Samuel

Grastien, Alban

Grover, Kush

Grünbacher, Sophie

Guha, Shibashis

Gutiérrez Brida, Simón Emmanuel

Havlena, Vojtěch

He, Jie

Helfrich, Martin

Henkel, Elisabeth

Hicks, Michael

Hirschkoff, Daniel

Hofmann, Jana

Hojjat, Hossein

Holík, Lukáš

Hospodár, Michal

Huang, Chao

Hyvärinen, Antti

Inverso, Omar

Itzhaky, Shachar

Jaksic, Stefan

Jansen, David N.

Jin, Xiangyu

Jonas, Martin

Kanav, Sudeep

Karra, Shyam Lal

Katsaros, Panagiotis

Kempa, Brian

Klauck, Michaela

Kreitz, Christoph

Kröger, Paul

Köhl, Maximilian Alexander

König, Barbara

Lahijanian, Morteza

Larraz, Daniel

Le, Nham

Lemberger, Thomas

Lengal, Ondrej

Li, Chunxiao

Li, Jianlin

Lorber, Florian

Lung, David

Luppen, Zachary

Lybech, Stian

Major, Juraj

Manganini, Giorgio

McCarthy, Eric

Mediouni, Braham Lotﬁ

Meggendorfer, Tobias

Meira-Goes, Romulo

Melcer, Daniel

Metzger, Niklas

Milovancevic, Dragana

Mohr, Stefanie

Najib, Muhammad

Noetzli, Andres

Nouri, Ayoub

Offtermatt, Philip

Otoni, Rodrigo

Paoletti, Nicola

Parizek, Pavel

Parker, Dave

Parys, Paweł

Passing, Noemi

Perez Dominguez, Ivan

Perez, Guillermo

Pinna, G. Michele

Pous, Damien

Priya, Siddharth

Putruele, Luciano

Pérez, Jorge A.

Qu, Meixun

Raskin, Mikhail

Organization xiii

Rauh, Andreas

Reger, Giles

Reynouard, Raphaël

Riener, Heinz

Rogalewicz, Adam

Roy, Rajarshi

Ruemmer, Philipp

Ruijters, Enno

Schilling, Christian

Schmitt, Frederik

Schneider, Tibor

Scholl, Christoph

Schultz, William

Schupp, Stefan

Schurr, Hans-Jörg

Schwammberger, Maike

Shaﬁei, Nastaran

Siber, Julian

Sickert, Salomon

Singh, Gagandeep

Smith, Douglas

Somenzi, Fabio

Stewing, Richard

Stock, Gregory

Su, Yusen

Tang, Qiyi

Tibo, Alessandro

Treﬂer, Richard

Trtík, Marek

Turrini, Andrea

Vaezipoor, Pashootan

van Dijk, Tom

Vašíček, Ondřej

Vediramana Krishnan, Hari Govind

Wang, Wenxi

Wendler, Philipp

Westfold, Stephen

Winter, Stefan

Wolovick, Nicolás

Yakusheva, Sophia

Yang, Pengfei

Zeljić, Aleksandar

Zhou, Yuhao

Zimmermann, Martin

xiv Organization

Contents –Part II

Probabilistic Systems

A Probabilistic Logic for Verifying Continuous-time Markov Chains. . . . . . . 3

Ji Guan and Nengkun Yu

Under-Approximating Expected Total Rewards in POMDPs . . . . . . . . . . . . . 22

Alexander Bork, Joost-Pieter Katoen, and Tim Quatmann

Correct Probabilistic Model Checking with Floating-Point Arithmetic . . . . . . 41

Arnd Hartmanns

Correlated Equilibria and Fairness in Concurrent Stochastic Games . . . . . . . . 60

Marta Kwiatkowska, Gethin Norman, David Parker, and Gabriel Santos

Omega Automata

A Direct Symbolic Algorithm for Solving Stochastic Rabin Games . . . . . . . . 81

Tamajit Banerjee, Rupak Majumdar, Kaushik Mallik,

Anne-Kathrin Schmuck, and Sadegh Soudjani

Practical Applications of the Alternating Cycle Decomposition . . . . . . . . . . . 99

Antonio Casares, Alexandre Duret-Lutz, Klara J. Meyer,

Florian Renkin, and Salomon Sickert

Sky Is Not the Limit: Tighter Rank Bounds for Elevator Automata in Büchi

Automata Complementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Vojtěch Havlena, Ondřej Lengál, and Barbora Šmahlíková

On-The-Fly Solving for Symbolic Parity Games . . . . . . . . . . . . . . . . . . . . . 137

Maurice Laveaux, Wieger Wesselink, and Tim A. C. Willemse

Equivalence Checking

Distributed Coalgebraic Partition Refinement . . . . . . . . . . . . . . . . . . . . . . . 159

Fabian Birkmann, Hans-Peter Deifel, and Stefan Milius

From Bounded Checking to Verification of Equivalence via Symbolic

Up-to Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

Vasileios Koutavas, Yu-Yang Lin, and Nikos Tzevelekos

Equivalence Checking for Orthocomplemented Bisemilattices

in Log-Linear Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

Simon Guilloud and Viktor Kunčak

Monitoring and Analysis

A Theoretical Analysis of Random Regression Test Prioritization . . . . . . . . . 217

Pu Yi, Hao Wang, Tao Xie, Darko Marinov, and Wing Lam

Verified First-Order Monitoring with Recursive Rules . . . . . . . . . . . . . . . . . 236

Sheila Zingg, Srđan Krstić, Martin Raszyk, Joshua Schneider,

and Dmitriy Traytel

Maximizing Branch Coverage with Constrained Horn Clauses . . . . . . . . . . . 254

Ilia Zlatkin and Grigory Fedyukovich

Efficient Analysis of Cyclic Redundancy Architectures via Boolean

Fault Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273

Marco Bozzano, Alessandro Cimatti, Alberto Griggio, and Martin Jonáš

Tools | Optimizations, Repair and Explainability

Adiar Binary Decision Diagrams in External Memory . . . . . . . . . . . . . . . . . 295

Steffan Christ Sølvsten, Jaco van de Pol, Anna Blume Jakobsen,

and Mathias Weller Berg Thomasen

Forest GUMP: A Tool for Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

Alnis Murtovi, Alexander Bainczyk, and Bernhard Steffen

ALPINIST: An Annotation-Aware GPU Program Optimizer. . . . . . . . . . . . . . . 332

Ömer Şakar, Mohsen Safari, Marieke Huisman, and Anton Wijs

Automatic Repair for Network Programs . . . . . . . . . . . . . . . . . . . . . . . . . . 353

Lei Shi, Yuepeng Wang, Rajeev Alur, and Boon Thau Loo

11th Competition on Software Verification: SV-COMP 2022

Progress on Software Verification: SV-COMP 2022 . . . . . . . . . . . . . . . . . . 375

Dirk Beyer

AProVE: Non-Termination Witnesses for CPrograms:

(Competition Contribution). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403

Jera Hensel, Constantin Mensendiek, and Jürgen Giesl

xvi Contents –Part II

BRICK: Path Enumeration Based Bounded Reachability Checking

of C Program (Competition Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . 408

Lei Bu, Zhunyi Xie, Lecheng Lyu, Yichao Li, Xiao Guo, Jianhua Zhao,

and Xuandong Li

A Prototype for Data Race Detection in CSeq 3: (Competition

Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413

Alex Coto, Omar Inverso, Emerson Sales, and Emilio Tuosto

DARTAGNAN: SMT-based Violation Witness Validation (Competition

Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418

Hernán Ponce-de-León, Thomas Haas, and Roland Meyer

Deagle: An SMT-based Verifier for Multi-threaded Programs

(Competition Contribution). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424

Fei He, Zhihang Sun, and Hongyu Fan

The Static Analyzer Frama-C in SV-COMP (Competition Contribution). . . . . 429

Dirk Beyer and Martin Spiessl

GDART: An Ensemble of Tools for Dynamic Symbolic Execution

on the Java Virtual Machine (Competition Contribution) . . . . . . . . . . . . . . . 435

Malte Mues and Falk Howar

Graves-CPA: A Graph-Attention Verifier Selector (Competition

Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440

Will Leeson and Matthew B. Dwyer

GWIT: A Witness Validator for Java based on GraalVM (Competition

Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446

Falk Howar and Malte Mues

The Static Analyzer Infer in SV-COMP (Competition Contribution) . . . . . . . 451

Matthias Kettl and Thomas Lemberger

LART: Compiled Abstract Execution: (Competition Contribution). . . . . . . . . 457

Henrich Lauko and Petr Ročkai

SYMBIOTIC 9: String Analysis and Backward Symbolic Execution with Loop

Folding: (Competition Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462

Marek Chalupa, Vincent Mihalkovič, Anna Řechtáčková, LukášZaoral,

and Jan Strejček

SYMBIOTIC-WITCH:AKLEE-Based Violation Witness Checker:

(Competition Contribution). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468

Paulína Ayaziová, Marek Chalupa, and Jan Strejček

Contents –Part II xvii

THETA: portfolio of CEGAR-based analyses with dynamic algorithm

selection (Competition Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474

Zsófia Ádám, Levente Bajczi, Mihály Dobos-Kovács, Ákos Hajdu,

and Vince Molnár

ULTIMATE GEMCUTTER and the Axes of Generalization: (Competition

Contribution) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479

Dominik Klumpp, Daniel Dietsch, Matthias Heizmann, Frank Schüssele,

Marcel Ebbinghaus, Azadeh Farzan, and Andreas Podelski

Wit4Java: A Violation-Witness Validator for Java Verifiers

(Competition Contribution). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484

Tong Wu, Peter Schrammel, and Lucas C. Cordeiro

Author Index ............................................ 491

xviii Contents –Part II

Contents –Part I

Synthesis

HOLL: Program Synthesis for Higher Order Logic Locking . . . . . . . . . . . . . 3

Gourav Takhar, Ramesh Karri, Christian Pilato, and Subhajit Roy

The Complexity of LTL Rational Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 25

Orna Kupferman and Noam Shenwald

Synthesis of Compact Strategies for Coordination Programs . . . . . . . . . . . . . 46

Kedar S. Namjoshi and Nisarg Patel

ZDD Boolean Synthesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Yi Lin, Lucas M. Tabajara, and Moshe Y. Vardi

Verification

Comparative Verification of the Digital Library of Mathematical Functions

and Computer Algebra Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

AndréGreiner-Petter, Howard S. Cohl, Abdou Youssef,

Moritz Schubotz, Avi Trost, Rajen Dey, Akiko Aizawa, and Bela Gipp

Verifying Fortran Programs with CIVL . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

Wenhao Wu, Jan Hückelheim, Paul D. Hovland, and Stephen F. Siegel

NORMA: a tool for the analysis of Relay-based Railway

Interlocking Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Arturo Amendola, Anna Becchi, Roberto Cavada, Alessandro Cimatti,

Andrea Ferrando, Lorenzo Pilati, Giuseppe Scaglione,

Alberto Tacchella, and Marco Zamboni

Efficient Neural Network Analysis with Sum-of-Infeasibilities . . . . . . . . . . . 143

Haoze Wu, Aleksandar Zeljić, Guy Katz, and Clark Barrett

Blockchain

Formal Verification of the Ethereum 2.0 Beacon Chain . . . . . . . . . . . . . . . . 167

Franck Cassez, Joanne Fuller, and Aditya Asgaonkar

Fast and Reliable Formal Verification of Smart Contracts

with the Move Prover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

David Dill, Wolfgang Grieskamp, Junkil Park, Shaz Qadeer, Meng Xu,

and Emma Zhong

A Max-SMT Superoptimizer for EVM handling Memory and Storage . . . . . . 201

Elvira Albert, Pablo Gordillo, Alejandro Hernández-Cerezo,

and Albert Rubio

Grammatical Inference

A New Approach for Active Automata Learning Based on Apartness . . . . . . 223

Frits Vaandrager, Bharat Garhewal, Jurriaan Rot,

and Thorsten Wißmann

Learning Realtime One-Counter Automata . . . . . . . . . . . . . . . . . . . . . . . . . 244

Véronique Bruyère, Guillermo A. Pérez, and Gaëtan Staquet

Scalable Anytime Algorithms for Learning Fragments of Linear

Temporal Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

Ritam Raha, Rajarshi Roy, Nathanaël Fijalkow, and Daniel Neider

Learning Model Checking and the Kernel Trick for Signal Temporal Logic

on Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281

Luca Bortolussi, Giuseppe Maria Gallo, Jan Křetínský, and Laura Nenzi

Verification Inference

Inferring Interval-Valued Floating-Point Preconditions . . . . . . . . . . . . . . . . . 303

Jonas Krämer, Lionel Blatter, Eva Darulova, and Mattias Ulbrich

NeuReach: Learning Reachability Functions from Simulations . . . . . . . . . . . 322

Dawei Sun and Sayan Mitra

Inferring Invariants with Quantifier Alternations: Taming the Search

Space Explosion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

Jason R. Koenig, Oded Padon, Sharon Shoham, and Alex Aiken

LinSyn: Synthesizing Tight Linear Bounds for Arbitrary Neural Network

Activation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

Brandon Paulsen and Chao Wang

Short papers

Kmclib: Automated Inference and Verification of Session Types from

OCaml Programs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379

Keigo Imai, Julien Lange, and Rumyana Neykova

Automated Translation of Natural Language Requirements

to Runtime Monitors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387

Ivan Perez, Anastasia Mavridou, Tom Pressburger, Alwyn Goodloe,

and Dimitra Giannakopoulou

xx Contents –Part I

MaskD: A Tool for Measuring Masking Fault-Tolerance . . . . . . . . . . . . . . . 396

Luciano Putruele, Ramiro Demasi, Pablo F. Castro,

and Pedro R. D’Argenio

Better Counterexamples for Dafny. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404

Aleksandar Chakarov, Aleksandr Fedchin, Zvonimir Rakamarić,

and Neha Rungta

Constraint Solving

cvc5: A Versatile and Industrial-Strength SMT Solver . . . . . . . . . . . . . . . . . 415

Haniel Barbosa, Clark Barrett, Martin Brain, Gereon Kremer,

Hanna Lachnitt, Makai Mann, Abdalrhman Mohamed,

Mudathir Mohamed, Aina Niemetz, Andres Nötzli, Alex Ozdemir,

Mathias Preiner, Andrew Reynolds, Ying Sheng, Cesare Tinelli,

and Yoni Zohar

Clausal Proofs for Pseudo-Boolean Reasoning . . . . . . . . . . . . . . . . . . . . . . 443

Randal E. Bryant, Armin Biere, and Marijn J. H. Heule

Moving Definition Variables in Quantified Boolean Formulas. . . . . . . . . . . . 462

Joseph E. Reeves, Marijn J. H. Heule, and Randal E. Bryant

A Sorted Datalog Hammer for Supervisor Verification Conditions Modulo

Simple Linear Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

Martin Bromberger, Irina Dragoste, Rasha Faqeh, Christof Fetzer,

Larry González, Markus Krötzsch, Maximilian Marx, Harish K Murali,

and Christoph Weidenbach

Model Checking and Verification

Property Directed Reachability for Generalized Petri Nets . . . . . . . . . . . . . . 505

Nicolas Amat, Silvano Dal Zilio, and Thomas Hujsa

Transition Power Abstractions for Deep Counterexample Detection . . . . . . . . 524

Martin Blicha, Grigory Fedyukovich, Antti E. J. Hyvärinen,

and Natasha Sharygina

Searching for Ribbon-Shaped Paths in Fair Transition Systems . . . . . . . . . . . 543

Marco Bozzano, Alessandro Cimatti, Stefano Tonetta,

and Viktoria Vozarova

CoVeriTeam: On-Demand Composition of Cooperative

Verification Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561

Dirk Beyer and Sudeep Kanav

Author Index ............................................ 581

Contents –Part I xxi

Probabilistic Systems

A Probabilistic Logic for Verifying

Continuous-time Markov Chains

Ji Guan1and Nengkun Yu2()

1State Key Laboratory of Computer Science, Institute of Software, Chinese

Academy of Sciences, Beijing, China.

guanji1992@gmail.com

2Centre for Quantum Software and Information, University of Technology Sydney,

Sydney, Australia.

nengkunyu@gmail.com

Abstract. A continuous-time Markov chain (CTMC) execution is a con-

tinuous class of probability distributions over states. This paper proposes

a probabilistic linear-time temporal logic, namely continuous-time linear

logic (CLL), to reason about the probability distribution execution of

CTMCs. We deﬁne the syntax of CLL on the space of probability dis-

tributions. The syntax of CLL includes multiphase timed until formulas,

and the semantics of CLL allows time reset to study relatively temporal

properties. We derive a corresponding model-checking algorithm for CLL

formulas. The correctness of the model-checking algorithm depends on

Schanuel’s conjecture, a central open problem in transcendental num-

ber theory. Furthermore, we provide a running example of CTMCs to

illustrate our method.

1 Introduction

As a popular model of probabilistic continuous-time systems, continuous-time

Markov chains (CTMCs) have been extensively studied since Kolmogorov [25].

In the recent 20 years, probabilistic continuous-time model checking receives

much attention. Adopting probabilistic computational tree logic (PCTL) [22] to

this context with extra multiphase timed until formulas Φ1UT1Φ2···UTKΦK+1,

for state formula Φand time interval T, Aziz et al. proposed continuous stochas-

tic logic (CSL) to specify the branching-time properties of CTMCs and the

model-checking problem for CSL is decidable [8]. After that, eﬃcient model-

checking algorithms were developed by transient analysis of CTMCs using uni-

formization [9] and stratiﬁcation [41] for a restricted version (path formulas are

restricted to single until formulas Φ1UIΦ2) and a full version of CSL, respec-

tively. These algorithms have been practically implemented in model checkers

PRISM [26], MRMC [24] and STORM [18]. Further details can be found in an

excellent survey [23].

There are also diﬀerent ways to specify the linear-time properties of CTMCs.

Timed automata were ﬁrst used to achieve this task [11,13,14,15,19], and then

The Author(s) 2022

D. Fisman and G. Rosu (Eds.): TACAS 2022, LNCS 13244, pp. 3–21, 2022.

https://doi.org/10.1007/978-3-030-99527-0_1

metric temporal logic (MTL) [12] was also considered in this context. Subse-

quently, the probability of “the system being in state s0within ﬁve-time units

after having continuously remained in state s1” can be computed. However, some

statements cannot be speciﬁed and veriﬁed because of the lack of a probabilistic

linear-time temporal logic, for instance “the system being in state s0with high

probability (≥0.9) within ﬁve-time units after having continuously remained in

state s1with low probability (≤0.1)”. Furthermore, this probabilistic property

cannot be expressed by CSL because CSL cannot express properties that are

deﬁned across several state transitions of the same time length in the execution

of a CTMC.

In this paper, targeting to express the mentioned probabilistic linear-time

properties, we introduce continuous-time linear logic (CLL). In particular, we

adopt the viewpoint used in [2] by regarding CTMCs as transformers of prob-

ability distributions over states. CLL studies the properties of the probability

distribution execution generated by a given initial probability distribution over

time. By the fundamental diﬀerence between the views of state executions and

probability distribution executions of CTMCs, CLL and CSL are incomparable

and complementary, as the relation between probabilistic linear-time temporal

logic (PLTL) and PCTL in model checking discrete-time Markov chains [2, Sec-

tion 3.3].

The atomic propositions of CLL are explained on the space of probability

distributions over states of CTMCs. We apply the method of symbolic dynamics

to the probability distributions of CTMCs. To be speciﬁc, we symbolize the

probability value space [0,1] into a ﬁnite set of intervals I={Ik⊆[0,1]}m

k=1.

A probability distribution µover its set of states S={s0, s2, . . . , sd−1}is then

represented symbolically as a set of symbols

S(µ) = {⟨s, I⟩ ∈ S×I:µ(s)∈ I}

where each symbol ⟨s, I⟩ asserts µ(s)∈ I, i.e., the probability of state sin

distribution µfalls in interval I. For example, ⟨s0,[0.9,1]⟩means the system is

in state s0with a probability in 0.9 to 1. The symbolization idea of distributions

has been considered in [2]: choosing a disjoint cover of [0,1]:

I={[0, p1),[p1, p2), ..., [pn,1]}.

Here, we remove this restriction and enrich the expressiveness of I. A crucial

fact about this symbolization is that the set S×Iis ﬁnite. Consequently,

the (probability distribution) execution path generated by an initial probability

distribution µinduces a sequence of symbols in S×Iover time. Therefore, the

dynamics of CTMCs can be studied in terms of a (real-time) language over the

alphabet S×I, which is the set of atomic propositions of CLL.

Diﬀerent from non-probabilistic linear-time temporal logics — linear-time

temporal logic (LTL) and MTL, CLL has two types of formulas: state formu-

las and path formulas. The state formulas are constructed using propositional

connectives. The path formulas are obtained by propositional connectives and a

temporal modal operator timed until UTfor a bounded time interval T, as in

4 J. Guan and N. Yu

MTL and CSL. The standard next-step temporal operator in LTL is meaningless

in continuous-time systems since the time domain (real numbers) is uncountable.

As a result, CLL can express the above mentioned probabilistic property “the

system is at state s0with high probability (≥0.9) within 5 time units after hav-

ing continuously remained at state s1with low probability (≤0.1)” in a path

formula:

φ=⟨s1,[0,0.1]⟩U[0,5]⟨s0,[0.9,1]⟩.

In this single until formula, there is a time instant 0 ≤t≤5 at which state s1

with low probability transits to state s0with high probability. Then we illustrate

this on the following timeline.

| {z }

⟨s1,[0,0.1]⟩

↓t≤5↓0

↑ ⟨s0,[0.9,1]⟩

Furthermore, CLL allows multiphase timed until formulas. The semantics of the

formulas focuses on relative time intervals, i.e., time can be reset as in timed au-

tomata [5,6], while those of CSL [8] are for absolute time intervals. Subsequently,

CLL can express not only relatively but also absolutely temporal properties of

CTMCs.

We illustrate the signiﬁcant diﬀerence between relatively temporal properties

and absolutely temporal properties of CTMCs. For instance, “before probability

distributions transition φhappening in 3 to 7 time units, the system always stays

at state s0with a high probability (≥0.9)” can be formalized in path formulae

φ′=⟨s0,[0.9,1]⟩U[3,7](⟨s1,[0,0.1]⟩U[0,5] ⟨s0,[0.9,1]⟩).

As we can see, there are two time instants, namely t1and t2, happening distribu-

tion transitions. Time is reset to 0 after the ﬁrst distribution transition happens

and thus t2is relative to t1. More clearly, we depict this on the following timeline.

| {z }

⟨s0,[0.9,1]⟩| {z }

⟨s1,[0,0.1]⟩

↓(t2+t1)≤12

z }| {

↑0s ↓t1≤7

↑ ⟨s0,[0.9,1]⟩

An absolute version is “probability distribution transition φhappens and the

system always stays at state s0with a high probability (≥0.9) in 3 to 7 time

units”

φ′′ =□[3,7]⟨s0,[0.9,1]⟩∧⟨s1,[0,0.1]⟩U[0,5]⟨s0,[0.9,1]⟩).

We can get a clear timeline representation by simply adding □[3,7]⟨s0,[0.9,1]⟩to

that of φ. Assume that t < 3,

| {z }

⟨s1,[0,0.1]⟩

↓t < 3↓0

↑ ⟨s0,[0.9,1]⟩↓3↓7

| {z }

⟨s0,[0.9,1]⟩

A Probabilistic Logic for Verifying Continuous-time Markov Chains 5

Time reset enriches the expressiveness of CLL but introduces more diﬃculties

to model checking CLL than CSL. We cross this by translating relative time to

the absolute one. As a result, we develop an algorithm to model check CTMCs

against CLL formulas. More precisely, we reduce the model-checking problem to

a reachability problem of absolute time intervals. The reachability problem corre-

sponds to the real root isolation problem of real polynomial-exponential functions

(PEFs) over the ﬁeld of algebraic numbers, an extensively studied question in

recent symbolic and algebraic computation community (e.g. [1,20,28]). By de-

veloping a state-of-the-art real root isolation algorithm, we resolve the latter

problem under the assumption of the validity of Schanuel’s conjecture, a central

open question in transcendental number theory [27]. This conjecture has also

been the footstone of the correctness of many recent model-checking algorithms,

including the decidability of continuous-time Markov decision processes [30], the

synthesizing inductive invariants for continuous linear dynamical systems [4], the

termination analysis for probabilistic programs with delays [39], and reachability

analysis for dynamical systems [20].

In summary, the main contributions of this paper are as follows.

–Introducing a probabilistic logic, namely continuous-time linear logic (CLL),

for reasoning about CTMCs;

–Developing a state-of-the-art real root isolation algorithm for PEFs over the

ﬁeld of algebraic numbers for checking atomic propositions of CLL;

–Proving that model checking CTMCs against CLL formulas is decidable

subject to Schanuel’s conjecture.

Organization of this paper. In the next section, we give the mathematical

preliminaries used in this paper. In Section 3, we recall the view of CTMCs as

distribution transformers. After that, the symbolic dynamics of CTMCs are in-

troduced by symbolizing distributions over states of CTMCs in Section 4. In the

subsequent section, we present our continuous-time probabilistic temporal logic

CLL. In Section 6, we develop an algorithm to solve the CLL model checking

problem. A case study and related works are shown in Sections 7and 8, respec-

tively. We summarize our results and point out future research directions in the

ﬁnal section.

2 Preliminaries

For the convenience of the readers, we review basic deﬁnitions and notations of

number theory, particularly Schanuel’s conjecture.

Throughout this paper, we write C,R,Qand Afor the ﬁelds of all complex,

real, rational and algebraic numbers, respectively. In addition, Zdenotes the set

of all integer numbers. For F∈ {C,R,Q,Z,A}, we use F[t] and Fn×mto denote

the set of polynomials in twith coeﬃcients in Fand n-by-mmatrices with every

entry in F, respectively. Furthermore, for F∈ {R,Q,Z}, we use F+to denote

the set of positive elements (including 0) of F.

6 J. Guan and N. Yu

A bounded (time) interval Tis a subset of R+, which may be open, half-open

or closed with one of the following forms:

[t1, t2],[t1, t2),(t1, t2],(t1, t2),

where t1, t2∈R+and t2≥t1(t1=t2is only allowed in the case of [t1, t2]). Here,

t1and t2are called the left and right endpoints of T, respectively. Conveniently,

we use inf Tand sup Tto denote t1and t2, respectively. In this paper, we only

consider bounded intervals.

For reasoning about the temporal properties, we further deﬁne the addition

and subtraction of (time) intervals. The expression T+tor t+T, for t∈R+,

denotes the interval {t+t′:t′∈ T }. Similarly, T − tstands for the interval

{−t+t′:t′∈ T } if t≤inf T. Furthermore, for two intervals T1and T2,

T1+T2={t∈(t′+T2) : t′∈ T1}={t1+t2:t1∈ T1and t2∈ T2}.

Two intervals T1and T2are disjoint if their intersection is an empty set, i.e.,

T1∩ T2=∅. Let us see some concrete examples: 1 + (2,3) = (3,4), (2,3) −1 =

(1,2), (2,3) + [3,4] = (5,7) and (2,3),[3,4] are disjoint. It is obvious that all

calculations of time intervals in the above are easy to be computed.

An algebraic number is a complex number that is a root of a non-zero poly-

nomial in one variable with rational coeﬃcients (or equivalent to integer coeﬃ-

cients, by eliminating denominators). An algebraic number αis represented by

(P, (a, b), ε) where Pis the minimal polynomial of α,a, b ∈Qand a+bi is an

approximation of αsuch that |α−(a+bi)|< ε and αis the only root of Pin the

open ball B(a+bi, ε). The minimal polynomial of αis the polynomial with the

smallest degree in Q[t] such that αis a root of the polynomial and the coeﬃcient

of the highest-degree term is 1. Any root of f(t)∈A[t] is algebraic. Moreover,

given the representations of a, b ∈A, the representations of a±b, a

band a·bcan

be computed in polynomial time, so does the equality checking [17].

Furthermore, a complex number is called transcendental if it is not an al-

gebraic number. In general, it is challenging to verify relationships between

transcendental numbers [33]. On the other hand, one can use the Lindemann-

Weierstrass theorem to compare some transcendental numbers. The transcen-

dence of eand πare direct corollaries of this theorem.

Theorem 1 (Lindemann-Weierstrass theorem). Let η1,··· , ηnbe pairwise

distinct algebraic complex numbers. Then Pkλkeηk= 0 for non-zero algebraic

numbers λ1,··· , λn.

The following concepts are introduced to study the general relation between

transcendental numbers.

Deﬁnition 1 (Algebraic independence). A set of complex numbers S=

{a1,··· , an}is algebraically independent over Qif the elements of Sdo not

satisfy any nontrivial (non-constant) polynomial equation with coeﬃcients in Q.

By the above deﬁnition, for any transcendental number u,{u}is algebraically

independent over Q, while {a}for any algebraic number a∈Ais not. Thus, a

A Probabilistic Logic for Verifying Continuous-time Markov Chains 7

set of complex numbers that is algebraically independent over Qmust consist of

transcendental numbers. {π, eπ√n}is also algebraically independent over Qfor

any positive integer n[31]. Checking the algebraic independence is challenging.

For example, it is still widely open whether {e, π }is algebraically independent

over Q.

Deﬁnition 2 (Extension ﬁeld). Given two ﬁelds E⊆F,Fis an extension

ﬁeld of E, denoted by F/E, if the operations of Eare those of Frestricted to

For example, under the usual notions of addition and multiplication, the ﬁeld of

complex numbers is an extension ﬁeld of real numbers.

Deﬁnition 3 (Transcendence degree). Let Lbe an extension ﬁeld of Q,

the transcendence degree of Lover Qis deﬁned as the largest cardinality of an

algebraically independent subset of Lover Q.

For instance, let Q(e)/Q={a+be |a, b ∈Q}and Q(√2)/Q={a+b√2|a, b ∈

Q}be two extension ﬁelds of Q. Then the transcendence degree of them are 1

and 0, respectively, by noting that eis a transcendental number and √2 is an

algebraic number.

Now, Schanuel’s conjecture is ready to be presented.

Conjecture 1 (Schanuel’s conjecture). Given any complex numbers z1,··· , zn

that are linearly independent over Q, the extension ﬁeld Q(z1, ..., zn, ez1, ..., ezn)

has transcendence degree of at least nover Q.

Stephen Schanuel proposed this conjecture during a course given by Serge

Lang at Columbia in the 1960s [27]. Schanuel’s conjecture concerns the transcen-

dence degree of certain ﬁeld extensions of the rational numbers. The conjecture,

if proven, would generalize the most well-known results in transcendental num-

ber theory signiﬁcantly [29,37]. For example, the algebraical independence of

{e, π}would simply follow by setting z1= 1 and z2=πi, and using Euler’s

identity eπi + 1 = 0.

3 Continuous-time Markov Chains as Distributions

Transformers

We begin with the deﬁnition of continuous-time Markov chains (CTMCs). A

CTMC is a Markovian (memoryless) stochastic process that takes values on a

ﬁnite state set S(|S|=d < ∞) and evolves in continuous-time t∈R+. Formally,

Deﬁnition 4. A CTMC is a pair M= (S, Q), where S(|S|=d) is a ﬁnite

state set and Q∈Qd×dis a transition rate matrix.

A transition rate matrix Qis a matrix whose oﬀ-diagonal entries {Qi,j }i=jare

nonnegative rational numbers, representing the transition rate from state ito

8 J. Guan and N. Yu

state j, while the diagonal entries {Qj,j }are constrained to be −Pj=iQi,j for

all 1 ≤j≤d. Consequently, the column summations of Qare all zero.

The evolution of a CTMC can be regarded as a distribution transformer.

Given initial distribution µ∈Qd×1∈ D(S), the distribution at time t∈R+is:

µt=eQtµ,

where D(S) is denoted as the set of all probability distributions over S. We call

D(S) the probability distribution space of CTMCs. An execution path of CTMCs

is a continuous function indexed by initial distribution µ∈ D(S):

σµ:R+→ D(S), σµ(t) = eQt µ. (1)

Example 1. We recall the illustrating example of CTMC M= (S, Q) in [8,

Figure 1] as the running example in our work. In particular, Mis a 5-dimensional

CTMC with initial distribution µ, where S={s0, s1, s2, s3, s4}and







−3 0 0 0 0

10000

2 0 −700

00300

00400







µ=







0.1

0.2

0.3

0.4







4 Symbolic Dynamics of CTMCs

In this section, we introduce symbolic dynamics to characterize the properties

of the probability distribution space of CTMCs.

First, we ﬁx a ﬁnite set of intervals I={Ik⊆[0,1]}k∈K, where the end-

points of each Ikare rational numbers. With the states S={s0, s1,· · · , sd−1},

we deﬁne the symbolization of distributions as a function:

S:D(S)→2S×IS(µ) = {⟨s, I⟩ ∈ S×I:µ(s)∈ I},(2)

where ×denotes the Cartesian product, and 2S×Iis the power set of S×

I.⟨s, I⟩ ∈ S(µ) asserts that the probability of state s in distribution µis

in the interval I. The symbolization of distributions is a generalization of the

discretization of distributions with Ik∩Im=∅for all k=mwhich was studied in

[2]. This generalization increases the expressiveness of our continuous linear-time

logic introduced in the next section. Now, we can represent any given probability

distribution by ﬁnite symbols from S×I. For example, suppose

I={[0,0.1],(0.1,0.9),[0.9,1],[1,1],[0.4,0.4]},(3)

and then the initial distribution µin Example 1is symbolized as

S(µ) = {⟨s0,[0,0.1]⟩,⟨s1,(0.1,0.9)⟩,⟨s2,(0.1,0.9)⟩,

⟨s3,(0.1,0.9)⟩,⟨s3,[0.4,0.4]⟩,⟨s4,[0,0.1]⟩}.(4)

A Probabilistic Logic for Verifying Continuous-time Markov Chains 9

As we can see from the above example, the symbolization of distributions on

states considers the exact probabilities (singleton intervals) of the states and the

range of their possibilities.

Next, we introduce the symbolization to CTMCs,

Deﬁnition 5. Asymbolized CTMC is a tuple SM = (S, Q, I), where M=

(S, Q)is a CTMC and Iis a ﬁnite set of intervals in [0,1].

As we can see, the set of intervals is picked depending on CTMCs. Then, we

extend this symbolization to the path σµ:

S◦σµ:R+→2S×I.(5)

Deﬁnition 6. Given a symbolized CTMC SM = (S, Q, I),S◦σµis a symbolic

execution path of M= (S, Q).

Given a symbolized CTMC SM = (S, Q, I), the path σµof CTMC M= (S, Q)

over real numbers R+generated by probability distribution µinduces a symbolic

execution path S◦σµover ﬁnite symbols S×I. Subsequently, the dynamics

of CTMCs can be studied in terms of a language over S×I. In other words,

we can study the temporal properties of CTMCs in the context of symbolized

CTMCs.

5 Continuous Linear-time Logic

In this section, we introduce continuous linear-time logic (CLL), a probabilistic

linear-time temporal logic, to specify the temporal properties of a symbolized

CTMC SM = (S, Q, I).

CLL has two types of formulas: state formulas and path formulas. The state

formulas are constructed using propositional connectives. The path formulas are

obtained by propositional connectives and a temporal modal operator timed until

UTfor a bounded time interval T, as in MTL and CSL. Furthermore, multiphase

timed until formulas Φ0UT1Φ1UT2Φ2. . . U TnΦnare allowed to enrich the expres-

siveness of CLL. More importantly, time reset is involved in these multiphase

formulas. Thus absolutely and relatively temporal properties of CTMCs can be

studied.

Deﬁnition 7. The state formulas of CLL are described according to the follow-

ing syntax:

Φ:= true |a∈AP | ¬Φ|Φ1∧Φ2

where AP denotes S×Ias the set of atomic propositions.

The path formulas of CLL are constructed by the following syntax:

φ:= true |Φ0UT1Φ1UT2Φ2. . . U TnΦn| ¬φ|φ1∧φ2

where n∈Z+is a positive integer, for all 0≤k≤n,Φkis a state formula,

and Tk’s are time intervals with the endpoints in Q+, i.e., each Tkis one of the

following forms:

(a, b),[a, b],(a, b],[a, b)∀a, b ∈Q+.

10 J. Guan and N. Yu

The semantics of CLL state formulas is deﬁned on the set D(S) of probability

distributions over Swith the symbolized function Sin Eq.(2) of Section 4.

(1) µ|=true for all probability distributions µ∈ D(S);

(2) µ|=aiﬀ a∈S(µ);

(3) µ|=¬Φiﬀ it is not the case that µ|=Φ(or written µ|=Φ);

(4) µ|=Φ1∧Φ2iﬀ µ|=Φ1and µ|=Φ2.

The semantics of CLL path formulas is deﬁned on execution paths {σµ}µ∈D(S)

of CTMC M= (S, Q).

(1) σµ|=true for all probability distributions µ∈ D(S);

(2) σµ|=Φ0UT1Φ1UT2Φ2. . . UTnΦniﬀ there is a time instant t∈ T1such that

σµt|=Φ1UT2Φ2. . . UTnΦn, and for any t′∈ T1∩[0, t), µt′|=Φ0, where

σµt|=Φiﬀ µt|=Φ, and µtis the distribution of the chain at time instant t,

i.e., µt=eQtµ∀t∈R+;

(3) σµ|=¬φiﬀ it is not the case that σµ|=φ(written σµ|=φ);

(4) σµ|=φ1∧φ2iﬀ σµ|=φ1and σµ|=φ2.

Not surprisingly, other Boolean connectives are derived in the standard way,

i.e., false =¬true,Φ1∨Φ2=¬(¬Φ1∧ ¬Φ2) and Φ1→Φ2=¬Φ1∨Φ2,and

the path formula φfollows the same way. Furthermore, we generalize temporal

operators ♢(“eventually”) and □(“always”) of discrete-time systems into their

timed variant ♢Tand □T, respectively, in the following:

♢TΦ=trueUTΦ□TΦ=¬♢T¬Φ.

For n= 1 in multiphase timed until formulas, the until operator UT1is a

timed variant of the until operator of LTL; the path formula Φ0UT1Φ1asserts

that Φ1is satisﬁed at some time instant in the interval T1and that at all pre-

ceding time instants in T1,Φ0holds. For example,

φ=⟨s1,[0,0.1]⟩U[0,5]⟨s0,[0.9,1]⟩,

as mentioned in introduction section.

For general n, the CLL path formula Φ0UT1Φ1UT2Φ2. . . U TnΦnis explained

over the induction on n. We ﬁrst mention that UTis right-associative, e.g.,

Φ0UT1Φ1UT2Φ2stands for Φ0UT1(Φ1UT2Φ2). This makes time reset, i.e., T1and

T2do not have to be disjoint, and the starting time point of T2is based on some

time instant in T1. Recall the multiphase timed until formula in introduction

section and this formula expresses a relative time property:

φ′=⟨s0,[0.9,1]⟩U[3,7](⟨s1,[0,0.1]⟩U[0,5] ⟨s0,[0.9,1]⟩),

which is diﬀerent to the following CLL path formula representing an absolutely

temporal property of CTMCs:

φ′′ =□[3,7]⟨s0,[0.9,1]⟩∧⟨s1,[0,0.1]⟩U[0,5]⟨s0,[0.9,1]⟩).

A Probabilistic Logic for Verifying Continuous-time Markov Chains 11

As an example, we clarify the semantics of CLL by comparing the above two

path formulas in general forms:

Φ0UT1Φ1UT2Φ2and Φ0UT1Φ1∧Φ1UT2Φ2.

(1) σµ|=Φ0UT1Φ1UT2Φ2asserts that there are time instants t1∈ T1, t2∈ T2

such that µt1+t2|=Φ2and for any t′

1∈ T1∩[0, t1) and t′

2∈ T2∩[0, t2),

µt′

1|=Φ0and µt1+t′

2|=Φ1, where µt=eQtµ∀t∈R+.This is more clear in

the following timeline.

| {z }

Φ0| {z }

Φ1

=inf T2

z }| { ↓Φ2

=inf T1

z }| {

↑time 0 ↑t1≤sup T1↑(t1+t2)≤sup(T1+T2)

(2) σµ|=Φ0UT1Φ1∧Φ1UT2Φ2asserts that there are time instants t1∈ T1, t2∈

T2such that µt1|=Φ1and µt2|=Φ2, and for any t′

1∈ T1∩[0, t1) and

t′

2∈ T2∩[0, t2), µt′

1|=Φ0and µt′

2|=Φ1, where µt=eQtµ∀t∈R+.

Before solving the model-checking problem of CTMCs against CLL formulas

in the next section, we shall ﬁrst discuss what can be speciﬁed in our logic CLL.

Given a CTMC (S, Q), CLL path formula ♢[0,1000]⟨s, [1,1]⟩expresses a live-

ness property that state s∈Sis eventually reached with probability one before

time instant 1000. In terms of safety properties, formula □[100,1000]⟨s, [0,0]⟩rep-

resents that state s∈Sis never reached (reached with probability zero) between

time instants 100 and 1000. Furthermore, setting the intervals nontrivial (neither

[0,0] or [1,1]), liveness and safety properties can be asserted with probabilities,

such as ♢[0,1000]⟨s, [0.5,1]⟩and □[100,1000]⟨s, [0,0.5]⟩. For multiphase timed un-

til formula ⟨s, [0.7,1]⟩U[2,3]⟨s, [0.7,1]⟩. . . U [2,3]⟨s, [0.7,1]⟩,where the number of

U[2,3] is 100, asserts that the probability of state sis beyond 0.7 in every time

instant 2 to 3, and this happens at least 100 times.

Next, we can classify members of Ias representing “low” and “high” prob-

abilities. For example, if Icontains 3 intervals {[0,0.1],(0.1,0.9),[0.9,1]}, we

can declare the ﬁrst interval as “low” and the last interval as “high”. In this

case □[10,1000)(⟨s0,[0,0.1]⟩ → ⟨s1,[0.9,1]⟩) says that, in time interval [10,1000),

whenever the probability of state s0is low, the probability of state s1will be

high.

6 CLL Model Checking

In this section, we provide an algorithm to model check CTMCs against CLL

formulas, i.e., the following CLL model-checking problem — Problem 1is decid-

able.

Problem 1 (CLL Model-checking Problem). Given a symbolized CTMC SM =

(S, Q, I) with an initial distribution µand a CLL path formula φon AP =

S×I, the goal is to decide whether σµ|=φ, where σµ(t) = eQtµis an execution

path deﬁned in Eq.(1).

12 J. Guan and N. Yu

In particular, we show that

Theorem 2. Under the condition that Schanuel’s conjecture holds, the CLL

model-checking problem in Problem 1is decidable.

In the following, we prove the above theorem from checking basic formulas

— atomic propositions to the most complex one — nontrivial multiphase timed

until formulas. For readability, we put the proofs of all results in Appendix A of

the extended version [21] of this paper.

We start with the simplest case of atomic proposition ⟨s, I⟩. By the semantics

of CLL, µt|=⟨s, I⟩ if and only if µt=eQtµ(s)∈ I. To check this, we ﬁrst observe

that the execution path eQtµof CTMCs is a system of polynomial exponential

functions (PEFs).

Deﬁnition 8. A function f:R→Ris a polynomial-exponential function

(PEF) if fhas the following form:

f(t) =

k=0

fk(t)eλkt(6)

where for all 0≤k≤K < ∞,fk(t)∈F1[t], fk(t)= 0, λk∈F2and F1,F2are

ﬁelds. Without loss of generality, we assume that λk’s are distinct.

Generally, for a PEF f(t) with the range in complex numbers C,g(t) =

f(t) + f∗(t) is a PEF with the range in real numbers R, where f∗(t) is the

complex conjugate of f(t). The factor tis omitted whenever convenient, i.e.,

f=f(t). tis called a root of a function fif f(t) = 0. PEFs often appear in

transcendental number theory as auxiliary functions in the proofs involving the

exponential function [10].

Lemma 1. Given a CTMC M= (S, Q)with S={s0, . . . , sd−1},Q∈Qd×d,

and an initial distribution µ∈Qd×1, for any 0≤i≤d−1,eQtµ(si), the i-th

entry of eQtµ, can be expressed as a PEF f:R+→[0,1] as in Eq.(6) with

F1=F2=A.

By the above lemma, for a given tin some bounded time interval T(to be speciﬁc

in the latter discussion), eQtµ(s)∈ I is determined by the algebraic structure

of PEF g(t) = eQtµ(s) in T. That is all maximum intervals Tmax ⊆ T such

that g(t)∈ I for all t∈ Tmax , where interval Tmax =∅is called maximum for

g(t)∈ I if no sub-intervals T′⊊Tmax such that the property holds, i.e., g(t)∈ I

for all t∈ T ′. Then eQt µ(s)∈ I if and only if t∈ Tmax for some maximum

interval Tmax. So, we aim to compute the set Tof all maximum intervals. By

the continuity of PEF g(t), this can be done by identifying a real root isolation

of the following PEF f(t) in T:f(t)=(g(t)−inf I)(g(t)−sup I).

A (real) root isolation of function f(t) in interval Tis a set of mutually

disjoint intervals, denoted by Iso(f)T={(aj, bj)⊆ T } for aj, bj∈Qsuch that

–for any j, there is one and only one root of f(t) in (aj, bj);

A Probabilistic Logic for Verifying Continuous-time Markov Chains 13

–for any root t∗of f(t), t∗∈(aj, bj) for some j.

Furthermore, if fhas no any root in T, then Iso(f)T=∅.

Although there are inﬁnite kinds of real root isolations of f(t) in T, the

number of isolation intervals equals to the number of distinct roots of f(t) in T.

Finding real root isolations of PEFs is a long-standing problem and can be

at least backtracked to Ritt’s paper [34] in 1929. Some following results were

obtained since the last century (e.g. [7,38]). This problem is essential in the

reachability analysis of dynamical systems, one active ﬁeld of symbolic and al-

gebraic computation. In the case of F1=Qand F2=N+in [1], an algorithm

named ISOL was proposed to isolate all real roots of f(t). Later, this algorithm

has been extended to the case of F1=Qand F2=R[20]. A variant of the

problem has also been studied in [28]. The correctness of these algorithms is

based on Schanuel’s conjecture. Other works are using Schanuel’s conjecture to

do the root isolation of other functions, such as exp-log functions [35] and tame

elementary functions [36].

By Lemma 1, we pursue this problem in the context of CTMCs. The distinct

feature of solving real root isolations of PEFs in our paper is to deal with complex

numbers C, more speciﬁcally algebraic numbers A, i.e., F1=F2=A. At the

same time, to the best of our knowledge, all the previous works can only handle

the case over R. Here, we develop a state-of-the-art real root isolation algorithm

for PEFs over algebraic numbers. Thus from now on, we always assume that

PEFs are over A, i.e., F1=F2=Ain Eq.(6). In this case, it is worth noting

that whether a PEF has a root in a given interval, T ⊆ R+is decidable subject

to Schanuel’s Conjecture if Tis bounded [16], which falls in the situation we

consider in this paper.

Theorem 3 ([16]). Under the condition that Schanuel’s conjecture holds, there

is an algorithm to check whether a PEF f(t)has a root in interval T, i.e.,

whether Iso(f)T=∅.

In this paper, we extend the above checking Iso(f)T=∅to computing

Iso(f)Tof PEF f(t).

Theorem 4. Under the condition that Schanuel’s conjecture holds, there is an

algorithm to ﬁnd real root isolation Iso(f)Tfor any PEF f(t)and interval T.

Furthermore, the number of real roots is ﬁnite, i.e., |Iso(f)T|<∞.

We can compute the set Tof all maximum intervals with the above theorem

to check atomic propositions. Furthermore, we can compare the values of any

real roots of PEFs, which is important in model checking general multiphase

timed until formulas at the end of this section.

Lemma 2. Let f1(t)and f2(t)be two PEFs with the domains in T1and T2,

and t1∈ T1and t2∈ T2are roots of them, respectively. Under the condition that

Schanuel’s conjecture holds, there is an eﬃcient way to check whether or not

t1−t2< g for any given rational number g∈Q.

14 J. Guan and N. Yu

For model checking general state formula Φ, we can also use real root isolation

of some PEF to obtain the set of all maximum intervals Tmax such that µt|=Φ

for all t∈ Tmax. The reason is that Φadmits conjunctive normal form consisting

of atomic propositions. See the proof of the following lemma in Appendix A of

the extended version [21] of this paper for the details.

Lemma 3. Under the condition that Schanuel’s conjecture holds, given a time

interval T, the set Tof all maximum intervals in Tsatisfying µt|=Φcan be

computed, where Φis a state formula of CLL. Furthermore, the number of all

intervals in Tis ﬁnite; the left and right endpoints of each interval in Tare

roots of PEFs.

At last, we characterize the multiphase timed until formulas by the reacha-

bility analysis of time intervals (instants).

Lemma 4. σµ|=Φ0UT1Φ1UT2Φ2···UTnΦnif and only if there exist time in-

tervals {Ik⊆R+}n

k=0 with I0= [0,0] such that

–The satisfaction of intervals: for all 1≤k≤n,µt|=Φk−1for all t∈ Ik,

and µt∗|=Φn, where t∗= sup Inand µt=eQtµ∀t∈R+;

–The order of intervals: for all 1≤k≤n,Ik⊆ Ik−1+Tkand inf Ik=

sup Ik−1+ inf Tk.

By the above lemma, the problem of checking multiphase timed until formulas

is reduced to verify the existence of a sequence of time intervals.

Now we can show the proof of Theorem 2.

Proof. Recall that the nontrivial step is to model check multiphase timed until

formula Φ0UT1Φ1UT2Φ2·· · UTnΦn,where {Tj}n

j=1 is a set of bounded rational

intervals in R+, and for 0 ≤k≤n+ 1, Φkis a state formula.

By Lemma 4, for model checking the above formula, we only need to check

the existence of time intervals {Ik}n

k=0 illustrated in the lemma. The following

procedure can construct such a set of intervals if it exists:

–(1) Let I0={I0= [0,0]};

–(2) For each 1 ≤k≤n, obtaining the set Ikin [0,Pk

j=1 sup Tj] of all

maximum intervals such that µt|=Φk−1for all t∈ I of I ∈ I, where

µt=eQtµ; this can be done by Lemma 3. Noting that Ikcan be the empty

set, i.e., Ik=∅;

–(3) Let kfrom 1 to n. First, updating Ik:

Ik={I ∩ (I′+Tk) : I ∈ Ikand I′∈Ik−1}.(7)

The above updates can be ﬁnished by Lemma 2. If Ik=∅, then the formula

is not satisﬁed;

–(4) Updating In: for each I ∈ In, we replace Iwith [s−ε, s) for some

constant ε > 0 if there is an s∈ I with s−ε∈ I such that µs|=Φnwhere

µs=eQsµ; Otherwise, remove this element from In. Again, this can be

done by Lemma 3. If In=∅, then the formula is not satisﬁed;

A Probabilistic Logic for Verifying Continuous-time Markov Chains 15

–(5) Finally, let kfrom n−1 to 1, updating Ik:

Ik={[s−inf Tk, s −inf Tk]:[s−ε, s)∈Ik+1]}.

Thus after the above procedure, we have non-empty sets {Ik}n

k=0 with the

following properties.

–for each 1 ≤k≤n,µt|=Φk−1for all t∈ Ikand Ik∈Ik, and µt∗|=Φn,

where t∗= sup In;

–for each 1 ≤k≤n,I ∈ Ik, there exists at least one I′∈Ik−1such that

I ⊆ sup I′+Tkand inf I= sup I′+ inf Tk.

Therefore, we can get a set of intervals {Ik}n

k=0 satisfying the two conditions

in Lemma 4if it exists. On the other hand, it is easy to check that all such

{Ik}n

k=0 must be in {Ik}n

k=0, i.e., for each k,Ik⊆ I for some I ∈ Ik. This

ensures the correctness of the above procedure.

By the above constructive analysis, we give an algorithm for model checking

CTMCs against CLL formulas. Focusing on the decidability problem, we do

not provide the pseudocode of the algorithm. Alternatively, we implement a

numerical experiment to illustrate the checking procedure in the next section.

7 Numerical Implementation

In this section, we implement a case study of checking CTMCs against CLL

formulas. Here, we consider a symbolized CTMC SM = (S, Q, I), where M=

(S, Q) is the CTMC in Example 1and ﬁnite set Iis the one considered in

Eq.(3). We check the properties of Mgiven by the following two CLL path

formulas mentioned in the introduction for diﬀerent initial distributions.

φ=⟨s1,[0,0.1]⟩U[0,5]⟨s0,[0.9,1]⟩.

φ′=⟨s0,[0.9,1]⟩U[3,7]⟨s1,[0,0.1]⟩U[0,5] ⟨s0,[0.9,1]⟩.

By Jordan decomposition, we have Q=SJS−1where







0−6000

0 2 0 0 1

−7−3000

3 3 0 1 0

4 4 1 0 0













−7 0 0 0 0

0−3000

0 0 0 0 0







S−1=







14 0−1

70 0

−1

60 0 0 0

21 04

70 1

703

71 0

31 0 0 0







Then, we consider an initial distribution µas the same as the one in Example 1.

Then we have that the value of eQtµis as follows:







e−3t0 0 0 0

−1

3(e−3t−1) 1 0 0 0

2(e−3t−e−7t) 0 e−7t0 0

14 e−7t−1

2e−3t+2

70−3

7e−7t+3

71 0

7e−7t−2

3e−3t+8

21 0−4

7e−7t+4

70 1













0.1

0.2

0.3

0.4













10 e−3t

−1

30 e−3t+7

20 e−3t+1

4e−7t

−1

20 e−3t−3

28 e−7t+39

−1

15 e−3t−1

7e−7t+22

105







16 J. Guan and N. Yu

As we only consider states s0and s1in formulas φand φ′, we focus on the

following PEFs: f0(t) = 1

10 e−3tand f1(t) = −1

30 e−3t+7

30 .

Next, we initialize the model checking procedures introduced in the proof of

Theorem 2. First, we compute the set Tof all maximum intervals T ⊆ [0,5]

such that eQtµ|=⟨s0,[0.9,1]⟩for t∈ T , i.e., f0(t)∈[0.9,1] for t∈ T . We obtain

T=∅by the real root isolation algorithm mentioned in Theorem 4, and this

indicates that σµ|=φwhere σµ(t) = eQtµis the path induced by µand deﬁned

in Eq.(1).

To check whether σµ|=φ′, we compute the set Tof all maximum intervals

T ⊆ [0,12] such that eQtµ|=⟨s0,[0.9,1]⟩for t∈ T , i.e., f0(t)∈[0.9,1] for t∈ T .

Again, we obtain T=∅by the real root isolation algorithm in Theorem 4.

Therefore, σµ|=φ′.

In the following, we consider a diﬀerent initial distribution µ1as follows:

eQtµ1=eQt







0.9

0.1













10 e−3t

−3

10 (e−3t−1)

20 e−3t−7

20 e−7t

−9

20 e−3t+3

20 e−7t+3

−3

5e−3t+1

5e−7t+2







The key PEFs are: g0(t) = 9

10 e−3tand g1(t) = −3

10 (e−3t−1).

Again, we initialize the model checking procedures introduced in the proof of

Theorem 2. We ﬁrst compute the set Tof all maximum intervals T ⊆ [0,5] such

that eQtµ1|=⟨s1,[0,0.1]⟩for t∈ T , i.e., g1(t)∈[0,0.1] for t∈ T . This can be

done by ﬁnding a real root isolation of the following PEF: g0

1(t) = −3

10 (e−3t−

1) −1

10 .

By implementing the real root isolation algorithm in Theorem 4, we have

Iso(g0

1)[0,5] ={(0.13,0.14)}and then T={[0, t∗]}for t∗∈(0.13,0.14).

Following the same way, we compute Tfor eQtµ1|=⟨s0,[0.9,1]⟩. Then we

complete the model checking procedures in the proof of Theorem 2, and we

conclude: σµ1|=φ. By repeating these, the result of the second formula φ′is

σµ1|=φ′.

8 Related Works

Agrawal et al. [2] introduced probabilistic linear-time temporal logic (PLTL) to

reason about discrete-time Markov chains in the context of distribution trans-

formers as we did for CTMCs in this paper. Interestingly, the Skolem Prob-

lem can be reduced to the model checking problem for the logic PLTL [3]. The

Skolem Problem asks whether a given linear recurrence sequence has a zero term

and plays a vital role in the reachability analysis of linear dynamical systems.

Unfortunately, the decidability of the problem remains open [32]. Recently, the

Continuous Skolem Problem has been proposed with good behavior (the problem

is decidable) and forms a fundamental decision problem concerning reachability

A Probabilistic Logic for Verifying Continuous-time Markov Chains 17

in continuous-time linear dynamical systems [16]. Not surprisingly, the Continu-

ous Skolem Problem can be reduced to model-checking CLL. The primary step

of verifying CLL formulas is to ﬁnd a real root isolation of a PEF in a given

interval. Chonev, Ouaknine and Worrell reformulated the Continuous Skolem

Problem in terms of whether a PEF has a root in a given interval, which is

decidable subject to Schanuel’s conjecture [16]. An algorithm for ﬁnding root

isolation can also answer the problem of checking the existence of the roots of a

PEF. However, the reverse does not work in general. Therefore, the decidability

of the Continuous Skolem Problem cannot be applied to establish that of our

CLL model checking.

Remark 1. By adopting the method in this paper, we established the decidability

of model checking quantum CTMCs against signal temporal logic [40]. Again,

we need Schanuel’s conjecture to guarantee the correctness. A Lindblad’s master

equation governs a quantum CTMC and a more general real-time probabilistic

Markov model than a CTMC, i.e., a CTMC is an instance of quantum CTMCs.

We converted the evolution of Lindblad’s master equation into a distribution

transformer that preserves the laws of quantum mechanics. We reduced the

model-checking problem of quantum CTMCs to the real root isolation problem,

which we considered in this paper, and thus our method could be applied to it.

9 Conclusion

This paper revisited the study of temporal properties of ﬁnite-state CTMCs by

symbolizing the probability value space [0,1] into a ﬁnite set of intervals. To

specify relatively and absolutely temporal properties, we propose a probabilistic

logic for CTMCs, namely continuous linear-time logic (CLL). We have considered

the model checking problem in this setting. Our main result is that a state-of-the-

art real root isolation algorithm over the ﬁeld of algebraic numbers was proposed

to establish the decidability of the model checking problem under the condition

that Schanuel’s conjecture holds.

This paper aims to show decidability in as simple a fashion as possible with-

out paying much attention to complexity issues. Faster algorithms on our current

constructions would signiﬁcantly improve from a practical standpoint.

Acknowledgments

We want to thank Professor Joost-Pieter Katoen for his invaluable feedback and

for pointing out the references [14,15,30]. This work is supported by the National

Key R&D Program of China (Grant No: 2018YFA0306701), the National Natural

Science Foundation of China (Grant No: 61832015), ARC Discovery Program

(#DP210102449) and ARC DECRA (#DE180100156).

18 J. Guan and N. Yu

References

1. Achatz, M., McCallum, S., Weispfenning, V.: Deciding polynomial-exponential

problems. In: Proceedings of the Twenty-ﬁrst International Symposium on Sym-

bolic and Algebraic Computation. pp. 215–222. ACM (2008)

2. Agrawal, M., Akshay, S., Genest, B., Thiagarajan, P.: Approximate veriﬁcation of

the symbolic dynamics of Markov chains. Journal of the ACM (JACM) 62(1), 2

(2015)

3. Akshay, S., Antonopoulos, T., Ouaknine, J., Worrell, J.: Reachability problems for

Markov chains. Information Processing Letters 115(2), 155–158 (2015)

4. Almagor, S., Kelmendi, E., Ouaknine, J., Worrell, J.: Invariants for continuous

linear dynamical systems. arXiv preprint arXiv:2004.11661 (2020)

5. Alur, R., Dill, D.L.: A theory of timed automata. Theoretical Computer Science

126, 183–235 (1994)

6. Alur, R., Henzinger, T.A., Vardi, M.Y.: Parametric real-time reasoning. In: Pro-

ceedings of the Twenty-ﬁfth Annual ACM Symposium on Theory of Computing.

pp. 592–601 (1993)

7. Avellar, C.E., Hale, J.K.: On the zeros of exponential polynomials. Journal of

Mathematical Analysis and Applications 73(2), 434–452 (1980)

8. Aziz, A., Sanwal, K., Singhal, V., Brayton, R.: Model-checking continuous-time

Markov chains. ACM Transactions on Computational Logic 1(1), 162–170 (2000)

9. Baier, C., Haverkort, B., Hermanns, H., Katoen, J.P.: Model-checking algorithms

for continuous-time Markov chains. IEEE Transactions on Software Engineering

29(6), 524–541 (2003)

10. Baker, A.: Transcendental number theory. Cambridge university press (1990)

11. Barbot, B., Chen, T., Han, T., Katoen, J.P., Mereacre, A.: Eﬃcient CTMC model

checking of linear real-time objectives. In: International Conference on Tools and

Algorithms for the Construction and Analysis of Systems. pp. 128–142. Springer

(2011)

12. Chen, T., Diciolla, M., Kwiatkowska, M., Mereacre, A.: Time-bounded veriﬁcation

of CTMCs against real-time speciﬁcations. In: International Conference on Formal

Modeling and Analysis of Timed Systems. pp. 26–42. Springer (2011)

13. Chen, T., Han, T., Katoen, J.P., Mereacre, A.: Quantitative model checking of

continuous-time Markov chains against timed automata speciﬁcations. In: 2009

24th Annual IEEE Symposium on Logic In Computer Science. pp. 309–318. IEEE

(2009)

14. Chen, T., Han, T., Katoen, J.P., Mereacre, A.: Model checking of continuous-

time Markov chains against timed automata speciﬁcations. Logical Methods in

Computer Science 7(1) (Mar 2011)

15. Chen, T., Han, T., Katoen, J.P., Mereacre, A.: Observing continuous-time MDPs

by 1-clock timed automata. In: International Workshop on Reachability Problems.

pp. 2–25. Springer (2011)

16. Chonev, V., Ouaknine, J., Worrell, J.: On the skolem problem for continuous lin-

ear dynamical systems. In: Chatzigiannakis, I., Mitzenmacher, M., Rabani, Y.,

Sangiorgi, D. (eds.) 43rd International Colloquium on Automata, Languages, and

Programming (ICALP 2016). Leibniz International Proceedings in Informatics

(LIPIcs), vol. 55, pp. 100:1–100:13. Schloss Dagstuhl–Leibniz-Zentrum fuer Infor-

matik, Dagstuhl, Germany (2016)

17. Cohen, H.: A course in computational algebraic number theory, vol. 138. Springer

Science & Business Media (2013)

A Probabilistic Logic for Verifying Continuous-time Markov Chains 19

18. Dehnert, C., Junges, S., Katoen, J.P., Volk, M.: A STORM is coming: A mod-

ern probabilistic model checker. In: International Conference on Computer Aided

Veriﬁcation. pp. 592–600. Springer (2017)

19. Feng, Y., Katoen, J.P., Li, H., Xia, B., Zhan, N.: Monitoring CTMCs by multi-clock

timed automata. In: International Conference on Computer Aided Veriﬁcation. pp.

507–526. Springer (2018)

20. Gan, T., Chen, M., Li, Y., Xia, B., Zhan, N.: Reachability analysis for solvable

dynamical systems. IEEE Transactions on Automatic Control 63(7), 2003–2018

(2017)

21. Guan, J., Yu, N.: A probabilistic logic for verifying continuous-time markov chains.

arXiv preprint arXiv:2004.08059 (2020)

22. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal

Aspects of Computing 6(5), 512–535 (1994)

23. Katoen, J.P.: The probabilistic model checking landscape. In: Proceedings of the

31st Annual ACM/IEEE Symposium on Logic in Computer Science. pp. 31–45.

ACM (2016)

24. Katoen, J.P., Zapreev, I.S., Hahn, E.M., Hermanns, H., Jansen, D.N.: The ins and

outs of the probabilistic model checker MRMC. Performance Evaluation 68(2),

90–104 (2011)

25. Kolmogoroﬀ, A.: ¨

Uber die analytischen methoden in der wahrscheinlichkeitsrech-

nung. Mathematische Annalen 104(1), 415–458 (1931)

26. Kwiatkowska, M., Norman, G., Parker, D.: PRISM: Probabilistic symbolic model

checker. In: International Conference on Modelling Techniques and Tools for Com-

puter Performance Evaluation. pp. 200–204. Springer (2002)

27. Lang, S.: Introduction to transcendental numbers. Addison-Wesley Pub. Co. (1966)

28. Li, J.C., Huang, C.C., Xu, M., Li, Z.B.: Positive root isolation for poly-powers. In:

Proceedings of the ACM on International Symposium on Symbolic and Algebraic

Computation. pp. 325–332. ACM (2016)

29. Macintyre, A., Wilkie, A.J.: On the decidability of the real exponential ﬁeld (1996)

30. Majumdar, R., Salamati, M., Soudjani, S.: On decidability of time-bounded reacha-

bility in CTMDPs. In: Czumaj, A., Dawar, A., Merelli, E. (eds.) 47th International

Colloquium on Automata, Languages, and Programming (ICALP 2020). Leib-

niz International Proceedings in Informatics (LIPIcs), vol. 168, pp. 133:1–133:19.

Schloss Dagstuhl–Leibniz-Zentrum f¨ur Informatik, Dagstuhl, Germany (2020)

31. Nesterenko, Y.: Modular functions and transcendence problems. Comptes rendus

de l’Acad´emie des sciences. S´erie 1, Math´ematique 322(10), 909–914 (1996)

32. Ouaknine, J., Worrell, J.: Decision problems for linear recurrence sequences. In:

International Workshop on Reachability Problems. pp. 21–28. Springer (2012)

33. Richardson, D.: How to recognize zero. Journal of Symbolic Computation 24(6),

627–645 (1997)

34. Ritt, J.F.: On the zeros of exponential polynomials. Transactions of the American

Mathematical Society 31(4), 680–686 (1929)

35. Strzebonski, A.: Real root isolation for exp-log functions. In: Proceedings of the

Twenty-ﬁrst International Symposium on Symbolic and Algebraic Computation.

pp. 303–314 (2008)

36. Strzebonski, A.: Real root isolation for tame elementary functions. In: Proceedings

of the 2009 International Symposium on Symbolic and Algebraic Computation.

pp. 341–350 (2009)

37. Terzo, G.: Some consequences of Schanuel’s conjecture in exponential rings. Com-

munications in Algebra®36(3), 1171–1189 (2008)

20 J. Guan and N. Yu

38. Tijdeman, R.: On the number of zeros of general exponential polynomials. In:

Indagationes Mathematicae (Proceedings). vol. 74, pp. 1–7. North-Holland (1971)

39. Xu, M., Deng, Y.: Time-bounded termination analysis for probabilistic programs

with delays. Information and Computation 275, 104634 (2020)

40. Xu, M., Mei, J., Guan, J., Yu, N.: Model checking quantum continuous-time

Markov chains. In: Haddad, S., Varacca, D. (eds.) 32nd International Conference

on Concurrency Theory (CONCUR 2021). Leibniz International Proceedings in In-

formatics (LIPIcs), vol. 203, pp. 13:1–13:17. Schloss Dagstuhl – Leibniz-Zentrum

f¨ur Informatik, Dagstuhl, Germany (2021)

41. Zhang, L., Jansen, D.N., Nielson, F., Hermanns, H.: Automata-based CSL model

checking. In: International Colloquium on Automata, Languages, and Program-

ming. pp. 271–282. Springer (2011)

Open Access This chapter is licensed under the terms of the Creative Commons

Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),

which permits use, sharing, adaptation, distribution and reproduction in any medium

or format, as long as you give appropriate credit to the original author(s) and the

source, provide a link to the Creative Commons license and indicate if changes were

made.

The images or other third party material in this chapter are included in the

chapter’s Creative Commons license, unless indicated otherwise in a credit line to the

material. If material is not included in the chapter’s Creative Commons license and

your intended use is not permitted by statutory regulation or exceeds the permitted

use, you will need to obtain permission directly from the copyright holder.

A Probabilistic Logic for Verifying Continuous-time Markov Chains 21

Under-Approximating

Expected Total Rewards in POMDPs?

Alexander Bork1( ), Joost-Pieter Katoen1, and Tim Quatmann1

RWTH Aachen University, Aachen, Germany

alexander.bork@cs.rwth-aachen.de

Abstract

We consider the problem: is the optimal expected total re-

ward to reach a goal state in a partially observable Markov decision

process (POMDP) below a given threshold? We tackle this—generally

undecidable—problem by computing under-approximations on these total

expected rewards. This is done by abstracting ﬁnite unfoldings of the

inﬁnite belief MDP of the POMDP. The key issue is to ﬁnd a suitable

under-approximation of the value function. We provide two techniques: a

simple (cut-oﬀ) technique that uses a good policy on the POMDP, and

a more advanced technique (belief clipping) that uses minimal shifts of

probabilities between beliefs. We use mixed-integer linear programming

(MILP) to ﬁnd such minimal probability shifts and experimentally show

that our techniques scale quite well while providing tight lower bounds

on the expected total reward.

1 Introduction

The relevance of POMDPs. Partially observable Markov decision processes (POM-

DPs) originated in operations research and nowadays are a pivotal model for

planning in AI [

]. They inherit all features of classical MDPs: each state has a

set of discrete probability distributions over the states and rewards are earned

when taking transitions. However, states are not fully observable. Intuitively,

certain aspects of the states can be identiﬁed, such as a state’s colour, but states

themselves cannot be observed. This partial observability reﬂects, for example, a

robot’s view of its environment while only having the limited perspective of its

sensors at its disposal. The main goal is to obtain a policy—a plan how to resolve

the non-determinism in the model—for a given objective. The key problem here

is that POMDP policies must base their decisions only on the observable aspects

(e.g. colours) of states. This stands in contrast to policies for MDPs which can

make decisions dependent on the entire history of full state information.

Analysing POMDPs. Typical POMDP planning problems consider either ﬁnite-

horizon objectives or inﬁnite-horizon objectives under discounting. Finite-horizon

objectives focus on reaching a certain goal state (such as “the robot has collected

all items” ) within a given number of steps. For inﬁnite horizons, no step bound

?This work is funded by the DFG RTG 2236 “UnRAVeL”.

The Author(s) 2022

D. Fisman and G. Rosu (Eds.): TACAS 2022, LNCS 13244, pp. 22–40, 2022.

https://doi.org/10.1007/978-3-030-99527-0_2

is provided and typically rewards along a run are weighted by a discounting

factor that indicates how much immediate rewards are favoured over more distant

ones. Existing techniques to treat these objectives include variations of value

iteration [

] and policy trees [

]. Point-based techniques [

]

approximate a POMDP’s value function using a ﬁnite subset of beliefs which is

iteratively updated. Algorithms include PBVI [

], Perseus [

], SARSOP [

]

and HSVI [

]. Point-based methods can treat large POMDPs for both ﬁnite-

and discounted inﬁnite-horizon objectives [42].

Problem statement. In this paper we consider the problem: is the maximal expected

total reward to reach a given goal state in a POMDP below a given threshold?

We thus consider an inﬁnite-horizon objective without discounting—also called

an indeﬁnite-horizon objective. A speciﬁc instance of the considered problem is

the reachability probability to eventually reach a given goal state in a POMDP.

This problem is undecidable [

] in general. Intuitively, this is due to the fact

that POMDP policies need to consider the entire (inﬁnite) observation history

to make optimal decisions. For a POMDP, this notion is captured by an inﬁnite,

fully observable MDP, its belief MDP. This MDP is obtained from observation

sequences inducing probabilities of being in certain states of the POMDP.

Previously proposed methods to solve the problem are e.g. to use approx-

imate value iteration [

], optimisation and search techniques [

], dynamic

programming [

], Monte Carlo simulation [

], game-based abstraction [

], and

machine learning [

]. Other approaches restrict the memory size of the

policies [

]. The synthesis of (possibly randomised) ﬁnite-memory policies is

ETR-complete

[

]. Techniques to obtain ﬁnite-memory policies use e.g. para-

meter synthesis [28] or satisﬁability checking and SMT solving [15,50].

Our approach. We tackle the aforementioned problem by computing under-

approximations on maximal total expected rewards. This is done by considering

ﬁnite unfoldings of the inﬁnite belief MDP of the POMDP, and then applying

abstraction. The key issue here is to ﬁnd a suitable under-approximation of

the POMDP’s value function. We provide two techniques: a simple (cut-oﬀ)

technique that uses a good policy on the POMDP, and a more advanced tech-

nique (belief clipping) that uses minimal shifts of probabilities between beliefs

and can be applied on top of the simple approach. We use mixed-integer linear

programming (MILP) to ﬁnd such minimal probability shifts. Cut-oﬀ techniques

for indeﬁnite-horizon objectives have been used on computation trees—rather

than on the belief MDP as used here—in Goal-HSVI [

]. Belief clipping amends

the probabilities in a belief to be in a state of the POMDP yielding discretised

values, i.e. an abstraction of the probability range [0

1] is applied. Such grid-based

approximations are inspired by Lovejoy’s grid-based belief MDP discretisation

method [

]. They have also been used in [

] in the context of dynamic pro-

gramming for POMDPs, and to over-approximate the value function in model

checking of POMDPs [

]. In fact, this paper on determining lower bounds for

A decision problem is ETR-complete if it can be reduced to a polynomial-length

sentence in the Existential Theory of the Reals (for which the satisﬁability problem is

decidable) in polynomial time, and there is such a reduction in the reverse direction.

Under-Approximating Expected Total Rewards in POMDPs 23

indeﬁnite-horizon objectives can be seen as the dual counterpart of [

]. Our key

challenge—compared to the approach of [

]—is that the value at a certain belief

cannot easily be under-approximated with a convex combination of values of

nearby beliefs. On the other hand, an under-approximation can beneﬁt from a

“good” guess of some initial POMDP policy. In the context of [

], such a guessed

policy is of limited use for over-approximating values in the POMDP induced

by an optimal policy. Although our approach is applicable to all thresholds, the

focus of our work is on determining under-approximations for quantitative object-

ives. Dedicated veriﬁcation techniques for the qualitative setting—almost-sure

reachability—are presented in [17,16,27].

Experimental results. We have implemented our cut-oﬀ and belief clipping ap-

proaches on top of the probabilistic model checker Storm [

] and applied it to a

range of various benchmarks. We provide a comparison with the model checking

approach in [

], and determine the tightness of our under-approximations by

comparing them to over-approximations obtained using the algorithm from [

Our main ﬁndings from the experimental validation are:

–

Cut-oﬀs often generate tight bounds while being computationally inexpensive.

–

The clipping approach may further improve the accuracy of the approximation.

–

Our implementation can deal with POMDPs with tens of thousands of states.

–Mostly, the obtained under-approximations are less than 10% oﬀ.

2 Preliminaries and Problem Statement

Let

Dist(A):=µ:A→[0,1] |Pa∈Aµ(a)=1

denote the set of probability

distributions over a ﬁnite set

. The set

supp

(

)

:={a∈A|µ(a)>0}

is the

support of

µ∈Dist(A)

. Let

R∞:=R∪ {∞,−∞}

. We use Iverson bracket

notation, where [

] = 1 if the Boolean expression

is true and [

] = 0 otherwise.

2.1 Partially Observable MDPs

Deﬁnition 1 (MDP).

AMarkov decision process (MDP) is a tuple

hS, Act,P, sinit i

with a (ﬁnite or inﬁnite) set of states

, a ﬁnite set of actions

Act

, a transition function

P:S×Act ×S→

1] with

Ps0∈SP

(

s, α, s0

)

∈ {0,1}

for all s∈Sand α∈Act, and an initial state sinit .

We ﬁx an MDP

M:=hS, Act,P, sinit i

. For

s∈S

and

α∈Act

, let

postM

(

s, α

)

{s0∈S|P

(

s, α, s0

)

}

denote the set of

-successors of

. The set of

enabled actions in s∈Sis given by Act(s):={α∈Act |postM(s, α)6=∅}.

Deﬁnition 2 (POMDP).

Apartially observable MDP (POMDP) is a tuple

hM, Z, O i

, where

is the underlying MDP with

|S| ∈ N

, i.e.

is ﬁnite,

is a ﬁnite set of observations, and

O:S→Z

is an observation function such

that O(s) = O(s0) =⇒Act(s) = Act(s0)for all s, s0∈S.

We ﬁx a POMDP

M:=hM, Z, O i

with underlying MDP

. We lift the notion

of enabled actions to observations

z∈Z

by setting

Act

(

)

:=Act

(

)for some

24 A. Bork, J.-P. Katoen, T. Quatmann

s∈S

with

(

) =

which is valid since states with the same observations are

required to have the same enabled actions. The notions deﬁned for MDPs below

also straightforwardly apply to POMDPs.

Remark 1.

More general observation functions of the form

S×Act →Dist(Z)

can be encoded in this formalism by using a polynomially larger state space [

An inﬁnite path through an MDP (and a POMDP) is a sequence

˜π

s0α1s1α2. . .

such that

αi+1 ∈Act

(

)and

si+1 ∈postM

(

si, αi+1

)for all

i∈N

. A ﬁnite path is

a ﬁnite preﬁx

ˆπ

s0α1. . . αnsn

of an inﬁnite path

˜π

. For ﬁnite

ˆπ

let

last

(

ˆπ

)

:=sn

and

|ˆπ|:=n

. For inﬁnite

˜π

set

|˜π|:=∞

and let

˜π

[

]denote the ﬁnite preﬁx of

length

i∈N

. We denote the set of ﬁnite and inﬁnite paths in

PathsM

ﬁn

and

PathsM

inf

, respectively. Let

PathsM:=PathsM

ﬁn ∪PathsM

inf

. Paths are lifted to

the observation level by observation traces. The observation trace of a (ﬁnite or

inﬁnite) path

s0α1s1α2. . . ∈PathsM

(

)

:=O

(

)

α1O

(

)

α2. . .

. Two

paths π, π0∈Paths Mare observation-equivalent if O(π) = O(π0).

Policies resolve the non-determinism present in MDPs (and POMDPs). Given

a ﬁnite path ˆπ, a policy determines the action to take at last (ˆπ).

Deﬁnition 3 (Policy).

Apolicy for

is a function

PathsM

ﬁn →Dist(Act)

such that for each path ˆπ∈PathsM

ﬁn,supp (σ(ˆπ)) ⊆Act(last (ˆπ)).

A policy

is deterministic if

|supp

(

ˆπ

))

= 1 for all

ˆπ∈PathsM

ﬁn

. Otherwise

it is randomised.

is memoryless if for all

ˆπ, ˆπ0∈PathsM

ﬁn

we have

last

(

ˆπ

) =

last

(

ˆπ0

) =

⇒σ

(

ˆπ

) =

(

ˆπ0

is observation-based if for all

ˆπ, ˆπ0∈PathsM

ﬁn

holds that

(

ˆπ

) =

(

ˆπ0

) =

⇒σ

(

ˆπ

) =

(

ˆπ0

). We denote the set of policies for

ΣM

and the set of observation-based policies for

ΣM

obs

. A ﬁnite-memory

policy (fm-policy) can be represented by a ﬁnite automaton where the current

memory state and the state of the MDP determine the actions to take [4].

The probability measure

µσ,s

for paths in

under policy

and initial state

sis the probability measure of the Markov chain induced by M,σ, and s[4].

We use reward structures to model quantities like time, or energy consumption.

Deﬁnition 4 (Reward Structure). Areward structure for Mis a function

R:S×Act ×S→R

such that either for all

s, s0∈S

α∈Act

(

s, α, s0

)

≥

or for all

s, s0∈S

α∈Act

(

s, α, s0

)

≤

0holds. In the former case, we call

positive, otherwise negative.

We ﬁx a reward structure

for

. The total reward along a path

is deﬁned

rewM,R

(

)

:=P|π|

i=1 R

(

si−1, αi, si

). The total reward is always well-deﬁned—

even if

is inﬁnite—since all rewards are assumed to be either non-negative or

non-positive. For an inﬁnite path

˜π

we deﬁne the total reward until reaching a

set of goal states G⊆Sby

rewM,R,G (˜π):=





rewM,R(ˆπ)if ∃i∈N: ˆπ= ˜π[i]∧last(ˆπ)∈G∧

∀j < i :last (˜π[j]) /∈G,

rewM,R(˜π)otherwise.

Under-Approximating Expected Total Rewards in POMDPs 25

Intuitively,

rewM,R,G

(

˜π

)accumulates reward along

˜π

until the ﬁrst visit of a goal

state

s∈G

. If no goal state is reached, reward is accumulated along the inﬁnite

path. The expected total reward until reaching Gfor policy σand state sis

ERσ

M,R(s|=♦G):=Z

˜π∈Paths M

inf

rewM,R,G (˜π)·µσ,s

M(d˜π).

Observation-based policies capture the notion that a decision procedure for a

POMDP only accesses the observations and their history and not the entire state

of the system. We are interested in reasoning about minimal and maximal values

over all observation-based policies. For our explanations we focus on maximising

(non-negative or non-positive) expected rewards. Minimisation can be achieved

by negating all rewards.

Deﬁnition 5 (Maximal Expected Total Reward).

The maximal expected

total reward until reaching Gfrom sin POMDP Mis

ERmax

M,R(s|=♦G):= sup

σ∈ΣM

obs

ERσ

M,R(s|=♦G).

We deﬁne ERmax

M,R(♦G):=ERmax

M,R(sinit |=♦G).

The central problem of our work, the indeﬁnite-horizon total reward problem,

asks the question whether the maximal expected total reward until reaching a

goal exceeds a given threshold.

Problem 1.

Given a POMDP

, reward structure

, set of goal states

G⊆S, and threshold λ∈R, decide whether ERmax

M,R(♦G)≤λ.

Example 1.

Fig. 1shows a POMDP

with three states and two observations:

(

) =

(

) = and

(

) = . A reward of 1 is collected when transitioning

from s1to s2via the β-action. All other rewards are zero. s0s1

α1

R: 1

Figure 1.

POMDP

The policy that always selects

and

maximizes the expected total reward to reach

{s2}

but is not observation-based. The observation-based policy

that for the ﬁrst

n∈N

transition steps selects

and then

selects

afterwards yields an expected total reward of

1−(1

/2)n. With n→ ∞ we obtain ERmax

M,R(♦{s2})=1.

As computing maximal expected rewards exactly in POMDPs is undecidable

[

], we aim at under-approximating the actual value

ERmax

M,R

(

♦G

). This allows

us to answer our problem negatively if the computed lower bound exceeds λ.

Remark 2.

Expected rewards can be used to describe reachability probabilities

by assigning reward 1 to all transitions entering

and assigning reward 0 to

all other transitions. Our approach can thus be used to obtain lower bounds on

reachability probabilities in POMDPs. This also holds for almost-sure reachability

(i.e. “is the reachability probabilty one?”), though dedicated methods like those

presented in [17,16,27] are better suited for that setting.

26 A. Bork, J.-P. Katoen, T. Quatmann

2.2 Beliefs

The semantics of a POMDP

are captured by its (fully observable) belief

MDP. The inﬁnite state space of this MDP consists of beliefs [

]. A belief is a

distribution over the states of the POMDP where each component describes the

likelihood to be in a POMDP state given a history of observations. We denote the

set of all beliefs for

BM:={b∈Dist(S)| ∀s, s0∈supp

(

) :

(

) =

(

)

}

and write O(b)∈Zfor the unique observation O(s)of all s∈supp(b).

The belief MDP of

is constructed by starting in the belief corresponding

to the initial state and computing successor beliefs to unfold the MDP. Let

(

s, α, z

)

:=Ps0∈S

[

(

) =

]

·P

(

s, α, s0

)be the probability to observe

z∈Z

after taking action

in POMDP state

. Then, the probability to observe

after taking action

in belief

(

b, α, z

)

:=Ps∈Sb

(

)

·P

(

s, α, z

). We refer

Jb|α, zK∈ BM

—the belief after taking

, conditioned on observing

—as

the α-z-successor of b. If P(b, α, z)>0, it is deﬁned component-wise as

Jb|α, zK(s):=[O(s) = z]·Ps0∈Sb(s0)·P(s0, α, s)

P(b, α, z)

for all s∈S. Otherwise Jb|α, zKis undeﬁned.

Deﬁnition 6 (Belief MDP).

The belief MDP of

is the MDP

bel

(

) =

BM,Act,PB, binit 

, where

is the set of all beliefs in

Act

is as for

binit :={sinit 7→ 1}

is the initial belief, and

PB:BM×Act × BM→

1] is the

belief transition function with

PB(b, α, b0):=(P(b, α, z)if b0=Jb|α, z K,

0otherwise.

We lift a POMDP reward structure Rto the belief MDP [25].

Deﬁnition 7 (Belief Reward Structure).

For beliefs

b, b0∈ BM

and action

α∈Act

, the belief reward structure

based on

associated with

bel

(

)is

given by

RB(b, α, b0):=Ps∈Sb(s)·Ps0∈S[O(s0) = O(b0)] ·R(s, α, s0)·P(s, α, s0)

P(b, α, O(b0)) .

Given a set of goal states

G⊆S

, we assume—for simplicity—that there is a set

of observations

Z0⊆Z

such that

s∈G

iﬀ

(

)

∈Z0

. This assumption can always

be ensured by transforming the POMDP

. See the full technical report [

] for

details. The set of goal beliefs for Gis given by GB:={b∈ BM|supp(b)⊆G}.

We now lift the computation of expected rewards to the belief level. Based on

the well-known Bellman equations [

], the belief MDP induces a function that

maps every belief to the expected total reward accumulated from that belief.

Deﬁnition 8 (POMDP Value Function).

For

b∈ BM

, the

-step value

function Vn:BM→Rof Mis deﬁned recursively as V0(b):= 0 and

Vn(b):= [b /∈GB]·max

α∈Act X

b0∈postbel(M)(b,α)

PB(b, α, b0)·RB(b, α, b0) + Vn−1(b0).

Under-Approximating Expected Total Rewards in POMDPs 27

s07→1

s17→0s07→ 1

s17→1

s07→1

s17→3

s07→1

s17→7

/8· · ·

s27→1

α1α1α1α1

RB: 0

β1

RB:1

1RB:3

RB:7

Figure 2. Belief MDP bel(M)of POMDP Mfrom Fig. 1

The (optimal) value function

V∗

BM→R∞

is given by

V∗

(

)

:= limn→∞ Vn

(

The

-step value function is piecewise linear and convex [

]. Thus, the optimal

value function can be approximated arbitrarily close by a piecewise linear convex

function [

]. The value function yields expected total rewards in

and

bel

(

ERmax

M,R(s|=♦G) = ERmax

bel(M),RB({s7→ 1} |=♦GB) = V∗({s7→ 1}).

Example 2.

Fig. 2shows a fragment of the belief MDP of the POMDP from

Fig. 1. Observe ERmax

bel(M),RB(♦{s27→ 1}) = 1.

We reformulate our problem statement to focus on the belief MDP.

Problem 2

(equivalent to Problem 1). For a POMDP

, reward structure

goal states

G⊆S

, and threshold

λ∈R

, decide whether

V∗

(

{sinit 7→ 1}

)

≤λ

As the belief MDP is fully observable, standard results for MDPs apply. However,

an exhaustive analysis of

bel

(

)is intractable since the belief MDP is—in

general—inﬁnitely large2.

3 Finite Exploration Under-Approximation

Instead of approximating values directly on the POMDP, we consider approx-

imations of the corresponding belief MDP. The basic idea is to construct a

ﬁnite abstraction of the belief MDP by unfolding parts of it and approximate

values at beliefs where we decide not to explore. In the resulting ﬁnite MDP,

under-approximative expected reward values can be computed by standard model

checking techniques. We present two approaches for abstraction: belief cut-oﬀs

and belief clipping. We incorporate those techniques into an algorithmic framework

that yields arbitrarily tight under-approximations.

The technical report [10] contains formal proofs of our claims.

The set of all beliefs—i.e. the state space of

bel

(

)—is uncountable. The reachable

fragment is countable, though, since each belief has at most |Z|many successors.

28 A. Bork, J.-P. Katoen, T. Quatmann

s07→1

s17→0s07→ 1

s17→1

s07→1

s17→3

/4bcut

s27→1

α1α1cut 1

R0:V(b)

R0: 0

β1

R0:1

/2α

cut

Figure 3. Applying belief cut-oﬀs to the belief MDP from Fig. 2

3.1 Belief Cut-Oﬀs

The general idea of belief cut-oﬀs is to stop exploring the belief MDP at certain

beliefs—the cut-oﬀ beliefs—and assume that a goal state is immediately reached

while sub-optimal reward is collected. Similar techniques have been discussed in

the context of fully observable MDPs and other model types [

]. Our

work adapts the idea of cut-oﬀs for POMDP over-approximations described in [

]

to under-approximations. The main idea of belief cut-oﬀs shares similarities with

the SARSOP [

] and Goal-HSVI [

] approaches. While they apply cut-oﬀs on

the level of the computation tree, our approach directly manipulates the belief

MDP to yield a ﬁnite model.

Let

BM→R∞

with

(

)

≤V∗

(

)for all

b∈ BM

. We call

an under-

approximative value function and

(

)the cut-oﬀ value of

. In each of the cut-oﬀ

beliefs

, instead of adding the regular transitions to its successors, we add a

transition with probability 1 to a dedicated goal state

bcut

. In the modiﬁed reward

structure

, this cut-oﬀ transition is assigned a reward

(

), causing the

value for a cut-oﬀ belief

in the modiﬁed MDP to coincide with

(

). Hence,

the exact value of the cut-oﬀ belief—and thus the value of all other explored

beliefs—is under-approximated.

Example 3.

Fig. 3shows the resulting ﬁnite MDP obtained when considering

the belief MDP from Fig. 2with single cut-oﬀ belief b={s07→ 1

/4, s17→ 3

/4}.

Computing cut-oﬀ values. The question of ﬁnding a suitable under-approximative

value function

is central to the cut-oﬀ approach. For an eﬀective approximation,

such a function should be easy to compute while still providing values close

to the optimum. If we assume a positive reward structure, the constant value

0is always a valid under-approximation. A more sophisticated approach is to

compute suboptimal expected reward values for the states of the POMDP using

some arbitrary, ﬁxed observation-based policy

σ∈ΣM

obs

. Let

Uσ

S→R∞

such that for all

s∈S

Uσ

(

) =

ERσ

M,R

(

♦G

). Then, we deﬁne the function

Uσ:BM→R∞as Uσ(b):=Ps∈supp(b)b(s)·Uσ(s).

We slightly deviate from Def. 4by allowing transition rewards to be

−∞

or +

∞

Alternatively, we could introduce new sink states with a non-zero self-loop reward.

Under-Approximating Expected Total Rewards in POMDPs 29

Lemma 1. Uσis an under-approximative value function, i.e. for all b∈ BM:

Uσ(b):=X

s∈supp(b)

b(s)·Uσ(s)≤V∗(b).

Thus, ﬁnding a suitable under-approximative value function reduces to ﬁnding

“good” policies for

, e.g. by using randomly guessed fm-policies, machine

learning methods [13], or a transformation to a parametric model [28].

3.2 Belief Clipping

The cut-oﬀ approach provides a universal way to construct an MDP which under-

approximates the expected total reward value for a given POMDP. The quality

of the approximation, however, is highly dependent on the under-approximative

value function used. Furthermore, regions where the belief MDP slowly converges

towards a belief may pose problems in practice.

As a potential remedy for these problems, we propose a diﬀerent concept

called belief clipping. Intuitively, the procedure shifts some of the probability mass

of a belief

in order to transform

to another belief

. We then connect

a way that the accuracy of our approximation of the value

V∗

(

)depends only

on the approximation of

V∗

(

)and the so-called clipping value—some notion of

distance between

and

that we discuss below. We can thus focus on exploring

the successors of ˜

bto obtain good approximations for both beliefs band ˜

Deﬁnition 9 (Belief Clip).

For

b∈ BM

, we call

µ:supp

(

)

→

1] abelief

clip if

∀s∈supp

(

)

:µ

(

)

≤b

(

)and

(

)

:=Ps∈supp(b)µ

(

)

1. The belief

(bµ)∈ BMinduced by µis deﬁned by

∀s∈supp(b) : (bµ)(s):=b(s)−µ(s)

1−P(µ).

Intuitively, a belief clip

for

describes for each

s∈supp

(

)the probability

mass that is removed (“clipped away”) from

(

). The induced belief is obtained

when normalising the resulting values so that they sum up to one.

Example 4.

For belief

{s07→ 1

/4, s17→ 3

/4}

, consider the two belief clips

µ1

{s07→ 1

/4, s17→ 1

/4}

and

µ2

{s07→ 1

/4, s17→ 0}

. Both induce the same

belief: (bµ1)=(bµ2) = {s07→ 0, s17→ 1}.

We have

supp

((

bµ

))

⊆supp

(

), which also implies

((

bµ

)) =

(

). Given

some candidate belief ˜

b, consider the set of inducing belief clips:

C(b, ˜

b):=nµ:supp(b)→[0,1] |µis a belief clip for bwith ˜

b= (bµ)o.

Belief ˜

bis called an adequate clipping candidate for biﬀ C(b, ˜

b)6=∅.

Deﬁnition 10 (Clipping Value).

For

b∈ BM

and adequate clipping candidate

, the clipping value is

∆b→˜

b:=P

(

δb→˜

), where

δb→˜

b:= arg minµ∈C(b,˜

b)P

(

The values δb→˜

b(s)for s∈supp(b)are the state clipping values.

30 A. Bork, J.-P. Katoen, T. Quatmann

s07→1

s17→0s07→ 1

s17→1

s07→1

s17→3

/4bcut

s07→0

s17→1

s27→1

α1α1clip 1

R0: 0

β1

R0:1

/2β

R0: 1

α1

cut

Figure 4. Applying belief clipping to the belief MDP from Fig. 2

Given a belief

and an adequate clipping candidate

, we outline how the notion

of belief clipping is used to obtain valid under-approximations. We assume

implying 0

< ∆b→˜

1. Instead of exploring all successors of

bel

(

), the

approach is to add a transition from

. The newly added transition has

probability 1

−∆b→˜

and gets assigned a reward of 0. The remaining probability

mass (i.e.

∆b→˜

) leads to a designated goal state

bcut

. To guarantee that—in

general—the clipping procedure yields a valid under-approximation, we need to

add a corrective reward value to the transition from

bcut

. Let

S→R∞

which maps each POMDP state to its minimum expected reward in the underlying,

fully observable MDP

, i.e.

(

) =

ERmin

M,R

(

♦G

). This function

soundly under-approximates the state values which can be achieved by any

observation-based policy. It can be generated using standard MDP analysis.

Given state clipping values

δb→˜

(

)for

s∈supp

(

), the reward for the transition

from bto bcut is Ps∈supp(b)(δb→˜

b(s)/∆b→˜

b)·L(s).

Example 5.

For the belief MDP from Fig. 2, belief

{s07→ 1

/4, s17→ 3

/4}

and clipping candidate

{s07→ 0, s17→ 1}

we get

∆b→˜

, as

δb→˜

µ2

{s07→ 1

/4, s17→ 0}

with the belief clip

µ2

as in Example 4. Furthermore,

(

) = 0. The resulting MDP following our construction above is given in Fig. 4.

The following lemma shows that the construction yields an under-approximation.

Lemma 2. (1 −∆b→˜

b)·V∗(˜

b) + ∆b→˜

b·X

s∈supp(b)

δb→˜

b(s)

∆b→˜

·L(s)≤V∗(b).

Proof

(sketch). To gain some intuition, consider the special case, where

∆b→˜

δb→˜

(

) =

(

)for some

s∈supp

(

). The clipping candidate

can be interpreted

as the conditional probability distribution arising from distribution

given that

is not the current state. The value

V∗

(

)can be split into the sum of (i) the

probability that

is not the current state times the reward accumulated from

belief

and (ii) the probability that

is the current state times the reward

accumulated from

, i.e. from the belief

{s7→ 1}

. However, for the two summands

When rewards are negative, we might have

(

) =

−∞

for many

s∈S\G

in which

case the applicability of the clipping approach is very limited.

Under-Approximating Expected Total Rewards in POMDPs 31

we must consider a policy that does not distinguish between the beliefs

, and

{s7→ 1}

as well as their observation-equivalent successors. In other words, the

same sequence of actions must be executed when the same observations are made.

We consider such a policy that in addition is optimal at

, i.e. the reward

accumulated from

is equal to

V∗

(

). For the reward accumulated from

{s7→ 1}

(

)provides a lower bound. Hence, (1

−b

(

))

·V∗

(

) +

(

)

·L

(

)is a lower

bound for the reward accumulated from b. A formal proof is given in [10]. ut

To ﬁnd a suitable clipping candidate for a given belief

, we consider a ﬁnite

candidate set

B⊆ BM

consisting of beliefs with observation

(

). These beliefs

do not need to be reachable in the belief MDP. The set can be constructed, e.g.

by taking already explored beliefs or by using a ﬁxed, discretised set of beliefs.

We are interested in minimising the clipping value

∆b→b0

over all candidate

beliefs

b0∈B

. A naive approach is to explicitly compute all clipping values for all

candidates. We are using mixed-integer linear programming (MILP) [

] instead.

An MILP is a system of linear inequalities (constraints) and a linear objective

function considering real-valued and integer variables. A feasible solution of the

MILP is a variable assignment that satisﬁes all constraints. An optimal solution

is a feasible solution that minimises the objective function.

Deﬁnition 11 (Belief Clipping MILP).

The belief clipping MILP for belief

b∈ BMand ﬁnite set of candidates B⊆ {b0∈ BM|O(b0) = O(b)}is given by:

minimise ∆such that:

b0∈B

ab0= 1 Select exactly one candidate b0(1)

∀b0∈B:ab0∈ {0,1}(2)

s∈supp(b)

δs=∆  Compute clipping value for selected b0(3)

∀s∈supp(b) : δs∈[0, b(s)] (4)

∀b0∈B:δs≥b(s)−(1 −∆)·b0(s)−(1 −ab0) (5)

The MILP consists of

(

|supp

(

)

|B|

)variables and

(

|supp

(

)

|·|B|

)con-

straints. For

b0∈B

, the binary variable

ab0

indicates whether

has been chosen

as the clipping candidate. Moreover, we have variables

δs

for

s∈supp

(

)and a

variable

∆

to represent the (state) clipping values for

and the chosen candidate

. Constraints 1and 2enforce that exactly one of the

ab0

variables is one, i.e.

exactly one belief is chosen. Constraint 3forces

∆

to be the sum of all state

clipping values.

δs

variables get a value between zero and

(

)(Constraint 4).

Constraint 5only aﬀects δsif the corresponding belief is chosen. Otherwise, ab0

is set to 0and the value on the right-hand side becomes negative. If a belief

is chosen, the minimisation forces Constraint 5to hold with equality as the

right-hand side is greater or equal to 0. Assuming

∆

is set to a value below 1, we

obtain a valid clipping values as

∀s∈supp(b) : δs=b(s)−(1 −∆)·b0(s)⇐⇒ b0(s) = b(s)−δs

1−∆.

32 A. Bork, J.-P. Katoen, T. Quatmann

Input : POMDP M=hM , Z, Oiwith M=hS, Act,P, sinit i, reward

structure R, goal states G⊆S, under-approx. value function V,

function L:S→R∞with L(s) = ERmin

M,R(s|=♦G)

Output : Clipping belief MDP KMand reward structure RK

1SK← {binit , bcut}with binit ={sinit 7→ 1}and a new belief state bcut

2PK(bcut,cut, bcut )←1,RK(bcut ,cut, bcut)←0// add self-loop

3Q← {binit }// initialize exploration set

4while Q6=∅do

5b←chooseBelief(Q),Q←Q\ {b}// pop next belief to explore from Q

6if supp

(

)

⊆Gthen PK

(

b, goal, b

)

←

(

b, goal, b

)

←

add self-loop

7else if exploreBelief(b)then // expand b

8foreach α∈Act(b)do // Using bel(M)and RBas in Defs. 6and 7

9foreach b0∈postbel(M)(b, α)do

10 PK(b, α, b0)←PB(b, α, b0),RK(b, α, b0)←RB(b, α, b0)

11 if b0/∈SKthen SK←SK∪ {b0},Q←Q∪ {b0}

12 else // apply cut-oﬀ and clipping to b

13 PK(b, cut, bcut)←1,RK(b, cut, bcut )←V(b)// add cut-oﬀ transition

14 choose a ﬁnite set B⊆ BMof clipping candidates for b

15 ˜

b, ∆b→˜

b, δb→˜

b←solveClippingMILP(b, B)

16 if ˜

b6=band ˜

bis adequate then // Clip busing ˜

17 PK(b, clip,˜

b)←(1−∆b→˜

b),PK(b, clip, bcut)←∆b→˜

18 RK(b, clip,˜

b)←0,RK(b, clip, bcut)←Ps∈supp (b)

δb→˜

b(s)

∆b→˜

b·L(s)

19 if ˜

b /∈SKthen SK←SK∪ {˜

b},Q←Q∪ {˜

20 return KM=SK,Act ] {goal,cut,clip},PK, binitand RK

Algorithm 1: Belief exploration algorithm with cut-oﬀs and clipping

A trivial solution of the MILP is always obtained by setting

ab0

and

∆

to 1and

δs

(

)for all

and an arbitrary

b0∈B

. This corresponds to an invalid belief

clip. However, as we minimise the value for

∆

, we can conclude that no belief in

the candidate set is adequate for clipping if ∆is 1in an optimal solution.

Theorem 1.

An optimal solution to the belief clipping MILP for belief

and

candidate set

sets

a˜

to 1 and

∆

to a value below 1 iﬀ

b∈B

is an adequate

clipping candidate for bwith minimal clipping value.

3.3 Algorithm

We incorporate belief cut-oﬀs and belief clipping into an algorithmic framework

outlined in Algorithm 1. As input, the algorithm takes an instance of Problems 1

and 2, i.e. a POMDP

with reward structure

and goal states

. In addition,

the algorithm considers an under-approximative value function

(Sect. 3.1) and

a function Lfor the computation of corrective reward values (Sect. 3.2).

Lines 1and 2initialise the state set

of the under-approximative MDP

with the initial belief

binit

and the designated goal state

bcut

which has only one

Under-Approximating Expected Total Rewards in POMDPs 33

transition to itself with reward 0. Furthermore, we initialise the exploration set

by adding

binit

(Line 3). During the computation,

is used to keep track of

all beliefs we still need to process. We then execute the exploration loop (Lines 4

to 19) until

becomes empty. In each exploration step, a belief

is selected

and removed from Q. There are three cases for the currently processed belief b.

supp

(

)

⊆G

, i.e.

is a goal belief, we add a self-loop with reward 0to

and continue with the next belief (Line 6).

is not expanded as successors of

goal beliefs will not inﬂuence the result of the computation.

is not a goal belief, we use a heuristic function

6exploreBelief

to decide

is expanded in Line 7. Lines 8to 11 outline the expansion step. The transitions

from

to its successor beliefs and the corresponding rewards as in the original

belief MDP (see Sect. 2.2) are added. Furthermore, the successor beliefs that

have not been encountered before are added to the set of states

and the

exploration set Q.

is not expanded, we apply the cut-oﬀ approach and the clipping approach

in Lines 12 to 19. In Line 13 we add a cut-oﬀ transition from

bcut

with

a new action

cut

. We use the given under-approximative value function

compute the cut-oﬀ reward. Towards the clipping approach, a set of candidate

beliefs is chosen and the belief clipping MILP for

and the candidate set is

constructed as described in Def. 11 (Lines 14 and 15). If an adequate candidate

with clipping values

∆b→˜

and

δb→˜

(

)for

s∈supp

(

)has been found, we add

the transitions from

bcut

and to

using a new action

clip

and probabilities

∆b→˜

and 1

−∆b→˜

, respectively. Furthermore, we equip the transitions with

reward values as described in Sect. 3.2 using the given function

(Lines 16 to 18).

If the clipping candidate

has not been encountered before, we add it to the

state space of the MDP and to the exploration set in Line 19.

The result of the algorithm is an MDP

with reward structure

. The

set of states

contains all encountered beliefs. To guarantee termination

of the algorithm, the decision heuristic

exploreBelief

has to stop exploring

further beliefs at some point. Moreover, the handling of clipping candidates in

Line 19 should not add new beliefs to

inﬁnitely often. We therefore ﬁx a ﬁnite

set of candidate beliefs

B#⊆ BM

and make sure that the candidate sets

Line 14 satisfy (

B\SK

)

⊆ B#

. To ensure a certain progress in the exploration

“

clip

-cycles”—i.e. paths of the form

b1clip . . . clip bnclip b1

—are avoided in

This can be done, e.g. by always expanding the candidate beliefs b∈ B#.

Expected total rewards until reaching the extended set of goal beliefs

Gcut :=

GB∪ {bcut}in KMunder-approximate the values in the belief MDP:

Theorem 2. For all beliefs b∈SK\ {bcut}it holds that

ERmax

KM,RK(b|=♦Gcut)≤V∗(b) = ERmax

bel(M),RB(b|=♦GB).

Corollary 1. ERmax

KM,RK(♦Gcut)≤ERmax

M,R(♦G).

5For example, Qcan be implemented as a FIFO queue.

The decision can be made for example by considering the size of the already explored

state space such that the expansion is stopped if a size threshold has been reached.

More involved decision heuristics are subject to further research.

34 A. Bork, J.-P. Katoen, T. Quatmann

Table 1. Results for benchmark POMDPs with maximisation objective

Benchmark Data Prism Storm

Model φ S/Act /ZCut-Oﬀ Cut-Oﬀ + Clipping Over-

Only η=2 η=3 η=4 η=6 Approx.

Drone

Pmax

1226 TO / MO ≥0.79 ≥0.79

TO TO TO

≤0.94

4-1 2954 <1s1360s

384 3·1043·104

Drone

Pmax

1226 TO / MO ≥0.86 ≥0.91 ≥0.92

TO TO

≤0.97

4-2 2954 <1s249s1902s

761 2·1042·1042·104

Grid-av

Pmax

17 [0.21,1.0] ≥0.86 ≥0.93 ≥0.93 ≥0.93 ≥0.93 ≤0.98

4-0 59 5.14s < 1s < 1s1.77s3.63s13.9s

4η=6238 312 472 663 1300

Grid-av

Pmax

17 [0.21,1.0] ≥0.82 ≥0.85 ≥0.82 ≥0.85

≤0.99

4-0.1 59 1.47s < 1s26.1s198s1913s

4η=3238 317 461 759

Netw-p

Rmax

2·104[557,557] ≥537 ≥537 ≥537 ≥537 ≥537 ≤558

2-8-20 3·1042355s2.3s98.5s320s651s2368s

4909 η=10 8·1041·1051·1051·1051·105

Netw-p

Rmax

2·105TO / MO ≥769 ≥769

TO TO TO

≤819

3-8-20 3·105290s6640s

2·1041·1061·106

Refuel

Pmax

208 [0.67,0.72] ≥0.67 ≥0.67 ≥0.67 ≥0.67 ≥0.67 ≤0.69

06 565 4625s < 1s5.89s24.3s92s2076s

50 η=34576 4834 5204 5603 6135

Refuel

Pmax

470 TO / MO ≥0.45 ≥0.45

TO TO TO

≤0.51

08 1431 <1s839s

66 2·1042·104

4 Experimental Evaluation

Implementation details. We integrated Algorithm 1in the probabilistic model

checker Storm [

] as an extension of the POMDP veriﬁcation framework

described in [

]. Inputs are a POMDP—encoded either explicitly or using an

extension of the Prism language [

]—and a property speciﬁcation. Internally,

POMDPs and MDPs are represented using sparse matrices. The implementation

supports minimisation

and maximisation of reachability probabilities, reach-

avoid probabilities (i.e. the probability to avoid a set of bad state until a set of goal

states is reached), and expected total rewards. In a preprocessing step, functions

and

as considered in Algorithm 1are generated. For

, we consider the function

Uσ

as in Lemma 1, where

is a memoryless observation-based policy given by a

heuristic

. For the function

, we apply standard MDP analysis on the underlying

MDP. When exploring the abstraction MDP

, our heuristic expands a belief iﬀ

|SK|≤|S|·maxz∈z|O−1

(

)

, where

|SK|

is the number of already explored beliefs

and

|O−1

(

)

is the number of POMDP states with observation

. Belief clipping

can either be disabled entirely, or we consider candidate sets

B⊆ B#

, where

η:={b∈ B | ∀s∈S:b(s)∈ {i

/η|i∈N,0≤i≤η}}

forms a ﬁnite, regular grid

of beliefs with resolution

η∈N\ {0}

. Grid beliefs

b∈ B#

are always expanded.

7For minimisation, the under-approximation yields upper bounds.

The heuristic uses optimal values obtained on the fully observable underlying MDP.

Under-Approximating Expected Total Rewards in POMDPs 35

Table 2. Results for benchmark POMDPs with minimisation objective

Benchmark Data Prism Storm

Model φ S/Act /ZCut-Oﬀ Cut-Oﬀ + Clipping Over-

Only η=2 η=3 η=4 η=6 Approx.

Grid

Rmin

17 [4.52,4.7]≤4.78 ≤4.78 ≤4.78 ≤4.78

≥4.52

4-0.1 62 649s < 1s15.6s148s1940s

3η=10 258 255 255 255

Grid

Rmin

17 [6.12,6.31]≤6.56 ≤6.56 ≤6.56 ≤6.56

≥6.08

4-0.3 62 1077s < 1s15.8s148s1983s

3η=10 255 256 256 256

Maze2

Rmin

15 [6.32,6.32]≤6.34 ≤6.34 ≤6.34 ≤6.34 ≤6.34 ≥6.32

0.1 54 1.79s < 1s < 1s < 1s < 1s2.02s

8η=10 91 90 90 90 90

Netw

Rmin

4589 [3.17,3.2]≤6.56 ≤6.56 ≤6.56 ≤6.56 ≤6.56 ≥3.14

2-8-20 6973 211s < 1s5.31s17.2s42.3s167s

1173 η=10 2·1042·1042·1043·1043·104

Netw

Rmin

2·104[5.61,6.79]≤11.9≤11.9≤11.9≤11.9

≥6.13

3-8-20 3·1047133s3.51s214s1372s4910s

2205 η=61·1052·1052·1052·105

Rocks

Rmin

6553 ≤38 ≤38 ≤38 ≤20 ≤21 ≥20

12 3·104TO / MO 1.39s61.1s138s230s532s

1645 3·1043·1043·1045·1046·104

Rocks

Rmin

1·104≤44 ≤44 ≤44 ≤26 ≤27 ≥26

16 5·104TO / MO 3.85s114s230s399s1062s

2761 4·1044·1044·1046·1041·105

Furthermore, we exclude clipping candidates

with

δb→˜

(

)

0for

with

(

) =

−∞

; clipping with such candidates is not useful as it induces a value of

−∞

Expected total rewards on fully observable MDPs are computed using Sound Value

Iteration [

] with relative precision 10

−6

. MILPs are solved using Gurobi [

Set-up. We evaluate our under-approximation approach with cut-oﬀs only and

with enabled belief clipping procedure using grid resolutions

= 2

6. We

consider the same POMDP benchmarks

as in [

]. The POMDPs are scalable

versions of case studies stemming from various application domains. To establish

an external baseline, we compare with the approach of [

] implemented in

Prism [

]. Prism generates an under-approximation based on an optimal policy

for an over-approximative MDP which—in contrast to Storm—means that always

both, under- and over-approximations, have to be computed. We ran Prism with

resolutions

= 2

10 and report on the best approximation obtained.

To provide a further reference for the tightness of our under-approximation,

we compute over-approximative bounds as in [

] using the implementation in

Storm with a resolution of

= 8. All experiments were run on an Intel

Xeon

Platinum 8160 CPU using 4 threads

, 64GB RAM and a time limit of 2 hours.

Results. Tables 1and 2show our results for maximising and minimising properties,

respectively. The ﬁrst columns contain for each POMDP the benchmark name,

Instances with a ﬁnite belief MDP that would be fully explored by our algorithm are

omitted since the exact value can be obtained without approximation techniques.

For our implementation, only Gurobi runs multi-threaded. Prism uses multiple

threads for garbage collection.

36 A. Bork, J.-P. Katoen, T. Quatmann

0 10,000 20,000 30,000

0.5

0.75

Number of explored beliefs |SK|

Pr(♦G)

Cut-Oﬀ η= 2

Figure 5. Accuracy for Drone 4-2 with diﬀerent sizes of approximation MDP KM

model parameters, property type (probabilities (P) or rewards (R)), and the

numbers of states, state-action pairs, and observations. Column Prism gives the

result with the smallest gap between over- and under-approximation computed

with the approach of [

]. For maximising (minimising) properties, our approach

competes with the lower (upper) bound of the provided interval. The relevant

value is marked in bold. We also provide the computation time and the considered

resolution η. For our implementation, we give results for the conﬁguration with

disabled clipping and for clipping with diﬀerent resolutions

. In each cell, we

give the obtained value, the computation time and the number of states in the

abstraction MDP

. Time- and memory-outs are indicated by TO and MO.

The right-most column indicates the over-approximation value computed via [

Discussion. The pure cut-oﬀ approach yields valid under-approximations in all

benchmark instances—often exceeding the accuracy of the approach of [

] while

being consistently faster. In some cases, the resulting values improve when clipping

is enabled. However, larger candidate sets signiﬁcantly increase the computation

time which stems from the fact that many clipping MILPs have to be solved.

For Drone 4-2, Fig. 5plots the resulting under-approximation values (

-axis)

for varying sizes of the explored MDP

(

-axis). The horizontal, dashed line in-

dicates the computed over-approximation value. The quality of the approximation

further improves with an increased number of explored beliefs.

5 Conclusion

We presented techniques to safely under-approximate expected total rewards in

POMDPs. The approach scales to large POMDPs and often produces tight lower

bounds. Belief clipping generally does not improve on the simpler cut-oﬀ approach

in terms of results and performance. However, considering—and optimising—the

approach for particular classes of POMDPs might prove beneﬁcial. Future work

includes integrating the algorithm into a reﬁnement loop that also considers

over-approximation techniques from [

]. Furthermore, lifting our approach to

partially observable stochastic games is promising.

Data Availability. The artifact [

] accompanying this paper contains source code,

benchmark ﬁles, and replication scripts for our experiments.

Under-Approximating Expected Total Rewards in POMDPs 37

References

Amato, C., Bernstein, D.S., Zilberstein, S.: Optimizing ﬁxed-size stochastic control-

lers for POMDPs and decentralized POMDPs. Auton. Agents Multi Agent Syst.

21(3), 293–320 (2010)

Ashok, P., Butkova, Y., Hermanns, H., Kretínský, J.: Continuous-time Markov

decisions based on partial exploration. In: ATVA. Lecture Notes in Computer

Science, vol. 11138, pp. 317–334. Springer (2018)

Aström, K.J.: Optimal control of Markov processes with incomplete state informa-

tion. J. of Mathematical Analysis and Applications 10(1), 174–205 (1965)

4. Baier, C., Katoen, J.P.: Principles of model checking. MIT Press (2008)

Bellman, R.: A Markovian decision process. Journal of Mathematics and Mechanics

6, 679–684 (1957)

Bonet, B.: Solving large POMDPs using real time dynamic programming. In: AAAI

Fall Symp. on POMDPs (1998)

Bonet, B., Geﬀner, H.: Solving POMDPs: RTDP-Bel vs. Point-based Algorithms.

In: IJCAI. pp. 1641–1646 (2009)

Bork, A., Junges, S., Katoen, J., Quatmann, T.: Veriﬁcation of indeﬁnite-horizon

POMDPs. In: ATVA. Lecture Notes in Computer Science, vol. 12302, pp. 288–304.

Springer (2020)

Bork, A., Katoen, J.P., Quatmann, T.: Artifact for Paper: Under-

Approximating Expected Total Rewards in POMDPs. Zenodo (2022). ht-

tps://doi.org/10.5281/zenodo.5643643

10.

Bork, A., Katoen, J.P., Quatmann, T.: Under-Approximating Expected Total

Rewards in POMDPs. arXiv e-print (2022), https://arxiv.org/abs/2201.08772

11.

Brázdil, T., Chatterjee, K., Chmelik, M., Forejt, V., Křetínsk`y, J., Kwiatkowska,

M., Parker, D., Ujma, M.: Veriﬁcation of Markov decision processes using learning

algorithms. In: ATVA. Lecture Notes in Computer Science, vol. 8837, pp. 98–114.

Springer (2014)

12.

Braziunas, D., Boutilier, C.: Stochastic local search for POMDP controllers. In:

AAAI. pp. 690–696. AAAI Press / The MIT Press (2004)

13.

Carr, S., Jansen, N., Topcu, U.: Veriﬁable rnn-based policies for POMDPs under

temporal logic constraints. In: IJCAI. pp. 4121–4127. ijcai.org (2020)

14.

Carr, S., Jansen, N., Wimmer, R., Serban, A.C., Becker, B., Topcu, U.:

Counterexample-guided strategy improvement for POMDPs using recurrent neural

networks. In: IJCAI. pp. 5532–5539. ijcai.org (2019)

15.

Chatterjee, K., Chmelík, M., Davies, J.: A symbolic SAT-based algorithm for almost-

sure reachability with small strategies in POMDPs. In: AAAI. pp. 3225–3232 (2016)

16.

Chatterjee, K., Chmelík, M., Gupta, R., Kanodia, A.: Optimal cost almost-sure

reachability in POMDPs. Artiﬁcial Intelligence 234, 26–48 (2016)

17.

Chatterjee, K., Doyen, L., Henzinger, T.A.: Qualitative analysis of partially-

observable Markov decision processes. In: MFCS. Lecture Notes in Computer

Science, vol. 6281, pp. 258–269. Springer (2010)

18.

Cheng, H.T.: Algorithms for partially observable Markov decision processes. Ph.D.

thesis, University of British Columbia (1988)

19.

Doshi, F., Pineau, J., Roy, N.: Reinforcement learning with limited reinforcement:

Using Bayes risk for active learning in POMDPs. In: ICML. pp. 256–263 (2008)

20.

Eagle, J.N.: The optimal search for a moving target when the search path is

constrained. Operations Research 32(5), 1107–1115 (1984)

38 A. Bork, J.-P. Katoen, T. Quatmann

21.

Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2021), https:

//www.gurobi.com

22.

Hauskrecht, M.: Value-function approximations for partially observable Markov

decision processes. J. Artif. Intell. Res. 13, 33–94 (2000)

23.

Hensel, C., Junges, S., Katoen, J., Quatmann, T., Volk, M.: The probabilistic

model checker Storm. Int. J. on Software Tools for Technology Transfer (2021).

https://doi.org/10.1007/s10009-021-00633-z

24.

Horák, K., Bošanský, B., Chatterjee, K.: Goal-HSVI: Heuristic Search Value Iteration

for Goal POMDPs. In: IJCAI. pp. 4764–4770. ijcai.org (7 2018)

25.

Itoh, H., Nakamura, K.: Partially observable Markov decision processes with impre-

cise parameters. Artiﬁcial Intelligence 171(8-9), 453–490 (2007)

26.

Jansen, N., Dehnert, C., Kaminski, B.L., Katoen, J., Westhofen, L.: Bounded model

checking for probabilistic programs. In: ATVA. Lecture Notes in Computer Science,

vol. 9938, pp. 68–85 (2016)

27.

Junges, S., Jansen, N., Seshia, S.A.: Enforcing almost-sure reachability in POMDPs.

In: CAV (2). Lecture Notes in Computer Science, vol. 12760, pp. 602–625. Springer

(2021)

28.

Junges, S., Jansen, N., Wimmer, R., Quatmann, T., Winterer, L., Katoen, J.P.,

Becker, B.: Finite-state Controllers of POMDPs via Parameter Synthesis. In: UAI.

pp. 519–529. AUAI Press (2018)

29.

Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially

observable stochastic domains. Artiﬁcial Intelligence 101(1-2), 99–134 (1998)

30.

Kurniawati, H., Hsu, D., Lee, W.S.: SARSOP: Eﬃcient point-based POMDP

planning by approximating optimally reachable belief spaces. In: Robotics: Science

and Systems. vol. 2008 (2008)

31.

Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Veriﬁcation of probabilistic

real-time systems. In: CAV. Lecture Notes in Computer Science, vol. 6806, pp.

585–591. Springer (2011)

32.

Lovejoy, W.S.: Computationally feasible bounds for partially observed Markov

decision processes. Operations Research 39(1), 162–175 (1991)

33.

Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning

and inﬁnite-horizon partially observable Markov decision problems. In: AAAI/IAAI.

pp. 541–548 (1999)

34.

Madani, O., Hanks, S., Condon, A.: On the undecidability of probabilistic planning

and related stochastic optimization problems. Artiﬁcial Intelligence

147

(1-2), 5–34

(2003)

35.

Meuleau, N., Kim, K.E., Kaelbling, L.P., Cassandra, A.R.: Solving POMDPs by

searching the space of ﬁnite policies. In: UAI. pp. 417–426 (1999)

36.

Monahan, G.E.: State of the art — a survey of partially observable Markov decision

processes: theory, models, and algorithms. Management Science

(1), 1–16 (1982)

37.

Norman, G., Parker, D., Zou, X.: Veriﬁcation and Control of Partially Observable

Probabilistic Systems. Real-Time Systems 53(3), 354–402 (2017)

38.

Pineau, J., Gordon, G., Thrun, S.: Point-based value iteration: An anytime algorithm

for POMDPs. In: IJCAI. vol. 3, pp. 1025–1032 (2003)

39.

Quatmann, T., Katoen, J.: Sound value iteration. In: CAV (1). Lecture Notes in

Computer Science, vol. 10981, pp. 643–661. Springer (2018)

40.

Russell, S.J., Norvig, P.: Artiﬁcial Intelligence: A Modern Approach (4th Edition).

Pearson (2020)

41.

Schrijver, A.: Theory of Linear and Integer Programming. John Wiley & Sons

(1986)

Under-Approximating Expected Total Rewards in POMDPs 39

42.

Shani, G., Pineau, J., Kaplow, R.: A survey of point-based POMDP solvers.

Autonomous Agents and Multi-Agent Systems 27(1), 1–51 (2013)

43.

Silver, D., Veness, J.: Monte-Carlo planning in large POMDPs. In: NIPS. pp.

2164–2172 (2010)

44.

Smallwood, R.D., Sondik, E.J.: The optimal control of partially observable Markov

processes over a ﬁnite horizon. Operations Research 21(5), 1071–1088 (1973)

45.

Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: UAI. pp.

520–527 (2004)

46.

Sondik, E.J.: The Optimal Control of Partially Observable Markov Processes. Ph.D.

thesis, Stanford Univ Calif Stanford Electronics Labs (1971)

47.

Sondik, E.J.: The optimal control of partially observable Markov processes over the

inﬁnite horizon: Discounted costs. Operations research 26(2), 282–304 (1978)

48.

Spaan, M.T., Vlassis, N.: Perseus: Randomized point-based value iteration for

POMDPs. J. of Artiﬁcial Intelligence Research 24, 195–220 (2005)

49.

Volk, M., Junges, S., Katoen, J.P.: Fast dynamic fault tree analysis by model

checking techniques. IEEE Transactions on Industrial Informatics

(1), 370–379

(2017)

50.

Wang, Y., Chaudhuri, S., Kavraki, L.E.: Bounded Policy Synthesis for POMDPs

with Safe-Reachability Objectives. In: AAMAS. pp. 238–246 (2018)

51.

Winterer, L., Junges, S., Wimmer, R., Jansen, N., Topcu, U., Katoen, J.P., Becker,

B.: Motion planning under partial observability using game-based abstraction. In:

CDC. pp. 2201–2208. IEEE (2017)

52.

Zhang, N.L., Lee, S.S.: Planning with partially observable Markov decision processes:

advances in exact solution method. In: UAI. pp. 523–530 (1998)

53.

Zhang, N.L., Zhang, W.: Speeding up the convergence of value iteration in partially

observable Markov decision processes. Journal of Artiﬁcial Intelligence Research

14, 29–51 (2001)

Open Access

This chapter is licensed under the terms of the Creative Commons

Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),

which permits use, sharing, adaptation, distribution and reproduction in any medium or

format, as long as you give appropriate credit to the original author(s) and the source,

provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter’s

Creative Commons license, unless indicated otherwise in a credit line to the material. If

material is not included in the chapter’s Creative Commons license and your intended

use is not permitted by statutory regulation or exceeds the permitted use, you will need

to obtain permission directly from the copyright holder.

40 A. Bork, J.-P. Katoen, T. Quatmann

Correct Probabilistic Model Checking

with Floating-Point Arithmetic?

Arnd Hartmanns

University of Twente, Enschede, The Netherlands

a.hartmanns@utwente.nl

Abstract. Probabilistic model checking computes probabilities and ex-

pected values related to designated behaviours of interest in Markov

models. As a formal veriﬁcation approach, it is applied to critical sys-

tems; thus we trust that probabilistic model checkers deliver correct re-

sults. To achieve scalability and performance, however, these tools use

ﬁnite-precision ﬂoating-point numbers to represent and calculate prob-

abilities and other values. As a consequence, their results are aﬀected

by rounding errors that may accumulate and interact in hard-to-predict

ways. In this paper, we show how to implement fast and correct prob-

abilistic model checking by exploiting the ability of current hardware

to control the direction of rounding in ﬂoating-point calculations. We

outline the complications in achieving correct rounding from higher-

level programming languages, describe our implementation as part of

the Modest Toolset’s mcsta model checker, and exemplify the trade-

oﬀs between performance and correctness in an extensive experimental

evaluation across diﬀerent operating systems and CPU architectures.

1 Introduction

Given a Markov chain or Markov decision process (MDP [25]) model of a safety-

or performance-critical system, probabilistic model checking (PMC) calculates

quantitative properties of interest: the probability of (rare or catastrophic) fail-

ures, the expected recovery time after service interruption, or the long-run aver-

age throughput. These properties involve probabilities or expected costs/rewards

of sets of model behaviours, and are often speciﬁed in a temporal logic like

PCTL [16]. As a formal veriﬁcation approach, users place great trust in the

results delivered by a PMC tool such as Prism [22], Storm [9], ePMC [15],

or the Modest Toolset’s [18]mcsta. In contrast to classical model checkers

for functional, Boolean-valued properties speciﬁed in e.g. LTL or CTL [2], a

probabilistic model checker is inherently quantitative: the input model contains

real-valued probabilities and costs/rewards; PCTL makes comparisons between

real-valued constants and probabilities; the most eﬃcient algorithms numerically

iterate towards a ﬁxpoint; and the ﬁnal result itself may well be a real number.

?This work was supported by NWO VENI grant no. 639.021.754 and the EU’s Horizon

2020 research and innovation programme under MSCA grant agreement 101008233.

The Author(s) 2022

D. Fisman and G. Rosu (Eds.): TACAS 2022, LNCS 13244, pp. 41–59, 2022.

https://doi.org/10.1007/978-3-030-99527-0_3

Often, we can restrict to rationals, which simpliﬁes the theory and facilitates

“exact” algorithms using arbitrary-precision rational number datatypes. These

algorithms only work for small models (as shown in the most recent QComp

2020 competition of quantitative veriﬁcation tools [6]). In this paper, we thus

focus on the PMC techniques that scale to large problems: those building upon

iterative numerical algorithms, in particular value iteration (VI) [8]. We restrict

to probabilistic reachability, i.e. calculating the probability to eventually reach a

goal state, as this is the core problem in PMC for MDP. Embedded in the usual

recursive CTL algorithm, it allows us to check any (unbounded) PCTL formula.

Starting from a trivial underapproximation of the reachability probability

for each state of the model, VI iteratively improves the value of each state

based on its successors’ values. The true reachability probabilities are the least

ﬁxpoint of this procedure, towards which the algorithm converges. For roughly

a decade, PMC tools implemented VI by stopping once the relative or absolute

diﬀerence between subsequent iterations was below a threshold . Haddad and

Monmege [12] showed in 20141that this does not guarantee a diﬀerence of ≤

between the reported and the true probability, putting in question the trust

placed in PMC tools. Then variants of VI were developed that provide sound, i.e.

-correct, results: interval iteration (II) [3,5,13], sound value iteration (SVI) [26],

and optimistic value iteration (OVI) [19]. We focus on II as the prototypical

sound algorithm. It additionally iterates on an overapproximation; its stopping

criterion is the diﬀerence between over- and underapproximation being ≤.

If all probabilities in an MDP are rational numbers, then the true reachability

probability as well as all intermediate values in II are rational, too. Yet imple-

menting II with arbitrary-precision rationals is impractical since the smaller-

and-smaller diﬀerences between intermediate values end up using excessive com-

putation time and memory. II is thus implemented with ﬁxed-precision (usually

64-bit IEEE 754 double precision) ﬂoating point numbers. These, however, can-

not represent all rationals, so operations must round to nearby representable

values. Although II is numerically benign, consisting only of multiplications and

additions within [0,1], the default round to nearest, ties to even policy can cause

II to deliver incorrect results. Wimmer et al. [29] show an example where PMC

tools incorrectly state that a simple PCTL property is satisﬁed by a small Markov

chain due to the underlying numeric diﬀerence having disappeared in rounding.

We conﬁrmed with current versions of Prism,Storm, and mcsta that the prob-

lem persists to today, even when requesting a “sound” algorithm like II. Wimmer

et al. propose interval arithmetic to avoid such problems, cautioning that

[...] the memory consumption will roughly double, since two numbers for

the interval bounds have to be stored [...]. The runtime will be higher by

a small factor, because we need to derive lower and upper bounds for the

intervals, requiring two model checking runs per sub-formula. [29, p. 5]

They did not provide an implementation, and we are not aware of any to date.

1Wimmer et al. [29] already in 2008 mention this problem in a more general setting,

but neither give a concrete counterexample nor propose a solution tailored to PMC.

42 A. Hartmanns

Our contribution. We present the ﬁrst PMC implementation that computes cor-

rect lower and upper bounds on reachability probabilities despite using ﬂoating-

point arithmetic. We beneﬁt from two developments since Wimmer et al.’s paper

of 2008: First, II (published 2014) already uses intervals (though not as Wimmer

et al. envisioned), necessarily doubling memory consumption compared to VI (as

do SVI and OVI, so it appears an unavoidable cost of soundness). In place of

“two model checking runs per sub-formula”, we can make the two interleaved

computations inside II safe w.r.t. rounding. Second, hardware and programming

language support for controlling the rounding direction in ﬂoating-point opera-

tions has improved, in particular with the AVX-512 instruction set in the newest

x86-64 CPUs and widespread compiler support for C99’s “ﬂoating-point environ-

ment” header fenv.h. Nevertheless, it is nontrivial to achieve runtime that is

only “higher by a small factor”. For the analysis of probabilistic systems, the only

related use of safe rounding we are aware of is in the SSMT tool SiSAT [27].

Structure. We recap PMC and II (Sect. 2) as well as problems and solutions re-

lated to rounding in ﬂoating-point arithmetic in Sect. 3. We then present our new

approach in Sect. 4, including important implementation aspects. The perfor-

mance of our approach is crucial to its adoption in tools; thus in Sect. 5we report

on extensive experiments across diﬀerent software and hardware conﬁgurations

on models from the Quantitative Veriﬁcation Benchmark Set (QVBS) [20].

2 Probabilistic Model Checking

We write {x17→ y1, . . . }to denote the function that maps all xito yi. Given a

set S, its powerset is 2S. A (discrete) probability distribution over Sis a function

µ∈S→[0,1] with countable support spt(µ)def

={s∈S|µ(s)>0}and

Ps∈spt(µ)µ(s) = 1.Dist (S)is the set of all probability distributions over S. If

µ(s)∈Qfor all s∈S, we call µarational probability distribution, in Dist

(S).

Markov decision processes (MDP) [25] combine the nondeterminism of Kripke

structures with the ﬁnite random choices of discrete-time Markov chains (DTMC).

Deﬁnition 1. AMarkov decision process (MDP) is a triple M=hS, sI, T i

where Sis a ﬁnite set of states with initial state sI∈Sand T:S→2Dist

(S)

is the transition function.T(s)must be ﬁnite and non-empty for all s∈S.

For s∈S, an element µof T(s)is a transition, and if s0∈spt(µ), then the

transition has a branch to successor state s0with probability µ(s0). If |T(s)|= 1

for all s∈S, then Mis a DTMC.

Example 1. Fig. 1shows our example MDP Mγ

n, which is actually a DTMC. It

is a simpliﬁed and parametrised version of the counterexample of Wimmer et

al. [29, Fig. 2]. It is parametrised in terms of n∈N(determining the number

of chained states with transitions labelled b) and γ∈(0,0.5) (changing some

probabilities). We draw transitions as lines to an intermediate node from which

Correct Probabilistic Model Checking with Floating-Point Arithmetic 43

Mγ

s−

s1sn

···

a b c

2−γ

1−γ

ntimes

Fig. 1. Example parametrised MDP Mγ

probability-labelled branches lead to successor states. We omit the intermediate

node for transitions with a single branch, and label some transitions to easily

refer to them. Mγ

nhas 4 + nstates and transitions, and 7+2nbranches.

In practice, higher-level modelling languages like Modest [14] are used to specify

MDP. The semantics of an MDP is captured by its paths. A path represents a

concrete resolution of all nondeterministic and probabilistic choices. Formally:

Deﬁnition 2. Aﬁnite path is a sequence πﬁn =s0µ0s1µ1. . . µn−1snwhere

si∈Sfor all i∈ { 0, . . . , n }and µi∈T(si)∧µi(si+1)>0for all i∈ { 0, . . . , n −

1}. Let |πﬁn|def

=nand last(πﬁn)def

=sn.Πﬁn (s)is the set of all ﬁnite paths starting

in s. A path is an analogous inﬁnite sequence π, and Π(s)is the set of all paths

starting in s. We write s∈πif ∃i:s=si.

A scheduler (or adversary,policy or strategy) only resolves the nondeterministic

choices of M. For this paper, memoryless deterministic schedulers suﬃce [4].

Deﬁnition 3. A function s:S→Dist(S)is a scheduler if, for all s∈S, we

have s(s)∈T(s). The set of all schedulers of Mis S(M).

We are interested in reachability probabilities. Let M|s=hS, sI, T |siwith T|s(s) =

{s(s)}be the DTMC induced by son M. Via the standard cylinder set con-

struction [10, Sect. 2.2] on M|s, a scheduler induces probability measures PM,s

on measurable sets of paths starting in s∈S.

Deﬁnition 4. For state sand goal state g∈S, the maximum and minimum

probability of reaching gfrom sis deﬁned as PM,s

max (g) = sups∈SPM,s

s({π∈

Π(s)|g∈π})and PM,s

min (g) = infs∈SPM,s

s({π∈Π(s)|g∈π}), respectively.

The deﬁnition extends to sets Gof goal states. We omit the superscript for M

when it is clear from the context, and if we omit that for s, then s=sI. From

now on, whenever we have an MDP with a set of goal states G, we assume w.l.o.g.

that all g∈Gare absorbing, i.e. every gonly has one self-loop transition.

Deﬁnition 5. Amaximal end component (MEC) of Mis a maximal (sub-)MDP

hS0, T 0, s0

Iiwhere S0⊆S,T0(s)⊆T(s)for all s∈S0, and the directed graph with

vertex set S0and edge set { hs, s0i|∃µ∈T0(s) : µ(s0)>0}is strongly connected.

44 A. Hartmanns

1function II(M=hS, sI, T i, G, opt , )

// Preprocessing

2if opt = max then M:= CollapseMECs(M , G)// collapse MECs

3S0:= Prob0(M, G, opt ),S1:= Prob1(M , G, opt)// identify 0/1 states

4l:= {s7→ 0|s∈S\S1} ∪ { s7→ 1|s∈S1}// initialise lower vector

5u:= {s7→ 0|s∈S0} ∪ { s7→ 1|s∈S\S0}// initialise upper vector

// Iteration

6while (u(sI)−l(sI))/l(sI)>  do // while relative error > :

7foreach s∈S\(S0∪S1)do // update non-0/1 states:

8l(s) := optµ∈T(s)Ps0∈spt(µ)µ(s0)·l(s0)// iterate lower vector

9u(s) := optµ∈T(s)Ps0∈spt(µ)µ(s0)·u(s0)// iterate upper vector

10 return 1

2(u(sI)−l(sI))

Alg. 1: Interval iteration for probabilistic reachability

2.1 Algorithms

Interval iteration [3,5,12,13] computes reachability probabilities p(s)=Ps

opt (G),

opt ∈ { max,min }. We show the basic algorithm as Alg. 1. It iteratively reﬁnes

vectors land uthat map each state to a value in Qsuch that, at all times, we

have l(s)≤p(s)≤u(s). In each iteration, the values in land uare updated

for all relevant states (line 7) via the classic Bellman equations of value itera-

tion (lines 8-9). Their least ﬁxpoint is p, towards which lconverges from below.

Some preprocessing is needed to ensure that the ﬁxpoint is unique and also u

converges towards p: for maximisation, we need to collapse MECs into single

states (line 2). This can be be done via graph-based algorithms (see e.g. [7])

that only consider the graph structure of the MDP as in Deﬁnition 1but do

not perform calculations with the concrete probability values. For both max-

imisation and minimisation, we need to identify the sets S0and S1such that

∀s∈S0:p(s)=0and ∀s∈S1:p(s) = S1(line 3). This can equally done

via graph-based algorithms [10, Algs. 1-4]. We then initialise land uto triv-

ial under-/overapproximations of p(lines 4-5). Iteration stops when the relative

diﬀerence between land uat sIis at most (which is often chosen as 10−3or

10−6). The corresponding check in line 6assumes that division by zero results

in +∞, as is the default in IEEE 754. By convergence of land utowards the

ﬁxpoint, II terminates, and we eventually return a value ˆpwith the guarantee

that p(sI)∈[(1 −)·ˆp, (1 + )·ˆp]. This makes II sound.

PCTL. The temporal logic PCTL [16] allows us to construct complex branching-

time properties. It takes standard CTL [2] and replaces the A(ψ)(“for all paths

ψholds”) and E(ψ)(“there exists a path for which ψholds”) operators by the

probabilistic operator P∼c(ψ)for “under all schedulers, the probability of the

measurable set of paths for which ψholds is ∼c” where ∼ ∈ { <, ≤, >, ≥ }

and c∈[0,1]. To model-check a PCTL formula on MDP M, we follow the

standard recursive CTL model checking algorithm [2, Sect. 6.4] except for the P

Correct Probabilistic Model Checking with Floating-Point Arithmetic 45

operator, which can be reduced to computing reachability probabilities. For the

“ﬁnally”/“eventually” case P∼c(Fφ), we can directly use interval iteration: Let Sφ

be the set of states recursively determined to satisfy φ. Call II(M, Sφ,opt∼, )

of Alg. 1with opt∼= max if ∼∈{<, ≤ } and opt∼= min otherwise, with two

modiﬁcations: Change the stopping criterion of line 6to check the diﬀerence for

all states, and in line 10, return the set SP

def

={s∈S| ∀x∈[l(s), u(s)]: x∼c}. If

∃s∈S, x ∈[l(s), u(s)]: xc, however, we would need to either abort and report

an “unknown” situation, or continue with a reduced until we can (hopefully

eventually) decide the comparison. None of Prism,Storm, and mcsta appear

to perform this extra check, though. In this paper, we only use PCTL for non-

nested top-level P∗(F. . .)operators; the results are then true if sI∈SP, should be

unknown in case the “unknown” situation applies to sI, and are false otherwise.

3 Floating-Point Arithmetic

The current implementations of II (in Prism,Storm, and mcsta) use IEEE 754

double-precision ﬂoating-point arithmetic to represent (a) the probabilities of

the MDP’s branches and (b) the values in land u. A ﬂoating-point number is

stored as a signiﬁcand dand an exponent ew.r.t. to an agreed-upon base bsuch

that it represents the value d·be. We ﬁx b= 2. IEEE 754 double precision uses

64 bits in total, of which 1 is a sign bit, 52 are for d, and 11 are for e. Standard

alternatives are 32-bit single precision (1 sign, 23 bits for d, and 8 for e) and the

80-bit x87 extended precision format (with 1 sign bit, 64 for d, and 15 for e). The

subset of Qthat can be represented in such a representation is determined by the

numbers of bits for dand e. For example, 1

2or 7

8can be represented exactly in all

formats, but 1

10 cannot. IEEE 754 prescribes that all basic operations (addition,

multiplication, etc.) are performed at “inﬁnite precision” with the result rounded

to a representable number. The default rounding mode is to round to the nearest

such number, choosing an even value in case of ties (round to nearest, ties to

even). In single precision, 1

10 is thus by default rounded to

13421773 ·2−27 = 0.100000001490116119384765625.

A single rounded operation leads to an error of at most the distance between the

two nearest representable numbers. In iterative computations, however, rounding

may happen at every step. A striking example of the consequences is the failure of

an American Patriot missile battery to intercept an incoming Iraqi Scud missile

in February 1992 in Dharan, Saudi Arabia [28], which resulted in 28 fatalities.

The Patriot system calculated time in seconds by multiplying its internal clock’s

value by a rounded binary representation of 1

10 . After 100 hours of continuous

operation, this lead to a cumulative rounding error large enough to miscalculate

the incoming missile’s position by more than half a kilometre [1].

3.1 Errors in Probabilistic Model Checking

II accumulates and multiplies rounded ﬂoating-point values in the land uvectors

with potentially already-rounded values representing the rational probabilities of

46 A. Hartmanns

the model. Using the default rounding mode, how can we be sure that the ﬁnal

result does not miss the true probability by more than half a kilometre, too?

Following Wimmer et al. [29], let us consider MDP Mγ

nof Fig. 1again, and

determine whether P≤1

2( { s+})holds. The model is acyclic, so it is easy to see

that

pdef

= Pmax( { s+}) = 1

2+γn+2 >1

Let us ﬁx n= 1 and γ= 10−6. Then p=1

2+ 10−18. This value cannot be

represented in double precision, and is by default rounded to 0.5.

We have encoded Mγ

nin the Modest and Prism languages, and checked the

answers returned by Prism 4.7, Storm 1.6.4, and mcsta 3.1 for the property.

The correct result would be false.Prism returns true in its default conﬁguration,

which uses an unsound algorithm, and false when requesting an algorithm with

exact rational arithmetic, for which Mγ

nis small enough. If we explicitly request

Prism to use II, then the result depends on the speciﬁed : for ≥10−11, we get

the correct result of false; for smaller ≤10−12 , i.e. higher precision, however, we

incorrectly get true.Storm incorrectly returns true in its default conﬁguration

as well as when we request a sound algorithm via the --sound parameter. Only

when using an exact rational algorithm via the --exact parameter does Storm

correctly return false.mcsta, when using II (--alg IntervalIteration), in-

correctly returns true, and additionally reports that it computed [l(sI), u(sI)]

as [0.5,0.5], thus not including the true value of p. Other algorithms are not

immune to the problem, either; for example, mcsta also answers true when using

SVI, OVI, and when solving the MDP as a linear programming problem via the

Google OR Tools’ GLOP LP solver.

This example shows that using a sound algorithm does not guarantee correct

results. The problem is not speciﬁc to cases of small probabilities like γ= 10−6

in the MDP; we can achieve the same eﬀect using arbitrarily higher values of γ

if we just increase na little. Such bounded try-and-retry chains where “normal”

probabilities in the model result in very small values during iteration and on

the ﬁnal result are not uncommon in the systems often modelled as MDPs,

e.g. backoﬀ schemes in communication protocols and randomised algorithms. In

general, tiny diﬀerences in probabilities in one place may result in signiﬁcant

changes of the overall reachability probability; for example, in two-dimensional

random walks, the long-run behaviour when the probabilities to move forward

or backward are both 1

2is vastly diﬀerent from if they are 1

2+δand 1

2−δ,

respectively, for any δ > 0.

3.2 On Precision and Rounding Modes

In our concrete example, we may be able to avoid the problem by increasing

precision: In the 80-bit extended format supported by all x86-64 CPUs, 1

2+10−18

is by default rounded to 5.000000000000000009... ·10−1, so there is a chance

of obtaining false unless other rounding during iterations would lose all the

diﬀerence. Extended precision is used for C’s long double type by e.g. the GCC

compiler; it is thus readily accessible to programmers. It is, however, the most

Correct Probabilistic Model Checking with Floating-Point Arithmetic 47

precise format supported in common CPUs today; if we need more precision,

we would have to resort to much slower software implementations using e.g.

the GNU MPFR library. Any a-priori ﬁxed precision, however, just shifts the

problem to smaller diﬀerences, but does not eliminate it.

The more general solution that we propose in this paper is to control the

rounding mode of the ﬂoating-point operations performed in the II algorithm.

In addition to the default round to nearest, ties to even mode, the IEEE 754

standard deﬁnes three directed rounding modes: round towards zero (i.e. trun-

cation), round towards +∞(i.e. always round up), and round towards −∞ (i.e.

always round down). As we will explain in Sect. 4, using the latter gives us an

easy way to make the computations inside II safe, i.e. guarantee the under- and

overapproximation invariants for land u, respectively. Control of the ﬂoating-

point rounding mode however appears to be a rarely-used feature of IEEE 754

implementations; consequently the level and style of support for it in CPUs and

high-level programming languages is diverse.

3.3 CPU Support for Rounding Modes

Storm and mcsta run exclusively on x86-64 systems (with the upcoming ARM-

based systems so far only supported via their x86-64 emulation layers), while

Prism additionally supports several other platforms via manual compilation.

Thus we focus on x64-64 in this paper as the platform probabilistic model check-

ers overwhelmingly run on today.

X87 and SSE. All x64-64 CPUs support two instruction sets to perform ﬂoating-

point operations in double precision: The x87 instruction set, originating from

the 8087 ﬂoating-point coprocessor, and the SSE instruction set, which includes

support for double precision since the Pentium 4’s SSE2 extension. Both imple-

ment operations according to the IEEE 754 standard. Aside from architectural

particularities such as its stack-based approach to managing registers, the x87

instruction set notably includes support for 80-bit extended precision. In fact,

by default, it performs all calculations in that extended precision, only rounding

to double or single precision when storing values back to 64- or 32-bit memory

locations. This has the advantage of reducing the error across sequences of oper-

ations, but for high-level languages makes the results depend on the compiler’s

choices of when to load/store intermediate values in memory vs. keeping them

in x87 registers. The SSE instructions only support single and double precision.

Both the x87 and SSE instruction sets support all four rounding modes men-

tioned above. The rounding mode of operations for x87 and SSE is determined

by the current value of the x87 FPU control word stored in the x87 FPU control

That is, to change rounding mode, we need to obtain the current control regis-

ter value, change the two bits determining rounding mode (with the other bits

controlling other aspects of ﬂoating-point operations such as the treatment of

NaNs), and apply the new value. This is done via the FNSTCW/FLDCW in-

struction pair on x87, and VSTMXCSR/VLDMXCSR for SSE. Rounding mode

48 A. Hartmanns

is thus part of the global (per-thread) state, and we must be careful to restore

its original conﬁguration when returning to code that does not expect rounding

mode changes. Frequent changes of rounding mode thus incur a performance

overhead due to the extra instructions that must be executed for every change

and their eﬀects on e.g. pipelining.

AVX-512. AVX-512 is the extension to 512 bits of the sequence of single instruc-

tion, multiple data (SIMD) instruction sets in x84-64 processors that started

with SSE. It became available for general-purpose systems in high-end desk-

top (Skylake-X) and server (Xeon) CPUs in 2017, but it took until the 10th

generation of Intel’s Core mobile CPUs in 2019 before it was more widely avail-

able in end-user systems. It is supposed to appear in AMD CPUs with the

upcoming Zen 4 architecture. Aside from its 512-bit SIMD instructions, AVX-

512 crucially also includes new instructions for single ﬂoating-point values where

the operation’s rounding mode is speciﬁed as part of the instruction itself via

the new “EVEX” encoding. Of particular note for implementing II are the new

VFMADD(r1r2r3)SD fused multiply-add instructions (the ridetermining how

the operand registers are used) that can directly be used for the sums of prod-

ucts in the Bellman equations in lines 8-9of Alg. 1. Overall, AVX-512 thus makes

rounding mode independent of global state, and may improve performance by

removing the need for extra instruction sequences to change rounding mode.

3.4 Rounding Modes in Programming Languages

Support for non-default rounding modes is lacking in most high-level program-

ming languages. Java, C#, and Python, for example, do not support them at

all. If II is implemented in such a language, there is consequently no hope for a

high-performance solution to the rounding problems described earlier.

For C and C++, the C99 and C++11 standards introduced access to the

ﬂoating-point environment. The fenv.h/cfenv headers include the fegetround

and fesetround functions to query the current rounding mode and change it,

respectively. Implementations of these functions on x86-64 read/change both the

x87 and SSE control registers accordingly. In the remainder of this paper, we fo-

cus on a C implementation, but most statements hold for C++ analogously. The

level of support for the C99 ﬂoating-point features varies signiﬁcantly between

compilers; it is in particular still incomplete in Clang2and GCC [11, Further

notes]. Still, both compilers provide access to the fegetround/fesetround func-

tions (via the associated standard libraries), but GCC in particular is not round-

ing mode-aware in optimisations. This means that, for example, subexpressions

that are evaluated twice, with a change in rounding mode in between, may be

compiled by GCC into a single evaluation before the change, with the resulting

value stored in a register and reused after the rounding mode change. This can

2The documentation as of October 2021 states that C99 support in Clang “is feature-

complete except for the C99 ﬂoating-point pragmas”.

Correct Probabilistic Model Checking with Floating-Point Arithmetic 49

even happen when using the -frounding-math option3. Programmers thus need

to inspect the generated assembly to ensure that no problematic transformations

have been made, or try to make them impossible by declaring values volatile

or inserting inline assembly “barriers”.

Overall, C thus provides a standardised way to change x87/SSE rounding

mode, but programmers need to be aware of compiler quirks when using these

facilities. Support for AVX-512 instructions that include rounding mode bits in

C, on the other hand, is only slightly more convenient than programming in

assembly as we can use the intrinsics in the immintrin.h header; there is no

standard higher-level abstraction of this feature in either C or C++.

4 Correctly Rounding Interval Iteration

Let us now change II as in Alg. 1to consistently round in safe directions at

every numeric operation. Given that we can change or specify the rounding

mode of all basic ﬂoating-point operations on current hardware, we expect that

a high-performance implementation can be achieved. First, the preprocessing

steps require no changes as they are purely graph-based. The changes to the

iteration part of the algorithm are straightforward: In line 6,

while (u(sI)−l(sI))/l(sI)>  do . . .,

we round the results of the subtraction and of the division towards +∞to avoid

stopping too early. In line 8,

l(s) := optµ∈T(s)Ps0∈spt (µ)µ(s0)·l(s0),

the multiplications and additions round towards −∞ while the corresponding

operations on the upper bound in line 9round towards +∞. Recall that all

probabilities in the MDP are rational numbers, i.e. representable as num

den with

num,den ∈N. We assume that num and den can be represented exactly in the

implementation. Then, in line 8, we calculate the ﬂoating-point values for the

µ(s0) = num/den by rounding towards −∞. In line 9, we round the result of the

corresponding division towards +∞. Finally, instead of returning the middle of

the interval in line 10, we return [l(sI), u(sI)] so as not to lose any information

(e.g. in case the result is compared to a constant as in the example of Sect. 3.1).

With these changes, we obtain an interval guaranteed to contain the true

reachability probability if the algorithm terminates. However, rounding away

from the theoretical ﬁxpoint in the updates of land umeans that we may

reach an eﬀective ﬁxpoint—where land uno longer change because all newly

computed values round down/up to the values from the previous iteration—at

a point where the relative diﬀerence of l(sI)and u(sI)is still above . This

will happen in practice: In QComp 2020 [6], mcsta participated in the ﬂoating-

point correct track by letting VI run until it reached a ﬁxpoint under the default

rounding mode with double precision. In 9 of the 44 benchmark instances that

mcsta attempted to solve in this way, the diﬀerence between this ﬁxpoint and

3The documentation as of Oct. 2021 states that -frounding-math “does not currently

guarantee to disable all GCC optimizations that are aﬀected by rounding mode.”

50 A. Hartmanns

1function SR-SII(M=hS, sI, T i, G, opt , )

2. . . (preprocessing as in Alg. 1). . .

3repeat

4chg := false

5fesetround(towards −∞)

6foreach s∈S\(S0∪S1)do

7lnew := optµ∈T(s)Ps0∈spt(µ)µ(s0)·l(s0)// iterate lower vector

8if lnew 6=l(s)then chg := true

9l(s) := lnew

10 fesetround(towards +∞)

11 foreach s∈S\(S0∪S1)do

12 unew := optµ∈T(s)Ps0∈spt(µ)µ(s0)·u(s0)// iterate upper vector

13 if unew 6=u(s)then chg := true

14 u(s) := unew

15 until ¬chg ∨(u(sI)−l(sI))/l(sI)≤

16 return [l(sI), l(sI)]

Alg. 2: Safely rounding sequential interval iteration (SR-SII) for x87 or SSE

the true value was more than the speciﬁed . With safe rounding away from the

true ﬁxpoint, this would likely have happened in even more cases.

To ensure termination, we thus need to make one further change to the II of

Alg. 1: In each iteration of the while loop, we additionally keep track of whether

any of the updates to land uchanges the previous value. If not, we end the loop

and return the current interval, which will be wider than the requested relative

diﬀerence. We refer to II with all of the these modiﬁcations as safely rounding

interleaved II (SR-III) in the remainder of this paper.

4.1 Sequential Interval Iteration

When using the x87 or SSE instruction sets to implement SR-III, we need to

insert a call to fesetround just before line 8, and another just before line 9.

If, for an MDP with nstates, we need miterations of the while loop, we will

make 2·n·mcalls to fesetround. This might signiﬁcantly impact performance

for models with many states, or that need many iterations (such as the haddad-

monmege model of the QVBS, which requires 7million iterations with = 10−6

despite only having 41 states). As an alternative, we can rearrange the iteration

phase of II as shown in Alg. 2: We ﬁrst update lfor all states (lines 6-9), then u

for all states (lines 11-14), with the rounding mode changes in between (lines 5

and 10). We call this variant of II safely rounding sequential II (SR-SII). It only

needs 2·mcalls to fesetround, which should improve its performance. However,

it also changes the memory access pattern of II with an a priori unknown eﬀect

on performance. We write III for II to stress that it is interleaved, and SII for

Alg. 2without the safe rounding, in the remainder of this paper.

Correct Probabilistic Model Checking with Floating-Point Arithmetic 51

4.2 Implementation Aspects

We have implemented III, SII, SR-III, and SR-SII in mcsta. While mcsta is writ-

ten in C#, the new algorithms are (necessarily) written in C, called from the

main tool via the P/Invoke mechanism. We used GCC 10.3.0 to compile our

implementations on both 64-bit Linux and Windows 10. We manually inspected

the disassembly of the generated code to ensure that GCC’s optimisations did

not interfere with rounding mode changes as described in Sect. 3.4. In a sig-

niﬁcant architectural change, we modiﬁed mcsta’s state space exploration and

representation code to preserve the exact rational values for the probabilities

speciﬁed in the model, so that safely-rounded ﬂoating-point representations for

the µ(s0)can be computed during iteration as described above.

Of each algorithm, we implemented four variants: a default one that leaves the

choice of instruction set to the compiler and uses fesetround to change round-

ing mode; an x87 variant that forces ﬂoating-point operations to use the x87

instructions by attributing the relevant functions with target("fpmath=387")

and that changes rounding mode via inline assembly using FNSTCW/FLDCW;

an SSE variant that forces the SSE instruction set via target("fpmath=sse")

and uses VSTMXCSR/VLDMXCSR in inline assembly for rounding mode chan-

ges; and an AVX-512 variant that implements all ﬂoating-point operations re-

quiring non-default rounding modes via AVX-512 intrinsics, in particular using

_mm_fmadd_round_sd in the Bellman equations. All variants use double pre-

cision; default and SSE additionally have a single-precision version (which we

omit for x87 since the reduced precision does not speed up the operations we

use); and x87 also provides an 80-bit extended-precision version (however we

currently return its results as safely-rounded double-precision values due to the

unavailability of a long double equivalent in C#, which limits its use outside of

performance testing for now). All in all, we thus provide 28 variants of interval

iteration for comparison, out of which 14 provide guaranteed correct results.

In particular, the safe rounding makes PMC feasible at 32-bit single precision,

which would otherwise be too likely to produce incorrect results. While we expect

that this may deliver many results with low precision (but which are correct) due

to a rounded ﬁxpoint being reached long before the relative width reaches , it

also halves the memory needed to store land u, and may speed up computations.

At the opposite end, mcsta is now also the ﬁrst PMC tool that can use 80-bit

extended precision, which however doubles the memory needed for land usince

80-bit long double values occupy 16 bytes in memory (with GCC).

5 Experiments

Using our implementation in mcsta, we ﬁrst tested all variants of the algorithms

on Mγ

nin the setting of Sect. 3.1. As expected, and validating the correctness of

the approach and its implementation, all SR variants return unknown.

We then assembled a set of 31 benchmark instances—combinations of a

model, values for its conﬁgurable parameters, and a property to check—from

52 A. Hartmanns

the QVBS covering DTMC, MDP, and probabilistic timed automata (PTA) [24]

transformed to MDP by mcsta using the digital clocks approach [23]. These are

all the models and probabilistic reachability probabilities from the QVBS sup-

ported by mcsta for which the result was not 0or 1(then it can be computed via

graph-based algorithms) and for which a parameter conﬁguration was available

where PMC terminated within our timeout of 120 s but II needed enough time for

it to be measured reliably ('0.2s). We checked each of these benchmarks with

all 28 variants of our algorithms using = 10−6on diﬀerent x86-64 systems:

I11w: an Intel Core i5-1135G7 (up to 4.2GHz) laptop running Windows 10,

this being the only system we had access to with AVX-512 support; AMDw:

an AMD Ryzen 9 5900X (3.7-4.8GHz) workstation running Windows 10, repre-

senting current AMD CPUs in our evaluation; I4x: an Intel Core i7-4790 (3.6-

4.0GHz) workstation running Ubuntu Linux 18.04, representing older-generation

Intel desktop hardware; and IPx: an Intel Pentium Silver J5005 (1.5-2.8GHz)

compact PC running Ubuntu Linux 18.04, representing a non-Core low-power

Intel system. We show a selection of our experimental results in the remainder

of this section, mainly from I11w and AMDw. We remark on cases where the

other systems (all with Intel CPUs) showed diﬀerent patterns from I11w.

We present results graphically as scatter plots like in Fig. 2. Each such plot

compares two algorithm variants in terms of runtime for the iteration phase of the

algorithm only (i.e. we exclude the time for state space exploration and prepro-

cessing). Every point hx, yicorresponds to a benchmark instance and indicates

that the variant noted on the x-axis took xseconds to solve this instance while

the one noted on the y-axis took yseconds. Thus points above the solid diagonal

line correspond to instances where the x-axis method was faster; points above

(below) the upper (lower) dotted diagonal line are where the x-axis method took

less than half (more than twice) as long.

Fig. 2ﬁrst shows the performance impact of enabling safe rounding for the

standard interleaved algorithm using double precision. The top row shows the

behaviour on I11w. We see that runtime is drastically longer in the default variant

that uses fesetround, but only increases by a factor of around 2 if we use

the speciﬁc inline assembly instructions. We note that GCC includes the code

for fesetround in the generated .dll ﬁle on Windows, but in contrast to the

assembly methods does not inline it into the callers. Some of the diﬀerence

may thus be function call overhead. The middle row shows the behaviour on

AMDw. Here, default is aﬀected just as badly, but the eﬀect on SEE is worse

while that on x87 is much lower than on the Intel I11w system. In the bottom

row, we show the impact on default on the Linux systems (bottom left and

bottom middle), which is much lower than on Windows. This is despite GCC

implementing fesetround as an external library call here. The overhead still

markedly diﬀers between the two Intel CPUs, though. Finally, as expected, we

see on the bottom right than safe rounding has almost no performance impact

when using the AVX-512 instructions.

Seeing the signiﬁcant impact enabling safe rounding can have, we next show

what the sequential algorithm brings to the table, in Fig. 3. On the top left, we

Correct Probabilistic Model Checking with Floating-Point Arithmetic 53

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (I11w, default)

SR-III

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (I11w, SSE)

SR-III

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (I11w, x87)

SR-III

DTMC MDP PTA

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (AMDw, default)

SR-III

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (AMDw, SSE)

SR-III

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (AMDw, x87)

SR-III

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (I4x, default)

SR-III

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (IPx, default)

SR-III

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (I11w, AVX-512)

SR-III

Fig. 2. Performance impact of safe rounding across instruction sets and systems

compare the base algorithms without safe rounding, where SII takes up to twice

as long in the worst case. This is likely due to the more cache-friendly memory

access pattern of III: we store land uinterleaved for III, so it always operates

on two adjacent values at a time. The bottom-left plot conﬁrms that reducing

the number of rounding mode changes reduces the overhead of safe rounding to

essentially zero. The remaining four plots show the diﬀerences between SR-III

and SR-SII. In all cases except x87 on AMDw, SR-III is slower. We thus have

that III is fastest but unsafe, SII and SR-SII are equally fast but the latter is

safe, and SR-III is safe but tends to be slower on the Intel systems. On the AMD

system, SR-III surprisingly wins over SR-SII with x87, highlighting that the x87

instruction set in Ryzen 3 must be implemented very diﬀerently from SSE.

54 A. Hartmanns

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (I11w, SSE)

SII

≤0.21 2 4 8 16

≤0.2

0.5

≥64

SR-III (I11w, SSE)

SR-SII

≤0.21 2 4 8 16

≤0.2

0.5

≥64

SR-III (I11w, x87)

SR-SII

≤0.21 2 4 8 16

≤0.2

0.5

≥64

SII (I11w, SSE)

SR-SII

≤0.21 2 4 8 16

≤0.2

0.5

≥64

SR-III (AMDw, SSE)

SR-SII

≤0.21 2 4 8 16

≤0.2

0.5

≥64

SR-III (AMDw, x87)

SR-SII

Fig. 3. Performance of interleaved compared to sequential II

We further investigate the impact of the instruction set in Fig. 4. Conﬁrming

the patterns we saw so far, SSE is slightly faster than x87 on I11w (and we see

similar behaviour on the other Intel systems) but slower by a factor of more

than 2 on the AMD CPU. The rightmost plot highlights that AVX-512 is the

fastest alternative on the most recent Intel CPUs, which may in part be due to

the availability of the fused multiply-add instruction that ﬁts II so well.

All results so far were for double-precision computations. To conclude our

evaluation, we show in Fig. 5that reducing to single precision does not bring

the expected performance beneﬁts. We see in the leftmost plot that the overhead

≤0.21 2 4 8 16

≤0.2

0.5

≥64

SR-III (I11w, SSE)

SR-III (x87)

≤0.21 2 4 8 16

≤0.2

0.5

≥64

SR-III (AMDw, SSE)

SR-III (x87)

≤0.21 2 4 8 16

≤0.2

0.5

≥64

SR-III (I11w, SSE)

SR-III (AVX-512)

Fig. 4. Performance with diﬀerent instruction sets

Correct Probabilistic Model Checking with Floating-Point Arithmetic 55

≤0.21 2 4 8 16

≤0.2

0.5

≥64

III (SSE, single)

SR-III (single)

≤0.21 2 4 8 16

≤0.2

0.5

≥64

SR-III (SSE, single)

SR-III (double)

≤0.21 2 4 8 16

≤0.2

0.5

≥64

SR-III (x87, double)

SR-III (ext.)

Fig. 5. Performance with diﬀerent precision settings (on I11w)

of safe rounding has a much higher variance compared to Fig. 2. The detailed tool

outputs hint at the reason being that rounding away from the ﬁxpoint occurs in

much larger steps with single precision, which signiﬁcantly slows down or stops

the convergence in several instances. The middle plot shows that, aside from the

slowly converging outliers, using single precision does not provide a speedup over

using doubles. Finally, on the right, we show that the impact of enabling 80-bit

extended precision on x87 is minimal.

6 Conclusion

There has been ample research into sound PMC algorithms over the past years,

but the problem of errors introduced by naive implementations using default

ﬂoating-point rounding has been all but ignored. We showed that a solution ex-

ists that, while perhaps conceptually simple, faces a number of implementation

and performance obstacles. In particular, hardware support for rounding modes

is arguably essential to achieve acceptable performance, but diﬃcult to use from

C/C++ and impossible to access from most other programming languages. We

extensively explored the space of implementation variants, highlighting that per-

formance crucially depends on the combination of the variant, the CPU, and the

operating system. Nevertheless, our results show that truly correct PMC is pos-

sible today at a small cost in performance, which should all but disappear as

AVX-512 is more widely adopted. With our implementation in mcsta, we provide

the ﬁrst PMC tool that combines fast, scalable, and correct.

Acknowledgments. This work was triggered by Masahide Kashiwagi’s excellent

overview of the diﬀerent ways to change rounding mode as used by his kv library

for veriﬁed numerical computations [21]. The author thanks Anke and Ursula

Hartmanns for contributing to the diversity of hardware on which the experi-

ments were performed by providing access to the AMDw and I11w systems.

Data availability. A dataset to replicate the experimental evaluation, including

the exact versions of the tools and models used, is archived and available at DOI

10.4121/19074047 [17].

56 A. Hartmanns

References

1. Arnold, D.N.: Some disasters attributable to bad numerical computing: The Patriot

missile failure (2000), https://www-users.cse.umn.edu/~arnold/disasters/patriot.

html, last accessed 2021-10-14.

2. Baier, C., Katoen, J.P.: Principles of model checking. MIT Press (2008)

3. Baier, C., Klein, J., Leuschner, L., Parker, D., Wunderlich, S.: Ensuring the re-

liability of your model checker: Interval iteration for markov decision processes.

In: Majumdar, R., Kuncak, V. (eds.) 29th International Conference on Computer

Aided Veriﬁcation (CAV). Lecture Notes in Computer Science, vol. 10426, pp.

160–180. Springer (2017). https://doi.org/10.1007/978-3-319-63387-9_8

4. Bianco, A., de Alfaro, L.: Model checking of probabalistic and nondeterministic

systems. In: 15th Conference on Foundations of Software Technology and Theoret-

ical Computer Science (FSTTCS). Lecture Notes in Computer Science, vol. 1026,

pp. 499–513. Springer (1995). https://doi.org/10.1007/3-540-60692-0_70

5. Brázdil, T., Chatterjee, K., Chmelik, M., Forejt, V., Kretínský, J., Kwiatkowska,

M.Z., Parker, D., Ujma, M.: Veriﬁcation of Markov decision processes us-

ing learning algorithms. In: Cassez, F., Raskin, J.F. (eds.) 12th International

Symposium on Automated Technology for Veriﬁcation and Analysis (ATVA).

Lecture Notes in Computer Science, vol. 8837, pp. 98–114. Springer (2014).

https://doi.org/10.1007/978-3-319-11936-6_8

6. Budde, C.E., Hartmanns, A., Klauck, M., Kretínský, J., Parker, D., Quatmann,

T., Turrini, A., Zhang, Z.: On correctness, precision, and performance in quanti-

tative veriﬁcation – QComp 2020 competition report. In: Margaria, T., Steﬀen, B.

(eds.) 9th International Symposium on Leveraging Applications of Formal Meth-

ods (ISoLA). Lecture Notes in Computer Science, vol. 12479, pp. 216–241. Springer

(2020). https://doi.org/10.1007/978-3-030-83723-5_15

7. Chatterjee, K., Henzinger, M.: Faster and dynamic algorithms for maxi-

mal end-component decomposition and related graph problems in proba-

bilistic veriﬁcation. In: Randall, D. (ed.) Twenty-Second Annual ACM-SIAM

Symposium on Discrete Algorithms (SODA). pp. 1318–1336. SIAM (2011).

https://doi.org/10.1137/1.9781611973082.101

8. Chatterjee, K., Henzinger, T.A.: Value iteration. In: Grumberg, O., Veith,

H. (eds.) 25 Years of Model Checking - History, Achievements, Perspectives.

Lecture Notes in Computer Science, vol. 5000, pp. 107–138. Springer (2008).

https://doi.org/10.1007/978-3-540-69850-0_7

9. Dehnert, C., Junges, S., Katoen, J.P., Volk, M.: A Storm is coming: A modern prob-

abilistic model checker. In: Majumdar, R., Kuncak, V. (eds.) 29th International

Conference on Computer Aided Veriﬁcation (CAV). Lecture Notes in Computer

Science, vol. 10427, pp. 592–600. Springer (2017). https://doi.org/10.1007/978-3-

319-63390-9_31

10. Forejt, V., Kwiatkowska, M.Z., Norman, G., Parker, D.: Automated veriﬁcation

techniques for probabilistic systems. In: Bernardo, M., Issarny, V. (eds.) 11th In-

ternational School on Formal Methods for the Design of Computer, Communication

and Software Systems (SFM). Lecture Notes in Computer Science, vol. 6659, pp.

53–113. Springer (2011). https://doi.org/10.1007/978-3-642-21455-4_3

11. Free Software Foundation: Status of C99 features in GCC (2021), https://gcc.gnu.

org/c99status.html, as accessed on 2021-10-14.

12. Haddad, S., Monmege, B.: Reachability in MDPs: Reﬁning convergence of value it-

eration. In: Ouaknine, J., Potapov, I., Worrell, J. (eds.) 8th International Workshop

Correct Probabilistic Model Checking with Floating-Point Arithmetic 57

on Reachability Problems (RP). Lecture Notes in Computer Science, vol. 8762, pp.

125–137. Springer (2014). https://doi.org/10.1007/978-3-319-11439-2_10

13. Haddad, S., Monmege, B.: Interval iteration algorithm for

MDPs and IMDPs. Theor. Comput. Sci. 735, 111–131 (2018).

https://doi.org/10.1016/j.tcs.2016.12.003

14. Hahn, E.M., Hartmanns, A., Hermanns, H., Katoen, J.P.: A compositional mod-

elling and analysis framework for stochastic hybrid systems. Formal Methods Syst.

Des. 43(2), 191–232 (2013). https://doi.org/10.1007/s10703-012-0167-z

15. Hahn, E.M., Li, Y., Schewe, S., Turrini, A., Zhang, L.: iscasMc: A web-based

probabilistic model checker. In: Jones, C.B., Pihlajasaari, P., Sun, J. (eds.) 19th

International Symposium on Formal Methods (FM). Lecture Notes in Computer

Science, vol. 8442, pp. 312–317. Springer (2014). https://doi.org/10.1007/978-3-

319-06410-9_22

16. Hansson, H., Jonsson, B.: A logic for reasoning about time and reliability. Formal

Aspects Comput. 6(5), 512–535 (1994). https://doi.org/10.1007/BF01211866

17. Hartmanns, A.: Correct probabilistic model checking with ﬂoating-

point arithmetic (artifact). 4TU.Centre for Research Data (2022).

https://doi.org/10.4121/19074047

18. Hartmanns, A., Hermanns, H.: The Modest Toolset: An integrated environment

for quantitative modelling and veriﬁcation. In: Ábrahám, E., Havelund, K. (eds.)

20th International Conference on Tools and Algorithms for the Construction and

Analysis of Systems (TACAS). Lecture Notes in Computer Science, vol. 8413, pp.

593–598. Springer (2014). https://doi.org/10.1007/978-3-642-54862-8_51

19. Hartmanns, A., Kaminski, B.L.: Optimistic value iteration. In: Lahiri, S.K., Wang,

C. (eds.) 32nd International Conference on Computer Aided Veriﬁcation (CAV).

Lecture Notes in Computer Science, vol. 12225, pp. 488–511. Springer (2020).

https://doi.org/10.1007/978-3-030-53291-8_26

20. Hartmanns, A., Klauck, M., Parker, D., Quatmann, T., Ruijters, E.: The quantita-

tive veriﬁcation benchmark set. In: Vojnar, T., Zhang, L. (eds.) 25th International

Conference on Tools and Algorithms for the Construction and Analysis of Systems

(TACAS). Lecture Notes in Computer Science, vol. 11427, pp. 344–350. Springer

(2019). https://doi.org/10.1007/978-3-030-17462-0_20

21. Kashiwagi, M.: kv – a C++ library for veriﬁed numerical computation, http://

veriﬁedby.me/kv/index-e.html, last accessed 2021-10-13.

22. Kwiatkowska, M.Z., Norman, G., Parker, D.: PRISM 4.0: Veriﬁcation of probabilis-

tic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) 23rd International

Conference on Computer Aided Veriﬁcation (CAV). Lecture Notes in Computer

Science, vol. 6806, pp. 585–591. Springer (2011). https://doi.org/10.1007/978-3-

642-22110-1_47

23. Kwiatkowska, M.Z., Norman, G., Parker, D., Sproston, J.: Performance analysis

of probabilistic timed automata using digital clocks. Formal Methods Syst. Des.

29(1), 33–78 (2006). https://doi.org/10.1007/s10703-006-0005-2

24. Kwiatkowska, M.Z., Norman, G., Segala, R., Sproston, J.: Automatic veriﬁcation

of real-time systems with discrete probability distributions. Theor. Comput. Sci.

282(1), 101–150 (2002). https://doi.org/10.1016/S0304-3975(01)00046-9

25. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic

Programming. Wiley Series in Probability and Statistics, Wiley (1994).

https://doi.org/10.1002/9780470316887

26. Quatmann, T., Katoen, J.P.: Sound value iteration. In: Chockler, H., Weis-

senbacher, G. (eds.) 30th International Conference on Computer Aided Veriﬁca-

58 A. Hartmanns

tion (CAV). Lecture Notes in Computer Science, vol. 10981, pp. 643–661. Springer

(2018). https://doi.org/10.1007/978-3-319-96145-3_37

27. Teige, T., Fränzle, M.: Constraint-based analysis of probabilistic hybrid systems.

In: Giua, A., Mahulea, C., Silva, M., Zaytoon, J. (eds.) 3rd IFAC Conference

on Analysis and Design of Hybrid Systems (ADHS). IFAC Proceedings Vol-

umes, vol. 42, pp. 162–167. Elsevier (2009). https://doi.org/10.3182/20090916-3-

ES-3003.00029

28. United States General Accounting Oﬃce: Software problem led to system failure

at Dhahran, Saudi Arabia. Report GAO/IMTEC-92-26 (February 1992), https:

//www-users.cse.umn.edu/~arnold/disasters/GAO-IMTEC-92-96.pdf

29. Wimmer, R., Kortus, A., Herbstritt, M., Becker, B.: Probabilistic model checking

and reliability of results. In: Straube, B., Drutarovský, M., Renovell, M., Gramata,

P., Fischerová, M. (eds.) 11th IEEE Workshop on Design & Diagnostics of Elec-

tronic Circuits & Systems (DDECS). pp. 207–212. IEEE Computer Society (2008).

https://doi.org/10.1109/DDECS.2008.4538787

Open Access This chapter is licensed under the terms of the Creative Commons

Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),

which permits use, sharing, adaptation, distribution and reproduction in any medium

or format, as long as you give appropriate credit to the original author and the source,

provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter’s

Creative Commons license, unless indicated otherwise in a credit line to the material. If

material is not included in the chapter’s Creative Commons license and your intended

use is not permitted by statutory regulation or exceeds the permitted use, you will need

to obtain permission directly from the copyright holder.

Correct Probabilistic Model Checking with Floating-Point Arithmetic 59

Correlated Equilibria and Fairness in

Concurrent Stochastic Games

Marta Kwiatkowska1, Gethin Norman2, David Parker3, and Gabriel

Santos1()

1Department of Computer Science, University of Oxford, Oxford, UK

{marta.kwiatkowska,gabriel.santos}@cs.ox.ac.uk

2School of Computing Science, University of Glasgow, Glasgow, UK

gethin.norman@glasgow.ac.uk

3School of Computer Science, University of Birmingham, Birmingham, UK

d.a.parker@cs.bham.ac.uk

Abstract. Game-theoretic techniques and equilibria analysis facilitate

the design and verication of competitive systems. While algorithmic

complexity of equilibria computation has been extensively studied, prac-

tical implementation and application of game-theoretic methods is more

recent. Tools such as PRISM-games support automated verication and

synthesis of zero-sum and (ε-optimal subgame-perfect) social welfare

Nash equilibria properties for concurrent stochastic games. However,

these methods become inecient as the number of agents grows and may

also generate equilibria that yield signicant variations in the outcomes

for individual agents. We extend the functionality of PRISM-games to

support correlated equilibria, in which players can coordinate through

public signals, and introduce a novel optimality criterion of social fair-

ness, which can be applied to both Nash and correlated equilibria. We

show that correlated equilibria are easier to compute, are more equitable,

and can also improve joint outcomes. We implement algorithms for both

normal form games and the more complex case of multi-player concur-

rent stochastic games with temporal logic specications. On a range of

case studies, we demonstrate the benets of our methods.

1 Introduction

Game-theoretic verication techniques can support the modelling and design of

systems that comprise multiple agents operating in either a cooperative or com-

petitive manner. In many cases, to eectively analyse these systems we also need

to adopt a probabilistic approach to modelling, for example because agents oper-

ate in uncertain environments, use faulty hardware or unreliable communication

mechanisms, or explicitly employ randomisation for coordination.

In these cases, probabilistic model checking provides a convenient unied

framework for both formally modelling probabilistic multi-agent systems and

specifying their required behaviour. In recent years, progress has been made in

this direction for several models, including turn-based and concurrent stochastic

The Author(s) 2022

D. Fisman and G. Rosu (Eds.): TACAS 2022, LNCS 13244, pp. 60–78, 2022.

https://doi.org/10.1007/978-3-030-99527-0_4

games (TSGs and CSGs), and for multiple temporal logics, such as rPATL [10]

and its extensions [24]. Tool support has been developed, in the form of PRISM-

games [22], and successfully applied to case studies across a broad range of areas.

Initially, the focus was on zero-sum specications [24], which can be natural

for systems whose participants have directly opposing goals, such as the defender

and attacker in a security protocol minimising or maximising the probability of

a successful attack, respectively. However, agents often have objectives that are

distinct but not directly opposing, and may also want to cooperate to achieve

these objectives. Examples include network protocols and multi-robot systems.

For these purposes, Nash equilibria (NE) have also been integrated into prob-

abilistic model checking of CSGs [24], together with social welfare (SW) opti-

mality criterion, resulting in social welfare Nash equilibria (SWNE). An SWNE

comprises a strategy for each player in the game where no player has an incen-

tive to deviate unilaterally from their strategy and the sum of the individual

objectives over all players is maximised.

One key limitation of SWNE, however, is that, as these techniques are ex-

tended to support larger numbers of players [21], the eciency and scalability

of synthesising SWNE is signicantly reduced. In addition, simply aiming to

maximise the sum of individual objectives may not produce the best perform-

ing equilibrium, either collectively or individually; for example, they can oer

higher gains for specic players, reducing the incentive of the other players to

collaborate and instead motivating them to deviate from the equilibrium.

In this paper, we adopt a dierent approach and introduce, for the rst time

within formal verication, both social fairness as an optimality criterion and

correlated equilibria, and the insights required to make these usable in practical

applications. Social fairness (SF) is particularly novel, as it is inspired by similar

concepts used in economics and distinct from the fairness notions employed in

verication. Correlated equilibria (CE) [3], in which players are able to coordi-

nate through public signals, are easier to compute than NE and can yield better

outcomes. Social fairness, which minimises the dierences between the objectives

of individual players, can be considered for both CE and NE.

We rst investigate these concepts for the simpler case of normal form games,

illustrating their dierences and benets. We then extend the approach to the

more powerful modelling formalism of CSGs and extend the temporal logic

rPATL to formally specify agent objectives. We present algorithms to synthesise

equilibria, using linear programming to nd CE and a combination of back-

wards induction or value iteration for CSGs. We implement our approach in

the PRISM-games tool [22] and demonstrate signicant gains in computation

time and that quantiably more fair and useful strategies can by synthesised

for a range of application domains. An extended version of this paper, with the

complete model checking algorithm, is available [23].

Related work. Nash equilibria have been considered for concurrent systems

in [18], where a temporal logic is proposed whose key operator is a novel path

quantier which asserts that a property holds on all Nash equilibrium computa-

tions of the system. There is no stochasticity and correlated equilibria are not

Correlated Equilibria and Fairness in Concurrent Stochastic Games 61

considered. In [2], a probabilistic logic that can express equilibria is formulated,

along with complexity results, but no implementation has been provided.

The notion of fairness studied here is inspired by fairness of equilibria from

economics [33,34] and aims to minimise the dierence between the payos, as

opposed to maximising the lowest payo among the players in an NE [25]. Our

notion of fairness can be thought of as a constraint applied to equilibria strate-

gies, similar in style to social welfare, and used to select certain equilibria based

on optimality. This is distinct from fairness used in verication of concurrent

processes, where (strong) fairness refers to a property stating that, whenever a

process is enabled innitely often, it is executed innitely often. This notion is

typically dened as a constraint on innite execution paths expressible in logics

LTL and CTL* and needed to prove liveness properties. For probabilistic models,

verication under fairness constraints has been formulated for Markov decision

processes and the logic PCTL* [5,4]. For games on graphs, fairness conditions

expressed as ω-regular winning conditions can be used to synthesise reactive

processes [8]. Algorithms for strong transition fairness for ω-regular games have

been recently studied in [6]. Both qualitative and quantitative approaches have

been considered for verication under fairness constraints, but no equilibria.

2 Normal Form Games

We start by considering normal form games (NFGs), then dene our equilibria

concepts for these games, present algorithms and an implementation for com-

puting them, and nally summarise some experimental results.

We rst require the following notation. Let Dist(X)denote the set of prob-

ability distributions over set X. For any vector v∈Rn, we use v(i)to refer

to the ith entry of the vector. For any tuple x= (x1, . . . , xn)∈Xn, element

x′∈Xand i⩽n, we dene the tuples x−i

def

= (x1, . . . , xi−1, xi+1, . . . , xn)and

x−i[x′]def

= (x1, . . . , xi−1, x′, xi+1, . . . , xn).

Denition 1 (Normal form game). A (nite, n-person) normal form game

(NFG) is a tuple N= (N, A, u)where: N={1, . . . , n}is a nite set of players;

A=A1× · · · ×Anand Aiis a nite set of actions available to player i∈N;

u= (u1, . . . , un)and ui:A→Ris a utility function for player i∈N.

We x an NFG N= (N, A, u)for the remainder of this section. In a play of N,

each player i∈Nchooses an action from the set Aiat the same time. If each

player ichooses ai, then the utility received by player jequals uj(a1, . . . , an).

We next dene the strategies for players of Nand strategy proles comprising

a strategy for each player. We also dene correlated proles, which allow the

players to coordinate their choices through a (probabilistic) public signal.

Denition 2 (Strategy and prole). Astrategy σifor player iis an element

of Σi=Dist(Ai)and a strategy prole σis an element of ΣN=Σ1× · · · ×Σn.

For strategy σiof player i, the support is the set of actions {ai∈Ai|σi(ai)>0}

and the support of a prole is the product of the supports of the strategies.

62 Marta Kwiatkowska, Gethin Norman, David Parker, Gabriel Santos

Denition 3 (Correlated prole). Acorrelated prole is a tuple (τ , ς)com-

prising τ∈Dist(D), where D=D1× · · · ×Dn,Diis a nite set of signals for

player i, and ς= (ς1, . . . , ςn), where ςi:Di→Ai.

For a correlated prole (τ, ς ), the public signal τis a joint distribution over

signals Difor each player isuch that, if player ireceives the signal di∈Di, then

it chooses action ςi(di). We can consider any correlated prole (τ, ς )as a joint

strategy, i.e., a distribution over A1× · · · ×Anwhere:

(τ, ς )(a1, . . . , an) = ∑{τ(d1, . . . , dn)|di∈Di∧ς(di) = aifor all i∈N}.

Conversely, any joint strategy τ∈Dist(A1× · · · ×An)can be considered as a

correlated prole (τ, ς )where Di=Aiand ςiis the identity function for i∈N.

Any strategy prole σcan be mapped to an equivalent correlated prole (in

which τis the joint distribution σ1× · · · ×σnand ςiis the identity function). On

the other hand, there are correlated proles with no equivalent strategy prole.

Under prole σand correlated prole (τ, ς )the expected utilities of player iare:

ui(σ)def

=∑(a1,...,an)∈Aui(a1, . . . , an)·(∏n

j=1 σj(aj))

ui(τ, ς )def

=∑(d1,...,dn)∈Dτ(d1, . . . , dn)·ui(ς1(d1), . . . , ςn(dn)) .

Example 1. Consider the two-player NFG where Ai={ai

1, ai

2}and a corre-

lated prole corresponding to the joint distribution τ∈Dist(A1×A2)where

τ(a1

1, a1

2) = τ(a2

1, a2

2) = 0.5. Under this correlated prole the players share a fair

coin and both choose their rst action if the coin is heads and their second action

otherwise. This has no equivalent strategy prole. ■

Optimal equilibria of NFGs. We now introduce the notions of Nash equilib-

rium [27] and correlated equilibrium [3], as well as dierent denitions of opti-

mality for these equilibria: social welfare and social fairness. Using the notation

introduced above for tuples, for any prole σand strategy σ⋆

i, the strategy tuple

σ−icorresponds to σwith the strategy of player iremoved and σ−i[σ⋆

i]to the

prole σafter replacing player i’s strategy with σ⋆

Denition 4 (Best response). For a prole σand correlated prole (τ , ς), a

best response for player ito σ−iand (τ, ς−i)are, respectively:

–a strategy σ⋆

ifor player isuch that ui(σ−i[σ⋆

i]) ⩾ui(σ−i[σi]) for all σi∈Σi;

–a function ς⋆

i:Di→Aifor player isuch that ui(τ, ς−i[ς⋆

i]) ⩾ui(τ, ς−i[ςi])

for all functions ςi:Di→Ai.

Denition 5 (NE and CE). A strategy prole σ⋆is a Nash equilibrium (NE)

and a correlated prole (τ, ς ⋆)is a correlated equilibrium (CE) if:

–σ⋆

iis a best response to σ⋆

−ifor all i∈N;

–ς⋆

iis a best response to (τ, ς⋆

−i)for all i∈N;

respectively. We denote by ΣNand ΣCthe set of NE and CE, respectively.

Correlated Equilibria and Fairness in Concurrent Stochastic Games 63

α u1(α)u2(α)u3(α)

(pro1,pro2,pro3)−1000 −1000 −100

(pro1,pro2,yld3)−1000 −100 −5

(pro1,yld2,pro3) 5 −55

(pro1,yld2,yld3) 5 −5−5

(yld1,pro2,pro3)−5−1000 −100

(yld1,pro2,yld3)−5 5 −5

(yld1,yld2,pro3)−5−5 5

(yld1,yld2,yld3)−10 −10 −10

Fig. 1: Example: Cars at an intersection and the corresponding NFG.

Any NE of Nis also a CE, while there can exist CEs that cannot be represented

by a strategy prole and therefore are not NEs. For each class of equilibria,

NE and CE, we introduce two optimality criteria, the rst maximising social

welfare (SW), dened as the sum of the utilities, and the second maximising

social fairness (SF), which minimises the dierence between the players’ utilities.

Other variants of fairness have been considered for NE, such as in [25], where

the authors seek to maximise the lowest utility among the players.

Denition 6 (SW and SF). An equilibrium σ⋆is a social welfare (SW) equi-

librium if the sum of the utilities of the players under σ⋆is maximal over all

equilibria, while σ⋆is a social fair (SF) equilibrium if the dierence between the

player’s utilities under σ⋆is minimised over all equilibria.

We can also dene the dual concept of cost equilibria [24], where players try to

minimise, rather than maximise, their expected utilities by considering equilibria

of the game N−= (N, A, −u)in which the utilities of Nare negated.

Example 2. Consider the scenario, based on an example from [32], where three

cars meet at an intersection and want to proceed as indicated by the arrows

in Figure 1. Each car can either proceed or yield. If two cars with intersecting

paths proceed, then there is an accident. If an accident occurs, the car having

the right of way, i.e., the other car is to its right, has a utility of −100 and the

car that should yield has a utility of −1000. If a car proceeds without causing an

accident, then its utility is 5and the cars that yield have a utility of −5. If all

cars yield, then, since this delays all cars, all have utility −10. The 3-player NFG

is given in Figure 1. Considering the dierent optimal equilibria of the NFG:

–the SWNE and SWCE are the same: for c2to yield and c1and c3to proceed,

with the expected utilities (5,−5,5);

–the SFNE is for c1to yield with probability 1,c2to yield with probability

0.863636 and c3to yield with probability 0.985148, with the expected utilities

(−9.254050,−9.925742,−9.318182);

–the SFCE gives a joint distribution where the probability of c2yielding and

of c1and c3yielding are both 0.5with the expected utilities (0,0,0).

Modifying u2such that u2(pro1,pro2,pro3) = −4.5to, e.g., represent a reckless

driver, the SWNE becomes for c1and c3to yield and c2to proceed with the

expected utilities (−5,5,−5), while the SWCE is still for c2to yield and c1and

c3to proceed. The SFNE and SFCE also do not change. ■

64 Marta Kwiatkowska, Gethin Norman, David Parker, Gabriel Santos

Algorithms for computing equilibria. Before we give our algorithm to com-

pute correlated equilibria, we briey describe the approach of [21,24] for Nash

equilibria computation that this paper builds upon. Finding NE in two-player

NFGs is in the class of linear complementarity problems (LCPs) and we follow

the algorithm presented in [24], which reduces the problem to SMT via labelled

polytopes [28] by considering the regions of the strategy prole space, itera-

tively reducing the search space as positive probability assignments are found

and added as restrictions on this space. To nd SWNE and SFNE, we can enu-

merate all NE and then nd the optimal NE.

When there are more than two players, computing NE values becomes a more

complex task, as nding NE within a given support no longer reduces to a linear

programming (LP) problem. In [21] we presented an algorithm using support

enumeration [31], which exhaustively examines all sub-regions, i.e., supports,

of the strategy prole space, one at a time, checking whether that sub-region

contains NEs. For each support, nding SWNE can be reduced to a nonlinear

programming problem [21]. This nonlinear programming problem can be modied

to nd SFNE in each support, similarly to how the LP problem for SWCEs is

modied to nd SFCEs below.

In the case of CE we can rst nd a joint strategy for the players, i.e.,

a distribution over the action tuples, which, as explained above, can then be

mapped to a correlated prole. A SWCE can be found by solving the following

LP problem. Maximise: ∑i∈N∑α∈Aui(α)·pαsubject to:

∑α−i∈A−i(ui(α−i[ai]) −ui(α−i[a′

i])) ·pα−i[ai]⩾0(1)

0⩽pα⩽1(2)

∑α∈Apα= 1 (3)

for all i∈N,α∈A,ai, a′

i∈Ai,α−i∈A−iwhere A−i

def

={α−i|α∈A}.

The variables pαrepresent the probability of the joint strategy corresponding

to the correlated prole selecting the action-tuple α. The above LP has |A|

variables, one for each action-tuple, and ∑i∈N(|Ai|2−|Ai|) + |A|+1 constraints.

Computation of SFCE can be reduced to the following optimisation problem.

Minimise pmax −pmin subject to: (1), (2) and (3) together with:

pi=∑α∈Apα·ui(α)(4)

(∧m∈Npi⩾pm)→(pmax =pi)(5)

(∧m∈Npi⩽pm)→(pmin =pi)(6)

for all i∈N,m=i,α∈A,aj, al∈Ai,α−i∈A−i. Again, the variables pαin

the program represent the probability of the players playing the joint action α.

The constraint (4) requires pito equal the utility of player i. The constraints

(5) and (6) set pmax and pmin as the maximum and minimum values within the

utilities of the players, respectively. Given we use the constraints (1), (2) and

(3), we start with the same number of variables and constraints as needed to

compute SWCEs and incur an additional |N|+2 variables and 3·|N|constraints.

Correlated Equilibria and Fairness in Concurrent Stochastic Games 65

Game Players |Ai| |A|NE CE

Supports SW SW SF

Majority voting

games

4 16 225 0.07 0.02 0.08

6 36 3,969 0.1 0.02 0.1

8 64 65,025 0.4 0.03 0.3

10 100 1,046,529 5.8 0.07 0.7

33 27 343 1.2 0.07 0.1

4 81 3,375 25.8 0.08 0.3

Covariant

games

33 27 343 8.7 0.08 1.7

4 81 3,375 598.5 0.08 2.9

82 256 6,561 TO 0.3 TO

3 6,561 5,764,801 TO 22.8 TO

10 2 1,024 59,049 TO 1.2 TO

Table 1: Times (s) for synthesis of equilibria in NFGs (timeout 30 mins).

Implementation. To nd SWNE or SFNE of two-player NFGs, we adopt a

similar approach to [24], using labelled polytopes to characterise and nd NE

values through a reduction to SMT in both Z3 [13] and Yices [14]. As an op-

timised precomputation step, when possible we also search for and lter out

dominated strategies, which speeds up the computation and reduces solver calls.

For NFGs with more than two players, solving the nonlinear programming

problem based on support enumeration has been implemented in [21] using a

combination of the SMT solver Z3 [13] and the nonlinear optimisation suite

Ipopt [38]. To mitigate the ineciencies of an SMT solver for such problems,

we used Z3 to lter out unsatisable support assignments with a timeout and

then Ipopt is called to nd SWNE values using an interior-point lter line-search

algorithm [39]. To speed up the overall computation, the support assignments are

analysed in parallel. Computing SFNE increases the complexity of the nonlinear

program and, due to the ineciency in this approach [21], we have not extended

the implementation to compute SFNE.

As shown above, computing SWCE for NFGs reduces to solving an LP, and

we implement this using either the optimisation solver Gurobi [17] or the SMT

solver Z3 [13]. In the case of SFCE, the constraints (5) and (6) include impli-

cations, and therefore the problem does not reduce directly to an LP. When

using Z3, we can encode these constraints directly as it supports assertions that

combine inequalities with logical implications, a feature that linear solvers such

as Gurobi do not have. Section 5discusses implementing SFCE computation in

Gurobi. Both solvers support the specication of lower priority or soft objectives,

which makes it possible to have a consistent ordering for the players’ payos in

cases where multiple equilibria exist.

Eciency and scalability. Table 1presents experimental results for solving

a selection of NFGs randomly generated with GAMUT [29], using Gurobi for

SWCE and NE of two-player NFGs, Z3 for SFCE and both Ipopt and Z3 for

NFGs of more than two players, and running on a 2.10GHz Intel Xeon Gold with

32GB of JVM memory. For each instance, Table 1lists the number of players,

actions for each player, joint actions and supports that need to be enumerated

when nding NE, as well as the time to nd SWNEs, SWCEs and SFCEs (the

time for nding SFNEs of two-player games is the same as for SWNEs). As the

results demonstrate, due to a simpler problem being solved and the fact that we

66 Marta Kwiatkowska, Gethin Norman, David Parker, Gabriel Santos

do not need to enumerate the solutions, computing CEs scales far better than

NEs as the number of players and actions increases. Finding NEs in games with

more than two players is particularly hard as the constraints are nonlinear. We

also see that SFCE computation is slower than SWCE, which is caused by the

additional variables and constraints required when nding SFCE and using Z3

rather than Gurobi for the solver.

3 Concurrent Stochastic Games

We now further develop our approach to support concurrent stochastic games

(CSGs) [36], in which players repeatedly make simultaneous action choices that

cause the game’s state to be updated probabilistically. We extend the previously

introduced denitions of optimal equilibria to such games, focusing on subgame-

perfect equilibria, which are equilibria in every state of a CSG. We then present

algorithms to reason about and synthesise such equilibria.

Denition 7 (Concurrent stochastic game). Aconcurrent stochastic multi-

player game (CSG) is a tuple G= (N, S, ¯

S, A, ∆, δ, AP ,L)where:

–N={1, . . . , n}is a nite set of players;

–Sis a nite set of states and ¯

S⊆Sis a set of initial states;

–A= (A1∪ {⊥})× · · · ×(An∪ {⊥})and Aiis a nite set of actions available

to player i∈Nand ⊥is an idle action disjoint from the set ∪n

i=1Ai;

–∆:S→2∪n

i=1Aiis an action assignment function;

–δ: (S×A)→Dist(S)is a (partial) probabilistic transition function;

–AP is a set of atomic propositions and L:S→2AP is a labelling function.

For the remainder of this section we x a CSG Gas in Denition 7. The game

Gstarts in one of its initial states ¯s∈¯

Sand, supposing Gis in a state s, then

each player iof Gchooses an action from the set that are available, dened

as Ai(s)def

=∆(s)∩Aiif ∆(s)∩Aiis non-empty and Ai(s)def

={⊥} otherwise.

Supposing each player chooses ai, then the game transitions to state s′with

probability δ(s, (a1, . . . , an)). To enable quantitative analysis of Gwe augment it

with reward structures, which are tuples r=(rA, rS)of an action reward function

rA:S×A→Rand state reward function rS:S→R.

Apath of Gis a sequence π=s0

α0

−→ s1

α1

−→ · · · where sk∈S,αk=

(ak

1, . . . , ak

n)∈A,ak

i∈Ai(sk)for i∈Nand δ(sk, αk)(sk+1)>0for all k⩾

0. We denote by FPathsG,s and IPaths G,s the sets of nite and innite paths

starting in state sof Grespectively and drop the subscript swhen considering

all nite and innite paths of G. As for NFGs, we can dene strategies of G

that resolve the choices of the players. Here, a strategy for player iis a function

σi:FPathsG→Dist(Ai∪ {⊥})such that, if σi(π)(ai)>0, then ai∈Ai(last(π))

where last(π)is the nal state of π. Furthermore, we can dene strategy proles,

correlated proles and joint strategies analogously to Denitions 2and 3.

Correlated Equilibria and Fairness in Concurrent Stochastic Games 67

The utility of a player iof Gis dened by a random variable Xi:IPathsG→R

over innite paths. For a prole4σand state s, using standard techniques [20],

we can construct a probability measure Prob σ

G,s over the paths with initial state s

corresponding to σ, denoted IPathsσ

G,s and the expected value Eσ

G,s(Xi)of player

i’s utility from sunder σ. Given utilities X1, . . . , Xnfor all the players of G, we

can then dene NE and CE (see Denition 5) as well as the restricted classes of

SW and SF equilibria as for NFGs (see Denition 6). Following [24,21], we focus

on subgame-perfect equilibria [30], which are equilibria in every state of G.

Nonzero-sum properties. As in [24] (for two-player CSGs) and [21] (for n-

player CSGs) we can specify equilibria-based properties using temporal logic.

For simplicity, we restrict attention to nonzero-sum properties without nesting,

allowing for the specication of NE and CE against either SW or SF optimality.

Denition 8 (Nonzero-sum specications). The syntax of nonzero-sum spec-

ications θfor CSGs is given by the grammar:

ϕ:=⟨⟨C⟩⟩(⋆1, ⋆2)opt∼x(θ)

θ:=P[ψ]+· · ·+P[ψ]|Rr[ρ]+· · ·+Rr[ρ]

ψ:=Xa|aU⩽ka|aUa

ρ:=I=k|C⩽k|Fa

where C=C1:· · · :Cm,C1, . . . , Cmare coalitions of players such that Ci∩Cj=∅

for all 1⩽i=j⩽mand ∪m

i=1Ci=N,(⋆1, ⋆2)∈ {ne,ce}×{sw,sf},opt ∈

{min,max},∼ ∈ {<, ⩽,⩾, >},x∈Q,ris a reward structure, k∈Nand ais

an atomic proposition.

The nonzero-sum formulae of Denition 8extend the logic of in [24,21] in that

we can now specify the type of equilibria, NE or CE, and optimality criteria, SW

or SF. A probabilistic formula ⟨⟨C1:· · ·:Cm⟩⟩(⋆1, ⋆2)max∼x(P[ψ1]+· · ·+P[ψm]) is

true in a state if, when the players form the coalitions C1, . . . , Cm, there is a

subgame-perfect equilibrium of type ⋆1meeting the optimality criterion ⋆2for

which the sum of the values of the objectives P[ψ1], . . . , P[ψm]for the coalitions

C1, . . . , Cmsatises ∼x. The objective ψiof coalition Ciis either a next (Xa),

bounded until (a1U⩽ka2) or until (a1Ua2) formula, with the usual equivalences,

e.g., Fa≡true U a.

For a reward formula ⟨⟨C1:· · ·:Cm⟩⟩(⋆1, ⋆2)opt∼x(Rr1[ρ1]+· · ·+Rrm[ρm]) the

meaning is similar; however, here the objective of coalition Cirefers to a re-

ward formula ρiwith respect to reward structure riand this formula is either

a bounded instantaneous reward (I=k), bounded accumulated reward (C⩽k) or

reachability reward (Fa).

For formulae of the form ⟨⟨C1:· · ·:Cm⟩⟩(⋆1, ⋆2)min∼x(θ), the dual notions of

cost equilibria are considered. We also allow numerical queries of the form

⟨⟨C1:· · ·:Cm⟩⟩(⋆1, ⋆2)opt=?(θ), which return the sum of the optimal subgame-

perfect equilibrium’s values.

4We can also construct such a probability measure and expected value given a corre-

lated prole or joint strategy.

68 Marta Kwiatkowska, Gethin Norman, David Parker, Gabriel Santos

Model checking nonzero-sum specications. Similarly to [24,21], to allow

model checking of nonzero-sum properties we consider a restricted class of CSGs.

We make the following assumption, which can be checked using graph algorithms

with time complexity quadratic in the size of the state space [1].

Assumption 1. For each subformula P[a1Ua2], a state labelled ¬a1∨a2is

reached with probability 1 from all states under all strategy proles and correlated

proles. For each subformula Rr[Fa], a state labelled ais reached with probability

1 from all states under all strategy proles and correlated proles.

We now show how to compute the optimal values of a nonzero-sum formula

ϕ=⟨⟨C1:· · · :Cm⟩⟩(⋆1, ⋆2)opt∼x(θ)when opt = max. The case when opt = min

can be computed by negating all utilities and maximising.

The model checking algorithm broadly follows those presented in [24,21], with

the dierences described below. The problem is reduced to solving an m-player

coalition game GCwhere C={C1, . . . , Cm}and the choices of each player iin GC

correspond to the choices of the players in coalition Ciin G. Formally, we have

the following denition in which, without loss of generality, we assume Cis of

the form {{1, . . . , n1},{n1+1, . . . n2}, . . . , {nm−1+1, . . . nm}} and let jCdenote

player j’s position in its coalition.

Denition 9 (Coalition game). For CSG G= (N, S, ¯

S, A, ∆, δ, AP ,L)and

partition C={C1, . . . , Cm}of the players into mcoalitions, we dene the coali-

tion game GC= ({1, . . . , m}, S, ¯

S, AC, ∆C, δ C,AP,L)as an m-player CSG where:

–AC= (AC

1∪ {⊥})× · · · ×(AC

m∪ {⊥});

–AC

i= (∏j∈Ci(Aj∪ {⊥})\ {(⊥, . . . , ⊥)})for all 1⩽i⩽m;

–for any s∈Sand 1⩽i⩽m:aC

i∈∆C(s)if and only if either ∆(s)∩Aj=∅

and aC

i(jC) = ⊥or aC

i(jC)∈∆(s)for all j∈Ci;

–for any s∈Sand (aC

1, . . . , aC

m)∈AC:δC(s, (aC

1, . . . , aC

m)) = δ(s, (a1, . . . , an))

where for i∈Mand j∈Ciif aC

i=⊥, then aj=⊥and otherwise aj=aC

i(jC).

If all the objectives in θare nite-horizon, backward induction [35,27] can be ap-

plied to compute (precise) optimal equilibria values with respect to the criterion

⋆2and equilibria type ⋆1. On the other hand, if all the objectives are innite-

horizon, value iteration [9] can be used to approximate optimal equilibria values

and, when there is a combination of objectives, the game under study is modied

in a standard manner to make all objectives innite-horizon.

Backward induction and value iteration over the CSG GCboth work by iter-

atively computing new values for each state sof GC. The values for each state,

in each iteration, are found by computing optimal equilibria values of an NFG N

whose utility function is derived from the outgoing transition probabilities from

sin the CSG and the values computed for successor states of sin the previous

iteration. The dierence here, with respect to [21], is that the NFGs are solved

for the additional equilibria and optimality conditions considered in this paper,

which we compute using the algorithms presented in Section 2.

Algorithm for probabilistic until. Because of space limitations, we only

present here the details of value iteration for (unbounded) probabilistic until, i.e.,

Correlated Equilibria and Fairness in Concurrent Stochastic Games 69

for ϕ=⟨⟨C1:· · · :Cm⟩⟩(⋆1, ⋆2)max∼x(θ)where θ=P[a1

1Ua1

2]+ · · · +P[am

1Uam

2].

The complete model checking algorithm can be found in [23].

Following [21], we use VGC(s, ⋆1, ⋆2, θ, n)to denote the vector of computed

values, at iteration n, in state sof GCfor optimality criterion ⋆2(SW or SF),

equilibria type ⋆1(NE or CE) and (until) objectives θ. We also use 1mand 0m

to denote a vector of size mwhose entries all equal to 1 or 0, respectively. For

any set of states S′, atomic proposition aand state swe let ηS′(s)equal 1if

s∈S′and 0otherwise, and ηa(s)equal 1if a∈L(s)and 0otherwise.

Each step of value iteration also keeps track of two sets D, E ⊆M, where

M={1, . . . , m}are the players of GC. We use Dfor the subset of players that

have already reached their goal (by satisfying ai

2) and Efor the players who

can no longer can satisfy their goal (having reached a state that fails to satisfy

1). It can then be ensured that their payos no longer change and are set to 1

or 0, respectively. In these cases, we eectively consider a modied game where,

although the payos for these players are set, we still need to take their strategies

into account in order to guarantee an optimal equilibrium.

Optimal values for all states sin the CSG GCcan be computed as the follow-

ing limit: VGC(s, ⋆1, ⋆2, θ) = limn→∞ VGC(s, ⋆1, ⋆2, θ, n), where VGC(s, ⋆1, ⋆2, θ, n) =

VGC(s, ⋆1, ⋆2,∅,∅, θ, n)and, for any D, E ⊆Msuch that D∩E=∅:

VGC(s, ⋆1, ⋆2, D, E, θ, n) =











(ηD(1), . . . , ηD(m)) if D∪E=M

(ηa1

2(s), . . . , ηam

2(s)) else if n= 0

VGC(s, ⋆1, ⋆2, D ∪D′, E, θ, n)else if D′=∅

VGC(s, ⋆1, ⋆2, D, E ∪E′, θ, n)else if E′=∅

val(N, ⋆1, ⋆2)otherwise

where D′={l∈M\(D∪E)|al

2∈L(s)},E′={l∈M\(D∪E)|al

1∈

L(s)and s∈L(al

2)}and val(N, ⋆1, ⋆2)equals optimal values of the NFG N=

(M, AC, u)with respect to the criterion ⋆2and of equilibria type ⋆1in which for

any 1⩽l⩽mand α∈AC:

ul(α) = 





1if l∈D

0else if l∈E

∑s′∈SδC(s, α)(s′)·vs′,l

n−1otherwise

and (vs′,1

n−1, vs′,2

n−1, . . . , vs′,m

n−1) = VGC(s′, ⋆1, ⋆2, D, E , θ, n−1) for all s′∈S.

Since this paper considers equilibria for any number of coalitions (in par-

ticular, for more than two), the above follows the algorithm of [21] in the way

that it keeps track of the coalitions that have satised their objective (D) or can

no longer do so (E). By contrast the CSG algorithm of [24] was limited to two

coalitions, which enabled the exploitation of ecient MDP analysis techniques

for such coalitions. As explained in [21], in such a scenario we cannot reduce the

analysis from an n-coalition game to an (n−1)-coalition game, as otherwise we

would give one of the remaining coalitions additional power (the action choices

of the coalition that has satised their objective or can no longer do so), which

would therefore give this coalition an advantage over the other coalitions.

70 Marta Kwiatkowska, Gethin Norman, David Parker, Gabriel Santos

Strategy synthesis. As in [24,21] we can extend the model checking algorithm

to perform strategy synthesis, generating a witness (i.e., a prole or joint strat-

egy) representing the corresponding optimal equilibrium. This is achieved by

storing the prole or joint strategy for the NFG solved in each state. Both the

proles and joint strategies require nite memory and are probabilistic. Memory

is required as choices change after a path formula becomes true or a target is

reached and to keep track of the step bound in nite-horizon properties. Ran-

domisation is required for both NE and CE of NFGs.

Correctness and complexity. The correctness of the algorithm follows directly

from [24,21], as changing the class of equilibria or optimality criterion does not

change the proof. The complexity of the algorithm is linear in the formula size

and value iteration requires nding optimal NE or CE for an NFG in each state

of the model. Computing NEs of an NFG with two (or more) players is PPAD-

complete [12,11], while nding optimal CEs of an NFG is in P [15].

4 Case Studies and Experimental Results

We have developed an implementation of our techniques for equilibria synthe-

sis on CSGs, described above, building on top of the PRISM-games [22] model

checker. Our implementation extends the tool’s existing support for construction

and analysis of CSGs, which is contained within its sparse matrix based “explicit”

engine written in Java. We have considered a range of CSG case studies (supple-

mentary material can be found at [40]). Below, we summarise the eciency and

scalability of our approach, again running on a 2.10GHz Intel Xeon Gold with

32GB JVM memory, and then describe our ndings on individual case studies.

Eciency and scalability. Table 2summarises the performance of our imple-

mentation on the case studies that we have considered. It shows the statistics for

each CSG, and the time taken to build it and perform equilibria synthesis, for

several dierent variants (NE vs. CE, SW vs. SF). Comparing the eciency of

synthesising SWNE and SWCE, we see that the latter is typically much faster.

For two-player NE, the social fairness variant is no more expensive to compute as

we enumerate all NEs. For CE, which uses Z3 rather than Gurobi for nding SF,

we note that, although Z3 is able to nd optimal equilibria, it is not primarily

developed as an optimisation suite, and therefore generally performs poorly in

comparison with Gurobi. The benets of the social fair equilibria, in terms of

the values yielded for individual players, are discussed in the in-depth coverage

of the dierent case studies below.

Aloha. In this case study, introduced in [24], a number of users try to send

packets using the slotted Aloha protocol. We suppose that each user has one

packet to send and, in a time slot, if kusers try and send their packet, then

the probability that each packet is successfully sent is q/k where q∈[0,1]. If a

user fails to send a packet, then the number of slots it waits before resending

the packet is set according to Aloha’s exponential backo scheme. The scheme

requires that each user maintains a backo counter, which it increases each time

Correlated Equilibria and Fairness in Concurrent Stochastic Games 71

Case study & property Players ⋆1,⋆2Param. CSG statistics Constr. Verif.

[parameters] values States Trans. time(s) time (s)

Aloha

(⋆1,⋆2)min =? (Rtime [Fsi])

[bmax ,q]

ne,sw

4,0.8 2,778 6,285 0.1

2.2

ce,sw 2.1

ne,sf 2.1

ce,sf 23.3

3ce,sw 4,0.8 107,799 355,734 3.0 80.1

ce,sf 114.6

4ne,sw 2,0.8 68,689 161,904 1.9 1042.9

ce,sw 58.8

Aloha

(⋆1,⋆2)max =? (Pmax =?[Fsi∧t⩽D])

[bmax ,q,D]

4ne,sw 2,0.8,8 159,892 388,133 3.9 1027.5

ce,sw 224.5

5ce,sw 2,0.8,8 1,797,742 5,236,655 54.5 4,936.8

ce,sf TO

Power control

(⋆1,⋆2)max =? (Rr[Fei])

[powmax , emax , qfail ]

ne,sw

8,40,0.2 32,812 260,924 1.2

564.5

ne,sf 566.3

ce,sw 177.9

3ce,sw 5,15,0.2 42,156 740,758 3.5 147.0

ce,sf TO

Public good

(⋆1,⋆2)max =? (Rc[I=rmax ])

[f, rmax ]

3ne,sw 2.5,3 16,202 35,884 0.8 27.5

ce,sw 1.9

4ne,sw 3,3 391,961 923,401 13.0 71.9

ce,sw 35.3

5ce,sw 4,2 59,294 118,342 3.1 5.2

Investors

(⋆1,⋆2)max =? (Rprof [Fcini])

[pbar,months ]

2ce,sw 0.2,8 71,731 315,804 2.4 47.5

ce,sf 2,401.9

3ce,sw 0.2,5 83,081 462,920 3.6 79.3

ce,sf 861.2

Table 2: Statistics for a set of CSG verication instances (timeout 2 hours).

there is a packet failure (up to bmax) and, if the counter equals kand a failure

occurs, randomly chooses the slots to wait from {0,1, . . . , 2k−1}.

We suppose that the objective of each user is to minimise the expected

time to send their packet, which is represented by the nonzero-sum formula

⟨⟨usr 1:· · · :usr m⟩⟩(⋆1, ⋆2)min=?(Rtime [Fs1]+· · ·+Rtime [Fsm]). Synthesising opti-

mal strategies for this specication, we nd that the cases for SWNE and SWCE

coincide (although SWCE returns a joint strategy for the players, this joint strat-

egy can be separated to form a strategy prole). This prole requires one user

to try and send rst, and then for the remaining users to take turns to try and

send afterwards. If a user fails to send, then they enter backo and allow all

remaining users to try and send before trying to send again. There is no gain to

a user in trying to send at the same time as another, as this will increase the

probability of a sending failure, and therefore the user having to spend time in

backo before getting to try again. For SFNE, which has only been implemented

for the two-player case, the two users follow identical strategies, which involve

randomly deciding whether to wait or transmit, unless they are the only user

that has not transmitted, and then they always try to send when not in backo.

In the case of SFCE, users can employ a shared probabilistic signal to coordinate

which user sends next. Initially, this is a uniform choice over the users, but as

time progresses the signal favours the users with lower backo counters as these

users have had fewer opportunities to send their packet previously.

In Figure 2we have plotted the optimal values for the players, where SWi

correspond to the optimal values (expected times to send their packets) for player

72 Marta Kwiatkowska, Gethin Norman, David Parker, Gabriel Santos

0.4 0.6 0.8 1

Expected time

two users

SFNEi

SW1

SW2

SFCEi

0.4 0.6 0.8 1

Expected time

three users

SW1

SW2

SW3

SFi

0.4 0.6 0.8 1

Expected time

four users

SW1

SW2

SW3

SW4

SFi

Fig. 2: Aloha: ⟨⟨usr 1:· · · :usr m⟩⟩(⋆1, ⋆2)min=?(Rtime [Fs1]+· · ·+Rtime [Fsm])

ifor both SWNE and SWCE for the cases of two, three and four users. We see

that the optimal values for the dierent users under SFNE and SFCE coincide,

while under SWNE and SWCE they are dierent for each user (with the user

sending rst having the lowest and the user sending last the highest). Comparing

the sum of the SWNE (and SWCE) values and that of the SFCE values, we see

a small decrease in the sum of less than 2% of the total, while for SFNE there

is a greater dierence as the players cannot coordinate, and hence try and send

at the same time.

Power control. This case study is based on a model of power control in cel-

lular networks from [7]. In the network there are a number of users that each

have a mobile phone. The phones emit signals that the users can strengthen by

increasing the phone’s power level up to a bound (pow max ). A stronger signal

can improve transmission quality, but uses more energy and lowers the qual-

ity of the transmissions of other phones due to interference. We use the ex-

tended model from [22], which adds a probability of failure (qfail ) when a power

level is increased and assumes each phone has a limited battery capacity (emax).

There is a reward structure associated with each phone representing transmis-

sion quality, which is dependent on both the phone’s power level and the power

levels of other phones due to interference. We consider the nonzero-sum prop-

erty ⟨⟨p1:· · ·:pm⟩⟩(⋆1, ⋆2)max=? (Rr1[Fe1]+· · ·+Rrm[Fem]), where each user tries

to maximise their expected reward before their phone’s battery is depleted.

In Figure 3we have presented the expected rewards of the players under

the synthesised SWCE and SFCE joint strategies. When performing strategy

synthesis, in the case of two users the SWNE and SWCE yield the same prole

in which, when the users’ batteries are almost depleted, one user tries to increase

their phone’s power level and, if successful, in the next step, the second user then

tries to increase their phone’s power level. Since the rst user’s phone battery

is depleted when the second tries to increase, this increase does not cause any

interference. On the other hand, if the rst user fails to increase their power

level, then both users increase their battery levels. For the SFCE, the users

can coordinate and ip a coin as to which user goes rst: as demonstrated by

Figure 3this yields equal rewards for the users, unlike the SWCE. In the case of

three users, the SWNE and SWCE dier (we were only able to synthesise SWNE

for powmax = 2 as for larger values the computation had not completed within

Correlated Equilibria and Fairness in Concurrent Stochastic Games 73

2345

1,700

1,800

1,900

2,000

powmax

Rewards

two users

SW1

SW2

FRCEi

FRNEi

2345

1,700

1,800

1,900

2,000

2,100

powmax

Rewards

three users

SWCE1

SWCE2

SWCE3

FRCEi

Fig. 3: Power control: ⟨⟨p1:· · ·:pm⟩⟩(⋆1, ⋆2)max=? (Rr1[Fe1) ]+· · ·+Rrm[Fem])

1.5 2 2.5 3

100

125

Expected Capital

three player

p1(SWCE)

p2(SWCE)

p3(SWCE)

pi(FRCE)

p1(SWNE)

p2(SWNE)

p3(SWNE)

1.5 2 2.5 3

100

150

200

250

300

350

400

Expected Capital (sum)

three player

SWNE

SWCE

SFCE

Fig. 4: Public good: ⟨⟨p1:· · · :pm⟩⟩(⋆1, ⋆2)max=? (Rc1[I=rmax ]+· · ·+Rcm[I=rmax ])

the timeout), again users take turns to try and increase their phone’s power

level. However, here if the users are unsuccessful the SWCE can coordinate as to

which user goes next trying to increase their phone’s battery level. Through this

coordination, the users’ rewards can be increased as the battery level of at most

one phone increases at a time, which limits interference. On the other hand, for

the SWNE users must decide independently whether to increase their phone’s

battery level and they each randomly decide whether to do so or not.

Public good. We next consider a variant of a public good game [19], based

on the one presented in [22] for the two-player case. In this game a number

of players each receive an initial amount of capital (einit) and, in each of rmax

months, can invest none, half or all of their current capital. The total invested

by the players in a month is multiplied by a factor fand distributed equally

among the players before the start of the next month. The aim of the play-

ers is to maximise their expected capital which is represented by the formula:

⟨⟨p1:· · · :pm⟩⟩(⋆1, ⋆2)max=?(Rc1[I=rmax ]+· · ·+Rcm[I=rmax ]).

Figure 4plots, for the three-player model, both the expected capital of indi-

vidual players and the total expected capital after three months for the SWNE,

SWCE and SFNE as the parameter fvaries. As the results demonstrate the play-

ers benet, both as individuals and as a population, by coordinating through a

correlated strategy. In addition, under the SFCE, all players receive the same

expected capital with only a small decrease in the sum from that of the SWCE.

Investors. The nal case study concerns a concurrent multi-player version of

futures market investor model of [26], in which a number of investors (the players)

74 Marta Kwiatkowska, Gethin Norman, David Parker, Gabriel Santos

23456789

4.4

4.6

4.8

5.2

5.4

Number of months

Expected reward

two player – CE (solid) & NE (dashed)

SW1

SW2

SFi

2345678

1.5

1.75

2.25

2.5

Number of months

Expected reward

three player – SW (solid) & SF (dashed)

CE1

CE2

CE3

Fig. 5: Investors: ⟨⟨inv 1:· · ·:invm⟩⟩(⋆1, ⋆2)max=? (Rpf 1[Fcin1]+· · ·+Rpf m[Fcinm])

interact with a probabilistic stock market. In successive months, the investors

choose whether to invest, wait or cash in their shares, while at the same time the

market decides with probability pbar to bar each investor, with the restriction

that an investor cannot be barred two months in a row or in the rst month,

and then the values of shares and cap on values are updated probabilistically.

We consider both two- and three-player models, where each investor tries to

maximise its individual prot represented by the following nonzero-sum prop-

erty: ⟨⟨inv 1:· · ·:invm⟩⟩(⋆1, ⋆2)max=?(Rpf 1[Fcin1]+· · ·+Rpf m[Fcinm]). In Figure 5

we have plotted the dierent optimal values for NE and CE of the two-player

game and the dierent optimal values for CE of the three-player game (the

computation of NE values timed out for the three player case). As the results

demonstrate, again we see that the coordination that CEs oer can improve the

returns of the players and that, although considering social fairness does decrease

the returns of some players, this is limited, particularly for CEs.

5 Conclusions

We have presented novel techniques for game-theoretic verication of proba-

bilistic multi-agent systems, focusing on correlated equilibria and a notion of

social fairness. We began with the simpler case of normal form games and then

extended this to concurrent stochastic games, and used temporal logic to for-

mally specify equilibria. We proposed algorithms for equilibrium synthesis, im-

plemented them and illustrated their benets, in terms of eciency and fairness,

on case studies from a range of application domains.

Future work includes exploring the use of further game-theoretic topics within

this area, such as techniques for mechanism design or other concepts such as

Stackelberg equilibria. We plan to implement SFCE computation in Gurobi using

the big-M method [16] to encode implications and techniques from [37] to encode

conjunctions, which should yield a signicant speed-up in their computation.

Acknowledgements. This project was funded by the ERC under the European

Union’s Horizon 2020 research and innovation programme (FUN2MODEL, grant

agreement No. 834115).

Correlated Equilibria and Fairness in Concurrent Stochastic Games 75

References

1. de Alfaro, L.: Formal Verication of Probabilistic Systems. Ph.D. thesis, Stanford

University (1997)

2. Aminof, B., Kwiatkowska, M., B. Maubert, B., Murano, A., Rubin, S.: Probabilistic

strategy logic. In: Proc. IJCAI’19. pp. 32–38 (2019)

3. Aumann, R.: Subjectivity and correlation in randomized strategies. Journal of

Mathematical Economics 1(1), 67–96 (1974)

4. Baier, C., Katoen, J.P.: Principles of Model Checking. MIT Press (2008)

5. Baier, C., Kwiatkowska, M.: Model checking for a probabilistic branching time

logic with fairness. Distributed Computing 11(3), 125–155 (1998)

6. Banerjee, T., Majumdar, R., Mallik, K., Schmuck, A.K., Soudjani, S.: Fast symbolic

algorithms for omega-regular games under strong transition fairness. Tech. Rep.

MPI-SWS-2020-007r, Max Planck Institute (2021)

7. Brenguier, R.: PRALINE: A tool for computing Nash equilibria in concurrent

games. In: Sharygina, N., Veith, H. (eds.) Proc. CAV’13. LNCS, vol. 8044, pp.

890–895. Springer (2013), lsv.fr/Software/praline/

8. Chatterjee, K., Fijalkow, N.: A reduction from parity games to simple stochastic

games. EPTCS 54, 74–86 (2011)

9. Chatterjee, K., Henzinger, T.: Value iteration. In: 25 Years of Model Checking.

LNCS, vol. 5000, pp. 107–138. Springer (2008)

10. Chen, T., Forejt, V., Kwiatkowska, M., Parker, D., Simaitis, A.: Automatic veri-

cation of competitive stochastic systems. Formal Methods in System Design 43(1),

61–92 (2013)

11. Chen, X., Deng, X., Teng, S.H.: Settling the complexity of computing two-player

Nash equilibria. J. ACM 56(3) (2009)

12. Daskalakis, C., Goldberg, P., Papadimitriou, C.: The complexity of computing a

Nash equilibrium. Communications of the ACM 52(2), 89–97 (2009)

13. De Moura, L., Bjørner, N.: Z3: An ecient SMT solver. In: Proc. TACAS’08.

LNCS, vol. 4963, pp. 337–340. Springer (2008), github.com/Z3Prover/z3

14. Dutertre, B.: Yices2.2. In: Biere, A., Bloem, R. (eds.) Proc CAV’14. LNCS,

vol. 8559, pp. 737–744. Springer (2014), yices.csl.sri.com

15. Gilboa, I., Zemel, E.: Nash and correlated equilibria: Some complexity considera-

tions. Games and Economic Behavior 1(1), 80–93 (1989)

16. Griva, I., Nash, S., Sofer, A.: Linear and Nonlinear Optimization: Second Edition.

CUP (01 2009)

17. Gurobi Optimization, LLC: Gurobi Optimizer Reference Manual (2021),

www.gurobi.com

18. Gutierrez, J., Harrenstein, P., Wooldridge, M.J.: Reasoning about equilibria in

game-like concurrent systems. In: Proc. 14th International Conference on Princi-

ples of Knowledge Representation and Reasoning (KR’14) (2014)

19. Hauser, O., Hilbe, C., Chatterjee, K., Nowak, M.: Social dilemmas among unequals.

Nature 572, 524–527 (2019)

20. Kemeny, J., Snell, J., Knapp, A.: Denumerable Markov Chains. Springer (1976)

21. Kwiatkowska, M., Norman, G., Parker, D., Santos, G.: Multi-player equilibria ver-

ication for concurrent stochastic games. In: Gribaudo, M., Jansen, D., Remke, A.

(eds.) Proc. QEST’20. LNCS, Springer (2020)

22. Kwiatkowska, M., Norman, G., Parker, D., Santos, G.: PRISM-games 3.0: Stochas-

tic game verication with concurrency, equilibria and time. In: Proc. CAV’20. pp.

475–487. LNCS, Springer (2020)

76 Marta Kwiatkowska, Gethin Norman, David Parker, Gabriel Santos

23. Kwiatkowska, M., Norman, G., Parker, D., Santos, G.: Correlated equilibria and

fairness in concurrent stochastic games (2022), arXiv:2201.09702

24. Kwiatkowska, M., Norman, G., Parker, D., Santos, G.: Automatic verication of

concurrent stochastic systems. Formal Methods in System Design pp. 1–63 (2021)

25. Littman, M., Ravi, N., Talwar, A., Zinkevich, M.: An ecient optimal-equilibrium

algorithm for two-player game trees. In: Proc. UAI’06. pp. 298–305. AUAI Press

(2006)

26. McIver, A., Morgan, C.: Results on the quantitative mu-calculus qMu. ACM Trans.

Computational Logic 8(1) (2007)

27. von Neumann, J., Morgenstern, O., Kuhn, H., Rubinstein, A.: Theory of Games

and Economic Behavior. Princeton University Press (1944)

28. Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.: Algorithmic Game Theory.

CUP (2007)

29. Nudelman, E., Wortman, J., Shoham, Y., Leyton-Brown, K.: Run the GAMUT:

A comprehensive approach to evaluating game-theoretic algorithms. In: Proc. AA-

MAS’04. pp. 880–887. ACM (2004), gamut.stanford.edu

30. Osborne, M., Rubinstein, A.: An Introduction to Game Theory. OUP (2004)

31. Porter, R., Nudelman, E., Shoham, Y.: Simple search methods for nding a Nash

equilibrium. In: Proc. AAAI’04. pp. 664–669. AAAI Press (2004)

32. Prisner, E.: Game Theory Through Examples. Mathematical Association of Amer-

ica, 1 edn. (2014)

33. Rabin, M.: Incorporating fairness into game theory and economics. The American

Economic Review 83(5), 1281–1302 (1993)

34. Rabin, M.: Fairness in repeated games. working paper 97–252, University of Cali-

fornia at Berkeley (1997)

35. Schwalbe, U., Walker, P.: Zermelo and the early history of game theory. Games

and Economic Behavior 34(1), 123–137 (2001)

36. Shapley, L.: Stochastic games. PNAS 39, 1095–1100 (1953)

37. Stevens, S., Palocsay, S.: Teaching use of binary variables in integer linear pro-

grams: Formulating logical conditions. INFORMS Transactions on Education

18(1), 28–36 (2017)

38. Wächter, A.: Short tutorial: Getting started with ipopt in 90 minutes. In: Com-

binatorial Scientic Computing. No. 09061 in Dagstuhl Seminar Proceedings,

Leibniz-Zentrum für Informatik (2009), github.com/coin-or/Ipopt

39. Wächter, A., Biegler, L.: On the implementation of an interior-point lter line-

search algorithm for large-scale nonlinear programming. Mathematical Program-

ming 106(1), 25–57 (2006)

40. Supporting material, www.prismmodelchecker.org/les/tacas22equ/

Correlated Equilibria and Fairness in Concurrent Stochastic Games 77

Open Access This chapter is licensed under the terms of the Creative Commons

Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),

which permits use, sharing, adaptation, distribution and reproduction in any medium

or format, as long as you give appropriate credit to the original author(s) and the

source, provide a link to the Creative Commons license and indicate if changes were

made.

The images or other third party material in this chapter are included in the chapter’s

Creative Commons license, unless indicated otherwise in a credit line to the material. If

material is not included in the chapter’s Creative Commons license and your intended

use is not permitted by statutory regulation or exceeds the permitted use, you will need

to obtain permission directly from the copyright holder.

78 Marta Kwiatkowska, Gethin Norman, David Parker, Gabriel Santos

Omega Automata

A Direct Symbolic Algorithm for Solving

Stochastic Rabin Games

Tamajit Banerjee1, Rupak Ma jumdar2, Kaushik Mallik2,

Anne-Kathrin Schmuck2, and Sadegh Soudjani3

1IIT Delhi, New Delhi, India

2MPI-SWS, Kaiserslautern, Germany

3Newcastle University, Newcastle upon Tyne, UK

Abstract. We consider turn-based stochastic 2-player games on graphs

with ω-regular winning conditions. We provide a direct symbolic algo-

rithm for solving such games when the winning condition is formulated

as a Rabin condition. For a stochastic Rabin game with kpairs over a

game graph with nvertices, our algorithm runs in O(nk+2k!) symbolic

steps, which improves the state of the art.

We have implemented our symbolic algorithm, along with performance

optimizations including parallellization and acceleration, in a BDD-based

synthesis tool called Fairsyn. We demonstrate the superiority of Fairsyn

compared to the state of the art on a set of synthetic benchmarks derived

from the VLTS benchmark suite and on a control system benchmark from

the literature. In our experiments, Fairsyn performed signiﬁcantly faster

with up to two orders of magnitude improvement in computation time.

1 Introduction

Symbolic algorithms for 2-player graph games are at the heart of many prob-

lems in the automatic synthesis of correct-by-construction hardware, software,

and cyber-physical systems from logical speciﬁcations. The problem has a

rich pedigree, going back to Church [10] and a sequence of seminal results

[6,31,17,30,13,14,34,21]. A chain of reductions can be used to reduce the syn-

thesis problem for ω-regular speciﬁcations to ﬁnding winning strategies in

2-player games on graphs, for which (symbolic) algorithms are known (see, e.g.,

[29,14,34,27]). These algorithms form the basis for algorithmic reactive synthesis.

For systems under uncertainty, it is also essential to capture non-determinism

quantitatively using probability distributions [5,18,22,25]. Turn-based stochas-

tic 2-player games [3,9], also known as 21/2-player games, generalize 2-player

graph games with an additional category of “random” vertices: Whenever the

game reaches a random vertex, a random process picks one of the outgoing

edges according to a probability distribution. The qualitative winning problem

asks whether a vertex of the game graph is almost surely winning for Player 0.

Stochastic Rabin games were studied by Chatterjee et al. [7], who showed that

the problem is NP-complete and that winning strategies can be restricted to

The Author(s) 2022

D. Fisman and G. Rosu (Eds.): TACAS 2022, LNCS 13244, pp. 81–98, 2022.

https://doi.org/10.1007/978-3-030-99527-0_5

be pure (non-randomized) and memoryless. Moreover, they showed a reduc-

tion from qualitative winning in an n-vertex k-pair stochastic Rabin game to

an O(n(k+ 1))-vertex (k+ 1)-pair (deterministic) Rabin game, resulting in an

O(n(k+ 1))k+2(k+ 1)!algorithm. In contrast, we provide a direct O(nk+2k!)

symbolic algorithm for the problem.

Our new direct symbolic algorithm is obtained in the following way. We

replace the probabilistic transitions with transitions of the environment con-

strained by extreme fairness as described by Pnueli [28]. Extreme fairness is

speciﬁed via a special set of Player 1 vertices, called live vertices. A run is ex-

tremely fair if whenever a live vertex is visited inﬁnitely often, every outgoing

edge from this vertex is taken inﬁnitely often. As our ﬁrst contribution, we show

that to solve a qualitative stochastic Rabin game, we can equivalently solve a

(deterministic) Rabin game over the same game graph by interpreting random

vertices of the stochastic game as live vertices.

As our second contribution we prove a direct symbolic algorithm to solve

(deterministic) Rabin games with live vertices, which we call extremely fair ad-

versarial Rabin games. In particular, we show a surprisingly simple syntactic

transformation that modiﬁes well-known symbolic ﬁxpoint algorithm for solving

2-player Rabin games on graphs (without live vertices), such that the modiﬁed

ﬁxpoint solves the extremely fair adversarial version of the game.

To appreciate the simplicity of our modiﬁcation, let us consider the well-

known ﬁxpoint algorithms for B¨uchi and co-B¨uchi games—particular classes of

Rabin games—given by the following µ-calculus formula:

B¨uchi: νY. µX. (G∩Cpre(Y)) ∪(Cpre(X)) ,

Co-B¨uchi: µX. νY. (G∪Cpre(X)) ∩(Cpre(Y)) .

where Cpre(·) denotes the controllable predecessor operator and Gdenotes the

set of goal states that should be visited recurrently. In the presence of strong

transition fairness, the new algorithm becomes

B¨uchi: νY. µX. (G∩Cpre(Y)) ∪(Apre(Y , X)) ,

Co-B¨uchi: νW. µX. ν Y. (G∪Apre(W, X)) ∩(Cpre(Y)) .

The only syntactic change (highlighted in blue) we make is to substitute the

controllable predecessor for the µvariable Xby a new almost sure predecessor

operator Apre(Y , X) incorporating also the previous νvariable Y; if the ﬁxpoint

starts with a µvariable (with no previous νvariable), like for co-B¨uchi games,

we introduce one additional νvariable in the front. For the general class of

Rabin speciﬁcations, with a more involved ﬁxpoint and with arbitrarily high

nesting depth depending on the number of Rabin pairs, we need to perform this

substitution for every such Cpre(·) operator for every µvariable.

We prove the correctness of this syntactic ﬁxpoint transformation for solv-

ing Rabin games [31,27] in this paper. It can be shown that the same syntactic

transformation may be used to obtain ﬁxpoint algorithms for qualitative solution

of stochastic games with other popular ω-regular objectives, namely Reachabil-

ity, Safety, (generalized) B¨uchi, (generalized) co-B¨uchi, Rabin-chain, parity, and

82 T. Banerjee et al.

GR(1). Owing to page constraints, these additional ﬁxpoints are only discussed

in the extended version [4] of this paper, where we also generalize all results

presented in this paper to a weaker notion of fairness, called transition fairness.

In a nutshell, these results show that one can solve games with live vertices

while retaining the algorithmic characteristics and implementability of known

symbolic ﬁxpoint algorithms that do not consider fairness assumptions.

We have implemented our symbolic algorithm for solving stochastic Rabin

games in a symbolic BDD-based reactive synthesis tool called Fairsyn.Fairsyn

additionally uses parallellization and a ﬁxpoint acceleration technique [23] to

boost performance. We evaluate our tool on two case studies, one using synthetic

benchmarks derived from the VLTS benchmark suite [15] and the other from

controller synthesis for stochastic control systems [12]. We show that Fairsyn

scales well on these case studies, and outperforms the state-of-the-art methods

by up to two orders of magnitude.

All the technical proofs, the ﬁxpoints for various other speciﬁcations, and an

additional benchmark taken from the software engineering literature [8] can be

found in the extended version of this paper under a slighly more relaxed setting

of the problem (transition fairness instead of extreme fairness) [4].

2 Preliminaries

Notation: We write N0to denote the set of natural numbers including zero.

Given a, b ∈N0, we write [a;b] to denote the set {n∈N0|a≤n≤b}. By

deﬁnition, [a;b] is an empty set if a > b. For any set A⊆Udeﬁned on the

universe U, we write Ato denote the complement of A. Given an alphabet A,

we use the notation A∗and Aωto denote respectively the set of all ﬁnite words

and the set of all inﬁnite words formed using the letters of the alphabet A. Let

Aand Bbe two sets and R⊆A×Bbe a relation. For any element a∈A, we

use the notation R(a) to denote the set {b∈B|(a, b)∈R}.

21/2-player game graph: We consider usual turn-based stochastic games, also

known as 21/2-player games, played between Player 0, Player 1, and a third player

representing environmental randomness which is treated as a “half player.” For-

mally, a 21/2-player game graph is a tuple G=hV, V0, V1, Vr, Eiwhere (i) Vis a

ﬁnite set of vertices, (ii) V0,V1, and Vrare subsets of Vwhich form a partition of

V, and (iii) E⊆V×Vis the set of directed edges. The vertices in Vrare called

random vertices, and the edges originating in a random vertex are called random

edges, denoted as Er. A 21/2-player game graph with no random vertices (i.e.

Vr=∅) is called a 2-player game graph.A21/2-player game graph with V1=∅

is called a 11/2-player game graph (also known as Markov Decision Processes or

MDPs). A 21/2-player game graph with V=Vris known as a Markov chain.

Strategies: A (deterministic) strategy of Player 0 is a function ρ0:V∗V0→V

with ρ0(wv)∈E(v) for every wv ∈V∗V0. Likewise, a strategy of Player 1 is a

function ρ1:V∗V1→Vwith ρ1(wv)∈E(v) for every wv ∈V∗V1. We denote

the set of strategies of Player iby Πi. A strategy ρiof Player i(i∈ {0,1}) is

memoryless if for every w1v, w2v∈V∗Vi, we have ρi(w1v) = ρi(w2v). In this

A Direct Symbolic Algorithm for Solving Stochastic Rabin Games 83

paper we restrict attention to deterministic strategies, as randomized strategies

are no more powerful than deterministic ones for 21/2-player Rabin games [7].

Plays: Consider an inﬁnite sequence of vertices4π=v0v1v2. . . ∈Vω. The

sequence πis called a play over Gstarting at the vertex v0if for every i∈N0, we

have vi∈Vand (vi, vi+1)∈E. A play is ﬁnite if it is of the form v0v1. . . vnfor

some ﬁnite n∈N0. Let ρ0∈Π0and ρ1∈Π1be a pair of strategies for the two

players, and v0∈Vbe a given initial vertex. For every ﬁnite play π=v0v1. . . v n,

the next vertex vn+1 is obtained as follows: If vn∈V0then vn+1 =ρ0(v0. . . vn);

if vn∈V1then vn+1 =ρ1(v0. . . vn); and if vn∈Vrthen vn+1 is chosen uniformly

at random from the set Er(vn). The uniform probability distribution over the

random edges is without loss of generality for the problem considered in this

paper; we will come back to this after setting up the problem statement. Every

play generated in this way by ﬁxing ρ0, ρ1, and v0is called a play compliant with

ρ0and ρ1that starts at vertex v0. The random choice in the random vertices

induces a probability measure Pρ0,ρ1

v0on the sample space of plays.5This is in

contrast to 2-player games, where for any choice of ρ0∈Π0,ρ1∈Π1, and

v0∈V, the resulting compliant play is unique.

Winning Conditions: Awinning condition ϕis a set of inﬁnite plays over G,

i.e., ϕ⊆Vω, where the game graph Gwill always be clear from the context. We

adopt Linear Temporal Logic (LTL) notation for describing winning conditions.

The atomic propositions for the LTL formulas are sets of vertices, i.e., elements

of the set 2V. We use the standard symbols for the Boolean and the temporal

operators: “¬” for negation, “∧” for conjunction, “∨” for disjunction, “→” for

implication, “U” for until (AUBmeans “the play remains inside the set Auntil

it moves to the set B”), “” for next (Ameans “the next vertex is in the set

A”), “♦” for eventually (♦Ameans “the play will eventually visit a vertex from

the set A”), and “” for always (Ameans “the play will only visit vertices

from the set A”). The syntax and semantics of LTL can be found in standard

textbooks [3]. By slightly abusing notation, we use ϕinterchangeably to denote

both the LTL formula and the set of plays satisfying ϕ. Hence, we write π∈ϕ

to denote the satisfaction of the formula ϕby the play π.

Rabin Winning Conditions: ARabin winning condition is expressed using a

set of kRabin pairs R={hG1, R1i,...,hGk, Rki}, where kis any positive integer

and Gi, Ri⊆Vfor all i∈[1; k]. We say that Rhas the index set P= [1; k]. A

play πsatisﬁes the Rabin condition Rif πsatisﬁes the LTL formula

ϕ:=Wi∈P♦Ri∧♦Gi.(2)

Almost Sure Winning: Let Gbe 21/2-player game graph, ρ0∈Π0and ρ1∈Π1

be a pair of strategies, v0∈Vbe an initial vertex, and ϕbe an ω-regular

4In our convention for denoting vertices, superscripts (ranging over N0) will denote

the position of a vertex within a given sequence/play, whereas subscripts, either 0,

1, or r, will denote the membership of a vertex in the sets V0,V1, or Vrrespectively.

5The unique measure Pρ0,ρ1

v0is obtained through Carath´eodory’s extension theorem

by extending the pre-measure on every inﬁnite extension—called the cylinder set—of

every ﬁnite play; see [3, pp. 757] for details.

84 T. Banerjee et al.

speciﬁcation over the vertices of G. Then Pρ0,ρ1

v0(ϕ) denotes the probability of

satisfaction of ϕby the plays compliant with ρ0and ρ1and starting at v0.

The set of almost sure winning states of Player 0 for the speciﬁcation ϕis

deﬁned as the set Wa.s.⊆Vsuch that for every v0∈ Wa.s.the following

holds: supρ0∈Π0infρ1∈Π1Pρ0,ρ1

v0(ϕ) = 1.It is known [7, Thm. 4] that there is

an optimal (deterministic) memoryless strategy ρ∗

0∈Π0—called the optimal

almost sure winning strategy—such that for every v0∈ Wa.s.it holds that

infρ1∈Π1Pρ∗

0,ρ1

v0(ϕ) = 1.

We extend the notion of winning to 2-player games as follows. Fix a 2-player

game graph G=hV, V0, V1,∅, E iand an ω-regular speciﬁcation ϕover V. Player 0

wins the game from a vertex v0∈Vif Player 0 has a strategy ρ0such that for

every Player 1 strategy ρ1, the unique resulting play starting at v0is in ϕ. The

winning region W ⊆ Vis the set of vertices from which Player 0 wins the game.

It is known that Player 0 has a memoryless strategy ρ∗

0—called the optimal

winning strategy —such that for every Player 1 strategy ρ1∈Π1and for every

initial vertex v0∈ W, the resulting unique compliant play is in ϕ[19].

3 Problem Statement and Outline

Given a 21/2-player game graph Gand a Rabin speciﬁcation ϕas in (2), we

consider the problem of solving the induced qualitative reactive synthesis prob-

lem. That is, we want to compute the set of almost sure winning states Wa.s.

of Gw.r.t. ϕand the corresponding optimal memoryless winning strategy ρ∗

0of

Player 0. This problem was solved by Chatterjee et al. [7] via a reduction from

qualitative winning in the original 21/2-player Rabin game to winning in a larger

(deterministic) 2-player Rabin game with an additional Rabin pair.

Instead of inﬂating the game graph and introducing an extra Rabin pair at

the cost of more expensive computation, we propose a direct and computationally

more eﬃcient symbolic algorithm over the original game graph G. We get this

algorithm by interpreting the random vertices of Gas special Player 1 vertices,

called live vertices, which are subject to an extreme fairness assumption: along

every play, if a live vertex vis visited inﬁnitely often, then all outgoing transitions

of vare also taken inﬁnitely often. This re-interpretation results in a 2-player

Rabin game with special live Player 1vertices that are subjected to extreme

fairness assumptions on Player 1’s behavior. We call such games extremely fair

adversarial (2-player) Rabin games. The correctness of our symbolic algorithm

then follows from the two main results of our paper.

(I) We show that qualitative winning in a 21/2-player Rabin game Gis equiv-

alent to winning in the extremely fair adversarial (2-player) Rabin game G`

obtained from G. Moreover, the winning strategy ρ0of Player 0 in G`is also the

optimal almost sure winning strategy in Gfor ϕ(see Thm. 1in Sec. 4).

(II) We give a direct symbolic algorithm to compute the set of winning states,

along with the Player 0 winning strategy for extremely fair adversarial (2-player)

Rabin games (see Thm. 2in Sec. 5).

A Direct Symbolic Algorithm for Solving Stochastic Rabin Games 85

Both contributions are discussed in detail in Sec. 4and Sec. 5, respectively.

Even though, for convenience, we have assumed a uniform probability distribu-

tion over the random edges, our contributions are valid for any arbitrary prob-

ability distribution. This follows from the established fact that the qualitative

analysis of 21/2-player games does not depend on the precise probability values

but only on the supports of the distributions [7].

We conclude the paper by an experimental evaluation in Sec. 6.

4 From Randomness to Extreme Fairness

In this section, we show that qualitative winning in 21/2-player Rabin games

is equivalent to winning in extremely fair adversarial (2-player) Rabin games

over the same underlying game graph. While it is known [16, Thm. 11.1] that

the reduction of random vertices to extreme fairness is sound and complete

for liveness winning conditions6we extend this connection to arbitrary Rabin

winning conditions in this section, and therefore to the entire class of ω-regular

speciﬁcations. We start with a formal deﬁnition of extremely fair adversarial

games and the connection between randomness and extreme fairness, before

stating our main result in Thm. 1.

Extremely Fair Adversarial Games: Let G=hV, V0, V1,∅, E ibe a 2-player

game graph with live vertices V`⊆V1, denoted using the tuple G`=hG , V `i.

The set of edges originating from the live vertices are called the live edges, and

is denoted as E`:= (V`×V)∩E. A play πover G`is extremely fair with respect

to V`if it satisﬁes the following LTL formula:

α:=V(v,v0)∈E`(♦v→♦(v∧ v0)) .(3)

Given G`and an ω-regular winning condition ϕover V, Player 0 wins the ex-

tremely fair adversarial game over G`for ϕfrom a vertex v0∈Vif Player 0

wins the game over G`for the winning condition α→ϕfrom v0.

Randomness as Extreme Fairness: Let G=hV , V0, V1, Vr, Eibe a 21/2-player

game graph. Then we say that Ginduces the 2-player game graph with live

vertices G`:=hhV, V0, V1∪Vr,∅, E i, Vri. Intuitively, we interpret every random

vertex of Gas a live Player 1 vertex in G`. Obviously, this reinterpretation does

not change the structure of the underlying graph speciﬁed by Vand E.

Soundness of the Reduction: It remains to show that the almost sure winning

set and the optimal almost sure winning strategy of Player 0 in Gfor ϕis the same

as the winning state set and the winning strategy of Player 0 in G`for ϕ. This is

formalized in the following theorem when ϕis given as a Rabin condition. The

proof essentially shows that the random vertices of Gsimulate the live vertices

of G`, and vice versa; details are in the extended version [4, App. B.6, pp. 61].

6An LTL formula ϕover Vdescribes a liveness property if every ﬁnite play πover G

allows for a continuation π0s.t. ππ0∈ϕ.

86 T. Banerjee et al.

Theorem 1. Let Gbe a 21/2-player game graph with vertex set V,ϕ⊆Vωbe a

Rabin winning condition as in (2), and G`be the 2-player game graph with live

edges induced by G. Let W ⊆ Vbe the set of vertices from which Player 0wins

the extremely fair adversarial game over G`with respect to ϕ, and Wa.s.be the

almost sure winning set of Player 0in the 21/2-player game Gwith respect to ϕ.

Then, W=Wa.s.. Moreover, an optimal almost sure winning strategy in G`is

also an optimal winning strategy in G, and vice versa.

5 Extremely Fair Adversarial Rabin Games

This section presents our main result, which is a symbolic ﬁxpoint algorithm that

computes the winning region of Player 0 in the extremely fair adversarial game

over G`with respect to any ω-regular property formalized as a Rabin winning

condition. This new symbolic ﬁxpoint algorithm has multiple unique features.

(I) It works directly over G`, without requiring any pre-processing step to reduce

G`to a “normal” 2-player game with larger set of vertices.

(II) Our new ﬁxpoint algorithm is obtained from the algorithm of Piterman et al.

[27] by a simple syntactic change. We simply replace all controllable predecessor

operators over least ﬁxpoint variables by a new almost sure predecessor operator

invoking the preceding maximal ﬁxpoint variable. This makes the proof of our

new ﬁxpoint algorithm conceptually simple (see Sec. 5.3).

At a higher level, we make a simple yet eﬃcient syntactic transformation of

the ﬁxpoint to incorporate the fairness assumption on the live vertices, without

introducing any extra computational complexity. Most remarkably, this transfor-

mation also works directly for ﬁxpoint algorithms for reachability, safety, B¨uchi,

(generalized) co-B¨uchi, Rabin-chain, and parity games, as these can be formal-

ized as particular instances of a Rabin game. Moreover, it also works for gener-

alized Rabin, generalized B¨uchi, and GR(1) games. Owing to page constrains,

these additional cases are described in the extended version [4].

5.1 Preliminaries on Symbolic Computations over Game Graphs

Set Transformers: Our goal is to develop symbolic ﬁxpoint algorithms to char-

acterize the winning region of an extremely fair adversarial game over a game

graph with live edges. As a ﬁrst step, given G`, we deﬁne the required symbolic

transformers of sets of states. We deﬁne the existential, universal, and control-

lable predecessor operators as follows. For S⊆V, we have

Pre∃

0(S):={v∈V0|E(v)∩S6=∅},(4a)

Pre∀

1(S):={v∈V1|E(v)⊆S},and (4b)

Cpre(S):= Pre∃

0(S)∪Pre∀

1(S).(4c)

Intuitively, the controllable predecessor operator Cpre(S) computes the set of all

states that can be controlled by Player 0 to stay in Safter one step regardless

A Direct Symbolic Algorithm for Solving Stochastic Rabin Games 87

of the strategy of Player 1. Additionally, we deﬁne two operators which take

advantage of the fairness assumption on the live vertices. Given two sets S, T ⊆

V, we deﬁne the live-existential and almost sure predecessor operators:

Lpre∃(S):={v∈V`|E(v)∩S6=∅},and (5a)

Apre(S, T ):= Cpre(T)∪Lpre∃(T)∩Pre∀

1(S).(5b)

Intuitively, the almost sure predecessor operator7Apre(S, T ) computes the set

of all states that can be controlled by Player 0 to stay in T(via Cpre(T)) as well

as all Player 1 states in V`that (a) will eventually make progress towards Tif

Player 1 obeys its fairness-assumptions encoded in α(via Lpre∃(T)) and (b) will

never leave Sin the “meantime” (via Pre∀

1(S)). All the used set transformers are

monotonic with respect to set inclusion. Further, Cpre(T)⊆Apre(S, T ) always

holds, Cpre(T) = Apre(S, T ) if V`=∅, and Apre(S, T )⊆Cpre(S) if T⊆S.

Fixpoint Algorithms in the µ-calculus: We use µ-calculus [20] as a con-

venient logical notation to deﬁne a symbolic algorithm (i.e., an algorithm that

manipulates sets of states rather than individual states) for computing a set of

states with a particular property over a given game graph G. The formulas of the

µ-calculus, interpreted over a 2-player game graph G, are given by the grammar

ϕ::=p|X|ϕ∪ϕ|ϕ∩ϕ|pre(ϕ)|µX.ϕ |νX.ϕ

where pranges over subsets of V,Xranges over a set of formal variables, pre

ranges over monotone set transformers in {Pre∃

0,Pre∀

1,Cpre,Lpre∃,Apre}, and µ

and νdenote, respectively, the least and the greatest ﬁxed point of the functional

deﬁned as X7→ ϕ(X). Since the operations ∪,∩, and the set transformers pre

are all monotonic, the ﬁxed points are guaranteed to exist. A µ-calculus formula

evaluates to a set of states over G, and the set can be computed by induction over

the structure of the formula, where the ﬁxed points are evaluated by iteration.

We omit the (standard) semantics of formulas (see [20]).

5.2 The Symbolic Algorithm

We now present our new symbolic ﬁxpoint algorithm to compute the winning

region of Player 0 in the extremely fair adversarial game over G`with respect to

a Rabin winning condition R. A detailed correctness proof can be found in the

extended version [4, App. B.3, pp. 40].

Theorem 2. Let G`=hG, V `ibe a game graph with live edges and Rbe a Rabin

condition over Gwith index set P= [1; k]. Further, let Z∗denote the ﬁxed point

of the following µ-calculus expression:

νYp0.µXp0.[

p1∈P

νYp1.µXp1.[

p2∈P\1

νYp2.µXp2. . . . [

pk∈P\k−1

νYpk.µXpk.



[

j=0

Cpj

,

(6a)

7We will justify the naming of this operator later in Rem. 1.

88 T. Banerjee et al.

where Cpj:=Tj

i=0 Rpi∩Gpj∩Cpre(Ypj)∪Apre(Ypj, Xpj),(6b)

with8p0= 0,Gp0:=∅and Rp0:=∅as well as P\i:= P\ {p1, . . . , pi}. Then Z∗

is equivalent to the winning region Wof Player 0in the extremely fair adver-

sarial game over G`for the winning condition ϕin (2). Moreover, the ﬁxpoint

algorithm runs in O(nk+2k!) symbolic steps, and a memoryless winning strategy

for Player 0can be extracted from it.

5.3 Proof Outline

Given a Rabin winning condition over a “normal” 2-player game, [27] provided a

symbolic ﬁxpoint algorithm which computes the winning region for Player 0. The

ﬁxpoint algorithm in their paper is almost identical to our ﬁxpoint algorithm

in (6): it only diﬀers in the last term of the constructed C-terms in (6b). [27]

deﬁnes the term Cpjas

Tj

i=0 Rpi∩Gpj∩Cpre(Ypj)∪Cpre(Xpj).

Intuitively, a single term Cpjcomputes the set of states that always remain within

Qpj:= Tj

i=0 Rpiwhile always re-visiting Gpj. That is, given the simpler (local)

winning condition

ψ:=Q∧♦G(7)

for two sets Q, G ⊆V, the set

νY. µX. Q ∩[(G∩Cpre(Y)) ∪(Cpre(X))] (8)

is known to deﬁne exactly the states of a “normal” 2-player game Gfrom which

Player 0 has a strategy to win the game with winning condition ψ[26]. Such

games are typically called Safe B¨uchi Games. The key insight in the proof of

Thm. 2is to show that the new deﬁnition of C-terms in (6b) via the new al-

most sure predecessor operator Apre actually computes the winning state sets

of extremely fair adversarial safe B¨uchi games. Subsequently, we generalize this

intuition to the ﬁxpoint for the Rabin games.

Fair Adversarial Safe B¨uchi Games: The following theorem characterizes

the winning states in an extremely fair adversarial safe B¨uchi game.

Theorem 3. Let G`=hG, V `ibe a game graph with live vertices and Q, G ⊆V

be two state sets over G. Further, let

Z∗:=νY. µX. Q ∩[(G∩Cpre(Y)) ∪(Apre(Y , X))] .(9)

Then Z∗is equivalent to the winning region of Player 0in the extremely fair ad-

versarial game over G`for the winning condition ψin (7). Moreover, the ﬁxpoint

algorithm runs in O(n2)symbolic steps, and a memoryless winning strategy for

Player 0can be extracted from it.

8The Rabin pair hGp0, Rp0i=h∅,∅i in (6) is artiﬁcially introduced to make the

ﬁxpoint representation more compact. It is not part of R.

A Direct Symbolic Algorithm for Solving Stochastic Rabin Games 89

Intuitively, the ﬁxpoints in (8) and (9) consist of two parts: (a) A minimal

ﬁxpoint over Xwhich computes (for any ﬁxed value of Y) the set of states that

can reach the “target state set” T:=Q∩G∩Cpre(Y) while staying inside the

safe set Q, and (b) a maximal ﬁxpoint over Ywhich ensures that the only states

considered in the target Tare those that allow to re-visit a state in Twhile

staying in Q.

By comparing (8) and (9) we see that our syntactic transformation only

changes part (a). Hence, in order to prove Thm. 3it essentially remains to show

that this transformation works for the even simpler safe reachability games.

Extremely Fair Adversarial Safe Reachability Games: A safe reachabil-

ity condition is a tuple hT, Qiwith T , Q ⊆Vand a play πsatisﬁes the safe

reachability condition hT , Qiif πsatisﬁes the LTL formula

ψ:= QUT. (10)

A safe reachability game is often called a reach-while-avoid game, where the

safe sets are speciﬁed by an unsafe set R:= Qthat needs to be avoided. Their

extremely fair adversarial version is formalized in the following theorem and

proved in the extended version [4, Thm. 3.3].

Theorem 4. Let G`=hG, V `ibe a game graph with live edges and hT , Qibe a

safe reachability winning condition. Further, let

Z∗:=νY. µX. T ∪(Q∩Apre(Y, X )).(11)

Then Z∗is equivalent to the winning region of Player 0in the extremely fair

adversarial game over G`for the winning condition ψin (10). Moreover, the ﬁx-

point algorithm runs in O(n2)symbolic steps, and a memoryless winning strategy

for Player 0can be extracted from it.

To gain some intuition on the correctness of Thm. 4, let us recall that the

ﬁxpoint for safe reachability games without live edges is given by:

µX. T ∪(Q∩Cpre(X)).(12)

Intuitively, the ﬁxpoint computation in (12) is initialized with X0=∅and

computes a sequence X0, X1, . . . , Xkof increasing sets until Xk=Xk+1. We

say that vhas rank rif v∈Xr\Xr−1. All states contained in Xrallow Player 0

to force the play to reach Tin at most r−1 steps while staying in Q. The

corresponding Player 0 strategy ρ0is known to be winning w.r.t. (10) and along

every play πcompliant with ρ0, the path πremains in Qand the rank is always

decreasing.

To see why the same strategy is also sound in the extremely fair adversarial

safe reachability game G`, ﬁrst recall that for vertices v /∈V`of G`, the operator

Apre(X, Y ) simpliﬁes to Cpre(X). With this, we see that for every v /∈V`a

Player 0 winning strategy eρ0in G`can always force plays to stay in Qand to

decrease their rank, similar to ρ0. Then every play πcompliant with such a

strategy eρ0and visiting a vertex in V`only ﬁnitely often satisﬁes (10).

90 T. Banerjee et al.

12 3

Fig. 1. Fair adversarial game graph discussed in Ex. 1and Ex. 2with Player 0 and

Player 1 vertices being indicated by circles and squares, respectively. The live vertices

are V`={2,3,5}(double square, blue), the target vertices are G={6,9}(double

circle, green), and the unsafe vertices are Q={1}(red,dotted).

The only interesting case for soundness of Thm. 4is therefore every play π

that visits states in V`inﬁnitely often. However, as the number of vertices is

ﬁnite, we only have a ﬁnite number of ranks and hence a certain vertex v∈V`

with a ﬁnite rank rneeds to get visited by πinﬁnitely often. From the deﬁnition

of Apre, we know that only states v∈V`are contained in Xrif vhas an outgoing

edge reaching Xkwith k < r. Because of the extreme fairness condition, reaching

vinﬁnitely often implies that also a state with rank ks.t. k < r will get visited

inﬁnitely often. As X1=Twe can show by induction that Tis eventually visited

along πwhile πalways remains in Quntil then.

In order to prove completeness of Thm. 4we need to show that all states

in V\Z∗are losing for Player 0. Here, again the reasoning is equivalent to the

“normal” safe reachability game for v /∈V`. For live vertices v∈V`, we see

that vis not added to Z∗via Apre if v /∈Tand either (i) none of its outgoing

edges make progress towards Tor (ii) some of its outgoing edges leave Z∗. One

can therefore construct a Player 1 strategy that for (i)-vertices always choose

an arbitrary transition and thereby never makes progress towards T(also if v

is visited inﬁnitely often), and for (ii)-vertices ensures that they are only visited

once on plays which remain in Q. This ensures that (ii)-vertices never make

progress towards Tvia their possibly existing rank-decreasing edges.

In the extended version [4], we have provided a detailed soundness and com-

pleteness proof of Thm. 4along with the respective Player 0 and Player 1 strat-

egy construction. In addition, there we also proved Thm. 3using a reduction to

Thm. 4for every iteration over Y.

Example 1 (Extremely Fair adversarial safe reachability game). We consider an

extremely fair adversarial safe reachability game over the game graph depicted

in Fig. 1with target vertex set T=G={6,9}and safe vertex set Q=V\ {1}.

We denote by Ymthe m-th iteration over the ﬁxpoint variable Yin (11),

where Y0=V. Further, we denote by Xmi the set computed in the i-th iteration

over the ﬁxpoint variable Xin (11) during the computation of Ymwhere Xm0=

∅. We further have Xm1=T={6,9}as Apre(·,∅) = ∅. Now we compute

X12 =T∪(Q∩Apre(Y0, X11))

={6,9} ∪ (V\ {1} ∩ [Cpre(X11)

| {z }

{7,8}

∪(Lpre∃(X11)∩Pre∀

1(V))

| {z }

{3,5}

]) = {3,5,6,7,8,9}.

(13)

A Direct Symbolic Algorithm for Solving Stochastic Rabin Games 91

We observe that the only vertices added to Xvia the Cpre term are 7 and

8. The live vertices 3 and 5 are added due to their outgoing edges leading to

the target vertex 6. The additional requirement Pre∀

1(V) in Apre(Y0, X11) is

trivially satisﬁed for all vertices at this point as Y0=Vand can therefore be

ignored. Doing one more iteration over Xwe see that now vertex 4 gets added

via the Cpre term (as it is a Player 0 vertex that allows progress towards 5) and

vertex 2 is added via the Apre term (as it is live and allows progress to 3). The

iteration over Xterminates with Y1=X1∗=V\ {1}.

Re-iterating over Xfor Y1gives X22 =X12 ={3,5,6,7,8,9}as before.

However, now vertex 2 does not get added to X23 because vertex 2 has an

edge leading to V\Y1={1}. Therefore the iteration over Xterminates with

Y2=X2∗=V\{1,2}. When we now re-iterate over Xfor Y2we see that vertex

3 is not added to X32 any more, as vertex 3 has a transition to V\Y2={1,2}.

Therefore the iteration over Xnow terminates with Y3=X3∗=V\ {1,2,3}.

Now re-iterating over Xdoes not change the vertex set anymore and the ﬁxed-

point terminates with Y∗=Y3=V\ {1,2,3}.

We note that the ﬁxpoint expression (12) for “normal” safe reachability

games terminates after two iterations over Xwith X∗={6,7,8,9}, as ver-

tices 7 and 8 are the only vertex added via the Cpre operator in (13). Due to

the stricter notion of Cpre requiring that all outgoing edges of Player 0 vertices

make process towards the target, (12) does not require an outer largest ﬁxed-

point over Yto “trap” the play in a set of vertices which allow progress when

“waiting long enough”. This “trapping” required in (11) via the outer ﬁxpoint

over Yactually fails for vertices 2 and 3 (as they are excluded from the winning

set of (11)). Here, Player 1 can enforce to “escape” to the unsafe vertex 1 in

two steps before 2 and 3 are visited inﬁnitely often (which would imply progress

towards 6 via the existing live edges).

We see that the winning region in the “normal” game is much smaller than the

winning region for the extremely fair adversarial game, as adding live transitions

restricts the strategy choices of Player 1, making it easier for Player 0 to win.

Example 2 (Extremely fair adversarial safe B¨uchi game). We now consider an

extremely fair adversarial safe B¨uchi game over the game graph depicted in Fig. 1

with target set G={6,9}and safe set Q=V\ {1}.

We ﬁrst observe that we can rewrite the ﬁxpoint in (9) as

νY. µX. [Q∩G∩Cpre(Y)] ∪[Q∩(Apre(Y , X))] .(14)

Using (14) we see that for Y0=Vwe can deﬁne T0:= Q∩G∩Cpre(V) = G=

{6,9}. Therefore the ﬁrst iteration over Xis equivalent to (13) and terminates

with Y1=X1∗=V\ {1}.

Now, however, we need to re-compute Tfor the next iteration over Xand

obtain T1=Q∩G∩Cpre(Y1) = V\ {1}∩{6,9} ∩ V\ {1,2,9}={6}. This

re-computation of T1checks which target vertices are repeatedly reachable, as

required by the B¨uchi condition. As vertex 9 has no outgoing edge trivially it

cannot be reached repeatedly.

92 T. Banerjee et al.

With this, we see that for the next iteration over Xwe only have one target

vertex T1={6}. Unlike the safe reachability case in Ex. 1, the vertex 7 cannot

be added to X22, since Player 1 can always decide to take the edge towards 9

from 7, and therefore prevents repeated visit of a target state. Vertices 2 and 3

get eliminated for the same reason as in the safe reachability game within the

second and third iteration over Y. The overall ﬁxpoint computation therefore

terminates with Y∗=Y3={4,5,6,8}.

Proof of Thm. 2:The proof of Thm. 2essentially follows from the same

arguments as in the soundness proof of the Rabin ﬁxpoint for 2-player game by

Piterman et al. [27], which utilizes Thm. 4and Thm. 3at all suitable places. In

[4, App. A, pp. 29], we illustrate the steps of the Rabin ﬁxpoint in (6) using a

simple extremely fair adversarial Rabin game with two Rabin pairs.

Remark 1. We remark that the ﬁxpoint (11), as well as the Apre operator, are

similar in structure to the solution of almost surely winning states in concurrent

reachability games [1]. In concurrent games, the ﬁxpoint captures the largest

set of states in which the game can be trapped while maintaining a positive

probability of reaching the target. In our case, the ﬁxpoint captures the largest

set of states in which Player 0 can keep the game while ensuring a visit to the

target either directly or through some of the edges from the live vertices. The

commonality justiﬁes our notation and terminology for Apre.

Remark 2. [2] studied fair CTL and LTL model checking where the fairness con-

dition is given by exteme fairness with all vertices of the transition system being

live. They show that CTL model checking under this all-live fairness condition,

can be syntactically transformed to non-fair CTL model checking. A similar

transformation is possible for fair model checking of B¨uchi, Rabin, and Streett

formulas. The correctness of their transformation is based on reasoning similar

to our Apre operator. For example, a state satisﬁes the CTL formula ∀♦punder

fairness iﬀ all paths starting from the state either eventually visits por always

visits states from which a visit to pis possible.

Complexity Analysis of (6):For Rabin games with kRabin pairs, Piterman et

al. [27] proposed a ﬁxpoint formula with alternation depth 2k+ 1 . Using the ac-

celerated ﬁxpoint computation technique of Long et al. [23], they deduce a bound

of O(nk+1k!) symbolic steps. We can apply the same acceleration technique to

our ﬁxpoint (6), yielding a complexity upper bound of O(nk+2 k!) symbolic steps.

(The additional complexity is because of an additional outermost ν-ﬁxpoint.)

6 Experimental Evaluation

We developed a C++-based tool Fairsyn9, which implements the symbolic fair

adversarial Rabin ﬁxpoint from Eq. (6) using Binary Decision Diagrams (BDD).

9Repository URL: https://gitlab.mpi-sws.org/kmallik/synthesis-with-edge-fairness

A Direct Symbolic Algorithm for Solving Stochastic Rabin Games 93

Fairsyn has a single-threaded and a multi-threaded version, which respectively

use the CUDD BDD library [32] and the Sylvan BDD library [11]. In both, we

used a ﬁxpoint acceleration procedure that “warm-starts” the inner ﬁxpoints by

exploiting a monotonicity property (detailed in the extended version [4]).

We demonstrate the eﬀectiveness of our proposed symbolic algorithm for 21/2-

player Rabin games using a set of synthetic benchmark experiments derived from

the VLTS benchmark suite (Sec. 6.1) and a controller synthesis experiment for

a stochastic dynamical system (Sec. 6.2); in the extended version [4], we include

an additional software engineering benchmark example from the literature. In

all of these examples, Fairsyn signiﬁcantly outperformed the state-of-the-art.

The experiments in Sec. 6.1 were performed using the multi-threaded Fairsyn

on a computer equipped with a 3 GHz Intel Xeon E7 v2 processor with 48 CPU

cores and 1.5 TiB RAM. The experiments in Sec. 6.2 were performed using the

single-threaded Fairsyn on a Macbook Pro (2015) laptop equipped with a 2.7 GHz

Dual-Core Intel Core i5 processor with 16 GiB RAM.

6.1 The VLTS Benchmark Experiments

We present a collection of synthetic benchmarks for empirical evaluation of the

merits of our direct symbolic algorithm compared to the one using the reduction

to 2-player games [7]; in the following, we refer the latter as the indirect approach.

Like our direct algorithm, the indirect approach has been implemented in Fairsyn

and beneﬁts from the same Sylvan-based parallel BDD-library and accelerated

ﬁxpoint solution technique. We collect the ﬁrst 20 transition systems from the

Very Large Transition Systems (VLTS) benchmark suite [15]; their descriptions

can be found in the VLTS benchmark website. For each of them, we randomly

generated instances of 21/2-player Rabin games with up to 3 Rabin pairs using

the following procedure: (i) we labeled a given fraction of the vertices as ran-

dom vertices, (ii) we equally partitioned the remaining vertices into system and

environment vertices, and (iii) for every set in R={hG1, R1i,...,hGk, Rki}, we

randomly selected up to 5% of all vertices to be contained in the set. All the ver-

tices in (i), (ii), and (iii) were selected randomly. In these examples, the number

of vertices ranged from 289–164,865, the number of BDD variables ranged from

9–18, and the number of transitions from 1224–2,621,480.

In Fig. 2, we compare the running times of Fairsyn and the indirect approach.

On the left scatter plot, every point corresponds to one instance of the randomly

generated benchmarks, where the X and the Y coordinates represent the run-

ning time for Fairsyn and the indirect approach respectively. The solid red line

indicates the exact same performance for both methods, whereas the dashed

red line indicates an order of magnitude performance improvement for Fairsyn

compared to the indirect approach. Observe that Fairsyn was faster by up to

two orders of magnitude for the majority of the cases. In the experiments, the

memory footprint of Fairsyn and the indirect approach was similar.

In the right plot, the X-axis corresponds to the proportion of random vertices

within the set of vertices in percentage: 0% corresponds to a 2-player game and

100% corresponds to a Markov chain. The Y-axis corresponds to the running

94 T. Banerjee et al.

time normalized with respect to the running time for the 0% case. We observe

that Fairsyn was insensitive to the change of proportion of the random vertices.

On the other hand, the indirect approach took longer time for larger proportion

of random vertices, because for every random vertex it adds 3k+ 2 additional

vertices, thus causing a linear blowup in the size of the game graph. The big

variations in the time diﬀerences of the two approaches are due to the varying

size of the experiments: The larger a game graph is, the larger is the diﬀerence.

Interestingly, for both Fairsyn and the indirect method, there is a dip in the

running time when all the vertices are random (i.e. the 100% case), which is

possibly due to faster computation of the Cpre and Apre operators and faster

convergence of the ﬁxpoint algorithm, owing to the absence of Player 0 and

Player 1 vertices.

10−3100103

10−3

100

103

Fairsyn (s)

The indirect approach (s)

0 20 40 60 80 100

500

1,000

1,500

Fraction of random vertices (%)

Normalized running time

Fig. 2. LEFT: Comparison of running time of Fairsyn and the indirect approach on

the VLTS benchmarks. All axes are in log-scale. RIGHT: Sensitivity of normalized

running time w.r.t. variation of the proportion of random vertices. The blue and the red

lines correspond to diﬀerent instances of Fairsyn and the indirect approach respectively.

6.2 Synthesis for Stochastically Perturbed Dynamical Systems

Synthesizing veriﬁed symbolic controllers for continuous dynamical systems is an

active area in cyber-physical systems research [33]. We consider a stochastically

perturbed dynamical system model, called the bistable switch [12], which is an

important model studied in molecular biology. The system model, call it Σ, has a

continuous and compact two-dimensional state space X= [0,4]×[0,4] ∈R2and

a ﬁnite input space U={−0.5,0,0.5} × {−0.5,0,0.5}. Suppose for any given

time k∈N,x1(k), x2(k) are the two states, u1(k), u2(k) are the two inputs,

and w1(k), w2(k) are a pair of statistically independent noise samples drawn

from a pair of distributions with bounded supports W1= [−0.4,−0.2], W2=

[−0.4,−0.2] respectively. Then the states of Σin the next time instant are:

x1(k+ 1) = x1(k)+0.05 (−1.3x1(k) + x2(k)) + u1(k) + w1(k),(15)

x2(k+ 1) = x2(k)+0.05 (x1(k))2

(x1(k))2+ 1 −0.25x2(k)+u2(k) + w2(k).

A controller Cfor Σis a function C:X→Umapping the state x(k) at any

time instant kto a suitable control input u(k). Then applying (15) repeatedly

A Direct Symbolic Algorithm for Solving Stochastic Rabin Games 95

Table 1. Performance comparison between Fairsyn and StochasticSynthesis (abbrevi-

ated as SS) [12] on a comparable implementation of the abstraction (uniform grid-based

abstraction). Col. 1 shows the size of the resulting 21/2-player game graph (computed

using the algorithm given in [24]), Col. 2 and 3 compare the total synthesis times and

Col. 4 and 5 compare the peak memory footprint (as measured using the “time” com-

mand) for Fairsyn and SS respectively. “OoM” stands for out-of-memory.

# vertices in

21/2-game abstraction

Total synthesis time Peak memory footprint

Fairsyn SS Fairsyn SS

3.8×1030.4 s 30 s 66 MiB 156 MiB

2.2×1048.2 s 55 s 72 MiB 1 GiB

1.1×1051 min 23 s 16 min 1 s 108 MiB 81 GiB

6.6×1055 min 27 s OoM 166 MiB 126 GiB

4.3×10641 min 7 s OoM 517 MiB 127 GiB

with u(k) = C(x(k)), starting with an initial state (x1(0), x2(0)) = x(0) = xinit ,

gives us an inﬁnite sequence of states (x(0), x(1), x(2), . . .) called a path. For

a ﬁxed controller Cand for a given initial state xinit, we obtain a probability

measure PC

xinit on the sample space of paths of Σ, in a way similar to how we

obtained the probability measure Pρ0,ρ1

v0over inﬁnite plays of 21/2-player games.

Let ϕ⊆Xωbe a Rabin speciﬁcation, deﬁned using a ﬁnite predicate over X.

C A

Fig. 3. Predicates over X.

We extend the notion of almost sure winning for con-

trol systems in the obvious way: A state x∈Xof Σis

almost sure winning if there is a controller Csuch that

x(ϕ) = 1. The controller synthesis problem asks to

compute an optimal controller C∗such that for every

almost sure winning state x,PC∗

x(ϕ) = 1.

Majumdar et al. [24] show that this synthesis prob-

lem can be approximately solved by lifting the system

Σto a ﬁnite 21/2-player game. We used Fairsyn to solve the resulting 21/2-player

Rabin games obtained for the controller synthesis problem for Σin (15) and for

the following speciﬁcation given in LTL using the predicates A, B, C, D as shown

in Fig. 3:ϕ:= (♦B→♦C)∧(♦A→¬C).

In Table 1, we compare the performance of Fairsyn against the state-of-the-

art algorithm for solving this problem, which is implemented in the tool called

StochasticSynthesis (SS) [12]. It can be observed that Fairsyn signiﬁcantly out-

performs SS for every abstraction of diﬀerent coarseness considered here.

Acknowledgments:

R. Majumdar and K. Mallik are funded through the DFG project 389792660

TRR 248–CPEC, A.-K. Schmuck is funded through the DFG project (SCHM

3541/1-1), and S. Soudjani is funded through the EPSRC New Investigator

Award CodeCPS (EP/V043676/1).

References

1. de Alfaro, L., Henzinger, T.A., Kupferman, O.: Concurrent reachability games. In:

39th Annual Symposium on Foundations of Computer Science, FOCS. pp. 564–575.

96 T. Banerjee et al.

IEEE Computer Society (1998)

2. Aminof, B., Ball, T., Kupferman, O.: Reasoning about systems with transition

fairness. In: 11th International Conference on Logic for Programming, Artiﬁcial

Intelligence, and Reasoning. LNCS, vol. 3452, pp. 194–208. Springer (2004)

3. Baier, C., Katoen, J.P.: Principles of Model Checking. MIT Press (2008)

4. Banerjee, T., Majumdar, R., Kaushik, M., Schmuck, A.K., Soudjani, S.: Fast sym-

bolic algorithms for omega-regular games under strong transition fairness (2021),

https://www.mpi-sws.org/tr/2020-007.pdf

5. Belta, C., Yordanov, B., Gol, E.A.: Formal methods for discrete-time dynamical

systems, vol. 15. Springer (2017)

6. Buchi, J.R., Landweber, L.H.: Solving sequential conditions by ﬁnite-state strate-

gies. Transactions of the American Mathematical Society 138, 295–311 (1969)

7. Chatterjee, K., de Alfaro, L., Henzinger, T.A.: The complexity of stochastic Ra-

bin and Streett games. In: Proceedings of the 32nd International Colloquium on

Automata, Languages and Programming (ICALP). Lecture Notes in Computer

Science, vol. 3580, pp. 878–890. Springer (2005)

8. Chatterjee, K., De Alfaro, L., Faella, M., Majumdar, R., Raman, V.: Code aware

resource management. Formal Methods in System Design 42(2), 146–174 (2013)

9. Chatterjee, K., Jurdzi´nski, M., Henzinger, T.A.: Quantitative stochastic parity

games. In: Proceedings of the ﬁfteenth annual ACM-SIAM symposium on Discrete

algorithms. pp. 121–130. Society for Industrial and Applied Mathematics (2004)

10. Church, A.: Logic, arithmetic, and automata. Proceedings of the International

Congress of Mathematicians, 1962 pp. 23–35 (1963)

11. van Dijk, T., van de Pol, J.: Sylvan: Multi-core decision diagrams. In: International

Conference on Tools and Algorithms for the Construction and Analysis of Systems.

pp. 677–691. Springer (2015)

12. Dutreix, M., Huh, J., Coogan, S.: Abstraction-based synthesis for stochastic sys-

tems with omega-regular objectives. arXiv preprint arXiv:2001.09236 (2020)

13. Emerson, E.A., Jutla, C.S.: The complexity of tree automata and logics of pro-

grams. In: FoCS. vol. 88, pp. 328–337 (1988)

14. Emerson, E.A., Jutla, C.S.: Tree automata, mu-calculus and determinacy. In: FoCS.

vol. 91, pp. 368–377 (1991)

15. Garavel, H., Descoubes, N.: Very large transition systems (2003), http://cadp.

inria.fr/resources/vlts/

16. van Glabbeek, R., H¨ofner, P.: Progress, justness, and fairness. ACM Comput. Surv.

52(4) (2019)

17. Gurevich, Y., Harrington, L.: Trees, automata, and games. In: Proceedings of the

fourteenth annual ACM symposium on Theory of computing. pp. 60–65 (1982)

18. Kamgarpour, M., Summers, S., Lygeros, J.: Control design for property speciﬁca-

tions on stochastic hybrid systems. Hybrid Systems: Computation and Control pp.

303–312 (April 2013)

19. Klarlund, N.: Progress measures, immediate determinacy, and a subset construc-

tion for tree automata. Annals of Pure and Applied Logic 69(2-3), 243–268 (1994)

20. Kozen, D.: Results on the propositional µ-calculus. Theoretical Computer Science

27(3), 333 – 354 (1983), international Colloquium on Automata, Languages and

Programming (ICALP)

21. Kupferman, O., Vardi, M.Y.: Safraless decision procedures. In: 46th Annual IEEE

Symposium on Foundations of Computer Science (FOCS’05). pp. 531–540. IEEE

(2005)

A Direct Symbolic Algorithm for Solving Stochastic Rabin Games 97

22. Laurenti, L., Lahijanian, M., Abate, A., Cardelli, L., Kwiatkowska, M.: Formal

and eﬃcient synthesis for continuous-time linear stochastic hybrid processes. IEEE

Transactions on Automatic Control (2020)

23. Long, D.E., Browne, A., Clarke, E.M., Jha, S., Marrero, W.R.: An improved al-

gorithm for the evaluation of ﬁxpoint expressions. In: International Conference on

Computer Aided Veriﬁcation. pp. 338–350. Springer (1994)

24. Majumdar, R., Mallik, K., Schmuck, A.K., Soudjani, S.: Symbolic qualitative con-

trol for stochastic systems via ﬁnite parity games. In: ADHS 2021 (2021)

25. Majumdar, R., Mallik, K., Soudjani, S.: Symbolic controller synthesis for B¨uchi

speciﬁcations on stochastic systems. In: Proceedings of the 23rd International Con-

ference on Hybrid Systems: Computation and Control. pp. 1–11 (2020)

26. Maler, O., Pnueli, A., Sifakis, J.: On the synthesis of discrete controllers for timed

systems. In: Annual Symposium on Theoretical Aspects of Computer Science. pp.

229–242. Springer Berlin Heidelberg (1995)

27. Piterman, N., Pnueli, A.: Faster solutions of Rabin and Streett games. In: 21st

Annual IEEE Symposium on Logic in Computer Science (LICS’06). pp. 275–284

(2006)

28. Pnueli, A.: On the extremely fair treatment of probabilistic algorithms. In: Pro-

ceedings of the ﬁfteenth annual ACM symposium on Theory of computing. pp.

278–290 (1983)

29. Pnueli, A., Rosner, R.: A framework for the synthesis of reactive modules. In: Vogt,

F.H. (ed.) International Conference on Concurrency, Proceedings. LNCS, vol. 335,

pp. 4–17. Springer (1988)

30. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Annual ACM

Symposium on Principles of Programming Languages. pp. 179–190. ACM Press

(1989)

31. Rabin, M.O.: Decidability of second-order theories and automata on inﬁnite trees.

Transactions of the American Mathematical Society 141, 1–35 (1969)

32. Somenzi, F.: Cudd 3.0.0 (2019), https://github.com/ivmai/cudd

33. Tabuada, P.: Veriﬁcation and control of hybrid systems: a symbolic approach.

Springer Science & Business Media (2009)

34. Zielonka, W.: Inﬁnite games on ﬁnitely coloured graphs with applications to au-

tomata on inﬁnite trees. Theor. Comput. Sci. 200(1-2), 135–183 (1998)

Open Access This chapter is licensed under the terms of the Creative Commons

Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),

which permits use, sharing, adaptation, distribution and reproduction in any medium

or format, as long as you give appropriate credit to the original author(s) and the

source, provide a link to the Creative Commons license and indicate if changes were

made.

The images or other third party material in this chapter are included in the chapter’s

Creative Commons license, unless indicated otherwise in a credit line to the material. If

material is not included in the chapter’s Creative Commons license and your intended

use is not permitted by statutory regulation or exceeds the permitted use, you will need

to obtain permission directly from the copyright holder.

98 T. Banerjee et al.

Practical Applications of the

Alternating Cycle Decomposition

Antonio Casares1()ID , Alexandre Duret-Lutz2ID , Klara J. Meyer3ID ,

Florian Renkin2ID , and Salomon Sickert4ID ⋆

1LaBRI, Universit´e de Bordeaux, France, antonio.casares-santos@labri.fr

2LRDE, EPITA, France, adl@lrde.epita.fr,frenkin@lrde.epita.fr

3Independent Researcher, email@klarameyer.de

4School of Computer Science and Engineering, The Hebrew University, Israel,

salomon.sickert@mail.huji.ac.il

Abstract. In 2021, Casares, Colcombet, and Fijalkow introduced the

Alternating Cycle Decomposition (ACD) to study properties and trans-

formations of Muller automata. We present the ﬁrst practical implemen-

tation of the ACD in two diﬀerent tools, Owl and Spot, and adapt it

to the framework of Emerson-Lei automata, i.e., ω-automata whose ac-

ceptance conditions are deﬁned by Boolean formulas. The ACD provides

a transformation of Emerson-Lei automata into parity automata with

strong optimality guarantees: the resulting parity automaton is minimal

among those automata that can be obtained by duplication of states.

Our empirical results show that this transformation is usable in practice.

Further, we show how the ACD can generalize many other specialized

constructions such as deciding typeness of automata and degeneraliza-

tion of generalized B¨uchi automata, providing a framework of practical

algorithms for ω-automata.

1 Introduction

Automata over inﬁnite words have many applications, including veriﬁcation and

synthesis of reactive systems with speciﬁcations given in formalisms such as Lin-

ear Temporal Logic (LTL) [27,23,11,12,2,29]. The synthesis problem from

LTL speciﬁcations asks, given an LTL formula φ, to build a controller that pro-

cesses an input word letter by letter, producing an output word, such that the

combined input-output-word satisﬁes φ. The automata-theoretic approach to

this problem (ﬁrst introduced by Pnueli and Rosner [27]) consists of building a

deterministic ω-automaton Aequivalent to the LTL speciﬁcation φ, then con-

struct a game from Ain which the opponent chooses the input letters for the

automaton, and ﬁnally solve this game and obtain a controller from a winning

strategy (whenever such a strategy exists). The automaton Acan use diﬀer-

ent kinds of acceptance conditions (Rabin, Emerson-Lei, Muller, parity...) and

⋆Salomon Sickert is supported in part by the Deutsche Forschungsgemeinschaft (DFG)

under project number 436811179, and in part funded by the European Research

Council (ERC) under the European Union’s Horizon 2020 research and innovation

programme under grant agreement No. 787367 (PaVeS)

The Author(s) 2022

D. Fisman and G. Rosu (Eds.): TACAS 2022, LNCS 13244, pp. 99–117, 2022.

https://doi.org/10.1007/978-3-030-99527-0_6

thus we obtain games with diﬀerent winning conditions. Among these games,

parity games are the easiest to solve and there are highly-developed techniques

for parity games solvers. Thus it is common practice to transform the automa-

ton Ato a parity one (for which we might need to augment the state space

of the automaton). The top-ranked tools in the SyntComp competitions [17],

Strix [23] (winner in editions 2018, 2019, 2020 and 2021) and ltlsynt [26], use

this approach, producing a transition-based Emerson-Lei automata (TELA) as

an intermediate step before constructing the parity automaton. For this reason,

optimal and eﬃcient procedures to transform Emerson-Lei automata into parity

automata are of great importance.

Emerson-Lei (EL) acceptance conditions (ﬁrst deﬁned by Emerson and Lei

[10], and reinvented in the HOA format [3]) are arbitrary positive Boolean for-

mulas over the primitives Inf(c) and Fin(c) where c’s are colors from a set Γ. A

run is accepting if the set of colors F ⊆ 2Γseen inﬁnitely often is a satisfying as-

signment to the EL acceptance condition (see Section 2for a formal deﬁnition).

Note that an explicit representation of all satisfying assignments is comparable

to the Muller condition [15, Section 1.3.2]. Since the Boolean structure of LTL

formulas can be mimicked by the Emerson-Lei acceptance conditions, a transla-

tion of LTL formulas to Emerson-Lei automata is particularly convenient.

Many algorithms to transform Emerson-Lei and Muller automata to parity

have been proposed. In essence they all transform an automaton by turning

each original state qinto multiple states of the form (q, r) where rrecords some

information about the current run, and transitions leaving (q, r) otherwise have a

one-to-one mapping with those leaving q. Deﬁnition 3calls this a locally bijective

morphism, and we like to refer to those as algorithms that duplicate states. For

instance in the Later Appearance Record (LAR) [16], ris a list of all colors

ordered by most recent appearance, producing therefore a blow-up of |Γ|! in the

state-space of the automaton. The State Appearance Record (SAR) [24,22] is a

variation of this idea for state-based conditions, and the Color Appearance Record

(CAR) [28] is a variation for the Emerson-Lei condition. The Index Appearance

Record (IAR) [24,22,20] is a specialized construction for Rabin and Streett

conditions, where ris now an ordering of pair indices. These algorithms have

no particular insights about the input acceptance condition, such as inclusion or

redundancies between colors (or pairs). In the Zielonka-tree transformation [31],

ris a reference to a branch in a tree representation of a Muller condition. That

tree representation is tailored to the condition and allows such simpliﬁcations

compared to previous methods (it can be proven to be always better [6,25]).

While none of these algorithms use the structure of the input automaton to

optimize the produced automata, some heuristics have been proposed [28,25,21].

In 2021, inspired by the Zielonka tree,Casares et al. introduced the Alternat-

ing Cycle Decomposition (ACD) of a Muller automaton [6]. Simply put, the ACD

is a forest, i.e., a list of trees, that captures how accepting and rejecting cycles

interleave in the automaton. They use the ACD to transform Muller automata

into parity automata, and they prove a strong optimality result: the resulting

automaton uses an optimal number of colors and has a minimal number of states

100 A. Casares, A. Duret-Lutz, K.J. Meyer, F. Renkin, S. Sickert

among those parity automata that can be obtained by duplicating states of the

original one (see Theorem 1for a formal statement). The main novelty of this

transformation is that it does not only take into account the structure of both

the acceptance condition and the automaton, but it exactly captures how they

interact with each other. Moreover, Casares et al. [6] show that we can obtain

some other valuable information about a Muller automaton from its ACD: for

example the ACD can be used to decide typeness, i.e, if we can relabel it with

another acceptance condition (parity, Rabin, Streett...). Their approach is pri-

marily theoretical and puts the emphasis on how the ACD can be useful to

obtain new results concerning Muller automata, but little is said about the costs

of computing the ACD or the applicability of the transformation in practice.

Contributions. In this paper, we show that the ACD is practical. We adapt the

deﬁnition of the ACD to Emerson-Lei automata and the HOA format [3]. We

implement the ACD and the associated transformation in two tools: Owl [18]

and Spot [9], providing baselines for eﬃcient implementations of these struc-

tures. We show that the ACD gives a usable and useful method to transform

Emerson-Lei automata into parity ones, improving upon any previous transfor-

mation in terms of the size of the output parity automaton. We extend the ACD

to produce state-based automata, and show that the ACD generally beats tradi-

tional degeneralization-based procedures. Our implementation can also use the

ACD to check typeness of deterministic automata.

Structure of the paper. We begin by providing some common deﬁnitions in Sec-

tion 2. In Section 3, we deﬁne the Alternating Cycle Decomposition, adapting

the deﬁnition of Casares et al. [6] to Emerson-Lei automata, and we provide an

algorithm to compute it. In Section 5, we study the transformation of Emerson-

Lei automata into parity ones using the ACD and we show experimental results

obtained by comparing the ACD-transform implemented in Spot and Owl with

other commonly used transformations. In Section 6we show experimental re-

sults in the particular case of degeneralization of generalized B¨uchi automata.

In Section 7we discuss the utility of the ACD to decide typeness of automata.

2 Preliminaries

We denote by |A|the cardinality of a set Aand by 2Aits power set. For a

ﬁnite alphabet Σ, we write Σ∗and Σωfor the sets of ﬁnite and inﬁnite words,

respectively, over Σ. The empty word is denoted by ε. Given v∈Σ∗, w ∈Σω,

we denote their concatenation by v·wand we write v⊑wif vis a preﬁx of w.

We note inf(w) the set of letters that occur inﬁnitely often in w. Given a map

σ:A→Band a subset A′⊆A, we denote σ|A′the restriction of σto A′. We

extend σto A∗and Aωcomponent-wise and we denote these extensions by σ

whenever no confusion arises.

A (directed, edge-colored) graph is a pair G= (V, E) where Vis a ﬁnite set

of vertices and E⊆V×Γ×Vis a ﬁnite set of Γ-colored edges. Note that with

Practical Applications of the Alternating Cycle Decomposition 101

Table 1: Encoding of common acceptance conditions into Emerson-Lei condi-

tions. The variables c, c0, c1, . . . stand for arbitrary colors from the set Γ.

(B) B¨uchi Inf(c)

(GB) generalized B¨uchi ViInf (ci)

(GC) generalized co-B¨uchi WiFin(ci)

(R) Rabin Wi(Fin(c2i)∧Inf (c2i+1 ))

(S) Streett Vi(Inf(c2i)∨Fin(c2i+1 ))

(P) parity min even Inf (0) ∨(Fin(1) ∧(Inf(2) ∨(Fin(3) ∧. . .)))

parity min odd Fin(0) ∧(Inf(1) ∨(Fin(2) ∧(Inf (3) ∨. . .)))

this deﬁnition one can have multiple diﬀerently colored edges from a vertex vto

a vertex u. A graph G′= (V′, E′) is a subgraph of G(written G′⊆G) if V′⊆V

and E′⊆E. A graph G= (V, E ) is strongly connected if for every pair of vertices

(v, u)∈V2there is a path from vto u. A strongly connected component (SCC)

of a graph Gis a maximal strongly connected subgraph of G.

Emerson-Lei acceptance conditions. Let Γ={0, . . . , n −1}be a ﬁnite set of n

integers called colors, from now on also written Γ={0,1, . . .}in our examples.

We deﬁne the set EL(Γ) of acceptance conditions according to the following

grammar, where cstands for any color in Γ:

α::= ⊤ | ⊥ | Inf(c)|Fin(c)|(α∧α)|(α∨α)

Acceptance conditions are interpreted over subsets of Γ. For C⊆Γwe deﬁne

the satisfaction relation C|=αinductively according to the following semantics:

C|=⊤C|=Inf(c) iﬀ c∈C C |=α1∧α2iﬀ C|=α1and C|=α2

C|=⊥C|=Fin(c) iﬀ c /∈C C |=α1∨α2iﬀ C|=α1or C|=α2

We denote by ¬αthe negation of the acceptance condition α, i.e., Fin(m) be-

comes Inf(m), and vice-versa, ∧becomes ∨, etc. We assume that constants are

propagated, i.e., a formula is either ⊤,⊥, or does not contain ⊤and ⊥.

Table 1shows how common acceptance conditions can be encoded into

Emerson-Lei conditions. Note that colors may appear multiple times; for in-

stance (Fin(0)∧Inf(1)) ∨(Fin(1)∧Inf(0)) is a Rabin condition.

Emerson-Lei automata. Atransition-based Emerson-Lei automaton (TELA) is

a tuple A= (Q, Σ, Q0, ∆, Γ , α), where Qis a ﬁnite set of states, Σis a ﬁnite

input alphabet, Q0⊆Qis a non-empty set of initial states, Γis a set of colors,

∆⊆Q×Σ×2Γ×Qis a ﬁnite set of transitions, and α∈EL(Γ) is an Emerson-Lei

condition. The graph of Ais the directed edge-colored graph GA= (Q, E) where

the edges E={(q, C, q ′) : ∃a∈Σ. (q, a, C, q′)∈∆}are obtained from ∆by

removing Σ. We denote the transition (q , a, C, q′)∈∆and the edge (q, C, q ′)∈E

by qa:C

−−→ q′and qC

−→ q′, respectively. Further, we might omit aor Cif they are

102 A. Casares, A. Duret-Lutz, K.J. Meyer, F. Renkin, S. Sickert

clear from the context. We denote by γthe projection of ∆or Eto the set of

colors Γ. Given a word w=a0·a1·a2· · · ∈ Σω, a run over win Ais a sequence

ϱ= (q0, a0, C0, q1)·(q1, a1, C1, q2)· · · ∈ ∆ωsuch that q0∈Q0. The output of the

run ϱ, is the word γ(ϱ)∈(2Γ)ω. A run ϱis accepting if inf (γ(ϱ)) ⊨α. A word

w∈Σωis accepted (or recognized ) by Aif there exists an accepting run over

win A. We denote L(A) the set of words accepted by A. Two automata A,A′

are equivalent if L(A) = L(A′). The size of an automaton, written |A|, is the

cardinality of its set of states. A state q∈Qis reachable if there is a path from

some state in Q0to qin GA.

An automaton Ais deterministic if Q0is a singleton and for every q∈Q

and a∈Σthere is at most one transition from qlabeled with a,qa:C

−−→ q′∈∆.

We will use automata with acceptance deﬁned over transitions (instead of

stated-based acceptance) by default. However, in Sections 5and 6we will also

discuss transformations towards automata with state-based acceptance.

If the acceptance condition of an automaton is represented as a condition of

kind X(cf. Table 1), we call it an X-automaton. We assume that each transition

of a parity-automaton is colored with exactly one color; this can be achieved by

substituting the set Cin a transition qa:C

−−→ q′by min C(if C=∅) or by {|Γ|+1}

if C=∅. (If Cis a singleton we will omit the brackets in the notation).

Labeled trees. Atree is a non-empty preﬁx-closed set T⊆N∗whose elements

are called nodes. It is partially ordered by the preﬁx relation; if x⊑ywe say

that xis an ancestor of yand yis a descendant of x(we add the adjective

“strict” if moreover x=y). The empty string εis the root of the tree. The set

of children of a node x∈Tis ChildrenT(x) = {x·i∈T:i∈N}. The set of

leaves of Tis Leaves(T) = {x∈T:ChildrenT(x) = ∅}. Nodes belonging to a

same set ChildrenT(x) are called siblings, and they are ordered from left to right

by increasing value of their last component. If Ais a set of labels, an A-labeled

tree is a pair ⟨T , η⟩of a tree Tand a map η:T→A. The depth of a node xis

Depth(x) = |x|. The height of Tis Height(T) = max

x∈TDepth(x).

3 The Alternating Cycle Decomposition

The Alternating Cycle Decomposition (ACD), proposed by Casares et al. [6], is

a generalization of the Zielonka tree. The ACD of an automaton Ais a forest, a

collection of trees, labeled with accepting and rejecting cycles of the automaton.

For each SCC of Awe have a unique tree and the labeling of each tree alternates

between accepting and rejecting cycles. Thus the ACD captures the complexity

of the cycle structure of each SCC. We present now the deﬁnition of the ACD

adapted to TELA.

For the rest of this section, let A= (Q, Σ, Q0, ∆, Γ, α) be a TELA and let

A= (Q, E) be the associated graph with edges colored by γ:E→2Γ. We lift

γto sets and deﬁne γ(E′) = S

e∈E′

γ(e) for every subset E′⊆E.

Practical Applications of the Alternating Cycle Decomposition 103

Deﬁnition 1. Acycle of Ais a subset of edges ℓ⊆Eforming a closed path in

A. A cycle ℓis accepting (resp. rejecting) if γ(ℓ)⊨α(resp. γ(ℓ)⊭α). The set

of states of a cycle ℓis States(ℓ) = {q∈Q:some e∈ℓpasses through q}. The

set of cycles of Ais denoted Cycles(A). It is (partially) ordered by set inclusion.

Deﬁnition 2 ([6]). Let S1, . . . , Skbe an enumeration of the strongly connected

components of G

A. The Alternating Cycle Decomposition of A, denoted ACD(A),

is a collection of kCycles(A)-labeled trees ⟨T1,...,Tk⟩with Ti=⟨Ti, ηi⟩such that:

–ηi(ε)is the set of edges of Si, for i= 1, . . . , k.

–If x∈Tiand ηi(x)is an accepting cycle, then xhas a child in Tifor each

maximal element in {ℓ∈Cycles(A) : ℓ⊆ηi(x)and ℓis rejecting}. In this

case, we say that xis a round node.

–If x∈Tiand ηi(x)is a rejecting cycle, then xhas a child in Tifor each

maximal element in {ℓ∈Cycles(A) : ℓ⊆ηi(x)and ℓis accepting}. In this

case, we say that xis a square node.

If q∈Qis a state belonging to the SCC Siin A, we deﬁne the tree associated

to qas the subtree Tq=⟨Tq, ηq⟩given by:

Tq={ε}∪{x∈Ti:q∈States(ηi(x))}, ηq=ηi|Tq.

Remark 1. We provide examples online at https://spot.lrde.epita.fr/ipynb/zlk

tree.html and an executable copy of this notebook is included in the artifact [8].

4 An Eﬃcient Computation of the ACD

In this section we give an algorithm to compute the Alternating Cycle Decom-

position of an Emerson-Lei automaton A, implemented in Owl [18] and Spot [9].

This can be done by ﬁrst computing an SCC-decomposition of G

Awhich gives us

the labels of the roots of the trees ⟨T1,...,Tk⟩, and then recursively computing

the children of the nodes of each tree, following the deﬁnition of ACD(A). Algo-

rithm 1shows how to compute the children of a given node and uses notation

we introduce now.

Let C⊆Γbe a subset of colors and let S= (QS, ES)⊆G

Abe a subgraph.

We deﬁne the projection of Son C, denoted S↓C= (QS, E′

S), as the subgraph

of Sobtained by removing the edges e∈ESsuch that γ(e)⊈C, that is,

E′

S={(q, D, q ′)∈ES:D⊆C}. We write Colors(S) = Se∈ESγ(e). We say

that S′⊆ S is an C-strongly connected component in S(C-SCC) if it is an SCC

of Sand Colors(S′) = C. Further, max⊆is the set of all maximal elements

according to the partial order deﬁned by ⊆.

Note that Algorithm 1uses Algorithm 2, which simpliﬁes the Emerson-Lei

conditions before passing the formula to a Max-SAT function (a SAT-solver

that computes maximal satisfying assignments, e.g., by clause blocking) [4]. This

preprocessing ensures that the ACD for Rabin or Streett acceptance conditions

can be constructed without making use of the general purpose algorithm for

computing maximal satisfying assignments.

104 A. Casares, A. Duret-Lutz, K.J. Meyer, F. Renkin, S. Sickert

Algorithm 1 Computing the children of a node.

1: Input: A cycle S=ηi(x) corresponding to the label of a node xof ACD(A).

2: Output: The set of labels for the children of x, (S1,...,Sk).

3: function Compute-Children(S)

4: children ← ∅,C←Colors(S)

5: if C⊨αthen ▷Maximal subsets D⊆Csuch that D⊨α⇔C⊭α

6: {C1,...,Ck} ← Max-Satisfying-Subsets(C, ¬α)

7: else

8: {C1,...,Ck} ← Max-Satisfying-Subsets(C, α)

9: for D∈ {C1,...,Ck}do

10: for S′∈SCCs of S↓Ddo ▷These might not be D-SCC in S

11: if Colors(S′)⊨α⇔D⊨αthen

12: children ←children ∪ {S′}

13: else

14: children ←children ∪Compute-Children(S′)

15: return max⊆children ▷Remove from children non-maximal cycles

Algorithm 2 The subprocedure Max-Satisfying-Subsets.

1: Input: A subset of colors C⊆Γand an EL condition α∈EL(Γ).

2: Output: max⊆{D⊆C:D⊨α}.

3: function Max-Satisfying-Subsets(C, α)

4: if C⊨αthen

5: return {C}

6: α←α[if c∈Cthen celse ⊥]▷Replace colors not in Cby false

7: L← {c∈C:¬cdoes not occur in α}

8: if L=∅then

9: α←α[if c∈Lthen ⊤else c]▷Replace colors in Lby true

10: {C1,...,Ck} ← Max-Satisfying-Subsets(C\L, α)

11: return {C1∪L,...,Ck∪L}

12: if α=¬c1∨ · ·· ∨ ¬cnthen

13: return {{c1,...,cn} \ {ci}: 1 ≤i≤n}}

14: return Max-SAT(α)

Memoization. To optimize the construction of the ACD and to avoid duplicated

recursive calls, we perform two kinds of memoization: First, we memoize the

results of calling Algorithm 2from Algorithm 1. (Thus we implicitly construct

a Zielonka DAG for α.) Second, we memoize the recursive calls to Algorithm 1:

this is useful, as distinct nodes in the ACD can be labeled by the same cycles.

5 From Emerson-Lei to Parity Automata

In this section we describe the transformation from TELA to parity automata

using the Alternating Cycle Decomposition [6]. This transformation provides

strong optimality guarantees: the resulting parity automaton has minimal size

Practical Applications of the Alternating Cycle Decomposition 105

among those that can be produced without merging states from the TELA and

it uses an optimal number of colors (Theorem 1). We also show that this trans-

formation can be adapted to produce state-based automata. Note that in this

case we loose the ﬁrst optimality guarantee.

5.1 The ACD Transformation

Let A= (Q, Σ, Q0, ∆, Γ, α) be a TELA and let ACD(A) = ⟨T1,...,Tk⟩. We

introduce the following notation that will allow us to move in the ACD.

Given a transition e=qa:C

−−→ q′such that both qand q′belong to the i-th

SCC of Aand a node x∈Ti, we deﬁne Support(x, e) to be the least ancestor z

of xin Tisuch that e∈ηi(z). If Support(x, e)=xand it is not a leaf in Tq′, let

z′be the only child of Support(x, e) that is an ancestor of x, and let y1, . . . , ys

be an enumeration from left to right of the nodes in ChildrenTq′(Support(x, e)).

We deﬁne NextBranch(x, e) as:











Support(x, e),if Support(x, e) = xor if Support(x, e) is a leaf in Tq′,

y1,if z′=ys,

yj+1,if z′=yj,1≤j < s.

We deﬁne a parity automaton PACD(A)= (P, Σ, P0, ∆P, ΓP, β ) (ACD transform

of A) equivalent to Aas follows:

States. The states of PACD(A)are of the form (q , x), for q∈Qand xa leaf of

the tree associated to q. Initial states are of the form (q0, x) with q0∈Q0is

an initial state in Aand xis the leftmost leaf on its corresponding tree.

P=[

q∈Q

{q}×Leaves(Tq), P0={(q0, x) : q0∈Q0,xthe leftmost leaf in Tq0}.

Transitions. For each transition e=qa:C

−−→ q′in ∆and each state (q, x)∈P,

let us deﬁne a transition (q, x)a:p

−−→ (q′, y) in ∆Pas follows: ﬁrst, q′is the

destination state for the original transition. If qand q′are not in the same

SCC then yis deﬁned as the leftmost leaf in Tq′and p= 1 (except if all Ti

have height 1 and a rounded root: in that case p= 0). Otherwise, if both q

and q′belong to the i-th SCC of A, then the destination leaf yis the leftmost

descendant of NextBranch(x, e) in Tq′.

We deﬁne the color pof the transition as Depth(Support(x, e)), if the root

of Tiis a round node (ηi(ε)⊨α), or as Depth(Support(x, e)) + 1 otherwise.

We remark that in this way, pis even if and only if ηi(z)⊨α.

Parity condition. The condition βis a parity min even condition (cf. Table 1).

Remark 2. If the color 0 does not appear on any transition then we shift all

colors by −1 and replace βby a parity min odd condition.

Proposition 1 ([6]). The automaton PACD (A)recognizes L(A).

106 A. Casares, A. Duret-Lutz, K.J. Meyer, F. Renkin, S. Sickert

Remark 3. The ACD transformation preserves many properties (determinism,

completeness, good-for-gameness, unambiguity...) of the automaton A, see [6].

Remark 4. Since the number of colors used by PACD(A)is at most the height of

a tree in ACD(A), we obtain that PACD(A)never uses more colors than |Γ|+ 1.

Furthermore, since the TELA does not require all transitions to have a color, we

can omit the maximal one and produce an automaton with at most |Γ|colors.

In order to state the optimality of this transformation we introduce the

notion of locally bijective morphisms of automata. Given an automaton A=

(Q, Σ, Q0, ∆, Γ, α) and q∈Q, we denote OutA(q) the set of outgoing transitions

of q, i.e., OutA(q) = {qa:C

−−→ q′∈∆:a∈Σ, C ⊆Γ , q′∈Q}.

Deﬁnition 3 ([6]). Let A= (Q, Σ, Q0, ∆, Γ, α)and A′= (Q′, Σ , Q′

0, ∆′, Γ ′, α′)

be two EL automata over Σ. A locally bijective morphism from Ato A′(denoted

φ:A → A′) is a pair of maps φQ:Q→Q′,φ∆:∆→∆′such that:

–φQ|Q0is a bijection between Q0and Q′

–φ∆q1

a:C

−−→ q2=φQ(q1)a:C′

−−−→ φQ(q2)for some C′⊆Γ′.

–For every q∈Q,φ∆|Out A(q)is a bijection between OutA(q)and OutA′(φQ(q))

–For every run ϱ∈∆ωin A,ϱis accepting iﬀ φ∆(ϱ)is accepting in A′.

Theorem 1 ([6]). Let Abe an Emerson-Lei automaton, and let PACD (A)be

the parity automaton obtained by applying the ACD transformation. Then,

–There is a locally bijective morphism φ:PACD (A)→ A.

–If P′is a parity automaton admitting a locally bijective morphism to A, then

|PACD(A)| ≤ |P ′|.

–If P′is a parity automaton recognizing L(A),P′uses at least as many colors

as PACD(A).

Note that all state-duplicating constructions mentioned in the introduction

create locally bijective morphisms. Thus the above theorem shows that the ACD

transformation duplicates the least number of states.

5.2 Experimental Results

Figures 1and 2compare four diﬀerent paritization procedures applied to 1065

TELA generated5from LTL formulas from the Synthesis Competition. These

automata have between 2 and 55 colors (mean 5.92, median 5) and between

1 and 245761 states (mean 2023.20, median 20). Automata with fewer than 2

colors have been ignored since they are trivial to paritize.

The procedures are Owl’s and Spot’s implementation of ACD transform, as

well as Spot’s implementation of the Zielonka Tree transform [6], and Spot’s

previous paritization function (called to parity) [28]. We refer the reader to

Section 8for information about the used versions. Two dotted lines on the sides

5We used ltl2tgba -G -D from Spot, and ltl2dela from Owl.

Practical Applications of the Alternating Cycle Decomposition 107

101103105

Owl ACD trans. (states)

101

103

105

Spot ACD trans. (states)

9 cases

above diag.

14 cases

below diag.

101103105

Spot ZlkTree trans. (states)

4 cases

above diag.

877 cases

below diag.

101103105

Spot to parity (states)

1 case

above diag.

123 cases

below diag.

Fig. 1: Comparison of the output size of the four paritization procedures.

10−410−1102

Owl ACD trans. (s)

10−3

10−1

101

103

Spot ACD trans. (s)

180 cases

above diag.

884 cases

below diag.

10−2101

Spot ZlkTree trans. (s)

552 cases

above diag.

508 cases

below diag.

10−2101

Spot to parity (s)

37 cases

above diag.

1020 cases

below diag.

Fig. 2: Time spent performing these four paritization procedures.

of the plots hold cases that did not ﬁnish within 500 seconds (red, inner line),

or where the tool reported an error6(orange, outer line). Pink dots represent

input automata that already have parity acceptance: for those, running the ACD

transform still makes sense as it will produce an output with a minimal number

of colors. However, Owl’s implementation, which mostly cares about reducing the

number of states, uses a shortcut and will return the input automaton unmodiﬁed

in this case: this explains the pink cloud on the left of Figure 2.

Owl’s and Spot’s implementations of the ACD transform produce automata

with the same size, as expected. The cases that are not on the diagonal all

correspond to timeouts or tool errors. The Zielonka Tree transform, which does

not take the automaton structure into consideration, produces automata that

are on the average 2.11 times bigger (median 1.60), while its runtime is on the

average 6.55 times slower (median 0.97). Lastly, Spot’s to parity function is

not far from the optimal size given by ACD transform: on the average its output

is 3.28 times larger, but the median of that size ratio is 1.00. Similarly, it is on

the average 15.94 times slower, but with a median of 1.04.

6Either “out-of-memory”, or “too many colors” as Spot is restricted to 32 colors.

108 A. Casares, A. Duret-Lutz, K.J. Meyer, F. Renkin, S. Sickert

5.3 ACD Transformation Towards State-Based Parity Automata

Sometimes it is desired to obtain an automaton with the acceptance deﬁned over

states. A state-based parity automaton is a tuple A= (Q, Σ , Q0, ∆, ϕ:Q→N)

where (Q, Σ, Q0, ∆) is the underlying structure deﬁned as for transition-based

automata in Section 2(with the only diﬀerence that ∆⊆Q×Σ×Qnow), and

ϕ:Q→Nis a map associating colors to states. A run over Ais accepting if the

minimal color visited inﬁnitely often is even.

Let Abe a TELA with ACD(A) = ⟨T1,...,Tk⟩. We deﬁne an equivalent

state-based parity automaton Psb-ACD(A)= (P, Σ , P0, ∆P, ϕ:P→N) as follows:

States. States are of the form (q, x), for q∈Qand x∈Tq(now the second

component corresponds to a node of the ACD that is not necessarily a leaf).

The set of initial states is the same as for PACD(A):

P=[

q∈Q

{q} × Tq, P0={(q0, x) : q0∈Q0,xthe leftmost leaf in Tq0}.

Transitions. For each transition e=qa:C

−−→ q′∈∆and (q, x)∈Pwe deﬁne

one transition (q, x)a

−→ (q′, y)∈∆P. To specify the destination node y, we

distinguish two cases:

Suppose that xis a leaf in Tq. If NextBranch(x, e) is not the leftmost child

of Support(x, e) in Tq′, then yis the leftmost leaf below NextBranch(x, e) in

Tq′(as in the transition-based case). If NextBranch(x, e) is the leftmost child

(a “lap” around Support(x, e) is ﬁnished), then we set y=Support(x, e).

If xis not a leaf in Tq, the destination yis determined exactly as if the

transition started in (q, x′) for x′the leftmost leaf in Tqunder x.

Parity condition. ϕ((q , x)) = Depth(x), if the root of Tqis a round node, and

ϕ((q, x)) = Depth(x) + 1 otherwise.

Note that we do not have the same optimality guarantee as in the transition-

based case: If xis not a leaf in its corresponding tree, then the states of the form

(q, x)∈Pare not necessarily reachable in Psb-ACD (A). We only need to add

those that can be reached from the initial state. However, the set of reachable

states does depend on the ordering of the children in the trees of the ACD, and

therefore the size of the ﬁnal automaton depends on this ordering.

We propose a heuristic to order the children of nodes in ACD(A). Let Tibe

a tree in ACD(A) and x∈Ti. We deﬁne:

Di(x) = {q′∈Q:qa

−→ q′/∈ηi(x),for some q∈States (ηi(x)), a ∈Σ}.

The heuristic consists in ordering the children of a node Tiby decreasing |Di(x)|.

Experiments involving transformations towards state-based automata and test-

ing this heuristic can be found in Section 6.2.

Practical Applications of the Alternating Cycle Decomposition 109

6 Degeneralization of Generalized B¨uchi Automata

The transformation of generalized-B¨uchi automata with ncolors into B¨uchi au-

tomata (with a single color) is known as “degeneralization ” and has been a

very common processing step between algorithms that translate temporal-logic

formulas into generalized-B¨uchi automata, and model-checking algorithms that

(used to) only work with B¨uchi automata. While it initially consisted in making

2ncopies of the GBA [30, Appendix B] to remember the set of colors that had

yet to be seen, degeneralization to state-based B¨uchi acceptance can be done us-

ing only n+ 1 copies once an arbitrary order of colors has been selected [13]. A

similar construction to transition-based B¨uchi acceptance requires only ncopies

of the original automaton. Diﬀerent orders of colors may lead to a diﬀerent num-

bers of reachable states in the B¨uchi automaton. Some tools even attempted to

start the degeneralization in diﬀerent copies to reduce the number of reachable

states [14]. Nowadays, an implementation such as the degeneralization of Spot

implements several SCC-based optimizations [2] to reduce the number of output

states, but is still sensitive to the arbitrary order selected for colors.

6.1 Transition-based Degeneralization

This order-sensitivity of the degeneralization, even in its transition-based vari-

ant, makes a striking diﬀerence with ACD. When applied to a generalized B¨uchi

automaton that has some accepting and rejecting paths, the ACD-transform pro-

duces an automaton with acceptance Inf(0)∨Fin(1). Since all transitions are ei-

ther labeled by 0or 1, color 1is superﬂuous7and the condition can be reduced

to Inf(0). In this context, ACD-transform therefore gives us a transition-based

B¨uchi automaton by duplicating the fewest number of states (Theorem 1(2)).

It can be seen that the cycling around the diﬀerent children of the ACD

(whose ordering is arbitrary) performed during ACD-transform is similar to the

process used in traditional degeneralization. What makes the latter sensitive to

color ordering is that it only “sees” one transition at a time, while the ACD

provides a view of the cycles. For instance a degeneralization would process

the sequence xyz

0 1 diﬀerently from the sequence xyz

1 0

depending on the order in which colors are expected to be encountered. However,

if there is no other transition reaching or leaving ythe two colors will always be

seen together so their order should not matter: the two transitions belong to the

same node of the ACD. The propagation of colors [28] is a related preprocessing

step that can improve the degeneralization by propagating all colors common to

the incoming transitions of a state to its outgoing transitions and vice-versa. It

would turn the previous situation into xyz

0 1 0 1 making the color

order selected by the degeneralization irrelevant (in this case).

A comparison of the output size of the traditional degeneralization imple-

mented in Spot (which includes several optimizations learned over the years)

7In an automaton with “parity min” acceptance where all transitions are colored, the

maximal color can always be omitted and replaced by the empty set.

110 A. Casares, A. Duret-Lutz, K.J. Meyer, F. Renkin, S. Sickert

3 4 5 6 7 8 9 10 11

TBA.degen (states)

TBA.acd (states)

31 34 9 1 1

76 54 44 20 5 1

86 40 25 17 3

108 33 20 11 3 3

84 32 6 11 5

121 5 6 16

15 2 1

10 4

0 case

above diag.

419 cases

below diag.

581 cases

on diag. 3 4 5 6 7 8 9 10 11

TBA.degen propagate (states)

TBA.acd (states)

43 28 4 1

132 32 15 16 5

128 26 13 4

137 27 10 3 1

115 17 2 4

128 6 5 9

11 3

0 case

above diag.

235 cases

below diag.

765 cases

on diag.

Fig. 3: Two-dimensional histogram of the sizes of 1000 automata, degeneralized

to transition-based B¨uchi automata, using Spot’s degeneralization function (with

or without propagation of colors), or using ACD-transform.

against that of ACD-transform is given in the left plot of Figure 3. Unsurpris-

ingly, because of ACD-transform’s optimality, there are no cases where ACD

loses to Spot’s transition-based degeneralization. The use of the propagation of

colors (right of the plot) is an improvement (the non-optimal cases dropped from

419 to 235) but not a cure.

Remark 5. The input automata used in this section and the next one is a set of

1000 randomly generated, minimal, deterministic, transition-based generalized

B¨uchi automata, with 3 or 4 states and 2 or 3 colors. The reason for using such

small minimal automata is to be able to use a SAT-based minimization [1] on

the degeneralized state-based output in the next section to estimate how large

the gap between an optimal and our procedure is.

6.2 State-based degeneralization

If ACD is used to produce a state-based output, as explained in Subsection 5.3,

the obtained automaton is not guaranteed to be minimal with respect to locally

bijective morphisms. In this case we can obtain a weaker optimality result:

Proposition 2. Let Abe a generalized B¨uchi automaton, and let Bsb−ACD(A)

be the state-based B¨uchi automaton obtained by applying the ACD state-based

transformation. If B′be is a state-based B¨uchi automaton admitting a locally

bijective morphism to A, then |Bsb−ACD(A)| ≤ |B′|+|A|.

Proof. Let B′be a state-based B¨uchi automaton admitting a locally bijective

morphism to A. We can transform it into a transition-based B¨uchi automaton

B′

trans by setting the transitions leaving accepting states to be accepting. This

automaton has the same size than B′and it also accepts a locally bijective

morphism to A. Therefore, by Theorem 1, we have that |BACD(A)| ≤ |B′

trans|=

Practical Applications of the Alternating Cycle Decomposition 111

3 4 5 6 7 8 9 10 11 12 13 14 15

SBA.acd (states)

SBA.degen (states)

3 38 2

3 14 58 5

4 10 31 70 7 1

4 27 39 83 13 1

2 13 23 27 73 13

2 4 15 17 24 79 24 3

1 4 10 19 9 37 10 1

3 7 10 11 6 12 2

2 7 9 3 2 22 2

1 1 3 8 2 2 7 6 4

1 2 4 5 8 2

1 2 3 1 1 4

402 cases

above diag.

96 cases

below diag.

502 cases

on diag. 3 4 5 6 7 8 9 10 11 12 13

SBA.acd.heuristic (states)

SBA.degen (states)

3 38 2

3 14 61 2

4 10 37 71 1

4 36 38 86 3

2 14 25 48 61 1

2 4 18 15 36 93

1 8 11 15 28 28

3 10 14 11 7 6

4 9 8 2 4 20

1 2 10 1 4 1 15

3 6 13

1 1 1 3 6

498 cases

above diag.

9 cases

below diag.

493 cases

on diag.

Fig. 4: Comparison of three ways to degeneralize to state-based B¨uchi: (acd,

acd.heuristic) using the state-based version of ACD-transform with or without

heuristic, and (degen) classical degeneralization.

3 4 5 6 7 8 9 10 11 12 13 14 15

SBA.acd (states)

SBA.acd.heuristic (states)

13619

13924 2

112841 6 1

210631 1

89 55 8 2

24 14 1

8 6 1

25 5

9 15 10

3 cases

above diag.

241 cases

below diag.

756 cases

on diag. 3 4 5 6 7 8 9

SBA.minimal (states)

SBA.acd.heuristic (states)

12 143

6 14 141

1 3 32 118

1 2 17 38

2 3 25

94 cases

above diag.

0 case

below diag.

555 cases

on diag.

Fig. 5: Eﬀect of the heuristic for ordering children of the ACD, and comparison

to the minimal degeneralized automata (when known).

|B′|, where BACD (A)is the transition-based automaton obtained applying the

ACD-transformation. We claim that |Bsb−AC D(A)| ≤ |BAC D(A)|+|A| (therefore

implying that |Bsb−ACD(A)| ≤ |B′|+|A|). Indeed, the set of states of Bsb−ACD (A)

is the union of the set of states of BACD(A)and a subset of nodes of the form

(q, ε), where εis the root of Tq. There are at most |A| nodes of this form. ⊓⊔

Figure 4compares three ways to perform state-based degeneralization. The

ACD comes in two variants, with or without the heuristic of Section 5.3, and it

is compared against the state-based degeneralization of Spot.

Figure 5shows how the heuristic variant compares to the one without, and

how it compares with the size of a minimal DBA, when its size could be computed

in reasonable time (in 649 cases). Note that there might not be a local bijective

112 A. Casares, A. Duret-Lutz, K.J. Meyer, F. Renkin, S. Sickert

morphism between the input automaton and the minimal DBA computed this

way, nonetheless these minimal size automata can serve as a reference point to

estimate the quality of a degeneralization. Compared to this subset of minimal

DBA, the average number of additional states produced by the state-based ACD

is 0.17 with heuristics, and 0.33 without. Comparatively, Spot’s degeneralization

has an average of 1.21 extra states.

7 Deciding Typeness

We highlight now how the ACD can be used to decide typeness of deterministic

TELA. This problem, ﬁrst introduced by Krishnan and Brayton [19], consists of

deciding whether we can replace the acceptance condition of a given automaton

by another (hopefully simpler) without changing the transition structure and

preserving the language (see Table 1for a list of common acceptance conditions).

Let A= (Q, Σ, Q0, ∆, Γ, α) be a TELA. We say that Ais X-type, for X∈

{B,C,GB,GC,P,R,S}, if there is an X-automaton over the same structure,

A′= (Q, Σ, Q0, ∆′, Γ ′, β ) (where ∆and ∆′only diﬀer on the coloring of the

transitions), such that L(A) = L(A′) and βbelongs to X. We emphasize that

we permit to use a diﬀerent set of colors Γ′in A′. Some conditions can always

be rewritten as conditions of other kinds (for example, B¨uchi conditions can be

expressed as parity ones, so being B-type implies being P-type). We should not

confuse this notion with the expressive power of deterministic automata using

these conditions. For example, both deterministic parity automata and Rabin

automata recognize all ω-regular languages, but there are Rabin automata that

are not parity-type. Further, we say that an automaton Ais weak if for every

SCC Sof A, all cycles in Sare accepting or all of them are rejecting.

The following result shows that the ACD is a suﬃcient data structure for

deciding typeness for many common acceptance conditions. We remark that the

second item adds to the results of Casares et al. [7] (this statement only holds if

transitions of automata are labeled with subsets of colors, which is not allowed

in their model).

Proposition 3 ([7, Section 5.2]). Let Abe a deterministic TELA such that

all its states q∈Qare reachable and let ACD(A) = ⟨T1,...,Tk⟩be its Alternat-

ing Cycle Decomposition. Then the following statements hold:

1. Ais Rabin-type (resp. Streett type) if and only if for every q∈Q, every round

node (resp. square node) of Tqhas at most one child in Tq. It is parity-type

if and only if it is both Rabin and Streett-type.

2. Ais generalized B¨uchi-type (resp. generalized co-B¨uchi-type) if and only if

for every 1≤i≤k, Height(Ti)≤2and in case of equality, the root of Tiis

a round node (resp. square node).

3. Ais weak if and only if for every 1≤i≤k, Height(Ti) = 1.

Also, the least number of colors used by a deterministic parity automaton

recognizing L(A)is max

1≤i≤kHeight(Ti) + ν, where ν= 0 if the root of all trees of

maximal height have the same shape (round or square), and ν= 1 otherwise.

Practical Applications of the Alternating Cycle Decomposition 113

If one of the previous conditions holds, then ACD(A)also provides an eﬀec-

tive procedure to relabel Awith the corresponding acceptance condition.

Remark 6. The ACD gives a typeness result for each SCC of the automaton,

which allows to simplify the acceptance condition of each of them indepen-

dently. Further, implications from right to left in Proposition 3also hold for

non-deterministic automata.

Proposition 3provides an eﬀective procedure to check typeness of TELA:

we just have to build the ACD and verify that it has the appropriate shape.

Spot’s implementation of ACD has options to abort the construction as soon

as it detects that the shape is wrong. Moreover, if an automaton is parity-type,

the ACD provides a method to relabel the automaton with a minimal number

of colors. Finally, if the automaton already has parity acceptance, the ACD

transformation boils down to the algorithm of Carton and Maceiras [5].

8 Availability

The ACD and the transformations based on it are currently implemented in two

open-source tools: Spot 2.10 [9] and Owl 21.0 [18]. (The original developments

were independent before the authors met and worked on this joint paper.)

In Spot 2.10, the ACD can be played with using the Python bindings. The acd

class implements the decomposition, and will render it as an interactive forest of

nodes that can be clicked to highlight the relevant cycles in the input automaton.

The acd transform() and acd transform sbacc() implements the transition-

based and state-based variant of the paritization procedure. Additionally, the

acd class has options to heuristically order the children to favor the state-based

construction, or to abort the construction as soon as it is clear that the ACD

does not have Rabin or Street shape (in case one wants to use it to establish

typeness of automata). All these features are illustrated at https://spot.lrde.ep

ita.fr/ipynb/zlktree.html. In the future, ACD will be used more by the rest of

Spot, and will be one option of the ltlsynt tool (for LTL synthesis).

In Owl, the ACD transformation is available through the aut2parity com-

mand. This command reads an automaton in the HOA format [3] using arbi-

trary acceptance, and produces a parity automaton in the same format. The tool

Strix [23], which builds upon Owl, gained in version 21.0.0 the option to use the

ACD-construction as an intermediate step.

Instructions to reproduce all experiments and included in the artifact [8].

9 Conclusion

We have shown that ACD is more than a theoretically-appealing construction:

our two implementations show that the construction is very usable in practice,

and provide a baseline for further improvements. We have also shown that ACD is

a Swiss-army knife for ω-automata in the sense that it can generalize and replace

several speciﬁc constructions (paritization, degeneralization, typeness checks).

114 A. Casares, A. Duret-Lutz, K.J. Meyer, F. Renkin, S. Sickert

References

1. Baarir, S., Duret-Lutz, A.: Mechanizing the minimization of deterministic gen-

eralized B¨uchi automata. In: Proceedings of the 34th IFIP International Confer-

ence on Formal Techniques for Distributed Objects, Components and Systems

(FORTE’14), Lecture Notes in Computer Science, vol. 8461, pp. 266–283, Springer

(Jun 2014), https://doi.org/10.1007/978-3-662-43613-4 17

2. Babiak, T., Badie, T., Duret-Lutz, A., Kˇret´ınsk´y, M., Strejˇcek, J.: Compositional

approach to suspension and other improvements to LTL translation. In: Proceed-

ings of the 20th International SPIN Symposium on Model Checking of Software

(SPIN’13), Lecture Notes in Computer Science, vol. 7976, pp. 81–98, Springer (Jul

2013), https://doi.org/10.1007/978-3-642- 39176-7 6

3. Babiak, T., Blahoudek, F., Duret-Lutz, A., Klein, J., Kˇret´ınsk´y, J., M¨uller, D.,

Parker, D., Strejˇcek, J.: The hanoi omega-automata format. In: Kroening, D.,

P˘as˘areanu, C.S. (eds.) Computer Aided Veriﬁcation, pp. 479–486, Springer Inter-

national Publishing (2015)

4. Battiti, R., , Protasi, M.: Handbook of Combinatorial Optimization: Volume 1–3,

chap. Approximate Algorithms and Heuristics for MAX-SAT, pp. 77–148. Springer

US (1998), ISBN 978-1-4613-0303-9, https://doi.org/10.1007/978-1-4613-0303- 9 2

5. Carton, O., Maceiras, R.: Computing the Rabin index of a parity automaton.

Informatique th´eorique et applications 33(6), 495–505 (1999), URL http://www.

numdam.org/item/ITA 1999 33 6 495 0/

6. Casares, A., Colcombet, T., Fijalkow, N.: Optimal transformations of games and

automata using Muller conditions. In: Bansal, N., Merelli, E., Worrell, J. (eds.) Pro-

ceedings of the 48th International Colloquium on Automata, Languages, and Pro-

gramming (ICALP’21), Leibniz International Proceedings in Informatics (LIPIcs),

vol. 198, pp. 123:1–123:14, Schloss Dagstuhl – Leibniz-Zentrum f¨ur Informatik,

Dagstuhl, Germany (2021), https://doi.org/10.4230/LIPIcs.ICALP.2021.123

7. Casares, A., Colcombet, T., Fijalkow, N.: Optimal transformations of muller condi-

tions. Extended version of [6], on ArXiv. (2021), https://arxiv.org/abs/2011.13041

8. Casares, A., Duret-Lutz, A., Meyer, K.J., Renkin, F., Sickert, S.: Artifact for the

paper “Practical applications of the alternating cycle decomposition”. https://do

i.org/10.5281/zenodo.5572613 (2021)

9. Duret-Lutz, A., Lewkowicz, A., Fauchille, A., Michaud, T., Renault, E., Xu, L.:

Spot 2.0 — a framework for LTL and ω-automata manipulation. In: Proceedings of

the 14th International Symposium on Automated Technology for Veriﬁcation and

Analysis (ATVA’16), Lecture Notes in Computer Science, vol. 9938, pp. 122–129,

Springer (Oct 2016), https://doi.org/10.1007/978-3-319-46520-3 8

10. Emerson, E.A., Lei, C.L.: Modalities for model checking (extended abstract):

Branching time strikes back. In: Proceedings of the 12th ACM symposium on

Principles of Programming Languages (POPL’85), pp. 84–96, ACM (1985), https:

//doi.org/10.1145/318593.318620

11. Esparza, J., Kˇret´ınsk´y, J., Raskin, J.F., Sickert, S.: From LTL and limit-

deterministic B¨uchi automata to deterministic parity automata. In: Proceedings of

the 23rd International Conference on Tools and Algorithms for the Construction

and Analysis of Systems (TACAS’17), Lecture Notes in Computer Science, vol.

10205, pp. 426–442, Springer-Verlag (2017), https://doi.org/10.1007/978-3-662-

54577-5 25

12. Esparza, J., Kˇret´ınsk´y, J., Sickert, S.: A uniﬁed translation of linear temporal logic

to ω-automata. J. ACM 67(6) (Oct 2020), https://doi.org/10.1145/3417995

Practical Applications of the Alternating Cycle Decomposition 115

13. Gastin, P., Oddoux, D.: Fast LTL to B¨uchi automata translation. In: Berry, G.,

Comon, H., Finkel, A. (eds.) Proceedings of the 13th International Conference on

Computer Aided Veriﬁcation (CAV’01), Lecture Notes in Computer Science, vol.

2102, pp. 53–65, Springer-Verlag (2001), https://doi.org/10.1007/3-540-44585-4 6

14. Giannakopoulou, D., Lerda, F.: From states to transitions: Improving translation of

LTL formulæ to B¨uchi automata. In: Peled, D., Vardi, M. (eds.) Proceedings of the

22nd IFIP WG 6.1 International Conference on Formal Techniques for Networked

and Distributed Systems (FORTE’02), Lecture Notes in Computer Science, vol.

2529, pp. 308–326, Springer-Verlag, Houston, Texas (Nov 2002)

15. Gr¨adel, E., Thomas, W., Wilke, T. (eds.): Automata Logics, and Inﬁnite Games.

Springer, Berlin, Heidelberg (2002), https://doi.org/10.1007/3-540-36387-4

16. Gurevich, Y., Harrington, L.: Trees, automata, and games. In: Proceedings of the

14th annual ACM symposium on Theory of computing (STOC’82), pp. 60–65

(1982), https://doi.org/10.1145/800070.802177

17. Jacobs, S., Bloem, R., Colange, M., Faymonville, P., Finkbeiner, B., Khalimov,

A., Klein, F., Luttenberger, M., Meyer, P.J., Michaud, T., Sakr, M., Sickert, S.,

Tentrup, L., Walker, A.: The 5th reactive synthesis competition (SYNTCOMP

2018): Benchmarks, participants & results. CoRR abs/1904.07736 (2019), URL

http://arxiv.org/abs/1904.07736

18. Kret´ınsk´y, J., Meggendorfer, T., Sickert, S.: Owl: A library for ω-words, automata,

and LTL. In: Proceedings of the 16th International Symposium on Automated

Technology for Veriﬁcation and Analysis (ATVA’18), Lecture Notes in Computer

Science, vol. 11138, pp. 543–550, Springer (2018), https://doi.org/10.1007/978-3-

030-01090-4 34

19. Krishnan, Sriram C.and Puri, A., Brayton, R.K.: Deterministic ωautomata vis-a-

vis deterministic buchi automata. In: Algorithms and Computation, pp. 378–386,

Springer Berlin Heidelberg, Berlin, Heidelberg (1994)

20. Kˇret´ınsk´y, J., Meggendorfer, T., Waldmann, C., Weininger, M.: Index appearance

record for transforming Rabin automata into parity automata. In: Legay, A., Mar-

garia, T. (eds.) Proceedings of the 23st International Conference on Tools and

Algorithms for the Construction and Analysis of Systems (TACAS’17), Lecture

Notes in Computer Science, vol. 10205, pp. 443–460 (2017), https://doi.org/10.1

007/978-3-662-54577- 5 26

21. Kˇret´ınsk´y, J., Meggendorfer, T., Waldmann, C., Weininger, M.: Index appearance

record with preorders. Acta Informatica (2021), https://doi.org/10.1007/s00236-0

21-00412-y

22. L¨oding, C.: Optimal bounds for transformations of ω-automata. In: Proceedings

of the 19th Conference on Foundations of Software Technology and Theoretical

Computer Science (FSTTCS’99), Lecture Notes in Computer Science, vol. 1738,

pp. 97–109, Springer (1999), https://doi.org/10.1007/3-540-46691-6 8

23. Luttenberger, M., Meyer, P.J., Sickert, S.: Practical synthesis of reactive systems

from LTL speciﬁcations via parity games. Acta Informatica pp. 3–36 (2020), https:

//doi.org/10.1007/s00236-019-00349-3

24. L¨oding, C.: Methods for the Transformation of ω-Automata: Complexity and

Connection to Second Order Logic. Master’s thesis, Institute of Computer Sci-

ence and Applied Mathematics Christian-Albrechts-University of Kiel (1998), URL

https://old.automata.rwth-aachen.de/users/loeding/diploma loeding.pdf

25. Meyer, P., Sickert, S.: On the optimal and practical conversion of Emerson-Lei

automata into parity automata. Unpublished manuscript, obsoleted by the work

of Casares et al. [6]. (2021)

116 A. Casares, A. Duret-Lutz, K.J. Meyer, F. Renkin, S. Sickert

26. Michaud, T., Colange, M.: Reactive synthesis from LTL speciﬁcation with Spot.

In: Proceedings of the 7th Workshop on Synthesis, SYNT@CAV 2018, Electronic

Proceedings in Theoretical Computer Science (2018)

27. Pnueli, A., Rosner, R.: On the synthesis of a reactive module. In: Proceedings

of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming

languages (POPL’89), pp. 179––190 (1989), https://doi.org/10.1145/75277.75293

28. Renkin, F., Duret-Lutz, A., Pommellet, A.: Practical “paritizing” of Emerson-Lei

automata. In: Proceedings of the 18th International Symposium on Automated

Technology for Veriﬁcation and Analysis (ATVA’20), Lecture Notes in Computer

Science, vol. 12302, pp. 127–143, Springer (Oct 2020), https://doi.org/10.1007/97

8-3-030-59152- 6 7

29. Vardi, M.Y.: An automata-theoretic approach to linear temporal logic. In: Logics

for Concurrency: Structure versus Automata, volume 1043 of Lecture Notes in

Computer Science, pp. 238–266, Springer-Verlag (1996)

30. Vardi, M.Y., Wolper, P.: An automata-theoretic approach to automatic program

veriﬁcation. In: Proceedings of the 1st Symposium on Logic in Computer Science

(LICS’86), pp. 332–344, IEEE Computer Society Press (Jun 1986)

31. Zielonka, W.: Inﬁnite games on ﬁnitely coloured graphs with applications to au-

tomata on inﬁnite trees. Theoretical Computer Science 200(1), 135–183 (1998),

https://doi.org/10.1016/S0304-3975(98)00009-7

Open Access This chapter is licensed under the terms of the Creative Commons

Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),

which permits use, sharing, adaptation, distribution and reproduction in any medium

or format, as long as you give appropriate credit to the original author(s) and the

source, provide a link to the Creative Commons license and indicate if changes were

made.

The images or other third party material in this chapter are included in the chapter’s

Creative Commons license, unless indicated otherwise in a credit line to the material. If

material is not included in the chapter’s Creative Commons license and your intended

use is not permitted by statutory regulation or exceeds the permitted use, you will need

to obtain permission directly from the copyright holder.

Practical Applications of the Alternating Cycle Decomposition 117

Sky Is Not the Limit

Tighter Rank Bounds for Elevator Automata in

Büchi Automata Complementation

Vojtěch Havlena , Ondřej Lengál , and Barbora Šmahlíková

ihavlena@fit.vut.cz,lengal@vut.cz,xsmahl00@vut.cz

Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic

Abstract. We propose several heuristics for mitigating one of the main causes

of combinatorial explosion in rank-based complementation of Büchi automata

(BAs): unnecessarily high bounds on the ranks of states. First, we identify elevator

automata, which is a large class of BAs (generalizing semi-deterministic BAs),

occurring often in practice, where ranks of states are bounded according to the

structure of strongly connected components. The bounds for elevator automata

also carry over to general BAs that contain elevator automata as a sub-structure.

Second, we introduce two techniques for reﬁning bounds on the ranks of BA states

using data-ﬂow analysis of the automaton. We implement out techniques as an

extension of the tool Ranker for BA complementation and show that they indeed

greatly prune the generated state space, obtaining signiﬁcantly better results and

outperforming other state-of-the-art tools on a large set of benchmarks.

1 Introduction

Büchi automata (BA) complementation has been a fundamental problem underlying

many applications since it was introduced in 1962 by Büchi [8,17] as an essential part of

a decision procedure for a fragment of the second-order arithmetic. BA complementation

has been used as a crucial part of, e.g., termination analysis of programs [13,20,10] or

decision procedures for various logics, such as S1S [8], the ﬁrst-order logic of Sturmian

words [33], or the temporal logics ETL and QPTL [38]. Moreover, BA complementation

also underlies BA inclusion and equivalence testing, which are essential instruments in

the BA toolbox. Optimal algorithms, whose output asymptotically matches the lower

bound of (0.76𝑛)𝑛[43] (potentially modulo a polynomial factor), have been devel-

oped [37,1]. For a successful real-world use, asymptotic optimality is, however, not

enough and these algorithms need to be equipped with a range of optimizations to make

them behave better than the worst case on BAs occurring in practice.

In this paper, we focus on the so-called rank-based approach to complementation,

introduced by Kupferman and Vardi [24], further improved with the help of Friedgut [14],

and ﬁnally made optimal by Schewe [37]. The construction stores in a macrostate partial

information about all runs of a BA Aover some word 𝛼. In addition to tracking states

that Acan be in (which is suﬃcient, e.g., in the determinization of NFAs), a macrostate

also stores a guess of the rank of each of the tracked states in the run DAG that captures

all these runs. The guessed ranks impose restrictions on how the future of a state might

look like (i.e., when Amay accept). The number of macrostates in the complement

The Author(s) 2022

D. Fisman and G. Rosu (Eds.): TACAS 2022, LNCS 13244, pp. 118–136, 2022.

https://doi.org/10.1007/978-3-030-99527-0_7

depends combinatorially on the maximum rank that occurs in the macrostates. The

constructions in [24,14,37] provides only coarse bounds on the maximum ranks.

A way of decreasing the maximum rank has been suggested in [15] using a PSpace

(and, therefore, not really practically applicable) algorithm (the problem of ﬁnding the

optimal rank is PSpace-complete). In our previous paper [19], we have identiﬁed several

basic optimizations of the construction that can be used to reﬁne the tight-rank upper

bound (TRUB) on the maximum ranks of states. In this paper, we push the applicability

of rank-based techniques much further by introducing two novel lightweight techniques

for reﬁning the TRUB, thus signiﬁcantly reducing the generated state space.

Firstly, we introduce a new class of the so-called elevator automata, which occur

quite often in practice (e.g., as outputs of natural algorithms for translating LTL to

BAs). Intuitively, an elevator automaton is a BA whose strongly connected components

(SCCs) are all either inherently weak1or deterministic. Clearly, the class substantially

generalizes the popular inherently weak [6] and semi-deterministic BAs [11,3,4]). The

structure of elevator automata allows us to provide tighter estimates of the TRUBs,

not only for elevator automata per se, but also for BAs where elevator automata occur

as a sub-structure (which is even more common). Secondly, we propose a lightweight

technique, inspired by data ﬂow analysis, allowing to propagate rank restriction along

the skeleton of the complemented automaton, obtaining even tighter TRUBs. We also

extended the optimal rank-based algorithm to transition-based BAs (TBAs).

We implemented our optimizations within the Ranker tool [18] and evaluated our

approach on thousands of hard automata from the literature (15 % of them were elevator

automata that were not semi-deterministic, and many more contained an elevator sub-

structure). Our techniques drastically reduce the generated state space; in many cases we

even achieved exponential improvement compared to the optimal procedure of Schewe

and our previous heuristics. The new version of Ranker gives a smaller complement in

the majority of cases of hard automata than other state-of-the-art tools.

2 Preliminaries

Words, functions. We ﬁx a ﬁnite nonempty alphabet Σand the ﬁrst inﬁnite ordinal

𝜔={0,1, . . .}. For 𝑛∈𝜔, by [𝑛]we denote the set {0, . . . , 𝑛}. For 𝑖∈𝜔we use

bb𝑖cc to denote the largest even number smaller of equal to 𝑖, e.g., bb42cc =bb43cc =42.

An (inﬁnite) word 𝛼is represented as a function 𝛼:𝜔→Σwhere the 𝑖-th symbol is

denoted as 𝛼𝑖. We abuse notation and sometimes also represent 𝛼as an inﬁnite sequence

𝛼=𝛼0𝛼1. . . We use Σ𝜔to denote the set of all inﬁnite words over Σ. For a (partial)

function 𝑓:𝑋→𝑌and a set 𝑆⊆𝑋, we deﬁne 𝑓(𝑆)={𝑓(𝑥) | 𝑥∈𝑆}. Moreover, for

𝑥∈𝑋and 𝑦∈𝑌, we use 𝑓 ⊳{𝑥↦→ 𝑦}to denote the function (𝑓\ {𝑥↦→ 𝑓(𝑥)}) ∪ {𝑥↦→ 𝑦}.

Büchi automata. A (nondeterministic transition/state-based) Büchi automaton (BA)

over Σis a quadruple A=(𝑄 , 𝛿, 𝐼 , 𝑄𝐹∪𝛿𝐹)where 𝑄is a ﬁnite set of states,𝛿:𝑄×Σ→

2𝑄is a transition function,𝐼⊆𝑄is the sets of initial states, and 𝑄𝐹⊆𝑄and 𝛿𝐹⊆𝛿are

the sets of accepting states and accepting transitions respectively. We sometimes treat 𝛿

as a set of transitions 𝑝𝑎

→𝑞, for instance, we use 𝑝𝑎

→𝑞∈𝛿to denote that 𝑞∈𝛿(𝑝, 𝑎).

1An SCC is inherently weak if it either contains no accepting states or, on the other hand, all

cycles of the SCC contain an accepting state.

Sky Is Not the Limit: Tighter Rank Bounds in B¨uchi Automata Complementation 119

Moreover, we extend 𝛿to sets of states 𝑃⊆𝑄as 𝛿(𝑃, 𝑎)=Ð𝑝∈𝑃𝛿(𝑝, 𝑎), and to sets

of symbols Γ⊆Σas 𝛿(𝑃, Γ)=Ð𝑎∈Γ𝛿(𝑃, 𝑎). We deﬁne the inverse transition function

as 𝛿−1={𝑝𝑎

→𝑞|𝑞𝑎

→𝑝∈𝛿}. The notation 𝛿|𝑆for 𝑆⊆𝑄is used to denote the

restriction of the transition function 𝛿∩ (𝑆×Σ×𝑆). Moreover, for 𝑞∈𝑄, we use A [𝑞]

to denote the BA (𝑄, 𝛿, {𝑞}, 𝑄𝐹∪𝛿𝐹).

Arun of Afrom 𝑞∈𝑄on an input word 𝛼is an inﬁnite sequence 𝜌:𝜔→𝑄that

starts in 𝑞and respects 𝛿, i.e., 𝜌0=𝑞and ∀𝑖≥0: 𝜌𝑖

𝛼𝑖

→𝜌𝑖+1∈𝛿. Let inf𝑄(𝜌)denote

the states occurring in 𝜌inﬁnitely often and inf 𝛿(𝜌)denote the transitions occurring in 𝜌

inﬁnitely often. The run 𝜌is called accepting iﬀ inf 𝑄(𝜌) ∩ 𝑄𝐹≠∅or inf 𝛿(𝜌) ∩ 𝛿𝐹≠∅.

A word 𝛼is accepted by Afrom a state 𝑞∈𝑄if there is an accepting run 𝜌of A

from 𝑞, i.e., 𝜌0=𝑞. The set LA(𝑞)={𝛼∈Σ𝜔| A accepts 𝛼from 𝑞}is called the

language of 𝑞(in A). Given a set of states 𝑅⊆𝑄, we deﬁne the language of 𝑅as

LA(𝑅)=Ð𝑞∈𝑅LA(𝑞)and the language of Aas L(A ) =LA(𝐼). We say that a state

𝑞∈𝑄is useless iﬀ LA(𝑞)=∅. If 𝛿𝐹=∅, we call Astate-based and if 𝑄𝐹=∅, we

call Atransition-based. In this paper, we ﬁx a BA A=(𝑄, 𝛿, 𝐼, 𝑄 𝐹∪𝛿𝐹).

3 Complementing Büchi automata

In this section, we describe a generalization of the rank-based complementation of state-

based BAs presented by Schewe in [37] to our notion of transition/state-based BAs.

Proofs can be found in [16].

3.1 Run DAGs

First, we recall the terminology from [37] (which is a minor modiﬁcation of the one

in [24]), which we use in the paper. Let the run DAG of Aover a word 𝛼be a DAG

(directed acyclic graph) G𝛼=(𝑉, 𝐸)containing vertices 𝑉and edges 𝐸such that

–𝑉⊆𝑄×𝜔s.t. (𝑞, 𝑖) ∈ 𝑉iﬀ there is a run 𝜌of Afrom 𝐼over 𝛼with 𝜌𝑖=𝑞,

–𝐸⊆𝑉×𝑉s.t. ((𝑞 , 𝑖),(𝑞0, 𝑖0)) ∈ 𝐸iﬀ 𝑖0=𝑖+1and 𝑞0∈𝛿(𝑞, 𝛼𝑖).

Given G𝛼as above, we will write (𝑝, 𝑖) ∈ G𝛼to denote that (𝑝, 𝑖) ∈ 𝑉. A vertex

(𝑝, 𝑖) ∈ 𝑉is called accepting if 𝑝is an accepting state and an edge ( (𝑞, 𝑖 ),(𝑞0, 𝑖0)) ∈ 𝐸

is called accepting if 𝑞𝛼𝑖

→𝑞0is an accepting transition. A vertex 𝑣∈ G𝛼is ﬁnite if the

set of vertices reachable from 𝑣is ﬁnite, inﬁnite if it is not ﬁnite, and endangered if it

cannot reach an accepting vertex or an accepting edge.

We assign ranks to vertices of run DAGs as follows: Let G0

𝛼=G𝛼and 𝑗=0. Repeat

the following steps until the ﬁxpoint or for at most 2𝑛+1steps, where 𝑛=|𝑄|.

–Set rank 𝛼(𝑣) ← 𝑗for all ﬁnite vertices 𝑣of G𝑗

𝛼and let G𝑗+1

𝛼be G𝑗

𝛼minus the

vertices with the rank 𝑗.

–Set rank 𝛼(𝑣) ← 𝑗+1for all endangered vertices 𝑣of G𝑗+1

𝛼and let G𝑗+2

𝛼be G𝑗+1

𝛼

minus the vertices with the rank 𝑗+1.

–Set 𝑗←𝑗+2.

For all vertices 𝑣that have not been assigned a rank yet, we assign rank 𝛼(𝑣) ← 𝜔.

We deﬁne the rank of 𝛼, denoted as rank (𝛼), as max{rank 𝛼(𝑣) | 𝑣∈ G𝛼}and the

rank of A, denoted as rank (A), as max{rank (𝑤) | 𝑤∈Σ𝜔\ L ( A)}.

Lemma 1. If 𝛼∉L(A), then rank (𝛼) ≤ 2|𝑄|.

120 Vojtˇech Havlena, Ondˇrej Leng´al, Barbora ˇ

Smahl´ıkov´a

3.2 Rank-Based Complementation

In this section, we describe a construction for complementing BAs developed in the work

of Kupferman and Vardi [24]—later improved by Friedgut, Kupferman, and Vardi [14],

and by Schewe [37]—extended to our deﬁnition of BAs with accepting states and tran-

sitions (see [19] for a step-by-step introduction). The construction is based on the notion

of tight level rankings storing information about levels in run DAGs. For a BA Aand

𝑛=|𝑄|, a (level) ranking is a function 𝑓:𝑄→ [2𝑛]such that 𝑓(𝑄𝐹) ⊆ {0,2, . . . , 2𝑛},

i.e., 𝑓assigns even ranks to accepting states of A. For two rankings 𝑓and 𝑓0we deﬁne

𝑓𝑎

𝑆𝑓0iﬀ for each 𝑞∈𝑆and 𝑞0∈𝛿(𝑞, 𝑎)we have 𝑓0(𝑞0) ≤ 𝑓(𝑞)and for each

𝑞00 ∈𝛿𝐹(𝑞, 𝑎)it holds 𝑓0(𝑞00) ≤ bb 𝑓(𝑞)cc. The set of all rankings is denoted by R. For

a ranking 𝑓, the rank of 𝑓is deﬁned as rank (𝑓)=max{𝑓(𝑞) | 𝑞∈𝑄}. We use 𝑓≤𝑓0

iﬀ for every state 𝑞∈𝑄we have 𝑓(𝑞) ≤ 𝑓0(𝑞)and we use 𝑓<𝑓0iﬀ 𝑓≤𝑓0and there

is a state 𝑞∈𝑄with 𝑓(𝑞)< 𝑓 0(𝑞). For a set of states 𝑆⊆𝑄, we call 𝑓to be 𝑆-tight if

(i) it has an odd rank 𝑟, (ii) 𝑓(𝑆) ⊇ {1,3, . . . , 𝑟 }, and (iii) 𝑓(𝑄\𝑆)={0}. A ranking is

tight if it is 𝑄-tight; we use Tto denote the set of all tight rankings.

The original rank-based construction [24] uses macrostates of the form (𝑆 , 𝑂, 𝑓 )to

track all runs of Aover 𝛼. The 𝑓-component contains guesses of the ranks of states

in 𝑆(which is obtained by the classical subset construction) in the run DAG and the

𝑂-set is used to check whether all runs contain only a ﬁnite number of accepting states.

Friedgut, Kupferman, and Vardi [14] improved the construction by having 𝑓consider

only tight rankings. Schewe’s construction [37] extends the macrostates to (𝑆, 𝑂 , 𝑓 , 𝑖)

with 𝑖∈𝜔representing a particular even rank such that 𝑂tracks states with rank 𝑖.

At the cut-point (a macrostate with 𝑂=∅) the value of 𝑖is changed to 𝑖+2modulo the

rank of 𝑓. Macrostates in an accepting run hence iterate over all possible values of 𝑖.

Formally, the complement of A=(𝑄, 𝛿, 𝐼, 𝑄𝐹∪𝛿𝐹)is given as the (state-based) BA

Schewe(A) =(𝑄0, 𝛿0, 𝐼 0, 𝑄 0

𝐹∪ ∅), whose components are deﬁned as follows:

–𝑄0=𝑄1∪𝑄2where

•𝑄1=2𝑄and

•𝑄2={( 𝑆 , 𝑂, 𝑓 , 𝑖 ) ∈ 2𝑄×2𝑄× T × {0,2, . . . , 2𝑛−2} | 𝑓is 𝑆-tight,

𝑂⊆𝑆∩𝑓−1(𝑖)},

–𝐼0={𝐼},

–𝛿0=𝛿1∪𝛿2∪𝛿3where

•𝛿1:𝑄1×Σ→2𝑄1such that 𝛿1(𝑆, 𝑎)={𝛿(𝑆, 𝑎) },

•𝛿2:𝑄1×Σ→2𝑄2such that 𝛿2(𝑆, 𝑎)={ ( 𝑆0,∅, 𝑓 , 0) | 𝑆0=𝛿(𝑆, 𝑎),

𝑓is 𝑆0-tight}, and

•𝛿3:𝑄2×Σ→2𝑄2such that (𝑆0, 𝑂 0, 𝑓 0, 𝑖0) ∈ 𝛿3((𝑆 , 𝑂, 𝑓 , 𝑖 ), 𝑎)iﬀ

∗𝑆0=𝛿(𝑆, 𝑎),

∗𝑓𝑎

𝑆𝑓0,

∗rank (𝑓)=rank (𝑓0),

∗and

◦if 𝑂=∅then 𝑖0=(𝑖+2)mod (rank (𝑓0) + 1)and 𝑂0=𝑓0−1(𝑖0), and

◦if 𝑂≠∅then 𝑖0=𝑖and 𝑂0=𝛿(𝑂, 𝑎) ∩ 𝑓0−1(𝑖); and

–𝑄0

𝐹={∅} ∪ ( (2𝑄× {∅} × T × 𝜔) ∩ 𝑄2).

We call the part of the automaton with states from 𝑄1the waiting part (denoted as

Waiting), and the part corresponding to 𝑄2the tight part (denoted as Tight).

Sky Is Not the Limit: Tighter Rank Bounds in B¨uchi Automata Complementation 121

Theorem 2. Let Abe a BA. Then L(Schewe(A)) = Σ𝜔\ L (A).

The space complexity of Schewe’s construction for BAs matches the theoretical

lower bound O( (0.76𝑛)𝑛)given by Yan [43] modulo a quadratic factor O (𝑛2). Note that

our extension to BAs with accepting transitions does not increase the space complexity

of the construction.

𝑟𝑠𝑡

𝑏

𝑎

(a) BA Aover {𝑎, 𝑏}

{𝑟} {𝑟, 𝑠, 𝑡 } {𝑠, 𝑡 }

∅

{𝑟:3, 𝑠:0, 𝑡:1},∅

{𝑟:3, 𝑠:2, 𝑡:1},∅

{𝑟:1, 𝑠:2, 𝑡:3},∅

{𝑟:1, 𝑠:0, 𝑡:0},∅{𝑟:1, 𝑠:0, 𝑡 :1},∅

{𝑟:1},∅

{𝑟:1, 𝑠:0, 𝑡:0},{𝑠, 𝑡 }

{𝑠:0, 𝑡:1},∅

𝑏

𝑏𝑏

𝑏

𝑎

𝑏

𝑎

𝑏

𝑎, 𝑏

𝑏𝑏

(b) A part of Schewe(A)

Fig. 1: Schewe’s complementation

Example 3. Consider the BA Aover

{𝑎, 𝑏}given in Fig. 1a. A part of

Schewe(A) is shown in Fig. 1b (we use

({ 𝑠:0, 𝑡:1},∅) to denote the macrostate

({ 𝑠, 𝑡 },∅,{𝑠↦→ 0, 𝑡 ↦→ 1},0)). We

omit the 𝑖-part of each macrostate since

the corresponding values are 0 for all

macrostates in the ﬁgure. Useless states

are covered by grey stripes. The full au-

tomaton contains even more transitions

from {𝑟}to useless macrostates of the

form ({𝑟:·, 𝑠:·, 𝑡:·},∅).ut

From the construction of Schewe(A),

we can see that the number of states is

aﬀected mainly by sizes of macrostates

and by the maximum rank of A. In par-

ticular, the upper bound on the number

of states of the complement with the maximum rank 𝑟is given in the following lemma.

Lemma 4. For a BA Awith suﬃciently many states 𝑛such that rank (A) =𝑟the

number of states of the complemented automaton is bounded by 2𝑛+(𝑟+𝑚)𝑛

(𝑟+𝑚)!where

𝑚=max{0,3− d 𝑟

2e}.

From Lemma 1we have that the rank of Ais bounded by 2|𝑄|. Such a bound

is often too coarse and hence Schewe(A) may contain many redundant states. De-

creasing the bound on the ranks is essential for a practical algorithm, but an optimal

solution is PSpace-complete [15]. The rest of this paper therefore proposes a framework

of lightweight techniques for decreasing the maximum rank bound and, in this way,

signiﬁcantly reducing the size of the complemented BA.

3.3 Tight Rank Upper Bounds

Let 𝛼∉L(A). For ℓ∈𝜔, we deﬁne the ℓ-th level of G𝛼as level 𝛼(ℓ)={𝑞| (𝑞, ℓ) ∈

G𝛼}. Furthermore, we use 𝑓𝛼

ℓto denote the ranking of level ℓof G𝛼. Formally,

𝑓𝛼

ℓ(𝑞)=(rank 𝛼((𝑞, ℓ)) if 𝑞∈level 𝛼(ℓ),

0otherwise. (1)

We say that the ℓ-th level of G𝛼is tight if for all 𝑘≥ℓit holds that (i) 𝑓𝛼

𝑘is tight, and

(ii) rank (𝑓𝛼

𝑘)=rank (𝑓𝛼

ℓ). Let 𝜌=𝑆0𝑆1. . . 𝑆ℓ−1(𝑆ℓ, 𝑂ℓ, 𝑓ℓ, 𝑖 ℓ). . . be a run on a word

122 Vojtˇech Havlena, Ondˇrej Leng´al, Barbora ˇ

Smahl´ıkov´a

𝛼in Schewe(A). We say that 𝜌is a super-tight run [19] if 𝑓𝑘=𝑓𝛼

𝑘for each 𝑘≥ℓ.

Finally, we say that a mapping 𝜇: 2𝑄→ R is a tight rank upper bound (TRUB) wrt 𝛼iﬀ

∃ℓ∈𝜔:level 𝛼(ℓ)is tight ∧ (∀𝑘≥ℓ:𝜇(level 𝛼(𝑘) ) ≥ 𝑓𝛼

𝑘).(2)

Informally, a TRUB is a ranking that gives a conservative (i.e., larger) estimate on

the necessary ranks of states in a super-tight run. We say that 𝜇is a TRUB iﬀ 𝜇

is a TRUB wrt all 𝛼∉L(A). We abuse notation and use the term TRUB also for

a mapping 𝜇0: 2𝑄→𝜔if the mapping inner (𝜇0)is a TRUB where inner (𝜇0)(𝑆)=

{𝑞↦→ 𝑚|𝑚=𝜇0(𝑆).

−1if 𝑞∈𝑄𝐹else 𝑚=𝜇0(𝑆)} for all 𝑆∈2𝑄. ( .

−is the monus

operator, i.e., minus with negative results saturated to zero.) Note that the mappings

𝜇𝑡={𝑆↦→ (2|𝑆\𝑄𝐹|.

−1)}𝑆∈2𝑄and inner (𝜇𝑡)are trivial TRUBs.

The following lemma shows that we can remove from Schewe(A) macrostates

whose ranking is not covered by a TRUB (in particular, we show that the reduced

automaton preserves super-tight runs).

Lemma 5. Let 𝜇be a TRUB and Bbe a BA obtained from Schewe(A) by replacing

all occurrences of 𝑄2by 𝑄0

2={( 𝑆 , 𝑂, 𝑓 , 𝑖 ) | 𝑓≤𝜇(𝑆) }. Then, L (B) = Σ 𝜔\ L (A).

4 Elevator Automata

In this section, we introduce elevator automata, which are BAs having a particular

structure that can be exploited for complementation and semi-determinization; elevator

automata can be complemented in O (16𝑛)(cf. Lemma 10) space instead of 2O (𝑛log 𝑛),

which is the lower bound for unrestricted BAs, and semi-determinized in O (2𝑛)instead

of O(4𝑛)(cf. [16]). The class of elevator automata is quite general: it can be seen

as a substantial generalization of semi-deterministic BAs (SDBAs) [11,5]. Intuitively,

an elevator automaton is a BA whose strongly connected components are all either

deterministic or inherently weak.

Let A=(𝑄, 𝛿 , 𝐼, 𝑄 𝐹∪𝛿𝐹).𝐶⊆𝑄is a strongly connected component (SCC) of A

if for any pair of states 𝑞, 𝑞 0∈𝐶it holds that 𝑞is reachable from 𝑞0and 𝑞0is reachable

from 𝑞.𝐶is maximal (MSCC) if it is not a proper subset of another SCC. An MSCC 𝐶is

trivial iﬀ |𝐶|=1and 𝛿|𝐶=∅. The condensation of Ais the DAG cond (A) =(M,E)

where Mis the set of A’s MSCCs and E={(𝐶1, 𝐶2) | ∃𝑞1∈𝐶1,∃𝑞2∈𝐶2,∃𝑎∈

Σ:𝑞1

𝑎

→𝑞2∈𝛿}. An MSCC is non-accepting if it contains no accepting state and no

accepting transition, i.e., 𝐶∩𝑄𝐹=∅and 𝛿|𝐶∩𝛿𝐹=∅. The depth of (M,E ) is deﬁned

as the number of MSCCs on the longest path in (M,E).

We say that an SCC 𝐶is inherently weak accepting (IWA) iﬀ every cycle in the

transition diagram of Arestricted to 𝐶contains an accepting state or an accepting

transition. 𝐶is inherently weak if it is either non-accepting or IWA, and Ais inherently

weak if all of its MSCCs are inherently weak. Ais deterministic iﬀ |𝐼| ≤ 1and

|𝛿(𝑞, 𝑎)| ≤ 1for all 𝑞∈𝑄and 𝑎∈Σ. An SCC 𝐶⊆𝑄is deterministic iﬀ (𝐶, 𝛿|𝐶,∅,∅)

is deterministic. Ais a semi-deterministic BA (SDBA) if A[ 𝑞]is deterministic for every

𝑞∈𝑄𝐹∪ {𝑝∈𝑄|𝑠𝑎

→𝑝∈𝛿𝐹, 𝑠 ∈𝑄, 𝑎 ∈Σ}, i.e., whenever a run in Areaches an

accepting state or an accepting transition, it can only continue deterministically.

Sky Is Not the Limit: Tighter Rank Bounds in B¨uchi Automata Complementation 123

0 1

2 3

4 5

¬𝑎

𝑎¬𝑎

𝑎

¬𝑏

𝑏¬𝑏

𝑏

¬𝑐

𝑐¬𝑐

𝑐

¬𝑎∧𝑏¬𝑎∧𝑏

¬𝑏∧𝑐¬𝑏∧𝑐

¬𝑎∧ ¬𝑏∧𝑐

det

Fig. 2: The BA for LTL formula

GF(𝑎∨GF (𝑏∨GF𝑐)) is elevator

Ais an elevator (Büchi) automaton iﬀ for

every MSCC 𝐶of Ait holds that 𝐶is (i) deter-

ministic, (ii) IWA, or (iii) non-accepting. In other

words, a BA is an elevator automaton iﬀ every

nondeterministic SCC of Athat contains an ac-

cepting state or transition is inherently weak. An

example of an elevator automaton obtained from

the LTL formula GF (𝑎∨GF (𝑏∨GF𝑐)) is shown

in Fig. 2. The BA consists of three connected de-

terministic components. Note that the automaton

is neither semi-deterministic nor unambiguous.

The rank of an elevator automaton Adoes

not depend on the number of states (as in general

BAs), but only on the number of MSCCs and the

depth of cond (A). In the worst case, Aconsists of a chain of deterministic components,

yielding the upper bound on the rank of elevator automata given in the following lemma.

Lemma 6. Let Abe an elevator automaton such that its condensation has the depth 𝑑.

Then rank (A) ≤ 2𝑑.

4.1 Reﬁned Ranks for Elevator Automata

Notice that the upper bound on ranks provided by Lemma 6can still be too coarse. For

instance, for an SDBA with three linearly ordered MSCCs such that the ﬁrst two MSCCs

are non-accepting and the last one is deterministic accepting, the lemma gives us an

upper bound on the rank 6, while it is known that every SDBA has the rank at most 3

(cf. [5]). Another examples might be two deterministic non-trivial MSCCs connected

by a path of trivial MSCCs, which can be assigned the same rank.

Instead of reﬁning the deﬁnition of elevator automata into some quite complex list of

constraints, we rather provide an algorithm that performs a traversal through cond (A)

and assigns each MSCC a label of the form type:rank that contains (i) a type and

(ii) a bound on the maximum rank of states in the component. The types of MSCCs that

we consider are the following:

T:trivial components,

IWA:inherently weak accepting components,

D:deterministic (potentially accepting) components, and

N:non-accepting components.

Note that the type in an MSCC is not given a priori but is determined by the

algorithm (this is because for deterministic non-accepting components, it is sometimes

better to treated them as Dand sometimes as N, depending on their neighbourhood).

In the following, we assume that Ais an elevator automaton without useless states

and, moreover, all accepting conditions on states and transitions not inside non-trivial

MSCCs are removed (any BA can be easily transformed into this form).

We start with terminal MSCCs 𝐶, i.e., MSCCs that cannot reach any other MSCC:

T1: If 𝐶is IWA, then we label it with IWA:0 .

T2: Else if 𝐶is deterministic accepting, we label it with D:2 .

124 Vojtˇech Havlena, Ondˇrej Leng´al, Barbora ˇ

Smahl´ıkov´a

IWA:ℓ

ℓ=max{ℓ𝐷, ℓ𝑁+1, ℓ𝑊}

𝐶:

D:ℓ𝐷N:ℓ𝑁IWA:ℓ𝑊

(a) 𝐶is IWA

D:ℓ

ℓ=max{ℓ𝐷+2, ℓ𝑁+1, ℓ𝑊+2,2}

𝐶:

D:ℓ𝐷

N:ℓ𝑁IWA:ℓ𝑊

(b) 𝐶is D

N:ℓ

ℓ=max{ℓ𝐷+1, ℓ𝑁, ℓ𝑊+1}

𝐶:

D:ℓ𝐷N:ℓ𝑁IWA:ℓ𝑊

Fig. 3: Rules for assigning types and rank bounds to MSCCs. The symbols 2and 2

are interpeted as 0if all the corresponding edges from the components having rank ℓ𝐷

and ℓ𝑊, respectively, are deterministic; otherwise they are interpreted as 2. Transi-

tions between two components 𝐶1and 𝐶2are deterministic if the BA (𝐶 , 𝛿|𝐶,∅,∅) is

deterministic for 𝐶=𝛿(𝐶1,Σ)∩(𝐶1∪𝐶2).

𝑡:ℓ

𝐶:

D:ℓ𝐷N:ℓ𝑁IWA:ℓ𝑊

Fig. 4: Structure of elevator

ranking rules

(Note that the previous two options are complete due

to our requirements on the structure of A.) When

all terminal MSCCs are labelled, we proceed through

cond (A), inductively on its structure, and label non-

terminal components 𝐶based on the rules deﬁned below.

The rules are of the form that uses the structure depicted in Fig. 4, where children nodes

denote already processed MSCCs. In particular, a child node of the form 𝑘:ℓ𝑘denotes

an aggregate node of all siblings of the type 𝑘with ℓ𝑘being the maximum rank of these

siblings. Moreover, we use typemax{𝑒𝐷, 𝑒 𝑁, 𝑒𝑊}to denote the type 𝑗∈ {D,N,IWA}

for which 𝑒𝑗=max{𝑒𝐷, 𝑒𝑁, 𝑒𝑊}where 𝑒𝑖is an expression containing ℓ𝑖(if there are

more such types, 𝑗is chosen arbitrarily). The rules for assigning a type 𝑡and a rank ℓ

to 𝐶are the following:

I1: If 𝐶is trivial, we set 𝑡=typemax{ℓ𝐷, ℓ𝑁, ℓ𝑊}and ℓ=max{ℓ𝐷, ℓ𝑁, ℓ𝑊}.

I2: Else if 𝐶is IWA, we use the rule in Fig. 3a.

I3: Else if 𝐶is deterministic accepting, we use the rule in Fig. 3b.

I4: Else if 𝐶is deterministic and non-accepting, we try both rules from Figs. 3b and 3c

and pick the rule that gives us a smaller rank.

I5: Else if 𝐶is nondeterministic and non-accepting, we use the rule in Fig. 3c.

{𝑟} {𝑟, 𝑠, 𝑡 } {𝑠, 𝑡 }

∅

{𝑟:3, 𝑠:0, 𝑡:1},∅

{𝑟:3, 𝑠:2, 𝑡:1},∅

{𝑟:1, 𝑠:2, 𝑡:3},∅

{𝑟:1, 𝑠:0, 𝑡:0},∅{𝑟:1, 𝑠:0, 𝑡 :1},∅

{𝑟:1},∅

{𝑟:1, 𝑠:0, 𝑡:0},{𝑠, 𝑡 }

{𝑠:0, 𝑡:1},∅

𝑏

𝑏𝑏

𝑏

𝑎

𝑏

𝑎

𝑏

𝑎, 𝑏

𝑏𝑏

𝑏

Fig. 5: A part of Schewe(A). The TRUB

computed by elevator rules is used to prune

states outside the yellow area.

Then, for every MSCC 𝐶of A, we assign

each of its states the rank of 𝐶. We use

𝜒:𝑄→𝜔to denote the rank bounds

computed by the procedure above.

Lemma 7. 𝜒is a TRUB.

Using Lemma 5, we can now use 𝜒

to prune states during the construction

of Schewe(A), as shown in the follow-

ing example.

Example 8. As an example, consider

the BA Ain Fig. 1a. The set of

MSCCs with their types is given as

Sky Is Not the Limit: Tighter Rank Bounds in B¨uchi Automata Complementation 125

{{𝑟}:N,{𝑠, 𝑡}:IWA}showing that BA Ais an elevator. Using the rules T1 and I4

we get the TRUB 𝜒={𝑟:1, 𝑠:0, 𝑡:0}. The TRUB can be used to prune the generated

states as shown in Fig. 5.ut

4.2 Eﬃcient Complementation of Elevator Automata

In Section 4.1 we proposed an algorithm for assigning ranks to MSCCs of an elevator

automaton A. The drawback of the algorithm is that the maximum obtained rank is not

bounded by a constant but by the depth of the condensation of A. We will, however,

show that it is actually possible to change Aby at most doubling the number of states

and obtain an elevator BA with the rank at most 3.

Intuitively, the construction copies every non-trivial MSCC 𝐶with an accepting

state or transition into a component 𝐶•, copies all transitions going into states in 𝐶to

also go into the corresponding states in 𝐶•, and, ﬁnally, removes all accepting conditions

from 𝐶. Formally, let A=(𝑄, 𝛿, 𝐼, 𝑄𝐹∪𝛿𝐹)be a BA. For 𝐶⊆𝑄, we use 𝐶•to denote

a unique copy of 𝐶, i.e., 𝐶•={𝑞•|𝑞∈𝐶}s.t. 𝐶•∩𝑄=∅. Let Mbe the set of MSCCs

of A. Then, the deelevated BA DeElev(A) =(𝑄0, 𝛿0, 𝐼 0, 𝑄 0

𝐹∪𝛿0

𝐹)is given as follows:

–𝑄0=𝑄∪𝑄•,

–𝛿0:𝑄0×Σ→2𝑄0where for 𝑞∈𝑄

•𝛿0(𝑞, 𝑎)=𝛿(𝑞, 𝑎)∪(𝛿(𝑞, 𝑎)) •and

•𝛿0(𝑞•, 𝑎)=(𝛿(𝑞, 𝑎) ∩ 𝐶)•for 𝑞∈𝐶∈ M;

–𝐼0=𝐼, and

–𝑄0

𝐹=𝑄•

𝐹and 𝛿0

𝐹={𝑞•𝑎

→𝑟•|𝑞𝑎

→𝑟∈𝛿𝐹} ∩ 𝛿0.

It is easy to see that the number of states of the deelevated automaton is bounded by 2|𝑄|.

Moreover, if Ais elevator, so is DeElev(A). The construction preserves the language

of A, as shown by the following lemma.

Lemma 9. Let Abe a BA. Then, L(A ) =L (DeElev( A)) .

Moreover, for an elevator automaton A, the structure of DeElev(A) consists of (after

trimming useless states) several non-accepting MSCCs with copied terminal deter-

ministic or IWA MSCCs. Therefore, if we apply the algorithm from Section 4.1 on

DeElev(A), we get that its rank is bounded by 3, which gives the following upper

bound for complementation of elevator automata.

Lemma 10. Let Abe an elevator automaton with suﬀﬁciently many states 𝑛. Then the

language Σ𝜔\ L(A) can be represented by a BA with at most O(16𝑛)states.

The complementation through DeElev(A) gives a better upper bound than the rank

reﬁnement from Section 4.1 applied directly on A, however, based on our experience,

complementation through DeElev(A) behaves worse in many real-world instances.

This poor behaviour is caused by the fact that the complement of DeElev(A) can have

a larger Waiting and macrostates in Tight can have larger 𝑆-components, which can

yield more generated states (despite the rank bound 3). It seems that the most promising

approach would to be a combination of the approaches, which we leave for future work.

126 Vojtˇech Havlena, Ondˇrej Leng´al, Barbora ˇ

Smahl´ıkov´a

IWA:ℓ

ℓ=max{ℓ𝐷, ℓ𝑁+1, ℓ𝑊, ℓ𝐺}

𝐶:

D:ℓ𝐷N:ℓ𝑁IWA:ℓ𝑊G:ℓ𝐺

(a) 𝐶is IWA

D:ℓ

ℓ=max{ℓ𝐷+2, ℓ𝑁+1, ℓ𝑊+2, ℓ𝐺+2,2}

𝐶:

D:ℓ𝐷

N:ℓ𝑁IWA:ℓ𝑊

G:ℓ𝐺

(b) 𝐶is D

N:ℓ

ℓ=max{ℓ𝐷+1, ℓ𝑁, ℓ𝑊+1, ℓ𝐺+1}

𝐶:

D:ℓ𝐷N:ℓ𝑁IWA:ℓ𝑊G:ℓ𝐺

Fig. 6: Rules assigning types and rank bounds for non-elevator automata.

4.3 Reﬁned Ranks for Non-Elevator Automata

The algorithm from Section 4.1 computing a TRUB for elevator automata can be

extended to compute TRUBs even for general non-elevator automata (i.e., BAs with

nondeterministic accepting components that are not inherently weak). To achieve this

generalization, we extend the rules for assigning types and ranks to MSCCs of elevator

automata from Section 4.1 to take into account general non-deterministic components.

For this, we add into our collection of MSCC types general components (denoted as G).

Further, we need to extend the rules for terminal components with the following rule:

T3: Otherwise, we label 𝐶with G:2|𝐶\𝑄𝐹|.

G:ℓ

ℓ=max{ℓ𝐷, ℓ𝑁+1, ℓ𝑊, ℓ𝐺} + 2|𝐶\𝑄𝐹|

𝐶:

D:ℓ𝐷N:ℓ𝑁IWA:ℓ𝑊G:ℓ𝐺

Fig. 7: 𝐶is G

Moreover, we adjust the rules for assigning

a type 𝑡and a rank ℓto 𝐶to the following (the

rule I1 is the same as for the case of elevator

automata):

I2–I5:(We replace the corresponding rules for their counterparts including general

components from Fig. 6).

I6: Otherwise, we use the rule in Fig. 7.

Then, for every MSCC 𝐶of a BA A, we assign each of its states the rank of 𝐶. Again, we

use 𝜒:𝑄→𝜔to denote the rank bounds computed by the adjusted procedure above.

Lemma 11. 𝜒is a TRUB.

5 Rank Propagation

𝜇0(𝑆)

𝜇(𝑅1)

𝑎1

𝜇(𝑅2)

𝑎2

· · · 𝜇(𝑅𝑚)

𝑎𝑚

Fig. 8: Rank propagation ﬂow

In the previous section, we proposed a way, how to

obtain a TRUB for elevator automata (with gener-

alization to general automata). In this section, we

propose a way of using the structure of Ato re-

ﬁne a TRUB using a propagation of values and thus

reduce the size of Tight. Our approach uses data

ﬂow analysis [32] to reason on how ranks and rankings of macrostates of Schewe(A)

can be decreased based on the ranks and rankings of the local neighbourhood of the

macrostates. We, in particular, use a special case of forward analysis working on

the skeleton of Schewe(A), which is deﬁned as the BA KA=(2𝑄, 𝛿0,∅,∅) where

𝛿0={𝑅𝑎

→𝑆|𝑆=𝛿(𝑅, 𝑎) } (note that we are only interested in the structure of KAand

Sky Is Not the Limit: Tighter Rank Bounds in B¨uchi Automata Complementation 127

not its language; also notice the similarity of KAwith Waiting). Our analysis reﬁnes

a rank/ranking estimate 𝜇(𝑆)for a macrostate 𝑆of KAbased on the estimates for its

predecessors 𝑅1, . . . , 𝑅𝑚(see Fig. 8). The new estimate is denoted as 𝜇0(𝑆).

More precisely, 𝜇: 2𝑄→Vis a function giving each macrostate of KAa value from

the domain V. We will use the following two value domains: (i) V=𝜔, which is used for

estimating ranks of macrostates (in the outer macrostate analysis) and (ii) V=R, which

is used for estimating rankings within macrostates (in the inner macrostate analysis). For

each of the analyses, we will give the update function up :(2𝑄→V)×(2𝑄)𝑚+1→V,

which deﬁnes how the value of 𝜇(𝑆)is updated based on the values of 𝜇(𝑅1), . . . , 𝜇(𝑅𝑚).

We then construct a system with the following equation for every 𝑆∈2𝑄:

𝜇(𝑆)=up (𝜇, 𝑆, 𝑅1, . . . , 𝑅𝑚)where {𝑅1, . . . , 𝑅𝑚}=𝛿0−1(𝑆, Σ).(3)

We then solve the system of equations using standard algorithms for data ﬂow analysis

(see, e.g., [32, Chapter 2]) to obtain the ﬁxpoint 𝜇∗. Our analyses have the important

property that if they start with 𝜇0being a TRUB, then 𝜇∗will also be a TRUB.

As the initial TRUB, we can use a trivial TRUB or any other TRUB (e.g., the output

of elevator state analysis from Section 4).

5.1 Outer Macrostate Analysis

We start with the simpler analysis, which is the outer macrostate analysis, which

only looks at sizes of macrostates. Recall that the rank 𝑟of every super-tight run in

Schewe(A) does not change, i.e., a super tight run stays in Waiting as long as needed

so that when it jumps to Tight, it takes the rank 𝑟and never needs to decrease it. We can

use this fact to decrease the maximum rank of a macrostate 𝑆in KA. In particular,

let us consider all cycles going through 𝑆. For each of the cycles 𝑐, we can bound the

maximum rank of a super-tight run going through 𝑐by 2𝑚−1where 𝑚is the smallest

number of non-accepting states occurring in any macrostate on 𝑐(from the deﬁnition,

the rank of a tight ranking does not depend on accepting states). Then we can infer that

the maximum rank of any super-tight run going through 𝑆is bounded by the maximum

rank of any of the cycles going through 𝑆(since 𝑆can never assume a higher rank in

any super-tight run). Moreover, the rank of each cycle can also be estimated in a more

precise way, e.g. using our elevator analysis.

Since the number of cycles in KAcan be large2, instead of their enumeration, we em-

ploy data ﬂow analysis with the value domain V=𝜔(i.e, for every macrostate 𝑆of KA,

we remember a bound on the maximum rank of 𝑆) and the following update function:

upout (𝜇, 𝑆, 𝑅1, . . . , 𝑅𝑚)=min{𝜇(𝑆),max{𝜇(𝑅1), . . . , 𝜇(𝑅𝑚)}}.(4)

Intuitively, the new bound on the maximum rank of 𝑆is taken as the smaller of the

previous bound 𝜇(𝑆)and the largest of the bounds of all predecessors of 𝑆, and the new

value is propagated forward by the data ﬂow analysis.

2KAcan be exponentially larger than Aand the number of cycles in KAcan be exponential to

the size of KA, so the total number of cycles can be double-exponential.

128 Vojtˇech Havlena, Ondˇrej Leng´al, Barbora ˇ

Smahl´ıkov´a

𝑝 𝑞 𝑟

𝑠

𝑎

𝑎 𝑎

•

𝑎

•

(a) Aex

{𝑝}:1

{𝑝, 𝑞}:3

{𝑝, 𝑞, 𝑟 , 𝑠}:7

𝑎

(b) 𝜇0

{𝑝}:1

{𝑝, 𝑞}:1

{𝑝, 𝑞, 𝑟 , 𝑠}:7

𝑎

out

Fig. 9: Example of outer macrostate anal-

ysis. (a)Aex (•denotes accepting transi-

tions). The initial TRUB 𝜇0in (b) is reﬁned

to 𝜇∗

out in (c).

Example 12. Consider the BA Aex in

Fig. 9a. When started from the initial

TRUB 𝜇0={{𝑝} ↦→ 1,{𝑝, 𝑞} ↦→

3,{𝑝, 𝑞 , 𝑟, 𝑠 } ↦→ 7}(Fig. 9b), outer

macrostate analysis decreases the max-

imum rank estimate for {𝑝, 𝑞}to 1,

since min{𝜇0({𝑝, 𝑞 },max{𝜇0({ 𝑝})} } =

min{3,1}=1. The estimate for

{𝑝, 𝑞 , 𝑟, 𝑠 }is not aﬀected, because

min{7,max{1,7}} =7(Fig. 9c). ut

Lemma 13. If 𝜇is a TRUB, then 𝜇C{𝑆↦→ upout (𝜇, 𝑆, 𝑅1, . . . , 𝑅𝑚)} is a TRUB.

Corollary 14. When started with a TRUB 𝜇0, the outer macrostate analysis terminates

and returns a TRUB 𝜇∗

out .

5.2 Inner Macrostate Analysis

Our second analysis, called inner macrostate analysis, looks deeper into super-tight

runs in Schewe(A). In particular, compared with the outer macrostate analysis from

the previous section—which only looks at the ranks, i.e., the bounds on the numbers

in the rankings—, inner macrostate analysis looks at how the rankings assign concrete

values to the states of Ainside the macrostates.

Inner macrostate analysis is based on the following. Let 𝜌be a super-tight run of

Schewe(A) on 𝛼∉L ( A) and (𝑆, 𝑂 , 𝑓 , 𝑖)be a macrostate from Tight. Because 𝜌is

super-tight, we know that the rank 𝑓(𝑞)of a state 𝑞∈𝑆is bounded by the ranks of the

predecessors of 𝑞. This holds because in super-tight runs, the ranks are only as high as

necessary; if the rank of 𝑞were higher than the ranks of its predecessors, this would

mean that we may wait in Waiting longer and only jump to 𝑞with a lower rank later.

Let us introduce some necessary notation. Let 𝑓 , 𝑓 0∈ R be rankings (i.e., 𝑓 , 𝑓 0:𝑄→

𝜔). We use 𝑓t𝑓0to denote the ranking {𝑞↦→ max{𝑓(𝑞), 𝑓 0(𝑞)} | 𝑞∈𝑄}, and

𝑓u𝑓0to denote the ranking {𝑞↦→ min{𝑓(𝑞), 𝑓 0(𝑞)} | 𝑞∈𝑄}. Moreover, we deﬁne

max-succ-rank𝑎

𝑆(𝑓)=max≤{𝑓0∈ R | 𝑓𝑎

𝑆𝑓0}and a function dec :R → R such that

dec(𝜃)is the ranking 𝜃0for which

𝜃0(𝑞)=









𝜃(𝑞).

−1if 𝜃(𝑞)=rank (𝜃)and 𝑞∉𝑄𝐹,

bb𝜃(𝑞).

−1cc if 𝜃(𝑞)=rank (𝜃)and 𝑞∈𝑄𝐹,

𝜃(𝑞)otherwise.

(5)

Intuitively, max-succ-rank𝑎

𝑆(𝑓)is the (pointwise) maximum ranking that can be reached

from macrostate 𝑆with ranking 𝑓over 𝑎(it is easy to see that there is a unique such

maximum ranking) and dec(𝜃)decreases the maximum ranks in a ranking 𝜃by one

(or by two for even maximum ranks and accepting states).

The analysis uses the value domain V=R(i.e., each macrostate of KAis assigned

a ranking giving an upper bound on the rank of each state in the macrostate) and

the update function upin given in the right-hand side of the page. Intuitively, up in

Sky Is Not the Limit: Tighter Rank Bounds in B¨uchi Automata Complementation 129

1upin (𝜇, 𝑆, 𝑅1, . . . , 𝑅𝑚):

2foreach 1≤𝑖≤𝑚and 𝑎∈Σdo

3if 𝛿(𝑅𝑖, 𝑎)=𝑆then

4𝑔𝑎

𝑖←max-succ-rank𝑎

𝑅𝑖(𝜇(𝑅𝑖))

5𝜃←𝜇(𝑆) u Ã{𝑔𝑎

𝑖|𝑔𝑎

𝑖is deﬁned};

6if rank (𝜃)is even then 𝜃←dec(𝜃);

7return 𝜃;

updates 𝜇(𝑞)for every 𝑞∈𝑆

to hold the maximum rank com-

patible with the ranks of its pre-

decessors. We note line Line 6,

which makes use of the fact that

we can only consider tight rank-

ings (whose rank is odd), so we

can decrease the estimate using

the function dec deﬁned above.

{𝑝:1, 𝑞:1} { 𝑝:7,𝑞:7,𝑟:7,𝑠:7}

{𝑝:6,𝑞:7,𝑟:7,𝑠:7}

{𝑝:6,𝑞:6,𝑟:7,𝑠:7}

{𝑝:6,𝑞:6,𝑟:6,𝑠:6}

{𝑝:5,𝑞:5,𝑟:5,𝑠:5}

dec

Example 15. Let us continue in Section 5.1 and per-

form inner macrostate analysis starting with the TRUB

{{ 𝑝:1},{𝑝:1, 𝑞:1},{𝑝:7, 𝑞:7, 𝑟 :7, 𝑠:7}} obtained from 𝜇∗

out .

We show three iterations of the algorithm for {𝑝, 𝑞 , 𝑟, 𝑠 }in

the right-hand side (we do not show {𝑝, 𝑞}except the ﬁrst

iteration since it does not aﬀect intermediate steps). We can

notice that in the three iterations, we could decrease the maxi-

mum rank estimate to {𝑝:6, 𝑞:6, 𝑟 :6, 𝑠:6}due to the accepting

transitions from 𝑟and 𝑠. In the last of the three iterations, when

all states have the even rank 6, the condition on Line 6would

become true and the rank of all states would be decremented

to 5using dec. Then, again, the accepting transitions from 𝑟and 𝑠would decrease the

rank of 𝑝to 4, which would be propagated to 𝑞and so on. Eventually, we would arrive to

the TRUB {𝑝:1, 𝑞:1, 𝑟:1, 𝑠:1}, which could not be decreased any more, since {𝑝:1, 𝑞:1}

forces the ranks of 𝑟and 𝑠to stay at 1. ut

Lemma 16. If 𝜇is a TRUB, then 𝜇C{𝑆↦→ upin (𝜇, 𝑆, 𝑅1, . . . , 𝑅𝑚)} is a TRUB.

Corollary 17. When started with a TRUB 𝜇0, the inner macrostate analysis terminates

and returns a TRUB 𝜇∗

in .

6 Experimental Evaluation

Used tools and evaluation environment. We implemented the techniques described in

the previous sections as an extension of the tool Ranker [18] (written in C++). Speaking

in the terms of [19], the heuristics were implemented on top of the RankerMaxR conﬁg-

uration (we refer to this previous version as RankerOld). We tested the correctness of

our implementation using Spot’s autcross on all BAs in our benchmark. We compared

modiﬁed Ranker with other state-of-the-art tools, namely, Goal [41] (implementing

Piterman [34], Schewe [37], Safra [36], and Fribourg [1]), Spot 2.9.3 [12] (im-

plementing Redziejowski’s algorithm [35]), Seminator 2 [4], LTL2dstar 0.5.4 [23],

and Roll [26]. All tools were set to the mode where they output an automaton with

the standard state-based Büchi acceptance condition. The experimental evaluation was

performed on a 64-bit GNU/Linux Debian workstation with an Intel(R) Xeon(R) CPU

E5-2620 running at 2.40 GHz with 32GiB of RAM and using a timeout of 5 minutes.

Datasets. As the source of our benchmark, we use the two following datasets: (i) random

containing 11,000 BAs over a two letter alphabet used in [40], which were randomly

130 Vojtˇech Havlena, Ondˇrej Leng´al, Barbora ˇ

Smahl´ıkov´a

10 100 1000 10000 100000

Ranker

100

1000

10000

100000

Schewe

(a) Ranker vs Schewe

10 100 1000 10000 100000

Ranker

100

1000

10000

100000

RankerOld

(b) Ranker vs RankerOld

Fig. 10: Comparison of the state space generated by our optimizations and other rank-

based procedures (horizontal and vertical dashed lines represent timeouts). Blue data

points are from random and red data points are from LTL. Axes are logarithmic.

generated via the Tabakov-Vardi approach [39], starting from 15 states and with var-

ious diﬀerent parameters; (ii) LTL with 1,721 BAs over larger alphabets (up to 128

symbols) used in [4], which were obtained from LTL formulae from literature (221) or

randomly generated (1,500). We preprocessed the automata using Rabit [30] and Spot’s

autfilt (using the --high simpliﬁcation level), transformed them to state-based ac-

ceptance BAs (if they were not already), and converted to the HOA format [2]. From

this set, we removed automata that were (i) semi-deterministic, (ii) inherently weak,

(iii) unambiguous, or (iv) have an empty language, since for these automata types there

exist more eﬃcient complementation procedures than for unrestricted BAs [5,4,6,28].

In the end, we were left with 2,592 (random) and 414 (LTL)hard automata. We use all

to denote their union (3,006 BAs). Of these hard automata, 458 were elevator automata.

6.1 Generated State Space

In our ﬁrst experiment, we evaluated the eﬀectiveness of our heuristics for pruning the

generated state space by comparing the sizes of complemented BAs without postprocess-

ing. This use case is directed towards applications where postprocessing is irrelevant,

such as inclusion or equivalence checking of BAs.

We focused on a comparison with two less optimized versions of the rank-based com-

plementation procedure: Schewe (the version “Reduced Average Outdegree” from [37]

implemented in Goal under -m rank -tr -ro) and its optimization RankerOld. The

scatter plots in Fig. 10 compare the numbers of states of automata generated by Ranker

and the other algorithms and the upper part of Table 1gives summary statistics. Observe

that our optimizations from this paper drastically reduced the generated search space

compared with both Schewe and RankerOld (the mean for Schewe is lower than for

RankerOld due to its much higher number of timeouts); from Fig. 10b we can see that

the improvement was in many cases exponential even when compared with our previous

optimizations in RankerOld. The median (which is a more meaningful indicator with

the presence of timeouts) decreased by 44 % w.r.t. RankerOld, and we also reduced the

Sky Is Not the Limit: Tighter Rank Bounds in B¨uchi Automata Complementation 131

Table 1: Statistics for our experiments. The upper part compares various optimizations of

the rank-based procedure (no postprocessing). The lower part compares Ranker to other

approaches (with postprocessing). The left-hand side compares sizes of complement BAs

and the right-hand side runtimes of the tools. The wins and losses columns give the

number of times when Ranker was strictly better and worse. The values are given for

the three datasets as “all (random :LTL)”. Approaches in Goal are labelled with G.

method mean median wins losses mean runtime [s] median runtime [s] timeouts

Ranker 3812 (4452 : 207) 79 (93 : 26) 7.83 (8.99 : 1.30) 0.51 (0.84 : 0.04) 279 (276 : 3)

RankerOld 7398 (8688 : 358) 141 (197 : 29) 2190 (2011: 179) 111 (107: 4) 9.37 (10.73 :1.99) 0.61 (1.04 :0.04) 365 (360 : 5)

Schewe G4550 (5495 : 665) 439 (774 : 35) 2640 (2315: 325) 55 (1: 54) 21.05 (24.28 : 7.80) 6.57 (7.39 : 5.21) 937 (928 :9)

Ranker 47 (52 : 18) 22 (27 : 10) 7.83 (8.99 : 1.30) 0.51 (0.84 : 0.04) 279 (276 : 3)

Piterman G73 (82 :22) 28 (34 :14) 1435 (1124 :311) 416 (360 :56) 7.29 (7.39 : 6.65) 5.99 (6.04 : 5.62) 14 (12: 2)

Safra G83 (91 : 30) 29 (35 :17) 1562 (1211 :351) 387 (350 : 37) 14.11 (15.05 :8.37) 6.71 (6.92 : 5.79) 172 (158 :14)

Spot 75 (85 :15) 24 (32 :10) 1087 (936 :151) 683 (501 :182) 0.86 (0.99: 0.06) 0.02 (0.02 : 0.02) 13 (13 :0)

Fribourg G91 (104 :13) 23 (31 :9) 1120 (1055 :65) 601 (376 :225) 17.79 (19.53 : 7.22) 9.25 (10.15: 5.48) 81 (80 :1)

LTL2dstar 73 (82 :21) 28 (34 :13) 1465 (1195 :270) 465 (383 :82) 3.31 (3.84 :0.11) 0.04 (0.05 :0.02) 136 (130 : 6)

Seminator 2 79 (91 :15) 21 (29 :10) 1266 (1131 :135) 571 (367 :204) 9.51 (11.25 : 0.08) 0.22 (0.39 : 0.02) 363 (362 :1)

Roll 18 (19 : 14) 10 (9: 11) 2116 (1858: 258) 569 (443 : 126) 31.23 (37.85 :7.28) 8.19 (12.23 :2.74) 1109 (1106 : 3)

number of timeouts by 23 %. Notice that the numbers for the LTL dataset do not diﬀer

as much as for random, witnessing the easier structure of the BAs in LTL.

6.2 Comparison with Other Complementation Techniques

In our second experiment, we compared the improved Ranker with other state-of-the-

art tools. We were comparing sizes of output BAs, therefore, we postprocessed each

output automaton with autfilt (simpliﬁcation level --high). Scatter plots are given

in Fig. 11, where we compare Ranker with Spot (which had the best results on average

from the other tools except Roll) and Roll, and summary statistics are in the lower

part of Table 1. Observe that Ranker has by far the lowest mean (except Roll) and the

third lowest median (after Seminator 2 and Roll, but with less timeouts). Moreover,

comparing the numbers in columns wins and losses we can see that Ranker gives strictly

better results than other tools (wins) more often than the other way round (losses).

In Fig. 11a see that indeed in the majority of cases Ranker gives a smaller BA than

Spot, especially for harder BAs (Spot, however, behaves slightly better on the simpler

BAs from LTL). The results in Fig. 11b do not seem so clear. Roll uses a learning-based

approach—more heavyweight and completely orthogonal to any of the other tools—and

can in some cases output a tiny automaton, but does not scale, as observed by the number

of timeouts much higher than any other tool. It is, therefore, positively surprising that

Ranker could in most of the cases still obtain a much smaller automaton than Roll.

Regarding runtimes, the prototype implementation in Ranker is comparable to Sem-

inator 2, but slower than Spot and LTL2dstar (Spot is the fastest tool). Implementa-

tions of other approaches clearly do not target speed. We note that the number of timeouts

of Ranker is still higher than of some other tools (in particular Piterman,Spot,Fri-

bourg); further state space reduction targeting this particular issue is our future work.

7 Related Work

BA complementation remains in the interest of researchers since their ﬁrst introduction

by Büchi in [8]. Together with a hunt for eﬃcient complementation techniques, the eﬀort

has been put into establishing the lower bound. First, Michel showed that the lower bound

is 𝑛!(approx. (0.36𝑛)𝑛) [31] and later Yan reﬁned the result to (0.76𝑛)𝑛[43].

132 Vojtˇech Havlena, Ondˇrej Leng´al, Barbora ˇ

Smahl´ıkov´a

1 10 100 1000

Ranker

100

1000

Spot

(a) Ranker vs Spot

1 10 100 1000

Ranker

100

1000

ROLL

(b) Ranker vs Roll

Fig. 11: Comparison of the complement size obtained by Ranker and other state-of-the-

art tools (horizontal and vertical dashed lines represent timeouts). Axes are logarithmic.

The complementation approaches can be roughly divided into several branches.

Ramsey-based complementation, the very ﬁrst complementation construction, where

the language of an input automaton is decomposed into a ﬁnite number of equivalence

classes, was proposed by Büchi and was further enhanced in [7]. Determinization-

based complementation was presented by Safra in [36] and later improved by Piterman

in [34] and Redziejowski in [35]. Various optimizations for determinization of BAs were

further proposed in [29]. The main idea of this approach is to convert an input BA into an

equivalent deterministic automaton with diﬀerent acceptance condition that can be easily

complemented (e.g. Rabin automaton). The complemented automaton is then converted

back into a BA (often for the price of some blow-up). Slice-based complementation tracks

the acceptance condition using a reduced abstraction on a run tree [42,21]. A learning-

based approach was introduced in [27,26]. Allred and Ultes-Nitsche then presented

a novel optimal complementation algorithm in [1]. For some special types of BAs, e.g.,

deterministic [25], semi-deterministic [5], or unambiguous [28], there exist speciﬁc

complementation algorithms. Semi-determinization based complementation converts

an input BA into a semi-deterministic BA [11], which is then complemented [4].

Rank-based complementation, studied in [24,15,14,37,22], extends the subset con-

struction for determinization of ﬁnite automata by storing additional information in

each macrostate to track the acceptance condition of all runs of the input automaton.

Optimizations of an alternative (sub-optimal) rank-based construction from [24] go-

ing through alternating Büchi automata were presented in [15]. Furthermore, the work

in [22] introduces an optimization of Schewe, in some cases producing smaller au-

tomata (this construction is not compatible with our optimizations). As shown in [9],

the rank-based construction can be optimized using simulation relations. We identiﬁed

several heuristics that help reducing the size of the complement in [19], which are

compatible with the heuristics in this paper.

Acknowledgements. We thank anonymous reviewers for their useful remarks that helped

us improve the quality of the paper. This work was supported by the Czech Science

Foundation project 20-07487S and the FIT BUT internal project FIT-S-20-6427.

Sky Is Not the Limit: Tighter Rank Bounds in B¨uchi Automata Complementation 133

References

1. Allred, J.D., Ultes-Nitsche, U.: A simple and optimal complementation algorithm for Büchi

automata. In: Proceedings of the Thirty third Annual IEEE Symposium on Logic in Computer

Science (LICS 2018). pp. 46–55. IEEE Computer Society Press (July 2018)

2. Babiak, T., Blahoudek, F., Duret-Lutz, A., Klein, J., Křetínský, J., Müller, D., Parker, D.,

Strejček, J.: The Hanoi omega-automata format. In: Computer Aided Veriﬁcation - 27th

International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceed-

ings, Part I. Lecture Notes in Computer Science, vol. 9206, pp. 479–486. Springer (2015).

https://doi.org/10.1007/978-3-319-21690-4_31

3. Blahoudek, F., Heizmann, M., Schewe, S., Strejček, J., Tsai, M.H.: Complementing semi-

deterministic büchi automata. In: Tools and Algorithms for the Construction and Analysis of

Systems. pp. 770–787. Springer Berlin Heidelberg, Berlin, Heidelberg (2016)

4. Blahoudek, F., Duret-Lutz, A., Strejček, J.: Seminator 2 can complement generalized Büchi

automata via improved semi-determinization. In: Proceedings of the 32nd International Con-

ference on Computer-Aided Veriﬁcation (CAV’20). Lecture Notes in Computer Science, vol.

12225, pp. 15–27. Springer (Jul 2020)

5. Blahoudek, F., Heizmann, M., Schewe, S., Strejček, J., Tsai, M.: Complementing semi-

deterministic Büchi automata. In: Tools and Algorithms for the Construction and Analysis of

Systems - 22nd International Conference, TACAS 2016, Held as Part of the European Joint

Conferences on Theory and Practice of Software, ETAPS 2016, Eindhoven, The Netherlands,

April 2-8, 2016, Proceedings. Lecture Notes in Computer Science, vol. 9636, pp. 770–787.

Springer (2016). https://doi.org/10.1007/978-3-662-49674-9_49

6. Boigelot, B., Jodogne, S., Wolper, P.: On the use of weak automata for deciding linear

arithmetic with integer and real variables. In: Automated Reasoning, First International Joint

Conference, IJCAR 2001, Siena, Italy, June 18-23, 2001, Proceedings. Lecture Notes in

Computer Science, vol. 2083, pp. 611–625. Springer (2001). https://doi.org/10.1007/3-540-

45744-5_50

7. Breuers, S., Löding, C., Olschewski, J.: Improved Ramsey-based Büchi complementation.

In: Proc. of FOSSACS’12. pp. 150–164. Springer (2012)

8. Büchi, J.R.: On a decision method in restricted second order arithmetic. In: Proc. of Inter-

national Congress on Logic, Method, and Philosophy of Science 1960. Stanford Univ. Press,

Stanford (1962)

9. Chen, Y., Havlena, V., Lengál, O.: Simulations in rank-based Büchi automata complementa-

tion. In: Programming Languages and Systems - 17th Asian Symposium, APLAS 2019, Nusa

Dua, Bali, Indonesia, December 1-4, 2019, Proceedings. Lecture Notes in Computer Science,

vol. 11893, pp. 447–467. Springer (2019). https://doi.org/10.1007/978-3-030-34175-6_23

10. Chen, Y., Heizmann, M., Lengál, O., Li, Y., Tsai, M., Turrini, A., Zhang, L.: Ad-

vanced automata-based algorithms for program termination checking. In: Proceedings of

the 39th ACM SIGPLAN Conference on Programming Language Design and Implemen-

tation, PLDI 2018, Philadelphia, PA, USA, June 18-22, 2018. pp. 135–150. ACM (2018).

https://doi.org/10.1145/3192366.3192405

11. Courcoubetis, C., Yannakakis, M.: Verifying temporal properties of ﬁnite-state probabilis-

tic programs. In: 29th Annual Symposium on Foundations of Computer Science, White

Plains, New York, USA, 24-26 October 1988. pp. 338–345. IEEE Computer Society (1988).

https://doi.org/10.1109/SFCS.1988.21950

12. Duret-Lutz, A., Lewkowicz, A., Fauchille, A., Michaud, T., Renault, É., Xu, L.: Spot 2.0 — a

framework for LTL and 𝜔-automata manipulation. In: Automated Technology for Veriﬁcation

and Analysis. pp. 122–129. Springer International Publishing, Cham (2016)

134 Vojtˇech Havlena, Ondˇrej Leng´al, Barbora ˇ

Smahl´ıkov´a

13. Fogarty, S., Vardi, M.Y.: Büchi complementation and size-change termination. In: Proc. of

TACAS’09. pp. 16–30. Springer (2009)

14. Friedgut, E., Kupferman, O., Vardi, M.: Büchi complementation made tighter. International

Journal of Foundations of Computer Science 17, 851–868 (2006)

15. Gurumurthy, S., Kupferman, O., Somenzi, F., Vardi, M.Y.: On complementing non-

deterministic Büchi automata. In: Correct Hardware Design and Veriﬁcation Methods,

12th IFIP WG 10.5 Advanced Research Working Conference, CHARME 2003, L’Aquila,

Italy, October 21-24, 2003, Proceedings. LNCS, vol. 2860, pp. 96–110. Springer (2003).

https://doi.org/10.1007/978-3-540-39724-3_10

16. Havlena, V., Lengál, O., Smahlíková, B.: Sky is not the limit: Tighter rank bounds for elevator

automata in Büchi automata complementation (technical report). CoRR abs/2110.10187

(2021), https://arxiv.org/abs/2110.10187

17. Havlena, V., Lengál, O., Šmahlíková, B.: Deciding S1S: Down the rabbit hole and through

the looking glass. In: Proceedings of NETYS’21. pp. 215–222. No. 12754 in LNCS, Springer

Verlag (2021). https://doi.org/10.1007/978-3-030-91014-3_15

18. Havlena, V., Lengál, O., Šmahlíková, B.: Ranker (2021), https://github.com/vhavlena/ranker

19. Havlena, V., Lengál, O.: Reducing (To) the Ranks: Eﬃcient Rank-Based Büchi Automata

Complementation. In: Proc. of CONCUR’21. LIPIcs, vol. 203, pp. 2:1–2:19. Schloss

Dagstuhl, Dagstuhl, Germany (2021). https://doi.org/10.4230/LIPIcs.CONCUR.2021.2,

iSSN: 1868-8969

20. Heizmann, M., Hoenicke, J., Podelski, A.: Termination analysis by learning terminating

programs. In: Proc. of CAV’14. pp. 797–813. Springer (2014)

21. Kähler, D., Wilke, T.: Complementation, disambiguation, and determinization of Büchi au-

tomata uniﬁed. In: Proc. of ICALP’08. pp. 724–735. Springer (2008)

22. Karmarkar, H., Chakraborty, S.: On minimal odd rankings for Büchi complemen-

tation. In: Proc. of ATVA’09. LNCS, vol. 5799, pp. 228–243. Springer (2009).

https://doi.org/10.1007/978-3-642-04761-9_18

23. Klein, J., Baier, C.: On-the-ﬂy stuttering in the construction of deterministic omega

-automata. In: Proc. of CIAA’07. LNCS, vol. 4783, pp. 51–61. Springer (2007).

https://doi.org/10.1007/978-3-540-76336-9_7

24. Kupferman, O., Vardi, M.Y.: Weak alternating automata are not that weak. ACM Trans.

Comput. Log. 2(3), 408–429 (2001). https://doi.org/10.1145/377978.377993

25. Kurshan, R.P.: Complementing deterministic Büchi automata in polynomial time. J. Comput.

Syst. Sci. 35(1), 59–71 (1987). https://doi.org/10.1016/0022-0000(87)90036-5

26. Li, Y., Sun, X., Turrini, A., Chen, Y., Xu, J.: ROLL 1.0: 𝜔-regular language learn-

ing library. In: Proc. of TACAS’19. LNCS, vol. 11427, pp. 365–371. Springer (2019).

https://doi.org/10.1007/978-3-030-17462-0_23

27. Li, Y., Turrini, A., Zhang, L., Schewe, S.: Learning to complement Büchi automata. In: Proc.

of VMCAI’18. pp. 313–335. Springer (2018)

28. Li, Y., Vardi, M.Y., Zhang, L.: On the power of unambiguity in Büchi complementation. In:

Proc. of GandALF’20. EPTCS, vol. 326, pp. 182–198. Open Publishing Association (2020).

https://doi.org/10.4204/EPTCS.326.12

29. Löding, C., Pirogov, A.: New optimizations and heuristics for determinization of büchi

automata. In: Automated Technology for Veriﬁcation and Analysis. pp. 317–333. Springer

International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-31784-3_18

30. Mayr, R., Clemente, L.: Advanced automata minimization. In: Proc. of POPL’13. pp. 63–74

(2013)

31. Michel, M.: Complementation is more diﬃcult with automata on inﬁnite words. CNET, Paris

15 (1988)

32. Nielson, F., Nielson, H.R., Hankin, C.: Principles of program analysis. Springer (1999).

https://doi.org/10.1007/978-3-662-03811-6

Sky Is Not the Limit: Tighter Rank Bounds in B¨uchi Automata Complementation 135

33. Oei, R., Ma, D., Schulz, C., Hieronymi, P.: Pecan: An automated theorem prover for automatic

sequences using büchi automata. CoRR abs/2102.01727 (2021), https://arxiv.org/abs/2102.

01727

34. Piterman, N.: From nondeterministic Büchi and Streett automata to deterministic parity

automata. In: Proc. of LICS’06. pp. 255–264. IEEE (2006)

35. Redziejowski, R.R.: An improved construction of deterministic omega-automaton using

derivatives. Fundam. Informaticae 119(3-4), 393–406 (2012). https://doi.org/10.3233/FI-

2012-744

36. Safra, S.: On the complexity of 𝜔-automata. In: Proc. of FOCS’88. pp. 319–327. IEEE (1988)

37. Schewe, S.: Büchi complementation made tight. In: Proc. of STACS’09. LIPIcs, vol. 3, pp.

661–672. Schloss Dagstuhl (2009). https://doi.org/10.4230/LIPIcs.STACS.2009.1854

38. Sistla, A.P., Vardi, M.Y., Wolper, P.: The Complementation Problem for Büchi Automata with

Applications to Temporal Logic. Theoretical Computer Science 49(2-3), 217–237 (1987)

39. Tabakov, D., Vardi, M.Y.: Experimental evaluation of classical automata constructions. In:

Proc. of LPAR’05. pp. 396–411. Springer (2005)

40. Tsai, M.H., Fogarty, S., Vardi, M.Y., Tsay, Y.K.: State of Büchi complementation. In: Imple-

mentation and Application of Automata. pp. 261–271. Springer Berlin Heidelberg, Berlin,

Heidelberg (2011)

41. Tsai, M.H., Tsay, Y.K., Hwang, Y.S.: GOAL for games, omega-automata, and logics. In:

Computer Aided Veriﬁcation. pp. 883–889. Springer Berlin Heidelberg, Berlin, Heidelberg

(2013)

42. Vardi, M.Y., Wilke, T.: Automata: From logics to algorithms. Logic and Automata 2, 629–736

(2008)

43. Yan, Q.: Lower bounds for complementation of 𝜔-automata via the full automata technique.

In: Automata, Languages and Programming. pp. 589–600. Springer Berlin Heidelberg, Berlin,

Heidelberg (2006)

Open Access This chapter is licensed under the terms of the Creative Commons

Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which per-

mits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as

you give appropriate credit to the original author(s) and the source, provide a link to the Creative

Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter’s Creative

Commons license, unless indicated otherwise in a credit line to the material. If material is not in-

cluded in the chapter’s Creative Commons license and your intended

use is not permitted by statutory regulation or exceeds the permitted use, you will need to ob-

tain permission directly from the copyright holder.

136 Vojtˇech Havlena, Ondˇrej Leng´al, Barbora ˇ

Smahl´ıkov´a

On-The-Fly Solving for Symbolic Parity Games

Maurice Laveaux1() , Wieger Wesselink1, and Tim A.C. Willemse1,2

1Eindhoven University of Technology, Eindhoven, The Netherlands

2ESI (TNO), Eindhoven, The Netherlands

{m.laveaux, j.w.wesselink, t.a.c.willemse}@tue.nl

Abstract. Parity games can be used to represent many diﬀerent kinds

of decision problems. In practice, tools that use parity games often rely

on a speciﬁcation in a higher-order logic from which the actual game

can be obtained by means of an exploration. For many of these decision

problems we are only interested in the solution for a designated vertex in

the game. We formalise how to use on-the-ﬂy solving techniques during

the exploration process, and show that this can help to decide the winner

of such a designated vertex in an incomplete game. Furthermore, we

deﬁne partial solving techniques for incomplete parity games and show

how these can be made resilient to work directly on the incomplete game,

rather than on a set of safe vertices. We implement our techniques for

symbolic parity games and study their eﬀectiveness in practice, showing

that speed-ups of several orders of magnitude are feasible and overhead

(if unavoidable) is typically low.

1 Introduction

A parity game is a two-player game with an ω-regular winning condition, played

by players ♢(‘even’) and □(‘odd’) on a directed graph. The true complexity of

solving parity games is still a major open problem, with the most recent break-

throughs yielding algorithms running in quasi-polynomial time, see, e.g., [18,7].

Apart from their intriguing status, parity games pop up in various fundamental

results in computer science (e.g., in the proof of decidability of a monadic second-

order theory). In practice, parity games provide an elegant, uniform framework

to encode many relevant decision problems, which include model checking prob-

lems, synthesis problems and behavioural equivalence checking problems.

Often, a decision problem that is encoded as a parity game, can be answered

by determining which of the two players wins a designated vertex in the game

graph. Depending on the characteristics of the game, it may be the case that

only a fraction of the game is relevant for deciding which player wins a vertex.

For instance, deciding whether a transition system satisﬁes an invariant can be

encoded by a simple, solitaire (i.e., single player) parity game. In such a game,

player □wins all vertices that are sinks (i.e., have no successors), and all states

leading to such sinks, so checking whether sinks are reachable from a designated

vertex suﬃces to determine whether this vertex is won by □, too. Clearly, as soon

as a sink is detected, any further inspection of the game becomes irrelevant.

The Author(s) 2022

D. Fisman and G. Rosu (Eds.): TACAS 2022, LNCS 13244, pp. 137–155, 2022.

https://doi.org/10.1007/978-3-030-99527-0_8

A complicating factor is that in practice, the parity games that encode deci-

sion problems are not given explicitly. Rather, they are speciﬁed in some higher-

order logic such as a parameterised Boolean equation system, see, e.g. [11]. Ex-

ploring the parity game from such a higher-order speciﬁcation is, in general,

time-and memory-consuming. To counter this, symbolic exploration techniques

have been proposed, see e.g. [19]. These explore the game graph on-the-ﬂy and

exploit eﬃcient symbolic data structures such as LDDs [13] to represent sets of

vertices and edges. Many parity game solving algorithms can be implemented

quite eﬀectively using such data structures [20,28,29], so that in the end, explor-

ing the game graph often remains the bottleneck.

In this paper, we study how to combine the exploration of a parity game

and the on-the-ﬂy solving of the explored part, with the aim to speed-up the

overall solving process. The central problem when performing on-the-ﬂy solving

during the exploration phase is that we have to deal with incomplete information

when determining the winner for a designated vertex. Moreover, in the symbolic

setting, the exploration order may be unpredictable when advanced strategies

such as chaining and saturation [9] are used.

To formally reason about all possible exploration strategies and the artefacts

they generate, we introduce the concept of an incomplete parity game, and an

ordering on these. Incomplete parity games are parity games where for some

vertices not all outgoing edges are necessarily known. In practice, these could be

identiﬁed by, e.g., the todo queue in a classical breadth-ﬁrst search. The extra

information captured by an incomplete parity game allows us to characterise

the safe set for a given player α. This is a set of vertices for which it can be

established that if player αwins the vertex, then she cannot lose the vertex if

more information becomes available. We prove an optimality result for safe sets,

which, informally, states that a safe set for player αis also the largest set with

this property (see Theorem 1).

The vertices won by player αin an α-safe set can be determined using a

standard parity game solving algorithm such as, e.g., Zielonka’s recursive al-

gorithm [31] or Priority Promotion [2]. However, these algorithms may be less

eﬃcient as on-the-ﬂy solvers. For this reason, we study three symbolic partial

solvers: solitaire winning cycle detection, forced winning cycle detection and fa-

tal attractors [17]. In particular cases, ﬁrst determining the safe set for a player

and only subsequently solving the game using one of these partial solvers will

incur an additional overhead. As a ﬁnal result, we therefore prove that all these

solvers can be (modiﬁed to) run on the incomplete game as a whole, rather than

on the safe set of a player (see Propositions 1-3).

As a proof of concept, we have implemented an (open source) symbolic tool

for the mCRL2 toolset [6], that explores a parity game speciﬁed by a parame-

terised Boolean equation system and solves these games on-the-ﬂy. We report

on the eﬀectiveness of our implementation on typical parity games stemming

from, e.g., model checking and equivalence checking problems, showing that it

can speed up the process with several orders of magnitude, while adding low

overhead if the entire game is needed for solving.

138 M. Laveaux, W. Wesselink and T.A.C. Willemse

Related Work. Our work is related to existing techniques for solving symbolic

parity games such as [20,19], as we extend these existing methods with on-the-

ﬂy solving. Naturally, our work is also related to existing work for on-the-ﬂy

model checking. This includes work for on-the-ﬂy (explicit) model checking of

regular alternation-free modal mu-calculus formulas [23] and work for on-the-

ﬂy symbolic model checking of RCTL [1]. Compared to these our method is

more general as it can be applied to the full modal mu-calculus (with data),

which subsumes RCTL and the alternation-free subset. Optimisations such as

the observation that checking LTL formulas of type AG reduces to reachability

checks [14] are a special case of our methods and partial solvers. Furthermore, our

methods are not restricted to model checking problems only and can be applied

to any parity game, including decision problems such as equivalence checking [8].

Furthermore, our method is agnostic to the exploration strategy employed.

Structure of the paper. In Section 2we recall parity games. In Section 3we

introduce incomplete parity games and show how partial solving can be applied

correctly. In Section 4we present several partial solvers that we employ for

on-the-ﬂy solving. Finally, in Section 5we discuss the implementation of these

techniques and apply them to several practical examples. The omitted proofs for

the supporting lemmas can be found in [22].

2 Preliminaries

A parity game is an inﬁnite-duration, two-player game that is played on a ﬁnite

directed graph. The objective of the two players, called even (denoted by ♢) and

odd (denoted by □), is to win vertices in the graph.

Deﬁnition 1. Aparity game is a directed graph G= (V, E, p, (V♢, V□)), where

–Vis a ﬁnite set of vertices, partitioned in sets V♢and V□of vertices owned

by ♢and □, respectively;

–E⊆V×Vis the edge relation;

–p:V→Nis a function that assigns a priority to each node.

Henceforth, let G= (V, E , p, (V♢, V□)) be an arbitrary parity game. Throughout

this paper, we use αto denote an arbitrary player and ¯αdenotes the opponent.

We write vE to denote the set of successors {w∈V|(v , w)∈E}of vertex

v. The set sinks(G) is deﬁned as the largest set U⊆Vsatisfying for all v∈U

that vE =∅;i.e.,sinks(G) is the set of all sinks: vertices without successors.

If we are only concerned with the sinks of player α, we write sinksα(G); i.e.,

sinksα(G) = Vα∩sinks(G). We write G∩U, for U⊆V, to denote the subgame

(U, (U×U)∩E, p ↾U,(V♢∩U, V□∩U)), where p↾U(v) = p(v) for all vertices

v∈U.

Example 1. Consider the graph depicted in Figure 1, representing a parity game.

Diamond-shaped vertices are owned by player ♢, whereas box-shaped vertices

are owned by player □. The priority of a vertex is written inside the vertex.

Vertex u1is a sink owned by player □.⊓⊔

On-The-Fly Solving for Symbolic Parity Games 139

Fig. 1. An example parity game

Plays and strategies. The game is played as follows. Initially, a token is placed on

a vertex of the graph. The owner of a vertex on which the token resides gets to

decide the successor vertex (if any) that the token is moved to next. A maximal

sequence of vertices (i.e., an inﬁnite sequence or a ﬁnite sequence ending in a

sink) visited by the token by following this simple rule is called a play. A ﬁnite

play πis won by player ♢if the sink in which it ends is owned by player □, and

it is won by player □if the sink is owned by player ♢. An inﬁnite play πis won

by player ♢if the minimal priority that occurs inﬁnitely often along πis even,

and it is won by player □otherwise.

A strategy σα:V∗Vα→Vfor player αis a partial function that prescribes

where player αmoves the token next, given a sequence of vertices visited by the

token. A play v0v1. . . is consistent with a strategy σif and only if σ(v0. . . vi) =

vi+1 for all ifor which σ(v0. . . vi) is deﬁned. Strategy σαis winning for player

αin vertex vif all plays consistent with σαand starting in vare won by α.

Player αwins vertex vif and only if she has a winning strategy σαfor vertex v.

The parity game solving problem asks to compute the set of vertices W♢, won

by player ♢and the set W□, won by player □. Note that since parity games are

determined [31,24], every vertex is won by one of the two players. That is, the

sets W♢and W□partition the set V.

Example 2. Consider the parity game depicted in Figure 1. In this game, the

strategy σ♢, partially deﬁned as σ♢(πu0) = u2and σ♢(πu2) = u0, for arbitrary

π, is winning for player ♢in u0and u2. Player □wins vertex u3using strategy

σ□(πu3) = u4, for arbitrary π. Note that player ♢is always forced to move the

token from u4to u3. Vertex u1is a sink, owned by player □, and hence, won by

player ♢.⊓⊔

Dominions. A strategy σαis said to be closed on a set of vertices U⊆Viﬀ

every play, consistent with σαand starting in a vertex v∈Uremains in U. If

player αhas a strategy that is closed on U, we say that the set Uis α-closed.

Adominion for player αis a set of vertices U⊆Vsuch that player αhas a

strategy σαthat is closed on Uand which is winning for α. Note that the sets

W♢and W□are dominions for player ♢and player □, respectively, and, hence,

every vertex won by player αmust belong to an α-dominion.

Example 3. Reconsider the parity game of Figure 1. Observe that player □has

a closed strategy on {u3, u4}, which is also winning for player □. Hence, the

set {u3, u4}is an □-dominion. Furthermore, the set {u2, u3, u4}is ♢-closed.

However, none of the strategies for which {u2, u3, u4}is closed for player ♢is

winning for her; therefore {u2, u3, u4}is not an ♢-dominion. ⊓⊔

140 M. Laveaux, W. Wesselink and T.A.C. Willemse

Predecessors, control predecessors and attractors. Let U⊆Vbe a set of vertices.

We write pre(G, U ) to denote the set of predecessors {v∈V| ∃u∈U:u∈

vE}of Uin G. The control predecessor set of Ufor player αin G, denoted

cpreα(G, U), contains those vertices for which αis able to force entering Uin

one step. It is deﬁned as follows:

cpreα(G, U)=(Vα∩pre(G, U)) ∪(V¯α\(pre(G, V \U)∪sinks(G)))

Note that both pre and cpre are monotone operators on the complete lattice

(2V,⊆). The α-attractor to Uin G, denoted Attrα(G, U ), is the set of vertices

from which player αcan force play to reach a vertex in U:

Attrα(G, U) = µZ.(U∪cpreα(G, Z))

The α-attractor to Ucan be computed by means of a ﬁxed point iteration,

starting at Uand adding α-control predecessors in each iteration until a stable

set is reached. We note that the α-attractor to an α-dominion Dis again an

α-dominion.

Example 4. Consider the parity game Gof Figure 1once again. The ♢-control

predecessors of {u2}is the set {u0}. Note that since player □can avoid moving

to u2from vertex u3by moving to vertex u4, vertex u3is not among the ♢-

control predecessors of {u2}. The ♢-attractor to {u2}is the set {u0, u2}, which

is the largest set of vertices for which player ♢has a strategy to force play to

the set of vertices {u2}.⊓⊔

3 Incomplete Parity Games

In many practical applications that rely on parity game solving, the parity game

is gradually constructed by means of an exploration, often starting from an ‘ini-

tial’ vertex. This is, for instance, the case when using parity games in the context

of model checking or when deciding behavioural preorders or equivalences. For

such applications, it may be proﬁtable to combine exploration and solving, so

that the costly exploration can be terminated when the winner of a particular

vertex of interest (often the initial vertex) has been determined. The example

below, however, illustrates that one cannot naively solve the parity game con-

structed so far.

Example 5. Consider the parity game Gin Figure 2, consisting of all vertices

and only the solid edges. This game could, for example, be the result of an

exploration starting from u4. Then G∩ {u0, u1, u2, u3, u4, u5}is a subgame for

which we can conclude that all vertices form an ♢-dominion. However, after

exploring the dotted edges, player □can escape to vertex u4from vertex u5.

Consequently, vertices u4and u5are no longer won by player ♢in the extended

game. Furthermore, observe that the additional edge from u3to u5does not

aﬀect the previously established fact that player ♢wins this vertex. ⊓⊔

On-The-Fly Solving for Symbolic Parity Games 141

Fig. 2. A parity game where the dotted edges are not yet known.

To facilitate reasoning about games with incomplete information, we ﬁrst intro-

duce the notion of an incomplete parity game.

Deﬁnition 2. An incomplete parity game is a structure ⅁= (G, I ), where Gis

a parity game (V, E, p, (V♢, V□)), and I⊆Vis a set of vertices with potentially

unexplored successors. We refer to the set Ias the set of incomplete vertices;

the set V\Iis the set of complete vertices.

Observe that (G, ∅) is a ‘standard’ parity game. We permit ourselves to use

the notation for parity game notions such as plays, strategies, dominions, etcetera

also in the context of incomplete parity games. In particular, for ⅁= (G, I),

we will write pre(⅁, U) and Attrα(⅁, U ) to indicate pre(G, U) and Attrα(G, U ),

respectively. Furthermore, we deﬁne ⅁∩Uas the structure (G∩U, I ∩U).

Intuitively, while exploring a parity game, we extend the set of vertices and

edges by exploring the incomplete vertices. Doing so gives rise to potentially

new incomplete vertices. At each stage in the exploration, the incomplete parity

game extends incomplete parity games explored in earlier stages. We formalise

the relation between incomplete parity games, abstracting from any particular

order in which vertices and edges are explored.

Deﬁnition 3. Let ⅁= ((V, E, p, (V♢, V□)), I),⅁′= ((V′, E′, p′,(V′

♢, V ′

□)), I′)be

incomplete parity games. We write ⅁⊑⅁′iﬀ the following conditions hold:

(1) V⊆V′,V♢⊆V′

♢and V□⊆V′

□;

(2) E⊆E′and ((V\I)×V)∩E′⊆E;

(3) p=p′↾V;

(4) I′∩V⊆I

Conditions (1) and (3) are self-explanatory. Condition (2) states that on the

one hand, no edges are lost, and, on the other hand, E′can only add edges

from vertices that are incomplete: for complete vertices, E′speciﬁes no new

successors. Finally, condition (4) captures that the set of incomplete vertices I′

cannot contain vertices that were previously complete. We note that the ordering

⊑is reﬂexive, anti-symmetric and transitive.

Example 6. Suppose that ⅁= (G, I ) is the incomplete parity game depicted in

Figure 2, where Gis the game with all vertices and only the solid edges, and

I={u3, u5}. Then ⅁⊑⅁′, where ⅁′= (G′, I′) is the incomplete parity game

where G′is the depicted game with all vertices and both the solid edges and

dotted edges, and I′=∅.⊓⊔

142 M. Laveaux, W. Wesselink and T.A.C. Willemse

Let us brieﬂy return to Example 5. We concluded that the winner of vertex

u4(and also u5) changed when adding new information. The reason is that

player □has a strategy to reach an incomplete vertex owned by her. Such an

incomplete vertex may present an opportunity to escape from plays that would

be non-winning otherwise. On the other hand, the incomplete vertex u3has

already been suﬃciently explored to allow for concluding that this vertex is

won by player ♢, even if more successors are added to u3. This suggests that

for some subset of vertices, we can decide their winner in an incomplete parity

game and preserve that winner in all future extensions of the game. We formally

characterise this set of vertices in the deﬁnition below.

Deﬁnition 4. Let ⅁= (G, I), with G= (V , E, p, (V♢, V□)) be an incomplete

parity game. The α-safe vertices for ⅁, denoted by safeα(⅁), is the set V\

Attr¯α(G, V¯α∩I).

Example 7. Consider the incomplete parity game ⅁of Example 6once more. We

have safe♢(⅁) = {u0, u1, u2, u3}and safe□(⅁) = {u0, u1, u2, u4, u5}.⊓⊔

In the remainder of this section, we show that it is indeed the case that while

exploring a parity game, one can only safely determine the winners in the sets

safe□(⅁) and safe♢(⅁), respectively. More speciﬁcally, we claim (Lemma 1) that

all α-dominions found in safeα(⅁) are preserved in extensions of the game, and

(Lemma 2) the winner of vertices not in safeα(⅁) are not necessarily won by the

same player in extensions of the game.

Lemma 1. Given two incomplete games ⅁and ⅁′such that ⅁⊑⅁′. Any α-

dominion in ⅁∩safeα(⅁)is also an α-dominion in ⅁′.

Example 8. Recall that in Example 7, we found that safe♢(⅁) = {u0, u1, u2, u3}.

Observe that in the incomplete parity game ⅁of Example 6, restricted to vertices

{u0, u1, u2, u3}, all vertices are won by player ♢, and, hence, {u0, u1, u2, u3}is

an ♢-dominion. Following Lemma 1we can indeed conclude that this remains an

♢-dominion in all extensions of ⅁, and, in particular, for the (complete) parity

game ⅁′of Example 6.⊓⊔

Lemma 2. Let ⅁be an incomplete parity game. Suppose that Wis an α-

dominion in ⅁. If W⊆ safeα(⅁), then there is an (incomplete) parity game

⅁′such that ⅁⊑⅁′and all vertices in W\safeα(⅁)are won by ¯α.

As a corollary of the above lemma, we ﬁnd that α-dominions that contain

vertices outside of the α-safe set are not guaranteed to be dominions in all

extensions of the incomplete parity game.

Corollary 1. Let ⅁be an incomplete parity game. Suppose that Wis an α-

dominion in ⅁. If W⊆ safeα(⅁), then there is an (incomplete) parity game ⅁′

such that ⅁⊑⅁′and Wis not an α-dominion in ⅁′.

The theorem below summarises the two previous results, claiming that the

sets safe♢(⅁) and safe□(⅁) are the optimal subsets that can be used safely when

combining solving and the exploration of a parity game.

On-The-Fly Solving for Symbolic Parity Games 143

Theorem 1. Let ⅁= (G, I), with G= (V , E, p, (V♢, V□)), be an incomplete

parity game. Deﬁne Wαas the union of all α-dominions in ⅁∩safeα(⅁), and let

W?=V\(W♢∪W□). Then W?is the largest set of vertices vfor which there

are incomplete parity games ⅁αand ⅁¯αsuch that ⅁⊑⅁αand ⅁⊑⅁¯αand vis

won by αin ⅁αand vis won by ¯αin ⅁¯α.

Proof. Let ⅁, with G= (V , E, p, (V♢, V□)) be an incomplete parity game. Pick

a vertex v∈W?. Suppose that in G, vertex v∈W?is won by player α. Let

⅁α=⅁. Then ⅁⊑⅁αand vis also won by αin ⅁α.

Next, we argue that there must be a game ⅁¯αsuch that ⅁⊑⅁¯αand vis

won by ¯αin ⅁¯α. Since v∈W?is won by player αin G,vmust belong to an

α-dominion in G. Towards a contradiction, assume that v∈safeα(⅁). Then there

must also be a α-dominion containing vin G∩safeα(⅁), since ¯αcannot escape

the set safeα(⅁). But then v∈Wα. Contradiction, so v /∈safeα(⅁). So, vmust

be part of an α-dominion Din Gsuch that D⊆ safeα(⅁). By Lemma 2, we ﬁnd

that there is an incomplete parity game ⅁¯αsuch that ⅁⊑⅁¯αand all vertices in

D\safeα(⅁), and vertex v∈Din particular, are won by ¯αin ⅁¯α.

Finally, we argue that W?cannot be larger. Pick a vertex v /∈W?. Then there

must be some player αsuch that v∈Wα, and, consequently, there must be an

α-dominion D⊆⅁∩safeα(⅁) such that v∈D. But then by Lemma 1, we ﬁnd

that vis won by αin all incomplete parity games ⅁′such that ⅁⊑⅁′.⊓⊔

4 On-the-ﬂy Solving

In the previous section we saw that for any solver solveα, which accepts a parity

game as input and returns an α-dominion Wα, a correct on-the-ﬂy solving algo-

rithm can be obtained by computing Wα=solveα(⅁∩safeα(⅁)) while exploring

an (incomplete) parity game ⅁. While this approach is clearly sound, computing

the set of safe vertices can be expensive for large state spaces and potentially

wasteful when no dominions are found afterwards. We next introduce safe at-

tractors which, we show, can be used to search for speciﬁc dominions without

ﬁrst computing the α-safe set of vertices.

4.1 Safe Attractors

We start by observing that the α-attractor to a set Uin an incomplete parity

game ⅁does not make a distinction between the set of complete and incomplete

vertices. Consequently, it may wrongly conclude that αhas a strategy to force

play to Uwhen the attractor strategy involves incomplete vertices owned by ¯α.

We thus need to make sure that such vertices are excluded from consideration.

This can be achieved by considering the set of unsafe vertices V¯α∩Ias potential

vertices that can be used by the other player to escape. We deﬁne the safe α-

attractor as the least ﬁxed point of the safe control predecessor. The latter is

deﬁned as follows:

spreα(⅁, U)=(Vα∩pre(⅁, U)) ∪(V¯α\(pre(⅁, V \U)∪sinks(⅁)∪I))

144 M. Laveaux, W. Wesselink and T.A.C. Willemse

Lemma 3. Let ⅁be an incomplete parity game. For all vertex sets X⊆safeα(⅁)

it holds that cpreα(⅁∩safeα(⅁), X) = spreα(⅁, X ).

The safe α-attractor to U, denoted SAttrα(⅁, U ), is the set of vertices from

which player αcan force to safely reach Uin ⅁:

SAttrα(⅁, U) = µZ.(U∪spreα(⅁, Z))

Lemma 4. Let ⅁be an incomplete parity game, and X⊆safeα(⅁). Then

Attrα(⅁∩safeα(⅁), X) = SAttrα(⅁, X ).

In particular, we can conclude the following:

Corollary 2. Let ⅁be an incomplete parity game, and X⊆safeα(⅁)be an

α-dominion. Then SAttrα(⅁, X)is an α-dominion for all ⅁′satisfying ⅁⊑⅁′.

One application of the above corollary is the following: since on-the-ﬂy solving is

typically performed repeatedly, previously found dominions can be expanded by

computing the safe α-attractor towards these already solved vertices. Another

corollary is the following, which states that complete sinks can be safely attracted

towards.

Corollary 3. Let ⅁= (G, I)be an incomplete parity game and let ⅁′be such

that ⅁⊑⅁′. Then SAttrα(⅁,sinks¯α(⅁)\I)is an α-dominion in ⅁′.

4.2 Partial Solvers

In practice, a full-ﬂedged solver, such as Zielonka’s algorithm [31] or one of

the Priority Promotion variants [2], may be costly to run often while exploring

a parity game. Instead, cheaper partial solvers may be used that search for

a dominion of a particular shape. We study three such partial solvers in this

section, with a particular focus on solvers that lend themselves for parity games

that are represented symbolically using, e.g., BDDs [5], MDDs [25] or LDDs [13].

For the remainder of this section, we ﬁx an arbitrary incomplete parity game

⅁= ((V, E , p, (V♢, V□)), I).

Winning solitaire cycles. A simple cycle in ⅁can be represented by a ﬁnite

sequence of distinct vertices v0v1. . . vnsatisfying v0∈vnE. Such a cycle is an

α-solitaire cycle whenever all vertices on that cycle are owned by player α.

Observe that if all vertices on an α-solitaire cycle have a priority that is of

the same parity as the owner α, then all vertices on that cycle are won by player

α. Formally, these are thus cycles through vertices in the set Pα∩Vα, where

P♢={v∈V\sinks(⅁)|p(v) mod 2 = 0}and P□={v∈V\sinks(⅁)|p(v)

mod 2 = 1}. Let Cα

sol(⅁) represent the largest set of α-solitaire winning cycles.

Then Cα

sol(⅁) = νZ.(Pα∩Vα∩pre(⅁, Z )).

On-The-Fly Solving for Symbolic Parity Games 145

Proposition 1. The set Cα

sol(⅁)is an α-dominion and we have Cα

sol(⅁)⊆safeα(⅁).

Proof. We ﬁrst prove that Cα

sol(⅁)⊆safeα(⅁). We show, by means of an induction

on the ﬁxed point approximants Aiof the attractor, that Cα

sol(⅁)∩Attr¯α(⅁, V ¯α∩

I) = ∅. The base case follows immediately, as Cα

sol(⅁)∩A0=Cα

sol(⅁)∩ ∅ =∅.

For the induction, we assume that Cα

sol(⅁)∩Ai=∅; we show that also Cα

sol(⅁)∩

((V¯α∩I)∪cpre¯α(⅁, Ai)) = ∅. First, observe that Cα

sol(⅁)⊆Vα; hence, it suﬃces

to prove that Cα

sol(⅁)∩(Vα\(pre(⅁, V \Ai)∪sinks(⅁)) = ∅. But this follows

immediately from the fact that for every vertex v∈ Cα

sol(⅁), we have v∈Pα∩

Vα∩pre(⅁,Cα

sol(⅁)); more speciﬁcally, we have vE ∩Cα

sol(⅁)=∅for all v∈ Cα

sol(⅁).

The fact that Cα

sol(⅁) is an α-dominion follows from the fact that for every

vertex v∈ Cα

sol(⅁), there is some w∈vE ∩ Cα

sol(⅁). This means that player α

must have a strategy that is closed on Cα

sol(⅁). Since all vertices in Cα

sol(⅁) are of

the priority that is beneﬁcial to α, this closed strategy is also winning for α.⊓⊔

Observe that winning solitaire cycles can be computed without ﬁrst computing

the α-safe set. Parity games that stand to proﬁt from detecting winning solitaire

cycles are those originating from verifying safety properties.

Winning forced cycles. In general, a cycle in safeα(⅁), through vertices in P♢

can contain vertices of both players, providing player □an opportunity to break

the cycle if that is beneﬁcial to her. Nevertheless, if breaking a cycle always

inadvertently leads to another cycle through P♢, then we may conclude that all

vertices on these cycles are won by player ♢. We call these cycles winning forced

cycles for player ♢. A dual argument applies to cycles through P□. Let Cα

for(⅁)

represent the largest set of vertices that are on winning forced cycles for player

α. More formally, we deﬁne Cα

for(⅁) = νZ.(Pα∩safeα(⅁)∩cpreα(⅁, Z )).

Lemma 5. The set Cα

for(⅁)is an α-dominion and we have Cα

for(⅁)⊆safeα(⅁).

A possible downside of the above construction is that it again requires to ﬁrst

compute safeα(⅁), which, in particular cases, may incur an additional overhead.

Instead, we can compute the same set using the safe control predecessor. We

deﬁne Cα

s−for(⅁) = νZ.(Pα∩spreα(⅁, Z)).

Proposition 2. We have Cα

for(⅁) = Cα

s−for(⅁).

Proof. Let τ(Z) = Pα∩spreα(⅁, Z ). We use set inclusion to show that Cα

for(⅁) is

indeed a ﬁxed point of τ.

–ad Cα

for(⅁)⊆τ(Cα

for(⅁)). Pick a vertex v∈ Cα

for(⅁). By deﬁnition of Cα

for(⅁),

we have v∈Pα∩safeα(⅁)∩cpreα(⅁,Cα

for(⅁)). Observe that safeα(⅁)∩

cpreα(⅁,Cα

for(⅁)) = safeα(⅁)∩cpreα(⅁∩safeα(⅁),Cα

for(⅁)). But then, since

Cα

for(⅁)⊆safeα(⅁), we ﬁnd, by Lemma 3, that cpreα(⅁∩safeα(⅁),Cα

for(⅁)) =

spreα(⅁,Cα

for(⅁)). Hence, v∈Pα∩spreα(⅁,Cα

for(⅁)) = τ(Cα

for(⅁)).

–ad Cα

for(⅁)⊇τ(Cα

for(⅁)). Again pick a vertex v∈τ(Cα

for(⅁)). Then v∈

Pα∩spreα(⅁,Cα

for(⅁)). Since Cα

for(⅁)⊆safeα(⅁), by Lemma 3, we again have

spreα(⅁,Cα

for(⅁)) = cpreα(⅁∩safeα(⅁),Cα

for(⅁)). But then it must be the case

that v∈safeα(⅁). Moreover, cpreα(⅁∩safeα(⅁),Cα

for(⅁)) ⊆cpreα(⅁,Cα

for(⅁)).

So v∈Pα∩safeα(⅁)∩cpreα(⅁,Cα

for(⅁)) = Cα

for(⅁).

146 M. Laveaux, W. Wesselink and T.A.C. Willemse

We show next that for any Z=τ(Z), we have Z⊆ Cα

for(⅁). Let Zbe such. We ﬁrst

show that for every v∈Z∩Vα, there is some w∈vE∩Z, and for every v∈Z∩V¯α,

we have v /∈sinks(⅁), v /∈Iand vE ⊆Z. Pick v∈Z∩Vα. Then v∈τ(Z)∩Vα=

Pα∩Vα∩spreα(⅁, Z)⊆pre(⅁, Z ). But then vE ∩Z=∅. Next, let v∈Z∩V¯α.

Then v∈τ(Z)∩V¯α=Pα∩V¯α∩spreα(⅁, Z )⊆V¯α\(pre(⅁, V \Z)∪sinks(⅁)∪I).

So v /∈pre(⅁, V \Z)∪sinks(⅁)∪I. Consequently, vE ⊆Z,v /∈sinks(⅁) and

v /∈I.

Since for every v∈Z∩Vα, we have vE ∩Z=∅, there must be a strategy

for player αto move to another vertex in Z. Let σbe this strategy. Moreover,

since for all v∈Z∩V¯αwe have vE ⊆Z, we ﬁnd that σis closed on Zand since

Z∩sinks(⅁) = ∅, strategy σinduces forced cycles. Moreover, since Z⊆Pα, we

can conclude that all vertices in Zare on winning forced cycles.

Finally, we must argue that Z⊆safeα(⅁). But this follows from the fact that

Z∩V¯α∩I=∅, and, hence, also Z∩Attr¯α(⅁, V ¯α∩I) = ∅. Since Zis contained

within Pα∩safeα(⅁), we ﬁnd that Z⊆ Cα

for(⅁). ⊓⊔

Fatal attractors. Both solitaire cycles and forced cycles utilise the fact that the

parity winning condition becomes trivial if the only priorities that occur on

a play are of the parity of a single player. Fatal attractors [17] were originally

conceived to solve parts of a game using algorithms that have an appealing worst-

case running time; for a detailed account, we refer to [17]. While ibid. investigates

several variants, the main idea behind a fatal attractor is that it identiﬁes cycles

in which the priorities are non-decreasing until the dominating priority of the

attractor is (re)visited. We focus on a simpliﬁed (and cheaper) variant of the

psolB algorithm of [17], which is based on the concept of a monotone attractor,

which, in turn, relies on the monotone control predecessor deﬁned below, where

P≥c={v∈V|p(v)≥c}:

Mcpreα(⅁, Z, U, c) = P≥c∩cpreα(⅁, Z ∪U)

The monotone attractor for a given priority is then deﬁned as the least ﬁxed point

of the monotone control predecessor for that priority, formally MAttrα(⅁, U, c) =

µZ.Mcpreα(⅁, Z, U, c). A fatal attractor for priority cis then the largest set of

vertices closed under the monotone attractor for priority c;i.e.,Fα(⅁, c) =

νZ.(P=c∩safeα(⅁)∩MAttrα(⅁∩safeα(⅁), Z, c)), where P=c=P≥c\P≥c+1.

Lemma 6 (See [17], Theorem 2). For even c, we have that MAttr♢(⅁∩

safeα(⅁),F♢(⅁, c), c)⊆safe♢(⅁)and MAttr♢(⅁∩safeα(⅁),F♢(⅁, c), c)is an ♢-

dominion. If cis odd then we have MAttr□(⅁∩safeα(⅁),F□(⅁, c), c)⊆safe□(⅁)

and MAttr□(⅁∩safeα(⅁),F□(⅁, c), c)is an □-dominion.

Our simpliﬁed version of the psolB algorithm, here dubbed solB−computes

fatal attractors for all priorities in descending order, accumulating ♢and □-

dominions and extending these dominions using a standard ♢or □-attractor.

This can be implemented using a simple loop over these priorities.

In line with the previous solvers, we can also modify this solver to employ

a safe monotone control predecessor, which uses a construction that is similar

On-The-Fly Solving for Symbolic Parity Games 147

in spirit to that of the safe control predecessor. Formally, we deﬁne the safe

monotone control predecessor as follows:

sMcpreα(⅁, Z, U, c) = P≥c∩spreα(⅁, Z ∪U)

The corresponding safe monotone α-attractor, denoted sMAttrα(⅁, U, c), is de-

ﬁned as follows: sMAttrα(⅁, U, c) = µZ.sMcpreα(⅁, Z, U, c). We deﬁne the safe

fatal attractor for priority cas the set Fα

s(⅁, c) = νZ.(P=c∩sMAttrα(⅁, Z, c)).

Proposition 3. Let ⅁be an incomplete parity game. We have F♢

s(⅁, c) =

F♢(⅁, c)for even cand for odd cwe have F□

s(⅁, c) = F□(⅁, c).

Similar to algorithm solB−, the algorithm solB−

scomputes safe fatal attrac-

tors for priorities in descending order and collects the safe-α-attractor extended

dominions obtained this way.

5 Experimental Results

We experimentally evaluate the techniques of Section 4. For this, we use games

stemming from practical model checking and equivalence checking problems.

Our experiments are run, single-threaded, on an Intel Xeon 6136 CPU @ 3 GHz

PC. The sources for these experiments can be obtained from the downloadable

artefact [21].

5.1 Implementation

We have implemented a symbolic exploration technique for parity games in the

mCRL2 toolset [6]. Our tool exploits techniques such as read and write depen-

dencies [20,4], and uses sophisticated exploration strategies such as chaining and

saturation [9]. We use MDD-like data structures [25] called List Decision Dia-

grams (LDDs), and the corresponding Sylvan implementation [13], to represent

parity games symbolically. Sylvan also oﬀers eﬃcient implementations for set

operations and relational operations, such as predecessors, facilitating the im-

plementation of attractor computations, the described (partial) solvers, and a

full solver based on Zielonka’s recursive algorithm [31], which remains one of the

most competitive algorithms in practice, both explicitly and symbolically [28,12].

For the attractor set computation we have also implemented chaining to deter-

mine (multi-)step α-predecessors more eﬃciently.

For all three on-the-ﬂy solving techniques of Section 4, we have implemented

1) a variant that runs the standard (partial) solver on the α-safe subgame and

removes the found dominion using the standard attractor (within that subgame),

and 2) a variant that uses (partial) solvers with the safe attractors. Moreover,

we also conduct experiments using the full solver running on an α-safe subgame.

An important design aspect is to decide how the exploration and the on-the-ﬂy

solving should interleave. For this we have implemented a time based heuristic

that keeps track of the time spent on solving and exploration steps. The time

148 M. Laveaux, W. Wesselink and T.A.C. Willemse

measurements are used to ensure that (approximately) ten percent of total time

is spent on solving by delaying the next call to the solver. We do not terminate

the partial solver when it requires more time, and thus it is only approximate.

As a result of this heuristic, cheap solvers will be called more frequently than

more expensive (and more powerful) ones, which may cause the latter to explore

larger parts of the game graph.

5.2 Cases

Table 1provides an overview of the models and a description of the property

that is being checked. The properties are written in the modal µ-calculus with

data [15]. For the equivalence checking case we have mutated the original model

to introduce a defect. For each property, we indicate the nesting depth (ND) and

alternation depth [10] and whether the parity game is solitaire (Yes/No). The

nesting depth indicates how many diﬀerent priorities occur in the resulting game;

for our encoding this is at most ND+2 (the additional ones encode constants

‘true’ and ‘false’). The alternation depth is an indication of a game’s complexity

due to alternating priorities.

Table 1. Models and formulas.

Model Ref. Prop. Result ND AD Sol. Description

SWP [30] 1 false 1 1 Y No error transition

2false 3 3 N Inﬁnitely often enabled then inﬁnitely often taken

WMS [27] 1 false 1 1 Y Job failed to b e done

2false 1 1 Y No zombie jobs

3true 3 2 Y A job can become alive again inﬁnitely often

4false 2 2 N Branching bisimulation with a mutation

BKE [3] 1 true 1 1 Y No secret leaked

2false 2 1 N No deadlock

CCP [26] 1 false 2 1 N No deadlo ck

2false 2 1 N After access there is always accessover possible

PDI n/a 1 true 2 1 N Controller reaches state before it can connect again

2false 2 1 N Connection impermissible can always happen or we

establish a connection

3false 3 1 N When connected move to not ready for connection and

do not establish a connection until it is allowed again

4true 2 1 N The interlocking moves to the state connection closed

before it is allowed to succesfully establish a connection

We use MODEL-ito indicate the parity game belonging to model MODEL

and property i. Models SWP, BKE and CCP are protocol speciﬁcations. The

model PDI is a speciﬁcation of a EULYNX SCI-LX SySML interface model that

is used for a train interlocking system. Finally, WMS is the speciﬁcation of a

workload management system used at CERN. Using tools in mCRL2 [6], we have

converted each model and property combination into a so-called parameterised

Boolean equation systems [16], a higher-level logic that can be used to represent

the underlying parity game.

Parity games SWP-1, WMS-1, WMS-2 and BKE-1 encode typical safety

properties where some action should not be possible. In terms of the alternation-

free modal mu-calculus with regular expressions, such properties are of the shape

On-The-Fly Solving for Symbolic Parity Games 149

[true∗.a]false. These properties are violated exactly when the vertex encoding

‘false’ can be reached. Parity games SWP-2, WMS-3 and WMS-4 are more

complex properties with alternating priorities, where WMS-4 encodes branching

bisimulation using the theory presented in [8]. The parity games BKE-2 and

CCP-1 encode a ‘no deadlock’ property given by a formula which states that

after every path there is at least one outgoing transition. Finally, CCP-2 and

all PDI cases contain formulas with multiple ﬁxed points that yield games with

multiple priorities but no (dependent) alternation.

Table 2. Experiments with parity games where on-the-ﬂy solving cannot terminate

early. All run times are in seconds. The number of vertices is given in millions. Memory

is given in gigabytes. Bold-faced numbers indicate the lowest value.

Game Strategy Vertices (106) Explore (s) Solve (s) Total (s) Mem (GB)

BKE-1 full 40 640 65 705 14

solitaire 40/40 629/615 153/100 782/715 15/15

cycles 40/40 635/644 149/160 785/804 15/15

fatal 40/40 624/625 152/164 776/789 15/15

partial 40 651 147 798 15

PDI-1 full 114 27 0.1 28 2

solitaire 114/114 28/27 4/0 33/28 2/2

cycles 114/114 29/28 7/7 36/35 2/2

fatal 114/114 28/28 4/7 32/35 2/2

partial 114 28 9 37 2

PDI-4 full 474 286 0 287 2

solitaire 474/474 284/281 46/14 331/295 2/2

cycles 474/474 284/287 92/91 376/378 2/2

fatal 474/474 285/283 80/91 365/374 2/2

partial 474 286 64 350 2

5.3 Results

In Tables 2and 3we compare the on-the-ﬂy solving strategies presented in

Section 4. In the ‘Strategy’ column we indicate the on-the-ﬂy solving strategy

that is used. Here full refers to a complete exploration followed by solving with

the Zielonka recursive algorithm. We use solitaire to refer to solitaire winning

cycle detection, cycles for forced winning cycle detection, fatal to refer to fatal

attractors and ﬁnally partial for on-the-ﬂy solving with a Zielonka solver on safe

regions. For solvers with a standard variant and a variant that utilises the safe

attractors the ﬁrst number indicates the result of applying the (standard) solver

on safe vertices, and the second number (following the slash ‘/’) indicates the

result when using the solver that utilises safe attractors.

The column ‘Vertices’ indicates the number of vertices explored in the game.

In the next columns we indicate the time spent on exploring and solving specif-

ically and the total time in seconds. We exclude the initialisation time that is

common to all experiments. Finally, the last column indicates memory used by

the tool in gigabytes. We report the average of 5 runs and have set a timeout

(indicated by ‡) at 1200 seconds per run. Table 2contains all benchmarks that

require a full exploration of the game graph, providing an indication of the over-

150 M. Laveaux, W. Wesselink and T.A.C. Willemse

Table 3. Experiments with parity games in which at least one partial solver terminates

early. All run times are in seconds. The number of vertices is given in millions. For

solvers with two variants the ﬁrst number indicates the result of applying the solver

on safe vertices, and following the slash ‘/’ the result when using the solver that uses

safe attractors. Memory is given in gigabytes. Bold-faced numbers indicate the lowest

value.

Game Strategy Vertices (106) Explore (s) Solve (s) Total (s) Mem (GB)

SWP-1 full 13304 ‡n/a ‡ ‡

solitaire 15.1/0.4 8.5/1.4 27.3/0.1 35.8/1.5 2.8/1.5

cycles 25.2/0.9 12.3/1.8 42.7/1.0 55.0/2.8 3.2/1.5

fatal 15.1/0.4 9.0/1.3 29.4/0.4 38.4/1.7 3.1/1.5

partial 27.1 13.1 50.4 63.5 3.6

SWP-2 full 1987 ‡n/a ‡ ‡

solitaire 1631/1987 ‡/‡163/11 ‡/‡ ‡/‡

cycles 1774/1774 ‡/‡154/91 ‡/‡ ‡/‡

fatal 0.007/0.007 0.9/0.9 0.4/0.2 1.3/1.0 1.4/1.2

partial 0.007 0.9 0.4 1.3 1.4

WMS-1 full 270 2.8 0.4 3.3 0.2

solitaire 270/240 2.8/2.5 0.8/0.4 3.6/2.9 0.3/0.2

cycles 270/270 2.9/3.2 0.8/8.0 3.7/11.2 0.3/0.5

fatal 270/270 2.6/3.2 0.8/8.5 3.4/11.7 0.3/0.5

partial 270 2.7 0.8 3.5 0.3

WMS-2 full 317 3.3 0.3 3.6 0.2

solitaire 7/7 0.2/0.2 1.0/0.5 1.2/0.8 0.1/0.1

cycles 7/66 0.2/0.8 1.0/2.7 1.2/3.4 0.1/0.2

fatal 7/66 0.2/0.7 1.0/2.9 1.3/3.6 0.1/0.2

partial 7 0.2 1.1 1.3 0.1

WMS-3 full 317 2.6 0.1 2.7 0.2

solitaire 317/317 2.6/2.6 0.4/0.3 3.1/2.9 0.2/0.2

cycles 317/317 2.7/2.7 0.4/0.6 3.1/3.3 0.2/0.2

fatal 5/1 0.2/0.1 0.5/0.1 0.7/0.2 0.1/0.1

partial 5 0.2 0.3 0.5 0.1

WMS-4 full 366 ‡n/a ‡ ‡

solitaire 0.03/0.03 38/38 0.8/0.1 39/38 2/2

cycles 0.03/0.03 37/37 0.8/0.3 38/37 2/2

fatal 0.03/0.03 37/37 0.8/0.3 38/37 2/2

partial 0.03 37 0.7 38 2

BKE-2 full 119 942 36.5 979 28

solitaire 0.0007/0.0001 0.2/0.1 0.0/0.0 0.2/0.2 0.9/0.9

cycles 0.0007/0.0003 0.2/0.2 0.0/0.0 0.2/0.2 0.9/0.9

fatal 0.0007/0.0003 0.2/0.2 0.0/0.0 0.2/0.2 0.9/0.9

partial 0.0007 0.2 0.0 0.2 0.9

CCP-1 full 0.4 28 4.2 32 2

solitaire 0.003/0.003 1.0/1.0 0.1/0.1 1.1/1.1 2/2

cycles 0.003/0.003 1.0/1.0 0.1/0.1 1.1/1.1 2/2

fatal 0.006/0.003 1.3/1.1 0.1/0.1 1.4/1.2 1.5/1.5

partial 0.003 1.0 0.1 1.1 1.5

CCP-2 full 0.9 35 33 68 1.7

solitaire 0.02/0.007 1.6/1.1 0.2/0.0 1.8/1.1 1.5/1.5

cycles 0.02/0.007 1.9/1.1 0.2/0.1 2.1/1.2 1.5/1.5

fatal 0.02/0.007 1.6/1.2 0.2/0.1 1.8/1.3 1.5/1.5

partial 0.02 1.6 0.2 1.8 1.5

PDI-2 full 229 31 12 43 2

solitaire 229/229 33/32 34/12 67/45 2/2

cycles 30/30 15/14 3/5 17/19 2/2

fatal 30/30 15/15 3/5 18/19 2/2

partial 123 23 29 51 2

PDI-3 full 436 228 8 236 2

solitaire 436/436 230/228 36/32 266/260 2/2

cycles 78/162 65/102 19/64 84/166 2/2

fatal 75/84 64/67 19/23 83/90 2/2

partial 110 82 30 112 2

On-The-Fly Solving for Symbolic Parity Games 151

head in cases where this is unavoidable; Table 3contains all benchmarks where

at least one of the partial solvers allows exploration to terminate early.

For games SWP-1, WMS-1, WMS-2 in Table 3we ﬁnd that solitaire, and in

particular the safe attractor variant, is able to determine the solution the fastest.

Also, for all entries in Table 2this is the solver with the least overhead. Next, we

observe that for cases such as WMS-1 and PDI-3 using the safe attractor variants

of the solvers can be detrimental. Our observation is that ﬁrst computing safe

sets (especially using chaining) can be quick when most vertices are owned by

one player and one priority and the computation of the safe attractor, which uses

the more diﬃcult safe control predecessor is more involved in such cases. There

are also cases WMS-3, WMS-4, CCP-1 and CCP-2 where the safe attractor

variants are faster and these cases all have multiple priorities. In cases where

these solvers are slow (for example PDI-3) we also observe that more states are

explored before termination, because the earlier mentioned time based heuristic

results in calling the solver signiﬁcantly less frequently.

For parity games SWP-2 and WMS-3 only fatal and partial are able to ﬁnd

a solution early, which shows that more powerful partial solvers can be useful.

From Table 2and the cases in which the safe attractor variants perform poorly

we learn that the partial solvers can, as expected, cause overhead. This overhead

is in our benchmarks on average 30 percent, but when it terminates early it can

be very beneﬁcial, achieving speed-ups of up to several orders of magnitude.

6 Conclusion

In this work we have developed the theory to reason about on-the-ﬂy solving

of parity games, independent of the strategy that is used to explore games. We

have introduced the notion of safe vertices, shown their correctness, proven an

optimality result, and we have studied partial solvers and shown that these can

be made to run without determining the safe vertices ﬁrst; which can be useful

for on-the-ﬂy solving. Finally, we have demonstrated the practical purpose of our

method and observed that solitaire winning cycle detection with safe attractors

is almost always beneﬁcial with minimal overhead, but also that more powerful

partial solvers can be useful.

Based on our experiments, one can make an educated guess which partial

solver to select in particular cases; we believe that this selection could even be

steered by analysing the parameterised Boolean equation system representing the

parity game. It would furthermore be interesting to study (practical) improve-

ments for the safe attractors, and their use in Zielonka’s recursive algorithm.

Acknowledgements We would like to thank Jeroen Meijer and Tom van Dijk

for their help regarding the Sylvan library when implementing our prototype.

This work was supported by the TOP Grants research programme with project

number 612.001.751 (AVVA), which is (partly) ﬁnanced by the Dutch Research

Council (NWO).

152 M. Laveaux, W. Wesselink and T.A.C. Willemse

References

1. Beer, I., Ben-David, S., Landver, A.: On-the-ﬂy model checking of RCTL formulas.

In: Hu, A., Vardi, M. (eds.) CAV. LNCS, vol. 1427, pp. 184–194. Springer (1998).

https://doi.org/10.1007/BFb0028744

2. Benerecetti, M., Dell’Erba, D., Mogavero, F.: Solving parity games via

priority promotion. Formal Methods Syst. Des. 52(2), 193–226 (2018).

https://doi.org/10.1007/s10703-018-0315-1

3. Blom, S., Groote, J.F., Mauw, S., Serebrenik, A.: Analysing the BKE-security

protocol with µCRL. Electron. Notes Theor. Comput. Sci. 139(1), 49–90 (2005).

https://doi.org/10.1016/j.entcs.2005.09.005

4. Blom, S., van de Pol, J., Weber, M.: LTSmin: Distributed and symbolic reachability.

In: Touili, T., Cook, B., Jackson, P.B. (eds.) CAV. LNCS, vol. 6174, pp. 354–359.

Springer (2010). https://doi.org/10.1007/978-3-642-14295-6 31

5. Bryant, R.E.: Symbolic Boolean manipulation with ordered binary-

decision diagrams. ACM Comput. Surv. 24(3), 293–318 (1992).

https://doi.org/10.1145/136035.136043

6. Bunte, O., Groote, J.F., Keiren, J.J.A., Laveaux, M., Neele, T., de Vink, E.P.,

Wesselink, W., Wijs, A., Willemse, T.A.C.: The mCRL2 toolset for analysing

concurrent systems - improvements in expressivity and usability. In: Vojnar,

T., Zhang, L. (eds.) TACAS. LNCS, vol. 11428, pp. 21–39. Springer (2019).

https://doi.org/10.1007/978-3-030-17465-1 2

7. Calude, C.S., Jain, S., Khoussainov, B., Li, W., Stephan, F.: Deciding parity games

in quasipolynomial time. In: Hatami, H., McKenzie, P., King, V. (eds.) STOC. pp.

252–263. ACM (2017). https://doi.org/10.1145/3055399.3055409

8. Chen, T., Ploeger, B., van de Pol, J., Willemse, T.A.C.: Equivalence checking

for inﬁnite systems using parameterized Boolean equation systems. In: Caires, L.,

Vasconcelos, V.T. (eds.) CONCUR. LNCS, vol. 4703, pp. 120–135. Springer (2007).

https://doi.org/10.1007/978-3-540-74407-8 9

9. Ciardo, G., Marmorstein, R.M., Siminiceanu, R.: The saturation algorithm for

symbolic state-space exploration. Int. J. Softw. Tools Technol. Transf. 8(1), 4–25

(2006). https://doi.org/10.1007/s10009-005-0188-7

10. Cleaveland, R., Klein, M., Steﬀen, B.: Faster model checking for the modal mu-

calculus. In: von Bochmann, G., Probst, D.K. (eds.) CAV. LNCS, vol. 663, pp.

410–422. Springer (1992). https://doi.org/10.1007/3-540-56496-9 32

11. Cranen, S., Luttik, B., Willemse, T.A.C.: Proof graphs for parameterised Boolean

equation systems. In: D’Argenio, P.R., Melgratti, H.C. (eds.) CONCUR. LNCS,

vol. 8052, pp. 470–484. Springer (2013). https://doi.org/10.1007/978-3-642-40184-

8 33

12. van Dijk, T.: Oink: An implementation and evaluation of modern parity game

solvers. In: Beyer, D., Huisman, M. (eds.) TACAS. LNCS, vol. 10805, pp. 291–308.

Springer (2018). https://doi.org/10.1007/978-3-319-89960-2 16

13. van Dijk, T., van de Pol, J.: Sylvan: multi-core framework for deci-

sion diagrams. Int. J. Softw. Tools Technol. Transf. 19(6), 675–696 (2017).

https://doi.org/10.1007/s10009-016-0433-2

14. Eir´ıksson, ´

A.T., McMillan, K.L.: Using formal veriﬁcation/analysis methods on

the critical path in system design: A case study. In: Wolper, P. (ed.) CAV. LNCS,

vol. 939, pp. 367–380. Springer (1995). https://doi.org/10.1007/3-540-60045-0 63

15. Groote, J.F., Willemse, T.A.C.: Model-checking processes with data. Sci. Comput.

Program. 56(3), 251–273 (2005). https://doi.org/10.1016/j.scico.2004.08.002

On-The-Fly Solving for Symbolic Parity Games 153

16. Groote, J.F., Willemse, T.A.C.: Parameterised Boolean equation systems. Theor.

Comput. Sci. 343(3), 332–369 (2005). https://doi.org/10.1016/j.tcs.2005.06.016

17. Huth, M., Kuo, J.H., Piterman, N.: Fatal attractors in parity games. In:

Pfenning, F. (ed.) FOSSACS. LNCS, vol. 7794, pp. 34–49. Springer (2013).

https://doi.org/10.1007/978-3-642-37075-5 3

18. Jurdzi´nski, M., Lazi´c, R.: Succinct progress measures for solving

parity games. In: LICS. pp. 1–9. IEEE Computer Society (2017).

https://doi.org/10.1109/LICS.2017.8005092

19. Kant, G., van de Pol, J.: Eﬃcient instantiation of parameterised

Boolean equation systems to parity games. In: Wijs, A., Bosnacki, D.,

Edelkamp, S. (eds.) GRAPHITE. EPTCS, vol. 99, pp. 50–65 (2012).

https://doi.org/10.4204/EPTCS.99.7

20. Kant, G., van de Pol, J.: Generating and solving symbolic parity games. In:

Bosnacki, D., Edelkamp, S., Lluch-Lafuente, A., Wijs, A. (eds.) GRAPHITE.

EPTCS, vol. 159, pp. 2–14 (2014). https://doi.org/10.4204/EPTCS.159.2

21. Laveaux, M.: Downloadable sources for the case study (2022).

https://doi.org/10.5281/zenodo.5896966

22. Laveaux, M., Wesselink, W., Willemse, T.A.C.: On-the-ﬂy solving for symbolic

parity games. CoRR abs/2201.09607 (2022), https://arxiv.org/abs/2201.09607

23. Mateescu, R., Sighireanu, M.: Eﬃcient on-the-ﬂy model-checking for regular

alternation-free mu-calculus. Sci. Comput. Program. 46(3), 255–281 (2003).

https://doi.org/10.1016/S0167-6423(02)00094-1

24. McNaughton, R.: Inﬁnite games played on ﬁnite graphs. Ann. Pure Appl. Logic

65(2), 149–184 (1993). https://doi.org/10.1016/0168-0072(93)90036-D

25. Miller, D.M.: Multiple-valued logic design tools. In: ISMVL. pp. 2–11. IEEE Com-

puter Society (1993). https://doi.org/10.1109/ISMVL.1993.289589

26. Pang, J., Fokkink, W.J., Hofman, R.F.H., Veldema, R.: Model checking a cache

coherence protocol of a java DSM implementation. J. Log. Algebraic Methods

Program. 71(1), 1–43 (2007). https://doi.org/10.1016/j.jlap.2006.08.007

27. Remenska, D., Willemse, T.A.C., Verstoep, K., Templon, J., Bal, H.E.:

Using model checking to analyze the system behavior of the LHC

production grid. Future Gener. Comput. Syst. 29(8), 2239–2251 (2013).

https://doi.org/10.1016/j.future.2013.06.004

28. Sanchez, L., Wesselink, W., Willemse, T.A.C.: A comparison of BDD-based par-

ity game solvers. In: Orlandini, A., Zimmermann, M. (eds.) GandALF. EPTCS,

vol. 277, pp. 103–117 (2018). https://doi.org/10.4204/EPTCS.277.8

29. Stasio, A.D., Murano, A., Vardi, M.Y.: Solving parity games: Explicit vs symbolic.

In: Cˆampeanu, C. (ed.) CIAA. LNCS, vol. 10977, pp. 159–172. Springer (2018).

https://doi.org/10.1007/978-3-319-94812-6 14

30. Tanenbaum, A.S., Wetherall, D.: Computer networks, 5th Edition. Pearson (2011),

https://www.worldcat.org/oclc/698581231

31. Zielonka, W.: Inﬁnite games on ﬁnitely coloured graphs with applications to

automata on inﬁnite trees. Theor. Comput. Sci. 200(1-2), 135–183 (1998).

https://doi.org/10.1016/S0304-3975(98)00009-7

154 M. Laveaux, W. Wesselink and T.A.C. Willemse

Open Access This chapter is licensed under the terms of the Creative Commons

Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/),

which permits use, sharing, adaptation, distribution and reproduction in any medium

or format, as long as you give appropriate credit to the original author(s) and the

source, provide a link to the Creative Commons license and indicate if changes were

made.

The images or other third party material in this chapter are included in the chapter’s

Creative Commons license, unless indicated otherwise in a credit line to the material. If

material is not included in the chapter’s Creative Commons license and your intended

use is not permitted by statutory regulation or exceeds the permitted use, you will need

to obtain permission directly from the copyright holder.

On-The-Fly Solving for Symbolic Parity Games 155

Equivalence Checking

Distributed Coalgebraic Partition Reﬁnement

Fabian Birkmann , Hans-Peter Deifel,? , and Stefan Milius??

Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany

{fabian.birkmann,hans-peter.deifel,stefan.milius}@fau.de

Abstract.

Partition reﬁnement is a method for minimizing automata

and transition systems of various types. Recently we have developed a

partition reﬁnement algorithm and the tool CoPaR that is generic in the

transition type of the input system and matches the theoretical run time

of the best known algorithms for many concrete system types. Genericity

is achieved by modelling transition types as functors on sets and systems

as coalgebras. Experimentation has shown that memory consumption is

a bottleneck for handling systems with a large state space, while running

times are fast. We have therefore extended an algorithm due to Blom

and Orzan, which is suitable for a distributed implementation to the

coalgebraic level of genericity, and implemented it in CoPaR. Experiments

show that this allows to handle much larger state spaces. Running times

are low in most experiments, but there is a signiﬁcant penalty for some.

1Introduction

Minimization is an important and basic algorithmic task on state-based systems,

concerned with reducing the state space as much as possible while retaining

the system’s behaviour. It is used for equivalence checking of systems and as a

subtask in model checking tools in order to handle larger state spaces and thus

mitigate the state-explosion problem.

We focus on the task of identifying behaviourally equivalent states modulo

bisimilarity. For classic labelled transitions systems this notion obeys the principle

‘states

and

are bisimilar if for every transition

−−→ s0

, there exists a transition

−−→ t0

with

and

bisimilar’, and symmetrically for transitions from

Bisimilarity is a rather ﬁne-grained branching-time notion of equivalence (cf. [17]);

it is widely used and preserves all properties expressible as

-calculus formulas.

Moreover, it has been generalized to yield equivalence notions for many other

types of state-based systems and automata.

Due to the above principle, bisimilarity is deﬁned by a ﬁxed point, to be

understood as a greatest ﬁxed point and is hence approximable from above.

This is used by partition reﬁnement algorithms: The initial partition considers

all states tentatively equivalent is then iteratively reﬁned using observations

Supported by the Deutsche Forschungsgemeinschaft (DFG) within the Re-

search and Training Group 2475 “Cybercrime and Forensic Computing”

(393541319/GRK2475/1-2019)

?? Supported by Deutsche Forschungsgemeinschaft (DFG) under project MI 717/7-1.

The Author(s) 2022

D. Fisman and G. Rosu (Eds.): TACAS 2022, LNCS 13244, pp. 159–177, 2022.

https://doi.org/10.1007/978-3-030-99527-0_9

about the states until a ﬁxed point is reached. Consequently, such procedures

run in polynomial time and can also be eﬃciently implemented, in contrast to

coarser system equivalences such as trace equivalence and language equivalence

of nondeterministic systems which are PSPACE-complete [23]. This makes mini-

mization under bisimilarity interesting even in cases where the main equivalence

is linear-time, such as for automata.

Eﬃcient partition reﬁnement algorithms exist for various systems: Kanellakis

and Smolka provide a minimization algorithm with run time

O(m·n)

for labelled

transition systems with

states and

transitions. Even faster algorithms have

been developed over the past 50 years for many types of systems. For example,

Hopcroft’s algorithm for minimizing deterministic automata has run time in

O(n·log n)

[21]; it was later generalized to variable input alphabets, with run time

O(n·|A| · log n)

[18,24]. The Paige-Tarjan algorithm minimizes transition systems

in time

O((m+n)·log n)

[31], and generalizations to labelled transition systems

have the same time complexity [13,22,36]. For the minimization of weighted

systems (a.k.a. lumping), Valmari and Franchescini [38] have developed a simple

O((m+n)·log n)

algorithm for systems with rational weights. Buchholz [10] gave

an algorithm for weighted automata, and Högberg et al. [20] one for (bottom-up)

weighted trees automata, both with run time in O(m·n).

In previous work [16,42], we have provided an eﬃcient partition reﬁnement

algorithm, which is generic in the system type, captures all the above system

types, and matches or, in some cases even improves on the run time complexity of

the respective specialized algorithms. Subsequently, we have shown how to extend

the generic complexity analysis to weighted tree automata and implemented the

algorithm in the tool CoPaR [11,41], again matching the previous best run time

complexity and improving it in the case of weighted tree automata with weights

from a non-cancellative monoid. The algorithm is based on ideas of Paige and

Tarjan, which leads to its eﬃciency. Genericity is achieved by modelling state

based systems as coalgebras, following the paradigm of universal coalgebra [34],

in which the transitions structure of systems is encapsulated by a set functor.

The algorithm and tool are modular in the sense that functors can be built

from a preimplemented set of basic functors by standard set constructions such

as cartesian product, disjoint union and functor composition. The tool then

automatically derives a parser for input coalgebras of the composed type and

provides a corresponding partition reﬁnement implementation oﬀ the shelf. In

addition, new basic functors

may easily be added to the set of basic functors by

implementing a simple reﬁnement interface for them plus a parser for encoded

coalgebras. Our experiments with the tool have shown that run time scales

well with the size of systems. However, memory usage becomes a bottleneck

with growing system size, a problem that has previously also been observed by

Valmari [37] for partition reﬁnement. One strategy to address this is to distribute

the algorithm across multiple computers, which store and process only a part

of the state space and communicate via message passing. For ordinary labelled

transition systems and Markov systems this has been investigated in a series

of papers by Blom and Orzan [4

–

9] who were also motivated to mitigate the

memory bottleneck of sequential partition reﬁnement algorithms.

160 F. Birkmann, H.-P. Deifel, S. Milius

Our contribution in this paper is an extension of CoPaR by an eﬃcient dis-

tributed partition algorithm in coalgebraic generality. Like in Blom and Orzan’s

work, our algorithm is a distributed version of a simple but eﬀective algorithm

called “the naive method” [23], or “the ﬁnal chain algorithm” in coalgebraic

generality [25,42]. We ﬁrst generalize signature reﬁnement introduced by Blom

and Orzan to the level of coalgebras. We also combine generalized signatures (Sec-

tion 3) with the previous encodings of set functors and their coalgebras

[11,41]

via

the new notion of a signature interface (Deﬁnition 3.1). This is a key idea to make

coalgebraic signature reﬁnement and the ﬁnal chain algorithm implementable in a

tool like CoPaR. In addition, we demonstrate how signature interfaces of functors

can be combined (Construction 3.3and Proposition 3.4) along standard functor

constructions. This yields a similar modularity principle than for the previous

sequential algorithm. However, this is a new feature for signature reﬁnement

and also, to our knowledge, for the ﬁnal chain algorithm. Consequently, our

distributed, modular and generic implementation of the ﬁnal chain algorithm is

new (already as sequential algorithm).

We also provide experiments demonstrating its scalability and show that much

larger state spaces can indeed be handled. Our benchmarks include weighted tree

automata for non-cancellative monoids, a type of system for which our previous

sequential implementation is heavily limited by its memory requirements. For

those systems the running times of the distributed algorithm are even faster then

those of the sequential algorithm. In a second set of benchmarks stemming from

the PRISM benchmark suite [27] we again show that larger systems can now be

handled; however, for some of these there is a penalty in run time.

Related work. Balcazar et al. [1] have proved that the problem of bisimilarity

checking for labelled transition systems is

-complete, which implies that it is

hard to parallelize eﬃciently. Nevertheless, parallel algorithms have been proposed

by Rajasekaran and Lee [33]. These are designed for shared memory machines

and hence do not distribute RAM requirements over multiple machines.

Symbolic techniques are an orthogonal approach to reduce memory usage of

partition reﬁnement algorithms and have been explored e.g. by Wimmer et al. [40]

and van Dijk and de Pol [15].

Two other orthogonal extensions of the generic coalgebraic minimization and

CoPaR have been presented in recent work. First a non-trivial extension computes

(1) reachable states and (2) the transition structure of the minimized systems [12].

Second, Wißmann et al. [43] have shown how to compute distinguishing formulas

in a Hennessy-Milner style logic for a pair of behaviourally inequivalent states.

2Preliminaries

Our algorithmic framework and the tool CoPaR [41,42] are based on modelling

state-based systems abstractly as coalgebras for a (set) functor that encapsulates

the transition type, following the paradigm of universal coalgebra [34]. We now

recall some standard notations for sets and maps and basic notions and examples

in coalgebra. We ﬁx a singleton set

1 = {∗}

; for every set

we have a unique

map

!: X→1

and the identity map

idX:X→X

. We denote composition of

Distributed Coalgebraic Partition Reﬁnement 161

maps by

(−)·(−)

, in applicative order. Given maps

f:X→A

g:X→B

deﬁne

hf, gi:X→A×B

hf, gi(x)=(f(x), g(x))

. The type of transitions of

states in a system is modelled by a set functor

. Informally,

assigns to every

set

a set

F X

of structured collections of elements of

, and an

-coalgebra is

a map

c:S→F S

which assigns to every state

s∈S

in a system a structured

collection

c(s)∈F S

of successor states of

. The functor

also determines a

canonical notion of behavioural equivalence of states of a coalgebra; this arises

by stipulating that morphisms of coalgebras are behaviour preserving maps.

Deﬁnition 2.1.

Afunctor

F:Set →Set

assigns to each set

a set

F X

and

to each map

f:X→Y

a map

F f :F X →F Y

, preserving identities and

composition (

FidX=idF X

F(g·f) = F g ·F f

). An

-coalgebra

(S, c)

consists of

a set

of states and a transition structure

c:S→F S

. A morphism

h: (S, c)→

(S0, c0)

-coalgebras is a map

h:S→S0

that preserves the transition structure,

i.e.

F h ·c=c0·h

. Two states

s, t ∈S

of a coalgebra

c:S→F S

are behavioural ly

equivalent (s∼t) if there exists a coalgebra morphism hwith h(s) = h(t).

Example 2.2.

We mention several types of systems which are instances of the

general notion of coalgebra and the ensuing notion of behavioural equivalence.

All these are possible input systems for our tool CoPaR.

(1)

Transition systems. The ﬁnite powerset functor

Pω

maps a set

to the

set

PωX

of all ﬁnite subsets of

, and a map

f:X→Y

to the map

Pωf=

f[−]: PωX→ PωY

taking direct images. Coalgebras for

Pω

are ﬁnitely branching

(unlabelled) transition systems. Two states are behaviourally equivalent iﬀ they

are (strongly) bisimilar in the sense of Milner [29,30] and Park [32]. Similarly,

ﬁnitely branching labelled transition systems with label alphabet

are coalgebras

for the functor F X =Pω(A×X).

(2)

Deterministic automata. For an input alphabet

, the functor given by

F X = 2 ×XA

, where

2 = {0,1}

, sends a set

to the set of pairs of boolean

values and functions

A→X

. An

-coalgebra

(S, c)

is a deterministic automaton

(without an initial state). For each state

s∈S

, the ﬁrst component of

c(s)

determines whether

is a ﬁnal state, and the second component is the successor

function

A→S

mapping each input letter

a∈A

to the successor state of

under input letter

. States

s, t ∈S

are behaviourally equivalent iﬀ they accept

the same language in the usual sense.

(3)

Weighted tree automata simultaneously generalize tree automata and weight-

ed (word) automata. Inputs of such automata stem from a ﬁnite signature

i.e. a ﬁnite set of input symbols, each with a prescribed natural number, its

arity.Weights are taken from a commutative monoid

(M, +,0)

. A (bottom-up)

weighted tree automaton (WTA) (over

with inputs from

) consists of a ﬁnite

set

of states, an output map

f:S→M

, and for each

k≥0

, a transition map

µk:Σk→MSk×S

, where

Σk

denotes the set of

-ary input symbols in

; the

maximum arity of symbols in Σis called the rank.

Every signature

gives rise to its associated polynomial functor, also de-

noted

, which assigns to a set

the set

`n∈NΣn×Xn

, where

denotes disjoint

union (coproduct). Further, for a given monoid

(M, +,0)

the monoid-valued func-

tor

M(−)

sends a set

to the set of maps

f:X→M

that are ﬁnitely supported,

162 F. Birkmann, H.-P. Deifel, S. Milius

i.e.

f(x)=0

for almost all

x∈X

. Given a map

f:X→Y

M(f):M(X)→M(Y)

sends a map

v:X→M

M(X)

to the map

y7→ Px∈X,f (x)=yv(x)

, correspond-

ing to the standard image measure construction.

Weighted tree automata are coalgebras for the composite functor

F X =

M×M(ΣX )

; indeed, given a coalgebra

c=hc1, c2i:S→M×M(ΣS)

, its ﬁrst

component

is the output map, and the second component

is equivalent to

the family of transitions maps µkdescribed above.

As proven by Wißmann et al. [41, Prop. 6.6], the coalgebraic behavioural

equivalence is precisely backward bisimulation of weighted tree automata as

introduced by Högberg et al. [20, Def. 16].

(4)

The bag functor

B:Set →Set

sends a set

to the set of all ﬁnite multisets

(or bags) over

. This is the special case of the monoid-valued functor for the

monoid

(N,+,0)

. Accordingly,

-coalgebras are weighted transition systems

with positive integers as weights, or they may be regarded as ﬁnitely branching

transition systems where multiple transitions between a pair of states are allowed.

Behavioural equivalence coincides with weighted (or strong) bisimilarity.

(5)

Markov chains. The ﬁnite distribution functor

Dω

is a subfunctor of the

monoid-valued functor

R(−)

for the usual monoid of addition on the real numbers.

It maps a set

to the set of all ﬁnite probability distributions on

. That

means that

DωX

is the set of all ﬁnitely supported maps

d:X→[0,1]

such that

Px∈Xd(x)=1. The action of Dωon maps is the same as that of R(−).

As shown by Rutten and de Vink [35], coalgebras

c:S→(DωS+ 1)A

are

precisely Larsen and Skou’s probabilistic transition systems [28] (aka. labelled

Markov chains [14]) with the label alphabet

. In fact, for each state

s∈S

and action label

a∈A

, that state either cannot perform an

-action (when

c(s)(a)∈1

) or the distribution

c(s)(a)

determines for every state

t∈C

the

probability with which stransitions to twith an a-action.

Coalgebraic behavioural equivalence is precisely probabilistic bisimilarity in

the sense of Larsen and Skou, see Rutten and de Vink [35, Cor. 4.7].

(6)

Markov decision processes are systems which feature both non-deterministic

and probabilistic branching. They are coalgebras for composite functors such as

Pω(A× Dω(−))

Pω(Dω(A×(−))

(simple/general Segala systems); Bartels et

al. [2] list further functors for various species of probabilistic systems.

Encodings. To supply coalgebras as inputs to CoPaR and in order to speak

about the size of a coalgebra in terms of states and transitions, we need

Deﬁnition 2.3

[12, Def. 3.1]. An encoding of a set functor

consists of a

set

of labels and a family of maps

[X:F X → B(A×X)

, one for every set

such that the map hF!, [Xi:F X →F1× B(A×X)is injective.

The encoding of a coalgebra

c:S→F S

hF!, [Si · c:S→F1× B(A×S)

For

s∈S

we write

−−→ t

whenever

(a, t)

is contained in the bag

[S(c(s))

. The

number of states and edges of a given encoded input coalgebra are

n=|S|

and

m=Ps∈S|[S(c(s))|, respectively, where |b|=Px∈Xb(x)for a bag b:X→N.

An encoding of a set functor

speciﬁes how

-coalgebras are represented as

directed graphs, and the required injectivity ensures that diﬀerent coalgebras

have diﬀerent encodings.

Distributed Coalgebraic Partition Reﬁnement 163

Example 2.4.

We recall a few key examples of encodings used by CoPaR [42];

for the required injectivity, see [12, Prop. 3.3].

(1)

For the ﬁnite powerset functor

Pω

one takes a singleton label set

A= 1

and

[X:PωX→ B(1 ×X)is the obvious inclusion: [X(U)(∗, x)=1iﬀ x∈U⊆X.

(2)

For the monoid-valued functor

M(−)

we take labels

A=M

, and the map

[X:M(X)→ B(M×X)is given by [X(t)(m, x)=1if t(x) = m6= 0 and 0else.

(3)

As a special case, the bag functor

has labels

A=N

, and the map

[X:BX→ B(N×X)is given by [X(t)(n, x)=1if t(x) = nand 0else.

Remark 2.5.(1)

Readers familiar with category theory may wonder about the

naturality of encodings

. It turns out [12] that in almost all instances, our

encodings are not natural transformations, except for polynomial functors. As

shown in op. cit., all our encodings satisfy a property called uniformity, which

implies that they are subnatural transformations [12, Prop. 3.15].

(2)

Having an encoding of a set functor

does not imply a reduction of the

problem of minimizing

-coalgebras to that of coalgebras for

B(A× −)

. In fact,

the behavioural equivalence of

-coalgebras and coalgebras for

B(A× −)

may

be very diﬀerent unless [Xis natural, which is not the case for most encodings.

Functors in CoPaR can be combined by product, coproduct or composition,

leading to modularity. But in order to automatically handle combined functors,

our tool crucially depends on the ability to form products and coproducts of

encodings [41,42]. We refrain from going into technical details, but note for

further use that given a pair of functors

F1, F2

with encodings

Ai, [X,i

one

obtains encodings for the functors

F1×F2

(cartesian product) and

F1+F2

(disjoint union) with the label set A=A1+A2.

Input syntax and processing. We brieﬂy recall the input format of CoPaR

and how inputs are processed; for more details see [41, Sec. 3.1]. CoPaR accepts

input ﬁles representing a ﬁnite

-coalgebra. The ﬁrst line of an input ﬁle speciﬁes

the functor Fwhich is written as a term according to the following grammar:

T::= X| PωT| B T| DωT|M(T)|Σ

Σ::= C|T+T|T×T|TAC::= N|A A ::= {s1, . . . , sn} | n, (1)

where

n∈N

denotes the set

{0, . . . , n −1}

, the

are strings subject to the usual

conventions for variable names (a letter or an underscore character followed by

alphanumeric characters or underscore), exponents

are written

F^A

, and

is one of the monoids

(Z,+,0)

(R,+,0)

(C,+,0)

(Pω(64),∪,∅)

(the monoid

-bit words with bitwise

), and

(N,max,0)

(the additive monoid of the

tropical semiring). Note that

eﬀectively ranges over at most countable sets,

and

over ﬁnite sets. A term

determines a functor

F:Set →Set

in the evident

way, with Xinterpreted as the argument.

The remaining lines of an input ﬁle specify a ﬁnite coalgebra

c:S→F S

. Each

line has the form

s:␣t

for a state

s∈S

, and

represents the element

c(s)∈F S

The syntax for

depends on the speciﬁed functor

and follows the structure of

164 F. Birkmann, H.-P. Deifel, S. Milius

q: {p: 0.5, r: 0.5}

p: {q: 0.4, r: 0.6}

r: {r: 1}

(a) Markov chain

{f,n} x X^{a,b}

q: (n, {a: p, b: r})

p: (n, {a: q, b: r})

r: (f, {a: q, b: p})

(b) Deterministic ﬁnite automaton

Fig. 1: Examples of input ﬁles with encoded coalgebras [41]

the term

deﬁning

; the details are explained in [41, Sec. 3.1.2]. Fig. 1from

op. cit. shows two coalgebras and the corresponding input ﬁles.

After reading the functor term

,CoPaR builds a parser for the functor-

speciﬁc input format and then parses the input coalgebra given in that format

into an intermediate format which internally represents the encoding of the

input coalgebra (Deﬁnition 2.3). For composite functors the parsed coalgebra

then undergoes a substantial amount of preprocessing, which also aﬀects how

transitions are counted; see [41, Sec. 3.5] for more details.

3Coalgebraic Partition Reﬁnement

As mentioned in the introduction, the sequential partition reﬁnement algorithm

previously implemented in CoPaR is based on ideas used in the Paige-Tarjan

algorithm [31] for transition systems. However, as has been mentioned by Blom

and Orzan [8], the Paige-Tarjan algorithm carefully selects the block of states to

split in each iteration, and the data structures used for this selection take a lot of

memory and require modiﬁcation to allow a distributed implementation. Hence,

Blom and Orzan have built their distributed algorithm from a rather simple

sequential partition reﬁnement algorithm based on what Kanellakis and Smolka

refer to as the naive method [23]. We now recall this algorithm and subsequently

show how it can be adapted to the coalgebraic level of generality.

Signature Reﬁnement. Given a ﬁnite labelled transition system with the state

set

, a partition on

may be presented by a function

π:S→N

, i.e. two states

s, t ∈S

lie in the same block of the partition iﬀ

π(s) = π(t)

. The signature of a

state s∈Sis the set of outgoing transitions to blocks of π:

sigπ(s) = {(a, π(t)) |sa

−−→ t}⊆Pω(A×N).(2)

Asignature reﬁnement step then reﬁnes

by putting

s, t ∈S

into diﬀerent blocks

iﬀ

sigπ(s)6=sigπ(t)

. Concretely, we put

πnew(s) = hash(sigπ(s))

using a perfect,

deterministic hash function

hash

. The signature reﬁnement algorithm (Fig. 2)

starts with a trivial initial partition on

and repeats the reﬁnement step until

the partition stabilizes, i.e. until two subsequent partitions have the same size.

Coalgebraic Signature Reﬁnement. Regarding a labelled transition system

as a coalgebra

c:S→ Pω(A×S)

(Example 2.2(1)), signatures are obtained by

postcomposing the transition structure with the partition under the functor:

sigπ=Sc

−−→ Pω(A×S)Pω(A×π)

−−−−−−−→ Pω(A×N).(3)

Distributed Coalgebraic Partition Reﬁnement 165

Variables : old and new partitions represented by π, πnew :S→Nwith sizes

l, lnew, resp.; set Hfor counting block numbers;

1foreach s∈Sdo

2πnew(s)←0;

3end

4lnew ←1;

5while l6=lnew do

6π←πnew, H ← ∅;

7foreach s∈Sdo

8πnew(s)←hash(sigπ(s));

9H←H∪ {πnew(s)};

10 end

11 l←lnew;

12 lnew ← |H|;

13 end

Fig. 2: Signature reﬁnement for labelled transition systems

The generalisation to coalgebras for arbitrary

is immediate: the signature

of a state of an

-coalgebra

c:S→F S

w.r.t. a partition

is given by the

function

sigπ=F π ·c

. In the reﬁnement step of the above algorithm two states

are identiﬁed by the next partition if they have the same signatures currently:

πnew(s) = πnew (t)⇐⇒ sigπ(s) = sigπ(t)⇐⇒ (F π)(c(s)) = (F π )(c(t)).(4)

Hence, the algorithm in fact simply applies

F(−)·c

to the initial partition

corresponding to the trivial quotient

!: S→1

until stability is reached. Note that

this is precisely the Final Chain Algorithm by König and Küpper [25, Alg. 3.2]

computing behavioural equivalence of a given

-coalgebra. Its correctness thus

proves correctness of the coalgebraic signature reﬁnement which is the algorithm

in Fig. 2with

sigπ=F π ·c

. Since we represent functors and their coalgebras by

encodings we use an interface to Fto compute signatures based on encodings.

Deﬁnition 3.1.

Given a functor

with encoding

A, [X

, a signature interface

consists of a function

sig :F1× B(A×N)→FN

such that for every ﬁnite set

and every partition π:S→Nwe have

F π =F S hF!,[Si

−−−−→ F1× B(A×S)F1×B(A×π)

−−−−−−−−→ F1× B(A×N)sig

−−−→ FN.(5)

Given a coalgebra

c:S→F S

, a state

s∈S

and a partition

π:S→N

, the two

arguments of

sig

should be understood as follows. The ﬁrst argument is the value

F!(c(s)) ∈F1

, which intuitively provides an observable output of the state

The second argument is the bag

B(A×π)([S(c(s))

formed by those pairs

(a, n)

of labels

and numbers

of blocks of the partition

to which

has an edge;

that is, that bag contains one pair

(a, n)

for each edge

−−→ s0

where

π(s0) = n

Thus, when supplied with these inputs,

sig

correctly computes the signature of

;

indeed, to see this, precompose equation (5) with the coalgebra structure c.

Example 3.2.(1)

The constant functor

has the label set

A=∅

, so we have

B(∅ × N)∼

, and we deﬁne the function

sig :C× B(∅ × N)→C

sig(c, ∗) = c

166 F. Birkmann, H.-P. Deifel, S. Milius

(2)

The powerset functor

Pω

has the label set

A= 1

, and we deﬁne the function

sig :Pω1× B(1 ×N)→ PωNby sig(z, b) = {n:b(∗, n)6= 0}.

(3)

The monoid-valued functor

R(−)

has the label set

A=R

, and we deﬁne the

function sig :R× B(R×N)→R(N)by sig(z, b)(n) = Σ{r|b(r, n)6= 0}.

Next we show how signature interfaces can be combined by products (

) and

coproducts (

). This is the key to the modularity of the implementation (be it

distributed or sequential) of the coalgebraic signature reﬁnement in CoPaR.

Construction 3.3.

Given a pair of functors

F1, F2

with encodings

Ai, [X,i

and

signature interfaces

sigi

, we put

A=A1+A2

and deﬁne the following functions:

(1)

for the product functor

F=F1×F2

we take

sig :F1×B(A×N)→F1N×F2N,

sig(t, b) = sig1(pr1(t),ﬁlter1(b)),sig2(pr2(t),ﬁlter2(b)).

Here,

pri:F1→Fi1

is the projection map and

ﬁlteri:B(A×N)→ B(Ai×N)

given by ﬁlteri(b)(a, n) = b(inia, n), where ini:FiN→FNis the injection map.

(2) for the coproduct functor F=F1+F2we take

sig :F1× B(A×N)→F1N+F2N,sig(init, b) = ini(sigi(t, ﬁlteri(b))).

Proposition 3.4.

The functions

sig

deﬁned in Construction 3.3yield signature

interfaces for the functors F1×F2and F1+F2, respectively.

As a consequence of this result, it suﬃces to implement signature interfaces

only for basic functors according to the grammar in

(1)

, i.e. the trivial identity

and constant functors as well as the functors

Pω

Dω

and the supported

monoid-valued functors

M(−)

. Signature interfaces of products, coproducts and

exponents, being a special form of product, are derived using Construction 3.3.

Functor composition can be reduced to these constructions by a technique

called desorting [42, Sec. 8.2], which transforms a coalgebra of a composite functor

into a coalgebra for a coproduct of basic functors whose signature interfaces can

then be combined by

(see also [41, Sec. 3.5]). As for the previous Paige-Tarjan

style algorithm, this leads to the modularity in the functor of the coalgebraic

signature reﬁnement algorithm: signature interfaces for composed functors are

automatically derived in CoPaR. Moreover, a new basic functor

may be added

by implementing a signature interface for

, eﬀectively extending the grammar

of supported functors in (1) by a clause F T .

4The Distributed Algorithm

Our distributed algorithm for coalgebraic signature reﬁnement is a generalization

of Blom and Orzan’s original algorithm [8] to coalgebras. We highlight diﬀerences

to op. cit. at the end of this section.

We assume a distributed high-bandwidth cluster of

workers

w1, . . . , wW

that is failure-free, i.e. nodes do not crash, messages do not get lost and between

two nodes the order of messages is preserved. The communication is based on

non-blocking send operations and blocking receive operations. Messages are triples

of the form

(from,to,data)

, where the

data

ﬁeld may be structured and will often

contain a tag to simplify interpretation.

Distributed Coalgebraic Partition Reﬁnement 167

Description. The distributed algorithm is based on the sequential algorithm

presented in Fig. 2, using a distributed hashtable to keep track of the partition.

As for the sequential algorithm, the input consists of an

-coalgebra

(S, c)

with

|S|=n

states. We split the state space evenly among the workers as a

preprocessing step. We write

with

|Si|=n/W

for the set of states of worker

The input for worker

is the encoding of that part of the transition structure of

the input coalgebra which is needed to compute the signatures of the states in

This information is presented to

as the list of all outgoing edges of states of

in the encoding of the coalgebra

(S, c)

, i.e. the list of all

−−→ t

with

s∈Si

(cf. Deﬁnition 2.3). We refer to the block number

π(s)

of a state

s∈S

as its ID.

After processing the input, the algorithm runs in two phases. In the Initializa-

tion Phase (Fig. 3) the workers exchange update demands about the IDs stored

in the distributed hashtable. If

has an edge

−−→ s0

into some state

then during reﬁnement

needs to be kept up to date about the ID of

and thus

instructs

to do so. Worker

remembers this information by storing

the set

Ins0={wi| ∃s∈Si, a ∈A. s a

−−→ s0}

of incoming edges of

(lines 14–16).

Hence, for each edge

−−→ s0

with

s∈Si

and

s0∈Sj

, worker

sends a message

to wj, informing wjto add wito Ins0(lines 5–8).

Variables : Set Vof visited states; process count d;

for each s∈Sia list Insof workers with an edge into s

1V← ∅, d ←0;

2foreach s∈Sido

3Ins←[];

4end

5foreach edge s→s0of wiwith

s06∈ Vdo

6V←V∪ {s0};

7send(wi, wj, s0);

8end

9foreach 1≤j≤Wdo

10 send(wi, wj,DONE);

11 end

12 waitFor(d=W);

13 return([Ins|s∈Si]);

14 on receive (wk, wi, s)do

15 Ins←(wk:: Ins);

16 end

17 on receive (_,_,DONE)do

18 d←d+ 1;

19 end

Fig. 3: Initialization Phase of worker wi

The main phase is the Reﬁnement Phase (Fig. 4), mimicking the reﬁnement

loop of the undistributed algorithm. In each iteration all workers compute their

part of the new partition, i.e. the IDs

hs=hash(sigπ(s))

for each of their states

s∈Si

(line 5). In addition, every worker

is responsible for sending the

computed ID of

s∈Si

to workers in

Ins

that need it for computation of their

own signatures in the next iteration (lines 6–9). The IDs are also sent to a

designated worker

counterOf(hs)

(lines 10–12). This ensures that IDs are counted

precisely once at the end of the round when the partition size is computed after

all messages have been received (lines 14–17). The actual counting (line 19) is a

168 F. Birkmann, H.-P. Deifel, S. Milius

Variables : Old, respectively new partitions π, πnew with sizes l, lnew ;

ﬁnished workers d;ID-counting set H;

1πnew ←0!, l ← −1, lnew ←0, H ← ∅;

2while l6=lnew do

3l←lnew, π ←πnew;

4foreach s∈Sido

5πnew(s)←hash(sigπ(s));

6foreach wj∈Insdo

7send(wi, wj,

8hUPD, s, πnew(s)i);

9end

10 send(wi,

11 counterOf(πnew(s)),

12 hCOUNT, πnew(s)i);

13 end

14 foreach 1≤j≤Wdo

15 send(wi, wj,DONE);

16 end

17 waitFor(d=W);

18 l←lnew;

19 lnew ←distribSum(sizeOf(H));

20 synchronize;

21 end

22 on receive

(wk, wi,(UPD, s, hs)) do

23 πnew(s)←hs;

24 end

25 on receive

(wk, wi,(COUNT, hs)) do

26 H←H∪ {hs};

27 end

28 on receive (_, wi,DONE)do

29 d←d+ 1;

30 end

Fig. 4: Reﬁnement Phase of worker wi

primitive operation in the MPI library, for an explicit

O(log W)

algorithm using

messages see e.g. Blom and Orzan [8, Fig. 6]. Finally, the workers synchronize

before starting the next iteration (line 20). The reﬁnement phase stops if two

consecutive partitions have the same size (line 2).

Correctness. The Initialization Phase (Fig. 3) terminates since every worker

reaches line 10, sends DONE to all workers and thus also receives it

(lines 17–19)

a total of

times, allowing it to progress past line 12. An analogous argument

proves termination of every iteration of the Reﬁnement Phase (Fig. 4). The

sequential algorithm is correct, hence we know the loop of the reﬁnement phase

terminates when all IDs are computed and counted correctly, since then the

distributed and the sequential algorithm compute precisely the same partitions.

To show that the signatures are computed correctly, we note that if all DONE

messages have been received in a round, then, by order-preservation of messages,

all messages sent previously in this round have also been received. This ensures

that no workers are missing from the lists

Ins

computed in the Initialization Phase

and that during the Reﬁnement Phase new IDs are sent to all concerned workers

(Fig. 4, lines 6–8). This establishes correctness of the signature computation, and

the signatures coincide on all workers since we assume that the hash function is

deterministic. Finally, the use of the

counterOf

function (line 11) ensures that

each ID is included in the counting set of exactly one worker. Thus, the distributed

sum of the sizes of all counting sets is equal to the size of the partition.

Distributed Coalgebraic Partition Reﬁnement 169

Complexity. Let us assume that not only states, but also outgoing transitions

are distributed evenly among the workers, i.e. every worker has about

m/W

outgoing transitions. In the Initialization Phase, the loop sending messages runs

O(m

and receiving takes

O(W·n

W) = O(n)

, since for worker

every other

worker

might have an edge into every state in

. Both are executed in parallel

so in total the phase runs in

O(max( m

W, n)) = O(m

W+n)

. In the Reﬁnement

Phase, we assume the run time of computing signatures and their hashes is linear

in the number of edges. Then the loop for computing and hashing (

O(m

) and

counting (

O(n

) signatures runs in total in

O(m+n

, since it is performed by

all workers independently. Each worker receives at most

m/W

ID-updates each

round and the partition size is computable in

O(W)

giving the complexity of one

reﬁnement step in

O(m+n

. As many as

iterations might be needed for a total

complexity of O(m

W+n) + n· O(n+m

W) = Omn+n2

W+n.

Remark 4.1.

The above analysis assumes that signature interfaces are imple-

mented with a linear run time in their input bag. This could in fact be theoretically

realized for all basic functors (whence also for their combinations) currently im-

plemented in CoPaR, which would involve using bucket sort for the grouping of

bag elements by the target block (second component), e.g. for monoid-valued

functors. However, since the table used in bucket sort would be very large (the

size of the last partition) and memory conscience is our main motivation, we

opted for an implementation using a standard nlog nsorting algorithm instead.

Implementation details. CoPaR is implemented in Haskell. We were able

to reuse, with only minor adjustments, major parts of the code base of CoPaR

dedicated to the representation and processing of coalgebras. This includes the

implemented functors and their encodings together with the corresponding parser

and preprocessing algorithms (see Section 2). As explained in Section 3the

sequential Paige-Tarjan-style algorithm of CoPaR was not used; we implemented

an additional “algorithmic frontend” to our “coalgebraic backend”. To compute

signatures during the Reﬁnement Phase, each functor implements the signature

interface (Deﬁnition 3.1), which is written in Haskell as follows:

class Hashable (Signature f) => SignatureInterface f where

type Signature f :: Type

sig :: F1f -> [( Lab el f , Int )] -> Signature f

We require in the second line a type

Signature f

, that serves as an implementa-

tion-speciﬁc datatype representation of

. In the type of

sig

, the types

f,Label f

and

F1 f

correspond to the name of

, its label type and the set

, respectively.

Example 4.2.

The Haskell-implementation of the signature interface for the

ﬁnite power set functor Pωfrom Example 3.2(2)is as follows:

data Px=Px already deﬁned in CoPaR

type instance Label P = () also already deﬁned

instance SignatureInterface P where

type Signature P = Set Int

170 F. Birkmann, H.-P. Deifel, S. Milius

sig :: F1f -> [((), Int )] -> Set Int

sig _ = setFromList . map snd

Signature interfaces for the other basic functors according to the grammar in

(1)

are implemented similarly. For combined functors CoPaR automatically derives

their signature interface based on Construction 3.3.

In the algorithm itself, each worker runs three threads in parallel: The ﬁrst

thread is for computing, the second one is for sending and the third one is for

receiving signatures. This allows us to keep calls to the MPI interface separated

from (pure) signature computation, simplifying logic and allowing the workers

to scatter the ID of one state while simultaneously computing the signature of

the next one to ensure that neither signature computation nor network traﬃc

become bottlenecks. For inter-thread communication and synchronization we rely

on Haskell’s software transactional memory [19] to ease concurrent programming,

e.g. to avoid race conditions.

Comparison to Blom and Orzan’s algorithm. We now discuss a few

diﬀerences of our algorithm to Blom and Orzan’s original one [8].

In Blom and Orzan’s algorithm for LTSs the sets

Ins

s∈Si

are in fact lists

and contain worker

a total of

times if there exist

edges from states in

. This induces a redundancy in messages of ID updates, since

sends

(instead of one) messages with the ID of

. If the LTS has an average

fanout of

then each worker has

t=n/W ·f

outgoing transitions; this is the

number of ID updates received every round. Since there are only

states, at most

n/t =W/f

of those messages are necessary. In our scenario, we have

Wf

for

large coalgebras, hence the overhead becomes massive; e.g. for

W= 10, f = 100

already

90%

of all ID messages are redundant. We use sets instead of lists for

Ins

to avoid this redundancy.

Signature computation and communication do not proceed simultaneously in

Blom and Orzan’s original algorithm. However, in their optimized version [9] and

in Blom et al.’s algorithm for state labelled continuous-time Markov chains [4]

they do.

Another diﬀerence of our implementation is that we decided to hash the

signatures directly on the workers of the respective states while Blom and Orzan

decided to ﬁrst send the signatures to some dedicated hashing worker who is

then (uniquely) responsible for hashing, i.e. computing a new ID. This method

allows to compute new IDs in constant time. However, for more complex functors

supported by CoPaR, sending signatures could result in very large messages, so we

opted for minimizing network traﬃc at the cost of slower signature computation.

5Evaluation

To illustrate the practical utility and scalability of the algorithm and its im-

plementation in CoPaR, we report on a number of benchmarks performed on

a selection of randomly generated and real world data. In previous evaluations

of sequential CoPaR [41], we were limited by the 16GB RAM of a standard

workstation. Here we demonstrate that our distributed implementation fulﬁlls its

Distributed Coalgebraic Partition Reﬁnement 171

main objective of handling larger systems without lifting the memory restriction

per process. All benchmarks were run on a high performance computing cluster

consisting of nodes with two Xeon 2660v2“Ivy Bridge” chips (10 cores per

chip + SMT) with 2.2GHz clock rate and 64GB RAM. The nodes are connected

by a fat-tree InﬁniBand interconnect fabric with 40 GBit/s bandwidth. Most

execution runs were performed using 32 workers on 8nodes, resulting in 4worker

processes per node. No process used more than 16GB RAM. Execution times of

the sequential algorithm were taken using one node of the cluster. No times are

given for executions that ran out of 16GB memory previously [41]; those were

not run on the cluster.

Weighted Tree Automata. In previous work [41], we have determined the size

of the largest weighted tree automata for diﬀerent parameters that the sequential

version of CoPaR could handle in 16GB of RAM. Here, we demonstrate that the

distributed version can indeed overcome these memory constraints and process

much larger inputs.

Recall from Example 2.2that weighted tree automata are coalgebras for the

functor

F X =M×M(ΣX)

. For these benchmarks, we use

ΣX = 4×Xr

with rank

r∈ {1,...,5}

and the monoids

(2,∨,0)

(available as the ﬁnite powerset functor

in CoPaR),

(N,max,0)

and

(Pω(64),∪,∅)

. To generate a random automaton

with

states, we uniformly chose

k= 50 ·n

transitions from the set of all possible

transitions (using an eﬃcient sampling algorithm by Vitter [39]) resulting in a

coalgebra encoding with

n0= 51 ·n

states and

m= (r+ 1) ·k

edges. We took

care to restrict the state and transition weights to at most 50 diﬀerent monoid

elements in each example, to avoid the situation where all states are already

distinguished in the ﬁrst iteration of the algorithm.

2324252627

210

211

Workers used

Mem. per Worker [MB]

100

150

200

Computation time [s]

Table 1lists results for both the

sequential and distributed implemen-

tation when run on the same input.

These are the largest WTAs for their

respective rank and monoid that se-

quential CoPaR could handle using at

most 16GB of RAM [41]. In contrast,

the distributed implementation uses

less than 1GB per worker for those

examples and is thus able to handle

much larger inputs. Incidentally, the

distributed implementation is also faster despite the overhead incurred by network

communication. This can partly be attributed to the input-parsing stage, which

does not need inter-worker synchronization and is thus perfectly parallelizable.

To test the scaling properties of the distributed algorithm, we ran CoPaR with

the same input WTA but a varying number of worker processes. For this we chose

the WTA for the monoid

(2,∨,0)

with

ΣX = 4 ×X5

having 86852 states with

4342600 transitions and ﬁle size 186MB. The ﬁgure on the right above depicts

the maximum memory usage per worker and the overall running time. The results

show that both data points scale nicely with up to 32 workers, but while the

172 F. Birkmann, H.-P. Deifel, S. Milius

running time even increases when using up to 128 workers, the memory usage per

worker (the main motivation for this work) continues to decrease signiﬁcantly.

Monoid r k n Mem. (MB) Time (s) Seq. Time (s)

(Pω(64),∪,∅)

5 4630750 92615 849 61 511

4 4171550 83431 663 52 642

3 4721250 94425 639 59 528

2 6704100 134082 675 76 471

1 7605350 152107 642 79 566

3 47212500 944250 6786 675 –

(N,max,0)

5 4722550 94451 871 61 445

4 4643950 92879 754 56 463

3 5039950 100799 628 64 391

2 5904200 118084 633 74 403

1 7845650 156913 677 82 438

3 50399500 1007990 5644 645 –

(2,∨,0)

5 4342600 86852 701 71 537

4 4624550 92491 728 67 723

3 6710350 134207 825 113 689

2 6900000 138000 715 129 467

1 7743150 154863 621 160 449

3 65000000 1300000 7092 1377 –

Table 1: Maximally manageable WTAs for sequential CoPaR; “Mem.” and “Time”

are the memory and time required for the distributed algorithm and are the

maximum over all workers. “Seq. Time” is the time needed by sequential CoPaR.

PRISM Models. Finally, we show how our distributed partition reﬁnement

implementation performs on models from the benchmark suite [27] of the PRISM

model checker [26]. These model (aspects of) real-world protocols and are thus

a good ﬁt to evaluate how CoPaR performs on inputs that arise in practice.

Speciﬁcally, we use the fms and wlan_time_bounded families of systems. These

are continuous time Markov chains, regarded as coalgebras for

F X =R(X)

, and

Markov decision processes regarded as coalgebras for

F X =N× Pω(N×(DωX))

respectively. Again, our translation to coalgebras took care to force a coarse

initial partition in the algorithm.

The results in Table 2show that the distributed implementation is again able

to handle larger systems than sequential CoPaR in 16GB of RAM per process.

For the fms benchmarks, the distributed implementation is again faster than the

sequential one. However, this is not the case for the wlan examples. The larger

run times might be explained by the much higher number of iterations of the

reﬁnement phase (

-column of the table). This means that only few states are

distinguished in each phase, and thus signatures are re-computed more often and

more network traﬃc is incurred.

Distributed Coalgebraic Partition Reﬁnement 173

Model n m Mem. (MB) Time (s) iSeq. Time (s)

fms (n=4)35910 237120 13 2 4 4

fms (n=5)152712 1111482 62 8 5 17

fms (n=6)537768 4205670 163 26 5 68

fms (n=7)1639440 13552968 514 84 5 232

fms (n=8)4459455 38533968 1690 406 7 –

wlan_tb (K=0)582327 771088 90 297 306 39

wlan_tb (K=1)1408676 1963522 147 855 314 105

wlan_tb (K=2)1632799 5456481 379 2960 374 –

Table 2: Benchmarks on PRISM models:

and

are the numbers of states and

edges of the input coalgebra;

is the number of reﬁnement steps (iterations). The

other columns are analogous to Table 1.

6Conclusions and Future Work

We have presented a new and simple partition reﬁnement algorithm in coalgebraic

genericity which easily lends itself to a distributed implementation. Our algorithm

is based on König and Küpper’s ﬁnal chain algorithm [25] and Blom and Orzan’s

signature reﬁnement algorithm for labelled transition systems [8]. We have

provided a distributed implementation in the tool CoPaR. Like the previous

sequential Paige-Tarjan style partition reﬁnement algorithm, our new algorithm

is modular in the system type. This is made possible by combining signature

interfaces by product and coproduct, which is used by CoPaR for handling

combined type functors. Experimentation has shown that with the distributed

algorithm CoPaR can handle larger state spaces in general. Run times stay low for

weighted tree automata, whereas we observed severe penalties on some models

from the PRISM benchmark suite.

An additional optimization of the coalgebraic signature reﬁnement algorithm

should be possible using Blom and Orzan’s idea [9] to mark in each iteration

those states whose signatures can change in the next iteration and only recompute

signatures for those states in the next round. This might mitigate the run time

penalties we have seen in some of the PRISM benchmarks.

Further work on CoPaR concerns symbolic techniques: we have a prototype

sequential implementation of the coalgebraic signature reﬁnement algorithm

where state spaces are represented using BDDs. In a subsequent step it could be

investigated whether this can be distributed. In another direction the distributed

algorithm might be extended to compute distinguishing formulas, as recently

achieved for the sequential algorithm [43], for which there is also an implemented

prototype. Finally, there is still work required to integrate all these new fea-

tures, i.e. distribution, distinguishing formulas, reachability and computation of

minimized systems, into one version of CoPaR.

Data Availability Statement The software CoPaR and the input ﬁles that

were used to produce the results in this paper are available for download [3]. The

latest version of CoPaR can be obtained at https://git8.cs.fau.de/software/copar.

174 F. Birkmann, H.-P. Deifel, S. Milius

References

Balcazar, J., Gabarro, J., Santha, M.: Deciding bisimilarity is

-complete. Form.

Asp. Comput. 4(6A), 638–648 (1992)

Bartels, F., Sokolova, A., de Vink, E.: A hierarchy of probabilistic system

types. In: Coalgebraic Methods in Computer Science, CMCS 2003. Elec-

tron. Notes Theor. Comput. Sci., vol. 82, pp. 57–75. Elsevier (2003)

Birkmann, F., Deifel, H.P., Milius, S.: Software and Benchmarks for Distributed Coal-

gebraic Partition Reﬁnement (Jan 2022). https://doi.org/10.5281/zenodo.5907084

Blom, S., Haverkort, B.R., Kuntz, M., van de Pol, J.: Distributed Markovian

bisimulation reduction aimed at CSL model checking. In: Proceedings of the 7th

International Workshop on Parallel and Distributed Methods in veriﬁCation (PDMC

2008). Electron. Notes Theor. Comput. Sci., vol. 220, pp. 35–50. Elsevier (2008)

Blom, S., Orzan, S.: A distributed algorithm for strong bisimulation reduction of

state spaces. In: Brim, L., Grumberg, O. (eds.) Proc. Parallel and Distributed Model

Checking (PDMC). Electron. Notes Theor. Comput. Sci., vol. 68, pp. 523–538.

Elsevier (2002)

Blom, S., Orzan, S.: Distributed branching bisimulation reduction of state spaces.

In: Sokolsky, O., Viswanathan, M. (eds.) Proc. Parallel and Distributed Model

Checking (PDMC). Electron. Notes Theor. Comput. Sci., vol. 89, pp. 99–113.

Elsevier (2003)

Blom, S., Orzan, S.: Distributed state space minimization. In: Arts, T., Fokkink,

W. (eds.) Proc. Eighth International Workshop on Formal Methods for Industrial

Critical Systems (FMICS). Electron. Notes Theor. Comput. Sci., vol. 80, pp. 109–

123. Elsevier (2003)

Blom, S., Orzan, S.: A distributed algorithm for strong bisimulation reduction of

state spaces. International Journal on Software Tools for Technology Transfer 7(1),

74–86 (2005). https://doi.org/10.1007/s10009-004-0159-4

Blom, S., Orzan, S.: Distributed state space minimization. International Jour-

nal on Software Tools for Technology Transfer 7(3), 280–291 (Jun 2005).

https://doi.org/10.1007/s10009-004-0185-2

10.

Buchholz, P.: Bisimulation relations for weighted automata. Theoret. Comput. Sci.

393,109–123 (2008)

11.

Deifel, H.P., Milius, S., Schröder, L., Wißmann, T.: Generic partition reﬁnement and

weighted tree automata. In: ter Beek et al., M. (ed.) Proc. International Symposium

on Formal Methods (FM). Lecture Notes Comput. Sci., vol. 11800, pp. 280–297.

Springer (2019)

12.

Deifel, H.P., Milius, S., Wißmann, T.: Coalgebra encoding for eﬃcient minimization.

In: Kobayashi, N. (ed.) Proc. 6th International Conference on Formal Structures

for Computation and Deduction (FSCD). LIPIcs, vol. 195, pp. 28:1–28:19. Schloss

Dagstuhl (2021)

13.

Derisavi, S., Hermanns, H., Sanders, W.: Optimal state-space lumping in Markov

chains. Inf. Process. Lett. 87(6), 309–315 (2003)

14.

Desharnais, J., Edalat, A., Panangaden, P.: Bisimulation for labelled markov

processes. Inform. Comput. 179(2), 163–193 (2002)

15.

van Dijk, T., van de Pol, J.: Multi-core symbolic bisimulation minimisation. Inter-

national Journal on Software Tools for Technology Transfer 20(2), 157–177 (Apr

2018). https://doi.org/10.1007/s10009-017-0468-z,http://link.springer.com/10.

1007/s10009-017-0468-z

16.

Dorsch, U., Milius, S., Schröder, L., Wißmann, T.: Eﬃcient coalgebraic partition

reﬁnement. In: Meyer, R., Nestmann, U. (eds.) Proc. 28th International Conference

Distributed Coalgebraic Partition Reﬁnement 175

on Concurrency Theory (CONCUR). LIPIcs, vol. 85, pp. 28:1–28:16. Schloss

Dagstuhl (2017)

17.

van Glabbeek, R.: The linear time – branching time spectrum I; the semantics

of concrete, sequential processes. In: Bergstra, J., Ponse, A., Smolka, S. (eds.)

Handbook of Process Algebra, pp. 3–99. Elsevier (2001)

18.

Gries, D.: Describing an algorithm by Hopcroft. Acta Informatica 2,97–109 (1973)

19.

Harris, T., Marlow, S., Peyton Jones, S.: Composable memory transac-

tions. In: PPoPP ’05: Proceedings of the tenth ACM SIGPLAN sympo-

sium on Principles and practice of parallel programming. pp. 48–60. ACM

Press (January 2005), https://www.microsoft.com/en-us/research/publication/

composable-memory-transactions/

20.

Högberg (Björklund), J., Maletti, A., May, J.: Bisimulation minimisation for

weighted tree automata. In: Developments in Language Theory, 11th International

Conference, DLT 2007, Turku, Finland, July 3-6,2007, Proceedings. Lecture Notes

Comput. Sci., vol. 4588, pp. 229–241. Springer (2007). https://doi.org/10.1007/978-

3-540-73208-2

21. Hopcroft, J.: An nlog nalgorithm for minimizing states in a ﬁnite automaton. In:

Theory of Machines and Computations. pp. 189–196. Academic Press (1971)

22.

Huynh, D., Tian, L.: On some equivalence relations for probabilistic processes.

Fund. Inform. 17,211–234 (1992)

23.

Kanellakis, P.C., Smolka, S.A.: CCS expressions, ﬁnite state processes,

and three problems of equivalence. Inform. Comput. 86(1), 43–68 (1990).

https://doi.org/10.1016/0890-5401(90)90025-D

24.

Knuutila, T.: Re-describing an algorithm by Hopcroft. Theoret. Comput. Sci. 250,

333–363 (2001)

25.

König, B., Küpper, S.: Generic partition reﬁnement algorithms for coalgebras and

an instantiation to weighted automata. In: Theoretical Computer Science, IFIP

TCS 2014. Lecture Notes Comput. Sci., vol. 8705, pp. 311–325. Springer (2014)

26.

Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Veriﬁcation of probabilistic

real-time systems. In: Computer Aided Veriﬁcation, CAV 2011. LNCS, vol. 6806,

pp. 585–591. Springer (2011)

27.

Kwiatkowska, M.Z., Norman, G., Parker, D.: The PRISM benchmark suite. In:

Ninth International Conference on Quantitative Evaluation of Systems, QEST 2012,

London, United Kingdom, September 17-20,2012. pp. 203–204. IEEE Computer

Society (2012). https://doi.org/10.1109/QEST.2012.14

28.

Larsen, K.G., Skou, A.: Bisimulation through probabilistic testing. Inform. Comput.

94(1), 1–28 (1991)

29.

Milner, R.: A Calculus of Communicating Systems, Lecture Notes Comput. Sci.,

vol. 92. Springer (1980)

30.

Milner, R.: Communication and Concurrency. International Series in Computer

Science, Prentice Hall (1989)

31.

Paige, R., Tarjan, R.: Three partition reﬁnement algorithms. SIAM J. Comput.

16(6), 973–989 (1987)

32.

Park, D.: Concurrency on automata and inﬁnite sequences. In: Deussen, P. (ed.)

Proc. Conf. on Theoretical Computer Science. Lecture Notes Comput. Sci., vol. 104,

pp. 167–183 (1981)

33.

Rajasekaran, S., Lee, I.: Parallel algorithms for relational coarsest parti-

tion problems. IEEE Trans. Parallel Distributed Syst. 9(7), 687–699 (1998).

https://doi.org/10.1109/71.707548

34.

Rutten, J.: Universal coalgebra: a theory of systems. Theoret. Comput. Sci. 249,

3–80 (2000)

176 F. Birkmann, H.-P. Deifel, S. Milius

35.

Rutten, J., de Vink, E.: Bisimulation for probabilistic transition systems: a coalge-

braic approach. Theoret. Comput. Sci. 221,271–293 (1999)

36.

Valmari, A.: Bisimilarity minimization in

O(mlog n)

time. In: Applications and

Theory of Petri Nets, PETRI NETS 2009. Lecture Notes Comput. Sci., vol. 5606,

pp. 123–142. Springer (2009)

37.

Valmari, A.: Simple bisimilarity minimization in o(m log n) time. Fundam. Inform.

105(3), 319–339 (2010). https://doi.org/10.3233/FI-2010-369

38.

Valmari, A., Franceschinis, G.: Simple

O(mlog n)

time Markov chain lumping. In:

Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2010.

Lecture Notes Comput. Sci., vol. 6015, pp. 38–52. Springer (2010)

39.

Vitter, J.S.: An eﬃcient algorithm for sequential random sampling. ACM Trans.

Math. Softw. 13(1), 58–67 (1987). https://doi.org/10.1145/23002.23003

40.

Wimmer, R., Herbstritt, M., Hermanns, H., Strampp, K., Becker, B.: Sigref –

A Symbolic Bisimulation Tool Box. In: Hutchison, D., Kanade, T., Kittler, J.,

Kleinberg, J.M., Mattern, F., Mitchell, J.C., Naor, M., Nierstrasz, O., Pandu Rangan,

C., Steﬀen, B., Sudan, M., Terzopoulos, D., Tygar, D., Vardi, M.Y., Weikum, G.,

Graf, S., Zhang, W. (eds.) Automated Technology for Veriﬁcation and Analysis,

vol. 4218, pp. 477–492. Springer Berlin Heidelberg, Berlin, Heidelberg (2006).

https://doi.org/10.1007/11901914_35

41.

Wißmann, T., Deifel, H.P., Milius, S., Schröder, L.: From generic partition reﬁnement

to weighted tree automata minimization. Form. Asp. Comput. 33,695–727 (2021)

42.

Wißmann, T., Dorsch, U., Milius, S., Schröder, L.: Eﬃcient and modular coalgebraic

partition reﬁnement. Log. Methods Comput. Sci. 16(1), 8:1–8:63 (2020)

43.

Wißmann, T., Milius, S., Schröder, L.: Explaining behavioural inequivalence generi-

cally in quasilinear time. In: Haddad, S., Varacca, D. (eds.) Proc. 32nd International

Conference on Concurrency Theory (CONCUR). LIPIcs, vol. 203, pp. 31:1–32:18.

Schloss Dagstuhl (2021)

Open Access This chapter is licensed under the terms of the Creative Commons

Attribution 4.0International License (http://creativecommons.org/licenses/by/4.0/),