Article

Distributed Snapshots: Determining Global States of Distributed Systems

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper presents an algorithm by which a process in a distributed system determines a global state of the system during a computation. Many problems in distributed systems can be cast in terms of the problem of detecting global states. For instance, the global state detection algorithm helps to solve an important class of problems: stable property detection. A stable property is one that persists: once a stable property becomes true it remains true thereafter. Examples of stable properties are “computation has terminated,” “ the system is deadlocked” and “all tokens in a token ring have disappeared.” The stable property detection problem is that of devising algorithms to detect a given stable property. Global state detection can also be used for checkpointing.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The rest of this note is organized as follows. In Sect. 2 we review the snapshot algorithm of Chandy and Lamport [ 8], emphasizing the reasoning-as-if nature of this algorithm. In Sect. 3 we review the local determinism method of Soloveichik and Winfree [ 9], again emphasizing the reasoning-as-if nature of the method. ...
... The snapshot algorithm was designed by Chandy and Lamport [ 8], and a colorful description of it, which we follow here, was provided by Dijkstra [ 10]. In it the algorithm assembles a "snapshot" of a possible but unlikely global system state in order to reason about the properties of the system. ...
... Chandy and Lamport use the analogy of photographers watching a sky filled with migrating birds to explain the algorithm [ 8]. The scene is vast, and the birds are in constant motion-no single photo suffices. ...
Chapter
Full-text available
It is occasionally useful to reason as if something were true, even when we know that it is almost certainly not true. We discuss two instances, one in distributed computing and one in tile self-assembly, and suggest directions for further investigation of this method.
... For example, when a message m from P l which brings the most recent checkpoint interval of P k through a causal path is received by P i , we need to update P re i i [k][l][0] to be the newer checkpoint interval of P k . We also need to update P re i i [k][l] [1] to be whatever the most recent interval of P k through a c-causal path and this interval is known to P i since m does not bring the newer checkpoint interval through a c-causal path to P k . When a message m from P l which brings the most recent checkpoint interval of P k through a c-causal path is received by P i , we need to update P re i i [k][l] [1] with the newer checkpoint interval of P k . ...
... We also need to update P re i i [k][l] [1] to be whatever the most recent interval of P k through a c-causal path and this interval is known to P i since m does not bring the newer checkpoint interval through a c-causal path to P k . When a message m from P l which brings the most recent checkpoint interval of P k through a c-causal path is received by P i , we need to update P re i i [k][l] [1] with the newer checkpoint interval of P k . We also need to update P re i i [k][l][0] to be whatever the most recent interval of P k through a causal path and this interval is known to P i since m does not bring the newer checkpoint interval through a causal path to P k . ...
... P re j i [k][l] [2] : 1 ≤ k, l ≤ n and P re j i [k][l] [3] : 1 ≤ k, l ≤ n is the checkpoint interval of P l and P j respectively when the message m is sent from P l and arrives at P j . Similarly, we have P re [1]) in its causal past. We suppose the message not from the most recent interval of P λ to P h is m 1 and the one from the most recent interval of P λ to P h is m 2 . ...
... Chandy and Lamport [15] proposed a global snapshottaking algorithm for distributed systems. Their work is one of the fundamental protocols in this regard. ...
... Thus, they conclude with complete and proven evidence of what will be obtained from the resulting distributed algorithms. The researchers also contributed to clarifying the relationships between the different snapshot algorithms semantically in addition to refining the models from these algorithms [15,16,25]. ...
Article
Full-text available
Distributed global snapshot (DGS) is one of the fundamental protocols in distributed systems. It is used for different applications like collecting information from a distributed system and taking checkpoints for process rollback. The Chandy–Lamport protocol (CLP) is famous and well-known for taking DGS. The main aim of this protocol was to generate consistent cuts without interrupting the regular operation of the distributed system. CLP was the origin of many future protocols and inspired them. The first aim of this paper is to propose a novel formal hierarchical parametric colored Petri net model of CLP. The number of constituting processes of the model is parametric. The second aim is to automatically generate a novel message sequence chart (MSC) to show detailed steps for each simulation run of the snapshot protocol. The third aim is model checking of the proposed formal model to verify the correctness of CLP and our proposed colored Petri net model. Having vital tools helps greatly to test the correct operation of the newly proposed distributed snapshot protocol. The proposed model of CLP can easily be used for visually testing the correct operation of the new future under-development DGS protocol. It also permits formal verification of the correct operation of the new proposed protocol. This model can be used as a simple, powerful, and visual tool for the step-by-step run of the CLP, model checking, and teaching it to postgraduate students. The same approach applies to similar complicated distributed protocols.
... Concretely, if event 1 potentially causes event 2 to happen, a logical clock ensures the clock value of 1 is smaller than the clock value of 2 . Logical clocks have been applied to a wide range of distributed applications, including mutual exclusion [22], consistent distributed snapshot [9], eventual consistency [15], and causally consistent data stores [27,46,54]. ...
... Lamport showed in his original paper how Lamport clocks can be used to implement distributed mutual exclusion [22]. Vector clocks are applied to realize consistent distributed snapshots [9]. Amazon leverages version vectors, a type of vector clocks, to build eventually consistent data store Dynamo [15]. ...
Preprint
Full-text available
Logical clocks are a fundamental tool to establish causal ordering of events in a distributed system. They have been applied in weakly consistent storage systems, causally ordered broadcast, distributed snapshots, deadlock detection, and distributed system debugging. However, prior logical clock constructs fail to work in an open network with Byzantine participants. In this work, we present Chrono, a novel logical clock system that targets such challenging environment. We first redefine causality properties among distributed processes under the Byzantine failure model. To enforce these properties, Chrono defines a new validator abstraction for building fault-tolerant logical clocks. Furthermore, our validator abstraction is customizable: Chrono includes multiple backend implementations for the abstraction, each with different security-performance trade-offs. We have applied Chrono to build two decentralized applications, a mutual exclusive service and a weakly consistent key-value store. Chrono adds only marginal overhead compared to systems that tolerate no Byzantine faults. It also out-performs state-of-the-art BFT total order protocols by significant margins.
... The selforganizing and self-regulating patterns sense and counteract fluctuations to maintain their stability [3]. On the other hand, current digital computing structures when deployed as a distributed system with asynchronous communication between them, are not stable under large fluctuations in their resource availability or demand [6][7][8]. An application often is executed by several distributed software components using computing resources often owned and managed by different providers and the assurance of end-to-end process sustenance with adequate resources, its stability, safety, security, and compliance with global requirements requires a complex layer of additional processes that increase complexity leading to 'who manages the managers' conundrum. ...
... In this example of an event-driven transaction transformation we used the data file from the text "Data Science Projects with Python" [32] provided through the GitHub repository [33]. 7 This data contains credit card customer information over six months. Over this time several variables are correlated to predict the likelihood of whether any given customer would default on the credit card payment for the seventh month. ...
Preprint
Full-text available
Digital machine intelligence has evolved from its inception in the form of computation of numbers to AI, which is centered around performing cognitive tasks that humans can perform, such as predictive reasoning or complex calculations. The state of the art includes tasks that are easily described by a list of formal, mathematical rules or a sequence of event-driven actions such as modeling, simulation, business workflows, interaction with devices, etc., and also tasks that are easy to do “intuitively”, but are hard to describe formally or as a sequence of event-driven actions such as recognizing spoken words or faces. While these tasks are impressive, they fall short in applying common sense reasoning to new situations, filling in information gaps, or understanding and applying unspoken rules or norms. Human intelligence uses both associative memory and event-driven transaction history to make sense of what they are observing fast enough to do something about it while they are still observing it. In addition to this cognitive ability, all bio-logical systems exhibit autopoiesis and self-regulation. In this paper, we demonstrate how machine intelligence can be enhanced to include both associative memory and event-driven transaction history to create a new class of knowledge-based assistants to augment human intelligence. The digital assistants use global knowledge derived from the Large Language Models to bridge the knowledge gap between various participants interacting with each other. We use the general theory of information and schema-based knowledge representation to create the memory and history of various transactions involved in the interactions.
... These streaming operators need to process critical events as barriers (e.g., watermark [13], punctuation [27], etc.) between their causally dependent past events (bearing timestamps that are earlier to the critical events) and causally pending future events. Ensuring these orders is important to (i) produce the correct result (Figure 3 left) and (ii) carry out critical functionalities, including checkpointing [21,29] and reconfigurations [66,67] (Figure 3 Right). Modeling a streaming operator as a serverless function poses a significant restriction on autoscaling to existing serverless frameworks, making them unable to satisfy R3 . ...
... SYNC_ONE corresponds to the global processing barrier (Section 3) -that needs to synchronize across all upstream actors. Scheduling policies can also chain SYNC_ONE between each pair of upstream/downstream actor to implement distributed snapshot (e.g., checkpoint [29], reconfiguration [66], etc.) If a SYNC_ONE barrier was formed by critical messages CM i from upstream actor U i ,0 ≤ i ≤ N, we define the dependency set of the barrier as follows: ...
Preprint
Full-text available
We propose Dirigo, a distributed stream processing service built atop virtual actors. Dirigo achieves both a high level of resource efficiency and performance isolation driven by user intent (SLO). To improve resource efficiency, Dirigo adopts a serverless architecture that enables time-sharing of compute resources among streaming operators, both within and across applications. Meanwhile, Dirigo improves performance isolation by inheriting the property of function autoscaling from serverless architecture. Specifically, Dirigo proposes (i) dual-mode actor, an actor abstraction that dynamically provides orderliness guarantee for streaming operator during autoscaling and (ii) a data plane scheduling mechanism, along with its API, that allows scheduling and scaling at the message-level granularity.
... Coordinated checkpointing involves synchronizing all processes and taking a checkpoint of the entire system state to enable recovery in case of failures [Janakiraman and Tamir 1994, Chandy and Lamport 1985, Tamir and Sequin 1984. In coordinated checkpointing, all processes must agree on a safe execution point in order to establish consistent checkpointing for all participants. ...
... Other researches also address scenarios where interacting processes in a distributed environment must reach a stable and consistent state. For instance, in [Chandy and Lamport 1985], authors propose a protocol where processes can determine a global state in the system during a distributed computation. Important problems can be cast in terms of the problem of detecting global states, for example, computation termination, deadlock detection, and, of special interest for this paper, definition of a global and consistent state among processes. ...
Conference Paper
This paper concisely reviews checkpointing techniques in distributed systems, focusing on various aspects such as coordinated and uncoordinated checkpointing, incremental checkpoints, fuzzy checkpoints, adaptive checkpoint intervals, and kernel-based and user-space checkpoints. The review highlights interesting points, outlines how each checkpoint approach works, and discusses their advantages and drawbacks. It also provides a brief overview of the adoption of checkpoints in different contexts in distributed computing, including Database Management Systems (DBMS), State Machine Replication (SMR), and High-Performance Computing (HPC) environments. Additionally, the paper briefly explores the application of checkpointing strategies in modern cloud and container environments, discussing their role in live migration and application state management. The review offers valuable insights into their adoption and application across various distributed computing contexts by summarizing the historical development, advances, and challenges in checkpointing techniques.
... • Chandy-Lamport Snapshot Algorithm [6]: It operates by taking snapshots of the local states of processes and the states of communication channels. ...
Preprint
Full-text available
Given an undirected graph, the $k$-core is a subgraph in which each node has at least $k$ connections, which is widely used in graph analytics to identify core subgraphs within a larger graph. The sequential $k$-core decomposition algorithm faces limitations due to memory constraints and data graphs can be inherently distributed. A distributed approach is proposed to overcome limitations by allowing each vertex to independently do calculation by only using local information. This paper explores the experimental evaluation of a distributed $k$-core decomposition algorithm. By assuming that each vertex is a client as a single computing unit, we simulate the process using Golang, leveraging its Goroutine and message passing. Due to the fact that the real-world data graphs can be large with millions of vertices, it is expensive to build such a distributed environment with millions of clients if the experiments run in a real-life scenario. Therefore, our experimental simulation can effectively evaluate the running time and message passing for the distributed $k$-core decomposition.
... Checkpointing [41] has been studied in distributed systems with various goals [14,17,24,30,31,35,51]. We summarize the related works into following three categories. ...
Preprint
Full-text available
Model checkpoints are critical Deep Learning (DL) artifacts that enable fault tolerance for training and downstream applications, such as inference. However, writing checkpoints to persistent storage, and other I/O aspects of DL training, are mostly ignored by compute-focused optimization efforts for faster training of rapidly growing models and datasets. Towards addressing this imbalance, we propose FastPersist to accelerate checkpoint creation in DL training. FastPersist combines three novel techniques: (i) NVMe optimizations for faster checkpoint writes to SSDs, (ii) efficient write parallelism using the available SSDs in training environments, and (iii) overlapping checkpointing with independent training computations. Our evaluation using real world dense and sparse DL models shows that FastPersist creates checkpoints in persistent storage up to 116x faster than baseline, and enables per-iteration checkpointing with negligible overhead.
... Já foi provado na literatura que a obtenção de estados globais em sistemas distribuídos assíncronos (como por exemplo as Grades Computacionais), não possuem solução (CHANDY; LAMPORT, 1985). Portanto, o escalonamento baseado em informações, fornecidas por um serviço de informação, esta ...
Article
Full-text available
JGridTS: um framework Java para a execução de tarefas em ambientes de grades computacionais baseadas em espaços de Tuplas
... According to the checkpoint protocol of Mekha, a VC checkpoint operation creates a set of checkpoint files, which contain the state information of each active instance in the VC. The protocol ensures global consistency by preventing a send event in the "FUTURE" set from sending a frame to a receive event in the "PAST" set in Figure 4. Based on the global consistency principle [30], a set of checkpoints is globally consistent if and only if the inconsistent cut does not exist. According to the checkpoint protocol in Figure 2, two barrier synchronizations are used to ensure the following: The first barrier synchronization at Step 4 ensures that every event in the "PAST" set has occurred before all active instances are suspended to perform MTD's stop-and-copy stage. ...
Article
Full-text available
Transparent hypervisor-level checkpoint-restart mechanisms for virtual clusters (VCs) or clusters of virtual machines (VMs) offer an attractive fault tolerance capability for cloud data centers. However, existing mechanisms have suffered from high checkpoint downtimes and overheads. This paper introduces Mekha, a novel hypervisor-level, in-memory coordinated checkpoint-restart mechanism for VCs that leverages precopy live migration. During a VC checkpoint event, Mekha creates a shadow VM for each VM and employs a novel memory-bound timed-multiplex data (MTD) transfer mechanism to replicate the state of each VM to its corresponding shadow VM. We also propose a global ending condition that enables the checkpoint coordinator to control the termination of the MTD algorithm for every VM in a VC, thereby reducing overall checkpoint latency. Furthermore, the checkpoint protocols of Mekha are designed based on barrier synchronizations and virtual time, ensuring the global consistency of checkpoints and utilizing existing data retransmission capabilities to handle message loss. We conducted several experiments to evaluate Mekha using a message passing interface (MPI) application from the NASA advanced supercomputing (NAS) parallel benchmark. The results demonstrate that Mekha significantly reduces checkpoint downtime compared to traditional checkpoint mechanisms. Consequently, Mekha effectively decreases checkpoint overheads while offering efficiency and practicality, making it a viable solution for cloud computing environments.
... The notion of consistent cut and its frontier, defined next, capture possible global states, that is, states that could be valid global states. We borrow the notion of consistent cuts (and by extension, frontiers) from [7], and modify it to fit continuous signals. ...
Article
Full-text available
This paper solves the problem of detecting violations of predicates over distributed continuous-time and continuous-valued signals in cyber-physical systems (CPS). CPS often operate in a safety-critical context, where their correctness is crucially important. Large CPS that consist of many autonomous and communicating components distributed across a geographical area must maintain global correctness and safety. We assume a partially synchronous setting, where a clock synchronization algorithm guarantees a bound on clock drifts among all signals. We introduce a novel retiming method that allows reasoning about the correctness of predicates among continuous-time signals that do not share a global view of time. The resulting problem is encoded as an SMT problem and we introduce techniques to solve the SMT encoding efficiently. Leveraging simple knowledge of physical dynamics allows further runtime reductions. We fully implement our approach on three distributed CPS applications: monitoring of a network of autonomous ground vehicles, a network of aerial vehicles, and a water distribution system. The results show that in some cases it is even possible to monitor a distributed CPS sufficiently fast for online deployment on fleets of autonomous vehicles.
... Second, we could have drawn different temporal boundaries -different consistent cuts -and found a different decomposition. Consistent cuts [Mattern 1989;Chandy and Lamport 1985] are of fundamental importance to the analysis of concurrent systems, as they model the realizable global states of a system. Thus, the formal representation for a diagram will embed a choice of consistent cuts; and as we will find in Sections 5 and 6, working with global information from the start enables simpler proof methods for reasoning about concurrent systems. ...
Preprint
Full-text available
The Lamport diagram is a pervasive and intuitive tool for informal reasoning about causality in a concurrent system. However, traditional axiomatic formalizations of Lamport diagrams can be painful to work with in a mechanized setting like Agda, whereas inductively-defined data would enjoy structural induction and automatic normalization. We propose an alternative, inductive formalization -- the causal separation diagram (CSD) -- that takes inspiration from string diagrams and concurrent separation logic. CSDs enjoy a graphical syntax similar to Lamport diagrams, and can be given compositional semantics in a variety of domains. We demonstrate the utility of CSDs by applying them to logical clocks -- widely-used mechanisms for reifying causal relationships as data -- yielding a generic proof of Lamport's clock condition that is parametric in a choice of clock. We instantiate this proof on Lamport's scalar clock, on Mattern's vector clock, and on the matrix clocks of Raynal et al. and of Wuu and Bernstein, yielding verified implementations of each. Our results and general framework are mechanized in the Agda proof assistant.
... When a failure is detected, the last available checkpoint can be reloaded and the execution of jobs may restart from that state. Different workers may take independent checkpoints or they may coordinate, for instance by using the distributed snapshot protocol [31] to periodically save a consistent view of the entire system state. A third alternative (per-activation checkpoint) is sometimes used for continuous jobs to save task state: at each activation, a task stores its task state together with the output data. ...
Article
Full-text available
Data is a precious resource in today’s society, and is generated at an unprecedented and constantly growing pace. The need to store, analyze, and make data promptly available to a multitude of users introduces formidable challenges in modern software platforms. These challenges radically impacted the research fields that gravitate around data management and processing, with the introduction of distributed data-intensive systems that offer innovative programming models and implementation strategies to handle data characteristics such as its volume, the rate at which it is produced, its heterogeneity, and its distribution. Each data-intensive system brings its specific choices in terms of data model, usage assumptions, synchronization, processing strategy, deployment, guarantees in terms of consistency, fault tolerance, ordering. Yet, the problems data-intensive systems face and the solutions they propose are frequently overlapping. This paper proposes a unifying model that dissects the core functionalities of data-intensive systems, and discusses alternative design and implementation strategies, pointing out their assumptions and implications. The model offers a common ground to understand and compare highly heterogeneous solutions, with the potential of fostering cross-fertilization across research communities. We apply our model by classifying tens of systems: an exercise that brings to interesting observations on the current trends in the domain of data-intensive systems and suggests open research directions.
Chapter
Similarly to the injunction “Know yourself” engraved on the frontispiece of Delphi’s temple more than two millennia ago, the sentence “Make it as simple as possible, but not simpler” (attributed to Einstein) should be engraved on the entrance door of all research laboratories. At the long run, what does remain of our work? Mathematicians and physicists have formulas. We have algorithms! The main issue with simplicity is that it is often confused with triviality, but as stated by J. Perlis, the recipient of the first Turing Award, “Simplicity does not precede complexity, but follows it”. Considering my research domain, namely distributed computing, this chapter surveys topics I was interested in during my career and presents a few results, approaches, and algorithms I designed (with colleagues). The design of these algorithms strove to consider (maybe unsuccessfully) concision, generality, simplicity, and elegance as first class properties. Said in other words, this chapter does not claim objectivity.
Article
The Lamport diagram is a pervasive and intuitive tool for informal reasoning about “happens-before” relationships in a concurrent system. However, traditional axiomatic formalizations of Lamport diagrams can be painful to work with in a mechanized setting like Agda. We propose an alternative, inductive formalization — the causal separation diagram (CSD) — that takes inspiration from string diagrams and concurrent separation logic, but enjoys a graphical syntax similar to Lamport diagrams. Critically, CSDs are based on the idea that causal relationships between events are witnessed by the paths that information follows between them. To that end, we model “happens-before” as a dependent type of paths between events. The inductive formulation of CSDs enables their interpretation into a variety of semantic domains. We demonstrate the interpretability of CSDs with a case study on properties of logical clocks , widely-used mechanisms for reifying causal relationships as data. We carry out this study by implementing a series of interpreters for CSDs, culminating in a generic proof of Lamport’s clock condition that is parametric in a choice of clock. We instantiate this proof on Lamport’s scalar clock, on Mattern’s vector clock, and on the matrix clocks of Raynal et al. and of Wuu and Bernstein, yielding verified implementations of each. The CSD formalism and our case study are mechanized in the Agda proof assistant.
Chapter
Serverless computing promises to significantly simplify cloud computing by providing Functions-as-a-Service where invocations of functions, triggered by events, are automatically scheduled for execution on compute nodes. Notably, the serverless computing model does not require the manual provisioning of virtual machines; instead, FaaS enables load-based billing and auto-scaling according to the workload, reducing costs and making scheduling more efficient. While early serverless programming models only supported stateless functions and severely restricted program composition, recently proposed systems offer greater flexibility by adopting ideas from actor and dataflow programming. This paper presents a survey of actor-like programming abstractions for stateful serverless computing, and provides a characterization of their properties and highlights their origin.
Article
Network telemetry is essential for administrators to monitor massive data traffic in a network-wide manner. Existing telemetry solutions often face the dilemma between resource efficiency (i.e., low CPU, memory, and bandwidth overhead) and full accuracy (i.e., error-free and holistic measurement). We break this dilemma via a network-wide architectural design, which simultaneously achieves resource efficiency and full accuracy in flow-level telemetry for large-scale data centers. carefully coordinates the collaboration among different types of entities in the whole network to execute telemetry operations, such that the resource constraints of each entity are satisfied without compromising full accuracy. It further addresses consistency in network-wide epoch synchronization and accountability in error-free packet loss inference. We prototype in DPDK and P4. Testbed experiments on commodity servers and Tofino switches demonstrate the effectiveness of over state-of-the-art solutions.
Article
Full-text available
Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a comprehensive overview of fundamental aspects of stream processing systems and their evolution in the functional areas of out-of-order data management, state management, fault tolerance, high availability, load management, elasticity, and reconfiguration. We review noteworthy past research findings, outline the similarities and differences between the first (’00–’10) and second (’11–’23) generation of stream processing systems, and discuss future trends and open problems.
Article
A multi‐tenant microservice architecture involving components with asynchronous interactions and batch jobs requires efficient strategies for managing asynchronous workloads. This article addresses this issue in the context of a leading company developing tax software solutions for many national and multi‐national corporations in Brazil. A critical process provided by the company's cloud‐based solutions encompasses tax integration, which includes coordinating complex tax calculation tasks and needs to be supported by asynchronous operations using a message broker to guarantee order correctness. We explored and implemented two approaches for managing asynchronous workloads related to tax integration within a multi‐tenant microservice architecture in the company's context: (i) a polling‐based approach that employs a queue as a distributed lock (DL) and (ii) a push‐based approach named single active consumer (SAC) that relies on the message broker's logic to deliver messages. These approaches aim to achieve efficient resource allocation when dealing with a growing number of container replicas and tenants. In this article, we evaluate the correctness and performance of the DL and SAC approaches to shed light on how asynchronous workloads impact the management of multi‐tenant microservice architectures from delivery and deployment perspectives.
Article
This paper studies checkpointing strategies for parallel applications subject to failures. The optimal strategy to minimize total execution time, or makespan, is well known when failure IATs obey an Exponential distribution, but it is unknown for non-memoryless failure distributions. We explain why the latter fact is misunderstood in recent literature. We propose a general strategy that maximizes the expected efficiency until the next failure, and we show that this strategy achieves an asymptotically optimal makespan, thereby establishing the first optimality result for arbitrary failure distributions. Through extensive simulations, we show that the new strategy is always at least as good as the Young/Daly strategy for various failure distributions. For distributions with a high infant mortality (such as LogNormal with shape parameter k = 2.51 or Weibull with shape parameter 0.5), the execution time is divided by a factor 1.9 on average, and up to a factor 4.2 for recently deployed platforms.
Article
A single-master database has limited update capacity because a single node handles all updates. A multi-master database potentially has higher update capacity because the load is spread across multiple nodes. However, the need to coordinate updates and ensure durability can generate high network traffic. Reducing network load is particularly important in a cloud environment where the network infrastructure is shared among thousands of tenants. In this paper, we present Taurus MM, a shared-storage multi-master database optimized for cloud environments. It implements two novel algorithms aimed at reducing network traffic plus a number of additional optimizations. The first algorithm is a new type of distributed clock that combines the small size of Lamport clocks with the effective support of distributed snapshots of vector clocks. The second algorithm is a new hybrid page and row locking protocol that significantly reduces the number of lock requests sent over the network. Experimental results on a cluster with up to eight masters demonstrate superior performance compared to Aurora multi-master and CockroachDB.
Chapter
By integrating Internet of Things (IoT) capabilities to sense real-time conditions in the physical environment, traditional Business Process Management (BPM) has the potential to become more flexible and adaptive. However, the integration of BPM and IoT faces challenges such as programming mechanism mismatches, resource management mechanism mismatches, and adaptive mechanism mismatches. This research considers IoT service-based technology as an effective approach to integrate BPM and IoT. The IoT service must be calculable, composable, bindable, and fault-tolerant. When IoT services run on Apache Flink, the native fault tolerance mechanism may not meet the fault tolerance needs of IoT services due to high-speed fluctuation characteristics of IoT service data sources. Additionally, traditional static checkpoint fault-tolerant mechanisms may not balance runtime overhead and recovery delay optimally. This paper proposes an on-demand dynamic checkpoint fault-tolerant method that calculates the recovery delay in real-time based on data fluctuation rates and actively triggers the checkpoint operation when the user threshold is reached. Experiments show that the proposed method improves system efficiency by up to 11.9% compared to the static checkpoint mechanism.KeywordsIoT servicefault-tolerancecheckpoint intervalon-demand dynamic checkpoint mechanism
Conference Paper
Full-text available
Condition monitoring of machinery is becoming more and more popular particularly in the process plan, where sudden breakdown may prove costly or may even fatal. The uncertainties faced by the rotating machinery such as misalignment, looseness, bearing defect and electrical faults are corrected by familiar and ordinary maintenance procedure. Replacing the faulty bearings, gears, drive belts, couplings and other machine components are also rather straight forward process. However, correcting the unbalance requires some special knowledge and understanding. The unbalanced rotor always cause more vibrations and generates excessive forces on the bearing areas and reduces the life of the machine. In this work vector balancing method is presented, which minimize the number of trial runs by eliminating all guesswork. In case of non stationary signal, the use of Spectrum analysis based on Fourier transform has some limitations. Hence, the vibration signal analysis is evaluated by using Continuous wavelet transform method. It is one of the most important tool for signal processing particularly in nonstationary signals. It is capable of giving time frequency representation of the signal.
Chapter
Full-text available
We describe social DNA nanorobots , which are autonomous mobile DNA devices that execute a series of pair-wise interactions between simple individual DNA nanorobots, causing a desired overall outcome behavior for the group of nanorobots which can be relatively complex. We present various designs for social DNA nanorobots that walk over a 2D nanotrack and collectively exhibit various programmed behaviors. These employ only hybridization and strand-displacement reactions, without use of enzymes. The novel behaviors of social DNA nanorobots designed here include: (i) Self-avoiding random walking , where a group of DNA nanorobots randomly walk on a 2D nanotrack and avoid the locations visited by themselves or any other DNA nanorobots. (ii) Flocking , where a group of DNA nanorobots follow the movements of a designated leader DNA nanorobot, and (iii) Voting by assassination , a process where there are originally two unequal size groups of DNA nanorobots; when pairs of DNA nanorobots from distinct groups collide, one or the other will be assassinated (by getting detached from the 2D nanotrack and diffusing into the solution away from the 2D nanotrack); eventually all members of the smaller groups of DNA nanorobots are assassinated with high likelihood. To simulate our social DNA nanorobots, we used a surface-based CRN simulator.
Chapter
Full-text available
It was 40 years ago today, when Ned taught DNA to play [32]. When Ned Seeman began laying the theoretical foundations of what is now DNA nanotechnology, he likely did not imagine the entire diversity and scale of molecular structures, machines, and computing devices that would be enabled by his work. While there are many reasons for the success of the field, not least the creativity shown by Ned and the community he helped build, such progress would not have been possible without breakthroughs in DNA synthesis and molecular analysis technology. Here, we argue that the technologies that will enable the next generation of DNA nanotechnology have already arrived but that we have not yet fully taken advantage of them. Specifically, we believe that it will become possible, in the near future, to dramatically scale up DNA nanotechnology through the use of array-synthesized DNA and high-throughput DNA sequencing. In this article, we provide an example of how DNA logic gates and circuits can be produced through enzymatic processing of array-synthesized DNA and can be read out by sequencing in a massively parallel format. We experimentally demonstrate processing and readout of 380 molecular gates in a single reaction. We further speculate that in the longer term, very large-scale DNA computing will find applications in the context of molecular diagnostics and, in particular, DNA data storage.
Chapter
Full-text available
In this essay, the evolution of DNA nanotechnology research in Japan to date will be reviewed. The expansion of the research community in Japan and the trends in regard to the selection of project themes will be elucidated, along with the identification of the researchers who participated in these projects. Some aspects of the research history of the author, who entered from the field of robotics, are introduced, as this information may be of interest to young students and researchers.
Chapter
Full-text available
Over the past 40 years, significant progress has been made on the design and implementation of nucleic acid circuits, which represent the computational core of dynamic DNA nanotechnology. This progress has been enabled primarily by substantial advances in experimental techniques, but also by parallel advances in computational methods for nucleic acid circuit design. In this perspective, we look back at the evolution of these computational design methods through the lens of the Visual DSD system, which has been developed over the past decade for the design and analysis of nucleic acid circuits. We trace the evolution of Visual DSD over time in relation to computational design methods more broadly, and outline how these computational design methods have tried to keep pace with rapid progress in experimental techniques. Along the way, we summarize the key theoretical concepts from computer science and mathematics that underpin these design methods, weaving them together using a common running example of a simple Join circuit. On the occasion of the 40th anniversary of DNA nanotechnology, we also offer some thoughts on possible future directions for the computational design of nucleic acid circuits and how this may influence, and be influenced by, experimental developments.
Chapter
Full-text available
We summarize our work on gellular automata, which are cellular automata we intend to implement with gel materials. If cellular automata are implemented as materials, it will become possible to realize smart materials with abilities such as self-organization, pattern formation, and self-repair. Furthermore, it may be possible to make a material that can detect the environment and adapt to it. In this article, we present three models of gellular automata, among which the first two have been proposed previously and the third one is proposed here for the first time. Before presenting the models, we briefly discuss why cellular automata are a research target in DNA computing, a field which aims to extract computational power from DNA molecules. Then, we briefly describe the first model. It is based on gel walls with holes that can open and exchange the solutions that surround them. The second model is also based on gel walls but differs in that the walls allow small molecules to diffuse. In presenting the second model, we focus on self-stability, which is an important property of distributed systems, related to the ability to self-repair. Finally, we report our recent attempt, in the third model, to design gellular automata that learn Boolean circuits from input–output sets, i.e., examples of input signals and their expected output signals.
Chapter
Full-text available
Spatial organization on the atomic scale is one of the key objectives of nanotechnology. The development of DNA nanotechnology is a hallmark of material programmability in 2D and 3D, in which the large variety of available DNA modifications allows it to be interfaced with a number of inorganic and organic materials. Nature’s solution to spatiotemporal control has been the evolution of self-organizing protein systems capable of pattern formation through energy dissipation. Here, we show that combining DNA origami with a minimal micron-scale pattern-forming system vastly expands the applicability of DNA nanotechnology, whether for the development of biocompatible materials or as an essential step toward building synthetic cells from the bottom up. We first describe the interaction of DNA origami nanostructures with model lipid membranes and introduce the self-organizing MinDE protein system from Escherichia coli . We then outline how we used DNA origami to elucidate diffusiophoresis on membranes through MinDE protein pattern formation. We describe how this novel biological transport mechanism can, in turn, be harnessed to pattern DNA origami nanostructures on the micron scale on lipid membranes. Finally, we discuss how our approach could be used to create the next generation of hybrid materials, through cargo delivery and multiscale molecular patterning capabilities.
Chapter
Full-text available
Research toward the use of nucleic acids as a medium in which to encode non-biological information or as a structural material with which to build novel constructs has now been going on for 40 years. I have been participating in this field for approximately 24 years. I will use my space within this dedicated volume to relate some of my personal experiences and observations throughout my own DNA nanotech journey.
Chapter
Full-text available
A diverse array of theoretical models of DNA-based self-assembling systems have been proposed and studied. Beyond providing simplified abstractions in which to develop designs for molecular implementation, these models provide platforms to explore powers and limitations of self-assembling systems “in the limit” and to compare the relative strengths and weaknesses of systems and components of varying capabilities and constraints. As these models often intentionally overlook many types of errors encountered in physical implementations, the constructions can provide a road map for the possibilities of systems in which errors are controlled with ever greater precision. In this article, we discuss several such models, current work toward physical implementations, and potential future work that could help lead engineered systems further down the road to the full potential of self-assembling systems based on DNA nanotechnology.
Chapter
Full-text available
Oritatami is a formal model of RNA co-transcriptional folding, in which an RNA sequence (transcript) folds upon itself while being synthesized (transcribed) out of its DNA template. This model is simple enough for further extension and also strong enough to study computational aspects of this phenomenon. Some of the structural motifs designed for Turing universal computations in oritatami have been demonstrated approximately in-vitro recently. This model has yet to take a significant aspect of co-transcriptional folding into full account, that is, reconfiguration of molecules. Here we propose a kinetic extension of this model called the oritatami kinetic (O k ) model, similar to what kinetic tile assembly model ( k TAM) is to abstract tile assembly model ( a TAM). In this extension, local rerouting of the transcript inside a randomly chosen area of parameterized radius competes with the transcription and the folding of the nascent beads (beads are abstract monomers which are the transcription units in oritatami). We compare this extension to a simulation of oritatami in the nubot model, another reconfiguration-based molecular folding model. We show that this new extension matches better a reconfiguration model and is also faster to simulate than passing through a nubot simulation.
Chapter
Full-text available
Dynamic DNA nanotechnology aims at the realization of molecular machines, devices, and dynamic chemical systems using DNA molecules. DNA is used to assemble the components of these systems, define the interactions between the components, and in many cases also as a chemical fuel that drives them using hybridization energy. Except for biosensing, applications of dynamic DNA devices have so far been limited to proof-of-concept demonstrations, partly because the systems are operating rather slowly, and because it is difficult to operate them continuously for extended periods of time. It is argued that one of the major challenges for the future development of dynamic DNA systems is the identification of driving mechanisms that will allow faster and continuous operation far from chemical equilibrium. Such mechanisms will be required to realize active molecular machinery that can perform useful tasks in nanotechnology and molecular robotics.
Chapter
Full-text available
With recent high-throughput technology, we can synthesize large heterogeneous collections of DNA structures and also read them all out precisely in a single procedure. Can we use these tools, not only to do things faster, but also to devise new techniques and algorithms? In this paper, we examine some DNA algorithms that assume high-throughput synthesis and sequencing. We record the order in which N events occur, using $$N^2$$ N 2 redundant detectors but only N distinct DNA domains, and (after sequencing) reconstruct the order by transitive reduction.
Chapter
Full-text available
DNA-based self-assembly enables the programmable arrangement of matter on a molecular scale. It holds promise as a means with which to fabricate high technology products. DNA-based self-assembly has been used to arrange chromophores (dye molecules) covalently linked to DNA to form Förster resonant energy transfer and exciton-based devices. Here we explore the possibility of making coherent exciton information processing devices, including quantum computers. The focus will be on describing the chromophore arrangements needed to implement a complete set of gates that would enable universal quantum computation.
Chapter
Full-text available
The first demonstration of DNA computing was realized by Adleman in 1994, aiming to solve hard combinational problems with DNA molecules. This pioneering work initiated the evolution of the field of DNA computing during the last three decades. Up to date, the implemented functions of DNA computing have been expanded to logic operations, neural network computations, time-domain oscillator circuits, distributed computing, etc. Herein, the history of DNA computing is briefly reviewed, followed by discussions on opportunities and challenges of DNA-based molecular computing, especially from the perspective of algorithm design. Future directions and design strategies for next-generation DNA computing is also discussed.
Chapter
Full-text available
The origins of DNA nanotechnology can be traced back to 1982, when Dr. Ned Seeman proposed assembling branched junctions as 3D lattices to facilitate protein crystallization. Over the past four decades, this concept has evolved into a multidisciplinary research field with vast potential for applications. In this mini review, we present a brief introduction of selected topics in nucleic acid nanotechnology, focusing on scaling up DNA assembly, achieving higher resolutions, and transferring to RNA structural design. We discusses the advantages and challenges of each topic, aiming to shed light on the enormous potential of nucleic acid nanotechnology.
Chapter
Full-text available
To celebrate the 40th anniversary of bottom-up DNA nanotechnology we highlight the interaction of the field with mathematics. DNA self-assembly as a method to construct nanostructures gave impetus to an emerging branch of mathematics, called here ‘DNA mathematics’. DNA mathematics models and analyzes structures obtained as bottom-up assembly, as well as the process of self-assembly. Here we survey some of the new tools from DNA mathematics that can help advance the science of DNA self-assembly. The theory needed to develop these tools is now driving the field of mathematics in new and exciting directions. We describe some of these rich questions, focusing particularly on those related to knot theory, graph theory, and algebra.
Article
Full-text available
We use the paradigm of diffusing computation, introduced by Dijkstra and Scholten, to solve a class of graph problems. We present a detailed solution to the problem of computing shortest paths from a single vertex to all other vertices, in the presence of negative cycles.
Article
Control of a distributed computing network requires an efficient method of allocating file resources, together with the capability of avoiding or recovering from deadlock. This paper describes alternative approaches to the deadlock problem, and examines two control schemes, one centralized and one distributed. Simulation shows that when perfect reliability is assumed, the centralized scheme is more efficient, except when traffic is primarily local in nature at each node. In practice, however, a centralized controller is often undesirable, since failure of the central node causes system paralysis. A modified control scheme is proposed, in which the centre of control can shift its location to adapt to failure of network elements. It is expected that such an adaptive controller will prove superior in performance to either of the previous alternatives.RésuméLe contrôle d’un réseau de données distribué exige une methode efficace d’allocation des fichiers de données aussi bien que le possibilité d’éviter des situations de “cadenas.” Cet article décrit des façons d’éviter les situations de “cadenas” et examine deux méthodes de contrôle dont l’une est centralisée et l’autre decentralisée. Lorsque le réseau a une fiabilité parfaite on découvre à l’aide des simulations que le contrôle centralisée est plus efficace, excepté dans des cas où le trafic est essentiellement local. Un contrôle centralisée est néanmoins indésirable pour des raisons pratiques, e.g. panne du nœud central. Une méthode de contrôle dans laquelle le centre de contrôle se déplace pour palier aux pannes du réseau est discutée et proposée. Une telle méthode de contrôle pourrait se révéler meilleure que les deux options mentionnées plus haut.
Article
Distributed deadlock models are presented for resource and communication deadlocks. Simple distributed algorithms for detection of these deadlocks are given. We show that all true deadlocks are detected and that no false deadlocks are reported. In our algorithms, no process maintains global information; all messages have an identical short length. The algorithms can be applied in distributed database and other message communication systems.
Article
In this paper it is shown how the Dijkstra-Scholten scheme for termination detection in a diffusing computation can be adapted to detect termination or deadlock in a network of communicating sequential processes as defined by Hoare. Categories and Subject Descriptors: D. 2.4 [Software Engineering]: Program Verification-correct. ness proofs; D. 3.3 [Programming Languages]: Language Constructs--concurrent programming structures; D. 4.1 [Operating Systems]: Process Management--deadlocks General Terms: Languages, Verification Additional Key Words and Phrases: distributed systems, networks of processes, termination detection, diffusing computation
Article
We propose an algorithm for detecting deadlocks among transactions running concurrently in a distributed processing network (i.e., a distributed database system). The proposed algorithm is a distributed deadlock detection algorithm. A proof of the correctness of the distributed portion of the algorithm is given, followed by an example of the algorithm in operation. The performance characteristics of the algorithm are also presented.
Article
The concept of one event happening before another in a distributed system is examined, and is shown to define a partial ordering of the events. A distributed algorithm is given for synchronizing a system of logical clocks which can be used to totally order the events. The use of the total ordering is illustrated with a method for solving synchronization problems. The algorithm is then specialized for synchronizing physical clocks, and a bound is derived on how far out of synchrony the clocks can become.
Article
A hierarchically organized and a distributed protocol for deadlock detection in distributed databases are presented in [1]. In this paper we show that the distributed protocol is incorrect, and present possible remedies. However, the distributed protocol remains impractical because "condensations" of "transaction-wait-for" graphs make graph updates difficult to perform. Delayed graph updates cause the occurrence of false deadlocks in this as well as in some other deadlock detection protocols for distributed systems. The performance degradation that results from false deadlocks depends on the characteristics of each protocol.
Article
This paper descrbes two protocols for the detection of deadlocks in distributed data bases–a hierarchically organized one and a distributed one. A graph model which depicts the state of execution of all transactions in the system is used by both protocols. A cycle in this graph is a necessary and sufficient condition for a deadlock to exist. Nevertheless, neither protocol requires that the global graph be built and maintained in order for deadlocks to be detected. In the case of the hierarchical protocol, the communications cost can be optimized if the topology of the hierarachy is appropriately chosen.
On partially-ordered event models of distributed computa-tions
  • L Lamport
  • K M And Chandy
LAMPORT, L., AND CHANDY, K. M. On partially-ordered event models of distributed computa-tions. Submitted for publication.
On partially-ordered event models of distributed computations
  • L And Chandy
LAMPORT, L., AND CHANDY, K. M. On partially-ordered event models of distributed computations. Submitted for publication.
On partially-ordered event models of distributed computations. Submitted for publication. LAMPORT L. AND CHANDY K.M. On partially-ordered event models of distributed computations
  • Lamport L M Chandy K