Figure 1 - uploaded by Peter Van Roy
Content may be subject to copyright.
Structured overlay network using ring topology 

Structured overlay network using ring topology 

Source publication
Conference Paper
Full-text available
There is no doubt about the increase in popularity of decentralised systems over the classical client-server architecture in distributed applications. These systems are developed mainly as peer-to-peer networks where it is possible to observe many strategies to organise the peers. The most popular one for structured networks is the ring topology. D...

Context in source publication

Context 1
... of resources are crucial properties that a peer-to-peer system must provide. Other wished properties such as efficient routing, scalability and full reachability, made that randomly connected peer-to- peer networks moved towards structured overlay networks. Many of these structured networks implements a Distribute Hash Table (DHT). Among many of them - Pastry [16], Tapestry [20], Kademlia [12], HyperCup [17], P-Grid [1] - we focus on Chord [18], because it is quite representative and it introduces a ring topology that has influenced many other networks. In Chord, peers are organised in a ring, having a set of pointers to efficiently find any other peer in the network. The resources of the system are distributed among the peers where each one is responsible for a set of them. Perform- ing a lookup for a resource must result in a consistent answer, finding the right responsible. To add or remove a peer from the network, the peer only needs to synchronise with its direct neighbours. The network is self-organising, mean- ing that peers will organise themselves in the ring topology without needing manual reconfiguration. Despite the self-organising nature of the ring architecture, its maintenance presents several challenges in order to provide lookup consistency at any time. Chord itself presents temporary inconsistency when several peers join the network concurrently. This problem occurs even in fault-free scenarios. To fix these inconsistencies, a stabilisation protocol must be run periodically. The system must also deal with peers gently leaving the network, which can occur massively and concurrent to other joining events. The most challenging issue though, is failure handling, where peers just leave the network breaking the ring without following any protocol. As we can see, ironically, the advantages of decentralised systems with respect to the classical client-server architecture, have the drawback of higher complexity due to the lack of a single point of control and synchronisation. Increas- ing self-management of decentralised systems can help us to reduce this new complexity. By self-management we mean the ability of a system to maintain its functionality despite changes in its environment. The system constantly monitor itself triggering corrective actions when the current state deviates from the desired one. In order to achieve self- management, the use of feedback loops in the design of the system appears as a straight forward approach. We use feedback loops to model the ring-maintenance of our peer-to-peer system, called P2PS [13], which also uses a ring topology. As a result of this new design, we introduce a novel relaxed-ring topology that simplifies the “join” algorithm and greatly improves failure recovery. Having the ability of handling failures, there is no need for a “leave” algorithm, because this case is already covered by failure recovery. The main contribution of this work is the design of a peer-to-peer network as a self-managing system, introducing a relaxed-ring topology that is able to provide fault- tolerance with realistic assumptions concerning failure detection. The use of feedback loops for modelling the system can be reused not only in other decentralised systems, but also in software design in general. Section 2 gives a more detailed introduction to peer-to- peer networks using ring topology, describing some existing solutions for ring maintenance. Section 3 briefly introduces feedback loops for self-managing systems and how they can be applied to software design. The result of applying feedback loops to the ring maintenance is given in section 4 with a detailed description of the relaxed-ring . After a deep anal- ysis of failure handling, the paper provides conclusions for this work. Peer-to-peer networks appear as the evident framework for working with decentralised systems. Looking at the his- tory of peer-to-peer systems, we find Napster[15] as the icon of the first generation. Napster uses a hybrid architecture with a centralised directory storing the location of the resources of the systems. Thus, the client-server strategy was needed in order to find other peers. A second generation characterised by Gnutella [8] and FreeNet [6] removed the servers from the topology becom- ing the first real peer-to-peer network. Peers build an overlay network on top of the Internet, being able to route with its own topology. No structure is used for the network because peers are connected randomly to other peers. Therefore, no strong guarantees can be provided with respect to reachability, availability and time to find items. Unfortu- nately, these kind of network have limited scalability and induce a huge amount of traffic [11]. Structured overlay networks - see introduction for references - appear as the third generation of peer-to-peer systems, claiming self-organisation of the network with fault- tolerance in addition to the guarantees that cannot be found in the second generation. Figure 1 depicts a structured overlay network using ring topology and providing a DHT with election of fingers based on the Tango [3] algorithm. This structure was first introduced by Chord [18]. Every peer is identified with a hash key, and it is connected to a successor and a predecessor respecting the order of the keys in clockwise direction. The DHT is used for storing and finding items in the network using basically two operations: put ( key, value ) to store a value with a certain key, and get ( key ) to recover the value. Every peer is responsible for all keys between its pre- decessor’s identifier and itself, excluding the predecessor to avoid overlapping. When lookup for a key is triggered from any point of the ring, consistency must be guaranteed. We define this as follows: As we mentioned already, ring maintenance is costly and it is not trivial to guarantee correctness. This is because the state of the ring is updated concurrently without having any centralised point of synchronisation. Chord’s algorithms for ring maintenance, handling joins and leaves , present well known problems of temporary inconsistency, where more that one peer appears to be the responsible for a key. For this reason, Chord needs to trigger periodic stabilisation in order to fix the inconsistencies. Existing analyses [7] con- clude that the problem comes from the fact that joins and leaves are not atomic operations. We also raise the issue that these operations always need the synchronisation of three peers, which is hard to guarantee with asynchronous communication, which is inherent to distributed programming. Existing solutions [9, 10] introduce locks in the algorithms in order to provide atomicity to join and leave operations. Locks are also hard to manage in asynchronous systems, and that is why these solutions only work on fault- free systems, which is not realistic. A better solution is provided by DKS [7], simplifying the locking mechanism and proving correctness of the algorithms in absent of failures. Even when this approach offers strong guarantees, we consider locks extremely restrictive for a dynamic network based on asynchronous communication. Every lookup request involving the locked peers must be suspended in presence of join or leave in order to guarantee consistency. Leaving peers are not allowed to leave the network until they are granted with the relevant locks. Given that, peers crashing can be seen as peers just leaving the network without respecting the protocol of the locking mechanism breaking the guarantees of the system. Another critical problem for performance is presented when a peer crashes while some joining or leaving peer is holding its lock. The situation is worse when the peer holding the relevant lock is the one that crashes. Under this con- siderations, we can observe that locks in distributed systems can hardly present an efficient fault-tolerant solution. Taken from system theory, feedback loops can be observed not only in existing automated systems, but also in self-managing systems in nature. Several examples of this can be found in [19], where feedback loops are introduced as a designing model for self-managing software. The loop consists out of three main concurrent components interact- ing with the subsystem. There is at least one agent in charge of monitoring the subsystem, passing the monitored information to a another component in charge of deciding a corrective action if needed. An actuating agent is used in order to perform this action in the subsystem. Figure 2 depicts the interaction of these three concurrent components in a feedback loop. These three components together with the subsystem forms the entire system. It has similar properties to PID-controllers , with the difference that the evolution of a running software application is measured discretely. The goal of the feedback loop is to keep a global property of the system stable. In the simplest cases, this property is represented by the value of a parameter. This parameter is constantly monitored. When a perturbation is detected, a corrective action is triggered. A negative feedback will make the system reacts in the opposite direction to the perturbation. Positive feedback increases the perturbation. Taking an air-conditioning as example, we can see the room where the system is installed as the subsystem. A thermometer constantly monitors the temperature in the room giving this information to a thermostat . The thermostat is the component in charge of computing the correcting action. If the monitored temperature is higher than the wished temperature, the thermostat will decide to run the air-conditioning to cool it down. That action corresponds to the actuating agent. Since every component executes concurrently, the model fits very well for modelling distributed systems. There are many alternatives for implementing every component and the way they interact. They can represent active objects, ac- tors, functions, ...

Similar publications

Conference Paper
Full-text available
In this paper, we present an approach for software rejuvenation based on automated self-healing techniques that can be easily applied to off-the-shelf Application Servers and Internet sites. Software aging and transient failures are detected through continuous monitoring of system data and performability metrics of the application server. If some a...

Citations

... This is because every peer is connected to the rest of the network. In case of the Relaxed Ring topology, disconnections and failure recovery are considered as part of its main features and they are explained in [12]. 1000°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°°* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ...
... As shown in other domains of distributed computing, this configuration has dramatic consequences on the scalability of the net- works. In this work, we explore advanced peer-to-peer (P2P) reconfiguration algorithms [17] to deal with the dynamicity and scalability of AmI networks. These algorithms implement efficient network configurations – known as network topologies – and message rerouting techniques that enable each network participant – known as peer (e.g. a device) – to have complete access to the network by having a small number of direct connections to other peers. ...
... This algorithm takes advantage of the best features of a fully connected network when the number of peers is small enough to allow the devices manage this kind of topology. When the network becomes too large to maintain a fully connected topology, the algorithm will automatically adapt the network configuration to become a Relaxed Ring [17], which can handle a large number of peers by executing more complex algorithms for self-managing the distributed network. We consider different aspects concerning the transition between networks: adaptation of the base algorithms, maintaining the network's coherence and self-healing from inefficient configurations. ...
... As we state in the motivation section, we look for an algorithm that can take full advantage of the current network size. First we will review two existing network topologies: fully connected and Relaxed Ring [17]. We discuss why these topologies are only partially suitable for the requirements shown in the previous section before introducing our approach called PALTA, that uses a combination of the fully connected and Relaxed Ring topologies, in Section 4. ...
Conference Paper
Full-text available
Many ambient intelligence (AmI) scenarios fit perfectly for auto-generated distributed networks, but they assume the existence of good enough network topology organizing the connected devices. AmI scenarios need to handle an unanticipated number of participants and inappropriate distributed network topologies can affect the network's efficiency by making it unstable and hard to manage. This paper introduces PALTA, a self-adapting hybrid topology capable of dynamically adjusting its configuration by using a combination of existing topologies. PALTA allows the incremental construction of self-maintained distributed networks which take advantage of the current network state.
... We propose PALTA: Peer-to-peer AdaptabLe Topology for Ambient intelligence, a dynamic topology intended for highly variable networks like in AmI scenarios. It combines already developed distributed topologies: fully connected networks and the Relaxed-ring [6]. PALTA manages the network configuration in order to optimize the communication between peers depending on the network size. ...
... The Relaxed-Ring topology [6] is a Chord-like [8] ring, where every peer has a successor (succ) and a predecessor (pred). It is a structured overlay network providing a Distributed Hash Table (DHT) where every peer is responsible for a certain range of hash-keys, which is delimited by its own key and the key of its predecessor, pred. ...
... Following the strategy of [6], where the relaxed-ring is modeled as a feedback loop, we can also model PALTA as shown in Figure 3. The monitors, actuators and the component that decides the corrective actions are placed at every node. ...
Conference Paper
Full-text available
Ambient Intelligence scenarios can be deployed even when the environment lacks of a underlying network infrastructure. This can be done using distributed ad-hoc networks. Ambient Intelligence applications can be highly variable and networks can have an unanticipated number of members. Inappropriate distributed network topologies can lead to unstable and inefficient communication. We propose PALTA, a decentralized and self-adaptable network topology. We use feedback loops to model its self-adaptable behaviour and we evaluate its performance using different simulations and measurements. PALTA allows the construction of distributed networks using self-management techniques and maintaining a good overall performance on the network communication.
... @BULLET We have devised algorithms for handling imperfect failure detection (false suspicions) [19], which vastly reduces the probability of lookup inconsistency. Imperfect failure detection is handled by relaxing the ring invariant to obtain a so-called " relaxed ring, " which maintains ...
... In between these two extremes we conjecture that there is a liquid phase, the relaxed ring, where the ring is connected but each node does not have a fixed set of neighbors. When a node is subject to a failure suspicion then its set of neighbors changes [19]. We conjecture that for properly designed SONs phase transitions can occur for changing values of the failure rate.Figure 5 shows the kind of behavior we expect for the relaxed ring. ...
Conference Paper
Full-text available
Programs are fragile for many reasons, including software errors, partial failures, and network problems. One way to make software more robust is to design it from the start as a set of interacting feedback loops. Studying and using feedback loops is an old idea that dates back at least to Norbert Wiener's work on Cybernetics. Up to now almost all work in this area has focused on how to optimize single feedback loops. We show that it is important to design software with multiple interacting feedback loops. We present examples taken from both biology and software to substantiate this. We are realizing these ideas in the SELFMAN project: extending structured overlay networks (a generalization of peer-to-peer networks) for large-scale distributed applications. Structured overlay networks are a good example of systems designed with interacting feedback loops. Using ideas from physics, we postulate that these systems can potentially handle extremely hostile environments. If the system is properly designed, it will perform a reversible phase transition when the node failure rate increases beyond a critical point. The structured overlay network will make a transition from a single connected ring to a set of disjoint rings and back again when the failure rate decreases. We are exploring how to expose this phase transition to the application so that it can continue to provide a service. For validation we are building three realistic applications taken from industrial case studies, using a distributed transaction layer built on top of the overlay. Finally, we propose a research agenda to create a practical design methodology for building systems based on the use of interacting feedback loops and reversible phase transitions.
... Chord itself presents temporary inconsistencies with peers massively join-The Relaxed-Ring: a Fault-Tolerant Topology for Structured Overlay Networks 3 the ring, because the Relaxed-Ring remains correct after every step, reducing the cost of maintenance. Part of this contribution has been published in [11], where the Relaxed-Ring was presented from the point of view of its design as a self-managing system. This work is focused on the correctness of the algorithm through analytical results, and an empirical validation with simulations. ...
... When the message arrives to node t, it is sent backwards to the branch, until it reaches the real responsible. Forwarding the request to the responsible is a conclusion we have already presented in [11], and it has been recently confirmed by Shafaat [14]. Introducing branches into the lookup mechanism modifies the guarantees about proximity offered by Chord. ...
Article
Full-text available
Fault-tolerance and lookup consistency are considered crucial properties for building applications on top of structured overlay networks. Many of these networks use the ring topology for the organization or their peers. The network must handle multiple joins, leaves and failures of peers while keeping the connection between every pair of successor-predecessor correct. This property makes the maintenance of the ring very costly and temporarily impossible to achieve, requiring periodic stabilization for fixing the ring. We introduce the relaxed-ring topology that does not rely on a perfect successor-predecessor relationship and it does not need a any periodic maintenance. Leaves and failures are considered as the same type of event providing a fault-tolerant and self-organizing maintenance of the ring. Relaxed-ring's limitations with respect to failure handling are formally identified, providing strong guarantees to develop applications on top of the architecture. Besides permanent failures, the paper analyses temporary failures and false suspicions caused by broken links, which are often ignored.
... We believe that the correct solution to this problem is to move towards decentralized architecture using principles from peer-to-peer network. We intend to refactor our system in two phases to achieve a P2P structure: in the first phase we will use a hybrid architecture with a centralized directory storing the locations of the services of the system, and in the second phase we will move to a relaxed-ring structure[11] linking the helper and administrator agents to achieve self-healing properties[5]. ...
Conference Paper
Full-text available
We describe the commercial application of agents to the handling of catalogue and stock-control for the selling of books on the inter- net. The primary characteristic of the target market is (very) low volumes over a (very) large number of items, thus agility and ex- tremely low overheads are the essential factors for a viable business model. Being a new company (established 2004), without legacy software and with the freedom to make new choices, it was decided that the agent abstraction offered both short-term software engi- neering and longer-term business advantages. This expectation has been borne out in practice, in that it has been possible to construct an e-trading platform, using a 4-person team over a period of a few months, and that is now part of a live business operation handling just over 12,000 transactions daily. In this paper we explain how agents helped focus attention on the responsibilities of key software functions, how different functions should interact with one another and how to identify and propagate key performance indicator infor- mation through the system to detect unexpected behaviour. Agent technology has many potential benefits for dynamic fast-moving businesses where software requirements change quickly and busi- ness needs grow rapidly, all within a dynamic environment that has entirely different rules across the axes of geography, market, customer and competitor. Using autonomous agents allowed The Book Depository to build quickly a complex network of P2P rela- tionships with a large number of suppliers and publishers of very different sizes who each utilize a variety of different trading and data interchange standards.
... In both cases, the ring structure must be maintained. This can be handled through the relaxed ring algorithm [26]. This algorithm maintains the invariant that every peer is in the same ring as its successor. ...
Conference Paper
Full-text available
As Internet applications become larger and more complex, the task of managing them becomes overwhelming. \Abnormal" events such as software updates, failures, attacks, and hotspots become frequent. The SELFMAN project will show how to handle these events automati- cally by making the application self managing. SELFMAN combines two technologies, namely structured overlay networks and advanced com- ponent models. Structured overlay networks (SONs) developed out of peer-to-peer systems and provide robustness, scalability, communication guarantees, and eciency. Component models provide the framework to extend the self-managing properties of SONs over the whole application. SELFMAN is building a self-managing transactional storage and using it for three application demonstrators: a machine-to-machine messenging service, a distributed Wiki, and an on-demand media streaming service. This paper provides an introduction and motivation to the ideas under- lying SELFMAN and a snapshot of its contributions midway through the project. We explain our methodology for building self-managing systems as networks of interacting feedback loops. We then summarize the work we have done to make SONs a practical basis for our architecture: using an advanced component model, handling network partitions, handling failure suspicions, and doing range queries with load balancing. Finally, we show the design of a self-managing transactional storage on a SON.
... We have reimplemented them using a concurrent component model and we are extending them with the hooks and abilities needed for building services and applications. We have devised algorithms for handling imperfect failure detection (false suspicions) [12] and network partitioning (detecting and merging partitions) [15]. Implementing transactions over a structured overlay network is challenging because of churn (the rate of node leaves, joins, and failures and the subsequent reorganizations of the overlay) and because of the Internet's failure model (crash stop with imperfect failure detection). ...
Article
Full-text available
Programs are fragile for many reasons, including soft- ware errors, partial failures, and network problems. One way to make software more robust is to de- sign it from the start as a set of interacting feed- back loops. Studying and using feedback loops is an old idea that dates back at least to Norbert Wiener's work on Cybernetics. But almost all work in this area has focused on single feedback loops. We show that it is important to design software with multi- ple interacting feedback loops. We present examples taken from both biology and software to substantiate this. To make this idea practical, a necessary con- dition is good support for concurrent programming. We nd that a message-passing model without shared state works well. Our own work focuses on extend- ing structured overlay networks (a generalization of peer-to-peer networks) for large-scale distributed ap- plications. Structured overlay networks are a good example of systems designed from the start as inter- acting feedback loops. We show how to extend them with a distributed transaction layer that keeps their good self-organization properties. We are using this system to build three realistic application scenarios taken from industrial case studies.
Article
The widespread of interconnectable computers gives systems the chance to operate more efficiently, by better utilizing the cooperation between individual components. User-centric solutions address the devices themselves and, since there is no network infrastructure and a device powerful enough to assume the role of a coordinator, adopting a peer-to-peer model tends to be the best solution. In this paper we propose AFT, an overlay that adapts to a changing number of nodes, is resilient to faults and is the foundation for an efficient implementation of a reputation based trust system. The AFT overlay is designed to be a solution for systems that need to share transient information, performing a synchronization between various components, like in mobile ad-hoc networks, M2M networks, urban networks, and wireless sensor networks. The operations supported by the overlay, like joining, leaving, unicast transmission, broadcast sharing and maintenance can be accomplished in a duration belonging to , where is the number of nodes which are part of the structure. We proved these properties and we evaluate the time performance related to overlay creation and node joining.
Chapter
When multiple users work collaboratively, coherence is not an easy feature to guarantee. It requires an exclusive access to some part of the User Interface (UI) and needs to give some feedbacks to other users. This synchronization needs a true concurrency control algorithm. One of the most common solution is to use a server as a transactional manager. Unfortunately, a central point of control is also a single point of failure. This paper proposes a decentralized architecture based on a peer-to-peer network providing decentralized transactional support with replicated storage. As a consequence, there is a gain in fault-tolerance and the transactional protocol eliminates the problem of network delay improving the overall usability. The addition of a feedback mechanism allow the users to understand better the behavior of the system.