Figure 6 - uploaded by Natalia Chechina
Content may be subject to copyright.
Hierarchical clustering  

Hierarchical clustering  

Source publication
Conference Paper
Full-text available
We consider the problem of adapting distributed Erlang applications to large or heterogeneous architectures to achieve good performance in a portable way. In many architectures, and especially large architectures, the communication latency between pairs of virtual machines (nodes) is no longer uniform. We propose two language-level methods that ena...

Contexts in source publication

Context 1
... gives rise to a system of nested clusters, as illustrated in Figure 6 for a set of points in the plane (with the usual Euclidean metric). A question arises here: we know the distance between two points (that is our basic data), but how do we measure the distance between two clusters? ...
Context 2
... is a tree which has one node for each cluster, with the children of a cluster being its subclusters. Figure 7 shows the dendrogram corresponding to Figure 6. The dendrogram was obtained using the hclust command in the R system for statistical analysis and visualisation [23], using distances between points measured directly from Figure 6. ...
Context 3
... 7 shows the dendrogram corresponding to Figure 6. The dendrogram was obtained using the hclust command in the R system for statistical analysis and visualisation [23], using distances between points measured directly from Figure 6. The dendrogram describes the hierarchical nested structure seen in Figure 6, with the height of the internal nodes of the dendrogram reflecting the distances between the corresponding subclusters. ...
Context 4
... dendrogram was obtained using the hclust command in the R system for statistical analysis and visualisation [23], using distances between points measured directly from Figure 6. The dendrogram describes the hierarchical nested structure seen in Figure 6, with the height of the internal nodes of the dendrogram reflecting the distances between the corresponding subclusters. ...

Similar publications

Article
Full-text available
During the software lifecycle, a program can evolve several times for different reasons such as the optimisation of a bottle-neck, the refactoring of an obscure function, etc. These code changes often involve several functions or modules, so it can be difficult to know whether the correct behaviour of the previous releases has been preserved in the...

Citations

... At the language level the design, implementation and validation of the new libraries (Section V) have been reported piecemeal [21,60], and are included here for completeness. ...
... We have implemented two Erlang libraries to support semi-explicit placement [60]. The first deals with node attributes, and describes properties of individual Erlang VMs and associ-ated hosts, such as total and currently available RAM, installed software, hardware configuration, etc. ...
Article
Distributed actor languages are an effective means of constructing scalable reliable systems, and the Erlang programming language has a well-established and influential model. While the Erlang model conceptually provides reliable scalability, it has some inherent scalability limits and these force developers to depart from the model at scale. This article establishes the scalability limits of Erlang systems and reports the work of the EU RELEASE project to improve the scalability and understandability of the Erlang reliable distributed actor model. We systematically study the scalability limits of Erlang and then address the issues at the virtual machine, language, and tool levels. More specifically: (1) We have evolved the Erlang virtual machine so that it can work effectively in large-scale single-host multicore and NUMA architectures. We have made important changes and architectural improvements to the widely used Erlang/OTP release. (2) We have designed and implemented Scalable Distributed (SD) Erlang libraries to address language-level scalability issues and provided and validated a set of semantics for the new language constructs. (3) To make large Erlang systems easier to deploy, monitor, and debug, we have developed and made open source releases of five complementary tools, some specific to SD Erlang. Throughout the article we use two case studies to investigate the capabilities of our new technologies and tools: a distributed hash table based Orbit calculation and Ant Colony Optimisation (ACO). Chaos Monkey experiments show that two versions of ACO survive random process failure and hence that SD Erlang preserves the Erlang reliability model. While we report measurements on a range of NUMA and cluster architectures, the key scalability experiments are conducted on the Athos cluster with 256 hosts (6,144 cores). Even for programs with no global recovery data to maintain, SD Erlang partitions the network to reduce network traffic and hence improves performance of the Orbit and ACO benchmarks above 80 hosts. ACO measurements show that maintaining global recovery data dramatically limits scalability; however, scalability is recovered by partitioning the recovery data. We exceed the established scalability limits of distributed Erlang, and do not reach the limits of SD Erlang for these benchmarks at this scale (256 hosts, 6,144 cores).
... It was introduced to provide a reusable solution that overcomes scalability limitations posed by both transitive connectivity, global namespace, and a lack of resource awareness, while preserving fault tolerance mechanisms of distributed Erlang. This was achieved by introducing two new libraries: (1) attribute that provides semi-explicit process placement [23], and s_group that partitions the node connection graph into s_groups [5]. SD Erlang has been available with several releases of Erlang/OTP, and is likely to remain available in the medium term as the Erlang/OTP group at Ericsson indicate no near future plans to change the mechanisms that the s_group libraries rely on. ...
... Nodes in the same s_group maintain all-to-all connections and a common namespace. So we might want to put nodes in the same group because of, e.g., communication distances or frequency of communication between the nodes, or common node attribute, such as available hardware [23]. To assist this decision, it is possible to use Devo, which shows nodes' affinity (Section 6.6), and Percept2, which shows the communication between nodes (Section 6.7). ...
Article
Full-text available
Large scale servers with hundreds of hosts and tens of thousands of cores are becoming common. To exploit these platforms software must be both scalable and reliable, and distributed actor languages like Erlang are a proven technology in this area. While distributed Erlang conceptually supports the engineering of large scale reliable systems, in practice it has some scalability limits that force developers to depart from the standard language mechanisms at scale. In earlier work we have explored these scalability limitations, and addressed them by providing a Scalable Distributed (SD) Erlang library that partitions the network of Erlang Virtual Machines (VMs) into scalable groups (s groups). This paper presents the first systematic evaluation of SD Erlang s groups and associated tools, and how they can be used. We present a comprehensive evaluation of the scalability and reliability of SD Erlang using three typical benchmarks and a case study. We demonstrate that s groups improve the scalability of reliable and unreliable Erlang applications on up to 256 hosts (6144 cores). We show that SD Erlang preserves the class-leading distributed Erlang reliability model, but scales far better than the standard model. We present a novel, systematic, and tool-supported approach for refactoring distributed Erlang applications into SD Erlang. We outline the new and improved monitoring, debugging and deployment tools for large scale SD Erlang applications. We demonstrate the scaling characteristics of key tools on systems comprising up to 10K Erlang VMs.
... Scalable Distributed Erlang (SD Erlang) was designed to preserve reliability of distributed Erlang while enabling scalability by partitioning the node connection graph into s_groups [4], and by introducing semi-explicit process placement [8]. ...
... In this experiment we analyse an impact of SD Erlang s_groups when no failure occurs. For that we vary the number of server nodes (3,4,6,8,12,16) while maintaining just a single router node. Since RSD-IM has only one s_group, this set-up results in identical architectures for both IM versions where s_group operations in the RSD-IM are identical to the global operations in the RD-IM. ...
... In this experiment we analyse an impact of the size of s_groups on the IM performance. For that we again increase the number of servers (6,8,12), but this time we fix the number of routers to two. In case of RSD-IM this results in two s_groups where depending on the total number of servers each s_group has either 3, 4, or 6 server nodes. ...
Conference Paper
Erlang has world leading reliability capabilities, but while it scales extremely well within a single node, distributed Erlang has some scalability issues. The Scalable Distributed (SD) Erlang libraries have been designed to address the scalability limitations while preserving the reliability model, and shown to deliver significant performance benefits above 40 hosts using some relatively simple benchmarks. This paper compares the reliability and scalability of SD Erlang and distributed Erlang using an Instant Messaging (IM) server benchmark that is a far more typical Erlang application; a relatively large and sophisticated benchmark; has throughput as the key performance metric; and uses non-trivial reliability mechanisms. We provide a careful reliability evaluation using chaos monkey. The key performance results consider scenarios with and without failures on up to 17 server hosts (272 cores). We show that SD Erlang adds no performance overhead when all nodes are grouped in a single s_group. However, either adding redundant router nodes in distributed Erlang applications, or dividing a set of nodes into small s_groups in SD Erlang applications, have small negative impact. Both the distributed Erlang and SD Erlang IM tolerate failures and, up to the failure rates measured, the failures have no impact on throughput. The IM implementations show that SD Erlang preserves the distributed Erlang reliability properties and mechanisms.
... In this paper we only cover research related to the s group part of the SD Erlang: essentially we address the question of how to scale a network of Erlang nodes by reducing the number of connections between the nodes. A discussion of semi-explicit placement can be found in (17). ...
Article
As the number of cores grows in commodity architectures so does the likelihood of failures. A distributed actor model potentially facilitates the development of reliable and scalable software on these architectures. Key components include lightweight processes which ‘share nothing’ and hence can fail independently. Erlang is not only increasingly widely used, but the underlying actor model has been a beacon for programming language design, influencing for example Scala, Clojure and Cloud Haskell.