Hierarchical clustering

Source publication

Performance portability through semi-explicit placement in distributed Erlang

Conference Paper

Full-text available

Aug 2015

We consider the problem of adapting distributed Erlang applications to large or heterogeneous architectures to achieve good performance in a portable way. In many architectures, and especially large architectures, the communication latency between pairs of virtual machines (nodes) is no longer uniform. We propose two language-level methods that ena...

Context 1

... gives rise to a system of nested clusters, as illustrated in Figure 6 for a set of points in the plane (with the usual Euclidean metric). A question arises here: we know the distance between two points (that is our basic data), but how do we measure the distance between two clusters? ...

View in full-text

Context 2

... is a tree which has one node for each cluster, with the children of a cluster being its subclusters. Figure 7 shows the dendrogram corresponding to Figure 6. The dendrogram was obtained using the hclust command in the R system for statistical analysis and visualisation [23], using distances between points measured directly from Figure 6. ...

View in full-text

Context 3

... 7 shows the dendrogram corresponding to Figure 6. The dendrogram was obtained using the hclust command in the R system for statistical analysis and visualisation [23], using distances between points measured directly from Figure 6. The dendrogram describes the hierarchical nested structure seen in Figure 6, with the height of the internal nodes of the dendrogram reflecting the distances between the corresponding subclusters. ...

View in full-text

Context 4

... dendrogram was obtained using the hclust command in the R system for statistical analysis and visualisation [23], using distances between points measured directly from Figure 6. The dendrogram describes the hierarchical nested structure seen in Figure 6, with the height of the internal nodes of the dendrogram reflecting the distances between the corresponding subclusters. ...

View in full-text

Erlang Code Evolution Control

Article

Full-text available

Sep 2017

During the software lifecycle, a program can evolve several times for different reasons such as the optimisation of a bottle-neck, the refactoring of an obscure function, etc. These code changes often involve several functions or modules, so it can be difficult to know whether the correct behaviour of the previous releases has been preserved in the...

Scaling Reliably: Improving the Scalability of the Erlang Distributed Actor Platform

Article

Apr 2017
ACM T PROGR LANG SYS

Distributed actor languages are an effective means of constructing scalable reliable systems, and the Erlang programming language has a well-established and influential model. While the Erlang model conceptually provides reliable scalability, it has some inherent scalability limits and these force developers to depart from the model at scale. This article establishes the scalability limits of Erlang systems and reports the work of the EU RELEASE project to improve the scalability and understandability of the Erlang reliable distributed actor model. We systematically study the scalability limits of Erlang and then address the issues at the virtual machine, language, and tool levels. More specifically: (1) We have evolved the Erlang virtual machine so that it can work effectively in large-scale single-host multicore and NUMA architectures. We have made important changes and architectural improvements to the widely used Erlang/OTP release. (2) We have designed and implemented Scalable Distributed (SD) Erlang libraries to address language-level scalability issues and provided and validated a set of semantics for the new language constructs. (3) To make large Erlang systems easier to deploy, monitor, and debug, we have developed and made open source releases of five complementary tools, some specific to SD Erlang. Throughout the article we use two case studies to investigate the capabilities of our new technologies and tools: a distributed hash table based Orbit calculation and Ant Colony Optimisation (ACO). Chaos Monkey experiments show that two versions of ACO survive random process failure and hence that SD Erlang preserves the Erlang reliability model. While we report measurements on a range of NUMA and cluster architectures, the key scalability experiments are conducted on the Athos cluster with 256 hosts (6,144 cores). Even for programs with no global recovery data to maintain, SD Erlang partitions the network to reduce network traffic and hence improves performance of the Orbit and ACO benchmarks above 80 hosts. ACO measurements show that maintaining global recovery data dramatically limits scalability; however, scalability is recovered by partitioning the recovery data. We exceed the established scalability limits of distributed Erlang, and do not reach the limits of SD Erlang for these benchmarks at this scale (256 hosts, 6,144 cores).

Evaluating Scalable Distributed Erlang for Scalability and Reliability

Article

Full-text available

Jan 2017
IEEE T PARALL DISTR

Large scale servers with hundreds of hosts and tens of thousands of cores are becoming common. To exploit these platforms software must be both scalable and reliable, and distributed actor languages like Erlang are a proven technology in this area. While distributed Erlang conceptually supports the engineering of large scale reliable systems, in practice it has some scalability limits that force developers to depart from the standard language mechanisms at scale. In earlier work we have explored these scalability limitations, and addressed them by providing a Scalable Distributed (SD) Erlang library that partitions the network of Erlang Virtual Machines (VMs) into scalable groups (s groups). This paper presents the first systematic evaluation of SD Erlang s groups and associated tools, and how they can be used. We present a comprehensive evaluation of the scalability and reliability of SD Erlang using three typical benchmarks and a case study. We demonstrate that s groups improve the scalability of reliable and unreliable Erlang applications on up to 256 hosts (6144 cores). We show that SD Erlang preserves the class-leading distributed Erlang reliability model, but scales far better than the standard model. We present a novel, systematic, and tool-supported approach for refactoring distributed Erlang applications into SD Erlang. We outline the new and improved monitoring, debugging and deployment tools for large scale SD Erlang applications. We demonstrate the scaling characteristics of key tools on systems comprising up to 10K Erlang VMs.

A scalable reliable instant messenger using the SD Erlang libraries

Conference Paper

Sep 2016

Erlang has world leading reliability capabilities, but while it scales extremely well within a single node, distributed Erlang has some scalability issues. The Scalable Distributed (SD) Erlang libraries have been designed to address the scalability limitations while preserving the reliability model, and shown to deliver significant performance benefits above 40 hosts using some relatively simple benchmarks. This paper compares the reliability and scalability of SD Erlang and distributed Erlang using an Instant Messaging (IM) server benchmark that is a far more typical Erlang application; a relatively large and sophisticated benchmark; has throughput as the key performance metric; and uses non-trivial reliability mechanisms. We provide a careful reliability evaluation using chaos monkey. The key performance results consider scenarios with and without failures on up to 17 server hosts (272 cores). We show that SD Erlang adds no performance overhead when all nodes are grouped in a single s_group. However, either adding redundant router nodes in distributed Erlang applications, or dividing a set of nodes into small s_groups in SD Erlang applications, have small negative impact. Both the distributed Erlang and SD Erlang IM tolerate failures and, up to the failure rates measured, the failures have no impact on throughput. The IM implementations show that SD Erlang preserves the distributed Erlang reliability properties and mechanisms.

Improving the network scalability of Erlang

Article

Feb 2016
J PARALLEL DISTR COM

As the number of cores grows in commodity architectures so does the likelihood of failures. A distributed actor model potentially facilitates the development of reliable and scalable software on these architectures. Key components include lightweight processes which ‘share nothing’ and hence can fail independently. Erlang is not only increasingly widely used, but the underlying actor model has been a beacon for programming language design, influencing for example Scala, Clojure and Cloud Haskell.

Hierarchical clustering

Contexts in source publication

Similar publications

Citations