ArticlePDF Available

Querying a Graph Database - language selection and performance considerations

Authors:

Abstract and Figures

NoSQL and especially graph databases are constantly gaining popularity among developers as they promise to deliver superior perfor-mance when handling highly interconnected data compared to relational databases. Apache Shindig is the reference implementation for OpenSocial with a highly interconnected data model. However, it had a relational database as back-end. In this paper we describe our experiences with the graph database Neo4j as back-end and compare Cypher, Gremlin and Java as alternatives for querying data with MySQL. We consider performance as well as usability from a developer’s perspective. Our results show that Cypher is a good query language in terms of code readability and has a moderate overhead for most queries (20-200%). However, it has to be supple-mented with “stored procedures” to make up for some performance deficits in pattern matching queries (>1000%). The RESTful API is unusable slow, whereas our WebSocket connection performs significantly better (>650%).
Content may be subject to copyright.
Docker cluster management for the cloud
survey results and own solution
René Peinl1, Florian Holzschuher1, Florian Pfitzer2
1Institute of Information Systems, Hof University, Alfons-Goppel-Platz 1, Hof, Germany
rene.peinl@iisys.de, florian.holzschuher2@iisys.de, florian@pfitzer.me
Keywords: Cloud Computing, Management Tools, Microservices, System Integration, Docker, Container
Abstract: Docker provides a good basis to run composite applications in the cloud, especially if those are not cloud-
aware, or cloud-native. However, Docker concentrates on managing containers on one host, but SaaS provi-
ders need a container management solution for multiple hosts. Therefore, a number of tools emerged that
claim to solve the problem. This paper classifies the solutions, maps them to requirements from a case study
and identifies gaps and integration requirements. We close some of these gaps with our own integration
components and tool enhancements, resulting in the currently most complete management suite.
1. INTRODUCTION
Virtual machines as the core virtualization
construct of the cloud have been improved
successively by addressing scheduling, packaging
and resource access problems [1]. VM instances use
isolated large files on their host to store their entire
file system and typically run as a single process on the
host. A full guest OS image for each VM is needed in
addition to the binaries and libraries necessary for the
applications. That leads to high RAM and disk
storage requirements and slow startup (ibid.).
Containers are a more lightweight virtualization
concept, i.e. less resource and time consuming [2][3].
They can be seen as more flexible tools for packaging,
delivering and orchestrating both software services
and applications. Containers built on recent advances
in virtualization and therefore allow for better
portability and interoperability while still utilizing
operating systems virtualization techniques [4].
Although lightweight operating system (OS)
virtualization techniques like Solaris Zones and
OpenVZ are long established, it was the release of
Docker in March 2013 [5] that led to mass adoption
and even a hype around containerization [6]. It
originally built on LXC but then Docker, Inc. wrote
an own runtime called libcontainer in order to easen
Go language bindings. When Docker, Inc. started to
build their own ecosystem around Docker, CoreOS
placed rkt (formerly rocket) against Docker arguing
that Docker stopped concentrating on building the
best possible container runtime and did not engage in
the security problems as much as needed [7]. With
LXD from Ubuntu there is a third alternative for
running containers, although Ubuntu claims to
concentrate providing a lightweight approach for
virtualizing a whole operating system whereas
Docker concentrates on packaging applications
(ibid.).
In the meanwhile, the so called Open Container
Initiative tries to bring all players together again and
is working on a container format specification and a
runtime implementation called runC [8].
Docker aims at making container technologies
easy to use and among other things encourages a
service-oriented architecture (SOA) and especially
the microservice architecture style [9, 10]. In a
Software as a Service scenario (SaaS), you therefore
cannot guarantee that activities of one customer won’t
negatively affect other customers, if you are using
containers only.
The need for a kind of “application package
format” as a basis for composite SaaS offerings
following the SOA principles [11] was already
discussed in [12]. Originating from a Platform as a
Service (PaaS) use case, Docker should be a good
basis to handle the components of a composite
application offered in the cloud. It provides an easy
and convenient way to create, deploy and configure
containers [5], including links to dependent
containers on the same host. Microservices can be
briefly summarized as a “loosely coupled service
oriented architecture with bounded contexts” [13],
with loosely coupled denoting that each service
should be independently deployable and bounded
contexts means that the service does not have to know
anything about its surroundings, but can discover
them on its own [cf. 14]. Enterprise applications are
often complex composite applications, which consist
of multiple individual components [15] and therefore
match microservices well. However, the Docker tools
soon reach their limits when it comes to managing
containers in a cluster or creating links across
multiple hosts [6]. To overcome those, a myriad of
tools is currently in development. We found over 60
tools in “The Docker Book” [9], a special issue of the
German developer magazine dedicated to Docker
[16] and the “Docker ecosystem” mindmap [17] with
relevance for building an automated Docker cluster
management solution similar to OpenStack on the
IaaS level [18 for a full list]. Docker Inc even counts
50,000 third-party projects on GitHub that are using
Docker [19]. Recent publications like [2022] show
the ongoing interest about this topic. Our hypothesis
is, that despite the benefits of competition, the time
has come to work together on a common cluster
project similar to OpenStack to form a comprehensive
integrated solution and stable interfaces for the
required components in order to make them
interchangeable, instead of building yet another tool
that solves parts of the challenges, but not all of them
and is not integrated with others.
2. METHODOLOGY
We followed a similar procedure to [23]. The goal
in our project is to move existing open source systems
to the cloud in order to offer them as SaaS. They form
a complex composite application with lots of
dependencies. Therefore, an infrastructure solution is
needed that allows administrators specifying the
dependencies and takes care of running the systems.
An analysis of the use case and deriving the relevant
requirements were the first steps. Then, a list of tools
was collected and analysed based on available
documentation. The most promising candidates were
installed and tested in more detail. The interplay
between different tools that cover parts of the
requirements was given special attention. Following
this approach, two solution stacks were identified.
One of the stacks was enhanced with own glue code,
so that the integration between the tools became
better. This solution is finally evaluated against the
requirements.
Although this procedure was project specific, the
results can be transferred to all companies running
Docker in a cloud for themselves or other companies
and having at least medium number of containers to
run (several dozen). Docker itself targets different
user groups from single developers who want to
easily test their applications to large cloud providers
like Google launching millions of containers per
month. The solution described here targets a large
spectrum of users in between those two extremes. As
soon as you talk of true cloud environment with
elasticity and on-demand provisioning, you need
automation and then all the requirements discussed
below need to be fulfilled.
The rest of the paper is structured as follows. We
briefly describe our SaaS project that serves as a case
study to derive requirements. We continue listing and
explaining the requirements and then compare them
to the functionality of existing tools. We categorize
those tools and elaborate on consistent definitions for
those categories. We present results of an empirical
study that tested the most promising candidates for a
comprehensive solution and propose our own
solution based on a number of existing tools. We
conclude with remaining challenges and an outlook.
3. CASE DESCRIPTION
The goal of the SCHub project (Social
Collaboration Hub, funded by the BMBF as part of
the FHprofUnt funding, https://www.sc-hub.de) is to
develop a distribution-like collaboration solution
based on open source software (OSS) that provides
end-users with a consistent experience across all
systems while using a modular microservice
approach [13]. It therefore represents a composite
application [24]. The solution will be available as
Software as a Service (SaaS) in the cloud as well as
for on premise installation. In order to achieve that, a
number of well-known OSS systems have to be
migrated to the cloud and Docker is an obvious choice
for supporting that. Since not all systems are capable
of handling multiple tenants and customization
possibilities are better that way, SCHub uses
individual instances of all frontend systems (portal,
groupware, …) per tenant and only shares backend
systems across tenants (database, mail server, …).
Each instance is packaged into a Docker container. To
guarantee isolation between instances of different
tenants, dedicated virtual machines (VM) per
customer are used additionally. These VMs become
the Docker hosts in this case. OpenStack serves as the
basis [25]. Initially, there is only one VM per tenant.
When resource limits of this VM are reached,
additional VMs are allocated and some containers are
migrated to a new host. Storage is provided by Ceph,
a software defined storage solution [26] as either
block-level or object storage, depending on the
requirements of the service.
Since off-the-shelf systems are used that are
integrated with our add-ons, we wanted to change
those systems as little as possible in order to stay
upwards compatible and benefit from future releases.
Therefore, the usage of a PaaS platform was not
feasible as it would require adapting the systems to
that platform. However, many components of a PaaS
solution are still needed, e.g. a load balancer, a central
authentication system [27], database as a service and
so on, so that typical Container as a Service offerings
[28] do not provide enough. It turned out, that a new
category of cloud offering would be ideal for this
case, a kind of runtime environment for SaaS
applications (RaaS). Where PaaS targets developers,
RaaS targets application administrators. The
following chapter lists the requirements for such a
solution.
4. REQUIREMENTS
From a provider’s perspective, automating the
management of the offered services is of vital
importance, because management and operation of IT
is one of the biggest cost factors today [15]. Many of
the required features are simply a transfer of IaaS
management features to Docker. There should be a
central list of containers (r01) with an overview of
resource usage, IP address, open ports, dependencies
and so on. You need a detail view of a container (r02)
including a way to change the configuration using the
Web UI concerning networking, storage and depen-
dencies. Since the application is built from multiple
services and therefore containers, it would be helpful
to be able to centrally define a kind of blueprint (r04)
that includes all the dependencies and to instantiate
the whole solution instead of single containers (r05).
For doing so, the management solution should
monitor resource usage of hosts (r07) and
automatically choose one with free resources, based
on an editable placement strategy (r06). Monitoring
should include CPU, RAM, storage and networking,
as well as application health. You should be able to
configure thresholds so that high CPU utilization over
a specified timeframe or low available memory
trigger alerts (r08) which in turn can trigger actions
like migrating a container to another host. Migration
(r10) could be performed by stopping the container,
unmounting the storage, starting an identical
container on a new host, updating service references
(r11) and mounting the storage (r12) there. Besides
storage, there should also be an easy way to pass
configuration data to the application inside the
container (r13). This data has to be stored in a
distributed key/value store (r14).
For communication between containers across
hosts (r15), you ideally need an overlay network or a
software defined network (SDN) [29]. Its
configuration should be accessible directly from the
Docker management UI, e.g., for defining IP address
ranges (r16). The Web UI of the SDN could be simply
integrated. You need routing of external requests with
URLs to tenant-specific container IPs (r17). This
routing should include load balancing if multiple
container instances are available (r18). There should
be a way to review the list of available images (r19)
including versions and ideally an association to the
containers running that image. If an image is updated,
the admin should be able to trigger a mechanism that
propagates the updates to the running instances (r20),
e.g. analogous to the migration described above.
There should be a way to access the container’s
console or open an SSH shell respectively (r21) and
review the log files (r09), both using the Web UI.
Table 1: List of requirements for a container mgmt. solution
No.
Requirement description
r01
Container list
r02
Container details view/edit
r03
Create new Docker hosts (VMs) from Web
r04
Blueprint for composite applications
r05
Launch (composite) application
r06
Editable container placement strategy
r07
Monitor container & host resource usage
r08
Trigger alerts based on monitoring events
r09
Review log files from Web UI
r10
Migration of containers to different hosts
r11
Update service references
r12
Mount (external) storage volumes
r4.4
Pass configuration data to containers
r14
Store data in a distributed database
r15
Communication of containers across hosts
r16
Web-based configuration for network
r17
Routing of external URLs to internal IPs
r18
Load balancing between containers
r19
List available images and instances
r20
Update containers with new image
r21
Access container command line from Web
r22
Customer/tenant management solution
It would also be desirable to have an integration
of the underlying IaaS solution, so that you can create
new hosts (VMs) from within the Docker Web UI
(r03), since container solutions are often build on IaaS
platforms [28]. Finally, there should be an integration
with a tenant / customer management solution (r22)
where both administrators and customers can review
information like the list of Docker containers per
tenant, the resulting resource usage, the number of
total and monthly active users as well as respective
billing information. It could be further argued, that an
authentication solution is needed, but we are skipping
this requirement, since Docker itself currently has no
working mechanism for authenticating an
administrator anyway.
5. EXISTING SOLUTIONS
We’ve concentrated our analysis on open source
components, although there are a few impressive
commercial tools available like StackEngine. The
descriptions of the tools capabilities are based on the
projects’ websites. We have installed and tested only
the most promising systems. Figure 1 gives an
overview of tools in the Docker ecosystem, which are
described in more detail in the next sections. The
ellipses represent tools. The thick lines demarcate
areas of functionality that are labeled in the
rectangles. A position closer to the center of the figure
within one category indicates that the tool has more
functionality than others in the outer areas. Solid
arrows represent dependencies between tools. Dashed
arrows indicate that the tool is directly supported. It
becomes obvious, that there already are some
dependencies and interactions between tools.
However, it is far from ideal and the most promising
candidates of different categories are often developed
side-by-side instead of hand-in-hand. The subsections
start with a general introduction about the respective
category of tools and what to expect from a
representative in this area and then continues with the
description of concrete tools. At the end of each
subsection we point at similarities to OpenStack,
where applicable.
5.1 Host Operating System
In principle, Docker can run on any modern Linux
system. However, a few specialized Linux
distributions have emerged that propose to bring
exactly what is needed to smoothly run Docker
containers and nothing more.
CoreOS is the most prominent one and was
launched briefly after Docker. Redhat has reacted
quickly and initiated project Atomic, which is
developed in close cooperation with Redhat’s own
PaaS solution OpenShift. Canonical has only recently
announced its own solution in this field called snappy
Ubuntu core. It abandons traditional package
managers and uses snappy, a new tool tailored for
containerized apps. Even VMware (Photon) and Intel
(Clear Linux), both not really well-known for open
source solutions, offer own optimized container host
Linux distributions. Boot2Docker is based on Tiny
Core Linux and seems to address developers more
than cloud hosters as it provides Windows and Mac
OS X integration.
OpenStack ships with CirrOS as a minimal image
for virtual machines. However, CirrOS brings no
Docker integration by default.
5.2 Image Registry
Docker uses layered images as a package format.
Similar to a disk image of a virtual machine, the
Docker image contains all the files necessary to run
the container and do something meaningful. The
Figure 1: Docker ecosystem with dependencies (own illustration)
image registry stores them and can be used to retrieve
an image, if it is not already present on the host (r19).
Docker Inc. provides a public image registry
called Docker Hub (https://hub.Docker.com) and an
open source implementation for running a private
registry. It is not a service registry (see service
discovery). Recently, version 2.0 of Docker Hub and
the image registry were released
(http://bit.ly/1GOk5N0) with a new architecture and
additional features like verifiable images, resumable
downloads and pluggable storage backend with S3
support [30]. Dogestry is an alternative
implementation using Amazon S3 compatible storage
as a backend. Google also offers a hosted container
registry for its own cloud platform, as do Amazon
(EC container registry) and CoreOS (quay.io) [31].
The OpenStack counterpart of this category is
Glance.
5.3 Container Management
Docker itself only provides a command-line
interface (CLI) and a RESTful API for managing
containers. This is fine for scripting and automating
things, but there still is a need for a Web UI (r01, r02),
e.g. for self-service administration by a customer.
As the name implies, DockerUI provides exactly
that missing WebUI for Docker, while the other
candidates in this category provide additional
functionality like management of composite
applications (Panamax, r04) or broader management
of containers and VMs (mist.io and Cockpit, r03).
Direct terminal access to the containers via Web UI
(r21) is currently under development by mist.io and
already implemented by Rancher (see section 5.11).
The OpenStack counterpart is Horizon.
5.4 Cluster Scheduler
While Docker itself can only list and manage
containers of a single host, a cluster scheduler should
allow the management of a cluster of Docker hosts
and all containers on them, including the resource-
aware placement of new containers (r06) and
automatic failover and migration of containers due to
resource bottlenecks [32]. The need for such a
solution first arose in large companies running
hundreds and thousands of containers like Google
[33]. Some of the solutions in this category are
inspired by Google Borg or Omega [34].
We found four solutions providing parts of this
functionality incl. a CLI (Apache Brooklyn, Citadel,
CoreOS fleet and Docker Swarm). Decking is similar,
but has additional orchestration capabilities (r04).
Apache Mesos was originally dedicated to hosting
solutions like Hadoop and Spark. Since version 0.20
it also supports running Docker containers. Other
solutions like Shipyard build on them and provide a
Web UI (r01, r02). Clocker additionally provides
some orchestration (r04, r05) and networking
functionality (r15), so that it is getting close to the
management suites (see section 5.11). Flocker
doesn’t provide a Web UI but also has additional
functionality like basic orchestration and networking.
It stands out due to its unique solution of linking
storage to containers in a portable way (r12).
Nova is kind of fulfilling this cluster management
job in OpenStack, especially the Nova scheduler.
5.5 Orchestration
Service orchestration is an important feature for
composite applications in a SaaS offering. When
different components are deployed on different hosts
to meet the scalability requirements, those separate
deployments should appear as a single coherent
subsystem to other components [23]. BPEL and
WSCI are examples of orchestration languages in
SoA [35]. Caballer et al. propose a Resource and
Application Description Language (RADL), for
dynamically deploying virtual machine images to
different cloud provider [36]. Docker orchestration
solutions mainly use YAML instead. Orchestration
tools should be able to add links between Docker
containers that are distributed across multiple hosts
(r04).
Some tools found in literature like Crane, Fig and
Maestro (formerly Dockermix) are not able to do that
and concentrate on single hosts. The developers of
Helios, Maestro NG and Shipper all decided not to
build upon cluster management solutions and instead
connect to the different hosts on their own. All three
come without a Web UI. Shipper seems to be the least
mature of the three. Wire is an interesting tool, as it
builds on Fig as well as Open vSwitch and dnsmasq
to configure interdependent containers across hosts.
The OpenStack counterpart of this category is
Heat.
5.6 Service Discovery
Service discovery has always been an issue in
SOA and has never been solved satisfactory in
practice [37]. Recently, a new proposal was made for
service discovery in a cloud context based on
OpenTosca, an Enterprise Service Bus and Chef [38].
Within the Docker ecosystem, the proposed tools
often represent more of a service registry and leave it
up to the application developer to use the provided
lookup mechanism (r14). Docker’s own mechanism
of container linking (only on the same host) uses
environment variables to pass the IP addresses of
linked containers. A third alternative is to use reverse
proxies to transparently route IP requests to the
respective services. An advantage of that solution is
that you can achieve load balancing at the same time.
The downside is that routing on a lower layer (e.g., 3)
of the network stack is considered faster than on layer
7. Proxy solutions are discussed in section 5.9.
Etcd and Consul are two well-known
representatives of the service registry category. They
provide a distributed key-value store in order to store
ports, IP addresses or other characteristics of services
running inside Docker containers. Zookeeper and
DoozerD work in similar ways, but are less dedicated
to Docker. Other tools like SkyDNS, dnsmasq and
WeaveDNS try to solve the problem by reusing DNS
for which there are discovery implementations in
every OS. Eureka from Netflix and Synapse from
AirBnB are two examples of tools that were open
sourced by large cloud service providers. Eureka is
specially designed to run in the AWS cloud. Synapse
leverages HAproxy to route requests to services and
watchers to get updated about changes in services
addresses [39]. Tools like SkyDock or registrator
automate the registration process (r11) by monitoring
Docker events and publishing information in the
service registries. Confd stands out from the rest of
the candidates, as it facilitates applications usage of
the configuration data from those service registries
(r13). It reads data from service registries or
environment variables and updates configuration files
accordingly.
In OpenStack, there is no dedicated service
discovery tool, since it is focused on IaaS.
5.7 Storage Volumes
Storage is a critical topic in Docker applications.
Due to the layered file system (AUFS or OverlayFS),
write performance inside Docker containers is far
worse than usually. Therefore, Docker suggests using
volumes mounted from the host for writing or
attaching specialized storage containers that are not
using a layered file system. However, both solutions
don’t cover the requirement (r12) of mounting
external storage like Amazon EBS (elastic block
storage) or Ceph directly to the container instead of
mounting them to the host and afterwards to the
container.
Flocker was the only solution for that kind of
problem for quite some time. It is leveraging the
capabilities of ZFS and synchronizes data between
hosts [40]. Starting with version 1.7 in July 2015,
Docker provides a new plug-in mechanism for both
storage volumes and networking. So called volume
drivers allow Docker to mount different kinds of
storage volumes. ClusterHQ, the company behind
Flocker, partnered with EMC [41] to bring the EMC
storage solutions ScaleIO and XtremIO to Docker as
well as Amazon EBS, Rackspace Cloud Block
Storage and OpenStack Cinder compatible storage
providers like Ceph. Other solutions quickly followed
this way, so that four months after the public
announcement, there are already some competing
solutions. EMC provides its own open source solution
called REX-Ray. The Ceph RBD Docker volume
driver by Yp engineering is one of three Ceph-based
solutions [42]. Rancher Convoy can use NFS or the
Linux Device Mapper as a storage backend. It runs on
individual hosts and requires a central object store in
order to backup a volume to that object store and
restore it on a different host. This requires two copy
operations over the network and is therefore not
suitable for migrating container from one host to
another. However, even the Ceph-based solutions
have open issues like how to determine the “right”
size of the assigned block storage [43], since it
usually cannot be resized online, so that auto-scaling
means a few seconds of downtime.
The OpenStack counterpart of this category is
Cinder.
5.8 Software Defined Network
Within Docker, every container gets a private IP
only visible on the same host. Ideally, an SDN is used
to connect containers between multiple hosts [43,
r15]. Furthermore, isolation is beneficial, so that
every customer (tenant) of the SaaS solution gets their
own virtual network [45]. In an SDN, a logically
centralized controller manages the collection of
switches through a standard interface, letting the
software control virtual and physical switches from
different vendors (ibid.).
Open vSwitch is a popular SDN solution that is
also used by default in OpenStack’s Neutron. It is also
a central part of the larger OpenDaylight initiative.
Despite that, there is a growing number of experts that
point to the limitations of Open vSwitch’s scalability,
since the number of tunnels it establishes is growing
with the square of the number of hosts. Socketplane
and Pipework are overlay networks that make use of
Open vSwitch and are tailored for Docker. They
manage IP assignment (r16) and routing of messages
between networks of multiple hosts. Socketplane
itself runs in a Docker container and uses Consul to
make the configuration available to every host. The
downside is that the Consul server has to run
internally as part of Socketplane which is using an
outdated version. It is not possible to use an existing
Consul cluster.
Flannel and Weave promise to do the same, but
without support for Open vSwitch. Flannel can
leverage VXLAN (virtual extensible local area
network) to tunnel packets or rewrite transport routes
on the host or in an Amazon virtual private cloud
(VPC). It uses an etcd cluster to share configuration
data and IP addresses. Weave is the only SDN
solution that can encrypt the network, which may be
relevant in some setups. It runs in a Docker container,
which leads to a strange dependency situation: if
Docker should use the Weave bridge, Weave has to
be started first to create it and Weave needs the
Docker daemon to start its container. Therefore, the
bridge has to be created manually. Weave can handle
the IP address assignment on its own and provide a
single subnet spanning all hosts or let the Docker
daemon do this and assign individual subnets to all
hosts. Weave encapsulates network packets in its own
UDP-based protocol.
Socketplane was bought by Docker, Inc. in March
2015 and was not updated since then. Instead, Docker
introduced a pluggable network layer in Docker 1.7
and is working on libnetwork, a host-to-host
communication solution that uses network drivers to
accomplish the goal. This attracted new solutions like
project Calico which uses layer 3 routing together
with BGP (border gateway protocol) and therefore
does not require an overlay network and, using
suitable hardware, no tunneling or network address
translation. It can be used both for OpenStack and for
Docker.
MidoNet by Midokura is somehow similar, but
even more OpenStack focused. Additionally, it has a
commercial branch and uses Zoekeeper as well as
Cassandra. Similar to Calico, it uses the BGP to route
between hosts.
Another recent alternative is Ubuntu Fan. Instead
of sharing routes and IP addresses in the cluster, it
uses a deterministic algorithm to create container IP
addresses out of the host IP address. IP packets are
encapsulated into other IP packets so that they can be
routed to the target host, which can remove the
surrounding packet and deliver the original packet to
the right container. The drawback is that containers
will always get a host-specific IP address, which
cannot be moved together with a container.
The OpenStack counterpart is Neutron.
5.9 Load Balancer
“Load Balancing is an integral part of Cloud
Computing and elastic scalability” [46]. The cloud
can limit the scalability of a software load balancer
due to security requirements. Amazon EC2 for
example disabled many layer 2 capabilities, such as
promiscuous mode and IP spoofing so that traditional
techniques to scale software load balancers will not
work [47].
HAProxy and Nginx are forwarding traffic on
layer 7 which limits scalability due to SSL
termination (ibid.). However they are commonly used
and fulfill our requirements (r17-6.2). Mode
specialized tools for this purpose include hipache and
pound, that both work as reverse proxy and http load
balancer. Hipache is also able to deal with Web
socket connections and can be easily scaled
horizontally.
Load Balancing as a Service is an advanced
service of the OpenStack Neutron component.
5.10 Monitoring
Monitoring is an essential part of cloud
computing [48]. It should provide both historical and
timely information about resource usage from
hardware to virtual machine and container level.
Monitoring application level key performance
indicators regarding conformance to service level
agreements and as a basis for pay-per-use would be
desirable as well (ibid.), but is neglected here.
Typically, solutions are split into tools that collect and
store data about resource consumption and Web-
based front-ends for visualizing them. For monitoring
Docker containers, you can use common solutions
like Nagios that provide extensions for cloud
scenarios, or some specialized tools like Sensu that
are built for scalability from the ground up [48].
Within the Docker ecosystem, the most
specialized solution is Google’s cAdvisor, as it is
tailored for monitoring containers. It brings its own
Web UI. FluentD, Graylog 2 and Logstash are general
purpose tools for log file management (r09). The
latter is often used in conjunction with Elasticsearch
as a NoSQL database and Kibana as a Web UI (ELK
Stack) [49]. Grafana is similar to Kibana and uses
InfluxDB or other time series databases as data stores.
It can be used in conjunction with cAdvisor since the
latter can export data to InfluxDB and recently
Prometheus. This seems advisable, since the Web UI
of cAdvisor is limited to the latest data of a single host
and does not show historical data. The combination
can be further enhanced with Google Heapster which
directly supports Kubernetes clusters. The container
management solution mist.io does also include
monitoring and seems to be the only one to support
alerts based on thresholds (r08).
Nagios is the default monitoring tool in
OpenStack.
5.11 Management Suites
Suites are the most comprehensive tools in our
review and should at least include cluster scheduler
and orchestration capabilities (r4-r6). They either
build on multiple other solutions in order to cover the
required functionality (e.g. Kubernetes, which relies
on the CoreOS tools etcd, fleet and flannel) or are
large monolithic solutions from the microservice
perspective (e.g. BOSH). They aim at what Kratzke
calls a lightweight virtualization cluster [6].
Kubernetes is a popular choice in this category
being driven by Google and CoreOS. It is the suite
which fulfills the most requirements of all tools (see
Table 2). The OpenStack Docker driver allows
managing Docker containers just like KVM VMs in
OpenStack and therefore reusing a large part of the
Table 2: Overview of Docker software tools with fulfilled requirements (light grey means partly fulfilled)
Cate-
gory
r01
r02
r03
r04
r05
r06
r07
r08
r09
r10
r11
r12
r13
r14
r15
r16
r17
r18
r19
r20
r21
r22
Image Registry
Container
Mgmt
Cluster Scheduler
Orchestration
Service Discovery
OpenStack modules. In principle, this is the right way
to go (r03). However, it does not match our use case
very well (Docker inside KVM-based VMs), since
you have to decide per host which hypervisor to run
(KVM, Xen or Docker). Furthermore, not all modules
are already updated to be used with Docker.
A promising but still premature (alpha) candidate
is Rancher.io. It aims at solving the multi-host
problems of Docker by providing a Web-based UI,
storage and networking capabilities. Version 0.3 from
January 2015 allows starting and stopping containers
on multiple hosts, linking containers across hosts and
assigning storage (r12). They are also dedicated to
support Docker Swarm and have a terminal agent that
offers Web-based terminal sessions with containers
(r21). The development of Rancher is quite active
with version 0.42 being published on 20th of October
2015. Interesting new features include the integration
of Docker machine in order to create new Docker
hosts from the same UI (r03). BOSH is part of
Pivotal’s PaaS solution Cloud Foundry. It is able to
start and stop containers on multiple hosts, but seems
to be missing an overlay network component.
Joyent’s Triton is going a completely different
way and brings Docker containers to the SmartOS
operating system which is an enhanced version of
Sun’s Solaris. They managed to run Docker
containers with Linux guests inside this Unix system
Table 2 continued
Cate-
gory
Software
r01
r02
r03
r04
r05
r06
r07
r08
r09
r10
r11
r12
r13
r14
r15
r16
r17
r18
r19
r20
r21
r22
Convoy
REX-Ray
Yp Ceph
Flannel 1 1
MidoNet 1 1
Open vSwitch 1
Pipework 1 1
Project Calico 1 1
Socketplane 1 1
Ubuntu Fan 1 1
Weave 1 1
HAProxy 1 1
hipache 1 1
Nginx 1 1
pound 1 1
Vulcan 1 1
cAdvisor -1
Grafana -1
Heapster -1
Kibana -1
Fluentd
Graylog2
logstash
Prometheus - -
InfluxDB - -
CF BOSH 111
Flocker 1 1 1 1 1
Kubernetes 1 1 1 1 1 1 -1
OpenStack w.
Docker Driver
111111
Magnum 1 1 1 1 1
Rancher 1 1 1 1
Triton 1 1 1 1 1 1
SDN
Load Balancer
Monitoring
Management Suites
Storage
with strong isolation making VMs around containers
superfluous. On the downside, they had to rewrite the
Docker engine on SmartOS and therefore are not
100% compatible (http://bit.ly/1RzO6pr) and maybe
won’t ever be given the development speed of
Docker. Table 2 summarizes our findings in a formal
way.
5.12 Customer management
A missing piece that is seldom discussed as part
of the Docker ecosystem is the customer side of the
story, although it is a common requirement that a
cloud service provider needs both a way to register
customers, upload users (e.g., via LDIF or SCIM) and
create bills. A customer administration center should
therefore provide a rating, charging and billing (RCB)
solution as well as a limited access to monitoring in
order to observe compliance with SLAs. JBilling is an
example of a billing solution for cloud services.
CloudKitty is an RCB solution specifically tailored
for OpenStack and integrates with Ceilometer and
Horizon, whereas Cyclops is suitable for different
cloud platforms as well as SaaS models.
6. IN DEPTH ANALYSIS
The choice of candidates for the in-depth analysis
is based on the criteria “number of supported
features” and “active development and community”.
Therefore, Clocker and CF BOSH were not analyzed
in detail, since they haven’t shown an active
development in 2015. Rancher provides a convenient
Web interface, but lacks cluster functionality beyond
host-spanning networking. It is also under heavy
development with a warning from the developers
about upgrades not being supported and possible
breaking changes before reaching version 1.0. The
Docker driver for OpenStack would have been a
favorite solution, but it does not support the use case
with containers inside VMs. It allows only Docker
instead of VMs and the administrator has to choose
per host, whether Docker or KVM is used. The
OpenStack Magnum project seems promising, but is
very premature at the moment. We did not succeed in
getting it fully operational.
Both Docker, Inc. and CoreOS provide a number
of integrated tools that allow basic management of a
cluster solution. Docker, Inc. has Docker machine to
prepare hosts to serve as Docker host. Docker
compose can be used to describe composite
applications consisting of multiple containers,
resources and links between them. With Docker
swarm, these composite applications can be started on
multiple hosts. Libnetwork, the default implemen-
tation for the network plugins and the successor of
Socketpipe allows for cluster-wide communication
between containers. However, the suite lacks a central
Web-UI, monitoring and service discovery features.
Some of that is provided by Docker Datacenter,
which literally was announced in the last possible
minute to be included in this paper [50].
CoreOS has fleet as a cluster scheduler, flannel as
an overlay network and etcd as a distributed service
registry. They even provide an alternative to the
Docker runtime called rkt, which was recently
published in version 1.0 and claims to be more secure
than Docker [51]. However, the real cluster solution
is missing something like Kubernetes, which CoreOS
acknowledges by offering Tectonic, a commercial
grade package of Kubernetes, the CoreOS tools and
enterprise support.
That leaves two solutions that are both relatively
mature and already well integrated with a number of
other Docker tools: Kubernetes and Mesos. While
Kubernetes comes pretty much pre-integrated with
the CoreOS tools flannel, etcd and fleet, Mesos does
not prescribe components for service discovery,
multi-host networking and orchestration. Our
proposal for the Mesos stack is based on functional
evaluation and the idea of building on well-
established components. It cannot be denied that
forming a counterpart to the Kubernetes stack was
also leading the selection. This was due to the fear
that further developments of the CoreOS tools like
etcd would concentrate on playing well together with
Kubernetes and neglecting the integration to other
tools like Mesos.
6.1 Kubernetes
Kubernetes not only uses its own terminology, but
also additional concepts. A Kubernetes cluster
consists of at least one master. Multi-master operation
and therefore fail-safety was introduced in version 1.0
in July 2015 (current version is 1.1.7 from January
2016). Its service called podmaster is responsible for
leader election (http://bit.ly/1LLyQWy). The Docker
hosts inside a Kubernetes cluster are called nodes
(formerly minions). Each node can run pods, which
represent an application and consist of one or multiple
Docker containers. A pod is the smallest manageable
unit. Pods are automatically assigned to nodes. This
assignment can be influenced with labels, so that
applications with high IOPS (input/output operations
per second) can be assigned to a node with SSDs for
example. Pods are started using Kubelets. A Kubelet
is an agent running on every node and can restart a
pod if it fails. If a whole node fails, Kubernetes needs
an additional replication controller to restart the pods
from the failed node on a different node. Kubernetes
accesses pods via services, which can be seen as
proxies with their own virtual IP address. They can
also be used as a load balancer, to e.g. use three pods
together as a single load-balanced application. A
service can be discovered using DNS or environment
variables. Pods communicate with each other using
flannel. Each node gets its own subnet (/24 by
default) so that it can run 254 pods at most. The IP
address of a pod never changes. The pods are defined
using JSON (JavaScript object notation) or YAML
files (yet another markup language). Regarding
storage, Kubernetes allows to use block devices from
Google Cloud Engine instead of local disks from the
minion. Since version 1.0 it additionally supports
NFS (network file system), iSCSI, GlusterFS, Ceph,
and Amazon EBS (elastic block storage).
6.2 Mesos
Apache Mesos is not focused on Docker, but is a
cluster management platform that abstracts resources
from single hosts and provides them as a flexible
resource pool to so called frameworks running on top
of Mesos. It has its roots in the big data area and plays
well together with Hadoop and Spark, but also with
Cassandra, Elasticsearch and Docker. For Docker
management, both Aurora and Marathon can be used
as frameworks for scheduling Docker-based
workloads. However, Aurora introduces a
dependency to Python, which renders most of the
official Docker images useless. Therefore, Marathon
was the obvious choice here. Mesos uses Zoekeeper
for leader election among the master nodes. Compute
nodes are called slaves in Mesos. The framework
(Marathon in our case) is responsible for placing
Docker containers on slaves. Mesos is offering free
resources from the slaves to the framework.
Communication between the Mesos master and the
framework is using Google’s ProtoBuf protocol.
Mesos uses a bin-packing strategy to utilize slaves
which already have workloads as completely as
possible before placing workloads on a completely
free slave. The placement can be influenced with
constraints that allow more sophisticated strategies
than Kubernetes’ labels. The constraints unique (e.g.,
only once per slave or data center), group_by (e.g.,
equally distributed over all racks), cluster (e.g. only
on slaves with SSDs) as well as like and unlike are
available. The latter two evaluate regular expressions
for placement.
Mesos does also allow the use of different
frameworks that jointly work on a single resource
pool. The workloads are started by an executor
service running on each slave. Marathon uses JSON
files to describe applications. It contains all Docker-
specific parameters except the name of the container,
which is auto-generated by Mesos. Composite
applications can be defined as groups of applications.
These groups can contain dependencies, which
Marathon considers when starting the containers.
Service discovery is not solved in Marathon itself.
Two solutions are recommended, which both have
drawbacks. Marathon can update a HAproxy instance
once a minute or use Mesos DNS for service
discovery. Both solutions lead to a considerable delay
in service availability.
Marathon does also support a basic form of
rolling updates. If a container image is updated, a
configurable part of old container instances is stopped
(0-100%) before new instances with the updated
container are launched. Then a health check is
performed. If the new instances are running well, the
remaining instances are stopped and started step by
step. A drawback of the current Marathon version is,
that errors during the deployment, like a missing
container image or a container that terminates with
errors during the start do not lead to proper errors in
Marathon and Mesos. Therefore, there is also no
automatic error handling or notification. The
administrator has to manually recover the system
from the failure.
6.3 Discussion
Although Kubernetes has a few more pre-configured
pieces directly built-in, none of the solutions fulfill all
the requirements. We chose Mesos as a basis for own
integration developments since it seems more open
for cooperating with different tools than Kubernetes
and also less commercially oriented than CoreOS
with Tectonic, although it is sad to notice that the
extended version of Mesos, the Data Center
Operating System, is only available for free on
Amazon AWS but not for on premise installations.
The additional components needed were chosen
based on interoperability with Dockers latest plug-in
model, matureness and feature richness. Figure 2
provides an overview including the control flow.
For service discovery, the choice was between
etcd and Consul. Zookeeper is already running for
master election in the Mesos cluster, but only on the
master nodes and we didn’t want to mix things up.
Consul has advantages by providing a dedicated
service catalog while etcd only has a schema-less
key-value store. Registrator is used to register
services in Consul together with health checks. It
listens to events from the Docker daemon and updates
entries in Consul upon container startup and
shutdown.
For networking, we chose Weave since it was one
of the first to support the network plugin model of
Docker and did also work well in our tests. We used
the PingPong benchmark of Kratzke [52] to test the
candidates and Weave worked best when having
enough processing power (multiple CPUs). Project
Calico looked very promising from an architectural
point of view, but failed in our test installation
without any error messages in the logs to deliver
larger RESTful messages (> 1 kB) in a stable manner.
Many packets got lost without any notice. MidoNet is
open source on the one hand, but lacks documentation
to run it properly on an existing OpenStack cluster. It
seems very commercially oriented with an open
source offer only to attract attention. Flannel has
dependencies on etcd and therefore was no option.
Weave is used in the “one subnet per host” mode,
since the advanced modes are not yet supported in the
stable version. This led to the need for a reverse proxy
to forward requests to the container in case it was
migrated to a different host. Nginx was used here due
to better experiences of the project team with it than
with HAproxy. No formal evaluation of performance
was made, since last benchmarks showed HAproxy
and Nginx pretty much on par [53]. Starting with
version 1.9, Nginx offers TCP load balancing in
addition to http load balancing, which was an
advantage of HAproxy before and only available in
the commercial version of Nginx. Therefore, both
seem pretty much on par. Nginx is also used for
routing external requests to internal services of the
front-end systems.
We didn’t find a storage solution that really met
our needs. While Flocker is the most mature solution,
it brings its own small ecosystem of dependencies
that didn’t work well with other choices. EMC’s
REX-Ray reminded us of MidoNet since it is also
poorly documented and very commercially oriented
so we didn’t manage to run it in an open source
version. Rancher’s Convoy worked as described, but
requires two copy operations: one to backup the
container volume to a central object store and one to
restore it on the new host. That leads to unacceptably
long downtimes. The Ceph driver from Yp
engineering would suit our purpose best, since it is a
small tool without larger dependencies and could be
easily integrated. It allows direct usage of block
storage from Ceph as a Docker volume, so in case of
a container migration it can be unmounted from the
old container and remounted on the new container on
another host without any copy operation in between.
In our scenario however, this would still mean a
considerable downtime of a few minutes (1-2) since
starting of Nuxeo or Liferay in a container takes that
long although it is frequently said that containers start
in a few seconds [54, 55]. This is only true for the
container itself, but not the application running inside.
A suitable solution would be to use snapshots of a
volume and mount the snapshot on the new container,
while allowing read/write access to the layer above
the snapshot on the old container. Once the new
container is online, the old container could be shut
down and the volume could be remounted with the
updated data. However, we have not managed to
implement such a solution up to now.
Regarding monitoring, cAdvisor is used for
collecting information about container resource
usage. This information is stored in a Prometheus
database afterwards and can be visualized using a
Grafana dashboard. There is no real alternative for
cAdvisor. Prometheus was preferred against
InfluxDB to store this data, since it claims to store
data more effectively. It stores a name and additional
labels for every metric only once, whereas InfluxDB
stores them redundantly for every timestamp. Both
solutions are supported by cAdvisor as an export
target. Prometheus does also support alerts (r08).
Grafana was chosen over Kibana for visualization
since it seems more open towards different purposes
whereas Kibana is mainly tailored for the ELK stack.
Table 3 provides an overview about the different
components in the Kubernetes and Mesos stack.
Differences regarding fulfilled requirements to Table
2 result from additional integration code in
Kubernetes or our own solution.
Figure 2: overview of cluster management components
Table 3: Comparison of solution stacks
Category
Stack 1
Stack 2
Reqs.
Orchestration
Kubernetes
Marathon
4-6
Scheduler
Fleet
Mesos
10
Master
election
Podmaster
Zookeeper
-
Service
discovery
Etcd
Consul
Registrator
14
11, 13
SDN
Flannel
Weave
15, 16
Storage
Kubernetes
Yp Ceph
12
Load
balancer
Kube-
proxy
Nginx
17,18
Monitoring
cAdvisor
Heapster
(fluentd)
cAdvisor
Prometheus
Grafana
7
(9)
Management
Dashboard
Dashboard
1-3
6.4 Evaluation
To highlight the deficiencies of an out-of-the-box
solution with the selected components, we performed
a number of tests.
1. Create a new virtual machine
2. Migrate a container to a new host
3. Identify a container with high CPU load
4. Compare resource usage for a single
application container across customers
5. Vertical scaling of an application based on
resource usage
We used a simple application instead of our
composite application in order to make the tests
simpler and therefore easier to reproduce. The
application is a WordPress blog which depends on a
MySQL database.
ad 1) Test: the VM should automatically contain
all necessary components and join the Mesos cluster
as a slave. Procedure: an OpenStack Heat template or
a configuration management system can be used.
Result: ok. Discussion: the solution is external to the
Docker management suite and not integrated. It is still
an acceptable solution.
ad 2) Test: the container should be stopped and
restarted on a host with more resources. Procedure:
Increase the resource requirements of the application
in Marathon. Result: a new deployment is created, but
it is not carried out. No error visible. Discussion: it is
a bug in Marathon that is preventing it from shutting
down the existing instance. MinimumHealthCapacity
is one by default. If set to zero the shutdown works
and the container is restarted on a new host. However,
it also means that Marathon does not restart the
application if it fails.
ad 3) Test: the administrator should be able to
easily identify a container with high CPU load.
Procedure: using the Grafana dashboard, look at the
CPU graph of the containers. Result: identifying the
container is easy, but finding out which application it
is running and to what customer it belongs is hard.
Discussion: Mesos assigns the name of the Docker
container automatically using the schema mesos<id>.
The administrator has to log on to the host and use the
Docker commands to find out which application is
running.
ad 4) Test: a container should be identifiable in
the monitoring solution using filters like application
name and customer name. Procedure: using Grafana,
the administrator is filtering the container list. Result:
not possible. Discussion: cAdvisor doesn’t deliver
any tags which could be used for filtering.
ad 5) Test: an additional application instance
should be started and load balanced as soon as a high
resource usage of the existing instance is detected. If
resource usage is low for some time, the additional
instance should be shut down. Procedure: generate
high load using Apache jMeter. Result: no auto
scaling possible. Discussion: Marathon does not have
that feature. A respective request was declined with
the remark it would be very customer specific [56].
6.5 Own enhancements
To fix these issues and create a better integrated
cluster management solution for Docker, author
Florian Pfitzer created a number of small applications
and extensions to some of the components used.
Docker controller: syncs container metadata from
the Docker daemon to the Consul service registry.
This fixes the problem that information and consul
can run out of sync when changes happen on the
Docker host while Registrator is not running.
Marathon scaler: gets statistics for a Marathon
application from Prometheus and scales it when
needed. It can either create additional application
instances with the same resource requirements, or
increase the resource requirements of the app and
restart it on a host with enough free resources. The
blueprint requirement was solved by storing a
Marathon JSON file in Consul and pushing it into
Marathon with the scaler application
(http://bit.ly/1Q8HTiO).
cAdvisor Pull Request #780: Export image name
and environment variables as Prometheus labels.
Since Marathon stores the name of the Docker image
that is run in the container in an environment variable,
exporting it to Prometheus allows filtering for the
image name. The same is true for host and Marathon
application name.
Prometheus Pull Request #812: Use Consul
ServiceAddress instead of Address when set. This
fixes an issue when Registrator is run in internal mode
and pushes the IP address of the Docker container to
the ServiceAddress field instead of Address. Since
version 0.15 of Prometheus this can be also done
using a mapping of ServiceAddress to Address in
Prometheus.
7. FINAL CONSIDERATIONS
Despite the fact that the Docker ecosystem is
huge, there still are requirements not fulfilled by any
of the tools (r20, r22) and some are only fulfilled by
a single tool (r08, r11). Many tools emerged quite
recently and therefore must be considered premature.
Managing customer data (r22) is maybe the most
important missing part. [57] argue, that there should
be a complete supply chain for the cloud starting with
deployment and monitoring and ending with
accounting and billing. The economic part of this
supply chain is currently not present in the Docker
ecosystem and also missing in the OpenStack project,
although the latter is about to change with CloudKitty
and Cyclops.
Updating a container (r20) can be emulated with
a couple of Docker commands replacing it, since
containers should be immutable. Still there should be
a standard way to automate this and also a solution
for possibly necessary one-time tasks that do a
database schema update or similar things associated
with the update. The rolling-update functionality of
Kubernetes and our own solution is not sufficient.
Registering a service in the registry is also a neglected
requirement (r11). Some tools do it, but our
impression is that it is a better idea to use IP addresses
and an SDN for routing instead of relying on one of
the service discovery solutions when containers are
migrated to another host. Both storage and
networking tools have flourished since Docker’s
publishing of the plugin system. Storage is handled
quite well by Flocker, but seems somehow
superfluous with the advent of Docker volume
plugins. EMC is already one step further and have
shown a working demo of migrating a stateful
application container inside a Mesos installation. It’s
a pity that we didn’t manage to get REX-Ray running.
EMC really has to work on the documentation.
Within the SDN area, Weave worked best in our tests,
although we would prefer a solution that directly
replaces OpenStack’s SDN Open vSwitch and
provides a unified solution for both OpenStack and
Docker running inside VMs on OpenStack. Project
Calico and MidoNet qualify for that. The latter has
the same problems like REX-Ray: missing or
incomplete documentation. Calico did work for us,
but showed problems with delivering larger packets
in the PingPong test. Maybe Ciscos engagement in
the project will help here, although it is currently
unclear whether they want to cooperate as announced
or create a fork called contiv which also contains a
volume driver with Ceph support. Another candidate
that recently got our attention is OpenContrail, an
SDN which targets replacing Open vSwitch in
OpenStack and recently started supporting
Kubernetes [58].
7.1 Outlook
In contrast to calls for consolidation [59], the
ecosystem is still growing both in existing tool
categories as well as in new ones. CRIU
(checkpoint/restore in userspace) allows live
migration of containers with support of Docker 1.5.
Unfortunately it currently has issues with Docker
v1.8 and still requires a modified Docker and a
modified Linux kernel to work. Although full support
in vanilla Docker was announced for version 1.9 [60],
it is still not present in 1.10.
Additionally, tools come and go and sometimes
developers throw away the old solutions to build
something new as ActiveState did with
CloudFoundry (CF) BOSH, which is replaced by CF
Diego and CF Lattice, which both support Docker,
but come with some limitations [61]. Even the
underlying Docker engine is challenged with multiple
alternatives like rkt from CoreOS, LXD from
Canonical and Hyper claim to have better isolation,
more security or other benefits.
Although it currently looks like many people
focus on either Kubernetes or Mesos/Marathon, there
are still new competitors in this area as well.
The Nomad cluster manager was announced by
Hashicorp at the end of September 2015 [62]. It is
specialized on microservices and Hashicorp claims it
is more generally applicable than Kubernetes since it
also supports directly running VMs and Java
applications in addition to Docker containers and
architecturally easier than Mesos with Marathon
since it is a single binary with much less external
dependencies [63]. It uses Consul and other tools by
Hashicorp and targets large clusters across multiple
datacenters.
Another recent development is the Microservice
infrastructure called Mantl developed at Cisco [64]. It
happens to use many of the same components as our
own solution, building on Mesos, Marathon and
Consul. It uses some different choices in other areas
replacing Nginx with HAproxy, Weave with Calico,
Prometheus/Grafana with the ELK stack and
cAdvisor with CollectD, so that hosts are monitored
instead of single containers. They seem to use similar
glue code for integrating Mesos/Marthon with Consul
as well as HAproxy with Consul. This can be seen as
an indication that our approach is going into the right
direction.
Hyper has recently announced the Hypernetes
project, a combination of OpenStack, the Hyper
engine instead of Docker and Kubernetes. They claim
to have built a true multi-tenant version of Kubernetes
and at the same time being more efficient due to
elimination of the VM layer [65].
7.2 Conclusions
A Docker-based open source cloud environment
to easily run composite applications as SaaS offerings
would be a good basis for initiatives like the Open
Cloud Alliance [66] that aim at simplifying the
process of bringing your applications to the cloud
while preserving the freedom of choice and openness
of the offering. In our paper, we have shown that
many components are needed to fulfill the
requirements for such a solution, which we dubbed
runtime environment for SaaS applications (RaaS). It
is similar to an IaaS environment, as we have shown
with OpenStack, and includes some components from
PaaS like load balancing and logging, but also has
unique features like service orchestration and
discovery. Our solution is in line with the reference
architecture for lightweight virtualization clusters by
Kratzke [67].
Not all requirements are currently fulfilled and
despite first integration approaches, there is a need for
closer cooperation within the Docker ecosystem. We
envision an embracing ecosystem project that bundles
the forces of Mesos proponents to jointly work on a
holistic solution for the Docker management
challenge. With Kubernetes and the upcoming
competitors Triton, Nomad and Mantl, there would
still be enough competition for a healthy market.
REFERENCES
1. Pahl, C.: Containerization and the PaaS Cloud.
IEEE Cloud Comput. 2431 (2015).
2. Scheepers, M.J.: Virtualization and
Containerization of Application Infrastructure:
A Comparison. Presented at the 21st Twente
Student Conference on IT , Twente, The
Netherlands June 23 (2014).
3. Pahl, C., Lee, B.: Containers and clusters for
edge cloud architectures-a technology review.
(2015).
4. Ranjan, R.: The Cloud Interoperability
Challenge. Cloud Comput. IEEE. 1, 2024
(2014).
5. Rosen, R.: Linux containers and the future cloud.
Linux J. 2014, 3 (2014).
6. Kratzke, N.: Lightweight Virtualization Cluster
How to Overcome Cloud Vendor Lock-In. J.
Comput. Commun. 2, 17 (2014).
7. Rubens, P.: Docker No Longer the Only
Container Game in Town, http://bit.ly/1IlkI0s,
(2015).
8. Hecht, L.: How Open Source Communities
Power Docker and the Container Ecosystem,
http://bit.ly/1LSIoLW, (2015).
9. Turnbull, J.: The Docker Book: Containerization
is the new virtualization. James Turnbull (2014).
10. Lewis, J., Fowler, M.: Microservices,
http://bit.ly/1dI7ZJQ, (2014).
11. Papazoglou, M.P.: Service-oriented computing:
Concepts, characteristics and directions. In: Web
Information Systems Engineering (WISE 2003).
4th Int. Conf. on. pp. 312. IEEE (2003).
12. Mietzner, R., Leymann, F., Papazoglou, M.P.:
Defining composite configurable SaaS
application packages using SCA, variability
descriptors and multi-tenancy patterns. In: ICIW
2008. IEEE (2008).
13. Cockcroft, A.: State of the Art in Microservices.
Presented at the DockerCon Europe 14 ,
Amsterdam, Netherlands December 4 (2014).
14. Evans, E.: Domain driven design: Tackling
Complexity in the Heart of Software. Addison-
Wesley, Boston (2003).
15. Binz, T., Breitenbücher, U., Kopp, O., Leymann,
F.: TOSCA: Portable Automated Deployment
and Management of Cloud Applications. In:
Advanced Web Services. pp. 527549. Springer
(2014).
16. Roßbach, P.: Docker Poster. Entwickler Mag.
Docker Spez. (2014).
17. Docker Ecosystem Mindmap,
http://bit.ly/1BjDgtW.
18. Peinl, R.: Docker ecosystem on Google Docs,
http://bit.ly/1DJ0eS4.
19. Docker, Inc.: About, http://bit.ly/1OjEBLl.
20. Crane, C.: The Container Ecosystem Project,
http://bit.ly/1RkyBTu.
21. Wallner, R.: A breakdown of layers and tools
within the container and microservices
ecosystem, http://bit.ly/21cttZN, (2015).
22. Williams, A. ed: The Docker & Container
Ecosystem. The New Stack (2015).
23. Chauhan, M.A., Babar, M.A.: Migrating service-
oriented system to cloud computing: An
experience report. In: Cloud Computing
(CLOUD) 2011, IEEE Int. Conf. on. pp. 404
411. IEEE (2011).
24. Coffey, J., White, L., Wilde, N., Simmons, S.:
Locating software features in a SOA composite
application. In: 8th European Conf. on Web
Services (ECOWS 2010). pp. 99106. IEEE
(2010).
25. Sefraoui, O., Aissaoui, M., Eleuldj, M.:
OpenStack: toward an open-source solution for
cloud computing. Int. J. Comput. Appl. 55, 38
42 (2012).
26. Koukis, V.: Flexible storage for HPC clouds
with Archipelago and Ceph. In: 8th Workshop
on Virtualization in High-Performance Cloud
Computing. ACM (2013).
27. Chadwick, D.W., Siu, K., Lee, C., Fouillat, Y.,
Germonville, D.: Adding federated identity
management to openstack. J. Grid Comput. 12,
327 (2014).
28. Piraghaj, S.F., Dastjerdi, A.V., Calheiros, R.N.,
Buyya, R.: Efficient virtual machine sizing for
hosting containers as a service. In: IEEE World
Congress on Services. pp. 3138. IEEE (2015).
29. Jain, R., Paul, S.: Network virtualization and
software defined networking for cloud
computing: a survey. Commun. Mag. IEEE. 51,
2431 (2013).
30. Day, S.: Docker Registry V2 - A new model for
image distribution. In: Docker Con 2015. , San
Francisco (2015).
31. Hausenblas, M.: Docker Registries: the Good,
the Bad & the Ugly, http://bit.ly/1OsrnIu.
32. Mills, K., Filliben, J., Dabrowski, C.: Comparing
VM placement algorithms for on-demand
clouds. In: 3rd Int. Conf. on Cloud Computing
Technology and Science (CloudCom). pp. 91
98. IEEE (2011).
33. Verma, A., Pedrosa, L., Korupolu, M., Tune,
D.O.E., Wilkes, J.: Large-scale cluster
management at Google with Borg. In: 10th
European Conference on Computer Systems. p.
18. ACM (2015).
34. Schwarzkopf, M., Konwinski, A., Abd-El-
Malek, M., Wilkes, J.: Omega: flexible, scalable
schedulers for large compute clusters. In: 8th
ACM European Conf. on Computer Systems. pp.
351364. ACM (2013).
35. Bucchiarone, A., Gnesi, S.: A survey on services
composition languages and models. In: Intl.
Workshop on Web ServicesModeling and
Testing (WS-MaTe 2006). p. 51 (2006).
36. Caballer, M., Blanquer, I., Moltó, G., de
Alfonso, C.: Dynamic management of virtual
infrastructures. J. Grid Comput. 13, 5370
(2015).
37. Bachlechner, D., Siorpaes, K., Fensel, D., Toma,
I.: Web service discovery-a reality check. In: 3rd
European Semantic Web Conference (2006).
38. Vukojevic-Haupt, K., Haupt, F., Karastoyanova,
D., Leymann, F.: Service Selection for On-
demand Provisioned Services. In: 18th Intl.
Enterprise Distributed Object Computing
Conference (EDOC’14). pp. 120–127. IEEE
(2014).
39. Serebryany, I., Rhoads, M.: SmartStack: Service
Discovery in the Cloud, http://bit.ly/1bRcjo2,
(2013).
40. Swan, C.: ClusterHQ Launches Flocker to
Facilitate Robust Stateful Docker Containers,
http://bit.ly/1KH3zG3, (2014).
41. Hall, S.: Five Storage Companies That Speak To
Docker’s Next Wave, http://bit.ly/1VBHYzp,
(2015).
42. Han, S.: Getting Started With the Docker RBD
Volume Plugin, http://bit.ly/1RYVONi, (2015).
43. Lorido-Botran, T., Miguel-Alonso, J., Lozano,
J.A.: A review of auto-scaling techniques for
elastic applications in cloud environments. J.
Grid Comput. 12, 559592 (2014).
44. Costache, C., Machidon, O., Mladin, A., Sandu,
F., Bocu, R.: Software-defined networking of
Linux containers. In: 13th RoEduNet
Conference. IEEE (2014).
45. Drutskoy, D., Keller, E., Rexford, J.: Scalable
network virtualization in software-defined
networks. IEEE Internet Comput. 17, 2027
(2013).
46. Rimal, B.P., Jukan, A., Katsaros, D., Goeleven,
Y.: Architectural requirements for cloud
computing systems: an enterprise cloud
approach. J. Grid Comput. 9, 326 (2011).
47. Liu, H., Wee, S.: Web server farm in the cloud:
Performance evaluation and dynamic
architecture. In: Cloud Computing. pp. 369380.
Springer (2009).
48. Aceto, G., Botta, A., De Donato, W., Pescapè,
A.: Cloud monitoring: A survey. Comput. Netw.
57, 20932115 (2013).
49. Ward, J.S., Barker, A.: Observing the clouds: a
survey and taxonomy of cloud monitoring. J.
Cloud Comput. Adv. Syst. Appl. 3, 40 (2014).
50. Yegulalp, S.: Docker Datacenter promises end-
to-end container control for enterprises,
http://bit.ly/1QYXoK1, (2016).
51. Polvi, A.: The Security-minded Container
Engine by CoreOS: rkt Hits 1.0,
http://bit.ly/1S3iyw5, (2016).
52. Kratzke, N.: About Microservices, Containers
and their Underestimated Impact on Network
Performance. CLOUD Comput. 2015. 180
(2015).
53. Kazemier, A.: BalancerBattle,
https://github.com/observing/balancerbattle.
54. Felter, W., Ferreira, A., Rajamony, R., Rubio, J.:
An updated performance comparison of virtual
machines and linux containers. Research Report
RC25482, IBM, Almaden (2014).
55. Seo, K.-T., Hwang, H.-S., Moon, I.-Y., Kwon,
O.-Y., Kim, B.-J.: Performance Comparison
Analysis of Linux Container and Virtual
Machine for Building Cloud. Adv. Sci. Technol.
Lett. Netw. Commun. 66, 105107 (2014).
56. Kabhal: Introduce multiple scaling strategies ·
mesosphere/marathon · Issue #1477,
http://bit.ly/1QAETBl, (2015).
57. Lindner, M., Galán, F., Chapman, C., Clayman,
S., Henriksson, D., Elmroth, E.: The cloud
supply chain: A framework for information,
monitoring, accounting and billing. In: 2nd Int.
Conf. on Cloud Comp. (2010).
58. Sreelakshmi, S.: OpenContrail Kubernetes
Integration, http://bit.ly/1Q91Jun, (2015).
59. Peinl, R., Holzschuher, F.: The Docker
Ecosystem Needs Consolidation. In: 5th Intl.
Conf. on Cloud Computing and Services Science
(CLOSER 2015). pp. 535542. SCITEPRESS,
Lisbon, Portugal (2015).
60. Kazemi, S.: CRIU Support in Docker for Native
Checkpoint and Restore. In: Linux Plumbers
Conference 2015. , Seattle, Washington, USA
(2015).
61. Berman, L.: Are Diego and Docker Really Good
Friends?, http://bit.ly/1WGpAW4, (2015).
62. Dadgar, A.: Nomad, http://bit.ly/1MV4bYB,
(2015).
63. Hashicorp: Nomad vs. Other Software,
http://bit.ly/1OsqLCG, (2015).
64. Owens, K.: Building Cisco’s IoE PaaS with
Mantl, http://bit.ly/1KFP9Ck, (2015).
65. Yegulalp, S.: Hypernetes unites Kubernetes,
OpenStack for multitenant container
management, http://bit.ly/1QvVrUS, (2015).
66. Crisp Research: Open Cloud Alliance -
Openness as an Imperative. Crisp Research
(2014).
67. Kratzke, N.: A lightweight virtualization cluster
reference architecture derived from open source
PaaS platforms. Open J. Mob. Comput. Cloud
Comput. 1, 1730 (2014).
... There have been a few studies looking specifically at graph database languages. Holzschuher and Peinl [22] have compared the readability of Cypher, Gremlin, SQL and a Java API. Based on a readability metric, they concluded that Cypher was the most readable of the four languages. ...
... In the case of SPARQL*, this is achieved by reversing the outer, :role, predicate 22 . 22 A more faithful analogue of the third Cypher query, would be to invert the embedded triple pattern, using the ^ operator, i.e. to write: <<?company ^:worksFor :Adrian>> :role ?role. However, the Blazegraph implementation does not permit the use of ^ in embedded triple patterns. ...
Preprint
Full-text available
This study compares participant acceptance of the property graph and edge-labelled graph paradigms, as represented by Cypher and the proposed extensions to the W3C standards, RDF* and SPARQL*. In general, modelling preferences are consistent across the two paradigms. When presented with location information, participants preferred to create nodes to represent cities, rather than use metadata; although the preference was less marked for Cypher. In Cypher, participants showed little difference in preference between representing dates or population size as nodes. In RDF*, this choice was not necessary since both could be represented as literals. However, there was a significant preference for using the date as metadata to describe a triple containing population size, rather than vice versa. There was no significant difference overall in accuracy of interpretation of queries in the two paradigms; although in one specific case, the use of a reverse arrow in Cypher was interpreted significantly more accurately than the ^ symbol in SPARQL. Based on our results and on the comments of participants, we make some recommendations for modellers. Techniques for reifing RDF have attracted a great deal of research. Recently, a hybrid approach, employing some of the features of property graphs, has claimed to offer an improved technique for RDF reification. Query-time reasoning is also a requirement which has prompted a number of proposed extensions to SPARQL and which is only possible to a limited extent in the property graph paradigm. Another recent development, the hypergraph paradigm enables more powerful query-time reasoning. There is a need for more research into the user acceptance of these various more powerful approaches to modelling and querying. Such research should take account of complex modelling situations.
... Graph queries Graph queries [20] can be used to extract sub-graphs that are qualitatively or quantitatively different from the queries constituting an enquiry protocol. Queries can spotlight nodes with unusual characteristics, e.g. the number of edges. ...
Article
Full-text available
Criminal investigation adopts Artificial Intelligence to enhance the volume of the facts that can be investigated and documented in trials. However, the abstract reasoning implied in legal justification and argumentation requests to adopt solutions providing high precision, low generalization error, and retrospective transparency. Three requirements that hardly coexist in today’s Artificial Intelligence solutions. In a controlled experiment, we then investigated the use of graph embeddings procedures to retrieve potential criminal actions based on patterns defined in enquiry protocols. We observed that a significant level of accuracy can be achieved but different graph reformation procedures imply different levels of precision, generalization, and transparency.
Article
To facilitate the development of new varieties and conserve plant germplasm data for future needs, Thailand organizations have developed plant germplasm search systems that provide access and exchange plant genetic resource data like the international organizations. However, the development of plant germplasm search systems using traditional data warehousing creates a limitation in terms of structural flexibility and scalability, as well as search performance. Accordingly, this paper proposes a JSON and index search system for the plant germplasm database named JINDEX. JINDEX is a new approach for implementing a galaxy schema in NoSQL based a hybrid key-value/document data model. Plant germplasm data is stored in JSON files as a document data model, providing a flexible structure of genetic resources for various plants. The plant germplasm data and index files are organized as an on-disk tree structure, which can be accessed regarding a key-value data model to retrieve plant germplasm data efficiently. Experimental results showed that the JINDEX-based plant germplasm search system outperforms the previous plant germplasm search system implemented in a relational database in terms of query response time. Additionally, this JINDEX-based plant germplasm search system has been deployed since 2020, demonstrating that it can work practically.
Chapter
The two dominant paradigms for graph databases, edge-labelled graphs and property graphs, may appear quite different. Yet, to the user, they have strong similarities. A usability study, comparing RDF-star/SPARQL-star and Cypher, found evidence for only limited differences in preferences between the modelling paradigms. This suggests the possibility of a convergence of the two paradigms; indeed in one case (Stardog) this is already happening. We also found little difference between the paradigms in users’ ability to detect valid and non-valid queries. In one specific case, the use of a reverse arrow in Cypher was interpreted significantly more accurately than the ^ symbol in the SPARQL query language; this argues, where possible, for the more intuitive Cypher notation.
Article
The architecture design by integrating the blockchain and IoT has gained salience in information systems and information technology research. However, there has not been significant work on the comprehensive and sequential applications of blockchain integrated IoT in e-agriculture for data validation, data storage, data security and data privacy. We aim to bridge this gap in the literature and discuss how the integration of blockchain and IoT can improve backward and forward linkages in agricultural value chains, benefit value chain actors, and augment the performance of IoT network. The study considers use cases of agriculture inputs and commodities/products to demonstrate the application of IoT devices in data collection and blockchain technology for data validation, data storage, data security, and data transmission. RAFT consensus algorithm for permission blockchain and FAIR principles are utilized to make e-agriculture information systems decentralized, efficient, fault tolerant, and interoperable. The technology coupling in e-agriculture can create externalities such as shared benefits, improved coordination between value chain actors, and real-time decision-making for optimal resource allocation and their sustainable utilization in e-agriculture.
Conference Paper
Full-text available
There has been a growing effort in decreasing energy consumption of large-scale cloud data centers via maximization of host-level utilization and load balancing techniques. However, with the recent introduction of Container as a Service (CaaS) by cloud providers, maximizing the utilization at virtual machine (VM) level becomes essential. To this end, this paper focuses on finding efficient virtual machine sizes for hosting containers in such a way that the workload is executed with minimum wastage of resources on VM level. Suitable VM sizes for containers are calculated, and application tasks are grouped and clustered based on their usage patterns obtained from historical data. Furthermore, tasks are mapped to containers and containers are hosted on their associated VM types. We analyzed clouds' trace logs from Google cluster and consider the cloud work-load variances, which is crucial for testing and validating our proposed solutions. Experimental results showed up to 7.55% improvement in the average energy consumption compared to baseline scenarios where the virtual machine sizes are fixed. In addition, comparing to the baseline scenarios, the total number of VMs instantiated for hosting the containers is also improved by 68% on average.
Article
Full-text available
Representing and efficiently querying time-varying social network data is a central challenge that needs to be addressed in order to support a variety of emerging applications that leverage high-resolution records of human activities and interactions from mobile devices and wearable sensors. In order to support the needs of specific applications, as well as general tasks related to data curation, cleaning, linking, post-processing, and data analysis, data models and data stores are needed that afford efficient and scalable querying of the data. In particular, it is important to design solutions that allow rich queries that simultaneously involve the topology of the social network, temporal information on the presence and interactions of individual nodes, and node metadata. Here we introduce a data model for time-varying social network data that can be represented as a property graph in the Neo4j graph database. We use time-varying social network data collected by using wearable sensors and study the performance of real-world queries, pointing to strengths, weaknesses and challenges of the proposed approach.
Conference Paper
Full-text available
Today's IT organizations who act as service providers are under increasing pressure to keep up with the continuous and growing demand for IT services. Through the shift from interactive, manual processes to automated, self-deployment of resources, the providers can increase the efficiency of delivering on-demand services. Virtualized resources, particularly virtual machines and containers that are in the focus of the present paper, make the infrastructure transparent to the final user, are easier to be configured and can seamlessly migrate to another host, in real time – preserving processes status. The Linux containers represent an emerging technology for fast and lightweight process virtualization. Because the containers require less resources to run by sharing the operating system kernel, a higher density of containers can be achieved on the same host, opposed to other virtualization solutions like hardware or para-virtualization. The paper presents a solution to enable the on-demand provisioning of Linux containers using Software-Defined Networking, a flexible approach to treating even control-level resources " as a Service " .
Article
Full-text available
Containerisation is widely discussed as a lightweight virtualisation solution. Apart from exhibiting benefits over traditional virtual machines in the cloud, containers are especially relevant for Platform-as-a-Service (PaaS) clouds to manage and orchestrate applications through containers as an application packaging mechanism. We discuss the requirements that arise from having to facilitate applications through distributed multi-cloud platforms.
Conference Paper
Full-text available
Cloud technology is moving towards more distribution across multi-clouds and the inclusion of various devices, as evident through IoT and network integration in the context of edge cloud and fog computing. Generally, lightweight virtualisation solutions are beneficial for this architectural setting with smaller, but still virtualised devices to host application and platform services, and the logistics required to manage this. Containerisation is currently discussed as a lightweight virtualisation solution. In addition to having benefits over traditional virtual machines in the cloud in terms of size and flexibility, containers are specifically relevant for platform concerns typically dealt with Platform-as-a-Service (PaaS) clouds such as application packaging and orchestration. For the edge cloud environment, application and service orchestration can help to manage and orchestrate applications through containers as an application packaging mechanism. We review edge cloud requirements and discuss the suitability container and cluster technology of that arise from having to facilitate applications through distributed multi-cloud platforms build from a range of networked nodes ranging from data centres to small devices, which we refer to here as edge cloud.
Article
Google's Borg system is a cluster manager that runs hundreds of thousands of jobs, from many thousands of different applications, across a number of clusters each with up to tens of thousands of machines. It achieves high utilization by combining admission control, efficient task-packing, over-commitment, and machine sharing with process-level performance isolation. It supports high-availability applications with runtime features that minimize fault-recovery time, and scheduling policies that reduce the probability of correlated failures. Borg simplifies life for its users by offering a declarative job specification language, name service integration, real-time job monitoring, and tools to analyze and simulate system behavior. We present a summary of the Borg system architecture and features, important design decisions, a quantitative analysis of some of its policy decisions, and a qualitative examination of lessons learned from a decade of operational experience with it.
Article
Many efforts to develop open cloud standards, which could enable interoperability among clouds, are underway. However, the critical challenge here is to develop standardization solutions for a number of technical issues. Ultimately, coordinating the many standardization efforts will pose a hard challenge for future. The paper provides research issues and directions related to cloud interoperability. In the cloud computing landscape, “cloud interoperability” typically refers to the ability to seamlessly deploy, migrate, and manage application workloads across heterogeneous hardware and software resources provided by multiple datacenter cloud providers (such as Amazon and GoGrid).
Conference Paper
Service selection is an important concept in service oriented architectures that enables the dynamic binding of services based on functional and non-functional requirements. The introduction of the concept of on-demand provisioned services significantly changes the nature of services and as a consequence the traditional service selection process does not fit anymore. Existing approaches for service selection rely on the always on semantic of services, an assumption that is not valid for on-demand provisioned services. We tackle this problem by adapting the traditional service selection process and by defining an additional step covering the changes introduced by the concept of on-demand provisioning. Our solution comprises an extended architecture for on-demand provisioning, a metamodel for a service registry and a detailed definition and discussion of the adapted and extended service selection process. The work presented in this paper allows keeping the advantages of dynamic service binding at runtime and combining them with the advantages of Cloud computing exploited through the concept of on-demand provisioning.
Book
Portability and automated management of composite applications are major concerns of todayâs enterprise IT. These applications typically consist of heterogeneous distributed components combined to provide the applicationâs functionality. This architectural style challenges the operation and management of the application as a whole and requires new concepts for deployment, configuration, operation, and termination. The upcoming OASIS Topology and Orchestration Specification for Cloud Applications (TOSCA) standard provides new ways to enable portable automated deployment and management of composite applications. TOSCA describes the structure of composite applications as topologies containing their components and their relationships. Plans capture management tasks by orchestrating management operations exposed by the components. This chapter provides an overview on the concepts and usage of TOSCA.