ArticlePDF Available

Bandwidth on demand for inter-data center communication

Authors:

Abstract and Figures

Cloud service providers use replication across geographically distributed data centers to improve end-to-end performance as well as to offer high reliability under failures. Content replication often involves the transfer of huge data sets over the wide area network and demands high backbone transport capacity. In this paper, we discuss how a Globally Reconfigurable Intelligent Photonic Network (GRIPhoN) between data centers could improve operational flexibility for cloud service providers. The proposed GRIPhoN architecture is an extension of earlier work [34] and can provide a bandwidth-on-demand service ranging from low data rates (e.g., 1 Gbps) to high data rates (e.g., 10-40 Gbps). The inter-data center communication network which is currently statically provisioned could be dynamically configured based on demand. Today's backbone optical networks can take several weeks to provision a customer's private line connection. GRIPhoN would enable cloud operators to dynamically set up and tear down their connections (sub-wavelength or wavelength rates) within a few minutes. GRIPhoN also offers cost-effective restoration capabilities at wavelength rates and automated bridge-and-roll of private line connections to minimize the impact of planned maintenance activities.
provides a simplified representation of how today's technology layers are interrelated and how the service categories map to them. Most large carriers' current core transport networks consist of a Wideband Digital Cross-connect System (W-DCS) Layer, SONET Layer, DWDM Layer, and Fiber Layer. Consider the network layers from the bottom up. At the very base is the fiber-optic layer. This layer consists of fiberoptic cables connecting the various nodes in the network. Laying these cables between cities is a huge capital investment, and hence this layer is very static. Built upon this fiber base is the transport layer. Dense-wavelength-division multiplexing (DWDM) is utilized in the core network because of it's huge capacity compared with all other technologies. A modern DWDM system utilizes anywhere from 40 to 100 wavelengths, each carrying signals at rates ranging from 10 to 100Gbps. Sub-wavelength channels at 2.5Gbps or 10Gbps can be provided via muxponders. These wavelength connections are bidirectional and multiplexed together onto a fiber-pair. Hence the transport layer is known as the DWDM layer. DWDM systems were initially point-to-point systems, with all traffic terminating at the two end nodes. If some connections were destined to travel further down the line, then they would be electronically regenerated before transmission on the next leg of their path. In recent years, ROADM technologies for DWDM transport networks have been deployed due to their capital and operational savings. A ROADM network typically includes a set of multi-degree ROADM nodes connected via fibers to form a mesh topology. Traffic may be added or dropped, regenerated, or expressed through
… 
Content may be subject to copyright.
Bandwidth on Demand for Inter-Data Center
Communication
Ajay Mahimkar, Angela Chiu, Robert Doverspike, Mark D. Feuer, Peter Magill,
Emmanuil Mavrogiorgis, Jorge Pastor, Sheryl L. Woodward, Jennifer Yates
AT&T Labs Research
{mahimkar,chiu,rdd,mdfeuer,pete,emaurog,jorel,sheri,jyates}@research.att.com
ABSTRACT
Cloud service providers use replication across g eographi-
cally distributed data centers to improve end-to-end perfor-
mance as well as to offer high reliability under failures. Con-
tent replication often involves the transfe r of huge data sets
over the wide area network and demands high back bone trans-
port capacity. In this paper, we discuss how a G
lobally
R
econfigurab le Intelligent Photonic Network (GRIPhoN) be -
tween data centers could improve operational flexibility for
cloud service providers. The proposed GRIP hoN architec-
ture is an extension of earlier work [34 ] and can provide a
bandwidth-on-demand service ranging from low data rates
(e.g., 1 Gbps) to high data rates (e.g., 10-40 Gbps). The
inter-data center communication network which is currently
statically provisioned co uld be dynamically configured based
on demand. Today’s back bone optical networks can take
several weeks to provision a customer’s private line connec -
tion. GRIPhoN would enable cloud operators to dy nami-
cally set up and tear down their connections (sub-wavelength
or wavelength rates) within a few minutes. GRIPhoN also
offers cost-effective restoration capabilities at wavelength rates
and automated bridge -and-roll of private line conne c tions to
minimize the impact of planned maintenance activities.
Categories and Subject Descriptors
C.2.1 [Computer-Communication Networks]: Network Ar-
chitecture and DesignNetwork communications
General Terms
Design, Performance, Reliability
Keywords
Inter-data cen te r communication, ROAD M, OTN
The views expressed are those of the authors and do not reflect the official
policy or position of the Department of Defense or the U.S. Government
and are classified under distribution statement A (Approved for Public
Release, Distribution Unlimited). Permission to make digital or hard copies
of all or part of this work for personal or classroom use is granted without
fee provided that copies are not made or distributed for profit or commercial
advantage and that copies bear this notice and the full citation on the first
page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee.
Hotnets ’11, November 14–15, 2011, Cambridge, MA, USA.
Copyright 2011 ACM 978-1-4503-1059-8/11/11 .. .$10.00.
1. INTRODUCTION
In the past few ye ars, we have seen the rapid growth of
cloud service offerings from companies such as Amazon [4],
IBM [21], Yahoo [32], Apple [2], Microsoft [5], Google [15]
and Facebook [10]. These cloud service providers (CSP)
use multiple geographically distributed da ta centers to im-
prove the end-to-end performan c e as well as to offer high
availability under failures. Massive amounts of content are
being collected by the data centers. The CSPs often repli-
cate the content on a regular basis across multiple data cen-
ters. Inter-data center replication a nd redundancy imp ose
high band w idth requirements on the inter-data ce nter wide
area network.
Traditionally, a CSP leases or owns a dedicated line be-
tween its data centers. Greenberg et al. [16] reports that
wide area transport is expensive and costs m ore than the in-
ternal network of a data c enter. This is also why some CSPs
do not operate multiple geographically distributed data cen-
ters [20]. The peak traffic volumes between data centers are
dominated by background, non-interactive, bulk data trans-
fers (as also observed by Chen et al. [6]). The CSP runs
backup and replication a pplications to transfer bulk data be-
tween its data centers. The scale of this data can range
from several terabytes (e.g., em e rging scientific and indus-
trial applications) to petabytes (e.g., Google’s Distributed
Peta-Scale Data Transfer [36]) . A recent survey conducted
by Forrester, Inc. [14] further highlights that a ma jority of
CSPs perform bulk data transfer among three or more data
centers. They project that inter-data-center transport require-
ments will double or triple in the next two to four ye ars.
There is a great deal of research literature on achieving
full bisection bandwidth within a data center with improved
network pe rformance (e.g., VL 2 [17], DCell [19], BCube [18],
MDCube [31], PortLand [25 ], c-Through [29], Helios [11],
Proteus [28]). However, there are few recent studies on inter-
data center bulk transfers [1, 6, 8, 23, 22]. Chen et al.
[6] characterizes the inter-data center traffic characteristics
using Yahoo ! data-sets. NetStitcher [22] takes the inter-
esting approach of stitching together unutilized bandwidth
across different data centers by using multi- path and multi-
hop store and forward scheduling. It effectively achieves
inter-data center bulk transfers with existing capacity.
Our Approach. In this paper, we take a completely differ-
ent approach to achieving dynamic inter-data cen te r commu-
1
nication. We propose GRIPhoN - a Globally Reconfigurable
I
ntelligent Photonic Network that would offer a Bandwidth
on Demand (BoD) service in the core network for efficient
inter-data center communication. We believe we are the first
to address the inter-data center capacity issue from the car-
rier’s perspective. The mo tivation behind BoD comes from
the variability in traffic demands for commu nication ac ross
data centers. Non-interactive bulk data transfers between
data centers are typically performed by the cloud operators
and have different patterns than interactive end-user driven
traffic. This gives us the opportunity to explore the use of
different data rates at different times - for example, high data
rate (10-40 Gbps) between data centers for non-interactive
data transfers and low rate (1-10 Gbps) for suppo rting inter-
active sessions. G RIPhoN provides a pla tform for offering
such dynamic connectivity. The inter-data center communi-
cation network which was previously statically provisioned
can now be viewed as adjustable. GRIPhoN offers flexibility
to the CSP in dynamica lly adjusting the bandwidth between
its geographically distributed data centers based on the de-
mand. The carrier also benefits from the intellige nt re-use of
the p ool of resources across multip le customers.
BoD Service Vision and Today’s Reality. We now outline
the dynamic service vision of GRIPhoN and compare it to
today’s reality.
1. Dynamic configurable-rate services. The vision b e-
hind GRIP hoN is to offer dynamic multi-rate services for
communication between geographically distributed data
centers. Having a choice between multiple data rates of-
fers flexibility to the CSPs in dynamically selecting the
right bandwidth based on dem and. Today carriers of-
fer BoD private-line services in limited a rch itec tures and
usually at rates 622 Mbps.
2. Rapid establishment of new connections. Dynamic
bandwidth adjustments require rapid connection provi-
sioning. This is achievable today at low data rates by
re-configuring electronic circuit switches [9 ]. However,
provisioning times for connections which require a fu ll
wavelength in the backbone are orders of magnitude slower
than needed. This is primarily because there has been
no call fo r faster times and hence neither the Element
Manage ment Systems ( EMS) nor the optical ha rdware is
optimized for fast speeds.
3. Reduced outage time. Following any network failure,
it is im portant to quickly restore the service. For low-
data-rate services, restoration times are on the order of
milliseconds. H owever, no restoration is usually avail-
able today for full wavelength capabilities. There are two
alternatives f or private-line customers: either buy expen-
sive 1+1 protection where if a primary connec tion fails,
traffic is re-routed to a backup, or wait for the carrier to
manually restore conn e ctions which me ans long ou ta ge
times (4 to 12 hours typically).
4. Minimal impact during maintenance. Maintenance is
a significant aspect of managing and operating large ne t-
works. Carriers would like to ensu re minimal or no im-
pact of maintenance on performance. Since the wave-
length connection management is being manually han -
dled today, there is a non-negligible impact on service.
GRIPhoN Contributions. GRIPhoN aims at bridging the
gap between the dynamic service vision and todays real-
ity as shown in Table 1. By offering dyn amic configurable-
rate services, GRIPhoN enables the CSP to a c tively adjust
their inter-data center connections. Such a BoD service is
not new to large carriers, at least for lower data rates. Such
lower-data-rate services are already offered, for example the
Optical Mesh Service (OMS) [9, 26, 27] . GRIPhoN scales
these concepts to very high data rates and offers the first BoD
service demonstration that can select data rates from sub-
wavelength connections (e.g., 1 Gbps) to full wavelength
connections (e.g., 10-40 Gbp s) . The sub-wavelength con-
nections are provided by OTN (Optical Transport Network)
switches in the network’s OTN Layer. Full wavelength con-
nections are established in the photonic layer by using col-
orless and non-directional reconfigurable optical add/drop
multiplexers (ROADMs). A CSP leases dedicated optical
access to the GRIPhoN core network at multiple data center
locations and dynamically sets up optical connections be-
tween the m. GRIPhoN enables dynamic and rapid c onnec-
tion management capabilities with the automated control of
fiber cross-connects (FXC) to route signa ls to either the pho-
tonic or OTN layer. This enables a CSP to utilize wavelength
and/or sub-wavelength resources.
GRIPhoN also offers cost-effective restoration capabili-
ties at wavelength rates via automatic fault identification and
dynamic re-establishment of connections. This reinstates
customer conne ctions far faster than repair of the underlying
fault. Though not as fast as 1+1 protection, this would also
be far less expensive. Finally, by using automated bridge-
and-roll [34] of private line conn e ctions, GRIPhoN mini-
mizes the impact during planned maintenance.
Comparison to prior work on dynamic optical networks.
In con trast to CANARIE [3], CHEETAH [35], DRAGON [2 4],
DWDM-RAM [13] and Lambda GRID [33] which are ini-
tiatives of research and e ducation networks tha t serve uni-
versities and national laboratories, GRIPhoN is intended for
the backbone network of a major carrier. Providing dyna mic
wavelength services on an inter-city commercial network
presents challen ges not only in the eventual scale that must
be managed, but also in the transition phase from today’s
static network. Efficient network implementation across mul-
tiple layers and multiple customers, cost-e ffective service
restoration and conformance with commercial operational
practices have received less attention in the research and ed-
ucation initiatives, whereas these issues are the primary fo-
cus in GRIPhoN.
2. BANDWIDTH O N DEMAND SERVICE
In this section, we first present a simplified view of the
services and network layers offered by the carrier. We then
describe the design of the BoD service offered by GRIPhoN
that can be utilized by the cloud service providers to dynam-
ically adjust the bandwidth available between their data cen-
ters.
2
BoD service vision Today’s reality GRIPhoN proposal
Dynamic configurable-rate Maximum rate well below full wave-
length rate
Rate configurable over wide range. Integrated services
using OTN, FXC and wavelength switching
Rapid establishment of new connections Takes several weeks for highest data
rates
Automated F iber Cross-connect (FXC) and ROADMs
enable full wavelength connections in minutes
Reduced outage times None (unless 1+1) for full wavelength
rates
Automated outage detection and dynamic re-
provisioning of impacted connections
Minimal impact during maintenance Non-negligible impact on service Automated bridge-and-roll [34]
Table 1: Bandwidth on Demand (BoD) service vision, today’s reality and GRIPhoN proposal.
Figure 1: Carrier’s view of current services & network
layers.
2.1 Carrier’s view of services & network lay-
ers
Fig. 1 provides a simplified representation of how today’s
technology layers are interrelated and how the service cate-
gories map to them. Most large carriers’ current core trans-
port networks consist of a Wideban d Digital Cross-connect
System (W-DCS) Lay er, SONET Layer, DWDM Layer, and
Fiber Layer.
Consider the network layers from the bottom up. At the
very base is the fiber-optic laye r. This layer c onsists of fiber-
optic cables connecting the various nodes in the network.
Laying these cables between cities is a huge capital invest-
ment, and hence this layer is very static. Bu ilt upon this
fiber base is the transport layer. Dense-wavelength-division
multiplexing (DWDM) is utilized in th e core network be-
cause of it’s hu ge capacity compared with a ll o ther technolo-
gies. A modern DWDM system utilizes anywhere from 40 to
100 wavelengths, each carrying signals at rates ranging from
10 to 100Gbps. Sub-wavelength channels at 2.5Gbps or
10Gbps can be provided via muxponders. These wavelength
connections are bidirectional and multiplexed together onto
a fiber-pair. Hence the transport layer is known as the DWDM
layer. DWDM systems were initially point-to-point systems,
with all traffic terminating at the two end nodes. If some con-
nections were destine d to travel further down the line, then
they would be electronically regenerated before transmission
on the next leg of their path. I n recent years, ROADM tech-
nologies for DWDM transport networks have b een deployed
due to their capital and operational savings. A ROADM
network typically includes a set of multi-degree ROADM
nodes connected v ia fibers to form a mesh topology. Traffic
may be added or dropped, regenerated, o r expressed through
at each ROADM. Op tical transponders (OT) are connected
to the ports of the ROADM to transmit and receive line-
side optical signals and convert them to standard client-sid e
optical signals. Optical-to-Electrical-to-Optical (OEO) re-
generation is needed when the distance between terminat-
ing nodes exceeds a limit for ad e quate signal quality, known
as the optical reach. When that happens, optical regenera-
tors (REGENs) a re used at one or more intermediate nodes.
ROADM’s are now being deployed with a dd/drop ports which
are both “colorless” (so that any OT can be tuned to pro-
vide a signal at any wavelength) and “non-d irectional” (Any
OT’s signal can be used on any of the ROADM’s inter-node
fiber-pairs; this is also referred to as “steerable” or direc-
tionless”).
The SONET (Syn chronous Optical Network) layer rides
on top of DWDM layer with B roa dband DCSs that cross-
connect at STS-1 rate as its most common network element.
The Add-Drop Multiplexer (ADM) is a special case of a
DCS with 2 degrees to fo rm SONET rings. It provides SONET
connections at rates from STS-1 (52Mbps) to OC-192 (10Gbps).
It carries both TDM and data traffic and provides an au-
tomatic protection/restoration mechanism to switch traffic
from working circuits to backup circuits in less than a sec-
ond. The Wide-band Digital Cross-connect System (W-DCS)
is above the SONET layer and consists of D CS-3/1s and
other DCS that cross-connect at greater tha n DS0 but be low
DS3 rates. It provides nxD S1 (1.5Mbps) TDM co nnections.
Ethernet Virtual Circuits (EVCs) provide v irtual links with
guaranteed bandwid th. Ethernet private lines are links be-
tween customer routers or Ethernet switches, usually con-
sisting of Giga bit Ethernet interfaces at customer ends and
then enca psulated and rate-limited into pipe s consisting of
virtually concatenated SONET STS-1s. Circuit-based BoD
services use virtual concatenation of channels fed from a
dedicated access or metro pipe to the customer. With cur-
rent services and network layers, the carrier offers BoD only
at the SONET la yer, not at the DWDM layer. With the
GRIPhoN vision using future services & network layers, BoD
at high data rates would be offered at the OTN la yer as well
as the DWDM layer.
Fig. 2 provides a view of such future services and network
layers from the carrier’s perspective. One of th e key assump-
tions of this service evolution model is that th e transport of
Guaranteed Bandwid th connections can be categorized by
bandwidth: below 1 Gbps is transported via the IP layer as
EVCs; 1 Gbps up to the core wavelength rate is transported
by the sub-wavelength layer as Ethernet Private Lines, most
likely encapsulated into concatenated TDM pipes; high-rate
private-line services (TDM connections of wavelength rate)
are carried directly over the DWDM (Dense Wavelength Di-
3
Figure 2: Carrier’s view of future services & net work
layers.
vision Multiplexing) layer. The OTN layer is introduced as
the sub-wavelength layer that provides higher switching ca-
pacity and better scalability than today’s SONET/Broadband
DCS layer. The OTN switches cross-connect at an O DU0
rate (1.25Gbps) and can su pport both TDM a nd Ethernet
packet-based client signals. Using ITU-standardized digi-
tally framed signals with digital overhead, the OTN layer
supports connection management as well as Forward Error
Correction for enhanced system performance. Compared
to using muxponders in the DWDM layer to provide sub-
wavelength con nections, the OTN layer with its switching
capability can achieve m ore efficient packin g of wavelengths
in the transport network. Moreover, it can provide auto-
matic sub-second sh ared-mesh restoration similar to today’s
SONET lay er.
2.2 GRIPhoN Design
Fig. 3 shows an overview of the GRIPhoN target service
architecture that enables BoD service for d ynamic inter-data
center communication. The data center premises connect
to the carrier’s network via a fixed, dedicated access pipe.
In order to allow for better grooming of the provided band-
width, we partition the ca rrier’s network into two separate
layers that are (i) the Optical Transport Network (OTN) layer
that provides low data rate connections (e.g., 1 Gbps), and
(ii) the Dense Wavelength Division Multiplexing (DWDM)
layer that provides high data rate conne c tions (e.g., 40 Gbps).
This allows a CSP to adjust the bandwidth according to their
exact needs. For example, they can use lower-speed circuits
to augment a high-speed circuit by using a co mbination of
2 x 1G OTN circuits and one 10G DWDM to achieve a to-
tal bandwidth of 12G instead of consuming a second 10G
DWDM.
Reconfigurable Fiber Cross-Connect (FXC). In order to
efficiently provide B oD services at wavelength rates, it is
necessary to have a switch on the c lient-side of the OT [12,
30]. A client-side switch allows for dynamic sharing of
transponders, which is useful in keeping costs low. While
this switch could be electronic, the low cost, small footprint,
and low-power consumption of a fiber-cross-connect (FXC)
Figure 3: BoD for inter-data center communication us-
ing GRIPhoN.
makes it an attractive tec hnology. Unfortunately, an FXC is
incapable of grooming traffic. Therefore, to provide BoD
services at rates below the data rate of a single wavelength,
electronic switch ing is necessary. This is provided by the
OTN switch, a part of the OTN layer of the GRIPhoN net-
work. T his layer rides on top of the DWDM layer. When
a connection is requested, the FXC, under the control of the
GRIPhoN controller, directs the signal to either an OT, to
be carried directly on the DWDM layer, or to a port on the
OTN switch, where it c an be combined with other OTN sig-
nals before transmission over the DWDM layer.
GRIPhoN controller. Connection establishment and re-
lease based on requests from the CSP are han dled by the
GRIPhoN controller. The GRIPhoN co ntroller communi-
cates with the network elements via the appropriate vendor-
supplied EMS. The c ontroller is responsible for keeping track
of the available network resources in its database, commu-
nication with the network ele ments (FXC controllers, OTN
switch EMS, ROADM EMS and NTE controllers) in order
to create or tear down the connections orde red by the CSPs,
capacity and resource manageme nt, inventory database man-
agement, failure detectio n, localiza tion and automated restora-
tions. To minimize service interruption during ne twork re-
configurations due to restoration, the GRIPhoN controller
executes a bridge-and-roll operation [7, 34] that first creates
a full new wavelength path (the “bridge”) while the original
connection is still in use an d then quickly “rolls” the traf-
fic on to the new path when ready. Th e bridg e-and-roll re-
sults in an almost hitless movement of traffic prior to sched-
uled maintenance or reversion following a failure restoration
(moving traffic from backu p paths to repaired primary). One
constraint of the bridge-and-roll operation however, is that
the new wavelength path has to be resource disjoint to the
old path.
Customer Graphical User Interface (GUI). Each cu sto mer
has a graphical user interface to GRIPhoN to visualize an d
manage his co nnections. The customer only visualizes the
channelized or un-cha nnelized interfaces (for sub-wavelength
or wavelength connections, respectively) of the NTE on his
4
Figure 4: G RIPhoN Testbed.
premises. The GUI comprises capabilities for connection
management (setting up or tearing down connections on de-
mand) and simp le fault ma nagement f rom the customer view-
point, such as showing status of connections affected by out-
ages, localizing the fault location, and indicating when restora-
tion is performed. The complexity of the GRIPhoN network
(access pipes, carrier equipments, network layers, GRIPhoN
controller) is hidde n from the customer.
3. TESTBED
In this section, we describe our laboratory prototype im-
plementation of GRIPhoN and present preliminary results
on wavelength connection management. Fig. 4 shows ou r
GRIPhoN testbe d with three customer premises, and the core
GRIPhoN network with DWDM and OTN layer. The DWDM
layer consists of Reconfigurable Optical Add/Drop Multi-
plexers (ROADMs) to provide wavelength switching (cu r-
rently at 10 Gbps, with plans to go to 40 Gbps). In our
prototype, we use two 3-degree ROADMs and two 2-degree
ROADMs. Wavelength-tuna ble optical transponders (OT)
are installed a t the ROADM add/drop ports and are used to
setup end-to-end wavelength connections. Clien t-side FXCs
allow for dynamic sharing of OT’s and REG ENs. The OTN
layer is in the process of installation . Each of the three cus-
tomer premises sites that could host a data center facility
includes servers, Ethernet switches, low-speed multiplexers
(1Gbps/10Gbps), and a 10Gbps/40Gbps Muxponder (10/40
MXP). The servers provide vid eo-on-de mand (VoD) content
across multiple facilities. The 1/10Gbps multiplexer aggre-
gates from multiple Ethernet switches and transmits over
a high-speed (10Gbps) channelized line. The 10/40Gbps
Muxp onder emulates Network Terminating equipment (NTE)
and has four 10Gbps ports on the client side and a 40Gbps
transmission rate on the line side (towards the carrier). The
line-side is the “fat pipe” shown in Fig. 3, and it emulates
a metro network which brings customer traffic to the core
network. Central O ffice terminals (COT) would receive the
customer data, in our prototype this is emulated by another
10/40MXP.
Path length (hops) 1 (I-IV) 2 (I-III-IV) 3 (I-II-III-IV)
Connection establishment
time (seconds)
62.48 65.67 70.94
Table 2: Dependence of w avelength connection establish-
ment times and the path lengt h in the ROADM layer.
We have constructed a customer GUI that has capabili-
ties for dynamically settin g up and tearing down connections
at chosen rates. It sh ows four 10Gbps po rts at each cus-
tomer premises. In this paper, we focus on DWDM layer
experiments. The 10Gbps connection is established from
the customer to the Core PoP (Core Point-of-Presence) over
the customer’s fat p ipe controlled through the EMS of the
40Gbps link. Th e wavelength connection that will be used
to traverse the backbone network is set up be tween a pair of
OTs installed at the source and destination ROAD M nodes
(in this case, in their respective core PoPs). The establish-
ment of a wavelength connection ranges from 60 to 70 sec-
onds. There are two contributions to this time: (i) ROADM
Element Ma nagement System (EMS) co nfiguration steps,
and ( ii) optical tasks, such as ROADM reconfiguration, laser
tuning, power balancing and link equalization. The times as-
sociated with both components at present are not constrained
by any fund amental limitations; rather, they represent a lack
of current carrier requirements for speed. We are now work-
ing with equipment suppliers to further understand the setup
times and ways to reduce them. The 60-70 seconds for wave-
length connection establishment is orders of magnitude bet-
ter than todays provisioning time in the DWDM la yer. This
was primarily a chievable due to the automate d reconfigura-
tion of fiber cross-connects and ROADMs using the GRIPho N
controller. Tearing down a wavelength connection takes around
10 seconds. We also performed preliminary analysis on the
dependence of the connection provisioning times on th e path
lengths (number of hops) in the ROADM (or, DWDM) layer.
Table 2 summarizes the results over ten iterations. As the
path length increases, the connection provisioning increases.
4. RESEARCH CHALLENGES
The BoD services offered by GRIPhoN introduce an en-
tirely new set o f research and operational challenges. An
effective, integrated network design or restoration process
across IP, OTN and DWDM layers necessitates cross-layer
management. The dynamic services, the intelligent and au-
tonomous network, and the integration of multiple network
layers together present several challenges:
Network resource planning. Ensuring adequate network
resources to support anticipated demand from the CSPs is
made more difficult by the existence of dynamic services.
In order to support rapid connection provisioning and faster
restorations, the ca rrier must plan ahead, where and when
to deploy the spare resources (especially OTs). Obviously, it
would be very expensive for the carrier to provision in lieu of
all possible usage scenarios. Thus, they need to forecast de-
mand and carefully manage the pool of GRIPhoN resources.
The carrier should also ensure isolation of services across
different CSPs. This resource planning at first glance may
seem similar to the planning that was performed in providing
plain old telephony services (POTS) with resources (phone
5
circuits) statistically shared by multiple users. However, in
this network the number of users is smaller and the cost of
a line is far greater, makin g accurate planning far more crit-
ical.
Network re-grooming. One attractive applicatio n of GRIPhoN
that is tolerant of the connection times demonstrated in this
work is network grooming. As the GRIPhoN network grows,
additional routes betwee n nodes will be ad ded. This will
make paths that were previously unavailable more appro-
priate for so me connections than the originally established
paths. The carrier may then want to re-provision the inter-
data center communication network with better paths (reduc-
ing latency and/or off-loading the original paths). The p ro-
cess of re-provisioning connections to achieve an improved
network configuration is called re-grooming. In order to
perform re-grooming with minimal imp act to the CSP, the
GRIPhoN bridge-and-roll can be used to migrate the wave-
length connections [ 34].
DWDM layer management. The connection establishment
times we have demonstrated are far slower than any fun-
damental limitations on th e DWDM layer. To reduce the
connection establishment time will place ad ditional require-
ments on bo th the physical hardware and software control
used in the DWDM layer. The optical transport system must
be able to turn wavelengths on/off and route them appro-
priately without affecting other con nections. This has im-
plications for the entire DWDM layer, from how quickly a
new wavelength is turned on, to the power transient tolerance
of the optical line (including bo th amplifiers and receivers).
The latter requirement is a lready b eing a ddressed by carriers
requiring that a cable cut in one pa rt of a mesh network will
not affect traffic in another part of the network. Achieving
a DWDM layer with d ramatically faster end-to-end connec-
tion times in a cost-effective manner requires that the entire
system’s dynamics be considered.
5. SUMMARY
In this paper, we presented the design of Globally Recon-
figurable Intelligent Photonic Network (GRIPhoN) between
data centers that can provide BoD service rangin g from low
data rates (e.g., 1 Gbps) to wavelength rates (e.g., 40 Gbps).
GRIPhoN provides flexibility to the cloud service providers
to dynamically set up and take down their wavelength con-
nections between their geographically distributed data cen-
ters w hen performing tasks like content replication or non-
interactive bulk data transfers.
Acknowledgement
We thank Adel Saleh, the DARPA Prog ram Manager of the
CORONET Program, for his inception of the program and
for his guida nce. We appreciate the support of the DARPA
CORONET Program, Contract N00173-08-C-2011 and the
U. S. Army RDE Contracting Center, Adelphi Contracting
Division, 2800 Powder Mill Rd., Adelphi, MD under con-
tract W911QX- 10-C00094. We thank Amin Vahdat (our
shepherd), Ra kesh Sinha and the HotNets anonymous re-
viewers for their insigh tful feed back. We also thank Fujitsu
and Ciena for th eir equipment and technical suppo rt.
6. REFERENCES
[1] S. Agarwal, J. Dunagan, N. Jain, S. Saroiu, A. Wolman, and H. Bhogan. Volley:
automated data placement for geo-distributed cloud services. In NSDI, 2010.
[2] Apple icloud. http://www.apple.com/icloud/.
[3] B. S. Arnaud, J. Wu, and B. K alali. Customer-controlled and -managed optical
networks. In Journal of Lightwave Technology, 2003.
[4] Amazon Simple Storage Service. aws.amazon.com/s3/.
[5] Windows azure. http://www.microsoft.com/windowsazure/.
[6] Y. Chen, S. Jain, V. K. Adhikari, Z.-L. Zhang, and K. Xu. A first look at
inter-data center traffic c haracteristics via yahoo! datasets. In IEEE INFOCOM,
2011.
[7] A. L. Chiu, G. Choudhury, G. Clapp, R. Doverspike, J. W. Gannett, J. G.
Klincewicz, G. Li, R. A. Skoog, J. Strand, A. von Lehmen, and D. Xu. Network
design and architectures for highly dynamic next-generation ip-over-optical
long distance networks. In Journal of Lightwave Technology, 2009.
[8] M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica. Managing data
transfers in computer clusters with Orchestra. In ACM SIGCOMM, 2011.
[9] R. Doverspike. Practical aspects of bandwidth-on-demand in optical networks.
In Panel on Emerging Networks, Service Provider Summit, OFC, 2007.
[10] Facebook Statistics.
www.facebook.com/press/info.php?statistics.
[11] N. Farrington, G. Porter, S. Radhakrishnan, H. H. Bazzaz, V. Subramanya,
Y. Fainman, G. Papen, and A . Vahdat. Helios: a hybrid electrical/optical switch
architecture for modular data centers. In ACM SIGCOMM, 2010.
[12] M. D. Feuer, D. C. Kilper, and S. L. Woodward. ROADMS and their syste m
applications. In Optical Fiber Telecommunications VB. New York: Academic
Press, 2008.
[13] S. Figueira, S. Naiksata, H. Cohen, D. Cutrell, P. D aspit, D. Gutierrez, and D. B.
Hoang. DWDM-RAM: Enabling grid services with dynamic optical networks.
In IEEE International Symposium on Cluster Computing and the Grid, 2004.
[14] Forrester research.
http://info.infineta.com/l/5622/2011-01-27/Y26.
[15] Google. http:
//www.google.com/corporate/datacenter/index.html.
[16] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel. The cost of a cloud:
research problems in data center networks. In ACM SIGCOMM CCR, 2009.
[17] A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A.
Maltz, P. Patel, and S. Sengupta. VL2: a scalable and flexible data center
network. In ACM SIGCOMM, 2009.
[18] C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu.
BCube: a high performance, server-centric network architecture for modular
data centers. In ACM SIGCOMM, 2009.
[19] C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang, and S. Lu. DCell: a scalable and
fault-tolerant network s tructure for data centers. In ACM SIGCOMM, 2008.
[20] Perspectives - James Hamilton’s Blog, Inter-Datacenter replication &
geo-redundancy. http://perspectives.mvdirona.com/2010/05/
10/InterDatacenterReplicationGeoRedundancy.aspx.
[21] Ibm smart cloud. http://www.ibm.com/cloud-computing/us/en/.
[22] N. Laoutaris, M. Sirivianos, X. Yang, and P. Rodriguez. Inter-datacenter bulk
transfers with NetStitcher. In ACM SIGCOMM, 2011.
[23] N. Laoutaris, G. Smaragdakis, P. Rodriguez, and R. Sundaram. Delay tolerant
bulk data transfers on the internet. In ACM SIGMETRICS, 2009.
[24] T. Lehman, J. Sobieski, and B. Jabbari. DRAGON: a framework for service
provisioning in heterogeneous grid networks. In IEEE Communications
Magazine, 2006.
[25] R. Niranjan Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri,
S. Radhakrishnan, V. Subramanya, and A. Vahdat. Portland: a scalable
fault-tolerant layer 2 data center network fabric. In ACM SIGCOMM, 2009.
[26] K. Oikonomou and R. Sinha. Network design and cost analysis of optical vpns.
In OFC, 2006.
[27] Optical mesh service (OMS). http://http://www.business.att.
com/wholesale/Service/data-networking-wholesale/
long-haul-access-wholesale/
optical-mesh-service-wholesale/.
[28] A. Singla, A. Singh, K. Ramachandran, L. Xu, and Y. Zhang. Proteus: A
topology malleable data center network. In ACM HotNets, 2010.
[29] G. Wang, D. G. A nde rsen, M. Kaminsky, M. Kozuch, T. S. E. Ng,
K. Papagiannaki, and M. Ryan. c-Through: Part-time optics in data centers. In
ACM SIGCOMM, 2010.
[30] S. L. Woodward, M. D. Feuer, J. L. Jackel, and A. A garwal.
Massively-scaleable highly-dynamic optical node design. In OFC/NFOEC,
2010.
[31] H. Wu, G. Lu, D. Li, C. Guo, and Y. Zhang. MDCube: a high performance
network structure for modular data center interconnection. In ACM CoNEXT,
2009.
[32] Yahoo! http://www.yahoo.com/.
[33] O. Yu, A. Li, Y. Cao, L. Yin, M. Liao, and H. Xu. Multi-domain lambda grid
data portal for collaborative grid applications. Future Gener. Comput. Syst.,
2006.
[34] X. J. Zhang, M. Birk, A. Chiu, R. Doverspike, M. D. Feuer, P. Magill,
E. Mavrogiorgis, J. Pastor, S. L. Woodward, and J. Yates. Bridge-and-roll
demonstration in griphon (globally reconfigurable intelligent photonic
network). In OFC, 2010.
[35] X. Zheng, M. Veeraraghavan, N. S. V. Rao, Q. Wu, and M. Zhu. CHEETAH:
Circuit-switched high-speed end-to-end transport architecture tes tbed. In IEEE
Communications Magazine, 2005.
[36] D. Ziegler. Distributed peta-scale data transfer.
http://www.cs.huji.ac.il/˜dhay/IND2011.html.
6
... B andwidth-on-demand (BoD) has been studied for many years by both commercial carriers and members of the global research and education (R&E) community [1][2][3][4][5][6], and some network service providers presently offer this service [7][8][9]. More recently, the traditional carrierbased private-line services (PLS) market has seen a dramatic evolution to a cloud-based business model. ...
Article
Full-text available
We describe bandwidth-on-demand in an evolved multilayer, software-defined networking (SDN) based cloud services model. We present motivation for using a multilayer architecture for the wide-area network (WAN). Our laboratory testbed is described, and both the hardware and management architecture are presented. We also show an initial proof-of-concept demonstration of this capability.
... Meanwhile, the semi-permanent optical layer in telecommunication networks might not adapt to the dynamic applications and traffic in DCIs [15]. Therefore, a dynamic optical layer with fast reconfiguration speed is desired. ...
Article
Full-text available
With the fast deployment of datacenters (DCs), bandwidth-intensive multicast services are becoming more and more popular in metro and wide-area networks, to support dynamic applications such as DC synchronization and backup. Hence, this work studies the problem of how to formulate and reconfigure multicast sessions in an elastic optical network (EON) dynamically. We propose a deep reinforcement learning (DRL) model based on graph neural networks to solve the sub-problem of multicast session selection in a more universal and adaptive manner. The DRL model abstracts topology information of the EON and the current provisioning scheme of a multicast session as graph-structured data, and analyzes the data to intelligently determine whether the session should be selected for reconfiguration. We evaluate our proposal with extensive simulations that consider different EON topologies, and the results confirm its effectiveness and universality. Specifically, the results show that it can balance the trade-off between the number of reconfiguration operations and blocking performance much better than existing algorithms, and the DRL model trained in one EON topology can easily adapt to solve the problem of dynamic multicast session reconfiguration in other topologies, without being redesigned or retrained.
... 1. Meta-Cloud (the most important)orchestrate user applications, allocate and schedule them between federates; 2. Interfaceprovides a unified API for users to submit their applications and for federate administrator to manage and control the resources of the federate; 3. Networkingregulates network resource usage and provides Capacity-on-Demand service [38]; 4. Monitorperforms resource monitoring and clearance for all federates in MC2E; 5. Quality of Service, Administration Control and Managementenforces resource usage policy, provides QoS based on user requirements and guarantees resource reliability. ...
Chapter
Full-text available
Recently in many scientific disciplines, e.g. physics, chemistry, biology and multidisciplinary research have shifted to computational modelling. The main instrument for such numerical experiments has been supercomputing. However, the number of supercomputers and their performance grows significantly slower than the growth of user’s demands. As a result, users of supercomputers may wait for weeks until their job will be done. At the same time the computational power of cloud computing recently grows up considerably represented by heterogeneous DC network with plenty of available resources for numerical experiments. In these circumstances, it may turn out that the time spent by the task in the system, i.e. the time spent in the queue \(+\) computing time, in the cloud environment may be shorter than in HPC installation. There are several problems related to cloud and supercomputer environments integration. First, is how to make a decision where to send a computational task: to a supercomputer or to cloud. Secondly, these environments may have significantly different APIs, so moving a computational task from one environment to another may require a lot of code modification. Another significant problem is an automatic provisioning of virtual environment to execute the task properly. The third one is how to organize effectively migration data, computational tasks, applications and services in DC network, between DC and HPC installation? Saying effectively, we mean that network can allocate shortly, on demand, the necessary capacity in order to transfer the necessary amount of data for the right time. It is called ‘Capacity on Demand’ service. In this chapter an environment for academic multidisciplinary research – Meta Cloud Computing Environment (MC2E) is presented. This environment demonstrates the possible solutions and approaches to the problems listed above.
Article
We present a novel framework, GreyLambda, to improve the scalability of traffic engineering (TE) systems. TE systems continuously monitor traffic and allocate network resources based on observed demands. The temporal requirement for TE is to have a time-to-solution in five minutes or less. Additionally, traffic allocations have a spatial requirement, which is to enable all traffic to traverse the network without encountering an over-subscribed link. However, the multi-commodity flowbased TE formulation cannot scale with increasing network sizes. Recent approaches have relaxed multi-commodity flow constraints to meet the temporal requirement but fail to satisfy the spatial requirement due to changing traffic demands, resulting in oversubscribed links or infeasible solutions. To satisfy both these requirements, we utilize optical topology programming (OTP) to rapidly reconfigure optical wavelengths in critical network paths and provide localized bandwidth scaling and new paths for traffic forwarding. GreyLambda integrates OTP into TE systems by introducing a heuristic algorithm that capitalizes on latent hardware resources at high-degree nodes to offer bandwidth scaling, and a method to reduce optical path reconfiguration latencies. Our experiments show that GreyLambda enhances the performance of two state-of-the-art TE systems, SMORE and NCFlow in real-world topologies with challenging traffic and link failure scenarios.
Article
Issue of service function chaining in a network is the focus of this paper. Currently, middle box placement in a network and packet steering through middle boxes are the two main problems associated with chaining services in a network—also known as service function chaining. We propose a One Pass Packet Steering (OPPS) method for use in multi-subscriber environments with the goal of reducing the total amount of time it takes for Users and Services to connect. We show a proof of idea execution utilizing imitations performed with Mininet. According to our findings, the end-to-end delay of subscribers utilizing different sets of policy chains with the same middle boxes and a fixed topology remains roughly the same. Software-Defined Networking, or SDN for short, is a new way of networking that gives a controller and its applications the all-powerful ability to see the whole network and program it in any way they want. This makes it possible for new innovations in network protocols and applications. SDN's logically centralized control plane, which gives visibility to the entire network and is used by many SDN applications, is one of its main benefits. We propose new SDN-specific attack vectors that seriously challenge this foundation, a first in the literature. While the spirit of our new attacks is somewhat similar to that of spoofing attacks in legacy networks, such as the ARP poisoning attack, there are significant differences in how unique vulnerabilities are exploited and how current SDN differs from legacy networks.
Chapter
With the advancement of cloud computing technology, many service providers are combining with cloud service providers to build a highly available streaming video-on-demand cloud platform and provide video services to end users. Generally, cloud service providers deploy many edge cloud CDN nodes in different geographic areas and provide video services to end users. However, when an end-user wants to watch certain videos and request video resources from surrounding edge cloud CDN nodes, the edge cloud CDN node will request missing video clips from other cloud nodes. Therefore, this will generate a large amount of additional video transmission costs and reduce the quality of service of the cloud service provider. To reduce or even minimize the video transmission cost of edge cloud CDN nodes while ensuring the quality of service (QoS). We designed a video transmission algorithm called Netdmc to ensure transmission quality. The algorithm can be divided into two parts. The first part is a low-latency video request algorithm based on ensuring service quality, and the second part is a video request algorithm based on minimizing video transmission costs. The simulation results demonstrate that the Netdmc algorithm can effectively reduce the cost of cloud service providers and ensure the quality of video services. KeywordsCloud service providerVideo transmission costQoS
Article
Flow routing is one of the most important issues in a software defined network (SDN), and faces various challenges. For example, each link may not be reliable all the time (or with a failure probability), and the flow-table size on each switch is limited. Existing solutions about reliable flow routing may result in a longer failure recovery delay, a large number of flow entries or massive control overhead. To this end, we propose to achieve throughput optimization with the constraint that the forwarding reliability probability of each switch pair should exceed a threshold (e.g., 99.9%). We formally define the reliable flow routing (RLFR) problem with flow-table size constraint. We present a rounding-based algorithm and analyze its approximation performance. We further extend our algorithm to preserve the throughput of each switch pair even with link failures. We implement our proposed algorithms on the SDN platform. The experimental results and large-scale network simulation results show that our algorithms can improve the network throughput by about 48.0% and reduce the maximum number of required flow entries by about 53.1%, compared with the existing solutions under the reliability requirement.
Chapter
With the development of cloud computing, more and more video service providers use services from cloud providers. A video service provider can construct a scalable video streaming platform with high availability by the cloud services. Typically, a video service provider uploads its video data to a cloud data center. Then, the cloud data center distributes the video data to its edge cloud CDN nodes. Usually, the cloud data center links with its edge cloud CDN nodes by high-capacity links, spanning different geographical regions. Video traffic across the cloud data center and the edge cloud CDN nodes of a cloud provider, brings on large operational cost to the cloud provider. How to reduce the video traffic cost is important for a cloud provider. Therefore, to reduce the video traffic cost, we propose a set of algorithms based on network maximum flow and minimum cut, called Netcut-way. The proposed Netcut-way, charged by the peak-bandwidth billing model, consists of three parts. The first is peak bandwidth calculation. The second is video segment segmentation. The third is video distribution route. Through extensive simulations, we demonstrate that Netcut-way can effectively reduce the operational cost of cloud providers in video traffic across data centers.
Article
We present SWAN, a system that boosts the utilization of inter-datacenter networks by centrally controlling when and how much traffic each service sends and frequently re-configuring the network's data plane to match current traffic demand. But done simplistically, these re-configurations can also cause severe, transient congestion because different switches may apply updates at different times. We develop a novel technique that leverages a small amount of scratch capacity on links to apply updates in a provably congestion-free manner, without making any assumptions about the order and timing of updates at individual switches. Further, to scale to large networks in the face of limited forwarding table capacity, SWAN greedily selects a small set of entries that can best satisfy current demand. It updates this set without disrupting traffic by leveraging a small amount of scratch capacity in forwarding tables. Experiments using a testbed prototype and data-driven simulations of two production networks show that SWAN carries 60% more traffic than the current practice.
Conference Paper
Full-text available
Advances in Grid technology enable the deployment of data-intensive distributed applications, which require moving terabytes or even petabytes of data between data banks. The current underlying networks cannot provide dedicated links with adequate end-to-end sustained bandwidth to support the requirements of these Grid applications. DWDM-RAM is a novel service-oriented architecture, which harnesses the enormous bandwidth potential of optical networks and demonstrates their on-demand usage on the OMNInet. Preliminary experiments suggest that dynamic optical networks, such as the OMNInet, are the ideal option for transferring such massive amounts of data. DWDM-RAM incorporates an OGSI/OGSA compliant service interface and promotes greater convergence between dynamic optical networks and data intensive Grid computing.
Conference Paper
Data-intensive applications that operate on large volumes of data have motivated a fresh look at the design of data center networks. The first wave of proposals focused on designing pure packet-switched networks that provide full bisection bandwidth. However, these proposals significantly increase network complexity in terms of the number of links and switches required and the restricted rules to wire them up. On the other hand, optical circuit switching technology holds a very large bandwidth advantage over packet switching technology. This fact motivates us to explore how optical circuit switching technology could benefit a data center network. In particular, we propose a hybrid packet and circuit switched data center network architecture (or HyPaC for short) which augments the traditional hierarchy of packet switches with a high speed, low complexity, rack-to-rack optical circuit-switched network to supply high bandwidth to applications. We discuss the fundamental requirements of this hybrid architecture and their design options. To demonstrate the potential benefits of the hybrid architecture, we have built a prototype system called c-Through. c-Through represents a design point where the responsibility for traffic demand estimation and traffic demultiplexing resides in end hosts, making it compatible with existing packet switches. Our emulation experiments show that the hybrid architecture can provide large benefits to unmodified popular data center applications at a modest scale. Furthermore, our experimental experience provides useful insights on the applicability of the hybrid architecture across a range of deployment scenarios.
Chapter
Subsystem and system vendors are rapidly developing and producing reconfigurable optical add/drop multiplexers (ROADMs), and carriers are installing and deploying them in their networks. This chapter is a comprehensive treatment of ROADMs and their application in WDM transmission systems and networks, comprising a review of various ROADM technologies and architectures, analyses of their routing functionalities and economic advantages, and considerations of design features and other requirements. The complex interplay between ROADM properties and optical transmission has also been explored, including a detailed discussion of static and dynamic channel power control. ROADMs enable an automated and transparent network capable of rapid reconfiguration. To fully realize this vision within the growing global communication fabric, transmission systems must be capable of dealing with continual changes, including power transients and varying transmission conditions. Network management systems must solve complex problems in routing and wavelength blocking, path verification, and more as the photonic layer assumes some of the tasks previously handled by higher layers. Advanced ROADM functionality, such as colorless add/drop ports, steerable transponders, and adaptive passbands, will be increasingly sought after, as will new and better solutions for signaling, network management, and mesh transmission. By meeting these challenges, the optical R&D community will help address the world's need for flexible, economical, and scalable networks.
Article
The basic building block of ever larger data centers has shifted from a rack to a modular container with hundreds or even thousands of servers. Delivering scalable bandwidth among such containers is a challenge. A number of recent efforts promise full bisection bandwidth between all servers, though with significant cost, complexity, and power consumption. We present Helios, a hybrid electrical/optical switch architecture that can deliver significant reductions in the number of switching elements, cabling, cost, and power consumption relative to recently proposed data center network architectures. We explore architectural trade offs and challenges associated with realizing these benefits through the evaluation of a fully functional Helios prototype.
Article
A fundamental challenge in data center networking is how to efficiently interconnect an exponentially increasing number of servers. This paper presents DCell, a novel network structure that has many desirable features for data center networking. DCell is a recursively defined structure, in which a high-level DCell is constructed from many low-level DCells and DCells at the same level are fully connected with one another. DCell scales doubly exponentially as the node degree increases. DCell is fault tolerant since it does not have single point of failure and its distributed fault-tolerant routing protocol performs near shortest-path routing even in the presence of severe link or node failures. DCell also provides higher network capacity than the traditional tree-based structure for various types of services. Furthermore, DCell can be incrementally expanded and a partial DCell provides the same appealing features. Results from theoretical analysis, simulations, and experiments show that DCell is a viable interconnection structure for data centers.
Article
The data centers used to create cloud services represent a significant investment in capital outlay and ongoing costs. Accordingly, we first examine the costs of cloud service data centers today. The cost breakdown reveals the importance of optimizing work completed per dollar invested. Unfortunately, the resources inside the data centers often operate at low utilization due to resource stranding and fragmentation. To attack this first problem, we propose (1) increasing network agility, and (2) providing appropriate incentives to shape resource consumption. Second, we note that cloud service providers are building out geo-distributed networks of data centers. Geo-diversity lowers latency to users and increases reliability in the presence of an outage taking out an entire site. However, without appropriate design and management, these geo-diverse data center networks can raise the cost of providing service. Moreover, leveraging geo-diversity requires services be designed to benefit from it. To attack this problem, we propose (1) joint optimization of network and data center resources, and (2) new systems and mechanisms for geo-distributing state.
Conference Paper
We describe a node architecture capable of supporting a highly-dynamic, multi-terabit core network. Design trade-offs, performance and hardware requirements are all discussed.
Article
We describe a node architecture capable of supporting a highly-dynamic, multi-terabit core network. Design trade-offs, performance and hardware requirements are all discussed.