Content uploaded by Bin Wu
Author content
All content in this area was uploaded by Bin Wu on Apr 09, 2015
Content may be subject to copyright.
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 1, JANUARY 2014 235
Joint Design on DCN Placement and
Survivable Cloud Service Provision over
All-Optical Mesh Networks
Jie Xiao, Hong Wen, Bin Wu, Xiaohong Jiang, Pin-Han Ho, and Lei Zhang
Abstract—Cloud services based on data center networks
(DCNs) require a transmission infrastructure with high-capacity,
low-latency, low-cost and high-availability, which can be offered
by survivable optical networks. DCN placement is a fundamental
issue in supporting cloud services in optical networks. It concerns
not only the cost of providing cloud services, but also the service
availability against failures via proper service replicas. In this
paper, we jointly optimize DCN placement with service routing
and protection to minimize the network cost, while ensuring fast
protection of all services against any single link failure or service
failure at a particular DCN. An ILP (Integer Linear Program) is
first formulated to achieve optimal joint design. It integrates p-
cycle (preconfigured protection cycle) for fast protection against
a single link failure, and DCN replicas and fast service rerouting
against a service failure. To make the design more scalable, a two-
step heuristic is then proposed for large-size network scenarios.
The first step separately solves the DCN placement and service
routing problem in the failure-free scenario, and the second step
takes fast service protection into account. The proposed design
is validated by extensive numerical experiments.
Index Terms—Cloud services, data center networks (DCNs),
optical networks, routing, survivability.
I. INTRODUCTION
THE rapid growth of broadband communications has led
to many new web applications such as online interactive
maps, social networks, video streaming, cloud computing and
CDN (Content Distribution Network) services. Most of those
applications are provided by data center networks (DCNs) [1-
2]. A DCN is a warehouse-scale and massively parallel com-
puting and storage resource. It generally consists of thousands
of clustered servers. DCN based applications are reshaping
the network landscape, by pushing the traditional hierarchical
and connectivity-oriented Internet towards a more flat and
Manuscript received March 31, 2013; revised August 20 and November 9,
2013. The editor coordinating the review of this paper and approving it for
publication was C. Assi.
J. Xiao is with the School of Computer Science and Technology, Tianjin
University, Tianjin, 300072, P. R. China, and with the National Key Lab on
Communications, University of Electronic Science and Technology of China,
Chengdu, 611731, P. R. China (e-mail: jiexiao001@gmail.com).
H. Wen is with the National Key Laboratory on Communications, Uni-
versity of Electronic Science and Technology of China, Chengdu, Sichuan,
611731, P. R. China (e-mail: sunlike@uestc.edu.cn).
B. Wu and L. Zhang, corresponding author, are with the School of
Computer Science and Technology, Tianjin University, Tianjin, 300072, P.
R. China (e-mail: binwu.tju@gmail.com, lzhang@tju.edu.cn).
X. Jiang is with the School of Systems Information Science, Future
University Hakodate, Hakodate, Japan (e-mail: jiang@fun.ac.jp).
P.-H. Ho is with the ECE Department, University of Waterloo, Waterloo,
ON, N2L 3G1, Canada (e-mail: p4ho@uwaterloo.ca).
Digital Object Identifier 10.1109/TCOMM.2013.121313.130240
service-oriented infrastructure [3]. Accordingly, networks pro-
vide more direct connections from content/service providers to
customers, with services delivered through a network of DCNs
(referred to as cloud [4-6]).
As cloud services grow rapidly, network-planning is neces-
sary to solve both the service deployment and the reliability
issues. Cloud services can be supported by the anycast service
mode [7-8], where heterogeneous contents or services are
replicated across multiple DCNs located at different nodes,
and a user demand can be served by any DCN that supports
the desired service. Such a distributed service mode brings two
benefits: 1) a demand can be served by a nearby DCN rather
than a remote centralized one, thereby reducing the service
transmission cost and latency; and 2) it improves service
availability, as the demands can still be served by other DCN
replicas upon a specific service failure at a particular DCN.
In fact, a failed service can be simultaneously protected by
multiple nearby replicas, as long as the sum capacity of those
replicas can satisfy the original demand.
Under the anycast mode, DCN placement is important in
balancing between network cost and service latency, as well
as ensuring cloud service availability. On one hand, cloud
service providers wish to direct user demands to nearby DCNs
with the smallest latencies and the minimum transmission
costs. This can be easily achieved by densely distributing the
DCNs in the area of interest. On the other hand, deploying a
DCN incurs not only a basic investment (e.g. warehouse rent
and power supply, etc.), but also the service capacity related
cost (e.g. servers and switches). This requires the number of
DCNs to be minimized. The tradeoff between network cost
and service latency makes DCN placement a fundamental
optimization problem.
In addition to the above tradeoff, service availability is an-
other important issue. It is desired that fast service protection
can be achieved against a component or service failure. In
terms of providing cloud services, anycast mode enhances
the service availability. Since cloud services are generally
deployed over a national-wide area, wavelength division mul-
tiplexing (WDM) optical network [9-10] is ideal for ser-
vice transmission. Although survivability has been extensively
studied in traditional optical networks [11-20], anycast mode
is taken into account only in a few recent works [8] to provide
highly available cloud services.
Given a backbone network topology with a certain amount
of service demands at each node, the DCN placement problem
0090-6778/14$31.00 c
2014 IEEE
236 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 1, JANUARY 2014
under the anycast mode should be addressed in three aspects:
1) how many DCNs should be placed; 2) where should the
DCNs be placed; and 3) what are the service and replica
capacities of the DCNs for satisfying all demands and ensuring
service availability against failures. Those problems should be
solved under the objective of minimizing the total network
cost, which includes not only the DCN construction and
the service transmission costs, but also the cost for service
protection. In particular, service transmission cost in optical
networks can be defined as distance related, and thus is also
related to the average service latency. By manipulating among
different cost metrics, the tradeoff between network cost and
service latency can be balanced as well.
In this paper, we study an integrated joint design under
the anycast mode, which takes both DCN placement and fast
cloud service protection into account. An ILP (Integer Linear
Program) is first formulated to achieve optimal joint design. In
terms of service protection, we consider any single link failure
and service failure at a particular DCN. To make the design
more scalable to large-size networks, a two-step heuristic is
proposed, which separately solves the DCN placement and
service routing problem before service protection is consid-
ered.
Compared with the classic node placement problem [21-
23], DCN placement in our proposed joint design is much
more complex. It differs from the former in three key aspects:
1) the classic node placement generally places nodes with
exactly the same configuration [21-23], whereas the service
and replica capacities of the DCNs are to be optimized and
thus different in our work; 2) cloud service brings about the
anycast mode which is not considered in the classic node
placement; and 3) in addition to service and replica capacity
optimization, DCN placement in the joint design is also
coupled with service routing and fast protection. As far as we
know, the proposed joint design provides the first complete and
integrated network-planning solution under the anycast mode
for reliable cloud services, where the classic node placement
can be taken as a basic building brick overlaid by multiple
interplaying factors.
The rest of the paper is organized as follows. We describe
the system model and the problem in Section II, and formulate
the optimal ILP for joint design in Section III. Section IV
presents the heuristic. Section V gives numerical results and
we conclude the paper in Section VI.
II. SYSTEM MODEL AND PROBLEM DESCRIPTION
A. System Model
Consider a WDM optical backbone network with topology
G(V,E), where Vis the set of all node/vertices and Eis the
set of all fiber links/edges. DCNs can be placed at a selected
subset of network nodes. Assume that a sufficient number of
wavelengths can be multiplexed onto each link for high-speed
optical transmissions. A set S={s1,s
2, ..., sS}denotes a
total number of S=|S|service types provided in the network.
Every node has a specific amount of demands on each service
type, which is counted in full wavelength granularity. The set
of services Sare hosted at the DCNs and are delivered to
the destination nodes through all-optical service paths, which
can be achieved by using OXCs (optical cross-connects) at the
intermediate nodes for transparent optical connections.
Network cost consists of the DCN construction cost,the
service transmission cost and the service protection cost.DCN
construction cost is the sum of the costs of all individual
DCNs deployed in the network. The cost of a particular
DCN includes a basic investment and a capacity related cost
measured by Psfor per unit service s. Service transmission
cost counts for the costs of all working wavelength capacity
for establishing the service paths to satisfy the network-wide
demands. Besides, service protection cost counts for the costs
of all spare wavelength capacity for fighting against the failure,
where at most a single service or link failure is assumed in
the network. Note that the cost of DCN service replicas is
counted in the DCN construction cost rather than in the service
protection cost.
In the failure-free scenario, service demands at each node
are served by nearby DCNs to reduce the service transmission
cost. Upon a service failure, the disrupted services must be
recovered using the service replicas hosted by other DCNs at
different nodes (i.e., the anycast service mode), whereas all
unaffected services keep their original paths without the need
of rerouting. Define protection segments as the preconfigured
spare wavelengths between a replicated service and the failed
one. To achieve fast service recovery against a service failure,
the rerouted service paths are set up by connecting the precon-
figured protection segments to the existing original paths of
the disrupted service. As such, only a single optical switching
operation is needed at the node with the failed service, and
reconfigurations at other intermediate nodes of the path can
be avoided for fast protection.
On the other hand, the classic p-cycle [14, 20] is adopted
for fast service protection against any single link failure. A
p-cycle is a preconfigured ring-like structure consisting of a
spare wavelength on each on-cycle link. If both end nodes of
a link are on a p-cycle but the link is not an on-cycle link, it is
defined as a straddling link.Ap-cycle can protect one unit of
traffic on each on-cycle link, and two units on each straddling
link. Note that pre-cross-connection of spare capacity is the
key to achieve fast protection, and p-cycle is not the only
choice. Instead, our work can be tailored to adopt other fast
protection schemes such as CFP [15] and pre-cross-connected
trail (p-trail) [24-25].
We also assume that service and link failures will not occur
at the same time, though simultaneous service and link failures
can be allowed by slightly modifying our proposed algorithms
as discussed later.
B. Problem Description
Our objective is to minimize the total network cost as
detailed in Section II.A, subject to all service demands being
satisfied with fast protection against any single link failure
(p-cycle based fast protection) or service failure.
As introduced in Section I, the above problem entails a
complex joint optimization on DCN placement, service routing
and fast protection. In our work, DCN placement differs
from the classic node placement problem [21-23] due to
the (service and replica) capacity optimization of the DCNs
XIAO et al.: JOINT DESIGN ON DCN PLACEMENT AND SURVIVABLE CLOUD SERVICE PROVISION OVER ALL-OPTICAL MESH NETWORKS 237
and the anycast mode of cloud services. The problem is
further complicated by the joint optimization on routing and
protection.
Notably, the anycast mode is the pivot that couples all
those interplaying factors into an integrated joint design. Each
service at a DCN is subject to a possible service failure.
In the anycast mode, a failed service can be protected by
multiple nearby DCNs via service replicas, as long as the
sum capacity of those replicas can satisfy the original demand.
Therefore, DCN placement concerns not only the locations to
place the service and replica capacities, but also the amount
and the protection relationship among them. In other words,
DCN placement is constrained by all possible service failure
scenarios and the protection scheme under the anycast mode,
as well as the routing scheme in both failure-free and failure
scenarios. This is much more complex than that in the tradi-
tional node placement problem [21-23]. The basic investment
for constructing a DCN also infers a less number of DCNs and
promotes the collocation of service and replica capacities at
the same DCN. Besides, DCN placement and routing scheme
determine the traffic load on each link, which interplays with
p-cycle placement against a possible link failure. In our work,
all the above factors are jointly considered under the anycast
mode to generate a cost efficient solution for highly available
cloud services.
III. ILP FOR OPTIMAL JOINT DESIGN
We define the notations used in the ILP in Section III.A, and
formulate the ILP in Section III.B. Our ILP imports a modular
Cycle Exclusion technique from [14] for optimal p-cycle
design without candidate cycle enumeration. It involves some
notions including cycle set, root node and voltage [14], which
will be further explained later in Fig. 1 and the corresponding
context.
A. Notation List
V: The set of all nodes in the network.
E: The set of all bidirectional links in the network.
S: The set of all service types in the network.
J: Predefined constant. It is the maximum allowed num-
ber of p-cycles for link failure protection. Cycles are
indexed by j∈{1,2, ..., J}.
Bu: Predefined constant. It is the basic investment cost for
constructing a DCN at a particular node u. Note that
Bucan take different values at different nodes.
Ps: Predefined constant. It is the cost for a unit amount
of service s∈Sprovided by a DCN server.
Cuv: Predefined constant. It is the cost of a wavelength
on link (u,v). We assume Cuv =Cvu for bidirectional
links and it can be either distance-related or hop-count
based.
β: Predefined constant greater than |S|×|V|×max{ds
u|
∀u∈V,∀s∈S}.
ds
u: Predefined constant. It is the amount of demands on
service type sat node u.
Du: Binary variable. It takes 1 if a DCN is placed at node
uand 0 otherwise.
Wuv: Non-negative integer variable. It is the total number of
working wavelengths on link (u,v) for service trans-
missions plus spare wavelengths for service rerouting
against a service failure. Note that the spare wave-
lengths for p-cycles are not included in Wuv.
cs
u: Non-negative integer variable. It is the sum of working
capacity (when no service fails) and replicated spare
capacity of service sat a DCN server placed at node
u.
ts
u: Non-negative integer variable. It is the working capac-
ity of service sprovided by a DCN at node uwhen
no DCN service fails.
ts
u|n: Non-negative integer variable. It is the capacity replica
of service sprovided by a DCN at node uif a DCN
at node nfails to provide the same type of service.
ts
uv: Non-negative integer variable. It is the wavelength
capacity on link (u,v) for transmitting service sfrom
node uto node vwhen no DCN service fails.
ts
uv|n: Non-negative integer variable. It is the spare wave-
length capacity of service son link (u,v) from node
uto node vif service sfails at node n.
α: Predefined fractional constant where 1/|V|≥α>0.
θj
uv: Binary variable. It takes 1 if link (u,v) is traversed by
cycle set jfrom node uto node v,and0otherwise.
zj
u: Binary variable. It takes 1 if node uis on cycle set j,
and 0 otherwise.
rj
u: Binary variable. It takes 1 if node uis a root node on
cycle set j,and0otherwise.
pj
u: Fractional variable. It is the voltage value of node u
when constructing cycle set j.
xj
uv: Binary variable. It takes 1 if link (u,v) can be
protected by cycle set j,and0otherwise.
B. ILP Formulation
The ILP in (1)-(15) carries out the joint design of DCN
placement, service routing and protection (against a link or
service failure) to minimize the total network cost, with the set
of parameters {G(V, E ),S,B,Ps,Cuv,ds
u,α,β,J}as the input.
minimize
(u, v)∈E
CuvWuv +
j
(u, v)∈E
Cuv(θj
uv +θj
vu)
+
u∈VBuDu+
s∈S
Pscs
u(1)
Subject to
Du≥1
β
s∈S
cs
u,∀u∈V;(2)
ts
u+
(u,v)∈E
(ts
vu −ts
uv)=ds
u,∀u∈V,∀s∈S;(3)
ts
n|n=0,∀n∈V,∀s∈S;(4)
(n,v)∈E
(ts
vn|n−ts
nv|n)=ts
n,∀n∈V,∀s∈S;(5)
238 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 1, JANUARY 2014
ts
u|n=
(u,v)∈E
(ts
uv |n−ts
vu|n),
∀u,n ∈V:u=n,∀s∈S;
(6)
cs
u≥ts
u+ts
u|n,∀u,n ∈V,∀s∈S;(7)
Wuv ≥(ts
uv|n+ts
vu|n)+
s∈S
(ts
uv +ts
vu),
∀n∈V,∀s∈S,∀(u, v)∈E;
(8)
θj
uv +θj
vu ≤1,∀(u, v)∈E,∀j;(9)
(u,v)∈E
(θj
uv +θj
vu)=2zj
u,∀u∈V,∀j;(10)
u∈V
rj
u≤1,∀j;(11)
(u,v)∈E
θj
uv ≤1+rj
u,∀u∈V,∀j;(12)
pj
v−pj
u≥αθj
uv −(1 −θj
uv),∀(u, v)∈E,∀j;(13)
xj
uv ≤1
2(zj
u+zj
v),∀(u, v)∈E,∀j;(14)
j
(2xj
uv −θj
uv −θj
vu)≥
s∈S
(ts
uv +ts
vu),
∀(u, v)∈E.
(15)
Objective (1) minimizes the network cost, which consists of
three terms. The first term counts for the capacity of working
wavelengths (i.e., service transmission cost) plus the spare
wavelength capacity for service rerouting against a service
failure. The second term is the spare wavelength capacity
dedicated to p-cycle protection against a link failure. The third
term counts for the DCN construction cost, which includes the
basic investment and the capacity related costs of all DCNs.
Constraints (2)-(8) formulate DCN placement and service
protection against a service failure. In particular, constraint
(2) means that if any type of DCN service is provided at
a particular node, then we must place a DCN at this node.
Constraint (3) formulates the flow conservation property of
an arbitrary service at each node in the failure-free scenario.
It also ensures that all demands in the network can be served.
Constraints (4)-(8) assume that an arbitrary service sfails at
node n. In particular, constraint (4) means that the replicated
capacity of sat node nshould be zero. Constraint (5) gives
the flow conservation property of service sat node n.Itsays
that the net amount of service semanating from node nmust
keep the same as if the service had never failed. This makes
the service failure transparent to the affected demands, and
ensures that the original paths of the disrupted services can
be reused by being connected to the preconfigured protection
segments. Constraint (6) formulates the flow conservation
property at other nodes without service failure. It says that
the replicated capacity of service sat those nodes equals to
the net traffic load of supon the service failure at node n.
This supports the anycast mode. Constraint (7) specifies that
the DCN capacity of service sat a node must be sufficient to
satisfy all demands on sno matter whether a service fails
or not. Finally, the required wavelength capacity on each
link must be sufficient to support the service transmissions
Root node
These two cycles are excluded
to avoid voltage value conflicts
A cycle set where the underlying network topology is omitted.
Fig. 1. Cycle Exclusion mechanism [14].
as formulated in (8). Constraint (8) also ensures that all
service paths in the failure-free scenario, including those of
the disrupted services which are reused upon the failure, will
keep unchanged.
Constraints (9)-(15) are dedicated to p-cycle protection of
the services against a link failure, which can be taken as a
separate module imported from [14]. It removes the candidate
cycle enumeration process by using the Cycle Exclusion algo-
rithm [14] (constraints (9)-(13)) to formulate a single cycle at
a time. Constraint (14) says that a link can be protected if its
two end nodes are on the p-cycle. Constraint (15) says that all
services must be protected, where one unit of traffic can be
protected if the failed link is an on-cycle link and two units
can be protected if it is a straddling one.
To make the paper self-contained, we briefly explain the
Cycle Exclusion algorithm [14] by referring to Fig. 1. In Cycle
Exclusion, a cycle can traverse a link only once in either
direction as specified by (9). Constraint (10) requires each
node to be incident on either two or zero on-cycle links. When
formulating a single p-cycle, constraint (10) may generate
multiple disjoint cycles referred to as a Cycle Set.Togeta
single cycle by excluding all other redundant ones, a unique
root node is defined on each cycle set as in (11). By constraint
(12), we logically allow the root node to have two outgoing
on-cycle links, whereas all other nodes can have at most one.
Next, a voltage value is defined for each node, and it must
keep increasing at the nodes along the logically directed on-
cycle links as required by (13). Due to the cyclic structure
of the cycles in a cycle set, if a cycle does not traverse the
root node, the voltage values along the cycle will encounter
a conflict and thus violate (13). Consequently, all redundant
cycles will be excluded from the cycle set, and only the one
traversing the unique root node remains as a single p-cycle.
Recall that in Section II.A, we have assumed that service
and link failures will not occur at the same time. Nevertheless,
by changing constraint (15) to the following (16), a simulta-
neous link and service failure can be allowed.
j
(2xj
uv −θj
uv −θj
vu)≥Wuv ,∀(u, v)∈E.(16)
IV. HEURISTIC
Solving the ILP in Section III is not an easy task. Although
some large-scale optimization techniques such as column
generation [26-28] can be applied, they are still ILP-based or
XIAO et al.: JOINT DESIGN ON DCN PLACEMENT AND SURVIVABLE CLOUD SERVICE PROVISION OVER ALL-OPTICAL MESH NETWORKS 239
DPP Algorithm: DCN Placement and Service Protection
Input:
G(V, E), S, B, Ps for ∀ ∈ , Cuv for ∀(,)∈,
for ∀ ∈ and
∀ ∈ , J and . Define ∅ as a null set.
Output:
Location of each DCN, service and replicated capacity of each
service type, service transmission capacity and paths for all demands,
and protection capacity and paths against a single link or service failure.
1. DCN placement and service routing:
1.1) Bipartite graph construction:
Construct a bipartite graph consisting of a left set L=V and a right
set R=V of vertices. Use an edge to connect each vertex pair ,
with ∈ and ∈. Calculate the shortest path Puv⊆ between
, in G(V, E) and assign the path cost =∑(,)∈ as the
weight to the edge.
1.2) DCN placement:
while (≠∅)
{
=arg
∈min = Service_Matching(,,).
Place DCN at node in G(V, E).
Set −→ and −
→.
}
Subroutine _(, , ):
_(,,)
{
Set →, 0→, ∅→
and +∞ → ;
while (≠)
{
=arg
∈()min=∑
()
∈
∑
∈ ;
if (≤
)
{
Set →
and ∪
→
;
Set +∑
(+
)
∈ →;
Set +∑
∈ →;
}
else break;
}
return ;
}
1.3) Service routing:
Based on shortest paths, each node in network G(V, E) finds its
closest DCN to serve its demands (tie is broken by random choice).
In the failure-free scenario, the required capacity
( ∀ ∈ ,
∀ ∈ ) of service s at DCNu can thus be determined, and the
wavelength capacity required on each link is ∑(
+
)
∈ where
+
is the amount of traffic s on link (u, v).
2. Service protection:
2.1) Protection against a service failure:
2.1.1) Protection rule:
Let DCNu be the closest DCN of DCNv. If service s at DCNv fails,
merge the shortest path Puv (i.e., the protection segment) with the
original service paths of s emanated from DCNv. The failed service
is protected by DCNu using the merged paths.
2.1.2) Spare capacity on the protection segment:
Under the assumptions in 2.1.1), the required number of spare
wavelengths on each link along Puv is max∈
.
2.1.3) DCN capacity and replicas:
Let be a set of nearby DCNs protected by DCNu. The total
capacity
of service s at DCNu (which includes
and the
replicated capacity of s) is determined by
=
+max
∈
.
2.2) Protection against a link failure:
Use the ILP module in constraints (9)-(15) to minimize the spare
wavelength capacity of p-cycles for link failure protection. p-cycles
can thus be placed in the network.
Fig. 2. Pseudo code of the proposed heuristic.
exhaustive search algorithms and thus not fully scalable. To
make the design more scalable for large-size networks, in this
section we propose a two-step heuristic DPP (DCN Placement
and Service Protection) by dividing the problem into two sub-
problems. The first step separately solves the DCN placement
and service routing problem, and the second step considers
service protection against a service or link failure by taking
the result of the first step as the input. In what follows, we
first present the DPP algorithm in Section IV.A, and then give
more detailed theoretical analysis in Section IV.B to validate
the DCN placement process in DPP. Finally, Section IV.C
discusses the complexity of DPP and how fast protection is
achieved in the solutions.
A. Algorithm Description
The pseudo code of the proposed heuristic DPP is given
in Fig. 2. In Step 1.1, we first construct a bipartite graph as
illustrated by the 4-node example in Fig. 3, where initially
both the left and right sets of vertices (Land R) include all
the nodes in the network. An edge <u, v >is used to connect
each and all node pairs {u, v}between the two sets, and is
weighted by the cost Puv of the shortest path between uand
vin G(V,E). In our analysis, a vertex v∈Rgives a possible
location of a DCN (denoted by DCNv). It matches a set of
vertices u∈Lwhich denotes the set of nodes that DCNv
should serve, provided that DCNvcan finally be placed at
node v.
Step 1.2 is dedicated to DCN placement. It checks each
vertex v∈Rby assuming a DCNvplaced at node v, and finds
the corresponding set of nodes Cv⊆Lto be served using a
subroutine Service Matching(L,v,Cv). The subroutine takes
Land vas inputs and Cvas the output. If adding a node u∈L
to Cvcan further reduce the amortized service cost Auin (17),
then uis added to Cv(i.e., to be served by DCNv)andthe
total cost Cand demands Dare updated accordingly (see the
pseudo code for the subroutine in Fig. 2).
Au=
C+
s∈S
ds
u(Ps+Puv)
D+
s∈S
ds
u
(17)
By checking all vertices in the right set Rusing the
subroutine Service Matching(L,v,Cv), a node vwith the
minimum amortized service cost Amin is identified. Then,
DCNvis placed at node v,andthesetsRand Lare updated
by R−{v}−→Rand L−Cv−→ L, respectively. At this
point, Step 1.2 is repeated again based on the updated Rand
L, until all vertices u∈Lhave been properly matched to their
serving DCNs. Then, the number of DCNs and their locations
can be determined.
Based on the DCN locations, Step 1.3 simply uses shortest
paths to route the services between a demanding node and its
closest DCN, and ensures that all demands in the network can
be satisfied in the failure-free scenario. Then, service routing
and the required server capacity for every service type can be
determined at each DCN.
Step 2 is for service protection against a single link or
service failure. The process described in Step 2 of Fig. 2 is
quite straightforward and thus not further explained here.
B. Theoretical Analysis on DCN Placement in DPP
DCN placement (Step 1.2 in Fig. 2) invokes a subroutine
Service Matching(L,v,Cv)whichtakesLand vas inputs
240 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 1, JANUARY 2014
ݒאࡾ
Each edge <u,v> matches a shortest path Puvكࡱ between nodes
ሼݑǡ ݒሽ in G(V,E), and is weighted by the path cost ࣪
௨௩ ൌσܥ௦௧ሺ௦ǡ௧ሻאࡼ࢛࢜ .
1
2
3
4
1
2
3
4
ݑאࡸ
Fig. 3. Illustration of the bipartite graph with |V|=4.
ܣ୫୧୬
|࢜|
|ࡸ|
݇୫୧୬
123kk+1 k+2
Fig. 4. How Amin changes with |Cv|in Service Matching∗(L,v,Cv).
and Cvas the output. For a given node v∈Rand a set L,
it returns a set Cv⊆Lwith the minimum amortized service
cost Amin by assuming that the nodes in Cvare served by
DCNvat node v. As discussed in Section IV.A, nodes u∈L
are sequentially added to Cvif they can continuously reduce
Auin (17). As indicated by the two circled parts in Fig. 2, the
subroutine breaks and returns once this trend changes (i.e.,
whenever Auis found larger than Amin =min{Au}recorded
in all previous rounds). In this subsection, we carry out some
theoretical analysis to validate Step 1.2 in DPP.
We now prove that Amin returned by the subroutine is
globally minimal for any |Cv|∈{1,2, ..., |L|}. By assum-
ing that Service Matching(L,v,Cv) breaks and returns at
|Cv|=kmin, the subroutine has ensured a decreasing series
of {Amin}recorded as |Cv|increases up to kmin,asshownin
Fig. 4. Therefore, we only need to prove that Amin cannot be
smaller if |Cv|is allowed to go beyond kmin, which can be
achieved by defining a Service Matching∗(L,v,Cv) without
the two circled parts in Service Matching(L,v,Cv)(seeFig.
2). To this end, we have the following Theorem 1 (proved in
the Appendix), which ensures the global minimum of Amin at
|Cv|=kmin.
Theorem 1: In Service Matching∗(L,v,Cv), Amin always
keeps increasing with |Cv|for |Cv|>k
min.
Note that Service Matching(L,v,Cv) calculates the min-
imum amortized service cost only for a single given node
v∈R. By repeatedly invoking this subroutine in Step 1.2
of Fig. 2 to minimize Amin over all v∈R, DPP identifies a
node vwith the minimum Amin to place DCNv. Obviously,
the following Theorem 2 is correct. It implies that DPP
intrinsically adopts a greedy approach for DCN placement.
Theorem 2: When DPP finds a matching {v,Cv}in Step
1.2toplaceaDCN
v, there is no any other matching {v
∈
R,C
v⊆L}which can achieve a smaller Amin than that by {v,
Cv}.
Theorems 1 and 2 provide underlying theoretical supports
to validate Step 1.2 for DCN placement in DPP. In particular,
Theorem 1 ensures that Service Matching(L,v,Cv) can
always return a proper set Cv⊆Lto minimize the amortized
service cost Amin for a given node v∈R. By checking all
nodes v∈Rin Step 1.2, Theorem 2 ensures that the specific
node v∈Ridentified for placing DCNvcan achieve the
minimum Amin among all nodes v∈R.
C. Discussions
We now analyze the complexity of DPP by excluding the
ILP based Step 2.2 for p-cycle placement. Without Step 2.2,
the complexity of DPP is dominated by two parallel opera-
tions: DCN placement in Step 1.2 and calculation on all-pairs
shortest paths used in other steps. In DCN placement, a DCNv
with the minimum amortized service cost can be placed at a
node vafter all v∈Rare checked, which incurs a complexity
of O(|V|). Meanwhile, subroutine Service Matching(L,v,
Cv) is invoked for checking each v∈R, where each node
u∈Lis sequentially tested to check whether the amortized
service cost can be further reduced or not, and at most |V|
nodes u∈Lcan be tested. As a result, placing a single DCN
requires a complexity of O(|V|2). Since at most |V|DCNs can
be placed, the total complexity of DCN placement is O(|V|3).
On the other hand, it is well known that the complexity of all-
pairs shortest path calculation is O(|V|3), and it is in parallel
with DCN placement. Hence, the total complexity of DPP is
O(|V|3)without considering Step 2.2.
Step 2.2 adopts the ILP module in constraints (9)-(15) for
p-cycle design. It is imported from [14], which adopts Cycle
Exclusion algorithm to remove the traditional candidate cycle
enumeration process, and thus can generate p-cycle solutions
in a fast manner. Even when the network size is relatively
large, the ILP running time is tolerable (e.g., only a few hours
for the network in Fig. 7 with 30 nodes and 62 links). On
the other hand, Step 2.2 can be replaced by some existing
heuristics for p-cycle placement if necessary. In this case, the
complexity of DPP will be completely in polynomial time.
Besides, if p-cycles are designed based on (16) rather than
(15), a simultaneous link and service failure can be allowed.
We emphasize that both the proposed ILP and the heuristic
focus on fast protection against either a service or a link fail-
ure. For a service failure, preconfigured protection segments
are connected to the original paths of the disrupted services (as
specified in Step 2.1.1 in Fig. 2). Since only a single operation
of local failure detection and optical switching is needed, the
optical recovery speed can be very fast. On the other hand,
it is well known that p-cycles can achieve fast protection
against a link failure. Fast protection is desired in providing
cloud services over all-optical networks for minimizing service
interruption time, and our proposed joint design well meets
this essential requirement.
XIAO et al.: JOINT DESIGN ON DCN PLACEMENT AND SURVIVABLE CLOUD SERVICE PROVISION OVER ALL-OPTICAL MESH NETWORKS 241
8+4, 8+3, 8+4
7+5, 7+4, 7+5
9+3, 7+4, 9+3
(a) Pan-European COST 239 network.
0
123
4
56
7
89
10
Ͷଷ
՜ͷ ଷ
՜
Ͷଵ
՜ͷ ଵ
՜ͺ ଵ
՜
ͺଷ
՜
Ͷଷ
՜ͷ ଷ
՜
ͺସ
՜
Ͷଷ
՜ͷ ଷ
՜
Ͷଵ
՜ͷ ଵ
՜ͺ ଵ
՜
ͺଷ
՜
ݏଵ fails at DCN6ݏଶ fails at DCN6ݏଷ fails at DCN6
Ͷସ
՜ͷ ସ
՜ͺ
ଵ
՜ͷ ଵ
՜ͺ
ସ
՜ͺ
Ͷଷ
՜ͷ ଷ
՜ͺ
ସ
՜ͺ
Ͷସ
՜ͷ ସ
՜ͺ
ଵ
՜ͷ ଵ
՜ͺ
ସ
՜ͺ
ݏଵ fails at DCN8ݏଶ fails at DCN8ݏଷ fails at DCN8
Simulation
parameters
B=16000 Į=0.01 ȕ=1000
S={s1,s2,s3}
ܲ
௦భ=200 ܲ
௦మ=250 ܲ
௦య=300
J=7
0-2-1-7-5-4-10-8-6-9-3-0
0-2-1-7-10-4-5-8-6-9-3-0
1-4-2-5-8-6-3-9-10-7-1
p-cycles
Node (u) ݀௨
௦భ݀௨
௦మ݀௨
௦య
0: Copenhagen
1: London
2: Amsterdam
3: Berlin
4: Brussels
5: Luxembourg
6: Prague
7: Paris
8: Zürich
9: Vienna
10: Milan
1 4 1
1 2 2
2 1 0
1 3 1
2 1 2
3 2 3
4 1 3
2 1 5
3 0 4
5 3 2
0 4 1
(d) Service demands at each node and routing in the failure-free scenario.
Service paths in the failure-free scenario
s1: ଵ
՜Ͳ s2: ଶ
՜Ͳ, Ͷଶ
՜ʹ ଶ
՜Ͳ s3: ଵ
՜Ͳ
s1: Ͷଵ
՜ͳ s2: Ͷଶ
՜ͳ s3: Ͷଶ
՜ͳ
s1: Ͷଶ
՜ʹ s2: Ͷଵ
՜ʹ
s1: ଵ
՜͵ s2: ଷ
՜͵ s3: ଵ
՜͵
s1: Ͷଶ
՜Ͷ s2: Ͷଵ
՜Ͷ s3: Ͷଶ
՜Ͷ
s1: Ͷଷ
՜ͷ s2: Ͷଵ
՜ͷ, ͺଵ
՜ͷ s3: ͺଷ
՜ͷ
s1: ସ
՜ s2: ଵ
՜ s3: ଷ
՜
s1: ͺଶ
՜ s2: Ͷଵ
՜ s3: Ͷସ
՜, ͺଵ
՜
s1: ͺଷ
՜ͺ s3: ͺସ
՜ͺ
s1: ଵ
՜ͻ, ͺସ
՜ͻ s2: ଵ
՜ͻ, ͺଶ
՜ͻ s3: ଶ
՜ͻ
s2: ͺସ
՜ͳͲ s3: ͺଵ
՜ͳͲ
(c) Simulation parameters.
(e) p-cycle placement.
(f) Protection segments against a service failure.
ଷ
՜ͷ ଷ
՜Ͷ
ଶ
՜ͺ ଶ
՜ͷ ଶ
՜Ͷ
ͺଷ
՜ͷ ଷ
՜Ͷ
ଷ
՜ͷ ଷ
՜Ͷ
ଵ
՜ͺ ଵ
՜ͷ ଵ
՜Ͷ
ͺସ
՜ͷ ସ
՜Ͷ
ଷ
՜ͷ ଷ
՜Ͷ
ଶ
՜ͺ ଶ
՜ͷ ଶ
՜Ͷ
ͺଷ
՜ͷ ଷ
՜Ͷ
ݏଵ fails at DCN4ݏଶ fails at DCN4ݏଷ fails at DCN4
(b) Link cost in kilometers
Link Cost Link Cost Link Cost
(
0, 1
)
1310
(
0
,
2
)
760
(
0, 3
)
390
(
0
,
6
)
740
(
1
,
2
)
550
(
1
,
4
)
390
(
1
,
7
)
450
(
2
,
3
)
660
(
2
,
4
)
210
(
2
,
5
)
390
(
3
,
6
)
340
(
3, 7
)
1090
(
3, 9
)
660
(
4
,
5
)
220
(
4
,
7
)
300
(
4
,
10
)
930
(
5
,
6
)
730
(
5
,
7
)
400
(
5
,
8
)
350
(
6
,
8
)
565
(
6
,
9
)
320
(
7, 8
)
600
(
7
,
10
)
820
(
8
,
9
)
730
(
8, 10
)
320
(
9
,
10
)
820
Fig. 5. ILP based optimal solution for the joint design with a total network cost of 120805.
p-cycles
(J=10)
DCN protection segment
Total cost
Gap to optimal in Fig. 5
0-1-7-5-2-4-10-8-9-6-3-0
0-3-2-5-7-4-10-8-9-6-0
0-1-7-5-2-4-10-8-9-6-3-0
0-2-5-4-1-7-8-10-9-6-3-0
2-4-10-8-7-5-2
0-3-2-5-4-7-8-9-6-0
0-3-6-0
͵ଵ
՞ʹଵ
՞ͷ
141260
16.93%
(b) Service protection and the total network cost.
11+13, 11+11, 7+17
13+11, 11+11, 17+7
(a) DCN placement and service routing.
10
Service from DCN5
Service from DCN3
0
123
4
56
7
89
Fig. 6. DPP Heuristic solution for COST 239 network with a total network cost of 141260 (16.93% above the optimal solution in Fig.5).
V. N UMERICAL RESULTS
In this section, we carry out numerical experiments to check
the proposed ILP and heuristic DPP. For simplicity, we assume
that the basic investment costs Butake the same value B
for all nodes u∈V, although they may be different in
practical networks. Since the ILP approach is not scalable,
we can optimally solve the ILP only in small-size networks.
This provides a benchmark to gauge the DPP performance.
For large-size network scenarios, we focus on checking the
feasibility of the DPP solutions.
A. ILP Based Optimal Joint Design in COST 239
CPLEX 11.0 is adopted to solve the ILP in (1)-(15). We
consider the typical pan-European COST 239 network with
11 nodes and 26 links as shown in Fig. 5a. Fig. 5b defines
the distance-related link cost in kilometers. In the simulation,
we set B= 16000,α=0.01 and β= 1000. Three types
of services {s1,s2,s3}are assumed in the network. The DCN
costs of per-unit capacity for {s1,s2,s3}are assumed to be
different from each other as {Ps1,Ps2,Ps3}={200,250,300}.
Besides, the maximum allowed number of p-cycles is set to
242 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 1, JANUARY 2014
J=7. Fig. 5c summarizes the simulation parameters. Service
demands ds
uat each node are listed in the first two columns
in Fig. 5d.
The optimal DCN placement and the capacity configuration
of each service are shown in Fig. 5a. DCNs are placed at
nodes 4, 6 and 8. The capacity of the services at each DCN is
{cs1
u,cs2
u,cs3
u}. Each term cs
uis expressed by x+y,where
xdenotes the service working capacity in the failure-free
scenario and ydenotes the capacity of service replicas. The
service paths in the failure-free scenario are listed in the
last three columns in Fig. 5d, with a number above each
arrow indicating the traffic load on the link. Besides, Fig. 5e
gives the p-cycles against a link failure, and Fig. 5f lists the
preconfigured protection segments against a service failure.
The total network cost is 120805.
B. DPP Design in Small-Size Networks
Fig. 6 gives the heuristic solution generated by the DPP
algorithm in Fig. 2 for COST 239, with the same set of
system parameters as in Fig. 5c and service demands as in
Fig. 5d. In particular, Fig. 6a shows DCN placement and
service routing, where the shortest-path based service paths
are drawn in the network topology using the bold arrows.
Fig. 6b shows the required p-cycles against a link failure and
the preconfigured DCN protection segment against a service
failure, as well as the total network cost and the gap to
the ILP based optimal solution in Fig. 5. The small gap-to-
optimality of 16.93% confirms the superior performance of
DPP. In addition to COST 239, we have carried out many other
experiments with similar network sizes. The results show that
the gap-to-optimality of DPP keeps quite stable at almost the
same level.
Despite of the excellent performance of DPP, some major
differences between the solutions in Fig. 5 and Fig. 6 can
be observed: 1) DCN placement schemes are different. There
are 3 DCNs in Fig. 5a but 2 in Fig. 6a. In addition, both
the DCN locations and capacities are different; 2) Due to
the differences on DCN placement and service routing, p-
cycle placement is also quite different in the two solutions.
In particular, the optimal solution requires 3 p-cycles as in
Fig. 5e, whereas the DPP solution requires 7 as in Fig. 6b;
3) Service transmissions are based on shortest paths in DPP
but may take detours in the optimal solution due to the more
intelligent joint optimization; and 4) In both failure-free and
service failure scenarios, the anycast service mode is better
supported in the optimal solution. In contrast, DPP always
serves the demands using the closest DCN in the failure-free
scenario, or resorts to the closest DCN replica for service
protection upon a service failure. Take the circled part in Fig.
5d as an example for the failure-free scenario. In Fig. 5d, the
demands at node 7 on service s3are served simultaneously by
nodes 4 and 8 (i.e., anycast), with a distance-related link cost
of 300 for 4−→ 7and 600 for 8−→ 7(i.e., detour). In Fig.
6a, the demands at node 7 are served by the closest DCN5.
Moreover, it is obvious from Fig. 5f that the anycast mode is
well supported in the optimal solution upon any service failure,
with multiple DCN replicas providing fast service protection
at the same time. In contrast, DPP only uses the closest DCN
replica (see Step 2.1.1 in Fig. 2).
C. DPP Design in Large-Size Networks
We now apply DPP to the network topology in Fig. 7a with
30 nodes and 62 links, which is taken from [14]. Simulation
parameters as shown in Fig. 7b are almost the same as that in
Fig. 5 and Fig. 6, but the link and path costs are based on a
constant cost of 600 for each link. Service demands at each
node are listed in Fig. 7c.
Fig. 7a shows the DCN placement and service routing
result. In Fig. 7a, each of the five shadowed zones covers
the nodes with demands served by the corresponding DCN in
the zone, and the bold arrows indicate the service transmission
paths in the failure-free scenario. Fig. 7d lists the capacity con-
figuration and the preconfigured protection segments (against
a service failure) for each DCN. As shown in Fig. 7e, 7 p-
cycles are required for fast protection against a link failure.
The total network cost is 278850.
D. Relationship Between B and the Number of DCNs
Fig. 8 shows how the number of DCNs placed in the
network changes with the basic investment B. In particular,
Fig. 8a is obtained using the optimal ILP based on the COST
239 network with the same sets of simulation parameters and
service demands as in Fig. 5. Fig. 8b is obtained using DPP
based on the same network and simulation settings as in Fig.
7. Obviously, the number of DCNs decreases as Bincreases,
because constructing a DCN becomes more expensive and thus
a less number of DCNs should be deployed. Fig. 8 confirms
this trend in both the optimal ILP and the DPP solutions.
VI. CONCLUSION
We studied the joint design of DCN (Data Center Network)
placement, service routing and service protection for provid-
ing cloud services in all-optical WDM (Wavelength Division
Multiplexing) networks. An optimal ILP (Integer Linear Pro-
gram) was first formulated to achieve joint optimization for
minimizing the total network cost. Then, a heuristic algorithm
DPP (DCN Placement and Service Protection) was proposed
to make the design more scalable in large-size networks.
The solutions generated by our algorithms can work under
the anycast service mode to satisfy all service demands in
the network, with the minimized network cost by leveraging
between the costs of DCNs and optical wavelengths. With
preconfigured protection segments and p-cycles adopted in
the proposed scheme, fast service protection can be achieved
to fight against a service or link failure. Numerical results
validated the correctness of the ILP and demonstrated the
superior performance of the proposed DPP heuristic.
ACKNOWLEDGMENT
This work is supported by the Major State Basic Re-
search Program of China (973 project No. 2013CB329301 and
2010CB327806), the Natural Science Fund of China (NSFC
project No. 61372085, 61032003 and 61271165), the Research
Fund for the Doctoral Program of Higher Education of China
(RFDP project No. 20120185110025 and 20120185110030),
and the Fundamental Research Funds for the Central Uni-
versities. It is also supported by Tianjin Key Laboratory of
XIAO et al.: JOINT DESIGN ON DCN PLACEMENT AND SURVIVABLE CLOUD SERVICE PROVISION OVER ALL-OPTICAL MESH NETWORKS 243
Working Capacity Capacity Replica
DCN0
DCN25
DCN15
DCN21
DCN5
9 9 6
6 5 7
6 9 8
7 5 8
8 3 6
0 0 0
0 0 0
9 9 8
0 0 0
8 5 7
DCN
(d) DCN placement and preconfigured protection segments.
Protection Segment
5-3-0
15-14-5
5-14-15
5-7-17-21
15-13-25
0-1-2-6-19-20-29-21-18-17-22-23-15-24-25-13-11-10-12-14-16-7-5-4-3-0
0-1-2-6-19-20-29-21-18-17-22-23-15-24-25-13-11-10-9-14-16-7-8-5-4-3-0
p-cycles
(e) 7 p-cycles required for fast link protection.
0-1-2-5-14-8-7-16-23-15-24-26-27-25-13-11-10-12-4-3-0
11-13-15-23-22-21-20-29-28-27-26-24-25-11
0-1-3-4-0
0-3-4-0
0-3-4-0
(b) Simulation parameters.
Simulation
parameters
B=16000 α=0.01 β=1000 S={s1, s2, s3}
=200 =250 =300
J=20
Cuv=600 for ∀(,)∈
(a) DCN placement and service routing.
(c) Service demands at each node.
Node (u)
0
1
2
3
4
5
6
7
8
9
1 2 1
2 1 1
1 0 1
1 1 1
1 2 1
1 0 2
2 1 1
3 1 1
1 1 1
1 1 0
10
11
12
13
14
15
16
17
18
19
1 1 1
2 0 2
1 1 1
2 2 1
0 1 1
1 1 2
1 2 1
1 0 1
0 1 1
2 1 1
Node (u)
Node (u)
20
21
22
23
24
25
26
27
28
29
1 2 1
1 0 2
3 1 1
1 0 1
0 3 1
2 1 1
0 2 2
1 1 1
1 0 1
1 1 2
0
4 9 10 11 26
27
25
13
12
8
1 3
14
5
2
7
6
16
15 24
28
29
21
20
19
17
18
23
22
Fig. 7. DPP design in large-size network taken from [14] with 30 nodes and 62 links. The total network cost is 278850.
(a) Optimal ILP solution for COST 239. (b) DPP solution for the network in Fig. 7a.
0
1
2
3
4
5
Number of DCNs
Fig. 8. The relationship between Band the number of DCNs in the optimal ILP (for COST 239) and the DPP (for the network in Fig. 7a) solutions.
244 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 1, JANUARY 2014
Cognitive Computing and Application, School of Computer
Science and Technology, Tianjin University, Tianjin, P. R.
China.
APPENDIX PROOF OF THEROM 1
Thereom 1:In Service Matching∗(L,v,Cv), Amin always
keeps increasing with |Cv|for |Cv|>k
min.
Proof :InService Matching∗(L,v,Cv), assume that a
node uis added to Cvat a particular stage and |Cv|=k>
kmin,butCand Dare not updated yet. For ease of analysis,
at this point we redefine some parameters as follows.
Ak
min =Amin;(18)
Ck=C;(19)
Ck=
s∈S
ds
u(Ps+Puv); (20)
Dk=D;(21)
Dk=
s∈S
ds
u.(22)
Then, based on (17) we have
Ak
min =Ck+Ck
Dk+Dk
.(23)
When Ckand Dkare updated, we have
Ck+1 =Ck+Ck;(24)
Dk+1 =Dk+Dk.(25)
We now prove Theorem 1 by induction. Because the sub-
routine Service Matching (L,v,Cv) breaks and returns with
|Cv|=kmin,wehaveAkmin
min <A
kmin+1
min (see the circled parts
in Fig. 2).
Assume that Ak
min <A
k+1
min holds for a specific k≥kmin.
According to (23), we have
Ck+Ck
Dk+Dk
<Ck+1 +Ck+1
Dk+1 +Dk+1
.(26)
From (24)-(26), we get
Ck+1Dk+1 −C
k+1 D
k+1 >0.(27)
On the other hand, since Ak+1
min is minimized by adding a
specific node uto Cvand thus makes |Cv|=k+1,wehave
Ck+2
Dk+2
=Ck+1 +Ck+1
Dk+1 +Dk+1
<Ck+1 +Ck+2
Dk+1 +Dk+2
,(28)
where Ck+2 and Dk+2 match another node (i.e., the (k+
2)th node) added to Cv(after node u). In other words, the
inequality in (28) must hold because that is why node uis
chosen as the (k+1)
th node added to Cv. From (27)-(28), we
get
Ck+2Dk+2 >Ck+2 D
k+2,(29)
and thus Ck+2
Dk+2
<Ck+2 +Ck+2
Dk+2 +Dk+2
,(30)
which is equivalent to
Ak+1
min <A
k+2
min .(31)
This proves Theorem 1 by closing the induction.
REFERENCES
[1] K. Chen, C. Guo, H. Wu, J. Yuan, Z. Feng, Y. Chen, S. Lu, and W.
Wu, “DAC: generic and automatic address configuration for data center
networks,” IEEE/ACM Trans. Networking, vol. 20, no. 1, pp. 84–99,
2012.
[2] M. Bari, R. Boutaba, R. Esteves, L. Granville, M. Podlesny, M. Rabbani,
Q. Zhang, and M. Zhani, “Data center network virtualization: a survey,”
IEEE Commun. Surveys & Tutorials, pre-published, pp. 1–20, 2012.
[3] C. Lam, H. Liu, B. Koley, X. Zhao, V. Kamalov, and V. Gill, “Fiber optic
communication technologies: what’s needed for datacenter network
operations,” IEEE Commun. Mag., vol. 48, no. 7, pp. 32–39, July 2010.
[4] Z. Zheng, T. Zhou, M. Lyu, and I. King, “Component ranking for
fault-tolerant cloud applications,” IEEE Trans. Services Computing,pre-
published, 2011.
[5] Y. Simmhan, C. Ingen, G. Subramanian, and J. Li, “Bridging the gap
between desktop and the cloud for escience applications,” in Proc. 2010
IEEE International Conference on Cloud Computing, pp. 474–481.
[6] P. Wright, T. Harmer, J. Hawkins, and Y. L. Sun, “A commodity-focused
multi-cloud marketplace exemplar application,” in Proc. 2011 IEEE
International Conference on Cloud Computing, pp. 590–597.
[7] J. Abley, A. Canada, and K. Lindqvist, RFC 47867-Operation of Anycast
Services, Dec. 2006.
[8] M. F. Habib, M. Tornatore, M. D. Leenheer, F. Dikbiyik, and B.
Mukherjee, “Design of disaster-resilient optical datacenter networks,”
IEEE/OSA J. Lightw. Technol., vol. 30, no. 16, pp. 2563–2573, Aug.
2012.
[9] L. Guo, J. Cao, H. Yu, and L. Li, “Path-based routing provisioning with
mixed shared protection in WDM mesh networks,” IEEE/OSA J. Lightw.
Technol., vol. 24, no. 3, pp. 1129–1141, 2006.
[10] L. Guo, “LSSP: a novel local segment-shared protection for multi-
domain optical mesh networks,” Computer Commun., vol. 30, no. 8,
pp. 1794–1801, 2007.
[11] B. Wu, K. L. Yeung, and P.-H. Ho, “Monitoring cycle design for fast
link failure localization in all-optical networks,” in IEEE/OSA J. Lightw.
Technol., vol. 27, no. 10, pp. 1392–1401, May 2009.
[12] B. Wu, P.-H. Ho, K. L. Yeung, J. Tapolcai, and H. T. Mouftah, “Optical
layer monitoring schemes for fast link failure localization in all-optical
networks,” IEEE Commun. Surveys and Tutorials, vol. 13, no. 1, pp.
114–125, 2011.
[13] B. Wu, P.-H. Ho, and K. L. Yeung, “Monitoring trail: on fast link failure
localization in WDM mesh networks,” IEEE/OSA J. Lightw. Technol.,
vol. 27, no. 18, pp. 4175–4185, Sept. 2009.
[14] B. Wu, K. L. Yeung, and P.-H. Ho, “ILP formulations for p-cycle design
without candidate cycle enumeration,” IEEE/ACM Trans. Networking.,
vol. 18, no. 1, pp. 284–295, Feb. 2010.
[15] B. Wu, P.-H. Ho, K. L. Yeung, J. Tapolcai, and H. T. Mouftah, “CFP:
cooperative fast protection,” IEEE/OSA J. Lightw. Technol. , vol. 28, no.
7, pp. 1102–1113, Apr. 2010.
[16] L. Guo and L. Li, “A novel survivable routing algorithm with partial
shared-risk link groups (SRLG)-disjoint protection based on differenti-
ated reliability constraints in WDM optical mesh networks,” IEEE/OSA
J. Lightw. Technol., vol. 25, no. 6, pp. 1410–1415, Jun. 2007.
[17] S. S. Ahuja, S. Ramasubramanian, and M. Krunz, “SRLG failure
localization in optical networks,” IEEE/ACM Trans. Networking,vol.
19, no. 4, pp. 989–999, 2011.
[18] J. Liu, X. Jiang, H. Nishiyama, and N. Kato, “Reliability assessment
for wireless mesh networks under probabilistic region failure model,”
IEEE Trans. Vehicular Technol., vol. 60, no. 5, pp. 2253–2264, 2011.
[19] X. Wang, X. Jiang, and A. Pattavina, “Assessing network vulnerability
under probabilistic region failure model,” in Proc. 2011 IEEE Inter-
national Conference on High Performance Switching and Routing, pp.
164–170.
[20] M. S. Kiaei, C. Assi, and B. Jaumard, “A survey on the p-cycle
protection method,” IEEE Commun. Surveys and Tutorials, vol. 11, no.
3, pp. 53–70, 2009.
[21] K. L. Yeung and T.-S. P. Yum, “Node placement optimization in
ShuffleNets,” IEEE/ACM Trans. Networking, vol. 6, no. 3, pp. 319–324,
1998.
[22] B. Wang, H. Xu, W. Liu, and H. Liang, “A novel node placement for
long belt coverage in wireless networks,” IEEE Trans. Computers,pre-
published, 2012.
[23] P. Cheng, C.-N. Chuah, and X. Liu, “Energy-aware node placement in
wireless sensor networks,” in Proc. 2004 IEEE GlobeCom, vol. 5, pp.
3210–3214.
[24] B. Wu, K. L. Yeung, and P.-H. Ho, “ILP formulations for non-simple
p-cycle and p-trail design in WDM mesh networks,” Elsevier Computer
Networks, vol. 54, no. 5, pp. 716–725, Apr. 2010.
XIAO et al.: JOINT DESIGN ON DCN PLACEMENT AND SURVIVABLE CLOUD SERVICE PROVISION OVER ALL-OPTICAL MESH NETWORKS 245
[25] T. Y. Chow, F. Chudak, and A. M. Ffrench, “Fast optical layer mesh
protection using pre-cross-connected trails,” IEEE/ACM Trans. Network-
ing., vol. 12, no. 3, pp. 539–548, Jun. 2004.
[26] C. Barnhart, E. L. Johnson, G. L. Nemhauser, M. W. Savelsbergh, and
P. H. Vance, “Branch-and-price: column generation for solving huge
integer programs,” in Operations Research., vol. 46, no. 3, pp. 316–
329, 1998.
[27] J. El-Najjar, C. Assi, and B. Jaumard, “Joint routing and scheduling in
WiMAX-based mesh networks,” IEEE Trans. Wireless Commun.,vol.
9, no. 7, pp. 2371–2381, 2010.
[28] B. Jaumard and H. A. Hoang, “Design and dimensioning of logical
survivable topologies against multiple failures,” IEEE/OSA J. Optical
Commun. and Networking, vol. 5, no. 1, pp. 23–36, 2013.
Jie Xiao received the B.Eng. degree in Electronic
Information Science and Technology from Central
China Normal University (Wuhan, P. R. China)
in 2010. He is now pursuing his Ph.D degree in
Computer Systems and Networking at Tianjin Uni-
versity (Tianjin, P. R. China) under the supervision
of professor Bin Wu. His research interests in-
clude computer systems and networking, optical and
wireless communications and networking, network
survivability and security issues.
Hong Wen was born in Chengdu, P. R. China. She
received the M.Sc. degree in Electrical Engineering
from Sichuan Union University of Sichuan, P. R.
China, in 1997. She pursued her Ph.D. degree in
Communication and Computer Engineering Dept. at
the Southwest Jiaotong University (Chengdu, P. R.
China). Then she worked as an associate professor
in the National Key Laboratory of Science and
Technology on Communications at UESTC, P. R.
China. From January 2008 to August 2009, she was
a visiting scholar and postdoctoral fellow in the ECE
Dept. at University of Waterloo. Now she holds the professor position at
UESTC, P. R. China. Her major interests focus on wireless communication
systems.
Bin Wu (S04 −M07) received his Ph.D. degree
in Electrical and Electronic Engineering from The
University of Hong Kong (Pokfulam, Hong Kong)
in 2007. He worked as a postdoctoral research fellow
from 2007-2012 in the ECE Dept. at University of
Waterloo (Waterloo, Canada). He is now a professor
in the School of Computer Science and Technology
at Tianjin University (Tianjin, P. R. China). His
research interests include computer systems and
networking, optical, wireless communications and
networking and network survivability.
Xiaohong Jiang (M03) received his B.S., M.S. and
Ph.D degrees all from Xidian University, Xian, P.
R. China. He is currently a full professor at Future
University Hakodate, Hakodate, Japan. Dr. Jiang
was an Associate professor at Tohoku University,
Japan from Feb. 2005 to Mar. 2010, an assistant
professor in Japan Advanced Institute of Science and
Technology (JAIST) from Oct. 2001 to Jan. 2005.
He was a JSPS research fellow at JAIST from Oct.
1999 to Oct. 2001, and a research associate in the
University of Edinburgh from Mar. 1999 to Oct.
1999. His research interests include computer communications and networks,
mainly on wireless networks, optical networks, etc. He has published over 190
technical papers at premium international journals and conferences, which
include over 20 papers published in IEEE journals such as IEEE/ACM
TRANSACTIONS ON NETWORKING, IEEE JOURNAL OF SELECTED AREAS
ON COMMUNICATIONS, etc. Dr. Jiang was the winner of the Best Paper
Award and Outstanding Paper Award of IEEE WCNC 2008, IEEE ICC 2005
Optical Networking Symposium, and IEEE/IEICE HPSR 2002. He is a Senior
Member of IEEE.
Pin-Han Ho received his B.Sc. and M.Sc. degrees
from the Electrical and Computer Engineering, De-
partment of National Taiwan University in 1993
and 1995, respectively. He started his Ph.D. studies
in 2000 at Queens University, Kingston, Ontario,
Canada, focusing on optical communications sys-
tems, survivable networking, and QoS routing prob-
lems. He finished his Ph.D. in 2002, and joined the
Electrical and Computer Engineering Department at
the University of Waterloo as an assistant professor
in the same year, where he is currently an associate
professor. He is the author/co-author of more than 100 refereed technical
papers and book chapters, and the co-author of a book on optical networking
and survivability. He is the recipient of the Distinguished Research Excellence
Award in the ECE Department at the University of Waterloo, the Early
Researcher Award in 2005, the Best Paper Award at SPECTS 02 and the
ICC 05 Optical Networking Symposium, and the Outstanding Paper Award
in HPSR 02.
Lei Zhang (S03 −M08) received her Ph.D. de-
gree in Computer Science from Auburn University
(Auburn, AL, USA) in 2008. She worked as an
assistant professor from 2008-2011 in the Computer
Science Dept. at Frostburg State University (Frost-
burg, MD, USA). She is now an assistant professor
in the School of Computer Science and Technology
at Tianjin University (Tianjin, P. R. China). Her re-
search interests include computer networks, wireless
communications, distributed algorithms and network
security.