Content uploaded by Masaru Katayama
Author content
All content in this area was uploaded by Masaru Katayama on Feb 08, 2017
Content may be subject to copyright.
Dynamic Path Bandwidth Allocation for 1000x10-Scale
Optical Layer-2 Switch Network based on Hierarchical Timeslot
Allocation Algorithm and Timeslot Converter
Kyota HATTORI, Masahiro NAKAGAWA, Naoki KIMISHIMA, Masaru KATAYAMA, and Akira MISAWA
NTT Network Service Systems Laboratories, NTT Corporation, hattori.kyota@lab.ntt.co.jp
Abstract We are developing optical layer-2 switch network that achieves dynamic path bandwidth
allocation (DPBA) for efficient aggregation in metro NWs. We show the experimental results of DPBA
cycle according to variations in traffic on NW scale of 1000x10.
Introduction
Network (NW) traffic is increasing exponentially,
and it is therefore becoming increasingly
necessary to develop cost-effective NWs in
terms of equipment cost and power
consumption. For this trend, we are developing
an optical layer-2 switch network1 (OL2SW-NW)
that can efficiently aggregate traffic in a large-
scale metro NW. The OL2SW-NW is based on a
WDM/TDM ring NW of L WDM channels. This
allows the bandwidth of wavelengths in the NW
to be shared with the ground paths between
1000 aggregation switches (SWs) on the access
NW and 10 IP routers (RRs) on the core NW.
The OL2SW-NW enables NW bandwidth to be
shared effectively as a TDM-PON2does by
allocating timeslots (TSs) to each ground path
and performing ADD/DROP of data according to
allocated TSs (Fig. 1). The OL2SW-NW can
dynamically change each bandwidth of ground
path by changing the number of allocated TSs
according to the amount of traffic on each
ground path at every fixed time. To achieve the
bandwidth allocation, traffic information on all
ground paths is collected at a central TS
scheduler (SCH) for TS allocation (TSA). Hence,
collecting traffic information from all ground
paths and TSA processing at the SCH will
produce bottlenecks in large-scale NWs.
In this paper, we show the experimental results
of dynamic path bandwidth allocation (DPBA) in
1000x10-scale NW by using a novel TSA
algorism and TS converters (TSCs).
Architecture of OL2SW-NW and NW model
The OL2SW-NW achieves DPBA for ground
paths between 1000 SWs and 10 RRs by
deploying a SCH (Fig. 2). We label the node
which constitutes OL2SW-NW as OL2SW; that
is, the OL2SW connected to SWs is called A-
OL2SW, and the one connected to RRs is called
C-OL2SW. We assumed that RRs achieve load
balance between the 10 nodes by applying
virtualization technology3 to the RRs. Therefore,
a SW communicates with a maximum of 10 RRs.
Each OL2SW has the TSC converts variable
Fig. 1: Concept of optical Layer-2 SW network
Fig. 2: Logical network of optical Layer-2 SW network
Ethernet frames (Ethers) from SWs and RRs
into burst signals (Bursts) according to the
allocated TSs. The SCH executes the TSA for
all paths among TSCs. The following shows the
NW model used to estimate the number of paths
and the number of wavelengths required for
TSA computation. First, we assumed that a SW
has one 10G Ethernet Interface (10GE) that
accommodates 1000 users, and that the
average traffic from a user is 1 Mbps. To
accommodate 1000 SWs, the required NW
capacity becomes 1 Tbps. Therefore, the
OL2SW-NW needs 100 (: L) wavelengths if the
capacity of a wavelength is 10 Gbps. Here, we
assumed that each RR aggregates the traffic
from 1000 SWs to 100 Gbps at maximum. This
means that the RR requires ten 10GEs. A TSC
is connected to each 10GE of a SW and RR
one-to-one and transmits Bursts from a 10G
burst interface (10GBT) according to the
allocated TS. We assume that the TSC in the A-
OL2SW does not communicate simultaneously
with more than one TSC at the same C-OL2SW,
and a TSC in the C-OL2SW can accommodate
100 paths at maximum to reduce the
computational time of TSA at the SCH. In
addition, there is no loopback communication
from one SW to another or from one RR to
another. In this case, the SCH has to compute
TSA for 20-k paths for both up and down stream.
Core NW
Access
NW
OL2SW-NW
Fiber
Timeslot
OL2SW
#1
SW
#1
SW
#1000
IP router
#10
IP router
#1
WDM/TDM
(Lwavelengths)
λ1
λN
ὉὉὉ
#1#4
#2
Time
#1
#2 #3
Wavelength
#3 #99
#1 #59#1
#2
TS1
TS2TS3
TS
100
TS1TS2
:Allocated
timeslot for
path number#
#
OL2SW
#1000
OL2SW
#1010
OL2SW
#1001
Mapping
ὉὉὉ
ὉὉὉ
SW
IP
router
Path
number
Upstream
Downstream
#1
#1
#1
#2
#1
#2
#3
#4
#1000
#10
#19999
#20000
ὉὉὉ
ὉὉὉ
ὉὉὉ
ὉὉὉ
ὉὉὉ
䊶䊶䊶
TS
Scheduler
C-plane
D-plane
10GE x 10 IFs
10GE
100 paths x 100 TSCs
=> 10k paths
䊶䊶䊶
1Gbps@1000 users
䊶䊶䊶 䊶䊶䊶
SW #1 SW #100 SW #901 SW #1000
IP Router#1
䊶䊶䊶
C-OL2SW
IP Router#10
䊶䊶䊶
TSC
#1
#1
TSC
#100
#100
TSC
#901
#901
TSC
#1000
#1000
#1
#10
A-OL2SW
10G BT
10G BT
TSC
#1100
TSC
#1091
䊶䊶䊶
TSC
#1010
TSC
#1001
䊶䊶䊶
Up &
Down
stream
TSA for 20k
paths
Th.2.E.2.pd
f
Fig. 3: Proposed DPBA based on TSC and SCH
Fig. 4: Schematic of Hierarchical TSA algorithm
Tab. 1: Formats of QL and TS-table
Operation of DPBA for OL2SW-NW
At every fixed time T, all TSCs are allocated TSs
for all destinations according to the amount of
traffic to follow the traffic variations. A TSC
prepares virtual queues (VQs) for each
destination, the number of which is M(A-
OL2SW’s TSC: 10, C-OL2SW’s TSC: 100) to
achieve burst transmission for each destination.
When a TSC receives Ethers from a SW and
RR through 10GE, the TSC identifies the
destination according to the MAC address of the
Ether and inserts them into the VQ according to
the destination. Then the TSC measures the
queue lengths (QL) in the VQs and notifies the
SCH of the QLs. The SCH calculates the
number of TSs assigned to each TSC based on
the QLs and sends the TS information (TS-
table) to the TSCs. Then, All TSCs can change
the allocated TS at the same time by updating
the TS-table at every T cycle.
Requirements of DPBA cycle
Traffic variation in the metro NW defined in this
paper is assumed to be a traffic rate of 100
Mbits per 1 sec to 1 min4. It is necessary to set
the T in the same order as traffic variation to
achieve DPBA. Therefore, we assume that a
suitable value of T is 1 sec. This means that the
OL2SW-NW must achieve DPBA cycle for 20k
paths at 1 sec. Here, we define the time from
sending the QL to setting the TS-table at the
TSC as Ta (Fig. 3). The Ta for all TSCs must be
within 1 sec in order to set T as 1 sec. Therefore,
we evaluate whether Ta is within 1 sec to
achieve DPBA at every 1 sec for 20k paths.
Hierarchical TSA algorithm
We introduce hierarchical TSA which divides
large-scale NW-wide TSA into small-scale local
TSA (Fig. 4). Then we calculate several small-
scale TSAs independently, which reduces the
calculation time. First, we define a set of some
adjacent OL2SWs as a “node group (N-G)” to
achieve a TSA hierarchy. Thus, each OL2SW
will belong to only one N-G. We call a pair of N-
Gs a N-G pair (GP). Then we compute a TSA
for paths in each GP (in-TSA) and a TSA for
GPs (gp-TSA). Finally, we obtain a TS schedule
by matching the results of the in-TSA to that of
the gp-TSA. In this process, the number of
intended paths and links in each TSA is less
than that of the original TSA. As a result, we
save computation time. Let the number of N-Gs
set up in NOL2SWs be G. The product of the
number of intended paths and that of the
intended links in the in-TSA and gp-TSA are
represented as O{(N/G)3} and O(G3),
respectively. The number of GPs is O(G2), so
this algorism has a calculation time that
depends on O{G2(N/G)3+G3}. Thus, we should
set G as an integer around N3/4 to yield a
minimal calculation time.
Control TS for QL and TS-table
It is necessary to collect the QL from 1010
OL2SWs and set the TS-table to 1010 OL2SWs.
However, if all OL2SWs send QL at any time,
collisions between QLs from different OL2SWs
C-Plane
D-Plane
Time
䊶䊶䊶
1010
QLTS table
TS Scheduler
T
a
TSC
䊶
䊶
䊶
#1
#2
#M
10GE
Ether
frame
䊶䊶䊶
TS1 TS2 TS
100 TS1
䊶䊶䊶
䊶䊶䊶
Processing TSA for 20k paths
T
䊶䊶䊶
TS1 TS2
VQ
OL2SW
TS
100
10GBT
TS2
1010
TSA for node-
group pairs
(gp-TSA)
Path
Link
Time
Link
1111213
TS1
2
3
4
5
6
7
GP#1
->
GP#11
GP#1 -> GP#13
Time
Link
1
TS3
4
5
6
7
C#1 -> A#3
1 2 3
C#1
->
A#2
TSA for paths
in each node-
group pair
(in-TSA)
A-OL2S W
C-OL2SW
Node
layer
1
110
11
998
G
Node-group
layer
999
1000
1
2
3
10
C#1 -> A#1
䊶䊶
䊶䊶
Node-group
Field
size
(bit)
Required number
A
-OL2SW
C
-OL2SW
QL
TSC
-ID
4
1 (
TSC)
10 (
TSCs)
VQ
-ID
7
10 (
VQs) x 1 (TSC)
100 (
VQs) x 10 (TSCs)
Q
-size
24
10 (
VQs) x 1 (TSC)
100 (
VQs) x 10 (TSCs)
Sum
Less than 0.1 KB
3.9
KB
TS
table
Input port
7
101
䋨WDM: 100 + 10GBT:1䋩
110
䋨WDM: 100 + 10GBT :10䋩
Output port
7
101 (Input port) x 100 (number
of TSs
)
110 (Input port) x 100 (number
of TSs
)
Sum
8.9 KB
9.9 KB
Fig. 5: (A) Experimental setup, (B) Prototype of TSC, (C) QL for 10 destinations according to allocated TS,
and
(D) D-plane’s signals from a TSC-D according to traffic variations at every T(:500 ms)
MUX
[Traffic conditions]
䊶Traffic variation:100Mbps
@maximum (Poisson traffic)
䊶Range of rate: 100M-1Gbps
1
2
3
4
5
6
7
8
9
10
TSC1
SCH
TG
䋨Rx䋩
TG
䋨Tx䋩
MUX
TG
Emulator of 1007
A- /C-OL2SWs
Wavelength
filter
Coupler
C-OL2SW #1
TSC-D#2
A-OL2SW #2
TSC-C#5
TSC-D#1
A-OL2SW #1
TSC-D#3
Emulation
of QLs from
1007 A- /C-
OL2SWs
Queue
length
QTimeslot
table
TS
(a)
(A)
1GE
10GBT
10GE
1GE 1GE
λ1
λ2
λ1
λ2
λ3
λ3
λ1
λ2
λ3
λ1
λ2
QQ
TSTS
(a) (a)
(b)
10
GE
Traff ic
10G
BT
(a)
Dest.1
Dest.1
Dest.2
Dest.2
Dest.3Dest.4
T
1TS (:10μs)
Time
(D)
Time
(ms)
10.50
QL (MB)
TT
(C) 6
1
0
T
2
3
4
5Traffi c
increase
Reallocation of TS
TSC-D
TSC-C
(B)
TSC-C#1
Q
TSTS
Q
10GBT
1GE
TSC-C#2
Q
TSTS
Q
10GBT
1GE
10
GE
Traff ic
10G
BT 10
GBT
Traff ic
10
GE
Dest.
2TS (:20μs)
TSC-C#3
Q
TSTS
Q
10GBT
1GE
TSC-C#4
Q
TSTS
Q
10GBT
10GE
[Specifications of
TSC-C and TSC-D]
䊶Kind of interfaces
䊶1 10GE
䊶1 10GBT
䊶4 1GEs
Receiving and sending
1010 C-TSs
Th.2.E.2.pd
f
Fig. 6: (A) Delay among TSC-Cs for number of A-
OL2SWs and (B) TSA time for number of paths
will occur. Therefore, all OL2SWs are allocated
control TS (C-TS) other than the data plane’s TS.
The OL2SW sends and receives QL and TS-
table by using the allocated C-TS. To achieve
this, the TSC consists of TSC-D (TSC for data),
which converts Ethers of the data into Bursts,
and TSC-C (TSC for control), which converts
Ethers of the QL and TS-table into Bursts.
Furthermore, to reduce the bandwidth for the C-
TSs, A TSC-C is deployed for an OL2SW. Thus,
1010 OL2SWs can send and receive QL and
TS-tables by allocating C-TSs to all OL2SWs.
Size of QL and TS-table
We define the QL and TS-table necessary for
the DPBA to show the required time to collect
QL and set TS-table for 1010 OL2SWs. The QL
and TS-table formats are given in Tab. 1. QL is
required, including TSC-ID, VQ-ID for identifying
the TSC-D and VQ, and Q-size for representing
the QL. A 4-bit /7 bit data field for the TSC-ID
and VQ-ID are necessary to identify 10 TSC-Ds
for an OL2SW and 100 VQs for a TSC-D,
respectively. Assuming the QL is sent every 10
ms, the maximum QL in 10 ms for 10-Gbps
traffic is 12.5 MB. Therefore, a 24-bit data field
for the QL is necessary to express in Byte unit.
From the above, the size of QL from an A-
OL2SW and a C-OL2SW is less than 0.1 KB
and 3.9 KB, respectively. If we set the size of
the C-TS to 10 μs (:12.5 KB), it will be capable
of sending QL of 1010 OL2SWs in 10.1 ms. In
contrast, the TS-table must be set to every
OL2SW input port (Inport), which is equal to the
sum of 10GBTs and the number of wavelengths1.
There are 101 A-OL2SW Inports and 110 C-
OL2SW Inports if the OL2SW accommodates
100 (: L) wavelengths. Therefore, a 7 bit data
field for the Inport is necessary. If we assume
there are 100 TSs, which have to be set for
every Inport and be specified for output port, the
sizes of TS-table for an A-OL2SW and a C-
OL2SW are 8.9 KB and 9.9 KB, respectively.
Therefore, if we allocate 1 C-TS for an OL2SW,
each C-TS will be capable of sending TS-table
of 1010 OL2SWs in 10.1 ms.
DPBA experiment for 1000 x10-scale NW
We explain the experimental results of DPBA for
1000x10-scale NW. The experimental setup is
shown in Fig. 5(A). We set three TSC-Ds for A-
OL2SW #1, #2 and C-OL2SW #1, which were
connected to TSC-C #1, #2, and #3,
respectively. The prototypes of a TSC-C and a
TSC-D are shown in Fig. 5(B). We allocated one
C-TS for each TSC-C. To emulate QLs from 998
A-OL2SWs and 9 C-OL2SWs, we connected a
traffic generator (TG) to TSC-C #4 and allocated
1007 C-TSs. The SCH was implemented in a
personal computer with a single core 2.3 GHz
CPU. In the proposed algorithm, Gis set to 180
based on the size of N(:1010) to obtain the
minimum computation time. In the algorithm of
in-TSA and gp-TSA, we use the first-fit TS
assignment algorithm.
Fig. 6(A) plots the maximum delay time (D)
between sending QL from Fig. 5 (A)’s (a) and
receiving QL at Fig. 5(A)’s (b) when changing
the number of A-OL2SWs under the condition of
fixed 10 C-OL2SWs at the emulator. Fig. 6(A)
indicates that D is increasing in proportion to the
number of A-OL2SWs because each A-OL2SW
requires a C-TS of 10 μs. The D was 10.2 ms at
maximum even when controlling 1000 A-
OL2SWs. Also, the time to send the TS-table
was 10.2 ms at maximum for 1000 A-OL2SWs
(not appeared in the graph). Therefore, we
found it was possible to achieve a round-trip
time between all TSC-Cs within 20.4 ms.
To evaluate the TSA time in the SCH, we
measured the time from the arrivals of all QLs to
sending the TS-table for all OL2SWs at SCH.
Fig. 6(B) plots the TSA time when changing the
number of paths, where each case was
executed 1000 times. The TSA time increased
in proportion to the number of paths5. The
maximum TSA time was within 450 ms even
when computed for 20-k paths.
The above results indicate that the maximum Ta
was 470 ms. Therefore, we found that it was
possible to achieve DPBA cycle at every 1 sec
on a metro NW scale of 1000x10 against the
defined traffic variations. At this time, when we
set T to 500 ms, we make sure of each function
which sends QL for 10 destinations according to
the allocated TS (Fig. 5(C)) and changes the
bandwidth at a TSC-D according to traffic
variations at every T (Fig. 5(D)).
Conclusions
We evaluated DPBA in OL2SW-NW. We verified
experimentally that DPBA cycle at 500 ms on
NW scale of 1000x10 was achieved.
References
[1] K. Hattori, et al., ECOC2012, We.3.D.5
[2] ITU-T, G.984.2 (2006)
[3] Y. Wang, et al., SIGCOMM’08, pp. 231–242.
[4] G. Xie et al., ITC2007, pp. 666-677
[5] S. Subramaniam et.al. , SPIE 3843, 2 (1999)
12
D(ms)
10
8
6
4
2
01000800600400200
Number of A-OL2SW
10.2 ms
100
200
300
400
500
0
TSA time (ms)
Number of paths 2016124
(k)
(A) (B)
8
450 ms
Th.2.E.2.pd
f