ArticlePDF Available

A Stochastic Foundation of Available Bandwidth Estimation: Multi-Hop Analysis

Authors:
  • Carnegie Corporation of New York
  • CUNY --- City College & Graduate Center

Abstract and Figures

This paper analyzes the asymptotic behavior of packet-train probing over a multi-hop network path P carrying arbitrarily routed bursty cross-traffic flows. We examine the statistical mean of the packet-train output dispersions and its relationship to the input dispersion. We call this relationship the response curve of path P. We show that the real response curve Z is tightly lower-bounded by its multi-hop fluid counterpart F, obtained when every cross-traffic flow on P is hypothetically replaced with a constant-rate fluid flow of the same average intensity and routing pattern. The real curve Z asymptotically approaches its fluid counterpart F as probing packet size or packet train length increases. Most existing measurement techniques are based upon the single-hop fluid curve S associated with the bottleneck link in P. We note that the curve S coincides with F in a certain large-dispersion input range, but falls below F in the remaining small-dispersion input ranges. As an implication of these findings, we show that bursty cross-traffic in multi-hop paths causes negative bias (asymptotic underestimation) to most existing techniques. This bias can be mitigated by reducing the deviation of Z from S using large packet size or long packet-trains. However, the bias is not completely removable for the techniques that use the portion of S that falls below F.
Content may be subject to copyright.
IEEE/ACM TRANSACTIONS ON NETWORKING, VOLUME 16, NO. 2, APRIL 2008 1
A Stochastic Foundation of Available Bandwidth
Estimation: Multi-Hop Analysis
Xiliang Liu, Member, IEEE, Kaliappa Ravindran, and Dmitri Loguinov, Member, IEEE
AbstractThis paper analyzes the asymptotic behavior of
packet-train probing over a multi-hop network path Pcarrying
arbitrarily routed bursty cross-traffic flows. We examine the
statistical mean of the packet-train output dispersions and its
relationship to the input dispersion. We call this relationship the
response curve of path P. We show that the real response curve
Zis tightly lower-bounded by its multi-hop fluid counterpart F,
obtained when every cross-traffic flow on Pis hypothetically
replaced with a constant-rate fluid flow of the same average
intensity and routing pattern. The real curve Zasymptotically
approaches its fluid counterpart Fas probing packet size
or packet train length increases. Most existing measurement
techniques are based upon the single-hop fluid curve Sassociated
with the bottleneck link in P. We note that the curve Scoincides
with Fin a certain large-dispersion input range, but falls below F
in the remaining small-dispersion input ranges. As an implication
of these findings, we show that bursty cross-traffic in multi-hop
paths causes negative bias (asymptotic underestimation) to most
existing techniques. This bias can be mitigated by reducing the
deviation of Zfrom Susing large packet size or long packet-
trains. However, the bias is not completely removable for the
techniques that use the portion of Sthat falls below F.
I. INTRODUCTION
END-TO-END estimation of the spare capacity along a
network path using packet-train probing has recently
become an important Internet measurement research area.
Several measurement techniques such as TOPP [13], Pathload
[5], IGI/PTR [4], Pathchirp [15], and Spruce [16] have been
developed. Most of the current proposals use a single-hop
path with constant-rate fluid cross-traffic to justify their meth-
ods. The behavior and performance of these techniques in
a multi-hop path with general bursty cross-traffic is limited
to experimental evaluations. Recent work [8] initiated the
effort of developing an analytical foundation for bandwidth
measurement techniques. Such a foundation is important in
that it helps achieve a clear understanding of both the validity
and the inadequacy of current techniques and provides a
guideline to improve them. However, the analysis in [8] is
restricted to single-hop paths. There is still a void to fill in
understanding packet-train bandwidth estimation over a multi-
hop network path.
Supported by NSF grants CCR-0306246, ANI-0312461, CNS-0434940, and
CNS-0519442.
Xiliang Liu is with Bloomberg L.P., New York, NY 10022 USA (e-mail:
liuxiliang@gmail.com).
Kaliappa Ravindran is with the Computer Science Department, City College
of New York, New York, NY 10031 USA (e-mail: ravi@cs.ccny.cuny.edu).
Dmitri Loguinov is with the Computer Science Department, Texas A&M
University, College Station, TX 77843 USA (e-mail: dmitri@cs.tamu.edu).
Recall that the available bandwidth of a network hop is
its residual capacity after transmitting cross-traffic within a
certain time interval. This metric varies over time as well as
a wide range of observation time intervals. However, in this
paper, we explicitly target the measurement of a long-term
average available bandwidth, which is a stable metric indepen-
dent of observation time instants and observation time intervals
[8]. Consider an N-hop network path P= (L1, L2,...,LN),
where the capacity of link Liis denoted by Ciand the long-
term average of the cross-traffic arrival rate at Liis given by
λi, which is assumed to be less than Ci. The hop available
bandwidth of Liis Ai=Ciλi. The path available bandwidth
APis given by
AP= min
1iN(Ciλi).(1)
The hop Lb, which carries the minimum available bandwidth,
is called the tight link or the bottleneck link 1. That is,
b= arg min
1iN(Ciλi).(2)
The main idea of packet-train bandwidth estimation is
to infer APfrom the relationship between the inter-packet
dispersions of the output packet-trains and those of the input
packet-trains. Due to the complexity of this relationship in
arbitrary network paths with bursty cross-traffic flows, pre-
vious work simplifies the analysis using a single-hop path
with fluid 2cross-traffic, while making the following two
assumptions without formal justification: first, cross-traffic
burstiness only causes measurement variability that can be
smoothed out by averaging multiple probing samples and
second, non-bottleneck links have negligible impact on the
proposed techniques.
The validity of the first assumption is partially addressed in
[8], where the authors use a single-hop path with bursty cross-
traffic to derive the statistical mean of the packet-train output
dispersions as a function of the input probing dispersion, re-
ferred to as the single-hop response curve. The analysis shows
that besides measurement variability, cross-traffic burstiness
can also cause measurement bias to the techniques that are
based on fluid analysis. This measurement bias cannot be
reduced even when an infinite number of probing samples are
used, but can be mitigated using long packet-trains and/or large
probing packet size.
1In general, the tight link can be different from the link with the minimum
capacity, which we refer to as the narrow link of P.
2We use the term “fluid” and “constant-rate fluid” interchangeably.
1063–6692/$20.00 c
2008 IEEE
2 IEEE/ACM TRANSACTIONS ON NETWORKING, VOLUME 16, NO. 2, APRIL 2008
This paper addresses further the two assumptions that
current techniques are based on. To this end, we extend the
asymptotic analysis in [8] to arbitrary network paths and
uncover the nature of the measurement bias caused by bursty
cross-traffic flows in a multi-hop network path. This problem is
significantly different from previous single-hop analysis due to
the following reasons. First, unlike single-hop measurements,
where the input packet-trains have deterministic and equal
inter-packet separation formed by the probing source, the input
packet-trains at any hop (except the first one) along a multi-
link path are output from the previous hop and have random
structure. Second and more importantly, the multi-hop probing
asymptotics are strongly related to the routing pattern of cross-
traffic flows. This issue never arises in a single-hop path and
it has received little attention in prior investigation. However,
as we show in this paper, it is one of the most significant
factors that affect the accuracy of bandwidth measurement in
multi-hop paths.
To characterize packet-train bandwidth estimation in its
most general settings, we analyze several important properties
of the probing response curve Zassuming a multi-hop path P
with arbitrarily routed bursty cross-traffic flows. We compare
Zwith its multi-hop fluid counterpart F, which is a response
curve obtained when every cross-traffic flow in Pis hypothet-
ically replaced with a fluid flow of the same average intensity
and routing pattern. We show, under an ergodic stationarity
approximation of the cross-traffic at each link, that the real
curve Zis tightly lower bounded by its fluid counterpart F
and that the curve Zasymptotically approaches its fluid bound
Fin the entire input range as probing packet size or packet-
train length increases.
Most of the existing techniques are based on the single-
hop fluid response curve Sassociated with the bottleneck link
in P. Therefore, any deviation of the real curve Zfrom the
single-hop curve Scan potentially cause measurement bias in
bandwidth estimation. Note that the deviation Z − S can be
decomposed as
Z − S = (Z − F) + (F − S).(3)
The first term Z − F is always positive and causes asymptotic
underestimation of APfor most of the existing techniques.
This deviation term and its resulting measurement bias are
“elastic” in the sense that they can be reduced to a negligible
level using packet-trains of sufficient length3. For the second
deviation term F−S, we note that both Sand Fare piece-wise
linear curves. The first two linear segments in Fassociated
with large input dispersions coincide with S(i.e., F − S = 0).
The rest of the linear segments in Fassociated with small
input dispersions appear above S(i.e., F −S >0). The amount
of deviation and the additional negative measurement bias it
causes are dependent on the routing patterns of cross-traffic
flows, and are maximized when every flow traverses only one
hop along the path (which is often called one-hop persistent
cross-traffic routing [2]). Furthermore, the curve deviation F −
Sis “non-elastic” and stays constant with respect to probing
packet size and packet-train length at any given input rate.
3The analysis assumes infinite buffer space at each router.
Therefore, the measurement bias it causes cannot be overcome
by adjusting the input packet-train parameters.
Among current measurement techniques, pathload and PTR
operate in the input probing range where Fcoincides with
S, and consequently are only subject to the measurement bias
caused by the first deviation term Z − F. Spruce may use
the probing range where F − S >0. Hence it is subject
to both elastic and non-elastic negative measurement biases.
The amount of bias can be substantially more than the actual
available bandwidth in certain common scenarios, leading to
negative results by the measurement algorithm and a final
estimate of zero by the tool.
The rest of the paper is organized as follows. Section II
derives the multi-hop response curve Fassuming arbitrarily
routed fluid cross-traffic flows and examines the deviation term
F − S. In Section III and IV, we analyze the deviation phe-
nomena and convergence properties of the real response curve
Zof a multi-hop path with respect to its fluid counterpart
F. We provide practical evidence for our theoretical results
using testbed experiments and real Internet measurements
in Section V. We examine the impact of these results on
existing techniques in Section VI and summarize related work
in Section VII. Finally, we briefly discuss future work and
conclude in Section VIII.
An earlier version of the this paper appeared in [9]. Inter-
ested readers can also refer to [10] for the technical proofs
that are omitted in this paper.
II. MULTI-HOP FLUID ANALYSIS
It is important to first thoroughly understand the response
curve Fof a network path carrying fluid cross-traffic flows,
since as we show later, the fluid curve Fis an approachable
bound of the real response curve Z. Initial investigation of
the fluid curves is due to Melandar et al. [12] and Dovrolis
et al. [1]. However, prior work only considers two special
cross-traffic routing cases (one-hop persistent routing and path
persistent routing). In this section, we formulate and solve the
problem for arbitrary cross-traffic routing patterns, based on
which, we discuss several important properties of the fluid
response curves that allow us to obtain the path available
bandwidth information.
A. Formulating A Multi-Hop Path
We first introduce necessary notations to formulate a multi-
hop path and the cross-traffic flows that traverse along the
path.
An N-hop network path P= (L1, L2,...,LN)is a se-
quence of Ninterconnected First-Come First-Served (FCFS)
store-and-forward hops. For each forwarding hop Liin P,
we denote its link capacity by Ci, and assume that it has
infinite buffer space and a work-conserving queuing discipline.
Suppose that there are Mfluid cross-traffic flows traversing
path P. The rate of flow jis denoted by xjand the flow rate
vector is given by x= (x1, x2,...,xM).
We impose two routing constraints on cross-traffic flows to
simplify the discussion. The first constraint requires every flow
to have a different routing pattern. In the case of otherwise,
LIU et al.: A STOCHASTIC FOUNDATION OF AVAILABLE BANDWIDTH ESTIMATION: MULTI-HOP ANALYSIS 3
the flows with the same routing pattern should be aggregated
into one single flow. The second routing constraint requires
every flow to have only one link where it enters the path and
also have only one (downstream) link where it exits from the
path. In the case of otherwise, the flow is decomposed into
several separate flows that meet this routing constraint.
Definition 1: Aow aggregation is a set of flows, repre-
sented by a “selection vector” p= (p1, p2,...,pM)T, where
pj= 1 if flow jbelongs to the aggregation and pj= 0 if
otherwise. We use fjto represent the selection vector of the
aggregation that contains flow jalone.
There are several operations between flow aggregations.
First, the common flows to aggregations pand qform another
aggregation, whose selection vector is given by pq, where
the operator represents “element-wise multiplication.” Sec-
ond, the aggregation that contains the flows in pbut not in q
is given by ppq. Finally, note that the traffic intensity of
aggregation pcan be computed from the inner product xp.
We now define several types of flow aggregation frequently
used in this paper. First, the traversing flow aggregation at link
Li, denoted by its selection vector ri, includes all fluid flows
that pass through Li. The M×Nmatrix R= (r1,r2,...,rN)
becomes the routing matrix of path P. For convenience, we
define an auxiliary selection vector r0=0.
The second type of flow aggregation, denoted by ei, in-
cludes all flows entering the path at link Li, which can be
expressed as ei=ririri1given the second routing
constraint stated previously. The third type of flow aggrega-
tion, which includes flows that enter the path at link Lkand
traverse the downstream link Li, is denoted as Γk,i =ekri,
where ki.
The cross-traffic intensity at link Liis denoted by λi. We
assume λi< Cifor 1iN. Since none of the links in P
is congested, the arrival rate of flow jat any link it traverses
is xj. Consequently, we have
λi=xri< Ci,1iN. (4)
We further define the path configuration of Pas the following
2×Nmatrix
H= C1C2. . . CN
λ1λ2. . . λN!.(5)
The hop available bandwidth of Liis given by Ai=Ciλi.
We assume that every hop has different available bandwidth,
and consequently that the tight link is unique. Sometimes,
we also need to refer to the second minimum hop available
bandwidth and the associated link, which we denote as Ab2=
Cb2λb2and Lb2, respectively. That is
b2 = arg min
1iN,i6=b(Ciλi),(6)
where bis the index of the tight hop.
B. Fluid Response Curves
We now consider a packet-train of input dispersion (i.e.,
inter-packet spacing) gIand packet size sthat is used to probe
path P. We are interested in computing the output dispersion
of the packet train and examining its relation to gI. Such
a relation is called the gap response curve of path P. It is
easy to verify that under fluid conditions, the response curve
does not depend on the packet-train length n. Hence, we only
consider the case of packet-pair probing. We denote the output
dispersion at link Lias γi(gI, s)or γifor short, and again for
notational convenience we let γ0=gI. Note that γN(gI, s)
corresponds to the notation Fwe have used previously.
Based on our formulations, the gap response curve of path
Phas a recursive representation given below.
Theorem 1: When a packet-pair with input dispersion gI
and packet size sis used to probe an N-hop fluid path with
routing matrix Rand flow rate vector x, the output dispersion
at link Lican be recursively expressed as
γi=
gIi= 0
max γi1,s+ Ωi
Cii > 0,(7)
where iis
i=
i
X
k=1hγk1xΓk,ii.(8)
Proof: Assume that the first probing packet arrives at
link Liat time instance a1. It gets immediate transmission
service and departs at a1+s/Ci. The second packet arrives at
a1+γi1. The server of Lineeds to transmit s+Ωiamount of
data before it can serve the second packet. If this is done before
time instant a1+γi1, the second packet also gets immediate
service and γi=γi1. Otherwise, the sever undergoes a busy
period between the departure of the two packets, meaning that
γi= (s+ Ωi)/Ci. Therefore, we have
γi= max γi1,s+ Ωi
Ci.(9)
This completes the proof of the theorem.
As a quick sanity check, we verify the compatibility be-
tween Theorem 1 and the special one-hop persistent routing
case, where every flow that enters the path at link Liwill exit
the path at link Li+1 . For this routing pattern, we have
Γk,i =(0i6=k
rii=k.(10)
Therefore, equation (8) can be simplified as
i=γi1xri=γi1λi,(11)
which agrees with previous results [1], [12].
C. Properties of Fluid Response Curves
Theorem 1 leads to several important properties of the fluid
response curve F, which we discuss next. These properties
tell us how bandwidth information can be extracted from the
curve F, and also show the deviation of F, as one should be
aware of, from the single-hop fluid curve Sof the tight link.
Property 1: The output dispersion γN(gI, s)is a continuous
piece-wise linear function of the input dispersion gIin the
input dispersion range (0,).
Let 0 = αK+1 < αK< . . . < α1< α0=be the input
dispersion turning points that split the gap response curve to
4 IEEE/ACM TRANSACTIONS ON NETWORKING, VOLUME 16, NO. 2, APRIL 2008
K+ 1 linear segments4. Our next result discusses the turning
points and linear segments that are of major importance in
bandwidth estimation.
Property 2: The first turning point α1corresponds to the
path available bandwidth in the sense that AP=s/α1. The
first linear segment in the input dispersion range (α1=
s/AP,)has slope 1 and intercept 0. The second linear
segment in the input dispersion range (α2, α1)has slope
λb/Cband intercept s/Cb, where bis the index of the tight
link:
γN(gI, s) =
gIα1gI≤ ∞
gIλb+s
Cb
α2gIα1
.(12)
These facts are irrespective of the routing matrix.
It helps to find the expression for the turning point α2,
so that we can identify the exact range for the second linear
segment. However, unlike α1, the turning point α2is depen-
dent on the routing matrix. In fact, all other turning points
are dependent on the routing matrix and can not be computed
based on the path configuration matrix alone. Therefore, we
only provide a bound for α2.
Property 3: For any routing matrix, the term s/α2is no
less than Ab2, which is the second minimum hop available
bandwidth of path P.
The slopes and intercepts for all but the first two linear
segments are related to the routing matrix. We skip the
derivation of their expressions, but instead provide both a
lower bound and an upper bound for the entire response curve.
Property 4: For a given path configuration matrix, the gap
response curve associated with any routing matrix is lower
bounded by the single-hop gap response curve of the tight
link
S(gI, s) =
gIgI>s
AP
s+gIλb
Cb
0< gI<s
AP
.(13)
It is upper bounded by the gap response curve associated with
one-hop persistent routing.
We now make several observations regarding the deviation
of γN(gI, s)(i.e., F) from S(gI, s). Combing (12) and (13),
we see that γN(gI, s)− S(gI, s) = 0 when gIα2. That is,
the first two linear segments on Fcoincide with S. When gI<
α2, Property 4 implies that the deviation γN(gI, s)S(gI, s)is
positive. The exact value depends on cross-traffic routing and
it is maximized in one-hop persistent routing for any given
path configuration matrix.
Also note that there are three pieces of path information
that we can extract from the gap response curve Fwithout
knowing the routing matrix. By locating the first turning point
α1, we can compute the path available bandwidth. From the
second linear segment, we can obtain the tight link capacity
and cross-traffic intensity (and consequently, the bottleneck
link utilization) information. Other parts of the response curve
Fare less readily usable due to their dependence on cross-
traffic routing.
4Note that the turning points in Fis indexed according to the decreasing
order of their values. The reason will be clear shortly when we discuss the
rate response curve.
D. Rate Response Curves
To extract bandwidth information from the output dispersion
γN, it is often more helpful to look at the rate response curve,
i.e., the functional relation between the output rate rO=s/γN
and the input rate rI=s/gI. However, since this relation
is not linear, we adopt a transformed version first proposed
by Melander et al. [13], which depicts the relation between
the ratio rI/rOand rI. Denoting this rate response curve by
˜
F(rI), we have
˜
F(rI) = rI
rO
=γN(gI, s)
gI
.(14)
This transformed version of the rate response curve is also
piece-wise linear. It is easy to see that the first turning point
in the rate curve is s/α1=Apand that the rate curve in the
input rate range (0, s/α2)can be expressed as
˜
F(rI) =
1rIAP
λb+rI
Cb
s
α2
rIAP
.(15)
Finally, it is also important to notice that the rate response
curve ˜
F(rI)does not depend on the probing packet size s.
This is because, for any given input rate rI, both γN(gI, s)
and gIare proportional to s. Consequently, the ratio between
these two terms remains a constant for any s.
E. Examples
We use a simple example to illustrate the properties of the
fluid response curves. Suppose that we have a 3-hop path with
equal capacity Ci= 10mb/s, i= 1,2,3. We consider three
routing matrices and flow rate settings that lead to the same
link load at each hop.
In the first setting that we call one hop (or, one hop
persistent) routing, the flow rate vector x= (4,7,8) and
the routing matrix R=diag(1,1,1). Each cross-traffic flow
traverses only one hop along the path. In the second setting,
each flow traverses no more than two hops. Hence, we call it
two hop routing. The flow rate vector is x= (4,3,5) and the
routing matrix Ris given by
R=
1 1 0
0 1 1
0 0 1
.(16)
In a third three hop (or, path persistent) routing case, a flow
can traverse up to three links. The flow rate vector x= (4,3,1)
and the routing matrix Ris given by
R=
1 1 1
0 1 1
0 0 1
.(17)
All three settings result in the same path configuration
H= 10 10 10
478!.(18)
The probing packet size sis 1500 bytes. The fluid gap
response curves for the three routing patterns are plotted in Fig.
LIU et al.: A STOCHASTIC FOUNDATION OF AVAILABLE BANDWIDTH ESTIMATION: MULTI-HOP ANALYSIS 5
0
2
4
6
8
10
0 2 4 6 8 10
α1
α2
α3
output dispersion γN (ms)
input dispersion gI (ms)
one-hop routing
two-hop routing
three-hop routing
lower bound
(a) gap response curve
0.5
1
1.5
2
2.5
3
0 2 4 6 8 10
s/α3
s/α2
s/α1
rI/rO
input rate rI (mb/s)
one-hop routing
two-hop routing
three-hop routing
lower bound
(b) rate response curve
Fig. 1. An example of multi-hop response curves.
1(a). In this example, all multi-hop fluid curves have 4 linear
segments separated by turning points α1= 6ms, α2= 4ms,
and α3= 2ms. The lower bound Sidentified in Property 4 is
also plotted in the figure. This lower bound is the gap response
curve of the single-hop path comprising only the tight link L3.
The rate response curves are given in Fig. 1(b), where the
three turning points are 2mb/s, 3mb/s, and 6mb/s respectively.
Due to the transformation we adopted, the rate curve for one-
hop routing still remains as an upper bound for the rate curves
associated with the other routing patterns. From Fig. 1(b), we
also see that, similar to the gap curves, the three multi-hop
rate response curves and their lower bound ˜
S(rI)(i.e., the
transformed rate version of S(gI, s)) share the same first and
second linear segments.
F. Discussion
We conclude this section by discussing several major chal-
lenges in extending the response curve analysis to a multi-
hop path carrying bursty cross-traffic flows. First, notice that
with bursty cross-traffic, even when the input dispersion and
packet-train parameters remain constant, the output dispersion
becomes random, rather than deterministic as in fluid cross-
traffic. The gap response curve Z, defined as the functional
relation between the statistical mean of the output dispersion
and the input dispersion, is much more difficult to penetrate
than the fluid curve F. Second, unlike in the fluid case, where
both packet-train length nand probing packet size shave no
impact on the rate response curve ˜
F(rI), the response curves
in bursty cross-traffic are strongly related to these two packet-
train parameters. Finally, a full characterization of a fluid flow
only requires one parameter – its arrival rate, while a full
characterization of a bursty flow requires several stochastic
processes. In what follows, we address these problems and
extend our analysis to multi-hop paths with bursty cross-traffic.
III. BASICS OF NON-FLUID ANALYSIS
In this section, we present a stochastic formulation of
the multi-hop bandwidth measurement problem and derive a
recursive expression for the output dispersion random variable.
This expression is a fundamental result that the asymptotic
analysis in Section IV is based upon.
A. Formulating Bursty Flows
We keep most of the notations the same as in the previous
section, although some of the terms are extended to have a
different meaning, which we explain shortly. Since cross-traffic
flows now become bursty flows of data packets, we adopt the
definitions of several random processes (Definition 1-6) in [8]
to characterize them. However, these definitions need to be
refined to be specific to a given router and flow aggregation.
In what follows, we only give the definitions of two random
processes and skip the others. The notations for all six random
processes are given in Table I.
Definition 2: The cumulative traffic arrival process of flow
aggregation pat link Li, denoted as {Vi(p, t),0t < ∞} is
a random process counting the total amount of data (in bits)
received by hop Lifrom flow aggregation pup to time instant
t.Definition 3: Hop workload process of Liwith respect to
flow aggregation p, denoted as {Wi(p, t),0t < ∞}
indicates the sum at time instance tof service times of all
packets in the queue and the remaining service time of the
packet in service, assuming that flow aggregation pis the
only traffic passing through link Li.
To simply analysis, we adopt the following “stationarity
approximation” on cross-traffic flows.
Assumption 1: For any cross-traffic aggregation pthat tra-
verses link Li, the cumulative traffic arrival process {Vi(p, t)}
has ergodic stationary increments. That is, for any δ > 0, the δ-
interval traffic intensity process {Yi,δ (p, t)}is a mean-square
ergodic process with time-invariant distribution and ensemble
mean xp.
We explain this assumption in more details. First, the
stationary increment assumption implies that the increment
process of {Vi(p, t)}for any given time interval δ, namely
{Vi(p, t +δ)Vi(p, t) = δYi,δ (p, t)}, has a time-invariant
distribution. This further implies that the δ-interval traffic
intensity process {Yi,δ (p, t)}is identically distributed, whose
marginal distribution at any time instance tcan be described
by the same random variable Yi,δ(p). Second, the mean-square
ergodicity implies that, as the observation interval δincreases,
the random variable Yi,δ (p)converges to xp in the mean-
square sense. In other words, the variance of Yi,δ (p)decays
to 0 as δ→ ∞, i.e.,
lim
δ→∞ EYi,δ(p)xp2= 0.(19)
By making Assumption 1, we approximate both the arrival
and the departure process of any cross-traffic flow aggregation
at any link as stationary processes. This sets us free from
the complexity of analyzing tandem queue departure processes
and greatly simplifies later discussions. The major results we
obtained, however, still hold for non-stationary cross-traffic.
We address more on this issue in Section V-C and leave its
formal discussion as our future work.
Our next assumption is that the queuing system at link Li
has evolved for a sufficiently long period of time and has
bypassed its transient state. Therefore, for any flow aggrega-
tion p, the workload process {Wi(p, t)}“inherit” the ergodic
stationarity property from the traffic arrival process {Vi(p, t)}.
This property is further carried over to the δ-interval workload-
difference process {Di,δ (p, t)}and the available bandwidth
process {Bi,δ (p, t)}. This distributional stationarity allows
6 IEEE/ACM TRANSACTIONS ON NETWORKING, VOLUME 16, NO. 2, APRIL 2008
TABLE I
RANDOM P ROCES S NOTATIONS
{Vi(p, t)}Cumulative arrival process at Liw.r.t. p
{Yi,δ(p, t)}Cross-traffic intensity process at Liw.r.t. p
{Wi(p, t)}Hop workload process at Liw.r.t. p
{Di,δ(p, t)}Workload-difference process at Liw.r.t. p
{Ui(p, t)}Hop utilization process at Liw.r.t. p
{Bi,δ(p, t)}Available bandwidth process at Liw.r.t. p
us to focus on the corresponding random variables Wi(p),
Di,δ(p), and Bi,δ (p). It is easy to get, from their definitions,
that the statistical means of Di,δ (p)and Bi,δ (p)are 0and
Cixp, respectively5. Further, the ergodicity property leads
to the following result.
Lemma 1: For any flow aggregation pthat traverses the
path at link Li, the random variable Bi,δ (p)converges in the
mean-square sense to Cixp as δ→ ∞, i.e.,
lim
δ→∞ EBi,δ(p)(Cixp)2= 0.(20)
On the other hand, notice that unlike {Yi,δ (p, t)}and
{Bi,δ(p, t)}, the workload-difference process {Di,δ (p, t)}is
not a moving average process by nature. Consequently, the
mean-square ergodicity of {Di,δ (p, t)}does not cause the
variance of Di,δ (p)to decay with respect to the increase of
δ. Instead, we have the following lemma.
Lemma 2: The variance of the random variable Di,δ (p)
converges to 2V ar[Wi(p)] as δincreases:
lim
δ→∞ EDi,δ(p)02= 2V ar [Wi(p)] .(21)
To obtain our later results, not only do we need to know the
asymptotic variance of Yi,δ (p),Di,δ (p)and Bi,δ (p)when δ
approaches infinity, but also we often rely on their variance
being uniformly bounded (for any δ) by some constant. This
condition can be easily justified from a practical standpoint.
First note that cross-traffic arrival rate is bounded by the
capacities of incoming links at a given router. Suppose that
the sum of all incoming link capacities at hop Liis C+, then
Yi,δ (p)is distributed in a finite interval [0, C+]and its variance
is uniformly bounded by the constant C2
+for any observation
interval δ. Similarly, the variance of Bi,δ (p)is uniformly
bounded by the constant C2
i. The variance of Di,δ (p)is
uniformly bounded by the constant 4V ar[Wi(p)] for any δ,
which directly follows from the definition of Di,δ (p).
Finally, we remind that some of the notations introduced
in Section II-A now are used with a different meaning. The
rate of the bursty cross-traffic flow j, denoted by xj, is the
probabilistic mean of the traffic intensity random variable
Yi,δ (fj), which is also the long-term average arrival rate of
flow jat any link it traverses. The term λi=xribecomes
the long-term average arrival rate of the aggregated cross-
traffic at link Li. The term Ai=Ciλiis the long-term
average hop available bandwidth at link Li. Again recall that
5Note that the hop available bandwidth of link Lithat is of measurement
interest, given by Ai=Cixrican be less than Cixp.
we explicitly target the measurement of long-term averages of
available bandwidth and/or cross-traffic intensity, instead of
the corresponding metrics in a certain time interval.
B. Formulating Packet Train Probing
We now consider an infinite series of packet-trains with
input inter-packet dispersion gI, packet size s, and packet-train
length n. The arrival of this packet-train series at path Pis
described by a point process Λ(t) = max{m0 : Tmt},
which has a sufficiently large inter-probing separation. Let
d1(m, i)and dn(m, i)be the departure time instances from
link Liof the first and last probing packets in the mth
packet-train. We define the sampling interval of the packet-
train as the total spacing ∆ = dn(m, i)d1(m, i), and the
output dispersion as the average spacing G= ∆/(n1)
of the packet-train. Both and Gare random variables,
whose statistics might depend on several factors such as the
input dispersion gI, the packet-train parameters sand n, the
packet-train index min the probing series, and the hop Li
that the output dispersion Gis associated with. Therefore, a
full version of Gis written as Gi(gI, s, n, m). However, for
notation brevity, we often omit the parameters that have little
relevance to the topic under discussion.
We now formally state the questions we address in this
paper. Note that a realization of the stochastic process
{GN(gI, s, n, m),1m < ∞} is just a packet-train probing
experiment. We examine the sample-path time-average of this
process and its relationship to gIwhen keeping sand n
constant. This relationship, previously denoted by Z, is called
the gap response curve of path P.
Notice that the ergodic stationarity of cross-traffic arrival,
as we assumed previously, can reduce our response curve
analysis to the investigation of a single random variable. This
is because each packet-train comes to see a multi-hop system
of the same stochastic nature and the output dispersion process
{GN(m),1m < ∞} is an identically distributed random
sequence, which can be described by the output dispersion
random variable GN. The sample-path time average of the
output dispersion process coincides with the mean of the
random variable GN6. Therefore, in the rest of the paper, we
focus on the statistics of GNand drop the index m.
In our later analysis, we compare the gap response curve of
Pwith that of the fluid counterpart of Pand prove that the
former is lower-bounded by the latter.
Definition 4: Suppose that path Phas a routing matrix R
and a flow rate vector xand that path ˜
Phas a routing matrix
˜
Rand a flow rate vector ˜
x.˜
Pis called the fluid counterpart
of Pif 1) all cross-traffic flows traversing ˜
Pare constant-rate
fluid; 2) the two paths ˜
Pand Phave the same configuration
matrix; and 3) there exists a row-exchange matrix T, such that
TR=˜
Rand Tx=˜
x.
From this definition, we see that for every flow jin P,
there is a corresponding fluid flow jin the fluid counterpart
of Psuch that flow jhave the same average intensity and
routing pattern as those of flow j. Note that the third condition
6Note that the output dispersion process can be correlated. However, this
does not affect the sample-path time average of the process.
LIU et al.: A STOCHASTIC FOUNDATION OF AVAILABLE BANDWIDTH ESTIMATION: MULTI-HOP ANALYSIS 7
in Definition 4 is made to allow the two flows have different
indices, i.e., to allow j6=j.
A second focus of this paper is to study the impact of
packet-train parameters sand non the response curves. That
is, for any given input rate rIand other parameters fixed, we
examine the convergence properties of the output dispersion
random variable GN(s/rI, s, n)as sor ntends to infinity.
C. Recursive Expression of GN
We keep input packet-train parameters gI,s, and nconstant
and next obtain a basic expression for the output dispersion
random variable GN.
Lemma 3: Letting G0=gI, the random variable Gihas
the following recursive expression
Gi=
i
X
k=1
Yk,k1k,i )Gk1
Ci
+s
Ci
+˜
Ii
n1
=Gi1+Di,i1(ri)
n1+Ri
n1,(22)
where the term Riis a random variable representing the
extra queuing delay(besides the queuing delay caused by the
workload process {Wi(ri, t)}) experienced at Liby the last
probing packet in the train. The term ˜
Iiis another random
variable indicating the hop idle time of Liduring the sampling
interval of the packet train.
Note that there are two terms ˜
Iiand Ri, which did not
appear in Table I. Interested readers can refer to Section 3.2
in our single-hop analysis [8] for a detailed discussion of
these two terms. Both of these two random variables can be
computed given the inter-arrival structure of the probing train
and the available bandwidth process at link i. We next present
the computation formulas while omitting their derivations.
To compute Ri, we denote by ai,k the arrival time instant
of the kth packet in the train at link i. We use Ri,k to denote
the extra queuing delay ( besides the amount of queuing delay
Wi(ri, ai,k)) experienced by the kth packet in the train at link
i. Then, Ri=Ri,n, and can be computed recursively in the
following, where δk=ai,k ai,k1:
0k= 1
max 0,sBi,δk(ri, ai,k1)δk
C+Ri,k1k > 1.
(23)
The other random variable ˜
Iihas a relationship with Riand
Bi,i1described in the following7:
˜
Ii=Ri+Bi,i1(n1)s
C.(24)
Even though in theory all the terms in (22) can be computed,
the computation of Riand ˜
Iirequires inter-arrival structure
information of the packet train, which is hard to compute
in practice. Note, however, that our goal in this paper is not
to obtain a computation procedure of the response curve Z.
Instead, we focus on analyzing the deviation phenomena and
convergence properties of Z. Lemma 3 is important and also
sufficient to serve this purpose.
7Please refer to Section 3.3 of [8] for details
Also note that due to the random input packet-train structure
at Li, all but the term s/Ciin (22) become random variables.
Some terms, such as Di,i1(ri)and Yk,k1k,i ), even have
two dimensions of randomness. To understand the behavior of
probing response curves, we need to investigate the statistical
properties of each term in (22).
IV. RESPONSE CURVES IN BURSTY CROSS-TRAFFIC
In this section, we first show that the gap response curve
Z=E[GN(gI, s, n)] of a multi-hop path Pis lower bounded
by its fluid counterpart F=γN(gI, s). We then investigate
the impact of packet-train parameters on Z.
A. Deviation Phenomena of Z
Our next lemma shows that passing through a link can only
increase the dispersion random variable in mean, due to the
zero mean of Di,i1(ri)and non-negative mean of Ri, which
immediately follows from (23).
Lemma 4: For 1iN,E[Gi]E[Gi1] = E[Ri]/(n
1) 0.
Using the first part of (22), our next lemma shows that
for any link Li, the output dispersion random variable Giis
lower bounded in mean by a linear combination of the output
dispersion random variables Gk, where k < i.
Lemma 5: For 1iN, the output dispersion random
variable Gisatisfies the following inequality
E[Gi]1
Ci i
X
k=1
xΓk,iE[Gk1] + s!=E[˜
Ii]
n10.(25)
From Lemma 4 and Lemma 5, we get
E[Gi]max E[Gi1],Pi
k=1 xΓk,iE[Gk1] + s
Ci!.(26)
This leads to the following theorem.
Theorem 2: For any input dispersion gI, packet-train pa-
rameters sand n, the output dispersion random variable GN
of path Pis lower bounded in mean by the output dispersion
γN(gI, s)of the fluid counterpart of P:
E[GN(gI, s, n)] γN(gI, s).(27)
Proof: We apply mathematical induction to i. When i=
0,E[G0] = γ0=gI. Assuming that (27) holds for 0i < N ,
we next prove that it also holds for i=N. Recalling (26), we
have
E[GN]maxE[GN1],PN
k=1 xΓk,N E[Gk1] + s
CN
maxγN1,PN
k=1 xΓk,N γk1+s
CN=γN,
where the second inequality is due to the induction hypothesis,
and the last equality is because of Theorem 1.
Theorem 2 shows that in the entire input gap range, the
piece-wise linear fluid gap response curve Fdiscussed in
Section II is a lower bound of the real gap curve Z. Further,
combining Theorem 1, Lemma 4 and Lemma 5, We can obtain
the deviation between the real curve Zand its fluid lower
8 IEEE/ACM TRANSACTIONS ON NETWORKING, VOLUME 16, NO. 2, APRIL 2008
bound F. This deviation, denoted by βN(gI, s, n)or βNfor
short, can be recursively expressed in the following, where we
let β0= 0:
βi=
βi1+E[Ri]
n1γi=γi1
1
CiPi
k=1 xΓk,iβk1+E[˜
Ii]
n1γi> γi1
.(28)
In what follows, we study the asymptotics of the curve
deviation βNwhen input packet-train parameters sor n
becomes large and show that the fluid lower bound Fis in
fact a tight bound of the real response curve Z.
B. Convergence Properties of Zfor Long Trains
We now show that when packet-size sis kept constant, as
the packet-train length n→ ∞, the output dispersion random
variable GN(gI, s, n)of path Pconverges in the mean-square
sense to its fluid lower bound γN(gI, s), for any gIand
any s. This means that not only E[GN]converges to γN,
but also the variance of GNdecays to 0 as nincreases.
We first prove this result over a single-hop path. We then
apply mathematical induction to extend this conclusion to any
multi-hop path with arbitrary cross-traffic routing under the
stationarity approximation.
Theorem 3: Under the stationarity approximation of this
paper, for a single-hop path Pwith capacity Cand cross-traffic
intensity λ < C, for any input dispersion gI(0,)and
probing packet size s, the output dispersion random variable
Gconverges to its fluid lower bound γin the mean-square
sense as n→ ∞
lim
n→∞ E"G(gI, s, n)max gI,λgI+s
C2#= 0.(29)
Proof: First consider the case when s/gI< C λ. We
examine the output sampling interval random variable ∆ =
(n1)G. The key is to view the first and last packets in the
input packet-train as a packet-pair and view the other packets
in between as if they were from another cross-traffic flow f.
The real cross-traffic and ftogether form a flow aggregation
denoted by p. Obviously, the packet arrival in pis still ergodic
stationary. The long term arrival rate of pis λ+s/gI< C. The
workload-difference process Dδ(p)is a zero-mean process.
According to Lemma 3, can be expressed as follows
∆ = (n1)gI+Dδ(p) + R, (30)
where δ= (n1)gIis the sampling interval of the input
packet-train, R= max (0,(sBδ(p)δ)/C)is the extra
queuing delay (besides the amount of queueing delay imposed
by flow aggregation p) imposed on the last probing packet by
the first probing packet in the train. The output dispersion
G= ∆/(n1) can be expressed as
G=gI+Dδ(p)
n1+ max 0,sBδ(p)δ
C(n1) ,(31)
Notice that, as nincreases, the second additive term converges
to 0 in the mean-square sense. That is,
lim
n→∞ E"Dδ(p)
n12#= lim
n→∞
2V ar[W(p)]
(n1)2= 0,(32)
where the first equality is due to Lemma 2. The third term on
the right hand side of (31) also converge to 0 in the mean-
square sense:
lim
n→∞ E"max(0, s Bδ(p)δ)
C(n1) 2#lim
n→∞
s2
C2(n1)2= 0.
(33)
Combining (31), (32), and (33), we get
lim
n→∞ Eh(G(gI, s, n)gI)2i= 0.(34)
Now consider the case when s/gI> C λ. We again examine
the sampling interval , and according to Lemma 3, we have
∆ = Yδ(p)δ
C+s
C+˜
I, (35)
The last term on the right side of (35) is the hop idle time
during the sampling interval of the packet-train, and can
be computed as ˜
I= max (0, Bδ(p)δs)/C . The output
dispersion G= ∆/(n1) can be expressed as
G=Yδ(p)δ
(n1)C+s
(n1)C+max (0, Bδ(p)δs)
C(n1) .(36)
The first additive term in (36) converges in the mean-square
sense to (λgI+s)/C as shown in the following:
lim
n→∞ E"Yδ(p)δ(n1)(λgI+s)
(n1)C2#
=g2
I
C2lim
δ→∞ E"Yδ(p)λ+s
gI2#= 0,(37)
where the second equality is due to the mean-square ergodicity
of the flow aggregation p. The second term in (36) is deter-
ministic, and its square converges to 0 as n→ ∞. The third
term in (36) converges in the mean-square sense to 0 when n
increases. To show this, first notice that since the arrival rate
of pis greater than hop capacity C, we have
lim
δ→∞ E[Bδ(p)] = 0.(38)
further notice that Bδ(p)is distributed in a finite interval
[0, C]and is always non-negative. Hence, (38) implies that the
second moment of Bδ(p)also converges to 0 as δincreases,
0lim
δ→∞ E(Bδ(p))2lim
δ→∞ E[CBδ(p)] = 0.(39)
This leads to the following
0lim
n→∞ E"max (0, Bδ(p)δs)
C(n1) 2#
lim
n→∞ E"Bδ(p)δ
C(n1)2#
= lim
δ→∞ gI
C2Eh(Bδ(p))2i= 0.(40)
Combining (36), (37), and (40), we get
lim
n→∞ E"G(gI, s, n)λgI+s
C2#= 0.(41)
Combining (34) and (41), the theorem follows.
LIU et al.: A STOCHASTIC FOUNDATION OF AVAILABLE BANDWIDTH ESTIMATION: MULTI-HOP ANALYSIS 9
Our next theorem extends this result to multi-hop path with
arbitrary cross-traffic routing.
Theorem 4: Under the stationarity approximation, for any
N-hop path Pwith arbitrary cross-traffic routing, for any input
dispersion gI(0,)and any probing packet size s, the
random variable GNconverges to its fluid lower bound γNin
the mean-square sense as n→ ∞,
lim
n→∞ Eh(GN(gI, s, n)γN(gI, s))2i= 0.(42)
Proof: We apply induction to i. When i= 1, the
conclusion holds due to Theorem 3. Assuming that (42) holds
for all i < N , we next show it also holds for i=N.
We apply the same method as in the proof of Theorem 3.
We view the first and last probing packets p1and pnas a
packet-pair, and view the rest of probing packets in the train
as if they were from another cross-traffic flow f. We denote
the aggregation of rNand fas p. Due to the “stationarity
approximation”, the traffic arrival in pcan be viewed as an
ergodic stationary flow when nis sufficient large. We now
examine the average arrival rate of pat link LN. That is, we
compute
λp= lim
n→∞
E[ΩN]
(n1)E[GN1(gI, s, n)] (43)
where Nis the random variable indicating the volume of
traffic buffered between p1and pnin the outgoing queue of
LN. Notice that
E[ΩN] = E"N
X
k=1
Yk,k1k,N )∆k1#+ (n1)s, (44)
where k1= (n1)Gk1is the sampling interval of the
input packet-pair p1and pnat Lk. Substituting (44) back into
(43), we get the following due to the induction hypothesis:
λp= lim
n→∞ PN
k=1 E[Yk,k1k,N )Gk1] + s
E[GN1(gI, s, n)]
=PN
k=1 xΓk,N γk1+s
γN1
.(45)
We now consider the case when λp< CN. This leads to
γN=γN1due to Theorem 1. Further, from Lemma 3, we
have
N= ∆N1+DN,N1(p) + RN,(46)
where RN= max(0, s BN,N1(p)∆N1)/CNis the extra
queuing at link N(besides the queueing delay imposed by
aggregation p)imposed by p1on pn. Dividing n1at both
sides of (46), we get the following expression for GN:
GN1+DN,N1(p)
n1+max 0, s BN ,N1(p)∆N1
CN(n1) .
(47)
As n→ ∞, the first additive term GN1in (47) converges to
γN1in mean-square sense due to the induction hypothesis.
The other two terms converge to 0 in the mean-square sense.
The proofs are similar to what is shown in (32) and (33), and
we omit the details. Hence, GNconverges to γN=γN1in
the mean square sense:
lim
n→∞ E(GNγN)2= 0.(48)
For the case when λp> CN. From Theorem 1, we have
γN=PN
k=1 xΓk,N γk1+s
CN
.(49)
Further, according to Lemma 3, we have
N=YN,N1(p)∆N1
CN
+s
CN
+˜
IN,(50)
where ˜
INis the hop idle time of LNduring the sampling
interval of the packet train, which can be expressed as
˜
IN= max 0,BN,N1(p)∆N1s
CN.(51)
Dividing by n1both sides of (50), we get
GN=YN,N1(p)GN1
CN
+s
(n1)CN
+˜
IN
n1.(52)
The first additive term of (52) converges in the mean-square
sense to λpγN1/CN. We omit the proof details but point out
that it requires the condition that the variance of YN,δ (p)is
uniformly bounded by some constant for all δ, which we have
justified previously. The second term is deterministic, and its
square converges to 0 as n→ ∞. The third term converges
to 0 in the mean-square as nincreases. To prove this, we first
show that BN,N1(p)converges in mean-square to 0. Let
P(x)be the distribution function of GN1, we have
lim
n→∞ EhBN,N1(p)2i
= lim
n→∞ Z
0
EhBN,(n1)x(p)2idP (x)
=Z
0
lim
n→∞ EhBN,(n1)x(p)2idP (x)
=Z
0
0dP (x) = 0,(53)
where the interchange between the limit and the integration
is valid, because the second-order moment of BN,δ (p)is
uniformly bounded by C2
Nfor all δ. Next, recalling (51) and
using an argument similar to (40), we can easily get
lim
n→∞ E
˜
IN
n1!2
= 0.(54)
Combining the results for all three additive terms in (52), we
conclude that when λp> CN,GNconverges in mean-square
to λpγN1/CN, which equals to γNdue to (45) and Theorem
1. Combining the two cases, we complete the inductive step
and the Theorem follows.
Let us make several comments on the conditions of these
results. First note that in the multi-hop cases, the stationarity
approximation is needed even when cross-traffic routing is
one-hop persistent. The reason is that when nis large, the
probing packet-train is also viewed as a flow, whose arrival
characteristics at all but the first hop are addressed by the
stationarity approximation. Second, we point out again that the
key in these two proofs is to view the first and last packets in
the train as a packet pair and view the other probing packets
as a cross-traffic flow. This makes tractable the analysis of the
10 IEEE/ACM TRANSACTIONS ON NETWORKING, VOLUME 16, NO. 2, APRIL 2008
extra queuing delay term Riand the hop idle time term ˜
Iiin
Lemma 3.
Theorem 4 shows that when the packet-train length n
increases while keeping sconstant, not only E[GN]converges
to its fluid bound γN, but also the variance of GNdecays to
0. This means that we can expect almost the same output
dispersion in different probings.
C. Convergence Properties of Zfor Large Packet Size
We now state without proof the condition under which the
curve deviation βN(s/rI, s, n)vanishes as probing packet size
sapproaches infinity which keeping nand rconstant.
Assumption 2: For any flow aggregation pat link Li,
Denoting by Pi,δ (x)the distribution function of the δ-interval
available bandwidth process {Bi,δ (p, t)}, we assume that for
all 1iN, the following holds
Pi,δ (r) = o1
δ2r < Cixp
Pi,δ (r) = 1 o1
δ2r > Cixp
.(55)
Recall that the mean-square ergodicity assumption we made
earlier implies that as the observation interval δgets large, the
random variable Bi,δ (p)converges in distribution to Cixp.
Assumption 2 further ensures that this convergence is fast in
the sense of (55).
Our next theorem states formally the convergence property
of the output dispersion random variable GN(s/rI, s, n)when
sincreases.
Theorem 5: Given stationarity approximation and Assump-
tion 2, for any N-hop path Pwith arbitrary cross-traffic
routing, for any input rate rI, the output dispersion random
variable GNof path Pconverges in mean to its fluid lower
bound γN:
lim
s→∞ EGNs
rI
, s, nγNs
rI
, s= 0.(56)
The asymptotic variance of GNwhen sincreases is upper
bounded by some constant KN:
lim
s→∞ E"GNs
rI
, s, nγNs
rI
, s2#KN.(57)
In [10], we proved this result in a special case of packet-pair
probing and one-hop cross-traffic routing. The proof, however,
can be easily extend to the general setting stated in the above
theorem. Note that the bounded variance, as stated in (57), is
an inseparable part of the whole theorem. This is because in a
mathematical induction proof, the mean convergence of GNto
γNcan be obtained only when the mean of GN1converges
to γN1and when the variance of GN1remains bounded,
as probing packet size s→ ∞.
Even though in practice, packet size is limited by path MTU,
Theorem 5 is still important because it justifies using large
probing packet size in bandwidth estimation.
D. Discussion
Among the assumptions in this paper, some are critical in
leading to our results while others are only meant to simplify
discussion. We point out that the distributional stationarity
assumption on cross-traffic arrivals can be greatly relaxed
without harming our major results. This can be intuitively
understood when noting that the response curve deviation
phenomena are caused by cross-traffic burstiness and that
the curve convergence properties result from the diminishing
burstiness of cross-traffic in asymptotically long observation
time scales. None of these facts depends on traffic stationarity.
Hence, the same results can be obtained without using the
stationarity approximation. However, this comes at the expense
of much more intricate notations and derivations. This is
because when cross-traffic arrivals are allowed to be only
second-order stationary or even non-stationary, the output
dispersion process {GN(m)}will no longer be identically
distributed. Consequently, the analysis of probing response
curves cannot be reduced to the investigation of a single output
dispersion random variable. Moreover,we also have to rely on
an ASTA assumption on packet-train probing [8] to derive the
results in this paper, which we have avoided in the present
setting.
On the other hand, the mean-square ergodicity plays a
central role in the proofs for Theorem 5 and Theorem 4. A
cross-traffic flow with mean-square ergodicity, when observed
in a large timescale, has an almost constant arrival rate. This
“asymptotically fluid like” property, is very common among
the vast majority of traffic models in stochastic literature, and
can be decoupled from any type of traffic stationarity. Conse-
quently, our results have a broad applicability in practice.
Next, we provide experimental evidence for our theoretical
results using testbed experiments and real Internet measure-
ment data.
V. EXPERIMENTAL VERIFICATION
In this section, we measure the response curves in both
testbed and real Internet environments. The results not only
provide experimental evidence to our theory, but also give
quantitative ideas of the curve deviation given in (28). To
obtain the statistical mean of the probing output dispersions,
we rely on direct measurements using a number of probing
samples. Even though this approach can hardly produce a
smooth response curve, the bright side is that it allows us to
observe the output dispersion variance, reflected by the degree
of smoothness of the measured response curve.
A. Testbed Experiments
In our first experiment, we measure in the Emulab testbed
[3] the response curves of a three-hop path with the following
configuration matrix (all in mb/s) and one-hop persistent cross-
traffic routing
H= 96 96 96
20 40 60 !.(58)
We generate cross-traffic using three NLANR [14] traces. All
inter-packet delays in each trace are scaled by a common factor
LIU et al.: A STOCHASTIC FOUNDATION OF AVAILABLE BANDWIDTH ESTIMATION: MULTI-HOP ANALYSIS 11
1
1.2
1.4
1.6
1.8
2
2.2
2.4
10 20 30 40 50 60 70 80 90 100
s/α3
s/α2
s/α1
rI/(s/E[GN])
Probing Input Rate rI (mb/s)
n=2
n=9
n=33
n=65
m-fluid
s-fluid
(a) one-hop persistent routing
1
1.2
1.4
1.6
1.8
2
2.2
2.4
10 20 30 40 50 60 70 80 90 100
s/α3
s/α2
s/α1
rI/(s/E[GN])
Probing Input Rate rI (mb/s)
n=2
n=9
n=33
n=65
m-fluid
s-fluid
(b) path-persistent routing
Fig. 2. Measured response curves using different packet train-length in the
Emulab testbed.
so that the average rate during the trace duration becomes
the desired value. The trace durations after scaling are 1-2
minutes. We measure the average output dispersions at 100
input rates, from 1mb/s to 100mb/s with 1mb/s increasing step.
For each input rate, we use 500 packet-trains with packet size
1500 bytes. The packet train length nis 65. The inter-probing
delay is controlled by a random variable with sufficiently large
mean. The whole experiment lasts for about 73 minutes. All
three traffic traces are replayed at random starting points once
the previous round is finished. By recycling the same traces in
this fashion, we make the cross-traffic last until the experiment
ends without creating periodicity. Also note that the packet-
trains are injected with their input rates so arranged that the
500 trains for each input rate is evenly separated during the
whole testing period.
This experiment not only allows us to measure the response
curve for n= 65, but also for any packet-train length ksuch
that 2k < n = 65, by simply taking the dispersions of the
first kpackets in each train. Fig. 2(a) shows the rate response
curve ˜
Z(rI, s, n)for k= 2,9,33 and 65 respectively. For
comparison purposes, we also plot in the figure the multi-hop
fluid curve ˜
F(rI), computed from Theorem 1, and the single-
hop fluid curve ˜
S(rI)of the tight link L3. The rate response
curves ˜
Z(rI, s, n)is defined as follows
˜
Z(rI, s, n) = rI
s/E[GN(s/rI, s, n)] .(59)
First note that the multi-hop fluid rate curve comprises four
linear segments separated by turning points 36mb/s, 56mb/s,
and 76mb/s. The last two linear segments have very close
slopes and they are not easily distinguishable from each
other in the figure. We also clearly see that the rate curve
asymptotically approaches its fluid lower bound as packet-
train length nincreases. The curves for n= 33 and n= 65
almost coincide with the fluid bound. Also note that the
smoothness of the measurement curve reflects the variance
of the output dispersion random variables. As the packet
train length increases, the measured curve becomes smoother,
indicating the fact that the variance of the output dispersions is
decaying. These observations are all in agreement with those
stated in Theorem 4.
Unlike single-hop response curves, which have no deviation
from the fluid bound when the input rate rIis greater than the
link capacity, multi-hop response curves usually deviate from
its fluid counterpart in the entire input range. As we see from
Fig. 2(a), even when the input rate is larger than 96mb/s, the
measured curves still appear above ˜
F. Also observe that the
single-hop fluid curve ˜
Sof the tight link L3coincides with
the multi-hop fluid curve ˜
Fwithin the input rate range (0,56)
but falls below ˜
Fin the input rate range (56,).
Finally, we explain why we choose the link capacities to
be 96mb/s instead of the fast ethernet capacity 100mb/s. In
fact, we did set the link capacity to be 100mb/s. However, we
noticed that the measured curves can not get arbitrarily close
to their fluid bound ˜
Fcomputed based on the fast ethernet
capacity. Using pathload to examine the true capacity of each
Emulab link, we found that their IP layer capacities are in fact
96mb/s, not the same as their nominal value 100mb/s.
In our second experiment, we change the cross-traffic rout-
ing to path-persistent while keeping the path configuration
matrix the same as given by (58). Therefore, the flow rate
vector now becomes (20,20,20).
We repeat the same packet-train probing experiment and the
results are plotted in Fig. 2(b). The multi-hop fluid rate curve
˜
Fstill coincides with ˜
Sin the input rate range (0,56). When
input rate is larger than 56mb/s, the curve ˜
Fpositively deviates
from ˜
S. However, the amount of deviation is smaller than that
in one-hop persistent routing. The measured curve approaches
the fluid lower bound ˜
Fwith decaying variance as packet-
train length increases. For n= 33 and n= 65, the measured
curves become hardly distinguishable from ˜
F. Also notice that
the measured curves exhibit more variance, probably because
this routing pattern introduces more inter-flow correlation.
We have conducted experiments using paths with more
hops, with more complicated cross-traffic routing patterns, and
with various path configurations. Furthermore, we examined
the impact of probing packet size using ns2 simulations, where
the packet size can be set to any large values. Results obtained
(not shown for brevity) all support our theory very well.
B. Real Internet Measurements
We conducted an extensive packet-train probing experiment
with a coverage of more than 270 Internet paths in the RON
testbed to verify our analysis in real networks. A detailed
discussion about this measurement study will be reported in
a separate paper. In what follows, we show the results for
only two Internet paths. Since neither the path configuration
nor the cross-traffic routing information is available for these
paths, we are unable to provide the fluid bounds. Therefore,
we verify our theory by observing the convergence of the
measured curves to a piece-wise linear curve as packet-train
length increases.
In the first experiment, we measure the rate response curve
of the path from the RON node lulea in Sweden to the
RON node at CMU. The path has 19 hops and a fast-
ethernet minimum capacity, as we find out using traceroute
and pathrate. We probe the path at 29 different input rates,
from 10mb/s to 150mb/s with a 5mb/s increasing step. For
each input rate, we use 200 packet-trains of 33 packets each
to estimate the output probing rate s/E[GN]. The whole
experiment takes about 24 minutes. Again, the 200 packet-
trains for each of the 29 input rates are so arranged that
12 IEEE/ACM TRANSACTIONS ON NETWORKING, VOLUME 16, NO. 2, APRIL 2008
1
1.2
1.4
1.6
1.8
2
2.2
20 40 60 80 100 120 140
rI/(s/E[GN])
Probing Input Rate rI (mb/s)
n=2
n=3
n=5
n=9
n=17
n=33
s-fluid
(a) lulea CMU
1
1.2
1.4
1.6
1.8
2
2.2
2.4
20 40 60 80 100 120 140
rI/(s/E[GN])
Probing Input Rate rI (mb/s)
n=2
n=3
n=9
n=33
n=65
n=129
(b) ana1-gblx Cornell
Fig. 3. Measured response curves of two Internet paths in RON testbed. The
path from lulea to CMU was measured on Jan. 16th 2005; and the path from
ana1-gblx to Cornell was measured on April 29th, 2005.
they are approximately evenly separated during the 24-minute
testing period. The measured rate response curves associated
with packet-train length 2, 3, 5, 9, 17, and 33 are plotted in
Fig. 3(a), where we see that the response curve approaches
a piece-wise linear bound as packet-train length increases. At
the same time, response curves measured using long trains are
smoother than those measured using short trains, indicating the
decaying variance of output dispersions. In this experiment,
the curve measured using probing trains of 33-packet length
exhibits sufficient smoothness and clear piece-wise linearity.
We have observed two linear segments from the figure. A
further investigation shows that the fluid bound of this 19-hop
path only has two linear segments.
Based on (15), we apply linear regression on the second
linear segment to compute the capacity Cband the cross-
traffic intensity λbof the tight link and get Cb= 96mb/s
and λb= 2mb/s. Using these results, we retroactively plot the
single-hop fluid bounds and observe that it almost overlaps
with the measured curve using packet-trains of 33-packet
length. Notice that the bottleneck link is under very light
utilization during our 24-minute measurement period. We
can also infer based on our measurement that the available
bandwidth of the path is constrained mainly by the capacity
of the bottleneck link and that the probing packet-trains have
undergone significant interaction with cross-traffic at non-
bottleneck links. Otherwise, according to Theorem 3 in [8],
the response curves measured using short train lengths would
not have appeared above the single-hop fluid bound when
the input rate is larger than the tight link capacity 96mb/s.
We believe that the tight link of the path is one of the last-
mile lightly utilized fast-ethernet links and that the backbone
links are transmitting significant amount of cross-traffic even
though they still have available bandwidth much more than the
fast-ethernet capacity. Also notice that similar to our testbed
experiments, fast-ethernet links only have 96mb/s IP-layer
capacity.
We repeat the same experiment on another path from the
RON node ana1-gblx in Anaheim California to the Cornell
RON node. This path has 21 hops and a fast-ethernet minimum
capacity. Due to substantial cross-traffic burstiness along the
path, we use packet-trains of 129-packet length in our probing
experiment. The other parameters such as the input rates and
the number of trains used for each rate are the same as in
the previous experiment. The whole measurement duration is
about 20 minutes. The measured response curves are plotted
in Fig. 3(b). As we see, the results exhibit more measure-
ment variability compared to the luleaCMU path. However,
as packet-train length increases, the variability is gradually
smoothed out and the response curve converges to a piece-
wise linear bound, where we can observe three linear segments
this time. The second linear segment roughly falls into the
input rate range from 50mb/s to 100mb/s. We again apply
linear regression on the second segment of the response curve
measured using packet-train length 129 to obtain the tight
link information. We get Cb= 165mb/s and λb= 117mb/s,
suggesting a heavily utilized (70%) non-access tight link,
which differs from the fast-ethernet narrow link.
VI. IMPLICATIONS
We now discuss the implications of our results on existing
measurement proposals. Except for pathChirp, all other tech-
niques such as TOPP, pathload, PTR, and Spruce are related
to our analysis.
A. TOPP
TOPP is based on multi-hop fluid rate response curve
˜
Fwith one-hop persistent cross-traffic routing. TOPP uses
packet-pairs to measure the real rate response curve ˜
Z, and
assumes that the measured curve will be the same as ˜
Fwhen a
large number of packet-pairs are used. However, our analysis
shows that the real curve ˜
Zis different from ˜
F, especially
when packet-trains of short length are used (e.g., packet-pairs).
Note that there is not much path information in ˜
Zthat is
readily extractable unless it is sufficiently close to its fluid
counterpart ˜
F. Hence, to put TOPP to work in practice, one
must use long packet-trains instead of packet-pairs.
B. Spruce
Using the notations in this paper, we can write spruce’s
available bandwidth estimator as follows
Cb1GN(s/Cb, s, n)s/Cb
s/Cb,(60)
where the probing packet size sis set to 1500bytes, the packet-
train length n= 2, and the bottleneck link capacity Cbis
assumed known.
It is shown in [8] that the spruce estimator is unbiased in
single-hop paths regardless of the packet-train parameters s
and n. This means that the statistical mean of (60) is equal to
APfor any s > 0and any n2. In a multi-hop path P, a
necessary condition to maintain the unbiasedness property of
the spruce estimator is
˜
Z(Cb, s, n) = λb+Cb
Cb
=˜
S(Cb).(61)
This means that at the input rate point Cb, the real rate
response of path Pmust be equal to the single-hop fluid rate
response at the tight link of P.
LIU et al.: A STOCHASTIC FOUNDATION OF AVAILABLE BANDWIDTH ESTIMATION: MULTI-HOP ANALYSIS 13
rI
rI/rO
APCbs/α2C
b
˜
Z
˜
F
˜
S
Elastic Deviation
Elastic Deviation
Non-elastic Deviation
Fig. 4. Illustration of two types of curve deviations.
This condition is usually not satisfied. Instead, due to
Theorem 2 and Property 4, we have
˜
Z(Cb, s, n)˜
F(Cb)˜
S(Cb).(62)
This implies that (60) is a negatively biased estimator of AP.
The amount of bias is given by
Cb˜
Z(Cb, s, n)˜
F(Cb)+Cb˜
F(Cb)˜
S(Cb).(63)
The first additive term in (63) is the measurement bias caused
by the curve deviation of ˜
Zfrom ˜
Fat input rate Cb, which
vanishes as n→ ∞ due to Theorem 4. Hence we call it elastic
bias. The second additive term is the portion of measurement
bias caused by the curve deviation of ˜
Ffrom ˜
Sat input
rate Cb, which remains constant with respect to the packet-
train parameters sand n. Therefore it is non-elastic. We
illustrate the two types of curve deviations in Fig. 4. Note
that when Cb< s/α2, non-elastic bias is 0. Further recall
that s/α2Ab2as stated in Property 3. Hence, a sufficient
condition for zero non-elastic bias is CbAb2. Conceptually,
elastic deviation stems from cross-traffic burstiness and non-
elastic deviation is a consequence of multi-hop effects.
In Table II, we give the amount measurement bias caused by
the two types of curve deviations in both the Emulab testbed
experiments and the real Internet probing measurement on the
path from lulea to CMU. Note that in the testbed experiment
using a 3-hop path with one-hop persistent routing, spruce
suffers about 84mb/s measurement bias, which is two times
more than the actual path available bandwidth 36mb/s. In the
second Emulab experiment using path-persistent cross-traffic,
the measurement bias is reduced to 38.8mb/s, which however
is still more than the actual available bandwidth. In both cases,
spruce estimator converges to negative values. We used spruce
to estimate the two paths and it did in fact give 0mb/s results
in both cases. For the Internet path from lulea to CMU, spruce
suffers 24mb/s negative bias and produces a measurement
result less than 70mb/s, while the real value is around 94mb/s.
We also use pathload to measure the three paths and observe
that it produces pretty accurate results.
The way to reduce elastic-bias is to use long packet-
trains instead of packet-pairs. In the luleaCMU experiment,
using packet-trains of 33-packet, spruce can almost completely
overcome the 24mb/s bias and produce an accurate result.
However, there are two problems of using long packet-trains.
TABLE II
SPRUCE BIAS IN EMULAB AND INTERNET EX PERI MENT (IN MB/S).
experiment elastic bias non-elastic bias total bias
Emulab-1 0.56 ×96 0.315 ×96 84.4
Emulab-2 0.28 ×96 0.125 ×96 38.8
lulea-cmu 0.25 ×96 0 24
First, there is not a deterministic train length that guarantees
negligible measurement bias on any network path. Second,
when router buffer space is limited and packet-train length
are too large, the later probing packets in each train may
experience frequent loss, making it impossible to accurately
measure ˜
F(Cb). After all, spruce uses input rate Cb, which
can be too high for the bottleneck router to accommodate long
packet-trains. On the other hand, note that non-elastic bias is
an inherit problem for spruce. There is no way to overcome
it by adjusting packet-train parameters.
C. PTR and pathload
PTR searches the response curve ˜
Z(rI, s, n)for the first
turning point and takes the input rate at the turning point as
the path available bandwidth AP. This method can produce
accurate result when the real response curve ˜
Zis close to ˜
F,
which requires packet-train length nto be sufficiently large.
Otherwise, PTR is also negatively biased and underestimates
AP. The minimum packet-train length needed is dependent
on the path conditions. The current version of PTR use packet
train length n= 60, which is probably insufficient for the
Internet path from pwh to CMU experimented in this paper.
Pathload is in spirit similar to PTR. However, it searches
for the available bandwidth region by detecting one-way-delay
increasing trend within a packet-train, which is different from
examining whether the rate response ˜
Z(rI, s, n)is greater than
one [6]. However, since there is a strong statistical correlation
between a high rate response ˜
Z(rI, s, n)and the one-way-
delay increasing tend within packet-trains, our analysis can
explain the behavior of pathload to a certain extent. Recall that,
as reported in [5], pathload underestimates available bandwidth
when there are multiple tight links along the path. Our results
demonstrate that the deviation of ˜
Z(rI, s, n)from ˜
Fin the in-
put rate range (0, AP)gives rise to a potential underestimation
in pathload. The underestimation is maximized and becomes
clearly noticeable when non-bottleneck links have the same
available bandwidth as AP, given that the other factors are
kept the same.
Even through multiple tight links cause one-way-delay
increasing trend for packet-trains with input rate less than
AP, this is not an indication that the network can not sustain
such an input rate. Rather, the increasing trend is a transient
phenomenon resulting from probing intrusion residual, and it
disappears when the input packet-train is sufficiently long.
Hence, it is our new observation that by further increasing
the packet-train length, the underestimation in pathload can
be mitigated.
14 IEEE/ACM TRANSACTIONS ON NETWORKING, VOLUME 16, NO. 2, APRIL 2008
VII. RELATED WORK
Besides the measurement techniques we discussed earlier,
Melander et al. [12] first discussed the rate response curve
of a multi-hop network path carrying fluid cross-traffic with
one-hop persistent routing pattern. Dovrolis et al. [1], [2]
considered the impact of cross-traffic routing on the output
dispersion rate of a packet-train. It was also pointed out that
the output rate of a back-to-back input packet-train (input rate
rI=C1, the capacity of the first hop L1) converges to a point
they call “asymptotic dispersion rate (ADR)” as packet-train
length increases. The authors provided an informal justification
as to why ADR can be computed using fluid cross-traffic. They
demonstrated the computation of ADR for several special path
conditions. Note that using the notations in this paper, ADR
can be expressed as
lim
n→∞
s
GN(s/C1, s, n)=s
γN(s/C1, s).(64)
Our work not only formally explains previous findings, but
also generalizes them to such an extent that allows any input
rate and any path conditions.
Kang et al. [7] analyzed the gap response of a single-hop
path with bursty cross-traffic using packet-pairs. The paper
had a focus on large input probing rate. Liu et al. extended
the single-hop analysis for packet-pairs [11] and packet-trains
[8] to arbitrary input rates and discussed the impact of packet-
train parameters.
VIII. CONCLUSION
This paper provides a stochastic characterization of packet-
train bandwidth estimation in a multi-hop path with arbitrarily
routed cross-traffic flows. Our main contributions include
derivation of the multi-hop fluid response curve as well as
the real response curve and investigation of the convergence
properties of the real response curve with respect to packet-
train parameters. The insights provided in this paper not only
help understand and improve existing techniques, but may also
lead to a new technique that measures tight link capacity.
There are a few unaddressed issues in our theoretical
framework. In our future work, we will identify how various
factors, such as path configuration and cross-traffic routing,
affect the amount of deviation between Zand F. We are also
interested in investigating new approaches that help detect and
eliminate the measurement bias caused by bursty cross-traffic
in multi-hop paths.
REFERENCES
[1] C. Dovrolis, P. Ramanathan, and D. Moore, “What Do Packet Dispersion
Techniques Measure?” in Proc. IEEE INFOCOM, Apr. 2001, pp. 905–
914.
[2] C. Dovrolis, P. Ramanathan, and D. Moore, “Packet Dispersion Tech-
niques and a Capacity Estimation Methodology,” in Proc. IEEE/ACM
Transaction on Networking, Mar. 2004.
[3] Emulab. [Online]. Available: http://www.emulab.net.
[4] N. Hu and P. Steenkiste, “Evaluation and Characterization of Available
Bandwidth Probing Techniques,IEEE J. Sel. Areas Commun., vol. 21,
no. 6, pp. 879–894, Aug. 2003.
[5] M. Jain and C. Dovrolis, “End-to-end Available Bandwidth: Measure-
ment Methodology, Dynamics, and Relation with TCP Throughput,” in
Proc. ACM SIGCOMM, Aug. 2002, pp. 295–308.
[6] M. Jain and C. Dovrolis, “Ten Fallacies and Pitfalls in End-to-End
Available Bandwidth Estimation,” in Proc. ACM IMC, Oct. 2004, pp.
272–277.
[7] S. Kang, X. Liu, M. Dai, and D. Loguinov, “Packet-pair Bandwidth
Estimation: Stochastic Analysis of a Single Congested Node,” in Proc.
IEEE ICNP, Oct. 2004.
[8] X. Liu, K. Ravindran, B. Liu, and D. Loguinov, “Single-Hop Probing
Asymptotics in Available Bandwidth Estimation: Sample-Path Analy-
sis,” in Proc. ACM IMC, Oct. 2004, pp. 300–313.
[9] X. Liu, K. Ravindran, and D. Loguinov, “Multi-Hop Probing Asymp-
totics in Available Bandwidth Estimation: Stochastic Analysis, in Proc.
ACM IMC, Oct. 2005, pp. 173–186.
[10] X. Liu, K. Ravindran, and D. Loguinov, “Multi-Hop Probing Asymp-
totics in Available Bandwidth Estimation: Stochastic Analysis,” City
University of New York, Tech. Rep., 2005.
[11] X. Liu, K. Ravindran, and D. Loguinov, “What Signals Do Packet-pair
Dispersions Carry?” in Proc. IEEE INFOCOM, Mar. 2005, pp. 281–292.
[12] B. Melander, M. Bjorkman, and P. Gunningberg, “A New End-to-End
Probing and Analysis Method for Estimating Bandwidth Bottlenecks,
in Proc. IEEE Globecom Global Internet Symposium, Nov. 2000, pp.
415–420.
[13] B. Melander, M. Bjorkman, and P. Gunningberg, “Regression-Based
Available Bandwidth Measurements,” in Proc. SPECTS, Jul. 2002.
[14] National Laboratory for Applied Network Research. [Online]. Available:
http://www.nlanr.net.
[15] V. Ribeiro, R. Riedi, R. Baraniuk, J. Navratil, and L. Cottrell,
“PathChirp: Efficient Available Bandwidth Estimation for Network
Paths,” in Proc. Passive and Active Measurement Workshop, Apr. 2003.
[16] J. Strauss, D. Katabi, and F. Kaashoek, “A measurement study of
available bandwidth estimation tools,” in Proc. ACM IMC, Oct. 2003,
pp. 39–44.
Xiliang Liu (ACM’02) received the B.S. degree
(with honors) in computer science from Zhejiang
University, Hangzhou, China, in 1994, the M.S.
degree in information science from Institute of Au-
tomation, Chinese Academy of Sciences, Beijing,
China in 1997, and the Ph.D. degree in computer
science from the City University of New York, New
York, in 2005. He currently works for Bloomberg
L.P. His research interests include Internet measure-
ment and monitoring, overlay networks, bandwidth
estimation, and stochastic modeling and analysis of
networked systems.
Kaliappa Ravindran is a faculty member of Com-
puter Science at the City University of New York,
located in the City College campus. Earlier, he had
held faculty positions at the Kansas State University,
Manhattan, and at the Indian Institute of Science,
Bangalore. He received Ph.D. in Computer Science
from the University of British Columbia, Canada. He
had worked in Canadian communication industries
for a short period before moving to USA. His
research interests span the areas of service-level
management of distributed networks, compositional
design of network protocols, system-level support for information assurance,
distributed collaborative systems, and internet architectures. His recent project
relationships with industries include IBM, AT&T, Philips, ITT, and HP.
Besides industries, some of his research has been supported by grants and
contracts from federal government agencies.
Dmitri Loguinov (S’99–M’03) received the B.S.
degree (with honors) in computer science from
Moscow State University, Moscow, Russia, in 1995
and the Ph.D. degree in computer science from the
City University of New York, New York, in 2002.
Since September 2002, he has been an Assistant
Professor of computer science with Texas A&M
University, College Station. His research interests
include peer-to-peer networks, Internet video stream-
ing, congestion control, image and video coding,
Internet traffic measurement and modeling.
... Further, in the case of random cross traffic, there may not be a single tight link, but the tight link may vary randomly. The consequence is an underestimation of the available bandwidth [9,10] which is analyzed in [11,12]. ...
... • Active available bandwidth estimation: To alleviate the variability of noise-afflicted packet gaps, we use standard packet train probes, which are reported to be more robust to random fluctuations. Although packet trains help to reduce the variability in available bandwidth estimations, they do not take care of the systematic deviations from the deterministic fluid-flow model that cause biased estimates [8,11,21]. Therefore, to reduce the bias in bandwidth estimates, we investigate how to benefit from machine learning techniques while using standard packet train probes for available bandwidth estimation. We propose three methods for active available bandwidth estimation; the first method is based on regression and the other two methods are based on multi-class classification. ...
... Moreover, in the case of random cross traffic, there may not be a single tight link, but the tight link may vary randomly. The consequence is an underestimation of the available bandwidth [9,10] that is analyzed in [11,12]. ...
Thesis
Full-text available
Today’s Internet Protocol (IP), the Internet’s network-layer protocol, provides a best-effort service to all users without any guaranteed bandwidth. However, for certain applications that have stringent network performance requirements in terms of bandwidth, it is significantly important to provide Quality of Service (QoS) guarantees in IP networks. The end-to-end available bandwidth of a network path, i.e., the residual capacity that is left over by other traffic, is determined by its tight link, that is the link that has the minimal available bandwidth. The tight link may differ from the bottleneck link, i.e., the link with the minimal capacity. Passive and active measurements are the two fundamental approaches used to estimate the available bandwidth in IP networks. Unlike passive measurement tools that are based on the non-intrusive monitoring of traffic, active tools are based on the concept of self-induced congestion. The dispersion, which arises when packets traverse a network, carries information that can reveal relevant network characteristics. Using a fluid-flow probe gap model of a tight link with First-in, First-out (FIFO) multiplexing, accepted probing tools measure the packet dispersion to estimate the available bandwidth. Difficulties arise, however, if the dispersion is distorted compared to the model, e.g., by non-fluid traffic, multiple tight links, clustering of packets due to interrupt coalescing and inaccurate time-stamping in general. It is recognized that modeling these effects is cumbersome if not intractable. To alleviate the variability of noise-afflicted packet gaps, the state-of-the-art bandwidth estimation techniques use post-processing of the measurement results, e.g., averaging over several packet pairs or packet trains, linear regression, or a Kalman filter. These techniques, however, do not overcome the basic assumptions of the deterministic fluid model. While packet trains and statistical post-processing help to reduce the variability of available bandwidth estimates, these cannot resolve systematic deviations such as the underestimation bias in case of random cross traffic and multiple tight links. The limitations of the state-of-the-art methods motivate us to explore the use of machine learning in end-to-end active and passive available bandwidth estimation. We investigate how to benefit from machine learning while using standard packet train probes for active available bandwidth estimation. To reduce the amount of required training data, we propose a regression-based scaleinvariant method that is applicable without prior calibration to networks of arbitrary capacity. To reduce the amount of probe traffic further, we implemen a neural network that acts as a recommender and can effectively select the probe rates that reduce the estimation error most quickly. We also evaluate our method with other regression-based supervised machine learning techniques. Furthermore, we propose two different multi-class classification-based methods for available bandwidth estimation. The first method employs reinforcement learning that learns through the network path’s observations withou having a training phase. We formulate the available bandwidth estimation as a single-state Markov Decision Process (MDP) multi-armed bandit problem and implement the "-greedy algorithm to find the available bandwidth, where " is a parameter that controls the exploration vs. exploitation trade-off. We propose another supervised learning-based classification method to obtain reliable available bandwidth estimates with a reduced amount of network overhead in networks, where available bandwidth changes very frequently. In such networks, reinforcement learning-based method may take longer to converge as it has no training phase and learns in an online manner. We also evaluate our method with different classification-based supervised machine learning techniques. Furthermore, considering the correlated changes in a network’s traffic through time, we apply filtering techniques on the estimation results in order to track the available bandwidth changes. Active probing techniques provide flexibility in designing the input structure. In contrast, the vast majority of Internet traffic is Transmission Contro Protocol (TCP) flows that exhibit a rather chaotic traffic pattern. We investigate how the theory of active probing can be used to extract relevant information from passive TCP measurements. We extend our method to perform the estimation using only sender-side measurements of TCP data and acknowledgmen packets. However, non-fluid cross traffic, multiple tight links, and packet loss in the reverse path may alter the spacing of acknowledgments and hence increase the measurement noise. To obtain reliable available bandwidth estimates from noise-afflicted acknowledgment gaps we propose a neural network-based method. We conduct a comprehensive measurement study in a controlled network testbed at Leibniz University Hannover. We evaluate our proposed methods under a variety of notoriously difficult network conditions that have not been included in the training such as randomly generated networks with multiple tight links, heavy cross traffic burstiness, delays, and packet loss. Our testing results reveal that our proposed machine learning-based techniques are able to identify the available bandwidth with high precision from active and passive measurements. Furthermore, our reinforcement learning-based method without any training phase shows accurate and fast convergence to available bandwidth estimates.
... In the field of available bandwidth estimation, as defined in [7], [8] as the unused capacity of a path, often active measurements are used to infer the network capabilities from measurements, since passive measurement exhibit some drawbacks, see [9]. The use of active measurements allows for a better control of the arrival process to infer more information from the network path. ...
... We use here the term attainable rate, since the available bandwidth is defined as the unused capacity in a time interval [7], [8]. The attainable rate may defer in dependence of the scheduling discipline. ...
Article
Full-text available
Selection of the optimal transmission rate in packet-switched best-effort networks is challenging. Typically, senders do not have any information about the end-to-end path and should not congest the connection but at once fully utilize it. The accomplishment of these goals lead to congestion control protocols such as TCP Reno, TCP Cubic, or TCP BBR that adapt the sending rate according to extensive measurements of the path characteristics by monitoring packets and related acknowledgments. To improve and speed up this adaptation, we propose and evaluate a machine learning approach for the prediction of sending rates from measurements of metrics provided by the TCP stack. For the prediction a neural network is trained and evaluated. The prediction is implemented in the TCP stack to speed up TCP slow start. For a customizable and performant implementation the extended Berkeley packet filter is used to extract relevant data from the kernel space TCP stack, to forward the monitoring data to a user space data rate prediction, and to feed the prediction result back to the stack. Results from a online experiment show improvement in flow completion time of up to 30%.
... Following the assumptions of the fluid-flow model, we can see that the expected value of ξ is zero in the absence of congestion, and it grows proportional to the probe traffic when the probing rate exceeds the available bandwidth. As seen in (11), the model is piece-wise linear due to the sharp bend at r in = C − λ which inhibits the direct application of the Kalman filter. In order to overcome the problem, we feed only the measurements that satisfy r in > to the filter, where is the recent estimate of the available bandwidth. ...
Preprint
An accurate and fast estimation of the available bandwidth in a network with varying cross-traffic is a challenging task. The accepted probing tools, based on the fluid-flow model of a bottleneck link with first-in, first-out multiplexing, estimate the available bandwidth by measuring packet dispersions. The estimation becomes more difficult if packet dispersions deviate from the assumptions of the fluid-flow model in the presence of non-fluid bursty cross-traffic, multiple bottleneck links, and inaccurate time-stamping. This motivates us to explore the use of machine learning tools for available bandwidth estimation. Hence, we consider reinforcement learning and implement the single-state multi-armed bandit technique, which follows the $\epsilon$-greedy algorithm to find the available bandwidth. Our measurements and tests reveal that our proposed method identifies the available bandwidth with high precision. Furthermore, our method converges to the available bandwidth under a variety of notoriously difficult conditions, such as heavy traffic burstiness, different cross-traffic intensities, multiple bottleneck links, and in networks where the tight link and the bottleneck link are not same. Compared to the piece-wise linear network a model-based direct probing technique that employs a Kalman filter, our method shows more accurate estimates and faster convergence in certain network scenarios and does not require measurement noise statistics.
Article
In order to answer how much bandwidth is available to an application from one end to another in a network, state-of-the-art estimation techniques, based on active probing, inject artificial traffic with a known structure into the network. At the receiving end, the available bandwidth is estimated by measuring the structural changes in the injected traffic, which are caused by the network path. However, bandwidth estimation becomes difficult when packet distributions are distorted by non-fluid bursty cross traffic and multiple links. This eventually leads to an estimation bias. One known approach to reduce the bias in bandwidth estimations is to probe a network with constant-rate packet trains and measure the average structural changes in them. However, one cannot increase the number of packet trains in a designated time period as much as needed because high probing intensity overloads the network and results in packet losses in probe and cross traffic, which distorts probe packet gaps and inflicts more bias. In this work, we propose a machine learning-based, particularly classification-based, method that provides reliable estimates utilizing fewer packet trains. Then, we implement supervised learning techniques. Furthermore, considering the correlated changes over time in traffic in a network, we apply filtering techniques on estimation results in order to track the changes in the available bandwidth. We set up an experimental testbed using the Emulab software and a dumbbell topology in order to create training and testing data for performance analysis. Our results reveal that our proposed method identifies the available bandwidth significantly well in single-link networks as well as networks with heavy cross traffic burstiness and multiple links. It is also able to estimate the available bandwidth in randomly generated networks where the network capacity and the cross traffic intensity vary substantially. We also compare our technique with the others that use direct probing and regression approaches, and show that ours has better performance in terms of standard deviation around the actual bandwidth values.
Conference Paper
Full-text available
The information that short-lived TCP flows provide on bandwidth estimation may benefit adaptive video streaming applications or may contribute towards the success of new TCP versions. To estimate the available bandwidth active probing techniques may be used, but they cause an intrusive effect on the network affecting TCP performance. Therefore, passive measurements are favored, though capturing traffic traces comes with its own challenges. In this paper, we use the feedback provided by TCP acknowledgments and perform the estimation from sender-side measurements only. However, difficulties arise when in the presence of discrete random cross traffic, multiple tight links, packet losses, and inaccurate timestamping in general, the acknowledgment packet gaps get distorted. To deal with noiseafflicted packet gaps, we consider a machine-learning approach, specifically the neural network, for estimating the available bandwidth. We also apply the neural network under a variety of notoriously difficult conditions that have not been included in the training, such as multiple tight links, heavy cross traffic burstiness and packet losses. We compare the performance of our proposed method with a state-of-the-art model-based technique, where our neural network approach shows improved performance.
Article
The dispersion that arises when packets traverse a network carries information that can reveal relevant network characteristics. Using a fluid-flow model of a bottleneck link with first-in first-out multiplexing, accepted probing tools measure the packet dispersion to estimate the available bandwidth, i.e., the residual capacity that is left over by other traffic. Difficulties arise, however, if the dispersion is distorted compared to the model, e.g., by non-fluid traffic, multiple bottlenecks, clustering of packets due to interrupt coalescing, and inaccurate time-stamping in general. It is recognized that modeling these effects is cumbersome if not intractable. This motivates us to explore the use of machine learning in bandwidth estimation. We train a neural network using vectors of the packet dispersion that is characteristic of the available bandwidth. Our testing results reveal that even a shallow neural network identifies the available bandwidth with high precision. We also apply the neural network under a variety of notoriously difficult conditions that have not been included in the training, such as randomly generated networks with the multiple bottleneck links and heavy cross traffic burstiness. Compared to two state-of-the-art model-based techniques as well as a recent machine learning-based technique (Yin et al., 2016), our neural network approach shows improved performance. Further, our neural network can effectively control the estimation procedure in an iterative implementation. We also evaluate our method with other supervised machine learning techniques.
Conference Paper
Full-text available
In this paper, we take the sample-path approach in analyzing the asymptotic behavior of single-hop bandwidth estimation under bursty cross-traffic and show that these results are provably different from those observed under fluid models of prior work. This difference, which we call the probing bias, is one of the previously unknown factors that can cause measurement inaccuracies in available bandwidth estimation. We present an analytical formulation of "packet probing," based on which we derive several major properties of the probing bias. We then experimentally observe the probing bias and investigate its quantitative relationship to several deciding factors such as probing packet size, probing train length, and cross-traffic burstiness. Both our analytical and experimental results show that the probing bias vanishes as the packet-train length or packet size increases. The vanishing rate is decided by the burstiness of cross-traffic.
Conference Paper
Full-text available
The area of available bandwidth (avail-bw) estimation has attracted significant interest recently, with several estimation techniques and tools developed during the last 2-3 years. Unfortunately, some key issues regarding the avail-bw definition, estimation, and validation remain vague or misinterpreted. In this note, we first review the previous work in the area and classify the existing techniques in two classes: direct probing and iterative probing. We then identify ten misconceptions, in the form of fallacies or pitfalls, that we consider as most important. Some misconceptions relate to basic statistics, such as the impact of the population variance on the sample mean, the variability of the avail-bw in different time scales, and the effect of the probing duration. Other misconceptions relate to the queueing model underlying these estimation techniques. For instance, ignoring that traffic burstiness or the presence of multiple bottlenecks can cause significant underestimation errors. Our objective is not to debunk previous work or to claim that some estimation techniques are better than others, but to clarify a number of important issues that cover the entire area of avail-bw estimation so that this important metric can be better understood and put in practical use.
Conference Paper
Full-text available
This paper analyzes the asymptotic behavior of packet-train probing over a multi-hop network path P carrying arbitrar- ily routed bursty cross-trafÞc flows. We examine the sta- tistical mean of the packet-train output dispersions and its relationship to the input dispersion. We call this relation- ship the response curve of path P. We show that the real response curve Z is tightly lower-bounded by its multi-hop fluid counterpart F, obtained when every cross-trafÞc flow on P is hypothetically replaced with a constant-rate fluid flow of the same average intensity and routing pattern. The real curve Z asymptotically approaches its fluid counter- part F as probing packet size or packet train length in- creases. Most existing measurement techniques are based upon the single-hop fluid curve S associated with the bot- tleneck link in P. We note that the curve S coincides with F in a certain large-dispersion input range, but falls below F in the remaining small-dispersion input ranges. As an implication of these Þndings, we show that bursty cross- trafÞc in multi-hop paths causes negative bias (asymptotic underestimation) to most existing techniques. This bias can be mitigated by reducing the deviation of Z from S using large packet size or long packet-trains. However, the bias is not completely removable for the techniques that use the portion of S that falls below F.
Conference Paper
Full-text available
Although packet-pair probing has been used as one of the primary mechanisms to measure bottleneck capacity, cross-traffic intensity, and available bandwidth of end-to-end Internet paths, there is still no conclusive answer as to what information about the path is contained in the output packet-pair dispersions and how it is encoded. In this paper, we address this issue by deriving closed-form expression of packet-pair dispersion in the context of a single-hop path and general bursty cross-traffic arrival. Under the assumptions of cross-traffic stationarity and ASTA sampling, we examine the statistical properties of the information encoded in inter-packet spacings and derive the asymptotic average of the output packet-pair dispersions as a closed-form function of the input dispersion. We show that this result is different from what was obtained in prior work using fluid cross-traffic models and that this discrepancy has a significant impact on the accuracy of packet-pair bandwidth estimation.
Conference Paper
Full-text available
The packet pair technique estimates the capacity of a path (bottleneck bandwidth) from the dispersion (spacing) experienced by two back-to-back packets. We demonstrate that the dispersion of packet pairs in loaded paths follows a multimodal distribution, and discuss the queueing effects that cause the multiple modes. We show that the path capacity is often not the global mode, and so it cannot be estimated using standard statistical procedures. The effect of the size of the probing packets is also investigated, showing that the conventional wisdom of using maximum sized packet pairs is not optimal. We then study the dispersion of long packet trains. Increasing the length of the packet train reduces the measurement variance, but the estimates converge to a value, referred to as the asymptotic dispersion rate (ADR), that is lower than the capacity. We derive the effect of the cross traffic in the dispersion of long packet trains, showing that the ADR is not the available bandwidth in the path, as was assumed in previous work. Putting all the pieces together, we present a capacity estimation methodology that has been implemented in a tool called pathrate
Article
Full-text available
The packet-pair technique aims to estimate the capacity of a path (bottleneck bandwidth) from the dispersion of two equal-sized probing packets sent back to back. It has been also argued that the dispersion of longer packet bursts (packet trains) can estimate the available bandwidth of a path. This paper examines such packet-pair and packet-train dispersion techniques in depth. We first demonstrate that, in general, packet-pair bandwidth measurements follow a multimodal distribution and explain the causes of multiple local modes. The path capacity is a local mode, often different than the global mode of this distribution. We illustrate the effects of network load, cross-traffic packet-size variability, and probing packet size on the bandwidth distribution of packet pairs. We then switch to the dispersion of long packet trains. The mean of the packet-train dispersion distribution corresponds to a bandwidth metric that we refer to as average dispersion rate (ADR). We show that the ADR is a lower bound of the capacity and an upper bound of the available bandwidth of a path. Putting all of the pieces together, we present a capacity-estimation methodology that has been implemented in a tool called pathrate. We report on our experiences with pathrate after having measured hundreds of Internet paths over the last three years.
Article
Full-text available
The available bandwidth (avail-bw) in a network path is of major importance in congestion control, streaming applications, quality-of-service verification, server selection, and overlay networks. We describe an end-to-end methodology, called self-loading periodic streams (SLoPS), for measuring avail-bw. The basic idea in SLoPS is that the one-way delays of a periodic packet stream show an increasing trend when the stream's rate is higher than the avail-bw. We have implemented SLoPS in a tool called pathload. The accuracy of the tool has been evaluated with both simulations and experiments over real-world Internet paths. Pathload is nonintrusive, meaning that it does not cause significant increases in the network utilization, delays, or losses. We used pathload to evaluate the variability ("dynamics") of the avail-bw in Internet paths. The avail-bw becomes significantly more variable in heavily utilized paths, as well as in paths with limited capacity (probably due to a lower degree of statistical multiplexing). We finally examine the relation between avail-bw and TCP throughput. A persistent TCP connection can be used to measure roughly the avail-bw in a path, but TCP saturates the path and increases significantly the path delays and jitter.
Article
In this paper we present a method for estimating the available bandwidth of a network path. It is an extension and enhancement of the bandwidth measurement method TOPP. TOPP actively probes a network path by sending probe packets in a predetermined time pattern. Our enhancement involves a formalized estimation algorithm based on constrained linear regression. Using the algorithm, bandwidth measurements can be fully automated requiring no assistance from the user. We show that our method is able to estimate bottlenecks that cannot be detected by packet train methods such as C-probe. In addition to inferring the available bandwidth, the method gives an estimate of the link bandwidth of the most congested link on the network path. The link bandwidth estimates are not limited to the rate at which we can inject probe packets into the network.
Conference Paper
We examine the problem of estimating the capacity of bottleneck links and available bandwidth of end-to-end paths under non-negligible cross-traffic conditions. We present a simple stochastic analysis of the problem in the context of a single congested node and derive several results that allow the construction of asymptotically-accurate bandwidth estimators. We first develop a generic queuing model of an Internet router and solve the estimation problem assuming renewal cross-traffic at the bottleneck link. Noticing that the renewal assumption on Internet flows is too strong, we investigate an alternative filtering solution that asymptotically converges to the desired values of the bottleneck capacity and available bandwidth under arbitrary (including non-stationary) cross-traffic. This is one of the first methods that simultaneously estimates both types of bandwidth and is provably accurate. We finish the paper by discussing the impossibility of a similar estimator for paths with two or more congested routers.
Conference Paper
We present a network friendly bandwidth measurement method, TOPP, that is based on active probing and includes analysis by segmented regression. This method can estimate two complementing available bandwidth metrics in addition to the link bandwidth of the congested link. Contrary to traditional packet pair estimates of the bottleneck link bandwidth, our estimate is not limited by the rate at which we can inject probe packets into the network. We also show that our method is able to detect bottlenecks that are invisible to methods such as the C-probe. Further more, we describe scenarios where our analysis method is able to calculate bandwidth estimates for several congested hops based on a single end-to-end probe session