ArticlePDF Available

A Game Theoretical Approach to Gateway Selections in Multi-domain Wireless Networks

Authors:

Abstract and Figures

We consider a coalition network where multiple groups are interconnected via wireless links. Gateway nodes are designated by each domain to achieve a network-wide interoperability. Due to the inter-domain communication cost, the optimal gateway selection for one single domain depends on the gateway selections of other domains and vice versa. In this paper, we investigate the interactions of gateway selections by multiple domains from a potential game perspective. The equilibrium inefficiency in terms of price of stability is characterized under various conditions. In addition, we examine the well-established equilibrium selective learning algorithm B-logit and show that B-logit is a special case of a general family of algorithms, denoted by Γ collectively. A novel learning algorithm named MAX-logit is proposed, which retains the favorable equilibrium selection property with the provably fastest convergence speed than any other algorithms in Γ, and can be applied to many other applications of potential games. Simulation results show that MAX-logit can improve the convergence speed of B-logit by up to 33.85%.
Content may be subject to copyright.
1
A Game Theoretical Approach to Gateway
Selections in Multi-domain Wireless Networks
Yang Song, Starsky H.Y. Wong and Kang-Won Lee
IBM Research, Hawthorne, NY
Email: {yangsong, hwong, kangwon}@us.ibm.com
Abstract—We consider a coalition network where multiple
groups are interconnected via wireless links. Gateway nodes
are designated by each domain to achieve a network-wide
interoperability. Due to the inter-domain communication cost, the
optimal gateway selection for one single domain depends on the
gateway selections of other domains and vice versa. In this paper,
we investigate the interactions of gateway selections by multiple
domains from a potential game perspective. The equilibrium
inefficiency in terms of price of stability is characterized under
various conditions. In addition, we examine the well-established
equilibrium selective learning algorithm B-logit and show that
B-logit is a special case of a general family of algorithms,
denoted by Γcollectively. A novel learning algorithm named
MAX-logit is proposed, which retains the favorable equilibrium
selection property with the provably fastest convergence speed
than any other algorithms in Γ, and can be applied to many
other applications of potential games. Simulation results show
that MAX-logit can improve the convergence speed of B-logit
by up to 33.85%.
I. INT RO DU CTI ON
We investigate the interoperability issue in coalition net-
works where multiple groups of nodes (or domains) are
connected via wireless links (e.g., mixture of IEEE 802.11,
WiMAX, satellite links, Unmanned Aerial Vehicle (UAV), 3G,
4G etc.). For example, in military operations (disaster recovery
scenarios), troops of multiple countries (police department and
fire rescue teams), need to form a wireless communication
backbone to facilitate mutual information exchange and dis-
semination. While a global link such as satellite and UAV
is usually deployed to achieve a network-wide connectivity
for mission-critical tasks, one key obstacle that hinders the
interoperability of coalition networks lies in the heterogeneity
of multiple domains in terms of different communication
technologies, protocols, policies, and incompatible packet for-
mats, which prevent two nodes in different domains from
communicating directly even in close geographic proximity.
Therefore, in order to enable inter-domain communications
among heterogeneous domains, a common paradigm is to
This work will appear in The 17th Annual International Conference on
Mobile Computing and Networking, Las Vegas, Nevada, 2011.
This research was sponsored by the U.S. Army Research Laboratory and the
U.K. Ministry of Defence and was accomplished under Agreement Number
W911NF-06-3-0001. The views and conclusions contained in this document
are those of the author(s) and should not be interpreted as representing the
official policies, either expressed or implied, of the U.S. Army Research
Laboratory, the U.S. Government, the U.K. Ministry of Defence or the U.K.
Government. The U.S. and U.K. Governments are authorized to reproduce and
distribute reprints for Government purposes notwithstanding any copyright
notation hereon.
designate a gateway node for each administrative domain,
which serves as a translator and collectively establish a
connected inter-domain communication backbone to facilitate
(secure and compatible) information exchange, as proposed in
[1], [2]. Designating gateway nodes also enhance the manage-
ability and controllability of coalition networks by enforcing
security and routing policies for inter-domain communications.
With this hierarchical structure, inter-domain packets are first
delivered to the designated gateway node in the source node
domain, then forwarded to the destination domain via the
established inter-domain wireless backbone, and finally reach
the destination node, as illustrated in Figure 1.
Gateways
Domain B
Domain A
D1S1
S2
D2
Fig. 1. A coalition network with multiple autonomous domains.
Finding the optimal set of gateway nodes in coalition
networks is challenging due to several reasons. First, due to
the absence of a trusted central authority in coalition networks,
decentralized algorithms that only rely on local information
and observations are desired. Second, although collaboratively
forming a communication backbone, each domain is inclined
to designate the gateway node for its own unilateral benefit,
regardless the potential adverse impact on the overall network
performance. Therefore, it is important and imperative to
examine whether a unanimous agreement on gateway selec-
tions exists. In addition, we are interested in quantifying
the performance degradation of such equilibrium solutions,
compared with the global optimum gateway selection, in
order to understand the impact of autonomy and lack of
coordination in coalition networks. Third, each administrative
domain in the coalition network may be reluctant to share
its private information, e.g., intra-domain node topology, to
other domains due to security and privacy concerns. It is thus
favorable to design distributed mechanisms that can achieve
an efficient solution while the private intra-domain structure
of each domain is unrevealed. The objective of this work is to
address the aforementioned challenges.
2
II. SY STE M MODE L
Let Mbe the set of domains in the coalition network. For
domain m M, denote the set of nodes in the domain
as Nm. We assume that |M| 2and |Nm| 2to
avoid triviality. Denote gi
m= 1 if node iis selected as the
gateway node for domain mand gi
m= 0 otherwise. Let
c
im=argmaxi∈Nmgi
mbe the selected gateway node in domain
m. For ease of exposition, we consider the scenario where
each domain will select one of its nodes as the gateway.
Denote gm={g1
m, g2
m,··· , g|Nm|
m}as the gateway selection
strategy of domain m. Let s={g1,g2,··· ,g|M|}be the
joint gateway selection profile of the whole network, i.e., the
collection of gateway selection strategies of all domains. For
notation brevity, we will also use c
im(gm)or c
im(s)to denote
the gateway node designated by gateway selection strategy gm
or gateway selection profile s, respectively. Denote gmas the
gateway selection strategies of all domains other than domain
m. Therefore, we will also use the notation of s={gm,gm}
to denote the gateway selection profile of the network when
domain mis of interest. We use Sto represent the set of
all possible gateway selection profiles of the network, where
|S| =Q|M|
m=1 |Nm|, i.e., all nodes can be designated as the
gateway node1.
Denote c(i, j)0as the associated symmetric link cost
for a pair of node iand j, e.g., Euclidean distance. For
mission-critical applications where an always-on network wide
connectivity is stringently required, a global link, e.g., UAV,
satellite, 3G/4G, is deployed in the network to facilitate
reliable inter-domain communications with a fixed yet possibly
expensive cost, denoted by η. Note that if nodes i, j are out
of the communication range, we have c(i, j) = ηdue to the
availability of the global link. It should be noted that if node
itransmits packets to jvia the global link, the transmission
is still viewed as one hop. We assume that each domain has a
decision module to select its gateway node based on its local
information and observation. More specifically, we consider
a practical scenario where each domain only knows its intra-
domain information in terms of topology and costs, as well as
its one hop link costs to other neighboring domains, yet the
global network topology and link costs are unknown due to
the lack of observability. Define
c(i, j),min (c(i, j ), η)(1)
where i, j are the gateways nodes of two domains. Therefore,
for each autonomous domain, say m, the gateway node c
im
should be selected to minimize
Um(gm,gm) = X
i6=c
im,i∈Nm
ci, c
im+X
n6=m,n∈M
cc
im,b
in,
(2)
where c(c
im,b
in) = ηis assumed by domain mif no direct
link from c
imto b
inis observed2, except the global link. The
first part of (2) is defined as the intra-domain cost for domain
1It can be verified that our analysis applies to scenarios where each domain
is restricted to designate the gateway from a subset of its nodes.
2A multi-hop path may exist which has a smaller aggregate cost than η.
However, domain mis only aware of its one-hop neighbors due to the lack
of global information.
m, and the second part is the locally observed inter-domain
cost from domain m’s viewpoint.
Associated with each gateway selection profile s S, a
physical communication graph, denoted by PCG(s), can be
obtained (at domain level), where the vertices are the set
of gateways specified by s, and the weighted edges are the
communication links with costs, including the global link. In
Figure 2, we illustrate the PCG of a coalition network with
three domains where node a, b, c are gateway nodes and all
the possible communication links among gateway nodes are
labeled. We assume throughout the paper that the network
topology varies at a slower time scale than the gateway
selection process.
a
bc
c(a,b)
c(a,c)
c(b,c)
η
η
η
Domain A
Domain B Domain C
Fig. 2. The physical communication graph (PCG) of a coalition network
with three domains.
Note that there might exist multiple paths between a pair of
gateway nodes in PCG(s), however, the network’s backbone
communication cost should only account for those links that
are actually traversed by inter-domain traffic. In this paper,
we assume that the underlying backbone inter-domain routing
algorithm satisfies (i),the path with minimum overall cost
is selected, (ii),in case of multiple paths with same cost
exist, the one with minimum number of hops is selected, and
(iii), if two paths have the same cost with same number
of hops, tie breaks randomly. We thus can establish the
desired cost-efficient communication backbone by finding the
minimum cost path for each pair of gateway nodes c
imand
b
in. In other words, the desired minimum cost backbone is an
undirected graph (also a spanning subgraph of PCG), denoted
by MCG(s), where the set of edges is the union of minimum
cost paths for all pairs of gateway nodes. Note that MCG(s)
may not be a tree in general. For example, Figure 3 illustrates
two different MCG realizations for the same structure of PCG
in Figure 2, with different values of c(a, c).
11.5
1
a
bc
Domain A
Domain B Domain C
(a) c(a, c) = 1.5
1
1
a
bc
Domain A
Domain B Domain C
(b) c(a, c) = 3
Fig. 3. Two different MCG of Figure 2 when c(a, b) = c(b, c) = 1, η = 10.
From the network’s perspective, it is desirable to obtain the
optimum gateway selection profile swhich minimizes the
3
network cost function given by
R(s) = X
mX
i6=c
im,i∈Nm
ci, c
im+X
(c
im,b
in)MCG(s)
cc
im,b
in.
(3)
The first part of (3) is the aggregate intra-domain cost of all
domains, and the second part of (3) is the cost of backbone
communication links in the associated graph of MCG(s).
III. GATEWAY SEL EC TIO N GAM E O F MULT IP LE
AUTONOMOUS DOMAI NS
A. Game Formulation
Note that the cost of a single domain m, given by (2),
depends on not only its own gateway selection, but also the
decisions of other domains. Therefore, coupled by the inter-
domain communication cost, we formulate the interactions of
gateway selections by multiple domains as a gateway selection
game, where each autonomous domain is a player, with an
objective function of (2), and the strategy space for domain
mis Nm. Note that when a domain mselects its gateway
node unilaterally, it may increase the cost of other domains,
as indicated by (2), which will trigger a new gateway update.
This procedure iterates until a common agreement is reached.
An important question arises that whether this iterative gate-
way selection process will reach a steady state eventually.
In other words, we are interested in whether the gateway
selection game has a Nash Equilibrium3, since an absence of
equilibrium states indicates that the network will oscillate and
a common agreement can never be expected. In addition, we
are interested in the performance of such equilibrium points,
if exist, in terms of the overall network performance in order
to characterize the performance deterioration due to autonomy
of multiple domains. We will address these two questions in
the rest of this section.
B. Game Analysis
Let us first introduce the concept of constructed complete
graph, which is a crucial component in our equilibrium anal-
ysis. Given a set of gateway nodes s,PCG(s)depicts the do-
main level structure consisting of all available communication
links among gateway nodes. For every pair of gateways that
possesses multiple links in PCG(s), we eliminate redundant
links by keeping the link with the smallest cost only. See
Figure 4 for an example. The constructed graph is hence a
complete graph with link cost c(a, b)defined in (1). We
denote such a graph as the constructed complete graph, a.k.a.,
CCG(s), for the given gateway selection profile s. Next, we
show that the gateway selection game falls into the category
of potential games, where the existence of Nash equilibrium
can be established.
Theorem 1 The gateway selection game has a Nash equilib-
rium, which minimizes, either locally or globally, the following
3From an engineering perspective, we only focus on pure Nash equilibrium
in this paper.
Domain B Domain C
2
2
b
10
η
Domain A Domain D
2
2
ad
10
η
=
10
η
=
c
Domain B Domain C
2
2
b
Domain A Domain D
2
2
ad
c
Physical Communication Graph (PCG) Constructed Complete Graph (CCG)
10
η
=
10
η
10
η
=
10
η
=
10
η
=
Fig. 4. Obtain the constructed complete graph (CCG) from the physical
communication graph (PCG) .
function
F(s) = X
mX
i6=c
im,i∈Nm
ci, c
im+X
(c
im,b
in)CCG(s)
cc
im,b
in.
(4)
Proof: Without loss of generality, let us assume domain
mis updating its gateway selection unilaterally, given the
gateway selections of other domains, i.e., gm. We calculate
the difference of function Fwith two strategies g
m,g′′
m, and
obtain
F(g
m,gm) F (g′′
m,gm)
=X
i6=c
i
m,i∈Nm
ci, c
i
m+X
n6=mX
i6=b
in,i∈Nn
ci, b
in
X
i6=c
i′′
m,i∈Nm
ci, c
i′′
mX
n6=mX
i6=b
in,i∈Nn
ci, b
in
+X
n6=m,n∈M cc
i
m,b
incc
i′′
m,b
in
=Um(g
m,gm) Um(g′′
m,gm).(5)
Note that we utilize the property that when a single domain m
switches its gateway strategy from g
mto g′′
m, the intra-domain
costs of other domains, as well as the links in CCG (s)that are
not incident to domain mare unchanged. We stress that (5) is
valid for any m,g
m, and g′′
m. Therefore, the gateway selection
game is an exact potential game with a potential function given
by (4). It is worth noting that every Nash Equilibrium in the
gateway selection game, where no domain can improve its
own performance by deviating unilaterally, corresponds to a
local or global minimizer of the potential function F(s). The
existence of Nash Equilibrium follows the results of [3].
However, in multiple domain gateway selection games, the
stable Nash equilibrium solution may not be unique. In other
words, depending on the initial configuration, the gateway
selection game may reach multiple equilibria which yield
significantly different performance in terms of overall network
cost. To capture the efficiency loss in games with rational
players, the concepts of price of anarchy [4] and price of
stability [5] are introduced in the literature, which are defined
as the performance ratio of the worst Nash Equilibrium to
the global optimal solution, and the best Nash Equilibrium
to the global optimal solution, respectively. Since our goal is
to design policies and mechanisms to improve the equilibrium
performance for multiple autonomous domains in the coalition
network, we will focus on the price of stability of gateway
selection games in this paper.
4
Theorem 2 For any gateway selection game with two players,
the best Nash Equilibrium is the global network optimum
solution, i.e., the price of stability is 1.
Proof: The result can be shown by contradiction and is
omitted due to space constraint.
We next show that the result in Theorem 2 also applies to
multiple domain gateway selection games (|M| 3), if the
following condition is satisfied.
Condition 1 (Triangle Inequality) We say the link cost met-
ric c(a, b)0satisfies triangle inequality if c(a, b)
c(a, c) + c(c, b).
Note that several link cost metrics are known to satisfy
triangle inequality, e.g., hop count and Euclidean distance. In
addition, network embedding techniques have been proposed
to convert general routing metrics into simple Euclidean
distances in the new metric space, e.g., [6], [7], [8].
Theorem 3 If the link cost metric c(a, b)satisfies the triangle
inequality, the price of stability is always 1.
Proof: When we obtain CCG from PCG, only the links
which will not be utilized in MCG are removed. Moreover,
CCG can be viewed as a constructed graph with new link
cost c(a, b) = min (c(a, b), η). It can be easily verified that
if the original link cost metric c(a, b)satisfies the triangle
inequality, so does the new link cost metric c(a, b). Therefore,
the induced MCG is the same as CCG, i.e., the potential
function of (4) is identical to the network cost function of (3),
which completes the proof since the best Nash equilibrium
corresponds to the global minimizer of the potential function.
Theorem 3 reveals that for multiple domain gateway se-
lection games, the global network optimum solution is one
of the Nash equilibria if the triangle inequality is satisfied.
Unfortunately, this result does not hold otherwise, as shown
in the following Theorem.
Theorem 4 If the triangle inequality does not hold, the price
of stability of an |M|-player gateway selection game is at
most (1 + δ), where
δ=
η|M|
2+1
|M| 3
2
minm∈M mingmPi6=c
im(gm),i∈Nmci, c
im(gm).(6)
Proof: Denote ¯s as the global minimizer of the potential
function of (4). Note that for any feasible gateway selection
profile s S, we have MCG (s)CCG (s). Therefore,
we obtain F(¯s) R (¯s). Denote sas the global optimum
gateway selection profile which minimizes (3). Since ¯s is the
Nash equilibrium minimizing (4), we have
R(¯s) F (¯s) F (s).(7)
We observe that
F=X
mX
i6=c
im(s),i∈Nm
ci, c
im(s)
+X
(c
im(s),b
in(s))MCG(s)
cc
im(s),b
in(s)
+X
(c
im(s)b
in(s))6∈MCG(s)
(c
im(s),b
in(s))CCG(s)
cc
im(s),b
in(s)
where the last term is the cost of links that are removed from
CCG (s)when generating MCG (s). Since CCG (s)is a
complete graph, the number of links is given by |M|(|M|−1)
2.
Moreover, MCG (s)is a connected graph and hence the
number of links is at least |M| 1. Therefore, we obtain
F(s)
R (s) + |M| (|M| 1)
2(|M| 1)η
R (s)
1 + |M|(|M|−1)
2(|M| 1)η
|M| minmPi6=c
im(s),i∈Nmci, c
im(s)
=R(s)
1 + |M|
2+1
|M| 3
2η
minmPi6=c
im(s),i∈Nmci, c
im(s)
.
In tandem with (7), we have
R(¯s) R (s)
1 + |M|
2+1
|M| 3
2η
minmPi6=c
im(s),i∈Nmci, c
im(s)
which completes the proof.
The denominator of (6) reflects the minimum intra-domain
cost of all domains in the network. Intuitively, if all domains
are dense, the dominant component in network cost is the
intra-domain cost and hence distributed local gateway selec-
tion can lead to a close optimal solution. If the number of
domains, i.e., |M|, increases, the performance gap becomes
larger due to the impact of autonomy of multiple domains.
It is also worth noting that when |M| = 2, we have δ= 0
and the price of stability is 1, which agrees with our previous
result in Theorem 2.
IV. EQU IL I BR IU M SE LE CTI VE LE A RN ING I N GATE WAY
SEL ECT IO N GAME S
A. γ-logit Learning Algorithms
In previous section, we have shown that multiple Nash
equilibria may exist in the gateway selection game, which
possess remarkably different network performance and the
one with the smallest network cost is desired. Recently, a
simple learning algorithm named binary logit, or B-logit
in short, has attracted significant attention in potential game
theory and networking communities, e.g., [9], [10], [11],
[12], due to its favorable property of equilibrium selection.
The procedure of B-logit algorithm [9] is summarized as
follows.
5
B-logit:
For every time slot t:
Randomly select one of the players, say m, to update
its gateway selection while other domains remain un-
changed.
Denote the current gateway selection of domain mas
gm(t). Domain mrandomly selects a node in its domain
as the gateway candidate. Denote the candidate gateway
selection strategy by f
gm. Domain mupdates as
Pr (gm(t+ 1) = f
gm)(8)
=exp−Um(f
gm,gm(t))
exp−Um(f
gm,gm(t)) + exp−Um(gm(t),gm(t))
and
Pr (gm(t+ 1) = gm(t)) = 1 Pr (gm(t+ 1) = f
gm)
(9)
where τis a small positive constant, a.k.a., the smoothing
factor of the algorithm.
It has been shown that as τ0,B-logit algorithm con-
centrates on the global minimizer of the potential function in
any potential games with arbitrarily high probability [9], [10].
At each step, B-logit computes the value of (2) at most
once, with local information only. This reduced complexity,
in tandem with the desirable equilibrium selection property,
prosper the deployment of B-logit in networking areas
such as network coding [13], channel and power allocation
in wireless mesh networks [10], and MIMO interference
networks [14], among many others.
In this paper, we will investigate an important yet
unanswered aspect of B-logit, i.e., the convergence speed
of B-logit to the desired best Nash equilibrium. In the
following, we first show that B-logit is essentially a
special case of a general family of learning algorithms,
denoted by γ-logit (or Γcollectively), parameterized by γ.
Next, we propose a novel learning algorithm MAX-logit
in Γwhich also retains the favorable property of equilibria
selection. More importantly, we prove that MAX-logit
possesses the fastest convergence rate compared with any
other γ-logit algorithm, including B-logit. Our key
observation is that all γ-logit learning algorithms achieve
the best Nash equilibrium asymptotically by generating
aperiodic, irreducible, reversible Markov chains with the
same steady state distribution yet different stochastic kernels.
The optimality in convergence rate of MAX-logit is proven
by investigating the mixing rate of the underlying Markov
chain. We first provide the general structure of a γ-logit
algorithm, parameterized by γ, as follows.
γ-logit:
γ-logit shares the same structure as B-logit except in
(8), where the probability is calculated as
Pr (gm(t+ 1) = f
gm) = exp−Um(f
gm,gm(t))
γ(s,s′′)(10)
where s={gm(t),gm(t)}and s′′ ={f
gm,gm(t)}are two
gateway selection profiles in S, and γsatisfies
1) Symmetry
γ(s,s′′) = γ(s′′ ,s),s S,s′′ S,
2) Feasibility
γ(s,s′′)max exp−Um(s) ,exp−Um(s′′ ) .
Denote the collection of all γ-logit algorithms as Γ. It is
straightforward to observe that B-logit is a special case in
Γwith
γ(s,s′′) = γ(s′′ ,s) = exp−Um(s) + exp−Um(s′′) .
Lemma 1 Every γ-logit algorithm in Γis equilibrium se-
lective, i.e., converging to the global minimizer of the potential
function asymptotically.
Proof: The proof is straightforward by verifying that
π(s) = exp−F(s)
Ps∈S exp−F(s) satisfies the detailed balance equa-
tion, and is omitted.
Conceptually, the state space of γ-logit algorithm is
the Cartesian product of complete graphs K|Nm|, m =
1,··· ,|M|. While all algorithms in Γshare the identical
state structure and steady state distribution, however, for
two realizations of γ, the underlying transition probability
matrices, denoted by P(γ), are noticeably different. In next
section, we will investigate the convergence rate of B-logit,
or γ-logit in general, by examining the transition probability
matrix P(γ)from a mixing time perspective.
B. Mixing Time Analysis of γ-logit Learning Algorithms
For arbitrary γ-logit algorithm, the associated probability
transition matrix P(γ)is an |S|-by-|S| matrix, and each
element can be written as
Pi,j (γ),Pr sisj=1
|M|
1
|Nm|
exp−U(sj)
γ(si,sj)
if si S and sj S differ at only player m, i.e., only the
gateway selections of domain mare different. Otherwise
Pi,j (γ) = 0,si6=sj,
and
Pi,i(γ) = 1 X
sj6=si
Pi,j (γ).
Denote the eigenvalues of P(γ)in decreasing order as
λk(P(γ)) , k = 1,··· ,|S|. By Perron-Frobenius Theorem
[15], we have
1 = λ1(P(γ)) > λ2(P(γ)) · · · λ|S| (P(γ)) >1.
It is well understood in the literature that the mixing rate of
a Markov chain to its steady state distribution is determined by
the second largest eigenvalue modulus, denoted by µ(P(γ)),
of the transition matrix P(γ), i.e.,
µ(P(γ)) = max |λ2(P(γ)) |,|λ|S| (P(γ)) |.
The smaller µ(P(γ)) is, the faster the Markov chain converges
to its steady state distribution [16], [17]. Therefore, we are
6
interested in finding the optimal values of γwhich retain
the desired property of equilibrium selection while enjoying
a provably faster convergence speed compared with any other
algorithms in Γ.
Next, we present a new learning algorithm in Γ, denoted
by MAX-logit, as follows.
MAX-logit:
MAX-logit is a γ-logit algorithm in Γwhere
γ(s,s′′) = max exp−Um(s) ,exp−Um(s′′ ) .(11)
Denote µMAX as the second largest eigenvalue modulus
associated with MAX-logit algorithm. We next present a
key result of our paper.
Theorem 5 Denote µ(P(γ)) as the second largest eigenvalue
modulus induced by an arbitrary γ-logit algorithm in Γ. We
have µMAX µ(P(γ)) .
Before we prove Theorem 5, we need to establish an
important lemma which is crucial in our mixing rate analysis.
Lemma 2 For any γ-logit learning algorithm in Γ, we
have λ2(P(γ)) 0and |λ2(P(γ)) | |λ|S| (P(γ)) |.
Proof: For an arbitrary γ-logit learning algorithm in
Γ, we have
Pi,i(γ)
|M|
X
m=1
1
|M||Nm|1
maxm|Nm|
,α. (12)
Define the conductance [18] of the state space Sas
h,min
π(A)1
2Psi∈A,sj6∈A π(si)Pi,j (γ)
π(A)(13)
where A S is a subset of the state space Sand π(A),
Psi∈A π(si). We know that when the smoothing factor τ
is small, the γ-logit learning algorithm will concentrate
arbitrarily close to the best Nash equilibrium, denoted by ¯
s,
in the steady state. Therefore, A={si,si6=¯
s}is a feasible
partition, i.e., π(A)1
2. We denote π(si) = ǫ, si6=¯
s, for
notation brevity. By definition of (13), we have
hPsi∈Aπ(si) Pr si¯
s
π(A)
(|N1| 1) 1
|M||N1|ǫ+···+|N|M| | 11
|M||NM|ǫ
(|S| 1) ǫ
=P|M|
m=1 1
|M| 1
|M||Nm|
|S| 1=11
|M| P|M|
m=1 1
|Nm|
Q|M|
m=1 |Nm| 1
11
maxm=1,··· ,|M| |Nm|
Q|M|
m=1 |Nm| 111
maxm=1,··· ,|M| |Nm|
2 maxm=1,··· ,|M| |Nm| 1
where we utilize the fact that |M| 2and |Nm| 2,m,
since otherwise the solution is trivial. In light of (12), we attain
hα1α
2α< α 1
2.
By invoking Cheeger’s Inequality [19], we have
λ2(P(γ)) 12h > 12α0 |λ2(P(γ))|>|12α|.
(14)
Next, we proceed to investigate the smallest eigenvalue,
λ|S| (P(γ)). Since Pi,i (γ)α, we have
W,P(γ)αI
is a nonnegative matrix with a row sum of 1α, where I
denotes the identity matrix with dimension |S |-by-|S|. Define
ρ(W)as the spectral radius of matrix Wand λk(W), k =
1,··· ,|S| as the eigenvalues of Win decreasing order. By
Theorem 8.1.22 in [15], we have
|λk(W)| ρ(W) = 1 α, k= 1,··· ,|S|.(15)
Note that the transition matrix P(γ)is not Hermitian in gen-
eral. To facilitate our analysis, we define π={π(si),si S}
as the vector containing all steady state distributions, and
Π = diag(π). Define
]
P(γ) = Π1/2P(γ1/2.
We can see that
]
P(γ)is Hermitian. In addition, since Π1/2is
nonsingular,
]
P(γ)and P(γ)are similar and hence share the
same spectrum [15], i.e.,
λk]
P(γ)=λk(P(γ)) ,k= 1,··· ,|S|.
Similarly, we define
f
W= Π1/2WΠ1/2=
]
P(γ)αI
which is also Hermitian and share the same spectrum as W.
Since
]
P(γ)and αI are both Hermitian and commutative, we
have
λk(W) = λkf
W=λk]
P(γ)α=λk(P(γ)) α, k
by the definition of eigenvalues. Therefore, by (15), we have
λ|S| (P(γ)) α=λ|S|(W)1α.
Finally, we have
λ|S| (P(γ)) 1 + 2α. (16)
Therefore, if λ|S| (P(γ)) 0, we have |λ2(P(γ))|
λ|S| (P(γ)). If λ|S| (P(γ)) <0, in light of (14) and (16), we
have |λ2(P(γ)) |>|λ|S| (P(γ)) |, which completes the proof
of Lemma 2.
Lemma 2 suggests that when comparing µ(P(γ)), we only
need to focus on the second largest eigenvalue λ2(P(γ)).
Next, we proceed to provide the proof of Theorem 5.
Proof of Theorem 5:
Denote Pand P(γ)as the probability transition matrices
induced by MAX-logit and an arbitrary γ-logit learning
algorithm in Γ, respectively. Define
f
P= Π1/2PΠ1/2
and
e
=
]
P(γ)f
P= Π1/2∆Π1/2
7
where = P(γ)P.It is worth noting that for each off-
diagonal element in , we have i,j =Pi,j (γ)P
i,j 0
and
i,i =X
j6=iP
i,j Pi,j (γ)0.
Therefore, is a diagonally dominant matrix with nonnegative
real eigenvalues [15], implying that e
is a positive semi-
definite (PSD) matrix since e
is Hermitian. By utilizing
Theorem 4.3.3 in [15], we have
λ2(P) = λ2f
Pλ2]
P(γ)=λ2(P(γ)) ,
which completes the proof by invoking Lemma 2.
Therefore, the proposed MAX-logit algorithm enjoys a
provably fastest convergence rate than any other algorithms
in Γ, including B-logit. Our theoretical results will be
validated in the next section.
V. PE RFO RM ANC E EVALUATIO N
A. Simulation Setting
To illustrate our theoretical results, we consider the follow-
ing scenario for our simulation. The coalition network consists
of |M| domains where each domain has |N | nodes. For each
domain, we randomly deploy its nodes in a round area with
radius 125m, centered at a random point within the square
field of 1000 ×1000m2. To demonstrate our price of stability
results in Theorem 3 and Theorem 4, we consider two types of
link cost in the simulation. The first is Euclidean distance cost,
which is a representative metric satisfying triangle inequality,
and has been utilized extensively in geographic routing and
network embedding schemes. In this case, both B-logit
and MAX-logit will converge to the best Nash equilibrium
which coincides with the optimum solution. Next, we consider
random link costs where triangle inequality is violated. More
specifically, we randomly select p%of the links in the network
and add a random error which is uniformly distributed between
0% and 5% of their original Euclidean distance link cost. This
enables us to compare different scenarios in a unified setting,
i.e., by setting p= 0, triangle inequality is satisfied. The global
link cost is set to η= 500 for all network scenarios. We
consider |M| = 2,3,4in our evaluation with varying number
of nodes in each domain, where the global network optimum
solution is attained via exhaustive search and is served as the
performance benchmark.
B. Euclidean Distance Cost
We first consider the Euclidean distance cost scenarios.
The global optimum gateway selection profile is denoted by
OPT.MAX-logit and B-logit are executed to iteratively
update gateway selections with only local observation and
information, from the same initial configuration. For both
algorithms, we set the smoothing factor τ= 0.0001 to ensure
convergence to the best Nash equilibrium is achieved with
sufficiently high probability.
Figure 5, 6 and 7 depict sample runs for 2domains, 3
domains, and 4domains scenarios where each domain contains
20 nodes. We observe that both MAX-logit and B-logit
converge to the network optimum solution gradually in all
three cases. Moreover, our proposed MAX-logit algorithm
converges significantly faster than B-logit to reach the
global network optimum solution. To further investigate the
convergence rate improvement by MAX-logit, we compare
the average convergence speed of MAX-logit and B-logit
over 5000 sample runs for a given number of domains and
nodes.
Nodes per domain 2 domains 3 domains 4 domains
5 nodes 16.06% 24.52% 33.85%
10 nodes 25.00% 29.81% 28.55%
20 nodes 11.96% 20.19% 20.36%
30 nodes 5.87% 16.46% 17.60%
TABLE I
CON VE RG ENCE RATE I MPROV EMEN T BY MAX-LOGIT W H EN p= 0.
Table I presents the average reduction on the number of
iterations needed to reach the network optimum solution,
comparing MAX-logit and B-logit, over 5000 sample
runs. It can be observed that MAX-logit converges to
the network optimum solution faster than B-logit up to
33.85%. It is also interesting to note that the improvement
diminishes when the number of nodes increases in each
domain. The reason is that, in such scenarios, the intra-domain
cost becomes more dominant in the overall network cost and
both B-logit and MAX-logit only need to concentrate on
few combinations of nodes that minimize each domain’s intra-
domain cost. Therefore, with the reduced feasible solution
set, B-logit algorithm performs reasonably well in finding
optimum solution and the convergence speed improvement by
MAX-logit becomes smaller.
C. Random Cost
Next, we consider the scenarios where triangle inequality
is violated by setting p= 50, i.e., 50% of the links in the
network are associated with random link cost. Figure 8, Figure
9, and Figure 10 illustrate the trajectories of MAX-logit and
B-logit in sample runs for 2domains, 3domains, and 4
domains scenarios, where violations of triangle inequality are
observed. In 2domains scenario, as suggested in Theorem 2,
MAX-logit and B-logit still converge to the global opti-
mum solution as iteration evolves. In 3domains and 4domains
scenarios, however, both MAX-logit and B-logit will
converge to the best Nash equilibrium which is different from
OPT (since the optimum solution is not a Nash equilibrium
and thus unstable). We also numerically calculate the price of
stability upper bound in (6), labeled as BOUND in Figure 9
and Figure 10. In both scenarios, the proposed MAX-logit
algorithm converges noticeably faster than B-logit.
We compare the average percentage of reduction on the
number of iterations needed to reach the best Nash equilib-
rium, comparing MAX-logit and B-logit, over 5000 sam-
ple runs. Since the best Nash equilibrium is not the network
optimum solution in general, we consider the following rule
as the criterion for convergency. For each sample run, we set
a sufficiently large number, i.e., 2000 in our simulations, as
8
0 20 40 60 80 100
3000
3100
3200
3300
3400
Iteration steps
Global network cost
MAX−logit
B−logit
OPT
Fig. 5. Sample run for 2domains with 40
nodes.
0 50 100 150 200
5000
5500
6000
6500
Iteration steps
Global network cost
MAX−logit
B−logit
OPT
Fig. 6. Sample run for 3domains with 60
nodes.
0 50 100 150 200
8000
8500
9000
9500
10000
10500
Iteration steps
Global network cost
MAX−logit
B−logit
OPT
Fig. 7. Sample run for 4domains with 80
nodes.
0 50 100 150 200
2500
3000
3500
4000
4500
5000
Iteration steps
Global network cost
OPT
B−logit
MAX−logit
Fig. 8. Sample run for 2domains with 40
nodes and random link cost.
0 50 100 150 200
4000
4500
5000
5500
6000
Iteration steps
Global network cost
BOUND
OPT
MAX−logit B−logit
Fig. 9. Sample run for 3domains with 60
nodes and random link cost.
0 50 100 150 200
6000
7000
8000
9000
10000
11000
12000
Iteration steps
Global network cost
BOUND
OPT
MAX−logit
B−logit
Fig. 10. Sample run for 4domains with 80
nodes and random link cost.
Nodes per domain 2 domains 3 domains 4 domains
5 nodes 21.84% 24.46% 27.38%
10 nodes 21.00% 21.44% 21.56%
20 nodes 9.54% 9.13% 5.47%
30 nodes 1.90% 1.93% 2.24%
TABLE II
CON VE RG EN CE R ATE I MP ROVEM EN T BY MAX-LOGIT WH EN p= 50.
the maximum number of iterations each algorithm executes.
We average the global network costs of the last 200 iteration
steps as the convergence point, and denote this value by
κ. We consider the algorithm converges to the best Nash
equilibrium at iteration t, if for (t, t + 100) consecutive
iterations steps, the global network cost consistently remains in
a small neighborhood of κ±2. Table II presents the percentage
of reduction on the number of iterations required to converge,
comparing MAX-logit and B-logit, where a similar trend
of improvement degradation while number of nodes increases
is observed, as in the Euclidean distance scenarios.
VI. CO NCL US ION S
In this work, we investigate the interactions of gateway
selection by multiple domains in a coalition network. Within
a potential game framework, the existence and inefficiency of
Nash equilibrium in the gateway selection game are analyzed
and quantified. In order to achieve the best Nash equilibrium,
equilibrium selective learning algorithms are studied. We show
that the well-established B-logit algorithm is a special
case of a general family of algorithms denoted by γ-logit,
or Γcollectively. In addition, we propose a novel learning
algorithm named MAX-logit which retains the favorable
property of equilibrium selection while enjoys a provably
faster convergence rate than any other algorithms in Γ. Our
results are substantiated via simulations.
REF ERE NC ES
[1] C.-K. Chau, J. Crowcroft, K.-W. Lee, and S. H. Y. Wong, “Inter-domain
routing for mobile ad hoc networks,” ACM MobiArch, 2008.
[2] A. Durresi, M. Durresi, and L. Barolli, “Heterogeneous multi domain
network architecture for military communications, Proceedings of the
Third International Conference on Complex, Intelligent and Software
Intensive Systems, 2009.
[3] D. Monderer and L. Shapley, “Potential games,” Journal of Games and
Economic Behavior, vol. 14, pp. 124–143, 1996.
[4] E. Koutsoupias and C. H. Papadimitriou, “Worst-case equilibria,
STACS, 1999.
[5] E. Anshelevich, A. Dasgupta, J. Kleinberg, E. Tardos, T. Wexler, and
T. Roughgarden, “The price of stability for network design with fair
cost allocation, IEEE FOCS, 2004.
[6] R. Kleinberg, “Geographic routing using hyperbolic space,” IEEE IN-
FOCOM, 2007.
[7] A. Cvetkovski and M. Crovella, “Hyperbolic embedding and routing for
dynamic graphs, IEEE INFOCOM, 2009.
[8] F. Papadopoulos, D. Krioukov, M. Boguna, and A. Vahdat, “Greedy
forwarding in dynamic scale-free networks embedded in hyperbolic
metric spaces, IEEE INFOCOM, 2010.
[9] G. Arslan, J. Marden, and J. Shamma, “Autonomous vehicle-target
assignment: A game theoretical formulation,” ASME Journal of Dynamic
Systems, Measurement and Control, pp. 584–596, 2007.
[10] Y. Song, C. Zhang, and Y. Fang, “Throughput maximization in multi-
channel wireless mesh access networks,” IEEE ICNP, 2007.
[11] J. Marden and J. Shamma, “Revisiting log-linear learning: Asyn-
chrony, completeness and payoff-based implementation, in submission,
http://ecee.colorado.edu/marden/publications.html.
[12] J. Marden, G. Arslan, and J. Shamma, “Cooperative control and potential
games,” IEEE Transactions on Systems, Man and Cybernetics. Part B:
Cybernetics, 2009.
[13] J. Marden and M. Effros, “The price of selfishness in network coding,
NetCod, 2009.
[14] G. Arslan, F. Demirkol, and Y. Song, “Equilibrium efficiency improve-
ment in MIMO interference systems: A decentralized stream control
approach, IEEE Transactions on Wireless Communications, 2007.
[15] R. Horn and C. Johnson, Matrix Analysis. Cambridge University Press,
1986.
[16] S. Boyd, P. Diaconis, and L. Xiao, “Fastest mixing markov chain on a
graph,” SIAM Review, vol. 46, pp. 667–689, 2004.
[17] R. Montenegro and P. Tetali, Mathematical Aspects of Mixing Times in
Markov Chains. NOW Publisher, 2005.
[18] F. Chuang, Spectral Graph Theory. CBMS Regional Conference Series
in Mathematics, 1997.
[19] J. Cheeger, A lower bound for the smallest eigenvalue of the laplacian,
Problems in analysis, pp. 195–199, 1970.
... Potential games are games where the incentive of players to change their strategy can be expressed in a single global function, the potential function. Potential games have been used in wireless networks in a plethora of problems, including power control [21] [42], cognitive radio [36], gateway selection [41] and channel allocation [11]. In our game-theoretic formulation we prove that there is an equilibrium point. ...
Article
Full-text available
One of the most significant problems in Wireless Sensor Network (WSN) deployment is the generation of topologies that maximize transmission reliability and guarantee network connectivity while also maximizing the network’s lifetime. Transmission power settings have a large impact on the aforementioned factors. Increasing transmission power to provide coverage is the intuitive solution yet with it may come with lower packet reception and shorter network lifetime. However, decreasing the transmission power may result in the network being disconnected. To balance these trade-offs we propose a discrete strategy game-theoretic solution, which we call TopGame that aims to maximize the reliability between nodes while using the most appropriate level of transmission power that guarantees connectivity. In this paper, we provide the conditions for the convergence of our algorithm to a pure Nash equilibrium as well as experimental results. Here we show, using the Indriya WSN testbed, that TopGame is more energy-efficient and approaches a similar packet reception ratio with the current closest state of the art protocol ART. Finally, we provide a methodology for further optimization of our work using an indicator function to distinguish between satisfactory and poor links.
... To solve the problem of the uniqueness of Nash equilibrium in the discrete case, Song et al. in [14] propose an equilibrium selective algorithm which retains the favorable NE. However, it is shown in [15] that this algorithm achieves a very slow convergence. ...
Chapter
A Wireless Sensor Network (WSN) is composed of sensor equipped devices that aim at sensing and processing information from the surrounding environment. Energy consumption is the major concern of WSNs. At the same time, quality of service is to be considered especially when dealing with critical WSNs. In this paper, we present a game theory based approach to maximize quality of service, defined as the aggregate frame success rate, while optimizing power allocation. Game theory is designed to study interactions between players (e.g. chess players) who decide on a set of actions (e.g. the players moves) to reach the objective outcomes (e.g. to win the game). Here, we model the system as a potential game. We show that the optimal power allocation, crucial in a heterogeneous sensor network, is a Nash equilibrium of this game, and we discuss its uniqueness. For simulations, we present a fully distributed algorithm that drives the whole system to the optimal power allocation.
Article
Full-text available
We show that complex (scale-free) network topologies naturally emerge from hyperbolic metric spaces. Hyperbolic geometry facilitates maximally efficient greedy forwarding in these networks. Greedy forwarding is topology-oblivious. Nevertheless, greedy packets find their destinations with 100% probability following almost optimal shortest paths. This remarkable efficiency sustains even in highly dynamic networks. Our findings suggest that forwarding information through complex networks, such as the Internet, is possible without the overhead of existing routing protocols, and may also find practical applications in overlay networks for tasks such as application-level routing, information sharing, and data distribution.
Article
Full-text available
We define and discuss several notions of potential functions for games in strategic form. We characterize games that have a potential function, and we present a variety of applications.Journal of Economic LiteratureClassification Numbers:C72, C73
Article
Full-text available
We consider an autonomous vehicle-target assignment problem where a group of vehicles are expected to optimally assign themselves to a set of targets. We introduce a game-theoretical formulation of the problem in which the vehicles are viewed as self-interested decision makers. Thus, we seek the optimization of a global utility function through autonomous vehicles that are capable of making individually rational decisions to opti-mize their own utility functions. The first important aspect of the problem is to choose the utility functions of the vehicles in such a way that the objectives of the vehicles are localized to each vehicle yet aligned with a global utility function. The second important aspect of the problem is to equip the vehicles with an appropriate negotiation mechanism by which each vehicle pursues the optimization of its own utility function. We present several design procedures and accompanying caveats for vehicle utility design. We present two new negotiation mechanisms, namely, "generalized regret monitoring with fading memory and inertia" and "selective spatial adaptive play," and provide accom-panying proofs of their convergence. Finally, we present simulations that illustrate how vehicle negotiations can consistently lead to near-optimal assignments provided that the utilities of the vehicles are designed appropriately.
Conference Paper
Full-text available
We introduce a game theoretic framework for studying a restricted form of network coding in a general wireless network. The network is fixed and known, and the system performance is measured as the number of wireless transmissions required to meet n unicast demands. Game theory is here employed as a tool for improving distributed network coding solutions. We propose a framework that allows each unicast session to independently adjust his routing decision in response to local information. Specifically, we model the interactions of the unicast sessions as a noncooperative game. This approach involves designing both local cost functions and decision rules for the unicast sessions so that the resulting collective behavior achieves a desirable system performance in a shared network environment. We propose a family of cost functions and compare the performance of the resulting distributed algorithms to the best performance that could be found and implemented using a centralized controller. We focus on the performance of stable solutions - where stability here refers to a form of Nash equilibrium defined below. Results include bounds on the bestand worst-case stable solutions as compared to the optimal centralized solution. Results in learning in games prove that the best-case stable solution can be learned by self-interested players with probability approaching 1.
Article
Full-text available
We present a view of cooperative control using the language of learning in games. We review the game-theoretic concepts of potential and weakly acyclic games, and demonstrate how several cooperative control problems, such as consensus and dynamic sensor coverage, can be formulated in these settings. Motivated by this connection, we build upon game-theoretic concepts to better accommodate a broader class of cooperative control problems. In particular, we extend existing learning algorithms to accommodate restricted action sets caused by the limitations of agent capabilities and group based decision making. Furthermore, we also introduce a new class of games called sometimes weakly acyclic games for time-varying objective functions and action sets, and provide distributed algorithms for convergence to an equilibrium.
Article
Log-linear learning is a learning algorithm that provides guarantees on the percentage of time that the action profile will be at a potential maximizer in potential games. The traditional analysis of log-linear learning focuses on explicitly computing the stationary distribution and hence requires a highly structured environment. Since the appeal of log-linear learning is not solely the explicit form of the stationary distribution, we seek to address to what degree one can relax the structural assumptions while maintaining that only potential function maximizers are stochastically stable. In this paper, we introduce slight variants of log-linear learning that provide the desired asymptotic guarantees while relaxing the structural assumptions to include synchronous updates, time-varying action sets, and limitations in information available to the players. The motivation for these relaxations stems from the applicability of log-linear learning to the control of multi-agent systems where these structural assumptions are unrealistic from an implementation perspective.
Article
Inter-domain routing is an important component to allow interoperation among heterogeneous network domains op-erated by different organizations. Although inter-domain routing has been well supported in the Internet, there has been relatively little support to the Mobile Ad Hoc Networks (MANETs) space. In MANETs, the inter-domain routing problem is challenged by: (1) dynamic network topology due to mobility, and (2) diverse intra-domain ad hoc routing pro-tocols. In this paper, we discuss how to enable inter-domain routing among MANETs, and to handle the dynamic na-ture of MANETs. We first present the design challenges for inter-domain routing in MANETs, and then propose a framework for inter-domain routing in MANETs.
Conference Paper
We consider a multi-link MIMO interference system in which each link wishes to maximize its own mutual information by choosing its own signal vector, which leads to a multi-player game. We show the existence of a Nash equilibrium and obtain sufficient conditions for the uniqueness of equilibrium. We consider two decentralized link adjustment algorithms called best-response process (a.k.a. iterative water-filling) and gradient-play (an autonomous and non-cooperative version of the well-known gradient ascent algorithm). Under our uniqueness conditions, we establish the convergence of these algorithms to the unique equilibrium provided that the links use some inertia. To improve the efficiency of an equilibrium with respect to the total mutual information by imposing limits on the number of independent data streams, we present a stream control approach using linear transformation of the link covariance matrices. We then show how to decentralize our stream control approach by allowing the links to negotiate the limits on the number of independent data streams that they are willing to impose upon themselves. To achieve this, we introduce a variation of a learning algorithm called "adaptive play" that has desirable convergence properties in potential games with reduced computation.
Conference Paper
We propose an embedding and routing scheme for arbitrary network connectivity graphs, based on greedy routing and utilizing virtual node coordinates. In dynamic multihop packet-switching communication networks, routing elements can join or leave during network operation or exhibit intermittent failures. We present an algorithm for online greedy graph embedding in the hyperbolic plane that enables incremental embedding of network nodes as they join the network, without disturbing the global embedding. Even a single link or node removal may invalidate the greedy routing success guarantees in network embeddings based on an embedded spanning tree subgraph. As an alternative to frequent reembedding of temporally dynamic network graphs in order to retain the greedy embedding property, we propose a simple but robust generalization of greedy distance routing called Gravity-Pressure (GP) routing. Our routing method always succeeds in finding a route to the destination provided that a path exists, even if a significant fraction of links or nodes is removed subsequent to the embedding. GP routing does not require precomputation or maintenance of special spanning subgraphs and, as demonstrated by our numerical evaluation, is particularly suitable for operation in tandem with our proposed algorithm for online graph embedding.