Fig 4 - uploaded by S. Lakshmivarahan
Content may be subject to copyright.
Saddle point game versus nonsaddle point game.  

Saddle point game versus nonsaddle point game.  

Source publication
Article
Full-text available
Multilevel games are abstractions of situations where decision makers are distributed in a network environment. In Part I of this paper, the authors present several of the challenging problems that arise in the analysis of multilevel games. In this paper a specific set up is considered where the two games being played are zero-sum games and where t...

Similar publications

Conference Paper
Full-text available
This paper presents preliminary work done on simulation-based optimization of a stochastic material-dispatching system in a retailer network. The problem we consider is one of determining the optimal number of trucks and quantities to be dispatched in such a system. Theoretical solution models for versions of this problem can be found in the litera...

Citations

... A learning automaton [11] [12] [13] [14] [15] [16] [17] [18] is an adaptive decisionmaking unit that improves its performance by learning how to choose the optimal action from a finite set of allowed actions through repeated interactions with a random environment. Learning automata can be classified into two main families: fixed structure learning automata and variable structure learning automata. ...
... A. Learning Automata A learning automaton1112131415161718 is an adaptive decisionmaking unit that improves its performance by learning how to choose the optimal action from a finite set of allowed actions through repeated interactions with a random environment. Learning automata can be classified into two main families: fixed structure learning automata and variable structure learning automata. ...
Conference Paper
Full-text available
In this paper, we propose some learning automata-based algorithms to solve the minimum spanning tree problem in stochastic graphs when the probability distribution function of the edge's weight is unknown. In these algorithms, at each stage a set of learning automata determines which edges to be sampled. This sampling method may result in decreasing unnecessary samples and hence decreasing the running time of algorithms. The proposed algorithm reduces the number of samples needs to be taken by a sample average approximation method from the edges of the stochastic graph. It is shown that by proper choice of the parameter of the proposed algorithms, the probability that the algorithms find the optimal solution can be made as close to unity as possible.
... Similarly, in an efficient DBE market, the supply of services will adjust immediately to any arising information about the underlying needs. Cooperation, symbiosis [16], [27] as well as the efficiency [37], [40] of adaptive multi-agent systems has been studied in the context of the simple games. In [40] no verifiable definition of efficiency is given, whereas in [37] the system is considered to be in an efficient market phase when all information that can be used by the agents' strategies is traded away, and no agent can accumulate more points than an agent making random guesses would. ...
Article
Full-text available
We investigate knowledge exchange among commercial organisations, the rationale behind it and its effects on the market. Knowledge exchange is known to be beneficial for industry, but in order to explain it, authors have used high level concepts like network effects, reputation and trust. We attempt to formalise a plausible and elegant explanation of how and why companies adopt information exchange and why it benefits the market as a whole when this happens. This explanation is based on a multi-agent model that simulates a market of software providers. Even though the model does not include any high-level concepts, information exchange naturally emerges during simulations as a successful profitable behaviour. The conclusions reached by this agent-based analysis are twofold: (1) A straightforward set of assumptions is enough to give rise to exchange in a software market. (2) Knowledge exchange is shown to increase the efficiency of the market
... By applying the periodical policy technique to a hierarchy of learning agents, we try to create a solution technique that solves both conflicting interest single-stage as well as multi-stage games. For now, our algorithm is designed for stochastic tree-structured multi-stage games as defined by E. Billard ( Zhou, Billard, & Lakshmivarahan 1999 ). He views multi-stage (or multilevel ) games as games where decision makers at the toplevel decide which game to play at the lower levels. ...
... However the MMDP formalization leaves out the games where the agents have different or competing interests. E. Billard introduced a multi-stage game where the agents have pure conflicting interests ( Zhou, Billard, & Lakshmivarahan 1999). Since the pay-offs of the agents aren't shared in conflicting games, for each joint-action, we have an element in the game matrix of the form (p h , p k ), where p h is the probability for Agent h to receive a positive reward of +1 and p k is the probability for Agent k. ...
... The hierarchical setup Billard used in ( Zhou, Billard, & Lakshmivarahan 1999) consists of 2 levels and is depicted inFigure 4. In this setting of Billard there are 2 agents both with 2 separate learning automata (Agent 1 consists of LA 1 1 at the high level and LA 2 1 at the low level). All of the learning automata have 2 actions. ...
Article
Full-text available
Coordination to some equilibrium point is an interesting problem in multi-agent reinforcement learning. In common interest single stage settings this problem has been studied profoundly and efficient solution techniques have been found. Also for particular multi-stage games some experiments show good results. However, for a large scale of problems the agents do not share a common pay-off function. Again, for single stage problems, a solution technique exists that finds a fair solution for all agents. In this paper we report on a technique that is based on learning automata theory and peri-odical policies. Letting pseudo-independent agents play peri-odical policies enables them to behave socially in pure con-flicting multi-stage games as defined by E. Billard (Billard & Lakshmivarahan 1999; Zhou, Billard, & Lakshmivarahan 1999). We experimented with this technique on games where simple learning automata have the tendency not to cooper-ate or to show oscillating behavior resulting in a suboptimal pay-off. Simulation results illustrate that our technique over-comes these problems and our agents find a fair solution for both agents.
... The alternative is to set to some very small value to remove most of the effects of the randomness and, in the current study, we show some three-dimensional results. Also, simulations are used in [17] to illustrate the convergence proofs. ...
... The -scheme is optimal for zero-sum games with saddle points [18]. The concurrent study [17] shows that -converges to noncooperation in multilevel zero-sum games with saddle points. Fig. 1. ...
... Likewise for systems which do not cooperate, except game B has a more favorable saddle point. The concurrent study [17] ...
Article
Full-text available
A model is presented of learning automata playing stochastic games at two levels. The high level represents the choice of the game environment and corresponds to a group decision. The low level represents the choice of action within the selected game environment. Both of these decision processes are affected by delays in the information state due to inherent latencies or to the delayed broadcast of state changes. Analysis of the intrinsic properties of this Markov process is presented along with simulated iterative behavior and expected iterative behavior. The results show that simulation agrees with expected behavior for small step lengths in the iterative map. A Feigenbaum diagram and numerical computation of the Lyapunov exponents show that, for very small penalty parameters, the system exhibits chaotic behavior.
Article
Due to the hardness of solving the minimum spanning tree (MST) problem in stochastic environments, the stochastic MST (SMST) problem has not received the attention it merits, specifically when the probability distribution function (PDF) of the edge weight is not a priori known. In this paper, we first propose a learning automata-based sampling algorithm (Algorithm 1) to solve the MST problem in stochastic graphs where the PDF of the edge weight is assumed to be unknown. At each stage of the proposed algorithm, a set of learning automata is randomly activated and determines the graph edges that must be sampled in that stage. As the proposed algorithm proceeds, the sampling process focuses on the spanning tree with the minimum expected weight. Therefore, the proposed sampling method is capable of decreasing the rate of unnecessary samplings and shortening the time required for finding the SMST. The convergence of this algorithm is theoretically proved and it is shown that by a proper choice of the learning rate the spanning tree with the minimum expected weight can be found with a probability close enough to unity. Numerical results show that Algorithm 1 outperforms the standard sampling method. Selecting a proper learning rate is the most challenging issue in learning automata theory by which a good trade off can be achieved between the cost and efficiency of algorithm. To improve the efficiency (i.e., the convergence speed and convergence rate) of Algorithm 1, we also propose four methods to adjust the learning rate in Algorithm 1 and the resultant algorithms are called as Algorithm 2 through Algorithm 5. In these algorithms, the probabilistic distribution parameters of the edge weight are taken into consideration for adjusting the learning rate. Simulation experiments show the superiority of Algorithm 5 over the others. To show the efficiency of Algorithm 5, its results are compared with those of the multiple edge sensitivity method (MESM). The obtained results show that Algorithm 5 performs better than MESM both in terms of the running time and sampling rate.
Conference Paper
Full-text available
This review surveys a few major questions in the field of decision theory. It is argued that a re-examination of some of the fundamental concepts in decision theory may have important implications to theoretical and even empirical research in economics and related fields.
Conference Paper
The analysis of the collective behavior of agents in a distributed multi-agent environment received a lot of attention in the past decade. More accurately, coordination was studied intensely because it enables agents to converge to Pareto optimal solutions and Nash equilibria. Most of these studies focussed on team games. In this paper we report on a technique for finding fair solutions in conflicting interest multi-stage games. Our hierarchical periodic policies algorithm is based on the characteristics of a homo egualis society in which the players also care about the proportional distribution of the pay-off in relation to the pay-off of the other players. This feature is built into a hierarchy of learning automata which is suited for playing sequential decision problems.
Conference Paper
To achieve synergy, it is important for agents to form cooperative groups such that shared resources, strategies and information can be fully utilized. A game-theoretic model is presented in which agents decide whether it is beneficial to form groups and what actions to take within the chosen context. Learning automata are used to model this multi-level decision-making process. The results show that asymmetries in initialization and equilibria do not effect this process. With delayed information, both symmetric and asymmetric penalties lead to chaos but with different Lyapunov exponents
Article
A new learning algorithm for the hierarchical structure learning automata (HSLA) operating in the nonstationary multiteacher environment (NME) is proposed. The proposed algorithm is derived by extending the original relative reward-strength algorithm to be utilized in the HSLA operating in the general NME. It is shown that the proposed algorithm ensures convergence with probability 1 to the optimal path under a certain type of the NME. Several computer-simulation results, which have been carried out in order to compare the relative performance of the proposed algorithm in some NMEs against those of the two of the fastest algorithms today, confirm the effectiveness of the proposed algorithm
Article
The paper proposes a network routing method based on a computational ecology model. The computational ecology model is a mathematical model proposed by B.A. Huberman and T. Hogg (1988), which represents a macro action of multi-agent systems. We formulate routing on a computer network as a resource allocation problem, where packets and links are regarded as agents and resources, respectively. Then, we apply an extended computational ecology model for this problem. Agents conflict so as to get more payoffs from links. As a result, they get the same payoffs, and a good resource allocation is achieved. In each node, each packet selects a link according to the selection rate decided through conflicts, and routing is accomplished autonomously with adaptability on the computer network. Moreover, we improve fault-tolerance of the system by local information exchanges. Finally, we examine the efficiency of the proposed method by computer simulation