Cooperation with a pure strategy (saddle point). Game A and game B have coincident saddle points but the payoff in A is greater than B so the agents support full cooperation. One simulation result ( = 0:1) is initialized to very low cooperation but it reaches the same conclusion as three other expected iterative solutions (dashed).

Source publication

Learning in multilevel games with incomplete information - Part I

Article

Full-text available

Feb 1999

A model is presented of learning automata playing stochastic games at two levels. The high level represents the choice of the game environment and corresponds to a group decision. The low level represents the choice of action within the selected game environment. Both of these decision processes are affected by delays in the information state due t...

A Parameter-Free Learning Automaton Scheme

Article

Full-text available

Nov 2017

Hao Ge

For a learning automaton, a proper configuration of its learning parameters, which are crucial for the automaton's performance, is relatively difficult due to the necessity of a manual parameter tuning before real applications. To ensure a stable and reliable performance in stochastic environments, parameter tuning can be a time-consuming and inter...

Dynamic user profile for adaptive personalized recommender system using learning automata

Article

Full-text available

Nov 2023
MULTIMED TOOLS APPL

The personalized recommender systems provide favorite services based on user preferences and interests. Due to the user's interests changing over time; hence the recommender system must be tracking these changes automatically; to overcome the research gap and col start problem in the current study, we suggest a framework to create an adaptive user profiling for a personalized recommender system using learning automata. We clustered items based on their features. In this technique, the learning automaton adjusts the amount of user interest in each cluster based on user feedback; then recommends the best items to the user based on demographic information of user and user's preferences. Several experiments are conducted on three movie datasets to show the performance of the proposed algorithm. The obtained results demonstrate that the proposed algorithm outperforms several existing approaches in terms of precision, recall, MAE, and RMSE.

Using Learning Automata to Solve Dynamic Distributed Constraint Optimization Problems

Preprint

Full-text available

Jun 2022

Learning-based techniques could be an alternative approach to solve Dynamic Distributed Constraint Optimization Problems (DDCOPs) and are computationally cheaper than sequential DCOP solvers. This paper, proposes a learning-based solution to solve DDCOPs in which the environment is stochastic due to the presence of multiple agents. In our approach the problem is modelled as a multi-agent Markov Decision Process and then a learning automaton, which is a relatively simple method and requires less qualitative data, is employed to learn how to assign values to variables. The proposed method considers two very important issues namely time step dependency and uncertainty about future events upon which we allocate values to variables. Experimental results reveal that the employed method converges and satisfies the constraints of the optimization problems in comparison to the well-known methods.

Mission Level Uncertainty in Multi-Agent Resource Allocation

Preprint

Full-text available

Jun 2021

In recent years, a significant research effort has been devoted to the design of distributed protocols for the control of multi-agent systems, as the scale and limited communication bandwidth characteristic of such systems render centralized control impossible. Given the strict operating conditions, it is unlikely that every agent in a multi-agent system will have local information that is consistent with the true system state. Yet, the majority of works in the literature assume that agents share perfect knowledge of their environment. This paper focuses on understanding the impact that inconsistencies in agents' local information can have on the performance of multi-agent systems. More specifically, we consider the design of multi-agent operations under a game theoretic lens where individual agents are assigned utilities that guide their local decision making. We provide a tractable procedure for designing utilities that optimize the efficiency of the resulting collective behavior (i.e., price of anarchy) for classes of set covering games where the extent of the information inconsistencies is known. In the setting where the extent of the informational inconsistencies is not known, we show -- perhaps surprisingly -- that underestimating the level of uncertainty leads to better price of anarchy than overestimating it.

A learning automata-based heuristic algorithm for solving the minimum spanning tree problem in stochastic graphs

Article

Full-text available

Feb 2012
J SUPERCOMPUT

During the last decades, a host of efficient algorithms have been developed for solving the minimum spanning tree problem in deterministic graphs, where the weight associated with the graph edges is assumed to be fixed. Though it is clear that the edge weight varies with time in realistic applications and such an assumption is wrong, finding the minimum spanning tree of a stochastic graph has not received the attention it merits. This is due to the fact that the minimum spanning tree problem becomes incredibly hard to solve when the edge weight is assumed to be a random variable. This becomes more difficult if we assume that the probability distribution function of the edge weight is unknown. In this paper, we propose a learning automata-based heuristic algorithm to solve the minimum spanning tree problem in stochastic graphs wherein the probability distribution function of the edge weight is unknown. The proposed algorithm taking advantage of learning automata determines the edges that must be sampled at each stage. As the presented algorithm proceeds, the sampling process is concentrated on the edges that constitute the spanning tree with the minimum expected weight. The proposed learning automata-based sampling method decreases the number of samples that need to be taken from the graph by reducing the rate of unnecessary samples. Experimental results show the superiority of the proposed algorithm over the well-known existing methods both in terms of the number of samples and the running time of algorithm.

Characterizing Game Dynamics in Two-Player Strategy Games Using Network Motifs

Article

Full-text available

Jul 2008
IEEE T SYST MAN CY B

Many complex systems, whether biological, sociological, or physical ones, can be represented using networks. In these networks, a node represents an entity, and an arc represents a relationship/constraint between two entities. In discrete dynamics, one can construct a series of networks with each network representing a time snapshot of interaction among the different components in the system. Understanding these networks is a key to understand the dynamics of real and artificial systems. Network motifs are small graphs-usually three to four nodes-representing local structures. They have been widely used in studying complex systems and in characterizing features on the system level by analyzing locally how the substructures are formed. Frequencies of different network motifs have been shown in the literature to vary from one network to another, and conclusions hypothesized that these variations are due to the evolution/dynamics of the system. In this paper, we show for the first time that in strategy games, each game (i.e., type of dynamism) has its own signature of motifs and that this signature is maintained during the evolution of the game. We reveal that deterministic strategy games have unique footprints (motifs' count) that can be used to recognize and classify the game's type and that these footprints are consistent along the evolutionary path of the game. The findings of this paper have significance for a wide range of fields in cybernetics.

The Emergence of Knowledge Exchange: An Agent-Based Model of a Software Market

Article

Full-text available

Aug 2006
IEEE T SYST MAN CY A

We investigate knowledge exchange among commercial organisations, the rationale behind it and its effects on the market. Knowledge exchange is known to be beneficial for industry, but in order to explain it, authors have used high level concepts like network effects, reputation and trust. We attempt to formalise a plausible and elegant explanation of how and why companies adopt information exchange and why it benefits the market as a whole when this happens. This explanation is based on a multi-agent model that simulates a market of software providers. Even though the model does not include any high-level concepts, information exchange naturally emerges during simulations as a successful profitable behaviour. The conclusions reached by this agent-based analysis are twofold: (1) A straightforward set of assumptions is enough to give rise to exchange in a software market. (2) Knowledge exchange is shown to increase the efficiency of the market

Patterns of Agent Interaction Scenarios as Use Case Maps

Article

Sep 2004
IEEE T SYST MAN CY B

Edward A Billard

A use case map (UCM) presents, in general, an abstract description of a complex system and, as such, is a good candidate for representing scenarios of autonomous agents interacting with other autonomous agents. The "gang of four" design patterns are intended for object-oriented software development but at least eight of the patterns illustrate structure, or architecture, that is appropriate for interacting agents, independent of software development. This study presents these particular patterns in the form of UCMs to describe abstract scenarios of agent interaction. Seven of the patterns attempt to balance the decentralized nature of interacting agents with an organized structure that makes for better, cleaner interactions. An example performance analysis is provided for one of the patterns, illustrating the benefit of an early abstraction of complex agent behavior. The original contribution here is a UCM presentation of the causal paths in agent behavior as suggested by software design patterns.

Multi-agent learning in conflicting multi-level games with incomplete information

Article

Full-text available

Jan 2004

Coordination to some equilibrium point is an interesting problem in multi-agent reinforcement learning. In common interest single stage settings this problem has been studied profoundly and efficient solution techniques have been found. Also for particular multi-stage games some experiments show good results. However, for a large scale of problems the agents do not share a common pay-off function. Again, for single stage problems, a solution technique exists that finds a fair solution for all agents. In this paper we report on a technique that is based on learning automata theory and peri-odical policies. Letting pseudo-independent agents play peri-odical policies enables them to behave socially in pure con-flicting multi-stage games as defined by E. Billard (Billard & Lakshmivarahan 1999; Zhou, Billard, & Lakshmivarahan 1999). We experimented with this technique on games where simple learning automata have the tendency not to cooper-ate or to show oscillating behavior resulting in a suboptimal pay-off. Simulation results illustrate that our technique over-comes these problems and our agents find a fair solution for both agents.

Colonies of learning automata

Article

Feb 2002
IEEE T SYST MAN CY B

Learning in multilevel games with incomplete information - Part II

Article

Full-text available

Jul 1999
IEEE T SYST MAN CY B

Multilevel games are abstractions of situations where decision makers are distributed in a network environment. In Part I of this paper, the authors present several of the challenging problems that arise in the analysis of multilevel games. In this paper a specific set up is considered where the two games being played are zero-sum games and where the decision makers use the linear reward-inaction algorithm of stochastic learning automata. It is shown that the effective game matrix is decided by the willingness and the ability to cooperate and is a convex combination of two zero-sum game matrices. Analysis of the properties of this effective game matrix and the convergence of the decision process shows that players tend toward noncooperation in these specific environments. Simulation results illustrate this noncooperative behavior

Similar publications

Citations