Conference Paper

Fuzzy Q-learning in a nondeterministic environment: Developing an intelligent Ms. Pac-Man agent

October 2009

October 2009

DOI:10.1109/CIG.2009.5286478

Source
IEEE Xplore

Conference: Computational Intelligence and Games, 2009. CIG 2009. IEEE Symposium on

Authors:

Wesley Viner

Princeton University

This paper reports the results from training an intelligent agent to play the Ms. Pac-Man video game using variations of a fuzzy Q-learning algorithm. This approach allows us to address the nondeterministic aspects of the game as well as finding a successful self-learning or adaptive playing strategy. The strategy presented is a table based learning strategy, in which the intelligent agent analyzes the current situation of the game, stores various membership values for each of the several contributors to the situation (distance to closest pill, distance to closest power pill, and distance to closest ghost), and makes decisions based on these values.

Pac-Man Conquers Academia: Two Decades of Research Using a Classic Arcade Game

Article

Full-text available

Dec 2017

Pac-Man and its equally popular successor Ms Pac-Man are often attributed to being the frontrunners of the golden age of arcade video games. Their impact goes well beyond the commercial world of video games and both games have featured in numerous academic research projects over the last two decades. In fact, scientific interest is on the rise and many avenues of research have been pursued, including studies in robotics, biology, sociology and psychology. The most active field of research is computational intelligence, not least because of popular academic gaming competitions that feature Ms Pac-Man. This paper summarises peer-reviewed research that focuses on either game (or close variants thereof) with particular emphasis on the field of computational intelligence. The potential usefulness of games like Pac-Man for higher education is also discussed and the paper concludes with a discussion of prospects for future work.

A model-based approximate λ-policy iteration approach to online evasive path planning and the video game Ms. Pac-Man

Article

Aug 2011

This paper presents a model-based approximate λ-policy iteration approach using temporal differences for optimizing paths online for a pursuit-evasion problem, where an agent must visit several target positions within a region of interest while simultaneously avoiding one or more actively pursuing adversaries. This method is relevant to applications, such as robotic path planning, mobile-sensor applications, and path exposure. The methodology described utilizes cell decomposition to construct a decision tree and implements a temporal difference-based approximate λ-policy iteration to combine online learning with prior knowledge through modeling to achieve the objectives of minimizing the risk of being caught by an adversary and maximizing a reward associated with visiting target locations. Online learning and frequent decision tree updates allow the algorithm to quickly adapt to unexpected movements by the adversaries or dynamic environments. The approach is illustrated through a modified version of the video game Ms. Pac-Man, which is shown to be a benchmark example of the pursuit-evasion problem. The results show that the approach presented in this paper outperforms several other methods as well as most human players. KeywordsApproximate dynamic programming–Reinforcement learning–Path planning–Pursuit evasion games

Evolution Strategy for Optimizing Parameters in Ms Pac-Man Controller

Conference Paper

Full-text available

Sep 2010

This paper describes an application of Evolutionary Strategy to optimizing ten distance parameters and seven cost parameters in our Ms Pac-Man controller, ICE Pambush 3, which was the winner of the IEEE CIG 2009 competition. Targeting at the first game level, we report our results from 14 possible optimization schemes; arising from combinations of which initial values to chose, those originally used in ICE Pambush 3 or those randomly assigned, and which parameter types to optimize first, the distance parameters, the cost parameters, or both. We have found that the best optimization scheme is to first optimize the distance parameters, with their initial values set to random values, and then the cost parameters with their initial values set to random values. The optimized ICE Pambush 3 using the resulting parameters from this optimization scheme has an improvement of 17% in the performance for the first game level, compared to the original ICE Pambush 3.

Context Sensitive Grammatical Evolution: A Novel Attribute Grammar Based Approach to the Integration of Semantics in Grammatical Evolution

Thesis

Full-text available

Aug 2020

James Vincent Patten

The merit of Evolutionary Algorithms (EAs) as a means of automatic problem solving has been demonstrated numerous times on a diverse set of problem types across a range of diﬀerent domains. The central hypothesis of this thesis is that by improving the expressiveness of EAs we can better support their deployment in domains in which context sensitive decision making is useful. After describing the principal structures and operations which allow EAs operate eﬀectively as a general problem solving technique, we describe a sample problem and outline how two EA types, Genetic Programming (GP) and Grammatical Evolution (GE), might be conﬁgured to solve it. After some foundational elements of the discipline game design are presented, we highlight how a move towards more formal speciﬁcations of design elements presents new opportunities for the deployment of EAs as a means of Procedural Content Generation (PCG). Subsequently a set of experiments are described in which a system, designed to support encoding of data type information using a variant of GP called Strongly Typed Genetic Programming (STGP), is used to generate Player Character (PC) controllers for the digital video game Ms. Pac-Man. Following this an overview of Formal Grammars (FGs) is presented and the principal structures and operations of a third EA type, GE, are described. After which a number of more expressive FGs than Context Free Grammar (CFG), the grammar traditionally used with GE, are outlined. Finally, we outline a new GE variant designed to support usage Attribute Grammars (AGs), a means of specifying solution semantics in addition to syntax, and outline a set of experiments conducted using it. After highlighting the gains that can be made by using this GE variant in traditional problem domains such as symbolic regression, we discuss its potential as a means of PCG in digital video games. as a means of automatic problem solving has been demonstrated numerous times on a diverse set of problem types across a range of diﬀerent domains. The central hypothesis of this thesis is that by improving the expressiveness of EAs we can better support their deployment in domains in which context sensitive decision making is useful. After describing the principal structures and operations which allow EAs operate eﬀectively as a general problem solving technique, we describe a sample problem and outline how two EA types, Genetic Programming (GP) and Grammatical Evolution (GE), might be conﬁgured to solve it. After some foundational elements of the discipline game design are presented, we highlight how a move towards more formal speciﬁcations of design elements presents new opportunities for the deployment of EAs as a means of Procedural Content Generation (PCG). Subsequently a set of experiments are described in which a system, designed to support encoding of data type information using a variant of GP called Strongly Typed Genetic Programming (STGP), is used to generate Player Character (PC) controllers for the digital video game Ms. Pac-Man. Following this an overview of Formal Grammars (FGs) is presented and the principal structures and operations of a third EA type, GE, are described. After which a number of more expressive FGs than Context Free Grammar (CFG), the grammar traditionally used with GE, are outlined. Finally, we outline a new GE variant designed to support usage Attribute Grammars (AGs), a means of specifying solution semantics in addition to syntax, and outline a set of experiments conducted using it. After highlighting the gains that can be made by using this GE variant in traditional problem domains such as symbolic regression, we discuss its potential as a means of PCG in digital video games.

Evolving a designer-balanced neural network for Ms PacMan

Conference Paper

Sep 2017

Automated Game Balancing in Ms PacMan and StarCraft Using Evolutionary Algorithms

Conference Paper

Mar 2017

Games, particularly online games, have an ongoing requirement to exhibit the ability to react to player behaviour and change their mechanics and available tools to keep their audience both entertained and feeling that their strategic choices and in-game decisions have value. Game designers invest time both gathering data and analysing it to introduce minor changes that bring their game closer to a state of balance, a task with a lot of potential that has recently come to the attention of researchers. This paper first provides a method for automating the process of finding the best game parameters to reduce the difficulty of Ms PacMan through the use of evolutionary algorithms and then applies the same method to a much more complex and commercially successful PC game, StarCraft, to curb the prowess of a dominant strategy. Results show both significant promise and several avenues for future improvement that may lead to a useful balancing tool for the games industry.

Analysis of Agent Expertise in Ms. Pac-Man Using Value-of-Information-Based Policies

Article

Full-text available

Feb 2017

Conventional reinforcement learning methods for Markov decision processes rely on weakly-guided, stochastic searches to drive the learning process. It can therefore be difficult to predict what agent behaviors might emerge. In this paper, we consider an information-theoretic approach for performing constrained stochastic searches that promote the formation of risk-averse to risk-favoring behaviors. Our approach is based on the value of information, a criterion that provides an optimal trade-off between the expected return of a policy and the policy's complexity. As the policy complexity is reduced, there is a high chance that the agents will eschew risky actions that increase the long-term rewards. The agents instead focus on simply completing their main objective in an expeditious fashion. As the policy complexity increases, the agents will take actions, regardless of the risk, that seek to decrease the long-term costs. A minimal-cost policy is sought in either case; the obtainable cost depends on a single, tunable parameter that regulates the degree of policy complexity. We evaluate the performance of value-of-information-based policies on a stochastic version of Ms. Pac-Man. A major component of this paper is demonstrating that ranges of policy complexity values yield different game-play styles and analyzing why this occurs. We show that low-complexity policies aim to only clear the environment of pellets while avoiding invulnerable ghosts. Higher-complexity policies implement multi-modal strategies that compel the agent to seek power-ups and chase after vulnerable ghosts, both of which reduce the long-term costs.

KARTUN-KARTUN PROPAGANDA PRO JEPUN SEWAKTU PENDUDUKAN JEPUN DI MALAYA

Article

Full-text available

Jun 2015

Saiful Akram Che Cob

Abstrak Penulisan ini akan membicarakan fungsi kartun sebagai satu medium dalam usaha penyebaran propaganda sewaktu pendudukan Jepun selama 3 tahun 8 bulan di Malaya. Penyelidikan terhadap kartun-kartun bercorak propaganda yang pro kepada pemerintahan Jepun (Dai Nippon) masih belum dianalisis secara kritis dalam konteks sumbangannya sebagai salah satu kaedah indoktrinasi yang berkesan selain daripada kaedah propaganda melalui tetuang udara (siaran radio), risalah bertulis dan filem-filem propaganda. Kartun-kartun yang tersiar di akhbar-akhbar seperti Malai Sinpo, The Malay Mail, Syonan Times dan Penang Daily News memberi gambaran kepada penduduk Malaya pada waktu itu mengenai doktrin romantik pembebasan ’Asia untuk Asia’ dan ’Lingkungan Kemakmuran Bersama Asia Timur Raya’ (kempen Dai Toa Senso) yang dicanang megah oleh Kerajaan Dai Nippon. Kartun-kartun propaganda turut tersiar di majalah-majalah utama seperti Semangat Asia, Fajar Asia dan Suara Timor. Sumbangan kartun sebagai salah satu sumber perekodan sejarah pendudukan Jepun di Malaya dilihat signifikan seperti mana rekod-rekod dan laporan-laporan bertulis. Tambahan pula, hal-hal yang kurang diperkatakan dalam dokumentasi persejarahan negara, iaitu propaganda yang pro kepada pemerintahan Jepun menjadi inti kepada penyelidikan ini. Lebih istimewa, dokumentasi tersebut menggunakan kuasa visual (kartun) sebagai daya tarik naratif pensejarahannya. Kata Kunci: Kartun, Propaganda, Akhbar, Majalah, Dai Nippon

Automating Commercial Video Game Development using Computational Intelligence

Article

Full-text available

Jan 2011

Problem statement: The retail sales of computer and video games have grown enormously during the last few years, not just in United States (US), but also all over the world. This is the reason a lot of game developers and academic researchers have focused on game related technologies, such as graphics, audio, physics and Artificial Intelligence (AI) with the goal of creating newer and more fun games. In recent years, there has been an increasing interest in game AI for producing intelligent game objects and characters that can carry out their tasks autonomously. Approach: The aim of this study is an attempt to create an autonomous intelligent controller to play the game with no human intervention. Our approach is to use a simple but powerful evolutionary algorithm called Evolution Strategies (ES) to evolve the connection weights and biases of feed-forward Artificial Neural Networks (ANN) and to examine its learning ability through computational experiments in a non-deterministic and dynamic environment, which is the well-known arcade game, called Ms. Pac-man. The resulting algorithm is referred to as an Evolution Strategies Neural Network or ESNet. Results: The comparison of ESNet with two random systems, Random Direction (RandDir) and Random Neural Network (RandNet) yields promising results. The contribution of this work also focused on the comparison between the ESNet with different mutation probabilities. The results show that ESNet with a high probability with high mean scores recorded compared to the mean scores of RandDir, RandNet and ESNet with a low probability. Conclusion: Overall, the proposed algorithm has a very good performance with a high probability of automatically generating successful game AI controllers for the video game.

Deploying Fuzzy Logic in a Boxing Game

Conference Paper

Full-text available

Sep 2011

Nowadays computer games have become a billion dollar industry. One of the important factors in success of a game is its similarity to the real world. As a result, many AI approaches have been exploited to make game characters more believable and natural. One of these approaches which has received great attention is Fuzzy Logic. In this paper a Fuzzy Rule-Based System is employed in a fighting game to reach higher levels of realism. Furthermore, behavior of two fighter bots, one based on the proposed Fuzzy logic and the other one based on a scripted AI, have been compared. It is observed that the results of the proposed method have less behavioral repetition than the scripted AI, which boosts human players' enjoyment during the game.

Fuzzy Tactics: A scripting game that leverages fuzzy logic as an engaging game mechanic

Article

Oct 2014
EXPERT SYST APPL

Artificial intelligence (AI) plays a major role in modern video games by making them feel both more realistic and more fun to play. Game intelligence usually works alongside the game logic, in the background, invisible to the players who enjoy the resulting character behaviors, the adaptive gameplay, and the procedurally generated content. However, artificial intelligence can also have a central role and become a major component of the overall gameplay (as for instance in the video game Black & White). In this paper, we define the genre of scripting video games and introduce Fuzzy Tactics, a video game we developed that has an innovative gameplay based on fuzzy logic and uses fuzzy rules as its core game mechanic and user interaction mechanism. In Fuzzy Tactics, players lead their troops into battle by specifying a set of fuzzy rules that determines the battle behavior of the units. Fuzzy logic is the only mean that players have to interact with the game and to command to their troops. Thus, it becomes the main game mechanic that allows us to (i) extend the depth of the game, (ii) keep the interaction intuitive, while also (iii) increasing the replayability and the educational value of the game.

A cell decomposition approach to online evasive path planning and the video game Ms. Pac-Man

Article

Sep 2011

This paper presents an approach for optimizing paths online for a pursuit-evasion problem where an agent must visit several target positions within a region of interest while simultaneously avoiding one or more actively-pursuing adversaries. This is relevant to applications such as robotic path planning, mobile-sensor applications, and path exposure. The methodology described utilizes cell decomposition to construct a modified decision tree to achieve the objective of minimizing the risk of being caught by an adversary and maximizing a reward associated with visiting the target locations. By computing paths online, the algorithm can quickly adapt to unexpected movements by the adversaries or dynamic environments. The approach is illustrated through a modified version of the video game Ms. Pac-Man which is shown to be a benchmark example of the pursuit-evasion problem. The results show that the approach presented in this paper runs in real-time and outperforms several other methods as well as most human players.

AntBot: Ant Colonies for Video Games

Article

Aug 2012

The video game industry is an emerging market which continues to expand. From its early beginning, developers have focused mainly on sound and graphical applications, paying less attention to developing game bots or other kinds of nonplayer characters (NPCs). However, recent advances in artificial intelligence offer the possibility of developing game bots which are dynamically adjustable to several difficulty levels as well as variable game environments. Previous works reveal a lack of swarm intelligence approaches to develop these kinds of agents. Considering the potential of particle swarm optimization due to its emerging properties and self-adaptation to dynamic environments, further investigation into this field must be undertaken. This research focuses on developing a generic framework based on swarm intelligence, and in particular on ant colony optimization, such as it allows general implementation of real-time bots that work over dynamic game environments. The framework has been adapted to allow the implementation of intelligent agents for the classical game Ms. Pac-Man. These were trialed at the Ms. Pac-Man competitions held during the 2011 International Congress on Evolutionary Computation.

A Monte-Carlo approach for ghost avoidance in the Ms. Pac-Man game

Conference Paper

Full-text available

Jan 2011

Ms. Pac-Man is a challenging, classic arcade game that provides an interesting platform for Artificial Intelligence (AI) research. This paper reports the first Monte-Carlo approach to develop a ghost avoidance module of an intelligent agent that plays the game. Our experimental results show that the look-ahead ability of Monte-Carlo simulation often prevents Ms. Pac-Man being trapped by ghosts and reduces the chance of losing Ms. Pac-Man's life significantly. Our intelligent agent has achieved a high score of around 21,000. It is sometimes capable of clearing the first three stages and playing at the level of a novice human player.

Evolving diverse Ms. Pac-Man playing agents using genetic programming

Conference Paper

Full-text available

Oct 2010

This paper uses genetic programming (GP) to evolve a variety of reactive agents for a simulated version of the classic arcade game Ms. Pac-Man. A diverse set of behaviours were evolved using the same GP setup in three different versions of the game. The results show that GP is able to evolve controllers that are well-matched to the game used for evolution and, in some cases, also generalise well to previously unseen mazes. For comparison purposes, we also designed a controller manually using the same function set as GP. GP was able to significantly outperform this hand-designed controller. The best evolved controllers are competitive with the best reactive controllers reported for this problem.

APPLICATION OF ARTIFICIAL INTELLIGENCE TECHNIQUES IN MS. PAC-MAN GAME: A REVIEW

Article

Full-text available

Jun 2015

AbstrakTeknik Kecerdasan Buatan (AI) berjaya digunakan dan diaplikasikan dalam pelbagai bidang, termasukpembuatan, kejuruteraan, ekonomi, perubatan dan ketenteraan. Kebelakangan ini, terdapat minat yangsemakin meningkat dalam Permainan Kecerdasan Buatan atau permainan AI. Permainan AI merujukkepada teknik yang diaplikasikan dalam permainan komputer dan video seperti pembelajaran, pathfinding,perancangan, dan lain-lain bagi mewujudkan tingkah laku pintar dan autonomi kepada karakter dalampermainan. Objektif utama kajian ini adalah untuk mengemukakan beberapa teknik yang biasa digunakandalam merekabentuk dan mengawal karakter berasaskan komputer untuk permainan Ms Pac-Man antaratahun 2005-2012. Ms Pac-Man adalah salah satu permainan yang digunakan dalam siri pertandinganpermainan diperingkat antarabangsa sebagai penanda aras untuk perbandingan pengawal autonomi.Kaedah analisis kandungan yang menyeluruh dijalankan secara ulasan dan sorotan literatur secara kritikal.Dapatan kajian menunjukkan bahawa, walaupun terdapat berbagai teknik, limitasi utama dalam kajianterdahulu untuk mewujudkan karakter permaianan Pac Man adalah kekurangan Generalization Capabilitydalam kepelbagaian karakter permainan. Hasil kajian ini akan dapat digunakan oleh penyelidik untukmeningkatkan keupayaan Generalization AI karakter permainan dalam Pasaran Permainan KecerdasanBuatan. Abstract Artificial Intelligence (AI) techniques are successfully used and applied in a wide range of areas, includingmanufacturing, engineering, economics, medicine and military. In recent years, there has been anincreasing interest in Game Artificial Intelligence or Game AI. Game AI refers to techniques applied incomputer and video games such as learning, pathfinding, planning, and many others for creating intelligentand autonomous behaviour to the characters in games. The main objective of this paper is to highlightseveral most common of the AI techniques for designing and controlling the computer-based charactersto play Ms. Pac-Man game between years 2005-2012. The Ms. Pac-Man is one of the games that used asbenchmark for comparison of autonomous controllers in a series of international Game AI competitions.An extensive content analysis method was conducted through critical review on previous literature relatedto the field. Findings highlight, although there was various and unique techniques available, the majorlimitation of previous studies for creating the Ms. Pac-Man game characters is a lack of generalizationcapability across different game characters. The findings could provide the future direction for researchersto improve the Generalization A.I capability of game characters in the Game Artificial Intelligence market.

Development of an Autonomous Agent based on Reinforcement Learning for a Digital Fighting Game

Conference Paper

Nov 2020

Orthogonally Evolved AI to Improve Difficulty Adjustment in Video Games

Conference Paper

Mar 2016

Computer games are most engaging when their difficulty is well matched to the player’s ability, thereby providing an experience in which the player is neither overwhelmed nor bored. In games where the player interacts with computer-controlled opponents, the difficulty of the game can be adjusted not only by changing the distribution of opponents or game resources, but also through modifying the skill of the opponents. Applying evolutionary algorithms to evolve the artificial intelligence that controls opponent agents is one established method for adjusting opponent difficulty. Less-evolved agents (i.e., agents subject to fewer generations of evolution) make for easier opponents, while highly-evolved agents are more challenging to overcome. In this publication we test a new approach for difficulty adjustment in games: orthogonally evolved AI, where the player receives support from collaborating agents that are co-evolved with opponent agents (where collaborators and opponents have orthogonal incentives). The advantage is that game difficulty can be adjusted more granularly by manipulating two independent axes: by having more or less adept collaborators, and by having more or less adept opponents. Furthermore, human interaction can modulate (and be informed by) the performance and behavior of collaborating agents. In this way, orthogonally evolved AI both facilitates smoother difficulty adjustment and enables new game experiences.

A cognitive model of navigation and path finding using cellular automata agent

Conference Paper

Mar 2015

A cognitive model of navigation and path finding using cellular automata agent

Conference Paper

Mar 2015

Artificial Intelligence constitute a continuum of attempts to model adaptive, learning and cognitive abilities in all the varying degrees of complexities we know from biology and psychology. The purpose of present research paper is to design cognitive cellular automata agent with conflict-level spatial problem-solving abilities. Such an agent will have the capability to reason, learn and plan in a manner similar to human being. The agent architecture has a fuzzy inference system to implement the “perceive-reason-act” decision cycle of a mobile cellular automata reflex agent. In essence the agent is expected to execute an Observe-Orient-Decide-Act (OODA) loop. A cognitive model is developed to compute the best-next-move at each time instant for the goal oriented, rational and utility-driven mobile cellular automata agent. Experiments are to be planned and conducted to evaluate the problem solving abilities of such an agent when immersed in a conflict situation.

A Model-based Approach to Optimizing Ms. Pac-Man Game Strategies in Real Time

Article

Jan 2016

This paper presents a model-based approach for computing real-time optimal decision strategies in the pursuit-evasion game of Ms. Pac-Man . The game of Ms. Pac-Man is an excellent benchmark problem of pursuit-evasion game with multiple, active adversaries that adapt their pursuit policies based on Ms. Pac-Man ’s state and decisions. In addition to evading the adversaries, the agent must pursue multiple fixed and moving targets in an obstacle-populated environment. This paper presents a novel approach by which a decision-tree representation of all possible strategies is derived from the maze geometry and the dynamic equations of the adversaries or ghosts. The proposed models of ghost dynamics and decisions are validated through extensive numerical simulations. During the game, the decision tree is updated and used to determine optimal strategies in real time based on state estimates and game predictions obtained iteratively over time. The results show that the artificial player obtained by this approach is able to achieve high game scores, and to handle high game levels in which the characters speeds and maze complexity become challenging even for human players.

Influence Map-based controllers for Ms. PacMan and the ghosts

Conference Paper

Sep 2012

Ms. Pac-Man, one of the classic arcade games has recently gained attention in the field of game AI through the yearly competitions of various kinds held at e.g. CIG. We have implemented an Influence Map-based controller for Ms. Pac-Man as well as for the ghosts within the game. We show that it is able to handle a number of various situations through the interesting behaviors emerging through the interplay of the different maps. It is also significantly better than the previous implementations based on similar techniques, such as potential fields.

A simple heuristic search method for the automatic generation of neural-based game artificial intelligence architectures in Ms. Pac-Man

Conference Paper

Jun 2010

In this work, we develop a game controller called HillClimbingNet (Hill-Climbing Neural Network) for playing Ms. Pac-man that combines the hill-climbing concept and simple feedforward neural network. The computational experiments have been conducted to evaluate and compare the proposed algorithm against Random Direction (RandDir) and Random Neural Network (RandNet) systems. According to the simulation results, HillClimbingNet has achieved an average score of 6290, but only 439 and 735 on the RandDir and RandNet, respectively. HillClimbingNet has a very good performance.

Uniform versus Gaussian mutators in automatic generation of game AI in Ms. Pac-Man using hill-climbing

Conference Paper

Apr 2010

This paper explores the idea of combining the hill-climbing concept into feed-forward artificial neural networks (ANN) to develop intelligent controllers to play the Ms. Pacman game. The resulting algorithm is referred to as the HillClimbingNet. A comparison with a random system, called RandNet is conducted on the same problem. We also present a survey of the effects of two most popular probability density functions, uniform and Gaussian distributions/mutators on the introduced algorithm. The results clearly indicate the strong potential of the hill-climbing strategy as a direct search method in tandem with a Gaussian-based mutator to optimize the ANN for playing Ms. Pac-Man.

Neural network ensembles for video game AI using evolutionary multi-objective optimization

Conference Paper

Dec 2011

Recently, there has been an increasing interest in game artificial intelligence (AI). Game AI is a system that makes the game characters behave like human beings that is able to make smart decisions to achieve the target in a computer or video game. Thus, this study focuses on an automated method of generating artificial neural network (ANN) controller that is able to display good playing behaviors for a commercial video game. In this study, we create neural-based game controller for screen-capture of Ms. Pac-Man using a multi-objective evolutionary algorithm (MOEA) for training or evolving the architectures and connection weights (including biases) in ANN corresponding to conflicting goals of minimizing complexity in ANN and maximizing Ms. Pac-man game score. In particular, we have chosen the commonly-used Pareto Archived Evolution Strategy (PAES) algorithm for this purpose. After the entire training process is completed, the controller is tested for generalization using the optimized networks in single network (single-net) and neural network ensemble (multi-net) environments. The multi-net model is compared to single-net model, and the results reveal that neural network ensemble is able learn to play with good strategies in a complex, dynamic and difficult game environment which is not achievable by the individual neural network.

RAMP: A rule-based agent for Ms. Pac-Man

Conference Paper

Full-text available

May 2009

RAMP is a rule-based agent for playing Ms. Pac-Man according to the rules stipulated in the 2008 World Congress on Computational Intelligence Ms. Pac-Man Competition. During the competition, our highest score was 15,970, outscoring the eleven other entrants in the competition. In runs reported here, RAMP achieves an average score over 10,000 and a high score of 18,560 across 100 runs; the highest score RAMP has achieved to date is 19,000. These are scores that are better than typical human novice players, including the paper authors themselves. The system was designed to have an evolutionary component, however, this was not developed in time for the competition, which instead used hand-coded rules. We have found the process of tuning the rule sets and accompanying parameters to be a time consuming and inexact process that is expected to benefit from an evolutionary computation approach. This paper describes our initial implementation as well as our progress towards adding an evolutionary computation component to enable the agent learn to play the game.

Fuzzy State Aggregation and Policy Hill Climbing for Stochastic Environments.

Article

Full-text available

Sep 2006

Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the operating environment changes. Additionally, by applying reinforcement learning to multiple cooperative software agents (a multi-agent system) not only allows each individual agent to learn from its own experience, but also opens up the opportunity for the individual agents to learn from the other agents in the system, thus accelerating the rate of learning. This research presents the novel use of fuzzy state aggregation, as the means of function approximation, combined with the fastest policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF). The combination of fast policy hill climbing and fuzzy state aggregation function approximation is tested in two stochastic environments: Tileworld and the simulated robot soccer domain, RoboCup. The Tileworld results demonstrate that a single agent using the combination of FSA and PHC learns quicker and performs better than combined fuzzy state aggregation and Q-learning reinforcement learning alone. Results from the multi-agent RoboCup domain again illustrate that the policy hill climbing algorithms perform better than Q-learning alone in a multi-agent environment. The learning is further enhanced by allowing the agents to share their experience through a weighted strategy sharing.

Technical Note: Q-Learning

Article

Full-text available

May 1992

Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states. This paper presents and proves in detail a convergence theorem forQ-learning based on that outlined in Watkins (1989). We show thatQ-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where manyQ values can be changed each iteration, rather than just one.

Reinforcement Learning with Soft State Aggregation

Article

Full-text available

Nov 1999

It is widely accepted that the use of more compact representations than lookup tables is crucial to scaling reinforcement learning (RL) algorithms to real-world problems. Unfortunately almost all of the theory of reinforcement learning assumes lookup table representations. In this paper we address the pressing issue of combining function approximation and RL, and present 1) a function approximator based on a simple extension to state aggregation (a commonly used form of compact representation), namely soft state aggregation, 2) a theory of convergence for RL with arbitrary, but fixed, soft state aggregation, 3) a novel intuitive understanding of the effect of state aggregation on online RL, and 4) a new heuristic adaptive state aggregation algorithm that finds improved compact representations by exploiting the non-discrete nature of soft state aggregation. Preliminary empirical results are also presented. 1 INTRODUCTION The strong theory of convergence available for rein...

Ms Pac-Man competition

Article

Dec 2007
SIGEVOlution

Simon Lucas

IEEE WCCI 2008 in Hong Kong played host to the latest Ms Pac-Man competition, organised by Simon Lucas as an activity of the IEEE CIS Games Technical Committee. The competition attracted 11 entries from teams all around the world, with the winning entry by Alan Fitzgerald, Peter Kemeraitis, and Clare Bates Congdon from the University of Southern Maine (USM) achieving a high-score of 15,970.

An Influence Map Model for Playing Ms. Pac-Man

Conference Paper

Jan 2009

In this paper we develop a Ms. Pac-Man playing agent based on an influence map model. The proposed model is as simple as possible while capturing the essentials of the game. Our model has three main parameters that have an intuitive relationship to the agent's behavior. Experimental results are presented exploring the model's performance over its parameter space using random and systematic global exploration and a greedy algorithm. The model parameters can be optimized without difficulty despite the noisy fitness function used. The performance of the optimized agents is comparable to the best published results for a Ms. Pac-Man playing agent. Nevertheless, some difficulties were observed in terms of the model and the software system.

Fuzzy Sets

Article

Jun 1965
Inform Contr

L.A Zadeh

A fuzzy set is a class of objects with a continuum of grades of membership. Such a set is characterized by a membership (characteristic) function which assigns to each object a grade of membership ranging between zero and one. The notions of inclusion, union, intersection, complement, relation, convexity, etc., are extended to such sets, and various properties of these notions in the context of fuzzy sets are established. In particular, a separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.

Data Structures and Algorithms in Java

Book

Jan 1998

Constitution of Ms.PacMan player with critical-situation learning mechanism

Article

Jan 2010

Hisashi Handa

We have previously proposed evolutionary fuzzy systems of playing Ms.PacMan for the competitions. As a consequence of the evolution, reflective action rules such that PacMan tries to eat pills effectively until ghosts come close to PacMan are acquired. Such rules work well. However, sometimes it is too reflective so that PacMan goes toward ghosts by herself in longer corridors. In this paper, a critical situation learning module is combined with the evolved fuzzy systems, i.e., reflective action module. The critical situation learning module is composed of Q-learning with CMAC. Location information of surrounding ghosts and the existence of power-pills are given to PacMan as state. This module punishes if the PacMan is caught by ghosts. Therefore, this module learning which pairs of (state, action) causes her death. By using learnt Q-value, PacMan tries to survive much longer. Experimental results on Ms.PacMan elucidate the proposed method is promising since it can capture critical situations well. However, as a consequence of the large amount of memory required by CMAC, real-time responses tend to be lost.

Learning fuzzy logic controller for reactive robot behaviours

Conference Paper

Aug 2003

Fuzzy logic plays an important role in the design of reactive robot behaviours. This paper presents a learning approach to the development of a fuzzy logic controller based on the delayed rewards from the real world. The delayed rewards are apportioned to the individual fuzzy rules by using reinforcement Q-learning. The efficient exploration of a solution space is one of the key issues in the reinforcement learning. A specific genetic algorithm is developed in this paper to trade off the exploration of learning spaces and the exploitation of learned experience. The proposed approach is evaluated on some reactive behaviour of the football-playing robots.

Acquiring the positioning skill in a soccer game using a fuzzy Q-learning

Conference Paper

Aug 2003

In this paper, we propose a reinforcement learning method called a fuzzy Q-learning where an agent determines its action based on the inference result by a fuzzy rule-based system. We apply the proposed method to a soccer agent that tries to learn to intercept a passed ball, i.e., it tries to catch up with a passed ball by another agent. In the proposed method, the state space is represented by internal information that the learning agent maintains such as the relative velocity and the relative position of the ball to the learning agent. We divide the state space into several fuzzy subspaces. We define each fuzzy subspace by specifying the fuzzy partition of each axis of the state space. A reward is given to the learning agent if the distance between the ball and the agent becomes smaller or if the agent catches up with the ball. It is expected that the learning agent finally obtains the efficient positioning skill through trial-and-error.

Advantages of cooperation between reinforcement learning agents in difficult stochastic problems

Conference Paper

Feb 2000

Presents the first results in understanding the reasons for cooperative advantage between reinforcement learning agents. We consider a cooperation method which consists of using and updating a common policy. We tested this method on a complex fuzzy reinforcement learning problem and found that cooperation brings larger than expected benefits. More precisely, we found that K cooperative agents each learning for N time steps outperform K independent agents each learning in a separate world for K*N time steps. We explain the observed phenomenon and determine the necessary conditions for its presence in a wide class of reinforcement learning problems

Cooperation and coordination between fuzzy reinforcement learning agents in continuous state partially observable Markov decision processes

Conference Paper

Feb 1999

We consider a pseudo-realistic world in which one or more opportunities appear and disappear in random locations. Agents use fuzzy reinforcement learning to learn which opportunities are most worthy of pursuing based on their promised rewards, expected lifetimes, path lengths and expected path costs. We show that this world is partially observable because the history of an agent influences the distribution of its future states. We implement a coordination mechanism for allocating opportunities to different agents in the same world. Our results show that optimal team performance results when agents behave in a partially selfish way. We also implement a cooperation mechanism in which agents share experience by using and updating one joint behavior policy. Our results demonstrate that K cooperative agents each learning in a separate world for N time steps outperform K independent agents each learning in a separate world for K*N time steps, with this result becoming more pronounced as the degree of partial observability in the environment increases

Adaptive Cooperative Fuzzy Logic Controller

Article

Feb 2004

Fuzzy logic is a natural basis for modelling and solving problems involving imprecise knowledge and continuous systems. Unfortunately, fuzzy logic systems are invariably static (once created they do not change) and subjective (the creator imparts their beliefs on the system). In this paper we address the question of whether systems based on fuzzy logic can e#ectively adapt themselves to dynamic situations.

Reinforcement Learning of Fuzzy Logic Controllers for Quadruped Walking Robots

Article

Apr 2003

This paper presents a fuzzy logic controller (FLC) for the implementation of some behaviour of Sony legged robots. The adaptive heuristic Critic (AHC) reinforcement learning is employed to refine the FLC. The actor part of AHC is a conventional FLC in which the parameters of input membership functions are learned by an immediate internal reinforcement signal. This internal reinforcement signal comes from a prediction of the evaluation value of a policy and the external reinforcement signal. The evaluation value of a policy is learned by temporal difference (TD) learning in the critic part that is also represented by a FLC. A genetic algorithm (GA) is employed for learning internal reinforcement of the actor part because it is more efficient in searching than other trial and error search approaches.

Mastering Ms. Pac-Man

K Uston

Fuzzy Q-learning in a nondeterministic environment: Developing an intelligent Ms. Pac-Man agent

Abstract

No full-text available

Recommended publications

vBattle: A new framework to simulate medium-scale battles in individual-per-individual basis