Reinforcement learning process [5]

Source publication

Fig. 1. Screen capture of flappy bird [1]

Fig. 2. Reinforcement learning process [5]

Using DQN and Double DQN to Play Flappy Bird

Chapter

Full-text available

Dec 2022

Kun Yang

Context 1

... choose actions to interact with environment by observing the states and then it will get a reward or a punishment (r t ). Because of the actions did by agent, the environment might change, and the agent needs to choose actions again due to the new state of environment (s t+1 ) and getting new reward or a punishment (r t+1 ). This process shows in Fig. 2 [5]. Many reinforcement learning situation can be described as Markov Decision Process that a quintuple including S, A, P, R, γ. S is a states sequence extract from environment. For example, the position of each Go chess piece can be the state when playing the game of go. A is actions that an agent can select to do. P is the ...

View in full-text

Figure 6: Two pigs-one bird plan in execution.

Figure 7: Solution requiring bouncing off the ground and collapsing a...

Figure 8: A screenshot example of each type of level layout

Heuristic Search For Physics-Based Problems: Angry Birds in PDDL+

Preprint

Full-text available

Mar 2023

This paper studies how a domain-independent planner and combinatorial search can be employed to play Angry Birds, a well established AI challenge problem. To model the game, we use PDDL+, a planning language for mixed discrete/continuous domains that supports durative processes and exogenous events. The paper describes the model and identifies key...

Fig 3. Average score per series (Q-learning)

Playing Flappy Bird with Two Different Value Learning Algorithms

Article

Full-text available

Apr 2023

In this paper, reinforcement learning will be applied to the game flappy bird with two methods DQN and Q-learning. Then, we compare the performance through the visualization of data. Furthermore, more results from other games are summarized to analysis the corresponding advantages and disadvantages. Finally, we discuss and compare these two reinfor...

Playing Flappy Bird Based on Motion Recognition Using a Transformer Model and LIDAR Sensor

Article

Full-text available

Mar 2024
SENSORS-BASEL

A transformer neural network is employed in the present study to predict Q-values in a simulated environment using reinforcement learning techniques. The goal is to teach an agent to navigate and excel in the Flappy Bird game, which became a popular model for control in machine learning approaches. Unlike most top existing approaches that use the game’s rendered image as input, our main contribution lies in using sensory input from LIDAR, which is represented by the ray casting method. Specifically, we focus on understanding the temporal context of measurements from a ray casting perspective and optimizing potentially risky behavior by considering the degree of the approach to objects identified as obstacles. The agent learned to use the measurements from ray casting to avoid collisions with obstacles. Our model substantially outperforms related approaches. Going forward, we aim to apply this approach in real-world scenarios.

Reinforcement learning process [5]

Context in source publication

Similar publications

Citations