Learning curve for BreakAway with bad user advice.

Source publication

Skill Acquisition Via Transfer Learning and Advice Taking

Conference Paper

Full-text available

Sep 2006

We describe a reinforcement learning system that transfers skills from a previously learned source task to a related target task. The system uses inductive logic programming to analyze experience in the source task, and transfers rules for when to take actions. The target task learner accepts these rules through an advice-taking algorithm, which al...

Context 1

... its inequalities reversed, this bad advice instructs the learner to pass backwards, shoot when far away from the goal and at a narrow angle, and move when close to the goal. Figure 6 shows the results of this advice, both in AI 2 transfer and alone. This experiment shows that while bad advice can decrease the positive effect of transfer, it does not cause the AI 2 system to impact learning negatively. ...

View in full-text

Relational Macros for Transfer in Reinforcement Learning

Conference Paper

Full-text available

Jun 2007

We describe an application of inductive logic programming to transfer learning. Transfer learning is the use of knowledge learned in a source task to improve learning in a related target task. The tasks we work with are in reinforcement learning domains. Our approach transfers relational macros, which are nite-state machines in which the transition...

Round Robin Comparison of Inter-Laboratory HRTF Measurements - Assessment with an auditory model for elevation

Conference Paper

Full-text available

Feb 2018

Repeatability of head-related transfer function (HRTF) measurements is a critical issue in intra- and inter- laboratory setups. In this paper, simulated perceptual variabilities of HRTFs are computed as an attempt to understand if different acquisition methods achieve similar results in terms of psychoacoustic features. We consider 12 HRTF indepen...

An Autonomous Inter-task Mapping Learning Method via Artificial Neural Network for Transfer Learning

Conference Paper

Full-text available

Dec 2017

Transfer learning could speed up reinforcement learning in many applications. Toward the fully autonomous reinforcement learning transfer agent, the mapping between the source task and target task should be learned instead of human-designed. To this end, this paper proposes an autonomous inter-task mapping learning method via artificial neural netw...

Effect of Practicing Soccer Juggling With Different Sized Balls Upon Performance, Retention and Transfer to Ball Reception

Article

Full-text available

Aug 2015

The aim of this study was to investigate if making the skill acquisition phase more difficult or easier would enhance performance in soccer juggling, and if this practice has a positive inter-task transfer effect to ball reception performance. Twenty-two adolescent soccer players were tested in juggling a soccer ball and in the control of an approa...

Advice-based Transfer in Reinforcement Learning

Article

Full-text available

This report is an overview of our work on transfer in rein- forcement learning using advice-taking mechanisms. The goal in transfer learning is to speed up learning in a target task by transferring knowledge from a related, previously learned source task. Our methods are designed to do so robustly, so that positive transfer will speed up learning b...

Introspective Action Advising for Interpretable Transfer Learning

Preprint

Jun 2023

Transfer learning can be applied in deep reinforcement learning to accelerate the training of a policy in a target task by transferring knowledge from a policy learned in a related source task. This is commonly achieved by copying pretrained weights from the source policy to the target policy prior to training, under the constraint that they use the same model architecture. However, not only does this require a robust representation learned over a wide distribution of states -- often failing to transfer between specialist models trained over single tasks -- but it is largely uninterpretable and provides little indication of what knowledge is transferred. In this work, we propose an alternative approach to transfer learning between tasks based on action advising, in which a teacher trained in a source task actively guides a student's exploration in a target task. Through introspection, the teacher is capable of identifying when advice is beneficial to the student and should be given, and when it is not. Our approach allows knowledge transfer between policies agnostic of the underlying representations, and we empirically show that this leads to improved convergence rates in Gridworld and Atari environments while providing insight into what knowledge is transferred.

Heuristic Search Planning with Deep Neural Networks using Imitation, Attention and Curriculum Learning

Preprint

Full-text available

Dec 2021

Learning a well-informed heuristic function for hard task planning domains is an elusive problem. Although there are known neural network architectures to represent such heuristic knowledge, it is not obvious what concrete information is learned and whether techniques aimed at understanding the structure help in improving the quality of the heuristics. This paper presents a network model to learn a heuristic capable of relating distant parts of the state space via optimal plan imitation using the attention mechanism, which drastically improves the learning of a good heuristic function. To counter the limitation of the method in the creation of problems of increasing difficulty, we demonstrate the use of curriculum learning, where newly solved problem instances are added to the training set, which, in turn, helps to solve problems of higher complexities and far exceeds the performances of all existing baselines including classical planning heuristics. We demonstrate its effectiveness for grid-type PDDL domains.

Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning

Preprint

Full-text available

May 2021

The idea of transfer in reinforcement learning (TRL) is intriguing: being able to transfer knowledge from one problem to another problem without learning everything from scratch. This promises quicker learning and learning more complex methods. To gain an insight into the field and to detect emerging trends, we performed a database search. We note a surprisingly late adoption of deep learning that starts in 2018. The introduction of deep learning has not yet solved the greatest challenge of TRL: generalization. Transfer between different domains works well when domains have strong similarities (e.g. MountainCar to Cartpole), and most TRL publications focus on different tasks within the same domain that have few differences. Most TRL applications we encountered compare their improvements against self-defined baselines, and the field is still missing unified benchmarks. We consider this to be a disappointing situation. For the future, we note that: (1) A clear measure of task similarity is needed. (2) Generalization needs to improve. Promising approaches merge deep learning with planning via MCTS or introduce memory through LSTMs. (3) The lack of benchmarking tools will be remedied to enable meaningful comparison and measure progress. Already Alchemy and Meta-World are emerging as interesting benchmark suites. We note that another development, the increase in procedural content generation (PCG), can improve both benchmarking and generalization in TRL.

Reinforcement learning-based collision avoidance: impact of reward function and knowledge transfer

Article

Full-text available

Mar 2020
AI EDAM

Collision avoidance for robots and vehicles in unpredictable environments is a challenging task. Various control strategies have been developed for the agent (i.e., robots or vehicles) to sense the environment, assess the situation, and select the optimal actions to avoid collision and accomplish its mission. In our research on autonomous ships, we take a machine learning approach to collision avoidance. The lack of available ship steering data of human ship masters has made it necessary to acquire collision avoidance knowledge through reinforcement learning (RL). Given that the learned neural network tends to be a black box, it is desirable that a method is available which can be used to design an agent's behavior so that the desired knowledge can be captured. Furthermore, RL with complex tasks can be either time consuming or unfeasible. A multi-stage learning method is needed in which agents can learn from simple tasks and then transfer their learned knowledge to closely related but more complex tasks. In this paper, we explore the ways of designing agent behaviors through tuning reward functions and devise a transfer RL method for multi-stage knowledge acquisition. The computer simulation-based agent training results have shown that it is important to understand the roles of each component in a reward function and the various design parameters in transfer RL. The settings of these parameters are all dependent on the complexity of the tasks and the similarities between them.

AN ADVICE MECHANISM FOR HETEROGENEOUS ROBOT TEAMS

Article

Full-text available

Jan 2020

Shoot-Out: Exploring HyperNEAT for an optimal Final-Third Approach in Robocup-2D Soccer

Conference Paper

Full-text available

Nov 2019

Design of Transfer Reinforcement Learning Mechanisms for Autonomous Collision Avoidance

Chapter

Full-text available

Jan 2019

It is often hard for a reinforcement learning (RL) agent to utilize previous experience to solve new similar but more complex tasks. In this research, we combine the transfer learning with reinforcement learning and investigate how the hyperparameters of both transfer learning and reinforcement learning impact the learning effectiveness and task performance in the context of autonomous robotic collision avoidance. A deep reinforcement learning algorithm was first implemented for a robot to learn, from its experience, how to avoid randomly generated single obstacles. After that the effect of transfer of previously learned experience was studied by introducing two important concepts, transfer belief—i.e., how much a robot should believe in its previous experience—and transfer period—i.e., how long the previous experience should be applied in the new context. The proposed approach has been tested for collision avoidance problems by altering transfer period. It is shown that transfer learnings on average had ~50% speed increase at ~30% competence levels, and there exists an optimal transfer period where the variance is the lowest and learning speed is the fastest.

Experience Learning From Basic Patterns for Efficient Robot Navigation in Indoor Environments

Article

Full-text available

Dec 2018
J INTELL ROBOT SYST

In this paper we propose a machine learning technique for real-time robot path planning for an autonomous robot in a planar environment with obstacles where the robot possess no a priori map of its environment. Our main insight in this paper is that a robot’s path planning times can be significantly reduced if it can refer to previous maneuvers it used to avoid obstacles during earlier missions, and adapt that information to avoid obstacles during its current navigation. We propose an online path planning algorithm called LearnerRRT that utilizes a pattern matching technique called Sample Consensus Initial Alignment (SAC-IA) in combination with an experience-based learning technique to adapt obstacle boundary patterns encountered in previous environments to the current scenario followed by corresponding adaptations in the obstacle-avoidance paths. Our proposed algorithm LearnerRRT works as a learning-based reactive path planning technique which enables robots to improve their overall path planning performance by locally improving maneuvers around commonly encountered obstacle patterns by accessing previously accumulated environmental information.We have conducted several experiments in simulations and hardware to verify the performance of the LearnerRRT algorithm and compared it with a state-of-the-art sampling-based planner. LearnerRRT on average takes approximately 10%of the planning time and 14%of the total time taken by the sampling-based planner to solve the same navigation task based on simulation results and takes only 33% of the planning time, 46% of total time and 95% of total distance compared to the sampling-based planner based on our hardware results.

Automatic Detection of Number Plate from Images of Bangladeshi Vehicles

Conference Paper

Full-text available

Apr 2017

In this paper, we propose a method to detect number plates of vehicles registered in Bangladesh. Our approach was to pre-train a deep Convolutional Neural Network with CIFAR-10 data, then fine tune the network by training it further with our dataset to create the Regions with Convolutional Neural Network (R-CNN) object detector. For training the R-CNN Region of Interest (ROI) labelled data was required. We have observed that using training data with a bigger ROI, which encapsulates the entire number plate within, enables the R-CNN to detect number plates more accurately. The proposed method can detect number plates with more than 99% accuracy.

Transferring knowledge from human- demonstration trajectories to reinforcement learning

Data

Jan 2017

Learning curve for BreakAway with bad user advice.

Context in source publication

Similar publications

Citations