ArticlePDF Available

User preferred behaviors for robot navigation exploiting previous experiences

Authors:

Abstract and Figures

Industry demands flexible robots that are able to accomplish different tasks at different locations such as navigation and mobile manipulation. Operators often require mobile robots operating on factory floors to follow definite and predictable behaviors. This becomes particularly important when a robot shares the workspace with other moving entities. In this paper, we present a system for robot navigation that exploits previous experiences to generate predictable behaviors that meet user's preferences. Preferences are not explicitly formulated but implicitly extracted from robot experiences and automatically considered to plan paths for the successive tasks without requiring experts to hard-code rules or strategies. Our system aims at accomplishing navigation behaviors that follow user's preferences also to avoid dynamic obstacles. We achieve this by considering a probabilistic approach for modeling uncertain trajectories of the moving entities that share the workspace with the robot. We implemented and thoroughly tested our system both in simulation and on a real mobile robot. The extensive experiments presented in this paper demonstrate that our approach allows a robot for successfully navigating while performing predictable behaviors and meeting user's preferences.
Content may be subject to copyright.
User Preferred Behaviors for Robot Navigation Exploiting Previous Experiences
Lorenzo Nardia,, Cyrill Stachnissa
aUniversity of Bonn, Institute for Geodesy and Geoinformation, Nussallee 15, 53115 Bonn, Germany
Abstract
Industry demands flexible robots that are able to accomplish different tasks at different locations such as navigation
and mobile manipulation. Operators often require mobile robots operating on factory floors to follow definite and
predictable behaviors. This becomes particularly important when a robot shares the workspace with other moving
entities. In this paper, we present a system for robot navigation that exploits previous experiences to generate predictable
behaviors that meet user’s preferences. Preferences are not explicitly formulated but implicitly extracted from robot
experiences and automatically considered to plan paths for the successive tasks without requiring experts to hard-code
rules or strategies. Our system aims at accomplishing navigation behaviors that follow user’s preferences also to avoid
dynamic obstacles. We achieve this by considering a probabilistic approach for modeling uncertain trajectories of the
moving entities that share the workspace with the robot. We implemented and thoroughly tested our system both in
simulation and on a real mobile robot. The extensive experiments presented in this paper demonstrate that our approach
allows a robot for successfully navigating while performing predictable behaviors and meeting user’s preferences.
Keywords: Navigation, Path planning, Collision avoidance, Trajectory prediction
1. Introduction
Over the last decades, robots have reshaped the man-
ufacturing industry. Traditional industrial robots are ma-
nipulators designed to perform specific tasks at fixed loca-
tions. Recently, the demand for flexible robotic solutions
that are able to perform different tasks at different loca-
tions has grown. These tasks include mobile manipulation
and navigation to transport materials and tools from one
location to another. To operate on factory floors, mobile
robots are usually requested to fulfill some requirements.
Often, it is desirable for robots to navigate only in certain
areas of the environment and to follow definite patterns
while navigating. Furthermore, robots frequently operate
in environments populated by other moving entities such
as human workers, forklifts, other robots, etc.. In this con-
text, the reproducibility and the predictability of the nav-
igation become essential. Especially in shared workspaces,
these properties are important to avoid collisions without
limiting the flexibility of the robot navigation.
In this paper, we propose an approach that allows a
robot to navigate in such environments considering the
preferences of the user. It implicitly collects preferences
from robot’s previous experiences and exploits them to
perform new tasks. This is generally more convenient
than hard-coding control strategies. Formalizing prefer-
ences can be difficult and hard-coded rules are often com-
plex to maintain and update. In the same way, using cost
Corresponding author
Email addresses: lorenzo.nardi@uni-bonn.de (Lorenzo
Nardi), cyrill.stachniss@igg.uni-bonn.de (Cyrill Stachniss)
functions to define behaviors can be problematic and usu-
ally requires experts to handle them. On the contrary,
even non-expert users can easily teach behaviors to the
robot by demonstration, e.g. joysticking the robot along
desired routes in the environment.
Our approach aims at reproducing and generalizing
previous successful behaviors to meet user’s preferences.
There exist several motion planning approaches that aim
at reusing previous experiences, but most of them do not
focus on the resulting robot behavior. Teach-and-repeat
allows a robot for repeating trajectories, however it is
hard to generalize them to different situations. This pa-
per builds upon and extends our recent conference pa-
per [24] in which we introduce a planning approach to
reuse robot’s previous experiences to meet user’s prefer-
ences while maintaining the planning flexible. We extend
this approach to consider preferences for robot naviga-
tion also in dynamic environments. Our objective is to
avoid obstacles by computing local deviations from the
path that satisfy user’s preferences, while maintaining the
global behavior. We achieve this by considering a proba-
bilistic method that takes the uncertainty of the obstacle
motion into account. We furthermore present an extended
experimental evaluation in simulated and real world envi-
ronments.
The main contribution of this paper is a system for
robot navigation that allows the robot (i) to meet the pref-
erences of the user exploiting the previous experiences of
the robot, (ii) to reproduce behaviors over different envi-
ronments and situations, (iii) to perform predictable be-
haviors providing similar solutions for similar situations,
Preprint submitted to Elsevier July 7, 2017
and (iv) to navigate in environments populated by other
moving objects while still accomplishing foreseeable be-
haviors. We achieve this by allowing the user to demon-
strate desired behaviors to the robot or to provide feedback
about its experiences. We store user’s favorite behaviors
and reuse them to guide the planning for a new but sim-
ilar navigation task. We define a similarity relationship
among tasks and a path representation for transferring ex-
periences to new situations. Our planning approach takes
advantages from theses concepts and considers previous
experiences while maintaining at the same time the flex-
ibility of a general planner. We implemented our system
using C++ and ROS, and we realized our planner in the
Open Motion Planning Library (OMPL) framework [33].
2. Related Work
Motion planning algorithms for robot navigation have
been widely studied [19]. A class of commonly used ap-
proaches are sampling-based planners such as Rapidly-
exploring random tree (RRT) [20] and its variants as RRT-
Connect [18]. These approaches are fast to discover the
connectivity of a configuration space but it is difficult to
predict the resulting paths due to their random nature.
Bruce and Veloso [3] extend traditional RRT to reuse
cached plans and bias a new search towards their way-
points. Zucker et al. [36] propose to adapt the sampling
distribution of sampling-based motion planners consider-
ing the features of workspace. Jiang and Kallmann [13]
propose the Attractor Guided Planner (AGP) that stores
every successful path and biases a new search to repro-
duce the structure of an experienced path according to
a similarity function. Our planning approach reuses and
generalizes the idea of experience-guided planner.
Other motion planning approaches reuse experiences:
Lien and Yu [21] construct roadmaps around geometric
models and reused them to plan for avoiding obstacles;
Fraichard and Delsart [5] deform previous trajectories in
the configuration-time space to fit new situations; Phillips
et al. [27] build a graph of experiences and exploit it for
performing repetitive tasks. The Lightning framework [2]
repairs previously generated trajectories to fit new plan-
ning problems. Coleman et al. [4] extend this idea by stor-
ing generated paths in a sparse roadmap. Such approaches
take advantage from previous experiences to speed up the
planning time and to find a path in high-dimensional con-
figuration spaces. Our approach does not rely on any
complex model or cost function and aims at reproducing
robot’s previous experiences to meet user’s preferences.
Jetchev and Toussaint [11] learn a mapping between
situations and trajectories and use a descriptor of the sit-
uation to transfer and optimize a previous trajectory in
a new situation. They extend this work in [12], using a
voxel representation of the environment to generalize tra-
jectories to a wider range of situations. Our approach also
uses the concept of situation descriptor to identify previous
paths fitting new situations.
Many teach-and-repeat approaches exist to reproduce
previously taught experiences. Sprunk et al. [32] reproduce
trajectories with high accuracy by matching laser scans.
Furgale et al. [9] propose a vision-based approach based
on topologically connected submaps. Furgale et al. [10]
extend standard teach-and-repeat by adding a local mo-
tion planner to account for dynamic environments. Perea
Str¨om et al. [25] use a similar approach for guiding a robot
home in case the mapping system fails during an explo-
ration mission. Mazuran et al. [22] optimize the demon-
strated trajectories within constraints defined according
to user’s preferences. In contrast to teach-and-repeat, our
objective is to maintain the flexibility of a real planner
while reproducing behaviors.
Case-based reasoning is another approach related to
our work. It considers robot experiences to build exper-
imental models and that are stored as cases and used in
the future tasks. Meri¸cli et al. [23] adopt this approach to
mobile push-manipulation, while Ros et al. [30] use it for
action selection in the robot soccer domain.
There exist several approaches for robot navigation in
dynamic environments. Many of them rely on reactive
strategies as Dynamic Window Approach [7] that repeat-
edly selects a velocity yielding to a safe trajectory within a
short time interval. Elastic Bands [28] combines a reactive
strategy with global planning to avoid obstacles. osmann
et al. [31] and successively Keller et al. [14] extend this ap-
proach to consider explicitly time information for optimal
trajectory planning. These approaches enable robots to
avoid dynamic obstacles, but do not allow for expressing
preferences about their behaviors.
Other researchers study human motion to allow robots
for compliant navigation in populated environments. Kruse
et al. [16] conduct a survey about human-aware naviga-
tion. Ziebart et al. [35] learn a cost function of features
to predict people trajectories and plan paths that avoid
hindrances. Bennewitz et al. [1] learn collections of tra-
jectories that characterize typical human motion patterns.
Kuderer et al. [17] as well as Kretzschmar et al. [15] model
cooperative navigation behaviors of humans to let the robot
interact with people in a socially compliant way. Pfeiffer
et al. [26] propose a navigation model trained on human-
human interaction. Trautman et al. [34] introduce Inter-
acting Gaussian Processes (IGPs) for navigating through
dense human crowds. IGPs describe a probabilistic in-
teraction between multiple navigating entities to accom-
plish cooperative collision avoidance. Fulgenzi et al. [8]
use Gaussian Processes to model typical obstacle trajec-
tories and planned with a variant of RRT that explicitly
consider the probability of collision. Ellis et al. [6] show
that Gaussian Processes allow to explicitly represent the
uncertainty of the human motion. We consider a similar
probabilistic approach to model the uncertain trajectories
of the entities with whom a robot shares the workspace on
a factory floor.
2
3. Our Approach
3.1. Use Case
We consider a use case throughout this paper to explain
and motivate our work: a non-expert operator requests
a mobile robot to perform navigation tasks on a factory
floor. In this environment, other moving entities such as
human workers, forklifts, other robots, etc. may share the
workspace with the robot. We assume that both the robot
and the operator know the environment. The operator
asks the robot to perform a navigation task by specify-
ing the start and goal poses. This scenario stems from
the EU-funded H2020 project RobDREAM that focuses
on automatically adapting and improving the behaviors of
a robot over its previous experiences. Our work enables
the robot to accomplish such tasks by adapting its navi-
gation behaviors according to the user’s preferences. The
operator can express preferences by rating behaviors expe-
rienced by the robot as good or bad, for example through a
simple GUI. The operator can also demonstrate good be-
haviors to the robot by joysticking the platform. We store
the good experiences into a database that grows over time.
Our system exploits such a database of examples to cap-
ture and reproduce the preferences of the user.
3.2. Overview of the Navigation System
Our system works on two levels: the global level and
the local level. We aim at providing behaviors that repro-
duce previous successful experiences of the robot at both
levels. Given a navigation task, the global level computes
a path from the start to the goal pose in the static map of
the environment. The global level can exploit previous ex-
periences to perform a task in the same environment, but
does not generalize the exploitation of experiences across
different environments. The local level handles collision
avoidance with unforeseen obstacles planning deviations
from the global path. It considers only the local situa-
tion. Therefore, the local level is largely independent from
the map of the environment and generalizes well previous
experiences across different scenes and environments.
We consider four key concepts to realize a flexible nav-
igation system that can exploit previous experiences. The
remainder of this section briefly introduces these concepts,
which we will describe in detail in the subsequent sections.
Path Representation. We describe an experienced path as
an ordered list of its poses that captures its structure. We
call these poses attractors. Fig. 1 depicts an example of
attractors, details are in Sec. 4.
Situation Description. We define a similarity relationship
among tasks to exploit previous experiences using situa-
tion descriptors. We define different descriptors at global
and local level to capture their functions and properties.
See Sec. 5 for further details.
Figure 1: Attractor representation of a local path P=
{qs, a1, a2, a3, qg}. (δ, φ, γ) identify a local attractor ai.
Planning Exploiting Experiences. We employ a planner in-
spired by the Attractor Guided Planner [13]. It guides the
planning process to prefer trajectories that match previ-
ous experiences. As described in detail in Sec. 6, distinct
planning instances run in parallel at global and local level.
Planning in Dynamic Environments. We consider a prob-
abilistic approach based on Gaussian Processes [29] to pre-
dict obstacle trajectories. Our system define virtual con-
straints based on these predictions and plans deviations to
avoid dynamic obstacles. We provide details in Sec. 7.
4. Path Representation
We use a path representation that effectively stores and
retrieves previously experienced paths. We represent a
path Pas P={qs, a1, . . . , an, qg}, where qsand qgare
the start and goal poses and a1, . . . , anare the poses along
the path that allow to capture its structure. We call these
poses attractors, recalling the concept introduced by Jiang
and Kallmann [13]. Given an experienced path, we com-
pute its attractors and store them in a database, so that
they can be exploited for planning for a new similar task.
In Sec. 6, we will describe how our planner takes advantage
from this representation.
We compute the attractors of a path by considering
iteratively a window of path points. We initialize the win-
dow with the first two path points. If a line fits through
the points in the window, we insert the successive point.
Otherwise, we identify the last inserted point as candidate
attractor and check whether a straight motion from the
previous attractor is valid and collision free. If this is the
case, we include the candidate attractor to the list of at-
tractors, and reinitialize the window. Otherwise, we select
the previous point in the path as candidate attractor and
perform a new check. We iterate this procedure until it
processes all path points. The result is an ordered list of
poses describing the path.
Global and local levels of our system rely on different
assumptions, so it is necessary to define a different way to
store and to reuse the attractors. We maintain the expe-
riences in two distinct databases: DGfor the global paths
and DLfor the local ones. As global experiences do not
generalize across different environments, we represent the
3
attractors of the global paths in (x, y, θ) map coordinates
and store them in a distinct database for each environ-
ment. On the contrary, to make local experiences available
across different scenes, the local level should not depend
on the environment. To this end, we apply a coordinate
transformation to the attractors that makes them inde-
pendent from the world frame and that can be inverted
in different situations for local replanning. We introduce a
local coordinate system based on the local path and on the
corresponding obstacle. It is the polar coordinate system
identified by the axis υin Fig. 1. It has the pole in the cen-
ter of the obstacle O, and the orientation of the vector υ0
that connects the start to the goal. In this frame, the co-
ordinates (ρ, φ, γ) identify unambiguously the attractors,
where ρis the distance of the attractor to O,φand γare
the angles that the line passing by Oand the center of
the robot Rform with υand the robot axis. As this rep-
resentation relies only on local information, it allows for
transferring experiences across different environments. To
further generalize over different obstacles, we consider δ
instead of ρ, where δis the distance of the attractor from
the obstacle surface along ρ. The coordinates (δ, φ, γ) al-
low for reusing an experienced local behavior even in the
cases in which the obstacle encountered by the robot has
different dimension, while keeping a safe distance from it.
To exploit an experienced local path for a new task, we
transform local attractors in the global frame of the cur-
rent environment and use them to guide the new plan.
5. Situation Description
To reproduce previous experiences, it is important to
identify which of these fits a new situation. We compare
navigation tasks using situation descriptors. A situation
descriptor is an array of vectors that describes the task
and the scene in which the robot will perform it. For each
experienced path, we compute the corresponding situation
descriptor and store it in the database together with the
attractors. We measure the similarity between two tasks
by computing the sum of the Euclidean distances of the
attributes of their situation descriptor.
As stated by Jetchev and Toussaint [11], the ability to
generalize experiences to new tasks depends on how we
describe the situation. We specify two distinct situation
descriptors to describe global and local tasks that fit the
objectives and the assumptions considered at each level.
As the global level depends on the environment, we
define a situation descriptor as the array composed by
the start and goal poses of the task expressed in the map
frame. Given the task to navigate from qs= (xs, ys, θs) to
qg= (xg, yg, θg), the corresponding situation descriptor is
dG= [xs, ys, θs, xg, yg, θg].(1)
This descriptor fits the requirements at global level: it al-
lows for multiple queries in one environment and provides
no information across different environments. Jiang and
Figure 2: Local situation descriptor defined by the local start
(dark blue) and goal (light blue), the extent of the obstacle
(yellow) and of free space around it (red).
Kallmann [13] show that this descriptor is effective in prac-
tice for static environments. Combining this definition of
situation descriptor with the representation of paths as at-
tractors, we can consider at global level every sub-path as
a distinct example. We store each attractor individually
in the database with a reference to the path it belongs to.
Given a new task, we search the database for the pair of
robot poses (qi, qj) such that qiand qjbelong to the same
path Pand the sum of the distances to the start and goal
configurations of the new task is the smallest. In this way,
a new plan can be guided by any of the sub-paths of P.
At local level, we want to generalize experiences across
different environments and obstacles. As a result of that,
we define a situation descriptor that relies only on local
information, as illustrated in Fig. 2. First, we consider
the local task expressed in the local coordinates (ρ, φ, γ)
introduced in Sec. 4:
dLtask = [ρs, φs, γs, ρg, φg, γg].(2)
We consider the distance from the obstacle center γinstead
of the distance from the surface of the obstacle δused
for representing local attractors, as γis straightforward to
compute and δdoes not provide any further information to
describe a task. This descriptor represents the task with
respect to the obstacle, but provides no information about
the obstacle itself and the space around it. The geometry
of the obstacle and of the space around it are fundamental
to plan a path to avoid it. Therefore, we consider two
additional components. The first one describes the shape
and dimension of the obstacle by computing the extent of
the obstacle from its center Oin the 8 directions defined
by the υaxis, each rotated by 45 degrees, see Fig. 2:
dLobstacle = [e1, e2, e3, e4, e5, e6, e7, e8].(3)
Considering this component, we favor the robot to accom-
plish similar behaviors at obstacles with similar shapes and
dimensions. In fact, these obstacles are likely to corre-
spond to similar objects and the user may want the robot
to accomplish a specific behavior to avoid a class of ob-
jects. For example, to avoid moving or potentially moving
obstacles the user may prefer a behavior that does not
cross their front or their way. The second additional com-
ponent describes the free space around the obstacle as the
4
extent of free space from the obstacle surface to the next
obstacle along the same directions considered in dLobstacle:
dLfreespace = [f1, f2, f3, f4, f5, f6, f7, f8],(4)
see Fig. 2 for an illustration. The geometry around an
obstacle is a good indicator for deciding on which side of
the obstacle the robot should pass to avoid it. Therefore,
considering this component will favor paths in the same
homotopy in tasks with similar geometry around the ob-
stacle. The resulting local descriptor is
dL= [dLtask , dLobstacle , dLfreespace ] (5)
that allows for comparing local tasks across different sit-
uations, obstacles and environments. We combine differ-
ent measure of similarity to keep the local descriptor as
general as possible. Nevertheless, each component can be
parametrized to obtain a descriptor biased towards a spe-
cific aspect of a local task, for example, to favor similar
behaviors at similar obstacles.
6. Planning Exploiting Experiences
We introduce a path planning algorithm that exploits
robot’s previous experiences to plan for new tasks. It takes
advantage of the notions of attractors to represent the ex-
perienced paths, and situation descriptor to identify simi-
lar tasks. Consider the use case introduced in Sec. 3.1 and
assume that the operator requires the robot to perform
the task illustrated in Fig. 3: navigate from qsto qgin the
known environment M, where the robot has already suc-
cessfully performed some tasks (dotted lines). Our system
takes as input: the database of global paths experienced
in M,DG,M , the database of local paths DL, the start qs
and the goal qgposes.
6.1. Main
Alg. 1 describes the main procedure of our system. The
method PlanGlobal computes a path in the static map
(line 1). Once a path is available, the robot starts nav-
igating along it. At each step of the path, we check the
remaining poses for invalid configurations (line 4). If an
unforeseen obstacle blocks the path at some point (line 5),
PlanLocal plans a deviation that enables the robot to
avoid the blocking obstacle and to get back to the global
path (line 6). We update the path with the local devia-
tion (line 7) and the robot safely navigates towards the
next path pose (line 8). We repeat this procedure until
the robot reaches the goal. Once the robot completed the
task, we ask the user to rate the robot behaviors (line 10).
In the following, we describe in detail how local and global
planning work and how we collect user feedbacks.
(a) New task and previous experiences (dotted lines).
(b) Global planning guided by the attractors of a similar experience.
(c) Local planning guided by the attractors of a similar experience.
(d) Exploration by relaxing the attractors of a previous experience.
Figure 3: Our planning approach exploits similar experiences
at global and local level to plan for a new task.
6.2. Global Planning
The method PlanGlobal (Alg. 2) implements the
global level of our system. Given the task to navigate
from qsto qgand the static map of the environment M, it
computes a path from qsto qgin M. First, it computes the
global situation descriptor dGcorresponding to the task
(line 1), as described in Sec. 5. We compare the descrip-
tor dGwith the ones in DG,M to find a previous similar
experience. If no similar experience is available, we plan a
new path from scratch using standard bi-directional RRT
5
Alg 1 Main(qs, qg, DG,M , DL)
1: pathglobal PlanGlobal(qs, qg, DG,M )
2: pathactual pathglobal
3: while pipathactual pi6=qgdo
4: obstacle, invalid poses ={pj, ..., pk} ←
checkPath([pi, pg])
5: if invalid poses 6= 0 then
6: pathlocal PlanLocal(pj1, pk+1, obstacle, DL)
7: pathactual updatePath(pathactual, pathlocal )
8: navigateTo(pi)
9: ii+ 1
10: Feedback(qs, qg, pathactual, pathglobal , DG,M , DL)
Alg 2 PlanGlobal(qs, qg, DG,M )
1: dGglobalDescriptor(qs, qg)
2: if attrgattractorsSimilarTask(dG, DG,M )6= 0 then
3: pathglobal bi-RRT(qs, qg, guided sampling, attrg)
4: else
5: pathglobal bi-RRT(qs, qg, uniform sampling)
6: return pathglobal
Alg 3 PlanLocal(qs, qg, obstacle, DL)
1: dLlocalDescriptor(qs, qg, obstacle)
2: if attrlattractorsSimilarTask(dL, DL)6= 0 then
3: attrltoMapCoordinates(attrl, obstacle)
4: pathlocal bi-RRT(qs, qg, guided sampling, attrl)
5: else
6: pathlocal bi-RRT(qs, qg, unif orm sampling)
7: return pathlocal
Alg 4 Feedback(qs, qg, pathactual, pathglobal , DG,M , DL)
1: if feedback(pathglobal) = good then
2: dGglobalDescriptor(qs, qg)
3: attrgextractAttractors(pathglobal)
4: storePath(dG, attrg, DG,M )
5: deviations getDeviations(pathactual , pathglobal )
6: for each {obstacle, path}in deviations do
7: if feedback(path) = good then
8: dLlocalDescriptor(qs, qg, obstacle)
9: attrlm extractAttractors(path)
10: attrltoLocalCoordinates(attrlm , obstacle)
11: storePath(dL, attrl, DL)
(bi-RRT) [18] (line 5). Otherwise, if the robot experienced
a similar task as illustrated in Fig. 3a (green dotted line),
we retrieve its attractors attrg(line 2) and exploit them
to compute the new path (line 3). In this case, we em-
ploy as planner a tailored version of the Attractor Guided
Planner (AGP) [13]. It is based on bi-directional RRT
and biases its search trees towards a previous experience.
The attractors attrg={a1, a2, a3, ..., an}guide the sam-
pling process. We iteratively select one of the attractors
aiattrgin an ordered fashion and sample a pose. For
the start tree, we select the attractors from the closest
to the start location to the closest to the goal location.
For the goal tree, in the opposite order. As a search tree
reaches an attractor ai, we select the next one ai+1. If
changes occurred in the environment with respect to the
previous experience, one or more attractors might not be
reachable. In this case, we sample a new pose according
to a dynamically updated Gaussian distribution centered
in the unreachable attractor and with covariance growing
proportionally to the number of non-valid states sampled
around it. We repeat this procedure until we sample a
valid state or we reach a maximum number of iterations.
If we find a valid state, the new path will exploit a previ-
ous experience even if the environment slightly changed.
Otherwise, the search continues sampling poses uniformly
in the same way as standard bi-RRT. If the planner ex-
ploited a previous experience, the new path will reproduce
its structure, as shown in Fig. 3b. Furthermore, in this case
the planning time decreases as we will show in Sec. 8.1.
6.3. Local Planning
The method PlanLocal (Alg. 3) implements the lo-
cal replanning procedure of our system. We trigger it
if the robot encounters an obstacle on its way. It con-
siders the blocking obstacle to compute the descriptor of
the local situation dL(line 1). We use the descriptor to
query the database of local paths DLfor a similar experi-
ence (line 2). The planning scheme reproduces the one at
global level. If no similar previous experience is available
to be exploited, we search a new local path using stan-
dard bi-directional RRT (line 6). Otherwise, we compute
the attractors attrlcorresponding to the previous similar
experience and transform them into the current situation
(line 3) to guide the sampling process (line 4). If a previ-
ous experience guides the planning, the resulting path will
reproduce it in the new situation. In the example shown
in Fig. 3c, an obstacle blocks the path and we need to plan
a deviation. Along its path, the robot has already avoided
an obstacle in a locally similar situation. Therefore, we
transform the attractors of that experience to the new sit-
uation and use them to guide the new plan. As a result,
the robot avoids the new obstacle accomplishing a similar
behavior to the experienced one.
6.4. User Feedback
The method Feedback enables the user to rate the
experienced robot behaviors (Alg. 4). First, we ask him
to provide a feedback for the global path (line 1). If the
user rates it as good, we compute the corresponding de-
scriptor (line 2) and attractors (line 3), and store them in
DG,M (line 4). Considering the deviations from the global
path generated by local replanning (line 5), the user can
also provide a feedback for robot’s local behaviors (line 7).
If he rates a local path as good, we compute the corre-
sponding local descriptor (line 8) and attractors(line 9).
At local level, we transform the attractors to local coordi-
nates (line 10) before storing them in DL(line 11). The
experiences stored in this process are then made available
to guide the planning for new tasks.
6
Our system does not need any initial data to work.
Without examples, we automatically fall back to bi-RRT
performance. We built the databases by storing the good
experiences online while the robot performs its tasks. The
higher the number of feedbacks provided by the user, the
better the system fit his preferences. If the environment
changes so that some experiences cannot be exploited for
planning anymore, we remove them from the database. In
this way, we keep track only of the experiences still valid,
and we bound the growth of the database. If required,
we can easily define distinct databases for different users
such that each database contains only the preferences of a
specific user.
6.5. Exploration vs. Exploitation of Experiences
Our planning approach focuses the search for a new
path as close as possible to a similar previous experience.
This allows for reproducing behaviors and generating paths
that are predictable to the user. It comes however at the
cost of exploring only a limited portion of the configuration
space if a similar path exists.
In case an exploration towards new path is desired,
we can give to the user the possibility to switch to an
exploration mode. Two options are available to explore
the space. First, full exploration does not consider any
previous experience but plans new paths using standard
bi-RRT that probabilistically covers the whole configura-
tion space. Second, attractor relaxation relaxes the con-
straints imposed by the attractor by increasing the co-
variance around one, some, or all attractors during the
sampling process in planning. One example in which we
relaxed all of the attractors of an experience is illustrated
in Fig. 3d. Similar paths can be generated by relaxing the
attractors that still maintain some form of similarity to
the original experience.
The user can evaluate such paths in simulation and,
once he finds a path that satisfies his preferences, we re-
place the experience in the database with the new path.
Attractor relaxation does not explore the whole configura-
tion space but allows the user to find an optimal behavior
according to his preferences starting from an experienced
path.
7. Planning in Dynamic Environments
In the scenario we envisioned in Sec. 3.1, the robot
might share the workspace with other moving entities such
as persons, other robots, etc. To deal with dynamic obsta-
cles, some path planners explicitly consider time informa-
tion [5, 14]. If the state space of the planner contains a
time dimension, planning becomes more complex and the
planning time grows. This often results in slow reactions
of the robot in the presence of an obstacle and, in the
worst case, in a collision. We introduce a strategy to use
our planning approach to avoid dynamic obstacles by re-
placing the notion of time by spatial constraints.
In the situation depicted in Fig. 4a, the robot Rnav-
igates along the blue dotted path. At the same time, an-
other agent His moving nearby. To ensure that Hwill
not block the robot’s path at some point, we first need
to check whether a collision between the two agents may
occur. We know robot’s path and motion model, so we
easily compute its future poses. On the other side, an
entity moving on a factory floor can be a human worker,
another robot, a forklift, etc. so we cannot make any spe-
cific assumption about the trajectory and motion model
of H. As H’s future motion presents high uncertainty, we
employ a probabilistic approach to predict its trajectory.
We rely on Gaussian Processes (GPs) [29] to model the
trajectory out of the current observation data.
7.1. Gaussian Processes for Modeling Trajectories
A Gaussian Process is a collection of Gaussian random
variables defined by a mean function mand a covariance
function c. A function f(x) following this distribution is
denoted as
f(x)GP (m(x), c(x, x0)).(6)
Given nobservations uof ffor the input xX, we can
learn the model GP (m(x), c(x, x0)) that encodes all of our
prior knowledge of it. For a new input x, GPs allow for
computing the predictive mean µand variance σof f(x)
as follows:
µ=k>K1u(7)
σ=c(x, x)k>K1k(8)
K=
c(x1, x1). . . c(x1, xn)
. . . . . . . . .
c(xn, x1). . . c(xn, xn)
, k =
c(x1, x)
. . .
c(xn, x)
.(9)
Many works as [6],[8] and [34] demonstrate that GPs
work well to model trajectories thanks to their property to
provide Gaussian probability distribution over trajectory
poses. Furthermore, approaches using GPs are more flex-
ible than for example an Extended Kalman Filter (EKF)
for which an explicit motion model needs to be defined.
We consider a pair of GPs to model a trajectory. They
take as input the time t= 1,2,3, ..., T and provide as
output the changes along the xand yaxes.
xt=xtxt1
yt=ytyt1
(10)
This representation assumes the movements along x
and yto be independent. As noise along xand yis corre-
lated, correlation is also induced in the posterior processes.
For both GPs, we consider a zero mean function. This is a
reasonable assumption as they represent functions of incre-
ments. Furthermore, this assumption allows for reducing
the number of hyperparameters that define our model. We
employ as covariance function the sum of a Matern kernel
7
(a) Robot Rnavigates along the blue dotted path while
another agent Hmoves nearby.
(b) Collision detected at t0by integrating R’s future footprints over
H’s predicted uncertainty areas.
(c) Local replanning considering the gray obstacle defined by the
uncertainty areas for t0tT.
(d) Deviation update to smoothly converge to the original path.
Figure 4: Planning a local deviation to avoid a collision with a dynamic obstacle.
with ν= 5/2 and a noise kernel:
c(x, x0) = σf1 + 5(xx0)
l+5(xx0)2
3l2
exp 5(xx0)
l+σ2
nδ(xx0)
(11)
where the hyperparameters are the length scale l, the ex-
pected variance of the output σfand the noise term σn.
Matern kernel includes a large class of kernels and it is
often used in real applications thanks to its flexibility. We
preferred it over the squared exponential kernel as the lat-
ter assumes high smoothness of the function, which usually
does not hold for noisy observations of real trajectories.
We train the hyperparameters using the robot’s previous
experiences. Thus, we exploit general information about
moving entities without the need of training parameters
of an explicit parametric motion model. As a result, the
trajectory model improves over time as the robot performs
more and more experiences.
7.2. Future Collision Detection
We use this tra jectory model to predict the future
poses of agents moving nearby the robot while navigat-
ing. In Fig. 4b, we predict the trajectory of Hwithin a
time horizon Tinto the future. The red dotted line is
the predicted mean trajectory. We visualize the predicted
2·standard deviations confidence intervals in x, y coor-
dinates as a continuous sequence of ellipsoids, which we
call uncertainty areas. In the figure, the red ellipsoids rep-
resent the uncertainty areas generated by the prediction
of H’s trajectory. To determine whether a collision may
occur, we integrate the future footprints of Rover the un-
certainty areas of Hcorresponding in time. If there is a
significant probability that the robot and the moving agent
are in the same area at some time t0, we detect a future
collision. This is similar to check if the future footprints
of Roverlaps with the uncertainty area of H. In Fig. 4b,
we detect a future collision at t=t0.
7.3. Replanning
If we detected a future collision, our system needs to
plan a deviation from the path to allow the robot to avoid
the other moving agent. We achieve this by defining an
artificial obstacle composed by the uncertainty areas. As
illustrated in Fig. 4c, it is centered in the center of the
uncertainty area corresponding to the time of the future
collision t0, and has area delineated by the union of the
uncertainty areas of Hfor t0tT. We exploit such
obstacle for local replanning as described in the previous
section. So, if a similar experience is available, we trans-
form its attractors in the current situation and use them
to guide the planning. The resulting path (blue dotted
line) avoids passing by the uncertainty areas correspond-
ing to t0tT. This may seem a conservative strategy,
however it ensures safety in the time horizon T. Notice
that an obstacle defined in this way expands in the di-
rection in which His moving, so the replanning favors
deviations passing on the opposite direction. In Sec. 8, we
will demonstrate that this strategy works well in practice.
While avoiding a moving obstacle, we update the ob-
stacle and plan the deviation from the global path every
time a new observation of its trajectory is received. As the
8
(a) bi-RRT.
(b) Our planner given the green path as example.
(c) Area covered by a circular robot to navigate along the paths
generated by bi-RRT (red) and by our approach (blue).
Figure 5: Global paths generated for a set of similar tasks.
time goes, the uncertainty of the obstacle’s trajectory pre-
diction at t0decreases and the corresponding uncertainty
area becomes smaller (Fig. 4d). Accordingly, also the de-
viation from the global path decreases, so that the robot
smoothly converges as soon as possible to it.
We apply the same strategy to avoid static obstacles.
If an obstacle is not moving, the predicted mean is always
equal to the current pose and the variance is zero. There-
fore, we can exploit indistinctly local experiences to avoid
static and dynamic obstacles.
8. Experiments
This section illustrates and discusses the capabilities
and performances of our system. We designed experiments
to support the claims made in the introduction. We evalu-
ate our planning approach in Sec. 8.1. Sec. 8.2 focuses on
our trajectory prediction model. We test our navigation
system in simulation in Sec. 8.3, and we present some tasks
performed running our system on a real KUKA Youbot
mobile robot in Sec. 8.4.
(a) bi-RRT.
(b) Our planner given the green path as example.
D
E
(c) Our planner given the green path in Fig. 6b as example.
Figure 6: Local paths generated to avoid unforeseen obstacles.
8.1. Planning
In the first set of experiments, we evaluate the capa-
bilities of our planning approach. Throughout this sec-
tion, we consider the scenario of the use case introduced
in Sec. 3.1 and use the implementation of bi-directional
RRT (bi-RRT) available in OMPL as baseline.
To show the ability of our planner to reproduce experi-
ences meeting user preferences, we assume an user requires
the robot to perform the 10 navigation tasks represented
in Fig. 5. These tasks are similar in a global sense, i.e. they
have nearby start and goal poses. Fig. 5a shows the paths
generated by bi-RRT for these tasks. Due to the random
nature of bi-RRT, they reveal substantial differences to
each other: some pass by the top central room, others by
the bottom central room. Thus, the user can make only
limited predictions about the resulting path for a similar
task. He also cannot express any preference about where
and how the robot should navigate.
Assume the user prefers that the robot navigates pass-
ing by the central top room to perform these tasks. Our
system allows him to express this preference by rating the
green path in Fig. 5b as good. Using this path as exam-
9
Figure 7: Performance comparison for planning using bi-RRT and our approach with 10, 20, 50, 100 and 200 examples.
ple, our planner generates for the same tasks of Fig. 5a
the blue paths in the figure. These paths reproduce the
structure of the example meeting the preferences of the
user. Therefore, our planner allows for generating similar
solutions for similar tasks and, thus, for making the robot
behaviors predictable.
Our planner generates paths with such properties also
for planning local deviations. Assume the robot is moving
along the orange path in Fig. 6a. Along the path, the robot
encounters two unforeseen static obstacles Aand Cthat
block the path at two different locations. The robot needs
to replan to avoid them and to reach its goal. In Fig. 6a, we
planned a local deviation 10 times for each obstacle using
bi-RRT. The resulting paths cause the robot to behave
differently for distinct runs of the same task. Our planner
prevents this and it further enables the user to express
preferences.
Assume the user rates the green path in Fig. 6b as
agood example to avoid obstacle A. Our planner gener-
ates the blue paths exploiting this path. These paths suc-
cessfully avoid obstacle Areproducing the example and so
meeting the user’s preferences. As the local situation at
obstacle Cis similar to the one at obstacle A, our planner
transform the experience at obstacle Ain the new situ-
ation and generates deviations to avoid obstacle Cthat
reproduce it. This shows that our system can reproduce
behaviors and user preferences across similar local situa-
tions. This holds also across different environments. Con-
sider the scenario illustrated in Fig. 6c where the orange
path is blocked by two obstacles. The situation at obsta-
cle Dis similar to the one for which the user provided an
example. Thus, our system generates paths to avoid the
new obstacle that reproduce the experience in the new sit-
uation. The local situation at obstacle Eis different and
so the example cannot be exploited. In this case, our plan-
ner generates new paths from scratch in the same way as
bi-RRT.
To illustrate the performances of our planning approach,
we considered 20 sets of 10 similar global navigation tasks
and 20 sets of 10 similar local tasks. First, we plan for
these tasks by using bi-RRT and then by using our planner
with 10, 20, 50, 100 and 200 examples randomly selected
among the ones generated by bi-RRT. We run this pro-
cedure 10 times for a total of 20,000 planning instances.
We consider 3 measures to evaluate the performance of the
planners: planning time, number of sampled states while
planning, area of the environment covered by a circular
robot to perform a set of similar tasks. The latter gives a
measure of the similarity of the paths generated for similar
tasks. The smaller the area covered, the more similar are
the paths. Fig. 5c illustrates an example of the area cov-
ered by a robot when using bi-RRT (red) and our approach
(blue) as planner.
Fig. 7 shows the performance of each planner setting
at global and local level. The first chart shows the average
planning time. As few examples are available, our system
outperforms bi-RRT. The planning time decreases with in-
creasing the number of examples up to approx. the 50%
in the local case and the 60% in the global case. When
the number of examples becomes large, the planning time
tends to marginally increase due to time needed to query
the database. The second chart shows the average number
of states sampled during planning. If a similar experience
is available, our planner tends to sample the correspond-
ing attractors instead of attempting to explore the whole
environment as bi-RRT. Thus, the larger the number of
examples provided, the smaller is the number of sampled
states. In the third chart, we compare the average percent-
age of area of the environment occupied by a circular robot
to perform a set of similar tasks. Even with few examples,
the area covered by our system is approx. 25% smaller
than bi-RRT at global level and 35% at local level. The
results at local level are explained by the ability of our ap-
proach to generalize experiences across different obstacles
and situations. When the number of examples available for
each task increases, the covered area grows accordingly.
8.2. Trajectory prediction
The second set of experiments aims at evaluating our
approach to predict the trajectory of moving objects. We
want to demonstrate that within a short time it provides
good trajectory predictions, and that uncertainty areas
captures well the uncertainty of the motion. To this end,
we created a setup in our lab in which people could walk
through and recorded the trajectories of 6 people walk-
ing in it for approx. 120 s each using a motion capture
system. We considered 30 s of trajectory of each person
to train the hyperparameters of our model. We used the
resulting model to predict the future trajectory of the re-
maining data within the next 5 s every 0.25 s for a total
of approx. 2500 predictions. The performance of our tra-
jectory prediction approach are illustrated in Fig. 8. The
10
Figure 8: Performance of our trajectory prediction model.
first graph shows the root mean square error of the mean
trajectory predicted by our approach with respect to the
time predicted. We compare it with the trajectory pre-
dicted by a linear constant-velocity model. Our approach
always outperforms the linear model and presents half the
error for predictions up to 3 s. The second graph illustrates
in percentage the number of times that the real trajectory
belongs to the corresponding uncertainty area. The per-
centage is above 90% for predictions up to 3s. Afterwards,
it decreases linearly and at 5 s the real trajectory is in
the corresponding uncertainty area around the 30% of the
cases. This is partially due to the zero mean assumption
made for the Gaussian Processes in our model. It means
in practice that for long predicted time the increments ∆x
and ∆ywill tend to zero. Still, our approach provides
good results for short-time predictions. Furthermore, 3 s
is usually a reasonable time to react to an obstacle moving
in an indoor environment. We also contrast this effect by
continuously updating the prediction and the path while
avoiding an obstacle. In the next section, we will show
that our approach works well in practice for enabling the
robot to avoid dynamic obstacles with foreseeable and safe
behaviors.
8.3. Navigation in Simulation
We implemented our system using C++ and ROS, and
tested it using V-REP robot simulator. We required a sim-
ulated robot to perform navigation tasks in some simulated
environments. We illustrate and discuss the performance
of our navigation system comparing it to other common
approaches. We want to show specially that our system
can generate similar robot behaviors for similar tasks.
First, we require the robot to perform 10 sets of 10
similar navigation tasks in some static environments to
test our system for robot navigation at global level. We
included in the database of global paths one example per
set randomly generated using bi-RRT, so that our system
Figure 9: Performance of navigation in static environments.
Replanning
Frequency [Hz]
Obstacle Velocity
0.4 m/s 0.8 m/s 1.0 m/s
coll. fail. coll. fail. coll. fail.
DWA 20.0 0.0 0.0 0.375 0.05 0.25 0.0
TEB 5.0 0.0 0.0 0.075 0.0 0.15 0.025
bi-RRT 20.0 0.0 0.0 0.0 0.0 0.075 0.0
Our appr. 20.0 0.0 0.0 0.0 0.0 0.0 0.0
Table 1: Statistics for dynamic obstacle avoidance.
had some experiences available to exploit while planning.
We compared the performances of our system with bi-
RRT and navfn by ROS. navfn is a standard approach in
ROS for robot navigation based on Dijkstra’s algorithm,
while we implemented bi-RRT by considering our system
without any example. We analyze 3 measures to evaluate
their performances: executed path length, average clear-
ance during navigation and area covered by the robot to
perform a set of similar tasks. Fig. 9 shows the results of
our experiments. navfn generates in average the shortest
paths, while bi-RRT presents the largest clearance. Us-
ing our system, these measures depend on the examples
exploited while planning. A first indicator that our ap-
proach allows for providing similar solutions for similar
tasks is that it presents the lowest standard deviation for
both measures. To further support this, our system covers
almost half of the area covered using bi-RRT.navfn also
scores good. It attempts to minimize the path length and,
by doing this, the paths for similar tasks tend to pass by
the same locations. However, it does not allow to express
any preference about which these locations should be.
We evaluate robot navigation at local level by intro-
ducing moving obstacles in the environments. We defined
six situations that differ from each other for: task, obstacle
velocity and trajectory, type of collision that could occur
(frontal, lateral, etc.). We compared our system to a reac-
tive approach as Dynamic Window Approach (DWA) [7],
an approach that consider explicitly time as Time Elas-
tic Bands (TEB) [14], and our system with no examples
(bi-RRT ). We used the available ROS implementation of
DWA and TEB. For simplicity, we provided only two ex-
amples to our system: one avoiding an obstacle on the left
and one on the right. We fixed the global path and per-
formed each task 20 times for every navigation algorithm.
We considered five measures to evaluate their performance:
executed path length, deviation from the global path, ex-
ecution time, clearance, and area covered to perform one
task multiple times. Fig. 10 shows the results of our ex-
11
Figure 10: Performance of navigation in dynamic environments.
Figure 11: KUKA Youbot.
periments. Our system provides in average the shortest
path length. It continuously updates the local path while
avoiding an obstacle, and so it converges as soon as pos-
sible to the global path as shown also in the analysis of
the deviation from the global path. bi-RRT also provides
good results in average as it is implemented in our system,
but it presents large standard deviation due to its random
nature. The results for the execution time reproduce the
ones for the path length except for DWA. The last chart
shows that our system allows the robot to cover the small-
est area to perform a task 20 times. This demonstrates its
capability to generate similar behaviors for similar situa-
tions also at local level.
Table 1 reports for each algorithm the replanning fre-
quency, the collision and failure rates with respect to the
velocity of the obstacle that we experienced in our exper-
iments. A collision occurs if the robot collides with an
obstacle, but it is then able to reach the goal. A failure
takes place if the robot is no able to reach the goal within
a timeout. As TEB plans optimal trajectories consider-
ing explicitly time, it plans at a lower frequency than the
other approaches. This is reflected in collision and failure
rates: when the obstacle is fast, TEB does not always re-
act fast enough. Even if replanning at 20 Hz, DWA also
can cause collisions and failures when the obstacle velocity
increases. Our system is safer. If no example is available
(bi-RRT ), it provoked collisions only in one case. Sam-
pling states uniformly may generate completely different
paths for successive planning instances. In this case, the
robot oscillates and wastes time to react to the obstacle.
Instead, we experienced no collision or providing examples
to our system.
(a) Navigation in a hallway keeping on the right.
(b) Navigation in a hallway without obstructing the passage.
(c) Navigation without passing under the table.
Figure 12: Examples of navigation tasks in real world.
8.4. Real Robot Navigation
We ran our system on a real robot to demonstrate that
our approach works for robot navigation in real world. The
platform used is the KUKA Youbot illustrated in Fig. 11.
It is an omni-directional mobile robot that we equipped
with two Hokuyo laser range finders. Using these sen-
sors, the robot localizes itself in the environment through
a Monte Carlo localization approach.
12
(a) Robot navigates along the planned path. (b) Human walks nearby.
(c) Collision detection and planning deviation to avoid the obstacle. (d) Deviation update (I).
(e) Deviation update (II). (f) Deviation update (III).
(g) Deviation update and convergence to the original path. (h) Robot resumes navigation along the original path.
Figure 13: Robot navigates along a given path avoiding a human blocking its way.
We want to show that our system allows the robot for
easily performing behaviors that may be hard to encode
in typical navigation systems. To this end, we defined
3 simple navigation tasks that an user may require the
robot to perform. The first task is illustrated in Fig. 12a
and consists of navigating from one side to the other of
the hallway by keeping on the right side. If no example is
available, the robot just navigates towards the goal (red
paths). So, we joysticked the robot along the green path
that passes on the right side of the hallway, and gave it
as example to our system. Exploiting such example, our
system navigates from one side to the other of the hallway
by keeping on the right (blue paths). The second task is
illustrated in Fig. 12b. The robot has to navigate through
13
Figure 14: Trajectories of robot (blue) and human (red) for
similar situations in which the robot replans to avoid collisions.
the hallway without to obstructing the passage. Without
any example, the robot just moves towards the goal (red
path) with the risk to block the passage for some time. As
we demonstrate the green example, the robot navigates
along the blue path to reach the other side of the hallway.
In the third task, the robot has to navigate from one side to
the other of a table avoid passing under it. Fig. 12c depicts
in red the path that the robot navigates if no example is
provided, and in blue the path executed when the green
path becomes available as example.
We also tested the capabilities of our system to avoid
walking people who may block robot’s path. We designed
different setups in which the robot and a human could
move through, and used a motion capture system to ob-
serve their trajectories. Fig. 13 illustrates an example in
which the robot avoids a human crossing its way using our
system. The robot navigates along the blue dotted path
(Fig. 13a). The yellow line shows the observed robot’s tra-
jectory. In Fig. 13b, a human walks into the scene. The
observations of his trajectory are depicted in red, while
the orange dotted line represents the predicted mean of
his future trajectory. Our system detects a future collision
in Fig. 13c that generates the gray artificial obstacle. So,
it plans the green local deviation that avoids the human
exploiting a previous experience. In Fig. 13(d-f), our sys-
tem updates the local deviation according to the current
the prediction of the trajectory of the human. As the hu-
man does not block the original path anymore (Fig. 13g),
our system plans paths that allow the robot to converge
to it. In Fig. 13h, the robot gets back to the global path
and continues its navigation along it. Fig. 14 illustrates
the trajectories of the robot and the human recorded for
running the same task 10 times. It shows that our system
reproduce similar behaviors for similar tasks.
9. Conclusion
In this paper, we presented a robot navigation system
that leads to navigation behaviors that are predictable and
meet user’s preferences. This is especially important for
mobile robots operating on factory floors where the oper-
ator often requires a robot to follow some criteria. Pref-
erences are implicitly extracted collecting demonstrated
examples or feedbacks about robot’s previous experiences.
Our planning approach reproduces experiences over sit-
uations and environments. This allows for accomplish-
ing similar behaviors for similar tasks according to user’s
preferences. We also introduced a probabilistic approach
to predict the trajectories of moving agents that allows a
robot for avoiding dynamic obstacles by applying the same
planning scheme. We implemented and evaluated our ap-
proach over an extensive set of experiments comparing it
with different common approaches for robot navigation.
We performed tests both in simulation and on a real mo-
bile robot operating in real world. The experiments sug-
gest that our system can reproduce and generalize previous
behaviors for robot navigation meeting user’s preferences
both at global and local level. Furthermore, it allows for
easily performing common tasks that may be hard to en-
code in typical navigation systems.
10. Acknowledgments
This work has partly been supported by the European
Commission under the H2020 framework programme un-
der grant agreement no. 645403 (RobDREAM).
References
[1] Bennewitz, M., Burgard, W., Cielniak, G., Thrun, S., 2005.
Learning motion patterns of people for compliant robot motion.
Int. Journal of Robotics Research 24, 31–48.
[2] Berenson, D., Abbeel, P., Goldberg, K., 2012. A robot path
planning framework that learns from experience, in: Proc. of
the IEEE Int. Conf. on Robotics & Automation (ICRA), pp.
3671–3678.
[3] Bruce, J., Veloso, M., 2002. Real-time randomized path
planning for robot navigation, in: Proc. of the IEEE/RSJ
Int. Conf. on Intelligent Robots and Systems (IROS), pp. 2383–
2388.
[4] Coleman, D., S¸ucan, I.A., Moll, M., Okada, K., Correll, N.,
2015. Experience-based planning with sparse roadmap span-
ners, in: Proc. of the IEEE Int. Conf. on Robotics & Automa-
tion (ICRA), pp. 900–905.
[5] Delsart, V., Fraichard, T., 2009. Navigating dynamic environ-
ments with trajectory deformation. Journal of Computing and
Information Technology 17, 27–36.
[6] Ellis, D., Sommerlade, E., Reid, I., 2009. Modelling pedestrian
trajectory patterns with gaussian processes, in: Proc. of the
Int. Conf. on Computer Vision (ICCV Workshops), pp. 1229–
1234.
[7] Fox, D., Burgard, W., Thrun, S., 1997. The dynamic window
approach to collision avoidance. IEEE Robotics & Automation
Magazine 4, 23–33.
[8] Fulgenzi, C., Tay, C., Spalanzani, A., Laugier, C., 2008.
Probabilistic navigation in dynamic environment using rapidly-
exploring random trees and gaussian processes, in: Proc. of
the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems
(IROS), pp. 1056–1062.
[9] Furgale, P., Barfoot, T., 2010. Visual teach and repeat for long-
range rover autonomy. Journal on Field Robotics 27, 534–560.
[10] Furgale, P., Kr¨usi, P., Pomerleau, F., Schwesinger, U., Colas, F.,
Siegwart, R., 2014. There and back again-dealing with highly-
dynamic scenes and long-term change during topological/metric
route following, in: ICRA14 Workshop on Modelling, Estima-
tion, Perception, and Control of All Terrain Mobile Robots.
[11] Jetchev, N., Toussaint, M., 2009. Trajectory prediction: learn-
ing to map situations to robot trajectories, pp. 449–456.
14
[12] Jetchev, N., Toussaint, M., 2010. Tra jectory prediction in clut-
tered voxel environments, in: Proc. of the IEEE Int. Conf. on
Robotics & Automation (ICRA), pp. 2523–2528.
[13] Jiang, X., Kallmann, M., 2007. Learning humanoid reaching
tasks in dynamic environments, in: Proc. of the IEEE/RSJ
Int. Conf. on Intelligent Robots and Systems (IROS), pp. 1148–
1153.
[14] Keller, M., Hoffmann, F., Hass, C., Bertram, T., Seewald, A.,
2014. Planning of optimal collision avoidance trajectories with
timed elastic bands. IFAC Proceedings Volumes 47, 9822–9827.
[15] Kretzschmar, H., Spies, M., Sprunk, C., Burgard, W., 2016. So-
cially compliant mobile robot navigation via inverse reinforce-
ment learning. Int. Journal of Robotics Research 35, 1289–1307.
[16] Kruse, T., Pandey, A.K., Alami, R., Kirsch, A., 2013. Human-
aware robot navigation: A survey. Journ. of Rob. & Aut. Sys-
tems 61, 1726–1743.
[17] Kuderer, M., Kretzschmar, H., Burgard, W., 2013. Teaching
mobile robots to cooperatively navigate in populated environ-
ments, in: Proc. of the IEEE/RSJ Int. Conf. on Intelligent
Robots and Systems (IROS), pp. 3138–3143.
[18] Kuffner, J.J., LaValle, S.M., 2000. Rrt-connect: An efficient
approach to single-query path planning, in: Proc. of the IEEE
Int. Conf. on Robotics & Automation (ICRA), pp. 995–1001.
[19] La Valle, S.M., 2011. Motion planning. IEEE Robotics & Au-
tomation Magazine 18, 108–118.
[20] LaValle, S.M., 1998. Rapidly-exploring random trees: A new
tool for path planning. Technical Report No. 98-11 .
[21] Lien, J.M., Lu, Y., 2009. Planning motion in environments with
similar obstacles, in: Proc. of Robotics: Science and Systems
(RSS).
[22] Mazuran, M., Sprunk, C., Burgard, W., Tipaldi, G.D., 2015.
Lextor: Lexicographic teach optimize and repeat based on user
preferences, in: Proc. of the IEEE Int. Conf. on Robotics &
Automation (ICRA), pp. 2780–2786.
[23] Meri¸cli, T., Veloso, M., Akın, H.L., 2015. A case-based approach
to mobile push-manipulation. Journal of Intelligent & Robotic
Systems 80, 189–203.
[24] Nardi, L., Stachniss, C., 2016. Experience-based path plan-
ning for mobile robots exploiting user preferences, in: Proc. of
the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems
(IROS), pp. 1170–1176.
[25] Perea Str¨om, D., Bogoslavskyi, I., Stachniss, C., 2016. Ro-
bust exploration and homing for autonomous robots. Journ. of
Rob. & Aut. Systems .
[26] Pfeiffer, M., Schwesinger, U., Sommer, H., Galceran, E., Sieg-
wart, R., 2016. Predicting actions to act predictably: Cooper-
ative partial motion planning with maximum entropy models,
in: Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and
Systems (IROS), pp. 2096–2101.
[27] Phillips, M., Cohen, B.J., Chitta, S., Likhachev, M., 2012. E-
graphs: Bootstrapping planning with experience graphs, in:
Proc. of Robotics: Science and Systems (RSS).
[28] Quinlan, S., Khatib, O., 1993. Elastic bands: Connecting path
planning and control, in: Proc. of the IEEE Int. Conf. on
Robotics & Automation (ICRA), pp. 802–807.
[29] Rasmussen, C.E., Williams, C.K.I., 2005. Gaussian Processes
for Machine Learning (Adaptive Computation and Machine
Learning). The MIT Press.
[30] Ros, R., De M`antaras, R.L., Arcos, J.L., Veloso, M., 2007. Team
playing behavior in robot soccer: A case-based reasoning ap-
proach, in: Proc. of the Int. Conf. on Case-Based Reasoning,
pp. 46–60.
[31] osmann, C., Feiten, W., W¨osch, T., Hoffmann, F., Bertram,
T., 2012. Trajectory modification considering dynamic con-
straints of autonomous robots, in: Proc. German Conf. on
Robotics (ROBOTIK 2012), pp. 1–6.
[32] Sprunk, C., Tipaldi, G.D., Cherubini, A., Burgard, W., 2013.
Lidar-based teach-and-repeat of mobile robot trajectories, in:
Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and
Systems (IROS), pp. 3144–3149.
[33] S¸ ucan, I.A., Moll, M., Kavraki, L.E., 2012. The Open Motion
Planning Library. IEEE Robotics & Automation Magazine 19,
72–82. "http://ompl.kavrakilab.org".
[34] Trautman, P., Ma, J., Murray, R.M., Krause, A., 2013. Robot
navigation in dense human crowds: the case for cooperation,
in: Proc. of the IEEE Int. Conf. on Robotics & Automation
(ICRA), pp. 2153–2160.
[35] Ziebart, B.D., Ratliff, N., Gallagher, G., Mertz, C., Peter-
son, K., Bagnell, J.A., Hebert, M., Dey, A.K., Srinivasa, S.,
2009. Planning-based prediction for pedestrians, in: Proc. of
the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems
(IROS), pp. 3931–3936.
[36] Zucker, M., Kuffner, J., Bagnell, J.A., 2008. Adaptive
workspace biasing for sampling-based planners, in: Proc. of the
IEEE Int. Conf. on Robotics & Automation (ICRA), pp. 3757–
3762.
15
... When unexpected and uncertain obstacles occur in the environment, researchers used various approaches to deal with, such as calculating the probability of the uncertain obstacles [7], [18], [25], [28], predicting the unexpected and uncertain behavior [13], and neural network [3], etc. However, probabilistic prediction methods require a large number of statistics data, and some of the data are difficult to obtain in the actual site, resulting in the accuracy of behavior prediction results is difficult to guarantee. ...
Preprint
Full-text available
In the environment cognition, except the xpected obstacles, the mobile robots might encounter unexpected obstacles. How to effectively deal with the unexpected obstacles is a challenge for the mobile robots. Unfortunately, researches about the unexpected obstacles in the environment cognition of the mobile robots are relatively scarce. Even though there are some works attempts to address it, these methods generally lack the ability of autonomous learning, which makes it difficult for the mobile robots to respond to the changed environments quickly and flexibly. Simulating the neuromodulatory mechanism of acetylcholine and norepinephrine in meditating the attention network, collision risk, which not only considers the influence of the obstacle distance on the movement of the mobile robots just like in the traditional methods, but also that of the obstacle speed, is designed to switch the attention network of the mobile robot between the dorsal attention network, which directs attention to anticipated stimuli (expected obstacles), and ventral attention network, which directs attention to novel stimuli (unexpected obstacles). Mimicking the neuromodulatory of acetylcholine and norepinephrine on dopamine and 5-HT, a novel learning rate is designed to accelerate the learning of the neurons in hidden layer of the motivatd developmental network, making the robots response to the changed environment quickly and flexibly. Moreover, weight updating rules are modified to improve the accuracy of the action selection of the mobile robots. Results in four simulation scenarios and the physical experiment demonstrate the feasibility of the proposed methodology in dealing with the unexpected obstacles in the environment cognition of the mobile robots.
... But this solution may be inflexible. Alternatively, Nardi and Stachniss (2017) presented a robot navigation system by learning previous experiences to meet user preferences. Kretzschmar et al. (2016) modeled the navigation preferences of users to allow the mobile robot to navigate in a socially compliant way. ...
Article
Human-aware robot navigation is the key to facilitate the deployment of mobile robots into human–robot shared environments. Although many efforts have been made in this regard, almost all of the existing works focus on the constraints such as user comfort and social rules, while ignoring to consider user preferences. And it is expected that the mobile robot is able to navigate in the user-desired areas according to user preferences, e.g., navigating at low speed in the bedroom. To this end, this paper studies the problem of allowing users to control the robot’s working areas to ensure robot aware navigation based on their preferences. To address the intrusiveness of additional facilities or devices to users, we propose a non-intrusive solution by deploying the virtual areas that are non-physical areas but are respected by the mobile robot while performing navigation tasks. Farther, an interactive interface is developed to support the common (non-expert) users to flexibly define the desired areas and specify the navigation behavior of the robot according to their preferences. The proposed solution is fully evaluated through extensive experiments. Experimental results are presented to demonstrate the validity of our approach, and show that the mobile robot can change its navigation behavior in user-defined areas according to user preferences. Supplementary video can be available at https://youtu.be/oXtbNavLmMk.
... Human-aware navigation has been applied successfully, by explicitly taking into account user preferences [11], planning a robot's motion according to a moving crowd [12], or navigating side-by-side with a user while not having knowledge of the end goal [13]. Some techniques focus on humanhuman interaction to synthesize their behaviour [14]. ...
... In (Malone et al., 2017), stochastic reachable sets are used to generate accurate artificial potential field for dynamic obstacles for online path planning. In (Nardi and Stachniss, 2017), a probabilistic approach is developed for modeling uncertain trajectories of the moving entities that share workspace with robot. ...
... The complexity of the object matching task rises for similar objects [7]. Previous experiences during the robot navigation are helpful to predict the path with obstacle avoidance of static and dynamic obstacles [8]. Tool based learning for robots is presented by [9]. ...
Article
Full-text available
p>Robot navigation in unknown and dynamic environments may result in aimless wandering, corner traps and repetitive path loops. To address these issues, this paper presents the solution by comparing the standard deviation of the distance ranges of the obstacles appeared in the robot navigation path. For the similar obstacles, The standard deviations of distance range vectors, obtained from the laser range finder sensor of the robot at similar pose, are very close to each other. Therefore, the measurements of odometer sensor are also combined with the standard deviation to recognize the location of the obstacles. A novel algorithm, with obstacle detection feature, is presented for robot navigation in unknown and dynamic environments. The algorithm checks the similarity of the distance range vectors of the obstacles in the path and uses this information in combination with the odometer measurements to identify the obstacles and their locations. The experimental work is carried out using Gazebo simulator.</p
... By means of the graph updating mechanism, next queries in the VM will return less costly visual paths, i.e. as straight as possible and without obstacles if possible. This simple strategy is a way to take into account previous experiences of the robotics system to exploit them in future navigation tasks [42]. ...
Article
Full-text available
We present a complete humanoid navigation scheme based on a topological map known as visual memory (VM), which is composed by a set of key images acquired offline by means of a supervised teaching phase (human-guided). Our autonomous navigation scheme integrates the humanoid localization in the VM, a visual path planner and a path follower with obstacle avoidance. We propose a pure vision-based localization algorithm that takes advantage of the topological structure of the VM to find the key image that best fits the current image in terms of common visual information. In addition, the visual path planner benefits obstacle-free paths. The VM is updated when a new obstacle is detected with an RGB-D camera mounted on the humanoid's head. The visual path following and obstacle avoidance problems are formulated in a unified sensor-based framework in which, a hierarchy of tasks is defined, and the transitions of consecutive and hierarchical tasks are performed smoothly to avoid instability of the humanoid. An extensive experimental evaluation using the NAO platform shows the good performance of the navigation scheme.
... Even though LbD is most popularly used for object manipulation tasks, the use of GPs for navigation has been explored by different researchers. Nardi and Stachniss [19] present a GP-based navigation system, designed to be robust against dynamic environments. They propose a planning approach to predict the trajectories of moving agents and adjust local plans to avoid collisions based on previous navigation experience. ...
Article
Full-text available
In this paper, we present a human-in-the-loop learning framework for mobile robots to generate effective local policies in order to recover from navigation failures in long-term autonomy. We present an analysis of failure and recovery cases derived from long-term autonomous operation of a mobile robot, and propose a two-layer learning framework that allows to detect and recover from such navigation failures. Employing a learning by demonstration (LbD) approach, our framework can incrementally learn to autonomously recover from situations it initially needs humans to help with. The learning framework allows for both real-time failure detection and regression using Gaussian processes (GPs). Our empirical results on two different failure scenarios indicate that given 40 failure state observations, the true positive rate of the failure detection model exceeds 90%, ending with successful recovery actions in more than 90% of all detected cases.
... Therefore, the robot may trap in a repetitive navigation path loop in some situations. Using preceding occurrences, user desired behaviour can be achieved during robot navigation [9]. Priority actions and early predictions based robot navigation algorithm is presented in [10]. ...
Article
Full-text available
In this paper, firstly, a model for robot navigation in unknown environment is presented as a Simulink model. This model is applicable for obstacles avoidance during the robot navigation. However, the first model is unable to recognize the re-occurrences of the obstacles during the navigation. Secondly, an advanced algorithm, based on the standard deviations of laser scan range vectors, is proposed and implemented for robot navigation. The standard deviations of the lasers scans, robot positions and the time of scans with similar standard deviations are used to obtain the obstacle recognition feature. In addition to the obstacle avoidance, the second algorithm recognizes the re-appearances of the obstacles in the navigation path. Further, the obstacle recognition feature is used to break the repetitive path loop in the robot navigation. The experimental work is carried out on the simulated Turtlebot robot model using the Gazebo simulator.
Article
During the movement of the mobile robot, except the common scenario, sometimes, it has to face the unexpected uncertainty in the environment. Traditional methods to address this problem generally use task-specific method from the engineering perspective, lacking flexibility in the changing environment and difficult to responde to the environmental challenges. Because of the function of the brain’s neuromodulatory system, human beings have ability to response to the ever-changing environment quickly. To simulate the working mechanism of the human brain in responding to the unexpected uncertainty in the environment, this work presents a motivated developmental network (MDN) to offer a control configuration for an artificial agent to face the unexpected uncertainty in the environment, through introducing the acetylcholine/norepinephrine (ACh/NE) systems to the MDN. Taking the regulation role of the ACh/NE to the serotonin and dopamine into account, a novel learning rate for the hidden layer neurons of the MDN is proposed. Moreover, a novel composite mechanism is presented to decide the moving direction of the agent. Under the modulation of dopamine, serotonin, acetylcholine and norepinephrine, the agent can perform specific functions effectively, e.g., to chase a target and elude the obstacle, especially the sudden obstacle. Goal-directed pursuing behavior in three simulation cases illustrate the effect of the presented neural modulatory systems, for instance, dealing with the the unexpected uncertainty, realizing the attentional effort, reinforcement learning, etc. To the best of our knowledge, this work is the first endeavor to address the unexpected uncertainty with the MDN.
Conference Paper
Full-text available
This paper reports on a data-driven motion planning approach for interaction-aware, socially-compliant robot navigation among human agents. Autonomous mobile robots navigating in workspaces shared with human agents require motion planning techniques providing seamless integration and smooth navigation in such. Smooth integration in mixed scenarios calls for two abilities of the robot: predicting actions of others and acting predictably for them. The former requirement requests trainable models of agent behaviors in order to accurately forecast their actions in the future, taking into account their reaction on the robot’s decisions. A human-like navigation style of the robot facilitates other agents—most likely not aware of the underlying planning technique applied—to predict the robot motion vice versa, resulting in smoother joint navigation. The approach presented in this paper is based on a feature-based maximum entropy model and is able to guide a robot in an unstructured, real-world environment. The model is trained to predict joint behavior of heterogeneous groups of agents from onboard data of a mobile platform. We evaluate the benefit of interaction-aware motion planning in a realistic public setting with a total distance traveled of over 4 km. Interestingly the motion models learned from human-human interaction did not hold for robot-human interaction, due to the high attention and interest of pedestrians in testing basic braking functionality of the robot.
Conference Paper
Full-text available
The demand for flexible industrial robotic solu- tions that are able to accomplish tasks at different locations in a factory is growing more and more. When deploying mobile robots in a factory environment, the predictability and reproducibility of their behaviors become important and are often requested. In this paper, we propose an easy-to-use motion planning scheme that can take into account user preferences for robot navigation. The preferences are extracted implicitly from the previous experiences or from demonstrations and are automatically considered in the subsequent planning steps. This leads to reproducible and thus better to predict navigation behaviors of the robot, without requiring experts to hard-coding control strategies or cost functions within a planner. Our system has been implemented and evaluated on a simulated KUKA mobile robot in different environments.
Article
Full-text available
The ability to explore an unknown environment is an important prerequisite for building truly autonomous robots. Two central capabilities for autonomous exploration are the selection of the next view point(s) for gathering new observations and robust navigation. In this paper, we propose a novel exploration strategy that exploits background knowledge by considering previously seen environments to make better exploration decisions. We furthermore combine this approach with robust homing so that the robot can navigate back to its starting location even if the mapping system fails and does not produce a consistent map. We implemented the proposed approach in ROS and thoroughly evaluated it. The experiments indicate that our method improves the ability of a robot to explore challenging environments as well as the quality of the resulting maps. Furthermore, the robot is able to navigate back home, even if it cannot rely on its map.
Article
Full-text available
Path deformation is a technique that was introduced to generate robot motion wherein a nominal path, that has been computed beforehand, is continuously deformed on-line in response to unforeseen obstacles. In an effort to improve path deformation, this paper presents a trajectory deformation scheme. The main idea is that by incorporating the time dimension and hence information on the obstacles' future behaviour, quite a number of situations where path deformation would fail can be handled. The trajectory represented as a discrete space-time curve is subject to deformation forces both external (to avoid collision with the obstacles) and internal (to maintain trajectory feasibility and connectivity). The trajectory deformation scheme has been tested successfully on a planar robot with double integrator dynamics moving in dynamic environments.
Article
Full-text available
We present an experienced-based planning framework called Thunder that learns to reduce computation time required to solve high-dimensional planning problems in varying environments. The approach is especially suited for large configuration spaces that include many invariant constraints, such as those found with whole body humanoid motion planning. Experiences are generated using probabilistic sampling and stored in a sparse roadmap spanner (SPARS), which provides asymptotically near-optimal coverage of the configuration space, making storing, retrieving, and repairing past experiences very efficient with respect to memory and time. The Thunder framework improves upon past experience-based planners by storing experiences in a graph rather than in individual paths, eliminating redundant information, providing more opportunities for path reuse, and providing a theoretical limit to the size of the experience graph. These properties also lead to improved handling of dynamically changing environments, reasoning about optimal paths, and reducing query resolution time. The approach is demonstrated on a 30 degrees of freedom humanoid robot and compared with the Lightning framework, an experience-based planner that uses individual paths to store past experiences. In environments with variable obstacles and stability constraints, experiments show that Thunder is on average an order of magnitude faster than Lightning and planning from scratch. Thunder also uses 98.8% less memory to store its experiences after 10,000 trials when compared to Lightning. Our framework is implemented and freely available in the Open Motion Planning Library.
Article
This paper is concerned with the planning of optimal trajectories for vehicle collision avoidance with a Timed Elastic Band (TEB) framework. The avoidance trajectory is represented by a TEB which is optimized with respect to multiple partially conflicting objectives. The resulting trajectory constitutes the optimal compromise between a mere braking and a lane change maneuver that avoids the collision with the smoothest feasible path. The approach is applicable to general critical traffic situations as the TEB considers the constraints imposed by the vehicle dynamics, road boundaries, static obstacles and moving vehicles.
Article
Mobile robots are increasingly populating our human environments. To interact with humans in a socially compliant way, these robots need to understand and comply with mutually accepted rules. In this paper, we present a novel approach to model the cooperative navigation behavior of humans. We model their behavior in terms of a mixture distribution that captures both the discrete navigation decisions, such as going left or going right, as well as the natural variance of human trajectories. Our approach learns the model parameters of this distribution that match, in expectation, the observed behavior in terms of user-defined features. To compute the feature expectations over the resulting high-dimensional continuous distributions, we use Hamiltonian Markov chain Monte Carlo sampling. Furthermore, we rely on a Voronoi graph of the environment to efficiently explore the space of trajectories from the robot’s current position to its target position. Using the proposed model, our method is able to imitate the behavior of pedestrians or, alternatively, to replicate a specific behavior that was taught by tele-operation in the target environment of the robot. We implemented our approach on a real mobile robot and demonstrated that it is able to successfully navigate in an office environment in the presence of humans. An extensive set of experiments suggests that our technique outperforms state-of-the-art methods to model the behavior of pedestrians, which also makes it applicable to fields such as behavioral science or computer graphics.
Article
In the last years, many researchers started to consider teach-and-repeat approaches for reliable autonomous navigation. The paradigm, in all its proposed forms, is deeply rooted in the idea that the robot should autonomously follow a route that has been demonstrated by a human during a teach phase. However, human demonstrations are often inefficient in terms of execution time or may cause premature wear of the robot components due to jittery behavior or strong accelerations. In this paper, we propose the concept of teach, optimize and repeat, which introduces a trajectory optimization step between the teach and repeat phases. To address this problem, we further propose LexTOR, a constrained trajectory optimization method for teach and repeat problems, where the constraints are defined according to user preferences. At its core, LexTOR optimizes both the execution time and the trajectory smoothness in a lexicographic sense. The experiments show that LexTOR is very effective, both qualitatively and quantitatively, in terms of execution time, smoothness, accuracy and bound satisfaction.
Article
The complexity of the potential physical interactions between the robot, each of the pushable objects, and the environment is vast in realistic mobile push-manipulation scenarios. This makes it non-trivial to write generic analytical motion and interaction models or tune the parameters of physics engines for every robot, object, and environment combination. We present a case-based approach to push-manipulation that allows the robot to acquire, through interaction and observation, a set of discrete, experimental, probabilistic motion models (i.e. probabilistic cases) for pushable passively-mobile real world objects. These probabilistic cases are then used as building blocks for constructing achievable push plans to navigate the object of interest to the desired goal pose as well as to potentially push the movable obstacles out of the way in cluttered task environments. Additionally, incremental acquisition and updating of the probabilistic cases allows the robot to adapt to the changes in the environment, such as increased mass due to loading of the object of interest for transportation purposes. The purely interaction and observation driven nature of our method makes it robot, object, and environment (real or simulated) independent, as we demonstrate through validation tests in a real world setup in addition to extensive experimentation in simulation.