Figure 2 - uploaded by Laurent Jeanpierre
Content may be subject to copyright.
Operator interface: map, video stream. Lines on the map represent operator advising: desired (green), undesired (orange), and forbidden (red) advising. The target shape is the goal of the robot. Numbers are navigation waypoints.

Operator interface: map, video stream. Lines on the map represent operator advising: desired (green), undesired (orange), and forbidden (red) advising. The target shape is the goal of the robot. Numbers are navigation waypoints.

Source publication
Article
Full-text available
This paper introduces Advice-MDPs, an expansion of Markov Decision Processes for generating policies that take into consideration advising on the desirability, undesirability, and prohibition of certain states and actions. AdviceMDPs enable the design of designing semi-autonomous systems (systems that require operator support for at least handling...

Contexts in source publication

Context 1
... operator availability is highly limited and unpredictable: while operators can generally watch and support the system for a few minutes after deployment, they have little time for actively training it and, suddenly, they may become overloaded due to arising threats. Illustrations of the implementation of the ADRS are presented in Figure 1 and Figure 2. ...
Context 2
... ADRS generates a map of the surroundings via a Simultaneous Localization and Mapping (SLAM) using a laser. This map is then displayed with a Graphical User Interface (GUI), presented in Figure 2, on a tablet that operators wear on their wrist. This GUI enables operators to advise on goals, desirable, undesirable and forbidden areas, by simply clicking on an "advice-type" button and drawing on the map (points, lines, shapes). ...
Context 3
... operator availability is highly limited and unpredictable: while operators can generally watch and support the system for a few minutes after deployment, they have little time for actively training it and, suddenly, they may become overloaded due to arising threats. Illustrations of the implementation of the ADRS are presented in Figure 1 and Figure 2. ...
Context 4
... ADRS generates a map of the surroundings via a Simultaneous Localization and Mapping (SLAM) using a laser. This map is then displayed with a Graphical User Interface (GUI), presented in Figure 2, on a tablet that operators wear on their wrist. This GUI enables operators to advise on goals, desirable, undesirable and forbidden areas, by simply clicking on an "advice-type" button and drawing on the map (points, lines, shapes). ...

Similar publications

Conference Paper
Full-text available
This paper describes our work to assure safe autonomy in soft fruit production. The first step was hazard analysis, where all the possible hazards in representative scenarios were identified. Following this analysis, a three-layer safety architecture was identified that will minimise the occurrence of the identified hazards. Most of the hazards are...

Citations

... Despite the rapid evolution of planning and robotics almost all of existing systems require at least occasional human support for achieving their goals and matching ethical requirements [2], [9], [17], [20]. All approaches agree in restricting the level of autonomy of the system is straightforward for ensuring that sufficient operator support is obtained before engaging in sensitive actions. ...
... including reinforcement learning [19] and operator advising [20]. Such solutions can be applied in SA-RAL-MDPs e.g. for the system to learn or be advised about different risks for operator disconnection depending on the state (e.g. when reaching the border of a wifi network) or for avoiding states for context-specific reasons that are not part of the model (e.g. ...
... They proposed an advising framework where a teacher augments a student's knowledge by providing not only the advised action but also the expected long-term reward of following that action. Vanhee et al. [17] introduced Advice-MDPs which extend Markov decision processes for generating policies that take into account advising on the desirability, undesirability and prohibition of certain states and actions. ...
Preprint
Agent advising is one of the main approaches to improve agent learning performance by enabling agents to share advice. Existing advising methods have a common limitation that an adviser agent can offer advice to an advisee agent only if the advice is created in the same state as the advisee's concerned state. However, in complex environments, it is a very strong requirement that two states are the same, because a state may consist of multiple dimensions and two states being the same means that all these dimensions in the two states are correspondingly identical. Therefore, this requirement may limit the applicability of existing advising methods to complex environments. In this paper, inspired by the differential privacy scheme, we propose a differential advising method which relaxes this requirement by enabling agents to use advice in a state even if the advice is created in a slightly different state. Compared with existing methods, agents using the proposed method have more opportunity to take advice from others. This paper is the first to adopt the concept of differential privacy on advising to improve agent learning performance instead of addressing security issues. The experimental results demonstrate that the proposed method is more efficient in complex environments than existing methods.
Article
Agent advising is one of the main approaches to improve agent learning performance by enabling agents to share advice. Existing advising methods have a common limitation that an adviser agent can offer advice to an advisee agent only if the advice is created in the same state as the advisee's concerned state. However, in complex environments, it is a very strong requirement that two states are the same, because a state may consist of multiple dimensions and two states being the same means that all these dimensions in the two states are correspondingly identical. Therefore, this requirement may limit the applicability of existing advising methods to complex environments. In this article, inspired by the differential privacy scheme, we propose a differential advising method that relaxes this requirement by enabling agents to use advice in a state even if the advice is created in a slightly different state. Compared with the existing methods, agents using the proposed method have more opportunity to take advice from others. This article is the first to adopt the concept of differential privacy on advising to improve agent learning performance instead of addressing security issues. The experimental results demonstrate that the proposed method is more efficient in complex environments than the existing methods.