Fig 1 - uploaded by Joseph DelPreto
Content may be subject to copyright.
A person supervises a robot while it autonomously learns a grasping task, and remotely controls it via virtual reality when the robot requests help. This master-apprentice model allows the robot to quickly learn the task while efficiently using the human's time and expertise. 1

A person supervises a robot while it autonomously learns a grasping task, and remotely controls it via virtual reality when the robot requests help. This master-apprentice model allows the robot to quickly learn the task while efficiently using the human's time and expertise. 1

Source publication
Conference Paper
Full-text available
As artificial intelligence becomes an increasingly prevalent method of enhancing robotic capabilities, it is important to consider effective ways to train these learning pipelines and to leverage human expertise. Working towards these goals, a master-apprentice model is presented and is evaluated during a grasping task for effectiveness and human p...

Contexts in source publication

Context 1
... aspects are explored via experiments in which subjects use VR to supervise and control an autonomous robot learning a grasping task online, as seen in Figure 1. The robot uses self-supervised learning to generate positive and negative grasping examples, and requests human intervention when it cannot find a solution. ...
Context 2
... each trial, a moisturizer bottle was randomly placed on a table in front of the robot as seen in Figure 1. A successful grasp required the robot to lift the object and hold it for a fixed duration. ...

Similar publications

Preprint
Full-text available
Autonomous Driving vehicles (ADV) are on road with large scales. For safe and efficient operations, ADVs must be able to predict the future states and iterative with road entities in complex, real-world driving scenarios. How to migrate a well-trained prediction model from one geo-fenced area to another is essential in scaling the ADV operation and...

Citations

... However, few articles have investigated the opposite: people helping robots. For example, DelPreto et al. (2020) focused on improving a robot arm's grasping behavior by human VR intervention. Vanzo and colleagues (2020) focused on measuring collaboration attitude as the number of times a participant agrees to hypothetically help a robot. ...
Preprint
Full-text available
Robots are increasingly present in our society. Their successful integration depends, however, on understanding and fostering pro-social behavior towards robots, in this case, helping. To better understand people’s reported willingness to help robots across different contexts (delivery, medical, service, and security), we conducted two preregistered studies on a German-speaking population (N = 415, and N = 542, representative of age and gender). We assessed attitudes, knowledge about robots, and anthropomorphism and investigated their effect on reported willingness to help. Results show that positive attitudes significantly predicted reported higher willingness to help. Contrary to our hypothesis, having more knowledge about robots increased reported willingness to help. Additionally, we found no effect of anthropomorphism, neither in the form of robot appearance nor as participant’s own view about robots, on reported willingness to help. Furthermore, results point to a context-dependency for willingness to help with participants preferring to help robots in a medical context compared to a security one, for example. Our findings thus highlight the relevance of knowledge and attitudes in understanding helping behavior towards robots. Additionally, our results raise questions about the relevance of anthropomorphism in pro-sociality towards robots.
... HTC Vive controller estimate their positions from infrared signals from so-called base stations, which have to be carefully placed and calibrated. In a similar vein, the work in [19] uses an Oculus Quest device for upper body tracking. However, Oculus controllers are tracked via cameras within the VR headset [20]. ...
Conference Paper
Full-text available
This work devises an optimized machine learning approach for human arm pose estimation from a single smartwatch. Our approach results in a distribution of possible wrist and elbow positions, which allows for a measure of uncertainty and the detection of multiple possible arm posture solutions, i.e., multimodal pose distributions. Combining estimated arm postures with speech recognition, we turn the smartwatch into a ubiquitous, low-cost and versatile robot control interface. We demonstrate in two use-cases that this intuitive control interface enables users to swiftly intervene in robot behavior, to temporarily adjust their goal, or to train completely new control policies by imitation. Extensive experiments show that the approach results in a 40% reduction in prediction errorover the current state-of-the-art and achieves a mean error of 2.56 cm for wrist and elbow positions.
... HTC Vive controller estimate their positions from infrared signals from so-called base stations, which have to be carefully placed and calibrated. In a similar vein, the work in [19] uses an Oculus Quest device for upper body tracking. However, Oculus controllers are tracked via cameras within the VR headset [20]. ...
Preprint
Full-text available
This work devises an optimized machine learning approach for human arm pose estimation from a single smartwatch. Our approach results in a distribution of possible wrist and elbow positions, which allows for a measure of uncertainty and the detection of multiple possible arm posture solutions, i.e., multimodal pose distributions. Combining estimated arm postures with speech recognition, we turn the smartwatch into a ubiquitous, low-cost and versatile robot control interface. We demonstrate in two use-cases that this intuitive control interface enables users to swiftly intervene in robot behavior, to temporarily adjust their goal, or to train completely new control policies by imitation. Extensive experiments show that the approach results in a 40% reduction in prediction error over the current state-of-the-art and achieves a mean error of 2.56cm for wrist and elbow positions.
... Through carefully designed teleoperation systems, human can control robots to perform human-like motions and complete tasks otherwise not possible without human input [16]. Moreover, in recent years, teleoperation systems have been widely applied to learning based manipulations that robot learns human manipulation skills [17], [18]. ...
Preprint
Full-text available
In-hand pivoting is one of the important manipulation skills that leverage robot grippers' extrinsic dexterity to perform repositioning tasks to compensate for environmental uncertainties and imprecise motion execution. Although many researchers have been trying to solve pivoting problems using mathematical modeling or learning-based approaches, the problems remain as open challenges. On the other hand, humans perform in-hand manipulation with remarkable precision and speed. Hence, the solution could be provided by making full use of this intrinsic human skill through dexterous teleoperation. For dexterous teleoperation to be successful, interfaces that enhance and complement haptic feedback are of great necessity. In this paper, we propose a cutaneous feedback interface that complements the somatosensory information humans rely on when performing dexterous skills. The interface is designed based on five-bar link mechanisms and provides two contact points in the index finger and thumb for cutaneous feedback. By integrating the interface with a commercially available haptic device, the system can display information such as grasping force, shear force, friction, and grasped object's pose. Passive pivoting tasks inside a numerical simulator Isaac Sim is conducted to evaluate the effect of the proposed cutaneous feedback interface.
... One example of this approach is presented in DelPreto et al. (2020), where the policy outputs a discrete vector of confidence scores for four different gripper orientations, and the one with the highest confidence is picked. An apprenticeship model is developed, which queries the teacher intervention in case of too many failures in a row or if the output confidence is lower than a certain threshold. ...
... These studies measure the learning efficiency of an IIL method and its relation to human performance. Jauhri et al. (2021), DelPreto et al. (2020, Biyik et al. (2020), Cui et al. (2019), Palan et al. (2019), Chisari et al. (2022), He et al. (2020), andHoque et al. (2021) employ robot learning metrics (task accuracy, success rate, training time, reward maximization) as well as human performance metrics (workload assessment and model perception) to compare different learning methods. ...
... Therefore, a proper selection of study participants should be taken into consideration. Most of the user studies in IIL either report very low ML expertise or none (Biyik et al., 2020;Palan et al., 2019;Chisari et al., 2022;He et al., 2020;Franzese et al., 2021b;Mészáros et al., 2022;Pérez-Dattari et al., 2019;Jauhri et al., 2021;DelPreto et al., 2020;Bajcsy et al., 2018;Bajcsy et al., 2017). On the other hand, Jain et al. (2015), Bajcsy et al. (2017), andPalan et al. (2019) require that participants have medium task domain expertise in robot manipulation tasks. ...
Preprint
Full-text available
Interactive Imitation Learning (IIL) is a branch of Imitation Learning (IL) where human feedback is provided intermittently during robot execution allowing an online improvement of the robot's behavior. In recent years, IIL has increasingly started to carve out its own space as a promising data-driven alternative for solving complex robotic tasks. The advantages of IIL are its data-efficient, as the human feedback guides the robot directly towards an improved behavior, and its robustness, as the distribution mismatch between the teacher and learner trajectories is minimized by providing feedback directly over the learner's trajectories. Nevertheless, despite the opportunities that IIL presents, its terminology, structure, and applicability are not clear nor unified in the literature, slowing down its development and, therefore, the research of innovative formulations and discoveries. In this article, we attempt to facilitate research in IIL and lower entry barriers for new practitioners by providing a survey of the field that unifies and structures it. In addition, we aim to raise awareness of its potential, what has been accomplished and what are still open research questions. We organize the most relevant works in IIL in terms of human-robot interaction (i.e., types of feedback), interfaces (i.e., means of providing feedback), learning (i.e., models learned from feedback and function approximators), user experience (i.e., human perception about the learning process), applications, and benchmarks. Furthermore, we analyze similarities and differences between IIL and RL, providing a discussion on how the concepts offline, online, off-policy and on-policy learning should be transferred to IIL from the RL literature. We particularly focus on robotic applications in the real world and discuss their implications, limitations, and promising future areas of research.
... training process in case of sparse rewards [6]- [8], and human demonstrations specifically are commonly used in the literature, providing these demonstrations can be quite costly. In addition to that, human demonstrations typically require hardware equipment or virtual reality sets to provide the demonstrations [6], [9]- [11]. In this work, we therefore propose to use MPC as an experience source for RL in the case of sparse rewards. ...
Preprint
Full-text available
Reinforcement learning (RL) has recently proven great success in various domains. Yet, the design of the reward function requires detailed domain expertise and tedious fine-tuning to ensure that agents are able to learn the desired behaviour. Using a sparse reward conveniently mitigates these challenges. However, the sparse reward represents a challenge on its own, often resulting in unsuccessful training of the agent. In this paper, we therefore address the sparse reward problem in RL. Our goal is to find an effective alternative to reward shaping, without using costly human demonstrations, that would also be applicable to a wide range of domains. Hence, we propose to use model predictive control~(MPC) as an experience source for training RL agents in sparse reward environments. Without the need for reward shaping, we successfully apply our approach in the field of mobile robot navigation both in simulation and real-world experiments with a Kuboki Turtlebot 2. We furthermore demonstrate great improvement over pure RL algorithms in terms of success rate as well as number of collisions and timeouts. Our experiments show that MPC as an experience source improves the agent's learning process for a given task in the case of sparse rewards.
... [2] demonstrated that commercially-available VR systems could be used to provide demonstrations of complex manipulation tasks to support effective imitation learning with deep neural networks. [3] used an apprenticeship model to efficiently use a human's time when teleoperating in VR to teach a grasping task. ...
Preprint
Full-text available
Mixed Reality (MR) has recently shown great success as an intuitive interface for enabling end-users to teach robots. Related works have used MR interfaces to communicate robot intents and beliefs to a co-located human, as well as developed algorithms for taking multi-modal human input and learning complex motor behaviors. Even with these successes, enabling end-users to teach robots complex motor tasks still poses a challenge because end-user communication is highly task dependent and world knowledge is highly varied. We propose a learning framework where end-users teach robots a) motion demonstrations, b) task constraints, c) planning representations, and d) object information, all of which are integrated into a single motor skill learning framework based on Dynamic Movement Primitives (DMPs). We hypothesize that conveying this world knowledge will be intuitive with an MR interface, and that a sample-efficient motor skill learning framework which incorporates varied modalities of world knowledge will enable robots to effectively solve complex tasks.
... This group of metrics has a strong connection to the level of autonomy that the swarm component possesses. In a shared control situation where there is a possibility for negotiation between human and automation, it is essential to identify extra measures such as the percentage of requests for assistance created by controlling agents (DelPreto et al., 2020;Kerzel et al., 2020), the percentage of requests for assistance created by the human operator, and the number of insignificant interventions by the human operator (Steinfeld et al., 2006). ...
Article
Full-text available
Swarm systems consist of large numbers of agents that collaborate autonomously. With an appropriate level of human control, swarm systems could be applied in a variety of contexts ranging from urban search and rescue situations to cyber defence. However, the successful deployment of the swarm in such applications is conditioned by the effective coupling between human and swarm. While adaptive autonomy promises to provide enhanced performance in human-machine interaction, distinct factors must be considered for its implementation within human-swarm interaction. This paper reviews the multidisciplinary literature on different aspects contributing to the facilitation of adaptive autonomy in human-swarm interaction. Specifically, five aspects that are necessary for an adaptive agent to operate properly are considered and discussed, including mission objectives, interaction, mission complexity, automation levels, and human states. We distill the corresponding indicators in each of the five aspects, and propose a framework, named MICAH (i.e., Mission-Interaction-Complexity-Automation-Human), which maps the primitive state indicators needed for adaptive human-swarm teaming.
... One example of this approach is presented in DelPreto et al. (2020), where the policy outputs a discrete vector of confidence scores for four different gripper orientations, and the one with the highest confidence is picked. An apprenticeship model is developed, which queries the teacher intervention in case of too many failures in a row or if the output confidence is lower than a certain threshold. ...
... These studies measure the learning efficiency of an IIL method and its relation to human performance. Jauhri et al. (2021), DelPreto et al. (2020, Biyik et al. (2020), Cui et al. (2019), Palan et al. (2019), Chisari et al. (2022), He et al. (2020), andHoque et al. (2021) employ robot learning metrics (task accuracy, success rate, training time, reward maximization) as well as human performance metrics (workload assessment and model perception) to compare different learning methods. ...
... Therefore, a proper selection of study participants should be taken into consideration. Most of the user studies in IIL either report very low ML expertise or none (Biyik et al., 2020;Palan et al., 2019;Chisari et al., 2022;He et al., 2020;Franzese et al., 2021b;Mészáros et al., 2022;Pérez-Dattari et al., 2019;Jauhri et al., 2021;DelPreto et al., 2020;Bajcsy et al., 2018;Bajcsy et al., 2017). On the other hand, Jain et al. (2015), Bajcsy et al. (2017), andPalan et al. (2019) require that participants have medium task domain expertise in robot manipulation tasks. ...
Book
Full-text available
Existing robotics technology is still mostly limited to being used by expert programmers who can adapt the systems to new required conditions, but not flexible and adaptable by non-expert workers or end-users. Imitation Learning (IL) has obtained considerable attention as a potential direction for enabling all kinds of users to easily program the behavior of robots or virtual agents. Interactive Imitation Learning (IIL) is a branch of Imitation Learning (IL) where human feedback is provided intermittently during robot execution allowing an online improvement of the robot’s behavior. In this monograph, research in IIL is presented and low entry barriers for new practitioners are facilitated by providing a survey of the field that unifies and structures it. In addition, awareness of its potential is raised, what has been accomplished and what are still open research questions being covered. Highlighted are the most relevant works in IIL in terms of human-robot interaction (i.e., types of feedback), interfaces (i.e., means of providing feedback), learning (i.e., models learned from feedback and function approximators), user experience (i.e., human perception about the learning process), applications, and benchmarks. Furthermore, similarities and differences between IIL and Reinforcement Learning (RL) are analyzed, providing a discussion on how the concepts offline, online, off-policy and on-policy learning should be transferred to IIL from the RL literature. Particular focus is given to robotic applications in the real world and their implications are discussed, and limitations and promising future areas of research are provided.
... Teleoperation for MM is not easy to implement, as it requires enabling the user to control both navigation and manipulation, possibly simultaneously. Previous work on this problem has explored using online click-through interfaces [18,19,20], muscle signals [21,22], joysticks [23,24,25], tablets [26], and virtual reality interfaces [27,28,29,30,31,32,33,34]. These approaches are either scalable or easy to use: web-based tools are widely available but are not well suited to demonstrate dexterous continuous control, while VR and other interfaces are intuitive to use but are not widely available. ...
Preprint
In mobile manipulation (MM), robots can both navigate within and interact with their environment and are thus able to complete many more tasks than robots only capable of navigation or manipulation. In this work, we explore how to apply imitation learning (IL) to learn continuous visuo-motor policies for MM tasks. Much prior work has shown that IL can train visuo-motor policies for either manipulation or navigation domains, but few works have applied IL to the MM domain. Doing this is challenging for two reasons: on the data side, current interfaces make collecting high-quality human demonstrations difficult, and on the learning side, policies trained on limited data can suffer from covariate shift when deployed. To address these problems, we first propose Mobile Manipulation RoboTurk (MoMaRT), a novel teleoperation framework allowing simultaneous navigation and manipulation of mobile manipulators, and collect a first-of-its-kind large scale dataset in a realistic simulated kitchen setting. We then propose a learned error detection system to address the covariate shift by detecting when an agent is in a potential failure state. We train performant IL policies and error detectors from this data, and achieve over 45% task success rate and 85% error detection success rate across multiple multi-stage tasks when trained on expert data. Codebase, datasets, visualization, and more available at https://sites.google.com/view/il-for-mm/home.