Conference Paper

Detecting, Opening and Navigating through Doors: A Unified Framework for Human Service Robots

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... These methods typically assume perfectly known geometry with objects pre-fixed to the robot. Recent works have demonstrated successful systems which leverage the vision learning methods and the robotic planning methods to manipulate 3D articulated objects with the learned visual knowledge [29], [30], [31]. These works typically start with the vision methods such as pose estimation, object detection, and part segmentation to get the visual knowledge of the environment and compute a motion trajectory. ...
Preprint
Generalizable object manipulation skills are critical for intelligent and multi-functional robots to work in real-world complex scenes. Despite the recent progress in reinforcement learning, it is still very challenging to learn a generalizable manipulation policy that can handle a category of geometrically diverse articulated objects. In this work, we tackle this category-level object manipulation policy learning problem via imitation learning in a task-agnostic manner, where we assume no handcrafted dense rewards but only a terminal reward. Given this novel and challenging generalizable policy learning problem, we identify several key issues that can fail the previous imitation learning algorithms and hinder the generalization to unseen instances. We then propose several general but critical techniques, including generative adversarial self-imitation learning from demonstrations, progressive growing of discriminator, and instance-balancing for expert buffer, that accurately pinpoints and tackles these issues and can benefit category-level manipulation policy learning regardless of the tasks. Our experiments on ManiSkill benchmarks demonstrate a remarkable improvement on all tasks and our ablation studies further validate the contribution of each proposed technique.
... In case(d), the end-effector is occluded after interactions. While a human is able to change the viewpoint Door Training Categories (12) Testing Categories (10) for better observation, our agent uses a fixed camera position and therefore not robust enough for occlusion. Both (c) and (d) cases could be addressed by better modeling the agent's hardware embodiment including end-effector shape and camera placement. ...
Preprint
We introduce the Universal Manipulation Policy Network (UMPNet) -- a single image-based policy network that infers closed-loop action sequences for manipulating arbitrary articulated objects. To infer a wide range of action trajectories, the policy supports 6DoF action representation and varying trajectory length. To handle a diverse set of objects, the policy learns from objects with different articulation structures and generalizes to unseen objects or categories. The policy is trained with self-guided exploration without any human demonstrations, scripted policy, or pre-defined goal conditions. To support effective multi-step interaction, we introduce a novel Arrow-of-Time action attribute that indicates whether an action will change the object state back to the past or forward into the future. With the Arrow-of-Time inference at each interaction step, the learned policy is able to select actions that consistently lead towards or away from a given state, thereby, enabling both effective state exploration and goal-conditioned manipulation. Video is available at https://youtu.be/KqlvcL9RqKM
Chapter
For an autonomous robotic system, detecting, opening, and navigating through doors remains a very challenging problem. It involves several hard-to-solve sub-tasks such as recognizing the door frame and the handle, discriminating between different type of doors and their status, and opening and moving through the doorway. Previous works often tackle single individual sub-problems, assuming that the robot is moving in a well-known static environments or it is already facing the door handle. However, ignoring navigation issues, using specialized robots, or restricting the analysis to specific types of doors or handles, reduce the applicability of the proposed approach. In this paper, we present a unified framework for the door opening problem, by taking a navigation scenario as a reference. We implement specific algorithms to solve each sub-task and we describe the hierarchical automata which integrates the control of the robot during the entire process. We build a publicly available data-set which consists in 780 images of doors and handles crawled from Google Images. Using this data-set, we train a deep learning neural network, exploiting the Single Shot MultiBox Detector, to recognize doors and handles. We implement error recovery mechanisms to add robustness and reliability to our robot, and to guarantee a high success rate in every task. We carry-out experiments on a realistic scenario, the “Help Me Carry” task of the RoboCup 2018, using a standard service robot, the Toyota Human Support Robot. Our experiments demonstrate that our framework can successfully detect, open, and navigate through doors in a reliable way, with low error rates, and without adapting the environment to the robot.
ResearchGate has not been able to resolve any references for this publication.