Nick Hawes's research while affiliated with University of Oxford and other places

Publications (155)

Article
The metareasoning framework aims to enable autonomous agents to factor in planning costs when making decisions. In this work, we develop the first non-myopic metareasoning algorithm for planning with Markov decision processes. Our method learns the behaviour of anytime probabilistic planning algorithms from performance data. Specifically, we propos...
Article
Full-text available
Grasslands, which encompass 40% of terrestrial ecosystems, hold global significance for food production, carbon storage and other ecosystem services. However, grasslands across the biosphere are becoming increasingly exposed to both wet and dry precipitation extremes resulting from climate change. Therefore, understanding how grasslands will respon...
Article
For many multi-robot problems, tasks are announced during execution, where task announcement times and locations are uncertain. To synthesise multi-robot behaviour that is robust to early announcements and unexpected delays, multi-robot task allocation methods must explicitly model the stochastic processes that govern task announcement. In this pap...
Chapter
Full-text available
We consider the challenging scenario of contextual bandits with continuous actions and large context spaces. This is an increasingly important application area in personalised healthcare where an agent is requested to make dosing decisions based on a patient’s single image scan. In this paper, we first adapt a reinforcement learning (RL) algorithm...
Preprint
Full-text available
Grasslands comprise 40% of terrestrial ecosystems and are globally important for food production, carbon storage, and other ecosystem services. However, grasslands in many areas are becoming increasingly exposed to extreme wet and dry periods resulting from global temperature increases. Therefore, understanding how grasslands will respond to climat...
Preprint
Full-text available
The nature of explanations provided by an explainable AI algorithm has been a topic of interest in the explainable AI and human-computer interaction community. In this paper, we investigate the effects of natural language explanations' specificity on passengers in autonomous driving. We extended an existing data-driven tree-based explainer algorith...
Article
Sharing scarce resources is a key challenge in multi-agent interaction, especially when individual agents are uncertain about their future consumption. We present a new auction mechanism for preallocating multi-unit resources among agents, while limiting the chance of resource violations. By planning for a chance constraint, we strike a balance bet...
Article
For many applications of Markov Decision Processes (MDPs), the transition function cannot be specified exactly. Bayes-Adaptive MDPs (BAMDPs) extend MDPs to consider transition probabilities governed by latent parameters. To act optimally in BAMDPs, one must maintain a belief distribution over the latent parameters. Typically, this distribution is d...
Preprint
We consider robot learning in the context of shared autonomy, where control of the system can switch between a human teleoperator and autonomous control. In this setting we address reinforcement learning, and learning from demonstration, where there is a cost associated with human time. This cost represents the human time required to teleoperate th...
Preprint
We propose DITTO, an offline imitation learning algorithm which uses world models and on-policy reinforcement learning to addresses the problem of covariate shift, without access to an oracle or any additional online interactions. We discuss how world models enable offline, on-policy imitation learning, and propose a simple intrinsic reward defined...
Article
Full-text available
Image sensing technologies are rapidly increasing the cost‐effectiveness of biodiversity monitoring efforts. Species differences in the reflectance of electromagnetic radiation can be used as a surrogate estimate plant biodiversity using multispectral image data. However, these efforts are often hampered by logistical difficulties in broad‐scale im...
Preprint
Offline reinforcement learning (RL) is suitable for safety-critical domains where online exploration is too costly or dangerous. In safety-critical settings, decision-making should take into consideration the risk of catastrophic outcomes. In other words, decision-making should be risk-sensitive. Previous works on risk in offline RL combine togethe...
Article
Full-text available
Consider a mobile robot exploring an office building with the aim of observing as much human activity as possible over several days. It must learn where and when people are to be found, count the observed activities, and revisit popular places at the right time. In this paper we present a series of Bayesian estimators for the levels of human activi...
Preprint
Full-text available
Achieving reactive robot behavior in complex dynamic environments is still challenging as it relies on being able to solve trajectory optimization problems quickly enough, such that we can replan the future motion at frequencies which are sufficiently high for the task at hand. We argue that current limitations in Model Predictive Control (MPC) for...
Article
In this overview paper, we present the work of the Goal-Oriented Long-Lived Systems Lab on multi-robot systems. We address multi-robot systems from a decision-making under uncertainty perspective, proposing approaches that explicitly reason about the inherent uncertainty of action execution, and how such stochasticity affects multi-robot coordinati...
Preprint
Full-text available
Active inference is a mathematical framework that originated in computational neuroscience. Recently, it has been demonstrated as a promising approach for constructing goal-driven behavior in robotics. Specifically, the active inference controller (AIC) has been successful on several continuous control and state-estimation tasks. Despite its relati...
Conference Paper
Full-text available
We consider shared autonomy systems where multiple operators (AI and human), can interact with the environment, e.g. by controlling a robot. The decision problem for the shared autonomy system is to select which operator takes control at each timestep, such that a reward specifying the intended system behaviour is maximised. The performance of the...
Article
Planning in Markov decision processes (MDPs) typically optimises the expected cost. However, optimising the expectation does not consider the risk that for any given run of the MDP, the total cost received may be unacceptably high. An alternative approach is to find a policy which optimises a riskaverse objective such as conditional value at risk (...
Preprint
Offline reinforcement learning (RL) aims to find near-optimal policies from logged data without further environment interaction. Model-based algorithms, which learn a model of the environment from the dataset and perform conservative policy optimisation within that model, have emerged as a promising approach to this problem. In this work, we presen...
Preprint
Full-text available
Model-based fault-tolerant control (FTC) often consists of two distinct steps: fault detection & isolation (FDI), and fault accommodation. In this work we investigate posing fault-tolerant control as a single Bayesian inference problem. Previous work showed that precision learning allows for stochastic FTC without an explicit fault detection step....
Preprint
Full-text available
Image sensing technologies are rapidly increasing the cost-effectiveness of biodiversity monitoring efforts. Species differences in the reflectance of electromagnetic radiation have recently been highlighted as a promising target to estimate plant biodiversity using multispectral image data. However, these efforts are currently hampered by logistic...
Article
Model-based fault-tolerant control (FTC) often consists of two distinct steps: fault detection & isolation (FDI), and fault accommodation. In this work we investigate posing fault-tolerant control as a single Bayesian inference problem. Previous work showed that precision learning allows for stochastic FTC without an explicit fault detection step....
Article
This article presents an Expert-guided Mixed-initiative Control Switcher (EMICS) for remotely operated mobile robots. The EMICS enables switching between different levels of autonomy during task execution initiated by either the human operator and/or the EMICS. The EMICS is evaluated in two disaster-response-inspired experiments, one with a simulat...
Preprint
Planning in Markov decision processes (MDPs) typically optimises the expected cost. However, optimising the expectation does not consider the risk that for any given run of the MDP, the total cost received may be unacceptably high. An alternative approach is to find a policy which optimises a risk-averse objective such as conditional value at risk...
Preprint
Recent trends envisage robots being deployed in areas deemed dangerous to humans, such as buildings with gas and radiation leaks. In such situations, the model of the underlying hazardous process might be unknown to the agent a priori, giving rise to the problem of planning for safe behaviour in partially known environments. We employ Gaussian proc...
Preprint
Previous work on planning as active inference addresses finite horizon problems and solutions valid for online planning. We propose solving the general Stochastic Shortest-Path Markov Decision Process (SSP MDP) as probabilistic inference. Furthermore, we discuss online and offline methods for planning under uncertainty. In an SSP MDP, the horizon i...
Preprint
Full-text available
This work presents a fault-tolerant control scheme for sensory faults in robotic manipulators based on active inference. In the majority of existing schemes, a binary decision of whether a sensor is healthy (functional) or faulty is made based on measured data. The decision boundary is called a threshold and it is usually deterministic. Following a...
Article
Multirobot systems must be able to maintain performance when robots get delayed during execution. For mobile robots, one source of delays is congestion . Congestion occurs when robots deployed in shared physical spaces interact, as robots present in the same area simultaneously must maneuver to avoid each other. Congestion can adversely affect na...
Article
Full-text available
The utilisation of robots in hazardous nuclear environments has potential to reduce risk to humans. However, historical use has been largely limited to specific missions rather than broader industry-wide adoption. Testing and verification of robotics in realistic scenarios is key to gaining stakeholder confidence but hindered by limited access to f...
Article
The parameters for a Markov Decision Process (MDP) often cannot be specified exactly. Uncertain MDPs (UMDPs) capture this model ambiguity by defining sets which the parameters belong to. Minimax regret has been proposed as an objective for planning in UMDPs to find robust policies which are not overly conservative. In this work, we focus on plannin...
Preprint
Full-text available
This work presents a novel fault-tolerant control scheme based on active inference. Specifically, a new formulation of active inference which, unlike previous solutions, provides unbiased state estimation and simplifies the definition of probabilistically robust thresholds for fault-tolerant control of robotic systems using the free-energy. The pro...
Preprint
In this work, we address risk-averse Bayesadaptive reinforcement learning. We pose the problem of optimising the conditional value at risk (CVaR) of the total return in Bayes-adaptive Markov decision processes (MDPs). We show that a policy optimising CVaR in this setting is risk-averse to both the parametric uncertainty due to the prior distributio...
Article
This paper presents an approach to planning under uncertainty in resource-constrained environments. We describe our novel method for online plan modification and execution monitoring, which augments an existing plan with pre-computed plan fragments in response to observed resource availability. Our plan merging algorithm uses causal structure to in...
Chapter
This work presents a fault-tolerant control scheme for sensory faults in robotic manipulators based on active inference. In the majority of existing schemes a binary decision of whether a sensor is healthy (functional) or faulty is made based on measured data. The decision boundary is called a threshold and it is usually deterministic. Following a...
Chapter
Previous work on planning as active inference addresses finite horizon problems and solutions valid for online planning. We propose solving the general Stochastic Shortest-Path Markov Decision Process (SSP MDP) as probabilistic inference. Furthermore, we discuss online and offline methods for planning under uncertainty. In an SSP MDP, the horizon i...
Chapter
Full-text available
We present a fault tolerant control scheme for robot manipulators based on active inference. The proposed solution makes use of the sensory prediction errors in the free-energy to simplify the residuals and thresholds generation for fault detection and isolation and does not require additional controllers for fault recovery. Results validating the...
Preprint
Full-text available
The parameters for a Markov Decision Process (MDP) often cannot be specified exactly. Uncertain MDPs (UMDPs) capture this model ambiguity by defining sets which the parameters belong to. Minimax regret has been proposed as an objective for planning in UMDPs to find robust policies which are not overly conservative. In this work, we focus on plannin...
Article
The daily working hours of mobile robots are limited primarily by battery life. Most systems use a combination of thresholds and fixed periods to decide when to charge. This produces charging behaviour that ignores high-value tasks that must be performed within time-windows or by deadlines. Instead the robot should schedule charging adaptively, tak...
Article
This work investigates Monte-Carlo planning for agents in stochastic environments, with multiple objectives. We propose the Convex Hull Monte-Carlo Tree-Search (CHMCTS) framework, which builds upon Trial Based Heuristic Tree Search and Convex Hull Value Iteration (CHVI), as a solution to multi-objective planning in large environments. Moreover, we...
Preprint
Full-text available
This work presents an approach for control, state-estimation and learning model (hyper)parameters for robotic manipulators. It is based on the active inference framework, prominent in computational neuroscience as a theory of the brain, where behaviour arises from minimizing variational free-energy. The robotic manipulator shows adaptive and robust...
Preprint
This work investigates Monte-Carlo planning for agents in stochastic environments, with multiple objectives. We propose the Convex Hull Monte-Carlo Tree-Search (CHMCTS) framework, which builds upon Trial Based Heuristic Tree Search and Convex Hull Value Iteration (CHVI), as a solution to multi-objective planning in large environments. Moreover, we...
Article
We consider robot learning in the context of shared autonomy, where control of the system can switch between a human teleoperator and autonomous control. In this setting we address reinforcement learning, and learning from demonstration, where there is a cost associated with human time. This cost represents the human time required to teleoperate th...
Preprint
This paper presents an expert-guided Mixed-Initiative (MI) variable-autonomy controller for remotely operated mobile robots. The controller enables switching between different Level(s) of Autonomy (LOA) during task execution initiated by either the human operator or/and the robot. The controller is evaluated in two Search and Rescue (SAR) inspired...
Conference Paper
Full-text available
We present a novel modelling and planning approach for multi-robot systems under uncertain travel times. The approach uses generalised stochastic Petri nets (GSPNs) to model desired team behaviour, and allows to specify safety constraints and rewards. The GSPN is interpreted as a Markov decision process (MDP) for which we can generate policies that...
Article
We present a framework for mobile service robot task planning and execution, based on the use of probabilistic verification techniques for the generation of optimal policies with attached formal performance guarantees. Our approach is based on a Markov decision process model of the robot in its environment, encompassing a topological map where node...
Article
The papers in this special section focus on the use of artificial intelligence (AI) for long term autonomy. Autonomous systems have a long history in the fields of AI and robotics. However, only through recent advances in technology has it been possible to create autonomous systems capable of operating in long-term, real-world scenarios. Examples i...
Article
Full-text available
Autonomous systems will play an essential role in many applications across diverse domains including space, marine, air, field, road, and service robotics. They will assist us in our daily routines and perform dangerous, dirty and dull tasks. However, enabling robotic systems to perform autonomously in complex, real-world scenarios over extended ti...
Preprint
Full-text available
Autonomous systems will play an essential role in many applications across diverse domains including space, marine, air, field, road, and service robotics. They will assist us in our daily routines and perform dangerous, dirty and dull tasks. However, enabling robotic systems to perform autonomously in complex, real-world scenarios over extended ti...
Article
We propose novel techniques for task allocation and planning in multi-robot systems operating in uncertain environments. Task allocation is performed simultaneously with planning, which provides more detailed information about individual robot behaviour, but also exploits the independence between tasks to do so efficiently. We use Markov decision p...
Conference Paper
Full-text available
Deep networks thrive when trained on large scale data collections. This has given ImageNet a central role in the development of deep architectures for visual object classification. However, ImageNet was created during a specific period in time, and as such it is prone to aging, as well as dataset bias issues. Moving beyond fixed training datasets w...
Article
We present a methodology for the generation of mobile robot controllers which offer probabilistic time-bounded guarantees on successful task completion, whilst also trying to satisfy soft goals. The approach is based on a stochastic model of the robot’s environment and action execution times, a set of soft goals, and a formal task specification in...
Conference Paper
Full-text available
Intelligent Autonomous Robots deployed in human environments must have understanding of the wide range of possible semantic identities associated with the spaces they inhabit-kitchens, living rooms, bathrooms, offices, garages, etc. We believe robots should learn this information through their own exploration and situated perception in order to unc...
Conference Paper
Full-text available
The ability to refer to entities such as objects, locations, and people is an important capability for robots designed to interact with humans. For example, a referring expression (RE) such as " Do you mean the box on the left? " might be used by a robot seeking to disambiguate between objects. In this paper, we present and evaluate algorithms for...
Conference Paper
Full-text available
Recognising objects in everyday human environments is a challenging task for autonomous mobile robots. However, actively planning the views from which an object might be perceived can significantly improve the overall task performance. In this paper we have designed, developed, and evaluated an approach for next best view planning. Our view plannin...
Article
Full-text available
Thanks to the efforts of the robotics and autonomous systems community, robots are becoming ever more capable. There is also an increasing demand from end-users for autonomous service robots that can operate in real environments for extended periods. In the STRANDS project1 we are tackling this demand head-on by integrating state-of-the-art artific...
Conference Paper
The success of mobile robots, in daily living environments, depends on their capabilities to understand human movements and interact in a safe manner. This paper presents a novel unsupervised qualitative-relational framework for learning human motion patterns using a single mobile robot platform. It is capable of learning human motion patterns in r...
Article
Full-text available
Thanks to the efforts of our community, autonomous robots are becoming capable of ever more complex and impressive feats. There is also an increasing demand for, perhaps even an expectation of, autonomous capabilities from end-users. However, much research into autonomous robots rarely makes it past the stage of a demonstration or experimental syst...
Article
Full-text available
This article presents an integrated robot system capable of interactive learning in dialogue with a human. Such a system needs to have several competencies and must be able to process different types of representations. In this article, we describe a collection of mechanisms that enable integration of heterogeneous competencies in a principled way....
Article
Full-text available
In this article, we present and evaluate a system, which allows a mobile robot to autonomously detect, model, and re-recognize objects in everyday environments. While other systems have demonstrated one of these elements, to our knowledge, we present the first system, which is capable of doing all of these things, all without human interaction, in...
Conference Paper
Full-text available
This paper presents an experimental analysis of the Human-Robot Interaction (HRI) between human operators and a Human-Initiative (HI) variable-autonomy mobile robot during navigation tasks. In our HI system the human operator is able to switch the Level of Autonomy (LOA) on-the-fly between teleoperation (joystick control) and autonomous control (ro...
Article
Full-text available
Abstract A long-standing goal of AI is to enable robots to plan in the face of uncertain and incomplete information, and to handle task failure intelligently. This paper shows how to achieve this. There are two central ideas. The first idea is to organize the robot's knowledge into three layers: instance knowledge at the bottom, commonsense knowled...
Article
Full-text available
In planning for deliberation or navigation in real-world robotic systems, one of the big challenges is to cope with change. It lies in the nature of planning that it has to make assumptions about the future state of the world, and the robot's chances of successively accomplishing actions in this future. Hence, a robot's plan can only be as good as...
Article
We present a novel task scheduling algorithm for use on mobile robots in real environments. The scheduling problem is formalised as mixed integer program, which is a standard approach in the scheduling community. Our contribution is the use of Allen's interval algebra to prune the search to be performed by the mixed integer program. This significan...
Article
Object recognition systems can be unreliable when run in isolation depending on only image based features, but their performance can be improved when taking scene context into account. In this paper, we present techniques to model and infer object labels in real scenes based on a variety of spatial relations — geometric features which capture how o...
Conference Paper
Full-text available
Object recognition systems can be unreliable when run in isolation depending on only image based features, but their performance can be improved when taking scene context into account. In this paper, we present techniques to model and infer object labels in real scenes based on a variety of spatial relations – geometric features which capture how o...
Article
We present an approach to the problem of learning by observation in spatially-situated tasks, whereby an agent learns to imitate the behaviour of an observed expert with no interaction and limited observations. The form of knowledge representation used for these observations is crucial, and we apply Qualitative Spatial-Relational representations to...
Article
We present a method to specify tasks and synthesise cost-optimal policies for Markov decision processes using co-safe linear temporal logic. Our approach incorporates a dynamic task handling procedure which allows for the addition of new tasks during execution and provides the ability to re-plan an optimal policy on-the-fly. This new policy minimis...
Conference Paper
Rapidly exploring randomised trees (RRTs) are a useful tool generating maps for use by agents to navigate. A disadvantage to using RRTs is the length of time required to generate the map. In large scale environments, or those with narrow corridors, the time needed to create the map can be prohibitive. This paper explores a new method for improving...
Article
Full-text available
The Association for the Advancement of Artificial Intelligence was pleased to present the AAAI 2014 Spring Symposium Series, held Monday through Wednesday, March 24-26, 2014. The titles of the eight symposia were Applied Computational Game Theory, Big Data Becomes Personal:, Knowledge into Meaning, Formal Verification and Modeling in Human-Machine...
Conference Paper
Full-text available
Many robot perception systems are built to only consider intrinsic object features to recognise the class of an object. By integrating both top-down spatial relational reasoning and bottom-up object class recognition the overall performance of a perception system can be improved. In this paper we present a unified framework that combines a 3D objec...
Article
In this article I explore two ideas. The first is that the idea of architectures for intelligent systems is ripe for exploitation given the current state of component technologies and available software. The second idea is that in order to encourage progress in architecture research, we must concentrate on research methodologies that prevent us fro...
Conference Paper
Full-text available
Finding objects in human environments requires autonomous mobile robots to reason about potential object locations and to plan to perceive them accordingly. By using information about the 3D structure of the environment, knowledge about landmark objects and their spatial relationship to the sought object, search can be improved by directing the rob...
Article
We investigate the problem of learning the control of small groups of units in combat situations in Real Time Strategy (RTS) games. AI systems may acquire such skills by observing and learning from expert players, or other AI systems performing those tasks. However, access to training data may be limited, and representations based on metric informa...
Article
Understanding of behaviour is a crucial skill for Artificial Intelligence systems expected to interact with external agents - whether other AI systems, or humans, in scenarios involving co-operation, such as domestic robots capable of helping out with household jobs, or disaster relief robots expected to collaborate and lend assistance to others. I...
Article
Full-text available
In many real world applications, autonomous mobile robots are required to observe or retrieve objects in their environment, despite not having accurate estimates of the objects' locations. Finding objects in real-world settings is a non-trivial task, given the complexity and the dynamics of human environments. However, by understanding and exploiti...
Conference Paper
Intelligent virtual agents are increasingly faced with very large scale, unstructured environments. In the case of user generated worlds, it is not always possible to give an agent the opportunity to pre-process the map. These agents are required to build a map of their environment and use it to plan routes in a very short period of time. We look a...

Citations

... It is also seen that the 670 nm trough for this sample is not as deep as for the control and drought samples, which may indicate reduced chlorophyll absorption consistent with the known effects of this treatment at the examined site. 44,45 Given the small number of samples tested here, it is not possible to draw conclusions as to the cause of the differences between these spectra, or to relate these to their growth environments. However, hyperspectral data cubes collected in this way are rich in information and ripe for further exploration. ...
... The prevailing focus on SDFs in robotics centers on task space, where configuration space actions are typically computed independently through mappings between the two spaces [33,29]. Existing approaches often model the configuration space using binary maps denoting collision status of joint configurations [32,42], to support sample-based motion planning algorithms [43,3,11]. Despite significant progress, these control and planning strategies are computationally expensive in high dimensional space due to the lack of gradient information. ...
... CMMDP approaches typically exploit the fact that only the resource constraint couples the agents to scale to larger problems. Planning for CMMDPs has considered a range of constraints over resource consumption, such as bounding its worst-case [67], considering a chance-constraint [68,71], and bounding its conditional value at risk [72]. ...
... Other studies use instead the forward error (which is simpler to compute) as the main attractive force [48]. Finally, an alternative strategy consists of including control costs in the Free Energy expression, to remove estimation biases and afford optimal action [104]. The pros and cons of the different approaches and their biological plausibility remain to be systematically investigated. ...
... Second, the spatial scale of data acquisition is highly significant when assessing the spectral variation hypothesis (Wang et al., 2018). Scaling up (sampling at coarser scales) enables wider coverage in a shorter time, which was based on the decreasing of spatial resolution (Jackson et al., 2022). While increasing pixel size will include more objects and reduce the discrimination of objects smaller than pixels, which will affect the spectral heterogeneity (Wang et al., 2018). ...
... On the other hand, passively acquiring sufficient amount of useful data usually comes with a time cost. However, mobile robots can actively explore the environment for useful observations, thus providing strong technical support for effectively acquiring high-quality and sufficient data (Santos et al. 2016;Santos, Krajník, and Duckett 2017;Jovan et al. 2022;Molina, Cielniak, and Duckett 2021). For example, an end-to-end method for active object classification with RGB-D data is proposed in (Patten et al. 2016), which plans a robot's future observations to identify uncertain objects in clutter. ...
... Planning scenarios in various application areas [51] have different resource constraints. Typical examples are energy consumption and time [11], or optimal expected revenue and time [42] in robotics, and monetary cost and available capacity in logistics [17]. ...
... Both NLU-MCTS and DMCTS overcome the issues present when making decisions solely with the expected return [11,38,53,70,80]. As we will show, computing the utility of the returns of a policy is useful when optimising for risk-aware RL and under the MORL ESR criterion, given utility of the returns of a policy contains more information about the range of potential negative and positive outcomes during planning and at decision time. ...
... Regret Optimization in MDPs. Measuring and optimizing a regret value to improve the robustness has been studied previously in uncertain Markov Decision Processes (MDPs) [1,28]. In RL, [13] established Advantage-Like Regret Minimization (ARM) as a policy gradient solution for agents robust to partially observable environments. ...
... Meanwhile, the cooperative multiplayer MAB focuses on a group of M players collaboratively solving challenges in a distributed decisionmaking environment, enhancing learning through shared information. This approach finds applications in fields like multi-robot systems [19] and distributed recommender systems [27]. ...