Nick Hawes's research while affiliated with University of Oxford and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (155)
The metareasoning framework aims to enable autonomous agents to factor in planning costs when making decisions. In this work, we develop the first non-myopic metareasoning algorithm for planning with Markov decision processes. Our method learns the behaviour of anytime probabilistic planning algorithms from performance data. Specifically, we propos...
Grasslands, which encompass 40% of terrestrial ecosystems, hold global significance for food production, carbon storage and other ecosystem services. However, grasslands across the biosphere are becoming increasingly exposed to both wet and dry precipitation extremes resulting from climate change.
Therefore, understanding how grasslands will respon...
For many multi-robot problems, tasks are announced during execution, where task announcement times and locations are uncertain. To synthesise multi-robot behaviour that is robust to early announcements and unexpected delays, multi-robot task allocation methods must explicitly model the stochastic processes that govern task announcement. In this pap...
We consider the challenging scenario of contextual bandits with continuous actions and large context spaces. This is an increasingly important application area in personalised healthcare where an agent is requested to make dosing decisions based on a patient’s single image scan. In this paper, we first adapt a reinforcement learning (RL) algorithm...
Grasslands comprise 40% of terrestrial ecosystems and are globally important for food production, carbon storage, and other ecosystem services. However, grasslands in many areas are becoming increasingly exposed to extreme wet and dry periods resulting from global temperature increases.
Therefore, understanding how grasslands will respond to climat...
The nature of explanations provided by an explainable AI algorithm has been a topic of interest in the explainable AI and human-computer interaction community. In this paper, we investigate the effects of natural language explanations' specificity on passengers in autonomous driving. We extended an existing data-driven tree-based explainer algorith...
Sharing scarce resources is a key challenge in multi-agent interaction, especially when individual agents are uncertain about their future consumption. We present a new auction mechanism for preallocating multi-unit resources among agents, while limiting the chance of resource violations. By planning for a chance constraint, we strike a balance bet...
For many applications of Markov Decision Processes (MDPs), the transition function cannot be specified exactly. Bayes-Adaptive MDPs (BAMDPs) extend MDPs to consider transition probabilities governed by latent parameters. To act optimally in BAMDPs, one must maintain a belief distribution over the latent parameters. Typically, this distribution is d...
We consider robot learning in the context of shared autonomy, where control of the system can switch between a human teleoperator and autonomous control. In this setting we address reinforcement learning, and learning from demonstration, where there is a cost associated with human time. This cost represents the human time required to teleoperate th...
We propose DITTO, an offline imitation learning algorithm which uses world models and on-policy reinforcement learning to addresses the problem of covariate shift, without access to an oracle or any additional online interactions. We discuss how world models enable offline, on-policy imitation learning, and propose a simple intrinsic reward defined...
Image sensing technologies are rapidly increasing the cost‐effectiveness of biodiversity monitoring efforts. Species differences in the reflectance of electromagnetic radiation can be used as a surrogate estimate plant biodiversity using multispectral image data. However, these efforts are often hampered by logistical difficulties in broad‐scale im...
Offline reinforcement learning (RL) is suitable for safety-critical domains where online exploration is too costly or dangerous. In safety-critical settings, decision-making should take into consideration the risk of catastrophic outcomes. In other words, decision-making should be risk-sensitive. Previous works on risk in offline RL combine togethe...
Consider a mobile robot exploring an office building with the aim of observing as much human activity as possible over several days. It must learn where and when people are to be found, count the observed activities, and revisit popular places at the right time. In this paper we present a series of Bayesian estimators for the levels of human activi...
Achieving reactive robot behavior in complex dynamic environments is still challenging as it relies on being able to solve trajectory optimization problems quickly enough, such that we can replan the future motion at frequencies which are sufficiently high for the task at hand. We argue that current limitations in Model Predictive Control (MPC) for...
In this overview paper, we present the work of the Goal-Oriented Long-Lived Systems Lab on multi-robot systems. We address multi-robot systems from a decision-making under uncertainty perspective, proposing approaches that explicitly reason about the inherent uncertainty of action execution, and how such stochasticity affects multi-robot coordinati...
Active inference is a mathematical framework that originated in computational neuroscience. Recently, it has been demonstrated as a promising approach for constructing goal-driven behavior in robotics. Specifically, the active inference controller (AIC) has been successful on several continuous control and state-estimation tasks. Despite its relati...
We consider shared autonomy systems where multiple operators (AI and human), can interact with the environment, e.g. by controlling a robot. The decision problem for the shared autonomy system is to select which operator takes control at each timestep, such that a reward specifying the intended system behaviour is maximised. The performance of the...
Planning in Markov decision processes (MDPs) typically optimises the expected cost. However, optimising the expectation does not consider the risk that for any given run of the MDP, the total cost received may be unacceptably high. An alternative approach is to find a policy which optimises a riskaverse objective such as conditional value at risk (...
Offline reinforcement learning (RL) aims to find near-optimal policies from logged data without further environment interaction. Model-based algorithms, which learn a model of the environment from the dataset and perform conservative policy optimisation within that model, have emerged as a promising approach to this problem. In this work, we presen...
Model-based fault-tolerant control (FTC) often consists of two distinct steps: fault detection & isolation (FDI), and fault accommodation. In this work we investigate posing fault-tolerant control as a single Bayesian inference problem. Previous work showed that precision learning allows for stochastic FTC without an explicit fault detection step....
Image sensing technologies are rapidly increasing the cost-effectiveness of biodiversity monitoring efforts. Species differences in the reflectance of electromagnetic radiation have recently been highlighted as a promising target to estimate plant biodiversity using multispectral image data.
However, these efforts are currently hampered by logistic...
Model-based fault-tolerant control (FTC) often consists of two distinct steps: fault detection & isolation (FDI), and fault accommodation. In this work we investigate posing fault-tolerant control as a single Bayesian inference problem. Previous work showed that precision learning allows for stochastic FTC without an explicit fault detection step....
This article presents an Expert-guided Mixed-initiative Control Switcher (EMICS) for remotely operated mobile robots. The EMICS enables switching between different levels of autonomy during task execution initiated by either the human operator and/or the EMICS. The EMICS is evaluated in two disaster-response-inspired experiments, one with a simulat...
Planning in Markov decision processes (MDPs) typically optimises the expected cost. However, optimising the expectation does not consider the risk that for any given run of the MDP, the total cost received may be unacceptably high. An alternative approach is to find a policy which optimises a risk-averse objective such as conditional value at risk...
Recent trends envisage robots being deployed in areas deemed dangerous to humans, such as buildings with gas and radiation leaks. In such situations, the model of the underlying hazardous process might be unknown to the agent a priori, giving rise to the problem of planning for safe behaviour in partially known environments. We employ Gaussian proc...
Previous work on planning as active inference addresses finite horizon problems and solutions valid for online planning. We propose solving the general Stochastic Shortest-Path Markov Decision Process (SSP MDP) as probabilistic inference. Furthermore, we discuss online and offline methods for planning under uncertainty. In an SSP MDP, the horizon i...
This work presents a fault-tolerant control scheme for sensory faults in robotic manipulators based on active inference. In the majority of existing schemes, a binary decision of whether a sensor is healthy (functional) or faulty is made based on measured data. The decision boundary is called a threshold and it is usually deterministic. Following a...
Multirobot systems must be able to maintain performance when robots get delayed during execution. For mobile robots, one source of delays is
congestion
. Congestion occurs when robots deployed in shared physical spaces interact, as robots present in the same area simultaneously must maneuver to avoid each other. Congestion can adversely affect na...
The utilisation of robots in hazardous nuclear environments has potential to reduce risk to humans. However, historical use has been largely limited to specific missions rather than broader industry-wide adoption. Testing and verification of robotics in realistic scenarios is key to gaining stakeholder confidence but hindered by limited access to f...
The parameters for a Markov Decision Process (MDP) often cannot be specified exactly. Uncertain MDPs (UMDPs) capture this model ambiguity by defining sets which the parameters belong to. Minimax regret has been proposed as an objective for planning in UMDPs to find robust policies which are not overly conservative. In this work, we focus on plannin...
This work presents a novel fault-tolerant control scheme based on active inference. Specifically, a new formulation of active inference which, unlike previous solutions, provides unbiased state estimation and simplifies the definition of probabilistically robust thresholds for fault-tolerant control of robotic systems using the free-energy. The pro...
In this work, we address risk-averse Bayesadaptive reinforcement learning. We pose the problem of optimising the conditional value at risk (CVaR) of the total return in Bayes-adaptive Markov decision processes (MDPs). We show that a policy optimising CVaR in this setting is risk-averse to both the parametric uncertainty due to the prior distributio...
This paper presents an approach to planning under uncertainty in resource-constrained environments. We describe our novel method for online plan modification and execution monitoring, which augments an existing plan with pre-computed plan fragments in response to observed resource availability. Our plan merging algorithm uses causal structure to in...
This work presents a fault-tolerant control scheme for sensory faults in robotic manipulators based on active inference. In the majority of existing schemes a binary decision of whether a sensor is healthy (functional) or faulty is made based on measured data. The decision boundary is called a threshold and it is usually deterministic. Following a...
Previous work on planning as active inference addresses finite horizon problems and solutions valid for online planning. We propose solving the general Stochastic Shortest-Path Markov Decision Process (SSP MDP) as probabilistic inference. Furthermore, we discuss online and offline methods for planning under uncertainty. In an SSP MDP, the horizon i...
We present a fault tolerant control scheme for robot manipulators based on active inference. The proposed solution makes use of the sensory prediction errors in the free-energy to simplify the residuals and thresholds generation for fault detection and isolation and does not require additional controllers for fault recovery. Results validating the...
The parameters for a Markov Decision Process (MDP) often cannot be specified exactly. Uncertain MDPs (UMDPs) capture this model ambiguity by defining sets which the parameters belong to. Minimax regret has been proposed as an objective for planning in UMDPs to find robust policies which are not overly conservative. In this work, we focus on plannin...
The daily working hours of mobile robots are limited primarily by battery life. Most systems use a combination of thresholds and fixed periods to decide when to charge. This produces charging behaviour that ignores high-value tasks that must be performed within time-windows or by deadlines. Instead the robot should schedule charging adaptively, tak...
This work investigates Monte-Carlo planning for agents in stochastic environments, with multiple objectives. We propose the Convex Hull Monte-Carlo Tree-Search (CHMCTS) framework, which builds upon Trial Based Heuristic Tree Search and Convex Hull Value Iteration (CHVI), as a solution to multi-objective planning in large environments. Moreover, we...
This work presents an approach for control, state-estimation and learning model (hyper)parameters for robotic manipulators. It is based on the active inference framework, prominent in computational neuroscience as a theory of the brain, where behaviour arises from minimizing variational free-energy. The robotic manipulator shows adaptive and robust...
This work investigates Monte-Carlo planning for agents in stochastic environments, with multiple objectives. We propose the Convex Hull Monte-Carlo Tree-Search (CHMCTS) framework, which builds upon Trial Based Heuristic Tree Search and Convex Hull Value Iteration (CHVI), as a solution to multi-objective planning in large environments. Moreover, we...
We consider robot learning in the context of shared autonomy, where control of the system can switch between a human teleoperator and autonomous control. In this setting we address reinforcement learning, and learning from demonstration, where there is a cost associated with human time. This cost represents the human time required to teleoperate th...
This paper presents an expert-guided Mixed-Initiative (MI) variable-autonomy controller for remotely operated mobile robots. The controller enables switching between different Level(s) of Autonomy (LOA) during task execution initiated by either the human operator or/and the robot. The controller is evaluated in two Search and Rescue (SAR) inspired...
We present a novel modelling and planning approach for multi-robot systems under uncertain travel times. The approach uses generalised stochastic Petri nets (GSPNs) to model desired team behaviour, and allows to specify safety constraints and rewards. The GSPN is interpreted as a Markov decision process (MDP) for which we can generate policies that...
We present a framework for mobile service robot task planning and execution, based on the use of probabilistic verification techniques for the generation of optimal policies with attached formal performance guarantees. Our approach is based on a Markov decision process model of the robot in its environment, encompassing a topological map where node...
The papers in this special section focus on the use of artificial intelligence (AI) for long term autonomy. Autonomous systems have a long history in the fields of AI and robotics. However, only through recent advances in technology has it been possible to create autonomous systems capable of operating in long-term, real-world scenarios. Examples i...
Autonomous systems will play an essential role in many applications across diverse domains including space, marine, air, field, road, and service robotics. They will assist us in our daily routines and perform dangerous, dirty and dull tasks. However, enabling robotic systems to perform autonomously in complex, real-world scenarios over extended ti...
Autonomous systems will play an essential role in many applications across diverse domains including space, marine, air, field, road, and service robotics. They will assist us in our daily routines and perform dangerous, dirty and dull tasks. However, enabling robotic systems to perform autonomously in complex, real-world scenarios over extended ti...
We propose novel techniques for task allocation and planning in multi-robot systems operating in uncertain environments. Task allocation is performed simultaneously with planning, which provides more detailed information about individual robot behaviour, but also exploits the independence between tasks to do so efficiently. We use Markov decision p...
Deep networks thrive when trained on large scale data collections. This has given ImageNet a central role in the development of deep architectures for visual object classification. However, ImageNet was created during a specific period in time, and as such it is prone to aging, as well as dataset bias issues. Moving beyond fixed training datasets w...
We present a methodology for the generation of mobile robot controllers which offer probabilistic time-bounded guarantees on successful task completion, whilst also trying to satisfy soft goals. The approach is based on a stochastic model of the robot’s environment and action execution times, a set of soft goals, and a formal task specification in...
Intelligent Autonomous Robots deployed in human environments must have understanding of the wide range of possible semantic identities associated with the spaces they inhabit-kitchens, living rooms, bathrooms, offices, garages, etc. We believe robots should learn this information through their own exploration and situated perception in order to unc...
The ability to refer to entities such as objects, locations, and people is an important capability for robots designed to interact with humans. For example, a referring expression (RE) such as " Do you mean the box on the left? " might be used by a robot seeking to disambiguate between objects. In this paper, we present and evaluate algorithms for...
Recognising objects in everyday human environments is a challenging task for autonomous mobile robots. However, actively planning the views from which an object might be perceived can significantly improve the overall task performance. In this paper we have designed, developed, and evaluated an approach for next best view planning. Our view plannin...
Thanks to the efforts of the robotics and autonomous systems community, robots are becoming ever more capable. There is also an increasing demand from end-users for autonomous service robots that can operate in real environments for extended periods. In the STRANDS project1 we are tackling this demand head-on by integrating state-of-the-art artific...
The success of mobile robots, in daily living environments, depends on their capabilities to understand human movements and interact in a safe manner. This paper presents a novel unsupervised qualitative-relational framework for learning human motion patterns using a single mobile robot platform. It is capable of learning human motion patterns in r...
Thanks to the efforts of our community, autonomous robots are becoming capable of ever more complex and impressive feats. There is also an increasing demand for, perhaps even an expectation of, autonomous capabilities from end-users. However, much research into autonomous robots rarely makes it past the stage of a demonstration or experimental syst...
This article presents an integrated robot system capable of interactive learning in dialogue with a human. Such a system needs to have several competencies and must be able to process different types of representations. In this article, we describe a collection of mechanisms that enable integration of heterogeneous competencies in a principled way....
In this article, we present and evaluate a system, which allows a mobile robot to autonomously detect, model, and re-recognize objects in everyday environments. While other systems have demonstrated one of these elements, to our knowledge, we present the first system, which is capable of doing all of these things, all without human interaction, in...
This paper presents an experimental analysis of the Human-Robot Interaction (HRI) between human operators and a Human-Initiative (HI) variable-autonomy mobile robot during navigation tasks. In our HI system the human operator is able to switch the Level of Autonomy (LOA) on-the-fly between teleoperation (joystick control) and autonomous control (ro...
Abstract A long-standing goal of AI is to enable robots to plan in the face of uncertain and incomplete information, and to handle task failure intelligently. This paper shows how to achieve this. There are two central ideas. The first idea is to organize the robot's knowledge into three layers: instance knowledge at the bottom, commonsense knowled...
In planning for deliberation or navigation in real-world robotic systems, one of the big challenges is to cope with change. It lies in the nature of planning that it has to make assumptions about the future state of the world, and the robot's chances of successively accomplishing actions in this future. Hence, a robot's plan can only be as good as...
We present a novel task scheduling algorithm for use on mobile robots in real environments. The scheduling problem is formalised as mixed integer program, which is a standard approach in the scheduling community. Our contribution is the use of Allen's interval algebra to prune the search to be performed by the mixed integer program. This significan...
Object recognition systems can be unreliable when run in isolation depending on only image based features, but their performance can be improved when taking scene context into account. In this paper, we present techniques to model and infer object labels in real scenes based on a variety of spatial relations — geometric features which capture how o...
Object recognition systems can be unreliable when run
in isolation depending on only image based features, but
their performance can be improved when taking scene
context into account. In this paper, we present techniques
to model and infer object labels in real scenes
based on a variety of spatial relations – geometric features
which capture how o...
We present an approach to the problem of learning by observation in spatially-situated tasks, whereby an agent learns to imitate the behaviour of an observed expert with no interaction and limited observations. The form of knowledge representation used for these observations is crucial, and we apply Qualitative Spatial-Relational representations to...
We present a method to specify tasks and synthesise cost-optimal policies for Markov decision processes using co-safe linear temporal logic. Our approach incorporates a dynamic task handling procedure which allows for the addition of new tasks during execution and provides the ability to re-plan an optimal policy on-the-fly. This new policy minimis...
Rapidly exploring randomised trees (RRTs) are a useful tool generating maps for use by agents to navigate. A disadvantage to using RRTs is the length of time required to generate the map. In large scale environments, or those with narrow corridors, the time needed to create the map can be prohibitive. This paper explores a new method for improving...
The Association for the Advancement of Artificial Intelligence was pleased to present the AAAI 2014 Spring Symposium Series, held Monday through Wednesday, March 24-26, 2014. The titles of the eight symposia were Applied Computational Game Theory, Big Data Becomes Personal:, Knowledge into Meaning, Formal Verification and Modeling in Human-Machine...
Many robot perception systems are built to only consider intrinsic object features to recognise the class of an object. By integrating both top-down spatial relational reasoning and bottom-up object class recognition the overall performance of a perception system can be improved. In this paper we present a unified framework that combines a 3D objec...
In this article I explore two ideas. The first is that the idea of architectures for intelligent systems is ripe for exploitation given the current state of component technologies and available software. The second idea is that in order to encourage progress in architecture research, we must concentrate on research methodologies that prevent us fro...
Finding objects in human environments requires autonomous mobile robots to reason about potential object locations and to plan to perceive them accordingly. By using information about the 3D structure of the environment, knowledge about landmark objects and their spatial relationship to the sought object, search can be improved by directing the rob...
We investigate the problem of learning the control of small groups of units in combat situations in Real Time Strategy (RTS) games. AI systems may acquire such skills by observing and learning from expert players, or other AI systems performing those tasks. However, access to training data may be limited, and representations based on metric informa...
Understanding of behaviour is a crucial skill for Artificial Intelligence systems expected to interact with external agents - whether other AI systems, or humans, in scenarios involving co-operation, such as domestic robots capable of helping out with household jobs, or disaster relief robots expected to collaborate and lend assistance to others. I...
In many real world applications, autonomous mobile robots are required to observe or retrieve objects in their environment, despite not having accurate estimates of the objects' locations. Finding objects in real-world settings is a non-trivial task, given the complexity and the dynamics of human environments. However, by understanding and exploiti...
Intelligent virtual agents are increasingly faced with very large scale, unstructured environments. In the case of user generated worlds, it is not always possible to give an agent the opportunity to pre-process the map. These agents are required to build a map of their environment and use it to plan routes in a very short period of time. We look a...
Citations
... It is also seen that the 670 nm trough for this sample is not as deep as for the control and drought samples, which may indicate reduced chlorophyll absorption consistent with the known effects of this treatment at the examined site. 44,45 Given the small number of samples tested here, it is not possible to draw conclusions as to the cause of the differences between these spectra, or to relate these to their growth environments. However, hyperspectral data cubes collected in this way are rich in information and ripe for further exploration. ...
... The prevailing focus on SDFs in robotics centers on task space, where configuration space actions are typically computed independently through mappings between the two spaces [33,29]. Existing approaches often model the configuration space using binary maps denoting collision status of joint configurations [32,42], to support sample-based motion planning algorithms [43,3,11]. Despite significant progress, these control and planning strategies are computationally expensive in high dimensional space due to the lack of gradient information. ...
... CMMDP approaches typically exploit the fact that only the resource constraint couples the agents to scale to larger problems. Planning for CMMDPs has considered a range of constraints over resource consumption, such as bounding its worst-case [67], considering a chance-constraint [68,71], and bounding its conditional value at risk [72]. ...
... Other studies use instead the forward error (which is simpler to compute) as the main attractive force [48]. Finally, an alternative strategy consists of including control costs in the Free Energy expression, to remove estimation biases and afford optimal action [104]. The pros and cons of the different approaches and their biological plausibility remain to be systematically investigated. ...
... Second, the spatial scale of data acquisition is highly significant when assessing the spectral variation hypothesis (Wang et al., 2018). Scaling up (sampling at coarser scales) enables wider coverage in a shorter time, which was based on the decreasing of spatial resolution (Jackson et al., 2022). While increasing pixel size will include more objects and reduce the discrimination of objects smaller than pixels, which will affect the spectral heterogeneity (Wang et al., 2018). ...
... On the other hand, passively acquiring sufficient amount of useful data usually comes with a time cost. However, mobile robots can actively explore the environment for useful observations, thus providing strong technical support for effectively acquiring high-quality and sufficient data (Santos et al. 2016;Santos, Krajník, and Duckett 2017;Jovan et al. 2022;Molina, Cielniak, and Duckett 2021). For example, an end-to-end method for active object classification with RGB-D data is proposed in (Patten et al. 2016), which plans a robot's future observations to identify uncertain objects in clutter. ...
... Planning scenarios in various application areas [51] have different resource constraints. Typical examples are energy consumption and time [11], or optimal expected revenue and time [42] in robotics, and monetary cost and available capacity in logistics [17]. ...
Reference: Multi-cost Bounded Tradeoff Analysis in MDP
... Both NLU-MCTS and DMCTS overcome the issues present when making decisions solely with the expected return [11,38,53,70,80]. As we will show, computing the utility of the returns of a policy is useful when optimising for risk-aware RL and under the MORL ESR criterion, given utility of the returns of a policy contains more information about the range of potential negative and positive outcomes during planning and at decision time. ...
... Regret Optimization in MDPs. Measuring and optimizing a regret value to improve the robustness has been studied previously in uncertain Markov Decision Processes (MDPs) [1,28]. In RL, [13] established Advantage-Like Regret Minimization (ARM) as a policy gradient solution for agents robust to partially observable environments. ...
... Meanwhile, the cooperative multiplayer MAB focuses on a group of M players collaboratively solving challenges in a distributed decisionmaking environment, enhancing learning through shared information. This approach finds applications in fields like multi-robot systems [19] and distributed recommender systems [27]. ...
Reference: Multi-Player Approaches for Dueling Bandits