ChapterPDF Available

Multi-Agent Path Finding – An Overview

Authors:

Abstract and Figures

Multi-Agent Pathfinding (MAPF) is the problem of finding paths for multiple agents such that every agent reaches its goal and the agents do not collide. In recent years, there has been a growing interest in MAPF in the Artificial Intelligence (AI) research community. This interest is partially because real-world MAPF applications, such as warehouse management, multi-robot teams, and aircraft management, are becoming more prevalent. In this overview, we discuss several possible definitions of the MAPF problem. Then, we survey MAPF algorithms, starting with fast but incomplete algorithms, then fast, complete but not optimal algorithms, and finally optimal algorithms. Then, we describe approximately optimal algorithms and conclude with non-classical MAPF and pointers for future reading and future work.
Content may be subject to copyright.
Multi-Agent Path Finding – an Overview?
Roni Stern1[0000000300438179]
Ben Gurion University of the Negev, Be’er Sheva, Israel
sternron@post.bgu.ac.il
Abstract. Multi-Agent Pathfinding (MAPF) is the problem of finding
paths for multiple agents such that every agent reaches its goal and the
agents do not collide. In recent years, there has been a growing interest in
MAPF in the Artificial Intelligence (AI) research community. This inter-
est is partially because real-world MAPF applications, such as warehouse
management, multi-robot teams, and aircraft management, are becoming
more prevalent. In this overview, we discuss several possible definitions
of the MAPF problem. Then, we survey MAPF algorithms, starting with
fast but incomplete algorithms, then fast, complete but not optimal algo-
rithms, and finally optimal algorithms. Then, we describe approximately
optimal algorithms and conclude with non-classical MAPF and pointers
for future reading and future work.
Keywords: Multi-Agent Pathfinding ·Heuristic Search
1 Introduction
MAPF is the problem of finding paths for multiple agents such that every
agent reaches its desired destination and the agents do not conflict. MAPF has
real-world applications in warehouse management [50], airport towing [27], au-
tonomous vehicles, robotics [45], and digital entertainment [26].
Research on MAPF has been developing rapidly in the past decade. In this
paper, we provide an overview of MAPF research in the Artificial Intelligence
(AI) community. The purpose of this overview is to help researchers and practi-
tioners that are less familiar with MAPF research better understand the problem
and current approaches for solving. It is not to intended to serve as a compre-
hensive survey on MAPF research.
This overview paper is structured as follows. In Section 2, we define the
problem formally, and discuss several of its notable variants. Then, a simple
analysis of the problem is given to illustrate its difficulty. Section 3 starts by
describing prioritized planning [34], which is still the most common approach in
practice to solve MAPF problems. We discuss the limitation of this approach,
in particular, the lack of completeness or optimality. Then, we mention several
MAPF algorithms that are fast and complete, but may return solutions that
are not optimal. Section 4 surveys several families of MAPF algorithms that
are guaranteed to return an optimal solution. Section 5 covers approximately
?Supported by ISF grant 210/17 to Roni Stern.
2 R. Stern
optimal algorithms, i.e., algorithms that guarantee the solution they return is at
most a constant factor more costly than an optimal solution. Finally, the paper
concludes with a partial list of MAPF extensions (Section 6), and pointers to
further reading and resources (Section 7). In addition, throughout this paper,
we point to interesting directions for future work.
2 Problem Definition
The literature includes multiple definitions of the MAPF problem. In this paper,
we mostly focus on what is called classical MAPF [37]. Section 6 discusses other
variants of MAPF. A classical MAPF problem with kagents is defined by a
tuple hG, s, tiwhere:
G= (V, E ) is an undirected graph whose vertices are the possible locations
agents may occupy and every edge (n, n0)Erepresents that an agent can
move from nto n0without passing through any other vertex.
sis a function that maps an agent to its initial location.
tis a function that maps an agent to its desired destination location.
Time is discretized into time steps. In every time step, each agent can per-
form a single action. There are two types of actions: wait and move. An agent
performing a wait action stays in its current location for one time step. A move
action moves an agent from its location to some other location. Move action
takes exactly one time step, and can only move an agent from its current lo-
cation to one of its adjacent locations. A valid solution to a MAPF problem is
a joint plan that moves all agents to their goals, in a way that agents do not
collide. Next, we define the terms valid solution, joint plan, and collision, in a
formal way.
A
C
B
A B C
A B
A B
C
D
(a) (b) (c) (d) (e)
Fig. 1. Illustration of different types of conflicts, taken from Stern et al. [37]: (a) a
vertex conflict, (b) a swapping conflict, (c) a following conflict, and (d) a cycle conflict.
Asingle-agent plan for an agent iis a sequence of actions that if agent i
performs these actions in location s(i) it will end up in location t(i). Formally,
a single-agent plan for agent iis sequence of actions π= (a1,...an) such that
an(· · · a2(a1(s(i))) · · ·) = t(i) (1)
Multi-Agent Path Finding – an Overview 3
Ajoint plan is a set of single-agent plans, one for each of the kagents. For a
joint plan Π, we denote by Πiits constituent single-agent plan for agent i. A
pair of agents iand jhave a vertex conflict in a joint plan Πif according to
their respective single-agent plans Πiand Πjboth agents are planned to occupy
the same vertex at the same time. Similarly, agents have a swapping conflict in a
joint plan if they are planned to swap locations over the same edge at the same
time. A valid solution to a MAPF problem is a joint plan that has none of these
conflicts.
Some MAPF applications have stricter requirements from a valid solution,
prohibiting other types of conflicts. Two notable types of conflicts are following
conflicts and cycle conflicts. A following conflict occurs if an agent plans to
occupy at time step t+ 1 a location that was occupied by some other agent at
time step t. A cycle conflict occurs if a set of agents i, i + 1,...j plan to move
in the same time step tin a circular pattern, i.e., agent iplans to move in time
step t+ 1 to agent’s i+ 1 location at time step t, agent i+ 1 plans to move in
time step t+ 1 to agent’s i+ 2 at time step t, and so on, while agent jplans to
move in time step t+ 1 to agent’s ilocation at time step t. Figure 1 illustrates
all these different types of conflicts. See Stern et al. [37] for a comprehensive
discussion on different types of conflicts and the relationships between them.
2.1 Optimization
MAPF problems can have more than one valid solution. In many MAPF ap-
plications, one would like to find a valid solution that optimizes some objective
function. The two most common objective functions used for evaluating a MAPF
solution are makespan and sum of costs. The makespan of a joint plan Π, de-
noted M(Π) is the number of time steps until all agents reach their goal.
M(Π) = max
1ik|πi|(2)
The sum of costs of a joint plan Π, denoted SOC(Π) is the sum of actions
performed until all agents reach their goal.
SOC(Π) = X
1ik
|πi|(3)
Following most prior work, we assume that when an agent waits in its desti-
nation then it also increase the SOC of the overall joint plan, unless that agent
is not planned to move later from its destination location. For example, consider
the case where agent ireaches its destination at time step t, leaves it at time
step t0, arrives back to its destination at time step t00, and stays there until all
agents reach their destinations. Then this single-agent plan contributes t00 to the
SOC of the corresponding joint plan.
2.2 From Single-Agent Pathfinding To MAPF
A single-agent shortest-path problem (SPP) is the problem of finding the shortest
path in a graph G= (V, E ) from a given source vertex sVto a given target
4 R. Stern
vertex tV. MAPF can be reduced to a shortest-path problem in a graph
known as the k-agent search space. This graph, denoted Gk, is different from
the single-agent graph G. A vertex in Grepresents a location that an agent may
occupy in a particular time step. A vertex in Gkrepresents a set of locations, one
per agent, that the agents can occupy in a particular time step. Thus, a vertex
in Gkis a vector of kvertices in G. An edge in Gkrepresents a joint action of
all agents, that is, a set of kactions, one per agent, that the agents can perform
simultaneously in a particular time step. Joint actions that result in a conflict,
will not have a corresponding edge in Gk. The cost of an edge in Gkcorresponds
to the cost of the corresponding joint action.
Observation 1 A lowest-cost path in Gkfrom s(1), . . . , s(k)to t(1),...t(k)
is an optimal solution to the MAPF problem hG, s, tiand vice versa.
Heuristic Search and the AAlgorithm Heuristic search in general and
the Aalgorithm in particular [18] are commonly used to solve shortest-path
problems. For completeness, we provide a brief background on A.
Ais a best-first search algorithm. It maintains a list of vertices called Open.
Initially, Open contains the source vertex. In every iteration, a single vertex is
removed from Open and expanded. To expand a vertex means to go over each of
its neighbors and generate it. To generate a vertex means creating it and adding
it to Open, unless it has already been generated before. For every generated
vertex n, Amaintains several values.
g(n) is the cost of the lowest-cost path found so far from the source vertex
to n.
parent(n) is the vertex before non that path.
h(n) is a heuristic estimate of the cost of the lowest-cost path from nto the
target vertex.
Let h(n) be a perfect heuristic estimate for n, that is, the cost of the lowest-
cost path from nto a goal. If h(n) is known for all nodes, then one can find
the shortest path from the source vertex to the target by choosing to go to the
vertex with the smallest hvalue. A heuristic function his called admissible iff for
every vertex nit holds that h(n)h(n). The Aalgorithm chooses to expand
the vertex nin Open that has the smallest g(n) + h(n) value.
Theorem 1 (Optimality of A[18]). Given an admissible heuristic, Ais
guaranteed to return an optimal solution, i.e., a shortest path from the source
vertex to its target.
Observation 1 and Theorem 1 mean that one can solve a given MAPF problem
by running Aon the k-agent search space. A simple way to obtain an admissible
heuristic for the k-agent search space is by considering the cost of the shortest
path in Gfrom every vertex vVto every target vertex t(1),...t(k). This is
done as follows. Let d(v, t(i)) be the cost of the shortest path from vto t(i).
Computing d(v, t(i)) for every vVand i∈ {1, . . . , k}can be done in time that
Multi-Agent Path Finding – an Overview 5
is polynomial in |V|and k, in the beginning of the search. Then, the following
is an admissible heuristic when optimizing for sum of costs
h(v1,...vk)=X
i∈{1,...,k}
d(vi, t(i)) (4)
and the following is an admissible heuristic when optimizing makespan
h(v1,...vk)= max
i∈{1,...,k}d(vi, t(i)) (5)
Challenges in Solving MAPF with AA very rough way to estimate the
hardness of solving a shortest path problem, with Aand other algorithms, is
by considering the size of the search space and its branching factor, which in
our case corresponds to the number of vertices in Gkand its average outgoing
degree. Thus, in the worst case, the size of the search space is |V|kand the
branching factor is |E|
|V|k
. As can be seen, both values are exponential in the
number of agents.
To get an estimate of these numbers, consider a MAPF problem with 20
agents on a 4-connected grid with 500 ×500 cells. In this case, the size of the
search space is 25,00020 9.09 ·1087 and the branching factor is 420 1.1·
1012 |V| ≈ 25,000. The exponential branching factor is especially problematic
for A, since Amust at least expand all vertices along an optimal path. The
computational cost of expanding a vertex, however, is at least linear in the
branching factor. Thus, textbook Acannot be used to solve a MAPF problem
with a large number of agents, even with a perfect heuristic function.
3 Fast MAPF Algorithms
A fundamental approach to address this combinatorial explosion is to try to
decouple the MAPF task to ksingle-agent pathfinding problems with as minimal
interaction as possible. Perhaps one of the most popular approaches to do so is
prioritized planning.
3.1 Prioritized Planning
The first step in prioritized planning is to assign each agent a unique number
from {1,...k}. Then, a single-agent plan is found for each agent in order of
their priority. When an agent searches for a plan, it is required to find a plan
that avoids creating a conflict with plans already found for agents with higher
priority.
A fundamental difference between a textbook shortest-path problem and the
problem of finding a plan for the agent with the ith priority is that in the latter
an optimal solution may require an agent to wait in its location. Thus, to find
a plan for the ith agent, is, in fact, a shortest path problem in a time-expansion
6 R. Stern
graph [34]. In a time-expansion graph, every vertex represents a pair (v, t), where
vis a vertex in the underlying graph Gand tis a time step. There is an edge
between vertices (v, t) and (v0, t0) in the time-expansion graph iff t0=t+1 and v0
is either equal to vor it is one of its neighbors. The size and branching factor of
the corresponding search space is manageable: the number of vertices is |V| × T,
where Tis an upper bound on the solution makespan, and the branching factor
is |E|
|V|+ 1. For example, in a MAPF problem with 20 agents on a 4-connected
grid with 500 ×500 cells, assuming T= 1,000, we have a search space size of
25,000,000 and a branching factor of 5. Ahas been successfully applied to
much larger search spaces.
The computational efficiency and simplicity of prioritized planning algo-
rithms is the main reason for their widespread adoption by practitioners. Imple-
menting prioritized planning includes many design choices. For example, several
methods have been proposed for setting the agents’ priorities [7, 1]. The Win-
dowed Hierarchical Cooperative Aalgorithm (WHCA) [34] also allowed inter-
leaving planning and execution in a prioritized planning framework. In WHCA,
the agents plans to avoid conflicts only for the next Xtime steps (the “window”).
After performing these Xsteps, the agents can re-plan the next Xsteps in the
same manner.
Prioritized planning is a sound approach for MAPF, in the sense that it
returns valid solutions. However, it is neither complete nor optimal. That is,
Not complete. A prioritized planning algorithm may not find any solution
to a solvable MAPF problem.
– Not optimal. The solution returned by a prioritized planning algorithm
may not be optimal, w.r.t. to a given objective function (e.g., sum of costs
or makespan).
s(1) t(1)
t(2) s(2)
Fig. 2. A MAPF problem in which prioritized planners will not find any solution.
As an example of these prioritized planning limitations, see the MAPF prob-
lem depicted in Figure 2. In this example, any prioritized planning algorithm
will fail to find a solution, regardless of which agent has a higher priority. The
problem, however, is clearly solvable, by having agent 1 move to the middle grid
cell in the upper row, allowing agent 2 to move to its target (t(2)), and then
moving to its own target (t(1)).
Multi-Agent Path Finding – an Overview 7
3.2 Complete MAPF Solvers
We say that a MAPF algorithm is fast if its worst-case time complexity is poly-
nomial in the size of the graph G, and not exponential in the number of agents.
Surprisingly, there are fast and complete algorithms for solving MAPF problems.
The most general of those is Kornhauser’s algorithm [20], which is complete and
runs in a worst case time complexity of O(|V|3). This algorithm is regarded as
complicated to implement. Thus, a variety of algorithms have been proposed that
are also fast and complete, at least for some restricted classes of MAPF prob-
lems. Below, we provide a partial list of such algorithms and classes of MAPF
problems.
The Push-and-Swap algorithm [24] and its extensions Parallel Push-and-
Swap [31] and Push-and-Rotate [11], are fast MAPF algorithms that are com-
plete for any MAPF problem in which there are at least two unoccupied vertices
in the graph. Very roughly, these algorithms work by executing a set of macro-
operators that move an agent towards its goal (push) and swap the location of
two agents (swap).
A MAPF problem is well-formed if, for any pair of agent iand j, there
exists a path from s(i) to t(i) that does not pass through s(j) and t(j). ˇ
ap et
al. [9] proved that prioritized planning algorithms that compulsory avoid start
locations are complete for well-formed MAPF problems.
A MAPF problem is slidable if for any triple of locations v1, v2,and v3,
there exists a path from v1to v3that does not go through v2.1Wang and
Botea [49] proposed a fast algorithm called MAPP that is complete for slidable
MAPF problems. The BIBOX algorithm is also fast and complete under these
conditions [38].
While all the above algorithms are fast and, under certain conditions, com-
plete, they do not provide any guarantee regarding the quality of the solution
they return. In particular, they do not guarantee that the resulting solution is
optimal, either w.r.t. sum-of-costs or makespan. In fact, finding a solution that
has the smallest makespan or the smallest sum of costs, is NP hard [39, 53].
Nevertheless, solution quality is important in many applications, e.g., saving op-
erational costs in an automated warehouse. Also, modern MAPF algorithms can
find provably optimal solutions in a few minutes to problems with more than a
hundred agents [32,21, 14].
In the next section, we present the state-of-the-art in MAPF algorithms
that are guaranteed to return a solution that is optimal with respect to a given
objective function. Such algorithms are referred to as optimal MAPF algorithms.
4 Optimal MAPF Solvers
It is possible to classify optimal MAPF algorithms to four high-level approaches:
1The exact definition of slidable is slightly more involved. The interested reader can
see the exact definition in Wang and Botea’s paper [49].
8 R. Stern
– Extensions of A.These are algorithms that search the k-agent search
space using a variant of the Aalgorithm.
The Increasing Cost Tree Search [33]. This algorithm splits the MAPF
problem into two problems: finding the cost added by each agent, and finding
a valid solution with these costs.
– Conflict-Based Search [32]. This algorithmic family solves MAPF by
solving multiple single-agent pathfinding problems. To achieve coordination,
specific constraints are added incrementally to the single-agent pathfinding
problems, in a way that verifies soundness, completeness, and optimality.
– Constraints programming [39, 6]. This approach compiles MAPF to a
set of constraints and solves them with a general purpose constraints solver.
4.1 Extensions of A
Standley [36] proposed two very effective extensions to Afor solving MAPF
problems.
Operator Decomposition The first extension is called Operator Decomposi-
tion (OD). OD is designed to cope with the exponential branching factor of the
k-agent search space. In OD, the agents are sorted according to some arbitrary
order. When expanding the source vertex s(1), . . . , s(k), only the actions of
one agent are considered. This generates a set of vertices that represent a possi-
ble location for the first agent in time step 1, and the locations all other agents
are occupying at time step 0. These vertices are added to Open. When expand-
ing one of these vertices, only the actions of the second agent are considered,
generating a new set of vertices. These vertices represent a possible location for
the first and second agents in time step 1, and the locations of all other agents
are occupying at time step 0. The search continues in this way. Only the kth
descendent of the start vertex is a vertex that represents a possible location of
all agents at time step 1. Vertices that represent the location of all agents at the
same time step are called full vertices, while all other vertices are called interme-
diate vertices. The search continues until reaching a full vertex that represents
the target t(1), . . . , t(k).
The obvious advantage of Awith OD compared to Awithout OD is the
branching factor. With OD, the branching factor is that of a single agent, while
without OD, it is exponential in the number of agents. However, the solution is
ktimes deeper when using OD, since there are kvertices between any pair of
full states. In the case of MAPF, this tradeoff is usually beneficial due to the
heuristic function. A high heuristic value for an intermediate vertex can help
avoid expanding the entire subtree beneath that vertex.
OD can be viewed as a special case of the Enhanced Partial Expansion A
(EPEA*) algorithm [17]. EPEA* is a variant of Athat can avoid generating
some of the vertices Awould generate when expanding a vertex. For details on
EPEA* and how it relates to OD, see Goldenberg et al. [17].
Multi-Agent Path Finding – an Overview 9
Independence Detection The second Aextension proposed by Standley [36]
is called Independence Detection (ID). ID attempts to decouple a MAPF problem
with kagents to smaller MAPF problems with fewer agents. It works as follows.
First, each agent finds an optimal single-agent plan for itself while ignoring all
other agents. If there is a conflict between the plans of a pair of agents, these
agents are merged to a single meta-agent. Then, A+OD is used to find an
optimal solution for the two agents in this meta-agent, ignoring all other agents.
This process continues iteratively: in every iteration a single conflict is detected,
the conflicting (meta-)agents are merged, and then solved optimally with A
+OD. The process stops where there are no conflicts between the agents’ plans.2
In the worst case, ID will end up merging all agents to a single meta-agent
and solving the resulting k-agents MAPF problem. However, in other cases, an
optimal solution can be returned and guaranteed by only solving smaller MAPF
problems with fewer agents. This can have a dramatic impact on runtime. ID is
a very general framework for MAPF solvers, as one can replace A+OD with
any other complete and sound MAPF solver.
M* The Malgorithm [47] also search the k-agent search space like A. To
handle the exponential branching factor, Mdynamically changes the branching
factor of the search space, as follows. Initially, whenever a vertex is expanded,
it generates only a single vertex that corresponds to all agents moving one step
in their own, individual, optimal path. This generates a single path in the k-
agent search space. Since the agents are following their individual optimal path,
a vertex nmay be generated that represents a conflict between a pair of agents
iand j. If this occurs, all the vertices along the path from the start vertex to
nare re-expanded, this time generating vertices for all combinations of actions
agents iand jmay perform. In general, a vertex in Mstores a conflict set,
which is a set of agents for which it will generate all combinations of actions.
For agents not in the conflict set, Monly considers a single action – the one on
their individual optimal path. Recursive M(rM*) is a notable improved version
of M. rM* attempts to identify sets of agents in the conflict set that can be
solved in a decoupled manner.
Mis similar to OD in that it limits the branching factor of some vertices.
rM* also bears some similarity to ID, in that it attempts to identify which sets
of agents can be solved separately. Nevertheless, rM*, OD, and ID, can be used
together: rM* can be used by ID to find optimal solutions to conflicting meta-
agents, and rM* can search the k-agent search space with Awith OD instead
of plain A. The latter is referred to as ODrM* and was shown to be effective
in some scenarios [47].
2This is actually a description of the simple ID algorithm. In the full ID algorithm,
the conflicting agents attempt to individually avoid the conflict while maintaining
their original solution cost.
10 R. Stern
4.2 The Increasing Cost Tree Search (ICTS)
The Increasing Cost Tree Search (ICTS) [33] algorithm does not search the k-
agent search space directly. Instead, it interleaves two search processes. The first,
referred to as the high-level search, aims to find the sizes of the agents’ single-
agent plans in an optimal solution for the given MAPF problem. The second,
referred to as the low-level search, accepts a vector of plan sizes (c1, . . . , ck), and
verifies if there exists a valid solution (π1, . . . , πk) to the given MAPF problem
in which the size of every single agent plan πiis exactly ci.
The high-level search of ICTS is implemented as a search over the increasing
cost tree (ICT). The ICT is a tree in which each node is a k-dimensional vector
of non-negative values. The root of the ICT is a vector (c1, . . . , ck) where for
every agent i, the value ciis the size of its individual optimal path. The children
of a node nin this tree are all vectors that result from adding one to one of
the kelements in n. The high-level of ICTS searches the ICT in a breadth-first
manner. This is done to verify that the first valid solution found by the low-level
search is an optimal solution.
As mentioned above, the low-level search of ICTS accepts an ICT node
(c1,...ck) from the high-level search, and searches for a valid solution (π1, . . . , πk)
in which i:|πi|=ci. To do so efficiently, ICTS computes for each agent iall
single-agent plans of size ci. Generating these set of plans is done with a simple
breadth-first search, and they are stored compactly in a Multi-valued Decision
Diagram (MDD) [35]. The cross product of the agents’ MDDs is a subgraph
of the k-agent search space that contains all joint plans that correspond to the
given ICT node. Observe that this cross product is a subgraph of the k-agent
search space. ICTS searches this cross product of MDDs for a valid solution.
Since this search solves a satisfaction problem and not an optimization problem,
a simple depth-first branch-and-bound is commonly used.
An effective way to speedup ICTS is to prune the ICT by quickly identifying
subsets of single-agent plan costs for which there is no valid solution [33]. For
example, assume an ICT node (c1,...ck) given to the low-level search. One can
check if there is a pair of single-agent plans for agents 1 and 2 such that their
costs is c1and c2, respectively, and they do not conflict. If no such pair of
plans exists, then the low-level search can safely return that there is no valid
solution for the corresponding ICT node. While this technique for pruning the
ICT is highly effective in practice, there is no current theory about how to choose
which subsets of costs to check. This is an open question for future research.
4.3 Conflict-Based Search
Conflict-Based Search (CBS) [32] is an optimal MAPF algorithm. It is unique in
that it solves a MAPF problem by solving a sequence of single-agent pathfinding
problems.
In more detail, CBS, similar to ICTS, runs two interleaving search processes:
alow-level search and a high-level search. The CBS low-level search accepts as
input an agent iand a set of constraints of the form hi, v, ti, representing that
Multi-Agent Path Finding – an Overview 11
agent imust not be at vertex vin time step t. The task of the CBS low-level
search is to find the lowest-cost single-agent plan for agent ithat does not violate
the given set of constraints. Existing single-agent pathfinding algorithms, such
as A, can be easily adapted to serve as the CBS low-level search.
The CBS high-level searches a set of constraints to impose on the low-level
search so that the resulting joint plan is a cost-optimal valid solution. This search
is performed over the Constraint Tree (CT). The CT is a binary tree in which
each node nis a pair (n.cont, n.Π) where n.cont is a set of CBS constraints and
n.Π is a joint plan consistent with these constraints. A CT node nis generated
by first setting its constraints and then using the CBS low-level search to find a
single-agent plan for each agent that satisfies its constraints. The root of the CT
is a CT node with an empty set of constraints. The objective of the high-level
search is to find node nin the CT in which n.Π is a cost-optimal valid solution.
The high-level search achieves this objective by searching the CT as follows.
First, the root of the CT is generated. If the joint plan for the root has no
conflict, meaning it is a valid solution, then the search returns it. Otherwise,
one of the conflicts in the joint plan is chosen. Let i,j,x, and tbe the pair of
agents, location, and time steps for which this conflict has occurred. Two new
CT nodes, niand nj, are generated and added as children to the root node.
The CT node niis generated with the constraint hi, x, tiand the CT node nj
is generated with the constraint hj, x, ti. The cost of a CT node is the cost of
the joint plan it represents. The high-level search continues to search the CT
in a best-first manner, choosing in every iteration to expand a CT node with
the lowest cost. Expanding a CT node means choosing one of its conflicts, and
resolving them by generating two new CT nodes with an additional constraint
as shown above. The search halts when a CT node nis found in which n.Π has
no conflicts. Then, n.Π is returned, and is guaranteed to be optimal.
CBS has many extensions and improvements. Meta-agent CBS [32] is a gen-
eralization of CBS in which instead of adding new constraints to resolve a conflict
between two agents, the algorithm may choose to merge the conflicting agent to
a single meta-agent. Improved CBS [8] attempts to reduce the size of the CT
by intelligently choosing which conflict to resolve in every iteration. HCBS [14]
adds an admissible heuristic to the high-level search to prune more nodes from
the CT. Recent work suggested a different scheme for resolving conflicts. For a
conflict in location xat time tbetween agents iand j, they proposed to generate
three CT nodes: one with a constraint that agent imust occupy xat time t,
one with a constraint that agent jmust occupy xat time t, and one with a
constraint that neither agent inor agent jcan occupy xat time t. The benefit
of this three-way split is that the sets of solutions that satisfy them is disjoint.
4.4 Constraint Programming
Constraint Programming (CP) is a problem-solving paradigm in which one mod-
els a given problem as a Constraints Satisfaction Problem (CSP) or a Constraint
Optimization Problems (COP), and then use a general-purpose constraints solver
to find a solution. A notable special case of CP is to model a problem as a
12 R. Stern
Boolean Satisfiability (SAT) problem, which is a special case of CSP, and use a
general-purpose SAT solver.
CP is a very general paradigm because many problems, including MAPF, can
be modeled as a CSP or a COP. The major benefit of using CP is that current
general-purpose constraints solver are very efficient and are constantly getting
better. In particular, modern SAT solvers are extremely efficient, solving SAT
problems with over a million variables.
A common approach for finding a solution to a given MAPF problem with
optimal makespan with CP is by splitting the problem to two problems: (1)
finding a valid solution whose makespan is equal to or smaller than a given
bound T, and (2) finding a value of Tthat is equal to the optimal makespan.
Next, we provide a brief description of this approach.
Finding a valid solution for a given makespan bound For every triplet of
agent a, vertex vV, and time step t, we define a Boolean variable Xa,v,t. Setting
Xa,v,t to true means that ais planned to occupy vat time t. The constraints
imposed on these variables ensure that:
1. Agent occupies one vertex in each time step. For every time step
and agent there is exactly one variable Xa,v,t that is assigned true. that is
assigned a true value.
2. No conflicts. For every time step and location, there is at most one variables
Xa,v,t that is assigned true.3
3. Agents start and ends in the desired locations. For every agent i,
Xi,s(i),1and Xi,t(i),C .
4. Agents move along edges. For every time tbefore T, agent i, and pair of
vertices vand v0, if the variables Xi,t,v and Xi,t,v0are both true then there
is an edge (v, v0)E.
Any assignment of values to the variables Xa,v,t corresponds to a valid solu-
tion for our MAPF problem whose makespan is at most T.
Finding the optimal makespan To find the optimal makespan, we start by
setting Tto be a lower bound on the optimal makespan. Such a lower bound can
be easily obtained by taking the maximum over the agents’ individual shortest
path to their goal. Then, a constraints solver is used to search for a solution to
the CSP defined above. If a solution has been found, we have found an optimal
solution. If not, Tis incremented by one, and the constraints solver is used again
to solve the new CSP. This process continues until an optimal solution is found.
Finding a solution with optimal sum-of-costs is also possible with CP, but it
requires some additional constraints and changes to the process [42, 6].
3Actually, this constraint only prevents vertex conflicts. To prevent swapping conflicts,
an additional constraint is needed, in which for every time step tbefore T, pair of
agents aand a0, and pair of locations vand v0, if the variables Xi,t,v and Xi,t0,v0are
both true then the variables Xj,t,v0and Xj,t0,v must not be both true.
Multi-Agent Path Finding – an Overview 13
It is important to note that the above is not the only way to solve MAPF
with CP. Surynek explored five different ways to model MAPF using SAT,
showing how different modeling choices impact the SAT solver’s runtime [40].
Bart´ak et al. [6] modeled several variants of MAPF using Picat [54], a higher-
level CP language. A CP written in Picat can be automatically compiled and
solved with either SAT, a CP solver, or a Mixed Integer-Linear Program (MILP)
solver [44]. They showed that different modelings and solvers are effective for dif-
ferent MAPF variants and problems. Still, how to choose the best model and
solver for a given MAPF problem is, to-date, an open question.
It is worth noting that solving MAPF with CP is, in it self, a special case of
a more general approach for solving MAPF in which one compiles MAPF to a
different problem, solves it with an algorithm designed for that problem. Promi-
nent examples are MAPF compilation to Answer Set Programming (ASP) [13],
to SAT Modulu Theory (SMT) [41], and to multi-commodity network flow [52].
Such MAPF algorithms are sometimes referred to as reduction-based MAPF
solvers [15].
4.5 Summary of Optimal Solvers
Unfortunately, there are no clear guidelines to predict which of the MAPF algo-
rithms detailed above would work best for a given MAPF problem. Prior work
suggested the following rules-of-thumb:
A-based and CP approaches are effective for small graphs that are dense
with agents.
CBS and ICTS are effective for large graphs.
However, this rules-of-thumb has not been grounded theoretically and its
empirical support is weak. We expect that future work will explore automated
methods to select the best solver to use for a given problem. Another appealing
direction for future work is to create hybrid algorithms that enjoy the comple-
mentary benefits of different MAPF solvers.
5 Approximately Optimal Solvers
While modern optimal MAPF algorithms have pushed the state of the art im-
pressively, there are still many MAPF problems for which current algorithms
cannot solve optimally in reasonable time. In such cases, one can always use one
of the fast MAPF algorithms described in Section 3, but that would mean the
solution returned may be very costly.
Approximately optimal MAPF algorithms, also known as bounded-suboptimal
algorithms, lie in the range between these algorithms and optimal algorithms.
An approximately optimal algorithm is an algorithm that accepts a parameter
 > 0 and returns a solution whose cost is at most 1 + times the cost of
an optimal solution. Ideally, an approximately optimal algorithm would return
solutions faster when increasing , thus providing a controlled trade-off between
14 R. Stern
runtime and solution quality. Approximately optimal MAPF algorithms have
been proposed based on each of the optimal MAPF approaches described in the
previous section. We describe them briefly below.
5.1 A-based
Creating an approximately optimal version of an A-based MAPF algorithm is
straightforward, since there are many approximately optimal A-based algorithm
in the heuristic search literature. Perhaps the most well-known approximately
optimal A-based algorithm is Weighted A[30], which is a best-first search that
uses the g+(1+ )hevaluation function to choose which node to expand in every
iteration. All A-based MAPF algorithms can use the same evaluation function
and obtain the guarantee: that the solution cost is at most 1+times the cost of
an optimal solution. Such a variant was mentioned explicitly for M[47], under
the name inflated M.
An interesting direction for future work is to use more modern A-based
approximately optimal algorithms to improve the performance of approximately
optimal A-based MAPF algorithms. Explicit Estimation Search (EES) [43] and
Dynamic Potential Search (DPS) [16] are some examples of such approximately
optimal A-based algorithms.
5.2 ICTS
To the best of our knowledge, there is no approximately optimal ICTS-based
algorithm for classical MAPF. The challenge in creating such an algorithm is
that the ICTS high-level search is done in a breadth-first manner. Thus there
is no heuristic to inflate, preventing the clear application of Weighted Aand
other approximately optimal search algorithms.
However, there is an approximately optimal variant of ICTS for MAPF prob-
lems in which moving an agent across different edges can have different costs.
This algorithm is based on the Extended ICTS (eICTS) algorithm [48], which is
an ICTS-based algorithm designed for this type of MAPF problems. In eICTS,
each ICT node is associated with a lower and upper bound. The high-level search
in this case becomes a best-first search on the lower bound, and low-level search
looks for optimal solutions within these bounds. This allows creating an approx-
imately optimal version of eICTS called wICTS, in which suboptimality is added
to both high-level and low-level search.
5.3 CBS
Enhanced CBS (ECBS) [4] is an approximately optimal MAPF algorithm that
is based on CBS. It introduces suboptimality in the low-level search and in the
high-level search. The low-level search in CBS can be any optimal shortest path
algorithm, such as A. As noted above, there are several approximately optimal
algorithms that are based on A, including Weighted A[30], EES [43], and
Multi-Agent Path Finding – an Overview 15
DPS [16]. Thus, introducing suboptimality to the low-level search can be done
by simply using one of these approximately optimal algorithms.
Introducing suboptimality to the high-level search is slightly more involved.
To do so, ECBS uses a focal search framework for its high-level search. Focal
search is a heuristic search framework introduced by Pearl and Kim [28] in
which the node expanded in every iteration is chosen from a subset of nodes
called FOCAL. FOCAL contains all nodes in Open that may lead to a solution
that may be approximately optimal. To choose which node to expand From
FOCAL, a secondary heuristic can be used. Importantly, this heuristic can be
inadmissible and domain-dependent. ECBS uses the focal search framework, and
uses a MAPF-specific secondary heuristic that prioritizes CT node with fewer
conflicts. For details, see Barer et al. [4]. Later work proposed an extension
to ECBS in which user-defined paths called highways are prioritized to further
improve runtime [10].
5.4 Constraint Programming
eMDD-SAT is a recently proposed approximately optimal MAPF algorithm
from the CP family. This algorithm models MAPF as a SAT problem. It follows
the high-level approach we described in Section 4.4, except that it is designed
for (approximately) optimizing SOC and not makespan.
In a very high-level manner, eMDD-SAT works by creating a SAT model
that allows solutions with longer makespan and larger SOC. The suboptimality
is controlled by high much larger is the SOC from a computed SOC lower-bound.
To the best of our knowledge, there is no approximately optimal MAPF algo-
rithm from this family that is designed for finding solutions with approximately
optimal makespan.
In general, significantly less efforts have been dedicated, to date, to develop
approximately optimal MAPF algorithms. However, existing approximately op-
timal MAPF algorithms demonstrate that adding even a very small amount of
suboptimality can allow solving much larger problems. For example, ECBS with
at most 1% suboptimality is able to solve MAPF problems with 250 agents on
large maps [4].
6 Beyond Classical MAPF
The scope of this overview is mostly limited to what is referred to as classical
MAPF [37]. Classical MAPF assumes that (1) every action takes exactly one
time step, (2) time is discretized into time steps, as oppose to continuous, and
(3) each agent occupies exactly one vertex. These assumptions do not necessarily
hold in real-world MAPF applications. With the maturity of classical MAPF
algorithms, recent years have also begun to explore MAPF problems that relax
these assumptions. Below, we provide a partial overview of these efforts.
16 R. Stern
6.1 Beyond One-Time Step Actions
The eICTS algorithm [48] mentioned above is designed for actions that may
require more than one time step. Such a setting is sometimes called MAPF with
non-unit edge cost. Adapting the CBS algorithm to non-unit edge cost settings
is straightforward, as it only requires changing the conflict-detection step.
Bart´ak et al. [5] proposed a CP-based algorithm for MAPF with non-unit
edge costs. Their model uses scheduling constraints to support actions with
different duration.
6.2 Beyond Discrete Time Steps
Time is continuous, and thus every time step discretization is, by definition an
abstraction of the real-world. This abstraction in the context of MAPF may lead
to suboptimality and even incompleteness.
As long as the agents do not need to wait, there is no need to directly deal
with this problem: the duration of move actions depend on the time required
to traverse the corresponding edge. However, when an agent needs to wait and
time is not discretized, then each agent has an infinite number of possible wait
actions in each vertex.
The key technique used so far to address this problem is to use the Safe
Interval Path Planning (SIPP) [29] algorithm. SIPP is a single-agent pathfind-
ing algorithm that is designed to avoid moving obstacles. Since obstacles are
moving, the single agent may choose to wait in its location, which raises again
the challenge of dealing with continuous time. SIPP addresses this challenge by
identifying safe intervals in which the agent can occupy each vertex, and runs an
Asearch on (vertex, safe interval) pairs. Andreychuck et al. showed how to use
SIPP to solve MAPF problems with continuous time, in a prioritized planning
framework [51] and in a CBS framework [2].
Surynek [41] recently proposed to use a CP-related approach for continuous
time. Instead of modeling the problem as a CSP or SAT problem, Surynek
proposed to model it as a SAT Modulu Theory (SMT) problem, and then apply
an SMT solver.
6.3 Beyond One-Agent per Vertex
The graph Gof possible location in classical MAPF is an abstraction of the real
world the agents are moving in. Arguably, in most real-world MAPF applications
the agents are moving in Euclidean space and have some geometric shape. Thus,
an agent may conflict if they stop in different areas, because their geometric
shapes overlap. Li et al. [22] referred to this as MAPF with large agents. In such
settings, an agent may “occupy” multiple vertices and a move action may create
a conflict with agents occupying multiple vertices.
Li et al. [22] proposed a CBS-based algorithm for addressing this setting.
They showed how to design suitable constraints for large agents and proposed
an admissible heuristic to speedup the search. They also described an A-based
Multi-Agent Path Finding – an Overview 17
algorithm and a SAT-based algorithm for this setting. Atzmon et al. [12] pro-
posed another CBS-based algorithm that can consider agents of arbitrary shape,
even without a reference point that is stable to rotations.
Robustness and Kinematic Constraints Even if an agent only occupies a
single vertex, it is still desirable in many scenarios to add a buffer around each
agent to further minimize the chance of collisions. Such a buffer can be either
spatial or temporal. A prime motivation for having such a buffer is to account
for the inherent uncertainty during the executing of the solution. That is, to
have the agents’ joint plan be valid and executable even if some agents do not
fully follow it.
The MAPF-POST [19] algorithm was designed to address such requirements.
MAPF-POST accepts as input a solution for a classical MAPF problem and
adapts it to consider safety and kinematic constraints. A limitation of MAPF-
POST is that it does not retain any guarantee on solution quality. For adding
robustness to temporal delays during execution, Atzmon et al. [3] proposed an
optimal CBS-based algorithm and CP-based algorithm.
6.4 Beyond One-Shot MAPF
In addition, classical MAPF is a one-shot, offline problem. In some MAPF ap-
plications, there is a sequence of related MAPF problems that are being solved
sequentially. Some recent work also addresses several types of online MAPF set-
tings. This includes settings where there is a fixed set of agents and a stream of
pathfinding tasks [25], as well as a setting where new agents appear over time
but each agent has a single navigation task [46]. The former setting is referred to
as the MAPF warehouse model and the latter as the MAPF intersection model.
Also, so far we assumed the allocation of agents to goals is given. In the
Multi-Agent Pickup-and-Delivery (MAPD) problem, this is not the case [23]. In
MAPD, there is a fixed set of agents that need to solve a batch of pickup and
delivery of tasks. A MAPD algorithm needs to plan paths without conflicts, and
also to allocate which agent should go to which destination.
7 Conclusion
This paper provides an overview of the current research on Multi-Agent Path
Finding (MAPF). After providing several definitions of MAPF were given, we
presented polynomial-time algorithms for solving the problem. Then, a range
of algorithms was described that return optimal solutions. These algorithms
can be split into four families: A-based, ICTS, CBS, and CP. Following, we
described how to transform several of these optimal algorithms to be approxi-
mately optimal algorithms, allowing trading solution quality for runtime. Finally,
we presented some extensions of classical MAPF, including non-unit edge costs,
continuous time, large agents, and online MAPF. Throughout this paper, we
suggested several directions for future work.
18 R. Stern
It is our hope that this paper will be useful to both researchers and practition-
ers looking for a brief introduction to MAPF. For formal definitions of MAPF
variants and benchmarks, see [37]. For additional MAPF-related resources, in-
cluding pointers to publications and additional tutorials, see the http://mapf.info
web site, created by Sven Koenig’s group.
References
1. Andreychuk, A., Yakovlev, K.: Two techniques that enhance the performance of
multi-robot prioritized path planning. In: International Conference on Autonomous
Agents and MultiAgent Systems (AAMAS). pp. 2177–2179 (2018)
2. Andreychuk, A., Yakovlev, K., Atzmon, D., Stern, R.: Multi-agent pathfinding
with continuous time. In: International Joint Conference on Artificial Intelligence
(IJCAI). pp. 39–45 (2019)
3. Atzmon, D., Stern, R., Felner, A., Wagner, G., Bart´ak, R., Zhou, N.F.: Robust
multi-agent path finding. In: International Conference on Autonomous Agents and
Multi Agent Systems (AAMAS). pp. 1862–1864 (2018)
4. Barer, M., Sharon, G., Stern, R., Felner, A.: Suboptimal variants of the conflict-
based search algorithm for the multi-agent pathfinding problem. In: Symposium
on Combinatorial Search (SoCS) (2014)
5. Bart´ak, R., ˇ
Svancara, J., Vlk, M., et al.: A scheduling-based approach to multi-
agent path finding with weighted and capacitated arcs. In: International Conference
on Autonomous Agents and MultiAgent Systems (AAMAS). pp. 748–756. Inter-
national Foundation for Autonomous Agents and Multiagent Systems (AAMAS)
(2018)
6. Bartk, R., Zhou, N., Stern, R., Boyarski, E., Surynek, P.: Modeling and solving
the multi-agent pathfinding problem in picat. In: IEEE International Conference
on Tools with Artificial Intelligence (ICTAI). pp. 959–966 (2017)
7. Bnaya, Z., Felner, A.: Conflict-oriented windowed hierarchical cooperative a. In:
IEEE International Conference on Robotics and Automation (ICRA). pp. 3743–
3748 (2014)
8. Boyarski, E., Felner, A., Stern, R., Sharon, G., Tolpin, D., Betzalel, O., Shimony,
E.: ICBS: improved conflict-based search algorithm for multi-agent pathfinding.
In: International Joint Conference on Artificial Intelligence (IJCAI) (2015)
9. ˇ
ap, M., Vokˇr´ınek, J., Kleiner, A.: Complete decentralized method for on-line
multi-robot trajectory planning in well-formed infrastructures. In: International
Conference on Automated Planning and Scheduling (ICAPS) (2015)
10. Cohen, L., Uras, T., Koenig, S.: Feasibility study: Using highways for bounded-
suboptimal multi-agent path finding. In: Symposium on Combinatorial Search
(SoCS) (2015)
11. De Wilde, B., Ter Mors, A.W., Witteveen, C.: Push and rotate: a complete multi-
agent pathfinding algorithm. Journal of Artificial Intelligence Research 51, 443–492
(2014)
12. Dor Atzmon, Amit Diei, D.R.: Multi-train path finding. In: Symposium on Com-
binatorial Search (SoCS) (2019)
13. Erdem, E., Kisa, D.G., Oztok, U., Sch¨uller, P.: A general formal framework for
pathfinding problems with multiple agents. In: AAAI Conference on Artificial In-
telligence (2013)
Multi-Agent Path Finding – an Overview 19
14. Felner, A., Li, J., Boyarski, E., Ma, H., Cohen, L., Kumar, T.S., Koenig, S.: Adding
heuristics to conflict-based search for multi-agent path finding. In: International
Conference on Automated Planning and Scheduling (ICAPS) (2018)
15. Felner, A., Stern, R., Shimony, S.E., Boyarski, E., Goldenberg, M., Sharon, G.,
Sturtevant, N.R., Wagner, G., Surynek, P.: Search-based optimal solvers for the
multi-agent pathfinding problem: Summary and challenges. In: Symposium on
Combinatorial Search (SoCS). pp. 29–37 (2017)
16. Gilon, D., Felner, A., Stern, R.: Dynamic potential searcha new bounded subopti-
mal search. In: Symposium on Combinatorial Search (SoCS) (2016)
17. Goldenberg, M., Felner, A., Sturtevant, N.R., Holte, R.C., Schaeffer, J.: Optimal-
generation variants of EPEA. In: SoCS (2013)
18. Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determina-
tion of minimum cost paths. IEEE Transactions on Systems Science and Cyber-
netics SSC-4(2), 100–107 (1968)
19. onig, W., Kumar, T., Cohen, L., Ma, H., Xu, H., Ayanian, N., Koenig, S.: Sum-
mary: multi-agent path finding with kinematic constraints. In: International Joint
Conference on Artificial Intelligence (IJCAI). pp. 4869–4873 (2017)
20. Kornhauser, D., Miller, G., Spirakis, P.: Coordinating pebble motion on graphs, the
diameter of permutation groups, and applications. In: Symposium on Foundations
of Computer Science. pp. 241–250. IEEE (1984)
21. Li, J., Harabor, D., Stuckey, P., Felner, A., Ma, H., Koenig, S.: Disjoint splitting for
multi-agent path finding with conflict-based search. In: International Conference
on Automated Planning and Scheduling (ICAPS) (2019)
22. Li, J., Surynek, P., Felner, A., Ma., H., Kumar, T.K.S., Koenig, S.: Multi-agent
path finding for large agents. In: AAAI Conference on Artificial Intelligence (2019)
23. Liu, M., Ma, H., Li, J., Koenig, S.: Task and path planning for multi-agent pickup
and delivery. In: International Conference on Autonomous Agents and MultiAgent
Systems (AAMAS). pp. 1152–1160 (2019)
24. Luna, R., Bekris, K.E.: Efficient and complete centralized multi-robot path plan-
ning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems.
pp. 3268–3275 (2011)
25. Ma, H., Li, J., Kumar, T., Koenig, S.: Lifelong multi-agent path finding for online
pickup and delivery tasks. In: Conference on Autonomous Agents and MultiAgent
Systems (AAMAS). pp. 837–845 (2017)
26. Ma, H., Yang, J., Cohen, L., Kumar, T.K.S., Koenig, S.: Feasibility study: Moving
non-homogeneous teams in congested video game environments. In: Conference on
Artificial Intelligence and Interactive Digital Entertainment (AIIDE). pp. 270–272
(2017)
27. Morris, R., Pasareanu, C.S., Luckow, K.S., Malik, W., Ma, H., Kumar, T.S.,
Koenig, S.: Planning, scheduling and monitoring for airport surface operations.
In: AAAI Workshop: Planning for Hybrid Systems (2016)
28. Pearl, J., Kim, J.H.: Studies in semi-admissible heuristics. IEEE transactions on
pattern analysis and machine intelligence PAMI-4, 392–399 (1982)
29. Phillips, M., Likhachev, M.: SIPP: Safe interval path planning for dynamic environ-
ments. In: IEEE International Conference on Robotics and Automation (ICRA).
pp. 5628–5635 (2011)
30. Pohl, I.: Heuristic search viewed as path finding in a graph. Artificial intelligence
1(3-4), 193–204 (1970)
31. Sajid, Q., Luna, R., Bekris, K.E.: Multi-agent pathfinding with simultaneous exe-
cution of single-agent primitives. In: SoCS (2012)
20 R. Stern
32. Sharon, G., Stern, R., Felner, A., Sturtevant, N.R.: Conflict-based search for opti-
mal multi-agent pathfinding. Artificial Intelligence 219, 40–66 (2015)
33. Sharon, G., Stern, R., Goldenberg, M., Felner, A.: The increasing cost tree search
for optimal multi-agent pathfinding. Artificial Intelligence 195, 470–495 (2013)
34. Silver, D.: Cooperative pathfinding. In: AIIDE. vol. 1, pp. 117–122 (2005)
35. Srinivasan, A., Ham, T., Malik, S., Brayton, R.K.: Algorithms for discrete function
manipulation. In: IEEE International conference on computer-aided design. pp.
92–95 (1990)
36. Standley, T.S.: Finding optimal solutions to cooperative pathfinding problems. In:
AAAI Conference on Artificial Intelligence. pp. 173–178 (2010)
37. Stern, R., Sturtevant, N.R., Felner, A., Keonig, S., Ma, H., Walker, T.T., Li,
J., Atzmon, D., Cohen, L., Kumar, T.S., Boyarski, E., Bartak, R.: Multi-agent
pathfinding: Definitions, variants, and benchmarks. In: Symposium on Combina-
torial Search (SoCS) (2019)
38. Surynek, P.: A novel approach to path planning for multiple robots in bi-connected
graphs. In: IEEE International Conference on Robotics and Automation (ICRA).
pp. 3613–3619 (2009)
39. Surynek, P.: An optimization variant of multi-robot path planning is intractable.
In: AAAI (2010)
40. Surynek, P.: Makespan optimal solving of cooperative path-finding via reductions
to propositional satisfiability. arXiv preprint arXiv:1610.05452 (2016)
41. Surynek, P.: Multi-agent path finding with continuous time viewed through satis-
fiability modulo theories (smt). arXiv preprint arXiv:1903.09820 (2019)
42. Surynek, P., Felner, A., Stern, R., Boyarski, E.: Efficient sat approach to multi-
agent path finding under the sum of costs objective. In: European Conference on
Artificial Intelligence (ECAI). pp. 810–818 (2016)
43. Thayer, J.T., Ruml, W.: Bounded suboptimal search: A direct approach using
inadmissible estimates. In: International Joint Conference on Artificial Intelligence
(IJCAI) (2011)
44. Van Roy, T.J., Wolsey, L.A.: Solving mixed integer programming problems using
automatic reformulation. Operations Research 35(1), 45–57 (1987)
45. Veloso, M.M., Biswas, J., Coltin, B., Rosenthal, S.: Cobots: Robust symbiotic au-
tonomous mobile service robots. In: IJCAI. p. 4423 (2015)
46. ˇ
Svancara, J., Vlk, M., Stern, R., Atzmon, D., Bart´ak, R.: Online multi-agent
pathfinding. In: AAAI Conference on Artificial Intelligence (2019)
47. Wagner, G., Choset, H.: Subdimensional expansion for multirobot path planning.
Artificial Intelligence 219, 1–24 (2015)
48. Walker, T.T., Sturtevant, N.R., Felner, A.: Extended increasing cost tree search for
non-unit cost domains. In: International Joint Conference on Artificial Intelligence
(IJCAI). pp. 534–540 (2018)
49. Wang, K.H.C., Botea, A.: MAPP: a scalable multi-agent path planning algorithm
with tractability and completeness guarantees. Journal of Artificial Intelligence
Research 42, 55–90 (2011)
50. Wurman, P.R., D’Andrea, R., Mountz, M.: Coordinating hundreds of cooperative,
autonomous vehicles in warehouses. AI magazine 29(1), 9 (2008)
51. Yakovlev, K., Andreychuk, A.: Any-angle pathfinding for multiple agents based on
SIPP algorithm. In: International Conference on Automated Planning and Schedul-
ing (ICAPS). pp. 586–593 (2017)
52. Yu, J., LaValle, S.M.: Multi-agent path planning and network flow. In: Algorithmic
Foundations of Robotics X. pp. 157–173 (2013)
Multi-Agent Path Finding – an Overview 21
53. Yu, J., LaValle, S.M.: Structure and intractability of optimal multi-robot path
planning on graphs. In: AAAI (2013)
54. Zhou, N., Kjellerstrand, H., Fruhman, J.: Constraint Solving and Planning with
Picat. Springer Briefs in Intelligent Systems, Springer (2015)
... In some cases, this may lead to falling into local optimal solutions. Furthermore, for largescale planning problems, the complexity of the algorithm increases with the number of robots, which may result in a solution not being found in an acceptable time [43]. In this study, the number of microrobots in the workspace is limited by a distance constraint, with a maximum of five microrobots allowed to be present at the same time. ...
Article
Full-text available
The collaboration of microrobot teams has attracted considerable attention, particularly in the field of micro/nano manipulation. Achieving independent control and motion planning of multiple magnetic microrobots for coordinated movements is one of the most important tasks that is still unsolved. In this paper, a 12 $\times$ 12 coil array system is developed to generate a series of localized magnetic fields that enable simultaneous control of multiple identical magnetic microrobots, allowing teams of microrobots to collaborate in parallel for micromanipulation tasks. First, the structure of the microcoil is optimized based on the finite element model to increase the strength and gradient of the magnetic field, which in turn enhances the driving performance of the system. Meanwhile, an improved multi-target tracking algorithm that utilizes kernel correlation filtering (KCF) and image contour detection (ICD) techniques is proposed to improve the tracking accuracy of microrobots. In addition, collaborative planning for multiple magnetic microrobots is also achieved with the combination of the conflict-based search (CBS) algorithm. Finally, the developed system is tested with extensive physical experiments. Especially, experiments on magnetic droplet transport with two microrobots are also conducted. The results impressively demonstrated the effectiveness of the devised system and the proposed methods. Note to Practitioners —This article is motivated by the recent wide interest in magnetic microrobots. Actuated by external magnetic field, magnetic microrobots can wirelessly perform targeted delivery/therapy and other micro-assembly tasks. To facilitate collaboration between microrobots, independent control of each microrobot is desirable. However, due to the interaction between magnetic microrobots and the global magnetic field, the collaboration of multiple microrobots presents great challenges. Therefore, several coil-array-based systems have been developed. In this paper, we develop a magnetic actuation system from both hardware and software aspects for the collaborative motion of multiple magnetic microrobots. The coil structure is optimized to enhance the driving performance of the devised system, and a fused multi-target tracking algorithm is proposed to improve the tracking accuracy. In combination with the CBS algorithm, collision-free paths are planned for multiple identical microrobots. The experimental results show that the constructed system and proposed methods can realize coordinated motion of multiple identical magnetic microrobots, which has enormous potential for some biomedical applications.
... Pebble Motion is related and has been used (see e.g. Kulich, Novák, and Přeucil 2019) to tackle Multi-Agent Pathfinding (MAPF) (Stern 2019), a problem in which pebbles/agents can move synchronously, and hence lends itself well to real-life problems such as automated warehouses. ...
Article
Given a tree, a set of pebbles initially stationed at some nodes of the tree and a set of target nodes, the Unlabeled Pebble Motion on Trees problem (UPMT) asks to find a plan to move the pebbles one-at-a-time from the starting nodes to the target nodes along the edges of the tree while minimizing the number of moves. This paper proposes the first optimal algorithm for UPMT that is asymptotically as fast as possible, as it runs in a time linear in the size of the input (the tree) and the size of the output (the optimal plan).
... The goal of multi-agent path finding (MAPF) is to find joint paths in a multiagent system [26] [27], guiding all agents to their destination without collisions. The existing MAPF methods can be broadly categorized into fast solvers, optimal solvers, and approximately solvers [28]. Fast solvers include methods such as Kornhauser's algorithm [29], Hierarchical Cooperative A* (HCA*) [30], Push-and-Swap [31], and Push-and-Rotate [32]. ...
Article
Full-text available
With the rapid development of online shopping and the prosperity of the e-commerce industry in recent years, traditional warehouses are struggling to cope with increasing order volumes. Accordingly, smart warehouses have gained considerable attention for their relatively high efficiency and productivity. In such warehouses, robots transport shelves to picking stations on the basis of tasks assigned to them and then return to the inventory area. An accurate task assignment method must be developed to achieve high efficiency in smart warehouses; however, existing task assignment methods use limited information, resulting in a lack of insight regarding future tasks in warehouses. This paper proposes a method based on the deep Q-network (DQN) that considers inventory for task assignments. The developed DQN-based model determines shelf return locations on the basis of current states to improve warehouse performance. The proposed method was compared with a traditional task assignment method, namely regret and marginal-cost based task assignment algorithm (RMCA); the results indicated that compared with the RMCA method, the proposed approach is more efficient and faster and can accommodate more robots.
... In a PC-TAPF problem, a set of tasks and a team of mobile robots are usually given at the beginning. We first need to assign each task to a suitable robot [5], then a set of conflict-free paths for robots needs to be generated to ensure the assigned tasks can be successfully completed [6,7]. Note that precedence constraints can exist between tasks in a PC-TAPF problem [8]. ...
Article
Full-text available
Mobile robots play an important role in smart factories, though efficient task assignment and path planning for these robots still present challenges. In this paper, we propose an integrated task- and path-planning approach with precedence constrains in smart factories to solve the problem of reassigning tasks or replanning paths when they are handled separately. Compared to our previous work, we further improve the Regret-based Search Strategy (RSS) for updating the task insertions, which can increase the operational efficiency of machining centers and reduce the time consumption. Moreover, we conduct rigorous experiments in a simulated smart factory with different scales of robots and tasks. For small-scale problems, we conduct a comprehensive performance analysis of our proposed methods and NBS-ISPS, the state-of-the-art method in this field. For large-scale problems, we examine the feasibility of our proposed approach. The results show that our approach takes little computation time, and it can help reduce the idle time of machining centers and make full use of these manufacturing resources to improve the overall operational efficiency of smart factories.
Article
Full-text available
This review paper provides an in-depth analysis of the latest advancements in applying Machine Learning (ML) to solve the Multi-Agent Path Finding (MAPF) problem. The MAPF problem is about finding collision-free paths for multiple agents to travel from their source to goal locations in a known environment. This method underpins a range of advanced, large-scale automated systems, notably in warehouse logistics. The existing research on conventional MAPF is extensive; however, recent developments in ML have notably augmented the capabilities of MAPF techniques. This research seeks to thoroughly investigate the emerging field focused on using ML to help solve the MAPF problem. It aims to highlight the transformative potential of ML in enhancing the efficiency and effectiveness of multi-agent systems in navigating and coordinating in complex environments. Our study comprehensively examines the entire MAPF process, encompassing environment representation, path planning, and solution execution.
Article
In this study, we introduce the Lifelong Evaluation-Based Large Neighborhood Search (LEB-LNS) algorithm designed to address the Lifelong Adaptive Multiple Priorities Multi-Agent Path Finding (LAMP-MAPF) challenge. This challenge involves agents that must navigate from one location to another across varying priority levels, constrained by limited calculation time for each interval. Initially, a $\gamma$ -based evaluation function is utilized to determine the significance of different priority levels. Following this, the evaluation led to the development of the Evaluation-Based LNS (EB-LNS) Algorithm, tailored for the Adaptive Multiple Priorities MAPF (AMP-MAPF) issue. By integrating task assignment, we further extend LEB-LNS algorithm for the LAMP-MAPF problem. The efficacy of LEB-LNS algorithm is verified through simulations conducted on fulfillment and sorting center maps, supplemented by real-world experiments. Results demonstrate that the LEB-LNS algorithm effectively resolves LAMP-MAPF problem, significantly enhancing agent throughput and reducing delays for high-priority agents.
Conference Paper
Full-text available
We introduce and empirically evaluate two techniques aimed at enhancing the performance of multi-robot prioritized path planning. The first technique is the deterministic procedure for rescheduling (as opposed to well-known approach based on random restarts), the second one is the heuristic procedure that modifies the search-space of the individual planner involved in the prioritized path finding. KEYWORDS Multi-robot systems; multi-robot path planning; multi-agent path finding; prioritized planning; random restarts ACM Reference Format: Consider n open-disk robots of equal radii r intended to simultaneously move towards their goals on a grid comprised of blocked and unblocked cells. The task is to plan the set of feasible trajectories (one per each robot) such that each two trajectories for different robots are collision-free, e.g. robots following them never collide. The spatial component of the trajectory, e.g. the path, is the sequence of straight-line segments connecting the centers of grid cells. We assume that all robots follow their paths with identical speed and can start/stop instantaneously. Depending on the application domain the cost of the solution can be either makespan, i.e. the time by which the last robot reaches its goal, or flowtime, which is the sum of traversal times across all the robots involved in the instance. 2 METHOD 2.1 Related work Complete (and optimal or bounded-suboptimal) solvers to the described problem exist, like the ones introduced in [13], [9], [8], [11], [5], [7] etc., but they typically require significant computational effort, especially when the number of robots increases. Appealing alternative is prioritized planning [6], which is widely used in ro-botics [4], [3] due to its simplicity and ability to scale well to large problems. Prioritized planners are incomplete in general but can be tweaked to increase the chance of finding the solution (and decrease flowtime and/or makespan). The most widespread approach to do so is to randomly reassign priorities in case of failure and re-plan, see [1], [2] for details. 2.2 Contribution In this work we introduce a novel deterministic technique for reassigning the priorities (re-scheduling) in case planning with initial priorities fails. We also introduce the notion of safe-start-interval (SSI) in order to prohibit the individual planner, involved in prioritized planning, from interfering with the start positions of low-priority robots for certain amount of time. Both suggested techniques enhance the performance of planning (as shown experimentally) and can be embedded into any prioritized planner.
Article
The multi-agent pathfinding problem (MAPF) is the fundamental problem of planning paths for multiple agents, where the key constraint is that the agents will be able to follow these paths concurrently without colliding with each other. Applications of MAPF include automated warehouses, autonomous vehicles, and robotics. Research on MAPF has been flourishing in the past couple of years. Different MAPF research papers assume different sets of assumptions, e.g., whether agents can traverse the same road at the same time, and have different objective functions, e.g., minimize makespan or sum of agents' actions costs. These assumptions and objectives are sometimes implicitly assumed or described informally. This makes it difficult for establishing appropriate baselines for comparison in research papers, as well as making it difficult for practitioners to find the papers relevant to their concrete application. This paper aims to fill this gap and facilitate future research and practitioners by providing a unifying terminology for describing the common MAPF assumptions and objectives. In addition, we also provide pointers to two MAPF benchmarks. In particular, we introduce a new grid-based benchmark for MAPF, and demonstrate experimentally that it poses a challenge to contemporary MAPF algorithms.
Article
Multi-agent pathfinding (MAPF) is an area of expanding research interest. At the core of this research area, numerous diverse search-based techniques were developed in the past 6 years for optimally solving MAPF under the sum-of-costs objective function. In this paper we survey these techniques, while placing them into the wider context of the MAPF field of research. Finally, we provide analytical and experimental comparisons that show that no algorithm dominates all others in all circumstances. We conclude by listing important future research directions.
Article
Potential Search (PS) is an algorithm that is designed to solve bounded cost search problems. In this paper, we modify PS to work within the framework of bounded suboptimal search and introduce Dynamic Potential Search (DPS). DPS uses the idea of PS but modifies the bound to be the product of the minimal f-value in OPEN and the required suboptimal bound. We study DPS and its attributes. We then experimentally compare DPS to WA* and to EES on a variety of domains and study parameters that affect the behavior of these algorithms.In general we show that in domains with unit edge costs (e.g., many standard benchmarks) DPS significantly outperforms WA* and EES but there are exceptions.
Article
Conflict-Based Search (CBS) and its generalization, Meta-Agent CBS are amongst the strongest newly introduced algorithms for Multi-Agent Path Finding. This paper introduces ICBS, an improved version of CBS. ICBS incorporates three orthogonal improvements to CBS which are systematically described and studied. Experimental results show that each of these improvements reduces the runtime over basic CBS by up to 20x in many cases. When all three improvements are combined, an even larger improvement is achieved, producing state-ofthe art results for a number of domains.
Article
The problem of finding conflict-free trajectories for multiple agents of identical circular shape, operating in shared 2D workspace, is addressed in the paper and decoupled, e.g., prioritized, approach is used to solve this problem. Agents' workspace is tessellated into the square grid on which any-angle moves are allowed, e.g. each agent can move into an arbitrary direction as long as this move follows the straight line segment whose endpoints are tied to the distinct grid elements. A novel any-angle planner based on Safe Interval Path Planning (SIPP) algorithm is proposed to find trajectories for an agent moving amidst dynamic obstacles (other agents) on a grid. This algorithm is then used as part of a prioritized multi-agent planner AA-SIPP(m). On the theoretical side, we show that AA-SIPP(m) is complete under well-defined conditions. On the experimental side, in simulation tests with up to 250 agents involved, we show that our planner finds much better solutions in terms of cost (up to 20%) compared to the planners relying on cardinal moves only.
Article
Conflict-Based Search (CBS) and its enhancements are among the strongest algorithms for the multi-agent path-finding problem. However,existing variants of CBS do not use any heuristics that estimate future work. In this paper, we introduce different admissible heuristics for CBS by aggregating cardinal conflicts among agents. In our experiments, CBS with these heuristics outperforms previous state-of-the-art CBS variants by up to a factor of five.
Article
In cooperative pathfinding problems, non-interfering paths that bring each agent from its current state to its goal state must be planned for multiple agents. We present the first practical, admissible, and complete algorithm for solving problems of this kind. First, we propose a technique called operator decomposition, which can be used to reduce the branching factors of many search algorithms, including algorithms for cooperative pathfinding. We then show how a type of independence common in instances of cooperative pathfinding problems can be exploited. Next, we take the idea of exploiting independent subproblems further by adding improvements that allow the algorithm to recognize many more cases of such independence. Finally, we show empirically that these techniques drastically improve the performance of the standard admissible algorithm for the cooperative pathfinding problem, and that their combination results in a complete algorithm capable of optimally solving relatively large problems in milliseconds.
Article
Pathfinding for a single agent is the problem of planning a route from an initial location to a goal location in an environment, going around obstacles. Pathfinding for multiple agents also aims to plan such routes for each agent, subject to different constraints, such as restrictions on the length of each path or on the total length of paths, no self-intersecting paths, no intersection of paths/plans, no crossing/meeting each other. It also has variations for finding optimal solutions, e.g., with respect to the maximum path length, or the sum of plan lengths. These problems are important for many real-life applications, such as motion planning, vehicle routing, environmental monitoring, patrolling, computer games. Motivated by such applications, we introduce a formal framework that is general enough to address all these problems: we use the expressive high-level representation formalism and efficient solvers of the declarative programming paradigm Answer Set Programming. We also introduce heuristics to improve the computational efficiency and/or solution quality. We show the applicability and usefulness of our framework by experiments, with randomly generated problem instances on a grid, on a real-world road network, and on a real computer game terrain.
Chapter
We address multi-agent path finding (MAPF) with continuous movements and geometric agents, i.e. agents of various geometric shapes moving smoothly between predefined positions. We analyze a new solving approach based on satisfiability modulo theories (SMT) that is designed to obtain optimal solutions with respect to common cumulative objectives. The standard MAPF is a task of navigating agents in an undirected graph from given starting vertices to given goal vertices so that agents do not collide with each other in vertices or edges of the graph. In the continuous version (MAPF\(^\mathcal {R}\)), agents move in an n-dimensional Euclidean space along straight lines that interconnect predefined positions. Agents themselves are geometric objects of various shapes occupying certain volume of the space - circles, polygons, etc. We develop concepts for circular omni-directional agents having constant velocities in the 2D plane but a generalization for different shapes is possible. As agents can have different shapes/sizes and are moving smoothly along lines, a movement along certain lines done with small agents can be non-colliding while the same movement may result in a collision if performed with larger agents. Such a distinction rooted in the geometric reasoning is not present in the standard MAPF. The SMT-based approach for MAPF\(^\mathcal {R}\) called SMT-CBS\(^\mathcal {R}\) reformulates previous Conflict-based Search (CBS) algorithm in terms of SMT. Lazy generation of constraints is the key idea behind the previous algorithm SMT-CBS. Each time a new conflict is discovered, the underlying encoding is extended with new to eliminate the conflict. SMT-CBS\(^\mathcal {R}\) significantly extends this idea by generating also the decision variables lazily. Generating variables on demand is needed because in the continuous case the number of possible decision variables is potentially uncountable hence cannot be generated in advance as in the case of SMT-CBS. We compared SMT-CBS\(^\mathcal {R}\) and adaptations of CBS for the continuous variant of MAPF experimentally.