ChapterPDF Available

Multi-Agent Path Finding – An Overview

October 2019

October 2019

DOI:10.1007/978-3-030-33274-7_6

In book: Artificial Intelligence (pp.96-115)

Authors:

Roni Stern

Palo Alto Research Center

Multi-Agent Pathfinding (MAPF) is the problem of finding paths for multiple agents such that every agent reaches its goal and the agents do not collide. In recent years, there has been a growing interest in MAPF in the Artificial Intelligence (AI) research community. This interest is partially because real-world MAPF applications, such as warehouse management, multi-robot teams, and aircraft management, are becoming more prevalent. In this overview, we discuss several possible definitions of the MAPF problem. Then, we survey MAPF algorithms, starting with fast but incomplete algorithms, then fast, complete but not optimal algorithms, and finally optimal algorithms. Then, we describe approximately optimal algorithms and conclude with non-classical MAPF and pointers for future reading and future work.

Illustration of different types of conflicts, taken from Stern et al. [37]: (a) a vertex conflict, (b) a swapping conflict, (c) a following conflict, and (d) a cycle conflict.

…

Figures - uploaded by Roni Stern

Content may be subject to copyright.

Content uploaded by Roni Stern

Content may be subject to copyright.

Multi-Agent Path Finding – an Overview?

Roni Stern1[0000−0003−0043−8179]

Ben Gurion University of the Negev, Be’er Sheva, Israel

sternron@post.bgu.ac.il

Abstract. Multi-Agent Pathﬁnding (MAPF) is the problem of ﬁnding

paths for multiple agents such that every agent reaches its goal and the

agents do not collide. In recent years, there has been a growing interest in

MAPF in the Artiﬁcial Intelligence (AI) research community. This inter-

est is partially because real-world MAPF applications, such as warehouse

management, multi-robot teams, and aircraft management, are becoming

more prevalent. In this overview, we discuss several possible deﬁnitions

of the MAPF problem. Then, we survey MAPF algorithms, starting with

fast but incomplete algorithms, then fast, complete but not optimal algo-

rithms, and ﬁnally optimal algorithms. Then, we describe approximately

optimal algorithms and conclude with non-classical MAPF and pointers

for future reading and future work.

Keywords: Multi-Agent Pathﬁnding ·Heuristic Search

1 Introduction

MAPF is the problem of ﬁnding paths for multiple agents such that every

agent reaches its desired destination and the agents do not conﬂict. MAPF has

real-world applications in warehouse management [50], airport towing [27], au-

tonomous vehicles, robotics [45], and digital entertainment [26].

Research on MAPF has been developing rapidly in the past decade. In this

paper, we provide an overview of MAPF research in the Artiﬁcial Intelligence

(AI) community. The purpose of this overview is to help researchers and practi-

tioners that are less familiar with MAPF research better understand the problem

and current approaches for solving. It is not to intended to serve as a compre-

hensive survey on MAPF research.

This overview paper is structured as follows. In Section 2, we deﬁne the

problem formally, and discuss several of its notable variants. Then, a simple

analysis of the problem is given to illustrate its diﬃculty. Section 3 starts by

describing prioritized planning [34], which is still the most common approach in

practice to solve MAPF problems. We discuss the limitation of this approach,

in particular, the lack of completeness or optimality. Then, we mention several

MAPF algorithms that are fast and complete, but may return solutions that

are not optimal. Section 4 surveys several families of MAPF algorithms that

are guaranteed to return an optimal solution. Section 5 covers approximately

?Supported by ISF grant 210/17 to Roni Stern.

2 R. Stern

optimal algorithms, i.e., algorithms that guarantee the solution they return is at

most a constant factor more costly than an optimal solution. Finally, the paper

concludes with a partial list of MAPF extensions (Section 6), and pointers to

further reading and resources (Section 7). In addition, throughout this paper,

we point to interesting directions for future work.

2 Problem Deﬁnition

The literature includes multiple deﬁnitions of the MAPF problem. In this paper,

we mostly focus on what is called classical MAPF [37]. Section 6 discusses other

variants of MAPF. A classical MAPF problem with kagents is deﬁned by a

tuple hG, s, tiwhere:

–G= (V, E ) is an undirected graph whose vertices are the possible locations

agents may occupy and every edge (n, n0)∈Erepresents that an agent can

move from nto n0without passing through any other vertex.

–sis a function that maps an agent to its initial location.

–tis a function that maps an agent to its desired destination location.

Time is discretized into time steps. In every time step, each agent can per-

form a single action. There are two types of actions: wait and move. An agent

performing a wait action stays in its current location for one time step. A move

action moves an agent from its location to some other location. Move action

takes exactly one time step, and can only move an agent from its current lo-

cation to one of its adjacent locations. A valid solution to a MAPF problem is

a joint plan that moves all agents to their goals, in a way that agents do not

collide. Next, we deﬁne the terms valid solution, joint plan, and collision, in a

formal way.



A B C



A B



A B





(a) (b) (c) (d) (e)

Fig. 1. Illustration of diﬀerent types of conﬂicts, taken from Stern et al. [37]: (a) a

vertex conﬂict, (b) a swapping conﬂict, (c) a following conﬂict, and (d) a cycle conﬂict.

Asingle-agent plan for an agent iis a sequence of actions that if agent i

performs these actions in location s(i) it will end up in location t(i). Formally,

a single-agent plan for agent iis sequence of actions π= (a1,...an) such that

an(· · · a2(a1(s(i))) · · ·) = t(i) (1)

Multi-Agent Path Finding – an Overview 3

Ajoint plan is a set of single-agent plans, one for each of the kagents. For a

joint plan Π, we denote by Πiits constituent single-agent plan for agent i. A

pair of agents iand jhave a vertex conﬂict in a joint plan Πif according to

their respective single-agent plans Πiand Πjboth agents are planned to occupy

the same vertex at the same time. Similarly, agents have a swapping conﬂict in a

joint plan if they are planned to swap locations over the same edge at the same

time. A valid solution to a MAPF problem is a joint plan that has none of these

conﬂicts.

Some MAPF applications have stricter requirements from a valid solution,

prohibiting other types of conﬂicts. Two notable types of conﬂicts are following

conﬂicts and cycle conﬂicts. A following conﬂict occurs if an agent plans to

occupy at time step t+ 1 a location that was occupied by some other agent at

time step t. A cycle conﬂict occurs if a set of agents i, i + 1,...j plan to move

in the same time step tin a circular pattern, i.e., agent iplans to move in time

step t+ 1 to agent’s i+ 1 location at time step t, agent i+ 1 plans to move in

time step t+ 1 to agent’s i+ 2 at time step t, and so on, while agent jplans to

move in time step t+ 1 to agent’s ilocation at time step t. Figure 1 illustrates

all these diﬀerent types of conﬂicts. See Stern et al. [37] for a comprehensive

discussion on diﬀerent types of conﬂicts and the relationships between them.

2.1 Optimization

MAPF problems can have more than one valid solution. In many MAPF ap-

plications, one would like to ﬁnd a valid solution that optimizes some objective

function. The two most common objective functions used for evaluating a MAPF

solution are makespan and sum of costs. The makespan of a joint plan Π, de-

noted M(Π) is the number of time steps until all agents reach their goal.

M(Π) = max

1≤i≤k|πi|(2)

The sum of costs of a joint plan Π, denoted SOC(Π) is the sum of actions

performed until all agents reach their goal.

SOC(Π) = X

1≤i≤k

|πi|(3)

Following most prior work, we assume that when an agent waits in its desti-

nation then it also increase the SOC of the overall joint plan, unless that agent

is not planned to move later from its destination location. For example, consider

the case where agent ireaches its destination at time step t, leaves it at time

step t0, arrives back to its destination at time step t00, and stays there until all

agents reach their destinations. Then this single-agent plan contributes t00 to the

SOC of the corresponding joint plan.

2.2 From Single-Agent Pathﬁnding To MAPF

A single-agent shortest-path problem (SPP) is the problem of ﬁnding the shortest

path in a graph G= (V, E ) from a given source vertex s∈Vto a given target

4 R. Stern

vertex t∈V. MAPF can be reduced to a shortest-path problem in a graph

known as the k-agent search space. This graph, denoted Gk, is diﬀerent from

the single-agent graph G. A vertex in Grepresents a location that an agent may

occupy in a particular time step. A vertex in Gkrepresents a set of locations, one

per agent, that the agents can occupy in a particular time step. Thus, a vertex

in Gkis a vector of kvertices in G. An edge in Gkrepresents a joint action of

all agents, that is, a set of kactions, one per agent, that the agents can perform

simultaneously in a particular time step. Joint actions that result in a conﬂict,

will not have a corresponding edge in Gk. The cost of an edge in Gkcorresponds

to the cost of the corresponding joint action.

Observation 1 A lowest-cost path in Gkfrom s(1), . . . , s(k)to t(1),...t(k)

is an optimal solution to the MAPF problem hG, s, tiand vice versa.

Heuristic Search and the A∗Algorithm Heuristic search in general and

the A∗algorithm in particular [18] are commonly used to solve shortest-path

problems. For completeness, we provide a brief background on A∗.

A∗is a best-ﬁrst search algorithm. It maintains a list of vertices called Open.

Initially, Open contains the source vertex. In every iteration, a single vertex is

removed from Open and expanded. To expand a vertex means to go over each of

its neighbors and generate it. To generate a vertex means creating it and adding

it to Open, unless it has already been generated before. For every generated

vertex n, A∗maintains several values.

–g(n) is the cost of the lowest-cost path found so far from the source vertex

to n.

–parent(n) is the vertex before non that path.

–h(n) is a heuristic estimate of the cost of the lowest-cost path from nto the

target vertex.

Let h∗(n) be a perfect heuristic estimate for n, that is, the cost of the lowest-

cost path from nto a goal. If h∗(n) is known for all nodes, then one can ﬁnd

the shortest path from the source vertex to the target by choosing to go to the

vertex with the smallest hvalue. A heuristic function his called admissible iﬀ for

every vertex nit holds that h(n)≤h∗(n). The A∗algorithm chooses to expand

the vertex nin Open that has the smallest g(n) + h(n) value.

Theorem 1 (Optimality of A∗[18]). Given an admissible heuristic, A∗is

guaranteed to return an optimal solution, i.e., a shortest path from the source

vertex to its target.

Observation 1 and Theorem 1 mean that one can solve a given MAPF problem

by running A∗on the k-agent search space. A simple way to obtain an admissible

heuristic for the k-agent search space is by considering the cost of the shortest

path in Gfrom every vertex v∈Vto every target vertex t(1),...t(k). This is

done as follows. Let d(v, t(i)) be the cost of the shortest path from vto t(i).

Computing d(v, t(i)) for every v∈Vand i∈ {1, . . . , k}can be done in time that

Multi-Agent Path Finding – an Overview 5

is polynomial in |V|and k, in the beginning of the search. Then, the following

is an admissible heuristic when optimizing for sum of costs

h(v1,...vk)=X

i∈{1,...,k}

d(vi, t(i)) (4)

and the following is an admissible heuristic when optimizing makespan

h(v1,...vk)= max

i∈{1,...,k}d(vi, t(i)) (5)

Challenges in Solving MAPF with A∗A very rough way to estimate the

hardness of solving a shortest path problem, with A∗and other algorithms, is

by considering the size of the search space and its branching factor, which in

our case corresponds to the number of vertices in Gkand its average outgoing

degree. Thus, in the worst case, the size of the search space is |V|kand the

branching factor is |E|

|V|k

. As can be seen, both values are exponential in the

number of agents.

To get an estimate of these numbers, consider a MAPF problem with 20

agents on a 4-connected grid with 500 ×500 cells. In this case, the size of the

search space is 25,00020 ≈9.09 ·1087 and the branching factor is 420 ≈1.1·

1012 |V| ≈ 25,000. The exponential branching factor is especially problematic

for A∗, since A∗must at least expand all vertices along an optimal path. The

computational cost of expanding a vertex, however, is at least linear in the

branching factor. Thus, textbook A∗cannot be used to solve a MAPF problem

with a large number of agents, even with a perfect heuristic function.

3 Fast MAPF Algorithms

A fundamental approach to address this combinatorial explosion is to try to

decouple the MAPF task to ksingle-agent pathﬁnding problems with as minimal

interaction as possible. Perhaps one of the most popular approaches to do so is

prioritized planning.

3.1 Prioritized Planning

The ﬁrst step in prioritized planning is to assign each agent a unique number

from {1,...k}. Then, a single-agent plan is found for each agent in order of

their priority. When an agent searches for a plan, it is required to ﬁnd a plan

that avoids creating a conﬂict with plans already found for agents with higher

priority.

A fundamental diﬀerence between a textbook shortest-path problem and the

problem of ﬁnding a plan for the agent with the ith priority is that in the latter

an optimal solution may require an agent to wait in its location. Thus, to ﬁnd

a plan for the ith agent, is, in fact, a shortest path problem in a time-expansion

6 R. Stern

graph [34]. In a time-expansion graph, every vertex represents a pair (v, t), where

vis a vertex in the underlying graph Gand tis a time step. There is an edge

between vertices (v, t) and (v0, t0) in the time-expansion graph iﬀ t0=t+1 and v0

is either equal to vor it is one of its neighbors. The size and branching factor of

the corresponding search space is manageable: the number of vertices is |V| × T,

where Tis an upper bound on the solution makespan, and the branching factor

is |E|

|V|+ 1. For example, in a MAPF problem with 20 agents on a 4-connected

grid with 500 ×500 cells, assuming T= 1,000, we have a search space size of

25,000,000 and a branching factor of 5. A∗has been successfully applied to

much larger search spaces.

The computational eﬃciency and simplicity of prioritized planning algo-

rithms is the main reason for their widespread adoption by practitioners. Imple-

menting prioritized planning includes many design choices. For example, several

methods have been proposed for setting the agents’ priorities [7, 1]. The Win-

dowed Hierarchical Cooperative A∗algorithm (WHCA∗) [34] also allowed inter-

leaving planning and execution in a prioritized planning framework. In WHCA∗,

the agents plans to avoid conﬂicts only for the next Xtime steps (the “window”).

After performing these Xsteps, the agents can re-plan the next Xsteps in the

same manner.

Prioritized planning is a sound approach for MAPF, in the sense that it

returns valid solutions. However, it is neither complete nor optimal. That is,

– Not complete. A prioritized planning algorithm may not ﬁnd any solution

to a solvable MAPF problem.

– Not optimal. The solution returned by a prioritized planning algorithm

may not be optimal, w.r.t. to a given objective function (e.g., sum of costs

or makespan).

s(1) t(1)

t(2) s(2)

Fig. 2. A MAPF problem in which prioritized planners will not ﬁnd any solution.

As an example of these prioritized planning limitations, see the MAPF prob-

lem depicted in Figure 2. In this example, any prioritized planning algorithm

will fail to ﬁnd a solution, regardless of which agent has a higher priority. The

problem, however, is clearly solvable, by having agent 1 move to the middle grid

cell in the upper row, allowing agent 2 to move to its target (t(2)), and then

moving to its own target (t(1)).

Multi-Agent Path Finding – an Overview 7

3.2 Complete MAPF Solvers

We say that a MAPF algorithm is fast if its worst-case time complexity is poly-

nomial in the size of the graph G, and not exponential in the number of agents.

Surprisingly, there are fast and complete algorithms for solving MAPF problems.

The most general of those is Kornhauser’s algorithm [20], which is complete and

runs in a worst case time complexity of O(|V|3). This algorithm is regarded as

complicated to implement. Thus, a variety of algorithms have been proposed that

are also fast and complete, at least for some restricted classes of MAPF prob-

lems. Below, we provide a partial list of such algorithms and classes of MAPF

problems.

The Push-and-Swap algorithm [24] and its extensions Parallel Push-and-

Swap [31] and Push-and-Rotate [11], are fast MAPF algorithms that are com-

plete for any MAPF problem in which there are at least two unoccupied vertices

in the graph. Very roughly, these algorithms work by executing a set of macro-

operators that move an agent towards its goal (push) and swap the location of

two agents (swap).

A MAPF problem is well-formed if, for any pair of agent iand j, there

exists a path from s(i) to t(i) that does not pass through s(j) and t(j). ˇ

C´ap et

al. [9] proved that prioritized planning algorithms that compulsory avoid start

locations are complete for well-formed MAPF problems.

A MAPF problem is slidable if for any triple of locations v1, v2,and v3,

there exists a path from v1to v3that does not go through v2.1Wang and

Botea [49] proposed a fast algorithm called MAPP that is complete for slidable

MAPF problems. The BIBOX algorithm is also fast and complete under these

conditions [38].

While all the above algorithms are fast and, under certain conditions, com-

plete, they do not provide any guarantee regarding the quality of the solution

they return. In particular, they do not guarantee that the resulting solution is

optimal, either w.r.t. sum-of-costs or makespan. In fact, ﬁnding a solution that

has the smallest makespan or the smallest sum of costs, is NP hard [39, 53].

Nevertheless, solution quality is important in many applications, e.g., saving op-

erational costs in an automated warehouse. Also, modern MAPF algorithms can

ﬁnd provably optimal solutions in a few minutes to problems with more than a

hundred agents [32,21, 14].

In the next section, we present the state-of-the-art in MAPF algorithms

that are guaranteed to return a solution that is optimal with respect to a given

objective function. Such algorithms are referred to as optimal MAPF algorithms.

4 Optimal MAPF Solvers

It is possible to classify optimal MAPF algorithms to four high-level approaches:

1The exact deﬁnition of slidable is slightly more involved. The interested reader can

see the exact deﬁnition in Wang and Botea’s paper [49].

8 R. Stern

– Extensions of A∗.These are algorithms that search the k-agent search

space using a variant of the A∗algorithm.

– The Increasing Cost Tree Search [33]. This algorithm splits the MAPF

problem into two problems: ﬁnding the cost added by each agent, and ﬁnding

a valid solution with these costs.

– Conﬂict-Based Search [32]. This algorithmic family solves MAPF by

solving multiple single-agent pathﬁnding problems. To achieve coordination,

speciﬁc constraints are added incrementally to the single-agent pathﬁnding

problems, in a way that veriﬁes soundness, completeness, and optimality.

– Constraints programming [39, 6]. This approach compiles MAPF to a

set of constraints and solves them with a general purpose constraints solver.

4.1 Extensions of A∗

Standley [36] proposed two very eﬀective extensions to A∗for solving MAPF

problems.

Operator Decomposition The ﬁrst extension is called Operator Decomposi-

tion (OD). OD is designed to cope with the exponential branching factor of the

k-agent search space. In OD, the agents are sorted according to some arbitrary

order. When expanding the source vertex s(1), . . . , s(k), only the actions of

one agent are considered. This generates a set of vertices that represent a possi-

ble location for the ﬁrst agent in time step 1, and the locations all other agents

are occupying at time step 0. These vertices are added to Open. When expand-

ing one of these vertices, only the actions of the second agent are considered,

generating a new set of vertices. These vertices represent a possible location for

the ﬁrst and second agents in time step 1, and the locations of all other agents

are occupying at time step 0. The search continues in this way. Only the kth

descendent of the start vertex is a vertex that represents a possible location of

all agents at time step 1. Vertices that represent the location of all agents at the

same time step are called full vertices, while all other vertices are called interme-

diate vertices. The search continues until reaching a full vertex that represents

the target t(1), . . . , t(k).

The obvious advantage of A∗with OD compared to A∗without OD is the

branching factor. With OD, the branching factor is that of a single agent, while

without OD, it is exponential in the number of agents. However, the solution is

ktimes deeper when using OD, since there are kvertices between any pair of

full states. In the case of MAPF, this tradeoﬀ is usually beneﬁcial due to the

heuristic function. A high heuristic value for an intermediate vertex can help

avoid expanding the entire subtree beneath that vertex.

OD can be viewed as a special case of the Enhanced Partial Expansion A∗

(EPEA*) algorithm [17]. EPEA* is a variant of A∗that can avoid generating

some of the vertices A∗would generate when expanding a vertex. For details on

EPEA* and how it relates to OD, see Goldenberg et al. [17].

Multi-Agent Path Finding – an Overview 9

Independence Detection The second A∗extension proposed by Standley [36]

is called Independence Detection (ID). ID attempts to decouple a MAPF problem

with kagents to smaller MAPF problems with fewer agents. It works as follows.

First, each agent ﬁnds an optimal single-agent plan for itself while ignoring all

other agents. If there is a conﬂict between the plans of a pair of agents, these

agents are merged to a single meta-agent. Then, A∗+OD is used to ﬁnd an

optimal solution for the two agents in this meta-agent, ignoring all other agents.

This process continues iteratively: in every iteration a single conﬂict is detected,

the conﬂicting (meta-)agents are merged, and then solved optimally with A∗

+OD. The process stops where there are no conﬂicts between the agents’ plans.2

In the worst case, ID will end up merging all agents to a single meta-agent

and solving the resulting k-agents MAPF problem. However, in other cases, an

optimal solution can be returned and guaranteed by only solving smaller MAPF

problems with fewer agents. This can have a dramatic impact on runtime. ID is

a very general framework for MAPF solvers, as one can replace A∗+OD with

any other complete and sound MAPF solver.

M* The M∗algorithm [47] also search the k-agent search space like A∗. To

handle the exponential branching factor, M∗dynamically changes the branching

factor of the search space, as follows. Initially, whenever a vertex is expanded,

it generates only a single vertex that corresponds to all agents moving one step

in their own, individual, optimal path. This generates a single path in the k-

agent search space. Since the agents are following their individual optimal path,

a vertex nmay be generated that represents a conﬂict between a pair of agents

iand j. If this occurs, all the vertices along the path from the start vertex to

nare re-expanded, this time generating vertices for all combinations of actions

agents iand jmay perform. In general, a vertex in M∗stores a conﬂict set,

which is a set of agents for which it will generate all combinations of actions.

For agents not in the conﬂict set, M∗only considers a single action – the one on

their individual optimal path. Recursive M∗(rM*) is a notable improved version

of M∗. rM* attempts to identify sets of agents in the conﬂict set that can be

solved in a decoupled manner.

M∗is similar to OD in that it limits the branching factor of some vertices.

rM* also bears some similarity to ID, in that it attempts to identify which sets

of agents can be solved separately. Nevertheless, rM*, OD, and ID, can be used

together: rM* can be used by ID to ﬁnd optimal solutions to conﬂicting meta-

agents, and rM* can search the k-agent search space with A∗with OD instead

of plain A∗. The latter is referred to as ODrM* and was shown to be eﬀective

in some scenarios [47].

2This is actually a description of the simple ID algorithm. In the full ID algorithm,

the conﬂicting agents attempt to individually avoid the conﬂict while maintaining

their original solution cost.

10 R. Stern

4.2 The Increasing Cost Tree Search (ICTS)

The Increasing Cost Tree Search (ICTS) [33] algorithm does not search the k-

agent search space directly. Instead, it interleaves two search processes. The ﬁrst,

referred to as the high-level search, aims to ﬁnd the sizes of the agents’ single-

agent plans in an optimal solution for the given MAPF problem. The second,

referred to as the low-level search, accepts a vector of plan sizes (c1, . . . , ck), and

veriﬁes if there exists a valid solution (π1, . . . , πk) to the given MAPF problem

in which the size of every single agent plan πiis exactly ci.

The high-level search of ICTS is implemented as a search over the increasing

cost tree (ICT). The ICT is a tree in which each node is a k-dimensional vector

of non-negative values. The root of the ICT is a vector (c1, . . . , ck) where for

every agent i, the value ciis the size of its individual optimal path. The children

of a node nin this tree are all vectors that result from adding one to one of

the kelements in n. The high-level of ICTS searches the ICT in a breadth-ﬁrst

manner. This is done to verify that the ﬁrst valid solution found by the low-level

search is an optimal solution.

As mentioned above, the low-level search of ICTS accepts an ICT node

(c1,...ck) from the high-level search, and searches for a valid solution (π1, . . . , πk)

in which ∀i:|πi|=ci. To do so eﬃciently, ICTS computes for each agent iall

single-agent plans of size ci. Generating these set of plans is done with a simple

breadth-ﬁrst search, and they are stored compactly in a Multi-valued Decision

Diagram (MDD) [35]. The cross product of the agents’ MDDs is a subgraph

of the k-agent search space that contains all joint plans that correspond to the

given ICT node. Observe that this cross product is a subgraph of the k-agent

search space. ICTS searches this cross product of MDDs for a valid solution.

Since this search solves a satisfaction problem and not an optimization problem,

a simple depth-ﬁrst branch-and-bound is commonly used.

An eﬀective way to speedup ICTS is to prune the ICT by quickly identifying

subsets of single-agent plan costs for which there is no valid solution [33]. For

example, assume an ICT node (c1,...ck) given to the low-level search. One can

check if there is a pair of single-agent plans for agents 1 and 2 such that their

costs is c1and c2, respectively, and they do not conﬂict. If no such pair of

plans exists, then the low-level search can safely return that there is no valid

solution for the corresponding ICT node. While this technique for pruning the

ICT is highly eﬀective in practice, there is no current theory about how to choose

which subsets of costs to check. This is an open question for future research.

4.3 Conﬂict-Based Search

Conﬂict-Based Search (CBS) [32] is an optimal MAPF algorithm. It is unique in

that it solves a MAPF problem by solving a sequence of single-agent pathﬁnding

problems.

In more detail, CBS, similar to ICTS, runs two interleaving search processes:

alow-level search and a high-level search. The CBS low-level search accepts as

input an agent iand a set of constraints of the form hi, v, ti, representing that

Multi-Agent Path Finding – an Overview 11

agent imust not be at vertex vin time step t. The task of the CBS low-level

search is to ﬁnd the lowest-cost single-agent plan for agent ithat does not violate

the given set of constraints. Existing single-agent pathﬁnding algorithms, such

as A∗, can be easily adapted to serve as the CBS low-level search.

The CBS high-level searches a set of constraints to impose on the low-level

search so that the resulting joint plan is a cost-optimal valid solution. This search

is performed over the Constraint Tree (CT). The CT is a binary tree in which

each node nis a pair (n.cont, n.Π) where n.cont is a set of CBS constraints and

n.Π is a joint plan consistent with these constraints. A CT node nis generated

by ﬁrst setting its constraints and then using the CBS low-level search to ﬁnd a

single-agent plan for each agent that satisﬁes its constraints. The root of the CT

is a CT node with an empty set of constraints. The objective of the high-level

search is to ﬁnd node nin the CT in which n.Π is a cost-optimal valid solution.

The high-level search achieves this objective by searching the CT as follows.

First, the root of the CT is generated. If the joint plan for the root has no

conﬂict, meaning it is a valid solution, then the search returns it. Otherwise,

one of the conﬂicts in the joint plan is chosen. Let i,j,x, and tbe the pair of

agents, location, and time steps for which this conﬂict has occurred. Two new

CT nodes, niand nj, are generated and added as children to the root node.

The CT node niis generated with the constraint hi, x, tiand the CT node nj

is generated with the constraint hj, x, ti. The cost of a CT node is the cost of

the joint plan it represents. The high-level search continues to search the CT

in a best-ﬁrst manner, choosing in every iteration to expand a CT node with

the lowest cost. Expanding a CT node means choosing one of its conﬂicts, and

resolving them by generating two new CT nodes with an additional constraint

as shown above. The search halts when a CT node nis found in which n.Π has

no conﬂicts. Then, n.Π is returned, and is guaranteed to be optimal.

CBS has many extensions and improvements. Meta-agent CBS [32] is a gen-

eralization of CBS in which instead of adding new constraints to resolve a conﬂict

between two agents, the algorithm may choose to merge the conﬂicting agent to

a single meta-agent. Improved CBS [8] attempts to reduce the size of the CT

by intelligently choosing which conﬂict to resolve in every iteration. HCBS [14]

adds an admissible heuristic to the high-level search to prune more nodes from

the CT. Recent work suggested a diﬀerent scheme for resolving conﬂicts. For a

conﬂict in location xat time tbetween agents iand j, they proposed to generate

three CT nodes: one with a constraint that agent imust occupy xat time t,

one with a constraint that agent jmust occupy xat time t, and one with a

constraint that neither agent inor agent jcan occupy xat time t. The beneﬁt

of this three-way split is that the sets of solutions that satisfy them is disjoint.

4.4 Constraint Programming

Constraint Programming (CP) is a problem-solving paradigm in which one mod-

els a given problem as a Constraints Satisfaction Problem (CSP) or a Constraint

Optimization Problems (COP), and then use a general-purpose constraints solver

to ﬁnd a solution. A notable special case of CP is to model a problem as a

12 R. Stern

Boolean Satisﬁability (SAT) problem, which is a special case of CSP, and use a

general-purpose SAT solver.

CP is a very general paradigm because many problems, including MAPF, can

be modeled as a CSP or a COP. The major beneﬁt of using CP is that current

general-purpose constraints solver are very eﬃcient and are constantly getting

better. In particular, modern SAT solvers are extremely eﬃcient, solving SAT

problems with over a million variables.

A common approach for ﬁnding a solution to a given MAPF problem with

optimal makespan with CP is by splitting the problem to two problems: (1)

ﬁnding a valid solution whose makespan is equal to or smaller than a given

bound T, and (2) ﬁnding a value of Tthat is equal to the optimal makespan.

Next, we provide a brief description of this approach.

Finding a valid solution for a given makespan bound For every triplet of

agent a, vertex v∈V, and time step t, we deﬁne a Boolean variable Xa,v,t. Setting

Xa,v,t to true means that ais planned to occupy vat time t. The constraints

imposed on these variables ensure that:

1. Agent occupies one vertex in each time step. For every time step

and agent there is exactly one variable Xa,v,t that is assigned true. that is

assigned a true value.

2. No conﬂicts. For every time step and location, there is at most one variables

Xa,v,t that is assigned true.3

3. Agents start and ends in the desired locations. For every agent i,

Xi,s(i),1and Xi,t(i),C .

4. Agents move along edges. For every time tbefore T, agent i, and pair of

vertices vand v0, if the variables Xi,t,v and Xi,t,v0are both true then there

is an edge (v, v0)∈E.

Any assignment of values to the variables Xa,v,t corresponds to a valid solu-

tion for our MAPF problem whose makespan is at most T.

Finding the optimal makespan To ﬁnd the optimal makespan, we start by

setting Tto be a lower bound on the optimal makespan. Such a lower bound can

be easily obtained by taking the maximum over the agents’ individual shortest

path to their goal. Then, a constraints solver is used to search for a solution to

the CSP deﬁned above. If a solution has been found, we have found an optimal

solution. If not, Tis incremented by one, and the constraints solver is used again

to solve the new CSP. This process continues until an optimal solution is found.

Finding a solution with optimal sum-of-costs is also possible with CP, but it

requires some additional constraints and changes to the process [42, 6].

3Actually, this constraint only prevents vertex conﬂicts. To prevent swapping conﬂicts,

an additional constraint is needed, in which for every time step tbefore T, pair of

agents aand a0, and pair of locations vand v0, if the variables Xi,t,v and Xi,t0,v0are

both true then the variables Xj,t,v0and Xj,t0,v must not be both true.

Multi-Agent Path Finding – an Overview 13

It is important to note that the above is not the only way to solve MAPF

with CP. Surynek explored ﬁve diﬀerent ways to model MAPF using SAT,

showing how diﬀerent modeling choices impact the SAT solver’s runtime [40].

Bart´ak et al. [6] modeled several variants of MAPF using Picat [54], a higher-

level CP language. A CP written in Picat can be automatically compiled and

solved with either SAT, a CP solver, or a Mixed Integer-Linear Program (MILP)

solver [44]. They showed that diﬀerent modelings and solvers are eﬀective for dif-

ferent MAPF variants and problems. Still, how to choose the best model and

solver for a given MAPF problem is, to-date, an open question.

It is worth noting that solving MAPF with CP is, in it self, a special case of

a more general approach for solving MAPF in which one compiles MAPF to a

diﬀerent problem, solves it with an algorithm designed for that problem. Promi-

nent examples are MAPF compilation to Answer Set Programming (ASP) [13],

to SAT Modulu Theory (SMT) [41], and to multi-commodity network ﬂow [52].

Such MAPF algorithms are sometimes referred to as reduction-based MAPF

solvers [15].

4.5 Summary of Optimal Solvers

Unfortunately, there are no clear guidelines to predict which of the MAPF algo-

rithms detailed above would work best for a given MAPF problem. Prior work

suggested the following rules-of-thumb:

–A∗-based and CP approaches are eﬀective for small graphs that are dense

with agents.

–CBS and ICTS are eﬀective for large graphs.

However, this rules-of-thumb has not been grounded theoretically and its

empirical support is weak. We expect that future work will explore automated

methods to select the best solver to use for a given problem. Another appealing

direction for future work is to create hybrid algorithms that enjoy the comple-

mentary beneﬁts of diﬀerent MAPF solvers.

5 Approximately Optimal Solvers

While modern optimal MAPF algorithms have pushed the state of the art im-

pressively, there are still many MAPF problems for which current algorithms

cannot solve optimally in reasonable time. In such cases, one can always use one

of the fast MAPF algorithms described in Section 3, but that would mean the

solution returned may be very costly.

Approximately optimal MAPF algorithms, also known as bounded-suboptimal

algorithms, lie in the range between these algorithms and optimal algorithms.

An approximately optimal algorithm is an algorithm that accepts a parameter

 > 0 and returns a solution whose cost is at most 1 + times the cost of

an optimal solution. Ideally, an approximately optimal algorithm would return

solutions faster when increasing , thus providing a controlled trade-oﬀ between

14 R. Stern

runtime and solution quality. Approximately optimal MAPF algorithms have

been proposed based on each of the optimal MAPF approaches described in the

previous section. We describe them brieﬂy below.

5.1 A∗-based

Creating an approximately optimal version of an A∗-based MAPF algorithm is

straightforward, since there are many approximately optimal A∗-based algorithm

in the heuristic search literature. Perhaps the most well-known approximately

optimal A∗-based algorithm is Weighted A∗[30], which is a best-ﬁrst search that

uses the g+(1+ )hevaluation function to choose which node to expand in every

iteration. All A∗-based MAPF algorithms can use the same evaluation function

and obtain the guarantee: that the solution cost is at most 1+times the cost of

an optimal solution. Such a variant was mentioned explicitly for M∗[47], under

the name inﬂated M∗.

An interesting direction for future work is to use more modern A∗-based

approximately optimal algorithms to improve the performance of approximately

optimal A∗-based MAPF algorithms. Explicit Estimation Search (EES) [43] and

Dynamic Potential Search (DPS) [16] are some examples of such approximately

optimal A∗-based algorithms.

5.2 ICTS

To the best of our knowledge, there is no approximately optimal ICTS-based

algorithm for classical MAPF. The challenge in creating such an algorithm is

that the ICTS high-level search is done in a breadth-ﬁrst manner. Thus there

is no heuristic to inﬂate, preventing the clear application of Weighted A∗and

other approximately optimal search algorithms.

However, there is an approximately optimal variant of ICTS for MAPF prob-

lems in which moving an agent across diﬀerent edges can have diﬀerent costs.

This algorithm is based on the Extended ICTS (eICTS) algorithm [48], which is

an ICTS-based algorithm designed for this type of MAPF problems. In eICTS,

each ICT node is associated with a lower and upper bound. The high-level search

in this case becomes a best-ﬁrst search on the lower bound, and low-level search

looks for optimal solutions within these bounds. This allows creating an approx-

imately optimal version of eICTS called wICTS, in which suboptimality is added

to both high-level and low-level search.

5.3 CBS

Enhanced CBS (ECBS) [4] is an approximately optimal MAPF algorithm that

is based on CBS. It introduces suboptimality in the low-level search and in the

high-level search. The low-level search in CBS can be any optimal shortest path

algorithm, such as A∗. As noted above, there are several approximately optimal

algorithms that are based on A∗, including Weighted A∗[30], EES [43], and

Multi-Agent Path Finding – an Overview 15

DPS [16]. Thus, introducing suboptimality to the low-level search can be done

by simply using one of these approximately optimal algorithms.

Introducing suboptimality to the high-level search is slightly more involved.

To do so, ECBS uses a focal search framework for its high-level search. Focal

search is a heuristic search framework introduced by Pearl and Kim [28] in

which the node expanded in every iteration is chosen from a subset of nodes

called FOCAL. FOCAL contains all nodes in Open that may lead to a solution

that may be approximately optimal. To choose which node to expand From

FOCAL, a secondary heuristic can be used. Importantly, this heuristic can be

inadmissible and domain-dependent. ECBS uses the focal search framework, and

uses a MAPF-speciﬁc secondary heuristic that prioritizes CT node with fewer

conﬂicts. For details, see Barer et al. [4]. Later work proposed an extension

to ECBS in which user-deﬁned paths called highways are prioritized to further

improve runtime [10].

5.4 Constraint Programming

eMDD-SAT is a recently proposed approximately optimal MAPF algorithm

from the CP family. This algorithm models MAPF as a SAT problem. It follows

the high-level approach we described in Section 4.4, except that it is designed

for (approximately) optimizing SOC and not makespan.

In a very high-level manner, eMDD-SAT works by creating a SAT model

that allows solutions with longer makespan and larger SOC. The suboptimality

is controlled by high much larger is the SOC from a computed SOC lower-bound.

To the best of our knowledge, there is no approximately optimal MAPF algo-

rithm from this family that is designed for ﬁnding solutions with approximately

optimal makespan.

In general, signiﬁcantly less eﬀorts have been dedicated, to date, to develop

approximately optimal MAPF algorithms. However, existing approximately op-

timal MAPF algorithms demonstrate that adding even a very small amount of

suboptimality can allow solving much larger problems. For example, ECBS with

at most 1% suboptimality is able to solve MAPF problems with 250 agents on

large maps [4].

6 Beyond Classical MAPF

The scope of this overview is mostly limited to what is referred to as classical

MAPF [37]. Classical MAPF assumes that (1) every action takes exactly one

time step, (2) time is discretized into time steps, as oppose to continuous, and

(3) each agent occupies exactly one vertex. These assumptions do not necessarily

hold in real-world MAPF applications. With the maturity of classical MAPF

algorithms, recent years have also begun to explore MAPF problems that relax

these assumptions. Below, we provide a partial overview of these eﬀorts.

16 R. Stern

6.1 Beyond One-Time Step Actions

The eICTS algorithm [48] mentioned above is designed for actions that may

require more than one time step. Such a setting is sometimes called MAPF with

non-unit edge cost. Adapting the CBS algorithm to non-unit edge cost settings

is straightforward, as it only requires changing the conﬂict-detection step.

Bart´ak et al. [5] proposed a CP-based algorithm for MAPF with non-unit

edge costs. Their model uses scheduling constraints to support actions with

diﬀerent duration.

6.2 Beyond Discrete Time Steps

Time is continuous, and thus every time step discretization is, by deﬁnition an

abstraction of the real-world. This abstraction in the context of MAPF may lead

to suboptimality and even incompleteness.

As long as the agents do not need to wait, there is no need to directly deal

with this problem: the duration of move actions depend on the time required

to traverse the corresponding edge. However, when an agent needs to wait and

time is not discretized, then each agent has an inﬁnite number of possible wait

actions in each vertex.

The key technique used so far to address this problem is to use the Safe

Interval Path Planning (SIPP) [29] algorithm. SIPP is a single-agent pathﬁnd-

ing algorithm that is designed to avoid moving obstacles. Since obstacles are

moving, the single agent may choose to wait in its location, which raises again

the challenge of dealing with continuous time. SIPP addresses this challenge by

identifying safe intervals in which the agent can occupy each vertex, and runs an

A∗search on (vertex, safe interval) pairs. Andreychuck et al. showed how to use

SIPP to solve MAPF problems with continuous time, in a prioritized planning

framework [51] and in a CBS framework [2].

Surynek [41] recently proposed to use a CP-related approach for continuous

time. Instead of modeling the problem as a CSP or SAT problem, Surynek

proposed to model it as a SAT Modulu Theory (SMT) problem, and then apply

an SMT solver.

6.3 Beyond One-Agent per Vertex

The graph Gof possible location in classical MAPF is an abstraction of the real

world the agents are moving in. Arguably, in most real-world MAPF applications

the agents are moving in Euclidean space and have some geometric shape. Thus,

an agent may conﬂict if they stop in diﬀerent areas, because their geometric

shapes overlap. Li et al. [22] referred to this as MAPF with large agents. In such

settings, an agent may “occupy” multiple vertices and a move action may create

a conﬂict with agents occupying multiple vertices.

Li et al. [22] proposed a CBS-based algorithm for addressing this setting.

They showed how to design suitable constraints for large agents and proposed

an admissible heuristic to speedup the search. They also described an A∗-based

Multi-Agent Path Finding – an Overview 17

algorithm and a SAT-based algorithm for this setting. Atzmon et al. [12] pro-

posed another CBS-based algorithm that can consider agents of arbitrary shape,

even without a reference point that is stable to rotations.

Robustness and Kinematic Constraints Even if an agent only occupies a

single vertex, it is still desirable in many scenarios to add a buﬀer around each

agent to further minimize the chance of collisions. Such a buﬀer can be either

spatial or temporal. A prime motivation for having such a buﬀer is to account

for the inherent uncertainty during the executing of the solution. That is, to

have the agents’ joint plan be valid and executable even if some agents do not

fully follow it.

The MAPF-POST [19] algorithm was designed to address such requirements.

MAPF-POST accepts as input a solution for a classical MAPF problem and

adapts it to consider safety and kinematic constraints. A limitation of MAPF-

POST is that it does not retain any guarantee on solution quality. For adding

robustness to temporal delays during execution, Atzmon et al. [3] proposed an

optimal CBS-based algorithm and CP-based algorithm.

6.4 Beyond One-Shot MAPF

In addition, classical MAPF is a one-shot, oﬄine problem. In some MAPF ap-

plications, there is a sequence of related MAPF problems that are being solved

sequentially. Some recent work also addresses several types of online MAPF set-

tings. This includes settings where there is a ﬁxed set of agents and a stream of

pathﬁnding tasks [25], as well as a setting where new agents appear over time

but each agent has a single navigation task [46]. The former setting is referred to

as the MAPF warehouse model and the latter as the MAPF intersection model.

Also, so far we assumed the allocation of agents to goals is given. In the

Multi-Agent Pickup-and-Delivery (MAPD) problem, this is not the case [23]. In

MAPD, there is a ﬁxed set of agents that need to solve a batch of pickup and

delivery of tasks. A MAPD algorithm needs to plan paths without conﬂicts, and

also to allocate which agent should go to which destination.

7 Conclusion

This paper provides an overview of the current research on Multi-Agent Path

Finding (MAPF). After providing several deﬁnitions of MAPF were given, we

presented polynomial-time algorithms for solving the problem. Then, a range

of algorithms was described that return optimal solutions. These algorithms

can be split into four families: A∗-based, ICTS, CBS, and CP. Following, we

described how to transform several of these optimal algorithms to be approxi-

mately optimal algorithms, allowing trading solution quality for runtime. Finally,

we presented some extensions of classical MAPF, including non-unit edge costs,

continuous time, large agents, and online MAPF. Throughout this paper, we

suggested several directions for future work.

18 R. Stern

It is our hope that this paper will be useful to both researchers and practition-

ers looking for a brief introduction to MAPF. For formal deﬁnitions of MAPF

variants and benchmarks, see [37]. For additional MAPF-related resources, in-

cluding pointers to publications and additional tutorials, see the http://mapf.info

web site, created by Sven Koenig’s group.

References

1. Andreychuk, A., Yakovlev, K.: Two techniques that enhance the performance of

multi-robot prioritized path planning. In: International Conference on Autonomous

Agents and MultiAgent Systems (AAMAS). pp. 2177–2179 (2018)

2. Andreychuk, A., Yakovlev, K., Atzmon, D., Stern, R.: Multi-agent pathﬁnding

with continuous time. In: International Joint Conference on Artiﬁcial Intelligence

(IJCAI). pp. 39–45 (2019)

3. Atzmon, D., Stern, R., Felner, A., Wagner, G., Bart´ak, R., Zhou, N.F.: Robust

multi-agent path ﬁnding. In: International Conference on Autonomous Agents and

Multi Agent Systems (AAMAS). pp. 1862–1864 (2018)

4. Barer, M., Sharon, G., Stern, R., Felner, A.: Suboptimal variants of the conﬂict-

based search algorithm for the multi-agent pathﬁnding problem. In: Symposium

on Combinatorial Search (SoCS) (2014)

5. Bart´ak, R., ˇ

Svancara, J., Vlk, M., et al.: A scheduling-based approach to multi-

agent path ﬁnding with weighted and capacitated arcs. In: International Conference

on Autonomous Agents and MultiAgent Systems (AAMAS). pp. 748–756. Inter-

national Foundation for Autonomous Agents and Multiagent Systems (AAMAS)

(2018)

6. Bartk, R., Zhou, N., Stern, R., Boyarski, E., Surynek, P.: Modeling and solving

the multi-agent pathﬁnding problem in picat. In: IEEE International Conference

on Tools with Artiﬁcial Intelligence (ICTAI). pp. 959–966 (2017)

7. Bnaya, Z., Felner, A.: Conﬂict-oriented windowed hierarchical cooperative a. In:

IEEE International Conference on Robotics and Automation (ICRA). pp. 3743–

3748 (2014)

8. Boyarski, E., Felner, A., Stern, R., Sharon, G., Tolpin, D., Betzalel, O., Shimony,

E.: ICBS: improved conﬂict-based search algorithm for multi-agent pathﬁnding.

In: International Joint Conference on Artiﬁcial Intelligence (IJCAI) (2015)

9. ˇ

C´ap, M., Vokˇr´ınek, J., Kleiner, A.: Complete decentralized method for on-line

multi-robot trajectory planning in well-formed infrastructures. In: International

Conference on Automated Planning and Scheduling (ICAPS) (2015)

10. Cohen, L., Uras, T., Koenig, S.: Feasibility study: Using highways for bounded-

suboptimal multi-agent path ﬁnding. In: Symposium on Combinatorial Search

(SoCS) (2015)

11. De Wilde, B., Ter Mors, A.W., Witteveen, C.: Push and rotate: a complete multi-

agent pathﬁnding algorithm. Journal of Artiﬁcial Intelligence Research 51, 443–492

(2014)

12. Dor Atzmon, Amit Diei, D.R.: Multi-train path ﬁnding. In: Symposium on Com-

binatorial Search (SoCS) (2019)

13. Erdem, E., Kisa, D.G., Oztok, U., Sch¨uller, P.: A general formal framework for

pathﬁnding problems with multiple agents. In: AAAI Conference on Artiﬁcial In-

telligence (2013)

Multi-Agent Path Finding – an Overview 19

14. Felner, A., Li, J., Boyarski, E., Ma, H., Cohen, L., Kumar, T.S., Koenig, S.: Adding

heuristics to conﬂict-based search for multi-agent path ﬁnding. In: International

Conference on Automated Planning and Scheduling (ICAPS) (2018)

15. Felner, A., Stern, R., Shimony, S.E., Boyarski, E., Goldenberg, M., Sharon, G.,

Sturtevant, N.R., Wagner, G., Surynek, P.: Search-based optimal solvers for the

multi-agent pathﬁnding problem: Summary and challenges. In: Symposium on

Combinatorial Search (SoCS). pp. 29–37 (2017)

16. Gilon, D., Felner, A., Stern, R.: Dynamic potential searcha new bounded subopti-

mal search. In: Symposium on Combinatorial Search (SoCS) (2016)

17. Goldenberg, M., Felner, A., Sturtevant, N.R., Holte, R.C., Schaeﬀer, J.: Optimal-

generation variants of EPEA. In: SoCS (2013)

18. Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determina-

tion of minimum cost paths. IEEE Transactions on Systems Science and Cyber-

netics SSC-4(2), 100–107 (1968)

19. H¨onig, W., Kumar, T., Cohen, L., Ma, H., Xu, H., Ayanian, N., Koenig, S.: Sum-

mary: multi-agent path ﬁnding with kinematic constraints. In: International Joint

Conference on Artiﬁcial Intelligence (IJCAI). pp. 4869–4873 (2017)

20. Kornhauser, D., Miller, G., Spirakis, P.: Coordinating pebble motion on graphs, the

diameter of permutation groups, and applications. In: Symposium on Foundations

of Computer Science. pp. 241–250. IEEE (1984)

21. Li, J., Harabor, D., Stuckey, P., Felner, A., Ma, H., Koenig, S.: Disjoint splitting for

multi-agent path ﬁnding with conﬂict-based search. In: International Conference

on Automated Planning and Scheduling (ICAPS) (2019)

22. Li, J., Surynek, P., Felner, A., Ma., H., Kumar, T.K.S., Koenig, S.: Multi-agent

path ﬁnding for large agents. In: AAAI Conference on Artiﬁcial Intelligence (2019)

23. Liu, M., Ma, H., Li, J., Koenig, S.: Task and path planning for multi-agent pickup

and delivery. In: International Conference on Autonomous Agents and MultiAgent

Systems (AAMAS). pp. 1152–1160 (2019)

24. Luna, R., Bekris, K.E.: Eﬃcient and complete centralized multi-robot path plan-

ning. In: IEEE/RSJ International Conference on Intelligent Robots and Systems.

pp. 3268–3275 (2011)

25. Ma, H., Li, J., Kumar, T., Koenig, S.: Lifelong multi-agent path ﬁnding for online

pickup and delivery tasks. In: Conference on Autonomous Agents and MultiAgent

Systems (AAMAS). pp. 837–845 (2017)

26. Ma, H., Yang, J., Cohen, L., Kumar, T.K.S., Koenig, S.: Feasibility study: Moving

non-homogeneous teams in congested video game environments. In: Conference on

Artiﬁcial Intelligence and Interactive Digital Entertainment (AIIDE). pp. 270–272

(2017)

27. Morris, R., Pasareanu, C.S., Luckow, K.S., Malik, W., Ma, H., Kumar, T.S.,

Koenig, S.: Planning, scheduling and monitoring for airport surface operations.

In: AAAI Workshop: Planning for Hybrid Systems (2016)

28. Pearl, J., Kim, J.H.: Studies in semi-admissible heuristics. IEEE transactions on

pattern analysis and machine intelligence PAMI-4, 392–399 (1982)

29. Phillips, M., Likhachev, M.: SIPP: Safe interval path planning for dynamic environ-

ments. In: IEEE International Conference on Robotics and Automation (ICRA).

pp. 5628–5635 (2011)

30. Pohl, I.: Heuristic search viewed as path ﬁnding in a graph. Artiﬁcial intelligence

1(3-4), 193–204 (1970)

31. Sajid, Q., Luna, R., Bekris, K.E.: Multi-agent pathﬁnding with simultaneous exe-

cution of single-agent primitives. In: SoCS (2012)

20 R. Stern

32. Sharon, G., Stern, R., Felner, A., Sturtevant, N.R.: Conﬂict-based search for opti-

mal multi-agent pathﬁnding. Artiﬁcial Intelligence 219, 40–66 (2015)

33. Sharon, G., Stern, R., Goldenberg, M., Felner, A.: The increasing cost tree search

for optimal multi-agent pathﬁnding. Artiﬁcial Intelligence 195, 470–495 (2013)

34. Silver, D.: Cooperative pathﬁnding. In: AIIDE. vol. 1, pp. 117–122 (2005)

35. Srinivasan, A., Ham, T., Malik, S., Brayton, R.K.: Algorithms for discrete function

manipulation. In: IEEE International conference on computer-aided design. pp.

92–95 (1990)

36. Standley, T.S.: Finding optimal solutions to cooperative pathﬁnding problems. In:

AAAI Conference on Artiﬁcial Intelligence. pp. 173–178 (2010)

37. Stern, R., Sturtevant, N.R., Felner, A., Keonig, S., Ma, H., Walker, T.T., Li,

J., Atzmon, D., Cohen, L., Kumar, T.S., Boyarski, E., Bartak, R.: Multi-agent

pathﬁnding: Deﬁnitions, variants, and benchmarks. In: Symposium on Combina-

torial Search (SoCS) (2019)

38. Surynek, P.: A novel approach to path planning for multiple robots in bi-connected

graphs. In: IEEE International Conference on Robotics and Automation (ICRA).

pp. 3613–3619 (2009)

39. Surynek, P.: An optimization variant of multi-robot path planning is intractable.

In: AAAI (2010)

40. Surynek, P.: Makespan optimal solving of cooperative path-ﬁnding via reductions

to propositional satisﬁability. arXiv preprint arXiv:1610.05452 (2016)

41. Surynek, P.: Multi-agent path ﬁnding with continuous time viewed through satis-

ﬁability modulo theories (smt). arXiv preprint arXiv:1903.09820 (2019)

42. Surynek, P., Felner, A., Stern, R., Boyarski, E.: Eﬃcient sat approach to multi-

agent path ﬁnding under the sum of costs objective. In: European Conference on

Artiﬁcial Intelligence (ECAI). pp. 810–818 (2016)

43. Thayer, J.T., Ruml, W.: Bounded suboptimal search: A direct approach using

inadmissible estimates. In: International Joint Conference on Artiﬁcial Intelligence

(IJCAI) (2011)

44. Van Roy, T.J., Wolsey, L.A.: Solving mixed integer programming problems using

automatic reformulation. Operations Research 35(1), 45–57 (1987)

45. Veloso, M.M., Biswas, J., Coltin, B., Rosenthal, S.: Cobots: Robust symbiotic au-

tonomous mobile service robots. In: IJCAI. p. 4423 (2015)

46. ˇ

Svancara, J., Vlk, M., Stern, R., Atzmon, D., Bart´ak, R.: Online multi-agent

pathﬁnding. In: AAAI Conference on Artiﬁcial Intelligence (2019)

47. Wagner, G., Choset, H.: Subdimensional expansion for multirobot path planning.

Artiﬁcial Intelligence 219, 1–24 (2015)

48. Walker, T.T., Sturtevant, N.R., Felner, A.: Extended increasing cost tree search for

non-unit cost domains. In: International Joint Conference on Artiﬁcial Intelligence

(IJCAI). pp. 534–540 (2018)

49. Wang, K.H.C., Botea, A.: MAPP: a scalable multi-agent path planning algorithm

with tractability and completeness guarantees. Journal of Artiﬁcial Intelligence

Research 42, 55–90 (2011)

50. Wurman, P.R., D’Andrea, R., Mountz, M.: Coordinating hundreds of cooperative,

autonomous vehicles in warehouses. AI magazine 29(1), 9 (2008)

51. Yakovlev, K., Andreychuk, A.: Any-angle pathﬁnding for multiple agents based on

SIPP algorithm. In: International Conference on Automated Planning and Schedul-

ing (ICAPS). pp. 586–593 (2017)

52. Yu, J., LaValle, S.M.: Multi-agent path planning and network ﬂow. In: Algorithmic

Foundations of Robotics X. pp. 157–173 (2013)

Multi-Agent Path Finding – an Overview 21

53. Yu, J., LaValle, S.M.: Structure and intractability of optimal multi-robot path

planning on graphs. In: AAAI (2013)

54. Zhou, N., Kjellerstrand, H., Fruhman, J.: Constraint Solving and Planning with

Picat. Springer Briefs in Intelligent Systems, Springer (2015)

Control of Multiple Identical Mobile Microrobots for Collaborative Tasks Using External Distributed Magnetic Fields

Article

Full-text available

Jun 2024

The collaboration of microrobot teams has attracted considerable attention, particularly in the field of micro/nano manipulation. Achieving independent control and motion planning of multiple magnetic microrobots for coordinated movements is one of the most important tasks that is still unsolved. In this paper, a 12 $\times$ 12 coil array system is developed to generate a series of localized magnetic fields that enable simultaneous control of multiple identical magnetic microrobots, allowing teams of microrobots to collaborate in parallel for micromanipulation tasks. First, the structure of the microcoil is optimized based on the finite element model to increase the strength and gradient of the magnetic field, which in turn enhances the driving performance of the system. Meanwhile, an improved multi-target tracking algorithm that utilizes kernel correlation filtering (KCF) and image contour detection (ICD) techniques is proposed to improve the tracking accuracy of microrobots. In addition, collaborative planning for multiple magnetic microrobots is also achieved with the combination of the conflict-based search (CBS) algorithm. Finally, the developed system is tested with extensive physical experiments. Especially, experiments on magnetic droplet transport with two microrobots are also conducted. The results impressively demonstrated the effectiveness of the devised system and the proposed methods. Note to Practitioners —This article is motivated by the recent wide interest in magnetic microrobots. Actuated by external magnetic field, magnetic microrobots can wirelessly perform targeted delivery/therapy and other micro-assembly tasks. To facilitate collaboration between microrobots, independent control of each microrobot is desirable. However, due to the interaction between magnetic microrobots and the global magnetic field, the collaboration of multiple microrobots presents great challenges. Therefore, several coil-array-based systems have been developed. In this paper, we develop a magnetic actuation system from both hardware and software aspects for the collaborative motion of multiple magnetic microrobots. The coil structure is optimized to enhance the driving performance of the devised system, and a fused multi-target tracking algorithm is proposed to improve the tracking accuracy. In combination with the CBS algorithm, collision-free paths are planned for multiple identical microrobots. The experimental results show that the constructed system and proposed methods can realize coordinated motion of multiple identical magnetic microrobots, which has enormous potential for some biomedical applications.

Optimal Unlabeled Pebble Motion on Trees

Article

Jun 2024

Given a tree, a set of pebbles initially stationed at some nodes of the tree and a set of target nodes, the Unlabeled Pebble Motion on Trees problem (UPMT) asks to find a plan to move the pebbles one-at-a-time from the starting nodes to the target nodes along the edges of the tree while minimizing the number of moves. This paper proposes the first optimal algorithm for UPMT that is asymptotically as fast as possible, as it runs in a time linear in the size of the input (the tree) and the size of the output (the optimal plan).

Deep Reinforcement Learning for Task Assignment and Shelf Reallocation in Smart Warehouses

Article

Full-text available

Jan 2024

With the rapid development of online shopping and the prosperity of the e-commerce industry in recent years, traditional warehouses are struggling to cope with increasing order volumes. Accordingly, smart warehouses have gained considerable attention for their relatively high efficiency and productivity. In such warehouses, robots transport shelves to picking stations on the basis of tasks assigned to them and then return to the inventory area. An accurate task assignment method must be developed to achieve high efficiency in smart warehouses; however, existing task assignment methods use limited information, resulting in a lack of insight regarding future tasks in warehouses. This paper proposes a method based on the deep Q-network (DQN) that considers inventory for task assignments. The developed DQN-based model determines shelf return locations on the basis of current states to improve warehouse performance. The proposed method was compared with a traditional task assignment method, namely regret and marginal-cost based task assignment algorithm (RMCA); the results indicated that compared with the RMCA method, the proposed approach is more efficient and faster and can accommodate more robots.

An Integrated Approach to Precedence-Constrained Multi-Agent Task Assignment and Path Finding for Mobile Robots in Smart Manufacturing

Article

Full-text available

Apr 2024

Mobile robots play an important role in smart factories, though efficient task assignment and path planning for these robots still present challenges. In this paper, we propose an integrated task- and path-planning approach with precedence constrains in smart factories to solve the problem of reassigning tasks or replanning paths when they are handled separately. Compared to our previous work, we further improve the Regret-based Search Strategy (RSS) for updating the task insertions, which can increase the operational efficiency of machining centers and reduce the time consumption. Moreover, we conduct rigorous experiments in a simulated smart factory with different scales of robots and tasks. For small-scale problems, we conduct a comprehensive performance analysis of our proposed methods and NBS-ISPS, the state-of-the-art method in this field. For large-scale problems, we examine the feasibility of our proposed approach. The results show that our approach takes little computation time, and it can help reduce the idle time of machining centers and make full use of these manufacturing resources to improve the overall operational efficiency of smart factories.

Robot Navigation Based on Reinforcement Learning: An Overview

Chapter

Jun 2024

Revolutionizing Energy Infrastructure: Automated Route Planning for Underground Transmission Lines in Phnom Penh

Article

Jun 2024

Optimizing Pathfinding for Goal Legibility and Recognition in Cooperative Partially Observable Environments

Article

May 2024

A Comprehensive Review on Leveraging Machine Learning for Multi-Agent Path Finding

Article

Full-text available

Jan 2024

This review paper provides an in-depth analysis of the latest advancements in applying Machine Learning (ML) to solve the Multi-Agent Path Finding (MAPF) problem. The MAPF problem is about finding collision-free paths for multiple agents to travel from their source to goal locations in a known environment. This method underpins a range of advanced, large-scale automated systems, notably in warehouse logistics. The existing research on conventional MAPF is extensive; however, recent developments in ML have notably augmented the capabilities of MAPF techniques. This research seeks to thoroughly investigate the emerging field focused on using ML to help solve the MAPF problem. It aims to highlight the transformative potential of ML in enhancing the efficiency and effectiveness of multi-agent systems in navigating and coordinating in complex environments. Our study comprehensively examines the entire MAPF process, encompassing environment representation, path planning, and solution execution.

Adaptive Lifelong Multi-Agent Path Finding With Multiple Priorities

Article

Jun 2024

In this study, we introduce the Lifelong Evaluation-Based Large Neighborhood Search (LEB-LNS) algorithm designed to address the Lifelong Adaptive Multiple Priorities Multi-Agent Path Finding (LAMP-MAPF) challenge. This challenge involves agents that must navigate from one location to another across varying priority levels, constrained by limited calculation time for each interval. Initially, a $\gamma$ -based evaluation function is utilized to determine the significance of different priority levels. Following this, the evaluation led to the development of the Evaluation-Based LNS (EB-LNS) Algorithm, tailored for the Adaptive Multiple Priorities MAPF (AMP-MAPF) issue. By integrating task assignment, we further extend LEB-LNS algorithm for the LAMP-MAPF problem. The efficacy of LEB-LNS algorithm is verified through simulations conducted on fulfillment and sorting center maps, supplemented by real-world experiments. Results demonstrate that the LEB-LNS algorithm effectively resolves LAMP-MAPF problem, significantly enhancing agent throughput and reducing delays for high-priority agents.

Traffic Management for Swarm Production

Conference Paper

Dec 2023

Two Techniques That Enhance the Performance of Multi-robot Prioritized Path Planning

Conference Paper

Full-text available

Jul 2018

We introduce and empirically evaluate two techniques aimed at enhancing the performance of multi-robot prioritized path planning. The first technique is the deterministic procedure for rescheduling (as opposed to well-known approach based on random restarts), the second one is the heuristic procedure that modifies the search-space of the individual planner involved in the prioritized path finding. KEYWORDS Multi-robot systems; multi-robot path planning; multi-agent path finding; prioritized planning; random restarts ACM Reference Format: Consider n open-disk robots of equal radii r intended to simultaneously move towards their goals on a grid comprised of blocked and unblocked cells. The task is to plan the set of feasible trajectories (one per each robot) such that each two trajectories for different robots are collision-free, e.g. robots following them never collide. The spatial component of the trajectory, e.g. the path, is the sequence of straight-line segments connecting the centers of grid cells. We assume that all robots follow their paths with identical speed and can start/stop instantaneously. Depending on the application domain the cost of the solution can be either makespan, i.e. the time by which the last robot reaches its goal, or flowtime, which is the sum of traversal times across all the robots involved in the instance. 2 METHOD 2.1 Related work Complete (and optimal or bounded-suboptimal) solvers to the described problem exist, like the ones introduced in [13], [9], [8], [11], [5], [7] etc., but they typically require significant computational effort, especially when the number of robots increases. Appealing alternative is prioritized planning [6], which is widely used in ro-botics [4], [3] due to its simplicity and ability to scale well to large problems. Prioritized planners are incomplete in general but can be tweaked to increase the chance of finding the solution (and decrease flowtime and/or makespan). The most widespread approach to do so is to randomly reassign priorities in case of failure and re-plan, see [1], [2] for details. 2.2 Contribution In this work we introduce a novel deterministic technique for reassigning the priorities (re-scheduling) in case planning with initial priorities fails. We also introduce the notion of safe-start-interval (SSI) in order to prohibit the individual planner, involved in prioritized planning, from interfering with the start positions of low-priority robots for certain amount of time. Both suggested techniques enhance the performance of planning (as shown experimentally) and can be embedded into any prioritized planner.

Multi-Agent Pathfinding: Definitions, Variants, and Benchmarks

Article

Sep 2021

The multi-agent pathfinding problem (MAPF) is the fundamental problem of planning paths for multiple agents, where the key constraint is that the agents will be able to follow these paths concurrently without colliding with each other. Applications of MAPF include automated warehouses, autonomous vehicles, and robotics. Research on MAPF has been flourishing in the past couple of years. Different MAPF research papers assume different sets of assumptions, e.g., whether agents can traverse the same road at the same time, and have different objective functions, e.g., minimize makespan or sum of agents' actions costs. These assumptions and objectives are sometimes implicitly assumed or described informally. This makes it difficult for establishing appropriate baselines for comparison in research papers, as well as making it difficult for practitioners to find the papers relevant to their concrete application. This paper aims to fill this gap and facilitate future research and practitioners by providing a unifying terminology for describing the common MAPF assumptions and objectives. In addition, we also provide pointers to two MAPF benchmarks. In particular, we introduce a new grid-based benchmark for MAPF, and demonstrate experimentally that it poses a challenge to contemporary MAPF algorithms.

Search-Based Optimal Solvers for the Multi-Agent Pathfinding Problem: Summary and Challenges

Article

Sep 2021

Multi-agent pathfinding (MAPF) is an area of expanding research interest. At the core of this research area, numerous diverse search-based techniques were developed in the past 6 years for optimally solving MAPF under the sum-of-costs objective function. In this paper we survey these techniques, while placing them into the wider context of the MAPF field of research. Finally, we provide analytical and experimental comparisons that show that no algorithm dominates all others in all circumstances. We conclude by listing important future research directions.

Dynamic Potential Search — A New Bounded Suboptimal Search

Article

Sep 2021

Potential Search (PS) is an algorithm that is designed to solve bounded cost search problems. In this paper, we modify PS to work within the framework of bounded suboptimal search and introduce Dynamic Potential Search (DPS). DPS uses the idea of PS but modifies the bound to be the product of the minimal f-value in OPEN and the required suboptimal bound. We study DPS and its attributes. We then experimentally compare DPS to WA* and to EES on a variety of domains and study parameters that affect the behavior of these algorithms.In general we show that in domains with unit edge costs (e.g., many standard benchmarks) DPS significantly outperforms WA* and EES but there are exceptions.

ICBS: The Improved Conflict-Based Search Algorithm for Multi-Agent Pathfinding

Article

Sep 2021

Conflict-Based Search (CBS) and its generalization, Meta-Agent CBS are amongst the strongest newly introduced algorithms for Multi-Agent Path Finding. This paper introduces ICBS, an improved version of CBS. ICBS incorporates three orthogonal improvements to CBS which are systematically described and studied. Experimental results show that each of these improvements reduces the runtime over basic CBS by up to 20x in many cases. When all three improvements are combined, an even larger improvement is achieved, producing state-ofthe art results for a number of domains.

Any-Angle Pathfinding for Multiple Agents Based on SIPP Algorithm

Article

Jun 2017

The problem of finding conflict-free trajectories for multiple agents of identical circular shape, operating in shared 2D workspace, is addressed in the paper and decoupled, e.g., prioritized, approach is used to solve this problem. Agents' workspace is tessellated into the square grid on which any-angle moves are allowed, e.g. each agent can move into an arbitrary direction as long as this move follows the straight line segment whose endpoints are tied to the distinct grid elements. A novel any-angle planner based on Safe Interval Path Planning (SIPP) algorithm is proposed to find trajectories for an agent moving amidst dynamic obstacles (other agents) on a grid. This algorithm is then used as part of a prioritized multi-agent planner AA-SIPP(m). On the theoretical side, we show that AA-SIPP(m) is complete under well-defined conditions. On the experimental side, in simulation tests with up to 250 agents involved, we show that our planner finds much better solutions in terms of cost (up to 20%) compared to the planners relying on cardinal moves only.

Adding Heuristics to Conflict-Based Search for Multi-Agent Path Finding

Article

Jun 2018

Conflict-Based Search (CBS) and its enhancements are among the strongest algorithms for the multi-agent path-finding problem. However,existing variants of CBS do not use any heuristics that estimate future work. In this paper, we introduce different admissible heuristics for CBS by aggregating cardinal conflicts among agents. In our experiments, CBS with these heuristics outperforms previous state-of-the-art CBS variants by up to a factor of five.

Finding Optimal Solutions to Cooperative Pathfinding Problems

Article

Jul 2010

Trevor Scott Standley

In cooperative pathfinding problems, non-interfering paths that bring each agent from its current state to its goal state must be planned for multiple agents. We present the first practical, admissible, and complete algorithm for solving problems of this kind. First, we propose a technique called operator decomposition, which can be used to reduce the branching factors of many search algorithms, including algorithms for cooperative pathfinding. We then show how a type of independence common in instances of cooperative pathfinding problems can be exploited. Next, we take the idea of exploiting independent subproblems further by adding improvements that allow the algorithm to recognize many more cases of such independence. Finally, we show empirically that these techniques drastically improve the performance of the standard admissible algorithm for the cooperative pathfinding problem, and that their combination results in a complete algorithm capable of optimally solving relatively large problems in milliseconds.

A General Formal Framework for Pathfinding Problems with Multiple Agents

Article

Jun 2013

Pathfinding for a single agent is the problem of planning a route from an initial location to a goal location in an environment, going around obstacles. Pathfinding for multiple agents also aims to plan such routes for each agent, subject to different constraints, such as restrictions on the length of each path or on the total length of paths, no self-intersecting paths, no intersection of paths/plans, no crossing/meeting each other. It also has variations for finding optimal solutions, e.g., with respect to the maximum path length, or the sum of plan lengths. These problems are important for many real-life applications, such as motion planning, vehicle routing, environmental monitoring, patrolling, computer games. Motivated by such applications, we introduce a formal framework that is general enough to address all these problems: we use the expressive high-level representation formalism and efficient solvers of the declarative programming paradigm Answer Set Programming. We also introduce heuristics to improve the computational efficiency and/or solution quality. We show the applicability and usefulness of our framework by experiments, with randomly generated problem instances on a grid, on a real-world road network, and on a real computer game terrain.

Continuous Multi-agent Path Finding via Satisfiability Modulo Theories (SMT)

Chapter

Mar 2021

Pavel Surynek

We address multi-agent path finding (MAPF) with continuous movements and geometric agents, i.e. agents of various geometric shapes moving smoothly between predefined positions. We analyze a new solving approach based on satisfiability modulo theories (SMT) that is designed to obtain optimal solutions with respect to common cumulative objectives. The standard MAPF is a task of navigating agents in an undirected graph from given starting vertices to given goal vertices so that agents do not collide with each other in vertices or edges of the graph. In the continuous version (MAPF$^\mathcal {R}$), agents move in an n-dimensional Euclidean space along straight lines that interconnect predefined positions. Agents themselves are geometric objects of various shapes occupying certain volume of the space - circles, polygons, etc. We develop concepts for circular omni-directional agents having constant velocities in the 2D plane but a generalization for different shapes is possible. As agents can have different shapes/sizes and are moving smoothly along lines, a movement along certain lines done with small agents can be non-colliding while the same movement may result in a collision if performed with larger agents. Such a distinction rooted in the geometric reasoning is not present in the standard MAPF. The SMT-based approach for MAPF$^\mathcal {R}$ called SMT-CBS$^\mathcal {R}$ reformulates previous Conflict-based Search (CBS) algorithm in terms of SMT. Lazy generation of constraints is the key idea behind the previous algorithm SMT-CBS. Each time a new conflict is discovered, the underlying encoding is extended with new to eliminate the conflict. SMT-CBS$^\mathcal {R}$ significantly extends this idea by generating also the decision variables lazily. Generating variables on demand is needed because in the continuous case the number of possible decision variables is potentially uncountable hence cannot be generated in advance as in the case of SMT-CBS. We compared SMT-CBS$^\mathcal {R}$ and adaptations of CBS for the continuous variant of MAPF experimentally.

Multi-Agent Path Finding – An Overview

Abstract and Figures

Recommended publications

Multi-Agent Path Finding for Large Agents

Symmetry-Breaking Constraints for Grid-Based Multi-Agent Path Finding

Multi-Agent Path Finding with Deadlines: Preliminary Results

Robust Multi-Agent Path Finding