PreprintPDF Available

Open-Endedness is Essential for Artificial Superhuman Intelligence

June 2024

June 2024

License
CC BY 4.0

Authors:

Jack Parker-Holder

University of Oxford

Show all 8 authorsHide

Preprints and early-stage research may not have been peer reviewed yet.

In recent years there has been a tremendous surge in the general capabilities of AI systems, mainly fuelled by training foundation models on internetscale data. Nevertheless, the creation of openended, ever self-improving AI remains elusive. In this position paper, we argue that the ingredients are now in place to achieve openendedness in AI systems with respect to a human observer. Furthermore, we claim that such open-endedness is an essential property of any artificial superhuman intelligence (ASI). We begin by providing a concrete formal definition of open-endedness through the lens of novelty and learnability. We then illustrate a path towards ASI via open-ended systems built on top of foundation models, capable of making novel, humanrelevant discoveries. We conclude by examining the safety implications of generally-capable openended AI. We expect that open-ended foundation models will prove to be an increasingly fertile and safety-critical area of research in the near future.

Knowledge accumulation and transfer in a human-AI open-ended system. We depict AI building on AI knowledge, humans understanding AI knowledge, AI understanding human knowledge, humans building on human knowledge, and emergent knowledge created by the process as a whole. Every process in this diagram offers an opportunity to embed safety methods that guide the system towards achieving ASI responsibly.

…

Figures - available via license: Creative Commons Attribution 4.0 International

Content may be subject to copyright.

Available via license: CC BY 4.0

Content may be subject to copyright.

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

Edward Hughes * 1 Michael Dennis * 1 Jack Parker-Holder 1Feryal Behbahani 1Aditi Mavalankar 1Yuge Shi 1

Tom Schaul 1Tim Rockt ¨

aschel 1

Abstract

In recent years there has been a tremendous surge

in the general capabilities of AI systems, mainly

fuelled by training foundation models on internet-

scale data. Nevertheless, the creation of open-

ended, ever self-improving AI remains elusive.

In this position paper, we argue that the in-

gredients are now in place to achieve open-

endedness in AI systems with respect to a hu-

man observer. Furthermore, we claim that

such open-endedness is an essential property

of any artiﬁcial superhuman intelligence (ASI).

We begin by providing a concrete formal deﬁni-

tion of open-endedness through the lens of novelty

and learnability. We then illustrate a path towards

ASI via open-ended systems built on top of foun-

dation models, capable of making novel, human-

relevant discoveries. We conclude by examining

the safety implications of generally-capable open-

ended AI. We expect that open-ended foundation

models will prove to be an increasingly fertile and

safety-critical area of research in the near future.

1. Introduction

Recent years have seen impressive progress in AI, mainly

driven by foundation models (Bommasani et al.,2021).

These models are increasingly used as agents in various

applications (e.g., Wang et al.,2023a;Wu et al.,2023;

Lifshitz et al.,2023;Wang et al.,2023c;Liu et al.,2023b;

Zheng et al.,2024;Ahn et al.,2022). This represents signif-

icant progress towards artiﬁcial general intelligence (AGI),

in the sense of reaching human-level performance on a wide

range of tasks (Legg and Hutter,2007). However, we are

still missing a formal description of what it would take for

an autonomous system to self-improve towards increasingly

creative and diverse discoveries without end—a Cambrian

Equal contribution

Google DeepMind, London, UK. Corre-

spondence to: Edward Hughes

edwardhughes@google.com

Michael Dennis <dennismi@google.com>.

Proceedings of the

41 st

International Conference on Machine

the author(s).

explosion of emergent capabilities, behaviors, and artifacts.

This kind of open-ended invention is the mechanism by

which human individuals and society at large accumulates

new knowledge and technology. Therefore, open-endedness

must be a property of an artiﬁcial superhuman intelligence

(ASI, Morris et al.,2023) that can, by deﬁnition, accomplish

a wide range of tasks at a level which no human can match.

By the very nature of superhuman intelligence, open-ended

discovery of innovative solutions is essential to empower hu-

manity to manage its risks, just as society evolves norms and

institutions to govern increasingly capable humans across

generations (Richerson et al.,2001).

Foundation models such as large language models (LLMs)

have scaled learning to large, static datasets scraped from

the internet. Extrapolating, we may soon be running out

of high-quality textual and visual data for training such

models (Villalobos et al.,2022). Thus, open-endedness is

unlikely to arise for free by training on ever-larger datasets.

Rather, a system endowed with the open-endedness neces-

sary for ASI will eventually have to create, refute and reﬁne

its own explanatory knowledge, in interaction with a source

of evidence (Deutsch,2011), as well as learning what data to

learn from (Jiang et al.,2022). Moreover, for ASI to be use-

ful and safe, it is important that open-endedness be guided

towards knowledge that is understandable by and beneﬁcial

for humanity. Foundation models and open-endedness are

orthogonal dimensions, whose combination is particularly

powerful (cf. Lehman et al.,2022;Huang et al.,2022;Chen

et al.,2023a;Meyerson et al.,2023;Zhang et al.,2023;Wu

et al.,2023;Wang et al.,2023a). Open-ended algorithms

endow foundation models with the ability to uncover new

knowledge, while foundation models guide the search space

for open-ended AI towards discovering human-relevant arti-

facts efﬁciently (Liu et al.,2023a;Ma et al.,2023;Romera-

Paredes et al.,2024). A formal deﬁnition of open-endedness

can catalyze progress in this direction, offering clarity and

focus to galvanize the research community.

We provide a new and precise deﬁnition of open-endedness

in Section 2, inspired by the open-ended systems in nature

that have created life, the human brain, culture, and tech-

nology, as well as open-ended systems in silico that, for

instance, have achieved superhuman level at the game of

Go (Silver et al.,2016), generated human-level adaptation

arXiv:2406.04268v1 [cs.LG] 6 Jun 2024

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

to novel 3D tasks (Bauer et al.,2023), self-improved lan-

guage models (Fernando et al.,2023;Yang et al.,2023a),

unlocked the tech tree in Minecraft (Wang et al.,2023a),

and discovered new results in pure mathematics (Romera-

Paredes et al.,2024). Open-endedness has been understood

in a wide variety of ways (Earle et al.,2021) ever since

it gained prominence as a term in the study of artiﬁcial

life (Bedau,1992;Bedau et al.,1998) and biological evo-

lution (Holland,1992;McShea,1996;Waddington,2008).

Contrary to Stepney and Hickinbotham (2023), we believe

quantifying open-endedness is both possible and important

going forward, and, akin to Sigaud et al. (2023), we be-

lieve it can be achieved via the help of an observer external

to the system. Our deﬁnition makes formal the aphorism

of Lisa B. Soros that, as observers of an open-ended sys-

tem, “we’ll be surprised but we’ll be surprised in a way that

makes sense in retrospect”. Concretely, open-ended systems

produce increasingly novel and surprising artifacts that are

hard to predict, even for an observer who has learned to

better predict by examining past artifacts. Once a system

exhibits these characteristics, i.e. producing learnable but

novel artifacts, we call it an open-ended system. This allows

us to pinpoint the sense in which open-endedness is essen-

tial for ASI, to provide examples illustrating how existing

open-ended AI systems lack generality, and to argue that

present-day foundation models are not yet open-ended.

Historically, the ﬁeld of open-endedness has faced numer-

ous challenges. Principal among these has been the problem

of structuring the search space so as to regularly produce

artifacts which are both novel and interesting to humans (Ma

et al.,2023). When humans make discoveries, they do so by

“standing on the shoulders of giant human datasets” (Clune,

2022); that is to say, utilising prior world, domain and com-

monsense knowledge, which they have acquired biologically

or culturally. Since foundation models have been trained on

vast amounts of human data, they capture human notions of

interestingness (Zhang et al.,2023). Furthermore, they are

general sequence modellers (Mirchandani et al.,2023) and

can generate variations from existing examples (Meyerson

et al.,2023), thus serving as general mutation operators.

This is compelling since with more advanced foundation

models, practical implementations of open-ended systems

become increasingly feasible. Taken together, open-ended

foundation models can both vary (i.e., mutate) data and as-

sess novelty and interestingness of real and generated data

to decide what data to further explore (i.e., select) (Jiang

et al.,2022).

In Section 3we provide some concrete research directions

for this marriage between open-endedness and foundation

models, for example leveraging evolutionary algorithms

and reinforcement learning. Generally capable open-ended

systems may be both extremely powerful and increasingly

prevalent, prompting pressing safety considerations (Ecoffet

… …

OBSERVER

Novelty

…

ARTIFACTS

Learnability

SYSTEM

Figure 1.

Illustration of open-endedness deﬁnition. The deﬁni-

tion of open-endedness hinges on a system’s ability to continuously

generate artifacts that are both novel and learnable to an observer.

Consider a system that designs various aircraft: a mouse (left)

might ﬁnd these designs novel but lack the capacity to comprehend

the principles behind them; for a human studying aerospace engi-

neering (middle), the system offers both novelty and the potential

for learning, making it open-ended. However, a superintelligent

alien (right) with vast aerospace knowledge might not ﬁnd the

design novel, but would still be able to analyze and understand

them. This highlights that open-endedness is observer-dependent

and that novelty or learnability alone is not enough.

et al.,2020). In Section 4, we argue that research into open-

ended systems will be essential to safely and beneﬁcially

deploy any increasingly general and autonomous AI.

2. Deﬁning Open-Endedness

2.1. Formal Deﬁnition

The notion of an open-ended system has received many

colloquial deﬁnitions (Soros and Stanley,2014;Stanley

and Lehman,2015;Stanley et al.,2017;Stanley,2019).

More formal approaches have often focused on the case

of evolutionary systems, quantifying the increasing com-

plexity (McShea,1996;Waddington,2008) and perpetual

novelty (Holland,1992) of biological evolution. Intuitively,

an open-ended system endlessly produces novel and interest-

ing artifacts. But novelty and interestingness have generally

been characterised without sufﬁcient precision, or in an

overly narrow way. We provide a general-purpose, formal

deﬁnition of open-endedness, as follows.

Deﬁnition: From the perspective of an observer, a

system is open-ended if and only if the sequence of

artifacts it produces is both novel and learnable.

More formally, a system

produces a sequence of artifacts

, indexed by time

. An observer

processes a new arti-

fact

to determine its predictability given a history

X1:t

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

of past ones.

possesses a statistical model

which pre-

dicts an arbitrary future artifact based on its observations of

the artifacts it has seen up to time

. The observer judges the

quality of their prediction based on a loss metric

ℓ(ˆ

Xt, XT)

ℓ(t, T )

for short. A natural implementation of

is as a

learning algorithm.

A system displays novelty if artifacts become increasingly

unpredictable with respect to the observer’s model at any

ﬁxed time t, namely:

∀t, ∀T > t, ∃T′> T :E[ℓ(t, T ′)] >E[ℓ(t, T )] .

In other words, there is always a less predictable artifact

coming further in the future.1

The system is learnable whenever conditioning on a longer

history makes artifacts more predictable, namely:

∀T, ∀t < T, ∀T > t′> t :E[ℓ(t′, T )] <E[ℓ(t, T )] .

Finally, a system is open-ended from the perspective of the

observer

if and only if it generates sequences of artifacts

that are both novel and learnable (see Figure 1). The novelty

aspect ensures the presence of information gain within the

system, while learnability guarantees that this information

gain holds meaning and is “interesting” to the observer.

For example, imagine that the system is a noisy TV pro-

ducing uniform random noise (Burda et al.,2018). A noisy

TV is learnable, allowing the observer to learn a statistical

model that approximates the uniform distribution increas-

ingly well; however, once the observer’s model converges

to uniform the system loses its novelty: all that is left is

aleatoric uncertainty, which is collapsed by the expectation.

Now imagine that the system is a noisy TV switched period-

ically by a remote control to a random, arbitrary distribution.

Every time the channel is changed, the observer may expe-

rience novelty; however, the system is now not learnable,

because the history of artifacts (previous TV channels) are

not correlated with the distribution of the next channel, so

the model loss will not decrease in general. We provide an

informal positive example in Appendix A.1.

Our deﬁnition makes no explicit mention of “interesting-

ness”. More precisely, interestingness is represented in our

deﬁnition by the observer’s choice of loss function

ℓ

. Thus,

for us, the interesting parts of artifacts are precisely those

features which the observer decides are useful to learn about.

Different observers can, and do, ﬁnd different artifacts inter-

esting, by virtue of the different parts of the feature space

they choose to learn with their statistical model.

We hope that our deﬁnition will serve as a useful grounding

for future work. On the theoretical side, it provides a basis

We take the expectation over any stochasticity in the artefacts;

practically speaking, were the observer to make observations from

identical copies of the system

, the expectation of

ℓ

would be

approximated by the empirical mean.

for proving whether a system is open-ended. On a practi-

cal note, it raises the prospect of searching for open-ended

systems. In this paper, we shall use it to underpin the argu-

ment that open-endedness lies on the critical path towards

ASI, and in particular that the combination of open-ended

algorithms and foundation models is ripe to yield signiﬁcant

progress towards that aim. We examine some subtleties of

our deﬁnition in Appendix A.2.

2.2. Related Deﬁnitions

In the interests of space, we review the deﬁnitions of open-

endedness most closely related to ours, covering more dis-

tantly related work in Appendix C.Soros and Stanley (2014)

provided four necessary conditions for an evolutionary pro-

cess to be open-ended, namely (1) that individuals must

meet a minimal criterion in order to reproduce, (2) that

evolution of individuals should create novel opportunities

to meet the minimal criterion, (3) that individuals them-

selves should make decisions about how to interact with the

world, and (4) that the potential complexity of the phenotype

should not be limited by its representation. Our deﬁnition

overlaps with these necessary conditions, but relaxes the

constraint that the open-ended system is evolutionary. Our

requirement that learnability is increasing can be seen as

a generalisation of the minimal criterion in condition (1).

Our requirement that the observer cannot intervene on the

system is analogous to condition (3). Our requirement that

novelty is increasing is analogous to conditions (2) and (4).

Indeed, conditions (2) and (4) suggest that an open-ended

system cannot be learned from a ﬁxed data distribution.

To our knowledge, the most recent paper offering a deﬁni-

tion of open-endedness is Sigaud et al. (2023). The authors

write: “an observer considers a process as open-ended if, for

any time

, there exists a time

t′> t

at which the process

generates a token that is new according to this observer’s

perspective”. This deﬁnition has considerable overlap with

ours. Like us, Sigaud et al. deﬁne open-endedness with re-

spect to an observer. They consider the observer examining

a sequence of tokens from a process, while we equivalently

have the observer consider a sequence of artifacts from a

system. Our requirement of novelty and learnability is com-

patible with their statement that the process should generate

a token that is “new according to the observer’s perspec-

tive”. Our deﬁnition differs by being more precise about

what this phrase means. In particular, we specify that what

an observer considers “new” should be artifacts that are

unpredictable according to their current statistical model of

the system under consideration. Moreover, we specify that

the observer’s “perspective” is generated by learning that

statistical model on the history of artifacts thus far presented

by the system. In particular, our deﬁnition can rule out

systems that display continual “novelty” but are otherwise

uninteresting, like white noise on a TV screen, for instance.

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

2.3. Types of Observer

The choice of observer is a free parameter of great impor-

tance for our deﬁnition. From the perspective of AI research,

there is a pre-eminent class of observers, namely humans.

In other words, we wish to generate artifacts that are valu-

able to individual humans and to society. This provides a

level of grounding for the open-ended system which nar-

rows the search space considerably, as we shall argue in

Section 3. Nevertheless, our deﬁnition deliberately admits

arbitrary observers, for several reasons. Firstly, it allows our

deﬁnition to encompass open-ended systems which are not

anthropocentric, such as biological evolution. Secondly, it

allows us to reason about open-ended systems which might

exceed human capabilities, so-called ASI. Thirdly, it allows

us to determine whether systems can be open-ended with

respect to any observer, as we did with the noisy TV.2

Practically speaking, any given observer will have some

time horizon

which bounds their observations of a system,

i.e.

t, T < τ

. This concept allows us to distinguish between

systems which are open-ended on different timescales. We

say that a system is inﬁnitely open-ended with respect to

an observer

if it remains open-ended on any timescale

τ→ ∞

. We say that a system is ﬁnitely open-ended with

time horizon

with respect to an observer

if it is open-

ended for

t, T < τ

. Consider, for example, an agent trained

in simulation with an automatic curriculum over tasks. In

principle, a human observer might ﬁnd observations of the

agent behaviour to be inﬁnitely open-ended, for the agent

may accrue the ability to solve ever more diverse and sur-

prising tasks. In practice (cf. AdA, Bauer et al.,2023),

novelty starts to plateau after about

month of training, due

to limitations in the richness of the task space and in the

size of the agent’s neural network. Thus AdA is ﬁnitely

open-ended with time horizon ≈1month.

Similarly, an observer’s judgement will be inﬂuenced by

the limitations of their cognitive abilities relative to the

breadth of the domain. For example, a human observer who

reads a curriculum of ever more complex articles from a

current snapshot of Wikipedia may ﬁnd such a system open-

ended, but only until they reach the limit of their memory.

A suitable ordering of Wikipedia articles will present novel

information, in the sense that every now and then an article

will be more unpredictable than we have hitherto seen. We

might also expect that this information will be learnable,

because human knowledge is interlinked, in the sense that

knowing more about one topic makes it easier to understand

There is one constraint on an observer which must be adhered

to for our deﬁnition to make sense. The loss function must treat

artifacts

and predictions

on an equal footing. In particular it

must be ﬁxed in advance without any knowledge of the system S.

Otherwise, an observer

could ﬁnd a system

to be open-ended

purely by discarding the artifacts from

and constructing its own

artifacts that it ﬁnds to be both novel and learnable.

other topics that may crop up later. However, once human

memory capacity is saturated, the human observer will start

to forget previous articles. This violates learnability: in

calculus, for instance, once one has forgotten the deﬁnition

of a derivative, one will ﬁnd it harder to understand an article

about the chain rule. Therefore, conditioning on a history

longer than an observer’s recall doesn’t necessarily make

the current artifact more predictable.

This example brings to light three interesting threads. Firstly,

the open-endedness of human technology, as observed by

humans, relies on our ability to compress knowledge into

a form that can be maintained within our collective mem-

ory: indeed, we present an alternative deﬁnition of open-

endedness in the language of compression in Appendix B.

Secondly, an artiﬁcial superhuman intelligence may have

less stringent memory constraints than humans, and there-

fore may judge itself to be open-ended beyond the point at

which humans assess it to be so, re-emphasising that human

observers must be considered pre-eminent for the purposes

of safety, as we explore further in Section 4. Thirdly, the

open-endedness in this example is a function of the breadth

of the domain. In a narrower domain, elliptic curve cryp-

tography say, the set of relevant Wikipedia articles would

be much smaller, so a human observer would ﬁnd this open-

ended only until they had understood every article, at which

point novelty would be violated. Nevertheless, humans can,

and frequently do, make new discoveries in narrow domains

via experimentation and reasoning; amassing a vast, static

trove of data is not the be all and end all of open-endedness.

2.4. Examples

In this section, we discuss some popular systems that are

open-ended but not general, or that are general but not open-

ended, with respect to a human observer. This serves two

purposes. Firstly, it demonstrates that our deﬁnition is not

so restrictive as to rule out systems that are intuitively open-

ended, and is not so loose as to include systems that intu-

itively lack open-endedness. Secondly, it motivates the ben-

eﬁts that foundation models can provide in addressing the

limitations of current open-ended systems and vice versa.

Our ﬁrst archetypal open-ended system is AlphaGo (Silver

et al.,2016). Consider as artifacts the sequence of policies

produced across training by AlphaGo. After sufﬁcient train-

ing, AlphaGo produces policies which are novel to human

expert players, in the sense that they play moves which

would be low probability for human professionals but which

nevertheless are winning against the best humans. Further-

more, humans can improve their win rate against AlphaGo

by learning from AlphaGo’s behavior (Shin et al.,2023).

Yet, AlphaGo keeps discovering new policies that can beat

even a human who has learned from previous AlphaGo ar-

tifacts. Thus, so far as a human is concerned, AlphaGo is

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

both novel and learnable. AlphaGo is just one representative

from a class of open-ended algorithms that augment rein-

forcement learning with self-play (Samuel,1959), achieving

or exceeding human-level play in Go, Chess, Shogi (Silver

et al.,2017), StarCraft II (Vinyals et al.,2019) Stratego (Per-

olat et al.,2022), DotA (Berner et al.,2019), and Diplomacy

(Bakhtin et al.,2022).

AlphaGo is an example of an open-ended system that

achieves narrow superhuman intelligence (Morris et al.,

2023). This limits its utility: self-play of this kind can-

not by itself help us to discover new science or technology

that requires combining insight from disparate ﬁelds, or

taking actions across a range of modalities, timescales and

contexts. The constraints of the game rules make the search

for novel and learnable artifacts tractable, and these artifacts

are found to be novel and learnable by human observers

largely because it was humans who invented the game.

Our second archetypal open-ended system is AdA (Bauer

et al.,2023;OEL Team et al.,2021). AdA is a large-scale

agent that learns to solve tasks in an 3D-environment called

XLand2. In XLand2 there are 25B possible task variants,

corresponding to different world topologies and a variety of

possible games within each world, that are prioritized for

learning potential (Jiang et al.,2021). Checkpoints of the

AdA agent across training are open-ended with respect to a

human observer who attempts to predict what capabilities

the agent might show. Across training, the agent gradu-

ally accumulates zero-shot and few-shot capabilities over

an ever wider set of held-out environments, requiring ever

more complex skills. Thus the human continually observes

novel capabilities in the agent. Furthermore, the prioritiza-

tion of task variants provides an interpretable ordering to the

accumulation of skills in the agent, rendering this learnable

by a human. AdA represents a wider class of open-ended al-

gorithms driven by unsupervised environment design (UED,

Dennis et al.,2020;Justesen et al.,2018), which establish

an automatic curriculum (Leibo et al.,2019;Baker et al.,

2020) of environments in the zone of proximal development

for agent learning (Vygotsky and Cole,1978).

It is natural to ask whether AdA would continue to be judged

as open-ended by a human observer should training be con-

tinued indeﬁnitely. Results in Bauer et al. (2023) suggest

that novelty starts to plateau, implying that with an order

of magnitude more compute AdA would almost certainly

not be open-ended. Indeed, the authors show that both in-

creasing the size of the agent and increasing the number

of tasks allow the agent to generalize to a wider range of

environments. Thus, in order for this system to be open-

ended on longer timescales, one would need an even richer

environment and an even more capable agent to sustain the

agent-environment co-evolution inherent in UED.

Our third archetypal open-ended system is POET (Wang

et al.,2019;2020). POET trains a population of agents,

each of which is paired with an environment that is evolving

over the course of training. These paired agent-environment

artifacts are open-ended with respect to a human observer

seeking to model the features of the environments that arise,

or equivalently the skills the paired agents possess. A Qual-

ity Diversity algorithm (QD, Pugh et al.,2016;Mouret and

Clune,2015) is deployed with respect to the environments,

hunting for challenging problems that lead to diverging per-

formance across the population. QD is an example of a

wider class of open-ended algorithms, namely evolutionary

algorithms, which we encounter again in Section 3.4.

Crucially, POET periodically transfers agents from one en-

vironment to another, which results in an empirical example

of the stepping stone phenomenon (Stanley and Lehman,

2015): agents can eventually solve incredibly challenging

environments that are not possible to solve with direct opti-

mization. As a result of training for billions of environment

steps, POET produces a diverse population of highly capa-

ble specialist agents, which can solve novel environments

that are created through coevolution with the population

(Brant and Stanley,2017). Novelty arises because of the

mutation operator in the QD algorithm, which yields new

and unpredictable environments. Learnability arises because

each mutation is small, so the past lineage of an environ-

ment is a good guide to its current features. Just as for AdA,

the key limitation on open-endedness is the environment

parameterization itself: eventually POET will plateau once

the agent can solve all possible terrains.

Our ﬁnal example is contemporary foundation models.

These are a negative example; they are not open-ended by

our deﬁnition with respect to any observer who can model

their training dataset. The justiﬁcation for this follows im-

mediately from our consideration of the noisy TV in Section

2.1. Contemporary foundation models are typically trained

on ﬁxed datasets. If the distribution of this data is learnable,

which it must be, for the foundation model learned it in

the ﬁrst place, then it cannot be endlessly novel, because

eventually the observer will have modelled the epistemic

uncertainty. As we saw in Section 2.3, foundation models

may appear open-ended to human observers if the domain of

enquiry is sufﬁciently broad, by virtue of the memory limita-

tions of the human brain. However, if the focus is narrowed,

for instance to tasks that require planning (Momennejad

et al.,2024;Pallagani et al.,2023;Valmeekam et al.,2023),

the limitations of the foundation model in generating novel,

correct solutions are exposed.

Since foundation models are periodically retrained on new

data, including data generated by their own interactions

with humans and the real world, one could argue that the

data distribution is not really ﬁxed. In some quarters, this

kind of distributional shift is seen as an annoyance, even

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

one which threatens “model collapse” (Shumailov et al.,

2023). We ﬂip this argument on its head, and contend

that augmenting foundation models with open-endedness

offers a path towards ASI. Similarly, the fact that foundation

models are typically conditional on context breaks the logic

that they cannot be open-ended. In principle, the context of a

foundation model can be recruited to recombine concepts in

an open-ended way by leveraging some external measure of

validity. This brings us neatly to some concrete suggestions

for how to build open-ended foundation models.

3. Open-Ended Foundation Models

We have deﬁned open-endedness and discussed why the

current foundation model training paradigm is not open-

ended. We believe that the trend of improving foundation

models trained on passive data by scaling alone will soon

plateau, and it will not be enough to reach ASI. Our position

is that open-endedness is a property of any ASI, and that

foundation models provide the missing ingredient required

for domain-general open-endedness. Further, we believe

that there may be only a few remaining steps required to

achieve open-endedness with foundation models. In the

following subsections, we sketch four overlapping paths

towards open-ended foundation models that lend credence to

this belief. The paths are neither intended to be prescriptive

nor exhaustive. Indeed, recent publications such as (Wong

et al.,2023b;Sharma et al.,2023) point to other paths.

Before proceeding, we must justify our claim that a future

foundation model trained passively on some large corpus

of human data is unlikely to spontaneously acquire open-

endedness. In principle, should we reach ASI, there will

be some sum total of data which the model has consumed

during its training, possibly via several intermediate stages.

Therefore, our claim is not about the impossibility of assem-

bling such a dataset. Rather, we suggest that it is unlikely

that this dataset can be pre-collected ofﬂine in an efﬁcient

way. The reason is that open-endedness is fundamentally an

experiential process: producing novelty and learnability in

the eyes of an observer requires continual online adaptation

on the basis of the artifacts already produced, in the context

of that observer’s evolving prior beliefs.

What would it take to collect ofﬂine a static dataset from

which such an experiential skill could be learned? Such

a dataset must contain a treasure trove of artifacts which

themselves crisply show novelty and learnability. Yet the

process by which culture evolves, ideas develop, inventions

arise and technologies proliferate is seldom recorded neatly

and comprehensively. The alternative paradigm, in which

experience is “built in” to the open-ended system, is well il-

lustrated by the scientiﬁc method. Since the Enlightenment,

the simple process of making hypotheses on the basis of

current knowledge, falsifying them with experiments based

on a source of evidence, and codifying the results into new

knowledge has yielded unprecedented progress in science

and technology (Deutsch,2011). In our view, the fastest

path to ASI will take inspiration from the scientiﬁc method,

compiling a dataset online by the explicit combination of

foundation models and open-ended algorithms.

3.1. Reinforcement Learning

The framework of Reinforcement Learning (RL) has been at

the forefront of achieving superhuman performance in nar-

row domains, such as AlphaGo’s groundbreaking strategies

that have enriched the human understanding of the game of

Go. RL agents act deliberately so as to shape their stream of

experience for both accumulating reward (exploitation) and

learning about how to increase expected reward in the future

(exploration). A nuanced extension are agents that set their

own goals to (learn to) pursue; and generating the sequence

of these goals can itself be an open-ended process, which

drives open-ended experience generation (Colas et al.,2022).

Voyager (Wang et al.,2023a) provides an early example of

how RL-like self-improvement can be built on top of founda-

tion models, without the need for explicit parameter updates

or established RL algorithms. Instead, Voyager assembles

an LLM-powered curriculum, uses iterative prompting as

an improvement operator, and assembles veriﬁed skills into

a library for hierarchical reuse.

A key problem in RL is how to shape exploration towards

novel and learnable behaviors in high-dimensional domains,

as discussed in Jiang et al. (2022). Exploration can be

guided, for instance, by pseudo-rewards (Bellemare et al.,

2016;Burda et al.,2018;Du et al.,2023b), modulation

(Schaul et al.,2019) or an automated curriculum that selects

relevant tasks (Jiang et al.,2021;Parker-Holder et al.,2022;

Samvelyan et al.,2023). To generalize this, a useful abstrac-

tion may be the notion of a proxy observer, which sits within

the system and proactively guides it to generate novel and

learnable content for the true external observer. In the past

this guidance was provided on the basis of simple metrics

such as TD-error, but now we can leverage foundation mod-

els to guide exploration towards artifacts that more closely

align with what a human observer deems to be novel and

interesting (Jiang et al.,2022). There is already evidence

that this approach may be effective, with LLMs providing

agent rewards from text in an environment (Klissarov et al.,

2023) and compiling a curriculum of tasks based on their

interestingness (Zhang et al.,2023;Faldor et al.,2024).

While RL considers the ﬁrst-person perspective of an agent

interacting with an environment, a different perspective cen-

ters on multi-agent dynamics, and the additional richness

arising from all the ways that different (possibly heteroge-

neous) agents can interact with each other, adapt to each

other, or learn from each other. The presence of multiple

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

learning agents provides a source of non-stationarity, such

that the optimal strategy for each individual will change over

time, potentially in an open-ended manner. Non-stationary

dynamics been used to achieve or exceed human-level per-

formance in games like StarCraft, DotA and Stratego. There

is early evidence that multi-agent systems may help to im-

prove factuality and reasoning in LLMs via debate (Du et al.,

2023c;Tang et al.,2023), although there is much more re-

search needed before superhuman capability is reached.

3.2. Self-Improvement

To achieve open-endedness, a model must not only con-

sume knowledge from pre-collected feedback as in, for

example, RLHF (Ziegler et al.,2019), but also generate

new knowledge, in form of hypotheses, insights or creative

outputs beyond the human curated training data. A self-

improvement loop should allow the agent to actively engage

in tasks that push the boundary of its knowledge and ca-

pabilities, for example via leveraging tools such as search

engines, simulated environments, calculators or interpreters

and interacting with other agents (Jiang et al.,2022;Schick

et al.,2024). This requires the model to have a scalable

mechanism to evaluate its own performance, identify areas

for improvement, and adapt its learning process accordingly.

There is growing evidence that foundation models can be

leveraged for feedback in place of humans, and can signiﬁ-

cantly amplify data generated by humans. Examples include

self-critique and revision for training harmless assistants

(Bai et al.,2022) and guiding human evaluators (Saunders

et al.,2022), self-correction for tool-use (Gou et al.,2023),

self-instruction for instruction following (Wang et al.,2022),

self-debugging for code generation (Chen et al.,2023b), self-

rewarding for instruction following (Yuan et al.,2024), and

leveraging VLMs as reward functions for control (Baumli

et al.,2023). These works hint at the possibility of founda-

tion models generating their own samples and reﬁning them

in an open-ended way.

3.3. Task Generation

Closely related to both RL and self-improvement is the prob-

lem of task generation, also known as the “problem problem”

(Leibo et al.,2019). One great candidate approach for open-

endedness is to keep adapting the difﬁculty of tasks to an

agent’s capability so that they remain forever challenging

yet learnable. Past examples of this type of system include

setter-solvers (Schmidhuber,1991b) and unsupervised envi-

ronment design (Dennis et al.,2020;Justesen et al.,2018;

Wang et al.,2019). With the advent of foundation models,

it has become feasible to use the Internet itself as an envi-

ronment (Jiang et al.,2022;Gur et al.,2021) via web-based

APIs, affording agents with an incredibly rich, ever-growing

and human-relevant task domain (Zhou et al.,2023).

Another possibility is to instead learn world models—

predictive simulators that can generate future outputs condi-

tioned on text or actions. A promising approach is to con-

sider a foundation model to be a world model itself, since

it is capable of predicting the future (Wong et al.,2023a;

Gurnee and Tegmark,2023;Park et al.,2023). Learned

world models like Genie (Bruce et al.,2024), and text-to-

video generation models like Sora (Brooks et al.,2024)

demonstrate that foundation video models can be used as

learned simulators, including in real-world settings like

robotics (Yang et al.,2023b) and autonomous driving (Hu

et al.,2023). If these works combine with learned multi-

modal reward models (Chan et al.,2023;Du et al.,2023a),

they could be used to generate an open-ended curriculum of

tasks, scaling to task spaces far larger and more photorealis-

tic than can currently be achieved. At sufﬁcient scale, this

may provide a path to generating AI agents with superhu-

man adaptability across a wide range of previously unseen

tasks, which can be deployed in the real world across the

rapidly closing Sim-to-Real gap (Huang et al.,2023).

3.4. Evolutionary Algorithms

Evolutionary methods offer a promising path to generate

open-ended systems with foundation models (Wu et al.,

2024). LLMs are well-placed to act as selection and muta-

tion operators, as they have been trained on vast datasets of

human knowledge, culture and preferences. For example,

LLMs offer a mechanism through which to make semanti-

cally meaningful mutations via text (Lehman et al.,2022;

Meyerson et al.,2023;Chen et al.,2023a). The simplest

such approach may be via prompts, which already allow

foundation models to further improve their performance.

Recent works have shown it is possible to far surpass human

designed prompts, leading to stronger models (Fernando

et al.,2023;Yang et al.,2023a;Guo et al.,2023). More

recently, Bradley et al. (2023) and Samvelyan et al. (2024)

went further, using an evolutionary algorithm and LLMs to

both generate variation and evaluate the quality and diversity

of candidate text, making it possible to guide the search for

creative and novel outputs. In the future it may be possible

to further reﬁne a model on these outputs, or use them for

planning (Gandhi et al.,2023), to achieve self-improvement.

Another angle of attack for evolutionary methods is in the

space of code (also known as genetic programming). Foun-

dation models have proven to be competent at producing

diverse and novel programs, providing a means of iterat-

ing upon an archive of candidate solutions. For example,

Eureka (Ma et al.,2023) evolves code-based reward func-

tions to learn complex control behaviors. Similarly, Fun-

Search (Romera-Paredes et al.,2024) evolves programs that

represent new mathematical knowledge. These examples

are focused on speciﬁc domains, and it remains an open

problem to scale code evolution to a more general setting.

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

4. Achieving ASI Responsibly

Now that we have foundation models, designing a truly gen-

eral open-ended learning system may be within our grasp.

However, the power of open-endedness comes with a swathe

of notable safety risks—beyond existing safety considera-

tions facing foundation models (Ecoffet et al.,2020). Find-

ing solutions to these challenges are interesting and impor-

tant core problems in open-endedness research. Because

the solutions to these problems may well depend on the

design of the open-ended system, it is critical that safety

and open-endedness are pursued in tandem. We cover them

here not to hold them separate from other directions in

open-endedness—in fact many of these problems are cur-

rent practical limitations of artiﬁcial open-ended systems.

Rather, this section is intended to draw speciﬁc attention

to these problems as some of the most fundamental and

exciting directions for research in the ﬁeld. Of course, this

short section cannot do justice to the breadth of concerns.

Hence, where possible, we provide references to the wealth

of knowledge in the ASI safety community.

We organize our understanding of these risks similar to

(Critch and Krueger,2020) by focusing on the ways knowl-

edge is created and transmitted through the joint human-AI

open-ended process in Figure 2. A powerful open-ended

system which has the problems listed in this section is not a

beneﬁcial open-ended system, and we believe it is not one

we should be striving to build. Solving these problems is

not just making open-ended systems safer, but also making

them usable by humans. As such, addressing these prob-

lems should be thought of as minimum speciﬁcations of an

open-ended system that we would want to build.

4.1. AI Creation and Agency

AI systems powering the open-ended creation of new knowl-

edge could lead to powerful new affordances. Without di-

rection, these creations could be the source of dual-use

dangers (Urbina et al.,2022). The danger is magniﬁed when

the open-ended systems take immediate action in an envi-

ronment. Current state-of-the-art systems operate in narrow,

simulated environments (Wang et al.,2023a;OEL Team

et al.,2021;Bauer et al.,2023). However, as AI is trained in

broader, more diverse simulations or is even deployed (and

continues to learn) in the real world, it becomes critical to

understand the dangers. The agency of open-ended AI poses

several safety risks, such as goal misgeneralization (di Lan-

gosco et al.,2022;Shah et al.,2022) and speciﬁcation gam-

ing (Clark and Amodei,2016). Open-ended search can be

seen as an ambitiously aggressive form of exploration; thus

one could hope to use similar approaches to mitigate the dan-

gers of exploration as in RL, like safe exploration (Garcıa

and Fern

andez,2015) and impact regularization (Krakovna

et al.,2018;Turner et al.,2020).

Figure 2.

Knowledge accumulation and transfer in a human-AI

open-ended system. We depict AI building on AI knowledge,

humans understanding AI knowledge, AI understanding human

knowledge, humans building on human knowledge, and emergent

knowledge created by the process as a whole. Every process in

this diagram offers an opportunity to embed safety methods that

guide the system towards achieving ASI responsibly.

4.2. Humans Understanding AI Creations

In order to provide informed oversight and direction when

guiding an open-ended system, human observers need to

at least partially understand the signiﬁcance of the new

artifacts that the system produces. This becomes increas-

ingly challenging as the complexity of these artifacts grows,

leading to the inability to give informed oversight and guid-

ance. Such a system may not only be unsafe, but would no

longer be open-ended for human observers, since it would

no longer be learnable. As such, any open-ended system

we want to build should have the ability to bring human ob-

servers along with it—understanding and interpreting these

systems is not only a core problem to make them safe, it is

also a core problem to make them useful.

One approach would be to try to understand the policy gen-

erated by open-ended systems through interpretability. With

current approaches this would require a formidable inter-

pretability effort for each domain of interest. However,

with the advent of automated interpretability (Bills et al.,

2023), one may hope to build increasingly good explana-

tions of the systems’ behaviors which match the increasing

complexity of the open-ended system. This presents an

sizeable challenge, as such a system would be a universal

explainer (Deutsch,2011), by deﬁnition.

An alternative approach is to prefer designs for open-ended

systems which promote interpretability and explainability,

or whose goal is to teach human observers. Already, there

are efforts to train systems which directly inform the user of

implicit knowledge (Christiano et al.,2021). One might aim

to design systems that at least maintain informed oversight

(Amodei et al.,2016;Bowman et al.,2022). This approach

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

may be especially effective if the design of the open-ended

system automatically facilitates understanding and control

by human users (Irving et al.,2018).

4.3. Humans Guiding AI Creation

Even if we assume that human observers can understand

enough of the behavior of an open-ended system to be in a

position to give informed feedback, we arrive at the question

of how a human designer could meaningfully guide an open-

ended system. This challenge goes beyond the difﬁculties of

directing individual RL agents, as not only do open-ended

systems often lack well-deﬁned objectives that could be

modiﬁed, but they are increasingly unpredictable by design.

One possibility would be to use humans in the loop to drive

open-endedness (Secretan et al.,2008), a kind of open-

endedness from human feedback (Zhang et al.,2023). A

complete solution to this problem not only needs to be

directable, but must actively raise unexpected and possibly

important artifacts to the user’s attention.

If open-ended systems could be made as directable as in-

dividual RL agents, then work deﬁning objectives which

preserve controllability (Hadﬁeld-Menell et al.,2016;2017;

Carey and Everitt,2023) might be a promising path towards

more controllable open-ended systems. However, direct-

ing an open-ended system towards any objective effectively

while maintaining the open-endedness is an open problem.

This problem is not only important for safety, but is impor-

tant for open-ended systems to be useful. In sufﬁciently

broad domains—such as all of mathematics, all proteins, or

all behaviors on a computer—an open-ended system may

rabbit-hole into the obscure theorems, useless proteins, or

only certain computer applications. Thus, building mech-

anisms that allow us to direct open-ended systems to not

just the safe artifacts, but the interesting and useful artifacts,

is a fruitful avenue for collaboration between safety and

open-endedness researchers.

4.4. Human Society Adapting

There are signiﬁcant non-technical concerns in ensuring that

society can understand, prepare for, and appropriately react

to new technological capabilities emerging from open-ended

foundation models. Indeed, the impact of AI systems is not

just felt at the individual level, but also at the level of the

collectives that structure our society—communities, organ-

isations, markets and nation states, to name a few. Since

the artifacts arising from open-ended foundation models

will by deﬁnition appear novel, we must devote prospective

attention to the ways in which these could harm or beneﬁt

the cooperative infrastructure of society (Dafoe et al.,2020).

Likewise, we must develop mechanisms to avoid tipping

points driven by feedback loops, like ﬂash crashes (Aldrich

et al.,2017). Decision-makers should be prepared to adapt

governance rapidly and retrospectively in response to open-

ended artifacts, ﬁnding a good balance between collecting

information and avoiding entrenchment of undesirable arti-

facts (Collingridge,1980).

4.5. Emergent Risks of Open-Ended Systems

Even if each subcomponent of Figure 2can be made safe,

it may still be the case that the aggregate joint human-AI

open-ended system leads to unforeseen problems. For in-

stance, two systems that are open-ended in isolation could

negatively interact to cause neither to be open-ended. This

would mean a cessation of progress and an inability to col-

lectively respond to new challenges. While such emergent

effects have been studied in multi-agent systems (Johanson

et al.,2022) and ASI safety (Critch and Krueger,2020) solu-

tions are still elusive, and an understanding of these effects

is critical to the safe deployment of open-ended systems.

If such problems are inevitable and unpredictable, we would

need our human-AI open-ended systems to adapt to solve

novel ASI safety failures as they arise. Due to the in-

herent unpredictability of knowledge creation, these prob-

lems may be both unavoidable and solvable once as they

arise (Deutsch,2011). We should be building an open-ended

system whose safety is anti-fragile (Taleb,2014), adapting

to emerging safety risks and getting stronger for it. This

entails designing techniques for understanding, monitoring,

and rapidly coordinating responses to emerging risks.

5. Conclusion and Outlook

Foundation models have led to a rapid increase in the gen-

erality of current AI systems. However, current foundation

models are limited in their capability to discover new knowl-

edge. In this paper, our position is that to further advance

in levels of AGI towards ASI, we require systems that are

open-ended—endowed with the ability to generate novel

and learnable artifacts for a human observer. There has

never been a more exciting time to build such systems, with

foundation models already exhibiting general human-like

knowledge that both accelerates further learning and guides

this learning towards human-relevant artifacts.

As we develop and deploy more generally-capable open-

ended systems, novel safety concerns arise that will be criti-

cal to address. In order to realise the beneﬁts of such sys-

tems, it is important that the human observer remains able

to learn from the novel artifacts, bringing ﬁelds such as ex-

plainability to the forefront of open-endedness research. If

these endeavors are successful, then we believe open-ended

foundation models could lead to advances that drastically

enhance modern society.

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

Impact Statement

Our work provides a formal deﬁnition of open-endedness,

and provides a discussion on its signiﬁcance for the pursuit

of ASI. We explore current research directions in the ﬁeld,

emphasising the potential of combining open-endedness

with foundation models as a pre-eminent path towards

achieving ASI. Developed responsibly, we believe that such

open-ended foundation models can have tremendous posi-

tive impact on the society, accelerating scientiﬁc and techno-

logical breakthroughs, enhancing human creativity through

a collaborative feedback loop, and acting as an engine for

general knowledge expansion across many ﬁelds. Recognis-

ing the profound implications of this concept, we dedicate

the entirety of Section 4to an initial analysis of potential

risks and societal impacts, offering frameworks for the re-

sponsible and ethical development of ASI. We hope that

highlighting these issues early will help to promote safety,

responsibility and accountability as the ﬁeld grows.

Acknowledgements

We gratefully acknowledge Dave Abel for providing valu-

able feedback on an early draft of this paper. We are thankful

to the designers at the Noun Project, from which we sourced

graphics under the CC BY 3.0 licence as follows: “tick”

icon by kareemovic, “Delete” icon by kareemovic, “alien”

icon by Artem Yurov, “girl” icon by Teewara soontorn, “year

of rat” icon by DailyPM, “aircraft” icon by mikicon, “con-

corde” icon by mikicon, “Plane” icon by CAMB, “humans”

icon by Ifanicon, and “Robot” icon by Deemak Daksina.

References

D. Abel, A. Barreto, B. Van Roy, D. Precup, H. van Hasselt,

and S. Singh. A deﬁnition of continual reinforcement

learning. ArXiv preprint, abs/2307.11046, 2023. URL

https://arxiv.org/abs/2307.11046.

M. Ahn, A. Brohan, N. Brown, Y. Chebotar, O. Cortes,

B. David, C. Finn, C. Fu, K. Gopalakrishnan, K. Haus-

man, A. Herzog, D. Ho, J. Hsu, J. Ibarz, B. Ichter,

A. Irpan, E. Jang, R. J. Ruano, K. Jeffrey, S. Jesmonth,

N. J. Joshi, R. Julian, D. Kalashnikov, Y. Kuang, K.-

H. Lee, S. Levine, Y. Lu, L. Luu, C. Parada, P. Pastor,

J. Quiambao, K. Rao, J. Rettinghouse, D. Reyes, P. Ser-

manet, N. Sievers, C. Tan, A. Toshev, V. Vanhoucke,

F. Xia, T. Xiao, P. Xu, S. Xu, M. Yan, and A. Zeng. Do

As I Can, Not As I Say: Grounding Language in Robotic

Affordances, Aug. 2022.

E. M. Aldrich, J. Grundfest, and G. Laughlin. The ﬂash

crash: A new deconstruction. Available at SSRN 2721922,

2017.

D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schul-

man, and D. Man

e. Concrete problems in ai safety.

ArXiv preprint, abs/1606.06565, 2016. URL

https:

//arxiv.org/abs/1606.06565.

Y. Bai, S. Kadavath, S. Kundu, A. Askell, J. Kernion,

A. Jones, A. Chen, A. Goldie, A. Mirhoseini, C. McK-

innon, C. Chen, C. Olsson, C. Olah, D. Hernan-

dez, D. Drain, D. Ganguli, D. Li, E. Tran-Johnson,

E. Perez, J. Kerr, J. Mueller, J. Ladish, J. Landau,

K. Ndousse, K. Lukosuite, L. Lovitt, M. Sellitto, N. El-

hage, N. Schiefer, N. Mercado, N. DasSarma, R. Lasenby,

R. Larson, S. Ringer, S. Johnston, S. Kravec, S. E.

Showk, S. Fort, T. Lanham, T. Telleen-Lawton, T. Con-

erly, T. Henighan, T. Hume, S. R. Bowman, Z. Hatﬁeld-

Dodds, B. Mann, D. Amodei, N. Joseph, S. McCandlish,

T. Brown, and J. Kaplan. Constitutional AI: Harmlessness

from AI Feedback, Dec. 2022.

B. Baker, I. Kanitscheider, T. M. Markov, Y. Wu, G. Powell,

B. McGrew, and I. Mordatch. Emergent tool use from

multi-agent autocurricula. In 8th International Confer-

ence on Learning Representations, ICLR 2020, Addis

Ababa, Ethiopia, April 26-30, 2020. OpenReview.net,

2020. URL

https://openreview.net/forum?

id=SkxpxJBKwS.

A. Bakhtin, N. Brown, E. Dinan, G. Farina, C. Flaherty,

D. Fried, A. Goff, J. Gray, H. Hu, et al. Human-level play

in the game of diplomacy by combining language models

with strategic reasoning. Science, 378(6624):1067–1074,

2022.

J. Bauer, K. Baumli, F. Behbahani, A. Bhoopchand,

N. Bradley-Schmieg, M. Chang, N. Clay, A. Collister,

V. Dasagi, L. Gonzalez, K. Gregor, E. Hughes, S. Kashem,

M. Loks-Thompson, H. Openshaw, J. Parker-Holder,

S. Pathak, N. Perez-Nieves, N. Rakicevic, T. Rockt

aschel,

Y. Schroecker, S. Singh, J. Sygnowski, K. Tuyls, S. York,

A. Zacherl, and L. M. Zhang. Human-timescale adapta-

tion in an open-ended task space. In A. Krause, E. Brun-

skill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett,

editors, Proceedings of the 40th International Conference

on Machine Learning, volume 202 of Proceedings of

Machine Learning Research, pages 1887–1935. PMLR,

2023.

K. Baumli, S. Baveja, F. Behbahani, H. Chan, G. Comanici,

S. Flennerhag, M. Gazeau, K. Holsheimer, D. Horgan,

M. Laskin, et al. Vision-language models as a source of

rewards. ArXiv preprint, abs/2312.09187, 2023. URL

https://arxiv.org/abs/2312.09187.

M. Bedau. Measurement of evolutionary activity, teleology,

and life. 1992.

M. A. Bedau, E. Snyder, C. T. Brown, N. H. Packard, et al. A

comparison of evolutionary activity in artiﬁcial evolving

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

systems and in the biosphere. In Proceedings of the fourth

European conference on artiﬁcial life, pages 125–134.

MIt Press Cambridge, 1997.

M. A. Bedau, E. Snyder, and N. H. Packard. A classiﬁcation

of long-term evolutionary dynamics. Artiﬁcial Life: The

Proceedings..., page 228, 1998.

M. Bellemare, S. Srinivasan, G. Ostrovski, T. Schaul,

D. Saxton, and R. Munos. Unifying count-based ex-

ploration and intrinsic motivation. Advances in neural

information processing systems, 29, 2016.

C. Berner, G. Brockman, B. Chan, V. Cheung, P. Debiak,

C. Dennison, D. Farhi, Q. Fischer, S. Hashme, C. Hesse,

et al. Dota 2 with large scale deep reinforcement learning.

ArXiv preprint, abs/1912.06680, 2019. URL

https:

//arxiv.org/abs/1912.06680.

S. Bills, N. Cammarata, D. Mossing, H. Tillman, L. Gao,

G. Goh, I. Sutskever, J. Leike, J. Wu, and W. Saun-

ders. Language models can explain neurons in language

models. URL https://openaipublic. blob. core. windows.

net/neuron-explainer/paper/index. html.(Date accessed:

14.05. 2023), 2023.

R. Bommasani, D. A. Hudson, E. Adeli, R. Altman,

S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosse-

lut, E. Brunskill, et al. On the opportunities and risks

of foundation models. arXiv preprint arXiv:2108.07258,

2021.

S. R. Bowman, J. Hyun, E. Perez, E. Chen, C. Pettit,

S. Heiner, K. Luko

e, A. Askell, A. Jones, A. Chen,

et al. Measuring progress on scalable oversight for large

language models. ArXiv preprint, abs/2211.03540, 2022.

URL https://arxiv.org/abs/2211.03540.

H. Bradley, A. Dai, H. Teufel, J. Zhang, K. Oostermei-

jer, M. Bellagente, J. Clune, K. Stanley, G. Schott, and

J. Lehman. Quality-Diversity through AI Feedback, Oct.

2023.

J. C. Brant and K. O. Stanley. Minimal criterion coevolution:

a new approach to open-ended search. In Proceedings of

the Genetic and Evolutionary Computation Conference,

pages 67–74, 2017.

T. Brooks, B. Peebles, C. Holmes, W. DePue, Y. Guo,

L. Jing, D. Schnurr, J. Taylor, T. Luhman, E. Luh-

man, C. Ng, R. Wang, and A. Ramesh. Video

generation models as world simulators. 2024.

URL

https://openai.com/research/

video-generation-models-as-world- simulators

J. Bruce, M. Dennis, A. Edwards, J. Parker-Holder, Y. Shi,

E. Hughes, M. Lai, A. Mavalankar, R. Steigerwald,

C. Apps, Y. Aytar, S. Bechtle, F. Behbahani, S. Chan,

N. Heess, L. Gonzalez, S. Osindero, S. Ozair, S. Reed,

J. Zhang, K. Zolna, J. Clune, N. de Freitas, S. Singh, and

T. Rockt

aschel. Genie: Generative Interactive Environ-

ments, Feb. 2024.

Y. Burda, H. Edwards, A. Storkey, and O. Klimov. Explo-

ration by Random Network Distillation, Oct. 2018.

M. C. Campi and S. Garatti. Compression, generalization

and learning. ArXiv preprint, abs/2301.12767, 2023. URL

https://arxiv.org/abs/2301.12767.

R. Carey and T. Everitt. Human control: Deﬁnitions and

algorithms. ArXiv preprint, abs/2305.19861, 2023. URL

https://arxiv.org/abs/2305.19861.

H. Chan, V. Mnih, F. Behbahani, M. Laskin, L. Wang,

F. Pardo, M. Gazeau, H. Sahni, D. Horgan, K. Baumli,

Y. Schroecker, S. Spencer, R. Steigerwald, J. Quan, G. Co-

manici, S. Flennerhag, A. Neitz, L. M. Zhang, T. Schaul,

S. Singh, C. Lyle, T. Rockt

aschel, J. Parker-Holder, and

K. Holsheimer. Vision-language models as a source of

rewards. In Second Agent Learning in Open-Endedness

Workshop, 2023.

A. Chen, D. M. Dohan, and D. R. So. EvoPrompting:

Language Models for Code-Level Neural Architecture

Search, Feb. 2023a.

X. Chen, M. Lin, N. Sch

arli, and D. Zhou. Teaching Large

Language Models to Self-Debug, Apr. 2023b.

P. Christiano, A. Cotra, and M. Xu. Eliciting latent knowl-

edge: How to tell if your eyes deceive you, 2021.

J. Clark and D. Amodei. Faulty reward functions in the

wild. Internet: https://blog. openai. com/faulty-reward-

functions, 2016.

J. Clune. AI-GAs: AI-generating algorithms, an alternate

paradigm for producing general artiﬁcial intelligence, Jan.

2020.

J. Clune. Ai will go farterh if it stands on the shoulders of

giant human data sets. Dec. 2022.

C. Colas, T. Karch, O. Sigaud, and P.-Y. Oudeyer. Autotelic

agents with intrinsically motivated goal-conditioned rein-

forcement learning: a short survey. Journal of Artiﬁcial

Intelligence Research, 74:1159–1199, 2022.

D. Collingridge. The Social Control of Technology.

St. Martin’s Press, 1980. ISBN 9780312731687.

URL

https://books.google.co.uk/books?

id=hCSdAQAACAAJ.

A. Critch and D. Krueger. Ai research considerations

for human existential safety (arches). ArXiv preprint,

abs/2006.04948, 2020. URL

https://arxiv.org/

abs/2006.04948.

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

A. Dafoe, E. Hughes, Y. Bachrach, T. Collins, K. R. McKee,

J. Z. Leibo, K. Larson, and T. Graepel. Open Problems

in Cooperative AI, Dec. 2020.

O. David, S. Moran, and A. Yehudayoff. On statistical

learning via the lens of compression. ArXiv preprint,

abs/1610.03592, 2016. URL

https://arxiv.org/

abs/1610.03592.

G. Del

etang, A. Ruoss, P.-A. Duquenne, E. Catt, T. Ge-

newein, C. Mattern, J. Grau-Moya, L. K. Wenliang,

M. Aitchison, L. Orseau, M. Hutter, and J. Veness. Lan-

guage Modeling Is Compression, Sept. 2023.

M. Dennis, N. Jaques, E. Vinitsky, A. M. Bayen, S. Rus-

sell, A. Critch, and S. Levine. Emergent complexity and

zero-shot transfer via unsupervised environment design.

In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and

H. Lin, editors, Advances in Neural Information Process-

ing Systems 33: Annual Conference on Neural Informa-

tion Processing Systems 2020, NeurIPS 2020, December

6-12, 2020, virtual, 2020.

J. Derbyshire. Potential surprise theory as a theoretical foun-

dation for scenario planning. Technological Forecasting

and Social Change, 124:77–87, 2017.

D. Deutsch. The beginning of inﬁnity: Explanations that

transform the world. Penguin UK, 2011.

L. L. di Langosco, J. Koch, L. D. Sharkey, J. Pfau, and

D. Krueger. Goal misgeneralization in deep reinforce-

ment learning. In K. Chaudhuri, S. Jegelka, L. Song,

C. Szepesv

ari, G. Niu, and S. Sabato, editors, Interna-

tional Conference on Machine Learning, ICML 2022, 17-

23 July 2022, Baltimore, Maryland, USA, volume 162 of

Proceedings of Machine Learning Research, pages 12004–

12019. PMLR, 2022. URL

https://proceedings.

mlr.press/v162/langosco22a.html.

E. L. Dolson, A. E. Vostinar, M. J. Wiser, and C. Ofria. The

modes toolbox: Measurements of open-ended dynamics

in evolving systems. Artiﬁcial life, 25(1):50–73, 2019.

Y. Du, K. Konyushkova, M. Denil, A. Raju, J. Landon,

F. Hill, N. de Freitas, and S. Cabi. Vision-language mod-

els as success detectors. In Proceedings of The 2nd Con-

ference on Lifelong Learning Agents, pages 120–136,

2023a.

Y. Du, E. Kosoy, A. Dayan, M. Rufova, P. Abbeel, and

A. Gopnik. What can ai learn from human exploration?

intrinsically-motivated humans and agents in open-world

exploration. In NeurIPS 2023 workshop: Information-

Theoretic Principles in Cognitive Systems, 2023b.

Y. Du, S. Li, A. Torralba, J. B. Tenenbaum, and I. Mor-

datch. Improving factuality and reasoning in language

models through multiagent debate. arXiv preprint

arXiv:2305.14325, 2023c.

S. Earle, J. Togelius, and L. B. Soros. Video games as

a testbed for open-ended phenomena. In 2021 IEEE

Conference on Games (CoG), pages 1–9. IEEE, 2021.

A. Ecoffet, J. Clune, and J. Lehman. Open Questions in

Creating Safe Open-ended AI: Tensions Between Control

and Creativity, June 2020.

M. Faldor, J. Zhang, A. Cully, and J. Clune. Omni-epic:

Open-endedness via models of human notions of interest-

ingness with environments programmed in code. arXiv

preprint arXiv:2405.15568, 2024.

C. Fernando, D. Banarse, H. Michalewski, S. Osindero, and

T. Rockt

aschel. Promptbreeder: Self-Referential Self-

Improvement Via Prompt Evolution, Sept. 2023.

K. Gandhi, D. Sadigh, and N. D. Goodman. Strategic Rea-

soning with Language Models, May 2023.

J. Garcıa and F. Fern

andez. A comprehensive survey on safe

reinforcement learning. Journal of Machine Learning

Research, 16(1):1437–1480, 2015.

Z. Gou, Z. Shao, Y. Gong, Y. Shen, Y. Yang, N. Duan,

and W. Chen. Critic: Large language models can self-

correct with tool-interactive critiquing. ArXiv preprint,

abs/2305.11738, 2023. URL

https://arxiv.org/

abs/2305.11738.

Q. Guo, R. Wang, J. Guo, B. Li, K. Song, X. Tan, G. Liu,

J. Bian, and Y. Yang. Connecting Large Language Models

with Evolutionary Algorithms Yields Powerful Prompt

Optimizers, Sept. 2023.

I. Gur, N. Jaques, Y. Miao, J. Choi, M. Tiwari, H. Lee, and

A. Faust. Environment generation for zero-shot com-

positional reinforcement learning. Advances in Neural

Information Processing Systems, 34:4157–4169, 2021.

W. Gurnee and M. Tegmark. Language Models Represent

Space and Time, Oct. 2023.

D. Hadﬁeld-Menell, S. J. Russell, P. Abbeel, and A. D.

Dragan. Cooperative inverse reinforcement learning. In

D. D. Lee, M. Sugiyama, U. von Luxburg, I. Guyon,

and R. Garnett, editors, Advances in Neural Information

Processing Systems 29: Annual Conference on Neural

Information Processing Systems 2016, December 5-10,

2016, Barcelona, Spain, pages 3909–3917, 2016.

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

D. Hadﬁeld-Menell, A. D. Dragan, P. Abbeel, and S. J.

Russell. The off-switch game. In C. Sierra, editor, Pro-

ceedings of the Twenty-Sixth International Joint Confer-

ence on Artiﬁcial Intelligence, IJCAI 2017, Melbourne,

Australia, August 19-25, 2017, pages 220–227. ijcai.org,

2017. doi: 10.24963/ijcai.2017/32. URL

https:

//doi.org/10.24963/ijcai.2017/32.

E. Hazan and S. Kale. Extracting certainty from uncertainty:

Regret bounded by variation in costs. Machine learning,

80:165–188, 2010.

M. Henaff, R. Raileanu, M. Jiang, and T. Rockt

aschel. Ex-

ploration via Elliptical Episodic Bonuses, Jan. 2023.

J. H. Holland. Adaptation in natural and artiﬁcial systems:

an introductory analysis with applications to biology,

control, and artiﬁcial intelligence. MIT press, 1992.

A. Hu, L. Russell, H. Yeo, Z. Murez, G. Fedoseev,

A. Kendall, J. Shotton, and G. Corrado. GAIA-1: A

Generative World Model for Autonomous Driving, Sept.

2023.

J. Huang, S. S. Gu, L. Hou, Y. Wu, X. Wang, H. Yu, and

J. Han. Large Language Models Can Self-Improve, Oct.

2022.

P. Huang, X. Zhang, Z. Cao, S. Liu, M. Xu, W. Ding, J. Fran-

cis, B. Chen, and D. Zhao. What went wrong? closing

the sim-to-real gap via differentiable causal discovery. In

Conference on Robot Learning, pages 734–760. PMLR,

2023.

M. Hutter. Universal artiﬁcial intelligence: Sequential

decisions based on algorithmic probability. Springer

Science & Business Media, 2004.

G. Irving, P. Christiano, and D. Amodei. AI safety via

debate, Oct. 2018.

M. Jiang, E. Grefenstette, and T. Rockt

aschel. Prioritized

Level Replay, June 2021.

M. Jiang, T. Rockt

aschel, and E. Grefenstette. General

Intelligence Requires Rethinking Exploration, Nov. 2022.

M. B. Johanson, E. Hughes, F. Timbers, and J. Z. Leibo.

Emergent bartering behaviour in multi-agent reinforce-

ment learning. ArXiv preprint, abs/2205.06760, 2022.

URL https://arxiv.org/abs/2205.06760.

N. Justesen, R. R. Torrado, P. Bontrager, A. Khalifa, J. To-

gelius, and S. Risi. Illuminating generalization in deep

reinforcement learning through procedural level genera-

tion. arXiv preprint arXiv:1806.10729, 2018.

M. Klissarov, P. D’Oro, S. Sodhani, R. Raileanu, P.-L. Ba-

con, P. Vincent, A. Zhang, and M. Henaff. Motif: Intrinsic

Motivation from Artiﬁcial Intelligence Feedback, Sept.

2023.

F. H. Knight. Risk, uncertainty and proﬁt, volume 31.

Houghton Mifﬂin, 1921.

V. Krakovna, L. Orseau, R. Kumar, M. Martic, and S. Legg.

Penalizing side effects using stepwise relative reachability.

ArXiv preprint, abs/1806.01186, 2018. URL

https:

//arxiv.org/abs/1806.01186.

S. Legg and M. Hutter. Universal Intelligence: A Deﬁnition

of Machine Intelligence, Dec. 2007.

J. Lehman and K. O. Stanley. Abandoning Objectives: Evo-

lution Through the Search for Novelty Alone. Evolu-

tionary Computation, 19(2):189–223, June 2011. ISSN

1063-6560. doi: 10.1162/EVCO a 00025.

J. Lehman, J. Gordon, S. Jain, K. Ndousse, C. Yeh, and

K. O. Stanley. Evolution through Large Models, June

2022.

J. Z. Leibo, E. Hughes, M. Lanctot, and T. Graepel. Au-

tocurricula and the Emergence of Innovation from Social

Interaction: A Manifesto for Multi-Agent Intelligence

Research, Mar. 2019.

S. Lifshitz, K. Paster, H. Chan, J. Ba, and S. McIlraith.

STEVE-1: A Generative Model for Text-to-Behavior in

Minecraft, June 2023.

S. Liu, C. Chen, X. Qu, K. Tang, and Y.-S. Ong. Large lan-

guage models as evolutionary optimizers. arXiv preprint

arXiv:2310.19046, 2023a.

X. Liu, H. Yu, H. Zhang, Y. Xu, X. Lei, H. Lai, Y. Gu,

H. Ding, K. Men, K. Yang, S. Zhang, X. Deng, A. Zeng,

Z. Du, C. Zhang, S. Shen, T. Zhang, Y. Su, H. Sun,

M. Huang, Y. Dong, and J. Tang. AgentBench: Eval-

uating LLMs as Agents, Aug. 2023b.

Y. J. Ma, W. Liang, G. Wang, D.-A. Huang, O. Bastani,

D. Jayaraman, Y. Zhu, L. Fan, and A. Anandkumar. Eu-

reka: Human-Level Reward Design via Coding Large

Language Models, Oct. 2023.

A. N. Mavor-Parker, K. A. Young, C. Barry, and L. D.

Grifﬁn. How to stay curious while avoiding noisy tvs

using aleatoric uncertainty estimation. In K. Chaud-

huri, S. Jegelka, L. Song, C. Szepesv

ari, G. Niu, and

S. Sabato, editors, International Conference on Ma-

chine Learning, ICML 2022, 17-23 July 2022, Balti-

more, Maryland, USA, volume 162 of Proceedings of

Machine Learning Research, pages 15220–15240. PMLR,

2022. URL

https://proceedings.mlr.press/

v162/mavor-parker22a.html.

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

D. W. McShea. Perspective metazoan complexity and evo-

lution: is there a trend? Evolution, 50(2):477–492, 1996.

E. Meyerson, M. J. Nelson, H. Bradley, A. Moradi, A. K.

Hoover, and J. Lehman. Language Model Crossover:

Variation through Few-Shot Prompting, Feb. 2023.

S. Mirchandani, F. Xia, P. Florence, B. Ichter, D. Driess,

M. G. Arenas, K. Rao, D. Sadigh, and A. Zeng. Large

Language Models as General Pattern Machines, July

2023.

I. Momennejad, H. Hasanbeig, F. Vieira Frujeri, H. Sharma,

N. Jojic, H. Palangi, R. Ness, and J. Larson. Evaluating

cognitive maps and planning in large language models

with cogeval. Advances in Neural Information Processing

Systems, 36, 2024.

M. R. Morris, J. Sohl-dickstein, N. Fiedel, T. Warkentin,

A. Dafoe, A. Faust, C. Farabet, and S. Legg. Levels of

AGI: Operationalizing Progress on the Path to AGI, Nov.

2023.

J.-B. Mouret and J. Clune. Illuminating search spaces by

mapping elites. ArXiv preprint, abs/1504.04909, 2015.

URL https://arxiv.org/abs/1504.04909.

OEL Team, A. Stooke, A. Mahajan, C. Barros, C. Deck,

J. Bauer, J. Sygnowski, M. Trebacz, M. Jader-

berg, M. Mathieu, N. McAleese, N. Bradley-Schmieg,

N. Wong, N. Porcel, R. Raileanu, S. Hughes-Fitt, V. Dal-

ibard, and W. M. Czarnecki. Open-ended learning

leads to generally capable agents. ArXiv preprint,

abs/2107.12808, 2021. URL

https://arxiv.org/

abs/2107.12808.

L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wain-

wright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama,

A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller,

M. Simens, A. Askell, P. Welinder, P. Christiano, J. Leike,

and R. Lowe. Training language models to follow instruc-

tions with human feedback, Mar. 2022.

V. Pallagani, B. Muppasani, K. Murugesan, F. Rossi, B. Sri-

vastava, L. Horesh, F. Fabiano, and A. Loreggia. Un-

derstanding the capabilities of large language models for

automated planning. arXiv preprint arXiv:2305.16151,

2023.

J. S. Park, J. C. O’Brien, C. J. Cai, M. R. Morris, P. Liang,

and M. S. Bernstein. Generative Agents: Interactive

Simulacra of Human Behavior, Apr. 2023.

J. Parker-Holder, M. Jiang, M. Dennis, M. Samvelyan, J. N.

Foerster, E. Grefenstette, and T. Rockt

aschel. Evolv-

ing curricula with regret-based environment design. In

K. Chaudhuri, S. Jegelka, L. Song, C. Szepesv

ari, G. Niu,

and S. Sabato, editors, International Conference on Ma-

chine Learning, ICML 2022, 17-23 July 2022, Balti-

more, Maryland, USA, volume 162 of Proceedings of

Machine Learning Research, pages 17473–17498. PMLR,

2022. URL

https://proceedings.mlr.press/

v162/parker-holder22a.html.

D. Pathak, P. Agrawal, A. A. Efros, and T. Darrell.

Curiosity-driven exploration by self-supervised predic-

tion. In D. Precup and Y. W. Teh, editors, Pro-

ceedings of the 34th International Conference on Ma-

chine Learning, ICML 2017, Sydney, NSW, Australia,

6-11 August 2017, volume 70 of Proceedings of Ma-

chine Learning Research, pages 2778–2787. PMLR,

2017. URL

http://proceedings.mlr.press/

v70/pathak17a.html.

J. Perolat, B. De Vylder, D. Hennes, E. Tarassov, F. Strub,

V. de Boer, P. Muller, J. T. Connor, N. Burch, T. Anthony,

et al. Mastering the game of stratego with model-free

multiagent reinforcement learning. Science, 378(6623):

990–996, 2022.

J. K. Pugh, L. B. Soros, and K. O. Stanley. Quality diversity:

A new frontier for evolutionary computation. Frontiers

in Robotics and AI, 3:40, 2016. ISSN 2296-9144. doi:

10.3389/frobt.2016.00040.

R. Raileanu and T. Rockt

aschel. RIDE: Rewarding Impact-

Driven Exploration for Procedurally-Generated Environ-

ments, Feb. 2020.

P. J. Richerson, R. Boyd, et al. Institutional evolution in the

holocene: the rise of complex societies. In Proceedings-

British Academy, volume 110, pages 197–234. Oxford

University Press Inc., 2001.

B. Romera-Paredes, M. Barekatain, A. Novikov, M. Balog,

M. P. Kumar, E. Dupont, F. J. R. Ruiz, J. S. Ellenberg,

P. Wang, O. Fawzi, P. Kohli, and A. Fawzi. Mathematical

discoveries from program search with large language

models. Nature, 625(7995):468–475, Jan. 2024. ISSN

1476-4687. doi: 10.1038/s41586-023-06924-6.

A. L. Samuel. Some studies in machine learning using

the game of checkers. IBM Journal of research and

development, 3(3):210–229, 1959.

M. Samvelyan, A. Khan, M. Dennis, M. Jiang, J. Parker-

Holder, J. Foerster, R. Raileanu, and T. Rockt

aschel.

MAESTRO: Open-Ended Environment Design for Multi-

Agent Reinforcement Learning, Mar. 2023.

M. Samvelyan, S. C. Raparthy, A. Lupu, E. Hambro, A. H.

Markosyan, M. Bhatt, Y. Mao, M. Jiang, J. Parker-Holder,

J. Foerster, T. Rockt

aschel, and R. Raileanu. Rainbow

teaming: Open-ended generation of diverse adversarial

prompts, 2024.

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

W. Saunders, C. Yeh, J. Wu, S. Bills, L. Ouyang, J. Ward,

and J. Leike. Self-critiquing models for assisting human

evaluators. ArXiv preprint, abs/2206.05802, 2022. URL

https://arxiv.org/abs/2206.05802.

T. Schaul, D. Borsa, D. Ding, D. Szepesvari, G. Ostro-

vski, W. Dabney, and S. Osindero. Adapting behaviour

for learning progress. arXiv preprint arXiv:1912.06910,

2019.

T. Schick, J. Dwivedi-Yu, R. Dess

ı, R. Raileanu, M. Lomeli,

E. Hambro, L. Zettlemoyer, N. Cancedda, and T. Scialom.

Toolformer: Language models can teach themselves to

use tools. Advances in Neural Information Processing

Systems, 36, 2024.

J. Schmidhuber. Adaptive conﬁdence and adaptive curiosity.

Inst. f¨

ur Informatik, 1991a.

J. Schmidhuber. A possibility for implementing curiosity

and boredom in model-building neural controllers. In

Proc. of the international conference on simulation of

adaptive behavior: From animals to animats, pages 222–

227, 1991b.

J. Secretan, N. Beato, D. B. D Ambrosio, A. Rodriguez,

A. Campbell, and K. O. Stanley. Picbreeder: Evolv-

ing pictures collaboratively online. In Proceedings of the

SIGCHI Conference on Human Factors in Computing Sys-

tems, CHI ’08, pages 1759–1768, New York, NY, USA,

Apr. 2008. Association for Computing Machinery. ISBN

978-1-60558-011-1. doi: 10.1145/1357054.1357328.

G. Shackle. Expectation in Economics. Cambridge

University Press, 1949. ISBN 9781107629141.

URL

https://books.google.co.uk/books?

id=zEb47udAsOcC.

R. Shah, V. Varma, R. Kumar, M. Phuong, V. Krakovna,

J. Uesato, and Z. Kenton. Goal misgeneralization: Why

correct speciﬁcations aren’t enough for correct goals.

ArXiv preprint, abs/2210.01790, 2022. URL

https:

//arxiv.org/abs/2210.01790.

A. Sharma, D. Cz

egel, M. Lachmann, C. P. Kempes, S. I.

Walker, and L. Cronin. Assembly theory explains and

quantiﬁes selection and evolution. Nature, 622(7982):

321–328, Oct. 2023. ISSN 1476-4687. doi: 10.1038/

s41586-023-06600-9.

M. Shin, J. Kim, B. van Opheusden, and T. L. Grifﬁths.

Superhuman Artiﬁcial Intelligence Can Improve Human

Decision Making by Increasing Novelty. Proceedings of

the National Academy of Sciences, 120(12):e2214840120,

Mar. 2023. ISSN 0027-8424, 1091-6490. doi: 10.1073/

pnas.2214840120.

I. Shumailov, Z. Shumaylov, Y. Zhao, Y. Gal, N. Papernot,

and R. Anderson. Model dementia: Generated data makes

models forget. arXiv e-prints, pages arXiv–2305, 2023.

P. Shyam, W. Jaskowski, and F. Gomez. Model-based active

exploration. In K. Chaudhuri and R. Salakhutdinov, ed-

itors, Proceedings of the 36th International Conference

on Machine Learning, ICML 2019, 9-15 June 2019, Long

Beach, California, USA, volume 97 of Proceedings of

Machine Learning Research, pages 5779–5788. PMLR,

2019. URL

http://proceedings.mlr.press/

v97/shyam19a.html.

O. Sigaud, G. Baldassarre, C. Colas, S. Doncieux, R. Duro,

N. Perrin-Gilbert, and V.-G. Santucci. A deﬁnition

of open-ended learning problems for goal-conditioned

agents. ArXiv preprint, abs/2311.00344, 2023. URL

https://arxiv.org/abs/2311.00344.

D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre,

G. van den Driessche, J. Schrittwieser, I. Antonoglou,

V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe,

J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap,

M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis.

Mastering the game of Go with deep neural networks and

tree search. Nature, 529(7587):484–489, Jan. 2016. ISSN

1476-4687. doi: 10.1038/nature16961.

D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai,

A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel,

T. Lillicrap, K. Simonyan, and D. Hassabis. Mastering

Chess and Shogi by Self-Play with a General Reinforce-

ment Learning Algorithm, Dec. 2017.

R. J. Solomonoff. A preliminary report on a general theory

of inductive inference. Citeseer, 1960.

L. Soros and K. Stanley. Identifying necessary condi-

tions for open-ended evolution through the artiﬁcial

life world of chromaria. In ALIFE 14: The Four-

teenth International Conference on the Synthesis and

Simulation of Living Systems, ALIFE 2023: Ghost

in the Machine: Proceedings of the 2023 Artiﬁcial

Life Conference, pages 793–800, 2014. doi: 10.

1162/978-0-262-32621-6- ch128. URL

https://doi.

org/10.1162/978-0-262-32621-6- ch128.

K. Stanley and J. Lehman. Why Greatness Cannot Be

Planned: The Myth of the Objective. Springer In-

ternational Publishing, 2015. ISBN 9783319155241.

URL

https://books.google.co.uk/books?

id=Llb1CAAAQBAJ.

K. O. Stanley. Why open-endedness matters. Artiﬁcial Life,

25(3):232–235, 2019. ISSN 1064-5462. doi: 10.1162/

artl a 00294. URL

https://doi.org/10.1162/

artl_a_00294.

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

K. O. Stanley and L. Soros. The role of subjectivity in

the evaluation of open-endedness. In Presentation deliv-

ered in OEE2: The Second Workshop on Open-Ended

Evolution, at ALIFE 2016, 2016.

K. O. Stanley, J. Lehman, and L. Soros. Open-endedness:

The last grand challenge you’ve never heard of. While

open-endedness could be a force for discovering intelli-

gence, it could also be a component of AI itself, 2017.

S. Stepney and S. Hickinbotham. On the open-endedness

of detecting open-endedness. Artiﬁcial Life, pages 1–26,

2023.

R. J. Sternberg and J. E. Davidson. The nature of insight.

The MIT Press, 1995.

N. N. Taleb. Antifragile: Things that gain from disorder,

volume 3. Random House Trade Paperbacks, 2014.

X. Tang, A. Zou, Z. Zhang, Y. Zhao, X. Zhang, A. Cohan,

and M. Gerstein. Medagents: Large language models

as collaborators for zero-shot medical reasoning. arXiv

preprint arXiv:2311.10537, 2023.

T. Taylor. Requirements for open-ended evolution in natural

and artiﬁcial systems. arXiv preprint arXiv:1507.07403,

2015.

T. Taylor. Routes to open-endedness in evolutionary systems.

arXiv preprint arXiv:1806.01883, 2018.

A. M. Turner, D. Hadﬁeld-Menell, and P. Tadepalli. Con-

servative agency via attainable utility preservation. In

Proceedings of the AAAI/ACM Conference on AI, Ethics,

and Society, pages 385–391, 2020.

F. Urbina, F. Lentzos, C. Invernizzi, and S. Ekins. Dual use

of artiﬁcial-intelligence-powered drug discovery. Nature

Machine Intelligence, 4(3):189–191, 2022.

K. Valmeekam, M. Marquez, S. Sreedharan, and S. Kamb-

hampati. On the planning abilities of large language

models-a critical investigation. Advances in Neural Infor-

mation Processing Systems, 36:75993–76005, 2023.

P. Villalobos, J. Sevilla, L. Heim, T. Besiroglu, M. Hobb-

hahn, and A. Ho. Will we run out of data? An analysis of

the limits of scaling datasets in Machine Learning, Oct.

2022.

O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu,

A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds,

P. Georgiev, J. Oh, D. Horgan, M. Kroiss, I. Danihelka,

A. Huang, L. Sifre, T. Cai, J. P. Agapiou, M. Jader-

berg, A. S. Vezhnevets, R. Leblond, T. Pohlen, V. Dal-

ibard, D. Budden, Y. Sulsky, J. Molloy, T. L. Paine,

C. Gulcehre, Z. Wang, T. Pfaff, Y. Wu, R. Ring, D. Yo-

gatama, D. W

unsch, K. McKinney, O. Smith, T. Schaul,

T. Lillicrap, K. Kavukcuoglu, D. Hassabis, C. Apps,

and D. Silver. Grandmaster level in StarCraft II us-

ing multi-agent reinforcement learning. Nature, 575

(7782):350–354, Nov. 2019. ISSN 1476-4687. doi:

10.1038/s41586-019-1724-z.

L. S. Vygotsky and M. Cole. Mind in society: Development

of higher psychological processes. Harvard university

press, 1978.

C. H. Waddington. Paradigm for an evolutionary process.

Biological Theory, 3:258–266, 2008.

G. Wang, Y. Xie, Y. Jiang, A. Mandlekar, C. Xiao, Y. Zhu,

L. Fan, and A. Anandkumar. Voyager: An Open-Ended

Embodied Agent with Large Language Models, May

2023a.

R. Wang, J. Lehman, J. Clune, and K. O. Stanley. Paired

open-ended trailblazer (POET): endlessly generating in-

creasingly complex and diverse learning environments

and their solutions. ArXiv preprint, abs/1901.01753, 2019.

URL https://arxiv.org/abs/1901.01753.

R. Wang, J. Lehman, A. Rawal, J. Zhi, Y. Li, J. Clune, and

K. O. Stanley. Enhanced POET: open-ended reinforce-

ment learning through unbounded invention of learning

challenges and their solutions. In Proceedings of the 37th

International Conference on Machine Learning, ICML

2020, 13-18 July 2020, Virtual Event, volume 119 of

Proceedings of Machine Learning Research, pages 9940–

9951. PMLR, 2020. URL

http://proceedings.

mlr.press/v119/wang20l.html.

T. T. Wang, A. Gleave, T. Tseng, K. Pelrine, N. Bel-

rose, J. Miller, M. D. Dennis, Y. Duan, V. Pogrebniak,

S. Levine, et al. Adversarial policies beat superhuman go

ais. In International Conference on Machine Learning,

pages 35655–35739. PMLR, 2023b.

Y. Wang, Y. Kordi, S. Mishra, A. Liu, N. A. Smith,

D. Khashabi, and H. Hajishirzi. Self-instruct: Align-

ing language model with self generated instructions.

ArXiv preprint, abs/2212.10560, 2022. URL

https:

//arxiv.org/abs/2212.10560.

Z. Wang, S. Cai, A. Liu, Y. Jin, J. Hou, B. Zhang,

H. Lin, Z. He, Z. Zheng, Y. Yang, X. Ma, and Y. Liang.

JARVIS-1: Open-World Multi-task Agents with Memory-

Augmented Multimodal Language Models, Nov. 2023c.

L. Wong, G. Grand, A. K. Lew, N. D. Goodman, V. K.

Mansinghka, J. Andreas, and J. B. Tenenbaum. From

Word Models to World Models: Translating from Natural

Language to the Probabilistic Language of Thought, June

2023a.

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

M. L. Wong, C. E. Cleland, D. Arend Jr, S. Bartlett, H. J.

Cleaves, H. Demarest, A. Prabhu, J. I. Lunine, and R. M.

Hazen. On the roles of function and selection in evolv-

ing systems. Proceedings of the National Academy of

Sciences, 120(43):e2310223120, 2023b.

X. Wu, S.-h. Wu, J. Wu, L. Feng, and K. C. Tan. Evolu-

tionary computation in the era of large language model:

Survey and roadmap. arXiv preprint arXiv:2401.10034,

2024.

Y. Wu, S. Prabhumoye, S. Y. Min, Y. Bisk, R. Salakhutdi-

nov, A. Azaria, T. Mitchell, and Y. Li. SPRING: GPT-4

Out-performs RL Algorithms by Studying Papers and

Reasoning, May 2023.

C. Yang, X. Wang, Y. Lu, H. Liu, Q. V. Le, D. Zhou, and

X. Chen. Large Language Models as Optimizers, Sept.

2023a.

M. Yang, Y. Du, K. Ghasemipour, J. Tompson, D. Schuur-

mans, and P. Abbeel. Learning Interactive Real-World

Simulators, Oct. 2023b.

W. Yuan, R. Y. Pang, K. Cho, S. Sukhbaatar, J. Xu, and

J. Weston. Self-rewarding language models. arXiv

preprint arXiv:2401.10020, 2024.

J. Zhang, J. Lehman, K. Stanley, and J. Clune. OMNI:

Open-endedness via Models of human Notions of Inter-

estingness, June 2023.

B. Zheng, B. Gou, J. Kil, H. Sun, and Y. Su. GPT-4V(ision)

is a Generalist Web Agent, if Grounded, Jan. 2024.

S. Zhou, F. F. Xu, H. Zhu, X. Zhou, R. Lo, A. Srid-

har, X. Cheng, T. Ou, Y. Bisk, D. Fried, U. Alon, and

G. Neubig. Webarena: A realistic web environment for

building autonomous agents. In Second Agent Learn-

ing in Open-Endedness Workshop, 2023. URL

https:

//openreview.net/forum?id=rmiwIL98uQ.

D. M. Ziegler, N. Stiennon, J. Wu, T. B. Brown, A. Radford,

D. Amodei, P. Christiano, and G. Irving. Fine-tuning

language models from human preferences. ArXiv preprint,

abs/1909.08593, 2019. URL

https://arxiv.org/

abs/1909.08593.

A. Illustrating Open-Endedness

A.1. An Informal Example

To illustrate our deﬁnition informally, we provide a relatable

real-world example. Let

be a research lab and the

academic papers published by the lab. A natural choice of

observer

is a research student in the ﬁeld at a different

lab. Roughly speaking, a research student sees novelty in a

line of work if, based on their knowledge of the literature up

to time

, given any subsequent paper

they can always

ﬁnd a later paper

xT′

that is more surprising than

. This

is intuitively sensible, a putative student with knowledge of

Newtonian mechanics will ﬁnd Maxwell’s equations hard

to predict, quantum mechanics even more surprising, and

contemporary particle physics very far outside their current

level of comprehension. A research student sees learnability

in a line of work if they ﬁnd that reading the previous papers

helps them better to predict the contents of the current paper.

Again, this appeals to our intuition: part of the purpose of

citations, for instance, is to point new researchers at previous

works that will help to deepen their understanding of the

current work.

Our interpretation of “interestingness” as learnability also

makes sense from the perspective of a research student. A

research student may choose to ignore a paper’s choice of

font, but will likely pay close attention to the details of a

novel method that yields state-of-the-art results. Thus the

student ﬁnds interesting the parts of the paper from which

they can learn the most. Similarly, the requirement that

the loss metric

ℓ

be chosen without knowledge of

ﬁnds a

natural interpretation here. A research student cannot judge

the open-endedness of a stack of papers by choosing to

never read the papers and instead inventing their own line

of research with no reference to previous works.

A.2. Deﬁnitional Subtleties

Self-play illustrates some subtleties in our deﬁnition. The

ﬁrst subtlety is the dependence of open-endedness on the

choice of observer. Suppose that

is an oracle who knows

the Nash strategy to play in Go. Assuming that the oracle

is modelling the win-rate of AlphaZero’s artifacts against

its own policy, it will never ﬁnd any AlphaZero policy to

be novel. Therefore the oracle does not ﬁnd AlphaZero to

be open-ended. The second subtlety is the dependence of

open-endedness on the learning limitations of the observer.

To an average human Go player, as opposed to an expert,

AlphaZero becomes novel earlier in training, and at some

point ceases to be learnable, because the average player

cannot ﬁgure out how to improve their own play with ref-

erence to very unusual style of a superhuman policy. Thus,

open-ended systems only remain open-ended while they

can “educate” their observers. We posit that superhuman

intelligence will be interesting to humans only as far as

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

humans can learn to understand it. The third subtlety is

that open-ended systems need not explore a problem space

fully to qualify as open-ended. Recently, adversarial search

was shown to yield policies that beat reimplementations

of AlphaZero and which are so simple that even amateur

humans can learn them (Wang et al.,2023b). Novelty and

learnability give no guarantee of coverage.

Because our deﬁnition is based on the perspective of an

external observer, one could worry that this makes it impos-

sible to make any sort of objective claims about the open-

endedness of any particular system, in harmony with the

arguments of Stanley and Soros (2016); Stepney and Hick-

inbotham (2023). There are two factors which mitigate this

concern. Firstly, the deﬁnition of open-endedness becomes

objective given any ﬁxed observer, and so it becomes a mea-

surable claim, in the sense that theorems can be written and

experiments conducted. For instance, if we care about open-

endedness with respect to humans, open-endedness can be

measured experimentally by how well humans can predict

the system. By having observer-dependence explicit in our

deﬁnition, we make precise the intuition that different ob-

servers, with different prior knowledge, different cognitive

capabilities and different timescales, are likely to judge the

same system in different ways. Thus our deﬁnition grace-

fully encompasses the diversity in perspectives of human

individuals and groups (such as companies or governments),

as well as the possibility that AI systems themselves could

be observers.

Secondly, while our deﬁnition of open-endedness depends

on an external observer, it is an open question as to whether

all “reasonable” observers would judge the same systems to

be open-ended. Since our deﬁnition rests on a notion of pre-

dictability with respect to the observer, our deﬁnition will

be as subjective as the underlying notion of predictability.

One may believe that predictability can be accurately and

objectively modeled as Solomonoff induction (Solomonoff,

1960). Thus if reasonable observers are taken to be those

whose predictions eventually follow something approximat-

ing Solomonoff induction, then any observer in this class

would eventually agree on which systems are open-ended.

Practically speaking, there are various existing methods in

the literature which can immediately be adapted to assess

the open-endedness of a system. First, one might elicit direct

human feedback on learnability and novelty of artifacts, in

the same spirit as RLHF (Ouyang et al.,2022) or PicBreeder

(Secretan et al.,2008). Second, one can use large language

models themselves as judges of novelty and learnability, as

argued for in OMNI (Zhang et al.,2023). Finally, one could

explicitly learn a model of the artifacts with an online learn-

ing method like Follow-the-Regularized-Leader (Hazan and

Kale,2010).

Can an open-ended system be its own observer? In prin-

ciple, there is nothing in our deﬁnition that rules out self-

observing open-ended systems. For example, an individual

self-improving agent could generate a series of artifacts,

each one of which is novel (surprising compared to the pre-

vious artifacts) and learnable (increasingly predictable given

the more history of the past artifacts). When the feedback

from self-observation is used to improve the system itself,

we call the observer a proxy observer for it no longer sits

outside the system.

For example, AlphaGo can be seen as an example of a self-

observing system, in that the agent trains in self-play i.e. it

observes its own policy as an opponent, is challenged by

the novel discoveries of search, and learns from them to im-

prove the policy. Likewise, humans can experience “Eureka

moments”, when an individual suddenly reconceptualizes

a problem in a ways that yields a solution (Sternberg and

Davidson,1995). A series of Eureka moments, each build-

ing on the last, is a self-observing open-ended system: the

human generates discoveries which are novel to themselves,

but which are also predictive of the next discovery.

Our notions of learnability is rather strict, in that it requires

that the loss be decreasing for all

t′> t

. A weaker and more

practical notion of learnability might state that it should

be probabilistically unlikely that the loss will increase as a

function of t:

∀T, ∀t < T, ∀T > t′> t :P(ℓ(t′, T )≥ℓ(t, T )) < δ .

It would be interesting to compare the consequences of

being a constant with the situation in which

has some

appropriate dependence on the variables

(t, t′, T )

. Similarly,

one could weaken the notion of novelty to state that it should

be probabilistically unlikely that the loss will decrease as a

function of

. We believe that there may be several related

and differently useful variants on our deﬁnition that would

be interesting to independently study, in a similar way that

there are many notions of convergence which are interesting,

related, and differently useful.

B. Alternative Deﬁnition

In Section 2.1 we provided a formal deﬁnition of open-

endedness in the language of statistical learning. Here we

give an alternative deﬁnition which we conjecture is equiva-

lent under appropriate conditions. The alternative deﬁnition

is phrased in the language of compression, a topic with

known formal connections to statistical learning (Hutter,

2004;David et al.,2016;Campi and Garatti,2023;Del

etang

et al.,2023).

Asystem

produces a sequence of artifacts

Xt∈ X

indexed by time

. An observer

processes a new artifact

to determine its information content given a history

ht=X1:t

of past ones.

possesses a history-dependent

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

compression map

Cht:X → {0,1}∗

which encodes

into a binary string of length |Cht(XT)|.

The system displays novelty if the information content in-

creases, namely:

∀t, ∀T > t, ∃T′> T :|Cht(XT′)|>|Cht(XT)|.

In other words, the complexity of the artifacts grows, ac-

cording to the observer.

The system is learnable if conditioning on a longer history

increases compressibility, namely:

∀T, ∀t < T, ∀T > t′> t :|Cht′(XT)|<|Cht(XT)|.

In other words, as its history grows, the observer must be

able to keep extracting additional patterns that help it com-

press future artifacts.

Finally, a system is open-ended from the perspective of

if and only if it generates sequences of artifacts that are both

novel and learnable.

We allow for the compression map

Cht

to be lossy. Hence,

also possesses a decompression map

Dht:{0,1}∗→ X

a symmetric loss function

ℓ:X ×X → R+

, and a threshold

ϵ∈R+that upper-bounds the error made by by Cht:

∀T, ∀t<T :ℓ(Dht(Cht(XT)), XT)< ϵ.

We can strengthen the deﬁnition to be independent of

appealing to rate-distortion theory. A rate-distortion curve

plots the the minimum information content

|Ch(X)|

such

that

ℓ(Dh(Ch(X)), X)< ϵ

against

, where the minimum

is over the maps

and

. The information content is

referred to as the rate and

is referred to as the distortion.

Picture a grid of rate-distortion curves

GtT

indexed by (dis-

cretized)

and

, as in Figure 3. Remember that

T > t

GtT

is strictly upper triangular, with other entries be-

ing undeﬁned. Then broad novelty is the requirement that

the curves get “fatter” as you move across the columns

on the grid, for every row

. Similarly, broad learnabil-

ity is the requirement that the curves get “ﬂatter” as you

move down the rows

on the grid, for every column

Broad open-endedness is the requirement that both broad

novelty and broad learnability hold. This notion of broad

open-endedness is vague in the same way the notion of

“convergence” is vague in that it can be made precise in

many subtly different but connected ways. For instance,

one could say a system is “uniformly” open-ended if dis-

tortion increases across the rows and decreases down the

columns for every rate

. Alternatively, one could deﬁne

“average” open-endedness by requiring that the integral of

the rate-distortion curve get larger as you move across the

columns and smaller as you move down the rows. We hope

that future work will elucidate these subtleties in deﬁning

broad open-endedness and determine which variants have

theoretical or practical merit.

Figure 3.

Open-endedness through the lens of rate-distortion

curves. We depict part of the upper triangular matrix of rate-

distortion curves

GtT

induced an observer after seeing the ﬁrst

artifacts aiming to lossily compress future artifact

. Here

2,3,4

and

T= 5,6,7

. Broad novelty is the property that, as you

move from left to right in any ﬁxed row, the rate-distortion curves

become fatter. Broad learnability is the property that, as you move

from top to bottom in any ﬁxed column, the curves become ﬂatter.

For the system to be broadly open-ended, both properties must

hold.

C. Further Related Work

Open-endedeness as a term emerged from the AI Life com-

munity when trying to quantify and replicate the increasing

complexity and perpetual novelty of biological evolution.

This is a rich ﬁeld with a signiﬁcant degree of disagreement

(Earle et al.,2021). As such there are a wide range of met-

rics proposed within the context of evolutionary systems

which aim to quantify it’s behavior. For instance persistence

ﬁltering, which measures how many generations an organ-

ism has persisted for (Dolson et al.,2019), and evolutionary

activity statistics (Bedau et al.,1997;1998). The closely

related question around the necessary conditions to produce

open-ended evolution has also been deeply studied (Taylor,

2018;2015). As these deﬁnitions are largely speciﬁc to bio-

logical evolution, we focus the remainder of our discussion

on the more recent deﬁnitions which aim to deﬁne open-

ended systems in a way that applies to current ML systems

and systems more broadly.

Our deﬁnition of open-endedness is closely related to the

concept of potential surprise in economics (Shackle,1949).

To measure potential surprise, an individual should ask:

“how surprised would I be if this outcome actually occurred,

if, at the time it occurred, I were still looking at the world in

the way I look at it right now?” (Derbyshire,2017). Inter-

preting surprise as unpredictability under a statistical model,

an open-ended system

is precisely one which produces

ever increasing “Shackle surprise” in an observer which is

learning. The concept of potential surprise is itself based

on the century-old idea of Knightian uncertainty (Knight,

Open-Endedness is Essential for Artiﬁcial Superhuman Intelligence

1921). Knightian uncertainty is a lack of any quantiﬁable

knowledge about some possible occurrence, as opposed

to the presence of quantiﬁable risk. Thus, somewhat im-

precisely, an open-ended system

is one which induces

Knightian uncertainty in an observer who is learning.

In Stanley and Lehman (2015), the authors argue that local

search for novel and interesting artifacts can be advanta-

geous over optimization for a global objective. This is be-

cause stepping stones towards a solution that optimizes the

global objective may well not resemble the solution itself.

Hence it is hard to translate the global objective into a local

improvement operator that reliably accumulates improve-

ments without getting stuck in local optima. To address this

deceptiveness, they suggest that novelty search (Lehman

and Stanley,2011), guided by a notion of interestingness,

can uncover stepping stones that advance knowledge and

capability. We take inspiration from this blueprint and turn

it into a deﬁnition. In order to clarify the notions of nov-

elty and interestingness, we formalize them with respect

to an external observer. Novelty becomes unpredictability

according to the observer’s history-conditional model, and

interestingness becomes learnability of that model across

the history of observations.

Our deﬁnition naturally relates to the notion of curiosity. Cu-

riosity, implemented as prediction error of a world model,

has long been mooted as an intrinsic motivation that can lead

to open-ended discovery in RL agents given a sufﬁciently

rich environment space (Schmidhuber,1991b;Pathak et al.,

2017;Raileanu and Rockt

aschel,2020;Henaff et al.,2023).

Our deﬁnition of novelty is effectively a generalisation of

curiosity, without requiring an overarching RL framework.

Our requirement of learnability ensures that the observer

attempts to capture all the epistemic uncertainty about the ar-

tifacts produced by a system. One challenge is that curiosity

based on novelty alone leads to “stochastic traps”, whereby

an agent will seek out sources of random noise with which

to sate its curiosity (Schmidhuber,1991a;Burda et al.,2018;

Shyam et al.,2019). In principle, our deﬁnition of novelty

collapses such aleatoric uncertainty by taking the expecta-

tion. In practice, we can only estimate the expectation, so it

may be useful to subtract from the loss an estimate of the

aleatoric uncertainty as in Mavor-Parker et al. (2022). We

hope that future work will examine such subtleties required

for an algorithmic implementation of our deﬁnition.

The synergies between foundation models and open-

endedness have previously been discussed by Jiang et al.

(2022). The authors propose a general notion of exploration

and detail how open-endedness can be used to solve explo-

ration problems when training foundation models. Our work

follows in this line of thinking, providing a formal deﬁnition

of open-endedness to make the discussion precise, and fur-

ther developing the connections between open-endedness

and ASI. A construction of a particular open-ended learn-

ing system is provided in (Jiang et al.,2022), which may

or may not ﬁt our proposed deﬁnition of an open-ended

system depending on how it is instantiated. The system

generates Turing machine descriptions of MDPs, explicitly

optimizing for an objective containing terms for learning

potential, diversity, and grounding. These terms have some

high-level relation to our notions of learnability and novelty,

but they are quite distinct in the details. For instance, learn-

ing potential is divided into three sub-critia, improbability,

learnability, and consistency, which are not made entirely

formal. More crucially, the learnability discussed by (Jiang

et al.,2022) is a property of a single MDP, whereas the

learnability we deﬁne is a property of a sequence of artifacts.

Similarly, in (Jiang et al.,2022) diversity is deﬁned as a

distance measure between MDPs, whereas novelty, as we

deﬁne it, is a property of the learning of the observer with no

necessary relationship to distances in the space of artifacts.

It would be an interesting direction for future research to

understand under what conditions the system described in

(Jiang et al.,2022) would be open-ended by our deﬁnition,

and, more generally, whether one can directly optimize for

open-endedness in some circumstances.

Open-endedness is related to, but separate from, the no-

tion of an AI-generating algorithm (AIGA, Clune,2020).

An AIGA automatically learns how to build a general AI,

based on meta-learning model architectures, meta-learning

learning algorithms, and automatically generating data from

which to learn. Adapting the logic of Clune (2020), an

AIGA need not be open-ended by our deﬁnition; if an

AIGA had the objective of passing a Turing test, it need

not produce any further novelty once this objective had

been achieved. Likewise, an open-ended system need not

be an AIGA; as we shall see in Section 2.4, there exist

open-ended systems with narrow scope that match or ex-

ceed human ability without full domain-generality. Our idea

of an Open-Ended Foundation Model in Section 3lives at

the intersection between open-endedness and AIGAs.

Similarly open-endedness is related to, but distinct from,

continual RL (Abel et al.,2023). A continual RL problem is

one in which the best agents never stop learning. However,

as observed by (Sigaud et al.,2023), this does not neces-

sarily imply that the agent policies accumulate increasing

novelty. Rather, a continual RL agent could cycle among

some set of strategies. In the case where continual RL does

produce policies which are open-ended according to some

observer, this open-endedness will have a scope that is re-

stricted by the environment.

ResearchGate has not been able to resolve any citations for this publication.

A Definition of Open-Ended Learning Problems for Goal-Conditioned Agents

Article

Full-text available

Nov 2023

A lot of recent machine learning research papers have "Open-ended learning" in their title. But very few of them attempt to define what they mean when using the term. Even worse, when looking more closely there seems to be no consensus on what distinguishes open-ended learning from related concepts such as continual learning, lifelong learning or autotelic learning. In this paper, we contribute to fixing this situation. After illustrating the genealogy of the concept and more recent perspectives about what it truly means, we outline that open-ended learning is generally conceived as a composite notion encompassing a set of diverse properties. In contrast with these previous approaches, we propose to isolate a key elementary property of open-ended processes, which is to always produce novel elements from time to time over an infinite horizon. From there, we build the notion of open-ended learning problems and focus in particular on the subset of open-ended goal-conditioned reinforcement learning problems, as this framework facilitates the definition of learning a growing repertoire of skills. Finally, we highlight the work that remains to be performed to fill the gap between our elementary definition and the more involved notions of open-ended learning that developmental AI researchers may have in mind.

Mathematical discoveries from program search with large language models

Article

Full-text available

Dec 2023
NATURE

Large language models (LLMs) have demonstrated tremendous capabilities in solving complex tasks, from quantitative reasoning to understanding natural language. However, LLMs sometimes suffer from confabulations (or hallucinations), which can result in them making plausible but incorrect statements1,2. This hinders the use of current large models in scientific discovery. Here we introduce FunSearch (short for searching in the function space), an evolutionary procedure based on pairing a pretrained LLM with a systematic evaluator. We demonstrate the effectiveness of this approach to surpass the best-known results in important problems, pushing the boundary of existing LLM-based approaches³. Applying FunSearch to a central problem in extremal combinatorics—the cap set problem—we discover new constructions of large cap sets going beyond the best-known ones, both in finite dimensional and asymptotic cases. This shows that it is possible to make discoveries for established open problems using LLMs. We showcase the generality of FunSearch by applying it to an algorithmic problem, online bin packing, finding new heuristics that improve on widely used baselines. In contrast to most computer search approaches, FunSearch searches for programs that describe how to solve a problem, rather than what the solution is. Beyond being an effective and scalable strategy, discovered programs tend to be more interpretable than raw solutions, enabling feedback loops between domain experts and FunSearch, and the deployment of such programs in real-world applications.

On the roles of function and selection in evolving systems

Article

Full-text available

Oct 2023

Physical laws—such as the laws of motion, gravity, electromagnetism, and thermodynamics—codify the general behavior of varied macroscopic natural systems across space and time. We propose that an additional, hitherto-unarticulated law is required to characterize familiar macroscopic phenomena of our complex, evolving universe. An important feature of the classical laws of physics is the conceptual equivalence of specific characteristics shared by an extensive, seemingly diverse body of natural phenomena. Identifying potential equivalencies among disparate phenomena—for example, falling apples and orbiting moons or hot objects and compressed springs—has been instrumental in advancing the scientific understanding of our world through the articulation of laws of nature. A pervasive wonder of the natural world is the evolution of varied systems, including stars, minerals, atmospheres, and life. These evolving systems appear to be conceptually equivalent in that they display three notable attributes: 1) They form from numerous components that have the potential to adopt combinatorially vast numbers of different configurations; 2) processes exist that generate numerous different configurations; and 3) configurations are preferentially selected based on function. We identify universal concepts of selection—static persistence, dynamic persistence, and novelty generation—that underpin function and drive systems to evolve through the exchange of information between the environment and the system. Accordingly, we propose a “law of increasing functional information”: The functional information of a system will increase (i.e., the system will evolve) if many different configurations of the system undergo selection for one or more functions.

Assembly theory explains and quantifies selection and evolution

Article

Full-text available

Oct 2023
NATURE

Scientists have grappled with reconciling biological evolution1,2 with the immutable laws of the Universe defined by physics. These laws underpin life’s origin, evolution and the development of human culture and technology, yet they do not predict the emergence of these phenomena. Evolutionary theory explains why some things exist and others do not through the lens of selection. To comprehend how diverse, open-ended forms can emerge from physics without an inherent design blueprint, a new approach to understanding and quantifying selection is necessary3–5. We present assembly theory (AT) as a framework that does not alter the laws of physics, but redefines the concept of an ‘object’ on which these laws act. AT conceptualizes objects not as point particles, but as entities defined by their possible formation histories. This allows objects to show evidence of selection, within well-defined boundaries of individuals or selected units. We introduce a measure called assembly (A), capturing the degree of causation required to produce a given ensemble of objects. This approach enables us to incorporate novelty generation and selection into the physics of complex objects. It explains how these objects can be characterized through a forward dynamical process considering their assembly. By reimagining the concept of matter within assembly spaces, AT provides a powerful interface between physics and biology. It discloses a new aspect of physics emerging at the chemical scale, whereby history and causal contingency influence what exists.

General intelligence requires rethinking exploration

Article

Full-text available

Jun 2023

We are at the cusp of a transition from ‘learning from data’ to ‘learning what data to learn from’ as a central focus of artificial intelligence (AI) research. While the first-order learning problem is not completely solved, large models under unified architectures, such as transformers, have shifted the learning bottleneck from how to effectively train models to how to effectively acquire and use task-relevant data. This problem, which we frame as exploration, is a universal aspect of learning in open-ended domains like the real world. Although the study of exploration in AI is largely limited to the field of reinforcement learning, we argue that exploration is essential to all learning systems, including supervised learning. We propose the problem of generalized exploration to conceptually unify exploration-driven learning between supervised learning and reinforcement learning, allowing us to highlight key similarities across learning settings and open research challenges. Importantly, generalized exploration is a necessary objective for maintaining open-ended learning processes, which in continually learning to discover and solve new problems, provides a promising path to more general intelligence.

Identifying Necessary Conditions for Open-Ended Evolution through the Artificial Life World of Chromaria

Conference Paper

Jul 2014

Large Language Models Can Self-Improve

Conference Paper

Jan 2023

Self-Instruct: Aligning Language Models with Self-Generated Instructions

Conference Paper

Jan 2023

On the Open-Endedness of Detecting Open-Endedness

Article

Feb 2023
ARTIF LIFE

We argue that attempting to quantify open-endedness misses the point: The nature of open-endedness is such that an open-ended system will eventually move outside its current model of behavior, and hence outside any measure based on that model. This presents a challenge for analyzing Artificial Life systems, leading us to conclude that the focus should be on understanding the mechanisms underlying open-endedness, not simply on attempting to quantify it. To demonstrate this, we apply several measures to eight long experimental runs of the spatial version of the Stringmol automata chemistry. These experiments were originally designed to examine the hypothesis that spatial structure provides a defense against parasites. The runs successfully show this defense, but also show a range of innovative, and possibly open-ended, behaviors involved in countering a parasitic arms race. Commencing with system-generic measures, we develop and use a variety of measures dedicated to analyzing some of these innovations. We argue that a process of analysis, starting with system-generic measures but going on to system-specific measures, will be needed wherever the phenomenon of open-endedness is involved.

Mastering the game of Stratego with model-free multiagent reinforcement learning

Article

Dec 2022
SCIENCE

We introduce DeepNash, an autonomous agent that plays the imperfect information game Stratego at a human expert level. Stratego is one of the few iconic board games that artificial intelligence (AI) has not yet mastered. It is a game characterized by a twin challenge: It requires long-term strategic thinking as in chess, but it also requires dealing with imperfect information as in poker. The technique underpinning DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without search, that learns to master Stratego through self-play from scratch. DeepNash beat existing state-of-the-art AI methods in Stratego and achieved a year-to-date (2022) and all-time top-three ranking on the Gravon games platform, competing with human expert players.

Open-Endedness is Essential for Artificial Superhuman Intelligence

Abstract and Figures

Recommended publications

DeepMind AI topples experts at complex game Stratego

Evolutionary Self-Replication as a Mechanism for Producing Artificial Intelligence

GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents

Standing Still Is Not an Option: Alternative Baselines for Attainable Utility Preservation