Content uploaded by Einar Bjarki Gunnarsson
Author content
All content in this area was uploaded by Einar Bjarki Gunnarsson on Jul 10, 2023
Content may be subject to copyright.
arXiv:2307.03346v1 [math.PR] 7 Jul 2023
Limit theorems for the site frequency spectrum
of neutral mutations in an
exponentially growing population
Einar Bjarki Gunnarsson1Kevin Leder2Xuanming Zhang2
1School of Mathematics, University of Minnesota, Twin Cities, MN 55455, USA.
2Department of Industrial and Systems Engineering, University of Minnesota, Twin Cities, MN 55455,
USA.
Abstract
The site frequency spectrum (SFS) is a widely used summary statistic of genomic
data, offering a simple means of inferring the evolutionary history of a population.
Motivated by recent evidence for the role of neutral evolution in cancer, we exam-
ine the SFS of neutral mutations in an exponentially growing population. Whereas
recent work has focused on the mean behavior of the SFS in this scenario, here, we
investigate the first-order asymptotics of the underlying stochastic process. Using
branching process techniques, we show that the SFS of a Galton-Watson process
evaluated at a fixed time converges almost surely to a random limit. We also show
that the SFS evaluated at the stochastic time at which the population first reaches
a certain size converges in probability to a constant. Finally, we illustrate how our
results can be used to construct consistent estimators for the extinction probability
and the effective mutation rate of a birth-death process.
Keywords: Site frequency spectrum; Neutral evolution; Infinite sites model; Branch-
ing processes; Convergence of stochastic processes.
MSC2020 Classification: 60J85, 60F15, 92D25, 92B05.
1 Introduction
The site frequency spectrum (SFS) is a popular summary statistic of genomic data, record-
ing the frequencies of mutations within a given population or population sample. For the
case of a large constant-sized population and selectively neutral mutations, the SFS has
given rise to several estimators of the rate of mutation accumulation within the popu-
lation, and these estimators have formed the basis of many statistical tests of neutral
evolution vs. evolution under selection [1, 2]. In this way, the SFS has provided a simple
means of understanding the rate and mode of evolution in a population using genomic
data.
1
Motivated by the uncontrolled growth of cancer cell populations, and the mounting
evidence for the role of neutral evolution in cancer [3, 4, 5, 6, 7], several authors have
recently studied the SFS of neutral mutations in an exponentially growing population.
Durrett [8, 9] considered a supercritical birth-death process, in which cells live for an
exponentially distributed time and then divide or die. He showed that in the large-
time limit, the expected number of mutations found at a frequency ≥famongst cells
with infinite lineage follows a 1/f power law with 0 < f < 1. Similar results were
obtained by Bozic et al. [10] and in a deterministic setting by Williams et al. [5]. In the
aforementioned work, Durrett also derived an approximation for the expected SFS of a
small random sample taken from the population [8, 9]. Further small sample results have
been derived using both branching process and coalescence techniques and they have been
compared with Durrett’s result in [11, 12]. In [13], we derived exact expressions for the
SFS of neutral mutations in a supercritical birth-death process, both for cells with infinite
lineage and for the total cell population, evaluated either at a fixed time (fixed-time SFS)
or at the stochastic time at which the population first reaches a given size (fixed-size SFS).
More recently, the effect of selective mutations on the expected SFS has been investigated
by Tung and Durrett [14] and Bonnet and Leman [15]. The latter work considers the
setting of a drug-sensitive tumor which decays exponentially under treatment, with cells
randomly acquiring resistance which enables them to grow exponentially under treatment.
Whereas the aforementioned works have focused on the mean behavior of the SFS,
here, we are interested in the asymptotic behavior of the underlying stochastic process.
Using the framework of coalescent point processes, Lambert [16] derived a strong law
of large numbers for the SFS of neutral mutations in a population sample, where the
sample is ranked in such a way that coalescence times between consecutive individuals
are i.i.d. Later works by Lambert [17], Johnston [18] and Harris et al. [19] characterized
the joint distribution of coalescence times for a uniformly drawn sample from a continuous-
time Galton-Watson process. Building on these works, Johnson et al. [20] derived limit
distributions for the total lengths of internal and external branches in the genealogical tree
of a birth-death process. Schweinsberg and Shuai [21] extended this analysis to branches
supporting exactly kleaves, which under a constant mutation rate characterizes the SFS of
a uniformly drawn sample. For a supercritical birth-death process, the authors established
both a weak law of large numbers and the asymptotic normality of branch lengths in the
limit of a large sample, assuming that the sample is sufficiently small compared to the
expected population size at the sampling time.
In this work, instead of considering a sample from the population using coalescence
techniques, we will investigate the first-order asymptotics for the SFS of the total pop-
ulation using branching process techniques. We establish results both for the fixed-time
and fixed-size SFS under the infinite sites model of mutation, where each new mutation is
assumed to be unique [22]. Cheek and Antal recently studied a finite sites model in [23]
(see also [24]), where each genetic site is allowed to mutate back and forth between the
four nucleotides A, C, G, T . With the understanding that a site is mutated if its nucleotide
differs from the nucleotide of the initial individual, the authors investigated the SFS of
a birth-death process stopped at a certain size, both for mutations observed in a certain
number and in a certain fraction of individuals. They used a limiting regime where the
population size is sent to infinity, mutation rate is sent to 0, and the number of genetic
2
sites is sent to infinity. In contrast, we will assume a constant mutation rate under the
infinite sites model (with no back mutations), and send either the fixed time or the fixed
size at which the population is observed to infinity.
Our results are derived for a supercritical Galton-Watson process in continuous time,
where each individual acquires neutral mutations at a constant rate ν > 0. Let Z0(t)
denote the size of the population at time t,λ > 0 denote the net growth rate of the
population, τNdenote the time at which the population first reaches size N, and Sj(t)
denote the number of mutations found in j≥1 individuals at time t. Our main result,
Theorem 1, characterizes the first-order behavior of e−λtSj(t) as t→ ∞ (fixed-time re-
sult) and N−1Sj(τN) as N→ ∞ (fixed-size result). To prove the fixed-time result, the key
idea is to decompose (Sj(t))t≥0into a difference of two increasing processes (Sj,+(t))t≥0
and (Sj,−(t))t≥0. These processes count the total number of instances that a mutation
reaches and leaves frequency j, respectively, up until time t. Using the limiting behavior of
Z0(t) as t→ ∞, we construct large-time approximations for the two processes (Sj,+(t))t≥0
and (Sj,−(t))t≥0. We then establish exponential L1error bounds on these approxima-
tions, which imply convergence in probability. Finally, by adapting an argument of Harris
(Theorem 21.1 of [25]), we use the exponential error bounds and the fact that (Sj,+(t))t≥0
and (Sj,−(t))t≥0are increasing processes to show that e−λtSj,+(t) and e−λtSj,−(t) converge
almost surely to their approximations. This in turn gives almost sure convergence for
e−λtSj(t) as t→ ∞. The fixed-size result is obtained by combining the fixed-time result
with an approximation result for τN, given by Proposition 1. Since we are only able to
establish the approximation for τNin probability, the result for N−1Sj(τN) as N→ ∞
is given in probability. Finally, we establish analogous fixed-time and fixed-size conver-
gence results for M(t) = P∞
j=1 Sj(t), the total number of mutations present at time t, in
Proposition 2. All results are given conditional on nonextinction of the population.
The rest of the paper is organized as follows. Section 2 introduces our branching pro-
cess model and establishes the relevant notation. Section 3 presents our results, including
explicit expressions for the birth-death process. Section 4 outlines the proof of the main
result, Theorem 1. Section 5 constructs consistent estimators for the extinction proba-
bility and effective mutation rate of the birth-death process. Finally, the proofs of the
remaining results can be found in Section 6.
2 Model
2.1 Branching process model with neutral mutations
We consider a Galton-Watson branching process (Z0(t))t≥0, started with a single individ-
ual at time 0, Z0(0) = 1, where the lifetimes of individuals are exponentially distributed
with mean 1/a > 0. At the end of an individual’s lifetime, it produces offspring according
to the distribution (uk)k≥0, where ukis the probability that koffspring are produced. We
define m:= P∞
k=0 kukas the mean number of offspring per death event and assume that
the offspring distribution has a finite third moment, P∞
k=0 k3uk<∞. Each individual,
over its lifetime, accumulates neutral mutations at (exponential) rate ν > 0. We assume
the infinite sites model of mutation, where each new mutation is assumed to be unique.
3
Throughout, we consider the case m > 1 of a supercritical process. The net growth rate
of the population is then λ=a(m−1) >0, with E[Z0(t)] = eλt for t≥0.
We will be primarily interested in analyzing the process conditional on long-term
survival of the population. We define the event of nonextinction of the population as
Ω∞:= {Z0(t)>0 for all t > 0}.
We also define the probability of eventual extinction as
p:= P(Ωc
∞) = P(Z0(t) = 0 for some t > 0},(1)
and the corresponding survival probability as q:= P(Ω∞). For N≥1, we define τNas
the time at which the population first reaches size N,
τN:= inf{t≥0 : Z0(t)≥N},(2)
with the convention that inf ∅=∞. Note that on Ω∞,τN<∞almost surely. Also note
that if uk>0 for some k > 2, it is possible that Z0(τN)> N . We finally define
pi,j(t) := P(Z0(t) = j|Z0(0) = i)
as the probability of transitioning from ito jindividuals in ttime units. For the baseline
case Z0(0) = 1, we simplify the notation to pj(t) := p1,j (t).
2.2 Special case: Birth-death process
An important special case is that of the birth-death process, where u2> u0≥0 and
u0+u2= 1. In this process, an individual at the end of its lifetime either dies with-
out producing offspring or produces two offspring. At each death event, the population
therefore either reduces or increases in size by one individual. The birth-death process is
for example relevant to the population dynamics of cancer cell populations (tumors) and
bacteria. In this case, the probability of eventual extinction can be computed explicitly
as p=u0/u2and the survival probability as q= 1 −u0/u2[9]. Furthermore, the prob-
ability mass function j7→ pj(t) has an explicit expression for each t≥0, which is given
by expression (64) in Section 6.8. This will enable us to derive explicit limits for the site
frequency spectrum of the birth-death process, see Corollary 1 in Section 3.2.
2.3 Asymptotic behavior
We note that (e−λt Z0(t))t≥0is a nonnegative martingale with respect to the natural
filtration Ft:= σ(Z0(s); s≤t). Thus, there exists a random variable Ysuch that
e−λtZ0(t)→Yalmost surely t→ ∞. By Theorem 2 in Section III.7 of [26],
YD
=pδ0+qξ, (3)
where pand qare the extinction and survival probabilities of the population, respectively,
δ0is a point mass at 0, and ξis a random variable on (0,∞) with a strictly positive
4
continuous density function and mean 1/q. Since we assume that the offspring distribution
has a finite second moment we know that E[(Z0(t))2] = O(e2λt) by Chapter III.4 of [27]
or Lemma 5 of [28], hence (e−λtZ0(t))t≥0is uniformly integrable and E[Y|Ft] = e−λt Z0(t).
Based on the large-time approximation Z0(t)≈Y eλt, for N≥1, we define an approx-
imation to the hitting time τNdefined in (2) as follows:
tN:= inf{t≥0 : Y eλt =N}.(4)
In Proposition 1, we show that conditional on Ω∞,τN−tN→0 in probability as N→ ∞.
2.4 Site frequency spectrum
In the model, each individual accumulates neutral mutations at rate ν > 0. For t > 0,
enumerate the mutations that occur up until time tas 1,...,Nt, and define Mt:=
{1,...,Nt}as the set of mutations generated up until time t. For i∈ Mtand s≤t, let
Ci(s) denote the number of individuals at time sthat carry mutation i, with Ci(s) = 0
before mutation ioccurs. The number of mutations present in jindividuals at time tis
then given by
Sj(t) := X
i∈Mt
1{Ci(t)=j}.
The vector (Sj(t))j≥1is the site frequency spectrum (SFS) of the neutral mutations at
time t. We also define the total number of mutations present at time tas
M(t) :=
∞
X
j=1
Sj(t).
The goal of this paper is to establish first-order limit theorems for Sj(t) and M(t), eval-
uated either at the fixed time tas t→ ∞ or at the random time τNas N→ ∞.
3 Results
3.1 General case
Our main result, Theorem 1, provides large-time and large-size first-order asymptotics for
the SFS conditional on nonextinction. For the fixed-time SFS, we establish almost sure
convergence, while for the fixed-size SFS, we establish convergence in probability. A proof
sketch is given in Section 4 and the proof details are carried out in Sections 6.1–6.5.
Theorem 1. (1) Conditional on Ω∞,
lim
t→∞ e−λtSj(t) = νY Z∞
0
e−λspj(s)ds, j ≥1,(5)
almost surely. Equivalently, with rN:= (1/λ) log(qN ),X:= qY and E[X|Ω∞] = 1,
lim
N→∞ N−1Sj(rN) = νX Z∞
0
e−λspj(s)ds, j ≥1,(6)
almost surely.
5
(2) Conditional on Ω∞,
lim
N→∞ N−1Sj(τN) = νZ∞
0
e−λspj(s)ds, j ≥1,(7)
in probability.
Proof. Section 4 and Sections 6.1–6.5.
The main difference between the fixed-time result (5) and the fixed-size result (7) is
that the limit in (5) is a random variable while it is constant in (7). The reason is that
the population size at a large, fixed time tis dependent on the limiting random variable
Yin e−λtZ0(t)→Y, while the population size at time τNis always approximately N. In
expression (6), the fixed-time result is viewed at the time rNdefined so that
lim
N→∞ N−1E[Z0(rN)|Ω∞] = 1.
The point is to show that when the result in (5) is viewed at a fixed time comparable to
τN, the mean of the limiting random variable becomes equal to the fixed-size limit in (7).
To establish the fixed-size result (7), we prove a secondary approximation result for the
hitting time τNdefined in (2). The result, stated as Proposition 1, shows that conditional
on Ω∞,τNis equal to the approximation tNdefined in (4) up to an O(1) error. The proof
involves relatively simple calculations, given in Section 6.6.
Proposition 1. For any ε > 0,
lim
N→∞ P(|τN−tN|> ε|Ω∞) = 0.(8)
Proof. Section 6.6.
The proof of the fixed-size result (7) combines the fixed-time result (5) with Proposition
1, as is discussed in Section 4.5. Since we are only able to establish the approximation for
τNin probability, the fixed-size result (7) is given in probability. An almost sure version
of Proposition 1 would immediately imply an almost sure version of (7).
Finally, a simpler version of the argument used to prove Theorem 1 can be used to
prove analogous limit theorems for the total number of mutations at time t,M(t).
Proposition 2. (1) Conditional on Ω∞,
lim
t→∞ e−λtM(t) = νY Z∞
0
e−λs(1 −p0(s))ds, (9)
almost surely.
(2) Conditional on Ω∞,
lim
N→∞ N−1M(τN) = νZ∞
0
e−λs(1 −p0(s))ds, (10)
in probability.
6
Proof. Section 6.7.
By combining the results of Theorem 1 and Proposition 2, we obtain the following
limits for the proportion of mutations found in j≥1 individuals:
lim
t→∞
Sj(t)
M(t)= lim
N→∞
Sj(τN)
M(τN)=R∞
0e−λspj(s)ds
R∞
0e−λs(1 −p0(s))ds , j ≥1,(11)
where the fixed-time limit applies almost surely and the fixed-size limit in probability. In
the application Section 5, we will also be interested in the proportion of mutations found
in j≥1 individuals out of all mutations found in ≥jindividuals. If we define
Mj(t) := X
k≥j
Sj(t), j ≥1, t ≥0,
as the total number of mutations found in ≥jindividuals, this proportion is given by
lim
t→∞
Sj(t)
Mj(t)= lim
N→∞
Sj(τN)
Mj(τN)=R∞
0e−λspj(s)ds
R∞
0e−λsP∞
k=jpk(s)ds, j ≥1,(12)
since limit theorems for Mj(t) follow from Theorem 1 and Proposition 2 by writing Mj(t) =
M(t)−Pj−1
k=1 Sk(t). Note that for both proportions, the fixed-time and fixed-size limits are
the same, as the variability in population size at a fixed time has been removed. Also note
that both proportions are independent of the mutation rate ν. In Section 5, we show that
for the birth-death process, these properties enable us to define a consistent estimator for
the extinction probability pwhich applies both to the fixed-time and fixed-size SFS.
3.2 Special case: Birth-death process
For the special case of the birth-death process, we are able to derive explicit expressions for
the limits in Theorem 1 and Proposition 2, as we demonstrate in the following corollary.
Corollary 1. For the birth-death process, conditional on Ω∞,
(1) the random variable Yin Theorem 1 has the exponential distribution with mean 1/q,
and the fixed-time result (5) can be written explicitly as
lim
t→∞ e−λtSj(t) = νqY
λZ1
0
(1 −py)−1(1 −y)yj−1dy
=νqY
λ
∞
X
k=0
pk
(j+k)(j+k+ 1), j ≥1.
(13)
For the special case p= 0 of a pure-birth or Yule process,
lim
t→∞ e−λtSj(t) = νY
λ
1
j(j+ 1).
7
(2) the fixed-size result (7) can be written explicitly as
lim
N→∞ N−1Sj(τN) = νq
λZ1
0
(1 −py)−1(1 −y)yj−1dy
=νq
λ
∞
X
k=0
pk
(j+k)(j+k+ 1), j ≥1.
(14)
For the pure-birth or Yule process,
lim
N→∞ N−1Sj(τN) = ν
λ
1
j(j+ 1).(15)
(3) the fixed-time result (9) can be written explicitly as
lim
t→∞ e−λtM(t) =
νY
λ, p = 0,
−νq log(q)Y
λp ,0< p < 1.
(16)
(4) the fixed-size result (10) can be written explicitly as
lim
N→∞ N−1M(τN) =
ν
λ, p = 0,
−νq log(q)
λp ,0< p < 1.(17)
Proof. Section 6.8.
Similarly, the proportion of mutations found in j≥1 individuals, appearing in ex-
pression (11), can be written explicitly as
R∞
0e−λspj(s)ds
R∞
0e−λs(1 −p0(s))ds =
1
j(j+ 1), p = 0,
−p
log(q)Z1
0
(1 −py)−1(1 −y)yj−1dy, 0< p < 1,
(18)
and the proportion of mutations in jindividuals out of all mutations in ≥jindividuals,
appearing in expression (12), can be written as
ϕj(p) := R∞
0e−λspj(s)ds
R∞
0e−λsP∞
k=jpk(s)ds =
1
j+ 1, p = 0,
1−R1
0(1 −py)−1yjdy
R1
0(1 −py)−1yj−1dy ,0< p < 1,
(19)
see Section 6.9. Note that expressions (18) and (19) give the same proportion for j= 1.
It can be shown that for any j≥1, ϕj(p) is strictly decreasing in p(Section 6.10). In
Section 5, we use this fact to develop an estimator for the extinction probability p.
We showed in expression (C.1) of [13] that for p= 0,
E[Sj(τN)] = νN
λ·1
j(j+ 1), j = 2,...,N −1.
In other words, the fixed-size result (15) holds in the mean even for finite values of N,
excluding boundary effects at j= 1 and j=N.
8
4 Proof of Theorem 1
In this section, we sketch the proof of the main result, Theorem 1. Proving the fixed-time
result (5) represents most of the work, which is discussed in Sections 4.1 to 4.4. The
main idea is to write the site-frequency spectrum process (Sj(t))t≥0as a difference of two
increasing processes in time, and to prove limit theorems for the increasing processes.
The fixed-size result (7) follows easily from fixed-time result (5) and Proposition 1 via the
continuous mapping theorem, as is discussed in Section 4.5.
4.1 Decomposition into increasing processes Sj,+(t)and Sj,−(t)
Fix j≥1. The key idea of the proof of the fixed-time result (5) is to decompose the process
(Sj(t))t≥0into a difference of two increasing processes (Sj,+(t))t≥0and (Sj,−(t))t≥0. To
describe these processes, we first need to establish some notation.
Recall that for mutation i∈ Mtand s≤t,Ci(s) is the size of the clone containing
mutation iat time s, meaning the number of individuals carrying mutation iat time s.
Set τi
j,−(0) := 0 and define recursively for k≥1,
τi
j,+(k) := inf{s > τ i
j,−(k−1) : Ci(s) = j},
τi
j,−(k) := inf{s > τ i
j,+(k) : Ci(s)6=j}.
Note that τi
j,+(k) is the k-th time at which the clone containing mutation ireaches or
“enters” size j, and τi
j,−(k) is the k-th time at which it leaves or “exits” size j. Next,
define
Ii
j,+(t) :=
∞
X
ℓ=1
1{τi
j,+(ℓ)≤t}, Ii
j,−(t) :=
∞
X
ℓ=1
1{τi
j,−(ℓ)≤t},(20)
as the number of times the clone containing mutation ienters and exits size j, respectively,
up until time t. Then, for each k≥1, define the increasing processes (Sk
j,+(t))t≥0and
(Sk
j,−(t))t≥0by
Sk
j,+(t) := X
i∈Mt
1{Ii
j,+(t)≥k}, Sk
j,−(t) := X
i∈Mt
1{Ii
j,−(t)≥k}.(21)
These processes keep track of the number of mutations in Mtwhose clones enter and
exit size j, respectively, at least ktimes up until time t. We can now finally define the
increasing processes (Sj,+(t))t≥0and (Sj,−(t))t≥0as
Sj,+(t) :=
∞
X
k=1
Sk
j,+(t), Sj,−(t) :=
∞
X
k=1
Sk
j,−(t).
A key observation is that these processes count the total number of instances that a
9
mutation enters and exits size j, respectively, up until time t. To see why, note that
∞
X
k=1
Sk
j,+(t) = X
i∈Mt
∞
X
k=1
1{Ii
j,+(t)≥k}=X
i∈Mt
∞
X
k=1
∞
X
ℓ=k
1{Ii
j,+(t)=ℓ}
=X
i∈Mt
∞
X
ℓ=1
ℓ
X
k=1
1{Ii
j,+(t)=ℓ}=X
i∈Mt
∞
X
ℓ=1
ℓ1{Ii
j,+(t)=ℓ}
=X
i∈Mt
Ii
j,+(t).
Similar calculations hold for P∞
k=1 Sk
j,−(t). Note that Ii
j,+(t)−Ii
j,−(t) = 1 if and only if
Ci(t) = j, and Ii
j,+(t)−Ii
j,−(t) = 0 otherwise. It follows that
Sj(t) = Sj,+(t)−Sj,−(t).(22)
The fixed-time result (5) will follow from limit theorems for Sj,+(t) and Sj,−(t), which in
turn follow from approximation results for the subprocesses Sk
j,+(t) and Sk
j,−(t) for k≥1.
4.2 Approximation results for Sk
j,+(t)and Sk
j,−(t)
We begin by establishing approximation results for Sk
j,+(t) and Sk
j,−(t) for each k≥1.
First, for the branching process (Z0(t))t≥0with Z0(0) = 1, set τ−
j(0) := 0 and define
recursively
τ+
j(k) := inf{s > τ−
j(k−1) : Z0(s) = j},
τ−
j(k) := inf{s > τ+
j(k) : Z0(s)6=j}, k ≥1.(23)
Set
pk
j,+(t) := P(τ+
j(k)≤t), pk
j,−(t) := P(τ−
j(k)≤t),(24)
which are the probabilities that the branching process enters and exits size j, respectively,
at least ktimes up until time t. A key observation is that
pj(t) = P(Z0(t) = j) =
∞
X
k=1 pk
j,+(t)−pk
j,−(t),(25)
which follows from the fact that
{Z0(t) = j}=[
k≥1{τ+
j(k)≤t, τ−
j(k)> t}
=[
k≥1{τ+
j(k)≤t}\{τ−
j(k)≤t}.
In addition, we note that since almost surely, Z0(t)→0 or Z0(t)→ ∞ as t→ ∞, there
exists 0 < θ < 1 so that for each t≥0,
pk
j,−(t)≤pk
j,+(t)≤P(τ+
j(k)<∞)≤θk.(26)
10
The approximation results for Sk
j,+(t) and Sk
j,−(t) can be established using almost
identical arguments, so if suffices to analyze Sk
j,+(t). Recall that Sk
j,+(t) is the number of
mutations whose clones enter size jat least ktimes up until time t. At any time s≤t, a
mutation occurs at rate νZ0(s), and with probability pk
j,+(t−s), its clone enters size jat
least ktimes up until time t. This suggests the approximation
Sk
j,+(t)≈νZt
0
Z0(s)pk
j,+(t−s)ds =: ¯
Sk
j,+(t).(27)
Since e−λtZ0(t)→Yas t→ ∞, we can further approximate for large t,
¯
Sk
j,+(t)≈νZt
0
Y eλspk
j,+(t−s)ds =: ˆ
Sk
j,+(t).(28)
For the remainder of the section, our goal is to establish bounds on the L1-error associated
with the approximations Sk
j,+(t)≈¯
Sk
j,+(t)≈ˆ
Sk
j,+(t).
We first consider the approximation (27). For ∆ >0, define the Riemann sum
¯
Sk
j,+,∆(t) := ν∆
⌊t/∆⌋
X
ℓ=0
Z0(ℓ∆)pk
j,+(t−ℓ∆).(29)
Clearly, lim∆→0¯
Sk
j,+,∆(t) = ¯
Sk
j,+(t) almost surely. In addition, for some C > 0,
¯
Sk
j,+,∆(t)≤Ct max
s≤tZ0(s).
Since (Z0(s))s≥0is a nonnegative submartingale, we can use Doob’s inequality to show
that CtEmaxs≤tZ0(s)<∞for each t≥0. Therefore, by dominated convergence,
lim
∆→0E¯
Sk
j,+,∆(t)−¯
Sk
j,+(t)= 0, t ≥0.
It then follows from the triangle inequality that
ESk
j,+(t)−¯
Sk
j,+(t)≤lim
∆→0ESk
j,+(t)−¯
Sk
j,+,∆(t), t ≥0.(30)
To bound the L1-error of the approximation (27), it therefore suffices to bound the right-
hand side of (30). We accomplish this in the following lemma.
Lemma 1. Let t > 0and ∆>0. There exists constants C1>0and C2>0independent
of t,∆and ksuch that
EhSk
j,+(t)−¯
Sk
j,+,∆(t)2i≤C1θkt2eλt +C2∆e3λt.(31)
Proof. Section 6.1.
11
We next turn to the approximation (28). By the triangle inequality and the Cauchy-
Schwarz inequality, we can write
E¯
Sk
j,+(t)−ˆ
Sk
j,+(t)≤νZt
0
EY eλs −Z0(s)pk
j,+(t−s)ds
≤νZt
0EhY eλs −Z0(s)2i1/2pk
j,+(t−s)ds.
By showing that EhY eλs −Z0(s)2i=Ceλs for some C > 0 and applying (26), we can
obtain the following bound on the L1-error of the approximation (28).
Lemma 2.
E¯
Sk
j,+(t)−ˆ
Sk
j,+(t)=O(θkeλt/2).(32)
Proof. Section 6.2.
Finally, from (30), (31) and (32), it is straightforward to obtain a bound on the L1-
error of the approximation Sk
j,+(t)≈ˆ
Sk
j,+(t), which we state as Proposition 3.
Proposition 3.
ESk
j,+(t)−ˆ
Sk
j,+(t)=O(θk/2teλt/2).(33)
4.3 Limit theorems for Sj,+(t)and Sj,−(t)
To establish limit theorems for Sj,+(t) and Sj,−(t), we define the approximations
ˆ
Sj,+(t) :=
∞
X
k=1
ˆ
Sk
j,+(t),ˆ
Sj,−(t) :=
∞
X
k=1
ˆ
Sk
j,−(t).
Focusing on the former approximation, we first argue that limt→∞ e−λt ˆ
Sj,+(t) exists. In-
deed, consider the following calculations for k≥1 and t≥0, where we use (26):
e−λt ˆ
Sk
j,+(t) = νe−λt Zt
0
Y eλspk
j,+(t−s)ds
=νY Zt
0
e−λspk
j,+(s)ds
≤νY
λθk.
The second equality shows that t7→ e−λt ˆ
Sk
j,+(t) is an increasing function, and the inequal-
ity shows that the function is bounded above by the summable sequence (νY /λ)θk. There-
fore, t7→ e−λt ˆ
Sj,+(t) is increasing and bounded above, which implies that limt→∞ e−λt ˆ
Sj,+(t)
exists. The limit is given by
lim
t→∞ e−λt ˆ
Sj,+(t) = νY Z∞
0
e−λs ∞
X
k=1
pk
j,+(s)!ds. (34)
12
We next note that by the triangle inequality and Proposition 3,
ESj,+(t)−ˆ
Sj,+(t)≤
∞
X
k=1
ESk
j,+(t)−ˆ
Sk
j,+(t)=Oteλt/2,
which implies that
Z∞
0
e−λtESj,+(t)−ˆ
Sj,+(t)dt < ∞.(35)
Combining (35) with the fact that (Sj,+(t))t≥0and (Sj,−(t))t≥0are increasing processes,
we can establish almost sure convergence results for e−λtSj,+(t) and e−λt Sj,−(t). In the
proof, we adapt an argument of Harris (Theorem 21.1 of [25]), with the L1condition (35)
replacing an analogous L2condition used by Harris.
Proposition 4. Conditional on Ω∞,
lim
t→∞ e−λtSj,+(t) = νY Z∞
0
e−λs ∞
X
k=1
pk
j,+(s)!ds,
lim
t→∞ e−λtSj,−(t) = νY Z∞
0
e−λs ∞
X
k=1
pk
j,−(s)!ds,
almost surely.
Proof. Section 6.3.
4.4 Proof of the fixed-time result (5)
To finish the proof of the fixed-time result (5), it suffices to note that by (25) and Propo-
sition 4,
lim
t→∞ e−λtSj,+(t)−Sj,−(t)=νY Z∞
0
e−λspj(s)ds.
Since Sj(t) = Sj,+(t)−Sj,−(t) by (22), the result follows.
4.5 Proof of the fixed-size result (7)
To prove the fixed-size result (7), we note that by (5), conditional on Ω∞,
lim
N→∞ e−λτNSj(τN) = νY Z∞
0
e−λspj(s)ds,
almost surely. Since N e−λtN=Yby (4), we also have
lim
N→∞ e−λ(τN−tN)·N−1Sj(τN) = Y−1lim
N→∞ e−λτNSj(τN)
=νZ∞
0
e−λspj(s)ds,
13
almost surely. By Proposition 1 and the continuous mapping theorem, conditional on Ω∞,
lim
N→∞ e−λ(τN−tN)= 1,
in probability. We can therefore conclude that conditional on Ω∞,
lim
N→∞ N−1Sj(τN) = νZ∞
0
e−λspj(s)ds,
in probability, which is the desired result.
5 Application: Estimation of extinction probability
and effective mutation rate for birth-death process
We conclude by briefly discussing how for the birth-death process, our results imply
consistent estimators for the extinction probability pand the effective mutation rate ν/λ,
given data on the SFS of all mutations found in the population. The estimator for pis
based on the long-run proportion of mutations found in one individual. Recall that by
(12), this proportion is the same for the fixed-time and fixed-size SFS. By setting j= 1
in (18), the proportion can be written explicitly as (Section 6.11)
ϕ1(p) =
1
2, p = 0,
−p+qlog(q)
plog(q),0< p < 1,(36)
where we recall that q= 1 −p. The function ϕ1(p) is strictly decreasing in pand it
takes values in (0,1/2]. If in a given population, the proportion of mutations found in
one individual is observed to be x, we define an estimator for pby applying the inverse
function of ϕ1:
bp=bp(x) := ϕ−1
1(x).(37)
Technically, ϕ−1
1is only defined on (0,1/2], whereas the random number xmay take
any value in [0,1]. This can be addressed by extending the definition of ϕ−1
1so that
ϕ−1
1(x) := ϕ−1
1(1/2) = 0 for x > 1/2 and ϕ−1
1(0) := limx→0+ϕ−1
1(x) = 1. Since ϕ−1
1so
defined is continuous, we can combine (11) and (18) with the continuous mapping theorem
to see that whether the SFS is observed at a fixed time or a fixed size, the estimator in
(37) is consistent in the sense that bp→pin probability as t→ ∞ or N→ ∞. In other
words, if the population is sufficiently large, its site frequency spectrum can be used to
obtain an arbitrarily accurate estimate of p. Then, using the total number of mutations
and the current size of the population, an estimate for ν/λ can be derived from (16) or
(17). We refer to Section 5 of [13] for a more detailed discussion of this estimator, which
includes an application of the estimator to simulated data.
In the preceding discussion, we focused on the proportion of mutations found in one
individual for illustration purposes. The point was to show that it is possible to define a
14
consistent estimator for pand ν/λ using the SFS. If it is difficult to measure the number of
mutations found in one individual, one can instead focus on the proportion of mutations
found in jcells out of all mutations found in ≥jcells for some j > 1, denoted by ϕj(p)
in (19). As noted in Section 3.2, ϕj(p) is strictly decreasing in pfor any j≥1, and it
takes values in (0,1/(j+ 1)]. We can therefore define a consistent estimator for pusing
the inverse function ϕ−1
j(p). However, it should be noted that the range of ϕj(p) becomes
narrower as jincreases, which will likely affect the standard deviation of the estimator.
6 Proofs
6.1 Proof of Lemma 1
Proof. Before considering the quantity of interest ESk
j,+(t)−¯
Sk
j,+,∆(t)2, we perform
some preliminary calculations. Recall that Mtis the set of mutations generated up until
time t. For ∆ >0 and any non-negative integer ℓwith ℓ∆< t, define Aℓ,∆to be the set
of mutations created in the time interval ℓ∆,min{(ℓ+ 1)∆, t}, and note that
Mt=
⌊t/∆⌋
[
ℓ=0
Aℓ,∆.
Define Xℓ,∆:= |Aℓ,∆|as the number of mutations created in ℓ∆,min{(ℓ+ 1)∆, t}. Note
that conditional on F(ℓ+1)∆ =σ(Z0(s); s≤(ℓ+ 1)∆),
Xℓ,∆∼Pois νZ(ℓ+1)∆
ℓ∆
Z0(s)ds!.
Using this fact, it is easy to see that
E[Xℓ,∆|F(ℓ+1)∆] = νZ(ℓ+1)∆
ℓ∆
Z0(s)ds = ∆νZ0(ℓ∆)(1 + O(∆)) (38)
and
E[X2
ℓ,∆|F(ℓ+1)∆]−E[Xℓ,∆|F(ℓ+1)∆] = E[Xℓ,∆|F(ℓ+1)∆ ]2,(39)
which implies
E[X2
ℓ,∆]−E[Xℓ,∆] = ∆2ν2E[Z0(ℓ∆)2](1 + O(∆)).(40)
For ease of presentation, we will for the remainder of the proof drop 1+O(∆) multiplicative
factors in calculations, as they will not affect the final result.
Recall that for a mutation i∈ Mt,Ii
j,+(t) is the number of times the clone containing
mutation ireaches size jup until time t, see (20). Define
Wk
ℓ∆,t(j) := X
i∈Aℓ,∆
1{Ii
j,+(t)≥k}
15
as the number of mutations in Aℓ,∆whose clone reaches size jat least ktimes up until
time t. Note that by the definition of Sk
j,+(t) in (21),
Sk
j,+(t) =
⌊t/∆⌋
X
ℓ=0
Wk
ℓ∆,t(j).(41)
For i∈Aℓ,∆,P(Ii
j,+(t)≥k) = pk
j,+(t−∆ℓ) + O(∆), where pk
j,+(t) is defined as in (24).
Therefore, conditional on Xℓ,∆,Wk
ℓ∆,t(j) is a binomial random variable with parameters
Xℓ,∆and pk
j,+(t−ℓ∆) + O(∆). Dropping 1 + O(∆) factors, this implies by (38),
E[Wk
ℓ∆,t(j)|F(ℓ+1)∆] = EEWk
ℓ∆,t(j)|Xℓ,∆,F(ℓ+1)∆|F(ℓ+1)∆
=pk
j,+(t−ℓ∆)EXℓ,∆|F(ℓ+1)∆
= ∆νpk
j,+(t−ℓ∆)Z0(ℓ∆),(42)
and by (40) and (38),
EWk
ℓ∆,t(j)2
=pk
j,+(t−ℓ∆)2EX2
ℓ,∆+pk
j,+(t−ℓ∆) 1−pk
j,+(t−ℓ∆)E[Xℓ,∆]
=pk
j,+(t−ℓ∆)2(EX2
ℓ,∆−E[Xℓ,∆]) + pk
j,+(t−ℓ∆)E[Xℓ,∆]
=pk
j,+(t−ℓ∆)2∆2ν2EZ0(ℓ∆)2+pk
j,+(t−ℓ∆)∆νE [Z0(ℓ∆)] .(43)
We are now ready to begin the main calculations. First, note that by (29) and (41),
EhSk
j,+(t)−¯
Sk
j,+,∆(t)2i
=E
⌊t/∆⌋
X
ℓ=0 ν∆Z0(ℓ∆)pk
j,+(t−ℓ∆) −Wk
ℓ∆,t(j)
2
=
⌊t/∆⌋
X
ℓ2=0
⌊t/∆⌋
X
ℓ1=0
Eν∆Z0(∆ℓ2)pk
j,+(t−∆ℓ2)−Wk
ℓ2∆,t(j)
ν∆Z0(∆ℓ1)pk
j,+(t−∆ℓ1)−Wk
ℓ1∆,t(j).(44)
We first consider the diagonal terms in the double sum. Note first that by (42),
E[Z0(ℓ∆)Wk
ℓ∆,t(j)] = ∆νpk
j,+(t−ℓ∆)E[Z0(ℓ∆)2],
which implies by (43),
Ehν∆Z0(ℓ∆)pk
j,+(t−∆ℓ)−Wk
ℓ∆,t(j)2i
=ν2∆2pk
j,+(t−ℓ∆)2E[Z0(ℓ∆)2]−2ν∆pk
j,+(t−∆ℓ)E[Z0(ℓ∆)Wk
ℓ∆,t(j)] + E[Wk
ℓ∆,t(j)2]
=EWk
ℓ∆,t(j)2−ν2∆2pk
j,+(t−ℓ∆)2E[Z0(ℓ∆)2]
=ν∆pk
j,+(t−ℓ∆)E[Z0(ℓ∆)].
16
Next, we consider the cross terms for ℓ1< ℓ2:
Eν∆Z0(∆ℓ2)pk
j,+(t−∆ℓ2)−Wk
ℓ2∆,t(j)ν∆Z0(∆ℓ1)pk
j,+(t−∆ℓ1)−Wk
ℓ1∆,t(j)
=ν∆pk
j,+(t−∆ℓ1)EZ0(∆ℓ1)ν∆Z0(∆ℓ2)pk
j,+(t−∆ℓ2)−Wk
ℓ2∆,t(j)
−EWk
ℓ1∆,t(j)ν∆Z0(∆ℓ2)pk
j,+(t−∆ℓ2)−Wk
ℓ2∆,t(j)
=EWk
ℓ1∆,t(j)Wk
ℓ2∆,t(j)−ν∆Z0(∆ℓ2)pk
j,+(t−∆ℓ2),
where the final equality follows by combining (42) with the fact that
EZ0(∆ℓ1)ν∆Z0(∆ℓ2)pk
j,+(t−∆ℓ2)−Wk
ℓ2∆,t(j)
=EEZ0(∆ℓ1)ν∆Z0(∆ℓ2)pk
j,+(t−∆ℓ2)−EWk
ℓ2∆,t(j)|F(ℓ2+1)∆ |F(ℓ1+1)∆.
We can now rewrite (44) as
EhSk
j,+(t)−¯
Sk
j,+,∆(t)2i
=ν∆
⌊t/∆⌋
X
ℓ=0
pk
j,+(t−ℓ∆)E[Z0(ℓ∆)]
+ 2 X
ℓ1<ℓ2
EWk
ℓ1∆,t(j)Wk
ℓ2∆,t(j)−ν∆Z0(∆ℓ2)pk
j,+(t−∆ℓ2).(45)
The remainder of the proof will focus on bounding the off-diagonal terms
EWk
ℓ1∆,t(j)Wk
ℓ2∆,t(j)−ν∆Z0(∆ℓ2)pk
j,+(t−∆ℓ2).(46)
We begin with the following lemma, which shows that in the limit as ∆ →0, we can
ignore the possibility of multiple mutations in time intervals of length ∆.
Lemma 3. For ℓ1< ℓ2,∆>0and t > 0,
E[Wk
ℓ2∆,t(j)Wk
ℓ1∆,t(j)] = P(Wk
ℓ2∆,t(j) = 1, W k
ℓ1∆,t(j) = 1) + Oeλ∆ℓ1e2λ∆ℓ2∆3,
E[Z0(ℓ2∆)Wk
ℓ1∆,t(j)] = E[Z0(ℓ2∆); Wk
ℓ1∆,t(j) = 1] + Oeλ∆ℓ1e2λ∆ℓ2∆2.
Proof. Section 6.4.
By Lemma 3, instead of (46) we can study the simpler difference
P(Wk
ℓ1∆,t(j) = 1, W k
ℓ2∆,t(j) = 1) −ν∆pk
j,+(t−∆ℓ2)E[Z0(∆ℓ2); Wk
ℓ1∆,t(j) = 1].(47)
For ease of notation, define
I1(ℓ1, ℓ2) := P(Wk
ℓ1∆,t(j) = 1, W k
ℓ2∆,t(j) = 1),
I2(ℓ1, ℓ2) := ν∆pk
j,+(t−∆ℓ2)E[Z0(∆ℓ2); Wk
ℓ1∆,t(j) = 1].
In the following calculations, we will use twice that
P(Wk
ℓ1∆,t(j) = 1|Z0(∆ℓ1) = n) = nν∆pk
j,+(t−∆ℓ1).
17
First consider the I2(ℓ1, ℓ2) term,
I2(ℓ1, ℓ2)
ν∆pk
j,+(t−∆ℓ2)=EZ0(∆ℓ2); Wk
ℓ1∆,t(j) = 1
=
∞
X
m=1
mP Z0(∆ℓ2) = m, W k
ℓ1∆,t(j) = 1
=
∞
X
m=1
∞
X
n=1
mP Z0(∆ℓ2) = m, W k
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n
=
∞
X
m=1
∞
X
n=1
mP Z0(∆ℓ2) = m|Wk
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n
·P(Wk
ℓ1∆,t(j) = 1|Z0(∆ℓ1) = n)P(Z0(∆ℓ1) = n)
=ν∆pk
j,+(t−∆ℓ1)
∞
X
n=1
nP (Z0(∆ℓ1) = n)
·
∞
X
m=1
mP (Z0(∆ℓ2) = m|Wk
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n).
Next we consider the I1(ℓ1, ℓ2) term,
I1(ℓ1, ℓ2) = P(Wk
ℓ1∆,t(j) = 1, W k
ℓ2∆,t(j) = 1)
=
∞
X
n=1
P(Wk
ℓ2∆,t(j) = 1|Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
·P(Wk
ℓ1∆,t(j) = 1|Z0(∆ℓ1) = n)P(Z0(∆ℓ1) = n)
=ν∆pk
j,+(t−∆ℓ1)
∞
X
n=1
nP (Z0(∆ℓ1) = n)P(Wk
ℓ2∆,t(j) = 1|Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
=ν∆pk
j,+(t−∆ℓ1)
∞
X
n=1
nP (Z0(∆ℓ1) = n)
·
∞
X
m=1
P(Wk
ℓ2∆,t(j) = 1|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
·P(Z0(∆ℓ2) = m|Wk
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n).
We can therefore write
I1(ℓ1, ℓ2)−I2(ℓ1, ℓ2)
=ν∆pk
j,+(t−∆ℓ1)
∞
X
n=1
nP (Z0(∆ℓ1) = n)
·
∞
X
m=1
P(Z0(∆ℓ2) = m|Wk
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n)
·P(Wk
ℓ2∆,t(j) = 1|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
−mν∆pk
j,+(t−∆ℓ2).(48)
18
We can use (48) to show that there exists a constant C > 0 so that
I1(ℓ1, ℓ2)−I2(ℓ1, ℓ2)≤C∆2θkeλ∆ℓ2,(49)
where θis obtained from (26). The proof is deferred to the following lemma.
Lemma 4. For ℓ1< ℓ2,∆>0and t > 0,(49) holds.
Proof. Section 6.5.
Returning to (45), we can finally use Lemmas 3 and 4 to conclude that there exist
positive constants C1,C2and C3such that
EhSk
j,+(t)−¯
Sk
j,+,∆(t)2i=ν∆
⌊t/∆⌋
X
ℓ=0
pk
j,+(t−ℓ∆)E[Z0(ℓ∆)]
+ 2 X
ℓ1<ℓ2
(I1(ℓ1, ℓ2)−I2(ℓ1, ℓ2)) + C3∆e3λt
≤C1θkteλt +C2θkt2eλt +C3∆e3λt.
This concludes the proof.
6.2 Proof of Lemma 2
Proof. Using that E[Y|Fs] = e−λs Z0(s), see Section 2.3, we begin by writing
EhY eλs −Z0(s)2i
=E[Z0(s)2]−2eλsE[Y Z0(s)] + e2λs E[Y2]
=e2λsE[Y2]−E[Z0(s)2].
From expression (5) of Chapter III.4 of [26], we know there exist positive constants c1and
c2such that
E[Z0(s)2] = c1e2λs −c2eλs.(50)
If we establish that E[Y2] = c1, then it will follow that
EhY eλt −Z0(t)2i=c2eλt,(51)
which is what we need to prove Lemma 2. To this end, note that Theorem 1 of IV.11 in
[26] implies that E[(Z0(t)e−λt)2]→E[Y2] as t→ ∞. And from (50), we know that
lim
t→∞ e−2λtE[Z0(t)2] = c1.
Therefore, E[Y2] = c1, which concludes the proof.
19
6.3 Proof of Proposition 4
Proof. Since Sj,+(t) is increasing in t,
e−λ(t+τ)Sj,+(t+τ)≥e−λτ e−λtSj,+(t), t, τ ≥0.
In Section 4.3, it is shown that ˆ
S:= limt→∞ e−λt ˆ
Sj,+(t) exists, and the limit is positive on
Ω∞since Y > 0, see (34). Suppose there is an ω∈Ω∞such that
lim sup
t→∞
e−λtSj,+(t, ω)>ˆ
S(ω).(52)
For notational convenience, we will drop the ωin what follows. If (52) is true, there is a
δ > 0 and a sequence of real numbers t1< t2< . . . such that ti+1 −ti> δ/λ(2 + 2δ) and
e−λtiSj,+(ti)>ˆ
S(1 + δ) for i= 1,2,.... Then
e−λ(ti+τ)Sj,+(ti+τ)≥e−λτ e−λtiSj,+(ti)≥(1 −λτ)ˆ
S(1 + δ).(53)
Also, there exists t0so that for t > t0,
e−λt ˆ
Sj,+(t)<ˆ
S(1 + δ/2).
Therefore, for ti> t0,
Zti+1
tie−λtSj,+(t)−e−λt ˆ
Sj,+(t)dt ≥Zti+δ/λ(2+2δ)
tie−λtSj,+(t)−e−λt ˆ
Sj,+(t)dt
≥Zti+δ/λ(2+2δ)
tie−λtSj,+(t)−e−λt ˆ
Sj,+(t)dt
≥ˆ
SZδ/λ(2+2δ)
0
((1 −λτ)(1 + δ)−(1 + δ/2)) dτ
=ˆ
S·δ2
8λ(1 + δ),
from which it follows that
Z∞
0e−λtSj,+(t)−e−λt ˆ
Sj,+(t)dt =∞.
By (35), we see that the inequality (52) cannot hold on a set of positive probability.
Now suppose that
lim inf
t→∞ e−λtSj,+(t, ω)<ˆ
S(ω) (54)
for some ω∈Ω∞. Then there is a sequence of real numbers t1< t2< . . . with ti+1 −ti>
δ/λ(2 −δ) and a real number 0 < δ < 1 such that e−λtiSj,+(ti)<(1 −δ)ˆ
S. Therefore,
e−λ(ti−τ)Sj,+(ti−τ)≤(1 −δ)ˆ
Seλτ ≤(1 −δ)ˆ
S
1−λτ ,0≤τ < 1/λ. (55)
20
Also, there exists t0so that for t > t0,
e−λt ˆ
Sj,+(t)>(1 −δ/2) ˆ
S.
Therefore,
Zti+1
tie−λtSj,+(t)−e−λt ˆ
Sj,+(t)dt ≥Zti+1
ti+1−δ/λ(2−δ)e−λt Sj,+(t)−e−λt ˆ
Sj,+(t)dt
≥Zti+1
ti+1−δ/λ(2−δ)e−λt ˆ
Sj,+(t)−e−λtSj,+(t)dt
≥ˆ
SZδ/λ(2−δ)
0
((1 −δ/2) −(1 −δ)/(1 −λτ )) dτ
=ˆ
Sδ
2λ+1−δ
λlog 2−2δ
2−δ,
where we can verify that δ
2λ+1−δ
λlog 2−2δ
2−δ>0 when δ < 1. Hence
Z∞
0e−λtSj,+(t)−e−λt ˆ
Sj,+(t)dt =∞,
which allows us to conclude that (54) cannot hold on a set of positive probability.
We can now conclude that on Ω∞,
lim
t→∞ e−λtSj,+(t) = ˆ
S
almost surely, which is the desired result.
6.4 Proof of Lemma 3
Proof. We will only prove the first statement, the proof of the second statement being
largely the same. To that end, it suffices to show that
E[Wk
ℓ2∆,t(j)Wk
ℓ1∆,t(j); Wk
ℓ2∆,t(j)>1] + E[Wk
ℓ2∆,t(j)Wk
ℓ1∆,t(j); Wk
ℓ1∆,t(j)>1]
=Oeλ∆ℓ1e2λ∆ℓ2∆3,
with ℓ1< ℓ2. Again, we will only show that the first term satisfies the bound, the proof
for the second term being largely the same. We first note that since Wk
ℓ1∆,t(j)≤Xℓ1,∆,
E[Wk
ℓ2∆,t(j)Wk
ℓ1∆,t(j)1{Wk
ℓ2∆,t(j)>1}]
=EhE[Wk
ℓ2∆,t(j)Wk
ℓ1∆,t(j)1{Wk
ℓ2∆,t(j)>1}|F∆(ℓ1+1)]i
≤EhE[Xℓ1,∆Wk
ℓ2∆,t(j)1{Wk
ℓ2∆,t(j)>1}|F∆(ℓ1+1)]i
=EhEXℓ1,∆|F∆(ℓ1+1)EhWk
ℓ2∆,t(j)1{Wk
ℓ2∆,t(j)>1}F∆(ℓ1+1)ii.
21
The final equality follows because the number of mutations created in the interval [∆ℓ1,∆ℓ1+
∆) is independent of the number of mutations created in [∆ℓ2,∆ℓ2+ ∆) and their fate,
given the population size up until time ∆(ℓ1+1). Therefore, using (38), Wk
ℓ2∆,t(j)≤Xℓ2,∆
and (39),
E[Wk
ℓ2∆,t(j)Wk
ℓ1∆,t(j)1{Wk
ℓ2∆,t(j)>1}]
≤ν∆EhZ0(∆ℓ1)EhEhWk
ℓ2∆,t(j)1{Wk
ℓ2∆,t(j)>1}F∆(ℓ2+1)iF∆(ℓ1+1)ii
≤ν∆EhZ0(∆ℓ1)EhEhXℓ2,∆1{Xℓ2,∆>1}F∆(ℓ2+1)iF∆(ℓ1+1)ii
≤ν∆EhZ0(∆ℓ1)EhEhXℓ2,∆(Xℓ2,∆−1)F∆(ℓ2+1)iF∆(ℓ1+1)ii
=ν3∆3EZ0(∆ℓ1)EZ0(∆ℓ2)2|F∆(ℓ1+1).
We then use that for s≤t,
E[Z0(t)2|Fs] = e2λ(t−s)Z0(s)2+ Var (Z0(t−s)) Z0(s),
to conclude that
E[Wk
ℓ2∆,t(j)Wk
ℓ1∆,t(j)1{Wk
ℓ2∆,t(j)>1}]
≤ν3∆3e2λ∆(ℓ2−ℓ1−1)E[Z0(∆ℓ1)Z0(∆(ℓ1+ 1))2]
+ν3∆3Var (Z0(∆(ℓ2−ℓ1−1))) E[Z0(∆ℓ1)Z0(∆(ℓ1+ 1))]
=ν3∆3e2λ∆(ℓ2−ℓ1)E[Z0(∆ℓ1)3]
+ν3∆3e2λ∆(ℓ2−ℓ1−1)Var(Z0(∆))E[Z0(∆ℓ1)2]
+ν3∆3Var (Z0(∆(ℓ2−ℓ1−1))) eλ∆E[Z0(∆ℓ1)2].
The desired result now follows from the assumption that the offspring distribution has a
finite third moment and thus E[Z0(t)3] = Oe3λtby Lemma 5 of [28].
6.5 Proof of Lemma 4
Proof. Let ℓbe a positive integer and let s > 0 such that ℓ∆ + ∆ < s. On the event
{Xℓ,∆= 1}, define Dj
ℓ∆(s) to be the number of disjoint intervals in [0, s] that the mutation
at time ℓ∆ is present in jindividuals, and let Bℓ∆(s) be the number of individuals alive
at time sdescended from the mutation at time ℓ∆. Note that
P(Wk
ℓ∆,t(j) = 1) = P(Xℓ,∆= 1, Dj
ℓ∆(t)≥k)(1 + O(∆)).
On {Xℓ1,∆= 1, Xℓ2,∆= 1}with ℓ1< ℓ2, let Adenote the event that the mutation at time
ℓ2∆ occurs in the clone started by the mutation at time ℓ1∆.
We now consider the first term inside the parenthesis in (48), and break it up based on
the value of Bℓ1∆(ℓ2∆) and whether Aoccurs or not. Once again, we refrain from writing
22
1 + O(∆) multiplicative factors.
P(Wk
ℓ2∆,t(j) = 1|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
=
m
X
i=1
P(Wk
ℓ2∆,t(j) = 1, Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
=
m
X
i=1
P(Wk
ℓ2∆,t(j) = 1, Bℓ1∆(ℓ2∆) = i, A|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
+
m
X
i=1
P(Wk
ℓ2∆,t(j) = 1, Bℓ1∆(ℓ2∆) = i, Ac|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1).
Note that
P(Wk
ℓ2∆,t(j) = 1, Bℓ1∆(ℓ2∆) = i, A|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
=P(Wk
ℓ2∆,t(j) = 1, A|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1, Bℓ1∆(ℓ2∆) = i)
·P(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
=P(Wk
ℓ2∆,t(j) = 1, A, Dj
ℓ1∆(t)≥k|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i)
·P(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
P(Dj
ℓ1∆(t)≥k|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i)
≤P(Wk
ℓ2∆,t(j) = 1, A|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i)
·P(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
P(Dj
ℓ1∆(t)≥k|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i),
and
P(Wk
ℓ2∆,t(j) = 1, A|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i)
=iν∆pk
j,+(t−∆ℓ2).
Also note that
P(Wk
ℓ2∆,t(j) = 1, Bℓ1∆(ℓ2∆) = i, Ac|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
=P(Wk
ℓ2∆,t(j) = 1, Ac|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1, Bℓ1∆(ℓ2∆) = i)
·P(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
= (m−i)ν∆pk
j,+(t−∆ℓ2)
·P(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1).
23
It follows that
P(Wk
ℓ2∆,t(j) = 1|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
≤νpk
j,+(t−∆ℓ2)
· m
X
i=1
i∆P(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
P(Dj
ℓ1∆(t)≥k|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i)
+
m
X
i=1
(m−i)∆P(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)!
≤ν∆pk
j,+(t−∆ℓ2)
· m+
m
X
i=1
iP(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
P(Dj
ℓ1∆(t)≥k|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i)!.
Going back to (48), we can then derive the upper bound
I1(ℓ1, ℓ2)−I2(ℓ1, ℓ2)
≤ν2∆2pk
j,+(t−∆ℓ1)pk
j,+(t−∆ℓ2)
∞
X
n=1
nP (Z0(∆ℓ1) = n)
·
∞
X
m=1
P(Z0(∆ℓ2) = m|Wk
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n)
·
m
X
i=1
iP(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
P(Dj
ℓ1∆(t)≥k|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i).
Note that
P(Z0(∆ℓ2) = m|Wk
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n)
·P(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
=P(Bℓ1∆(ℓ2∆) = i, Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
P(Wk
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n)
and
P(Bℓ1∆(ℓ2∆) = i, Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
P(Dj
ℓ1∆(t)≥k|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i)
=P(Bℓ1∆(ℓ2∆) = i, Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Dj
ℓ1∆(t)≥k)
P(Dj
ℓ1∆(t)≥k|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i)
=P(Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(∆ℓ2) = i).
24
It follows that
P(Z0(∆ℓ2) = m|Wk
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n)
·P(Bℓ1∆(ℓ2∆) = i|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, W k
ℓ1∆,t(j) = 1)
P(Dj
ℓ1∆(t)≥k|Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(ℓ2∆) = i)
=P(Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(∆ℓ2) = i)
P(Wk
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n).
Since
P(Z0(∆ℓ1) = n)
P(Wk
ℓ1∆,t(j) = 1, Z0(∆ℓ1) = n)=1
P(Wk
ℓ1∆,t(j) = 1|Z0(∆ℓ1) = n)
=1
nν∆pk
j,+(t−∆ℓ1),
we can write
I1(ℓ1, ℓ2)−I2(ℓ1, ℓ2)
≤∆νpk
j,+(t−∆ℓ2)
·
∞
X
n=1
∞
X
m=1
m
X
i=1
iP (Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(∆ℓ2) = i).
Now,
P(Z0(∆ℓ2) = m, Z0(∆ℓ1) = n, Xℓ1,∆= 1, Bℓ1∆(∆ℓ2) = i)
=P(Z0(∆ℓ2) = m, Bℓ1∆(∆ℓ2) = i|Z0(∆ℓ1) = n, Xℓ1,∆= 1)
·P(Xℓ1,∆= 1|Z0(∆ℓ1) = n)P(Z0(∆ℓ1) = n)
=nν∆P(Z0(∆ℓ1) = n)pi(∆(ℓ2−ℓ1))pn−1,m−i(∆(ℓ2−ℓ1)),
where we recall that pn,m(t) = P(Z0(t) = m|Z0(0) = n) and pm(t) = p1,m(t). It follows
that
I1(ℓ1, ℓ2)−I2(ℓ1, ℓ2)
≤ν2∆2pk
j,+(t−∆ℓ2)
∞
X
n=1
nP (Z0(∆ℓ1) = n)
∞
X
m=1
m
X
i=1
ipn−1,m−i(∆(ℓ2−ℓ1))pi(∆(ℓ2−ℓ1))
=ν2∆2pk
j,+(t−∆ℓ2)
∞
X
n=1
nP (Z0(∆ℓ1) = n)
∞
X
i=1
ipi(∆(ℓ2−ℓ1))
∞
X
m=i
pn−1,m−i(∆(ℓ2−ℓ1))
=ν2∆2pk
j,+(t−∆ℓ2)
∞
X
n=1
nP (Z0(∆ℓ1) = n)
∞
X
i=1
ipi(∆(ℓ2−ℓ1))
=ν2∆2pk
j,+(t−∆ℓ2)eλ∆ℓ2≤ν2∆2θkeλ∆ℓ2.
This is the desired result.
25
6.6 Proof of Proposition 1
Proof. To begin with, define the extinction time of the branching process with Z0(0) = 1
as
τ0= inf{t > 0 : Z0(t) = 0},
and note that the extinction probability p=P(τ0<∞) satisfies p∈[0,1) by the
assumption m > 1. We want to prove that for any ε > 0,
lim
N→∞ P(|τN−tN|> ε|Ω∞) = 0,
where τNand tNare defined by (2) and (4), respectively. We begin by establishing a
simple lower bound on τNfor large N.
Lemma 5. For ρ∈(0,1) define sN(ρ) := ρ
λlog(N). Then
P(τN< sN(ρ)) = ON2(ρ−1).
Proof. Since m > 1, we know that (Z0(t))t≥0is a submartingale. Therefore,
P(τN< sN(ρ)) = P sup
t≤sN(ρ)
Z0(t)≥N!≤1
N2EZ0(sN(ρ))2=ON2(ρ−1).
We next establish a simple result about the rate of convergence of e−λtZ0(t)→Y.
Lemma 6. For z > 0,
lim
a→∞ Psup
t≥a|Z0(t)e−λt −Y| ≥ zY, Ω∞= 0.
Proof. Fix a > 0 and δ > 0. On the event Ω∞,Yis a random variable on (0,∞) with a
strictly positive continuous density function, see (3). Thus, there exists η > 0 such that
P(Y < η, Ω∞)≤δ. We can therefore write
Psup
t≥a|Z0(t)e−λt −Y| ≥ zY, Ω∞≤δ+Psup
t≥a|Z0(t)e−λt −Y| ≥ zη, Ω∞.
For arbitrary b > a, we see from the triangle inequality that
sup
a≤t≤bZ0(t)e−λt −Y≤Z0(a)e−λa −Y+ sup
a≤t≤bZ0(t)e−λt −Z0(a)e−λa.
Thus, for z > 0,
Psup
a≤t≤bZ0(t)e−λt −Y≥zη, Ω∞
≤PZ0(a)e−λa −Y≥zη/2+Psup
a≤t≤bZ0(t)e−λt −Z0(a)e−λa≥zη/2.(56)
26
Since by (51),
EhZ0(a)e−λa −Y2i=Oe−λa,
EhZ0(a)e−λa −Z0(b)e−λb2i=Oe−λa,
Markov’s and Doob’s inequalities can be applied to (56) to see that
Psup
a≤t≤bZ0(t)e−λt −Y≥zη, Ω∞=Oe−λa /η2z2.
Since
Psup
t≥aZ0(t)e−λt −Y≥zη, Ω∞= lim
b→∞ Psup
a≤t≤bZ0(t)e−λt −Y≥zη, Ω∞,
it follows that
lim sup
a→∞
Psup
t≥a|Z0(t)e−λt −Y| ≥ zY, Ω∞≤δ,
and because δis arbitrary the desired result follows.
We are now ready to analyze the difference τN−tNon Ω∞. We first consider the case
τN< tN−ε. Define the difference function
ω0(t) = Z0(t)−Y eλt.
On Ω∞, by the definition of tNin (4),
Z0(τN) = Y eλτN+ω0(τN)
=Neλ(τN−tN)+ω0(τN),
which implies for τN< tN−ε,
ω0(τN) = N1−eλ(τN−tN)+ (Z0(τN)−N)≥N1−eλ(τN−tN)≥N1−e−λε.
Take 0 < ρ < 1. Applying Lemma 5,
P(τN< tN−ε, Ω∞)
≤Pω0(τN)≥N1−e−λε, τN< tN−ε, Ω∞
≤P(τN≤sN(ρ))
+Pω0(τN)≥N1−e−λε, sN(ρ)< τN< tN−ε, Ω∞
=ON2(ρ−1)+Pω0(τN)≥N1−e−λε , sN(ρ)< τN< tN−ε, Ω∞.
27
Thus we consider
Pω0(τN)≥N1−e−λε, sN(ρ)< τN< tN−ε, Ω∞
≤P sup
sN(ρ)<t<tN−ε
(Z0(t)−Y eλt)≥N1−e−λε,Ω∞!
≤P sup
sN(ρ)<t<tN−ε
(Z0(t)e−λt −Y)eλ(tN−ε)≥N1−e−λε,Ω∞!
≤P sup
sN(ρ)<t
(Z0(t)e−λt −Y)≥Yeλε −1,Ω∞!,
where in the last step, we use the defition of tN. We can now apply Lemma 6 to get
lim
N→∞ P(τN< tN−ε, Ω∞) = 0.
We next consider τN> tN+ε. Note that on the event {τN> tN+ε} ∩ Ω∞,
−ω0(tN+ε) = Y eλ(tN+ε)−Z0(tN+ε) = Neλε −Z0(tN+ε)≥Neλε −1.
Therefore,
P(τN> tN+ε, Ω∞)≤PY eλ(tN+ε)−Z0(tN+ε)≥Neλε −1,Ω∞
=PY−Z0(tN+ε)e−λ(tN+ε)≥Y1−e−λε,Ω∞.
Since P(tN≤1
2λlog(N),Ω∞) = P(Y≥√N , Ω∞)→0 as N→ ∞, we can write
PY−Z0(tN+ε)e−λ(tN+ε)≥Y1−e−λε,Ω∞
≤P(Y≥√N) + P sup
t> 1
2λlog(N)Y−e−λtZ0(t)≥Y1−e−λε,Ω∞!.
We can then apply Lemma 6 to get
lim
N→∞ P(τN> tN+ε, Ω∞) = 0,
which concludes the proof.
6.7 Proof of Proposition 2
Proof. We use a similar argument to the proof of Theorem 1. First, we break the total
number of mutations M(t) into
M(t) = M+(t)−M−(t),
where M+(t) represents the total number of mutations generated up until time t, and
M−(t) represents the number of mutations which belong to M+(t) but die out before time
28
t. Obviously, these two processes are increasing in time. The limit theorems for M(t)
will follow from limit theorems for M+(t) and M−(t). Because of the almost identical
arguments, we will focus on the analysis of M+(t).
As in the proof of Theorem 1, we define the approximations
ˆ
M+(t) := νZt
0
Y eλsds (57)
and
¯
M+(t) := νZt
0
Z0(s)ds, (58)
as well as the Riemann sum approximation
¯
M+,∆(t) := ν∆
⌊t/∆⌋
X
ℓ=0
Z0(ℓ∆).(59)
Note that the only difference between (28) and (57) is the probability pk
j,+(t−s) which
does not appear in (57). Therefore, we can simply follow the proofs of Lemmas 1 and 2
by replacing Sk
j,+(t), ˆ
Sk
j,+(t), ¯
Sk
j,+(t), ¯
Sk
j,+,∆(t) and θwith M+(t), ˆ
M+(t), ¯
M+(t), ¯
M+,∆(t)
and 1, respectively, and we will get
EM+(t)−ˆ
M+(t)=O(teλt/2),(60)
which implies
Z∞
0
e−λtEM+(t)−ˆ
M+(t)dt < ∞.(61)
Note that lim
t→∞ e−λt ˆ
M+(t) = νY/λ exists and M+(t) is an increasing process. By replacing
the corresponding terms in the proof of Proposition 4, we can get
lim
t→∞ e−λtM+(t) = νY Z∞
0
e−λsds =νY /λ, (62)
almost surely. Similarly,
lim
t→∞ e−λtM−(t) = νY Z∞
0
e−λsp0(s)ds, (63)
almost surely. The fixed-time result (9) follows immediately from (62) and (63).
Then, by following the proof in Section 4.5, we can get the fixed-size result (10) for
the total number of mutations,
lim
N→∞ N−1M(τN) = νZ∞
0
e−λs(1 −p0(s))ds,
in probability.
29
6.8 Proof of Corollary 1
Proof. (1) For the birth-death process, we can write
p0(t) = p(eλt −1)
eλt −p,
pj(t) = q2eλt
(eλt −p)2·eλt −1
eλt −pj−1
, j ≥1,
(64)
see expression (B.1) in [13]. Therefore, for j≥1,
Z∞
0
e−λspj(s)ds =1
λZ∞
0
q2e−λs
(1 −pe−λs)2·1−e−λs
1−pe−λs j−1
·λe−λsds.
Using the substitution x:= e−λs,dx =−λe−λs ds, we obtain
Z∞
0
e−λspj(s)ds =q2
λZ1
0
x
(1 −px)2·1−x
1−pxj−1
dx.
We again change variables, this time y:= (1 −x)/(1 −px), in which case
x= (1 −y)/(1 −py),
dx =−q/(1 −py)2dy,
1−px =q/(1 −py).
In addition, y= 1 for x= 0 and y= 0 for x= 1, which implies
Z∞
0
e−λspj(s)ds =q
λZ1
0
(1 −py)−1(1 −y)yj−1dy. (65)
To get the sum representation in (13), it suffices to note that
Z1
0
(1 −py)−1(1 −y)yj−1dy =
∞
X
k=0
pkZ1
0
(1 −y)yj+k−1dy
=
∞
X
k=0
pk
(j+k)(j+k+ 1).
To get the pure-birth process result, it suffices to note that p= 0, q= 1 and
Z1
0
(1 −y)yj−1dy =1
j(j+ 1).
(2) Follows from the same calculations as in (1).
(3) By (64), for the birth-death process,
1−p0(t) = (1 −p)eλt
eλt −p=qeλt
eλt −p.
30
Therefore,
Z∞
0
e−λs(1 −p0(s))ds =1
λZ∞
0
q
1−pe−λs ·λe−λsds.
Using the substitution x:= e−λs,dx =−λe−λs ds, we obtain
Z∞
0
e−λs(1 −p0(s))ds =1
λZ1
0
q
1−pxdx =
1
λ, p = 0,
−qlog(q)
λp ,0< p < 1.
(66)
(4) Follows from the same calculations as in (3).
6.9 Derivation of expression (19)
By writing Mj(t) = M(t)−Pj−1
k=0 Sk(t), it follows from Corollary 1 that conditional on
Ω∞,
lim
t→∞ e−λtMj(t) = νqY
λZ1
0
(1 −py)−1(1 −y)
∞
X
k=j
yk−1dy
=νqY
λZ1
0
(1 −py)−1yj−1dy.
Similarly,
lim
N→∞ N−1Mj(τN) = νq
λZ1
0
(1 −py)−1yj−1dy.
It follows that
lim
t→∞
Sj(t)
Mj(t)= lim
N→∞
Sj(τN)
Mj(τN)=R1
0(1 −py)−1(1 −y)yj−1dy
R1
0(1 −py)−1yj−1dy
= 1 −R1
0(1 −py)−1yjdy
R1
0(1 −py)−1yj−1dy =: ϕj(p).
6.10 Proof that ϕj(p)is strictly decreasing
Here, we show that for each j≥1, ϕj(p) given by the last expression in Section 6.9 is
strictly decreasing in p. Set
a:= Z1
0
(1 −py)−2yj+1dyZ1
0
(1 −py)−1yj−1dy,
b:= Z1
0
(1 −py)−2yjdyZ1
0
(1 −py)−1yjdy.
31
It suffices to show that a > b for each p∈(0,1). First, note that we can write
a=Z1
0Z1
0
(1 −py)−2yj+1(1 −px)−1xj−1dydx
and
b=Z1
0Z1
0
(1 −py)−2yj(1 −px)−1xjdydx,
which implies
a−b=Z1
0Z1
0
(1 −py)−2(1 −px)−1yjxj−1(y−x)dydx
=Z1
0Zx
0
(1 −py)−2(1 −px)−1yjxj−1(y−x)dydx
+Z1
0Z1
x
(1 −py)−2(1 −px)−1yjxj−1(y−x)dydx.
The latter integral can be rewritten as follows:
Z1
0Z1
x
(1 −py)−2(1 −px)−1yjxj−1(y−x)dydx
=Z1
0Zy
0
(1 −py)−2(1 −px)−1yjxj−1(y−x)dxdy
=−Z1
0Zx
0
(1 −px)−2(1 −py)−1xjyj−1(y−x)dydx
which implies
a−b=Z1
0Zx
0
(1 −py)−1(1 −px)−1yj−1xj−1(y−x)(1 −py)−1y−(1 −px)−1xdydx.
Since
y
1−py −x
1−px =y−x
(1 −py)(1 −px),
we can finally conclude that
a−b=Z1
0Zx
0
(1 −py)−2(1 −px)−2yj−1xj−1(y−x)2dydx > 0
for each p∈(0,1).
32
6.11 Derivation of expression (36)
To derive expression (36) in the main text, we note that (1 −py)−1=P∞
k=0(py)kfor
0< p < 1 and 0 ≤y≤1, which implies
Z1
0
(1 −py)−1(1 −y)dy =
∞
X
k=0
pkZ1
0
yk(1 −y)dy
=
∞
X
k=0
pk
k+ 1 −
∞
X
k=0
pk
k+ 2.
Since P∞
k=1
xk
k=−log(1 −x), we obtain
Z1
0
(1 −py)−1(1 −y)dy =−log(q)
p−1
p2−log(q)−p
=q
p2log(q) + 1
p.
Therefore, applying expression (18), we can write for 0 < p < 1,
ϕ1(p) = −p
log(q)Z1
0
(1 −py)−1(1 −y)dy =−p+qlog(q)
plog(q).
Acknowledgments
EBG was supported in part by NSF grant CMMI-1552764, NIH grant R01 CA241137,
funds from the Norwegian Centennial Chair grant and the Doctoral Dissertation Fellow-
ship from the University of Minnesota. K. Leder was supported in part with funds from
NSF award CMMI 2228034 and Research Council of Norway Grant 309273.
References
[1] K. Zeng, Y.-X. Fu, S. Shi, and C.-I. Wu, “Statistical tests for detecting positive
selection by utilizing high-frequency variants,” Genetics, vol. 174, no. 3, pp. 1431–
1439, 2006.
[2] G. Achaz, “Frequency spectrum neutrality tests: one for all and all for one,” Genetics,
vol. 183, no. 1, pp. 249–258, 2009.
[3] A. Sottoriva, H. Kang, Z. Ma, T. A. Graham, M. P. Salomon, J. Zhao, P. Marjoram,
K. Siegmund, M. F. Press, D. Shibata, et al., “A big bang model of human colorectal
tumor growth,” Nat. Genet., vol. 47, no. 3, pp. 209–216, 2015.
[4] S. Ling, Z. Hu, Z. Yang, F. Yang, Y. Li, P. Lin, K. Chen, L. Dong, L. Cao, Y. Tao,
et al., “Extremely high genetic diversity in a single tumor points to prevalence of non-
darwinian cell evolution,” Proc. Natl. Acad. Sci. USA, vol. 112, no. 47, pp. E6496–
E6505, 2015.
33
[5] M. J. Williams, B. Werner, C. P. Barnes, T. A. Graham, and A. Sottoriva, “Identi-
fication of neutral tumor evolution across cancer types,” Nat. Genet., vol. 48, no. 3,
p. 238, 2016.
[6] S. Venkatesan and C. Swanton, “Tumor evolutionary principles: how intratu-
mor heterogeneity influences cancer treatment and outcome,” Am. Soc. Clin. On-
col. Educ. Book, vol. 36, pp. e141–e149, 2016.
[7] A. Davis, R. Gao, and N. Navin, “Tumor evolution: Linear, branching, neutral or
punctuated?,” Biochim. Biophys. Acta Rev. Cancer, vol. 1867, no. 2, pp. 151–161,
2017.
[8] R. Durrett, “Population genetics of neutral mutations in exponentially growing can-
cer cell populations,” Ann. Appl. Propab., vol. 23, no. 1, p. 230, 2013.
[9] R. Durrett, “Branching process models of cancer,” in Branching Process Models of
Cancer, pp. 1–63, Springer, 2015.
[10] I. Bozic, J. M. Gerold, and M. A. Nowak, “Quantifying clonal and subclonal passenger
mutations in cancer evolution,” PLoS Comput. Biol., vol. 12, no. 2, p. e1004731, 2016.
[11] H. Ohtsuki and H. Innan, “Forward and backward evolutionary processes and allele
frequency spectrum in a cancer cell population,” Theor. Popul. Biol., vol. 117, pp. 43–
50, 2017.
[12] K. N. Dinh, R. Jaksik, M. Kimmel, A. Lambert, S. Tavar´e, et al., “Statistical in-
ference for the evolutionary history of cancer genomes,” Stat. Sci., vol. 35, no. 1,
pp. 129–144, 2020.
[13] E. B. Gunnarsson, K. Leder, and J. Foo, “Exact site frequency spectra of neutrally
evolving tumors: A transition between power laws reveals a signature of cell viabil-
ity,” Theoretical Population Biology, vol. 142, pp. 67–90, 2021.
[14] H.-R. Tung and R. Durrett, “Signatures of neutral evolution in exponentially growing
tumors: A theoretical perspective,” PLOS Computational Biology, vol. 17, no. 2,
p. e1008701, 2021.
[15] C. Bonnet and H. Leman, “Site frequency spectrum of a rescued population under
rare resistant mutations,” arXiv preprint arXiv:2303.04069, 2023.
[16] A. Lambert, “The allelic partition for coalescent point processes,” Markov Pro-
cess. Relat. Fields, vol. 15, no. 3, pp. 359–386, 2009.
[17] A. Lambert, “The coalescent of a sample from a binary branching process,” Theo-
retical population biology, vol. 122, pp. 30–35, 2018.
[18] S. G. Johnston, “The genealogy of galton-watson trees,” 2019.
34
[19] S. C. Harris, S. G. G. Johnston, and M. I. Roberts, “The coalescent structure of
continuous-time Galton–Watson trees,” The Annals of Applied Probability, vol. 30,
no. 3, pp. 1368 – 1414, 2020.
[20] B. Johnson, Y. Shuai, J. Schweinsberg, and K. Curtius, “Estimating single cell clonal
dynamics in human blood using coalescent theory,” bioRxiv, pp. 2023–02, 2023.
[21] J. Schweinsberg and Y. Shuai, “Asymptotics for the site frequency spectrum
associated with the genealogy of a birth and death process,” arXiv preprint
arXiv:2304.13851, 2023.
[22] R. Durrett, Probability models for DNA sequence evolution. Springer Science & Busi-
ness Media, 2008.
[23] D. Cheek and T. Antal, “Genetic composition of an exponentially growing cell pop-
ulation,” Stochastic Processes and their Applications, 2020.
[24] D. Cheek and T. Antal, “Mutation frequencies in a birth–death branching process,”
Ann. Appl. Probab., vol. 28, no. 6, pp. 3922–3947, 2018.
[25] T. E. Harris, “The theory of branching process,” 1964.
[26] K. B. Athreya and P. E. Ney, Branching processes. Courier Corporation, 2004.
[27] K. Athreya and P. Ney, Branching Processes. New York: Springer-Verlag, 1972.
[28] J. Foo, K. Leder, and J. Zhu, “Escape times for branching processes with random
mutational fitness effects,” Stochastic Processes and Their Applications, vol. 124,
no. 11, pp. 3661–3697, 2014.
35