PreprintPDF Available

Explicit and combined estimators for stable distributions parameters

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

This article focuses on the estimation of the stability index and scale parameter of stable random variables. While there is a sizable literature on this topic, no precise theoretical results seem available. We study an estimator based on log-moments, which always exist for such random variables. The main advantage of this estimator is that it has a simple closed form expression. This allows us to prove an almost sure convergence result as well as a central limit theorem. We show how to improve the accuracy of this estimator by combining it with previously defined ones. The closed form also enables us to consider the case of non identically distributed data, and we show that our results still hold provided deviations from stationarity are "small". Using a centro-symmetrization, we expand the previous estimators to skewed stable variables and we construct a test to check the skewness of the data. As applications, we show numerically that the stability index of multistable Lévy motion may be estimated accurately and consider a financial log, namely the S&P 500, where we find that the stability index evolves in time in a way that reflects with major financial events.
Content may be subject to copyright.
Explicit and combined estimators for parameters
of stable distributions
Jacques Lévy Véhel1Anne Philippe2Caroline Robet 2
1INRIA Rennes Bretagne Atlantique & Case Law Analytics,
2Université de Nantes, Laboratoire de Mathématiques Jean Leray, UMR CNRS 6629
Abstract: This article focuses on the estimation of the stability index and scale parameter of
stable random variables. While there is a sizable literature on this topic, no precise theoretical
results seem available. We study an estimator based on log-moments, which always exist for
such random variables. The main advantage of this estimator is that it has a simple closed
form expression. This allows us to prove an almost sure convergence result as well as a central
limit theorem. We show how to improve the accuracy of this estimator by combining it with
previously defined ones. The closed form also enables us to consider the case of non identically
distributed data, and we show that our results still hold provided deviations from stationarity are
"small". Using a centro-symmetrization, we expand the previous estimators to skewed stable
variables and we construct a test to check the skewness of the data. As applications, we show
numerically that the stability index of multistable Lévy motion may be estimated accurately and
consider a financial log, namely the S&P 500, where we find that the stability index evolves in
time in a way that reflects with major financial events.
Keywords: averaging estimates; misspecifed model; moment estimate; Monte Carlo approx-
imation; stable distribution
1 Introduction
The class of α-stable distributions is ubiquitous in probability: such distributions appear as the
limit of normalized sums of independent and identically distributed random variables.
A random variable Xis said to have α-stable distribution with α(0,2] if for any n2
there is a real Dnsuch that n1X+Dnhas the same distribution as X1+· ·· +Xn, the sum
of nindependent copies of X(see Samorodnitsky and Taqqu (1994) for equivalent definitions
and properties). This probability distribution admits a continuous probability density, that is
not known in closed form, except for Gaussian distributions, Cauchy distributions, Lévy distri-
butions and constants. Non-Gaussian stable distributions are a model of choice for real world
1
phenomena exhibiting jumps. Indeed, for α < 2, their density exhibit "heavy tails", resulting
in a power-law decay of the probability of extreme events. They have been used extensively
in recent years for modeling in domains such as biomedicine (see Salas-Gonzalez et al., 2013),
geophysics (see Yang et al., 2009), economy and finance (see Mandelbrot, 1997), Internet traffic
(see Dimitriadis et al., 2011) and more. A stable distribution is characterized by four parame-
ters:
a stability parameter, denoted α(0,2]. The value α= 2 corresponds to Gaussian
distribution. For non-Gaussian stable distribution α(0,2), it governs the heaviness of
the tail. Its density decreases as the power function xα1when |x|tends to infinity (see
(6.1) in Appendix or Property 1.2.15 in Samorodnitsky and Taqqu (1994) for details).
a scale parameter usually denoted σ, (that is proportional to variance in the Gaussian
case),
a location parameter µsimilar to the mean in the case of Gaussian distributions,
a skewness parameter βranging in [1,1].
Hereafter, we write XSα(σ, β, µ)to indicate that X has a stable distribution. The character-
istic function φ, is given by (see Samorodnitsky and Taqqu, 1994, for details):
φ(t) =
exp σα|t|α1sign(t) tan( πα
2)+iµt,if α6= 1,
exp σ|t|1 + sign(t)2 log |t|
π+iµt,if α= 1,
where
sign(t) =
1if t > 0,
0if t= 0,
1if t < 0.
Our main aim in this work is to estimate these parameters. This is an important step to
use stable distributions for real world phenomena modeling, which is not trivial. The maxi-
mum likelihood estimate (MLE), that is the natural estimate for a parametric problem, is diffi-
cult to use (except in few cases, for example Gaussian distribution, Cauchy distribution, . . . ).
Bergström (1952) gives a series representation of the density function. From this representa-
tion, DuMouchel (1973) establishes the asymptotic theory of MLE. Under conditions ensuring
the existence of MLE, he proves the consistency and the asymptotic normality. See also Du-
Mouchel (1975) for results on the Fisher information. Despite this property of optimality, it
is nevertheless difficult to calculate the MLE. In practice, finding the MLE requires numerical
approximations. Different methods has been proposed for instance through Fourier inversion
(see Nolan (2001)). However, such procedures entail approximation errors that cannot be easily
assessed.
2
Another difficulty is that, except in the Gaussian case, stable random variables have infinite
moments of order at least the stability index. More precisely, if Xis a Sα(σ, β, µ)stable random
variable with 0< α < 2, then we have E[|X|p]<if and only if 0< p < α (see Samorodnit-
sky and Taqqu, 1994, Prop 1.2.16). This property implies that the non-Gaussian stable random
variables do not possess a finite variance, and, in some cases, a well-defined mean. Therefore,
a standard method of moments cannot be used.
A number of estimators are of common use, such as the ones proposed by Fama and Roll
(1971), McCulloch (1986) and Koutrouvelis (1980, 1981). A difficulty with these estimators is
that they do not possess a simple closed form expression. As a consequence, and to the best of
our knowledge, no theoretical results are known about them, such as almost sure convergence
and central limit theorems. Their asymptotic distributions as well as asymptotic variances are
thus only accessible through numerical simulations. Another drawback of not having explicit
and simple closed forms is that it is difficult to assess theoretically their performance in situ-
ations that slightly depart from the classical assumptions of identical and independent random
samples. This is nevertheless desirable when one wishes to deal with real world data, which
will often not verify these ideal hypotheses.
The parameter αcan be also interpreted as a tail index. Indeed, the asymptotic tail behavior
of the stable distribution is Pareto when α6= 2, i.e. it exists C > 0such that P(X > x)Cxα
as xtends to infinity (see Samorodnitsky and Taqqu (1994)). Therefore, all the estimators of
index of regularly varying distributions can be applied to α. See for instance Hill (1975); Hall
(1982); De Haan L. (2006); Resnick (2007) for a review, and McCulloch (1997); DuMouchel
(1983); Fofack and Nolan (1999) for applications to stable distributions.
The construction of estimators for parameters of stable distribution is also related to more
recent studies on Lévy processes. An important challenge is to characterize the activity of
jumps. In the more general context of semi-martingales, Aït-Sahalia and Jacod (2009); Jing
et al. (2012) propose estimates for the jump activity index based on discrete high-frequency ob-
servations. For a Lévy process, this jump activity index corresponds to the Blumenthal–Getoor
index, which can be estimated in different ways (see for instance Belomestny (2010) for spec-
tral approach). In the particular case of stable Lévy process, this index is just the stable index
α.
Recently, Falconer and Lévy Véhel (2018a,b) construct a new class of processes called
self-stabilizing processes. The stability index at time tdepends on the value of the process at
time t. For a self-stabilizing process (Zt)tR+, its limit distribution after scaling around t, is
an α(Z(t))-stable process. The estimation of the function αis a difficult issue that requires an
easy-to-calculate estimator with good properties for small samples.
Our main aim in this work is to investigate the theoretical properties of a generalized method
of moments with log-moments. This idea is not new, as it has long been remarked that log-
moments always exist for stable random variables and that it is convenient to work with them.
Ma and Nikias (1995) consider the same estimator as the one we study in the symmetric case.
3
They apply it to blind channel identification while Wang et al. (2015) use this estimator for
α-stable noise in a laser gyroscope’s random error. Kuruoglu (2001) considers the general
case with four parameters based on a symmetrization of the observations. In these articles, the
asymptotic properties are not addressed. Owing to its simple expression, we are able to prove
almost sure convergence and a central limit theorem both in an identically and independent
framework and in a case of slight deviation from stationarity. We compare the performance
of our estimator with the Koutrouvelis regression method (see Koutrouvelis, 1980, 1981). The
results depend on the value of αand on the size of the sample. We then combine these two
estimators using a technique recently developed in Lavancier and Rochet (2016) to enhance
their performance, especially in the case of small samples. As applications, we show numerical
experiments both on synthetic data (symmetric Lévy multistable motion) and on a financial
log (S&P 500), which confirm our theoretical results that the estimator is able to track smooth
enough variations of the stability index in time.
In Section 2, we study estimators of αand σfor symmetric (that is, when µ=β= 0) stable
random variables: log-moments estimators and a combined estimator build with the Koutrou-
velis ones. In Section 3, we expand the log-moment, Koutrouvelis and combined estimators
to the skewed case studying two ways for the adaptation of the log-moments estimator. The
properties of log-moment estimators also allow us to propose a method for testing the skewness
of the data. In Section 4, we investigate the case of non-identically distributed observations, we
prove robustness of the log-moments estimators under some conditions for the perturbations. In
Section 5, we perform numerical experiments involving multistable Lévy motion and real data
with the study of a financial index.
2 Estimation methods
After the theoretical study of the log-moment estimate, we apply the procedure described in
Lavancier and Rochet (2016) to provide a combined estimator for the parameters α(0,2) and
σR
+.
2.1 Symmetric case for log-moments
For a symmetric stable distribution closed form expressions are available for absolute log-
moments (Le Guével, 2013), which allow one to derive expressions for estimating αand σ.
First, note the following property:
Proposition 2.1. Let ZSα(1,0,0) with 0< α < 2. We have E[|log |Z||p]<for all
p > 0.
Proof. See Appendix.
4
These expectations may be computed explicitly by remarking that:
E[(log |Z|)p] = dpE[|Z|t]
dtpt=0
,(2.1)
and by using the following result:
Proposition 2.2. Let ZSα(1,0,0) with 0< α < 2. For all 0<t<min(α, 1), we have
E[|Z|t] = Γ(1 t/α)
Γ(1 t) cos(πt/2) .(2.2)
Proof. See Appendix.
We deduce that E[log |Z|] = 1
α1γand Var(log |Z|) = π2
6α2+π2
12 , where γis the Euler
constant.
Theorem 2.3. Let (X1, . . . , Xn)be a sequence of independent and identically distributed stan-
dard symmetric stable random variables Sα(1,0,0) with 0< α < 2. Define
bαn(X1, . . . , Xn) = γ
γ+1
nPn
i=1 log |Xi|.
Then, bαn
a.s.
αwhen n+. Moreover, with f(x) = π2
6x2+π2
12 ,
n(bαnα)γ
bα2
npf(bαn)
d
→ N(0,1).(2.3)
Proof. The proof of this result may be found in Appendix.
In general, σis unknown and we must use a joint estimation of both parameters. Let Wbe
a stable variable with parameter Sα(σ, 0,0). By taking Z=W
σ, we have ZSα(1,0,0) and
we deduce the log-moments of Wby using the log-moments of Z. We get
E[log |W|] = 1
α1γ+ log σ
and
Var(log |W|) = π2
6α2+π2
12.(2.4)
Theorem 2.4. Let (X1, . . . , Xn)be a sequence of independent and identically distributed sym-
metric stable random variables Sα(σ, 0,0) with 0< α < 2and σ > 0. Define the estimators
5
bα(n)
LOG =bα(n)
LOG(X1, . . . , Xn)and bσ(n)
LOG =bσ(n)
LOG(X1, . . . , Xn)by
bα(n)
LOG =
max
6
π2n
n
X
i=1 "log |Xi| − 1
n
n
X
k=1
log |Xk|#2
1
2,1
4
1/2
,
bσ(n)
LOG = exp 1
n
n
X
i=1
log |Xi| − 1
bα(n)
LOG 1!γ!.
(2.5)
Then,
(bσ(n)
LOG,bα(n)
LOG)a.s.
(σ, α)when n+.(2.6)
Moreover,
n bσ(n)
LOG
bα(n)
LOG ! σ
α!! d
→ N(0, Fα,σ Gα,σΣα,σ G>
α,σ F>
α,σ),(2.7)
where
Fα,σ = σ γσ2
0 1 !, Gα,σ = 1 0
61
α1γ+ log σα323α32!,(2.8)
Σα,σ = Var(log |X1|)Cov(log |X1|,(log |X1|)2)
Cov(log |X1|,(log |X1|)2) Var((log |X1|)2)!.(2.9)
Proof. The proof of this result may be found in the appendix.
For the parameter of interest α, we have an explicit form of the limiting distribution and its
variance can be consistently estimated.
Corollary 2.5. Under the same assumption as Theorem 2.4,
1. the asymptotic variance in (2.7) for bα(n)
LOG only depends on αand is equal to
τ2
α:= 36
π4
1
|6µ221/2|3µ4µ2
2
where µ2, µ4are central moments (see (2.4) and (6.2) )
µ2=π2
6α2+π2
12 and µ4=π43
20α4+1
12α2+19
240.
2. Moreover, we have
Tn=n
τn
(bα(n)
LOG α)d
→ N(0,1),(2.10)
where τ2
n=τ2
bα(n)
LOG
is the plug-in estimates of τ2
α.
Proof. This is an immediate consequence of Theorem 2.4.
6
The interest of (2.10) is to provide asymptotic confidence intervals for the parameter α. It
also possible to apply this result for testing null hypothesis of the form αAwhere Ais a
subset of (0,2).
For example, the choice A= [1,2) allows to test the existence of the first moment. The critical
region of the test is of the form {bα(n)
LOG <1 + τn
nqw}where qwis the standard Gaussian quantile
of order w. This test has asymptotic significance level wand it is consistent under the alternative
hypothesis "X1is not integrable".
2.2 Combined estimator
A way to improve the performance of the log-moment estimate, is to aggregate different es-
timates. We want to construct an estimator of αwhich will be at least as good as the best
estimator, for each α, for small samples. We build a new estimator for the parameters αand σ
using a combined estimator whose general procedure of construction is described in Lavancier
and Rochet (2016). In our special case, θ= (α, σ)>are the parameters to estimate and we have
access to pestimators for αand qestimators for σ.
Let bα(p)(resp.bσ(q)) be the collection of p(resp. q) estimates of α(resp. σ). We consider
averaging estimators of θof the form
b
θλ=λ> bα(p)
bσ(q)!, λ Λ,(2.11)
where λ>denotes the transpose of λand ΛR(p+q)×2is a subset of (p+q)×2matrices.
A convenient way to measure the performance of b
θλis to compare it to b
θ?, defined as the
best linear combination b
θλobtained for a non-random vector λΛ. Specifically, b
θ?is the linear
combination λ?>bα(p)
bσ(q)minimizing the mean squared error (MSE), i.e.
λ?= argmin
λΛ
E[kb
θλθk2].
Clearly, the larger the set Λis, the better it will be. However, choosing the whole space
Λ = R(p+q)×2is generally not exploitable. We must impose some conditions on the set Λin
order to have an explicit form for λ?.
Define J=1p0p
0q1qwhere 1kis the vector composed of kones and where 0kis the vector
composed of kzeros. We consider the maximal constraint set
Λmax ={λR(p+q)×2>J=I2}
with I2the identity matrix. The mean squared error E[kb
θλθk2]is minimized on the set Λmax
7
for a unique solution λ?= Σ1J(J>Σ1J)1,where Σis the Gram matrix
Σ = Ebα(p)α
bσ(q)σbα(p)α
bσ(q)σ>.(2.12)
This result is proved in Lavancier and Rochet (2016) (page 178). Since the matrix Σis unknown,
the averaging estimator b
θmax is obtained by replacing Σby its estimation b
Σ:
b
λmax =b
Σ1J(J>b
Σ1J)1,b
θmax =b
λ>
max bα(p)
bσ(q)!.(2.13)
Different strategies are described in Lavancier and Rochet (2016) to estimate Σdepending on
information available on combined estimates.
In Section 2.3, we combine the log-moment estimate with the well-known Koutrouvelis
estimator. In the absence of an explicit or asymptotic form for the variance of the Koutrouvelis
estimator, we estimate Σusing the parametric bootstrap.
2.3 Numerical performance of the individual and combined estimators
In this section, we provide a numerical comparison of the log-moment estimate and the Koutrou-
velis estimate (see Koutrouvelis, 1980, 1981). The choice of this particular estimator is moti-
vated by the results reported in Weron (1995) showing that it performs usually better than other
methods such as the Fama-Roll and McCulloch ones.
Then, to improve the performances of the Koutrouvelis estimator and the log-moments es-
timator we implement the aggregation method introduced in Section 2.2.
Definition of Koutrouvelis estimate. The Koutrouvelis (1980, 1981) estimator is based on
exploiting the explicit expression of the iterated logarithm of the characteristic function φ. In
the symmetric case, it takes the particularly simple form
log(log(|φ(t)|2)) = log(2σα) + αlog |t|.(2.14)
The empirical characteristic function given by b
φn(t) = 1
nPn
j=1 eitXjbased on i.i.d obser-
vations (Xj)is a consistent estimator of φ. We estimate these parameters by regressing
y= log(log(|b
φn(t)|2)) on w= log |t|in the model yk=m+αwk+kwhere m= log(2σα),
tk=πk
25 for k∈ {1, . . . , K}with Kdepending on the parameter αand on the sample size, and
kdenotes an error term. In our simulations, we use an easier version of the Koutrouvelis re-
gression method which is more adapted for the symmetric case (see Weron, 1995). We describe
the algorithm for an observed sample of size n:
Stopping parameters. Fix the admissible error tol and the maximum number of itera-
tions itermax if the admissible error is not reached. In all simulations, we take tol = 0.05
8
and itermax = 10.
Initialization. A regression applied to the McCulloch (1986) quantile method provides
initial estimates bαand bσ.We fix a first estimate of the scale parameter sfor the scaled
sample Xj
σj∈{1,...,n}
by the deterministic value bs= 2.
Recursive loop. While the number of iterations is less than itermax and |bs1|> tol,
find the number K of points in the regression depending on bαas in the classical
Koutrouvelis regression,
define w= (wk)k∈{1,...,K}and y= (yk)k∈{1,...,K }by
wk= log |tk|and yk= log log b
φntk
bσ
2!!,
where tk=πk
25 for k∈ {1, . . . , K},
compute ¯wand ¯ythe empirical mean of samples (wk)k∈{1,...,K}and (yk)k∈{1,...,K },
compute the new bαgiven by
bα= min PK
k=1(wk¯w)(yk¯y)
PK
k=1(wk¯w)2,2!,
set the new bsby bs= exp ¯ybα¯wlog(2)
bα,
set the new bσby bσ=bσbs.
This modified version of Koutrouvelis gives performances (in terms of mean squared errors)
similar to the original. However, it is much faster because this version does not necessitate the
estimation of the parameters βand µ, which requires the numerical inversion of matrices of size
n×n.
Remark 1. As already pointed out by Weron (1995), simulation studies show that the recursive
scheme stabilizes very quickly. With a high probability, the algorithm stops before itermax it-
erations, and so the admissible error tol is reached. Moreover, we do not observe a significant
effect on the estimates when we reduce the value of tol. To support these remarks, we realize
the following numerical experiments. We simulate 105independent copies of stable samples
with parameters (α, σ, β, µ) = (1.8,1,0,0) and the sample size is n= 500. The Koutrou-
velis algorithm is executed for two values of the admissible error tol = 0.01 and 0.05. For
all replications, the algorithm stops before reaching the number of iterations itermax. Table 1
gives the estimated probability distribution of the number of iterations. When the tol parameter
decreases, the average computing time increases without any significant improvement of ap-
proximation quality. The mean squared error for the αestimate is 4.3×103(with a precision
9
of 104) for both values of tol = 0.01,0.05. Moreover, the L2-norm of the difference between
the estimations is of the order of 105.
Nb of iterations 1 2 3 3
tol= 0.01 25% 74% 1% 0%
tol= 0.05 88% 12% 0% 0%
Table 1: Estimated probability distributions of the number of iterations in Koutrouvelis algo-
rithm. Estimations are done on r= 100000 independent copies. The sample are simulated with
the following parameters : n= 500,α= 1.8,σ= 1 and β=µ= 0.
Comparison of individual estimates. For each pair of values (α, σ),rindependent samples
of size nof independent stable random variables are generated. The empirical mean squared
error of the sampling distribution of αand σis given by
MSEα=1
r
r
X
i=1
(bαiα)2, MSEσ=1
r
r
X
i=1
(bσiσ)2,
where bα(resp bσ) is an estimator of α(resp σ).
In the sequel, we use the abbreviation "KOUT" and "LOG" to refer respectively to the
Koutrouvelis and log-moment estimator. For each α, the behaviors of bαK OUT and bαLOG are
similar for all values of σ(Table 2) whereas, for each value of σ,bσKOUT and bσLOG improve
when αis increasing (Table 3). Besides, when αis fixed, bσKOUT and bσLOG have the same
behavior for all σ.
With a simulation study, we compare the empirical mean squared errors of αand σfor the
methods introduced earlier. Tables 2 and 3 show that the log-moment estimator performs better
than the Koutrouvelis one when α < 1, while the converse is true for α > 1, with the differences
in performance increasing for extreme values of α.
Combined estimator. The Koutrouvelis regression estimator and the log-moment estimator
are complementary in the sense that the Koutrouvelis regression estimator is preferable when
α > 1whereas the log-moment estimate becomes better when α < 1. Applying the method
described in Section 2.2 with these estimators, we hope to get an estimate which will be at least
α=0.2 α=0.6 α=1α=1.4 α=1.8
σ=10 LOG 9.06 1059.06 1044.67 1031.97 1023.10 102
KOUT 4.70 1042.35 1033.75 1038.20 1034.17 103
σ=1LOG 8.07 1051.06 1034.47 1032.20 1023.19 102
KOUT 4.27 1042.11 1033.91 1037.58 1034.28 103
σ=0.1 LOG 8.93 1059.59 1044.43 1032.20 1022.97 102
KOUT 4.66 1041.96 1034.22 1037.58 1034.29 103
Table 2: Mean squared error for bαLOG and bαK OUT (r= 500 and n= 500).
10
α=0.2 α=0.6 α=1α=1.4 α=1.8
σ=10 LOG 7.10 9.86 1016.08 1016.07 1015.43 101
KOUT 11.5 9.92 1014.82 1013.56 1011.72 101
σ=1LOG 8.20 1029.45 1036.52 1036.98 1035.77 103
KOUT 1.18 1019.82 1034.44 1033.83 1031.56 103
σ=0.1 LOG 6.73 1048.77 1056.71 1056.54 1055.89 105
KOUT 8.63 1048.60 1054.75 1053.35 1051.82 105
Table 3: Mean squared error for bσLOG and bσKOUT (r= 500 and n= 500).
as good as the best estimator, for each α.The estimate bσK OUT is better than bσLOG except for
the small values of αwhere bσLOG slightly outperforms. Therefore we use only bσK OUT in the
combination. We will see later that, for technical reasons, it is not relevant to include too many
estimators in the combination.
We consider combined estimate of the form
b
λ>
bαKOUT
bαLOG
bσK OUT
where
b
λ=b
Σ1J(J>b
Σ1J)1, J =1 0
1 0
0 1
and where ˆ
Σis the parametric bootstrap estimate of the Gram matrix. Note that the parametric
approaches proposed by Lavancier and Rochet (2016) cannot be applied in our context since
an explicit or asymptotic form for the variance of the Koutrouvelis estimator is unknown. The
bootstrap procedure is the following. We compute a first estimation of the parameters by
bα0=bαKOUT +bαLOG
2,bσ0=bσKOUT .
We simulate Bsamples of size nof a symmetric stable distribution with parameters bα0and
bσ0. Then, the three estimators are computed, which gives bα(b)
KOUT ,bα(b)
LOG and bσ(b)
KOUT for
b= 1, . . . , B, and the matrix Σis estimated by the empirical covariance matrix of sample
bα(b)
KOUT ,bα(b)
LOG,bσ(b)
KOUT b=1,...,B .
The consistency of parameter estimates justifies the correct use of this approach (see Theo-
rem 2.4 and Koutrouvelis (1980)). Moreover, by construction, ˆ
Σis positive definite matrix and
so the inversion of ˆ
Σis always possible.
Experiments show that errors entailed by the estimation of Σare negligible compared to the
advantage of having several estimators for small samples. Note that similar estimators could be
built by combining more than 2 estimators for α. For example, it would be possible to add the
McCulloch quantile estimator. We can also add bσLOG for σ. However, this would increase the
size of the covariance matrix whose estimation will be worse and entail the risk of constructing
11
Weight
alpha
Figure 1: Weight (average) for the LOG-moment estimator bαLOG in the combined estimator
depending on α. The upper (resp. lower) bound of the interval correspond to the 95%quantile
(resp. 5%) for r= 500 replications of the combination with n= 100 and B= 1000.
α0.2 0.3 0.4 0.5 0.6 0.7
KOUT 2.4 1032.5 1034.6 1038.3 1031.3 1021.5 102
LOG 5.0 1041.2 1032.0 1033.4 1035.6 1037.6 103
COMB 4.0 1048.9 1041.6 1032.9 1035.2 1036.8 103
α0.8 0.9 1 1.1 1.2 1.3
KOUT 1.4 1021.5 1021.8 1021.9 1022.8 1023.4 102
LOG 1.2 1032.0 1023.3 1024.6 1025.9 1027.8 102
COMB 8.3 1031.1 1021.5 1021.8 1022.6 1023.1 102
α1.4 1.5 1.6 1.7 1.8 1.9
KOUT 4.0 1024.1 1024.1 1022.8 1021.9 1021.1 102
LOG 8.5 1028.8 1029.9 1028.7 1027.4 1027.6 102
COMB 3.3 1023.4 1023.5 1022.5 1021.9 1021.1 102
Table 4: Mean squared errors for Koutrouvelis regression (KOUT), Log-moment (LOG) and
the combined (COMB) estimators of αfor r= 500,n= 100,B= 1000 and σ= 1.
a combined estimator never better than each individual ones. The weight for the log estimator in
the combination is represented in Figure 1. We represent in Table 4 the mean squared errors for
several values of α. For each value, we remark that the combination between Koutrouvelis and
log estimators is always better than each estimator separately. This is confirmed by the plots in
Figure 2 comparing the empirical distributions of each estimator.
12
α=0.4 α=0.6 α=0.8
α=1 α=1.2 α=1.4
α=1.6 α=1.8
Figure 2: Empirical density functions for Log-moment (LOG), Koutrouvelis regression
(KOUT) and the combined (COMB) estimators of αfor r= 500,n= 100,B= 1000 and
σ= 1.
13
3 Skewed stable distributions
3.1 Adaptation of estimators for the skewed case
In the case XSα(σ, β, 0), we have
E[log |X|] = 1
α1γ+ log σlog |cos θ|
α,
E[(log |X| − E[log |X|])2] = Var(log |X|) = π2
6α2+π2
12 θ2
α2,
where γis the Euler constant and θ= arctan βtan απ
2(see Kuruoglu (2001) prop. 4 and
Kateregga et al. (2017) for application).
Let (X1, . . . , X2n)be a sequence of 2nindependent and identically distributed stable ran-
dom variables Sα(σ, β, 0). We use the centro-symmetrization introduced in Kuruoglu (2001) to
the observed data to obtain nindependent symmetric stable random variables Sα(2σ, 0,0) :
(X2kX2k1)k∈{1,...,n}. Then, we estimate αby taking then bα(n)
LOG(X2X1, X4X3, . . . , X2n
X2n1), where bα(n)
LOG is introduced in Theorem 2.4. In Kuruoglu (2001), the parameter βis also
estimated using Var[log |X|].He provides a numerical comparison of βestimates.
Another way to estimate αis to use the (2n1) random variables (XkXk1)k∈{2,...,2n}
by taking bα(2n1)
LOG (X2X1, X3X2, . . . , X2nX2n1). The interest is to preserve the same
sample size. However, not enough information is available on the dependence structure of the
process (log(|XkXk1|))kNto establish the consistency and central limit theorem for this
estimate. Given the absence of theoretical result and the numerical comparison provided in
Section 3.3, we only consider the first estimate (based on independent increments) in the rest of
the study.
3.2 Test of symmetry
We propose a test for checking the skewness of dataset. We want to test H0: ”β= 0” against
H1: ”β6= 0” using the properties of the estimators studied in the previous section
Let (X1, . . . , X2n)be a sequence of 2nindependent and identically distributed stable ran-
dom variables Sα(σ, β, 0). Under the null hypothesis H0, both estimates bαLOG ((X2k)k)and
bαLOG((X2kX2k1)k)are consistent, and so the difference between these estimators tends
to zero. For skewed variables, this convergence does not occur since bαLOG ((X2k)k)does not
converge any more to α. These facts suggest to construct a test based on the difference of these
14
estimators. Denote
L1:= E[log |Z|] = 1
α1γ+ log σ,
L2:= E(log |Z| − E[log |Z|])2=π2
6α2+π2
12,
L3:= E(log |Z| − E[log |Z|])3= 2ζ(3) 1
α31,
L4:= E(log |Z| − E[log |Z|])4=π43
20α4+1
12α2+19
240,
C:= Cov (log |X2| − E[log |X2|])2,(log |X2X1| − E[log |X2X1|])2
=E(log |X2| − Elog |X2|)2(log |X2X1| − Elog |X2X1|)2L2
2,
where Zis an Sα(σ, 0,0) random variable, ζis the Riemann zeta function ζ(s) = P
n=1
1
nsand
ζ(3) = 1.2020569 . . . .
Proposition 3.1. Denote bαLOG({X2k})and bαLOG({X2kX2k1})the log-moment estimate
calculated respectively on the samples (X2k)k=1,...,n and (X2kX2k1)k=1,...,n.For w(0,1),
define the critical region
Rw=((x1, ..., x2n)R2nn[bαLOG({x2k})bαLOG({x2kx2k1})]2
18π4bα6
LOG({x2kx2k1})(c
L4c
L2
2b
C)
> tw),
where twis the 1wquantile of the Chi-squared distribution with 1degree of freedom, and
where c
L4,c
L2and b
Care respectively the empirical moments of L4, L2and C. We decide to
reject the null hypothesis if (X1, . . . , X2n)Rw. The test has an asymptotic significance level
equal to wand is asymptotically consistent under H1.
Proof. We denote Yk= log |X2k|and Zk= log |X2kX2k1|for k= 1, . . . , n. Under the null
hypothesis, we have
n 1
n
n
X
k=1 (YkYn)2
(ZkZn)2! L2
L2!! d
→ N 0
0!, L4L2
2C
C L4L2
2!!.
Then by multidimensional delta method, we get
n bαLOG ((X2k)k)
bαLOG((X2kX2k1)k)! α
α!! d
→ N 0
0!,9α6
π4 L4L2
2C
C L4L2
2!!
and nbαLOG((X2k)k)bαLOG((X2kX2k1)k)
q18α6
π4(L4L2
2C)
d
→ N(0,1).
Finally, applying the Slutsky theorem and the consistency of c
L4,c
L2and b
C, we obtain that the
15
asymptotic significance level is equal to w.
Under H1, we have
bαLOG((X2k)k)bαLOG((X2kX2k1)k)a.s.
πα
π26θ2α6= 0
with θ= arctan βtan απ
2. Then, under the alternative β6= 0 we have a consistent test,
P(Rw)
n→∞ 1.
3.3 Numerical performances
Estimation We compare the performance of both log-moment estimates obtained after sym-
metrization. The mean squared errors of the first estimate calculated on (X2kX2k1)k∈{1,...,n}
do not depend on βsince (X2mX2m1)are symmetric and independent. We compare this
estimate with the second estimate calculated on (XkXk1)k∈{2,...,2n}. The joint distribution of
(XkXk1)k∈{2,...,2n}is unknown, and could depend on β. However, we observe numerically
that its mean squared error does not depend on β. Table 5 provides a comparison of both two
estimates in term of mean squared error. For α < 1, both estimates have similar performance
despite a 2-fold on sample sizes, this shows that the dependence degrades the precision. This
impact of dependence is less pronounced for α1. We improve the precision by taking the es-
timate on dependent sample. However, Koutrouvelis method always outperforms it when α > 1
(see Tables 2 and 4).
The comparison can be done on these tables for β= 0 since Koutrouvelis method does not
vary with skewed distributions. Indeed, the modulus of the characteristic function depends only
on αand σ.
In light of these numerical results, we decide to combine the log-moment estimate after
symmetrization bαLOG({X2kX2k1}k=1,...,n )(named after LOG sym.) with the Koutrouvelis
estimator in the same way that with symmetric variables (see Section 2.2) to obtain a new
estimator whose numerical performances are reported on Table 6.
For skewed data (see Table 6), the combined estimators still have good performance but we
loose in term of mean squared errors comparing to the symmetric case.
Testing procedure We use Monte Carlo experiments to evaluate the empirical significance
level, that is the probability to reject the null hypothesis H0under H0. Table 7 gives the perfor-
mances of our testing procedure under the null hypothesis. For small values of α, the empirical
significance level converges slowly to w: this is due to the form of the density that is concen-
trated around zero, and the poor quality of the estimation of the coefficient L4L2
2C. The
estimation converges rather slowly to this coefficient (which increases when αdecreases).
To evaluate the performance of the test, we also examine the distribution of the p-values.
Under the null hypothesis, the p-value converges in distribution to the uniform distribution on
16
2n α = 0.2α= 0.4α= 0.6α= 0.8α= 1
100 nobs. i.i.d 1.01 1034.44 1031.27 1023.30 10210.7 102
2n1obs. dep. 0.970 1034.14 1031.08 1022.44 1026.08 102
500 nobs. i.i.d 1.80 1047.90 1042.10 1034.68 1031.01 102
2n1obs. dep. 1.75 1047.58 1041.86 1033.81 1037.33 103
1000 nobs. i.i.d 9.12 1053.86 1049.90 1042.23 1034.79 103
2n1obs. dep. 8.94 1053.69 1048.88 1041.84 1033.48 103
2n α = 1.2α= 1.4α= 1.6α= 1.8
100 nobs. i.i.d 12.6 10214.4 10212.0 1029.76 102
2n1obs. dep. 8.78 10211.2 1029.59 1027.47 102
500 nobs. i.i.d 2.14 1024.32 1025.46 1024.76 102
2n1obs. dep. 1.35 1022.51 1023.61 1023.31 102
1000 nobs. i.i.d 9.98 1032.02 1023.35 1023.20 102
2n1obs. dep. 6.51 1031.17 1022.02 1022.18 102
Table 5: Mean squared errors for αusing log-moments for 2nrandom variables i.i.d.
Sα(1, β, 0).
α0.2 0.4 0.6 0.8 1
COMB 2.39 1032.21 1035.88 1039.08 1031.67 102
KOUT 6.14 1034.92 1031.10 1021.36 1021.72 102
LOG sym. 1.90 1034.36 1031.34 1022.67 1027.29 102
α1.2 1.4 1.6 1.8
COMB 2.70 1023.53 1023.54 1022.20 102
KOUT 2.71 1024.15 1024.16 1022.41 102
LOG sym. 1.24 1011.38 1011.29 1011.10 101
Table 6: Mean squared errors for the combined (COMB), Koutrouvelis regression (KOUT) and
Log-moment (LOG) estimators of αfor r= 500,n= 100,B= 1000,β= 0.6and σ= 1.
[0,1]. Under alternative, the p-value converges in probability to zero, and the converge rate
indicates the power of the test. In Figure 3, we can thus see that the power increases when β
goes away from 0or when the sample sizes increases. This convergence under the alternative
hypothesis depends on the values of αand β. The case β= 0 confirms that the empirical
significance level converges quickly to the nominal level ω.
We observe that the p-value takes the value 1with non-null probability for small size of
samples. This jump is due to the truncation in the log-moment estimator defined in Theorem
2.4. Indeed, the truncation imposed on αto be equal to 2, thus the distribution is symmetric and
we always accept the null hypothesis H0. This phenomenon asymptotically disappears since
the estimate is consistent.
17
α= 0.6 0.8 1 1.2 1.4 1.6 1.8
2n= 200 0.084 0.087 0.088 0.097 0.088 0.06 0.042
2n= 1000 0.097 0.089 0.074 0.065 0.067 0.061 0.035
2n= 1040.10 0.071 0.051 0.050 0.048 0.051 0.053
2n= 5 1040.081 0.065 0.050 0.050 0.049 0.050 0.051
Table 7: Probabilities to reject the null hypothesis under H0for several sizes of samples and
different values of α. The significance level is w= 5%.
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
Empirical c.d.f. of p-values
p-values
(a) 2n= 200,α= 1.2
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
p-values
Empirical c.d.f. of p-values
(b) 2n= 200,α= 0.8
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
Empirical c.d.f. of p-values
p-values
(c) 2n= 1000,α= 1.2
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
Empirical c.d.f. of p-values
p-values
(d) 2n= 1000,α= 0.8
Figure 3: Representation of the empirical cumulative distribution function of p-values for dif-
ferent values of β,β= 0 (black), β= 0.2(red), β= 0.4(dark blue), β= 0.6(green),
β= 0.8(pink) and β= 1 (light blue). In (a), we add in dotted lines the value for
β∈ {−1,0.8,0.6,0.4,0.2}which correspond exactly to the positive ones.
18
4 Case of non-identically distributed stable variables
In applications, it may be the case that one needs to analyze non-stationary phenomena. For
instance, it seems plausible that financial logs which display jumps will see the intensity of
these jumps depend on external events, such as crises (see next section for an illustration on
the S&P 500). Sometimes, the variation of αwill be slow, and it is of interest to investigate
under which conditions our estimator still behaves correctly in situations where the data at hand
deviate slightly from the assumption that the random variables have the same distribution. In
the sequel, we examine two cases: deterministic and random small perturbations of α, leading
to random variables which are not identically distributed. We do not dispense here with the
independence assumption, although this would be a desirable extension. This generalization
will be useful to address the estimation of the stability function for the self-stabilizing processes
(see Falconer and Lévy Véhel (2018a,b)).
4.1 Deterministic perturbations
Let (Xi)ibe a sequence of independent variables and Xrandom variable independent of (Xi)i
such that XiSαi(σi,0,0) and XSα(σ, 0,0).We denote Yi= log |Xi|,Y= log |X|.
Assume there are constants (cα, cσ)(0,1)2such that for each integer i,
αi=α+εi(0,2] and σi=σ+ηi,
with εiand ηideterministic satisfying
|εi|
αcα<1and |ηi|
σcσ<1.
Proposition 4.1. Under the conditions 1
n
n
X
i=1 |εi| −
n→∞ 0and 1
n
n
X
i=1 |ηi| −
n→∞ 0,one has
bα(n)
LOG
a.s.
n→∞ α,
where bα(n)
LOG is defined in Theorem 2.4.
Proof. See Appendix.
Proposition 4.2. Let Σα,σ be the covariance matrix between Yand Y2:
Σα,σ =Var(Y)Cov(Y,Y 2)
Cov(Y,Y 2) Var(Y2)(4.1)
and set
Hα,σ =6(( 1
α1)γ+log σ)α32
3α32.(4.2)
19
With the conditions 1
nPn
i=1 |εi| −
n→∞ 0and 1
nPn
i=1 |ηi| −
n→∞ 0,the following central
limit theorem holds for bα(n)
LOG:
nbα(n)
LOG αd
→ N(0, H>
α,σΣα,σHα,σ).(4.3)
Proof. See Appendix.
4.2 Random perturbations
Let Xbe a random variable with a stable distribution Sα(σ, 0,0). For each integer i, denote
αi=α+εiwhere εiis a random variable. Suppose there is a constant cα(0,1) such that
Pεi
αcα,min cα,2
α1= 1.
Let (Xi)ibe a sequence of independent variables and independent of Xsuch that Xi
Sαi(σ, 0,0) (given αi). We denote Y= log |X|and Yi= log |Xi|for iN.
Proposition 4.3. Under the conditions 1
n
n
X
i=1
E[|εi|]
n→∞ 0, we have
bα(n)
LOG
a.s.
n→∞ α.
If, in addition, 1
n
n
X
i=1
E[|εi|]
n→∞ 0, then the following central limit theorem holds:
nbα(n)
LOG αd
→ N(0, Hα,σ Σα,σ H>
α,σ)
where Σα,σ and Hα,σ are defined in (4.1) and (4.2).
Proof. See Appendix.
5 Some applications for the combined estimator
In this section, we apply the combined estimate to processes with varying stable index. The
estimate is calculated on small window with respect to the number of observations to catch the
variations of α. This empirical study shows that the performance of the combined estimator for
small samples makes the local estimation of αpossible.
5.1 Numerical results on synthetic data: multistable Lévy motion
We now use our log-moment and combined estimators in the case of the multistable Lévy
motion defined in Falconer and Lévy Véhel (2009) (see also Le Guével and Lévy Véhel (2012)
20
for further properties of this process). The basic idea is to allow the stability index evolve with
time, so that the jump intensity, which is governed by α, varies along a trajectory. Such a feature
is commonly encountered in times series observed in fields such as finance or biomedicine (see
for instance Corlay et al. (2014); Frezza (2018); Fischer et al. (2003); Bianchi et al. (2013)).
Let us briefly recall the definition of such processes.
Let α: [0,1] (0,2) be continuously differentiable. We denote r<s> =sign(r)|r|sfor
rRand sR. Symmetric multistable Lévy motion is defined by
Mα(t) = Cα(t)X
(X,Y )Π
1(0,t](X)Y<1(t)>(5.1)
where Cθ=Z
0
uθsin(u)du1
and Πis a Poisson point process on R+×Rwith plane
Lebesgue measure L2as mean measure. This process is simulated by using the field
X(t, u) = Cα(u)X
(X,Y )Π
1(0,t](X)Y<1(u)>.
For each u(0,1),X(., u)is an α(u)-stable process with independent increments which can
be implemented using the RSTAB program available in Stoev and Taqqu (2004) or in Samorod-
nitsky and Taqqu (1994). The interval [0,1] is discretized in Nequal parts and X(., u)is
implemented by the cumulative sum of Nindependent stable random variables with α(u)as
characteristic exponent.
In Figure 4, we display sample paths of multistable processes for several αfunctions. Then, we
estimate these functions at all point t0thanks to the combined estimator with a window of n
observations around t0.Therefore, from a realization Mα(k
N)k=0,..,N of a multistable process,
the function αcan only be estimated on the interval [n
N; 1 n
N].
In Figure 5, we iterate 100 times the simulation and the estimation for a multistable process
with α(t)=1.50.48 sin(2π(t+ 1/4)). For each point where the function αis estimated,
we obtain the empirical distribution of the combined estimator. This procedure is repeated for
several sizes of window (100,200,1000 and 2000). We observe that the standard error which
corresponds to the standard deviation for the combined estimator is decreasing when the size
of the window nincreases whereas the bias is increasing for nlarge. Finally, the mean squared
error is decreasing when nincreases until n= 1000 and then increases for larger value. Figure
6 represents the bias, standard error and mean squared error as function of time tfor various
values of the window size.
The mean squared error as a function of αis reported in Figure 7. The mean squared error does
not vary much according to the value of αin [1,2] when nis fixed.
As these figures show, reasonable estimates are obtained on these experiments, due to the
fact that the variations of αare "slow" compared to the sampling frequency: this feature ensures
that centering a window around any given t0and treating all points inside this window as having
21
0.0 0.2 0.4 0.6 0.8 1.0
0.5 0.0 0.5
M𝛼(t)
t (time)
0.2 0.4 0.6 0.8
1.2 1.4 1.6 1.8
t (time)
𝛼 and 𝛼̂ (COMB)
α(t)=1.98 0.96t
0.0 0.2 0.4 0.6 0.8 1.0
1.5 1.0 0.5 0.0 0.5 1.0
t (time)
M𝛼(t)
0.2 0.4 0.6 0.8
1.3 1.4 1.5 1.6 1.7 1.8 1.9
t (time)
𝛼 and 𝛼̂ (COMB)
α(t)=1.98 0.96
1+exp(2040t)
0.0 0.2 0.4 0.6 0.8 1.0
0.6 0.4 0.2 0.0 0.2 0.4
M𝛼(t)
t (time)
0.2 0.4 0.6 0.8
1.0 1.2 1.4 1.6 1.8 2.0
t (time)
𝛼 and 𝛼̂ (COMB)
α(t) = 1.50.48 sin(2πt)
Figure 4: Trajectories of multistable processes on (0,1) with N= 20000 points in the first
column. The functions α(t)(red) and bαC OMB (t)(black) are represented in the second column
with n= 2000.
(a) n= 100 (b) n= 200 (c) n= 1000 (d) n= 2000
Figure 5: Box-plots of the estimator bα(n)
COMB for 100 replications of a multistable process with
characteristic exponent α(t)=1.50.48 sin(2π(t+ 1/4)), represented in red. The box-plots
represent the behavior of the estimator for several sizes nof window.
the same αvalue is an acceptable approximation as far as estimation is concerned.
22
(a) (b)
(c)
Figure 6: Representation of the bias (a), standard error (b) and mean squared error (c) as func-
tion of t for n= 100 (black solid line), n= 200 (red dashed), n= 1000 (green dotted) and
n= 2000 (blue dotted and dashed mix.). The statistics are evaluated on the same trajectories in
Figure 5.
Figure 7: Mean squared error according to the value of αof the combined estimation for a
multistable process with α(t) = 1.50.48 sin(2π(t+ 1/4)). The statistics are evaluated on the
same trajectories as Figure 5.
23
5.2 Application on financial logs
This last section deals with real data. We want to apply the combined estimator to estimate
the characteristic exponent of the financial index Standard &Poor’s 500 (abbreviated as the
S&P 500, see Figure 8). This is a stock market index based on the 500 companies having
largest capitalization in the United States. The stock market returns of the financial index S&P
500, which correspond to the renormalized growth rate ((Yt+1 Yt)/Yt)t, are supposed to be
independent stable random variables. We analyze the behavior of the index around the Wall
Street Crash of 1929, and during the period 1996-2017. For both periods, we test the symmetry
and we estimate the exponent αin sliding window.
Period 1996-2017 In Figure 9 (a), we first test the symmetry of the data in sliding window
of size 1000 (using the test defined in Proposition 3.1). We represent the empirical distribution
function for the p-values calculated on the sliding windows since 1996. The cumulative distri-
bution function is very close to the uniform one. This correspond to the p-values distribution
under the null hypothesis. Then, the symmetric hypothesis is not rejected for S&P 500 returns
since 1996. As a consequence, the parameter is estimated by applying the estimator defined
in Theorem 2.4 in sliding window of several sizes during the period 1996-2017 (see Figure
10). We observe the effect of the window size on the estimation of the function α. The overall
pattern of the three estimates is the same. However, the regularity of the estimated function
varies, it is a classical problem of bias-variance trade-off. When the sample size is fixed, we do
not have theoretical results on bias and variance for non identically distributed models. These
results would allow to propose a choice of window that satisfies the bias-variance trade-off. In
practice, the estimation of the bias and the variance is a difficult problem that requires generally
prior information on the αfunction. If the variations of αare quite low over small intervals, the
bias and the variance could be evaluated for example using a bootstrap method.
Wall Street Crash of 1929 In Figure 9(b), the distribution of the p-value allows to reject the
hypothesis of symmetry between 1929 and 1936. In Figure 11, the estimation of the character-
istic exponent is done between 1929 and 1936 using the skewed combined estimator defined in
Section 3. A sudden drop is observed at the end of 1929. This change corresponds to the Wall
Street financial crash of 1929. The estimation for symmetric data (in red) is added in the figure
to see the difference between the two estimations, particularly during the crisis.
The empirical comparison of these two periods highlights the effect of a financial crisis on
the two parameters αand β. Indeed, the crisis has an impact on the jump activity but it also
create an asymmetric situation.
24
1000 1500 2000
1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016
Date
SP 500 index
(a)
0.05 0.00 0.05 0.10
1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016
date
SP500 returns
(b)
Figure 8: Evolution of the financial index S&P 500 as function of t(a) and its return (b),
between 1996 and 2017.
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
p-values
Empirical c.d.f. of p-values
(a) 1996-2017
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
p-values
Empirical c.d.f. of p-values
(b) 1928-1936
Figure 9: Representation of the empirical cumulative distribution function of the p-value for
S&P 500 returns between 1996 and 2017 (Figure (a)) and between 1928 and 1936 ( Figure (b))
with sliding window of size 1000.
25
(a) n=50
(b) n=100
(c) n=200
Figure 10: Values of the combined estimator bα(n)
COMB for the S&P 500 characteristic exponent
αin sliding window of several sizes nfor working days between 1996 and 2017.
26
1.2 1.4 1.6 1.8 2.0
1929 1930 1931 1932 1933 1934 1935 1936
Date
Estimation
Figure 11: Values of the skewed combined estimator bα(n)
COMB (in black) and the symmetric
combined estimator (in red) for the S&P 500 characteristic exponent αaround the Wall Street
financial crash of 1929 (sliding window of size n= 200 observations).
27
6 Appendix
Proof of Proposition 2.1
Let Zbe a stable random variable Sα(1,0,0). It satisfies the following properties:
Z has bounded density,
for α(0,2), the asymptotic behavior of the tail probabilities is
lim
λ+λαP(Z > λ) = lim
λ+λαP(Z < λ) = 1
2Cα,(6.1)
where Cα=Z
0
uαsin(u)du1
(see Samorodnitsky and Taqqu (1994), Property
1.2.15).
Thus we deduce for all p > 0,
E[|log |Z||p] = Z
0
P(|log |Z||p> x)dx < .
For the gaussian case α= 2, this result is immediate.
Proof of Proposition 2.2
We use the formula of Property 1.2.17 and 1.2.15 in Samorodnitsky and Taqqu (1994) to com-
pute E[|Z|t]for 0< t < α:
E[|Z|t] = 2t1Γ(1 t/α)
tR+
0
sin2(u)
ut+1 du =2t1Γ(1 t/α)
R+
0
sin(2u)
utdu =(1 t)Γ(1 t/α)
Γ(2 t) cos(πt/2) .
Furthermore, if 0< t < 1, we have:
E[|Z|t] = Γ(1 t/α)
Γ(1 t) cos(πt/2) .
Proof of Theorem 2.3
By the strong law of large numbers for the random variables (log |Xi|)iwith finite expectation
(Proposition 2.1), and the continuous mapping theorem for g(x) = γ
γ+x, we get bαn
a.s.
n→∞ α.
Then we apply the central limit theorem
n 1
n
n
X
i=1
log |Xi| − (α11)γ!d
→ N(0, f (α)),
where f(x) = π2
6x2+π2
12 . Using the delta method with the function g, we obtain
n(bαnα)d
→ N(0, f (α)α4γ2).
28
Finally, Slutsky’s theorem leads to the following result:
n(bαnα)γ
bα2
npf(bαn)
d
→ N(0,1).
Proof of Theorem 2.4
The proof is based on the same arguments as Theorem 2.3.
The random variables (log |Xi|)ihave finite second moment, so we can apply the continuous
mapping theorem with x7→ π
6xπ2/2to the empirical variance of (log |Xi|)i. We get the strong
consistency of ˆα(n)
LOG. For ˆσ(n)
LOG, it is similar with the function (x, y)7→ ex(y11)γapplied to
(1
nPn
i=1 log |Xi|; ˆα(n)
LOG).
Then, we need the third and the fourth log-moment of a stable law ZSα(1,0,0) for the
covariance matrix. We have
E(log |Z| − E[log |Z|])3= 2ζ(3) 1
α31
and we get
E[(log |Z|)3] = 4γ3+ 2γπ2+ 8ζ(3)
4α3+12γ32γπ2
4α2+12γ3+γπ2
4α+4γ3γπ28ζ(3)
4
and
E(log |Z| − E[log |Z|])4=π43
20α4+1
12α2+19
240(6.2)
E[(log |Z|)4] = 240γ4+ 240γ2π2+ 36π4+ 1920ζ(3)γ
240α4+960γ4480γ2π21920ζ(3)γ
240α3
+1440γ4+ 360γ2π2+ 20π4
240α2+960γ4240γ2π21920ζ(3)γ
240α
+240γ4+ 120γ2π2+ 19π4+ 1920ζ(3)γ
240
where ζis the Riemann zeta function ζ(s) = P
n=1
1
nsand ζ(3) = 1.2020569 . . . .
According to the multivariate central limit theorem we have
n 1
nPn
i=1 log |Xi|
1
nPn
i=1(log |Xi|)2! E(log |X1|)
E[(log |X1|)2]!! d
→ N(0,Σα,σ ),
where Σα,σ is defined in (2.9). Then, we apply twice the delta method. First with the function
29
(x, y)7→ x;π
6(yx2)π2/2, we get
n 1
nPn
i=1 log |Xi|
ˆα(n)
LOG ! E(log |X1|)
α!! d
→ N(0, Gα,σ Σα,σG>
α,σ).
where Gα,σ is defined in (2.8). Then with the function (x, y)7→ ex(y11)γ;y, we obtain
n ˆσ(n)
LOG
ˆα(n)
LOG ! σ
α!! d
→ N(0, Fα,σ Gα,σΣα,σ G>
α,σF>
α,σ),
where Fα,σ is defined in (2.8). This concludes the proof.
Proof of Proposition 4.1
By the Kolmogorov strong law of large numbers for non-identically distributed random vari-
ables (Theorem 2.3.10 in Sen and Singer (1994)), we get
1
n
n
X
i=1
Y2
i 1
n
n
X
i=1
Yi!2
1
n
n
X
i=1
E[Y2
i] 1
n
n
X
i=1
E[Yi]!2
a.s.
0
as
1
n
n
X
i=1
E[Yi]E[Y] = γ
n
n
X
i=1 1
αi1
α+1
n
n
X
i=1
(log σilog σ)
=γ
n
n
X
i=1
εi
(α?
i)2+1
n
n
X
i=1
ηi
σ?
i
and
1
n
n
X
i=1
E[Y2
i]E[Y2]
=γ2+π2/6
n
n
X
i=1 1
α2
i1
α22γ2
n
n
X
i=1 1
αi1
α
+2γ
n
n
X
i=1 1
αi1log σi1
α1log σ+1
n
n
X
i=1 (log σi)2(log σ)2
=γ2+π2/6
n
n
X
i=1
2εi
˜αi32γ2
n
n
X
i=1
εi
(α?
i)2+2γ
n
n
X
i=1 1
ˇσi1
ˇαi1ηilog( ˇσi)
( ˇαi)2εi
+1
n
n
X
i=1
2 log ˜σi
˜σi
ηi
where ˜αi,α?
i,ˇαi(respectively ˜σi,σ?
i,ˇσi) have ranged between αand αi(respectively σand σi).
1
n
n
X
i=1
E[Yi]E[Y]γ
α2(1 cα)2
1
n
n
X
i=1 |εi|+1
σ(1 cσ)
1
n
n
X
i=1 |ηi|
30
1
n
n
X
i=1
E[Y2
i]E[Y2]
2γ2+π2/3
α3(1 cα)3+2γ2
α2(1 cα)2+2γmax(|log(σσcσ)|,|log(σ+σcσ)|)
α2(1 cα)21
n
n
X
i=1 |εi|
+2γ
σ(1 cσ)1
α(1 cα)+ 1+2 max(|log(σσcσ)|,|log(σ+σcσ)|)
σ(1 cσ)1
n
n
X
i=1 |ηi|
Under the conditions 1
n
n
X
i=1 |εi| → 0and 1
n
n
X
i=1 |ηi| → 0, we have
1
n
n
X
i=1
E[Y2
i]
n→∞ E[Y2]and 1
n
n
X
i=1
E[Yi]
n→∞ E[Y].
By the continuous mapping theorem with g(x, y) = π
qmax(6(yx2)π2
2,π2
4), we obtain
bαnαa.s.
0.
Proof of Proposition 4.2
Define the covariance-matrix Σ(n)
α,σ as follows:
Σ(n)
α,σ := 1
nPn
i=1 Var(Yi)1
nPn
i=1 Cov(Yi,Y 2
i)
1
nPn
i=1 Cov(Yi,Y 2
i)1
nPn
i=1 Var(Y2
i).
With the conditions of Proposition 4.1, we have
Σ(n)
α,σ
n→∞ Σα,σ := Var(Y) Cov(Y,Y 2)
Cov(Y,Y 2) Var(Y2).
By the central limit theorem for non-identically distributed random variables (Theorem 3.3.9 in
Sen and Singer (1994)), as supkE
Yk
Y2
k!
4
<, we have
n 1
nPn
i=1 Yi
1
nPn
i=1 Y2
i! 1
nPn
i=1 E[Yi]
1
nPn
i=1 E[Y2
i]!! d
→ N(0,Σα,σ ).
Under conditions of Proposition 4.2, we get n1
nPn
i=1 E[Y2
i]E[Y2]
n→∞ 0and
n1
nPn
i=1 E[Yi]E[Y]
n→∞ 0,then
n 1
nPn
i=1 Yi
1
nPn
i=1 Y2
i! E[Y]
E[Y2]!! d
→ N(0,Σα,σ ).
By applying the delta-method with g(x, y) = π
qmax(6(yx2)π2
2,π2
4), we obtain the result.
Proof of Proposition 4.3
31
By the strong law of large numbers for non-identically distributed random variables (Theorem
2.3.10 in Sen and Singer (1994)), we get
1
n
n
X
i=1
Y2
i 1
n
n
X
i=1
Yi!2
1
n
n
X
i=1
E[Y2
i] 1
n
n
X
i=1
E[Yi]!2
a.s.
0.
With the same calculation, we have
1
n
n
X
i=1
E[Y2
i]E[Y2] = γ2+π2/6
n
n
X
i=1
E1
α2
i1
α2+2γ2+ 2 log(σ)
n
n
X
i=1
E1
αi1
α
=γ2+π2/6
n
n
X
i=1
E2εi
( ˜αi)3+2γ2+ 2 log(σ)
n
n
X
i=1
Eεi
(α?
i)2
and
1
n
n
X
i=1
E[Yi]E[Y] = γ
n
n
X
i=1
E1
αi1
α=γ
n
n
X
i=1
Eεi
(α?
i)2
with α?
iand ˜αi(min(α, αi),max(α, αi)).
As α?
iand ˜αiare almost surely included in [α(1 cα), α(1 + cα)], we get
1
n
n
X
i=1
E[Y2
i]E[Y2]γ2+π2/6
(α(1 cα))3+| − 2γ2+ 2 log(σ)|
(α(1 cα))21
n
n
X
i=1
E[|εi|]
and
1
n
n
X
i=1
E[Yi]E[Y]γ
(α(1 cα))2
1
n
n
X
i=1
E[|εi|].
If we suppose 1
n
n
X
i=1
E[|εi|]0, we get
1
n
n
X
i=1
E[Y2
i]
n→∞ E[Y2]and 1
n
n
X
i=1
E[Yi]
n→∞ E[Y].
The central limit theorem is proved as Prop. 4.2.
Acknowledgment
We thank the referees for their valuable comments which helped to improve this article.
J. Lévy Véhel gratefully acknowledges financial support from SMABTP.
32
Bibliography
Aït-Sahalia, Y. and Jacod, J. (2009). Estimating the degree of activity of jumps in high frequency
data. The Annals of Statistics, 37.
Belomestny, D. (2010). Spectral estimation of the fractional order of a lévy process. Ann.
Statist., 38(1):317–351.
Bergström, H. (1952). On some expansions of stable distribution functions. Arkiv för Matem-
atik, 2.
Bianchi, S., Pantanella, A., and Pianese, A. (2013). Modeling stock prices by multifractional
brownian motion: an improved estimation of the pointwise regularity. Quantitative Finance,
13(8):1317–1330.
Corlay, S., Lebovits, J., and Véhel, J. L. (2014). Multifractional stochastic volatility models.
Mathematical Finance, 24(2):364–402.
De Haan L., F. A. (2006). Extreme Value Theory: An Introduction. 1st edition edition.
Dimitriadis, I. A., Alberola-López, C., Martín-Fernández, M., de-la Higuera, P. C., Simmross-
Wattenberg, F., and Asensio-Pérez, J. I. (2011). Anomaly detection in network traffic based
on statistical inference and α-stable modeling. IEEE Transactions on Dependable and Secure
Computing, 8:494–509.
DuMouchel, W. (1983). Estimating the stable index αin order to measure tail thickness: A
critique. Annals of Statistics, 11.
DuMouchel, W. H. (1973). On the asymptotic normality of the maximum-likelihood estimate
when sampling from a stable distribution. The Annals of Statistics, 1.
DuMouchel, W. H. (1975). Stable distributions in statistical inference: 2. information from
stably distributed samples. Journal of the American Statistical Association, 70.
Falconer, K. J. and Lévy Véhel, J. (2009). Multifractional, multistable, and other processes with
prescribed local form. Journal of Theoretical Probability, 22(2):375–401.
Falconer, K. J. and Lévy Véhel, J. (2018a). Self-stabilizing processes. Stochastic Models.
Falconer, K. J. and Lévy Véhel, J. (2018b). Self-stabilizing processes based on random signs.
Journal of Theoretical Probability.
Fama, E. F. and Roll, R. (1971). Parameter estimates for symmetric stable distributions (ref:
V63 p817-36). Journal of the American Statistical Association, 66:331–338.
33
Fischer, R., Akay, M., Castiglioni, P., and Di Rienzo, M. (2003). Multi-and monofractal indices
of short-term heart rate variability. Medical and Biological Engineering and Computing,
41(5):543–549.
Fofack, H. and Nolan, J. P. (1999). Tail behavior, modes and other characteristics of stable
distributions. Extremes, 2(1):39–58.
Frezza, M. (2018). A fractal-based approach for modeling stock price variations. Chaos: An
Interdisciplinary Journal of Nonlinear Science, 28(9):091102.
Hall, P. (1982). On some simple estimates of an exponent of regular variation. Journal of the
Royal Statistical Society: Series B (Methodological), 44(1):37–42.
Hill, B. M. (1975). A simple general approach to inference about the tail of a distribution. The
Annals of Statistics, 3.
Jing, B.-Y., Kong, X.-B., Liu, Z., and Mykland, P. (2012). On the jump activity index for
semimartingales. Journal of Econometrics, 166(2):213 – 223.
Kateregga, M., Mataramvura, S., Taylor, D., and Zhang, X. (2017). Parameter estimation for
stable distributions with application to commodity futures log-returns. Cogent Economics &
Finance, 5.
Koutrouvelis, I. A. (1980). Regression-type estimation of the parameters of stable laws. J.
Amer. Statist. Assoc., 75(372):918–928.
Koutrouvelis, I. A. (1981). An iterative procedure for the estimation of the parameters of stable
laws. Comm. Statist. B—Simulation Comput., 10(1):17–28.
Kuruoglu, E. E. (2001). Density parameter estimation of skewed alpha-stable distributions.
IEEE Transactions on Signal Processing, 49(10):2192–2201.
Lavancier, F. and Rochet, P. (2016). A general procedure to combine estimators. Comput.
Statist. Data Anal., 94:175–192.
Le Guével, R. (2013). An estimation of the stability and the localisability functions of multi-
stable processes. Electron. J. Stat., 7:1129–1166.
Le Guével, R. and Lévy Véhel, J. (2012). A Ferguson - Klass - LePage series representation of
multistable multifractional processes and related processes. Bernoulli, 18(4):1099–1127.
Ma, X. and Nikias, C. L. (1995). Parameter estimation and blind channel identification in
impulsive signal environments. IEEE transactions on signal processing, 43(12):2884–2897.
Mandelbrot, B. B. (1997). Fractals and Scaling In Finance: Discontinuity, Concentration, Risk.
Springer Publishing Company, Incorporated, 1st edition.
34
McCulloch, J. (1986). Simple consistent estimators of stable distribution parameters. Comm.
Statist. B—Simulation Comput., 15(4):1109–1136.
McCulloch, J. (1997). Measuring tail thickness to estimate the stable index alpha: A critique.
Journal of Business & Economic Statistics, 15:74–81.
Nolan, J. P. (2001). Maximum likelihood estimation and diagnostics for stable distributions. In
Lévy processes, pages 379–400. Springer.
Resnick, S. I. (2007). Heavy-tail phenomena: probabilistic and statistical modeling. Springer
series in operations research and financial engineering. Springer.
Salas-Gonzalez, D., Górriz, J. M., Ramírez, J., Illán, I. A., and Lang, E. W. (2013). Linear inten-
sity normalization of fp-cit spect brain images using the α-stable distribution. NeuroImage,
65:449 – 455.
Samorodnitsky, G. and Taqqu, M. S. (1994). Stable non-Gaussian random processes. Stochastic
Modeling. Chapman & Hall, New York. Stochastic models with infinite variance.
Sen, P. and Singer, J. (1994). Large Sample Methods in Statistics: An Introduction with Appli-
cations. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis.
Stoev, S. and Taqqu, M. (2004). Simulation methods for linear fractional stable motion and
farima using the fast fourier transform. Fractals, 12(1):95–121.
Wang, X., Li, K., Gao, P., and Meng, S. (2015). Research on parameter estimation methods for
alpha stable noise in a laser gyroscope’s random error. Sensors, 15(8):18550–18564.
Weron, R. (1995). Performance of the estimators of stable law parameters. HSC Research
Reports HSC/95/01, Hugo Steinhaus Center, Wroclaw University of Technology.
Yang, C., Hsu, K., and Chen, K. (2009). The use of the levy-stable distribution for geophysical
data analysis. Hydrogeology Journal, 17(5):1265–1273.
35
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
A self-stabilizing processes $\{Z(t), t\in [t_0,t_1)\}$ is a random process which when localized, that is scaled to a fine limit near a given $t\in [t_0,t_1)$, has the distribution of an $\alpha(Z(t))$-stable process, where $\alpha: \mathbb{R}\to (0,2)$ is a given continuous function. Thus the stability index near $t$ depends on the value of the process at $t$. In an earlier paper we constructed self-stabilizing processes using sums over plane Poisson point processes in the case of $\alpha: \mathbb{R}\to (0,1)$ which depended on the almost sure absolute convergence of the sums. Here we construct pure jump self-stabilizing processes when $\alpha$ may take values greater than 1 when convergence may no longer be absolute. We do this in two stages, firstly by setting up a process based on a fixed point set but taking random signs of the summands, and then randomizing the point set to get a process with the desired local properties.
Article
Full-text available
Alpha stable noise, determined by four parameters, has been found in the random error of a laser gyroscope. Accurate estimation of the four parameters is the key process for analyzing the properties of alpha stable noise. Three widely used estimation methods-quantile, empirical characteristic function (ECF) and logarithmic moment method-are analyzed in contrast with Monte Carlo simulation in this paper. The estimation accuracy and the application conditions of all methods, as well as the causes of poor estimation accuracy, are illustrated. Finally, the highest precision method, ECF, is applied to 27 groups of experimental data to estimate the parameters of alpha stable noise in a laser gyroscope's random error. The cumulative probability density curve of the experimental data fitted by an alpha stable distribution is better than that by a Gaussian distribution, which verifies the existence of alpha stable noise in a laser gyroscope's random error.
Book
Full-text available
With this webpage the authors intend to inform the readers of errors or mistakes found in the book after publication. We also give extensions for some material in the book. We acknowledge the contribution of many readers.
Book
This text bridges the gap between sound theoretcial developments and practical, fruitful methodology by providing solid justification for standard symptotic statistical methods. It contains a unified survey of standard large sample theory and provides access to more complex statistical models that arise in diverse practical applications.
Book
This book serves as a standard reference, making this area accessible not only to researchers in probability and statistics, but also to graduate students and practitioners. The book assumes only a first-year graduate course in probability. Each chapter begins with a brief overview and concludes with a wide range of exercises at varying levels of difficulty. The authors supply detailed hints for the more challenging problems, and cover many advances made in recent years.
Article
The recent global financial crisis has threatened the financial system with total collapse of many economic sectors with a particular penetration to world’s stock markets. The large swings in the prices of international stocks or indexes have reinvigorated the debate on their mathematical modeling. The traditional approaches do not seem to be very exhaustive and satisfactory, especially when extreme events occur. We propose a fractal-based approach to model the actual prices by assuming that they follow a Multifractional Process with Random Exponent. An empirical evidence is offered that this stochastic process is able to provide an appropriate modeling of actual series in terms of goodness of fit by comparing three main stock indexes.
Article
We construct `self-stabilizing' processes {Z(t), t $\in [t_0,t_1)$}. These are random processes which when `localized', that is scaled around t to a fine limit, have the distribution of an $\alpha$(Z(t))-stable process, where $\alpha$ is some given function on R. Thus the stability index at t depends on the value of the process at t. Here we address the case where $\alpha$: R $\to$ (0,1). We first construct deterministic functions which satisfy a kind of autoregressive property involving sums over a plane point set $\Pi$. Taking $\Pi$ to be a Poisson point process then defines a random pure jump process, which we show has the desired localized distributions.
Article
We present a class of simple estimates of an exponent of regular variation. Unlike those proposed recently by de Haan and Resnick (1980), ours converge at an algebraic rather than a logarithmic rate, and unlike those considered by Teugels (1980), their form does not depend on the unknown exponent.
Article
We propose a general method to combine several estimators of the same quantity in order to produce a better estimate. In the spirit of model and forecast averaging, the final estimator is computed as a weighted average of the initial ones, where the weights are constrained to sum to one. In this framework, the optimal weights, minimizing the quadratic loss, are entirely determined by the mean square error matrix of the vector of initial estimators. The averaging estimator is derived using an estimation of this matrix, which can be computed from the same dataset. We show that the solution satisfies a non-asymptotic oracle inequality and is asymptotically optimal, provided the mean square error matrix is suitably estimated. This method is illustrated on standard statistical problems: estimation of the position of a symmetric distribution, estimation in a parametric model, density estimation, where the averaging estimator outperforms the initial estimators in most cases.