BookPDF Available

The Laplace Distribution and Generalizations

Authors:

Abstract

In this part we discuss current results on multivariate Laplace distributions and their generalizations. The field is relatively unexplored, and the subject matter is quite fresh and somewhat fragmented; thus our account is intentionally concise. In our opinion, some period of digestion is required to put these results in a proper perspective. Hopefully, a separate monograph will be available on this burgeoning area of statistical distributions in the not-too-distant future.
The Laplace Distribution
and Generalizations
A Revisit with New App l ications
Samuel Kotz
Department of Engineering Management & System Engineering
The George Washington University, Washington, DC 20052
Tomasz J. Kozubowski
Department of Mathematics
University o f Nevada, Reno, NV 89557
Krzysztof Podg
´
orski
Department of Mathematical Sciences
Indiana University–Purdue University, Indianapolis, IN 462 02.
January 24, 2001
To Rosalie and our children and grandchildren
S.K.
To Ania, Joseph, and Kamil
T.J.K.
To my Parents
K.P.
Contents
Preface ix
Abbreviations and Notation xiii
I Univariate Distributions 1
1 Historical background 3
2 Classical symmetric Laplace distribution 17
2.1 Definition and basic properties . . . . . . . . . . . . . . . . 19
2.1.1 Density and distribution functions . . . . . . . . . . 19
2.1.2 Characteristic and moment generating functions . . 21
2.1.3 Moments and related parameters . . . . . . . . . . . 2 3
2.2 Representations and characterizations . . . . . . . . . . . . 26
2.2.1 Mixture of normal distributions . . . . . . . . . . . 26
2.2.2 Relation to exponential distribution . . . . . . . . . 28
2.2.3 Relation to Pareto distribution . . . . . . . . . . . . 29
2.2.4 Relation to 2 × 2 unit normal determinants . . . . . 30
2.2.5 An orthogonal representation . . . . . . . . . . . . . 31
2.2.6 Stability with respect to geometric summation . . . 32
2.2.7 Distributional limits of geometric sums . . . . . . . . 36
2.2.8 Stability with respect to the ordinary summation . . 39
iv Contents
2.2.9 Distributional limits of deterministic sums . . . . . . 42
2.3 Functions of Laplace random variables . . . . . . . . . . . . 43
2.3.1 The distribution of the sum of independent Laplace
variates . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.3.2 The distribution of the product o f two independent
Laplace variates . . . . . . . . . . . . . . . . . . . . 49
2.3.3 The distribution of the ratio of two independent La place
variates . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.3.4 The t-statistic for a double exponential (Laplace) dis-
tribution . . . . . . . . . . . . . . . . . . . . . . . . 53
2.4 Further properties . . . . . . . . . . . . . . . . . . . . . . . 56
2.4.1 Infinite divisibility . . . . . . . . . . . . . . . . . . . 56
2.4.2 Geometric infinite divisibility . . . . . . . . . . . . . 59
2.4.3 Self-decomp osability . . . . . . . . . . . . . . . . . . 59
2.4.4 Complete monotonicity . . . . . . . . . . . . . . . . 61
2.4.5 Maximum entropy property . . . . . . . . . . . . . . 62
2.5 Order statistics . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.5.1 Distribution of a single order statistic . . . . . . . . 65
2.5.2 Joint distributions of order statistics . . . . . . . . . 68
2.5.3 Moments of order statistics . . . . . . . . . . . . . . 74
2.5.4 Representation of order statistics via sums of expo-
nentials . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.6 Statistical inference . . . . . . . . . . . . . . . . . . . . . . . 78
2.6.1 Point estimation . . . . . . . . . . . . . . . . . . . . 81
2.6.2 Interval estimation . . . . . . . . . . . . . . . . . . . 112
2.6.3 Tolerance intervals . . . . . . . . . . . . . . . . . . . 121
2.6.4 Testing hypo thesis . . . . . . . . . . . . . . . . . . . 126
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
3 Asymmetric Laplace distributions 163
3.1 Definition and basic properties . . . . . . . . . . . . . . . . 167
3.1.1 An alternative parameterization and s pecial cases . . 167
3.1.2 Standardization . . . . . . . . . . . . . . . . . . . . . 169
3.1.3 Densities and their properties . . . . . . . . . . . . . 169
3.1.4 Moment and cumulant generating functions . . . . . 173
3.1.5 Moments and related parameters . . . . . . . . . . . 174
3.2 Representations . . . . . . . . . . . . . . . . . . . . . . . . . 178
3.2.1 Mixture of normal distributions . . . . . . . . . . . . 178
3.2.2 Convolution of exponential distributions . . . . . . . 180
3.2.3 Self-decomp osability . . . . . . . . . . . . . . . . . . 181
3.2.4 Relation to 2 × 2 normal determinants . . . . . . . . 182
3.3 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
3.4 Characterizations and further properties . . . . . . . . . . . 185
3.4.1 Infinite divisibility . . . . . . . . . . . . . . . . . . . 185
3.4.2 Geometric infinite divisibility . . . . . . . . . . . . . 187
Contents v
3.4.3 Distributional limits of geometric sums . . . . . . . . 188
3.4.4 Stability with respect to geometric summation . . . 191
3.4.5 Maximum entropy property . . . . . . . . . . . . . . 19 2
3.5 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
3.5.1 Maximum likelihood estimation . . . . . . . . . . . . 196
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
4 Related distributions 223
4.1 Bessel function distribution . . . . . . . . . . . . . . . . . . 224
4.1.1 Definition and parameterizations . . . . . . . . . . . 224
4.1.2 Representations . . . . . . . . . . . . . . . . . . . . . 226
4.1.3 Self-decomp osability . . . . . . . . . . . . . . . . . . 230
4.1.4 Densities . . . . . . . . . . . . . . . . . . . . . . . . 235
4.1.5 Moments . . . . . . . . . . . . . . . . . . . . . . . . 239
4.2 Laplace motion . . . . . . . . . . . . . . . . . . . . . . . . . 241
4.2.1 Symmetric Laplace motion . . . . . . . . . . . . . . 241
4.2.2 Representations . . . . . . . . . . . . . . . . . . . . . 243
4.2.3 Asymmetric L aplace motion . . . . . . . . . . . . . . 247
4.3 Linnik distribution . . . . . . . . . . . . . . . . . . . . . . . 249
4.3.1 Characterizations . . . . . . . . . . . . . . . . . . . . 251
4.3.2 Representations . . . . . . . . . . . . . . . . . . . . . 256
4.3.3 Densities and distribution functions . . . . . . . . . 259
4.3.4 Moments and tail behavior . . . . . . . . . . . . . . 267
4.3.5 Properties . . . . . . . . . . . . . . . . . . . . . . . . 268
4.3.6 Simulation . . . . . . . . . . . . . . . . . . . . . . . 270
4.3.7 Estimation . . . . . . . . . . . . . . . . . . . . . . . 271
4.3.8 Extensions . . . . . . . . . . . . . . . . . . . . . . . 275
4.4 Other cases . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
4.4.1 Log-Laplace distribution . . . . . . . . . . . . . . . . 276
4.4.2 Generalized Laplace distribution . . . . . . . . . . . 277
4.4.3 Sargan distribution . . . . . . . . . . . . . . . . . . . 277
4.4.4 Geometric stable laws . . . . . . . . . . . . . . . . . 278
4.4.5 ν-stable laws . . . . . . . . . . . . . . . . . . . . . . 279
4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
II Multivariate distributions 285
5 Symmetric multivariate Laplace distribution 289
5.1 Bivariate case . . . . . . . . . . . . . . . . . . . . . . . . . . 289
5.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . 289
5.1.2 Moments . . . . . . . . . . . . . . . . . . . . . . . . 290
5.1.3 Densities . . . . . . . . . . . . . . . . . . . . . . . . 291
5.1.4 Simulation of bivariate Laplace var iates . . . . . . . 292
5.2 General symmetric multivariate case . . . . . . . . . . . . . 295
vi Contents
5.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . 295
5.2.2 Moments and densities . . . . . . . . . . . . . . . . . 296
5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
6 Asymmetric multivariate Laplace distribution 299
6.1 Bivariate case definition and basic properties . . . . . . . 301
6.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . 301
6.1.2 Moments . . . . . . . . . . . . . . . . . . . . . . . . 301
6.1.3 Densities . . . . . . . . . . . . . . . . . . . . . . . . 302
6.1.4 Simulation of bivariate asymmetric Laplace va riates 303
6.2 General multivariate asymmetric case . . . . . . . . . . . . 306
6.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . 306
6.2.2 Special cases . . . . . . . . . . . . . . . . . . . . . . 307
6.3 Representations . . . . . . . . . . . . . . . . . . . . . . . . . 308
6.3.1 Basic representation . . . . . . . . . . . . . . . . . . 308
6.3.2 Polar representation . . . . . . . . . . . . . . . . . . 309
6.3.3 Subo rdinated Brownian motion . . . . . . . . . . . . 31 1
6.4 Simulation algorithm . . . . . . . . . . . . . . . . . . . . . . 311
6.5 Moments and densities . . . . . . . . . . . . . . . . . . . . . 312
6.5.1 Mean vecto r and covar iance matrix . . . . . . . . . . 312
6.5.2 Densities in the general case . . . . . . . . . . . . . . 312
6.5.3 Densities in the symmetric case . . . . . . . . . . . . 314
6.5.4 Densities in the one dimensional case . . . . . . . . . 314
6.5.5 Densities in the case of odd dimension . . . . . . . . 314
6.6 Unimodality . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
6.6.1 Unimodality . . . . . . . . . . . . . . . . . . . . . . 31 4
6.6.2 A related representation . . . . . . . . . . . . . . . . 316
6.7 Conditional distributions . . . . . . . . . . . . . . . . . . . . 317
6.7.1 Conditional distributions . . . . . . . . . . . . . . . 317
6.7.2 Conditional mean and covariance matrix . . . . . . . 318
6.8 Linear transformations . . . . . . . . . . . . . . . . . . . . . 319
6.8.1 Linear combinations . . . . . . . . . . . . . . . . . . 319
6.8.2 Linear regression . . . . . . . . . . . . . . . . . . . . 321
6.9 Infinite divisibility properties . . . . . . . . . . . . . . . . . 322
6.9.1 Infinite divisibility . . . . . . . . . . . . . . . . . . . 322
6.9.2 Asymmetric L aplace motion . . . . . . . . . . . . . . 323
6.9.3 Geometric infinite divisibility . . . . . . . . . . . . . 324
6.10 Stability properties . . . . . . . . . . . . . . . . . . . . . . . 325
6.10.1 Limits of random sums . . . . . . . . . . . . . . . . 325
6.10.2 Stability under random summation . . . . . . . . . . 326
6.10.3 Stability of deterministic sums . . . . . . . . . . . . 328
6.11 Linear regression with Laplace e rrors . . . . . . . . . . . . . 32 9
6.11.1 Least-squares estimation . . . . . . . . . . . . . . . . 329
6.11.2 Estimation of σ
2
. . . . . . . . . . . . . . . . . . . . 330
6.11.3 The distributions of standard t and F statistics . . . 332
Contents vii
6.11.4 Inference from the estimated regression function . . 333
6.11.5 Maximum likelihood estimation . . . . . . . . . . . . 33 4
6.11.6 Bayesian estimation . . . . . . . . . . . . . . . . . . 336
6.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
III Applications 343
7 Engineering sciences 347
7.1 Detection in the presence of Laplace noise . . . . . . . . . . 347
7.2 Encoding and decoding of analog signals . . . . . . . . . . . 351
7.3 Optimal quantizer in image and speech compression . . . . 352
7.4 Fracture problems . . . . . . . . . . . . . . . . . . . . . . . 356
7.5 Wind shear data . . . . . . . . . . . . . . . . . . . . . . . . 358
7.6 Error distributions in navigation . . . . . . . . . . . . . . . 359
8 Financial data 363
8.1 Underreported data . . . . . . . . . . . . . . . . . . . . . . 36 4
8.2 Interest rates data . . . . . . . . . . . . . . . . . . . . . . . 365
8.3 Currency exchange rates . . . . . . . . . . . . . . . . . . . . 36 7
8.4 Share market return models . . . . . . . . . . . . . . . . . . 369
8.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 369
8.4.2 Stock market returns . . . . . . . . . . . . . . . . . . 370
8.5 Option pricing . . . . . . . . . . . . . . . . . . . . . . . . . 372
8.6 Stochastic variance Value-at-Risk models . . . . . . . . . . 374
8.7 A jump diffusion model for asset pricing with Laplace dis-
tributed jump-sizes . . . . . . . . . . . . . . . . . . . . . . . 379
8.8 Price changes modele d by Laplace-Weibull mixtures . . . . 380
9 Inventory management and quality control 381
9.1 Demand during lead time . . . . . . . . . . . . . . . . . . . 381
9.2 Acceptance sampling for La place distributed quality charac-
teristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
9.3 Steam generator inspection . . . . . . . . . . . . . . . . . . 385
9.4 Adjustment of statistical process control . . . . . . . . . . . 386
9.5 Duplicate check-sampling of the meta llic content . . . . . . 387
10 Astronomy, biological and environmental s ci ences 391
10.1 Sizes of beans, sand particles, and diamonds . . . . . . . . . 391
10.2 Pulses in long bright gamma-ray bursts . . . . . . . . . . . 393
10.3 Random fluctuations of response rate . . . . . . . . . . . . . 393
10.4 Modeling low dose responses . . . . . . . . . . . . . . . . . . 395
10.5 Multivariate elliptically contoured distributions for repeated
measurements . . . . . . . . . . . . . . . . . . . . . . . . . . 395
viii Contents
10.6 ARMA models with Laplace noise in the environmental time
series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
A Bessel functions 399
References 403
Index 433
Preface
The aim of this monograph is quite modest: It attempts to be a systematic
exp osition of all that appeared in the literature and was known to us by
the end of the 20th Century on the Laplac e distribution and its numer-
ous generalizations and extensions. We have tried to cover both theoretical
developments and applications. There were two main reasons for writing
this book. The first was our conviction tha t a number of areas and sit-
uations where the Laplace distributions naturally occurs is so extensive
that tracking the original sources becomes unfeasible. The second was our
observation of the growing demand for statistical distributions having prop-
erties tangent to those exhibited by the Laplac e laws. We feel that these
two “necessary” conditions for ex istence of a monograph on a given subject
justified our efforts which led to this book.
There are many details which are arrange d primarily for reference work
such as inclusion of the most co mmonly used terminology and notation. In
several cases, we have also proposed unification to overcome the ambiguity
of notions so often present in this area. An unavoida ble coloring of choice
by per sonal taste may have done some injustice to the subject matter by
omitting or emphasizing certain topics due to space limitation. We trust
that this feature does not constitute a serious drawback in our literature
search we tried to leave no stone unturned (we have collected over 400
references).
Because we view this monograph as a textbook, the exposition in the ear-
lier chapters proceeds at a rather pedestrian pace and each part of the book
presupposes all earlier developments. Slightly mo re advanced approach is
x PREFACE
taken in the sec ond part of the boo k w here quite a few results obtained by
the authors appear in print for the first time.
The exercises are supposed to be an integral part of the discussion but a
number of them are intended simply to aid in understanding the concepts
employed.
The monograph should be rea d (and studied!) with a constant reminder
that it a ims to provide an alternative to the dominance of the “normal”
law (the eponymous “Gaussian distribution”) which reigned almost without
opposition in statistical theory and applications for almost two centuries.
We have tried to make sufficiently precise statements while striving to
keep the mathematical level of the book to be app e aling for the widest pos-
sible readership including users of distribution theory in various applied
sciences. We ho pefully did not however overplayed the simplicity card which
is so popular among exp ositors of probabilistic and statistical conce pts in
the last two decades or so. The prerequisites are c alculus, matrix algebra
and familia rity with the basic concepts of probability theory and statisti-
cal inference. As a lways the most desirable prerequisites for the books of
this kind are that ill-defined qualities of mathematical sophistication and
understanding of the intricate nature of the somewhat elusive probabilistic
reasoning.
Since so much of this book is a synthesis of other people’s work, the text
and the extensive bibliography (which reflects the rich diversity of sources)
themselves must stand as an expression of our intellectual gratitude to the
pioneers and contributors to the subject matter of the monograph. Spe-
cial thanks are als o due to librarians at the George Washington University
(first and fo remost to Mrs . Debra Bensazon), Indiana University Purdue
University, Indianapolis , the University of California at Santa Barbara, and
the University of Nevada at Reno, who generous ly assisted us in digging
out sources related to the Laplace distributions. The modern communica-
tion technology facilitated tremendously the overcome the problem of the
“academic geo graphy” among the authors located at the opposite corners
of the United States and at its geographical midpo int. We tender our very
warm thanks to Ms. Ann Konstant and to Mr. Tom Grosso our edi-
tors at Birkha¨user-Verla g in Boston for their efficient, expeditious, and
meticulous handling of the production of this monograph.
We hope that the monograph will trigger additional theoretical research
and provide tools which will generate further fruitful applications of the
presented distributions in various branches of life and behavioral sciences.
It is the applications that provide the special vita lity to probabilistic laws
which in our o pinio n are of permanent interest on its own both from math-
ematical and conceptual aspects. We wish our readers a most pleasant and
instructive journey when sailing (leisurely o r rapidly) through the text.
PREFACE xi
S.K.
Washington, D.C.
T.J.K.
Reno, N evada
K.P.
Indianapolis, Indiana
July, 2000
xii PREFACE
Abbreviations and Notation
AAI average adjustment interval
ABLUE asymptotically best linear unbiased estimator
AD Anderson-Darling (test)
AL asymmetric Laplace
AMP asymptotically most powerful
ARE asymptotic relative efficiency
BAL bivariate asymmetric Laplace
BLUE best linear unbiased estimator
c.d.f. cumulative distribution function
ch.f. characteristic function
CLT central limit theorem
CvM Cram´er-von Mises (test)
EC eliptically contoured
GAL genera lize d asymmetric Laplace
GGC generalized gamma c onvolution
GIG generalized inverse Gaussian
GS geometric stable
i.i.d. independent, identically distributed
LLN law of large numbers
LMP locally most powerful
LSE least-squar e s estimator
MLE maximum likelihood estimator
MME method of moments estimator
MR midrange
MSD mean squared deviation
MSE mean squared error
OC operating characteristic (curve)
p.d.f. probability density function
r.v. random variable (vector)
UMVU uniformly minimum variance unbiased
UMP uniformly most powerful (test)
VaR Value-at-Risk
xiv Abbreviations and Notation
A
0
the transpose of a matrix A
|A| the determinant of a square matrix A
AL(θ, µ, σ)
univariate AL law with mode at θ,
mean θ + µ, and variance µ
2
+ σ
2
AL(µ, σ)
univariate AL law with mode at zero,
mean µ, a nd variance µ
2
+ σ
2
AL(µ)
standard univariate AL law with mode at 0,
mean µ, a nd variance µ
2
+ 1
AL
(θ, κ, σ)
univariate AL law with mode at θ,
skewness parameter κ, and scale parameter σ
AL
(κ, σ)
univariate AL law with mode at 0,
skewness parameter κ, and scale parameter σ
AL
(κ)
standard univariate AL law with mode at 0,
skewness parameter κ, and scale parameter 1
AL
d
(m, Σ)
d-dimensional asymmetric Laplace distribution with mean m
and variance-covariance matrix Σ + mm
0
ALM(µ, σ, ν) asymmetric Laplace motion
BAL(m
1
, m
2
, σ
1
, σ
2
, ρ) bivariate asymmetric Laplace distribution
Beta(α, β) Beta distribution with parameters α and β
BSL(σ
1
, σ
2
, ρ) bivariate sy mmetric Laplace distribution
CL(θ, s)
classical Laplace distribution with mean θ
and scale parameter s
D
n
the Kolmogorov statistic
D
±
n
the Smirnov one-sided statistic
EX the expected value of a random variable X
Abbreviations and Notation xv
E
1
(x) the exponential integral function, E
1
(x) =
R
x
e
t
t
dt, x > 0
EC
d
(m, Σ, g) eliptically c ontoured distribution
G(α, β) gamma distribution with shape parameter α and scale parameter β
G(α) standard gamma distribution with scale parameter 1
GAL(θ, µ, σ, τ)
generalized asymmetric Laplace distribution
(Bessel K-function distribution, variance-gamma
distribution) with parameters θ, µ, σ, τ
GAL(µ, τ)
standard generalized asymmetric Laplace distribution
(the GAL(θ, µ, σ, τ) distribution with θ = 0 and σ = 1
GAL
(θ, κ, σ, τ)
generalized asymmetric Laplace distribution
(Bessel K-function distribution, variance-gamma
distribution) with parameters θ, κ, σ, τ
GAL
(κ, τ)
standard generalized asymmetric Laplace distribution
(the GAL
(θ, κ, σ, τ) distribution with θ = 0 and σ = 1
GAL
d
(m, Σ, s) d-dimensional generalized Laplace distribution
GIG(λ, χ, ψ) generalized inverse Gaussian distribution
GS
α
(σ, β, µ)
geometric stable dis tribution with index α, scale
parameter σ, skewness parameter β, and location
parameter µ; in particular, GS
α
(σ, 0, 0) = L
α,σ
,
GS
2
(s, 0, 0) = CL(0, s), GS
2
(σ/
2, β, µ) = AL(0, µ, σ)
H
d
(λ, α, β, δ, µ, Σ) d-dimensional g e neralized hyperbolic distribution
I
d
d-dimensional identity matrix
I(θ) the Fisher information about θ
J
λ
the Bessel function of the first kind of order λ
K
λ
the modified Bessel function of the third kind with index λ
xvi Abbreviations and Notation
L(θ, σ) Laplace distribution with mean θ and variance σ
2
L
α,σ
Linnik distribution with index α and scale parameter σ
LM(σ, ν)
symmetric Laplace motion with space-scale
parameter σ and time-scale parameter ν
log natural logarithm
N the set of natural numbers
N(µ, σ
2
) normal distribution with mean µ and variance σ
2
N
d
(m, Σ)
d-dimensional normal distribution with the mean ve c tor m
and variance-covariance matrix Σ
o(g(x))
f(x) = o(g(x)) as x x
0
means that f (x)/g(x)
converges to zero as x x
0
o(1) f(x) = o(1) if the function f converge s to zero
O(g(x))
f(x) = O(g(x)) as x x
0
means that |f (x)/g(x)|
is bo unded for x close to x
0
O(1) f(x) = O(1) if the function f is bounded
R the set of real numbers
R
d
d-dimensional Eucledean space
Re(z) the real part of z
SL
d
(Σ)
d-dimensional symmetric Laplace distribution
with mean zero and variance-covariance matrix Σ
s
0
t the inner product of the ve c tors s and t
S
d
unit sphere in R
d
: {s R
d
: ||s|| = 1}
sign(x) 1 for x > 0, 1 for x < 0, 0 for x = 0
||t|| (t
0
t)
1/2
- the Euclidean no rm of t R
d
t
0
the transpose of a column vector t
Abbreviations and Notation xvii
U
(d)
uniform distribution on S
d
V ar(X) the variance of a random variable X
X CL(θ, s) X has the distribution CL(θ, s), etc.
[[x]] the greatest integer less than or equal to x
x
k:n
the kth smallest of x
1
, x
2
, . . . , x
n
x
+
x if x 0, 0 if x 0
x
x if x 0, 0 if x 0
I
A
indicator function of the s e t A
a.s.
convergence with probability one
p
convergence in probability
d
convergence in distribution
d
= equality of distributions
γ
1
the coefficient of skewness
γ
2
the coefficient of kurtos is (excess kurtosis)
Γ(α) the gamma function, Γ(α) =
R
0
x
α1
e
x
dx
κ
n
(X) the nth cumulant of the random variable X
χ
2
chi-square distribution
µ
n
(X) the nth central mo ment of the random variable X
ν
p
geometric random variable with mean 1/p
Part I
Univariate Distributions
1
1
Historical background
Over 75 years ago in a paper which appear e d in the 1923 issue of the Jour-
nal of American Statistical Association (pp. 841-852) entitled First and
Second Laws of Error, the late Professor and Head of Vital Statistics at
the Harvard School of Public Health, Edwin Bidwell Wilson (1879-1964)
1
concurs with the economics Professor W.L. Crum’s conclusions e xpressed
in a paper published in the same journal in March 1923, entitled The Use
of the Median in Determining Seasonal Variation (pp. 60 7-614) tha t “good
many series of data from economic sources probably may be better treated
by the median than by the mean...” These remarks may be viewed revolu-
tionary at the period of unquestiona ble dominance of the arithmetic mean
and normal distribution in statistical theory. E.B. Wilson reminds us that
the first two laws of error were both originated with P.S. Laplace. The first
law presented in 1774 states that the freq uency of an error could be ex-
pressed as an exponential function of the numerical magnitude of the error,
disregar ding sign, o r equivalently that the logarithm of the frequency of an
error (without regard to sign) is a linear function of the error.
The second law (pro posed 4 years later in 1778) states that the fre-
quency of the error is an exponential function of the square of the error, or
equivalently that the logarithm of the frequency is a quadratic (parabolic)
function of the error. See Figure 1.1.
1
Wilson’s name is known to many statisticians in view of the so-called Wilson-Hilferty
transformation (see Wilson, E.B. and Hilferty, M.M. Proc. Nat. Acad. Sci., 17, pp. 684-
688) a device that allows the use of a normal approxi mation for chi-square probabilities.
4 1. Historical background
Figure 1.1: On the left, Laplace’s first fre quency curve F =
k
2
e
kx
. On the
right, Laplace’s second (Gauss’s) frequency curve: F =
1
2πσ
e
x
2
/2σ
2
. Each
curve should be reproduced symmetrica lly on the other side of the central
vertical line. (The figure is taken from Wilson’s 1923 paper.) Reprinted
with permission from the Journal of the American Statistical Association.
Copyright 1923 by the American Statistical Assoc iation. All right re served.
The second Laplace law is usually called the normal distribution or the
Gauss law. Wilson among several other scholars doubts the attribution
of that law to Gauss and remarks that Gauss “in spite of his well-known
precocity had probably not made his discovery before he was two years
old.” He notes that there are excellent mathematical r e asons for the far
greater attention which has been paid to the second law, since it involves
the variable x
2
(if x be the error) and this is “subject to all the laws of e le-
mentary mathematical analysis” while the first law involving the absolute
value of the error x is not an analytic function and presents considerable
mathematical difficulty in its manipulation.
Next, however, E.B. Wilson states that the frequency which we actually
meet in everyday work in economics, in biometrics, or in vital statistics,
very often fail to conform a t all closely to the “so-called” normal distribu-
tion. He points out that the fact that in extraordina rily precise measure-
ments of astronomy of position the errors are dispersed about the mean
1. Historical background 5
Deviation Frequency Deviation Frequency Deviation Frequency
*Over -30 2 -11 6 6 13
-30 1 -10 3 7 8
-29 1 -9 5 8 6
-28
1 -8 11 9 5
-24 1 -7 6 10 2
-23 1 -6 23 11 4
-22
1 -5 10 12 3
-21 2 -4 13 13 1
-20
1 -3 19 14 2
-19 2 -2 9 15 1
-18 2 -1 11 16 1
-17
2 0 28 17 1
-16 2 1 22 18 2
-15 1 2 22 23 1
-14
3 3 13 24 1
-13 6 4 19 28 1
-12
3 5 13 Over 3 0 7
*-32,-37. 34, 35, 35, 41, 41, 42, 45.
Table 1.1: Crum’s data frequencies of deviations from the medians (N =
324 total frequency).
in accordance with the Gauss law and that the dispersion of shots in ar-
tillery and small arms practice are covered very well by the generalization
of this law, is “no justification for attempting to force the (normal) law
with its various generalizations upon the data for which it is not fitted.”
Wilson emphasizes that it is important to examine the data themselves for
the purp ose of determining the proper statistical treatment and it is by
no means safe to rush ahead and apply the second law of Laplace or the
various extensions of it developed by the Scandinavian School on the one
hand (Gram, Charlier) or the (British) Biometric School (Pearson, Yule)
on the other.” He analyzes the example provided by Crum (Table 1.1).
He also notes that for the normal distribution if e
i
denotes a deviation
2
from a mean and S
1
denotes the mean deviation, S
2
mean square devia-
tion, etc. (namely
nS
1
=
X
e
i
, nS
2
2
=
X
e
2
i
,
nS
3
3
=
X
e
3
i
, nS
4
4
=
X
e
4
i
),
2
A deviation here means absolute deviation.
6 1. Historical background
the ratios S
i
’s ought to satisfy
S
1
: S
2
: S
3
: S
4
= 1.000 : 1.253 : 1.465 : 1.645.
Commenting on these ratios Wilson is echoing and mo difying Bertrand’s
famous dictum (“if these equalities are not satisfied someone has re-
touched and altered the immediate results of experiment”) and asserts that
“when confronted with data that do not satisfy this continued proportion
it is very obvious that the data are not distributed in frequency accord-
ing to the second law (with some latitude of departure from the straight
proportion must be permitted).” Now for the data supplied by Crum, we
have approximately:
S
1
= 7.0, S
2
= 10.3, S
3
= 13.8, S
4
= 17.0
thus the ra tio s are
1 : 1 .5 : 2.0 : 2.4,
a far cry from those to be obe yed based on the normal distribution. The
spread is just too wide and no reasonable allowance for the behavior of
probable errors can produce such great a spread.
On the other hand applying the first law of Laplace, where the frequency
varies as e
kd
(d is the numerical value of the deviation) we obtain after
some “annoying” calculations involving calculus, the theoretical va lues
S
1
: S
2
: S
3
: S
4
= 1.000 : 1.414 : 1.817 : 2.213.
Wilson justifiably asserts that the distribution in frequency of the data is
much near er to Laplace’s first law than to the second and it is no longer re-
ally r easonable to maintain that the differences are within the presumptive
errors due partly to the scarcity and irregularity of material.
However there is “a little evidence” that the observations are more dis-
persed than they would be even under the first law. To account for possible
asymmetry Wilson suggests the class ic al graphical method representing the
frequency law as
f =
1
2
Nκe
κx
,
where N = 324, the deviation is x and the number n of deviations beyond
a given value x is
n =
Z
x
fdx =
1
2
Ne
κx
.
Hence,
log
10
n = log
10
(
1
2
N) (κ log
10
e)x
plots as a straight line on the so-called arith-log pape r with x as abscissa
and n is the ordinate. Since for the first law of Laplace κ = 1, where
1. Historical background 7
Figure 1.2: Reproduced probability plot for the Crum’s data discussed in
Wilson’s article. Reprinted with pe rmission from the Journal of the Amer-
ican Statistical Association. Copyright 1923 by the American Statistical
Association. All right reserved.
θ is the mean deviation, it is reasonable to choose θ = S
1
= 7 .0. (The
values of θ calculated from the four S’s ar e 7.0, 7.3, 7.5, and 7.7.) A fair
representation of the distribution of the data is given by f = 23e
x/7
(recall
that we are using the absolute value of x the numerical value of deviation)
8 1. Historical background
Figure 1.3: Arith-log cha rt for the first Laplace law using Crum’s data.
Reprinted with permission fr om the Journal of the Am erican Statistical
Association. Copyright 1923 by the American Statistical Association. All
right reserved.
and the arith-log chart constructed for the fir st Laplacian law like the
probability chart for the Gaussian law was also based on the total integrated
frequency outside a certain limit. Figure 1.2 pr e sents a probability plot
1. Historical background 9
a chart in which the ordinates are the perc e ntage of deviations which are
less than (left scale) or greater than (right scale) a given deviation plotted
as an abscissa under the assumption of the Gaussian law, namely if the
Gaussian law wer e followed the line would have bee n straight.
Evidently the Gaussian fit is inadequate. The straight line fitted to the 4
central points results in no deviation in the observations greater than +20
and smaller than -19. For comparison the arith-log chart c onstructed for the
first Laplacian law is presented in Figure 1.3 . On this chart, the p oints (ˆn, x)
are represented for the number of empirical deviations ˆn beyond x a nd
compared to the graph of log
10
n = log
10
(N/2) (κ log
10
e)x. Examining
the chart, Wilson asserts tha t This chart shows on arith-lo g paper the
number of deviations as ordinates greater than the values given as absciss ae.
If La place’s first law holds, the points should lie on a straig ht line. The
lowest set of points and the lowest line are for the negative deviations (left
scale), and for them the law holds as well as c ould be desired. The top line
and set of points are for the positive deviations; the fit to the straight dotted
line is bad (right scale). The middle line and set of points are for positive
and negative deviations taken together (left scale) without regard to sign,
and the fit is fair better than for the (Gaussian) curve (in Figure 1.2).”
Wilson concludes by stating that these data give internal evidence of
following Laplace’s first law instead of this second law and should be fitted
to tha t law.
In spite of the prestige of the Journal in which the paper appea red and the
prominence of the author, Wilson’s plea remained a call in the wilderness
for over 5 decades and only recently attention has been shifted to the
Laplace first law known as the Laplace distribution or occasionally double
exponential distribution as a candidate for fitting data in economics and
health s c ience s.
For many years the Laplace distribution was a popular topic in prob-
ability theory due to simplicity of its characteristic function and density,
the curious phenomenon that a random va riable with only slightly different
characteristic function loses the simplicity of the density function and o ther
numerous attractive probabilistic features enjoyed by this distribution.
Perhaps one of the earliest sources in which the Laplace distribution is
discussed as a law of errors in the English language is the 1911 pape r by
the fa mous economist and probabilist J.M. Keynes in the Journal of the
Royal Statistical Society, Vol. 74, New Series, (pp. 322-331).
With his usual lucidity, Keynes discusses the probability of a measure-
ment x
q
assuming the real (actual) va lue to be a
s
, as an algebraic function
f(x
q
, a
s
), the same function for all values of x
q
and a
s
“within the lim-
its of the pr oblem.” The tas k is to find the value of a
s
, namely x which
maximizes:
m
Y
q=1
f(x
q
, x).
10 1. Historical b ackground
This is equiva lent to solving
m
X
q=1
f
0
(x
q
, x)
f(x
q
, x)
= 0
or
P
f
0
q
/f
q
= 0 for brevity. Now, the law of errors determines the form of
f(x
q
, x) and the form of f(x
q
, x) determines the algebraic relation
P
f
0
q
/f
q
=
0 between the measurements and the most probable value. Keynes analyzes
several situations.
1. If the most probable value of the quantity is equa l to the arithmetic
mean of measurements
1
m
P
m
q=1
x
q
, then
P
f
0
q
/f
q
= 0 is equivalent to
P
(x x
q
) = 0. Thus, f
0
q
/f
q
can be written as Φ
00
(x)(x x
q
), where
Φ
00
(x) is a non-zero function independent of x
q
. Integrating, we get
log f
q
= Φ
0
(x)(x x
q
) Φ(x) + Ψ(x
q
),
where Ψ(x
q
) is a function independent of x. Thus,
f
q
= e
Φ
0
(x)(xx
q
)Φ(x)+Ψ(x
q
)
.
Setting Φ(x) = κ
2
x
2
and Ψ(x
q
) = κ
2
x
2
q
+ log A we obtain
f
q
= Ae
κ
2
(xx
q
)
2
= Ae
κ
2
y
2
q
,
(where y
q
is the absolute magnitude of the error in the measurement
x
q
) the s o-called normal law.
Keynes emphasizes that this is only one “among st a number o f pos-
sible solutions” but notes that with one additional assumption this is
the only law of error leading to the arithmetic mean. The assumption
is that negative and po sitive e rrors of the same absolute amount are
equally likely.
Indeed in that case f
q
will be of the form Be
θ([xx
q
]
2
)
, where θ([x
x
q
]
2
) is the value of a certain real function θ evaluated at (x x
q
)
2
.
We have
Φ
0
(x)(x x
q
) Φ(x) + Ψ(x
q
) = θ([x x
q
]
2
)
or
Φ
00
(x) = 2
d
d(x x
q
)
2
θ([x x
q
]
2
)
and
d
d(x x
q
)
2
θ([x x
q
]
2
) = κ
2
,
where κ is a consta nt since Φ
00
(x) is independent of x
q
. Thus,
θ([x x
q
]
2
) = κ
2
(x x
q
)
2
+ log C
1. Historical background 11
and
f
q
= Ae
κ
2
(xx
q
)
2
,
with A = BC.
2. Next, Keynes discusses in detail the case of the law of error if the
geometric mean of the measurements leads to the most probable value
of the quantity. This yields
f
q
= A
x
x
q
κx
e
κx
.
Keynes then compares it with the earlier derivation by D. McAlister
in the Proceeding of the Royal Society 29 (1879), p. 365, who obtained
f
q
= Ae
κ
2
log
2
(x
q
/x)
,
the well-known log-normal law.
He also notes that J.C. Kapteyn in his monograph Skew Frequency
Curves, Astronomical Laboratory, Groningen (1903), obtained similar
result.
3. Next, he discusses the law of e rrors implied by the harmonic mean
leading to
f
q
= Ae
κ
2
y
2
q
/x
q
.
Here positive and negative errors of the same absolute magnitude are
not equally likely.
4. Keynes now poses the question:
If the most probable value of the quantity is equal to the median of
measurements, what is t he law of error?
For this purpose he defines the median of observations and notes its
property originally prove d by G. T. Fechner (1801-1887 )
3
who fir st
introduced median into use : If x is the median of a number of
magnitudes, the sum of the absolute differences (i.e., the difference
always reckoned positive) between x and each of the mag nitudes is a
minimum.” Now write |xx
q
| = y
q
. Since
P
m
i=1
y
q
is to be minimum
we must have
P
m
q=1
xx
q
y
q
= 0. Whence proceeding as before, we have
f
q
= Ae
R
xx
q
y
q
Φ
00
(x)dx+Ψ(x
q
)
.
The simplest case of this is obtained by putting
Φ
00
(x) = k
2
, Ψ(x
q
) =
x x
q
y
q
k
2
x
q
3
In his book Kollektivmasslehre, Leipzig, W. Englemann, (1897).
12 1. Historical b ackground
whence
f
q
= Ae
k
2
|xx
q
|
= Ae
k
2
y
q
.
This satisfies the additional condition that pos itive and negative er-
rors of equal magnitude are equally likely. Thus in this important
respect the median is as satisfactory as the arithmetic mean, and the
law of error which leads to it is as simple. It also resembles the normal
law in that it is a function of the error only, and not of the magnitude
of the measurement as well.
Keynes (1911) analysis of L aplace’s contribution to the first law of
error is worth reproducing verbatim.
“The median law of err or, f
q
= Ae
k
2
y
q
, where y
q
is the absolute
amount of the error always reckoned pos itive, is of some historical in-
terest, because it was the earliest law of error to be formulated. The
first attempt to bring the doctrine of averages into definite relation
with the theory of probability and with laws of error was published
by Laplace in 1774 in a memoir “Sur la probabilit´e de causes par les
´ev´enemens.”
4
. This memoir was not subse quently incorporated in his
Th´eorie Analytique, and does not represent his more mature view. In
the Th´eorie he dr ops altogether the law tentatively adopted in the
memoir, and lays down the main lines of the investigation for the
next hundred years by the introduction of the normal law of error.
The popula rity of the normal law, with the a rithmetic mean and the
method of least squares as its corollar ies has be e n very largely due
to its overwhelming advantages, in comparison with all other laws of
error, for the purpo ses of mathematical development and manipula-
tion. And in addition to these technical advantages, it is probably
applicable as a first approximation to a la rger and more ma nageable
group of phenomena than any other single law.
5
So powerful a hold
indeed did the normal law obtain on the minds of statisticians, that
until quite recent times only a few pioneers have seriously considered
the possibility of pr e ferring in certain circumstances other means to
the arithmetic and other laws of error to the normal. Laplace ear lier
memoir fell, therefore, o ut of remembrance. But it remains interest-
ing, if only for the fact that a law of error there makes its appearance
for the fir st time.”
Laplace (1794) sets himself the problem in a somewhat s implified
form:
“D´eterminer le milieu que l’on doit prendre entre trois observation
donn´ees d’un mˆeme ph´enom`ene.” He begins by assuming a law y =
4
emoirs pr´esent´es `a l’ Acad´emie des Sciences Paris, vol. vi., pp. 227-332
5
We would add that the Central Limit Theorem should be also credited for this
populari ty.
1. Historical background 13
φ(x) for an error, where y is the probability of an error x; and finally
by means of a number of somewhat arbitrary assumptions (our em-
phasis), arrive at the result φ(x) = (m/2)e
mx
. If this formula is to
follow from his arguments, x must denote the absolute error, always
taken positive. It is unlikely that Laplace was led to this result by
considerations other than those by which he attempts to justify it.”
“Laplace, however did not notice that his law of error led to the
median. For instead of finding the most proba ble value, which wo uld
led him straight to it, he seeks the “mean of er ror” the value, that
is to say, which the true value is as likely to fall short of as to exceed.
This value is, for the median law, laborious to find and awkward
in the result. L aplace works it out corr e ct for the case where the
observations are no more than three.”
5. Finally Keynes deals with the case where law of errors leads to a
mode without providing an explicit solution and concludes with a
discussion of the most general form of the law of errors when it is
assumed that positive and negative errors of the sa me magnitude are
equally probable.
He emphasizes that the most general fo rm leading to the median is
f
q
= Ae
Φ
0
(x)
xx
q
y
q
+Ψ(x
q
)
,
where f
q
is the probability of a measurement x
q
given that the true va lue
is x.
Stigler (1986a) provides a somewhat different assessment of Lapla c e ’s 1774
memoir. He presents an English translation of the memoir (whose English
title is Probability of the Causes of Events) and points out that Laplace was
just 2 5 years old when the memoir appeared and it was his firs t substantial
work in mathematical statistics.
For our readers interested in history, it is worthwhile to reproduce Laplac e ’s
elegant and ingenious derivation of the what is now referred to as the
Laplace distribution. We reproduce his illustrative Figure 2 depicting his
error distribution (our Figure 1.4). Here V repr e sents the true value o f the
location parameter (in the modern terminology). Denoting φ(x) as proba-
bility density of the deviation x of an observation from V , in his attempt
to determine this function Laplac e argues as follows:
“But o f an infinite number of possible functions, which choice is to be
preferred? The following considerations ca n determine a choice. It is true
(Figure 1 .4) that if we have no reason to suppose the point p mor e probable
than the point p
0
, we sho uld take φ(x) to be constant, and the curve ORM
0
will be a straight line infinitely near the axis Kp. But this supposition
must be rejected, because if we suppose there existed a very large number
of observations of the phenomenon, it is presumed that they become rarer
14 1. Historical b ackground
Figure 1.4: An illustration (Figure 2) from Laplace’s 1774 memoir.
the farther they are spread from the truth. We can also easily see that this
diminution cannot be constant, that it must become less as the observations
deviate more from the truth. Thus not only the ordinates of the curve
RM M
0
, but also the differences of these ordinates must decrea se as they
become further from the point V , which in this Figure we always suppose
to be the true instant o f the phenomeno n. Now, as we have no re ason
to suppose a different law for the ordinates than for their differences
6
, it
follows that we must, subject to the rules of probabilities, suppo se the
ratio of two infinitely small consecutive differences to be e q ual to that of
the corresponding ordinates. We thus will have
(x + dx)
(x)
=
φ(x + dx)
φ(x)
.
Therefore
(x)
dx
= (x),
which gives φ(x) = Ce
mx
. Thus, this is the va lue that we should choose
for φ(x). The constant C should be deter mined from the supposition that
the area of the curve ORM equals unity which represents certainty, which
6
It is important to note that Laplace is talking here about the difference of the
probability density function, not of the observations, i.e. this crucial assumption does
not impose that the difference of observations should be distributed in the same way as
observations themselves (which is not true for the Laplace distribution)
1. Historical background 15
gives C = 1 /2m. Ther e fore φ(x) = (m/2)e
mx
, e being the number whose
hyperbolic logarithm is unity.
One can object that this law is repugnant in that if x is supposed ex-
tremely la rge, φ(x) will not be zero, but to this I reply that while e
mx
indeed has a real value of all x, this value is so small for x extremely large
that it can be regarded as zero.”
Keynes quite justifiably mentions “a number of somewha t arbitrary as-
sumptions” in Laplace’s argument. Nevertheless the argument involves sev-
eral potent idea . The books by Stigler (1986b) and Hald (1995), and also
the article by Eisenhar t (1983) contain more rigorous derivations as well as
valuable revealing comments.
An interesting “applied” genesis of the Laplace distribution was pre-
sented in Mantel and Pasternack (196 6) [see also Rohatgi (1984), Exam-
ple 4, p. 482]. We present it to gether with some representation of Laplace
random va riables as the determinant of a random matrix.
Let X
1
and X
2
represent the lifetimes of two identical independent c om-
ponents, an original and its replacement. Suppose tha t we require the prob-
ability that the replacement outlas ts original component. Thus
P (X
2
> X
1
) = P (X
2
X
1
> 0) = 1/2.
Let us assume that lifetimes are distributed exponentially with common
mean λ and compute the density of Z = X
2
X
1
. Since Z is a symmetric
random variable it is enough to compute the density for z > 0. For z > 0,
the density of the difference of X
2
and X
1
is given by
f
Z
(z) =
Z
0
(λ
1
e
x
1
)(λ
1
e
(z+x
1
)
)dx
1
= (2λ)
1
e
z
,
and thus for z R:
f
Z
(z) = (2λ)
1
e
−|z|
.
We have a verbal proof of this result. Consider two “idealized” light bulbs
in use simultaneously. We are interested in the distribution of the difference
in their failure times. Once one bulb fails the remaining bulb being as good
as new will have a remaining lifetime given by the standard waiting time
distribution (exponential). With probability 1/2, the first failure will corre-
sp ond either to the first or the second lifetime distribution (exponentials)
so tha t the difference in failure times will be pos itive or negative with equal
probabilities and in each case with absolute value following the standard
waiting time distribution i.e. the exponential.
Since a standard exponential ra ndom variable multiplied by two has the
chi-square distribution with two degrees of freedom, the arguments above
show that Z is distributed a s a half of the difference of two independent
chi-square random va riables each with two degrees of freedom.
On the other hand, if Z
1
, Z
2
, Z
3
, and Z
4
are independent standard
normal random variables, it is easy to see that the distribution of Z is the
16 1. Historical b ackground
same as that of Z
1
Z
2
+Z
3
Z
4
. Indeed, U
1
= (X
1
+X
2
)/2, U
2
= (X
1
X
2
)/2,
U
3
= (X
3
+ X
4
)/2, U
4
= (X
3
X
4
)/2 are all independent normal with
variance 1/2. Thus,
X
1
X
2
+ X
3
X
4
= (U
2
1
+ U
2
3
) (U
2
2
+ U
2
4
),
which has the same distribution as a difference of two independent χ
2
(chi-
square) random variables with two degrees of freedom each.
In general, sums or differences of n normal products each of 2 factors
will be distributed like 1/2 of the difference s of 2 independent χ
2
each with
n degrees of freedom and if n is even this is a n/2-fold convolution of the
Laplace distribution. These sums of n products correspond to the sample
covar iance for biva riate normal samples when correlation is zero.
To r e c apitulate, the err or distribution, nowadays referred to as the Laplace
distribution or the double exponential distribution, originated in Laplace’s
1774 memoir. Historically, it was the first continuous distribution of un-
bounded suppor t. Although since its introduction the distribution was oc-
casionally recommended as a better fit to certain data, its popularity is un-
justifiably by far lesser than that of its four years older “sibling” Laplace’s
second law of error - better known in the English language literature as the
Gaussian (normal) law.
This monograph is devoted to collecting and presenting properties, gen-
eralizations, and applications of the Laplace distribution with a tacit a im to
demonstrate that it is a na tur al and sometimes superior alternative to the
normal law. We hope to convince our readers that this class of distributions
deserve more attention than it received until very rec e ntly.
2
Classical symmetric Laplace
distribution
In the course of our study of the Laplace distribution and its generalizations
we have noticed that quite often in the statistical literature this distribution
is used not on its own merits but as a source for counterexamples for
other (mainly normal) distributions. It would seem that it has been cre ated
solely for purposes to provide examples of curiosity, non-regularity and
pathological behavior. In studies with probabilistic c onte nt, the distribution
serves as a tool for limiting theorems and representations with the emphasis
on analyzing its differences from the classical theory based on the “sound”
foundations of normality. One gets the impression that the “sharp needle”
at the origin of the Laplace distribution where the bulk of the density
is concentrated generates a ripple effect which affects the behavior over
its whole support including the tails
1
. These obs ervations prompted us
to initiate a deta iled study of the Laplace distribution on its own merits
without constant intruding comparisons and analogs.
In Table 2.1 and Figure 2.1, re produced from Chew (1968), we present
definitions and graphs of the six classe s of sy mmetric about zero, single-
parameter distributions: uniform, triangular, cosine, logistic, Laplace, and
normal. Values of the distribution functions are given in Table 2.2. The
graphs of their densities for cases of the unit variance convincingly demon-
1
Tails of a random variable X are the probabili ties P (X < x) and P (X > x), x > 0.
The asymptotic behavior of these functions of x is often refer red to as the tail behavior
of X or its distribution.
18 2. Classical symmetric Laplace distribution
Name Density Function Distribution Function Variance
Uniform
1
2a
, x (a, a)
0, elsewhere
0, x a
a+x
2a
, x (a, a)
1, x a
a
2
3
Triangular
b+x
b
2
, x [b, 0]
bx
b
2
, x (0, b]
0, elsewhere
0, x < b
(b+x)
2
2b
2
, x [b, 0]
1
(bx)
2
2b
2
, x (0, b]
1, x > b
b
2
6
Cosine
1+cos x
2π
, x [π, π]
0, elsewhere
0, x < π
π+x+sin x
2π
, x [π, π]
1, π x
π
2
6
3
Logistic
sech
2
(x/d)
2d
1
1+e
2x/d
(πd)
2
12
Laplace ce
2c|x|
e
2cx
/2, x < 0
1 e
2cx
, x 0
1
2c
2
Normal
1
2π
e
x
2
/2
R
x
−∞
1
2π
e
u
2
/2
du
1
Table 2.1: Densities and distribution functions of some symmetrica l prob-
ability distributions [repr oduced from Chew (1968)]. Reprinted with per-
mission from The American Statistician. Copyright 1968 by the American
Statistical Association. All right reserved.
strate the basic features and, in particular, the special position of the
Laplace distribution with its towe ring peak and heavy tails.
Leptokurtic tendencies (see Section 2.1.3 for more details) are frequently
found among measurements of the superior quality and ho mogeneity. A
leptokurtic Laplace curve presents a well visible “peak”: in the vicinity of
the center there is a certain excess of (small) elements. As the are a under the
curve is the same as under the normal curve, the peak is counter balanced
by a cor responding diminution of frequencies in the intermediate regions
further from the center (tails). Generally, there is an “overcompensation” so
that the leptokurtic curve crosses the normal curve four times, first about
the peak and then again at the tails and tends toward x-axis by staying
slightly above the no rmal curve.
2.1 Definition and basic properties 19
Figure 2.1: Graphs of density functions of several s ymmetrical populations
[reproduced from Chew (1968)]. Reprinted with permission from The Amer-
ican Statistician. Copyright 1968 by the American Statistical Association.
All right reserved.
2.1 Definition and basic properti es
2.1.1 Density and distribution functions
The classical Laplace distribution (als o known as first law of Laplace) is a
probability distribution on (−∞, ), given by the density function
f(x; θ, s) =
1
2s
e
−|xθ|/s
, −∞ < x < , (2.1.1)
where θ (−∞, ) and s > 0 are location a nd scale parameters, respec-
tively [see, e.g., Ord (1983), J ohnson et al. (1995)]. As discussed in some
detail in Chapter 1, it was named after Pierre-Simon Laplace (17 49-1827),
20 2. Classical symmetric Laplace distribution
x Normal Logistic Laplace Cosine Triangular
0.0 0.5000 0.5000 0.5000 0.5000 0.500 0
0.2 0.5793 0.5897 0.6238 0.5720 0.578 5
0.4 0.6554 0.6738 0.7160 0.6422 0.650 1
0.6 0.7257 0.7480 0.7860 0.7088 0.715 1
0.8 0.7881 0.8102 0.8387 0.7702 0.773 4
1.0 0.8413 0.8598 0.8784 0.8252 0.825 0
1.2 0.8849 0.8981 0.9084 0.8728 0.869 9
1.4 0.9192 0.9269 0.9310 0.9122 0.908 2
1.6 0.9452 0.9480 0.9480 0.9436 0.939 9
1.8 0.9641 0.9632 0.9608 0.9670 0.964 9
2.0 0.9772 0.9741 0.9704 0.9832 0.983 2
2.2 0.9861 0.9818 0.9777 0.9931 0.994 8
2.4 0.9918 0.9873 0.9832 0.9982 0.999 8
2.6 0.9953 0.9911 0.9873 0.9998 1.000 0
2.8 0.9974 0.9938 0.9906 1.0000
3.0 0.9987 0.9957 0.9928
3.2 0.9993 0.9970 0.9946
3.4 0.9997 0.9979 0.9959
3.6 0.9998 0.9985 0.9969
3.8 0.9999 0.9990 0.9977
4.0 1.0000 0.9993 0.9983
Table 2.2: Values of distribution functions of selected distributions. The
values of x are in multiples of standard deviation.
who in 1774 obtained (2.1.1) as the distribution whose likelihood is maxi-
mized when the location parameter is set to the median. As it was already
alluded in Chapter 1 and will be discuss e d further in in Section 2.2, the
Laplace distribution arises also as the law of the difference betwee n two
exp onential random variables. Consequently, it is a lso known as double ex-
ponential distribution
2
, as well as two-tailed exponential distribution [see,
e.g., Greenwood et al. (1962)] and bilateral exponential law [see, e.g., Feller
(1971)].
2
Note that this name is also used for the extreme value distribution with density
exp( exp(x)), x > 0, as well as for a distribution from the exponential family studied
by Efron (1986). The term of double exponential fitness function for the probabilities
p = exp( exp(α
0
+ α
1
x
1
+ · · · + α
n
x
n
)) is common in biostatistic literature [see, e.g.,
Manly (1976)]. Johnson et al. (1995) recommend calling the extreme value distribution
doubly exponential law.
2.1 Definition and basic properties 21
It is easy to verify that the var iance of (2.1.1) is equal to 2s
2
. Thus, the
standard classical Laplace distribution, which has the density
f(x; 0, 1) =
1
2
e
−|x|
, < x < , (2.1.2)
has the variance equal to 2. For var ious derivations it would seem convenient
to c onsider a reparameterization of Laplace densities
g(x; θ, σ) =
1
2σ
e
2|xθ|
, −∞ < x < . (2 .1.3)
In this case the standard Laplac e distribution is given by setting θ = 0 and
σ = 1. It has the variance equal to one and the density is of the form
g(x; 0, 1) =
1
2
e
2|x|
, < x < . (2.1.4)
To distinguish be tween these two parameterizations we shall be r e ferring to
the classical Laplace CL(θ, s) and standard classical La place CL(0, 1) dis-
tributions in the cases given by (2.1.1) and (2.1.2), and to Laplace L(θ, σ)
and standard (a ctually standardized) Laplace L(0, 1) dis tributions in the
cases represented by (2.1.3) and (2.1.4), respectively. We shall also reta in
the difference in notation for the scale parameter by reserving s for classical
Laplace distributions and σ for those given by (2.1.3). Therefore, reformu-
lating any result from one parameterization to the other is a matter of
replacing s by σ/
2 or σ by
2s. In Figure 2.2 we present graphs of the
standard classical and the standard Laplace densities.
The cumulative dis tribution function (c.d.f.) corre sponding to density
(2.1.1) is
F (x; θ, s) =
1
2
e
−|xθ|/s
if x θ,
1
1
2
e
−|xθ|/s
if x θ.
(2.1.5)
The distr ibution is symmetric about θ, i.e. for any real x we have
f(θ x; θ, σ) = f(θ + x; θ, σ) and F (θ x; θ, σ) = 1 F (θ + x; θ, σ).
(2.1.6)
Consequently, the mean, median, and mode of this dis tribution are all equal
to θ.
2.1.2 Characteristic and moment generating functions
The characteristic function (ch.f.) corresponding to the standard classical
Laplace CL(0, 1 ) random variable (r .v.) X with density (2.1 .2) is
ψ
X
(t) = E[e
itX
] =
Z
−∞
e
itx
1
2
e
−|x|
dx = (1 + t
2
)
1
, −∞ < t < .
(2.1.7)
22 2. Classical symmetric Laplace distribution
-10 -5 0 5 10
0.0 0.2 0.4 0.6
-10 -5 0 5 10
0.0 0.2 0.4 0.6
standard Laplace
standard classical Laplace
Figure 2.2: Standard classical Laplace [equation (2.1.2)] and standard
Laplace [equation (2.1.4)] density functions.
For the general classical Laplace r.v. Y with the distribution CL(θ, s) we
have Y
d
= sX + θ. Thus,
ψ
Y
(t) = E[e
it(sX+θ)
] = e
itθ
ψ
X
(st) =
e
itθ
1 + s
2
t
2
, < t < . (2.1 .8)
It is a well-known but nevertheless a curious fact that the pair of Fourier
transforms (2.1.2) and (2.1.7) occur in reverse order for the Cauchy distri-
bution. Namely, the standard Cauchy distribution with density
f
c
(x) =
1
π(1 + x
2
)
, < x < ,
has characteristic function given by
φ
c
(t) = e
−|t|
, < t < .
The moment generating function of standard class ic al Laplace r.v. X
with density (2.1.2) is
M
X
(t) = E[e
tX
] =
Z
−∞
e
tx
1
2
e
−|x|
dx = (1 t
2
)
1
, 1 < t < 1. (2.1.9)
2.1 Definition and basic properties 23
For the general clas sical Laplace r.v. Y with density (2.1 .1) we have
M
Y
(t) = e
M
X
(st) =
e
1 s
2
t
2
,
1
s
< t <
1
s
. (2.1.10)
Consequently, the cumulant generating functions, log M
Y
(t) and log M
X
(t),
corresponding to (2.1.1) and (2.1.2), are
log(1 s
2
t
2
) and log(1 t
2
), (2.1.11)
respectively.
2.1.3 Moments and related parameters
Cumulants
The nth cumulant of a classical Laplace r.v. X, denoted κ
n
, is defined
as the coefficient of t
n
/n! in the Taylor e xpansion (about t = 0) of the
cumulant generating function of X. Formulas (2.1.11) for the cumulant
generating function generate the cumulants of Laplace distributions in a
straightforward manner. Indeed, using the Taylor expansion of log(1 z)
about z = 0 we have
log(1 t
2
) =
X
k=1
t
2k
k
.
Thus, for the standard classical Laplace r.v. X given by (2.1.2), we have
κ
n
(X) =
0 if n is odd,
2(n 1)! if n is even.
(2.1.12)
Hence, for a general class ical Laplace r.v. Y with CL(θ, s) distribution,
κ
n
(Y ) =
θ n = 1,
0 if n > 1 is odd,
2s
n
(n 1)! if n is even,
(2.1.13)
since κ
n
(Y ) = κ
n
(θ + sX) = s
n
κ
n
(X) for n 2.
Moments
By writing the Taylor expansion o f the moment generating function (2.1.10)
with θ = 0,
M
Y
(t) =
X
k=0
s
2k
(2k)!
t
2k
(2k)!
,
we obtain the nth central moment of general classical L aplace r.v. Y with
density (2.1.1):
µ
n
(Y ) = E(Y θ)
n
=
0 if n is odd,
s
n
n! if n is eve n.
(2.1.14)
24 2. Classical symmetric Laplace distribution
One can obtain the central absolute moment of a classical Laplace distribu-
tion by observing that it is equal to the central, raw moment of exponential
distribution with parameter λ = 1/s, or, more dire c tly,
ν
a
(Y ) = E|Y θ|
a
=
Z
0
x
a
1
s
e
x/s
dx = s
a
Γ(a + 1). (2.1.15)
In particular, we have
Mean = θ, Var iance = 2s
2
, (2.1.16)
so that fo r θ 6= 0, the coefficient of variation of Y is
p
E(Y EY )
2
|EY |
=
2s
|θ|
. (2.1.17)
Note that the mean and variance involve different parameters (as is the case
of the normal distribution, but unlike the binomial, Poisson and gamma
distributions).
The nth moment about ze ro o f the classical Laplace r.v. Y with density
(2.1.1) is given by [see , e.g ., Farison (1965), K acki (1965a)]
α
n
(Y ) = EY
n
= n!
n
X
j=0
1 + (1)
j+n
2j!
θ
j
s
nj
= n!
[[n/2]]
X
i=0
θ
n2i
(n 2i)!
s
2i
,
(2.1.18)
where [[x]] denotes the greatest integer less than or equal to x.
Mean deviation
By (2.1.15), the mean deviation of a classical Laplace r.v. Y with density
(2.1.1) is equal to
E|Y E[Y ]| = E|Y θ| = s. (2.1.19)
Furthermore, we have
Mean deviation
Standard deviation
=
s
2s
=
1
2
0.707. (2.1.20)
Recall that for all normal distributions, the above ratio is given by
p
2
0.798.
Coefficients of skewness and kurtosis
For a distribution of a r.v. X with a finite third moment and standard
deviation greater than zero, the coefficient of skewness is a measure of
symmetry defined by
γ
1
=
E(X EX)
3
(E(X EX)
2
)
3/2
. (2.1.21)
2.1 Definition and basic properties 25
By (2.1.14), the coefficient of skewness o f Laplace distribution (2.1.1) is
equal to zero (as it is the case for any symmetric distribution with a finite
third moment).
For a r.v. X with a finite fourth moment, the excess kurtosis
3
is defined
as
γ
2
=
E(X EX)
4
(V ar(X))
2
3. (2.1.22)
It is a measure of peakedness a nd o f heaviness of the tails (properly ad-
justed, so that γ
2
= 0 for a normal distribution), a nd is independent of
the scale. If γ
2
> 0, the dis tribution is said to be leptokurtic, and it is
platykurtic otherwise. In view of (2.1.14),
γ
2
=
s
4
4!
(2s
2
)
2
3 = 3. (2.1.23)
Thus, the Laplace distribution is a leptokurtic one, indicating a large degree
of p e akedness as compared to the nor mal distributions. See Balanda (1987)
and Horn (1983) for more details.
Entropy
Entropy of a classical Laplace variable Y is easy to compute:
H(Y ) = E[log f (Y )] =
Z
−∞
log(2s) +
|x θ|
s
1
2s
e
−|xθ|/s
dx
= log(2s) + ν
1
(Y )/s
= log(2s) + 1.
As will be s hown in Section 2.4.5, this entropy is maximal within the class of
continuous distributions on R with a given abs olute moment [see Kagan et
al. (1973)], as well as within the class of conditionally Gaussian distributions
[see Levin and Tchernitser (1999) or Levin and Albanese (1998 )]. These
results provide a dditional arguments for applications of Laplace laws to
various practical problems [see Chapter III].
Quartiles and quantiles
Because of the availability of an explicit fo rm of the cumulative distribution
function, quantiles ξ
q
of a classical Laplace distribution can be written
explicitly as follows:
ξ
q
=
θ + s ln(2q); q (0, 1/2],
θ s ln[2(1 q)]; q (1/2, 1).
(2.1.24)
3
Without centering by 3 it i s simply called kurtosis.
26 2. Classical symmetric Laplace distribution
In particular, the first and the third quartiles are given by
Q
1
= ξ
1/4
= θ s ln 2, Q
3
= ξ
3/4
= θ + s ln 2.
Evidently, the s e c ond q uartile Q
2
the median is equal to θ.
2.2 Representatio ns and chara cterizat ions
In the first part of this section we present various representations of Laplace
r.v.’s in terms of other well-known random variables. These representations
are also listed in Table 2.3. We shall focus on the standard classical Laplace
r.v. X with density (2.1.2) and ch.f. (2.1.7). As it was already mentio ned,
for a general Laplace r.v. Y with density (2.1.1) the corresponding represen-
tations of Y follow from the relation Y
d
= θ+sX. When writing equalities in
distribution we shall follow the s tandard convention tha t random variables
appearing on the same side of the equation are independent.
Characterizations of distributions is a popular and well-developed topic
of modern probability theory. It provides additional insight into the struc-
ture of distributions especially those which are, like the Laplac e distribu-
tion, defined by a simple density and characteristic function. The simplicity
of a formula does not always convey obvious features and masks surprises
that may be built into a particular distribution. In the case of the Laplace
distribution its characterizations unveil quite intriguing properties which
one would not suspect basing solely on its “modest” density function.
In the second part of this section we descr ibe some characterizations
of Laplace distributions, in particular those connected with the geometric
summation,
S
p
= X
1
+ ··· + X
ν
p
, (2.2.1)
where ν
p
is a geometric random va riable with mean 1/p and probability
function
P (ν
p
= k) = (1 p)
k1
p, k = 1, 2, 3, . . . , (2.2.2)
while X
i
, i 1, are i.i.d. r .v .’s independent of ν
p
. It turns out that under
the geometric summation (2.2.1), the L aplace distribution plays the role
analogous to that of Gaussian distribution under the ordinary summation.
As discussed in Kalashnikov (1997), geometric sums (2.2.1) arise naturally
in diverse fields of applications such as risk theory, modeling financial asset
returns, insurance mathematics and other s, and consequently the Laplace
distribution is a pplicable for stochastic modeling.
2.2.1 Mixture of normal distributions
Any Laplace r.v. can be thought of as a Gaussian r.v. with mean zero and
stochastic variance which has an exponential dis tribution. More formally,
2.2 Representations and characterizations 27
a Laplace r.v. has the same distribution as the product of a normal and an
independent e xponentially distributed random variable, as sketched in
Proposition 2.2.1 A standard classical Laplace r.v. X has the represen-
tation
X
d
=
2W Z, (2.2.3)
where the random variables W and Z have t he standard exponential and
normal distributions, respectively.
Proof. Le t W be a standard exponential r.v. with the density f
W
(w) =
e
w
, w > 0, and the moment generating function M
W
(t) = E[e
tW
] =
(1 t)
1
, t < 1. Let Z be a standard normal random variable with the
density f
Z
(z) =
1
2π
e
z
2
/2
, −∞ < z < , and the characteristic function
φ
Z
(t) = e
t
2
/2
, −∞ < t < . The ch.f. of the product
2W Z coincides
with the standard classical Laplac e ch.f. (2.1.7). Indeed, co nditioning on
W , we obtain
E[e
it
2W Z
] = E[E[e
it
2W Z
|W ]] = E[φ
Z
(t
2W )]
= E[e
t
2
W
] = M
W
(t
2
) = (1 + t
2
)
1
.
The proposition is thus proved.
Remark 2.2.1 An alternative proof of Proposition 2.2.1 utilizing the den-
sities of W and Z is outlined in Exercise 2.7.10. Relation (2.2.3) written in
terms of the densities becomes
1
2
e
−|x|
=
Z
0
f
Z
x
2w
1
2w
f
W
(w)dw =
Z
0
1
2
1
2πw
e
1
2
x
2
2w
+2w
dw.
(2.2.4)
Remark 2.2.2 For a general Laplace r.v. Y with density (2.1.1) we have
the representation Y
d
= θ +
2sW
1/2
Z.
Remark 2.2.3 Representation (2 .2.3) can be written as
X
d
= RZ, (2.2.5)
where Z is as before, and the random varia ble R =
2W has a Rayleigh
distribution with density f
R
(x) = we
x
2
/2
, x > 0.
28 2. Classical symmetric Laplace distribution
Remark 2.2.4 Another related representation discussed in Loh (1984) is
obtained by denoting T = 1/
W . Then,
X
d
=
2
Z
T
(2.2.6)
Here, the r.v. T has a brittle fracture distribution with density f
T
(x) =
2x
3
e
1/x
2
[such T is used to mo del the br e aking stress or strength, see,
e.g., Black et al. (1989), or Johnson et al. (1994) pp. 694]. A proof of the
result is left as an exercise.
2.2.2 Rel a tion to exponential distribution
The ch.f. (2.1.7 ) of a standard classical Laplace distribution can b e factored
as follows
1
1 + t
2
=
1
1 it
1
1 + it
. (2.2.7)
Note that the first factor is the ch.f. of a standard exponential r.v. W with
the density f
W
(w) = e
w
, w 0, while the second one is the ch.f. of W .
Since for independent random variables the product o f ch.f.’s c orresponds to
their sum, we arrive at a representation of a standard c lassical Laplace r.v.
in terms of two independent exponential random va riables. The following
proposition is thus valid.
Proposition 2.2.2 A classical standard Laplace r.v. X admits the repre-
sentation
X
d
= W
1
W
2
, (2.2.8)
where W
1
and W
2
are i.i.d. standard exponential random variables.
Remark 2.2.5 For a general Laplace r.v. Y with density (2.1.1) we have
Y
d
= θ + s(W
1
W
2
).
Remark 2.2.6 Denoting H
i
= 2W
i
, i = 1, 2, we obtain
Y
d
= θ +
s
2
(H
1
H
2
),
where H
1
and H
2
are i.i.d. with the χ
2
distribution with two degrees of
freedom (having the density f(x) =
1
2
e
x/2
).
Remark 2.2.7 Note that the following relation for an X distributed ac-
cording to the standard classical Laplace law follows immediately from
(2.2.8):
X
d
= log(U
1
/U
2
),
where U
1
and U
2
are independent random variables distributed uniformly
on [0, 1] [see, e.g., Lukacs and Laha (1964, pp. 61)].
2.2 Representations and characterizations 29
The standard classical Laplace ch.f. (2.1.7) can also b e decomposed as
follows:
1
1 + t
2
=
1
2
1
1 it
+
1
2
1
1 + it
. (2.2.9)
The right-hand side of (2.2.9) is the ch.f. of the product IW , where the
discrete symmetric variable I takes on values ±1 with probabilities 1/2,
while W is an indep e ndent of I standard exponential (see Exercise 2.7.12).
Thus, the standard classical Lapla c e distribution is a simple exponential
mixture. This is stated in the following
Proposition 2.2.3 A standard classical Laplace r.v. X admits the repre-
sentation
X
d
= IW, (2.2.10)
where W is standard exponential while I takes on values ±1 with probabil-
ities 1/2.
Remark 2.2.8 For a genera l Laplace r.v. Y with the density (2.1.1) we
have
Y
d
= θ + sIW.
Remark 2.2.9 It follows directly from (2.2.10) that if X is a standard
classical Laplace r.v., then |X| is standard exponential r.v . W . Thus, as
already noted by Johnson et al. (1995 , p.190), if X
1
, X
2
, . . . , X
n
are i.i.d.
standard Laplace r.v.’s, then any statistics depending only on the absolute
values |X
1
|, |X
2
|, . . . , |X
n
| can be repr e sented in terms of χ
2
random vari-
ables (since as already stated above, 2W is a χ
2
r.v. with two degree s of
freedom).
2.2.3 Rel a tion to Pareto distribution
A standard exponential r.v. W is r elated to a Pareto Type I r.v. P with
the density f(x) = 1/x
2
, x 1, as fo llows:
W
d
= log P. (2.2.11)
Consequently, representation (2.2.8) can be restated in terms of two inde-
pendent Pareto random variables. We have
Proposition 2.2.4 A standard classical Laplace r.v. X admits the repre-
sentation
X
d
= log
P
1
P
2
, (2.2.12)
where P
1
and P
2
are i.i.d. Pareto Type I random variables with the density
1/x
2
, x 1.
30 2. Classical symmetric Laplace distribution
Proof. Note that W
1
= log P
1
has standard exponential distribution with
density e
x
, x 0. The result now follow s directly from Propos itio n 2.2.2 .
Remark 2.2.10 For a general classical Laplace r.v. Y with dens ity (2.1.1)
we have
Y
d
= log
e
θ
P
1
P
2
s
.
Hence, the log-Laplace random variable e
(Y θ)/s
has the same distribution
as the ratio of two independent Pareto Type I random variables.
2.2.4 Rel a tion to 2 × 2 unit normal determinants
The following connection between Laplace and normal dis tributions - al-
ready mentioned in Chapter 1 - has been established by Nyquist et al.
(1954) almost fifty years ago and was a subject of a number of letters to
the editor in the American Statistician during the last decades.
Proposition 2.2.5 A standard classical Laplace r.v. X admits the repre-
sentation
X
d
=
U
1
U
2
U
3
U
4
= U
1
U
4
U
2
U
3
, (2.2.13)
where the U
i
’s are i.i.d. standard normal random variables.
The proof presented below is based on Proposition 2.2.2 and follows a
heuristic derivation due to Mantel and Pasternak (1966). For an alter-
native formal proof using characteristic functions see Exercise 2.7.13. Fo r
additional comments on this problem see Nicholson (1958), Mantel (1973),
Missiakoulis and Darton (1985), Mantel (1 987), and Jo hnson et al. (1995,
pp. 191), among others. Proof. In view of Proposition 2.2.2 and the remark
following it, we have X
d
= (H
1
H
2
)/2, where H
1
and H
2
are i.i.d. with the
χ
2
distribution with two degrees of freedom. Recall that H
1
d
= (W
1
+ W
2
),
where W
1
and W
2
are i.i.d. with the χ
2
distribution with one degree of fre e -
dom. (An analogous representation holds for H
2
.) Furthermore, W
1
d
= Z
2
1
,
where Z
1
is standard normal var iable. Consequently, we have
X
d
=
1
2
(Z
2
1
+ Z
2
2
Z
2
3
Z
2
4
),
where Z
i
’s are i.i.d. standard normal variables. Equivalently,
X
d
=
Z
1
Z
3
2
Z
1
+ Z
3
2
Z
4
Z
2
2
Z
4
+ Z
2
2
.
2.2 Representations and characterizations 31
Note that the two normal random variables Z
1
Z
3
and Z
1
+ Z
3
are inde-
pendent, and so are Z
4
Z
2
and Z
4
+ Z
2
. Thus, U
1
=
Z
1
Z
3
2
, U
2
=
Z
4
Z
2
2
,
U
3
=
Z
4
+Z
2
2
, and U
4
=
Z
1
+Z
3
2
are i.i.d. standard normal and (2.2.13) is
indeed valid.
Attempts to generalize this result to determinants of larger size, so far
have not been successful (see Exercise 2.7 .14). All the above c ited repre-
sentations are summarized in Ta ble 2.3.
2.2.5 An orthogonal representation
Younes (2000) shows that a clas sical Laplace r.v. X admits an orthogonal
representation of the form
X =
X
n=1
b
n
X
n
, (2.2.14)
where {X
n
, n 1} is a se quence of uncorrelated random variables (the
orthogonality here means uncorrelation). The convergence in (2.2.14) is in
the mean square, i.e.,
lim
n→∞
E
X
n
X
k=1
b
k
X
k
!
2
= 0. (2.2.15)
Proposition 2.2.6 A standard classical Laplace CL(0, 1) r.v. X admits
the representation (2.2.14) with
b
n
=
ξ
n
2J
0
(ξ
n
)
Z
0
xe
x
J
0
(ξ
n
e
x/2
)dx (2.2.16)
and
X
n
=
2
ξ
n
J
0
(ξ
n
)
J
0
(ξ
n
e
−|X|/2
), (2.2.17)
where J
0
and J
1
are the Bessel functions of the rst kind of order 0 and 1,
respectively (see Appendix A), and ξ
n
is the nth root of J
1
.
Proof. See Younes (2000) for a derivation.
Orthogonal representations play an important role in statistics. For e x-
ample, they appear in Factor Analysis, where each of the d obs ervable
variables is expressed as the sum of p < d uncorrelated common factors
and one unique factor. See, e.g., Younes (2000) for further information on
orthogonal representations and their applications in statistics.
32 2. Classical symmetric Laplace distribution
Representation Variables
2W · Z
Z standard normal r.v.
W exponentially distributed r.v .
R · Z
R Rayleigh r.v. (p.d.f. - f(w) = we
w
2
/2
)
Z standard normal r.v.
2Z/T
T “brittle fracture” r.v. (p.d.f. -
f(t) = 2t
3
e
1/t
2
)
Z standard normal r.v.
W
1
W
2
W
1
, W
2
standard exponential r.v.’s
(H
1
H
2
)/2
H
1
, H
2
Chi-square r.v.’s with two d.f.
I · W
I r andom sign taking ± with equal pro babili-
ties
W standard exponential r.v.
log(P
1
/P
2
)
P
1
, P
2
Pareto Type I r.v.’s (p.d.f. -
f(p) = 1/p
2
, p > 1)
log(U
1
/U
2
)
U
1
, U
2
r.v.’s uniformly distributed on [0, 1].
U
1
· U
4
U
2
· U
3
U
1
, U
2
, U
3
, U
4
standard normal r.v.’s
Y =
P
n
i=1
Y
(n)
1i
Y
(n)
2i
Y
1i
, Y
2i
gamma distributed r.v.’s
with the density given by (2.4.3), see Proposi-
tion 2.4.1.
Table 2.3: Summary of the repres e ntations of the standard classical Laplace
distribution presented in this section. All variables in each r e presentation
are mutually independent.
2.2.6 Stability with respect to geometric summation
Stability which is related to infinite divisibility is a well-known property of
the normal distribution. A formal definition is: if X, X
1
, X
2
, . . . are i.i.d.
normal, then fo r every positive integer n there exist an a
n
> 0 and a b
n
R
such that
X
d
= a
n
(X
1
+ ··· + X
n
) + b
n
. (2.2.18)
2.2 Representations and characterizations 33
In fact, the normal law is the only non-degenerate one with finite vari-
ance having this property.
4
Under the geometric summation (2.2.1), the
best-known proper ty analogous to (2.2.18) is perhaps the following char-
acterization of the exponential distribution: if Y, Y
1
, Y
2
, ... a re positive and
non-degenera te i.i.d. random variables with finite variance, then
a
p
ν
p
X
i=1
Y
i
d
= Y
1
for all p (0, 1) (2.2.19)
if and only if Y
1
has an exp onential dis tribution [see, e.g., Arnold (1973),
Kakosyan et al. (1984), Milne and Yeo (1989)]. If, however, Y
i
’s are sym-
metric, then (2.2.19) characterizes the class of Laplace distributions. This
is not surprising if one notes that - as already mentioned - the Laplace
distribution is simply a symmetric extension o f the standard expo nential
distribution.
We shall start the proof with
Lemma 2.2.1 Let X
1
, X
2
, . . . be i.i.d. random variables with ch.f. ψ, and
let N be a positive and integer valued random variable with the generating
function defined as G(z) = E(z
N
). Then, the ch.f. of t he r.v.
P
N
i=1
X
i
is
G(ψ(t)).
Proof. Conditioning on N , we obtain dire c tly:
Ee
it
P
N
k=1
X
i
=
X
n=1
ψ
n
(t)P (N = n) = Eψ
N
(t).
Proposition 2.2.7 Let Y, Y
1
, Y
2
, ... be non-degenerate and symmetric i.i.d.
random variables with finite variance σ
2
> 0, and let ν
p
be a geometric
random variable with mean 1/p, independent of the Y
i
’s. Then, the following
statements are equivalent:
(i) Y is stable with respect to geometric summation, i.e. there exist con-
stants a
p
> 0 and b
p
R, such that
a
p
ν
p
X
i=1
(Y
i
+ b
p
)
d
= Y for all p (0, 1). (2.2.20)
(ii) Y possesses the Laplace distribution with mean zero and variance σ
2
.
Moreover, the constants a
p
and b
p
must be of the form: a
p
= p
1/2
, b
p
= 0.
4
If the finite variance assumption is dropped, then the distributions satisfying (2.2.18)
are called stable (Paretian stable, α-stable) laws [see, e.g., Zolotarev (1986), Janicki and
Weron (1994), Samorodnitsky and Taqqu (1994), and Ni kias and Shao (1995)], of which
normal distribution is a special case.
34 2. Classical symmetric Laplace distribution
Proof. We shall first establish the form of the normalizing constants in
(2.2.20). Taking the expected value of both sides of (2.2.20) and exploiting
independence we arrive at
0 = E[Y ] = E[ν
p
]E[a
p
(Y
i
+ b
p
)].
Since E[ν
p
] = 1/p 6= 0 and a
p
> 0, in view of the symmetry of Y
i
, we have
b
p
= E[Y
i
] = 0. Next we equate the variances of both sides of (2.2.20).
Denoting by S
p
the left-hand side of (2.2.20), we can write the following
well-known decompositio n based on conditional variances:
Var[S
p
] = Var[E[S
p
|ν
p
]] + E[Var[S
p
|ν
p
]].
In the above e xpression the first term is zero, since
E[S
p
|ν
p
] = ν
p
a
p
E[Y
i
]
and as shown a bove E[Y
i
] = 0. Now, E[ν
p
] = 1/p, and the second term
becomes
E[Var[S
p
|ν
p
]] = E[ν
p
a
2
p
σ
2
] =
a
p
p
1/2
2
σ
2
.
However, since the variance on the right-hand side of (2.2.20) is σ
2
, we have
a
p
p
1/2
2
= 1,
so that a
p
= p
1/2
.
We now turn to the equivalence between (i) and (ii) with a
p
= p
1/2
and
b
p
= 0. By Lemma 2.2.1, in terms of ch.f.’s relation (2.2.20) is expressed as
(p
1/2
t)
1 (1 p)ψ(p
1/2
t)
= ψ(t) for all p (0, 1) and all t R, (2.2.21)
where ψ is the ch.f. of Y . [Note that E(z
ν
p
) = pz/(1 (1 p)z).] Relation
(2.2.21) will be often utilized in the sequel. Consequently, we also have for
all t R,
(p
1/2
t)
1 (1 p)ψ(p
1/2
t)
ψ(t), as p 0. (2.2.22)
Since ψ(p
1/2
t) ψ(0) = 1, we obtain
p
1 (1 p)ψ(p
1/2
t)
ψ(t), as p 0, (2.2.23)
2.2 Representations and characterizations 35
or, equivalently,
1
1
p
[1 (1 p)ψ(p
1/2
t)]
ψ(t), as p 0 (2.2.24)
for all t R. Now, since Y posse sses the first two moments, its ch.f. can be
written as
ψ(u) = 1 + iuE[Y ] +
(iu)
2
2
(E[Y
2
] + δ) = 1
u
2
2
(σ
2
+ δ), (2.2.25)
where δ = δ(u) denotes a bounded function of u such that lim
u0
δ(u) = 0
[see, e.g., Theorem 8.44 in Breiman (1993)]. Utilizing (2.2.25), we can write
the denominator in (2.2.24) as
t
2
2
(σ
2
+ δ) + 1
pt
2
2
(σ
2
+ δ), (2.2.26)
which converges to
1
2
t
2
σ
2
+ 1 as p 0. However, u = p
1/2
t 0 as p 0.
Consequently,
1
1
2
t
2
σ
2
+ 1
= ψ(t), (2.2.27)
so tha t Y has Laplace distribution with mean zero and variance σ
2
. We
have thus established the implication (i) (ii). To verify the reverse im-
plication, all is needed is to verify that the Laplace ch.f. (2.2.27) satisfies
(2.2.21).
Proposition 2.2.7 is perha ps the first theorem in this boo k which requires
somewhat delicate arguments. The result is due to Kakosyan et al. (1984)
but the proof presented here differs from the original one.
Remark 2.2.11 If Y
i
’s are positive r.v.’s but the assumption of finite vari-
ance is dropped, (2.2.19) characterizes Mittag-Leffler distributions [see , e.g.,
Gnedenko (1 970), Pillai (1990)]. These are distr ibutions of positive r.v.’s
with the Laplace tra ns form
E[e
sX
] =
1
1 + σ
α
s
α
,
where 0 < α 1, and are reduced to an exponential r.v. for α = 1.
Remark 2.2.12 If Y
i
’s are symmetric, but the assumption of finite vari-
ance is dropped, (2.2.19) characterizes the Linnik distributions [see Lin
(1994), Kozubowski (19 94b)]. Linnik distributions possess the ch.f.
ψ(t) =
1
1 + σ
α
|t|
α
,
36 2. Classical symmetric Laplace distribution
where 0 < α 2, and are reduced to Laplace distr ibutions for α = 2. We
shall study this class in Section 4.3 of Chapter 4.
Remark 2.2.13 If no assumptions on the distribution of the Y
i
’s are im-
posed, relation (2.2.19) characterizes the so-called strictly geometric stable
distributions [see, e.g., K le banov et al. (1984), Jankovi´c (1992), Kozubowski
(1994a )]. Further studies dealing with the stability relation (2.2.19 ) and its
generalizatio ns include Janji´c (1984), Gnedenko and Janji´c (1983), Jankovi´c
(1993a b), Bunge (1993), Bunge (1996), Baring haus and Grubel (1997), and
Bouzar (1999).
Incidentally, relation (2.2.21) is equivalent to the following relation amo ng
random va riables
Y
d
= p
1/2
IY
1
+ (1 I)(Y
2
+ p
1/2
Y
3
), (2.2.28)
where Y, Y
1
, Y
2
, Y
3
are i.i.d., while I is an indicator (Bernoulli) random
variable, independent of Y, Y
1
, Y
2
, Y
3
, with P (I = 1) = p and P (I = 0) =
1 p.
Another relation among random variables, that is also equivalent to
(2.2.21), is
Y
d
= p
1/2
Y
1
+ (1 I)Y
2
. (2.2.29)
The above relatio n is simply a restatement of repre sentation (2.4.9) for
symmetric Laplace r.v.’s with mean zero to be discussed below.
Consequently, we have yet two more characterizations of the Laplace
distribution, which can be obtained by computing ch.f.’s of the right-
hand sides of (2.2.28) and (2.2.29), and comparing them with the relation
(2.2.21).
Proposition 2.2.8 Let Y, Y
1
, Y
2
, Y
3
be non-degenerate, symmetric i.i.d.
random variables with finite variance σ
2
> 0. Let I be an indicator random
variable with P (I = 1) = p and P (I = 0) = 1p, independent of Y
1
, Y
2
, Y
3
.
Then, the following statements are equivalent:
(i) Y satisfies relation (2.2.28) for all p [0, 1].
(ii) Y satisfies relation ( 2.2.29) for all p [0, 1].
(iii) Y has Laplace distribution with mean zero and variance σ
2
.
2.2.7 Distributional limits of geometric sums
An expo nential distribution is not only stable with respect to geomet-
ric summation, but also appears to be the only possible non-degenerate
limiting distribution of norma lize d geometric sums (2.2.1) with i.i.d. p os-
itive terms po ssessing finite expectations. If X
i
’s are i.i.d. non-negative
r.v.’s with µ = E[X
1
] < , then pS
p
, where S
p
is given by (2.2.1), con-
verges in distribution (as p 0) to an exponential r.v. with mean µ.
2.2 Representations and characterizations 37
This result is due to enyi (1956) obtained mo re then 40 years ago. In
Kalashnikov’s (199 7) opinion, enyi’s theorem may explain the popular-
ity of exponential distribution among researchers in reliability, risk theory,
and other fields where geometric s ums (2.2.1) frequently arise. The connec-
tion betwee n geometric sums, rarefactions of renewal processes, geometric
compounding and damage models was emphasized some 20 years later by
Galambos and Kotz (1978).
Similarly, Laplace distribution arises as a limit of S
p
when X
i
’s are sym-
metric with finite variance. Specifically
Proposition 2.2.9 Let S
p
be given by (2.2.1), where X
1
, X
2
, ... are non-
degenerate and symmetric i.i.d. r.v.’s with a finite variance, with ν
p
being a
geometric r.v. with the mean 1/p, independent of the X
i
’s. Then, the class
of Laplace distributions with zero mean coincides with the class of non-
degenerate distributional limits of a
p
S
p
as p 0, where a
p
> 0. Moreover,
if Var[X
1
] = σ
2
and
a
p
ν
p
X
i=1
X
i
d
Y as p 0, (2.2.30)
there exists γ > 0 such that a
p
= p
1/2
γ + o(p
1/2
), and Y has a Laplace
distribution with mean zero and variance σ
2
γ
2
.
Proof. Evidently, if Y has a Lapla c e distribution, then in view of (2.2 .20),
the convergence (2.2.30) holds with X
i
d
= Y and a
p
= p
1/2
. It is therefore
sufficient to show that if (2.2.30) holds with Var[X
1
] = σ
2
, then for some
γ > 0 the limit must have the Laplace distribution with mean zero and
variance σ
2
γ
2
where a
p
= p
1/2
γ(1 + o(1)).
Assume that (2 .2.30) holds , X
i
’s being symmetric with Var[X
1
] = σ
2
and Y being non-degenerate. In terms of ch.f.’s, by Lemma 2.2.1, we have
(a
p
t)
1 (1 p)φ(a
p
t)
ψ(t), as p 0, for all t, (2.2.31)
where φ and ψ are the ch.f.’s of X
1
and Y , respectively. First, note that
for all t we must have the convergence
φ(a
p
t) 1, as p 0. (2.2.32)
Indeed, by continuity of ψ and the property ψ(0) = 1, we must have ψ(t) 6=
0 for all t in an interval (, ), where > 0. Then, for such a t the limit
in (2.2.31) is non-zero while the limit of the numerator in (2.2.31) is zero.
Consequently, the denominator in (2.2.31) ought to converge to zero , s o
that (2.2.32) will hold for such a t. Take now any t in an inter val (2, 2 )
and use the inequality
0 1 Reφ(s) 4(1 Reφ(s/2)) (2.2.33)
38 2. Classical symmetric Laplace distribution
with s = a
p
t to conclude that (2.2.32) holds for such a t. Inequality (2.2.33)
follows directly from the following trigonometric relation
1 cos 2tx = 2(1 cos
2
tx) 4(1 cos tx),
since Re φ(s) is the expected va lue of cos tX. [The last inequality follows
directly from 0 (cos tx 1)
2
.] This implies that (2.2.32) holds for all t.
Next, utilizing (2 .2.32), we r e write (2 .2.31) in the form
1
1
p
[1 (1 p)φ(a
p
t)]
ψ(t), as p 0, (2.2.34)
for all t R. Now, since (2.2.32) holds for all t and φ is a ch.f. of a non-
degenerate distribution, we must have
a
p
0, as p 0. (2.2.35)
Indeed, if (2.2.35) is not valid, we would have had a
p
n
c, for some
sequence p
n
0, where 0 < c , so that, as p 0, we would have had
φ(a
p
n
t) φ(ct) = 1 (2.2.36)
for all t. But (2.2.36) implies that the distribution of X
1
is degenerate,
contradicting our assumption. Thus, (2.2.35) must be valid.
Now, we proceed as in the proof of Proposition 2.2.7 and write the de-
nominator of (2.2.34) in the form
a
p
p
1/2
2
t
2
2
(σ
2
+ δ) + 1
a
2
p
t
2
2
(σ
2
+ δ), (2.2.37)
where as above δ = δ(u) denotes a bounded function of u such that
lim
u0
δ(u) = 0. Since as p 0 the expression (2.2.37) conver ges to a
limit, and moreover, in view of (2.2.35),
t
2
2
(σ
2
+ δ)
t
2
σ
2
2
;
a
2
p
t
2
2
(σ
2
+ δ) 0 , (2.2.38)
the term
a
p
p
1/2
must converge to some limit γ > 0 (if the limit were zero,
the e xpression (2.2.37) would converge to 1, implying that ψ(t) 1 and
that Y has a degenerate distribution). Consequently, we have ve rified the
convergence in (2.2.34), where the limiting ch.f. is of the form
ψ(t) =
1
1 +
1
2
σ
2
γ
2
t
2
(2.2.39)
and a
p
/p
1/2
γ, so that a
p
= p
1/2
γ(1 + o(1)). This co mpletes the proof.
2.2 Representations and characterizations 39
Remark 2.2.14 If no assumptions on the distribution of the X
i
’s are im-
posed, then, as p 0, the weak limits of
a
p
ν
p
X
i=1
(X
i
+ b
p
), (2.2.40)
where a
p
> 0 and b
p
R
d
, result in geometric stable (GS) laws [se e , e.g.,
Mittnik and Rachev (1991)].
2.2.8 Stability with respect to the ordinary summation
We have seen is Section 2.2.6 tha t symmetric Laplace distributions are
stable with respect to rando m summation (Proposition 2.2.7). When the
summation is “deterministic”, the Laplace distribution ha s the stability
property (2.2.18) under a random normalization.
Before stating the main result of this subse c tion, we shall establish some
auxiliary prop e rties in which we use the following notation for gamma
densities with parameters α and β:
f
α,β
(x) =
x
α1
e
x/β
β
α
Γ(α)
.
A non-random sum of the i.i.d. Laplace random variables is no longer a
Laplace variable. Instead, the sum admits the repre sentation given below,
which is a generalization of the representation (2.2.3) for a single Laplace
random va riable.
Proposition 2.2.1 0 Let Y
1
, Y
2
, . . . be i.i.d. L(0, 1 ) random variables. Then
Y
1
+ ··· + Y
n
d
=
p
G
n
Z, (2.2.41)
where G
n
has a gamma distribution with parameters α = n, β = 1 and Z
is a standard normal r.v. independent of G
n
.
Proof. Let Y
i
’s have the Laplace distribution L(0, 1), in which case their
ch.f. is
ψ(t) =
1
1 +
1
2
t
2
. (2.2.42)
Thus, the ch.f. of the sum of n i.i.d. copies of Y
i
is
1
1 +
1
2
t
2
n
. (2.2.43)
Note that the ch.f. of the product
G
n
Z of two independent r.v.’s, where
Z is standa rd norma l and G
n
has a gamma distribution, is of the form
φ(t) = M
G
n
(t
2
/2),
40 2. Classical symmetric Laplace distribution
where M
G
n
is the moment generating function of G
n
(this rela tion is evi-
dently true if G
n
is replaced by an arbitrary random variable independent
of Z). To conclude the proof r e call that the moment ge nerating function of
a gamma r.v. is of the form
M
G
n
(t) =
1
1 t
n
. (2.2.44)
In what follows let B
n
denote a beta distribution with parameters 1 and
n, given by the density
f(x) = n(1 x)
n1
, 0 < x < 1. (2.2.45)
The following result will be needed:
Lemma 2.2.2 Let B
n1
and G
n
be independent r.v.’s having the beta dis-
tribution with parameters 1 and n 1 and the gamma distribution with
parameters n and 1, respectively. Let W be a standard exponential vari-
able. Then, the representation
W
d
= G
n
B
n1
is valid.
Proof. Let G(α) denote the ga mma distribution with density
f
α
(x) =
x
α1
e
x
Γ(α)
.
If X
α
1
G(α
1
) and X
α
2
G(α
2
) are independent, then it is well known
that the two random variables
X
α
1
+ X
α
2
and
X
α
1
X
α
1
+ X
α
2
are mutually independent, and their distributions are respectively, G(α
1
+
α
2
) and a standard beta with parameters α
1
and α
2
[see also pp. 349-350
in Johnson et al. (1994)]. The independence of these two random variables
is actually a characterization of the gamma distribution, as established by
Lukacs (1955).
Take now α
1
= 1 and α
2
= n 1 and obser ve that the standard ex-
ponential r.v. X
α
1
can be expressed as the product of two independent
variables,
X
α
1
= (X
α
1
+ X
α
2
)
X
α
1
X
α
1
+ X
α
2
,
2.2 Representations and characterizations 41
where the first one is a G(n) variable while the second is a beta variable
with parameters 1 and n 1.
We now state the main result.
Proposition 2.2.1 1 Let Y, Y
1
, Y
2
, ... be i.i.d. random variables with finite
variance σ
2
> 0, and let B
n
be a r.v. independent of the Y
i
’s, with density
(2.2.45). Then, the following statement s are equivalent:
(i) For all integers n greater than 1,
B
1/2
n1
n
X
i=1
Y
i
d
= Y. (2.2.46)
(ii) Y has a symmetric Laplace distribution.
Proof. We shall first deal with the implication (i) (ii). Taking the ex-
pected value on both sides of (2.2.46), we have
E[Y ] = E[B
1/2
n1
](E[Y
1
] + ··· + E[Y
n
]) = nE[
p
B
n1
]E[Y ]. (2.2.47)
This implies that E[Y ] = 0 as nE[
p
B
n1
] 6= 1 (for example, E[
B
1
] = 2/3
since B
1
is uniformly distr ibuted on [0, 1]).
Next, write the left-hand side of (2 .2.46) in the form
U
n
V
n
, where
U
n
= nB
n1
and V
n
=
P
n
i=1
Y
i
n
1/2
, (2.2.48)
and let n . Then, U
n
converges in distribution to a random variable
W with the standard exponential distribution. Indeed P (U
n
u) = 1
(1 u/n)
u
, u (0, n), which converges to e
u
, u 0. By the Central L imit
Theorem, V
n
converges to a normal r.v. with mean zero and variance σ
2
.
Since, by the assumption, U
n
is independent of V
n
, the limit of the product
U
n
V
n
is the product of the limits, so that
p
U
n
V
n
d
W
1/2
σZ. (2.2.49)
This is, however, a representation of a Laplac e r.v. with mean zero and
variance σ
2
[see Prop osition 2.2.1 and Remarks following it]. To complete
the proof of the implication (i) (ii), observe that Y must have the same
distribution as the limit in (2.2 .49), since by (i), (2.2.46) holds for all n > 1.
We now turn to the pr oof of the implication (ii) (i). Multiply both sides
of (2.2.41) by B
1/2
n1
(which is independent of the other r.v.’s) to o bta in
B
1/2
n1
(Y
1
+ ···+ Y
n
)
d
= (G
n
B
n1
)
1/2
σZ. (2.2.50)
42 2. Classical symmetric Laplace distribution
By Lemma 2.2.2, the product G
n
B
n1
has the same dis tribution as a stan-
dard exponential r.v. W , so that the right-hand side of (2.2.50) has the
Laplace distribution (with variance σ
2
) by the representation (2.2.41) with
n = 1. The proof is thus completed.
Remark 2.2.15 Relation (2.2.46) characterizes the Laplace distribution
even if the assumption of finite variance of the Y
i
’s is dropped. The proof
of this result available so far is highly technical, see Pakes (1992ab).
Remark 2.2.16 Proceeding in the same manner as in the proof of Proposi-
tion 2.2.11, one can show that within the class of positive r.v.’s the stability
relation
B
n1
n
X
i=1
Y
i
d
= Y, n 2,
characterizes the exponential distributions [see , e.g., Kotz and Steutel (1988),
Yeo and Milne (1989 ), Huang and Chen (1989)]. Similarly, for any 0 < α <
1, the re lation
B
1
n1
n
X
i=1
Y
i
d
= Y, n 2, (2.2.51)
characterizes Mittag-Leffler distributions, mentioned above, which follows
from the re sults of Pakes (1992ab) and Alamatsaz (1993).
Remark 2.2.17 If Y
i
’s are symmetric, then for any 0 < α 2 relation
(2.2.51) characterizes Linnik distributions with index α [see Chapter 3,
Section 4.3]. If no assumptions on the distribution of the Y
i
’s are impose d,
then for any 0 < α 2, relation (2.2.51) characterizes strictly geometric
stable distributions, which follows from the results of Pakes (1992ab) and
Alamatsaz (1993).
2.2.9 Distributional limits of determin i stic sums
One of the bas ic versions of the central limit theorem (CLT) states that
whenever X
1
, X
2
, . . . is a sequence of i.i.d. rando m variables with mean µ
and variance σ
2
< , the sequence of the partial sums,
a
n
n
X
i=1
(X
i
µ), (2.2.52)
where a
n
= n
1/2
, converg e s in distribution to a normal r.v. with mean
zero and variance σ
2
. As we have seen in Section 2.2.7, the limit may not
2.3 Functions of Laplace random variables 43
have a normal distribution if the numbe r of terms in the summation is a
random variable. Similarly, we may arrive at a non-normal limit of (2.2.52)
if the normalizing sequence a
n
is random. The following results shows that
under beta-dis tributed a
n
’s we obta in in the limit a Laplace distr ibution.
We thus have an additional characterization of this class.
Proposition 2.2.1 2 Let X
1
, X
2
, ... be non-degenerate, i.i.d. r.v.’s with mean
µ and finite variance, and let for each n > 1, the r.v. B
n
be independent of
X
i
’s and have a beta distribution with density (2.2.45). Then, as n ,
the class of non-degenerate distributional limits of (2.2.52) with a
n
= B
1/2
n1
coincides with the class of Laplace distributions with zero m ean.
Proof. Evidently, if Y has a Laplace distribution, than in view o f (2.2.46),
Y is the limit of (2.2.52) with X
i
d
= Y . T hus, it is sufficient to show that
the sums (2.2.52) with a
n
= B
1/2
n1
converge to a Laplace distribution. To
this end, we proceed a s in the proof of Proposition 2.2.11, writing (2.2.52)
as U
n
V
n
, where
U
n
= (nB
n1
)
1/2
and V
n
=
P
n
i=1
(X
i
µ)
n
1/2
, (2.2.53)
and analog ously showing that the limit of the product has indeed a Laplac e
distribution.
2.3 Functions of La place random variabl es
In this section we discuss distributions of certain standard functions of
independent Laplace random var iables, including sum, product, the ratio.
2.3.1 The distribution of the sum of independe nt Laplace
variates
Let us first consider two independent classical Laplace random variables
X
1
and X
2
with densities
f
i
(x) =
1
2s
i
e
−|x|/s
i
, i = 1, 2, x R. (2.3.1)
Our goal is to find the probability distribution of the sum,
Y = X
1
+ X
2
. (2.3.2)
44 2. Classical symmetric Laplace distribution
By the symmetry, the difference X
1
X
2
has the same distribution as the
sum (2.3.2). Using Pro po sition 2.2.2 one can write each X
i
as the differ-
ence of expo nential random variables, so that the sum of two independent
Laplace r.v.’s is a linear combina tion of four independent standard expo-
nential variables denoted below by Z
i
’s,
Y
d
= s
1
(Z
1
Z
2
) + s
2
(Z
3
Z
4
). (2.3.3)
(This la ck of closure is in contrast with the normal case, where the sum of
independent normal variables normal). Rearranging the terms, we have
Y
d
= (s
1
Z
1
s
2
Z
4
) (s
1
Z
2
s
2
Z
3
) =
1
s
1
s
2
(W
1
W
2
), (2.3.4)
where
W
1
=
1
κ
Z
1
κZ
4
and W
2
=
1
κ
Z
2
κZ
3
(2.3.5)
are independent and identically distributed random var iables, and
κ =
r
s
2
s
1
(2.3.6)
is a positive constant.
We proceed by first finding the distr ibution of the W
i
’s and then the
distribution of their difference. To acco mplish the first step, we shall use
the following result.
Lemma 2.3.1 Let G
1
and G
2
be i.i.d. random variables with standard
gamma distribution given by the density
g(x) =
1
Γ(ν)
x
ν1
e
x
, ν > 0, x > 0. (2.3.7)
Let κ be a positive constant. Then, the probability density of the random
variable
W =
1
κ
G
1
κG
2
(2.3.8)
is
h(x) =
1
Γ(ν)
π
|x|
κ + 1
ν1/2
e
1
2
(1κ)x
K
ν1/2
(
1
2
(1 + κ)|x|), x 6= 0,
(2.3.9)
where K
λ
is the modified Bessel function of the third kind with the index
λ, given in Appendix A.
2.3 Functions of Laplace random variables 45
Remark 2.3.1 The distribution with density (2.3.9) is for obvious reasons
known as the Besse l function distribution [see, e.g., Pearson et al. (1 929)].
We shall study this class of distributions in Section 4.1 of Chapter 4.
Proof. First, note tha t the densities of X
1
=
1
κ
G
1
and X
2
= κG
2
are κg(κx)
and
1
κ
g(
x
κ
), respectively, where g is the density of G
1
(and G
2
) given by
(2.3.7). Next, by independence, the joint density of X
1
and X
2
is
f(x
1
, x
2
) = g(κx)g(
x
κ
) =
1
[Γ(ν)]
2
(x
1
x
2
)
ν1
e
κx
1
1
κ
x
2
, x
1
, x
2
> 0.
(2.3.10)
Consider a one-to-one transformation W = X
1
X
2
, Z = X
2
. The inverse
transformation, X
1
= W + Z, X
2
= Z, has the Jacobian equal to one, so
that the joint density of W and Z is
p(w, z) = f(w + z, z), z, w + z > 0. (2.3.11)
The marginal density of W = X
1
X
2
can be found by integrating the
joint density (2.3.11) with respect to z,
h(w) =
Z
−∞
f(w + z, z)dz. (2.3.12)
Combining (2.3.10) and (2.3.12), for w < 0 we obtain
h(w) =
1
[Γ(ν)]
2
e
κw
Z
w
z
ν1
(z + w)
ν1
e
(κ+
1
κ
)z
dz. (2.3.13)
Now, the application of the integration formula (A.0.14) of Bessel functions
(see Appendix A) with µ = ν, u = w, and β = κ + κ
1
, leads to (2.3.9).
Similarly, for w > 0, we have
h(w) =
1
[Γ(ν)]
2
e
κw
Z
0
z
ν1
(z + w)
ν1
e
(κ+
1
κ
)z
dz. (2.3.14)
The change of var iable x = w + z results in
h(w) =
1
[Γ(ν)]
2
e
1
κ
w
Z
w
x
ν1
(x w)
ν1
e
(κ+
1
κ
)x
dx. (2.3.15)
Another a pplication of (A.0.14 ), this time with u = w, produces (2.3.9).
The result follows.
To find the density of the W
i
’s given by (2.3.5), we apply Lemma 2.3.1
with ν = 1. Here, the Bessel function with index 1/2 has a closed form
given by (A.0.11) in Appendix A, and the density of W
1
takes the form
h(x) =
1
Γ(1)
π
|x|
κ + 1
1/2
e
1
2
(1κ)x
K
1/2
(
1
2
(1 + κ)|x|)
46 2. Classical symmetric Laplace distribution
=
1
π
|x|
1/2
(κ + 1)
1/2
e
1
2
(1κ)x
π
(κ + 1)
1/2
|x|
1/2
e
1
2
(1+κ)|x|
=
1
κ + 1
e
1
2
(1κ)x
1
2
(1+κ)|x|
,
which can be written as
h(x) =
1
1 + κ
e
κ|x|
, for x 0,
e
1
κ
|x|
, fo r x < 0.
(2.3.16)
Remark 2.3.2 For κ 6= 1 we obtain an asymmetric Laplace distribution
to be studied in detail in Chapter 3.
Next, we shall derive the distribution of the difference W
1
W
2
, where the
W
i
’s are i.i.d. va riables defined by (2 .3.5) with densities given by (2.3.16).
Proposition 2.3.1 Let W
1
and W
2
be i.i.d. r.v.’s with density (2.3.16).
Then, the density of V = W
1
W
2
is
h(x) =
1
4
(1 + |x|)e
−|x|
, x R, for κ = 1,
1
2
κ
1κ
4
(e
κ|x|
κ
2
e
1
κ
|x|
), x R, for κ (0, 1) (1, ).
(2.3.17)
Proof. The density of V = W
1
W
2
is related to the common density of
W
1
and W
2
as follows:
f
V
(x) =
Z
−∞
h(x + y)h(y)dy. (2.3.18)
Since the probability density of the difference of two i.i.d. random varia bles
is symmetric, it is sufficient to consider x > 0. Splitting the region of
integration according to positivity and negativity o f the functions h(x + y)
and h(y), we obtain f
V
(x) = I
1
+ I
2
+ I
3
, where
I
1
=
Z
x
−∞
h(x + y)h(y)dy, I
2
=
Z
0
x
h(x + y)h(y)dy, I
3
=
Z
0
h(x + y)h(y)dy.
(2.3.19)
We evaluate the above integrals utilizing (2.3.16):
I
1
=
1
1 + κ
2
Z
x
−∞
e
1
κ
(x+y)
e
1
κ
y
dy =
1
1 + κ
2
κ
2
e
1
κ
x
, (2.3.20)
I
2
=
1
1 + κ
2
Z
0
x
e
κ(x+y)
e
1
κ
y
dy
2.3 Functions of Laplace random variables 47
=
1
1 + κ
2
(
1
1κ
(e
κx
e
1
κ
x
), for κ 6= 1,
xe
x
, for κ = 1,
(2.3.21)
I
3
=
1
1 + κ
2
Z
0
e
κ(x+y)
e
κy
dy =
1
2κ
e
κx
. (2.3.22)
Combining (2.3.20) - (2.3.2 2) and simplifying we obtain the dens ity (2.3.17)
of V .
We now return to the repr e sentation (2.3.4) of Y . Using Proposition 2.3.1
along with (2.3.6 ), we obtain the following density of the sum X
1
+X
2
[and
the difference X
1
X
2
],
f
X
1
+X
2
(x) =
1
4
s(1 + s|x|)e
s|x|
, for s
1
= s
2
= s,
1
2
s
2
s
1
1
1(s
2
/s
1
)
2
(s
1
e
s
2
|x|
s
2
e
s
1
|x|
), for s
1
6= s
2
.
(2.3.23)
Remark 2.3.3 Note that the distribution of the sum of two independent
Laplace r.v.’s with the same scale parameters is of a different type and
much simpler tha n that when the scale parameters are different.
Remark 2.3.4 Weida (1935) obtained the distribution of the difference
X
1
X
2
by inverting the relevant characteristic function. His derivation,
however, seems to be not quite correct.
Next, we consider the case of more than two identically distributed and
independent standard classical Laplace r.v.’s with a common density given
by (2.3.1) with the scale parameter equal to 1. Recall that the sum of n
such variables has a representation in terms of gamma and standard normal
random variables (Proposition 2.2.10). Now, Lemma 2.3.1 can be used for
the derivation of the density of the sum T of these i.i.d. random variables
(as well as the density of the c orresponding arithmetic mean). Indeed, since
for each i = 1, . . . , n we have
X
i
d
= Z
i
Z
0
i
,
where Z
i
and Z
0
i
are i.i.d. standard exponential variables (Prop osition
2.2.2), it follows that
T = n
X
n
=
n
X
i=1
X
i
d
=
n
X
i=1
Z
i
n
X
i=1
Z
0
i
= G
1
G
2
, (2.3.24)
where G
1
and G
2
are i.i.d. standard gamma r.v .’s with density (2.3.7) with
the shape parameter ν = n. Thus, the density of the sum T is given by
48 2. Classical symmetric Laplace distribution
(2.3.9) with ν = n and κ = 1. Since the Bessel function K
ν1/2
admits
the closed form (A.0.10) for ν = n, we obtain the following formula for the
density of T ,
f
T
(x) =
e
−|x|
(n 1)!2
n
n1
X
j=0
(n 1 + j)!
(n 1 j)!j!
|x|
n1j
2
j
, x R. (2.3.25)
For the arithmetic mean
X
n
= T /n we have the density
f
X
n
(x) = nf
T
(nx), x R. (2.3.26)
In the following result we present a useful representation of T derived in
Kou (2000) (see Exercise 2.7.18).
Proposition 2.3.2 Let X
1
, . . . , X
n
be i.i.d. standard classical Laplace vari-
ables. Then,
T = X
1
+ ··· + X
n
d
= I ·
M
n
X
j=1
Z
j
, (2.3.27)
where the Z
j
’s are i.i.d. standard exponential variables, I takes on values
±1 with probabilities 1/2, and M
n
is an int eger-valued r.v. given by the
probability function
P (M
n
= j) =
2
j
2
2n1
2n j 1
n 1
, j = 1, 2, . . . , n. (2.3.28)
[The Z
j
’s, I, and M
n
are mu tually independent, and
0
0
is defined as 1.]
Table 2.4 below contains the densities of
X
n
for sa mple sizes n = 1, 2, 3, 4,
which were worked out in Craig (1932)
5
[see also Edwards (1948)]. Weida
(1935) in one of the early papers devoted to the Laplace distribution ob-
tained an expression for the density of
X
n
by inverting the relevant charac-
teristic function. However, his formula is no t as simple as ours and involves
the deriva tive of order n1 (with respect to t) of the function e
itnx
(1+it)
n
.
Remark 2.3.5 As noted by Johnson et al. (1995), many authors consid-
ered sums or arithmetic means and related statistics under an underlying
Laplace model, including Hausdorff (1901), Craig (1932), Weida (1935), and
Sassa (1968). In particular, Balakrishnan and Kocherlakota (1986) utilized
the density (2.3.26) in studying the effects of nonnormality on X-charts.
They showed that the pr obabilities α (false alarm) and 1 β (true alarm)
remain almost unchanged when the underlying normal distribution is re-
placed by the Laplace distribution, and concluded that no mo dification to
the control charts was necessary in this case.
5
In Craig (1932) the coefficient of |x|
2
for n = 4 contains a printing error (98 instead
of 96).
2.3 Functions of Laplace random variables 49
n Density of X
n
1 f(x) =
1
2
e
−|x|
, x R
2 f(x) =
1
2
(1 + 2|x|)e
2|x|
, x R
3 f(x) =
9
16
(1 + 3|x| + 3|x|
2
)e
3|x|
, x R
4 f(x) =
1
24
(15 + 60|x| + 96|x|
2
+ 64 |x|
3
)e
4|x|
, x R
Table 2.4: Densities of the sample means X
n
for samples of selected sizes n
from a standard classical Laplace distribution with ch.f. ψ(t) = (1 + t
2
)
1
.
2.3.2 T he distribution of the p roduct o f two independent
Laplace variates
Consider two independent class ic al Laplace random variables X
1
and X
2
with densities (2.3.1). We shall find the probability distribution of the ran-
dom variable
Y = X
1
X
2
. (2.3.29)
Since X
i
d
= I
i
W
i
, where for i = 1, 2, W
i
is standard exponential while
I
i
is independent from W
i
and takes on values ±1 with probabilities 1/2
(Proposition 2.2.3), we have
Y
d
= s
1
s
2
(I
1
I
2
)W
1
W
2
= s
1
s
2
IW
1
W
2
, (2.3.30)
where I = I
1
I
2
is independent of the W
i
’s and has the same distribution
as each of the I
i
’s. Cons e quently, we need to find the distribution of the
product of two independent standard e xponential random variables. For
x > 0 we have
P (W
1
W
2
x) =
Z
0
P
W
1
x
z
e
z
dz = 1
Z
0
e
(xz
1
+z)
dz,
as P (W
1
< u) = 1 e
u
. We now utilize the definition (A.0.4) of Bessel
functions (see Appendix A) with λ = 1 and u = 2
x (noting tha t K
λ
=
50 2. Classical symmetric Laplace distribution
K
λ
), to obtain
Z
0
e
(xz
1
+z)
dz = 2
xK
1
(2
x),
so that the distribution function of W
1
W
2
takes the form
F
W
1
W
2
(x) = 1 2
xK
1
(2
x). (2.3.31)
Next, we take the derivative, using the relations (A.0.8) and (A.0.9) of
Bessel functions (see Appendix A) to obtain an expression for the pr oba-
bility density of W
1
W
2
:
f
Z
1
Z
2
(x) = 2K
0
(2
x). (2.3.32)
Thus, in view of (2.3.30), the density of Y is
f
Y
(x) =
4
s
1
s
2
K
0
2
s
|x|
s
1
s
2
, x R. (2.3.33)
It is interesting to compare (2.3.33) with the density of the product of
two independent normal variables with means equal to ze ro and the same
variances as those of X
1
and X
2
,
g(x) =
1
2πs
1
s
2
K
0
|x|
2s
1
s
2
, (2.3.34)
see, e.g., Craig (1936). In both cases the density of the product depends on
x through the same Bessel function K
0
, and the argument for the Laplace
case is essentially the square root of the argument for the normal case (thus
in a sense the product retains the orig inal structure of these distributions).
Graphs of these two densities are presented in Figure 2.3(Left).
2.3.3 T he distribution of the ratio of two independent Laplace
variates
Let X
1
and X
2
be two independent classical Laplace random variables
with densities (2.3.1). We seek the probability distribution of the random
variable
Y =
X
1
X
2
. (2.3.35)
Using the representation X
i
d
= s
i
I
i
W
i
given in Proposition 2.2.3, we have
Y
d
=
s
1
s
2
I
1
I
2
W
1
W
2
=
s
1
s
2
I
W
1
W
2
, (2.3.36)
2.3 Functions of Laplace random variables 51
−4 −3 −2 −1 0 1 2 3 4
0
1
2
3
4
5
6
7
8
−8 −6 −4 −2 0 2 4 6 8
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Figure 2.3: Densities of the product (left) and the ratio (right) of two i.i.d.
standard Laplace rando m variables (dashed lines) vs. two i.i.d. standard
Gaussian random variables (solid lines).
where I = I
1
/I
2
takes values ±1 with equal probabilities and is independent
of the standard exponential r.v.’s W
i
. We are thus required to find the
distribution of the r atio of two independent standard exponential random
variables.
First, we find the distribution function by conditioning. For x > 0 we
have
P
W
1
W
2
x
=
Z
0
P (W
1
xz)e
z
dz = 1
Z
0
e
z(x+1)
dz = 1
1
1 + x
.
Hence, the ratio W
1
/W
2
has a standard Pareto distribution of the second
kind [the so-called Lomax distribution, see, e.g., Johnson et al. (1994) pp.
575 or Spr inger (1979) pp. 161] with density
f
W
1
/W
2
(x) =
1
1 + x
2
, x 0. (2.3.37)
Consequently, the distribution of Y is “double” Pareto with density
6
f
Y
(x) =
1
2
s
2
s
1
1
1 + (s
2
/s
1
)|x|
2
, x R. (2.3.38)
6
It should be noted that our result does not fully agree with Weida (1935).
52 2. Classical symmetric Laplace distribution
Note that like in the normal case, where the ratio of two mean zero normal
random variables has Cauchy distribution, the distr ibution with density
(2.3.38) has infinite mean and variance. (However, the fractional moments
E|Y |
α
do exist for 0 < α < 1.) In the i.i.d. case the densities of the ratio
of two mean-zero Laplace and two mean-zero normal variables are
1
2
1
1 + |x|
2
and
1
π
1
1 + x
2
, x R,
respectively. Graphs of these two densities are shown in Figure 2.3(Right)
.
Remark 2.3.6 Note that the same distribution arises under an appropri-
ate rando mization of the scale parameter s of the classical Laplace distri-
bution CL(0, s) (see Exercise 2.7.48).
Function Distribution
Y
1
+ ··· + Y
n
f(x) =
e
−|x|
(n1)!2
n
P
n1
j=0
(n1+j)!
(n1j)!j!
|x|
n1j
2
j
, x R.
X
1
± X
2
f(x) =
1
4
s
1
s
2
(1 +
s
1
s
2
|x|)e
s
1
s
2
|x|
, s
1
= s
2
,
1
2
s
2
s
1
1
1(s
2
/s
1
)
2
(s
1
e
s
2
|x|
s
2
e
s
1
|x|
), s
1
6= s
2
,
X
1
· X
2
f(x) =
4
s
1
s
2
K
0
2
q
|x|
s
1
s
2
, x R.
X
1
/X
2
f(x) =
1
2
s
2
s
1
1
1+(s
2
/s
1
)|x|
2
, x R.
Table 2.5: Densities and distributions of sums and products of indep e ndent
Laplace random variables. Here Y
i
, i = 1, . . . , n, are i.i.d. CL(0, s) r.v.’s,
while X
1
and X
2
are independent CL(0, s
1
) and CL(0, s
2
) r.v.’s.
2.3 Functions of Laplace random variables 53
2.3.4 T he t-statistic for a double exponential (Laplace)
distribution
Let X
1
, . . . , X
n
be i.i.d. variables with common density f, where f(x) > 0
for all x. Define
X
n
=
1
n
n
X
i=1
X
i
, S
2
n
=
n
X
i=1
(X
i
X
n
)
2
. (2.3.39)
The indepe ndence of X
n
and S
2
n
is a unique proper ty of the no rmal dis-
tribution, and plays an important role in the derivation of the probability
distribution of the t-statistic,
T
n
=
X
n
θ
σ/
n
,
r
1
n 1
S
2
n
2
=
n(X
n
θ)
S
n
/
n 1
, (2.3.40)
where θ and σ
2
are the mean and the variance of X
1
. In this section, we
shall follow Sansing (1976) and disc us s the distribution of T
n
defined above
when the parent population is classical Laplace (with the mean equal to
zero). Let
f
X
n
,S
2
n
(x, y), < x < , y > 0, (2.3.41)
be the joint density of
X
n
and S
2
n
based on a sample of size n 2. Sansing
and Owen (1974) derived a recursive relation for f
X
n
,S
2
n
presented b e low.
Lemma 2.3.2 Let X
1
, X
2
, . . . be i.i.d. variables with common density f ,
where f(x) > 0 for all x. Then, for any n 2, < x < and y > 0,
we have
f
X
n+1
,S
2
n+1
(x, y) =
r
n + 1
n
y
Z
1
1
w(u)du, (2.3.42)
where
w(u) = f
X
n
,S
2
n
x +
u
y
p
n(n + 1)
, y(1 u
2
)
!
f
x u
r
n
n + 1
y
.
(2.3.43)
Proof. Note that
X
n+1
=
n
n + 1
X
n
+
1
n + 1
X
n+1
(2.3.44)
and
S
2
n+1
= S
2
n
+
n
n + 1
(
X
n
X
n+1
)
2
. (2.3.45)
54 2. Classical symmetric Laplace distribution
Since X
n+1
is independent of
X
n
and S
2
n
, the joint density of X
n
, S
2
n
, and
X
n+1
is
f
X
n
,S
2
n
(x, y)f(z). (2.3.46)
Using the auxiliary variable
U =
s
n
n + 1
1
S
2
n
(
X
n
X
n+1
), (2.3.47)
we obtain the re lation (2.3.42), s e e Sa ns ing and Owen (1974) for details.
For n = 2, we ge t directly
f
X
2
,S
2
2
(x, y) =
r
2
y
f
x +
r
y
2
f
x
r
y
2
, −∞ < x < , y > 0,
(2.3.48)
while for n = 3 the relation (2.3.42) produces
f
X
3
,S
2
3
(x, y) =
3
Z
1
1
(1 u
2
)
1/2
3
Y
i=1
f(x +
ya
i3
(u))du, (2.3.49)
where
a
13
(u) =
u
6
+
q
1u
2
2
,
a
23
(u) =
u
6
q
1u
2
2
,
a
33
(u) = u
q
2
3
,
so that
3
X
i=1
a
i3
(u) = 0,
3
X
i=1
a
2
i3
(u) = 1.
Assume now that f is the density (2.1.1 ) of the classical Laplace distribu-
tion with mean θ = 0 and scale parameter s > 0. Then, for the rando m
sample of size n = 2, the joint density of
X
2
and S
2
2
is
f
X
2
,S
2
2
(x, y) =
1
4s
2
r
2
y
·
e
2|x|/s
if |x|
p
y/2
e
2y/s
if |x| <
p
y/2.
(2.3.50)
[Note that here S
2
2
is simply (X
1
X
2
)
2
/2.] Thus, the density of the t-
statistic when n = 2 is (Exercise 2.7.23)
f
T
2
(t) =
1
4
if |t| < 1
1
4t
2
if |t| 1.
(2.3.51)
2.3 Functions of Laplace random variables 55
For n = 3, we obta in from (2.3.49)
f
X
3
,S
2
3
(x, y) =
3
8s
3
Z
1
1
(1 u
2
)
1/2
e
y
s
P
3
i=1
x
y
+a
i3
(u)
du, (2.3.52)
with the functions a
i3
(u) as before. As no ted by Sansing (1976), in the
region |x/
y|
p
2/3, we can express (2.3.52) as follows:
f
X
3
,S
2
3
(x, y) =
3π
8s
3
e
3|x|/s
,
x
y
r
2
3
. (2.3.53)
Further, a similar relation holds for other sample sizes as well [see Sansing
(1976)],
f
X
n
,S
2
n
(x, y) =
(n1)/2
y
n3
2
n
s
n
Γ
n1
2
e
n|x|/s
,
x
y
r
n 1
n
. (2.3.54)
Using (2.3.54), we follow Sansing (1976) to derive the distribution function
of the t-statistic (2.3.40) for t > n 1 :
F
T
n
(t) = 1
π
(n1)/2
Γ(n 1)
n2
n1
Γ
n1
2
n 1
n
(n1)/2
t
n+1
. (2.3.55)
Finally, differentiating (2.3.55) we obtain the p.d.f. of T
n
:
f
T
n
(t) =
π
(n1)/2
Γ(n)
n2
n1
Γ
n1
2
n 1
n
(n1)/2
|t|
n
, |t| > n 1. (2.3.56)
Remark 2.3.7 Note that the tails of the density (2.3.56) are heavier than
those of the co rresponding t-distribution with n degrees of freedom (Exer-
cise 2.7.25).
Remark 2.3.8 As noted by Sansing (1976), the evaluation of the joint
density of
X
n
and S
2
n
in the region where |x|/
y <
p
(n 1)/n is quite
complicated. For this case Sansing (1976) derived uppe r and lower bounds
for the joint density, leading to the corresponding bounds for the density
f
T
n
of the t-statistic in the region |t| n 1 where the exact formula
(2.3.56) is not valid.
Remark 2.3.9 Gallo (1979) considered an a nalog of the t-distribution de-
fined as
˜
T
n
=
U
n
V
n
, (2.3.57)
where
U
n
=
n
X
i=1
X
i
and V
n
=
n
X
i=1
|X
i
θ|. (2.3.58 )
56 2. Classical symmetric Laplace distribution
[and X
1
, . . . , X
n
is a random sample from the CL(θ, s) distribution]. The
joint distribution of U
n
and V
n
[derived in Gallo (1979)] consists of a con-
tinuous part supported by the region
I = {(u, v) : v 0, < u < + v} (2.3.59)
and a singular part concentrated on the bo unda ry of I. The correspond-
ing s tatistic
˜
T
n
defined in (2.3 .57) has support in the interva l [1, 1] (see
Exercise 2.7.24). The distribution function of
˜
T
n
is
˜
F
n
(x) =
0 for x < 1,
1
2
n
n
1 +
P
n1
i=1
a
i
R
0
Γ
i,
1x
1+x
z
z
ni1
e
z
dz
o
for 1 x < 1,
1 for x 1,
(2.3.60)
where
a
i
=
n
i
1
Γ(i)
1
Γ(n i)
and
Γ(a, y) =
Z
y
t
a1
e
t
dt (2.3.61)
is the incomplete gamma function. Note that the distribution o f
˜
T
n
is a mix-
ture of po int masses at ±1 (each with probability 1/2
n
), and a continuous
part (occurring with probability 1 2/2
n
) with density
˜
f
n
(x) =
2
n
2
n
2
Γ(n)
2
2n1
n1
X
i=1
a
i
(1 t)
i1
(1 + t)
ni1
, 1 < t < 1, (2.3.62)
see Gallo (1979)
7
2.4 Further proper ties
2.4.1 Infinite d i v isibility
The notion o f infinite divisibility plays a fundamental role in the study of
central limit theorems and evy processes. A probability distribution with
ch.f. ψ is infinitely divisible if for any integer n 1 we have ψ = φ
n
n
, where
7
Note that the c.d.f. and the p.d.f. of
˜
T
n
derived in Gallo (1979) may contain some
misprints.
2.4 Further properties 57
φ
n
is another characteristic function. In other words, a r.v. Y with ch.f. ψ
has the representation
Y
d
=
n
X
i=1
X
i
(2.4.1)
for some i.i.d. random variables X
i
. The importance of the class of infinitely
divisible distributions follows from the fact that they are limits of the sums
of rows of (X
n,i
)
nN,i=1,...,n
, and the terms in each row are i.i.d. (Here, N
denotes the set of natural numbers.) Thus, roughly speaking, if we have
a large number of independent and similar random effects which add to-
gether, the resulting distribution will be approximately infinitely divisible.
According to (2.2.7), the ch.f. (2.1.8) of a classical Laplac e distribution
CL(θ, s) can be factored as follows
e
iθt
(1 ist)(1 + ist)
=
"
e
iθt/n
1
1 ist
1/n
1
1 + ist
1/n
#
n
= φ
n
n
(t).
(2.4.2)
For each integer n 1, the function φ
n
is the ch.f. of θ/n + Y
1n
Y
2n
,
where Y
1n
and Y
2n
are i.i.d. with the ch.f. (1 ist)
1/n
. The latter is the
ch.f. of a gamma distribution with density
(1/s)
1/n
Γ(1/n)
x
1
n
1
e
x/s
, x 0. (2.4.3)
Consequently, Laplace distributions are infinitely divisible
8
and we state
the result for mally in
Proposition 2.4.1 Let Y have a Laplace distribution with ch.f. (2.1.8).
Then, t he distribution of Y is infinitely divisible. Furthermore, for every
integer n 1, representation (2.4.1) holds. Each X
i
is distributed as θ/n +
Y
1n
Y
2n
, where Y
1n
and Y
2n
are i.i.d. with gamma density (2.4.3).
The ch.f. of eve ry infinitely divisible distribution admits a unique canoni-
cal evy-Khinchine representation. Several variations of this representation
using different spectral measures are known. Here we consider the repre-
sentation which states that a ch.f. of an infinitely divisible distribution can
8
Dugu´e (1951) has raised a question of existence of a probability law which is not in-
finitely divisible but still can be written as a sum of two independent random variables
with distributions parameterized by a continuous parameter. Mistakenly, the Laplace
distribution was used as an example. As pointed out by Lukacs (1957), the example
is not valid as the Laplace distribution is infinitely divisible. Lukacs (1957) also con-
structs another example which answers positively to the question originally raised by
Dugu´e (1951).
58 2. Classical symmetric Laplace distribution
be wr itten uniquely in the for m
ψ(t) = exp
iat
1
2
b
2
t
2
+
Z
−∞
(e
itx
1 it sin x)dΛ(x)
, (2.4.4 )
where −∞ < a < , b 0, and Λ is a evy measure on (−∞, ), char-
acterized by the properties: Λ({0}) = 0 and
R
−∞
min(1, x
2
)dΛ(x) < .
Below, we present the evy-Khinchine re presentation of a Laplace distri-
bution [see Ta kano (1988) for a detailed treatment of the d-dimensional
density Ce
−||x||
, where kxk is the length of the vector x, including the o ne
dimensional case d = 1].
Proposition 2.4.2 The ch.f. (2.1.8) of a general classical Laplace distri-
bution CL(θ, s) admits the evy-Khinchine representation (2.4.4) with
a = θ, b = 0, dΛ(x) =
1
|x|
e
−|x|/s
dx. (2.4.5)
Proof. It is sufficient to prove the result for the standard classical Laplace
distribution. We need to show that
1
1 + t
2
= e
2
R
0
(cos(xt)1)e
x
x
1
dx
,
or, equivalently,
ln(1 + t
2
) = 2
Z
0
(cos(xt) 1)e
x
x
1
dx. (2.4.6)
Since both sides of (2.4.6) have well-defined Taylor series r epresentations
about zero, it is enough to demonstrate that the co e fficients in these rep-
resentations coincide.
The left hand side has the coefficients:
a
n
=
(1)
n/2
2(n 1)! if n is even,
0 if n is odd.
We now compute the coefficients of the right hand side. Denoting c(t, x) =
cos(xt) 1, we have for n 1:
n
c(t, x)
t
n
=
(1)
n/2
x
n
cos(tx) if n is even,
(1)
(n1)/2
x
n
sin(tx) if n is odd.
Consequently, the nth coe fficient of the Taylor representation is zero for
odd n while for even n it is given by
2
Z
0
(1)
n/2
x
n1
e
x
dx = 2(1)
n/2
(n 1)!.
This completes the proof.
2.4 Further properties 59
Remark 2.4.1 For comparison, the evy-Khinchine representation of the
normal distribution with mean µ and variance σ
2
is simply
ψ(t) = e
iµt
σ
2
t
2
2
,
and the evy measure Λ is zer o in this case.
2.4.2 Geometric infi nite divisibility
A r.v. Y (and its probability distribution) is said to be geometric infinitely
divisible if for any p (0 , 1) it satisfies the relation
Y
d
=
ν
p
X
i=1
Y
(i)
p
, (2.4.7)
where ν
p
is a geometric r.v. w ith mean 1/p, the random variables Y
(i)
p
are
i.i.d. for each p, and ν
p
and (Y
(i)
p
) are independent [see, e.g., Klebanov et
al. (1984)]. It can be shown that geometric infinite divisible laws are the
limits of the sums of rows of (X
ν
p
,i
)
ν
p
,i=1,...,ν
p
, where terms in each row
are i.i.d. conditionally on ν
p
, and their number ν
p
is random, geometrically
distributed, and is independent of the X
n,i
. Thus, if we have a large r an-
dom, geometrically distributed number of independent and similar random
effects (but depending on the number of effects) which add up together, the
observed distribution will be approximately geo metric infinitely divisible.
This prop erty justifies the interest in and importance of this class of dis-
tributions for probabilistic model construction and analysis. The following
proposition, which is a direct consequence of Proposition 2.2.7, establishes
geometric infinite divisibility of Laplace distributions.
Proposition 2.4.3 Let Y possess a classical Laplace distribution CL(0, s).
Then, Y is geometric infinitely divisible and for any p (0, 1) relation
(2.4.7) holds with Y
(i)
p
CL(0, s
p).
2.4.3 Self-decomposability
A random variable Y (and its probability distribution) is self-decomposable
if for each c (0, 1) it has the representation
Y
d
= cY + X, (2.4.8)
where X and Y are independent (the dis tribution of X may depend on c).
In terms of ch.f.’s it means that the function ψ(t)(ct), where ψ is the
ch.f. of Y , is a ch.f. for each c (0, 1). Evidently, normal distributions are
self-decomposable as the corresponding ratio is the ch.f. of a normal distri-
bution. Laplace distributions are also self-decomposable, as was shown by
60 2. Classical symmetric Laplace distribution
Ramachandran (1997). Below we shall present explicitly the corresponding
representation (2.4.8).
Proposition 2.4.4 Let Y possess a classical Laplace distribution with ch.f.
(2.1.8). Then, Y is self-decomposable and for any c (0, 1) we have
Y
d
= cY + θ(1 c) + s(δ
1
W
1
δ
2
W
2
), (2.4.9)
where δ
1
and δ
2
are dependent r.v.’s taking on values of either zero or one
with the probabilities
P (δ
1
= 0, δ
2
= 0) = c
2
, P (δ
1
= 1, δ
2
= 1) = 0
P (δ
1
= 1, δ
2
= 0) = P(δ
1
= 0, δ
2
= 1) =
1
2
(1 c
2
).
The r.v.’s W
1
and W
2
are standard exponential and Y , W
1
, W
2
, (δ
1
, δ
2
)
are mu tually independent.
Proof. Write Y = θ+sX, where X is the standard classical Laplace variable.
Note that the ch.f. of X given by (2.1.7) can be factored as follows:
1
(1 + ict)(1 ict)
c
2
+
1
2
(1 c
2
)
1
1 it
+
1
2
(1 c
2
)
1
1 + it
,
(2.4.10)
where the first factor is the ch. f. of cX while the second one is the ch.f. of
δ
1
W
1
δ
2
W
2
. Consequently, we obtain the represe ntation
X
d
= cX + δ
1
W
1
δ
2
W
2
. (2.4.11)
To arrive at (2.4.9), combine (2.4.11) with
Y = θ + sX.
We summarize sta bility properties of the Laplace distribution in Table
2.6 below. In the second part of Section 2.2 and throughout most of Sec-
tion 2.4, we have studied various distributional relations involving Laplace
distributions. In these r e lations, unlike those prese nted in the first part
of Section 2.2, ra ndom variables distributed according to Laplace distri-
butions appear on both sides of distributional equalities. For this reason,
we term them stability properties of Laplace distributions. The variables
Y and Y
i
’s are Laplace CL(0, s). All the variables in (each) representation
presented in Table 2.6 are mutually independent.
2.4 Further properties 61
Stability property Variables
Y
d
=
p
P
ν
p
i=1
Y
i
=
P
ν
p
i=1
Y
(i)
p
ν
p
geometric r.v. with parame-
ter p,
Y
(p)
i
i.i.d. CL(0,
p · s) r.v.’s
Y
d
=
pIY
1
+ (1 I)(Y
2
+
pY
3
)
I 0-1 r.v. with P (I = 1) = p
Y
d
=
pY
1
+ (1 I)Y
2
I 0-1 r.v. with P (I = 1) = p
Y
d
=
p
B
n1
(Y
1
+ ··· + Y
n
)
B
n1
beta r.v. with parameters
n 1 and 1
Y
d
= cY + s(δ
1
W
1
δ
2
W
2
)
W
1
, W
2
standard exponential
r.v.’s,
δ
1
, δ
2
0-1 r.v.’s given in Pro po-
sition 2.4.4
Table 2.6: Summary of stability pro perties of the c lassical Laplace distri-
bution. The variables Y and Y
i
’s are CL(0, s). All the variables in each
representation are mutually independent.
2.4.4 C omplete m onotonicity
A function f defined on an interva l I R is called completely mono-
tone (respectively, absolutely monotone) if it is infinitely differentiable on
I and (1)
k
f
(k)
(x) 0 (respe c tively, f
(k)
(x) 0) for any x I and any
k = 0, 1 , 2, . . . . Since the derivatives of the La place density are straightfor-
ward to calculate, it is ea sy to see that the p.d.f. of the classical Laplace
distribution with mean zero is completely monoto ne on (0, ) [and abso-
lutely monotone on (−∞, 0)]. As noted by Dreier (1999), every symmetric
density on (−∞, ), which is completely monotone on (0, ), is a scale
mixture of Laplace distributions.
Proposition 2.4.5 Let f be a symmetric (about zero) probability density
on (−∞, ) which is completely monotone on (0, ). Then, there exists a
distribution function G on (0 , ) such that
f(x) =
Z
0
1
2
ye
y|x|
dG(y), x 6= 0, (2.4.12)
while the ch.f. corresponding to f is
ψ(t) =
Z
0
1
1 + t
2
/y
2
G(y), < t < . (2.4.13)
62 2. Classical symmetric Laplace distribution
Proof. The result follows from the fact that every completely mo notone
density on (0 , ) is a scale mixture of expo nential densities on (0, ), see
Steutel (1970).
Remark 2.4.2 The convers e of Propositio n 2.4.5 clearly holds as well: ev-
ery density of the fo rm (2.4.12) with some c.d.f. G on (0, ) is a symmetric
density on (−∞, ) which is completely monotone on (0, ).
Remark 2.4.3 The central moment
µ
2m
= E[X
2m
] (2.4.14)
of the CL(0, s) random variable X is equal to (2m)!s
2m
[cf. (2.1 .14)]. Con-
sequently, for every 1 l r we have
µ
2l
(2l)!
1
2l
=
µ
2r
(2r)!
1
2r
, (2.4.15)
since each side in (2.4 .15) is equal to s. Actually, the Laplace distribution
is the only symmetric distribution on (−∞, ) with completely monotone
density on (0, ) for which the equality in (2.4.15) holds; for all other
symmetric random variables X on (−∞, ) with completely monotone
density on (0 , ) and finite 2mth moment (2.4.14), we have an inequality
µ
2l
(2l)!
1
2l
µ
2r
(2r)!
1
2r
, 1 l r m, (2.4.16)
see Dreier (1999).
2.4.5 Maximum entropy property
One of the basic concepts of information theory is the notion of entropy,
which is a meas ure of uncertainty associated with a probability distribu-
tion. The maximum entropy pr inciple states that, of all distributions that
satisfy certain constrains , one sho uld select the one with the largest en-
tropy. A maximum entropy distribution is believed not to incorporate any
extraneous information other than that which is specified by the r e le vant
constrains. Thus, finding the maximum entropy dis tribution could be con-
sidered as a general inference procedure, and indeed it was proposed ini-
tially by Jaynes (1957) in this manner. It has bee n successfully applied
in a great variety o f fields including statistical mechanics, statistics, stock
market analysis, queuing theory, image analysis, reliability estimation [see,
e.g., Kapur (1993)].
2.4 Further properties 63
For a one-dimensional r.v. X with density (or probability function) f ,
the entropy of X is defined by
H(X) = E[log f(X)]. (2.4.17)
It is well known that among all continuous r.v.’s with mean zero and given
variance, the Ga ussian (normal) distribution provides the largest entropy
[see, e.g., Reza (1961)]. Similarly, the Laplace distribution maximizes the
entropy among all continuous distributions with given first absolute mo-
ment, as noted by Kagan et al. (1973). Both results easily follow from the
following Proposition proved in Kagan et al. (1973).
Proposition 2.4.6 [Kagan, Linnik, and Rao]. Let X be a r.v. with density
p(x) > 0 for x (a, b) and p(x) = 0 otherwise. (2 .4.18)
Let h
1
, h
2
, . . . be integrable functions on (a, b) satisfying for given constants
g
1
, g
2
, . . . the conditions
Z
b
a
h
i
p(x)dx = g
i
, i = 1, 2, . . . . (2.4.19)
Then, the maximu m entropy is attained for the distributions with the den-
sity of the form
p(x) = e
a
0
+a
1
h
1
(x)+···
(2.4.20)
(and only by them), if there exist constants a
0
, a
1
, . . . such that the above
density satisfies the conditions (2.4.18) and (2.4.19).
How can we deduce the entropy maximization proper ty of the Laplace dis-
tribution from the above proposition? Consider continuous random vari-
ables with density p satisfying (2.4.18) with a = −∞, b = , and such
that
Z
−∞
|x|p(x)dx = c > 0. (2.4.21)
Then, according to Proposition 2.4.6, the maximum entropy is attained by
the density
p(x) = e
a
0
+a
1
|x|
, x (−∞, ), (2.4.22)
for some co ns tants a
0
and a
1
. Let us find the constants so that the function
(2.4.22) integ rates to 1 on (−∞, ) and satisfies the condition (2.4.21 ).
First, note that a
1
< 0 to ensure the integrability of p. Then, write
1 =
Z
−∞
e
a
0
e
a
1
|x|
dx =
2e
a
0
|a
1
|
, (2.4.23)
64 2. Classical symmetric Laplace distribution
so that
e
a
0
=
|a
1
|
2
. (2.4.24)
Finally, by (2.4.21), we have
c =
Z
−∞
|x|
|a
1
|
2
e
−|a
1
x|
dx =
Z
0
x|a
1
|e
−|a
1
|x
dx =
1
|a
1
|
, (2.4.25)
so that a
1
= 1/c and the density (2.4.22) takes the form
p(x) =
1
2c
e
−|x|/c
, x (−∞, ). (2.4.26)
The following result summarizes our discussion.
Proposition 2.4.7 Consider the class C of all continuous random vari-
ables with non-vanishing densities on (−∞, ) and such that
E|X| = c > 0 for X C. (2.4.27)
Then, the maximum entropy is attained for the Laplace r.v. X
c
with density
(2.4.26), and
max
X∈C
H(X) = H(X
c
) = log(2c) + 1.
Remark 2.4.4 If mean deviation about some fixed point θ is prescribe d
instead of E|X|, then the entropy is maximized by the density
1
2c
e
−|xθ|/c
,
where c = E|X θ| (Exercise 2.7.30).
Remark 2.4.5 If in addition to (2.4.27) we add the condition that EX =
c
1
, where |c
1
| < c, then the entropy is maximized by the skewed Laplace
distribution studied in Chapter 3 (see Proposition 3.4.7, Chapter 3). On the
other hand, if the mean along with the absolute deviation about the mean
are pre scribed (instead of EX and E|X|), then the entropy is maximized
by the symmetric Laplace distribution (Exercise 3.6.18, Chapter 3).
Recall that the Laplace distribution L(0, σ) (with mean zero and variance
σ
2
) can be regarded as Gaussian with a stochastic variance V = σ
2
W ,
where W has standard exponential distribution (see Proposition 2.2.1).
As noted recently by Levin and Tchernitser (1999), among all zero-mean
Gaussian r.v.’s with stochastic variance V (independent of the Gaussian
term), for any given value of EV , the Laplace distribution maximizes the
entropy of V . This follows from the fact that among all distributions with
given mean and (0, ) support, the maximum entropy corresponds to the
exp onential distribution [see, Gokhale (1975)], which can be established via
Proposition 2.4.6. Here is the exact formula tion of this result.
2.5 Order statistics 65
Proposition 2.4.8 Consider the class M of random variables of the form
DZ, where Z and D are independent, Z is standard normal, while D
has a continuous distribution on (0, ) with mean σ
2
. Then, the maximum
entropy of D,
max
Y
d
=
DZ∈M
H(D) = log(
2σ) + 1,
is attained for the Laplace r.v. Y
d
= σ
W Z, where W is standard expo-
nential.
2.5 Order stati stics
In this section we shall discuss order statistics of random variables having
a Laplace distribution.
Let the measurements obtained from a sample of size n be represented
by random variables X
1
, . . . , X
n
. The X
i
’s are mutually independent and
each one has the same cumulative distribution function (and probability
density function, if it exists).
We now introduce n new random variables
X
1:n
, X
2:n
, . . . , X
n:n
,
which are the original random variables arranged in ascending order of
magnitude so that
X
1:n
X
2:n
··· X
n:n
.
The random variables X
r:n
, where 1 r n, are called order statistics
(to be distinguished from rank order statistics equal to 1, 2, 3, . . . , n for
X
1:n
, X
2:n
, X
3:n
, . . . , X
n:n
, which are occasionally also referred to as order
statistics).
In particular, X
1:n
is the minimum of the X
i
’s, and X
n:n
is the maximum.
Another common order sta tistic is X
k+1:2k+1
, which coincides with the
sample median when the sample size is odd (n = 2k + 1). For the last fifty
years or der statistics have been playing an increasingly important role in
statistical inference and have appeared in many areas of statistical theory
and practice. We shall encounter them in later chapters a s well.
2.5.1 Distribution of a single order statistic
Given the parent distribution of X
1
(or, equivalently, any one of X
i
, i =
1, . . . , n), it is an elementary exercis e in proba bility theory to find the
distribution of any order statistic. For instance, if F denotes the c.d.f. of
X
1
, then the c.d.f. of X
n:n
is obtained as follows:
F
n:n
(x) = P (X
n:n
x) = P (all X
i
x) = [F (x)]
n
.
66 2. Classical symmetric Laplace distribution
Similarly, for a gener al order statistic, we have
F
r:n
(x) = P (X
r:n
x) = P
n
X
i=1
I
i
r
!
,
where I
i
’s are i.i.d. indica tor r.v.’s defined as
I
i
=
1 if X
i
x,
0 if X
i
> x.
The sum
P
n
i=1
I
i
is a binomial r.v. with probability of success p = P (X
i
x) = F (x), so that
F
r:n
(x) =
n
X
i=r
n
i
[F (x)]
i
[1 F (x)]
ni
. (2.5.1)
In the continuous case, the corresponding p.d.f. (obtained by differentia-
tion) is
f
r:n
(x) = r
n
r
[F (x)]
r1
[1 F (x)]
nr
f(x), (2.5.2)
where f is the density corresponding to F . We shall now assume that
X
1
, . . . , X
n
are i.i.d. from the classical Laplace distribution CL(θ, s). De-
note the c.d.f. and p.d.f. of the rth order statistics by F
r:n
(·; θ, s) and
f
r:n
(·; θ, s), resp e ctively. For the standard distribution CL(0, 1), we shall
omit the parameters and simply write F
r:n
(·) and f
r:n
(·). Below we shall
derive the distributions of or der statistics connected with the standard clas-
sical Laplace distribution. To obtain the c orresponding distribution in the
case of a general Laplace distribution, use the relations
F
r:n
(x; θ, s) = F
r:n
x θ
s
and f
r:n
(x; θ, s) =
1
s
f
r:n
x θ
s
.
The following res ult is obtained by direct application of formulas (2.5.1) -
(2.5.2).
Proposition 2.5.1 Let X
r:n
be the rth order statistic connected with a
sample of size n from the standard classical Laplace distribution CL(0, 1 ).
Then, the c.d.f. and p.d.f. of X
r:n
are
F
r:n
(x) =
1
2
n
n
X
i=r
n
i
e
ix
(2 e
x
)
ni
if x 0,
e
(ni)x
(2 e
x
)
i
if x 0,
(2.5.3)
and
f
r:n
(x) = r
1
2
n
n
r
e
rx
(2 e
x
)
nr
if x 0,
e
(nr+1)x
(2 e
x
)
r1
if x 0,
(2.5.4)
respectively.
2.5 Order statistics 67
Remark 2.5.1 For the classical Laplace CL(θ, s) we have the density
f
r:n
(x; θ, s) =
r
s
1
2
n
n
r
·
e
r(xθ)/s
(2 e
(xθ)/s
)
nr
if x 0,
e
(nr+1)(θx)/s
(2 e
(θx)/s
)
r1
if x 0.
(2.5.5)
In particular, we have the following special cases.
The minimum
The 1st or der statistic connected with a sample of size n from the CL(θ, s)
distribution has the following c .d.f. and p.d.f.:
F
1:n
(x; θ, s) =
1 (1
1
2
e
(xθ)/s
)
n
if x θ,
1
1
2
n
e
n(θx)/s
if x θ,
(2.5.6)
and
f
1:n
(x; θ, s) =
n
2
n
s
e
(xθ)/s
(2 e
(xθ)/s
)
n1
if x θ,
e
n(θx)/s
if x θ.
(2.5.7)
The maximum
The nth order statistic connected with a sa mple of size n from the CL(θ, s)
distribution has the following c .d.f. and p.d.f.:
F
n:n
(x; θ, s) =
1
2
n
e
n(xθ)/s
if x θ,
(2 e
(θx)/s
)
n
if x θ,
(2.5.8)
and
f
n:n
(x; θ, s) =
n
2
n
s
e
n(xθ)/s
if x θ,
e
(θx)/s
(2 e
(θx)/s
)
n1
if x θ.
(2.5.9)
The symmetry in the express ions for f
1:n
and f
n:n
Results from the relation
X
1:n
d
= 2θ X
n:n
.
The median
Let n = 2k + 1, k = 0, 1, 2, . . . , and let X
k+1:n
be the sample median
˜
X of
X
1
, X
2
, . . . , X
n
. Then, the p.d.f. of X
k+1:n
is as follows:
f
k+1:n
(x) =
n!
(k!)
2
1
2
2k+1
1
s
e
(k+1)|xθ| /s
(2 e
−|xθ|/s
)
k
, (2.5.10)
and the distribution is symmetric about θ. The above distribution was
derived in Fisher (1934), see also Karst and Polowy (1963).
68 2. Classical symmetric Laplace distribution
2.5.2 Joint d i s tributions of order statistics
Proceeding as in Section 2.5.1 , we can find the joint distributions of two
or more or der statistics. Consider a random sample X
1
, . . . , X
n
from a
continuous distribution with the c.d.f. F and p.d.f. f. Let
1 n
1
< n
2
< ··· < n
k
n,
where 1 k n. Then, the joint p.d.f. of X
n
1
:n
, X
n
2
:n
, . . . , X
n
k
:n
is non-
zero at x = (x
1
, . . . , x
k
) only if x
1
x
2
··· x
k
, in which case it is
equal to
f
n
1
,... ,n
k
:n
(x) = n!
k
Y
j=1
f(x
j
)
k
Y
j=0
[F (x
j+1
) F (x
j
)]
n
j+1
n
j
1
(n
j+1
n
j
1)!
, (2.5.11)
with x
0
= , x
k+1
= +, n
0
= 0, and n
k+1
= n + 1 [see, e.g., David
(1981, p. 10)]. In particular, the joint distribution of two order statistics,
X
r:n
and X
r
0
:n
, where 1 r < r
0
n, has density
f
r,r
0
:n
(x, y) = C(n, r, r
0
)F
r1
(x)f(x)[F (y) F (x)]
r
0
r1
f(y)[1 F (y)]
nr
0
(2.5.12)
for x y [and f
r,r
0
:n
(x, y) = 0 for x > y], where
C(n, r, r
0
) =
n!
(r 1)!(r
0
r 1)!(n r
0
)!
. (2.5.13)
An application of the above to order statistics associated with the Laplace
distribution, leads immediately to the following result.
Proposition 2.5.2 Let X
1
, . . . , X
n
be a random sample from standard
classical Laplace distribution CL(0, 1). Then, for any 1 r < r
0
n, the
joint distribution of X
r:n
and X
r
0
;n
has the density
f
r,r
0
:n
(x, y) =
1
2
n
C(n, r, r
0
)u(x, y), (2.5.14)
where the constant C(n, r, s) is given by (2.5.13) and
u(x, y) =
e
rx+y
[e
y
e
x
]
r
0
r1
[2 e
y
]
nr
0
if x y 0,
e
rx(nr
0
+1)y
[2 e
y
e
x
]
r
0
r1
if x 0 y,
e
(x+(nr
0
+1)y)
[e
x
e
y
]
r
0
r1
[2 e
x
]
r1
if 0 x y,
0, if x > y.
(2.5.15)
Remark 2.5.2 The joint distribution of the minimum and maximum is
thus given by
f(x, y) =
n(n 1)
2
n
e
x+y
(e
y
e
x
)
n2
if x y 0,
e
xy
(2 e
y
e
x
)
n2
if x 0 y,
e
xy
(e
x
e
y
)
n2
if 0 x y,
0, if x > y.
(2.5.16)
2.5 Order statistics 69
Remark 2.5.3 When the sample is drawn from a general CL(θ, s) distri-
bution, the joint density of X
r:n
and X
r
0
;n
is
f
r,r
0
:n
(x, y; θ, s) =
1
s
2
f
r,r
0
:n
((x θ)/s, (y θ)/s), (2.5.17)
with f
r,r
0
:n
(x, y) given by (2.5.1 4) - (2.5.15).
The joint distributions of order statistics play a n important role in statis-
tical applications. Many common statistics utilized in sta tistica l inference
are functions of order statistics, and we can obtain their distributions via
(2.5.11) coupled with standa rd transformation methods. Below we present
several examples of such der ivations for the Laplace distribution.
Range, midrange, sample median
The three commonly used statistics which are functions of just two order
statistics are:
R = X
n:n
X
1:n
the range of X
i
’s
MR =
X
n:n
+ X
1:n
2
the midrange of X
i
’s
˜
X =
X
k:2k
+ X
k+1:2k
2
the sample median when n = 2k is even
In the next proposition we derive the dis tribution of R [see, e.g., Edwards
(1948)].
Proposition 2.5.3 Let X
1
, . . . , X
n
, n 1, be a random sample from the
standard classical Laplace distribution CL(0, 1). Then the range R has the
following density function
f
R
(z) =
n 1
2
n1
e
z
h
(1 e
z
)
n2
+
n
2
I
n
(z)
i
, z > 0,
where I
n
(z) =
R
0
z
(2e
xz
e
x
)
n2
dx can be computed from the following
recurrent relations:
I
2
= z, A
2
= 1 e
z
, B
2
= e
z
1,
I
n
= 2I
n1
A
n1
e
z
B
n1
,
A
n
=
2
n 1
(n 2)(A
n1
I
n1
) + 2 (1 e
z
)
n
B
n
=
2
n 1
(n 2)(B
n1
I
n1
) 2(1 e
z
)
n1
(1 e
z
)
,
where A
n
=
R
0
z
e
x
(2 e
xz
e
x
)
n2
dx, B
n
=
R
0
z
e
x
(2 e
xz
e
x
)
n2
dx.
70 2. Classical symmetric Laplace distribution
Proof. Let f (x, y) denote the density of (X
1:n
, X
n:n
) given by (2.5.16). The
density of R can be written as the fo llowing sum of three integrals
f
R
(z) =
Z
z
−∞
f(x, z + x)dx +
Z
0
z
f(x, x + z)dx +
Z
0
f(x, z + x)dx.
The first and the third integral are equal to each o ther and equal to
n 1
2
n
(1 e
z
)
n2
e
z
,
while the middle integral is equal to
n(n 1)
2
n
e
z
I
n
(z).
Thus, it remains to prove the recurrent relations.
First note that I
2
(z) =
R
0
z
1dx = z, A
2
(z) =
R
0
z
e
x
dx = 1 e
z
, and
B
2
(z) =
R
0
z
e
x
dx = e
z
1. Next we have
I
n+1
(z) =
Z
0
z
(2 e
xz
e
x
)
n1
dx
= 2I
n
(z) A
n
(z) e
z
B
n
(z),
A
n+1
(z) =
Z
0
z
e
x
(2 e
xz
e
x
)
n1
dx
= 2A
n
(z) e
z
I
n
(z)
Z
0
z
e
2x
(2 e
xz
e
x
)
n2
dx,
B
n+1
(z) =
Z
0
z
e
x
(2 e
xz
e
x
)
n1
dx
= 2B
n
(z) I
n
(z) e
z
Z
0
z
e
2x
(2 e
xz
e
x
)
n2
dx.
Integration by parts of A
n+1
(z) and B
n+1
(z) leads to
A
n+1
(z) = (1 e
z
)
n
(n 1)e
z
I
n
(z)
+ (n 1)
Z
0
z
e
2x
(2 e
xz
e
x
)
n2
dx,
B
n+1
(z) = (1 e
z
)
n1
(1 e
z
) (n 1)I
n
(z)
+ (n 1)e
z
Z
0
z
e
2x
(2 e
xz
e
x
)
n2
dx,
and after some elementary algebra we arrive at the recurrent r e lations
stated in the theorem.
2.5 Order statistics 71
It is interesting to see how the distribution of R differs from the case when
the sample is from a Gaussian population. Unfortunately, for the latter case,
to the best of our knowledge, the exact distributions can be computed
explicitly only for special cases. McKay and Pearson (1933) studied the
case n = 3, obtaining the following density:
˜
f
R
(z) =
6
π
e
z
2
/4
Φ(z/
6), z > 0,
where Φ is the c .d.f. of the standard nor mal distribution. Larger values of n
would require numerical computations of certain integrals [the elaborated
computations for n = 2 to 20 of the cumulative distribution functions in
the pre-computer era are given in Pearson and Hartley (1942)]. The case
of a Laplace population is thus computationally easier since our recursive
formulas allow for explicit form of the densities for an arbitrar y n.
The Laplace var iable L(0, 1) = CL(0,
2/2) has the mean equal to zero
and variance equal to one, so it is appropriate for comparisons. For this
random variable the density of the range for sample size equal to three is
given by
f
R
(z) = e
2z
3z
2 +
2e
2z
, z > 0.
The graphs of these two densities are presented in Figure 2.4. The heavier
tails of the Lapla c e distribution are evident.
Consider now another function of the maximal and minimal order statis-
tics the midrange MR. Using a similar technique we obtain the following
result.
Proposition 2.5.4 Let X
1
, . . . , X
n
, n 1, be a random sample from the
standard classical Laplace distribution CL(0, 1). Then, the midrange MR
has the density f
MR
(z) = 2h(2z), where h, the density of X
1:n
+ X
n:n
, is
given by
h(z) =
e
−|z|
(1 + e
−|z|
)
2
[1 (2.5.18)
n 1
2
n
(1 e
−|z|
)
n1
n + 1
n 1
+ e
−|z|

+
n(n 1)
2
n
e
−|z|
J
n
(|z|).
Here,
J
n
(z) =
Z
z/2
0
(e
x
e
z+x
)
n2
dx, z > 0,
and it can be computed from the following recurrent relations:
J
2
= z/2, J
3
= 1 e
z/2
e
z
+ e
3z/2
,
72 2. Classical symmetric Laplace distribution
0 1 2 3 4 5 6
0.0 0.1 0.2 0.3 0.4 0.5
0 1 2 3 4 5 6
0.0 0.1 0.2 0.3 0.4 0.5
Figure 2.4: The compa rison of the p.d.f. of the range for sample size n = 3:
normal (dotted line) vs. Laplace (solid line) cases.
J
n
=
2
n 2
(1 e
z
2(n 3)e
z
J
n2
). (2.5.19)
Proof. Since the distribution of M R is symmetric around zero it is sufficient
to compute the density f
MR
(z) for the positive z. As in the previous proof,
let f(x, y) given by (2 .5.16) be the joint density of X
1:n
and X
n:n
(the
minimal and maximal order statistics). Then, the density h of the sum
X
1:n
+ X
n:n
is
h(z) =
Z
−∞
f(x, z x)dx =
n(n 1)
2
n
e
z
×
×
"
Z
0
−∞
e
2x
(2 e
xz
e
x
)
n2
dx +
Z
z/2
0
(e
x
e
z+x
)
n2
dx
#
.
The first integral can be computed directly by substitution and it leads
directly to (2.5.18).
The recursive relation (2.5.19) for computing the second integral can be
obtained as follows. First,
J
n
=
Z
z/2
0
e
x
(e
x
e
z+x
)
n3
dx + e
z
Z
z/2
0
e
x
(e
x
e
z+x
)
n3
dx.
2.5 Order statistics 73
For the two integrals in the above equa tion, denoted as I
1
and I
2
, re-
sp e c tively, we have:
I
1
=
Z
z/2
0
e
2x
(e
x
e
z+x
)
n4
dx e
z
J
n2
, (2.5.20)
I
2
= e
z
Z
z/2
0
e
2x
(e
x
e
z+x
)
n4
dx + J
n2
. (2.5.21)
In order to compute
R
z/2
0
e
2x
(e
x
e
z+x
)
n4
dx and
R
z/2
0
e
2x
(e
x
e
z+x
)
n4
dx, let us apply integration by parts technique to the integrals
on the left hand side of (2.5.20) and (2.5.21). We get
Z
z/2
0
e
x
(e
x
e
z+x
)
n3
dx
= 1 e
z
+
Z
z/2
0
e
x
(n 3)(e
x
e
z+x
)
n4
(e
x
e
z+x
)dx
= 1 e
z
n 3
e
z
(
J
n2
Z
z/2
0
e
2x
(e
x
e
z+x
)
n4
dx
)
,
Z
z/2
0
e
x
(e
x
e
z+x
)
n3
dx
= e
z
1 +
Z
z/2
0
e
x
(n 3)(e
x
e
z+x
)
n4
(e
x
e
z+x
dx
= e
z
1 + (n 3)
(
J
n2
+ e
z
Z
z/2
0
e
2x
(e
x
e
z+x
)
n4
dx
)
.
Thus,
Z
z/2
0
e
2x
(e
x
e
z+x
)
n4
dx =
1 e
z
n 2
n 4
n 2
e
z
J
n2
and
e
z
Z
z/2
0
e
2x
(e
x
e
z+x
)
n4
dx =
1 e
z
n 2
n 4
n 2
J
n2
.
Substituting these integrals into (2.5.20) and (2.5.21) leads to recursive
formula (2.5.19).
It is well known [see, e.g., Gumbel (1944)] that the distribution of the
midrange converges (when appropr iately normalized) to the logistic distri-
bution given by the density
f(z) =
e
−|z|
(1 + e
−|z|
)
2
.
74 2. Classical symmetric Laplace distribution
This limiting density is the first factor in the expression (2.5.18) for the
density h of the sum of the two extremal order statistics. Clearly, no nor-
malization (scaling) is required in order for the sum X
1:n
+X
n:n
to converge
to this logistic variable as n incre ases to infinity. Consequently, we see that
for the Laplace distribution a simple multiplication of the midrange by 2
is required to achieve the limiting s tandard logistic distribution.
The distribution of
˜
X for n = 2k + 1 was given in (2.5.1 0). In our next
result, we present the density of
˜
X in case of an even s ample size, as derived
by Asrabadi (1 985), omitting the details of its technical derivation.
Proposition 2.5.5 The distribution of the sample median
˜
X for n = 2k
is given by the density
f
˜
X
(z) =
n!
2
k
[(k 1)!]
2
k2
X
i=0
(1)
i
k1
i
2
i
(k 1 i)
e
(k+1+i)|z|
(1 e
(k1i)|z|
)
(1)
k
2
k1
|z|e
2k|z|
+
1
k2
k
e
2k|z|
.
2.5.3 Mome nts of order statistics
The computation of central moments of order statistics connected with a
general clas sical Laplace distribution is straightforward. Using the explicit
density (2.5.5) of the rth order statistic X
r:n
, we obtain
E[(X
r:n
θ)
k
] = s
k
n!Γ(k + 1)
(r 1)!(n r)!
×
(1)
k
nr
X
j=0
a
j
+
r1
X
j=0
b
j
,
(2.5.22)
where
a
j
= (1 )
j
(n r)!
j!(n r j)!
2
(r+j+1)
(r + j)
(k+1)
(2.5.23)
and
b
j
= (1 )
j
(r 1)!
j!(r 1 j)!
2
(nr+2+j)
(n r + 1 + j)
(k+1)
. (2.5.24)
In particular, for odd n, the mean of the sample median X
(n+1)/2:n
is equal
to θ, while the variance of the sample median is
E[(X
(n+1)/2:n
θ)
2
] =
4s
2
n!
[(n 1)/2]!
(n1)/2
X
j=0
c
j
, (2.5.25)
where
c
j
= (1 )
j
"
j!
n 1
2
j
!2
j+(n+1)/2
n + 1
2
+ j
3
#
1
. (2.5.26)
2.5 Order statistics 75
When the sample size n = 2k is even, the mean of the sample median is
still equal to θ, while the variance of the sample median was derived in
Asrabadi (1985). Its value for the standard classical Laplace distribution is
n!
[(k 1)!]
2
2
2k
[
k2
X
j=0
d
j
+ 2
k3
k
4
{(1)
k1
+ 1}], (2.5.27)
where
d
j
=
(k 1)!
j!(k 1 j)!)
(2)
j
(k 1 j)
1
{(k + 1 + j)
3
(2k)
3
}.
(2.5.28)
Govindarajulu (1966) obtained expressions for the means, variances and
covar iances of order s tatistics from the standard classical Laplace distri-
bution in ter ms of those from the standard exponential distribution. His
method applies to a general distribution which is symmetric about the
origin [see also Balakrishnan et al. (1993 )]. Let X
1:n
, . . . , X
n:n
denote the
order sta tistics corresponding to a random sample of size n from a symmet-
ric distribution with c.d.f. F
X
, and let Y
1:n
, . . . , Y
n:n
be the order statistics
obtained from a similar sample from the corresponding folded distribution
with c.d.f. F
Y
(y) = 2F
X
(y) 1, y 0 (so that Y
d
= |X|). Then, we have
the relations:
E[X
k
r:n
] =
1
2
n
(
r1
X
i=0
n
i
E[Y
k
ri:ni
]
+(1)
k
n
X
i=r
n
i
E[Y
k
ir+1:i
]
)
, 1 r n, (2.5.29)
and for 1 r < s n:
E[X
r:n
X
s:n
] =
1
2
n
(
r1
X
i=0
n
i
E[Y
ri:ni
Y
si:ni
]
s1
X
i=r
n
i
E[Y
ir+1:i
]E[Y
si:ni
]
+
n
X
i=s
n
i
E[Y
is+1:i
Y
ir+1:i
]
)
, (2.5.30)
see Govindarajulu (1963). Recalling that if X is standard classical Laplace
variable then Y = |X| is a standard exponential variable, Govindarajulu
(1966) used well-known explicit expressions of the mea ns of exponential
order statistics in (2.5.29) - (2.5.30) to obtain the following mo ments o f
76 2. Classical symmetric Laplace distribution
order statistics connected with the CL(0, 1) distr ibution:
E[X
r:n
] =
1
2
n
(
r1
X
i=0
n
i
S
1
(r i, n i)
n
X
i=r
n
i
S
1
(i r + 1, i)
)
, 1 r n, (2.5.31)
E[X
2
r:n
] =
1
2
n
(
r1
X
i=0
n
i
S
2
(r i, n i)
+
n
X
i=r
n
i
S
1
(i r + 1, i)
)
, 1 r n, (2.5.32)
and for 1 r < s n,
E[X
r:n
X
s:n
] =
1
2
n
(
r1
X
i=0
n
i
S
3
(r i, s i, n i)
s1
X
i=r
n
i
S
1
(i r + 1, i)S
1
(s i, n i)
+
n
X
i=s
n
i
S
3
(i s + 1, i r + 1, i)
)
. (2.5.33)
Here, for 1 r n,
S
1
(r, n) =
n
X
i=nr+1
1
i
, S
2
(r, n) =
n
X
i=nr+1
1
i
2
+ [S
1
(r, n)]
2
, (2.5.34)
and for 1 r < s n,
S
3
(r, s, n) =
n
X
i=nr+1
1
i
2
+ S
1
(r, n) · S
1
(s, n). (2.5.35)
Utilizing the relations (2.5.31) - (2.5.33), Govindarajulu (1966) calculated
the means, variances, and covariances of order statistics connected with the
standard classical Laplace distribution for sample sizes up to 20.
Remark 2.5.4 Balakrishnan (1988) extended the relations (2.5.29) - (2.5.30)
to the case of a single-sca le outlier model (when the random sample con-
sists of n 1 i.i.d. symmetric variables and one symmetric scale outlier).
Balakrishnan and Ambagaspitiya (1988) used this extension in studying
robustness of various linear estimators of the locatio n and scale parameters
of the classical Laplace distribution. The results have also be en extended
by Balakrishnan (1989) to the case of independent but not necessarily iden-
tically distributed obse rvations from the Laplace distribution.
2.5 Order statistics 77
Remark 2.5.5 Akahira and Takeuchi (1990) studied the loss of informa-
tion associated with the order statistics and related estimators related to
the Laplace distribution.
Remark 2.5.6 Lien et al. (19 92) derived moments of order sta tistics and
related best linea r unbiased estimators of the location and scale parameters
connected with the standard doubly truncated Laplace distribution with
density
f(x) =
1
2(1 P Q)
e
−|x|
, log(2Q) x log(2P ), (2.5.36)
where P and Q represent the pr op ortions of truncation on the left and right
of the standard classical Laplace density. Khan and Khan (1987) obtained
recurrence relations for the moments of order statistics connected with the
doubly truncated Laplace distribution (2.5.36).
2.5.4 Representation of o rder statistics via sums of
exponentials
In many considerations we found it useful to represent order statistics in
the form of sums of independent exponential random variables [see, for
example, Subsection 2.6.1].
It follows from (2.2.10) that a vector (X
1
, . . . , X
n
) of i.i.d. standard clas-
sical Laplace random var iable has a distributional representation of the
form
(X
1
, . . . , X
n
)
d
= (δ
1
W
1
, . . . , δ
n
W
n
), (2.5.37)
where (δ
1
, . . . , δ
2
) are i.i.d. Rademacher r.v .’s (random signs taking pluses
and minuses with equal probabilities) and (W
1
, . . . , W
n
) are i.i.d. standard
exp onential va riables independent of the δ
i
’s.
Let B
n
be a Bernoulli random variable counting the number of “pluses”
among the δ
i
’s. The number of “minuses” is denoted by
¯
B
n
= n B
n
.
Proposition 2.5.6 Let (X
1
, . . . , X
n
) be a vector of i.i.d. CL(0, 1) random
variables, and let B
n
be a Bernoulli random variable with p = 1/2 inde-
pendent of two independent sequences (
¯
W
i
)
i=1
, (W
i
)
i=1
of i.i.d. standard
exponential random variables.
Then, the order statistics of (X
1
, . . . , X
n
) have the following distribu-
tional representations:
(X
1:n
, . . . , X
n:n
)
d
= (
¯
W
¯
B
n
:
¯
B
n
, . . . ,
¯
W
1:
¯
B
n
, W
1:B
n
, . . . , W
B
n
:B
n
)
d
=
i
X
l=1
¯
W
l
n l + 1
¯
B
n
i=1
,
i
X
l=1
W
l
n l + 1
B
n
i=1
.
78 2. Classical symmetric Laplace distribution
Proof. It is e nough to notice that conditionally on the δ
i
’s, {W
i
, δ
i
= 1}
are independent of {W
i
, δ
i
= 1}. Thus, we can represent {W
i
, δ
i
= 1}
by {
¯
W
i
, i = 1, . . . ,
¯
B
n
}, and {W
i
, δ
i
= 1} by {W
i
, i = 1, . . . , B
n
}. The first
representation then follows by appropriate ordering of these two sequences.
The second r epresentation follows from the well-known representation o f
the exponential order statistics:
(W
i:n
)
n
i=1
d
=
i
X
l=1
W
l
n l + 1
n
i=1
. (2.5.38)
[See, e.g., Balakrishnan and Cohen (1991), p. 34.]
Remark 2.5.7 Consider n = 2k + 1. Let K
n
= max(B
n
,
¯
B
n
) and δ
n
=
sign(B
n
k 1/2), where B
n
is as in the above representation. We then
have the following representations for the median
X
k+1:n
d
= δ
n
K
n
k
X
l=1
W
l
K
n
l + 1
d
= δ
n
K
n
X
l=k+1
W
l
l
.
Here, K
n
and δ
n
are dependent but jo intly independent of the W
i
’s.
2.6 Statistical inference
In this rather lengthy section we shall discuss basic statistical theory and
methodology for Laplace distribution. We warn the readers that some of
the proofs presented herein may be a tough climbing but in our opinion
quite a rewarding experience. When collecting material for this section we
were pleasantly surprised by the abundance of available results scattered
in the liter ature. Before proceeding with results on estimation and testing
let us make some remarks concerning the classical Laplace location-scale
family of distributions with the density
f(x; θ, s) =
1
s
f
x θ
s
, −∞ < θ < , 0 < s < , −∞ < x < ,
(2.6.1)
where f is the s tandard classical Laplace density (2.1.2). We start with
an observation that our class is not a member of exponential family of
distributions, i.e. the density (2.6.1) can not be written as
a(θ, s)b(x)e
P
k
i=1
c
i
(θ,s)d
i
(x)
, −∞ < θ < , 0 < s < , −∞ < x < ,
(2.6.2)
2.6 Statistical inference 79
where a(θ, s) and c
i
(θ, s), 1 i k, are some functions of the vector
parameter (θ, s) and b(x) and d
i
(x), 1 i k, are some functions of
x. Co nsequently, many standa rd res ults which are valid for exponential
families of distributions are not available for the Laplace distribution.
Let X
1
, . . . , X
n
be i.i.d. each with density (2.6.1). If the density was of
the form (2.6.2), then the data could be reduced to the set of k sufficient
statistics (T
1
, . . . , T
k
), where
T
i
= T
i
(X
1
, . . . , X
n
) =
n
X
j=1
d
i
(X
j
). (2.6.3)
Since we are not dealing with exponential family, this is not the case.
Clearly, the set of all order statistics,
T = (X
1:n
, . . . , X
n:n
), (2.6.4)
is sufficient, as it is for any i.i.d. o bservations. Moreove r, greater reduction
of the data is not possible here, since the statistic T given above is also
minimal sufficient [see, e.g., Lehmann and Casella (1998)].
Proposition 2.6.1 Let P be the family of densities ( 2.6.1), and let the
variables X
1
, . . . , X
n
be i.i.d. each with density f(·; θ, s) P. Then, the
statistic T given by (2.6.4) is minimal sufficient for P.
The proof of Proposition 2.6.1 hinges on the following lemma presented in
Lehmann and Casella (1998).
Lemma 2.6.1 If P is a family of distributions with common s upport and
P
0
P, and if T is minimal sufficient for P
0
and sufficient for P, it is
minimal sufficient for P.
Proof. To establish Proposition 2.6.1 , note that the statistic T is sufficient
for P by the Factorization Criterion [see, e.g., Lehmann and Casella (1998),
Theorem 6.5]. It remains to show that T is also minimal sufficient. Let P
0
be the subset of P of these densities (2.6.1) where s = 1. In view of Lemma
2.6.1, it is enough to show that T is minimal s ufficient for P
0
. Consider a
subset P
1
of P
0
consisting of densities with a rational value of θ. Since the
family P
1
is countable, the set of statistics of the form
S
j
(X
1
, . . . , X
n
) =
Q
n
i=1
f(X
i
; θ
j
, s)
Q
n
i=1
f(X
i
; 0, s)
, (2.6.5)
where θ
j
is the jth rational number different from zero (since there are
countably many rational number s, they can be enumerated), is minimal
sufficient for P
1
[see Lehmann and Casella (1998), Theorem 6.12]. Since for
the Laplace distribution
S
j
(X
1
, . . . , X
n
) = e
P
n
i=1
|X
i
θ
j
|+
P
n
i=1
|X
i
|
, (2.6.6)
80 2. Classical symmetric Laplace distribution
it is c le ar that the set of statistics (2.6.6) is equivalent to the set of order
statistics, that is
S
j
(X
1
, . . . , X
n
) = S
j
(Y
1
, . . . , Y
n
), j = 1, 2, . . . , (2.6.7)
if and only if (X
1
, . . . X
n
) and (Y
1
, . . . , Y
n
) have the same order s tatistics.
Thus, the set of order statistics T is minimal sufficient for P
1
, and also for
P
0
via another application of Lemma 2.6.1.
We now turn to a study of the amount of Fisher information contained
in a random sample from the distribution with density (2.6.1). For the
location-scale family with density (2 .6.1), the entries of the Fisher infor-
mation matrix,
I(θ, s) =
I
11
I
12
I
21
I
22
, (2.6 .8)
are given by
I
11
=
1
s
2
Z
f
0
(y)
f(y)
2
f(y)dy, (2.6.9)
I
22
=
1
s
2
Z
yf
0
(y)
f(y)
+ 1
2
f(y)dy, (2.6.10)
and
I
12
= I
21
=
1
s
2
Z
y
f
0
(y)
f(y)
2
f(y)dy, (2.6.11)
see, e.g., Lehmann and Casella (1998). After routine calculations (see Ex-
ercise 2.7.31) we obtain
I(θ, s) =
1/s
2
0
0 1/s
2
. (2.6.12)
It is worth to note that
R
[f
0
(y)/f(y)]
2
f(y)dy is 1 for both Laplace and
normal densities but has a different value fo r other symmetric distributions
such as logistic and Cauchy .
Remark 2.6.1 Note that the Laplace density does not satisfy the standard
differentiability assumptions required for the co mputation of the Fisher
information matrix, since f is not differentiable a t zero. However, the re-
lations (2.6.9) - (2.6.11) are valid under a weaker assumption that f is
absolutely continuous, which is the case for the Laplace density [see, e.g.,
Huber (1981), Section 4.4].
2.6 Statistical inference 81
2.6.1 Point e s tim ation
We start with the problem of estimating the parameters of Laplace distri-
bution. Since the theory of estimation for the classical Laplace distribution
is well developed, we shall stick to the CL(θ, s) parameter ization. Below
we shall assume that X
1
, . . . , X
n
are n mutually independent random vari-
ables with proba bility density function (2.1.1), while x
1
, . . . , x
n
are their
particular realizations.
Maximum likelihood estimation
The likelihood function based on a sample of size n form the classical
Laplace distribution with scale s and location θ is
f
n
(x
1
, . . . , x
n
; θ, s) =
n
Y
i=1
f(x
i
; θ, s) =
1
2s
n
e
1
s
P
n
i=1
|x
i
θ|
. (2.6 .13)
Let us consider three cases, two where one of the parameters is known, and
one where both parameters are unknown.
Case 1: The value of s is known. Clearly, to find the maximum value of f
n
with respec t to θ, is the same as to minimize the expression
1
n
n
X
i=1
|x
i
θ| (2.6.14)
with respect to θ. Note that (2.6.14) is the expected value E|Y θ|, where
Y is a discrete random variable taking each of the values x
1
, . . . , x
n
with
probability 1/n. Consequently, the value of θ that minimizes (2.6.14 ) is
the median of Y , which here coincides with the sample median of the
observations x
1
, . . . , x
n
[see Hombas (1986)]. Norton (1984) established
this result by using calculus (see Exercise 2.7.34).
Thus, for n odd, the max imum likelihood estimator (MLE) of θ, denoted
ˆ
θ
n
, is uniquely defined as the middle observation X
(n+1)/2:n
. For n even,
ˆ
θ
n
can be chosen as any value between the two middle observations. For
convenience, in this case the canonical median, which is the ar ithmetic
mean of the two middle values, is usually used in practice.
Proposition 2.6.2 Let X
1
, . . . , X
n
be i.i.d. with t he CL(θ, s) distribution
(2.1.1), where s is known and θ R is un known. Then, the MLE of θ,
ˆ
θ
n
=
X
k+1:n
, for n = 2k + 1,
1
2
{X
k:n
+ X
k+1:n
}, for n = 2k,
(2.6.15)
where X
r:n
denotes the rth order statistic, is
(i) Unbiased;
(ii) Consistent;
(iii) Asymptotically normal, i.e.,
n(
ˆ
θ
n
θ) converges in distribution to a
normal distribution with mean zero and variance s
2
.
82 2. Classical symmetric Laplace distribution
Proof. The result can be established by using the explicit form of the density
and moments of sample median, derived in Section 2.5.
(i) Using the formulas for the moments of order statistics (see Section 2.5),
we find that the mean of the sample median defined by (2.6.15) is equal to
θ.
(ii) The consistency of
ˆ
θ
n
follows from part (i) and the fact the variance of
ˆ
θ
n
converges to zero as n (Exercise 2.7.39).
(iii) The standard regularity conditions usually stated in theorems on asymp-
totic normality of MLE’s do no t hold for Laplace distribution. To establish
the asymptotic normality, use Theore m 3.2, Chapter 5 of Lehman (1983),
which asserts that the sequence
n(
ˆ
θ
n
θ) converges to the normal distri-
bution with mean zero and variance 1/[4f
2
(0)], where f is the p.d.f. o f X
1
.
Remark 2.6.2 The median may not b e the best estimator to use for the
CL(θ, s) distribution, since there are other unbiased estimators of θ with
smaller va riances. For example, Rosenbe rger and Gasko (1983) found that
the variances of both the midmean
9
and the broadened median
10
are less
than that of the median. However, the median has a desirable property of
robustness (as do most other trimmed means) a s it per forms well (in terms
of efficiency) if the assumed model departs from the Laplace distribution;
Rosenberger and Gasko (19 83) recommend the median as an estimator of
location based on samples of size n 6 from a symmetric, possibly heavy
tailed distribution.
Keynes (1911) conjectured that the property that the sample median
is a MLE of the location parameter is a characterization of the Laplac e
distribution. This indeed is the case, as shown in Kagan et al. (1973) for
the case of n = 4 and under the assumption that the density function
of the considered distribution is lower semicontinuous. Recall that normal
distribution admits a similar characterization, where the MLE of the shift
parameter is the sample mean for sample sizes n = 2, 3 [see, Teicher (1961)].
It is interesting to note that the result fo r Laplace fails for sample sizes
n = 2, 3, see Rao and Ghosh (1971) and Exercise 2.7.35.
9
The midmean is the average of the central half of the order statistics (the 25%
trimmed mean).
10
For n odd, the broadened median is the average of the three m iddle order statistics
for 5 n 12 and the five middle order statistics for n 13. For n even, it is a weighted
average of the four middle order statistics for 1 n 12 with weights 1/6, 1/3, 1/3,
and 1/6, while for n 13 it is a weighted average of six middle or der statistics with the
weights 1/10, 1/5, 1/5, 1/5, 1/5, and 1/10 [see, e.g., Rosenberger and Gasko (1983)].
2.6 Statistical inference 83
The above characterization problem of the Laplace distribution has been
thoroughly studied in Findeisen (1982), who showed that the following
conditions imply that f is a Laplace density (with the mode at zero), where
X
1
, . . . , X
n
are i.i.d. with density f(x θ), −∞ < x, θ < .
(i) For all n, every median of the random sample of size n is the MLE of θ.
(ii) There is at least one even n, such that every median of the random
sample of size n is the MLE of θ.
(iii) There are infinitely many n’s such that for every random sample of
size n at least one median is the MLE of θ.
(iv) For sufficiently large n, the canonical median given by (2.6.15) is always
a MLE of θ.
In addition, Findeisen (19 82) demonstrated that the conditions (v) and (vi)
given below are not sufficient to conclude that f is a Laplace density (see
Exercises 2.7.36 and 2.7.37):
(v) There exists at least one n such that every median o f a random sample
of size n is the MLE of θ.
(vi) There exists an even n such that the two particular medians, X
n/2:n
and X
n/2+1:n
, are the MLE’s of θ.
Buczolich and Sz´ekely (1989) improved these results by showing that the
above characterization of the L aplace distribution of Kagan et al. (1973)
holds fo r an arbitrary even sample size n 4 and without any regularity
conditions on the density, and by replacing “every median” with some
median” in the condition (ii) of Findeisen (1982) given above. Thus, we
have the following characterization of the Laplace distribution.
Proposition 2.6.3 Let {F (x θ), θ R} be a family of absolutely contin-
uous distribution functions on R depending on a shift parameter θ. If the
canonical s ample median given by ( 2.6.15) is the MLE of θ for some even
sample size n 4, t hen F must be a Laplace distribution function, so that
F
0
(x) = f(x) =
1
2a
e
a|x|
, x 6= 0.
We refer the reader to Buczolich and Sz´ekely (19 89) for a fairly advanced
proof of the result.
Remark 2.6.3 More generally, if for some i {1, 2, . . . , n 1} a linear
combination of two consecutive order statistics of the form
W = a
i
X
i:n
+ a
i+1
X
i+1:n
(2.6.16)
is the MLE of θ, where n 3 and
a
i
+ a
i+1
= 1, a
i
, a
i+1
> 0, (2.6.17)
84 2. Classical symmetric Laplace distribution
then F must be a skewed Lapla c e distribution function corresponding to
the density
f(x) =
ce
b
1
|x|
if x 0
ce
b
2
|x|
if x 0,
(2.6.18)
where b
1
is some positive constant, b
2
=
i
ni
b
1
, and c is chosen so that the
density (2.6.18) integrates to 1 [Buczolich and Sz´ekely (1989)]. In particular,
Proposition 2.6.3 still ho lds if the canonical sample median is replaced by
an arbitrary median [of the form (2.6.16) with i = n/2].
Remark 2.6.4 We see that the MLE of the location parameter when sam-
pling from a Laplace distribution is the sample median (the empirical 0.5-
quantile). A question arises whether there are any distributions for which
the MLE’s of the location parameters are given by other empirical quan-
tiles. It turns o ut that this is generally true for skewed Laplace distributions
(2.6.18) [se e Section 3.5 of Chapter 3]. One family of skewed Laplace dis-
tributions is given by the p.d.f.
f(x) = α(1 α)
e
(1α)|xθ|
, fo r x < θ,
e
α|xθ|
, for x θ,
(2.6.19)
where θ (−∞, ) and α (0, 1) [see Poiraud-Casanova and Thomas-
Agnan (2000)]. Here, given i.i.d. observations from the density (2.6.19)
(with a given value of α), the MLE of θ is the empirical α-quantile (see
Exercise 2.7.38). For α = 1/2, the density (2.6.19) reduces to a symmet-
ric Laplace density and the MLE of θ is the empirical 0.5-quantile (the
median).
The two tailed power distribution with the c.d.f.
F (x) =
x
n
n1
for 0 x θ
1 (1 x)
n
/(1 θ)
n1
for θ x 1
and density
f(x) =
nx
n1
n1
for 0 x θ
n(1 x)
n1
/(1 θ)
n1
for θ x 1
has a similar property: the MLE of the parameter θ [which is not actually
a loca tion parameter as described in (2.6.1)] is given by an order statistic.
(The above distribution serves as an alternative to beta distributions. For
n = 2, we have the tr iangular distribution.)
Remark 2.6.5 Marshall and Olkin (1993) extended the above maximum
likelihood characterizatio n of Laplace distribution to the multivariate case.
They showed that if x
1
, x
2
, x
3
, x
4
is a rando m sample of size n = 4 from
a location family {F (x θ), θ R
d
} of distr ibutions in R
d
, where f = F
0
is lower semicontinuous at x = 0, and the vector of sa mple medians is a
MLE of θ, then f must be the product of univariate Laplace densities.
2.6 Statistical inference 85
Case 2: The value of θ is known. Here the likelihood function is maximized
by the sample firs t absolute moment.
Proposition 2.6.4 Let X
1
, . . . , X
n
be i.i.d. with t he CL(θ, s) distribution
(2.1.1), where θ is known and s > 0 is unknown. Then, the MLE of s,
ˆs
n
=
1
n
n
X
i=1
|X
i
θ|, (2.6.20)
is
(i) Unbiased;
(ii) Strongly consistent;
(iii) Asymptotically normal, i.e.,
n(ˆs
n
s) converges in distribution to a
normal distribution with mean zero and variance s
2
.
(iv) Efficient.
Proof. To establish (2.6.20), write the log-likelihood,
log f
n
(x
1
, . . . , x
n
; θ, s) = n log 2 n log s
1
s
n
X
i=1
|x
i
θ|, (2.6.21)
and note that its derivative with resp e ct to s,
1
s
1
s
n
X
i=1
|X
i
θ| n
!
,
is dec reasing for s < ˆs
n
and increasing for s > ˆs
n
.
(i) The unbiasedness of ˆs
n
follows from the repr e sentation
|X
i
θ|
d
= sW, (2.6.22)
where W is standard exponential with mean and variance equal to o ne.
(ii) The strong consistency of ˆs
n
follows from the Strong Law of Lar ge
Numbers, since the random variables (2.6.22) are i.i.d. with mean s.
(iii) The asymptotic normality follows from the classical version of the
Central Limit Theorem, as the random variables (2.6.22) are i.i.d. with
mean and standard deviation both equal to s.
(iv) The efficiency of ˆs
n
follows from the fact that the variance of ˆs
n
coin-
cides with the Craer- Rao lower bound (for the variance of any unbiased
estimator of s). Indeed, the Cram´er-Rao lower bound is [nI(s)]
1
, where
I(s) = E
2
s
2
log f(x; θ, s)
(2.6.23)
is the Fisher information in one observation from f (x; θ, s). The second
derivative of log f(x; θ, s) with respect to s is
2
s
2
log f(x; θ, s) =
1
s
2
2|x θ|
s
3
, (2 .6.24)
86 2. Classical symmetric Laplace distribution
so that I(s) = 1/s
2
and the Craer-Rao lower bound is s
2
/n.
Note that since ˆs
n
is unbiased and efficient, it is a uniformly minimum
variance unbiased estimator (UMVU) of s.
We have shown that for a scale parameter family o f Laplace distribu-
tions, a MLE of the scale parameter is the first absolute moment given
by (2.6.20). Is the converse true? Recall that for the corresponding scale
parameter family of normal distributions, a MLE of the scale parameter is
q
1
n
P
n
i=1
X
2
i
, which actually is a characterization of normal distribution
[see Te icher (1961)]. For the Laplace distribution, such characterization
holds as well.
Proposition 2.6.5 Let {F (x/s), s > 0} be a family of absolutely continu-
ous distributions on R, depending on a s cale parameter s. Suppose that the
density f(x) = F
0
(x) satisfies the following conditions:
(i) f is continuous on (−∞, );
(ii)
lim
y0
f(λy)
f(y)
= 1 for all λ > 0. (2.6.25)
If for all sample sizes n, a MLE of s is given by
1
n
P
n
i=1
|X
i
|, then F is
Laplace and f(x) =
1
2
e
−|x|
.
Proof. Suppose that ˆs
n
=
1
n
P
n
i=1
|X
i
| is a MLE of s for all sample sizes n.
Then, ˆs
n
maximizes the likelihood function, so that we have the inequality
1
ˆs
n
n
n
Y
i=1
f
x
i
ˆs
n
1
s
n
n
Y
i=1
f
x
i
s
(2.6.26)
for all s > 0 and x
i
R, i = 1, . . . , n. Let y
i
= x
i
/ˆs
n
and λ = ˆs
n
/s. Then,
we can write (2.6.26) as
n
Y
i=1
f (y
i
) λ
n
n
Y
i=1
f (λy
i
) , (2.6.27)
where λ > 0 and y
1
, . . . , y
n
satisfy the condition
n
X
i=1
|y
i
| = n. (2.6.28)
Consider the function f for x > 0. With positive y
i
’s satisfying (2.6.28) and
arbitrary λ > 0, the condition (2.6.26) leads to an exponential function,
f(x) = c
1
e
x
, x > 0, (2.6.29)
2.6 Statistical inference 87
see Teicher (1961, Theorem 2). Similarly, for x > 0, denote g(x) = f(x)
and write (2.6.27) as
n
Y
i=1
g (y
i
) λ
n
n
Y
i=1
g (λy
i
) , (2.6.30)
where λ > 0, and y
i
> 0 satisfy
P
n
i=1
y
i
= n. P roceeding as ab ove, we
arrive at the conclusion that
f(x) = g(x) = c
2
e
x
, x < 0. (2.6.31)
Since f is a probability density on (−∞, ), we must have c
1
+ c
2
= 1. To
conclude the proof, note that only the choice c
1
= c
2
=
1
2
leads to a MLE
given by the sample first absolute moment.
Remark 2.6.6 Cifarelli and Regazzini (197 6) considered the pro blem of
characterization of probability distributions for which the mean a bsolute
deviation (2.6.20) is an unbiased and efficient estimator of the scale param-
eter. Suppose, that X
1
, . . . , X
n
are i.i.d. with the density
g(x) =
1
s
f
x
s
, (2.6.32)
where f is positive for all real x and s > 0, continuous at x = 0, and satisfies
some technical conditions. Cifarelli and Regazzini (1976) showed that if the
statistic (2.6.20) (with θ = 0) is unbiased an efficient for the scale parameter
s of (2.6.32), then f is the standard classical Laplace distribution. Cifarelli
and Regazzini (1976) also obtained a generalization, showing that if for
some γ > 0 the statistic
ˆs
n,γ
=
1
n
n
X
i=1
|X
i
|
γ
(2.6.33)
is an unbiased and efficient estimator for the parameter s
γ
[under the model
(2.6.32)], then g must b e the exponential power density
g(x) =
γ
1γ
1
2sΓ(γ
1
)
e
(γs
γ
)
1
|x|
γ
,
which we shall (briefly) co ns ider in Section 4.4.2 of Chapter 4.
Case 3: Both s and θ are unknown. Similarly as above , here the MLE of
θ is the sample median
ˆ
θ
n
given by (2.6.15), while the MLE of the scale
88 2. Classical symmetric Laplace distribution
parameter s is equal to the mean absolute deviation
ˆs
n
=
1
n
n
X
j=1
|X
j
ˆ
θ
n
|. (2.6.34)
We shall demonstr ate that these estimators are consistent and asymptot-
ically normal. To prove these results one could use the general theory of
maximum likelihood estimation and its asy mptotics. Instead we have de-
cided to give more explicit derivations using the specific structure of maxi-
mum like lihood estimators for Laplace distributions. We restrict ourselves
to the case of a n odd sample s iz e , i.e. n = 2k + 1. The case of an even sam-
ple size can be derived in an analogous way with some minor adjustments
to account for the different form of the median. Thus we shall assume that
n = 2k + 1.
Let us start with an interesting representations of the median and the
mean absolute deviation for Laplace distributions. First, note the following
general relations for the mean absolute deviation
1
n
n
X
i=1
|X
i
X
k+1:n
| =
1
n
n
X
i=1
|X
i:n
X
k+1:n
|
=
1
n
"
k
X
i=1
(X
k+1:n
X
i:n
) +
n
X
i=k+2
(X
i:n
X
k+1:n
)
#
=
1
n
"
n
X
i=k+2
X
i:n
k
X
i=1
X
i:n
#
. (2.6.35)
Now, let us consider X
i
’s being i.i.d. from the standard clas sical Laplace
distribution. We use the representation of their order statistics given in
Proposition 2.5.6 to obtain the following result.
Proposition 2.6.6 Let (X
1
, . . . , X
n
) be a vector of i.i.d. CL(0, 1) random
variables, n = 2k + 1, and let B
n
be a Bernoulli random variable with
p = 1/2 independent of two independent sequences (
¯
W
i
)
i=1
, (W
i
)
i=1
of
i.i.d. standard exponential random variables. Define
¯
B
n
= n B
n
, K
n
=
max(B
n
,
¯
B
n
),
¯
K
n
= n K
n
, and δ
n
= sign(B
n
k 1/2).
2.6 Statistical inference 89
Then we have the following three joint representations of
ˆ
θ
n
and ˆs
n
:
ˆ
θ
n
d
= δ
n
W
K
n
k:K
n
d
= δ
n
K
n
k
X
l=1
W
l
K
n
l + 1
d
= δ
n
K
n
X
l=k+1
W
l
l
,
ˆs
n
d
=
1
n
¯
K
n
X
i=1
¯
W
i:
¯
K
n
+
K
n
X
i=K
n
k+1
W
i:K
n
K
n
k1
X
i=1
W
i:K
n
d
=
1
n
¯
K
n
X
l=1
¯
W
l
+
K
n
X
l=K
n
k+1
W
l
+
k
k + 1
W
K
n
k
+
K
n
k1
X
l=1
2k K
n
+ l
K
n
l + 1
W
l
d
=
1
n
¯
K
n
X
l=1
¯
W
l
+
k
X
l=1
W
l
+
k
k + 1
W
k+1
+
K
n
X
l=k+2
2k + 1
l
1
W
l
.
Here and below, if the upper limit of summation is smaller than t he lower
limit, then the sum is assumed to be zero.
Proof. The representation for the median was explained in Remark 2.5.7.
For the mean absolute deviation let us consider two cases.
First, let B
n
k + 1, i.e. K
n
= B
n
. We have
n
X
i=k+2
X
i:n
d
=
B
n
X
i=B
n
k+1
W
i:B
n
and
k
X
i=1
X
i:n
d
=
¯
B
n
X
i=1
¯
W
i:
¯
B
n
+
B
n
k1
X
i=1
W
i:B
n
.
Thus in this case the first representation for ˆs
n
follows from the relation
(2.6.35).
The sec ond case of B
n
k, i.e. K
n
=
¯
B
n
, ca n be treated similarly. We
obtain
ˆs
n
d
=
1
n
B
n
X
i=1
W
i:B
n
+
¯
B
n
X
i=
¯
B
n
k+1
¯
W
i:
¯
B
n
¯
B
n
k1
X
i=1
¯
W
i:
¯
B
n
.
The first representation of ˆs
n
follows from the fact that B
n
is independent
of the W
i
’s and
¯
W
i
’s, which allows for the replacement of W
i
’s by
¯
W
i
’s
(and vice versa) in the last equation.
90 2. Classical symmetric Laplace distribution
To prove the second representation, we a pply the representation of order
statistics of exponential random variables give n in (2.5.38). Let us consider
only the case of B
n
k + 1, the other case being symmetric. Since the rep-
resentation for the median was discussed in Remark 2.5.7, here we consider
the mean absolute deviation.
We have (for B
n
k + 1)
ˆs
n
d
=
1
n
B
n
X
i=B
n
k+1
W
i:B
n
B
n
k1
X
i=1
W
i:B
n
+
¯
B
n
X
i=1
¯
W
i:B
n
. (2.6.36)
By representation (2.5.38), the distribution of the first term in the above
equation is the same as that of
B
n
X
i=B
n
k+1
B
n
k
X
l=1
W
l
B
n
l + 1
+
i
X
l=B
n
k+1
W
l
B
n
l + 1
!
= k
B
n
k
X
l=1
W
l
B
n
l + 1
+
B
n
X
l=B
n
k+1
B
n
X
i=l
W
l
B
n
l + 1
= k
B
n
k
X
l=1
W
l
B
n
l + 1
+
B
n
X
l=B
n
k+1
W
l
.
The s e c ond and the third terms in (2.6.36) c an be written as follows
B
n
k1
X
i=1
i
X
l=1
W
l
B
n
l + 1
=
B
n
k1
X
l=1
B
n
k1
X
i=l
W
l
B
n
l + 1
=
B
n
k1
X
l=1
B
n
k l
B
n
l + 1
W
l
,
¯
B
n
X
i=1
i
X
l=1
¯
W
l
¯
B
n
l + 1
=
¯
B
n
X
l=1
¯
B
n
X
i=l
¯
W
l
¯
B
n
l + 1
=
¯
B
n
X
l=1
¯
W
l
.
Combining these three distributional relations results in the second repre-
sentation of the mean absolute deviation.
Finally, the third representation is obta ined by replacing the sequence
(W
1
, . . . , W
B
n
) by (W
B
n
, . . . , W
1
) and (
¯
W
1
, . . . ,
¯
W
¯
B
n
) by (
¯
W
¯
B
n
, . . . ,
¯
W
1
).
Now, we prove the main theorem about consistency and a symptotic effi-
ciency of
ˆ
θ
n
and ˆs
n
as estimators of θ and s. The proof is rather involved.
2.6 Statistical inference 91
We hope that our readers will communicate to us a simplified proof. Note,
however, that consistency, a symptotic normality, and efficiency of MLE’s
for vario us distributions is a challenging problem, and a number of most
prominent mathematical statisticians struggled with it in the last 30 years.
Theorem 2.6.1 Let (X
i
)
i=1
be a sequence of i.i.d. random variables hav-
ing CL(θ, s) distribution. Then the pair of maximum likelihood estimators
(
ˆ
θ
n
, ˆs
n
) of (θ, s) is consistent, asymptotically normal and efficient. The
asymptotic covariance matrix has the form
Σ =
s
2
0
0 s
2
.
(See also Fisher’s information matrix at the beginning of this section.)
Proof. It is sufficient to as sume that θ = 0 and s = 1 and show that
n(
ˆ
θ
n
E
ˆ
θ
n
, ˆs
n
Eˆs
n
)
converges in distribution to the standard bivariate normal distribution
while E(
ˆ
θ
n
) and E(ˆs
n
) converge to zero and one, respectively.
We s hall use the representation of the estimators given in Pro po sition 2.6.6.
By the Central Limit Theorem and Skorohod’s representation theore m we
can assume that
(B
n
n/2)/
p
n/4
converges almost surely to a standar d normal random variable Z which is
independent of the W
i
’s and
¯
W
i
’s.
Let us first consider the median
ˆ
θ
n
. By Proposition 2.6.6, we need to find
the limiting distribution o f the variable A
n
equal to the middle expression
in the following inequa lities multiplied by δ
n
:
n
1
k + 1
K
n
X
l=k+1
W
l
n
K
n
X
l=k+1
W
l
l
n
1
K
n
K
n
X
l=k+1
W
l
. (2.6.37)
Consider the rig ht-hand side expression, say R
n
, and take its character-
istic function with respect to the conditional distribution given B
n
φ
R
n
(t|B
n
) = E
exp
it
n
1
K
n
δ
n
K
n
X
l=k+1
W
l
!
B
n
!
=
1
(1 i
n
n/K
n
)
K
n
k
=
1
(1 itδ
n
n/K
n
)
K
n
/(itδ
n
n)
itδ
n
n(K
n
k)/K
n
.
92 2. Classical symmetric Laplace distribution
Note that
n
n/K
n
converges in the absolute value to zero, a nd δ
n
n(K
n
k)/K
n
converges by the assumption to Z a.e. Consequently, the consider e d
characteristic function converg e s (a.e. with respect to K
n
) to e
itZ
. Thus,
the conditional distribution of the right hand side o f (2.6.37), R
n
, converges
to a degenerated distribution at Z. Thus, the convergence is in pr obabil-
ity. Exactly the same arguments can be repe ated for the left hand side of
(2.6.37). This implies that A
n
, conditionally on B
n
, converges in pr oba-
bility to Z. To obta in the unconditional limiting distribution of A
n
, note
that
φ
A
n
(t) = E(φ
A
n
(t|B
n
)).
Since φ
A
n
(t|B
n
) is bounded and convergent almos t everywhere, it fol-
lows from the Dominated Convergence Theorem that φ
A
n
(t) converges to
E(e
itZ
) = e
t
2
/2
.
Now, we consider the mean absolute deviation. We again consider the
distribution of ˆs
n
conditionally on B
n
. Set
C
n
=
n(ˆs
n
E(ˆs
n
|B
n
))
and note the following repres entation
C
n
d
=
P
¯
K
n
l=1
(
¯
W
l
E(
¯
W
l
))
n
+
P
k
l=1
(W
l
E(W
l
))
n
+
1
n
k
k + 1
(W
k+1
E(W
k+1
)) +
n
K
n
X
l=k+2
1
l
1
n
(W
l
E(W
l
)).
Note that the four terms in the above representation are mutually indepen-
dent. Also the first two terms are independent of the median. It follows fr om
the Central Limit Theorem that each of the first two terms is convergent
in distribution to the standard normal distribution multiplied by
2/2 (we
need also to invoke the Law of Large Numbers to get that
¯
K
n
/n converges
almost surely to 1/2). Thus their sum is convergent to the standard normal
distribution. Clearly,
1
n
k
k + 1
(W
k+1
E(W
k+1
))
converges to zero.
It remains to consider the distributional limit of the last term,
n
K
n
X
l=k+2
1
l
1
n
(W
l
E(W
l
)).
Note the following inequalities (E(W
1
) = 1)
n(K
n
k 2)
1
K
n
1
n
n
K
n
X
l=k+2
1
l
1
n
E(W
l
)
2.6 Statistical inference 93
n(K
n
k 2)
1
k + 2
1
n
and
n
K
n
X
l=k+2
W
l
1
K
n
1
n
n
K
n
X
l=k+2
1
l
1
n
W
l
n
K
n
X
l=k+2
W
l
1
k + 2
1
n
.
Since K
n
/n conver ges in probability to 1/2, and (K
n
k 1/2)/
n con-
verges almost surely to |Z|/2, we conclude that
n(K
n
k 2)
1
K
n
1
n
and
n(K
n
k 2)
1
k + 2
1
n
converge in probability to |Z|/2 (conditionally on B
n
). Observe that
n
K
n
X
l=k+2
W
l
1
K
n
1
n
and
n
K
n
X
l=k+2
W
l
1
k + 2
1
n
have the sa me limit (conditionally on B
n
) since
1/K
n
1/n
1/(k + 2) 1/n
converges in probability to one. In addition,
n
K
n
X
l=k+2
W
l
1
k + 2
1
n
=
k
k + 2
K
n
X
l=k+2
W
l
n
.
The characteristic function (conditionally on B
n
) of
P
K
n
l=k+2
W
l
/
n is con-
vergent to e
it|Z|/2
. This shows that, in probability (conditionally on B
n
),
lim
n→∞
n
K
n
X
l=k+2
1
l
1
n
(W
l
E(W
l
)) =
|Z|
2
|Z|
2
= 0.
Consequently, ˆs
n
converges to the standard normal distribution and asymp-
totically is indepe ndent of
ˆ
θ
n
(the only terms in the representation of ˆs
n
which are dependent o n
ˆ
θ
n
are converge nt in probability to zero).
To conclude the proof, we need to show that
lim
n→∞
E(ˆs
n
) = 1.
We have
E(ˆs
n
) =
E(K
n
)
n
+
k
n
+
1
n
k
k + 1
+ E
K
n
X
l=k+2
1
l
!
E(K
n
) k 1
n
=
1
2
+
1
2 + 1/k
+
1
n
k
k + 1
+ E
K
n
X
l=k+2
1
l
!
n/2 k 1
n
.
94 2. Classical symmetric Laplace distribution
We see that e xcept for the first two terms (which are convergent to 1/2
each) the rest of them is converging to zero. This concludes the proof.
Remark 2.6.7 Harter et al. (1979) discuss adaptive MLE’s of the location
and scale parameters (θ and s, respectively) of a symmetric population,
where a sample is first classified as having come from uniform, normal, or
Laplace distribution, and then the MLE’s of θ and s, appropriate for the
chosen population, are computed. See Harter et al. (197 9) a nd references
therein for further information, including the classification criteria.
Maximum likelihood estimation under censoring
Let X
1
, . . . , X
n
be an i.i.d. sample from the classical Laplace distribution
with density f (·; θ, s) given by (2.1 .1) and distribution function F (·; θ, s)
given by (2.1.5). When the smalles t r and the largest r observations are
censored we obtain a Ty pe- II (symmetrically) censored sample
X
r+1:n
··· X
nr:n
. (2.6.38)
If x
r+1:n
··· x
nr:n
is a particular realization of (2.6.38), then the
likelihood function is
L(θ, s) =
n!
(r!)
2
{F (x
r+1:n
; θ, s)[1 F (x
nr:n
; θ, s)]}
r
nr
Y
i=r+1
f(x
i:n
; θ, s).
(2.6.39)
Utilizing (2.1.1) and (2.1.5) we obtain
L(θ, s) =
n!
2
n
(r!)
2
s
2n2
× (2.6.40)
×
e
(x
nr:n
θ)/s
(2e
(x
r+1:n
θ)/s
)
exp{
P
nr
i=r+1
(x
i:n
θ)/s}
, θ < x
r+1:n
,
exp
n
r
s
(x
nr:n
x
r+1:n
)
P
nr
i=r+1
x
i:n
θ
s
o
, θ [x
r+1:n
, x
nr:n
],
e
(x
r+1:n
θ)/s
(2e
(x
nr:n
θ)/s
)
exp{
P
nr
i=r+1
(θx
i:n
)/s}
, θ > x
nr:n
.
We now fix s > 0 and maximize the function L with respect to θ. By
(2.6.40), the likelihood function is monotonica lly increasing in θ on (−∞, x
r+1:n
)
and monotonically decreasing in θ on (x
nr:n
, ), so tha t the maximum
values of L must occur for some θ in [x
r+1:n
, x
nr:n
], see Exercise 2.7.44.
But on the latter interval, the function L is maximized if the sum
nr
X
i=r+1
x
i:n
θ
s
2.6 Statistical inference 95
is minimal, so that the MLE of θ is sample median of the cens ored sample
(which is the same as that of the orig inal sample). Substituting the sample
median
ˆ
θ
n
given by (2.6.15) into the likelihood function (2.6.40) results in
the following function of s to be maximized:
g(s) = L(
ˆ
θ
n
, s) =
n!
2
n
(r!)
2
s
n2r
e
C/s
, (2.6.41)
where
C = r(x
nr:n
x
r+1:n
) +
nr
X
i=r+1
|x
i:n
ˆ
θ
n
| > 0. (2.6.42)
Since the function g is maximized at s = C/(n 2r) [Exercise 2.7.44],
we obtain the following MLE of s [see Balakrishnan and Cutler (199 4)]:
ˆs
n
=
1
n 2r
nr
X
i=[[(n+1)/2]]+1
x
i:n
[[n/2]]
X
i=r+1
x
i:n
+ r(x
nr:n
x
r+1:n
)
.
(2.6.43)
Remark 2.6.8 Balakrishnan a nd Cutler (1994) derived the bias and the
efficiencies of the above estimators (compa red to the BLUE’s discussed be-
low); see also Childs and Balakrishnan (1997a) for the derivation of the
mean square error of these estimators. Ba lakrishnan and Cutler (1994) ob-
tained similar explicit estimators of θ and s under Type-II right censoring,
while Childs and Balakrishnan (1997b) extended the results to a general
Type-II censored samples.
Maximum likelihood estimation of monotone location parameters
Let for each i = 1 , 2, . . . , k, f(x; θ
i
) be the density (2.1.1) of the classical
Laplace CL(θ
i
, s) distribution with the location parameter θ
i
and the scale
parameter s = 1. Assume that n
i
items,
X
i1
, X
i2
, . . . , X
in
i
, (2.6.44)
are chosen fro m the distribution with density f(x; θ
i
), and the resulting
k samples are independent. Our goal is to find estimates
ˆ
θ
1
,
ˆ
θ
2
, . . . ,
ˆ
θ
k
of
θ
1
, θ
2
, . . . , θ
k
such that
ˆ
θ
1
ˆ
θ
2
···
ˆ
θ
k
. (2.6.45)
Brunk (1955) consider e d problems of this type when f(x; θ) is a member of
an exponential family of distributions (that includes the normal distribu-
tion with either unknown mean or unknown sta ndard deviation, but does
not include the Laplace distribution), while Robertson and Waltman (1968)
96 2. Classical symmetric Laplace distribution
developed a procedure for finding restricted estimates (2.6.45) for a class
of distributions co ntaining the classical Laplace law. More information on
the early histor y of such problems is given in Brunk (1965).
A proce dure for obtaining restricted maximum likelihood estimates de-
veloped by Robertson and Waltman (1968) assumes that the family of
functions {f(x; θ), θ Θ}, where Θ is a connected set of real numbers,
satisfies the fo llowing four conditions:
(A1) f(x; θ) has support S which is the same for all θ Θ,
(A2) For each x S the function f(x; θ) is continuous in θ,
(A3) If x
1
, . . . , x
n
S, then the likelihood function
L(θ; x
1
, . . . , x
n
) =
n
Y
i=1
f(x
i
; θ) (2.6.46)
is unimoda l with mode M (not necessary unique),
(A4) If x
1
, . . . , x
n
S and y
1
, . . . , y
m
S, and M
x
, M
y
are the
modes of the likelihood functions L(θ; x
1
, . . . , x
n
) and L(θ; y
1
, . . . , y
m
),
respectively, then M
xy
is between M
x
and M
y
, where M
xy
is the mode of
L(θ; x
1
, . . . , x
n
, y
1
, . . . , y
m
).
The conditions A3 and A4 do not assume that the mode be unique
[similar earlier results by van Eeden (1957) did assume the uniqueness of
the mode], although the condition A4 requires the existence of a certain
rule by which the mode is to be selected.
In the above setting, let M
i
be the mode of the likelihood function of the
ith sample (2.6.44), and let for 1 R S k, M(R, S) denote the mode
of the likelihood function
S
Y
i=R
n
i
Y
j=1
f(x
ij
; θ) (2.6.47)
of the combined observations of the Rth through Sth samples. The objec-
tive is to find a point
ˆ
θ
1
,
ˆ
θ
2
, . . . ,
ˆ
θ
k
in the set
S
k
= {(α
1
, . . . , α
k
) : α
i
Θ, α
1
α
2
··· α
k
} (2.6.48 )
for which the likelihood function
L(α
1
, . . . , α
k
) =
k
Y
i=1
n
i
Y
j=1
f(x
ij
; α
i
) (2.6.49)
is maximized. The main result of Robertson and Waltman (1968) asserts
that under the conditions A1 - A4 there exists a point in S
k
maximizing
the likelihood function (2.6.49), and it admits the representation
ˆ
θ
j
= min
1Rj
max
RSk
M(R, S) = max
jSk
min
1RS
M(R, S). (2.6.50)
2.6 Statistical inference 97
In addition, if θ
1
θ
2
··· θ
k
and if
lim
m→∞
k
X
i=1
|M
i
θ
i
| = 0, (2.6.51)
then with probability one
lim
m→∞
k
X
i=1
|
ˆ
θ
i
θ
i
| = 0, (2.6.52)
where m = min(n
1
, . . . , n
k
) [see Rober tson and Waltman (1968)].
Evidently, the family of L aplace densities with location θ Θ = (−∞, )
and a given scale parameter s (for convenience assumed to be one) satisfies
the c onditions A1 - A3 above. Here, the mode of the likelihood function
(the MLE of θ) is the sample median. Further , if in case of an even sample
size the median is chosen as in (2.6.15) to be the average of the two mid-
dle values, then the condition A4 is satisfied as well (see Exercis e 2.7.40).
Consequently, we have the following result [see Robertson a nd Waltman
(1968)].
Proposition 2.6.7 Assume that we have k independent random samples,
where the ith s ample, given in (2.6.44), is from the classical Laplace dis-
tribution with t he location parameter θ
i
and the scale parameter s = 1.
Then,
ˆ
θ
1
ˆ
θ
2
···
ˆ
θ
k
, where
ˆ
θ
j
is given by (2.6.50), is the MLE of
θ
1
, θ
2
, . . . , θ
k
subject to the condition (2.6.45).
Further, as noted by Robertson and Waltman (1968), the sample median of
the ith sample, M
i
, converges almost surely to θ
i
by the Glivenko-Cantelli
Theorem, so that by (2.6.51) we have the almost sure convergence (2.6.52 )
of the res tricted MLE’s.
The method of moments
Let X
1
, . . . , X
n
be a random sample from the classica l Laplac e distribution
with density (2.1.1). As in the case of MLE’s, we shall consider three cases,
two when one of the parameters is known, and o ne when both are unknown.
Case 1: The value of s is known. Since the mea n of the CL(θ, s) random
variable is equal to θ, the method of moments estimator (MME) of θ is the
sample mean,
˜
θ
n
=
1
n
n
X
i=1
X
i
. (2.6.53)
Clearly, the estimator (2.6.53) is unbiased for θ. Further, by the Strong
Law of Large Numbers and the Central Limit Theorem, it is consistent
and asymptotically normal.
98 2. Classical symmetric Laplace distribution
Proposition 2.6.8 Let X
1
, . . . , X
n
be i.i.d. with t he CL(θ, s) distribution
(2.1.1), where s is known and θ R is unknown. Then, the MME of θ
given by (2.6.53) is
(i) Unbiased;
(ii) Strongly consistent;
(iii) Asymptotically normal, i.e.,
n(
˜
θ
n
θ) converges in distribution to a
normal distribution with mean zero and variance 2s
2
.
Note that the asy mptotic variance of the MME of θ is twice as large as
that of the MLE of θ, so that for the Laplace distribution the asymptotic
relative efficiency (ARE) of the sample median
ˆ
θ
n
relative to the sample
mean
˜
θ
n
is
ARE(
ˆ
θ
n
) =
2s
2
s
2
= 2.
For any finite sample size n, the variance o f the MME is
V ar(
˜
θ
n
) =
V ar(X
1
)
n
=
2s
2
n
, (2.6.54)
while the variance of the MLE (the canonical median) is given in Section 2.5
[see also the relations (2.7.24) - (2.7.25), Exercise 2.7.39]. Table 2.7 contains
the variances of
ˆ
θ
n
and
˜
θ
n
for sample sizes n = 1 (1)7. We see that
V ar(
ˆ
θ
n
) V ar(
˜
θ
n
) (2.6.55)
when the sample size n is between 3 and 7. (The difference being rather
substantial.) Chu and Hotelling (1955) established the relation
B
k
1
1
2k + 2
3/2
V ar(
ˆ
θ
2k+1
)
1/(2k + 1)
1.51B
k
1 +
1
2k
3/2
, k 1,
(2.6.56)
where
B
k
=
(2k + 1)!
(k!)
2
1
2
2k+1
r
2π
2k + 1
, (2.6.57)
and concluded that if n = 2k + 1 7, then the relation (2.6.55) holds as
well (Exercise 2.7.42).
Case 2: The value of θ is known. Since the r.v. X
i
θ has the CL(0, s)
distribution, without loss of generality we shall assume that θ = 0. By the
moment relation (2.1.14), we have EX
2
i
= 2s
2
, so that the MME of s is
˜s
n
=
v
u
u
t
1
2n
n
X
i=1
X
2
i
. (2.6.58)
The following result summarizes the asymptotic prop erties of ˜s
n
.
2.6 Statistical inference 99
n 1 2 3 4 5 6 7
V ar(
˜
θ
n
) 2 1 0.667 0.500 0.400 0.333 0.286
V ar(
ˆ
θ
n
) 2 1 0.639 0.406 0.351 0.261 0.236
Table 2.7: The variances of
˜
θ
n
(the sample mean) and
ˆ
θ
n
(the sample me-
dian) for samples of size n from the standard classical Laplace distribution.
Proposition 2.6.9 Let X
1
, . . . , X
n
be i.i.d. with t he CL(0, s) distribution.
Then, the MME of s given by ( 2.6.58) is
(i) Strongly consistent;
(ii) Asymptotically normal, i.e.,
n(˜s
n
s) converges in distribution to a
normal distribution with mean zero and variance 1.25s
2
.
Proof. To establish (i), note that by the Strong Law of Large Numbers,
1
n
n
X
i=1
X
2
i
a.s.
E[X
2
i
] = 2s
2
. (2.6.59)
Thus,
˜s
n
= g
1
n
n
X
i=1
X
2
i
!
a.s.
g(2s
2
) = s, (2.6.60)
where
g(x) =
p
x/2. (2.6.61)
Similarly, Part (ii) can be established via the Central Limit Theore m. Since
X
2
i
, i = 1, 2, . . . are i.i.d. with
E[X
2
1
] = 2s
2
and V ar[X
2
i
] = E[X
i
]
4
(E[X
2
i
])
2
= 20s
4
[see the moment formula (2.1.14)], the sequence
n
1/2
1
n
n
X
i=1
X
2
i
2s
2
!
(2.6.62)
converges in distribution to a normal distribution with mean zero and vari-
ance 20s
4
. Thus, by standard ar guments of the large sample theory [see,
e.g., Rao (1965)], the sequence
n
1/2
"
g
1
n
n
X
i=1
X
2
i
!
g(2s
2
)
#
= n
1/2
(˜s
n
s) (2.6.63)
100 2. Classical symmetric Laplace distribution
converges in distribution to a normal distribution with mean zero and vari-
ance
[g
0
(2s
2
)]
2
(20s
4
) =
5
4
s
2
. (2.6.64)
Remark 2.6.9 Note tha t the asymptotic variance of the ˜s
n
is larger than
that of the MLE ˆs
n
. The relation between the variances for a finite sample
size n is investigated in Exercise 2.7.43.
Case 3: Both, s and θ are u nknown. Let
bm
1n
=
1
n
n
X
i=1
X
i
and bm
2n
=
1
n
n
X
i=1
X
2
i
(2.6.65)
be the first and se c ond sample moments for the random sample X
1
, . . . , X
n
from the CL(θ, s) distribution. Since the first two moments of X
1
are
E[X
1
] = θ, E[X
2
1
] = θ
2
+ 2s
2
, (2.6.66)
[see (2.1.18)], solving equations (2.6.66) for θ and s in terms of the first two
moments and substituting the sample moments (2.6.65), we arrive at the
following MME’s of θ and s:
˜
θ
n
= bm
1n
=
1
n
n
X
i=1
X
i
, ˜s
n
=
r
bm
2n
bm
2
1n
2
=
v
u
u
t
1
2n
n
X
i=1
(X
i
X
n
)
2
.
(2.6.67)
As before, the consistency and asymptotic normality of the estimators
(2.6.67) follow from standard arguments o f the large sample theory [see,
e.g., Rao (1965)].
Proposition 2.6.1 0 Let X
1
, . . . , X
n
be i.i.d. from the CL(θ, s) distribu-
tion, where θ R and s > 0. Let
˜
ξ
n
=
˜
θ
n
˜s
n
, (2.6.68)
where
˜
θ
n
and ˜s
n
are given by (2.6.67), be the MME of the vector parameter
ξ =
θ
s
. (2.6.69)
2.6 Statistical inference 101
Then, the estimator
˜
ξ
n
is
(i) Strongly consistent;
(ii) Asymptotically normal, i.e., the sequence
n(
˜
ξ
n
ξ) converges in dis-
tribution to a bivariate normal distribution with the (vector) mean zero and
the covariance matrix
Σ
MME
=
2s
2
0
0
5
4
s
2
. (2.6.70)
Proof. Consider an auxiliary seq uence of i.i.d. bivariate random vectors
Y
i
=
X
i
X
2
i
, i = 1, 2, . . . . (2.6.71)
The vector mean a nd the covariance matrix of Y
i
are as follows
m
Y
=
θ
θ
2
+ 2s
2
, Σ
Y
=
2s
2
4θs
2
4θs
2
8θ
2
s
2
+ 20 s
4
. (2.6.72)
[We have used the moment formulas (2.1.18).] Clearly, the Strong Law of
Large Numbers (SLLN) and Central Limit Theorem (CLT) apply to the
sequence (Y
i
), so that
lim
n→∞
1
n
n
X
i=1
Y
i
a.s.
= m
Y
(2.6.73)
and
lim
n→∞
n(
1
n
n
X
i=1
Y
i
m
Y
)
d
= N
2
(0, Σ
Y
). (2.6.74)
[The notation N
d
(m, Σ) denotes the d-dimensional normal distribution
with mean vector m and the covariance matrix Σ.] Observe that the esti-
mator (2.6.68) can be expressed in terms of the Y
i
’s as
˜
ξ
n
= g
1
n
n
X
i=1
Y
i
!
, (2.6.75)
where
g(x
1
, x
2
) =
x
1
,
r
x
2
x
2
1
2
!
. (2.6.76)
To prove the strong consis tency, use (2.6.73) together with the continuity
of g defined above to conclude that
lim
n→∞
g
1
n
n
X
i=1
Y
i
!
= lim
n→∞
˜
ξ
n
a.s.
= g(m
Y
) = ξ. (2.6.77)
102 2. Classical symmetric Laplace distribution
Similarly, we establish the asymptotic normality of
˜
ξ
n
by standa rd result
from the large sample theory [see, e.g., Rao (1965)]. Since the function g
has a non- singular matrix of partial derivatives at the point m
Y
,
D =
"
g
i
x
j
x=m
Y
#
=
1
s
s 0
θ/s 1/4
, (2.6.7 8)
the convergence (2.6.74) produces
lim
n→∞
n
"
g
1
n
n
X
i=1
Y
i
!
g(m
Y
)
#
d
= N
2
(0,
Y
D
0
), (2.6.79)
or
0
lim
n→∞
n
h
˜
ξ
n
ξ
i
d
= N
2
(0, Σ
MME
), (2.6.80)
since
g
1
n
n
X
i=1
Y
i
!
=
˜
ξ
n
, g(m
Y
) = ξ, and
Y
D
0
= Σ
MME
.
Remark 2.6.10 For 0 < p < 1, the function
f(x) = p
1
2s
1
e
−|xθ
1
|/s
1
+ (1 p)
1
2s
2
e
−|xθ
2
|/s
2
, < x < , (2.6.81)
is the density of the mixture of two Laplace distributions CL(θ
1
, s
1
) and
CL(θ
2
, s
2
). Such distributions may no longer be unimodal [see Exercise
2.7.46]. The method of moments estimatio n of the parameters of (2.6.81)
is consider ed in Kacki (1965b), Krysicki (1966ab), and Kacki and Krysicki
(1967).
Linear estimation
In this section we consider the so-called L-estimators of the parameters θ
and s of the classical Laplace distr ibution, w hich are linear combinations
of order statistics.
Best linear unbiased estimation. Let X
1
, . . . , X
n
be a random sample from
the CL(θ, s) distribution, and let
X
k+1:n
··· X
nm:n
(2.6.82)
2.6 Statistical inference 103
be the corresponding Type-II censored sample . For i = 1, . . . , n, let
µ
i
= E
X
i:n
θ
s
, σ
ii
= V ar
X
i:n
θ
s
, σ
ij
= Cov
X
i:n
θ
s
,
X
j:n
θ
s
(2.6.83)
be the means, variances, and covariances of the order s tatistics from the
standard classical Laplace distribution, with values given in (2.5.31), (2.5.32),
and (2.5.33), respectively. Then, the best linear unbiased estimators (BLUE’s
- unbiased estimators of minimum va riance in the class of linear unbias e d
estimators) of θ and s based on (2.6.82) are [see, e.g., Sarhan (1954, 1955),
Govindarajulu (1966), David (1981), Balakrishnan and Cohen (1991)]
θ
n
=
m
0
Σ
1
m1
0
Σ
1
m
0
Σ
1
1m
0
Σ
1
(m
0
Σ
1
m)(1
0
Σ
1
1) (m
0
Σ
1
1)
2
· X =
nm
X
i=k+1
a
i
X
i:n
(2.6.84)
and
s
n
=
1
0
Σ
1
1m
0
Σ
1
1
0
Σ
1
m1
0
Σ
1
(m
0
Σ
1
m)(1
0
Σ
1
1) (m
0
Σ
1
1)
2
· X =
nm
X
i=k+1
b
i
X
i:n
, (2.6.85)
where
X = (X
k+1:n
, . . . , X
nm:n
)
0
m = (µ
k+1
, . . . , µ
nm
)
0
1 = (1, . . . , 1)
0
(2.6.86)
are n k m-dimensional vectors and
Σ = [σ
ij
]
i,j=k+1...nm
(2.6.87)
is an nkm×nkm covariance matrix. The variances and covariances
of the estimators (2.6.84) and (2.6.85) are
V ar(θ
n
) = s
2
m
0
Σ
1
m
(m
0
Σ
1
m)(1
0
Σ
1
1) (m
0
Σ
1
1)
2
, (2.6.88)
V ar(s
n
) = s
2
1
0
Σ
1
1
(m
0
Σ
1
m)(1
0
Σ
1
1) (m
0
Σ
1
1)
2
, (2.6.89)
Cov(θ
n
, s
n
) = s
2
m
0
Σ
1
1
(m
0
Σ
1
m)(1
0
Σ
1
1) (m
0
Σ
1
1)
2
. (2.6.90)
Note that under symmetric censoring (k = m) the covar iance (2.6.90 ) is
equal to 0 (since in this case m
0
Σ
1
1 = 0), the coefficients of X
i:n
and
104 2. Classical symmetric Laplace distribution
X
ni+1:n
in θ
n
in (2.6.84) are equal, and in s
n
in (2.6.85) are equal in
absolute value and opposite in sign.
The coefficients a
i
and b
i
in (2.6.84) and (2.6.85) were tabulated by
Sarhan (1954, 1955) for sample sizes up to 5, and by Govindarajulu (1966)
for sample sizes up to 20 (and all choices o f symmetric censoring ). Bal-
akrishnan, Chandramouleeswaran, a nd Ambagaspitiya (1994 ) g ive tables
of a
i
and b
i
for the case of Type-II right ce ns ored samples of sizes up to
20 [with k = 0 and m = 0(1)(n 2)]. In table 2.8 below one can find the
coefficients a
i
and b
i
of θ
n
and s
n
based on complete samples for sample
sizes n = 2(1)1 0 [calculated by Govindarajulu (1966)].
n X
n:n
X
n1:n
X
n2:n
X
n3:n
X
n4:n
Variances
2 θ
n
0.5000 1.000
s
n
0.6667 0.7778
3 θ
n
0.1481 0.7037 0.5895
s
n
0.4444 0.0000 0.4321
4 θ
n
0.0473 0.4527 0.4155
s
n
0.3077 0.2145 0.2986
5 θ
n
0.0166 0.2213 0.5241 0.3169
s
n
0.2331 0.2264 0.0000 0.2290
6 θ
n
0.0063 0.1006 0.3931 0.2548
s
n
0.1876 0.1943 0.1132 0.1858
7 θ
n
0.0025 0.0455 0.2386 0.4267 0.2122
s
n
0.1572 0.1631 0.1439 0.0000 0.1565
8 θ
n
0.0010 0.0208 0.1316 0.3465 0.1814
s
n
0.1355 0.1391 0.1391 0.0718 0.1351
9 θ
n
0.0004 0.0097 0.0698 0.2374 0.3654 0.1581
s
n
0.1191 0.1211 0.1251 0.1013 0.0000 0.1190
10 θ
n
0.0002 0.0046 0.0364 0.1478 0.3110 0.1399
s
n
0.1063 0.1074 0.1110 0.1061 0.0504 0.1062
Table 2.8: Coefficients of the BLUE’s of the parameters θ and s of the clas-
sical Laplace distribution. The last column gives the values of V ar(θ
n
)/s
2
and V ar(s
n
)/s
2
.
By definition, the variance of the BLUE of θ is smaller than that of
MLE (the sample median) and MME (the mean) as these are also linear
combinations of order statistics a nd unbiased for θ. Sarhan (1 954) compared
the e fficiencies
11
of the latter two estimators, as well as of the midrange
(X
1:n
+ X
n:n
)/2 (which is also unbiased), relative to the B L UE of θ. The
11
The efficiency of an estimator
ˆ
θ
1
relative to another estimator
ˆ
θ
2
is the ratio
V ar(
ˆ
θ
2
)/V ar(
ˆ
θ
1
) expressed as a percentage.
2.6 Statistical inference 105
Figure 2.5: Percentage efficiencies of the three estimators of the location
parameter θ: the sample mean, the midrange, and the median, relative to
the BLUE of θ, in different populations [Republished with permission of
Institute of Ma thematical Statistics, from Sarhan, A.E., Annals of Mathe-
matical Statistics, 25, Copyright 1954.
efficiencies are presented in Table 2.9, and also graphically in Figure 2.5
[taken from Sarhan (1954)]. As noted by Sarhan (1954), the MLE (the
median) is mor e efficient than the MME (the mean) and the midrange
(and les s efficient than the BLUE).
Sample Size n 2 3 4 5
Estimator
Mean 100.00 88.43 82.80 79.21
Midrange 100.00 67 .90 49.6 5 3 8.29
Median 100.0 0 92.27 98.90 90.23
Table 2.9: Efficiencies of vario us es timators of the location parameter θ of
the classical Laplace distribution, relative to the BLUE of θ.
Remark 2.6.11 Chan a nd Chan (1969) derived the BLUE’s of θ and s
based on k selected order statistics (k- optimum BLUE’s) connected with
a random sample of size n fro m the classical Laplace distribution CL(θ, s).
106 2. Classical symmetric Laplace distribution
In Chan and Chan (1969), the authors provided tables containing the op-
timum ranks, the coefficients, biases, variances, and efficiencies (relative to
the corresponding BLUE’s based on all order statistics for complete sam-
ples) of the k-optimum BLUE’s for k = 1, 2, 3, 4 and n = k(1)20.
Remark 2.6.12 Rao et al. (1991) derived an optimum linear (in absolute
values of order statistics) unbiased estimator of the scale parameter s in
complete and censored samples. The estimator re duces to the sample mean
absolute deviation (the MLE of s when θ is known) for complete samples
and is generally more efficient than the BLUE of s.
Remark 2.6.13 Ahsanullah and Rahim (1973) noted some practical sit-
uations where a number of observations somewhere in the middle of an
ordered sample may be missing [see, e.g., Sarhan and Greenberg (1967)].
For a given sample size n, 1 R
1
< R
2
n, and k = k
1
+ k
2
, where
k
1
< R
1
and k
2
< n (R
2
1), Ahsanullah and Rahim (1973) determined
the optimum ranks
1 n
0
1
< n
0
2
< ··· < n
0
k
1
R
1
and R
2
n
0
k
1
+1
< ··· < n
0
k
1
+k
2
n
and derive d the BLUE’s of θ and s based on the order statistics
X
n
0
1
:n
, X
n
0
2
:n
, . . . , X
n
0
k
1
:n
, X
n
0
k
1
+1
:n
, . . . , X
n
0
k
1
+k
2
:n
,
observing that the efficiency of their estimates (relative to the BLUE’s
based on a complete sample) was quite high.
Remark 2.6.14 Let X
1:n
, . . . , X
n:n
be the order statistics corresponding
to a random sample of size n = 2k+1 from the classical Laplace distribution
with an unknown θ and the scale parameter s = 1. Akahira (1986) showe d
that variance of the linear estimator
ˆ
θ
AK
=
1
2
(X
k+1r
k:n
+ X
k+1+r
k:n
) (2.6.91)
with the optimal choice of r = 0.48 is asymptotically smaller than that of
the MLE o f θ (the s ample median
ˆ
θ
n
):
V ar(
ˆ
θ
n
) =
1
n
1 +
1.13
k
+ O
1
n

(2.6.92)
while
V ar(
ˆ
θ
AK
) =
1
n
1 +
0.90
k
+ O
1
n

. (2.6.93)
Generalizing, Sugiura and Naing (1989) showed that an a ppropriate linear
estimator of θ of the form
ˆ
θ
SN,m
=
m
X
i=1
a
i
[X
k+1r
i
k:n
+ X
k+1+r
i
k:n
] + bX
k+1:n
, (2.6.94)
2.6 Statistical inference 107
where 0 < r
m
< ··· < r
2
< r
1
(and with r
i
k assumed to be an integer),
has smaller asymptotic variance than the estimator
ˆ
θ
AK
defined in (2.6.91),
as the constant 0.90 in (2.6.93) is reduced to
p
2 0.80 [see also Akahira
(1987,1 990) and Akahira and Takeuchi (1993)]. Sugiura and Naing (1989)
observed that the variance of their estimator admits the same asymptotic
expansion [given by (2.6.9 3) with 0.90 replaced by
p
2] as Bayes risk
with respect to a prior having finite interval suppo rt (and satisfying some
technical conditions) derived by Joshi (1 984).
Remark 2.6.15 Let
X
1:n
··· X
ns:n
(2.6.95)
be a Type-II right-censored sample associated with a random sample of size
n fr om the CL(θ, s) distribution. Balakrishnan and Chandramouleeswaran
(1994b) utilized the pivotal varia bles
Q
1
=
X
ns+1:n
X
ns
s
n
and Q
2
=
X
n:n
X
ns
s
n
(2.6.96)
in prediction of X
ns+1:n
and X
n:n
(the percentage points of Q
1
and Q
2
were deter mined by Mo nte-Carlo simulations). The quantity s
n
in (2.6.96)
denotes the BLUE of the scale parameter s based on the censored sample
(2.6.95). In addition, these author s derived prediction intervals for extreme
order statistics Y
1:m
and Y
m:m
connected with a future sample of size m
from the Laplace distribution. The pre diction intervals utilize the (simu-
lated) perce ntage points of the pivotal quantities
Q
3
=
Y
1:m
θ
n
s
n
and Q
4
=
Y
m:m
θ
n
s
n
, (2.6.97)
where θ
n
and s
n
are the BL UE’s of θ and s, respectively, based on the
censored sample (2.6.95). Ling (1977) and Ling and Lim (1978) approa ched
these prediction problems from the Bayesian p erspective.
Simplified linear estimation. Let
W
i
= X
ni+1:n
X
i:n
(2.6.98)
and
V
i
=
1
2
(X
ni+1:n
+ X
i:n
) (2.6.99)
be the ith q uasi-range and the ith quasi-midr ange, respectively, connected
with the random sample X
1
, . . . , X
n
from the classical Laplace distribu-
tion CL(θ, s). Raghunandanan and Srinivasan (1971) considered simplified
linear estimators of θ and s based on V
i
and linear combinations of W
i
’s,
108 2. Classical symmetric Laplace distribution
for complete as well as symmetrically censored samples . Similar e stima-
tors for the parameters of a normal distribution were obtained in Dixon
(1957,1 960).
When k larg e st and k smallest observations are censored, where k 0,
the simplified estimator of θ is that V
i
(with i k + 1) which has the
smallest var iance. Under the same censoring, the simplified estimator of s,
denoted by ˆs
k,n
, is the estimator with minimum variance among estimators
of the form
C
[[n/2]]
X
i=k+1
c
i
W
i
, (2.6.100)
where the W
i
’s are given by (2 .6.98), the c
i
’s take the values of 0 or 1, and
C is a normalizing constant that makes the estimator (2.6.100) unbiased.
Table 2.10 contains the values of the index i corresponding to the simplified
estimator of θ, V
i
, based on complete samples with n = 3(1)20. The relative
efficiency of this es timator relative to the BLUE of θ is also included in
Table 2.10 (note that when n = 3 and 5 the estimator coincides with the
MLE of θ - the sample median). Table 2.11 contains the values of the
n i V ar(V
i
)/s
2
Eff(V
i
)
3 2 0 .63889 0 92.3
4 2 0 .42013 5 98.9
5 3 0 .35118 0 90.2
6 3 0 .26090 5 97.7
7 3 0 .22580 5 94.0
8 4 0 .18731 0 96.8
9 4 0 .16479 5 95.9
10 5 0.145225 96.3
11 5 0.129605 96.7
12 6 0.118125 96.0
13 6 0.106670 97.0
14 7 0.099285 95.9
15 7 0.090540 97.2
16 7 0.085190 96.0
17 8 0.078575 97.2
18 8 0.074175 96.7
19 9 0.069350 97.3
20 9 0.065670 97.0
Table 2.10: Simplified linear estimator of θ, V
i
, its variance, and its relative
(percent) efficiency relative to the BLUE of θ, based on a complete random
sample of size n from the classical Laplace distribution CL(θ, s).
simplified estimator ˆs
k,n
of the form (2.6.100), along with its efficiency
2.6 Statistical inference 109
relative to the BLUE s
n
of s, defined as
Eff(ˆs
k,n
) = V ar(s
n
)/V ar(ˆs
k,n
) × 100%
More extensive tables can be found in Raghunandanan and Srinivasan
(1971).
n k ˆs
k,n
V ar(ˆs
k,n
)/s
2
Eff(ˆs
k,n
)
4 0 0 .28915 7(W
1
+ W
2
) 0.300624 99.3
5 0 0 .23132 5(W
1
+ W
2
) 0.229000 100.0
6 0 0 .18348 6(W
1
+ W
2
+ W
3
) 0.186515 99.6
6 1 0 .66666 7W
2
0.304009 98.5
7 0 0 .15727 4(W
1
+ W
2
+ W
3
) 0.156500 1 00.0
7 1 0 .39072 1(W
2
+ W
3
) 0.234731 97.5
8 0 0 .13425 4(W
1
+ W
2
+ W
3
+ W
4
) 0.135438 99 .7
8 1 0 .32457 1(W
2
+ W
3
) 0.188570 98.4
8 2 0 .96713 3W
3
0.303726 99.4
9 0 0 .11933 7(W
1
+ W
2
+ W
3
+ W
4
) 0.119000 100.0
9 1 0 .28288 2(W
2
+ W
3
) 0.158812 98.3
9 2 0 .79085 5W
3
0.233068 98.5
10 0 0.108696(W
1
+ W
2
+ W
3
+ W
4
) 0.106392 99 .8
10 1 0.238741(W
2
+ W
3
+ W
5
) 0.137784 98.0
10 2 0.681084 W
3
0.190810 97.3
10 3 1.267536 W
4
0.305295 99.7
Table 2.11: Simplified linear estimator of s, ˆs
k,n
, its variance , and its relative
(percent) efficiency relative to the BLUE of s, based on a random sample of
size n from the classical Laplace distribution CL(θ, s), where k observations
are ce nsored from each end.
Remark 2.6.16 Iliescu and Voa (1973) considered a symptotically unbi-
ased estimators of s of the form
α(n)
[[n/2]]
X
i=1
W
i
, (2.6.101)
which have the same structure as the simplified estimator (2.6.100) of the
scale parameter.
Asymptotic best linear unbiased estimation. Cheng (197 8) remarked that
for a large sample size n, the BLUE’s of θ and s are too tedious to calcu-
late. Consequently, using the theory of asymptotically best linear unbiased
estimates (ABLUE) developed by Ogawa (1951), he derived a method for
110 2. Classical symmetric Laplace distribution
an optimal selection of the order statistics from complete as well as singly
or doubly censored large samples to estimate parameters of the Laplace
distribution. The method utilizes the sample quantiles
X
[[
1
]]+1:n
< ··· < X
[[
k
]]+1:n
, (2.6.102)
where the real numbers
0 = λ
0
< λ
1
< ··· < λ
k
< λ
k+1
= 1 (2.6.103)
are called the spacings and the u
i
’s defined by
λ
i
=
Z
u
i
−∞
f(x)dx = F (u
i
) (2.6.104)
are the population quantiles of the standard classical Laplace distribution
with density f and distribution function F . Under the above setting, the
ABLUE of θ (when s is known) is
θ
∗∗
n
=
k
X
i=1
a
i
X
[[
i
]]+1:n
K
3
K
2
s, (2.6.105)
the ABLUE of s (when θ is known) is
s
∗∗
n
=
k
X
i=1
b
i
X
[[
i
]]+1:n
K
3
K
2
θ, (2.6.106)
and their asymptotic variances are
V ar
ASY
(θ
∗∗
n
) =
s
2
nK
1
, V ar
ASY
(s
∗∗
n
) =
s
2
nK
2
, (2 .6.107)
where
K
1
=
P
k+1
i=1
(f
i
f
i1
)
2
λ
i
λ
i1
K
2
=
P
k+1
i=1
(f
i
u
i
f
i1
u
i1
)
2
λ
i
λ
i1
K
3
=
P
k+1
i=1
(f
i
f
i1
)(f
i
u
i
f
i1
u
i1
)
λ
i
λ
i1
(2.6.108)
and
a
i
=
f
i
K
1
f
i
f
i1
λ
i
λ
i1
f
i+1
f
i
λ
i+1
λ
i
,
b
i
=
f
i
K
2
f
i
u
i
f
i1
u
i1
λ
i
λ
i1
f
i+1
u
i+1
f
i
u
i
λ
i+1
λ
i
,
f
i
= f (u
i
), i = 1, 2, . . . , k, f
0
= f
k+1
= f
0
u
0
= f
k+1
u
k+1
= 0.
(2.6.109)
2.6 Statistical inference 111
The asymptotic efficiencies (ARE) of θ
∗∗
n
and s
∗∗
n
relative to the Cram´er-
Rao lower bound are
ARE(θ
∗∗
n
) = K
1
, ARE(s
∗∗
n
) = K
2
. (2.6.110)
The estimates based on the optimal s pacings (2.6.103) are those that maxi-
mize the ARE ’s (2.6.110) and are re ferred to as the {λ
i
}-ABLUE [see Chan
(1970)].
As shown in Cheng (1978) the coefficients a
i
in (2.6.105) for the {λ
i
}-
ABLUE of θ are zero except for the coefficient of 1 corresponding to a
single-point spacing {1/2}.
Proposition 2.6.1 1 Let X
1
, . . . , X
n
be a random sample of size n from
the classical Laplace distribution CL(θ, s) with known value of s. The opti-
mum spacing for the {λ
i
}-ABLUE of θ, θ
∗∗
n
, is a s ingle-point spacing {1/2},
which is independent of the number of order statistics k. The ARE of θ
∗∗
n
is 1.
Thus, in la rge samples, we can uniquely estimate the location parameter
θ of the CL(θ, s) distribution (with known value of s) by θ
∗∗
n
, either from
a full sample or a censore d one, as long as the middle observation is not
missing.
The es timatio n of the parameter s is more complicated. Here, maximizing
K
2
(the ARE of s
∗∗
n
) with respect to the spacings (2.6.103) leads to a system
of equations [see Cheng (19 78)]:
f
i+1
u
i+1
f
i
u
i
λ
i+1
λ
i
+
f
i
u
i
f
i1
u
i1
λ
i
λ
i1
f(u
i
) 2
d(f
i
u
i
)
du
i
= 0.
Cheng (1978) noted tha t in this case the o ptimal spacings may not be
unique, and they may be symmetric about the point 1/2 only when the
number k is even. We refer the reader to Cheng (1978) for further in-
formation and an extensive set of tables containing the optimal spacings
{λ
i
} and the corresponding coefficients b
i
for the {λ
i
}-ABLUE of s given
by (2.6.106), as well as the asymptotic efficiencies of s
∗∗
n
relative to the
Cram´er-Rao lower bound.
Ali et al. (1981) der ived estimators for the ξ-quantiles, x
ξ
, of the classical
Laplace CL(θ, s) distribution. Their estimato rs are
˜x
ξ
= a
l
X
l:n
+ a
m
X
m:n
, 1 l m n,
where the ranks l, m and the coefficients a
l
, a
m
are chosen so that ˜x
ξ
is
asymptotically best (minimum variance) linear unbiased estimator (ABLUE)
of x
ξ
. The proc e dure doe s not involve estimation of the location and scale
parameters and does not require the use of tables, since the estimator ad-
112 2. Classical symmetric Laplace distribution
mits the following explicit form:
˜x
ξ
=
0.255X
[[0.30506ξn]]+1:n
+ 0.745X
[[1.50134ξn]]+1:n
for 0.0352 ξ 0.3330
z
ξ
1.59362
X
[[0.10159n]]+1:n
+
1 +
z
ξ
1.59362
X
[[n/2]]+1:n
for ξ < 0.0352 and 0.3330 < ξ < 0.5
X
[[n/2]]+1:n
for ξ = 0.5
1
z
ξ
1.59362
X
[[n/2]]+1:n
+
z
ξ
1.59362
X
[[0.89841n]]+1:n
for 0.5 < ξ < 0.6670 and ξ > 0.9648
0.745X
[[(1.50134ξ0.50134)n]]+1:n
+ 0.255X
[[(0.30536ξ+0.69494)n]]+1:n
for 0.6670 ξ 0.9648,
(2.6.111)
where z
ξ
is the ξ-quantile of the standard classical Laplace distribution.
They compared the asymptotic variance of their estimator with that of the
standard quantile estimator X
[[]]+1:n
, concluding that x
ξ
performs much
better. Table 2.12 contains the asymptotic relative efficiencies (ARE) of
x
ξ
relative to X
[[]]+1:n
, computed by Ali et al. (1981). See Saleh et al.
(1983) for further discussion on quantile estimation for double exponential
distribution, and Umbach et al. (1984) for applications of ABLUE’s ba sed
on optimal spacing s in testing hypothesis.
ξ 0.1 0.2 0.333 9 0.4 0.5
ARE 122 128 191 147 100
Table 2.12: Asymptotic relative (p ercent) efficiencies (ARE) for x
ξ
relative
to X
[[]]+1:n
for the Laplace distribution.
2.6.2 Interval estimation
We shall now discuss confidence intervals for parameters of the classical
Laplace distribution. Let X
1
, . . . , X
n
be a random sample from the CL(θ, s)
distribution. If the scale parameter s is known, then a confidence interval for
θ may be constructed utilizing the distribution of the sample median given
in (2.5.1 0) and Proposition 2.5.5. If the loc ation parameter θ is known, then
since the r.v.’s |X
i
θ|/s are i.i.d. standard e xponential (see Proposition
2.6 Statistical inference 113
2.2.3), the MLE of s given by (2.6.20) is distr ibuted as (2n)
1
sV , where
V has a χ
2
distribution with 2n degrees of freedom. Consequently, the
100(1 α)% confidence interval fo r s is given by
2
n
X
j=1
|X
j
θ|
χ
2
2n,1α/2
, 2
n
X
j=1
|X
j
θ|
χ
2
2n,α/2
, (2.6.112)
where χ
2
2n,p
denotes the pth quantile of the χ
2
distribution with 2n degre e s
of freedom. If both θ and s are unknown, confidence intervals for θ and s
can be obtained via the distributions of the pivotal quantities
V
n
=
1
s
n
X
j=1
|X
j
ˆ
θ
n
| and W
n
=
ˆ
θ
n
θ
P
n
j=1
|X
j
ˆ
θ
n
|
, (2.6.113)
where
ˆ
θ
n
is the MLE of θ given by (2.6.15), a s V
n
and W
n
are distributed
independently of the parameters [see Bain and Eng e lhardt (1973)]. The
distributions of V
n
and W
n
can be derived exactly for small values of n,
but calculations become quite tedious as the va lue o f n increases [c f. Bain
and Engelhardt (1973)]. For n = 3, we have
V
3
d
= Y
3:3
Y
1:3
and W
3
d
=
Y
2:3
Y
3:3
Y
1:3
, (2.6.114)
where Y
1:3
Y
2:3
Y
3:3
are the order statistics connected with a random
sample of size three from the standard classical Laplace distribution. Since
V
3
coincides with the range, its p.d.f. follows from Pro position 2.5.3 in
section 2.5,
f
V
3
(x) = e
x
(e
x
+ 1.5x 1), x > 0. (2.6.115)
The p.d.f. of W
3
can be derived from the joint p.d.f. of the order statistics
given in (2.5.11),
f
W
3
(x) =
(
9
2
|x|(1 9|x|
2
)
2
if |x| > 1
3
8
8
(1+|x|)
3
3
(1+|x|)
2
1
(1+3|x|)
2
otherwise,
(2.6.116)
see Bain and Engelhardt (1973). Fo r n 3 one can use either asymptotic
distributions of V
n
and W
n
[see, e.g., Bain and Engelhardt (1973)] or Monte-
Carlo approximations to derive the confidence intervals. Using the latter
approach, one would first approximate the va lue w
α/2
such that
P (W
n
> w
α/2
) =
α
2
(2.6.117)
from the empirical distribution of W
n
obtained by Monte-Carlo simulations.
Then, an a pproximate (1 α)10 0% confidence interval fo r θ is
ˆ
θ
n
w
α/2
n
X
j=1
|X
j
ˆ
θ
n
|,
ˆ
θ
n
+ w
α/2
n
X
j=1
|X
j
ˆ
θ
n
|
. (2.6.118)
114 2. Classical symmetric Laplace distribution
Similarly, an approximate (1 α)10 0% confidence interval for s would be
P
n
j=1
|X
j
ˆ
θ
n
|
v
1α/2
,
P
n
j=1
|X
j
ˆ
θ
n
|
v
α/2
!
, (2.6.119)
where v
β
denotes an estimate of the βth quantile obtained by Monte-Carlo
simulations. More details can b e found in Bain and Engelhardt (1973).
Remark 2.6.17 Ba lakrishnan, Chandramouleeswaran, and Ambagaspitiya
(1994) studied the infere nce on θ when s is assumed either known or un-
known, and on s when θ is unknown, for complete as well as Type-II cen-
sored samples, through the three pivotal quantities
θ
n
θ
s
V
1
,
θ
n
θ
s
n
V
1
,
s
n
/s 1
V
2
, (2.6.120)
where θ
n
and s
n
are the BLUE’s of θ and s and s
2
V
1
and s
2
V
2
are the vari-
ances of θ
n
and s
n
. See Balakrishnan, Chandramouleeswaran, and Amba-
gaspitiya (1994) fo r the percentage points of the pivotal quantities (2.6.120)
and also Balakrishnan, Cha ndramouleeswaran, and Govindarajulu (199 4)
for further results on the approximations of the distributions of (2.6.120)
and their accuracy.
Confidence bands for the Laplace c.d.f.
Let F (·; θ, s) be the c.d.f. of the classical Laplace distribution given by
(2.1.5). Srinivasan and Wharton (1982) constructed one-sided and two-
sided confidence bands on F (·; θ, s) using the Kolmogorov-Smirnov-type
statistics
L
n
= sup
−∞<x<
|F (x; θ, s) F (x; θ
n
, s
n
)| (2.6.121)
and
L
+
n
= sup
x0
{F (x; θ, s) F (x; θ
n
, s
n
)}, (2.6.122)
where θ
n
and s
n
are the BLUE’s of θ and s. Let for any 0 < α < 1, the αth
quantile of L
n
be l
α
(so tha t P (L
n
l
α
) = α). Then, a two-sided α100%
confidence band fo r F (·; θ, s) is given by
(max{F (x; θ
n
, s
n
) l
α
, 0}, min{F (x; θ
n
, s
n
) + l
α
, 1}) , (2.6.1 23)
with a similar one- sided confidence band based on L
+
n
. Tables 2.13 and 2.14
below present simulated percentage points of L
n
and L
+
n
for n up to 20, de-
rived by Srinivasan and Wharton (198 2). For larger values on n, Srinivas an
and Wharton (1982) recommended certain large-sample approximations for
2.6 Statistical inference 115
the perc e ntage points of L
n
and L
+
n
. For example, the quantiles of L
n
may
be approximated through the limiting distribution of
nL
n
, which is the
same as that of the random variable sup |X
0
(y)|, where X
0
(y) is a Gaussian
process with the representation
X
0
(y) =
1
2
e
−|y|
(U + V y), < y < . (2.6.124)
In (2.6.124), the variables U and V are i.i.d. standard normal. We refer
the reader to Srinivasan and Wharton (1982) for more technical details
regarding this problem.
n \ α 0 .80 0.85 0.90 0.95 0.99
5 0.31 0.35 0.39 0.45 0.56
6 0.29 0.32 0.35 0.41 0.52
7 0.26 0.29 0.33 0.38 0.48
8 0.25 0.27 0.31 0.36 0.46
9 0.23 0.26 0.29 0.34 0.44
10 0.22 0.24 0.27 0.32 0.41
11 0.21 0.23 0.26 0.31 0.39
12 0.20 0.22 0.25 0.30 0.38
13 0.19 0.22 0.24 0.28 0.36
14 0.18 0.21 0.23 0.27 0.34
15 0.18 0.20 0.22 0.26 0.33
16 0.17 0.19 0.22 0.25 0.32
17 0.16 0.18 0.21 0.24 0.31
18 0.16 0.18 0.20 0.24 0.31
19 0.16 0.18 0.20 0.23 0.31
20 0.15 0.17 0.19 0.23 0.29
Table 2.13: Simulated percentage po ints l
α
of the statistic L
n
.
Conditional inference
The confidence intervals discussed in Section 2.6.2 are based on the MLE’s
ˆ
θ
n
and ˆs
n
of the parameters θ and s of the classical Laplace distribution
CL(θ, s). As noted by Kappenman (1975), these estimators are not suffi-
cient statistics so that inference about θ a nd s based on these statistics
leads to some loss of information contained in the random sample. It is
generally accepted that the lost information may be recovered (on the av-
erage) by conditioning on the ancillary statistics, which was first suggested
by Fisher (1934) [see also re marks by Edwards (197 4)]. Kappenman (1975)
followed the conditional approach and obtained conditional confidence in-
tervals for the Laplace parameters, based on the conditional distributions
116 2. Classical symmetric Laplace distribution
n \ α 0 .80 0.85 0.90 0.95 0.99
5 0.23 0.27 0.31 0.38 0.51
6 0.21 0.24 0.29 0.35 0.47
7 0.19 0.22 0.26 0.32 0.44
8 0.18 0.21 0.25 0.31 0.42
9 0.16 0.19 0.23 0.38 0.39
10 0.16 0.18 0.22 0.27 0.38
11 0.15 0.17 0.21 0.26 0.36
12 0.14 0.17 0.20 0.25 0.34
13 0.13 0.16 0.19 0.24 0.34
14 0.13 0.15 0.18 0.23 0.32
15 0.12 0.14 0.18 0.22 0.30
16 0.12 0.14 0.17 0.21 0.29
17 0.12 0.14 0.17 0.21 0.28
18 0.12 0.14 0.17 0.21 0.28
19 0.11 0.13 0.16 0.20 0.27
20 0.11 0.13 0.15 0.19 0.26
Table 2.14: Simulated percentage points l
+
α
of the statistic L
+
n
.
of the pivotal q uantities (2.6.113) given the a ncillary statistics. Here, we
shall first examine the loss of information associated with the median and
related estimators in the Laplace case, and then discuss the conditional
inference.
Loss of information. The loss of information associated with the median
when estimating the location parameter of the classical Laplace distribution
was discussed by Fisher (1922, 1925, 1934). We shall consider the location
family given by the density
f(x; θ) = f(x θ) =
1
2
e
−|xθ|
, < x, θ < , (2.6.12 5)
where f is the standard classical Laplace density. Let X
1
, . . . , X
n
be a
random sample of size n = 2k+1 from the dis tribution given by the density
(2.6.125). Then, by (2.6.12), the Fisher information supplied by the sample
is n = 2k + 1. O n the other hand, when we use the MLE for estimating
the location parameter θ, which by Proposition 2.6.2 is the sample median
ˆ
θ
n
= X
k+1:n
, we are replacing n = 2k+1 observations from the distribution
(2.6.125) by a single observation from the distribution with the density
f
k+1:n
(x) of the median given by (2.5.10). Since the latter distribution is
also a location family,
f
k+1:n
(x) = g(x θ), < x, θ < , (2.6.1 26)
2.6 Statistical inference 117
where
g(x) =
(2k + 1)!
(k!)
2
1
2
2k+1
e
(k+1)|x|
(2 e
−|x|
)
k
, −∞ < x < ,
(2.6.127)
is an absolutely continuous density function, the Fisher information con-
tained in the median is
I(θ) =
Z
−∞
g
0
(y)
g(y)
2
g(y)dy, (2.6.128)
with g given by (2.6.127) [see Huber (1981), Lehmann and Casella (19 98),
and also Exercise 2.7.31]. After a lengthy c alculation we obtain (Exercise
2.7.32)
I(θ) =
(
12[log 2 0.5] if k = 1
(k+1)(2k+1)
k1
1
(2k)!
(k!)
2
1
2
2k1
if k > 1,
(2.6.129)
cf. Fisher (1934). As noted by Fisher (1934), although the median is asymp-
totically efficient (the ratio of 2k + 1 to I(θ) given by (2.6.129) tends to 1
as k ), the amount lost,
2k + 1 I(θ) =
2(2k + 1)
k 1
(
(k + 1)
(2k)!
(k!)
2
1
2
2k
1
)
, k > 1, (2.6.130)
increases to infinity. As k , we obtain an asymptotic approximation
of the loss,
2k + 1 I(θ) 4(
p
k 4), k , (2.6.131)
using the Stirling’s For mula (Exercise 2.7.32). Fisher (1934) noted that
with the sample size n = 2k + 1 = 629, this loss is about 36.
More generally, we can calculate the loss of information associated with
the statistic
T
l
= (X
kl+1:n
, . . . , X
k+l+1:n
), (2.6.132)
which is the set of the central 2l + 1 order s tatistics obtained from a sample
of size n = 2k + 1 from the Laplace distribution (2.6.125). It is well known
[see, e.g., Fisher (1925), Rao (1961)] that the loss of information associated
with an arbitrary statistic T obtained from a sample of size n from the
population with density f (·; θ) is
E
θ
(
V ar
θ
n
X
i=1
θ
log f(X
i
; θ)|T
!)
, (2.6.133)
118 2. Classical symmetric Laplace distribution
where V ar
θ
(·|T ) is the conditional variance given T and E
θ
is an uncondi-
tional expectation. In case of the Laplace distribution (2.6.125), we have
θ
log f(X
i
; θ) = sign(X
i
θ), (2.6.134)
and the conditional variance takes the form
V ar
2k+1
X
i=1
sign(X
i
θ)|T
l
!
= (k l)(V
1
+ V
2
), (2.6.135)
where
V
1
=
0 for X
kl+1:n
θ
(2u 1 )/u
2
for X
kl+1:n
> θ,
(2.6.136)
V
2
=
0 for X
k+l+1:n
θ
(2v 1)/v
2
for X
k+l+1:n
< θ,
(2.6.137)
and
u = F (X
kl+1:n
), v = 1 F (X
k+l+1:n
), (2.6.138)
with F being the distribution function of the standard classical Laplace
distribution [see Akahira and Takeuchi (1990 ) for details]. Hence, the los s
of information associated with T
l
is
L
l
= (k l)(E(V
1
) + E(V
2
)) =
=
2(2k + 1)!
(k l 1)!(k + l)!
Z
1
1/2
2u 1
u
2
u
kl
(1 u)
k+l
du,(2.6.139)
since both V
1
and V
2
have support on [1/2, 1] (F(x) > 1/2 if x > θ) and
E(V
1
) = E(V
2
) =
(2k + 1)!
(k l)!(k + l)!
Z
1
1/2
2u 1
u
2
u
kl
(1 u)
k+l
du. (2.6.14 0)
Relating the integral in (2.6.139) to an incomplete be ta function, Akahira
and Takeuchi (1990) obtained the following result for the loss of informa-
tion.
Proposition 2.6.1 2 For each integer 0 l k2, the loss of information
L
l
associated with the statistic T
l
given by (2.6.132) is
2
2k
L
l
2(2k + 1)
=
(2k)!
(k!)
2
(l + 1)2
2k
k l 1
+
l
X
j=0
2(l j + 1)
k l 1
(2k)!
(k l)!(k + l)!
.
(2.6.141)
2.6 Statistical inference 119
Note that for l = 0, in which case T
l
is the median X
k+1:n
, the relation
(2.6.141) reduces to (2.6.130). Asymptotically, for fixed l and large k, the
loss of information (2.6.141) is given by
L
l
=
4k
π
(1 + o(1)) 4(l + 1) + O
l
2
k
, (2.6.142)
and c oincides with (2.6.131) for l = 0 [see Akahira and Takeuchi (1990)]. We
refer an interested reader to Akahira (1987, 1990), Akahira and Takeuchi
(1990, 1993), and Takeuchi and Akahira (1976) for mor e information on
loss of infor mation and second order a symptotic results for order statistics
and related estimators of the location parameter in the case of Laplace
distribution.
Conditional confidence intervals. Let X
1
, . . . , X
n
be i.i.d. random variables
with the common class ic al Laplace distribution with density (2.1.1), and
let X
1:n
··· X
n:n
be the corresponding order statistics. Define the
statistic
a = (a
1
, . . . , a
n
)
0
, (2.6.143)
where
a
i
=
X
i:n
ˆ
θ
n
ˆs
n
, i = 1, . . . , n, (2.6.144)
and
ˆ
θ
n
and ˆs
n
are the MLE’s of the location and scale parameters given
by (2.6.15) and (2.6.34 ), respectively. Note that for n = 2m + 1 we have
a
m+1
= 0, while for n = 2m we have a
m
= a
m+1
. In addition,
n
X
i=1
|a
i
| = 0, (2.6.145)
so that only n 2 of the components o f a are independent. Further, since
the pivotal quantities
U
n
=
ˆ
θ
n
θ
ˆs
n
and V
n
=
ˆs
n
s
(2.6.146)
have distributions that do not depend on the parameters θ a nd s [see Antle
and Bain (1969)], it follows that a is an ancillary statistics for θ and s [cf.
Kappenman (1975)]. The joint conditional density function of
ˆ
θ
n
and ˆs
n
,
given the value of the ancillary statistics a, is proportio nal to
1
s
2
ˆs
n
s
n2
exp
(
ˆs
n
s
n
X
i=1
ˆ
θ
n
θ
ˆs
n
+ a
i
)
. (2.6.147)
120 2. Classical symmetric Laplace distribution
Note that the Jacobian of (ˆs
n
,
ˆ
θ
n
) as a function of U
n
and V
n
is s
2
V
n
so
the co nditional joint density of U
n
and V
n
, given the value of the ancillary
statistics a, is equal to
p
U
n
,V
n
(u, v|a) = Kv
n1
e
v
P
n
i=1
|u+a
i
|
. (2.6.148)
The normalizing constant in (2.6.148) is equal to
K =
1
2Γ(n 1)
[B
n
(a)c(
ˆ
θ
n
)]
n1
, (2.6.149)
where
c(t) =
n
X
i=1
|a
i
t| =
P
n
i=1
a
i
nt for t a
1
(2i n)t +
P
n
j=i+1
a
j
P
i
j=1
a
j
for a
i
t a
i+1
nt
P
n
i=1
a
i
for t a
n
(2.6.150)
and B
n
(a) is equal to
(
n
X
i=1
[c(
ˆ
θ
n
)/c(a
i
)]
n1
(2i n)(n + 2 2i)
)
1/(n1)
(2.6.151)
if n is odd and to
(n 1)(a
n/2+1
a
n/2
)
2c(
ˆ
θ
n
)
+
1
2
n
X
i=1
i6=n/2,n/2+1
[c(
ˆ
θ
n
)/c(a
i
)]
n1
(2i n)(n + 2 2i)
1/(n1)
(2.6.152)
if n is even, see Kappenmann (1975) and Uthoff (1973). Utilizing (2.6.148),
one can now der ive the marginal conditional density of U
n
,
p
U
n
(u|a) = KΓ(n)
(
n
X
i=1
|u + ai|
)
n
, (2.6.153)
and use it to produce the conditiona l 100(1 α)% confidence interval for
θ,
(
ˆ
θ
n
u
2
ˆ
θ
n
,
ˆ
θ
n
u
1
ˆ
θ
n
), (2.6.154)
where the constants u
1
and u
2
satisfy the conditions
P (U
n
u
1
|a) = P (U
n
u
2
|a) = α/2. (2.6.155)
2.6 Statistical inference 121
Similarly, we c an derive the mar ginal conditional density of V
n
, and conse-
quently obtain the expression
K
γ(n 1; v
2
c(a
1
)) γ(n 1 ; v
1
c(a
1
))
n(c(a
1
))
n1
+
n1
X
i=1
γ(n 1; v
2
c(a
i
)) γ(n 1 ; v
1
c(a
i
))
(2i n)(c(a
i
))
n1
n1
X
i=1
γ(n 1; v
2
c(a
i+1
)) γ(n 1; v
1
c(a
i+1
))
(2i n)(c(a
i+1
))
n1
+
γ(n 1; v
2
c(a
n
)) γ(n 1 ; v
1
c(a
n
))
n(c(a
n
))
n1
(2.6.156)
for the pr obability
P (v
1
< V
n
< v
2
|a) = P
ˆs
n
v
2
< s <
ˆs
n
v
1
|a
, (2.6.157)
where
γ(n 1; x) =
Z
x
0
e
t
t
n2
dt, 0 < x < , (2.6.158)
is the incomplete gamma function, see Ka ppenman (1975). Thus, the con-
ditional 100(1 α)% confidence interval for s is
(ˆs
n
/v
2
, ˆs
n
/v
1
), (2.6.159)
where the constants v
1
and v
2
are chosen so that the conditional probability
(2.6.157) given by (2.6.156) is equal to 1 α.
Grice et al. (1978) compared the conditional confidence intervals for θ
given by (2 .6.154) with the unconditional ones given by (2.6.118), in terms
of their expected lengths. Using Monte Carlo techniques they concluded
that the conditional approach yields slightly narrower intervals on average,
and that the two methods are essentially in agreement for large sample
sizes. Table 2.15 below, taken from Grice et al. (1978), contains the ex-
pected lengths of the conditional and unconditional confidence inter vals
for selected sample sizes.
Remark 2.6.18 Conditiona l inference for the Laplace distribution under
Type-II right censoring is discussed in Childs and Balakrishnan (1996).
2.6.3 Tolerance intervals
Let X
1
, . . . , X
n
be a random sample of size n from a distribution with
density f , and let
U = U(X
1
, . . . , X
n
) and L = L(X
1
, . . . , X
n
)
122 2. Classical symmetric Laplace distribution
1 α 0.90 0.90 0.95 0.95 0.98 0 .98
n Cond. Uncond. Cond. Uncond. Co nd. Uncond.
3 3.352 3.641 4.740 4.975 7.495 7.649
5 2.113 2.273 2.575 2.912 3.542 3.787
9 1.375 1.498 1.698 1.949 2.119 2.316
15 0.997 1.061 1.214 1.32 6 1.484 1.525
33 0.631 0.682 0.761 0.83 0 0.917 0.942
Table 2.15: Expected lengths of conditional and unconditional 100(1 α)%
confidence intervals for θ based on random samples with selected size n
from the CL(θ, 1) distribution.
be two statistics such that
P
Z
L
f(x)dx β
= γ (2.6.160)
and
P
Z
U
−∞
f(x)dx β
!
= γ. (2.6.161)
Then, L and U are said to be lower and upper (β, γ) tolerance limits, while
the interva ls (L, ) and (−∞, U) are, respectively, lower and upper γ prob-
ability tolerance intervals for proportion β (β-content tolerance intervals at
level γ). Similarly, for L < U, the interval (L, U) is a two-sided γ probabil-
ity tolerance interval for proportion β (β-content tolerance interval at level
γ) if
P
Z
U
L
f(x)dx β
!
= γ. (2.6.162)
Below, we shall disc uss tolerance intervals when the random s ample is from
the two-parameter classic al Laplace distribution with density (2.1.1). Let
us first consider the lower tolerance interval of the form
(L, ) = (
ˆ
θ
n
bˆs
n
, ), (2.6.163)
where
ˆ
θ
n
and ˆs
n
are the MLE’s of the parameters θ and s given by (2.6.15)
and (2.6.34), respectively. Thus, the problem is to determine the tolerance
factor b in (2.6.163). Upon substituting the Laplace density (2 .1.1) and L
given by (2.6.163) into (2.6.160), and changing the va riable u = (x θ)/s,
we obtain the following equation for b:
P
Z
ˆ
θ
n
θ
s
b
ˆs
n
s
1
2
e
−|u|
du β
!
= γ. (2.6.164)
2.6 Statistical inference 123
Restricting β to β 1/2 (in practice, the proportion β is close to one) we
can write equivalently
P
ˆ
θ
n
θ
s
b
ˆs
n
s
k
β
!
= γ, (2.6.165)
where
k
β
= log[2(1 β)] 0. (2.6.16 6)
Bain and Enge lhardt (1973) e xpressed (2.6.165) as
P
U
n
b
n
k
β
= γ, (2.6.16 7)
where
U
n
(c) =
ˆ
θ
n
θ
s
cn
ˆs
n
s
, (2.6.168)
and used the approximation
P
U
n
b
n
k
β
Φ
n(k
β
b)
1 + b
2
(2.6.169)
to obtain an approximate value of the tolerance fac tor:
b
1
n z
2
γ
n
nk
β
+ z
γ
q
n(1 + k
2
β
) z
2
γ
o
. (2.6.170)
and z
γ
are the standard normal c.d.f. and γth quantile, respectively.]
Note that by symmetry, the interval
(−∞, U) = (−∞,
ˆ
θ
n
+ bˆs
n
), (2.6.171)
with b as in (2.6.170), is an approximate upper γ probability tolerance
interval.
Kappenman (1977) derived conditional tolerance intervals following the
conditional approach presented is Sec tion 2.6.2. Here, the interval of the
form (2.6.163) is a lower γ probability conditional tolerance interval fo r
proportion β if
P
Z
ˆ
θ
n
bˆs
n
f(x; θ, s)dx β |a
= γ, (2.6.172)
where f(x; θ, s) is the Laplace p.d.f. (2.1.1) and a is the vector of ancillary
statistics given by (2.6 .143) - (2.6 .144) in Section 2.6.2. [The upper and the
two-sided conditional tolerance intervals are defined similarly.] Using the
124 2. Classical symmetric Laplace distribution
conditional joint distribution of (
ˆ
θ
n
θ)/s and ˆs
n
/s, Kappenman (1977)
obtained the following value for the tolerance factor b:
b = a
h
c(a
h
)
n 2h
+
1
n 2h
×
e
k
β
(n2h)
(c(a
h
))
1n
+
p(n 2h)
KΓ(n 1)

1/(n1)
,(2.6.173)
where k
β
is given by (2.6.166), a is a s before , c(t) is given by (2 .6.150), K
is the normalizing constant (2.6.149), h is the largest intege r (h 2) such
that
Q(h) = KΓ(n 1)
(
1
n(c(a
1
))
n1
+
h1
X
i=1
1
n 2i
×
1
(c(a
i+1
))
n1
1
(c(a
i
))
n1

1 γ, (2.6.174 )
and p = 1 γ Q(h). To actually calculate b, one must first find h, usually
by setting h = 2, 3, . . . in (2.6.174).
By symmetry, the upper γ probability conditional tolerance interval for
proportion β is
(−∞,
ˆ
θ
n
bˆs
n
), (2.6.175)
where b is obtained from (2.6.173) - (2.6.174) with k
β
replaced by k
β
and
with p equal to γ Q(h), where now h is the largest integer (h 2) such
that Q(h) < γ.
Shyu and Owen (1986a) remarked that the approximate tolerance inter-
vals (2.6.163), which are based on the approximation (2.6.170), c an miss
the exact values significantly in so me applications, while the conditional
tolerance fac tors (2.6.173) are not ea sy to compute even for small sample
sizes. They proposed a method based on Monte-Carlo simulations sketched
below, leading to useful tables for the toler ance factor b. Denoting
W
n
=
(
ˆ
θ
n
θ)/s k
β
ˆs
n
/s
, (2.6.176)
we see that the relatio n (2.6.16 5) is equivalent to
P (W
n
b) = γ. (2.6.177)
Since the distribution of (
ˆ
θ
n
θ)/s and ˆs
n
/s is independent of the param-
eters θ and s [see Antle and Bain (1969)], the same property is shared by
the statistic W
n
defined in (2.6.176). Consequently, the tolerance factor b
can be determined from the relatio n (2.6.177) for any given values of β, γ,
and n.
2.6 Statistical inference 125
For n = 2, the p.d.f. of W
2
takes the following form for x 6= 0:
g(x) =
1
4
u(x)e
2k
β
x1
+ [ 1 u(x)] e
2k
β
x+1
+
e
2k
β
x
2
for x > 1,
1
4
exp
[1 u(x)] e
2k
β
x+1
+
1
x
2
e
2k
β
for 1 < x 1,
1
4
1
x
2
e
2k
β
for x 1,
(2.6.178)
where
u(x) =
1
x
2
+
2k
β
x
,
see Exercise 2.7.41).
Thus, the exact value of b can be obtained by solving (2.6.177) (numeri-
cally, since the re le vant distribution function does not admit a closed form).
Shyu and Owen (1986a) provide a table for the resulting values of b, for
n = 2 and
β = 0.750, 0.9 00, 0.950, 0.990, 0.995, 0.999
γ = 0.500, 0.750, 0.900, 0.950, 0.975, 0.990, 0.995.
They also note that when n > 2 the exact distribution of W
n
is difficult
to obtain and hence they derive approximations based on simulations. The
values of the tolerance fac tor b for sample sizes n = 3(1)11, 50, 100 and the
same values of β and γ as those for n = 2 above, can be found in Shyu and
Owen (1986a).
Similarly, Shyu and Owen (1986b) developed analogous procedures for
obtaining the two-sided tolerance intervals of the form
(L, U) = (
ˆ
θ
n
bˆs
n
,
ˆ
θ
n
+ bˆs
n
), (2.6.179)
where
ˆ
θ
n
and ˆs
n
are as before, and presented useful ta bles for the tolerance
factor b, for the same values of n, β, and γ as those used in Shyu and Owen
(1986a ) for the one-sided tolerance limits.
In Shyu and Owen (1987), the authors consider β-expectation tolerance
intervals of the form (2.6.179) defined by the condition
E
"
Z
U
L
f(x; θ, s)dx
#
= β, (2.6.1 80)
where f (·; θ, s) is the double expo nential density (2.1 .1). Shy u and Owen (1987)
note tha t (2.6.180) is equivalent to
P (b < Y
n
< b) = β, (2.6.181)
126 2. Classical symmetric Laplace distribution
where
Y
n
=
X
ˆ
θ
n
ˆs
n
, (2.6.182)
the variable X has a standard classical Laplac e distribution, and
ˆ
θ
n
and ˆs
n
are as before, and are independent from X. Subsequently, by simulations,
they developed useful tables for the tolerance factor b, with the same values
of n, β, and γ as those used in Shyu and Owen (1986ab).
Remark 2.6.19 Ba lakrishnan and Chandramouleeswaran (1994a) devel-
oped upper and lower to le rance intervals based on Type-II censored samples
form the Laplace distribution. Their intervals are of the form
(−∞, U) = (−∞, θ
n
+ bs
n
) and (L, ) = (θ
n
bs
n
, ),
where θ
n
and s
n
are the BLUE’s of θ and s. They developed tables of
the tolerance facto r b for sample size n = 5(1)10, 12, 15, 20, right-censoring
level s = 0(1)[[n/2]], and
β = 0.500(0.025)0.975
γ = 0.750, 0.8 50, 0.900.0.950, 0.980, 0.990, 0.995.
In addition, Balakrishnan and Chandramouleeswaran (199 4a) proposed an
estimator of the reliability
R
X
(t) = P (X > t) = 1 F (t; θ, s) (2.6.183)
of the CL(θ, s) r.v. X at time t of the form
R
X
(t) =
1
1
2
e
(tθ
n
)/s
n
for t θ
n
1
2
e
(tθ
n
)/s
n
for t θ
n
,
(2.6.184)
and described how to use their tables of the tolerance factor b to obtain
confidence intervals for the reliability (2 .6.183).
2.6.4 Testing hypothesis
Testing the normal versus the Laplace
Let X
1
, . . . , X
n
be i.i.d. with the common density
1
σ
f
x θ
σ
, (2.6.185)
where the function f is symmetric about zero, and consider the problem of
testing
H
0
: f = f
0
against H
1
: f = f
1
, (2.6 .186)
2.6 Statistical inference 127
where f
0
and f
1
are the standard normal and the standard Laplace densi-
ties, respectively. Let us derive the likelihood ratio test for this problem.
Writing the density (2.6.185) in the form
f(x; θ, σ, α) =
c
α
σ
e
b
α
|
xθ
σ
|
α
, (2.6.187)
and choo sing the parameter space to be
= {(θ, σ, α) : θ R, 0 < σ, α = 1, 2} =
0
1
, (2.6.188)
we are testing whether the vector parameter belongs to
0
= {(θ, σ, α) : θ R, 0 < σ, α = 2}
(the normal distribution) or to
1
= {(θ, σ, α) : θ R, 0 < σ, α = 1}
the L aplace distribution). The likelihood ratio criterion rejects H
0
if the
ratio
sup
(θ,σ,α)
0
Q
n
i=1
f(x
i
; θ, σ, α)
sup
(θ,σ,α)
Q
n
i=1
f(x
i
; θ, σ, α)
(2.6.189)
is less than some constant c. Clearly, on
0
the supremum is attained by the
MLE’s of the mean and the s tandard deviation under the normal model:
ˆ
θ
N
n
=
1
n
n
X
i=1
x
i
= ¯x
n
, (2.6.190)
ˆσ
N
n
=
v
u
u
t
1
n
n
X
i=1
(x
i
¯x
n
)
2
. (2.6.191)
Similarly, the supremum of the joint density over the set
1
is attained
when the parameters are the MLE’s under the Laplace model:
ˆ
θ
L
n
= ˜x
n
(the sample median), (2.6.192)
ˆσ
L
n
=
2
n
n
X
i=1
|x
i
˜x
n
|. (2.6.193 )
Thus, the likelihood ratio (2.6.189) becomes
Q
n
i=1
f(x
i
;
ˆ
θ
N
n
, ˆσ
N
n
, 2)
max
n
Q
n
i=1
f(x
i
;
ˆ
θ
N
n
, ˆσ
N
n
, 2),
Q
n
i=1
f(x
i
;
ˆ
θ
L
n
, ˆσ
L
n
, 1)
o
. (2.6.194)
128 2. Classical symmetric Laplace distribution
The s ubstitution of the density (2.6.187) (where c
2
= 1/
2π, b
2
= 1/2 for
the normal and c
1
= 1/
2, b
1
=
2 for the Laplace) and the statistics
(2.6.190), (2.6.191), (2.6.192), and (2.6 .193) into (2.6.194) results in the
following expression for the likelihood ratio:
1
max
1,
πn
2e
P
(x
i
¯x
n
)
2
P
|x
i
˜x
n
|

. (2.6.195)
Thus, the likelihood ratio test rejects H
0
if
V
n
=
1
n
P
|x
i
˜x
n
|
q
1
n1
P
(x
i
¯x
n
)
2
< C, (2.6.196)
where C is chosen to produce the required size of the test.
Remark 2.6.20 Similar test when testing for normality, based on the ratio
P
|x
i
¯x
n
|
p
P
(x
i
¯x
n
)
2
, (2.6.197)
was prop osed by Geary (19 35) and investigated by Pearson (1935). (Note
that here we use the sample mean when calculating the mean deviation.)
The test (2.6.196) is not a uniformly most powerful (UMP) test [unless
n=1, see Rohatgi (1984)]. However, as shown by Uthoff (1973), there exists
a most powerful scale and location invariant test for (2.6.185), which is
asymptotically equivalent to but different from the likelihood ratio test
(2.6.196). This test rejects H
0
if
B
n
V
n
< k
where V
n
is given in (2.6.196) and B
n
is a certain function of the order
statistics [see Uthoff (1973) for details]. On the other hand, in case θ is
known (at fo r convenience se t to zero) the likelihood ratio and the mos t
powerful scale and loca tion invariant test are both equivalent to rejecting
H
0
when
P
|x
i
|
p
P
x
2
i
< C, (2.6.198)
see Hogg (1972).
The approximate critical region of the test (2.6.196) may be based on the
asymptotic distribution of the test s tatistic in (2.6.196). It was shown in
Uthoff (1973) that if the underlying probability distribution is symmetric
and absolutely continuous with a finite forth moment and with a density
f continuous in the neighborhood of the median, then the statistic V
n
(as
2.6 Statistical inference 129
well as B
n
V
n
) is asymptotically normal with the mean ν
1
ν
1/2
2
and the
variance
1
n
[1 ν
1
ν
3
ν
2
2
+ 4
1
ν
2
1
ν
1
2
(ν
4
ν
2
2
1)], (2.6.199)
where ν
i
= E|X m|
i
and m is the median of f . Thus, under H
0
, where
the distribution is normal, the distribution of V
n
is approximately normal
with the mea n of 0.798 and the variance of 0.045/n [Uthoff (1973)].
Goodness of fit tests
In this Section we follow Yen and Moore (1988) and discuss two nonpara -
metric goodness-of-fit tests for the Laplace distribution. The tests are us e d
to determine whether for a given random sample X
1
, . . . , X
n
, the underly-
ing probability distribution is a CL(θ, s) distribution (with some unknown
values of the parameters ).
The Anderson-Darling test. The test statistic for the (modified) Anderson-
Darling (AD) test is
A
2
n
= n
1
n
n
X
j=1
(2j 1)[log F (X
j:n
; θ, s) + log F (X
nj+1:n
; θ, s)],
(2.6.200)
where F (·; θ, s) is the classical Laplace distribution function (2.1.5) and
X
j:n
is the jth order statistic connected with the given random sample [see
Yen and Moo re (1988)]. The values of the par ameters θ and s are usually
not known, and must be estimated before the test statistic (2.6.200) can be
computed. Yen and Moore (1988) obtained the critical values for the above
test by Monte-Carlo simulations. For each n = 5(5)50, a random sample of
size n was generated fr om Lapla c e distribution and the MLE’s (2.6.15) and
(2.6.34) of the parameters were substituted into (2.6.200) to obtain a value
of the test statistic. The proce dure was repeated 5000 times producing
an empirical distribution of the test statistic (2.6.200), from which sample
quantiles approximating the critical values were obtained. Table 2.16 below,
taken from Yen and Moo re (1988), contains the critical values of the test
statistic (2 .6.200) for selected sample sizes and significance levels α.
The Cram´er-von Mises test. T he test statistic for the (modified) Cram´e r-
von Mises (CvM) test is
W
2
n
=
1
12n
+
n
X
j=1
F (X
j:n
; θ, s)
2j 1
2n
2
, (2.6.201)
where F (·; θ, s) and X
j:n
are as befor e [see Yen and Moore (1988)]. As in
the former test, the values of the parameters θ a nd s must be estimated
130 2. Classical symmetric Laplace distribution
n \ α 0.20 0.15 0.10 0.05 0.01
5 0.607 0.682 0.789 0.948 1.256
10 0.558 0.618 0.707 0.854 1 .224
15 0.611 0.686 0.801 0.989 1 .409
20 0.592 0.658 0.758 0.919 1 .264
25 0.622 0.691 0.793 0.999 1 .435
30 0.599 0.667 0.773 0.949 1 .416
35 0.628 0.698 0.800 0.975 1 .457
40 0.639 0.706 0.817 1.012 1 .461
45 0.619 0.692 0.807 0.980 1 .441
50 0.607 0.673 0.783 0.967 1 .393
Table 2.16: C ritical values for the modified Anderson-Darling test for the
Laplace distribution, for selected values of the sa mple size n and significance
level α.
befo re the test statistic (2.6.201) can be computed. Yen and Moore (1988)
obtained the critical values for the above test by Monte-Carlo s imulations
similar to those for the case of the AD test. Table 2.17 b e low [taken from
Yen and Moo re (1988 )] contains the critical values of the test statistic
(2.6.201) for selected sample sizes and significance levels α. Yen and Moo re
n \ α 0.20 0.15 0.10 0.05 0.01
5 0.080 0.090 0.105 0.131 0.193
10 0.076 0.084 0.096 0.116 0 .172
15 0.085 0.096 0.112 0.142 0 .205
20 0.082 0.092 0.104 0.128 0 .186
25 0.088 0.100 0.114 0.145 0 .220
30 0.084 0.095 0.109 0.137 0 .207
35 0.089 0.101 0.116 0.146 0 .213
40 0.092 0.104 0.121 0.148 0 .222
45 0.088 0.099 0.116 0.145 0 .215
50 0.085 0.096 0.113 0.142 0 .212
Table 2.17: Critical values for the modified Cram´er- von Mises test for the
Laplace distribution, for selected values of the sa mple size n and significance
level α.
(1988) tabulated the power of the two (level α = 0.01 and α = 0.05)
tests discussed above under six different alternative hypotheses with nor-
mal, Weibull, uniform, Cauchy, g amma, a nd exponential distributions. The
power function of the AD test was hig her than that for the CvM test under
the uniform, Cauchy, ga mma, and exponential alternatives across all sam-
2.6 Statistical inference 131
ple sizes and s ignificance levels considere d. Under the normal and Weibull
alternatives, the power functions were comparable.
Neyman-Pearson test for location
In this section we shall consider two simple hypotheses about the loc ation
of the Laplace distribution when the scale is known. Namely, let X
1
, . . . , X
n
be an i.i.d. sample fro m the Laplace distribution CL(θ, s). We want to test
H
0
: θ = θ
1
against H
1
: θ = θ
2
,
where θ
1
and θ
2
are some known prescribed numbers.
It follows from the Neyman-Pearson Lemma that the optimal test (i.e.
the most powerful test) of the significance level α rejects H
0
if
Q
n
i=1
f(X
i
; θ
1
, s)
Q
n
i=1
f(X
i
; θ
2
, s)
< k
α
,
where k
α
satisfies the e quation
P
Q
n
i=1
f(X
i
; θ
1
, s)
Q
n
i=1
f(X
i
; θ
2
, s)
< k
α
θ = θ
1
= α, (2.6.202)
where f(x; θ, s) is the density function of CL(θ, s).
We shall consider the case θ
2
> θ
1
, since otherwise we would rewrite the
sample as (X
1
, . . . , X
n
) replacing θ
1
and θ
2
by θ
1
and θ
2
, respec-
tively. Substituting the density f(x; θ, s) into (2.6.202) it is easy to observe
that the above testing procedure is equivalent to rejecting H
0
provided
n
X
i=1
g(X
i
) > t
α
,
where
g(x) =
(θ
2
θ
1
)/s for x < θ
1
,
2x/s (θ
2
+ θ
1
)/s for θ
1
x θ
2
,
(θ
2
θ
1
)/s for x > θ
2
.
(2.6.203)
The graph of the function g is sketched in Figure 2.6.
In order to determine the value of t
α
we are required to solve the equation
P
n
X
i=1
g(X
i
) > t
α
θ = θ
1
!
= α.
This requires the knowledge of the distribution of the test statistic
P
n
i=1
g(X
i
)
under the H
0
hypothesis. This distribution is given in Marks et al. (1978).
Below we present this re sult and its proof.
132 2. Classical symmetric Laplace distribution
-
x
6
g(x)
(θ
2
θ
1
)/s
(θ
2
θ
1
)/s
θ
1
θ
2
Figure 2.6: Function g(x) used in the Neyman-Pearson test.
Theorem 2.6.2 Let X
1
, . . . , X
n
be a random sample from the CL(θ, s)
distribution. Then, under the null hypothesis θ = θ
1
, the distribution of
T
n
=
n
X
i=1
g(X
i
),
where g(x) is defined in 2.6.203 and θ
2
> θ
1
, is given by the following c.d.f.
F
(0)
n
(x) =
1
2
n
n
X
k=1
nk
X
l=0
k
X
r=0
n
k

n k
l
(1)
r
k
r
e
(r+l)(θ
2
θ
1
)/s
×
×
h
1 e
v(x)/2
· e
k1
(v(x)/2)
i
× u (v(x)) +
+
n
X
m=0
n
m
e
m(θ
2
θ
1
)/s
u (x + (n 2m)(θ
2
θ
1
)/s)
)
,
where v(x) = x+(n2l 2r)(θ
2
θ
1
)/s, e
k
(·) is the incomplete exponential
function, i.e.
e
k
(z) =
k
X
i=0
z
i
i!
,
and
u(x) =
0 for z < 0
1 for z 0.
The expected value and the variance of T
n
under H
0
are
E
(0)
(T
n
) = n
1 e
(θ
2
θ
1
)/s
(θ
2
θ
1
)/s
2.6 Statistical inference 133
and
Var
(0)
(T
n
) = n
3 2e
(θ
2
θ
1
)/s
e
2(θ
2
θ
1
)/s
4(θ
2
θ
1
)
s
e
(θ
2
θ
1
)/s
.
If θ = θ
2
(the H
1
hypothesis), the distribution of T
n
is given by the c.d.f.
F
(1)
n
(x) = 1 F
(0)
n
(x),
and in this case the expect ed value and the variance are given by
E
(1)
[T
n
] = E
(0)
[T
n
], Var
(1)
[T
n
] = Var
(0)
[T
n
].
The statistics T
n
is asymptotically normal, i.e.
lim
n→∞
T
n
E[T
n
]
p
Var[T
n
]
d
= N (0, 1).
Proof. Consider first the distribution o f T
n
under H
0
. Since g(X
i
) is a
truncated Laplace random var iable, its distribution is given by
F (x) =
0 for x < (θ
2
θ
1
)/s,
F (x; (θ
1
θ
2
)/s, 2) for (θ
2
θ
1
)/s x (θ
2
θ
1
)/s,
1 for x > (θ
2
θ
1
)/s,
where F (x; θ, s) is the c.d.f. of the CL(θ, s) distribution.
Straightforward calculations yield the following characteristic function
for this truncated distribution
φ(t) = e
θ
2
θ
1
2s
(
cosh

1
2
it
θ
2
θ
1
s
+
sinh
(
1
2
it)(θ
2
θ
1
)/s
1 2it
)
.
Consequently, the characteristic function of T
n
, φ
(0)
(t), becomes
e
n
θ
2
θ
1
2s
(
cosh

1
2
it
θ
2
θ
1
s
+
sinh
(
1
2
it)(θ
2
θ
1
)/s
1 2it
)
n
.
Expressing the hyperbolic sine and cosine in ter ms of complex exponentials,
and using the binomial expansion of the nth power of the sum, we obtain
(after rather tedious but straightforward simplifications)
φ
(0)
(t) =
1
2
n
(
n
X
k=1
nk
X
l=0
k
X
r=0
n
k

n k
l
(1)
r
k
r
·
·e
(r+l)(θ
2
θ
1
)/s
·
e
it(n2r2l)(θ
2
θ
1
)/s
(1 2it)
k
+
+
n
X
m=0
n
m
e
m(θ
2
θ
1
)/s
e
it(n2m)(θ
2
θ
1
)/s
)
.
134 2. Classical symmetric Laplace distribution
Note that
ψ
1
(t) =
e
it(n2r2l)(θ
2
θ
1
)/s
(1 2it)
k
and
ψ
2
(t) = e
it(n2m)(θ
2
θ
1
)/s
are, respectively, the characteristic functions of the χ
2
r.v. with 2k degrees
of freedom (shifted by (2r +2ln)(θ
2
θ
1
)/s to the right) and the consta nt
random var iable equal to (2m n)(θ
2
θ
1
)/s. The final formula for the
c.d.f. F
(0)
n
follows from the forms of the c.d.f.’s for these two distributions.
The formulas for the expected value a nd variance can be obtained easily
by integr ation of the truncated Laplace random variable g(X).
The corresponding under H
1
follow from the symmetry of Laplace dis-
tribution. First, note the relation
g(x) = g((x θ
2
) + θ
1
).
Thus,
P (T
n
x
θ = θ
2
) = P
n
X
i=1
g((X
i
θ
2
) + θ
1
x
θ = θ
2
!
= P
n
X
i=1
g(X
i
) x
θ = θ
1
!
= 1 P (T
n
x
θ = θ
1
).
[The second last equality above follows from the fact that if X has the
CL(θ
2
, s) distribution then Y = (X θ
2
) + θ
1
has the CL(θ
1
, s) distribu-
tion.]
The asymptotic normality is a direct consequence of the Central Limit
Theorem.
The importance of the explicit formula for the test statistic in the above
problem is due to the fact that the asymptotic Gaussian approximation is
not usually very accurate for small and moderate sample s iz es. For example,
it was shown in Dadi and Marks (1987) that for samples size in the range
from 5 to 50 the Gaussian approximation can be quite co ns ervative, some
yielding the t
α
-value substantially larger then its exac t value (see the above
mentioned paper for numerical results).
Asymptotic optimality of Kolmogorov-Smirnov test
The asymptotic optimality of the Kolmogorov goodness-of-fit test for the
location Laplace family was studied in Nikitin (1995), who der ived the
2.6 Statistical inference 135
following characterization of the Laplace distribution: The Kolmog orov
goodness-o f-fit test is locally asymptotic optimal in the Bahadur sense if
and only if the underlying family of distributions are symmetric Laplace
laws. To state this result more precis ely, let us recall so me basic notions
from the theory of asymptotic efficiency for statistical tests.
Let us consider a location family given by the densities f
θ
, θ R, and
let F (x; θ) be the corresponding cumulative distribution functions. Let
K(θ, θ
0
) be the information number, i.e. K(θ, θ
0
) = E
θ
0
log(f
θ
/f
θ
0
). The
Smirnov one-sided statistics are defined as follows
D
±
n
= sup
xR
±[F
n
(x) F (x; 0 )],
while the Kolmogorov statistic is
D
n
= sup
xR
|F
n
(x) F (x; 0 )|.
The s tatistics D
±
n
(or D
n
) are locally optimal in the Bahadur sense if and
only if
lim
n→∞
1
n
log P
θ,n
= K(θ, 0),
where P
θ,n
is the observed P-value based on D
±
n
(or D
n
) under the as-
sumption that the sample is obtained fro m the distribution given by f
θ
.
Let G be the class of absolutely co ntinuous densities on the real line such
that for g G we have
0 < lim
θ0
θ
2
Z
log
g(x + θ)
g(x)
g(x + θ)dx
=
1
2
Z
g
0
2
(x)
g(x)
dx < .
The following theorem was proved in Nikitin (1995, Theorem 6 .3.1).
Theorem 2.6.3 Consider a location testing problem with f
θ
= g(x + θ).
Then, the sequences of statistics D
n
and D
+
n
are locally asymptotically op-
timal in the Bahadur sense within the class G only for the Laplace distri-
bution, i.e. for g(x) = 1/2e
−|x|
. The sequence of statistics D
n
is n ever
optimal in the Bahadur sense in the class G.
Comparison of nonparametric tests of location
Ramsey (1971) examines eight nonpara metric tests of location in a small
sample setting and investigates power functions for samples dr awn from
Laplace distribution. His main conclusio n is that the Mood median test,
which is the asymptotically most powerful (AMP) rank test, performs
poorly for the alternatives that are not c lose to the null hypothesis.
Consider a rank sum s tatistic. Let X
1
, X
2
, . . . , X
m
and Y
1
, . . . , Y
n
be
independent random samples from populations F (x) and G(y), respectively.
We test H
0
: G(x) F (x) versus the location shift alternative H
A
: G(x) =
136 2. Classical symmetric Laplace distribution
F (x θ) for some θ > 0. Let δ
i
(i = 1, 2, . . . , N = m + n) be zero/one
random variable indicating whether the ith smallest value in the combined
sample is a Y .
A rank sum statistic is a linear combination
T
N
=
N
X
i=1
a
N,i
δ
I
,
where the a
N,i
(i = 1, . . . , N ) are the so-called ith “scores”.
When F (x) is known to belong to a family (for example normal) that
admits a UMP test, the choice of a test is clear and unique. When nothing
is known about F (·) except for the information provided by the samples,
one should se lect a nonparametric procedure with good efficiency in a wide
class of distributional families.
For an intermediate situation when partial knowledge about F (x) is avail-
able, Ramsey (1971) proposes to use the Laplace distribution for the null
hypothesis. It is not quite clear why this is an appr opriate assumption
presumably the idea is that the data is long tailed however the behav-
ior of eight standard nonparametric tests o f location under the assumption
that the null distribution is Laplace is, of course, of interest on its own.
The eig ht nonparametric tests for the Laplace distribution investigated
in Ramsey (1971) and in Conover e t al. (1 978) (a follow-up to the first
paper) are:
1. The locally most powerful (LMP) rank test. Under the Laplace dis-
tribution the L MP scores are
a
N,i
= 2P r(Z
N
i 1) 1,
where Z
N
is binomial variable with the parameters N and p = 1/2.
2. The Mood median test (M), where
a
N,i
= sign(2i N 1)
and the statistic is an AMP rank test [see, e.g., ajek (1969)].
3. The normal scores (F) test, where a
N,i
is the expected value of the
ith order statistic in a random sample of N observations from the
standard normal distribution.
4. The Wilcoxon tes t (W) [see Wilcoxon (1945)], where a
N,i
= i, the
rank itself. Note that indicators of the Y -ranks (rather than the X-
ranks) form the test statistic, so the null hypothesis is favored by the
large va lues of the test statistic.
5. The van der Waerden (V) tes t [see van der Waerden (1952)], which
uses quantiles of the standard normal distribution as scores.
2.6 Statistical inference 137
Figure 2.7: Power functions of 8 nonparametric tests of location in Laplace
family for various values of significance level α (0.1 - top-left, and middle-
right; 0.05 - top- right, bottom-left; 0.025 - middle-left, bottom-right) and
sample sizes m = 5 and n = 5 (first three graphs) and n = 4 (last three
graphs). Reproduced from Conover et al. (1978). Reprinted with permission
form the Journal of the American Statistical Ass ociation. Copyright 1978
by the America n Statistical Association. All rights reserved.
138 2. Classical symmetric Laplace distribution
6. The Tukey’s quick tes t (T), which counts the number of Y ’s exceeding
the largest X and the number of X’s which are less than the smalles t
Y . (If in the combined sample the largest and smallest observations
come from the same sample, then T = 0.)
7. The Neave-Tukey quick test (N) statistic, which maximizes the Tukey
statistic over subsamples in which one observation is omitted.
8. The Kolmogorov-Smirnov test (K): If F
m
(x) and G
m
(y) are the sam-
ple c.d.f.’s of X’s and Y ’s, respectively, then
KS = s up
x
|F
m
(x) G
n
(y)|.
Ramsey chooses sample sizes n = m = 5 and calculates the power functions
of each test as a function of the lo cation shift θ. The powe r functions are
of the form
p(θ) = 1 +
1
a
00
5
X
i=1
e
i
X
j=0
a
j,i
θ
j
.
The results are presented in Figure 2.7. Here the power of the LMP test
(p
LMP
) is used as a standard and thus the LMP test is represented by the
zero line. For other tests, the diagrams show the differences
p
·
(θ) p
LMP
(θ),
where p
·
(θ) is the power function of another test.
It is quite surprising that the Mood media n test (which is AMP) performs
very poorly except for a small local region in which it is an approximation
to the LMP test. Note also that the F, W, and V tests behave almost as
well as the LMP tests.
This example with Laplace distribution shows that sometimes with an
unfamiliar distributional family the cost of deriving the LMP test may
not be justified and serves a warning to tho se who “purchase a shred of
optimality (i.e. the use of asymptotically most powerful test) at the expense
of a large sample assumption”[Ramsey (1971)].
2.7 Exercises
In this section we present some 60 exercises of va rious degree of difficulty
related to the material discussed in Chapter 2. We urge our readers to
at least skim this section since it contains information which will enhance
their understanding of the prope rties of the clas sical symmetric Laplace
distribution.
2.7 Exercises 139
Exercise 2.7.1 Show that the nth moment abo ut zero of the classical
Laplace r.v. Y with density (2.1.1) is given by (2.1.18). Compare with the
corresponding result for a normal r.v. with mean θ and variance σ
2
Exercise 2.7.2 Show that the density function f (x; θ, s) given by (2.1.1)
has derivatives of any order, except at x = θ, where there is a cusp. Demon-
strate the following explicit form of these der ivatives:
d
dx
n
f(x; θ, s) =
(1)
n
1
2
1
s
n+1
e
−|xθ|/s
if x > θ,
1
2
1
s
n+1
e
−|xθ|/s
if x < θ.
(2.7.1)
Exercise 2.7.3 Gini mean difference for the distribution of a r.v. X is
defined as
γ(X) = E|X
1
X
2
|,
where X
1
, X
2
are i.i.d. c opies of X. Show that if X CL(θ, s) then γ(X) =
3
2s
Exercise 2.7.4 Let X be a c lassical L aplace r .v . with the density f(x) =
f(x; θ, s) as in (2.1.1).
(a) Show that for θ < 0 the geometric mean of X, defined as
λ = exp
Z
0
log xf(x)dx
,
is
λ = exp
1
2
(γ log s)e
θ/s
,
where
γ =
Z
0
e
y
log y dy 0.5772156 . . .
is the Euler’s constant [Christensen (2000)]. What is the value of λ when
θ > 0?
(b) Calculate the harmonic mean of X, defined as
η =
Z
−∞
1
x
f(x)dx
1
,
where the integral is understood in the Cauchy’s principal value sense.
Exercise 2.7.5 Let Y have a classical Laplace CL(0, s) distr ibution with
density f (x) = f (x; 0, s) given by (2.1.1).
(a) Verify that
Z
−∞
log f(x)
1 + x
2
dx = (2.7.2)
140 2. Classical symmetric Laplace distribution
and
xf
0
(x)/f(x)is increasing without bound as x . (2.7.3)
Recall that for a real r.v. Y whose c.d.f. is absolutely continuous with
density f the conditions (2.7.2) (so-called Krein condition) and (2.7.3) (so-
called Lin condition) are sufficient for the moments
α
n
= E[Y
n
] =
Z
−∞
x
n
f(x)dx (2.7.4)
to determine the distribution of Y uniquely [see Krein (1944), Stoyanov (2000)].
Thus, the CL(0, s) distribution is uniquely deter mined by the sequence {α
n
}
of its moments.
(b) Another sufficient condition fo r the moments (2.7.4) to determine the
distribution uniquely is the so-ca lled Carleman condition:
X
n=1
α
1
2n
2n
= , (2.7.5)
see, e .g., Harris (19 66). Does the Laplace distribution CL(0, s) satisfy the
Carleman condition?
(c) Is the general classical Laplace distribution CL(θ, s) determined uniquely
by the sequence {α
n
} of its moments?
Exercise 2.7.6 Let X be a random variable with the coefficients of skew-
ness and kurtosis γ
1
and γ
2
, respectively. The quantities
γ
3
=
E[(X EX)
5
]
[E(X EX)]
5/3
10 γ
1
and
γ
4
=
E[(X EX)
6
]
[E(X EX)]
3
15 γ
2
10 γ
2
1
15
may be viewed as generalizations of γ
1
and γ
2
. Compute these quantities
for the standard Laplace and the standard normal distributions
Exercise 2.7.7 In this exercise you will study the effect of rounding on
the mean and the variance of the Laplac e distribution. If the values of a
continuous r.v. X are rounded into intervals of width ω, where the center of
the interval containing zer o is , then the values of the re sulting discrete
r.v.
˜
X are
, ± ω, ± 2ω, . . . .
Moreover,
˜
X = + nω, n = 0, ±1, . . .
2.7 Exercises 141
whenever
+ ω/2 x < + + ω/2.
(a) Let X have the CL(0, s) distribution, so that E[X] = 0 and V ar[X] =
σ
2
= 2s
2
. Show that the probability function of the r.v.
˜
X admits the
following explicit form:
P (
˜
X = + nω) =
1
2
e
ω/2s
e
ω/2s
e
/s+ωn/s
for n 1,
1
1
2
e
ω/2s
e
/s
+ e
/s
for n = 0,
1
2
e
ω/2s
e
ω/2s
e
/sωn/s
for n 1.
(2.7.6)
(b) Derive closed for m expressions for the mean and the variance of the
r.v.
˜
X given by (2.7.6). Discuss the e ffects of rounding on the mean and
variance. You may want to follow Tricker (1984), writing ω = and
considering the behavior o f the bias
E[
˜
X] E[X]
ω
and the ratio
V =
V ar[
˜
X]
V ar[X]
for various va lues of a and r.
(c) Repeat the above for the normal distribution with mean zero and vari-
ance σ
2
. Does the probability function of
˜
X admit an explicit form in this
case? What ab out E[
˜
X] a nd V ar[
˜
X]? In which c ase is the effect of rounding
more severe?
Exercise 2.7.8 Let F and G be the d.f.’s of two continuous distributions
symmetric about θ
F
and θ
G
, respectively. We say that F is lighter tailed
than G, denoted
F <
s
G,
if the function G
1
[F (x)] is convex for x > θ
F
[see van Zwet (1964)].
(a) Show that the s-ordering defined ab ove is location and scale invariant.
(b) Assume that θ
F
= θ
G
= 0 and show that if F <
s
G then G(x)
F (x) for x > 0. Thus, G has more probability in the tail tha n F does
[Hettmansper ger and Kee nan (1 975)].
(c) Show that
uniform <
s
normal <
s
logistic <
s
Laplace.
(d)
Further, s how that although we have Logistic <
s
Cauchy, the L aplace
and the Cauchy distributions are un-comparable with respect to the <
s
142 2. Classical symmetric Laplace distribution
ordering [see Latta (1979 ), and also Balanda (1 987)]. In practice, the uni-
form is usually re ferred to as light tailed, the normal and logistic as medium
tailed, while the Laplace and the Cauchy as heavy tailed, so in a sense, the s-
ordering corr e sponds to a common perception of tail heaviness. Please see,
e.g., Hettmansperger and Keenan (1975) for mor e information on ordering
of distributions by tail heaviness.
Exercise 2.7.9 Let X have the standard classical Laplace distr ibution
with density p(x) =
1
2
e
−|x|
(−∞ < x < ). Show that the ordinate p(X),
considered as a random variable [the so-called vertical density function, see
Troutt (1991)] has uniform distribution on (0, 1/2). Note that the same
is true fo r the ordinate p(X) when X has the standard exponential den-
sity p(x) = e
x
(x > 0) (in which case we obtain the standard uniform
distribution). Investigate the corresponding case of the standard normal
distribution: derive the density of the ordinate p(X) when X is s tandard
normal with p.d.f. p(x) =
1
2π
e
x
2
/2
(−∞ < x < ), which is not uniform!
Exercise 2.7.10 Le t W be a standard exponential r.v. with the density
f
W
(w) = e
w
, w 0, and let Z be an independent of W standard normal
random va riable with the density
f(x) =
1
2π
e
x
2
/2
, −∞ < x < .
Show that the density of the product X =
2W Z is given by the right
hand side of relation (2.2.4).
Hint: Consider the transformation Y
1
= W , Y
2
=
2W Z and derive the
joint density of Y
1
and Y
2
. Then, integrate the joint density with respect
to y
1
to obtain the marginal density of Y
2
.
Exercise 2.7.11 Le t W have a standard exponential distribution with
density f
W
(w) = e
w
, w 0. Show that the random variable T = 1/
W
has the density f
T
(x) = 2x
3
e
1/x
2
, x > 0.
Exercise 2.7.12 Le t W have a standard exponential distribution with
density f
W
(w) = e
w
, w 0. Let I be r.v. taking values ±1 with probabil-
ities 1/2 each and independent of W . Show that the ch.f. of I · W is given
by the right hand side of (2.2.9 ).
Exercise 2.7.13 Le t U
1
, U
2
, U
3
, U
4
be i.i.d. standard normal rando m
variables. By computing relevant characteristic functions, show that the
r.v. X = U
1
U
4
U
2
U
3
has the standard Laplace distribution.
Hint: First, show that the ch.f. of X is (E[e
itU
1
U
4
])
2
and compute this
exp ectation by conditioning on U
4
.
Exercise 2.7.14 Explain why a three dimensional extension of (2.2.13)
given by a 3 × 3 matrix does not result in a Laplace distribution or its
modifications. Investigate an n-dimensional extension.
2.7 Exercises 143
Exercise 2.7.15 Le t δ
1
and δ
2
be r.v.’s taking values of either zero or one
with probabilities given in Proposition 2.4.4. Let W
1
, W
2
be i.i.d. standard
exp onential r.v.’s, independent of (δ
1
, δ
2
). Let X have a standard Laplace
distribution with ch.f. (2.1.7).
(a) Show that the ch.f. of cX , where c (0, 1), is given by the first factor
of (2.4.10).
(b) Show that the ch.f. of δ
1
W
1
δ
2
W
2
is given by the second factor of
(2.4.10).
(c) Show that the product (2.4.10) is equal to the ch.f. of X.
Exercise 2.7.16 Show that if X
1
and X
2
are i.i.d. CL(0, s) random vari-
ables, then Y = |X
1
/X
2
| has F -distribution with ν
1
= 2 and ν
2
= 2 degrees
of freedom.
Exercise 2.7.17 Show that if Z
i
, i = 1, 2, . . . , 6, a re i.i.d. standard normal
r.v.’s, then
Y = |Z
1
|
q
Z
2
2
+ Z
2
3
|Z
4
|
q
Z
2
5
+ Z
2
6
has the standard classical Laplace distribution.
Exercise 2.7.18 Le t X
1
, . . . , X
n
be i.i.d. standard c lassical Laplace r.v.’s.
Show that the sum T =
P
n
j=1
X
j
admits the random sum representation
(2.3.27) of Proposition 2.3.2.
Hint: Write the ch.f. φ(t) of the right hand side of (2.3.27) by conditioning
on I and M
n
to obtain
φ(t) =
1
2
n
X
j=1
"
1
1 it
j
+
1
1 + it
j
#
2
j
2
2n1
2n j 1
n 1
. (2.7.7)
Then, show that (2.7.7) coincides with [1 + t
2
]
n
, which is the ch.f. of T .
Exercise 2.7.19 Le t X
1
and X
2
be i.i.d. random variables with density
f(x) = px
p1
, p > 0, x (0, 1) [the standard power function distribution
with parameter p, see, e.g., Johnson et al. (1994, p. 607)]. Show that the
r.v.
Y = p log
X
1
X
2
has the standard classical Laplace distribution.
Hint: Relate X
1
to the s tandard Pareto Type I r.v. with p.d.f. 1/x
2
, x > 1,
and use Proposition 2.2.4.
Exercise 2.7.20 Recall that the standard classical Laplace r.v. X has
the same distribution as the difference of two i.i.d. standard exponential
144 2. Classical symmetric Laplace distribution
variables (see Pr oposition 2.2.2). Investigate whether there are any other
i.i.d. r.v.’s V
1
and V
2
such that
X
d
= V
1
V
2
. (2.7.8)
Proceed by writing the relation (2.7 .8) in terms of ch.f.’s,
1
1 + t
2
= ψ
V
1
(t)ψ
V
1
(t), (2.7.9 )
where ψ
V
1
is the ch.f. of V
1
, and note that the ch.f.
ψ
V
1
(t) = (1 it)
α
(1 + it)
α1
, 0 α 1,
is a solution of (2.7.9). What is the corresponding r.v. V
1
? Are there any
other solutions to (2.7.9)? [cf. Problem 64-13, SIAM Review 8(1) (1966),
108-110].
Exercise 2.7.21 Le t X
1
, X
2
, . . . be i.i.d. random variables with a finite
mean µ, and let N be a positive and integer valued random varia ble with
a finite mean E[N ]. Show that if N and X
i
’s are independent, then the
mean of the ra ndom sum
P
N
i=1
X
i
is equal to the product µE[N].
Exercise 2.7.22 Define f
n
(t) = [φ(
t)]
1/n
for t > 0 and n = 1, 2, . . . ,
where φ is a real-valued characteristic function. If the function f
n
is com-
pletely monotone on (0, ) for each n [that is (1)
k
f
(k)
n
(t) 0 for t > 0 ,
k = 0, 1, . . . ] then the ch.f. φ is infinitely divisible [Kelker (1971)]. Apply
the above result to the ch.f. of the standard clas sical Laplace distribution
to e stablish its infinite divisibility.
Exercise 2.7.23 Suppose that X
1
and X
2
are i.i.d. classical Laplac e r.v.’s
with p.d.f. (2.1.1), where θ = 0 and s > 0. Let
X
2
=
1
2
(X
1
+ X
2
) and S
2
2
= (X
1
X
2
)
2
+ (X
2
X
2
)
2
.
Show that the p.d.f. of the t-statistic (2.3.40) with n = 2 is given by (2.3.51).
Exercise 2.7.24 Le t X
1
, . . . , X
n
be a random sample from the classical
Laplace distribution CL(θ, s).
(a) Show that the distr ibution of the t-type statistic
˜
T
n
given by (2.3.5 7) is
concentrated o n the interval [1, 1] and does not depend on the parameters
θ a nd s.
(b) Show that the distribution function of the statistic
˜
T
n
is given by
(2.3.60).
(c) Investigate the distribution of ano ther analog of the t-distribution, the
statistic
P
n
i=1
(X
i
θ)
P
n
i=1
|X
i
ˆ
θ
n
|
,
where
ˆ
θ
n
is the sample median of the X
i
’s.
2.7 Exercises 145
Exercise 2.7.25 Le t
g
n
(t) =
Γ
n+1
2
Γ
n
2
1 +
t
2
n
(n+1)/2
, −∞ < t < ,
be the density of the t-distribution with n degrees of freedom, and let f
T
n
be the density (2.3.56) of the t-statistic (2.3.40) based on a random sample
of size n from the classical Laplace distribution with density (2.1.1) with
θ = 0. Investiga te the behavior of the ratio
γ
n
(t) = g
n
(t)/f
T
n
(t)
as t . Specifically, show that γ
n
(t) is monotonically increasing to infin-
ity for t (t
0
, ) for some t
0
> 0. Conclude that the tails of the density f
T
n
are heavier than those of the student t-density g
n
. What are the implica-
tions when one uses the critical points of the t-distribution when calculating
the Ty pe I error probabilities, the power function, or the confidence levels
connected with samples from the Laplace distr ibution?
Exercise 2.7.26 Compare products and ratios o f two independent Laplace
random variables with products and ratios of two independent normal ran-
dom variables.
Exercise 2.7.27 Le t X
1
, X
2
, X
3
, X
4
be independent standard classical Laplace
random va riables. Find the p.d.f.’s of their following functions
X
3
p
(X
2
1
+ X
2
2
)/2
,
2X
2
3
X
2
1
+ X
2
2
,
X
2
1
+ X
2
2
X
2
3
+ X
2
4
.
Exercise 2.7.28 If X
1
has the density
f
1
(x) =
1
2a
; a < x < a,
0; otherwise
and X
2
has the density
f
2
(x
2
) =
1
2πσ
e
1
2
x
2
2
2
, < x
2
< ,
and X
1
and X
2
are independent, then Y = X
1
X
2
has the density
h(y) =
1
2
2π
E
1
y
2
2a
2
σ
2
, < y < ,
where
E
1
(x) =
Z
x
e
t
t
dt, x > 0,
is the exponential integral. What is the c orresponding result when X
2
is
replaced by a Laplace r.v. with mean zero and scale parameter σ?
146 2. Classical symmetric Laplace distribution
Exercise 2.7.29 Le t B
n
have beta distr ibution with para meters 1 and n,
with density given by (2.2.45). Show that as n , the sequence nB
n1
converges in distribution to a standard exponential random variable.
Exercise 2.7.30 Show that if in Proposition 2.4.7 the condition (2.4.27)
is replaced by
E|X θ| = c > 0 for X C,
then the maximum entropy is attained by the classical Laplace distribution
with density f(x) =
1
2c
e
−|xθ|/c
[Kapur (1 993)].
Exercise 2.7.31 (a ) Consider a loca tion family with density
f(x θ), < x, θ < , (2.7.10)
where f is the standard classical Laplace density f(x) =
1
2
e
−|x|
. Show that
the Fisher information I(θ), given by
I(θ) =
Z
−∞
[f
0
(y)]
2
f(y)
dy, (2.7.11)
is equal to one. Compare it with the corresponding values of I(θ) when f
is the standard normal, standard logistic, a nd standard Cauchy density.
(b) Now consider a loc ation-scale family with density (2.6.1). Using the
relations (2.6.9) - (2.6.11), show that the Fisher information matrix is given
by (2.6.12).
(c) Show that for a locatio n-scale family of L(θ, σ) distributions given by
the density (2.1.3), the Fisher informa tio n matrix is
2
2
0
0 1
2
.
(d) What is the corresponding Fisher information matrix when f in (2.6.1)
is the standard normal density f(x) =
1
2π
e
x
2
/2
?
Exercise 2.7.32 Le t X
1
, . . . , X
n
be a random sample of size n = 2k + 1
from the clas sical Laplace location family with density (2.6.125), and let
ˆ
θ
n
= X
k+1:n
be the sample median with the density give n by (2.6.126) -
(2.6.127).
(a) Following (2.6.128), show that the Fisher information abo ut θ contained
in
ˆ
θ
n
is given by (2.6.129).
(b) Show that the amount of Fisher information lost when using
ˆ
θ
n
is given
by (2.6.130).
(c) Show that the loss (2.6.130) converges to infinity as k .
(d) Establish the asymptotic relation (2.6.131).
2.7 Exercises 147
Exercise 2.7.33 Given a random sample X
1
, . . . , X
n
(from a continuous
distribution with density f and distribution function F ) and a score func-
tion J(u), 0 < u < 1, (corresponding to a one-sample linear rank test of
symmetry), the R-estimator of the location parameter θ is defined as the
solution of
n
X
i=1
sign(X
i
θ)J
+
R(|X
i
θ|)
n + 1
= 0, (2.7.12)
where
J
+
(u) = J(1/2 + u/2)
and R(w) is the rank of w [see, e.g., Hall and J oiner (1983)]. Under some
regularity conditions, the efficient score function [corresponding to the
asymptotically most powerful rank test, see, e.g., ajek (1969)] is
J(u) =
f
0
(F
1
(u))
f
0
(F
1
(u))
. (2.7.13)
(a) Show that if the sample is from the CL(θ, 1) distribution, then the
efficient score function (2 .7.13) is
J(u) = sign(u 1/2) (2.7.14)
(so that the corresponding asymptotica lly most powerful rank test is the
sign test).
(b) Show tha t if the score function is given by (2.7.14), then the R-estimator
of location given by (2.7.12) is the sample median.
(c) What is the most efficient score function (and the corresponding asymp-
totically most powerful rank test) if the underlying distribution is normal?
(d) What is the most efficient score function (and the corresponding asymp-
totically mo st powerful rank test) under the underlying logistic dis tribu-
tion?
Exercise 2.7.34 Le t X
1
, . . . , X
n
be a random sample from the density
f(x; θ) =
1
2
e
−|xθ|
, −∞ < x < , −∞ < θ < . (2.7.15)
Use calculus to show that the MLE of θ is the sample median. Proceed by
writing the log-likelihood as
ψ(θ) = n log 2
n
X
i=1
{(x
i
θ)
2
}
1/2
, (2.7.16)
and then by taking the derivative with respect to θ to find the intervals
where ψ is increasing and decreasing.
148 2. Classical symmetric Laplace distribution
Exercise 2.7.35 The following example was derived in Rao and Ghosh
(1971). Consider the location family {f(x θ), θ R}, where
f(y) =
e
k
1
α
1
|y|
, fo r 0 |y| c
1
,
e
k
2
α
2
|y|
, fo r c
1
|y| < ,
(2.7.17)
where, for continuity, we have (α
2
α
1
)c
1
+ k
1
= k
2
, and
0 < α
1
< α
2
< 2α
1
. (2.7.18)
The constants k
1
, k
2
, α
1
, α
2
, and c
1
are such that f is a valid probability
density on (−∞, ).
(a) Show that if g is convex and sy mmetric function on R, then for any
x
1
, x
2
R, the function
h(θ) = g(x
1
θ) + g(x
2
θ) (2.7.19)
is minimized for θ
=
x
1
+x
2
2
.
(b) Let X
1
, X
2
be a random sample of size n = 2 from density f(x θ).
Apply part (a) to the function
g(x) = log f(x) (2.7.20)
to show that the likelihood function is maximized when θ is set to the
sample median.
(c) Let y
1
< y
0
< y
2
be a random sample of size n = 3 form f(x θ).
Assuming that y
0
= 0, write the negative log-likelihood function and show
that it is convex. Further, show that for θ near zero the negative of the
log-likelihood function is minimized by θ = 0 (sample median). Argue that
the global minimum exists, and is also attained at θ = 0. Thus , we have a
non-Laplace distribution, such that the MLE of the location parameter for
sample sizes n = 2, 3 is sample median.
Exercise 2.7.36 Le t f be the skewed La place density given by (2.6.18).
(a) Show that the function f is a probability density on (−∞, ) if c =
(1/b
1
+ 1/b
2
)
1
.
(b) Le t n be odd, and let X
1
, . . . , X
n
be a random sample of size n from
the distribution with density f (x θ), where f is the density (2.6.18) with
the constants b
1
and b
2
such that
b
1
n 1
n + 1
b
2
and b
2
n 1
n + 1
b
1
. (2 .7.21)
Show that every median of X
1
, . . . , X
n
is the MLE of θ.
(c) Let n be odd and let b
1
> 0 and b
2
=
n+2
n
b
1
. Show that the above b
1
and b
2
satisfy the conditions (2.7.21).
(d) In view of the above results, show that the condition (v) preceding
Proposition 2.6.3 (see Section 2.6.1) is not enough to conclude that the
population is Laplace [Findeisen (198 2)].
2.7 Exercises 149
Exercise 2.7.37 Consider the function
f(x) = c(2 + |x|)
1
e
−|x|
, < x < . (2.7.22)
(a) Argue that f with an appro priate c > 0 is a probability density function
on (−∞, ).
(b) Show that for every −∞ < x, y, θ < we have
log f(x θ) + log f(y θ) log f(0) + log f(y x). (2.7.23)
(c) Using part (b), show that if X
1
and X
2
are i.i.d. with density f(x θ),
where f is given by (2.7.22), then both X
1
and X
2
are the MLE’s of θ.
(d) In view of the above results, show that the condition (vi) pr e ceding
Proposition 2.6.3 (se e Section 2.6.1) is not sufficient to conclude that the
population is Laplace [Findeisen (198 2)].
Exercise 2.7.38 Le t X
1
, . . . , X
n
be a random sample from the density
(2.6.19) with a given value of α and an unknown value of θ. Show that the
MLE of θ is the empirical α-quantile of the sample (defined to be a number
ˆ
ξ
α
such that at least α × 100% o f the observations are less than or equal
to
ˆ
ξ
α
, and a t least (1 α) ×100% of the observations are greater or equal
to
ˆ
ξ
α
).
Exercise 2.7.39 Le t X
1:n
, . . . , X
n:n
denote order statistics from a stan-
dard classical Lapla c e distribution CL(0, 1). Then, the variance of the sam-
ple median (2.6 .15) is given by
σ
2
n
=
(
n!
(k!)
2
2
1k
P
k
i=0
k!
i!(ki)!
(2)
i
(k + 1 + i)
3
, fo r n = 2k + 1,
n!
[(k1)!]
2
2
2k
P
k2
i=0
a(i, k) +
3(1)
k1
+1
2
k+3
k
4
, for n = 2k
(2.7.24)
(in case n = 2 the sum
P
1
i=0
should be set to zero), where
a(i, k) =
(k 1)!
i!(k 1 i)!
(2)
i
(k 1 i)
1
{(k + 1 + i)
3
(2k)
3
}.
(2.7.25)
Show that σ
2
n
0 as n .
Exercise 2.7.40 Le t M
x
, M
y
and M
xy
be the sample medians (2.6.15) of
x
1
, . . . , x
n
, y
1
, . . . , y
m
and x
1
, . . . , x
n
, y
1
, . . . , y
m
, re spec tively. Show that
M
xy
is between M
x
and M
y
.
Exercise 2.7.41 Le t X
1
, . . . , X
n
be a random sample from the standa rd
classical Laplace distribution, and let
W
n
=
ˆ
θ
n
log[2(1 β)]
ˆs
n
,
150 2. Classical symmetric Laplace distribution
where 0.5 < β < 1 and the statistics
ˆ
θ
n
and ˆs
n
are, respectively, the
(canonical) sample median (2.6.15) and the sample mean absolute deviation
(2.6.34) (the MLE’s of the Laplace parameters). Show that if n = 2, then
the p.d.f. of W
2
is given by (2.6.178) with k
β
= log[2(1 β)] [Shyu and
Owen (1986a)].
Exercise 2.7.42 Le t X
1
, . . . , X
n
be i.i.d. from the CL(θ, s) distribution,
and let
ˆ
θ
n
and
˜
θ
n
be the MLE and MME of θ given by (2.6.15) and (2.6.53),
respectively.
(a) Show that if s = 1 and n = 2k + 1, then for any integer k 3 the right
hand side of (2.6.56) satisfies the rela tio n
(1.51)
(2k + 1)!
(k!)
2
1
2
2k+1
r
2π
2k + 1
1 +
1
2k
3/2
2. (2.7.26)
Conclude that for n = 2k+1 7 the variance of
ˆ
θ
n
is less than the variance
of
˜
θ
n
(which is 2/n).
(b) Investigate the corr e sponding case when the sample size is even.
Exercise 2.7.43 Le t X
1
, . . . , X
n
be i.i.d. from the CL(0, s) distribution,
and let ˆs
n
and ˜s
n
be the MLE and MME of s g iven by (2.6.20) and (2.6 .58),
respectively.
(a) We s aw in Proposition 2.6.4 that ˆs
n
is unbiased for s. Investigate
whether this property is shared by ˜s
n
.
(b) Asymptotically, the variance of ˆs
n
is smaller than that of ˜s
n
. For n 1,
derive the variances of ˆs
n
and ˜s
n
and examine which one is larger.
Exercise 2.7.44 Consider a Type-II censored sample (2.6.3 8) from the
classical Laplace distribution and the corresponding likelihood function
(2.6.40).
(a) Show that the likelihood function is continuous in θ for any fixed s > 0.
(b) Show that for any fixed s > 0 the likelihood function is monoto nically
increasing in θ for θ (−∞, x
r+1:n
) and monotonically decreasing in θ for
θ (x
nr:n
, ).
(c) Show that for θ [x
r+1:n
, x
nr:n
] and for any fixed s > 0 the likeliho od
function is maximized by sample median of x
r+1:n
, . . . , x
nr:n
.
(d) Show that the MLE of θ is the sample median.
(e) Show that when we substitute the sample median
ˆ
θ
n
into the likelihood
function (2.6.40) we obtain the function g given by (2.6.41) - (2.6.42).
(f) Show that the function g is maximized by s = C/(n 2r) and deduce
that the MLE of s is given by (2.6.43).
(g) Investigate the ca se of Type-II right censored samples and general Type-
II censored samples.
Exercise 2.7.45 Le t X
1
, . . . , X
n
be i.i.d. with the CL(0, s) distribution.
2.7 Exercises 151
(a) Show that
δ
1
=
1
2n
n
X
i=1
X
2
i
(2.7.27)
is an unbiased and consistent but not efficient estimator of the parameter
s
2
.
(b) Show that under the loss function of the form
L(δ, s
2
) = f(s
2
)(δ s
2
)
2
, (2.7.28)
where f is an arbitrary positive function, the risk of the estimators of the
form
δ
α
=
α
n
n
X
i=1
X
2
i
(2.7.29)
is minimized for
α
=
n
2(5 + n)
. (2.7.30)
[Jakuszenkow (1978)]. Is the r e sulting estimator cons istent for s
2
? Compare
the variances of δ
1
and δ
α
.
Exercise 2.7.46 Consider the mixture of two Laplace distributions with
density (2.6.81). Show that
(a) If θ
1
= θ
2
= θ, then for any 0 < p < 1, the distribution is unimodal
with the mode at θ.
(b) If θ
1
< θ
2
and
s
2
1
s
2
1
+ s
2
2
e
(θ
2
θ
1
)/s
2
< p <
s
2
1
s
2
1
+ s
2
2
e
(θ
1
θ
2
)/s
1
,
then the distribution is bimodal with the modes at θ
1
and θ
2
.
(c) If θ
1
< θ
2
and
0 < p <
s
2
1
s
2
1
+ s
2
2
e
(θ
2
θ
1
)/s
2
,
then the distribution is unimo dal with the mode at θ
2
.
(d) If θ
1
< θ
2
and
s
2
1
s
2
1
+ s
2
2
e
(θ
1
θ
2
)/s
1
< p < 1,
then the distribution is unimodal with the mode at θ
1
. [Kacki and Krysicki
(1967).]
152 2. Classical symmetric Laplace distribution
Exercise 2.7.47 Le t Y have a classical La place distribution with density
(2.1.1), so that
Y
d
= θ +
2sX, (2.7.31 )
where X has the CL(0, 1) distribution. T hen, the mixture on θ of the distr i-
bution of Y is the type I comp ound Laplace distribution with parameters
µ, σ, and s, if θ in (2.7.31) has the normal distribution with mean µ and
variance σ
2
[see, e.g., Johnson et a l. (1995)]. Show that the p.d.f. of this
distribution is
f(x) = C
Φ
x µ
σ
σ
s
e
(xµ)/s
+ Φ
x µ
σ
σ
s
e
(xµ)/s
,
where Φ is the c.d.f. of the standard normal distribution,
C =
1
2s
e
1
2
(
σ
s
)
2
,
and < x < , −∞ < µ < , σ > 0, and s > 0.
Exercise 2.7.48 Le t Y have a classical La place distribution with density
(2.1.1) and representation (2.7.31). The mixture on 1/s of the distribution
of Y is the type II compound Laplace distribution with parameters θ, α,
and β if 1/s in (2.7.31) has the Γ(α, β) distribution with density
f
α,β
(x) =
x
α1
e
x/β
β
α
Γ(α)
, α > 0, β > 0, x > 0,
[see, e.g., Johnson et al. (1995)].
(a) Show that the p.d.f. and the c.d.f. of this distribution are
f(x) =
1
2
αβ[1 + |x θ|β]
(α+1)
, α > 0, β > 0, −∞ < x < , (2.7.32)
and
F (x) =
1
2
[1 + |x θ|β]
α
, for x < θ,
1
1
2
[1 + |x θ|β]
α
, fo r x θ,
respectively. Note that for θ = 0, α = 1, and β = s
2
/s
1
, the dens ity (2.7.32)
coincides with that of the ratio of two independent, mean zero, classical
Laplace r.v.’s with scale parameters s
1
> 0 and s
2
> 0, respectively (see
Section 2.3.3).
(b) Further, show that as α and β 0 with αβ = s > 0, then
f(x) in (2.7.32) converges to the cla ssical Laplace density (2.1.1 ). [The re-
lation between Lapla c e distributions and distributions with densities given
by (2.7.32) is analogous to that between normal and Pearson Type VII
distributions, see, e.g., Johnson et al. (1995).]
2.7 Exercises 153
Exercise 2.7.49 Le t Y have the type II compound Lapla ce distribution
with density (2.7.32).
(a) Show that for α > 1 the mean of Y is equal to θ, and for α > 2 the
variance of Y is
σ
2
=
2β
2
(α 1)(α 2)
, α > 2.
Note that the distribution is symmetric about θ, so that (for α > 1) we
have: median = mea n = mode.
(b) More generally, show that the moments of order α or greater do not
exist, while for 0 < r < α we have
E(X θ)
r
=
αβ
r
P
r
j=0
(1)
j
r!
j!(rj)!
(α + j r)
1
, fo r r even,
0, for r odd.
(c) Show that the mean deviation is β/(α 1) (for α > 1). Derive an
expression for the Mean deviation/Standar d deviation and compare with
the corresponding value for the Laplace distribution.
(d) Show that the coefficient of kurtosis , defined in (2.1.22), is given by
γ
2
=
6(α 1)(α 2)
(α 3)(α 4)
3, α > 4.
What is the range of γ
2
? How does γ
2
above compare with the correspond-
ing value for the Laplace distribution? Is the type II compound Laplace
distribution leptok urtic (γ
2
> 0) or platykurtic (γ
2
< 0)?
Exercise 2.7.50 Le t Y
1
, . . . , Y
n
be i.i.d. normal variables with mean µ
and variance σ
2
. Assume that the variance is a constant, while the mean
is a random variable with the Laplace L(θ, η) prior distribution (so that µ
has the mean and variance equal to θ and η
2
, re spec tively). Let Y be the
corresponding sample mean and let f be the marginal density of Y .
(a) Show that
f(x) =
1
η
e
(σ/η)
2
/n
{F (z) + F (z)}, (2.7.33)
where
z =
n
σ
(y θ), F (z) = e
b
z
Φ(z b
), b
=
σ
η
r
2
n
,
and Φ is the c.d.f. of the standard no rmal distribution.
(b) Determine the posterior p.d.f. of µ given Y = y. Show that the posterior
mean and variance are
E(µ|Y = y) = w(z)(y +
σ
n
b
) + (1 w(z))(y
σ
n
b
)
154 2. Classical symmetric Laplace distribution
and
V ar(µ|Y = y) =
σ
2
n
4σ
4
n
2
η
2
H(z),
respectively, where
w(z) =
F (z)
F (z) + F (z)
,
H(z) =
[F (z) + F (z)]g(z) 2F (z)F (z)
[F (z) + F (z)]
2
,
and
g(z) = e
b
z
φ(z b
).
(The function φ above denotes the standard normal p.d.f.)
(c) Investigate the dependence of the posterior mean and variance o n y.
How do they change as y varies from −∞ to ? Does the posterior var iance
attain a minimum value for some y? Is the poster ior distribution symmetric
or a skewed one? What happens to the pos terior distribution as y ?
[Mitchell (1994)].
Exercise 2.7.51 Le t X have a normal distr ibution with varia nce equal to
one and with a random mean µ having the Laplace distribution CL(0, η)
(the Laplace prior).
(a) Using the previous exercise, show that
E(µ|X) = X h(X)η,
where
h(x) =
1 e
2cx
ψ(x)
1 + e
2cx
ψ(x)
,
ψ(x) =
Φ(x c)
Φ(x c)
,
and Φ is the standard normal distribution function.
(b) Show that h is a monotonically increasing and odd function from
(−∞, ) onto (1, 1) with h(0) = 0.
(c) A prior for µ is said to be neutral if the median of µ is 0 while the median
of µ
2
is 1. Show that the above Laplace prior is neutral for η = log 2.
(d) Show that the risk of ˆµ(X), defined as
E[(ˆµ(X) µ)
2
|µ],
is a bounded function of µ.
Magnus (2000) refers to ˆµ(X) = X h(X) log 2 as the neutral Laplace
estimator of the mean µ. Further properties of ˆµ(X) can be found in the
above paper.
2.7 Exercises 155
Exercise 2.7.52 Le t X
1
, X
2
, X
3
be i.i.d. logistic random variables with
the distribution function
F (x) = (1 + e
x
)
1
, −∞ < x < , (2.7.34)
and let Y be a standard classical Laplace variable with p.d.f. (2.1.2).
(a) Show that
X
2:3
+ Y
d
= X
1
, (2.7.35)
where X
2:3
is the 2nd order statistic (the sample median) of the X
i
’s.
The above result involving the Laplace distribution is ac tually a charac-
terization of the log istic distribution [see George and Mudholkar (198 1)].
If Y CL(0, 1) and the relation (2.7.35) holds, then, under some tech-
nical conditions on the distribution of X
1
, the c.d.f. of X
1
is given by
(2.7.34). George and Mudholkar (1981) provide an interesting interpre-
tation of (2.7.35) utilizing the deco mposition of the Laplace r.v. into a
difference of two i.i.d. exponential variables W
1
and W
2
: if adding and sub-
tracting W
1
and W
2
to and from the median X
2:3
produces the distribution
of X
1
, then X
1
must have a logistic dis tribution.
(b) Under the above conditions, establish the relation
X
1:3
+ X
3:3
2
+ Y
d
= X
1
. (2.7.36)
Deduce from (2.7.35) and (2.7.36) that for a random sample of size n = 3
from the standard logistic distribution the sa mple median has the same
distribution as the midrange [George and Rousseau (1987)]. Investigate
whether this property is actually a characterization of the logistic distribu-
tion.
(c) Generalize Part (a) by showing that if X
1
, X
2
, . . . are i.i.d. with c.d.f.
(2.7.34) while Y
1
, Y
2
, . . . ar e i.i.d. Laplace CL(0, 1) random variables, then
X
k+1:2k+1
+
k
X
j=1
Y
j
j
d
= X
1
, k 1. (2.7.37)
[George and Rousseau (1 987)]
(d) Generalize Part (b) by showing that under the conditions of Part (c)
we have
X
1:2k+1
+ X
2k+1:2k+1
2
+
k
X
j=1
Y
j
2j 1
d
= X
1
, k 1. (2.7.38)
Further, show that when the midrange is based on an even number of i.i.d.
logistic random var iables, then
X
1:2k
+ X
2k:2k
2
+
1
2
k1
X
j=1
Y
j
j
d
=
X
1
+ X
2
2
, k 1. (2.7.3 9)
156 2. Classical symmetric Laplace distribution
[George and Rousseau (1 987)]
Exercise 2.7.53 Le t X
1
and X
2
be i.i.d. standard normal random vari-
ables, and let W be an exponential r andom variable with mean two and
independent of X
1
and X
2
. Then, by Proposition 2.2.1, the r.v. Y =
W X
2
has the standard classical Laplace distribution CL(0, 1).
(a) Show that for any positive constants σ and η, the density of the r.v.
σX
1
+ ηY = σX
1
+ η
W X
2
(2.7.40)
(which is the sum of zero mean a nd independent normal and Laplace vari-
ables) is given by
g(x) =
1
η
e
σ
2
/(2η
2
)
1
2
e
x/η
Φ
ηx σ
2
ησ
+
1
2
e
x/η
Φ
ηx + σ
2
ησ

,
where Φ is the distribution function of X
1
[Kou (2000)].
(b) Show that if (2.7.40) is divided by
p
σ
2
+ η
2
W , then the resulting r.v.,
U
1
=
σX
1
+ η
W X
2
p
σ
2
+ η
2
W
,
has the standard normal distribution. Further, show that this result remains
valid for an arbitrary positive r.v. W [Sarabia (1993)].
(c) Generalize, by showing that if X
1
, X
2
, and X
3
are i.i.d. standard normal
r.v.’s and V is an arbitrary r.v., then the r.v.
U
2
=
X
1
+ V X
2
+ V
2
X
3
1 + V
2
+ V
4
is standard no rmal [Sarabia (1993)]. Investigate a n extension with more
than three normal var iables.
Exercise 2.7.54 Extend Parts (b) and (c) of Exercise 2.7.53 by showing
that if X
1
, X
2
, and X
3
are i.i.d. symmetric stable r.v.’s with ch.f. φ(t) =
e
−|t|
α
, where 0 < α 2, then the r.v.’s
U
1
=
X
1
+ V X
2
(1 + V
α
)
1
and
U
2
=
X
1
+ V X
2
+ V
α
X
3
(1 + V
α
+ V
α
2
)
1
,
where V is an arbitrary non-negative r.v. independent of the X
i
’s, have the
same distribution as X
1
[Sarabia (1 994)]. Investigate an extension where
the number of X
i
’s is mo re than three.
2.7 Exercises 157
Exercise 2.7.55 Le t Y
1
, Y
2
, . . . b e i.i.d. sequence of CL(0, 1) random vari-
ables.
(a) Show that the r.v.
X =
X
j=1
Y
j
j
has the standard logistic distr ibution with c.d.f. (2.7.34) and ch.f.
ϕ
X
(t) = co sech πt
[see Pakes (1997) for further discussion and generalizations].
(b) Using the above representation deduce that the logistic distribution is
infinitely divisible.
Hint: Note the following infinite product representation of the hyperbolic
cosecant function:
cosech(z) =
1
z
Y
j=1
1 +
z
2
j
2
π
2
1
,
see, e.g. Abramowitz and Stegun (1965).
(c) Using Part (a) and Part (d) of Exercise 2.7.52 deduce the limiting
distribution of the logistic midrange (X
1:2k
+ X
2k:2k
)/2 as k .
Exercise 2.7.56 Le t X
1:n
··· X
n:n
be the order statistics connected
with a random sa mple from a uniform distribution on the interval (1, 1).
(a) Derive the joint distribution of the statistics
U
n
=
X
n:n
X
1:n
2
and V
n
=
X
n:n
+ X
1:n
2
.
(b) Show that the marginal p.d.f. of V
n
is
g
n
(x) =
n
2
(1 |x|)
n1
, |x| 1, (2.7.41)
and that the variance of V
n
is
σ
2
n
=
2
(n + 1)(n + 2)
, (2.7.42)
[see Neyman and Pearson (192 8); Carlton (1946)].
(c) Show that as n , the p.d.f. of the standardized variable W
n
=
V
n
n
, which is given by
1
s
n
g
n
x
s
n
(2.7.43)
158 2. Classical symmetric Laplace distribution
with
s
n
=
1
σ
n
=
r
(n + 1)(n + 2)
2
, (2.7.44)
converges to the standard Laplace density (2.1.4).
(d) Note that in Part (c), the limit
lim
n→∞
s
n
n
(2.7.45)
is e qual to s = 1/
2. Generalize Part (c) by showing that if for a positive
sequence {s
n
} the limit (2.7.45) is equal to s, where 0 < s < , then the
p.d.f.’s (2.7.43) converge to the Laplace distribution with mean zero and
scale parameter s with density (2.1.1) [Dreier (1999)]. What happens if the
limit (2.7.45) is equal to zero? What if it is equal to ?
(e) Now, let the sample be from the uniform distribution on the interval
(0, a) with some a > 0. By considering an appropriate linear transforma-
tion, derive the p.d.f. of V
n
, show that V
n
is unbiased for the population
mean a/2, and find the variance of V
n
. Further, show that the standardized
random va riable
W
n
=
V
n
E(V
n
)
p
V ar(V
n
)
.
still converges in distribution to the standard L aplace distribution with
density (2.1.4).
(f) Under the conditions of Part (e), show that standardized sample mean,
Z
n
=
X
n
E(X
n
)
q
V ar(X
n
)
,
converges in distribution to the s tandard normal distribution. In view of
these results, discuss the use of W
n
and
X
n
as estimates of the mean of the
uniform distribution on the interval (0, a) with some a > 0 . [Biswas and
Sehgal (1991)].
Exercise 2.7.57 Le t g
n
be the density (2.7.41). Show that for every x > 0
there exists an n
0
N, such that
1
2
ye
yx
y
n
g
n
xy
n
1
2nx
for all n n
0
and all y 0. Conclude that the converge nce to the Laplace
density,
lim
n→∞
y
n
g
n
xy
n
=
y
2
e
y|x|
, −∞ < x < ,
is uniform in y for every x 6= 0 [Dreier (1999)].
2.7 Exercises 159
Exercise 2.7.58 Navarro and Ruiz (20 00) define a discrete Laplace dis-
tribution by the probability function:
f(k) = c(s)e
|kθ|/s
, k = 0, ±1, ±2 , ··· , (2.7.46)
where θ is an integer, s is a positive real number, and c(s) is a norm-
ing constant (the authors also mention a possible extension where θ is a
real number and the support of the distribution is a countable set of rea l
numbers).
(a) Show that in order fo r the function (2.7.46) to be a genuine probability
function we must have
c(s) =
1 e
1/s
1 + e
1/s
. (2.7.47)
(b) Show that a r.v. Y with the probability function (2.7.46) admits the
representation
Y
d
= θ + X
1
X
2
, (2.7.48)
where X
1
and X
2
are i.i.d. geometric variables given by the probability
function
P (X
1
= k) = (1 p)
k
p, k = 0, 1, 2, . . . (2.7.49)
with
p = 1 e
1/s
. (2.7.50)
(c) Show that if a geometric distribution (2.7.49) with p as in (2.7.50) is
extended sy mmetrically to the set of negative integers, then we obtain the
distribution (2.7.46) with θ = 0. Thus, analogo us ly to the Laplace case, we
might call this distribution a double geometric distribution.
Exercise 2.7.59 If F is a distribution function with the corresponding
cumulants κ
i
, then the Edgeworth expansion of F is given by
F (x) = Φ(x)
κ
3
6
(x
2
1)φ(x)
κ
4
24
(x
3
3x)φ(x)
κ
2
3
72
(x
5
10 x
3
+ 15 x)φ(x) + ··· ,
where Φ and φ are the c.d.f. and the p.d.f. of the standard normal distri-
bution [see, e.g ., Kotz and Johnson (1982)].
(a) Let X
1
, . . . , X
n
be i.i.d. from the CL(θ, s) distribution, and consider
the standardized sample mean
T
n
=
1
2sn
n
X
j=1
(X
j
θ).
160 2. Classical symmetric Laplace distribution
Show that the jth cumulant of T
n
is given by
n
1j/2
(
2s)
j
κ
j
,
where the κ
j
is the jth cumulant of X
1
θ.
(b) Using the expression (2.1.13) for the cumulants of the Laplace distri-
bution, derive the following (Edgeworth) approximation of the c.d.f. of T
n
:
F
n
(x) = Φ(x)
1
8n
φ(x)(x
3
3x) + O(n
2
).
[Pace and Salvan (1997).]
Exercise 2.7.60 Le t X
1
, X
2
, . . . be i.i.d. s tandard Laplace L(0, 1) ra ndom
variables. Then the sequence {X
n
, n 1} obeys the law of the iterated
logarithm,
lim sup
n→∞
P
n
k=1
X
k
p
2n log(log n)
= 1 a.s., (2.7.51)
since (2.7.51) holds for any i.i.d. sequence of standardized random variables
[see, e.g., Breiman (1993), Theor e m 13.25]. Generalize (2 .7.51) by showing
that for any α 0 the sequence {X
n
, n 1} satisfies
lim sup
n→∞
P
n
k=1
k
α
X
k
n
α
p
2n log(log n)
=
1
2α + 1
a.s. (2.7.52)
[Tomkins (1972)].
Hint: Denote c
n
= n
1/4
and show that fo r large n the double inequality
e
t
2
(1c
n
|t|)/(2n)
E[e
tX
k
/
n
] e
t
2
(1+c
n
|t|/2)/(2n)
(2.7.53)
holds for each positive integer k n and any t such that |t| 1/c
n
. Then
use the fact that the condition (2.7.53) is sufficient for (2.7.52) [Tomkins
(1972)].
Exercise 2.7.61 A random variable X on [0, ) with the Laplace trans-
form η(s) = Ee
sX
is called a generalized gamma convolution (GGC) if
η(s) = exp
as
Z
0
log
1 +
2
w
dU(w)
, a 0, Re(s) 0,
(2.7.54)
where U is a non-negative measure on (0, ) such tha t
Z
1
0
|log w|dU(w) < and
Z
1
1
w
dU(w) < ,
see, e.g., Bondesson (1992).
2.7 Exercises 161
(a) Show that the standard exponential distribution belongs to the class of
GGC laws and the measure U is a unit mass at u = 1. Consequently, sym-
metric L aplace distributions, as well a s their asymmetric and multivariate
generalizatio ns studied in this book, are mean-variance mixtures of normal
laws by generalized gamma convolutions.
(b) Similarly, show that every gamma distributions is a GGC. What is the
measure U in this case?
162 2. Classical symmetric Laplace distribution
3
Asymmetric Laplace distributions
Chapter 3 is devoted to asymmetric Laplace distributions - a skewed fam-
ily of distributions which in our opinion is the mos t appropriate skewed
generalizatio n of the classical Laplace law. In the last several decades, var-
ious forms of skewed La place distributions have sporadically appeared in
the literature. One of the earliest is due to McGill (1962) who considers
distributions w ith p.d.f.
f(x) =
φ
1
2
e
φ
1
|xθ|
, x θ,
φ
2
2
e
φ
2
|xθ|
, x > θ,
(3.0.1)
while Holla and Bhattacharya (1 968) study the distribution with the p.d.f.
f(x) =
pφe
φ|xθ|
, x θ,
(1 p)φe
φ|xθ|
, θ < x,
(3.0.2)
where 0 < p < 1. Lingappaiah (1988) derived some properties o f (3.0.1),
terming the distribution two-piece double exponential. Poiraud-Casanova
and Thomas -Agnan (2000) exploited a skewed Laplace distribution with
p.d.f.
f(x) = α(1 α)
e
(1α)|xθ|
, fo r x < θ,
e
α|xθ|
, for x θ,
(3.0.3)
where θ (−∞, ) and α (0, 1), to show the equivalence of certain
quantile estimators.
164 3. Asymmetric Laplace distributions
Azzalini (1985) noted that if X and Y are symmetric (about zero) and
independent r.v .’s with densities f
X
, f
Y
and distribution functions F
X
, F
Y
,
respectively, then for any λ,
1
2
= P (X λY < 0) =
Z
−∞
f
Y
(y)F
X
(λy)dy. (3.0.4)
Consequently, the function
g(y) = 2f
Y
(y)F
X
(λy) (3.0.5)
is a p.d.f. for any λ. If we take X and Y to be i.i.d. standard normal
variables, then (3.0.5) gives the dens ity of the skew-normal distribution,
extensively s tudied since its introduction in O’Hagan and Leonhard (19 76)
mainly by Azzalini and its associates [see Azzalini (1985, 1986), Henze
(1986), Liseo (1990), Azzalini and Dalla Valle (1996), Azzalini and Capi-
tanio (1999)]. Similarly, if X and Y are i.i.d. standard Laplace r.v.’s, uti-
lizing (3.0.5), we obtain a skewed Laplace distribution with the density
g(x) =
1
2
e
(1+λ)x
, −∞ < x 0,
e
x
1
2
e
(1+λ)x
, 0 < x < ,
(3.0.6)
studied by Balakrishnan and Ambagaspitiya (1994) in an unpublished tech-
nical report.
Another manner of introducing skewness into a symmetric distribution
has been proposed by Fern´andez and Steel (199 8) [see a lso Fern´andez et al.
(1995)]. Here, the idea is to convert a symmetric p.d.f. into a skewed one by
postulating inverse scale factors in the positive and negative orthants. Thus,
a symmetric density f generates the following class of skewed distributions,
indexed by κ > 0,
f(x|κ) =
2κ
1 + κ
2
f(κx), x 0,
f(κ
1
x), x < 0.
(3.0.7)
When f is the standard classical Laplace dens ity (2.1.2), then (3.0.7), with
the addition of a location and scale parameters, leads to a three-parameter
family with the density
p(x) =
1
σ
κ
1 + κ
2
exp
κ
σ
(x θ)
, fo r x θ,
exp
1
σκ
(x θ)
, for x < θ,
(3.0.8)
introduced by Hinkley and Reva nkar (1977). These distributions, ter med
asymmetric Laplace (AL) laws by Kozubowski and Podg´orski (2000), s how
promise in financial modeling (se e Part III of the monog raph devoted to
applications and references therein). It is our opinion that members of this
particular class dese rve to be called the asymmetric Laplace (AL) distri-
butions. There are at least three reasons why these laws warrant a special
treatment.
3. Asy mmetric Laplace distributions 165
Firstly, these distributions arise naturally as limiting distributions in a
random summation scheme. Recall that symmetric Lapla c e laws are the
only possible limiting distributions for (normalized) sums of i.i.d. symmet-
ric random variables with a finite variance, when the number of terms in
the summation has a geometric distribution with the mean converging to
infinity (see Proposition 2.2.9 ). Similarly, if the assumption of symmetry of
the summands is omitted, we obtain AL laws as the limiting distributions
(see Proposition 3.4.4).
Secondly, the AL laws extend naturally all the basic properties of sym-
metric Laplace distributions.
Mixtures of normal distributions. A classical symmetric Laplace r.v.
may be viewed as a normal r.v. with mean zero and a stochastic
variance (see Proposition 2.2.1). Analogously, an AL r.v. has a simi-
lar interpretation, where the mean of the normal distribution is now
stochastic (see Proposition 3.2.1). This fact is of a particular im-
portance for application in finance wher e stochastic variance models
are being used [see, e.g., Madan, et al. (1998), Levin and Tcher-
nitser (1999)].
Stability with respect to geometric summation. A symmetric Laplace
r.v. Y has the same distribution as a (appropriately scaled) sum of a
geometric number of i.i.d. copies of Y (see Proposition 2.2.7). More
generally, we obtain a similar characterization of an AL r.v., when
the equality of distributions is replaced by the weak convergence (see
Proposition 3.4.5).
Distribut ions with maximal entropy. As we have seen in Propos itio n
2.4.7, among a ll continuous distributions on (−∞, ) with a given
first absolute moment, the one with a maximal entropy is provided
by a symmetric L aplace distribution. As we shall show in the present
chapter, under an additional restriction on the value of the mean, the
entropy is maximized by an AL law.
Convolution of exponential distributions. A classical Laplace r.v. can
be represented as a difference of two i.i.d. exponential random vari-
ables (see Prop osition 2.2.2). If the two exponential r.v.’s are inde-
pendent but no longer identically distributed, their difference has an
AL law (se e Proposition 3.2.2).
Finally, it is the properties and features of AL distributions which are
similar in nature to these features of the normal distribution that make
them particularly attractive in a pplications.
Infinite divisibility. Variables appearing in many applications in var-
ious sciences can often be represented as sums o f a large number o f
166 3. Asymmetric Laplace distributions
tiny variables, often independent and identically distributed. This is
a practical interpretation of the notion of infinite divisibility. Thus,
when dealing with such a phenomenon, a “proper” model ought to
be infinitely divis ible. It is well known that all normal distributions
are infinitely divis ible, and so are the AL laws.
Limiting laws. The normal distribution arises as a limit of a determin-
istic sum of i.i.d. random variables with a finite variance , where the
number of terms in the summation tends to infinity. Consequently, if
a variable of inter est can be viewed as a result o f a large number of
independent increments (with a finite variance), then its distribution
may be approximated by the normal law. Similarly, a random sum of
i.i.d. random variables with finite variance converges to an AL r.v.
when the average number of terms in the summation tends to infinity.
Thus, in practice we could use a n AL approximation for a variable
resulting from a random number (a geometric variable with a large
mean) of independent innovations (with a finite va riance).
Maximum ent ropy property. The principle of maximum entropy, which
states that out of all the distributions satisfying a given set of con-
strains one should chose the one with the largest entropy, is considered
as genera l inference procedure and has been applied successfully in a
wide variety of fields, including statistical mechanics, statistics, eco-
nomics, queuing theo ry, image analysis, and other, see, e.g., Kapur
(1993). Thus, distributions maximizing the entropy under suitable
constrains provide useful models in applications. It is well known
that among all continuous distributions on (−∞, ) with a given
mean and variance, the Gaussian (normal) distribution provides the
largest entropy. Analog ously, the entropy is maximized by the AL dis-
tribution, when the mean and the first abso lute moment are s pecified
(Proposition 3.4.7).
Finiteness of moments. It is often argued tha t most variables ap-
pearing in applications should have finite moments of all orders (or
at leas t the mean and the variance). This holds for the no rmal as well
as for the AL laws.
Symmetry. Probability distributions of variables arising in the real-
world are often symmetric. The normal distribution is a symmetric
one, and as such, is often used as a model in practice. An AL distri-
bution can also be symmetric as well (in which case it reduces to the
classical Laplace distribution), but the AL model actually provides
more flexibility, allowing for asymmetry.
Simplicity. The distributions applied in practice ought to be han-
dled easily. It is highly a dva ntageous if their densities, distribution
3.1 Definition and basic properties 167
functions and other characteristics allow for str aightforward calcu-
lations and estimation procedures should also be preferably imple-
mented with ease. Ideally, the c.d.f. and the p.d.f. should have closed
form expressions, which would substantially fa c ilita te the derivation
and implementation of estimation and s imulation procedures. This
is indeed the case with the normal distribution, although the distr i-
bution function here la cks an explicit form and requires a numerical
approximation. We shall see that the corresponding formulas and
procedures for the AL laws are at least as simple, if not simpler than
their normal counterparts.
Extensions. An a ppropriate model should allow for various extensions,
particularly to the multivariate setting. This is the case with both the
normal and the AL laws. The multivariate extensions o f a univariate
AL law is quite natural (and are discussed in Part II of this text).
3.1 Definition and basic properti es
A formal definition of the class of asymmetric La place distributions is as
follows.
Definition 3.1.1 A random variable Y is said to have an asymmetric
Laplace (AL) distribution if there ex ist parameters θ R, µ R and
σ 0 such that the characteristic function of Y has the form
ψ(t) =
e
iθt
1 +
1
2
σ
2
t
2
iµt
. (3.1.1)
We denote the distribution of Y by AL(θ, µ, σ), and write Y AL(θ, µ, σ).
Remark 3.1.1 Asymmetric Laplace laws with θ = 0 constitute a subclass
of GS distributions defined in Subsection 4.4.4. Namely,
AL(0, µ, σ) = GS
2
(σ/
2, β, µ), β = sign(µ), (3.1.2)
where GS
α
(σ, β, µ) denotes the distribution given by ch.f. (4.4.7), see Ex-
ercise 3.6.15.
3.1.1 An a l terna tive parameterization and special cases
While the distribution is properly defined for every θ R, µ R, and
σ 0, we shall note s pecifically the following s pecial cases:
If θ = µ = σ = 0, then ψ(t) = 1 for every t R and the distribution
is degenerate at 0;
168 3. Asymmetric Laplace distributions
For θ = σ = 0 and µ 6= 0, we have an exponential r.v. with mean µ
[concentrated on (0, ) for µ > 0 and on (−∞, 0) for µ < 0];
For µ = 0 and σ 6= 0, we have a symmetric La place distribution with
mean θ and variance σ
2
.
The ch.f. (3.1.1) with σ > 0 can be expressed in the following manner:
ψ(t) = e
iθt
1
1 + i
σκ
2
t
!
1
1 i
σ
2κ
t
!
=
e
iθt
1 +
1
2
σ
2
t
2
i
σ
2
1
κ
κ
t
,
(3.1.3)
where the additional parameter κ > 0 is related to µ and σ as follows:
κ =
2σ
µ +
p
2σ
2
+ µ
2
=
p
2σ
2
+ µ
2
µ
2σ
, (3.1.4)
while
µ =
σ
2
1
κ
κ
. (3.1.5)
Note that for each fixed σ > 0 the expression (3.1.4), considered as a
function of µ a nd written κ = κ(µ), is decreasing on (−∞, ) with κ(0) = 1
and
lim
µ→−∞
κ(µ) = , lim
µ→∞
κ(µ) = 0. (3.1.6)
We shall use the abbreviation AL to denote all distributions with ch.f. given
either by (3.1.1) or by (3.1.3), including those with µ = 0 (symmetric ones)
and σ = 0.
We shall find it convenient to express ce rtain properties of the asym-
metric Laplace distributions in the (θ, κ, σ) parameterization, using the
notation AL
(θ, κ, σ) for the distribution given by (3.1.3). The parameter
κ is scale invariant, so that the random va riables Y and cY have the same
κ parameter whenever Y is AL
(θ, σ, κ) distributed and c > 0. Note also
that in the (θ, σ, κ) parameterization, σ is a bona fide scale parameter.
The following relations will be often used in the sequel:
1
κ
κ =
2µ
σ
,
1
κ
+ κ =
r
4 +
2µ
2
σ
2
,
1
κ
2
+ κ
2
= 2
µ
2
σ
2
+ 1
. (3.1.7)
The following result follows easily from the form of the AL characteristic
function.
Proposition 3.1.1 Let X AL
(θ, κ, σ) and let c be a non-zero real con-
stant. Then,
(i) c + X AL
(c + θ, κ, σ)
(ii) cX AL
(cθ, κ
c
, |c|σ), where κ
c
= κ
sign(c)
.
3.1 Definition and basic properties 169
Remark 3.1.2 Note that in particular, if X AL
(θ, κ, σ) then X
AL
(θ, 1/κ, σ).
3.1.2 Stan d ardization
Since θ is simply a location parameter, we shall often assume θ = 0. To
simplify the notation in this case, we shall write AL(µ, σ) and AL
(κ, σ)
for the distributions AL(0, µ, σ) and AL
(0, κ, σ), respectively. Further,
for θ = 0 and σ = 1 we shall say that the distribution is standard, and
write AL(µ) and AL
(κ), respectively [for the distributions AL(0, µ, 1)
and AL
(0, κ, 1)]. Many properties of AL laws shall be stated in terms of
standard va riables.
Tables 3.1 and 3.2 below contain summary of our notation and the spe cial
cases in the two parameterizations.
3.1.3 Densities and their properties
Using the factorization (3.1.3), we can represent an asymmetric Laplace
r.v. Y as fo llows:
Y
d
= θ + Y
1
Y
2
, (3.1.8)
where the two variables on the right hand side ar e independent and expo-
nentially distributed with means σ/(
2κ) and σκ/
2, re spec tively. Equiv-
alently, we have
Y
d
= θ +
σ
2
1
κ
W
1
κW
2
, (3.1.9)
where W
1
and W
2
are two i.i.d. standard e xponential ra ndom variables.
This representation leads to explicit formulas for the corresponding den-
sity and distribution function [cf. formula (2.3.16) and the computations
preceding it].
Proposition 3.1.2 Let f
θ,κ,σ
and F
θ,κ,σ
denote the p.d.f. and c.d.f. of an
AL
(θ, κ, σ) distribution, respectively. Then,
f
θ,κ,σ
(x) =
2
σ
κ
1 + κ
2
exp
2κ
σ
|x θ|
, if x θ
exp
2
σκ
|x θ|
, if x < θ,
(3.1.10)
and
F
θ,κ,σ
(x) =
1
1
1+κ
2
exp
2κ
σ
|x θ|
, if x θ
κ
2
1+κ
2
exp
2
σκ
|x θ|
, if x < θ.
(3.1.11)
170 3. Asymmetric Laplace distributions
Case Name Notation Char. funct.
θ R
σ 0
µ R
Asymm. Laplace AL(θ, µ, σ)
e
iθt
1+
1
2
σ
2
t
2
iµt
θ = 0
σ 0
µ R
Asymm. Laplace AL(0, µ, σ), AL(µ, σ)
1
1+
1
2
σ
2
t
2
iµt
θ R
σ 0
µ = 0
Symm. Laplace AL(θ, 0, σ), L(θ, σ)
e
iθt
1+
1
2
σ
2
t
2
θ = 0
σ = 1
µ R
Standard AL AL(0, µ, 1), AL(µ)
1
1+
1
2
t
2
iµt
θ = 0
σ = 0
µ 6= 0
Exponential AL(0, µ, 0), E(µ)
1
1iµt
θ R
σ = 0
µ = 0
Degenerated e
iθt
Table 3.1: Special cases and notation for an asymmetric La place distribu-
tion in the AL(θ, µ, σ) parameterization.
Figure 3.1 shows AL densities for various values of the parameters.
Remark 3.1.3 Note that for κ = 1 we obtain the p.d.f. and the c.d.f. of
the symmetric La place distribution given by (2.1.3 ).
Remark 3.1.4 To obtain expressions of the AL p.d.f. and c.d.f. in the
AL(θ, µ, σ) parameterization, substitute in (3.1.10)-(3.1.11) the expression
for κ given by (3.1 .4).
Remark 3.1.5 If Y is an AL random var iable given by (3.1.10)-(3.1.11),
then
P (Y θ) = F
θ,κ,σ
(θ) =
κ
2
1 + κ
2
= q
κ
(3.1.12)
3.1 Definition and basic properties 171
Case Name Notation Char. funct.
θ R
σ 0
κ > 0
Asymm. Laplace AL
(θ, κ, σ)
e
iθt
1+
1
2
σ
2
t
2
i
σ
2
(
1
κ
κ)t
θ = 0
σ 0
κ > 0
Asymm. Laplace
AL
(0, κ, σ),
AL
(κ, σ)
1
1+
1
2
σ
2
t
2
i
σ
2
(
1
κ
κ)t
θ R
σ 0
κ = 1
Symm. Laplace
AL
(θ, 1, σ),
L(θ, σ)
e
iθt
1+
1
2
σ
2
t
2
θ = 0
σ = 1
κ > 0
Standard AL
AL
(0, κ, 1),
AL
(κ)
1
1+
1
2
t
2
i
1
2
(
1
κ
κ)t
θ R
σ = 0
κ = 1
Degenerated e
iθt
Table 3.2: Special cases and notation for an asymmetric La place distribu-
tion in the AL
(θ, κ, σ) parameteriza tio n.
and
P (Y > θ) = 1 F
θ,κ,σ
(θ) =
1
1 + κ
2
= p
κ
. (3.1.13)
Consequently, the parameter κ controls the probability assigned to each side
of θ. Clearly, for κ = 1, the two probabilities are equal and the distribution
is symmetric ab out θ.
Remark 3.1.6 Our skewed Laplac e distribution with density (3.1.10), which
is defined by its characteristic function, may be obtained formally by follow-
ing a general procedur e of obtaining a skewed dis tribution from a symmet-
ric one, which has been proposed recently by Fern´andez and Steel (1998).
Let f be any p.d.f. which is unimodal (say about zero) and symmetric.
The method of transforming the symmetric distribution given by f into a
skewed one consists of introducing inverse scale factors for the positive and
negative parts of the distribution, leading to the density (3.0.7) discussed
in the introduction. The L aplace distribution demonstrates that such dis-
tributions may appear quite naturally.
172 3. Asymmetric Laplace distributions
x
p(x)
-10 -5 0 5 10
0.0 0.1 0.2 0.3 0.4 0.5
x
p(x)
-10 -5 0 5 10
0.0 0.1 0.2 0.3 0.4 0.5
x
p(x)
-10 -5 0 5 10
0.0 0.1 0.2 0.3 0.4 0.5
x
p(x)
-10 -5 0 5 10
0.0 0.1 0.2 0.3 0.4 0.5
x
p(x)
-10 -5 0 5 10
0.0 0.1 0.2 0.3 0.4 0.5
x
p(x)
-10 -5 0 5 10
0.0 0.1 0.2 0.3 0.4 0.5
x
p(x)
-10 -5 0 5 10
0.0 0.1 0.2 0.3 0.4 0.5
x
p(x)
-10 -5 0 5 10
0.0 0.1 0.2 0.3 0.4 0.5
x
p(x)
-10 -5 0 5 10
0.0 0.1 0.2 0.3 0.4 0.5
µ = 0, κ = 1.0
µ = 10, κ 0.1
Figure 3.1: Standard asymmetric Laplace densities with µ =0 , 0.8, 1.5, 2 ,
3, 4, 6, 8, 10 which correspond to κ 1 .0, 0.68, 0.50, 0.41, 0.30, 0.24 , 0.16,
0.12, 0.1.
Remark 3.1.7 Every AL density can be written as a mixture of two ex-
ponential densities with means µ
1
= σ/(κ
2) and µ
2
= σκ/
2,
f
θ,κ,σ
(x) = p
κ
1
µ
1
e
−|xθ|
1
I
[θ,)
(x) + q
κ
1
|µ
2
|
e
(xθ)/|µ
2
|
I
(−∞)
(x),
(3.1.14)
with q
κ
and p
κ
defined by (3.1.12) and (3.1.13), respectively (I
A
(x) is the
indicator function equal to 1 if x belongs to the set A and equal to zero
otherwise).
Remark 3.1.8 Since the AL density is incr e asing on (−∞, θ) and decreas-
ing on (θ, ), the distribution is unimodal with the mode equal to θ. The
value of the dens ity at the mode is
f
θ,κ,σ
(θ) =
2
σ
κ
1 + κ
2
in the AL
(θ, κ, σ) parameteriza tio n and
f
θ,µ,σ
(θ) =
1
p
µ
2
+ 2σ
2
3.1 Definition and basic properties 173
in the AL(θ, µ, σ) parameter ization. This value can be located anywhere in
the interval (0, ). Further, we have
lim
µ0
f
θ,µ,σ
(θ) =
1
2σ
, lim
σ0
+
f
θ,µ,σ
(θ) =
1
|µ|
, lim
µ,σ0
f
θ,µ,σ
(θ) = .
(3.1.15)
Further prope rties of AL densities are discussed in the exercises.
3.1.4 Mome nt a nd cumulant generating functions
We can obtain the moment generating function of an AL distribution either
by a straig htforward integration utilizing the AL density (3.1.10) or from
the representation (3.1.9).
Proposition 3.1.3 If Y AL
(θ, κ, σ) then the moment generating func-
tion of Y is
M
θ,κ,σ
(t) = E[e
tY
] =
e
θt
1
1
2
σ
2
t
2
σ
2
1
κ
κ
t
,
2
σκ
< t <
2κ
σ
.
(3.1.16)
Proof. By the repr e sentation (3.1.9) we have
M
θ,κ,σ
(t) = E[e
tY
] = e
θt
E[e
σ
2
1
κ
tW
1
]E[e
σ
2
κtW
2
],
where W
1
and W
2
are i.i.d. standard exponential variables with moment
generating function
M
W
i
(s) = E[e
sW
i
] =
1
1 s
, s < 1.
Thus, we have
M
θ,κ,σ
(t) =
e
θt
(1
σ
2
1
κ
t)(1 +
σ
2
κt)
, (3.1.17)
where we must have
2κ
< 1 and
κ
2
< 1. (3.1.18)
Now, (3.1.17) and (3.1.18) produce (3.1.16), concluding the proof.
174 3. Asymmetric Laplace distributions
Remark 3.1.9 In the AL(θ, µ, σ) parameterization the moment generat-
ing function is
M
θ,µ,σ
(t) =
e
θt
1
1
2
σ
2
t
2
µt
,
2
p
2σ
2
+ µ
2
µ
< t <
2
p
2σ
2
+ µ
2
+ µ
.
(3.1.19)
In case µ = 0 we obtain the moment gener ating function (2.1.10) of the
classical Laplace distribution CL(θ, s) with s = σ/
2 [the L(θ, σ) distribu-
tion].
By Proposition 3.1.3 we can now write the cumulant generating function,
log M
θ,κ,σ
(t), corresponding to the AL
(θ, κ, σ) distribution:
log M
θ,κ,σ
(t) = θt log
1
σ
2
1
κ
t
log
1 +
σ
2
κt
,
2
σκ
< t <
2κ
σ
.
(3.1.20)
Note that in the symmetric case (κ = 1) we obtain the cumulant generating
function (2.1.11) of the classical Laplace distribution CL(θ, s) with s =
σ/
2.
3.1.5 Mome nts and related parameters
Cumulants
The cumulants of a general AL
(θ, κ, σ) r.v. Y ar e the coefficients of t
n
/n!
in the Taylor s e ries (about t = 0) o f the corresponding cumulant generating
function (3.1.20). Thus, the nth cumulant κ
n
is equal to the nth deriva-
tive of the cumulant ge nerating function at t = 0. The calculation of the
derivatives is straightforward. For n = 1 we have
d
dt
log M
θ,κ,σ
(t) = θ +
σ
2
(
1
1
σ
2
1
κ
t
κ
1 +
σ
2
κt
)
, (3.1.21)
while for n > 1 we obtain
d
n
dt
n
log M
θ,κ,σ
(t) = (n 1)!
σ
2
n
(
1
1
σ
2
1
κ
t
!
n
+
κ
1 +
σ
2
κt
!
n
)
.
(3.1.22)
Now, substituting t = 0 into (3 .1.21) and (3.1.22), we obtain the following
expressions for the cumulants of an AL
(θ, κ, σ) r.v. Y :
κ
n
(Y ) =
θ +
σ
2
(κ
1
κ) if n = 1,
(n 1)!
σ
2
n
(κ
n
κ
n
) if n > 1 is odd,
(n 1)!
σ
2
n
(κ
n
+ κ
n
) if n is even.
(3.1.23)
3.1 Definition and basic properties 175
Note that in the symmetric case (κ = 1) the cumulants of odd order greater
than one vanish, and we obtain cumulants (2.1.13) of the classical Laplace
distribution CL(θ, s) w ith s = σ/
2. Observe also that the mean and vari-
ance of Y , which coincide with the first and second cumulants, respectively,
are
E[Y ] = θ +
σ
2
1
κ
κ
= θ + µ, V ar[Y ] =
σ
2
2
1
κ
2
+ κ
2
= µ
2
+ σ
2
.
(3.1.24)
Moments
Let Y AL
(θ, κ, σ). For any integer n > 0, the nth moment of Y ab out
θ, E(Y θ)
n
, is
Z
θ
−∞
(y θ)
n
2
σ
κ
1 + κ
2
e
2
σκ
(yθ)
dy +
Z
θ
(y θ)
n
2
σ
κ
1 + κ
2
e
2κ
σ
(θy)
dy.
The substitution of x = θ y in the fir st integral and x = y θ in the
second integral leads to:
(1)
n
κ
2
1 + κ
2
Z
0
x
n
2
σκ
e
2
σκ
x
dx +
1
1 + κ
2
Z
0
x
n
2κ
σ
e
2κ
σ
x
dx.
Thus,
E(Y θ)
n
= n!
σ
2κ
n
1 + (1)
n
κ
2(n+1)
1 + κ
2
, (3.1.25)
since for any u > 0 and a > 1 we have
Z
0
x
a
ue
ux
dx =
1
u
a
Z
0
x
a
e
x
dx =
Γ(a + 1)
u
a
.
In the symmetric case (κ = 1) we obtain the moments (2 .1.14) of the
classical Laplace distribution with s = σ/
2.
Absolute moments
To obtain abso lute moments of an AL distribution, we follow essentially
the calculation leading to the moment formula (3.1.25), obtaining
E[|Y θ|
a
] =
σ
2κ
a
Γ(a + 1)
1 + κ
2(a+1)
1 + κ
2
, a > 1. (3.1.26)
176 3. Asymmetric Laplace distributions
Mean deviation
Let Y have an AL
(θ, κ, σ) distribution with density f
θ,κ,σ
given by (3.1.10).
Then, by (3.1.24), the mean deviation of Y is
E|Y E[Y ]| =
Z
−∞
|y θ
σ
2
(1 κ)|f
θ,κ,σ
(y)dy.
After a straightforward but tedious integration we obtain:
E|Y E[Y ]| =
2σ
κ(1 + κ
2
)
e
(κ
2
1)
, (3.1.27)
which equals σ/
2 for the symmetric case with µ = 0, cf. (2.1.19). Further,
since the standa rd dev iation of Y is
p
V ar(Y ) =
s
σ
2
+
σ
2
2
1
κ
κ
2
=
σ
1 + κ
4
2κ
,
We have
Mean deviation
Standard deviation
=
2e
κ
2
1
(1 + κ
2
)
1 + κ
4
.
For the symmetric L aplace distribution (κ = 1), the above ratio is equal to
1/
2, as previously derived in (2.1.20).
Coefficient of Variation
For a r.v . X with the mean not equal to zero, the coefficient of variation
is defined as
p
V ar(X)
|EX|
.
For Y AL(θ, µ, σ) with θ 6= µ, the mean of Y is non-zero a nd the
coefficient of variation is equal to
p
µ
2
+ σ
2
|θ + µ|
. (3.1.28)
For θ = 0 and µ 6= 0 , we obta in
s
σ
2
µ
2
+ 1 =
p
1
2
+ κ
2
1 κ
. (3.1.29)
Note that in this case the absolute value of the mean is less than or equal
to the standard deviation, and thus the coefficient of variation is always
greater than or equal to one.
3.1 Definition and basic properties 177
Coefficients of skewness and kurtosis
The coefficient of skewness, defined in (2.1.21), is a measure of symmetry
which is independent of scale. For the symmetric Lapla c e distribution its
value is zero, as it is for any symmetric distr ibution with finite third moment
and standard deviation greater than zero. For an AL
(θ, κ, σ) distribution,
the coefficient of skewness is non-zero, unless κ = 1 (µ = 0). In terms of κ,
its value is as follows:
γ
1
= 2
1
3
κ
3
(1
2
+ κ
2
)
3/2
. (3.1.30)
It follows from (3.1.30) that the absolute value of γ
1
is bounded by two,
and as κ increases within the interval (0, ), then the corresponding value
of γ
1
decreases monotonically fro m to 2 to 2.
Let us now study the peakedness of AL distributions. We saw in Section
2.1 that a symmetric Laplace distribution is leptokur tic, as its coefficient of
kurtosis (adjusted), defined in (2.1.22), is equal to three. For an AL
(θ, κ, σ)
distribution, we have
γ
2
= 6
12
(1
2
+ κ
2
)
2
. (3.1.31)
Thus, the distr ibution is leptokurtic and γ
2
varies from 3 [the least value
for the symmetric Laplace distribution with κ = 1, se e (2.1.23)] to 6 (the
greatest value attained for the limiting exponential distribution when κ
0).
Quantiles
Since the distribution function of an asymmetric Laplace distribution is
given in closed form, calcula tio n of quantiles, including the median, is quite
straightforward. Let ξ
q
be the qth quantile of an AL r.v. with distribution
function given by (3.1.11). Then, we have
ξ
q
=
θ +
σκ
2
log
n
1+κ
2
κ
2
q
o
for q (0,
κ
2
1+κ
2
],
θ
σ
2κ
log{(1 + κ
2
)(1 q)} for q (
κ
2
1+κ
2
, 1).
(3.1.32)
Note that for κ = 1 we obtain the quantiles (2.1.24) of the symmetric
Laplace distribution. Setting q = 1/2, we obtain the median m:
m = ξ
1/2
=
θ +
σ
2κ
log
n
2
1+κ
2
o
for κ 1,
θ
σκ
2
log
n
2κ
2
1+κ
2
o
for κ > 1.
(3.1.33)
178 3. Asymmetric Laplace distributions
By setting q = 1/4 and q = 3/4 we obtain the first and third quartiles, Q
1
and Q
3
, as well as the interquartile range, equal to
Q
3
Q
1
=
σ log 3
2κ
for κ 1/
3,
σ
2κ
log
n
4
1+κ
2
o
σκ
2
log
n
1+κ
2
4κ
2
o
for 1/
3 < κ <
3,
σκ log 3
2
for κ
3.
(3.1.34)
In particular, we have
Q
1
= θ and Q
3
= θ + σ
r
3
2
log 3 for κ =
1
3
and
Q
1
= θ σ
r
3
2
log 3 and Q
3
= θ for κ =
3.
Remark 3.1.10 If κ = 1 (µ = 0), the relation (3.1.33) yields m = θ, which
is the median of a symmetric Laplace distribution. Similarly, for σ = θ = 0,
we get m = µ log 2, w hich is the median of an expo nential distribution with
mean µ (to which the asymmetric Laplace law is simplified in this case).
Remark 3.1.11 One c an show that for κ 6= 1, the mode, median, and
mean of an AL distribution satisfy the following inequalities:
If κ < 1 then Mode < Median < Mean,
If κ > 1 then Mode > Median > Mean.
(3.1.35)
All three measures of location ar e equal to θ when κ = 1 (µ = 0), in which
case we obtain the symmetric Laplace distribution.
In Table 3.3 below we summarize the moments and related parameters of
AL r.v.’s.
3.2 Representatio ns
In this section we present the re presentations and characterizations of AL
distributions that are generalizations of the corresponding properties of the
symmetric Laplace distributions, as presented in Section 2.2 .
3.2.1 Mixture of normal distributions
A symmetric Laplace r.v. can be rega rded (informally) as a normal r.v. with
mean zero and variance which is an exponentially distributed random vari-
able (se e Proposition 2.2.1). AL r.v.’s admit s imila r interpretation, where
the mean is a random variable as well. We state it more for mally in the
result below.
3.2 Representations 179
Parameter Definition Value
Absolute
moment
E|Y |
a
, a > 1
σ
2κ
a
Γ(a + 1)
1+κ
2(a+1)
1+κ
2
nth moment
EY
n
n!
σ
2κ
n
1+(1)
n
κ
2(n+1)
1+κ
2
nth cumulant
κ
n
=
(n1)!
σ
2κ
n
(1+(1)
n
κ
2n
)
Mean
EY
σ
2
(
1
κ
κ) = µ
Variance
E(Y EY )
2
µ
2
+ σ
2
Mean
deviation
E|Y EY |
2σe
(κ
2
1)
κ(1+κ
2
)
Coeff. of
Variation
V ar(Y )
|EY |
q
σ
2
µ
2
+ 1 =
1
2
+κ
2
1κ
Coeff. of
Skewness
γ
1
=
E(Y EY )
3
(E(Y EY )
2
)
3/2
2
1
3
κ
3
(1
2
+κ
2
)
3/2
Kurtosis
(adjusted)
γ
2
=
E(Y EY )
4
(V ar(Y ))
2
3
6
12
(1
2
+κ
2
)
2
Median m = F
1
0,κ,σ
(1/2)
m =
(
σ
2κ
log
1+κ
2
2
, κ 1
σκ
2
log
1+κ
2
2κ
2
, κ > 1
Table 3.3: Moments and related pa rameters of Y AL
(θ, κ, σ) with θ = 0.
Proposition 3.2.1 An AL(θ, µ, σ) random variable Y with ch.f. (3.1.1)
admits the representation
Y
d
= θ + µW + σ
W Z, (3.2.1)
where Z is standard normal and W is standard exponential.
Proof. Le t W have an exponential distribution with p.d.f. e
w
. Condition-
ing on W , we can express the ch.f. of the right hand side of (3.2.1) as
180 3. Asymmetric Laplace distributions
follows:
E[e
it(θ+µW +σ
W Z)
] =
Z
0
e
itθ+itµw
E[e
itσ
wZ
]e
w
dw.
Note that
E[e
itσ
wZ
] = φ
Z
(
w) = e
1
2
t
2
σ
2
w
,
where φ
Z
(s) = e
s
2
/2
is the ch.f. of a s tandard normal r.v. Z. Thus,
E[e
it(θ+µW +σ
W Z)
] =
Z
0
e
itθ
e
w(1+
1
2
t
2
σ
2
iµt)
dw,
which produces the ch.f. (3 .1.1) and the result follows.
Note that in the symmetric case (µ = 0) we obtain the representation
of the classical Laplace r.v . discussed in P roposition 2.2.1 (Chapter 2) and
remarks following it.
3.2.2 C onvolution of exponential distributions
We now formally state the representation (3.1.9) in the following result:
Proposition 3.2.2 An AL
(θ, κ, σ) random variable Y with ch.f. (3.1.3)
admits the representation (3.1.9), where W
1
and W
2
are i.i.d. standard
exponential random variables.
Note that for κ = 1 we obtain the representation of the classic al Laplace
distribution discussed in Proposition 2.2 .2 (Chapter 2) and remarks follow-
ing it.
Remark 3.2.1 Denoting H
i
= 2W
i
, i = 1, 2, we have
Y
d
= θ +
σ
2
2
1
κ
H
1
κH
2
, (3.2.2)
where H
1
and H
2
are i.i.d. chi-sq uare r.v.’s with two degrees of freedom.
Remark 3.2.2 Since a standard ex ponential r.v. W has the same dis tri-
bution as log(U ), where U is standard uniform variable, we have the
following r epresentation of Y in terms of two i.i.d. standard uniform vari-
ables U
1
and U
2
:
Y
d
= θ +
σ
2
log
U
κ
1
U
1
2
!
. (3.2.3)
It generalizes similar representation of the classical Laplace distribution
with κ = 1.
3.2 Representations 181
Remark 3.2.3 Similarly, we can express an AL r.v. in terms of two i.i.d.
Pareto Type I r.v.’s, P
1
and P
2
, with density is f (x) = 1/x
2
, x 1. Indeed,
as already mentioned in Section 2.2.3, a standar d exponential r.v. W has
the same distribution as log(P
1
), so that by (3.1.9) we have
Y
d
= θ +
σ
2
log
P
1
1
P
κ
2
!
. (3.2.4)
Similar representation of the classical Laplace distribution was obtained in
Proposition 2.2.4.
Remark 3.2.4 The repres e ntation of Proposition 3.2.2 may be expressed
alternatively as follows:
Y
d
= θ +
σ
2
IW, (3.2.5)
where the r.v.’s I and W are independent, W is a standard exponential
variable, while I takes on the values κ and 1 with probabilities κ
2
/(1 +
κ
2
) and 1/(1 + κ
2
), respectively. In the symmetric case with κ = 1 (µ = 0),
the random variable I takes on the values 1 with probabilities 1/2, and
(3.2.5) reduces to the repre sentation (2.2.10) of the symmetric Laplace r.v.
with the scale parameter s = σ/
2.
3.2.3 Self-decomposability
We have seen in Section 2.4.3 that all symmetric Laplace random variables
Y are s e lf-decomposable, that is for every c (0, 1) we have
Y
d
= cY + X,
where X and Y are independent variables. Ramachandran (1997) shows
that all AL distributions are self-decomposable as well. In fact, we have
the following explicit representation:
Proposition 3.2.3 Let Y AL
(θ, κ, σ). Then Y is self-decomposable
and for any c [0, 1] we have
Y
d
= cY + (1 c)θ +
σ
2
1
κ
δ
1
W
1
κδ
2
W
2
, (3.2.6 )
where δ
1
, δ
2
are dependent Bernoulli r.v.’s taking on values of either zero
or one with the probabilities
P (δ
1
= 0, δ
2
= 0) = c
2
, P(δ
1
= 1, δ
2
= 1) = 0,
P (δ
1
= 1, δ
2
= 0) = (1 c)
c +
1 c
1 + κ
2
,
182 3. Asymmetric Laplace distributions
P (δ
1
= 0, δ
2
= 1) = (1 c)
c +
(1 c)κ
2
1 + κ
2
,
W
1
and W
2
are standard exponential variables, and Y , W
1
, W
2
, and (δ
1
, δ
2
)
are mu tually independent.
Proof. The representation (3.2.6 ) follows directly from the following equal-
ity for ch.f.’s:
(1 + i
σ
2
cκt)(1 i
σ
2
1
t)
(1 + i
σ
2
κt)(1 i
σ
2
κ
1
t)
= c
2
+ (1 c)
c +
1 c
1 + κ
2
1
1 i
σ
2
κ
1
t
+ (1 c)
c +
(1 c)κ
2
1 + κ
2
1
1 + i
σ
2
κt
.
Remark 3.2.5 Note tha t in the symmetric case κ = 1, the representation
(3.2.6) reduces to that of a symmetric Laplace distribution with s = σ/
2,
see Proposition 2.4.4.
Remark 3.2.6 Note the following version of the above representation:
Y
d
= cY + (1 c)θ +
δ
1
κ
δ
2
κ
σ
2
W,
where δ
i
’s are as b efore, W has the standard exponential distribution, and
Y , W , (δ
1
, δ
2
) are independent.
By taking c = 0 in (3.2.6) we obtain the representation of an AL r.v . Y as
a mixture of exponentially distributed random variables:
Y
d
= θ +
σ
2
1
κ
δ
1
W
1
κδ
2
W
2
, (3.2.7)
where the zero-one variables δ
1
and δ
2
, δ
1
+ δ
2
= 1, assume one with
probabilities 1/(1+κ
2
) and κ
2
/(1+κ
2
), respectively, and are independent of
i.i.d. exponential variables W
1
and W
2
. This is essentially the representation
from Proposition 3 .2.2.
3.2.4 Relation to 2 × 2 normal determinants
We have the following extension of Proposition 2.2.5 to the case of an AL
random va riable.
3.2 Representations 183
Proposition 3.2.4 Let Y AL
(θ, κ, σ) with θ = 0 and σ = 1, and let
(X
1
, X
2
) and (X
3
, X
4
) be i.i.d. bivariate normal r.v.’s with vector mean
zero and variance-covariance matrix
Σ =
1
2κ
1 + κ
2
, 1 κ
2
1 κ
2
, 1 + κ
2
. (3.2.8)
Then,
Y
d
= X
1
X
2
+ X
3
X
4
. (3.2.9)
Note that if Y is symmetric Laplace (κ = 1), then Σ is an identity ma-
trix, so that all four var iables X
1
, X
2
, X
3
, X
4
are i.i.d. standard normal (see
Proposition 2.2.5). For this case the representation (3.2.9) was derived in
Mantel a nd Pasternack (1966) by an appropriate representation in terms
of chi-square random variables [see also Farebrother (1986)], and in Mantel
(1973) by calculating the appropriate characteristic functions [see also com-
ments in Mantel (1987) and Missiakoulis and Darton (1985)]. Here we prove
our generalization for asymmetric La place distribution using appropriate
representations in terms of random variables.
Proof. Let Z
1
, Z
2
, Z
3
, Z
4
be i.i.d. standard normal r.v.’s. Note tha t X
i
’s
have the following representation
(X
1
, X
2
)
d
=
Z
1
κZ
3
2κ
,
Z
1
+ κZ
3
2κ
, (3.2.10)
(X
3
, X
4
)
d
=
Z
2
κZ
4
2κ
,
Z
2
+ κZ
4
2κ
. (3.2.11)
Indeed, to see (3.2.10) note that the linear combinations of Z
i
’s are normal
with
V ar
Z
1
κZ
3
2κ
= V ar
Z
1
+ κZ
3
2κ
=
1
2κ
(1 + κ
2
) (3.2.12)
and
Cov
Z
1
κZ
3
2κ
,
Z
1
+ κZ
3
2κ
=
1
2κ
(1 κ
2
), (3.2.13)
which correspond to the entries of Σ given by (3.2.8). Similar arguments
apply to (3.2.11). Next, write
X
1
X
2
+ X
3
X
4
d
=
1
2κ
{(Z
1
κZ
3
)(Z
1
+ κZ
3
) + (Z
2
κZ
4
)(Z
2
+ κZ
4
)}
=
1
2κ
(Z
2
1
κ
2
Z
2
3
+ Z
2
2
κ
2
Z
2
4
) =
1
2κ
(H
1
κ
2
H
2
), (3.2.14)
184 3. Asymmetric Laplace distributions
where
H
1
= Z
2
1
+ Z
2
2
and H
2
= Z
2
3
+ Z
2
4
(3.2.15)
are two i.i.d. χ
2
r.v.’s with two degree s of freedom. Finally, note that
H
i
d
= 2W
i
, i = 1, 2, where W
i
’s are i.i.d. standard exponential variables, so
that (3.2.14) re duces to (3.1.9) and the result follows.
Table 3.4 summarizes the representations studied in this section.
Representation Variables
µW +
W Z
Z - standard normal r.v.
W - expo nentially distributed r.v.
1
2
(
1
κ
W
1
κW
2
)
W
1
, W
2
- standard ex po nential r.v.’s
1
2
2
(
1
κ
H
1
κH
2
)
H
1
, H
2
- χ
2
r.v.’s with two degrees of freedom
1
2
IW
I takes o n va lues κ and
1
κ
with probabilities
κ
2
1+κ
2
and
1
1+κ
2
W - standard exponential r.v .
1
2
log(P
1
1
/P
κ
2
)
P
1
, P
2
- Pareto r .v .’s with p.d.f. f(p) = 1/p
2
, p > 1
1
2
log(U
κ
1
/U
1
2
)
U
1
, U
2
- r.v.’s uniformly distributed on [0, 1]
X
1
X
2
+ X
3
X
4
(X
1
, X
2
) and (X
3
, X
4
) are bivariate normal with
mean 0 and covariance given by (3.2.8 )
1
2
(
1
κ
δ
1
W
1
κδ
2
W
2
)
W
1
, W
2
- standard ex po nential r.v.’s
(δ
1
, δ
2
) as sumes values (1, 0) and (0, 1) with
probabilities
1
1+κ
2
and
κ
2
1+κ
2
Table 3.4: Summary of the representations of the standard AL(0, µ, 1) [or
AL
(0, κ, 1)] random variables. All random variables (or vectors) in each
representation are mutually independent.
3.3 Simulation 185
3.3 Simulation
Random variate g e neration from an AL distribution is straightforward.
Since the AL distribution function has closed form, and so does its inverse,
the inversion method can be applied [see, e.g., Devroye (1986)]. Alterna-
tively, we can use any of the representations discussed in Section 3.2. The
representation (3.2.3) in terms of two i.i.d. uniform variables se e ms to be
most suitable for simulation, as these can be obtain directly. Here is an AL
generator based on this representation.
An AL
(θ, κ, σ) generator.
Generate a uniform [0, 1] random variate U
1
.
Generate a uniform [0, 1] random variate U
2
, independent o f U
1
.
Set Y θ +
σ
2
log
U
κ
1
U
1
2
.
RETURN Y.
Remark 3.3.1 To generate an AL(θ, µ, σ) variate, first compute the pa-
rameter κ using the relation (3.1.4), and then apply the above algorithm.
3.4 Characterizations and further properties
3.4.1 Infinite d i v isibility
In Section 2.4.1 of Chapter 2 we discussed a fundamental c oncept of infi-
nite divisibility and showed that all symmetric L aplace laws are infinitely
divisible. Similarly, all AL distributions are infinitely divisible as well, as
their ch.f. ψ given by (3.1.3) can be factored as
ψ(t) =
e
iθt/n
1
1 i
σ
2κ
t
!
1/n
1
1 + i
σκ
2
t
!
1/n
n
= [ψ
n
(t)]
n
(3.4.1)
for each integer n 1. The ch.f. ψ
n
corresponds to the random variable
θ
n
+
σ
2
1
κ
G
1
κG
2
, (3.4.2)
where G
1
and G
2
are i.i.d. gamma Γ(1/n, 1) random variables with density
f(x) =
1
Γ(1/n)
x
1/n1
e
x
, x > 0. (3.4.3)
186 3. Asymmetric Laplace distributions
Generalizations of Laplace distribution such as (3.4.2), whose character-
istic functions are powers of the AL ch.f., are known as Bessel function
distributions, and will be subject of Section 4.1 of Chapter 4.
The following result summarizes our discussion.
Proposition 3.4.1 Let Y AL
(θ, κ, σ). Then, Y is infinitely divisible,
admitting for each integer n 1 the representation
Y
d
=
n
X
i=1
X
ni
, (3.4.4)
where the X
ni
’s are i.i.d. variables given by (3.4.2).
Our next result reveals the evy-Khinchine representation of an AL char-
acteristic function, which was derived in Takano (1 989,1990).
Proposition 3.4.2 The ch.f. ψ of Y AL
(θ, κ, σ) r.v. admits evy-
Khinchine representation
ψ(t) = exp
itθ +
Z
R
(e
itu
1)λ(u)du
, (3.4.5)
where
λ(u) =
1
|u|
(
e
2κ
σ
|u|
, for u > 0
e
2
κσ
u
, for u < 0.
(3.4.6)
Proof. Recall that the evy measure of exponential distribution with pa-
rameter β > 0 has density e
βu
/u, u > 0 , i.e.
1
1 it/β
= exp
Z
0
(e
itu
1)
1
u
e
βu
du
, β > 0, t R.
Consequently,
1
1 i
σκ
2
t
= exp
Z
0
(e
ity
1)
1
y
e
2
σκ
y
dy
, t R, (3.4.7)
and
1
1 i
σ
2κ
t
= exp
Z
0
(e
itu
1)
1
u
e
2κ
σ
u
du
, t R. (3.4.8)
Replacing in (3.4.7) t with t and substituting y = u we obtain
1
1 + i
σκ
2
t
= exp
Z
0
−∞
(e
itu
1)
1
|u|
e
2
σκ
|u|
du
, t R. (3.4.9)
3.4 Characterizations and further properties 187
The multiplication of the corresponding sides of (3.4.8) and (3.4.9), coupled
with (3.1.3), pr oduces (3.4.5) - (3.4.6).
In Figure 3.2, we see graphs of the evy densities for various specifications
of the par ameter µ (σ = 1).
-5 0 5 10 15 20
0.0 0.2 0.4 0.6 0.8 1.0
-5 0 5 10 15 20
0.0 0.2 0.4 0.6 0.8 1.0
-5 0 5 10 15 20
0.0 0.2 0.4 0.6 0.8 1.0
-5 0 5 10 15 20
0.0 0.2 0.4 0.6 0.8 1.0
-5 0 5 10 15 20
0.0 0.2 0.4 0.6 0.8 1.0
-5 0 5 10 15 20
0.0 0.2 0.4 0.6 0.8 1.0
-5 0 5 10 15 20
0.0 0.2 0.4 0.6 0.8 1.0
-5 0 5 10 15 20
0.0 0.2 0.4 0.6 0.8 1.0
-5 0 5 10 15 20
0.0 0.2 0.4 0.6 0.8 1.0
µ = 0, κ = 1
µ = 10, κ 0.1
Figure 3.2: Densities of the evy measures for standard asymmetric Laplace
distributions with µ =0 , 0.8, 1.5, 2, 3, 4, 6, 8, 10, which correspond to
κ 1.0, 0.68 , 0.50, 0.41, 0.30, 0.24, 0.16, 0.12, 0.1 (the densities o f these
distributions are illustrated in Figure 3.1.
3.4.2 Geometric infi nite divisibility
In Section 2.4.2 of Chapter 2 we discussed the class of geometric infinitely
divisible laws, and showed that all symmetric Laplace distributions with
mean zero belong to this group. More generally, all AL laws with mo de
equal to zero are geometric infinitely divisible as well, as shown by the
following
188 3. Asymmetric Laplace distributions
Proposition 3.4.3 If Y AL(0, µ, σ), then Y is geometric infinitely di-
visible and for all p (0, 1) we have
Y
d
=
ν
p
X
i=1
Y
(i)
p
, (3.4.10)
where ν
p
is geometric r.v. with mean 1/p, the r.v.’s Y
(i)
p
are i.i.d. AL(0, pµ,
)
for each p, and ν
p
and (Y
(i)
p
) are independent.
Proof. Let f
p
be the ch.f. of Y
(i)
p
. Conditioning on ν
p
, we find the ch.f. o f
the right hand side of (3.4.10) to be
E[e
it
P
ν
p
i=1
Y
(i)
p
] =
X
n=1
E[e
it
P
n
i=1
Y
(i)
p
](1 p)
n1
p =
pf
p
(t)
1 (1 p)f
p
(t)
.
(3.4.11)
When we now substitute
f
p
(t) =
1
1 +
1
2
2
t
2
iµpt
,
which is the ch.f. of the AL(0, pµ,
) distribution, into (3.4.11), we ob-
tain the ch.f. of Y given by (3.1.1) with θ = 0.
Remark 3.4.1 If Y AL
(0, κ, σ), then (3.4.10) holds with Y
(i)
p
having
the AL
(0, κ
p
,
) distribution, where
κ
p
=
p
p(1 κ)
2
+ 4
p(1 κ)
2
, (3.4.12)
see Exercise 3.6.17.
3.4.3 Distributional limits of geome tric sums
We saw in Section 2.2.7 of Chapter 2 that the class of symmetric Laplace
distributions with zero mean coincides with the class of distributional limits
as p 0 of (appropriately normalized) g e ometric sums
X
1
+ ··· + X
ν
p
,
where X
1
, X
2
, . . . are non-degenerate and symmetric i.i.d. r.v.’s with finite
variance, and ν
p
is a geometric r.v. with mean 1/p, independent of the X
i
’s.
It turns out that if we omit the assumption of symmetry, then the limiting
class coincides with the family of AL distributions.
3.4 Characterizations and further properties 189
Proposition 3.4.4 The class of AL distributions with mode equal to zero
coincides with the class of non-degenerate distributional limits of
S
p
= a
p
ν
p
X
i=1
(X
i
+ b
p
) (3.4.13)
as p 0, where X
1
, X
2
, . . . are non-degenerate i.i.d. r.v.’s with finite vari-
ance, and ν
p
is a geometric r.v. with mean 1/p, independent of the X
i
’s.
Moreover, if EX
i
= µ and V ar(X
i
) = σ
2
, then the normalizing sequences
in (3.4.13) may be taken as
a
p
= p
1/2
, b
p
= µ(p
1/2
1), (3.4.14)
in which case S
p
converges in distribution to the AL(0, µ, σ) random vari-
able.
Proof. First, we shall show that if Y AL(0, µ, σ), then Y is the dis-
tributional limit o f S
p
, where X
i
’s are i.i.d. r .v.’s with EX
i
= µ and
V ar(X
i
) = σ
2
, while the normalizing sequences are given by (3.4.14). Thus,
we need to show the convergence
p
1/2
ν
p
X
j=1
(X
j
µ + p
1/2
µ)
d
Y, (3.4.15)
where Y is an AL r.v. with ch.f. ψ given by (3.1.1) with θ = 0. Writing
(3.4.15) in terms of ch.f.’s, we obtain
pe
ipµt
φ(p
1/2
t)
1 (1 p)e
ipµt
φ(p
1/2
t)
ψ(t), (3.4.16)
where φ is the ch.f. of X
j
µ. Taking reciprocals, we can express (3.4.16)
as
1 (1 p)e
ipµt
φ(p
1/2
t)
pe
ipµt
φ(p
1/2
t)
1 +
1
2
σ
2
t
2
iµt. (3.4.17)
Note that the factor φ(p
1/2
t) tends to one as p c onverge s to zero, so that
we can write equivale ntly (splitting the numerator)
e
ipµt
1
p
+
1 (1 p)φ(p
1/2
t)
p
= I + II 1 +
1
2
σ
2
t
2
iµt. (3.4.18)
First, we show that I iµt. Indeed, we have:
e
ipµt
1
p
= iµt
sin(pµt)
pµt
+
cos(pµt) 1
pµt
pµt iµt + 0.
190 3. Asymmetric Laplace distributions
To establish the convergence
II =
1 (1 p)φ(p
1/2
t)
p
1 +
1
2
σ
2
t
2
(3.4.19)
use Theorem 8.44 form Breiman (1993). Since W
j
= X
j
µ has finite first
two moments, the ch.f. of W
j
can be written as
φ(u) = 1 + iuEW
j
+
(iu)
2
2
(EX
2
j
+ δ(u)),
where δ denotes a bounded function of u such that lim
u0
δ(u) = 0. Since
EW
j
= E[X
j
µ] = 0 and EW
2
j
= E(X
j
µ)
2
= σ
2
, we apply the above
with u = p
1/2
t to the left hand side of (3.4 .19) to obtain
t
2
2
(σ
2
+ δ(p
1/2
t)) + 1
pt
2
2
(σ
2
+ δ(p
1/2
t)),
which converges to 1 + t
2
σ
2
/2 as p 0. Thus, we have shown the firs t part
of the pro po sition.
Let us now assume that the var iables (3.4.13) converge in distribution
to a r.v. Y with ch.f. ψ. Our g oal is to show that the r.v. Y has an AL
distribution. First, note that being a limit of geometric compounds (3.4.13),
the r.v. Y is geometric infinitely divisible, and thus also infinitely divisible,
see, e.g., Mohan et al. (1993). Thus, its ch.f. ψ does not vanish. Expressing
the convergence in terms of ch.f.’s we have
pf
p
(t)
1 (1 p)f
p
(t)
ψ(t) for t R, (3.4.20)
where
f
p
(t) = e
ita
p
b
p
φ(a
p
t)
and φ is the ch.f. of the X
j
’s. Since the fraction in (3.4.20) converges to a
non-zero limit while its numerator converges to zero (since f
p
is bounded),
we must have
f
p
(t) 1 for t R. (3.4.21)
We now rewrite (3.4.20) equivalently as
1
1 +
1
p
(
1
f
p
(t)
1)
ψ(t) for t R, (3.4.22)
so that
1
p
1
f
p
(t)
1
1
ψ(t)
1 for t R. (3.4.23)
3.4 Characterizations and further properties 191
In view o f (3.4.21), we have
1
p
(f
p
(t) 1) 1
1
ψ(t)
for t R. (3.4.24)
We now let p = 1/n and denote a
n
= a
1/n
, b
n
= b
1/n
, so that (3.4.24) takes
the form
n(f(a
n
t)e
ita
n
b
n
1) 1
1
ψ(t)
for t R, (3.4.25)
where the limit is a continuous function. Thus, by Feller (197 1, XVII, The-
orem 1) we co nclude that
(f(a
n
t)e
ita
n
b
n
)
n
exp
1
1
ψ(t)
for t R. (3.4.26)
But the left hand side of (3.4.26) is the ch.f. of
a
n
n
X
i=1
(X
i
+ b
n
), (3.4.27)
where the X
i
’s are i.i.d. with finite variance, and consequently the limit in
(3.4.26) must be a normal characteristic function,
exp
1
1
ψ(t)
= exp
iµt
1
2
σ
2
t
2
, (3.4.28)
where µ R and σ > 0 are some constants. Solving (3.4.28 ) for ψ(t) we
obtain the AL ch.f. (3.1.1) with θ = 0. The r e sult has been proved.
Remark 3.4.2 If the X
i
’s are AL(µ, σ), then they have mean µ and vari-
ance µ
2
+σ
2
. Consequently, we have the convergence to the AL(µ,
p
µ
2
+ σ
2
)
law under the normalization (3 .4.14).
3.4.4 Stabi l i ty with respect to geometric summation
As we saw in Section 2.2.6 of Chapter 2 , an AL r.v. Y with µ = 0 (sym-
metric L aplace) is the only symmetric r.v. w ith a finite se c ond mo ment
satisfying the relation
Y
d
= a
p
ν
p
X
i=1
(Y
i
+ b
p
), (3.4.29)
192 3. Asymmetric Laplace distributions
where ν
p
is geometrically distributed with mean 1/p, Y
i
’s are i.i.d. copies of
Y , and ν
p
and Y
i
’s are independent. More generally, all AL r.v.’s satisfy the
above relation when the equality in distribution is replaced by convergence
in distribution. The following result, that we include here without proof,
follows from a more general characterizatio n of geometr ic stable distribu-
tions, see Kozubowski (1994b, Theorem 3.1).
Proposition 3.4.5 Let Y be a random variable with fin ite variance, and
let Y
1
, Y
2
, . . . be i.i.d. copies of Y . Then, the following st atements are equiv-
alent:
(i) Y AL(0, µ, σ) with µ
2
+ σ
2
> 0;
(ii) There exist a
p
> 0 and b
p
R such that
a
p
ν
p
X
i=1
(Y
i
+ b
p
)
d
Y, (3 .4.30)
where ν
p
is geometric r.v. with mean 1/p, independent of Y
i
’s.
Moreover, the normalizing sequences must have the form:
a
p
= Cp
1/2
[1 + δ(p)], b
p
= [(p) + (p a
p
)µ]/a
p
, (3.4.31)
where
C =
s
σ
2
σ
2
+ µ
2
(3.4.32)
and the sequences δ(p) and η(p) converge to zero as p 0.
3.4.5 Maximum entropy property
In this section we characterize AL laws in terms of their entropy, which
was defined in Section 2.4.5. Let us derive the entropy of X having an AL
distribution with dens ity (3.1.10).
Proposition 3.4.6 Let X have an AL
(θ, κ, σ) distribution with density
f given by (3.1.10). Then, the entropy of X is given by
H(X) = E[lo g f(X)] = 1 + log σ + log
1
κ
+ κ
1
2
log 2. (3.4.33)
Proof. The calculation is straightforward. Since the value of entropy is
not affected by translation, we can assume that θ = 0. By definition, the
entropy of X is equal to
Z
0
−∞
(log C +
2
κσ
x)Ce
2
κσ
x
dx
Z
0
(log C
2κ
σ
x)Ce
2κ
σ
x
dx,
(3.4.34)
3.4 Characterizations and further properties 193
where
C =
2
σ
κ
1 + κ
2
. (3.4.35)
Recalling that fo r any a > 0 we have
Z
0
ae
ax
dx = 1 and
Z
0
xae
ax
dx =
1
a
, (3.4.36)
we obtain the following expression after integration in (3.4.34)
H(X) = C
σ
2
κ log C + C
σ
2
κ C
σ
2κ
log C + C
σ
2κ
. (3.4.37)
The s ubstitution of (3.4.35) into (3.4.37) pro duces (3.4.33).
Remark 3.4.3 Note that for κ = 1, for which the AL distribution becomes
a symmetric Laplace dis tribution, formula (3.4.3 3) simplifies to
H(X) = 1 + log σ +
1
2
log 2, (3.4.38)
which was derived for symmetric Laplace distribution in Section 2.1.3 of
Chapter 2.
We saw in Section 2.4.5 tha t the classical Laplace distribution maximizes
the entropy among all distributions with a given first absolute moment and
(−∞, ) support. It turns out that under the additional stipulation that
the mean be also given, the distribution that maximizes the entropy is AL,
as shown by Kotz et a l. (2000a).
Proposition 3.4.7 Consider the class C of all continuous random vari-
ables with non-vanishing densities on (−∞, ) and such that
EX = c
1
R and E|X| = c
2
> 0 for X C, (3.4 .39)
where
|c
1
| < c
2
. (3.4.40)
Then, t he maximum entropy is attained for the AL r.v. X
with density
(3.1.10), where θ = 0,
κ =
c
2
c
1
c
2
+ c
1
1/4
, (3.4.41)
194 3. Asymmetric Laplace distributions
and
σ =
1
2
(c
2
2
c
2
1
)
1/4
(
c
2
+ c
1
+
c
2
c
1
). (3.4.42)
Moreover, the maximum ent ropy is
max
X∈C
H(X) = H(X
) = 2 log
c
2
+ c
1
+
c
2
c
1
2
+ 1. (3.4.43)
Proof. Applying Proposition 2.4.6 with a = −∞, b = , h
1
(x) = x, and
h
2
(x) = |x|, we find that the maximum entropy is attained by the density
p(x) = e
a
0
e
a
1
x+a
2
|x|
= e
a
0
e
(a
1
+a
2
)x
, if x 0
e
(a
1
a
2
)x
, if x < 0
, (3.4.44)
provided that the function (3.4.44) integrates to one on (−∞, ) and sat-
isfies the constrains (3.4.39). Thus, it is eno ugh to find the constants a
0
,
a
1
and a
2
for which the constrains are satisfied. To this end, first no te that
the integrability of p implies the following restrictions on a
1
and a
2
:
a
1
+ a
2
< 0 and a
1
a
2
> 0, (3.4.45)
implying that a
2
< 0. Write
a
1
=
1
2σ
1
κ
κ
R and a
2
=
1
2σ
1
κ
+ κ
(−∞, 0) (3.4.46)
for some σ > 0 and κ > 0, so that the density (3.4.44) takes the form
p(x) = e
a
0
(
e
2κ
σ
x
, if x 0
e
1
2κσ
x
, if x < 0.
(3.4.47)
Comparing (3.4.47) with (3.1.10), we conclude that p must be an AL den-
sity, so that
e
a
0
=
2
σ
κ
1 + κ
2
. (3.4.48 )
Next, using the fo rmulas for the mean and the first absolute moment of the
AL distribution with density (3.1 .10) with θ = 0, we write the conditions
(3.4.39) as
EX =
σ
2κ
1 κ
4
1 + κ
2
= c
1
(3.4.49)
and
E|X| =
σ
2κ
1 + κ
4
1 + κ
2
= c
2
. (3.4.50)
3.5 Estimation 195
Divide the sides of (3.4.50) into the corresponding sides of (3.4.4 9) to obtain
1 κ
4
1 + κ
4
=
c
1
c
2
. (3.4.51)
Solving the above equation for κ produces (3.4.41). Finally, the substitu-
tion of κ given by (3.4 .41) into (3.4.50) and solving for σ produces (3.4.42).
We thus conclude that the entropy is maximized by the AL law with θ = 0
and κ and σ as specified by (3.4.41) - (3.4.42). The actual value of the
maximal entropy follows from Proposition 3.4.6.
Remark 3.4.4 Note that if the mean is zero, then κ = 1 a nd σ =
2c
2
so that the entropy is max imized by the classical Laplace r.v. with dens ity
1
2c
2
e
−|x|/c
2
. In this case the maximal e ntropy (3.4.43) reduces to (3.4.38).
Remark 3.4.5 If in Proposition 3.4.7 the absolute deviation about the
mean is prescribed instead of E|X|, then the entropy is maximized by the
symmetric Laplace distribution (Exercise 3.6.18).
3.5 Estimation
In this s e ction we study the problem of estimating the pa rameters of an
AL distribution. Note tha t our distributions are essentially co nvolutions
of exponential random variables of different signs, and common estima-
tion procedur es for mixtures of positive exponential distributions [see, e.g.,
Mendenhall and Hader (19 58), Rider (1961)] are not applicable in this case.
We shall focus on the method o f maximum likelihood, leaving the discuss ion
of other methods of estimation (i.e. the method o f moments) to exercises.
Most of the results presented below are taken from Kotz et a l. (2000 c ).
Let us start with the derivation of the Fisher information matrix, I(θ, κ, σ),
corresponding to an AL
(θ, κ, σ) distribution. We have
I(θ, κ, σ) =
h
E
n
γ
i
log f
θ,κ,σ
(X) ·
γ
j
log f
θ,κ,σ
(X)
o i
3
i,j=1
,
where X has an AL
(θ, κ, σ) distribution with the vector-parameter
γ = (θ, κ, σ)
and density f
θ,κ,σ
. Routine calculations (Exercise 3.6.23) produce the ma-
trix:
I(θ, κ, σ) =
2
σ
2
2
σ
2
1+κ
2
0
2
σ
2
1+κ
2
1
κ
2
+
4
(1+κ
2
)
2
1
σκ
1κ
2
1+κ
2
0
1
σκ
1κ
2
1+κ
2
1
σ
2
. (3.5.1)
196 3. Asymmetric Laplace distributions
3.5.1 Maximum likelihood estimation
Let X
1
, . . . , X
n
be an i.i.d. random sample from an AL
(θ, κ, σ) distribu-
tion with the density f
θ,σ,κ
given by (3.1.10), and let x
1
, . . . , x
n
be their
particular realization. Then, the likelihood function takes the form
L(θ, κ, σ) =
2
n/2
σ
n
κ
n
(1 + κ
2
)
n
exp
2κ
σ
n
X
j=1
(x
j
θ)
+
2
κσ
n
X
j=1
(x
j
θ)
,
(3.5.2)
where
(x
i
θ)
+
=
x
i
θ if x
i
θ
0 if x
i
θ
(3.5.3)
and
(x
i
θ)
=
θ x
i
if x
i
θ
0 if x
i
θ.
(3.5.4)
Thus, the log-likelihood function is
log L(θ, κ, σ) =
n
2
log 2 n log σ + n log
κ
1 + κ
2
2
σ
D, (3.5.5)
where
D = D(θ, κ) = κ
n
X
j=1
(x
j
θ)
+
+
1
κ
n
X
j=1
(x
j
θ)
. (3.5.6)
We shall follow our approach to the s ymmetric case and consider several
cases.
Case 1: The values of κ and σ are known
Here the likelihood function will be maximized by the value of θ that min-
imizes the function
Q(θ) = κ
n
X
i=1
(x
i
θ)
+
+
1
κ
n
X
i=1
(x
i
θ)
. (3.5.7)
Let X
1:n
··· X
n:n
be the order statistics connected with a random
sample of size n from the AL
(θ, κ, σ) distribution, and let x
1:n
···
x
n:n
be their particular realiza tion. Consider the set of n + 1 intervals
{I
0
, . . . , I
n
}, where
I
0
= (−∞, x
1:n
], I
n
= [x
n:n
, ), (3.5.8)
3.5 Estimation 197
and
I
j
= [x
j:n
, x
j+1:n
], j = 1, 2, . . . n 1. (3.5.9)
It can be shown that the function Q is continuous on R and linear on e ach
of the intervals I
j
, j = 0 , 1, . . . , n (Exercise 3.6.19). Further, the function
Q is decreasing on I
0
, increasing on I
n
, while on any I
j
with 1 j n 1
it is
Decreasing if
j
nj
< κ
2
,
Constant if
j
nj
= κ
2
,
Increasing if
j
nj
> κ
2
.
(3.5.10)
Thus, if the parameter κ is such that
κ
2
=
j
n j
for some j = 1, 2, . . . , n 1, (3.5.11)
then the function Q is minimized by any value of θ within the interval
[x
j:n
, x
j+1:n
]. Consequently, any statistic of the form
pX
j:n
+ (1 p)X
j+1:n
, p [0, 1], (3.5.12)
may be taken as an MLE of the parameter θ in this case. If the condition
(3.5.11) does not hold, the function Q attains its global minimum value at
the unique
ˆ
θ
n
given by
ˆ
θ
n
=
X
1:n
if κ
2
<
j
nj
,
X
j:n
if
j1
n(j1)
< κ
2
<
j
nj
, j = 2, 3, . . . , n 1,
X
n:n
if κ
2
> n 1.
(3.5.13)
We se e that, like in the case of symmetric Laplace distribution, the pro b-
lem of estimating the location parameter θ of the AL
(θ, κ, σ) distribution
admits an explicit solution.
Observe that for large values of n we will have
j 1
n (j 1)
κ
2
<
j
n j
for some j = 2, 3, . . . , n 1, (3.5.14)
so that c onsistently with the relations (3.5.12) - (3.5.13) the statistic X
j:n
may be taken as the MLE of θ. Solving the inequa lities (3.5.14) for j we
obtain the re lation
n
κ
2
1 + κ
2
< j 1 + n
κ
2
1 + κ
2
, (3.5.15)
which is satisfied uniquely by
j = j(n) = [[
2
/(1 + κ
2
)]] + 1 (3.5.16 )
198 3. Asymmetric Laplace distributions
(the square bracket [x] denotes the integral part of x). The resulting MLE
of θ, which is now given by the order statistic
ˆ
θ
n
= X
j(n):n
, (3.5 .17)
is c onsistent and asymptotically normal [Kotz et al. (2000c)].
Proposition 3.5.1 Let X
1
, . . . , X
n
be i.i.d. from the AL
(θ, κ, σ) distri-
bution with an unknown value of θ. Then, t he MLE of θ given by (3.5.17)
is
(i) Consistent;
(ii) Asymptotically normal, i.e.,
n(
ˆ
θ
n
θ)
d
N (0, σ
2
/2); (3.5.18)
(iii) Asymptotically efficient.
Proof. It is well known [see, e.g., David (1981)] that for a continuous dis-
tribution with density f the sample quantile
ˆ
ξ
λ,n
= X
[[λn]]+1
, 0 < λ < 1,
converges to the corresponding population quantile ξ
λ
and the asymptotic
distribution of
n(
ˆ
ξ
λ,n
ξ
λ
)
is normal with mean zero and var iance
λ(1 λ)
(f(ξ
λ
))
2
. (3.5.19)
In our case the MLE is a sample quantile with λ =
κ
2
1+κ
2
, the corresponding
population quantile ξ
λ
is eq ual to θ (since the above λ coincides with the
probability that the relevant a symmetric Laplace variable is less than θ),
and
f(ξ
λ
) = f
θ,κ,σ
(θ) =
2
σ
κ
1 + κ
2
. (3.5 .20)
Thus, the consistency and asymptotic normality (3.5.18) follow. To estab-
lish asymptotic efficiency note that the asymptotic variance coincides with
the inverse o f the Fisher information I(θ) = 2
2
[cf. (3.5.1)].
The specific form of the MLE for the location parameter provides a
characterization of our cla ss of asymmetric Laplace laws. Buczolich and
3.5 Estimation 199
Sz´ekely (1989), already mentioned in a remark following Proposition 2.6.3
of Chapter 2, considered the question o f when the sta tistic
P
n
i=1
a
i
X
i:n
,
where a
i
0 and
P
n
i=1
a
i
= 1, can be the MLE of the location parameter
θ for a sa mple X
1
, . . . , X
n
from a distribution given by a density f(x). A
proof of the following result may be found in Buczolich and Sz´ekely (19 89).
Theorem 3.5.1 The weighted sum
P
n
i=1
a
i
X
i:n
, where n 3, a
i
0, and
P
n
i=1
a
i
= 1, can be the MLE for the location parameter θ if and only if
one of the following cases holds:
(i) a
i
= 1/n for all i = 1, . . . , n,
(ii) a
1
= p and a
n
= 1 p for some p (0, 1),
(iii) a
j
= p and a
j+1
= 1p for some p (0, 1) and some j = 1, . . . , n1,
(iv) a
j
= 1 for some j = 1, . . . , n.
In the first case the distribution is necessarily Gaussian centered at zero
(and the estimator is a sample mean).
In the second case, the distribution is uniform on the interval [c(1
p), cp] for some c > 0 (and the estimator is the midrange).
In the third case, the distribution is necessarily asymmetric Laplace with
the skewness parameter κ
2
= j/(n j).
In the fourth case, there is no parametric class to which the density f
belongs for the case when n is fixed. H owever, if the hypothesis holds for
infinitely many sample sizes n = n
r
and for j = j
r
such that j
r
/n
r
con-
verges to α than the distribution is necessarily asymmetric Laplace with the
skewness parameter κ
2
= α/(1 α).
Case 2: The values of θ and κ are known
Here, the log-likelihood (3 .5.5) leads to the following function of σ to be
maximized:
Q(σ) = C n log σ
2
σ
D, (3.5.21)
where the quantities C =
n
2
log 2 + n log
κ
1+κ
2
and D given by (3.5 .6) do
not depend of σ. By differentiating, we find that Q attains its maximum
value at the unique point
ˆσ
n
=
2
n
κ
n
X
j=1
(x
j
θ)
+
+
1
κ
n
X
j=1
(x
j
θ)
, (3.5.22)
which is the MLE of σ. Note that the distribution of ˆσ
n
coincides with that
of the sample mean
ˆσ
n
=
1
n
n
X
j=1
Y
i
, (3.5.23)
200 3. Asymmetric Laplace distributions
where the Y
i
’s are i.i.d. exponential variables with mean σ and var iance
σ
2
. This follows from the fact that if X AL
(θ, κ, σ) then the variable
Y = g(X), where
g(x) =
(
2κ(x θ) if x θ
2
κ
(x θ) for x < θ,
(3.5.24)
is exponentially distributed with the above mean and variance (Exercise
3.6.20).
The representation (3.5.23) immediately leads to the strong consistency
and asymptotic normality of ˆσ
n
, as the variables Y
i
have finite variance.
Since the asymptotic variance coincides with the reciprocal of the Fisher in-
formation I(σ) = 1/σ
2
[cf. (3.5.1)] the MLE is also asymptotically efficient
[Kotz et al. (2 000c)].
Proposition 3.5.2 Let X
1
, . . . , X
n
be i.i.d. r.v.’s from the AL
(θ, κ, σ)
distribution, where the value of σ is unknown. Then, the MLE of σ is given
by (3.5.22) and is
(i) Unbiased;
(ii) Strongly consistent;
(iii) Asymptotically normal, where
n(ˆσ
n
σ)
d
N (0, σ
2
); (3.5.25)
(iv) Asymptotically efficient.
Case 3: The values of θ and σ are known
Here, by (3.5.5), we need to maximize the function
g(y, α, β) = log y log(1 + y)
2
αy
β
y
(3.5.26)
with respec t to y (0, ), where
α = α(θ) =
2
σ
1
n
n
X
j=1
(x
j
θ)
+
, β = β(θ) =
2
σ
1
n
n
X
j=1
(x
j
θ)
. (3.5.27)
For any fixed α, β > 0, the derivative o f g with res pect to y is
h(y, α, β) =
y
g(y, α, β) =
1
y
2y
1 + y
2
+
β
y
2
α. (3.5.28)
To find the MLE of κ, we shall study the solutions of the equation
h(y, α, β) = 0. (3.5.29)
The relevant properties of the function h are presented in the following
lemma [see Kotz et al. (2000c)].
3.5 Estimation 201
Lemma 3.5.1 For any fixed α, β > 0 the function h defined in (3.5.28) is
strictly decreasing on (0, ) with
lim
y0
+
h(y, α, β) = and lim
y→∞
h(y, α, β) = α < 0,
so that there exists a unique solution y
0
(0, ) of the equation (3.5.29).
Moreover, we have:
p
β/α y
0
1 in case β α,
1 y
0
p
β/α in case β α
(3.5.30)
Proof. Fix α, β > 0 and write
h(y, α, β) = h
1
(y) + h
2
(y), (3.5 .31)
where
h
1
(y) =
1
y
2y
1 + y
2
and h
2
(y) =
β
y
2
α. (3.5.32)
Since
d
dy
h
1
(y) =
1
y
2
2
1 y
2
(1 + y
2
)
2
, (3.5.33 )
it is easy to see that the function h
1
is decreasing on the interval (0, y
)
and increasing on the interval (y
, ), where
y
=
q
2 +
5 > 1. (3.5.34)
In addition,
lim
y0
+
h
1
(y) = , h
1
(1) = 0, h
1
(y
) < 0, lim
y→∞
h
1
(y) = 0. (3.5.35)
On the other hand, the function h
2
is dec reasing on (0, ) and
lim
y0
+
h
2
(y) = , h
2
(
p
β/α) = 0, h
2
(1) = β α, lim
y→∞
h
2
(y) = α < 0.
(3.5.36)
Assume first that α = β. Then, h(1) = 0 and
h(y) = h
1
(y) + h
2
(y) > 0 (3.5.37)
for y (0, 1), while
h(y) = h
1
(y) + h
2
(y) < 0 (3.5.38)
202 3. Asymmetric Laplace distributions
for y (1, ). Consequently, y
0
= 1 is the unique solution of the equation
(3.5.29) satisfying (3.5 .30).
Next, assume that β < α. By the above pro perties of h
1
and h
2
, we
deduce that there must exist a unique
y
0
(
p
β/α, 1) (3.5.39)
such that relations (3.5.37) - (3.5.38) hold for y (0, y
0
) and y (y
0
, ),
respectively. This y
0
must be a unique solution of the equation (3.5.29)
satisfying (3.5.30).
Finally, if β > α, then the result follows from the relation
h(y, α, β) = h(1/y, β, α) (3.5.40)
and the applica tio n of the previous case.
Remark 3.5.1 It is easy to see that the co nclus ions of Lemma 3.5.1 remain
valid if either α or β is equal to zero (which occurs when all the obse rvations
are located on one side of θ). It is interesting that in this case we still get
the MLE’s and the corresponding two-tailed AL distribution. On the other
hand, we shall see in Case 5 (when θ is known) that under this condition the
maximum likelihood approach would pr oduce an exponential distribution
(a one-tailed AL law).
In view of Lemma 3.5.1, we conclude that the likelihood function (3.5.26)
is maximized at a unique value of y (the MLE of κ), which can be obtained
by solving equation (3.5.29). The solution does not a dmit a closed form
and must be found numerically. The properties of the MLE are presented
in the following r e sult [Kotz et al. (2000c)].
Proposition 3.5.3 Let X
1
, . . . , X
n
be i.i.d. r.v.’s from an AL
(θ, κ, σ)
distribution where the values of θ and σ are known. Then, the MLE of κ
is the un ique solution ˆκ
n
of t he equation (3.5.29), where the function h is
defined in (3.5.28) and α, β are given in (3.5.27). The MLE ˆκ
n
is
(i) Consistent;
(ii) Asymptotically normal and efficient:
n(ˆκ
n
κ)
d
N (0, σ
2
κ
), (3.5.41)
where the asymptotic variance
σ
2
κ
=
κ
2
(1 + κ
2
)
2
(1 + κ
2
)
2
+ 4κ
2
(3.5.42)
3.5 Estimation 203
coincides with the reciprocal of the Fisher information I(κ).
Moreover, for any integer n 1 we have
p
β/α ˆκ
n
1 in case β α,
1 ˆκ
n
p
β/α in case β α.
(3.5.43)
Proof. Consider auxiliary r andom vectors
Z
(i)
= [Z
(i)
1
, Z
(i)
2
]
0
, i = 1, 2, . . . , n, (3.5.44)
where Z
(i)
1
= (X
i
θ)
+
and Z
(i)
2
= (X
i
θ)
, so that
X
i
θ = [1, 1]Z
(i)
. (3.5.45)
The above Z
(i)
’s admit the representation
Z
(i)
d
=
δ
1,i
E
1,i
δ
2,i
E
2,i
,
where the E
1,i
’s are i.i.d. distributed as
σ
2
1
κ
W , and the E
2,i
’s are i.i.d.
distributed as
σ
2
κW , where W is a standard exponential va riable, and the
δ
1i
, δ
2i
are the 0 1 random variables that appear in the representation
(3.2.7). The ra ndom vectors Z
(i)
are i.i.d. with the mean
m
Z
=
m
1,Z
m
2,Z
=
σκ
2(1 + κ
2
)
1
2
κ
2
(3.5.46)
and the covariance matrix
Σ
Z
=
σ
2
κ
2
2 (1 + κ
2
)
2
(1
2
+ 1)
2
1 1
1 (κ
2
+ 1)
2
1
. (3.5.47)
Clearly, the sequence {Z
(i)
} obeys the Law of Large Numbers and the
Central Limit Theorem, so that
lim
n→∞
¯
Z
(n)
a.s.
= m
Z
(3.5.48)
and
lim
n→∞
n(
¯
Z
(n)
m
Z
)
d
= N(0, Σ
Z
), (3.5.49)
where
¯
Z
(n)
=
1
n
n
X
i=1
Z
(i)
=
"
1
n
n
X
i=1
Z
(i)
1
,
1
n
n
X
i=1
Z
(i)
2
#
0
. (3.5.50)
204 3. Asymmetric Laplace distributions
Notice that the quantities α and β are related to the Z
(i)
’s as follows:
α =
2
σ
1
n
n
X
i=1
Z
(i)
1
=
2
σ
¯
Z
(n)
1
, (3.5.51)
β =
2
σ
1
n
n
X
i=1
Z
(i)
2
=
2
σ
¯
Z
(n)
2
. (3 .5.52)
Since the MLE, ˆκ
n
, is a unique solution of the equation (3.5.29), it can be
written as
ˆκ
n
= H(α, β), (3.5.53)
where H(·, ·) is a continuous and differentiable function satisfying the equa-
tion
h(H(α, β), α, β) = 0. (3.5.54)
In view o f (3.5.51)-(3.5.52), we have
ˆκ
n
= H
2
σ
¯
Z
(n)
1
,
2
σ
¯
Z
(n)
2
!
. (3.5.55)
To establish the consistency of the MLE given in (3.5.55 ), note that by
(3.5.46), (3.5.47), (3.5.48), a nd the continuity of H, we have
ˆκ
n
d
H
2
σ
m
1,Z
,
2
σ
m
2,Z
!
. (3.5.56)
Substituting
α =
2
σ
m
1,Z
=
1
κ
1
1 + κ
2
and
2
σ
m
2,Z
=
κ
3
1 + κ
2
(3.5.57)
into (3.5.29) and solving for y we obtain κ, as can be readily verified.
The asymptotic normality (3.5.41) of ˆκ
n
can be establish similarly. In
view of (3.5.49 ), by the s tandard larg e sample theory results [see, e.g.,
Serfling (1980)] it follows that as n , the variables
n
"
H
2
σ
¯
Z
(n)
1
,
2
σ
¯
Z
(n)
2
!
H
2
σ
m
1,Z
,
2
σ
m
2,Z
!#
(3.5.58)
converge in distribution to a N
0, 2
Z
D
0
2
variable, where D is the
matrix of partial derivatives of H:
D =
H
α
,
H
β
[α,β]=
2m
Z
. (3.5.59)
3.5 Estimation 205
A straightforward but laborious calculation of the derivatives produces
D =
κ
2
(1 + κ
2
)
2
(1 + κ
2
)
2
+ 4κ
2
,
(1 + κ
2
)
2
(1 + κ
2
)
2
+ 4κ
2
, (3.5.60)
and we obtain (3.5.41) - (3.5.42). The asymptotic efficiency is obtained by
noting that σ
2
κ
given in (3.5.42) is the reciprocal o f the Fisher information
I(κ) given by the middle entry in the Fisher information matrix (3.5.1 ).
Case 4: The value of κ is known
By (3.5.5), we need to maximize the function
Q(θ, σ) = n log σ
2
σ
D(θ, κ), (3.5.61)
where D(θ, κ) is given by (3.5.6). We have already esta blished in Case 2
that for any fixed va lue of D = D(θ, κ) the function (3.5.61) is maximized
by the following value of σ:
σ(θ) =
2
n
D(θ, κ). (3.5.62)
The corresponding maximum value of Q is
Q(θ, σ(θ)) = n log
(
2
n
D(θ, κ)
)
n. (3.5.63)
Since the quantity (3.5.63) is decreasing in D(θ, κ), we need to find the
value of θ that minimizes the latter. Such value was already obtained in
Case 1. Thus, the MLE of θ, denoted
ˆ
θ
n
, is given by (3.5.12) or (3.5.13),
and for large n it can be taken as the order statistic X
j(n):n
with j(n) given
by (3.5.16). The MLE of σ is then given by (3.5.62) with
ˆ
θ
n
in place on θ,
that is
ˆσ
n
=
2
n
κ
n
X
j=1
(x
j
ˆ
θ
n
)
+
+
1
κ
n
X
j=1
(x
j
ˆ
θ
n
)
. (3.5.64)
We observe that both estimators are linear combinations of order statistics,
as was the case with the corresponding MLE’s of the parameters of a sym-
metric Laplace distribution. Proce eding as in the classical Laplace case,
one can show that the MLE (
ˆ
θ
n
, ˆσ
n
) is consistent, asymptotically normal,
and efficient, with the asymptotic covariance matrix
Σ =
σ
2
/2 0
0 σ
2
, (3.5.65)
206 3. Asymmetric Laplace distributions
cf. (3.5.1). We shall omit a highly technical derivation of this result, which
can be found in Kotz e t al. (2000c ).
Case 5: The value of θ is known
Here, we need to maximize the function
Q(κ, σ) = log κ log(1 + κ
2
) log(σ) [κ, 1/κ]
¯
Z
(n)
/(σ/
2),
where the vector
¯
Z
(n)
was defined previously in (3.5.50). We shall proceed
by considering three cases:
1. θ x
1:n
,
2. θ x
n:n
,
3. x
1:n
< θ < x
n:n
.
In case 1, all sample values are greater than or equal to θ, so that
(x
i
θ)
+
= x
i
θ and (x
i
θ)
= 0 for all i = 1, 2, . . . , n. (3.5.66)
Thus, the two components of the vector
¯
Z
(n)
are
¯
Z
(n)
1
=
1
n
n
X
i=1
Z
(i)
1
=
1
n
n
X
i=1
(x
i
θ)
+
= ¯x
n
θ, (3.5.67)
¯
Z
(n)
2
=
1
n
n
X
i=1
Z
(i)
2
=
1
n
n
X
i=1
(x
i
θ)
= 0, (3.5.68)
so that the function Q takes the form
Q(κ, σ) = log κ log(1 + κ
2
) log(σ)
2
σ
κ(¯x
n
θ). (3.5.69)
Fix κ > 0 and differentiate (3.5.69) with res pect to σ to obtain
Q(κ, σ)
σ
=
1
σ
+
2
σ
2
κ(¯x
n
θ). (3.5.70)
It is clear that the derivative is positive for σ < σ(κ) and negative fo r
σ > σ(κ), where
σ(κ) =
2κ(¯x
n
θ). (3.5.71)
Consequently, for any fixed κ > 0, the function Q in (3.5.69) is maximized
by σ(κ). Thus, for all σ, κ > 0, we have
Q(κ, σ) Q(κ, σ(κ)) = log(1 + κ
2
) log
2 log(¯x
n
θ) 1.
(3.5.72)
3.5 Estimation 207
The above function o f κ is strictly dec reasing o n (0, ) with the least upper
bound of
lim
κ0
+
Q(κ, σ(κ)) = log
2 log(¯x
n
θ) 1, (3.5.73)
corresponding to the values κ = 0 and σ = 0. Since these values are not ad-
missible, formally the MLE’s of κ and σ do not exist in this case. However,
as
κ 0
+
and σ(κ) =
2κ(¯x
n
θ) 0
+
, (3.5.74)
then the AL
(θ, κ, σ(κ)) distribution converges wea kly to the exponential
distribution with the density
g(y) =
1
µ
e
(yθ)
for y θ
0 otherwise,
(3.5.75)
where µ = ¯x
n
θ (Exercise 3.6.24). This is actually the AL(θ, µ, 0) distri-
bution. Intuitively, it is certainly plausible to conclude that the underlying
distribution is exponential if all sample values happen to be located on one
side of the location parameter θ.
Similar considerations lead to the conclusion that in the second case
(θ x
n:n
), where we have
¯
Z
(n)
1
=
1
n
n
X
i=1
Z
(i)
1
=
1
n
n
X
i=1
(x
i
θ)
+
= 0 (3.5.76)
and
¯
Z
(n)
2
=
1
n
n
X
i=1
Z
(i)
2
=
1
n
n
X
i=1
(x
i
θ)
= θ ¯x
n
, (3.5.7 7)
we can choose
σ(κ) =
2κ
1
(θ ¯x
n
) (3.5.78)
to e nsure that for all σ, κ > 0 we have
Q(κ, σ) Q(κ, σ(κ)) = log
κ
2
1 + κ
2
log
2 log(θ ¯x
n
) 1. (3.5.79 )
The above function of κ is strictly incre asing on (0, ) with the limit at
infinity
lim
κ→∞
Q(κ, σ(κ)) = log
2 log(θ ¯x
n
) 1. (3.5.80)
As in the previous case, the maximum likelihood formally does not yield
a so lution (since the values κ = and σ = 0 are not admissible). Not
208 3. Asymmetric Laplace distributions
surprisingly, these limiting values of the parameters do correspond to a
distribution, as in the previous case, which this time is given by the density
g(y) =
0 for y θ
1
µ
e
(θy)
for y θ,
(3.5.81)
where µ = θ ¯x
n
. This is so since the AL
(θ, κ, σ(κ)) density converges to
the density (3 .5.81) as κ (Exercise 3.6.24). Again, we see that when all
sample values happen to be on the left side o f the location parameter θ, then
the maximum likelihood approach leads to an e xponential distribution.
We now move to the third case, as suming that the value of θ is strictly
between x
1:n
and x
n:n
, in which case both c omponents of the vector
¯
Z
(n)
are non-zero.
Note that the likelihood function converges to zero on the boundary of its
domain, so that the existence and uniqueness of the MLE’s is guaranteed
if the following equations for the derivatives of Q have a unique solution
within the domain:
Q(κ,σ)
σ
=
1
σ
+
2
σ
2
[κ, 1]
¯
Z
(n)
= 0,
Q(κ,σ)
κ
=
1
κ
2κ
1+κ
2
2
σ
[1, 1
2
]
¯
Z
(n)
= 0.
(3.5.82)
The above equations are e quivalent to
[κ
2
, 1
2
]
¯
Z
(n)
= 0,
2 [κ, 1]
¯
Z
(n)
= σ,
and lead to the following unique and explicit s olution for κ and σ:
bκ
n
=
4
s
[0, 1]
¯
Z
(n)
[1, 0]
¯
Z
(n)
, bσ
n
=
2
"
4
s
[0, 1]
¯
Z
(n)
[1, 0]
¯
Z
(n)
,
4
s
[1, 0]
¯
Z
(n)
[0, 1]
¯
Z
(n)
#
¯
Z
(n)
.
Remark 3.5.2 The corre sponding MLE of the parameter µ of the AL(θ, µ, σ)
parameterization is the sample mean:
bµ
n
= [1, 1]
¯
Z
(n)
=
1
n
n
X
i=1
X
i
.
The above estimators can be written more explicitly as follows:
bκ
n
=
4
s
1
n
P
n
i=1
(x
i
θ)
1
n
P
n
i=1
(x
i
θ)
+
, (3.5.83)
bσ
n
=
2
4
v
u
u
t
1
n
n
X
i=1
(x
i
θ)
+
4
v
u
u
t
1
n
n
X
i=1
(x
i
θ)
×
3.5 Estimation 209
×
v
u
u
t
1
n
n
X
i=1
(x
i
θ)
+
+
v
u
u
t
1
n
n
X
i=1
(x
i
θ)
. (3.5.84)
The MLE [ˆκ
n
, ˆσ
n
]
0
is consistent, asymptotically normal, and efficient for the
vector-parameter [κ, σ]
0
, see , e.g., Har tley and Revankar (1974), Kozubowski
and Podg´orski (2000).
Theorem 3.5.2 Let X
1
, . . . X
n
be i.i.d. with the AL
(θ, κ, σ) distribution
where the value of θ is known. Then the MLE of [κ, σ], [bκ
n
, bσ
n
]
0
, given by
(3.5.83) - (3.5.84) is:
(i) Strongly consistent,
(ii) Asymptotically bivariate normal with the asymptotic covariance ma-
trix
Σ
MLE
=
σ
2
8
(1 + κ
2
)
2
"
1
σ
2
1
κσ
1κ
2
1+κ
2
1
κσ
1κ
2
1+κ
2
1
κ
2
1 +
4κ
2
(1+κ
2
)
2
#
, (3.5.85)
(iii) Asymptotically efficient, namely the above asymptotic covariance
matrix coincides with the inverse of the Fisher information matrix.
Proof. The res ult follows from the large sample theory [see, e.g ., Serfling
(1980)]. Write
[ˆκ
n
, ˆσ
n
] = G(
¯
Z
(n)
) = [G
1
(
¯
Z
(n)
1
,
¯
Z
(n)
2
), G
2
(
¯
Z
(n)
1
,
¯
Z
(n)
2
)], (3.5.86)
where
G
1
(y
1
, y
2
) = (y
2
/y
1
)
1/4
(3.5.87)
and
G
2
(y
1
, y
2
) =
2(y
1
y
2
)
1/4
(
y
1
+
y
2
). (3.5.8 8)
(i) To establish the co nsistency of the MLE given in (3.5.86) use the con-
tinuity of G together with (3.5.48) to conclude that
lim
n→∞
[ˆκ
n
, ˆσ
n
] = G( lim
n→∞
¯
Z
(n)
) = G(m
Z
), (3.5.89)
and then verify by substitution that
G(m
Z
) = [κ, σ]. (3.5.90)
(ii) Similarly, we establish the asymptotic normality of the MLE with the
asymptotic variance of the form
Z
D
0
, where
D =
"
G
i
y
j
(y
1
,y
2
)=m
Z
#
2
i,j=1
(3.5.91)
210 3. Asymmetric Laplace distributions
is the matrix of partial derivatives of the vector valued function G. We shall
skip labor ious calcula tions leading to the asymptotic variance (3.5.85).
(iii) To prove asymptotic efficiency we need to demonstrate that Σ
MLE
is
equal to the inverse of the Fisher informa tion matrix I(κ, σ). By (3.5.1),
the Fisher information matrix is
I(κ, σ) =
1
σ
2
"
σ
2
κ
2
1 +
4
(1+κ)
2
σ
κ
1κ
1+κ
σ
κ
1κ
1+κ
1
#
. (3.5.92)
Taking the inverse of the above matrix we obtain (3.5.85).
Case 6: The value of σ is known
If the value of σ is given, then maximizing the log-likelihood function (3.5.5)
is e quivalent to maximizing the function
Q(θ, κ) = log κ log(1 + κ
2
)
κα(θ) +
1
κ
β(θ)
, (3.5.93)
where α(θ) and β(θ) were defined previously in (3.5.27). Following Kotz et
al. (2000c), we shall proceed by maximizing (3.5.93) with respect to (θ, κ)
on the sets
R × J
1
, R × J
2
, . . . , R × J
n
, (3.5.94)
where
J
1
=
0,
1
n 1
, J
n
= [n 1, ), (3.5.95)
and
J
i
=
i 1
n (i 1)
,
i
n i
, j = 2, 3, . . . , n 1. (3.5.96)
The procedure described below will result in the set o f n pairs,
(θ
1
, κ
1
), . . . , (θ
n
, κ
n
), (3.5 .97)
where the ith pair maximizes the function (3.5.93) on the set R × J
i
,
i = 1, 2, . . . , n. By substituting (3 .5.97) into (3.5.93) and comparing the
resulting values we would obtain the required MLE’s of θ and κ.
3.5 Estimation 211
The pr ocess of obtaining each of the pairs in (3 .5.97) co nsists o f two steps.
First, note that by the results on es timating θ (see Case 1), the inequality
Q(θ, κ) Q(x
i:n
, κ) = log κ log(1 + κ
2
)
κα(x
i:n
) +
1
κ
β(x
i:n
)
(3.5.98)
holds for all (θ, κ) R × J
i
. We can now maximize the right-hand side of
(3.5.98) with respect to κ J
j
using the results obtained under Case 3
(where the only unknown par ameter is κ). Namely, we conclude that the
right-hand side of (3.5.98) is increasing on the interval (0, κ
0
i
) and decreas-
ing on the interval (κ
0
i
, ), where κ
0
i
is the unique solution of the equa tion
(3.5.29) [with α = α(x
i:n
) and β = β(x
i:n
)]. Now, the value κ
i
that would
maximize the right-hand side of (3.5.98) would be either κ
0
i
(if κ
0
i
J
i
) or
one of the endpoints of J
i
(the left endpoint if it is greater than κ
0
i
, or the
right endpoint in case it is less than κ
0
i
). The algorithm below summarizes
the pro cess of obtaining the MLE’s of θ and κ for this problem.
A Computation of the MLE’s of θ and κ when σ is known.
For i = 1, 2, . . . , n, set
α =
2
σ
1
n
n
X
j=i
(x
j:n
x
i:n
), β =
2
σ
1
n
i
X
j=1
(x
i:n
x
j:n
). (3.5.99)
For i = 1, 2, . . . , n, solve the equation:
1
κ
2κ
1 + κ
2
+
β
κ
2
α = 0, (3.5.100)
obtaining the unique solution κ
0
i
which lies between 1 and
p
β/α.
Set
κ
1
=
κ
0
1
if κ
0
1
1/(n 1),
1
n1
otherwise,
(3.5.101)
For i = 2, 3, . . . , n 1, κ
i
=
i1
n(i1)
if κ
0
i
<
i1
n(i1)
,
κ
0
i
if
i1
n(i1)
κ
0
i
<
i
ni
,
i
ni
if κ
0
i
i
ni
,
(3.5.102)
κ
n
=
κ
0
n
if κ
0
n
n 1,
n 1 otherwise.
(3.5.103)
212 3. Asymmetric Laplace distributions
For i = 1, 2, . . . , n, s ubstitute the two va lues θ
i
= x
i:n
and κ
i
given
by (3.5.101) - (3.5.103) into (3.5.93) and choose the pair that results
in the maximum value.
The method for estimating θ and κ is more complex compare d with other
cases considered so far and may be time consuming for large problems. The
consistency as well as asymptotic normality and efficiency o f the estimato rs
may be obta ined similarly as in the case of estimating a ll three parameters.
Case 7: The values of all three parameters are unknown
Let us start by noting that the maximum likelihood estimators and their
asymptotic distributions for this case were derived in Har tley and Revankar
(1974) and Hinkley and Revankar (1977), although these authors worked
in the context of log-Laplace model and under another parameterization.
Our presentation closely fo llows Kotz et al. (2000c).
We need to maximize the log-likelihood function (3.5.5) with respect to
all three parameters, which is equivalent to maximizing the function
Q(θ, κ, σ) = lo g σ + log
κ
1 + κ
2
2
σ
κα(θ) +
1
κ
β(θ)
, (3.5.104)
where this time
α(θ) =
1
n
n
X
j=1
(x
j
θ)
+
and β(θ) =
1
n
n
X
j=1
(x
j
θ)
. (3.5.105)
We shall pr oceed by first fixing the value of θ and then applying the re sults
obtained under Case 5 (when the value of θ is known).
When θ x
1:n
, then by the relation (3.5.72) (see Case 5) we conclude
that for any κ, σ > 0
Q(θ, κ, σ) log(1 + κ
2
) log
2 log(¯x
n
θ) 1. (3.5.106)
Similarly, when θ x
n:n
, then by (3.5.79), we will have
Q(θ, κ, σ) log
κ
2
1 + κ
2
log
2 log(θ ¯x
n
) 1. (3.5.107)
If x
1:n
< θ < x
n:n
then both q uantities α(θ) and β(θ) given in (3.5.105)
are positive. Thus, using the results of Case 5 we will have
Q(θ, κ, σ) Q(θ, ˆκ, ˆσ), (3.5.108 )
where the quantities ˆκ and ˆσ a re the MLE’s of κ and σ (derived under the
case when the value of θ is known) given by (3.5.83) - (3.5.84). Substituting
3.5 Estimation 213
these values into the right-hand side of (3.5.108), we obtain after some
algebra
Q(θ, κ, σ) g(θ), (3.5.109)
where
g(θ) = log
2 2 log(
p
α(θ) +
p
β(θ))
p
α(θ)
p
β(θ). (3.5.110)
Note that for θ (x
1:n
, x
2:n
) we have
α(θ) =
1
n
n
X
j=2
(x
j:n
θ) and β(θ) =
1
n
(θ x
1:n
), (3.5.111)
so that
lim
θx
+
1:n
α(θ) = ¯x
n
x
1:n
and lim
θx
+
1:n
β(θ) = 0. (3.5.112)
Thus,
lim
θx
+
1:n
g(θ) = log
2 log(¯x
n
x
1:n
). (3.5.113)
The limit in (3.5.113) is larger than the value Q(θ, κ, σ) at any θ x
1:n
, 0 <
κ, 0 < σ. Indeed, in view of (3 .5.106), for θ x
1:n
we will have
Q(θ, κ, σ) log
2 log(¯x
n
x
1:n
) 1, (3.5.114 )
since here the function Q attains its least upper bound for κ = σ = 0
and θ = x
1:n
. In view of the above, we can restrict attention to the values
θ > x
1:n
when maximizing the function Q(θ, κ, σ) over θ R, 0 < κ, 0 < σ.
Similar arguments show that
lim
θx
n:n
g(θ) = log
2 log(x
n:n
¯x
n
), (3.5.115)
which is 1 larger than the supremum of the function Q(θ, κ, σ) over the
values θ x
n:n
, 0 < κ, 0 < σ [the supremum is obtained by taking κ
and θ = x
n:n
in the right-hand side of (3.5.107)]. Consequently, we can rule
out the values θ x
n:n
from further consider ation.
This leaves us with the problem of maximizing the function Q(θ, κ, σ)
given by (3.5.104) under the c onditions
x
1:n
< θ < x
n:n
, 0 < κ < , 0 < σ < , (3.5.116)
or eq uiva lently, maximizing the function g(θ) in (3.5.110) on the set
A = {θ : x
1:n
< θ < x
n:n
}. (3.5.117)
214 3. Asymmetric Laplace distributions
Clearly, this is equivalent to the minimization of the function
h(θ) = 2 lo g(
p
α(θ) +
p
β(θ)) +
p
α(θ)
p
β(θ) (3.5.118)
with respect to the same values of θ. It turns out that the infimum of the
function h on the set A is given by one of the values
h(x
j:n
), j = 1, 2, . . . , n. (3 .5.119)
This follows from the following lemma [see Kotz et al. (2000c) a nd Exercise
3.6.25].
Lemma 3.5.2 The function h defined in (3.5.118) is continuous on the
closed interval [x
1:n
, x
n:n
] and concave down on each of the su b-intervals
(x
j:n
, x
j+1:n
), j = 1, 2, . . . , n 1.
Consequently, to find the MLE ’s of θ, κ, and σ we should proceed as follows:
Step 1: Evaluate the n va lues (3.5.119) and choose a positive integer r n
such that
h(x
r:n
) h(x
j:n
) j = 1, 2, . . . , n. (3.5.120)
Step 2: Set θ = x
r:n
and find the MLE’s of κ and σ (derived previously
under C ase 5).
There are three scenarios in Step 2:
If r = 1 (θ = x
1:n
) then as in Case 5, the MLE’s do not exist (as the
likelihood is maximized by κ = σ = 0) but the likelihood approach
leads to (positive) e xponential distribution with dens ity (3.5.75 ) with
µ = ¯x
n
x
1:n
.
If r = n (θ = x
n:n
) then again formally the MLE’s do not exist
but the likelihood approach does lead to the (negative) exponential
distribution with dens ity (3.5.81) with µ = x
n:n
¯x
n
.
If 1 < r < n, then the MLE’s are
ˆ
θ
n
= X
r:n
,
ˆκ
n
=
4
q
β(
ˆ
θ
n
)/
4
q
α(
ˆ
θ
n
),
ˆσ
n
=
2
4
q
α(
ˆ
θ
n
)
4
q
β(
ˆ
θ
n
)
q
α(
ˆ
θ
n
) +
q
β(
ˆ
θ
n
)
,
(3.5.121)
where
α(
ˆ
θ
n
) =
1
n
n
X
j=1
(x
j
ˆ
θ
n
)
+
and β(
ˆ
θ
n
) =
1
n
n
X
j=1
(x
j
ˆ
θ
n
)
. (3.5.122)
3.6 Exercises 215
Thus, the problem of estimating all three parameters of the AL
(θ, κ, σ)
distribution admits a solution that can be determined with ease. The re-
sulting MLE’s are consistent, asymptotically nor mal, and asymptotically
efficient with the asymptotic covariance matrix equal to the inverse of the
Fisher information matrix (3.5.1). We refer the reader to Hartley and Re-
vankar (1974) and Hinkley and Revankar (1977) for technical details re-
garding the asymptotic results on the MLE’s.
3.6 Exercises
The r e aders may find the 26 exercises below to be somewha t cha lle nging.
Again we recommend tha t a special attention will be paid to these exe rcises.
A number of them deal with the most recent re sults on a symmetric Laplace
distributions.
Exercise 3.6.1 Let X have an asymmetric Laplace distribution with p.d.f.
(3.0.1). Derive the mean, median, mode, and variance of X.
Exercise 3.6.2 Let X have the skewed Laplace distribution with p.d.f.
(3.0.3).
(a) Find the mean and the variance of X.
(b) Show that the mode of X and the α-quantile of X are both equal to θ.
(c) Show that the characteristic function of X is
ϕ(t) = α(1 α)e
iθt
1
1 α + it
+
1
it α
.
What is the moment generating function of X?
Exercise 3.6.3 Consider a hyperbolic distribution with density
f(x) =
p
α
2
β
2
2αδK
1
(δ
p
α
2
β
2
)
e
α
δ
2
+(xθ)
2
+β(xθ)
, < x < ,
(3.6.1)
where
α > 0, 0 |β| < α, −∞ < θ < , δ > 0
and K
1
(·) is the modified Bessel function of the third kind with index 1
(see Appendix A).
(a) Show that as
δ ,
δ
p
α
2
β
2
σ
2
> 0, β 0,
216 3. Asymmetric Laplace distributions
the density (3.6.1) converges (pointwise) to the density of the normal dis-
tribution with mean θ and variance σ
2
.
(b) Show that as δ 0, the density (3.6.1) converges (pointwise) to an
asymmetric Laplace density
g(x) = C
e
(αβ)|xθ|
for x θ,
e
(α+β)|xθ|
for x < θ.
(3.6.2)
What is the normalizing constant C in (3.6.2)?
(c) Show that the density (3.6.2) corresponds to the AL
(θ, κ, σ) distribu-
tion, where
σ =
r
2
α
2
β
2
, and κ =
s
α β
α + β
.
Thus, the latter distribution arises as a limit of hyperbolic distributions
[Barndorff-Nielsen (1977)].
Exercise 3.6.4 For a < 0 < b and n N consider a r.v. X
n
with p.d.f.
f
n
(x) =
n + 1
b a
(
xa
a
n
for a x 0
bx
b
n
for 0 x b.
(3.6.3)
(a) Show that the function f
n
is a genuine probability density function.
(b) Let a = nA and b = nB, where A, B > 0. Show that as n then
for every x R the density f
n
(x) converges to
f(x) =
1
A + B
e
−|x|/A
for x 0
e
−|x|/B
for x 0.
(3.6.4)
(c) Show that the function (3.6.4) is the p.d.f. of the AL
(σ, κ) distribution
with
σ =
AB and κ =
p
A/B
(cf. Exercise 2.7.56, Chapter 2).
Exercise 3.6.5 E stablish the relations (3.1.4) and (3.1.5). Further, show
that for every σ > 0 the functions of µ and κ, given by (3.1.4) and (3.1.5),
respectively, are strictly decreasing on their domains, and prove the rela-
tions given in (3.1.6 ).
Exercise 3.6.6 Let f
θ,κ,σ
(x) be the dens ity (3.1.10) of an AL distribution.
(a) Show that for any x R we have
f
θ,κ,σ
(x) = f
θ,1/κ,σ
(x). (3.6.5)
3.6 Exercises 217
What is the interpretation of (3.6.5) in terms of random variables?
(b) Show that for 0 < κ < 1 and x > 0 we have
f
θ,κ,σ
(θ + x) > f
θ,1/κ,σ
(θ x). (3.6.6)
What happens for κ > 1? For κ = 1?
(c) Clearly, when x , then densities on both sides of (3.6.6) converge
to zero. Investigate whether they converge with the same rate, or one of
them converges to zero faster than the other one.
(d) Repeat parts (a)-(c) using the AL(θ, µ, σ) parameterization.
Exercise 3.6.7 In this problem we investigate the derivatives of an AL
density.
(a) Show that the AL densities (3.1.1 0) have derivatives of any order (except
at x = θ), which are expressed by the following formulas:
f
(n)
θ,κ,σ
(x) =
(1)
n
2κ
σ
n+1
1
1+κ
2
e
2κ|xθ|
, if x > θ
2
(σκ)
n+1
κ
2
1+κ
2
e
2|xθ|/(κσ)
, if x < θ.
(3.6.7)
(b) Find the limits
lim
xθ
+
(1)
n
f
(n)
θ,κ,σ
(x) and lim
x0
f
(n)
θ,κ,σ
(x), (3.6.8)
check for what values of n or the parameters, if any, the two limits in (3.6.8)
are equal, and give an interpretation of the equality.
(c) Show that if 0 < κ 1 and x θ + σn/
2, where n is a pos itive
integer, then
(1)
n
f
(n)
θ,κ,σ
(x) f
(n)
θ,κ,σ
(x). (3.6.9)
What happens if κ > 1? If x θ + σn/
2?
Exercise 3.6.8 Show that the AL density f given by (3.1.10) is completely
monotone on (θ, ) and absolutely monotone on (−∞, θ) [that is for any
k = 0, 1, 2, . . . , we have (1)
k
f
(k)
(x) 0 for x > θ and f
(k)
(x) 0 for
x < θ]
Exercise 3.6.9 E stablish formulas (3.1.16) and (3.1.19) for the m.g.f. of
an AL distribution.
Exercise 3.6.10 Le t Y AL
(θ, κ, σ).
(a) Show that the ath absolute moment of Y θ is finite for any a > 1 ,
and is given by (3.1.26).
(b) Show that the mean absolute devia tion o f Y is given by (3.1.27).
218 3. Asymmetric Laplace distributions
Exercise 3.6.11 Calculate the nth moment about zero of the AL(θ, κ, σ)
distribution.
Exercise 3.6.12 Le t Y AL
(θ, κ, σ).
(a) Show that the co e fficients of skewness and kurtosis of Y , defined by
(2.1.21) and (2 .1.22), a re given by (3.1.30) and (3.1.31), respectively.
(b) Show that the coefficient o f skewness is bounded by 2 in absolute value,
and decreases monotonically from 2 to 2 as κ increases from zero to
infinity.
(c) Show that the coefficient of kurtos is varies from three to six.
Exercise 3.6.13 The κ-criter ion is a pr e liminary selection test useful in
reducing the number of plausible models for a given set of data [see, e.g.,
Elderton (1938), Hirschberg e t al. (1989)]. The κ-criterion is defined as
κ =
β
1
(β
2
+ 3)
2
4(4β
2
3β
1
)(2β
2
3β
1
6)
, (3.6.10)
where β
1
is the square of the coefficient of skewness γ
1
and β
2
is the (un-
adjusted) kurtosis γ
2
+3 [cf. (2.1.21) - (2.1.22), Chapter 2] of the underlying
probability distribution. It is clear that the κ-criterion is zero for the sym-
metric Laplace distribution (as it is for any symmetric distribution with
a finite 4th moment since in this c ase β
1
= 0). Derive the κ-criterion for
the AL(µ, σ) distr ibution (not to be confused with the parameter κ of the
distribution). What is the range of the κ-criterion in this case?
Exercise 3.6.14 Le t Y AL
(θ, κ, σ). Esta blish the mode-median-mean
inequalities (3.1.35).
Exercise 3.6.15 A common measure of skewness of a probability distr i-
bution with distribution function F is given by the limit:
lim
x→∞
1 F (x) F (x)
1 F (x) + F (x)
.
The above limit is equal to zero if the distribution is symmetric about zero.
Show that for an AL distribution with dis tribution function (3.1.11) the
above limit is equal to 1 if κ < 1 (µ > 0), is equal to 1 if κ > 1 (µ < 0)
and is equal to
e
2θ
σ
e
2θ
σ
e
2θ
σ
+ e
2θ
σ
for κ = 1. [Note that for an AL(0, µ, σ) distribution, which is a special case
of a geometric stable distribution GS
α
(σ/
2, β, µ) with α = 2 [see (4.4.7)],
the above limit is equal to sign(µ). Since for geometric stable distributions
this limit is equal to β [see, e.g., Kozub owski (1994a)], for consistency, we
set β = sign(µ) for a GS law with α = 2.]
3.6 Exercises 219
Exercise 3.6.16 Show that an AL
(θ, κ, σ) r.v. Y admits the representa-
tion (3.2.5).
Exercise 3.6.17 Le t Y AL
(0, κ, σ), let Y
(i)
p
, i 1, be i.i.d. variables
having the AL
(0, κ
p
,
) distribution, where k
p
is given by (3.4.12), and
let ν
p
be geometric random variable with P (ν
p
= n) = (1 p)
n1
p, n 1,
which is independent from the Y
(i)
p
’s. Show that for each p (0, 1) the
representation (3.4.10) is valid.
Exercise 3.6.18 Show that if in Proposition 3.4.7 the mean and mean
deviation about the mean are prescribed, that is if the condition (3.4.39 ) is
replaced by
EX = c
1
R and E|X c
1
| = c
2
> 0 for X C,
then the maximum entropy is attained by the classical symmetric Laplace
distribution with dens ity f (x) =
1
2c
2
e
−|xc
1
|/c
2
[Kapur (1 993)].
Exercise 3.6.19 Le tX
1:n
··· X
n:n
be the order statistics connected
to a random sample of size n from the Laplace AL
(θ, κ, σ) distribution
where κ and σ are k nown while θ is to be estimated by the method of
maximum likelihood.
(a) Show tha t the likeliho od function is maximized by any θ that minimizes
the function Q given by (3.5.7).
(b) Show that the function Q is continuous on R and linear on the intervals
I
0
, I
1
, . . . I
n
given by (3.5.8) - (3 .5.9).
(c) Show that the function Q is decreasing on I
0
, increasing on I
n
, and o n
I
j
(1 j n 1) the behavior of Q is given by (3.5.10).
(d) Conclude that if the condition (3.5.11) holds then any statistic of the
form (3.5.12) is an MLE of θ, and if it does not, then the MLE of θ is given
by (3.5.13).
(e) Derive the mean and va riance of the above MLE. Check whether the es-
timator is efficient (i.e., its variance attains the Cram´er-Rao lower bound).
Exercise 3.6.20 Show that if X AL
(θ, κ, σ) then Y = g(X), where
the function g is given by (3.5.24), has an exponential distribution with
mean σ and variance σ
2
. This is a generalization of the fact that the r.v.
|X| is exponential whenever X is a symmetric Laplace variable (with mean
0).
Exercise 3.6.21 Le t X
1
, . . . , X
n
be a random sample fr om the AL(θ, µ, σ)
distribution. Derive the method of moments estimators of each of the pa-
rameters assuming that the values of the other two are known. Investigate
consistency and asymptotic no rmality of the estimators. Compare with the
corresponding results for the MLE’s.
220 3. Asymmetric Laplace distributions
Exercise 3.6.22 Le t X
1
, . . . , X
n
be a random sample fr om the AL(θ, µ, σ)
distribution.
(a) Assuming that the value of θ is known (and for convenience set to zero),
show that the method of moments estimators (MME’s) of µ a nd σ are given
by
˜µ
n
=
¯
X
n
=
1
n
n
X
i=1
X
i
, ˜σ
n
=
v
u
u
t
1
n
n
X
i=1
X
i
2
¯
X
2
n
. (3.6.11)
Further, show that the estimator (˜µ
n
, ˜σ
n
)
0
is strongly consistent and its
asymptotic distribution is normal with (vector) mean zero and the covari-
ance matrix
Σ
MME
=
1
4σ
2
4σ
2
+ 4µ
2
σ
2
2µσ
3
2µσ
3
4µ
4
+ 8µ
2
σ
2
+ 5σ
5
. (3.6.12)
[Kozubowski and Podg´orsk i (2000)].
Hint: Consider an auxiliar y sequence of bivariate i.i.d. random vectors V
i
=
(X
i
, X
2
i
)
0
. Show that the vector mean and covariance matrix of V
i
are
m
V
=
µ
2µ
2
+ σ
2
, Σ
V
=
σ
2
+ µ
2
5µσ
2
+ 4µ
3
5µσ
2
+ 4µ
3
20µ
4
+ 32 µ
2
σ
2
+ 5σ
4
.
Then, use the fact that the Law of Large Numbers and the Central L imit
Theorem are valid for the sequence {V
i
}.
(b) Derive the MME’s for the remaining pairs of the parameters (assum-
ing that the value of the remaining parameter is known) and study their
consistency and asymptotic no rmality.
(c) Inve stigate the method of moments estimation of all three parameters.
Exercise 3.6.23 Show that the Fisher information matrix correspo nding
to the AL(θ, κ, σ) distribution is given by (3.5.1).
Exercise 3.6.24 Le t X have an AL
(θ, κ, σ) distribution.
(a) Suppose that σ = σ(κ) =
2κµ for some µ > 0. Show that when κ 0,
then the corresponding AL density (3.1.10) converges to the exponential
density (3.5.75).
(b) Suppose that σ = σ(κ) =
2κ
1
µ for some µ > 0. Show that when κ
0, then the corre sponding AL density (3.1.10) converges to the exponential
density (3.5.81).
Exercise 3.6.25 Prove Lemma 3.5.2 .
Hint: to e stablish the c oncavity show that h
00
(θ) < 0 for all θ (x
1:n
, x
n:n
),
j = 1, 2, . . . n.
Exercise 3.6.26 Le t x
1:3
< x
2:3
< x
3:3
be particular realizations of the
order statistics corresponding to a random sample of size n = 3 from the
3.6 Exercises 221
AL
(θ, κ, σ) distribution. Derive the MLE’s of all three parameters. Un-
der what conditions on x
j:3
’s the MLE of θ is
ˆ
θ
3
= x
2:3
? When does the
maximum likelihood approach lead to an exponential distribution?
222 3. Asymmetric Laplace distributions
4
Related distributions
Symmetric Laplace distributions can be extended in various ways. As we
discussed in Chapter 3, skewness may be introduced, leading to asymmetric
Laplace laws. Next, one can consider a more general class of distributions
whose ch.f.’s are positive powers of Laplace ch.f.’s. These are margina l dis-
tributions of the evy proc e ss {Y (t), t 0} with independent increments,
for which Y (1) has symmetric or asymmetric Laplace distribution. We term
such a process the Laplace motion. Finally, one obtains a wider class of
limiting distributions consisting of geometric stable laws, by allowing for
infinite variance of the components in the geometric compounds (2.2.1).
More generally, if the random number of components in the summation
(2.2.1) is distributed according to a discrete law ν o n positive integer s, a
wider c lass of ν- s table laws is obta ined as the limiting distributions. This
chapter is devoted to discus sion of all such related distributions and random
variables.
Barndorff-Nielsen (1977) introduced a genera l class of hyperbolic distri-
butions [see also Eber lein and Keller (1995) for applications in finance]. The
Bessel function distributions which are discussed in this chapter could be
studied through the theory of this class. However, hype rb olic distributions
do not constitute a direct generalization of Laplace laws. Thus, we decided
not to present the following material through this alternative approach as
it would take us “too far” from the classical Laplace distribution.
224 4. Related distributions
4.1 Bessel function distribution
If X
1
, . . . X
n
are i.i.d. Laplace r.v.’s with mean zero and variance σ
2
, then
their sum S
n
has the ch.f.
ψ
S
n
(t) =
n
Y
i=1
ψ
X
i
(t) =
1
1 +
1
2
σ
2
t
2
n
(−∞ < t < ). (4.1.1)
By infinite divisibility of the Laplace distribution, the function (4.1.1) is a
legitimate ch.f. even when n is not an integer (but is still positive). More
generally, taking in (4.1.1) the ch.f. of an a symmetric Laplace distribution
with the mode at zero (which is still infinitely divisible) we conclude that
the function
ψ(t) =
1
1 +
1
2
σ
2
t
2
iµt
τ
, < t < , (4.1.2)
is a characteristic function for any µ R and σ, τ 0. The function (4.1.2)
yields an AL ch.f. for τ = 1 and symmetric Laplace ch.f. for τ = 1 and µ = 0
(and gamma ch.f. for σ = 0). Not surprisingly, it is known in the literature
as a generalized (asymmetric) Laplace distribution [see, e.g., Matha i (1993),
Kozubowski and Podg´orski (1999c)]. Since the corresponding density func-
tion can be written in the terms of the Bessel function of the third kind
(defined in the Appendix), Bessel function distribution is another name for
this class [see, e.g., McKay (1932)]. The formula for the density appeared
in Pearson et al. (1929) in connection with the distribution of sample co-
variance for a random sample dr awn from a bivariate normal population
[see also Pearson et a l. (1932) and Bhattacharyya (1942)]. This distribu-
tion arise s as a mixture of normal distributions with stochastic variance
having gamma distribution, and so it is also called variance gamma model,
see, e.g., Madan and Seneta (1990). Such mixtures (with mean zero) were
introduced in Teichroew (1957), who commented that in some practical
problems the variable of interest may b e normal with variance varying with
time. Rowland and Sichel (1960) applied the generalized Laplace model to
logarithms of the ratios of duplicate check-sampling values (of gold ore) in
South African gold mines, reporting an excellent fit. Sichel (1973) applied
this distribution for modeling the size of diamonds mined in South West
Africa. More recently, the variance gamma model became popular among
some financial modelers, due to its simplicity, flexibility, and an excellent fit
to empirical data, see, e.g., Madan and Seneta (1990), Madan et al. (1998),
Levin and Tchernitser (1999), Kozub owski and Podg´orski (1999ac).
4.1.1 Definition and parameterizations
We shall start with a definition, terminology, and some no tation. We shall
define a general four-parameter family of distributions, although in the
4.1 Bessel function distribution 225
sequel we will often consider a three-parameter model with the location
parameter fixed at zero.
Definition 4.1.1 A random variable Y is said to have a generalized asym-
metric Laplace (GAL) distribution if its ch.f. is given by
ψ(t) =
e
iθt
(1 +
1
2
σ
2
t
2
iµt)
τ
, −∞ < t < , (4.1.3)
where θ, µ R and σ, τ 0. We denote such distribution by GAL(θ, µ, σ, τ)
and write Y GAL(θ, µ, σ, τ).
Remark 4.1.1 The terminology for the above family of distributions is
not well-established and various names can be equally justified. First, in
McKay (1932) and Johnson et al. (1994) we have two types of Bessel func-
tion distr ibutions: Bessel I-function distribution (not considered here) and
Bessel K-function distribution (which is an alternative name for gener-
alized Laplace distributions). The name Bessel K-function distribution is
then historically well justified. On the other hand in various contexts more
compact name is more handy: we prefer L aplace motion instead of Bessel
K-function motion. In this book we decided to use the terms Bessel func-
tion distribution and variance-gamma distribution interchange ably with the
name generalized Laplace distribution used in Definition 4.1.1.
While the distribution is well-defined for every θ, µ R and σ, τ 0 , we
have the following spe c ial cases. If θ = µ = σ = 0, then ψ(t) = 1 for every
t R, and the distribution is degenerate at 0. For θ = σ = 0 and µ > 0, we
have a gamma r.v. with the scale parameter µ and the shape parameter τ
(which r e duces to an exponential variable for τ = 1). For τ = 1, we obtain
an AL distribution, which for µ = 0 and σ > 0, yields a symmetric Laplace
distribution with mean θ and variance σ
2
.
The GAL ch.f. (4.1.3) with σ > 0 can be factored similarly as an AL
ch.f.,
ψ(t) = e
iθt
1
1 + i
2
2
σκt
!
τ
1
1 i
2
2κ
σt
!
τ
, (4.1.4)
where the additional parameter κ > 0 is related to µ and σ as before,
µ =
σ
2
1
κ
κ
and κ =
2σ
µ +
p
2σ
2
+ µ
2
=
p
2σ
2
+ µ
2
µ
2σ
. (4.1.5)
It will be convenient to express certain properties of the GAL distributions
in the (θ, κ, σ, τ )-parameterization, using the notation GAL
(θ, κ, σ, τ) for
the distribution g iven by (4.1.4). Analogously to the AL case, the parameter
κ is scale invariant, while σ is a genuine scale parameter [in the (θ, κ, σ, τ)-
parameterization].
226 4. Related distributions
The following result extends an analogous property of AL laws (Propo-
sition 3.1.1).
Proposition 4.1.1 Let X GAL
(θ, κ, σ, τ) and let c be a n on-zero real
constant. Then,
(i) c + X GAL
(c + θ, κ, σ, τ)
(ii) cX GAL
(cθ, κ
c
, |c|σ, τ), where κ
c
= κ
sign(c)
.
Remark 4.1.2 Note that in particular, if X GAL
(θ, κ, σ, τ) then X
GAL
(θ, 1/κ, σ, τ).
Since θ is a location parameter, we shall o ften assume θ = 0 and denote
the corresponding distribution as either GAL(µ, σ, τ) or GAL
(κ, σ, τ), de-
pending on the parameterization. For θ = 0 and σ = 1 we shall refer to
the GAL distribution as s tandard and write GAL(µ, τ) and GAL
(κ, τ),
respectively, for the distributions GAL(0, µ, 1, τ) a nd GAL
(0, κ, 1, τ). We
shall often state o ur results in terms of standard variables.
Table 4.1 below contains a summary of the notation and special cases.
4.1.2 Representations
A Bessel function random variable admits ce rtain repre sentations analogous
to those corresponding to AL random variables. First, we shall consider a
mixture representation in terms of normal distribution with a stochastic
mean and variance. Then, we shall discuss a representation as a convolu-
tion of two g amma distributions, ana logous to the previously cons idered
representations of (symmetric and asymmetric) Laplace r.v.’s in terms of
exp onential r.v.’s . Finally, we shall discuss the relation between the Bessel
function distribution and a sample covariance for bivariate normal random
samples.
Mixture of normal distributions
Let Z b e a standard normal random variable. Then, for any µ R and
σ > 0, the r.v.
µ + σX (4.1.6)
has normal distribution with mean µ and variance σ
2
. The ch.f. of the latter
r.v. is
φ
µ,σ
(t) = e
iµt
1
2
σ
2
t
2
, t R. (4.1.7)
Now, suppose that the mean and variance of the above normal r.v. are
multiplied by an independent and positive random va riable W and let us
write the resulting new random variable Y as the following function of Z
and W :
Y = µW + σ
W Z. (4.1.8)
4.1 Bessel function distribution 227
Case Distribution Notation Density
θ = 0
τ = 1
σ = 0
µ > 0
Exponential
(with mean
µ)
GAL(0, µ, 0, 1)
AL(µ, 0)
GAL(µ, 0, 1)
Γ(1, µ), E(µ)
1
µ
e
x/µ
(x > 0)
θ = 0
σ = 0
µ > 0
Gamma
with parame-
ters
α = τ, β = µ
GAL(0, µ, 0, τ)
GAL(µ, 0, τ)
Γ(τ, µ)
x
τ1
e
x/µ
µ
τ
Γ(τ )
(x > 0)
τ = 1
σ > 0
µ = 0
Symmetric
Laplace
L(θ, σ),
AL(θ, 0, σ)
GAL(θ, 0, σ, 1)
1
2σ
e
2|xθ|
(x R)
τ = 1
σ > 0
µ 6= 0
Asymmetric
Laplace
AL(θ, µ, σ),
GAL(θ, µ, σ, 1)
2κ
σ(1+κ
2
)
(
e
2κ
σ
(θx)
, x θ
e
2
σκ
(xθ)
, x < θ,
θ = 0
σ = 0
µ = 0
τ = 0
Degenerate
at 0
Table 4.1: Special cas e s a nd notation for the Bessel function distribution
in the GAL(θ, µ, σ, τ) parameterization.
Thus, conditionally on W = w, the random va riable Y has a normal distri-
bution with mean µw and variance wσ
2
. To find the margina l distribution
of Y , we may find its density by integrating the product of the conditional
density of Y |W = w and the marginal density f (w) of W . Alternatively, we
may find the ch.f. of Y by conditioning on W . This is exactly how we have
found mixture representations of this type for the classical as well as asym-
metric Laplace distributions. We shall follow this approach to show that
Y given by (4.1.8) has the Bessel function distribution when W is gamma
distributed. Indeed, let W has a gamma distribution Γ(α = τ, β = 1) with
the density
g(x) =
1
Γ(τ)
x
τ 1
e
x
, x > 0, τ > 0. (4.1.9)
228 4. Related distributions
Conditioning on W , we obtain
ψ
Y
(t) = Ee
itY
= E[E(e
itY
|W )] =
Z
0
Ee
it(µw+σ
wZ)
g(w)dw.
When we put the gamma density (4.1.9) and the normal ch.f. (4.1.7) into
the above relation, we get
ψ
Y
(t) =
Z
0
φ
µw,σ
w
(t)g(w)dw =
1
Γ(τ)
Z
0
w
τ 1
e
w(1+
1
2
σ
2
t
2
iµt)
dw.
The latter integral can be related to the standard gamma function to pro-
duce
ψ
Y
(t) =
1
Γ(τ)
Γ(τ)
1
1 +
1
2
σ
2
t
2
iµt
τ
=
1
1 +
1
2
σ
2
t
2
iµt
τ
,
which we recognize as the GAL(µ, σ, τ) characteristic function. We summa-
rize our findings in the following result, where we c onsider a more general
four parameter model.
Proposition 4.1.2 A GAL(θ, µ, σ, τ) random variable Y with ch.f. (4.1.3)
admits the representation
Y
d
= θ + µW + σ
W Z, (4.1.10)
where Z is standard normal and W is gamma with density (4.1.9).
Remark 4.1.3 Note that in the case τ = 1, where W has the standa rd
exp onential distribution, for µ = 0 (and σ =
2, θ = 0) we obtain the rep-
resentation (2.2.3) of the standard classical Laplace distribution, while for
µ 6= 0 , we get the representation (3.2.1) o bta ined previous ly for asymmetric
Laplace laws.
The above repres entation produces the following result, showing that as
the parameter τ converges to infinity, the corresponding Bessel random
variable converge s in distribution to a normal variable.
Theorem 4.1.1 Let Y
τ
GAL(µ
τ
, σ
τ
, τ), where
lim
τ →∞
µ
τ
τ = µ
0
and lim
τ →∞
σ
2
τ
τ = σ
2
0
.
Then Y
τ
converges in distribution to the Gaussian r.v. with mean µ
0
and
variance σ
2
0
.
Proof. Let W
τ
be a gamma Γ(α = τ, β = 1) random va riable. It follows
from the form of the relevant characteristic functions that the r andom vari-
ables µ
τ
W
τ
and σ
2
τ
W
τ
converge in probability to µ
0
and σ
2
0
, re spec tively.
Thus, the result follows from the representation given in Pro position 4.1.2
by invoking the independence of W and Z.
4.1 Bessel function distribution 229
Relation to gamma distribution
We now study the relation between Bessel function and gamma distribu-
tions. Let Y have the Bessel function distribution with the ch.f. ψ given by
(4.1.3). Note that in the factorization of ψ given by (4.1.4), the third factor
corresponds to the r.v.
σ
2
1
κ
G
1
, while the second factor corresponds to the
r.v.
σ
2
κG
2
, where G
1
, G
2
are i.i.d. Γ(α = τ, β = 1) random variables.
Thus, we obtain the following result derived by Press (1 967).
Proposition 4.1.3 A GAL
(θ, κ, σ, τ) random variable Y with ch.f. (4.1.4)
admits the representation
Y
d
= θ +
σ
2
1
κ
G
1
κG
2
, (4.1.11)
where G
1
and G
2
are i.i.d. gamma random variables with density (4.1.9).
As before, for the special case τ = 1, the representation (4.1.11) reduces
to that of an AL r.v., in which case G
1
and G
2
are standard exponential
variables (see Proposition 3.2.2).
Remark 4.1.4 Writing G
i
= log(U
i
), where U
i
’s have log-gamma dis-
tribution on (0, 1) with p.d.f.
f(u) =
1
Γ(τ)
(log u)
τ 1
, u (0, 1),
[see, e.g., Johnson et al. (1994)], we obta in the representation
Y
d
= θ +
σ
2
log
U
κ
1
U
1
2
!
. (4.1.12)
For κ = 1 the U
i
’s are standard uniform and we obtain the representation
(3.2.3) of AL random variables.
Remark 4.1.5 Similarly, writing G
i
= log P
i
, we obta in the representa-
tion
Y
d
= θ +
σ
2
log
P
1
1
P
κ
2
!
. (4.1.13)
Here, the i.i.d. variables P
i
have density
f(u) =
1
Γ(τ)
1
u
2
(log u)
τ 1
, u (1, ).
For κ = 1 the P
i
’s have Pareto Type I distribution and (4.1.13) reduces to
the representation (3.2.4) of AL r.v.’s.
230 4. Related distributions
Remark 4.1.6 Recall that if G has a gamma distribution with density
(4.1.9), then the r.v. H = 2G has a chi-square distribution with ν = 2τ
degrees of free dom, denoted by χ
2
ν
. C onsequently, Y GAL
(θ, κ, σ, τ) has
the following repr esentation in terms of two i.i.d. χ
2
2τ
-distributed r.v.’s H
1
and H
2
,
Y
d
= θ +
2σ
4
1
κ
H
1
κH
2
. (4.1.14)
4.1.3 Self-decomposability
As shown in Proposition 3.2.3 of Chapter 3, every AL
(θ, κ, σ) r.v. Y is
self-decomposable, that is for every c (0, 1) it admits the representation
Y
d
= cY + (1 c)θ + V, (4.1.15)
where the r.v. V can be expressed as
V
d
=
σ
2
1
κ
δ
1
W
1
κδ
2
W
2
. (4.1.16)
Here, δ
1
, δ
2
are r.v.’s taking values of either zero or one with probabilities
P (δ
1
= 0, δ
2
= 0) = c
2
, P (δ
1
= 1, δ
2
= 1) = 0,
P (δ
1
= 1, δ
2
= 0) = (1 c)
c +
1 c
1 + κ
2
,
P (δ
1
= 0, δ
2
= 1) = (1 c)
c +
(1 c)κ
2
1 + κ
2
,
W
1
and W
2
are standard exponential variables, and Y , W
1
, W
2
, and (δ
1
, δ
2
)
are mutually independent. Now, consider a GAL
(θ, κ, σ, τ) r.v. X, where
τ = n is a positive integer. Then,
X
d
= θ +
n
X
i=1
Y
i
, (4.1.17)
where the Y
i
’s are i.i.d. AL
(0, κ, σ) random variables. Consequently, since
each Y
i
admits the representation (4.1.15) with θ = 0,
Y
i
d
= cY
i
+ V
i
, (4.1.18)
where V
i
’s are i.i.d. copies of V given by (4.1.16), we obtain
X
d
= θ +
n
X
i=1
Y
i
d
= θ + c
n
X
i=1
Y
i
+
n
X
i=1
V
i
= c(θ +
n
X
i=1
Y
i
) + (1 c)θ +
n
X
i=1
V
i
.
(4.1.19)
Thus, we conclude that X is self-decomposable as well. The following result
summarizes our findings.
4.1 Bessel function distribution 231
Proposition 4.1.4 Let X GAL
(θ, κ, σ, n), where n 1 is a positive
integer. Then X is self-decomposable and for any c [0, 1] we have
X
d
= cX + (1 c)θ +
n
X
i=1
V
i
, (4.1.20)
where the V
i
’s are i.i.d. variables with the representation (4.1.16).
Remark 4.1.7 The fact that a GAL r.v. with the para meters θ = 0,
κ = 1, σ > 0 and τ = n N has the same distribution as the sum of
n i.i.d. symmetric Laplace variables shows that this distribution is stable
with respect to a random summation where the number of terms ν
p,n
has
the Pascal distribution:
P (ν
p,n
= k) =
k 1
n 1
p
n
(1 p)
kn
, k = n, n + 1, . . . , 0 < p < 1.
(4.1.21)
More precisely, if X
i
’s are i.i.d. with the GAL
(0, 1, σ, n) distribution and
ν
p,n
is an independent of the X
i
’s Pascal r.v., then the relation
p
1/2
ν
p,n
X
i=1
X
i
d
= X
1
, (4.1.22)
holds for all p (0, 1). Mor e over, under the symmetry and finite varia nce of
the X
i
’s, the stability prop e rty (4.1.22) characterizes this class of distr ibu-
tions (recall that with the geometric number of terms, w hich corresponds to
n = 1, we obtain the characterization of symmetric Laplace laws). In addi-
tion, the class of GAL
(0, 1, σ, n) distributions consists of all distributional
limits as p 0 of Pascal compounds
a
p
ν
p,n
X
i=1
(Y
i
b
p
) (4.1.23)
with b
p
= 0, where the Y
i
’s are symmetric and i.i.d. variables with finite
variance, independent of the Pascal number o f terms ν
p,n
. If the restrictions
on symmetry or finite variance are relaxed, we obtain a larger class of
Pascal-stable distributions, introduced in Jankovi´c (1993b) as the class of
distributional limits of (4.1.23).
Relation to sample covariance
Pearson et al. (1929) s howed analytically, that if (X
i
, Y
i
), i = 1, . . . n, are
i.i.d. from a bivariate normal distribution with mea ns µ
X
and µ
Y
, va ri-
ances σ
2
X
and σ
2
Y
, and correlation coefficient ρ, then the product-moment
232 4. Related distributions
coefficient
p
11
=
1
n
n
X
i=1
(X
i
X)(Y
i
Y ) (4.1 .24)
has the Bessel function distribution. We provide an alternative derivatio n,
utilizing appropriate represe ntations of rando m variables along with convo-
lution representation (4.1.11) of the Bessel function distribution. Without
loss of genera lity we assume that a random sample comes from the standard
normal distribution with means zero, variances equal to one, and correla-
tion (covariance) ρ. The following result shows that the statistic
T
n
= n
n
X
i=1
(X
i
X)(Y
i
Y ) = n
n
X
i=1
X
i
Y
i
n
X
i=1
X
i
!
n
X
i=1
X
i
!
(4.1.25)
has a Bessel function distribution with appropriate parameters (and con-
sequently so does the statistic p
11
defined above).
Proposition 4.1.5 Let X
i
and Y
i
, i = 1, . . . n, be i.i.d. bivariate nor-
mal with zero mean, unit variances, and covariance ρ. Then, for any n >
1, the statistic T
n
given by (4.1.25) has the Bessel function distribution
GAL
(κ, σ, τ) with
σ =
2n
p
1 ρ
2
, κ =
r
1 ρ
1 + ρ
, τ =
n 1
2
. (4.1.26)
Before proving Proposition 4.1.5 we establish the fo llowing lemma.
Lemma 4.1.1 Let x
1
, . . . , x
n
and y
1
, . . . , y
n
be two sets of real numbers,
and let
x and y be their arithmetic means. Then, for any integer n 1, we
have
n
n
X
i=1
(x
i
x)
2
=
X
1i<jn
(x
i
x
j
)
2
(4.1.27)
n
n
X
i=1
(x
i
x)(y
i
y) =
X
1i<jn
(x
i
x
j
)(y
i
y
j
). (4.1.28)
4.1 Bessel function distribution 233
Proof. Since (4.1.27) follows from (4.1.28), we only prove the latter relatio n.
We have the following chain of equalities
n
n
X
i=1
(x
i
x)(y
i
y) = n
n
X
i=1
x
i
y
i
n
X
i=1
x
i
n
X
j=1
y
j
= n
n
X
i=1
x
i
y
i
n
X
i=1
x
i
y
i
2
X
1i<jn
x
i
y
j
= (n 1)
n
X
i=1
x
i
y
i
X
1i<jn
2x
i
y
j
=
X
1i<jn
(x
i
x
j
)(y
i
y
j
).
We now turn to the proof of Proposition 4.1.5
Proof. In view of the representation (4.1.11), our goal is to show
T
n
d
= n
p
1 ρ
2
1
κ
G
1
κG
2
. (4.1 .29)
By Lemma 4.1.1 we have
T
n
=
X
1i<jn
a
i,j
,
where a
i,j
= (X
i
X
j
)(Y
i
Y
j
). Write
a
i,j
=
1
4
{[b
+
i,j
]
2
[b
i,j
]
2
},
where
b
±
i,j
= (Y
i
Y
j
) ± (X
i
X
j
),
so that
T
n
=
1
4
X
1i<jn
[b
+
i,j
]
2
X
1i<jn
[b
i,j
]
2
.
Next, note tha t for all 1 i < j n and 1 k < l n the variables b
+
i,j
and b
k,l
are independent. Indeed, they are normally distributed and their
234 4. Related distributions
covar iance is equal to zero:
Cov(b
+
i,j
, b
k,l
) = Cov{(Y
i
Y
j
) + (X
i
X
j
), (Y
i
Y
j
) (X
i
X
j
)}
= Cov(Y
i
, Y
k
) Cov(Y
i
, Y
l
) Cov(Y
i
, X
k
) + Cov(Y
i
, X
l
)
Cov(Y
j
, Y
k
) + Cov(Y
j
, Y
l
) + Cov(Y
j
, X
k
) Cov(Y
j
, X
l
)
+ Cov(X
i
, Y
k
) Cov(X
i
, Y
l
) Cov(X
i
, X
k
) + Cov(X
i
, X
l
)
Cov(X
j
, Y
k
) + Cov(X
j
, Y
l
) + Cov(X
j
, X
k
) Cov(X
j
, X
l
)
= δ
ik
δ
il
ρδ
ik
+ ρδ
il
δ
jk
+ δ
jl
+ ρδ
jk
ρδ
jl
+ ρδ
ik
ρδ
il
δ
ik
+ δ
il
ρδ
jk
+ ρδ
jl
+ δ
jk
δ
jl
= 0,
since δ
ij
is equal to one if i = j and zero otherwise. Next, write
T
n
=
1
4
(W
+
W
),
where
W
+
=
X
1i<jn
[b
+
i,j
]
2
and W
=
X
1i<jn
[b
i,j
]
2
are independent random var iables. Further, we have
b
±
i,j
= (Y
i
± X
i
) (Y
j
± X
j
) = Z
±
i
Z
±
j
,
where
Z
±
i
= (Y
i
± X
i
), i = 1, . . . n.
Note that the Z
+
i
’s are i.i.d. normal with mean zero a nd variance 2(1 + ρ),
since
V ar(Y
i
+ X
i
) = V ar(Y
i
) + V ar(X
i
) + 2Cov(Y
i
, X
i
) = 2(1 + ρ).
Similarly, the Z
i
’s are i.i.d. normal with mean zero and variance 2(1 ρ).
We now express T
n
in terms of the Z
±
i
’s as
T
n
=
1
4
X
1i<jn
[Z
+
i
Z
+
j
]
2
X
1i<jn
[Z
i
Z
j
]
2
,
and apply Lemma 4.1.1 to conclude that
W
+
= n
n
X
i=1
[Z
+
i
Z
+
]
2
and W
= n
n
X
i=1
[Z
i
Z
]
2
,
where
Z
+
and Z
denote the arithmetic means of Z
+
i
’s and Z
i
’s, re-
sp e c tively. Since the Z
+
i
’s are i.i.d. normal with mean zero and variance
σ
2
+
= 2(1 + ρ), we conclude that the s tatistic
H
1
=
1
n
W
+
σ
2
+
=
P
n
i=1
[Z
+
i
Z
+
]
2
2(1 + ρ)
4.1 Bessel function distribution 235
has a chi- square distribution with n 1 degrees of freedom. Similarly, the
same distribution has the statistic
H
2
=
1
n
W
σ
2
=
P
n
i=1
[Z
i
Z
]
2
2(1 ρ)
,
which is independent fro m W
1
. Finally, we can write
T
n
=
1
4
{2n(1 + ρ)H
1
2n(1 ρ)H
2
} =
n
2
{(1 + ρ)H
1
(1 ρ)H
2
},
which is equivalent to (4.1.29) by the relation between chi-square and
gamma distributions. The res ult has been proved.
Remark 4.1.8 For the special case n = 3 we obtain τ = 1 so that the
statistic T
3
has an asymmetric Laplac e distribution AL
(κ, σ) with pa-
rameters as in (4.1.26). Equivalently, an AL
(κ, 1) r.v. Y admits a repre -
sentation
Y
d
=
(X
1
X)(Y
1
Y ) + (X
2
X)(Y
2
Y ) + (X
3
X)(Y
3
Y )
2
p
1 ρ
2
,
where ρ and κ are related as in (4.1.26) and (X
i
, Y
i
), i = 1, 2, 3, are i.i.d.
bivariate normal va riables with vector mean zero, unit variances, and cor-
relation ρ.
4.1.4 Densities
To derive the p.d.f. of a GAL random varia ble we can either apply the
inversion formula to the GAL ch.f. (4.1.2) or exploit the representations
(4.1.10) and (4.1.11). Actually, we have already done the latter (for the case
σ = 1) in Lemma 2.3.1 of Section 2.3, where we were dealing with functions
of Laplace random variables. Thus, the density of a GAL
(θ, κ, σ, τ) r.v.
has the following form for x 6= θ,
h(x) =
2e
2
2σ
(1κ)(xθ)
πσ
τ +1/2
Γ(τ)
2|x θ|
κ + 1
!
τ
1
2
K
τ 1/2
2
2σ
1
κ
+ κ
|x θ|
!
,
(4.1.30)
where K
λ
is the modified Bessel function of the third kind with the index
λ, g iven in Appendix A. A standard GAL density is obtained for θ = 0 and
σ = 1. The above density, derived by a variety of methods and under various
parameterizations, has appeared in several papers, including Pearson et al.
(1929), McKay (1932), Madan et al. (1998), Levin and Tchernitser (1999),
236 4. Related distributions
−0.6 −0.4 −0.2 0 0.2 0.4 0.6
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
τ=1/4
τ=3/2
Standard Laplace
τ = 1
τ =3
−2 −1.5 −1 −0.5 0 0.5
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
τ=3/2
τ = 1
τ =3
τ=1/4
Asymmetric Laplace
Figure 4.1: Densities of the standard generalized Laplace distributions with
τ = 1/4, 1/2, 3/4, 1, 5/4, 3/2, 2, 5/2, and 3. Left: κ = 1 the symmetric
case; Right: κ = 2 an asymmetric case.
Kozubowski and Po dg´orski (1999a). In Figure 4.1, we present a variety of
standard density GAL densities. Note the behavior of the densities at zero,
which will be the subject of Theorem 4.1.2.
Let us note several special cases.
Asymmetric Laplace laws
Consider a standar d density GAL density with τ = 1. Here, the Bessel
function has index 1/2, so that it admits a closed form given by (A.0.1 1)
in Appendix A. Thus, the density (4 .1.30) takes the form
h(x) =
2
Γ(1)
π
2|x|
κ + 1
!
1/2
e
2
2
(1κ)x
K
1/2
(
2
2
1
κ
+ κ
|x|)
=
2
π
|x|
1/2
(κ + 1)
1/2
e
2
2
(1κ)x
π
(κ + 1)
1/2
|x|
1/2
e
2
2
(1+κ)|x|
=
2
κ + 1
e
2
2
(1κ)x
2
2
(1+κ)|x|
, (4.1.31)
which we recognize as the density of the standard AL
(0, 1, κ) distribution.
Further, in the symmetric case κ = 1, the above reduces to the density of
the standard Lapla c e distribution.
4.1 Bessel function distribution 237
Symmetric case
When κ = 1 and θ = 0, the distribution is symmetric (about zero) since
the corresponding characteristic function is real. In this case, the density
is given by the following eve n function of x:
h(x) =
2
σ
τ +1/2
Γ(τ)
π
|x|
2
τ 1/2
K
τ 1/2
(
2|x|), x 6= 0. (4.1.32)
This particula r distribution arises as a mixture of normal distributions
with mean zero and (stochastic) variance σ
2
W , where W has the gamma
distribution with density (4.1.9 ), see e.g., Teichroev (1957), Madan and
Seneta (1990), McLeish (1982).
In our next result we summarize some properties of the densities of the
symmetric generalized Laplace distributions. In particular, we show that
they are all unimodal for τ 1, and study their behavior at the mode.
Theorem 4.1.2 Let h(x; τ) be the density of a symmetric generalized Laplace
distribution GAL(0, 1, 1, τ ). Then, h(x; τ) has the following asymptotic be-
havior as x 0
+
:
h(x; τ) =
1
2
τ
π
Γ(1/2τ )
Γ(τ )
x
2τ 1
+ o(x
2τ 1
) for τ (0, 1/2),
2
π
log x + o(log x) for τ = 1/2,
1
2π
Γ(τ 1/2)
Γ(τ )
+ o(1) for τ > 1/2.
Moreover, for x > 0, we have
x
h(x; τ) =
1
τ 1
2
2
h(x; τ 1), τ > 1,
and
x
h(x; τ) =
=
2τ 1
2
τ
π
Γ(1/2τ )
Γ(τ )
x
2τ 2
+ o(x
2τ 2
) for τ (0, 1/2),
2
π
x
1
+ o(x
1
) for τ = 1/2,
1
sin(π(τ 1/2))Γ(2τ1)
x
2τ 2
+ o(x
2τ 2
) for τ (1/2 , 1),
1 + o(1) for τ = 1,
t1/2
sin(π(τ 1/2))Γ(2τ)
x
2τ 2
+ o(x
2τ 2
) for τ (1, 3/2),
2
2π
x log(
2x) + o(x log(
2x)) for τ = 3/2,
q
2
π
Γ(τ 1/2)
(τ 3/2)Γ(τ)
x + o(x) for τ (n 1/2, n + 1/2),
2(n2)(2n)!
πn!
x + o(x) for τ = n + 1/2,
where in the last two relations n N + 1.
238 4. Related distributions
Proof. Let H(x; τ) = x
τ 1/2
K
τ 1/2
(x). We have the following relation
which follows from the form of the density (4.1.32):
h(x; τ) =
2
1τ
πΓ(τ)
H(
2x; τ).
The result follows from the Properties 6, 9, and 10 of the functions
H(x; τ) and K
λ
given in Appendix A. The behavior of the density at z e ro
follows from Property 6 (and also Property 10 for τ < 1/2). The recurrent
relation follows from Property 9. The be havior of the derivative of h(x; τ)
follows from all three properties.
A direct cons e quence of Theo rem 4.1.2 is
Corollary 4.1.1 The density of a symmetric GAL
(0, κ, σ, τ) distribution
with τ > 1 is unimodal with the mode at zero.
Proof. The recurrent relation of Theorem 4.1.2 implies that the derivative
of the density is negative for positive arguments. Thus, the density is a
decreasing function which does not have any maximum, except possibly a t
zero.
The graphs of the densities in the sy mmetric case illustrating their be-
havior at zero, which is studied in the above theorem, are presented in
Figure 4.1 (the left-hand side picture). The influence of the parameters on
the shape of the densities is perhaps better illustrated by Figure 4.2.
An integer value of τ
We already know that when τ = n is a non-nega tive integer, then the corre-
sp onding GAL r.v. is a sum of n i.i.d. AL random va riables (with the same
parameters σ and µ (or κ). In this case the Bessel function K
n1/2
admits
a clos e d form [see (A.0.10) in Appendix A], and so does the corresponding
standard GAL density with the parameter τ = n 1:
h(x) =
1
(n 1)!
n1
X
j=0
(n 1 + j)!
(n 1 j)!j!
2
(nj)/2
|x|
n1j
(κ + 1)
n+j
(
e
2κ|x|
, for x 0,
e
2
1
κ
|x|
, fo r x < 0
(4.1.33)
[see, e.g., Press (1967 ), Levin and Tcher nitser (1999), Kozubowski and
Podg´o rski (1999a)]. Note that in the symmetric case (κ = 1) the above
density simplifies to (2.3.25) considered previously in connection with the
distribution of the sum of n i.i.d. Laplace r.v.’s [see also Teichroev (1957),
4.1 Bessel function distribution 239
−3 −2 −1 0 1 2 3
0
0.2
0.4
0.6
0.8
1
1.2
Standard Gaussian
τ=
infinity
τ=1/4
Standard Laplace
τ=1
−1.5 −1 −0.5 0 0.5 1 1.5 2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Standard Gaussian
τ=1
Asymmetric Laplace
τ=1/4
τ=3
Figure 4.2: Comparison of the standardized generalized Laplace densities
and normal standard density. Both pictures contain densities with τ = 1/4,
1/2, 3/4, 1, 5/4, 3/2, 2, 5/2, and 3. Right: the symmetric case, κ = 1; Left:
an asymmetric case, κ = 2. All densities have the mean equal to zero and
variance equal to one.
McLeish (1982)]. Also observe that (4.1.33) coincides w ith (4.1.31) if τ = 1,
which is the AL case. Further, here the density (4.1.33) is a mixture of n
densities on (−∞, ). For j = 0, . . . , n 1, the jth density has the form
f
n,j
(x) = p
n,j
g
nj,1
(x)1
[0,)
(x) + q
n,j
g
nj,κ
(x)1
(0,)
(x), (4.1.34)
where g
α,β
stands for the gamma G(α, β) density, and
p
n,j
=
p
n
q
j
p
n
q
j
+ p
j
q
n
, q
n,j
= 1 p
n,j
=
p
j
q
n
p
n
q
j
+ p
j
nq
n
, (4.1.35)
with p = 1/(1 + κ
2
) and q = κ
2
/(1 + κ
2
). Under the above notation, the
GAL
(0, κ, 1, n) density is
h(x) =
n1
X
j=0
(n + j 1)!
j!(n 1)!
2
(nj)/2
(p
n
q
j
+ p
j
q
n
)f
n,j
(x). (4.1.36)
This result, taken from Kozubowski and Podg´orski (1 999a), is a general-
ization of the exponential mixture representation discussed previously for
the AL random variables.
4.1.5 Mome nts
Exploiting representations of the K-Be ssel function random va riables it is
easy to find their moments. This is done in the following result.
240 4. Related distributions
Proposition 4.1.6 The moments of a GAL(µ, σ, τ) random variable Y
are given by the following relations
E(Y
n
) =
1
πΓ(τ)
[[n/2]]
X
k=0
n
2k
σ
2k
µ
n2k
2
k
Γ(1/2 + k)Γ(τ + n k).
In particular, if µ = 0 (symmetric case), then
E(Y
2m
) = σ
2m
m1
Y
i=0
[(τ + i)(2i + 1)].
Proof. We exploit the representation (4.1.8) and the following formulas for
the moments of a gamma variable W with parameter α = τ and a standard
normal random variable Z:
E(W
s
) =
Γ(τ + s)
Γ(τ)
, E(Z
2k
) = 2
k
Γ(1/2 + k)
Γ(1/2)
=
k1
Y
i=0
(2i + 1).
Since odd moments of the s tandard normal rando m va riable vanish, we
obtain
E(Y
n
) =
n
X
l=0
n
l
σ
l
µ
nl
E(Z
l
)E(W
nl/2
)
=
[[n/2]]
X
k=0
n
2k
σ
2k
µ
n2k
E(Z
2k
)E(W
nk
),
and the formula follows from a direct application of the expressions for the
moments of W and Z.
In the symmetric case all terms expect the last one in the ab ove sum
vanish and the conclusion follows from the identity:
Γ(τ + k) = Γ (τ )
k1
Y
i=0
(τ + i), k N.
Corollary 4.1.2 The mean of a GAL(µ, σ, τ) random variable Y is equal
to
E(Y ) = τµ,
and the variance is
V ar(Y ) = τ(µ
2
+ σ
2
).
4.2 Laplace motion 241
4.2 Laplace motion
In this section we study the Laplace motion a stochastic process which
plays the same role in the Lapla cian domain as the Brownian motion does
among Gaussian processes. The Laplace motions have several interesting
properties which distinguish them from their famous Gaussian counterpart.
We study here only the most fundamental ones, leaving more extensive
investigation for some future work on processes generated by the Laplace
distribution.
The Laplace motions are special cases of evy processes. The latter
are defined through the clas s o f infinitely divisible distributions to which
Laplace distributions belong. Although the Laplace motions share some
common properties with the Brownian motions, including the finite second
(or any order) moments, independence and stationarity of increments, their
observed features are essentially different. First, their trajectories (paths)
are discontinuous at any point and, in fact, they are pur e ly jumps func-
tions. In general, they can be asymmetric, including prop e rties of their
paths. The space-scale is not exchangeable with the time-scale which, even
in the symmetric case, requires two different parameters for these scales.
The Laplace motions have several representations which relate them to
other processes. First, they can be written as a Brownian motion evaluated
at random time, the latter being the gamma process. In o ther words they
are Brownian motions subordinated to the gamma process. Alternatively,
the Laplace motion can be obtained as a difference of two independent
gamma processes. Finally, using a ge neral representation o f evy processes,
we can write them as compound Poisson processes with independent and
random jumps having a special fo rm of the distribution (given by so-called
evy density). The last characterization gives an insight to the structure
of the trajectories and sizes of jumps, the latter completely characterizing
trajectories of pure jumps processes.
The finiteness of their moments and their convenient characterizations
make Laplace motions an interesting object for future investigation and for
developing the theo ry of Laplacian processes more or less in the same spirit
as the theory of Gaussian processes is developed based on the Brownian
motion.
4.2.1 Symmetric Laplace motion
As we already know, the Laplace distributions are infinitely divisible (see,
for example, Subsection 2.4.1 in Chapter 2, o r also Section 6.9 in this chap-
ter). Thus it is a direct cons e quence of the general theory o f infinitely divis-
ible distributions and proce sses that we can define the following subclass
of evy processes [cf. Ferguson and K lass (1972)].
242 4. Related distributions
Definition 4.2.1 A stochastic process L(t) is called a symmetric Laplace
motion with the space-scale parameter σ and the time-scale parameter ν [in
short, LM(σ, ν) process] if
1. It starts at the origin, i.e.
L(0) = 0,
2. It has independent and stationary (homogeneous) increments,
3. The increments by the time-scale unit have a symmetric Laplace dis-
tribution with the parameter σ, i.e.
L(t + ν) L(t)
d
= L(σ).
The symmetric Laplace motion LM(1, 1) is called the standard Laplace
motion or simply the Laplace motion.
A symmetric Laplace motion Y (t) with drift m is a LM(σ, ν) process
L(t) shifted by a linear function mt, i.e.
Y (t) = mt + L(t)
Remark 4.2.1 The above definition, along with the properties of infinitely
divisible distributions, imply the following characteristic function for the
increment L(s + t) L(s) of LM(σ, ν):
φ
t
(u) =
1
(1 + σ
2
u
2
/2)
t/ν
,
i.e. the increment has the generalized symmetric Laplace distribution (the
symmetric K-Bessel function distribution) with the parameters σ and τ =
t/ν which is denoted by GAL(0, σ, τ).
Remark 4.2.2 Recall that the standard Brownian motion {B(t), t > 0}
is self-similar with index H = 1/2, that is
{B(at), t > 0}
d
= {a
H
B(t), t > 0 } for all a > 0. (4.2.1)
In contrast with the Brownian motion, for the Laplace motion the time-
scale and the space-scale are no longer exchangeable, and the process is no t
self-similar. Indeed, for any a > 0 and H > 0 we have
a
H
L(t)
d
= GAL(0, a
H
σ, t/ν)
and
L(at)
d
= GAL(0, σ, at/ν),
so the self-similarity property (4.2.1) can not hold for the Laplace motion
L(t).
4.2 Laplace motion 243
Remark 4.2.3 As expected, a general Lapla ce motion with a drift can be
defined through the standard Laplace motion L by the expression
mt + σL(t/ν).
Let us start a more detailed discussion of the properties of the Laplac e
motion with the derivation of their moments.
Proposition 4.2.1 Let L(t) be a LM (σ, ν) Laplace m otion with drift m.
Then
E[L(t)] = mt, V ar[L(t)] =
2
/ν.
Proof. The result follows from Remark 4.2.1 and Corollary 4.1.2.
It follows immediately that fixing the variance and the mean do not
define a Laplac e motion completely. Therefore, there are infinitely many
Laplace motions LM(σ, ν) each with σ
2
= 1, having the same c ovariance
structure as the standard Brownian motion characterized by unit variance
at the time equal to one. In Figure 4.3, we present trajectories of the
processes with the same covariance structure. We see that sample properties
differ significantly for these processes.
4.2.2 Representations
There are several important representations of Laplace motion. Mo st of the
results presented here were discussed and par tially proved in Madan and
Seneta (1990).
The first representation relates Laplace motion to Brownian mo tio n eval-
uated at an independent random time distributed according to a gamma
process. Recall that a stochastic process Γ
t
is called a gamma process if
it starts at zer o, has independent and homog e neous incr e ments, and the
distribution of the increment Γ
t+s
Γ
t
is given by the gamma distribution
with the shape parameter s/ν. If ν = 1 we refer to such a pr ocess as the
standard gamma process.
Theorem 4.2.1 Let B(t) be a Brownian motion with the scale parameter
σ and let Γ
t
be a gamma process with parameter ν independent of B
t
. Then
the process
L(t) = B
t
), t > 0,
is LM(σ, ν).
244 4. Related distributions
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
Figure 4.3: Trajectories of Laplace motions a nd Brownian motion (three
paths for each process). All processes have the same covaria nce structure
characterized by unit variance at time t = 1. This requirement for the
Laplace motion LM(σ, ν) is satisfied by setting σ =
ν. Top: standard
Brownian motion vs. s tandard Lapla c e motion (ν = 1); Middle: LM(σ =
2, ν = 2) and LM(σ =
2/2, ν = 1/2); Bottom: LM(σ =
5, ν = 5)
and LM(σ = 1/2, ν = 1/4);
4.2 Laplace motion 245
Proof. That the process L(t) starts at the origin is obvious. The distribution
of L(t) can be obtained from the characteristic function
φ
L(t)
(ξ) = Ee
iL(t)ξ
= E(E(e
iB
t
)ξ
|Γ
t
))
= Ee
Γ
t
σ
2
ξ
2
/2
=
1
(1 + σ
2
ξ
2
/2)
t/ν
,
which corresponds to the GAL(0, σ
2
, t/ν) distribution. The proof then fol-
lows from the gener al property stating that the c omposition of two inde-
pendent processe s with independent and homogenous increments (in this
case Brownian motion and gamma process) is again a proc ess with inde-
pendent and homogenous increments [see Bertoin (1 996)].
Another simple representation of Laplace motion is given in the following
theorem.
Theorem 4.2.2 Let Γ
t
and
˜
Γ
t
be two independent gamma processes with
the same parameter ν. Then the process defined by
L(t) =
2
2
σ
t
˜
Γ
t
), t > 0,
is LM(σ, ν).
Proof. The process obviously starts at zero and has independent and ho-
mogeneous increments since Γ
t
and
˜
Γ
t
are such processes. Thus the thesis
follows from Proposition 4.1.3 applied to G
1
= Γ
t
and G
2
=
˜
Γ
t
for κ = 1.
The last representation which we want to discuss here follows from an ap-
plication of the general result of Ferg us on and Klass (1972). It is sometimes
described as a Poisson approximation of independent increments processes.
Recall first the evy-Khinchine representation of a symmetric process
X(t) with independent and homogeneous increments with no Gaus sian
component (Laplace motions are examples of such proc esses):
φ
X(t)
(u) = exp
Z
−∞
[cos(uz) 1]dΛ
t
(z)
,
where Λ
t
= tΛ and Λ is the evy measur e of X(1).
Consider the standard classical Laplace motion L(t), i.e. with ν = 1
and σ =
2. By the evy-Khinchine representation derived in Proposi-
tion 2.4.2, the ab ove representation holds with Λ defined through
Λ([u, u]
c
) = 2E
1
(u) = 2
Z
u
1
x
e
x
dx.
246 4. Related distributions
Here, E
1
stands for the exponential integral function [see, e.g., Abramowitz
and Stegun (1965)]. In the following series representation we restrict our-
selves to a standard Laplace motion and to time interval [0,1].
Theorem 4.2.3 Let L(t) be a standard Laplace motion. Assume that (δ
i
)
is a Rademacher sequence (i.i.d. symmetric signs), (U
i
) is an i.i.d. sequence
of random variables distributed uniformly on [0, 1],
i
) are arrival times
in a standard Poisson process. We assume that all three sequences, (δ
i
),
(U
i
), and
i
), are independent. Then, the following representation holds
for L(t):
L(t)
d
=
X
i=1
δ
i
J
i
I
[0,t)
(U
i
),
where the series is absolutely convergent with probability one, J
i
= E
1
1
i
),
and I
[0,t)
(U
i
) is the indicator function of the int erval [0, t) evaluated at U
i
.
Proof. The proof is a direct consequence of a theorem of Ferguson and
Klass (1972, p. 1640). The absolute convergence follows from the fact that
R
1
0
zdΛ(z) is finite. Consequently, no centering of the terms of the series is
needed. By adding random signs to the repres e ntation, we obtain symme-
try of the process.
Remark 4.2.4 From the above representation, one can derive properties
of trajectories of Laplace motions . First of all, sample paths are pure jump
functions (a function is a jump function if its value is equal the sum of the
jumps, o r in other words, if it is increasing only at the jumps). The absolute
values of the jumps are given by J
i
’s, and are ordered. The largest jump is
represented by J
1
= E
1
1
1
), and its distribution is given by
P (J
1
x) = e
E
1
(x)
, x > 0.
Since E
1
(x) converges to infinity when x approaches zero , the distribution
of the first jump is continuous on [0, ) and has the density
f
J
1
(x) = e
E
1
(x)
e
x
/x.
Using the probability structur e of the arrivals of a Poisson process one can
easily derive the conditional distribution of the nex t jump given the pre-
vious ones. Namely, the distribution o f J
n
given that J
1
= x
1
, . . . , J
n1
=
x
n1
has the following c.d.f.
F (x|x
1
, . . . , x
n1
) = e
E
1
(x)+E
1
(x
n1
)
, x > x
n1
> ··· > x
1
.
4.2 Laplace motion 247
4.2.3 Asymmetric Laplace motion
The definition and properties of Laplace mo tion extend naturally to the
asymmetric case. The fact that the asymmetric Laplac e distribution AL(µ, σ)
is infinitely divisible justifies the following definition.
Definition 4.2.2 A stochastic process L(t) is called an asymmetric Laplace
motion with the space-scale parameter σ, the time-scale parameter ν, and
centered at µ [and denoted by ALM(µ, σ, ν)] if
1. It starts at the origin, i.e.
L(0) = 0,
2. It has independent and stationary (homogeneous) increments,
3. The increments by the time-scale unit have an asymmetric Laplace
distribution with the parameters µ and σ, i.e.
L(t + ν) L(t)
d
= AL(µ, σ).
An asymmetric Laplace motion with drift m is an ALM(µ, σ, ν) process
L(t) shifted by a linear function mt, i.e.
Y (t) = mt + L(t)
Remark 4.2.5 The ab ove definition and the properties of infinitely divis-
ible distributions lead to the following characteristic function of the incre-
ment L(s + t) L(s) of the ALM(µ, σ, ν) process:
φ
L(t)
(u) =
1
(1 iµu + σ
2
u
2
/2)
t/ν
,
i.e. the increment has the ge neralized asymmetric Laplace distribution (the
asymmetric Bessel function distribution) with the parameters µ, σ, and
τ = t/ν, denoted GAL(µ, σ, τ).
Proposition 4.2.2 Let L(t) be an ALM(µ, σ, ν) Laplace motion with a
drift m. Then,
E[L(t)] = mt + µt/ν, V ar[L(t)] = t(µ
2
+ σ
2
)/ν.
Proof. The result follows from Remark 4.2.5 and Corollary 4.1.2.
Below we list representations of the ALM(µ, σ, ν) process, which are
direct e xtensions of the ones obtained for the symmetric Laplace motions.
248 4. Related distributions
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
0 5 10 15 20 25 30
-5 0 5
Figure 4.4: Trajectories of asymmetric Laplace motions with centering
drifts (three paths for each process). All process es are asymmetric, but
have the same covariance structure characterized by unit variance at time
t = 1 and the mean zero, i.e. the same as for the symmetric processes of
Figure 4.3. This requirement for asymmetric Laplace motion ALM(µ, σ, ν)
with a drift m is satisfied by setting m = µ/ν and σ =
p
ν µ
2
, where
µ
2
< ν. Top: Laplace motions with ν = 1 and µ = 0.4 (left) µ = 0.8 (right);
Bottom: L aplace motions with ν = 4 and µ = 1 (left) µ = 1.5 (right).
Subordinated Brownian motion
Assume that B(t) is a Brownian motion with scale σ and with drift µ,
and that Γ
t
is a g amma process with the parameter ν independent o f B(t).
Then, the following representation for the ALM(µ, σ, ν) process L(t) holds:
L(t)
d
= B
t
), t > 0. (4.2.2)
Difference of gamma processes
Let Γ
t
and
˜
Γ
t
be two independent gamma proc e sses with parameter ν. Let
κ =
2σ/(µ +
p
2σ
2
+ µ
2
). Then, we have the following repres entation of
4.3 Linnik distribution 249
the ALM(µ, σ, ν) process L(t):
L(t)
d
=
2
2
σ
1
κ
Γ
t
κ
˜
Γ
t
, t > 0. (4.2.3)
Compound Poisson approximation
The series representation of an ALM(µ, σ, ν) process is a direct general-
ization of the symmetric case, and involves a series which is absolutely
convergent almost surely. Let us recall that the evy measure Λ of the
asymmetric Laplace distribution AL(µ, σ) is given by
Λ(u, ) = E
1
(
2κu/σ), Λ(−∞, u) = E
1
(
2u/(σκ)), u > 0.
Let us now define Λ
(x) = E
1
(
2x/(σκ)) and Λ
+
(x) = E
1
(
2xκ/σ),
x > 0.
Let L(t) be an asymmetric Laplace motion ALM(µ, σ, 1). Assume that
(δ
i
) is a Rademacher sequence of i.i.d. symmetric signs, (U
i
) is an i.i.d.
sequence of random variables distributed uniformly on [0, 1], and
i
) is
a sequence of the arrival times in a standard Poisson proc e ss. We assume
that all thre e sequences, (δ
i
), (U
i
), and
i
), are independent. Then the
following representation in distribution holds for L(t):
L
t
d
=
X
i=1
δ
i
J
i
I
[0,t)
(U
i
), (4.2.4)
where the series is abs olutely convergent with probability one and J
i
=
Λ
1
δ
i
i
).
4.3 Linnik distributi on
The univariate symmetric Linnik distribution with index α (0, 2] a nd
scale parameter σ > 0 is given by the characteristic function
ψ
α,σ
(t) =
1
1 + σ
α
|t|
α
, t R, (4.3.1)
and is named after Ju. V. L innik, who showed that the function (4.3.1)
is a bona fide ch.f. for any α (0, 2], see Linnik (1953). Since for α = 2
we obtain symmetric Laplace distribution, the distribution is a lso known
as α-Laplac e, see, e.g., Pillai (1985). We s hall write L
α,σ
to denote the
distribution given by (4.3.1).
Linnik laws are spec ial cases of strictly geometric stable (geo metr ic sta-
ble) distributions, introduced in Klebanov et al. (1984). A random va riable
250 4. Related distributions
Y (and its probability distribution) is called strictly geo metric stable, if for
any p (0, 1) there is an a
p
> 0 such that
a
p
ν
p
X
i=1
Y
i
d
= Y
1
, (4.3.2)
where ν
p
is geometric r.v. w ith mean 1/p, while the Y
i
’s are i.i.d. copies
of Y , independent of ν
p
. Strictly ge ometric stable laws are a special case
of geometric stable laws discussed later in Subsection 4.4.4; they have ch.f.
(4.4.7) with either µ = 0 and α 6= 1 or β = 0 and α = 1. Thus, str ic tly
geometric stable laws form a three-parameter fa mily, and their ch.f. can be
written as
ψ
α,σ,τ
(t) =
1
1 + σ
α
|t|
α
exp(ατsign(t)/2)
, t R, (4.3.3)
where α and σ are as before, and τ is such that |τ| min(1, 2 1).
Since for τ = 0 we obtain the symmetric Linnik distribution (4.3.1), some
authors re fer to (4.3.3) as a non-symmetric Linnik distribution, see, e.g.,
Erdogan and Ostrovskii (1997). As we shall see in this Section, Linnik dis-
tributions shar e some, but not all the properties of the s ymmetric La place
distribution. Like symmetric Laplace distributions, Linnik laws are stable
with respect to geometric summation, and appear as limit laws of geometric
compounds when the summands are symmetric and have an infinite vari-
ance. We shall discuss their various characteriza tio ns in Section 4.3.1. In
Section 4.3.2, while discussing representations of Linnik laws, we shall show
that they are mixtures of stable laws, as well as exponential mixtures and
scale mixtures of normal distributions. These representations lead directly
to integral repr e sentations of the Linnik densities, which are discussed in
Section 4.3.3, devoted to Linnik densities a nd distribution functions. Al-
though a closed form expression for the Linnik density seems to be un-
available, as it is the case for stable laws, asymptotic results have been
investigated by Ko tz et al. (1995). In sectio n 4.3.4 we shall study moments
and the tail behavior of the Linnik laws. We shall show that their tail
probabilities are no longer exponential, and the moments are governed by
the parameter α. Unlike Laplac e laws, while analogously to stable distri-
butions, the Linnik laws have an infinite variance, while the mean is finite
only for 1 < α < 2. In Section 4.3.5 we shall list properties of the Linnik
laws, which include unimodality, geometric and classical infinite divisibil-
ity, and self-decompo sability. Sections 4.3.6 and 4.3.7 are devoted to the
problems of simulation and estimation, respectively. For the Linnik laws,
the standard methods (which are based o n explicit forms of the relevant
distribution functions and densities) are not practical. We shall show that
the problem of simulation is easily handled by the mixture representations
of Linnik laws, and discuss some recent advances in the estimation problem.
Section 4.3.8 is devoted to extension of the Linnik distribution.
4.3 Linnik distribution 251
4.3.1 C haracterizations
In this section we present characterizations of Linnik laws related mostly
to geometric summation. Many results are consequences of the fact that
Linnik laws are spec ial cases of strictly geometric stable distributions.
Stability with respect to geometric su m m ation
We have seen in Section 2.2.6 that within the class of symmetric r.v.’s with
a finite variance, the classica l Laplace r.v. is characterized by the stability
property (4.3.2). Anderson (1992) o bs e rved that the Linnik distribution is
closed under geometric compounding as well, so that (4.3.2) holds with
L
α,σ
distributed Y
i
’s and a
p
= p
1
. In the case α = 1 this result is due
to Arnold (1973) and it serves as a foundation for the development of An-
derson’s (1992) multivariate Linnik distribution. In the subsequent result
we show that stability prope rty (4.3 .2) actually characterizes symmetric
Linnik distributions within the class of symmetric r.v.’s (not necessarily
with finite variance).
Proposition 4.3.1 Let Y, Y
1
, Y
2
, . . . be symmetric, i.i.d. random variables
and let ν
p
be a geometric random variable with mean 1/p, independent of
the Y
i
’s. The following statements are equivalent:
(i) Y is stable with respect to geometric s ummation,
a
p
ν
p
X
i=1
(Y
i
+ b
p
)
d
= Y for all p (0, 1), (4.3.4)
where a = a
p
> 0 and b = b
p
R.
(ii) Y has a symmetric Linnik distribution.
Moreover, the constants a
p
and b
p
are necessarily of the form: a
p
= p
1
,
b
p
= 0.
Proof. First, we show that the Linnik r.v. with ch.f. (4.3.1) satisfies the
relation (4.3.4) with the ab ove a
p
and b
p
. Using the typical conditional
argument we write the ch.f. of the variable on the left-hand side of (4.3.4)
in the following form
p
1 +
α
|t|
α
1
1 (1 p)(1 +
α
|t|
α
)
1
, (4.3.5)
and note that it simplifies to (4.3.1), which is the ch.f. of the right-hand side
of (4.3.4). To prove the converse, use the corresponding characterization of
strictly geometric sta ble laws (see, e .g., Ko z ubowski (1994b), Theorem 3.2)
and conclude that if a r.v. Y
1
satisfies (4.3.4), it then must be a strictly
geometric stable r.v. with ch.f. (4.3.3) and the normalizing constants must
be as specified in the statement of the proposition. Since Y
1
is assumed
to be symmetric, its ch.f. must be real, implying that the parameter τ in
252 4. Related distributions
(4.3.3) equals zero, leading to the Linnik ch.f. (4 .3.1). This concludes the
proof.
What happens if the re lation (4.3.4) holds only for one particular value
of p? Then, the solution of (4.3.4) consists of a larger class than the c lass
of strictly geometric stable laws, see Lin (1994) for details. However, un-
der certain additional tail conditions, relation (4.3.4) with one particular
value of p characterizes symmetric Linnik distributions as well. Specifically,
assuming that ψ satisfies the condition
lim
t0
(1 ψ(t))/|t|
α
= γ for some γ > 0 and 0 < α 2, (4.3.6)
we have the following result.
Proposition 4.3.2 Let Y, Y
1
, Y
2
, . . . be i.i.d. r.v.’s whose ch.f. ψ satisfies
condition (4.3.6). Let p (0, 1) and let ν
p
be a geometric r.v. with mean
1/p, independent of the sequence (Y
i
). Then,
a
p
ν
p
X
i=1
Y
i
d
= Y (4.3.7)
for some a
p
> 0 if and only if a
p
= p
1
and Y has a symmetric Linnik
distribution.
See Lin (1994) for a pro of and also for a similar characterization of
Mittag-Leffler distributions. The result also appe ared in Kakosyan et al.
(1984) under the additional assumptions that a
p
= p
1
and the distribu-
tion of Y is non-degenerate and symmetric.
The following characterization of the Linnik distribution is also proved
in Lin (199 4), as well as in Kakosyan et al. (1984), under the additional
assumptions that a
p
= (p/q)
1
and the distribution of Y is non-degenerate
and symmetric.
Proposition 4.3.3 Let Y
1
, Y
2
, . . . be i.i.d. r.v.’s whose ch.f. ψ satisfies
condition (4.3.6). Let p, q (0, 1), where p 6= q, and let ν
p
and ν
q
be
geometric r.v.’s with means 1/p and 1/q, respectively, independent of (Y
i
).
Then,
a
p
ν
p
X
i=1
Y
i
d
=
ν
q
X
i=1
Y
i
(4.3.8)
with some a
p
6= 0 if and only if |a
p
|
α
= p/q and Y has a symmetric Linnik
distribution.
We c onclude this section by noting that relation (4.3.7) remains valid
under the randomization of parameter p. More precisely, let Y, Y
1
, Y
2
, . . .
4.3 Linnik distribution 253
be i.i.d. symmetric and non-degenerate r.v.’s whose ch.f. ψ satisfies c ondi-
tion (4.3.6). Let ν
p
be a geometric r.v . with mean 1/p, independent of the
sequence (Y
i
), where p (0, 1). Further, assume that the parameter p is
itself a r.v. with a proba bility distribution on (0, 1). Then, relation (4.3.7)
holds with a
p
= p
1
if and only if Y has symmetric Linnik distribution. In
addition, if (4.3.7) ho lds with non-negative r.v.’s and a
p
= p, then Y must
have an exponential distribution; see Kakosyan et al. (1984) for proofs and
further details.
Distributional limits of geometric sums
We have shown in Section 2.2.7 that the classical Laplace distribution arises
as the only p ossible limit of a geometric sum with symmetric i.i.d. compo-
nents with finite variance. If the condition of finite variance is o mitted, we
then obtain a characterization of symmetric Linnik distributions.
Proposition 4.3.4 The class of symmetric Linnik distributions coincides
with the class of distributional limits of
S
p
= c
p
ν
p
X
i=1
X
i
(4.3.9)
as p 0, where c
p
> 0, the X
i
’s are symmetric i.i.d. random variables,
and ν
p
is a geometric random variable with mean 1/p, independent of the
X
i
’s.
Proof. First, note that by Proposition 4.3.1, a symmetric Linnik r.v. X
is equal in distribution to the r.v. S
p
given by (4.3.9), where a
p
= p
1
and X
i
’s are i.i.d. copies of X. So it is a distributional limit of S
p
as well.
Thus, it remains to show that if geometric c ompounds (4.3.9) with i.i.d. and
symmetric X
i
’s converge in distribution to a r.v. Y , then the latter must
have a symmetric Linnik distribution. Our proof consists of s howing that
the r.v. Y is symmetric and stable with respect to geometric summation
[i.e., (4.3.2) holds], and thus it must have a symmetric Linnik distribution
by Proposition 4.3.1. First, note that as the r.v.’s X
i
are symmetric, their
ch.f. is real, so that the ch.f. of S
p
must be real, implying that the ch.f. of the
limiting r.v. Y is real as well. Consequently, Y has a symmetric distribution.
If Y is degenerate at zero, it is (a degenerate) Linnik (with σ = 0) and the
result is valid. Assume now that the distribution of Y is not concentrated
at zero. It then follows that Y can not have a degenerate distribution
(concentrated at some constant not equal to z e ro), since then its ch.f. would
not be real. Next, fix an arbitrary p
0
(0, 1) and for any p (0, p
0
) define
p
00
= p/p
0
. Then, the ge ometric r.v. ν
p
admits the representation
ν
p
=
ν
p
0
X
i=1
ν
(i)
p
00
, (4.3.10)
254 4. Related distributions
where ν
(i)
p
00
’s are i.i.d. geometric r.v.’s with mean 1/p
00
while ν
p
0
is geometric
with mean 1/p
0
, independent of the ν
(i)
p
00
’s (Exercise 4.5.15). This allows us
to e xpress S
p
in the following manner:
S
p
= c
p
ν
p
X
i=1
X
i
d
=
c
p
c
p
00
ν
p
0
X
i=1
W
(i)
p
00
, (4.3.11)
where W
(i)
p
00
’s are i.i.d. r.v.’s equal in distribution to S
p
00
= c
p
00
P
ν
p
00
i=1
X
i
.
Now, as p 0 , we note that p
00
= p/p
0
also converges to zero (p
0
being
fixed!), so that by the assumption we have
W
(i)
p
00
= S
p
00
= c
p
00
ν
p
00
X
i=1
X
i
d
Y
i
, i = 1, 2, . . . , (4.3.12)
where the Y
i
’s are independent copies of Y . Thus, we have the convergence
ν
p
0
X
i=1
W
(i)
p
00
d
ν
p
0
X
i=1
Y
i
, (4.3.13)
see Exercise 4.5 .16. Since by the assumption S
p
d
Y , where Y in non-
degenerate, in view of (4.3.11) and (4.3.13) we conclude that the sequence
c
p
/c
p
00
must conve rge to a limit (which may depend on p
0
) denoted by a
p
0
,
and we must have
a
p
0
ν
p
0
X
i=1
Y
i
d
= Y. (4.3.14)
Consequently, by Proposition 4.3.1, Y must have symmetric Linnik dis-
tribution, as p
0
is an arbitrary real number in (0, 1). The result has been
proved.
Stability with respect to deterministic summation
We saw in Section 2.2.8 that within the class of symmetric distributions
with finite variance, the classical Laplace distribution can be characterized
by means of the s tability property under deterministic summation and
random normalization. Omitting the condition of finite variance leads to a
characterization of symmetric Linnik laws.
Proposition 4.3.5 Let the variables B
n
, where n > 0, have a Beta(1, n)
distribution given by (2.2.45). Let 0 < α 2, and let {Y
i
} be a sequence
4.3 Linnik distribution 255
of symmetric i.i.d. random variables. Then, the following statements are
equivalent:
(i) For all n 2, Y
1
d
= B
1
n1
(Y
1
+ ··· + Y
n
).
(ii) Y
1
has a symmetric Linnik distribution.
Proof. The proof is very similar to that of Proposition 2.2.11 for the sym-
metric Laplace case. Write the right-hand side of the representation in (i)
in the form U
n
V
n
, where
U
n
= (nB
n1
)
1
and V
n
=
P
n
i=1
Y
i
n
1
, (4.3.15)
and let n . Then, U
n
converges in distribution to a random variable
W
1
, where the variable W has a standard exponential distribution. Fur-
ther, since the product U
n
V
n
as well as the seq uence U
n
are convergent,
while V
n
has a symmetric distribution, we conclude tha t the sequence V
n
must be convergent as well. Moreover, if X is the limit of V
n
, then it must
have a symmetric sta ble distribution with ch.f. (4.3.20). Since by the as-
sumption U
n
is independent of V
n
, the limit o f the product U
n
V
n
is the
product of the limits, so that
U
n
V
n
d
W
1
X. (4.3.16 )
But this is the representation (4.3.19) of Linnik random variables discussed
in the nex t section. The implication (i) (ii) follows, since Y
1
must have
the same distribution as the limit in (4.3.16).
We now turn to the pr oof of the implication (ii) (i). Multiply both sides
of (4.3.21) from P roposition 4.3.8 by B
1
n1
(which is independent of all the
other r.v.’s) to obtain
B
1
n1
(Y
1
+ ··· + Y
n
)
d
= (G
n
B
n1
)
1
X (4.3.17)
(with X as above). By Lemma 2.2.2, the product G
n
B
n1
has the same
distribution as a standard exponential r.v. W , so that the right-hand side
of (4.3.17) has a Linnik distribution by the representation (4.3.19). The
proof is thus complete.
We conclude our discussion on stability with another cha racterization of
symmetric Linnik laws, derived in Pillai (1985) for a larger class of semi-
α-Laplace distributions, a class that includes all strictly geometric stable
laws.
Proposition 4.3.6 Let Y, Y
1
, Y
2
, and Y
3
be i.i.d. symmetric Linnik vari-
ables L
α,σ
. Let p (0, 1), and let I be an indicator random variable, inde-
pendent of Y, Y
1
, Y
2
, Y
3
, with P (I = 1) = p and P (I = 0) = 1 p. Then,
256 4. Related distributions
the following equality in distribution is valid for any p (0, 1):
Y
d
= p
1
IY
1
+ (1 I)(Y
2
+ p
1
Y
3
). (4.3.18)
Proof. The result follows by writing the ch.f. of the right-hand side in
(4.3.18) conditioning on the distribution of the r.v. I.
4.3.2 Representations
Representations of Linnik random variables were studied by Devroye (1990),
Anderson (1992), Ander son and Arnold (1993), Kotz and Ostrovskii (1996),
and Kozubowski (1998). Devroye (1 990) derived the following fundamental
representation of a Linnik r.v. in terms of independent exponential and
symmetric stable random variables, which is analogous to the representa-
tion (2.2.3) of the Lapla ce distribution.
Proposition 4.3.7 A Linnik r.v. Y with the ch.f. (4.3.1) admits the rep-
resentation
Y
d
= W
1
X, (4.3.19)
where X is symmetric stable variable with ch.f.
φ(t) = exp(σ
α
|t|
α
) (4.3.20)
and W is a standard exponential r.v., independent of X.
The above repr e sentation is a special case with n = 1 of the next result,
which describes the distribution of the sum of n i.i.d. Linnik random vari-
ables. It generalizes similar representation for the case of symmetric Laplace
random va riables, see Proposition 2.2.10 of Chapter 2.
Proposition 4.3.8 Let Y
1
, Y
2
, . . . be i.i.d. Linnik r.v.’s with ch.f. (4.3.1).
Then
Y
1
+ ··· + Y
n
d
= G
1
n
X, (4.3.21)
where X is symmetric stable with ch.f. (4.3.20) and G
n
has gamma G(n, 1)
distribution.
Proof. The result follow s by computing the ch.f.’s on both sides of (4.3.21).
By conditioning on G
n
, we calculate the ch.f. of G
1
n
X as follows
Ee
itG
1
n
X
=
Z
0
Ee
itz
1
X
z
n1
Γ(n)
e
z
dz =
Z
0
φ(tz
1
)
z
n1
Γ(n)
e
z
dz,
4.3 Linnik distribution 257
where φ is the symmetric stable ch.f. (4.3.20). Since
φ(tz
1
)e
z
= e
z(σ
α
|t|
α
+1)
,
a straightforward integration results in the Linnik ch.f. (4.3.1).
The representation (4.3.19) allows for obtaining properties of Linnik dis-
tributions from those of stable laws. However, its va lue for certain appli-
cations may be limited. For insta nce , the above representation is not very
convenient for simulating Linnik random variates, since stable distribu-
tions do not admit densities or distribution functions in closed form, and
require mixture representations themselves for s imulation. Kotz and Ostro-
vskii (1996) and Kozubowski (1998) have studied an alternative mixture
representations of the Linnik distribution which allow efficient generation
of the corresponding rando m variates. Kotz and Ostrovskii (1996) observe
that for any 0 < α < α
0
2, the ch.f.’s of the Linnik distributions L
α,1
and L
α
0
,1
satisfy the equation
ψ
α,1
(t) =
Z
0
ψ
α
0
,1
(t/s)g(s; α, α
0
)ds, (4.3.22)
where
g(s; α, α
0
) =
α
0
π
sin
πα
α
0
s
α1
1 + s
2α
+ 2s
α
cos
πα
α
0
(4.3.23)
is the density of a non-negative r.v. U
α,α
0
. Kozubowski (1998) notes the
representation
ψ
α,1
(t) =
Z
0
ψ
α
0
,1
(ts)g(s; α, α
0
)ds, (4.3.24)
using the above notation. Representations (4.3.22)-(4.3.24) lead to the con-
clusion that the corresponding Linnik r.v.’s Y
α,1
and Y
α
0
,1
obey the re pre-
sentations
Y
α,1
d
= Y
α
0
,1
· U
α,α
0
d
= Y
α
0
/U
α,α
0
. (4.3.25)
Kozubowski (1998) modifies represe ntations (4.3.25) by introducing a r.v.
W
ρ
= U
α
α,α
0
, where ρ = α/α
0
< 1 , with a folded Cauchy density g
ρ
on
(0, ) given by
g
ρ
(x) =
sin(πρ)
πρ[x
2
+ 2x cos(πρ) + 1]
. (4.3.26)
Note that the definition of W
ρ
can be extended to the cases ρ = 0 and
ρ = 1 as well by taking weak limits as ρ 0
+
and ρ 1
, thus arriving
258 4. Related distributions
at the density g
0
(x) = (1 + x)
2
for W
0
and W
1
= 1 (see Exercise 4.5.19).
The following result is a restatement of (4.3.25) in terms of the r.v. W
ρ
[see
Kozubowski (1998)].
Proposition 4.3.9 Let 0 < α < α
0
2 and ρ = α/α
0
< 1. Let W
ρ
be a
non-negative r.v. with the density (4.3.26), and let Y
α
0
be a Linnik L
α
0
r.v., independent of W
ρ
. Then, a r.v. Y
α,σ
with the Linnik L
α,σ
distribution
admits the representations
Y
α,σ
d
= Y
α
0
· W
1
ρ
d
= Y
α
0
/W
1
ρ
. (4.3.27)
The fact that the representations involve both, the division and multipli-
cation, follows from the reciprocal property of the r.v. W
ρ
(see Exercises
4.5.20 and 4.5.21).
Taking α
0
= 2, we arrive at the classical Laplace r .v . and the repre-
sentation provides a direct method of simulating Linnik random variates
discussed in section 4.3.6. Thus, a Linnik L
α,σ
r.v. can b e thought of as a
Laplace variable with a stochastic variance, and also as a normal variable
with a stochastic var iance (since a Laplace distribution is a scale mixture of
normal distributions). In addition, the Laplace r.v. corresponding to α
0
= 2
has the representation σIW in accordance with Propo sition 2.2.3. Cons e-
quently, we obtain the following exponential mixture representation o f the
Linnik r.v. L
α,σ
.
Proposition 4.3.1 0 Let Y
α,σ
be a Linnik L
α,σ
r.v. with any 0 < α 2,
and let W
ρ
be a non-negative r.v. with the density (4.3.26) for ρ = α/2 1.
Then,
Y
α,σ
d
= σ · I · W · W
1
ρ
d
= σ · I ·W/W
1
ρ
, (4.3.28)
where I is an indicator r.v. taking values ±1 with probabilities 1/2 each,
W is standard exponential variable, and all the variables are independent.
Taking α = 2, the above repres entation reduces to the representation
(2.2.10) of the Laplac e distribution, as W
1
= 1.
Remark 4.3.1 Choosing α = 1 and α
0
= 2 and noting that in this case
the r.v. W
1/2
has a folded standard Cauchy distribution, we arrive at the
representation
Y
1,1
d
= exponential · Cauchy
d
= |Cauchy|· Laplace, (4.3.29)
which is essentially r e statement of the well known result that the density
of Cauchy variable is of the same fo rm as the characteristic function of the
Laplace while the characteristic function of Cauchy variable is of the same
functional form as the density of the Laplace.
4.3 Linnik distribution 259
Remark 4.3.2 Non-symmetric Linnik distributions with ch.f. (4.3.3) and
more general geometric stable r.v.’s admit similar mixture representations,
see Erdogan and Ostr ovskii (199 8a), K ozub owski (2000a), and Belinskiy
and Kozubowski (2000) for further details.
4.3.3 Densities and distribution functions
Here we study Linnik distribution functions and densities. There are no
closed form ex pressions for Linnik distribution functions and densities, e x-
cept for α = 2, which corresponds to the Laplace distribution. However, the
mixture representations of Section 4.3.2 lead to integral as well as asymp-
totic and convergent series repr esentations of Linnik densities a nd distri-
bution functions, which we present below.
Integral representations
The representation (4.3 .19) leads to the repre sentations of Linnik densi-
ties and distribution functions through their stable co unterparts. Let p
α,σ
and F
α,σ
denote the density and distribution function of the Linnik L
α,σ
distribution given by ch.f. (4.3.1). Similarly, let g
α,σ
and G
α,σ
denote the
density and distr ibution function of the corresponding stable law specified
in Proposition 4.3.7.
Proposition 4.3.1 1 Every Linnik distribution with 0 < α 2 is abso-
lutely continuous and
F
α,σ
(x) =
Z
0
G
α,σ
x
z
1
e
z
dz , (4.3.30)
p
α,σ
(x) =
Z
0
z
1
g
α,σ
x
z
1
e
z
dz . (4.3.31)
The above representations, which are dealt with in Exercise 4.5.22, a p-
peared in K ozub owski (1994a) and Lin (1994). Note that in case α = 2,
equations (4.3.30) - (4.3.31 ) produce the distribution function and density
of a symmetric Laplac e distribution.
Next, we express the exp onential mixture representation (4.3.28) in terms
of the corresponding densities and distribution functions (see Exercise 4.5.23).
Proposition 4.3.1 2 The distribution function and density of the Linnik
L
α,1
distribution with 0 < α < 2 admit the following representations for
x > 0,
F
α,1
(x) = 1
sin
πα
2
π
Z
0
v
α1
exp(vx)dv
1 + v
2α
+ 2v
α
cos
πα
2
(4.3.32)
260 4. Related distributions
and
p
α,1
(x) =
sin
πα
2
π
Z
0
v
α
exp(v|x|)dv
1 + v
2α
+ 2v
α
cos
πα
2
. (4.3.33)
For x < 0, u s e F
α,1
(x) = 1 F
α,1
(x) and p
α,1
(x) = p
α,1
(x).
This representation appears in Erdogan (1995) and, for the case 1 < α < 2,
in Klebanov et al. (1996). Note that the density (4.3.33) can be written
equivalently in the form
p
α,1
(x) =
sin
πα
2
π
Z
0
v
α
exp(v|x|)dv
|1 + v
α
exp(α/2)|
2
, x 6= 0, (4.3.34)
in which it was origina lly first derived (by the inversion formula and the
Cauchy theorem for complex variables) in Linnik (1953). Indeed, since for
real x we have ex p(ix) = cos x + i sin x, the denominator under the integral
in (4.3.34) is equal to
|1 + v
α
cos
πα
2
+ iv
α
sin
πα
2
|
2
= (1 + v
α
cos
πα
2
)
2
+ (v
α
sin
πα
2
)
2
,
and coincides with that in (4.3.33).
Remark 4.3.3 Hayfavi (1 998) derived another representation of the Lin-
nik density p
α,1
by a contour integral: fo r any δ (0, 1) and α [δ, 2 δ],
we have
p
α,1
(x) =
1
x
i
4α
Z
L(δ)
e
z log x
dz
Γ(z) sin
πz
α
cos
πz
2
, x > 0,
where L(δ) is the boundary of the region
{z : |z| > δ/2, |arg z| < π/4}.
Note that
lim
x0
+
p
α,1
(x) =
sin
πα
2
π
Z
0
v
α
dv
|1 + v
α
exp(α/2)|
2
. (4.3.35)
The integral is divergent for 0 < α 1 and it is convergent for 1 < α < 2 .
In the latter case
p
α,1
(0) = lim
x0
+
p
α,1
(x) = (α sin(πα))
1
. (4.3.36)
Thus, the limit of p
α,1
(x) as x 0
+
is finite for 1 < α < 2 and infinite for
0 < α 1, in which case the densities have an infinite peak at x = 0. On
the interval (0, ), the function p
α,1
(x) is decreasing and its kth derivative
satisfies the relations
lim
x0
+
(1)
k
p
(k)
α,1
(x) = , k = 1, 2, . . .
4.3 Linnik distribution 261
-3 -2 -1 0 1 2 3
0.0 0.2 0.4 0.6 0.8 1.0
α = 2
α = 0.5
Figure 4.5: Densities of Linnik distributions with σ = 1 and α’s equal to
0.5, 0.75, 1 .00, 1.25, 1.50, 1.75, 2.00.
and
(1)
k
p
(k)
α,1
(x) 0, k = 1, 2, . . . .
The latter property implies complete monotonicity of the Linnik density
on (0, ) [see, e.g., Kotz et al. (1995)]. Since the characteristic function is
real for all t R, the density p
α,1
(x) is an even function of x. Finally, since
the integral on the right-hand side of (4.3.34) is a continuous function of α
for any fixed x, the density p
α,1
(x) is a continuous function of α (0, 2).
Figure 4.5 presents graphs o f several selec ted Linnik densities.
Series expansions
We shall briefly discuss asymptotic and converg e nt series representations
of Linnik distribution functions and densities. We start we the asymptotic
expansions at infinity, due to Kozubowski (1994a), Erdogan (1995), and
Kotz et al. (1995 ). Let p
α
= p
α,1
be the density and let F
α
= F
α,1
be the
distribution function corresponding to the Linnik characteristic function
(4.3.1) with σ = 1. Consider the densities first. The following asymptotic
262 4. Related distributions
relation is valid as x :
p
α
(±x)
1
π
X
k=1
(1)
k+1
Γ(kα + 1) sin(kπα/2)x
1
. (4.3.37)
The above asymptotic relation can be written alternatively as follows.
Proposition 4.3.1 3 The density p
α
of a Linnik L
α,1
distribution has the
following representation for x > 0:
n > 0 p
α
(±x) =
1
π
n
X
k=1
c
k
x
1
+ R
n
(x), (4.3.38)
where
c
k
= (1)
k+1
Γ(kα + 1) sin(kπα/2),
|R
n
(x)|
αΓ(α(n + 1) + 1)
π|sin(πα/2)|
x
α(n+1)1
.
See Kozubowski (1994a) for the proof of Proposition 4.3.13 and Belinskiy
and Kozubowski (2000) for its extension to geometric stable laws.
The approximation of p
α
(x) with the finite sum (4.3.38) should be used
for large values of x, since for fix e d n the remainder |R
n
(x)| converges to
zero as x [with the rate of O(
1
x
(n+1)α+1
)]. In particular, for n = 1, we
have the following asymptotic expansion:
p
α
(±x)
1
π
Γ(1 + α) sin(πα/2)x
1α
, x , (4.3.39)
with the absolute value of the remainder R
1
(x) bounded by
b
1
(x, α) =
αΓ(2α + 1)
π sin
πα
2
x
2α1
. (4.3.40)
As an illustration of the asymptotic expa nsion (4.3.39), in Table 4.2 we
present the values of the approximation, along with the corresponding val-
ues of the bound (4.3.40) and the percent error [equa l to the ratio of the
bound (4.3.40) to the approximate value (4.3.39) multiplied by 100%].
Next, we turn to distribution functions. Their asy mptotic expansions are
obtained by integration of the corresponding series for the densities. We
have the following asymptotic relation as x :
1 F
α
(x)
1
π
X
k=1
(1)
k+1
Γ(kα) sin(kπα/2)x
. (4.3.41)
4.3 Linnik distribution 263
x α appr. of p
α
(x) b
1
(x, α) perc e nt error
10 1/2 6.307831E-3 2.250791E-3 36%
10 3/2 9.461747E-4 4.051423E-4 42%
20 1/2 2.230155E-3 5.626977E-4 25%
20 3/2 1.672616E-4 2.532140E-5 15%
50 1/2 5.641896E-4 9.003163E-5 16%
50 3/2 1.692569E-5 6.482277E-7 3.83%
100 1/2 1.994711E -4 2.250791E-5 11%
100 3/2 2.992067E -6 4.051423E-8 1.35%
1000 1/2 6.307831E-6 2.250791E-7 3.5 7%
1000 3/2 9.461747E-9 4.051423E-12 0.04%
Table 4.2: The values of the one-term asymptotic expansion of p
α
(x), along
with the values of the error bound b
1
(x, α) and the correspo nding maximal
percent error, for s e le c ted values of α and x.
Similarly, we get the behavior of the Linnik c.d.f. at −∞:
F
α
(x)
1
π
X
k=1
(1)
k+1
Γ(kα) sin(kπα/2)x
, x . (4.3.42)
More precisely, we have the following result:
Proposition 4.3.1 4 The distribution function F
α
of a Linnik distribution
L
α,1
admits the following representation for x > 0:
n > 0 1 F
α
(x) =
1
π
n
X
k=1
b
k
x
+ R
n
(x), (4.3.43)
where
b
k
= (1 )
k+1
Γ(kα) sin(kπα/2),
|R
n
(x)|
αΓ(α(n + 1))
π|sin(πα/2)|
x
α(n+1)
.
See Kozubowski (1994a) for the proof of Pr oposition 4.3.14.
We now turn our attention to series expansions and asymptotics at zero
for Linnik densities, which were theoretically thoroughly studied by Kotz
et al. (1995). We add here some numerical results. The structure of such
series representations depends on the ar ithmetic nature of the parameter
α. Three cases o ught to be investigated:
(i) 1 is an integer.
264 4. Related distributions
(ii) 1 is a non-integer rational number.
(iii) α is an irrationa l number.
In case (i) we have the following representation.
Proposition 4.3.1 5 Let p
α
be the density of a Linnik distribution L
α,1
,
where 0 < α =
1
n
< 2 and n is a positive integer. Then,
p
α
(±x) =
1
2
X
k=1,k/nQ\N
(1)
k+1
x
k/n1
Γ(k/n) cos
2n
(4.3.44)
+
(1)
n+1
π
cos x · log
1
x
+
1
2
sin x
+
(1)
n+1
π
X
k=0
(1)
k
Γ
0
(2k + 1)
Γ(2k + 1)
x
2k
.
See Erdogan (1995) and Kotz et al. (1995) for the proofs. The series rep-
resentation leads to the asymptotic formula for each n 2,
p
α
(±x) =
1
2
n1
X
k=1
(1)
k+1
x
k/n1
Γ(k/n) cos
2n
+
(1)
n+1
π
log
1
x
+ (1)
n
γ
π
+
(1)
n+1
nx
1/n
2Γ(1/n) sin
π
2n
+ O(|x|
2/n
), x 0, (4.3 .45)
where γ is the Euler constant. Let us note the following two special cases.
For α = 1, which corresponds to the ch.f. ψ
1,1
(t) = [1 + |t|]
1
, we obtain
the representation
p
1
(±x) =
1
π
cos x · log
1
x
+
1
2
sin x +
1
π
X
k=0
(1)
k
Γ
0
(2k + 1)
Γ(2k + 1)
x
2k
(4.3.46)
and the corresponding asymptotic formula
p
1
(±x) =
1
π
log
1
x
γ
π
+
1
2
x
1
2π
x
2
log
1
x
+ O(x
2
), x 0. (4.3.47)
For α = 1/2, we obtain
p
1/2
(x) =
1
2x
X
k=0
(1)
[
k+1
2
]
|x|
k
Γ(k +
1
2
)
cos x
π
· lo g
1
|x|
+
+
sin |x|
2
1
π
X
k=0
(1)
k
Γ
0
(2k + 1)
Γ(2k + 1)
|x|
2k
, (4.3.48)
corresponding to φ
1/2,1
(t) = [1 + |t|
1/2
]
1
.
In case (ii), things are getting a little more complicated, as the expansion
includes se veral series.
4.3 Linnik distribution 265
Proposition 4.3.1 6 Let p
α
be the density of a Linnik distribution L
α,1
,
where 0 < α =
m
n
< 2 and m and n are relatively prime integers greater
than one. Then,
p
α
(±x) =
X
k=1,k/nQ\N
(1)
k+1
Γ(kα)
sin(kπα/2)
sin(kπα)
x
1
(4.3.49)
+
1
π
log
1
x
X
t=1
(1)
(m+n)t
Γ(mt)
sin(tπnα/2)x
mt1
+
1
2
X
t=1
(1)
(m+n)t1
Γ(mt)
cos(nα/2 )x
mt1
+
1
α
X
j=1,j/mQ\N
(1)
j1
Γ(j)
sin(jπ/2)
sin(jπ)
x
j1
+
1
π
X
t=1
(1)
(m+n)t
Γ
0
(mt)
Γ
2
(mt)
sin(nα/2)x
mt1
.
See Erdogan (1995) and Kotz et al. (1995) for the proofs. Rather remark -
ably, under the additional assumption that the number m is even, the s e ries
expansion for p
m/n
simplifies to
p
α
(±x) =
1
2
X
k=1
(1)
k+1
x
1
Γ(kα) cos(kπα/2)
+
1
α
X
k=0
(1)
k
x
2k
Γ(2k + 1) sin(π(2k + 1))
,
(4.3.50)
where the serie s on the r ight-hand side is absolutely convergent. We note
here that the expansion for α = 1/n given in Proposition 4.3.15 follows
from the one with α = m/n by setting m = 1 in (4.3.49) (see E xercise
4.5.24).
To obtain a symptotic formulas for x 0 des c ribing the behavior up
to O(|x|
N
), it is necessary to select from the right-hand side of (4.3.49)
the terms involving powers of |x| that are less than N , and to add the
term containing log(1/|x|), if available. For example, for α = 3/2, we have
m = 3, n = 2, and
p
3/2
(±x) =
4
3
3
r
2
π
· |x|
1/2
+
1
2π
x
2
log
1
x
+
Γ
0
(3)
4π
x
2
+ O(|x|
7/2
)
(4.3.51)
as x 0. Another remarkable result is that under the ca se (iii), where
α is irrationa l, the representation of p
α
is similar to (4.3.50) rather than
(4.3.49)! Indeed, if α (0, 2) is not rational of the form α = m/n with an
266 4. Related distributions
odd m, we have the representation
p
α
(x) =
1
x
lim
s→∞
(
1
2
s
X
k=1
(1)
k+1
|x|
Γ(kα) cos
α
2
+
1
α
X
kA
s
(1)
k
|x|
2k+1
Γ(2k + 1) sin
π(2k+1)
α
)
,
(4.3.52)
where A
s
denotes the set of positive intege rs k satisfying the relation 1
2k + 1 < α(s+ 1/2). In addition, the limit on the right-hand side is uniform
with respect to x on any compact subset of R. Moreover, for almos t all (but
not all) irrational values of α the representation (4.3.50) remains valid and
the series converges absolutely and uniformly on any compact set. More
precisely, the “lucky” set of irrational α’s is the set (0, 2) \ L, where L is
the set of the so-called Lio uv ille numbers - namely numbers β such that
for any r = 2, 3, 4, . . . there exists a pair of integers p, q 2 such that
0 < |β p/q| < q
r
.
It is well known that these numbers are transcendental and the set of all
Liouville numbers has Lebesgue measure zero. We thus have the following
Proposition [see Kotz et al. (1995)].
Proposition 4.3.1 7 The density p
α
of a Linnik distribution L
α,1
, where
0 < α < 2 is irrational and not Liouville, admits the representation (4.3.50).
Moreover, both series converge absolutely and u niformly on any compact
set.
To construct an α for which bo th ser ies in (4.3.50) are divergent, we have
to c onstruct a sequence of very rapidly growing integers by the recurrence
relation:
q
s+1
= (q
s
!)
2
q
s
, s = 1, 2, . . . ,
and set
α =
X
k=1
1
q
k
.
Evidently, since q
s
> 2
s
for s 2 and α (1/2, 1), it is not difficult to
show that these α’s are Liouville numbers and the terms of the form
(1)
k+1
x
1
Γ(αk) cos(παk/2)
with index k = q
s
diverge to as s .
4.3 Linnik distribution 267
4.3.4 Mome nts and tail behavior
The asymptotic representation (4.3.43) shows that Linnik distributions
have regularly varying tails with index α. More precisely, if the r.v. Y
α
have the Linnik distribution L
α,1
, then we have
lim
x→∞
x
α
P (Y
α,σ
> x) =
Γ(α) sin
πα
2
π
. (4.3.53)
Consequently, a s noticed by Lin (1994), the abs olute moments of positive
order p, e(p) = E|Y
α,σ
|
p
, are finite for p < α and infinite for p α.
The following computational formula for e(p) is useful for estimating the
parameters of Linnik distribution [see Kozubowski a nd Panorska (1996),
Proposition 5.3].
Proposition 4.3.1 8 Let Y L
α,σ
with 0 < α 2. Then, for every 0 <
p < α, we have
e(p) = E|Y |
p
=
p(1 p)σ
p
π
αΓ(2 p) sin
πp
α
cos
πp
2
. (4.3.54)
In case p = 1, we need to set (1p)/ cos
πp
2
to its limiting value when p 1,
which is equal to 2/π. Note that for α = 2, we obtain a familiar express ion
for the moments of the symmetric Laplace distribution, E|Y |
p
= σ
p
Γ(p+1)
(see Exercise 4.5.25). In particular, the first absolute moment (p = 1) is
equal to σ for the Laplace distribution, and
2σ
α sin
π
α
(4.3.55)
for the Linnik L
α,σ
distribution. We lis t few selected values of E|Y | for the
latter distribution with σ = 1 in Table 4.3 below (the corresponding value
of the standard classical L aplace distribution is equal to 1). We can clearly
see the increase in E|Y | as the parameter α approaches 1 . In fact, for ea ch
given σ > 0, the function of α given by (4.3.55) is strictly decreasing on
(1, 2], and converges to infinity as α 1
+
. For α = 1, the first absolute
moment of Linnik distribution is infinite, while for α = 2 it coincides with
its counterpart of the standard classical Laplace distribution.
α 1.01 1.025 1.05 1.10 1.25 1.50 1 .75 2
E|Y | 63.67 25.49 12.78 6.4 5 2.72 1.54 1.17 1
Table 4.3: Selected values of E|Y |, where Y has the Linnik distribution
with σ = 1 and various α’s .
Since Linnik distribution L
α,σ
has the tails P (Y
α,σ
> x) asymptotically
equivalent to the power function x
α
, it is in the domain of attractio n of
268 4. Related distributions
stable distribution with index α. Indeed, for a given sequence X
1
, X
2
, . . .
of i.i.d. Linnik L
α,1
random variables, as n , the sum
S
n
= n
1
n
X
i=1
X
i
converges in distribution to the stable law with characteristic function
φ(t) = exp(−|t|
α
):
lim
n→∞
E[e
itS
n
] = lim
n→∞
(1 + |t|
α
/n)
n
= exp(−|t|
α
).
We conclude this section with the result on the asymptotic behavior
of fractional absolute moments of Linnik distribution, which follows from
the tail behavior of g eometric stable distributions, see Kozubowski and
Panorska (1996).
Proposition 4.3.1 9 Let Y L
α,σ
with 0 < α 2. Then,
lim(α r)E|Y |
r
=
2αΓ(α)σ
α
sin
πα
2
π
. (4.3.56)
4.3.5 Properties
In this section we collect (somewhat fragmented) further results on sym-
metric L innik distributions.
Self-decomposability
In Sec tio n 2.4.3 we discussed the class L of self-decomposable distributions
and showed that symmetric Laplace distributions belong to this class. It
was shown in Lin (1994) that this prope rty is shared by Linnik distributions
as we ll.
Proposition 4.3.2 0 All symmetric Linnik distributions are in class L,
that is for all c (0, 1) the Linnik characteristic function ψ
α,σ
given by
(4.3.1) can be written as
ψ
α,σ
(t) = ψ
α,σ
(ct)φ
c
(t), (4.3.57)
where φ
c
is a characteristic function.
Proof. Luka cs (1970) has shown that if p > 1 and g is a ch.f., then the
function (p 1)/(p g(t)) is also a characteristic function. Since
ψ
α,σ
(t)
ψ
α,σ
(ct)
=
p 1
p ψ
α,σ
(ct)
= φ
c
(t),
where p = (1 c
α
)
1
> 1, we conclude that φ
c
is a characteristic function.
4.3 Linnik distribution 269
Remark 4.3.4 We note also that strictly geometric stable laws are self-
decomposable as well [see, e.g., Kozubowski (1994a)], while geometric stable
r.v.’s with 0 < α < 2 and µ 6= 0, in ge neral, are not, see Ramachandran
(1997).
As shown by Yamazato (1978), self-decomposability implies unimodality,
so that Linnik distribution are unimoda l (with the mode at zero). The
unimodality of Linnik distributions was also proved in Laha (196 1). In
conclusion, we note that although general geometric stable laws may not
belong to class L, they are all unimodal (with the mode at zero ), as recently
shown by Belinskiy and Kozub owski (2000).
Infinite divisibility
We saw in Section 2.4.1 that symmetric La place distribution is infinitely
divisible, and its characteristic function admits evy-Khinchine represen-
tation with an explicit expre ssion for the evy measure. Linnik distribu-
tions are infinitely divisible as well, although the L´evy measure can be no
longer written explicitly. Their evy-Khinchine representation follows from
Lemma 7, VI.2 of Bertoin (1996) and the fact that a Linnik random variable
Y
α,1
can be written as Y = S(W ), where W is standar d exponential vari-
able and S(t) is a stable process with independent increments, independent
of W , and S(1) has the stable law with the characteristic function
φ(t) = e
−|t|
α
. (4.3.58)
Proposition 4.3.2 1 The ch.f. (4.3.1) of the Linnik distribution L
α,σ
ad-
mits the representation
ψ(t) = exp
Z
R
(e
itu
1)dΛ(u)
, (4.3.59)
where
dΛ
du
(u) =
α
2|u|
E exp
u
σX
α
=
1
σ
Z
0
g
α
u
σw
1
e
w
w
1+1
dw,
where X has the stable distribution (4.3.58) and g
α
is the density of X.
Remark 4.3.5 See Kozubowski et al. (1998) for a more detailed discussion
on the L innik and more general geometric stable evy mea sure and their
asymptotics at zero.
Remark 4.3.6 If α = 2, the Linnik distribution L
α,σ
reduces to the clas-
sical Laplace distr ibution CL(0, σ) with mean ze ro and variance 2σ
2
. In
this case the stable random variable X has ch.f. e
t
2
, which corresponds
270 4. Related distributions
to the normal distribution with mean zero and variance equal to two. Con-
sequently, the density of the e vy measure is
dΛ
du
(u) =
2
2|u|
E exp
u
σX
2
=
1
|u|
Z
−∞
e
u
2
σ
2
x
2
1
2
π
e
1
4
x
2
dx.
(4.3.60)
Noting that the integral in (4.3.60) is an even function of x, we obtain after
some algebra
dΛ
du
(u) =
1
π
1
|u|
Z
0
t
1/21
e
(t+
u
2
σ
2
1
4t
)
dt. (4.3.61)
Relating the integ ral in (4.3.61) to the modified Bessel function K
1/2
,
defined in (A.0.4) (see Appendix A), we obtain
dΛ
du
(u) =
1
π
1
|u|
K
1/2
|u|
σ
· 2 ·
|u|
1/2
2
σ
1/2
. (4.3.62)
Finally, the application of Pr operties 5 and 10 of the function K
λ
, results
in
dΛ
du
(u) =
1
|u|
e
−|u|
, (4.3.63)
which is the density obtained previously for the classical La place distribu-
tion (see P roposition 2.4.2 of Chapter 2).
4.3.6 Simulation
Devroye’s representation (4.3.19) allows us to generate Linnik distributions
from independent stable and exponential variates. However, the generation
of stable distributions requires non-standard methods, as their distribution
functions are not given explicitly, s e e, e.g., Weron (199 6). An alternative
way of computer simulation of Linnik random variables is obtained through
the re presentation (4.3.27) with α
0
= 2. Here, the r.v.’s that appear in the
representation have explicit distribution functions, and thus can conve-
niently be generated by the inversion method. Indeed, the Laplace distri-
bution function is given in Section 2.1.1, while the distribution function of
the r.v. W
ρ
has the following form
F
ρ
(x) =
1
πρ
arctan
x
sin πρ
+ cot πρ
π
2
+ 1. (4.3.64)
Since the inverse function of F
ρ
has an explicit form,
F
1
ρ
(x) = sin(πρ) cot (πρ(1 x)) cos(πρ), (4.3.65)
4.3 Linnik distribution 271
the r.v . W
ρ
can be generated by the inversion method. Here is a generator
of a symmetric Linnik L
α,σ
distribution given by the ch.f. (4.3.1).
A Linnik L
α,σ
generator.
Generate random variate Z from L
2,1
distribution (standard Laplace
with location 0 and scale 1).
Generate uniform [0,1] variate U , independent of Z.
Set ρ α/2.
Set W sin(πρ) cot (πρU) cos(πρ).
Set Y σ · Z · W
1
.
RETURN Y
More details on generation variates from the L innik and more general
geometric stable laws can be found in Kozubowski (2000b).
4.3.7 Estima tion
This section is devoted to the problem of estimating the parameters α and
σ of the Linnik distribution L
α,σ
. Since densities and distribution func-
tions of Linnik laws can not in general be written in closed form, most
estimation methods for Linnik laws suggested in the literature are based
on the characteristic function and its empirical counterpart. Recall that
if X
1
, X
2
, . . . , X
n
are i.i.d. random variables with characteristic function
ψ, then the empirical characteristic function (sample ch.f.) is defined a s
follows:
b
ψ
n
(t) =
1
n
n
X
j=1
e
itX
j
. (4.3.66)
The above function is the characteristic function of the empirical distri-
bution of the data, that assigns probability 1/n to each observation. By
definition and the strong LLN, it follows that
E[
b
ψ
n
(t)] = ψ(t) and
b
ψ
n
(t)
a.s.
ψ(t) as n . (4.3.67)
Consequently, estimators based o n the sample characteristic function are
usually strongly cons istent.
Below we present s e veral estimation procedures for Linnik para meters,
based on the random s ample X
1
, X
2
, . . . , X
n
from the Linnik L
α,σ
dis-
tribution given by the ch.f. ψ = ψ
α,σ
as specified by (4.3.1). Here, the
272 4. Related distributions
characteristic function is real and the distribution is s ymmetric about z ero.
Thus, the real part of the empirical characteristic function,
bη
n
(t) =
1
n
n
X
j=1
cos(tX
j
), (4.3.68)
can be used in e stimation.
Method of moments type estimators
The first method is a special case of the estimation procedure for geometric
stable parameters suggested by Anderson (1992) and Kozubowski (1993).
The metho d is based on the sample characteristic function (4.3.68) for
the symmetric case and produces computationally simple, consistent, and
asymptotically normal estimators. For convenience, we set λ = σ
α
, to be
consistent with the notation used in Kozubowski (19 93). Since
1(t) = 1 + λ | t |
α
,
we have
υ(t
i
) = λ|t
i
|
α
, i = 1, 2, (4.3.69)
where υ(t) = |1(t) 1] and t
1
6= t
2
, are both greater than 0. Solving
equations (4.3.69) for α and λ we obtain
α =
log[υ(t
1
)(t
2
)]
log[t
1
/t
2
]
, λ = e xp
log | t
1
| log[υ(t
2
)] log | t
2
| log[υ(t
1
)]
log[t
1
/t
2
]
.
(4.3.70)
Substituting the sample ch.f. bη
n
(t) for ψ(t) into (4.3.70) we get estimators
of α and λ :
bα =
log[bυ
n
(t
1
)/bυ
n
(t
2
)]
log[t
1
/t
2
]
,
b
λ = exp
log | t
1
| log[bυ
n
(t
2
)] log | t
2
| log[bυ
n
(t
1
)]
log[t
1
/t
2
]
,
where bυ
n
(t) = |1/bη
n
(t)1| is the sample counterpart of υ(t). Since bη
n
(t)
a.s.
ψ(t), also bυ
n
(t)
a.s.
υ(t), and the estimators are consistent.
Remark 4.3.7 Please see Jacques et al. (1 999) for an extension of the
method to the case of generalized Linnik laws given by the ch.f.
ψ
α,σ,β
(t) =
1
1 + σ
α
|t|
α
β
, t R.
4.3 Linnik distribution 273
Least-squares estimators
Another estimation procedure based on the sample ch.f. is the regressio n-
type estimation of Koutrouvelis (1980) adapted to the Linnik case, which
was dis c ussed in Kozubowski (19 93) in the more general setting of geo met-
ric stable laws . Again, set λ = σ
α
. Taking the logarithms of both sides in
the relation
| 1(t) 1 |= λ | t |
α
(4.3.71)
results in
log | 1(t) 1 |= log λ + α log | t | . (4.3.72 )
We can now estimate λ and α using the regression of y = log | 1/bη
n
(t) 1 |
on x = log | t | via the model
y
i
= δ + αx
i
+
i
, i = 1, . . . , K, (4.3.73)
where {t
i
}, i = 1, . . . , K, is a suitable sequence of real numbers, δ = log λ,
and
i
is an e rror term. Denote these estimators by eα and
e
λ.
Like the method of moments procedure , the regression-type estimation
presented here produces consistent estimators and is computationa lly s traight-
forward. However, there is lack of the optimality properties for estimators,
and the methods may not be robust with re spec t to the choice of the re-
quired constants.
The minimal distance method
Anderson and Ar nold (1993) discuss another estimation method for Linnik
parameters, based on empirical characteristic function (4.3.66). They con-
sider estimation of the parameter α of the Linnik distribution with σ = 1,
although the procedure can be generalized to include the scale parameter
as we ll. The method is based on minimization of the objective function
I
L
(α) =
Z
−∞
|
b
ψ(t) (1 + |t|
α
)
1
|
2
e
t
2
dt, (4.3.74)
where
b
ψ is the empirical characteristic function (4.3.66) based on the ran-
dom sample X
1
, X
2
, . . . , X
n
from the Linnik L
α,1
distribution. Again, since
the distribution is symmetric, the real part of
b
ψ given by (4.3.68) can be
used, in which c ase the objective function becomes
I
L
(α) =
Z
−∞
|bη(t) (1 + |t|
α
)
1
|
2
e
t
2
dt. (4.3.75)
The weights e
t
2
are incorporated mainly for mathematical co nvenience,
as the integrals of the form
Z
−∞
f(t)e
t
2
dt (4.3.76)
274 4. Related distributions
can be well approximated by the sum
m
X
i=1
ω
i
f(z
i
) + R
m
(4.3.77)
(via the so-called Hermite integration). Here , the weights are
ω
i
=
2
m1
m!
m
(mH
m1
(z
i
))
2
, (4.3.78)
and z
i
is the ith zero of the mth degree Hermite polynomial H
m
(z). The
values of z
i
, ω
i
, and ω
i
e
z
2
i
are presented in Abramowitz and Stegun (1965,
pp. 924). They reproduce tables of zeroes and weight factors of the first
twenty Hermite polynomials from Salz e r et al. (1952).
The objective function in the sy mmetric case can be well approximated
by
b
I
L
(α) =
m
X
i=1
ω
i
(bη(z
i
) (1 + |z
i
|
α
)
1
)
2
. (4.3.79)
The values of bα
L
that minimize
b
I
L
(α) are strongly consistent estimators of
α. Anderson and Arno ld (1993) carried out an extensive simulations which
indicate that this approach provides reasonable estimators.
Fractional moment estimation
Here, we present the approach to estimation based on fractional moments
of Section 4.3.4, which wa s co ns idered in Koz ubowski (1999). The basis
for the method is the formula (4.3.54), that expresses the fractional mo-
ment E|Y |
p
in terms of the parameters α and σ. We c an substitute sample
fractional moments and solve the resulting equations for the parameters .
As noted in Kozubowski (199 9), the method is computationally simple,
requires minimal implementa tion efforts, and provides accurate estimates
even for small sample sizes.
Consider 0 < p < α 2, and let e(p) = E|Y
1
|
p
denote the pth absolute
moment of L
α,σ
. Next, choose two values of p, say p
1
and p
2
, r eplace e(p
k
) in
the fractional moment formula (4.3.54) with its sample counterpart ˆe(p
k
) =
1
n
P
|Y
i
|
p
k
, k = 1, 2, and solve the resulting equations for α and σ.
As an illustration, assume 1 < α 2 and take p
1
= 1/2 and p
2
= 1, so
that by (4.3.54) we have
ˆe(1/2) =
1
n
X
|Y
i
|
1/2
=
r
πσ
2
1
α sin
π
2α
(4.3.80)
and
ˆe(1) =
1
n
X
|Y
i
| =
2σ
α sin
π
α
. (4.3.81)
4.3 Linnik distribution 275
Next, eliminate σ from (4.3.80) and (4.3.81) by squaring both sides of
(4.3.80) and dividing the two sides of the re sulting equa tion into the cor-
responding s ides of equa tion (4.3.81). This results in the equation for α,
ˆe(1)
(ˆe(1/2))
2
=
4α sin
2
π
2α
π sin
π
α
. (4.3.82)
As remarked by Kozubowski (1999), finding a numerical solution of (4.3.82)
is straightforward, since the right-hand side of (4.3.82) is strictly decreasing
in α. Now, we can substitute ˆα into either (4.3.80) or (4.3.81) and solve
the resulting eq uations for ˆσ
1
and ˆσ
2
, obtaining
ˆσ
1
=
2
π
ˆα
2
sin
2
π
2ˆα
[ˆe(1/2)]
2
, (4.3.83)
ˆσ
2
=
1
2
ˆα sin
π
ˆα
ˆe(1). (4.3.84)
One can compute the average ˆσ = (ˆσ
1
+ ˆσ
2
)/2 to estimate σ. As rep orted in
Kozubowski (1999), the above estimators perform well on simulated data.
The results ar e most accurate when α is close to 2, and generally improve as
n increase s. Surprisingly, the procedure provides quite satisfactory results
even for sample sizes a s small as 100. The procedure can easily be adapted
to the general strictly geometric stable case as well.
4.3.8 Extens i ons
We have already s e en that symmetric Linnik dis tributions for m a subclass
of strictly geometric stable laws given by ch.f. (4.3.3). Distributions from
this three-parameter family sha re many properties of the Linnik laws, see
for example Kozubowski (1994ab), Erdogan (1995). In turn, strictly ge-
ometric stable laws form a subclass of geometric stable laws, defined in
Section 4.4.4. The latter is a four-parameter family of distributions which
are limiting laws for (normalized) geometric sums with i.i.d. components.
More information on geo metr ic stable laws c an be found in Kozubowski
and Rachev (1999ab).
Since the Linnik distribution is infinitely divisible, any po sitive power of
the Linnik ch.f. (4.3.1) is a well-defined ch.f. corresponding to real valued
(and symmetric) random variable. The resulting distributions are called
generalized Linnik laws, s ee, e.g., Devroye (1993), Pakes (1998), Erdogan
and Ostrovskii (1998b), a nd Jacques et al. (1999) for more details.
Nonnegative r.v.’s with Laplace-Stieltjes transform
f
α,c
(s) =
1
1 + cs
α
, s 0, α (0, 1], c > 0, (4 .3.85)
276 4. Related distributions
are the Mittag-Leffler distributions, introduced by Pillai (1990). Pakes
(1995) considered a more general clas s of distributions with Laplace-Stieltjes
transform
f
α,c,β
(s) =
1
1 + cs
α
β
, s 0, α (0, 1], c > 0, β > 0, (4.3.86)
and referred to them as the positive Linnik laws. Note that the functions
(4.3.85) and (4.3.86) ought to be res tricted to the case α (0, 1], s ince
otherwise they are not completely monotone, and hence can not serve as
Laplace-Stieltjes transforms.
Replacing s in (4.3.86 ) by 1 z, we obtain the function
g
α,c,β
(z) =
1
1 + c(1 z)
α
β
, |z| 1, α (0, 1], c > 0, β > 0, (4.3.87)
which is a probability generating function of a nonnegative integer-valued
r.v. with the discrete Linnik distribution, studied by Devroye (1990) for
c = 1 and Pakes (1995) for c > 0. For β = 1 we obtain here the dis-
crete Mittag-Leffler distribution, see Pillai (1990) and Jayakumar and Pillai
(1995). Letting β , we ar rive in the limit at the pro bability generating
function
h
α,c
(z) = e
c(1z)
α
, |z| 1, α (0, 1], c > 0, (4.3.88)
which represents a discrete stable distributed r.v., se e Steutel and va n Horn
(1979), and also Christoph and Schr e iber (1998a). We refer an interes ted
reader to Christoph and Schreiber (1998abc) for more information o n and
further references for these discrete distributions.
4.4 Other cases
4.4.1 Log-Laplace distribution
By analogy with the lognormal, S
U
, and S
B
systems of distributions [see ,
e.g., Johnson et al. (1994), Chapters 12 and 14], Johnson (195 4) considered
the system
X =
θ + s lo g Y (S
0
L
system),
θ + s sinh
1
Y (S
0
U
system),
θ + s lo g
Y
1Y
(S
0
B
system),
(4.4.1)
where Y has the standard c lassical Laplace distribution. The S
0
L
system of
distributions is known as the log-Laplace distributions (in analogy with the
log-nor mal distributions), see Uppuluri (1981), Chipman (1985), Kotz et
al. (1985), and J ohnson et al. (1994) for further discussion on log-Laplace
distributions.
4.4 Other cases 277
4.4.2 Generalized Laplace di s tribution
The following generalization of the Laplace distribution was proposed by
Subbotin (1923)
f
p
(x) = [2p
1/p
σ
p
Γ(1 + 1/p)]
1
exp((
p
p
)
1
|x µ|
p
), (4.4.2)
where µ = E(X) is the location parameter, σ
p
= [E(|X µ|
p
)]
1/p
is the
scale pa rameter, and p > 0 is the shape parameter. The distributions with
above densities form a family called exponential power function distribu-
tions, and they are also called generalized Laplace distributions, as for p = 1
they reduce to the standard Laplace laws. The estimation of the parame-
ters was treated in a number of papers , for example the MLE’s and their
properties were derived in Agr´o (1995) [see also Zeckhauser and Thomp-
son (1970)]. The distribution is widely used in Bayesian inference [see, e.g .,
Box and Tiao (1962), Tiao and Lund (1970)]. Other r e lated paper s include
Jakuszenkow (1979), Sharma (1984), and Taylor (1992).
4.4.3 Sargan distribution
Consider a symmetric Bessel function distribution GAL(0, σ, τ), where τ =
n + 1 is an integer. Here, the Bess e l function K
τ 1/2
= K
n+1/2
admits a
closed form (A.0.10) given in Appendix A, and the density (4.1.32) becomes
f(x) =
1
2
e
−|x|
n
X
j=0
γ
j
|x|
j
, (4.4.3)
where
γ
j
=
(2n j)!2
j2n
n!j!(n j)!
(4.4.4)
[cf. equation (4 .1.33)]. This distribution corresponds to the sum of n + 1
i.i.d. standa rd L aplace r.v.’s [for n = 0 we obtain the standard Laplace
density (2.1.2)].
More generally, if Y
1
, . . . Y
n+1
are i.i.d. with general Laplace distribution
(2.1.1), then the sample mean,
Y , has density
f(x) =
Kα
2
e
α|xθ|
n
X
j=0
γ
j
α
j
|x θ|
j
, (4.4.5)
where K = 1, γ
j
are as above, and α = (n + 1) [se e , e.g., Weida (1935)].
The function (4.4.5) is a special case o f Sargan densities of order n, which
for θ = 0 are given by (4.4.5 ) with
278 4. Related distributions
γ
j
0, γ
0
= 1, α > 0, K =
n
X
j=0
γ
j
j!
1
. (4.4.6)
Sargan densities have been suggested as a n alternative to normal distri-
butions in some econometric models, where it is desirable that the relevant
distribution function be similar to normal but computable in closed form,
see, e.g., Goldfeld and Quandt (1981), Missiakoulis (1983) (who observes
that the density of the arithmetic mean of n + 1 independent Laplace vari-
ables is a nth order Sargan density), Kafaei and Schmidt (1985), and Tse
(1987).
4.4.4 Geometric stable laws
If the random variables in (2.2.1) have infinite variance, than the geometric
compounds no longer converge to an AL law given by (3 .1.10) with θ =
0. Instead, the limiting distributions form a broader class of geometric
stable (GS) laws. It is a four-parameter family denoted by GS
α
(σ, β, µ)
and conveniently des cribed in terms of the characteristic function:
ψ(t) = [1 + σ
α
|t|
α
ω
α,β
(t) iµt]
1
, (4.4.7)
where
ω
α,β
(x) =
1 sign(x) tan(πα/2), if α 6= 1,
1 +
2
π
sign(x) log |x|, if α = 1.
(4.4.8)
The parameter α (0, 2] is the index that determines the tail of the
distribution: P (Y > y) Cy
α
(as y ) for 0 < α < 2. For α = 2
the tail is exponential and the distribution reduces to an AL law, since
ω
2
1. The parameter β [1, 1] is the skewness parameter, while
µ R and σ 0 co ntrol, as usual, the location and scale, res pectively. We
shall provide a few comments on basic features of GS laws, referring an
interested reader to Kozubowski and Rachev (1999ab) for an up to date
information and numerous references on GS laws and their particular cases.
Remark 4.4.1 Special ca ses of GS laws include Linnik distribution [dis-
cussed in Chapter 4, Section 4.3, wher e β = 0 and µ = 0 (see Linnik (1953)],
and Mittag-Leffler distributions, which are GS with β = 1 and either α = 1
and σ = 0 [exponential distribution] or 0 < α < 1 and µ = 0. The latter
are the only non-negative GS r.v.’s [se e , e.g., Pilla i (1990), Fuita (1993),
Jaya kumar and Pillai (1993)]. For applications of Mittag-Leffler laws see,
e.g., Weron and Kotulski (199 6).
4.4 Other cases 279
Remark 4.4.2 GS laws sha re many, but not all, properties of the so-called
Paretian stable distr ibutions. In fact, Paretian stable and GS laws are re-
lated via their characteristic functions, ϕ and ψ, as shown in Mittnik and
Rachev (1991):
ψ(t) = γ(log ϕ(t)), (4.4.9)
where γ(x) = 1/(1 + x) is the Laplace transform of the standard exponen-
tial distribution. Relation (4.4.9) produces the representation (4.4.7), as
well as the mixture representation of a GS random variable Y in terms of
independent standardized Pareto stable and exponential r.v.’s , X and W :
Y
d
=
µW + W
1
σ X, α 6= 1,
µW + W σX + σW β(2/π) log(W σ), α = 1.
(4.4.10)
Note that the a bove representation reduces to (2.2.3) in case α = 2 and
µ = 0 , as then X has the normal dis tribution with mean zero and variance
2.
Remark 4.4.3 The asymmetric Laplace distribution, which is GS with
α = 2, plays, among GS laws, the role analogous to that of the normal
distribution among Paretian s table laws. Namely, AL are the only law s
in this class w ith a finite variance. Also they are limits in the random
summation scheme with geometrically distributed number of terms as the
normal laws are limits in the ordinary summation scheme. In contrast to
normal distribution, c.d.f.’s of AL laws have explicit expressions, which
makes them by far easier to handle in applications.
Remark 4.4.4 Similarly to Paretian stable laws, the GS laws lack explicit
expressions for densities and distribution functions, which handicap their
practical implementation. Moreover, they are “fat-tailed”, have s tability
properties (with resp e c t to random summation), and g e neralize the cen-
tral limit theo rem (being the only limiting laws for geometric compounds).
However, they are different from the stable (and normal) laws in that their
densities are more “peaked”; consequently, they are similar to the Laplace
type distributions still being heavy-tailed. Unlike Paretian stable densities,
GS densities blow-up” at zero if α < 1. Since many financial data are
“peaked” and “fat-tailed”, they are often consistent with a GS model [see,
e.g., Kozubowski and Rachev (1994)].
4.4.5 ν-stable laws
Suppose that the random number of terms in the summation (2.2.1) is any
integer-valued random variable, and, as p converges to zero, ν
p
approaches
280 4. Related distributions
infinity (in probability) while
p
converges in distribution to a r.v. ν with
Laplace trans fo rm γ. Then, the normalized compounds (2.2.1) converge
in distribution to a ν-stable r.v., whose characteristic function is (4.4.9)
[see, e.g., Gnedenko and Korolev (1996), Klebanov and Rachev (1996),
Kozubowski and Panorska (1996)]. The class of ν-stable laws contains GS
and generalized AL laws as special cases: if ν
p
is geometric with mean 1/p,
then
p
converges to the standard expo nential and (4.4.9) leads to (4.4.7).
The tail behavior of ν-stable laws is essentially the same as that of stable
and GS laws.
4.5 Exercises
Exercise 4.5.1 For any given σ
2
> 0, let the r.v. X be log-normal with
the p.d.f.
f(x|σ
2
) =
(
1
2πσx
e
(log x)
2
/(2σ
2
)
for x > 0,
0 o therwise
[so that given σ
2
, the r.v. log X is N(0, σ
2
)]. Show that if the quantity σ
2
is a rando m variable with the standard exponential distribution, then X
has the log-Laplace distribution with the p.d.f.
g(x) =
1
2
(
x
21
for 0 < x < 1,
x
1
2
for x 1
[so that the r.v. log X is standard Laplace L(0, 1)].
Exercise 4.5.2 Using the results on symmetric generalized Laplace den-
sities demo ns trate that asymmetric generalized Laplac e densities are uni-
modal. Is the mode for those distributions always at zero?
Exercise 4.5.3 Recall that if X has the standard symmetric Bessel func-
tion distribution GAL
(0, 1,
2, n) with ch.f. is (1 + t
2
)
n
, then X has the
same distribution as the sum of n i.i.d. standard c lassical L aplace random
variables. Thus, the variable X a dmits the random sum representation dis-
cussed in Pro po sition 2.3.2 of Chapter 2. Investigate, whether a skewed
Bessel r.v. GAL
(0, κ, σ, n) admits similar representation.
Exercise 4.5.4 Using Theorem 4.1.1, show that under the conditions of
this theore m, the corresponding generalized Laplace densities converge to
a normal dens ity.
Exercise 4.5.5 Derive the coefficient of ske wness and kurtosis for the K-
Bessel function distribution, and c ompare them with the corresponding
values for the Laplace and AL laws.
4.5 Exercises 281
Exercise 4.5.6 Derive estimators of the K-Bessel function distribution
parameters by the method of moments, and study their asymptotic prop-
erties. You may want to consider several cases as to which of the four
parameters are unknown.
Exercise 4.5.7 Consider a sequence of stochastic processes {L
n
(t)} and a
process B(t). We say tha t {L
n
(t)} has finite dimensional distributions con-
vergent to the finite dimensiona l distributions of B(t), if for each N N and
t
1
, . . . , t
N
, the sequence of the random vectors (L
n
(t
1
), . . . , L
n
(t
N
)) con-
verges in distribution to (B(t
1
), . . . , B(t
N
)). Let L
n
(t) be LM(1/
τ
n
, 1
n
),
where τ
n
converges to infinity, and let B(t) be a standard Brownian mo-
tion. Show that the convergence of finite dimensional distributions holds in
this case.
Exercise 4.5.8 Let X
1
, . . . , X
n
be i.i.d. with the exponential power func-
tion density
g(x) =
k
2sΓ(1/k)
e
(|x|/s)
k
, −∞ < x < , s, k > 0, (4.5.1)
where k is assumed to be known (for k = 1 we obtain the Laplace distri-
bution).
(a) Show that the method of moments estimator of the parameter s
2
is
δ
1
=
Γ(1/k)
Γ(3/k)
n
X
i=1
X
2
i
. (4.5.2)
[Jakuszenkow (1979).] Derive the mean and the variance of δ
1
. Show that
δ
1
is unbiased and consistent for s
2
. Is δ
1
an efficient estimator for s
2
, i.e.,
does the variance of δ
1
coincide with the Cram´er-Rao lower bound?
(b) Show that the MLE of the parameter s
k
is
δ
2
=
k
n
n
X
i=1
|X
i
|
k
. (4.5.3)
[Jakuszenkow (1979).] Show that δ
2
is unbiased and consistent for s
k
. Is δ
2
an efficient estimator for s
k
?
(c) Show that among all estimators of the form
δ = α
n
X
i=1
X
2
i
, α > 0, (4.5.4)
the one that minimizes the expected value of the loss function
L(δ, s
2
) = f(s
2
)(δ s
2
)
2
, (4.5.5)
282 4. Related distributions
where f is an arbitrary positive function, correspo nds to
α
=
Γ(3/k)Γ(1/k)
Γ(5/k)Γ(1/k) + (n 1)Γ
2
(3/k)
. (4.5.6)
[Jakuszenkow (1979).] Is the res ulting estimator unbiased for s
2
?
(d) Note that the estimator considered in Part (c) is not a function of the
complete and sufficient statistic T =
P
n
i=1
|X
i
|
k
. To improve the estimato r,
consider the class of estimato rs of the form αT
2/k
, α > 0, and show that
the best estimator [with respect to the loss function (4.5.5)] is obtained for
α
=
Γ((n + 2 )/k)
Γ((n + 4 )/k)
. (4.5.7)
[Sharma (1984).]
Exercise 4.5.9 E xtend Theorem 4.2.3 to an arbitrary symmetric Laplace
motion LM(σ, ν) defined over the interval [0, T ].
Exercise 4.5.10 It is well-known that there exist essentially different stochas-
tic processes having the same distribution a t any fixed time point. Consider
the following two processe s:
˜
L
t
=
p
Γ
t
σB
t
+ Γ
t
µ + mt
and
˜
˜
L
t
=
q
˜
Γ
t
σB
t
+
˜
Γ
t
µ + mt,
where Γ
t
is a gamma proces s independent of a Brownian motion B
t
, while
˜
Γ
t
is a gamma white noise, i.e. for each n N and t
1
, . . . , t
n
R the
variables
˜
Γ
t
1
, . . . ,
˜
Γ
t
n
are independent gamma distributed with the shape
parameters t
1
/ν, . . . , t
n
, respectively.
Let L
t
be ALM(µ, σ, ν) with a drift m. Show that for each fixed t
L
t
d
=
˜
L
t
d
=
˜
˜
L
t
.
Are
˜
L
t
and
˜
˜
L
t
Laplace motions? Why?
Hint: Use the representation given in Proposition 4.1.2 to show the first
part.
Exercise 4.5.11 Prove the representation 4.2.2 of ALM(µ, σ, ν).
Exercise 4.5.12 Prove the representation 4.2.3 of ALM(µ, σ, ν).
Exercise 4.5.13 Prove the representation 4.2.4 of ALM(µ, σ, ν).
Exercise 4.5.14 Show that the function (4.3.1) is a genuine characteristic
function for any 0 < α < 1.
4.5 Exercises 283
Hint: Proceed by showing that:
(i) ψ
α,σ
(t) = ψ
α,σ
(t), t > 0.
(ii) ψ
α,σ
(0) = 1.
(iii) lim
t→∞
ψ
α,σ
(t) = 0.
(iv) ψ
00
α,σ
(t) > 0 for t > 0, so that ψ
α,σ
is convex on (0, ).
Thus, ψ
α,σ
is a Polya-type ch.f., see, e.g., Lukacs (1970).
Exercise 4.5.15 For any p (0, 1), let ν
p
denote a geometric r.v. with
mean 1/p and pr obability function
P (ν
p
= k) = p(1 p)
k1
, k = 1, 2, . . . .
Let p, q (0, 1), and consider a se quence (ν
(i)
p
) of i.i.d. geometr ic random
variables with mean 1/p and another geometric r.v. ν
q
independent of the
sequence. Show tha t the geometric sum
P
ν
q
i=1
ν
(i)
p
has the same probability
distribution as ν
pq
[a geometric r.v. with mean 1/(pq)].
Hint: Write the ch.f. of the geometric sum conditioning on the ν
q
.
Exercise 4.5.16 For each n 1, let Z
(1)
n
, Z
(2)
n
, . . . be a sequence of i.i.d.
r.v.’s. Assume that for each i we have the convergence Z
(i)
n
d
Z
(i)
as n
, where the Z
(i)
’s are independent and identically distributed variables.
Let ν be any integer valued r.v. independent of all the other r.v.’s involved.
Show that, as n , the rando m sum
P
ν
i=1
Z
(i)
n
converges in distribution
to the random sum
P
ν
i=1
Z
(i)
.
Exercise 4.5.17 Prove Proposition 4.3.6
Exercise 4.5.18 For any 0 < ρ < 1, let f
ρ
be the Cauchy density on
(−∞, ), defined as follows,
f
ρ
(x) =
sin(πρ)
π[(x + cos(πρ))
2
+ sin
2
(πρ)]
, x R. (4.5.8 )
Show that
R
0
f
ρ
(x)dx = ρ, so that g
ρ
(x) =
1
ρ
f
ρ
(x) is a density on (0, ).
Exercise 4.5.19 For any 0 < ρ < 1, let W
ρ
be a pos itive r.v. with the
density g
ρ
defined in Exerc ise 4.5.18. Show that as ρ 0
+
, the distribution
of W
ρ
converges weakly to the distribution given by the density g
0
(x) =
(1 + x)
2
, while as ρ 1
+
, the distribution of W
ρ
converges weakly to a
distribution of a unit mass at 1, namely W
1
1.
Exercise 4.5.20 For any 0 < ρ < 1, let W
ρ
be a pos itive r.v. with the
density g
ρ
defined in Exerc ise 4.5.18. Show that W
ρ
has the reciprocal
property W
ρ
d
= 1/W
ρ
.
284 4. Related distributions
Exercise 4.5.21 Show tha t if X is the Pareto Type I random variable
with the p.d.f.
f(x) =
1
x
1
log b log a
, 0 < a < x < b,
then Y = 1/X has a distribution of the same type.
Exercise 4.5.22 Prove Proposition 4.3.11.
Exercise 4.5.23 Prove Proposition 4.3.12
Exercise 4.5.24 Show tha t setting m = 1 in (4.3.49) produces (4.3.44).
Exercise 4.5.25 Using the well-known identity for non-integer values of
z,
γ(z)Γ(1 z) =
π
sin πz
,
show that for α = 2, the fractional absolute moments of the Linnik distri-
bution g iven by (4.3.54) coincide with σ
α
Γ(p + 1), which are the moments
of the symmetric Laplace distribution.
Exercise 4.5.26 Show tha t the Sargan density (4.4.5) with restrictions
(4.4.6) is a bona fide pro bability density function on (−∞, ).
Part II
Multivariate distributions
285
287
Preamble
In this part of the monograph, we shall discuss currently available results
on multivariate Laplace distributions and their generalizations. The field
is relatively unexplored and the subject matter is quite fresh and somewhat
fragmented, thus our account is intentionally concise. In the authors’ opin-
ion, some period of digestion is required and perhaps even essential to put
these results into a proper perspective. Hopefully, a separate monograph
will be available on this burgeoning area of st atistical distributions in not
too distant future.
Multivariate generalizations of the Laplace laws have been considered on
various occasions by various aut hors. The term multivariate Laplace law is
still somewhat ambiguous but recently it applies most often to the class of
symmetric, elliptically contoured distributions for which the characteristic
function is of the form
Φ(t) =
1
1 +
1
2
t
0
Σt
. (4.5.9)
Recall that a r.v. in R
d
has an elliptically contoured distribution if its ch.f.
has the form
Φ(t) = e
it
0
m
φ(t
0
Σt) (4.5.10)
for some function φ, where m is a d × 1 vector in R
d
and Σ is a d × d
non-negative definite matrix [see, e.g., Fang et al. (1990)].
Probably, the simplest multivariate generalization of Laplace distribution
is the distribution of a vector of independent Laplace random variables [see,
e.g., Osiewalski and Steel (1993), Marshall and Olkin (1993)]. However
not many properties of univariate laws can be extended to this class of
distribution. Moreover, it is not invariant on rotations (see, for example,
the graph of bivariate density in Figure 8.7).
Transforming a bivariate normal distribution, Ulrich and Chen (1987)
obtained another bivariate distribution with Laplace marginals, noting that
there were no “naturally occurring” bivariate Laplace distributions. Much
earlier, McGraw and Wagner (1968) in their seminal paper provided a num-
ber of examples of bivariate elliptically contoured distributions, including
the multivariate Laplace distribution (4.5.9) and their generalizations [see
also Johnson and Kotz (1972) Table 3, pp. 297, equation (69), pp. 301 and
Johnson (1987)]. This m ultivariate Laplace law also appears in Anderson
(1992) as a special case of the mult ivariate Linnik distribution [also known
as the semi-α-Laplace distribution, see Pillai ( 1985)].
Recently, Ernst (1998) introduced yet another multivariate extension of
symmetric Laplace distributions again via an elliptic contouring. In the
one-dimensional case, his class reduces to the univariate symmetric Laplace
laws.
288
Barndorff-Nielsen (1977) has introduced the class of the so-called hy-
perbolic distributions, which was later extended to the multivariate case in
Blaesid (1981). With an appropriate passage to the limit of their parame-
ters, one can obtain a multivariate and asymmetric extension of the Laplace
laws. This is the same class which to be introduced below but is studied here
on its own, independently of the theory of hyperbolic and inverse Gaussian
distributions.
This part of the monograph is organized in such a manner that special
cases bivariate and symmetric distributions are discussed rst (albeit
rather briefly), prior to the more general cases of multivariate and asym-
metric distributions. We believe that this exposition, despite the fact that
formally most of the properties follows from the results derived for the gen-
eral case, allows for a faster reference to the special important cases without
the need to absorb the more cumbersome notation and description of the
general multivariate asymmetric Laplace distributions. Thus the symmet-
ric (elliptically contoured) multivariate distributions are discussed before
the general asymmetric ones and the bivariate cases precede the general
multivariate ones. On the other hand, we present proofs for the general
setting, omitting explicit proofs in particular cases unless they provide a
better insight.
While discussing the mult ivariate Laplace distributions we s hall always
consider them to be centered at zero. One can add the location parameter
in a natural m anner and t hus consider, as we did in the previous chap-
ters, a more general class of asymmetric Laplace distributions. However,
this complicates the already cumbersome notation in the multivariate case
without adding substantially to deeper understanding.
5
Symmetric multivariate Laplace
distribution
In this chapter we shall be dis c ussing a natur al extension of the univa riate
Laplace symmetric distribution to the multivariate setting. The material
discussed here has not - to the best of our knowledge - appeared before in
monogra phic literature. A compa rison with the usually used multivariate
normal distribution would be most instructive.
5.1 Bivariate case
5.1.1 Definition
As in the univariate case, the mo st direct a nd simple way to introduce the
bivariate symmetric Laplace distributions is through their characteristic
functions. Thus the bivariate symmetric Laplace distributions constitutes
a three pa rameter family of two dimensional distributions with the charac-
teristic functions given by
ψ(t
1
, t
2
) =
σ
2
1
t
2
1
2
+ ρσ
1
σ
2
t
1
t
2
+
σ
2
2
t
2
2
2
1
,
where the three parameters σ
1
, σ
2
, and ρ satisfy
σ
1
0, σ
2
0, ρ [0, 1].
We s hall use BSL(σ
1
, σ
2
, ρ) instead of the full lengthy expression to describe
membership in this family.
290 5. Symmetric multivariate Laplace distribution
Note that in this definition, as well as in all others in this part of the
book, we do not take into account the locatio n of the distribution, making
it always to be centered at ze ro. The word “symmetric” in our terminology
represents the fact that our distribution is actually obta ined from a one
dimensional distribution spread uniformly along an ellipsoid in the two
dimensions. Formally, this mea ns that the characteristic function depends
on its argument t = (t
1
, t
2
)
0
through t
0
Σt, where Σ is a certain positive
definite matrix, in this case
Σ =
σ
2
1
σ
1
σ
2
ρ
σ
1
σ
2
ρ σ
2
2
. (5.1.1)
In general, for this type of a distribution the name elliptically contoured is
used, a nd more appropriately the distribution under consideration should
be called the elliptically contoured Laplace distribution.
The following prope rty follows immediately from the definition.
Proposition 5.1.1 A linear combination a
1
Y
1
+ a
2
Y
2
of the coordinates
of a BSL(σ
1
, σ
2
, ρ) random vector Y = (Y
1
, Y
2
)
0
has a one dimensional
symmetric Laplace distribution L(0, σ), where
σ =
q
σ
2
1
a
2
1
+ 2ρσ
1
σ
2
a
1
a
2
+ σ
2
2
a
2
2
).
In particular the marginal distributions of a BSL distribution are sym-
metric L aplace distributions.
The case when σ
1
= σ
2
= 1 and ρ = 0 will be distinguished, and the cor-
responding distribution will be referred to as the standard bivariate Laplace
distribution.
5.1.2 Mome nts
The moments of the Laplace distribution are easily obtained by differenti-
ating its characteristic function. In par ticula r, we have the following formu-
las for the mean vector and variance-covariance matrix of a BSL(σ
1
, σ
2
, ρ)
random vector Y:
EY = 0; Cov(Y) = E(YY
0
) =
σ
2
1
σ
1
σ
2
ρ
σ
1
σ
2
ρ σ
2
2
.
Note that if Y is uncorrelated (ρ = 0), Y
1
and Y
2
are not independent
(unlike the situation in the case of bivariate normal distribution).
Remark 5.1.1 One can consider a vector of two independent Laplace ran-
dom variables and its distribution. By the above property, such a vector
does not belong to multivariate Laplace family. An example of the density
for such a random ve c tor can be seen in Figure 8 .7.
5.1 Bivariate case 291
Cov(Y) =
1 0
0 1
Cov(Y) =
1 0
0 0.5
Gaussian
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Laplace
Σ = Cov(Y )
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Figure 5.1: Laplace and Gaussian bivariate densities corresponding to the
uncorrelated distributions.
5.1.3 Densities
The formula fo r densities is taken from the general case, considered in
Section 6.5 of Chapter 6, equation (6.5.3). Namely, we have
g(x, y) =
1
πσ
1
σ
2
p
1 ρ
2
· K
0
s
2(x
2
2
1
2ρxy/(σ
1
σ
2
) + y
2
2
2
)
1 ρ
2
,
292 5. Symmetric multivariate Laplace distribution
where K
0
is the Bessel function of the third kind given by (A.0.4) or (A.0.5)
in Appendix A.
In particular, the standar d bivariate Laplace distribution is given by
1
π
K
0
p
2(x
2
+ y
2
)
. (5.1.2)
To compare the Gaussian and Laplace distributions we present in Fig-
ures 5.1 a nd 5.2 bivariate AL and Gaussian densities. Figure 5.1 deals with
uncorrelated distributions with the two covariance matrices Σ given by
1 0
0 1
and
1 0
0 0.5
. (5.1.3)
The graphs present contour lines at the levels in the interval (0, 0.5).
The densities were cut off above the leve l of 0.5 (the Laplace densities are
unbounded around zero). In or der to illustrate the tails and the behavior
around zero the co ntour levels were chosen differently in two different sub-
intervals. From the sub-interval (0, 0.00 5) we have chosen 10 equally spaced
levels to show contours repr e senting tails o f a distribution and from the sub-
interval (0.005, 0.5) we have selected 50 equally s paced levels to present
contours of a distribution at its center.
The first two drawings represent Gaussian densities of the distributions
with the covariance matrices specified by the values of Σ.
In the next row of pictures we present densities of the Laplace random
variables - symmetric having the same covariance matrices. The bivariate
parameters of these two distributions are given by Σ = Cov(Y). The one
on the left hand side corresponds to the bivariate standard Laplace random
variable with the density (5.1.2) for which Σ is the identity matrix.
Analogous graphs are obtained for the correlated densities. In the first
two r ows Gaussian and symmetric, elliptically contoured Laplace distribu-
tions are presented with the covariance matrices coinciding with the ma trix
Σ.
In Figur e 5.2, we pre sent the correlated version of the previous graphs.
Namely, we co ns ider the covariance matrices Σ given by
1 0.5
0.5 1
and
1 0.5
0.5 0.5
. (5 .1.4)
In the first row the Ga ussian distributions are prese nted with covariance
matrices given by (5.1.4). In the second row the corresponding Laplace
densities are provided.
5.1.4 Simulation of bivariate Laplace variates
The general algorithm for simulation of asymmetric multivariate Laplace
variables is derived in Section 6.4 of the next chapter. We present her e its
version for the bivariate symmetric cas e.
5.1 Bivariate case 293
Cov(Y) =
1 0.5
0.5 1
Cov(Y) =
1 0.5
0.5 0.5
Gaussian
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Laplace
Σ = Cov(Y )
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Figure 5.2: Laplace and Gaussian bivariate densities corresponding to the
correla ted distributions.
A BSL(σ
1
, σ
2
, ρ) generator.
Generate a bivariate normal variable X with mean zero and covari-
ance matrix Σ given by (5.1.1).
Generate a standard exponential variable W .
Set Y
W ·X.
RETURN Y.
294 5. Symmetric multivariate Laplace distribution
In the eight figures below we have used this method implemented in the
S-Plus package to s imulate samples from the distr ibutions which are given
by the densities presented on Figures 5.1 a nd 5.2.
Cov(Y) =
1 0
0 1
Cov(Y) =
1 0
0 0.5
Gaussian
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
Laplace
Σ = Cov(Y )
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
Figure 5.3: Unco rrelated Laplac e and Gaussian random samples. Monte
Carlo simulation is based on the described algorithm. (The sample size
equals 2000.)
5.2 General symmetric multivariate case 295
Cov(Y) =
1 0.5
0.5 1
Cov(Y) =
1 0.5
0.5 0.5
Gaussian
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
Laplace
Σ = Cov(Y )
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
Figure 5.4: Correlated Laplace and Gaussian random samples. Monte Carlo
simulation is based on the described algorithm. (The sample size equals
2000.)
5.2 General symmetric multivariat e case
5.2.1 Definition
A multivaria te symmetric Laplace distribution is a direct generalization
of the bivariate case . As before, the word “symmetric” refers to elliptically
contoured or elliptically symmetric distributions and means that the distri-
296 5. Symmetric multivariate Laplace distribution
butions possess the characteristic function which depends on its variables
only through a quadratic form.
Let Σ be an d ×d positive definite matrix of full rank. We shall say that
a d dimensional distribution is multivariate symmetric Laplace with the
parameter Σ, denoted SL
d
(Σ), if its characteristic function is of the form
Ψ(t) =
1
1 +
1
2
t
0
Σt
. (5.2.1)
5.2.2 Mome nts and densities
It follows directly from the definition that the SL
d
(Σ) distribution is cen-
tered at ze ro (the mean is zero) and its covariance matrix is given by Σ.
From the representation of the density for the general multivariate asym-
metric c ase we have that the SL
d
(Σ) density function is of the form
g(y) =
2
(2π)
d/2
|Σ|
1/2
y
0
Σ
1
y
2
v/2
K
v
q
2y
0
Σ
1
y
, (5.2.2)
where v = (2 d)/2 and K
v
(·) is the modified Bessel function of the third
kind given by (A.0.4) or (A.0.5) in Appendix A. This density was derived
in George and Pillai (19 88) for the cas e Σ = 2I and in Anderson (1992) as
a spec ial case of multivariate Linnik density [note that the density (8) of
Anderson (1992) contains an extra factor of
2Q]. Additional properties of
SL
d
(Σ) are provided in exercises below. They should be viewed as integral
part of this chapter.
5.3 Exercises
Exercise 5.3.1 Let X = (X
1
, X
2
)
0
have a standard bivariate Laplace dis-
tribution BSL(1, 1, 0). Show that the two random va riables X
1
and X
2
are
uncorrelated but not independent.
Exercise 5.3.2 Let X = (X
1
, X
2
)
0
have a standard bivariate Laplace
distribution BSL(1, 1, 0). Convert to polar coordinates by setting X
1
=
R cos θ, X
2
= R sin θ (R > 0, 0 < θ < 2π).
(a) Derive the mar ginal density function of R.
(b) Derive the marginal density function of θ.
(c) Are R and θ independent?
(d) Repeat parts (a) - (c) under the assumption that X
1
and X
2
are i.i.d.
with the standard Laplace L(0, 1) distribution.
(e) Repeat parts (a) - (c) under assumption that X
1
and X
2
are i.i.d. with
the standard normal distr ibution.
5.3 Exercises 297
Exercise 5.3.3 Let X = (X
1
, X
2
)
0
BSL(σ
1
, σ
2
, ρ).
(a) Derive the mar ginal p.d.f.’s of X
1
and X
2
.
(b) Derive the conditional p.d.f. of X
2
given X
1
= x
1
.
Exercise 5.3.4
Let X = (X
1
, . . . , X
d
)
0
have a symmetric multivariate
Laplace distribution SL
d
(Σ), and le t Ψ be the ch.f. of X.
(a) Verify that the mean vector of X is 0 and the covariance matrix of X
is Σ.
(b) Using the following expression for the kth moment of X,
m
k
(X) =
1
i
k
d
k
Ψ(t)
dt
k
t=0
, (5.3.1)
show that every moment of X of odd order vanishes.
(c)
Using the following expression for the kth cumulant of X,
c
k
(X) =
1
i
k
d
k
log Ψ(t)
dt
k
t=0
, (5.3.2)
show that c
1
(X) = 0, c
2
(X) = Σ, c
3
(X) = 0, and
c
4
(X) = vec Σ Σ + (I
d
3
+ (K
dd
I
d
))(Σ vecΣ) (5.3.3)
[Kollo (2000)], where vec A is the vec ope rator of matrix A, A B is the
Kronecker product of matrices A and B, a nd K
dd
is the vec- permutation
matrix [see, e.g., Harville (1997) or Magnus and Neudecker (1999) for the
matrix notation]. What are the corresponding results for the multivariate
normal vector X with vector mean zero and covariance matrix Σ? [You
may wish to consult Kotz et al. (2 000)].
Exercise 5.3.5
Recall that if X is univariate standard class ical Laplace
variable with density p(x) =
1
2
e
−|x|
(−∞ < x < ), then the o rdinate
p(X) has uniform distribution on (0, 1/2), while the ordinate p(Z) fa ils to
be uniform for standard normal variable Z with density p(x) =
1
2π
e
x
2
/2
(−∞ < x < ) (see Exercis e 2.7.9). However, show that if the variables
X
1
and X
2
have bivariate normal distribution with p.d.f.
p(x
1
, x
2
) =
1
2π
e
1
2
(x
2
1
+x
2
2
)
, < x
1
, x
2
< ,
then the ordinate p(X
1
, X
2
) is uniform on (0, 2π) [Troutt (1991)]. Investi-
gate the corresponding case of standard bivariate Laplace distribution with
p.d.f.
p(x
1
, x
2
) =
1
π
K
0
q
2(x
2
1
+ x
2
2
)
, (x
1
, x
2
) 6= (0, 0 ).
We suggest that you c onsult Troutt (1991), Kotz an Troutt (1996), or Kotz
et al. (1997).
298 5. Symmetric multivariate Laplace distribution
Exercise 5.3.6 Generalize the results of Exercises 2.7.9 (of Chapter 2)
and 5.3.5 by showing that if X is a ra ndom vector in R
d
, d 1, with
probability density function
f(x) = c
d
e
(x
0
x)
d
/2
,
then the random varia ble U = f (X) has uniform distribution on (0, c
d
).
What is the value of c
d
?
Exercise 5.3.7 Let Y = (Y
1
, . . . , Y
d
)
0
have a multivariate AL
d
(0, I
d
) dis-
tribution in R
d
. Show that the random ve ctor
Y
1
Y
d
, . . . ,
Y
d1
Y
d
has a multivariate Cauchy distribution with the density
Γ(d/2)π
d/2
1 +
d1
X
i=1
!
d/2
and is independent of ||Y|| = (
P
d
i=1
Y
2
i
)
1/2
. The a bove result is actually
a characterization of spherica lly symmetric distributions [see George and
Pillai (1988)].
6
Asymmetric multivariate Laplace
distribution
In this chapter we present the theory of a class of multivariate laws which we
term asymmetric Laplace (AL) distributions [see Kozubowski and Podg´orski
(1999bc), Kotz et al. (2000b)]. The class is an extensio n of both the symmet-
ric multivariate La place distributions and the univariate AL distributions
that were discussed in the previous chapters. This extension retains the
natural, asymmetric and multivariate features of the properties character-
izing these two important subclasses. In particular, the AL distributions
arise as the limiting laws in a random summation scheme with i.i.d. terms
having a finite second moment, where the number of terms in the sum-
mation is geometrically distributed independently of the terms themselve s.
This class can be viewed as a subclass of hyperbolic distributions and some
of its properties are inherited from them. However, to demonstrate an ele-
gant theoretical str ucture of the multiva riate AL laws and also for the sake
of simplicity we prefer direct derivations of the results. Thus we provide
explicit formulas for the probability density and the density o f the evy
measure. The results pr e sented include also characterizations, mixture rep-
resentations, formulas for moments, a simulation algorithm, and a brief
discussion of linear regression models with AL errors.
The multivariate laws discussed below, unlike the laws of Ernst (1998)
already mentioned, have multivariate (and univariate) Laplace marginal
distributions, allow for asymmetry, and in general are not elliptically con-
toured. Asymmetric Laplac e laws can be defined in various equivalent ways,
which we express in the form of their characterizations and representations.
Their significance comes from the fact that they are the only distributional
limits for (appropriately normalized) random sums of i.i.d. random vectors
300 6. Asymmetric multivariate Laplace distribution
(r.v.’s) with finite second moments
X
(1)
+ ··· + X
(ν
p
)
, (6.0.1)
where ν
p
has a geometric distribution with the mean 1/p (independent of
X
(i)
’s):
P (ν
p
= k) = p(1 p)
k1
, k = 1, 2, . . . , (6.0.2)
and p converges to zero [see, e.g., Mittnik and Rachev (1991)]. Thus, these
multivariate laws arise rather naturally. Since the sums such as (6.0.1) fre-
quently appear in many applied problems in biology, economics, insurance
mathematics, reliability, and other fields [see the examples in Kalashnikov
(1997) and references therein], AL distributions should have a wide variety
of applications. In particular, this class seems to be suitable for modeling
of heavy tailed asymmetric multiva riate data for which one is reluctant to
sacrifice the property of finiteness of moments. (Multivariate stable distri-
butions are an alternative where this concession has to be made.)
From the standpoint o f the classical distribution theory, the AL laws
form a sub-class of the geometric stable distributions [see, e.g., Rachev and
SenGupta (1992)]. The geometric stable laws approximate geometric com-
pounds (6.0.1) w ith a rbitrary components, including those with infinite
means [see Kozubowski and Rachev (1999b) for references o n multivari-
ate geometric s table laws]. The geometric stable distributions, similarly
to stable laws, have the tail behavior governed by the index of stability
α (0, 2]. The AL distributions correspond to the geometric stable sub-
class with α = 2 . Thus, they play an analogous role among the geometr ic
stable laws as Gaussian dis tributions do among the stable laws. Like Ga us -
sian distributions, they have finite moments of a ll orders, and their theory
is equa lly elegant a nd straightforward. However, in spite of finiteness of mo-
ments, their tails are substa ntially longe r than those fo r the Gaussian laws;
this coupled with the fact that they allow for asymmetry renders them to
be more flexible and attractive for modeling heavy tailed asymmetric data.
Incidentally, the multivariate AL laws can be obtained as a limiting case
of the generalized hyperbolic distributions, introduced by Barndorff-Nielsen
(1977). Consequently, certain properties of AL laws can be deduced from
the corresponding properties of the generalized hyperbolic distributions
and passing to the limit. However, direct proofs for AL laws are o ften sim-
pler than their “hyperbolic” counterparts and in addition provide a better
insight into this class, and we have included them in our work. Moreover,
many properties a re quite specific to AL laws, such as their convolution
properties in relation to the random summation model. From the la tter
point of view, which coincides with our main interest and motiva tion, the
relation to the generalized hyperbolic laws, although an important one, is
in essence not crucial.
6.1 Bivariate case definition and basic properties 301
6.1 Bivariate case definition and basic pr operties
6.1.1 Definition
The bivariate asymmetric Laplace distributions constitute a five parameter
family of two dimensional distributions given by the characteristic function
ψ(t
1
, t
2
) =
1
1 +
σ
2
1
t
2
1
2
+ ρσ
1
σ
2
t
1
t
2
+
σ
2
2
t
2
2
2
im
1
t
1
im
2
t
2
,
where the five pa rameters m
1
, m
2
, σ
1
, σ
2
, and ρ satisfy
m
1
R, m
2
R, σ
1
> 0, σ
2
> 0, ρ [0, 1].
In the sequel, the notation BAL(m
1
, m
2
, σ
1
, σ
2
, ρ) will stand for the asym-
metric bivariate Laplace distribution with the given parameters.
The distribution is no longer elliptically contoured (unless m
1
= m
2
= 0),
which justifies using the term asymmetric distributions”. The following
property follows immediately from the definition.
Proposition 6.1.1 A linear combination a
1
Y
1
+ a
2
Y
2
of the coordinates
of a BAL(σ
1
, σ
2
, ρ) random vector Y = (Y
1
, Y
2
)
0
has a one dimensional
AL distribution AL(µ, σ), where
µ = m
1
t
1
+ m
2
t
2
and σ =
q
σ
2
1
a
2
1
+ 2ρσ
1
σ
2
a
1
a
2
+ σ
2
2
a
2
2
).
As in the symmetric case, the marginal distributions of a BAL distribu-
tion are univariate asymmetric Laplace distributions.
6.1.2 Mome nts
The moments of the BAL distribution are ea sily obtained by differentiating
of their characteristic function. In particular, we have the following formulas
for the means and the elements of the variance-covariance matrix of a
BAL(m
1
, m
2
, σ
1
, σ
2
, ρ) random vector Y = (Y
1
, Y
2
)
0
:
EY
1
= m
1
, EY
2
= m
2
; VarY
1
= σ
2
1
+ m
2
1
, VarY
2
= σ
2
2
+ m
2
2
,
Cov(Y
1
, Y
2
) = σ
1
σ
2
ρ + m
1
m
2
.
Note that as in the symmetric case, even if the components of Y are
uncorrelated (i.e. σ
1
σ
2
ρ+m
1
m
2
= 0), they are not independent. Moreover,
the matrix Σ is no longer the variance-covariance ma trix of Y (unless
m = 0).
302 6. Asymmetric multivariate Laplace distribution
Cov(Y) =
1 0
0 1
Cov(Y) =
1 0
0 0.5
Laplace
m = (0.5, 0.5)
Σ 6= Cov(Y )
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Laplace
m = (0.5, 0.25)
Σ 6= Cov(Y )
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Figure 6.1: Asymmetric bivariate Laplace densities cor responding to the
uncorrelated distributions. The covariances are the same as in the sym-
metric c ase in Fig ure 5.1.
6.1.3 Densities
The expression for densities is o bta ined from the general case considered
in the nex t section [equation (6.5.3)]. We have
g(x, y) =
exp
((m
1
σ
2
1
m
2
ρ) x + (m
2
σ
1
2
m
1
ρ) y) /(σ
1
σ
2
(1 ρ
2
))
πσ
1
σ
2
p
1 ρ
2
·K
0
C(m
1
, m
2
, σ
1
, σ
2
, ρ)
p
x
2
σ
2
1
2ρxy + y
2
σ
1
2
,
6.1 Bivariate case definition and basic properties 303
where
C(m
1
, m
2
, σ
1
, σ
2
, ρ) =
p
2σ
1
σ
2
(1 ρ
2
) + m
2
1
σ
2
1
2m
1
m
2
ρ + m
2
2
σ
1
2
σ
1
σ
2
(1 ρ
2
)
.
In Figure 6.1, we present four different asymmetric bivariate Laplace
densities for which the covaria nce matrix is exactly the same as for the
symmetric cases of Gaussian and Laplace distributions presented in Fig-
ure 5.1. These densities are still uncorrela ted but the matrix Σ is no longer
diagonal.
The four graphs deal with various cases when m
1
6= 0 and m
2
6= 0,
and thus the distributions are no longer e lliptically contoured (symmetric).
The va lues of the five parameters are as follows. The two cases in the first
row of Figure 6.1 corre spond to m
1
= m
2
= 1/2 a nd σ
1
= σ
2
=
3/2,
ρ = 1/3 (the left picture) and σ
1
=
3/2, σ
2
= 1/2, ρ =
3/3 (the
right picture). The two c ases in the second row of Figure 6.1 correspond to
m
1
= 1/2, m
2
= 1/4 and σ
1
=
3/2, σ
2
=
15/4, ρ =
5/15 (the left
picture) and σ
1
=
3/2, σ
2
=
7/4, ρ =
21/21 (the right picture). For
the meaning of the presented contour lines see Section 5.1.
The graphs indicate that even in the uncorrelated case, the Laplace dis-
tributions exhibit a large variety of asymmetric features, the property not
shared by the Gaussian distributions (compare with Figure 5.1).
Similar graphs are obtained for the correlated densities c orresponding
to the covariance matrices given in Section 5.1. Figure 6.2 should be com-
pared with the symmetric case provided in Figure 5.2. In both cases, we
have the same correlation structure. These graphs present densities of four
asymmetric Laplace distributions with the parameters specified as follows:
Σ =
0.75 0.25
0.25 0.75
, m = (0.5, 0.5)
0
;
Σ =
0.75 0.25
0.25 0.25
, m = (0.5, 0.5)
0
;
Σ =
0.75 0.375
0.375 0.9375
, m = (0.5, 0.25)
0
;
Σ =
0.75 0.375
0.375 0.4375
, m = (0.5, 0.25)
0
.
Asymmetry of the distributions is clearly noticeable.
6.1.4 Simulation of bivariate asymmetric Laplace variates
The general algorithm for simulating asymmetric multivariate Laplace var i-
ables is derived in Section 6.4 of the next chapter. In the bivariate case it
takes the following form:
A BAL(m
1
, m
2
, σ
1
, σ
2
, ρ) generator.
304 6. Asymmetric multivariate Laplace distribution
Cov(Y) =
1 0.5
0.5 1
Cov(Y) =
1 0.5
0.5 0.5
Laplace
m = (0.5, 0.5)
Σ 6= Cov(Y )
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Laplace
m = (0.5, 0.25)
Σ 6= Cov(Y )
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3 4
−4
−3
−2
−1
0
1
2
3
4
Figure 6.2: Laplace asymmetric bivar iate densities correspo nding to cor-
related distributions. The same covariances as in the symmetric case in
Figure 5.2 are used.
Generate a bivariate normal variable X with mean zero and covari-
ance matrix Σ given by (5.1.1).
Generate a standard exponential variable W .
Set Y
W ·X + mW .
RETURN Y.
6.1 Bivariate case definition and basic properties 305
Note that compared with the corresponding algorithm for the symmetric
case (see Section 5.1), here we have an extra variable mW , which combined
with
W X leads to an AL variable.
In Figures 6.3 and 6.4, we prese nt graphs of the same densities (based on
Monte Carlo simulation) as those presented in the graphs of the densities
in Figures 6.1 and 6.2.
Cov(Y) =
1 0
0 1
Cov(Y) =
1 0
0 0.5
Laplace
m = (0.5, 0.5)
Σ 6= Cov(Y )
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
Laplace
m = (0.5, 0.25)
Σ 6= Cov(Y )
••
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
Figure 6.3: Uncorrelated asymmetric Laplace random samples. Monte Carlo
simulation based on the algorithm describ e d in the text. (The sample size
equals 2000.)
306 6. Asymmetric multivariate Laplace distribution
Cov(Y) =
1 0.5
0.5 1
Cov(Y) =
1 0.5
0.5 0.5
Laplace
m = (0.5, 0.5)
Σ 6= Cov(Y )
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
Laplace
m = (0.5, 0.25)
Σ 6= Cov(Y )
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
-4 -2 0 2 4
Figure 6.4: Correlated asymmetric Laplace random samples. Monte Carlo
simulation based on the algorithm describ e d in the text. (The sample size
equals 2000.)
6.2 General multivariate asymmetric case
6.2.1 Definition
Firstly, we shall provide a definition of multiva riate AL laws.
Definition 6.2.1 A random vector Y in R
d
is said to have a multivariate
asymmetric Laplace distribution (A L) if its characteristic function is given
6.2 General multivariate asymmetric case 307
by
Ψ(t) =
1
1 +
1
2
t
0
Σt im
0
t
, (6.2.1)
where m R
d
and Σ is a d × d non-negative definite symmetric matrix.
We shall use the notation AL
d
(m, Σ) to denote the distribution of Y, and
write Y AL
d
(m, Σ). If the matrix Σ is positive-definite, the dis tribution
is truly d-dimensional and has a probability density function. Otherwise, it
is degenerate and the probability mass of the distribution is concentrated
in a linea r proper subspace of the d-dimensional space.
For m = 0 the distribution AL
d
(0, Σ) reduces to the symmetric multi-
variate Laplace law L
d
(Σ) discussed in Section 5.2 of Chapter 5 (although
more appropriately it should perhaps be called an elliptically contoured
Laplace law).
Remark 6.2.1 The parameter m = (m
1
, . . . , m
d
)
0
appearing in (6.2.1) is
not a shift parameter: if Y AL
d
(m, Σ) it does not follow that Y + n
AL
d
(m + n, Σ). In fact, the distribution of Y + n is not even AL (unless
n = 0). However, the mean of Y exists and equals m.
Remark 6.2.2 The class of AL laws is not closed under summation of
independent r.v.’s: if X and Y are independent AL r.v.’s, then in general
X + Y does not possess an AL law.
6.2.2 Special cases
In the following remarks we shall discuss some special cases of AL laws.
Remark 6.2.3 For d = 1 we obtain a univariate AL(µ, σ) distribution
with mean µ and variance σ
2
+ µ
2
.
Remark 6.2.4 For d = 2 the distribution AL
2
(m, Σ) with m = (m
1
, m
2
)
0
and Σ given by (5.1.1 ) reduces to BAL(m
1
, m
2
, σ
1
, σ
2
, ρ) distribution (and
to the BSL(σ
1
, σ
2
, ρ) distribution for m = 0).
Remark 6.2.5 Here is an example of a degenerate AL law in R
d
. If Y has
a univariate AL(1, 1) law and m R
d
, then the r.v. Y = mY has the ch.f.
Ψ
Y
(t) = Ee
it
0
Y
= ψ
Y
(t
0
m) =
1
1 +
1
2
t
0
(mm
0
)t im
0
t
.
Thus, Y AL
d
(m, Σ) with Σ = mm
0
.
Remark 6.2.6 Consider a r.v. Y AL
d
(m, 0), with the ch.f.
Ψ
Y
(t) =
1
1im
0
t
. (6.2.2)
308 6. Asymmetric multivariate Laplace distribution
Then, Y admits the representation Y
d
= mZ, where Z is the standard
exp onential va riable. Indeed, we have
Ψ
Y
(t) = Ee
it
0
Y
= ψ
Z
(t
0
m) =
1
1 im
0
t
.
This distribution is related to the Marshall-Olkin exponential distribution
of the r.v.
W = (W
1
, . . . , W
d
)
0
,
given by its survival function
P (W
1
> x
1
, . . . , W
d
> x
d
) = e
max(x
1
,... ,x
d
)
, x
i
0, i = 1, 2, . . . , d.
Since the ch.f. of W is
Ψ
W
(t) = (1 i(t
1
+ ··· + t
d
))
1
,
we have Y
d
= D(m)·W, wher e D(m) is a diagonal matrix with the elements
of the vector m on its main diagonal.
6.3 Representatio ns
6.3.1 Basic representation
The following result follows directly from the representation of geometric
stable laws discussed in Kozubowski and Panorska (1999).
Theorem 6.3.1 Let Y AL
d
(m, Σ) and let X N
d
(0, Σ). Let W be an
exponentially distributed r.v. with mean 1, independent of X. Then,
Y
d
= mW + W
1/2
X. (6.3.1)
Remark 6.3.1 More general mixtures of normal distributions, where W
has a generalized inverse Gaus sian distribution, were consider e d by Barndorff-
Nielsen (1977). A generalized inverse Gaussian distribution with parame-
ters (λ, χ, ψ), denoted GIG(λ, χ, ψ), has the p.d.f.:
p(x) =
(ψ)
λ/2
2K
λ
(
χψ)
x
λ1
e
1
2
(χx
1
+ψx)
, x > 0, (6.3.2)
where K
λ
is the modified Bessel function of the third kind (see Appendix
A). The range of the parameters is
χ 0, ψ > 0, λ > 0; χ > 0, ψ > 0, λ = 0; χ > 0, ψ 0, λ < 0 .
6.3 Representations 309
Barndorff-Nielsen (1977) co nsidered mixtures of the form
Y
d
= µ + mW + W
1/2
X, (6.3.3)
where X is as be fore, m = Σβ with some d-dimensional vector β, and
W GIG(λ, χ, ψ). With the notation χ = δ
2
, ψ = ξ
2
, and α
2
= ξ
2
+β
0
Σβ,
Y has a d-dimensional generalized hyperbolic distribution with index λ,
denoted by H
d
(λ, α, β, δ, µ, Σ) [a hyperbolic distribution is obtained for
λ = 1, see, e.g., Blaesild (1981)]. Ta king the limiting case GIG(1, 0, 2) as
mixing distribution (which is a standard exponential) and setting Σβ = m
and µ = 0, so that δ
2
= 0, ξ
2
= 2, and α =
p
2 + m
0
Σ
1
m, we obtain the
mixture W m + W
1/2
X, where X is N
d
(0, Σ), independent of W , which
has a multivariate AL distribution.
Remark 6.3.2 By Theorem 6.3.1, each component Y
i
of an AL r.v. Y
admits the representation
Y
i
d
= m
i
W + W
1/2
σ
ii
X
i
, (6.3.4)
where X
i
is standard normal variable. This is the representation 3.2.1 ob-
tained previously for univariate AL laws.
6.3.2 Polar re presentation
Note that AL laws with m = 0 are elliptically contoured (EC), as their ch.f.
depends on t only through the quadratic form t
0
Σt. The class of elliptically
symmetric distributions consists of EC laws with non-singular Σ and the
density
f(x) = k
d
|Σ|
1/2
g[(x m)
0
Σ
1
(x m)], (6.3.5)
where g is a one-dimensional real-valued function (independent of d) and k
d
is a proportionality constant [se e , e.g., Fang et al. (1990)]. We shall denote
the laws with the density (6.3.5) by EC
d
(m, Σ, g). It is well-known, that
every r.v. Y EC
d
(0, Σ, g) admits the polar representation
Y
d
= RHU
(d)
, (6.3.6)
where H is a d × d matrix such that HH
0
= Σ, R is a positive r.v. in-
dependent of U
(d)
(having the distribution of
Y
0
Σ
1
Y), and U
(d)
is
a r.v. unifor mly distr ibuted on the sphere S
d
. Thus, HU
(d)
is uniformly
distributed on the surface of the hyper e llipsoid
{y R
d
: y
0
Σ
1
y = 1}.
Our next basic result identifies the distribution of R in the class of AL
distributed variables Y [see Kotz et al. (2000b)].
310 6. Asymmetric multivariate Laplace distribution
Proposition 6.3.1 Let Y AL
d
(0, Σ), where |Σ| > 0. Then, Y admits
the polar representation (6.3.6), where H is a d×d matrix such that HH
0
=
Σ, U
(d)
is a r.v. uniformly distributed on the sphere S
d
, and R is a positive
r.v. independent of U
(d)
with the density
f
R
(x) =
2x
d/2
K
d/21
(
2x)
(
2)
d/21
Γ(d/2)
, x > 0, (6.3.7)
where K
v
is t he modified Bessel function of the third kind defined by (A.0.4)
in Appendix A.
Proof. By Theorem 6.3.1, Y has the representation (6.3.1) with m = 0.
Write Σ = HH
0
, where H is a d × d non-singular lower triangular matrix
[see, e.g., Devroye (1986), pp. 566, for a recipe for such a matrix from
a given non-singular Σ]. Then, the r.v. X N
d
(0, Σ) in (6.3.1) has the
representation X = HN, where N N
d
(0, I). Fur ther, the r.v. N, which is
EC, has the well known representation N
d
= R
N
U
(d)
, where R
N
and U
(d)
are independent, U
(d)
is uniformly distr ibuted on S
d
, while R
N
is po sitive
with density
f
R
N
(x) =
d · x
d1
exp(x
2
/2)
2
d/2
Γ(d/2 + 1)
, x > 0 (6.3.8)
(it is distributed as the square root of a chi-squared r.v. with d degrees
of freedo m). Therefo re, it is sufficient to show that W
1/2
R
N
has density
(6.3.7). To this end, apply standard transformation theorem to write the
density of W
1/2
R
N
as
f
W
1/2
R
N
(y) = dy
Z
0
x
d/22
exp(
1
2
(x
2
+ 2y
2
/x))
2
d/2
Γ(d/2 + 1)
dx. (6 .3.9)
Let f
λ,χ,ψ
be the GIG density (6.3.2) with ψ = 1, χ = 2y
2
, and λ = d/21.
Then, relation (6.3.9) becomes
f
W
1/2
R
N
(y) =
d · yK
λ
(
2y)
2
d/2
Γ(d/2 + 1)(χ)
λ/2
Z
0
f
λ,χ,ψ
(x)dx, (6.3.10)
which yields (6.3.7) since the function f
λ,χ,ψ
integrates to one.
Remark 6.3.3 In cas e d = 1, w here the AL law has ch.f. ψ(t) = (1 +
σ
11
t
2
/2)
1
, the r.v. U
(1)
takes on values ±1 with probabilities 1/2, while
the Bessel function simplifies to
K
1/2
(
2y) =
p
π/2 exp(
2y)/(
2y)
1/2
6.4 Simulation algorithm 311
[see formula (A.0.11) in AppendixA]. Thus, R
d
= (1/
2)W , where W is a
standard exponential var iable. Consequently, the right-hand side of (6.3.6)
becomes
p
σ
11
/2 · W U
(1)
, and we obtain the representation of symmetric
Laplace r.v.’s already discusse d in Section 2.2 of Chapter 2.
6.3.3 Subordinated Brownian motion
All AL r.v.’s can be interpreted as values of a subordinated Gaussian pro-
cess. More precisely, if Y AL
d
(m, Σ), then
Y
d
= X(W ),
where X is a d-dimensio nal Gaussian process with indep e ndent increments,
X(0) = 0, and X(1) N
d
(m, Σ). This fo llows immediately from evaluating
the characteristic function on the right-hand side through conditioning o n
the exponential random variable W . Conseq uently, AL distributions may
be studied via the theory of (stopped) evy processes [see Bertoin (1996)].
6.4 Simulation algor ithm
The pro blem of random number generation for symmetric Laplace laws was
posed in Devroye (1986) and r e iterated in Johnson (1987): “...variate gen-
eration has not been explicitly worked out for (the bivariate Laplace a nd
generalized Laplace distributions) in the literature.” However, simulation
of generalized hyperbolic random variables was studied earlier by Atkin-
son (1982). The algorithms were based on the normal mixtures represen-
tations of the distributions under consideration. In this sense, in principle
the problem of simulations for multivaria te AL distributions was resolved.
However, the solution can no t be considered to be an explicit one, since the
fact that AL distributions can be obtained as the limiting case of hyperbolic
distributions is not co mmo nly known.
To state simulation algorithm for the general multivariate AL distribu-
tions we shall use the representation (6.3.1). The approach is quite straight-
forward [see Kozubowski and Podg´o rski (1999b)], as both exponential and
multivariate nor mal variates are relatively easy to generate and appropriate
procedures are by now implemented in all standard statistical packages.
A AL
d
(m, Σ) generator.
Generate a standard exponential variate W .
Independently of W , generate multivariate normal N
d
(0, Σ) variate
N.
Set Y m · W +
W · N.
312 6. Asymmetric multivariate Laplace distribution
RETURN Y
This algorithm and pseudo-random samples of normal and exponential
random variables obtained from the S-plus package were used to produce
the graphs o f bivariate Laplace distributions in Figures 5.3,5.4,6.3, 6.4.
6.5 Moments and densities
6.5.1 Mean vector and covariance matrix
The relation between the mean vector EY, the covariance matrix Cov(Y)
and the parameters m and Σ can easily be obtained from the represe ntation
(6.3.4). We have EY
i
= m
i
, so that
E(Y) = m.
Furthermore, the variance-covariance matrix of Y is
Cov(Y) = Σ + mm
0
.
Indeed, since E(X
i
X
j
) = σ
ij
and EW
2
= 2, we have:
E(Y
i
Y
j
) = E[(m
i
W + W
1/2
X
i
)(m
j
+ W
1/2
X
j
)] = m
i
m
j
EW
2
+ E(W )E(X
i
X
j
)
= 2m
i
m
j
+ σ
ij
.
Thus,
Cov(Y
i
, Y
j
) = E(Y
i
Y
j
) E(Y
i
)E(Y
j
) = 2m
i
m
j
+ σ
ij
m
i
m
j
= m
i
m
j
+ σ
ij
.
6.5.2 Densities i n the general case
In this section we s tudy AL densities (assuming that the distribution is
non singular). The re presentation given in Theorem 6.3.1, coupled with
conditioning on the exponential variable W , produces a relation between
the distribution functions and the densities of AL and multivariate normal
random vectors. Let G(·) and F (·) be the d.f.’s of AL
d
(m, Σ) and N
d
(0, Σ)
r.v.’s, respe c tively, and let g(·) and f(·) be the corresponding densities.
Corollary 6.5.1 Let Y AL
d
(m, Σ). The distribution function and the
density (if it exists) of Y can be expressed as follows:
G(y) =
Z
0
F (z
1/2
y z
1/2
m)e
z
dz
g(y) =
Z
0
f(z
1/2
y z
1/2
m)z
d/2
e
z
dz. (6.5.1)
6.5 Moments and densities 313
We can express a n AL density in terms of the modified Bessel function of
the third kind (see the definition in Appendix A). By (6.5.1), the density
of Y AL
d
(m, Σ) beco mes
g(y) = (2π)
d/2
|Σ|
1/2
Z
0
exp
(y zm)
0
Σ
1
(y zm)
2z
z
z
d/2
dz.
(6.5.2)
For y = 0, we arrive at
g(0) = (2π)
d/2
|Σ|
1/2
Z
0
exp
z(
1
2
m
0
Σ
1
m + 1)
z
d/2
dz,
so that the density blows up at zero unless d = 1. For y 6= 0, we can simplify
the exponential part of the integrand and substitute w = z(1+m
0
Σ
1
m/2)
in (6.5.2), to obtain
g(y) =
e
y
0
Σ
1
m
(1 +
1
2
m
0
Σ
1
m)
d/21
(2π)
d/2
|Σ|
1/2
Z
0
exp
a
2
4z
z
z
(d2)/21
dz,
where a =
q
(2 + m
0
Σ
1
m)(y
0
Σ
1
y). Taking into account the integral
representation (A.0.4) of the corresponding Bessel functions (see Appendix
A), we finally obtain the following basic result.
Theorem 6.5.1 The density of Y AL
d
(m, Σ) can be expressed as fol-
lows:
g(y) =
2e
y
0
Σ
1
m
(2π)
d/2
|Σ|
1/2
y
0
Σ
1
y
2 + m
0
Σ
1
m
v/2
K
v
q
(2 + m
0
Σ
1
m)(y
0
Σ
1
y)
,
(6.5.3)
where v = (2 d)/2 and K
v
(u) is t he m odified Bessel function of the third
kind given by (A.0.4) or (A.0.5) given in Appendix A.
Remark 6.5.1 The above density is a limiting case of a generalized hy-
perbolic density
ξ
λ
exp(β
0
(x µ))K
d/2λ
(α
q
δ
2
+ (x µ)
0
Σ
1
(x µ))
(2π)
d/2
|Σ|
1/2
δ
λ
K
λ
(δξ)[
q
δ
2
+ (x µ)
0
Σ
1
(x µ)]
d/2λ
(6.5.4)
with λ = 1, ξ
2
= 2, δ
2
= 0, µ = 0, β = Σ
1
m, and α =
p
2 + m
0
Σ
1
m
(see the remarks following Theorem 6.3.1). Note that in case δ = 0 we use
the asymptotic relation (A.0.12 ) given in the Appendix.
314 6. Asymmetric multivariate Laplace distribution
6.5.3 Densities i n the symmetric case
In the symmetric case (m = 0), we obtain the density (5.2.2) of the SL
d
(Σ)
distribution:
g(y) = 2(2π)
d/2
|Σ|
1/2
y
0
Σ
1
y/2
v/2
K
v
q
2y
0
Σ
1
y
.
6.5.4 Densities i n the one dimensional case
If d = 1, we have Σ = σ
11
= σ and the ch.f. corresponds to an univariate
AL(µ, σ) distribution with σ
2
= Σ and µ = m. In this case we have
v = 1/2, and the Bessel function is simplified as in (A.0.11). C onsequently,
the density becomes
g(y) =
1
γ
e
|y|
σ
2
(γµ·sign(y))
,
where γ =
p
µ
2
+ 2σ
2
, and coincides with the density of a univariate AL
distribution given by (3.1.10) with θ = 0. In the symmetric case (µ = 0), it
gives the density of a univariate Lapla c e distribution with mean zer o and
variance σ
2
.
6.5.5 Densities i n the case of odd dimension
If d is odd, the density can be written in a closed form. Indeed, suppose
d = 2r + 3, where r = 0, 1, 2, . . . , so that v = (2 d)/2 = r 1/2.
Since K
v
(u) = K
v
(u) and the Bessel function K
v
with v = r + 1/2 has an
explicit form (A.0.10) given in Appendix A, the AL density (6.5.3) becomes
g(y) =
C
r
e
y
0
Σ
1
mC
y
0
Σ
1
y
(2π
p
y
0
Σ
1
y)
r+1
|Σ|
1/2
r
X
k=0
(r + k)!
(r k)!k!
(2C
q
y
0
Σ
1
y)
k
, y 6= 0 ,
where v = (2 d)/2 and C =
p
2 + m
0
Σ
1
m.
The density has a particularly simple form in three dimensional space
(d = 3), where we have r = 0 and
g(y) =
e
y
0
Σ
1
mC
y
0
Σ
1
y
2π
p
y
0
Σ
1
y|Σ|
1/2
, y 6= 0.
6.6 Unimodality
6.6.1 Unimodality
We already know that all univariate AL distributions are unimodal with
the mode at zero. There are many nonequivalent notions of unimo dality
6.6 Unimodality 315
for probability distributions in R
d
[see, e.g., Dharmadhikari and Jo ag-Dev
(1988)]. A natura l extension of univariate unimodality is star unimodality
in R
d
, which for a distribution with continuous density f requires that f be
non-increasing along rays emanating from zero. Here is an exact criterion
for star unimoda lity due to Dharmadhikari and Joag-Dev (1988).
Criterion 1 A distribution P with continuous density f on R
d
is star
unimodal about zero if and only if whenever
0 < t < u < and x 6= 0
then
f(ux) f(tx).
It is clear from its statement that the criterion remains valid for densi-
ties discontinuous at zero as well. We shall show below that a ll truly d-
dimensional AL laws are star unimodal about zero.
Proposition 6.6.1 Let Y AL
d
(m, Σ) with |Σ| > 0. Then the distribu-
tion of Y is star unimodal about 0.
Proof. Assume that d > 1 and let x 6= 0. For t > 0 define h(t) = log g(tx),
where g is the density of Y given by (6.5.3). Write
h(t) = log C
1
+ C
2
t + v log t + log K
v
(C
3
t),
where v = 1 d/2 and the c onstants C
1
, C
2
, and C
3
are given by
C
1
=
2(x
0
Σ
1
x)
v/2
(2π)
d/2
|Σ|
1/2
(2 + m
0
Σ
1
m)
v/2
> 0,
C
2
= m
0
Σ
1
x R,
C
3
=
p
2 + m
0
Σ
1
m
p
x
0
Σ
1
x > 0.
It is required to s how that h is an non-increasing function o f t. The deriva-
tive of h with respect to t is
d
dt
h(t) = C
2
+
v
t
+
K
0
v
(C
3
t)
K
v
(C
3
t)
C
3
. (6 .6.1)
Use the properties (A.0.8) - (A.0.9) of Bessel function K
v
(listed in Ap-
pendix A) to write (6.6.1) as
d
dt
h(t) = C
2
K
v1
(C
3
t)
K
v
(C
3
t)
C
3
. (6.6 .2)
316 6. Asymmetric multivariate Laplace distribution
If C
2
< 0, then (6.6.2) implies that h
0
(t) 0, since the Bessel function K
v
is always positive and C
3
> 0. Otherwise, write Σ
1
= Q
0
Q and use the
Cauchy-Schwarz inequality to conclude that
|C
2
| = |(Qm)
0
(Qx)| ||Qm|| · ||Qx || =
p
m
0
Σ
1
m
p
x
0
Σ
1
x < C
3
.
Thus, the conclusion h
0
(t) 0 will follow if we show that the ratio
K
v1
(C
3
t)
K
v
(C
3
t)
is greater or equal to one. Since for any v, K
v
(x) = K
v
(x), this is equiv-
alent to showing that
K
v
(C
3
t) K
v+1
(C
3
t).
This is indeed true since v 0 (as d > 1) and by using Property 3 of
Bessel functions listed in Appendix A we obtain the desired inequality.
Remark 6.6.1 Any AL r.v. Y is linear unimodal about 0 in the sense that
that is every linear combina tion c
0
Y is univariate unimodal about zero [see
Definition 2.3 of Dharmadhikari and Joag-Dev (1988)]. This follow s from
part (iii) of Corollary 6.8.1 since all univariate AL laws are unimodal about
zero.
6.6.2 A rel ated representation
A univariate r.v. Y is unimodal about zero if and only if it has the repre-
sentation Y
d
= U X, where U and X are independent and U is uniformly
distributed on (0, 1) [see, e.g., Shepp (1962)]. Similarly, every s tar unimodal
(about 0) r.v. in R
d
has the represe ntation Y
d
= U
1/d
X, where U is as be-
fore and is independent fr om X [see Dharmadhikari and Joag-Dev (1988),
Theorem 2.1]. Below we identify the distribution of X in case of a sym-
metric AL r.v. Y. Let Y AL
d
(0, Σ) with |Σ| > 0. From the proof of
Proposition 6.3.1 we have the representation Y
d
= W
1/2
R
N
HU
(d)
, where
H is a matrix satisfying Σ = HH
0
, U
(d)
is uniform on the unit sphere S
d
,
W is standard exponential, R
N
has the density (6.3.8), and all var iables
are independent. Note that R
N
d
= V
1/d
, where V has dens ity
f
V
(x) =
exp(x
2/d
/2)
2
d/2
Γ(d/2 + 1)
, x > 0.
The density of V is unimodal, hence by Shepp (1962) it has the re presen-
tation V
d
= U S for some S (where U is standard uniform and independent
of S). It can be shown by routine calculations that the density of S is:
f
S
(x) =
x
2/d
exp(x
2/d
/2)
d2
d/2
Γ(d/2 + 1)
, x > 0.
6.7 Conditional distributions 317
Thus, we have Y
d
= U
1/d
(W
1/2
S
1/d
HU
(d)
). The density of W
1/2
S
1/d
is
readily obtained as well to be
f
W
1/2
S
1/d
(x) =
2x
d/2+1
K
d/2
(
2x)
2
d/2
Γ(d/2 + 1)
, x > 0. (6.6.3)
The following statement summarize s this discussion.
Theorem 6.6.1 Let Y AL
d
(0, Σ), where |Σ| > 0 and Σ = HH
0
. Then
Y admits the representation
Y
d
= U
1/d
X,
where U and X are independent, U is u niform on (0, 1) while X is ellip-
tically symmetric with the representation X
d
= R
X
HU
(d)
, where U
(d)
is
uniform on S
d
while R
X
has density (6.6.3).
6.7 Conditional distributions
6.7.1 C onditional distributions
Below we o btain co nditional distributions of Y AL
d
(m, Σ) with a non-
singular Σ. The derivation is similar to that for the case of multivariate
generalized hyperbolic distribution, see Blaesild (1981). It turns out that
the conditional laws are not AL, but genera lize d hyperbolic ones. However,
the c onditional distributions can be AL if Y has a multivariate k-Bess e l
function distribution (6.9.1), discusse d in Section 6.9. The c onditional dis-
tributions of a multivar iate AL laws are given in the following result [Kotz
et al. (2000b)].
Theorem 6.7.1 Let Y GAL
d
(m, Σ, s) have ch.f. (6.9.1) [see Section
6.9] with non-singular Σ. Let Y
0
= (Y
0
1
, Y
2
0
) be a partition of Y into r ×1
and k × 1 dimensional sub-vectors, respectively. Let (m
0
1
, m
0
2
) and
Σ =
Σ
11
Σ
12
Σ
21
Σ
22
be the corresponding partitions of m and Σ, where Σ
11
is an r ×r matrix.
Then, (i) If s = 1 (so that Y is AL), then the conditional distribution of
Y
2
given Y
1
= y
1
is the generalized k-dimensional hyperbolic distribution
H
k
(λ, α, β, δ, µ, ) having the density
p(y
2
|y
1
) =
ξ
λ
exp(β
0
(y
2
µ))K
k/2λ
(α
q
δ
2
+ (y
2
µ)
0
1
(y
2
µ))
(2π)
k/2
||
1/2
δ
λ
K
λ
(δξ)[
q
δ
2
+ (y
2
µ)
0
1
(y
2
µ)/α]
k/2λ
,
(6.7.1)
318 6. Asymmetric multivariate Laplace distribution
where λ = 1 r/2, α =
p
ξ
2
+ β
0
β, β =
1
(m
2
Σ
21
Σ
1
11
m
1
),
δ =
q
y
0
1
Σ
1
11
y
1
, µ = Σ
21
Σ
1
11
y
1
, = Σ
22
Σ
21
Σ
1
11
Σ
12
, and ξ =
q
2 + m
0
1
Σ
1
11
m
1
; (ii) If m
1
= 0, then the conditional distribution of Y
2
given Y
1
= 0 is GAL
k
(m
2·1
, Σ
2·1
, s
2·1
), where
s
2·1
= s r/2, Σ
2·1
= Σ
22
Σ
21
Σ
1
11
Σ
12
, m
2·1
= m
2
.
Proof. We shall sketch the proof of part (i) [the proof for part (ii) is simi-
lar]. By part (i) of Corollary 6.8.1 with n = r, the r.v. Y
1
is AL
r
(m
1
, Σ
11
).
Write the densities of Y and Y
1
according to (6.5.3) and simplify the ratio
of the densities utilizing the familiar relations from the classical multivari-
ate analysis:
Y
0
Σ
1
m = Y
0
1
Σ
1
11
m
1
+ (m
2
Σ
21
Σ
1
11
m
1
)
0
1
(y
2
Σ
21
Σ
1
11
y
1
),
Y
0
Σ
1
Y = Y
0
1
Σ
1
11
Y
1
+ (Y
2
Σ
21
Σ
1
11
Y
1
)
0
1
(y
2
Σ
21
Σ
1
11
y
1
),
m
0
Σ
1
m = m
0
1
Σ
1
11
m
1
+ (m
2
Σ
21
Σ
1
11
m
1
)
0
1
(m
2
Σ
21
Σ
1
11
m
1
),
|Σ| = |Σ
22
Σ
21
Σ
1
11
Σ
12
| · |Σ
11
|.
Finally, verify that α
2
= β
0
β + ξ
2
.
Remark 6.7.1 Note that in view of part (i) of the theorem, the parameter
λ can not equal one. Hence, in case of multivariate AL distribution no
conditional law can be AL. However, in part (ii) we might have s r/2 =
1, in which case we do obtain a conditional AL law for a multivariate
generalized AL distribution.
6.7.2 C onditional mean and covariance matrix
Since the co nditional dis tributions of an AL r.v. are generalized hyperbolic
distributions, we can derive expressions for conditional mean vector and
covar iance matrix via the theory of hyperbolic distributions.
Proposition 6.7.1 Let Y have a GAL law (6.9.1) with a non-singular Σ.
Let Y, m, and Σ be partitioned as in Theorem 6.7.1. Then,
E(Y
2
|Y
1
= y
1
) = Σ
21
Σ
1
11
y
1
+ (m
2
Σ
21
Σ
1
11
m
1
)
Q(y
1
)
C
R
1r/2
(CQ(y
1
))
6.8 Linear transformations 319
and
V ar(Y
2
|Y
1
= y
1
) =
Q(y
1
)
C
(Σ
22
Σ
21
Σ
1
11
Σ
12
)R
1r/2
(CQ(y
1
))
+(m
2
Σ
21
Σ
1
11
m
1
)(m
2
Σ
21
Σ
1
11
m
1
)
0
Q(y
1
)
C
2
G(y
1
),
where C =
q
2 + m
0
1
Σ
1
11
m
1
, Q(y
1
) =
q
y
0
1
Σ
1
11
y
1
, R
s
(x) = K
s+1
(x)/K
s
(x),
and
G(y
1
) = (R
1r/2
(CQ(y
1
))R
2r/2
(CQ(y
1
)) R
2
1r/2
(CQ(y
1
))).
Proof. Our outline of the proof follows Kotz et al. (2000b). Apply The-
orem 6.7.1 and utilize the representation (6.3.3) o f the generalized hy-
perbolic distribution to conclude that E(Y
2
|Y
1
= y
1
) = µ + βE(W )
and V ar(Y
2
|Y
1
= y
1
) = β(β)
0
V ar(W ) + E(W ), where W has the
GIG(s, δ
2
, ξ
2
) dis tribution (6.3.2) and µ, β, , δ, and ξ are as given in
Theorem 6 .7.1. Then, apply the well-known formulas for the moments of W ,
E(W
r
) = (δ/ξ)
r
K
s+r
(δξ)/K
s
(δξ) [see, e.g., Barndorff-Nielsen and Blaesild
(1981)].
Remark 6.7.2 If m
0
1
Σ
1
11
Σ
12
= m
d
, then by Theorem 6.7 .1, the condi-
tional distribution of Y
d
given (Y
1
, . . . , Y
d1
) is generalized hyperbolic and
symmetric about µ = Σ
21
Σ
1
11
y
1
(since β = 0 in this case), which must be
the mean of the conditional distribution. This provides an alternative way
for proving the result on linear regression to be discussed below.
6.8 Linear transfor matio ns
6.8.1 Linear combinations
In this section we discuss the distribution of AL vectors under the linear
transformations. The next proposition shows that if Y AL
d
(m, Σ) then
all linear combinations of components of Y are jointly AL.
Proposition 6.8.1 Let Y = (Y
1
, . . . , Y
d
)
0
AL
d
(m, Σ). Let A be an
l × d real matrix. Then, t he random vector AY is AL
l
(m
A
, Σ
A
), where
m
A
= Am and Σ
A
= AΣA
0
.
Proof. The assertion follows from the general relation
Ψ
AY
(t) = Ee
i(AY)
0
t
= Ee
iY
0
A
0
t
= Ψ
Y
(A
0
t)
320 6. Asymmetric multivariate Laplace distribution
and the fact that the matrix AΣA
0
is non-negative definite whenever Σ is.
Remark 6.8.1 Note that the proof is quite genera l, and applies to any
multivariate distribution whose ch.f. depends on t only thr ough the quadratic
form tΣt
0
and linear function m
0
t. Thus, it applies to all elliptically con-
toured distributions with ch.f. (4.5.10) as well as to the so called ν-s table
laws with ch.f.’s of the form g(1 + tΣt
0
im
0
t), where g is a Laplace trans-
form of a positive random variable [see, e.g., Kozubowski and Panorska
(1998)].
It follows that all univariate and multivariate marginals, as well as linear
combinations of the components of a multivariate AL vector, are AL.
Corollary 6.8.1 Let Y = (Y
1
, . . . , Y
d
)
0
AL
d
(m, Σ), where Σ = (σ
ij
)
d
i,j=1
.
Then,
(i) For all n d, (Y
1
, . . . , Y
n
) AL
n
(
˜
m,
˜
Σ), where
˜
m = (m
1
, . . . , m
n
)
0
and
˜
Σ is a n × n matrix with ˜σ
ij
= σ
ij
for i, j = 1, . . . , n;
(ii) For any b = (b
1
, . . . b
d
)
0
R
d
, the r.v. Y
b
=
P
d
k=1
b
k
Y
k
is univariate
AL(µ, σ) with σ =
b
0
Σb and µ = m
0
b. Further, if Y is symmetric
AL, then so is Y
b
;
(iii) For all k d, Y
k
AL(µ, σ) with σ =
σ
kk
and µ = m
k
.
Proof. Here is an outline of the proof. For part (i), apply Pro po sition 6.8.1
with n × d matrix A = (a
ij
) such that a
ii
= 1 and a
ij
= 0 for i 6= j. Fo r
part (ii), apply Proposition 6.8.1 with l = 1 and compare the resulting ch.f.
with the characteristic function of the univariate asymmetric distribution.
For part (iii) apply part (ii) to the standa rd base vectors in R
d
.
Remark 6.8.2 Corollary 6.8.1 part (ii) implies that the sum
P
d
k=1
Y
k
has
an AL distribution if all Y
0
k
s are components of a multivariate AL r.v. (and
thus all Y
0
k
s are univariate AL r.v.’s). This is in contrast with a sum of
i.i.d. AL r.v.’s, that ge nerally does not have an AL distribution.
Remark 6.8.3 Note that if Y has a nonsingular AL law (that is Σ is
positive definite) and the matrix A is such that AA
0
is positive-definite,
then the vector AY has a non-singular AL law as well. In particular, this
holds if A is a nonsingular square matrix.
6.8 Linear transformations 321
We have shown in Corollar y 6.8.1, part (ii), that if Y is an AL r.v . in R
d
,
then all linear combinations of its components are univariate AL r.v.’s. A
natural question is whether the converse is true. As of now, we do not have
a complete answer to this question. The following result provides a partial
answer for the case where all linear combinations are univariate AL(µ, σ)
with either µ = 0 (symmetric Laplace distribution) or σ = 0 (expo nential
distribution).
Theorem 6.8.1 Let Y = (Y
1
, . . . , Y
d
)
0
be a r.v. in R
d
. If all linear combi-
nations
P
d
k=1
c
k
Y
k
have either symmetric Laplace or exponential distribu-
tion, then Y has an AL
d
(m, Σ) distribution with either Σ = 0 or m = 0.
Proof. The proof follows from the corresponding result for GS laws [see
Kozubowski (1997), Theorem 3.3] and the fact that AL
d
(m, Σ) distribu-
tions with either Σ = 0 or m = 0 are strictly ge ometric stable.
6.8.2 Linear regression
Interestingly enough the conditions for linearity of the regr e ssion of Y
d
on Y
1
, . . . , Y
d1
, where Y = (Y
1
, . . . , Y
d
)
0
is AL, coincide with those for
multivariate normal laws.
Proposition 6.8.2 Let Y = (Y
1
, . . . , Y
d
)
0
AL
d
(m, Σ). Let
m
0
1
= (m
1
, . . . , m
d1
)
and let
Σ =
Σ
11
Σ
12
Σ
21
Σ
22
be a partition of Σ such that Σ
11
is a d 1 × d 1 matrix. Then,
E(Y
d
|Y
1
, . . . , Y
d1
) = a
1
Y
1
+ ··· + a
d1
Y
d1
(a.s.) (6.8.1)
if and only if
Σ
11
a = Σ
12
and m
0
1
a = m
d
. (6.8.2)
Moreover, in case |Σ| > 0, condition (6.8.2) is equivalent to
m
0
1
Σ
1
11
Σ
12
= m
d
and a = (a
1
, . . . , a
d1
)
0
= Σ
1
11
Σ
12
.
322 6. Asymmetric multivariate Laplace distribution
Proof. It is well known that, for a r.v. Y with a finite mean, the condition
(6.8.1) holds if and only if
Ψ(t)
t
d
t
d
=0
= a
1
Ψ(t)
t
1
t
d
=0
+ ··· + a
d1
Ψ(t)
t
d1
t
d
=0
,
where Ψ is the ch.f. of Y [see, e.g., Miller (1978)]. Substitution of the AL
ch.f. (6.2.1 ) into the ab ove equation followed by differentiation results in
(6.8.2). In case |Σ| > 0, the solution of the first equation in (6.8.2) is
a = Σ
1
11
Σ
12
, which solves the second equation in (6.8.2) if and only if
m
0
1
Σ
1
11
Σ
12
= m
d
.
Remark 6.8.4 The regression is always linear for m = 0.
6.9 Infinite divisibility properties
6.9.1 Infinite d i v isibility
The following result establishes infinite divisibility of multivariate AL laws
and identifies their L´evy measure.
Theorem 6.9.1 Let Y have a non-degenerate d-dimensional AL
d
(m, Σ)
law. Then, the ch.f. of Y is of the form
Ψ(t) = exp
Z
R
n
e
it·x
1
Λ(dx)
with
dΛ
dx
(x) =
2 exp(m
0
Σ
1
x)
(2π)
d/2
|Σ|
1/2
Q(x)
C(Σ, m)
d/2
K
d/2
(Q(x)C(Σ, m)),
where
Q(x) =
p
x
0
Σ
1
x and C(Σ, m) =
p
2 + m
0
Σ
1
m.
Proof. Apply Proposition 4.1 from Kozubowski and Rachev (1999b), which
identifies the density of geometric stable evy measure, to obtain
dΛ
dx
(x) =
Z
0
f(z
1/2
x z
1/2
m)z
d/21
e
z
dz,
where f(·) is the density of the multivaria te normal N
d
(0, Σ) distribution
with respect to the d-dimensional Lebesgue mea sure. Nex t, proceed simi-
larly to the computation of AL densities described in Section 6.5. Alterna-
tively, use the representation of Y through s ubordinated Brownian motion
6.9 Infinite divisibility properties 323
and Lemma 7, VI.2 of Bertoin (19 96) or use the fact that multivariate AL
laws are mixture s of normal distributions by generalized gamma convolu-
tions [cf. Exercise 2.7.61] and the corresponding results for the latter law s
derived in Takano (1989).
Remark 6.9.1 Note that for any d the density of an AL L´evy measure is
unbounded at x = 0.
Remark 6.9.2 In the one-dimensional case (d = 1), writing σ
2
= Σ,
µ = m, a nd κ =
2σ/(µ +
p
µ
2
+ 2σ
2
), we have
dΛ
dx
(±x) =
1
x
exp
2x
σ
κ
±1
!
, x > 0,
which is the density (3 .4.6) of the evy measure of univariate AL laws (see
Section 3.4 of Chapter 3).
6.9.2 Asymmetric Laplace motion
Since multivariate AL laws are infinitely divisible, similarly to the one-
dimensional case, one can define a evy process on [0, ) with independent
increments, the Laplace motion {Y(s), s 0}, so that Y(0) = 0, Y(1) is
given by (6.2.1), while for s > 0 the ch.f. of Y(s) is
Ψ(t) =
1
1 +
1
2
t
0
Σt im
0
t
s
, s > 0, (6.9.1)
[See, e.g., Teichroew (1957)]. Distributions on R
d
given by (6.9.1) will be
called generalized asymmetric Laplace (GAL), and denoted as GAL
d
(m, Σ, s).
For d = 1 we obtain the Bessel function distribution s tudied in Section 4.1
of Chapter 4. A GAL r .v . admits mixture representation (6.3.1) where W
has a gamma distribution with density
g(x) =
x
s1
Γ(s)
e
x
. (6.9.2)
The density corresponding to (6.9.1) can be expressed in terms of Bessel
function as follows:
p(x) =
2 exp(m
0
Σ
1
x)
(2π)
d/2
Γ(s)|Σ|
1/2
Q(x)
C(Σ, m)
sd/2
K
sd/2
(Q(x)C(Σ, m)),
(6.9.3)
324 6. Asymmetric multivariate Laplace distribution
where
Q(x) =
p
x
0
Σ
1
x and C(Σ, m) =
p
2 + m
0
Σ
1
m.
In the one-dimensional case, Sichel (1973) utilized (6.9.1) for mo deling
size distributions of diamonds excavated fro m marine deposits in South
West Africa. In financial applications, this process is known as the variance
gamma process (see Part III for more details on these and other applica-
tions).
Remark 6.9.3 If Σ = I
d
and m = 0 we obtain the symmetric multivariate
Bessel density
p(x) = C
d
(||x||)
a
K
a
(||x||), (6.9.4)
where β =
2, a = s d/2 > d/2 and C
d
is a normalizing constant
independent of x. [see Fang et al. (1990), p.92]. In the special case a = 0 and
β = σ/
2, Fang et al. (1990) call the distribution corresponding to (6.9.4) a
multivariate Laplace distribution. Note that this distribution belongs to our
class of Laplace distributions only in the bivariate c ase (d = 2) (Exercise
6.12.14).
Remark 6.9.4 If Σ = I
d
and s =
d+1
2
, the density (6.9.3) simplifies to
p(x) =
e
2+||m||
2
+m
0
x
(2π)
(d1)/2
Γ(
d+1
2
)
p
2 + ||m||
2
, (6.9.5)
which is a direct generalizatio n of the one-dimensional AL density [see
Takano (1989,1990), and Exercise6.12.12]. Taka no (1989) derived the L´evy
measure corresponding to the density (6.9.5) and showed that for d 2
these distributions are self-decomposable if and only if m = 0 (which is
in contrast with the case d = 1, since all one-dimensional AL laws are self
decomposable, cf. Proposition 3.2.3).
6.9.3 Geometric infi nite divisibility
Like their one-dimensional counterparts, all multivariate AL laws are geo-
metric infinitely divisible [see, e.g., Kotz et al. (2000b)].
Proposition 6.9.1 Let Y AL
d
(m, Σ). Then, Y is geometric infi nitely
divisible and the relation
Y
d
=
ν
p
X
i=1
Y
(i)
p
(6.9.6)
holds for all p (0, 1), where Y
(i)
p
’s are i.i.d. with the AL
d
(mp, pΣ) dis-
tribution, ν
p
is geometrically distributed with mean 1/p, and ν
p
and (Y
(i)
p
)
are independent.
6.10 Stability properties 325
Proof. Write (6.9.6) in terms of ch.f.’s. and follow the proof for the one
dimensional case (see Proposition 3.4.3 of Chapter 3).
6.10 Stability properties
In this section we collec t va rious characterizations of the multivariate AL
laws which exhibit their stability properties under appropriate summation
schemes. The r esults presented here , unlike the major ity of the previous
ones, can not be derived from the theor y of g e neralized hyperbolic distribu-
tions, because the latter do not possess any general convolution properties
except in some special cases (such as the nor mal inverse Gaussia n ca se or
the normal variance gamma models).
6.10.1 Limits of random sums
Analogously to the one-dimensional case, the multivar iate AL laws are
the only possible limits of geometric sums (6.0.1) of i.i.d. r.v.’s with finite
second moments. Actually, the result below can serve as an alternative
definition of this class of distributions.
Proposition 6.10. 1 Let ν
p
be a geometrically distributed r.v. with mean
1/p, where p (0, 1). A random vector Y has an AL distribution in R
d
if and only if t here exists an independent of ν
p
sequence {X
(i)
} of i.i.d.
random vectors in R
d
with nite covariance m atrix, and a
p
> 0, b
p
R
d
,
such that
a
p
ν
p
X
j=1
(X
(j)
+ b
p
)
d
Y, as p 0. (6.1 0.1)
Proof. The result follows from the so called transfer theorem for random
summation [see, e.g ., Rosi´nsk i (1976)] and its convers e [see Szasz (1972)],
together with the Central Limit Theore m for i.i.d. r.v.’s with a finite co-
variance matrix.
Our next result determines the type of normalization which produces
convergence in (6.10.1).
Theorem 6.10.1 Let X
(j)
be i.i.d. random vectors in R
d
with mean vector
m and covariance matrix Σ. For p (0, 1), let ν
p
be a geometric r.v. with
326 6. Asymmetric multivariate Laplace distribution
mean 1/p, and independent of the sequence (X
(j)
). Then, as p 0,
a
p
ν
p
X
j=1
(X
(j)
+ b
p
)
d
Y AL
d
(m, Σ), (6.10.2)
where a
p
= p
1/2
and b
p
= m(p
1/2
1).
Proof. By the Cra m´er-Wald device [see, e.g., Billingsley (1968)], the con-
vergence (6.10.2) is equivale nt to
c
0
a
p
ν
n
X
j=1
(X
(j)
+ b
p
)
d
c
0
Y
for all vectors c in R
d
. Writing W
j
= c
0
X
(j)
, µ = c
0
m, b
p
= (p
1/2
1 )µ,
and Y = c
0
Y, we have
a
p
ν
p
X
j=1
(W
j
+ b
p
)
d
Y AL(µ, σ), as p 0. (6.10.3)
Here, the W
j
’s are i.i.d. varia bles with mean µ and variance σ
2
= c
0
Σc,
and Y is a univariate AL va riable with ch.f.
ψ(t) =
1
1 +
1
2
σ
2
t
2
iµt
.
The convergence (6.10.3) now follows from Proposition 3.4.4 for the uni-
variate AL cas e [cf. equation (3.4.15)].
Next, we study stability properties of AL random vectors.
6.10.2 Stability under random summation
The following stability prop e rty is a well known characterization of α-s table
random vectors: X is α-stable if and only if for any n 2 we have the
following equality in distribution
X
(1)
+ · · · + X
(n)
d
= n
1
X + d
n
, (6.10.4)
where X
(i)
’s are i.i.d. copies of X and d
n
is some vector in R
d
[see, e.g.,
Samorodnitsky and Taq qu (1994)].
We have an analo gous characterization of AL random vectors with re-
sp e c t to geometric summation [see Ko tz et al. (2000b)].
Theorem 6.10.2 Let Y, Y
(1)
, Y
(2)
, . . . be i.i.d. r.v.’s in R
d
with finite
second moments, and let ν
p
be a geometrically distributed random variable
6.10 Stability properties 327
independent of the sequence {Y
(i)
, i 1}. For each p (0, 1), the r.v. Y
has the following stability property
a
p
ν
p
X
i=1
(Y
(i)
+ b
p
)
d
= Y, (6.10.5)
with a
p
> 0 and b
p
R
d
if and only if Y is AL
d
(m, Σ) with either Σ = 0
or m = 0. The normalizing constants are necessarily of the form
a
p
= p
1/2
, b
p
= 0.
The above result fo llows from the characterization o f strictly geometric
stable laws given in Theorem 3.1 of Kozubowski (1997) and the fact that
the only geometric stable laws with finite second moments are AL
d
(m, Σ)
laws with e ither Σ = 0 or m = 0.
Remark 6.10.1 Since in general multivariate AL r.v.’s do not satisfy re-
lation (6.10.5), as is the case in the univariate case, the question arises as
to w hether
S
(p)
= a
p
ν
p
X
i=1
(Y
(i)
+ b
p
)
d
Y, as p 0, (6.1 0.6)
where Y
(i)
are i.i.d. copies of Y , ν
p
is an independent of {Y
(i)
, i 1}
geometrically distributed, and a
p
> 0 and b
p
R
d
. Note that the conver-
gence (6.10.6) holds for all univariate AL laws (see Proposition 3.4.5), as
well as for general geometric stable laws with the index α less than two
[see, Ko z ubowski (1997)]. It is quite sur prising that for d > 1, as noted
by Kozubowski (1997), in general AL r.v.’s do not satisfy (6.10.6), unless
m = 0 or Σ = 0. Indeed, if either Σ = 0 or m = 0, then (6.10.5) is s atis-
fied, and so is (6.10.6). Assume Σ 6= 0 and suppose that Y AL
d
(m, Σ)
satisfies (6.10.6). Then, for any c R
d
we have
c
0
S
(p)
= a
p
ν
p
X
i=1
h
c
0
Y
(i)
+ c
0
b
p
i
d
Y
c
= c
0
Y as p 0. (6.10.7)
By Corollary 6.8.1, par t (ii), the r.v. Y
c
= c
0
Y is univariate AL with
σ = (c
0
P
c)
1/2
and µ = c
0
m. After the application of Proposition 3.4.5,
we find that (6.10.7) holds with
a
p
= Cp
1/2
(1 + o(1)), where C =
σ
2
/(µ
2
+ σ
2
)
1/2
. (6.10.8)
Since the normalizing constant a(p) in (6.10.7) should be independent of
c, (6.10.8) implies that µ = c
0
m = 0 for every c, and thus m = 0. In the
latter case C = 1 and (6.10.7) holds with a
p
= p
1/2
and b
p
= 0.
328 6. Asymmetric multivariate Laplace distribution
6.10.3 Stability of deterministic sums
In the next result, taken from Kotz et al. (2000b), we show that a deter-
ministic sum of i.i.d. AL r.v.’s, scaled by an appropriate random variable,
has the same distribution as each component of the sum. It is a general-
ization of a similar characterization of the univa riate Laplace distributions,
see Proposition 2.2.11 in Chapter 2.
Theorem 6.10.3 Let B
m
, where m > 0, have a Beta(1, m) distribution.
Let {X
(i)
} be a sequence of i.i.d. random vectors with nite second moment.
Then, the following statements are equivalent:
(i) For all n 2, X
(1)
d
= B
1/2
n1
(X
(1)
+ ··· + X
(n)
).
(ii) X
(1)
is AL
d
(m, Σ) with either Σ = 0 or m = 0.
Proof. The above result follows from the corresponding result for GS laws
[see Kozubowski and Rachev (1999b)] and the fact that AL
d
(m, Σ) distri-
butions with either Σ = 0 o r m = 0 are strictly GS. The result for GS laws
follows from the results of Pakes (1992ab).
We shall co nclude our discussion with yet another stability pro perty of
AL laws, that for the one-dimensional case was given in (2.2.28) of Chapter
2 [and noted by Pillai (1985)].
Proposition 6.10. 2 Let Y, Y
(1)
, Y
(2)
, and Y
(3)
be AL
d
(m, Σ) r.v.’s with
either Σ = 0 or m = 0. Let p (0, 1), and let I be an indicator random
variable, indepen dent of the Y
(i)
’s, with P (I = 1) = p, and P (I = 0) =
1p. Then, the following equality in distribution is valid for any p (0, 1):
Y
d
= p
1/2
IY
(1)
+ (1 I)(Y
(2)
+ p
1/2
Y
(3)
). (6.10.9)
Proof. Let c R
d
. Since c
0
Y, c
0
Y
(1)
, c
0
Y
(2)
, and c
0
Y
(3)
are univariate
AL(µ, σ) with either µ = 0 or σ = 0 (see Corollary 6.8.1), the result in
one-dimensional case [see equation (2.2.28), and also Pillai (1985)] produces
c
0
Y
d
= p
1/2
Ic
0
Y
(1)
+ (1 I)(c
0
Y
(2)
+ p
1/2
c
0
Y
(3)
),
or eq uiva lently,
c
0
Y
d
= c
0
(p
1/2
IY
(1)
+ (1 I)(Y
(2)
+ p
1/2
Y
(3)
)).
The last relation implies (6.10.9).
6.11 Linear regression with Laplace errors 329
6.11 Linear regression with Laplace errors
In this fina l section we study a regressio n model with Laplace distributed
error term. Consider the multiple linear regression model
Y = Xb + e, (6.11.1)
where Y is a d×1 random vector of observations, X is a d×k non-stochastic
matrix of rank k, b is a k ×1 vector of r e gression parameters with unknown
values, and e is a d ×1 random error term. Assume that e AL
d
(0, σ
2
I
d
),
where I
d
is a d ×d identity matrix (so that the mean vector and covariance
matrix of e are, respectively, 0 and σ
2
I
d
). Although the elements of e ar e
uncorrelated, they are not indep e ndent. According to Theorem 6.3.1, e has
the representation
e
d
= W
1/2
N, (6.11.2)
where N N
d
(0, σ
2
I
d
) (multivariate normal with mean 0 and covariance
matrix σ
2
I
d
), while W is standard exponential (independent of N).
6.11.1 Least-squares estimation
The least-squares estimator (LSE)
b
b of b satisfies the normal equations:
(X
0
X)
b
b = X
0
Y.
If X has full ra nk, the inverse of X
0
X exists and
b
b can be expressed as
b
b = (X
0
X)
1
X
0
Y, (6.11.3)
which is the same as in the normal case.
Next, we consider the joint distribution of
b
b and the vector of residuals
b
e = Y X
b
b. In view of (6.11.1) and (6.11.3), we have
b
b
b
e
=
(X
0
X)
1
X
0
I
d
(X
0
X)
1
X
0
Y =
b
0
+
(X
0
X)
1
X
0
I
d
(X
0
X)
1
X
0
e,
where e AL
d
(0, σ
2
I
d
). Now , since
b
b
b
e
b
0
is a linea r function of e, its distribution is AL according to Proposition
6.8.1.
330 6. Asymmetric multivariate Laplace distribution
Proposition 6.11. 1 Un der the model (6.11.1), the least-squares estima-
tor
b
b and t he vector of residuals
b
e = Y X
b
b have the following joint
distribution:
b
b
b
e
b
0
AL
k+d
(0, Σ) (6.11.4)
Σ = σ
2
(X
0
X)
1
0
0 I
d
X(X
0
X)
1
X
0
.
Remark 6.11.1 As in the normal case, it follows that E(
b
b) = b (so that
b
b is unbiased), E(
b
e) = 0, Cov(
b
b) = σ
2
(X
0
X)
1
, and Cov(
b
e) = σ
2
(I
d
X(X
0
X)
1
X
0
) . However,
b
b and
b
e are unco rrelated, but not independent.
Remark 6.11.2 Note that since Y
1
, . . . Y
d
are uncorrelated, V ar(Y
i
) =
σ
2
, and
b
b is unbiased for b, the conditions of Gauss-Markov theorem are
fulfilled. Thus, for any c R
d
, the estimator c
0
b
b of c
0
b has the smallest
possible var iance among all linear estimators of the form c
0
Y which are
unbiased for c
0
b. In particular, for j = 1, . . . d,
b
b
j
will have the smallest
variance among all linea r unbiased estimators of b
j
.
6.11.2 Estimation of σ
2
As in the normal case, the estimator
b
e
0
b
e/(d k) is unbiased for σ
2
, which
follows from the following result.
Proposition 6.11. 2 Un der the model (6.11.1), the statistic
b
e
0
b
e is dis-
tributed as
σ
2
· W ·V,
where W and V are independent, W is standard exponential, while V has
a chi-square distribution with d k degrees of freedom. Moreover, the r.v.
b
e
0
b
e
2
has the following density function:
p(x) =
p
x/2
(dk)/21
K
(dk)/21
(
2x)/Γ
d k
2
, x > 0. (6.11.5)
Proof. Fir st, write
b
e = (I
d
X(X
0
X)
1
X
0
)(Xb + e), note that the ma trix
I
d
X(X
0
X)
1
X
0
is indepotent, and utilize the repr e sentation (6.11.2), to
obtain
b
e
0
b
e = ZN
0
(I
d
X(X
0
X)
1
X
0
)N,
where N has a multivariate normal distribution with mean zero and co-
variance matrix σ
2
I
d
. Now, the first part of the Proposition follows, since
N
0
(I
d
X(X
0
X)
1
X
0
)N
2
has a chi-square distribution with dk degrees
6.11 Linear regression with Laplace errors 331
of freedom (a standar d fa c t for the regression model (6.11.1) with normally
distributed error term).
Next, apply the standard transforma tion theore m for random variables
to obtain the density of W V in the form
p(x) =
x
(dk)/21
2
(dk)/2
Γ
dk
2
Z
0
y
1(dk)/21
e
1
2
(x/y+2y)
dy.
Finally, utilize the fact that the generalized inverse Gaussian density (6.3.2)
with χ = x, ψ = 2 , and λ = 1 (d k)/2 integrates to one on (0, ), so
that
Z
0
y
1(dk)/21
e
1
2
(x/y+2y)
dy =
2K
(dk)/21
(
2x)
(2/x)
1/2(dk)/4
,
which produces (6.11.5).
Remark 6.11.3 The above result may be used to obtain confidence inter-
vals for σ.
Next, we derive the minimal mean squared error estimator for σ
2
. Con-
sider the class of estimators for σ
2
of the form δ
c
= c
b
e
0
b
e. We k now from
Proposition 6.11.2 that for c = 1/(d k) we obtain an unbiased estimator.
However, this estimator do e s not minimize the mean squared error (MSE)
defined as
MSE = E(δ
c
σ
2
)
2
= V arδ
c
+ (Eδ
c
σ
2
)
2
.
To find c that minimizes the MSE , wr ite
MSE = c
2
σ
4
V ar(
b
e
0
b
e
2
) + (
2
E(
b
e
0
b
e
2
) σ
2
)
2
, (6.11.6)
and compute the mean and variance of
b
e
0
b
e
2
that appear in (6.11.6)
utilizing Proposition 6.11.2. Namely, we have
E(
b
e
0
b
e
2
) = E(W V ) = E(W )E(V ) = 1 · n
and
E(
b
e
0
b
e
2
)
2
= E(W
2
)E(V
2
) = 2 · (2 n + n
2
),
where n = d k, so that
V ar(
b
e
0
b
e
2
) = E(W
2
V
2
) [E(W V )]
2
= 4n + n
2
.
332 6. Asymmetric multivariate Laplace distribution
Consequently, (6.11.6) pr oduces
MSE = σ
4
[c
2
(2n
2
+ 4n) 2cn + 1].
The minimum value is easily found to be c
= 1/(2(n + 2)). We summarize
our discussion below.
Proposition 6.11. 3 Consider the model (6.11.1) and the class of estima-
tors of σ
2
of t he form c
b
e
0
b
e, where c R. Then the estimator
b
e
0
b
e
2(d k + 2)
minimizes the MSE.
6.11.3 The distributions of standard t an d F statistics
When s tudying the regre ssion model (6.11.1) with multivariate student-t
error term e, Zellner (1976) noticed that tests and intervals based on the
usual t and F statistics remain valid. He also remarked that his argument
with conditioning holds for models (6.11.1) whenever the error term is a
normal mixture (6.11.2) with a prope r distribution of W , establishing the
validity of the us ual t and F statistics.
Proposition 6.11. 4 Consider the regression model (6.11.1) where e
AL
d
(0, σ
2
I
d
) and X is of full rank. Let
b
b = (
b
b
1
, . . .
b
b
k
)
0
be the least-squares
estimator of b = (b
1
, . . . b
k
)
0
, and let s
2
=
b
e
0
b
e/(d k). Then,
(i) The statistic
T
i
=
b
b
i
b
i
s
c
ii
, (6.11 .7)
where c
ii
is the ith diagonal element in (X
0
X)
1
, has a t-distribution with
d k degrees of freedom;
(ii) If b = 0, then the statistic
F =
(
b
b
0
X
0
Y d
Y
2
)/(k 1)
b
e
0
b
e/(d k)
(6.11.8)
has an F -distribution with k 1 and d k degrees of freedom.
(iii) The statistic
(
b
b b)
0
X
0
X(
b
b b)/k
b
e
0
b
e/(d k)
, (6.11.9)
which is used in deriving confidence ellipsoids for b, has an F - distribution
with k and d k degrees of freedom. Moreover, a 100(1 α)% confidence
6.11 Linear regression with Laplace errors 333
region for b is given by
(b
b
b)
0
X
0
X(b
b
b) k
b
e
0
b
e
d k
F
k,nk
(α), (6.11.10)
where F
k,nk
(α) is the upper (100α)th percentile of an F -distribution with
k and d k degrees of freedom.
Remark 6.11.4 An improved confidence ellipsoids were derived in Hwang
and Chen (1986).
6.11.4 Inference from the estimated regression function
After fitting, a regression model can be used for predictions. Let x
0
be a
k ×1 vector of predictor variables. Then, x
0
coupled with
b
b can be used to
estimate the regression function x
0
0
b as well as the value of the r e sponse,
Y
0
, at x
0
. It turns out that the confidence intervals for these predictions
coincide with those for the normal case.
Estimating the regression function at x
0
Note that since x
0
0
b is a linear function of b, the Gauss -Markov theorem
implies that x
0
0
b
b is BLUE for x
0
0
b, with variance of x
0
0
(X
0
X)
1
x
0
σ
2
. More-
over, as in the normal case, the statistic
x
0
0
b
b x
0
0
b
s
p
x
0
0
(X
0
X)
1
x
0
, (6.11.11)
where s
2
=
b
e
0
b
e/(d k), has a t-distribution with d k degrees of freedom.
Forecasting a new observation at x
0
As in the normal model, a new obs ervation Y
0
has an unbiased predictor
x
0
0
b
b. According to the model (6.11.1), we now have
Y
Y
0
=
X
x
0
0
b +
e
e
0
,
where [e
0
e
0
]
0
AL
d+1
(0, σ
2
I
d+1
). Note that the forecast error, Y
0
x
0
0
b
b,
can be expressed as
Y
0
x
0
0
b
b = [
x
0
0
(X
0
X)
1
X
0
1
]
e
e
0
,
so tha t it has a univariate AL distribution with mean zero and variance
σ
2
(1 + x
0
0
(X
0
X)
1
x
0
) (see Corollary 6.8.1). It follows that the statistic
Y
0
x
0
0
b
b
s
p
1 + x
0
0
(X
0
X)
1
x
0
334 6. Asymmetric multivariate Laplace distribution
has t-distribution with d k degrees of freedom.
6.11.5 Maximum likelihood estimation
By (6.5.3) and (6.11.1), the likelihood function for the regression model has
the form
p(y|b, σ) =
K
d/21
(
2||y Xb||)
2
d/41/2
π
d/2
σ
1+d/2
||y Xb||
d/21
, (6.11.12)
where K
λ
denotes the modified Bessel function of the third kind. Note that
for any fixed value of σ, the functions K
d/21
(
2||y Xb||) and ||y
Xb||
d/21
are both decreasing in ||yXb|| (for d = 2, which is the s mallest
value for d, the latter function is cons tant). Thus, the maximum occurs
whenever ||y Xb|| is minimized. Consequently, the maximum likelihood
estimator (MLE) for b coincides with the least-squares estimator (LSE) for
b. To find the MLE fo r σ, we need to maximize the function
L(σ) =
K
d/21
(a/σ)
π
d/2
σ
1+d/2
a
d/21
with respect to σ (0, ), where a =
2||y X
b
b|| and
b
b is the LSE (and
MLE) for b. The log arithmic derivative of L equals
d
log L(σ) =
1 + d/2
σ
a
σ
2
K
0
d/21
(a/σ)
K
d/21
(a/σ)
.
Using Property 4 o f Bess e l functions from Appendix A, we have
d
log L(σ) =
1
σ
a
σ
R
d/21
(a/σ) d/(a/σ)
, (6.11.13)
where the function R
λ
is defined by (A.0.15) in Appendix A. In view of
(6.11.13), the following lemma 6.11.1 implies the existence of a unique
number bσ (0, ) such that the function log L(σ) is strictly increasing on
(0, bσ) and strictly decreasing on (bσ, ). This number, which is the MLE of
σ, is a unique solution of the equation
R
d/21
(a/σ) = d/(a/σ). (6.11.14)
Lemma 6.11.1 Let d be an integer greater than or equal to two.
(i) If d = 2, then the function h
d
(x) = xR
d/21
(x) is strictly increasing for
x (0, ) with lim
x→∞
h
d
(x) = and lim
x0
+
h
d
(x) = 0.
(i) If d > 2, then the function h
d
(x) = R
d/21
(x)d/x is strictly increasing
for x (0 , ) with lim
x→∞
h
d
(x) = 1 and lim
x0
+
h
d
(x) = −∞.
6.11 Linear regression with Laplace errors 335
Proof. First, consider the case d = 2. By Property 13, we have
d
dx
xR
0
(x) =
x(R
2
0
(x) 1). By Property 11 , R
0
(x) > 1, so that
d
dx
xR
0
(x) > 0, showing
that the function xR
0
(x) is strictly increasing. Property 11 also produces
lim
x→∞
h
d
(x) = . Finaly, the limit lim
x0
+
h
d
(x) = 0 follows from the
asymptotic behavior of the Bessel function (Prop e rty 6).
Next, consider d > 2. Apply Pr operty 12 with λ = d/2 1 to o bta in the
following expression for the derivative of h
d
,
d
dx
h
d
(x) =
d
dx
2/x + 1/R
(d2)/21
(x)
. (6.11.15)
Note that for d > 3 the function R
(d2)/21
(x) is decreasing (Pro perty 11),
while for d = 3 we have R
1/2
(x) = 1 (by Property 4). In either case, the
derivative (6.11.15) is pos itive (as the expression in paranthesis is a strictly
increasing function), so that the function h
d
is strictly increasing. The rest
of (ii) follows fro m Properties B1 and A6.
Note that since R
d/21
(a/σ) > 1 (see Appendix A, Property11), we must
have d/(a/bσ) > 1, so that the MLE of σ satisfies the inequality
bσ > a/d =
2||y X
b
b||/d.
Remark 6.11.5 Recall that the MLE for σ under normally distributed
error term is given by ˜σ = ||y X
b
b||/
d. Consequently, in case d = 2,
the MLE of σ under the model (6.11.1) with AL distributed error term is
greater than the one under the model with normally distributed error term.
Remark 6.11.6 The solution to (6.11.14) must be obtained numerically,
except for a few special cases described below.
Special case d = 3. Here, the Bessel function has a clo sed form (see
Property 5), and we have
R
d/21
(x) = R
1/2
(x) = 1 + 1 /x.
Consequently, the equation (6.1 1.14) yields the solution
bσ = a/2 = ||y X
b
b||/
2,
which is greater than ˜σ.
Special case d = 5. Here, we use the iterative property (A.0.16) of R
λ
to write equation (6.11.14) as
1/R
1/2
(a/σ) = 2/(a/σ).
Since R
1/2
(x) = 1 + 1/x, we obtain the following quadratic equation
for bσ:
2bσ
2
+ 2bσa a
2
= 0,
336 6. Asymmetric multivariate Laplace distribution
whose positive solution is bσ =
31
2
a. Again, we see that bσ 0.366a
is greater than ˜σ = a/
10 0.316a
Special case d = 7. Here, we use the iterative property (A.0.16) of R
λ
twice to write equation (6 .11.14) as
3σ/a + 1/R
1/2
(a/σ) = a/(2σ).
Since R
1/2
(x) = 1 + 1/x, we obtain the following cubic equation for
y = bσ/a
y
3
+ y
2
+ y/6 1/6 = 0,
whose real solution is
y =
1
3
(2 +
p
31/8)
1/3
+ (2
p
31/8)
1/3
1
.
Consequently, the MLE of σ is
bσ =
a
3
(2 +
p
31/8)
1/3
+ (2
p
31/8)
1/3
1
a
2.47
.
Again, we see that bσ is greater than ˜σ = a/
14 a/3.74.
6.11.6 Bayesian estimation
Here, we analyze the regression model (6.11.1) with an AL error term and
likelihood function (6.11.12) from the Bayesian point of view. We assume
a diffuse prior distribution for the parameters b and σ
2
,
p(b, σ
2
) 1
2
, b R
k
, 0 < σ
2
< .
This standard improper distribution assumes that b and log σ
2
are uni-
formly and indep e ndently distributed. Taking into account the likelihood
function (6.11.12), we o bta in the po sterior joint distribution of b and σ
2
,
p(b, σ
2
|y)
K
d/21
(
2||y Xb||/
σ
2
)
(σ
2
)
3/2+d/4
||y Xb||
d/21
. (6.11.16)
To obtain the marginal posterior p.d.f. of b, we integrate (6.11.16) with
respect to u = σ
2
:
p(b|y) ||y Xb||
1d/2
Z
0
K
d/21
(
2||y Xb||/
u)
(u)
3/2+d/4
du. (6.11.17)
The change of var iable z = ||y Xb||/
u in (6.11.17) leads to
p(b|y)
2
||y Xb||
2
d/2
Z
0
z
d/2+4
K
d/21
(
2z)dz
1
||y Xb||
2
d/2
,
(6.11.18)
6.11 Linear regression with Laplace errors 337
as the integral in (6 .11.18) is a constant independent of b [the finiteness of
the integral follows from relation (A.0.13), se e Appendix A]. Since
||y Xb||
2
= s
2
(d k) + (b
b
b)
0
X
0
X(b
b
b), (6.11.19)
where
s
2
=
b
e
0
b
e/(d k) = (Y X
b
b)
0
(Y X
b
b)/(d k),
we recognize (6.11.18) as a k-dimensional Student-t p.d.f. with v = d k
degrees o f free dom [see, e.g., Zellner (1 976), Johnson and Kotz (1972 )]. The
posterior density of b has the form:
p(b|y) =
Γ((v + k)/2)(1 + v
1
(b
b
b)
0
R
1
(b
b
b))
(v+k)/2
(πv)
k/2
Γ(v/2)|R|
1/2
, (6.11.20)
where R = (X
0
X)
1
s
2
is a po sitive-definite matrix. Note that the same
posterior distribution results under the model (6.11.1 ) with multivariate
normal and Student-t error terms [see Zellner (1976) for the latter]. We also
see that whenever v = dk > 1, the mean of the posterior distribution of b
exists and equals
b
b. Consequently, the Bayesian estimator of b (under the
squared error loss function and diffuse prior distribution) coincides with
MLE and LSE for b.
Next, we derive the marginal posterior p.d.f. of σ
2
by integrating (6.11.16)
with resp e c t to b. Setting u = σ
2
, δ
2
= s
2
(d k)/u, λ = 1 (d k)/2, and
using (6.11.19), we obta in after some algebra
p(u|y)
2
1d/2
u
d/2+1
Z
R
k
K
k/2λ
(
2
q
δ
2
+ (b
b
b)
0
(X
0
X/u)(b
b
b))
(
q
δ
2
+ (b
b
b)
0
(X
0
X/u)(b
b
b)/
2)
k/2λ
db.
(6.11.21)
We now recognize the integrand in (6.11.21) as the main fac tor of a k-
dimensional generalized hyper bo lic density (6.5.4) with parameters λ, δ,
ξ = α =
2, µ =
b
b, β = 0, and Σ = (X
0
X)
1
u. Since the latter density
integrates to one over R
k
, we eva luate the integral in (6.11.21) and obtain
the following expression after so me algebraic manipulations
p(u|y)
1
u
(dk)/4+3/2
K
(dk)/21
(
p
2s
2
(d k)/u). (6.11.22)
After further integration of (6.11.22), which takes into consideration the
integration formula (A.0.13), we finally obtain an exact expression for the
posterior density of u = σ
2
:
p(u|y) =
(
p
s
2
(d k))
(dk)/2+1
K
(dk)/21
(
p
2s
2
(d k)/u)
(
2)
(dk)/21
(
u)
(dk)/2+3
Γ((d k)/2)
. (6.11.23)
338 6. Asymmetric multivariate Laplace distribution
It can be shown that the r.v. with the above density has the same distri-
bution as s
2
(d k)/X, where X is a r.v. with density (6.11.5). The mean
of this posterior distribution gener ally does not exist.
6.12 Exercises
Exercise 6.12.1 Le t X AL
d
(m, Σ).
(a) Show that if m = 0 (so that X is actually symmetric Laplace), then
any one-dimensional marginal distribution of X is symmetric Laplace.
(b) Show that if every one-dimensional marginal distribution of X is sym-
metric L aplace, then X is symmetric Laplace, X L
d
(Σ).
[Thus, for multivariate AL laws, the symmetry is a componentwise prop-
erty, which is in contrast with geometric stable laws with index less than
2].
Exercise 6.12.2 Le t X = (X
1
, . . . , X
d
)
0
have a multivariate asymmetric
Laplace distribution AL
d
(m, Σ), and let Ψ be the ch.f. of X. Using the
cumulant formula (5.3.2), show that c
1
(X) = m
0
, c
2
(X) = Σ + mm
0
, and
c
3
(X) = vec Σ m + m Σ + vec Σm
0
+ m
2
m
0
(6.12.1)
[Kollo (2000)].
Exercise 6.12.3 Le t X = (X
1
, X
2
)
0
BAL(m
1
, m
2
, σ
1
, σ
2
, ρ).
(a) Assuming that m
1
= m
2
= m, σ
1
= σ
2
= σ, a nd ρ = 0, find the p.d.f.’s
of X
1
, X
1
+ X
2
, X
1
X
2
, and X
2
given X
1
= x
1
. Wha t are the conditiona l
mean and variance of the latter distribution?
(b) Repeat Part (a) for a general BAL r.v . X.
Exercise 6.12.4 B y considering the appro priate characteristic functions,
prove the “if” part of Theorem 6.10.2. Namely, show that if X
(i)
are i.i.d.
with the L
d
(Σ) distribution and ν
p
is an independent geometric variable
with mean 1/p then the equality in distribution
p
1
ν
p
X
I=1
X
(i)
d
= X
(1)
(6.12.2)
holds with α = 2.
Exercise 6.12.5 Establish the implication (ii) (i) of Theorem 6.10.3.
Exercise 6.12.6 Show that if X N
d
(0, Σ) a nd W is an independent
standard exponential variable, then the r.v.
Y = mW +
W X,
where m R
d
, ha s the AL
d
(m, Σ) distribution.
Hint: use the characteristic functions.
6.12 Exercises 339
Exercise 6.12.7 Consider the regression model (6.11.1 ) from the Bayesian
point of view. Assuming the prior distribution for the par ameters b and
σ
2
, derive the posterior density (6.11.1 6) of the parameters and show that
the posterior marginal densities of b and σ
2
are given by (6.11.20) and
(6.11.23), respe c tively.
Exercise 6.12.8 Le t X be a r.v. in R
d
with the ch.f.
Φ(t) = E
it
0
X
= u(t) + iv(t) = r(t)e
(t)
.
Then, the function
θ(t) = tan
1
{v(t)/u(t)}, |t| |r
0
|,
where r
0
is the zero of u(t) closest to the origin, is called the characteristic
symmetric function of X [see Heathcote et al. (1995)]. For an (elliptically)
symmetric distribution about the point m the a bove function is linear in t
and has been used in testing multivariate symmetry [see Heathcote et al.
(1995)].
Derive the characteristic symmetric function for a r.v . X with the AL
d
(m, Σ)
distribution. Under what conditions on m and Σ is the distribution of X
symmetric? Wha t is θ(t) in this case?
Exercise 6.12.9 Le t X = (X
1
, . . . , X
d
)
0
be a random vector in R
d
. The
variables X
1
, . . . , X
d
(the co mponents of X) a re said to b e associated if the
inequality
Cov[f(X), g(X)] 0
holds for all measurable functions f and g which are non-decreasing in
each coordinate (whenever the covariance is finite). It is well k nown that if
X N
d
(0, Σ), then the components of X are associated if a nd only if they
are positively correlated (Σ 0) [Pitt (1982)]. Let X have an AL(m, Σ)
distribution.
(a) Show that if the components of X are as sociated then they must be
positively correlated, that is
Σ + mm
0
0. (6.12 .3)
(b)
∗∗
Investigate whether the condition (6 .12.3) is also s ufficient for the
association of the components of X.
Exercise 6.12.10 Let X have a multivariate normal N
d
(m, Σ) distribu-
tion, where Σ is a non-negative definite covariance matrix of rank r d.
(a) Using the well-known decomposition Σ = CC
0
, where C is a d × r
matrix of rank r, show that the random vector
CZ + m, (6.12.4)
340 6. Asymmetric multivariate Laplace distribution
where
Z = (Z
1
, . . . , Z
r
)
0
(6.12.5)
is a random vector with the standard nor mal and independent components,
have the sa me distribution as the vector X.
(b) Now let the components of the r.v. (6.12.5) be i.i.d. standard Laplace
variables. Show that the distribution of the r.v. (6.12.4), referred to by
Kalashnikov (1997) as multivariate Laplace distribution, does not be long
to the class of AL laws. In particular, show that in general the univariate
marginal distributions of the resulting random vecto r will not be Laplac e .
Discuss the similarities and the differences of the resulting distributions
with the AL laws.
Exercise 6.12.11 Let m R
d
and let Σ be a d × d positive-definite ma-
trix. Consider an elliptically symmetric distribution in R
d
with the density
(6.3.5), where
g(x) = e
x
λ/2
. (6.12.6)
This distribution is known as the multivariate exponential power distribu-
tion [see, e.g., Ferandez et al. (1995)] as well as multivariate generalized
Laplace distribution [Ernst (1998)]. Haro-Lop´ez and Smith (1999) refer to
the special case with λ = 1 as the elliptical Laplace distribution in their
robustness studies and show that it can be obtained as a scale mixture
of multivariate normal distributions. For d = 1 we obtain the generalized
Laplace distribution (the exponential power distribution) with density
f(x) =
λ
2sΓ(1)
exp
(
x µ
s
λ
)
. (6.12.7)
(a) Determine the proportionality constant k
d
[see (6.3.5)] in this case.
(b) Set λ = 1 [in which case (6.12.7) produces the classical symmetric
Laplace distribution] and check whether the marginal distributions corre-
sp onding to (6.3.5) are L aplace.
Exercise 6.12.12 Let X has a general multiva riate Bessel distribution
with the ch.f. (6.9.1) and the density (6.9.3).
(a) Show that in case Σ = I
d
and s =
d+1
2
, we obtain the density (6.9.5),
which leads to
p(x) =
e
2||x||
2(2π)
(d1)/2
Γ(
d+1
2
)
(6.12.8)
if m = 0. Compare the latter density with that of the multivariate expo-
nential power distributions discussed in Exercise 6.12.11.
(b) Show that the densities (6.9.5) and (6.12.8) lead to the AL and Laplace
densities if d = 1. What are the parameters in this case?
6.12 Exercises 341
Exercise 6.12.13 Generalizing elliptically symmetric distributions, Fern´a ndez
et al. (1995) introduced a clas s of v-spheric al distributions given by the
density
p(x; m, τ) = τ
d
g[v{τ (x m)}], (6.12.9)
where v(·) is a scalar function such that
v(·) > 0 (with a possible exception on a set of Lebesg ue measure
zero),
v(ka) = kv(a) for all k 0 and a R
d
,
g is a non-negative function, and m R
d
and τ
1
> 0 are the location and
scale parameters, respectively. [The functions v(·) and g(·) must be chosen
such that (6.12.9) is a genuine probability density function.] Note that by
choosing
v(a) = a
0
Σ
1
a
we obtain the elliptically symmetric distributions, which with g given by
(6.12.6) are the exponential power distributions [cf. Exercise 6.12 .11]. Ferandez
et al. (1995) introduced a s kewed multivariate generalization of the Laplace
distribution as a special case with q = 1 of the skewed multivariate expo-
nential power distribution w hich has density (6.12.9) with
v(a
1
, . . . , a
d
) =
"
d
X
i=1
{( a
+
i
)
q
+ (γa
i
)
q
}
#
1/q
(6.12.10)
and
g(x) = c
d
e
1
2
x
q
. (6.12.11)
[As befor e , x
+
= max(x, 0) and x
= max(0, x).]
(a) Show that if X = (X
1
, . . . , X
d
)
0
, w here the X
i
’s are i.i.d. variables with
the skewed exponential power distribution with the density
f(x) = c
e
(x/γ)
q
/2
for x 0
e
(γx)
q
/2
for x 0,
(6.12.12)
where γ, q > 0 and
c
1
= 2
1/q
Γ(1 + 1/q)Γ(γ + 1), (6.12.13)
then the r.v. X has the v-spherical density (6.12.9) with v given by (6.12.10)
and g given by (6.12 .11) [Fern´andez et al. (1995)]. In particular, we see
that the d-dimensional skewed Laplace r.v. of Fern´andez et al. (1995) is
generated as an i.i.d. sample o f size d from a univariate AL distribution.
(b) Derive the mean, the variance, the moments EX
k
, and the coefficients
of s kewness and kurtos is for a random variable X with the density (6.12.12).
342 6. Asymmetric multivariate Laplace distribution
Exercise 6.12.14 Let X have a sy mmetric multivaria te Bessel distr ibu-
tion with density given by (6.9.4). In the special case a = 0, Fang et a l.
(1990) call it a multivariate La place distribution. Here, the density of X is
proportional to
f(x) K
0
(||x||), (6.12.14)
where K
0
is the modified Bessel function of the third kind and order 0.
(a) Show that the distribution in R
d
with the density as in (6 .12.14) is
AL(m, Σ) only if d = 2. What ar e m and Σ in this case?
(b) Show that if X
d
= RU
(d)
is the polar representation of a symmetric
multivariate Bessel r.v. in R
d
with the density (6.9.4), then the density of
the r.v. R is
g
R
(r) = c
r
r
a+d1
K
a
(r/β), (6.12.15)
where
c
1
r
= 2
a+d2
β
a+d
Γ(d/2)Γ(a + d/2). (6.12.16)
What is this representation in case X has density (6.12.14)? How does it
compare with that of a symmetric Laplace L(I
d
) distribution?
Exercise 6.12.15 A d-dimensional r.v. with the ch.f.
Ψ(t) =
1
1 +
1
2
t
0
Σt
α/2
, t R
d
, (6.12.17)
where 0 < α 2 and Σ is a non-negative definite matrix, is said to have a
multivariate Linnik distribution [see, e.g., Anderson (1992), Pakes (1992a),
Ostrovskii (1995)]. For α = 2 it reduces to the symmetric multivariate
Laplace distribution.
(a) Show that all components of a multivariate Linnik r.v. have univariate
Linnik distributions.
(b) Show that all linear combinations c
0
X, where X has a multivariate
Linnik distribution and c R
d
, are univariate Linnik
Exercise 6.12.16 By considering the appropriate characteristic functions,
show that if X
(i)
’s are i.i.d. with the multivariate Linnik distribution (6.12.17)
and ν
p
is an independent geo metr ic variable with mean 1/p then the re-
lation (6.12.2) holds. Thus, multivariate Linnik variables a re stable with
respect to geometric summation, as are univariate (symmetric) Linnik and
Laplace as well as multivariate symmetric L aplace variables.
Part III
Applications
343
345
Preamble
Laplace distributions found and continue to find applications in a variety of
disciplines which range from image and speech recognition (input distribu-
tions) and ocean engineering (distributions of navigation errors) to finance
(distributions of log-returns of a commodity). By now, they are rapidly be-
coming distributions of the first choice whenever “something” with heavier
than Gaussian tails is observed in the data. Consequently, there is a large
number of publications scattered in diverse journals and monographs where
Laplace laws are ment ioned as the “right” distribution and it is a daunting
task to dig out” and report them all.
The asymmetric Laplace distribution as described in this book is quite
a recent invention. It was motivated by the similar probabilistic consider-
ations as the asymmetric (skewed) normal distribution developed by Az-
zalini (1985,1986). It is our belief that natural applications will inevitably
arise. In fact an application in modeling of foreign currency exchange has
recently been suggested. Several other applications are described in the sub-
sequent chapters. Similar comments apply t o the multivariate generaliza-
tions of Laplace distributions.
In this part of the book, we attempt to present those applications which
we consider in our subjective judgment the most interesting and promising.
In our choice we were also restricted by the fact that our book is addressed
to possible wide range of potential “clients” of the Laplace distributions.
Again the personal taste might have played an unavoidable but hopefully
not a damaging role. Thus in order to make the material readable for our
intended audience we had to present some of more specialized and narrowly
focused applications in an essay form. Readers interested in further details
are directed to the literature cited in the References.
346
7
Engineering sciences
This is the first chapter in the third part of the book, which deals with
applications of various versions of Laplace distributions in sciences, busi-
ness, and va rious branches of engineering . We shall start with application
in communication theory, in particular signal proces sing, which seem to
dominate earlier results in the sixties and seventies of the last century.
Next, we shall mention applications in fracture problems discovered in late
forties before the appe arance of the Weibull distribution which dominated
this field in the second half of the 20th century. Applications in navigation
problems conclude the chapter.
7.1 Detection in the presence of Laplace noise
Detection of a known consta nt signal which is distorted by the presence
of a random noise was discussed in the communica tion theory on various
occasions [see Marks et al. (1978), Dadi and Marks (1987) and reference
therein].
Using statistical terms, the go al is to test for the presence or absence of a
positive constant signal s in additive random noise. The hypothesis testing
problem in this context is formulated as follows
H
0
: x
i
= n
i
, i = 1, 2, . . . , N;
H
1
: x
i
= s + n
i
, s > 0,
348 7. Engineering sciences
-
Input
x
i
g(·)
-
Nonlinearity
g(x
i
)
P
N
i=1
-
Test statistic
t =
P
N
i=1
g(x
i
)
@
@
@
t > T ?
@
@
@
-
Yes: H
1
-
No: H
0
Figure 7.1: General scheme of a detector.
where based on the observations {x
i
, i = 1, 2, . . . , N} we are to decide
whether the s ignal s is abs e nt or present. The quantity α is the probability
of the fir st type erro r (incorrectly a c cepting H
1
), it is also called the sig nif-
icance level. Similarly, β, the detection probability or the power function
of the test, is the probability of correctly ac c e pting H
1
.
In statistical terminology, we are dealing here with a test for location in
the case a simple hypo thesis H
0
vs. a simple alternative H
1
. However, in
communication theory, the problem receives a different formulation which
uses the notion of a detector. This is best represented by the scheme pre-
sented in Figure 7.1.
The detector represented in this figure is defined through the form of
g which is called in this context a zero-memory non-linearity. Also the
distribution of the noise n
i
influences the value of the threshold T for
the test statistic t =
P
N
i=1
g(x
i
), since the latter has a distribution which
depends on the distribution of the noise.
Various forms of detectors can be proposed by means of an appropriate
definition of g. The well-known Neyman-Pearson optimal detector is defined
if the density of the input is known. Its form (as well as its name) follows
from the classical Neyman-Pearson lemma [Neyman and Pearson (1933)]
which maximizes the powe r of the test. It is easy to observe that in general
the optimal non-linearity should be of the fo rm
g
opt
(x) = ln
f
n
(x θ)
f
n
(x)
,
where f
n
is the distribution of n
i
(which are assumed to be i.i.d. random
variables) [s e e e.g. Miller and Thomas (1972)].
In the analysis of detector performance, the noise is commonly assumed
to be Gaussian. The assumption is often justified (for example for ultra-high
frequency signals UHF) and results in a mathematically tractable a naly-
7.1 Detection in the presence of Laplace noise 349
sis. However in many instances, as pointed by Miller and Thomas (1972),
a non-Gaussian noise assumption is necessary (for example for extremely
low frequency ELH).
One fo rm of frequently encountered non-Gaussian nois e is the so-called
impulsive noise. Such noise typically possesses much heavier tail behavior
than Gaussian noise. Because of this, Laplace noise has been sugge sted as
a model for some types of impulsive noise.
Indeed, models of noise based on Laplace distributions appears in engi-
neering studies on various occasions in the last forty years. Bernstein et al.
(1974) comment on the non-Gaussian nature of ELF atmospheric noise, and
they give a plot of a typical experimentally determined probability density
function assoc iated with such a noise which is very similar to a L aplace den-
sity. Mertz (1961) proposed a density for the amplitude of impulsive noise
which in the limiting case results in the density of Laplace law. Kanefsky
and Thomas (1965) considered a class of generalized Gaus sian noises, ob-
tained by generalizing the Gaussian density to arrive at a variable rate of
exp onential decay. The Laplace distribution is within this cla ss of gener-
alized Gaussian distributions. Also, Duttweiler and Messerschimitt (1976)
refer to the Laplace distribution as a model for the dis tribution of s peech.
For the case of Laplace noise given by the density
f(n) =
γ
2
e
γ|n|
, n R, γ > 0,
the Neyman-Pearson optimal detector was found in Miller and Thomas (1972).
Namely, the nonlinearity is of the form
g
opt
(x) =
γs, x > s
2γx γs, 0 x s
γs, x < 0 .
See also Figure 7.2.
In order to solve the detection problem completely, it remains to find the
distribution of the statistic
t =
N
X
i=1
g
opt
(x
i
).
350 7. Engineering sciences
-
x
6
g
opt
(x)
γs
γs
s
Figure 7.2: Nonlinearity in optimal detector for Laplace noise.
This problem was solved in Marks et al. (1978) and results in the following
c.d.f.
F
(0)
N
(x)
=
1
2
N
N
X
k=1
N
k
k
X
r=0
(1)
r
k
r
Nk
X
l=0
N k
l
e
(r+l)γs
e
x+N γs
2
e
k1
x + (N 2(l + r))γs
2

u (x + (N 2l 2r)γs)
+
1
2
N
N
X
m=0
N
m
e
s
u(x + (N 2m)γs),
where e
k
(·) is the incomplete exponential function,
e
k
(z) =
k
X
i=0
z
i
i!
,
and
u(x) =
0 ; z < 0
1 ; z 0.
For the pr oof of this result and further discussion of testing hypothesis
about the location pa rameters for the Laplace laws see Part I, Chapter 2.6,
Subsection 2.6.4. Note that in the above formulation, we use slig htly dif-
ferent notation to be consistent with the original paper.
Since we are dealing her e with the classical Laplace law which is symmet-
ric, the distribution of the statistic t(·) under the alternative H
1
is given
7.2 Encoding and decoding of analog signals 351
by
F
(1)
N
(t) = 1 F
(0)
N
(t).
The mean and variance of the test statistic are
E
0
t = E
1
t = N
1 e
γs
γs
,
Var
0
t = Var
1
t = N
3 2e
γs
e
2γs
4γse
γs
.
Cf. Theorem 2.6.2 in Part I, Chapter 2.6.
In c ommunication theory other detectors beside the optimal one are also
considered. For example, the linear detector is given by g
lin
(x) = x and
the sign detector is given by g
sign
(x) = sign x. We refer to Dadi and
Marks (1987) and Marks et al. (1978) for a detailed discussion of the per-
formances of these detectors under the Laplace noise and their limiting
behavior when the sample size N increases without bound.
7.2 Encoding a nd decoding o f analo g signals
Another standard pr oblem in communication theory is encoding and de-
coding of analog signals. The distribution of such signals depends on their
nature. One of the most important ones are speech signals. It has been
found that the Laplace distribution ac c urately models the speech signals.
Although it was also discovered that true speech signals ar e strongly cor-
related when measured in time, in many theoretical studies it is often as-
sumed, in order to avo id complications following fro m the dependence in
samples, that samples are indepe ndent. The theoretical findings have been
compared to the corresponding empirical properties observed in real speech
samples. In one s uch a study, Duttweiler and Messerschmitt (197 6) consid-
ered a reduced-bit-rate wave form encoding of analog signals. A concise
account of their findings is presented below (we emphasize that portio n
where the Laplace distribution has played a prominent role). For additional
details we refer our reader to the original paper.
The method considered in Duttweiler and Messers chmitt (1976) is called
nearly instantaneous companding (NIC). NIC is distinguished among most
other bit-rate reduction techniques by a performance that is largely insen-
sitive to the statistic of the input signal. The analysis of this robustness
was carried out in the paper by examining the method for sinusoidal sig-
nals, Gaussian independent samples, Laplace independent samples, and real
sp e e ch samples (believed to be dependent Laplace samples). The method
involves g rouping of so me standard encoding, in the study of the so-called
µ255 (PCM) encoding (assuming n-bit quantiza tion)
1
, into groups co ns ist-
ing of N samples. Then, it re-encodes the g roups, exploiting in a certain
1
In a PCM encoding of Analog-to-Digital converter, each bit represents a fixed voltage
level. So if the least significant bit corresponds to a level V volts, then the n th bit
352 7. Engineering sciences
manner the information about the samples with the la rgest magnitude in
the groups to reduce the bit size to n 2. Next, the encoded signal is
decoded in a complementary NIC decoder to obtain back the n-size d bit
codes. Finally, in order to obtain an analog signal, decoding through an
appropriate decoder (µ255 PCM) is performed.
In order to verify the insensitivity of the technique to the initial distri-
bution in the signal, the NIC signal-to -quantization noise ratio (SNR) with
n = 8 and three sets of signal statistics (sinusoidal, Laplac e , Gaussian)
were dis c ussed. In Figure 7.3, we present perfor mance for Gaussian and
Laplacian inputs (we sho uld re member that Laplace inputs are believed to
approximate better the true distribution of the speech data). The compar-
ison of SNR is made with respect to the initial encoding (in our case µ255
PCM).
The performance of the decoder depends on the block size N. At N = 8
the degradation is about 7dB
2
, with a Laplac ian distribution and 6dB
with a Gaussian. The Laplacian distr ibution is characteristic of speech,
but speech samples are strongly correlated. For the simulated NIC with an
actual speech input the degradation for N = 8 was of 3.5dB.
Another interesting way o f presenting an SNR data co nsists of g raphing
the SNR versus the average number of bits per sample as the block size N
varies. Two of this plots appear in Figure 7.3(right). One assumes indepen-
dent Laplacian samples while the other is based on an actual speech. The
maximum advantage of NIC is 3dB with independent L aplacian samples
and 6dB with the actual speech. In both cases the maximum advantage
occurs at about N = 10.
7.3 Optimal quantizer in image and speech
compression
The Laplace distribution is commonly e ncountered in image and speech
compression applications. One of the fundamental problems in this context
consists o f finding the so-called optimal quantizer design. Let us first explain
a general idea of such a design.
corresponds to a level 2
n
V volts. To achieve recognizable voice quality sampling at rates
of 8000 samples per second over a 13-bit range must often be used. To reduce the range
requirement a logarithmic µ255 data compander can be used to compress speech into an
8-bit word according to the formula y(x) = V log(1 + µx/V )/ log(1 + µ) with the value
µ = 255 most often used in telephone applications.
2
A decibel is a dimensionless, logarithmic unit equal to one-tenth of the common
logarithm of a number expressing a ratio of two powers. In the usual case for input and
output quantities in telecommunications, the decibel is a very convenient unit to express
signal-to-noise ratio.
7.3 Optimal quantizer in image and speech compression 353
Figure 7.3: SNR versus amplitude with independent Laplace (left)
and Gaussian (right) samples. SNR of Laplacian samples versus
bits/sample (bottom). Gra phs are reproduced from Duttweiler and Messer-
schmitt (1976) with permission of the IEEE (
c
1976 IEEE).
Consider an analo g signal which should be converted to a digital one. A
quantizer is a method of such an a nalog-to-digital conversio n. Specifically,
a scalar quantizer maps each input (a continuous random variable) to its
output appr oximation. The issue is to optimize the quantizer performance
subject to some criteria. One such criterion is to minimize the information
rate o f the quantizer as measure d by its output entropy. In another ap-
354 7. Engineering sciences
proach, the mean square error of quantization is considered as a measure
of performance.
Since Laplace distributions are commonly enco untered in practical quan-
tization problems, considerable a ttention was given into the problem of
finding an optimal quantizer for La placian input sources. Here we shall dis-
cuss mostly the results of Sullivan (1996), but the works by Nitadori (1965),
Lanfer (1978), Noll and Zielinski (1979), and Adams and Giesler (19 78) are
recommended to those interested in the history of the problem.
Let an input va riable, which will be subject to quantization, be modeled
by a random variable X having a smoo th p.d.f. f(x). For convenience and
without loss of generality let us assume that f(x) is zero for x < 0. An
n-level scalar quantizer, where n is the number of possible values {y
(n)
i
}
n
i=1
in the q uantized output, is defined as
Q
(n)
(X) =
n1
X
i=0
y
(n)
i
I
(t
(n)
i+1
,t
(n)
i
]
(X),
where I
(a,b]
is an indicator function of an interval (a, b] and {t
(n)
i
}
n
i=0
are
the n + 1 decision thresholds for the quantizer given by
t
(n)
i
=
n1
X
j=i
α
j
, i = 0, . . . , n 1, t
(n)
n
= 0.
The quantities {α
i
}
n1
i=0
are some positive steps (α
0
= ) and the output
values are defined through a set of n non-negative reconstruction offsets
{δ
n1
i=0
} by
y
(n)
i
= t
(n)
i+1
+ δ
i
.
The distortion measure d(∆) is any function of which increases mono-
tonically and smoothly (although not necessary symmetrically) as its ar gu-
ment deviates from zero (for ex ample the mean square error d(∆) = ||
2
is a distortion measure). The exp ected qua ntizer distortion is then defined
by
D
(n)
f
= E[d(X Q
(n)
(X))]
=
n1
X
i=0
Z
t
(n)
i
t
(n)
i+1
d(x y
(n)
i
)f(x)dx,
and the probability of each output y
(n)
i
is
p
(n)
i
=
Z
t
(n)
i
t
(n)
i+1
f(x)dx.
7.3 Optimal quantizer in image and speech compression 355
The output pro babilities determine the output entropy of the quantizer,
a lower bound on the expected bit rate required to encode the output, given
by
H
(n)
f
=
n1
X
i=0
p
(n)
i
log
2
p
(n)
i
[bits per sa mple].
We are interested in a quantizer which minimizes the o bjective function
J
(n)
f
= D
(n)
f
+ λH
(n)
f
for some λ 0. Such a quantizer is optimal in the sens e that no other
scalar quantizer can have lower distortion with equal or lower entropy.
In Sullivan (1996), the optimal quantizer as well as fast algorithm for
its co mputation for an e xponentially dis tributed input were presented. In
the case o f mean squared-error distortio n, the solution has an explicit form
expressed by the Lambert function W , i.e. the inverse function to f(W ) =
W e
W
, which ca n be approximated by
W (z) = 1 + q
1
3
q
2
+
11
72
q
3
43
540
q
4
+ . . . ,
where q =
p
2(ez + 1). [See Corless et al. (1996).] This optimal solution α
i
is given by
α
i+1
= ν
i
+ W (ν
i
e
ν
i
).
where
ν
i
= 2 α
i
e
α
i
1 e
α
i
.
The results on exponential source a re then used to derive the optimal
quantizer for Laplace distribution. It is interesting to see how the exponen-
tial q uantizer can be utilized in this c ase.
First, let us consider the quantizer which has an output value associa ted
with the input value of x = 0 . The boundaries of the step are defined by
two non-negative thresholds t
l
and t
r
, where t
l
+t
r
> 0, so that if the input
is between t
l
and t
r
then the output value is equal to . The quantizer
has the distortion
η(t
l
, t
r
, ) =
Z
t
r
t
l
d(x )e
−|x|
/2dx
and the entropy
T (e
t
l
/2, e
t
r
/2),
where
T (p, q) = B(p) + (1 p)B(q/(1 p))
B(p) = p log
2
p (1 p) log
2
(1 p).
356 7. Engineering sciences
The number of output levels on the right of t
r
is n
r
and on the left
of t
l
is n
l
. Thus, n = n
r
+ n
l
+ 1. Now, we define the quantizer as the
composition of three subquantizers. First, we have the one defined above
for values around zero. Then, for a Laplace random variable X, X t
r
given that X > t
r
has exponential distribution and so does (X t
l
) given
that X < t
l
. Consequently, we can write
J
(n)
L
= η(t
l
, t
r
, ) + λT
1
2
e
t
l
,
1
2
e
t
r
+
1
2
(e
t
l
ˆ
J
(n
l
)
e
+ e
t
r
J
(n
r
)
e
),
where J
(n)
L
stands for the objective function for the Laplace source, J
(n
r
)
e
is
the objective function for the exponential source, while
ˆ
J
(n
l
)
e
is the objec-
tive function o f n
l
-level quantizer for an exponential source with distortion
measure
ˆ
d(∆) = d(∆). Using the results on the exponential source it is
enough to find the minimizer for η(t
l
, t
r
, ).
The method of computing these quantizers presented in Sullivan (1996)
is non-itera tive, which is an impr ovement over some previous iterative re-
finement techniques. In addition, it is extremely fast and optimal for a
general difference-based distortion measure, as well as for a restricted and
unrestricted (asymptotic) number of quantization leve ls.
7.4 Fracture problems
In Epstein (1947,1948), a po tential application of Laplace distribution is
discussed in the relation to the fracturing of materials under applied forces.
The considered statistical models assume that the difference between the
ideal model and observed values is due to randomly distributed flaws in the
body which will weaken it. The simplest theory is based on the weakest link
concept. It assumes that the strength of a g iven sp e cimen is determined
by the weakest point o r, in other words, by the smallest values found in
a sample of n, where n is a number of flaws in the considered material.
This relates the problem to the extreme value theory. For applications, the
term strength ca n be interpreted in different ways: mechanical strength,
electrical strength, resistance of painted specimens to the corrosive effects
of the atmosphere, ability to stop the passage of light rays, or the life span
of a device which ceases to function when any of a number of vital parts
breaks down.
There is a dispute which distributions of the strength of a flaw are cor-
rect ones. Based on experimental data the following characteristics of the
distribution should be accounted for: some experimenters have observed
that the mode of the strength decreases as some function of the loga rithm
of the size of sp e c imen; the distribution of strengths of specimens all of the
7.4 Fracture problems 357
Distr.
Smallest
value dis tr.
(large n)
Mode ˜y Mean Variance
Laplace
1
2λ
e
−|xµ|
µ λ log
ηn
2
µ λ log(n/2)
˜y
0.577λ
λ
2
π
2
6
Gaussian
1
2πσ
e
(xµ)
2
2σ
2
µ σ(
2 log n
log log n+log 4π
2
2 log n
log η
2 log n
)
µ σ(
2 log n
log log n+log 4π
2
2 log n
)
˜y
0.577σ
2 log n
σ
2
π
2
12 log n
Weibull
αβx
β1
e
αx
β
η
1
β1
αnβ
1
Γ
(
β+1
β
)
()
1
Γ
(
β+2
β
)
Γ
2
(
β+1
β
)
()
2
Table 7 .1: Summary of the r e sults from Epstein (1948) on the distribution
of strength in the weakest link model depending on the distribution of the
strength of a flaw. η stands for a standar d exponential random variable.
same size appears to be negatively skewed; in the breakdown of capacitors
the sizes of conducting particles (flaws) ar e distributed according to an ex-
ponential law. In the last example, it can b e e asily shown that the most
probable value of the breakdown voltage depends linearly on the logarithm
of the area.
Epstein (1947,1948) considers several common distributions of the strength
of a flaw given by a density f(x) including La placian, Gaussian, and Weibull
densities. Several issues are of interest in this context. First, one would like
to know the asymptotic distribution of the smallest value in a sample of
size n. Then it is important for the study of size effect how specimen size
(represented by n) affects the distribution of strengths. In particular, one
would like to know how the mode, the mean, and variance of the smallest
value depend on the size n. Rather standard arguments lead Epstein to the
results which are summarized in Table 7.1.
From this summary we see that the ass umption on the fo rm of distribu-
tion affects in a sig nificant way properties of the streng th of a specimen.
Moreover there are physical data available which follow each of the pat-
terns which are exhibited by the listed above distributions. For example,
the distribution of breakdown voltages of capacitors have a distribution of
the Laplace type . The derived properties when put in a physical context
give information how the size effect depends on the distribution o f strengths
in the vicinity of flaws. Specifically, the specimens become weaker as the
size increases. In the case of Laplace distribution it decreases linearly with
358 7. Engineering sciences
log n, for Gaussian distribution the dependence is through
log n, while for
Weibull distributions the depe ndence is through negative powers of n. The
spread of the distribution remains unchange d in the Laplace ca se while in
two other cases it decrea ses with the specimen size.
Epstein’s works carried out ove r 50 years ago generated vast literature
on this subject related to extre me value distributions. For our purposes it is
sufficient to note that the Laplace distribution appears on an equal footing
with the more popular distributions at that period such as Gaussian and
Weibull.
7.5 Wind shear data
Barndorff-Nielson (1979) has proposed the hyperbolic distributions for mod-
eling turbulence
3
encountered by an aircraft. The model is quite compli-
cated and difficult to handle when parameter estimation is considered.
Kanji (1985 ), noticing that the L aplace and Gaussian distributions ar e lim-
iting cases of the hyperbolic distributions, proposed the mixture of these
two as a model for wind shear data
4
. Wind shears are encountered by an
aircraft during the approach to landing and their distribution is critical for
assessing the effectiveness and safety of aircraft and for training pilots to
react correctly when they encounter a wind shear.
Kanji (1985) had worked with 24 sets of data on wind shear collected
during last 2 minutes of landing of a passenger aircraft. The measurement
represents the gradient of airspeed change against its duration. The basic
assumption is that a wind shear forms an individual gust which have a
strictly defined fo rm specified by its duration and the magnitude of change
of the air velocity. The 120 seconds flight before the touch down was split
into four bands, the first two of the 40 seconds length and the last two of
the 20 seconds length. The histograms o f the data sugge sted that for the
early stage (firs t 40 seconds) of landing the Laplace distribution seems to fit
the data well while for the last 20 se c onds less peaky Gaussian distribution
appears to be appropriate. Considering this, Kanji proposes the following
mixture model
p
1
(x; µ, σ, α) = α
1
σ
2
e
2|xµ|
+ (1 α)
1
σ
2π
e
(xµ)
2
/(2σ
2
)
, (7.5.1)
which a mixture of Laplace and Gaussian distributions having the same
mean and va riance. The proposed estimation procedure starts with the
estimation o f the mean and the variance for both components in the model,
3
random changes i n wind velocity with insufficient duration to significantly affect an
aircraft’s flight path.
4
A change in wind velocity of sufficient magnitude and dur ation to significantly affect
an aircraft’s flight path and require corrective action by the pilot, or autopilot
7.6 Error distributions in navigation 359
and then employs the chi-square goodness of fit procedure to fit the mixing
constant α. The inference led to the following approximate values of α in
the four time bands: 0.9, 0.6, 0.5, 0.3, r espe c tively. Confirming that wind
shear data lose their Laplacian character in the earlier stages for the sake
of Gaussian one at the e nd of landing. The fit was significant in all but 9
out of 24 cases.
In Jones and McLachlan (1990), a mixture of Laplace and Gauss distri-
butions was studied in the same context. This time, however, the author s
do not as sume equa l variance in the components and demonstrate an ap-
propriately modified es timatio n procedure leading to even be tter fits than
those obtained by Kanji (1985). Further discussion on justification and pa-
rameter estimation of the mixture model (7.5.1) can be found in Kapoor
and Kanji (1990) and Scallan (1992), respectively.
7.6 Erro r distributions in navigation
In Anderson and Ellis (1 971), we find an interesting discussion of error
distributions reported in ocea n engineering. The authors analyze fifty-four
distributions and conclude that most of them have exponential or even
heavier tails and only a few seem to follow Gauss ian law. In heuristic fash-
ion, they argue that only for equipment made of identical items the col-
lected data follow the Gaussian law. For example it was observed that the
frequency distributions of a single pilot operating the same set of facilities
and under similar navigational environments appear to be well modeled
by Gaussian law s. I f the data are collected by instruments w hich, although
nominally the same, are much more diverse and far from identical, then the
data exhibit longer tails. This is due to variability of variance for differ-
ent instruments. In the aircraft navigation data it was repea tedly observed
that if data are collected from a fairly complex navigation systems, there
is a strong tendency to exhibit exponential tail b e havior. The question is
then how to rationalize this applicability of these two so different distribu-
tions. The answer given in Anderson and Ellis (1971) suggests to consider
Gaussian distr ibutions with random variances.
As an example, consider two gauges, one new and one older and worn
out. The variance of the second one will lead to the data far from the true
value, and the distribution can be closer to a Laplace distribution than to a
Gaussian one. The authors sugges t the use of distributions with exponential
or even heavier tails for navigation data. They derive such distributions by
combining observations from a number of Gaussian distributions that cover
a range of standard deviations. Of course, various distributions (or patterns,
as the authors describe them) of standard deviations will lead to different
distributions of the errors (we know that one such possibility is the Laplace
distribution, if the distribution of the standard deviation is Rayleigh, see
360 7. Engineering sciences
2.2.5). They note: “ In the past, navigation statistics have tended to be a
conglomeration of single observations from various origins and there has
been n o need t o examine the range of s tandard deviations from each origin.
Therefore, we do not know the pattern which these standard deviations are
likely to follow.”
The lack of information on the distribution of the standard deviation
prevents the authors from making any strong recommendation on the type
of error-distribution, exc e pt that they str ongly favor in some situations
“log-tail” (in our terminology exponential-tail) distributions:
“The navigator will remember that the Gaussian distribution can arise if
one observer (without blunders) operates one equipment (without integra-
tors) under one set of stable conditions! If his information is based on a
number of diverse sources (or even if it is based on one source and the nav-
igator has a healthy pessimism) the log-tail distribution will be preferable
within the limits in which he is likely to be interested.”
In conclusion, after studying the difference in quantiles between the
Gaussian distr ibution and an alternative distribution being Gaussian with
random variance (although they do not consider Laplace distribution) they
say:“if the Gaussian distribution is assumed for errors, and if the standard
deviation is deduced from observations based on a large number of equip-
ments and operators, there will in fact be considerably more ex treme results
than predicted by the assumption.”
The argument they provide in favor of the models based on Gaussian
mixtures with stochastic variance can be easily extended to other areas of
applied research. For this reason Laplace dis tributions can serve as valuable
models in situations heuristically described above.
In Hsu (1979), the model with Laplace distribution was investigated and
compared w ith the real-life data on navigation er rors for airc raft position.
The data were collected by the U.S. Federal Aviation Administration over
the Central East Pacific Track System. The position errors in the lateral
direction (along the tracks) were recorded for the traffic heading to Oak-
land (3435 data points) and Los Angeles (4147 data po ints). The following
five models were fitted to the data: Gaussian, Laplace, Student’s t, a mix-
ture of two Laplace, and a mixture of two Student’s t distributions. The
best fit, particularly in the tail region, was obtained by a mixture of two
Laplace distributions. On the other hand, the Gaussian distribution was
performing rather poorly. It is worth to emphasize that a model adequately
describing the tail behavior is of paramount importance in this application.
The s implicity of the models based on Laplace distributions and their em-
pirical adequacy adds much to its prac tical applicability as illustrated by
Hsu (1979) in the application of the proposed Laplace model for the calcu-
lation of aircraft mid-air collision risk. This risk is based on the probability
of track overlap by two aircraft which take adjacent parallel tracks with
some nominal lateral separation in nautical miles. The computation of this
distribution (which is the convolution of the nav igation error distr ibutions
7.6 Error distributions in navigation 361
for the considered two tracks) is possible for all models and it was found
that the models other then the mixture of Laplace tend to underestimate
the overlap for the most of the range of the nominal separation consider e d.
362 7. Engineering sciences
8
Financial data
An area where the Laplace and related distributions can find most interest-
ing and successful applications is modeling of financial data. This is due to
the fact that traditional models based on Gaussia n distribution are very of-
ten not supported by rea l-life data mostly due to long tails and asymmetry
present in these data. Because Laplace distributions can account fo r lep-
tokurtic and ske wed data they are natural candidates to replace Gaussian
models and processes. In fact, some activity involving the L aplace distribu-
tion can be already obser ved in this area. The Laplace motion and models
based on multivariate Laplace laws have appeared in works on modeling
stock market retur ns, currency e xchange rates, and interest rates. In this
Chapter we present several such applications .
It is important to mention that there exists interesting material on appli-
cations of hyperbolic and normal inverse Gaussia n distributions to the fi-
nancial data [see, e.g., Eb e rlein and Ke ller (1995), Barndor f-Nielsen (1997)].
Since generalized Laplace distributions can be viewed as special cases of
hyperbolic distributions, the mentioned work also supports their applica-
tion to stochastic volatility modeling. In particular, the estimation based on
German stock market data in Eberlein a nd Keller (1995) confirms most of
claims in Section 8.4. We do not report these results as not directly related
to the Laplac e laws but we recommend the cited work to those interested
in financial modeling .
364 8. Financial data
8.1 Underreported data
Consider a Pareto r andom variable Y
with p.d.f.,
p
1
(y
) =
(
γ
m
m
y
γ+1
, fo r y
m,
0, for 0 < y
< m.
(8.1.1)
The Pareto distribution has been found useful for modeling a variety of
phenomena, including distributions of incomes, prope rty values, fir m or city
values, wo rd frequencies, migration, etc. However, as remar ked by Hartley
and Revankar (1974), in many applications (particularly those dealing with
income or property values) one may reasonably expect that the reported
values underestimate the true values of a given variable of interest. To
account for this, Hartley and Revankar (1974) consider Y
with density
(8.1.1) as an unobservable (true) variable, which is related to an observable
variable Y via the equation
Y = Y
U, (8.1.2)
where the variable U (0 U Y
) is a positive under reporting error.
The goal here is to make infer e nce about the distribution of Y
(that is to
estimate the parameters γ and m) based on a random sample from Y . To
accomplish this, one needs to relate the p.d.f. of Y to the parameters γ and
m of Y
. Hartley and Revankar (1974) postulate that the pro po rtion of Y
which is underrep orted, denoted by
W
=
U
Y
, (8.1.3)
is dis tributed independently of Y
with the p.d.f.
p
2
(w
) = λ(1 w
)
λ1
, 0 w
1, λ > 0. (8.1.4)
Then, the observable r.v. Y given by (8.1.2) has the p.d.f.
g(y) =
γ
m
λ
λ + γ
m
y
γ+1
, fo r y m,
y
m
λ1
, for 0 < y < m.
(8.1.5)
We now recognize (8.1.5) as the p.d.f. of a log-Laplace distribution. Indeed,
writing X = log Y a nd denoting
σ =
r
1
λγ
, κ =
r
γ
λ
, θ = log m, (8.1.6)
we find that the p.d.f. of X is
h(x) =
1
σ
1
κ
1
+ κ
e
κ|xθ|
, fo r x θ,
e
1
κ
|xθ|
, for x < θ,
(8.1.7)
8.2 Interest rates data 365
which is a three-parameter AL
(θ, κ, σ) density [see also Hinkley a nd Re-
vankar (1977)]. Thus, AL laws have found applications in Economics in
connection with modeling (underreported) income and similar variables.
8.2 Interest rates data
-0.15 -0.10 -0.05 0.0 0.05 0.10
0 2 4 6 8 10 12
data
-0.15 -0.10 -0.05 0.0 0.05 0.10 0.15
0 5 10 15
-0.15 -0.10 -0.05 0.0 0.05 0.10 0.15
0 5 10 15
-0.15 -0.10 -0.05 0.0 0.05 0.10 0.15
0 5 10 15
-0.15 -0.10 -0.05 0.0 0.05 0.10 0.15
0.0 0.2 0.4 0.6 0.8 1.0
-0.15 -0.10 -0.05 0.0 0.05 0.10 0.15
0.0 0.2 0.4 0.6 0.8 1.0
-0.15 -0.10 -0.05 0.0 0.05 0.10 0.15
0.0 0.2 0.4 0.6 0.8 1.0
-0.15 -0.10 -0.05 0.0 0.05 0.10 0.15
0.0 0.2 0.4 0.6 0.8 1.0
Figure 8.1: Top-left: Histogram of interest rates on 3 0-year Treasury bonds.
Top-right: Non-parametric es timator of the density (thin solid line) vs. the
theoretical ones (normal dashed line, AL thick solid line). Bottom-left:
Empirical c.d.f. vs. nor mal c.d.f. Bottom-right: Empirical c.d.f. vs. AL c.d.f.
In this section we pres e nt an application of AL distributions in modeling
interest rates on 3 0-year . Klein (1993) studied yie ld rates on average daily
30-year Treasury bonds from 1977 to 1990, finding that the empirical dis-
tribution is too p e aky” and fat-tailed” to have been from a normal distri-
bution. He rejected the traditional log-normal hypo thesis and proposed the
Paretian stable hypo thesis , which would “account for the o bserved peaked
middle a nd fat tails”. The paper was followed by several discussions, where
some researchers objected to the stable hypothesis and offered alternative
models.
366 8. Financial data
Kozubowski and Podg´orski (1999a) suggested an AL model for interest
rates, arguing that this relatively simple model is well capa ble of c apturing
the peakedness, fat-tailnes s, skewness, and high k urtosis observed in the
data. These authors considered a data set consisting of interest ra tes on
30-year Treasury bonds on the la st working day of the month (which is
published in Huber’s discussion of Klein’s paper, p. 156). The data covers
the period of February 1997 through December 1993. Converting the data
to the logarithmic changes, Y
t
= log(i
t
/i
t1
), where i
t
is the is the interest
rate on 30 -year Treasury bonds on the last working day of the month t, the
authors assume that the resulting 202 values of the logarithmic changes Y
i
are i.i.d. observations from an AL distribution.
The histogram of the data set appear s in Figure 8.1 (bottom-left). The
typical shape of a AL density is apparent: the distribution has high peak
near zero a nd appears to have tails thicker tha n that of the normal distribu-
tion. Comparisons o f the empirical c.d.f. with the normal c.d.f (Figure 8.1,
top-left) and the empirical dens ity with the normal density (Figure 8.1,
bottom-right) confirm these findings. We observe a disparity around the
center of the distribution due to a high peak in the data . To fit an AL
model, one needs to estimate the parameters µ and σ. Kozubowski and
Podg´o rski (1999a) used the maximum likelihood estimators, obtaining
bµ = 0.0071 78218 and bσ = 0.294043202,
and then calculated the parameter κ and some other related parameters.
The resulting values are presented in Table 8.1, along with corresponding
empirical counterparts:
1. Sample Mean:
1
n
P
Y
i
.
2. Sample Variance:
1
n
P
(Y
i
Y )
2
.
3. Sample Mean Deviation:
1
n
P
|Y
i
Y |.
4. Sample Coefficient of Skewness: bγ
1
=
1
n
P
(Y
i
Y )
3
/(
1
n
P
(Y
i
Y )
2
)
3/2
.
5. Sample Kurtosis (adjusted): bγ
2
=
1
n
P
(Y
i
Y )
4
/(
1
n
P
(Y
i
Y )
2
)
2
3.
Except for a slight discrepancy in skewness, the match between empirical
and theoretical values is remarka ble. In Figure 8.1 the theoretical AL c.d.f.
is compa red with the e mpir ic al c.d.f. (top-right) and the density kernel
estimator based on the data is compared with the theoretical densities of
normal and AL distributions with the estimated parameter s. We observe a
better agreement with the AL distribution than with the normal one [the
Figure is taken fro m Kozubowski and Podg´orski (1999a)].
8.3 Currency exchange rates 367
Parameter Theoretical value Empirical value
Mean 0.001018163 0.001018163
Variance 0.001733809 001372467
Mean deviation 0.02944785 0.02945773
Mean dev./ Std dev. 0.70721 75 0.7582487
Coefficient of Skewness 0.073 34177 0.22 74964
Kurtosis (adjusted) 3.003586 3.599207
Table 8.1: Theoretical versus empirical mo ments and related pa-
rameters of Y AL(bµ, bσ).
-0.03 -0.02 -0.01 0.0 0.01 0.02 0.03
0 20 40 60 80 100
Yen
-0.03 -0.02 -0.01 0.0 0.01 0.02 0.03
0 20 40 60 80
DM
Figure 8.2: Japanese Yen (left) and German Deutschemark (right) daily
exchange rates, 1/1/8 0 to 12 /7/90.
8.3 Currency exchange rates
We present an application of AL dis tributions in modeling foreign currency
exchange rates taken from K ozub owski and Podg´o rski (2000). Following the
ideas of Mittnik and Rachev (1993 ), we may view an exchange rate change
as a sum of a larg e number of small changes, where the sum is taken up to
a random time ν
p
(that has a geometric distribution):
exchange rate change =
ν
p
X
i=1
(small changes).
368 8. Financial data
The random nature of time reflects the volatility and unpredictability of
the factors which contribute to establishment of a current exchange rate.
Therefore, the AL laws (provided the small changes have finite variance) are
very likely to approximate the distribution of the exchange rate change. We
may think of ν
p
as the moment when the probabilistic structure governing
the exchange r ates breaks down. This can be a new informatio n, political,
economical, or other events that affect the fundamentals of the exchange
market.
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
Quantiles of Standard Normal
Yen
-2 0 2
-0.04 -0.02 0.0 0.02 0.04
Outlier: x=3.6, y=0.14
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
Quantiles of Standard Normal
DM
-2 0 2
-0.04 -0.02 0.0 0.02 0.04
o
o
o
o
o
o
oo
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
oo
oo
o
o
o
oo
o
o
o
o
o
o
o
o
ooo
oo
o
o
o
ooo
o
o
oo
oo
oo
oooo
o
o
oo
oo
o
o
oo
oo
o
oo
o
ooo
oo
o
o
oo
o
oo
o
oo
o
oo
o
oooo
oo
oo
oo
o
o
oo
ooo
oooo
oo
ooo
o
oo
oo
oo
o
oooo
oo
oo
ooo
o
ooo
oo
oooooo
oo
oo
o
oooo
ooo
oooooo
oo
oooo
oo
oo
ooooo
oo
oo
ooooo
oooo
o
oooooooo
ooo
oooo
oo
oo
oo
oooo
ooooooo
ooooo
ooooooo
ooooo
o
oooo
oooo
oooooo
ooooo
oo
ooooo
oooo
oo
ooooo
oooooo
oo
oooo
ooooo
ooooo
oo
oooooo
ooo
oooooooo
ooooo
oooooooo
oooooo
ooooo
oooooo
ooooooooo
ooooo
ooooooo
ooooooo
ooooooo
oooooooooooo
oooooooo
oo
ooooo
ooooooooo
oooooo
oooo
ooooooooo
ooooooooooooooo
oooooo
ooooooo
ooooo
ooooo
oooooooooooo
ooooooo
oooooooooo
ooooooooo
ooo
ooo
ooooooooo
ooooo
ooooooooooooooo
oooooooo
ooooooooooo
ooooooooooooooo
oooooooooo
oo
oooooooooo
oooooooooo
ooooooooo
oooooooo
oooo
ooooooooo
ooooooooo
ooooooooooo
ooooooooooooooo
oooooooooooooo
oooooooooo
oooooooo
ooooooooooo
oooooooooo
oooooooooooooo
oooooo
ooooooo
ooooooooooooo
oooooooooooo
oooooooo
ooooooooooo
ooooooo
ooooooooooooooooooooo
oooooooo
ooooooooooooo
oooooooooooooo
oooooooo
ooooooooooooooooo
oooooooooooooo
ooooooooooo
ooooooooo
oooooooo
oooooooooooo
oooooooooooooooo
ooooo
oooooooooooooo
oooooooooooooooooo
oooooooooooooooooooooo
oooooooooooo
ooooooooooooooooo
ooooooooooo
ooooooooooooooooo
ooooooooo
ooooooooooooooooooo
oooooooooooo
ooooooooooo
ooooooooooooooooooo
oooooooooooo
ooooooooooooooo
ooooooooooo
oooooooooooooo
ooooooooooooooooooooo
oooooooooo
ooooooooooooooo
ooooooooooooooooo
ooooooooo
ooooooooooo
ooooooooooooooooooooooooooo
oooooooooooooooooooooooooo
ooooooo
oooooooo
oooooooooooooooooooooooooooooooooooo
oooooooooooo
ooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooo
oooooooooo
oooooooooooooooooooooooooooo
ooooooooooo
oooooooooooooo
ooooooooooooooooooooooo
oooooooooooooooooooo
oooooooooooooooo
ooooooooooo
oooooooooooooooooooooo
ooooooooo
ooooooo
oooooooooooo
ooooooooooooooooooo
oooooooooo
oooooooooo
ooooooooooooooooo
ooooooooooo
ooooooooo
ooooooooooooo
ooooooooo
oooooooo
ooooooooooooooooo
ooooooooooooooooo
ooooooo
ooooooooo
ooooooooo
oooooooo
ooooooooooooooooooo
oooooo
oooooooooooo
oooooooooooo
ooooooo
ooooooo
oooooooooooooooooo
oooooo
ooooooooooooo
ooooooooooo
ooo
ooooooooo
ooooo
ooooooooooooo
oooooooooo
ooooooooooo
ooooo
ooooooooooooo
oooooooooooo
ooooooooo
oooooooooo
ooooooooooo
ooooo
oooooooooo
ooooooo
ooooooooooo
oooooooooooo
oooooooo
ooooooo
ooooooooooooo
oooooooooooooo
oooooo
ooooooo
oooooooo
oooooo
ooooooo
oooooooo
ooooooooo
ooooooooo
ooooooo
ooooooo
oooooooo
ooooo
oooooo
oooo
oooooo
oooo
oooooo
oooooo
ooooooo
ooooooooo
ooo
ooo
oooooo
ooooooo
ooooooo
ooooooo
ooooo
oooooooooo
oooooo
ooooo
oooooo
oooooooo
oooo
ooooo
oooooo
o
oooooo
oo
oooo
oooo
oooooooo
ooo
oooo
ooo
ooo
ooooo
ooo
oo
oo
ooo
ooooooo
ooooo
ooooo
ooo
oooooo
oooo
oooo
ooooo
oooooo
oooooooo
ooo
oooo
ooo
ooo
ooooo
oooo
ooooo
ooo
oooo
ooooo
ooo
o
oooo
oo
oo
ooo
oo
oo
oo
oooo
oo
o
oooo
ooo
ooooooo
ooooooo
o
ooo
oo
o
ooooo
ooo
oooo
o
oo
oo
oo
o
oooo
oo
o
oo
oooo
ooo
ooooo
ooo
o
oo
oooooo
ooo
ooooo
oo
o
oo
oo
o
ooo
o
ooo
oo
o
o
oo
oo
o
o
o
oo
oo
o
o
oo
o
oo
o
ooo
o
o
oo
oo
oooo
oo
oo
o
o
oo
o
oo
oooo
o
o
o
o
o
oo
oo
oo
o
o
o
o
o
o
o
oo
o
o
oo
oo
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
Quantiles of Asymetric Laplace
Yen
-0.02 0.0 0.02 0.04
-0.04 -0.02 0.0 0.02 0.04
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
oo
o
o
oo
o
o
o
o
o
o
o
o
ooo
oo
oo
o
o
o
o
o
oo
o
o
o
o
o
o
oooo
o
oo
o
o
o
oo
o
oooo
ooo
o
o
o
o
o
o
o
o
oo
o
oo
oo
oo
o
o
oo
o
oo
o
o
oo
ooooo
oo
oooo
o
oo
oo
o
oo
o
ooo
o
ooo
oo
o
oo
oo
o
o
o
ooo
ooo
o
ooo
o
ooo
o
o
ooo
oooo
o
o
ooo
ooo
ooooo
ooo
oo
ooooo
o
oo
oooo
o
oooo
oo
oooooo
oooo
ooooo
o
oo
ooo
oooo
o
o
ooo
oo
ooooo
oooo
ooo
o
ooo
ooo
ooooo
ooo
ooooo
oo
oo
ooooo
ooo
ooooo
oooooo
oo
ooo
oo
ooooo
oooo
ooooo
oooooo
oooooooo
oooo
oo
oooooo
oooooooo
oo
ooooo
ooo
ooooooo
oooooo
ooooo
oo
ooo
ooo
ooooo
ooooooo
o
ooooo
oooo
ooooo
oooooooo
oooo
ooooooo
oooo
oooooo
oooooo
oooo
oooo
ooooooo
ooooooo
oooooo
ooooo
o
oooooo
oooooo
oooooooooo
oooo
oooooooo
ooo
ooooooo
ooo
oooooooo
oooo
ooooooooo
oooooo
ooooo
oooooooooooo
oooooo
oooooo
ooooooooo
oooooooo
ooooooo
oooooo
ooooo
oooo
ooooooo
oooooooooo
ooooo
ooooooooooo
ooooooooo
ooooo
ooooooooo
oooooooooo
oooooooooooo
oooooooo
oooooooo
oooooooooooo
ooooooo
ooooooo
ooooo
oooooooo
ooooooo
ooooooooo
oooooo
ooooooooo
ooooooooo
ooooooo
o
oooooooo
oooooooo
ooooo
oooooo
ooooooo
oooooooo
ooooooo
oooooooooooo
ooooooo
oooooooo
oooooooooooooo
ooooooooooooo
ooooooooo
ooooooo
oooooooooooooo
ooooooooo
ooooooooo
oooooo
ooooooooo
ooooooooo
ooooooooo
oooooooooo
oooooooooo
ooooooooooo
ooooooooo
oooooooooooooooo
ooooooooo
oooooooooooo
ooooooooooooooooo
ooooooooo
oooooooo
ooooooooooooo
ooooooooo
oooooooo
ooooooooooo
ooooo
ooooooooo
oooooooooooo
ooooooooooo
oooooooooooo
oooooooooo
ooooooooo
oooooooooo
ooooo
ooooooooooo
oooooooooooooo
ooooo
oooooooooooooooo
ooooooooo
ooooooo
oooooooooooooooo
oooooooo
ooooooooooo
oooooo
oooooooooooooooooooooooo
ooooooooooooooooo
oooooooo
ooooooooooooooo
oooooooooooooo
ooooooooooooooooo
ooooooooooooooo
ooo
ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oo
oooooooooooooooooooooo
oooooooooooo
ooooooooo
oooooooooo
oo
oooooooooooooooooo
oooooooooooooo
ooooooooo
ooooo
ooooooooo
oooooooo
ooooooooo
ooooooooooo
ooooooooo
ooooooooo
ooooooooooooo
ooooooooooo
ooooooo
oooooooooooooo
oooooooooooooo
oooooooooooooo
ooooooo
ooooooooooo
oooooooooooooo
oooooooo
oooooooo
ooooooooooooooooo
ooooo
oooooo
oooooo
ooooooooooo
oooooooooooo
oooooooo
ooooooo
oooo
ooooooooooo
oooooooooooo
ooooo
oooooooo
oooooooo
ooooooo
ooooooo
ooooooooo
ooooooo
ooooooooo
ooooooooo
oooo
oooooo
ooooooo
ooooo
ooooooo
ooooooo
oooooo
ooooooooo
ooooooo
oooooo
oooooo
ooo
oooooooo
oooooooooo
ooooooooo
oooooooo
ooooooooo
ooooooooo
ooooo
ooo
oooooooooo
oooooo
ooooooooo
ooooooooooooo
oooooooo
ooooooooooo
ooooooo
oo
ooooooo
ooooooo
oooooo
oooooooo
oooooo
oooooo
ooo
oooooo
ooooo
ooooooooo
oooo
oooooooo
oooooooooo
oooo
ooooo
ooooooo
ooo
oooo
oo
oo
oooooo
oo
oo
ooooo
oo
oo
oooo
ooooo
ooooooooo
ooo
oooo
oooooooooo
ooo
ooooo
oooooo
oooooo
ooooooooo
oooo
oooo
oooo
ooo
oo
oooooo
ooooooo
o
ooo
ooo
oooooo
o
oooooo
ooooo
ooooooo
ooooo
oooo
ooooo
o
ooo
ooooo
ooo
oo
oooo
oooo
ooooo
ooo
oo
oo
oo
ooooo
oooo
o
ooooo
o
ooooo
ooo
oooooo
ooo
o
oo
oo
ooooo
o
oooo
oooo
oooo
ooo
oooo
o
oo
ooo
oo
oo
o
ooo
oooo
oo
oooo
oooooo
oo
oo
o
ooo
oo
o
o
oo
oo
oo
ooo
oo
oooo
oo
oooo
ooo
oo
o
o
o
oo
oo
oo
oo
ooooo
oo
oooo
oo
ooo
o
ooo
oo
ooo
o
o
o
o
o
ooo
o
o
ooo
oo
oooo
o
o
oo
oo
oo
o
ooo
oo
ooo
oo
oo
oo
o
o
o
oo
o
oo
o
ooo
o
o
o
oo
o
oo
oo
o
o
ooo
oooo
o
o
o
o
oo
o
o
oo
o
o
ooo
oo
o
o
o
o
o
o
oo
o
o
oo
o
oo
oo
o
o
oo
oo
oo
o
oo
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
Quantiles of Asymetric Laplace
DM
-0.02 0.0 0.02 0.04
-0.04 -0.02 0.0 0.02 0.04
Figure 8.3: Top: Normal quantile plots of J apanese Yen (left) and Ger-
man Deutschemark (right) exchange rate data. Bottom: Quantile plots of
Japanese Yen (left) and German Deutschemark (right) exchange rate data
vs. fitted AL distributions.
Kozubowski and Podg´orski (2000) fitted AL laws to a bivariate data set
on two currency c ommodities: the Germa n Deutschemark versus the US
8.4 Share market return models 369
Dollar (DMUS), and the Japanese Yen versus the US Dollar (YUS). The
observations were daily exchange rates from 1/1/80 to 12/7/90 (2853 data
points). [The standard change in the log(ra te) from day t to day t + 1 was
used.]
The histograms of the data appear in Fig ure 8.2, where we observe a
typical shape of a AL density. The distributions have high peaks near zero,
and appear to have tails thicker than that of the normal distribution. Nor-
mal quantile plots (QQ plots) in Figure 8.3 (Top) confirm these findings.
Observe that the normal plots deviate from a straight line rather substan-
tially. In order to fit an AL model, we need to estimate the parameters µ
and σ. The maximum likelihood estimators pr oduced
bµ = 0.0007558 and bσ = 0.0052196 8
for the German Deutschemark data and
bµ = 0.0007272 and bσ = 0.0049445
for the Japanese Yen data. The quantile plots of the two data sets with
theoretical AL distributions are presented in Figure 8.3 (Bottom) We see
only a slight departures fr om the straight line. It is evident even for a
naked eye that AL distributions model these data more appropriately than
normal distributions. We refer the reader to Kozubowski and Podg´orski
(1999c) for a more in-depth study of modeling the distribution of currency
exchange rates with AL laws.
8.4 Share market return models
8.4.1 Introd uction
The application of the Lapla c e motion as defined in Section 4.2, Chap-
ter 4, to modeling share market returns has been investigated in many
recent papers, starting with Cla rk (1973) (although indirectly) and dur-
ing the last decade in Madan and Seneta (1990), Madan and Milne (1991),
Longstaff (1994), Eberlein and Keller (1995), Barndorff-Nielsen (1996, 1997)
(through more general models based on hyperbolic distributions), Madan
et al. (1998), Geman et al. (2000ab).
It is empirically evident that stock price-changes do not follow normal
distribution. In particular, sample e xcess kurtosis for many available fi-
nancial data is significantly gre ater than zero (zero corresponds to normal
distribution). This deviation from normality implies that the assumptions
of Central L imit Theorem may not be valid for individual r andom effects
making up a price change. One solution as postulated by Mandelbrot (1963)
is to consider individual effects not having finite variance. The resulting
distribution should then be long to the class of stable distributions (a.k.a.
370 8. Financial data
Paretian stable laws). An alternative solution, as suggested in Clark (1973),
is to consider a subordinated Gaussian process. Considering cotton futures,
he argues that their prices evolve at different rates during identical time
intervals. This is presumably due to the fa c t that the number of individ-
ual effects which add together to give the price change during a fixed time
unit, say a day, is random. Thus, a version of Central Limit Theorem with
a random number of elements should be used to obtain an approximate dis-
tribution of a daily stock price. Clark (1973) desc ribe s the rationa le behind
these assumptions:
“The different evolution of price series on different days is due to the fact
that information is available to traders at a varying rate. On days when no
new information is available, trading is slow, and the price proc ess evolves
slowly. On days when new information violates old e xpectations, trading is
brisk, and the price process evolves much faster.”
In economic literature, this argument is describ e d through the assump-
tion that the business (or economic) time runs randomly relatively to
the physical time [see Madan and Seneta (1990), Geman et al. (2000ab)].
This sort of argument leads to the subordinated model of stock price s
S(t) = X(T (t)), where X(t) and T (t) are two independent stochastic pro-
cesses: X(t) is the stock price in business time t, and T (t) is business time
at real time t.
If we assume that B(t) = log X(t) is a Brownian motion and that T (t)
is a gamma process, then the process L(t) = log S(t) is a Laplace mo-
tion. In the work of Madan et al. (1998) a nd some other works oriented
toward applications in finance this process is named the variance gamma
process. This new model for the security prices enjoys several major advan-
tages when compa red with other models discussed in the literature on the
subject. In particular, it incorporates asymmetry, heavy-tailedness, contin-
uous time specificatio n, finite moments of all orders, elliptically contoured
multivariate counterpart, and provides adequate empirical fit. Additional
features include approximation by a compound Poisson process and rep-
resentation as a Brownian motion evaluated a t random time governed by
a gamma process. The last representation is interpreted as a mathemati-
cal interpretation of economic clock ticking in a random fashion. All these
features are direct conse quences of the properties of the Laplace motion
studied in Sectio n 4.2, Chapter 4.
Below we present a brief description of the model and list its basic prop-
erties.
8.4.2 Stock ma rk et returns
We consider a particular commodity w ith the stock price S
t
at time t. We
assume that {S
t
}
t0
is a r andom proc e ss and the return over the time unit
8.4 Share market return models 371
is given by
R =
S
t+1
S
t
.
Then, the log-return is defined as
L = ln R. (8.4.1)
In most models, it is assumed that the distribution of R does not depend
on t, so the dependence of R on t is not exhibited in the notation.
More generally, the stochastic process
S(t) = S(0) ex p(L
t
)
usually represents the stock price S(t) at time t, where the process L
t
has
homogeneous increments, i.e. L
t+s
L
t
d
= L
s
. Note that [by (8.4.1)] we
have L
d
= L
1
.
The literature on market returns includes a number of models for L
t
: the
Brownian motion, symmetric stable processes, normally distributed jumps
at Poisson jump times, models based on t-distribution, and generalized
beta distributions. A model based on the Laplace motion (the variance
gamma process) can be introduced by assuming that L
t
has homogeneous
and independent increments and that L
1
has a shifted generalized Laplac e
distribution. Thus,
L
1
d
= GAL(a, µ, σ, ν), (8.4.2)
where the parameters of the generalized Laplace distribution (a, µ, σ, and
ν), and the interest rate r are related through
a = r +
1
ν
ln
1 µ
σ
2
2
.
The additional shift ln(1 µ σ
2
/2) is a result of the drift
E[exp(L
t
)] = 1/(1 µt σ
2
t/2)
1
and is added in or der to have E exp(S(t)) = e
rt
.
Asymmetric generalized Laplace distribution (skewed Bessel K-function
distribution) was probably, in this context, first considered in Longstaff (1994).
He assumes that L
t
is a conditional Brownian motion upon the gamma
stochastic variance with a shift in the mean proportional to this stochas-
tic variance (without any substantiation of the gamma distribution for the
variance). In the quoted work, the stochastic process was not specified ex-
cept for one dimensional distributions which allow for other than Laplace
motion models for L
t
(see Exercise 4.5.10 in Chapter 4).
372 8. Financial data
Madan and Seneta (1990) considered the symmetric Laplace motion,
showing that in this case (µ = 0) the agreement o f the Laplace model with
real data is very good. Madan and Seneta (1990) compared the (symmetric)
Laplace mo tion model with the normal, the stable, and the Press compound
events model (ncp), using a chi-squared goodness-of-fit test statistic on
the data on 19 stocks quoted on the Sydney Stock Exchange. For 12 of
the studied stocks, the minimum chi-s quared was attained by the Laplace
motion model. The rema ining seven cases were be st characterized by the
ncp for five cases and the stable for two cases (and none for the normal
distribution). Thus , the Laplace motion appears to be a good contender as
a model of daily stock returns. The studies of Madan et al. (1998) confirm
this opinion to even a greater extend for the as ymmetric Laplace motion.
Madan et al. (1998) studied the empirical prices for the S&P 500 Index
futures traded at the Chicago Mecantile Exchange (CME) obtained from
the Financial Futures Institute in Washington D.C. for the time period
from Ja nuary 1992 to September 1994. Using the maximum likelihood ap-
proach, the authors fitted these data with the following models: the Brow-
nian motion (the po pular Black-Sholes model), symmetric Laplac e motion,
and asymmetric Laplace motion. The three models were considered both
for the statistical process of the stock price and for the risk neutral process
which was obtained using the data on the three month Tr easury Bill rate
obtained from the Federal Reserve Board in Washington D.C.
For the statistical process of the log-price, it was found that the log-
normal process is strongly rejected in favor of the symmetric Laplace mo-
tion while the asymmetric Laplace motion makes no significa nt improve-
ment in fit over the symmetric one.
The situation is essentially different for the risk neutral process where
an enhancement of skewness is observed as a result of risk aversion in
equilibrium. For example, the log normal model is rejected in favor of the
symmetric Laplace motion in 30.8% of the tests, while the analogous rate
for asymmetric Laplace is 91.6%.
8.5 Option pricing
Once the model for the price change of a commodity is decided upon, it is
impo rtant to find an effective and operational formula for the price of an
option. P robably the most important advantage of the Laplace model given
by (8.4.2) is that it allows for a close d form of the price of an European
option on the stock using the Black-Sholes formula for the Brownian motion
model of price change. The results were obtained in Longsta ff (1994) and
Madan et al. (1998).
The standard result on the price of a European call option C(S
0
, K, t)
for a strike of K and maturity t with the initial value of the stock S(0) = S
8.5 Option pricing 373
is given by
C(S, K, t) = e
rt
E[max(S(t) K, 0)], (8.5.1)
where the expectation is taken with respect to the risk-neutral density.
Evaluation of the option price (8.5.1) uses the co nditional evaluation
given the gamma proce ss in the representation in Theorem 4.2.1. Condi-
tionally on the value of the random time, we have a standard Brownian
motion model and the Black-Sholes formula can be applied. The European
option price is then obtained by integrating out the gamma process.
Theorem 8.5.1 The European call option price on a stock, when the stock
price is given by the Laplace motion through the condition (8.4.2), is given
by
C(S, K, t) = S · Ψ
d
r
1
ν
(α + s)
2
2
, s(ξ + 1)/
r
1
ν
(α + s)
2
2
,
t
ν
!
K · e
rt
· Ψ
d
r
1
ν
α
2
2
, ξ
2
s/
r
1
ν
α
2
2
,
t
ν
!
,
where
d =
1
s
ln
S
K
+ rt +
t
ν
· ln
2 ν(α + s)
2
2 να
2
.
and Ψ is the complementary Bessel function given by t he following integral
involving the standard normal distribution function Φ:
Ψ(a, b, γ) =
Z
0
Φ
a
u
+ b
u
u
γ1
e
u
Γ(γ)
du.
The proof of this theore m can be found in Madan et al. (1998). One
can notice s imilarity of this formula to the one base d on the Black-Sholes
model. The only difference is that the Besse l function is used instea d of the
normal distribution. Computationally, the formula is more complex than
the traditional Black-Sholes formula since it involves double integral of
elementary functions. It is nevertheless practical as it was used by Madan
et al. (1998) in their numerical computations on the da ta discussed in the
previous section.
Here, for each fit to the three models the option price was computed fo r
143 weeks. Then the pricing error was computed. For a correct model the
pricing e rrors should not exhibit any consistent pattern and they should
not be predictable (orthogonality tests were used to determine if the prices
resulting from a given model are biased or not). From these s tudies it
follows that the asymmetric Lapla ce motion provides an acceptable pricing
which removes the so-called volatility smile so often reported in the financial
literature for the Black-Sholes prices. For the detailed description of the
statistical analysis we refer our readers to the above pape r.
374 8. Financial data
8.6 Stochastic variance Value-at-Risk models
The research very closely related to modeling of stock market returns was
presented in Levin and Albansese (1998) and Levin and Tchernitser (1999),
where Value-at-Risk (VaR) models with multifactor gamma sto chastic vari-
ance were recommended and supported by both theore tical results and
real-life da ta.
Let X be a random risk factor. Assume first that it is modeled by a one-
dimensional random variable. An investment strategy is represented by a
portfolio, say Π(X), which depends on this factor and denotes the return
of investment over some fix e d period of time (a day or ten days). The V aR
at the level p (0, 1) is then defined as the p- quantile of the distribution
of Π(X):
P (Π(X) V aR) = p.
If the portfolio is a linear function of X, the distribution of the risk factor
X determines the value of V aR. Usually, the assumption of a normality
of X is no t supported by real-life data. Figure 8.4 shows that the data
are not well modeled by normal density. Assuming a Gaussian distribution
may lead to misleading values of V aR (they are two small in absolute
value w hen compared to the actual V aR’s). The real data exhibit more
peakedness, heavier tails, and often skewness. None of these feature can be
modeled accura tely by a Gaussian density. (See also Figures 8.6 and 8.5.)
For example, the returns of 3-Month FIBOR presented in Figure 8.4 show:
Skewness equal to 0.98 and kurto sis equal to 49.0 for daily re tur ns ;
Skewness equal to 0.46 and kurtosis equal to 5.6 for 10-days returns .
It is not uncommon in financial research to consider a modification of
the normality assumption by allowing for random variance in the normal
model (see also Section 8.4). In addition, in the work discussed therein,
the maximum entropy principle was evoked to determine the distribution
of such a random variance of the risk factor.
More precisely, consider the following assumptions on the distribution of
the risk fa c tor X.
Assumption 8.6.1 Conditionally on V , the distribution of the risk factor
X is normal with the mean µ and variance V , i.e.
X =
V Z + µ,
where Z is a standard normal variable independent of a positive random
variable V having the mean V
0
= σ
2
0
.
Assumption 8.6.2 The distribution of the variance V 0 has to satisfy
the maximum entropy principle under the constraint
E(V ) = V
0
.
8.6 Stochastic variance Value-at-Risk models 375
Figure 8.4: Comparing histograms of r isk factors with Gaussian model.
Daily and 1 0-days returns of 3-Month FIBOR. (Courtesy of Alexander
Levin).
As we already know (see Section 2.4.5) these assumptions lea d to the
model with the variance V distributed according to exponential law and
thus by the representation 2.2.3 the unconditiona l distribution of risk factor
is given by the Laplace law L(µ, σ
0
). Of course, this allows for e xplicit
computation the V aR values using the formulas for the quantiles of the
Laplace distribution.
Consider, for example, the currency exchange data (also used in Sec-
tion 8.3). In Figure 8.5, we see that the symmetric Laplace distribution fits
the data by far better than the Gaussian distribution.
Remark 8.6.1 In finance, the notion of volatility is commonly used to
describe the square roo t o f variance (the standard deviation). Note that
if V is expo nential, then the volatility
V is distributed according to the
Rayleigh distribution.
It may be reasonable to replace Assumption 8.6.1 by the following one.
Assumption 8.6.3 Conditionally on V , the distribution of the risk factor
X is normal with the mean µ γV and variance V , i.e.
X =
V Z γV + µ,
where Z is a standard normal variable independent of a positive random
variable V having the mean V
0
= σ
2
0
. The parameter γ controls the corre-
lation between the risk factor X and the stochastic variance V .
376 8. Financial data
Figure 8.5: Comparis on of historical data and their fits by Gaussian a nd
Laplace densities. (Cour tes y of Alexander Levin).
Figure 8.6: Comparing histograms of risk factors with Gaussian and Laplace
models. Left: Daily returns of S&P 500 Index showing that asymmetric
Laplace distribution (double expo nential) fits the data quite well. Right: 10-
days returns of S&P 500 Index are fitted better by a generalized Laplace
distribution (stochastic variance gamma model). (Courtesy of Alexander
Levin.)
8.6 Stochastic variance Value-at-Risk models 377
Then, the distribution of the risk factor becomes asymmetric Laplace
AL(µ, σ
2
0
γ, σ
0
). Again the V aR can be explicitly computed as the quan-
tiles of asymmetric Laplace laws are readily available. We see in Figure 8.6
that the data on returns of S&P 500 Index are clearly skewed to the right.
The fit of asymmetric Laplace on the left graph is by far better than the
Gaussian providing a sound empirical justification of the above model for
risk factor distributions.
So fa r we have considered a fixed p eriod within which we are modeling
the return of our portfolio. A natural extension is to consider a stochastic
variance model which depends on time. Our previous considerations which
lead to e xponential distribution for stochastic variance over a fixed period
should naturally introduce a time factor into the model by considering a
gamma process.
Assumption 8.6.4 The total stochastic variance V (t) follows a gamma
process.
As a consequence of this as sumption, the stochastic variance over an ar-
bitrary time interval is distributed according to the gamma law and the
stochastic volatility is distributed according to the Nakagami distribution
[which is the distribution of the square root of a gamma distributed vari-
able, see, e.g., Nakagami (1964)]. We know from Chapter 4 that this leads
to risk factors distributed according to generalized Laplace distributions
(the K-Bessel distributions). In this case, the V aR is no longer expressed
in terms of elementary functions as the K-Bessel distributions involves a
modified Be ssel function which nee ds to be inverted to obtain V aR defined
as a quantile for this distribution. Numerical proc e dures have to be used
for the computational purposes .
The available financial data seem to confirm such a model. From Fig-
ure 8.6, we obs e rve that the distribution over a longer period of time (10-
days vs. daily) has a relatively sma ller peak in the center, which agrees
with the model having gamma distributed stochastic variance. The same
observation can be made for the data presented in Figure 8.4.
The above model poses a challenging inferential problem how to estimate
the parameter s of distributions based on generalized Laplace model by ex-
ploiting the time scale. For example, the question arises as to which period
of time would lead to an asymmetric (but not generalized) Laplace distri-
bution of the risk factor . This problem was partially addressed in Levin and
Tchernitser (1999) where an interesting calibration procedure was propo sed
allowing for computing the parameters of the model by matching appropri-
ate moments of the distributions for variance and for the risk factor. As a
first step, the method of moments could be used to estimate the para meters.
The next challenge is to extend these models to the case of a multi-
variate portfolio. Le t X be a vector of risk factors and let Π(X) be a
portfolio depending on these factors. In order to compute V aR one needs
to identify multidimensional distribution of X. Following the successful fit
378 8. Financial data
Figure 8.7: Bivariate distribution based on DEM/USD and JPY/USD data:
Top-left: Gaussian model; Top-right: Model based on independent Laplace
variables; Bottom-left: Multivariate Laplace model; Bottom-right: Histori-
cal distribution. (Co urtesy of Alexander Levin).
of the univariate models, we are looking for distributions which in one di-
mensional case are reduced to asymmetric Laplace or generalized Laplace
distributions. For the bivariate currency exchange data, which were studied
in Levin and Tchernitser (1999), three models were examined: the Gaus-
8.7 A jump diffusion model for asset pricing with Laplace distributed jump-sizes 379
sian, a linear combination of Laplace variables, and bivariate Laplace (the
elliptically contoured Laplace distribution). The two dimensional data on
exchange rates of German Mark and Japanese Yen vs. US Dollar were used
to verify the proposed models. As seen in Figure 8.7, the most convincing
fit is provided by elliptically contoured K-Bessel distribution which sug-
gests that multivariate Laplace distributions can very well be useful for
multivariate modeling in finance.
8.7 A jump diffusion model for asset pricing with
Laplace distributed jump-sizes
Another model which is alternative to Gaussian for the price of an asset
(a stock or a stock index) was proposed in Kou (2000). In contrast to
the variance gamma models discussed in Sections 8.5,8.6 which are purely
jump proce sses, it contains both a continuo us part modeled by a geometric
Brownian motion and a jump part with the logarithm of the jump sizes
having a Laplace distribution and the jump times corresponding to the
arrival times of a Poisson process. The asset price S(t) is given by the
following sto chastic differential equation
dS(t)
S(t)
= µdt + σdW (t) + d
N(t)
X
i=1
(V
i
1)
,
where W (t) is a standard Wiener process, N(t) is a Poisson proce ss with
rate λ, and {V
i
} is a sequence of indepe ndent identically distributed non-
negative random var iables such that X = log V has the Laplace distribution
CL(θ, η). All the variates are assumed to be independent. The solution to
the above equation has the form
S(t) = S(0) ex p
(µ
1
2
σ
2
)t + σW (t)
N(t)
Y
i=1
V
i
.
It is shown in Kou (2000) that the above model has important features
observed in the financial data (and no n-existing in the standard diffusion
models) such as higher peak and heavier tails, asymmetry, and volatility
smile. Moreover, a closed formula for option pricing is available, although
it is somewhat complicated and involves some special functions (the Hh
function). We refer an interested reader to the original work.
380 8. Financial data
8.8 Price changes modeled by Lapla ce-Weibull
mixtures
As we have mentioned on various occasions the ability of modeling heavy-
tails as well as the peak at the center are imp ortant a dvantages of Laplace
modeling in finance. Rachev a nd SenGupta (1993) propose to consider con-
taminated Laplace distribution in order to accommodate for the possibility
of outliers. Namely, the following model is discussed
p(x; π, λ, µ, γ) = πf
1
(x; λ) + (1 π)f
2
(x; µ, γ),
where f
1
is the CL(0, 1) density,
f
1
(x; λ) = (λ/2) ex p(λ|x|),
and f
2
is the density of a symmetric Weibull distribution give n by
f
2
(x) =
γµ
2
|x|
γ1
exp(µ|x|
γ
),
where γ > 1, µ > 0, 0 π 1.
Obtaining maximum likelihood estimator s for this multiparameter fam-
ily of distributions is troublesome, mostly because of the presence of the
Weibull component. However, the general E-M algorithm can be used for
this purpose, and was successfully applied in Ra chev and SenGupta (1993).
There are no known results for asymptotic properties of such estimators.
In the proposed model, the leading term is Laplace density with the
Weibull density being a possible contaminant. Therefore, it is of inter-
est to test for the no mixture hypothesis: π = 1. Various cases, depend-
ing on which parameters are known, are discussed in Rachev and Sen-
Gupta (1993).
The model was then applied to price changes for real estate data in the
city of Paris. Mixture distributions are considered for such data because of a
possibility small changes in the corresponding buyers/investors population
due to immigratio n or emigration. The data consisted of the aver age prices
for one-bedroom apartments in Paris for 61 consecutive months. The data
were transformed to x
i
= log(ξ
i+1
i) and then the E-M algorithm yielded
the following estimates: ˆπ = 0.852, ˆγ = 5.070,
ˆ
λ = 7.97, and ˆµ = 45.39. An
initial Monte Carlo study suggests rather go od a greement of the estimated
model with the observed data.
9
Inventory management and quality
control
Somewhat surprisingly, there a re only few and isolated applications of the
Laplace distributions related to inventory management problems and qual-
ity control. The dominance of the gamma and exponential distributions in
this field is still overwhelming. We have collected here a few results which
hopefully will be elaborated by the researchers and practitioners in not too
distant future.
9.1 Demand during lead time
Distribution of demand during lead time in inventory control is e ssential
for determining inventory decision variables such as: expected back order,
lost sales, protection level, and stock out risk.
Bagchi et al. (1983) show that based on theoretical considerations this
distribution oug ht to be the Hermite distribution [see Johnson et al. (1992)]
given by
P (W = 0) = p
0
= e
ab
P (W = w) = p
w
= p
0
[[w/2]]
X
j=0
a
w2j
b
j
(w 2 j)!j!
, w = 1, 2, 3, . . . ,
where a and b are the parameters of the distribution such that E(W ) =
a+2b and Var(W ) = a+4b. Indeed this is the exact distributions of demand
during lead time when unit demand is Poisson and lead time is normally dis-
tributed. However, in applied literatur e [see, e.g. Peterson and Silver (1979)]
382 9. Inventory management and quality control
Value
Q
R
=
P
w=R+1
p
w
Q
R
approx.
Percentage error
100·(Q
R
Q
0
R
)/Q
R
(reorder
points)
Hermite Laplace Normal Laplace Normal
7 .4163 .4110 .4449 1.27 -6.87
8 .3098 .2776 .3387 10.39 -9.33
9 .2335 .1875 .2440 12.70 -4.50
10 .1620 .1267 .1660 21.79 - 2.47
11 .1140 .0856 .1060 24.91 7.08
12 .0741 .0578 .0636 21.98 14.17
13 .0484 .0391 .0358 19.21 26.03
14 .0300 .0264 .0188 12.00 37.33
15 .0186 .0178 .0092 4.30 50.54
16 .0108 .0120 .0043 -11.11 60.19
17 .0064 .0081 .0018 -26.56 71.88
Table 9.1: Approximations to the tail of Hermite demand during lead time
(mean=7, variance=1 3). Source: Bagchi et al. (1983).
it is recommended to utilize the Laplace distributions for this purpose es-
pecially for slow-moving items or the universal normal approximation. We
are thus interested in comparing normal and Laplac e distributions as ap-
proximation to the (skewed) Hermite distribution. These approximations
are based on the method of moments and the parameters ar e chosen by
equating the means and variances. Bagchi et al. (1983) pr ovide a table
comparing
Q
R
= 1 P
R
=
X
w=R+1
p
w
the tails of Hermite distribution with mean 7 a nd variance 13 (correspond-
ing to Poisson demand with mean equal one and the normal lead with mean
7 and varia nce 6 - the relation being E(W ) = λµ a nd Var(W ) = λµ+λ
2
σ
2
,
where λ(= 1) is the mean of the Poisson demand and µ and σ
2
are the pa-
rameters of the normal lead time) with their normal and Laplace approxi-
mations. The results of their finding for this particular case are summarized
in Table 9.1.
For the normal approximation, the maximum error decreases as the mean
increases, and increases as the variance increa ses. For the Laplace approx-
imation, the maximum error seems to increase as the mean increases, but
it decreases as the variance increases. The table indicates that the Laplace
may approximate the Hermite well in the high percentage po ints of the
right tail. The normal distribution yields better approximations in the mid-
9.2 Acceptance sampling for Laplace distributed quality characteristics 383
dle perc e ntage points. The perc e ntage errors seem to move in opposite di-
rections with the no rmal distr ibution providing a better fit for moderate
reorder points and the Laplace is substantially dominating at high ones.
Further and more detailed investiga tio ns may be appropriate.
9.2 Acceptance sampling for Laplace distributed
quality characteristics
In the theory of one- sided acceptance sampling we consider a measured
quality characteristic, say X, which is compared to an upper specification
limit, say U, in order to determine if an item is classified as defective. The
quality of the lot o f items is then defined as the theoretical proportion p
of its defective items, i.e. p = P (X > U ). If we have a sample of items
from the lot for which quality is expressed in terms of (X
1
, . . . , X
n
) and
the estimated defective proportion is given by ˆp, then the decision rule to
accept or reject the whole lot is g iven by
ˆp p
accept the lot
ˆp > p
reject the lot,
where p
is a specified acceptance constant.
The theory is well developed if the distribution of X is normal. Sahli et
al. (1997) pointed out tha t using the procedures based on the normality
assumption when it is no t valid could be quite misleading. T he authors
report that the proc e dure which for the sample size n = 45 and under the
Gaussian assumption ensures ac c e pta nce probability of 0.95, gives accep-
tance probability of only 0.453 if we replace the Gaussian distribution by
the Laplace distribution.
This demonstrates importance of developing a theory for other than the
normal cases. Sahli et al. (1997) presents acceptance procedure for symmet-
ric Laplace distribution both in the c ase when only the center parameter
is unknown and in the case when the center and scale parameters are un-
known. Below we present the summary of their findings.
In general, we assume that the distribution of X depends on a parameter
θ and we define the lot a c ceptance pr obability based on our decision rule
by
P
a
(θ) = P (ˆp p
).
The quality of the lot p is also a function of θ. In the cases when all θ which
give the same p produce also the same value of P
a
, P
a
can be treated as a
function of p and the graph of P
a
as a function of p is called an operating
characteristic (OC) curve. The standard accepta nce sampling plan design
problem is to give a decision rule with corresponding OC curve passing
384 9. Inventory management and quality control
through two give n points (p
1
, P
a
1
), (p
2
, P
a
2
). The problem is solved under
the normal assumption by the following accepta nce rules:
¯
X U σz
p
if the deviation σ is known
¯
X U Sz
p
if σ is unknown and S
2
is the sample variance.
Practical ways to choose the sample size n and p
such that the OC curve
passes through the two points (p
1
, P
a
1
), (p
2
, P
a
2
) are provided by the In-
ternational Organizatio n for Standardization (1989).
For the Laplace distribution CL(θ, φ), we have the following relations
between the parameters a nd the propor tion p of the defective items
θ = U + φ ln(2p).
The case of φ known. Let us take the decision rule using the median
ˆ
θ
(which is the MLE estimator of θ):
ˆ
θ
˜
X
U
accept the lot
ˆ
θ >
˜
X
U
reject the lot.
The issue now is to determine the acceptance constant
˜
X
U
and the sample
size n such that the OC curve passes through a g iven two points. Note that
the function P
a
(p) is equal to the cumulative distribution function of the
median which in principle can be explicitely computed although numerical
algorithms have to be used for this purposes. For example, if φ = 1 and
U = 3, then to ensure P
a
(0.0068) = 0.95 and P
a
(0.0106) = .1, we obtain
n = 51 and
˜
X
U
= 1.0360.
The case of φ unknown. A r easonable acceptance rule would be
ˆp p
accept the lot,
ˆp > p
reject the lot,
where
ˆp =
e
(
ˆ
θU )/
ˆ
φ
2
,
ˆ
φ is the sample mean absolute deviation (the MLE of φ), and p
to be
determined. This is equivalent to
ˆ
θ U k
ˆ
φ accept the lot,
ˆ
θ > U k
ˆ
φ reject the lot,
where k has to be determined. In order to determine the OC curve in this
case one can either consider the exact distribution of the statistics
ˆ
φ and
ˆ
θ
or apply some asymptotic results (see also Section 2.6). The complexity of
the problem was partially analyzed in Sahli et al. (1997).
9.3 Steam generator inspection 385
9.3 Steam generato r i nspection
The exponential distribution has found applications in a variety of fields.
Easterling (1978) notices that for heavy-tailed data the model consisting
of the sum of an exponential variable and Laplace distributed indepen-
dent mea surement error can be utilized. In the quoted paper this model is
applied to mea surements of tube degradation in a steam generator.
The steam generators in pressurized water re actors contain tho usands of
tubes through which heated water from the reactor flows to be converted
into steam. Those tubes can erode ove r time and, if the generator is not
inspec ted and maintained prop e rly, it can lead to leaks which require the
plant to be shut down. In order to develop an appropriate inspection plan
an adequate statistical model for the degradation of the tubes has to be
developed. In Easterling (1978), the actual degradation (extent of thinning)
of a tube D, expressed as a per c entage of the initial tube wall thickness, is
a random variable having an exponential distribution with parameter θ:
h(d) =
1
θ
e
d/θ
.
The degrada tion is measured by a device called an eddy current tester
and it is clear from the available experimental data that the measurements
are made with some heavy-tailed and biased errors E. Laplace distribution
with density
g(e) =
1
2φ
e
−|eµ|
seems to be well-fitted for this sort of data. The measured degradation is
then modeled as
M = D + E,
where E and D are independent and distributed according to the a bove
densities. Then the cumulative distribution function of M is given by
P (M m) =
(
φ
2(φ+θ)
e
(mµ)
, if m µ,
1
θ
2
θ
2
φ
2
e
(mµ)
+
θ
2(θφ)
e
(mµ)
, if m > µ.
From the above explicit fo rmula one can derive conditional moments of
M and D [see Easterling (1 978)].
The goodness-of-fit analysis of the above model o n some experimental
data was perfo rmed. The model appears to provide an adequate fit. How-
ever, a s pointed by Easterling (1978), there is a problem of correctly esti-
mating the variances represented by θ and φ. Both represent variability in
the model and it is hard to discern if it comes from variance of the error
or the variance of deg radation.
386 9. Inventory management and quality control
9.4 Adjustment of statistical process control
The majority of applications of the Laplace distributions are due to inad-
equacy of the Ga ussian modeling. Along these lines Gonz´ales et al. (1999)
present rather a surprising application of the Laplace distribution by find-
ing approximate solution to a Gaussian model (which is co nsidered a c cu-
rate) through exact solutions available for a corresponding Laplace model.
Namely, a nalytical solution to the average adjustment interval and the
mean squared deviation from target of the “bo unded adjustment” schemes
are found under the assumption that the disturbances are generated from a
Laplace distribution. Then robustness of the solution on the distributional
assumptions is demonstrated and used to derive the approximate results
for the Ga us sian case .
Feedback control schemes used in the parts and hybrid industries mus t
often account for the cost of being off target, the costs of adjusting and/or
sampling process. In such a case, feedback adjustment may be implemented
by using boun ded (dead band) adjustment schemes. In these schemes the
disturbances are represented by an integrated moving average (IMA) time
series model
z
t+1
z
t
= a
t+1
θa
t
,
where z
0
= a
0
= 0, the innovations a
t
are independent and identically
distributed (i.i.d.) normal random varia bles with mean zero and standard
deviation σ
a
, and 0 < λ = 1 θ 1. The adjustments are given by
x
t
= X
t
X
t1
and their effect is realized at the time t + 1. The possibility
of sampling and adjusting the process occurs only at the times tm, m N.
The corresponding disturbances are g iven by
z
mt+m
z
mt
= u
mt+m
θ
m
u
mt
,
where u
tm
are i.i.d. normal random variables with mean zero and standard
deviation σ
m
, and θ
m
, σ
m
, and λ
m
= 1 θ
m
satisfy λ
2
m
σ
2
m
=
2
σ
2
a
and
θ
m
σ
2
m
= θσ
2
a
. Optimal bounded a djustment schemes re quire that an action
X
tm
needed to bring the process back to target is taken every time the
minimum mean squared error of forecasted deviation from target exceeds
some threshold values ±L. Important parameters for these schemes are the
sampling interval m, the action limits ±L, and the amount of adjustment
required (which depends on the overcompensation s to be produced). Once
these parameters are chosen, the average adjustment interval (AAI) and
mean squared deviation (MSD) may be c omputed by solving certain inte-
gral equations. Under the above described disturbances, the equations have
the form
AAI(x) = mh
0
(x),
MSD(x) = σ
2
m
+ λ
2
m
σ
2
m
{(1 m)/(2m) + g
2
(x)},
9.5 Duplicate check-sampling of the metallic content 387
where x = s/(λ
m
σ
m
), g
2
(x) = h
2
(x)/h
0
(x). The functions h
k
(x) for k = 0
and 2 ar e the solutions of the Fredholm integral equation
h
k
(x) = x
k
+ σ
m
Z
Λ
Λ
h
k
(w)φ{σ
m
(w x)}dw, (9.4.1)
where Λ = L/(λ
m
σ
m
) and φ(·) is the density function of the innovations
u
tm
. See Gonz´ales et al. (1999) and the references therein.
When innovations are Gaussian there is no a nalytic solution to (9.4.1).
However, as it is shown in Gonz´ales et al. (1999), analy tical solution can be
written e xplicitly if the innovations follow Laplace distribution. Namely, in
the Laplacian case the solutions are
h
0
(x) =
Λ
2
+ Λ
2 + 1 x
2
, |x| Λ,
1 + Λ
2e
2(|x|−Λ)
,
h
2
(x) =
(
Λ
4
/6 + Λ
3
2/3 x
4
/6 + x
2
, |x| Λ ,
x
2
+ Λ
3
2
3
e
2(|x|−Λ)
.
These solutions can be used to obtain exact values of AAI and MSD.
The Fredholm equation can be also solved for the convolutions of Laplace
distributions. Then the solutions can be used to approximate solutions for
normal innovations by the Central Limit Theorem. However, as shown in
Gonz´ales et al. (1999), the limiting distr ibution can be approximated quite
accurately by simply extrapolating the solution in the cases of the Laplace
distribution and the two-fold convolution of the Lapla c e distribution. For
the corresponding results we refer our reader to the original paper.
9.5 Duplicate check-sampling of the metallic
content
An application of generalized Laplace (Bessel function) distributions was
obtained some 40 years ago by Rowland and Sichel (1960) in modeling
duplicate measurements of the metallic content in the gold mines of South
Africa (but to the best of our knowledge no more rece nt results are available
at least in probabilistic and statistical literature). Because such duplicate
check-sampling is a common practice in industrial ana ly sis this approach
could be valuable for quality controllers working in other areas as well. In
our presentation, we shall restrict ourselves to a description of the model,
referring readers interested in quality control to the original paper.
The check measuring is based on duplicate measurements of a specimen
in order to gauge the accuracy of qualitative determinations. The two mea-
surements, called the original sample and the check sample, can be used
to asse ss the quality of measurements. In standard applications, it is often
388 9. Inventory management and quality control
reasonable to assume that the difference of measurements is normally dis-
tributed. However, in cases when the variance of the error is dependent on
the level of specimen in a measurement, the use of the normal distribution
is not appropriate.
This seems to be the case in duplicate measurements of the gold content
in gold mines. Namely, the higher level of the gold content in samples taken
in a groove the la rger variance of the measured content. It was verified in
various studies that for the double check sampling in the gold mines the
ratios of two measurements have stabilized standard deviations and thus
they should be rather used for statistical purposes in pla c e of the differences.
Let X and Y represent the original and check sample. From the data
collected from mines in South Africa, it was inferred that the distributions
of X and Y are identical and thus the ratio R = X/Y has a distribution
which is asymmetric around one. As it is more convenient to use symmet-
ric distributions in deriving control chart limits, the logarithm of the ratio,
L = log R, which is distributed symmetrically around zero, is a more suit-
able variable. The log-norma l distribution has a prominent position in mine
valuation, and is often used to model the distribution of R if all samples
are taken in a small reef ar ea (so that the variance can be assumed con-
stant). If va riances of all such small reef areas were constant, all the ratios
obtained in check sampling could be pooled together and would conform to
the log-normal law. Unfortunately, the observed data reject such a model.
It was observed that the logarithms of the observed ratios which under
the log-normal model should be normally distributed, re veal strongly lep-
tokurtic features. According to Rowland and Sichel (1960) leptokurtosis is
due to the instability” of the logarithmic variances which is observed even
for samples taken in two neighboring reef areas. Since standar d statisti-
cal densities used for symmetric leptokurtic distributions, such as Pearson
Type VII distribution (a t-distribution with not necessarily integer-valued
degrees of freedom), were rejected by the χ
2
-test, the authors resorted to a
model which in the terminolo gy of this book is represented by generalized
symmetric Laplace (sy mmetric Bessel function) distributions.
The basis for the model follows the same scheme that was presented
earlier in this book: the variable L is normally distributed with a stochastic
variance (corres ponding to the random choice of the location). The variance
is assumed to have a gamma distribution, a nd L is a product of the random
variance and a normal random variable, assumed to be independent. As a
result of these assumptions we obtain the following density of L:
γ(l) =
p
a/π
2
ν1/2
Γ(ν + 1/2)
(
2a|l|)
ν
K
ν
(
2a|l|),
where a a nd ν are some positive parameters. This distribution corresponds
to the density given by equation (4.1.32), if we take a = 1
2
and ν =
τ 1/2. One should notice that the above density is also well defined for
ν (1/2, 0] although this case was not discussed in the original paper.
9.5 Duplicate check-sampling of the metallic content 389
The derive d model has fitted very well the data from various gold mines.
The formal derivation of the quality control charts based on this model and
a discussion of their implementa tion in the mining practice can b e found
in Rowland and Sichel (1960).
390 9. Inventory management and quality control
10
Astronomy, biological and
environmental sciences
In this shor t chapter miscellaneous applications of Laplace distributions
are briefly surveyed. In the first section, we report that Laplace distribu-
tion may in certain instances provide a better fit than the more co mpli-
cated hyperbolic distribution. The central part of this chapter is devoted
to importa nt application to the area of dose response c urves by Uppu-
luri (1981), which unfortunately has not been inves tigated further due to
untimely death of the author.
10.1 Sizes of beans, sand particles, and diamonds
Laplace distributions a nd, more generally, hyperbolic distributions were
considered for modeling sizes of diamonds, beans and sand particles.
Barndorff-Nielsen (1977) studied the distribution of the logarithm of pa r-
ticle size of wind blown sands. The distribution for which the logarithm of
the density function is a hyperbola (or, in higher dimensions, a hyp e rboloid)
is proposed as a model. It was the firs t occasion when the class of hyperbolic
distribution was introduced. It was also noted that the Laplace distribution
is a limiting distribution with an appropriate passag e to the limit of the
corresponding parameters. For the Laplace distribution, the log-probability
function is not a hyperbola but rather two straight half-lines attached at a
single point.
The standard distribution in size statistics is the log-normal distribution.
However, quite often mixtures of log-normal distribution s e e m to account
392 10. Astronomy, biological and environmental sciences
better for lo ng tails of the observed da ta. Log-hyperbo lic distributions (and
in particular log-Laplace distributions) are mixtures of log-normal distribu-
tions, and both o f them have asymptotically linear tails. These two features
makes them particularly suitable for modeling size data.
The class of o ne dimensional hyperbolic distributions introduced in Barn-
dorff-Nielsen (1977) ca n be described in terms of the density
f(x; φ, γ, µ, δ) =
1
(φ
1
+ γ
1
)δ
φγK
1
(δ
φγ)
· exp(
1
2
(φ + γ)
p
δ
2
+ (x µ)
2
+
1
2
(φ γ)(x µ)).
In the limiting case (δ 0) we obtain an asymmetric Laplace distr ibution,
while a Gaussian distribution is obtained when δ and δ σ
2
[cf.
Exercise 3.6.3].
Hyperbolic distributions provided an excellent fit to the data on the
sand particles from the studies by Bagnold (1954) as well as to the sam-
ples of sand from the Danish west coast. It was also suggested that this
class of distribution can be applied to other contexts when the size data
are considered. As an example, size distribution of diamonds from a large
mining areas in South West Africa were discussed in Sichel (1973). He no-
ticed that “diamond sizes in the marine deposit of South West Africa ar e
well represented by a two-parameter log-normal distribution provided the
stones originate from a small compact mining block, on one and the same
beach horizon”. However for larger mining areas the deviations from the
log-nor mal distributions are observed. Sichel (1973) intro duced the mix-
ture of log-normal distributions which in our terminology would be called
generalized asymmetric log-Laplace distributions.
In Blaesild (1981), the biva riate hyperbolic distributions are propos e d to
fit the historical W. Johannsen’s bivariate data on the length and breath
of beans. These now classical sets of two-dimensional data s howing non-
normal variations was fit by a bivariate hyperbolic distribution providing a
reasonable agreement with the data. As the bivariate Laplace distributions
constitute a subclass of hype rb olic distributions, it would be of interest to
compare the Laplace fit to the more general but also more complicated
hyperbolic fit. This was actually done in Fieller (1993), who studied the
distribution of sizes of sand particles in the relation with archaeological
research. Fieller (1993) reported that
“attempts to fit the log-hyperbolic models of B arndorff-Nielsen (1977)
proved computationally impossible. Instead, a simpler version, based on
the log skew Laplace distribution, proved computationally trac table and
most satisfyingly a ns wered the questions quite conclusively.”
Similar comments apply to many other investigations of fitting the hyper-
bolic distribution to empirical data. Barndorff-Nielsen and Blaesid (1982)
apply the hyperbo lic model to the following six data sets:
10.2 Pulses in long bright gamma-ray bursts 393
1. Grain size s, acolian sand deposits;
2. Grain size s, river bed sediment;
3. Differences between logarithms of duplicate deter minations of content
of gold per ore;
4. Differences of streamwise velocity components in a turbulent atmo-
spheric field of large Reynold numbers;
5. The lengths of beans whose breadths lie in a fixed interval;
6. Personal incomes in Australia 1962-1963.
In four o f these cases (Data sets 1,2,4, and 6) the resulting distribution is
close to the Laplace distribution (in the logarithmic sca le we obser ve almost
two straight half-lines instead of a hyperbola), while in two other cases the
data seem to be “more” Gaussian (parabolic log-probability function).
10.2 Pulses in long bright gamma-ray bursts
Somewhat unusual application of an asymmetric Laplace distribution was
found in the mo deling of the shapes of long bright g amma-ray bursts dis-
cussed by Norris et al. (1996). The paper exa mines the temporal profiles
of bursts detected by the burst and transient source experiment in the
Compton gamma ray observatory. The most fre quently observed pulses are
intermediate between asymmetric Laplace and asymmetric Gaussian. The
general functional form of the pulse intensity is given by
I(t) =
A exp((|t t
max
|
r
)
ν
) : t < t
max
,
A exp((|t t
max
|
d
)
ν
) : t > t
max
,
where t
max
is the time of the pulse’s maximum intensity A, σ
r
and σ
d
are the rise (t < t
max
) and decay (t > t
max
) time consta nts, respectively,
and ν is a measure of peakedness. For ν = 1 we obtain an asymmetric
Laplace shape, and for ν = 2 the corresponding shape can described by an
asymmetric Gaussian distribution.
The paper focuse s on de-convoluting the above shapes from the temporal
data of the observed gamma ray burs ts. The interactive numerical routine
is used to fit pulses in bursts. The most frequently occurring peakedness
lies approximately halfway between Gaussian and Laplacian distributions.
10.3 Random fluctuations of response rate
In many behavioral systems, one can o bs e rve pulse-like responses that recur
regularly in time with a very low variation. The constant beating of heart
394 10. Astronomy, biological and environmental sciences
and responses of the optic nerve of the horseshoe crab, limulus (which is
famous for the lo ng trains of action potentials pr oduced when its visual re-
ceptor is subject to a steady light), are just two of many examples observed
in nature. These respo nses although random are quite periodic, and their
fluctuations are not modeled well by a Poisson proc e ss.
McGill (1962) proposes a stochastic model for such responses which ac-
commodates both periodic and random components. This model involves a
mechanism which g e nerates regularly spaced excitations which can initiate
a response after a random delay. The excitations are not observed but their
periodicity is indirectly se e n in a regular pa ttern of responses.
The general model is rather simple. Suppose that excitations occur in
equal non-random time intervals of length τ. At excitation kτ, k = 1, 2, . . . ,
we have a positive random variable S
k
which represents a random delay
between an excitation at kτ and the response which occurs at kτ + S
k
.
We assume that the S
k
’s are i.i.d. ra ndom variables having expo nential
distribution with parameter λ. The goal is to find the distribution of the
time between responses. In McGill (1962 ), this distribution is shown to
have the form
f(t) =
λν
1ν
sinh λt 0 t τ,
1+ν
2ν
λe
λt
t τ,
where ν is a constant given by ν = e
λτ
. This distribution is skewed and
has the mode at t = τ. Moreover, as λτ increases without a bound, ν
converges to zero and the distribution becomes asymptotically
f(t τ) =
λ
2
e
λ|tτ |
,
which is the symmetric Laplace distribution. This asy mptotic distribution
applies to the case when the random component (“noise”) is small relatively
to the periodic component repr e sented by τ.
The fact that the Laplace distribution arises as the limiting distribution is
not surprising. By independence, we see that for large τ, τ +r = τ +S
2
˜
S
1
,
where
˜
S
1
= τ S
1
and the latter is approximately exponentially distributed
for large τ . Now, the limiting distribution follows from the representation of
Laplace distribution as a difference of two exponential random variables. As
noted by McGill (1962): This simple point (that Laplace is a difference of
exp onentials) is ignored in most texts on statistics because, perhaps, no one
imagines why anyone else would be interes ted. Our argument establishes a
very good reason for being interested. The difference, and hence the Laplace
distribution, provides a characterizatio n of the error in a timing device that
is under periodic excitation.”
The model is then tested on two sets of re al-life data: responses of a
single fiber of the optic nerve of the horseshoe crab and interresp onse times
produced by a bar-pressing rat after a long conditioning period. The data
are more leptokurtic than normal distribution and Laplace distribution fit
the data quite well.
10.4 Modeling low dose responses 395
10.4 Modeling low dose responses
If a random variable Y has the Laplace distribution, then e
Y
has the log-
Laplace distribution. This distribution was considered in Uppuluri (1981)
as a model in the study of the behavior of doses response curves at low
doses. One of the problems in this context is linearity versus nonlinearity
of dose respons e for radiation car c inogenesis. Since animal experiments can
only be performed at reasonable high dose s, the problem of extrapolation
to low doses becomes viable only under a suitable mathematical model.
The following axiomatic approach leads to the model given by log-Laplace
distribution.
Axiom 1 At small doses, the percent increa se in the cumulative propor-
tion of deaths is proportional to the percent increase in the dose.
Axiom 2 At larger doses, the percent increase in the cumulative propor-
tion of survivors is proportional to the percent decrease in the dose.
Axiom 3 At zer o dose, no deaths, and when the dose is infinite, no sur-
vivors, and the cumulative proportion of deaths F (x) is a monotonic,
nondecreasing function of the dose x.
Under these axioms we obtain that the cumulative distribution function
of the dose response has the form
F (x) = F (1)x
µ
, 1 F (x) = (1 F (1))/x
λ
,
for some positive µ and λ.
The log-Laplace distribution corresponding to the classical Laplace dis-
tribution is obtained if we additionally assume that λ = µ and F (1) = 1/2.
Of course, the log-Lapla c e dis tribution for asymmetric Lapla c e distribu-
tions is also included in the above model.
10.5 Multivariate elliptically contoured
distribut ions for repeated measurements
Lindsey (1999) discusses a need for other than normal multivariate distri-
butions in the analysis of re peated measurements. The main deficiency of
normal distributions is inability to model heavier tails. As an alter native
Lindsey (1999) proposes multivariate exponential power distributions given
by the density
f(y; µ, Σ, β) =
nΓ(n/2)
π
n/2
p
|Σ|Γ
1 +
n
2β
2
1+n/(2β)
e
1
2
[(yµ)
T
Σ
1
(yµ)]
β
,
396 10. Astronomy, biological and environmental sciences
also known as the Kotz-type multivariate distribution [cf. Exercise 6.12.11
and Fang et al. (1990)].
For β = 1/2, this represents a certain generalization of Laplace distribu-
tion. However, it is not a multivariate Laplace distribution as discussed in
this book.
As an e xample, Lindsey (1999) considers blood sugar level for the two
treatments of rabbits involving two neutral protamine Hagedorn insulin
mixtures. The estimate of β (around 0.40) strongly suggest non-normality.
The main reason are heavy tails exhibited by the data.
This e xample illustrated the ability of multivariate exponential power
distribution to fit heavy tailed data. However, as pointed by Lindsey (1999),
it has several unpleasant properties:
The marginal and conditional distributions are more complex ellipti-
cally contoured distributions, and not of the exponential power type;
It see ms to be difficult to introduce independence betwe e n observa-
tions.
In view of this, it would be interesting to compare the exponential power
distributions with multivariate Laplace distributions as discussed in this
book. To quote the author of the discussed paper : “The fact that the mul-
tivariate normal distribution is rejected in favo r of a more heavily tailed
distribution for these data does not imply that this (multivariate exponen-
tial power) is the mos t appropriate distribution for them.”
10.6 ARMA model s with Laplace noise in the
environmental time series
An ARMA model with Lapla ce noise was used to fit the data o n the sul-
phate concentration in Damsleth and El-Shaarawi (1989). The data con-
sisted of 1 47 weekly measurements of the sulphate concentration in the
Turkey Lakes Watershed in Ontario, Canada, from ear ly March 1982 to
the end of 1984. The data exhibits some extreme va lues and thus there is
a reasonable doubt about nor mality of the underlying time series. A stan-
dard time series analysis of the data suggests that AR(1) model may be
appropriate. Thus the model considered is
X
t
= φX
t1
+ a
t
,
where a
t
is a random noise. In the cla ssical time series theory the model
with a
t
being Gaussian is typically being considered. The Laplace distri-
bution is an alternative, which is distinct fro m the normal distribution.
10.6 ARMA models with Laplace noise in the environmental time series 397
Computationally, the Laplace case is still straightforward, though some-
times cumbersome. The probability density function of X
t
is given by
f(x) =
1
2
X
j=0
α
i
|φ|
i
e
−|x|/|φ|
i
where
α
i
= (1 )
i
i
Y
t=1
[φ
2t
/(1 φ
2i
)]/
Y
t=1
(1 φ
2t
).
The shape of this distribution exhibits “La placian features” (peak and
heavy-tails) for φ close to zero , and “Gaussia n features” for φ close to
unity. It is inter e sting that this density has all derivatives at zero, provided
φ 6= 0 . In Damsleth and El-Shaarawi (1989), also bivariate distribution of
(X
t
, X
t1
) is computed in an explicit form.
Using the max imum like lihood method the fit of both models (Gaussian
and Laplacian) was made. The Laplace model fits the data better than
the Gaussian one, both before and after logarithmic transformation of the
data. Details are pre sented in the cited pap e r.
398 10. Astronomy, biological and environmental sciences
Appendix A
Bessel functions
The Bessel function of the first kind of order λ is given by the conver gent
series:
J
λ
(z) = z
λ
X
k=0
(1)
k
z
2k
2
2k+λ
k!Γ(λ + k + 1)
. (A.0.1)
In particular,
J
0
(z) =
X
k=0
(1)
k
z
2k
2
2k
(k!)
2
=
1
π
Z
π
0
cos(z cos θ) (A.0.2)
and
J
1
(z) =
X
k=0
(1)
k
z
2k+1
2
2k+1
k!(k + 1)!
=
1
π
Z
π
0
cos(z sin θ θ), (A.0.3)
see, e.g., Abramowitz and Stegun (1965).
Below we collect some results for the modified Bessel function of the third
kind with index λ R, denoted K
λ
(·). We refer the reader to Abramowitz
and Stegun (1965), Olver (19 74), and Watson (1 962) for definitions and
further properties of these and related special functions.
There are many integral representations of K
λ
(u) in the literature. The
following r epresentations are relevant to our work. The first can be found
in Watson (19 62, p.183), the second appears in Abramowitz and Stegun
400 Appendix A. Bessel functions
0 5 10 15
−0.5
0
0.5
1
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
1
2
3
4
5
6
7
8
9
10
Figure A.1: Graphs of Bessel functions: Left J
0
(starting at the origin) and
J
1
(starting at one); Right K
0
(the lowest), K
1/2
, and K
1
(the highest).
(1965, p. 376), while the third is given in Olver (1974).
K
λ
(u) =
1
2
u
2
λ
Z
0
t
λ1
exp
t
u
2
4t
dt, u > 0. (A.0.4)
K
λ
(u) =
(u/2)
λ
Γ(1/2)
Γ(λ + 1/2)
Z
1
e
ut
(t
2
1)
λ1/2
dt, λ 1/2. (A.0.5)
K
λ
(u) =
Z
0
e
u cosh t
cosh(λt)dt, λ R. (A.0.6)
Property 1 The Bessel function K
λ
(u) is continuous and positive func-
tion of λ 0 and u > 0.
Property 2 If λ 0 is fixed, then throughout the u interval (0, ), the
function K
λ
(u) is positive and decreasing.
Property 3 If u > 0 is fix ed, then throughout the λ interval (0, ), the
function K
λ
(u) is positive and increasing.
Property 4 For any λ 0 and u > 0, the Bessel function K
λ
satisfies
the relations
K
λ
(u) = K
λ
(u), (A.0.7)
Appendix A. Bessel fun ctions 401
K
λ+1
(u) =
2λ
u
K
λ
(u) + K
λ1
(u), (A.0.8)
K
λ1
(u) + K
λ+1
(u) = 2K
0
λ
(u). (A.0.9)
Property 5 For λ = r +1/2, where r is a non-negative integer, the Bessel
function K
λ
has the closed form
K
r+1/2
(u) =
r
π
2u
e
u
r
X
k=0
(r + k)!
(r k)!k!
(2u)
k
. (A.0.10)
In particular, for r = 0, we obtain
K
1/2
(u) =
r
π
2u
e
u
. (A.0.11)
Property 6 If λ is fi xed, then
as x 0
+
, K
λ
(x) Γ(λ)2
λ1
x
λ
(λ > 0), K
0
(x) log(1/x).
(A.0.12)
Property 7 For any a > 0 and µ, λ such that µ + 1 ± λ > 0, we have
Z
0
x
µ
K
λ
(ax)dx =
2
µ1
a
µ+1
Γ
1 + µ + λ
2
Γ
1 + µ λ
2
. (A.0.13)
[See, Gradshteyn and Ryzhik (1980).]
Property 8 For any µ > 0 and βu > 0 we have
Z
u
x
µ1
(x u)
µ1
e
βx
dx =
Γ(µ)
π
u
β
µ
1
2
e
βu
2
K
µ
1
2
βu
2
.
(A.0.14)
[See, Gradshteyn and Ryzhik (1980).]
Property 9 For any ν > 0 we have
[x
ν
K
ν
(x)]
0
= x
ν1
K
ν1
(x).
[See Olver (1974), (8.05), p.251, and (10.05), p.60.]
Property 10 For any ν > 0 we have
K
ν
(x) = K
ν
(x).
[See Olver (1974), (8.05), p.251.]
402 Appendix A. Bessel functions
Consider the function
R
λ
(x) =
K
λ+1
(x)
K
λ
(x)
. (A.0.15)
The function R
λ
has a number of important properties.
Property 11 For λ 0 the function R
λ
(x) is strictly decreasing in x with
lim
x→∞
R
λ
(x) = 1 and lim
x0
+
R
λ
(x) = .
Property 12 Property 4 of Bessel functions produces the recursive rela-
tion
R
λ
(x) =
2λ
x
+
1
R
λ1
(x)
. (A.0.16)
Property 13 Property 4 of Bessel functions produces the following expres-
sion for the derivative of R
λ
.
d
dx
R
λ
(x) = R
2
λ
(x)
2λ + 1
x
R
λ
(x) 1. (A.0.17)
Please see Jorgensen (1982) for these and other properties of the function
R
λ
(and Bessel function).
References
[1] Abramowitz, M. and Stegun, I.A. (1965). Handbook of Mathematical
Funct ions, Dover Publications, New York.
[2] Adams, Jr., W.C. and Giesler, C.E. (1978 ). Quantizing characteristics
for sig nals having Laplacian amplitude probability density function,
IEEE Trans. Comm. COM-26, 1295-1297.
[3] Agr`o, G. (19 95). Maximum likelihood estimation for the exponen-
tial power function parameters, Comm. Statist. S imu lation Comput.
24(2), 523-536.
[4] Ahsanullah, M. and Rahim, M.A. (1973 ). Simplified estimates of the
parameters of the double-exponential distribution based on optimum
order statistics from a middle-censored sample, Naval Res. Logist.
Quarterly 20, 745-751.
[5] Akahira, M. (1986 ). On the loss of information for the estimators in
the double exponential case, Repor ted at the Symposium on Infer-
ence Based on Incomplete information with Applications, University
of Tsukuba, Kyushu University.
[6] Akahira, M. (1987). Second order asymptotic comparison of estimators
of a common parameter in the double exponential case, Ann. Inst.
Statist. Math. 39, 25-36.
[7] Akahira, M. (19 90). Second order asymptotic comparison of the dis-
cretized likeliho od estimator with asymptotically efficient estimators
in the double expo nential case, Metron 48, 5-17.
404 References
[8] Akahira, M. and Takeuchi, K. (1990). Loss of information a ssociated
with the order sta tistics and r elated estimators in the double expo-
nential distribution case, Austral. J. S tatist. 32, 281-291.
[9] Akahira, M. and Takeuchi, K. (1993). Second order asymptotic bound
for the variance of estimators for the double exponential distribution,
in Statistical Science & Data Analysis (eds., K. Matusita et al.), pp.
375-382, VSP Publishers, Amsterdam.
[10] Alamatsaz, M.H. (199 3). On characterizations of exponential and
gamma distributions, Statist. Probab. Lett. 17, 315-319.
[11] Ali, M.M., Umbach, D. and Hassa nein, K.M. (1981). Estimation of
quantiles of e xponential and double exponential distributions based on
two or der statistics, Comm. Statist. Theory Methods A10 (19), 1921-
1932.
[12] Anderson, D.N. (1992). A multivariate Linnik distribution, Statist.
Probab. Lett. 14, 333-336 .
[13] Anderson, D.N. and Arnold, B.C. (1993). Linnik distributions and
processes, J. Appl. Probab. 30, 330-340.
[14] Anderson, E.W. and Ellis, D.M. (1971). Erro r distributions in naviga-
tion, J. Inst. of Navigation 24, 429-442.
[15] Andrews, D.F., Bickel, P.J., Hampel, F.R., Huber, P.J., Rogers, W.H.
and Tukey, J.W. (1972). Robust Estimates of Location, Princeton Uni-
versity Press, Princeton.
[16] Antle, C.E. and Bain, L.J. (1969). A property of maximum likelihood
estimators of location and scale parameters, SIA M Rev. 11(2), 251-
253.
[17] Arnold, B.C. (1973). Some characterizations of the exponential dis-
tribution by geometric compounding, SIAM J. Appl. Math. 24(2),
242-244.
[18] Asrabadi, B.R. (1985). The exact confidence interval for the s c ale pa-
rameter and the MVUE of the Laplace distribution, Commun. Statist.
Theory Methods 14(3), 713-733.
[19] Atkinson, A.C. (1982). The simulation of generalized inverse Gauss ian
and hyperbolic random variables, S IAM J. Sci. Statist. Comput. 3(4),
502-515.
[20] Azzalini, A. (1985). A class of distributions that includes the norma l
ones, Scand. J. Statist. 12, 171-1 78.
References 405
[21] Azzalini, A. (1986). Further results on a cla ss of distributions which
includes the normal ones, Statistica 4 6(2), 199-208.
[22] Azzalini, A. and Capitanio, A. (1999). Statistical applications of the
multivariate skew normal distribution, J. Roy. Statist. Soc. Ser. B
61(3), 579-602.
[23] Azzalini, A. a nd Dalla Valle, D. (1996). The multivariate skew-normal
distribution, Biometrika 83(4), 715-726.
[24] Bag chi, U., Hayya, J.C. and O rd, J.K. (1983). The Hermite distribu-
tion as a model of demand during lead time for slow-moving items,
Decision Sciences 14, 447- 466.
[25] Bag nold, R.A. (1954). The Physics of Blown Sand and Desert Dunes,
Methuen, London.
[26] Bain, L.J. and Engelhardt, M. (1973). Interval estimation for the two-
parameter double- exponential distribution, Technometrics 15(4), 875-
887.
[27] Balakrishnan, N. (1988). Recurrence relations among moments of or-
der sta tistics from two related outlier moments, Biometrical J. 30,
741-746.
[28] Balakrishnan, N. (1989). Recurrence relations among moments of or-
der statistics from two related sets of independent and not-identically
distributed random variables, Ann. Inst . Statist. Math. 41, 323-329.
[29] Balakrishnan, N. and Ambagaspitiya, R.S. (1988). Relationships
among moments of order sta tistics in samples from two related out-
lier moments and some applications, Comm. Statist. Theory Methods
17(7), 2327-2341.
[30] Balakrishnan, N. and Ambagaspitiya, R.S. (1994). On skewed-
Laplace distributions, Report, McMaster Univers ity, Hamilton, On-
tario, Canada.
[31] Balakrishnan, N. and Chandramouleeswaran, M.P. (1994a). Reliabil-
ity estimation and tolerance limits for Laplac e distribution based on
censored samples, Report, McMaster University, Hamilton, Ontario,
Canada.
[32] Balakrishnan, N. and Chandramouleeswaran, M.P. (1994b). Pre dictio n
intervals for Laplace distr ibution based on ce nsored samples, Report,
McMaster University, Hamilton, Ontario, Canada.
406 References
[33] Balakrishnan, N., Chandramouleeswaran, M.P. and Ambagaspitiya,
R.S. (1994). BLUE’s of location and scale parameters of Laplace dis-
tribution based on Type-II censored samples and associated inference,
Report, McMaster University, Hamilton, Ontario, Canada.
[34] Balakrishnan, N., Chandramouleeswaran, M.P. and Govindarajulu,
Z. (1994). Inference on parameters of the Laplace distribution based
on Type-II censored samples using E dgeworth approximation, Report,
McMaster University, Hamilton, Ontario, Canada.
[35] Balakrishnan, N. and Cohen, A.C. (1991). Order Statistics and Infer-
ence: Estimation Methods, Academic Press, San Diego.
[36] Balakrishnan, N. and Cutler, C.D. (1994). Maximum likeliho od esti-
mation of the Laplace parameters based on Type-II censored samples,
in H.A. David Festschrift Volume (eds., D.F. Mor rison, H.N. Nagaraja,
and P.K. Sen), pp. 145-151, Springer-Verlag, New York.
[37] Balakrishnan, N., Govindarajulu, Z. and Balasubramanian, K. (1993).
Relationships b e tween moments o f two related sets of order statistics
and some extensions, Ann. Inst. Statist. Math. 45, 243-247.
[38] Balakrishnan, N. and Kocherlakota, S. (1986). Effects of nonnormality
on X charts: Single assigna ble cause model, Sankhy¯a Ser. B 48, 439-
444.
[39] Balanda, K.P. (1987). Kurtosis comparisons of the Cauchy and double
exp onential distributions, Comm. Statist. Theory Methods 16(2), 579–
59.
[40] Bar inghaus, L. and Gr¨ubel, R. (1997). On a class of characterization
problems for random convex combinations, Ann. Inst. Statist. Math.
49(3), 555-567.
[41] Bar ndorff-Nielsen, O.E. (1977). E xponentially decreasing distributions
for the logarithm of particle size, Proc. Roy. Soc. London Ser. A 353,
401-419.
[42] Bar ndorff-Nielsen, O.E. (1979). Models for non-Gaussian variation,
with application to turbulence, Proc. Roy. Soc. London Ser. A 368,
501-520.
[43] Bar ndorff-Nielsen, O.E. (1996). Processes of Normal Inverse Gaussian
Type, Research Report 339, Dept. Theor. Statist., Aarhus University.
[44] Bar ndorff-Nielsen, O.E. (1997). Normal inverse Gaussia n dis tributions
and stochastic volatility modelling, Scand. J. Statist. 24, 1-13.
References 407
[45] Bar ndorff-Nielsen, O.E. and Blaesild, P. (1981). Hyperbolic distribu-
tions and ramificatio ns: contributions to theory and application, in
Statistical Distributions in Scientific Work, Vol 4 (eds., C. Taillie et
al.), pp. 19-44, Reidel, Dordrecht.
[46] Bar ndorff-Nielsen, O.E. and Blaesild, P. (1982). Hyperbolic distribu-
tions, in The Encyclopedia of Statistical Sciences, Vol. 3 (eds., S. Kotz
and N.L Johnson), pp. 700-707, Wiley, New York.
[47] Belinskiy, B.P. and Kozubowski, T.J. (2000). Exponential mixture re p-
resentation of geometric stable dens ities, J. Math. Anal. Appl. 246,
465-479.
[48] Bernstein, S.L., Burrows, M.L., Evans, J.E., Griffiths, A.S., McNeill,
D.A., Niessen, C.W., Richer, I., White, D.P. and Willim, D.K. (1974).
Long-range communications at extremely low frequencies, Proc. IEEE
62, 292-312.
[49] Bertoin, J. (1996). L´evy Processes, University Press, Cambridge.
[50] Bhattacharyya, B. C. (1942). The use of McKay’s Bessel function
curves for graduating frequency distributions. Sankhy¯a 6, 175–18 2.
[51] Billingsley, P. (1968). Convergence of Probability Measures, Wiley, New
York.
[52] Biswas, S. and Sehgal, V.K. (1991). Topics in S tatistical Methodology,
Wiley, New York.
[53] Black, C.M., Durham, S.D., Lynch, J.D. and Padgett, W.J. (1989). A
new probability distribution for the s trength of brittle fibers, Fiber-Tex
1989, The Third Conference on Advanced Engineering Fibers and Tex-
tile Structures for Composites, NASA Conference Publication 3082,
363-374.
[54] Blaesild, P. (1981). The two-dimensional hyperbolic distribution and
related distributions, w ith an application to Johannsen’s bean data,
Biometrika 68, 251-63.
[55] Bondesson, L. (1992). Generalized Gamma Convolutions and Related
Classes of Distributions and Densities, Springer, New York.
[56] Bouza r, N. (1999). On geometric stability and Poisson mixtures, Illi-
nois J. Math. 43(3), 520-527.
[57] Box, G.E.P. and Tiao, G.C. (19 62). A further look at robustness via
Bayes’s theorem, Biometrika 49, 419–432.
[58] Breiman, L. (1993). Probability (2nd ed.), SIAM, Philadelphia.
408 References
[59] Brunk, H.D. (1955). Maximum likelihood estimation of monotone pa-
rameters, Ann. Math. Statist. 26, 6 07-616.
[60] Brunk, H.D. (1965). Conditional expectatio n given a σ-lattice and
applications, Ann. Math. Statist. 36, 1339-1350.
[61] Buczolich, Z. and Sz´ekely, G. (1989). When is a weighted average of or-
dered sample elements a maximum likelihood estimator of the location
parameter? Adv. Appl. Math. 10, 439–4 56.
[62] Bunge, J. (1993). Some stability classes for ra ndom numbers of random
vectors, Comm. Statist. Stochastic Models 9, 247-254.
[63] Bunge, J. (1996). Composition semigro ups and random stability, Ann.
Probab. 24(3), 1476-1489.
[64] Carlto n, A.G. (1946). Estimating the parameters of a rectangular dis-
tribution, Ann. Math. Statist. 17, 355-358.
[65] Chan, L.K. (1970). Linear estimation of the location and scale param-
eters of the Cauchy distribution based on sample quantiles, J. Amer.
Statist. Assoc. 65, 851-859.
[66] Chan, L.K. a nd Chan, N.N. (1969). Estimates of the parameters of
the double exponential distribution based on selected order statistics,
Bulletin of the Institute of Statistical Research and Training 3, 21-40.
[67] Cheng, S.W. (1978). Linear quantile estimation of parameters of the
double exponential distribution, Soochow J. Math. 4, 39-49.
[68] Chew, V. (1968). Some useful alternatives to the normal distribution,
Amer. Statist. 22 (3), 22-24.
[69] Childs, A. and Balakrishnan, N. (1996). Conditional inference pro-
cedures for the Laplace distribution based on type-II right censored
samples, Statist. Probab. Lett. 31(1), 31-39.
[70] Childs, A. and Balakrishnan, N. (1997a). Some extensions in the ro-
bust estimation of parameters of exponential and double expo nential
distributions in the presence of multiple outliers, in Handbook of Statis-
tics - Robust Methods, Vol. 15 (eds., C.R. Rao and G.S. Maddala), pp.
201-235, North-Holland, Amsterdam.
[71] Childs, A. and Balakrishnan, N. (1997b). Maximum likeliho od es tima-
tion of Laplace parameters bas e d on general type-II ce nsored examples,
Statist. Papers 38(3), 343-348.
[72] Chipman, J.S. (1985). Theory and measurement of income distribu-
tion, in Advances in Econometrics, Vol. 4 (eds., R.L. Basmann and
G.F. Rhodes, Jr.), pp. 135-165, JAI Press, Greenwich.
References 409
[73] Christoph, G. and Schreiber, K. (1998a). Discr e te stable random vari-
ables, Statist. Probab. Lett. 37, 243-247.
[74] Christoph, G. and Schreiber, K. (1998b). The generalized discrete Lin-
nik distributions, in Advances in Stochastic Models for Reliability,
Quality, and Safety (eds., W. Kahle et al. ), pp. 3-18, Birkhauser-
Verlag, Boston.
[75] Christoph, G. and Schreiber, K. (1998c). Positive Linnik and discr ete
Linnik distributions, in Asymptotic Methods in Probability and Statis-
tics with Applications, P roceedings of the Conference in St-Petersburg,
June 1998, Birkhauser-Verlag, Boston.
[76] Chu, J.T. and Hotelling, H. (1955). The moments of the sa mple me-
dian, Ann. Math. Statist. 26, 59 3-606 .
[77] Christensen, R. (2000). Unpublished Notes, Department of Statistics,
University of New Mex ic o, Albuquer que, NM.
[78] Cifarelli, D.M and Reg azzini, E. (1976). On a characterization o f a
class of distributions ba sed on efficiency of certain estimator, Rend.
Sem. Mat. Univers. Politecn. Torino (1974-1975) 33, 299-311 (in Ital-
ian).
[79] Clark , P.K. (1973) A subordinated stochastic process model with finite
variance for speculative prices, Econometrica 41, 135-155.
[80] Conover, W.J., Wehmanen, O. and Ramsey, F.L. (1978). A note on
the small-sample power functions for nonparametric tests of location
in the do uble exponential family, J. Amer. Statist. Assoc. 73, 188- 190.
[81] Corless, R.M., Gonnet, G.H., Hare, D.E.G., Jeffrey, D.J. and Knuth,
D.E. (1996). On the Lambert W function, Adv. Comput. Math. 5,
329-359.
[82] Craig, C.C. (1932). On the distributions of certain statistics, Amer. J.
Math. 54, 353-366.
[83] Craig, C.C. (1936). On the frequency function of xy, Ann. Math.
Statist. 7, 1-15.
[84] Crum, W.L. (1923). The us e of the median in determining seasonal
variation, J. Amer. Statist. Assoc. 18, 607-614.
[85] Dadi, M.I. and Marks, R.J., II (1987). Detector relative efficiencies
in the pr e sence of La place noise, IEEE Trans. Aerospace Electron.
Systems 23, 568-582.
410 References
[86] Damsleth, E. and El-Shaarawi, A.H. (1989). ARMA models with
double-exponentially distributed noise, J. Roy. Statist. Soc. Ser. B
51(1), 61-69.
[87] David, H.A. (1981). Order Statistics (2nd ed.), Wiley, New York.
[88] Devroye, L. (1986). Non-Uniform Random Variate Generation,
Springer-Ver lag, New York.
[89] Devroye, L. (1990). A note on Linnik distribution, Statist. Probab.
Lett. 9, 305-306.
[90] Devroye, L. (1993). A triptych of discrete distributions related to the
stable law, Statist. Probab. Lett. 18, 349-351.
[91] Dewald, L.S. and Lewis, P.A.W. (1985). A new Laplace second-order
autoregr e ssive time series model-NLAR(2), IEEE Trans. Inform. The-
ory 31(5), 645-651.
[92] Dharmadhikar i, S. and Joag-Dev, K. (1988). Unimodality, Convexity,
and Applications, Academic Press, San Diego.
[93] Divanji, G. (1988). On Semi-α-L aplace dis tributions, J. Indian Statist.
Assoc. 26, 31-38.
[94] Dixon, W.J. (1957). Estimates of the mean and standard deviation of
a normal po pulation, Ann. Math. Statist. 28, 806-809.
[95] Dixon, W.J. (1960). Simplified estimation from censored normal sam-
ples, Ann. Math. Statist. 31, 385-391.
[96] Dreier, I. (1999). Inequalities for real characteristic functions and their
moments, Ph.D. Dissertation, Technical University of Dresden, Ger-
many (in German).
[97] Dugu´e, D. (1951). Sur certains exemples de d´ecompos itions en
arithm´etique des lois de probabilit´e, Ann. In s t. H. Poincar´e 12, 159-
169.
[98] Duttweiler, D.L. and Messerschmitt, D.G. (1976). Nearly instanta-
neous companding for nonuniformly quantized PCM, IEEE Trans.
Comm. COM-24(8), 864-873.
[99] Easterling, R.G. (1978). Exponential responses with double exponen-
tial measurement error - A model for steam generator insp ection, Pro-
ceedings of the DOE Statistical Symposium, US Department of Energy,
pp. 90-110.
[100] Eber le in, E. and Keller, U. (1995). Hyperbolic distributions in fi-
nance, Bernoulli 1(3), 281-299.
References 411
[101] Edwards, A.W.F. (1974). Letter to the Editor, Technometrics 16(4),
641-642.
[102] Edwards, L. (1 948). The use of normal significance limits when the
parent population is of Laplace form, J. Institute of Actuaries Stu-
dents’ Society 8, 87-99.
[103] Efron, B. (1986). Double exponential families and their use in gener-
alized linear regression, J. Amer. Statist. Assoc. 81, 709-721.
[104] Eis e nhart, Ch. (1983). Laws of Error I-II I , in Encyclopedia of Sta-
tistical Sciences, Vol 4 (eds., S. Kotz et al.), pp. 530-562, Wiley, New
York.
[105] Elderton, W.P. (1938). Frequency Curves and Correlation, Cam-
bridge University Press, Cambridge.
[106] Eps tein, B. (1947). Application of the theory of extreme values in
fracture problems, J. Amer. Statist. Assoc. 4 3, 403-412.
[107] Eps tein, B. (194 8). Statistical aspects of fracture problems, J. Appl.
Physics 19, 140-147.
[108] Erdogan, M.B. (1995). Analytic and asymptotic properties of non-
symmetric Linnik’s probability densities, PhD Thesis, Bilkent Univer-
sity, Ankara; appeared in J. Fourier Anal. Appl. 5(6), 523-544, 1999.
[109] Erdogan, M.B. and Ostrovskii, I.V. (199 7). Non-symmetric Linnik
distributions, C. R. Acad. Sci. Paris t. 325, erie I, 511-516 .
[110] Erdogan, M.B. and Ostrovskii, I.V. (1998a). On mixture r e presenta-
tion of the Linnik density, J. Austral. Math. Soc. Ser. A 64, 317-326.
[111] Erdogan, M.B . and Ostrovskii, I.V. (1998b). Analytic and a symptotic
properties of gener alized Linnik probability densities, J. Math. Anal.
Appl. 217, 555-579.
[112] Ernst, M.D. (1998). A multivariate g e neralized Laplace distribution,
Comput. Statist. 13, 227-232.
[113] Fa ng, K .T., Kotz, S. a nd Ng, K.W. (1990). Symmetric Multivariate
and Related Distributions, Monographs on Statistics and Probability
36, Chapman-Hall, London.
[114] Fa rebrother, R.W. (1986). Pitman’s measure of closeness, Amer.
Statist. 40, 179-180.
[115] Fa rison, J.B. (1965). On c alculating moments for some common prob-
ability laws, IEEE Trans. Inform. Theory 11 (4), 586-589.
412 References
[116] Fechner, G.T. (1897). Kollektivmasslehre, W. Engelmann, Leipzig.
[117] Feller, W. (1971). Introduction to the Theory of Probability and its
Applications, Vol. 2 (2 nd ed.), Wiley, New York.
[118] Ferguson, T.S. and Klass, M.J. (1972). A representation of indepen-
dent increment proces ses without Gaussian compo nents, Ann. Math.
Statist. 43, 1634-1643.
[119] Ferandez, C., Osiewalski, J. and Steel, M.F.J. (1995). Modeling and
inference with v-distributions, J. Amer. Statist. Assoc. 90, 1331 -1340.
[120] Ferandez, C. and Steel, M.F.J. (1998). On Bayesian modeling of fat
tails and skewness, J. Amer. Statist. Assoc. 9 3, 359-371.
[121] Fieller, N.R.J. (1993). Archaeostatistics: Old statistics in ancient con-
texts, Statistician 42, 279-295.
[122] Findeisen, P. (1982). Characterizatio n of the bilateral ex ponential
distribution, Metrika 29, 95-102 (in German).
[123] Fisher, R.A. (1922). On the mathematical foundations of theoretical
statistics, Philos. Trans. Roy. Soc. London Ser. A 222, 309-368.
[124] Fisher, R.A. (1925). Theor y of statistical estimation, Proc. Camb.
Philos. Soc. 22, 700-725.
[125] Fisher, R.A. (1934). Two new prope rties of mathematical likelihood,
Proc. Roy. Soc. London Ser. A 147, 2 85-307.
[126] Fuita, y. (1993). A generalization of the results of Pillai, Ann. Inst.
Statist. Math. 45(2), 361-365.
[127] Galambos, J. and Kotz, S. (1978). Characterizations of Probability
Distributions. A Unified Approach with an Emphasis on Exponential
and Related Models, Lecture Notes in Math. 675, Springer, Berlin.
[128] Gallo , F. (1979). On the Laplace first law: Sample distribution of the
sum of values and the sum of absolute values of the errors, distribution
of the related T, Statistica 39, 443-454.
[129] Geary, R.C. (1935). The ratio of the mean deviation to the standard
deviation as a test of normality, Biometrika 27, 310-332.
[130] Geman, H., Madan, D.B. and Yor, M. (2000a). Asset prices are Brow-
nian motion: Only in business time, in Quantitative Analysis in Fi-
nancial Markets, Volume II (ed., Marco Avellaneda), World Scientific
Publishing Company, in pre ss.
References 413
[131] Geman, H., Madan, D.B. and Yor, M. (2000 b). Time changes for
evy processes, Mathematical Finance, in press.
[132] George, E.O. and Mudholkar, G.S. (1981). A characterization of the
logistic distribution by a sample media n, Ann. Inst. Statist. Math. 33,
Part A, 125- 129.
[133] George, E.O . and Rousseau, C.C. (198 7). On the logistic midrange,
Ann. Inst. Statist. Math. 39, Part A, 627-635.
[134] George, Sa bu a nd Pillai, R.N. (1988). A characterization of spherical
distributions and some applications of Meijer ’s G-function, Proceedings
of the Symposium on Special Functions and Problem-oriented Research
(Trivandrum, 1988), pp. 61-71, Publication 14, Centre Math. Sci.,
Trivandrum, Kerala, India.
[135] Gnedenko, B.V. (1970). Limit theorems for sums of random number
of positive independent random variables, Proc. 6th Berkeley Symp.
on Math. Statist. Probab. , Vo l. 2, pp. 537-5 49.
[136] Gnedenko, B.V. and Janjic, S. (1983). A characteristic property of
one class of limit distributions, Math. Nachr. 113, 145-1 49.
[137] Gnedenko, B.V. and Korolev, V.Yu. (1996). Random Summation:
Limit Theorems and Application, CRC Press , Boca Raton.
[138] Gokhale, D.V. (1 975). Maximum entropy characterizations of some
distributions, in Statistical Distributions in Scientific Work, Vol. 3
(eds., G.P. Patil, S. Kotz and J.K. Ord), pp. 2 99-305, Reidel, Boston.
[139] Goldfeld, S.M and Quandt, R.E. (1981 ). Econometric modelling with
non-normal disturbances, J . Econometrics 17, 141-1 55.
[140] Gonz´alez, F.J., Puig-Pey, J. and Luce˜no, A. (1999). Analytical ex-
pressions for the average adjustment interval and mean squared devi-
ation for bounded adjustment s chemes, Commun. Statist. Simulation
Comput. 28, 623-6 35.
[141] Govindarajulu, Z. (1963). Relationships among moments of order
statistics in samples from two related populations, Technometrics 5,
514-518.
[142] Govindarajulu, Z. (1966). Best linear estimates under symmetric cen-
soring of the parameters of a double exponential population, J. Amer.
Statist. Assoc. 61, 248-258. (Correction: 71, 255.)
[143] Gradshteyn, I.S. and Ryzhik, I.M. (1980). Tables of Integrals, Series,
and Products, Academic Press, New York.
414 References
[144] Greenwood, J.A., Olkin, I and Savage, I.R. (1962). Index to Annals of
Mathematical Statistics, Volumes 1-31, 1930-1960. University of Min-
nesota, Minneapolis St. Paul: North Central Publishing.
[145] Grice, J.V., Bain, L.J. and Engelhar dt, M. (1978). Comparison of
conditional and unconditional confidence interva ls for the double ex-
ponential distribution, Comm. Statist. Simulation Comput. 7(5), 515-
524.
[146] Gumbel, E.J. (1944). Ranges and midrange s, Ann. Math. Statist. 15,
414-422.
[147] ajek, J. (1969). Nonparametric Statistics, Holden-Day, San Fran-
cisco.
[148] Hald, A. (1995 ). History of Mathematical Statistics 1750-1930, Wiley,
New York.
[149] Hall, D.L. and Joiner, B.L. (1983). Asymptotic relative efficiencies
of R-estimators of location, Comm. Statist. Theory Methods 12(7),
739-763.
[150] Har o-L´opez, R.A. and Smith, A.F.M. (1999). On robust Bayesian
analysis for location and scale parameters , J. Multivariate Anal. 70,
30-56.
[151] Har ris, B. (1966). Theory of Probability, Addison-Wesley, Reading.
[152] Har ter, H.L., Moore, A.H. and Curry, T.F. (19 79). Adaptive robust
estimation of location and scale parameters of symmetric populations,
Comm. Statist. Theory Methods A8(15), 1 473-1 491.
[153] Har tley, M.J and Revankar, N.S. (1974). On the estimation of the
Pareto law from underreported data, J . Econometrics 2, 327-341.
[154] Har ville, D.A. (1997). Matrix Algebra From a Statistician’s Perspec-
tive, Springer, New York.
[155] Hausdorff, F. (1901). Beitrage zur Wahrscheinlichkeitsrechnung,
Verhandlungen der Konigliche Sachsischen Gesellschaft der Wis-
senschaften, Leipzig, Mathematisch-Physische Classe 53, 152-178.
[156] Hayfavi, A. (1998). An improper integral representation of Linnik’s
probability densities, Tr. J. Math. 22, 235-242.
[157] Heathcote, C.R., Rachev, S.T. and Cheng, B. (1995). Testing multi-
variate symmetry, J. Multivariate Anal. 5 4, 91-112.
[158] Henze, N. (1986). A probabilistic representation of the skew-normal
distribution, Scand. J. Statist. 13, 271-275 .
References 415
[159] Hettmansperger, T.P. and Keenan, M.A. (1975). Tailweight, statisti-
cal inference and families of distributions - a brief survey, in Statistical
Distributions in Scientific Work, Vol. 1 (eds., G.P. Patil, S. Kotz and
J.K. Ord), pp. 161-172, Reidel, Dordrecht.
[160] Hinkley, D.V. and Revankar, N.S. (1977). Estimation of the Pareto
law from underreported data, J. Econometrics 5, 1- 11.
[161] Hirschbe rg, J.G., Molina, D.J. and Slottje, D.J. (1989). A selection
criterion for choosing between functional forms of income, Econometric
Rev. 7(2), 183-197.
[162] Hoaglin, D.C., Mosteller, F. and Tukey, J.W. (eds.) (1983 ). Under-
standing Robust and Exploratory Data Analysis, Wiley, New York.
[163] Hogg, R.V. (1972). More lig ht on the kurtosis and related statistics,
J. Amer. Statist. A ss oc. 67, 422-424.
[164] Holla, M.S. and Bhattacharya, S.K. (1968). On a compound Gaussia n
distribution, Ann. Inst. Statist. Math. 20, 331-336.
[165] Hombas, V.C. (1 986). The double exponential distribution: Using
calculus to find a maximum likelihood estimato r, Amer. Statist. 40(2),
178.
[166] Hor n, P.S. (1983). A measure for peakedness, Amer. Statist. 37 (1),
55-56.
[167] Hsu, D.A. (1979). Long-tailed distributions for position er rors in nav-
igation, Appl. Statist. 28, 62-72.
[168] Huang, W.J. and Chen, L.S. (1989). Note on a characterization of
gamma distributions, Statist. Probab. Lett. 8, 485-4 87.
[169] Huber, P.J. (1981 ). Robust Statistics, Wiley, New Yor k.
[170] Hwang, J. T. and Chen, J. (1986). Improved confidence sets for the
coefficients of a linear model with spherically symmetric errors, Ann.
Statist. 14(2), 444-460 .
[171] Iliescu, D.V. and Voa, V.Gh. (1973). Proportion-p estimators for
certain distributions, Statistica 33, 309-321.
[172] International Organizatio n for Standardization (1989). Sampling pro-
cedures a nd charts for inspection by variable for percent nonconform-
ing (ISO 3951), Geneva, Switzerland.
[173] Jacques, C ., R´emillard, B. and Theodor e scu, R. (1999). Estimation
of Linnik law parameters, Statist. Decisions 17, 213-235.
416 References
[174] Jakuszenkow, H. (1978). Estimation of variance in Laplac e’s distribu-
tion, Zeszyty Nauk. Politech. odz. Matematyka 10, 29-36 (in Polish).
[175] Jakuszenkow, H. (1979). E stimation of the variance in the generalized
Laplace distribution with quadratic loss function, Demonstratio Math.
12(3), 581-591.
[176] Janicki, A. and Weron, A. (1994). Simulation and Chaotic Behavior
of α-Stable Stochastic Processes, Marcel Dekker, New York.
[177] Janji´c, S. (1984). O n random varia bles with the same distribution
type as their random sum, Publications de l’Institut Math´ematique,
Nouvelle erie, tome 35 49, 161-166.
[178] Jankovi´c, S. (1992). Preservation of type under mixing, Teor. Veroy-
atnost. i Primenen. 37, 594-599.
[179] Jankovi´c, S. (1993a). Some properties of random variables which are
stable with respect to the random sample size, Stability Problems for
Stochastic Models, Lecture Notes in Math. 1546, 68-75.
[180] Jankovi´c, S. (1993b). Enlarg e ment of the class o f geometrically in-
finitely divisible random variables, Publ. Inst. Math. (Beograd) (N.S.)
54(68), 126-134.
[181] Jayakumar, K. and Pillai, R.N. (1993). The first-order autoregres sive
Mittag-Leffler process, J. Appl. Probab. 30, 462-466.
[182] Jayakumar, K. and Pillai, R.N. (1995). Discrete Mittag-Leffler dis-
tribution, Statist. Probab. Lett. 23, 271-274.
[183] Jaynes, E.T. (1957). Information theory and statistical mechanics I,
Phys. Rev. 106, 620-630.
[184] Johnson, M.E. (1987). Multivariate Statistical Simulation, Wiley,
New York.
[185] Johnson, N.L. (1954). System of frequency curves derived from the
first law of Laplace, Trabajos de Estadistica 5, 283-291.
[186] Johnson, N.L. and Kotz, S. (1970). Continuous Univariate Distribu-
tions - 1, Wiley, New York.
[187] Johnson, N.L. and Kotz, S. (1972). Distributions in Statistics: Con-
tinuous Multivariate Distributions, Wiley, New York.
[188] Johnson, N.L., Kotz, S. and Balakrishnan, N. (1994). Continuous
Univariate Distributions - 1 (2nd ed.), Wiley, New York.
References 417
[189] Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995). Continuous
Univariate Distributions - 2 (2nd ed.), Wiley, New York.
[190] Johnson, N.L., Ko tz, S. and Kemp, A.W. (1992). Un ivariate Discrete
Distributions (2nd ed.), Wiley, New York.
[191] Jones, P.N. and McLachlan, G.J. (1990). Laplace-normal mixtures
fitted to wind shear data, J. Appl. Statistics 17 , 271-276.
[192] Jorgensen, B. (1982). Statistical Properties of the Generalized Inverse
Gaussian Distribution, Lecture Notes in Statist. 9, Springer-Verlag,
New York.
[193] Joshi, S.N. (1984). Expansion of Bayes r isk in the c ase of do uble
exp onential family, Sankhy¯a Ser. A 46, 64-74.
[194] Kacki, E. (1965a). Absolute moments of the Laplace distribution,
Prace Matematyczne 10, 89-93 (in Polish).
[195] Kacki, E. (1965 b). Certain specia l cases o f the parameter estimation
of a mixture of two Laplace distributions, Zeszyty Naukowe Politech-
niki Lodzkiej, Elektryka 20 (in Polish).
[196] Kacki, E. a nd Krysicki, W. (1967). Parameter estimation of a mix-
ture of two Laplace distr ibutions (general case), Roczniki Polskiego
Towarzystwa Matematycznego, Seria I: Prace Matematyczne 11, 23-
31 (in German).
[197] Kafaei, M.-A. and Schmidt, P. (1985). On the adequacy of the ”Sar-
gan distribution” as an approximation to the normal, Comm. Statist.
Theory Methods 14(3), 509–526.
[198] Kagan, A.M., Linnik, Yu.V. and Rao, C.R. (1973). Characterization
Problems in Mathematical Statistics, Wiley, New York.
[199] Kakosyan, A.V., Klebanov, L.B. and Melamed, I.A. (1984). Char-
acterization of D istributions by the Method of Intensively Monotone
Operators, Lecture Notes in Math. 1088, Springer, B e rlin.
[200] Kalashnikov, V. (1997). Geometric Sum s: Bounds for Rare Events
with Applications, K luwer Acad. Publ., Dordrecht.
[201] Kanefsky, M. and Thomas, J.B. (1965). On polarity detection
schemes with non-Gaussian inputs, J. Franklin Institute 280, 120-138.
[202] Kanji, G.K. (1985). A mixture model for wind shear data. J. Appl.
Statistics 12, 49-58 .
[203] Kapoor, S. and Kanji, G.K. (1990). Application of the characteriza-
tion theory to the mixture model, J. Appl. St atist. 17, 263-270.
418 References
[204] Kappenma n, R.F. (1975). Conditional confidence intervals for double
exp onential distribution parameters, Technometrics 17(2), 233-235.
[205] Kappenma n, R.F. (1977). Tolerance intervals for the double expo-
nential distribution, J. Amer. Statist. Assoc. 72, 908-909.
[206] Kapteyn, J.C. (1903). Skew Frequency Curves, Astronomical Labo-
ratory, Groningen.
[207] Kapur, J.N. (1993). Maximum-Entropy Models in Science and Engi-
neering (revised ed.), Wiley, New York.
[208] Karst, O.J. and Polowy, H. (1963). Sampling properties of the median
of a Laplace distribution, Amer. Math. Monthly 70, 628-636.
[209] Kelker, D. (1971). Infinite divisibility and variance mixtures of the
normal distribution, An n. Math. Statist. 42(2 ), 802-808.
[210] Kendall, M.G., Stuart, A. and Ord, J.K. (1994). Kendall’s Advanced
Theory of Statistics, Vol. 1, Distribution theory (6th edition), Halsted
Press (Wiley, Inc.), New York.
[211] Keynes, J.M. (1911). The principal averages and the laws of er ror
which lead to them, J . Roy. St atist. Soc. 74, New Series, 322-33 1.
[212] Khan, A.H. and Khan, R.V. (1987). Relations among moments of
order statistics in samples from doubly truncated Laplace and expo-
nential distributions, J. Statist. Res. 21, 35-44.
[213] Klebanov, L.B., Maniya, G.M. and Melamed, I.A. (1984). A problem
of Zolotarev and analog s of infinitely divisible and s table distributions
in a scheme for summing a random number of random variables, The-
ory Probab. Appl. 29, 791-794.
[214] Klebanov, L.B., Melamed, J.A., Mittnik, S. and Rachev, S.T. (1996).
Integral and asymptotic representations of geo-stable densities, Appl.
Math. Lett. 9, 37-40.
[215] Klebanov, L.B. and Rachev, S.T. (1996). Sums of random number
of random variables and their approximations with ν-accompanying
infinitely divisible laws, Serdica Math. J. 22, 471 -496.
[216] Klein, G.E. (1993). T he sensitivity of cash-flow analysis to the choice
of statistical model for interest rate changes (with discussions), Trans-
actions of the Society of Actuaries XLV, 79-186.
[217] Kollo, T. (2000). Private communication (to S. Kotz).
[218] Kotz, S., Balakrishnan, N. and Johnson, N.L. (2000). Continuous
Multivariate Distributions (2nd ed.), Wiley, New Yo rk.
References 419
[219] Kotz, S., Fang, K.T. and Liang, J.J. (1997). On multivariate vertical
density repres e ntation and its application to random number genera-
tion, Statistics 30, 163-180.
[220] Kotz, S. and Johnson, N.L. (1982). Encyclopedia of Statistical Sci-
ences, Vol. 2, Wiley, New York.
[221] Kotz, S., Johnson, N.L. and Read, C.B. (19 85). Log-Laplace distri-
bution, in Encyclopedia of Statistical Sciences, Vol. 5 (eds., S. Kotz,
N.L. Johnson, and C.B. Read), pp. 133-1 34, Wiley, New York.
[222] Kotz, S., Kozubowski, T.J. and Podg´orski, K. (2000a). Maximum
entropy characterization of asymmetric Laplace distribution, Technical
Report No. 361, Department of Statistics and Applied Proba bility,
University of California, Santa Barbara; to appear in Int. Math. J.
[223] Kotz, S., Kozubowski, T.J. and Podg´orski, K. (2000b). An asym-
metric multivar iate Laplace distribution, Technical Report No. 367,
Department o f Statistics and Applied Probability, University of Cali-
fornia, Santa Barbara.
[224] Kotz, S., Kozubowski, T.J. and Podg´orski, K. (2000c). Maximum
likelihood estimation of asymmetric La place parameters, preprint.
[225] Kotz, S. a nd Ostrovskii, I.V. (1996). A mixture representation of the
Linnik distribution, Statist. Probab. Lett. 26, 61-64.
[226] Kotz, S., Ostrovskii, I.V. and Hayfavi, A. (1995). Analytic and
asymptotic properties of L innik’s probability densities, I and II, J.
Math. Anal. Appl. 193, 353-371 and 497-521.
[227] Kotz, S. and Steutel, F.W. (1988). Note on a characterizatio n of
exp onential distributions, S tatist. Probab. Lett. 6, 201-203.
[228] Kotz, S. and Trout, M.D. (19 96). On the vertical density representa-
tion and ordering of distributions, Statistics 28, 241-2 47.
[229] Kou, S.G. (200 0). A jump diffusion model for option pricing with
three properties: le pto kurtic feature, volatility smile, and analytical
tractability, Preprint, Columbia University.
[230] Koutrouvelis, I.A. (1980). Reg ression-type estimation of the param-
eters of stable laws, J. Amer. Statist. Assoc. 75, 918-9 28.
[231] Kozubowski, T.J. (1993). Estimation of the par ameters of geometric
stable laws, Technical Report No. 253, Department of Sta tistics and
Applied Probability, University of California, Santa Barbara; appeared
in Math. Comput. Modelling 29(10-12), 241-253, 1999.
420 References
[232] Kozubowski, T.J. (1994a). Representation and properties of geomet-
ric stable laws, in Approximation, Probability, and Related Fields (eds.,
G. Anastassiou and S.T. Rachev), pp. 321- 337, Plenum, New York.
[233] Kozubowski, T.J. (1994b). The inner characterization of geometric
stable laws, Statist. Decisions 12, 307-321.
[234] Kozubowski, T.J. (1997). Characterizatio n of multivariate geometric
stable distributions, S tatist. Decisions 15, 397-416.
[235] Kozubowski, T.J. (1998). Mixture representation of Linnik distribu-
tion revisited, Statist. Probab. Lett. 38, 1 57-160.
[236] Kozubowski, T.J. (1999). Fractional moment estimation of Linnik
and Mittag-Leffler parameters, Math. Comput. Modelling, in press.
[237] Kozubowski, T.J. (2000a). Exponential mixture representation of ge-
ometric stable distributions, Ann. Inst. Statist. Math. 52(2), 231-238.
[238] Kozubowski, T.J. (2000b). Computer simulatio n of geometr ic stable
random variables, J. Comp. Appl. Math. 116 , 221-229.
[239] Kozubowski, T.J. and Panorska, A.K. (1996). On moments and tail
behavior of ν-stable random variables, Statist. Probab. Lett. 29, 307-
315.
[240] Kozubowski, T.J. and Panorska, A.K. (1998). Weak limits for multi-
variate random sums, J. Multivariate Anal. 67, 398-413.
[241] Kozubowski, T .J. and Panorska, A.K . (1999). Multivariate geometric
stable distributions in financial applications, Math. Comput. Modelling
29, 83-92.
[242] Kozubowski, T.J. and Podg´orski, K. (1999a). A class o f asymmetric
distributions, Actuarial Research Clearing House 1, 113-134.
[243] Kozubowski, T.J. and Podg´orski, K. (199 9b). A and asymmetric gen-
eralization of Laplace distr ibution, Comput. S tatist., in press.
[244] Kozubowski, T.J. and Podg´or ski, K. (1999c). Asymmetric Laplace
laws and modeling financial data, Math. Comput. Modelling, in press.
[245] Kozubowski, T.J. and Podg´o rski, K. (2 000). Asymmetric Lapla ce dis-
tributions, Math. Sci. 25, 37-46.
[246] Kozubowski, T.J., Podg´ors ki, K. and Samorodnitsky, G. (1998). Tails
of evy measure of geometric stable random variables, Extremes 1(3),
367-378.
References 421
[247] Kozubowski, T.J. and Rachev, S.T. (1994). The theo ry of geometric
stable distributions and its use in modeling financial data, European
J. Oper. Res. 74, 310-324.
[248] Kozubowski, T.J. and Rachev, S.T. (1999 a). Univariate geometric
stable laws, J. Comput. Anal. Appl. 1(2), 177-217.
[249] Kozubowski, T.J. and Rachev, S.T. (1999b). Multivar iate geometric
stable laws, J. Comput. Anal. Appl. 1(4), 349-385.
[250] Krein, M. (1944). On the e xtrapolation problem of A.N. Kolmogorov,
Doklady Akad. Nauk SSSR 46(8), 339-342 (in Russian).
[251] Krysicki, W. (1966a). An application of the method of moments to
the problem of parameter estimation of a mixture of two Laplace dis-
tributions, Zeszyty Naukowe Politechniki odzkiej, Wl´okiennictwo 14,
3-14 (in Polish).
[252] Krysicki, W. (1966b). Estimation of parameters in a mixture of two
Laplace’s distributions with use of the method of moments. Zeszyty
Naukowe Politechniki odzkiej 77, 5-13 (in Polish).
[253] Laha, R.G. (1961). On a class of unimodal distributions, Proc. Amer.
Math. Soc. 12, 181-184.
[254] Lanfer , H. (1978). Maximum signal-to-nois e-ratio quantization for
Laplacian-distributed signals, Inform. Syst. Theory in Digital Com-
mun. NTG-Report, VDE- Verlag, Berlin, Germany 65, 52.
[255] Laplace, P.S. (1774). M´emoire sur la probabilit´e des causes par les
´ev´enemens, M´emoires de Math´ematique et de Physique 6, 621-656.
English translation Memoir on the Probability of the Causes of Events
in Statistical Science 1(3), 364- 378, 1986.
[256] Latta, R.B. (1979). Composition rules for proba bilities from paired
comparisons, Ann. Statist. 7, 349-371.
[257] Lehmann, E.L. (1983). Theory of Point Estimation, Wiley, New York.
[258] Lehmann, E.L. and Case lla, G. (1998). Theory of Point Estimation
(2nd ed.), Springer, New York.
[259] Levin, A. and Albanese, C. (1998). Bayesian Value-at-Risk: Calibra-
tion and Simulation, Presentation at the SIAM Annual Meeting’98,
Toronto, July 13-17, 1998.
[260] Levin, A. and Tchernitser, A. (1999). Multifactor gamma stochastic
variance Value-at-Risk model, Presentation at the Conference Appli-
cations of Heavy Tailed Distributions in Economics, Engineering, and
Statistics, American University, Wa shington, DC, June 3-5, 1999.
422 References
[261] Lien, D.H.D, Balakrishnan, N. and Ba lasubramanian, K. (1992). Mo-
ments of order statistics from a non-overlapping mixture model with
applications to truncated Laplace distr ibution, Comm. Statist. Theory
Methods 21, 1909-1928.
[262] Lin, G.D. (1994). Characterizations of the Laplace and related dis-
tributions via geometric compounding, Sankhy¯a Ser. A 56, 1-9.
[263] Lindsey, J.K. (1999). Multivariate elliptically contour e d distributions
for repeated measurements, Biometrics 55, 1277-1280.
[264] Ling, K.D. (1977). Bayesian pre dictive distribution for sample from
double exponential distribution, Nanta Mathematica 10(1), 13-19.
[265] Ling, K.D. and Lim, S.K. (1 978). On Bayesian predictive distri-
butions for samples from double exponential distribution based on
grouped data, Nanta Mathematica 11(2), 191-201.
[266] Lingappaiah, G.S. (1988). On two- piece double e xponential distribu-
tion, J. Korean Statist. Soc. 17(1), 46-55.
[267] Linnik, Ju.V. (1953). Linear forms and statistical criteria, I, II, Ukr.
Mat. Zhurnal 5, 207-290 (in Russian); also in Selected Translations in
Math. Statist. Probab. 3, 1-90 (1963).
[268] Liseo, B. (1990). The skew normal class of densities: inferential as-
pects from a Bayesian viewpoint, Statistica 50, 59-70 (in Italian).
[269] Loh, W.-Y. (1984). Random quotients and robust estimation, Comm.
Statist. Theory Methods 13(22), 2757-2769.
[270] Longstaff, F. A. (1994). Stochastic volatility and option valuation: a
pricing-density approach, Prepr int, The Anderson Graduate School of
Management, UCLA.
[271] Lukacs, E. (1955). A characterization of gamma distribution, Ann.
Math. Statist. 26, 88-95.
[272] Lukacs, E. (1957). Remarks concerning characteristic functions, Ann.
Math. Statist. 28, 717-723.
[273] Lukacs, E. (1970). Characteristic Functions, Griffin, London.
[274] Lukacs, E. and Laha, R.G. (1964). Applications of Characteristic
Funct ions, Hafner Publishing Company, New York.
[275] Madan, D.B., Carr, P. and Chang, E.C. (1998). The variance gamma
process and option pricing, Eu ropean Finance Review 2, 74-105 .
References 423
[276] Madan, D.B. and Milne, F. (1991). Option pricing with VG martin-
gale components, Mathematical Finance 1, (4), 33-55.
[277] Madan, D.B. and Seneta, E. (1990). The variance gamma (V.G.)
model for share markets returns, J. Business 63, 511-524.
[278] Magnus, J.R. (2000). Estimation of the mean of a univariate normal
distribution with known variance, Preprint, CenTER, Tilburg Univer-
sity.
[279] Magnus, J.R. and Neudecker, H. (1999). Matrix Differen tial Calculus
with Applications in Statistics and Economics, Wiley, Chichester.
[280] Mandelbrot, B. (1 963). The variations of certain speculative prices,
J. Business 36, 394-419.
[281] Manly, B.F.J. (1976). Some examples of double exponential fitness
functions, Heredity 36, 229-234.
[282] Mantel, N. (1973). A characteristic function exercise, Amer. Statist.
27(1), 31.
[283] Mantel, N. (1987). The Laplace distribution and 2 × 2 unit normal
determinants, Amer. Statist. 41(1), 88.
[284] Mantel, N. and Pasternak, B.S. (1 966). Light bulb statistics, J. Amer.
Statist. Assoc. 61, 633-639.
[285] Marks, R.J., Wise, G.L., Haldeman, D.G., and Whited, J.L. (1978).
Detection in Laplace noise, IEEE Trans. Aerospace Electron. Systems
AES-14(6), 866-871.
[286] Marshall, A.W. and Olkin, I. (1993). Maximum likelihood character-
izations, Statist. Sinica 3, 157- 171.
[287] Mathai, A.M. (1993). Generalized Laplace distribution with applica-
tions, J. Appl. Statist. Sci. 1(2), 169-178.
[288] McAlister, D. (1879). The law of the g eometric mean, Proceeding of
the Royal Soc. 29, 367-3 76.
[289] McGill, W.J. (1962). Random fluctuations of response rate, Psy-
chometrika 27, 3-17.
[290] McGraw, D.K. and Wagner, J .F. (1968). Elliptically symmetric dis-
tributions, IEEE Trans. Inform. Theory 1 4, 110-120.
[291] McKay, A.T. (1932). A B essel function distribution, Biometrika 24,
39-44.
424 References
[292] McKay, A.T. and Pearson, E.S. (1933). A note o n the distribution of
range in samples of n, Biometrika 25, 415-420.
[293] McLeish, D.L. (1982). A robust alterna tive to the normal distribu-
tion, Canad. J. S tatist. 10(2), 89-102.
[294] Mendenhall, W. and Hader, R.J. (1958). Estimation of parameters
of mixed exp onentially distributed failure time distributions from cen-
sored life test data, Biometrika 45, 504-520.
[295] Mertz, P. (1961). Model of impulsive noise for data transmission,
IEEE Trans. Comm. CS-9, 130-1 37.
[296] Miller, G. (1978). Properties o f certain sy mmetric stable distribu-
tions, J. Multivariate Anal. 8(3 ), 346-360.
[297] Miller, J.H. and Thomas, J.B. (1972). Detectors for discrete-time
signals in non-Gaussia n noise , IEEE Trans. Inform. Theory IT-18(2),
241-250.
[298] Milne, R.K. and Yeo, G.F. (1989). Random sum characterizations,
Math. Sci. 14, 120-126.
[299] Missiakoulis, S. (1 983). Sargan densities, which one? J. Econometrics
23, 223-233.
[300] Missiakoulis, S. and Darton, R. (198 5). The distribution of 2 ×2 unit
normal determinants, Amer. Statist. 39(3), 241.
[301] Mitchell, A.F.S. (1994). A note on posterior moments for a normal
mean with double exponential prior, J. Roy. Statist. Soc. Ser. B 56(4),
605-610.
[302] Mittnik, S. and Rachev, S.T. (1991). Alternative multivariate stable
distributions and their applications to financial modelling, in Stable
Processes and Related Topics (eds., S. Cambanis et al.), pp. 107-119,
Birkhauser, Boston.
[303] Mittnik, S. and Rachev, S.T. (19 93). Modeling asset returns with
alternative stable distributions, Econometric Rev. 12(3), 261-330.
[304] Mohan, N.R., Vasudeva, R. and Hebbar, H.V. (1993). On geomet-
rically infinitely divisible laws and geometric domains of attraction,
Sankhy¯a Ser. A 55, 171-179.
[305] Nakagami, M. (1964). On the intensity distribution and its applica-
tions to signal statistics, Radio Sci. J. Res. 68D(9), 995-1 003.
[306] Navarro, J. and Ruiz, J.M. (2000). Personal communication (to S.
Kotz).
References 425
[307] Neyman, J. and Pearson, E.S. (1928). On the use and interpreta-
tion of certain test criteria for purposes of statistical inference, I,
Biometrika 20A, 175-24 0.
[308] Neyman, J. and Pearson, E.S. (1933). On the problem of the most
efficient tests of s tatistical hypotheses, Phil. Trans. Roy. Soc. London,
Ser. A 231, 289-337.
[309] Nicholson, W.L. (19 58). On the distribution of 2 ×2 rando m normal
determinants, Ann. Math. Statist. 29, 575-580.
[310] Nikias, C.L. and Shao, M. (1995). Signal Processing with Alpha-Stable
Distributions and Applications, Wiley, New York.
[311] Nikitin, Y. (1995). Asymptotic Efficiency of Nonparametric Tests,
Cambridge University Press, Cambridge.
[312] Nitadori, K. (1965). Statistical analysis of ∆PCM, Electron. Com-
mun. in Japan 48, 17-26.
[313] Noll, P. and Zelinski, R. (1979). Comments on“Quantizing charac-
teristics for signals having Laplacia n amplitude pr obability density
function”, IEEE Trans. Comm. COM-27, 1295-1297.
[314] Nor ris, J.P., Nemiroff, R.J., Bonnell, J.T., Scargle, J.D., Kouveliotou,
C., Paciesas, W.S., Meegan, C.A. and Fishman, G.J. (1996). Attributes
of pulses in long bright gamma-r ay bursts, Astrophysical J ournal 459,
393-412.
[315] Nor ton, R.M. (1 984). The double exponential distribution: Using cal-
culus to find a maximum likelihood estimator, Amer. Statist. 38(2),
135-136.
[316] Nyquist, H., Rice, S.O. and Riordan, J. (1954). The distribution of
random determinants, Q uart. Appl. Math. 12(2), 97-104.
[317] Ogawa, J. (1951). Contribution to the theory of systematic statistics,
I, Osaka Math. J. 3(2), 175-213.
[318] Okubo, T. and Narita, N. (1980). On the distribution of extreme
winds ex pected in Japan, National Bureau of Standards Special publi-
cation 560-1, 12 pp.
[319] Olver, A.K.P. (1974). Asymptotics and Special Functions, Academic
Press, New York.
[320] Ord, J.K. (1983). Laplace distribution, in Encyclopedia of Statistical
Science, Vol. 4 (eds., S. Kotz, N.L. Johnson, and C.B. Read), pp. 473-
475, Wiley, New York.
426 References
[321] Os ie walski, J. and Steel, M.F.J. (1993). Robust Bayesian inference in
l
q
-spherical models, Biometrika 80(2), 456-460.
[322] Os trovskii, I.V. (1995). Analytic and asymptotic prop e rties of multi-
variate Linnik’s distribution, Mathematical Physics, Analysis, Geom-
etry 2(3/4), 436-455.
[323] Pace, L. and Salvan, A. (1997). Principles of Statistical Inference,
World Scientific Press, Singapore.
[324] Pakes, A.G. (1992a). A characterization o f gamma mixtures of stable
laws motivated by limit theorems, Statist. Neerlandica 2-3, 209-218.
[325] Pakes, A.G. (1992b). On characterizations through mixed sums, Aus-
tral. J. Statist. 34(2), 323-339.
[326] Pakes, A.G. (1995). Characterization of discrete laws via mixed sums
and Markov branching processes, Stochastic Process. Appl. 55 , 285-
300.
[327] Pakes, A.G. (1997). The laws of some random series o f independent
summands, in Advances in t he Theory and Practice of Statistics: A
Volume in Honor of Samuel Kotz (eds., N.L. Jo hnson and N. Balakr-
ishnan), pp. 499-516 , Wiley, New York.
[328] Pakes, A.G. (1998). Mixture representations for symmetric general-
ized Linnik laws, S tatist. Probab. Lett. 37, 213-221.
[329] Pearson, E.S. (1935). A comparis on of β
2
and Mr. Geary’s w
n
criteria,
Biometrika 27, 333-352.
[330] Pearson, E.S. and Hartley, H.O. (1942). The probability integral of
the range in samples of n observations from a normal population,
Biometrika 32, 301-310.
[331] Pearson, K ., Jefferey, G.B. and Elderton, E.M. (1929). On the distri-
bution of the first product moment-coefficient, in sa mples drawn from
an indefinitely large normal po pulation, Biometrika 21, 164-1 93.
[332] Pearson, K., Stouffer, S.A. and David, F.N. (1932). Further applica-
tions in statistics of the T
m
(x) Bessel function, Biometrika 24, 293-350.
[333] Peterson, R. and Silve r, E.A. (1979). Decision Systems for Inventory
Management and Production Planning, Wiley, New York.
[334] Pillai, R.N. (1985). Semi - α- Laplace distributions, Comm. Statist.
Theory Methods 14(4), 991-1000.
[335] Pillai, R.N. (1990). On Mittag-Le ffler functions and related distribu-
tions, Ann. Inst. Statist. Math. 42(1), 157-161.
References 427
[336] Pitt, L. (1982). Positively correlated no rmal variables are associated,
Ann. Probab. 10(2), 496-499 .
[337] Poiraud-Casanova, S. and Thomas-Agnan, C. (2000). About mono-
tone regression quantiles, Statist. Probab. Lett. 48, 101-104.
[338] Press, S.J. (1967). On the sample c ovariance from a bivariate normal
distribution, Ann. Inst. Statist. Math. 19, 355-361.
[339] Problem 64-13 (1966). SIAM Review 8(1), 108-110.
[340] Rachev, S.T. and SenGupta, A. (1992). Geometric stable distribu-
tions and Laplace-Weibull mixtures, Statist. Decisions 10, 251-271.
[341] Rachev, S.T. and SenGupta, A. (1993). Laplace-Weibull mixtures for
modeling price changes, Management Science 39(8), 1029-1038.
[342] Raghunandanan, K. and Srinivasan, R. (1971). Simplified estima-
tion of parameters in a double exponential distribution, Technometrics
13(3), 689-691.
[343] Rama chandran, B. (1997). On geometric stable laws, a related prop-
erty of stable processes, and stable densities, Ann. Inst. Statist. Math.
49(2), 299-313.
[344] Ramsey, F.L. (1971). Small sample power functions for nonparamet-
ric tests of location in the double ex po nential family, J. Amer. Statist.
Assoc. 66, 149-151.
[345] Rao, C.R. (1961). Asymptotic efficiency and limiting information,
Proc. Fourth Berkeley Symp. Math. Statist. Probab. 1, 531-545.
[346] Rao, C.R. (1965). Linear Statistical Inference and Its Applications,
Wiley, New York.
[347] Rao, C.R and Ghosh, J.K. (1971). A note on some translation-
parameter families of dens ities for which the median is an M.L.E .,
Sankhy¯a Ser. A 33(1), 91-92.
[348] Rao, A.V., Rao, A.V.D. and Narasimhan, V.L. (1991). Optimum lin-
ear unbiased estimation of the scale parameter by absolute values of
order statistics in the double exponential a nd double Weibull distri-
butions, Comm. Statist. Simulation Comput. 20(4), 1139-1158.
[349] enyi, A. (1956). Posson-folyamat egy jemllemz¨ese, Magyar Tud.
Akad. Nat. Kutat´o Int. ozl., v. 1, 4, 519-527 (in Hungarian).
[350] Reza, F.M. (1961). An Introduction to Information Theory, McGraw-
Hill, New York.
428 References
[351] Rider, P. (1961). The method of moments applied to a mixture of
two exponential distributions, Ann. Statist. 32, 143-147.
[352] Robertson, T. and Waltman, P. (1968). On estimating monotone pa-
rameters, Ann. Math. Statist. 39(3), 1030-1039.
[353] Rohatgi, V.K. (1984). Statistical Inference, Wiley, New York.
[354] Ros e nberger, J.L. and Gasko, M. (1983). Comparing location estima-
tors: trimmed means, medians, and trimean, in U nderstanding Robust
and Exploratory Data Analysis (eds., D.C. Hoa glin, F. Mosteller, and
J.W. Tukey), pp. 297-338, Wiley, New York.
[355] Ros i´nski, J. (1 976). Weak compactness of laws of random sums of
identically distributed random vectors in Banach spaces, Colloq. Math.
35, 313-325.
[356] Rowland, R.St.H. and Sichel, H.S. (1960). Statistical quality control
of routine underground sampling, J. S. A fr. Inst. Min. Metall. 60,
251-284.
[357] Sahli, A., Trecourt, P. and Robin, S. (1997). Acceptance sampling by
Laplace distributed variables, Commun. Statist. Theory Methods 26,
2817-2834.
[358] Saleh, A.K.Md., Ali, M.M. and Umbach, D. (1983). Estimation of
the quantile function of a location-sc ale family of distributions based
on a few selected order statistics, J. Statist. Plann. Inference 8, 75-86.
[359] Salzer, H.E., Zucker, R. and Capuano, R. (1952). Table of the zeros
and weight factors of the first twenty Hermite polynomials, J. Research
NBS 48, 111-116 .
[360] Samo rodnitsky, G. and Taqqu, M. (1994). Stable Non-Gaussian Ran-
dom Processes, Chapman & Hall, New York.
[361] Sansing, R.C. (1976). The t-statistic for a double exponential distri-
bution, SIAM J. Appl. Math. 31, 634-645.
[362] Sansing, R.C. and Owen, D.B. (1974). The density of the t-statistic
for non-normal distributions, Comm. Statist. 3(2), 13 9-155.
[363] Sarabia, J.M. (1993). Problem 48, Q
..
u
estii´o 1 7(1) (in Spanish).
[364] Sarabia, J.M. (1994). Personal communication (to S. Kotz).
[365] Sarhan, A.E. (1954). Estimation of the mean and standard deviation
by order statistics, Ann. Math. St atist. 25, 317- 328.
References 429
[366] Sarhan, A.E. (1955). Estimation of the mean and standard deviation
by order statistics, Part III, Ann. Math. Statist. 26, 576-592.
[367] Sarhan, A.E. and Greenberg, B. (1967). Linear es timates for doubly
censored samples from the exponential distribution with observations
also missing from the middle, Bull. Int. Statist. Institute, 36th Session
42(2), 1195-1204.
[368] Sassa, H. (1968). The probability density of a certain statistic in one
sample from the double exponential population, Bull. Tokyo Gakugei
Univ. 19, 85-89 (in Japanese).
[369] Scallan, A.J. (1992). Maximum likelihood estimation for a nor-
mal/Laplace mixture distribution, The Statistician 41, 227-231.
[370] Serfling , R.J. (1 980). Approximation Theorems of Mathematical
Statistics, Wiley, New Yor k.
[371] Sharma, D. (1984). On estimating the variance of a generalize d
Laplace distribution, Metrika 31, 85-88.
[372] Shepp, L.A. (1962). Symmetric Random Walk, Trans. Amer. Math.
Soc. 104, 144-153.
[373] Shyu, J.-C. and O wen, D.B. (1986a). One-sided tolerance intervals
for the two-parameter double exponential distribution, Comm. Statist.
Simulation Comput. 15(1), 101-119.
[374] Shyu, J.-C. and Owen, D.B. (1986b). Two-sided tolerance intervals
for the two-parameter double exponential distribution, Comm. Statist.
Simulation Comput. 15(2), 479-495.
[375] Shyu, J.-C. and Owen, D.B. (1987). β-expectation tolerance intervals
for the double exponential dis tribution, Comm. Statist. Simulation
Comput. 16(1), 129-139.
[376] Sichel, H.S. (1973). Statistical valuation of diamondiferous depos its,
J. S. Afr. Inst. Min. Metall. 73, 235-243.
[377] Springe r, M.D. (1979). The Algebra of Random Variables, Wiley, New
York.
[378] Srinivas an, R. and Wharton, R.M. (1982). Confidence bands for the
Laplace distribution, J. Statist. Comput. Simulation 14, 89-99.
[379] Steutel, F.W. (1970). Preservation of Infinite Divisibility Under Mix-
ing and Related Topics, Mathematisch Centrum, Amsterdam.
[380] Steutel, F.W. and van Harn, K. (1979). Discre te analogues of self-
decomposability and stability, Ann. Probab. 7, 497-501.
430 References
[381] Stigler, S.M. (1986a). Laplace’s 1774 Memoir on Invers e Probability,
Statistical Science 1(3), 359-378.
[382] Stigler, S.M. (1986b). The H istory of Statistics: The Measurement of
Uncertainty Before 1900, Harvar d Univ. Press, Cambridge.
[383] Stoyanov, J. (2000). Krein co ndition in probabilistic moment prob-
lems, Bernoulli 6(5), 939-949.
[384] Subbotin, M.T. (1923). On the law of frequency of error, Mathe-
maticheskii Sbornik 31, 296-300.
[385] Sugiura, N. and Naing, M.T. (1989). Improved estimators for the
location of double exponential distribution, Comm. St atist. Theory
Methods 18, 541-554.
[386] Sullivan, G.J. (1996). Efficient scala r quantization of exponential and
Laplacian random variables, IEEE Trans. Inform. Theory 42, 1265-
1274.
[387] Szasz, D. (1972). On classes of limit distributions for sums of a ran-
dom number of identically distributed independent random variables,
Theory Probab. Appl. 17, 401-415.
[388] Takano, K. (1988). On the evy representation of the characteris-
tic function of the probability distribution Ce
−|x|
dx, Bull. Fac. Sci.
Ibaraki Univ. 20, 61-65.
[389] Takano, K. (1989). On mixtures of the normal distribution by the
generalized gamma convolutions, Bull. Fac. Sci. Ibaraki Univ. 21, 29-
41.
[390] Takano, K. (1990). Correction and addendum to “On mixtures of
the normal distribution by the generalized gamma convolutions”, Bull.
Fac. Sci. Ibaraki Univ. 22, 50-52.
[391] Takeuchi, K. and Akahira, M. (1976). O n the seco nd order asymptotic
efficiencies o f e stimators, in Proc. Third Japan-USSR Symp. Probab.
Theory (eds., G. Maruyama and J.V. Prokho rov), pp. 604-638 ; Lecture
Notes in Math. 550, Springer-Verlag, Berlin.
[392] Taylor, J.M.G. (199 2). Properties of modelling the error distribution
with an extra shape parameter, Comput. Statist. Data Anal. 13, 33-46.
[393] Teicher, H. (1961). Maximum likelihood characterization of distribu-
tions, Ann. Math. Statist. 32, 1214-1222.
[394] Teichroew, D. (1957). The mixture o f normal distributions with dif-
ferent variances, Ann. Math. Statist. 28, 510-512.
References 431
[395] Tiao, G.C. and Lund, D.R. (1970). The use of OLUMV estimators
in inference robustness studies of the location parameter of a class of
symmetric distributions, J. Amer. Statist. Assoc. 65, 370-386.
[396] Tomkins, R.J. (1972). A generalization of Kolmogorov’s law of the
iterated logarithm, Proc. Amer. Math. Soc. 3 2(1), 268-274.
[397] Tricker, A.R. (1984). Effects of Rounding on the moments of a prob-
ability distribution, St atistician 33(4 ), 381- 390.
[398] Troutt, M.D. (1991). A theorem on the density of the density ordinate
and an alternative interpretation of the Box-Muller method, Statistics
22, 463-466.
[399] Tse, Y.K. (1987). A note on Sarg an densities, J. Econometrics 34,
349-354.
[400] Ulrich, G. and Chen, C.-C. (1987). A biva riate double exponential dis-
tribution and its generalization, ASA Proceedings on Statistical Com-
puting, 127-129.
[401] Umbach, D., Ali, M.M. and Saleh, A.K.Md. (1984). Hypothesis test-
ing for the double exponential distribution based on optimal spacing,
Soochow J. Math. 10, 133-143.
[402] Uppuluri, V.R.R. (1981). Some properties of log-Laplace distribution,
in Statistical Distributions in Scientific Work, Vol 4 (eds., G.P. Patil,
C. Taillie and B. Baldessari), pp. 105-110, Reidel, Dordrecht.
[403] Uthoff, V.A. (1973). The mo st powerful scale and location inva riant
test of the normal versus the double exponential, Ann. Statist. 1, 170-
174.
[404] van der War den, B.L. (1952). Order tests fo r the two -sample problem
and their power, Indag. Math. 14, 453-458. [Correction: Indag. Math.
15, 80.]
[405] Van Eeden, C. (1957). Maximum likelihood estimation of partially
or completely ordered parameters, I and II, Indag. Math. 19, 128-136,
201-211.
[406] van Zwet, W.R. (1964). Convex Transformations of Random Vari-
ables, Math. Centre, Amsterdam.
[407] Watson, G.N. (1962). A Treatise on the Theory of Bessel functions,
Cambridge University Press, London.
[408] Weida, F.M. (1935). On certain distribution functions when the law
of the universe is Poisson’s firs t law of error, Ann. Math. Statist. 6,
102-110.
432 References
[409] Weron, R. (1996). On the Chambers-Mallows-Stuck method for sim-
ulating skewed stable random variables, Statist. Probab. Lett. 28, 1 65-
171.
[410] Weron, K. and Kotulski, M. (1996). On the Cole-Cole relaxation func-
tion and related Mittag-Leffler distributions, Physica A 232, 180-188.
[411] Wilcoxon, F. (1945). Individual comparisons by ranking methods,
Biometrics 1, 80-83.
[412] Wilson, E.B. (1923). First and second laws of error, J. Amer. Statist.
Assoc. 18, 841-852.
[413] Wilson, E.B. and Hilferty, M.M. (1931). The distribution of chi-
square, Proc. Nat. Acad. Sci. 17, 684-688.
[414] Yamazato, M. (1978). Unimodality of infinitely divisible distribution
functions of class L , Ann. Probab. 6, 523-531.
[415] Yen, V.C. and Moore, A.H. (1988). Modified goodness-of-fit test for
the Laplace dis tribution, Comm. St atist. Simulation Comput. 17, 2 75-
281.
[416] Yeo , G.F. and Milne, R.K. (1 989). On characterizations of exponen-
tial distributions, Statist. Probab. Lett. 7, 303-305.
[417] Younes , L. (2000). Ortho gonal expansions for a continuous ran-
dom var iable with statistical applications, Ph.D. Thesis, Unive rsity
of Barcelona, Barcelona, Spain.
[418] Zeckhauser, R. and Thompson, M. (1970). Linear regression with
non-normal error terms, Rev. Econom. Statist. 52, 280-286.
[419] Zellner , A. (1976). Baye sian and non-Bayesian analysis of the regres-
sion model with multivariate student-t error ter ms , J. Amer. Statist.
Assoc. 71, 400-405.
[420] Zolotarev, V.M. (1986). One-Dimensional Stable Distributions, Vol-
ume 65 of Translations of Mathematical Monographs, American Math-
ematical Society.
Index
acceptance sampling, 383, 428
α-Laplace distribution, 249, 255,
287, 410, 426
analog of the t-distribution, 55,
144
Anderson-Darling test, 129
asset pricing, 379
association, 339
asymmetric Laplace distribution,
46, 163–215, 223, 224,
235, 247, 249, 279, 288 ,
419, 420
absolute moments, 165, 166,
175, 193, 194
alternative parameterization,
167
characteristic function of, 167,
168, 171
characterizations of, 165, 178,
198
coefficient of skewness, 177
coefficient of variation, 176
cumulant generating function,
173, 174
cumulants, 174, 175, 179
density, 169, 170
distribution function, 169, 170
entropy, 192–194
estimation, 195, 196
Fisher information matrix, 195,
198, 200, 203, 205, 209 ,
210, 215
generalized, see generalized asym-
metric Laplace distribu-
tion
geometric infinite divisibility,
187, 188
infinite divisibility, 16 6, 190
interquartile range, 178
kurtosis, 177
evy density o f, 187
likelihood function, 196, 2 02,
208
mean, 166, 175
mean deviation, 176
median, 177
moment ge nerating function,
173
moments, 175, 178
parameterizations, 167–173
434 Index
quantiles, 177
simulation, 167, 185
standard, 164, 184, 187
variance, 168, 175
asymptotic relative efficiency, 98
asymptotically most powerful, 135,
136, 138
average adjustment interval, 386
Bayes risk, 107
Bernoulli random variable, 36, 77,
181, 182
Bessel function, 31, 44, 45, 48, 50,
399, 407, 426, 431
Bessel function distribution, 45,
186, 224–280, 280, 323,
423
characteristic function, 224
definition, 224
density, 224, 235
mean, 240
moments, 239, 24 0
multivariate, see multivariate
Bessel distribution
standard, 226
standard density, 236
variance, 240
best linear unbiased estimator, 77,
103
bilateral exponential distribution,
20, 412
bivariate asymmetric Laplace dis-
tribution, 301
characteristic function, 301,
306
definition, 301
densities, 302, 303, 312
moments, 301
simulation, 305, 306
variance-covariance matrix, 301,
312
bivariate hyperbolic distribution,
392
bivariate Laplace distributions, 287,
289
bivariate normal distribution, 297
Black-Sholes formula, 372, 373
breakdown voltage, 357
brittle fracture distribution, 28
broadened median, 82
Brownian motion, 241, 243–245,
248, 281, 282, 370–373,
379, 412
canonical median, 81, 83
Carleman condition, 140
Cauchy distribution, 22, 52, 80 ,
130, 141, 146, 258, 283 ,
298, 406, 408
censored sample, 94, 95, 103, 104,
106–111, 114, 121, 126,
150, 403, 405, 406, 408 ,
410, 424, 429
chi-square distribution, 16, 28–30,
32, 113, 134, 180, 183,
184, 230, 235, 310, 330 ,
359, 432
class L, 268, 269, 432
code modulation, 351
coefficient of skewness, 24, 177,
218, 280
coefficient of variation, 24, 176
communication theory, 347, 351
complementary Bessel function, 373
completely monotone density, 62
compound Laplace distribution, 152,
153
compound Poisson process, 241,
370
confidence ellipsoid, 332, 33 3
contaminated Laplace, 380
control chart, 48, 388, 389
convolution of exponential distri-
butions, 165, 180
cosine distribution, 17, 18, 20
Cram´er-Rao lower bound, 85, 86,
111, 219, 281
Cram´er-von Mises test, 129, 130
Cram´er-Wald device, 326
Index 435
currency exchange, 345, 363, 367,
369, 375, 378
decibel, 352
demand during le ad time, 381, 382
detector, 348, 350, 351
diffuse prior, 336, 337
discrete Laplace distribution, 159
discrete Linnik distribution, 276 ,
409
discrete Mittag-Le ffler distribution,
276, 416
discrete stable distribution, 276,
409
distortion measure, 354, 356
domain of attraction, 267
dose response for radiation car-
cinogenesis, 395
double exponential distribution, 9,
16, 20
double geometr ic distribution, 159
double Pareto distribution, 51
doubly exponential law, 20
duplicate check-sampling, 387
E-M algorithm, 380
economic clock, 370
Edgeworth expansion, 159
efficiency of an es timator, 104
efficient score function, 147
elliptically contoured K-Bess e l dis-
tribution, 379
elliptically contoured distribution,
287, 395, 396, 422
elliptically contoured Laplace dis-
tribution, 290 , 292, 340,
379
elliptically symmetric distribution,
295, 340, 341, 423
empirical characteristic function,
272, 273
empirical distribution, 271, 365
encoding of analog sig nals, 351
entropy, 25, 62, 63, 146, 165, 166,
192, 193, 195, 219, 353 ,
355, 413, 418, 419
environmental sciences, 3 91
European option, 372
exp onential family, 20 , 78, 79, 95
exp onential integral, 145, 246
exp onential mixture, 239, 250, 258,
259, 407, 420
exp onential powe r distribution, 87,
277, 281, 340, 341, 403
multivariate, see multivariate
exp onential powe r distri-
bution
F -distribution, 143, 332, 3 33
feedback adjustment, 386
financial data, 164, 279, 363, 369,
377, 379, 420, 421, 424
first law of error, 3
first law of Laplace, 6, 9
Fisher information, 80 , 85, 116,
117, 146, 195, 198, 200 ,
203, 205, 209, 210, 215 ,
220
folded Cauchy density, 257
fractional moments, 52, 2 68, 274,
420
fracturing of materials, 356
Fredholm integral equation, 387
functions of order statistics, 69
gamma distribution, 24, 39, 40,
57, 152, 161, 224, 229,
230, 235, 237, 243, 323 ,
371, 388, 422
characterization of, 40
gamma process , 241, 243, 245, 248,
282, 370, 373, 377
gamma white noise, 282
gamma-ray bursts, 393
Gauss-Markov theorem, 330, 333
generalized asy mmetric Laplace dis-
tribution, 224 , 225, 280,
323
436 Index
generalized beta distribution, 371
generalized gamma convolution, 160,
323, 407, 430
generalized Gaussian distribution,
349
generalized hyperbolic distribution,
300, 309, 317–319, 325
generalized inverse Gaussia n dis-
tribution, 308, 404, 417
generalized Laplace distribution,
225, 236, 237, 277, 280 ,
311, 340, 363, 371, 376–
378, 411, 416, 423, 429
generalized Linnik law, 272, 275,
426
generating function, 33, 276
geometric infinitely divisible, 59,
187, 190, 324
geometric mean, 11, 139, 423
geometric random variable, 26
geometric stable dis tribution, 39,
218, 223, 250, 251, 259 ,
262, 268, 269, 271–273,
275, 278, 300, 308, 322 ,
327, 338, 407, 418–421,
427
characterization of , 192
geometric summation, 26, 32, 33,
36, 165, 191, 251, 253,
326, 342
Gini mean difference, 139
goodness-o f-fit test, 129, 135, 372,
432
harmonic mean, 11, 139
Hermite distribution, 381, 382, 405
Hermite polynomial, 274
homogeneous increments, 371
hyperbolic distribution, 215, 216,
223, 288, 299, 300, 309 ,
311, 317, 318, 358, 363 ,
369, 391, 392, 407, 410
generalized, see generalized hy-
perbolic distribution
image and speech compression, 352
impulsive noise, 349
incomplete exponential function,
132
incomplete gamma function, 56,
121
indicator function, 172, 246
infinite divisibility, 32, 56 , 57, 144,
157, 166, 185, 322, 324 ,
418, 429
integrated moving average, 386
interest rates, 363, 365, 366, 418
inventory control, 381
inverse Gaussian distribution, 288,
308, 363, 406
generalized, see generalized in-
verse Gaussian distribu-
tion
jump diffusion model, 379
jump function, 246
κ-criterion, 218
Kolmogo rov goodness -of-fit test,
134
Kolmogo rov statistic, 135
Kolmogo rov-Smirnov test, 134, 138
Kotz-type multiva riate distribu-
tion, 396
Krein condition, 140, 430
Kronecker product, 297
kurtosis, 140, 153, 177, 218, 280,
341, 366, 369, 374, 406 ,
415
Lambert function, 355
Laplace distribution, 17
central absolute moments, 24
central moments, 23, 62
characteristic function, 21, 22
classical, 27 –31, 43, 4 7–50, 52,
53
coefficient of skewness, 24, 25
coefficient of variation, 24
conditional inference, 115, 116,
118–121
Index 437
confidence intervals, 112, 113,
121, 122
cumulant generating function,
23
cumulative distribution func-
tion, 21
density, 19, 21, 22
discrete, see discrete Laplace
distribution
entropy, 25, 64
Fisher information matrix, 80,
91
generalized, see generalized Laplace
distribution
genesis of, 15
goodness-o f-fit tests, 129, 134,
135
joint distribution of order statis-
tics, 68
kurtosis, 24, 25
likelihood function, 81
linear estimation, 107
maximum likelihood estima-
tion, see also maximum
likelihood estimation, 88,
91, 94
mean, 21, 24
mean deviation, 24
mean of the sample median,
74, 82
median, 26
midrange, 69, 71
minimal sufficient statistic, 79
moment ge nerating function,
21–23
moments of order statistics,
74, 75, 77
multivariate, see multivariate
Laplace distribution
orthogonal representation, 31
parameters, 19
quantile estimation, 112
quantiles, 25
sample median, 74
standard, 21
standard clas sical, 21– 23, 26,
27, 3 2, 58, 6 0, 66, 68, 75–
77
test for loc ation, 131
variance, 17, 21
variance of the s ample me-
dian, 74, 75, 82
Laplace motion, 223, 241–249, 282,
323, 363, 369–373
asymmetric, 241, 247–249
compound Poisson approxi-
mation, 241
covar iance structure, 243, 244,
248
evy-Khinchine representation,
245
self-similarity property, 242
series representation, 246, 249
space-scale para meter, 242, 247
standard, 243–246
symmetric, 241–246
time-scale parameter, 242, 2 47
trajectories, 241, 243, 246, 248
with drift, 2 42, 243, 247
Laplace noise, 347, 3 49, 350, 396,
409, 423
Laplace-Weibull mixture, 380, 427
law of the iterated logarithm, 160,
431
least-squar e s estimator , 273, 329,
330, 332, 334
leptokurtic, 18, 25, 153, 177, 363,
388, 394, 419
L-estimator, 102
evy measure, 58, 59, 186, 1 87,
245, 249, 269, 299, 322–
324, 420
evy proces s, 56, 223, 241, 31 1,
323, 407, 413
evy-Khinchine representation, 57
59, 186, 245, 269, 430
likelihood ratio tes t, 127, 128
limits of geometric sums, 36, 188,
253, 325
438 Index
linear combinations of order statis-
tics, 205
linear unimodal, 316
Linnik distribution, 42, 249–259,
271, 278, 284, 342, 404 ,
410, 411, 414, 415, 419 ,
420, 426
absolute moments, 268
characteristic function, 249,
261, 268
densities, 250, 257, 259–266
discrete, see discrete Linnik
distribution
distribution function, 257, 259,
261–263
estimation, 271–275
exp onential mixture represen-
tation, 239, 258, 259
generalized, see generalized Lin-
nik law
infinite divisibility, 25 0
evy-Khinchin representation,
269
mixture representations, 257
multivariate, see multivariate
Linnik distribution
non-symmetric, 250
positive, see positive Linnik
law
scale parameter, 249, 273
series expansions, 263
simulations, 250, 2 57, 270
tail index, 267
Liouville number, 266
locally most powerful, 136
log-gamma distribution, 229
log-Lapla c e distribution, 30, 212,
276, 280, 364, 392, 395 ,
419, 431
log-nor mal distribution, 11, 280,
365, 388, 391, 392
log-return, 371
logistic distribution, 17, 18, 20,
73, 74 , 80 , 141, 142, 14 6,
147, 155, 157, 413
characterization of, 155, 413
Lomax distribution, 51
loss of information, 77, 115–119,
403, 404
Marshall-O lk in exponential distri-
bution, 308
maximum entropy principle, 62,
166, 192, 374
maximum likelihood, 81, 84, 8 8,
91, 95, 96, 148–150, 195,
202, 207, 208, 212, 219 ,
221, 334, 366, 369, 372 ,
380, 397, 403, 408, 415 ,
419, 423, 425, 429–431
maximum likelihood estimation,
81–94
of lo cation parameters, 95
under c ensoring, 94
mean deviation, 5, 7, 24, 64, 88,
176
mean squared dev iation, 5, 386
mean squared e rror, 331
median law of e rror, 12, 13
median test, 135, 136, 138
method of moments, 97, 102, 195,
219, 220, 272, 273, 281 ,
377, 382, 421, 428
midmean, 82
midrange, 69, 71, 73, 74, 104, 105,
107, 155, 157, 199, 413 ,
414
Mittag-Leffler distribution, 35, 42,
252, 276, 278, 420, 426
discrete, see discrete Mittag-
Leffler distribution
mixture of exponentially distributed
random variables , 182, 428
mixture of Laplace and Gaussian
distributions, 358, 429
mixture of log-normal distributions,
392
mixture of normal distributions,
26, 161, 165, 224, 226,
Index 439
237, 250, 258, 308, 323 ,
418, 430
mixture of stable laws, 250, 42 6
mixture of two Laplace distribu-
tions, 102, 151, 360, 421
mixture of two Student’s t distri-
butions, 360
mode-median-mean inequalities, 218
modified Bessel function, 215, 399
multivariate as ymmetric Laplace
distribution, 299–338
characteristic function, 301,
311
conditional distributions, 317–
319
covar iance, 312, 318
definition, 301, 306, 325
densities, 312, 315
geometric infinite divisibility,
324
infinite divisibility, 32 2
evy measure, 299, 322, 323
linear combination of, 301, 316,
319–321
linear regression, 319, 3 21
marginals, 320
mean, 312, 318
simulation, 299, 305
unimodality, 314, 315
multivariate Bessel distribution, 324,
340, 342
multivariate Cauchy distribution,
298
multivariate expo nential power dis-
tribution, 340 , 341, 395,
396
multivariate Laplace distribution,
287–338, 340, 342, 396,
419, 420
covar iance, 290, 292, 296
density, 287, 290, 292
mean vector, 290
polar represe ntation, 309, 310
simulation, see simulation, mul-
tivariate as ymmetric Laplace
distribution
symmetric, see multivariate
symmetric La place dis-
tribution
multivariate Linnik distribution,
287, 296, 342, 404, 426
multivariate symmetric Laplace dis-
tribution, 289–296
Nakagami dis tribution, 377
navigation, 345, 347, 359, 360, 404,
415
nearly instantaneous companding,
351
Neave-Tukey quick test, 138
neutral Laplace e stimator, 154
Neyman-Pearson lemma, 348
Neyman-Pearson optimal detector,
348, 349
non-Gaussian noise, 3 49, 424
nonparametric tests of location,
135–137
normal characteristic function, 191
normal distribution, 3
characterization of, 86
ν-stable law, 279, 320, 420
ocean engineering, 345, 359
operating characteristic curve, 383
optimal quantizer, 352, 355
option pricing, 373, 419, 422, 423
order statistics, 65–78, 149, 196,
403–406, 408, 410, 413,
418, 422, 427, 428
joint distribution of two, see
joint distribution of or-
der statistics, Laplace dis-
tribution
orthogonal representation, 31
Paretian stable distribution, 33,
279, 364, 365, 370
440 Index
Pareto distribution, 29, 30, 32, 51,
143, 181, 184, 229, 284 ,
414, 415
Pascal distribution, 231
Pascal-stable distribution, 231
Pearson Type VII distribution, 152,
388
platykurtic, 25, 153
Poisson approximation, 245, 249
Poisson process, 246 , 249, 379, 394
Polya-type characteristic function,
283
positive Linnik law, 276, 409
power function distribution, 143
prediction interval, 107
prior, 339
product of two independent Laplace
variates, 49
product of two independent nor-
mal variables, 50
product-moment coefficient, 231,
426
pure jumps process, 241
quality control, 381, 428
Rademacher sequence, 246, 249
random fluctuations of response
rate, 393, 423
random summation, 165, 299, 300,
325, 326, 413
range, 69, 71, 72
rank s um statistic, 135, 136
rank test, 147
ratio of two independent Laplace
variates, 50
Rayleigh distribution, 27, 375
real estate data, 380
reciprocal property, 258, 283
regress ion, 299, 319, 321, 322, 32 9,
331–334, 336, 411, 419,
427, 432
repeated measurements, 395
R-estimator, 147, 414
restricted maximum likelihood, 96
risk neutral pr ocess, 372
S&P 500 Index, 372, 376
sample coefficient of skewness, 366
sample kurtosis, 366
sample mean deviation, 366
sample median, 3, 11–13, 65, 67,
69, 74, 75, 144, 146 –150,
155, 409, 413
sample quantile, 110
Sargan distribution, 277, 284 , 417
scale and location invariant test,
128, 431
second law of error, 3
self-decomposability, 59, 181 , 230,
250, 268, 269, 324, 429
share market returns, 369
sign test, 147
signal-to-quantization noise ratio,
352
simplified linear estimator, 107
size data, 392
skew-normal distribution, 164, 405,
414, 422
skewed exponential power distri-
bution, 341
skewed Laplace distribution, 64,
84, 148, 163, 164, 171,
405
slow-mov ing item, 382
Smirnov one-sided statistic, 135
s-ordering , 141
spacings, 110–112
sp e e ch recognition, 345
sp e e ch s ignals, 351
spherically symmetric dis tribution,
298
stability, 32, 165 , 191, 231 , 251,
254, 255, 279, 325–328,
429
stable distribution, 33, 156, 256,
279, 300, 342, 369, 372 ,
432
discrete, see discrete stable
distribution
Index 441
stable process, 371, 416, 427
standard bivariate L aplace distri-
bution, 290, 292, 296, 297
standard exponential distribution,
28
standard gamma distribution, 44
star unimodality, 315
statistical proc e ss control, 386
steam generator inspection, 385,
410
stochastic va riance, 26, 64, 165,
224, 226, 237, 258, 360 ,
374, 375, 377, 388, 421
stock price, 369–373
strictly geometric stable dis tribu-
tion, 36, 42, 249–252, 255,
269, 275, 321, 327
subordinated Brownian motion, 248,
311, 322
subordinated Gaussian process, 370
subordinated model of stock prices,
370
sufficient statistic, 79, 115, 282
sum of two independent Laplace
r.v.’s, 47
tails, 17, 1 8, 250, 267
t-distribution, 145, 332–334, 371,
388
testing multivariate symmetry, 339,
414
time series, 386, 396, 410
tolerance factor, 1 22–126
tolerance interval, 122–126, 418,
429
tolerance limit, 122, 125
transfer theorem, 325
Treasury Bill, 3 72
Treasury bonds, 365, 366
triangular distribution, 17, 18, 20,
84
t-statistic, 53–55, 144, 145, 428
Tukey’s quick test, 138
two tailed powe r distribution, 84
two-piece double exponential, 163,
422
two-tailed e xponential distribution,
20
underground sampling, 428
underreported data, 36 4, 414, 415
uniform distribution, 18
uniformly minimum variance un-
biased estimator, 86
uniformly most powerful, 128
Value-at-Risk, 374, 421
van der Waerden test, 136
variance gamma distribution, 225
variance gamma process, 224, 324,
325, 370, 371, 376, 422 ,
423
v-distribution, 412
vec operator , 297
vec-permutation matrix, 297
vertical density function, 142, 419
volatility, 363, 368, 375, 377, 406,
422
volatility smile, 373, 379, 419
v-spherical distribution, 341
weakest link, 356, 357
Weibull distribution, 130, 347, 358 ,
380, 427
Wilcoxon test, 136
wind shear, 358, 359, 417
... There are several well-known multivariate distributions that belong to the class of normal variance-mean mixtures. For example, the multivariate-t distribution (Kotz and Nadarajah, 2004), the generalized hyperbolic distribution (McNeil et al., 2005) and the asymmetric Laplace distribution (Kotz et al., 2001) can all be expressed in terms of a multivariate Gaussian random variable (cf. Section 2.4). ...
... is the Mahalonobis distance between x and µ, ν = (2 − p)/2, µ, α ∈ R p are the location and skewness parameters, respectively, and Σ is a p × p covariance matrix (cf. Kotz et al., 2001). It follows that the density of the MSALDs is found by replacing the component density functions in (1) with the density given in (3). ...
Preprint
Mixtures of shifted asymmetric Laplace distributions were introduced as a tool for model-based clustering that allowed for the direct parameterization of skewness in addition to location and scale. Following common practices, an expectation-maximization algorithm was developed to fit these mixtures. However, adaptations to account for the `infinite likelihood problem' led to fits that gave good classification performance at the expense of parameter recovery. In this paper, we propose a more valuable solution to this problem by developing a novel Bayesian parameter estimation scheme for mixtures of shifted asymmetric Laplace distributions. Through simulation studies, we show that the proposed parameter estimation scheme gives better parameter estimates compared to the expectation-maximization based scheme. In addition, we also show that the classification performance is as good, and in some cases better, than the expectation-maximization based scheme. The performance of both schemes are also assessed using well-known real data sets.
... Вибірки генеруючого випадкового процесу формуються за законом розподілу Лапласа [17]. Ця модель негаусівського збурення характеризується так званими «важкими хвостами», що у свою чергу дає змогу представити складні моделі пасивних завад, прикладом яких є імпульсна завада. ...
Article
Радіолокаційне виявлення сигналів, що містять корисну інформацію про об’єкти спостереження, являє собою комплексний та багатофункціональний процес, що об’єднує розв’язок різних задач, однією з яких є виявлення рухомих цілей на тлі пасивних завад. ЇЇ вирішення ґрунтується на фундаментальній ідеї, яка спирається на застосування ефекту Допплера, відповідно до чого, було розроблено методи селекції рухомих цілей, які полягають в зміні частоти сигналу, відбитого від рухомого об’єкту. Традиційні алгоритми, черезперіодного віднімання пасивних завад систем СРЦ є ефективним, коли завади стаціонарні, проте мають недоліки, викликані завадами зі складним частотним спектром. В результаті для підвищення ефективності систем радіолокаційного виявлення було запропоновано та досліджено адаптивні алгоритми та показано їх переваги. Однак потреба в подальшому опрацюванні та удосконаленні існуючих систем та методів,залишається і досі актуальною, та пояснюється багатьма аспектами одним з яких є те, що системи радіолокаційного виявлення змушені функціонувати в умовах завад різної природи, що не завжди можна описати гаусівською моделлю. Таким чином, у роботі увага приділяється проблемі виявлення радіолокаційних сигналів, відбитих від рухомих цілей, на фоні завад, які описуються негаусівським розподілом, а також досліджується забезпечення стійкості алгоритму виявлення. Випадковість виникнення завади в процесі спостереження, дозволяє для їх математичного представлення використати авторегресійну модель, що генерується збуренням описаними моделлю Лапласа. В результаті виконується синтез локально-оптимального вирішального правила для виявлення сигналу відомої форми на тлі авторегресійної завади, який полягає у знаходженні максимуму відповідно до сигнального параметра і вектора параметрів завади отриманого відношення правдоподібності, що являє собою відношення висунутих щодо отриманої вибірки гіпотез. Стійкість алгоритму виявлення забезпечується шляхом оцінки невідомих параметрів шумового процесу при застосуванні емпіричного байєсівського підходу. В процесі роботи також досліджено працездатність синтезованого алгоритму при дії негаусівських завад, що описується моделлю K-розподілу. Для розуміння ефективності запропонованого алгоритму виявлення виконується статистичного моделювання. Результати моделювання підтверджують ефективність синтезованого робастного алгоритму перед неробастним, дієвість якого проявляються при наявності імпульсних завад із збільшенням ймовірності їх виникнення, відповідно він більш стійкий до виникнення хаотичних імпульсних завад, тоді як не робастний алгоритм дає значні похибки, що призводить до погіршення виявлення корисного сигналу.
... Table 7 outlines the various parameters used in our implementation. The higher privacy guarantee provided by Laplace noise generated using gamma distribution [51] and the computationally-efficient XOR operation-based homomorphic encryption [23] makes it an attractive alternative to DDP-BL. ...
Article
Full-text available
Personal data collected from today's wearable sensors contain a rich amount of information that can reveal a user's identity. Differential privacy (DP) is a well-known technique for protecting the privacy of the sensor data being sent to community sensing applications while preserving its statistical properties. However, differential privacy algorithms are computationally expensive, requiring user-level random noise generation which incurs high overheads on wearables with constrained hardware resources. In this paper, we propose SeRaNDiP-which utilizes the inherent random noise existing in wearable sensors for distributed differential privacy. We show how various hardware configuration parameters available in wearable sensors can enable different amounts of inherent sensor noise and ensure distributed differential privacy guarantee for various community sensing applications with varying sizes of populations. Our evaluations of SeRaNDiP on five wearable sensors that are widely used in today's commercial wearables-MPU-9250 accelerometer, ADXL345 accelerometer, BMP 388 barometer, MLP 3115A2 barometer, and MLX90632 body temperature sensor show a 1.4X-1.8X computation/communication speedup and 1.2X-1.5X energy savings against state-of-the-art DP implementation. To the best of our knowledge, SeRaNDiP is the first framework to leverage the inherent random sensor noise for differential privacy preservation in community sensing without any hardware modification.
... Inspired by previous works [22,62], we use four methods to generate the centers: (i) UNI, where centers are uniformly sampled from the dataset. (ii) LAP, where centers are sampled from the Laplace distribution [34]. We set the location and scale parameters, i.e., and , to | |/2 and | |/10 respectively, where is the object set. ...
Preprint
Full-text available
Spatial objects often come with textual information, such as Points of Interest (POIs) with their descriptions, which are referred to as geo-textual data. To retrieve such data, spatial keyword queries that take into account both spatial proximity and textual relevance have been extensively studied. Existing indexes designed for spatial keyword queries are mostly built based on the geo-textual data without considering the distribution of queries already received. However, previous studies have shown that utilizing the known query distribution can improve the index structure for future query processing. In this paper, we propose WISK, a learned index for spatial keyword queries, which self-adapts for optimizing querying costs given a query workload. One key challenge is how to utilize both structured spatial attributes and unstructured textual information during learning the index. We first divide the data objects into partitions, aiming to minimize the processing costs of the given query workload. We prove the NP-hardness of the partitioning problem and propose a machine learning model to find the optimal partitions. Then, to achieve more pruning power, we build a hierarchical structure based on the generated partitions in a bottom-up manner with a reinforcement learning-based approach. We conduct extensive experiments on real-world datasets and query workloads with various distributions, and the results show that WISK outperforms all competitors, achieving up to 8x speedup in querying time with comparable storage overhead.
... The heterogeneity in data populations and associated quantities of interest has inspired the use of a variety of kernels, each with its own advantages and characteristics. Gaussian kernels are particularly popular in various inferential problems, especially those related to density estimation and clustering analysis Kotz et al. (2001); Bailey et al. (1994.); Roeder & Wasserman (1997); Robert (1996); Banfield & Raftery (1993). ...
Preprint
Dirichlet Process mixture models (DPMM) in combination with Gaussian kernels have been an important modeling tool for numerous data domains arising from biological, physical, and social sciences. However, this versatility in applications does not extend to strong theoretical guarantees for the underlying parameter estimates, for which only a logarithmic rate is achieved. In this work, we (re)introduce and investigate a metric, named Orlicz-Wasserstein distance, in the study of the Bayesian contraction behavior for the parameters. We show that despite the overall slow convergence guarantees for all the parameters, posterior contraction for parameters happens at almost polynomial rates in outlier regions of the parameter space. Our theoretical results provide new insight in understanding the convergence behavior of parameters arising from various settings of hierarchical Bayesian nonparametric models. In addition, we provide an algorithm to compute the metric by leveraging Sinkhorn divergences and validate our findings through a simulation study.
... The function G X+Y (x) corresponds to the well-known asymmetric two-sided exponential distribution (asymmetric Laplace distribution), see, for example, [1,6]. Now, passing to old notation with the account of relations ...
Article
Full-text available
New two-sided bounds are proposed for the ruin probability in the classical risk process.
Article
In this paper, the differentially private synchronization control (DPSC) problem is investigated for a class of nonlinear complex networks against possible eavesdropping attacks. The complex network is composed of a number of coupled nodes whose nonlinear dynamics is remotely controlled via a communication link. The initial state of the underly network serve as the private information to be protected from potential eavesdroppers. First, the concepts of mean-square synchronization and differential privacy are introduced to quantify, respectively, the performance of synchronization and the level of privacy. A new DPSC scheme is then proposed, which consists of a noise generator, a pair of input and output noise injectors, and a synchronization controller, in order to guarantee the ultimate mean-square synchronization with preserved differential privacy. In addition, the underlying co-design issue is thoroughly discussed on the input/output noise injector and the synchronization controller. Finally, a series of simulations are carried out to show the effectiveness of the proposed DPSC scheme.
Article
Full-text available
Introduction: Climate change and global warming are some of the issues and concerns of human beings today that have important effects on rain, evaporation, runoff, and finally water supply and causes the severity and weakness of these parameters, increasing the occurrence of severe weather events and lack of available water, which causes irreparable damage. Evapotranspiration is an important component in the hydrological cycle that is affected by various factors such as air temperature, relative humidity, wind speed and number of sunshine hours, and these factors are also affected by climate change. Due to the occurrence of climate change in the country in recent decades, it is important to study the changes in the trend of climatic parameters and their role in evapotranspiration in order to apply management methods to reduce evaporation in water resources. The Mann-Kendall trend test is an often-used method to examine changes in the data time series, but this test only expresses changes in the center of the data series, so Quantile and Bayesian multiple regression methods are used to study the trend of changes in different parts of the data time series and also to investigate the role of various parameters on a specific parameter. Therefore, the purpose of the present research is to study and compare the trend of changes in evaporation and climatic parameters affecting it and determining the role of these factors on evaporation using Quantile and Bayesian multiple regression methods at Hashemabad Gorgan station located in Golestan province. Materials and methods: In the first step, the meteorological data time series of evaporation, and the factors affecting it including average temperature, relative humidity, sunshine hour and wind speed were prepared for Hashemabad Gorgan synoptic station with a statistical period of more than 30 years (1984-2018) and the seasonal data series of these data were formed. The non-parametric Mann-Kendall test was performed to investigate the trend of changes in evapotranspiration and the factors affecting it and then Quantile and Bayesian regression was used to investigate the changes in various quantiles of data series and also to determine the role of changes in different values of each of these meteorological factors on evaporation. Results and discussion: investigation of the trend in the evaporation data series and the factors affecting it based on the Man-Kendall test shows that the aforementioned trend test shows a decreasing trend in evaporation only for winter, while the lowest changes in trend in different quantiles are in this season, and in Summer the increasing trend in evaporation has the highest slope in the range of 0.66-1.14% for lower and upper quantiles, respectively. In spring, the increasing slope of 0.75 in the upper quantiles gradually changes to a decreasing slope of -0.72. In autumn, the lower quantiles have a higher slope up to -0.66 and in winter, the quantiles have the lowest slope between -0.14 to -0.08. The results of investigating the factors affecting evaporation also show that changes in temperature have a greater share in the changes of daily evaporation and the bigger the quantile, the increasing slope increases. The relationship between relative humidity and evaporation follows a decreasing slope and ranks second after temperature. Finally, it can be concluded that in Hashemabad station of Gorgan, the Mann-Kendall trend test could not detect significant changes that have occurred in the process of the evaporation trend, but these changes have been revealed the best by the Quantile regression method. In summer, which is the driest and hottest season of the year, the trend of increase in evaporation is more intense, and due to summer cultivation in this region, water consumption in agriculture has increased in recent years and with this trend will increase further in the future. Conclusion: Given the fact that climate change may not have occurred only in the average value of a data series, but extreme events might have occurred in some parts of the series so it is necessary to study different parts of the data series using methods such as Quantile and Bayesian multiple regression and the results of this research emphasizes this matter. Also, due to the necessity of studying evapotranspiration in the application of management methods for water resources, useful and practical results can be brought out by studying the changes in different ranges of evapotranspiration, especially it’s extreme and high values as well as the effects of changes in different ranges of climatic parameters on evaporation.
Article
Eight nonparametric tests are examined in a small-sample setting. Their power functions are compared with the locally most powerful (LMP) linear rank test, primarily to see over which range the LMP test retains good power. All tests are two-sample tests, the populations are governed by the double exponential distribution, and the sample sizes are m = 5, n = 5, and m = 5, n = 4. The LMP test performs uniformly well, even under nonlocal alternatives, while its asymptotically equivalent counterpart, the median test, performs rather poorly with these small sample sizes.
Article
Using a simple characterization of the Linnik distribution, discrete-time processes having a stationary Linnik distribution are constructed. The processes are structurally related to exponential processes introduced by Arnold (1989), Lawrance and Lewis (1981) and Gaver and Lewis (1980). Multivariate versions of the processes are also described. These Linnik models appear to be viable alternatives to stable processes as models for temporal changes in stock prices.
Book
1 Introduction.- 2 Basic properties.- 2.1 Moments and cumulants.- 3 Related distributions.- 3.1 Normal approximations.- 3.2 Powers and logarithms of generalized inverse Gaussian variates.- 3.3 Products and quotients of generalized inverse Gaussian variates.- 3.4 A generalized inverse Gaussian Markov process.- 3.5 The generalized hyperbolic distribution.- 4 Maximum likelihood estimation.- 4.1 Estimation for fixed ?.- 4.2 On the asymptotic distribution of the maximum likelihood estimate for fixed ?.- 4.3 The partially maximized log-likelihood for ?, estimation of ?.- 4.4 Estimation of ? when ? and ? are fixed.- 4.5 Estimation of ? when ? and ?>0 are fixed.- 5 Inference.- 5.1 Distribution results.- 5.2 Inference about ?.- 5.3 Inference about ?.- 5.4 One-way analysis of variance.- 5.5 A regression model.- 6 The hazard function. Lifetime models..- 6.1 Description of the hazard function.- 7 Examples.- 7.1 Failures of airconditioning equipment.- 7.2 Pulses along a nerve fibre.- 7.3 Traffic data.- 7.4 Repair time data.- 7.5 Fracture toughness of MIG welds.- References.- List of symbols.
Article
In this chapter, (continuous) positive Linnik and (nonnegative integer valued) discrete Linnik random variables are discussed. Rates of convergence and first terms of both the Edgeworth expansions and the expansions in the exponent of the distribution functions of certain sums of such random variables with nonnegative strictly stable as well as discrete stable limit laws are considered.
Article
The joint density function of the sample mean and sample variance is recursively derived for samples from a population with density function f where f (x) > 0 cilinost syervi,vheie, everywhere continuous and has certain integral properties. For populations where f does not have these integral properties, this joint density is an approximation. This joint density-function is used to derive the density function of the t-statistic for samples from f. The family of generalized normal density functions is used for an example. The approximation for the t-density is given for that family. For some specific members of the family, the true a probabilities for the approximations are tabled and compared to the results of a simulation study.
Article
The first-order autoregressive semi-Mittag-Leffler (SMLAR(1)) process is introduced and its properties are studied. As an illustration, we discuss the special case of the first-order autoregressive Mittag-Leffler (MLAR(1)) process.