ArticlePDF Available

Tukey's Paper After 40 Years

Authors:
  • Independent Researcher

Abstract and Figures

The paper referred to is “The Future of Data Analysis,” published in 1962. Many authors have discussed it, notably Peter Huber, who in 1995 reviewed the period starting with Hotelling's 1940 article “The Teaching of Statistics.” I extend the scope of Huber's remarks by considering also the period before 1940 and developments since 1995. I ask whether statistics is a science and suggest that to attract bright students to our subject, we need to show them the excitement and rewards of applied work.
Content may be subject to copyright.
Tukey’s Paper After 40 Years
Colin MALLOWS
Avaya Labs
Basking Ridge, NJ 07920
(
colinm@avaya.com
)
The paper referred to is “The Future of Data Analysis,” published in 1962. Many authors have discussed
it, notably Peter Huber, who in 1995 reviewed the period starting with Hotelling’s 1940 article “The
Teaching of Statistics. I extend the scope of Huber’s remarks by considering also the period before 1940
and developments since 1995. I ask whether statistics is a science and suggest that to attract bright students
to our subject, we need to show them the excitement and rewards of applied work.
KEY WORDS: Data analysis; Massive data; Statistical science; University College London; Zeroth
problem.
This invited paper and the discussions were organized by Vi-
jay Nair. The paper and the discussions by Andreas Buja and
James Landwehr were originally presented at the conference
on the “Future of Data Analysis” in honor of Jon Kettenring at
Avaya Labs in October 2005.
1. INTRODUCTION
Tukey’s 1962 paper (Tukey 1962) redefined our subject. It
introduced the term “data analysis” as a name for what applied
statisticians do, differentiating this from formal statistical infer-
ence. Tukey said:
Large parts of data analysis are inferential in the sample-to-population sense,
but these are only parts, not the whole. Large parts of data analysis are incisive,
laying bare indications which we could not perceive by simple and direct ex-
amination of the raw data, but these too are parts, not the whole. Some parts of
data analysis. .. are allocation, in the sense that they guide us in the distribution
of effort.... Data analysis is a larger and more varied eld than inference, or
incisive procedures, or allocation.
In an early section of the paper, Tukey asked: “How can new
data analysis be initiated?” He suggested four ways:
1. We should seek out wholly new questions to be answered.
2. We need to tackle old problems in more realistic frame-
works.
3. We should seek out unfamiliar summaries of observa-
tional material, and establish their useful properties.
4. Still more novelty can come from finding, and evading,
still deeper lying constraints.
Tukey’s paper has been enormously influential. In dis-
cussing it, I am at a great disadvantage, because many distin-
guished authors have addressed this topic, particularly in the
1997 Festschrift celebrating John’s 80th birthday (Brillinger,
Fernholz, and Morgenthaler 1997). Peter Huber gave a review
of developments from Hotelling’s 1940 article “The Teaching
of Statistics” (Hotelling 1940) through Tukey’s 1962 paper, the
Madison (1967), Edmonton (1974), and Berkeley (1983) con-
ferences, and the 1984 David report “Renewing U.S. Mathe-
matics. Critical Resource for the Future.” Huber then presented
his own thoughts: “where are we now in 1995?”, “current view
of the path of statistics,” “where should we go?”, and “where
will we go?”. His final comment was:
Statistics will survive and flourish through the sheer mass of applications in
most diverse fields. But whether the field as such will retain coherence is an al-
together different question; the answer is up to us statisticians and data analysts,
and to the actions we are going to take.
I cannot hope to better this masterly survey of 55 years of de-
velopment. Perhaps the most useful thing I can do is to urge you
to reread Tukey’s paper, and Huber’s commentary. I must also
draw attention to an important 2002 National Science Foun-
dation report Statistics:Challenges and Opportunities for the
Twenty-First Century (National Science Foundation 2002). So
what can I hope to do here? I think the most I can hope to do
is to give my assessment of where we are, and to point to the
areas that seem most crucial to me.
2. UNIVERSITY COLLEGE LONDON
I start by extending the scope of Huber’s review, beginning
with some personal reminiscences. When I first went to Uni-
versity College London (UCL) in 1948 to study mathematics,
I had not heard of statistics (the discipline), although for sev-
eral years my father (who was what in England is called a Po-
lice Chief Inspector; there seems to be no equivalent rank here)
had been responsible for developing a system for recording and
analyzing data on traffic accidents. He introduced a degree of
mechanization into the process by recording the data on edge-
punched cards that could be sorted using a knitting needle. He
also developed some graphical displays. So perhaps some of my
aptitude was inherited. I was very fortunate to find that UCL
had the world’s premier Statistics Department, founded by Karl
Pearson in 1911 as an outgrowth of his Biometric Laboratory,
which he had established in 1895. As a junior in the Mathemat-
ics Department, I was required to take some lectures outside
the department, and, serendipitously, I opted to take some lec-
tures from F. N. David, who succeeded (as she did with many
others) in awakening my interest and enthusiasm for our sub-
ject. I transferred to Statistics for my final year and stayed on to
work for my doctorate. At that time, Egon Pearson was the se-
nior professor; the staff included F. N. David and N. L. Johnson,
who jointly supervised my thesis work, and H. O. Hartley was
there for several years. Later, of course, these three emigrated
to the United States.
I cannot overemphasize the sense of history that was perva-
sive at UCL at that time. The spirit of Karl Pearson was still
© 2006 American Statistical Association and
the American Society for Quality
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DOI 10.1198/004017006000000219
319
320 COLIN MALLOWS
Figure 1. John Tukey.
present, although he had been dead for 14 years; the Neyman–
Pearson theory was only 20 years old, but much of the theory
of Normal-theory tests had been worked out; and Bayesian-
ism had been in decline (due to the antipathy of Fisher and
the impact of Neyman and the confidence-interval approach)
for 25 years. This was before Dennis Lindley’s conversion from
frequentist to Bayesian; Jimmie Savage had not yet written his
“Foundations of Statistics” book. Box had not yet applied ex-
perimental design concepts to response surfaces. At that time a
computer was a young woman employed to sit at a mechanical
desk calculator. H. O. Hartley lectured on numerical methods
and showed us UCL’s collection of ancient mechanical calcu-
lating machines. David Cox was at Birkbeck College, just down
the street from UCL, and M. G. Kendall was a professor in the
London School of Economics.
Hanging in the corridors of the department at UCL were
a dozen or so framed aphorisms, quotations from such great
figures of the past as Charles Darwin, Francis Galton, and
Florence Nightingale. These mottoes were all taken down when
Dennis Lindley became professor. In pride of place in the main
classroom was an enormous two-way contingency table, show-
ing heights of fathers and sons, dating from the time of Galton’s
development of regression in the 1880s. The journal Biometrika
was published by UCL; it had been founded in 1905 by Karl
Pearson.
The sense was that this was the place where modern statistics
began; where Galton invented regression, where Karl Pearson
invented the correlation coefficient and his family of distrib-
utions and the chi-squared test, and where Egon Pearson and
Jerzy Neyman invented the theory of hypothesis tests. Gosset
(“Student”) had been an intimate of Karl Pearson and had in-
vented the t-test. It was impossible to forget that much of the
theory developed out of questions raised in biometrics, which
itself was stimulated by Darwin’s theory of 90 years earlier.
Statistics was basically a descriptive subject (as all sciences are
to begin with). So Karl Pearson’s family of distributions was
a major contribution, as was his chi-squared test of goodness
of fit. Experimental design was a subject of use in agriculture.
Egon Pearson was friendly with Walter Shewhart, and had vis-
ited the United States in 1931. Some of the data that we studied
were drawn from Shewhart’s experience.
3. R. A. FISHER
When Karl Pearson died in 1936, the Statistics Department
was split in two, and Fisher became chairman of a new Eu-
genics Department, upstairs from the Statistics Department. He
had moved out some years before I got there. Of course, Fisher
had been stimulated by his study of the many decades of exper-
iments at the agricultural research station, Rothamstead. The
theory of statistics had been given a tremendous boost by his
1922 and 1925 papers in which he set out the concept of a
mathematical specification of a statistical problem, which led
to all of the machinery of sufficiency, efficiency, and so on.
Fisher identified three problems: (1) problems of specification,
(2) problems of estimation, and (3) problems of distribution.
Nowadays we would rephrase the second and third of these
problems as being those of choosing one or more methods
of analysis, and studying these methods. In the three vol-
umes Breakthroughs in Statistics, edited by Kotz and Johnson
(1992, 1997), by my count 3/4 of the papers are devoted to
elaborations of Fisher’s formulation of statistics as a theoreti-
cal subject. They are concerned with the mathematical analysis
of a statistical specification, not of a statistical problem per se.
Fisher’s first problem can be criticized as being too restricted in
scope, because it does not allow for the possibility that the cor-
rect formulation of the problem may not be understood when
the analysis begins. In my Fisher lecture (Mallows 1998), I sug-
gested that we need to consider a “zeroth problem, in which we
consider what the relevant population is, what the relevant data
are, and how they relate to the purpose of the statistical study.
Later I draw attention to what I call the “fourth problem.
Fisher’s approach made theoretical statistics into a mathe-
matical subject, and the accident of Normal theory being so el-
egant led to tremendous developmentsin the theory. Probability
theory was by this time a respectable subject, due to the efforts
in the 1930s of Kolmogorov, Levy, Cramer, and later Feller and
Loeve. There was the sense that the basic concepts of statistical
theory had been discovered and that all that remained was ap-
plying the theory to more sources of data and working out the
details.
Now we realize that this was an illusion. Fisher had created
a mathematical theory that was admirably suited to providing
subjects for doctoral theses containing many theorems.
Figure 2. Sir Ronald Aylmer Fisher.
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
TUKEY’S PAPER AFTER 40 YEARS 321
As Leo Breiman remarked at the Neyman–Kiefer conference
in 1983, if all you have is a hammer, every problem looks like
a nail. (This observation was not original with Breiman, of
course.) So for awhile, every statistical problem looked like a
hypothesis-testing problem. Nowadays we have the Bayesian
sledgehammer, which is guaranteed to make an impression on
every problem, but only after it has been cast into a formal
shape.
4. HAROLD HOTELLING
Hotelling’s 1940 paper, reprinted in Statistical Science
in 1988, argued two things: first, that statistics had assembled
a considerable body of coherent techniques, which meant that
it deserved to be taught in its own department rather than be-
ing dispersed to its applied fields, and second, that because so
much of this body of knowledge was mathematical, these new
statistics departments should be affiliated with departments of
mathematics. This paper was very influential, especially in the
United States. Huber (1997) pointed out that already in 1940
Deming had commented on Hotelling’s paper and had remarked
that:
Some of [Hotelling’s] recommendations might be misunderstood. I take it that
they are not supposed to embody all that there is in the teaching of statistics,
because there are many other neglected phases that ought to be addressed....
The modern student, and too often his teacher, overlook the fact that such a
simple thing as a scatter diagram is a more important tool of prediction than the
correlationcoefcient.... Aboveall,thestatistician mustbeascientist.
Please remember this last remark; I will return to it later.
One of the results of the mathematization of the subject
was that ambitious and talented young professors, under pres-
sure to publish scholarly research, found that the easiest way
to do so was to write papers that could appear in the Annals
of Mathematical Statistics. Working on applications was time-
consuming and did not readily lead to research papers.
Particularly in this country, this led to an emphasis on math-
ematical derivations and proofs. Asymptotics became a major
industry. The Annals became uninteresting and unintelligible to
most applied statisticians. The great power of Fisher’s formu-
lation was that it allowed statistical theory to develop divorced
from application. By 1962, the time was ripe for a new vision,
which Tukey’s paper provided.
Figure 3. Harold Hotelling.
5. JOHN TUKEY
In his 1962 paper “The Future of Data Analysis, Tukey ar-
gued that statistics should not be thought of as a subfield of
mathematics, and that statistics is more comprehensive than for-
mal inference. It was this paper that introduced the term “data
analysis,” which has largely replaced the term “applied statis-
tics.”
Here are two more quotations from this eminently quotable
paper:
To the extent that pieces of mathematical statistics fail to contribute, or are not
intended to contribute ... to the practice of data analysis, they must be judged
as pieces of pure mathematics, and criticized according to its purest standards.
Quoting Martin Wilk: We must teach an understanding of why certain sorts of
techniques .. . are indeed useful.
Tukey and others (notably George Box, at the 1967 Madison
conference) argued that statisticians should aspire to be first-
rate scientists, rather than second-rate mathematicians. All
commentators have pointed to education as the key; education
in schools, exposing young students to data and its display, ed-
ucation in service courses, and education in graduate schools
of statistics. Students need to learn how to be scientists, not
just technicians. The problem is how do we attract the brightest
students to our subject? We all, I think, understand the excite-
ment of our subject, its intellectual challenge, and the reward
that comes from an insightful analysis of a statistical problem.
How to convey this to a bright student who has some analytical
aptitude but is attracted by the glamor of pure science (or math)
or the promise of riches on Wall Street?
Tukey’s approach can be criticized; his 1977 EDA book
(Tukey 1977) discusses the methods of exploratory data analy-
sis, but says nothing about how to use these methods. (In his
Preface he pointed out that he was presenting examples, not
case studies.)
6. 1962–1995
Between 1962 and 1995 we have the era described by Huber;
refer to his paper for an insightful commentary. Highlights in-
clude a 1967 Conference on the Future of Statistics at Madi-
son [edited by Watts (1968)], a 1974 Conference on Direction
for Mathematical Sciences at Edmonton [edited by Ghurye
(1975)], the 1983 Neyman–Kiefer Conference at Berkeley
[edited by LeCam and Olshen (1985)], and the 1984 David
report titled “Renewing U.S. Mathematics.
At the three conferences, much attention was paid to the
impact of the computer, which was beginning to be felt. But
more general topics were also considered. At the 1967 con-
ference, Tukey asked: “Is statistics a computing science?” He
commented:
Trying to be certain about uncertainty is a phenomenon of the present century.
. . . Some would not call actions or problems statistical unless they explicitly
involve the treatment of uncertainty. A few might even claim them not to be
statistical unless a formal optimization, under specific formal hypotheses, ad-
mittedly underlay what was to be done. Such views, if too widely held, would
have very unfortunate consequences for statistics.
At the 1974 Edmonton Conference, Herb Robbins’ paper
“Wither Mathematical Statistics” drew much attention. I quote:
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
322 COLIN MALLOWS
An intense preoccupation with the latest technical minutiae, and indifference
to the social and intellectual forces of tradition and revolutionary change, com-
bine to produce the Mandarism that some would now say already characterizes
academic statistical theory and is most likely to describe its immediate future.
Peter Huber argued that “the cause of the apparent mis-
ery was that too many problems in mathematical statistics has
reached maturity and were simply being squeezed dry. Huber
identified data analysis as a promising growth area. He also
contributed an insightful paper to the Neyman–Kiefer memo-
rial conference in 1983. He argued that statistics evolves along
a spiral path, so that “after some time the focus of concern
returns, although in a different track, to an earlier stage of devel-
opment, and takes a fresh look at business left unfinished dur-
ing the last turn.” Thus the current interest in graphics (made
possible by advances in computing technology) revisits prob-
lems addressed in the nineteenth century, after an interlude in
the twentieth century during which methods based primarily on
models (Fisher, Neyman/Pearson, Wald) attracted more inter-
est.
The 1984 David report on “Renewing U.S. Mathematics” is
of interest to us mainly because of what it did not address. Al-
though it drew attention to developments in probability theory,
and pointed to data handling and analysis as one component of
a rise in mathematics usage, it said very little about statistics.
7. SINCE 1995
What has happened since 1995? Here are some high-
lights. In 1996 a report of the CATS subcommitee of the Na-
tional Research Council on Statistical Software Engineering
identified this new interdisciplinary field (National Research
Council/CATS 1996a). Also in 1996, the same subcommi-
tee published the proceedings of a workshop on massive
datasets (National Research Council/CATS 1996b). There
Arthur Dempster stated that:
one of the major complaints about statistical theory as formulated and taught
in the textbooks is that it is a theory about procedures. It is divorced from the
specific phenomenon.
Both of these reports draw attention to how our subject is evolv-
ing in new directions.
The 1998 National Science Foundation report (“the Odom
report”) was concerned mainly with mathematics; it formu-
lated three primary activities of mathematicians. For statisti-
cians, these are:
Generating concepts (and also methodologies)
Interacting with areas that use statistics
Attracting and developing the next generation of statisti-
cians.
In 2002 JASA published a series of 52 vignettes that were col-
lected in a volume titled Statistics in the 21st Century (Raftery
et al. 2002). These short review articles highlight important ad-
vances and outline potentially fruitful areas of research.
Also in 2002, the National Science Foundation published a
report titled Statistics:Challenges and Opportunities for the
21st Century, a shortened version of which appeared in Statisti-
cal Science in 2004. This report included a short history of the
development of our subject. It identifies the “core” of our sub-
ject as “the subset of statistical activity that is focused inward,
on the subject itself, rather than outward, towards the needs of
statistics in particular scientific domains.” It argued for several
initiatives aimed at ensuring the health of this core.
In a commentary published in Statistical Science in 2004,
Leo Breiman criticized the NSF report’s emphasis on devel-
oping the “core,” claiming (in my language) that this is a
throwback to Hotelling’s vision of statistics as an academic
subject,withresearch“...focused on the development of statis-
tical models, methods, and related theory...” (Breiman2004).
Breiman pointed out that this emphasis denigrates the way most
important advances in statistics have occurred not by introspec-
tion, but rather by involvement in challenging problems sug-
gested by different disciplines and data. The report argues that
more funding should be given to activity that is focused inward
rather than outward. Breiman says that:
At a time when statistics is beginning to recover from its “overmathematiza-
tion” in the post World War II years and engage in significant applications in
many areas, the report is a step into the past and not into the future.
8. IS STATISTICS A SCIENCE?
Is statistics a science? I still remember the psychic shock
I felt when I first heard the name of the new IMS journal, Sta-
tistical Science. It sounded pretentious. If statistics is a science,
then what is its subject matter? Physicists study “stuff, biolo-
gists study life, astronomers study the universe—what do sta-
tisticians study? At the 2002 NSF conference, David Cox was
asked to identify “what is statistics. His answer was that statis-
tics is the discipline concerned with the study of variability, the
study of uncertainty, and the study of decision making in the
face of uncertainty. Similar definitions have been given many
times. This seems to say that what statisticians study is their
methodology, divorced from applications. In my 1997 Fisher
lecture I criticized this kind of definition. Particularly with large
datasets, the kind of uncertainty that statistical techniques know
how to handle is not the primary issue; the difficulty comes
from the complexity of the problem and the fact that we have
not been “given” a specification on which to rely. The definition
I prefer says that:
Statistics concerns the relation of quantitative data to a real-world problem,
often in the presence of variability and uncertainty. It attempts to make precise
and explicit what the data has to say about the problem of interest.
However we view this issue, the question remains: Is sta-
tistics a science? We all agree that statisticians should act like
scientists, but is statistics itself a science? We are like carpen-
ters, with the Neyman–Pearson hammer, Bayesian formalism,
and now also a collection of powerful and delicate computing
and graphical tools, but we need experience to know when and
how to use them.
Perhaps statistics is the science whose domain is inference,
from data to a substantive problem. In his 1962 paper Tukey
said that:
Three constituents will be judged essential [for constituting a science]:
(a1) intellectual content;
(a2) organization into an understandable form;
(a3) reliance on the test of experience as the ultimate standard of validity.
...Dataanalysispassesallthreetests.Iwouldregarditasascience,onedened
by a ubiquitous problem rather than by a concrete subject.
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
TUKEY’S PAPER AFTER 40 YEARS 323
So is statistics (or, if you prefer, data analysis) a science?
If it is, then it is still a science in the preanalytical stage. We
have barely begun to organize our thinking about how statistical
technology is applied to substantive problems. In 1980 Peter
Walley and I pleaded for more work in this direction. Not much
has happened since then.
In my years at Bell Labs I developed a sincere admiration
for engineers, who have to make things work in the real world.
Statistics is nothing more than a trivial exercise if it does not ad-
dress real-world problems. I again quote John Tukey, who said
(in an interview recorded in the 1997 Festschrift), that “statis-
tics is a pure technology.
But it is a technology that faces new challenges; to deal
with them, we need new ideas, new theories, and new method-
ologies. In 1995 Daryl Pregibon and I gave a definition of a
massive-data problem. This is not just a problem for which we
have massive data; that could be simply a problem where classi-
cal asymptotic theory is all that is needed. I have not seen many
of those. No, a massive data problem is a massive problem for
which we have massive data. (If we have just the problem with-
out the massive data, it is one for which statistics has little to
offer.) In a massive problem, the difficulty is the complexity.
This point was made repeatedly in the 1996 NRC Workshop
report.
9. AN EXAMPLE
Let me discuss a very simple example. For my last 3 years
with AT&T, I was deeply involved in the parity problem. I pub-
lished a summary of some of the technical problems that arose
in Statistical Science in 2002 (Mallows 2002).
The Telecommunications Act of 1996 mandated that incum-
bent local exchange carriers (ILECs, Bell companies) must pro-
vide (on request, for a fair fee) certain services for incoming
competitive companies (CLECs, such as AT&T, trying to enter
the local telephone market), with these services to be “at least
equal in quality to that provided by the local exchange carrier to
itself.” ILECs routinely collect data on many variables to mon-
itor their performance.
The available data is the monthly reports that the ILEC
prepares, and corresponding reports for the CLECs. A nat-
ural approach for a statistician is to postulate that for each
recorded variable, the ILEC data represents a random sample
from some distribution, whereas the CLEC data represent a
sample from a possibly different distribution, the problem be-
ing to test whether these distributions are the same. A permu-
tation test seems perfectly suited for the purpose, because its
null hypothesis is exactly that the two distributions are identi-
cal, without relying on any assumption as to their shape. I tes-
tified to this effect before several state commissions, and my
argument was accepted as valid. I argued that even with as few
as one CLEC observation, on a variable where large values are
bad, then if this single CLEC observation is larger than each
of, say, 99 ILEC observations, then this is strong evidence (at
the 1% level) that discrimination against the CLEC customer
has occurred. Such an argument would be easy to present to a
statistics student. Note that there is no possibility of a Bayesian
approach to this problem, because in addition to the fact that we
do not know the distribution shapes, the ILEC will not accept
any nonzero value for the prior probability that discrimination
has occurred. What is needed is a decision procedure that will
determine whether there was or was not discrimination. So a
test procedure seems appropriate.
But the validity of the permutation test depends on the va-
lidity of the model. Suppose that there is a strong day-of-week
effect but that day of the week has not been recorded in the
available data. Suppose in fact that data for Fridays tends to be
larger than for other weekdays, and that the single CLEC value
happened to fall on a Friday. Then we should be comparing this
CLEC value not with the whole set of ILEC data, but only with
the Friday values. Similarly, a CLEC value that occurred on a
Wednesday may not be outlying when viewed against the whole
ILEC dataset but may become so when compared with other
Wednesday data. A careful survey of the methodology that has
been proposed should draw attention to this kind of difficulty.
The key idea seems to be exchangeability; if we ignore (or can-
not observe) day of week, then the observations may appear to
be exchangeable, but if we can see day of week, then they may
not be (see Draper, Hodges, Mallows, and Pregibon 1993). It
is the analyst’s responsibility to determine how to organize the
data so that comparisons are made of like with like. In the par-
ity proceedings, much time was spent arguing about the proper
levels of disaggregation for the data. These discussions were
not illuminated by actual data. The problem of reaggregating
the within-cell statistics to provide an overall criterion was an
interesting technical challenge.
It seems that the most important contribution a statistician
could make in this problem is not to design a “valid” test, but
rather to ensure that disaggregation has been carried far enough.
This is not Fisher’s second or third problem, or even his first,
but the zeroth problem! We can also identify the “fourth prob-
lem,” which comes after the statistical analysis has been com-
pleted; it is to interpret the results in terms that are intelligible
to the nonstatistical worker. Here I have little to suggest. I have
no hope of explaining to an innumerate commissioner the pro-
cedure that I proposed, which can be described as a “balanced
averaged adjusted truncated disaggregated modified t.”
All one can hope to do in such a situation is to get the op-
posing technical experts and the commission technical staff to
agree that the procedure makes sense, so that the commission
will have no grounds for objecting.
10. A NEW VISION
I think the time is ripe for a new vision of our subject. What
statisticians do is to exploit the idea of a probability model for
data. In teaching the technical methods of the subject, the model
is assumed to be “given, not in question. Bayesians go even
further, assuming that the analyst can assign prior probabilities
for every unknown quantity. And decision theory assumes that
the consequences of every possible decision are known. But in
real applications, models are not known; a major part of the
problem lies in setting up an appropriate model. This is Fisher’s
“first problem.”
Two Fisher lectures, in 1988 and 1989, by Erich Lehmann
and David Cox (both printed in Statistical Science in 1990), ad-
dressed this problem. Nowadays students learn in school about
collecting and presenting data, but from our point of view they
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
324 COLIN MALLOWS
are still stuck in the nineteenth century, when the emphasis was
on describing populations and not on statistical inference. Much
university teaching addresses Fisher’s second and third prob-
lems, which deal with what we commonly think of as statisti-
cal theory, the technical tools of our profession. Fisher’s first
problem is setting up the probability model that will be used.
My zeroth problem logically precedes that, considering what
data to use and how it relates to the substantive problem. The
fourth problem is interpreting the results of the statistical analy-
sis. When I learned the Neyman–Pearson theory from Egon
Pearson himself, it was applied only to very simple problems
in which the meaning of a final conclusion that “the effect is
statistically significant” was self-evident. But in anything more
than the simplest problems, the results need to be interpreted,
with all of the necessary caveats.
What we need to do is to attract bright university students to
the excitement of the subject, by showing them how the statis-
tical approach can lead to insights in many fields. To do this,
I suggest that we need to show them how the five stages (zeroth
through fourth) appear in applied problems. A first step might
be to look at the 20 or so JASA vignettes that are concerned with
applications and try to identify common elements. Surely, each
of these applications areas is not completely different from all
the others. What has been done in the past is to organize appli-
cations by the methodologies they use; GLMs, survival analy-
sis, ARIMA models, and so on. But this does not address the
question of how one chooses an appropriate methodology. What
we need is a classification of applications that does this.
I suggest that someone should take on the task of looking
for the common features among the JASA vignettes, and also
among the subject areas listed in the 2002 NSF report, and per-
haps the essays in the collection edited by Tanur et al. (1987)
and the collections of small problems collected by Cox and
Snell (1981), Andrews and Herzberg (1985), and Hand, Daly,
Lunn, McConway, and Ostrowski (1994) with the purpose of
organizing the material so that the excitement of applications
can be taught to university students. I am not the first to sug-
gest this. In his 1962 paper, Tukey stated that “the ideas of data
analysis ought to survive a look at how data is analyzed.
Moreover, he noted that he once suggested at a statistical
meeting that it might be useful if statisticians looked to see how
data was actually analyzed by many sorts of people. And he
was criticized by a “very eminent and senior statistician” who
said that this idea might have merit, but that young statisticians
should not indulge it too much, because it might distort their
ideas.
In my 1997 Fisher lecture, I quoted from Rubin (1993,
p. 204):
The special training statisticians receive in mapping real problems into formal
probability models, computing inferences from data and models, and exploring
the adequacy of these inferences, is not really part of any other formal disci-
pline, yet is often crucial to the quality of empirical research.
I commented then, and say again now: Would that students
indeed were so trained!
11. CONCLUSION
Statisticians need to be involved with real problems. If our
discipline is to prosper and not have its growth areas taken over
by people interested only in parts of the whole (such as machine
learning, “data mining,” or image analysis), then we must look
for the common elements in these various fields and develop
a framework that encompasses them all. This framework need
not be mathematical; mathematics is seductively easy compared
with data analysis. It should not merely organize the techniques.
The key concept is “statistical thinking.
ACKNOWLEDGMENTS
A version of this article was presented at a conference in
honor of Jon Kettenring, held at Avaya Research, September 30,
2005. At that conference I remarked that Jon Kettenring and
I are professional brothers, because both of us were guided (on
different continents) by Norman Johnson in our thesis work.
Thanks to several commentators and referees for stimulating
me to polish this article (a little).
[Received January 2006. Revised April 2006.]
REFERENCES
Andrews, D. F., and Herzberg, A. M. (1985), Data, New York: Springer-Verlag.
Breiman, L. (2004), “Comment on the NSF Report on the Future of Statistics,”
Statistical Science, 19, 411.
Brillinger, D., Fernholz, L. T., and Morgenthaler, S. (1997), ThePracticeof
Data Analysis (the Tukey Festschrift), Princeton, NJ: Princeton University
Press.
Cox, D. R. (1990), “Role of Models in Statistical Analysis (The 1989 Fisher
Lecture),” Statistical Science, 5, 169–174.
Cox, D. R., and Snell, E. J. (1981), Applied Statistics: Principles and Examples,
New York: Chapman & Hall.
Draper, D., Hodges, J. S., Mallows, C. L., and Pregibon, D. (1993), “Exchange-
ability and Data Analysis,” Journal of the Royal Statistical Society,Ser.A,
156, 9–28.
Ghurye, S. G. (1975), “Proceedings of the Conference on Directions for Math-
ematical Statistics (The Edmonton Conference),” special supplement to Ad-
vances in Applied Probability.
Hand, D. J., Daly, F., Lunn, A. D., McConway, K. J., and Ostrowski, E. (1994),
A Handbook of Small Data Sets, London: Chapman & Hall.
Hotelling, H. (1940), “The Teaching of Statistics, The Annals of Mathematical
Statistics, 11, 457–470; reprinted in Statistical Science (1988), 3, 63–71.
Huber, P. J. (1997), “Speculations on the Path of Statistics,” in The Practice
of Data Analysis, eds. D. R. Brillinger, L. T. Fernholz, and S. Morgenthaler,
Princeton, NJ: Princeton University Press, pp. 175–191.
Kotz, S., and Johnson, N. L. (1992, 1997), Breakthroughs in Statistics,
Vols. 1–3, New York: Springer-Verlag.
LeCam, L., and Olshen, R. A. (eds.) (1985), Proceedings of the Berkeley
Conference in Honor of Jerzy Neyman and Jack Kiefer, Belmont, CA:
Wadsworth.
Lehmann, E. L. (1990), “Model Specification: The Views of Fisher and Ney-
man, and Later Developments (The 1988 Fisher Lecture),” Statistical Sci-
ence, 5, 160–168.
Mallows, C. L. (1998), “The Zeroth Problem (1997 Fisher Lecture),” The Amer-
ican Statistician, 52, 1–9.
(2002), “Parity: Implementing the Telecommunications Act of 1996,
Statistical Science, 17, 256–285.
Mallows, C. L., and Pregibon, D. (1995), “Some Statistical Principles for Mas-
sive Data Problems, in Proceeding of the Statistical Computing Section,
American Statistical Association.
Mallows, C. L., and Walley, P. (1980), “A Theory of Data Analysis?” in Pro-
ceedings of the Business and Economics Section, American Statistical Asso-
ciation.
National Research Council/CATS (1992), Combining Information, Washing-
ton, DC: National Academies Press.
(1996a), Massive Data Sets, Washington, DC: National Academies
Press. (1996b), Statistical Software Engineering, Washington, DC: National
Academies Press.
National Science Foundation (2002), Statistics:Challenges and Opportunities
for the 21st Century, Arlington, VA: Author.
Raftery, A. E., Tanner, M. A., and Wells, M. T. (eds.) (2002), Statistics in the
21st Century, London: Chapman & Hall/CRC.
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DISCUSSION 325
Rubin, D. R. (1993), “The Future of Statistics,” Statistics and Computing,3,
204.
Tanur, J. M., et al. (1987), Statistics: A Guide to the Unknown (2nd ed.), San
Francisco: Holden-Day.
Tukey, J. W. (1962), “The Future of Data Analysis,” The Annals of Mathemati-
cal Statistics, 33, 1–67.
(1977), Exploratory Data Analysis, Reading, MA: Addison-Wesley.
Watts, D. G. (1968), The Future of Statistics, New York: Academic Press.
Discussion
David R. BRILLINGER
University of California
Berkeley, CA 94720
(
brill@stat.Berkeley.edu
)
1. INTRODUCTION
It is a total pleasure to be invited to comment on Colin’s
timely paper. In it Colin refers to Bell Labs and AT&T sev-
eral times. Further, the Tukey (JWT) paper lists his affiliations
as Princeton University and Bell Telephone Laboratories, so
I seize an opportunity to celebrate the Labs of the early 1960s
as well as comment on his ideas.
Colin’s paper brings back so many memories of the
1960–1964 period: anecdotes, FFTs, lunches, seminars,
Hamming, Tukey, Hamming–Tukey, golf, learning, visitors,
computing, books, history, open doors, pink paper drafts,
technical reports, rides between Princeton and Murray Hill,
shared offices, AMTSs, chiding, support (personal and finan-
cial), opportunities (both seized and missed), blackboards, air-
conditioning, freedom, confidence, pranks, Tukey anecdotes,
gossip, conferences, unpublished memos, and people who are
no longer with us. Pursuit of excellence was the order of the
day. I could write a page or more on each of these topics, but
this is not the place.
I was at Bell Labs for the summers of 1960, 1961, and then
for the years 1962–1964. I was a summer student at first and
next a Member of Technical Staff (MTS). These were magic
years at a magic place. None of the involved persons with whom
I have used the term have ever disagreed. I can say that every-
thing important about statistics that I ever learned, I learned at
lunch at Murray Hill. The rest of my career has been applying
what I learned.
Colin reviews a place (University College London, 1948–
1958) and people (Fisher, Hotelling, Tukey) in his paper. I will
do the same.
2. THE PEOPLE
Colin is, of course, one of the key influences, drivers, crit-
ics, and contributors to the development of modern data analy-
sis. He is a problem solver with few if any peers. At the Labs
he used to be in his office (with door wide open), at lunch, al-
ways available and always interruptible. The others in the group
with wide-open doors and a thirst for discovery included Martin
Wilk, Ram Gnanadesikan, Bill Williams, Roger Pinkham, and a
stream of visitors. Of course, John Tukey dropped in/appeared
steadily from the management wing of the buildings. The fields
of expertise included sampling, multivariate analysis, time se-
ries, analysis of variance, and the newly defined field of data
analysis. [Gnanadesikan (2001) reminded me that JWT came
up with the term “data analysis” at a party at my house in 1960.
Ram’s paper contains many reminiscences about the Labs and
comments on data analysis.]
Martin Wilk went on to become a Vice President of AT&T
and then Director of Statistics Canada. He was one of the few
people who could cause John Tukey to really focus on the topic
at hand. (JWT was one of the great multiprocessors and typi-
cally focused on several things at a time.) In particular, Martin
could sum up mighty ideas in a pithy phrase or sentence. To give
an example, there was a scorn for significance tests at the Labs.
Martin remarked: “Significance tests are things to do while your
are thinking about what you really want to do. Both Colin and
Martin went on to write influential papers with Tukey on ex-
ploratory data analysis.
3. THE RESEARCH
The Labs’ researchers’ directions then were not specifically
laid out by the higher-ups, rather various management and engi-
neering types would drop in with problems. It seemed that few,
if any, in the statistics group could resist these problems, puz-
zles, or datasets. There were expected and unexpected discov-
eries. Terminology was created, graphic displays were basic,
residuals were fodder, engineering and chemical science were
ever present. Gnanadesikan (2001) used the word “synergy” to
describe the milieu.
A theme of my discussion is that the Labs of the early 1960s
were magic years for data analysis. They were also magic years
for the digitization of the engineering sciences. The FFT (fast
Fourier transform) has been mentioned, but also seismic records
and speech were being digitized and an analysis sometimes cul-
minated with an analog record. I mention this because a great
talent that Colin brought to the Statistics Group was skills in
combinatorics and discrete mathematics.
© 2006 American Statistical Association and
the American Society for Quality
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DOI 10.1198/004017006000000200
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
326 DAVID R. BRILLINGER
4. TUKEY’S PAPER
“Tukey’s paper” was the first article of the first number of
the Annals of Mathematical Statistics of 1962. The editor at
that time was J. L. Hodges Jr., who was renowned for both the-
oretical and applied statistics work. No thanks are given in the
paper to referees, so perhaps the editor published it on his own
authority. The paper had been received by the Annals on July 1,
1961 and was presented at the IMS Meeting in Seattle in 1961,
so it was out in public.
Tukey’s Foreward to the Collected Works (Jones 1986) is
worth a read. For example, one finds at the beginning: “Besse
Day (Mauss), who spent a year with R. A. Fisher, once told me
that he told her that ‘all he had learned he had learned over
the (then hand-cranked) calculating machine’. I record this
quote to lead into the remark that JWT was involved in more
than pencil-and-paper data analyses. Tukey’s paper presents an
example. There are several analyses of one particular dataset,
a36×15 table of the values of some particular multiple regres-
sion coefficients. JWT presents a robust/resistant row/column
fitting procedure. The Foreward is also interesting for JWT’s
comments on Bayesian statistics.
5. COLIN’S PAPER
Colin asks a sequence of questions:
“How do we attract the brightest students to our subject?”
“How to convey this to a bright student, who has some
analytical attitude, but who is attracted to the glamour of
pure science (or math), or the promise of riches in Wall
Street?”
“Is statistics a science?”
“If statistics is a science, what is its subject matter?”
“What do statisticians study?”
“The question remains, is statistics a science?”
“But is statistics itself a science?”
“So is statistics, or data analysis if you prefer, a science?”
“Surely each of these applications areas is not completely
different from all the others?”
“How does one choose an appropriate methodology?”
6. SOME ANSWERS TO THE QUESTIONS
First off, I am not going to get into the “is it a science?”
discussion, because I just do not think that it matters much.
I am happy to view “statistics/data analysis” as a fine endeavor
that provides much amusement and contributions of insight
and understanding to scientific researchers. I leave the ques-
tion to others, but note that Colin mentions his “sincere admi-
ration for engineers, who have to make things work in the real
world” (I have heard this sentiment phrased as “every engineer-
ing problem has a solution”), and engineering statistics is one
of our subfields (see Technometrics).
However, “how to involve students” is a question dear to my
heart. I do have suggestions:
Get them to read books like the Hoaglin–Mosteller–Tukey
(1983, 1985, 1991) series. (I note Colin’s chiding of JWT’s
EDA book with “his 1977 EDA book discusses the meth-
ods of exploratory data analysis, but says nothing about
how to use these methods.”)
Get them to attend pertinent courses.
Teach pertinent courses.
Get them to attend talks, and get talks presented.
Pay them well.
Raid the computer science departments. (There are lots of
straight computing problems, like how to work out bag-
plots and how to speed up computations, that can lure stu-
dents in.)
My own serious attempt at an original course was Statis-
tics 215a, taught in the fall semesters of 2003 and 2004 here
at Berkeley. The syllabus, book list, and readings are provided
in the Appendix.
Another attempt I made was to use the book of De Veaux,
Velleman, and Bock (2006) as text in a third-year undergrad-
uate course. In it many EDA techniques are illustrated, there
is a chapter on “Regression Wisdom, and one finds the stric-
ture “Make a picture. Make a picture. Make a picture. repeated
many times. (This was a Labs mantra.) Students from a broad
group of departments registered for the course and appeared to
grasp the EDA concepts almost immediately.
I am sure others teach such courses. It strikes me that one
does not have to yearn for a reincarnation of that 1960s Labs
environment, because the ideas are out and Tukey-type data
analysis is now the order of the day.
7. SUMMARY
I call this 1960–1964 period “magic years” because the seeds
for high-quality statistical analyses were sown then, and analy-
ses in which electronic computers, graphics, and residuals be-
came paramount. Sadly, one cannot say the same about the
Labs; how the mighty have fallen.
I end with the following note. There was talk at the 1960s
lunches of forming a Society of Data Analysis. My contribution
was to suggest that Tukey could be called “soda pop.”
APPENDIX: STATISTICS 215A “APPLIED STATISTICS
AT AN ADVANCED LEVEL, UNIVERSITY OF
CALIFORNIA BERKELEY 2003, 2004
Syllabus
Week 1. Stem-and-leaf, 5-number summary, boxplot, par-
allel boxplots, examples
Week 2. EDA vs. CDA vs. DM, magical thinking, scatter
plots, pairs(), bagplot(), spin()
Week 3. Summaries of location, spread vs. level plot, em-
pirical Q–Q plot, smoothing scatterplots, smoothing types
Week 4. The future of data analysis, linear fitting, OLS,
WLS, NLS, multiple OLS, robust/resistant fitting of straight
line
Week 5. Optimization methods, the psi function, residual
analysis, fitting by stages, the x-values
Week 6. Wavelets, NLS, robust/resistant variants, smooth-
ing/nonparametric regression, sensitivity curve, two-way arrays
Week 7. Residuals analysis for two-way array, L1 approxi-
mation, median polish, diagnostic plot, data analysis and statis-
tics: an expository overview
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DISCUSSION 327
Week 9. Exploratory analysis of variance: terminology,
overlays, ANOVA table, rob/res methods, examples
Week 10. Some principles of data analysis
Week 11. r2, R2, Simpson’s paradox, lurking variables
Week 12. Exploratory time series analysis (ETSA), plotting
time series, methods
Week 13. Data mining, definitions; contrasts with statistics
Week 14. Data mining for time series, for association rules,
market basket analysis.
Book List
Cleveland, W. S. (1994), The Elements of Graphing Data,
Belmont, CA: Wadsworth.
Chambers, J. M., Cleveland, W. S., Kleiner, B., and
Tukey, P. A. (1983), Graphical Methods for Data Analysis,
Duxbury.
Hand, D., Mannila, H., and Smyth, P. (2000), Principles of
Data Mining, Cambridge, MA: MIT Press.
Hastie, T., Tibshirani, R., and Friedman, J. (2001), The Ele-
ments of Statistical Learning, New York: Springer-Verlag.
Hoaglin, D., Mosteller, F., and Tukey, J. (1983), Understand-
ing Robust and Exploratory Data Analysis, New York: Wiley.
(1985), Exploring Data Tables,Trends, and Shapes,
New York: Wiley.
(1991), Fundamentals of Exploratory Analysis of
Variance, New York: Wiley.
Mosteller, F., and Tukey, J. W. (1977), Data Analysis and
Regression, Reading, MA: Addison-Wesley.
Rao, C. R. (2002), Linear Statistical Inference and Its Appli-
cations, New York: Wiley.
Tukey, J. W. (1977), Exploratory Data Analysis, Reading,
MA: Addison-Wesley.
Venables, W. N., and Ripley, B. D. (2002), Modern Applied
Statistics With S–PLUS, New York: Springer-Verlag.
Readings
Breiman, L. (2001), “Statistical Modeling: The Two Cul-
tures,” Statistical Science, 16, 199–231.
Diaconnis, P. (1985), “Theories of Data Analysis: From
Magical Thinking Through Classical Statistics,” in Exploring
Data Tables,Trend,and Shapes, eds. D. Hoaglin, F. Mosteller,
and J. Tukey, New York: Wiley, pp. 1–36.
Friedman, J. H. (2001), “The Role of Statistics in the Data
Revolution, International Statistical Review, 29, 5–10.
Hand, D. J. (1998), “Data Mining: Statistics and More,” The
American Statistician, 52, 112–118.
Mallows, C., and Pregibon, D. (1987), “Some Principles of
Data Analysis,” in Proceedings of the 46th Session ISI, Tokyo,
pp. 267–278.
Mannila, H. (2001), “Theoretical Framework for Data Min-
ing,” SIGKDD, 1, 30–32.
(1980), “We Need Both Exploratory and Confirma-
tory, in The Collected Works of John W. Tukey, ed. L. V. Jones,
Monterey, CA: Wadsworth & Brooks/Cole, pp. 811–817.
Tukey, J. W. (1962), “The Future of Data Analysis, in The
Collected Works of John W. Tukey, ed. L. V. Jones, Monterey,
CA: Wadsworth & Brooks/Cole, pp. 391–484.
Tukey, J. W., and Wilk, M. B. (1966), “Data Analysis and
Statistics: An Expository Overview,” in The Collected Works of
John W. Tukey, ed. L. V. Jones, Monterey, CA: Wadsworth &
Brooks/Cole, pp. 549–578.
ADDITIONAL REFERENCES
DeVeaux, R. D., Velleman, P. F., and Bock, D. E. (2006), Introductory Statstics,
Boston: Pearson, Addison-Wesley.
Gnanadesikan, R. (2001), “A Conversation With Ramanathan Gnanadesikan,”
Statistical Science, 16, 295–309.
Hoaglin et al., see the Appendix.
Jones, L. V. (1986), The Collected Works of John W. Tukey, Vols. III and IV,
Monterey, CA: Wadsworth & Brooks/Cole.
Mallows, C., and Tukey, J. W. (1982), “An Overview of Techniques of Data
Analysis, Emphasizing Its Exploratory Aspects,” in The Collected Works
of John Tukey, ed. L.V. Jones, Monterey, CA: Wadsworth & Brooks/Cole,
pp. 891–968.
Discussion
Andreas BUJA
Department of Statistics
University of Pennsylvania
Philadelphia, PA 19104
(
buja@wharton.upenn.edu
)
Colin Mallows discussion of Tukey’s paper gives us an op-
portunity to clarify our thoughts about the state of the field. Be-
fore I enter into a debate with Colin, I will follow his lead by
reminiscing about the past—a more recent past than his, how-
ever.
It used to be that self-identification as a statistician, at parties,
say, produced rambling responses about “the worst class I had
to take in college. The confession “I’m in statistics” was not
exactly a conversation stopper, but it did not move the conver-
sation in a desirable direction either. This I remember from the
1980s. Did we have a problem back then, and, if so, do we still
have it today?
© 2006 American Statistical Association and
the American Society for Quality
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DOI 10.1198/004017006000000192
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
328 ANDREAS BUJA
Recently, I had an experience of the opposite kind, and al-
though it may not (yet?) be typical, it may be an indicator of
a climate change: One day at the start of my train ride home,
a young woman asked whether she could have the seat next to
me. As the train began to roll, we independently opened our
bags and both pulled out very similar looking paper: LaTex
hardcopy. A conversation ensued, part of which was “what do
you do?” When I told her that I am in statistics, she let out a
sigh and said “I wish I had gone into statistics. Why? She was
a graduate student at Penn in astrophysics, and for her Ph.D. she
analyzed quasar data and needed statistics to look for support
of a theory according to which quasar signals look the same run
forward and backward in time. Moreover, her husband was an
econometrician on Wall Street, who applied statistical analysis
to financial time series for good pay.
These days I feel quite comfortable being a statistician. More
than ever it has what I find attractive: the license to do so many
different things, ranging from pretty math to powerful comput-
ing to applications in almost any science. If you allow me to
reminisce for one more moment, I am still fond of computer
graphics, especially the collaboration with Debby Swayne and
Di Cook, and I have fond memories of a programming error that
I once made way back at Stanford. It amounted to unknowingly
stuffing scrambled data into an algorithm (multidimensional
scaling), which produced amazingly regular and pretty pictures.
They were indeed so puzzling that later they became the inspi-
ration for a very pretty mathematical problem, for which I was
lucky enough to fall into a collaboration with Logan, Reeds,
and Shepp that produced an even prettier solution.
I am also fond of the memories of one of my first consulting
experiences at the University of Washington, which consisted
of co-advising a Ph.D. student in musicology (of all fields) who
had collected two fascinating lab datasets that recorded the re-
sponses of trained musicians to musical stimuli: how they felt
a given tone fragment should be continued, and to what degree
they felt a tone fitted into a cord. The datasets were so rich and
so structured, the likes I have not seen since.
In all, I think statistics is one of the most satisfying intellec-
tual environments, great for those who like dabbling in many ar-
eas, and also great for the focused and highly specialized minds.
I am not too worried about finding promising young researchers
in the recent crop of graduates, judging from our last recruiting
season in which we felt we just did not have enough positions
for the available talent.
Having started on an optimistic note, let me continue right
along. Colin discusses the question of whether the field is get-
ting too fragmented. Some of us remember from the 1980s
the call for application and relevance of statistical research.
This call was heeded to such an extent that a later NSF report
called for renewed focus on the “core, an idea that the late Leo
Breiman found misguided. Do we have a problem? Maybe not.
The focus on the core is not incompatible with relevance and
application. As a community, we should be able to attend to
both. At worst, we will go the way of physics; physicists have
always had a mostly healthy tension between experimentalists
and theorists. Sure they tease each other, but physics remains
the most successful science there is. In statistics, we may ac-
tually have reversed some of the tension between the theoreti-
cal and applied areas. Colin has always bridged the two with
brilliance, but recently people who we may have thought of
as archetypes of theoreticians have also gotten themselves in-
volved in applications: Peter Bickel and Larry Brown come to
mind.
Colin asks whether statistics is a “science.” He criticizes de-
finitions of statistics that refer to the study of methodology, di-
vorced from applications, and he prefers a definition that refers
to application, quantitative data, and meaning in data. I am not
so sure; this seems dangerously close to us being the guys who
know how to use tools. Of course we should know how to use
tools, and the world will appreciate us if we do use them, but
for us on the inside I think any particular application is just that:
particular. I do not think we are interested in microarrays and
transaction data as such, although we feel deeply rewarded if
we help sciences and businesses achieve their substantive goals.
Yet those of us who apply themselves to application areas hope
to find new problems and challenges of a generality that tran-
scend these areas. We hope for conceptual and methodological
innovation and enjoy the successes in applications as beneficial
and essential side effects that keep us grounded and make the
world happy. Our preoccupation with methodological general-
ity is not misguided, because it has long-term benefits in that
generalizations abstracted from one application may pay off in
future applications. Actually, Colin implicitly agrees at the end
of his paper when he urges us to go over the JASA vignettes and
look for common strands of ideas and approaches.
Colin’s statement that “statistics is nothing more than a trivial
exercise if it does not address real-world problems” makes me
somewhat uneasy. I would rather hold that solving real-world
problems is essential for the practice of statistics, but it does
not belong in its definition. Metaphorically, we are tool makers,
not carpenters. We are into tools, not furniture, although we
find it essential to spend a fair amount of time in carpenters’
workshops to see what new tools might be needed for future
furniture making.
As a definition of statistics the field, I propose the following:
Statistics is the science of quantitative methodology for data
analysis and data-based decision making. Not part of the defin-
ition are the concepts of uncertainty and variation, and neither
are applications, just as the definition of physics should not in-
clude the concepts of heat and particles. Another issue: In the
foregoing definition, I apply the word “quantitative” to method-
ology, not to data. Why? Because we are perfectly able to an-
alyze aspects of qualitative data by quantitative means. In this
proposal I also use the phrase “the science of..., because sta-
tistics is neither just mathematical theory nor just application;
importantly, it includes pondering the meaning of rather deep
concepts. I will return to this in a moment.
Note that any definition of statistics is really meant for our
self-reflection and for people in friendly fields who already have
some notion of statistics. Definitions typically cannot be used
to explain to folks at parties what we do. If they ask, you may
say “for example, we develop methods for telling from your
genes whether you’ll get cancer, on which you may hear “oh,
so you’re into this bio stuff! That’s hot!, and you say “not re-
ally, the same methods could be used to predict from your credit
records whether you’ll go bankrupt. The point is that the ab-
straction level at which we operate may be difficult to convey
to folks on the outside. We on the inside would miss the level
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DISCUSSION 329
also if we reduced statistics to toying in applications. It is not
too weird if some of us (not all of us!) spend time on tools that
have neverseen an application. Take the Bayesians; theirs was a
“trivial exercise” for a long time, “divorced from applications,”
until computing unlocked their tools and unleashed this torrent
of Bayesian modeling that is still on us.
Colin quotes an important Tukey sentence: “Statistics is pure
technology. I will have some bones to pick with this, too, but
on the whole it jibes with the idea of statistics as tool making.
This sentence is an excellent corrective, especially for first-year
graduate students to whom statistics looks like a proving ground
for math skills, because all they do is solve cute math problems
for the math stat course. It helps to tell them that statistics is
also a place where one invents tools. Indeed, making inventions
may be more important than mathematical theory, although it is
true that the inventions should be mathematically informed.
The quibble I have with Tukey’s quote is that it is too ab-
solute. I would still say that statistics is more science than en-
gineering, mainly because of the depth of the concepts and
insights we bring to bear. Here is a small list of such con-
cepts, many forming natural dualities: randomness, uncertainty,
sample spaces (=sets of hypothetical worlds), probability of
events and plausibility (likelihood) of hypotheses, populations
and samples, sampling (=dataset-to-dataset) variability, priors
and posteriors, structure-and-error or signal-and-noise, causal-
ity and correlation, intervention and observation, exploration
and inference, explanation and prediction, confounding, type 1
and type 2 error and multiplicity, aggregation and disaggrega-
tion (mentioned by Colin), and, on a related note, Simpson’s
paradox. These notions go beyond the purely technological as-
pects of tool making; they are deep, and some have ancestry
in old traditions of philosophy. Indeed, similar to the way the
natural sciences replaced what was formerly the “philosophy
of nature,” statistics appropriated topics that used to belong to
“epistemology. Again similar to the natural sciences, statistics
developed some aspects of epistemology beyond anything that
philosophers of the past could have anticipated. In as far as it
is the business of statistics to ponder the question “how is it
possible to extract knowledge from empirical observations?,
our field is the legitimate inheritor of the quantifiable aspects of
epistemology.
Some of us may better know a newer strand of philosophy,
called “philosophy of science, which was initially driven by
intense opposition to traditional philosophy. Among its figure-
heads were such well-known names as Carnap and Popper, both
very capable of quantitative theorizing and therefore more con-
genial to us. Some of Popper’s ideas about “conjectures and
refutations” (one of his book titles) are firmly embedded in
the theory of statistical hypothesis testing; we too teach that
hypotheses can never be verified, they can only be falsified.
Putting it more grandiosely, truth is in those hypotheses that
hold up to repeated testing in the long run.
As a corollary of this ramble, I state that statistics is nei-
ther just technology nor just application—it is science above
all. “Science” still seems to be a badge of honor in our part of
the world, and abrogating it from statistics would presumably
lower its status. It should come to our attention, however, that
science is not a universally appreciated value, and even in acad-
emia there are colleagues who see us as just playing a game. To
them we are a crowd of privileged and largely white males (not
all dead yet), who have instituted rules by which to play games
of science. Rules being arbitrary, the games are arbitrary, and
nothing essential sets us apart from other games being played
the world over—political, religious, literary, artistic, athletic. In
this thinking, our game is of a highbrow variety, but such ex-
ist for all other types of games as well. Why do I go off on
this seemingly unwarranted tangent? Why inveigh gratuitously
against folks to whom we have never spoken? The reason is
that we actually have something to say, and what we have to
say throws light on our field and its role in the sciences.
The major point that we should insist on is that rules are not
arbitrary, and not all games are equal. Indeed, few things are
as empowering to humans as are good conventions. An exam-
ple are the rules and conventions of the sciences that have the
power to produce true theories in the long run. And some of
these rules and conventions are in fact owed to statistics. Statis-
tics contributes to the conduct of science in the following two
ways: (1) It develops and proposes rules for guarding against
the overinterpretation of data, which is the traditional domain
of statistical inference, and (2) it also constantly explores new
language to express quantitative associations in data, which is
the domain of modeling in all its incarnations, Bayesian or fre-
quentist, parametric or nonparametric. Although we may be a
little jaded by the vengeance with which vast areas of the sci-
ences have embraced the use of pvalues and confidence inter-
vals, the fact remains that these conventions provide protections
that we would not want to miss. As for statistical modeling, the
area of greatest creative effort today, we tend to think of it as
technology. This view covers only the use of models for predic-
tion, however. When using models for interpretation, it is more
useful to think of them as languages that describe quantitative
associations. A difficulty with models is often that there are too
many to choose from, as may be the case with structural equa-
tion models, some types of Bayesian models, and, of course,
nonparametric models from trees to boosting to SVMs. Facing
an embarrassment of riches, we tend to complain, but instead
should we not be happy with the expressive choices that we
have? If there is a danger with the current wealth in modeling,
then it is that we and our colleagues in the substantive fields are
seduced to look for answers in the latest statistical models as
opposed to substantive theory. These are minor problems, how-
ever, and the fact remains that our game is not just any game;
our conventions, vocabulary and expressive power do advance
knowledge.
We heard much about the virtues of being involved in appli-
cations and real-world problems. Sure, applications have driven
some of the recent developments in the field. But one important
driver of the recent history of statistics matters above all. Note
this: What made the bootstrap, Bayesian modeling, nonpara-
metric fitting, and data visualization possible in 1999 but not
1949? Computers! Wasn’t computer technology a more perva-
sive driver of research than any particular area of application?
Maybe “driver” is the wrong word; “enabler” or “catalyst” may
be better, because computers allowed us to do things we al-
ways wanted to do but could not. For me, it is a curious per-
sonal memory that only a quarter-century ago Werner Stuetzle
and Jerry Friedman custom-built hardware for computer graph-
ics in anticipation of the concept of a “workstation. And to-
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
330 BRADLEY EFRON
day we take it for granted to have access to 4–6 orders of
magnitude more power, all packed into a laptop with 60G of
disk, 1G of memory, and a processor probably more than a
thousand times faster than the SUN chip number 2 they used
on their ORION workstation. And that is only the hardware.
Equally important is software, such as the S and R languages,
the BUGS package, C and Fortran (if we must), database soft-
ware, spreadsheets, and of course LaTex. Colin talked about
large datasets: they too became possible because of computer
technology. Related technologies produced microarrays, fMRI,
communications networks, the Sloan Digital Sky Survey, and
more.
At this point, one may wonder whether computer technol-
ogy will keep enabling our field, or whether we are slowly
approaching an asymptote. It may indeed be the case that the
future stimuli will be new types of data, such as microarrays,
genomics and proteomics data, image libraries, and transaction
and network data. I expect, however, that computer speed and
increase in storage will continue to play a role, but, realistically
speaking, we have to wait for software standards to emerge so
the manipulation of, for example, sets of images will be as easy
as it is today for standard N×pdata matrices. We are probably
sufficiently parochial and computationally limited that we will
not be the leaders in future data infrastructure, although I could
be wrong, in view of the achievements of our colleagues in the
bioconductor project. Something that may and should happen,
as we move with faster computers and greater storage capacity
into larger data problems, is an evolution of our high-level com-
puting tools, the S and R languages, possibly away from the
functional paradigm, which in an early incantation even John
Chambers called “flamboyant, namely wasteful. Twenty years
from now, S and R might still be recognizable as the same lan-
guages that they are today, but they will have grown to enable us
to use them for the exploration of the Sloan Digital Sky Survey
data, or the medical records of all U.S. patients.
Looking into the future of the fundamental concepts of statis-
tics may be more difficult. Will there be breakthroughs similar
to robustness, bootstrap, nonparametric fitting, Bayesian mod-
eling, and data visualization? Colin may be on to something
with his zeroth and fourth problems of which address the points
where the rubber meets the road:
1. Before we perform an analysis, how do we make contact
with reality? What data should we use? Are the data at
hand relevant for the questions that we have in mind?
2. After we performed an analysis, what does it mean in re-
ality? Have we answered any questions at all? Going be-
yond the original questions, did we find surprises?
Do these questions not echo the concerns of epistemology: how
can we extract knowledge from empirical observations? Maybe
there exist fundamental concepts that would elucidate these im-
portant stages of statistical activity. Colin kicked the ball; it is
up to us to keep it rolling. Meanwhile, we might just hold our
heads a little higher than we have been used to doing. Thanks,
Colin, for an inspiring piece!
Discussion
Bradley EFRON
Department of Statistics
Stanford University
Stanford, CA 94305
(
brad@stat.stanford.edu
)
Colon Mallows’ essay is intriguing, insightful, provocative,
and a little strange. The same description can be applied to
Tukey’s monumental 1962 paper, which turned out to be much
fatter than I remembered when retrieved from the JSTOR
archive (48 sections!) Much of that paper now seems idio-
syncratic, even for Tukey, focusing on, for instance “FUNOP”
methodology for dealing with outliers, but the general message
is still powerful: Statistical ideas should be ultimately judged by
their utility in applications, not on theoretical grounds. I do not
agree completely with Tukey’s sentiment, but I am grateful to
Mallows for channeling it so forcefully. What follows are some
responses to both Tukey and Mallows, with no real attempt at
logical ordering.
I believe that statistics is an information science (actually,
the first information science), joined now by such fields as
computer science, information theory, optimization (operations
research), and signal processing. As such, it operates at a sec-
ond level, one step removed from direct contact with the nat-
ural world. Is information science real science? Tukey is more
generous than Mallows in his answer. From my point of view,
the question was settled in the 1920s and 1930s by the de-
velopment of the Fisher information bound and the Neyman–
Pearson lemma. These are optimality results, statements of the
best it is possible to do in a given situation, and they moved sta-
tistics beyond its methodology stage into scientific adulthood.
The theory/applications dichotomy looks shallow in this con-
text; every MLE application reflects, at least tacitly, the bless-
ing of the information bound, and similarly for hypothesis tests
and Neyman–Pearson.
Tukey’s paper proposed a return to the world of pure method-
ology. It is easy to forget how radical data analysis (also pro-
© 2006 American Statistical Association and
the American Society for Quality
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DOI 10.1198/004017006000000183
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DISCUSSION 331
posed as a new name for our field) was intended to be. A look
through EDA or Mostellor and Tukey’s green book on regres-
sion analysis reveals almost no theoretical structure at all, not
even basic probability. Electronic computation also plays a sur-
prisingly minor role. Section 47, “The Impact of the Computer”
(one page), has a certain Luddite aspect, with Tukey as John
Henry, carrying out “hand FUNOP” at lightening speeds. In
fairness, this relates to small datasets, “36 values, a forum
where Tukey was unchallenged master, and arguably the fa-
vored setting for the whole data analysis program. Of course,
this is not how data analysis has played out in the twenty-first
century.
At least one of Tukey’s main points has come to pass, and in
spectacular fashion: “We need to tackle old problems in more
realistic frameworks. As a sometimes-practicing biostatisti-
cian, I have been witness to nearly a complete makeover in the
day-to-day methodology of statistical analysis: Kaplan–Meier,
generalized linear models, proportional hazards, robust meth-
ods, jackknives and bootstraps, Cp (!), GEE, EM, MCMC,....
The computer broke the bottleneck of mathematical tractability
that constrained classical statistics, and statisticians responded
with a ferocious burst of algorithmic-based technology. The-
ory and applications worked together in this creative outburst,
a healthy situation that continues today.
Mathematical statistics was a tired subject in 1962, as Huber
suggests, making Tukey’s call for an applications-oriented ref-
ormation timely as well as exciting. The call was answered. The
ensuing 40 years have seen a rising curve of applied statistical
interest, with a notable upward acceleration in the past decade
but with a twist that Tukey did not foresee: Massive data prob-
lems, in the terminology of Pregibon and Mallows, generated
by new scientific equipment such as microarrays, have moved
center stage. (Tukey himself worked on large-scale problems,
notably in the halothane study, but they are not the main thrust
of “the future of data analysis.”)
I want to talk about one example, at the risk of it soon ap-
pearing as quaint as FUNOP. Figure 1 relates to a microar-
ray experiment comparing two classes of prostate cancer. Here
50 patients in class 1 were compared with 52 patients in class 2,
yielding two-sample tstatistics for each of 6,033 genes, say ti,
i=1,2,...,6,033. Each tihas been converted to a putative
z-score,
zi=1F100(ti), i=1,2,...,6,033,(1)
where F100 is the cdf of a standard Student tvariate with
100 degrees of freedom, and is the standard normal cdf.
The ziwould have a N(0,1)null distribution under the classic
Gaussian assumptions.
Panel (a) of the figure shows a histogram of the 6,033
z-values, compared with a permutation null distribution ob-
tained by randomly interchanging the microarrays, recomput-
ing the tstatistics, and applying transformation (1). [In fact,
the permutation null is almost perfectly N(0,1); score one for
the classic Gaussian assumptions.] A Q–Q plot of the z-values,
in (b) at right, shows a distribution approximately N(0,1.12)
near the center but with long tails, presumably reflecting some
nonnull genes, the kind that the scientists were looking for.
Here are a few of the questions that come to mind:
(a)
(b)
Figure 1. Histogram of z-Values From a Prostate Cancer Microarray
Study (Singh et al. 2002) (a) and Q–Q Plot of z-Values (b).
Which of the genes are “significantly” nonnull? The quo-
tation marks are necessary because the classic definition
of significance seems possibly inappropriate in a massive-
data context. FDR analysis, another useful new statistical
methodology, flags 60 genes at FDR level .1: 32 on the left
and 28 on the right.
As in Mallows’ example, is this permutation analysis the
correct one?
How powerful for detecting nonnull genes is the experi-
ment?
Is N(0,1)the correct null hypothesis, or should we use
N(0,1.12), as suggested by the Q–Q plot? (Doing so re-
duces the number of flagged genes to 16, with only 3 on
the left.)
I have given some possible answers in earlier work (Efron
2004, 2005). My reason for bringing this up here relates to the
question of theory versus application. Microarray studies have
generated another furious burst of statistical technology, as a
look at the R library “CRAN” will show. But technology by
itself is not enough. Sooner or later, the new applications will
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
332 PETER J. HUBER
have to be grounded in new theory. This is exactly the point
where statistics proves itself to be a science and not just “a pure
technology, in Tukey and Mallows’ unfortunate phrase. FDR
theory is a good example of new theory already happening, and
I have no doubt of further progress being close at hand.
The call to applications, appropriate in 1962, seems strange
in today’s massive-data atmosphere. Applications and theory
feed off of each other in an ideal science, and in fact I think
the balance in statistics is rather nice right now. A critical spirit
is natural to statistical training (I like Rubin’s way of saying
this), and that includes a big dose of self-criticism. Mallows’
essay offers a healthy example of the self-critical genre, and
makes some telling points, but we do not want to emulate the
Gilbert and Sullivan character “who praises with enthusiastic
tone, every century but this, and every country but his own.
ADDITIONAL REFERENCES
Efron, B. (2004), “Large-Scale Simultaneous Hypothesis Testing: The Choice
of a Null Hypothesis,” Journal of the American Statistical Association, 99,
96–104.
(2005), “Local False-Discovery Rates, technical report, Stanford Uni-
versity, available at http://www-stat.stanford.edu/brad/papers/False.pdf.
Singh, D., Febbol, P., Ross, K., Golub, T., and Sellers, R. (2002), “Gene Ex-
pression Correlates of Clinical Prostate Cancer Behavior, Cancer Cell,1,
203–209.
Discussion
Peter J. HUBER
Klosters, Switzerland
(
peterj.huber@bluewin.ch
)
When I was asked to comment on Colin Mallows’s paper and
the surrounding issues, my first reaction was: What can I say
that has not been said before? I felt that Mallows’ account was
so insightful and elegant, that to comment on it would be to
detract from it. However, I found a few snippets to add, and
after looking once more at Tukey’s 1962 paper and at reports
on the status and future of statistics, I felt that there are many
things that should be said again, and again. And again.
I comment on four topics: the 2002 National Science Foun-
dation (NSF) report (because it is a step in the wrong direction),
data mining (because it invites programmed self-deception),
models (because Tukey had eschewed them in 1962), and strat-
egy (because I would like to add some alternative facets to Mal-
lows’s five problems).
1. 2002 NSF REPORT ON THE FUTURE
OF STATISTICS
This report sets its theme by defining the field in the follow-
ing words:
Statistics is the discipline concerned with the study of variability, with the study
of uncertainty and with the study of decision making in the face of uncertainty.
Is that all? And does it point in the right direction? Through
the three-fold use of the words “the study of, it stresses that
statistics is concerned with the theory of the things rather than
with the things themselves. It is a fine description of ivory tower
theoretical statistics, and it pointedly excludes analysis of actual
data. This is like giving a definition of physics that excludes
experimental physics. Why did they not write: “Statistics is the
theory and practice of data analysis,” with some explication of
what is meant by theory and by practice? The report is sticking
to its set theme by concentrating on the supposed theoretical
“core” of statistics, and after seeing that, I gave up. I agree with
Leo Breiman that the report is a step into the past and not into
the future.
Already back in 1940, Deming had gently criticized Hotel-
ling’s paper on the teaching of statistics by pointing out
that “there are many other neglected phases that ought to be
stressed.” He mentioned simple descriptive tools and the in-
terpretation of data that arise from conditions not in statistical
control. His suggestions fell on deaf ears; these items were, and
obviously still are, considered dirty stuff, below the dignity of
a “core” statistician.
Deming’s plea was reiterated more forcefully and in consid-
erably more detail by Tukey in 1962. It is sad to witness the
renewed attempts to keep statistics pure and narrow.
2. DATA MINING
By not paying attention to “dirty stuff, the statistics com-
munity opened the field wide to others, particularly computer
scientists, who then invented data mining and touted it as a cure-
all for the problems caused by data glut. On the whole, I take
a pretty dim view of data mining (see Huber 1999, p. 636).
Of course, there are interesting and important aspects. But too
many of the so-called “data mining tools” are nothing more than
good old-fashioned methods of statistics, with fancier terminol-
ogy and a glossier wrapping. Unfortunately, what had made
those methods work in the first place—namely, the common
sense judgment of a good old-fashioned statistician applying
them—did not fit into a supposedly fully automated package
and was omitted.
As a consequence, data miners have added a new twist to ly-
ing with statistics: programmed self-deception. Large datasets
typically are heterogeneous, and automated methods inevitably
fall into the pitfalls dug by heterogeneity. The statistics com-
munity may be to blame here, too; how many statistics courses
© 2006 American Statistical Association and
the American Society for Quality
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DOI 10.1198/004017006000000174
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DISCUSSION 333
and texts ever draw attention to Simpson’s paradox? Freed-
man/Pisani/Purves is a laudable exception (see the example on
sex bias in graduate admissions).
The danger inherent in neural network and similar black-box
methods lies not so much in overfitting, as many statisticians
seem to think, but rather in the difficulty of interpreting what is
going on inside the black box. The more opaque the box, the
less likely it is that one will spot potential problems. A case
story from a data analysis exam may illustrate my point. In
that exam, a student found that the best discriminator between
carriers and noncarriers of a certain genetic disease was age.
This was entirely correct but useless; what he had discovered,
but misinterpreted, was that carriers and controls had not been
matched with regard to age. Would we have noticed that if we
had been presented with a black-box discrimination procedure
without explicit identification of the discriminatory variable?
3. MODELS, COMPARISON AND SIMULATIONS
In his 1962 paper Tukey eschews modeling. This is interest-
ing for several reasons. Among special growth areas Tukey sin-
gles out stochastic process data: “Data of the sort today thought
of as generated by a stochastic process is a current challenge
which both deserves more attention, and different sorts of at-
tention, than it is now receiving” (p. 4). Today, most people
would think that data of this sort cries out for modeling (e.g.,
Box–Jenkins, state-space models, Kalman filter). Of course,
these approaches gained prominence only later. But in sec-
tion 27, after discussing spectral aspects of stochastic process
data in some detail, Tukey voices reservations about models:
“If I were actively concerned with the analysis of data from
stochastic processes (other than as related to spectra), I believe
that I should try to seek out techniques of data processing which
were not too closely tied to individual models. Note that his
EDA (1977) might have been subtitled “data analysis in the ab-
sence of models.”
Tukey’s idiosyncratic dislike of models is curious, because
he regards data analysis as a science (p. 6), and modeling is
a scientific, not a statistical, task. But models are viewed dif-
ferently in the sciences and in (traditional) statistics. In the sci-
ences, insight typically is gained by thinking in models. Models
do not need to exactly represent the situation, only its relevant
features, and this creates problems on the interface between sci-
ence and statistics. Scientists interpret goodness-of-fit statistics
rather differently from statisticians. I remember the puzzlement
of a physicist comparing observed to theoretical spectra, when
his data analysis program had declared a visually lousy fit as
good and a visually perfect fit as poor, and he suspected a bug
in the program. On inspection, it turned out that his program
was perfectly in order; it calculated chi-squared goodness-of-fit
statistics, and in the first case the observational errors were very
large, and in the second case they were negligibly small (and
the test statistic had picked up irrelevant systematic errors in a
large dataset). There is no methodology in traditional statistics
for assessing the adequacy of a model! The only thing orthodox
frequentist theory can do about models is to reject them, and
Bayesian statistics cannot even do that. Nonrejection does not
imply that the model is correct; it does not even tell one whether
it adequately represents the data. On the other hand, a rejected
model might be perfectly adequate.
Rather than downplay and ignore statistical modeling (as
Tukey seems to suggest), I recommend that data analysis should
provide techniques for assessing the adequacy of models. A di-
rect comparison between the data and the model is not good
enough. For example, comparing an estimated spectrum with
the theoretical spectrum is difficult, because it may be impos-
sible to tell whether differences are real or are due to random
errors or processing artifacts. But a more elaborate comparison,
between judiciously chosen results computed from the data and
analogous results computed from an ensemble of simulations
of the model, is a most powerful way to judge the adequacy of
a model.
4. STRATEGY
In 1997, in an essay titled “Strategy Issues in Data Analysis,
I drew parallels to the famous treatise by Clausewitz (1832)
on military strategy. This essay has relevance to several of the
issues raised by Mallows, in particular to his five problems of
Fisher (from zeroth to fourth), and I cull some aphorisms from
it.
The standard statistics texts concentrate on techniques geared
toward small and homogeneous datasets. They are concerned
with the “tactics” of the field, whereas “strategy” deals with
broader issues and with the question of when to use which tech-
nique. The need for strategic thinking in data analysis is im-
posed on us by the advent of ever larger datasets. What really
forces the issue is not the size, but the fact that larger datasets al-
most invariably are less homogeneous and have more complex
internal structure.
Clausewitz commented disdainfully on the theorists of his
time, who “considered only factors that could be mathemati-
cally calculated.” Similarly, theoretical statistics should go be-
yond mathematical statistics.
According to Clausewitz, “war is the continuation of politics
by other means. . . . The political object is the goal, war is the
means of reaching it, and means can never be considered in iso-
lation from their purpose.” If war is the continuation of politics,
then data analysis is the continuation of science.
Data analysis is hard and often tedious work, so do not waste
forces. Concentrate on essentials. Use the simplest approach
that will do the job. Do not demonstrate your data-analytic
prowess with exotic procedures. Remember the KISS principle:
keep it simple and stupid. Do not overanalyze your data. Know
when to stop. If it comes to the worst, accept defeat gracefully.
Data analysis ranges from planning the data collection to pre-
senting the conclusions of the analysis, that is, from Mallows’s
zeroth to fourth problem. The war may be lost already in the
planning stage. Unfortunately, the data analyst rarely has any
control over the earliest phases of an analysis—namely, over
planning and design of the data collection, as well as over the
act of collecting the data (which sorely complicates the zeroth
problem!). There are cases (mostly hushed up and therefore
rarely documented) where multimillion dollar data collections
had to be junked because of poor design. I recounted a case
(a large hail-prevention experiment) where an unusually re-
sourceful and persuasive statistician was able to convince the
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
334 JAMES M. LANDWEHR
sponsor that the data he was supposed to analyze were worth-
less because one had neglected to randomize, and to force an
improved repetition of the experiment.
My recommendation is to plan the data collection with the
subsequent analysis in mind. Clever planning may simplify the
analysis and may make the spotting and correcting of the ever
present errors easier. Be aware of the dangers of gross errors and
of systematic errors, of omitted or duplicated batches of data,
and so on. The meta-data (i.e., the story behind the data, how
they were preprocessed, the precise meaning of the variables,
and so on) are just as important as the data themselves.
To assist with overall planning and preparation of resources,
I proposed a kind of checklist by dividing the actual analysis
into strategically important stages. I illustrated them with the
help of examples. These stages tend to be encountered in the
following order, but, strictly speaking, a linear ordering is not
possible; one naturally and repeatedly cycles between different
actions (I prefer this word to Fisher’s “problems”):
Inspection
Error checking
Modification
Comparison
Modeling and model fitting
Simulation
“What-if” analyses
Interpretation
Presentation of conclusions.
Conceptually, Fisher had been concerned with homogeneous
univariate populations, which was a perfect starting point for
his time. But his three problems do not generalize very well
to more complicated situations. His first two (specification and
estimation) correspond to modeling and model fitting in my
framework. I combine them because they are closely intercon-
nected. With more complex data and more complex analysis
procedures, Fisher’s third problem (distribution) is becoming
too difficult to handle by theory alone, and one must resort to
simulation. Model fitting may involve such techniques as non-
linear weighted least squares, generalizing Fisher’s maximum
likelihood methods. The other items on my list of actions typi-
cally call for judgment rather than mathematics.
The final strategy issue, the presentation of conclusions, is re-
lated to Mallows’ fourth problem, and I should add some alter-
nate considerations to it. The more massive and more complex
the datasets, the more difficult can it be to present the con-
clusions. Also, they may become massive and correspondingly
difficult to manage. With high-dimensional data (we met ex-
amples in market research and highway quality maintenance
problems), the number of potential questions can explode. We
found that a kind of sophisticated decision support system (i.e.,
a customized software system to generate answers to questions
of the customers) is almost always a better solution than a thick
volume of precomputed tables and graphs.
I do endorse Mallows’ new vision and his conclusion. I per-
sonally think that the best way to teach applied statistics is
through case studies and apprenticeship, particularly through
active participation in substantial projects with real data. But
I know from experience how difficult it is to involve students in
such a fashion during their university education. His suggestion
about the JASA vignettes is an interesting first step in system-
atizing such ideas. But deep immersion in one substantial scien-
tific problem is more important for a future applied statistician
than shallow immersion in many problems. I think it was Nor-
bert Wiener who once claimed that a necessary prerequisite for
successful collaboration between a mathematician and a scien-
tist was that both are knowledgeable about the other’s field of
expertise to such a degree that the scientist is able to suggest
a novel theorem, and the mathematician can suggest a novel
experiment. The full statement can be found in Wiener (1965,
p. 3). It is ascribed there to the physiologist Arturo Rosenblueth.
ADDITIONAL REFERENCES
Huber, P. J. (1997), “Strategy Issues in Data Analysis,” in Proceedings of
the Conference on Statistical Science Honoring the Bicentennial of Stefano
Franscini’s Birth,Monte Verità,Switzerland, eds. C. Malaguerra, S. Morgen-
thaler, and E. Ronchetti, Basel: Birkhäuser-Verlag, pp. 221–238.
(1999), “Massive Datasets Workshop: Four Years After,” Journal of
Computational and Graphical Statistics, 8, 635–652.
Wiener, N. (1965), Cybernetics, Cambridge, MA: MIT Press.
Discussion
James M. LANDWEHR
Avaya Labs
Basking Ridge, NJ 07920
(
jml@avaya.com
)
In this paper, Mallows shares his interesting insights into
Tukey’s landmark 1962 paper through relating Tukey’s paper
to several relevant statistical contexts: Sir R. A. Fisher’s ear-
lier work, the research environment in London in the late 1940s
and early 1950s, and several events and reports from the last
10 years or so. We benefit from Mallows’ reflections and also
from his views on some important general issues for statistics
today.
One point that I would especially like to note is Mallows’
definition of statistics: “Statistics concerns the relation of quan-
titative data to a real-world problem, often in the presence of
© 2006 American Statistical Association and
the American Society for Quality
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DOI 10.1198/004017006000000165
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
DISCUSSION 335
variability and uncertainty. It attempts to make precise and ex-
plicit what the data has to say about the problem of interest.” It
seems to me that this focuses on the essence of our subject and
yet is sufficiently broad to encompass the range of important
problems that we statisticians attack today. Mallows has use-
fully characterized these activities into his five stages of prob-
lems.
Mallows writes early in this paper that “perhaps the most use-
ful thing I can do is to urge you to reread Tukey’s paper...”
Having found Mallows’ advice well worth heeding, I did reread
Tukey’s paper, which is eminently quotable. I would like to take
this opportunity to share a few quotations from Tukey’s paper
beyond those included by Mallows, and also to reflect on them
a bit. Have Tukey’s words been borne out by events over the
last 40 years? Are Tukey’s words relevant today and looking
forward?
The quotations that follow are all taken from Tukey’s 1962
paper on the indicated pages. Italics are from the original paper.
1. WHAT IS DATA ANALYSIS?
“I have come to feel that my central interest is in data analysis, which I take
to include, among other things: procedures for analyzing data, techniques for
interpreting the results of such procedures, ways of planning the gathering of
data to make its analysis easier, more precise or more accurate, and all the
machinery and results of (mathematical) statistics which apply to analyzing
data” (p. 2).
“Many have forgotten that data analysis can, sometimes quite appropriately,
precede probability models, that progress can come from asking what a speci-
fied indicator (=a specified function of the data) may reasonably be regarded as
estimating. Escape from this constraint can do much to promote novelty” (p. 5).
Note the breadth of the definition of data analysis in the first
quote. Through the last phrase Tukey includes mathematical
statistics, perhaps not all of it but the portions that he feels
actually apply to analyzing data. My sense is that this defini-
tion is broader than the way in which the term “data analysis”
has actually come to be used over the years. As for data analy-
sis preceding probability models or being done separately from
probability models, the situation has changed, and that certainly
seems to have come to pass over the years.
2. THE PROCESS OF DATA ANALYSIS,
DANGERS AND GOALS
“There is but one natural chain of growth in dealing with a specific problem of
data analysis, viz:
(a1) recognition of problem,
(a1) one technique used,
(a2) competing techniques used,
(a3) rough comparisons of efficacy,
(a4) comparison in terms of a precise (and thereby inadequate) criterion,
(a5) optimization in terms of a precise, and similarly inadequate criterion,
(a5) comparison in terms of several criteria
(Number of primes does not indicate relative order.)
...
(A) Praise and use work which reaches stage (a3), or only stage (a2), or
even stage (a1)...”(p.7).
Here Tukey laid out his view of the data analysis process,
apparently unique at this level of specification. I am struck by
a few points. He states that any precise criterion is inadequate
(for the purpose of data analysis) but still requires one or several
criteria as key components of the process. The data analysis
process is not totally ad hoc. He refers to comparing efficacy,
not efficiency under some modeling assumptions. He suggests
praising and using work that gets only to the first stage of the
overall process. Tukey wanted real analyses, not just theoretical
investigations of statistical properties of new procedures.
To what extent have we incorporated this paradigm into our
applied work? My sense is that in many serious application en-
vironments it is followed roughly, with plenty of iterations. But
I think it is hard to see this approach very often in textbooks or
in the literature we write for each other, where the goals—for
whatever reasons—generally seem to focus on in-depth treat-
ment of specific pieces of the process rather than providing a
sense of the overall problem solving and data analysis processes
for serious applications. I would be delighted to have someone
convince me that I am wrong about this; specifically, that it is
relatively easy to get a good sense of the data analysis process
from our statistics journals. (And, as a former journal editor my-
self, I realize that journals and authors have multiple purposes,
and this may often not be a realistic one.)
“There is a corresponding danger for data analysis, particularly in its statistical
aspects. This is the view that all statisticians should treat a given set of data in
the same way, just as all British admirals, in the days of sail, maneuvered in
accord with the same principles” (p. 13).
“The most important maxim for data analysis to heed, and one which many
statisticians seem to have shunned, is this: ‘Far better an approximate answer
to the right question, which is often vague, than an exact answer to the wrong
question, which can always be made precise’ (pp. 13–14).
Concerning the first of these quotes, perhaps the view that
Tukey attacked came from the notion that statistics amounts to
solving a precisely defined mathematical problem. My sense
is that we statisticians have indeed gotten away from this no-
tion, as well as from the views that there is only one “correct”
way to analyze a set of data and that all good statisticians must
treat the data the same way. I believe that alternative, sensible
approaches should give basically similar important conclusions
about the data; moreover, if they do not, we should at least be
able to understand and explain to others what aspects of the
approaches lead to the different conclusions. But my sense is
also that others who analyze data from time to time but are not
highly trained in statistics may not see the situation this way,
and that many continue to believe that there must be one and
only one “right” way to analyze some data.
The second statement here is arguably Tukey’s most famous
single quotation (at least about data analysis). Probably we all
agree with it, but probably we all also need to keep it in mind
more than we do.
3. METHODOLOGICAL AREAS: MULTIVARIATE
ANALYSIS, GRAPHICS, AND COMPUTING
“The analysis of multiple-response data has similarly been much discussed, but,
with the exception of psychological uses of factor analysis, we see few exam-
ples of multiple-response data today which make essential use of its multiple-
response character” (p. 4).
“In view of this difficulty of description, it is not surprising that we do not have
a good collection of ideal, or prototype multivariate problems and solutions,
indeed it is doubtful if we have even one (where many are needed). A better
grasp of just what we want from a multivariate situation, and why, could per-
haps come without the aid of better description, but only with painful slowness”
(p. 33).
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
336 JAMES M. LANDWEHR
Applications of multivariate analysis have come a long way,
and I doubt if many statisticians today would agree with the first
statement. Descriptive multivariate statistical techniques, which
Tukey saw as a gap, still provide in my opinion plenty of op-
portunities, even though much progress has been made through
creative and computationally intensive graphics. It is interesting
that Tukey made these statements on data analysis for multivari-
ate problems approximately ten years after T. W. Anderson’s
classic book on multivariate analysis was published.
“The importance of the introduction of half-normal plotting is probably not
quite large enough to be regarded as marking a whole new cycle of data analy-
sis, though this remains to be seen. The half-normal plot itself is important, but
thedevelopments leading out of itare likely tobemany and varied....These
techniques, like the half-normal plot itself, will begin with indication, and will
only later pass on to significance, confidence, or other explicitly probabilistic
types of inference” (p. 42).
Certainly probability plotting methods in general are now
widely used by statisticians and are very powerful tools in mod-
eling, so I would say that Tukey’s prediction has been borne out.
But their integration and use with significance and confidence
procedures has not happened yet to a large degree, in my opin-
ion. I understand that half-normal plots are part of six-sigma
black-belt training programs, but how widely used are prob-
ability plotting methods outside the community of those with
advanced statistical training? I think that use of and comfort
with probability plotting methods is almost a reliable marker
for those with strong data analysis interests and skills, so there
is still a long ways to go in terms of advancing and disseminat-
ing this general topic.
“How vital, and how important, to the matters we have discussed is the rise
of the stored-program electronic computer? In many instances the answer may
surprise many by being ‘important but not vital, although in others there is no
doubt but what the computer has been ‘vital’.... On the other hand, there are
situation[s] where the computer makes feasible what would have been wholly
unfeasible. Analysis of highly incomplete medical records is almost sure to
prove an outstanding example” (pp. 63–64).
“Some would say that one should not automate such procedures of examina-
tion, that one should encourage the study of the data. (Which is somehow
discouraged by automation?) To this view there are at least three strong counter-
arguments:
1. Most data analysis is going to be done by people who are not sophisti-
cated data analysts and who have very limited time; if you do not provide
them tools the data will be even less studied. . . .
I look forward to the automation of as many standardizable statistical proce-
dures as possible. When these are available, we can teach the man who will
have access to them the ‘why’ and the ‘which, and let the ‘how’ follow along”
(p. 22).
Tukey was not proselytizing for a data analysis approach to
statistics because computing enabled it, nor was he saying that
heavy-duty computing is required for it. Instead, he argued that
data analysis was the correct approach and mindset to take.
Computing just made it easier and more widely available, and
permitted attacking bigger and more realistic problems. Today,
perhaps more data analysis is done by users through Excel than
through all of the mainstream statistics packages. Is that desir-
able or not? I think the answer is that it depends on the expertise
with which it is done, not where or by whom (although these
variables may be correlated).
4. DATA ANALYSIS: WHAT’S NEXT?
“We need to face up to the need for both indication and conclusion in the same
analysis (p. 62).
Multiple comparisons research topics have progressed, but in
practice I think this simply has not happened nor have we made
the progress that Tukey anticipated 40 years ago. Were his goals
unrealistic?
“We need to face up to the need for afreeuseof ad hoc and informal procedures
in seeking indications (p. 62).
We have made more progress, both in techniques and in atti-
tudes, on this front.
...it is natural for indication procedures to grow up before the corresponding
conclusion procedures do so (p. 62).
How does this statement relate to the computationally inten-
sive Bayesian modeling topics that have developed over the last
decade and more? My sense is that this important new area de-
veloped through a different process than that which Tukey en-
visioned for advances.
“The future of data analysis can involve great progress, the overcoming of real
difficulties, and the provision of a great service to all fields of science and tech-
nology. Will it? That remains to us, to our willingness to take up the rocky
road of real problems in preference to the smooth road of unreal assumptions,
arbitrary criteria, and abstract results without real attachments. Who is for the
challenge?” (p. 64).
These were the closing words of Tukey’s paper. The paper
was a call to action, an attempt to move, if not rock, the foun-
dations of the establishment. It was emotional, not dry—an ex-
hortation. Although containing some technical content, it was
primarily a polemic. It came around 40 years after Fisher’s pa-
pers of the 1920s laying out some of the foundations of math-
ematical statistics. We now are 40-plus years from Tukey’s
paper, roughly the same length of time. We have come a long
way, I believe, as measured against Tukey’s statements. The
range, depth, and technology of serious data-analytic appli-
cations today—whether led by people calling themselves sta-
tisticians or by others—are impressive. But is there today a
corresponding call to rock the establishment, or could there be,
or should there be? If so, what is it?
Thanks again to Mallows for a stimulating paper and for mo-
tivating me to reread Tukey’s paper, and to the editors for invit-
ing this discussion and causing me to think further about the
issues.
TECHNOMETRICS, AUGUST 2006, VOL. 48, NO. 3
... Chambers' greater statistics)-or grants data analysis the status of a standalone field, external but related to statistics, which is considered a narrow part of formal, applied mathematics. Breiman (2001) and Mallows (2006) take the latter stance, calling for the expansion of statistics to include scientific elements and engage with real-world disciplines. This does not entail that statistics is itself a full-bodied science. ...
Article
Full-text available
The modern abundance and prominence of data have led to the development of “data science” as a new field of enquiry, along with a body of epistemological reflections upon its foundations, methods, and consequences. This article provides a systematic analysis and critical review of significant open problems and debates in the epistemology of data science. We propose a partition of the epistemology of data science into the following five domains: (i) the constitution of data science; (ii) the kind of enquiry that it identifies; (iii) the kinds of knowledge that data science generates; (iv) the nature and epistemological significance of “black box” problems; and (v) the relationship between data science and the philosophy of science more generally.
... A seminal article by Tukey (1962) argued for a shift from theoretical statistics to applied statistics, providing the foundation of what we understand as machine learning today (Mallows, 2006). Originating from the field of applied statistics, data science has become the motor for business intelligence's evolution toward enterprise AI (Cao, 2017;Filip et al., 2014). ...
Article
Full-text available
Disagreement and confusion about artificial intelligence (AI) terminology impede researchers, innovators, and practitioners when developing and implementing enterprise applications. The prevailing ambiguities and use of buzzwords are exacerbated by media and vendor marketing hype. This study identifies several ambiguities within and across AI fields and subfields. Combining a systematic review with a sequential mixed-models design, a total of 26,143 publications were reviewed and mapped, making this the largest conceptual study in the AI field. A unified framework is proposed as an Euler diagram to bring about clarity through a "common language" for AI researchers, innovators, and practitioners.
... Chambers' greater statistics) --or grants data analysis the status of a standalone field, external but related to statistics, which is considered a narrow part of formal, applied mathematics. Breiman's (2001) and Mallows (2006) take the latter stance, by calling for the expansion of statistics to include scientific elements and engage with real-world disciplines. This does not entail that statistics is itself a full-bodied science. ...
... Once we had our corpus, we first highlighted all statistics as defined by Mallows (2006). This includes inferential statistics -a piece of information from a portion of the population, which is then extrapolated -and descriptive statistics -information about the whole of the sample set. ...
Article
Full-text available
Statistics are a central part of political communication, yet little is known about how they are used rhetorically by politicians. This article therefore develops a rhetorical understanding of statistics in political debate and explores how they function primarily as strategies of argumentation. Through an analysis of how British politicians use numbers in debates on the National Health Service ‘winter crisis’, it is argued that four tropes underpin the use of statistics as a rhetorical device. The trope of dehistoricisation is said to engender consensus over the facticity of statistical arguments, while the tropes of synecdoche, enthymeme and framing are said to enable contestation over their presentation and meaning. The article concludes that a rhetorical understanding of statistics is vital to elucidating the selective, contestable and strategic ways in which numbers function in political debate, thereby challenging the notion that quantification can be an objective or value-free means of establishing political claims.
... The overriding message in John Tukey's paper "The Future of Data Analysis" (Tukey 1962) is, as Brad Efron so succintly and aptly states: "Statistical ideas should be ultimately judged by their utility in applications, not on theoretical grounds"-see the discussion of "Tukey's Paper After 40 Years" by Mallows (2006). Mallows concludes his paper by saying that the framework that statisticians should develop "need not be mathematical; mathematics is seductively easy compared with data analysis. ...
... The overriding message in John Tukey's paper "The Future of Data Analysis" (Tukey 1962) is, as Brad Efron so succintly and aptly states: "Statistical ideas should be ultimately judged by their utility in applications, not on theoretical grounds"-see the discussion of "Tukey's Paper After 40 Years" by Mallows (2006). Mallows concludes his paper by saying that the framework that statisticians should develop "need not be mathematical; mathematics is seductively easy compared with data analysis. ...
Article
Full-text available
My comments on the paper by Egozcue & Pawlowsky-Glahn published in TEST
Chapter
For many years, the R language has had a reputation as a premier system for interactive data analysis. From a user’s perspective, there are two main reasons for this. First, R is a language designed specifically for working with data, so it has important practical features (e.g. sensible treatment of missing values) that are not found in more general languages. Second, R comes with a vast array of high-quality packages, or libraries, that handle specialized tasks. The packages are contributed by experts in various fields, and tend to be tied closely to the literature—two facts that are relevant in an integrative field such as oceanography. The case for R has grown stronger in recent years, with a general movement to open-source software , and with specialized aspects of oceanographic data analysis becoming available in the oce package. Now is a good time for oceanographers to try R.
Chapter
Ihr volle wissenschaftstheoretisch-philosophische Tiefe erhält die Statistik erst mit der Betrachtung des allgemeinen Induktionsproblems. Man kann sie sogar als den am weitesten ausgearbeiteten theoretisch fundierten als auch praktisch erfolgreichen Versuch auffassen, jenes zu lösen. Die Formulierung von Tukey deutet bereits an, dass die Statistik nicht eine, sondern ein ganzes Spektrum spezieller Lösungen anbietet. Genauso wenig wie es den Stein der Weisen gibt, existiert ein Induktionsprinzip. Vielmehr gibt es eine ganze Reihe von Ansätzen und verschiedenartige Klassen von Argumenten um Verallgemeinerungen zu rechtfertigen.
Article
Data mining is a new discipline lying at the interface of statistics, database technology, pattern recognition, machine learning, and other areas. It is concerned with the secondary analysis of large databases in order to find previously unsuspected relationships which are of interest or value to the database owners. New problems arise, partly as a consequence of the sheer size of the data sets involved, and partly because of issues of pattern matching. However, since statistics provides the intellectual glue underlying the effort, it is important for statisticians to become involved. There are very real opportunities for statisticians to make significant contributions.
Article
The term ‘exchangeability’ was introduced by de Finetti in the context of personal probability to describe a particular sense in which quantities treated as random variables in a probability specification are thought to be similar. We believe, however, that judgments of similarity are more primitive than those of probability and are at the heart of all statistical activities, including those for which probability specifications are absent or contrived. In this paper, we give a definition of exchangeability in a descriptive context, which extends de Finetti’s concept to a wider domain. Our objective is to analyse the logic of judgments of exchangeability (or similarity, or homogeneity), to clarify the roles of context and data analysis in these judgments. We give several examples to illustrate the nature of these judgments in description, inference and prediction. We use this discussion to clarify the extent to which judgments of similarity in inference and prediction can be based on data, and the extent to which they must rely on pure faith. Our discussion is a contribution to the emerging theory of data analysis, the as-yet largely atheoretical and informal process that precedes and supports formal statistical activities.
Article
The nature of data is rapidly changing. Data sets are becoming increasingly large and complex. Modern methodology for analyzing these new types of data are emerging from the fields of Data Base Managment, Artificial Intelligence, Machine Learning, Pattern Recognition, and Data Visualization. So far Statistics as a field has played a minor role. This paper explores some of the reasons for this, and why statisticians should have an interest in participating in the development of new methods for large and complex data sets.
Article
In 1922 Fisher decoupled the theory of statistics from its applications. He identified three basic problems, the first of which is choosing the form of the specification or model for the data. But there is a problem that logically precedes this: how do the data relate to the problem, and what other data might be relevant? In my view this is what "statistical thinking" should be concerned with. I describe a modest formalism that may help in the development of a "theory of applied statistics," and discuss how probability enters the picture.
Article
The word “strategy” — literally: the leading of the army“ — inevitably evokes military associations. One is reminded of Clausewitz’ famous treatise ”Vom Kriege“, the first systematic and comprehensive treatment of the subject, published in 1832 after the author’s death. In data analysis, strategy is a relatively recent innovation. A decade ago, in a talk on ”Environments for supporting statistical strategy“ I had quipped that it was difficult to support something which did not exist (Huber 1986). Today, the joke might no longer be appropriate, but we still are far away from a Clausewitz for data analysis.
Article
Data analysis is not a new subject. It has accompanied productive experimentation and observation for hundreds of years. At times, as in the work of Kepler, it has produced dramatic results.
Chapter
Preceding chapters have described various geodetic and modeling tools that can be used to monitor volcano deformation, discussed examples of how those tools have been used to infer what might be happening beneath a few well studied volcanoes, and explored some emerging links between geodesy and other disciplines in volcanology. In this final chapter, I take a step back to consider some basic questions about the current state of volcano geodesy and try to glimpse its future — a future bright with the promise of real time, global surveillance but also clouded by increasing risk as populations continue to encroach on many of the world’s dangerous volcanoes.