ArticlePDF Available

Math is Music; Statistics is Literature

Authors:
Math is Music; Statistics is Literature or Why are there
no six year old Novelists?1
Richard D. De Veaux, Williams College and Paul F. Velleman, Cornell University
Almost thirty years ago, something happened that made Introductory Statistics harder to
teach. Students didn’t suddenly become less teachable, nor did professors forget their
craft. It was then that we began to switch from teaching Statistics as a Mathematics
course to teaching the art and craft of Statistics as its own discipline. When Statistics was
viewed as a sub-specialty of Mathematics, students were taught to manipulate formulas
and calculate the “correct” answer to rote exercises. Life for the teacher, both as
instructor and grader was easy.
That started changing in the early 1980’s. The video series Against All Odds appeared
and David Moore and George McCabe published Introduction to the Practice of
Statistics. Since then two pioneering committees —one for the MAA and ASA, and a
second, that produced the Guidelines for Assessment and Instruction in Statistics
Education (GAISE) Report, officially adopted by the ASA—have pushed us all to change
our teaching. And a new generation of texts has appeared following the advice of these
reports—and challenging Statistics teachers to use this new approach.
But why is it more difficult to teach this way? And why is it so important that we do so?
By comparison, let’s look at mathematics. Much of the beauty of mathematics stems
from its axiomatic structure and logical development. That same structure facilitates--in
fact dictates--the order in which the material is taught. It also ensures that the course is
self-contained, so there are no surprises. But modern statistics courses are not like that.
That can frustrate students who were expecting a math class. As a student of one of us
once wrote on the course evaluation form, “This course should be more like a math
course, with everything you need laid out beforehand.”
Mathematics has a long history of prodigies and geniuses, with many of the most famous
luminaries showing their genius at remarkably early ages. We’ve all heard at least one
1 This paper is based on several talks given by the authors at USCOTS.
version of the famous story of young Carl Friedrich Gauss. A web search finds over 100
different retellings of the story, but an article in American Scientist2 identifies a version
actually recounted at Gauss’ funeral. In that version, Gauss, aged 7 and youngest in the
class, summed the numbers from 1 to 100 in seconds, wrote the answer on his slate and
then threw it down on the table mumbling “there it lies” in the local dialect. It was
perhaps an hour later that the teacher discovered that his answer was, in fact, the only
correct one in the room.
Prodigies in math can develop at remarkably early ages because math creates its own
self-consistent and isolated world. Pascal had worked out the first twenty-three
propositions of Euclid by age 12 when his parents, who wanted him to concentrate on
religion, finally relented and presented him with a copy of Euclid’s Elements. Galois
wrote down the essentials of what later became Galois theory the night before a fateful
duel when he was 20, or so the legend has it. In the modern era, Norbert Weiner entered
Tufts at age 11, Charles Pfefferman of Princeton was, at 22, the youngest full professor in
American history, and Ruth Lawrence of Hebrew University passed her A-levels in pure
math at age 9 and became the youngest student ever to enroll at Oxford two years later.
Of course mathematics isn’t the only field that shows prodigies. Mozart, Schumann, and
Mendelssohn, among others, were young musical prodigies. Even though his music
matured, it is remarkable that some of the music Mozart wrote at age 5 is still in the
repertoire. Chess prodigies continue to appear. Sergey Karjakin is the youngest
grandmaster ever at 12 years 7 months. The infamous, late Bobby Fischer, who was
youngest in 1958 when he became a grand master at 15 years, 6 months and 1 day, is now
only 19th on that list.
But there are only a few fields that develop prodigies, and all seem to be self-contained.
For example, as Thomas Dulack observed, “There are no child prodigies in literature.”3
Although one might argue that William Cullen Bryant, Thomas Chatterton, H.P.
Lovecraft or Mattie Stepanek qualify as literary prodigies, that list doesn’t have quite the
2 Brian Hayes. “Gauss’s Day of Reckoning. American Scientist,May-June 2006,Volume 94, Number 3,
Page: 200
3 http://advance.uconn.edu/2006/060424/06042412.htm
same panache as the others we’ve cited. It’s no easier to find prodigies in art, poetry,
philosophy, or other endeavors that require life experience.
What does any of this have to do with statistics and how can it help us understand why
introductory statistics is so hard to teach? The challenge for the student (and teacher) of
introductory statistics is that, like literature and art, navigating through and making sense
of it requires not just rules and axioms, but life experience and “common sense.”
Although working with elementary statistics requires some mathematical skills, we ask so
much more of the intro stats student than is required by, for example, a first Calculus
course. A student in calc I is not asked to comment on whether a question makes sense,
whether the assumptions are satisfied (is the reservoir from which the water is pouring
really a cone?), to evaluate the consequences of the result, or to write a sentence or two to
communicate the answer to others. But that’s exactly what the modern intro stats course
demands.
The challenge we face is that, unlike calc I, we have a wide variety of skills to teach, and
most of them require judgment in addition to mathematical manipulation. Judgment is
best taught by example and experience, which takes time. But we’re supposed to produce
a student capable of these skills in one term. It would be challenging enough to teach the
definitions, formulas, and skills in the standard first course. To convey in addition, the
grounds for sound judgment is even more difficult. It should be no wonder that the first
course in statistics is widely acknowledged to be one of the most difficult courses to
teach in the university.
It is not merely that we hope to teach judgment to Sophomores; we are actually asking
our students to change the way they reason about the real world. We call the skills they
must acquire the seven unnatural acts of statistical thinking:4
1. Think Critically. Challenge the data’s credentials, look for biases and lurking
variables.
2. Be Skeptical. Question authority and the current theory. (Well, OK, Sophomores
do find this natural.)
4 P.F.Velleman, 2003, “Thinking with Data; Seven Unnatural Acts and Ten 400-year-old Aphorisms”
Keynote address to the Beyond the Formula conference, Rochester, NY.
3. Think about variation rather than about center.
4. Focus on what we don’t know. For example, a confidence interval exhibits how
much we don’t know about the parameter.
5. Perfect the Process. Our best conclusion is often a refined question, but that
means a student can’t memorize the “answer.”
6. Think about conditional probabilities and rare events. Humans just don’t do this
well. Ask any gambler. But without this the student can’t understand a P-value.
7. Embrace vague concepts. Symmetry, Center, Outlier, Linear… the list of concepts
fundamental to Statistics but left without firm definitions is quite long. What
diligent student wanting to learn the “right answer” wouldn’t be dismayed?
How can we help students navigate through these woods? We don’t have definitive
answers to the question, in spite of our over 50 years (combined – not each) of teaching
introductory statistics. But we’d like to identify some themes that might help us as a
community to start a conversation about some of the challenges.
We can help students by giving them a structure for problem solving that incorporates the
requirement that they exercise their judgment. In our books we’ve recommended that
students follow the steps that W.E. Deming created over 50 years ago in his advice to
industry: Plan, Do, Check, Act. We’ve substituted Communicate for Act to underscore
the importance of communicating to others the results we see. Students must learn to
communicate their results in plain language and not only in statistical jargon.
As the GAISE report emphasized, we must place more emphasis on the Plan, and
Communicate steps.. The emphasis of the traditional mathematical course, on the Do step
can be largely replaced by relying on technology for the calculations and graphics.
In teaching students to think through the problem, plan their attack on it, and
communicate results, we bring students face-to-face with their real-world knowledge and
experience—with the literature side of their maturing intellect. We owe them an
acknowledgement that we’ve done this. It isn’t fair to emphasize the simplicity of the
calculations or to just provide a bunch of definitions in little boxes. No Comp Lit or
Philosophy teacher would do that, and neither should we.
What guidance should we offer? First, we can note that the judgment often called for in
statistics is one that invites students to state their personal views. (After all, they are the
ones who must be 95% confident in their interval.) But we can offer guidance for their
judgments; they must be guided by the ethical goal of discovering, describing, modeling,
and understanding truth about the world.5
Second, we can remind students that their Introductory Statistics course is related to
every other course they may study. The reason they are taking Statistics (or perhaps, the
reason that it’s required) is that they are accumulating the kind of knowledge about the
real world that will help them write literature and read philosophy, and that kind of
knowledge makes them qualified to make statistical judgments. Of course, by asking
students to call upon what they’ve learned in other courses we are encouraging them to
solidify their knowledge from those courses.
Third, we must actually require students to demonstrate all of the steps of a statistical
analysis, from problem formulation, to communicating the results, to making real world
recommendations on what they find. Unfortunately, homework and exam problems that
carry these requirements are harder to write and harder to grade. Training teaching
assistants to reliably grade these efforts can be problematic. Moreover, many statistics
instructors are not trained in statistics and they too can find this approach challenging.
But the results of teaching a modern course rewards both the student and teacher in spite
of its challenges.
We should also face outward to the academic community. There is a widespread
impression that introductory Statistics can be taught – or even less plausible, can be
learned—in a single term. Any objective consideration of the breadth and depth of the
concepts and methods covered shows this to be absurdly optimistic. Yet few academic
programs require more than one course, and many of those that require two are cutting
back. We need to argue as a discipline that an introductory Statistics course must cover
more than an introduction to inference for means if it is to teach the reasoning of
5 “Truth, Damn Truth, and Statistics,” (July, 2008), Journal of Statistics Education, Volume 16,
Number 2, http://www.amstat.org/publications/jse/v16n2/velleman.html
Statistics—and that teaching that reasoning must be its goal (and not just teaching
definitions and formulas.) But a more complete course that covers techniques that require
more than rudimentary sophistication such as inference for regression and multiple
regression is unlikely to have time to teach judgment, planning, and communication. It
will most likely be pared down to a collection of equations and rules.
As a community we need to make it clear that the subject of Statistics deserves both more
respect and more time, not because it covers so many methods but because it should teach
the foundations of reasoning when we have data. Part of the argument might be that,
unlike students in subjects that exhibit prodigies, our students must summon their real-
world knowledge to learn to think statistically. And that the effort by statistics teachers
and students will pay back correspondingly in all that our students do. Math is sometimes
said to be the language of science (and much social science), but statistics should teach
students the structure for what it communicates.
Is the effort to teach the modern course worth it? We believe that it is. Rather than a
collection of techniques or a “cookbook” of situations and formulas, a modern course in
statistics must teach students to reason about the world. Although that makes the course
more difficult to teach and to assess, it will make a difference in students’ lives and serve
them for the rest of their academic careers and beyond.
... Just as physics attempts to understand the physical universe and presses mathematics into service wherever it can help, so too statistics attempts to turn data into real-world insights and presses mathematics into service wherever it can help. And whereas in mathematics, mathematical structures can exist and be of enormous interest for their own sake, in statistics, mathematical structures are merely a means to an end (see also Box, 1990, paragraph 2;De Veaux & Velleman, 2008). A consequence is, adapting a famous quotation from John Tukey, whereas a mathematician prefers an exact answer to an approximate question, an applied statistician prefers an approximate answer to an exact question. ...
Chapter
What is statistics? We attempt to answer this question as it relates to grounding research in statistics education. We discuss the nature of statistics as the science of learning from data, its history and traditions, what characterizes statistical thinking and how it differs from mathematics, connections with computing and data science, why learning statistics is essential, and what is most important. Finally, we attempt to gaze into the future, drawing upon what is known about the fast-growing demand for statistical skills and the portents of where the discipline is heading, especially those arising from data science and the promises and problems of big data.
... Readers are encouraged to familiarize themselves with these papers as part of ongoing curricular review. Overview [Pierie, 1986] Guidelines for bachelor degree curricula in statistics: An interim report [Moore et al., 1995] Statistics education fin desì ecle [Wild and Pfannkuch, 1999] Statistical thinking in empirical enquiry [Tarpey et al., 2000] Curriculum guidelines for bachelor of arts degrees in statistical science [Bryce et al., 2000] Curriculum guidelines for bachelor of science degrees in statistical science [Breiman, 2001] Statistical modeling: the two cultures [Moore, 2001] Undergraduate programs and the future of academic statistics [Ritter et al., 2001] Advice from prospective employers on training BS statisticians [Cannon et al., 2002] Guidelines for undergraduate minors and concentrations in statistical science [Scheaffer and Stasny, 2004] The state of undergraduate education in statistics: A report from the CBMS 2000 [DeVeaux and Velleman, 2008] Math is music: Statistics is literature [Brown and Kass, 2009] What is statistics? ...
Technical Report
Full-text available
This is an exciting time to be a statistician. The contribution of the discipline of statistics to scientific knowledge is widely recognized (McNutt, 2014) with increasingly positive public perception. Many feel "daunted by the challenge of extracting understanding from floods of disconnected data that threaten to swamp every discipline" (Yamamoto, 2013). Demand for statisticians is strong, and it is frequently described as a top job (Wasserstein, 2015). The McKinsey report makes clear the need for new graduates with "deep analytical skills," and many (most?) of these new workers will be trained at the undergraduate level. Fortunately, the recent growth of undergraduate statistics programs is impressive. While still small in absolute numbers they have nearly doubled between 2010-2013 and are on track to continue to increase. But there are challenges as well as opportunities in this new world of data (Ridgway, 2015; Horton, 2015). The traditional statistics curriculum with mathematical foundations has not kept up with pressing demands for students who can make sense of data. Calls for transformed undergraduate education have resonated nationally. These pressures led ASA President Nathaniel Schenker to convene an ASA workgroup to update the association's guidelines for undergraduate programs. The group, with broad representation from academia, industry, and government, put forward guidelines that were endorsed by the ASA Board of Directors in November, 2014 (a copy of the guidelines and related resources can be found at http://www.amstat.org/education/curriculumguidelines.cfm). Much of the statistics education literature focuses on the introductory statistics course and statistics before college. Given the relatively few decades since the establishment of undergraduate statistics programs, this is not surprising. While there has been impressive growth in the number of students taking introductory statistics, there has been a relative dearth of articles on the curriculum beyond the introductory course. The 2014 ASA curriculum guidelines focus particular attention on the relationships between courses and student experiences beyond what has been implemented in traditional lecture courses. The November, 2015 issue of The American Statistician includes a set of papers that addresses the challenges and opportunities for undergraduate programs in statistics.
... Reforms in statistics education encourage an emphasis on understanding of concepts, interpretation, and data analysis instead of formulas, computation, and mathematical theory (Garfield et al., 2007;DeVeaux and Velleman, 2008). Curricula based on these reforms move away from teaching statistics as a collection of facts. ...
Article
Full-text available
This paper discusses the influence that decisions about data cleaning and violations of statistical assumptions can have on drawing valid conclusions to research studies. The datasets provided in this paper were collected as part of a National Science Foundation grant to design online games and associated labs for use in undergraduate and graduate statistics courses that can effectively illustrate issues not always addressed in traditional instruction. Students play the role of a researcher by selecting from a wide variety of independent variables to explain why some students complete games faster than others. Typical project data sets are “messy,” with many outliers (usually from some students taking much longer than others) and distributions that do not appear normal. Classroom testing of the games over several semesters has produced evidence of their efficacy in statistics education. The projects tend to be engaging for students and they make the impact of data cleaning and violations of model assumptions more relevant. We discuss the use of one of the games and associated guided lab in introducing students to issues prevalent in real data and the challenges involved in data cleaning and dangers when model assumptions are violated.
Article
Full-text available
Opinion mining and sentiment analysis are two topics with growing interest in artificial intelligence. The latest research in these areas has evolved in increasing complexity and sophistication using text mining and natural language processing techniques. Twitter is a popular social networking platform where users post and read messages. These messages, known as “tweets”, can convey opinions about a range of topics and therefore may be subject to sentiment analysis tasks. Possible strategies go from concept-based approaches and data mining, to the crudest keyword spotting methods. This paper describes the building of an average general sentiment index using Spanish tweets. For this purpose, an existing emotional lexicon for Spanish words has been tailored to assign sentiment polarity to texts. Some results and methods to evaluate the quality of the resulting index are shown later.
Article
Full-text available
This is an exciting time to be a statistician. The contribution of the discipline of statistics to scientific knowledge is widely recognized (McNutt 2014) with increasingly positive public perception. Many feel “daunted by the challenge of extracting understanding from floods of disconnected data that threaten to swamp every discipline” (Yamamoto 2013). Demand for statisticians is strong, and as such, ‘statistician’ frequently ranks as a top job (Wasserstein 2015). The McKinsey report (Manyika et al. 2011) makes clear the need for new graduates with “deep analytical skills,” and many (most?) of these new workers will be trained at the undergraduate level. Fortunately, the recent growth of undergraduate statistics programs is impressive. While still small in absolute numbers they have nearly doubled between 2010 and 2013 (Wasserstein 2015) and are on track to continue to increase. But there are challenges as well as opportunities in this new world of data (Horton 2015; Ridgway 2015a). The traditional statistics curriculum with mathematical foundations has not kept up with pressing demands for students who can make sense of data. Calls for transformed undergraduate education have resonated nationally (Holdren and Lander 2012; Zorn et al. 2014). These pressures led ASA President Nathaniel Schenker to convene an ASA workgroup to update the association's guidelines for undergraduate programs. The group, with broad representation from academia, industry, and government, put forward guidelines that were endorsed by the ASA Board of Directors in November 2014 (ASA 2014). Table 1 includes the full executive summary (a copy of the guidelines and related resources can be found at http://www.amstat.org/education/curriculumguidelines.cfm).
ResearchGate has not been able to resolve any references for this publication.