IN 1964, Darrell Huff, a freelance writer and onetime editor of Better Homes and Gardens, wrote How to Lie with Statistics. It became the best-selling statistics volume of all time. It wasn’t fancy or pretentious. Illustrated with charming New Yorker–style cartoons by Irving Geis, its chapters had breezy titles like “The Gee-Whiz Graph” (Chapter 5) and “How to Statisticulate” (Chapter 9). Popular sources like Time magazine and the Kinsey Reports provided examples of improper statistical usages. To be sure, the book’s offerings were hardly news to collegiate readers, but that wasn’t really the point. Huff’s genius was to bring statistics down to a level where any reader (yes, reader, even you!) could feel sophisticated enough to talk back to the news.
Sixty years on, the sophistication of readers of this type of book has probably increased in statistically significant ways. It’s hard to imagine a celebrity stats-stud like Nate Silver existing in any era but our own, when an all-caps CORRELATION ≠ CAUSATION in the comment thread needs no further explanation. In fact, so well known is this concept that googling the phrase “spurious correlations” leads to a popular blog with calculated tongue-in-cheek correlation coefficients of obviously non-causal variables (e.g., “US spending on science, space, and technology” vs. “Suicides by hanging, strangulation, and suffocation”), all for the sake of entertainment.
Of course, one need only visit the latest round of Facebook infographics to know that statisticulation is a robust art. (Just scan the feed for the politics.) Where statistics are commonplace, the gullible are bound to be misled. And they’re not quarantined among hoi polloi. After the financial troubles of 2008, exposés on the misdeeds of Wall Street quants were ubiquitous; similarly, exposés on statistical misuse in scientific research, most notably one written by the Stanford researcher John Ioannidis, are just as numerous now as they ever were.
Enter Gary Smith. When this Pomona College economics professor elected to write Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics, his conceit was borrowed more or less wholesale from Darrell Huff. This is a connection I am making, not one — subtitle aside — that’s ever mentioned by Smith. (Did he expect no one to notice his “Garbage In, Gospel Out” is a rewrite of Huff’s “Sample with the Built-in Bias”? That his “Graphical Gaffes” are Huff’s “Gee-Whiz Graphs”? Or that “Apples and Prunes” is just “The Semiattached Figure” redux?) Which isn’t necessarily a bad thing, but surely a hat tip wouldn’t have hurt.
Before we sharpen any pitchforks, though, let’s consider the challenges inherent in this task. After all, an updated catalog of statistical error doesn’t just require that old examples be replaced by new. Today’s sophisticates require the gritty technical details, and so the success of Standard Deviations depends not only on the clarity and interest of its examples, but also on its effectiveness at balancing techy wonkishness with layman accessibility.
So, starting with the easy stuff — what about it? Are the accessible parts any good?
For the most part, yes. Smith zeroes in on public furors and dispatches them with dry efficiency. Many are controversies you may have heard of and wondered, idly, if any scientific explanation could be in store. The book opens, for instance, with an account of Paul, the psychic octopus. Journalists reported that his movement toward one side of his tank or the other could help predict which team would win the World Cup. “What the heck was going on?” asks Smith. “How could a slimy, pea-brained invertebrate know more about soccer than I did?”
It will surprise no one that, in a book on statistical explanation, there are no admissions of the inexplicable. Octopus Paul, a German, usually favored the side of his tank with the German or similarly striped flag — and in those cases, the horizontally striped flag country usually won. If there is one overarching theme to the book’s grab bag of topics, it’s that results should have reasons, and that sometimes the desire for results, paired with the inborn human drive to find patterns even where there are none, can lead people to make absurd claims.
Smith has some fun dredging up old nonsense on this point. Late in the book, he includes an entertaining chapter on ESP (extrasensory perception), a “science” dependent on exploiting patterns. J. B. Rhine of Duke University was an ESP acolyte in the 1930s. His research was rife with what Smith calls “data grubbing,” the selective use of data to favor one’s desired outcome. When Rhine attempted to correlate sequences of drawn cards with human predictions, he would often claim statistical significance for his results. But these “results” would variously correlate positively or negatively with past or future predictions, and when the selected correlations vanished — well, this he earnestly attributed to a “decline effect.”
“With so many subjects and so many possibilities,” Smith comments, “it should be easy to find patterns, even in random guesses.”
This illustrates another of the book’s recurring themes. Smith makes it clear that data grubbing doesn’t stop with paranormal investigations, and Standard Deviations is at its best when it indicts the practice of current researchers. Here’s how Smith characterizes the know-nothing formation of theory from data:
Ransacking data for patterns is fun and exciting — like playing Sudoku or solving a murder mystery. Examine the data from every angle. Separate the data into categories based on gender, age, and race. Discard data that muddle patterns. Look for something — anything — that is interesting. After a pattern is discovered, start thinking about reasons.
Of course, there is nothing wrong with the use of data to construct an empirical theory. The problem occurs when one uses a data set — the same data set — both as the means for constructing a theory and, once it has been constructed, as the theory’s prefab proof.
Such issues frustrate a variety of studies, sometimes in topics as pressing as whether power lines cause cancer (studies based on restricted data sets implied they did), or as trivial as whether the stock market will do better if the Sports Illustrated swimsuit model is an American (again, historical data indicated it would). In both of these situations, and in many others, Smith shows how using a new data set, a set not used to discover the trend, can reveal the initial correlation to be accidental. (Tellingly, the swimsuit test was created to reveal stock market lunacy — only to be taken seriously by unwary investors.)
Though these examples of misuse are compelling, a summary probably makes them sound more elegantly tied together than they are. No one would accuse Smith of too much style. He discusses rocket patterns in wartime London without mentioning Tyrone Slothrop, quotes Milton Friedman unapologetically, and calls state lotteries a “stupidity tax” with no concern for sensitivity or taste. Topics shift from epidemiology to sabermetrics to psychology in short order. Chapters vary wildly in complexity and length. The book’s structure owes more to The Big Book of Amazing Facts than the usual mathematical ziggurat of progressive difficulty.
In his salad of approaches, Smith veers away from grand principles or overarching narrative. On the plus side, this allows him to avoid any obvious traps, with context always determining the worth of any particular data set. Unfortunately, this same pragmatic specificity also allows him to ignore key conceptual issues, which makes certain critiques less effective than they might have been.
For example, Smith begins Chapter 17, “Betting the Bank,” with a wry dissection of one “Superinvestor strategy” recommended by Hume & Associates, an investment advisory firm, in the 1980s. The strategy was based on the fact that the gold-to-silver ratio (i.e., the price of gold divided by the price of silver) had fluctuated between 34 and 38 for most of the prior 15 years, which led to prescriptions based on the belief that, no matter how this ratio might move in the future, it would always be drawn back to this range. History did not bear out this observation — not efficiently, anyway. One would have had to wait some 26 years to cash in on this strategy, since from 1986–2012 the GSR’s average value was 66, well above the super investor expectation. “If there is no underlying reason for the discovered pattern,” he explains, “there is no reason for deviations from the pattern to self-correct.”
For contrast, Smith offers a pricing relation with a reason behind it: the corn-to-soybean ratio. For the past 50 years, “it cost about 2.5 times as much to produce a bushel of soybeans as to produce a bushel of corn,” and voilà: the CSR has fluctuated right around 2.5.
On one level, I’ll buy that. But on another level, isn’t this “reason” just another correlation? Aside from “common sense,” that slippery quality, Smith gives us no guidelines for determining which reasons are sensible. The book’s final chapter, “When to Be Persuaded and When to Be Skeptical,” is just a chapter-by-chapter summary of lessons from each.
The big problem here, if we want to dig for it, goes deep — all the way down to Hume’s problem of induction (David Hume, not “& Associates”), the precept that our knowledge of the world is fundamentally restricted to the observations we make of correlated quantities within it. In his last chapter, Smith advises, “If a statistical conclusion seems unbelievable, don’t believe it.” But if we accept, ultimately, that things seem “believable” or “unbelievable” based only on the biased data of our experience, we’re forced also to accept that there may be times when we’re unfairly persuaded, and other times when we’re unfairly skeptical. “We need both data and theory,” as another of Smith’s last-chapter summaries instructs us — but first we have to admit that successful theories are invented to fit the data.
We’re stuck with that. But even if Standard Deviations isn’t much as philosophy, it doesn’t really try to be. The book has a raggedy, professorial tone, and the brusque confidence of its dismissals often makes it more enjoyable than the original How to Lie with Statistics. Gary Smith has written a book full of practical criteria and righteous myth-busting, and if you’re willing — as I guess I am — to regard it simply as a compendium of statistical abuse, there’s a lot to be learned from it. In any case, unless everyone reads it, Standard Deviations won’t be the last book of its kind.