Skip to content

Lex maniac

Investigating changes in American English vocabulary over the last 40 years

Tag Archives: statistical analysis


(1990’s | journalese (sports) | “percentage baseball”)

Few of my few devoted readers being baseball fans, it behooves me to offer some explanation of this odd word. (Don’t you always look for chances to use “behoove” in a sentence?) “Sabermeterics” refers to rigorous statistical analysis, which begins by establishing a reliable set of numbers measuring the performance of single players and entire teams and then reinterpreting them, taking them apart, recombining them, and generating new statistics, thought to be more revealing than the old ones. The word itself is an eponym, “saber” being derived from the acronym SABR, the Society for American Baseball Research, founded in 1971 as a small organization devoted to using statistics to understand baseball history. Nowadays, sabermetrics attracts more attention as a way of helping executives and managers arrive at the most effective ways to evaluate and use their players, or decide how much they should be paid or traded for. Now other sports have been bitten by the bug, and the concept may even be familiar to non-fans; many baseball abstainers have heard of Michael Lewis’s book “Moneyball,” an account of the Oakland A’s under general manager Billy Beane, who adopted sabermetric insights wholesale and built a successful team with limited means. (If you missed that, there was a Simpsons episode in 2010.)

The term has always been credited to one of its leading practitioners, Bill James, who has — not single-handedly — revolutionized our understanding of baseball. (Full disclosure: my copy of his “New Historical Baseball Abstract” is pretty much disbound due to wear.) He began a one-man samizdat in the seventies, producing mimeographed collections of statistics and evaluations of major-league players; within a few years, the annual “Baseball Abstract” was picked up by a major publisher. Since then, he has written several compendious reference books that have laid out new frameworks for understanding how baseball works. In 2003 the Boston Red Sox hired him as a special advisor, a post he retains. He has indeed created some very complex and arcane statistics, but they have become common currency in discussions of baseball.

There are two inspiring stories here: James’s rise from outsider devoid of credentials to respected insider; and the triumph of empiricism and scholarship. The first proves that such storybook careers remain possible, but the latter, it seems to me, has wider cultural import. The SABR scholars, with little to offer except patient, unremunerated toil, have applied a version of the scientific method to baseball, emphasizing observation, data gathering, and statistical analysis in order to reach well-founded formulas for success. And to a great extent, it has worked. Baseball teams can no longer ignore sabermetrics; the insights of those nerdy statisticians — “statistorians” as a pre-James pioneer, L. Robert Davids, called them — have become so standard that ignoring them is a form of malpractice. It may give us a flicker of faith that in the face of a rising tide of obscurantism, that kind of work still proves its worth and compels respect, even in a game as anti-intellectual and tradition-bound as baseball.

Like the sciences, sabermetrics ultimately proves itself through successful prediction. Why is it that sabermetrics gets more credit than, say, climate science, despite the fact that the broad claims made by climatologists thirty years ago have been borne out? It’s a much smaller audience, for one thing; most people don’t care enough about baseball to set any store by ingenious statistical hermeneutics, but nearly everyone has an opinion about climate change. Baseball has a very long tradition of statistical study, and there have always been a few “figure Filberts,” as people like James used to be called; outside of baseball, most people don’t understand statistical analysis and don’t hold with it, unless it happens to confirm what they already believed. In baseball, the goal is to win, and winning is clearly defined and easily measured. That is much less true in the greater world, where a lot more people win by casting doubt on human-caused climate change than by taking issue with sabermetricians.

Tags: , , , , , , , , , , , , , ,

crunch the numbers

(1980’s | computerese? enginese? | “wade through OR digest the figures”)

Some new expressions engulf the landscape, washing over us all and forcing themselves on every ear, if not every lip. When we talk about common expressions, those are usually the kind we mean. There is another kind, though, not so ubiquitous, but unavoidable because the preferred, or only, way to refer to a particular action, process, or concept. So it likewise forces itself on every ear, but without the same unrelenting insistence. “Crunch the numbers” is one of those. It has become inevitable, in a culture devoted to amassing vast reservoirs of data, that we have a word for getting something useful out of all those statistics — once you collect all those numbers, you have to do something with them. There’s really no other word for it, and the phrase has become invariably associated with statistical distillation. The commonplace is formed not only from sheer frequency; if you have no choice but to reach for the same expression every time, it makes its presence felt.

The point of “crunching” the numbers, I think, is that they are reduced in size and complexity, like a mouthful of bran flakes turning into easily swallowed mush. The computer — number-crunching is almost invariably associated with computers, occasionally with calculators — takes a huge, indigestible mass of data and breaks it down. The expression seems to have arisen in the engineering community in the sixties and moved beyond it by the early eighties. It gained ground quickly, and soon no longer required quotation marks or glosses (actually, it was never generally glossed). Some expressions, though slangy and therefore not reproduced in mainstream publications until well after they’ve become ordinary, at least in their field, take hold quickly once they do because they’re easy to grasp and enjoy.

“Crunch the numbers” was at one time sole property of engineers and programmers; a few more professions may try it on now — accountants and statisticians primarily. The function of the computer, as envisioned in the post-war world, was to do many, many calculations per minute by brute force, placing vast amounts of computing power in one place and letting ‘er rip. I haven’t done the research to determine the no doubt lively expressions the tech boys used in the decade or two before “crunch the numbers” came along, or maybe it arose earlier than I think. It seems likely that there was no predictable expression before we started using this one, because we so rarely needed to talk about that volume and density of computation.

“Crunch the numbers” doesn’t share the taint of “massage the numbers,” or “game the system” or “fuzzy math.” A ground-level, first-resort expression must remain neutral, and the phrase is not generally used to question the motives or competence of those doing the crunching. “Run the numbers” is a little different, meaning “execute the formula and get the answer.” It likewise lacks any dubious connotation, despite a superficial resemblance to that staple of urban gambling, “running numbers” (or “playing the numbers”).

Tags: , , , , , , , , ,