I’ve just finished Ian Ayres’s book Super Crunchers, which I found via Andrew McAfee’s Harvard Business Review blog post, The Future of Decision Making: Less Intuition, More Evidence (discussed in Intuition and the quantitative value investor). Super Crunchers is a more full version of James Montier’s 2006 research report, Painting By Numbers: An Ode To Quant, providing several more anecdotes in support of Montier’s thesis that simple statistical models outperform the best judgements of experts. McAfee discusses one such example in his blog post:
Princeton economist Orley Ashenfleter predicts Bordeaux wine quality (and hence eventual price) using a model he developed that takes into account winter and harvest rainfall and growing season temperature. Massively influential wine critic Robert Parker has called Ashenfleter an “absolute total sham” and his approach “so absurd as to be laughable.” But as Ian Ayres recounts in his great book Supercrunchers, Ashenfelter was right and Parker wrong about the ‘86 vintage, and the way-out-on-a-limb predictions Ashenfelter made about the sublime quality of the ‘89 and ‘90 wines turned out to be spot on.
Ayers provides a number of stories not covered in Montier’s article, from Don Berwick’s “100,000 lives” campaign, Epagogix’s hit movie predictor, Offermatica’s automated web ad serving software, Continental Airlines’s complaint process, and a statistical algorithm for predicting the outcome of Supreme Court decisions. While seemingly unrelated, all are prediction engines based on a quantitative analysis of subjective or qualitative factors.
The Supreme Court decision prediction algorithm is particularly interesting to me, not because I am an ex-lawyer, but because the language of law is language, not often plain, and seemingly irreducible to quantitative analysis. (I believe this is true also of value investment, although numbers play a larger role in that realm, and therefore it lends itself more readily to quantitative analysis.) According to Andrew Martin and Kevin Quinn, the authors of Competing Approaches to Predicting Supreme Court Decision Making, if they are provided with just a few variables concerning the politics of a case, they can predict how the US Supreme Court justices will vote.
Ayers discussed the operation of Martin and Quinn’s Supreme Court decision prediction algorithm in How computers routed the experts:
Analysing historical data from 628 cases previously decided by the nine Supreme Court justices at the time, and taking into account six factors, including the circuit court of origin and the ideological direction of that lower court’s ruling, Martin and Quinn developed simple flowcharts that best predicted the votes of the individual justices. For example, they predicted that if a lower court decision was considered “liberal”, Justice Sandra Day O’Connor would vote to reverse it. If the decision was deemed “conservative”, on the other hand, and came from the 2nd, 3rd or Washington DC circuit courts or the Federal circuit, she would vote to affirm.
Ted Ruger, a law professor at the University of Pennsylvania, approached Martin and Quinn at a seminar and suggested that they test the performance of the algorithm against a group of legal experts:
As the men talked, they decided to run a horse race, to create “a friendly interdisciplinary competition” to compare the accuracy of two different ways to predict the outcome of Supreme Court cases. In one corner stood the predictions of the political scientists and their flow charts, and in the other, the opinions of 83 legal experts – esteemed law professors, practitioners and pundits who would be called upon to predict the justices’ votes for cases in their areas of expertise. The assignment was to predict in advance the votes of the individual justices for every case that was argued in the Supreme Court’s 2002 term.
The outcome?
The experts lost. For every argued case during the 2002 term, the model predicted 75 per cent of the court’s affirm/reverse results correctly, while the legal experts collectively got only 59.1 per cent right. The computer was particularly effective at predicting the crucial swing votes of Justices O’Connor and Anthony Kennedy. The model predicted O’Connor’s vote correctly 70 per cent of the time while the experts’ success rate was only 61 per cent.
Ayers provides a copy of the flowchart in Super Crunchers. Its simplicity is astonishing: there are only 6 decision points, and none of the relate to the content of the matter. Ayers posits the obvious question:
How can it be that an incredibly stripped-down statistical model outpredicted legal experts with access to detailed information about the cases? Is this result just some statistical anomaly? Does it have to do with idiosyncrasies or the arrogance of the legal profession? The short answer is that Ruger’s test is representative of a much wider phenomenon. Since the 1950s, social scientists have been comparing the predictive accuracies of number crunchers and traditional experts – and finding that statistical models consistently outpredict experts. But now that revelation has become a revolution in which companies, investors and policymakers use analysis of huge datasets to discover empirical correlations between seemingly unrelated things.
Perhaps I’m naive, but, for me, one of the really surprising implications arising from Martin and Quinn’s model is that the merits of the legal arguments before the court are largely irrelevant to the decision rendered, and it is Ayres’s “seemingly unrelated things” that affect the outcome most. Ayres puts his finger on the point at issue:
The test would implicate some of the most basic questions of what law is. In 1881, Justice Oliver Wendell Holmes created the idea of legal positivism by announcing: “The life of the law has not been logic; it has been experience.” For him, the law was nothing more than “a prediction of what judges in fact will do”. He rejected the view of Harvard’s dean at the time, Christopher Columbus Langdell, who said that “law is a science, and … all the available materials of that science are contained in printed books”.
Martin and Quinn’s model shows Justice Oliver Wendell Holmes to be right. Law is nothing more than a prediction of what judges will in fact do. How is this relevant to a deep value investing site? Deep value investing is nothing more than a prediction of what companies and stocks will in fact do. If the relationship holds, seemingly unrelated things will affect the performance of stock prices. Part of the raison d’etre of this site is to determine what those things are. To quantify the qualitative factors affecting deep value stock price performance.
[…] to all sorts of biases. They perform better when they are locked into some process (see here, here, here and here for the wordier […]
LikeLike
This just reiterates what Philip Tetlock found in his studies on political science experts:
http://www.amazon.com/Philip-Tetlock-Better-Forecasters-Hedgehogs/dp/B000V76TYS/
Experts lose to simple statistical algorithms in most of the cases.
LikeLike
[…] is a Greenbackd.com review of Ian Ayres’ book Super Crunchers (thanks to The Stingy Investor blog for highlighting this). According to Greenbackd, Ayres’ […]
LikeLike
Caution is warranted in taking Ayres’s conclusions at face value because his primary research seems to have been performed sloppily. At least one of his examples (the Ashenfelter model) is quoted incorrectly. Michael Mauboussin points this out in an interesting article about the importance of checking claims. While this may not necessarily invalidate the conclusions, I feel some skepticism is appropriate.
Click to access D8618-Mauboussin_SeeForYourself.pdf
LikeLike
That’s a great article. Thank you. Mauboussin is fantastic.
LikeLike
This is an interesting theme – a related review which gives examples of how statistical modeling of experts’ decisions can be helpful in decision making is Swets, Dawes, and Monahan (2000): “Psychological science can improve diagnostic decisions”. (Psychological Science in the Public Interest 1(1): 1-26), freely available for download.
LikeLike
Superb. Thank you. I’ll take a look.
LikeLike