Tue Oct 20 14:18:48 EDT 2015

Forecasting Tournaments

Does Philip Tetlock hold the key to accurate predictions?
Seeing Into the Future

Within all the teams, researchers ran experiments -- for example, pitting individuals against groups -- to see which methods improved accuracy. The essential insight? Prediction accuracy is possible when people participate in a setup that rewards only accuracy -- and not the novelty of the explanation, or loyalty to the party line, or the importance of keeping up your reputation. It is within this condition that the "supers," the top 2 percent of each group, emerged.

Every prominent pundit who was invited to participate in the tournament declined. Put it this way, Tetlock says: "If you have an influential column and a Nobel Prize and big speaking engagements and you're in Davos all the time -- if you have all the status cards -- why in God's name would you ever agree to play in a tournament where the best possible outcome is to break even?"

... Now that we know some limitations, and strengths, of forecasters, Tetlock wants to focus on asking the right questions. He hopes to create what Kahneman has called "adversarial collaboration tournaments" -- for instance, bringing together two politically opposed groups to discuss the Iran nuclear deal. One group thinks it's great, one group thinks it's terrible, and each must generate 10 questions that everyone will answer.

The idea is that each side will generate questions with answers that favor their position, and that, with everyone forced to consider all questions, a greater level of understanding will emerge.

Good Judgment™ Open

Compete in the GJ Open against the smartest crowd online -- the one that decisively won the 4 year ACE forecasting tournament!



Edge Master Class 2015: Philip Tetlock: A Short Course in Superforecasting

On the weekend of July 30th, Edge convened one of its "Master Classes."

... This year, the psychologist and social scientist Philip E. Tetlock presented the findings based on his work on forecasting as part of the Good Judgment Project. In 1984, Tetlock began holding "forecasting tournaments" in which selected candidates were asked questions about the course of events: In the wake of a natural disaster, what policies will be changed in the United States? When will North Korea test nuclear weapons? Candidates examine the questions in teams. They are not necessarily experts, but attentive, shrewd citizens.

... Over the weekend in Napa, Tetlock held five classes, which are being presented by Edge in their entirety (8.5 hours of video and audio) along with accompanying transcripts (61,000 words).

  • Edge Short Course in Superforecasting, Class I

    Forecasting Tournaments: What We Discover When We Start Scoring Accuracy

    If you read Tom Friedman's columns carefully, you'll see that he does make a lot of implicit predictions.

    . . . He's warned us about a number of things and uses the language of could or may -- various things could or might happen. When you ask people what do "could" or "might" mean in isolation, they mean anything from about .1 probability to about .85 probability. They have a vast range of possible meanings. This means that it's virtually impossible to assess the empirical track record of Tom Friedman, and Tom Friedman is by no means alone. It's not at all usual for pundits to make extremely forceful claims about violent nationalist backlashes or impending regime collapses in this or that place, but to riddle it with vague verbiage quantifiers of uncertainty that could mean anything from .1 to .9.

    It is as though high status pundits have learned a valuable survival skill, and that survival skill is they've mastered the art of appearing to go out on a limb without actually going out on a limb. They say dramatic things but there are vague verbiage quantifiers connected to the dramatic things. It sounds as though they're saying something very compelling and riveting. There's a scenario that's been conjured up in your mind of something either very good or very bad. It's vivid, easily imaginable.

    It turns out, on close inspection they're not really saying that's going to happen. They're not specifying the conditions, or a time frame, or likelihood, so there's no way of assessing accuracy. You could say these pundits are just doing what a rational pundit would do because they know that they live in a somewhat stochastic world. They know that it's a world that frequently is going to throw off surprises at them, so to maintain their credibility with their community of co-believers they need to be vague. It's an essential survival skill. There is some considerable truth to that, and forecasting tournaments are a very different way of proceeding. Forecasting tournaments require people to attach explicit probabilities to well-defined outcomes in well-defined time frames so you can keep score.

  • Edge Short Course in Superforecasting, Class II

    Tournaments: Prying Open Closed Minds in Unnecessarily Polarized Debates

    Four Performance Drivers in Tournaments:

    • Get Right People on Bus

      10-15% boost from screening forecasters on fluid intelligence/ active open-mindedness.
    • Benefits of Interaction

      10-20% boost: Forecasters do better working either collaboratively in teams or competitively in predictions markets.
    • Benefits of Training

      10% boost: Cognitive debiasing exercises help
    • Benefits of Elitist/Extremizing Algorithms

      15-30% boost: more weight to better forecasters AND then "extremize" to compensate for conservatism of aggregate forecasts ("super-fox" strategy)

    ... the Good Judgment Project outperformed a prediction market inside the intelligence community, which was populated with professional analysts who had classified information, by 25 or 30 percent, which was about the margin by which the superforecasters were outperforming our own prediction market in the external world.

  • Edge Short Course in Superforecasting, Class III

    Counterfactual History: The Elusive Control Groups in Policy Debates

    The U.S. intelligence community does not believe it's appropriate to hold analysts accountable for the accuracy of their forecasts. It believes it's appropriate to hold analysts accountable for the processes by which they reach their conclusions. It's not appropriate to judge them on the basis of the accuracy of their conclusions when their conclusions are about the future.

  • Edge Short Course in Superforecasting, Class IV

    Skillful Backward and Forward Reasoning in Time: Superforecasting Requires "Counterfactualizing"

    A lot of human beings take a certain umbrage at the thought that they're just an algorithm. A lot of superforecasters would not. They would be quite open to the possibility that you could create a statistical model of their judgment policies in particular domains, and that statistical model might, in the long run, outperform themselves because it's performed more reliably. They're quite open to that. Most people do not like that idea very much. That's another way in which superforecasters, not all of them, as Barb is pointing out, as they're a somewhat heterogeneous group, but the best among them would be quite open to that idea.

    ... Creativity is one of those buzzwords. I'm a little hesitant to talk about creativity. There are things they do that feel somewhat creative, but what they mostly do is they think very carefully and they think carefully about how they think. They're sensitive to blind spots, and they kick themselves pretty hard when they slip into what they knew was a trap. They slip into it anyway because some of these traps are sneaky and they're hard to avoid.

  • Edge Short Course in Superforecasting, Class V

    Condensing it All Into Four Big Problems and a Killer App Solution

    The first big problem I see is that in virtually all high stakes policy debates we observe now, the participants are motivated less by pure accuracy goals than they are by an assortment of other goals. They're motivated by ego defense, defending their past positions, they're motivated by self-promotion, claiming credit for things (correctly or incorrectly), they're motivated to affirm their loyalty to a community of co-believers because the status of pundits hinges critically on where they stand in their social networks. If you're a liberal or a conservative high profile pundit, you know that if you take one for the home team, they're going to pick you up and keep you moving along.

    ... The second big point--we've already talked about attribute substitution. High stakes partisans want to simplify an otherwise intolerably complicated world. They use attribute substitution a lot. They take hard questions and replace them with easy ones and they act as if the answers to the easy ones are answers to the hard ones.

    ... The third thing we talked a bit about yesterday is rhetorical obfuscation as an essential survival strategy if you're a political pundit. To preserve their self and public images in an environment that throws up a lot of surprises, which, of course, the political world does, high stakes partisans have to learn to master an arcane art: the art of appearing to go out on a predictive limb without actually doing it, of appearing to be making a forecast without making a forecast.

    They say decisive-sounding things about Eurozone collapse or this or that, but there are so many "may's" and "possibly's" and "could's" and so forth that turning it into a probability estimate that can be scored for accuracy is virtually impossible. A, they can't keep score of themselves. B, there is no way to tell ex post which side gets closer to the truth because each side has rhetorically positioned itself in a way that allows it to explain what happened ex post.

    ... Attribute substitution, point four, is not just going on among the debaters, it's going on in the audience as well. Audiences are remarkably forgiving of all these epistemological sins that debaters are committing. There is a tendency to take partisan claims more or less at face value as long as the partisans belong to their community of co-believers.

Addendum: Interview with Phil Tetlock on the Rationally Speaking podcast.


Posted by mjm | Permanent link | Comments
comments powered by Disqus