Does Philip Tetlock hold the key to accurate predictions?
Seeing Into the Future
Within all the teams, researchers ran experiments -- for example, pitting individuals against groups -- to see which methods improved accuracy. The essential insight? Prediction accuracy is possible when people participate in a setup that rewards only accuracy -- and not the novelty of the explanation, or loyalty to the party line, or the importance of keeping up your reputation. It is within this condition that the "supers," the top 2 percent of each group, emerged.
Every prominent pundit who was invited to participate in the
tournament declined. Put it this way, Tetlock says: "If you have an
influential column and a Nobel Prize and big speaking engagements
and you're in Davos all the time -- if you have all the status
cards -- why in God's name would you ever agree to play in a
tournament where the best possible outcome is to break even?"
...
Now that we know some limitations, and strengths, of forecasters,
Tetlock wants to focus on asking the right questions. He hopes to
create what Kahneman has called "adversarial collaboration tournaments"
-- for instance, bringing together two politically opposed groups
to discuss the Iran nuclear deal. One group thinks it's great,
one group thinks it's terrible, and each must generate 10 questions
that everyone will answer.
The idea is that each side will generate questions with answers that favor their position, and that, with everyone forced to consider all questions, a greater level of understanding will emerge.
Compete in the GJ Open against the smartest crowd online -- the one that decisively won the 4 year ACE forecasting tournament!
Edge Master Class 2015: Philip Tetlock: A Short Course in Superforecasting
On the weekend of July 30th, Edge convened one of its "Master Classes."
... This year, the psychologist and social scientist Philip E. Tetlock presented the findings based on his work on forecasting as part of the Good Judgment Project. In 1984, Tetlock began holding "forecasting tournaments" in which selected candidates were asked questions about the course of events: In the wake of a natural disaster, what policies will be changed in the United States? When will North Korea test nuclear weapons? Candidates examine the questions in teams. They are not necessarily experts, but attentive, shrewd citizens.
... Over the weekend in Napa, Tetlock held five classes, which are being presented by Edge in their entirety (8.5 hours of video and audio) along with accompanying transcripts (61,000 words).
Forecasting Tournaments: What We Discover When We Start Scoring Accuracy
If you read Tom Friedman's columns carefully, you'll see that he does
make a lot of implicit predictions.
. . .
He's warned us about a number of things and uses the language
of could or may -- various things could or might happen.
When you ask people what do "could" or "might" mean in isolation,
they mean anything from about .1 probability to about .85 probability.
They have a vast range of possible meanings. This means that it's
virtually impossible to assess the empirical track record of
Tom Friedman, and Tom Friedman is by no means alone.
It's not at all usual for pundits to make extremely forceful claims
about violent nationalist backlashes or impending regime collapses
in this or that place, but to riddle it with vague verbiage quantifiers
of uncertainty that could mean anything from .1 to .9.
It is as though high status pundits have learned a valuable survival skill,
and that survival skill is they've mastered the art of appearing to go out
on a limb without actually going out on a limb. They say dramatic things
but there are vague verbiage quantifiers connected to the dramatic things.
It sounds as though they're saying something very compelling and riveting.
There's a scenario that's been conjured up in your mind of something
either very good or very bad. It's vivid, easily imaginable.
It turns out, on close inspection they're not really saying that's
going to happen. They're not specifying the conditions, or a time frame,
or likelihood, so there's no way of assessing accuracy. You could say
these pundits are just doing what a rational pundit would do because
they know that they live in a somewhat stochastic world. They know that
it's a world that frequently is going to throw off surprises at them,
so to maintain their credibility with their community of co-believers
they need to be vague. It's an essential survival skill. There is some
considerable truth to that, and forecasting tournaments are a very
different way of proceeding. Forecasting tournaments require people
to attach explicit probabilities to well-defined outcomes in well-defined
time frames so you can keep score.
Tournaments: Prying Open Closed Minds in Unnecessarily Polarized Debates
Four Performance Drivers in Tournaments:
... the Good Judgment Project outperformed a prediction market inside the intelligence community, which was populated with professional analysts who had classified information, by 25 or 30 percent, which was about the margin by which the superforecasters were outperforming our own prediction market in the external world.
Counterfactual History: The Elusive Control Groups in Policy Debates
The U.S. intelligence community does not believe it's appropriate to hold analysts accountable for the accuracy of their forecasts. It believes it's appropriate to hold analysts accountable for the processes by which they reach their conclusions. It's not appropriate to judge them on the basis of the accuracy of their conclusions when their conclusions are about the future.
Skillful Backward and Forward Reasoning in Time: Superforecasting Requires "Counterfactualizing"
A lot of human beings take a certain umbrage at the thought
that they're just an algorithm. A lot of superforecasters
would not. They would be quite open to the possibility that you
could create a statistical model of their judgment policies in
particular domains, and that statistical model might, in the
long run, outperform themselves because it's performed more
reliably. They're quite open to that. Most people do not like that
idea very much. That's another way in which superforecasters,
not all of them, as Barb is pointing out, as they're a somewhat
heterogeneous group, but the best among them would be quite open
to that idea.
...
Creativity is one of those buzzwords. I'm a little hesitant to
talk about creativity. There are things they do that feel somewhat
creative, but what they mostly do is they think very carefully and
they think carefully about how they think. They're sensitive to
blind spots, and they kick themselves pretty hard when they slip
into what they knew was a trap. They slip into it anyway because
some of these traps are sneaky and they're hard to avoid.
Condensing it All Into Four Big Problems and a Killer App Solution
The first big problem I see is that in virtually all high stakes
policy debates we observe now, the participants are motivated
less by pure accuracy goals than they are by an assortment of
other goals. They're motivated by ego defense, defending their
past positions, they're motivated by self-promotion, claiming
credit for things (correctly or incorrectly), they're motivated
to affirm their loyalty to a community of co-believers because the
status of pundits hinges critically on where they stand in their
social networks. If you're a liberal or a conservative high
profile pundit, you know that if you take one for the home team,
they're going to pick you up and keep you moving along.
...
The second big point--we've already talked about attribute substitution.
High stakes partisans want to simplify an otherwise intolerably
complicated world. They use attribute substitution a lot. They take
hard questions and replace them with easy ones and they act as if
the answers to the easy ones are answers to the hard ones.
...
The third thing we talked a bit about yesterday is rhetorical obfuscation
as an essential survival strategy if you're a political pundit.
To preserve their self and public images in an environment that
throws up a lot of surprises, which, of course, the political world does,
high stakes partisans have to learn to master an arcane art:
the art of appearing to go out on a predictive limb without actually
doing it, of appearing to be making a forecast without making a forecast.
They say decisive-sounding things about Eurozone collapse or this or that,
but there are so many "may's" and "possibly's" and "could's" and so forth
that turning it into a probability estimate that can be scored for
accuracy is virtually impossible. A, they can't keep score of themselves.
B, there is no way to tell ex post which side gets closer to the truth
because each side has rhetorically positioned itself in a way that allows
it to explain what happened ex post.
...
Attribute substitution, point four, is not just going on among the
debaters, it's going on in the audience as well. Audiences are remarkably
forgiving of all these epistemological sins that debaters are committing.
There is a tendency to take partisan claims more or less at face value
as long as the partisans belong to their community of co-believers.
Addendum: Interview with Phil Tetlock on the Rationally Speaking podcast.