"Economics research is usually not replicable" say (pdf) Andrew Chang and Phillip Li (via).
This problem is not confined to economics: it also seems to be the case in psychology (pdf) and medicine. Perhaps, therefore, the problem is a general (ish) academic one: pressure to publish and win research grants lead researchers to, ahem, tweak their findings.
Nevertheless, it poses a problem for those of us who are consumers of economic research: how should we respond to this?
First, we should remember Dave Giles' Ten Commandments, in particular:
Check whether your "statistically significant" results are also "numerically/economically significant"...
Keep in mind that published results will represent only a fraction of the results that the author obtained, but is not publishing.
Secondly, we should remember the Big Facts. For example, one the Big Facts in finance is that active equity fund managers rarely (pdf) beat the market (pdf) for very long, at least after fees. This, as much as Campbell Harvey's statistical work (pdf) reminds us to be wary of the hundreds of papers claiming to find factors that beat the market.
To take another example, Pritchett and Summers remind us of another Big Fact (pdf):
The single most robust empirical finding about economic growth is low persistence of growth rates...Episodes of super-rapid growth tend to be of short duration and end in decelerations back to the world average growth rate.
This warns us to be sceptical of findings that national policies, institutions or cultures have significant long-lasting effects on growth.
Thirdly, we should ask: is this paper consistent or not with other findings?
Take this paper which claims that fund managers perform badly after they have suffered a bereavement. This seems a mere curiosum. It's not. It's consistent with experimental evidence that sadness increases present bias, and with another Big Fact - that stock markets do better in winter, perhaps because longer nights in the autumn depress investors and share prices.
One of my favourite examples here is momentum. When I first saw Jegadeesh and Titman's claim that shares that have recently risen tend to carry on rising whilst fallers continue falling, I thought it was an interesting curiosity. But a similar thing has been found in currencies, commodities and even in sports betting. All this suggests that momentum is a reasonably robust fact.
But here we have an inconsistency: how do we reconcile this claim with the Big Fact that fund managers don't often beat stock markets?
This brings me to my fourth principle. We must ask: is there a sound theoretical reason for these findings, which reconcile them to the Big Facts?
Initially I thought momentum's out-performance was possible because fund managers were looking for accounting-based anomalies rather than behavioural ones. This, though, has become less plausible given the interest in behavioural finance since the 90s. But there is another possibility, pointed out by Victoria Dobrynskaya. It's that momentum stocks often have "bad beta"; they carry benchmark risk which makes them unattractive to fund managers with the result that they are under-priced to reflect such risk.
Now, here's the thing. It is, I suspect, rare for papers to satisfy these criteria. Of course, a lot of findings are worth thinking about. But few are actionable. And those that are are ones which fit other findings, Big Facts and theory.
My principles, I hope, are a form of informal Bayesianism. They are intended to steer the middle ground between a bigoted cleaving to our own prejudices on the one hand and gee-whizz gullibility on the other.
You might wonder why I've taken my examples from financial economics. It's partly because my ignorance of this field is less profound than others. But it's also to remind you of another Big Fact - that economic research is not, and should not be, about armchair windbaggery but rather about how to help people make better real world decisions.
I think you mean the late Peter Kennedy's 10 commandments for applied econometrics
http://www.stat.columbia.edu/~gelman/stuff_for_blog/KennedyJEconomicSurveys2002.pdf
Posted by: Jonathan | October 08, 2015 at 02:05 PM
"This warns us to be sceptical of findings that national policies, institutions or cultures have significant long-lasting effects on growth."
I don't think that conclusion is warranted. Growth may be erratic if you are looking on short frequency but wealth and poverty are extremely persistent.
http://www.res.org.uk/details/mediabrief/4411361/Twin-Peaks-How-The-World-Is-Polarizing-Into-Rich-And-Poor.html
Posted by: Luis Enrique | October 08, 2015 at 02:24 PM
'"Economics research is usually not replicable" say (pdf) Andrew Chang and Phillip Li (via).
This problem is not confined to economics: it also seems to be the case in psychology (pdf) and medicine.'
Chang and Li are mistaking reproducibility for replicability in that paper. There is some of that (irreproducibility) shit in psych. and med. of course but it's not the same problem at all. Irreproducible research is even shoddier than unreplicable research.
Posted by: phayes | October 08, 2015 at 02:30 PM
also I think cock up over conspiracy is usually the explanation, when it comes to replicablity.
Posted by: Luis Enrique | October 08, 2015 at 02:30 PM
"another Big Fact - that stock markets do better in winter, perhaps because longer nights in the autumn depress investors and share prices."
Christmas?
Posted by: theOnlySanePersonOnPlanetEarth | October 08, 2015 at 06:32 PM
Sorry to disagree, Chris, but economists don't know what the word "replicate" means.
A failure to replicate an experiment in medicine, for instance, as damaging as it may be for the science, doesn't mean what you think it means.
Posted by: Magpie | October 09, 2015 at 09:05 AM
and economist on what replication means
http://www.cgdev.org/publication/meaning-failed-replications-review-and-proposal-working-paper-399
Posted by: Luis Enrique | October 09, 2015 at 10:40 AM
From Dan Hirschman,
https://scatter.wordpress.com/2014/07/16/expermental-vs-statistical-replication/
"... experimental replication vs. statistical replication [reproduction].
Experimental replication is the more obvious kind:
can we run a new experiment using the same methods and produce
a substantially similar result?
Statistical replication [reproduction], on the other hand, asks,
can we take the exact same data, run the same or similar
statistical models, and reproduce the reported results?
In other words, experimental replication is about generalizability,
while statistical replication [reproduction] is about data manipulation and model specification."
"On the one hand, sociology, economics, and political science all have
ongoing issues with statistical replication [reproduction].
The big Reinhart and Rogoff controversy was the result of an attempt to replicate [reproduce] a statistical finding that revealed some unreported shenanigans in how cases were
weighted, and that some cases were simply dropped through error."
Economics fails the far simpler statistical reproduction, without even doing an additional experiment. Meanwhile, psychology and pharmacy cannot experimentally replicate 60-70 percent of published results.
This experimental unreplicability is understandable for classical (frequentist) approaches to statistics.
The paper
"Why Most Published Research Findings are False"
by John Ionnidis
gives a formula for the probability
a published statistical relationship is wrong,
calling it the Positive Predictive Value (PPV).
"After a research finding has been claimed based on achieving formal statistical significance,
the post-study probability that it is true is the Positive Predictive Value, PPV."
PPV = (1 - beta) R / (R - beta * R + alpha)
= 1 / [1 + alpha / (1 - beta) R) ]
where
alpha = .05 usually -- the probability of a Type I error
beta is the probability of a Type II error -- (1 - beta is the power )
R is the ratio of true relationships to no [false] relationships in that FIELD of tests.
Writing
R / (1-R)
is the pre-study probability the relationship is true.
I will call this the "Background Probability" of a true relationship.
While the researcher/statistician can set alpha = 0.05,
and can get beta = 0.80,
their probability meaning is clouded by their frequentist interpretation.
What the statistician can't set, and what is never mentioned
-- the Background Probability -- is most important in most research!
In a field like molecular biology,
with alpha = 0.05,
30 cancer related genes out of 30,000 genes will see a
PPV at most 0.02 -- almost no published results would be true!
Posted by: Jameson Burt | October 09, 2015 at 08:25 PM
"This warns us to be sceptical of findings that national policies, institutions or cultures have significant long-lasting effects on growth."
I think this Chris is very dubious. It is true that industrialisation is a one-off thing - and therefore you get 'catch up' growth - historians have long known that. The fact is, however, that some countries have industrialised, some have not, and some have entered the growth curve, others not. Once on, some have industrialised faster and have attained higher GNP per capita and egalitarian outcomes. These are the results of such things as policies, cultures, social and institutional structures, politics and geography. Think about it, why is that they are concentrated in north asia, Europe and the US and not Africa or South America?
I would also refer to more discussion in Lars Syll's post:
"First, there is a structural break in the aftermath ofWorldWar II that significantly affects the evolution of the income distribution; another structural break may have occurred in the late 1990s. Second, certain groups of states … show a development that is different from other states … Third, states with poor neighbors show a different development than states with rich neighbors. Moreover, a choice for annual transition periods is shown to be inconsistent with the Markov property. Ignoring these factors may considerably affect the correctness of inferences about the evolution of the regional income distribution drawn from the limiting distribution."
https://larspsyll.wordpress.com/2015/10/09/economic-convergence-and-the-markov-chain-approach/
Posted by: Nanikore | October 11, 2015 at 09:00 AM
To pile on re: growth. "Reversion to the mean" is not a mechanism. We're desperately short of an explanation why growth rates centre on the % that they tend to do...
As you've said before - no mechanism means it's not a useful theory.
Posted by: Metatone | October 11, 2015 at 02:40 PM
Episodes of super-rapid growth tend to be of short duration and end in decelerations back to the world average growth rate.
This isn't at all surprising, and not because policy doesn't matter. As Scott Sumner put it, policy affects levels, not growth rates (except in transition).
That is, a policy improvement will lead to higher growth rates until economy country reaches the new GDP equilibrium for its policy and endowments, at which point it reverts to the global average growth rate. There's no reason to expect a policy change to raise growth rates indefinitely.
The reason for this, as I understand it, is that long-run economic growth is largely a function of technological improvements, availability of capital, and policy. Technology is pretty much globally uniform. Capital markets are global. Policy can affect the willingness of foreign investors to invest in your country, and how effective they can make that capital, but again, this only affects growth rates in transition.
Note that for very large countries, it can take a very long time to reach equilibrium. Over 25 years later, China is still moving at an accelerated growth rate towards the new equilibrium set by abandoning Maoism in favor of merely bad policy.
Posted by: Brandon Berg | October 13, 2015 at 03:52 PM
First time commenter; didn't realize your blog software eats markup. Here's the link to Sumner: http://econlog.econlib.org/archives/2015/08/policy_affects.html
Posted by: Brandon Berg | October 13, 2015 at 03:53 PM