Skip to content

Category: Social Sciences

Replication, Publication Bias, and Negative Findings

I came across fascinating read on some of the important problems that plague the scientific process in the social sciences and elsewhere. From an article by Ed Yong in the May 2012 edition of Nature:

Positive results in psychology can behave like rumours: easy to release but hard to dispel. They dominate most journals, which strive to present new, exciting research. Meanwhile, attempts to replicate those studies, especially when the findings are negative, go unpublished, languishing in personal file drawers or circulating in conversations around the water cooler. “There are some experiments that everyone knows don’t replicate, but this knowledge doesn’t get into the literature,” says Wagenmakers. The publication barrier can be chilling, he adds. “I’ve seen students spending their entire PhD period trying to replicate a phenomenon, failing, and quitting academia because they had nothing to show for their time.” (…)

One reason for the excess in positive results for psychology is an emphasis on “slightly freak-show-ish” results, says Chris Chambers, an experimental psychologist at Cardiff University, UK. “High-impact journals often regard psychology as a sort of parlour-trick area,” he says. Results need to be exciting, eye-catching, even implausible. Simmons says that the blame lies partly in the review process. “When we review papers, we’re often making authors prove that their findings are novel or interesting,” he says. “We’re not often making them prove that their findings are true.”

I have briefly discussed the lack of replication in economics here, but in short, the issue is that once a finding is published, there are practically no incentives for people to replicate those findings.

There are two reasons for this. The first is that journals tend to want to publish only novel results, so even if you manage to confirm someone else’s findings, there will be few takers for your study unless you do something significantly different… in which case you’re no longer doing replication.

The second is the tendency to publish only studies in which the authors find support for their hypothesis. This is known as “publication bias.”

For example, suppose I hypothesize that the consumption of individuals increases as their income increases, and suppose I find support for that hypothesis using data on US consumers. This result eventually gets published in a scientific journal. Suppose now that you decide to replicate my finding using Canadian data and you fail to replicate my findings. Few journals would actually be interested in such a finding. That’s because failing to reject the null hypothesis in a statistical test is not surprising (after all, you’ve staked 90, 95, or 99 percent of the probability mass on the null hypothesis that consumption is not associated with income), but also because, as Yong’s article highlights, that would not exactly be an “exciting, eye-catching” result.

I am currently dealing with such a “negative finding” in one of my papers, in which I find that land titles do not have the positive impact on productivity posited by the theoretical literature in Madagascar, a context where donors have invested hundreds of millions of dollars in various land titling policies. Perhaps unsurprisingly, the paper has proven to be a very tough sell.

(HT: David McKenzie.)

Fixing the Peer Review Process by Crowdsourcing It? (Continued)

We call the fallout to any article the “comments,” but since they are often filled with solid arguments, smart corrections and new facts, the thing needs a nobler name. Maybe “gloss.” In the Middle Ages, students often wrote notes in the margins of well-regarded manuscripts. These glosses, along with other forms of marginalia, took on a life of their own, becoming their own form of knowledge, as important as, say, midrash is to Jewish scriptures. The best glosses were compiled into, of course, glossaries and later published — serving as some of the very first dictionaries in Europe.

Any article, journalistic or scientific, that sparks a debate typically winds up looking more like a good manuscript 700 years ago than a magazine piece only 10 years ago. The truth is that every decent article now aspires to become the wiki of its own headline.

Sure, there is still the authority that comes of being a scientist publishing a peer-reviewed paper, or a journalist who’s reported a story in depth, but both such publications are going to be crowd-reviewed, crowd-corrected and, in many cases, crowd-improved. (And sometimes, crowd-overturned.) Granted, it does require curating this discussion, since yahoos and obscenity mavens tend to congregate in comment sections.

That’s from a New York Times op-ed in last weekend’s Sunday Review by Jack Hitt, who is also a frequent contributor to This American Life (here is my favorite This American Life story by Jack Hitt).

Hitt’s point should be be taken more seriously by academics. In all fairness, however, in some corners of academia, the idea is being taken seriously: the AEJs — the four new journals of the American Economic Association — have comments section for every published article (I don’t know why the AEA has not also done so for its flagship journal, the American Economic Review.)

Unfortunately, readers of the AEJs seem to be slow to embrace that change, as few articles appear to have garnered any comments. Moreover, a quick look at the latest issue of each AEJ indicates no comments at all. Perhaps the problem is that one needs to be a member of the AEA to comment.

If those comments thread ever take off, and if other journals start offering similar comment sections, this would be a cheap, quick way of building canonical knowledge within any discipline, as I discussed in my previous post on this topic.

Identifying Causal Relationships vs. Ruling Out All Other Possible Causes

Portrait of Artistotle (Source: Wikimedia Commons.)

I was in Washington last month to discuss my work on food prices, in which I look at whether food prices cause social unrest, at an event whose goal was to discuss the link between climate change and conflict.

As many readers of this blog know, disentangling causal relationships from mere correlations is the goal of modern science, social or otherwise, and though it is easy to test whether two variables x and y are correlated, it is much more difficult to determine whether x causes y.

So while it is easy to test whether increases in the level of food prices are correlated with episodes of social unrest, it is much more difficult to determine whether food prices cause social unrest.

In my work, I try to do so by conditioning food prices on natural disasters. To make a long story short, if you believe that natural disasters only affect social unrest through food prices, this ensures that if there is a relationship between food prices and social unrest, that relationship is cleaned out of whatever variation which is not purely due to the relationship flowing from food prices to social unrest. In other words, this ensures that the estimated relationship between the two variables is causal. This technique is known as instrumental variables estimation.

Identifying Causal Relationships vs. Ruling Out All Other Causes

As with almost any other discussion of a social-scientific issue nowadays, the issue of causality came up during one of the discussions we had at that event in Washington. It was at that point that someone implied that it did not make sense to talk of causality by bringing up the following analogy: