Skip to content

Do Data Want to Be Free?

Last updated on November 13, 2011

Yes, says Kim Yi Dionne:

I was brought up in the tradition of sharing your data with others. There has been a lot of focus on the hassles of doing so (anonymizing, cleaning, getting “scooped,” etc.)

But there are a number of reasons to do it. First, only by sharing the data do you allow others to be able to improve upon your ideas (and, hopefully, selfishly, to cite your work). In fact, one study showed that sharing detailed research data is associated with an increased citation rate.

The principle of sharing your data also strikes me as a way to signal that your findings are honest. A professor of mine demonstrated to us in an advanced methods course the difficulty of replicating findings if an author doesn’t think about potential replication when submitting a piece for publication. I decided from that point forward that I would always submit a final paper only after drafting an intelligible do-file and paring down a data file that could be uploaded online for someone else to replicate. A new study in PLoS One finds the willingness to share data is related to the strength of evidence and the quality of reporting results.

I agree with Kim. That is the reason why I post my code and data on my research page as soon as an article is accepted. I do it not only because it can only increase my number of citations, I also do it because in this day and age in which referees can easily Google the authors of a given paper, the fact that you post your code and data for all of your published papers can send a powerful signal that your empirical results are trustworthy.