Skip to content

Taubes on the Weakness of Observational Studies, and a Methodological Rant

Last updated on January 6, 2012

One caveat is observational studies, where you identify a large cohort of people – say 80,000 people like in the Nurse’s Health Study – and you ask them what they eat. You give them diet and food frequency questionnaires that are almost impossible to fill out and you follow them for 20 years. If you look and see who is healthier, you’ll find out that people who were mostly vegetarians tend to live longer and have less cancer and diabetes than people who get most of their fat and protein from animal products. The assumption by the researchers is that this is causal – that the only difference between mostly vegetarians and mostly meat-eaters is how many vegetables and how much meat they eat.

I’ve argued that this assumption is naïve almost beyond belief. In this case, vegetarians or mostly vegetarian people are more health conscious. That’s why they’ve chosen to eat like this. They’re better educated than the mostly meat-eaters, they’re in a higher socioeconomic bracket, they have better doctors, they have better medical advice, they engage in other health conscious activities like walking, they smoke less. There’s a whole slew of things that goes with vegetarianism and leaning towards a vegetarian diet. You can’t use these observational studies to imply cause and effect. To me, it’s one of the most extreme examples of bad science in the nutrition field.

That’s Gary Taubes in a FiveBooks interview over at The Browser. Taubes is better known for his book Good Calories, Bad Calories, in which he argues that a diet rich in carbohydrates is what makes us fat and, eventually, sick, and in which he argues in favor of an alternative diet rich in fats.

I really don’t know what kind of diet is best for weight loss, but I do want to stress Taubes’ point about the weakness of observational studies, even longitudinal ones. It is not uncommon for social science researchers to say “Well, we’ve been following these people over time, so we can use fixed effects to control for unobserved heterogeneity.” That is, they control for what remains constant for each unit of observation over time, which is made possible because they have more than one observation for each unit of observation. I have certainly been guilty of that.

The truth of the matter, however, is that fixed effects rarely ever solve an endogeneity problem. This is especially true in long panels — that is, longitudinal studies over long periods of time, such as the 20-year study Taubes refers to.

(And don’t get me started on random effects which, unless you have actual experimental data, I find completely unbelievable. That is, even with a Hausman test, I don’t buy random effects models — Hausman tests are known for being terrible given that the bulk of the probability mass sits on the null hypothesis that random effects should be used, and it is much easier to fail to reject the null than it is to reject it…)

Sure, there are certain things that do not change over time. Gender is one of them (although not always; a large enough data set will include people who undergo a sex change operation), height might be another, but there aren’t that many.

The fact that fixed effects are not a cure-all to endogeneity problems is the reason why few people actually buy the results of cross-country regressions nowadays.

Take a bunch of countries, and regress the logarithm of their GDP per capita on some variable of interest (e.g., the amount of foreign aid they receive). Even with country fixed effects, a time trend, a rich set of controls, etc., there is almost no chance that what you’ve estimated is the causal impact of foreign aid on economic growth. There are just too many things that change over time for a given country, and which no data set can possibly be rich enough to account for.

That’s why for such cross-country analyses, my preferences lie with a convincing quasi experimental design, i.e., a design that involves a plausibly exogenous instrumental variable, in the style of Acemoglu et al.’s (2001) study of the impact of institutions on economic performance (and even that has come under serious fire…)