Skip to content

Category: Impact Evaluation

Evaluating the Impact of Policies Using Regression Discontinuity Design, Part 2

I had a long post yesterday on regression discontinuity design (RDD), a statistical apparatus that allows identifying causal relationships even in the absence of randomization.

I split my discussion of RDD into two posts so as to respect my self-imposed rule #3 (“anything longer than 500 words, you split into two posts,” which constitutes an example of RDD in itself) but to make a long story short, the assumption made by RDD is that units of observation (e.g., children) immediately above and below some exogenously imposed threshold (e.g., the passing mark on an entrance exam for an elite school) are similar, so that comparing units immediately above and below that threshold allows estimating a causal effect (e.g., the causal effect of going to an elite school).

An RDD design is nice to have when eligibility for some treatment (e.g., going to an elite school) consists of a single threshold. Often, however, there will be multiple thresholds, which are aggregated into a single index without any clear idea as to what weight is given to each variable. So what are we to do in those cases?

Evaluating the Impact of Policies Using Regression Discontinuity Design, Part 1

Do students in smaller classes perform better than students in larger classes?

The answer might seem obvious. After all, students in smaller classes receive more attention from teachers, and so they should perform better.

We cannot know for sure, however, without looking at actual data on class size and student performance. In order to do so, we could collect data on student performance from various schools whose class sizes vary and look at whether students in smaller classes perform better.

But that wouldn’t be enough to determine whether smaller classes actually cause students to perform better. Correlation is not causation, and it could be the case that high-performing students are assigned to smaller classes composed of similar students. Thus, finding a correlation between class size and student performance would not be an indication that smaller classes cause students to perform better — only that school administrators want to put high-performing students in the same classes.

So how are we to know whether smaller classes actually cause students to perform better? One way could be to create classes of varying sizes (say, classes of 15, 30, 45, and 60 students) and randomly assign students to a given class size at the beginning of the year. Then, we could collect data on student performance on a standardized year-end exam and test whether average student performance is better in smaller than in bigger classes. Unfortunately, such a nice, clean experiment isn’t always feasible.

Taubes on the Weakness of Observational Studies, and a Methodological Rant

One caveat is observational studies, where you identify a large cohort of people – say 80,000 people like in the Nurse’s Health Study – and you ask them what they eat. You give them diet and food frequency questionnaires that are almost impossible to fill out and you follow them for 20 years. If you look and see who is healthier, you’ll find out that people who were mostly vegetarians tend to live longer and have less cancer and diabetes than people who get most of their fat and protein from animal products. The assumption by the researchers is that this is causal – that the only difference between mostly vegetarians and mostly meat-eaters is how many vegetables and how much meat they eat.

I’ve argued that this assumption is naïve almost beyond belief. In this case, vegetarians or mostly vegetarian people are more health conscious. That’s why they’ve chosen to eat like this. They’re better educated than the mostly meat-eaters, they’re in a higher socioeconomic bracket, they have better doctors, they have better medical advice, they engage in other health conscious activities like walking, they smoke less. There’s a whole slew of things that goes with vegetarianism and leaning towards a vegetarian diet. You can’t use these observational studies to imply cause and effect. To me, it’s one of the most extreme examples of bad science in the nutrition field.

That’s Gary Taubes in a FiveBooks interview over at The Browser. Taubes is better known for his book Good Calories, Bad Calories, in which he argues that a diet rich in carbohydrates is what makes us fat and, eventually, sick, and in which he argues in favor of an alternative diet rich in fats.

I really don’t know what kind of diet is best for weight loss, but I do want to stress Taubes’ point about the weakness of observational studies, even longitudinal ones. It is not uncommon for social science researchers to say “Well, we’ve been following these people over time, so we can use fixed effects to control for unobserved heterogeneity.” That is, they control for what remains constant for each unit of observation over time, which is made possible because they have more than one observation for each unit of observation. I have certainly been guilty of that.