I had a long post yesterday on regression discontinuity design (RDD), a statistical apparatus that allows identifying causal relationships even in the absence of randomization.
I split my discussion of RDD into two posts so as to respect my self-imposed rule #3 (“anything longer than 500 words, you split into two posts,” which constitutes an example of RDD in itself) but to make a long story short, the assumption made by RDD is that units of observation (e.g., children) immediately above and below some exogenously imposed threshold (e.g., the passing mark on an entrance exam for an elite school) are similar, so that comparing units immediately above and below that threshold allows estimating a causal effect (e.g., the causal effect of going to an elite school).
An RDD design is nice to have when eligibility for some treatment (e.g., going to an elite school) consists of a single threshold. Often, however, there will be multiple thresholds, which are aggregated into a single index without any clear idea as to what weight is given to each variable. So what are we to do in those cases?
In a new working paper, Alan de Brauw and Dan Gilligan develop a method to deal with such cases:
Regression discontinuity design (RDD) is a useful tool for evaluating programs when a single variable is used to determine program eligibility. RDD has also been used to evaluate programs when eligibility is based on multiple variables that have been aggregated into a single index using explicit, often arbitrary, weights. In this paper, we show that under specific conditions, regression discontinuity can be used in instances when more than one variable is used to determine eligibility, without assigning explicit weights to map those variables into a single measure. The RDD approach used here groups observations that are common across multiple criteria through the use of a distance metric to create an implicit partition between groups. We apply this model to evaluate the impact of the conditional cash transfer program Comunidades Solidarias Rurales in El Salvador (…).
I first became aware of that method when Alan told me about it as part of a conversation in which we were discussing the possibility of evaluating an NGOs program in Ethiopia. The program in question was given to some people in the treatment villages (but not to others) according to relatively unclear criteria, so I expressed my doubts as to whether it was even feasible to evaluate the program using RDD. Alan then referred me to the aforementioned paper.
Ultimately, we had to abandon the idea of evaluating that program, because it was not possible to obtain the names of people who almost received the program but who ended up not receiving it, and so it was not possible to exploit an RDD design. That is unfortunate for two reasons: (i) I have never run my own RDD, and it would have been very nice to do so, and (ii) it would have been nice to work with Alan, whom I first met at the first development economics conference I ever attended, back in 2003.