I had a long post yesterday on regression discontinuity design (RDD), a statistical apparatus that allows identifying causal relationships even in the absence of randomization.
I split my discussion of RDD into two posts so as to respect my self-imposed rule #3 (“anything longer than 500 words, you split into two posts,” which constitutes an example of RDD in itself) but to make a long story short, the assumption made by RDD is that units of observation (e.g., children) immediately above and below some exogenously imposed threshold (e.g., the passing mark on an entrance exam for an elite school) are similar, so that comparing units immediately above and below that threshold allows estimating a causal effect (e.g., the causal effect of going to an elite school).
An RDD design is nice to have when eligibility for some treatment (e.g., going to an elite school) consists of a single threshold. Often, however, there will be multiple thresholds, which are aggregated into a single index without any clear idea as to what weight is given to each variable. So what are we to do in those cases?