Skip to content

‘Metrics Monday: How to Systematically Think About Selection

Last updated on November 19, 2016

Jeffrey Smith and Arthur Sweetman have a very nice viewpoint article titled “Estimating the Causal Effects of Policies and Programs” in the latest issue of the Canadian Journal of Economics. The article is articulated around three points, viz. heterogeneity of treatment effects, the increased focus on internal validity over the past 20 years, and the use of economic theory to guide empirical work.

It is a good read–one that avoids taking some of the more extreme positions often taken by in that literature–and I plan on including it as a reading for the advanced econometrics course I teach every other year.

In reading Smith and Sweetman’s paper, I learned how to systematically think about selection into treatment when dealing with observational data. Their discussion can be particularly useful when you have survey data and your units of observation–in my case, that usually means individuals or households–are not randomly assigned to treatment but choose to participate on the basis of both their observable and unobservable characteristics, which means that you have to do the best you can with the data you have if you want to make a causal statement.

Smith and Sweetman note that in determining whether to select into a treatment, a unit of observation [math]i[/math] will consider three relevant quantities. Using the notation of the potential outcomes framework:

  1. [math]Y_{0i}[/math], or the value of the outcome variable for [math]i[/math] if she does not take up the treatment.
  2. [math]Y_{1i}[/math], or the value of the outcome variable for [math]i[/math] if she takes up the treatment.
  3. [math]C_{i}[/math], or the cost to [math]i[/math] of taking up the treatment.

It follows that [math]i[/math] will choose to participate if and only if [math]Y_{1i} – Y_{0i} > C_{i}[/math].

But then, those three relevant quantities–[math]Y_{1i}[/math], [math]Y_{0i}[/math] and [math]C_{i}[/math]–can be used to think systematically about how units of observation select into treatment. In section 2.1 of their paper, Smith and Sweetman explain how to do so:

  1. Holding [math]Y_{1i}[/math] and [math]C_{i}[/math] fixed, units are decreasingly likely to take up the treatment as [math]Y_{0i}[/math] increases,
  2. Holding [math]C_{i}[/math] fixed, units are increasingly likely to take up the treatment as [math]Y_{1i} – Y_{0i}[/math] increases, and
  3. Holding the potential outcomes [math]Y_{1i}[/math] and [math]Y_{0i}[/math] fixed, units are increasingly likely to take up the treatment as [math]C_{i}[/math] decreases.

If this all seems obvious, it’s because it is. But this was new to me, and so I figured it might be new to a number of readers of this blog.

As I mentioned above, what I like about this approach is that it provides a systematic way to think about how selection might affect your estimate of the average treatment effect [math]E(Y_{1i} – Y_{0i})[/math]. For instance, maybe it is the case that, in your application, it is reasonable to expect that [math]E(Y_{1i} – Y_{0i})[/math] is constant (say, because a unit either receives nothing or she gets a $1,000-payment), in which case you only need to worry about controlling or finding a good proxy for [math]C_{i}[/math]. Or maybe [math]C_{i}[/math] and [math]Y_{0i}[/math] are constant in your application, in which case you only need to worry about why and how [math]Y_{1i}[/math] varies across units.

The foregoing is obviously not a magic formula to deal with selection bias, and even in cases where you can successfully make the case that you have proxied for the relevant quantities, you would still have to worry about other kinds of unobserved heterogeneity, reverse causality, and measurement error. But thinking about bias using Smith and Sweetman’s formulation sure is better than some of the haphazard discussions of it I have witnessed over the years (many of them written by me!), it can help assuage your readers that there is less in the error term than meets the eye, and it can help you qualify how the remaining selection biases your estimate of the average treatment effect.