Last updated on February 5, 2017
Last week, in the first half of this two-part post, I talked about the method developed by Conley et al. (2012) to deal with departures from the assumption of strict exogeneity of an instrumental variable (IVs)–that is, to deal with what Conley et al. (2012) refer to as “plausibly exogenous” IVs.
How to deal with an imperfect instrument was an idea whose time apparently had come in 2012: In the same volume of the same journal, Nevo and Rosen (2012) develop an alternative method for dealing with imperfect IVs, which is what I wanted to discuss this week.
Again, imagine you are interested in the effect of treatment [math]D[/math] on outcome [math]Y[/math], with or without controls [math]X[/math]. You are interested in estimating
(1) [math]Y = \beta {D} +\epsilon,[/math]
from which I am omitting the constant and the controls for brevity. Specifically, you are interested in the causal effect of the endogenous treatment [math]D[/math] on [math]Y[/math], and you have a plausibly exogenous instrument [math]Z[/math].
Intuitively, Nevo and Rosen’s method relaxes the assumption that [math]Corr(Z, \epsilon) = \rho_{Z \epsilon} = 0[/math]–the exogeneity assumption–to allow for the possibility that [math]\rho_{Z \epsilon} \neq 0[/math] at the cost of making a stronger assumption on the relationship between the endogenous variable and the instrument–the relevance assumption. In their own words, the method makes a weaker assumption on the unobservables, but a stronger assumption on the observables.
Specifically, Nevo and Rosen’s method assumes that (i) [math]sgn(\rho_{Z \epsilon}) = sgn(\rho_{D \epsilon})[/math], and (ii) [math]\rho_{Z \epsilon}<\rho_{D \epsilon}[/math].
Again, in Nevo and Rosen’s own words, assumption (ii) is “an intuitive assumption for those applications where … [math]Z[/math] is not necessarily exogenous but is ‘better’ or ‘less endogenous’ than the endogenous regressor.”
Assumptions (i) and (ii) above are assumptions 3 and 4 in Nevo and Rosen’s article; they make a few more regularity assumptions around these, but those really are the central ones.
With just assumption (i) and the regularity assumptions, you get Lemma 1 in the paper: If [math]Cov(D,Z)<0[/math], then the true parameter [math]\beta[/math] lies between the OLS and 2SLS estimates, [math]\beta_{OLS}[/math] and [math]\beta{2SLS}[/math]. If, however, [math]Cov(D,Z)>0[/math], [math]\beta \leq \min\{\beta_{OLS},\beta_{2SLS}\}[/math] in cases where [math]Cov(D, \epsilon) \geq 0[/math] and [math]Cov(Z, \epsilon) \geq 0[/math], and [math]\beta \geq \max \{\beta_{OLS},\beta_{2SLS}\}[/math] in cases where [math]Cov(D, \epsilon) \leq 0[/math] and [math]Cov(Z, \epsilon) \leq 0[/math].
With both assumptions (i) and (ii) (and the regularity assumptions), you get Proposition 1, which generates even better (as in sharper) bounds. This requires a bit more math than I can go into in this post.
Nevo and Rosen then generalize their findings to the case where there are additional regressors (i.e., controls), to the case where there are multiple imperfect instruments, and to the case where there are multiple treatment (i.e., endogenous) variables, and they have a section on inference, since beyond knowing [math]\beta[/math], it’s also nice to know the confidence interval around it. Finally, they provide an application from the empirical IO literature.
To summarize both this post and last week’s: When you have an instrument that is plausibly but not strictly exogenous (in Conley et al.’s terminology) or imperfect (in Nevo and Rosen’s terminology), all is not lost. This is encouraging for those of us who rarely have the luxury of having experimental data and must often rely on observational data. For example, I am currently putting the finishing touch on a paper where I aim to incorporate one or both of the Conley et al. and Nevo and Rosen approaches in order to show that my 2SLS results are robust to departures from the strict exogeneity assumption.
Still, this does not mean that anything goes and that crappy instruments get a pass. The way I like to think of this (at least in the context of the Conley et al., 2012 approach) is that these methods are useful for cases where there is one possible but unlikely channel in which the exclusion restriction is violated, but not for cases where there are several likely such channels. When the latter happens, it is perhaps best to adopt the view according to which “whereof one cannot speak, thereof one must be silent.”
(This is the 50th post in the Metrics Monday series. At the end of 2010, when I started blogging, I never thought I would end up blogging for so long, let alone writing so much about econometrics. Thank you for reading; here is to 50 more of those posts!)