Skip to content

‘Metrics Monday: Front-Door Criterion Follow-Up (Updated)

My last post in this series, on how to use Pearl’s front-door criterion in a regression context, generated lots of page views as well as lots of commentary on Twitter–enough so that I thought a follow-up post might be useful.

Recall that with outcome Y, treatment X, mechanism M, and an unobserved confounder U affecting both Y and X but not M, the method I outlined in the post is pretty simple:

  1. Regress M on X, get b_{MX}, the coefficient on X.
  2. Regress Y on X and M, get b_{YM}, the coefficient on M.
  3. The product of b_{MX} and b_{YM} is the effect of treatment X on outcome Y estimated by the front-door criterion.

One of the things that came up on Twitter was whether someone should use the procedure outlined above, or do the following instead:

  1. Regress M on X, get \hat{e}, the residual.
  2. Regress Y on \hat{e}, get b_Ye, the coefficient on \hat{e}.
  3. Regress M on X, get b_{MX}, the coefficient on X.
  4. The product of b_{Ye} andb_{MX} is the effect of treatment X on outcome Y estimated by the front-door criterion.

Note that the two methods yield the exact same treatment effect. Here is a Kerwinian proof by Stata:

clear
drop _all
set obs 1000

set seed 123456789
gen u = rnormal(0,1)
gen treat = u + rnormal(0,1)
gen mech = -0.3 * treat + rnormal(0,1)
gen outcome = 0.5 * mech + u + rnormal(0,1)

sureg (mech treat) (outcome mech treat)
nlcom [mech]_b[treat]*[outcome]_b[mech]

reg mech treat
predict e, resid

reg outcome e
matrix a = _b[e]
reg mech treat
matrix b = _b[treat]
matrix c = a*b

matrix list c

Note how the estimate obtained in the line that begins with -nlcom- and the line that begins with -matrix list- are identical. In terms of implementation, I prefer the (somewhat old-fashioned, I realize) use of seemingly unrelated regression, since it allows the error terms to be correlated across the two component regressions.

To reiterate what I talked about at the end of my last post: Caveat emptor. When it comes to observational data, rare is the scenario where one can claim that the mechanism M whereby an endogenous treatment X is entirely unaffected by the unobserved confounders U that simultaneously affect treatment X and outcome Y. So this post and the previous one are really meant to be illustrative of something that might work in some rare situations more than an encouragement to apply the front-door criterion unthinkingly as a means of identifying a causal relationship on the cheap. In this as in so many things, TINSTAAFL.

On Twitter, Daniel Millimet adds: