My last post in this series, on how to use Pearl’s front-door criterion in a regression context, generated lots of page views as well as lots of commentary on Twitter–enough so that I thought a follow-up post might be useful.
Recall that with outcome Y, treatment X, mechanism M, and an unobserved confounder U affecting both Y and X but not M, the method I outlined in the post is pretty simple:
Regress M on X, get b_{MX}, the coefficient on X.
Regress Y on X and M, get b_{YM}, the coefficient on M.
The product of b_{MX} and b_{YM} is the effect of treatment X on outcome Y estimated by the front-door criterion.
One of the things that came up on Twitter was whether someone should use the procedure outlined above, or do the following instead:
Regress M on X, get \hat{e}, the residual.
Regress Y on \hat{e}, get b_Ye, the coefficient on \hat{e}.
Regress M on X, get b_{MX}, the coefficient on X.
The product of b_{Ye} andb_{MX} is the effect of treatment X on outcome Y estimated by the front-door criterion.
Note that the two methods yield the exact same treatment effect. Here is a Kerwinian proof by Stata:
clear drop _all set obs 1000
set seed 123456789 gen u = rnormal(0,1) gen treat = u + rnormal(0,1) gen mech = -0.3 * treat + rnormal(0,1) gen outcome = 0.5 * mech + u + rnormal(0,1)
reg outcome e matrix a = _b[e] reg mech treat matrix b = _b[treat] matrix c = a*b
matrix list c
Note how the estimate obtained in the line that begins with -nlcom- and the line that begins with -matrix list- are identical. In terms of implementation, I prefer the (somewhat old-fashioned, I realize) use of seemingly unrelated regression, since it allows the error terms to be correlated across the two component regressions.
To reiterate what I talked about at the end of my last post: Caveat emptor. When it comes to observational data, rare is the scenario where one can claim that the mechanism M whereby an endogenous treatment X is entirely unaffected by the unobserved confounders U that simultaneously affect treatment X and outcome Y. So this post and the previous one are really meant to be illustrative of something that might work in some rare situations more than an encouragement to apply the front-door criterion unthinkingly as a means of identifying a causal relationship on the cheap. In this as in so many things, TINSTAAFL.
(Update: There was a mistake in the original post. Thanks to Peter Hull, Paul Hünermund, Vincent Arel-Bundock, and Daniel Millimet, who provided enlightening comments on the original post, the proper procedure is given at the bottom, along with some ideas for implementation in Stata.)
If you have been reading this blog for a while, you are undoubtedly familiar with the usual methods used by economists to identify causal relations (e.g., randomized controlled trials, instrumental variables, difference-in-differences, etc.)
One method that you may not have heard of, or that you might only have heard of in passing, is Pearl’s (2000) front-door criterion, which Pearl discusses in a more intuitive way in The Book of Why, the popular-press book he has recently published in which he discusses his work on causality (Pearl, 2018). In fact, in The Book of Why, Pearl goes so far as to assert that the use of the front-door criterion might help end the hegemony of randomized controlled trials when it comes to identifying causal impacts!
Consider the following figure, where X denotes a treatment variable, Y denotes the outcome of interest, M denotes a mechanism through which X causes Y, and U represents unobserved confounders.
Let’s ignore M for a minute. If you have ever seen a graph like the one in the figure above–a directed acyclic graph, or DAG–that type of graph is used by causality researchers to look at the structure underlying a causal model. Here, the identification problem is illustrated by the fact that U affects both X and Y (i.e., there are arrows from U to both X and Y), and that is the reason why identifying the causal relationship flowing from X to Y is difficult, i.e., because any correlation between the two cannot be argued to be causal because of the presence of U. In such cases, an economist’s first instinct would often be to find a variable Z which is correlated with X but which is not affected by U–a setup which would allow identifying the causal effect of X on Y, and which you have probably recognized as an instrumental variable (IV) setup.
Pearl, however, came up with a clever way of identifying the causal effect of X on Y which tends to be somewhat less demanding than having to find a credible IV. Looking at the figure above, Pearl’s method involves finding a mechanism M whereby X causes Y, but which is itself not affected by unobserved confounders. (Indeed, notice that there is no arrow from U to M in the figure above.)
That is essentially the idea of the front-door criterion: To find a mechanism M whereby X causes Y but which is not itself affected by unobserved confounders. In an old post, Alex Chinco, an assistant professor of finance at the University of Illinois, explains how even in the presence of self-selection of units into treatment, if you can credibly make the case that treatment intensity is not affected by the unobserved confounders that drive both the uptake of treatment and your outcome of interest, you can identify the effect of the treatment on that outcome.
Intuitively, what the front-door criterion does is kind of like what IV does, except that it moves the variable that purges the variation in X from its correlation with U in front of X (hence the name front-door criterion), or between X and Y such that you have X -> M -> Y, instead of behind it (as in a traditional 2SLS setup) where you have Z -> X -> Y.
It took me a while to sit down to write this post, because the idea behind this series of posts is to present things that one can use in a regression context, and whatever I have read from Pearl usually presents the front-door criterion in a simple binary treatment, binary outcome, and binary mechanism example involving smoking as treatment, lung cancer as outcome, and the rate of tar accumulation in the lungs as mechanisms, in which case you can recover the treatment effect by multiplying conditional probabilities.
But applied economists usually are interested in examples that involve more than just binary variables, and it took me a while to find a discussion of how to do this in a regression context. Even the recent paper by Glynn and Kashkin (2017) comparing front and backdoor criteria does not go into the details of how to do that. Luckily, the Alex Chinco post I refer to above goes into the details of how to do that. Specifically, Chinco discusses a two-step procedure, as follows. In a discussion about how to implement this in practice on Twitter, here is what came up (though reddit user /u/unistata came up with it six months before):
Regress M on X and a constant.
Regress Y on M, X, and a constant.
Multiply the coefficient on X in step 1 by the coefficient on M in step 2.
The result of step 3 is then the front-door criterion estimate of the causal effect of treatment X on outcome Y.
How would you implement this in Stata. A quick and dirty way to do it would to estimate
. sureg (m x) (y m x)
. nlcom [m]_b[x]*[y]_b[m]
As always, there is no free lunch, and in order to apply the front-door criterion, one has to make the case that M really is not affected by U the way X and Y are, which might be a difficult case to make. But if you have an application where self-selection into treatment compromises the identification of the causal effect of treatment on your outcome of interest and you can find a variable that measures the intensity of that treatment which is not driven by the same confounding factors as those affecting treatment and outcome, you might have a good case for using the front-door criterion.
Looks like there is a pent-up demand for econometrics, and so I will most likely keep writing about that topic, hopefully more frequently in 2019 than over the last few months, which have kept me busy with teaching. (On that, if you have any topic of an applied econometric nature to suggest, I’m all ears…)
* * *
This has been another good year for me. In terms of research, I have published six new articles, including three on contract farming.
I have traveled internationally to Germany, Finland, Canada (three times), Denmark, Japan, and Italy; and within the US to Urbana-Champaign, IL, Phoenix, AZ, Starkville, MS, Berkeley, CA, Philadelphia, PA, Washington, DC (twice), Ames, IA, Ann Arbor, MI, West Lafayette, IN, and Kona, HI (this last one for a family vacation, thank goodness!)
This year, I had the honor of being elected to the executive board of the Agricultural and Applied Economics Association. At Food Policy, my co-editor and I have seen our impact factor go up once again this year.
Finally, I have the College of Food, Agricultural, and Natural Resource Sciences’ 2018 Distinguished Teaching Award for Graduate Faculty.
Again this year, the most wonderful part of all this has been to meet so many interesting people, old and new, and to realize further that we are all one.
* * *
Whether you have been reading this blog since its very beginning or you have only started recently, thank you from the bottom of my heart for making time to read what I write.
Happy New Year! I hope 2018 brings you joy, health, and prosperity.