Skip to content

Fixed Effects and Causal Inference

Last updated on June 1, 2023

That is the title of a new working paper by Dan Millimet and me. If memory serves, the genesis of this paper was an exchange Dan and I had on Twitter where we both remarked that, with panel data, adding more rounds of data is not necessarily better if the goal is to identify a causal relationship, because the amount of stuff, both observed and unobserved, that remains constant over time (in other words, what is controlled for by unit fixed effects) decreases as the data grows to cover a longer time period.

Given that, it is surprising that the fixed effects (FE) estimator has emerged as the default estimator to use when trying to identify a causal relationship with longitudinal data. Even Yair Mundlak, who developed the FE estimator to control for management bias when estimating agricultural production functions, recognized that stuff is only time-invariant when looking at short periods when he wrote, in his original 1961 then-Journal of Farm Economics, now-American Journal of Agricultural Economics article, that (emphasis added)

[i]nstead of beginning by conceptualizing what we mean by management we shall assume that whatever management is, it does not change considerably over time; and for short periods, say a few years, it can be assumed to remain constant.

In our paper, we show analytically that adding more rounds of data will almost always make things better from an identification perspective when using the FE estimator, and we discuss a number of alternatives to the FE estimator. We then show, on the basis of simulations and by replicating earlier work, that the FE estimator rarely does a better job than the alternatives we discuss. Strikingly, even a plain-vanilla first-difference estimator (which is only equivalent to FE when T = 2) often does a much better job than FE when it comes to reducing bias in a majority of circumstances.

Here is the abstract:

Across many disciplines, the fixed effects estimator of linear panel data models is the default method to estimate causal effects with nonexperimental data that are not confounded by time-invariant, unit-specific heterogeneity. One feature of the fixed effects estimator, however, is often overlooked in practice: With data over time $t \in {1,…,T}$ for each unit of observation $i \in {1,…,N}$, the amount of unobserved heterogeneity the researcher can remove with unit fixed effects is weakly decreasing in $T$. Put differently, the set of attributes that are time-invariant is not invariant to the length of the panel. We consider several alternatives to the fixed effects estimator with $T>2$ when relevant unit-specific heterogeneity is not time-invariant, including existing estimators such as the first-difference, twice first-differenced, and interactive fixed effects estimators. We also introduce several novel algorithms based on rolling estimators. In the situations considered here, there is little to be gained and much to lose by using the fixed effects estimator. We recommend reporting the results from multiple linear panel data estimators in applied research.