Skip to content

Category: Uncategorized

Fixed Effects and Causal Inference

That is the title of a new working paper by Dan Millimet and me. If memory serves, the genesis of this paper was an exchange Dan and I had on Twitter where we both remarked that, with panel data, adding more rounds of data is not necessarily better if the goal is to identify a causal relationship, because the amount of stuff, both observed and unobserved, that remains constant over time (in other words, what is controlled for by unit fixed effects) decreases as the data grows to cover a longer time period.

Given that, it is surprising that the fixed effects (FE) estimator has emerged as the default estimator to use when trying to identify a causal relationship with longitudinal data. Even Yair Mundlak, who developed the FE estimator to control for management bias when estimating agricultural production functions, recognized that stuff is only time-invariant when looking at short periods when he wrote, in his original 1961 then-Journal of Farm Economics, now-American Journal of Agricultural Economics article, that (emphasis added)

Survey Ordering and the Measurement of Welfare

That’s the title of a new working paper by Wahed Rahman, Jeff Bloem, and me in which we randomly place the module asking survey respondents about their assets either near the beginning (treatment) or at the end (control) of the survey to see whether the latter introduces classical (i.e., noise) or non-classical (i.e., bias) measurement error.

On average, we have a null finding. That is, whether we ask respondents about their assets early or late in the survey introduces neither classical nor non-classical measurement error. But we do find some interesting treatment heterogeneity in that respondents from larger households (i.e., households with more than four individuals) and with a low level (i.e., fewer than six years) of formal education tend to underreport assets when asked about them later in the survey.

One caveat: We are assuming that this is happening because of survey fatigue, and so that the “right” number of assets and the “right” value of those assets is given by respondents who are asked about them earlier. Unfortunately, we have no way of testing whether survey fatigue is the mechanism here. The fact that we find a null finding on average, combined with the fact that our survey was relatively short (i.e., 75 minutes) lends credence to the idea that survey fatigue is what is driving our sub-sample results.

Here is the abstract:

Social and economic policy and research rely on the accurate measurement of welfare. In nearly all instances, measuring welfare requires collecting data via long household surveys that are cognitively taxing on respondents. This can lead to measurement error, both classical (i.e., noisier responses) and non-classical (i.e., biased responses). We embed a survey ordering experiment in a relatively short survey, lasting just over 75 minutes on average, by asking half of our respondents about their assets near the beginning of the survey (treatment) and asking the remainder of our respondents about their assets at the end of the survey (control). We find no evidence that survey ordering introduces classical or non-classical measurement error in either the number of reported assets or the reported asset value in the full sample. But in sub-samples of respondents who (i) are from larger (i.e., more than four individuals) households, or (ii) have low levels (i.e., fewer than six years) of education, we find evidence of differential reporting due to survey ordering. These results highlight important heterogeneity in response bias which, despite the null effect in the full sample, can be meaningful. For example, for respondents from larger households, placing the asset module near the beginning of the survey leads to a 23 percent increase in the total reported asset value relative to placing the same module at the end of the survey.

Leveling the Playing Field: How to Write a Good Diversity Statement

As I write in the introduction to Doing Economics paraphrasing William Gibson, a lot of hidden-curriculum information is already here; it’s just not evenly distributed. An exchange I had on Twitter this morning made me realize that a lot of people may not know how to write a good diversity statement. This is my effort to level the playing field along that dimension.

* * *

Increasingly, universities and other institutions hiring researchers ask for personal statements about diversity, equity, inclusion, and belonging (DEIB) from applicants in addition to cover letters, CVs, writing samples, letters of recommendation, teaching evaluations, and so on.

Having sat on search committees, I have seen all sorts of diversity statements, from the extremely bad to the very, very good. As someone who cares about all parts of DEIB, I worry that applicants from some universities are taught how to write a good diversity statement, whereas applicants from other universities are left to figure it out on their own.

So in the interest of leveling the playing field, I thought I would link to what is perhaps the best resource for writing diversity statements, namely UC Berkeley’s Office for Faculty Equity & Welfare’s “Rubric for Assessing Candidate Contributions to DEIB.” Why do I think this is the best resource? Because when I read applications, applicants from the UC system usually have the best DEI statements.

Specifically, that rubric assesses whether applicants discuss three distinct areas: “knowledge and understanding (section 1), track record of activities to date (section 2), and plans for contributing at Berkeley (section 3).”

Obviously, if you are not applying for a position at UC Berkeley, you will want to either skip the third section, or write it for the specific job you are applying for. For example, when I sat down to write my own diversity statement, it had two sections: (i) a discussion of my knowledge and understanding of DEIB, and (ii) my track record of activities contributing to DEIB to date.

The most important section of the webpage I linked to above is the sample rubric, which tells you exactly the kind of thing you should be discussing, and what counts as good versus what does not (make sure to check it out before writing your own statement; my screen capture leaves out a lot of information):