Skip to content

Category: Uncategorized

‘Metrics Monday: Generated Regressors, or Why Regressing on \hat{x} Can Be a Problem

A few weeks ago I was in a meeting with a team of graduate students with whom I am working on a research project. As we were going over their estimation results, I asked a few questions to make sure that those results were sound.

At some point, I asked: “Are there any generated regressors in those regressions?” Hearing no answer, I looked up and saw a bunch of puzzled faces looking back at me. Before I even began explaining, one of the students alluded to how this would make a good blog post.

Suppose you want to estimate the equation

How to Write and Publish Applied Econ Papers

My friend and erstwhile colleague Tim Beatty (UC Davis), who currently serves as editor of the American Journal of Agricultural Economics, and his frequent coauthor Jay Shimshack (UVA), who has served as editor of the Journal of Environmental Economics and Management, have put together an extremely useful set of slides titled “Practical Tips for Writing and Publishing Applied Economics Papers” for a course they have been teaching. You can find those slides here.

‘Metrics Monday: Good Things Come to Those Who Weight–Part I

I was sitting in my office on Friday afternoon when one of our third-year PhD students dropped by with an applied econometric question: “When should I use weights?”

After telling her to go read Solon et al.’s 2015 piece in the JHR symposium on empirical methods, I decided to reread that paper for myself and blog about it this week. In the near future, in part II, I’m hoping to tackle Andrews and Oster’s new NBER working paper on weighting for external validity.

Before I begin, some clarification: throughout this post, I’ll be discussing the use of sampling weights. If you are a Stata user, this refers to that statistical package’s -pweight-, i.e., “weights that denote the inverse of the probability that the observation is included because of the sampling design.” I have never had to rely on -aweight-, -fweight-, or -iweight-, so I wouldn’t know when to use them.

Suppose you oversample a specific group in order to get more precise estimates for that group. For instance, suppose you are interested in the opinion of LGBTQ students. If you randomly sample individuals from a given population of students, you may not have enough LGBTQ respondents in your sample, and so whatever descriptive statistics you come up with for that sub-group might be too noisy. Thus, you may wish to over-sample LGBTQ respondents in order to improve precision. What I mean by this is that you would randomly sample respondents from each group–LGBTQ and non-LGBTQ–until you have the right number. So if you target a sample size of n=100 and you’d like 50% respondents from each group, you split the population in two groups (assuming that’s easy to do; in the case of LGBTQ students, it might not be easy to do) and sample from each until each group has 50 observations.