Skip to content

‘Metrics Monday: Interpreting Coefficients I

A few conversations with colleagues who teach econometrics have convinced me that, for all the advanced technical knowledge we impart students in standard econometrics classes, we often don’t do a very good job of teaching them how to interpret what they are estimating. This is generally the reason why I teach a graduate class on the practice of econometrics (i.e., so-called cookbook econometrics class) every other year.

More specifically, this leads me to discussing the interpretation of certain types of coefficients for this week’s installment of ‘Metrics Mondays. Beyond the (accurate) interpretation of coefficients, I don’t have a grand overarching theme, so what follows is a collection of bullet points more than anything.

Interaction Terms

Suppose you are interested in studying the effect of education on wage in the following modified Mincer equation:

(i) y = a + bE + cX + fE*X + e,

where y denotes a person’s wage, E denotes that person’s education, X denotes their experience, and e is an error term with mean zero. For the purposes of this discussion, let’s ignore the fact that E and y are jointly determined.

What is the effect of education on wage for the average individual in the data here? Too many students would be quick to say that that effect is measured by the coefficient b when in fact (I use d throughout as notation for a partial derivative because I can’t be bothered to pasted a curly d everywhere), dy/dE = b + fX because the interaction term fE*X is included in equation 1.

Effects at Means or Mean Effects?

Notice that X enters dy/dE above, so a natural question is “What X do I use?” Meaning: You can compute and report dy/dE in two ways:

  1. At the mean of the (relevant) explanatory variables, by reporting dy/dE = b + fX_bar, where X_bar is the (estimation) sample mean of X. With Stata, the effect dy/dE = b + fX is easily recovered with the -lincom- command, which is used for linear combinations. After the regression command -reg y E X EX- you would type -lincom _b[E] + _b[EX]*X_bar-, where you would replace X_bar withe (estimation) sample mean of the variable X. Stata would then report an estimate of the effect of E on y at the mean of X, complete with standard error and t-statistic for the null hypothesis that that effect is zero.
  2. The mean effect. by reporting the mean (dy/dE)_bar, i.e., the mean of dy/dE, or the sum over all i in the estimation sample of (b + fX_i) divided by the estimation sample size. In Stata, this would mean creating a new variable using the -predictnl- command, which is used for nonlinear combinations.

One complication is that very often, the two effects–the effect computed at the mean of the explanatory variables, and the mean effect–differ substantially. To take an example from some elasticities I have been estimating this week: The elasticity-at-means is 0.81, but the mean elasticity is 3.1. That almost fourfold difference between the two is not exceptional and unique to the context I was working on.

Generally, I prefer to underpromise and overdeliver, so my approach in such cases is to walk in the footsteps of John Boyd, err on the side of caution, and report the most conservative (i.e., smallest in absolute value) of the two estimated effects. That way, if anyone tells me that my estimated effect is “too large,” I can always tell them that the true effect is actually likely to be much larger.

Be There or Be Square

Another application of the foregoing relates to how you interpret variables and their square. For example, when using individual-level data, it is not uncommon to include a person’s age and the square of their age as regressors in order to account for potential nonlinearities (here, U-shaped or inverse U-shaped relationships) between their age and the dependent variable. When studying the accumulation of assets, for example, there usually is an inverse U-shape relationship between age and asset accumulation: People in their late teens and early 20s usually have few assets; people accumulate assets throughout their work life, buying real estate, saving for retirement, etc.; and once they retire, people sell off their assets in order to consume. So suppose you want to study the effect of age A on a person’s assets Y while controlling for a number of variables X, you’d estimate

(2) Y = a + bA + cA^2 + fX + e.

Here, the marginal effect of age is dY/dA is b + 2cA, and not just b.

The age thing is a bit of a no-brainer, and few people make the mistake of only looking at age while ignoring its square.

Recently, however, I came across a paper looking at the inverse farm size–productivity relationship where the authors regressed productivity Y (measured in kilograms of output per hectare) on the size of a plot of land H (for hectares) and the square of that plot’s size H^2, so that they estimated

(3) Y = a + bH + cH^2 + e.

Then, in order to test whether there is an inverse relationship between farm size and productivity, they simply looked at whether the estimated b was significantly different from zero and negative. In fact, the proper test was to test the null that b + 2cH = 0 versus the alternative hypothesis that b + 2cH ~= 0, in which case rejecting the null in favor of b + 2cH would constitute evidence of an inverse relationship.