{"id":11579,"date":"2016-01-11T05:00:06","date_gmt":"2016-01-11T11:00:06","guid":{"rendered":"http:\/\/marcfbellemare.com\/wordpress\/?p=11579"},"modified":"2016-01-10T12:06:59","modified_gmt":"2016-01-10T18:06:59","slug":"metrics-monday-there-is-more-than-one-source-of-endogeneity","status":"publish","type":"post","link":"https:\/\/marcfbellemare.com\/wordpress\/11579","title":{"rendered":"&#8216;Metrics Monday: There Is More than One Source of Endogeneity"},"content":{"rendered":"<blockquote><p>&#8220;If you know a good story, tell it from time to time.&#8221; &#8212; Noah Smith.<\/p><\/blockquote>\n<p>Actually, I know two related stories, which I will recount in this post because both stories need to be understood much more widely than they currently are given how often their affiliated problems crop up in the manuscripts I read.<\/p>\n<p>Take the most basic theoretical problem in microeconomics: A producer\u00a0has to choose how much labor\u00a0\u2113 to use in order to maximize its profit from producing and selling some output q whose production is dictated by the production function q = f(\u2113), where f(.) is the technology available to the producer. The output q sells at price p, and labor\u00a0\u2113 sells at wage w.<\/p>\n<p>Setting the maximization problem, taking the first-order condition, checking that the second-order condition is satisfied, and solving for the profit-maximizing quantity of labor will yield a labor input demand \u2113* = \u2113(p,w). In such a problem, we say that\u00a0\u2113 is an endogenous variable&#8211;it is determined within the context of the problem&#8211;while p and w are exogenous variables&#8211;they are predetermined, that is, they are given, and they do not depend on the problem. (Alternatively, we also say that p, w, and f(.) are the primitives of the problem, but that is neither here nor there for the purposes of this discussion).<!--more--><\/p>\n<p>Now suppose you wanted to study the labor allocation decisions\u00a0on farms in a developing country. If you believe the theoretical model above, the least you would want to do is to regress each farm&#8217;s labor allocation\u00a0\u2113 on the price of the crop grown on that farm p and on the wage that farm pays its workers. It would be a mistake, however, to claim that because p and w are exogenous in the theoretical problem above, you can treat them as exogenous in the empirical problem.<\/p>\n<p>So my first story is this: Endogeneity and exogeneity have vastly different\u00a0theoretical and empirical meanings.<\/p>\n<p>My second story is related: It&#8217;s not because output price p and the wage w are not caused by\u00a0\u2113 that they are exogenous. Indeed, there is more than one cause of (statistical) endogeneity.\u00a0In the regression<\/p>\n<p>(1)\u00a0\u2113 = a + bp + cw + e,<\/p>\n<p>statistical endogeneity can bias your estimates of a, b, and c in three ways:<\/p>\n<ol>\n<li><em>Unobserved heterogeneity.<\/em>\u00a0This is also known as the omitted variables problem. Suppose it is more physically demanding to work on a low-quality plot than it is to work on a high-quality plot, and that you have to pay workers accordingly. In this case, your estimate of c is biased because of the correlation between (omitted) soil quality, which is in the error term e, and w.<\/li>\n<li><em>Measurement error.<\/em>\u00a0Suppose the farmers you collected data from tend to\u00a0lie about\u00a0the price at which they sell\u00a0their crop (say, because they wish to under-report their actual income). Then the price p you observe is such that p = p* + u, where p* is the real price they receive for their crop, and u is the &#8220;adjustment&#8221; they make to that price when they tell you how much they received for their crop.\u00a0If u is correlated with p&#8211;say, the higher the price a farmer receives, the bigger the lie&#8211;then your estimate of b is biased.<\/li>\n<li><em>Reverse causality or simultaneity.<\/em>\u00a0This is what a lot of people think of as\u00a0<em>the<\/em> source of statistical endogeneity.\u00a0Suppose that, for some reason, the amount of labor a farmer employs on his farm has an effect on the price that farmer receives for his crops or on the wage he has to pay (say, because he has to pay his workers for overtime). This too will bias\u00a0your\u00a0coefficient estimates.<\/li>\n<\/ol>\n<p>Now, none of this should\u00a0be new to people who received their graduate training in the past 10 years. \u00a0But there are still some folks who believe reason #3 is the only source of statistical endogeneity, and who confuse theoretical and statistical endogeneity.<\/p>\n<p>The foregoing, however,\u00a0suggests a systematic way to think through and discuss identification issues\u00a0when writing applied papers: In my own work, I almost always include a point-by-point discussion of whether (i) unobserved heterogeneity, (ii) measurement error, and (iii) reverse causality\/simultaneity are a source of bias\u00a0in the application at hand, and of how I deal with each source of statistical endogeneity. I see such a discussion as second only to the introduction in terms of importance in the grand scheme of\u00a0a research paper, and I think most young researchers would benefit from including such a discussion when using observational data.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;If you know a good story, tell it from time to time.&#8221; &#8212; Noah Smith. Actually, I know two related stories, which I will recount<\/p>\n<div class=\"more-link-wrapper\"><a class=\"more-link\" href=\"https:\/\/marcfbellemare.com\/wordpress\/11579\">Continue reading<span class=\"screen-reader-text\">&#8216;Metrics Monday: There Is More than One Source of Endogeneity<\/span><\/a><\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-11579","post","type-post","status-publish","format-standard","hentry","category-uncategorized","entry"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p1gPg8-30L","_links":{"self":[{"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/posts\/11579","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/comments?post=11579"}],"version-history":[{"count":13,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/posts\/11579\/revisions"}],"predecessor-version":[{"id":11593,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/posts\/11579\/revisions\/11593"}],"wp:attachment":[{"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/media?parent=11579"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/categories?post=11579"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/tags?post=11579"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}