Rob writes:
I am not an econometrician–I spend my time playing with CGE models–but have to know something about econometrics. Recently I have been reviewing draft papers on a project using detailed tax data in my country–firm-level, matched with individual returns of employees, valued-added tax, import duties, etc.–for the period 2009-2014. A massive and rather unusual database.
It is all good work, but I have two concerns. (Note: I will get to Rob’s second concern at next week’s installment of ‘Metrics Mondays. — MFB.) One is about big data. Many of the researchers report t-statistics and other statistics as if this does not matter. In fact some say they are dealing with the population of firms, in which case my sense is that standard errors say nothing about statistical fit, but maybe about economic significance of relations between means. Even if it is a sample, as n/N becomes closer to 1, sample statistics become problematic.
That is a very interesting question. Let me just rephrase it a bit more broadly to this: What do you do when you are dealing with the population itself instead of dealing with a sample that is representative of a population?