{"id":11756,"date":"2016-03-03T05:00:38","date_gmt":"2016-03-03T11:00:38","guid":{"rendered":"http:\/\/marcfbellemare.com\/wordpress\/?p=11756"},"modified":"2016-03-03T12:10:31","modified_gmt":"2016-03-03T18:10:31","slug":"testing-thursday-comparing-distributions-redux","status":"publish","type":"post","link":"https:\/\/marcfbellemare.com\/wordpress\/11756","title":{"rendered":"Testing Thursday: Comparing Distributions Redux"},"content":{"rendered":"<p>On the subject of <a href=\"http:\/\/marcfbellemare.com\/wordpress\/11720\">tests used to compare two distributions<\/a>, Varun writes with two questions. His first question is as follows:<\/p>\n<blockquote><p>I teach part of a data analysis course at our institute. With the example of auto.dta the comes with Stata, we found that the variable miles per gallon (mpg) to be not normally distributed.<br \/>\nTo find out whether mpg\u00a0is statistically different for domestic (a sub-sample of 52 cases) and foreign (a sub-sample of 22 cases) cars (total sample size of 74), I told students\u00a0there are two nonparametric tests, the Wilcoxon ranks-sum test and the two-sample Kolmogorov-Smirnov test. The Wilcoxon rank-sum test tests whether medians are significantly different while the two-sample Kolmogorov-Smirnov\u00a0test tests whether distributions are different both groups. My students asked me which one they should go for between these two tests\u00a0for mpg in auto.dta.<\/p><\/blockquote>\n<p><!--more--><\/p>\n<p>Answer:\u00a0I have never used the\u00a0Wilcoxon rank-sum test, and so I was not familiar with the procedure. \u00a0After digging around, however, I think I can answer the question &#8220;Should\u00a0one should use -ksmirnov- or -ranksum-&#8220;?\u00a0<a href=\"http:\/\/www.graphpad.com\/guides\/prism\/6\/statistics\/index.htm?stat_choosing_between_the_mann-whit.htm\">These<\/a>\u00a0guidelines might be helpful:<\/p>\n<blockquote><p>Both the Mann-Whitney [Note: The test conducted by -ranksum- &#8211;MFB.]and the Kolmogorov-Smirnov tests are nonparametric tests to compare two unpaired groups of data. Both compute p-values that test the null hypothesis that the two groups have the same distribution. But they work very differently:<\/p>\n<p>The Mann-Whitney test first ranks all the values from low to high, and then computes a P value that depends on the discrepancy between the mean ranks of the two groups.<\/p>\n<ul>\n<li>The Kolmogorov-Smirnov test compares the cumulative distribution of the two data sets, and computes a P value that depends on the largest discrepancy between distributions. Here are some guidelines for choosing between the two tests:<\/li>\n<li>The KS test is sensitive to any differences in the two distributions. Substantial differences in shape, spread or median will result in a small P value. In contrast, the MW test is mostly sensitive to changes in the median.<\/li>\n<li>The MW test is used more often and is recognized by more people, so choose it if you have no idea which to choose.<\/li>\n<li>The MW test has been extended to handle tied values. The KS test does not handle ties so well. If your data are categorical, so has many ties, don&#8217;t choose the KS test.<\/li>\n<li>Some fields of science tend to prefer the KS test over the MW test. It makes sense to follow the traditions of your field.<\/li>\n<\/ul>\n<\/blockquote>\n<p>In my<a href=\"http:\/\/marcfbellemare.com\/wordpress\/11710\"> contract farming data<\/a>, both -ksmirnov- and -ranksum- give a\u00a0similar\u00a0answer:<\/p>\n<pre>. ksmirnov hunger, by(cf) exact\r\n\r\nTwo-sample Kolmogorov-Smirnov test for equality of distribution functions\r\n\r\n Smaller group D P-value Exact\r\n ----------------------------------------------\r\n 0: 0.0001 1.000\r\n 1: -0.0887 0.010\r\n Combined K-S: 0.0887 0.019 .\r\n\r\nNote: Ties exist in combined dataset;\r\n there are 21 unique values out of 1182 observations.\r\n\r\n. ranksum hunger, by(cf)\r\n\r\nTwo-sample Wilcoxon rank-sum (Mann-Whitney) test\r\n\r\n cf | obs rank sum expected\r\n-------------+---------------------------------\r\n 0 | 601 373279.5 355491.5\r\n 1 | 581 325873.5 343661.5\r\n-------------+---------------------------------\r\n combined | 1182 699153 699153\r\n\r\nunadjusted variance 34423427\r\nadjustment for ties -399809.6\r\n ----------\r\nadjusted variance 34023617\r\n\r\nHo: hunger(cf==0) = hunger(cf==1)\r\n z = 3.050\r\n Prob &gt; |z| = 0.0023\r\n\r\n<\/pre>\n<p>In both cases,\u00a0the null (of equality of distributions) is rejected.<\/p>\n<p>Which test are people more likely to use in economics? My\u00a0hunch was that the Kolmogorov-Smirnov test was more popular than the Mann-Whitney U test,\u00a0the test conducted\u00a0by -ranksum- (by <a href=\"https:\/\/en.wikipedia.org\/wiki\/Law_of_small_numbers\">the Law of Small Numbers<\/a>, as this hunch was based\u00a0on the fact that when I was in grad school, a\u00a0problem set once asked us to conduct a Kolmogorov-Smirnov test, but we were never asked to conduct a Mann-Whitney test), so I\u00a0looked\u00a0on\u00a0JSTOR. It turns out that between 2010 and 2016, there are 108 instances of &#8220;Kolmogorov-Smirnov&#8221; in economics articles on JSTOR versus 101 instances of &#8220;Mann-Whitney,&#8221; so it looks as though there is no clear preference for either, at least in the recent past.<\/p>\n<p>As always, my advice is &#8220;<a href=\"http:\/\/marcfbellemare.com\/wordpress\/10966\">do (and report) both<\/a>.&#8221; This is especially so with nonparametric procedures which, compared to two parametric procedures aimed at getting at the same answer (e.g., probit vs. logit),\u00a0are more likely to yield different answers by virtue of looking at the same problem very differently.<\/p>\n<p>Varun then\u00a0asks:<\/p>\n<blockquote><p>Also they asked me: Here, the sample size is greater than\u00a050, while one of the sub-samples is smaller\u00a0than 50. Should we go for the exact p-value option in Stata for a two-sample Kolmogorov-Smirnov\u00a0test?<\/p><\/blockquote>\n<p>&nbsp;<\/p>\n<p>Answer: The concern here is only computational, meaning that it can take significantly longer with n &gt; 50. I would always go for the exact p-value, unless it takes an inordinate amount of time. I just did both with and without &#8220;, exact&#8221; with Stata 13 on my computer (a Samsung Ultrabook Series 5 I purchased in 2012 and which is getting on in age), and in both cases, it took a fraction of a second. I suspect that unless your sample size is in the tens of thousands, the time it takes to compute the exact p-value will not be too crazy.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>On the subject of tests used to compare two distributions, Varun writes with two questions. His first question is as follows: I teach part of<\/p>\n<div class=\"more-link-wrapper\"><a class=\"more-link\" href=\"https:\/\/marcfbellemare.com\/wordpress\/11756\">Continue reading<span class=\"screen-reader-text\">Testing Thursday: Comparing Distributions Redux<\/span><\/a><\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[],"class_list":["post-11756","post","type-post","status-publish","format-standard","hentry","category-uncategorized","entry"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p1gPg8-33C","_links":{"self":[{"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/posts\/11756","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/comments?post=11756"}],"version-history":[{"count":6,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/posts\/11756\/revisions"}],"predecessor-version":[{"id":11762,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/posts\/11756\/revisions\/11762"}],"wp:attachment":[{"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/media?parent=11756"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/categories?post=11756"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/marcfbellemare.com\/wordpress\/wp-json\/wp\/v2\/tags?post=11756"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}