The New York Times reports yesterday that a new study from Miller, Hemenway, and Azrael claims:
"States with the greatest number of guns in the home also have the highest rates of homicide, a new study finds. . . . " Well, I have just spent a short time looking at the study, but there are some of things that are pretty obvious: 1) They excluded the District of Columbia without any explanation, 2) they use other crime rates to explain the homicide rate (by the way, they don’t use anything like an arrest or conviction rate, nothing to do with law enforcement), 3) they use purely crosssectional data that never allows one to properly control for what may cause differences in crime rates, and 4) data from different years is used without any explanation (for the sake of argument I will use what they did, but it is weird to have the unemployment rate from 2000 to explain the homicide rate from 2001 to 2003, etc.). The data for a panel test on this is readily available from the sources used in their paper, though I have only collected the data to redo the estimates for 2001 that they use (why is it that these papers where one can put together the data in an afternoon get any serious attention). Why they only looked at the CDC data for 2001 when it is
available for many other years is a bit of a puzzle.
Since Miller and Hemenway have refused in the past to let me look at their data, I didn't bother this time and simply put the data together myself.The bottom line is that their results comes from two factors: the exclusion of DC and the use of other crime rates to explain the murder rate. Changing these two factors causes their result to go from positive and significant to negative and significant. I also decided to run these regressions on the robbery rate and doing so produced a statistically significant negative effect whether or not DC was excluded. Using arrest rate data, not shown, also caused the results to be more significantly negative. If I had the necessary panel data handy, my strong presumption is that would also reverse with their result whether or not DC was included.
It is problematic to include the other crime rates in these regressions, particularly since they must believe that guns cause robbery as well as homicide. The results below indicate that more guns mean fewer robberies (again this is using their flawed set up, though I believe that this would continue to be observed with panel data).
The general issue when you are doing this type of empirical work is to use all the data available. When I have done my empirical work on guns I have used all the data available for all jurisdictions for all the years available. In this case, the CDC survey data is available for many years after 1995, not just 2001, and they are not using all the jurisdictions. If you selectively pick years or places one should have a good explanation for why you are doing that, and I don't see any such explanations in the paper. The regressions reported by Miller et. al. are also not the type of regression estimates that any economist would run. What I try to show below is how sensitive the results are to what I would consider to be the most obvious corrections. Including all jurisdictions and make the estimates slightly more consistent with the way an economist would look at it without even having to add new variables.
In any case, noting that this is purely crosssectional data and not very useful, here is an attempt to redo their estimates looking at the homicide rate from 2001 to 2003 on the gun ownership rate from the CDC and the other variables that they use (I wasn't able to find their gini coefficient, but that is the only variable that they used that wasn't included). Here are some very simple linear regressions that I put together fairly quickly:
DC excluded (used all their variables in their Table 3, except for the gini coefficient)
Homcide01to03 = average homicide rate from 2001 to 2003.
I think that the other variables should be clear.
. reg Homcide01to03 gunownershiprate2001 percenturban medianfamilyincome1999 percentbelowpovertylevel percentblack percentsinglefemaleparenthouseho unemploymentrate2000census percentdivorced percentpop18342001 aggrivatedassaultrate2001 robberyrate2001 southerncensusregion alcoholconsumption2001 if notDC==0
Source  SS df MS Number of obs = 50
+ F( 13, 36) = 21.98
Model  275.288226 13 21.1760174 Prob > F = 0.0000
Residual  34.6827793 36 .963410535 Rsquared = 0.8881
+ Adj Rsquared = 0.8477
Total  309.971006 49 6.32593889 Root MSE = .98153

Homcide01~03  Coef. Std. Err. t P>t [95% Conf. Interval]
+
gunowne~2001  6.158754 2.575103 2.39 0.022 .9362022 11.38131
percenturban  1.20992 2.421382 0.50 0.620 6.12071 3.70087
medianf~1999  .000102 .000079 1.29 0.205 .0000581 .0002622
percentbel~l  40.05939 19.33717 2.07 0.046 .8417922 79.27699
percentblack  .1185185 .0484017 2.45 0.019 .0203554 .2166816
percentsin~o  3.773734 39.70597 0.10 0.925 84.30117 76.75371
unemployme~s  26.08681 26.27778 0.99 0.327 79.38061 27.20699
percentdiv~d  27.83938 17.55642 1.59 0.122 7.76669 63.44544
per~18342001  12.88474 13.88689 0.93 0.360 15.27917 41.04865
aggriva~2001  .0016147 .0016653 0.97 0.339 .0017627 .0049922
robbery~2001  .0243026 .0056717 4.28 0.000 .0127999 .0358053
southernce~n  1.351635 .599814 2.25 0.030 2.568114 .1351559
alcohol~2001  .0742161 .3756206 0.20 0.844 .6875778 .83601
_cons  14.5245 5.782964 2.51 0.017 26.2529 2.796107

DC excluded (did not include their variables for other crimes)
. reg Homcide01to03 gunownershiprate2001 percenturban medianfamilyincome1999 percentbelowpovertylevel percentblack percentsinglefemaleparenthouseho unemploymentrate2000census percentdivorced percentpop18342001 southerncensusregion alcoholconsumption2001 if notDC==0
Source  SS df MS Number of obs = 50
+ F( 11, 38) = 14.32
Model  249.711 11 22.701 Prob > F = 0.0000
Residual  60.2600055 38 1.58578962 Rsquared = 0.8056
+ Adj Rsquared = 0.7493
Total  309.971006 49 6.32593889 Root MSE = 1.2593

Homcide01~03  Coef. Std. Err. t P>t [95% Conf. Interval]
+
gunowne~2001  2.69241 3.090395 0.87 0.389 3.563767 8.948587
percenturban  5.193162 2.623195 1.98 0.055 .1172174 10.50354
medianf~1999  .0000198 .0000975 0.20 0.840 .0001776 .0002172
percentbel~l  25.22912 24.1867 1.04 0.303 23.7343 74.19253
percentblack  .2104145 .0536538 3.92 0.000 .1017981 .3190309
percentsin~o  10.48135 48.55617 0.22 0.830 87.81547 108.7782
unemployme~s  1.005869 32.85402 0.03 0.976 65.50361 67.51534
percentdiv~d  50.45611 21.41619 2.36 0.024 7.101307 93.81091
per~18342001  6.999652 17.28577 0.40 0.688 27.99356 41.99286
southernce~n  1.131898 .7236749 1.56 0.126 2.596902 .333105
alcohol~2001  .0678944 .4816396 0.14 0.889 .9071341 1.042923
_cons  13.31319 7.321042 1.82 0.077 28.13387 1.507483

Same as above, but DC is included
. reg Homcide01to03 gunownershiprate2001 percenturban medianfamilyincome1999 percentbelowpovertylevel percentblack percentsinglefemaleparenthouseho unemploymentrate2000census percentdivorced percentpop18342001 southerncensusregion alcoholconsumption2001
Source  SS df MS Number of obs = 51
+ F( 11, 39) = 31.88
Model  1620.08306 11 147.280278 Prob > F = 0.0000
Residual  180.146769 39 4.61914793 Rsquared = 0.8999
+ Adj Rsquared = 0.8717
Total  1800.22983 50 36.0045966 Root MSE = 2.1492

Homcide01~03  Coef. Std. Err. t P>t [95% Conf. Interval]
+
gunowne~2001  9.199294 4.729762 1.94 0.059 18.76614 .3675525
percenturban  3.598846 4.131027 0.87 0.389 11.95464 4.756945
medianf~1999  .0000194 .0001664 0.12 0.908 .0003172 .000356
percentbel~l  39.06187 41.19014 0.95 0.349 44.25305 122.3768
percentblack  .4766173 .0751993 6.34 0.000 .3245123 .6287222
percentsin~o  201.1131 71.71166 2.80 0.008 346.1636 56.06257
unemployme~s  98.52408 52.70362 1.87 0.069 8.079052 205.1272
percentdiv~d  94.91258 35.49413 2.67 0.011 23.11892 166.7062
per~18342001  95.1942 23.88845 3.98 0.000 46.87524 143.5132
southernce~n  3.159236 1.169235 2.70 0.010 5.524236 .7942356
alcohol~2001  1.496186 .7727291 1.94 0.060 .0668065 3.059178
_cons  25.89853 12.24821 2.11 0.041 50.67287 1.124194

DC excluded, not using their selective set of control variables
. reg Homcide01to03 gunownershiprate2001 if notDC==0
Source  SS df MS Number of obs = 50
+ F( 1, 48) = 0.00
Model  .00402852 1 .00402852 Prob > F = 0.9802
Residual  309.966977 48 6.45764536 Rsquared = 0.0000
+ Adj Rsquared = 0.0208
Total  309.971006 49 6.32593889 Root MSE = 2.5412

Homcide01to03  Coef. Std. Err. t P>t [95% Conf. Interval]
+
gunownershiprate2001  .0743955 2.978593 0.02 0.980 6.063259 5.914468
_cons . . . . . . . . .  4.707644 1.0878 4.33 0.000 2.520475 6.894813

Same with DC included
. reg Homcide01to03 gunownershiprate2001
Source  SS df MS Number of obs = 51
+ F( 1, 49) = 5.18
Model  172.063659 1 172.063659 Prob > F = 0.0273
Residual  1628.16617 49 33.227881 Rsquared = 0.0956
+ Adj Rsquared = 0.0771
Total  1800.22983 50 36.0045966 Root MSE = 5.7644

Homcide01to03  Coef. Std. Err. t P>t [95% Conf. Interval]
+
gunownershiprate2001  14.46889 6.358312 2.28 0.027 27.24639 1.69138
_cons . . . . . . . . .  10.34603 2.299427 4.50 0.000 5.725162 14.9669

What it means. Again, this uses purely crosssectional data, but accepting that: their result depends on excluding DC and including other crime rates to explain the murder rate. This would mean that more guns, less homicide. Even when DC is excluded, the simple correlation using crosssectional data is negative, though not at all statistically significant.
Just for the sake of argument, I did the same regressions for robbery (though I only took the time to put together the robbery rates for 2001).
DC Excluded
. reg robberyrate2001 gunownershiprate2001 percenturban percentdivorced medianfamilyincome199
> 9 percentbelowpovertylevel percentsinglefemaleparenthouseho percentblack southerncensusregion
> percentpop18342001 unemploymentrate2000census alcoholconsumption2001 if notDC==0
Source  SS df MS Number of obs = 50
+ F( 11, 38) = 14.80
Model  151143.145 11 13740.2859 Prob > F = 0.0000
Residual  35287.596 38 928.620948 Rsquared = 0.8107
+ Adj Rsquared = 0.7559
Total  186430.741 49 3804.709 Root MSE = 30.473

robbery~2001  Coef. Std. Err. t P>t [95% Conf. Interval]
+
gunowne~2001  148.547 74.7843 1.99 0.054 299.9399 2.845877
percenturban  220.1914 63.47854 3.47 0.001 91.68583 348.697
percentdiv~d  940.7374 518.2491 1.82 0.077 108.4031 1989.878
medianf~1999  .0024856 .0023595 1.05 0.299 .0072621 .0022909
percentbel~l  425.7565 585.2927 0.73 0.471 1610.62 759.1066
percentsin~o  99.18109 1175.008 0.08 0.933 2279.498 2477.861
percentblack  3.950401 1.298365 3.04 0.004 1.321999 6.578804
southernce~n  .8315924 17.51217 0.05 0.962 34.61994 36.28313
per~18342001  100.722 418.2974 0.24 0.811 947.5208 746.0768
unemployme~s  892.2601 795.0325 1.12 0.269 717.1991 2501.719
alcohol~2001  .6820588 11.65517 0.06 0.954 24.27672 22.9126
_cons  11.46862 177.1615 0.06 0.949 347.1761 370.1133

DC included
. reg robberyrate2001 gunownershiprate2001 percenturban percentdivorced medianfamilyincome199
> 9 percentbelowpovertylevel percentsinglefemaleparenthouseho percentblack southerncensusregion
> percentpop18342001 unemploymentrate2000census alcoholconsumption2001
Source  SS df MS Number of obs = 51
+ F( 11, 39) = 34.80
Model  468437.017 11 42585.1833 Prob > F = 0.0000
Residual  47727.1118 39 1223.7721 Rsquared = 0.9075
+ Adj Rsquared = 0.8815
Total  516164.128 50 10323.2826 Root MSE = 34.982

robbery~2001  Coef. Std. Err. t P>t [95% Conf. Interval]
+
gunowne~2001  269.6794 76.98545 3.50 0.001 425.3971 113.9616
percenturban  130.6335 67.23995 1.94 0.059 5.372167 266.6391
percentdiv~d  1393.584 577.7313 2.41 0.021 225.0122 2562.156
medianf~1999  .0024894 .0027086 0.92 0.364 .007968 .0029893
percentbel~l  284.852 670.4441 0.42 0.673 1640.953 1071.249
percentsin~o  2056.182 1167.237 1.76 0.086 4417.142 304.7783
percentblack  6.662021 1.224005 5.44 0.000 4.186237 9.137804
southernce~n  19.81946 19.03141 1.04 0.304 58.31413 18.6752
per~18342001  797.6534 388.8279 2.05 0.047 11.17482 1584.132
unemployme~s  1885.609 857.8469 2.20 0.034 150.45 3620.768
alcohol~2001  13.86693 12.57757 1.10 0.277 11.5736 39.30746
_cons  116.7293 199.3618 0.59 0.562 519.9766 286.5179

For Robbery whether you included DC or not there is a statistically significant negative relationship between the CDC's measure of gun ownership in 2001 and robbery rates in that year.
Sorry about the typos. I was working on this pretty late.
Labels: GunControl, Hemenway