Cluster-Robust Standard Errors. What has all this to do with the "More Guns, Less Crime" data? Problem: Default standard errors (SE) reported by Stata, R and Python are right only under very limited circumstances. Yes, that code will fit a regression model which assumes that the response is normally distributed, and use the Generalized Estimating Equations (GEE) method to provide standard errors that account for the correlation due to clustering within firms. A Practitioner's Guide to Cluster-Robust Inference According to Cameron and Miller, this clustering will lead to: Incorrect standard errors violate of the assumption of independence required by many estimation methods and statistical tests and can lead to Type I and Type II errors. One way to control for Clustered Standard Errors is to specify a model. Hand calculations for clustered standard errors are somewhat complicated (compared to your average statistical formula). Clustered standard errors may be estimated as follows: proc genmod; class identifier; model depvar = indvars; repeated subject=identifier / type=ind; run; quit; This method is quite general, and allows alternative regression specifications using different link functions. I agree, if first differencing is applied to remove the fixed effects then it should be applied also to the dependent variable. Ibragimov, R., & Muller, U. When did the IBM 650 have a "Table lookup on Equal" instruction? How to understand the object in a category. As this is panel data, you almost certainly have clustering. In a simple time series setting we can use Newey-West covariance matrix with a bunch of lags and that will take care of the problem of correlation in the residuals. I have a question about how to correct standard errors when the independent variable has correlation. The Sampling Design reason for clustering Consider running a simple Mincer earnings regression of the form: Log(wages) = a + b*years of schooling + c*experience + d*experience^2 + e You present this model, and are deciding whether to cluster the standard errors. First, I'll show how to write a function to obtain clustered standard errors. Cluster sampling involves the grouping of the population into convenient aggregations. Assume m clusters. Block bootstrap the standard errors with individuals being "blocks". Regression of dem_ind on log_gdppc (standing for democracy index and logarithm of gdp per capita), with standard errors clustered across countries to correct for autocorrelation. - Wooldridge (2010) "Econometric Analysis of Cross Section and Panel Data", 2nd Edition, MIT Press. WikiProject Statistics or WikiProject Math may be able to help recruit an expert. Teachers might be more efficient in some classes than other classes, students may be clustered by ability (e.g. special education classes), or some schools might have better access to computers than others. While robust standard errors are often larger than their usual counterparts, this is not necessarily the case, and indeed in this example, there are some robust standard errors that are smaller than their conventional counterparts. Even in the second case, Abadie et al. Am I correct in understanding that if you include fixed effects, you should not be clustering at that level? In this case, the clustering correction would increase the standard errors from 0.25 to 1.25. 