I have a panel data set in R (time and cross section) and would like to compute standard errors that are clustered by two dimensions, because my residuals are correlated both ways. Googling around I found http://thetarzan.wordpress.com/2011/06/11/clustered-standard-errors-in-r/ which provides a function to do this. It seems a bit ad-hoc so I wanted to know if there is a package that has been tested and does this?
I know sandwich
does HAC standard errors, but it doesn't do double clustering (i.e. along two dimensions).
This is an old question. But seeing as people still appear to be landing on it, I thought I'd provide some modern approaches to multiway clustering in R:
fixest::feols()
library(fixest)
nlswork = haven::read_dta("http://www.stata-press.com/data/r14/nlswork.dta")
est_feols = feols(ln_wage ~ age | race + year, cluster = ~race+year, data = nlswork)
est_feols
## An important feature of fixest: We can _instantaneously_ compute other
## VCOV matrices / SEs on the fly with summary.fixest(). No need to re-run
## the model!
summary(est_feols, se = 'standard') ## vanilla SEs
summary(est_feols, se = 'hetero') ## robust SEs
summary(est_feols, se = 'twoway') ## diff syntax, but same as original model
summary(est_feols, cluster = c('race', 'year')) ## ditto
summary(est_feols, cluster = ~race^year) ## interacted cluster vars
summary(est_feols, cluster = ~ race + year + idcode) ## add third cluster var (not in original model call)
## etc.
lfe::felm()
library(lfe)
## Note the third "| 0 " slot means we're not using IV
est_felm = felm(ln_wage ~ age | race + year | 0 | race + year, data = nlswork)
summary(est_felm)
sandwich
library(sandwich)
library(lmtest)
est_sandwich = lm(ln_wage ~ age + factor(race) + factor(year), data = nlswork)
coeftest(est_sandwich, vcov = vcovCL, cluster = ~ race + year)
Aaaand, just to belabour the point about speed. Here's a benchmark of the three different approaches (using two fixed FEs and twoway clustering).
est_feols = function() feols(ln_wage ~ age | race + year, cluster = ~race+year, data = nlswork)
est_felm = function() felm(ln_wage ~ age | race + year | 0 | race + year, data = nlswork)
est_standwich = function() {coeftest(lm(ln_wage ~ age + factor(race) + factor(year), data = nlswork),
vcov = vcovCL, cluster = ~ race + year)}
microbenchmark(est_feols(), est_felm(), est_standwich(), times = 3)
#> Unit: milliseconds
#> expr min lq mean median uq max neval cld
#> est_feols() 11.94122 11.96158 12.55835 11.98193 12.86692 13.75191 3 a
#> est_felm() 87.18064 95.89905 100.69589 104.61746 107.45352 110.28957 3 b
#> est_standwich() 176.43502 183.95964 188.50271 191.48425 194.53656 197.58886 3 c