Regression with Heteroskedasticity Corrected Standard Errors

JJJ picture JJJ · Dec 8, 2010 · Viewed 41.9k times · Source

I would like to find the R implementation that most closely resembles Stata output for fitting a Least Squares Regression function with Heteroskedastic Corrected Standard Errors. Specifically I would like the corrected standard errors to be in the "summary" and not have to do additional calculations for my initial round of hypothesis testing. I am looking for a solution that is as "clean" as what Eviews and Stata provide.

So far, using the "lmtest" package the best I can come up with is:

model <- lm(...)
coeftest(model, vcov = hccm) 

This gives me the output that I want, but it does not seem to be using "coeftest" for its stated purpose. I would also have to use the summary with the incorrect standard errors to read off the R^2 and F stat, etc. I feel that there should exist a "one line" solution to this problem given how dynamic R is.

Thanks

Answer

Gavin Simpson picture Gavin Simpson · Dec 8, 2010

I think you are on the right track with coeftest in package lmtest. Take a look at the sandwich package which includes this functionality and is designed to work hand in hand with the lmtest package you have already found.

> # generate linear regression relationship
> # with Homoskedastic variances
> x <- sin(1:100)
> y <- 1 + x + rnorm(100)
> ## model fit and HC3 covariance
> fm <- lm(y ~ x)
> vcovHC(fm)
            (Intercept)           x
(Intercept) 0.010809366 0.001209603
x           0.001209603 0.018353076
> coeftest(fm, vcov. = vcovHC)

t test of coefficients:

            Estimate Std. Error t value  Pr(>|t|)    
(Intercept)  1.01973    0.10397  9.8081 3.159e-16 ***
x            0.93992    0.13547  6.9381 4.313e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

To get the F test, look at function waldtest():

> waldtest(fm, vcov = vcovHC)
Wald test

Model 1: y ~ x
Model 2: y ~ 1
  Res.Df Df      F    Pr(>F)    
1     98                        
2     99 -1 48.137 4.313e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

You could always cook up a simple function to combine these two for you if you wanted the one-liner...

There are lots of examples in the Econometric Computing with HC and HAC Covariance Matrix Estimators vignette that comes with the sandwich package of linking lmtest and sandwich to do what you want.

Edit: A one-liner could be as simple as:

mySummary <- function(model, VCOV) {
    print(coeftest(model, vcov. = VCOV))
    print(waldtest(model, vcov = VCOV))
}

Which we can use like this (on the examples from above):

> mySummary(fm, vcovHC)

t test of coefficients:

            Estimate Std. Error t value  Pr(>|t|)    
(Intercept)  1.01973    0.10397  9.8081 3.159e-16 ***
x            0.93992    0.13547  6.9381 4.313e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Wald test

Model 1: y ~ x
Model 2: y ~ 1
  Res.Df Df      F    Pr(>F)    
1     98                        
2     99 -1 48.137 4.313e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1