first-difference linear panel model variance in R and Stata

Dimitriy V. Masterov picture Dimitriy V. Masterov · Sep 26, 2013 · Viewed 12.6k times · Source

I would like for a colleague to replicate a first-difference linear panel data model that I am estimating with Stata with the plm package in R (or some other package).

In Stata, xtreg does not have a first difference option, so instead I run:

reg D.(y x), nocons cluster(ID)

In R, I am doing:

plm(formula = y ~ -1 + x, data = data, model = "fd", index = c("ID","Period"))

The coefficients match, but the standard errors in R are larger than in Stata. I looked in the plm help and pdf documentation, but I must be missing something.

Answer

Metrics picture Metrics · Sep 27, 2013

The standard errors are different because you use cluster option in Stata.

R:

data(Grunfeld)
library(plm)
grun.re <- plm(inv~-1+value+capital,data=Grunfeld,model="fd")
> summary(grun.re)
Oneway (individual) effect First-Difference Model

Call:
plm(formula = inv ~ -1 + value + capital, data = Grunfeld, model = "fd")

Balanced Panel: n=10, T=20, N=200

Residuals :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-202.00  -15.20   -1.76   -1.39    7.95  199.00 

Coefficients :
         Estimate Std. Error t-value  Pr(>|t|)    
value   0.0890628  0.0082341  10.816 < 2.2e-16 ***
capital 0.2786940  0.0471564   5.910  1.58e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Stata

 reg D.(inv value capital), nocons

      Source |       SS       df       MS              Number of obs =     190
-------------+------------------------------           F(  2,   188) =   70.58
       Model |   259740.92     2   129870.46           Prob > F      =  0.0000
    Residual |  345936.615   188  1840.08838           R-squared     =  0.4288
-------------+------------------------------           Adj R-squared =  0.4228
       Total |  605677.536   190   3187.7765           Root MSE      =  42.896

------------------------------------------------------------------------------
       D.inv |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       value |
         D1. |   .0890628   .0082341    10.82   0.000     .0728197    .1053059
             |
     capital |
         D1. |    .278694   .0471564     5.91   0.000     .1856703    .3717177

If you want to cluster by group, here is the solution:

R:

library(lmtest) # for coeftest function
coeftest(grun.re,vcov=vcovHC(grun.re,type="HC0",cluster="group"))

t test of coefficients:

        Estimate Std. Error t value  Pr(>|t|)    
value   0.089063   0.013728  6.4878 7.512e-10 ***
capital 0.278694   0.130954  2.1282   0.03462 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Stata:

. reg D.(inv value capital), nocons cluster(firm)

Linear regression                                      Number of obs =     190
                                                       F(  2,     9) =   47.80
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.4288
                                                       Root MSE      =  42.896

                                  (Std. Err. adjusted for 10 clusters in firm)
------------------------------------------------------------------------------
             |               Robust
       D.inv |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       value |
         D1. |   .0890628   .0145088     6.14   0.000     .0562416    .1218841
             |
     capital |
         D1. |    .278694    .138404     2.01   0.075    -.0343976    .5917857
------------------------------------------------------------------------------

You can see that there is slight difference. For details in R, see plm manual page 39 and also here plus here