Block bootstrap from subject list

Question 1

Block bootstrap from subject list

r regression plyr statistics-bootstrap

baha-kev · Aug 12, 2012 · Viewed 7.4k times · Source

Answer

Answer

How about something like this:

myfit <- function(x, i) {
   mydata <- do.call("rbind", lapply(i, function(n) subset(Grunfeld, firm==x[n])))
   coefficients(lm(value ~ inv + capital, data = mydata))
}

firms <- unique(Grunfeld$firm)

b0 <- boot(firms, myfit, 999)

Question 2

I'm trying to efficiently implement a block bootstrap technique to get the distribution of regression coefficients. The main outline is as follows.

I have a panel data set, and say firm and year are the indices. For each iteration of the bootstrap, I wish to sample n subjects with replacement. From this sample, I need to construct a new data frame that is an rbind() stack of all the observations for each sampled subject, run the regression, and pull out the coefficients. Repeat for a bunch of iterations, say 100.

Each firm can potentially be selected multiple times, so I need to include it data multiple times in each iteration's data set.
Using a loop and subset approach, like below, seems computationally burdensome.
Note that for my real data frame, n, and the number iterations is much larger than the example below.

My thoughts initially are to break the existing data frame into a list by subject using the split() command. From there, use

sample(unique(df1$subject),n,replace=TRUE)

to get the new list, then perhaps implement quickdf from the plyr package to construct a new data frame.

Example slow code:

require(plm)
data("Grunfeld", package="plm")

firms = unique(Grunfeld$firm)
n = 10
iterations = 100
mybootresults=list()

for(j in 1:iterations){

  v = sample(length(firms),n,replace=TRUE)
  newdata = NULL

  for(i in 1:n){
    newdata = rbind(newdata,subset(Grunfeld, firm == v[i]))
  }

  reg1 = lm(value ~ inv + capital, data = newdata)
  mybootresults[[j]] = coefficients(reg1)

}

mybootresults = as.data.frame(t(matrix(unlist(mybootresults),ncol=iterations)))
names(mybootresults) = names(reg1$coefficients)
mybootresults

  (Intercept)      inv    capital
1    373.8591 6.981309 -0.9801547
2    370.6743 6.633642 -1.4526338
3    528.8436 6.960226 -1.1597901
4    331.6979 6.239426 -1.0349230
5    507.7339 8.924227 -2.8661479
...
...

Block bootstrap from subject list

Answer

Related questions