Predicting standard errors of forecast

Bryan picture Bryan · Dec 7, 2012 · Viewed 7.9k times · Source

I'm a newbie to R, coming from the Stata world. I've just run a linear model (with approx 100 variables, each with 500 data points or so) like so:

RegModel.3 <- lm(ordercount~timecount2+timecount4 .... expeditedrop, data=Dataset)

Now I want to find the standard error of the forecast, like the stdf function in stata, for each of the fitted values.

I've tried the following code:

predict(RegModel.3$fitted.values, new, se.fit=TRUE)

But I'm getting the following error:

Error in UseMethod("predict") : no applicable method for 'predict' applied to an object of class "c('double', 'numeric')"

What am I doing wrong? Also, how do I go about exporting the output using the write.csv command in a manner similar to the way I wrote the coefficients:

write.csv(RegModel.3$coefficients, file='results.csv')

Thanks!

Answer

Ben Bolker picture Ben Bolker · Dec 7, 2012

A reproducible example taken from ?lm:

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2,10,20, labels=c("Ctl","Trt"))
weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group)


pp <- predict(lm.D9, interval="prediction",se.fit=TRUE)

(see ?predict.lm for reference) and

write.csv(cbind(pp$fit,stderr=pp$se.fit),file="predintervals.csv")

should do it. (In this case the standard errors are all the same because there is a single categorical predictor and no continuous predictor ...)

PS: Whenever possible it's better to use standard accessors, such as coef() to extract the coefficients or fitted() to extract fitted values, than reaching inside the object with $. Try methods(class="lm") to see what accessors are available.

edit: an illustration that the general approach still works with a large problem:

set.seed(101)
X <- matrix(runif(101*500),nrow=500)
prednames <- paste0("predictor",1:100)
X2 <- setNames(as.data.frame(X),
               c("response",prednames))
form <- reformulate(paste(prednames,collapse="+"),response="response")
fit1 <- lm(form,data=X2)
pp <- predict(fit1,interval="predict",se.fit=TRUE)