Is there a simple command to do leave-one-out cross validation with the lm() function?

stollenm picture stollenm · Oct 31, 2017 · Viewed 15.2k times · Source

Is there a simple command to do leave-one-out cross validation with the lm() function in R?

Specifically is there a simple command which for the code below?

x <- rnorm(1000,3,2)
y <- 2*x + rnorm(1000)

pred_error_sq <- c(0)
for(i in 1:1000) {
  x_i <- x[-i]
  y_i <- y[-i]
  mdl <- lm(y_i ~ x_i) # leave i'th observation out
  y_pred <- predict(mdl, data.frame(x_i = x[i])) # predict i'th observation
  pred_error_sq <- pred_error_sq + (y[i] - y_pred)^2 # cumulate squared prediction errors
}

y_squared <- sum((y-mean(y))^2)/100 # Variation of the data

R_squared <- 1 - (pred_error_sq/y_squared) # Measure for goodness of fit

Answer

amarchin picture amarchin · Oct 31, 2017

Another solution is using caret

library(caret)

data <- data.frame(x = rnorm(1000, 3, 2), y = 2*x + rnorm(1000))

train(y ~ x, method = "lm", data = data, trControl = trainControl(method = "LOOCV"))

Linear Regression

1000 samples 1 predictor

No pre-processing Resampling: Leave-One-Out Cross-Validation Summary of sample sizes: 999, 999, 999, 999, 999, 999, ... Resampling results:

RMSE Rsquared MAE
1.050268 0.940619 0.836808

Tuning parameter 'intercept' was held constant at a value of TRUE