Is there a simple command to do leave-one-out cross validation with the lm()
function in R?
Specifically is there a simple command which for the code below?
x <- rnorm(1000,3,2)
y <- 2*x + rnorm(1000)
pred_error_sq <- c(0)
for(i in 1:1000) {
x_i <- x[-i]
y_i <- y[-i]
mdl <- lm(y_i ~ x_i) # leave i'th observation out
y_pred <- predict(mdl, data.frame(x_i = x[i])) # predict i'th observation
pred_error_sq <- pred_error_sq + (y[i] - y_pred)^2 # cumulate squared prediction errors
}
y_squared <- sum((y-mean(y))^2)/100 # Variation of the data
R_squared <- 1 - (pred_error_sq/y_squared) # Measure for goodness of fit
Another solution is using caret
library(caret)
data <- data.frame(x = rnorm(1000, 3, 2), y = 2*x + rnorm(1000))
train(y ~ x, method = "lm", data = data, trControl = trainControl(method = "LOOCV"))
Linear Regression
1000 samples 1 predictor
No pre-processing Resampling: Leave-One-Out Cross-Validation Summary of sample sizes: 999, 999, 999, 999, 999, 999, ... Resampling results:
RMSE Rsquared MAE
1.050268 0.940619 0.836808Tuning parameter 'intercept' was held constant at a value of TRUE