Caret train method complains Something is wrong; all the RMSE metric values are missing

Fred R. picture Fred R. · Jul 28, 2015 · Viewed 12.8k times · Source

On numerous occasions I've been getting this error when trying to fit a gbm or rpart model. Finally I was able to reproduce it consistently using publicly available data. I have noticed that this error happens when using CV (or repeated cv). When I don't use any fit control I don't get this error. Can some shed some light one why I keep getting error consistently.

fitControl= trainControl("repeatedcv", repeats=5)
ds = read.csv("http://www.math.smith.edu/r/data/help.csv")
ds$sub = as.factor(ds$substance)
rpartFit1 <- train(homeless ~ female + i1 + sub + sexrisk + mcs + pcs, 
                   tcControl=fitControl, 
                   method = "rpart", 
                   data=ds)

Answer

StupidWolf picture StupidWolf · Jun 25, 2020

There is a typo, it should be trControl instead of tcControl. And when the argument is provided as tcControl, caret passes this to rpart and this throws an error because this option was never available.

I guess this answers your question of why you get this error when you try to have a cross-validation in training.

Below is how it should work:

library(caret)
library(mosaicData)

data(HELPrct)
ds = HELPrct
fitControl= trainControl(method="repeatedcv",times=5)
ds$sub = as.factor(ds$substance)

rpartFit1 <- train(homeless ~ female + i1 + sub + sexrisk + mcs + pcs, 
                   trControl=fitControl, 
                   method = "rpart", 
                   data=ds[complete.cases(ds),])

rpartFit1
CART 

117 samples
  6 predictor
  2 classes: 'homeless', 'housed' 

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 105, 105, 105, 106, 105, 106, ... 
Resampling results across tuning parameters:

  cp          Accuracy   Kappa      
  0.00000000  0.5280303  -0.03503032
  0.01190476  0.5280303  -0.03503032
  0.07142857  0.5977273  -0.02970604

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was cp = 0.07142857.