Caret Neural Network Error: "missing values in resampled performance measures"

Matthew Jackson picture Matthew Jackson · Jul 28, 2015 · Viewed 7.1k times · Source

I have seen other people with this error before, however, I have not found a satisfactory answer. I wonder if anyone can offer some insights into my problem?

I have some car auction data which I am trying to model to predict the Hammer.Price.

> str(myTrain)
'data.frame':   34375 obs. of  9 variables:
 $ Grade          : int  4 4 4 4 2 3 4 3 3 4 ...
 $ Mileage        : num  150850 113961 71834 57770 43161 ...
 $ Hammer.Price   : num  750 450 1600 4650 4800 ...
 $ New.Price      : num  15051 13795 15051 14475 14475 ...
 $ Year.Introduced: int  1996 1996 1996 1996 1996 1996 1996 1996 1996 1996 ...
 $ Engine.Size    : num  1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 ...
 $ Doors          : int  3 3 3 3 3 3 3 3 3 3 ...
 $ Age            : int  3771 4775 3802 2402 2463 3528 3315 3193 4075 4988 ...
 $ Days.from.Sale : int  1778 1890 2183 1939 1876 1477 1526 1812 1813 1472 ...

myTrain contains a random 70% of the data and myTest the other 30%, I train the model

myModel <- train(Hammer.Price ~ ., data = myTrain, method = "nnet")

This results in the following warning:

Warning message: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : There were missing values in resampled performance measures.

When I try to predict all of the results are equal to 1.

myTestPred <- predict(myModel, myTest)

I have previously used this data to train a MLP neural network using SPSS Modeller but don't seem to be able to recreate the results in R. I have tried some of the other neural network packages in caret but always get the same result.

Does anyone understand this better than me?

Answer

Achekroud picture Achekroud · Jul 29, 2015

Does it fix the problem if you scale the data before calling train? I have had this problem with glmnet and nnet if you don't scale all the variables before running the model. It also helps (anecdotally) if you make all of your variables numeric.

You can also try making your resampling explicit e.g. using

myControl <- trainControl(method = "repeatedcv", repeats=5, number = 10)

and then passing this to train:

myModel <- train(Hammer.Price ~ .,
    data = myTrain,
    method = "nnet",
    trControl = mycontrol)

Without the data it is sometimes difficult to spot the error, sorry.