I have seen other people with this error before, however, I have not found a satisfactory answer. I wonder if anyone can offer some insights into my problem?
I have some car auction data which I am trying to model to predict the Hammer.Price
.
> str(myTrain)
'data.frame': 34375 obs. of 9 variables:
$ Grade : int 4 4 4 4 2 3 4 3 3 4 ...
$ Mileage : num 150850 113961 71834 57770 43161 ...
$ Hammer.Price : num 750 450 1600 4650 4800 ...
$ New.Price : num 15051 13795 15051 14475 14475 ...
$ Year.Introduced: int 1996 1996 1996 1996 1996 1996 1996 1996 1996 1996 ...
$ Engine.Size : num 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 ...
$ Doors : int 3 3 3 3 3 3 3 3 3 3 ...
$ Age : int 3771 4775 3802 2402 2463 3528 3315 3193 4075 4988 ...
$ Days.from.Sale : int 1778 1890 2183 1939 1876 1477 1526 1812 1813 1472 ...
myTrain
contains a random 70% of the data and myTest
the other 30%, I train the model
myModel <- train(Hammer.Price ~ ., data = myTrain, method = "nnet")
This results in the following warning:
Warning message: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : There were missing values in resampled performance measures.
When I try to predict all of the results are equal to 1.
myTestPred <- predict(myModel, myTest)
I have previously used this data to train a MLP neural network using SPSS Modeller but don't seem to be able to recreate the results in R. I have tried some of the other neural network packages in caret but always get the same result.
Does anyone understand this better than me?
Does it fix the problem if you scale the data before calling train
? I have had this problem with glmnet and nnet if you don't scale all the variables before running the model. It also helps (anecdotally) if you make all of your variables numeric.
You can also try making your resampling explicit e.g. using
myControl <- trainControl(method = "repeatedcv", repeats=5, number = 10)
and then passing this to train
:
myModel <- train(Hammer.Price ~ .,
data = myTrain,
method = "nnet",
trControl = mycontrol)
Without the data it is sometimes difficult to spot the error, sorry.