dataset = read.csv('dataset/housing.header.binary.txt')
dataset1 = dataset[6] #higest positive correlation
dataset2 = dataset[13] #lowest negative correlation
dependentVal= dataset[14] #dependent value
new_dataset = cbind(dataset1,dataset2, dependentVal) # new matrix
#split dataset
#install.packages('caTools')
library(caTools)
set.seed(123) #this is needed to garantee that every run will produce the same output
split = sample.split(new_dataset, SplitRatio = 0.75)
train_set = subset(new_dataset, split == TRUE)
test_set = subset(new_dataset, split == FALSE)
#Fitting Decision Tree to training set
install.packages('rpart')
library(rpart)
classifier = rpart(formula = Medv ~ Rm + Lstat,
data = train_set)
#predicting the test set results
y_pred = predict(classifier, newdata = test_set[3], type ='class')
I want to predict column 3 of test_set
, but I keep getting
Error in eval(predvars, data, env) : object 'Rm' not found
Even though I specify test_set[3]
not test_set[1]
which contain Rm
The column names are as follows: Rm
, Lstat
, and Medv
.
test_set[3]
and test_set[2]
give the same following error:
Error in eval(predvars, data, env) : object Rm not found
and test_set[1]
gives:
Error in eval(predvars, data, env) : object 'Lstat' not found
I have tried the following:
names(test_set) <- c('Rm', 'Lstat','Medv')
: I renamed explicitly.is.data.frame(test_set)
: i checked if test_set is a dataframe.I solved the problem by the following code
y_pred = predict(classifier, newdata = test_set[-3], type ='class')
https://www.rdocumentation.org/packages/rpart/versions/4.1-13/topics/predict.rpart quote from R documentation
"newdata : data frame containing the values at which predictions are required. The predictors referred to in the right side of formula(object) must be present by name in newdata. If missing, the fitted values are returned."