I can't for the life of me figure out how to compute a confusion matrix on rpart.
Here is what I have done:
set.seed(12345)
UBANK_rand <- UBank[order(runif(1000)), ]
UBank_train <- UBank_rand[1:900, ]
UBank_test <- UBank_rand[901:1000, ]
dim(UBank_train)
dim(UBank_test)
#Build the formula for the Decision Tree
UB_tree <- Personal.Loan ~ Experience + Age+ Income +ZIP.Code + Family + CCAvg + Education
#Building the Decision Tree from Test Data
UB_rpart <- rpart(UB_tree, data=UBank_train)
Now, I would think I would do something like
table(predict(UB_rpart, UBank_test, UBank_Test$Default))
But that is not giving me a confusion matrix.
You didn't provide a reproducible example, so I'll create a synthetic dataset:
set.seed(144)
df = data.frame(outcome = as.factor(sample(c(0, 1), 100, replace=T)),
x = rnorm(100))
The predict
function for an rpart
model with type="class"
will return the predicted class for each observation.
library(rpart)
mod = rpart(outcome ~ x, data=df)
pred = predict(mod, type="class")
table(pred)
# pred
# 0 1
# 51 49
Lastly, you can build the confusion matrix by running table
between the prediction and true outcome:
table(pred, df$outcome)
# pred 0 1
# 0 36 15
# 1 14 35