Suppose I have a factor variable y
with n levels, for which I have both predictions and real outcomes available. How can I construct the confusion matrix?
set.seed(12345)
y_actual = as.factor(sample(c('A','B', 'C', 'D', 'E'), 100, replace = TRUE))
set.seed(12346)
y_predict = as.factor(sample(c('A','B', 'C', 'D', 'E'), 100, replace = TRUE))
This question is already answered for the case n = 2. See
R: how to make a confusion matrix for a predictive model?
What I tried
This is how far I got
ones = data.frame(total = rep(1,100));
confusion = aggregate(ones, list(Prediction = predict, Reality = real), sum, a.action=0)
confusion
Prediction Reality total
1 A A 12
2 B A 5
3 C A 15
4 A B 15
5 B B 7
6 C B 8
7 A C 12
8 B C 16
9 C C 10
Now this has to be brought in the shape of a matrix.
Background
The confusion matrix has as the horizontal label "actual class" and as vertical label "predicted class". The matrix elements are simply counts like this:
element (1,1) = Number of counts for actual class is A and predicted class is A
element (1,2) = Number of counts for actual class is A and predicted class is B
etc
Simply use confusionMatrix
from the package caret
require(caret)
confusionMatrix(y_actual, y_predict)
Reference
Prediction A B C
A 12 5 15
B 15 7 8
C 12 16 10