I am working with a multi-class multi-label output from my classifier. The total number of classes is 14 and instances can have multiple classes associated. For example:
y_true = np.array([[0,0,1], [1,1,0],[0,1,0])
y_pred = np.array([[0,0,1], [1,0,1],[1,0,0])
The way I am making my confusion matrix right now:
matrix = confusion_matrix(y_true.argmax(axis=1), y_pred.argmax(axis=1))
print(matrix)
Which gives an output like:
[[ 79 0 0 0 66 0 0 151 1 8 0 0 0 0]
[ 4 0 0 0 11 0 0 27 0 0 0 0 0 0]
[ 14 0 0 0 21 0 0 47 0 1 0 0 0 0]
[ 1 0 0 0 4 0 0 25 0 0 0 0 0 0]
[ 18 0 0 0 50 0 0 63 0 3 0 0 0 0]
[ 4 0 0 0 3 0 0 19 0 0 0 0 0 0]
[ 2 0 0 0 3 0 0 11 0 2 0 0 0 0]
[ 22 0 0 0 20 0 0 138 1 5 0 0 0 0]
[ 12 0 0 0 9 0 0 38 0 1 0 0 0 0]
[ 10 0 0 0 3 0 0 40 0 4 0 0 0 0]
[ 3 0 0 0 3 0 0 14 0 3 0 0 0 0]
[ 0 0 0 0 2 0 0 3 0 0 0 0 0 0]
[ 2 0 0 0 11 0 0 32 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 3 0 0 0 0 0 7]]
Now, I am not sure if the confusion matrix from sklearn is capable of handling multi-label multi-class data. Could someone help me with this?
What you need to do is to generate multiple binary confusion matrices (since essentially what you have are multiple binary labels)
Something along the lines of:
import numpy as np
from sklearn.metrics import confusion_matrix
y_true = np.array([[0,0,1], [1,1,0],[0,1,0]])
y_pred = np.array([[0,0,1], [1,0,1],[1,0,0]])
labels = ["A", "B", "C"]
conf_mat_dict={}
for label_col in range(len(labels)):
y_true_label = y_true[:, label_col]
y_pred_label = y_pred[:, label_col]
conf_mat_dict[labels[label_col]] = confusion_matrix(y_pred=y_pred_label, y_true=y_true_label)
for label, matrix in conf_mat_dict.items():
print("Confusion matrix for label {}:".format(label))
print(matrix)