I have a multiclass classification task. When I run my script based on the scikit example as the follows:
classifier = OneVsRestClassifier(GradientBoostingClassifier(n_estimators=70, max_depth=3, learning_rate=.02))
y_pred = classifier.fit(X_train, y_train).predict(X_test)
cnf_matrix = confusion_matrix(y_test, y_pred)
I get this error:
File "C:\ProgramData\Anaconda2\lib\site-packages\sklearn\metrics\classification.py", line 242, in confusion_matrix
raise ValueError("%s is not supported" % y_type)
ValueError: multilabel-indicator is not supported
I tried to pass the labels=classifier.classes_
to confusion_matrix()
, but it doesn't help.
y_test and y_pred are as the follow:
y_test =
array([[0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0],
[0, 1, 0, 0, 0, 0],
...,
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0]])
y_pred =
array([[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
...,
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0]])
This worked for me:
y_test_non_category = [ np.argmax(t) for t in y_test ]
y_predict_non_category = [ np.argmax(t) for t in y_predict ]
from sklearn.metrics import confusion_matrix
conf_mat = confusion_matrix(y_test_non_category, y_predict_non_category)
where y_test
and y_predict
are categorical variables like one-hot vectors.