sklearn metrics for multiclass classification

dino  picture dino · Aug 26, 2017 · Viewed 44.6k times · Source

I have performed GaussianNB classification using sklearn. I tried to calculate the metrics using the following code:

print accuracy_score(y_test, y_pred)
print precision_score(y_test, y_pred)

Accuracy score is working correctly but precision score calculation is showing error as:

ValueError: Target is multiclass but average='binary'. Please choose another average setting.

As target is multiclass, can i have the metric scores of precision, recall etc.?

Answer

ml4294 picture ml4294 · Aug 26, 2017

The function call precision_score(y_test, y_pred) is equivalent to precision_score(y_test, y_pred, pos_label=1, average='binary'). The documentation (http://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html) tells us:

'binary':

Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

So the problem is that your labels are not binary, but probably one-hot encoded. Fortunately, there are other options which should work with your data:

precision_score(y_test, y_pred, average=None) will return the precision scores for each class, while

precision_score(y_test, y_pred, average='micro') will return the total ratio of tp/(tp + fp)

The pos_label argument will be ignored if you choose another average option than binary.