I try to calculate the f1_score
but I get some warnings for some cases when I use the sklearn f1_score
method.
I have a multilabel 5 classes problem for a prediction.
import numpy as np
from sklearn.metrics import f1_score
y_true = np.zeros((1,5))
y_true[0,0] = 1 # => label = [[1, 0, 0, 0, 0]]
y_pred = np.zeros((1,5))
y_pred[:] = 1 # => prediction = [[1, 1, 1, 1, 1]]
result_1 = f1_score(y_true=y_true, y_pred=y_pred, labels=None, average="weighted")
print(result_1) # prints 1.0
result_2 = f1_score(y_true=y_ture, y_pred=y_pred, labels=None, average="weighted")
print(result_2) # prints: (1.0, 1.0, 1.0, None) for precision/recall/fbeta_score/support
When I use average="samples"
instead of "weighted"
I get (0.1, 1.0, 0.1818..., None). Is the "weighted"
option not useful for a multilabel problem or how do I use the f1_score
method correctly?
I also get a warning when using average="weighted"
:
"UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples."
It works if you slightly add up data:
y_true = np.array([[1,0,0,0], [1,1,0,0], [1,1,1,1]])
y_pred = np.array([[1,0,0,0], [1,1,1,0], [1,1,1,1]])
recall_score(y_true=y_true, y_pred=y_pred, average='weighted')
>>> 1.0
precision_score(y_true=y_true, y_pred=y_pred, average='weighted')
>>> 0.9285714285714286
f1_score(y_true=y_true, y_pred=y_pred, average='weighted')
>>> 0.95238095238095244
The data suggests we have not missed any true positives and have not predicted any false negatives (recall_score
equals 1). However, we have predicted one false positive in the second observation that lead to precision_score
equal ~0.93.
As both precision_score
and recall_score
are not zero with weighted
parameter, f1_score
, thus, exists. I believe your case is invalid due to lack of information in the example.