I am calculating the precisions and recall for off-the-shelf algorithms on a dataset that I recently prepared.
It is a binary classification problem and I am looking to calculate the precision, recall and the f-scores for each of the classifier I built.
test_x, test_y, predics, pred_prob,score = CH.buildBinClassifier(data,allAttribs,0.3,50,'logistic')
The build classifier method basically builds a classifier, fits a training data and returns test_x(the features of the test data), test_y(the ground truth labels), predict(predictions made by the classifier), red_prob(prediction probabilities from the LogisiticRegression.predict_proba
method).
Below is the code for calculating precision-recall:
from sklearn.metrics import precision_recall_curve
pr, re, _ = precision_recall_curve(test_y,pred_prob,pos_label=1)
pr
(array([ 0.49852507, 0.49704142, 0.49554896, 0.49702381, 0.49850746,
0.5 , 0.5015015 , 0.50301205, 0.50453172, 0.50606061,
. . . . . . .
0.875 , 1. , 1. , 1. , 1. ,
1. , 1. , 1. , 1. ])
re
array([ 1. , 0.99408284, 0.98816568, 0.98816568, 0.98816568,
0.98816568, 0.98816568, 0.98816568, 0.98816568, 0.98816568,
. . . . . . .
0.04142012, 0.04142012, 0.03550296, 0.0295858 , 0.02366864,
0.01775148, 0.01183432, 0.00591716, 0. ]))
I do not understand why are precision and recall arrays? Shouldn't they be just single numbers?
Since precision is calculated as tpf/(tpf+fpf)
and similarly recall as definition?
I am aware about calculating the average precision-recall by the following piece of code, but somehow seeing arrays instead of tpf, fpf, precision and recall is making me wonder what is going on.
from sklearn.metrics import precision_recall_fscore_support as prf
precision,recall,fscore,_ = prf(test_y,predics,pos_label=1,average='binary')
Edit:
But without the average
and pos_label
parameter it reports the precisions for each of the class. Could someone explain the difference between the outputs of these two methods?
From the sklearn documentation for precision_recall_curve:
Compute precision-recall pairs for different probability thresholds.
Classifier models like logistic regression do not actually output class labels (like "0" or "1"), they output probabilities (like 0.67). These probabilities tell you the likelihood that the input sample is of a particular class, like the positive ("1") class. But you still need to choose a probability threshold so that the algorithm can convert the probability (0.67) into a class ("1").
If you choose a threshold of 0.5, then all input samples with calculated probabilities greater than 0.5 will be assigned to the positive class. If you choose a different threshold and you get a different number of samples assigned to the positive and negative class, and therefore different precision and recall scores.