Manually calculate AUC

user9370613 picture user9370613 · Jun 14, 2018 · Viewed 8.7k times · Source

How can I obtain the AUC value having fpr and tpr? Fpr and tpr are just 2 floats obtained from these formulas:

my_fpr = fp / (fp + tn)
my_tpr = tp / (tp + fn)
my_roc_auc = auc(my_fpr, my_tpr)

I know this can't pe possible, because fpr and tpr are just some floats and they need to be arrays, but I can't figure it out how to do that so. I also know that I can compute AUC this way:

y_predict_proba = model.predict_proba(X_test)
probabilities = np.array(y_predict_proba)[:, 1]
fpr, tpr, _ = roc_curve(y_test, probabilities)
roc_auc = auc(fpr, tpr)

but I want to avoid using predict_proba for some reasons. So my question is: how can I obtain AUC having fp, tp, fn, tn, fpr, tpr? In other words, is it possible to obtain AUC without roc_curve?

Answer

ofirdi picture ofirdi · May 28, 2019

Yes, it is possible to obtain the AUC without calling roc_curve.

You first need to create the ROC (Receiver Operating Characteristics) curve. To be able to use the ROC curve, your classifier should be able to rank examples such that the ones with higher rank are more likely to be positive (e.g. fraudulent). As an example, Logistic Regression outputs probabilities, which is a score that you can use for ranking. The ROC curve is created by plotting the True Positive Pate (TPR) against the False Positive Rate (FPR) at various threshold settings. As an example:

enter image description here

The model performance is determined by looking at the area under the ROC curve (or AUC)

enter image description here

You can find here the more detailed explanation.