I am using R v3.3.2 and Caret 6.0.71 (i.e. latest versions) to construct a logistic regression classifier. I am using the confusionMatrix function to create stats for judging its performance.
logRegConfMat <- confusionMatrix(logRegPrediction, valData[,"Seen"])
Accuracy : 0.7239
Sensitivity : 0.3333
Specificity : 0.9213
The target value in my data (Seen) uses 1 for true and 0 for false. I assume the Reference (Ground truth) columns and Predication (Classifier) rows in the confusion matrix follow the same convention. Therefore my results show:
Question: Why is sensitivity given as 0.3333 and specificity given as 0.9213? I would have thought it was the other way round - see below.
I am reluctant to believe that there is bug in the R confusionMatrix function as nothing has been reported and this seems to be a significant error.
Most references about calculating specificity and sensitivity define them as follows - i.e. www.medcalc.org/calc/diagnostic_test.php
According to the documentation ?confusionMatrix
:
"If there are only two factor levels, the first level will be used as the "positive" result."
Hence in your example positive result will be 0
, and evaluation metrics will be the wrong way around. To override default behaviour, you can set the argument positive =
to the correct value, alas:
confusionMatrix(logRegPrediction, valData[,"Seen"], positive = "1")