Caret package - defining Positive result

duvvurum picture duvvurum · Oct 30, 2015 · Viewed 12k times · Source

While using Caret package for machine learning, I am struck with Caret's default "Positive" outcome picking i.e the first level of the outcome factor in binary classification problems.

Package says it can be set to the alternative level. Can any body help me to define the positive outcome?

Thanking you

Answer

phiver picture phiver · Oct 30, 2015

look at this example. Extended this from the caret examples with confusionMatrix.

lvs <- c("normal", "abnormal")
truth <- factor(rep(lvs, times = c(86, 258)),
                levels = rev(lvs))
pred <- factor(
  c(
    rep(lvs, times = c(54, 32)),
    rep(lvs, times = c(27, 231))),               
  levels = rev(lvs))

xtab <- table(pred, truth)

str(truth)
Factor w/ 2 levels "abnormal","normal": 2 2 2 2 2 2 2 2 2 2 ...

Because abnormal is the first level, this will be the default positive class

confusionMatrix(xtab)

Confusion Matrix and Statistics

          truth
pred       abnormal normal
  abnormal      231     32
  normal         27     54

               Accuracy : 0.8285          
                 95% CI : (0.7844, 0.8668)
    No Information Rate : 0.75            
    P-Value [Acc > NIR] : 0.0003097       

                  Kappa : 0.5336          
 Mcnemar's Test P-Value : 0.6025370       

            Sensitivity : 0.8953          
            Specificity : 0.6279          
         Pos Pred Value : 0.8783          
         Neg Pred Value : 0.6667          
             Prevalence : 0.7500          
         Detection Rate : 0.6715          
   Detection Prevalence : 0.7645          
      Balanced Accuracy : 0.7616          

       'Positive' Class : abnormal     

To change to positive class = normal, just add this in the confusionMatrix. Notice the differences with the previous output, differences start appearing at the sensitivity and other calculations.

confusionMatrix(xtab, positive = "normal")

Confusion Matrix and Statistics

          truth
pred       abnormal normal
  abnormal      231     32
  normal         27     54

               Accuracy : 0.8285          
                 95% CI : (0.7844, 0.8668)
    No Information Rate : 0.75            
    P-Value [Acc > NIR] : 0.0003097       

                  Kappa : 0.5336          
 Mcnemar's Test P-Value : 0.6025370       

            Sensitivity : 0.6279          
            Specificity : 0.8953          
         Pos Pred Value : 0.6667          
         Neg Pred Value : 0.8783          
             Prevalence : 0.2500          
         Detection Rate : 0.1570          
   Detection Prevalence : 0.2355          
      Balanced Accuracy : 0.7616          

       'Positive' Class : normal