ROC curves for Random Forest fit objects using pROC in R, to use positve or negative "votes" as predictor

Forevertrip picture Forevertrip · Mar 21, 2019 · Viewed 8k times · Source

Obese is a binary response var with 1 indicating obese and 0 not obese. Weight is a continuous predictor.

using a RF to classify obese:

library(randomFores)

rf <- randomForest(factor(obese)~weight)

gives us a fit object containing:

> summary(rf)
                Length Class  Mode     
call               2   -none- call     
type               1   -none- character
predicted        100   factor numeric  
err.rate        1500   -none- numeric  
confusion          6   -none- numeric  
votes            200   matrix numeric  
oob.times        100   -none- numeric  
classes            2   -none- character
importance         1   -none- numeric  
importanceSD       0   -none- NULL     
localImportance    0   -none- NULL     
proximity          0   -none- NULL     
ntree              1   -none- numeric  
mtry               1   -none- numeric  
forest            14   -none- list     
y                100   factor numeric  
test               0   -none- NULL     
inbag              0   -none- NULL     
terms              3   terms  call  

I believe the votes matrix shows how many votes, from 0 to 1, the rF gives to classifying each case to either class; not obese = 0, obese = 1:

> head(rf$votes, 20) 
           0          1
1  0.9318182 0.06818182
2  0.9325843 0.06741573
3  0.2784091 0.72159091
4  0.9040404 0.09595960
5  0.3865979 0.61340206
6  0.9689119 0.03108808
7  0.8187135 0.18128655
8  0.7170732 0.28292683
9  0.6931217 0.30687831
10 0.9831461 0.01685393
11 0.3425414 0.65745856
12 1.0000000 0.00000000
13 0.9728261 0.02717391
14 0.9848485 0.01515152
15 0.8783069 0.12169312
16 0.8553459 0.14465409
17 1.0000000 0.00000000
18 0.3389831 0.66101695
19 0.9316770 0.06832298
20 0.9435897 0.05641026

taking those:

votes_2 <- rf$votes[,2]
votes_1 <- rf$votes[,1]

my question is why do:

pROC::plot.roc(obese, votes_1)

and

pROC::plot.roc(obese, votes_2)

produce the same result.

Answer

Calimo picture Calimo · Mar 21, 2019

The first thing to realize is that ROC analysis doesn't care about the exact values of your data. Instead it looks at the ranking on the data points, and how the ranks separate.

Second, as has been mentioned in a comment above, the votes for classes 0 and 1 sum up to 1 in each observation. This means that in terms of ranking, the two are equivalent (modulo the direction of sorting).

The last piece of the puzzle is that pROC doesn't assume that you are providing the predictor as the probability to belong to the positive class. Instead you can pass any kind of score, and the direction of the comparison is detected automatically. This is done silently by default but you can see what happens by setting the quiet flag to FALSE:

> pROC::roc(obese, votes_1, quiet = FALSE)
Setting levels: control = 0, case = 1
Setting direction: controls < cases

> pROC::roc(obese, votes_2, quiet = FALSE)
Setting levels: control = 0, case = 1
Setting direction: controls > cases

Notice how in the case of votes_2 it detected that the negative class had higher values (based on the median) and set the direction of the comparison accordingly.

If this is not what you want you can always set the class levels and direction parameters explicitly:

> pROC::roc(obese, votes_2, levels = c(0, 1), direction = "<")

This will result in a "reversed" curve showing how votes_2 is performing worse than random at detecting the positive class with higher values.