ROC curve in R using ROCR package

r roc
spektra picture spektra · Jul 13, 2012 · Viewed 70k times · Source

Can someone explain me please how to plot a ROC curve with ROCR. I know that I should first run:

prediction(predictions, labels, label.ordering = NULL)

and then:

performance(prediction.obj, measure, x.measure="cutoff", ...)

I am just not clear what is meant with prediction and labels. I created a model with ctree and cforest and I want the ROC curve for both of them to compare it in the end. In my case the class attribute is y_n, which I suppose should be used for the labels. But what about the predictions? Here are the steps of what I do (dataset name= bank_part):

pred<-cforest(y_n~.,bank_part)
tablebank<-table(predict(pred),bank_part$y_n)
prediction(tablebank, bank_part$y_n)

After running the last line I get this error:

Error in prediction(tablebank, bank_part$y_n) : 
Number of cross-validation runs must be equal for predictions and labels.

Thanks in advance!

Here's another example: I have the training dataset(bank_training) and testing dataset(bank_testing) and I ran a randomForest as below:

bankrf<-randomForest(y~., bank_training, mtry=4, ntree=2,    
keep.forest=TRUE,importance=TRUE) 
bankrf.pred<-predict(bankrf, bank_testing, type='response')

Now the bankrf.pred is a factor object with labels c=("0", "1"). Still, I don't know how to plot ROC, cause I get stuck to the prediction part. Here's what I do

library(ROCR) 
pred<-prediction(bankrf.pred$y, bank_testing$c(0,1) 

But this is still incorrect, cause I get the error message

Error in bankrf.pred$y_n : $ operator is invalid for atomic vectors

Answer

Jeff Allen picture Jeff Allen · Jul 13, 2012

The predictions are your continuous predictions of the classification, the labels are the binary truth for each variable.

So something like the following should work:

> pred <- prediction(c(0.1,.5,.3,.8,.9,.4,.9,.5), c(0,0,0,1,1,1,1,1))
> perf <- performance(pred, "tpr", "fpr")
> plot(perf)

to generate an ROC.

EDIT: It may be helpful for you to include the sample reproducible code in the question (I'm having a hard time intepreting your comment).

There's no new code here, but... here's a function I use quite often for plotting an ROC:

 plotROC <- function(truth, predicted, ...){
   pred <- prediction(abs(predicted), truth)    
   perf <- performance(pred,"tpr","fpr")

   plot(perf, ...)
}