I'm trying to use the R survival package, to produce a plot of log(-log(survival))
against log(time)
(This is something sometimes recommended as a way to visually inspect for accelerated lifetime or proportional hazard properties).
The "fun=cloglog
" option in plot.survfit
is not producing what I expect it to.
Using the "gehan" data from the "MASS" library:
first, here is a simple plot of survival against time, for the treatment and control groups:
data(gehan, package="MASS")
gehansurv=Surv(gehan$time, gehan$cens)
plot(survfit(gehansurv ~ gehan$treat), col=c("black", "red"))
OK so far. Now, if I use the fun=cloglog
option, the documentation for plot.survfit makes me think that I'll get a plot of log(-log(survival))
against log(time)
:
fun
an arbitrary function defining a transformation of the survival curve. For example fun=log is an alternative way to draw a log-survival curve (but with the axis labeled with log(S) values), and fun=sqrt would generate a curve on square root scale. Four often used transformations can be specified with a character argument instead: "log" is the same as using the log=T option, "event" plots cumulative events (f(y) = 1-y), "cumhaz" plots the cumulative hazard function (f(y) = -log(y)), and "cloglog" creates a complimentary log-log survival plot (f(y) = log(-log(y)) along with log scale for the x-axis).
However, when I try this, it doesn't seem to use the log(-log(y))
function, because the displayed curve is still decreasing (since the original survival curve is decreasing, and the applied f(y)=log(-log(y))
function is a decreasing function, the resulting log(-log(survival))
curve should be increasing).
Also, the x-axis is not log-scaled:
library(VGAM)
plot(survfit(gehansurv ~ gehan$treat), col=c("black", "red"), fun=cloglog)
I can get what I want by defining my own log(-log())
function and using the log="x"
option:
> myfun=function(p){return(log(-log(p)))}
> plot(survfit(gehansurv ~ gehan$treat), col=c("black", "red"), fun=myfun, log="x")
So: What am I doing wrong above (or how am I misinterpreting the plot.survfit
documentation)?
Supplementary question: how would a "fun="
option change the scaling on the horizontal axis as the documentation claims that "fun=cloglog"
would, when on the face of it the argument to "fun"
is a function applied to the vertical variable?
Put quotes around cloglog for plot.survfit.
library(survival)
library(MASS)
data(gehan)
gehansurv=Surv(gehan$time, gehan$cens)
plot(survfit(gehansurv ~ gehan$treat), col=c("black", "red"), fun="cloglog")