How to predict survival time in Cox's Regression Model in R?

statBeginner picture statBeginner · Feb 13, 2015 · Viewed 10.1k times · Source

I have a modeled a problem using Cox's regression and now want to predict the estimated survival time for an individual. The model has a list of covariates on which the survival time depends. This tells us how to calculate P(T>t) which is basically the survival Function (1-CDF) for a given individual.

I want to predict something which is slightly different. Given values for the covariates that have been used, I want to predict the the estimated number of days that the person would live. This, according to me, is similar to sampling from the pdf. How can I do this using the survival package in R? Below is a summary of the fit using Cox's regression model.

Call:
coxph(formula = Surv(Time, death) ~ variable1 + variable2 + variable3 + 
variable4 + variable5 + variable6 + variable7 + variable8 + variable9, 
data = DataTest, method = "breslow")

n= 23756, number of events= 23756 

          coef exp(coef) se(coef)      z Pr(>|z|)    
variable1  0.02494   1.02526  0.02375  1.050  0.29354    
variable2 -0.20715   0.81290  0.02395 -8.650  < 2e-16 ***
variable3  0.12940   1.13814  0.02263  5.717 1.08e-08 ***
variable4  0.02469   1.02500  0.02289  1.079  0.28077    
variable5  0.13165   1.14070  0.02235  5.891 3.84e-09 ***
variable6  0.22286   1.24965  0.01534 14.526  < 2e-16 ***
variable7 -0.10513   0.90021  0.02035 -5.167 2.38e-07 ***
variable8  -0.12215   0.88501  0.02243 -5.447 5.13e-08 ***
variable9  -0.04930   0.95189  0.01827 -2.698  0.00697 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

      exp(coef) exp(-coef) lower .95 upper .95
variable1    1.0253     0.9754    0.9786    1.0741
variable2    0.8129     1.2302    0.7756    0.8520
variable3    1.1381     0.8786    1.0888    1.1898
variable4    1.0250     0.9756    0.9800    1.0720
variable5    1.1407     0.8767    1.0918    1.1918
variable6    1.2496     0.8002    1.2126    1.2878
variable7    0.9002     1.1109    0.8650    0.9368
variable8    0.8850     1.1299    0.8470    0.9248
variable9    0.9519     1.0505    0.9184    0.9866

Concordance= 0.543  (se = 0.002 )
Rsquare= 0.022   (max possible= 1 )
Likelihood ratio test= 516.5  on 9 df,   p=0
Wald test            = 503.1  on 9 df,   p=0
Score (logrank) test = 505.1  on 9 df,   p=0

Answer

Mike.Gahan picture Mike.Gahan · Feb 13, 2015

Due to the censored nature of survival data, it is usually more useful to compute a median survival time instead of a mean expected survival time. You can very easily recover the median survival time for each person in your data by running the following:

survfit(cox.ph.model,newdata= DataTest)