R coxph() warning: Loglik converged before variable

JMarcelino picture JMarcelino · Oct 14, 2013 · Viewed 18.6k times · Source

I'm having some trouble using coxph(). I've two categorical variables: Sex and Probable Cause, that I want to use as predictor variables. Sex is just the typical male/female but Probable Cause has 5 options. I don't know what is the problem with the warning message. Why does the cofidence intervals are from 0 to Inf and the p-values so high?

Here's the code and the output:

> my_coxph <- coxph(Surv(tempo,status) ~ factor(Sexo)+ factor(Causa.provavel) ,           data=ceabn)
Warning message:
In fitter(X, Y, strats, offset, init, control, weights = weights,  :
Loglik converged before variable  2,3,5,6 ; beta may be infinite. 

> summary(my_coxph)
Call:
coxph(formula = Surv(tempo, status) ~ factor(Sexo) + factor(Causa.provavel), 
data = ceabn)

n= 43, number of events= 31 

                                            coef exp(coef)  se(coef)     z Pr(>|z|)
factor(Sexo)macho                      7.254e-01 2.066e+00 4.873e-01 1.488    0.137
factor(Causa.provavel)caca             2.186e+01 3.107e+09 9.698e+03 0.002    0.998
factor(Causa.provavel)colisao linha MT 1.973e+01 3.703e+08 9.698e+03 0.002    0.998
factor(Causa.provavel)indeterminado    9.407e-01 2.562e+00 1.683e+04 0.000    1.000
factor(Causa.provavel)predacao         2.170e+01 2.655e+09 9.698e+03 0.002    0.998
factor(Causa.provavel)predado          2.276e+01 7.659e+09 9.698e+03 0.002    0.998

                                       exp(coef) exp(-coef) lower .95 upper .95
factor(Sexo)macho                      2.065e+00  4.841e-01    0.7947     5.368
factor(Causa.provavel)caca             3.107e+09  3.219e-10    0.0000       Inf
factor(Causa.provavel)colisao linha MT 3.703e+08  2.701e-09    0.0000       Inf
factor(Causa.provavel)indeterminado    2.562e+00  3.904e-01    0.0000       Inf
factor(Causa.provavel)predacao         2.655e+09  3.766e-10    0.0000       Inf
factor(Causa.provavel)predado          7.659e+09  1.306e-10    0.0000       Inf

Concordance= 0.752  (se = 0.059 )
Rsquare= 0.608   (max possible= 0.987 )
Likelihood ratio test= 40.23  on 6 df,   p=4.105e-07
Wald test            = 7.46  on 6 df,   p=0.2807
Score (logrank) test = 30.48  on 6 df,   p=3.183e-05

Thank you

Answer

IRTFM picture IRTFM · Oct 15, 2013

When I asked Terry Therneau (author of pkg:survival) about that several years ago he said the test that is being triggered to generate that warning is overly sensitive. Generally the warning is not correct. You can usually just look at your coefficients to see that they are not infinite or even effectively infinite which a coefficient of 20 would be unrealistically large in most instances, but a coefficient less than five would not be.

In your case, however, it seems to be correctly warning you that there may be problems with your data, since you have implausibly large coefficients. A beta coefficient of 2.276e+01 (= 22.7) in an exponential model is just ridiculously high. The estimated relative risk is well over a million! You should be looking at tabular classifications of your data for problems of complete separation. Did any of your control group die, er, have an event?