What does is.na() applied to non-(list or vector) of type 'NULL' mean?

Vincent picture Vincent · Jun 20, 2013 · Viewed 32.7k times · Source

I want to select a Cox model with the forward procedure from a data.frame with no NA. Here is some sample data:

test <- data.frame(
  x_1   = runif(100,0,1),
  x_2   = runif(100,0,5),
  x_3   = runif(100,10,20),
  time  = runif(100,50,200),
  event = c(rep(0,70),rep(1,30))
)

This table has no signification but if we try to build a model anyway :

modeltest <- coxph(Surv(time, event) ~1, test)
modeltest.forward <- step(
  modeltest, 
  data      = test, 
  direction = "forward", 
  scope     = list(lower = ~ 1, upper = ~ x_1 + x_2 + x_3)
)

The forward ends at the first step and says:

In is.na(fit$coefficients) : is.na() applied to non-(list or vector) of type 'NULL'

(three times)

I tried to change the upper model, I even tried upper = ~ 1 but the warning stays. I don't understand: I have no NAs and my vectors are all numerics (I checked it). I searched if people had the same issue but all I could find was problems due to the name or class of the vectors.

What's wrong with my code?

Answer

Richie Cotton picture Richie Cotton · Dec 14, 2015

The problem in this specific case

The right hand side of your formula is 1, which makes it a null model. coxph calls coxph.fit, which (perhaps lazily) doesn't bother to return coefficients for null models.

Later coxph calls extractAIC, which erroneously assumes that the model object contains an element named coefficients.

The general case

is.na assumes that its input argument is an atomic vector or a matrix or a list or a data.frame. Other data types cause the warning. It happens with NULL, as you've seen:

is.na(NULL)
## logical(0)
## Warning message:
## In is.na(NULL) : is.na() applied to non-(list or vector) of type 'NULL'

One common cause of this problem is trying to access elements of a list, or columns of a data frame that don't exist.

d <- data.frame(x = c(1, NA, 3))
d$y # "y" doesn't exist is the data frame, but NULL is returned
## NULL
is.na(d$y)
## logical(0)
## Warning message:
## In is.na(d$y) : is.na() applied to non-(list or vector) of type 'NULL'

You can protect against this by checking that the column exists before you manipulate it.

if("y" in colnames(d))
{
  d2 <- d[is.na(d$y), ]
}

The warning with other data types

You get a simliar warning with formulae, functions, expressions, etc.:

is.na(~ NA)
## [1] FALSE FALSE
## Warning message:
## In is.na(~NA) : is.na() applied to non-(list or vector) of type 'language'

is.na(mean)
## [1] FALSE
## Warning message:
## In is.na(mean) : is.na() applied to non-(list or vector) of type 'closure'

is.na(is.na)
## [1] FALSE
## Warning message:
## In is.na(is.na) : is.na() applied to non-(list or vector) of type 'builtin'

is.na(expression(NA))
## [1] FALSE
## Warning message:
## In is.na(expression(NA)) :
##   is.na() applied to non-(list or vector) of type 'expression'