NA values not being excluded in `cor`

r na
Charlie picture Charlie · Jul 14, 2015 · Viewed 36.3k times · Source

To simplify, I have a data set which is as follows:

b <- 1:6
# > b
# [1] 1 2 3 4 5 6
jnk <- c(2, 4, 5, NA, 7, 9)
# > jnk
# [1]  2  4  5 NA  7  9

When I try:

cor(b, jnk, na.rm=TRUE)

I get:

> cor(b, jnk, na.rm=T)
  Error in cor(b, jnk, na.rm = T) : unused argument (na.rm = T)

I've also tried na.action = na.exclude, etc. None seem to work. It'd be really helpful to know what the issue is and how I can fix it. Thanks.

Answer

Spacedman picture Spacedman · Jul 14, 2015

TL; DR: Use instead:

cor(b, jnk, use="complete.obs")

Read ?cor:

cor(x, y = NULL, use = "everything",
     method = c("pearson", "kendall", "spearman"))

It doesn't have na.rm, it has use.

an optional character string giving a method for computing covariances in the presence of missing values. This must be (an abbreviation of) one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs".

Pick one. Details of what each does is in the Details section of ?cor.