Matching vector of default values using match.arg() with or without error [R]

SimonG picture SimonG · Mar 9, 2015 · Viewed 8.3k times · Source

I want to write a function that applies one of two different statistical methods to its input. In the process, I noticed some behavior of different functions that I do not understand. The function I want to write should have the following properties:

  • it should have a vector as a default value (so the user can see which methods are available)
  • if the argument is left at the default value, then the first of the two methods should be used
  • if the user manually supplies a vector of methods, then the function should give an error

Basically, I want the function to behave like cor does in R. There, you have a default value method = c("pearson", "kendall", "spearman"), and the functions calculated the Pearson correlation if method isn't specified. If the user asks for several methods at once, the function returns an error.

From looking at cor, this appears to be done using match.arg(method). This behavior is illustrated here:

x <- y <- 1:5

cor(x, y, method="pearson")
# = 1
cor(x, y, method="kendall")
# = 1
cor(x, y, method=c("pearson","kendall"))
# gives an error

I tried writing my own function, also using match.arg(method), but I realized that the result is somehow different. Even when choosing a vector for method, the function doesn't terminate with an error, but returns the results of the first method.

This is illustrated here:

myfun <- function(x, method=c("add","multiply")){
  method <- match.arg(method)
  if(method=="add") return(sum(x))
  if(method=="multiply") return(prod(x))
}

x <- 1:5

myfun(x, method="add")
# = 15
myfun(x, method="multiply")
# = 120
myfun(x, method=c("add","multiply"))
# = 15

I don't understand this behavior, and I would be glad if you could help me out here. From my attempts on Google, I realize that it might be related to non-standard evaluation, but I can't put two and two together just yet.

Thanks in advance, your help is much appreciated!

Cheers!

EDIT:

I could also re-phrase my question:

What powerful sorcery does cor do that it returns the Pearson correlation when method is not supplied, but that it returns an error when method = c("pearson", "kendall", "spearman") is explicitly specified?

Answer

shadow picture shadow · Mar 9, 2015

If choices and args are the same in match.arg, then the first element is returned. Otherwise arg has to be length 1. From match.arg:

Since default argument matching will set arg to choices, this is allowed as an exception to the ‘length one unless several.ok is TRUE’ rule, and returns the first element.

match.arg(c("pearson", "kendall", "spearman"), c("pearson", "kendall", "spearman"))
## [1] "pearson"
match.arg(c("pearson", "kendall"), c("pearson", "kendall", "spearman"))
## Error in match.arg(c("pearson", "kendall"), c("pearson", "kendall", "spearman")) : 
##  'arg' must be of length 1

You can get your desired behavior using a dummy argument:

myfun <- function(x, method=c("add","multiply","other.return.error")){
  method <- match.arg(method)
  if("other.return.error" %in% method) stop("this option should not be used")
  if(method=="add") return(sum(x))
  if(method=="multiply") return(prod(x))
}