Pass formula to function in R?

RickyB picture RickyB · Feb 4, 2013 · Viewed 9.8k times · Source

Any help with this would be really appreciated. I am using the Lumley survey package and am trying to simplify my code, but have hit a slight snag.

The svymean function from the package is called as follows in my code, where the first argument is a formula indicating which variables I want, and the second argument is that dataset:

svymean(~hq_ehla, FraSvy, na.rm=TRUE)

I'm trying to create a function that will pull out the mean (proportions) and standard errors for categorical variables, so I've made the following function:

stats <- function(repstat, num) {
    estmean <- as.numeric(round(100 * repstat[num], digits=0))
    estse <- round(100 * sqrt(attributes(repstat)$var[num,num]), digits=1)
    return(list(mean=estmean, se=estse))
}

This works, so when I'm pulling out the mean and se of my first category, for example, I use:

stats(svymean(~hq_ehla, FraSvy, na.rm=TRUE), 1)$mean
stats(svymean(~hq_ehla, FraSvy, na.rm=TRUE), 1)$se

What I'd like to be able to do is simplify this to something much shorter, where maybe I'd only have to write:

stats(FraSvy, "hq_ehla", 1)$mean

Or something like that. Problem is that I can't figure out how to pass a formula to a function using a variable name.

Answer

mnel picture mnel · Feb 4, 2013

You can use reformulate to construct your formula and call svymean within your function. Use ... to pass na.rm or other arguments to svymean

stats <- function(terms, data,  num, ...) {
  .formula <- reformulate(terms)
  repstat <- svymean(.formula, data, ...)
  estmean <- as.numeric(round(100 * repstat[num], digits=0))
  estse <- round(100 * sqrt(attributes(repstat)$var[num,num]), digits=1)
  return(list(mean=estmean, se=estse))
}

stats(data = FraSvy, terms = "hq_ehla", 1, na.rm = TRUE)$mean

Have a look at this answer for more details on programmitically create formula objects

Or, you could pass a formula object within the function.

stats2 <- function(formula, data,  num, ...) {

  repstat <- svymean(formula, data, ...)
  estmean <- as.numeric(round(100 * repstat[num], digits=0))
  estse <- round(100 * sqrt(attributes(repstat)$var[num,num]), digits=1)
  return(list(mean=estmean, se=estse))
}


stats2(data = FraSvy, formula = ~hq_ehla, 1, na.rm = TRUE)$mean