Pass a vector of variable names to arrange() in dplyr

rsoren picture rsoren · Oct 22, 2014 · Viewed 15.4k times · Source

I want to pass arrange() {dplyr} a vector of variable names to sort on. Usually I just type in the variables I want, but I'm trying to make a function where the sorting variables can be input as a function parameter.

df <- structure(list(var1 = c(1L, 2L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 4L
  ), var2 = structure(c(10L, 1L, 8L, 3L, 5L, 4L, 7L, 9L, 2L, 6L
  ), .Label = c("b", "c", "f", "h", "i", "o", "s", "t", "w", "x"
  ), class = "factor"), var3 = c(7L, 5L, 5L, 8L, 5L, 8L, 6L, 7L, 
  5L, 8L), var4 = structure(c(8L, 5L, 1L, 4L, 7L, 4L, 3L, 6L, 9L, 
  2L), .Label = c("b", "c", "d", "e", "f", "h", "i", "w", "y"), 
  class = "factor")), .Names = c("var1", "var2", "var3", "var4"), 
  row.names = c(NA, -10L), class = "data.frame")

# this is the normal way to arrange df with dplyr
df %>% arrange(var3, var4)

# but none of these (below) work for passing a vector of variables
vector_of_vars <- c("var3", "var4")
df %>% arrange(vector_of_vars)
df %>% arrange(get(vector_of_vars))
df %>% arrange(eval(parse(text = paste(vector_of_vars, collapse = ", "))))

Answer

farnsy picture farnsy · Oct 22, 2014

Hadley hasn't made this obvious in the help file--only in his NSE vignette. The versions of the functions followed by underscores use standard evaluation, so you pass them vectors of strings and the like.

If I understand your problem correctly, you can just replace arrange() with arrange_() and it will work.

Specifically, pass the vector of strings as the .dots argument when you do it.

> df %>% arrange_(.dots=c("var1","var3"))
   var1 var2 var3 var4
1     1    i    5    i
2     1    x    7    w
3     1    h    8    e
4     2    b    5    f
5     2    t    5    b
6     2    w    7    h
7     3    s    6    d
8     3    f    8    e
9     4    c    5    y
10    4    o    8    c

========== Update March 2018 ==============

Using the standard evaluation versions in dplyr as I have shown here is now considered deprecated. You can read Hadley's programming vignette for the new way. Basically you will use !! to unquote one variable or !!! to unquote a vector of variables inside of arrange().

When you pass those columns, if they are bare, quote them using quo() for one variable or quos() for a vector. Don't use quotation marks. See the answer by Akrun.

If your columns are already strings, then make them names using rlang::sym() for a single column or rlang::syms() for a vector. See the answer by Christos. You can also use as.name() for a single column. Unfortunately as of this writing, the information on how to use rlang::sym() has not yet made it into the vignette I link to above (eventually it will be in the section on "variadic quasiquotation" according to his draft).