dplyr - groupby on multiple columns using variable names

Neil picture Neil · Dec 28, 2015 · Viewed 35.6k times · Source

I am working with R Shiny for some exploratory data analysis. I have two checkbox inputs that contain only the user-selected options. The first checkbox input contains only the categorical variables; the second checkbox contains only numeric variables. Next, I apply a groupby on these two selections:

var1 <- input$variable1      # Checkbox with categorical variables
var2 <- input$variable2      # Checkbox with numerical variables

v$data <- dataset %>%
  group_by_(var1) %>%
  summarize_(Sum = interp(~sum(x), x = as.name(var2))) %>%
  arrange(desc(Sum))

When only one categorical variable is selected, this groupby works perfectly. When multiple categorical variables are chosen, this groupby returns an array with column names. How do I pass this array of column names to dplyr's groupby?

Answer

MrFlick picture MrFlick · Dec 28, 2015

If you have a vector of variable names, you should pass them to the .dots= parameter of group_by_. For example:

mtcars %>% 
   group_by_(.dots=c("mpg","hp","wt")) %>% 
   summarize(x=mean(gear))