Select first and last row from grouped data

tospig picture tospig · Jul 21, 2015 · Viewed 107.8k times · Source

Question

Using dplyr, how do I select the top and bottom observations/rows of grouped data in one statement?

Data & Example

Given a data frame

df <- data.frame(id=c(1,1,1,2,2,2,3,3,3), 
                 stopId=c("a","b","c","a","b","c","a","b","c"), 
                 stopSequence=c(1,2,3,3,1,4,3,1,2))

I can get the top and bottom observations from each group using slice, but using two separate statments:

firstStop <- df %>%
  group_by(id) %>%
  arrange(stopSequence) %>%
  slice(1) %>%
  ungroup

lastStop <- df %>%
  group_by(id) %>%
  arrange(stopSequence) %>%
  slice(n()) %>%
  ungroup

Can I combine these two statmenets into one that selects both top and bottom observations?

Answer

jeremycg picture jeremycg · Jul 21, 2015

There is probably a faster way:

df %>%
  group_by(id) %>%
  arrange(stopSequence) %>%
  filter(row_number()==1 | row_number()==n())