Applying a function to every row of a table using dplyr?

Stephen Henderson picture Stephen Henderson · Feb 17, 2014 · Viewed 85.7k times · Source

When working with plyr I often found it useful to use adply for scalar functions that I have to apply to each and every row.

e.g.

data(iris)
library(plyr)
head(
     adply(iris, 1, transform , Max.Len= max(Sepal.Length,Petal.Length))
    )
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species Max.Len
1          5.1         3.5          1.4         0.2  setosa     5.1
2          4.9         3.0          1.4         0.2  setosa     4.9
3          4.7         3.2          1.3         0.2  setosa     4.7
4          4.6         3.1          1.5         0.2  setosa     4.6
5          5.0         3.6          1.4         0.2  setosa     5.0
6          5.4         3.9          1.7         0.4  setosa     5.4

Now I'm using dplyr more, I'm wondering if there is a tidy/natural way to do this? As this is NOT what I want:

library(dplyr)
head(
     mutate(iris, Max.Len= max(Sepal.Length,Petal.Length))
    )
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species Max.Len
1          5.1         3.5          1.4         0.2  setosa     7.9
2          4.9         3.0          1.4         0.2  setosa     7.9
3          4.7         3.2          1.3         0.2  setosa     7.9
4          4.6         3.1          1.5         0.2  setosa     7.9
5          5.0         3.6          1.4         0.2  setosa     7.9
6          5.4         3.9          1.7         0.4  setosa     7.9

Answer

alexwhan picture alexwhan · Jul 14, 2014

As of dplyr 0.2 (I think) rowwise() is implemented, so the answer to this problem becomes:

iris %>% 
  rowwise() %>% 
  mutate(Max.Len= max(Sepal.Length,Petal.Length))

Non rowwise alternative

Five years (!) later this answer still gets a lot of traffic. Since it was given, rowwise is increasingly not recommended, although lots of people seem to find it intuitive. Do yourself a favour and go through Jenny Bryan's Row-oriented workflows in R with the tidyverse material to get a good handle on this topic.

The most straightforward way I have found is based on one of Hadley's examples using pmap:

iris %>% 
  mutate(Max.Len= purrr::pmap_dbl(list(Sepal.Length, Petal.Length), max))

Using this approach, you can give an arbitrary number of arguments to the function (.f) inside pmap.

pmap is a good conceptual approach because it reflects the fact that when you're doing row wise operations you're actually working with tuples from a list of vectors (the columns in a dataframe).