When working with plyr
I often found it useful to use adply
for scalar functions that I have to apply to each and every row.
e.g.
data(iris)
library(plyr)
head(
adply(iris, 1, transform , Max.Len= max(Sepal.Length,Petal.Length))
)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species Max.Len
1 5.1 3.5 1.4 0.2 setosa 5.1
2 4.9 3.0 1.4 0.2 setosa 4.9
3 4.7 3.2 1.3 0.2 setosa 4.7
4 4.6 3.1 1.5 0.2 setosa 4.6
5 5.0 3.6 1.4 0.2 setosa 5.0
6 5.4 3.9 1.7 0.4 setosa 5.4
Now I'm using dplyr
more, I'm wondering if there is a tidy/natural way to do this? As this is NOT what I want:
library(dplyr)
head(
mutate(iris, Max.Len= max(Sepal.Length,Petal.Length))
)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species Max.Len
1 5.1 3.5 1.4 0.2 setosa 7.9
2 4.9 3.0 1.4 0.2 setosa 7.9
3 4.7 3.2 1.3 0.2 setosa 7.9
4 4.6 3.1 1.5 0.2 setosa 7.9
5 5.0 3.6 1.4 0.2 setosa 7.9
6 5.4 3.9 1.7 0.4 setosa 7.9
As of dplyr 0.2 (I think) rowwise()
is implemented, so the answer to this problem becomes:
iris %>%
rowwise() %>%
mutate(Max.Len= max(Sepal.Length,Petal.Length))
rowwise
alternativeFive years (!) later this answer still gets a lot of traffic. Since it was given, rowwise
is increasingly not recommended, although lots of people seem to find it intuitive. Do yourself a favour and go through Jenny Bryan's Row-oriented workflows in R with the tidyverse material to get a good handle on this topic.
The most straightforward way I have found is based on one of Hadley's examples using pmap
:
iris %>%
mutate(Max.Len= purrr::pmap_dbl(list(Sepal.Length, Petal.Length), max))
Using this approach, you can give an arbitrary number of arguments to the function (.f
) inside pmap
.
pmap
is a good conceptual approach because it reflects the fact that when you're doing row wise operations you're actually working with tuples from a list of vectors (the columns in a dataframe).