is there an elegant way to handle NA as 0 (na.rm = TRUE) in dplyr?
data <- data.frame(a=c(1,2,3,4), b=c(4,NA,5,6), c=c(7,8,9,NA))
data %>% mutate(sum = a + b + c)
a b c sum
1 4 7 12
2 NA 8 NA
3 5 9 17
4 6 NA NA
but I like to get
a b c sum
1 4 7 12
2 NA 8 10
3 5 9 17
4 6 NA 10
even if I know that this is not the desired result in many other cases
You could use this:
library(dplyr)
data %>%
#rowwise will make sure the sum operation will occur on each row
rowwise() %>%
#then a simple sum(..., na.rm=TRUE) is enough to result in what you need
mutate(sum = sum(a,b,c, na.rm=TRUE))
Output:
Source: local data frame [4 x 4]
Groups: <by row>
a b c sum
(dbl) (dbl) (dbl) (dbl)
1 1 4 7 12
2 2 NA 8 10
3 3 5 9 17
4 4 6 NA 10