I'm a fan of the revalue
function is plyr
for substituting strings. It's simple and easy to remember.
However, I've migrated new code to dplyr
which doesn't appear to have a revalue
function. What is the accepted idiom in dplyr
for doing things previously done with revalue
?
There is a recode
function available starting with dplyr version dplyr_0.5.0 which looks very similar to revalue
from plyr.
Example built from the recode
documentation Examples section:
set.seed(16)
x = sample(c("a", "b", "c"), 10, replace = TRUE)
x
[1] "a" "b" "a" "b" "b" "a" "c" "c" "c" "a"
recode(x, a = "Apple", b = "Bear", c = "Car")
[1] "Car" "Apple" "Bear" "Apple" "Car" "Apple" "Apple" "Car" "Car" "Apple"
If you only define some of the values that you want to recode, by default the rest are filled with NA
.
recode(x, a = "Apple", c = "Car")
[1] "Car" "Apple" NA "Apple" "Car" "Apple" "Apple" "Car" "Car" "Apple"
This behavior can be changed using the .default
argument.
recode(x, a = "Apple", c = "Car", .default = x)
[1] "Car" "Apple" "b" "Apple" "Car" "Apple" "Apple" "Car" "Car" "Apple"
There is also a .missing
argument if you want to replace missing values with something else.