Using dplyr and stringr to replace all values starts with

Tomas Ericsson picture Tomas Ericsson · May 4, 2017 · Viewed 12.9k times · Source

my df

> df <- data.frame(food = c("fruit banana", "fruit apple", "fruit grape", "bread", "meat"), sold = rnorm(5, 100))
>   df
          food      sold
1 fruit banana  99.47171
2  fruit apple  99.40878
3  fruit grape  99.28727
4        bread  99.15934
5         meat 100.53438

Now I want to replace all values in food that starts with "fruit" and then group by food and summarise sold with sum sold.

> df %>%
+     mutate(food = replace(food, str_detect(food, "fruit"), "fruit")) %>% 
+     group_by(food) %>% 
+     summarise(sold = sum(sold))
Source: local data frame [3 x 2]

    food      sold
  (fctr)     (dbl)
1  bread  99.15934
2   meat 100.53438
3     NA 298.16776

Why is this command not working? It gives me NA instead of fruit?

Answer

PKumar picture PKumar · May 4, 2017

It is working for me, I think your data is in factors:

Using stringsAsFactors=FALSE while making the data as below or you can run options(stringsAsFactors=FALSE) in the R environment to avoid the same:

df <- data.frame(food = c("fruit banana", "fruit apple", "fruit grape", "bread", "meat"), sold = rnorm(5, 100),stringsAsFactors = FALSE)

df %>%
mutate(food = replace(food, str_detect(food, "fruit"), "fruit")) %>% 
group_by(food) %>% 
summarise(sold = sum(sold))

Output:

 # A tibble: 3 × 2
       food      sold
      <chr>     <dbl>
    1 bread  99.67661
    2 fruit 300.28520
    3  meat  99.88566