I want to create 7 dummy variables -one for each day, using dplyr
So far, I have managed to do it using the sjmisc
package and the to_dummy
function, but I do it in 2 steps -1.Create a df of dummies, 2) append to the original df
#Sample dataframe
mydfdata.frame(x=rep(letters[1:9]),
day=c("Mon","Tues","Wed","Thurs","Fri","Sat","Sun","Fri","Mon"))
#1.Create the 7 dummy variables separately
daysdummy<-sjmisc::to_dummy(mydf$day,suffix="label")
#2. append to dataframe
mydf<-bind_cols(mydf,daysdummy)
> mydf
x day day_Fri day_Mon day_Sat day_Sun day_Thurs day_Tues day_Wed
1 a Mon 0 1 0 0 0 0 0
2 b Tues 0 0 0 0 0 1 0
3 c Wed 0 0 0 0 0 0 1
4 d Thurs 0 0 0 0 1 0 0
5 e Fri 1 0 0 0 0 0 0
6 f Sat 0 0 1 0 0 0 0
7 g Sun 0 0 0 1 0 0 0
8 h Fri 1 0 0 0 0 0 0
9 i Mon 0 1 0 0 0 0 0
My question is whether I can do it in one single workflow using dplyr
and add the to_dummy
into the pipe-workflow- perhaps using mutate
?
*to_dummy
documentation
If you want to do this with the pipe, you can do something like:
library(dplyr)
library(sjmisc)
mydf %>%
to_dummy(day, suffix = "label") %>%
bind_cols(mydf) %>%
select(x, day, everything())
Returns:
# A tibble: 9 x 9 x day day_Fri day_Mon day_Sat day_Sun day_Thurs day_Tues day_Wed <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 a Mon 0. 1. 0. 0. 0. 0. 0. 2 b Tues 0. 0. 0. 0. 0. 1. 0. 3 c Wed 0. 0. 0. 0. 0. 0. 1. 4 d Thurs 0. 0. 0. 0. 1. 0. 0. 5 e Fri 1. 0. 0. 0. 0. 0. 0. 6 f Sat 0. 0. 1. 0. 0. 0. 0. 7 g Sun 0. 0. 0. 1. 0. 0. 0. 8 h Fri 1. 0. 0. 0. 0. 0. 0. 9 i Mon 0. 1. 0. 0. 0. 0. 0.
With dplyr
and tidyr
we could do:
library(dplyr)
library(tidyr)
mydf %>%
mutate(var = 1) %>%
spread(day, var, fill = 0, sep = "_") %>%
left_join(mydf) %>%
select(x, day, everything())
And with base R we could do something like:
as.data.frame.matrix(table(rep(mydf$x, lengths(mydf$day)), unlist(mydf$day)))
Returns:
Fri Mon Sat Sun Thurs Tues Wed a 0 1 0 0 0 0 0 b 0 0 0 0 0 1 0 c 0 0 0 0 0 0 1 d 0 0 0 0 1 0 0 e 1 0 0 0 0 0 0 f 0 0 1 0 0 0 0 g 0 0 0 1 0 0 0 h 1 0 0 0 0 0 0 i 0 1 0 0 0 0 0