I am trying to convert the following format:
mydata <- data.frame(movie = c("Titanic", "Departed"),
actor1 = c("Leo", "Jack"),
actor2 = c("Kate", "Leo"))
movie actor1 actor2
1 Titanic Leo Kate
2 Departed Jack Leo
to binary response variables:
movie Leo Kate Jack
1 Titanic 1 1 0
2 Departed 1 0 1
I tried the solution described in Convert row data to binary columns but I could get it to work for two variables, not three.
I would really appreciate if there is a clean way to do this.
How much spice is too much? Here is a solution via tidyr
:
library(dplyr)
library(tidyr)
mydata %>%
gather(actor,name,starts_with("actor")) %>%
mutate(present = 1) %>%
select(-actor) %>%
spread(name,present,fill = 0)
movie Jack Kate Leo
1 Departed 1 0 1
2 Titanic 0 1 1