I have a list of data indicating attendance to conferences like this:
Event Participant
ConferenceA John
ConferenceA Joe
ConferenceA Mary
ConferenceB John
ConferenceB Ted
ConferenceC Jessica
I would like to create a binary indicator attendance matrix of the following format:
Event John Joe Mary Ted Jessica
ConferenceA 1 1 1 0 0
ConferenceB 1 0 0 1 0
ConferenceC 0 0 0 0 1
Is there a way to do this in R?
Assuming your data.frame
is called "mydf", simply use table
:
> table(mydf)
Participant
Event Jessica Joe John Mary Ted
ConferenceA 0 1 1 1 0
ConferenceB 0 0 1 0 1
ConferenceC 1 0 0 0 0
If there is a chance that someone would have attended a conference more than once, leading table
to return a value greater than 1, you can simply recode all values greater than 1 to 1, like this.
temp <- table(mydf)
temp[temp > 1] <- 1
Note that this returns a table
. If you want a data.frame
to be returned, use as.data.frame.matrix
:
> as.data.frame.matrix(table(mydf))
Jessica Joe John Mary Ted
ConferenceA 0 1 1 1 0
ConferenceB 0 0 1 0 1
ConferenceC 1 0 0 0 0
In the above, "mydf" is defined as:
mydf <- structure(list(Event = c("ConferenceA", "ConferenceA",
"ConferenceA", "ConferenceB", "ConferenceB", "ConferenceC"),
Participant = c("John", "Joe", "Mary", "John", "Ted", "Jessica")),
.Names = c("Event", "Participant"), class = "data.frame",
row.names = c(NA, -6L))
Please share your data in a similar manner in the future.