I have a need that I imagine could be satisfied by aggregate
or reshape
, but I can't quite figure out.
I have a list of names (brand
), and accompanying ID number (id
). This data is in long form, so names can have multiple ID's. I'd like to de-dupicate by the name (brand
) and concatenate the multiple possible id
's into a string separated by a comment.
For example:
brand id
RadioShack 2308
Rag & Bone 4466
Ragu 1830
Ragu 4518
Ralph Lauren 1638
Ralph Lauren 2719
Ralph Lauren 2720
Ralph Lauren 2721
Ralph Lauren 2722
should become:
RadioShack 2308
Rag & Bone 4466
Ragu 1830,4518
Ralph Lauren 1638,2719,2720,2721,2722
How would I accomplish this?
Let's call your data.frame DF
> aggregate(id ~ brand, data = DF, c)
brand id
1 RadioShack 2308
2 Rag & Bone 4466
3 Ragu 1830, 4518
4 Ralph Lauren 1638, 2719, 2720, 2721, 2722
Another alternative using aggregate
is:
result <- aggregate(id ~ brand, data = DF, paste, collapse = ",")
This produces the same result and now id
is not a list
anymore. Thanks to @Frank comment. To see the class
of each column try:
> sapply(result, class)
brand id
"factor" "character"
As mentioned by @DavidArenburg in the comments, another alternative is using the toString
function:
aggregate(id ~ brand, data = DF, toString)