Transpose only certain columns in data.frame

Ken picture Ken · Mar 21, 2016 · Viewed 12.1k times · Source

Here is the data I have:

           am   group  v1  v2  v3    v4
1  2015-10-31       A 693 803 700   17%
2  2015-10-31       B 524 859 302   77%
3  2015-10-31       C 266 675 86    7%
4  2015-10-31       D 376 455 650   65%
5  2015-11-30       A 618 715 200   38%
6  2015-11-30       B 249 965 215   54%
7  2015-11-30       C 881 106 184   24%
8  2015-11-30       D 033 047 492   46%
9  2015-12-31       A 229 994 720   19%
10 2015-12-31       B 539 543 332   57%
11 2015-12-31       C 100 078 590   24%
12 2015-12-31       D 517 413 716   57%

Question: How can I transpose the data such that

  1. transpose v1-v4 and
  2. make values in am as column variables
  3. group variable is replicated by the number of v1-v4

The result I'd like to produce:

group metric 2015-10-31 2015-11-30 2015-12-31
    A     v1        693        618        229
    A     v2        803        715        994 
    A     v3        700        200        720
    A     v4        17%        38%        19%
    B     v1        524        249        539
    B     v2        859        965        543 
    B     v3        302        215        332
    B     v4        77%        54%        57%
    ...

What I have tried so far:

name <- mydata$am
data <- as.data.frame(t(mydata[, -1]))
colnames(mydata) <- name

This doesn't handle group variable the way I want.

Thanks for your help.

Answer

A5C1D2H2I1M1N2O1R2T1 picture A5C1D2H2I1M1N2O1R2T1 · Mar 21, 2016

The basic idea would be to go to a "long" format first, and then go into a "wide" format.

Here are a few ways to do this....

melt + dcast

library(data.table) ## or library(reshape2)
dcast(melt(as.data.table(mydf), id.vars = c("am", "group")), 
      group + variable ~ am, value.var = "value")

recast

(This is basically the same as above, but in one step.)

library(reshape2)
recast(mydf, group + variable ~ am, id.var = c("am", "group"))

gather + spread

library(dplyr)
library(tidyr)

mydf %>%
  gather(key, value, v1:v4) %>%
  spread(am, value)

reshape

reshape(cbind(mydf[c(1, 2)], stack(mydf[-c(1, 2)])), 
        direction = "wide", idvar = c("group", "ind"), timevar = "am")