how to spread or cast multiple values in r

Rokmc1050 picture Rokmc1050 · Sep 24, 2014 · Viewed 16.7k times · Source

Here is toy data set for this example:

data <- data.frame(x=rep(c("red","blue","green"),each=4), y=rep(letters[1:4],3), value.1 = 1:12, value.2 = 13:24)

       x y value.1 value.2
1    red a       1      13
2    red b       2      14
3    red c       3      15
4    red d       4      16
5   blue a       5      17
6   blue b       6      18
7   blue c       7      19
8   blue d       8      20
9  green a       9      21
10 green b      10      22
11 green c      11      23
12 green d      12      24

How can I cast or spread variable y, to produce the following wide data.frame:

     x a.value.1 b.value.1 c.value.1 d.value.1 a.value.2 b.value.2 c.value.2 d.value.2
1  blue         5         6         7         8        17        18        19        20
2 green         9        10        11        12        21        22        23        24
3   red         1         2         3         4        13        14        15        16

Answer

akrun picture akrun · Sep 24, 2014

We could do this using dplyr/tidyr. We reshape the 'data' from 'wide' to 'long' format with gather specifying the columns (starts_with('value')) to be combined to a key/value column pair ('Var/Val'), unite the 'Var' and 'y' column to create a single 'Var1' column, and reconvert back to 'wide' format with spread.

 library(dplyr)
 library(tidyr)
 data %>%
      gather(Var, val, starts_with("value")) %>% 
      unite(Var1,Var, y) %>% 
      spread(Var1, val)

 #      x value.1_a value.1_b value.1_c value.1_d value.2_a value.2_b   value.2_c
 #1   blue         5         6         7         8        17        18        19
 #2  green         9        10        11        12        21        22        23
 #3    red         1         2         3         4        13        14        15
 #    value.2_d
 #1        20
 #2        24
 #3        16

Update

(After 6 months)

Reshaping multiple value columns to wide is now possible with dcast from data.table_1.9.5 without using the melt. We can install the devel version from here

 library(data.table)
 dcast(setDT(data), x~y, value.var=c('value.1', 'value.2'))
 #       x a_value.1 b_value.1 c_value.1 d_value.1 a_value.2 b_value.2 c_value.2
 #1:  blue         5         6         7         8        17        18        19
 #2: green         9        10        11        12        21        22        23
 #3:   red         1         2         3         4        13        14        15
 #   d_value.2
 #1:        20
 #2:        24
 #3:        16