I’m trying to reshape my data from long to wide formula using the dcast
function from reshape2.
The objective is to use different variables in the value.var
parameter but R doesn't let me use more than one value in it.
Is there any other way I could fix it? I've looked at other similar questions but I haven't been able to find a similar examples.
Here is my current dataset:
+---------+------+--------+--------------+------------+
| Country | Year | Growth | Unemployment | Population |
+---------+------+--------+--------------+------------+
| A | 2015 | 2 | 8.3 | 40 |
| B | 2015 | 3 | 9.2 | 32 |
| C | 2015 | 2.5 | 9.1 | 30 |
| D | 2015 | 1.5 | 6.1 | 27 |
| A | 2016 | 4 | 8.1 | 42 |
| B | 2016 | 3.5 | 9 | 32.5 |
| C | 2016 | 3.7 | 9 | 31 |
| D | 2016 | 3.1 | 5.3 | 29 |
| A | 2017 | 4.5 | 8.1 | 42.5 |
| B | 2017 | 4.4 | 8.4 | 33 |
| C | 2017 | 4.3 | 8.5 | 30 |
| D | 2017 | 4.2 | 5.2 | 30 |
+---------+------+--------+--------------+------------+
My objective is to pass year column to the rest of the columns (growth, unemployment and population). I’m using the below dcast function.
data_wide <- dcast(world, country ~ year,
value.var=c("Growth","Unemployment","Population"))
Ideal outcome
+---------+-------------+-------------------+-----------------+-------------+-------------------+-----------------+
| Country | Growth_2015 | Unemployment_2015 | Population_2015 | Growth_2016 | Unemployment_2016 | Population_2016 |
+---------+-------------+-------------------+-----------------+-------------+-------------------+-----------------+
| A | 2 | 8.3 | 40 | 4 | 8.1 | 42 |
| B | 3 | 9.2 | 32 | 3.5 | 9 | 32.5 |
| C | 2.5 | 9.1 | 30 | 3.7 | 9 | 31 |
| D | 1.5 | 6.1 | 27 | 3.1 | 5.3 | 29 |
+---------+-------------+-------------------+-----------------+-------------+-------------------+-----------------+
If you're not married to a dcast solution, I personally find tidyr easier.
library(tidyr)
df <- df %>%
gather(key, value, -Country, -Year) %>%
unite(new.col, c(key, Year)) %>%
spread(new.col, value)
Result
Country Growth_2015 Growth_2016 Growth_2017 Population_2015 Population_2016 Population_2017 Unemployment_2015 Unemployment_2016 Unemployment_2017
1 A 2.0 4.0 4.5 40 42.0 42.5 8.3 8.1 8.1
2 B 3.0 3.5 4.4 32 32.5 33.0 9.2 9.0 8.4
3 C 2.5 3.7 4.3 30 31.0 30.0 9.1 9.0 8.5
4 D 1.5 3.1 4.2 27 29.0 30.0 6.1 5.3 5.2
This works by
Stacking all values into one column...
Combining variable name and year columns into a single column...
The new column is then spread into wide format