I have ten datasets that have been read from Excel files, using the xlsx library, and stored in tibbles. I want to merge them.
Here are example datasets. The number of variables differ between datasets, and some variables are only in one dataset. The value of the person variable will never overlap.
data1 <- tibble(person = c("A","B","C"),
test1 = as.factor(c(1,4,5)),
test2 = c(14,25,10),
test3 = c(12.5,16.0,4),
test4 = c(16,23,21),
test5 = as.factor(c(49,36,52)))
data2 <- tibble(person = c("D","E","F"),
test1 = c(8,7,2),
test3 = c(6.5,12.0,19.5),
test4 = as.factor(c(15,21,29)),
test5 = as.factor(c(54,51,36)),
test6 = c(32,32,29),
test7 = c(13,11,10))
The actual datasets usually have ~50 rows and ~200 variables in them. I have tried
all_data <- dplyr::bind_rows(data1,data2)
hoping to get this outcome
# A tibble: 6 x 8
person test1 test2 test3 test4 test5 test6 test7
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 1 14 12.5 16 49 NA NA
2 B 4 25 16.0 23 36 NA NA
3 C 5 10 4.0 21 52 NA NA
4 D 8 NA 6.5 15 54 32 13
5 E 7 NA 12.0 21 51 32 11
6 F 2 NA 19.5 29 36 29 10
but instead I get this error
Error in bind_rows_(x, .id) : Column `test1` can't be converted from factor to numeric
I have searched Stackoverflow, and I found questions regarding this, and most answers center on trying to convert the variables to another class. But I don't care which classes my variables have, because I will just write the merged dataset to a CSV-file or Excel file.
Isn't there some kind of simple workaround?
I think that this should work:
library(plyr)
all_data <- rbind.fill(data1,data2)