Reputation: 155
I have ten datasets that have been read from Excel files, using the xlsx library, and stored in tibbles. I want to merge them.
Here are example datasets. The number of variables differ between datasets, and some variables are only in one dataset. The value of the person variable will never overlap.
data1 <- tibble(person = c("A","B","C"),
test1 = as.factor(c(1,4,5)),
test2 = c(14,25,10),
test3 = c(12.5,16.0,4),
test4 = c(16,23,21),
test5 = as.factor(c(49,36,52)))
data2 <- tibble(person = c("D","E","F"),
test1 = c(8,7,2),
test3 = c(6.5,12.0,19.5),
test4 = as.factor(c(15,21,29)),
test5 = as.factor(c(54,51,36)),
test6 = c(32,32,29),
test7 = c(13,11,10))
The actual datasets usually have ~50 rows and ~200 variables in them. I have tried
all_data <- dplyr::bind_rows(data1,data2)
hoping to get this outcome
# A tibble: 6 x 8
person test1 test2 test3 test4 test5 test6 test7
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 1 14 12.5 16 49 NA NA
2 B 4 25 16.0 23 36 NA NA
3 C 5 10 4.0 21 52 NA NA
4 D 8 NA 6.5 15 54 32 13
5 E 7 NA 12.0 21 51 32 11
6 F 2 NA 19.5 29 36 29 10
but instead I get this error
Error in bind_rows_(x, .id) : Column `test1` can't be converted from factor to numeric
I have searched Stackoverflow, and I found questions regarding this, and most answers center on trying to convert the variables to another class. But I don't care which classes my variables have, because I will just write the merged dataset to a CSV-file or Excel file.
Isn't there some kind of simple workaround?
Upvotes: 14
Views: 29655
Reputation: 2157
test1 in data1 is of class factor whereas in data2 is of class numeric. Combining a factor class and numeric class causes this problem. Solution either convert test1 in both data1 and data2 to factors and then use all_data <- dplyr::bind_rows(data1,data2)
or
data.table::rbindlist(data1,data2)
Upvotes: 0
Reputation: 17289
As the file are usually small (several hundred rows) and you simply want to combine the two file and write to a new file, I think we can convert all columns to character, thus the common columns in data1
and data2
will have the same type.
library(dplyr)
bind_rows(mutate_all(data1, as.character), mutate_all(data2, as.character))
Upvotes: 10
Reputation: 268
I think that this should work:
library(plyr)
all_data <- rbind.fill(data1,data2)
Upvotes: 14