Reputation: 87
I would like to merge 2 data frames additively such that
taxonomy A B C
1 rat 0 1 2
2 dog 1 2 3
3 cat 2 3 0
and
taxonomy A D C
1 rat 0 1 9
2 Horse 0 2 6
3 cat 2 0 2
produce
taxonomy A B C D
1 rat 0 1 11 1
2 Horse 0 0 6 2
3 cat 4 3 2 0
4 dog 1 2 3 0
I've tried aggregate, merge, apply, ddply.... with no success...this will be done on 2 data frames with a couple hundred rows and columns
Upvotes: 4
Views: 130
Reputation: 26343
The data.table
equivalent of @avid_useR's answer.
library(data.table)
rbindlist(list(df1, df2), fill = TRUE)[, lapply(.SD, sum, na.rm = TRUE), by = taxonomy]
# taxonomy A B C D
#1: rat 0 1 11 1
#2: dog 1 2 3 0
#3: cat 4 3 2 0
#4: Horse 0 0 6 2
Upvotes: 2
Reputation: 66819
You can do...
> library(reshape2)
> dcast(rbind(melt(DF1), melt(DF2)), taxonomy ~ variable, fun.aggregate = sum)
Using taxonomy as id variables
Using taxonomy as id variables
taxonomy A B C D
1 cat 4 3 2 0
2 dog 1 2 3 0
3 Horse 0 0 6 2
4 rat 0 1 11 1
This sorts the rows and columns alphabetically, but I guess this might be avoidable by using a factor
.
Data:
DF1 = structure(list(taxonomy = c("rat", "dog", "cat"), A = 0:2, B = 1:3,
C = c(2L, 3L, 0L)), .Names = c("taxonomy", "A", "B", "C"), row.names = c(NA,
-3L), class = "data.frame")
DF2 = structure(list(taxonomy = c("rat", "Horse", "cat"), A = c(0L,
0L, 2L), D = c(1L, 2L, 0L), C = c(9L, 6L, 2L)), .Names = c("taxonomy",
"A", "D", "C"), row.names = c(NA, -3L), class = "data.frame")
Upvotes: 1
Reputation: 18661
With bind_rows
from dplyr
:
library(dplyr)
bind_rows(df1, df2) %>%
group_by(taxonomy) %>%
summarize_all(sum, na.rm = TRUE)
Output:
# A tibble: 4 x 5
taxonomy A B C D
<chr> <int> <int> <int> <int>
1 cat 4 3 2 0
2 dog 1 2 3 0
3 Horse 0 0 6 2
4 rat 0 1 11 1
Data:
df1 <- structure(list(taxonomy = c("rat", "dog", "cat"), A = 0:2, B = 1:3,
C = c(2L, 3L, 0L)), .Names = c("taxonomy", "A", "B", "C"), class = "data.frame", row.names = c("1",
"2", "3"))
df2 <- structure(list(taxonomy = c("rat", "Horse", "cat"), A = c(0L,
0L, 2L), D = c(1L, 2L, 0L), C = c(9L, 6L, 2L)), .Names = c("taxonomy",
"A", "D", "C"), class = "data.frame", row.names = c("1", "2",
"3"))
Upvotes: 4