Reputation: 33
I got two data frames like this:
dat1
col n
1 A 1
2 B 1
3 C 2
dat2
col n
1 A 2
2 B 1
3 C 1
4 D 1
and I want to make a data frame like this with dat1 and dat2:
dat3
col n
1 A 3
2 B 2
3 C 3
4 D 1
I'm trying to make data frame (dat3) with dplyr bind_rows, group_by and count, but I can't.
bind_rows(dat1, dat2) %>%
group_by(col)
result:
col n
1 A 1
2 B 1
3 C 2
4 A 2
5 B 1
6 C 1
7 D 1
bind_rows(dat1, dat2) %>%
group_by(col) %>%
count(n)
result:
col n nn
1 A 1 1
2 A 2 1
3 B 1 2
4 C 1 1
5 C 2 1
6 D 1 1
How can I make dat3?
Upvotes: 2
Views: 131
Reputation: 886978
Or in base R
,
aggregate(cbind(Sum = n) ~ col, rbind(df1, df2), FUN = sum)
# col Sum
#1 A 3
#2 B 2
#3 C 3
#4 D 1
df1 <- structure(list(col = c("A", "B", "C"), n = c(1L, 1L, 2L)),
class = "data.frame", row.names = c("1",
"2", "3"))
df2 <- structure(list(col = c("A", "B", "C", "D"), n = c(2L, 1L, 1L,
1L)), class = "data.frame", row.names = c("1", "2", "3", "4"))
Upvotes: 1
Reputation: 160417
Third option, just in case:
psum <- function(..., na.rm = TRUE) {
m <- cbind(...)
apply(m, 1, sum, na.rm = na.rm)
}
full_join(dat1, dat2, by = "col") %>%
mutate(n = psum(n.x, n.y))
# col n.x n.y n
# 1 A 1 2 3
# 2 B 1 1 2
# 3 C 2 1 3
# 4 D NA 1 1
(n.x
and n.y
columns are generated by the join due to same-named columns, they are retained here solely for demonstration. Yes, psum
is a hack here, likely something better out there ...)
Upvotes: 1
Reputation: 4907
data.table
is a superior package to dplyr
. I suggest you try it:
library(data.table)
dat1 <- setDT(dat1); dat2 <- setDT(dat2)
dat3 <- rbindlist(list(dat1, dat2))[, .(n= sum(n)), .(col)]
Upvotes: 0
Reputation: 16178
You should summarise instead of counting:
bind_rows(dat1, dat2) %>%
group_by(col) %>% summarise(Sum = sum(n))
# A tibble: 4 x 2
col Sum
<chr> <dbl>
1 A 3
2 B 2
3 C 3
4 D 1
Upvotes: 1