Reputation: 33
I need to aggregate the same values in col2 and col3, so I expect to recieve SUM in col4 and col5:
df <- data.frame("col1"="a", "col2"=c("mi", "se", "mi", "se", "ty"),
"col3"=c("re", "my", "re", "my", "my"), "col4"=c(1, 2, 3, 4, 5),
"col5"=c(1, 2, 3, 4, 5))
agg <- aggregate(df, by=list(df$col1, df$col2), FUN=sum)
The result is an error, though:
Error in Summary.factor(c(1L, 1L), na.rm = FALSE) : ‘sum’ not meaningful for factors
My expected output is
col1 col2 col3 col4 col5
1 a mi re 4 4
2 a se my 6 6
3 a ty my 5 5
Upvotes: 2
Views: 2710
Reputation: 476
Using dplyr :
agg <- df %>%
group_by(col2, col3) %>%
summarise(col4 = sum(col4),
col5 = sum(col5))
# col2 col3 col4 col5
# <fct> <fct> <dbl> <dbl>
# 1 mi re 4 4
# 2 se my 6 6
# 3 ty my 5 5
Is that what you are looking for ?
Upvotes: 1
Reputation: 72994
Exclude factor columns by aggregating on list(col4, col5)
.
with(df, aggregate(list(col4, col5), by=list(col1, col2, col3), sum))
# Group.1 Group.2 Group.3 c.1..2..3..4..5. c.1..2..3..4..5..1
# 1 a se my 6 6
# 2 a ty my 5 5
# 3 a mi re 4 4
We can get a somewhat nicer output if we name the lists.
with(df, aggregate(list(col4=col4, col5=col5), by=list(col1=col1, col2=col2, col3=col3), sum))
# col1 col2 col3 col4 col5
# 1 a se my 6 6
# 2 a ty my 5 5
# 3 a mi re 4 4
As suggested by @Ronak Shah we also could do
aggregate(cbind(col4, col5) ~ col1 + col2 + col3, df, sum)
The list
method is slightly faster, though.
Upvotes: 0