Reputation: 1294
I have tried to look at manuals for aggregate, apply, etc, but I can't find examples where some columns are skipped by the function being applied, while other columns are copied as-is. Example:
> olddf = data.frame(code=c("one","one","two"), val1=c(1,2,3), val2=c(4,5,6), val3=c(7,8,9))
> olddf
code val1 val2 val3
1 one 1 4 7
2 one 2 5 8
3 two 3 6 9
>
How do I aggregate olddf so that I get a new data frame where:
I basically want:
> newdf
code val2 newcol
1 one 9 1.6666
2 two 6 1.5000
I would like to do this in one step rather than defining a separate function to deal with each column / aggregate. Is this possible?
Upvotes: 0
Views: 535
Reputation: 3501
Try data.table
library(data.table)
setDT(olddf)[, .(val2 = sum(val2),
newcol = sum(val3)/sum(val2)), by = code]
# code val2 newcol
# 1: one 9 1.666667
# 2: two 6 1.500000
Upvotes: 2
Reputation: 121
You can do this in two steps, the key is using the aggregate function. In the by() argument, you want to specify how you want to aggregate. In this case, aggregate according to the value in "Code". Then, add third column just by specifying the function of interest.
aggregate <- aggregate(olddf[, c("val2", "val3")], by = list(olddf$code), FUN = sum)
aggregate$newcol <- aggregate$val3/aggregate$val2
Upvotes: 0