rstruck
rstruck

Reputation: 1294

r aggregate dataframe: some columns unchanged, some columns aggregated

I have tried to look at manuals for aggregate, apply, etc, but I can't find examples where some columns are skipped by the function being applied, while other columns are copied as-is. Example:

> olddf = data.frame(code=c("one","one","two"), val1=c(1,2,3), val2=c(4,5,6), val3=c(7,8,9))
> olddf
  code val1 val2 val3
1  one    1    4    7
2  one    2    5    8
3  two    3    6    9
> 

How do I aggregate olddf so that I get a new data frame where:

I basically want:

> newdf
  code val2 newcol
1  one    9 1.6666
2  two    6 1.5000

I would like to do this in one step rather than defining a separate function to deal with each column / aggregate. Is this possible?

Upvotes: 0

Views: 535

Answers (2)

KFB
KFB

Reputation: 3501

Try data.table

library(data.table)
setDT(olddf)[, .(val2 = sum(val2),
                 newcol = sum(val3)/sum(val2)), by = code]
#    code val2   newcol
# 1:  one    9 1.666667
# 2:  two    6 1.500000

Upvotes: 2

JFu
JFu

Reputation: 121

You can do this in two steps, the key is using the aggregate function. In the by() argument, you want to specify how you want to aggregate. In this case, aggregate according to the value in "Code". Then, add third column just by specifying the function of interest.

aggregate <- aggregate(olddf[, c("val2", "val3")], by = list(olddf$code), FUN = sum)
aggregate$newcol <- aggregate$val3/aggregate$val2

Upvotes: 0

Related Questions