M.Adams
M.Adams

Reputation: 145

aggregate a column by sum and another column by mean at the same time

I want to use aggregate function on a date frame but sum one column and take average of another column.

Here is an example data frame

Manager   Category  Amount  SqFt
Joe           Rent     150   500
Alice         Rent     250   700
Joe      Utilities      50   500
Alice    Utilities      75   700

I cannot do something like below. Is there an easy way to do it ?

Avg_CPSF=aggregate(cbind(Amount,SqFt)~Manager,data=aaa,FUN=c(sum,mean)

Eventually I need

Manager  Amount   SqFT
Joe       200      500
Alice     325      700

so that I can calculate Cost per Square Foot by doing Amount/SqFT

Upvotes: 5

Views: 14044

Answers (2)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

There are several ways to do this. Here are some that I like (all assuming we're starting with a data.frame named "mydf"):

Using ave and unique

unique(within(mydf, {
  Amount <- ave(Amount, Manager, FUN = sum)
  SqFt <- ave(SqFt, Manager, FUN = mean)
  rm(Category)
}))
#   Manager Amount SqFt
# 1     Joe    200  500
# 2   Alice    325  700

Using data.table:

library(data.table)
DT <- data.table(mydf)
DT[, list(Amount = sum(Amount), SqFt = mean(SqFt)), by = "Manager"]
#    Manager Amount SqFt
# 1:     Joe    200  500
# 2:   Alice    325  700

Using "sqldf":

library(sqldf)
sqldf("select Manager, sum(Amount) `Amount`, 
      avg(SqFt) `SqFt` from mydf group by Manager")

Using aggregate and merge:

merge(aggregate(Amount ~ Manager, mydf, sum), 
      aggregate(SqFt ~ Manager, mydf, mean))

Upvotes: 20

James
James

Reputation: 66834

You can use the summarise function with ddply in the plyr package:

library(plyr)
ddply(mtcars,.(cyl),summarise,TotalWt=sum(wt),AveHP=mean(hp))
  cyl TotalWt     AveHP
1   4  25.143  82.63636
2   6  21.820 122.28571
3   8  55.989 209.21429

Upvotes: 4

Related Questions