Reputation: 323

Sum values from common ID in a dataframe

I have a data frame that looks like:

df<-data.frame(id=c("xx33","xx33","xx22","xx11","xx11","xx00"),amount=c(10,15,100,20,10,15),date=c("01/02/2013","01/02/2013","02/02/2013","03/03/2013","03/03/2013","04/04/2013"))

    id  amount  date
1   xx33    10  01/02/2013
2   xx33    15  01/02/2013
3   xx22    100 02/02/2013
4   xx11    20  03/03/2013
5   xx11    10  03/03/2013
6   xx00    15  04/04/2013

I want to compile all the common IDs and sum the amount and also the number of occurances of the id, but also carry the common information such as date which is the same for each id (along with any other variable). So, I want the output to be:

    id  sum date        number
1   xx33    25  01/02/2013   2
2   xx22    100 02/02/2013   1
3   xx11    30  03/03/2013   2
4   xx00    15  04/04/2013   1

I've tried

ddply(.data = df, .var = "id", .fun = nrow)

and that returns the total number of occurances but I can't figure out a way to sum the all the common ids without looping.

Upvotes: 10

Answers (4)

Matthew Plourde

Reputation: 44614

an obligatory base R answer:

unique(transform(df, amount=ave(amount, id, FUN=sum), 
                     count=ave(amount, id, FUN=length)))
#     id amount       date count
# 1 xx33     25 01/02/2013     2
# 3 xx22    100 02/02/2013     1
# 4 xx11     30 03/03/2013     2
# 6 xx00     15 04/04/2013     1

Upvotes: 4

Jilber Urbina

Reputation: 61214

Here's an R base solution

> cbind(aggregate(amount~id+date, sum, data=df), table(df$id))[, -4]
    id       date amount Freq
1 xx33 01/02/2013     25    1
2 xx22 02/02/2013    100    2
3 xx11 03/03/2013     30    1
4 xx00 04/04/2013     15    2

Upvotes: 6

Metrics

Reputation: 15458

Here is the solution using plyr package:

   library(plyr)
ddply(df,.(date,id),summarize,sum=sum(amount),number=length(id))
            date   id sum number
    1 01/02/2013 xx33  25      2
    2 02/02/2013 xx22 100      1
    3 03/03/2013 xx11  30      2
    4 04/04/2013 xx00  15      1

Upvotes: 9

TheComeOnMan

Reputation: 12905

Using the data.table library -

library(data.table)
dt <- data.table(df)
dt2 <- dt[,list(sumamount = sum(amount), freq = .N), by = c("id","date")]

Output:

> dt2
     id       date sumamount freq
1: xx33 01/02/2013        25    2
2: xx22 02/02/2013       100    1
3: xx11 03/03/2013        30    2
4: xx00 04/04/2013        15    1

Upvotes: 9

Sum values from common ID in a dataframe

Answers (4)

Related Questions