Reputation: 369
I have a data frame like this:
id no age
1 1 7 23
2 1 2 23
3 2 1 25
4 2 4 25
5 3 6 23
6 3 1 23
and I hope to aggregate the date frame by id
to a form like this: (just sum the no
if they share the same id
, but keep age
there)
id no age
1 1 9 23
2 2 5 25
3 3 7 23
How to achieve this using R?
Upvotes: 18
Views: 35595
Reputation: 34763
Even better, data.table
:
library(data.table)
# convert your object to a data.table (by reference) to unlock data.table syntax
setDT(DF)
DF[ , .(sum_no = sum(no), unq_age = unique(age)), by = id]
Upvotes: 7
Reputation: 98599
Assuming that your data frame is named df
.
aggregate(no~id+age, df, sum)
# id age no
# 1 1 23 9
# 2 3 23 7
# 3 2 25 5
Upvotes: 25
Reputation: 4180
Alternatively, you could use ddply
from plyr package:
require(plyr)
ddply(df,.(id,age),summarise,no = sum(no))
In this particular example the results are identical. However, this is not always the case, the difference between the both functions is outlined here. Both functions have their uses and are worth exploring, which is why I felt this alternative should be mentioned.
Upvotes: 4