Reputation: 1281
there is a simple aggregate:
dat = read.table(textConnection(
'ID value
1 4
1 7
2 8
2 3
2 3'), header = TRUE)
aggregate(dat,by=list("type"=dat$ID),sum)
i get the result output:
type ID value
1 1 2 11
2 2 6 14
i wonder:
1.in the first row ,why the ID is 2?
2.in the second row ,why the ID is 6?
Upvotes: 0
Views: 145
Reputation: 42639
You requested a sum of each column, aggregated bydat$ID
. Using this interface, that will include all columns. dat$ID
is simply a vector and thus the ID
column is not removed from the aggregated results. The function sum
is also applied to ID
within each group.
For the first row, you are computing with(dat, sum(ID[dat$ID==1]))
or 1+1.
For the second row, you are computing with(dat, sum(ID[dat$ID==2]))
or 2+2+2
(It is intentional that I specified dat$ID
in each index, rather than ID
, as that is what your aggregate
call is doing.)
Using the formula interface to aggregate
is cleaner, and gives what you seem to want. Using this interface, aggregate
gives the sum of the value
column, with ID
as it appears in each aggregated group:
> aggregate(value ~ ID, data=dat, sum)
ID value
1 1 11
2 2 14
Upvotes: 2