Reputation: 920
My data looks like below.
id MONTH DAY VAR1 VAR2 VAR3 ...
1 10 21 5 1 "a"
1 10 21 4 2 "a"
1 10 22 3 4 "b"
1 11 25 5 1 "a"
2 10 10 5 1 "c"
2 10 10 7 5 "c"
2 10 28 5 1 "a"
...
From the data, If id,MONTH,DAY are same such as 1st and second row, I want to get sum VAR1 and mean VAR2, otherwise remain. Therefore my desired result should be
id MONTH DAY VAR1 VAR2 VAR3 ...
1 10 21 9 1.5 "a"
1 10 22 3 4 "b"
1 11 25 5 1 "a"
2 10 10 12 3 "c"
2 10 28 5 1 "a"
...
VAR3 is the same if id,MONTH,DAY are same.
I think I can use aggregate repeatedly for each sum and mean function but it seems to be cumbersome. Is there a simple way of doing by keeping my data format?
Upvotes: 0
Views: 100
Reputation: 6969
You can also easily do multiple calculations within a group with data.table
, e.g.:
DT[, ("VAR1.sum", "VAR2.mean") := .(sum(VAR1), mean(VAR2)), by = .(id, MONTH, DAY)]
Upvotes: 1
Reputation: 21621
Try:
library(dplyr)
df %>%
group_by(id, MONTH, DAY) %>%
summarise(VAR1 = sum(VAR1), VAR2 = mean(VAR2), first(VAR3))
Which gives:
#Source: local data frame [5 x 6]
#Groups: id, MONTH [?]
#
# id MONTH DAY VAR1 VAR2 first(VAR3)
# (int) (int) (int) (int) (dbl) (fctr)
#1 1 10 21 9 1.5 a
#2 1 10 22 3 4.0 b
#3 1 11 25 5 1.0 a
#4 2 10 10 12 3.0 c
#5 2 10 28 5 1.0 a
Upvotes: 1