Reputation: 3722
Trying to conditionally sum based on the previous groupings. Having trouble coming up with it.
I'm trying to sum the amt column based on which ones are in type r1, after grouping by f.
Reproducible code:
s <- sample(c('one', 'two'), 96, replace = TRUE)
f <- sample(c('a','s','d','f'), 96, replace = TRUE)
r1_amt <- runif(96, 1, 100)
r2_amt <- runif(96, 1, 100)
r3_amt <- runif(96, 1, 100)
x <- data_frame(s, f, r1_amt, r2_amt, r3_amt)
smy <- x %>%
group_by(f) %>%
summarise(n = n(), # population in each f group
num_r1 = sum(r1_amt >= 50)) # amount of r1 in each f group
I've tried .[r1_amt >= 50]$amt
, cumsum(r1_amt >= 50)
, sum(ifelse(r1_amt >= 50, r1_amt, 0))
but haven't been able to come up with the grouped numbers.
So 1 given row could be a 60 for r1, 40 for r2, and 55 for r3 and it should be included in the summed amount column for only r1 and r3 if that makes sense.
Upvotes: 1
Views: 172
Reputation: 28825
This may be possible in a bit cleaner way too, but this should work:
x.v2 <- x # temp variable
x.v2[which(x[,4] != 'r1'),3] <- 0 # replace values of tpe != 'r1' with 0's
smy <- x.v2 %>%
group_by(f) %>%
summarise(n = n(), # population in each f group
num_r1 = sum(amt)) # sum of values for type == 'r1' in each group f
rm(x.v2) # remove temp variable
smy # output for seed = 123 (use set.seed(123) for building data)
# f n num_r1
# 1 a 20 114.1879
# 2 d 28 611.9858
# 3 f 19 351.5366
# 4 s 29 357.8402
Upvotes: 1
Reputation: 7248
It sounds like what you want to do is just group by both f and type to compute the per-f/type statistics.
x %>% group_by(f, type) %>% summarise(num_type=n(), sum_type=sum(amt))
Source: local data frame [16 x 4]
Groups: f [?]
f type num_type sum_type
<chr> <chr> <int> <dbl>
1 a r1 12 616.6610
2 a r2 6 417.5589
3 a r3 9 375.2246
4 a r4 7 346.5796
5 d r1 8 471.1253
...
You can use tidyr
to go back to wide form for the sum_type
field, but I would only do so for display purposes:
> res %>% spread(type, sum_amt)
Source: local data frame [12 x 6]
Groups: f [4]
f num_type r1 r2 r3 r4
* <chr> <int> <dbl> <dbl> <dbl> <dbl>
1 a 6 NA 417.5589 NA NA
2 a 7 NA NA NA 346.5796
3 a 9 NA NA 375.2246 NA
...
Upvotes: 1