daikonradish
daikonradish

Reputation: 702

calculate cumulative sum of one column based on another column's rank

i have a dataset that looks like this:

amount    rank    category
4000      1       A
200       3       A
1000      2       A
10        4       A
500       1       B
...

I want to calculate the cumulative sum of amount based on ordering of rank, i.e. return:

cum      rank    category
4000     1       A
5000     2       A
5200     3       A
5210     4       A
...

any help would be nice! :)

Upvotes: 0

Views: 1809

Answers (2)

Daniel Gardiner
Daniel Gardiner

Reputation: 1056

A dplyr solution:

library(dplyr)

data = data.frame(amount = c(4000, 200, 1000, 10, 500),
                  rank = c(1, 3, 2, 4, 1),
                  category = c("A", "A", "A", "A","B"))

data %>% arrange(category, rank) %>% 
 group_by(category) %>% mutate(csum = cumsum(amount))

Upvotes: 1

Arun
Arun

Reputation: 118779

A data.table solution:

require(data.table) ## version >= 1.9.0
setDT(dat)          ## converts data.frame to data.table by reference

setkey(dat, category, rank) ## sort first by category, then by rank
dat[, csum := cumsum(amount), by=category]

#    amount rank category csum
# 1:   4000    1        A 4000
# 2:   1000    2        A 5000
# 3:    200    3        A 5200
# 4:     10    4        A 5210
# 5:    500    1        B  500

Upvotes: 1

Related Questions