Duck
Duck

Reputation: 39613

Cumulative sum for individuals in a dataframe with R

I have a data frame like this:

data=data.frame(ID=c("0001","0002","0003","0004","0004","0004","0001","0001","0002","0003"),Saldo=c(10,10,10,15,20,50,100,80,10,10),place=c("grocery","market","market","cars","market","market","cars","grocery","cars","cars"))

I was trying to calculate total sum of aldo for each individual in ID variable applying cumsum or apply but I don't get the result I want. I would like someone like this:

  ID      Saldo.Total
1 0001         190
2 0002          20
3 0003          20
4 0004          85 

Upvotes: 1

Views: 8207

Answers (2)

Marius
Marius

Reputation: 60190

I think you may have gotten confused, as what you want is not really a cumulative sum, it's just a sum:

library(plyr)
ddply(
  data,
  .(ID),
  summarize,
  Saldo.Total=sum(Saldo)
  )

Output:

    ID Saldo.Total
1 0001         190
2 0002          20
3 0003          20
4 0004          85

A cumulative sum is the "running total" as you move along the vector, e.g.:

> x = c(1, 2, 3, 4, 5)
> cumsum(x)
[1]  1  3  6 10 15

Upvotes: 1

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193687

You can use aggregate:

> aggregate(Saldo ~ ID, data, function(x) max(cumsum(x))) ## same as sum
    ID Saldo
1 0001   190
2 0002    20
3 0003    20
4 0004    85

If you're really interested in a cumulative sum by ID, try the following:

within(data, {
  Saldo.Total <- ave(Saldo, ID, FUN = cumsum)
})
#     ID Saldo   place Saldo.Total
# 1  0001    10 grocery          10
# 2  0002    10  market          10
# 3  0003    10  market          10
# 4  0004    15    cars          15
# 5  0004    20  market          35
# 6  0004    50  market          85
# 7  0001   100    cars         110
# 8  0001    80 grocery         190
# 9  0002    10    cars          20
# 10 0003    10    cars          20

Upvotes: 5

Related Questions