user2543622
user2543622

Reputation: 6796

R cumulative sum based upon other columns

I have a data.frame as below. The data is sorted by column txt and then by column val. summ column is sum of value in val colummn and the summ column value from the earlier row provided that the current row and the earlier row have same value in txt column...How could i do this in R?

txt=c(rep("a",4),rep("b",5),rep("c",3))
val=c(1,2,3,4,1,2,3,4,5,1,2,3)
summ=c(1,3,6,10,1,3,6,10,15,1,3,6)
dd=data.frame(txt,val,summ)
> dd
   txt val summ
1    a   1    1
2    a   2    3
3    a   3    6
4    a   4   10
5    b   1    1
6    b   2    3
7    b   3    6
8    b   4   10
9    b   5   15
10   c   1    1
11   c   2    3
12   c   3    6

Upvotes: 0

Views: 2432

Answers (1)

bgoldst
bgoldst

Reputation: 35324

If by "most earlier" (which in English is more properly written "earliest") you mean the nearest, which is what is implied by your expected output, then what you're talking about is a cumulative sum. You can apply cumsum() separately to each group of txt with ave():

dd <- data.frame(txt=c(rep("a",4),rep("b",5),rep("c",3)), val=c(1,2,3,4,1,2,3,4,5,1,2,3) );
dd$summ <- ave(dd$val,dd$txt,FUN=cumsum);
dd;
##    txt val summ
## 1    a   1    1
## 2    a   2    3
## 3    a   3    6
## 4    a   4   10
## 5    b   1    1
## 6    b   2    3
## 7    b   3    6
## 8    b   4   10
## 9    b   5   15
## 10   c   1    1
## 11   c   2    3
## 12   c   3    6

Upvotes: 3

Related Questions