Reputation: 699
I have the following data frame:
d <- data.frame(x = c(6,6,6,17,17,17,68,68,68), w = c(4,2,1,0,5,1,0,2,7))
Now I want to add a new column sw
such that
for each x=6
, sw=4+2+1+0+5+1+0+2+7=22
,
for each x=17
, sw=0+5+1+0+2+7=15
,
for each x=68
, sw=0+2+7=9
.
That is,
d
x w sw
1 6 4 22
2 6 2 22
3 6 1 22
4 17 0 15
5 17 5 15
6 17 1 15
7 68 0 9
8 68 2 9
9 68 7 9
For this example, I can do the following:
j = sum(d$w)
k = sum(d$w[d$x!=6])
l = sum(d$w[d$x==68])
d$sw <- c(rep(j,3),rep(k,3),rep(l,3))
But for 100 different values of x
, I can't do such coding, even I may not know what values is x
taking because I will generate the x=round(rexp(1000,1/100))
.
Upvotes: 2
Views: 588
Reputation: 38520
Here is a third base R solution that first calculates the values and then merges them to the original data.frame.
merge(df, within(aggregate(w ~ x, data=df, sum), sw <- rev(cumsum(rev(w))))[-2], by="x")
x w sw
1 6 4 22
2 6 2 22
3 6 1 22
4 17 0 15
5 17 5 15
6 17 1 15
7 68 0 9
8 68 2 9
9 68 7 9
aggregate(w ~ x, data=df, sum)
calculates the sum of w for each x group, Then sw <- rev(cumsum(rev(w)))
creates the cumulative sum using two revs
first to get the correct calculation and second to put it in the proper order. These functions are wrapped in within
so a copy of this data.frame is returned, and [-2]
drops the now extraneous w term calculated by aggregate
. This aggregated data.frame is then merged onto the original data.frame.
data
df <-
structure(list(x = c(6L, 6L, 6L, 17L, 17L, 17L, 68L, 68L, 68L
), w = c(4L, 2L, 1L, 0L, 5L, 1L, 0L, 2L, 7L)), .Names = c("x",
"w"), row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9"
), class = "data.frame")
Upvotes: 1
Reputation: 51622
Another idea via base R using Reduce
with accumulate = TRUE
argument, i.e.
ind <- aggregate(w ~ x, d, length)$w #get lengths of each group
rslts <- rev(unlist(lapply(Reduce(`rbind`, rev(split(d, d$x)), accumulate = TRUE),
function(i) sum(i$w))))
d$sw <- rep(rslts, ind)
d
# x w sw
#1 6 4 22
#2 6 2 22
#3 6 1 22
#4 17 0 15
#5 17 5 15
#6 17 1 15
#7 68 0 9
#8 68 2 9
#9 68 7 9
Upvotes: 2
Reputation: 887991
We can do this with base R
v1 <- sapply(unique(d$x), function(x) rep(sum(d$w[match(x, d$x):nrow(d)]), length(x)))
d$sw <- rep(v1, table(d$x))
Upvotes: 2