Reputation: 960
I have the following dataset and would like to add a new column 'colY'. How to achieve it (the following shows how colY is calculated)?
GROUP ID colX colY
1 1 0.8 =0.8*(1+0.7*(1+0.6))
1 2 0.7 =0.7*(1+0.6)
1 3 0.6 =0.6
2 1 1.3 =1.3*(1+1.2*(1+1.1*(1+1.0)))
2 2 1.2 =1.2*(1+1.1*(1+1.0))
2 3 1.1 =1.1*(1+1.0)
2 4 1.0 =1.0
Preferably in data.table syntax. Thank you!
Upvotes: 0
Views: 293
Reputation: 25225
Here is an option using Rcpp
with data.table
:
library(Rcpp)
cppFunction('NumericVector fun(NumericVector v) {
int n = v.size();
NumericVector res(n);
res[n-1] = v[n-1];
for(int i=n-2; i>=0; i--) {
res[i] = v[i] * (1 + res[i+1]);
}
return res;
}')
DT[, colY := fun(colX), GROUP]
output:
GROUP ID colX colY
1: 1 1 0.8 1.696
2: 1 2 0.7 1.120
3: 1 3 0.6 0.600
4: 2 1 1.3 6.292
5: 2 2 1.2 3.840
6: 2 3 1.1 2.200
7: 2 4 1.0 1.000
Upvotes: 1
Reputation: 2419
Check this
runsum <- function(x){
b <- as.numeric()
len <- length(x)
for(i in 1:len){
b[i] <- sum(cumprod(x[i:len]))
}
return(b)
}
dt[, colY := runsum(colX),by=GROUP]
Result:
GROUP ID colX colY
1: 1 1 0.8 1.696
2: 1 2 0.7 1.120
3: 1 3 0.6 0.600
4: 2 1 1.3 6.292
5: 2 2 1.2 3.840
6: 2 3 1.1 2.200
7: 2 4 1.0 1.000
Data:
library(data.table)
dt <- fread("GROUP ID colX
1 1 0.8
1 2 0.7
1 3 0.6
2 1 1.3
2 2 1.2
2 3 1.1
2 4 1.0 ")
I think there are some better methods to replace function runsum
, but I haven't got that and here I just use a custom function to show the basical idea. Any improvement are welcome.
Upvotes: 1