josumezo
josumezo

Reputation: 15

create column in R data frame adding values in several columns above the value in a different column

This is my problem I have a dataframe like this

df=data.frame(v1=c(24,15, 0, 7,36,10), c1=c(22,15,0,0,28,11), v2=c(0,10,0,19,0,0), c2=c(0,7,0,22,0,0), v3=c(54,22,28,55,62,38), c3=c(44,23,22,66,71,44), cut=c(20,35,40,15,40,30))

That looks like this

  v1 c1 v2 c2 v3 c3 cut
1 24 22  0  0 54 44  20
2 15 15 10  7 22 23  35
3  0  0  0  0 28 22  40
4  7  0 19 22 55 66  15
5 36 28  0  0 62 71  40
6 10 11  0  0 38 44  30

I need to create a new column with adds the values of columns v1,v2,v3 (there are more in the real case), but only when they are above the value in the column "cut"

So, in this case the column "sum" would be:

sum
78
 0
 0
64
62
38

I hope this is not obvious.

Upvotes: 0

Views: 7070

Answers (1)

Jake Burkhead
Jake Burkhead

Reputation: 6535

R> df$sum = apply(df[c(grep("v", names(df), value = TRUE), "cut")], 1, function(y) sum(y[y > y["cut"]]) )
R> df
  v1 c1 v2 c2 v3 c3 cut sum
1 24 22  0  0 54 44  20  78
2 15 15 10  7 22 23  35   0
3  0  0  0  0 28 22  40   0
4  7  0 19 22 55 66  15  74
5 36 28  0  0 62 71  40  62
6 10 11  0  0 38 44  30  38

Here's a mapply approach

R> df$sum = do.call(mapply, c(function(...) { y = c(...); sum(y[y > y["cut"]]) }, df[c(grep("v", names(df), value = TRUE), "cut")]) )
R> df
  v1 c1 v2 c2 v3 c3 cut sum
1 24 22  0  0 54 44  20  78
2 15 15 10  7 22 23  35   0
3  0  0  0  0 28 22  40   0
4  7  0 19 22 55 66  15  74
5 36 28  0  0 62 71  40  62
6 10 11  0  0 38 44  30  38

Upvotes: 2

Related Questions