SBS
SBS

Reputation: 87

Column addition in based on condition of another column using R

I have data frame mydata as follows,

 ID TS        TB    TC
  1 1.7360     -1    0
  2 1.7302     -1    0.254
  3 1.7244      0    0.624
  4 1.7232      0    0.254
  5 1.7208      0    1.25
  6 1.7208      0    0
  7 1.7208      0    0
  8 1.7023      0    0
  9 1.6814      0    0
 10 1.6768      1    0
 11 1.6746      0    6.25
 12 1.6503      0    0.2547
 13 1.6258      0    0.987
 14 1.6190      0    0.3654
 15 1.6154      0    0.6251
 16 1.6258      0    0.369
 17 1.6397      0    0
 18 1.6443      0    0
 19 1.6491      0    0
 20 1.6503      0    0

Now I need to add/subtract values of TS and TC and create another new column TSC with a condition that if there is -1 in column TB then TSC == TS + TC until there is another +1 or -1 in TB, else if there is 1 in column TB then TSC == TS - TC until there is another +1 or -1 in TB. Also there are possibilities of having 0 at the begining of the data frame in the column B then Tsc == TS +TC.

Below is my final result

 ID       TS       TB     TC      TSC
 1  1.73602       -1    0       1.736020823
 2  1.73023       -1    0.254   1.984238239
 3  1.72445        0    0.624   2.348455656
 4  1.72326        0    0.254   1.977269484
 5  1.72089        0    1.25    2.970897142
 6  1.72089        0    0       1.720897142
 7  1.72089        0    0       1.720897142
 8  1.70236        0    0       1.70236322
 9  1.68145        0    0       1.681456955
10  1.67686        1    0       1.676860542
11  1.67463        0    6.25    4.575363528
12  1.65031        0    0.2547  1.395619965
13  1.62585        0    0.987   0.638855188
14  1.61903        0    0.3654  1.253634704
15  1.61547        0    0.6251  0.990376191
16  1.62585        0    0.369   1.256855188
17  1.63979        0    0       1.639792697
18  1.64438        0    0       1.64438911
19  1.64913        0    0       1.649133794
20  1.65031        0    0       1.650319965

Upvotes: 0

Views: 273

Answers (1)

Spacedman
Spacedman

Reputation: 94192

By constructing a vector of +1 for adding and -1 for subtracting its easy. So how do we construct that?

First constructing a vector replacing all the zeroes with the previous non-zero. By using rle we can construct a vector of how many times -1, 1, or 0 appears, and then replace the zeroes with the previous non-zero, then inverse that and get a vector of 1 and -1 only.

Stick a -1 on the start to satisfy the "if starts with zero, subtract" condition:

> rx = rle(c(-1,mydata$TB))

Now see where the zeroes are:

> wz = which(rx$values==0)

Set the zeroes to whatever the previous values were.

> rx$values[wz]=rx$values[wz-1]

Now expand to a vector of 1 and -1, chopping off the first to get rid of that -1 we started with. Also, make it +1 for adding and -1 for subtracting:

> mydata$TBop = -inverse.rle(rx)[-1]

Then do the operation:

> mydata$TSC=mydata$TS + mydata$TBop*mydata$TC
> mydata
    ID     TS TB     TC TBop     TSC
 1:  1 1.7360 -1 0.0000    1  1.7360
 2:  2 1.7302 -1 0.2540    1  1.9842
 3:  3 1.7244  0 0.6240    1  2.3484
 4:  4 1.7232  0 0.2540    1  1.9772
 5:  5 1.7208  0 1.2500    1  2.9708
 6:  6 1.7208  0 0.0000    1  1.7208
 7:  7 1.7208  0 0.0000    1  1.7208
 8:  8 1.7023  0 0.0000    1  1.7023
 9:  9 1.6814  0 0.0000    1  1.6814
10: 10 1.6768  1 0.0000   -1  1.6768
11: 11 1.6746  0 6.2500   -1 -4.5754
12: 12 1.6503  0 0.2547   -1  1.3956
13: 13 1.6258  0 0.9870   -1  0.6388
14: 14 1.6190  0 0.3654   -1  1.2536
15: 15 1.6154  0 0.6251   -1  0.9903
16: 16 1.6258  0 0.3690   -1  1.2568
17: 17 1.6397  0 0.0000   -1  1.6397
18: 18 1.6443  0 0.0000   -1  1.6443
19: 19 1.6491  0 0.0000   -1  1.6491
20: 20 1.6503  0 0.0000   -1  1.6503

Probably best to write a testable function for the operator:

Top <-
function(x){
 rx = rle(c(-1,x))
 wz = which(rx$values==0)
 rx$values[wz] = rx$values[wz-1]
 -inverse.rle(rx)[-1]
}

Then you can check simple examples:

> Top(0) # add
[1] 1
> Top(-1) # add
[1] 1
> Top(1) # subtract
[1] -1

> Top(c(0,-1,1,-1,0,0,0))
[1]  1  1 -1  1  1  1  1   # add, add, sub, add, add, add, add

> Top(mydata$TB)
[1]  1  1  1  1  1  1  1  1  1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1

Then your solution is a one-liner:

> mydata$TSC = mydata$TS + Top(mydata$TB) * mydata$TC

Upvotes: 1

Related Questions