How to add values in one column that have '0' in another column

Question

I have a dataset with Time.Interval, Net.Chg, and Tick.Count columns. The Net.Chg has positive, negative, and Zero. Based on the Net.Chg, I want to sum the values in Time.Count for positive, negative and Zero and then Group by date.

Time.Interval   Net.Chg    Tick.Count
2-Jan-17         NA        NA
19:15 - 19:16       -0.0047    7
19:16 - 19:17    0     8
19:17 - 19:18    0.0025    10
3-Jan-17         NA        NA
03:45 - 03:46    0     1
03:54 - 03:55   -0.0002    2
19:43 - 19:44   -0.0008    4
20:01 - 20:02    0.0025    2
4-Jan-17         NA        NA
00:54 - 00:55   -0.0007    2
01:10 - 01:11    0.0005    1
01:11 - 01:12    0     1

Time.Interval <- c('2-Jan-17 _00:00:00.000000', '19:15 - 19:16', '19:16 - 19:17', '19:17 - 19:18', '3-Jan-17 _00:00:00.000000', '03:45 - 03:46', '03:54 - 03:55', '19:43 - 19:44', '20:01 - 20:02', '4-Jan-17 _00:00:00.000000', '00:54 - 00:55', '01:10 - 01:11', '01:11 - 01:12')
Net.Chg <- c(NA, -0.0047, 0, 0.0025, NA, 0, -0.0002, -0.0008, 0.0025, NA, -0.0007, 0.0005, 0)
Tick.Count <-  c(NA, 7, 8, 10, NA, 1, 2, 4, 2, NA, 2, 1, 1)
data <- data.frame(Time.Interval, Net.Chg, Tick.Count)

The required output is

pos = sum of "Tick.Count" if Net.Chg > 0
neg = sum of "Tick.Count" if Net.Chg < 0
UnChng = sum of "Tick.Count" if Net.Chg == 0
OF <- pos - Neg

I tried the following code

DF <- dd %>% group_by(grp = cumsum(str_detect(Time.Interval, "[A-Z]"))) %>% summarise(Time.Interval = anydate(first(Time.Interval)), pos = sum((Net.Chg > 0)* Tick.Count, na.rm = T),  neg = sum((Net.Chg < 0) * Tick.Count, na.rm = T), unChg = sum(Net.Chg ==0 * Tick.Count, na.rm=T), OF = sum(sign(Net.Chg) * Tick.Count, na.rm = TRUE))

This code gives me correct values for pos, neg, and 'OF' but the Unchng value is wrong.

The current output is

Time.Interval      pos    Neg     UnChng     OF
02Jan2017          10     7       4           3      
03Jan2017          2      6       5          -4
04Jan2017          1      2       4          -1

while the actual output should be

Time.Interval      pos    Neg     UnChng     OF
02Jan2017          10     7       8           3      
03Jan2017          2      6       1          -4
04Jan2017          1      2       1          -1

I tried the sum(Net.Chg ==0 + Tick.Count, na.rm=T) and length(Net.Chg ==0 * Tick.Count) but could not succeed.

Rorschach · Accepted Answer

You should never use == when comparing floats because of precision error. R has functions like all.equal and identical or you can just check for small errors, eg.

DF <- dd %>% 
    group_by(grp = cumsum(str_detect(Time.Interval, "[A-Z]"))) %>% 
    summarise(Time.Interval = anydate(first(Time.Interval)), 
        pos = sum((Net.Chg > 0)* Tick.Count, na.rm = TRUE),  
        neg = sum((Net.Chg < 0) * Tick.Count, na.rm = TRUE), 
        unChg = sum((abs(Net.Chg)-0 < 1e-15) * Tick.Count, na.rm=TRUE), 
        OF = sum(sign(Net.Chg) * Tick.Count, na.rm = TRUE))

It's also considered bad practice to use T instead of TRUE as the former can be set to any value.

How to add values in one column that have '0' in another column

Answers (2)

Related Questions

How to add values in one column that have &#39;0&#39; in another column

Answers (2)

Related Questions

How to add values in one column that have '0' in another column