Michael
Michael

Reputation: 11

Remove fake negative value in R

I need to calculate the total amount for each user in the data set but the problem is even when the amount for the user is 0 it keeps subtracting and generating fake negative values(there can't be amount less then 0 its bug).

For every + or - there are real events. However, when the amount hits 0, no matter how many negative events appear the result should not go below 0, and if we have first 10 negative events like going to -1000 and after that we have one positive +200 and after that one negative -100, I need the final result to be 100.

Here is example, the final total amount for that user should be 200.

userdata <- read.table(text="
 ID  Amount UserID Date     Hour
 1   500    2      3/3/2018 0:00
 2  -200    2      3/4/2018 0:00
 3  -250    2      3/5/2018 0:00
 4  -500    2      3/8/2018 0:00
 5   100    2      3/8/2018 0:00
 6  -50     2      3/8/2018 0:00
 7   250    2      3/8/2018 0:00
 8  -100    2      3/8/2018 0:00
", header=TRUE, stringsAsFactors=FALSE)

I need a way to correctly calculate that amounts.

Upvotes: 0

Views: 203

Answers (2)

Uwe
Uwe

Reputation: 42544

If I understand correctly, the total amount is computed by a cumulative sum which never becomes negative.

While AkselA's recursive filter computes the totals by looping over the elements of the vector, the approach below iteratively corrects the cumulative sum whenever it becomes negative. Note that the order of elements is important, e.g., time series.

nonneg_cumsum <- function(x) {
  n <- length(x)
  y <- cumsum(x)
  repeat {
    i <- head(which(y < 0), 1L)
    if (length(i) < 1) return(y)
    y[i:n] <- y[i:n] - y[i]
  }
}

nonneg_cumsum(userdata$Amount)
[1] 500 300  50   0 100  50 300 200

For comparison, here is the output of the regular cumsum() function:

cumsum(userdata$Amount)
[1]  500  300   50 -450 -350 -400 -150 -250

Upvotes: 1

AkselA
AkselA

Reputation: 8836

I think we can solve this with a recursive filter, a sort of conditional cumulative sum.

# Isolate the vector we're interested in and prepend a zero
y <- c(0, userdata$Amount)

# run a for loop
for (i in 2:length(y)) {   # For every position in the vector, 
    y[i] <- y[i-1] + y[i]  # add the previous to the present.
    if (y[i] < 0) {        # If the resulting sum is less than zero,
        y[i] <- 0          # replace it with zero
    }
}

# Or equivalent, but maybe a bit more elegant
for (i in 2:length(y)) { 
    y[i] <- max(c(0, y[i-1] + y[i]))
}

y[-1]
# [1] 500 300  50   0 100  50 300 200  

tail(y, 1)
# 200

Upvotes: 1

Related Questions