Reputation: 11
I need to calculate the total amount for each user in the data set but the problem is even when the amount for the user is 0 it keeps subtracting and generating fake negative values(there can't be amount less then 0 its bug).
For every + or - there are real events. However, when the amount hits 0, no matter how many negative events appear the result should not go below 0, and if we have first 10 negative events like going to -1000 and after that we have one positive +200 and after that one negative -100, I need the final result to be 100.
Here is example, the final total amount for that user should be 200.
userdata <- read.table(text="
ID Amount UserID Date Hour
1 500 2 3/3/2018 0:00
2 -200 2 3/4/2018 0:00
3 -250 2 3/5/2018 0:00
4 -500 2 3/8/2018 0:00
5 100 2 3/8/2018 0:00
6 -50 2 3/8/2018 0:00
7 250 2 3/8/2018 0:00
8 -100 2 3/8/2018 0:00
", header=TRUE, stringsAsFactors=FALSE)
I need a way to correctly calculate that amounts.
Upvotes: 0
Views: 203
Reputation: 42544
If I understand correctly, the total amount is computed by a cumulative sum which never becomes negative.
While AkselA's recursive filter computes the totals by looping over the elements of the vector, the approach below iteratively corrects the cumulative sum whenever it becomes negative. Note that the order of elements is important, e.g., time series.
nonneg_cumsum <- function(x) {
n <- length(x)
y <- cumsum(x)
repeat {
i <- head(which(y < 0), 1L)
if (length(i) < 1) return(y)
y[i:n] <- y[i:n] - y[i]
}
}
nonneg_cumsum(userdata$Amount)
[1] 500 300 50 0 100 50 300 200
For comparison, here is the output of the regular cumsum()
function:
cumsum(userdata$Amount)
[1] 500 300 50 -450 -350 -400 -150 -250
Upvotes: 1
Reputation: 8836
I think we can solve this with a recursive filter, a sort of conditional cumulative sum.
# Isolate the vector we're interested in and prepend a zero
y <- c(0, userdata$Amount)
# run a for loop
for (i in 2:length(y)) { # For every position in the vector,
y[i] <- y[i-1] + y[i] # add the previous to the present.
if (y[i] < 0) { # If the resulting sum is less than zero,
y[i] <- 0 # replace it with zero
}
}
# Or equivalent, but maybe a bit more elegant
for (i in 2:length(y)) {
y[i] <- max(c(0, y[i-1] + y[i]))
}
y[-1]
# [1] 500 300 50 0 100 50 300 200
tail(y, 1)
# 200
Upvotes: 1