Eric Tim
Eric Tim

Reputation: 53

How do I fix a cumulative sum in R when dealing with negative values

I have data based on a user's behavior on chess.com. I'm trying to get data on total puzzles done by a user. Unfortunately, players can reset their puzzles, causing their overall number of puzzles completed (LifeTimeCumulative) to go down. It can go all the way to zero, or to any other number. My data looks like the data below. How do I go from the LifetimeCumulative Column, where sometimes the number goes down, to LifeTimeCumulativeCorrect. Notice that in LifeTimeCumulativeCorrect, if the LifeTimeCumulative goes down, then LifeTimeCumulativeCorrect does not go down. Basically, LifeTimeCumulativeCorrect is summing only the positive increases in LifeTimeCumulative.

library(dplyr)
library(lubridate)

set.seed(1)

observations = 100

dateStart = as.Date("2022-1-1")
dateEnd   = as.Date("2022-1-04")
dates = seq( dateStart, dateEnd, by = "day")

players = c("A","B","C")

df = expand.grid( player=players, date=dates )

df = df %>% 
  arrange( player, date ) %>% 
  mutate( LifeTimeCumulative = sample.int( 3, nrow(df), replace = TRUE) - 1 )


LifeTimeCumulativeCorrect <- c(0,2,2,3,0,2,2,2,1,2,2,2)

df$LifeTimeCumulativeCorrect <- LifeTimeCumulativeCorrect


Data

Upvotes: 2

Views: 329

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145775

We can calculate the increments with diff and use pmax to set all negative increments to 0. Then we cumsum this corrected increment:

df %>%
  group_by(player) %>%
  mutate(
    increment = c(first(LifeTimeCumulative), pmax(diff(LifeTimeCumulative), 0)),
    corrected = cumsum(increment)
  ) %>%
  ungroup()
# # A tibble: 12 × 6
#    player date       LifeTimeCumulative LifeTimeCumulativeCorrect increment corrected
#    <fct>  <date>                  <dbl>                     <dbl>     <dbl>     <dbl>
#  1 A      2022-01-01                  0                         0         0         0
#  2 A      2022-01-02                  2                         2         2         2
#  3 A      2022-01-03                  0                         2         0         2
#  4 A      2022-01-04                  1                         3         1         3
#  5 B      2022-01-01                  0                         0         0         0
#  6 B      2022-01-02                  2                         2         2         2
#  7 B      2022-01-03                  2                         2         0         2
#  8 B      2022-01-04                  1                         2         0         2
#  9 C      2022-01-01                  1                         1         1         1
# 10 C      2022-01-02                  2                         2         1         2
# 11 C      2022-01-03                  2                         2         0         2
# 12 C      2022-01-04                  0                         2         0         2  

Upvotes: 3

Related Questions