Reputation: 2044
I am trying to use dplyr
and it's mutate
and lag
functions to create a burndown based on another column and the previous value in the column I'm creating.
I have seen an answer here that is asking for the same thing but the OP answered it themselves with using a loop. I am hoping there is a more eloquent solution to within dplyr
.
sample (imagine df has 1000's rows):
df <- data.frame(SCHED_SHIP_DATE = c("2019-09-10","2019-09-11","2019-09-12","2019-09-13"), quantity = c(156,52,136,285), stringsAsFactors = FALSE)
start <- 4000
temp3 <- df %>%
arrange(SCHED_SHIP_DATE) %>%
mutate(burndown = ifelse(row_number() == 1, start, NA)) %>%
mutate(burndown = lag(burndown, default = burndown[1]) - quantity)
I get the following output (NA
's after the it's done one row of mutate):
> temp3
SCHED_SHIP_DATE quantity burndown
1 2019-09-10 156 3844
2 2019-09-11 52 3948
3 2019-09-12 136 NA
4 2019-09-13 285 NA
When I am expecting this:
> temp3
SCHED_SHIP_DATE quantity burndown
1 2019-09-10 156 3844
2 2019-09-11 52 3948
3 2019-09-12 136 3812
4 2019-09-13 285 3527
Upvotes: 0
Views: 488
Reputation: 11140
I think your desired output is a bit wrong. Here's what you probably need -
df %>%
arrange(SCHED_SHIP_DATE) %>%
mutate(
burndown = 4000 - cumsum(quantity)
# burndown = pmax(4000 - cumsum(quantity), 0) # use this if you don't want -ve burndown
)
SCHED_SHIP_DATE quantity burndown
1 2019-09-10 156 3844
2 2019-09-11 52 3792
3 2019-09-12 136 3656
4 2019-09-13 285 3371
Upvotes: 0