biviz
biviz

Reputation: 173

R: cumulative total at a daily level

I have the following dataset:

enter image description here

I want to measure the cumulative total at a daily level. So the result look something like:

enter image description here

I can use dplyr's cumsum function but the count for "missing days" won't show up. As an example, the date 1/3/18 does not exist in the original dataframe. I want this missed date to be in the resultant dataframe and its cumulative sum should be the same as the last known date i.e. 1/2/18 with the sum being 5.

Any help is appreciated! I am new to the language.

Upvotes: 0

Views: 1257

Answers (1)

r2evans
r2evans

Reputation: 160447

I'll use this second data.frame to fill out the missing dates:

daterange <- data.frame(Date = seq(min(x$Date), max(x$Date), by = "1 day"))

Base R:

transform(merge(x, daterange, all = TRUE),
          Count = cumsum(ifelse(is.na(Count), 0, Count)))
#          Date Count
# 1  2018-01-01     2
# 2  2018-01-02     5
# 3  2018-01-03     5
# 4  2018-01-04     5
# 5  2018-01-05    10
# 6  2018-01-06    10
# 7  2018-01-07    10
# 8  2018-01-08    11
# ...
# 32 2018-02-01    17

dplyr

library(dplyr)
x %>%
  right_join(daterange) %>%
  mutate(Count = cumsum(if_else(is.na(Count), 0, Count)))

Data:

x <- data.frame(Date = as.Date(c("1/1/18", "1/2/18", "1/5/18", "1/8/18", "2/1/18"), format="%m/%d/%y"),
                Count = c(2,3,5,1,6))

Upvotes: 1

Related Questions