patL
patL

Reputation: 2299

Regular time interval in R

I have a time series dataset (ts) with sells for each day.

ts
## A tibble: 40 x 2
#        dates sells
#       <date> <int>
# 1 2014-09-01    32
# 2 2014-09-02     8
# 3 2014-09-03    39
# 4 2014-09-04    38
# 5 2014-09-05     1
# 6 2014-09-06    28
# 7 2014-09-07    33
# 8 2014-09-08    21
# 9 2014-09-09    29
#10 2014-09-10    33
## ... with 30 more rows

I want to get the sum of sells in a regular interval, for example in four days.

In this case, the output for the firs 8 days would be:

## A tibble: 2 x 1
#  value
#  <dbl>
#1   117
#2    83

I know that it's easy to do with resample from pandas in python, however I can't accomplish in R.

my data:

ts <- structure(list(dates = structure(c(16314, 16315, 16316, 16317, 
16318, 16319, 16320, 16321, 16322, 16323, 16324, 16325, 16326, 
16327, 16328, 16329, 16330, 16331, 16332, 16333, 16334, 16335, 
16336, 16337, 16338, 16339, 16340, 16341, 16342, 16343, 16344, 
16345, 16346, 16347, 16348, 16349, 16350, 16351, 16352, 16353
), class = "Date"), sells = c(32L, 8L, 39L, 38L, 1L, 28L, 33L, 
21L, 29L, 33L, 13L, 32L, 10L, 15L, 19L, 3L, 17L, 35L, 29L, 10L, 
27L, 14L, 30L, 11L, 24L, 31L, 10L, 27L, 32L, 23L, 25L, 2L, 22L, 
4L, 18L, 22L, 15L, 16L, 23L, 3L)), .Names = c("dates", "sells"
), row.names = c(NA, -40L), class = c("tbl_df", "tbl", "data.frame"
))

Thank you.

Upvotes: 1

Views: 128

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269854

If we switch to a time series representation it makes it particularly simple:

library(zoo)

z <- read.zoo(ts)
z4 <- rollapplyr(z, 4, by = 4, sum)

giving the following time series indexed by the ending date of each 4 day interval:

> z4
2014-09-04 2014-09-08 2014-09-12 2014-09-16 2014-09-20 2014-09-24 2014-09-28 
       117         83        107         47         91         82         92 
2014-10-02 2014-10-06 2014-10-10 
        82         66         57 

(If you wanted to convert the output to a data frame then fortify.zoo(z4) or if you just wanted the sequence of sums as a plain vector coredata(z4). )

library(ggplot2)
autoplot(z4)

screenshot

Upvotes: 1

akrun
akrun

Reputation: 887431

In R, one option is to use cut.Date in the group_by to create an interval of 4 days and then get the sum of 'sells'

library(dplyr)
out <- ts %>%
         group_by(interval = cut(dates, breaks = '4 day')) %>% 
         summarise(value = sum(sells))
head(out, 2)
# A tibble: 2 x 2
#   interval  value
#  <fctr>     <int>
#1 2014-09-01   117
#2 2014-09-05    83

Upvotes: 4

Related Questions