S Das
S Das

Reputation: 3391

Summing Frequency based on Time Stamp Interval

I have a following dataset that provides frequency count for particular timestamps.

a <- read.table(header=TRUE, text="
Time Freq
7:00:36    3
7:00:55    0
7:02:18    8
7:02:54    3
7:04:20    6
7:04:36    0
7:05:52    4
7:06:17    0
7:07:47    3
7:08:03    0
                   ")
a  
      Time Freq
1  7:00:36    3
2  7:00:55    0
3  7:02:18    8
4  7:02:54    3
5  7:04:20    6
6  7:04:36    0
7  7:05:52    4
8  7:06:17    0
9  7:07:47    3
10 7:08:03    0

str(a)
'data.frame':   10 obs. of  2 variables:
 $ Time: Factor w/ 10 levels "7:00:36","7:00:55",..: 1 2 3 4 5 6 7 8 9 10
 $ Freq: int  3 0 8 3 6 0 4 0 3 0

a$Time <- as.POSIXct(strptime(a$Time, "%H:%M:%OS"))

str(a)
'data.frame':   10 obs. of  2 variables:
 $ Time: POSIXct, format: "2016-05-09 07:00:36" "2016-05-09 07:00:55" "2016-05-09 07:02:18" "2016-05-09 07:02:54" ...
 $ Freq: int  3 0 8 3 6 0 4 0 3 0

I want to calculate the summation of frequencies for fixed time intervals like 2 min. The desired result would be like the following:

           interval frequency
1 07:00:01-07:02:00         3
2 07:02:01-07:04:00        11
3 07:04:01-07:06:00        10
4 07:06:01-07:08:00         3
5 07:08:01-07:10:00         0

Here's my attempt:

library(dplyr)
interval <- 2

summary <- a %>%
  mutate(interval = floor((as.numeric(Time - min(Time)))/intrvl)+1) %>%
  group_by(interval, add = TRUE) %>%
  summarize(starttime = min(Time),
            frequency = n()) %>%
  select(-interval)
summary
Source: local data frame [10 x 2]

             starttime frequency
                (time)     (int)
1  2016-05-09 07:00:36         1
2  2016-05-09 07:00:55         1
3  2016-05-09 07:02:18         1
4  2016-05-09 07:02:54         1
5  2016-05-09 07:04:20         1
6  2016-05-09 07:04:36         1
7  2016-05-09 07:05:52         1
8  2016-05-09 07:06:17         1
9  2016-05-09 07:07:47         1
10 2016-05-09 07:08:03         1

Upvotes: 1

Views: 430

Answers (1)

lmo
lmo

Reputation: 38520

This base R method using cut and aggregate will work:

a$Time <- as.POSIXct(strptime(a$Time, "%H:%M:%OS"))

# get a factor variable that contains separate levels for every 2 minute interval
a$interval <- cut(a$Time, breaks="2 min")
# aggregate the data, summing the frequencies
aggregate(Freq ~ interval, data=a, FUN=sum)

Upvotes: 1

Related Questions