Separating timestamped data into bins

Question

I have a text file containing timestamps with associated button presses. I loaded it into R using R studio. The button presses are formatted as strings.

52 right 08:16:23

53     a 08:16:23

54    up 08:16:24

55     a 08:16:24

56     b 08:16:24

57     a 08:16:24

58     a 08:16:24

59 right 08:16:24

60     a 08:16:24

The timestamps have been converted into POSIXct timestamps, but came in separate date and time fields in my text file.

I want to break the data into equally spaced bins based on time and count the frequency of each button within these.

There are a handful of buttons and there are a lot of different nonunique timestamps.

Ideally I'd like as small as minute intervals and a solution that allows me to change the granularity would be great.

IRTFM · Accepted Answer

Let's assume you have a data.frame named "dat" and that the time value is in a column named "V3", as it is in the one I created form you text. Then using seq.POSIXct with an interval of a minute only creates a single point and cut cannot handle that so I started adding different values. In the process I discovered that my initial attempt with seq.POSIXct returned NA for the upper values because the sequence ended if the seconds were higher in the max time than the min time so I added 60 seconds to the max. as the interval for this demonstration. You should be able to generalize the code in the obvious locations.

# Initial failed attempt with your data
> grp <- cut(dat$time, breaks=seq(min(dat$time), max(dat$time), by="1 min"), include.lowest=TRUE) 
Error in cut.default(unclass(x), unclass(breaks), labels = labels, right = right,  : 
  'breaks' are not unique

 # Better data, more challenging, allows better testing

dat$grp <- cut(dat$time, breaks=seq(min(dat$time), 
                                      max(dat$time)+60, by="1 min"), 
                           include.lowest=TRUE,right=TRUE)

> dat
  V1    V2       V3                time                 grp
1 52 right 08:16:23 2016-04-17 08:16:23 2016-04-17 08:15:24
2 53     a 08:16:23 2016-04-17 08:16:23 2016-04-17 08:15:24
3 54    up 08:17:59 2016-04-17 08:17:59 2016-04-17 08:17:24
4 55     a 08:18:45 2016-04-17 08:18:45 2016-04-17 08:18:24
5 56     b 08:20:53 2016-04-17 08:20:53 2016-04-17 08:20:24
6 57     a 08:20:01 2016-04-17 08:20:01 2016-04-17 08:19:24
7 58     a  08:17:5 2016-04-17 08:17:05 2016-04-17 08:16:24
8 59 right 08:18:24 2016-04-17 08:18:24 2016-04-17 08:17:24
9 60     a 08:14:24 2016-04-17 08:14:24 2016-04-17 08:14:24

You can get the counts by group with table:

> table(dat$grp)

2016-04-17 08:14:24 2016-04-17 08:15:24 2016-04-17 08:16:24 2016-04-17 08:17:24 
                  1                   2                   1                   2 
2016-04-17 08:18:24 2016-04-17 08:19:24 2016-04-17 08:20:24 
                  1                   1                   1

See ?table for additional options about handling missing values.

Separating timestamped data into bins

Answers (2)

Related Questions