Reputation: 401
I have a text file containing timestamps with associated button presses. I loaded it into R using R studio. The button presses are formatted as strings.
52 right 08:16:23
53 a 08:16:23
54 up 08:16:24
55 a 08:16:24
56 b 08:16:24
57 a 08:16:24
58 a 08:16:24
59 right 08:16:24
60 a 08:16:24
The timestamps have been converted into POSIXct timestamps, but came in separate date and time fields in my text file.
I want to break the data into equally spaced bins based on time and count the frequency of each button within these.
There are a handful of buttons and there are a lot of different nonunique timestamps.
Ideally I'd like as small as minute intervals and a solution that allows me to change the granularity would be great.
Upvotes: 0
Views: 169
Reputation: 263362
Let's assume you have a data.frame named "dat" and that the time value is in a column named "V3", as it is in the one I created form you text. Then using seq.POSIXct
with an interval of a minute only creates a single point and cut cannot handle that so I started adding different values. In the process I discovered that my initial attempt with seq.POSIXct returned NA for the upper values because the sequence ended if the seconds were higher in the max time than the min time so I added 60 seconds to the max. as the interval for this demonstration. You should be able to generalize the code in the obvious locations.
# Initial failed attempt with your data
> grp <- cut(dat$time, breaks=seq(min(dat$time), max(dat$time), by="1 min"), include.lowest=TRUE)
Error in cut.default(unclass(x), unclass(breaks), labels = labels, right = right, :
'breaks' are not unique
# Better data, more challenging, allows better testing
dat$grp <- cut(dat$time, breaks=seq(min(dat$time),
max(dat$time)+60, by="1 min"),
include.lowest=TRUE,right=TRUE)
> dat
V1 V2 V3 time grp
1 52 right 08:16:23 2016-04-17 08:16:23 2016-04-17 08:15:24
2 53 a 08:16:23 2016-04-17 08:16:23 2016-04-17 08:15:24
3 54 up 08:17:59 2016-04-17 08:17:59 2016-04-17 08:17:24
4 55 a 08:18:45 2016-04-17 08:18:45 2016-04-17 08:18:24
5 56 b 08:20:53 2016-04-17 08:20:53 2016-04-17 08:20:24
6 57 a 08:20:01 2016-04-17 08:20:01 2016-04-17 08:19:24
7 58 a 08:17:5 2016-04-17 08:17:05 2016-04-17 08:16:24
8 59 right 08:18:24 2016-04-17 08:18:24 2016-04-17 08:17:24
9 60 a 08:14:24 2016-04-17 08:14:24 2016-04-17 08:14:24
You can get the counts by group with table:
> table(dat$grp)
2016-04-17 08:14:24 2016-04-17 08:15:24 2016-04-17 08:16:24 2016-04-17 08:17:24
1 2 1 2
2016-04-17 08:18:24 2016-04-17 08:19:24 2016-04-17 08:20:24
1 1 1
See ?table
for additional options about handling missing values.
Upvotes: 1
Reputation: 948
These functions may be of interest to you:
The answer depends if the time is recognized by R. If not, you can use
chron( ... )
on your time variable. Please see: http://www.stat.berkeley.edu/~s133/dates.html
c <- cut(time_variable, number_of_bins)
This should get the max and min of the time variable, divide the range by the number of bins, then assign each of the times to the appropriate bin
table(c)
This will return the frequency in each bin
Upvotes: 1