Reputation: 216
I'm trying to graph request per second using our apache log files. I've massaged the log down to a simple listing of the timestamps, one entry per request.
04:02:28
04:02:28
04:02:28
04:02:29
...
I can't quite figure out how to get R to recognize as time and aggregate to per second. Thanks for any help
Upvotes: 2
Views: 1638
Reputation: 20282
It seems to me that since you already have time-stamps at one-second granularity, all you need to do is do a frequency-count of the time-stamps and plot the frequencies in the original time-order. Say timeStamps
is your array of time-stamps, then you would do:
plot(c( table( timeStamps ) ) )
I'm assuming you want to plot the log-messages in each one-second interval over a certain period. Also I'm assuming that the HMS time-stamps are within one day. Note that the table
function produces a frequency-count of its argument.
Upvotes: 1
Reputation: 5390
I'm not exactly sure, how to make this correctly, but this should be one possible way and maybe helps you.
Instead of strings, get the data as UNIX timestamps from the database that denote the number of seconds from 1970-01-01.
Use hist(data) to plot a histogram for example. Or you may use melt
command from reshape2
package and use cast
for creating a data frame, where one column is the timestamp and another column determines the number of transactions at that time.
Use as.POSIXlt(your.unix.timestamps, origin="1970-01-01", tz="GMT")
to convert the timestamps to R understandable datetime structures.
Then add labels to the plot using the data from point 3 using format
.
Here's an example:
# original data
data.timestamps = c(1297977452, 1297977452, 1297977453, 1297977454, 1297977454, 1297977454, 1297977455, 1297977455)
data.unique.timestamps = unique(data.timestamps)
# get the labels
data.labels = format(as.POSIXlt(data.unique.timestamps, origin="1970-01-01", tz="GMT"), "%H:%M:%S")
# plot the histogram without axes
hist(data.timestamps, axes=F)
# add axes manually
axis(2)
axis(1, at=unique(data.timestamps), labels=data.labels)
-- Hope this helps
Upvotes: 1
Reputation: 179518
The lubridate package makes working with dates and time very easy.
Here is an example, using the hms() function of lubridate. hms converts a character string into a data frame with separate columns for hours, minutes and seconds. There are similar functions for myd (month-day-year), dmy (day-month-year), ms (minutes-seconds)... you get the point.
library(lubridate)
data <- c("04:02:28", "04:02:28", "04:02:28", "04:02:29")
times <- hms(data)
times$second
[1] 28 28 28 29
At this point, times is a straight-forward data frame, and you can isolate any column you wish:
str(times)
Classes 'period' and 'data.frame': 4 obs. of 6 variables:
$ year : num 0 0 0 0
$ month : num 0 0 0 0
$ day : num 0 0 0 0
$ hour : num 4 4 4 4
$ minute: num 2 2 2 2
$ second: num 28 28 28 29
Upvotes: 3