Reputation:
I'm completely new to R, and I have been tasked with making a script to plot the protocols used by a simulated network of users into a histogram by a) identifying the protocols they use and b) splitting everything into a 5-second interval and generate a graph for each different protocol used.
Currently we have
data$bucket <- cut(as.numeric(format(data$DateTime, "%H%M")),
c(0,600, 2000, 2359),
labels=c("00:00-06:00", "06:00-20:00", "20:00-23:59")) #Split date into dates that are needed to be
to split the codes into 3-zones for another function. What should the code be changed to for 5 second intervals?
Sorry if the question isn't very clear, and thank you
Upvotes: 0
Views: 138
Reputation: 8267
The histogram function hist()
can aggregate and/or plot all by itself, so you really don't need cut()
.
Let's create 1,000 random time stamps across one hour:
set.seed(1)
foo <- as.POSIXct("2014-12-17 00:00:00")+runif(1000)*60*60
(Look at ?POSIXct
on how R treats POSIX time objects. In particular, note that "+" assumes you want to add seconds, which is why I am multiplying by 60^2.)
Next, define the breakpoints in 5 second intervals:
breaks <- seq(as.POSIXct("2014-12-17 00:00:00"),
as.POSIXct("2014-12-17 01:00:00"),by="5 sec")
(This time, look at ?seq.POSIXt
.)
Now we can plot the histogram. Note how we assign the output of hist()
to an object bar
:
bar <- hist(foo,breaks)
(If you don't want the plot, but only the bucket counts, use plot=FALSE
.)
?hist
tells you that hist()
(invisibly) returns the counts per bucket. We can look at this by accessing the counts
slot of bar
:
bar$counts
[1] 1 2 0 1 0 1 1 2 3 3 0 ...
Upvotes: 2