Reputation: 23
I have a reasonable amount of time data, and I'd like to put it in a frequency graph, where the X-axis would be several intervals of time and the Y-axis would be the amount of data I've collected in such period. See this example:
Let's suppose I have this list:
[10:17:55, 10:37:40, 10:40:26, 10:48:18, 11:00:17, 11:01:12, 11:06:58, 11:09:20, 11:43:41, 11:48:24, 11:49:14, 12:07:31, 12:10:52, 12:10:52, 12:19:00, 12:19:00, 12:19:43, 12:20:55, 12:38:27, 12:38:27, 12:55:09, 12:55:10, 12:57:31, 12:57:31, 13:04:16, 13:04:16, 13:06:51 13:06:51, 14:55:06, 14:56:10, 15:01:30, 15:28:42, 3:29:17, 15:35:33, 15:58:32, 16:05:07, 16:09:16, 16:10:36, 16:32:57, 16:32:57, 16:34:32, 16:38:16, 17:43:27, 17:53:01, 17:56:14, 18:08:21, 18:17:23, 18:37:23, 18:37:23, 18:43:13, 18:43:13 18:51:43, 18:51:43, 19:05:39, 19:05:39]
And I'd like to plot a histogram showing how many values are there in intervals of 1h, or 30 minutes (still deciding), such as:
10h - 11h: 4
11h - 12h: 7
.
.
.
19h - 20h: 2
But all that represented in a graph. I know the very basics of how to plot a histogram in R and couldn't figure out how to do that. I've seen some answers making plots throughout the days, which is not much applicable, because these values were collected in different days... Can you guys help me?
EDIT: Here's a dput()
of the list:
structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L,
13L, 13L, 14L, 14L, 15L, 16L, 17L, 17L, 18L, 19L, 20L, 20L, 21L,
21L, 22L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L,
33L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 41L, 42L, 42L,
43L, 43L, 44L, 44L), .Label = c("10:17:55", "10:37:40", "10:40:26",
"10:48:18", "11:00:17", "11:01:12", "11:06:58", "11:09:20", "11:43:41",
"11:48:24", "11:49:14", "12:07:31", "12:10:52", "12:19:00", "12:19:43",
"12:20:55", "12:38:27", "12:55:09", "12:55:10", "12:57:31", "13:04:16",
"13:06:51", "14:55:06", "14:56:10", "15:01:30", "15:28:42", "15:29:17",
"15:35:33", "15:58:32", "16:05:07", "16:09:16", "16:10:36", "16:32:57",
"16:34:32", "16:38:16", "17:43:27", "17:53:01", "17:56:14", "18:08:21",
"18:17:23", "18:37:23", "18:43:13", "18:51:43", "19:05:39"), class = "factor")`
Upvotes: 2
Views: 4340
Reputation: 263342
There are range, trunc and seq methods for POSIXt or Date objects. Assuming you assign that structure object to a name such as tms
this would convert to POSIXct and then construct a range, a sequence of breaks that spanned the hours and then bin within 30 minute intervals:
> tms <- as.POSIXct(tms, format="%H:%M:%S")
> brks <- trunc(range(tms), "hours")
Warning message:
In if (isdst == -1) { :
the condition has length > 1 and only the first element will be used
> hist(tms, breaks=seq(brks[1], brks[2]+3600, by="30 min") )
Notice that the plot method for POSIXt objects handles the x-axis labeling:
I suppose you could check to see if the second "brks" was within the half-hour window for a 30 minute plot. So this would be the code to avoid a blank bin, if targeting half-hour bins:
hist(tms, breaks=seq(brks[1],
brks[2]+ if( as.numeric( max(tms)-brks[2] ) < 30) #diff time in mins
{1800} else{3600},
by="30 min")
)
Upvotes: 3
Reputation: 1053
Here is the method I used to obtain what it is you are after.
This will work for hours and half hours. Not the prettiest, but I think it serves your purpose. You will need to do some massaging of the axes so they display the information you desire. Hopefully that helps!
hours <- as.numeric( format( strptime( times , format = "%H:%M:%S" ) , "%H" ) )
hist( hours , breaks = unique( hours ) )
half_hours <- hours + ( as.numeric( format( strptime( times , format = "%H:%M:%S" ) , "%M" ) ) /60 )
hist(half_hours , breaks = c( unique( hours ) , unique( hours ) + 0.5 ) )
Upvotes: 1