Reputation:

How to analyze time in R

For several months I have noted down the time that I woke up each morning. What I have now is a database that contains times of day in 24 hour format, e.g. 2014-11-29 05:23:00, which I can trim to something like 04:23.

I want to plot the distribution of my wake up times. The x-axis will be time of day, the y-axis will be frequency. All very simple, except:

What I'm breaking my head over right now is how to handle the x-axis scale. Since there are 60 minutes to an hour, I could:

Create a scale of minutes in a day, where the time 04:23 would be transformed to minute 263. This would be easy on my calculations, but unintuitive to read. Of course I could transform back those minutes easily.
Use a hundred-minute hour. Since the axis in my plot would only be labelled every full hour, this would be both easy to calculate and easy to read. But if I want to see the mean or other calculated data in 60-minute time, I'd have to re-transform it, which might cause inaccuracies. But I guess these would be minor.
Let R handle times.

Since the only thing that I don't know how to do is the third option, my question is:

How can I use times as data in R? And what is the best way to do so?

Here is a sample vector of times, if you want to try something:

t <- c("00:13:00", "00:30:00", "00:36:00", "00:45:00", "00:48:00", "01:08:00", "01:14:00", "01:15:00", "01:25:00", "02:06:00", "02:07:00", "02:22:00", "02:23:00", "02:36:00", "02:37:00", "02:55:00", "03:08:00", "03:10:00", "03:11:00", "03:13:00", "03:15:00", "03:23:00", "03:35:00", "03:55:00", "03:57:00", "03:58:00", "04:03:00", "04:06:00", "04:15:00", "04:21:00", "04:21:00", "04:22:00", "04:43:00", "04:48:00", "04:51:00", "04:58:00", "05:00:00", "05:02:00", "05:03:00", "05:17:00", "05:25:00", "05:34:00", "05:38:00", "05:45:00", "05:46:00", "05:50:00", "05:52:00", "06:10:00", "06:11:00", "06:13:00", "06:23:00", "06:26:00", "22:18:00", "23:27:00", "23:40:00", "23:53:00", "23:54:00", "23:58:00")

I have tried to plot the times using the chron library, but for some reason the labelling of the x-axis reverts to 0 to 1 when the range is the full 24 hours (it shows times when the graph is only a few hours wide), and the hist function refuses to use any graphical parameters (plot remains FALSE even when I explicitly set it to TRUE:

library(chron)
t <- times(c("00:13:00", "00:30:00", "00:36:00", "00:45:00", "00:48:00", "01:08:00", "01:14:00", "01:15:00", "01:25:00", "02:06:00", "02:07:00", "02:22:00", "02:23:00", "02:36:00", "02:37:00", "02:55:00", "03:08:00", "03:10:00", "03:11:00", "03:13:00", "03:15:00", "03:23:00", "03:35:00", "03:55:00", "03:57:00", "03:58:00", "04:03:00", "04:06:00", "04:15:00", "04:21:00", "04:21:00", "04:22:00", "04:43:00", "04:48:00", "04:51:00", "04:58:00", "05:00:00", "05:02:00", "05:03:00", "05:17:00", "05:25:00", "05:34:00", "05:38:00", "05:45:00", "05:46:00", "05:50:00", "05:52:00", "06:10:00", "06:11:00", "06:13:00", "06:23:00", "06:26:00", "22:18:00", "23:27:00", "23:40:00", "23:53:00", "23:54:00", "23:58:00"))
hist(t, probability = TRUE, col = "gray")
lines(density(t), col = "blue", lwd = 2)
lines(density(t, adjust = 2), lty = "dotted", col = "darkgreen", lwd = 2)

Warning message:
In hist.default(t, probability = TRUE, col = "gray", plot = FALSE) :
  arguments ‘freq’, ‘col’ are not made use of

enter image description here

Upvotes: 2

Answers (3)

Tim

Reputation: 7464

Have you considered using some arbitrary "zero" point? It could be some minimum value or average waking-up time. I could imagine that what you are interested in is differences between times, so "zero" could be arbitrary point in time as an anchor for comparison.

Upvotes: 0

Rusan Kax

Reputation: 1894

library(ggplot2)

#generate random times (between 4AM and 7:59AM) as a proxy for your data
Random_times=c(); 
for(i in 1:600){
  Random_times=c(Random_times,as.POSIXct(strptime(paste(sample(4:7,1),":",sample(0:59,1),":","00",sep=""),"%H:%M")))
}

#as absolute times
P_random_times=as.POSIXct(Random_times, origin="1970-01-01")
qplot(P_random_times)+xlim(c(strptime("03:00","%H:%M"),strptime("10:00","%H:%M")))



 #Or as mins from the minumum wake time 
P_times=difftime(P_random_times, min(P_random_times),units="mins")
qplot(as.numeric(P_times))

histogram with time by R

Upvotes: 3

Marcin

Reputation: 8044

package forecast http://cran.r-project.org/web/packages/forecast/index.html will help you

Upvotes: 0

How to analyze time in R

Answers (3)

Related Questions