Hendy
Hendy

Reputation: 10604

Time data in R for logged durations in H M S format (but H can be > 24)

I have a data set like this:

> dput(data)
structure(list(Run = c("Dur 2", "Dur 3", "Dur 4", "Dur 5", "Dur 7", 
"Dur 8", "Dur 9"), reference = c("00h 00m 32s", "00h 00m 31s", 
"00h 05m 46s", "00h 03m 51s", "00h 06m 49s", "00h 06m 47s", "00h 08m 56s"
), test30 = c("00h 00m 44s", "00h 00m 41s", "00h 21m 54s", "00h 13m 37s", 
"00h 28m 48s", "00h 22m 54s", "10h 02m 12s"), test31 = c("00h 00m 39s", 
"00h 00m 45s", "00h 40m 10s", "00h 23m 07s", "00h 35m 23s", "00h 47m 42s", 
"25h 37m 05s"), test32 = c("00h 01m 05s", "00h 01m 13s", "00h 55m 02s", 
"00h 28m 54s", "01h 03m 17s", "01h 02m 08s", "39h 04m 39s")), .Names = c("Run", 
"reference", "test30", "test31", "test32"), class = "data.frame", row.names = c(NA, 
-7L))

I tried to get it into plottable format like so:

library(reshape2)
library(scales)

# melt the data and convert the time strings to POSIXct format
data_melted <- melt(data, id.var = "Run")
data_melted$value <- as.POSIXct(data_melted$value, format = "%Hh %Mm %Ss")

I'm getting NAs for the final durations of Dur9 presumably due to the fact that POSOXct is expecting actual HMS data in the sense of 24h time.

What's the recommended way to deal with logged data like this that isn't rolling over into days once H > 24?

Do I need to manually check it for such instances and create a new string representing the days (which would then seem to require that I create an arbitrary start day and increment the day if H > 24)? Or is there a package better suited for strict time data vs. assuming all time data is logged according to an actual time stamp?

Many thanks!

Upvotes: 1

Views: 470

Answers (1)

mnel
mnel

Reputation: 115382

You can use colsplit from the plyr package to create columns for hours, minutes and seconds then using create a difftime object that can be added to a date

library(plyr)

# note gsub('s','',mdd[['value']]) removes trailing s from each value
# we then split on `[hm]` (ie. h or m)` -- this returns a data.frame with
# 3 integer columns 
times <- colsplit(gsub('s','',mdd[['value']]), '[hm]', names = c('h','m','s'))

seconds <- as.difftime(with(times, h*60*60 + m *60 + s), format = '%X', units = 'secs')
seconds
Time differences in secs
 [1]     32     31    346    231    409    407    536     44     41   1314    817   1728   1374  36132     39     45
[17]   2410   1387   2123   2862  92225     65     73   3302   1734   3797   3728 140679

You don't need to do the arithmetic yourself, using Map and Reduce

 Reduce('+',Map(as.difftime, times, units = c('hours','mins','secs')))

Upvotes: 2

Related Questions