Reputation: 1005
I'm working with a data frame that has a variable called "Duration" with values in the format:
1h 7m 46s
0h 16m 41s
..and so on. The column is formatted as a "factor" by default, and I'm wondering how to convert it to an actual duration. I'd like to be able to compute averages and sums of the durations.
Upvotes: 3
Views: 4045
Reputation: 10204
You can extract the hours, minutes, and seconds as follows:
x <- c('1h 7m 46s','0h 16m 41s')
hours <- as.numeric(gsub('^(?:.* )?([0-9]+)h.*$','\\1',x))
minutes <- as.numeric(gsub('^.* ([0-9]+)m.*$','\\1',x))
seconds <- as.numeric(gsub('^.* ([0-9]+)s.*$','\\1',x))
duration_seconds <- seconds + 60*minutes + 60*60*hours
the pattern for minutes is translated as: Starts with (^
) zero or more(*
) characters(.
), followed by a space (), followed by 1 or more(
+
) digits([0-9]
) followed by the letter m (m
) followed by zero or more(*
) characters(.
) to the end of the string($
)
Bonus: the (?:.* )?
in the regex for hours is a non-capturing group((?: )
), which consumes zero or more (*
) chartacters (.
) followed by a space (). Note that because
(?:.* )?
is a non-capturing gruop. \\1
still referrs to the number string.
Upvotes: 5
Reputation: 736
Consider converting the times to strings with an as.character()
cast. Once your times are strings you can convert them into DateTime
objects with the strptime
command, e.g.
> s <- "1h 7m 46s"
> tfmt <- "%Hh %Mm %Ss"
> t1 <- strptime(s, format=tfmt)
> s <- "0h 16m 41s"
> t2 <- strptime(s, format=tfmt)
Having the data in this format is handy as there are then tools for working with them
> t1
[1] "2015-01-30 01:07:46"
> t2
[1] "2015-01-30 00:16:41"
> t1 - t2
Time difference of 51.08333 mins
> difftime(t1, t2, units="secs")
Time difference of 3065 secs
Upvotes: 4