Reputation: 13
How do you convert a column with both minutes and hours to an int describing minutes like a df with
df$duration = ["1 h 10 min", "120 min",...]
duration
----------
1 h 10 min
120 min
to
df $duration = [70, 120, ...]
result
------
70
120
Upvotes: 1
Views: 215
Reputation: 8215
Use the lubridate
package, but you need to clean the data a little by getting all the values into a consistent format.
> df <- data.frame(duration=c("1 h 10 min","120 min"), stringsAsFactors = F)
> no_h<-!grepl("h", df$duration)
> df$duration[no_h] <- paste("0 h", df$duration[no_h])
> df$period <-hm(df$duration)
> df$minute <- hour(df$period)*60 + minute(df$period)
> df
duration period minute
1 1 h 10 min 1H 10M 0S 70
2 0 h 120 min 120M 0S 120
>
Upvotes: 2
Reputation: 32548
duration = c("1 h 10 min", "120 min")
sapply(strsplit(duration, " "), function(x){
temp = as.numeric(x)
if (length(temp) == 4){
sum(as.numeric(temp[c(1, 3)]) * c(60, 1))
}else{
as.numeric(temp[1])
}
})
#[1] 70 120
#Warning messages:
#1: In FUN(X[[i]], ...) : NAs introduced by coercion
#2: In FUN(X[[i]], ...) : NAs introduced by coercion
Upvotes: 1
Reputation: 93851
Here's one option:
library(stringr)
d = c("1 h 10 min", "120 min", "2 h", "12 h 53 min")
na_to_0 = function(x) {x[is.na(x)] = 0; x}
to_minutes = function(s) {
hr = na_to_0(60 * as.numeric(str_replace(str_extract(s, "[0-9]{1,2} h"), " h", "")))
min = na_to_0(as.numeric(str_replace(str_extract(s, "[0-9]{1,3} min"), " min", "")))
hr + min
}
to_minutes(d)
[1] 70 120 120 773
Upvotes: 1