TheRimalaya
TheRimalaya

Reputation: 4592

Parse Time of irregular format in R

I am some time as,

  [1] "9.58"      "19.19"     "43.03"     "1:40.91"   "2:11.96"   "3:26.00"  
  [7] "3:43.13"   "4:44.79"   "7:20.67"   "12:37.35"  "26:17.53"  "26:44"    

Some of them have only seconds that are in decimal. Some of them have minutes and hours and are separated by ":"

I want all of them in one unit (seconds or minutes or hours). How can I do this in R

Upvotes: 0

Views: 975

Answers (3)

akrun
akrun

Reputation: 886948

A couple of more ways after making the format into an unified format with sub

data1 <- sub("^([^:]+:[^:]+)$", "00:\\1", sub("^([0-9]*\\.*[0-9]*)$", "00:00:\\1", data))

1) Using chron -convert the 'data1' to a times object, coerce to numeric and multiply by seconds in a day i.e. 86400

library(chron)
60*60*24*as.numeric(times(data1))
#[1]    9.58   19.19   43.03  100.91  131.96  206.00
#[7]  223.13  284.79  440.67  757.35 1577.53 1604.00

2) Using period_to_seconds from lubridate - convert to datetime object and then change it to seconds with period_to_seconds

library(lubridate)
period_to_seconds(hms(data1))
#[1]    9.58   19.19   43.03  100.91  131.96  206.00
#[7]  223.13  284.79  440.67  757.35 1577.53 1604.00

Upvotes: 1

Aur&#232;le
Aur&#232;le

Reputation: 12819

I'm always very reluctant to parse date and times by hand, I trust my own code much less than the tested work of others who built dedicated tools.

So I would use lubridate for instance:

library(lubridate)

data <-
  c("9.58", "19.19", "43.03", "1:40.91", "2:11.96", "3:26.00", 
   "3:43.13", "4:44.79", "7:20.67", "12:37.35", "26:17.53", "26:44")

difftime(parse_date_time(data, orders = c("%H %M %OS", "%M %OS", "%OS")), 
         parse_date_time("0", orders = "%S"))

# Time differences in secs
#  [1]    9.580002   19.190002   43.029999  100.910004  131.959999
# [6]  206.000000  223.129997  284.790001  440.669998  757.349998
# [11] 1577.529999 1604.000000

lubridate offers the advantageous possibility to supply multiple parsing formats that are tried successively (c("%H:%M:%OS", "%M:%OS", "%OS") here, also note that the : separator can be omitted, allowing more robust parsing in case of poorly formatted input data).
My solution is still somewhat "hacky" because I wasn't able to parse those directly as difftimes, but as POSIXct, so I compared them to 0 to output difftimes.

Upvotes: 3

Therkel
Therkel

Reputation: 1438

You could split the strings with str_split on the colon separator : and convert these into seconds.

have <- c("9.58","1:40.91","1:01:02.1")

have_split <- strsplit(have,":")   ## List of times split

convert <- function(x){
    x <- as.numeric(x)
    if(length(x) == 1){               ## Has only seconds
        x                           
    } else if(length(x) == 2){        ## Has seconds and minutes
        out <- x[1]*60+x[2]
    } else if(length(x) == 3){        ## Has seconds, minutes and hours
        out <- x[1]*60^2+x[2]*60+x[3]
    }
}

sapply(have_split,convert)
## [1]    9.58  100.91 3662.10

Upvotes: 1

Related Questions