How to clean a time column in r

I have a time column in R as:

22:34:47
06:23:15
7:35:15
5:45

How to make all the time values in a column into hh:mm:ss format. I have used as_date(a$time, tz=NULL) but I am not able to get the format which I wanted.

Upvotes: 0

Views: 650

Answers (3)

Kevin Arseneau
Kevin Arseneau

Reputation: 6264

Using a approach with and hms verbs.

library(dplyr)
library(hms)

a <- tibble(time = c("22:34:47", "06:23:15", "7:35:15", "5:45"))

a %>%
  mutate(
    time = case_when(
      is.na(parse_hms(time)) ~ parse_hm(time),
      TRUE ~ parse_hms(time)
    )
  )

# # A tibble: 4 x 1
#  time  
#  <time>
# 1 22:34 
# 2 06:23 
# 3 07:35 
# 4 05:45

Note that the use of case_when could be replaced with an ifelse. The reason for this conditional is that parse_hms will return NA for values without seconds.

You may also want the output to be a POSIX compliant value, you may adapt the previous solution to do so.

a %>%
  mutate(
    time = case_when(
      is.na(parse_hms(time)) ~ as.POSIXct(parse_hm(time)),
      TRUE ~ as.POSIXct(parse_hms(time))
    )
  )

# # A tibble: 4 x 1
#   time               
#   <dttm>             
# 1 1970-01-01 22:34:47
# 2 1970-01-01 06:23:15
# 3 1970-01-01 07:35:15
# 4 1970-01-01 05:45:00

Note this will set the date to origin, which is 1970-01-01 by default.

Upvotes: 1

thelatemail
thelatemail

Reputation: 93938

Nothing a bit of formatting can't take care of:

x <- c("22:34:47","06:23:15","7:35:15","5:45")

format(
  pmax(
     as.POSIXct(x, format="%T", tz="UTC"),
     as.POSIXct(x, format="%R", tz="UTC"), na.rm=TRUE
  ),
  "%T"
)
#[1] "22:34:47" "06:23:15" "07:35:15" "05:45:00"

The pmax means any additional seconds will be taken in preference to just hh:mm.

You could get functional if you wanted to get a similar result with less typing, and more opportunity for turning it into a repeatable function.

do.call(pmax, c(lapply(c("%T","%R"), as.POSIXct, x=x, tz="UTC"), na.rm=TRUE))

Upvotes: 1

akrun
akrun

Reputation: 887901

Here is an option with parse_date_time which can take multiple formats

library(lubridate)
format(parse_date_time(time, c("HMS", "HM"), tz = "GMT"), "%H:%M:%S")
#[1] "22:34:47" "06:23:15" "07:35:15" "05:45:00"

data

time <- c("22:34:47", "06:23:15", "7:35:15", "5:45")

Upvotes: 2

Related Questions