Reputation: 267
In R version 3.6.0, I want to standardize a character column to a date format. The column has dates in three different forms like below:
time <- c("12/31/2019", "2020-01-01 00:00:10.555", "1517075645.917")
t <- data.frame(time)
I can see "1517075645.917" is a unix timestamp:
as.POSIXct(1517075645.917, origin = "1970-01-01", tz = "UTC")
# [1] "2018-01-27 17:54:05 UTC"
I can standardize the first and second element as below:
# version 1.5.6
library(lubridate)
parse_date_time(c("12/31/2019", "2020-01-01 00:00:10.555"), c("mdy", "ymd HMS"))
# [1] "2019-12-31 00:00:00 UTC" "2020-01-01 00:00:10 UTC
How do I standardize the whole column including the character "1517075645.917" as I cannot find a parsing format like "mdy" or "ymd HMS" for it?
Upvotes: 2
Views: 149
Reputation: 389325
You can use if_else
and apply the function based on pattern in the data.
library(dplyr)
library(lubridate)
t %>%
mutate(time1 = suppressWarnings(if_else(grepl('^\\d+\\.\\d+$', time),
as.POSIXct(as.numeric(time), origin = "1970-01-01", tz = "UTC"),
parse_date_time(time, c("mdy", "ymd HMS")))))
# time time1
#1 12/31/2019 2019-12-31 00:00:00
#2 2020-01-01 00:00:10.555 2020-01-01 00:00:10
#3 1517075645.917 2018-01-27 17:54:05
Upvotes: 0
Reputation: 11546
Does this work:
library(anytime)
library(stringr)
library(dplyr)
t %>%
mutate(time1 = case_when(str_detect(time, '\\d{10}\\.\\d{3}') ~ time, TRUE ~ NA_character_ )) %>%
mutate(time1 = as.POSIXct(as.numeric(time1), origin = '1970-01-01', tz = 'UTC')) %>%
mutate(time = anytime(time)) %>% mutate(time = coalesce(time, time1)) %>% select(-time1)
time
1 2019-12-31 00:00:00
2 2020-01-01 00:00:10
3 2018-01-27 23:24:05
>
Upvotes: 1