Reputation: 11
I'm trying to deal with a date and time variable (dttm) in a spark data frame. I'm using sparklyr and dplyr. Here is my issue...
Each row of the column in question is in this format:
I want to split this date and time column (dttm) into two columns :
So in the first place, I used regexp_replace and mutate to create the time column :
spark_df %>% mutate(time = regexp_replace(date_and_time, "^[^_]* ", ""))
Here is what I obtain in my new column "time":
So the code is nearly working, the only issue is that the two first digit are converting in 00.
Upvotes: 1
Views: 775
Reputation: 6222
Maybe this could be a good starting point if it doesn't solve your problem.
dates <- data.frame(date =
c("2018-06-11 22:06:45", "2018-06-11 22:07:45", "2019-06-11 22:06:45"))
tbl <- copy_to(sc, dates)
tbl %>% mutate(new_date = as.POSIXct(date)) %>%
mutate(day = as.Date(new_date),
time = paste0(hour(new_date), ":", minute(new_date), ":",
second(new_date)))
# date new_date day time
# <chr> <dttm> <date> <chr>
# 1 2018-06-11 22:06:45 2018-06-11 12:06:45 2018-06-11 22:6:45
# 2 2018-06-11 22:07:45 2018-06-11 12:07:45 2018-06-11 22:7:45
# 3 2019-06-11 22:06:45 2019-06-11 12:06:45 2019-06-11 22:6:45
Upvotes: 1