Joaquim
Joaquim

Reputation: 11

Split Date and time variable with sparklyr

I'm trying to deal with a date and time variable (dttm) in a spark data frame. I'm using sparklyr and dplyr. Here is my issue...

Each row of the column in question is in this format:

I want to split this date and time column (dttm) into two columns :

So in the first place, I used regexp_replace and mutate to create the time column :

spark_df %>% mutate(time = regexp_replace(date_and_time, "^[^_]* ", ""))

Here is what I obtain in my new column "time":

So the code is nearly working, the only issue is that the two first digit are converting in 00.

Upvotes: 1

Views: 775

Answers (1)

kangaroo_cliff
kangaroo_cliff

Reputation: 6222

Maybe this could be a good starting point if it doesn't solve your problem.

dates <- data.frame(date = 
    c("2018-06-11 22:06:45", "2018-06-11 22:07:45", "2019-06-11 22:06:45"))
tbl <- copy_to(sc, dates)

tbl %>% mutate(new_date = as.POSIXct(date)) %>%
    mutate(day = as.Date(new_date), 
    time = paste0(hour(new_date), ":", minute(new_date), ":",
                       second(new_date))) 

#   date                new_date            day        time
#   <chr>               <dttm>              <date>     <chr>
# 1 2018-06-11 22:06:45 2018-06-11 12:06:45 2018-06-11 22:6:45
# 2 2018-06-11 22:07:45 2018-06-11 12:07:45 2018-06-11 22:7:45
# 3 2019-06-11 22:06:45 2019-06-11 12:06:45 2019-06-11 22:6:45

Upvotes: 1

Related Questions