Reputation: 3764
I am trying to use the Hive UDF date_format()
to extract the day of the week but it only returns NA
. Let's look at an example
sc <- sparklyr::spark_connect(master = "local")
df <- dplyr::copy_to(
sc,
data.frame(date = as.POSIXct("2020-01-01")),
"df"
)
df
# # Source: spark<df> [?? x 1]
# date
# <dttm>
# 1 2019-12-31 23:00:00
# Extracting the year works fine...
dplyr::mutate_at(
.tbl = df,
.vars = "date",
.funs = ~date_format(., "yyyy")
)
# # Source: spark<?> [?? x 1]
# date
# <chr>
# 1 2020
# But extracting the day of the week does not...
dplyr::mutate_at(
.tbl = df,
.vars = "date",
.funs = ~date_format(., "E")
)
# # Source: spark<?> [?? x 1]
# date
# <chr>
# 1 NA
Any help would be appreciated. Some system information:
Upvotes: 1
Views: 263
Reputation: 3824
My attempt was using mutate
instead. If you want to change in place, replace DoW
with date
.
library(tidyverse)
library(sparklyr)
sc <- spark_connect(master = "local")
df <- dplyr::copy_to(sc, data.frame(date = as.POSIXct("2020-01-01")), "df")
df %>% mutate(DoW=date_format(date, "E"))
# Source: spark<?> [?? x 2]
date DoW
<dttm> <chr>
1 2019-12-31 23:00:00 Wed
Upvotes: 1