Reputation: 21459
I'm connecting to Postgres 15.1 in R 4.3.0 using DBI 1.2.2 and dbplyr 2.3.2.
Here is a statement that works:
> my_table %>% mutate(date = as.Date(created_at)) %>% head %>% pull(date)
[1] "2023-09-10" "2023-11-29" "2023-12-03" "2024-04-12" "2023-12-07" "2024-04-27"
But this doesn't:
> my_table %>% mutate(date = as.Date(created_at, tz="Asia/Dhaka")) %>% head %>% pull(date)
Error in `purrr::pmap()`:
ℹ In index: 1.
ℹ With name: date.
Caused by error in `as.Date()`:
! unused argument (tz = "Asia/Dhaka")
Run `rlang::last_trace()` to see where the error occurred.
But this does:
> my_table %>% collect %>% mutate(date = as.Date(created_at, tz="Asia/Dhaka")) %>% head %>% pull(date)
[1] "2023-09-10" "2023-11-30" "2023-12-03" "2024-04-13" "2023-12-07" "2024-04-27"
That leads me to believe that dbplyr
doesn't know how to use the "tz" argument of as.Date.
Can I get the database to return dates in a given timezone, or do I have to collect all my datasets?
I've seen mention of timezone_out
for Postgres's dbConnect, but I haven't been able to get it to do what I think it should.
Upvotes: 1
Views: 58
Reputation: 3535
You may find the timezone
argument to be useful. See here. You may want to check the data types in the underlying PostgreSQL table. Things will be much more predictable if this is (or is cast to) TIMESTAMP WITH TIME ZONE
(perhaps using ::TIMESTAMPTZ
?).
conn <- dbConnect(Postgres(),
host = "dbhost",
user = "dbuser",
dbname = "dbname",
timezone = "America/Chicago"
)
Upvotes: 0
Reputation: 21459
It worked for me to set the time zone on the connection, as from this answer.
con <- DBI::dbConnect(RPostgres::Postgres(), dbname = dbname)
# Setting the timezone will make as.Date work in this time zone
DBI::dbExecute(con, paste0("SET TIME ZONE 'America/Chicago'"))
my_table <- tbl(con, "my_table")
my_table %>% mutate(date = as.Date(created_at)) %>% head %>% select(created_at, date)
# Source: SQL [6 x 2]
# Database: postgres [me@mydb]
created_at date
<dttm> <date>
1 2024-03-06 20:07:25 2024-03-06
2 2024-05-18 22:35:13 2024-05-18
3 2023-11-28 18:02:03 2023-11-28
4 2024-04-23 18:20:55 2024-04-23
5 2023-12-01 00:45:38 2023-11-30
6 2023-07-10 19:17:10 2023-07-10
Line 5 shows a difference between UTC date (on the left) and America/Chicago date (on the right). America/Chicago is currently -0600, so I'd expect the date to be one day earlier, which it is.
===
More detail on why I think timezone_out
won't work: it does something, but I don't think I can get it to display the right date without collect.
con <- DBI::dbConnect(RPostgres::Postgres(), dbname = dbname, timezone_out="America/Chicago")
my_table <- tbl(con, "my_table")
my_table %>% mutate(date = as.Date(created_at)) %>% head %>% select(created_at, date)
# Source: SQL [6 x 2]
# Database: postgres [me@mydb]
created_at date
<dttm> <date>
1 2024-03-06 14:07:25 2024-03-06
2 2024-05-18 17:35:13 2024-05-18
3 2023-11-28 12:02:03 2023-11-28
4 2024-04-23 13:20:55 2024-04-23
5 2023-11-30 18:45:38 2023-12-01
6 2023-07-10 14:17:10 2023-07-10
It's displaying created_at in my laptop timezone ("CST"):
> my_table %>% head(1) %>% pull(created_at)
[1] "2024-03-06 14:07:25 CST"
but it's the same UTC timestamp:
> my_table %>% head(1) %>% pull(created_at) %>% as.POSIXct(tz="UTC")
[1] "2024-03-06 20:07:25 UTC"
However, I think date
field is the UTC date (see line 5, which should be 2023-11-30 but is 2023-12-01), and I can't shift it without collect.
Upvotes: 0