Reputation: 325
I've run into a problem with managing time zones with POSIXct in R. I have set the TZ
option globally as "Europe/London"
but since we have switched back to GMT have run as.POSIXct
no longer converts the numeric vector back to the right time.
Digging into why I found that differences in time zone can be caused by the object type used to set the origin date.
For example:
# Date time is set as 1 second after 1970-01-01
as.POSIXct(1, origin = "1970-01-01")
# [1] "1970-01-01 01:00:01 BST"
# Same numeric value, but one hour less now that the origin is set using a POSIXct
as.POSIXct(1, origin = as.POSIXct("1970-01-01"))
# [1] "1970-01-01 00:00:01 BST"
The first value doesn't really make sense given that the query was taken outside of British summer time, yet these were taken in GMT (see results below):
Sys.timezone()
# [1] "Europe/London"
Sys.time()
# [1] "2018-10-31 11:05:36 GMT"
Even when you explicitly state the time zone at each stage, the hour difference still persists:
as.POSIXct(1, origin = "1970-01-01", tz = "Europe/London")
# [1] "1970-01-01 01:00:01 BST"
as.POSIXct(1, origin = as.POSIXct("1970-01-01", tz = "Europe/London"), "Europe/London")
# [1] "1970-01-01 00:00:01 BST"
To make matters worse the documentation resulting from ?as.POSIXct
is pretty vague about the management of time zones, specifically:
If a time zone is needed and that specified is invalid on your system, what happens is system-specific but attempts to set it will probably be ignored.
Given this, I have a series of questions:
1) Why does as.POSIXct(1, origin = "1970-01-01", tz = "Europe/London")
add an hour? Even when the origin date would be parsed as a GMT time and the time zone has been set explicitly.
2) What is the best method of ensuring that you time zone in R is consistent when converting from numeric in R?
3) What is the best practice for managing time zones in R? Is there a good reference, especially for POSIXct
date types.
Upvotes: 4
Views: 496
Reputation: 23598
You are in a bit of history here for question 1. See below all outcomes for BST, GMT and UTC. UTC and GMT should be (and are) the same. Now, why do you get BST with the first line of code?
That is because in 1970 the UK was the whole year on BST. Actually, the UK was on BST from 1968-02-18 to 1971-10-31. Which means R is correct by returning "1970-01-01 01:00:01 BST" when you supply the timezone for "Europe/London". See for more info on this wikipedia page.
Times:
as.POSIXct(1, origin = "1970-01-01", tz = "Europe/London")
[1] "1970-01-01 01:00:01 BST"
as.POSIXct(1, origin = "1970-01-01", tz = "GMT")
[1] "1970-01-01 00:00:01 GMT"
as.POSIXct(1, origin = "1970-01-01", tz = "UTC")
[1] "1970-01-01 00:00:01 UTC"
Q2: First you need to know from which time zone the dates are from. Then either keep working in that time zone or change the time zone to your local time zone. Or strip the timezone of the date time object, which would force everything to UTC.
I would say lubridate's force_tz
and with_tz
functions to force the time zones. But since you don't want lubridate, either set your local time zone to whatever you need. I tend to use Sys.setenv(TZ = "UTC")
if I'm working with stock data so xts objects don't complain when I have a different local time.
Q3: here is a bit from R for Data Science here is an SO post on time zones
Upvotes: 2