Reputation: 401
This might sound like a duplicate issue but I have gone through many POSIxct related bugs but did not come across this. If you still find one, I will really appreciate being pointed in that direction. as.POSIXct is behaving very awkwardly in my case. See the example below:
options(digits.secs = 3)
test_time <- "2017-01-26 23:00:00.010"
test_time <- as.POSIXct(test_time, format = "%Y-%m-%d %H:%M:%OS")
This returns:
"2017-01-26 23:00:00.00"
Now, I try the following option and it returns NA. I have no idea why is this behaving like that when all I need it to convert to is "2017-01-26 23:00:00.010".
test_time <- "2017-01-26 23:00:00.010"
test_time <- as.POSIXct(test_time, format = "%Y-%m-%d %H:%M:%OS3")
Now it works fine when I do this:
as.POSIXlt(strptime(test_time,format = "%Y-%m-%d %H:%M:%OS"), format = "%Y-%m-%d %H:%M:%OS")
But for my purpose I need to have this as a POSIxct object because some libraries I am working with only take POSIXct objects. Converting POSIXlt to POSIXct again results in the same problem as before. Is there an issue with my system settings? The date is also not one of those daylight savings times one to throw an error. Why would it work with one format and not others? Any leads/suggestions are welcome!
Running on Windows 10 64-bit
Upvotes: 0
Views: 905
Reputation: 2950
The issue here has to do with the maximum precision that POSIXct can handle. It is backed by a double under the hood, representing the number of seconds since the epoch, midnight on 1970-01-01 UTC. Fractional seconds are represented as fractional parts of that double, i.e. 63.02
represents 1970-01-01 00:01:03.02 UTC
.
options(digits = 22, digits.secs = 3)
.POSIXct(63.02, tz = "UTC")
#> [1] "1970-01-01 00:01:03.02 UTC"
63.02
#> [1] 63.02000000000000312639
Now, when working with doubles there are limits to the precision that they can represent exactly. You can see this with the above example; typing in 63.02
in the console doesn't return exactly the same number, and instead returns something close, but with some extra bits at the end.
So now let's take a look at your example. If we start as "low level" as possible, the first thing as.POSIXct()
does is call strptime()
, which returns a POSIXlt object. That keeps each "field" of the date-time as a separate element (i.e. year is kept separate from month, day, second, etc). We can see that it parsed correctly and our sec field holds 0.01
.
# `digits.secs` to print 3 fractional digits (has no effect on parsing)
# `digits` to print 22 fractional digits for double values
options(digits.secs = 3, digits = 22)
x <- "2017-01-26 23:00:00.010"
# looks good
lt <- strptime(x, format = "%Y-%m-%d %H:%M:%OS", tz = "America/New_York")
lt
#> [1] "2017-01-26 23:00:00.01 EST"
# This is a POSIXlt, which is a list holding fields like year,month,day,...
class(lt)
#> [1] "POSIXlt" "POSIXt"
# sure enough...
lt$sec
#> [1] 0.01000000000000000020817
But now convert that to POSIXct. At this point, the individual fields are collapsed into a single double, which might have precision issues.
# now convert to POSIXct (i.e. a single double holding all the info)
# looks like we lost the fractional seconds?
ct <- as.POSIXct(lt)
ct
#> [1] "2017-01-26 23:00:00.00 EST"
# no, they are still there, but the precision in the `double` data type
# isn't enough to be able to represent this exactly as `1485489600.010`
unclass(ct)
#> [1] 1485489600.009999990463
#> attr(,"tzone")
#> [1] "America/New_York"
So the ct
fractional part of the double value is close to .010
, but can't represent it exactly and returns a value slightly less than .010
, which gets (I presume) rounded down when the POSIXct is printed, making it look like you lost the fractional seconds.
Because these issues are so troublesome, I recommend using the low level API of the clock package (note that I wrote this package). It has support for fractional seconds up to nanoseconds without loss of precision (by using a different data structure than POSIXct). https://clock.r-lib.org/
library(clock)
x <- "2017-01-26 23:00:00.010"
nt <- naive_time_parse(x, format = "%Y-%m-%d %H:%M:%S", precision = "millisecond")
nt
#> <time_point<naive><millisecond>[1]>
#> [1] "2017-01-26 23:00:00.010"
# If you need it in a time zone
as_zoned_time(nt, zone = "America/New_York")
#> <zoned_time<millisecond><America/New_York>[1]>
#> [1] "2017-01-26 23:00:00.010-05:00"
Upvotes: 1