Reputation: 59970
I find some strange behaviour from as.POSIXlt
that I am unable to explain, I am hoping someone else can. In investigating this question I found that sometimes the fractional part of a second would be rounded incorrectly
For example, the numbers below represent a particular second since the epoch has begun, with the last 6 digits being the fractional part of the second, so the fraction of a second on the first number should be .645990.
# Generate sequence of integers to represent date/times
times <- seq( 1366039619645990 , length.out = 11 )
options(scipen=20)
times
[1] 1366039619645990 1366039619645991 1366039619645992 1366039619645993 1366039619645994 1366039619645995
[7] 1366039619645996 1366039619645997 1366039619645998 1366039619645999 1366039619646000
# Convert to date/time with microseconds
options(digits.secs = 6 )
as.POSIXlt( times/1e6, tz="EST", origin="1970-01-01") + 5e-7
[1] "2013-04-15 10:26:59.645990 EST" "2013-04-15 10:26:59.645991 EST" "2013-04-15 10:26:59.645992 EST"
[4] "2013-04-15 10:26:59.645993 EST" "2013-04-15 10:26:59.645994 EST" "2013-04-15 10:26:59.645995 EST"
[7] "2013-04-15 10:26:59.645996 EST" "2013-04-15 10:26:59.645997 EST" "2013-04-15 10:26:59.645998 EST"
[10] "2013-04-15 10:26:59.645999 EST" "2013-04-15 10:26:59.646000 EST"
I found that I have to add a small increment, equal to half the minimum change in time to get correct representation of the fractional part of a second, otherwise rounding errors occur. And it works just fine if I run as.POSIXlt
on a sequence of numbers as above, however if I try to convert one number, namely the one that should end in .645999 then the number of truncated to .645 and I do not know why!
# Now just convert the date/time that should end in .645999
as.POSIXlt( times[10]/1e6, tz="EST", origin="1970-01-01") + 5e-7
[1] "2013-04-15 10:26:59.645 EST"
Compare the 10th element in the vector returned by as.POSIXlt
with the single element equivalent above. What is happening?
Session info:
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] raster_2.0-41 sp_1.0-5
loaded via a namespace (and not attached):
[1] grid_2.15.2 lattice_0.20-13 tools_2.15.2
Upvotes: 1
Views: 237
Reputation: 59970
This seems to be a rounding issues, whereby significant digits of the fractional second are discarded. The offending(?) code is in the format methods for objects of class POSIXlt
, namely format.POSIXlt
which is used by print.POSIXlt
.
If we use the two values below as an example, format.POSIXlt
uses the following line which I have wrapped in an sapply to test the absolute value of the difference between the fractional seconds rounded to successively greater number of digits.
secs <- c( 59.645998 , 59.645999 )
sapply( seq_len(np) - 1L , function(x) abs(secs - round(secs, x)) )
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.354002 0.045998 0.004002 0.000002 0.000002 0.000002
[2,] 0.354001 0.045999 0.004001 0.000001 0.000001 0.000001
As you can see when the seconds are .xxx999 any rounding to 3 or more digits gives 0.000001 which affects the printing thus:
# the number of digits used for the fractional seconds is gotten here
np <- getOption("digits.secs")
# and the length of digits to be printed is controlled in this loop
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs,
i)) < 0.000001)) {
np <- i
break
}
This is because 0.000001 as actually found in the above method is:
sprintf( "%.20f" , abs(secs[2] - round(secs,5)))
[1] "0.00000099999999991773"
# In turn this is used to control the printing of the fractional seconds
if (np == 0L)
"%Y-%m-%d %H:%M:%S"
else paste0("%Y-%m-%d %H:%M:%OS", np)
So the fractional seconds get truncated to only 3 decimal places because of the test used in rounding. I think if the test value in the for loop was set to 5e-7 this issue would disappear.
When the result returned is a vector of POSIXlt
objects a different print method must be getting called.
Upvotes: 2
Reputation: 8753
I haven't got a proper answer (keep looking into it) but I thought this was interesting:
times <- seq( 1366039619645990 , length.out = 11 )
# Convert to date/time wz="EST", origin="1970-01-01") + 5e-7
options(digits.secs = 6 )
test <- as.POSIXlt( times/1e6, tz="EST", origin="1970-01-01") + 5e-7
test1[1] <- NULL
for(i in 1:11)
test1[i] <- as.POSIXlt(times[i]/1e6, tz="EST", origin="1970-01-01") + 5e-7
> identical(test, test1)
[1] TRUE
BTW, in single statements I got the same result as you...
> test
[1] "2013-04-15 10:26:59.645990 EST" "2013-04-15 10:26:59.645991 EST" "2013-04-15 10:26:59.645992 EST"
[4] "2013-04-15 10:26:59.645993 EST" "2013-04-15 10:26:59.645994 EST" "2013-04-15 10:26:59.645995 EST"
[7] "2013-04-15 10:26:59.645996 EST" "2013-04-15 10:26:59.645997 EST" "2013-04-15 10:26:59.645998 EST"
[10] "2013-04-15 10:26:59.645999 EST" "2013-04-15 10:26:59.646000 EST"
> test[10]
[1] "2013-04-15 10:26:59.645 EST"
> as.POSIXlt( times[10]/1e6, tz="EST", origin="1970-01-01") + 5e-7
[1] "2013-04-15 10:26:59.645 EST"
Looking at the last two statements, it seems that this issue is mainly related to displaying the single value rather then a vector. But even in this case it would be a truncation, probably via floor
, not a rounding.
Upvotes: 1