Simon O'Hanlon
Simon O'Hanlon

Reputation: 59970

Rounding error with microseconds using as.POSIXlt

I find some strange behaviour from as.POSIXlt that I am unable to explain, I am hoping someone else can. In investigating this question I found that sometimes the fractional part of a second would be rounded incorrectly

For example, the numbers below represent a particular second since the epoch has begun, with the last 6 digits being the fractional part of the second, so the fraction of a second on the first number should be .645990.

# Generate sequence of integers to represent date/times
times <- seq( 1366039619645990 , length.out = 11 )
options(scipen=20)
times
 [1] 1366039619645990 1366039619645991 1366039619645992 1366039619645993 1366039619645994 1366039619645995
 [7] 1366039619645996 1366039619645997 1366039619645998 1366039619645999 1366039619646000

# Convert to date/time with microseconds 
options(digits.secs = 6 )
as.POSIXlt( times/1e6, tz="EST", origin="1970-01-01") + 5e-7
 [1] "2013-04-15 10:26:59.645990 EST" "2013-04-15 10:26:59.645991 EST" "2013-04-15 10:26:59.645992 EST"
 [4] "2013-04-15 10:26:59.645993 EST" "2013-04-15 10:26:59.645994 EST" "2013-04-15 10:26:59.645995 EST"
 [7] "2013-04-15 10:26:59.645996 EST" "2013-04-15 10:26:59.645997 EST" "2013-04-15 10:26:59.645998 EST"
[10] "2013-04-15 10:26:59.645999 EST" "2013-04-15 10:26:59.646000 EST"

I found that I have to add a small increment, equal to half the minimum change in time to get correct representation of the fractional part of a second, otherwise rounding errors occur. And it works just fine if I run as.POSIXlt on a sequence of numbers as above, however if I try to convert one number, namely the one that should end in .645999 then the number of truncated to .645 and I do not know why!

# Now just convert the date/time that should end in .645999
as.POSIXlt( times[10]/1e6, tz="EST", origin="1970-01-01") + 5e-7
[1] "2013-04-15 10:26:59.645 EST"

Compare the 10th element in the vector returned by as.POSIXlt with the single element equivalent above. What is happening?

Session info:

R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] raster_2.0-41 sp_1.0-5     

loaded via a namespace (and not attached):
[1] grid_2.15.2     lattice_0.20-13 tools_2.15.2

Upvotes: 1

Views: 237

Answers (2)

Simon O&#39;Hanlon
Simon O&#39;Hanlon

Reputation: 59970

This seems to be a rounding issues, whereby significant digits of the fractional second are discarded. The offending(?) code is in the format methods for objects of class POSIXlt, namely format.POSIXlt which is used by print.POSIXlt.

If we use the two values below as an example, format.POSIXlt uses the following line which I have wrapped in an sapply to test the absolute value of the difference between the fractional seconds rounded to successively greater number of digits.

secs <- c( 59.645998 , 59.645999 )
sapply( seq_len(np) - 1L , function(x) abs(secs - round(secs, x)) )
         [,1]     [,2]     [,3]     [,4]     [,5]     [,6]
[1,] 0.354002 0.045998 0.004002 0.000002 0.000002 0.000002
[2,] 0.354001 0.045999 0.004001 0.000001 0.000001 0.000001

As you can see when the seconds are .xxx999 any rounding to 3 or more digits gives 0.000001 which affects the printing thus:

# the number of digits used for the fractional seconds is gotten here
np <- getOption("digits.secs")

# and the length of digits to be printed is controlled in this loop
for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs, 
                i)) < 0.000001)) {
                np <- i
                break
            }

This is because 0.000001 as actually found in the above method is:

sprintf( "%.20f" , abs(secs[2] - round(secs,5)))
[1] "0.00000099999999991773"            

# In turn this is used to control the printing of the fractional seconds            
if (np == 0L) 
            "%Y-%m-%d %H:%M:%S"
        else paste0("%Y-%m-%d %H:%M:%OS", np) 

So the fractional seconds get truncated to only 3 decimal places because of the test used in rounding. I think if the test value in the for loop was set to 5e-7 this issue would disappear.

When the result returned is a vector of POSIXlt objects a different print method must be getting called.

Upvotes: 2

Michele
Michele

Reputation: 8753

I haven't got a proper answer (keep looking into it) but I thought this was interesting:

times <- seq( 1366039619645990 , length.out = 11 )
# Convert to date/time wz="EST", origin="1970-01-01") + 5e-7
options(digits.secs = 6 )

test <- as.POSIXlt( times/1e6, tz="EST", origin="1970-01-01") + 5e-7

test1[1] <- NULL
for(i in 1:11)
  test1[i] <- as.POSIXlt(times[i]/1e6, tz="EST", origin="1970-01-01") + 5e-7

> identical(test, test1)
[1] TRUE

BTW, in single statements I got the same result as you...

> test
 [1] "2013-04-15 10:26:59.645990 EST" "2013-04-15 10:26:59.645991 EST" "2013-04-15 10:26:59.645992 EST"
 [4] "2013-04-15 10:26:59.645993 EST" "2013-04-15 10:26:59.645994 EST" "2013-04-15 10:26:59.645995 EST"
 [7] "2013-04-15 10:26:59.645996 EST" "2013-04-15 10:26:59.645997 EST" "2013-04-15 10:26:59.645998 EST"
[10] "2013-04-15 10:26:59.645999 EST" "2013-04-15 10:26:59.646000 EST"
> test[10]
[1] "2013-04-15 10:26:59.645 EST"
> as.POSIXlt( times[10]/1e6, tz="EST", origin="1970-01-01") + 5e-7
[1] "2013-04-15 10:26:59.645 EST"

Looking at the last two statements, it seems that this issue is mainly related to displaying the single value rather then a vector. But even in this case it would be a truncation, probably via floor, not a rounding.

Upvotes: 1

Related Questions