statquant
statquant

Reputation: 14360

Accurately converting from character->POSIXct->character with sub millisecond datetimes

I have a character datetime column in a file. I load the file (into a data.table) and do things that require the column to be converted to POSIXct. I then need to write the POSIXct value back to file, but the datetime will not be the same (because it is printed incorrectly).

This print/formatting issue is well known and has been discussed several times. I've read some posts describing this issue. The most authoritative answers I found are given in response to this question. The answers to that question provide two functions (myformat.POSIXct and form) that are supposed to solve this issue, but they do not seem to work on this example:

x <- "04-Jan-2013 17:22:08.139"
options("digits.secs"=6)
form(as.POSIXct(x,format="%d-%b-%Y %H:%M:%OS"),format="%d-%b-%Y %H:%M:%OS3")
[1] "04-Jan-2013 17:22:08.138"
form(as.POSIXct(x,format="%d-%b-%Y %H:%M:%OS"),format="%d-%b-%Y %H:%M:%OS4")
[1] "04-Jan-2013 17:22:08.1390"
myformat.POSIXct(as.POSIXct(x,format="%d-%b-%Y %H:%M:%OS"),digits=3)
[1] "2013-01-04 17:22:08.138"
myformat.POSIXct(as.POSIXct(x,format="%d-%b-%Y %H:%M:%OS"),digits=4)
[1] "2013-01-04 17:22:08.1390"

My sessionInfo:

R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                        
[5] LC_TIME=C                              

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] fasttime_1.0-0   data.table_1.8.9 bit64_0.9-2      bit_1.1-9
[5] sas7bdat_0.3     chron_2.3-43     vimcom_0.9-6    

loaded via a namespace (and not attached):
[1] tools_2.15.2

Upvotes: 7

Views: 10153

Answers (4)

Martin M&#228;chler
Martin M&#228;chler

Reputation: 4765

Two things:

1) @statquant is right (and the otherwise known experts @Joshua Ulrich and @Dirk Eddelbuettel are wrong), and @Aaron in his comment, but that will not be important for the main question here:

POSIXlt by design is definitely more accurate in storing times than POSIXct: As its seconds are always in [0, 60), it has a granularity of about 6e-15, i.e., 6 femtoseconds which would be dozens of million times less granular than POSIXct.

However, this is not very relevant here (and for current R): Almost all operations, notably numeric ones, use the Ops group method (yes, not known to beginners, but well documented), just look at Ops.POSIXt which indeed trashes the extra precision by first coercing to POSIXct. In addition, the format()/print() ing uses 6 decimals after the "." at most, and hence also does not distinguish between the internally higher precision of POSIXlt and the "only" 100 nanosecond granularity of POSIXct.
(For the above reason, both Dirk and Joshua were lead to their wrong assertion: For all simple practical uses, the precision of *lt and *ct is made the same).

2) I do tend to agree that we (R Core) should improve the format()ing and hence print()ing of such fractions of seconds POSIXt objects (still after the bug fix mentioned by @Aaron above).
But then I may be wrong, and "we" have got it right, by some definition of "right" ;-)

Upvotes: 3

Aaron - mostly inactive
Aaron - mostly inactive

Reputation: 37754

So I guess you do need a little fudge factor added to my suggestion here: https://stackoverflow.com/a/7730759/210673. This seems to work but perhaps might include other bugs; test carefully and think about what it's doing before using for anything important.

myformat.POSIXct <- function(x, digits=0) {
  x2 <- round(unclass(x), digits)
  attributes(x2) <- attributes(x)
  x <- as.POSIXlt(x2)
  x$sec <- round(x$sec, digits) + 10^(-digits-1)
  format.POSIXlt(x, paste("%Y-%m-%d %H:%M:%OS",digits,sep=""))
}

Upvotes: 5

Dirk is no longer here
Dirk is no longer here

Reputation: 368201

When you write

My understanding is that POSIXct representation is less precise than the POSIXlt representation

you are plain wrong.

It is the same representation for both -- down to milliseconds on Windows, and down to (almost) microseconds on the other OSs. Did you read help(DateTimeClasses) ?

As for your last question, yes the development version of my RcppBDT package uses Boost Date.Time and can go all the way to nanoseconds if your OS supports it and you turned the proper representation on. But it does replace POSIXct, and does not yet support vectors of time objects.

Edit: Regarding your follow-up question:

R> one <- Sys.time(); two <- Sys.time(); two - one
Time difference of 7.43866e-05 secs
R>
R> as.POSIXlt(two) - as.POSIXlt(one)
Time difference of 7.43866e-05 secs
R> 
R> one    # options("digits.sec"=6) on my box
[1] "2013-03-13 07:30:57.757937 CDT"
R> 

Edit 2: I think you are simply experiencing that floating point representation on computers is inexact:

R> print(as.numeric(as.POSIXct("04-Jan-2013 17:22:08.138",
+                   format="%d-%b-%Y %H:%M:%OS")), digits=18)
[1] 1357341728.13800001
R> print(as.numeric(as.POSIXct("04-Jan-2013 17:22:08.139",
+                   format="%d-%b-%Y %H:%M:%OS")), digits=18)
[1] 1357341728.13899994
R> 

The difference is not precisely 1/1000 as you assumed.

Upvotes: 3

Joshua Ulrich
Joshua Ulrich

Reputation: 176648

As the answers to the questions you linked to already say, how a value is printed/formatted is not the same as what the actual value is. This is just a printed representation issue.

R> as.POSIXct('2011-10-11 07:49:36.3')-as.POSIXlt('2011-10-11 07:49:36.3')
Time difference of 0 secs
R> as.POSIXct('2011-10-11 07:49:36.2')-as.POSIXlt('2011-10-11 07:49:36.3')
Time difference of -0.0999999 secs

Your understanding that POSIXct is less precise than POSIXlt is incorrect. You're also incorrect in saying that you can't include a POSIXlt object as a column in a data.frame.

R> x <- data.frame(date=Sys.time())
R> x$date <- as.POSIXlt(x$date)
R> str(x)
'data.frame':   1 obs. of  1 variable:
 $ date: POSIXlt, format: "2013-03-13 07:38:48"

Upvotes: 4

Related Questions