Reputation: 14360
I have a character datetime column in a file. I load the file (into a data.table
) and do things that require the column to be converted to POSIXct
. I then need to write the POSIXct
value back to file, but the datetime will not be the same (because it is printed incorrectly).
This print/formatting issue is well known and has been discussed several times. I've read some posts describing this issue. The most authoritative answers I found are given in response to this question. The answers to that question provide two functions (myformat.POSIXct
and form
) that are supposed to solve this issue, but they do not seem to work on this example:
x <- "04-Jan-2013 17:22:08.139"
options("digits.secs"=6)
form(as.POSIXct(x,format="%d-%b-%Y %H:%M:%OS"),format="%d-%b-%Y %H:%M:%OS3")
[1] "04-Jan-2013 17:22:08.138"
form(as.POSIXct(x,format="%d-%b-%Y %H:%M:%OS"),format="%d-%b-%Y %H:%M:%OS4")
[1] "04-Jan-2013 17:22:08.1390"
myformat.POSIXct(as.POSIXct(x,format="%d-%b-%Y %H:%M:%OS"),digits=3)
[1] "2013-01-04 17:22:08.138"
myformat.POSIXct(as.POSIXct(x,format="%d-%b-%Y %H:%M:%OS"),digits=4)
[1] "2013-01-04 17:22:08.1390"
My sessionInfo
:
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=C
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] fasttime_1.0-0 data.table_1.8.9 bit64_0.9-2 bit_1.1-9
[5] sas7bdat_0.3 chron_2.3-43 vimcom_0.9-6
loaded via a namespace (and not attached):
[1] tools_2.15.2
Upvotes: 7
Views: 10153
Reputation: 4765
Two things:
1) @statquant is right (and the otherwise known experts @Joshua Ulrich and @Dirk Eddelbuettel are wrong), and @Aaron in his comment, but that will not be important for the main question here:
POSIXlt
by design is definitely more accurate in storing times than POSIXct
: As its seconds are always in [0, 60), it has a granularity of about 6e-15, i.e., 6 femtoseconds which would be dozens of million times less granular than POSIXct
.
However, this is not very relevant here (and for current R): Almost all operations, notably numeric ones, use the Ops
group method (yes, not known to beginners, but well documented), just look at Ops.POSIXt
which indeed trashes the extra precision by first coercing to POSIXct
. In addition, the format()/print() ing uses 6 decimals after the "." at most, and hence also does not distinguish between the internally higher precision of POSIXlt
and the "only" 100 nanosecond granularity of POSIXct
.
(For the above reason, both Dirk and Joshua were lead to their wrong assertion: For all simple practical uses, the precision of *lt and *ct is made the same).
2) I do tend to agree that we (R Core) should improve the format()
ing and hence print()
ing of such fractions of seconds POSIXt objects (still after the bug fix mentioned by @Aaron above).
But then I may be wrong, and "we" have got it right, by some definition of "right" ;-)
Upvotes: 3
Reputation: 37754
So I guess you do need a little fudge factor added to my suggestion here: https://stackoverflow.com/a/7730759/210673. This seems to work but perhaps might include other bugs; test carefully and think about what it's doing before using for anything important.
myformat.POSIXct <- function(x, digits=0) {
x2 <- round(unclass(x), digits)
attributes(x2) <- attributes(x)
x <- as.POSIXlt(x2)
x$sec <- round(x$sec, digits) + 10^(-digits-1)
format.POSIXlt(x, paste("%Y-%m-%d %H:%M:%OS",digits,sep=""))
}
Upvotes: 5
Reputation: 368201
When you write
My understanding is that POSIXct representation is less precise than the POSIXlt representation
you are plain wrong.
It is the same representation for both -- down to milliseconds on Windows, and down to (almost) microseconds on the other OSs. Did you read help(DateTimeClasses)
?
As for your last question, yes the development version of my RcppBDT package uses Boost Date.Time and can go all the way to nanoseconds if your OS supports it and you turned the proper representation on. But it does replace POSIXct, and does not yet support vectors of time objects.
Edit: Regarding your follow-up question:
R> one <- Sys.time(); two <- Sys.time(); two - one
Time difference of 7.43866e-05 secs
R>
R> as.POSIXlt(two) - as.POSIXlt(one)
Time difference of 7.43866e-05 secs
R>
R> one # options("digits.sec"=6) on my box
[1] "2013-03-13 07:30:57.757937 CDT"
R>
Edit 2: I think you are simply experiencing that floating point representation on computers is inexact:
R> print(as.numeric(as.POSIXct("04-Jan-2013 17:22:08.138",
+ format="%d-%b-%Y %H:%M:%OS")), digits=18)
[1] 1357341728.13800001
R> print(as.numeric(as.POSIXct("04-Jan-2013 17:22:08.139",
+ format="%d-%b-%Y %H:%M:%OS")), digits=18)
[1] 1357341728.13899994
R>
The difference is not precisely 1/1000 as you assumed.
Upvotes: 3
Reputation: 176648
As the answers to the questions you linked to already say, how a value is printed/formatted is not the same as what the actual value is. This is just a printed representation issue.
R> as.POSIXct('2011-10-11 07:49:36.3')-as.POSIXlt('2011-10-11 07:49:36.3')
Time difference of 0 secs
R> as.POSIXct('2011-10-11 07:49:36.2')-as.POSIXlt('2011-10-11 07:49:36.3')
Time difference of -0.0999999 secs
Your understanding that POSIXct
is less precise than POSIXlt
is incorrect. You're also incorrect in saying that you can't include a POSIXlt
object as a column in a data.frame.
R> x <- data.frame(date=Sys.time())
R> x$date <- as.POSIXlt(x$date)
R> str(x)
'data.frame': 1 obs. of 1 variable:
$ date: POSIXlt, format: "2013-03-13 07:38:48"
Upvotes: 4