Reputation: 572
I imagine this has to do with R's data structures and the answer will be quick, but I haven't yet found one so here goes:
as.character(9875987598759875)
[1] "9875987598759876"
library(crayon)
chr(9875987598759875)
[1] "9875987598759876"
toString(9875987598759875)
[1] "9875987598759876"
What gives? How should I be making this conversion more safely?
Upvotes: 1
Views: 272
Reputation: 226087
.Machine$integer.max
indicates that the largest integer R can store is 2147483647 (this could conceivably vary across platforms, but it's very unlikely to). Any number larger than that is automatically converted to floating point, with the attendant imprecision/round-off error. (Unlike in Python, which expensively but magically converts integer variables to an arbitrary-length representation as necessary.)
If you install the bit64
package you can use 64-bit integers, with (presumably) exactness up to
print(2^63-1,digits=22)
[1] 9223372036854775808
If you start with a character string, you can safely do round-trip conversion to integer64
and back:
library(bit64)
cc <- "9875987598759875"
x <- as.integer64(cc)
identical(cc,as.character(x))
## [1] TRUE
However, typically once you've read a number into R as a regular number it's too late. You can use colClasses="integer64"
with read.table()
/read.csv()
/etc. to read values in as integer64
; I believe the file-reading functions from readr
and data.table
also have integer64-handling capabilities.
For many applications, if you're not actually planning on doing anything numerical with these digit-strings, it's safest and easiest to make sure you import them as character
in the first place ...
Upvotes: 3