Ben G
Ben G

Reputation: 4328

as.integer() on an int64 dataframe produces unexpected result

I was reviewing some code and came across this odd result. If you have a dataframe with one value of type integer and you coerce it to integer you get what I think you would expect:

library(dplyr)

tibble(x = as.integer(c(1))) %>% as.integer()

[1] 1

But if it's of type int64, you get something weird:

library(bit64)

tibble(x = as.integer64(c(1))) %>% as.integer()

[1] 0

What gives? I assume it has something to do with the int64 class. But why would I get zero? Is this just bad error handling?

Update

OK, there's a hint to what's going on when you call dput on the int64 dataframe:

structure(list(x = structure(4.94065645841247e-324, 
                             class = "integer64")), 
          row.names = c(NA, -1L), 
          class = c("tbl_df", "tbl", "data.frame"))

So as.integer() is rightly converting 4.94065645841247e-324 to zero. But why is that what's stored in the DF?

Also, to see that this is not a bit64 issue, I get a very similar structure on the actual df I get back from my database:

structure(list(max = structure(2.78554211125295e-320,
                               class = "integer64")),
          class = "data.frame", 
          row.names = c(NA, -1L))

Upvotes: 6

Views: 1260

Answers (1)

SmokeyShakers
SmokeyShakers

Reputation: 3412

I think this is a limitation of bit64. bit64 uses the S3 Method as.integer.integer64 to convert from int64 to int, but only for vectors (unlike base as.integer which can be applied to other objects). The base as.integer doesn't know how to convert int64 to int on a data.frame or otherwise.

So after loading bit64, as.integer will call actually as.integer.integer64 on all int64 vectors, but not on a data.frame or tibble.

Upvotes: 1

Related Questions