Reputation: 55390
When using big64
package, summing a vector of NAs
to another vector of integers yields an inaccurate result. Depending on whether the NA
vector is summed first or last, the results will be either 0
or twice the correct answer, respectively.
Notice that converting the NA
vector away from integer64 will remove the issue.
However, when experimenting with other small values in place of y, the results were awfully strange. For example:
40 + 35 = 75 but
35 + 40 = 80
Any thoughts as to what is going on?
EXAMPLE:
library(bit64)
x <- as.integer64(c(20, 20))
y <- as.integer64(c(NA, NA))
sum(y, x, na.rm=TRUE)
# integer64
# [1] 80 # <~~~ Twice the correct value
sum(x, y, na.rm=TRUE)
# integer64
# [1] 0 # <~~~~ Incorrect 0. Should be 40.
## Removing the NAs does not help.
y <- y[!is.na(y)]
## A vector of 0's gives the same issue
y <- as.integer64(c(0, 0))
## Same results
sum(y, x, na.rm=TRUE)
# integer64
# [1] 80
sum(x, y, na.rm=TRUE)
# integer64
# [1] 0
## Converting to numeric does away with the issue (but is not a viable workaround, for obvious reasons)
y <- as.numeric(y)
sum(y, x, na.rm=TRUE)
# [1] 1.97626e-322
sum.integer64(y, x, na.rm=TRUE)
# integer64
# [1] 40
sum(x, y, na.rm=TRUE)
# integer64
# [1] 40
Give y
a single value, and the results are also very out of place
y <- as.integer64(c(35, NA, NA))
sum.integer64(x, if (!all(is.na(y))) removeNA(y), na.rm=TRUE)
sum.integer64(x, y[[1]], na.rm=TRUE)
sum.integer64(y[[1]], x, na.rm=TRUE)
## No NA's present
sum.integer64(as.integer64(35), x)
# integer64
# [1] 80
sum.integer64(x, as.integer64(35))
# integer64
# [1] 70
Upvotes: 3
Views: 123
Reputation:
Not an answer, but an exploration. Hope it might help you.
From the sum.integer64
function of the bit64
package:
function (..., na.rm = FALSE)
{
l <- list(...)
ret <- double(1)
if (length(l) == 1) {
.Call("sum_integer64", l[[1]], na.rm, ret)
oldClass(ret) <- "integer64"
ret
}
else {
ret <- sapply(l, function(e) {
if (is.integer64(e)) {
.Call("sum_integer64", e, na.rm, ret)
ret
}
else {
as.integer64(sum(e, na.rm = na.rm))
}
})
oldClass(ret) <- "integer64"
sum(ret, na.rm = na.rm)
}
}
Here is your example:
library(bit64)
x <- as.integer64(c(20, 20))
y <- as.integer64(c(NA, NA))
na.rm <- TRUE
l <- list(y, x)
ret <- double(1)
ret
#[1] 0
# We use the sapply function as in the function:
ret <- sapply(l, function(e) { .Call("sum_integer64", e, na.rm, ret) })
oldClass(ret) <- "integer64"
ret
#integer64
#[1] 40 40 <-- twice the value "40"
sum(ret, na.rm = na.rm)
# integer64
#[1] 80 <-- twice the expected value, as you said
Here we decompose the calculation, for each vector:
ret <- double(1)
ret2 <- NULL
ret2[1] <- .Call("sum_integer64", y, na.rm, ret)
ret2[2] <- .Call("sum_integer64", x, na.rm, ret)
oldClass(ret2) <- "integer64"
ret2
#integer64
#[1] 0 40 <-- only once the value "40", and "0" because of NaNs
sum(ret2, na.rm = na.rm)
#integer64
#[1] 40 <- expected value
Upvotes: 2