Reputation: 1594
I am trying to deal with percentages in R and I am getting strange issue. When I convert values of vector to percentages of the sum
of the vector, it often happens, that they do not add up to one.
Minimal example:
data <- rnorm(1000)*100
max <- 50
unlist(lapply(0:(1000/max-1),
function(i)
sum(
data[(i*max+1):(i*(max+1))]
/
sum(data[(i*max+1):(i*(max+1))])
)
))-1
It should give vector of zeros, however I am getting this:
[1] 0.000000e+00 0.000000e+00 -1.110223e-16 -1.110223e-16 0.000000e+00 -1.110223e-16 0.000000e+00 0.000000e+00 0.000000e+00
[10] 0.000000e+00 0.000000e+00 2.220446e-16 0.000000e+00 -4.440892e-16 0.000000e+00 0.000000e+00 0.000000e+00 4.440892e-16
[19] -1.110223e-16 0.000000e+00
Any idea for remedy?
Upvotes: 1
Views: 382
Reputation: 263362
They are off by a number that is insignificant. If you want to change how these insignificant differences, that are inherent in floating point arithmetic, are displayed you can use the format function or one of its cousins like sprintf or formatC. This is really an instance of FAQ 7.31. If you do want help with formatting, you should describe a particular application. If you wnat to coerce to see zeroes you can also use round()
round( unlist(lapply(0:(1000/max-1),
function(i)
sum(
data[(i*max+1):(i*(max+1))]
/
sum(data[(i*max+1):(i*(max+1))])
)
))-1 , digits=4)
Upvotes: 4
Reputation: 174803
A more important question is why do you think these should be 0
?
You are using floating point arithmetic and not all numbers can be represented exactly in your computer. This is covered (or related to) R FAQ 7.31, which explains the phenomenon.
You can either ignore it (for all intents & purposes, these values are 0
)
> all.equal(tmp, rep(0, length(tmp))) ## tmp contain your numbers
[1] TRUE
or learn to deal with it accordingly for your particular operation. One way is to just round them to some extent:
> round(tmp, 2)
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> round(tmp, 3)
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> round(tmp, 4)
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> round(tmp, 5)
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
But it does depend what you want to do with these numbers.
Upvotes: 4