Anycorn
Anycorn

Reputation: 51475

converge to zero via underflow

please ignore this post, I misread algorithm, so the question is not relevant. However, I cannot close post anymore. Please vote to close

I have been using certain algorithm from numerical recipes, which converges to zero via underflow:

// all types are the same floating type
sum = 0
for (i in 0,N)
   sum += abs(V[i]);

my question, how does it happen? how does sum of small positive floating-point numbers converge to underflow/zero?

is there some condition where 0 + f = 0 , f > 0?

algorithm in question is Jacoby, http://www.mpi-hd.mpg.de/astrophysik/HEA/internal/Numerical_Recipes/f11-1.pdf, page 460. It is quite possible I misunderstand how the convergence is achieved, if so, please correct me.

thank you

Upvotes: 3

Views: 233

Answers (3)

MSN
MSN

Reputation: 54614

If V is an array of doubles and sum is a float (or single), you can certainly have values that are > 0 but when added to sum produces 0 if they are smaller than the smallest non-zero denormalized value representable in a float.

How do you know sum is actually zero and not just really really close? Are all bits set to zero?

EDIT: after reading the actual application, the underflow to zero remark is probably referring to repeated rotations around various axes to determine the eigenvalues and eigenvectors of a matrix. In that case, the algorithm only works if you can assume that repeated multiplications of very small numbers will clamp or underflow to zero. The actual sum won't underflow itself, however.

Upvotes: 2

cape1232
cape1232

Reputation: 1009

What are the types you are using? If f is a float and d1 and d2 are doubles, you get this.

double d1 = std::numeric_limits<double>::min();
double d2 = std::numeric_limits<double>::min();
float f = d1 + d2;
if (f == 0.0) std::cout << "yes";
else std::cout << "no";

This produces "yes".

Upvotes: 1

Roland Illig
Roland Illig

Reputation: 41625

I would be very surprised if that was possible using IEEE 754 arithmetics. The point is that IEEE 754 states that intermediate results are infinitely precise and then rounded to the destination data type.

So if you have sum + V[i], that value will always be greater or equal to sum. Rounding down to the next representable number will either produce sum or a number greater than sum.

Of course, there is nothing in the original question that prevents sum from being negative in the first place. In that case the answer would be trivial.

In IEEE 754 arithmetics, there is no number f such that 0 + f = 0 and at the same time f > 0.

Upvotes: 0

Related Questions