song xs
song xs

Reputation: 81

Is floating point math determinstic for all intel/amd cpus?

Suppose I have already compiled a binary, doing some float caculation and output the result. If I provide same input for different execution, can I assume that the result must be completely the same (bit-identical)? Does the binary always produce determinstic result for every instruction (ADDPS, FMADD ... or other sse/avx floating instructions) on all kind of x86_64 CPUS? If not, any instruction/arch example?

Upvotes: 1

Views: 144

Answers (2)

wim
wim

Reputation: 3998

It depends on your binary executable.

A software developer and/or compiler may choose to use different code paths, depending on the instruction set support of the actual CPU and/or OS (runtime cpu dispatching). x86-64 only mandates SSE and SSE2 support. Modern CPU’s may have support for instruction sets such as AVX2/FMA and AVX-512. These instruction sets may help to improve the performance and/or the accuracy of floating point operations. But, for example, the result of computing a*b+c with a single vfmadd132ss instruction is not necessarily bit-identical with the result of a separate add and mul instruction (vmulss and vaddss). Note that library calls also may cause (unexpected) runtime cpu dispatching.

Moreover instructions such as the approximate inverse square root vrsqrtss are not bit-identical across AMD and Intel processors.

The basic floating point instructions, such as add, sub, mul, div, fma and sqrt are deterministic. With an identical code path but different processors, the outcome should be identical if only these instructions are executed.

Upvotes: 3

JM Arnold
JM Arnold

Reputation: 69

[One more attempt...]

In addition to @wim's answer above:

Another reference, from 2016, in which I report on comparing the rsqrt and rcp instructions between Intel and AMD processors is https://github.com/jeff-arnold/math_routines/blob/main/rsqrt_rcp/docs/rsqrt_rcp.pdf. This shows that the rsqrt and rcp instructions may give different results for the same arguments on Intel and AMD processors, and that these differences may affect the result of an application. It deduces the underlying mechanisms of these instructions and shows how they differ on those two processors.

See also https://members.loria.fr/PZimmermann/papers/accuracy.pdf which is a (continuing) study of the accuracy of various implementations of math library functions. The last paragraph of the introduction is relevant to the original question, explaining that a given library run on different hardware may give different results because of runtime dispatching (i.e., different code paths executed based on the underlying hardware) and, for some particular instructions (e.g., rsqrt and rcp), their execution on different hardware may give different results.

Upvotes: 2

Related Questions