Differing Floating Point Calculation Results between x86_64 and ARMv8.2-A

Question

I have compiled the same Fortran libraries and code in both aarch64 and x86_64. It is a model that runs algorithms across n-dimensional arrays / matrices. The ARM CPU is the Amazon Graviton2. AMD & Intel options in AWS produce identical results when the code is compiled and run for x86_64.

I'm using gcc / g++ / gfortran / mpich with the following flags (all version 8.3.0, from debian buster's main repos)

-O2 -ftree-vectorize -funroll-loops -w -ffree-form -ffree-line-length-none -fconvert=big-endian -frecord-marker=4

It all compiles and runs fine, however, i notice in the output of the model, the results differ very slightly. It seems to be a matter of precision or rounding, as most values are the same between output. However, there are (seemingly) random values throughout the output where it looks like the code compiled for one arch either rounded down or truncated and the other arch rounded up.

The output is stored as NetCDF (using NetCDF-Fortran version 4.5.3) and the md5sum of the files is the same across x86_64 CPUs but differs on aarch64.

Any ideas of why this might be happening? Or any flags I can use during compilation to ensure that I get identical results across architectures?

The values I'm looking at now have a precision of 5 decimal places, i.e 123.12345

Here is a snippet from a diff of the output where you can see that most values are identical but a few seem to have been rounded differently (I've marked the differing values with **):

  657c657
  <     18.83633, 18.83212, 18.82778, **18.82337**, 18.81886, 18.81425, 18.80956, 
  ---
  >     18.83633, 18.83212, 18.82778, **18.82336**, 18.81886, 18.81425, 18.80956, 
  1151c1151
  <     17.35448, 17.37331, 17.39206, 17.41071, 17.42931, **17.4478**, 17.46622, 
  ---
  >     17.35448, 17.37331, 17.39206, 17.41071, 17.42931, **17.44779**, 17.46622, 
  1711c1711
  <     19.77562, 19.77532, 19.77493, 19.77445, 19.77386, 19.77319, **19.77241**, 
  ---
  >     19.77562, 19.77532, 19.77493, 19.77445, 19.77386, 19.77319, **19.77242**, 
  2130c2130
  <     20.06532, 20.06839, **20.07135**, 20.07423, 20.07702, 20.0797, 20.0823, 
  ---
  >     20.06532, 20.06839, **20.07136**, 20.07423, 20.07702, 20.0797, 20.0823, 
  2140c2140
  <     20.04788, 20.04424, 20.04047, **20.03661**, 20.03268, 20.02863, 20.02448, 
  ---
  >     20.04788, 20.04424, 20.04047, **20.03662**, 20.03268, 20.02863, 20.02448, 
  2600c2600
  <     11.54104, 11.57732, 11.61352, 11.6497, 11.68579, **11.72186**, 11.75784, 
  ---
  >     11.54104, 11.57732, 11.61352, 11.6497, 11.68579, **11.72185**, 11.75784,

Differing Floating Point Calculation Results between x86_64 and ARMv8.2-A

Answers (1)

Related Questions