jcox
jcox

Reputation: 959

.net float rounding errors at compile-time vs runtime

I was recently setting up some data for a test case checking rounding errors on the float data type, and ran into some unexpected results. I expected that cases t2 and t3 would produce the same result as t1, but that is not the case on my machine. Can anyone tell me why?

I suspect the reason for the difference is that t2 and t3 are evaluated at compilation, but I'm surprised that the compiler completely ignores my attempts to force it to use an intermediate float data type during evaluation. Is there some part of the c# standard that mandates evaluating constants with the largest available data type, regardless of the one specified?

This is on a win7 64-bit intel machine running .net 4.5.2.

  float temp_t1 = 1/(3.0f);
  double t1 = (double)temp_t1;

  const float temp_t2 = 1/(3.0f);
  double t2 = (double)temp_t2;

  double t3 = (double)(float)(1/(3.0f));

  System.Console.WriteLine( t1 ); //prints 0.333333343267441
  System.Console.WriteLine( t2 ); //prints 0.333333333333333
  System.Console.WriteLine( t3 ); //prints 0.333333333333333

Upvotes: 1

Views: 273

Answers (2)

Mike Zboray
Mike Zboray

Reputation: 40818

People often have questions about the consistency of floating point calculations. There are almost no guarantees given by the .NET Framework on this point. To quote Eric Lippert:

The C# compiler, the jitter and the runtime all have broad lattitude to give you more accurate results than are required by the specification, at any time, at a whim -- they are not required to choose to do so consistently and in fact they do not.

In this particular case, the answer is straight-forward. The raw IL for a release build:

IL_0000: ldc.r4 0.333333343
IL_0005: conv.r8
IL_0006: ldc.r8 0.33333333333333331
IL_000f: stloc.0
IL_0010: ldc.r8 0.33333333333333331
IL_0019: stloc.1
IL_001a: call void [mscorlib]System.Console::WriteLine(float64)
IL_001f: ldloc.0
IL_0020: call void [mscorlib]System.Console::WriteLine(float64)
IL_0025: ldloc.1
IL_0026: call void [mscorlib]System.Console::WriteLine(float64)
IL_002b: ret

All arithmetic here is done by the compiler. In the Roslyn compiler, the fact that temp_t1 is a variable causes the compiler to emit IL that loads a 4-byte float and then convert it to a double. I believe this is consistent with previous versions. In the other two cases, the compiler does all arithmetic at double precision and stores those results. It is not surprising that the second and third cases don't differ because the compiler does not retain local constants in the IL.

Upvotes: 2

Frank Hileman
Frank Hileman

Reputation: 1239

C# floating point behavior is based on the underlying CPU, using IEEE 754 formats. If you want to really see what is going on, you need to look at the numbers in their binary format, by converting them to bytes. When you print them, they are converted from base 2 to base 10, and you have a lot of processing going on.

Here is what I suspect is happening. Your first computation (temp_t1) is using single precision floating point, 23 bits for the mantissa. I suspect, but did not confirm, that temp_t2 and t2 have been transformed by an optimization component in the compiler, such that temp_t2 was not computed using single precision floating point, but rather double precision, and t2 picked up that value.

More information regarding floating point behavior: https://msdn.microsoft.com/en-us/library/aa691146(v=vs.71).aspx

Upvotes: -1

Related Questions