mortal
mortal

Reputation: 235

How to parse signed zero?

Is it possible to parse signed zero? I tried several approaches but no one gives the proper result:

float test1 = Convert.ToSingle("-0.0");
float test2 = float.Parse("-0.0");
float test3;
float.TryParse("-0.0", out test3);

If I use the value directly initialized it is just fine:

float test4 = -0.0f;

So the problem seems to be in the parsing procedures of c#. I hope somebody could tell if there is some option or workaround for that.

The difference could only be seen by converting to binary:

var bin= BitConverter.GetBytes(test4);

Upvotes: 20

Views: 3309

Answers (3)

TheGeneral
TheGeneral

Reputation: 81523

Updated Results

Summary

Mode            : Release
Test Framework  : .NET Framework 4.7.1
Benchmarks runs : 100 times (averaged/scale)

Tests limited to 10 digits
Name            |      Time |    Range | StdDev |      Cycles | Pass
-----------------------------------------------------------------------
Mine Unchecked  |  9.645 ms | 0.259 ms |   0.30 |  32,815,064 | Yes
Mine Unchecked2 | 10.863 ms | 1.337 ms |   0.35 |  36,959,457 | Yes
Mine Safe       | 11.908 ms | 0.993 ms |   0.53 |  40,541,885 | Yes
float.Parse     | 26.973 ms | 0.525 ms |   1.40 |  91,755,742 | Yes
Evk             | 31.513 ms | 1.515 ms |   7.96 | 103,288,681 | Base


Test Limited to 38 digits 
Name            |      Time |    Range | StdDev |      Cycles | Pass
-----------------------------------------------------------------------
Mine Unchecked  | 17.694 ms | 0.276 ms |   0.50 |  60,178,511 | No
Mine Unchecked2 | 23.980 ms | 0.417 ms |   0.34 |  81,641,998 | Yes
Mine Safe       | 25.078 ms | 0.124 ms |   0.63 |  85,306,389 | Yes
float.Parse     | 36.985 ms | 0.052 ms |   1.60 | 125,929,286 | Yes
Evk             | 39.159 ms | 0.406 ms |   3.26 | 133,043,100 | Base


Test Limited to 98 digits (way over the range of a float)
Name            |      Time |    Range | StdDev |      Cycles | Pass
-----------------------------------------------------------------------
Mine Unchecked2 | 46.780 ms | 0.580 ms |   0.57 | 159,272,055 | Yes
Mine Safe       | 48.048 ms | 0.566 ms |   0.63 | 163,601,133 | Yes
Mine Unchecked  | 48.528 ms | 1.056 ms |   0.58 | 165,238,857 | No
float.Parse     | 55.935 ms | 1.461 ms |   0.95 | 190,456,039 | Yes
Evk             | 56.636 ms | 0.429 ms |   1.75 | 192,531,045 | Base

Verifiably, Mine Unchecked is good for smaller numbers however when using powers at the end of the calculation to do fractional numbers it doesn't work for larger digit combinations, also because its just powers of 10 it plays with a i just a big switch statement which makes it marginally faster.

Background

Because of the various comments I got, and the work I put into this. I thought I’d rewrite this post with the most accurate benchmarks I could get. And all the logic behind them.

So when this first question come up, I had written my own benchmark framework and often just like writing a quick parser for these things and using unsafe code, 9 times out of 10 I can get this stuff faster than the framework equivalent.

At first this was easy, just write a simple logic to parse numbers with decimal point places, and I did pretty well, however the initial results weren’t as accurate as they could have been, because my test data was just using the ‘f’ format specifier, and would turn larger precision numbers in to short formats with only 0’s.

In the end I just couldn’t write a reliable parses to deal with exponent notation I.e 1.2324234233E+23. The only way I could get the maths to work was using BIGINTEGER and lots of hacks to force the right precision into a floating point value. This turned to be super slow. I even went to the float IEEE specs and try to do the maths to construct it in bits, this wasn’t that hard, and however the formula has loops in it and was complicated to get right. In the end I had to give up on exponent notation.

So this is what I ended up with.

My testing framework runs on input data a list of 10000 floats as strings, which is shared across the tests and generated for each test run, A test run is just going through the each test (remembering it’s the same data for each test) and adds up the results then averages them. This is about as good as it can get. I can increase the runs to 1000 or factors more however they don’t really change. In this case because we are testing a method that takes basically one variable (a string representation of a float) there is no point scaling this as its not set based, however I can tweak the input to cater for different lengths of floats, i.e., strings that are 10, 20 right up to 98 digits. Remembering a float only goes up to 38 anyway.

To check the results I used the following, I have previously written a test unit that covers every float conceivable, and they work, except for a variation where I use Powers to calculate the decimal part of the number.

Note, my framework only tests 1 result set, and it's not part of the framework

private bool Action(List<float> floats, List<float> list)
{
   if (floats.Count != list.Count)
      return false; // sanity check

   for (int i = 0; i < list.Count; i++)
   {
      // nan is a special case as there is more than one possible bit value
      // for it
      if (  floats[i] != list[i] && !float.IsNaN(floats[i]) && !float.IsNaN(list[i]))
         return false;
   }

   return true;
}

In this case I'm testing again 3 types of input as shown below

Setup

// numberDecimalDigits specifies how long the output will be
private static NumberFormatInfo GetNumberFormatInfo(int numberDecimalDigits)
{
   return new NumberFormatInfo
               {
                  NumberDecimalSeparator = ".",
                  NumberDecimalDigits = numberDecimalDigits
               };
}

// generate a random float by create an int, and converting it to float in pointers

private static unsafe string GetRadomFloatString(IFormatProvider formatInfo)
{
   var val = Rand.Next(0, int.MaxValue);
   if (Rand.Next(0, 2) == 1)
      val *= -1;
   var f = *(float*)&val;
   return f.ToString("f", formatInfo);
}

Test Data 1

// limits the out put to 10 characters
// also because of that it has to check for trunced vales and
// regenerates them
public static List<string> GenerateInput10(int scale)
{
   var result = new List<string>(scale);
   while (result.Count < scale)
   {
      var val = GetRadomFloatString(GetNumberFormatInfo(10));
      if (val != "0.0000000000")
         result.Add(val);
   }

   result.Insert(0, (-0f).ToString("f", CultureInfo.InvariantCulture));
   result.Insert(0, "-0");
      result.Insert(0, "0.00");
      result.Insert(0, float.NegativeInfinity.ToString("f", CultureInfo.InvariantCulture));
   result.Insert(0, float.PositiveInfinity.ToString("f", CultureInfo.InvariantCulture));
   return result;
}

Test Data 2

// basically that max value for a float
public static List<string> GenerateInput38(int scale)
{

   var result = Enumerable.Range(1, scale)
                           .Select(x => GetRadomFloatString(GetNumberFormatInfo(38)))
                           .ToList();

   result.Insert(0, (-0f).ToString("f", CultureInfo.InvariantCulture));
   result.Insert(0, "-0");
   result.Insert(0, float.NegativeInfinity.ToString("f", CultureInfo.InvariantCulture));
   result.Insert(0, float.PositiveInfinity.ToString("f", CultureInfo.InvariantCulture));
   return result;
}

Test Data 3

// Lets take this to the limit
public static List<string> GenerateInput98(int scale)
{

   var result = Enumerable.Range(1, scale)
                           .Select(x => GetRadomFloatString(GetNumberFormatInfo(98)))
                           .ToList();

   result.Insert(0, (-0f).ToString("f", CultureInfo.InvariantCulture));
   result.Insert(0, "-0");
   result.Insert(0, float.NegativeInfinity.ToString("f", CultureInfo.InvariantCulture));
   result.Insert(0, float.PositiveInfinity.ToString("f", CultureInfo.InvariantCulture));
   return result;
}

These are the tests I used

Evk

private float ParseMyFloat(string value)
{
   var result = float.Parse(value, CultureInfo.InvariantCulture);
   if (result == 0f && value.TrimStart()
                              .StartsWith("-"))
   {
      result = -0f;
   }
   return result;
}

Mine safe

I call it safe as it tries to check for invalid strings

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private unsafe float ParseMyFloat(string value)
{
   double result = 0, dec = 0;

   if (value[0] == 'N' && value == "NaN") return float.NaN;
   if (value[0] == 'I' && value == "Infinity")return float.PositiveInfinity;
   if (value[0] == '-' && value[1] == 'I' && value == "-Infinity")return float.NegativeInfinity;


   fixed (char* ptr = value)
   {
      char* l, e;
      char* start = ptr, length = ptr + value.Length;

      if (*ptr == '-') start++;
      

      for (l = start; *l >= '0' && *l <= '9' && l < length; l++)
         result = result * 10 + *l - 48;
      

      if (*l == '.')
      {
         char* r;
         for (r = length - 1; r > l && *r >= '0' && *r <= '9'; r--)
            dec = (dec + (*r - 48)) / 10;

         if (l != r)
            throw new FormatException($"Invalid float : {value}");
      }
      else if (l != length)
         throw new FormatException($"Invalid float : {value}");

      result += dec;

      return *ptr == '-' ? (float)result * -1 : (float)result;
   }
}

Unchecked

This fails for larger strings, but is ok for smaller ones

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private unsafe float ParseMyFloat(string value)
{
   if (value[0] == 'N' && value == "NaN") return float.NaN;
   if (value[0] == 'I' && value == "Infinity") return float.PositiveInfinity;
   if (value[0] == '-' && value[1] == 'I' && value == "-Infinity") return float.NegativeInfinity;

   fixed (char* ptr = value)
   {
      var point = 0;
      double result = 0, dec = 0;

      char* c, start = ptr, length = ptr + value.Length;

      if (*ptr == '-') start++;   

      for (c = start; c < length && *c != '.'; c++)
         result = result * 10 + *c - 48;

      if (*c == '.')
      {
         point = (int)(length - 1 - c);
         for (c++; c < length; c++)
            dec = dec * 10 + *c - 48;
      }

      // MyPow is just a massive switch statement
      if (dec > 0)
         result += dec / MyPow(point);

      return *ptr == '-' ? (float)result * -1 : (float)result;
   }
}

Unchecked 2

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private unsafe float ParseMyFloat(string value)
{

   if (value[0] == 'N' && value == "NaN") return float.NaN;
   if (value[0] == 'I' && value == "Infinity") return float.PositiveInfinity;
   if (value[0] == '-' && value[1] == 'I' && value == "-Infinity") return float.NegativeInfinity;


   fixed (char* ptr = value)
   {
      double result = 0, dec = 0;

      char* c, start = ptr, length = ptr + value.Length;

      if (*ptr == '-') start++;

      for (c = start; c < length && *c != '.'; c++)
         result = result * 10 + *c - 48;     

      // this division seems unsafe for a double, 
      // however i have tested it with every float and it works
      if (*c == '.')
         for (var d = length - 1; d > c; d--)
            dec = (dec + (*d - 48)) / 10;

      result += dec;

      return *ptr == '-' ? (float)result * -1 : (float)result;
   }
}

Float.parse

float.Parse(t, CultureInfo.InvariantCulture)

#Original Answer

Assuming you don't need a TryParse method, I managed to use pointers and custom parsing to achieve what I think you want.

The benchmark uses a list of 1,000,000 random floats and runs each version 100 times, all versions use the same data

Test Framework : .NET Framework 4.7.1

Scale : 1000000
Name             |        Time |     Delta |  Deviation |       Cycles
----------------------------------------------------------------------
Mine Unchecked2  |   45.585 ms |  1.283 ms |       1.70 |  155,051,452
Mine Unchecked   |   46.388 ms |  1.812 ms |       1.17 |  157,751,710
Mine Safe        |   46.694 ms |  2.651 ms |       1.07 |  158,697,413
float.Parse      |  173.229 ms |  4.795 ms |       5.41 |  589,297,449
Evk              |  287.931 ms |  7.447 ms |      11.96 |  979,598,364

Chopped for brevity

Note, Both these version cant deal with extended format, NaN, +Infinity, or -Infinity. However, it wouldn't be hard to implement at little overhead.

I have checked this pretty well, though I must admit I haven't written any unit tests, so use at your own risk.

Disclaimer, I think Evk's StartsWith version could probably be more optimized, however it will still be (at best) slightly slower than float.Parse.

Upvotes: 9

Evk
Evk

Reputation: 101543

I think there is no way to force float.Parse (or Convert.ToSingle) to respect negative zero. It just works like this (ignores sign in this case). So you have to check that yourself, for example:

string target = "-0.0";            
float result = float.Parse(target, CultureInfo.InvariantCulture);
if (result == 0f && target.TrimStart().StartsWith("-"))
    result = -0f;

If we look at source code for coreclr, we'll see (skipping all irrelevant parts):

private static bool NumberBufferToDouble(ref NumberBuffer number, ref double value)
{
    double d = NumberToDouble(ref number);
    uint e = DoubleHelper.Exponent(d);
    ulong m = DoubleHelper.Mantissa(d);

    if (e == 0x7FF)
    {
        return false;
    }

    if (e == 0 && m == 0)
    {
        d = 0; // < relevant part
    }

    value = d;
    return true;
}

As you see, if mantissa and exponent are both zero - value is explicitly assigned to 0. So there is no way you can change that.

Full .NET implementation has NumberBufferToDouble as InternalCall (implemented in pure C\C++), but I assume it does something similar.

Upvotes: 17

Gaurang Dave
Gaurang Dave

Reputation: 4046

You can try this:

string target = "-0.0";  
decimal result= (decimal.Parse(target,
                 System.Globalization.NumberStyles.AllowParentheses |
                 System.Globalization.NumberStyles.AllowLeadingWhite |
                 System.Globalization.NumberStyles.AllowTrailingWhite |
                 System.Globalization.NumberStyles.AllowThousands |
                 System.Globalization.NumberStyles.AllowDecimalPoint |
                 System.Globalization.NumberStyles.AllowLeadingSign));

Upvotes: 3

Related Questions