John Vulconshinz
John Vulconshinz

Reputation: 1148

Basic String to Floating-point Conversion

I am attempting to write my own character string to float conversion function. It is basically a cheap ripoff of strtof(), but I can't get it to mimic strtof() exactly. I do not expect my function to mimic strtof() exactly, but I want to know why it differs where it does. I have tested a couple different strings and I found that the following strings have different values when the are given to my function and when given to strtof() and when they are printed using printf("%.38f")).

  1. 1234.5678
  2. 44444.44444
  3. 333.333
  4. 777.777

Why does this happen? (Also feel free to point out any other mistakes, or inform me of any other strings that also have different values (there is no way I can find them all).)

#include <stdlib.h>
#include <stdio.h>
#include <float.h>
#include <math.h>

int dec_to_f(char *dec, float *f)
{
int i = 0;
float tmp_f = 0;

if(dec == NULL) return 1;

if(f == NULL) return 2;

if(dec[i] == '\000') return 3;

if(dec[i] == '-')
{
    i++;

    if(dec[i] == '\000') return 3;

    for(; dec[i] != '\000'; i++)
    {
        if(dec[i] == '.')
        {
            float dec_place = 10;
            int power_of_ten = 1;

            for(i++; dec[i] != '\000'; i++, power_of_ten++, dec_place *= 10)
            {
                if(dec[i] >= '0' && dec[i] <= '9')
                {
                    if(power_of_ten > FLT_MAX_10_EXP) return 4;
                    else tmp_f -= (dec[i] - '0') / dec_place;
                }
                else return 5;
            }

            break;
        }

        if(dec[i] >= '0' && dec[i] <= '9')
        {
            tmp_f = tmp_f * 10 - (dec[i] - '0');
            if(!isfinite(tmp_f)) return 6;
        }
        else return 5;
    }
}
else
{
    if(dec[i] == '+')
    {
        if(dec[i+1] == '\000') return 3;
        else i++;
    }

    for(; dec[i] != '\000'; i++)
    {
        if(dec[i] == '.')
        {
            float dec_place = 10;
            int power_of_ten = 1;

            for(i++; dec[i] != '\000'; i++, power_of_ten++, dec_place *= 10)
            {
                if(dec[i] >= '0' && dec[i] <= '9')
                {
                    if(power_of_ten > FLT_MAX_10_EXP) return 7;
                    else tmp_f += (dec[i] - '0') / dec_place;
                }
                else return 5;
            }

            break;
        }

        if(dec[i] >= '0' && dec[i] <= '9')
        {   
            tmp_f = tmp_f * 10 + (dec[i] - '0');
            if(!isfinite(tmp_f)) return 8;
        }
        else return 5;
    }
}

*f = tmp_f;
return 0;
    }

int main()
{
printf("FLT_MIN = %.38f\n", FLT_MIN);
printf("FLT_MAX = %f\n", FLT_MAX);
float f = 0;
int return_value = 0;
char str[256];

printf("INPUT = ");
scanf("%s", str);

return_value = dec_to_f(str, &f);

printf("return_value = %i\nstr = \"%s\"\nf = %.38f\nstrtof = %.38f\n", return_value, str, f, strtof(str, NULL));
}

Upvotes: 1

Views: 327

Answers (4)

Rick Regan
Rick Regan

Reputation: 3512

The short answer is: You can't use floats or doubles to convert to floats or doubles. You need arithmetic of higher precision, either "big floats" or "big integers".

The longer answer is in David Gay's paper (cited in other answers) and David Gay's implementation of that paper.

The even longer answer is on my Web site, where I explain David Gay's code in a series of detailed articles.

If you don't care about how to get conversions right, and just want to understand why yours went wrong, read my article Quick and Dirty Decimal to Floating-Point Conversion. It shows a small program like yours, which seems should work, but doesn't. Then see my article Decimal to Floating-Point Needs Arbitrary Precision to understand why.

Upvotes: 2

chux
chux

Reputation: 153457

@Eric Postpischil and @Nahuel Fouilleul have provided good info. I'll add some more thoughts that don't fit well as a comment.

1) Text to FP needs to be evaluated in the other direction. Rather than most significant digits to least. Form the result from least to most significant. Ignore leading zeros. This will best maintain the subtle effects of your least significant text digits. As you go right to left, maintain a power_of_10 to multiple by at the end.

power_of_ten *= 10.0;
...
loop()
  // tmp_f = tmp_f * 10 + (dec[i] - '0');
  tmp_f = tmp_f/10 + (dec[i] - '0');
  power_of_ten *= 10.0;
...
tmp_f *= power_of_10;

2) Upon noticing the DP ., (going right to left), reset your power_of_10 to 1.0.

3) Fold your - and + code into one.

4) Use "%.9e" to compare results.

5) Use next_afterf(x,0.99*x) and next_afterf(x,1.01*x) to bracket acceptable results.

6) Typical float has about 1 part in power(2,23) precision (~7 decimal digits). As OP is closing in on that, the overall conversion is OK, just needs to reverse parsing.

Upvotes: 1

Eric Postpischil
Eric Postpischil

Reputation: 222660

Converting decimal to binary or vice-versa with correct rounding is complicated, requires detailed knowledge of floating-point arithmetic, and requires care.

There are a number of reasons why conversion is hard. Two of them are:

  • When calculations are performed with floating-point, those calculations often experience rounding errors. If the computations are not carefully designed, those rounding errors will affect the final results.
  • Some inputs will be very close to a rounding point, a point where rounding changes because the two nearest representable values are almost equally distant. As an example, consider 1.30000001192092895507812x. If that x is 4, the result should be 1.2999999523162841796875. If it is 6, the result should be 1.30000007152557373046875. Yet the digit x is well beyond the number of decimal digits that 32-bit binary floating-point can distinguish. It is even beyond the number of digits that 64-bit can distinguish. So you cannot use ordinary arithmetic to perform these conversions. You need some form of extended-precision arithmetic.

(In fact, consider 1.30000001192092895507812500000000…x. If x is a non-zero digit after any number of zeros in that numeral, then the conversion should round upward. If there is no non-zero digit, then the conversion should round downward. This means there is no limit to how many digits you must examine in order to determine the correctly rounded result. Fortunately, there are limits to the amount of arithmetic you must do, aside from scanning digits, as shown in the paper.)

Upvotes: 3

Nahuel Fouilleul
Nahuel Fouilleul

Reputation: 19315

After looking at the source of strtof/strtod, it uses double and then cast to float.

Replacing float by double gives the same result as strtof:

#include <stdlib.h>
#include <stdio.h>
#include <float.h>
#include <math.h>
int dec_to_f(char *dec, float *f)
{
int i = 0;
double tmp_f = 0;
if(dec == NULL) return 1;
if(f == NULL) return 2;
if(dec[i] == '\000') return 3;
if(dec[i] == '-')
{
    i++;
    if(dec[i] == '\000') return 3;
    for(; dec[i] != '\000'; i++)
    {
        if(dec[i] == '.')
        {
            double dec_place = 10;
            int power_of_ten = 1;
            for(i++; dec[i] != '\000'; i++, power_of_ten++, dec_place *= 10)
            {
                if(dec[i] >= '0' && dec[i] <= '9')
                {
                    if(power_of_ten > FLT_MAX_10_EXP) return 4;
                    else tmp_f -= (dec[i] - '0') / dec_place;
                }
                else return 5;
            }
            break;
        }
        if(dec[i] >= '0' && dec[i] <= '9')
        {
            tmp_f = tmp_f * 10 - (dec[i] - '0');
            if(!isfinite(tmp_f)) return 6;
        }
        else return 5;
    }
}
else
{
    if(dec[i] == '+')
    {
        if(dec[i+1] == '\000') return 3;
        else i++;
    }
    for(; dec[i] != '\000'; i++)
    {
        if(dec[i] == '.')
        {
            double dec_place = 10;
            int power_of_ten = 1;
            for(i++; dec[i] != '\000'; i++, power_of_ten++, dec_place *= 10)
            {
                if(dec[i] >= '0' && dec[i] <= '9')
                {
                    if(power_of_ten > FLT_MAX_10_EXP) return 7;
                    else tmp_f += (dec[i] - '0') / dec_place;
                }
                else return 5;
            }
            break;
        }
        if(dec[i] >= '0' && dec[i] <= '9')
        {   
            tmp_f = tmp_f * 10 + (dec[i] - '0');
            if(!isfinite(tmp_f)) return 8;
        }
        else return 5;
    }
}
*f = (float)tmp_f;
return 0;
    }
int main()
{
printf("FLT_MIN = %.38f\n", FLT_MIN);
printf("FLT_MAX = %f\n", FLT_MAX);
float f = 0;
int return_value = 0;
char str[256];
printf("INPUT = ");
scanf("%s", str);
return_value = dec_to_f(str, &f);
printf("return_value = %i\nstr = \"%s\"\nf = %.38f\nstrtof = %.38f\n", return_value, str, f, strtof(str, NULL));
}

Upvotes: 2

Related Questions