Reputation: 71

Number precision error in C

Here is a code I wrote:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    double num;
    int tmp;
    printf("enter a number!\n");
    scanf("%lf",&num);
    tmp=num*10000;
    printf(" temp=%d\n",tmp);

    return 0; 
}

When I enter the number 1441.1441 the result i'm getting is 14411440 instead of 14411441 which is obviously the correct result after multiplying my input number by 10000. Can someone help me figure out this problem?

Upvotes: 5

Answers (4)

paxdiablo

Reputation: 881633

Since the vast majority of real numbers cannot actually be represented exactly, you'll probably find that 1441.1441 is actually stored as something like 1441.14409999_blah_blah_blah. You can find that out by inserting:

printf ("%.50lf\n", num);

immediately after the scanf and seeing (trailing zeroes removed):

1441.14409999999998035491444170475006103515625

Now that's actually the correct (ie, closest) value based on your input. The next highest number from there gives you:

1441.144100000000207728589884936809539794921875

The error with the first value is:

0.00000000000001964508555829524993896484375
               ^ ~ 2 x 10^-14

while the error with the second is:

0.000000000000207728589884936809539794921875
              ^ ~ 2 x 10^-13

and you can see the latter error is about 10 times as much.

When you multiply that by 10000 and try to shoehorn it into an int, it gets rounded down (truncated). That's because the (C11) standard has this to say in 6.3.1.4:

When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero).

One thing you can try is to change your shoehorning line into:

tmp = num * 10000 + 0.5;

which effectively turns the truncation into a rounding operation. I think that will work for all cases but you may want to test it (and keep an eye on it) just in case.

Upvotes: 11

Daniel Fischer

Reputation: 183888

For the general principle, paxdiablo's answer contains the relevant parts. Most terminating decimal fractions cannot be exactly represented as binary floating point numbers, hence the value of the floating point variable is a little smaller or larger than the mathematical value of the number representation in the given string, so when you want to get the appropriate integer value after scaling, you should round and not truncate.

But in the specific example here, we have a different scenario. The closest IEEE754 double precision (64-bit binary) value to 1441.1441 is

1441.14409999999998035491444170475006103515625

which is indeed a little smaller than 1441.1441. But if that value is multiplied with 10000 as an IEEE754 double precision value, the result is exactly

14411441

What happens here is that, as is allowed per 5.2.4.2.2 paragraph 9

Except for assignment and cast (which remove all extra range and precision), the values yielded by operators with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type.

(emphasis mine), the product is evaluated with a greater precision than required by the type (probably the x87 80-bit format), yielding a slightly smaller value, and when the result of the multiplication is converted to int, the fractional part is discarded, and you get 14411440.

scanf("%lf",&num);

The value is stored in num, so it must have exactly the precision of double.

tmp=num*10000;

The product num * 10000 is neither stored nor cast to double, so it may have greater precision, resulting in a smaller or larger value than the closest double value. That value is then truncated to obtain the int.

If you stored the product in a double variable

num *= 10000;
tmp = num;

or cast it to double before converting to int,

tmp = (double)(num * 10000);

you ought to get the result 14411441 for the input 1441.1441 (but note that not all compilers always honour the requirement of converting to the exact required precision when casting or storing - violating the standard - so there's no guarantee that that will produce 14411441 with all optimisation settings).

Since many 64-bit platforms perform floating-point arithmetic using SSE instructions rather than the x87 coprocessor, the observed behaviour is less likely to appear on 64-bit systems than on 32-bit systems.

Upvotes: 1

son of the northern darkness

Reputation: 709

It looks like scanf is using float precision inside scanf. I breifly checked that 1441.1441 is represented in float as 1441.1440. In genereal you shouldn't rely on precision in floating point operations.

Upvotes: -4

0x90

Reputation: 40982

Try to make it round like that:

float a = 3.14;

int i = (int)(a+0.5);

In your case:

 double num;
 int tmp;
 printf("enter a number!\n");
 scanf("%lf",&num);
 tmp=(int)(num*10000 + 0.5);
 printf(" temp=%d\n",tmp);

Upvotes: 0

Number precision error in C

Answers (4)

Related Questions