Harsh Vardhan
Harsh Vardhan

Reputation: 209

Why does a C floating-point type modify the actual input of 125.1 to 125.099998 on output?

I wrote the following program:

 #include<stdio.h>
    int main(void)
    {
     float f;
     printf("\nInput a floating-point no.: ");
     scanf("%f",&f);
     printf("\nOutput: %f\n",f);
     return 0;
    }

I am on Ubuntu and used GCC to compile the above program. Here is my sample run and output I want to inquire about:

Input a floating-point no.: 125.1
Output: 125.099998

Why does the precision change?

Upvotes: 1

Views: 1186

Answers (7)

Harsh Vardhan
Harsh Vardhan

Reputation: 209

Thank you all for your answers. Although almost all of you helped me look in the right direction I could not understand the exact reason for this behavior. So I did a bit of research in addition to reading the pages you guys pointed me to. Here is my understanding for this behavior:

Single Precision Floating Point numbers typically use 4 bytes for storage on x86/x86-64 architectures. However not all 32 bits (4 bytes = 32 bits) are used to store the magnitude of the number.

For storing as a single precision floating type, the input stream is formatted in the following notation (somewhat similar to scientific notation):

(-1)^s x 1.m x 2^(e-127), where
  s = sign of the number, range:{0,1} - takes up 1 bit
  m = mantissa (fractional portion) of the number - takes up 23 bits
  e = exponent of the number offset by 127, range:{0,..,255} - takes up 8 bits

and then stored in memory as

0th byte 1st byte 2nd byte 3rd byte
mmmmmmmm mmmmmmmm emmmmmmm seeeeeee

Therefore the decimal number 125.1 is first converted to binary form but limited to 24 bits so that the mantissa is represented by no more than 23 bits. After conversion to binary form:

125.1 = 1111101.00011001100110011

NOTE: 0.1 in decimal can be represented up to infinite bits in binary but the computer limits the representation to 17 bits so the complete representation does not exceed 24 bits.

Now converting it into the specified notation we get:

125.1 = 1.111101 00011001100110011 x 2^6
      = (-1)^0 + 1.111101 00011001100110011 x 2^(133-127)

which implies

s = 0
m = 11110100011001100110011
e = 133 = 10000101

Therefore, 125.1 will be stored in memory as:

0th byte 1st byte 2nd byte 3rd byte
mmmmmmmm mmmmmmmm emmmmmmm seeeeeee
00110011 00110011 11111010 01000010

On being passed to the printf() function the output stream is generated by converting the binary form to the decimal form. The bytes are actually stored in reverse order (from the input stream) and hence read in this order:

3rd byte 2nd byte 1st byte 0th byte
seeeeeee emmmmmmm mmmmmmmm mmmmmmmm
01000010 11111010 00110011 00110011

Next, it is converted into the specific notation for conversion

(-1)^0 + 1.111101 00011001100110011 x 2^(133-127)

On simplifying the above representation further:

= 1.111101 00011001100110011 x 2^6
= 1111101.00011001100110011

And finally converting it to decimal:

= 125.0999984741210938

but single precision floating point can represent only up to 6 decimal places, therefore the answer is rounded off to 125.099998.

Upvotes: 6

Andrew Langrick
Andrew Langrick

Reputation: 126

Think about a fixed point representation first.

2^3=8 2^2=4 2^1=2 2^0=1 2^-1=1/2 2^-2=1/4 2^-3=1/8 2^-4=1/16

If we want to represent a fraction then we set the bits to the right of the point, so 5.5 is represented as 01011000.

But if we want to represent 5.6, there is not an exact fractional representation. The closest we can get is 01011001 == 5.5625

1/2 + 1/16 = 0.5625

2^-4 + 2^-1

Upvotes: 2

Alexander
Alexander

Reputation: 10386

No floating point numbers has an exact representation, they all have limited accuracy. When converting from a number in text to a float (with scanf or otherwise), you're in another world with different kinds of numbers, and precision may be lost. Same thing goes when converting from a float to a string: you decide on how many digits you want. You can't know "how many digits there are" in a float before converting to text or another format that can keep that information. This all has to do with how floats are stored:

significant_digits * baseexponent

Upvotes: 0

ysth
ysth

Reputation: 98398

The normal type used for floating point in C is double, not float. Your float is implicitly cast to a double, and because the float is less precise, the difference to the closest representable number to 125.1 is more apparent (and printf's default precision is tailored for use with doubles). Try this instead:

#include<stdio.h>
int main(void)
{
    double f;
    printf("\nInput a floating-point no.: ");
    scanf("%lf",&f);
    printf("\nOutput: %f\n",f);
    return 0;
}

Upvotes: -1

flolo
flolo

Reputation: 15486

If I tell you to write 1/3 as decimal number down, you realize there a numbers which have no finite representation. .1 is the exact representation of 1/10 there this problem does not appear, BUT this is just in decimal representation. In binary representation .1 is one of those numbers that require infinite digits. As your number must be somehwere cut there is something lost.

Upvotes: 0

user822715
user822715

Reputation: 27

Because its the closest representation of 125.1 , remember that single precision floating point are just 32 bits.

Upvotes: 0

Aasmund Eldhuset
Aasmund Eldhuset

Reputation: 37950

Because the number 125.1 is impossible to represent exactly with floating-point numbers. This happens in most programming languages. Use e.g. printf("%.1f", f); if you want to print the number with one decimal, but be warned: the number itself is not exactly equal to 125.1.

Upvotes: 12

Related Questions