Stanley Sathler
Stanley Sathler

Reputation: 460

Why is an "unsigned int" NOT different from an EOF - can it store negative values?

I am trying to read a bitmap file, byte by byte, and I have a loop that runs until EOF is reached. To make that, I have a variable declared as unsigned int that stores each byte. The loop stops when this variable is equal to EOF.

The interesting point is: if I declare my variable as unsigned int it works. However, if I declare my variable as unsigned short int, the loop runs forever, because it never finds the EOF.

#include <stdio.h>

int main()
{
    FILE *file;
    unsigned int currentByte;

    file = fopen("/home/stanley/Desktop/x.bmp", "rb");

    while ((currentByte = fgetc(file)) != EOF) {
        printf("%d \n", currentByte);
    }

    fclose(file);
    return 0;
}

The code above is the code I am writing. If the file has a size of 90B, 90 bytes are printed on the screen.

However, for some reason, when I change it to unsigned short int currentByte, the loop keeps running forever. It is as if currentByte was never equal to EOF.

I read somewhere that EOF contains a negative value (-1). But if EOF is negative, why does it work when I use only unsigned int and why does it bug when I use unsigned short int? In theory, shouldn't the problem be related to the unsigned itself rather than the short? It's unsigned who can't store negative values.

I'm sorry if this is a very silly question. I'm trying to understand better how bits and bytes work, and some concepts might be strange to me yet.

I'm compiling it on the following environment:

Thanks in advance. :)

Upvotes: 3

Views: 229

Answers (3)

Thomas Jager
Thomas Jager

Reputation: 5265

If the size of int is greater than the size of short, then you will encounter this problem.

Let's assume that EOF is of type int and contains the value -1. For the sake of example, let's also assume that int is a 32-bit value, while short is a 16-bit value.

In this case, if fgetc returns EOF, it will have a value of 0xFFFFFFFF when taken as an unsigned int. When comparing it to EOF (type int), the signed integer -1 will be converted to the unsigned value 0xFFFFFFFF. These two values are equal, so the comparison works as expected.

However, an EOF returned by fgetc, taken as an unsigned short, will have a value of 0xFFFF. Because the size of unsigned short is smaller than the size of int, when comparing this value to EOF, the unsigned short 0xFFFF will be converted to an int with a value of 0x0000FFFF (extra digits shown for clarity). Since -1 is not equal to 0xFFFF for a 32-bit value, this comparison is always not equal, and the loop will not stop.

The fact that fgetc returns int hints that you should keep it as that type, as you'll otherwise discard some information or cause confusion in comparisons.

Upvotes: 3

P.W
P.W

Reputation: 26800

When you convert a signed integer to an unsigned integer (which happens when EOF is assigned to an unsigned integer variable), the result is converted to an unsigned integer by adding UINT_MAX + 1. So if EOF is -1, this value becomes UINT_MAX.

And UINT_MAX can properly fit only into unsigned int and not unsigned short. And the result of this particular conversion is implementation defined and so the behavior of the program will depend on it.

Note the fgetc function returns a int, so you must use an int variable to store its value.

Upvotes: 2

R.. GitHub STOP HELPING ICE
R.. GitHub STOP HELPING ICE

Reputation: 215257

You should use the type int to match what fgetc returns, not unsigned int. The reason the loop stop condition works with unsigned int is not that the value is ever negative, but that, when the != operator is used with unsigned and signed operands of the same rank, both get promoted to unsigned before the comparison. Assigning the EOF result of fgetc to currentByte and promoting EOF to unsigned both produce the same result, and thus they compare equal.

Upvotes: 2

Related Questions