Reputation: 460
I am trying to read a bitmap file, byte by byte, and I have a loop that runs until EOF is reached. To make that, I have a variable declared as unsigned int
that stores each byte. The loop stops when this variable is equal to EOF
.
The interesting point is: if I declare my variable as unsigned int
it works. However, if I declare my variable as unsigned short int
, the loop runs forever, because it never finds the EOF
.
#include <stdio.h>
int main()
{
FILE *file;
unsigned int currentByte;
file = fopen("/home/stanley/Desktop/x.bmp", "rb");
while ((currentByte = fgetc(file)) != EOF) {
printf("%d \n", currentByte);
}
fclose(file);
return 0;
}
The code above is the code I am writing. If the file has a size of 90B, 90 bytes are printed on the screen.
However, for some reason, when I change it to unsigned short int currentByte
, the loop keeps running forever. It is as if currentByte
was never equal to EOF
.
I read somewhere that EOF
contains a negative value (-1). But if EOF
is negative, why does it work when I use only unsigned int
and why does it bug when I use unsigned short int
? In theory, shouldn't the problem be related to the unsigned
itself rather than the short
? It's unsigned who can't store negative values.
I'm sorry if this is a very silly question. I'm trying to understand better how bits and bytes work, and some concepts might be strange to me yet.
I'm compiling it on the following environment:
Thanks in advance. :)
Upvotes: 3
Views: 229
Reputation: 5265
If the size of int
is greater than the size of short
, then you will encounter this problem.
Let's assume that EOF
is of type int
and contains the value -1. For the sake of example, let's also assume that int
is a 32-bit value, while short
is a 16-bit value.
In this case, if fgetc
returns EOF
, it will have a value of 0xFFFFFFFF when taken as an unsigned int
. When comparing it to EOF
(type int
), the signed integer -1 will be converted to the unsigned value 0xFFFFFFFF. These two values are equal, so the comparison works as expected.
However, an EOF
returned by fgetc
, taken as an unsigned short
, will have a value of 0xFFFF. Because the size of unsigned short
is smaller than the size of int, when comparing this value to EOF
, the unsigned short
0xFFFF will be converted to an int
with a value of 0x0000FFFF (extra digits shown for clarity). Since -1 is not equal to 0xFFFF for a 32-bit value, this comparison is always not equal, and the loop will not stop.
The fact that fgetc
returns int
hints that you should keep it as that type, as you'll otherwise discard some information or cause confusion in comparisons.
Upvotes: 3
Reputation: 26800
When you convert a signed integer to an unsigned integer (which happens when EOF
is assigned to an unsigned integer variable), the result is converted to an unsigned integer by adding UINT_MAX + 1
. So if EOF
is -1
, this value becomes UINT_MAX
.
And UINT_MAX
can properly fit only into unsigned int
and not unsigned short
.
And the result of this particular conversion is implementation defined and so the behavior of the program will depend on it.
Note the fgetc
function returns a int
, so you must use an int
variable to store its value.
Upvotes: 2
Reputation: 215257
You should use the type int
to match what fgetc
returns, not unsigned int
. The reason the loop stop condition works with unsigned int
is not that the value is ever negative, but that, when the !=
operator is used with unsigned
and signed
operands of the same rank, both get promoted to unsigned
before the comparison. Assigning the EOF
result of fgetc
to currentByte
and promoting EOF
to unsigned
both produce the same result, and thus they compare equal.
Upvotes: 2