Reputation: 283
This code snippet is excerpted from a linux book. If this is not appropriate to post the code snippet here, please let me know. I will delete it. Thanks.
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char buf[30];
char *p;
int i;
unsigned int index = 0;
//unsigned long index = 0;
printf("index-1 = %lx (sizeof %d)\n", index-1, sizeof(index-1));
for(i = 'A'; i <= 'Z'; i++)
buf[i - 'A'] = i;
p = &buf[1];
printf("%c: buf=%p p=%p p[-1]=%p\n", p[index-1], buf, p, &p[index-1]);
return 0;
}
On 32-bit OS environment: This program works fine no matter the data type of index is unsigned int or unsigned long.
On 64-bit OS environment: The same program will run into "core dump" if index is declared as unsigned int. However, if I only change the data type of index from unsigned int to a) unsigned long or b) unsigned short, this program works fine too.
The reason from the book only tells me that 64-bit will cause the core-dump due to non-negative number. But I have no idea exactly about the reason why unsigned long and unsigned short work but unsigned int.
What I am confused is that
p + (0u -1) == p + UINT_MAX
when index is unsigned int.
BUT,
p + (0ul - 1) == p[-1]
when index is unsigned long.
I get stuck at here.
If anyone can help to elaborate the details, it is highly appreciated!
Here comes some result on my 32 bit(RHEL5.10/gcc version 4.1.2 20080704)
and 64 bit machine (RHEL6.3/gcc version 4.4.6 20120305)
I am not sure if gcc version makes any difference here. So, I paste the information as well.
On 32 bit:
I tried two changes:
1) Modify unsigned int index = 0
to unsigned short index = 0
.
2) Modify unsigned int index = 0
to unsigned char index = 0
.
The program can run without problem.
index-1 = ffffffff (sizeof 4)
A: buf=0xbfbdd5da p=0xbfbdd5db p[-1]=0xbfbdd5da
It seems that the data type of index will be promoted to 4 bytes due to -1.
On 64 bit:
I tried three changes:
1) Modify unsigned int index = 0
to unsigned char index = 0
.
It works!
index-1 = ffffffff (sizeof 4)
A: buf=0x7fffef304ae0 p=0x7fffef304ae1 p[-1]=0x7fffef304ae0
2) Modify unsigned int index = 0
to unsigned short index = 0
.
It works!
index-1 = ffffffff (sizeof 4)
A: buf=0x7fff48233170 p=0x7fff48233171 p[-1]=0x7fff48233170
3) Modify unsigned int index = 0
to unsigned long index = 0
.
It works!
index-1 = ffffffff (sizeof 8)
A: buf=0x7fffb81d6c20 p=0x7fffb81d6c21 p[-1]=0x7fffb81d6c20
BUT, only
unsigned int index = 0
runs into the core dump at the last printf.
index-1 = ffffffff (sizeof 4)
Segmentation fault (core dumped)
Upvotes: 1
Views: 3652
Reputation: 5543
Arithmetic on unsigned values is always defined, in terms of wrap-around. E.g. (unsigned)-1
is the same as UINT_MAX
. So an expression like
p + (0u-1)
is equivalent to
p + UINT_MAX
(&p[0u-1]
is equivalent to &*(p + (0u-1))
and p + (0u-1)
).
Maybe this is easier to understand if we replace the pointers with unsigned integer types. Consider:
uint32_t p32; // say, this is a 32-bit "pointer"
uint64_t p64; // a 64-bit "pointer"
Assuming 16, 32, and 64 bit for short
, int
, and long
, respectively (entries on the same line equal):
p32 + (unsigned short)-1 p32 + USHRT_MAX p32 + (UINT_MAX>>16)
p32 + (0u-1) p32 + UINT_MAX p32 - 1
p32 + (0ul-1) p32 + ULONG_MAX p32 + UINT_MAX p32 - 1
p64 + (0u-1) p64 + UINT_MAX
p64 + (0ul-1) p64 + ULONG_MAX p64 - 1
You can always replace operands of addition, subtraction and multiplication on unsigned types by something congruent modulo the maximum value + 1. For example,
-1 ☰ ffffffffhex mod 232
(ffffffffhex is 232-1 or UINT_MAX
), and also
ffffffffffffffffhex ☰ ffffffffhex mod 232
(for a 32-bit unsigned type you can always truncate to the least-significant 8 hex-digits).
Your examples:
32-bit
unsigned short index = 0;
In index - 1
, index is promoted to int
. The result has type int
and value -1 (which is negative). Same for unsigned char
.
64-bit
unsigned char index = 0;
unsigned short index = 0;
Same as for 32-bit. index
is promoted to int
, index - 1
is negative.
unsigned long index = 0;
The output
index-1 = ffffffff (sizeof 8)
is weird, it’s your only correct use of %lx
but looks like you’ve printed it with %x
(expecting 4 bytes); on my 64-bit computer (with 64-bit long
) and with %lx
I get:
index-1 = ffffffffffffffff (sizeof 8)
ffffffffffffffffhex is -1 modulo 264.
unsigned index = 0;
An int
cannot hold any value unsigned int
can, so in index - 1
nothing is promoted to int
, the result has type unsigned int
and value -1 (which is positive, being the same as UINT_MAX
or ffffffffhex, since the type is unsigned). For 32-bit-addresses, adding this value is the same as subtracting one:
bfbdd5db bfbdd5db
+ ffffffff - 1
= 1bfbdd5da
= bfbdd5da = bfbdd5da
(Note the wrap-around/truncation.) For 64-bit addresses, however:
00007fff b81d6c21
+ ffffffff
= 00008000 b81d6c20
with no wrap-around. This is trying to access an invalid address, so you get a segfault.
Maybe have a look at 2’s complement on Wikipedia.
Under my 64-bit Linux, using a specifier expecting a 32-bit value while passing a 64-bit type (and the other way round) seems to “work”, only the 32 least-significant bits are read. But use the correct ones. lx
expects an unsigned long
, unmodified x
an unsigned int
, hx
an unsigned short
(an unsigned short
is promoted to int
when passed to printf
(it’s passed as a variable argument), due to default argument promotions). The length modifier for size_t
is z
, as in %zu
:
printf("index-1 = %lx (sizeof %zu)\n", (unsigned long)(index-1), sizeof(index-1));
(The conversion to unsigned long
doesn’t change the value of an unsigned int
, unsigned short
, or unsigned char
expression.)
sizeof(index-1)
could also have been written as sizeof(+index)
, the only effect on the size of the expression are the usual arithmetic conversions, which are also triggered by unary +
.
Upvotes: -1
Reputation: 9208
One other problem is code has is in your printf()
:
printf("index-1 = %lx (sizeof %d)\n", index-1, sizeof(index-1));
Lets simplify:
int i = 100;
print("%lx", i-1);
You are telling printf
here is a long
but in reality you are sending an int
. clang does tell you the corrent warning (I think gcc should also spit the correct waring). See:
test1.c:6:19: warning: format specifies type 'unsigned long' but the argument has type 'int' [-Wformat]
printf("%lx", i - 100);
~~~ ^~~~~~~
%x
1 warning generated.
Solution is simple: you need to pass a long to printf
or tell printf
to print an int
:
printf("%lx", (long)(i-100) );
printf("%x", i-100);
You got luck on 32bit and your app did not crash. Porting it to 64bit revealed a bug in your code and you can now fix it.
Upvotes: 1
Reputation: 45684
Do not lie to the compiler!
Passing printf
an int
where it expects a long
(%ld
) is undefined behavior.
(Creating a pointer pointing outside any valid object (and not just behind one) is UB too...)
Correct the format specifiers and the pointer arithmetic (that includes indexing as a special case) and everything will work.
UB includes "It works as expected" as well as "Catastrophic failure".
BTW: If you politely ask your compiler for all warnings, it would warn you. Use -Wall -Wextra -pedantic
or similar.
Upvotes: 1