Reputation: 105
I was browsing through a webpage which had some c FAQ's, I found this statement made.
Similarly, if a has 10 elements and ip points to a[3], you can't compute or access ip + 10 or ip - 5. (There is one special case: you can, in this case, compute, but not access, a pointer to the nonexistent element just beyond the end of the array, which in this case is &a[10].
I was confused by the statement
you can't compute ip + 10
I can understand accessing the element out of bounds is undefined, but computing!!!.
I wrote the following snippet which computes (let me know if this is what the website meant by computing) a pointer out-of-bounds.
#include <stdio.h>
int main()
{
int a[10], i;
int *p;
for (i = 0; i<10; i++)
a[i] = i;
p = &a[3];
printf("p = %p and p+10 = %p\n", p, p+10);
return 0;
}
$ ./a.out
p = 0xbfa53bbc and p+10 = 0xbfa53be4
We can see that p + 10 is pointing to 10 elements(40 bytes) past p. So what exactly does the statement made in the webpage mean. Did I mis-interpret something.
Even in K&R (A.7.7) this statement is made:
The result of the + operator is the sum of the operands. A pointer to an object in an array and a value of any integral type may be added. ... The sum is a pointer of the same type as the original pointer, and points to another object in the same array, appropriately offset from the original object. Thus if P is a pointer to an object in an array, the expression P+1 is a pointer to the next object in the array. If the sum pointer points outside the bounds of the array, except at the first location beyond the high end, the result is undefined.
What does being "undefined" mean. Does this mean the sum will be undefined, or does it only mean when we dereference it the behavior is undefined. Is the operation undefined even when we do not dereference it and just calculate the pointer to element out-of-bounds.
Upvotes: 6
Views: 3930
Reputation: 400512
Undefined behavior means exactly that: absolutely anything could happen. It could succeed silently, it could fail silently, it could crash your program, it could blue screen your OS, or it could erase your hard drive. Some of these are not very likely, but all of them are permissible behaviors as far as the C language standard is concerned.
In this particular case, yes, the C standard is saying that even computing the address of a pointer outside of valid array bounds, without dereferencing it, is undefined behavior. The reason it says this is that there are some arcane systems where doing such a calculation could result in a fault of some sort. For example, you might have an array at the very end of addressable memory, and constructing a pointer beyond that would cause an overflow in a special address register which generates a trap or fault. The C standard wants to permit this behavior in order to be as portable as possible.
In reality, though, you'll find that constructing such an invalid address without dereferencing it has well-defined behavior on the vast majority of systems you'll come across in common usage. Creating an invalid memory address will have no ill effects unless you attempt to dereference it. But of course, it's better to avoid creating those invalid addresses so that your code will work perfectly even on those arcane systems.
Upvotes: 11
Reputation: 13624
The web page wording is confusing, but technically correct. The C99 language specification (section 6.5.6) discusses additive expressions, including pointer arithmetic. Subitem 8 specifically states that computing a pointer one past the end of an array shall not cause an overflow, but beyond that the behavior is undefined.
In a more practical sense, C compilers will generally let you get away with it, but what you do with the resulting value is up to you. If you try to dereference the resulting pointer to a value, as K&R states, the behavior is undefined.
Undefined, in programming terms, means "Don't do that." Basically, it means the specification that defines how the language works does not define an appropriate behavior in that situation. As a result, theoretically anything can happen. Generally all that happens is you have a silent or noisy (segfault) bug in your program, but many programmers like to joke about other possible results from causing undefined behavior, like deleting all of your files.
Upvotes: 4
Reputation: 92874
The behaviour would be undefined in the following case
int a[3];
(a + 10) ; // this is UB too as you are computing &a[10]
*(a+10) = 10; // Ewwww!!!!
Upvotes: 2