jdm
jdm

Reputation: 10130

Accessing an array out of bounds, but returning earlier - UB?

I have code that calculates an array index, and if it is valid accesses that array item. Something like:

int b = rowCount() - 1;
if (b == -1) return;
const BlockInfo& bi = blockInfo[b];

I am worried that this might be triggering undefined behavior. For example, the compiler might assume that b is always non-negative, since I use it to index the array, so it will optimize the if clause away.

Under which circumstances is it safe to "access" an array out-of-bounds, when you do nothing with the invalid result? Does it change if blockInfo is not an actual array, but an container like a vector? If this is unsafe, could I fix it by putting the access in an else clause?

if (b == -1) {
    return;
} else {
    const BlockInfo& bi = blockInfo[b];
}

Lastly, are there compiler flags in the spirit of -fno-strict-aliasing or -fno-delete-null-pointer-checks that make the compiler "do the obvious thing" and prevent any unwanted behavior?

For clarification: My concern is specifically because of a different issue, where you intend to test whether a pointer is non-null before accessing it. The compiler turns this around and reasons that, since you are dereferencing it, it cannot have been null! Something like this (untested):

void someFunc(struct MyStruct *s) {
    if (s != NULL) {
       cout << s->someField << endl;
       delete s;
    }
 }

I recall hearing that simply forming an out-of-bounds array access is UB in C++. Thus the compiler could legally assume the array index is not out of bounds, and remove checks to the contrary.

Upvotes: 3

Views: 319

Answers (2)

Asteroids With Wings
Asteroids With Wings

Reputation: 17464

There is no access to blockInfo[-1] in your program. Your code specifically prohibits that.


For example, the compiler might assume that b is always non-negative, since I use it to index the array, so it will optimize the if clause away.

No, it cannot do that, precisely because an access to index -1 (or, rather, (std::size_t)-1) may or may not be a valid index. The language does let you pass -1 as an index; it'll just be converted first to a std::size_t with the well-defined unsigned wrap-around logic that comes with doing so. So there is not, and cannot be, any rule whereby the compiler is permitted to assume that you will never pass int -1 as an index.

Even if there were, it'd still make no sense to let the compiler completely ignore the if statement. If it could, if our if statements were not reliable, every program in the world would be unsafe! There'd be no way to enforce any of your operations' preconditions.


The compiler may only skip or re-order things when it can prove that doing so results in a well-defined program with the same behaviour as your original instructions, given any possible input.

In fact, this is where UB comes from: where proving correctness is really difficult, that's usually where the standard throws compilers a bone and says something is "undefined" and the compiler can just do whatever it likes.

One interesting example of this is kind of the opposite of your case, where a check is [erroneously] placed after the access, and the compiler therefore assumes the check passes, whether it actually did or not:

void foo(char* ptr)
{
   char x = *ptr;
   if (ptr)
      bar();
   else
      baz();
}

The function foo may call bar() even if ptr is null! That might sound unlikely to you, but it actually does happen (e.g. this crash in a widely-used library).


could I fix it by putting the access in an else clause?

Those two pieces of code are semantically equivalent; it's the same program.


Lastly, are there compiler flags in the spirit of -fno-strict-aliasing or -fno-delete-null-pointer-checks that make the compiler "do the obvious thing" and prevent any unwanted behavior?

The compiler already does the obvious thing, as long as "obvious" is "according to the C++ standard".

Upvotes: 2

Potatoswatter
Potatoswatter

Reputation: 137900

the compiler might assume

If the compiler proceeds from a wrong assumption, then it's wrong and defective.

Under which circumstances is it safe to "access" an array out-of-bounds, when you do nothing with the invalid result?

It is never safe to access an array out of bounds, because that produces UB before you have a chance to use or not-use the result. However, an untaken branch in the code doesn't count as an access, as in your first or second examples. So, if I understand your last question, there's no need for a special flag.

Upvotes: 1

Related Questions