dhein
dhein

Reputation: 6555

Undefined-Behavior at its best, is it -boundary break? -bad pointer arithmetic? Or just -ignore of aliasing?

I'm working now for some weeks with c99 focusing undefined behaviour. I wanted to test some strange code while trying to respect the rules. The result was this code:

(plz forgive me the variable names, i had eaten a clown)

int main(int arg, char** argv)
{

    unsigned int uiDiffOfVars;
    int LegalPointerCast1, LegalPointerCast2, signedIntToRespectTheRules;
    char StartVar;//Only use to have an adress from where we can move on
    char *TheAccesingPointer;
    int iTargetOfPointeracces;

    iTargetOfPointeracces= 0x55555555;

    TheAccesingPointer = (char *) &StartVar;
    LegalPointerCast2 = (int) &StartVar;
    LegalPointerCast1 = (int) &iTargetOfPointeracces;

    if ((0x80000000 & LegalPointerCast2) != (0x80000000 & LegalPointerCast1))
    {
        //as im not sure in how far 
        //"— Apointer is converted to other than an integer or pointer type (6.5.4)." is treating unsigned integers,
        //im checking this way.
        printf ("try it on next machine!\r\n");
        return 1;
    }


    if ((abs (LegalPointerCast1) > abs (LegalPointerCast2)))
        uiDiffOfVars = abs (LegalPointerCast1) - abs (LegalPointerCast2);
    else
        uiDiffOfVars = abs (LegalPointerCast2) - abs (LegalPointerCast1);

    LegalPointerCast2 = (int) TheAccesingPointer;
    signedIntToRespectTheRules = abs ((int) uiDiffOfVars);

    if ((abs (LegalPointerCast1) > abs (LegalPointerCast2)))
        TheAccesingPointer = (char *)(LegalPointerCast2 + signedIntToRespectTheRules);
    else
        TheAccesingPointer = (char *)(LegalPointerCast2 - signedIntToRespectTheRules);

     printf ("%c\r\n", *TheAccesingPointer);//Will the output be an 'U' ?

    return 0;
}

So this code is undefined behavior at its best. I get different results, whether I'm not accessing any memory-area, that i don't own, nor accessing any uninitialized memory. (afaik)

The first critical rule was, I'm not allowed to add or subtract pointer which lets them leaving their array bounds. But I'm allowed to cast a pointer into integer, there I'm able calculate with, as I want, am I not?

My second assumption was as I'm allowed to assign a pointer an address thats valid, its a valid operation to assign this calculated address to a pointer. Since I'm acting with a char pointer, there is also no break of strict aliasing rules, as a char* is allowed to alias anything.

So which rule is broken, that this causes UB?

are single Variables also to be understood as "Arrays", and I'm breaking this rule?

— Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object (6.5.6).

If so, I'm also allowed to do this?

int var;
int *ptr;
ptr = &var;
ptr = ptr + 1;

Because the result is almost pretty sure undefined behavior. compiling with MSVC2010 it puts out the expected "U", but on freeBSD using clang and gcc I get depending on optimization level pretty funny and different results each time. (what in my eyes shouldn't be as far the bahavior is defined).

So any ideas what is causing this nasal dragons?

Upvotes: 0

Views: 285

Answers (2)

supercat
supercat

Reputation: 81257

An implementation may only define uintptr_t and intptr_t if it can guarantee two things:

  1. the act of converting a valid or null pointer to one of those types will yield defined behavior;

  2. If some value of that type, q, is numerically equal to the result of such conversion, and if the object identified by the converted pointer still exists, converting the value of type q back to the original pointer type will yield a pointer which compares equal to the original.

If uintptr_t is a 64-bit unsigned integer type, code may convert any valid pointer to uintptr_t and operate upon it just like any other 64-bit unsigned integer, without regard for the size of the original object or anything else. On the other hand, converting the result of such conversion back to the pointer type would, from the point of view of the Standard, only yield defined behavior in cases where the resulting number matched the result of an earlier conversion from a still-valid pointer to uintptr_t.

Note, btw, that many implementations document the relationship between pointers and uintptr_t values to an extent far beyond what the Standard requires, but that does not imply that code making use of such knowledge will actually work. For example, given the code:

static int x,y;
int test(void)
{
  int *p = outsideFunction(&x);
  y=1;
  *p=5;
  return y;
}

some impementations might document means via which a programmer could ascertain the relative displacements of x and y. Even such implementations, however, might generate code that assumes that the write to *p can't possibly affect y since it is a static object which never has its address taken.

Upvotes: 0

Bryan Olivier
Bryan Olivier

Reputation: 5307

You are basically running into paragraph 6.3.2.3 Pointer ad 5 in conversion from int to char* in the assignment to TheAccesingPointer.

An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

The use of all abs functions makes it very dependent on the actual implementation what happens. Basically it will only work if iTargetOfPointeracces has a higher address than StartVar. If you lose all occurrences of abs I think you will get 'U' on most if not all architectures and with most if not all compilers.

Ironically this is not undefined behavior but implementation defined behavior. But when you don't get 'U' the TheAccesingPointer is not pointing to an entity of the referenced type, most likely it is not pointing to an entity at all.

If it is not pointing to an entity then (of course) you will run into undefined behavior when dereferencing it in the printf following paragraph 6.5.3.2 ad 4

The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.

Let's elaborate two scenarios where all addresses on the stack have bit 31 set, which is quite common under Linux.

Scenario A: Suppose &StartVar < &iTargetOfPointeracces then

  abs(LegalPointerCast1) - abs(LegalPointerCast2)
= LegalPointerCast2 - LegalPointerCast1 (by both < 0)
= (char*)(&StartVar) - (char*)(&iTargetOfPointeracces)
< 0 (by &StartVar < &iTargetOfPointeracces)
So uiDiffOfVars = (char*)(&StartVar) - (char*)(&iTargetOfPointeracces)
and signedIntToRespectTheRules = -uiDiffOfVars (by (int)uiDiffOfVars < 0)
thus  TheAccesingPointer
= (char *)(&StartVar + (char*)(&iTargetOfPointeracces) - (char*)(&StartVar))
= (char*)(&iTargetOfPointeracces)

So in this scenario you will get 'U'.

Scenario B: Suppose &StartVar > &iTargetOfPointeracces then

  abs(LegalPointerCast1) - abs(LegalPointerCast2)
= LegalPointerCast2 - LegalPointerCast1 (by both < 0)
= (char*)(&StartVar) - (char*)(&iTargetOfPointeracces)
> 0 (by &StartVar > &iTargetOfPointeracces)
So uiDiffOfVars = (char*)(&StartVar) - (char*)(&iTargetOfPointeracces)
and signedIntToRespectTheRules = uiDiffOfVars (by (int)uiDiffOfVars > 0)
thus TheAccesingPointer
= (char *)(&StartVar + (char*)(&StartVar) - (char*)(&iTargetOfPointeracces))
= (char *)(2*(char*)&StartVar - (char*)(&iTargetOfPointeracces))

In this scenario it is very unlikely that TheAccesingPointer is pointing to some entity, so undefined behavior is triggered in dereferencing this pointer. So my point is that the calculation of TheAccesingPointer is implementation defined, where the above calculations are very common. If the computed pointer is not pointing to iTargetOfPointeracces, as in scenario B, undefined behavior is triggered.

Different optimization levels may result in a different order of StartVar' andiTargetOfPointeracces' on the stack and that may explain the different result for different optimization levels.

I don't think single variables count as an array.

Upvotes: 2

Related Questions