Tomi Kyöstilä
Tomi Kyöstilä

Reputation: 1279

Is GCC broken when taking the address of an argument on ARM7TDMI?

My C code snippet takes the address of an argument and stores it in a volatile memory location (preprocessed code):

void foo(unsigned int x) {
    *(volatile unsigned int*)(0x4000000 + 0xd4) = (unsigned int)(&x);
}

int main() {
    foo(1);
    while(1);
}

I used an SVN version of GCC for compiling this code. At the end of function foo I would expect to have the value 1 stored in the stack and, at 0x40000d4, an address pointing to that value. When I compile without optimizations using the flag -O0, I get the expected ARM7TMDI assembly output (commented for your convenience):

        .align  2
        .global foo
        .type   foo, %function
foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        sub     sp, sp, #8
        str     r0, [sp, #4]     @ 3. Store the argument on the stack
        mov     r3, #67108864
        add     r3, r3, #212
        add     r2, sp, #4       @ 4. Address of the stack variable
        str     r2, [r3, #0]     @ 5. Store the address at 0x40000d4
        add     sp, sp, #8
        bx      lr
        .size   foo, .-foo
        .align  2
        .global main
        .type   main, %function
main:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        stmfd   sp!, {r4, lr}
        mov     r0, #1           @ 1. Pass the argument in register 0
        bl      foo              @ 2. Call function foo
.L4:
        b       .L4
        .size   main, .-main
        .ident  "GCC: (GNU) 4.4.0 20080820 (experimental)"

It clearly stores the argument first on the stack and from there stores it at 0x40000d4. When I compile with optimizations using -O1, I get something unexpected:

        .align  2
        .global foo
        .type   foo, %function
foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        sub     sp, sp, #8
        mov     r2, #67108864
        add     r3, sp, #4        @ 3. Address of *something* on the stack
        str     r3, [r2, #212]    @ 4. Store the address at 0x40000d4
        add     sp, sp, #8
        bx      lr
        .size   foo, .-foo
        .align  2
        .global main
        .type   main, %function
main:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        stmfd   sp!, {r4, lr}
        mov     r0, #1           @ 1. Pass the argument in register 0
        bl      foo              @ 2. Call function foo
.L4:
        b       .L4
        .size   main, .-main
        .ident  "GCC: (GNU) 4.4.0 20080820 (experimental)"

This time the argument is never stored on the stack even though something from the stack is still stored at 0x40000d4.

Is this just expected/undefined behaviour? Have I done something wrong or have I in fact found a Compiler Bug™?

Upvotes: 5

Views: 1585

Answers (11)

David Cary
David Cary

Reputation: 5500

Tomi Kyöstilä wrote

development for the Game Boy Advance. I was reading about its DMA system and I experimented with it by creating single-color tile bitmaps. The idea was to have the indexed color be passed as an argument to a function which would use DMA to fill a tile with that color. The source address for the DMA transfer is stored at 0x40000d4.

That's a perfectly valid thing for you to do, and I can see how the (unexpected) code you got with the -O1 optimization wouldn't work.

I see the (expected) code you got with the -O0 optimization does what you expect -- it puts value of the color you want on the stack, and a pointer to that color in the DMA transfer register.

However, even the (expected) code you got with the -O0 optimization wouldn't work, either. By the time the DMA hardware gets around to taking that pointer and using it to read the desired color, that value on the stack has (probably) long been overwritten by other subroutines or interrupt handlers or both. And so both the expected and the unexpected code result in the same thing -- the DMA is (probably) going to fetch the wrong color.

I think you really intended to store the color value in some location where it stays safe until the DMA is finished reading it. So a global variable, or a function-local static variable such as

// Warning: Three Star Programmer at work

// Warning: untested code.
void foo(unsigned int x) {
    static volatile unsigned int color = x; // "static" so it's not on the stack
    volatile unsigned int** dma_register =
        (volatile unsigned int**)(0x4000000 + 0xd4);
    *dma_register = &color;
}

int main() {
    foo(1);
    while(1);
}

Does that work for you?

You see I use "volatile" twice, because I want to force two values to be written in that particular order.

Upvotes: 1

Evan Teran
Evan Teran

Reputation: 90533

One thing to note is that according to the standard, casts are r-values. GCC used to allow it, but in recent versions has become a bit of a standards stickler.

I don't know if it will make a difference, but you should try this:

void foo(unsigned int x) {
    volatile unsigned int* ptr = (unsigned int*)(0x4000000 + 0xd4);
    *ptr = (unsigned int)(&x);
}

int main() {
    foo(1);
    while(1);
}

Also, I doubt you intended it, but you are storing the address of the function local x (which is a copy of the int you passed). You likely want to make foo take an "unsigned int *" and pass the address of what you really want to store.

So I feel a more proper solution would be this:

void foo(unsigned int *x) {
    volatile unsigned int* ptr = (unsigned int*)(0x4000000 + 0xd4);
    *ptr = (unsigned int)(x);
}

int main() {
    int x = 1;
    foo(&x);
    while(1);
}

EDIT: finally, if you code breaks with optimizations it is usually a sign that your code is doing something wrong.

Upvotes: 3

old_timer
old_timer

Reputation: 71586

I think Even T. has the answer. You passed in a variable, you cannot take the address of that variable inside the function, you can take the address of a copy of that variable though, btw that variable is typically a register so it doesnt have an address. Once you leave that function its all gone, the calling function loses it. If you need the address in the function you have to pass by reference not pass by value, send the address. It looks to me that the bug is in your code, not gcc.

BTW, using *(volatile blah *)0xabcd or any other method to try to program registers is going to bite you eventually. gcc and most other compilers have this uncanny way of knowing exactly the worst time to strike.

Say the day you change from this

*(volatile unsigned int *)0x12345 = someuintvariable;

to

*(volatile unsigned int *)0x12345 = 0x12;

A good compiler will realize that you are only storing 8 bits and there is no reason to waste a 32 bit store for that, depending on the architecture you specified, or the default architecture for that compiler that day, so it is within its rights to optimize that to an strb instead of an str.

After having been burned by gcc and others with this dozens of times I have resorted to forcing the issue:

.globl PUT32
PUT32:
   str r1,[r0]
   bx lr





   PUT32(0x12345,0x12);

Costs a few extra clock cycles but my code continues to work yesterday, today, and will work tomorrow with any optimization flag. Not having to re-visit old code and sleeping peacefully through the night is worth a few extra clock cycles here and there.

Also if your code breaks when you compile for release instead of compile for debug, that also means it is most likely a bug in your code.

Upvotes: 0

zaphod
zaphod

Reputation: 4721

Once you return from foo(), x is gone, and any pointers to it are invalid. Subsequently using such a pointer results in what the C standard likes to call "undefined behavior," which means the compiler is absolutely allowed to assume you won't dereference it, or (if you insist on doing it anyway) need not produce code that does anything remotely like what you might expect. If you want the pointer to x to remain valid after foo() returns, you must not allocate x on foo's stack, period -- even if you know that in principle, nothing has any reason to clobber it -- because that just isn't allowed in C, no matter how often it happens to do what you expect.

The simplest solution might be to make x a local variable in main() (or in whatever other function has a sufficiently long-lived scope) and to pass the address in to foo. You could also make x a global variable, or allocate it on the heap using malloc(), or set aside memory for it in some more exotic way. You can even try to figure out where the top of the stack is in some (hopefully) more portable way and explicitly store your data in some part of the stack, if you're sure you won't be needing for anything else and you're convinced that's what you really need to do. But the method you've been using to do that isn't sufficiently reliable, as you've discovered.

Upvotes: 7

flolo
flolo

Reputation: 15526

In general I would say, that it is a valid optimization. If you want to look deeper into it, you could compile with -da This generates a .c.Number.Passname, where you can have a look at the rtl (intermediate representation within the gcc). There you can see which pass makes which optimization (and maybe disable just the one, you dont want to have)

Upvotes: 0

Adam Davis
Adam Davis

Reputation: 93615

So you're putting the address of a local stack variable into the DMA controller to be used, and then you're returning from the function where the stack variable is available?

While this might work with your main() example (since you aren't writing on the stack again) it won't work in a 'real' program later - that value will be overwritten before or while DMA is accessing it when another function is called and the stack is used again.

You need to have a structure, or a global variable you can use to store this value while the DMA accesses it - otherwise it's just going to get clobbered!

-Adam

Upvotes: 4

Ben Combee
Ben Combee

Reputation: 17437

I actually don't think the compiler is wrong, although this is an odd case.

From a code analysis point-of-view, it sees you storing the address of a variable, but that address is never dereferenced and you don't jump outside of the function to external code that could use that address you stored. When you exit the function, the address of the stack is now considered bogus, since its the address of a variable that no longer exists.

The "volatile" keyword really doesn't do much in C, especially with regards to multiple threads or hardware. It just tells the compiler that it has to do the access. However, since there's no users of the value of x according to the data flow, there's no reason to store the "1" on the stack.

It probably would work if you wrote

void foo(unsigned int x) {
    volatile int y = x;
    *(volatile unsigned int*)(0x4000000 + 0xd4) = (unsigned int)(&y);
}

although it still may be illegal code, since the address of y is considered invalid as soon as foo returns, but the nature of the DMA system would be to reference that location independently of the program flow.

Upvotes: 4

Alan H
Alan H

Reputation: 3111

Not an answer, but just some more info for you. We are running 3.4.5 20051201 (Red Hat 3.4.5-2) at my day job.

We have also noticed some of our code (which I can't post here) stops working when we add the -O1 flag. Our solution was to remove the flag for now :(

Upvotes: 0

Tomi Kyöstilä
Tomi Kyöstilä

Reputation: 1279

sparkes wrote

If you think you have found a bug in GCC the mailing lists will be glad you dropped by but generally they find some hole in your knowledge is to blame and mock mercilessly :(

I figured I'd try my luck here first before going to the GCC mailing list to show my incompetence :)


Adam Davis wrote

Out of curiosity, what are you trying to accomplish?

I was trying out development for the Game Boy Advance. I was reading about its DMA system and I experimented with it by creating single-color tile bitmaps. The idea was to have the indexed color be passed as an argument to a function which would use DMA to fill a tile with that color. The source address for the DMA transfer is stored at 0x40000d4.


Will Dean wrote

Personally, I might do try this to see if it compiled better:

void foo(unsigned int x) 
{
    volatile unsigned int* pArg = &x;
    *(volatile unsigned int*)(0x4000000 + 0xd4) = (unsigned int)pArg;
}

With -O0 that works as well and with -O1 that is optimized to the exact same -O1 assembly I've posted in my question.

Upvotes: 0

Will Dean
Will Dean

Reputation: 39530

I'm darned if I can find a reference at the moment, but I'm 99% sure that you are always supposed to be able to take the address of an argument, and it's up to the compiler to finesse the details of calling conventions, register usage, etc.

Indeed, I would have thought it to be such a common requirement that it's hard to see there can be general problem in this - I wonder if it's something about the volatile pointers which have upset the optimisation.

Personally, I might do try this to see if it compiled better:

void foo(unsigned int x) 
{
    volatile unsigned int* pArg = &x;
    *(volatile unsigned int*)(0x4000000 + 0xd4) = (unsigned int)pArg;
}

Upvotes: 1

sparkes
sparkes

Reputation: 19523

Is this just expected/undefined behaviour? Have I done something wrong or have I in fact found a Compiler Bug™?

No bug just the defined behaviour that optimisation options can produce odd code which might not work :)

EDIT:

If you think you have found a bug in GCC the mailing lists will be glad you dropped by but generally they find some hole in your knowledge is to blame and mock mercilessly :(

In this case I think it's probably the -O options attempting shortcuts that break your code that need working around.

Upvotes: -2

Related Questions