anekix
anekix

Reputation: 2563

if statement in assembly ouput of c code

i have this simple piece of code in c:

#include <stdio.h>

void test() {}

int main()
{
    if (2 < 3) {

        int zz = 10;
    }
    return 0;
}

when i see the assembly output of this code:

test():
  pushq %rbp
  movq %rsp, %rbp
  nop
  popq %rbp
  ret
main:
  pushq %rbp
  movq %rsp, %rbp
  movl $10, -4(%rbp) // space is created for zz on stack
  movl $0, %eax
  popq %rbp
  ret

i got the assembly from here (default options) I can't see where is the instruction for the conditional check?

Upvotes: 0

Views: 813

Answers (6)

Peter Cordes
Peter Cordes

Reputation: 364180

The interesting thing here is that gcc and clang optimize away the if() even at -O0, unlike some other compilers (ICC and MSVC).

gcc -O0 doesn't mean no optimization, it means no extra optimization beyond what's needed to compile at all. But gcc does have to transform through a couple internal representations of the function logic before emitting asm. (GIMPLE and Register Transfer Language). gcc doesn't have a special "dumb mode" where it slavishly transliterates every part of every C expression to asm.

Even a super-simple one-pass compiler like TCC does minor optimizations within an expression (or even a statement), like realizing that an always-true condition doesn't require branching.

gcc -O0 is the default, which you obviously used because the dead store to zz isn't optimized away.

gcc -O0 aims to compile quickly, and to give consistent debugging results.

  • Every C variable exists in memory, whether it's ever used or not.

  • Nothing is kept in registers across C statements (except variables declared register; -O0 is the only time that keyword does anything). So you can modify any C variable with a debugger while single-stepping. i.e. spill/reload everything between separate C statements. See also Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? (This is why benchmarking for -O0 is nonsense: writing the same code with fewer larger expressions is faster only at -O0, not with real settings like -O3).

    Other interesting consequences: constant-propagation doesn't work, see Why does integer division by -1 (negative one) result in FPE? for a case where gcc uses div for a variable set to a constant, vs. something simpler for a literal constant.

  • Every statement is compiled independently, so you can even jump to a different source line (within the same function) using GDB and get consistent results. (Unlike in optimized code where that would be likely to crash or give nonsense, and definitely not match the C abstract machine).

Given all those requirements for gcc -O0 behaviour, if (2 < 3) can still be optimized to zero asm instructions. The behaviour doesn't depend on the value of any variable, and it's a single statement. There's no way it can ever be not-taken, so the simplest way to compile it is no instructions: fall-through into the { body } of the if.

Note that gcc -O0's rules / restrictions go far beyond the C as-if rule that the machine-code for a function merely has to implement all externally-visible behaviour of the C source. gcc -O3 optimizes the whole function down to just

main:                 # with optimization
    xor    eax, eax
    ret

because it doesn't care about keeping asm for every C statement.


Other compilers:

See all 4 of the major x86 compilers on Godbolt.

clang is similar to gcc, but with a dead store of 0 to another spot on the stack, as well as the 10 for zz. clang -O0 is often closer to a transliteration of C into asm, for example it will use div for x / 2 instead of a shift, while gcc uses a multiplicative inverse for division by a constant even at -O0. But in this case, clang also decides that no instructions are sufficient for an always-true condition.

ICC and MSVC both emit asm for the branch, but instead of the mov $2, %ecx / cmp $3, %ecx you might expect, they both actually do 0 != 1 for no apparent reason:

# ICC18
    pushq     %rbp                                          #6.1
    movq      %rsp, %rbp                                    #6.1
    subq      $16, %rsp                                     #6.1

    movl      $0, %eax                                      #7.5
    cmpl      $1, %eax                                      #7.5
    je        ..B1.3        # Prob 100%                     #7.5

    movl      $10, -16(%rbp)                                #9.16
..B1.3:                         # Preds ..B1.2 ..B1.1
    movl      $0, %eax                                      #11.12
    leave                                                   #11.12
    ret                                                     #11.12

MSVC uses the xor-zeroing peephole optimization even without optimization enabled.

It's slightly interesting to look at which local / peephole optimizations compilers do even at -O0, but it doesn't tell you anything fundamental about C language rules or your code, it just tells you about compiler internals and the tradeoffs the compiler devs chose between spending time looking for simple optimizations vs. compiling even faster in no-optimization mode.

The asm is never intended to faithfully represent the C source in any kind of way that would let a decompiler reconstruct it. Just to implement equivalent logic.

Upvotes: 4

klutt
klutt

Reputation: 31366

It's simple. It is not there. The compiler optimized it away.

Here is the assembly when compiling with gcc without optimization:

    .file   "k.c"
    .text
    .globl  test
    .type   test, @function
test:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    nop
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   test, .-test
    .globl  main
    .type   main, @function
main:
.LFB1:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    $10, -4(%rbp)
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE1:
    .size   main, .-main
    .ident  "GCC: (Debian 6.3.0-18) 6.3.0 20170516"
    .section    .note.GNU-stack,"",@progbits

and here it is with optimization:

    .file   "k.c"
    .text
    .p2align 4,,15
    .globl  test
    .type   test, @function
test:
.LFB11:
    .cfi_startproc
    rep ret
    .cfi_endproc
.LFE11:
    .size   test, .-test
    .section    .text.startup,"ax",@progbits
    .p2align 4,,15
    .globl  main
    .type   main, @function
main:
.LFB12:
    .cfi_startproc
    xorl    %eax, %eax
    ret
    .cfi_endproc
.LFE12:
    .size   main, .-main
    .ident  "GCC: (Debian 6.3.0-18) 6.3.0 20170516"
    .section    .note.GNU-stack,"",@progbits

As you can see, not only the comparison is optimized away. Almost the whole main is optimized away since it does not produce anything visible. The variable zz is never used. The only observable thing your code does is returning 0.

Upvotes: 2

P.P
P.P

Reputation: 121387

The condition if (2<3) is always true. So a decent compiler would detect this generate the code as if the condition doesn't exist. In fact, if you optimize it with -O3, godbolt.org generates just:

test():
  rep ret
main:
  xor eax, eax
  ret

This is again valid because a compiler is allowed optimise and transform the code as long as the observable behaviour is preserved.

Upvotes: 0

You don't see it, because it isn't there. The compiler was able to perform analysis, and rather easily see that this branch will always be entered.

Instead of emitting a check that will do nothing but waste CPU cycles, it emits an easily optimized version of the code.

A C program is not a sequence of instructions for the CPU to perform. That's what the emitted machine code is. A C program is a description of the behavior your compiled program should have. A compiler is free to translate it in almost any way it wants, so long as you get that behavior.

It's known as "the as-if rule".

Upvotes: 7

notan
notan

Reputation: 389

if (2<3)

is allways true, therefore the Compiler emmits no opcode for it.

Upvotes: 0

J_P
J_P

Reputation: 781

2 is always less tan 3 so, as the compiler know the result of 2<3 is always true, there is no need for an if decision in assembler.

The optimization means to generate less time / less code.

Upvotes: 1

Related Questions