user8552461
user8552461

Reputation:

Decrementing stack by 24 when only 8 bytes are needed?

I have the C code:

long fib(long n) {
  if (n < 2) return 1;
  return fib(n-1) + fib(n-2);
}

int main(int argc, char** argv) {
    return 0;
}

which I compiled by running gcc -O0 -fno-optimize-sibling-calls -S file.c yielding assembly code that has not been optimized:

    .file   "long.c"
    .text
    .globl  fib
    .type   fib, @function
fib:
.LFB5:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    pushq   %rbx
    subq    $24, %rsp
    .cfi_offset 3, -24
    movq    %rdi, -24(%rbp)
    cmpq    $1, -24(%rbp)
    jg  .L2
    movl    $1, %eax
    jmp .L3
.L2:
    movq    -24(%rbp), %rax
    subq    $1, %rax
    movq    %rax, %rdi
    call    fib
    movq    %rax, %rbx
    movq    -24(%rbp), %rax
    subq    $2, %rax
    movq    %rax, %rdi
    call    fib
    addq    %rbx, %rax
.L3:
    addq    $24, %rsp
    popq    %rbx
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE5:
    .size   fib, .-fib
    .globl  main
    .type   main, @function
main:
.LFB6:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    %edi, -4(%rbp)
    movq    %rsi, -16(%rbp)
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE6:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
    .section    .note.GNU-stack,"",@progbits

My question is:

Why do we decrement the stack pointer by 24, subq $24, %rsp? As I see it, we store one element only, first argument n in %rdi, on the stack after the initial two pushes. So why don't we just decrement the stack pointer by 8 and then move n to -8(%rbp)? So

subq    $8, %rsp
movq    %rdi, -8(%rbp)

Upvotes: 0

Views: 169

Answers (1)

Eric Postpischil
Eric Postpischil

Reputation: 222744

GCC does not fully optimize with -O0, not even its stack use. (This may aid in debugging by making some of its use of the stack more transparent to humans. For example, objects a, b, and c may share a single stack location if their active lifetimes (defined by uses in the program, not by the model of lifetime in the C standard) with -O3, but may have separately reserved places in the stack with -O0, and that makes it easier for a human to see where a, b, and c are used in the assembly code. The wasted 16 bytes may be a side effect of this, as those spaces may be reserved for some purpose that this small function did not happen to use, such as space to save certain registers if needed.)

Changing optimization to -O3 results in GCC subtracting only eight from the stack pointer.

Upvotes: 3

Related Questions