edhu
edhu

Reputation: 451

Assembly code generalization

I encountered the following code in my computer architecture class:

void mystery( long A[], long B[], long n )
{
  long i;
  for ( i = 0; i < n; i++ ) {
    B[i] = A[n-(i+1)];
  }
}

And my professor showed the corresponding assembly code GCC generates on an Ubuntu machine and he seems to be confused as well:

mystery:
    pushq   %rbp
    movq    %rsp, %rbp
    movq    %rdi, -24(%rbp)
    movq    %rsi, -32(%rbp)
    movq    %rdx, -40(%rbp)
    movq    $0, -8(%rbp)
    jmp .L2
.L3:
    movq    -8(%rbp), %rax
    leaq    0(,%rax,8), %rdx
    movq    -32(%rbp), %rax
    addq    %rax, %rdx
    movq    -8(%rbp), %rax
    notq    %rax
    movq    %rax, %rcx
    movq    -40(%rbp), %rax
    addq    %rcx, %rax
    leaq    0(,%rax,8), %rcx
    movq    -24(%rbp), %rax
    addq    %rcx, %rax
    movq    (%rax), %rax
    movq    %rax, (%rdx)
    addq    $1, -8(%rbp)
.L2:
    movq    -8(%rbp), %rax
    cmpq    -40(%rbp), %rax
    jl  .L3
    popq    %rbp
    ret

But I can't understand why the compiler will generate this code. It appears the A, B, and n are pushed onto the stack but the stack pointer %rsp doesn't change its value. Also, -16(%rbp) also seems to be allocated but is never put in a value. Is there any reason GCC will behave this way?

Upvotes: 0

Views: 195

Answers (1)

Vittorio Romeo
Vittorio Romeo

Reputation: 93264

Compiler Explorer (godbolt.org) is a great tool to look at generated assembly from various compilers and with different flags. Here's what g++7 -O2 produces for your code:

mystery(long*, long*, long):
        test    rdx, rdx
        jle     .L1
        lea     rax, [rdi-8+rdx*8]
        sub     rdi, 8
.L3:
        mov     rdx, QWORD PTR [rax]
        sub     rax, 8
        add     rsi, 8
        mov     QWORD PTR [rsi-8], rdx
        cmp     rax, rdi
        jne     .L3
.L1:
        rep ret

To answer your question: compiling with optimizations disabled usually unexpected/less sensible output. "Why?" is a difficult question to answer as this highly depends on how the compiler is implemented.

Here's a screenshot showing a comparison of -O2, -O0 and -Ofast:

comparison

Try it out here: https://godbolt.org/g/pQ637a

Upvotes: 1

Related Questions