Reputation: 6140
I have converted a C++ code to assembly with a high optimization level
#include <iostream>
using namespace std;
int main()
{
float sum=0;
for(int i = 0; i < 10; i++)
sum += 1.0f/float(i+1);
cout<<sum<<endl;
return 0;
}
via
g++ -O3 -S main.cpp
g++ -O3 main.cpp && ./a.out
The result is
2.92897
But when I convert it into assembly, I do not realize where this number is located. There should be either a loop or (if unrolled) a final result which is 2.92897
. But I cannot find it in the following code:
.file "main.cpp"
.section .text.startup,"ax",@progbits
.p2align 4,,15
.globl main
.type main, @function
main:
.LFB1561:
.cfi_startproc
subq $8, %rsp
.cfi_def_cfa_offset 16
movl $_ZSt4cout, %edi
movsd .LC0(%rip), %xmm0
call _ZNSo9_M_insertIdEERSoT_
movq %rax, %rdi
call _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
xorl %eax, %eax
addq $8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE1561:
.size main, .-main
.p2align 4,,15
.type _GLOBAL__sub_I_main, @function
_GLOBAL__sub_I_main:
.LFB2048:
.cfi_startproc
subq $8, %rsp
.cfi_def_cfa_offset 16
movl $_ZStL8__ioinit, %edi
call _ZNSt8ios_base4InitC1Ev
movl $__dso_handle, %edx
movl $_ZStL8__ioinit, %esi
movl $_ZNSt8ios_base4InitD1Ev, %edi
addq $8, %rsp
.cfi_def_cfa_offset 8
jmp __cxa_atexit
.cfi_endproc
.LFE2048:
.size _GLOBAL__sub_I_main, .-_GLOBAL__sub_I_main
.section .init_array,"aw"
.align 8
.quad _GLOBAL__sub_I_main
.local _ZStL8__ioinit
.comm _ZStL8__ioinit,1,1
.section .rodata.cst8,"aM",@progbits,8
.align 8
.LC0:
.long 0
.long 1074228871
.hidden __dso_handle
.ident "GCC: (Ubuntu 7.2.0-1ubuntu1~16.04) 7.2.0"
.section .note.GNU-stack,"",@progbits
I was suspected to .LC0
and 1074228871
. But such a conversion via another code gives me 2.11612 which is a different number.
So, where is the calculation or the result in the assembly code?
Upvotes: 2
Views: 168
Reputation: 364248
The loop wasn't just unrolled, it was optimized away completely by constant-propagation. That's why main
has no branching other than call
.
movsd .LC0(%rip), %xmm0
(MOV Scalar Double) loads the 8-byte FP arg to cout<<sum
from a static constant in .rodata
, like normal for how most compilers deal with FP constants.
At .LC0
, we find:
.LC0:
.long 0
.long 1074228871
These pseudo-instructions assemble to 8 bytes of data. This is the integer representation of the bit pattern that means 2.92897...
in IEE754 double-precision (binary64
). x86 is little-endian for FP as well as integer, so the 0
in the first (low) 4 bytes are the bottom of the significand (aka mantissa).
There's an interactive single-precision converter at https://www.h-schmidt.net/FloatConverter/IEEE754.html, but IDK of one for double
where you could plug in the integer value of the bit-pattern and see it decoded as a double
.
But such a conversion via another code gives me 2.11612 which is a different number.
You linked to code which type-puns the upper half of the bit-pattern to float
(violating C++ pointer-aliasing rules, BTW. Use memcpy
for type-punning). You'd get the right answer if you took 1074228871ULL << 32
and type-punned that to double
.
clang puts asm comments on FP constants to show their value in decimal, but gcc doesn't. e.g. from the Godbolt compiler explorer: clang5.0 -O3
optimizes the loop away to the same constant, but represents it slightly differently in asm:
.LCPI0_0:
.quad 4613777869364002816 # double 2.9289684295654297
# exactly equivalent to what gcc emits,
# just different syntax for the same 8 bytes
It's just bytes, and decimal integer is what gcc always does for all constants in compiler-generated asm, even though this is near useless for humans (much worse even than hex).
I'm not sure if GAS syntax even handles FP constants; NASM does. But as I said, it's all just bytes.
Upvotes: 5