Reputation: 4027
So I have two functions, one just casts from double
to int64_t
, the other calls std::round
:
std::int64_t my_cast(double d)
{
auto t = static_cast<std::int64_t>(d);
return t;
}
std::int64_t my_round(double d)
{
auto t = std::round(d);
return t;
}
They work correctly: cast(3.64)
= 3
and round(3.64)
= 4
. But, when I look at the assembly, they seem to be doing the same thing. So am wondering how they get different results?
$ g++ -std=c++1y -c -O3 ./round.cpp -o ./round.o
$ objdump -dS ./round.o
./round.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z7my_castd>:
0: f2 48 0f 2c c0 cvttsd2si %xmm0,%rax
5: c3 retq
6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
d: 00 00 00
0000000000000010 <_Z8my_roundd>:
10: 48 83 ec 08 sub $0x8,%rsp
14: e8 00 00 00 00 callq 19 <_Z7my_castd+0x19> <========!!!
19: 48 83 c4 08 add $0x8,%rsp
1d: f2 48 0f 2c c0 cvttsd2si %xmm0,%rax
22: c3 retq
Disassembly of section .text.startup:
0000000000000030 <_GLOBAL__sub_I__Z7my_castd>:
30: 48 83 ec 08 sub $0x8,%rsp
34: bf 00 00 00 00 mov $0x0,%edi
39: e8 00 00 00 00 callq 3e <_GLOBAL__sub_I__Z7my_castd+0xe>
3e: ba 00 00 00 00 mov $0x0,%edx
43: be 00 00 00 00 mov $0x0,%esi
48: bf 00 00 00 00 mov $0x0,%edi
4d: 48 83 c4 08 add $0x8,%rsp
51: e9 00 00 00 00 jmpq 56 <_Z8my_roundd+0x46>
I am not sure what the purpose of that callq
on line 14
is for, but, even with that, my_cast
and my_round
seem to be just doing a cvttsd2si
which, I believe is conversion with truncation.
However, the two functions, like I mentioned earlier, produce different (correct) values on the same input (say 3.64
)
What is happening?
Upvotes: 24
Views: 12432
Reputation: 2373
When dumping an object file with objdump -d
, it is quite important to add the option -r
, which commands the utility to also dump relocations:
$ objdump -dr round.o
...
0000000000000010 <_Z8my_roundd>:
10: 48 83 ec 28 sub $0x28,%rsp
14: e8 00 00 00 00 callq 19 <_Z8my_roundd+0x9>
15: R_X86_64_PC32 _ZSt5roundd
19: 48 83 c4 28 add $0x28,%rsp
1d: f2 48 0f 2c c0 cvttsd2si %xmm0,%rax
Now, notice the new line that appeared. That's a relocation instruction embodied into the object file. It instructs the linker to add a distance between _Z8my_roundd+0x9
and _ZSt5roundd
to the value found at offset 15.
The e8
at offset 14 is the operation code for relative call. The following 4 bytes must contain the IP-relative offset to the function being called (the IP at the moment of execution pointing to the next instruction). Because the compiler cannot know that distance, it leaves it filled with zeroes and inserts a relocation so that linker can fill it later.
When disassembling without the -r
option, relocations are ignored, and that creates the illusion that the function _Z8my_roundd
makes a call into the middle of itself.
Upvotes: 13
Reputation: 41301
Assembly output is more useful (g++ ... -S && cat round.s
):
...
_Z7my_castd:
.LFB225:
.cfi_startproc
cvttsd2siq %xmm0, %rax
ret
.cfi_endproc
...
_Z8my_roundd:
.LFB226:
.cfi_startproc
subq $8, %rsp
.cfi_def_cfa_offset 16
call round <<< This is what callq 19 means
addq $8, %rsp
.cfi_def_cfa_offset 8
cvttsd2siq %xmm0, %rax
ret
.cfi_endproc
As you can see, my_round
calls std::round
and then executes cvttsd2siq
instruction. This is because std::round(double)
returns double
, so its result still has to be converted to int64_t
. And that is what cvttsd2siq
is doing in both your functions.
Upvotes: 19
Reputation: 18902
With g++ you can have a higher level view of what's happening using the -fdump-tree-optimized
switch:
$ g++ -std=c++1y -c -O3 -fdump-tree-optimized ./round.cpp
That produces a round.cpp.165t.optimized
file:
;; Function int64_t my_cast(double) (_Z7my_castd, funcdef_no=224, decl_uid=4743$
int64_t my_cast(double) (double d)
{
long int t;
<bb 2>:
t_2 = (long int) d_1(D);
return t_2;
}
;; Function int64_t my_round(double) (_Z8my_roundd, funcdef_no=225, decl_uid=47$
int64_t my_round(double) (double d)
{
double t;
int64_t _3;
<bb 2>:
t_2 = round (d_1(D));
_3 = (int64_t) t_2;
return _3;
}
Here the differences are quite clear (and the call to the round
function glaring).
Upvotes: 18