user1095108
user1095108

Reputation: 14603

floating point conversions and performance

I am aware of the errors that can occur when doing conversions between floating point numbers and integers, but what about performance (please disregard the accuracy issues)?

Does performance, in general, suffer if I do n-ary operations on operands of differing arithmetic types, that is, on differing floating point types (e.g. float and double) and floating point/integer type combinations (e.g. float and int)? Do there exist rules of thumb, such as, to keep all operands the same type?

P.S.: I am asking because I'm writing an expression template library and would like to know whether to allow binary operations on vectors containing values of differing arithmetic types.

Upvotes: 6

Views: 4683

Answers (3)

WeirdlyCheezy
WeirdlyCheezy

Reputation: 704

I suspect the answer to this question is going to vary by target architecture, because the conversions can (but might not) to occur in hardware. For example, consider the following code, which causes some interconversions between int and float:

int main (int argc, char** argv)
{
    int precoarced = 35;
    // precoarced gets forced to float
    float result = 0.5 + precoarced;

    // and now we force it back to int
    return (int)result;

    // I wonder what the disassembly looks like in different environments?
}

When I tried to compile this with g++ (I'm on Ubuntu, x86) with default settings, and used gdb to disassemble:

   0x00000000004004b4 <+0>: push   %rbp
   0x00000000004004b5 <+1>: mov    %rsp,%rbp
   0x00000000004004b8 <+4>: mov    %edi,-0x14(%rbp)
   0x00000000004004bb <+7>: mov    %rsi,-0x20(%rbp)
   0x00000000004004bf <+11>:    movl   $0x23,-0x8(%rbp)
   0x00000000004004c6 <+18>:    cvtsi2sdl -0x8(%rbp),%xmm0
   0x00000000004004cb <+23>:    movsd  0x10d(%rip),%xmm1        # 0x4005e0
   0x00000000004004d3 <+31>:    addsd  %xmm1,%xmm0
   0x00000000004004d7 <+35>:    unpcklpd %xmm0,%xmm0
   0x00000000004004db <+39>:    cvtpd2ps %xmm0,%xmm0
   0x00000000004004df <+43>:    movss  %xmm0,-0x4(%rbp)
   0x00000000004004e4 <+48>:    movss  -0x4(%rbp),%xmm0
   0x00000000004004e9 <+53>:    cvttss2si %xmm0,%eax
   0x00000000004004ed <+57>:    pop    %rbp
   0x00000000004004ee <+58>:    retq   

Note the instructions with a cvt-prefixed mnemonic. These are conversion instructions. So in this case, the conversion is happening in hardware in a handful of instructions. So, depending on how many cycles these instructions cost, it could be reasonably fast. But again, a different architecture (or different compiler) could change the story.

Edit: On a fun side note, there's an extra conversion in there due to me accidentally specifying 0.5 instead of 0.5f. That's why the cvtpd2ps op is in there.

Edit: x86 has had FP support for a long time (since the 80s), so C++ compilers targeting x86 will generally make use of the hardware (unless the compiler is seriously behind the times). Thanks Hot Licks for pointing this out.

Upvotes: 5

alecov
alecov

Reputation: 5172

Usually there are some performance penalties, although negligible compared to other things. This is due to data migration between integral and floating-point registers, and other possible ABI issues.

The answer for such questions is much always the same. In doubt? Benchmark it. Performance is hardly theoretically predictable.

Upvotes: 2

Hot Licks
Hot Licks

Reputation: 47729

On most machines, conversions between float and int formats is fairly fast, being assisted by features of the floating-point hardware.

However, one should, of course, make some effort to express literals in the "correct" format, if only for documentation purposes. And it doesn't hurt to use explicit casts as well, for documentation.

Upvotes: 2

Related Questions