Johan
Johan

Reputation: 76567

Efficient three valued compare

For unsigned integers getting the result of

if (a>b) => 1
if (a=b) => 0
if (a<b) => -1

Can be optimized into the branchless version

return ((a > b) - (a < b))

This can be written into x86 assembly like so:

4829D1           cmp rcx,rdx
0F94C1           setz cl    
19C0             sbb eax,eax    
83D8FF           sbb eax,-$01    
D3E8             shr eax,cl

13 bytes in total

Is there a way to do this without branches in less than 5 instructions or in fewer bytes?

Upvotes: 2

Views: 116

Answers (2)

zx485
zx485

Reputation: 29022

A solution with less bytes (11 bytes) and one less instruction (4 instructions) which may be faster:

483bca          cmp     rcx,rdx
1bc0            sbb     eax,eax
483bd1          cmp     rdx,rcx
83d000          adc     eax,0

This can be improved upon to 10 bytes if you have a spare register known to be null.

...
11d8            adc     eax,ebx

Upvotes: 2

Kerrek SB
Kerrek SB

Reputation: 477040

Clang 3.7 produces the following x86-64 machine code for me:

0:   48 39 f7                cmp    rcx,rdx
3:   0f 97 c0                seta   al
6:   0f b6 c0                movzx  eax,al
9:   83 d8 00                sbb    eax,0x0

That's four instructions for the computation; the result is in eax.

This can be improved upon:

0:   31 C0                   xor eax,eax
2:   48 39 D1                cmp rcx,rdx
5:   0F 97 C0                setnbe al
8:   83 D8 00                sbb eax,0x0

11 bytes.

Upvotes: 4

Related Questions