There are two ways to check if the number is divisible by 2: x % 2 == 1 (x & 1) == 1 Which of the two is more efficient?

java

JavaDeveloper

Reputation: 5660

Is binary operation more efficient than modulo?

There are two ways to check if the number is divisible by 2:

x % 2 == 1
(x & 1) == 1

Which of the two is more efficient?

Upvotes: 5

Answers (6)

peawormsworth

Reputation: 1220

More binary operation choices:

x>>1<<1!=x

x>>1<<1^x

Tested in python:

for x in range(-5,6):
    print(x, x%2, (x&1)==1, x>>1<<1^x, x>>1<<1!=x)

Upvotes: 0

OldCurmudgeon

Reputation: 65811

Actually neither of those expressions test divisibility by two (other than in the negative). They actually both resolve to true if x is odd.

There are many other ways of testing even/oddness (e.g. ((x / 2) * 2) == x)) but none of them have the optimal properties of x & 1 solely because no compiler could possibly get it wrong and use a divide.

Most modern compilers would compile x % 2 to the same code as x & 1 but a particularly stupid one could implement x % 2 using a divide operation so it could be less efficient.

The argument as to which is better is a different story. A rookie/tired programmer may not recognize x & 1 as a test for odd numbers but x % 2 would be clearer so there is an argument that x % 2 would be the better option.

Me - I'd go for if ( Maths.isEven(x) ) making it absolutely clear what I mean. IMHO Efficiency comes way down the list, well past clarity and readability.

public class Maths {

    // Method is final to encourage compiler to inline if it is bright enough.
    public static final boolean isEven(long n) {
        /* All even numbers have their lowest-order bit set to 0.
         * This `should` therefore be the most efficient way to recognise
         * even numbers.
         * Also works for negative numbers.
         */
        return (n & 1) == 0;
    }
}

Upvotes: 1

Marco13

Reputation: 54639

There will hardly be a noticable difference in practice. Particularly, it's hard to imagine a case where such an instruction will be the actual bottleneck.

_{(Some nitpicking: The "binary" operation should rather be called bitwise operation, and the "modulo" operation actually is a remainder operation)}

From a more theoretical point of view, one could assume that the binary operation is more efficient than the remainder operation, for reasons that already have been pointed out in other answers.

However, back to the practical point of view again: The JIT will almost certainly come for the rescue. Considering the following (very simple) test:

class BitwiseVersusMod
{
    public static void main(String args[])
    {
        for (int i=0; i<10; i++)
        {
            for (int n=100000; n<=100000000; n*=10)
            {
                long s0 = runTestBitwise(n);
                System.out.println("Bitwise sum "+s0);

                long s1 = runTestMod(n);
                System.out.println("Mod    sum "+s1);
            }
        }
    }


    private static long runTestMod(int n)
    {
        long sum = 0;
        for (int i=0; i<n; i++)
        {
            if (i % 2 == 1)
            {
                sum += i;
            }
        }
        return sum;
    }

    private static long runTestBitwise(int n)
    {
        long sum = 0;
        for (int i=0; i<n; i++)
        {
            if ((i & 1) == 1)
            {
                sum += i;
            }
        }
        return sum;
    }
}

Running it with a Hotspot Disassembler VM using

java -server -XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:+PrintAssembly BitwiseVersusMod

creates the JIT disassembly log.

Indeed, for the first invocations of the modulo version, it creates the following disassembly:

  ...
  0x00000000027dcae6: cmp    $0xffffffff,%ecx
  0x00000000027dcae9: je     0x00000000027dcaf2
  0x00000000027dcaef: cltd   
  0x00000000027dcaf0: idiv   %ecx               ;*irem
                                                ; - BitwiseVersusMod::runTestMod@11 (line 26)
                                                ; implicit exception: dispatches to 0x00000000027dcc18
  0x00000000027dcaf2: cmp    $0x1,%edx
  0x00000000027dcaf5: movabs $0x54fa0888,%rax   ;   {metadata(method data for {method} {0x0000000054fa04b0} &apos;runTestMod&apos; &apos;(I)J&apos; in &apos;BitwiseVersusMod&apos;)}
  0x00000000027dcaff: movabs $0xb0,%rdx
  ....

where the irem instruction is translated into an idiv, which is considered to be rather expensive.

In contrast to that, the binary version uses an and instruction for the decision, as expected:

  ....
  0x00000000027dc58c: nopl   0x0(%rax)
  0x00000000027dc590: mov    %rsi,%rax
  0x00000000027dc593: and    $0x1,%eax
  0x00000000027dc596: cmp    $0x1,%eax
  0x00000000027dc599: movabs $0x54fa0768,%rax   ;   {metadata(method data for {method} {0x0000000054fa0578} &apos;runTestBitwise&apos; &apos;(I)J&apos; in &apos;BitwiseVersusMod&apos;)}
  0x00000000027dc5a3: movabs $0xb0,%rbx
  ....

However, for the final, optimized version, the generated code is more similar for both versions. In both cases, the compiler does a lot of loop unrolling, but the core of the methods can still be identified: For the bitwise version, it generates an unrolled loop containing the following instructions:

  ...
  0x00000000027de2c7: mov    %r10,%rax
  0x00000000027de2ca: mov    %r9d,%r11d
  0x00000000027de2cd: add    $0x4,%r11d         ;*iinc
                                                ; - BitwiseVersusMod::runTestBitwise@21 (line 37)

  0x00000000027de2d1: mov    %r11d,%r8d
  0x00000000027de2d4: and    $0x1,%r8d
  0x00000000027de2d8: cmp    $0x1,%r8d
  0x00000000027de2dc: jne    0x00000000027de2e7  ;*if_icmpne
                                                ; - BitwiseVersusMod::runTestBitwise@13 (line 39)

  0x00000000027de2de: movslq %r11d,%r10
  0x00000000027de2e1: add    %rax,%r10          ;*ladd
                                                ; - BitwiseVersusMod::runTestBitwise@19 (line 41)
  ...

There is still the and instruction for testing the lowest bit. But for the modulo version, the core of the unrolled loop is

  ...
  0x00000000027e3a0a: mov    %r11,%r10
  0x00000000027e3a0d: mov    %ebx,%r8d
  0x00000000027e3a10: add    $0x2,%r8d          ;*iinc
                                                ; - BitwiseVersusMod::runTestMod@21 (line 24)

  0x00000000027e3a14: test   %r8d,%r8d
  0x00000000027e3a17: jl     0x00000000027e3a2e  ;*irem
                                                ; - BitwiseVersusMod::runTestMod@11 (line 26)

  0x00000000027e3a19: mov    %r8d,%r11d
  0x00000000027e3a1c: and    $0x1,%r11d
  0x00000000027e3a20: cmp    $0x1,%r11d
  0x00000000027e3a24: jne    0x00000000027e3a2e  ;*if_icmpne
                                                ; - BitwiseVersusMod::runTestMod@13 (line 26)
  ...

I have to admit that I can not fully understand (at least, not in reasonable time) what exactly it is doing there. But in any case: The irem bytecode instruction is also implemented with an and assembly instruction, and there is no longer any idiv instruction in the resulting machine code.

So to repeat the first statement from this answer: There will hardly be a noticable difference in practice. Not only because the cost of a single instruction will hardly ever be the bottleneck, but also because you never know which instructions actually will be executed, and in this particular case, you have to assume that they will basically be equal.

Upvotes: 1

Robert Harvey

Reputation: 180788

The bit operation is almost certainly faster.

Division/modulus is a generalized operation which must work for any divisor you provide, not just 2. It must also check for underflow, range errors and division by zero, and maintain a remainder, all of which takes time.

The bit operation just does a bit "and" operation, which in this case just so happens to correspond to division by two. It might actually use just a single processor operation to execute.

Upvotes: 5

khelwood

Reputation: 59112

Either the & expression will be faster or they will be the same speed. Last time I tried, they were the same speed when I used a literal 2 (because the compiler could optimise it) but % was slower if the 2 was in a variable. The expression x % 2 == 1 as a test for odd numbers does not work for negative x. So there's at least one reason to prefer &.

Upvotes: 2

Louis Newstrom

Reputation: 172

The binary operation is faster. The mod operation has to calculate a division in order to get the remainder.

Upvotes: 0

Is binary operation more efficient than modulo?

Answers (6)

Related Questions