calccrypto
calccrypto

Reputation: 8991

Performance of operator| vs operators+

Is there any major difference between | and + that would affect a code's performance in the long run? or are both O(1)? the code i am working with is something like this:

uint64_t dostuff(uint64_t a,uint64_t b){
        // the max values of the inputs are 2^32 - 1

        // lots of stuff involving boolean operators
        // that have no way of being substituted by 
        // arithmetic operators

        return (a << 32) + b;
        //or
        return (a << 32) | b;
}

the code will be used many times, so i want to speed it up as much as possible.

Upvotes: 2

Views: 322

Answers (8)

Simon
Simon

Reputation: 784

This is platform-specific (and likely compiler-specific). On the SPU's on the PS3, dynamic OR's are quite expensive if I remember correctly. I'm not sure of the numbers but I think that it ends up by dividing it into multiple operations, causing the cost to expand to several instructions. On x86/x64 or most modern CISC it is quite likely that either one is just one instruction and is very unlikely to cause any pipeline stalls or other costly operations.

Edit: The reason for the cost is because the Cell processor only has one general-purpose-register which means that it can't load both variables into standard registers and perform the optimization. Instead the values have to be loaded into the altivec-register set where the operation has to be done, the result then has to be fetched from the altivec registers into the gpr by a mask in order to retrieve the result.

If you are pushing these operations onto a PS3 or the GPU on any modern computer, you might want to look into how those processors behave. The GPU's might also have similar issues since they're also RISC-processors dedicated towards SIMD-operations.

Upvotes: 0

Jerry Coffin
Jerry Coffin

Reputation: 490108

If there's any advantage, it's going to be in favor of the or. In reality, however, there's unlikely to be any difference on any reasonably modern CPU (or even anything but a really ancient one).

Basically, an or just sets the bit, and that's all. One two-input or gate is all that's needed, so you get exactly one gate of propagation delay.

An adder is a bit more complex: computing the current bit requires a three-input XOR. An XOR is normally composed to two levels of gates. In addition, it generates a carry, that has to be used as an input to the adder for the next bit. A "ripple carry adder", therefore, needs as many clock cycles as there are bits being added. There are cleverer ways of handling the problem where you handle carries separately from the rest of the addition, so you get a lower propagation delay, but in the worst case, even these don't help.

Most of that only matters if you're designing a CPU yourself though. If you're using a typical CPU, the gates in the functional units are running fast enough that it can/will do a full add in one clock cycle. Some reasonably recent ones can even do two adds per clock cycle in a single functional unit.

Upvotes: 1

Chris A.
Chris A.

Reputation: 6887

Both are certainly O(1) since O(1) means a constant. They are probably not the same constant. Big Oh notation is meant to understand asymptotic behavior independent of constants.

Oh yeah, one more thing. Always profile before you optimize. You'll find out very quickly that time isn't being spent where you think. Always!

Upvotes: 3

Thomas Matthews
Thomas Matthews

Reputation: 57688

The | and '+` are different mathematical operations.
Given the equations:

  unsigned int y = 2 + 2;
  unsigned int z = 2 | 2;

will yield different answers.

Technically, the `|' operation is faster since it only uses OR gates inside the processor. The addition operation requires more gates.

The performance gained by using '|' over '+' is usually wasted by the time required to fetch data into and out of the processor. In otherwords, the net performance is negligible. (The time difference is usually in the range of nanoseconds.)

However, the maintenance time between the two forms may be greater. When one is needing arithmetic rather than bit twiddling (or vice versa), trying to find this runtime error can be great.

Use the proper operator for the proper purpose. Give the testing and maintenance groups a break. This kind of micro-optimization is not worthwhile.

Upvotes: 0

Lee Louviere
Lee Louviere

Reputation: 5262

Use |.

+ can only add to the operation time par obvious reasons.

Upvotes: 2

Jack
Jack

Reputation: 133577

The best answer here is not trying to predict which one is better but benchmark it or check the assembly code. I would guess that both will be optimized to the same instruction and in any case the number of CPU cycles taken by both could be equal.

But I strongly suggest you to check ASM and benchmark both solutions.

Upvotes: 1

Yakov Galka
Yakov Galka

Reputation: 72479

No performance difference on any modern computer.

The two operators have different meaning though. If the bit is already set, | will do nothing, but + will clear the bit and all the following non-zero bits and set the next zero bit to 1.

Upvotes: 5

salezica
salezica

Reputation: 76919

Both are a single instruction. As for electronic propagation times, no idea which one is faster.

You can test for speed yourself, I guess, but seeing as the difference will probably be linear (if detectable at all), and affected by noisy factors, it may be a bit difficult.

Upvotes: 1

Related Questions