Reputation: 45

Modulus optimization in a program

I have seen that many people prefer to use in code:

 while(i<1000000){
       ret+=a[i];
       i++;
       if(ret >= MOD) 
          ret -= MOD;
   }

instead of making ret%MOD in the final step.

What is the difference between these two and how both these are equal?

How it is making an optimize our code?

Upvotes: 0

Answers (6)

Spektre

Reputation: 51845

Yes booth compute the same thing but:

operation % needs integer division which is more time costly then - and if

but on modern parallel machines (mean more pipelines by that not cores)
the CPU do more tasks at once unless they depend on each other or brunching occurs
that is why on modern machines is the % variant usually faster (if stalls the pipelines)

There are still platforms where the -=,if variant is faster

like MCU's so when you know you have just single CPU/MCU pipeline
or have very slow division then use this variant
you should always measure the result ties during optimization process
in your case you want to call just single mod per whole loop so it should be faster but check the later text ...

Compilers

modern compilers optimize code for your target platform and usually detect this and use the right choice
so you should not be consumed by the low level optimization instead of by programing the task functionality
but not all compilers are such for many platforms there are still used older compilers
also in some rare cases the optimizations are preferred to be turned off
because it could destroy specific desired timing, instruction patterns, or even functionality of the task ...
in such cases there is no choice and this knowledge suddenly comes handy

now the differences of your cases from algorithmic side:

while(i<1000000){ ret+=a[i]; i++; if(ret>=MOD) ret-=MOD; }

the sub result is still around modulo MOD
that mean you do not need more bits then used for max(a[i])+MOD*N where N depends on a[i]
if the sum(a[i]) will go to bignums then this will have more speed due to no need to increase sub-result bit-width

while(i<1000000){ ret+=a[i]; i++; } ret%=MOD;

this could overflow if variable ret can not hold the non modulo result

while(i<1000000){ ret+=a[i]; i++; ret%=MOD; }

this is how it should be for bigger non modulo results

if (ret>=MOD) ret-=MOD; is not modulo operation

it is just iteration of it.
more safe is while (ret>=MOD) ret-=MOD;
but if you know that the sub-result is not increasing too much (so it will not overflow in any few iterations) then if is OK
but in that case you should add while or modulo after the loop to ensure correct result

Upvotes: 0

japreiss

Reputation: 11251

Modulo requires integer division which is usually the slowest integer math operation on a CPU. Long ago before pipelines and branch prediction, this code was probably reliably faster than modulo. Nowadays branches can be very slow so its benefit is far from certain. If the values in a are always much smaller than MOD, it's probably still a win because the branch will be skipped most iterations and the branch predictor will mostly guess right. If they are not smaller, it's uncertain. You would need to benchmark both.

If you can write the program such that MOD is always a power of 2, you could use bit masking which is much faster than either.

If I saw this pattern in code that wasn't 1) from 1978 or 2) accompanied by a comment explaining how the author benchmarked it and found it was faster than modulo on the current compiler, typical user CPU, and a realistic data input, I'd roll my eyes hard.

Upvotes: 1

Ali Akber

Reputation: 3800

Let an example :

13 MOD 10 what it actually do is, give you the reminder after dividing 13 by 10.

that is : 13 - (10 * (int)(13/10)) = 13 - ( 10 * 1 ) = 3

so if a[i] <= mod then it will work good. but if a[i] > mod then see, what happens

let a[]= {15,15,15}

mod=7

in first step
ret = 0 + 15
ret = 15 - 7 = 8

2nd step
ret = 8 + 15 = 23
ret = 23 - 7 = 16

3rd step
ret = 16 + 15
ret = 31 - 7 = 24

So your final result is 24, but it should be 3.

you have to do :

while (ret >= MOD)
    ret -= MOD;

if you want to use subtraction instead of mod..

And obviously sub is better than mod in respect to time... because mod is really time consuming :(

Upvotes: 2

Paul R

Reputation: 212979

The conditional test and subtraction is typically less expensive than a modulus operation, especially if the sum does not frequently exceed MOD. A modulus operation is effectively an integer division instruction, which typically has a latency which is an order of magnitude greater than that of compare/subtract. Having said that, unless the loop is a known performance bottleneck then you should just code for clarity and robustness.

Upvotes: 1

Mario

Reputation: 36497

Basically you can't tell without trying. There are two possible outcomes (considering my note further down below):

The compiler optimizes the code in some way that both solutions use either a conditional jump or a modulo operation. This does not only depend on how "bright" the compiler is, but it also has to consider the target architecture's available instruction set (but to be honest, it would be odd not having a modulo operation).
The compiler doesn't optimize the code (most probable for non-optimizing debug builds).

The basic difference that - as mentioned already - the solution with the if() will use one conditional jump, which - again depending on your architecture - might slow you down a bit, since the compiler can't prefetch the next instruction without evaluating the jump condition first.

One further note:

Either using a modulo operation or your if() statement actually isn't equal (depending on the actual values), simply due to the fact that ret % MOD would result in the following equal code:

while (ret >= MOD)
    ret -= MOD;

Imagine a[i] being bigger than MOD and the new sum being bigger than two times MOD. In that case you'd end up with a ret bigger than MOD, something that won't happen when using modulo.

Upvotes: 2

Ed Heal

Reputation: 60007

It is best not to try to optimise code unless you have a performance problem. Then find out where it is actually having the problems

An to answer you question the two are the same - but you need to check with the particular hardware/compiler to check.

Upvotes: 1

Modulus optimization in a program

Answers (6)

Related Questions