Reputation: 836
I'm writing a multi-precision library in C++, using a base of 2^64, and currently I'm working on the mod
operation. I'm using Algorithm D described in Donald E. Knuth's 1998 edition of "The Art Of Computer Programming" Vol. 2, Section 4.3.1, for division, which yields a quotient and a remainder. For the mod
operation, I'm performing a division, throwing away the quotient in the end. Although Knuth's Algorithm D is very fast if implemented in C++ with some ASM enhancements for the partial division and the concurrent multi-precision multiplication/subtraction in each step, I'm not sure if there is a better way, since throwing away a painstakingly computed result doesn't seem efficient to me.
Unfortunately, it's not possible to get rid of the partial division in Algorithm D, because the partial quotient is required to compute the remainder, by subtracting the product of the partial quotient and the divisor from the dividend iteratively.
I've searched the Internet for alternative solutions, and found the influential papers written by Paul Barrett and Peter L. Montgomery. However, the fancy tricks they use seem to pay off only if lots of mod
operations are performed in a row with the same modulus, since they involve heavy precomputations. This is the case in complex operations like modular exponentiation, where the mod
of several squares and products is required for a single modulus. Barrett starts with the basic definition of the remainder, r = a - b * (a / b)
, and changes the division to a multiplication with the reciprocal of b
. Then he presents an efficient way to compute this multiplication, which pays off if the reciprocal is computed once for several similar computations. Montgomery transforms the operands into a completely different residue system, where modular arithmetic is cheap, but for the price of transformations to and fro.
Additionally, Both algorithms introduce some restrictions, which need to be met for correct operation. Montgomery, for instance, usually requires the operands to be odd, which is the case in RSA calculations with primes, but which cannot be assumed in the general case. Outside these restrictions, even more overhead for normalizations is required.
So what I need, is an efficient one-shot mod
function without overhead and special restrictions. Hence my question is: Is it possible to compute a remainder without computing the quotient in the first place, in a way that is more efficient than division?
Upvotes: 1
Views: 380
Reputation: 7895
One suggestion would be to write a simple function that would calculate A%B=C
and store the A
, B
and C
values into an array, then store all the results into a vector. Then print them out to see the relationships of all of the inputs and output values.
There is one thing that can be done to simplify some of this work and that is to know some of the properties of the mod function. These two statements will help you out with the function.
0 mod N = 0 N mod 0 = undefined
Since 0 mod N = 0
we can put a test case for A
and if it is 0
we can just use that to populate our array. Likewise if B
= 0
we can populate our array's C
value with -1
just to represent undefined because you can not perform A mod 0
as the compilation will fail due to division by 0.
I wrote this function to do just that; then I run it through a loop for both A
& B
from [0,15]
.
#include <array>
#include <vector>
#include <iostream>
std::array<int, 3> calculateMod(int A, int B) {
std::array<int, 3 > res;
if (A == 0) {
res = std::array<int, 3>{ 0, B, 0 };
}
else if (B == 0) {
res = std::array<int, 3>{ A, 0, -1 };
}
else {
res = std::array<int, 3>{ A, B, A%B };
}
return res;
}
int main() {
std::vector<std::array<int, 3>> results;
int N = 15;
for (int A = 0; A <= N; A++) {
for (int B = 0; B <= N; B++) {
results.push_back(calculateMod(A, B));
}
}
// Now print out the results in a table form:
int i = 0; // Index for formatting output
for (auto& res : results) {
std::cout << res[0] << " % " << res[1] << " = " << res[2] << '\n';
// just for formatting output data to make it easier to read.
i++;
if ( i > N ) {
std::cout << '\n';
i = 0;
}
}
return 0;
}
Here is it's output:
0 % 0 = 0
0 % 1 = 0
0 % 2 = 0
0 % 3 = 0
0 % 4 = 0
0 % 5 = 0
0 % 6 = 0
0 % 7 = 0
0 % 8 = 0
0 % 9 = 0
0 % 10 = 0
0 % 11 = 0
0 % 12 = 0
0 % 13 = 0
0 % 14 = 0
0 % 15 = 0
1 % 0 = -1
1 % 1 = 0
1 % 2 = 1
1 % 3 = 1
1 % 4 = 1
1 % 5 = 1
1 % 6 = 1
1 % 7 = 1
1 % 8 = 1
1 % 9 = 1
1 % 10 = 1
1 % 11 = 1
1 % 12 = 1
1 % 13 = 1
1 % 14 = 1
1 % 15 = 1
2 % 0 = -1
2 % 1 = 0
2 % 2 = 0
2 % 3 = 2
2 % 4 = 2
2 % 5 = 2
2 % 6 = 2
2 % 7 = 2
2 % 8 = 2
2 % 9 = 2
2 % 10 = 2
2 % 11 = 2
2 % 12 = 2
2 % 13 = 2
2 % 14 = 2
2 % 15 = 2
3 % 0 = -1
3 % 1 = 0
3 % 2 = 1
3 % 3 = 0
3 % 4 = 3
3 % 5 = 3
3 % 6 = 3
3 % 7 = 3
3 % 8 = 3
3 % 9 = 3
3 % 10 = 3
3 % 11 = 3
3 % 12 = 3
3 % 13 = 3
3 % 14 = 3
3 % 15 = 3
4 % 0 = -1
4 % 1 = 0
4 % 2 = 0
4 % 3 = 1
4 % 4 = 0
4 % 5 = 4
4 % 6 = 4
4 % 7 = 4
4 % 8 = 4
4 % 9 = 4
4 % 10 = 4
4 % 11 = 4
4 % 12 = 4
4 % 13 = 4
4 % 14 = 4
4 % 15 = 4
5 % 0 = -1
5 % 1 = 0
5 % 2 = 1
5 % 3 = 2
5 % 4 = 1
5 % 5 = 0
5 % 6 = 5
5 % 7 = 5
5 % 8 = 5
5 % 9 = 5
5 % 10 = 5
5 % 11 = 5
5 % 12 = 5
5 % 13 = 5
5 % 14 = 5
5 % 15 = 5
6 % 0 = -1
6 % 1 = 0
6 % 2 = 0
6 % 3 = 0
6 % 4 = 2
6 % 5 = 1
6 % 6 = 0
6 % 7 = 6
6 % 8 = 6
6 % 9 = 6
6 % 10 = 6
6 % 11 = 6
6 % 12 = 6
6 % 13 = 6
6 % 14 = 6
6 % 15 = 6
7 % 0 = -1
7 % 1 = 0
7 % 2 = 1
7 % 3 = 1
7 % 4 = 3
7 % 5 = 2
7 % 6 = 1
7 % 7 = 0
7 % 8 = 7
7 % 9 = 7
7 % 10 = 7
7 % 11 = 7
7 % 12 = 7
7 % 13 = 7
7 % 14 = 7
7 % 15 = 7
8 % 0 = -1
8 % 1 = 0
8 % 2 = 0
8 % 3 = 2
8 % 4 = 0
8 % 5 = 3
8 % 6 = 2
8 % 7 = 1
8 % 8 = 0
8 % 9 = 8
8 % 10 = 8
8 % 11 = 8
8 % 12 = 8
8 % 13 = 8
8 % 14 = 8
8 % 15 = 8
9 % 0 = -1
9 % 1 = 0
9 % 2 = 1
9 % 3 = 0
9 % 4 = 1
9 % 5 = 4
9 % 6 = 3
9 % 7 = 2
9 % 8 = 1
9 % 9 = 0
9 % 10 = 9
9 % 11 = 9
9 % 12 = 9
9 % 13 = 9
9 % 14 = 9
9 % 15 = 9
10 % 0 = -1
10 % 1 = 0
10 % 2 = 0
10 % 3 = 1
10 % 4 = 2
10 % 5 = 0
10 % 6 = 4
10 % 7 = 3
10 % 8 = 2
10 % 9 = 1
10 % 10 = 0
10 % 11 = 10
10 % 12 = 10
10 % 13 = 10
10 % 14 = 10
10 % 15 = 10
11 % 0 = -1
11 % 1 = 0
11 % 2 = 1
11 % 3 = 2
11 % 4 = 3
11 % 5 = 1
11 % 6 = 5
11 % 7 = 4
11 % 8 = 3
11 % 9 = 2
11 % 10 = 1
11 % 11 = 0
11 % 12 = 11
11 % 13 = 11
11 % 14 = 11
11 % 15 = 11
12 % 0 = -1
12 % 1 = 0
12 % 2 = 0
12 % 3 = 0
12 % 4 = 0
12 % 5 = 2
12 % 6 = 0
12 % 7 = 5
12 % 8 = 4
12 % 9 = 3
12 % 10 = 2
12 % 11 = 1
12 % 12 = 0
12 % 13 = 12
12 % 14 = 12
12 % 15 = 12
13 % 0 = -1
13 % 1 = 0
13 % 2 = 1
13 % 3 = 1
13 % 4 = 1
13 % 5 = 3
13 % 6 = 1
13 % 7 = 6
13 % 8 = 5
13 % 9 = 4
13 % 10 = 3
13 % 11 = 2
13 % 12 = 1
13 % 13 = 0
13 % 14 = 13
13 % 15 = 13
14 % 0 = -1
14 % 1 = 0
14 % 2 = 0
14 % 3 = 2
14 % 4 = 2
14 % 5 = 4
14 % 6 = 2
14 % 7 = 0
14 % 8 = 6
14 % 9 = 5
14 % 10 = 4
14 % 11 = 3
14 % 12 = 2
14 % 13 = 1
14 % 14 = 0
14 % 15 = 14
15 % 0 = -1
15 % 1 = 0
15 % 2 = 1
15 % 3 = 0
15 % 4 = 3
15 % 5 = 0
15 % 6 = 3
15 % 7 = 1
15 % 8 = 7
15 % 9 = 6
15 % 10 = 5
15 % 11 = 4
15 % 12 = 3
15 % 13 = 2
15 % 14 = 1
15 % 15 = 0
From the data above we can see that if A == B
the result will be 0
. We can also see that if B > A
then B == A
. Finally we can see that there are patterns between odd
and even
values of A
while B < A
. If you can understand these patterns then most of it becomes algebraic manipulation. From here the next step would be to create an algorithm that would take all of this data and convert it to its binary equivalence.
I chose the value of N
above as 15
for a reason. This is due to the binary representation of all the possible combinations of binary digits before they repeat again. We know that a single byte of data is 8 bits; we know that the values from [0,15] will fit into half of that; for example:
binary byte: hex decimal
0000 0000 0x00 0
...
0000 1111 0xFF 15
After these 15 different sequences of 0s and 1s these patterns will repeat. So by taking the table above you can convert these into binary representations. Now once you examine the representations of A & B
inputs with their C
outputs in binary and understand the 3 properties of the results that I had mentioned above; you should be able to design an algorithm to quickly compute the modulo of any A
B
combination quite easily. One trick to remember is that there are 3 other things to take into consideration. The first is what user eerokia
had stated:
"In particular, modulo with a power of 2 can be replaced by a bitwise operations."
The next beyond that is are the values even or odd as the even and odd cases do present different patters of A mod B
when B < A
.
I have give you some tools of information to start out on, but the rest I'll leave up to you including the task of converting the A
, B
, and C
values into their binary representations.
Once you know the binary patterns of both the A
and B
inputs according to their C
outputs and you understand the truth tables of the logical gates - operators such as And - &
, Or - |
, Nand - (!&)
, Nor - (!|)
, Xor - ^
Xnor - (!^)
and Not - !
as well as the compliment (~)
. You should be able to design your algorithm with efficiency.
Upvotes: -1