jnm2
jnm2

Reputation: 8354

Can I parallelize huge integer additions?

I will need to add two unsigned 256 megabit integers over two billion times. Since carrying is obviously very important in addition and cannot be determined without waiting for lower order bits to be added, are there any performance gains to be had from multicore CPU features, such as splitting the number into multiple parts and dealing with carries later?

Upvotes: 2

Views: 192

Answers (3)

Olof Forshell
Olof Forshell

Reputation: 3264

If you mouseover the tags you will find information about the number members interested in that subject:

  • multicore - 118
  • multi-threading - 2.1k
  • C - 11.7k
  • C++ - 17k
  • performance 1.3k
  • spinlock - 5
  • atomic - 18

You can choose one subject and then narrow down the number of questions by adding another/others.

Your original post had to do with more efficient addition using multicore so multicore would be one tag to select. Since multi-threading is part of any os or app using multicore select that. You now have 117 questions. You might want to select questions by users with more points rather than fewer. Look at the tags for individual questions and avoid those with C#, Java and .Net since those subjects are more about code production efficiency rather than code execution efficiency.

Other concepts you can search for are affinity, critical section, saturation, memory lock/barrier, thread-safe, rdtsc.

One thing you might keep in mind is that the practicalities of writing really fast code have very much to do with trial and error, getting your feet wet or whatever you might want to call it. What you can find here are hints about what you can try, what you would want to look out for.

As to my original answer with GMP, I recommend you check out the author's page. It contains information on things such as sustained instruction throughput on different x86 architectures, division by constant integer using integer multiplication and winning the Simon Singh Code Book challenge. There is also performance inforamtion about GMP itself.

Upvotes: 0

Olof Forshell
Olof Forshell

Reputation: 3264

Why don't you use the GMP library?

Upvotes: 0

Nic Foster
Nic Foster

Reputation: 2894

You can definitely separate this up into many pieces. For example, take these two numbers:

  12345
+ 67890

Now we'll split them after the third digit, between the hundreds and tens columns. This gives us

  123      45
+ 678    + 90

Calculate the results of each

  123      45
+ 678    + 90
-------------
  801     135

On the left number set you need to know how many digits you chopped off, in this case, two digits, so add two zeros back onto the end of 801, giving you 80100. And add 135 to it, and you have 80235.

You could do this with much larger numbers, and as many splits as you would like. Using this method prevents any carrying from occurring.

Of course, when you recombine large numbers you're still left with large additions. You could probably figure out how many digits have carried, and just add that small amount to your left-hand number.

For instance, in our above example, our number on the right ended up going from 2 columns to 3 columns, with the result being 135. So the extra column is the number to be carried, which could be added to your 801. This allows you to add to the small number, and then just concatenate the two numbers like you would a string

45 and 90 both took up two columns, which added made 135. We take any extra columns generated, in this case, just the 1, and add it to our left-hand number, 801.

801 + 1 = 802   
802 concatenated with 35 = 80235

If you want something extremely efficient, I'm sure you could look-up how 32-bit processors add 64-bit or larger numbers. I'm sure they do something similar for 64-bit numbers, adding the two 32-bit sections, and carrying over from the least significant 32-bit to the most significant.

And in terms of parallelization, split up your number into 32-bit pairs to be added together, then determine how many available threads the CPU can handle at once, and split up your list of pairs by that much and give that much to each thread. When the results are calculated, put them in a completed section.

The trick of carrying the numbers from the least significant to the most significant once you get all the results back will be tricky, as adding even a single 1 value to a number can cause it roll over another number as well.

Upvotes: 2

Related Questions