Reputation:
I'm trying to transfer this prime sieving function to use Parallel.For so it can utilize multiple cores.
However, when I run it, the value of the b variable seems to randomly jump or even not change at all, especially for higher values of To.
static List<long> Sieve(long To)
{
long f = 0;
To /= 2;
List<long> Primes = new List<long>();
bool[] Trues = new bool[To];
if (To > 0)
Primes.Add(2);
long b = 0;
Parallel.For(1L, To, a =>
{
b++;
if (Trues[b])
return;
f = 2 * b + 1;
Primes.Add(f);
for (long j = f + b; j < To; j += f)
Trues[j] = true;
});
return Primes;
}
What's going on, and how can I stop that from happening?
Upvotes: 1
Views: 180
Reputation: 6132
The problem you are facing here is called race conditions
, it's what happens when multiple CPU cores load the same variable into their respective cache, work on it, then write the value back to RAM. Obviously, the value that's written back to RAM may already be old in the meantime (like when a core loads the variable right before it's overwritten with another value)
First of all: I wouldn't use b++
but int i = Interlocked.Increment(ref b);
instead. Interlocked.Increment ensures that no 2 threads attempt to increment the same value at the same time. The result is the incremented value which will be saved into the variable i
. This is very important, because you will need that value to remain constant for every iteration of your for-loop, which would be impossible otherwise, since other threads will be incrementing this variable.
Next is your variable f
and a
(defined as the For-iterator). Forget f
, use a
instead.
f = 2 * b + 1; // wrong
a = 2 * b + 1; // correct
Lastly: System.Collections.Generic.List is NOT, I repeat (because it's important) NOT thread safe. See http://msdn.microsoft.com/en-us/library/6sh2ey19.aspx for more details.
Primes.Add(f); // will likely break something
lock (Primes) // LOCK the list
{
Primes.Add(a); // don't forget, we're using 'a' instead of 'f' now
}
The lock
keyword accepts only reference-type variables as argument, that is because locking a variable does NOT prevent another thread from accessing it. Instead, you can imagine it as setting a flag on top of the reference, in order to signal other threads I'm working here, please do not disturb!
Of course, if another thread attempts to access Primes
without asking to lock it beforehand, the thread will still be able to access the variable.
You should've learned all of this though, since the Parallel Prime Sieve is one of the most common beginner exercises when first learning about multithreading.
EDIT:
After all the above steps are done, the program shouldn't run amok; however this does not mean that the solution will be correct, or that you'll have gained a speedup, since many of your threads will be doing duplicate work.
Assume Thread a
is given the responsibility to mark every multiple of 3, while Thread n
is given the responsibility to mark the multiples of 9. When run sequentially, by the time Thread n
begins processing the multiples of 9, it will see that 9 is already a multiple of another (prime) number. However, since your code is now parallel, there is no guarantee that 9 will be marked by the time Thread n
begins its work. Not to mention that - since 9 may not be marked - might be added to the list of prime numbers.
Because of this, you have to sequentially find all prime numbers between 1 and the square root of To
. Why the square root of To
? That's something you'll have to find out yourself.
Once you have found all prime numbers from 1 to the square root of To
, you can start your parallel prime sieve in order to find the rest of the primes, using all primes found previously.
One last noteworthy point would be, that Primes
should be built only after Trues
has been filled. That's because:
1. Your threads will only have to process the multiples of 2 to the square root of To
, thus will in the current implementation not add any more elements to Primes
beyond the root.
2. If you choose to have your threads go beyond the root, you'll face the problem, that one of your threads will add a non-prime number to Primes
shortly before another thread marks that number as non-prime, which is not what you want.
3. Even in the event that you were lucky and all elements of Primes
are indeed all prime numbers between 1 and To
, they may not necessarily be in order, requiring Primes
to be sorted first.
Upvotes: 1
Reputation: 29233
Welcome to the wonderful world of multithreading.
Right off the bat, I can see that every iteration of your loop does a b++
and then uses b
throughout its course. This means that every iteration of your loop will be modifying the value of b
in the midst of all other iterations.
What you probably want to do is use the a
variable made available in your inline function, which does exactly what you seem to be trying to do with b
. On the off chance that this is not the case, then you should look into locking b
and copying its value to a local (to each iteration) variable before doing stuff to it.
Try this instead and let me know if it's what you wanted to do:
static List<long> Sieve(long To)
{
To /= 2;
List<long> Primes = new List<long>();
if (To > 0)
Primes.Add(2);
Parallel.For(1L, To, a =>
{
long f = 2 * a + 1;
Primes.Add(f);
});
Primes.Sort();
return Primes;
}
Upvotes: 1
Reputation: 171226
b
is shared across threads. What do you expect to happen if multiple threads bang on that poor variable at once?
It seems like b
and a
are always equal in your code (or differing by one). Use a
. And synchronize access to all other shared state (like the list).
Upvotes: 1