Reputation: 56
I am trying to make a parallel version of prime factorization in Java, and have a working program. However, the program is very slow – much slower than my sequential version. I do understand this may have to do with thread optimization and the sheer number of threads (I tried using ExecutorService but I don't think I really got the hang of it). Anyone who can point out a way to optimize this piece of code?
class HovedFaktorArbeider implements Runnable //Class for first step factorization
{
int fra;
int til;
int id;
List<Long> tempFaktorer; //Must be long since we'll stumble upon primes larger than int.MAX_VALUE accidentally
long[] lokaleFaktorer;
CyclicBarrier faktorBarrier = new CyclicBarrier(k+1);
ExecutorService executorService = Executors.newFixedThreadPool(2);
long g;
public HovedFaktorArbeider(int fra, int til, int id)
{
this.fra = fra;
this.til = til;
this.id = id;
}
public void run()
{
try
{
//int tempK = 2; //Erstatter bare k for aa sjekke
long tempN = (long) n;
g = (tempN*tempN)-100;
long tall = g + (long) fra; //The long numbers to be factorized
for(int i = fra; i <= til; i++) //For the number split
{
int tempFra = 0;
int tempTil = 0;
int rest = 0;
int faktor = 0;
if(primes.size() % k > 0) //If not perfect division
{
rest = (primes.size() % k); //Saving remainder
faktor = (primes.size()/k); //Saving divisor, spreading remainder
rest--; //Decrement remainder
}
else //If perfect divison
faktor = primes.size()/k;
tempTil = faktor; //Setting end value
tempFaktorer = new ArrayList<Long>();
for(int ii = 0; ii < k; ii++) //For the prime number split
{
//executorService.submit(new HjelpeFaktorArbeider(tempFra, tempTil, tall, ii));
new Thread(new HjelpeFaktorArbeider(tempFra, tempTil, tall, ii)).start();
tempFra = tempTil + 1; //Set new value for start
if(rest > 0)
{
tempTil += faktor + 1; //Spreading remainder
rest--;
}
else
tempTil += faktor; //New end value
}
faktorBarrier.await();
lokaleFaktorer = new long[tempFaktorer.size()];
for(int j = 0; j < tempFaktorer.size(); j++)
{
lokaleFaktorer[j] = tempFaktorer.get(j);
}
faktorer[i] = lokaleFaktorer; //TNB: i does not start at 0, so placement should be correct
tall++;
}
mainBarrier.await();
executorService.shutdown();
}
catch(Exception e){e.printStackTrace();}
}
public synchronized void oppdaterTempFaktorer(long tall)
{
tempFaktorer.add(tall);
}
class HjelpeFaktorArbeider implements Runnable //Class for second step factorization
{
int fra;
int til;
int id;
long tall;
public HjelpeFaktorArbeider(int fra, int til, long tall, int id)
{
this.fra = fra;
this.til = til;
this.id = id;
this.tall = tall;
}
public void run()
{
try
{
long t = tall;
for(int i = fra; i < til; i++)
{
int primtall = primes.get(i);
while(t % primtall == 0)
{
try
{
oppdaterTempFaktorer((long) primtall);
}
catch(Exception e){e.printStackTrace();}
t = t/primtall;
}
}
runBarrier.await();
if(t > 1 && id == 0 && !tempFaktorer.contains(t))
{
try
{
oppdaterTempFaktorer(t);
}
catch(Exception e){e.printStackTrace();}
}
faktorBarrier.await();
}
catch(Exception e){e.printStackTrace();}
}
}
}
As of now, I have a list of primes found through Erastothenes' Sieve, a limit n and cores k (8 on my machine).
P.S.: the reason why there are two Runnable classes, is that I need each and every factorization to be executed by multiple threads.
Upvotes: 1
Views: 690
Reputation: 8137
If you are interested in concurrency I suggest you look at other concurrency models such as Futures
or Actors
. Threads and locks are SOooo0o 2007.
As to the question of how to make parallel code faster well that is the billion dollar question which has many approaches. It's entirely dependent on your problem and your architecture.
So to start with your problem isn't a fantastic problem to make use of parallel architectures as you can see because for instance say you have 9 threads, and you check the numbers 2,3,4,5,6,7,8,9,10, that is already pretty bad because you didn't have to check 4,6,8,9,10 as 2 & 3 would have shown you those to be pretty bad (ok you probably wouldn't check even numbers but just to prove my point). So some problems can not make very good use of a parallel code as you can see here.
As to your question about how to efficiently parallelise Java code, the most well accepted answer is don't. Programmers these days are more concerned with not blocking threads which generally revolve around the don't call me we'll call you
principal. But how to write efficient parallel code for mathematical problem is an overly broad computer science question.
But in general your question is too broad.
Upvotes: 2