Reputation: 103
I'm trying to parallelize a for loop in C++. The idea is that given an array of asteroids, I calculate the gravity force the asteroids create between them. Each asteroid has its mass and position.
I want to parallelize this loop, but the problem is that there is a segmentation fault when a thread want to access to any asteroid that is used by another thread to calculate the force.
this is my code:
//For each asteroid calculate forzes acting
for(unsigned long j=0; j<asteroids.size(); j++){
vector<double>forces(2);
{
#pragma omp parallel num_threads(4)
#pragma omp for
//I start in x instead of 0 to avoid redundance calculation
for(unsigned long x=j; x <asteroids.size(); x++){
//Avoid calculations on itself
if(asteroids[j].getX() != asteroids[x].getX() && asteroids[j].getY() != asteroids[x].getY()){
forces = asteroids[j].calculateAsteroidMov(asteroids[x], gravity, dmin);
}
asteroids[x].invertForze(forces[0], forces[1]);
}
}
for(unsigned long j=0; j<asteroids.size(); j++){
asteroids[j].updatePosition(t, width, height);
}
}
And here is the calculateAstoidMov is:
std::vector<double> Asteroid::calculateAsteroidMov(Asteroid neighbour, double gravity, double dmin){
//Distance between
double xdist = x - neighbour.getX();
double ydist = y - neighbour.getY();
double dist = sqrt( xdist*xdist + ydist*ydist );
double xforze = 0;
double yforze = 0;
if(dist > dmin){
double slope = ydist / xdist;
if(slope > 1 || slope < -1){
slope -= trunc(slope);
}
double alfa = atan(slope);
xforze = ((gravity * mass * neighbour.getMass()) / (dist*dist));
yforze = ((gravity * mass * neighbour.getMass()) / (dist*dist));
if(xforze > 200){
xforze = 200;
}else if(yforze > 200){
yforze = 200;
}
xforze *= cos(alfa);
yforze *= sin(alfa);
sumxforze += xforze;
sumyforze += yforze;
}
std::vector<double> forces = {xforze, yforze};
return forces;
}
And the updatePosition()
void Asteroid::updatePosition(double t, double width, double height){
//Spped update
vx += (sumxforze/mass) * t;
vy += (sumyforze/mass) * t;
//Position update
x += vx * t;
y += vy * t;
}
How I should parallelize the loop that calculate the force? I hope it is clear...
Upvotes: 1
Views: 449
Reputation: 405
The problem is you write the forces vector from different threads at the same time. You can move it's declaration to the inner for loop, so concurrency will not be a problem. I also assume that, you shouldn't call the invertForze
when you don't calculate forces
.
for(unsigned long j=0; j<asteroids.size(); j++){
{
#pragma omp parallel num_threads(4)
#pragma omp for
//I start in x instead of 0 to avoid redundance calculation
for(unsigned long x=j; x <asteroids.size(); x++){
//Avoid calculations on itself
if(asteroids[j].getX() != asteroids[x].getX() && asteroids[j].getY() != asteroids[x].getY()){
vector<double> forces = asteroids[j].calculateAsteroidMov(asteroids[x], gravity, dmin);
asteroids[x].invertForze(forces[0], forces[1]);
}
}
}
for(unsigned long j=0; j<asteroids.size(); j++){
asteroids[j].updatePosition(t, width, height);
}
}
Upvotes: 0
Reputation:
There are two ways to tackle this issue.
Maintain two list of asteroids, one you read from, and one you write to. Many threads can safely read from the same asteroid, and you are guaranteed each thread writes to an area of memory the others don't have access to.
Depdending on what invertForze()
does, this might also give you the benefit of making the whole process order-independant.
Break up your asteroid field into sub-fields of interacting asteroids, and paralellize on a per-island basis, instead of on a per-asteroid basis.
This is the approach most modern physics engine use, because they use the assumption that islands tend to remain the same frame after frame, but it is a lot more complicated to put into place compared to a simple double-buffered solution.
Upvotes: 1