Reputation: 779
I've written a fairly complex program which simulates the behavior of small cell-like organisms in a particle system.
The simulation goes through a few stages where it calculates the next position for each object in the simulation. A 'motion' member function is called for each particle and organism in the simulation, which updates its displacement vectors (in the case of the organism, it handles AI and motion). The render function then iterates through each object and draws the appropriate shape at each object's new displacement.
The displacement calculations for each object are done several hundred times before a frame is rendered, which allows the simulation to run much faster.
In this process, it is impossible to calculate the next position of the objects without knowing the current position (due to the serial nature of the process), so I cannot give a 'block' of frames to calculate to several different threads - each frame relies on the calculation of the previous one.
As I mentioned before, the new positions of each object are calculated by iterating through each object and calling a current object's member function which calculates the new position for that particular object. I was wondering whether this process could be done in parallel - calculate one quarter of the objects on one thread, another quarter on a second thread, etc. Is this method possible, and would it improve the speed of the calculations for each frame when there are huge amounts of objects in the simulation?
Upvotes: 0
Views: 1448
Reputation: 24269
You are probably going to want to use some form of double buffering: that is, you're going to want a set of cells that are your pristine, source state, and a set of cells into which your application writes the results of calculations.
As you process the simulation, you will read from the first buffer and write to the second. When the pass is completed, you swap.
typedef World Cell[9][9]; // World is a 9x9 matrix of Cells
World buffers[2]; // 2 buffers.
World* src = buffers[0];
World* dst = buffers[1];
PopulateWorld(src);
while (running) {
PerformTransformations((const World*)src, (/*!const*/ World*)dst);
std::swap(src, dst);
}
Alternatively, you could marshal/encapsulate the transformable properties of each cell into their own structs/classes so that each cell has a pair and you simply swap between the two.
struct Cell {
struct Data {
Matrix3d position;
Matrix3d velocity;
};
Data m_data[2];
static void DetermineWhichBuffersToUse(size_t runNo, size_t& srcNo, size_t& dstNo) {
// when runNo is even, use m_data[0] as src,
// when runNo is odd, use m_data[1] as src.
size_t src = (runNo & 1);
size_t dst = 1 - src
}
...
};
Another option would be to use a message passing pipeline, whereby you marshal all of the cells into requests to process data, have compuations done by worker threads which output a message with the resulting dataset back to the parent thread.
The parent thread sends out all of the messages and then reaps back all of the results and writes them. This solution is more worthy of investigation if you are planning to scale the simulation up across multiple systems, in which case you might want to look into something like ZeroMQ for the message passing library.
Upvotes: 1