Levi H
Levi H

Reputation: 3586

Threading is slow and unpredictable?

I've created the basis of a ray tracer, here's my testing function for drawing the scene:

public void Trace(int start, int jump, Sphere testSphere)
{
    for (int x = start; x < scene.SceneWidth; x += jump)
    {
        for (int y = 0; y < scene.SceneHeight; y++)
        {
            Ray fired = Ray.FireThroughPixel(scene, x, y);

            if (testSphere.Intersects(fired))
                sceneRenderer.SetPixel(x, y, Color.Red);
            else sceneRenderer.SetPixel(x, y, Color.Black);
        }
    }
}

SetPixel simply sets a value in a single dimensional array of colours. If I call the function normally by just directly calling it it runs at a constant 55fps. If I do:

Thread t1 = new Thread(() => Trace(0, 1, testSphere));
t1.Start();
t1.Join();

It runs at a constant 50fps which is fine and understandable, but when I do:

Thread t1 = new Thread(() => Trace(0, 2, testSphere));
Thread t2 = new Thread(() => Trace(1, 2, testSphere));

t1.Start();
t2.Start();

t1.Join();
t2.Join();

It runs all over the place, rapidly moving between 30-40 fps and sometimes going out of that range up to 50 or down to 20, it's not constant at all. Why is it running slower than it would if I ran the whole thing on a single thread? I'm running on a quad core i5 2500k.

Upvotes: 5

Views: 319

Answers (3)

Armin
Armin

Reputation: 1062

Experiment: Exchange x with y, go through x in the inner loop and y in the outer loop, and always distribute the load per thread line-wise, never column-wise (x).

My assumption is based on the fact that bitmaps are almost always stored with ascending memory addresses in the x-direction . If that is the case, your current memory access pattern is hard on the CPU caches, especially when multiple threads are used.

Upvotes: 0

Nick Butler
Nick Butler

Reputation: 24433

This is difficult to answer without profiling your app, but I would suspect false sharing.

Both your threads are writing to a shared memory structure, which will cause your CPU caches to keep invalidating.

The easy way to test would be to create a separate output array for each thread.
It doesn't have to work - just look at the frame rates.

I wrote an article about this a while back: "Concurrency Hazards: False Sharing"

Upvotes: 3

JimmyMcHoover
JimmyMcHoover

Reputation: 864

Threading usually isn't the way to go with rendering. I don't know what exactly is executed within the thread, but it's possible that creating the threads and joining them costs more time than what you win by the parallel calculation. Depends on the amount of processor cores too.

Upvotes: 0

Related Questions