user176168
user176168

Reputation: 1360

Compute shader: read data written in one thread from another?

Can somebody tell me whether the following compute shader is possible with DirectX 11?

I want the first thread in a Dispatch that accesses an element in a buffer (g_positionsGrid) to set (compare exchange) that element with a temporary value to signify that it is taking some action.

In this case the temp value is 0xffffffff and the first thread is going to go continue on and allocate a value from a structed append buffer (g_positions) and assign it to that element.

So all fine so far but the other threads in the dispatch can come in inbetween the compare exchange and the allocation of the first thread and so need to wait until the allocation index is available. I do this with a busy wait ie the while loop.

However sadly this just locks up the GPU as I'm assuming that the value written by the first thread is not propogated through to the other threads stuck in the while loop.

Is there any way to get those threads to see that value?

Thanks for any help!

RWStructuredBuffer<float3> g_positions : register(u1);
RWBuffer<uint> g_positionsGrid : register(u2);

void AddPosition( uint address, float3 pos )
{
    uint token = 0; 

    // Assign a temp value to signify first thread has accessed this particular element
    InterlockedCompareExchange(g_positionsGrid[address], 0, 0xffffffff, token);

    if(token == 0)
    {
        //If first thread in here allocate index and assign value which
        //hopefully the other threads will pick up
        uint index = g_positions.IncrementCounter();
        g_positionsGrid[address] = index;
        g_positions[index].m_position = pos;
    }
    else
    {
        if(token == 0xffffffff)
        {
            uint index = g_positionsGrid[address];

            //This never meets its condition
            [allow_uav_condition]
            while(index == 0xffffffff) 
            { 
                //For some reason this thread never gets the assignment
                //from the first thread assigned above
                index = g_positionsGrid[address]; 
            }

            g_positions[index].m_position = pos;
        }
        else
        {
            //Just assign value as the first thread has already allocated a valid slot 
            g_positions[token].m_position = pos;

        }
    }
}

Upvotes: 2

Views: 4325

Answers (1)

Drop
Drop

Reputation: 13003

Thread sync in DirectCompute is very easy, but comparing to same features to CPU threading is very unflexible. AFAIK, the only way to sync data between threads in compute shader is to use groupshared memory and GroupMemoryBarrierWithGroupSync(). That means, that you can:

  • create small temporary buffer in groupshared memory
  • calculate value
  • write to groupshared buffer
  • synchronize threads with GroupMemoryBarrierWithGroupSync()
  • read from groupshared from another thread and use it somehow

To implement all this stuff, you need proper array indices. But where you can take it from? In DirectCompute values passed in Dispatch and system values that you can get in shader (SV_GroupIndex, SV_DispatchThreadID, SV_GroupThreadID, SV_GroupID) related. Using that values you can calculate indices to assess you buffers.

Compute shaders are not well documented, and there is no easy way, but to find out more info at least you can:

As of your code. Well, probably you can redesign it a little.

  1. It is always good to all threads do the same task. Symmetric loading. Actually, you can not assign different tasks for you threads as you do it in CPU code.

  2. If your data first need some preprocessing, and further processing, you may want to divide it to differrent Dispatch() calls (different shaders) that you will call in sequence:

    • preprocessShader reads from buffer inputData and writes to preprocessedData
    • calculateShader feads from preprocessedData and writes to finalData

    In this case you can drop out any slow thread sync and slow group shared memory.

  3. Look at "Thread reduction" trick mentioned above.

Hope it helps! And happy coding!

Upvotes: 1

Related Questions