Reputation: 18218
Here is my situation: I have a RWTexture2D<float4> out
which will always be in state D3D12_RESOURCE_STATE_UNORDERED_ACCESS
and another RWTexture2D<float4> tex
(initialized in state D3D12_RESOURCE_STATE_COPY_SOURCE
) and my render loop is like this:
out
to zero using ClearUnorderedAccessViewFloat
DispatchRays
which will read from and write to out
tex
from D3D12_RESOURCE_STATE_COPY_SOURCE
to D3D12_RESOURCE_STATE_UNORDERED_ACCESS
out
and stores the result in tex
tex
from D3D12_RESOURCE_STATE_UNORDERED_ACCESS
to D3D12_RESOURCE_STATE_COPY_SOURCE
Currently I'm waiting for the GPU to be finished (by a method of the form wait_for_gpu
below) after (1.) and again after (2.). I've noticed that the performance of this is rather poor. So my question is: How can I do this better?
I guess it can be made way more efficient by using (resource) barriers. Beyond ResourceBarrier
there is now another method Barrier
(see https://microsoft.github.io/DirectX-Specs/d3d/D3D12EnhancedBarriers.html) and I'm quite lost about how I should use them here.
I've tried something like the following:
D3D12_RESOURCE_BARRIER resource_barrier;
resource_barrier.Flags = D3D12_RESOURCE_BARRIER_FLAG_NONE;
resource_barrier.Type = D3D12_RESOURCE_BARRIER_TYPE_UAV;
resource_barrier.UAV.pResource = out;
command_list->ResourceBarrier(1, &resource_barrier);
But does that really do what I want? I clearly only want to make sure that before the ray generation shader is invoked by DispatchRays
, the stores from ClearUnorderedAccessViewFloat
have finished and it is safe to read those values.
Using Barrier
instead, it might be better to specify a D3D12_BARRIER_GROUP
with a D3D12_TEXTURE_BARRIER
and SyncBefore = D3D12_BARRIER_SYNC_CLEAR_UNORDERED_ACCESS_VIEW
(though this enum value doesn't seem to be available in my version of d3d12.h
) and SyncAfter = D3D12_BARRIER_SYNC_RAYTRACING
.
Unfortunately, I personally think that the documentation is quite poor and I have no idea what would be the best way to use these things here. So, any help is highly appreciated.
void wait_for_gpu()
{
command_queue->Signal(fence, fence_values[frame_index]);
fence->SetEventOnCompletion(fence_values[frame_index], fence_event);
WaitForSingleObjectEx(fence_event, INFINITE, FALSE);
++d3d.fence_values[frame_index];
}
Upvotes: 0
Views: 157