xaxazak
xaxazak

Reputation: 830

Vulkan: Concurrent host-writes and device reads to separate parts of same VkMemory

To transfer my static data into the GPU, I'm thinking of having a single staging VkMemory object (ballpark 64MB), and using it as a rotating queue. However, I have multiple threads producing content (eg: rendering glyphs, loading files, procedural) and I'd like it if they could upload their data entirely by themselves (i.e. write plus submit Vulkan transfer commands).

I'm intending to keep the entire staging VkMemory permanently mapped (if this is dumb please say so) at least during loading (but perhaps longer if I want to stream data).

To achieve the above, once a thread's data is fully written/flushed to staging I'd like it to be able to immediately submit GPU transfer commands.

However, that means the GPU will be reading from one part of the VkMemory while other threads may be writing/flushing to it.

AFAIK I will also need to use image memory barriers for the transition from VK_IMAGE_LAYOUT_PREINITIALIZED to VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL.

I couldn't find anything on the spec explicitly saying this was legal or illegal, only that care should be taken to ensure synchronization. However, I didn't find enough detail for me to be sure one way or the other.

NOTE: The staging queue will need to ensure transfers have been completed before overwriting anything - I intend to keep a complimentary queue of VkFences for this.


Questions:

  1. Is this OK?
  2. Do I need to align each separate object to a page boundary? Or something else.
  3. Am I correct in assuming that the image memory barrier (above) won't require the device to write to staging memory.

Upvotes: 1

Views: 493

Answers (1)

ratchet freak
ratchet freak

Reputation: 48196

  1. yes the spec talks about the region being read from and written to must be synced.

  2. if the memory is not coherent then you must align the blocks being read from or written to to NonCoherentAtomSize

source: Vulkan spec under the note after the declaration of vkMapMemory

vkMapMemory does not check whether the device memory is currently in use before returning the host-accessible pointer. The application must guarantee that any previously submitted command that writes to this range has completed before the host reads from or writes to that range, and that any previously submitted command that reads from that range has completed before the host writes to that region (see here for details on fulfilling such a guarantee). If the device memory was allocated without the VK_MEMORY_PROPERTY_HOST_COHERENT_BIT set, these guarantees must be made for an extended range: the application must round down the start of the range to the nearest multiple of VkPhysicalDeviceLimits::nonCoherentAtomSize, and round the end of the range up to the nearest multiple of VkPhysicalDeviceLimits::nonCoherentAtomSize.

  1. a layout transition may write to the memory however barriers will do their own syncing with regards to previous and subsequent memory accesses.

Upvotes: 1

Related Questions