iOS Metal: The fastest way to read read-only data?

Question

Situation: In a Metal kernel function, every thread in a threadgroup reads the exact same value at a time. The kernel pseudocode:

kernel void foo(device   int2*   ranges,  
                constant float3& readonlyBuffer,  
                device   float*  results,  
                uint lno [[ threadgroup_position_in_grid ]])  
{  
  float acc = 0.0;  

  for(int i=ranges[lno].x; i



The problem: in the pursuit of optimizing buffer reads, I changed readonlyBuffer's address space qualifier from device to constant. This had zero impact on the kernel performance although the Apple documentation says something different:


  The constant address space is optimized for multiple instances executing a graphics or kernel function accessing the same location in the buffer.


Questions:


How can I improve memory read times of my constant buffer?
Can I move the buffer (or at least a part of it) to an on-chip cache (something like Constant Buffer Preloading (page 24))?

Taylor · Accepted Answer

In your example code, indexing into readonlyBuffer would generate a compiler error.

Assuming readonlyBuffer is declared as a pointer, then the compiler doesn't statically know the size, and can't move the data to the constant memory space.

If readonlyBuffer is small (you only have 4KB of constant memory to work with), put it into a struct as follows:

struct ReadonlyBuffer {
    float3 values[MAX_BUFFER_SIZE];
};

Then do:

kernel void foo(device   int2*   ranges,  
                constant ReadonlyBuffer& readonlyBuffer,  
                device   float*  results,  
                uint lno [[ threadgroup_position_in_grid ]])

Finally, run a GPU trace ("Capture GPU Frame") and ensure you don't get the following error:

The Compiler was not able to Preload your Buffer. Kernel Function, Buffer Index: 1.

For more info about Buffer Preloading, see: https://developer.apple.com/videos/play/wwdc2016/606/?time=408

iOS Metal: The fastest way to read read-only data?

Answers (1)

Related Questions