Reputation: 2566
I'll preface this by saying that this is my first time ever touching shader code so I'm sure there will be plenty wrong with my wording.
I'm working on a SpriteKit game and want to be able to generate sprites instead of manually creating/changing/adding assets. My idea was to have a "base" asset, make note of all the different colors that exist within the asset, then programmatically create uniforms representing both the asset's colors and the colors that I want to replace them with. From there, my shader could compare each rgb value with the uniforms representing the asset's colors and determine which color to replace it with. Essentially, I'm building a mapping from old colors to new colors and performing a color swap.
This works fine, as long as I use less than 30 uniforms in my "mapping." When I exceed 30, I get the following error (note that the "u_base_" and "u_new_" names are my generated uniforms):
program_source:49:1883: error: 'buffer' attribute parameter is out of bounds: must be between 0 and 30 fragment float4 SKShader_FragFunc( texture2d u_texture [[texture(0)]], const device float *u_time [[buffer(0)]], const device float *u_path_length [[buffer(1)]], const device float4 * u_base_skinDark [[buffer(2)]],const device float4 * u_base_eyesColor [[buffer(3)]],const device float4 * u_new_shirtMediumLight [[buffer(4)]],const device float4 * u_base_eyesDark [[buffer(5)]],const device float4 * u_new_shoes [[buffer(6)]],const device float4 * u_new_hairDark [[buffer(7)]],const device float4 * u_new_eyesWhite [[buffer(8)]],const device float4 * u_new_skinMediumDark [[buffer(9)]],const device float4 * u_base_skinMediumLight [[buffer(10)]],const device float2 * u_sprite_size [[buffer(11)]],const device float4 * u_base_shirtMediumLight [[buffer(12)]],const device float4 * u_new_eyesColor [[buffer(13)]],const device float4 * u_base_shirtDark [[buffer(14)]],const device float4 * u_base_shirtMediumDark [[buffer(15)]],const device float4 * u_new_hairMedium [[buffer(16)]],const device float4 * u_base_eyesBrow [[buffer(17)]],const device float4 * u_base_hairDark [[buffer(18)]],const device float4 * u_new_shirtMediumDark [[buffer(19)]],const device float4 * u_new_skinDark [[buffer(20)]],const device float4 * u_new_eyesDark [[buffer(21)]],const device float4 * u_new_eyesBrow [[buffer(22)]],const device float4 * u_new_hairLight [[buffer(23)]],const device float4 * u_new_skinLight [[buffer(24)]],const device float4 * u_base_skinLight [[buffer(25)]],const device float4 * u_new_skinMediumLight [[buffer(26)]],const device float4 * u_base_hairLight [[buffer(27)]],const device float4 * u_base_shoes [[buffer(28)]],const device float4 * u_base_eyesWhite [[buffer(29)]],const device float4 * u_new_shirtDark [[buffer(30)]],const device float4 * u_base_skinMediumDark [[buffer(31)]],const device float4 * u_base_hairMedium [[buffer(32)]], SKShader_VertexOut interpolated [[stage_in]]) {
I've found almost no references to the error message so finding a fix has been difficult. I've since discovered that this isn't really the right way to use shaders since these types of comparisons aren't very performant, but I'd still like to understand the error. Is there anyway to increase the size of this buffer? Is this just a limitation of SpriteKit? Also, is the only way to accomplish this color swap by using a "swap texture"?
Upvotes: 2
Views: 578
Reputation: 66
I won't be able to answer your specific questions regarding SpriteKit, but I have encountered the same error while working on a Metal project.
Depending on your hardware you may be able to use more than 31 buffers (i.e. the limit you are running into) by utilising argument buffers, e.g. for Tier 1 (from about_argument_buffers):
Tier 1 Limits
The following resource limits are defined as the maximum combined number of resources set within an argument buffer and set individually, per graphics or compute function. For example, if a kernel function uses 4 individual textures and one argument buffer with 8 textures, the total number of textures for that kernel function is 12.
In iOS and tvOS, the maximum entries in each function argument table are:
31 buffers (on A11 and later, 96 buffers)
31 textures* (on A11 and later, 96 textures)
16 samplers
*Writable textures are not supported within an argument buffer.
In macOS, the maximum entries in each function argument table are:
64 buffers
128 textures
16 samplers
For Tier 1 macOS for example (e.g. on an M1), you should be able to use a total of 64 buffers by utilising argument buffers.
Example .metal
shader code:
#include <metal_stdlib>
using namespace metal;
struct DataBuffers {
const device float* buffer0 [[ id(0) ]];
const device float* buffer1 [[ id(1) ]];
const device float* buffer2 [[ id(2) ]];
const device float* buffer3 [[ id(3) ]];
const device float* buffer4 [[ id(4) ]];
const device float* buffer5 [[ id(5) ]];
const device float* buffer6 [[ id(6) ]];
const device float* buffer7 [[ id(7) ]];
const device float* buffer8 [[ id(8) ]];
const device float* buffer9 [[ id(9) ]];
const device float* buffer10 [[ id(10) ]];
const device float* buffer11 [[ id(11) ]];
const device float* buffer12 [[ id(12) ]];
const device float* buffer13 [[ id(13) ]];
const device float* buffer14 [[ id(14) ]];
const device float* buffer15 [[ id(15) ]];
const device float* buffer16 [[ id(16) ]];
const device float* buffer17 [[ id(17) ]];
const device float* buffer18 [[ id(18) ]];
const device float* buffer19 [[ id(19) ]];
const device float* buffer20 [[ id(20) ]];
const device float* buffer21 [[ id(21) ]];
const device float* buffer22 [[ id(22) ]];
const device float* buffer23 [[ id(23) ]];
const device float* buffer24 [[ id(24) ]];
const device float* buffer25 [[ id(25) ]];
const device float* buffer26 [[ id(26) ]];
const device float* buffer27 [[ id(27) ]];
const device float* buffer28 [[ id(28) ]];
const device float* buffer29 [[ id(29) ]];
const device float* buffer30 [[ id(30) ]];
const device float* buffer31 [[ id(31) ]];
const device float* buffer32 [[ id(32) ]];
const device float* buffer33 [[ id(33) ]];
const device float* buffer34 [[ id(34) ]];
const device float* buffer35 [[ id(35) ]];
const device float* buffer36 [[ id(36) ]];
const device float* buffer37 [[ id(37) ]];
const device float* buffer38 [[ id(38) ]];
const device float* buffer39 [[ id(39) ]];
};
kernel void argument_buffer_test(
device float* out [[ buffer(0) ]],
const device DataBuffers& data_buffers [[ buffer(1) ]],
uint id [[ thread_position_in_grid ]]
) {
// Simply copy one of the input buffers.
out[id] = data_buffers.buffer11[id];
}
Corresponding .swift
file
import Foundation
import MetalKit
// Number of samples.
let N = 10
// Number of buffers in the struct as defined in the .metal file.
let structBuffers = 40
// Setup.
let device = MTLCreateSystemDefaultDevice()!
let commandQueue = device.makeCommandQueue()!
let library = device.makeDefaultLibrary()!
let function = library.makeFunction(name: "argument_buffer_test")!
let pipeline = try! device.makeComputePipelineState(function: function)
let threadsPerThreadgroup = MTLSize(width: pipeline.threadExecutionWidth, height: 1, depth: 1)
let argumentEncoder = function.makeArgumentEncoder(bufferIndex: 1)
let argumentBuffer = device.makeBuffer(length: argumentEncoder.encodedLength, options: [])
argumentEncoder.setArgumentBuffer(argumentBuffer, offset: 0)
// Dummy data.
var inputBuffer: MTLBuffer
var inputBuffers = [MTLBuffer]()
var dummyData: [CFloat]
for i in 0..<structBuffers {
dummyData = Array<CFloat>(repeating: Float(i), count: N)
inputBuffer = device.makeBuffer(
bytes: &dummyData,
length: MemoryLayout<CFloat>.stride * dummyData.count,
options: []
)!
inputBuffers.append(inputBuffer)
argumentEncoder.setBuffer(inputBuffer, offset: 0, index: i)
}
let outputBuffer = device.makeBuffer(length: MemoryLayout<CFloat>.stride * N, options: [])!
// Encode commands.
let commandBuffer = commandQueue.makeCommandBuffer()!
let encoder = commandBuffer.makeComputeCommandEncoder()!
encoder.setComputePipelineState(pipeline)
encoder.setBuffer(outputBuffer, offset: 0, index: 0)
// The argument buffer is represented as a single buffer here,
// with references to the individual input buffers created above.
encoder.setBuffer(argumentBuffer, offset: 0, index: 1)
// When using argument buffers, the underlying buffers have to be marked like so
for i in 0..<inputBuffers.count {
encoder.useResource(inputBuffers[i], usage: .read)
}
let threadsPerGrid = MTLSize(width: N, height: 1, depth: 1)
encoder.dispatchThreads(threadsPerGrid, threadsPerThreadgroup: threadsPerThreadgroup)
encoder.endEncoding()
commandBuffer.commit()
commandBuffer.waitUntilCompleted()
// Print output.
print(Array(
UnsafeBufferPointer<CFloat>(
start: outputBuffer.contents().bindMemory(to: CFloat.self, capacity: N),
count: N
)
))
Another good resource is Chapter 26 from Metal by Tutorials: https://github.com/raywenderlich/met-materials/tree/editions/3.0.
Upvotes: 1