Reputation: 2869
When trying to use Metal to rapidly draw pixel buffers to the screen from memory, we create MTLBuffer
objects using MTLDevice.makeBuffer(bytesNoCopy:..)
to allow the GPU to directly read the pixels from memory without having to copy it. Shared memory is really a must-have for achieving good pixel transfer performance.
The catch is that makeBuffer
requires a page-aligned memory address and a page aligned length
. Those requirements are not only in the documentation -- they are also enforced using runtime assertions.
The code I am writing has to deal with a variety of incoming resolutions and pixel formats, and occasionally I get unaligned buffers or unaligned lengths. After researching this I discovered a hack that allows me to use shared memory for those instances.
Basically what I do is I round the unaligned buffer address down to the nearest page boundary, and use the offset
parameter from makeTexture
to ensure that the GPU starts reading from the right place. Then I round up length
to the nearest page size. Obviously that memory is going to be valid (because allocations can only occur on page boundaries), and I think it's safe to assume the GPU isn't writing to or corrupting that memory.
Here is the code I'm using to allocate shared buffers from unaligned buffers:
extension MTLDevice {
func makeTextureFromUnalignedBuffer(textureDescriptor : MTLTextureDescriptor, bufferPtr : UnsafeMutableRawPointer, bufferLength : UInt, bytesPerRow : Int) -> MTLTexture? {
var calculatedBufferLength = bufferLength
let pageSize = UInt(getpagesize())
let pageSizeBitmask = UInt(getpagesize()) - 1
let alignedBufferAddr = UnsafeMutableRawPointer(bitPattern: UInt(bitPattern: bufferPtr) & ~pageSizeBitmask)
let offset = UInt(bitPattern: bufferPtr) & pageSizeBitmask
assert(bytesPerRow % 64 == 0 && offset % 64 == 0, "Supplied bufferPtr and bytesPerRow must be aligned on a 64-byte boundary!")
calculatedBufferLength += offset
if (calculatedBufferLength & pageSizeBitmask) != 0 {
calculatedBufferLength &= ~(pageSize - 1)
calculatedBufferLength += pageSize
}
let buffer = self.makeBuffer(bytesNoCopy: alignedBufferAddr!, length: Int(calculatedBufferLength), options: .storageModeShared, deallocator: nil)
return buffer.makeTexture(descriptor: textureDescriptor, offset: Int(offset), bytesPerRow: bytesPerRow)
}
}
I've tested this on numerous different buffers and it seems to work perfectly (only tested on iOS, not on macOS). My question is: Is this approach safe? Any obvious reasons why this wouldn't work?
Then again, if it is safe, why were the requirements imposed in the first place? Why isn't the API just doing this for us?
Upvotes: 11
Views: 1729
Reputation: 2869
I have submitted an Apple TSI (Technical Support Incident) for this question, and the answer is basically yes, it is safe. Here is the exact response in case anyone is interested:
After discussing your approach with engineering we concluded that it was valid and safe. Some noteworthy quotes:
“The framework shouldn’t care about the fact that the user doesn’t own the entire page, because it shouldn’t ever read before the offset where the valid data begins.”
“It really shouldn’t [care], but in general if the developer can use page-allocators rather than malloc for their incoming images, that would be nice.”
As to why the alignment constraints/assertions are in place:
“Typically mapping memory you don’t own into another address space is a bit icky, even if it works in practice. This is one reason why we required mapping to be page aligned, because the hardware really is mapping (and gaining write access) to the entire page.”
Upvotes: 10