Reputation: 1395
How can I get the RGB (or any other format) pixel value from a CVPixelBufferRef? Ive tried many approaches but no success yet.
func captureOutput(captureOutput: AVCaptureOutput!,
didOutputSampleBuffer sampleBuffer: CMSampleBuffer!,
fromConnection connection: AVCaptureConnection!) {
let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer)
//Get individual pixel values here
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)
}
Upvotes: 21
Views: 23194
Reputation: 4426
Here is a method for getting the individual rgb values from a BGRA pixel buffer. Note: Your buffer must be locked before calling this.
func pixelFrom(x: Int, y: Int, movieFrame: CVPixelBuffer) -> (UInt8, UInt8, UInt8) {
let baseAddress = CVPixelBufferGetBaseAddress(movieFrame)
let bytesPerRow = CVPixelBufferGetBytesPerRow(movieFrame)
let buffer = baseAddress!.assumingMemoryBound(to: UInt8.self)
let index = x*4 + y*bytesPerRow
let b = buffer[index]
let g = buffer[index+1]
let r = buffer[index+2]
return (r, g, b)
}
Upvotes: 14
Reputation: 833
Swift 5
I had the same problem and ended up with the following solution. My CVPixelBuffer
had dimensionality 68 x 68
, which can be inspected by
CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
print(CVPixelBufferGetWidth(pixelBuffer))
print(CVPixelBufferGetHeight(pixelBuffer))
You also have to know the bytes per row:
print(CVPixelBufferGetBytesPerRow(pixelBuffer))
which in my case was 320.
Furthermore, you need to know the data type of your pixel buffer, which was Float32
for me.
I then constructed a byte buffer and read the bytes consecutively as follows (remember to lock the base address as shown above):
var byteBuffer = unsafeBitCast(CVPixelBufferGetBaseAddress(pixelBuffer), to: UnsafeMutablePointer<Float32>.self)
var pixelArray: Array<Array<Float>> = Array(repeating: Array(repeating: 0, count: 68), count: 68)
for row in 0...67{
for col in 0...67{
pixelArray[row][col] = byteBuffer.pointee
byteBuffer = byteBuffer.successor()
}
byteBuffer = byteBuffer.advanced(by: 12)
}
CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
You might wonder about the part byteBuffer = byteBuffer.advanced(by: 12)
. The reason why we have to do this is as follows.
We know that we have 320 bytes per row. However, our buffer has width 68 and the data type is Float32
, e.g. 4 bytes per value. That means that we virtually only have 272
bytes per row, followed by zero-padding. This zero-padding probably has memory layout reasons.
We, therefore, have to skip the last 48 bytes in each row which is done by byteBuffer = byteBuffer.advanced(by: 12)
(12*4 = 48
).
This approach is somewhat different from other solutions as we use pointers to the next byteBuffer
. However, I find this easier and more intuitive.
Upvotes: 5
Reputation: 620
Update for Swift3:
let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0));
let int32Buffer = unsafeBitCast(CVPixelBufferGetBaseAddress(pixelBuffer), to: UnsafeMutablePointer<UInt32>.self)
let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
// Get BGRA value for pixel (43, 17)
let luma = int32Buffer[17 * int32PerRow + 43]
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)
Upvotes: 7
Reputation: 78985
baseAddress
is an unsafe mutable pointer or more precisely a UnsafeMutablePointer<Void>
. You can easily access the memory once you have converted the pointer away from Void
to a more specific type:
// Convert the base address to a safe pointer of the appropriate type
let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress)
// read the data (returns value of type UInt8)
let firstByte = byteBuffer[0]
// write data
byteBuffer[3] = 90
Make sure you use the correct type (8, 16 or 32 bit unsigned int). It depends on the video format. Most likely it's 8 bit.
Update on buffer formats:
You can specify the format when you initialize the AVCaptureVideoDataOutput
instance. You basically have the choice of:
If you're interested in the color values and speed (or rather maximum frame rate) is not an issue, then go for the simpler BGRA format. Otherwise take one of the more efficient native video formats.
If you have two planes, you must get the base address of the desired plane (see video format example):
Video format example
let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
let bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress)
// Get luma value for pixel (43, 17)
let luma = byteBuffer[17 * bytesPerRow + 43]
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)
BGRA example
let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer)
let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
let int32Buffer = UnsafeMutablePointer<UInt32>(baseAddress)
// Get BGRA value for pixel (43, 17)
let luma = int32Buffer[17 * int32PerRow + 43]
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)
Upvotes: 24