CosmicVarion
CosmicVarion

Reputation: 146

Performant Multi ROI Image Color Average on iOS

CoreImage's CIAreaAverage filter can easily be used to perform whole CIImage RGB color averaging. For example:

let options = [CIContextOption.workingColorSpace: kCFNull as Any]
let context = CIContext(options: options)

let parameters = [
    kCIInputImageKey: inputImage, // assume this exists
    kCIInputExtentKey: CIVector(cgRect: inputImage.extent)
]

let filter = CIFilter(name: "CIAreaAverage", parameters: parameters)

var bitmap = [Float32](repeating: 0, count: 4)
context.render(filter.outputImage!, toBitmap: &bitmap, rowBytes: 16, bounds: CGRect(x: 0, y: 0, width: 1, height: 1), format: .RGBAf, colorSpace: nil)

let rAverage = bitmap[0]
let gAverage = bitmap[1]
let bAverage = bitmap[3]
...

However supposing one does not want whole CIImage color averaging, breaking up the image into regions of interest (ROIs) by varying the input extent (see kCIInputExtentKey above), and performing CIAreaAverage filtering operations per ROI introduces many sequential steps, decreasing performance drastically. The filters cannot be chained, of course, since the output is a 4-component color average (see bitmap above). Another way of describing this might be "average downsampling".

For example, let's say you have a 1080p image (1920x1080), and you want a 10x10 color average matrix from this. You would be performing 100 CIAreaAverage operations for 100 different input extents--each corresponding to a 192x108 pixel ROI for which you wish to have R, G, B, and perhaps A, average. But this is now 100 sequential CIAreaAverage operations--not performant.

Perhaps the next thing one might think to do is some sort of parallel for loop, e.g., a DispatchQueue.concurrentPerform(iterations:, execute:) per ROI. However, I am not seeing a performance gain. (Note that CIContext is thread safe, CIFilter is not)

Logically the next idea might be to create a custom CIFilter--let's call it CIMultiAreaAverage. However, it's not obvious how to create a CIKernel that can examine a source pixel's location and map that to a particular destination pixel. You need some buffer of information such as ROI color sum or to treat the destination pixel as a buffer. The simplest thing might be to perform ROI per channel sum into a destination with integer type, and then process that once rendered to a bitmap into an average by casting to float and dividing by the number of pixels in the ROI.

I wish I had access to the source code for CIAreaAverage. To encapsulate the full functionality in the CIFilter you might have to go further and write what's really a custom Metal shader. So perhaps someone with some expertise can assist with how to accomplish this with a metal shader.

Another option might be to use vDSP/vImage to perform these ROI operations. It seems easy to create the necessary vImage_Buffers per ROI, but I'd want to make sure that's an in-place operation (probably) for performance. Then, I'm not sure which or how to apply a vDSP mean function to the vImage_Buffer, treating it like an array, if that's possible. It sounds like this might be the most performant operation.

What does SO think?

Upvotes: 1

Views: 415

Answers (1)

Frank Rupprecht
Frank Rupprecht

Reputation: 10383

Here is what Apple is doing in CIAreaAverage:

Filter graph for CIAreaAverage

I don't know why they follow two different paths, but this is what I think is happening:

The path on the left is a stepwise reduction of the input pixels into a smaller output. The kernel _areaAvg8 reduces a group of (up to) 8x8 pixels into one output pixel by calculating their average value. _areaAvg2 does the same for 2x2 pixels and _horizAvg2 for 2x1. So in multiple steps, the image is reduced, each step reducing the values of the previous step further. Until the last step produces one final pixel that contains the average of all pixels of the input.

For the right side, I assume that the CIAreaAverageProcessor is a CIImageProcessingKernel that uses Metal Performance Shaders, specifically I assume MPSImageReduceRowMean and MPSImageReduceColumnMean, to do the same. Why they have those two paths with the switch on top I do not know.

For your use case, I suggest you implement something similar to the left path, but stop somewhere in the middle, depending on the size of your desired output.

To improve performance, you can make use of the bilinear sampling that is provided by the graphics hardware basically for free: When you sample the input image at a coordinate in the middle of 4 pixels, you already get an average of these 4 color values. That means for an 8x8 reduction, you only need 4 x 4 = 16 sample operations (instead of 64). This kernel could look something like this:

extern "C" float4 areaAvg8(coreimage::sampler src, coreimage::destination dest) {
    float2 center = dest.coord() * 8.0; // assuming that src is 8x larger than dest
    float4 sum = src.sample(src.transform(center + float2(-3.0, -3.0)))
               + src.sample(src.transform(center + float2(-1.0, -3.0)))
               + src.sample(src.transform(center + float2( 1.0, -3.0)))
               + src.sample(src.transform(center + float2( 3.0, -3.0)))
               + src.sample(src.transform(center + float2(-3.0, -1.0)))
               + src.sample(src.transform(center + float2(-1.0, -1.0)))
               + src.sample(src.transform(center + float2( 1.0, -1.0)))
               + src.sample(src.transform(center + float2( 3.0, -1.0)))
               + src.sample(src.transform(center + float2(-3.0,  1.0)))
               + src.sample(src.transform(center + float2(-1.0,  1.0)))
               + src.sample(src.transform(center + float2( 1.0,  1.0)))
               + src.sample(src.transform(center + float2( 3.0,  1.0)))
               + src.sample(src.transform(center + float2(-3.0,  3.0)))
               + src.sample(src.transform(center + float2(-1.0,  3.0)))
               + src.sample(src.transform(center + float2( 1.0,  3.0)))
               + src.sample(src.transform(center + float2( 3.0,  3.0)));
    return sum / 16.0;
}

Upvotes: 1

Related Questions