I am using AVFoundation to capture video frames, process with opencv and display the result in an UIImageView on the new iPad. The opencv process does the followings ("inImg" is the video frame) :
cv::Mat testROI = inImg.rowRange(0,100);
testROI = testROI.colRange(0,10);
testROI.setTo(255); // this is a BGRA frame.
However, instead of getting a vertical white bar (100 row x 10 col) on the top left corner of the frame, I got 100 stair-like horizontal lines from top right corner to the bottom left, each with 10 pixel long.
After some investigation, I realized that the width of the displayed frame seems to be 8 pixel wider than the cv::Mat. (i.e. the 9th pixel of the 2nd row is right below the 1st pixel of the 1st row.).
The video frame itself is shown correctly (no displacement between rows). The problem appears when the AVCaptureSession.sessionPreset is AVCaptureSessionPresetMedium (frame rows=480, cols=360) but does not appear when it is AVCaptureSessionPresetHigh (frame rows=640, cols=480).
There are 360 cols shown in full screen. (I tried traverse and modify the cv::Mat pixel-by-pixel. Pixel 1-360 were shown correctly. 361-368 disappeared and 369 was shown right under pixel 1).
I tried combinations of imageview.contentMode (UIViewContentModeScaleAspectFill and UIViewContentModeScaleAspectFit) and imageview.clipsToBound (YES/NO) but no luck.
What could be the problem? Thank you very much.
I use the following code to create the AVCaptureSession:
NSArray* devices = [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
if ([devices count] == 0) {
NSLog(@"No video capture devices found");
return NO;
for (AVCaptureDevice *device in devices) {
if ([device position] == AVCaptureDevicePositionFront) {
_captureDevice = device;
NSError* error_exp = nil;
if ([_captureDevice lockForConfiguration:&error_exp]) {
[_captureDevice setWhiteBalanceMode:AVCaptureWhiteBalanceModeContinuousAutoWhiteBalance];
[_captureDevice unlockForConfiguration];
// Create the capture session
_captureSession = [[AVCaptureSession alloc] init];
_captureSession.sessionPreset = AVCaptureSessionPresetMedium;
// Create device input
NSError *error = nil;
AVCaptureDeviceInput *input = [[AVCaptureDeviceInput alloc] initWithDevice:_captureDevice error:&error];
// Create and configure device output
_videoOutput = [[AVCaptureVideoDataOutput alloc] init];
dispatch_queue_t queue = dispatch_queue_create("cameraQueue", NULL);
[_videoOutput setSampleBufferDelegate:self queue:queue];
_videoOutput.alwaysDiscardsLateVideoFrames = YES;
OSType format = kCVPixelFormatType_32BGRA;
_videoOutput.videoSettings = [NSDictionary dictionaryWithObject:[NSNumber numberWithUnsignedInt:format]forKey:(id)kCVPixelBufferPixelFormatTypeKey];
// Connect up inputs and outputs
if ([_captureSession canAddInput:input]) {
[_captureSession addInput:input];
if ([_captureSession canAddOutput:_videoOutput]) {
[_captureSession addOutput:_videoOutput];
AVCaptureConnection * captureConnection = [_videoOutput connectionWithMediaType:AVMediaTypeVideo];
if (captureConnection.isVideoMinFrameDurationSupported)
captureConnection.videoMinFrameDuration = CMTimeMake(1, 60);
if (captureConnection.isVideoMaxFrameDurationSupported)
captureConnection.videoMaxFrameDuration = CMTimeMake(1, 60);
if (captureConnection.supportsVideoMirroring)
[captureConnection setVideoMirrored:NO];
[captureConnection setVideoOrientation:AVCaptureVideoOrientationPortraitUpsideDown];
When a frame is received, the followings is called:
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
@autoreleasepool {
CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
OSType format = CVPixelBufferGetPixelFormatType(pixelBuffer);
CGRect videoRect = CGRectMake(0.0f, 0.0f, CVPixelBufferGetWidth(pixelBuffer), CVPixelBufferGetHeight(pixelBuffer));
AVCaptureConnection *currentConnection = [[_videoOutput connections] objectAtIndex:0];
AVCaptureVideoOrientation videoOrientation = [currentConnection videoOrientation];
CGImageRef quartzImage;
// For color mode a 4-channel cv::Mat is created from the BGRA data
CVPixelBufferLockBaseAddress(pixelBuffer, 0);
void *baseaddress = CVPixelBufferGetBaseAddress(pixelBuffer);
cv::Mat mat(videoRect.size.height, videoRect.size.width, CV_8UC4, baseaddress, 0);
if ([self doFrame]) { // a flag to switch processing ON/OFF
[self processFrame:mat videoRect:videoRect videoOrientation:videoOrientation]; // "processFrame" is the opencv function shown above
CIImage *ciImage = [CIImage imageWithCVPixelBuffer:pixelBuffer];
quartzImage = [self.context createCGImage:ciImage fromRect:ciImage.extent];
CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
UIImage *image = [UIImage imageWithCGImage:quartzImage scale:1.0 orientation:UIImageOrientationUp];
[self.imageView performSelectorOnMainThread:@selector(setImage:) withObject:image waitUntilDone:YES];
I assume you're using the constructor Mat(int _rows, int _cols, int _type, void* _data, size_t _step=AUTO_STEP)
and that AUTO_STEP is 0 and assumes that the row stride is width*bytesPerPixel
This is generally wrong — it's very common to align rows to some larger boundary. In this case, 360 is not a multiple of 16 but 368 is; which strongly suggests that it's aligning to 16-pixel boundaries (perhaps to assist algorithms that process in 16×16 blocks?).
cv::Mat mat(videoRect.size.height, videoRect.size.width, CV_8UC4, baseaddress, CVPixelBufferGetBytesPerRow(pixelBuffer));
