Reputation: 362
I need to recognize rectangles in frames from a captured video. I use the following method to display a rectangle on top of an observed image.
func displayRect(for observation: VNRectangleObservation) {
DispatchQueue.main.async { [weak self] in
guard let size = self?.imageView.frame.size else { return }
guard let origin = self?.imageView.frame.origin else { return }
let transform = CGAffineTransform(scaleX: size.width, y: size.height)
let rect = observation.boundingBox.applying(transform)
.applying(CGAffineTransform(scaleX: 1.0, y: -1.0))
.applying(CGAffineTransform(translationX: 0.0, y: size.height))
.applying(CGAffineTransform(translationX: -origin.x, y: -origin.y))
let path = UIBezierPath(rect: rect)
let layer = CAShapeLayer()
layer.path = path.cgPath
layer.fillRule = kCAFillRuleEvenOdd
layer.fillColor = UIColor.red.withAlphaComponent(0.2).cgColor
self?.overlay.sublayers = nil
self?.overlay.addSublayer(layer)
}
}
This works just fine with images taken from the camera, but for frames from captured video the rectangle is off. In fact, it looks like it (and thus the entire coordinate system for the image) if off by 90 degrees. Please see the screenshots below.
Am I missing something about video frames that could cause the observation's boundingBox
property to be in an entirely different coordinate system?
Below is my implementation of the captureOutput
delegate method.
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let buffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
// Also tried converting to CGImage, creating handler from that, but made no difference
let handler = VNImageRequestHandler(cvPixelBuffer: buffer, options: [:])
let request = VNDetectRectanglesRequest()
request.minimumAspectRatio = VNAspectRatio(0.2)
request.maximumAspectRatio = VNAspectRatio(1.0)
request.minimumSize = Float(0.3)
try? handler.perform([request])
// Note: Only ever captures one rectangle, so calling `first` not the issue.
guard let observations = request.results as? [VNRectangleObservation],
let observation = observations.first else {
return removeShapeLayer()
}
displayRect(for: observation, buffer: buffer)
}
Upvotes: 2
Views: 1977
Reputation: 13276
This issue is that you're not passing the orientation of the buffer to the VNImageRequestHandler
so it is trading the video as landscape. Then when it return that rect, you place that above the video that is being displayed in portrait.
You'll either need to pass the orientation to the VNImageRequestHandler
, or modify (rotate) the rectangle returned to take that into account.
Upvotes: 4