Reputation: 29
I want to create an ARKit app using Xcode. I want it to recognize a generic rectangle without pressing a button and that subsequently the rectangle does a certain function.
How to do it?
Upvotes: 2
Views: 519
Reputation: 4077
You do not need ARKit
to recognise rectangles, only Vision
.
In case to recognise generic rectangles, use VNDetectRectanglesRequest
.
Upvotes: 1
Reputation: 58043
As you rightly wrote, you need to use Vision
or CoreML
frameworks in your project along with ARKit. Also you have to create a pre-trained machine learning model (.mlmodel
file) to classify input data to recognize your generic rectangle.
For creating a learning model use one of the following resources: TensorFlow, Turi, Caffe, or Keras.
Using .mlmodel
with classification tags inside it, Vision requests return results as VNRecognizedObjectObservation
objects, which identify objects found in the captured scene. So, if the image's corresponding tag is available via recognition process in ARSKView then an ARAnchor
will be created (and SK
/SCN
object can be placed onto this ARAnchor
).
Here's a snippet code on a topic "how it works":
import UIKit
import ARKit
import Vision
import SpriteKit
.................................................................
// file – ARBridge.swift
class ARBridge {
static let shared = ARBridge()
var anchorsToIdentifiers = [ARAnchor : String]()
}
.................................................................
// file – Scene.swift
DispatchQueue.global(qos: .background).async {
do {
let model = try VNCoreMLModel(for: Inceptionv3().model)
let request = VNCoreMLRequest(model: model, completionHandler: { (request, error) in
DispatchQueue.main.async {
guard let results = request.results as? [VNClassificationObservation], let result = results.first else {
print ("No results.")
return
}
var translation = matrix_identity_float4x4
translation.columns.3.z = -0.75
let transform = simd_mul(currentFrame.camera.transform, translation)
let anchor = ARAnchor(transform: transform)
ARBridge.shared.anchorsToIdentifiers[anchor] = result.identifier
sceneView.session.add(anchor: anchor)
}
}
let handler = VNImageRequestHandler(cvPixelBuffer: currentFrame.capturedImage, options: [:])
try handler.perform([request])
} catch {
print(error)
}
}
.................................................................
// file – ViewController.swift
func view(_ view: ARSKView, nodeFor anchor: ARAnchor) -> SKNode? {
guard let identifier = ARBridge.shared.anchorsToIdentifiers[anchor] else {
return nil
}
let labelNode = SKLabelNode(text: identifier)
labelNode.horizontalAlignmentMode = .center
labelNode.verticalAlignmentMode = .center
labelNode.fontName = UIFont.boldSystemFont(ofSize: 24).fontName
return labelNode
}
And you can download two Apple's projects (sample code) written by Vision engineers:
Recognizing Objects in Live Capture
Classifying Images with Vision and Core ML
Hope this helps.
Upvotes: 0