Vision and ARKit frameworks in Xcode project

I want to create an ARKit app using Xcode. I want it to recognize a generic rectangle without pressing a button and that subsequently the rectangle does a certain function.

How to do it?

Upvotes: 2

Views: 519

Answers (2)

Maxim Volgin
Maxim Volgin

Reputation: 4077

You do not need ARKit to recognise rectangles, only Vision.

In case to recognise generic rectangles, use VNDetectRectanglesRequest.

Upvotes: 1

Andy Jazz
Andy Jazz

Reputation: 58043

As you rightly wrote, you need to use Vision or CoreML frameworks in your project along with ARKit. Also you have to create a pre-trained machine learning model (.mlmodel file) to classify input data to recognize your generic rectangle.

For creating a learning model use one of the following resources: TensorFlow, Turi, Caffe, or Keras.

Using .mlmodel with classification tags inside it, Vision requests return results as VNRecognizedObjectObservation objects, which identify objects found in the captured scene. So, if the image's corresponding tag is available via recognition process in ARSKView then an ARAnchor will be created (and SK/SCN object can be placed onto this ARAnchor).

Here's a snippet code on a topic "how it works":

import UIKit
import ARKit
import Vision
import SpriteKit

.................................................................

// file – ARBridge.swift
class ARBridge {
    static let shared = ARBridge()
    var anchorsToIdentifiers = [ARAnchor : String]()
}

.................................................................

// file – Scene.swift
DispatchQueue.global(qos: .background).async {
    do {
        let model = try VNCoreMLModel(for: Inceptionv3().model)
        let request = VNCoreMLRequest(model: model, completionHandler: { (request, error) in

             DispatchQueue.main.async {
                 guard let results = request.results as? [VNClassificationObservation], let result = results.first else {
                     print ("No results.")
                     return
                 }
                 var translation = matrix_identity_float4x4
                 translation.columns.3.z = -0.75
                 let transform = simd_mul(currentFrame.camera.transform, translation)
                 let anchor = ARAnchor(transform: transform)
                 ARBridge.shared.anchorsToIdentifiers[anchor] = result.identifier
                 sceneView.session.add(anchor: anchor)
             }
         }
         let handler = VNImageRequestHandler(cvPixelBuffer: currentFrame.capturedImage, options: [:])
         try handler.perform([request])
     } catch { 
         print(error) 
     }
}

.................................................................

// file – ViewController.swift
func view(_ view: ARSKView, nodeFor anchor: ARAnchor) -> SKNode? {
    guard let identifier = ARBridge.shared.anchorsToIdentifiers[anchor] else {
        return nil
    }
    let labelNode = SKLabelNode(text: identifier)
    labelNode.horizontalAlignmentMode = .center
    labelNode.verticalAlignmentMode = .center
    labelNode.fontName = UIFont.boldSystemFont(ofSize: 24).fontName
    return labelNode
}

And you can download two Apple's projects (sample code) written by Vision engineers:

Recognizing Objects in Live Capture

Classifying Images with Vision and Core ML

Hope this helps.

Upvotes: 0

Related Questions