RealityKit and Vision – How to call RayCast API

Question

This question is also asked in the Apple Forum but so far, I have not seen any response there.

The question is really, after finding the point of interested from a frame in ARSession. How to convert that into 3D world coordinate.

How did I got a point:

let handler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage, orientation: .up, options: [:])
let handPoseRequest = VNDetectHumanHandPoseRequest()
....
try handler.perform([handPoseRequest])

Then I need to Raycast from the 2D point derived from ARFrame.capturedImage to 3D world coordinate:

fileprivate func convertVNPointTo3D(_ point: VNRecognizedPoint,
                                  _ session: ARSession,
                                    _ frame: ARFrame,
                                 _ viewSize: CGSize) -> Transform? {

    let pointX = (point.x / Double(frame.camera.imageResolution.width))*Double(viewSize.width)
    let pointY = (point.y / Double(frame.camera.imageResolution.height))*Double(viewSize.height)
    let query = frame.raycastQuery(from: CGPoint(x: pointX, y: pointY), allowing: .estimatedPlane, alignment: .any)
    let results = session.raycast(query)

    if let first = results.first {
        return Transform(matrix: first.worldTransform)
    } else {
        return nil
    }
}

According to API, I should use UI point. However, I do not know how capturedImage being converted to UI point. The calculate I used for the points are not correct.

Thanks.

BSharer App - Share Books · Accepted Answer

The issue was the image orientation. In my case, using iPad back camera in Portrait direction, I need to do .downMirrored (instead of .up).

let handler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage, orientation: .downMirrored, options: [:])

Once getting the orientation correct, the point values from image recognition could be DIRECTLY used raycast.

RealityKit and Vision – How to call RayCast API

Answers (1)

Related Questions