Reputation: 1
I'm working with the implicit-depth repo by Niantic Labs, but I've run into an issue. The repo does not share the code for projecting 3D points or depth values to the 2D camera image plane. I'm trying to project a 3D object into a scene using pyrender, ARKit intrinsics, and a camera pose matrix, but the object is not rendered correctly in the image plane. I see it outside the image plane(checked with 3D view .
Here's what I'm doing:
Intrinsics: Using ARKit-provided values (fx, fy, cx, cy), where focal lengths are in pixels, and the principal point (cx, cy) is offset from the top-left corner of the image.
Pose: Using the 4x4 pose matrix provided by ARKit (simd_float4x4) for the camera's position and orientation in the world.
Rendering: Using pyrender to render the object with IntrinsicsCamera and the given pose matrix.
I've also tried normalizing the intrinsics (cx, cy) relative to the image dimensions, but the alignment issue persists.
Is there anything I am missing in rendering ARKiT camera pose data?
Here is the code that I am using for ARKiT capturing:
private func createFrameMetadata(
frame: ARFrame,
depthFileName: String,
rgbFileName: String,
depthResolution: [Int],
objPosition: [Float]
) -> [String: Any] {
let timestamp = frame.timestamp
let intrinsics = frame.camera.intrinsics
let fx = intrinsics.columns.0.x
let fy = intrinsics.columns.1.y
let cx = intrinsics.columns.2.x
let cy = intrinsics.columns.2.y
let widthRGB = CVPixelBufferGetWidth(frame.capturedImage)
let heightRGB = CVPixelBufferGetHeight(frame.capturedImage)
let transform = frame.camera.transform
let pose4x4: [Float] = [
transform.columns.0.x, transform.columns.0.y, transform.columns.0.z, transform.columns.0.w,
transform.columns.1.x, transform.columns.1.y, transform.columns.1.z, transform.columns.1.w,
transform.columns.2.x, transform.columns.2.y, transform.columns.2.z, transform.columns.2.w,
transform.columns.3.x, transform.columns.3.y, transform.columns.3.z, transform.columns.3.w
]
return [
"timestamp": timestamp,
"intrinsics": [fx, fy, cx, cy, 0],
"depth": depthFileName,
"image": rgbFileName,
"pose4x4": pose4x4,
"resolution": [widthRGB, heightRGB],
"depthResolution": depthResolution,
"objPosition": objPosition // [x, y, z]
]
}
And python code :
import numpy as np
import trimesh
import pyrender
import matplotlib.pyplot as plt
fuze_trimesh = trimesh.load(
'./3D_models/teapot.obj')
fuze_trimesh.apply_scale(0.006)
mesh = pyrender.Mesh.from_trimesh(fuze_trimesh)
scene = pyrender.Scene()
scene.add(mesh)
# data comes from ARKiT (real Data)
fx = 1350.447509765625
fy = 1350.447509765625
cx = 956.402587890625
cy = 728.2923583984375
pose = np.array([
0.96495628356933594,
0.037732210010290146,
0.25968354940414429,
0,
-0.00015643320512026548,
0.98969066143035889,
-0.14322151243686676,
0,
-0.26241043210029602,
0.13816189765930176,
0.95501405000686646,
0,
-0.17561252415180206,
-0.0080626010894775391,
0.22842416167259216,
1
])
camera = pyrender.IntrinsicsCamera(
fx=fx, cx=cx, fy=fy, cy=cy)
pose = np.reshape(pose, (4, 4))
pose = np.transpose(pose)
# Apply rotations
pose_landscape_right = pose # No change
camera_pose = pose
scene.add(camera, pose=camera_pose)
light = pyrender.DirectionalLight(
color=np.array([1.0, 1.0, 1.0]), intensity=5.0)
scene.add(light, pose=camera_pose)
r = pyrender.OffscreenRenderer(1920, 1440)
pyrender.Viewer(scene)
Upvotes: 0
Views: 29