Using iPhone TrueDepth sensor to detect a real face vs photo?

How can I use the depth data captured using iPhone true-depth Camera to distinguish between a real human 3D face and a photograph of the same? The requirement is to use it for authentication.

What I did: Created a sample app to get a continuous stream of AVDepthData of what is in front of the camera.

Upvotes: 17

Answers (2)

Andy Jazz

Reputation: 58493

Theory

TrueDepth sensor lets iPhone X ... iPhone 14 generate a high quality ZDepth channel in addition to RGB channels that are captured through a regular selfie camera. ZDepth channel allows us visually make a difference whether it's a real human face or just a photo. In ZDepth channel, a human face is represented as a gradient, but photo has almost solid color because all pixels on a photo's plane are equidistant from camera.

AVFoundation

At the moment AVFoundation API has no Bool-type instance property allowing you to find out if it's a real face or a photo, but AVFoundation's capture subsystem provides you with AVDepthData class – a container for per-pixel distance data (depth map) captured by camera device. A depth map describes at each pixel the distance to an object, in meters.

@available(iOS 11.0, *)
open class AVDepthData: NSObject {

    open var depthDataType: OSType { get }
    open var depthDataMap: CVPixelBuffer { get }
    open var isDepthDataFiltered: Bool { get }
    open var depthDataAccuracy: AVDepthDataAccuracy { get }
}

A pixel buffer is capable of containing the depth data's per-pixel depth or disparity map.

var depthDataMap: CVPixelBuffer { get }

ARKit

ARKit heart is beating thanks to AVFoundation and CoreMotion sessions (in a certain extent it also uses Vision). Of course you can use this framework for Human Face detection but remember that ARKit is a computationally intensive module due to its "heavy metal" tracking subsystem. For a successful real face (not a photo) detection, use ARFaceAnchor allowing you to register head's motion and orientation at 60 fps and facial blendshapes allowing you to register user's facial expressions in real time. To make it impossible to simulate facial expressions using a video instead of a real person, use the AR app's random text commands to force you to show a particular facial expression.

Vision

Implement Apple Vision and CoreML techniques to recognize and classify a human face contained in CVPixelBuffer. But remember, you need ZDepth-to-RGB conversion in order to work with Apple Vision – AI / ML mobile frameworks don't work with Depth map data directly, at the moment. When you want to use RGBD data for authentication, and there will be just one or two users' faces to recognize, it considerably simplifies a task for Model Learning process. All you have to do is to create an mlmodel for Vision containing many variations of ZDepth facial images.

You can use Apple Create ML app for generating a lightweight and effective mlmodel files.

Useful links

To visualize a depth data coming from the TrueDepth camera use the following sample code. Sample codes for detecting and classifying images using Vision you can find here and here. Also, look at this post to find out how to convert AVDepthData to regular RGB pattern.

Upvotes: 12

Tanya Rao

Reputation: 11

You can make use of AVCaptureMetadataOutput and AVCaptureDepthdataOutput to detect face and then take the required action

Upvotes: -2