jegadeesh
jegadeesh

Reputation: 945

Augmented reality - ARKit - detecting objects dynamically (without marker/ibeacon) in an environment

What are the ways to identify a particular object in a room and the position of the user for indoor navigation with AR. I understand that we can use beacon and marker to identify an object or the location of user in a room.

Without using them what are the other alternatives for finding user location and identifying an object for AR experience. I am exploring on AR for indoor navigation with iOS devices(currently focusing on using ARKit). If we use core location for user positioning, the accuracy is low. In a small shop if we use core location or any map related services we will face user/product miss positioning leading to not-a-good experience for users. Any other ways/solutions to solve this?

Upvotes: 0

Views: 1981

Answers (1)

Clay
Clay

Reputation: 1761

The obvious alternative way to detect objects visually in a scene would be to use CoreML framework with ARKit. A basic app is already available on Github.

CoreML-in-ARKit

You can also obtain the worldPosition of those objects relative to a starting origin & plot the x,z coordinate system (indoor map) based on the SCNNode label position. It’s not going to be that accurate... but it’s a basic object identification and positioning system.

Edit:

One limitation of using an out-of-the-box CoreML image classifiers like Inceptionv3.mlmodel is it only detects the dominant generic objects from a set of generic categories such as trees, animals, food, vehicles, people, and more.

You mention doing an object recognition (image classifying) inside a retail shop. This will need a custom image classifier that can for example discriminate different types of iphone models (iphone7, iphone 8 or iphone X) rather than merely determining its a smartphone.

To create your own object recogniser (image classifier) for ARkit follow this tutorial written by Hunter Ward.

https://medium.com/@hunter.ley.ward/create-your-own-object-recognizer-ml-on-ios-7f8c09b461a1

code is available on Github:

https://github.com/hanleyweng/Gesture-Recognition-101-CoreML-ARKit

Note: If you need to create a custom classifier for 100’s of items in a retail shop... Ward recommends around 60 images per class... which would total around 60 x 100 = 6000 images. To generate the Core ML model, Ward uses a Microsoft Cognitive Service called “Custom Vision”... which currently has a limit of 1000 images. So if you need to do more than 1000 images you will have to find another way to create the model.

Upvotes: 3

Related Questions