Reputation: 945
What are the ways to identify a particular object in a room and the position of the user for indoor navigation with AR. I understand that we can use beacon and marker to identify an object or the location of user in a room.
Without using them what are the other alternatives for finding user location and identifying an object for AR experience. I am exploring on AR for indoor navigation with iOS devices(currently focusing on using ARKit). If we use core location for user positioning, the accuracy is low. In a small shop if we use core location or any map related services we will face user/product miss positioning leading to not-a-good experience for users. Any other ways/solutions to solve this?
Upvotes: 0
Views: 1981
Reputation: 1761
The obvious alternative way to detect objects visually in a scene would be to use CoreML framework with ARKit. A basic app is already available on Github.
You can also obtain the worldPosition of those objects relative to a starting origin & plot the x,z coordinate system (indoor map) based on the SCNNode label position. It’s not going to be that accurate... but it’s a basic object identification and positioning system.
Edit:
One limitation of using an out-of-the-box CoreML image classifiers like Inceptionv3.mlmodel is it only detects the dominant generic objects from a set of generic categories such as trees, animals, food, vehicles, people, and more.
You mention doing an object recognition (image classifying) inside a retail shop. This will need a custom image classifier that can for example discriminate different types of iphone models (iphone7, iphone 8 or iphone X) rather than merely determining its a smartphone.
To create your own object recogniser (image classifier) for ARkit follow this tutorial written by Hunter Ward.
https://medium.com/@hunter.ley.ward/create-your-own-object-recognizer-ml-on-ios-7f8c09b461a1
code is available on Github:
https://github.com/hanleyweng/Gesture-Recognition-101-CoreML-ARKit
Note: If you need to create a custom classifier for 100’s of items in a retail shop... Ward recommends around 60 images per class... which would total around 60 x 100 = 6000 images. To generate the Core ML model, Ward uses a Microsoft Cognitive Service called “Custom Vision”... which currently has a limit of 1000 images. So if you need to do more than 1000 images you will have to find another way to create the model.
Upvotes: 3