daVincere
daVincere

Reputation: 170

Training a model to achieve DLib's facial landmarks like feature points for hands and it's landmarks

[I'm a noob in Machine Learning and OpenCV]
These below are the results i.e. 68 facial landmarks that you get on applying the DLib's Facial Landmarks model that can be found here.
enter image description here

It mentions in this script that the models was trained on the on the iBUG 300-W face landmark dataset.

Now, I wish to create a similar model for mapping the hand's landmarks. I have the hand dataset here.

What I don't get is:
1. how am I supposed to train the model on those positions? Would I have to manually mark each joint in every single image or is there an optimised way for this?
2. In DLib's model, each facial landmark position has a particular value for e.g., the right eyebrows are 22, 23, 24, 25, 26 respectively. At what point would they have been given those values?
3. Would training those images on DLib's shape predictor training script suffice or would I have to train the model on other frameworks (like Tensorflow + Keras) too?

Upvotes: 3

Views: 5733

Answers (2)

daVincere
daVincere

Reputation: 170

@thachnb's answer above answers everything.

  1. Labelling has to be done manually. With DLib, Imglab comes with the source and can be easily built with CMake. It allows for:
    • Labelling with boxes
    • Annotating/denoting parts (features) to an object (object of interest) For eg: face is an object and the features/landmark positions it's parts.

I was recommended Amazon Mechanical Turk a lot during my research.

  1. The features are given those values during their labelling. You can be creative in those naming conventions until you are consistent.
  2. As only landmarks'/feature points training is required here, DLib's pre-built Shape Predictor Trainer would suffice here. Dlib has pretty in-depth documentation so it won't be hard to follow along.

Additionally you might need some good hand data-set resources for this. Below are some great suggestions:

Hope this works out for others.

Upvotes: 3

thachnb
thachnb

Reputation: 1533

  1. how am I supposed to train the model on those positions? Would I have to manually mark each joint in every single image or is there an optimised way for this?

    -> Yes, you should do it all manually. Detecting hand location, defining how many points you need to describe shape.

  2. In DLib's model, each facial landmark position has a particular value for e.g., the right eyebrows are 22, 23, 24, 25, 26 respectively. At what point would they have been given those values?

    -> From learning step, for example you want 3 points for each finger, and 2 other points for wrist, so total 15 + 2 = 17 points. Depends on how you defined which points belong to which finger, e.g point[0] to point[2] is for thumb. and so on.

  3. Would training those images on DLib's shape predictor training script suffice or would I have to train the model on other frameworks (like Tensorflow + Keras) too?

    -> With dlib you can do everything.

Upvotes: 4

Related Questions