Using kinect to pick up 2 hand gestures in C++
I am trying to build a program that takes just the hand inputs from Kinect
I need acquire 3 things,
-the streaming of kinect depth video with OpenGL output data on top of it
-recognition of just 2 simple hand gestures, open hand and closed fist, I will build some function to solve it down to Boolean form for each hand
-left and right hand positions, if possible to do more than 2 hands, that would be great
Basically to do a click and drag mouse operation with open and close hand motion in kinect, let's start with just one hand, if it's possible to do more than 2 hands, I will learn that later.
From what I have read so far, Kinect could do this easily without any extra libraries, so I should be able to build my application with just Kinect library and OpenGL
I heard there are tons of examples for this online, but all I found so far are in C#, not C++, the other components for my program are only in C++ and I want to stay with C++ if possible.
Answers (1)
There are essentially two layers:
- Interaction Stream (C++ or managed)
- Interaction Controls (managed only, WPF-specific)
The WPF controls are implemented in terms of the interaction stream.
If you are using a UI framework other than WPF, you will need to do the following:
- Implement the "interaction client" interface. This interface has a
single method, GetInteractionInfoAtLocation. This method will be
called repeatedly by the interaction stream as it tracks the user's
hand movements. Each time it is called, it is your responsibility to
return the "interaction info" (InteractionInfo in managed,
NUI_INTERACTION_INFO in C++) for the given user, hand, and position.
Essentially, this is how the interaction stream performs hit-testing
on the controls within your user interface.
- Create an instance of the interaction stream, supplying it a
reference to your interaction client implementation.
- Start the Kinect sensor's depth and skeleton streams.
- For each depth and skeleton frame produced by the sensor streams,
pass the frame's data to the appropriate method (ProcessDepth or
ProcessSkeleton) of the interaction stream. As the interaction stream
processes the input frames from the sensor, it will produce
interaction frames for your code to consume. In C++, call the
interaction stream's GetNextFrame method to retrieve each such frame.
In managed code, you can either call OpenNextFrame, or subscribe to
the InteractionFrameReady event.
- Read the data from each interaction frame to find out what the user
is doing. Each frame has a timestamp and a collection of user info
structures, each of which has a user tracking ID and a collection of
hand info structures, which provide information about each hand's
position, state, and grip/ungrip events
You can find a complete sample here.