Reputation: 163

Kinect - Techniques required to achieve the following display

Does anyone have any idea what technique I should use to make the display video shift left, right, up and down as in the video below? I want to achieve this with a Kinect but with a different idea.

Thanks in advance.

http://www.youtube.com/watch?v=V2hxaijuZ6w

Upvotes: 2

Answers (2)

Coeffect

Reputation: 8866

EDIT:

Now that I'm awake, I'll go into better detail about this (it apparently took me a week to wake up).

So the Winscape project connects a real and virtual world by giving windows from the real world into a virtual world. The way it does this is act like the real world is part of the virtual world, and then changes the display of the monitors (disguised to look like windows) to replicate the view a person should see if they existed in the virtual world.

Imagine your virtual world. It doesn't necessarily have an end to it, but there's a point where you stop trying to render stuff into it, so let's say the world in enclosed in a box that contains all the rendered elements. Now what Winscape does is make it appear that the virtual world actually exists in the real world, and that you can see it through the monitors.

First step is obviously to create your virtual world. For starters, I'd suggest just creating a literal box. Make each wall a difference color, or put color gradients on the walls. Make something simple. If you haven't already decided on a 3D framework to handle this, I'd suggest XNA. It's C#, which works with the Kinect SDK, and it's got a ton of tutorials online to help you. Once you've created your world, use XNA to place a camera inside the box and add some simple controls to rotate the camera. This will allow you to look around the box from the inside, to make sure the rendering is working as expected.

Once you've done that, you need to decide where to put your windows. These will be the viewpoints into your 3D scene. To demonstrate this concept, here's a picture I took from an XNA camera tutorial.

Camera basics

Note that, if you read the actual tutorial, they won't say the exact same thing as me because I'm just hijacking the picture to demonstrate my meaning. So, the (0,0,0) point is where your "eye" is. The pink rectangle would represent your window. Looking at the window, four lines are drawn from the eye to the corners of the pink window. These four lines are extended forward until they collide with the background, creating the green rectangle. This would be the rectangle that your eye can see through the window.

Note that XNA will actually handle a LOT of this for you. You simply need to create a camera in your virtual scene and move it around, doing some math to aim it directly at your window. You'll want that camera to be in the virtual space in a way that represent your location in the real world. You can do this by using the Kinect to get your real world coordinates in relation to itself, then configure your application to know where your Kinect is in relation to your windows. Combing that data, you can get the location of your eyes in relation to your monitors in the real world, and since the monitors are represented by the windows in the virtual world, you can figure out where you exist in the virtual world. So place the virtual camera where your head is in the virtual world, point it at the windows, and do some magic to make sure only the window is viewed by the camera.

Original semi-lucid rant:

Okay, I'm going to take a shot at this (it's almost 1 AM, so let me know if I did a less than brilliant job and I'll come back to it when I wake up).

First, it'll involve quite a bit of math that I'm just going to skim over. You have, essentially, three layers.

Person ---- "Windows" (Monitors) ---- Scene

The scene, of course, doesn't really exist. You have to kind of incorporate the person into a virtual world where the scene, which is really just a flat image, exists behind a wall. The only way the person can see said scene is through the windows in the wall, which in reality is faked by monitors.

So, here comes the math. The Kinect can calculate where you're standing in the room, and more importantly, where your head is. From this you can get a general sense of where your eyes are. You'll need to take this point (your eyes) and translate it into the coordinates you're using in your virtual world. Then, calculate what those eyes should be able to see through the virtual windows. You can do this by projecting lines from the eyes to each corner of a window, all the way through until it hits the "scene" picture. Each window will correspond to a rectangular area on the background picture. This rectangle is what needs to be drawn to the screen.

The trickiest part is going to be setting up virtual world to nearly perfectly mimic the real world. Essentially, a lot of configuration ("okay, this window is 1.5 meters above the Kinect.. and .25 meters to its left.."). I'm also not sure how far back you should put the scene picture. If I think of something, I'll come back to this, but you can certainly just try it out and figure out a distance that works well for your set up.

Oh wait, now I know why I couldn't figure out the distance. It's because that example is using a 3D simulation. Pretty nifty. So you'd just need to figure out where you want to play your windows in the simulation or whatnot.

Upvotes: 3

George Profenza

Reputation: 51857

There are multiple techniques based on what setup you want to use (KinectDSK, libfreenect, OpenNI, etc.) and how accurate you want this to be.

OpenNI for example has a function called GetCoM which returns the centre of mass for a user (it doesn't need to track a skeleton at this point) which can be used. It looks like OpenNI was used in the video but they still use an old version. The newer version allows skeleton tracking without the 'psi'(ψ) pose.

Note that it doesn't look like it takes the user's head direction. The body could point in one direction and the head in another for example. G.Fanelli and his team have done quite a bit of research in the area. For Kinect check out Real Time Head Pose Estimation from Consumer Depth Cameras

RTHPE

I've played a bit with the KinectSDK and a Kinect for Windows and noticed there's a Face Tracker included.

FaceTracker KinectSDK

In the end, based on to how loose or precise do you want the tracking to be, what's your ideal setup (maximum distance covered, content used, etc.) you can figure out what SDK/library will suit you best. Also, I imagine this also depends a bit on your experience with programming, in which case, also look for wrappers easier to tackle (e.g. Unity, MaxMSP/Jitter, Processing, openFrameworks, etc.)

Upvotes: 2

Kinect - Techniques required to achieve the following display

Answers (2)

Related Questions