Anshul
Anshul

Reputation: 11

Simple video stream on the Oculus Rift DK2

I have an Oculus Rift fitted with an eye tracker that I use for displaying real time images on based on the eye tracker input. I wanted to get your opinion on the way I'm handling display to the Rift (is it right?).

I have a basic image of simple text that is modified using OpenCV based on the gaze coordinates from the eye tracker. Thus, every time the eye tracker outputs gaze coordinates (60Hz), I have a new image so it is as if I'm working with a webcam stream. I have a working program, but since I'm new to OpenGL I was hoping someone could go over the following steps to see if I'm not missing something. I'll only include the main chunks of code with written comments below:

1) First, I create and bind a texture. matToTexture is a function that converts the cv::mat image to a texture.

    tex = matToTexture(result, GL_NEAREST, GL_NEAREST, GL_CLAMP);
    glBindTexture(GL_TEXTURE_2D, tex)        

2) Then I make the eye render buffers and setup VR components, get eye poses, etc.:

        for (int eye=0; eye<2; eye++)
{
    idealSize = ovrHmd_GetFovTextureSize(hmd, (ovrEyeType)eye, hmd->DefaultEyeFov[eye], 1.0f);
    EyeRenderTexture[eye] = tex;
    //EyeRenderViewport[eye].Pos.x = 0;
    EyeRenderViewport[0].Pos.x =0;
    EyeRenderViewport[1].Pos.x = idealSize.w/2;
    EyeRenderViewport[eye].Pos.y = 0;
    EyeRenderViewport[eye].Size = idealSize;
}

//Setup VR components
ovrGLConfig oglcfg;
oglcfg.OGL.Header.API               = ovrRenderAPI_OpenGL;
oglcfg.OGL.Header.BackBufferSize.w  = hmd->Resolution.w;
oglcfg.OGL.Header.BackBufferSize.h  = hmd->Resolution.h;
oglcfg.OGL.Header.Multisample       = 1;
oglcfg.OGL.Window                   = handle;
oglcfg.OGL.DC                       = GetDC(handle);

if (!ovrHmd_ConfigureRendering(hmd, &oglcfg.Config,
                               ovrDistortionCap_Vignette |
                               ovrDistortionCap_TimeWarp | ovrDistortionCap_Overdrive,
                               hmd->DefaultEyeFov, EyeRenderDesc))  
    return(1);

//Getting eye pose outside the loop since our pose will remain static
ovrVector3f useHmdToEyeViewOffset[2]= {EyeRenderDesc[0].HmdToEyeViewOffset, EyeRenderDesc[1].HmdToEyeViewOffset};
ovrHmd_GetEyePoses(hmd, 0, useHmdToEyeViewOffset, EyeRenderPose, NULL);

glGenTextures(1, &textureID);

//Changing eye tracking location from 1920-1080 to 2364-1461 since that is 
//optimal buffer size
float x_scale = static_cast<float>(image.cols)/static_cast<float>(hmd->Resolution.w);
float y_scale = static_cast<float>(image.rows)/static_cast<float>(hmd->Resolution.h);

//x_adjusted and y_adjusted store the new adjusted x,y values
float x_adjusted, y_adjusted;

Finally, I have the rendering while loop

    while(1)
{
            //Changing the texture dynamically because the result image is changing
    //with eye tracker input
    tex = matToTexture(result, GL_NEAREST, GL_NEAREST, GL_CLAMP);   
    glBindTexture(GL_TEXTURE_2D, tex);

    for (int eye = 0; eye<2; eye++)
    {
        projection[eye] = ovrMatrix4f_Projection(EyeRenderDesc[eye].Fov, 1, 1000, 1);
        glMatrixMode(GL_PROJECTION);
        glLoadTransposeMatrixf(projection[eye].M[0]);
        EyeRenderTexture[eye] = tex;
        glMatrixMode(GL_MODELVIEW);
        glLoadIdentity();
        glTranslatef(EyeRenderDesc[eye].HmdToEyeViewOffset.x,-EyeRenderDesc[eye].HmdToEyeViewOffset.y, EyeRenderDesc[eye].HmdToEyeViewOffset.z);


        //Distortion Rendering
        eyeTexture[eye].OGL.Header.API              = ovrRenderAPI_OpenGL;
        //eyeTexture[eye].OGL.Header.TextureSize    = idealSize;
        eyeTexture[eye].OGL.Header.TextureSize.h    = idealSize.h;
        eyeTexture[eye].OGL.Header.TextureSize.w    = 2*idealSize.w;
        eyeTexture[eye].OGL.Header.RenderViewport.Size  = idealSize;
        eyeTexture[0].OGL.Header.RenderViewport.Pos.x = 0;
        eyeTexture[1].OGL.Header.RenderViewport.Pos.x = idealSize.w;
        eyeTexture[eye].OGL.Header.RenderViewport.Pos.y = 0;
        eyeTexture[eye].OGL.TexId                   = EyeRenderTexture[eye];
    }


    ovrHmd_EndFrame(hmd, EyeRenderPose, &eyeTexture[0].Texture);

    //restoring result back to original so that new scotoma position
    //can be added onto it
    image.copyTo(result);

    // Clear the screen to black
    glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
    glClear(GL_COLOR_BUFFER_BIT);

    //Exiting loop if 'q' is pressed
    if (quit == 1) break;

}

glDeleteTextures(1, &tex);

SO I just have one OpenGL texture that I modify with each frame. I've read about framebuffers and how they can be used, but even after having read from a bunch of sources, I'm very confused on if they should be used for this particular application. Is what I'm doing okay? If there is a source you can recommend to learn more about OpenGL for this 2D application, I would appreciate it.

QUESTION: Both the screens currently see the same exact image. Why is that? Shouldn't both eyes see slightly different images? Did I not setup the eye textures/ viewports correctly? If needed, I can upload the entire code.

When I wrote this code I was using Rift 0.5.0 but now I upgraded to 0.8.0 beta

Thank you!

Upvotes: 0

Views: 937

Answers (1)

derhass
derhass

Reputation: 45332

QUESTION: Both the screens currently see the same exact image. Why is that? Shouldn't both eyes see slightly different images?

The occulus SDK will not create the stereoscopic image pair for you. It will just two things which are of concern here:

  1. provide you with the correct projection parameters as well as the viewpoint / orientation

  2. postprocess the image pairs for display on the rift (warping, correction for chromatic abberation).

While you do query the projection matrix and the viewpoint position from the SDK, you actually are not doing anything with it. You just set them as OpenGL projection and modelview matrix without ever rendering anything with it.

The code is supposed to render into the textures, providing a different perspective into a 3D world, and finally using ovrHmd_EndFrame to do the postprocessing on it and to render to the actual window.

However, you just provide your monoscopic input texture as input, completely skip the rendering step and directly post-process it.

You cannot autoamtically deduce a stereoscopic image pair from a single monoscopic image.

From your comments:

I understand that won't be real 3D, but I would like to know for example how a monoscopic computer game image can be modified to show slighlty different views on left vs. right eye.

That depends on how you define "monoscopic game". Most of these games actually use a 3D data representation and render it to the screen, creating a 2D projection. In such a case, for stereoscopic output, the rendering has to be done twice, with a different projection and view matrices.

Another way would be using the monoscopic image and the depth buffer to create another view by basically reprojecting the 3d points (which we get by the depth buffer) to silgihtly different view configuration, and filling all the holes.

However, none of that applies to

I have a new image so it is as if I'm working with a webcam stream.

If you only have a monoscopic webcam, you have no direct way to get 3D information required for rendering a different perspective of the scene. There are some approaches to reconstruct such information using the temporal coherence of video streams, see structure from motion wikipedia artice. But that is extremely limited and not applicable for any real-time use.

Upvotes: 1

Related Questions