How can I avoid inconsistent audio playback using PortAudio and OpenCV?

Question

I'm using opencv (for object recognition) combined with portaudio to play sounds based on video input. Essentially, my goal is to play a sine wave tone of a certain pitch/frequency at different rates. It works, but the outcome is very unpredictable. Sometimes audioplayback works (program runs slowly, but it works), other times no audio playback occurs. In a nutshell/flow this is what my program does:

Start webcam feed -> Acquire webcam image -> Choose Region in Image -> Return to video feed -> while(frame exists) -> Track object position -> Initialize Port Audio tools -> Play sound based on position -> Terminate Portaudio tools

I can't seem to figure out why audio playback is inconsistent. Do you all have any tips? I've been reading around, and my thinking is that this a latency issue, but I'm really not experienced in the matter. When I use portaudio without opencv, no latency issues occur, so I know it has to do with combining the two. Any help is appreciated.

while (frame)
{
    cvCopyImage(frame, drawImg);

    // process
    track(frame);

    // get result
    CvRect r;
    float  confidence;
    bool   valid;
    /* getRoi tells us if the region being tracked on the screen
     * is the same region that we chose prior to entering this while loop
     */
    getRoi(&r, &confidence, &valid); 

    // show
    cvDrawRect(drawImg, cvPoint(r.x, r.y), 
        cvPoint(r.x + r.width - 1, r.y + r.height - 1),
        valid ? cvScalar(0, 255, 0) : cvScalar(0, 255, 255),
        2
    );
    writeLogo(drawImg,"USC-IRIS");
    int xpos = r.x;
    int ypos = r.y;



    cvShowImage("Tracking", drawImg);
    cout << "valid " << valid << endl;
    cout << "conf val " << confidence << endl;
    cout << "xpos, ypos " << xpos << ", " << ypos << endl;
            //If the region on the screen is the region we chose
            //then we should play specific sounds
    if(valid){

        sI->soundWrite(xpos, ypos);
        float freq = sI->getFreq();
        int amp = sI->getAmp();
        float pulse = sI->getPulse();

        switch(amp){
            case 0:
                //printf("Hear sound in both ears.
");
                data.targetBalance = .5;
                break;
            case 1:
                //printf("Hear sound in left ear.
");
                data.targetBalance = 0;
                break;
            case 2:
                //printf("Hear sound in right ear.
");
                data.targetBalance = 1;
                break;
            default:
                //printf("Incorrect value for amp (left/right sound indicator)");
                data.targetBalance = .5;
                break;
        }



        err = Pa_Initialize(); //scan for available devices i.e. audio jack, headphones
        if(err != paNoError) {
            printf("init
");
            goto error;
        }
        //open the sound stream for processing
        err =  Pa_OpenDefaultStream( &stream, 0, 2, paFloat32, SAMPLE_RATE, 
            256, patestCallback, &data ); //open the sound stream for processing
        if( err != paNoError ) {
            printf("open
");
            goto error;
        }

        //start the stream (i.e. play sound) if no errors
        err = Pa_StartStream(stream);
        if(err != paNoError) {
            printf("start
");
            goto error;
        }

        //check which ear(s) the sound should be played to



        //hold that tone for a certain amount of time (pulse*200 millisec)
        Pa_Sleep(pulse*200);
        cout << "pulse: " << pulse <<  endl << "freq: " << freq << endl;
        cout << "amp: " << amp << endl;

        //stop the stream (i.e. stop playing sound)
        err = Pa_StopStream(stream);
        if(err != paNoError) {
            printf("stop
");
            goto error;
        }

        err = Pa_CloseStream( stream );
        if( err != paNoError ) {
            printf("close
");
            goto error;
        }

        err = Pa_Terminate();
        if( err != paNoError ) {
            printf("term
");
            goto error;
        }
    }
    int key = cvWaitKey(1);
    // write
    if (output_txt)
        fprintf(output_txt, "%d %d %d %d
", r.x, r.y, r.width, r.height);
    if (output_avi)
        cvWriteFrame(output_avi, drawImg);

    // next
    if (key == 'q'||key=='Q')
        break;
    frame = cvQueryFrame(capture);
}

nmante · Accepted Answer

It seems that the inconsistent audio playback was due to another segment of code not displayed in my question above. That incorrect code is below. I believe the error has to do with the first if statement and last forloop in this function. I think that the variable framesToCalc wasn't being calculated correctly. Thus, the first for loop wasn't placing any data into the outputBuffer/out variable. Then, at the end I'm zeroing out the remaining unused buffer space. Hence, no sound because of a zeroed buffer. My solution was to remove the first if else, and the last forloop. Additionally, I did the first for loop from i=0 to framesPerBuffer. Now it works perfectly.

static int patestCallback(const void *inputBuffer, void *outputBuffer, unsigned long framesPerBuffer, const PaStreamCallbackTimeInfo *timeInfo, PaStreamCallbackFlags statusFlags, void *userData){
paTestData *data = (paTestData*)userData;
SAMPLE_t *out = (SAMPLE_t *)outputBuffer;
int i;
int framesToCalc;
int finished = 0;
(void) inputBuffer; 
int left_phase = data->left_phase;
int right_phase = data->right_phase;


if( data->framesToGo < framesPerBuffer )
{
    framesToCalc = data->framesToGo;
    data->framesToGo = 0;
    finished = 1;
}
else
{
    framesToCalc = framesPerBuffer;
    data->framesToGo -= framesPerBuffer;
}

for( i=0; icurrentBalance < data->targetBalance )
    {
        data->currentBalance += BALANCE_DELTA;
    }
    else if( data->currentBalance > data->targetBalance )
    {
        data->currentBalance -= BALANCE_DELTA;
    }
    left_phase += (LEFT_FREQ / SAMPLE_RATE);
    right_phase += (RIGHT_FREQ / SAMPLE_RATE);
    if( fabs(data->currentBalance - .5)  < .001){
        //left_phase += (double)(LEFT_FREQ / SAMPLE_RATE);
        if( left_phase > 1.0) left_phase -= 1.0;

        *out++ = DOUBLE_TO_SAMPLE( AMPLITUDE * sin( (left_phase * M_PI * 2. )));

        //right_phase += (double)(RIGHT_FREQ / SAMPLE_RATE);
        if( right_phase > 1.0) right_phase -= 1.0;
        *out++ = DOUBLE_TO_SAMPLE( AMPLITUDE * sin( (right_phase * M_PI * 2. )));
    }else{
        //left_phase += (double)(LEFT_FREQ / SAMPLE_RATE);
        if( left_phase > 1.0) left_phase -= 1.0;

        *out++ = DOUBLE_TO_SAMPLE( AMPLITUDE * sin( (left_phase * M_PI * 2. ))*(1.0 - data->currentBalance));

        //right_phase += (double)(RIGHT_FREQ / SAMPLE_RATE);
        if( right_phase > 1.0) right_phase -= 1.0;
        *out++ = DOUBLE_TO_SAMPLE( AMPLITUDE * sin( (right_phase * M_PI * 2. ))*data->currentBalance);
    }

}
    // zero remainder of final buffer
    for( ; i<(int)framesPerBuffer; i++ )
    {
        *out++ = SAMPLE_ZERO; //left
        *out++ = SAMPLE_ZERO; //right
    }
    data->left_phase = left_phase;
    data->right_phase = right_phase;
    return finished;
}

How can I avoid inconsistent audio playback using PortAudio and OpenCV?

Answers (2)

Related Questions