Mengü Demir
Mengü Demir

Reputation: 21

CUDA: How do I store persistent data?

I want to store background image data on device in CUDA. Later while I read a new scene from a video source I want to send the new scene as a foreground image to the GPU and substract it from the background image. I don't desire to resend the background image to the GPU for every scene. How can I do this?

Upvotes: 2

Views: 1553

Answers (2)

phoad
phoad

Reputation: 1871

Here is a simple example..

int main(int argc, char **argv) {
    uint *hostBackground, *hostForeground; //new uint[]..
    uint *background, *foreground;

First initialize your background and foreground data..

    cudaMalloc(background, ..);
    cudaMalloc(foreground, ..);

then load background data

    cudaMemCpy(background, hostBackground, ..); //copy to device..

then read the foreground data

    while (applicationRuns) {
        readImage(hostForeground); //read image..
        cudaMemcpy(foreground, hostForeground, ..); //copy to device

        //operate on foreground..
        substruct_kernel<<<threads, blocks>>>(foreground, background, width, height);

        cudaMemcpy(hostForeground, foreground, ..); //copy to host

        //use hostForeground
    }

free them up

    cudaFree(foreground);
    cudaFree(background);
}

Here is a simple substruct kernel..

__global__ void substruct_kernel(uint *foreground, uint *backgroung, int width, int height)
{
    int idx = threadIdx.x + threadDim.x * blockIdx.x;
    int idy = threadIdx.y + threadDim.y * blockIdx.y;

    if (idx < width && idy < height)
       foreground[idx + idy * width] -= background[idx + idy * width]; //clamp may be required..
}

I do suggest using libraries for such simple operations. Blas libraries or Thrust library might be the options.

Upvotes: 1

harrism
harrism

Reputation: 27899

Store the background image in a device memory array (i.e. on the GPU). Then when you read the foreground image use cudaMemcpy to copy it to another device memory array. Then launch a kernel that takes the two device memory arrays as arguments and performs the image subtraction. Should be simple.

Assuming you use default context creation and this is all running in the same CPU thread, you don't have to worry about doing anything specific to keep your CUDA context "intact" as Bart commented. However if you do any CPU multithreading you will need to do some context management.

Upvotes: 3

Related Questions