Multiple GPUs in Cuda - Working code before, but not any more

Question

I've recently come across trouble running with multiple NVidia GPUs in a Cuda application. The attached code is able to reproduce the problem consistently on my system in both Visual Studio 2013 and 2015 (Windows 7, Cuda 9.2, Nvidia driver 398.26, 1xGTX1080 and 1xGTX960). I am building for the correct compute capabilities for my cards (5.2 and 6.1).

Specifically, after the first GPU has been initialized, I am unable to get any function calls on the second GPU to work. The error code is consistently "CudaErrorMemoryAllocation". It fails in the Nvidia profiler and in both debug and release builds. I can initialize on the GPUs in either order and reproduce the problem.

This problem came up when trying to scale my current application, which is a large pipeline of image processing algorithms. There can be several independent instances of this pipeline, and due to memory limitations, multiple cards will be required. The main reason I'm so confused by this issue is that I've had it working before - I have a Visual Profile session that I ran a couple years ago that shows my same cards behaving as expected. The only difference I'm aware of is that it was in Cuda 8.0.

Any ideas?

#include "cuda_runtime.h"
#include "cuda.h"

#include 
#include 
#include 

// Function for each thread to run
void gpuThread(int gpuIdx, bool* result)
{
    cudaSetDevice(gpuIdx); // Set gpu index

    // Create an int array on CPU
    int* hostMemory = new int[1000000];
    for (int i = 0; i < 1000000; i++)
        hostMemory[i] = i;

    // Allocate and copy to GPU
    int* gpuMemory;
    cudaMalloc(&gpuMemory, 1000000 * sizeof(int));
    cudaMemcpy(gpuMemory, hostMemory, 1000000 * sizeof(int), cudaMemcpyHostToDevice);

    // Synchronize and check errors
    cudaDeviceSynchronize();
    cudaError_t error = cudaGetLastError();
    if (error != CUDA_SUCCESS)
    {
        result[0] = false;
        return;
    }

    result[0] =  true;
}

int main()
{
    bool result1 = false;
    bool result2 = false;

    std::thread t1(gpuThread, 0, &result1);
    std::thread t2(gpuThread, 1, &result2);

    t1.join();  // Wait for both threads to complete
    t2.join();

    if (!result1 || !result2) // Verify our threads returned success
        std::cout << "Failed
";
    else
        std::cout << "Passed
";

    std::cout << "Press a key to exit!
";
    _getch();

    return 0;
}

Brian M · Accepted Answer

After a day of uninstalling and reinstalling programs, it appears that this is an issue with the 398.26 driver. The newer version, 399.07, works as expected.

Multiple GPUs in Cuda - Working code before, but not any more

Answers (1)

Related Questions