Adam S.
Adam S.

Reputation: 1271

Thrust issue with CUDA 6 managed memory

I am running into an issue where when trying to use cudaMallocManaged() and thrust in the same CUDA 6 application thrust fails, even when thrust is not using any of the managed memory. Simply having an unused managed variable is enough to cause thrust to fail. I have created the following reproducer which I am testing on an NVIDIA Jetson TK1 running CUDA 6.0:

#include "thrust/device_ptr.h"
#include "thrust/sort.h"

__global__ void calculate_hash(uint *hash_values, uint *particle_ids, int length)
{
    int i = blockIdx.x*blockDim.x + threadIdx.x;

    if(i >= length)
        return;

    hash_values[i] =  1;
    particle_ids[i] = i;
}

void hash_particles_gpu(uint *d_hash_values, uint *d_particle_ids, int length)
{
    int block_size = 256;
    int num_blocks = ceil(length/(float)block_size);

    calculate_hash<<<num_blocks, block_size>>>(d_hash_values, d_particle_ids, length);  

    cudaDeviceSynchronize();

    thrust::device_ptr<uint> keys(d_hash_values);
    thrust::device_ptr<uint> values(d_particle_ids);
    thrust::sort_by_key(keys, keys+length, values);
}

int main(int argc, char *argv[])
{
    int length = 15;
    int bytes;

    #ifdef BROKE
    int *m_int;
    cudaMallocManaged((void**)&m_int, sizeof(int));
    #endif

    // Allocate uint hash value array
    bytes = length*sizeof(unsigned int);
    unsigned int * hash_values;
    cudaMalloc((void**)&hash_values, bytes);    

    // Allocate uint particle ID array
    bytes = length*sizeof(unsigned int);
    unsigned int *particle_ids;
    cudaMalloc((void**)&particle_ids, bytes);

    hash_particles_gpu(hash_values, particle_ids, length);
}

When I compile and run:

$ nvcc -DBROKE -DTHRUST_DEBUG example.cu -o broke.exe
$ nvcc -DTHRUST_DEBUG example.cu -o fixed.exe
$ ./fixed.exe
$ ./broke.exe
terminate called after throwing an instance of 'thrust::system::system_error'
  what():  synchronize: RakingReduction: unknown error
Abort

I have checked to make sure I do not have any errors up until this point and everything seems fine until I call sort_by_key. Any idea what is going on?

Upvotes: 1

Views: 516

Answers (1)

Adam S.
Adam S.

Reputation: 1271

Thanks for the comments. I flashed the latest Linux for Tegra, 19.3, and it works now with Cuda 6.0. Looks like NVIDIA had a driver issue with L4T 19.2.

Upvotes: 1

Related Questions