JMan Mousey
JMan Mousey

Reputation: 371

How to sort gpu data into separate lists owned by cpu objects with thrust and cuda?

I am new to thrust, but hoping to get a how to on a paralleled sorting scenario. I have one super large gpu list (1mil+) and I am trying to sort them into various cpu containers where each container has a device_vector. The idea is that I want to sort the gpu list into the various device_vectors owned by CPU containers.

class GpuObject
{
    int someData;
    int otherValue;
};

class CpuContainer
{
    thrust::device_vector<GpuObject>* SortedGpuList;
};

for( int i = 0; i<100; i++ )
{
      Containers.push_back(new CpuContainer());
}

thrust::device_vector<GpuObject>* completeGpuList;

__device__ __host__
void sortIntoContainers( .... )
{
    // ... possible to sort completeGpuList into Containers[i].SortedGpuList based on GpuObject.someData ?
}

My first stab was to create a device_vector to hold an int representing which container a give GpuObject will get mapped into (equal in size to completeGpuList). Then I use thrust::transform with an object that has int () operator to return the containerID for each GpuObject. After this I sort by key on the original gpuCompleteList using the new containerIDList. But how can I effectively copy all entries without looping through the lists after the sort?

Upvotes: 1

Views: 147

Answers (1)

Tom
Tom

Reputation: 3336

How about setting all vectors in a larger matrix? Value are stored in it, and other fields are sorted in objects. For example, a 50 * 1M float* matrix. Then each vector i is at offset '50 * i' of this matrix, or (matrix + 50 * i). This is one common way for managing many vectors.

And then you can sort elements by keys with 'thrust::sort_by_key'. Every time before sorting, reset the 'keys' matrix to [0, 1, ..., 49, 0, 1, ...49, ..., 0, 1, ..., 49] with a simple kernel. Then 'sort_columns_withIndices' like below can be used to sort the elements. After sorting, the keys are the indices of the objects.

#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/sort.h>
#include <thrust/reduce.h>
#include <thrust/execution_policy.h>
#include <thrust/functional.h>



extern "C"
__global__ void sort_columns_withIndices(float* values, int* keys, int numRows, int numCols, int descending)
{
int i = blockDim.x * blockIdx.x + threadIdx.x;
if (i < numCols)
{   
    if (descending > 0){
        thrust::sort_by_key(thrust::device, values + i * numRows, values + (i + 1) * numRows, keys + i * numRows, thrust::greater<float>());
    } else {
        thrust::sort_by_key(thrust::device, values + i * numRows, values + (i + 1) * numRows, keys + i * numRows, thrust::less<float>());   
    }
}
}

Upvotes: 2

Related Questions