Reputation: 6447
I'm trying to program something simple in order to learn NVidia CUDA/Thrust. I'm a total noob. What I'm trying to do is use a find_if with a custom predicate. My predicate for the moment just returns true on everything so I'm trying to get all input. I want to do a search eventually for strings where I initialize the functor with some string X and then allow the GPU to find all strings that match.
I'm confused on several points here.
I try to fill up a device_vector full of pointers to my strings and then run it against my MemCmp predicate.
First off, does the device_vector "know" to copy my string over from main memory over to GPU memory or does it just copy a pointer value?
Secondly, at the line "count = d_inputVector.end() - iter;" it returns a 12 being the number of items in my iterator that is the result from the find_if. Isn't this wrong? If I try iter - d_inputVector.begin() is returns zero which doesn't get my anywhere.
Finally, is my method of getting at the results of my little program correct? Am I to copy memory using thrust::copy into a host_vector and would a loop like the one at the end suffice to view the results?
Any suggestions are greatly appreciated. Thanks,
mj
struct MemCmp
{
__host__ __device__
bool operator()(char *data)
{
bool rv = false;
rv = true;
return rv;
}
};
....
// I initialize a device_vector and then copy pointers from main memory into the device_vector.
thrust::device_vector<char*> d_inputVector( itemCount );
for( int i=0; i<itemCount; i++ ){
d_inputVector[i] = inputData[i];
}
thrust::device_vector<char*>::iterator iter;
iter = thrust::find_if( d_inputVector.begin(), d_inputVector.end(), MemCmp() );
// this is the count that I think is wrong.
count = d_inputVector.end() - iter;
thrust::host_vector<char*> results( count );
thrust::copy( d_inputVector.begin(), iter, results.begin() );
for( thrust::host_vector<char *>::iterator it = results.begin(); it != results.end(); it++ ){
char* foo = *it;
}
Upvotes: 0
Views: 1268
Reputation: 15734
find_if
is not a good function to find all strings that match. It simply finds the first first element that matches. Take a look at copy_if
.
First off, does the device_vector "know" to copy my string over from main memory over to GPU memory or does it just copy a pointer value?
You will end up with pointer values that have no meaning on the GPU.
In C++, to run on the CPU, you would use std::string
to store your strings. So it would be a std::vector<std::string>
. Matters are complicated by the fact that there is no device implementation of string
, so you can't copy those to the GPU.
In addition, many of the STL algorithms (I'm guessing it's the same with thrust
) require that the objects that are elements in the vector have working copy constructor and assignment operators. The compiler supplies those for the basic types, but not for an array of char.
So, your simple exercise to learn CUDA/Trust may not turn out to be that simple. I think you would need a C++ class that encapsulates a fixed size array of chars and implements device functions for the necessary operators.
Also, moving a vector with many items over to device memory in that way is very inefficient because each assignment you do to the device_vector
causes a separate copy from host to device memory to be executed in the background. Instead, populate a host_vector
and then assign the host_vector
to the device_vector
. Then, only a single copy from host to device memory is executed.
Secondly, at the line "count = d_inputVector.end() - iter;" it returns a 12 being the number of items in my iterator that is the result from the find_if. Isn't this wrong? If I try iter - d_inputVector.begin() is returns zero which doesn't get my anywhere.
The expression should be count = d_inputVector.begin() - iter;
and it should return 0 because the first element in the vector matches the find.
Finally, is my method of getting at the results of my little program correct? Am I to copy memory using thrust::copy into a host_vector and would a loop like the one at the end suffice to view the results?
After you have created a device_vector
with your results, simply assign it to a host_vector
to move it to host memory in a single operation.
thrust::host_vector<char*> H = D;
Upvotes: 2