Reputation: 2104
I'm a total newcomer to OpenCL trying to find out the pros and cons of OpenCL and hashes.
Say, for example, I have a trivial hash function:
public static uint GetHash(string str)
{
uint s = 21; // seed
foreach (char ch in str)
s = (s + (uint)ch) * 10;
return s;
}
(I know this is a horrible hash, but it's just an example)
Now lets say I wish to calculate all permutations of characters a-zA-Z0-9_
out to a length of 50, so for example:
a
b
...
_
aa
ab
...
__
Obviously this is a huge number (63^50) of hashes I need to calculate, so I decide to use OpenCL and GPU computing.
My question is: are there any pitfalls that OpenCL/GPU computing brings? I've read the following:
This makes me question the effectiveness of GPU computing in this case, because it seems to me that I'll need to use one of these approaches:
Are those conclusions accurate? If not, why, and is there anything else to watch out for?
Upvotes: 1
Views: 845
Reputation: 23248
Slow is a relative term. But generally, you want to avoid transferring huge amounts of data to and from the GPU, or to put it another way, you have to make the cost of the data transfer 'worth it' by doing a decent amount of calculations on the GPU before you transfer the result back off.
So, looking at your problem as you current have stated it (as I understand), you want to:
This will run poorly, because the calculation of the hash is computationally fairly trivial and the majority of time will be spent performing data transfer.
Definitely you want to generate the string permutations on the GPU - this will avoid the cost of (2). Splitting these up into work items should not be too hard. If you have a base string, e.g. 'aaaa', and have, say, 4 dimensions for each suffixed character, then calculate the hash in each thread (depending on the hash function, you could also make huge savings if the hash of the prefix 'aaaa' can be precalculated once and reused) and put that in the output.
But I suspect this approach would still bottleneck at transferring the generated hashes back to the host. If there is something you need to do with the hashes afterwards, such as check for equality against a known hash, you can do this on the GPU as well, avoiding all those costly data transfers because all you would need to write back is the single matching (or maybe a few matches) string/result hash to global memory and not 63^50.
Upvotes: 1