Reputation: 21
I am implementing the Boyer–Moore algorithm using OpenMP and C. I am using the gcc compiler. My serial code works fine, but when parallelizing using OpenMP, I used
#pragma omp parallel for
But the output I am getting is not correct. I am getting different keyword counts for different runs and also the offset of the keyword is incorrect.
Are there any special rules for this #pragma omp parallel for
directive?
This is code of the for loop:
#pragma omp parallel
{
#pragma omp for
for(k=0;k<=s.st_size;k+=chunksize-plen)
{
fseek(fp,k,SEEK_SET);
fread(data, chunksize, sizeof(unsigned char), fp);
data[chunksize]='\0';
boyermoore(data,pattern,k,&c);
}
}
Upvotes: 0
Views: 880
Reputation: 12784
First, you need to minimize the amount of data that is shared within the parallel region. For this code, I think you need to make data
private, and if it's a memory buffer, make sure it's separate for each thread:
#pragma omp parallel private(data)
{
data = malloc( chunksize ); /* or whatever size it should be */
#pragma omp for
{ /* ... */ }
free(data);
}
Alternatively, you might allocate the buffer for all threads at once before the parallel region, and use omp_get_thread_num
to find out which part of the buffer should be used:
buffer = malloc ( chunksize * omp_get_max_threads() );
#pragma omp parallel private(data)
{
data = (char*)buffer + omp_get_thread_num()*chunksize;
/* ... */
}
Maybe pattern
and c
should be private as well; it's hard to say without seeing the code of boyermoore
function.
Second, you need to synchronize access to any shared variables (unless those are only for reading). In particular, any operations with a file should be synchronized, e.g. with the help of #pragma omp critical
:
#pragma omp critical
{
fseek(fp,k,SEEK_SET);
fread(data, chunksize, sizeof(unsigned char), fp);
}
Third, you need to make sure that boyermoore
function and any functions it calls are thread-safe and can be executed concurrently without data races. Typically, any access to a shared state (e.g. global variables) need to be synchronized. Also if any pointers are passed to the function (e.g. &c
seems suspicious) you need to make sure that either those point to different memory in each thread or the modifications of that memory are synchronized as well.
Upvotes: 0
Reputation: 340466
These statements are not safely parallelizable (is that a word?):
fseek(fp,k,SEEK_SET);
fread(data, chunksize, sizeof(unsigned char), fp);
One thread setting the position at the same time as another or immediately before read in another thread will wreak havoc. Not to mention that each thread would be reading into the same data
buffer.
Upvotes: 1