Lois2B
Lois2B

Reputation: 121

Why does the second call of the same function execute forever?

I wrote a function which uses a parallel for to do some calculations with a static schedule, and then it returns to my main. After that, I call this function again, but this time it's running forever so I have to abort the program.

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <omp.h>
#include <time.h>

int thread_count;

void work(int x) {
    int divisor = 0;
    for (int i=1; i<=x; i++) {
        if ((x%i) == 0) {
            divisor++;
        }
    }
}

void initialize(int *codes, int n) {
    thread_count = 4;

    srand(time(NULL));
    for (int i=0; i<n; i++) {
        codes[i] = rand() % 10000;
    }
}

double get_difference(double *times, int n) {
    double min, max;
    min = max = times[0];
    for (int i=1; i<n; i++) {
        if (times[i] > max) {
            max = times[i];
        }
        if (times[i] < min) {
            min = times[i];
        }
    }
    return (max-min);
}

void my_function(int *a, double *times, int n, int thread_count) {
    long i;
    #pragma omp parallel
    {
      #pragma omp parallel for num_threads(thread_count) \
        shared(a, n) private(i) schedule(static, 1)
        for (i=0; i<n; i++) {
            work(a[i]);
        }
        double wtime = omp_get_wtime();
        printf( "Time taken by thread %d is %f\n", omp_get_thread_num(), wtime);
        times[omp_get_thread_num()] = wtime;
     }
}

void odd_even(int *a, int n) {
   int phase, i, tmp;

   # pragma omp parallel num_threads(thread_count) \
      default(none) shared(a, n) private(i, tmp, phase)
   for (phase = 0; phase < n; phase++) {
      if (phase % 2 == 0)
        # pragma omp for 
         for (i = 1; i < n; i += 2) {
            if (a[i-1] < a[i]) {
               tmp = a[i-1];
               a[i-1] = a[i];
               a[i] = tmp;
            }
         }
      else
         #pragma omp for 
         for (i = 1; i < n-1; i += 2) {
            if (a[i] < a[i+1]) {
               tmp = a[i+1];
               a[i+1] = a[i];
               a[i] = tmp;
            }
         }
   }
}

And in my main I make the calls:

int main(int argc, char *argv[]) {
    int n = atoi(argv[1]);
    int arr[n];
    double times[thread_count];
    initialize(arr, n);
    odd_even(arr, n);
    my_function(arr, times, n, thread_count);
    double difference = get_difference(times, thread_count);
    printf("Difference is %f\n", difference);

    // my_function(arr, times, n, thread_count);
    // difference = get_difference(times, thread_count);
    // printf("Difference is %f\n", difference);
}

I made some prints to standard output, it prints the timestamps of each thread smoothly for the first call within a couple of seconds, but when I make the second call, the program will keep executing forever and nothing gets printed.

I tried both block distribution with chunk-size of schedule being n/thread_count, and block-cycling distribution with chunk-size being 1, but I get the same issue either way.

I also tried duplicating the function and calling two different functions with the same content one after the other, but that doesn't work either.

I don't change any of the variables and data between the two calls, so why is the second function call not executing properly?

Upvotes: 1

Views: 114

Answers (1)

dreamcrash
dreamcrash

Reputation: 51513

There are some issues with your code, in the function my_function the iterations of the loop are not being assigned to threads as you wanted. Because you have added again the clause parallel to #pragma omp for, and assuming that you have nested parallelism disabled, which by default it is, each of the threads created in the outer parallel region will execute "sequentially" the code within that region. Consequently, for a n = 6 and number of threads = 4, you would have the following block of code:

for (i=0; i<n; i++) {
    work(a[i]);
}

being executed 6 x 4 = 24 times (i.e., the total number of loop iterations multiple by the total number of threads). For a more in depth explanation check this SO Thread about a similar issue. Nevertheless, the image below provides a visualization of the essential:

enter image description here

So fix the my_function to:

void my_function(int *a, double *times, int n, int thread_count) {
    # pragma omp parallel num_threads(thread_count) shared(a)
    {
       #pragma omp for schedule(static, 1)
       for (long i=0; i<n; i++) {
           work(a[i]);
       }
    double wtime = omp_get_wtime();
    printf( "Time taken by thread %d is %f\n", omp_get_thread_num(), wtime);
    times[omp_get_thread_num()] = wtime;
    }
}

Second, the variable is being used thread_count before being properly initialized:

double times[thread_count];
initialize(arr, n);

change to :

initialize(arr, n);
double times[thread_count];

The last problem was causing undefined behavior, which can lead to unforeseen problems.

Another issue, that you might or not be aware of is that ironically the function work is not actually doing anything meaningful.

Calling double wtime = omp_get_wtime(); alone will not give you for how long did a thread work. According to the OpenMP documentation

The omp_get_wtime routine returns elapsed wall clock time in seconds.

Therefore, to measure the time spend in some block of code you can do the following

double begin = omp_get_wtime();
// block of code
double end = omp_get_wtime();

and use the expression end-begin to get the time spend in seconds. In your case:

 void my_function(int *a, double *times, int n, int thread_count) {
        # pragma omp parallel num_threads(thread_count) shared(a)
        {
           double begin = omp_get_wtime();
           #pragma omp for schedule(static, 1)
           for (long i=0; i<n; i++) {
               work(a[i]);
           }
           double end = omp_get_wtime();
           double time = end - begin;
           printf( "Time taken by thread %d is %f\n", omp_get_thread_num(), time);
           times[omp_get_thread_num()] = time;
        }
    }

Upvotes: 2

Related Questions