A.nechi
A.nechi

Reputation: 541

multi threaded program is blocking

Hi I'm trying to write a multi-threaded program in C where I use 4 threads to work on some computation on an array of floats. So, I began by creating just 4 threads and play with some arguments that define which part of the array the thread will work on. And at that point the program is working fine.

And now, I tried to use only loading and storing instructions (256 bits Intel intrinsics). And then, the program never finish although it seems that the threads routines are finishing their work.

void *routine(void *thread_info)
{
   int n;
   unsigned t_start,t_stop;
   unsigned ind1, ind2, ind3;
   float *arr_in , *arr_out;
   struct thread_data *mydata;

   mydata = (struct thread_data*) thread_info;
   t_start = mydata->start;
   t_stop  = mydata->stop;
   arr_in  = mydata->input;
   arr_out = mydata->output;

   for (n = t_start; n < t_stop; n += 8)
   {  
      ind1 = 256 + n;
      ind2 = 512 + n;

      vec_a = _mm256_load_ps((float *) (&arr_in[n   ]) );
      vec_b = _mm256_load_ps((float *) (&arr_in[ind1]) );
      vec_c = _mm256_load_ps((float *) (&arr_in[ind2]) );

      _mm256_store_ps((float *) (&arr_out[n   ]), (vec_a) );
      _mm256_store_ps((float *) (&arr_out[ind1]), (vec_b) );
      _mm256_store_ps((float *) (&arr_out[ind2]), (vec_c) );
    }   
    printf("EXECUTION FINISHED ===== Thread : %u \n",t_start);
    pthread_exit(NULL);
}

void foo(float* in,float* out)
{
   unsigned t,i=0;
   for(t=0;t<256;t+=64)
   {
      thread_data_array[i].start    = t;
      thread_data_array[i].stop = t+QUARTER;
      thread_data_array[i].input    = in;
      thread_data_array[i].output   = out;
      pthread_create(&threads[i],NULL,routine,(void*)&thread_data_array[i]);
      i++;
   }
   pthread_exit(NULL);
}

int main()
{
   float *data1;
   float *data2;

   posix_memalign((void**)&data1, 32, 1024 * sizeof(float));
   posix_memalign((void**)&data2, 32, 1024 * sizeof(float));

   Load_inputs(reals,imags);//load data into the two arrays
   foo(data1,data2);
   printf("PROGRAM EXECUTION FINISHED");
   return EXIT_SUCCESS;
}

The compilation is good no errors but the execution give me the following:

EXECUTION FINISHED ===== Thread : 0 
EXECUTION FINISHED ===== Thread : 64 
EXECUTION FINISHED ===== Thread : 128 
EXECUTION FINISHED ===== Thread : 192

the program is not terminating and still missing PROGRAM EXECUTION FINISHED

Upvotes: 0

Views: 97

Answers (1)

mnistic
mnistic

Reputation: 11020

In you foo function, you call pthread_exit(NULL); which will immediately terminate the main thread (foo is called from the main thread). This is why you are not seeing "PROGRAM EXECUTION FINISHED" in the output, the main thread never gets the chance to print it out because it was terminated in foo. What you want to do instead is join the threads with pthread_join which will make the main thread wait for the other threads to finish.

Upvotes: 5

Related Questions