Giannis
Giannis

Reputation: 75

pthread_join hangs accordingly to random global variable value

I have built this code utilizing pthreads. The goal is to build an array X[N][D] and assign random values to it. You could read the elements of this array as the coefficients of some points.

On the next step I am trying to calculate an array distances[N]which holds all the distances between the last element (Nth) and each other element. The distances calculation is executed using pthreads.

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <math.h>

#define N 10
#define D 2         //works for any d
#define NUM_THREADS 8


//double *distances;
//int global_index = 0;
pthread_mutex_t lock;
double *X;

typedef struct
{
    //int thread_id;
    double *distances;
    int *global_index ;
    pthread_mutex_t lock;
    double *X;

}parms;

void *threadDistance(void *arg)
{
    parms *data = (parms *) arg;
    double *distances = data->distances;
    double *X = data->X;
    int *global_idx = data -> global_index;

    int idx,j;
    //long id = (long)arg;
    pthread_mutex_lock(&lock);

    while(*global_idx<N)
    {
        //printf("Thread #%ld , is calculating\n", id);
        idx = *(global_idx);
        (*global_idx)++;
        pthread_mutex_unlock(&lock);
        for(j=0 ; j<D; j++)
        {
            distances[idx] = pow(X[(j+1)*N-1]-X[j*N+idx], 2);
            //printf("dis[%d]= ", dis);
            //printf("%f\n",distances[idx]);
        }
        //printf("global : %d\n", *global_idx);
    }


    pthread_exit(NULL);


}

void calcDistance(double * X, int n, int d)
{
    int i;
    int temp=0;
    pthread_t threads[NUM_THREADS];
    double *distances = malloc(n * sizeof(double));

    parms arg;
    arg.X = X;
    arg.distances = distances;
    arg.global_index = &temp;

    for (i=0 ; i<NUM_THREADS ; i++)
    {
        pthread_create(&threads[i], NULL, threadDistance, (void *) &arg);
    }

    for(i = 0 ; i<NUM_THREADS; i++)
    {
        pthread_join(threads[i], NULL);
    }

    /*----print dstances[] array-------*/
    printf("--------\n");
    for(int i = 0; i<N; i++)
    {
        printf("%f\n", distances[i]);
    }
    /*------------*/
    free(distances);
}

int main()
{

    srand(time(NULL));

    //allocate the proper space for X
    X = malloc(D*N*(sizeof(double)));

    //fill X with numbers in space (0,1)
    for(int i = 0 ; i<N ; i++)
    {
        for(int j=0; j<D; j++)
        {
            X[i+j*N] = (double) (rand()  / (RAND_MAX + 2.0));
        }

    }

    calcDistance(X, N, D);


    return 0;
}

The problem is that the code executes completely only when N=100000. If N!=100000 the code just hangs and I have found that the source of the problem is the pthread_join() function. First of all I cannot understand why the hang depends on the value of N.

Secondly, I have tried printf()ing the value of global_index (as you can see it is commented out in this particular sample of code). As soon as I uncomment the printf("global : %d\n", *global_idx); command the program stops hanging, regardless of the value of N.

It seems crazy to me as the differences between hanging and not hanging are so irrelevant.

Upvotes: 0

Views: 107

Answers (1)

user3629249
user3629249

Reputation: 16540

regarding:

pthread_mutex_lock(&lock); 
while(*global_idx<N) 
{  
    // ... 
    pthread_mutex_unlock(&lock); 

The result is that after the first iteration of the loop, the mutex is always unlocked. Suggest moving the call to pthread_mutex_lock() to inside the top of the loop.

after making the above corrections, I then set N to 10000. Then re-compiled, etc. The result was a seg fault event, so the mis-handling of the mutex is not the only problem.

regarding:

* First of all I cannot understand why the hang depends on the value of N.*

it seems the program is actually crashing with a seg fault event, not hanging

Upvotes: 1

Related Questions