mitweyl
mitweyl

Reputation: 87

confusion in using of pthread_join ( thread, NULL)

When we create a thread with pthread_create, should we place the pthread_join immediate?

For example I have the following two codes, but I do not know why it does not work.

For the 1st version, the output is not deterministic.

#include<iostream>
#include<pthread.h>
#include<cstring>
#include<cstdlib>
#define ROW 3
#define COL 3
using namespace std;
typedef struct {
int row;
int col;
} para;
void print(double * para)
{
    for(int i=0;i<3;i++)
   {
           for(int j=0;j<3;j++)
           {
                    cout<<*(para+3*i+j)<<"\t";
            }
            cout<<endl;
        }

}
double mat[9]={1,2,3,4,5,6,7,8,9};
double * result=(double *) malloc(9*sizeof(double));
void * mul(void * arg)
{
        para * temp=(para *) arg;
        int row=temp->row;
        int col=temp->col;
        double sum=0;
        for(int i=0;i<3;i++)
        {
                double a=*(mat+row*3+i);
                double b=*(mat+i+3*col);
                sum+=a*b;
        }
        *(result+row*3+col)=sum;
int main()
{
        pthread_t thread[9];
        for(int i=0;i<9;i++)
        {
                   para M;
                M.row=i/3;
                M.col=i%3;
                pthread_create(&thread[i],NULL,mul,&M);

        }
        for(int i=0;i<9;i++)
        {
                 pthread_join(thread[i],NULL);            
        }

        print(result);
}

With the 2nd version, the output is correct.

#include<iostream>
#include<pthread.h>
#include<cstring>
#include<cstdlib>
#define ROW 3
#define COL 3
using namespace std;
typedef struct {
int row;
int col;
} para;
void print(double * para)
{
    for(int i=0;i<3;i++)
   {
           for(int j=0;j<3;j++)
           {
                    cout<<*(para+3*i+j)<<"\t";
            }
            cout<<endl;
        }

}
double mat[9]={1,2,3,4,5,6,7,8,9};
double * result=(double *) malloc(9*sizeof(double));
void * mul(void * arg)
{
        para * temp=(para *) arg;
        int row=temp->row;
        int col=temp->col;
        double sum=0;
        for(int i=0;i<3;i++)
        {
                double a=*(mat+row*3+i);
                double b=*(mat+i+3*col);
                sum+=a*b;
        }
        *(result+row*3+col)=sum;
int main()
{
        pthread_t thread[9];
        for(int i=0;i<9;i++)
        {
                   para M;
                M.row=i/3;
                M.col=i%3;
                pthread_create(&thread[i],NULL,mul,&M);
                pthread_join(thread[i],NULL); 
        }   
        print(result);
}

What is the difference between these two usages? And why the first code has something wrong?

Upvotes: 1

Views: 5828

Answers (2)

Loki Astari
Loki Astari

Reputation: 264361

The first version starts nine threads.
Then once all threads have been created it waits for them all to finish before exiting.
Thus you get nine threads running in parallel.

The second version starts nine threads.
But after each thread is started it waits for the thread to exit before continuing.
Thus you get nine threads running serially.

Unfortunately the first version is also broken.
The data object passed to the thread (as the 4th parameter (&M)) is an automatic variable that goes out of scope potentially before the thread completes.

Fix like this:

    pthread_t thread[9];
    para      M[9];
    for(int i=0;i<9;i++)
    {
            M[i].row = i/3;
            M[i].col = i%3;
            pthread_create(&thread[i],NULL,mul,&M[i]);

    }

Upvotes: 5

rkhayrov
rkhayrov

Reputation: 10260

All your threads share the same memory region as their parameter because you repeatedly pass the pointer to the same stack-allocated variable M. You change M in the main loop while worker threads are running, leading to non-deterministic results. In the 2nd version you've essentially turned your code into sequential, as pthread_join waits for every thread to terminate before you start next, that's why it works correctly.

Upvotes: 0

Related Questions