Reputation: 87
When we create a thread with pthread_create
, should we place the pthread_join
immediate?
For example I have the following two codes, but I do not know why it does not work.
For the 1st version, the output is not deterministic.
#include<iostream>
#include<pthread.h>
#include<cstring>
#include<cstdlib>
#define ROW 3
#define COL 3
using namespace std;
typedef struct {
int row;
int col;
} para;
void print(double * para)
{
for(int i=0;i<3;i++)
{
for(int j=0;j<3;j++)
{
cout<<*(para+3*i+j)<<"\t";
}
cout<<endl;
}
}
double mat[9]={1,2,3,4,5,6,7,8,9};
double * result=(double *) malloc(9*sizeof(double));
void * mul(void * arg)
{
para * temp=(para *) arg;
int row=temp->row;
int col=temp->col;
double sum=0;
for(int i=0;i<3;i++)
{
double a=*(mat+row*3+i);
double b=*(mat+i+3*col);
sum+=a*b;
}
*(result+row*3+col)=sum;
int main()
{
pthread_t thread[9];
for(int i=0;i<9;i++)
{
para M;
M.row=i/3;
M.col=i%3;
pthread_create(&thread[i],NULL,mul,&M);
}
for(int i=0;i<9;i++)
{
pthread_join(thread[i],NULL);
}
print(result);
}
With the 2nd version, the output is correct.
#include<iostream>
#include<pthread.h>
#include<cstring>
#include<cstdlib>
#define ROW 3
#define COL 3
using namespace std;
typedef struct {
int row;
int col;
} para;
void print(double * para)
{
for(int i=0;i<3;i++)
{
for(int j=0;j<3;j++)
{
cout<<*(para+3*i+j)<<"\t";
}
cout<<endl;
}
}
double mat[9]={1,2,3,4,5,6,7,8,9};
double * result=(double *) malloc(9*sizeof(double));
void * mul(void * arg)
{
para * temp=(para *) arg;
int row=temp->row;
int col=temp->col;
double sum=0;
for(int i=0;i<3;i++)
{
double a=*(mat+row*3+i);
double b=*(mat+i+3*col);
sum+=a*b;
}
*(result+row*3+col)=sum;
int main()
{
pthread_t thread[9];
for(int i=0;i<9;i++)
{
para M;
M.row=i/3;
M.col=i%3;
pthread_create(&thread[i],NULL,mul,&M);
pthread_join(thread[i],NULL);
}
print(result);
}
What is the difference between these two usages? And why the first code has something wrong?
Upvotes: 1
Views: 5828
Reputation: 264361
The first version starts nine threads.
Then once all threads have been created it waits for them all to finish before exiting.
Thus you get nine threads running in parallel.
The second version starts nine threads.
But after each thread is started it waits for the thread to exit before continuing.
Thus you get nine threads running serially.
Unfortunately the first version is also broken.
The data object passed to the thread (as the 4th parameter (&M
)) is an automatic variable that goes out of scope potentially before the thread completes.
Fix like this:
pthread_t thread[9];
para M[9];
for(int i=0;i<9;i++)
{
M[i].row = i/3;
M[i].col = i%3;
pthread_create(&thread[i],NULL,mul,&M[i]);
}
Upvotes: 5
Reputation: 10260
All your threads share the same memory region as their parameter because you repeatedly pass the pointer to the same stack-allocated variable M
. You change M
in the main loop while worker threads are running, leading to non-deterministic results. In the 2nd version you've essentially turned your code into sequential, as pthread_join
waits for every thread to terminate before you start next, that's why it works correctly.
Upvotes: 0