Reputation: 3355
This is my test C program as below:
#include <pthread.h>
#include <unistd.h>
#include <stdio.h>
__thread int var = 0;
void* worker(void* arg);
int main()
{
pthread_t pid1, pid2;
pthread_create(&pid1, NULL, worker, (void*)0);
pthread_create(&pid2, NULL, worker, (void*)1);
printf("-----------1----------\n");
pthread_join(pid1, NULL);
sleep(1);
printf("-----------2----------\n");
pthread_join(pid2, NULL);
return 0;
}
void* worker(void* arg)
{
int idx = (int)arg;
int i;
for (i = 0; i < 10; ++i) {
printf("thread: %d ++var = %d\n",
idx,
++var);
}
}
And then I compile it as below:
$ gcc -g -Wall -pthread 1.c -lpthread -o test
1.c: In function ‘worker’:
1.c:27:15: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
27 | int idx = (int)arg;
| ^
1.c:35:1: warning: control reaches end of non-void function [-Wreturn-type]
35 | }
| ^
Run result as below: the last line comes out 1 second later. But I cannot understand why "thread: 1"
comes before "thread: 0"
?
$ ./test
-----------1----------
thread: 1 ++var = 1
thread: 1 ++var = 2
thread: 1 ++var = 3
thread: 1 ++var = 4
thread: 1 ++var = 5
thread: 1 ++var = 6
thread: 1 ++var = 7
thread: 1 ++var = 8
thread: 1 ++var = 9
thread: 1 ++var = 10
thread: 0 ++var = 1
thread: 0 ++var = 2
thread: 0 ++var = 3
thread: 0 ++var = 4
thread: 0 ++var = 5
thread: 0 ++var = 6
thread: 0 ++var = 7
thread: 0 ++var = 8
thread: 0 ++var = 9
thread: 0 ++var = 10
-----------2----------
Upvotes: 1
Views: 112
Reputation: 23822
Imagine a situation in a multicore system where each thread execution is assigned to a different processor, they will execute the code somewhat independently from each other, as it should be, if we want to have a faster program, if one had to wait for the other to end so it can begin, then it would render the use of two threads useless.
What is guaranteed by the pthread_join
s is that the program will not advance until both threads have ended their respective work.
If you want to control the execution flow, then you shouldn't have more than one thread, or you must synchronize the execution yourself, in this case you could use a mutex, it would look something like:
void* worker(void* arg);
typedef struct { // shared data
int idx;
int var;
pthread_mutex_t* mutex_ptr;
} Data;
int main()
{
pthread_mutex_t mutex;
pthread_mutex_init(&mutex, NULL);
Data data = {.idx = 0, .var = 0, .mutex_ptr = &mutex};
pthread_t pid1, pid2;
pthread_create(&pid1, NULL, worker, &data);
pthread_create(&pid2, NULL, worker, &data);
pthread_join(pid1, NULL);
pthread_join(pid2, NULL);
pthread_mutex_destroy(&mutex);
}
void* worker(void* arg)
{
Data* data = (Data*)arg;
int i;
pthread_mutex_lock(data->mutex_ptr);
printf("\n----------%d-----------\n", data->idx);
for (i = 0; i < 10; ++i) {
printf("thread: %d ++var = %d\n",
data->idx,
++data->var);
}
data->idx++;
data->var = 0;
pthread_mutex_unlock(data->mutex_ptr);
return NULL; // return type of worker is void* so it must return a pointer
}
In this live sample you can see the difference when using vs not using mutex. I had to add a larger loop to really see it.
Expected output with mutex:
----------0-----------
thread: 0 ++var = 1
thread: 0 ++var = 2
thread: 0 ++var = 3
thread: 0 ++var = 4
thread: 0 ++var = 5
thread: 0 ++var = 6
thread: 0 ++var = 7
thread: 0 ++var = 8
thread: 0 ++var = 9
thread: 0 ++var = 10
----------1-----------
thread: 1 ++var = 1
thread: 1 ++var = 2
thread: 1 ++var = 3
thread: 1 ++var = 4
thread: 1 ++var = 5
thread: 1 ++var = 6
thread: 1 ++var = 7
thread: 1 ++var = 8
thread: 1 ++var = 9
thread: 1 ++var = 10
Without mutex any order is valid, e.g.:
----------0-----------
thread: 0 ++var = 1
thread: 0 ++var = 2
thread: 0 ++var = 3
thread: 0 ++var = 4
thread: 1 ++var = 1
thread: 0 ++var = 5
----------1-----------
thread: 0 ++var = 6
thread: 1 ++var = 2
thread: 0 ++var = 7
thread: 1 ++var = 3
thread: 0 ++var = 8
thread: 1 ++var = 4
...
...
...
Note that the synchronized example takes litle to no advantage of using threads in terms of speed because of the synchronization we introduced, the program behaves in manner that is similar to if we were using a single thread, but because we synchronize it we can control the access to the data, note that the mutex is shared, also note that this is an example, but keep in mind that the critial section, i.e what's inside the lock, should be only the necessary data, and loops should be avoided when possible.
Upvotes: 2
Reputation: 141493
why "thread: 1" comes before "thread: 0"?
It's just that the thread 1 gets CPU time first and is soo fast that is able to print it all.
Most probably what happens, is that main()
has CPU time, then main()
runs pthread_join
, in which case it yields processor time and scheduler kicks in. Then the scheduler decides to give CPU time to thread 1 - probably the last one, an arbitrary choice. The thread is fast enough to print it all before the scheduler is able to re-schedule CPU time.
The threads are unsequenced with each other - you can't expect any kind of order, except that printf
outputs should not mix, i.e. printf
itself is thread safe. One thread printing before the other is as unsequenced as any other result.
warning: cast from pointer to integer of different size
Do (uintptr_t)arg;
to silence the warning.
warning: control reaches end of non-void function
It is a very serious warning and results in undefined behavior. In particular, lately I explored how this exact problem may result in an endless loop with very similar code to yours. Add return NULL;
on the end of the function.
Upvotes: 1