E V Ravi
E V Ravi

Reputation: 141

Unable to understand pthread_create() behaviour in the following program?

#include <stdio.h>
#include <pthread.h>

void *thread_func(void *arg)
{
        printf("hello, world \n");
        return 0;
}

int main(void)
{
        pthread_t t1, t2;

        pthread_create(&t1, NULL, thread_func, NULL);
        pthread_create(&t2, NULL, thread_func, NULL);

        printf("t1 = %d\n",t1);
        printf("t2 = %d\n",t2);

        return 0;
}

The above program creates two threads, where each thread prints "Hello World".

So, as per my understanding, the "Hello world" should print a maximum 2 times.

However, while executing the same program multiple times(back to back), there are scenarios where "Hello world" is being printed more than 2 times. So I am unclear how it is getting printed an unexpected number of times?

Here are the sample outputs:

[rr@ar ~]$ ./a.out
t1 = 1290651392
t2 = 1282258688
hello, world
hello, world


[rr@ar ~]$ ./a.out
t1 = 1530119936
t2 = 1521727232
hello, world
hello, world
hello, world

As shown above, after executing the program for many times, "hello, world" is printed 3 times. Can anyone please advise how come it got printed 3 times?

Upvotes: 5

Views: 155

Answers (3)

Luis Colorado
Luis Colorado

Reputation: 12668

Well, let me show an scenario where this can happen. You probably know (if you don't, please read the appropiate manual page) that printf() is one of the functions that is not thread safe of the standard library (there is a list in the pthread_<something>, somewhere) and you probably also know that printf(3) stores its data in a buffer previously to issue the write(2) system call to actually write the data to stdout.

  1. Thread A (I deliverately selected different labels as you, to make both threads indistinguishable, so Thread A can be either thread) makes a printf() call that puts a compllete "Hello, world\n" message in the buffer, and prepares to write(2) it, as a consequence of the terminal being a tty device and a \n ends the output string.
  2. Thread B takes control and makes a second call to the same printf(2)data (and fills the same buffer with a second copy of "Hello, world\n") and for the same reason, prepares and completely executes a write(2) syscall of the whole buffer (which now contains two messages) flushes the buffer. this makes "Hello, world\n" to appear twice.
  3. Thread A, which has been blocked by means of the first write(2) system call (both threads cannot make simultaneous write(2) calls to the same inode ---this is warranted by the system kernel) flushes it's view (which probably is stored in its stack, and includes only the reference to the first message) of the buffer (which finished at the end of the first "Hello, world\n") and makes another write of one more "Hellow, world\n" message)

Final score: Three "Hellow, world\n" messages at the terminal.

NOTE

The most probable thing is that getting three messages is difficult to happen, as you need one of the threads to bypass the other in the time printf decides it is time to flush the buffer, after filling it (which is a short time) and then get first into the blocking write(2) call (as explained before, both threads cannot be involved in a write(2) call at the same time to the same file, that's not permitted by the kernel)

Upvotes: 2

Jean-Baptiste Yun&#232;s
Jean-Baptiste Yun&#232;s

Reputation: 36401

You experienced a thread-safety problem. I ran your code several times in Linux 16.04 and it produces many different outputs, while the one with 3 hello world message being rare it exists. More frequently there is no output at all which means that main terminates faster than threads being able to finish their outputs. Sometime partial outputs are produced like:

t1=xxxx
t2=yyyy
he

That means that the main is exiting while only one thread was able to push some characters in the stdout buffer. Remember that a normal return from main is equivalent to a call to exit which flushes stdio buffers.

While I am unable to really understand what happens behind the scene when you observe 3 messages, I suspect that there exists a run race that let the main flushing a buffer that is currently being flush by one of the threads. Without examining very carefully the source code of printf it is very hard to say more. A possible (rough) scenario would looks like:

  1. thread1 fills the buffers and enters the flushing but is preempted at the very beginning of it
  2. main exits, so enters the flushing and terminates it and is preempted at the very end of it thus producing hello world
  3. thread1 finishes its flushing producing hello world
  4. thread2 produces hello world
  5. main gets the CPU and terminates the process.

printf is not defined as thread-safe, which means that implementors may realize it as such or not (probably not in most cases). So you need, as with any function that uses some shared resource, some mutex to prevents buffer concurrency and such.

In your case, this should be roughly solved (3 outputs) by joining the threads in the main which will prevent main exiting/flushing before threads termination. But be aware that this will not solve other concurrency problems (two threads accessing the same buffer...).

Upvotes: 4

Marian
Marian

Reputation: 7472

When the main program terminates sub threads are terminated as well.

It can happen that both sub-threads are executed before the main task finishes. In this case you see two "hello worlds" and the output as you show in the question.

Also it can happen that the main program finishes before one or both threads print the output. In this case you can see one or no "hello world" at all.

I do not see any possibility that a single run of this program prints it 3 times. I suppose that you are executing the program in a loop and the output of two runs is mixed together. ADDED: For example, imagine the following scenario: RUN1: prints two numbers, then subthreads are scheduled and they print one "hello world" each, then RUN1 main is scheduled back and program finishes. Next, RUN2 is launched. In this case both subthreads are scheduled before the main program prints the numbers.

So you see an output like:

t1=346236763               (RUN1 - main)
t2=876237623               (RUN1 - main)
hello, world               (RUN1 - subthread)
hello, world               (RUN1 - subthread)
hello, world               (RUN2 - subthread)
hello, world               (RUN2 - subthread)
t1=3786768623              (RUN2 - main)
t2=7843473478              (RUN2 - main)

the output can be wrongly interpreted as if there were 4 "hello worlds" written by a single run.

Upvotes: 0

Related Questions