Reputation: 93
I know that threads share the address space, but do not share their stacks. Isn't that contradicting? Why is it true to say they share address space when they in fact do not share their stack - Stack is part of the address space, isn't it?
I would assume it threads share heap, data and code segment and not stack segment. To me all of them are considered process address space.
Can someone clarify please? Thanks!!
Upvotes: 8
Views: 2427
Reputation: 49
It is possible to have threads with separated stack address spaces, but it depends on two factors: how threads are implemented and which limitations are imposed by the operating system:
If they are implemented exclusively in user space without any kernel help (like first thread libraries in old Unix OSes), they will share stack address space. The difference is where the stack starts in each thread.
If the operating system implements special syscalls for building threads (i.e. like Mach3 based kernels cthreads), or built around special fork syscalls like Linux's clone, or around non-posix syscalls (like Windows), they can share most common address space but having different anonymous memory for the stack segment.
Note that in the first case (user space threads), threads share everything, even the same PID and there is not real separation between threads. If one thread gets blocked or kills the process all threads got blocked or killed (there is not real separated execution). Of course. in this case the stack address space is shared by all the threads in the same PID.
In the other cases (with support of OS), the degree of isolation depends on two things: the threads library and the kernel facilities. If the library, in spite of having mechanisms for creating process with different combinations of shared resources (like Linux clone), does not use it, the threads by sure will share the stack. If the library is advanced and has support for such an exotic feature, it may separate stacks.
But separating stacks in different address spaces introduces a big problem: you cannot share variables in the stack among threads. At first glance it does not seem a big problem, and even you may think it is an advantage. But it is not true, in fact sharing variables from the stack among several threads is a very common use case (e.g. in scientific code). Here follows a parallelized for
in OpenMP (source https://www.openmp.org/wp-content/uploads/openmp-examples-4.5.0.pdf):
void simple(int n, float *a, float *b)
{
int i;
#pragma omp parallel for
for (i=1; i<n; i++) /*i is private by default*/
b[i] = (a[i] + a[i-1]) / 2.0;
}
As you can see, the b
and a
vectors are passed as pointers during the call. You don't have any warranty that they reside in a "shared address space". If this OpenMP library is linked against a thread library where threads have stacks in different address spaces, this OpenMP parallelization will fail. This is really a bad start for a threading library when it is breaking one of the introductory examples of the OpenMP library.
So due compatibility, the most common is to never separate in different address spaces the stack, in spite of being possible to implement in most modern operating systems.
Upvotes: 1
Reputation: 223719
Yes, thread have the same address space but do not share stacks. Anything that one thread sees in memory another thread can see and at the same address, but each thread's stack is in a different place in the address space so they each call other functions independently without interfering with each other.
Take the following program as an example:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
void *foo(void *arg)
{
int *n = arg;
printf("in thread, arg=%p, value=%d, &n=%p\n", arg, *n, (void *)&n);
return NULL;
}
int main()
{
int x = 4;
printf("in main, x=%d, &x=%p\n", x, (void *)&x);
pthread_t tid;
pthread_create(&tid, NULL, foo, &x);
sleep(3);
pthread_join(tid, NULL);
return 0;
}
The main function passes the address of a local variable, which lives on the stack of the main thread, to another thread. The thread is able to dereference that pointer and read the value of the variable.
On my system it outputs the following:
in main, x=4, &x=0x7fff2142985c
in thread, arg=0x7fff2142985c, value=4, &n=0x7f6abaa90f08
Here you can see that both the main thread and the child thread see the same address and value for x
in the main function. You can also see that the address of variable n
in foo
, which lives in the stack of the child thread, is very far away from the address of x
in main
(roughly 637GB apart).
This demonstrates that both threads can read the same memory with the same addresses and that each thread has its own stack.
Upvotes: 9
Reputation: 1869
The stacks are in the same address space, but each thread is not supposed to touch the stacks that belong to other threads. Why?
Well, simply because threads could not possibly share the stack, as it would inevitably lead to concurrent modification of the stack, with data corruption and crashes.
Consider the following: Thread 1 pushes something onto the stack and then thread 2 does the same. Now thread 1 pops the stack and will get the data that thread 2 pushed. We have undefined behavior.
With regards to the heap, each memory allocation is synchronized - which means that only one thread can allocate memory at any given time, and this prevents concurrency problems, but is also one of the reasons why memory allocation is a huge bottleneck (it is very slow!).
Apart from the thread context, which is unique to each thread, and includes the stack pointer, user data can also be stored on a per-thread basis using something called thread local storage (TLS).
Upvotes: 0
Reputation: 4192
Old textbooks frequently describe them as such, but thread stacks on modern operating systems aren't some "special" components of process address space. They are a memory mappings, just like any other mappings, created by mmap
.
The primordial thread — a first thread in process — may obtain it's stack in special way, but rest of threads have it allocated normally by user space threading library (often with mmap
call). A stack can usually be manipulated by user space, sometimes even completely replaced with another memory allocation.
Most operating systems don't even check, that thread actually uses a stack it claims to use — here is a description of recently implemented security mitigation technique, that implements such checking to defend against exploits.
Upvotes: 0