Reputation: 232
The man page of sem_init() says "Initializing a semaphore that has already been initialized results in undefined behavior." Why is that and what exactly will happen on Linux?
This doesn't make sense to me, because when you call sem_init() for the first time, the (uninitialized) sem_t could have exact content as an initialized sem_t -- if the manual is correct, then sem_init() simply doesn't work.
Upvotes: 4
Views: 2959
Reputation: 215261
On Linux, where semaphores are implemented without any system resources, sem_init
just fills in the sem_t
structure members, and so nothing bad will happen if it's called more than once. However, in general much worse things could happen.
If the sem_t
is just a dummy object containing a pointer to an allocated object (note: this can't work for process-shared semaphores), you would leak memory by calling sem_init
multiple times.
Similarly, if sem_t
just contained a reference (like a file descriptor number) to a kernel-managed resource, you would leak these kernel resources by calling sem_init
more than once.
Even worse, if the library implementation maintained a linked list of all instantiated semaphores using prev/next pointers inside the sem_t
object (also not possible for process-shared case), you would corrupt this list by calling sem_init
on a sem_t
that's already part of the list.
The standard for POSIX semaphores allows a wide variety of implementation types that might be needed to support implementations on different types of systems (e.g. machines without atomic compare-and-swap instruction, bare-metal with no kernel, ...) so it leaves the behavior undefined so as not to impose requirements that might limit implementation choices.
Upvotes: 3
Reputation: 46607
Why is that
Think of it from an API designer's point of view. A semaphore can be seen as an abstract object that is created, used, and eventually disposed of.
Now the task is to map it to C (or any other language). The semaphore implementation will need to acquire resources, possibly resources that are maintained by the operating system. Above live cycle makes a lot of sense.
The API is finalized, and a first implementation is made. Many corner cases or extra requirements come up soon. For example whether sem_init
can be called multiple times, given that the current implementation makes it trivial to allow it. Another one (maybe) is that it should be possible to select whether semaphores are shared between threads, or processes.
In each case, the API designer will have to weight the trade-offs:
In this case, it seems allowing for double initialization would get a no by most of these criteria. So the decision is made to not allow it. It probably still works with your particular implementation, compiler, system or even the majority of implementations, compilers, systems.
How to convey that? Well, you call it undefined behaviour in the manual and everybody knows not to do it. People with good working intuition for the environment can easily guess what the behaviour might be. Only a fool would rely on it, though.
the (uninitialized) sem_t could have exact content as an initialized sem_t
That is true. However, let us say the sem_t
holds a pointer to a piece of heap memory that sem_init
allocates using malloc
. It is perfectly possible for a randomly-non-initialized sem_t
to have the exact same pointer value, but the resource it corresponds to would not exist.
Upvotes: 2