Implementing mutexes from ground up

Question

Lets say I am writing an OS. And I am no cheater, so I start from the very bottom. After A while you have processes, a pre-emptive scheduler and Malloc. And now you need mutexes and semaphores.

The start of this is code which can exclusively lock a mutex, or exclusively increase or decrease a semaphore. Lets say I have this code.

Then you get the next step: taking the mutexes. Lets say we have two processes, fighting for one mutex. Process A is first and snatches the mutex before B can touch it. So now B has to wait. My question is specifically about handling the wait phase. I can see the following options:

Keep process B on the scheduler and every time it gets a timeslice it will check if it can lock the mutex. If not, reschedule and wait for the next timeslice. This method seems quite flawless, except for the waste of CPU time on a process that cant do anything.
Introduce a new process, call him Kernel. He is the all-knowing one and has access to everything. If a process cant lock the mutex it drops into waiting and gets no more timeslices. If a process releases the mutex it notifies the kernel and the kernel will get a timeslice later. In its timeslice, it searches for processes waiting for the mutex. The process with the highest priority that was waiting for this mutex will be woken up.
Again, assuming a process falls into WAIT if it cant get the mutex: on mutex release, the process that releases the mutex has to go trough the list of processes and see which processes where waiting for the mutex. Then it wakes up the one with the highest priority. I dislike this one because I really do not want to give any process write access to any part of the memory. I plan on using an MPU to prevent this, to detect segfaults, etc. I will have a harder time implementing the MPU part correctly if I implement this method.

So thats what I can see. I like 2 the most, but it still feels like a lot of overhead. I would love to hear your input on the question: how do you implement waiting for a locked mutex to be released?

Some more background: I am actually implementing the OS, on a Cortex M4 CPU. I know I wont be able to beat freeRTOS etc. Its about the learning experience.

Martin James · Accepted Answer

Typically, this is what happens:

Thread A attempts to get the mutex by making a syscall that references it. No other thread has the mutex so its call succeeds. A runs on..

Thread B attempts to get the mutex by making a syscall that references it. The mutex is taken and so B is taken out of the list of running threads and inserted into a queue in the mutex struct. The kernel returns to someother thread that is ready. Thread B is now dead code and stack.

Thread A releases the mutex by making a syscall that references it. The mutex thread queue is checked to see if it has entries, and it has. Thread B is popped off the queue and added to the list of ready threads. The scheduler/dispatcher decides which set of ready thread to run on the cores. Thread B may be in this set, depending upon he scheduler algorithm and stored state data on all the other threads. Thread B may run immedately on another core than thread A, it may immediately preempt thread A, (which would go from running to ready), or it may not run at all since the available cores are in use for running other higher-pririty threads.

No 'kernel process' is required, just kernel code and data.

No 'timeslices' are required or needed. Timeslices are not relevant to this functionality.

The 'searches for processes waiting for the mutex' is a queue pop, O(1).

Edit after comments:

'this means that every process has access to all memory related to the scheduler'

Not all the time, because the syscalls change to kernel state and then back on return. There is no need for the thread in user-state to have unrestricted access to kernel memory.

You need mutexes or semaphores to start waiting for mutexes or semaphores, (to ensure the thread safety of the queue).

No. Typically, moving threads between containers means removing and inserting only a pointer to the thread control block. Such an operation is very fast, and the very occasional contention can be avoided with an atomic spinlock, interrupt-disable or more complex serialization/protection mechanisms involving inter-core comms drivers.

Implementing mutexes from ground up

Answers (1)

Related Questions