Reputation: 49

Implementation of a semaphore in C

I am working on a simple implementation of a semaphore in C, and while my implementation is working (as a binary semaphore), I have a question regarding its validity.

My concern stems from my definition of my wait function:

void Wait(int semid) {
    char *shmPtr;

    shmPtr = shmat(semid, NULL, 0);

    if(shmPtr == (void *) -1){
        printf("Could not attach to semaphore...\n");
        exit(1);
    } 

    //Wait for the value in shared memory to 
    //equal 0, then set it equal to 1,
    //detach and return
    while( (*shmPtr) != 0);

    (*shmPtr) = 1;

    if(shmdt(shmPtr) < 0) {
        printf("Cannot detach from semaphore...");
    }

    return;

}

My question lies with the while( (*shmPtr) != 0) loop. Lets say we have 2 processes that are waiting. A third process changes the value of a semaphore to equal 0 (ignore the fact that this is a binary implementation of a semaphore).

My concern is that if process 1 evaluates the condition of the while loop to be false, and then the CPU context switches to process 2 before setting the semaphore value equal to 1, both processes will enter the critical section.

Is there a better way to implement the waiting functionality? I've seen a lot of people using pthread_cond_wait, but that uses a mutex which essentially defeats the purpose of the semaphore implementation.

Thank you

EDIT: Adding Wikipedia's implementation of TestAndSet in C to reference in the comments

#define LOCKED 1

int TestAndSet(int* lockPtr) {
    int oldValue;

    oldValue = *lockPtr;
    *lockPtr = LOCKED;
    // -- End of atomic segment --

    return oldValue;
}

Upvotes: 1

Answers (3)

justin

Reputation: 53

just making my reply 5 years later :)...

as the many comments here have stated, you need an idea of atomicity, but there's a "catch 22" dependency here. semaphores are solutions for synchronization problems, but the correct implementation of a semaphore requires a synchronization solution (e.g. "atomic" CPU instructions, disable interrupts/scheduler).

so imo using pthread_cond_wait doesn't defeat the purpose of you implementing a semaphore for a learning exercise. you need to operate on some ground truths with operations that guarantee certain behavior. an api/library that can do that isn't fundamentally different to the "proper" methods as aforementioned, if all you're looking to do is to implement how a semaphore data structure is handled between two processes.

so i think the more interesting question is how does an OS manage these semaphores? or to widen the scope, how does any software maintain the coherency of a resource that is shared between multiple processes?

for your implementation, you could instead create a separate "semaphore handler" thread where its sole responsibility is to manage a global list of semaphores that protect a common, shared resource. other threads that want to hold one of these semaphores must first "request" this handler thread, and will spin until it's able to hold it (or time out-exit).

non-coincidentally, this emulates the architectural split between what is the "kernel" and what is "userspace" for an operating system.

Upvotes: 0

Gabriel Staples

Reputation: 53115

I don't know how to do it on a PC (if you find out, please do come back and post your own answer), but what you need is what I call "atomic access guards." In other words, you need a mechanism to force atomic access to a given variable for a set amount of time. What that means is that you essentially force all threads/processes to pause for a moment, while 1 and only 1 thread gets access to a variable. It then does its thing with the variable (ex: reads, modifies, writes to it), then re-enables the other threads when done. In this way, you guarantee atomic access to that variable by that thread during those operations. Now, all race conditions are solved.

In C, this is highly architecture-dependent I believe, and relies on C functions written in inline assembly code via something like the __asm keyword, and/or it relies on setting bits in specific hardware registers to certain values in order to enforce certain behavior. Example of using the __asm keyword: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100748_0606_00_en/ddx1471430827125.html.

Sample of inline assembly code wrapped up in a C function:

int add(int i, int j)
{
  int res = 0;
  __asm ("ADD %[result], %[input_i], %[input_j]"
    : [result] "=r" (res)
    : [input_i] "r" (i), [input_j] "r" (j)
  );
  return res;
}

Once you have your "atomic access guard" functions to give you atomic access, you can do something like the following:

// atomic access guard ON

// Do whatever you want here: it's all atomic now!
// Read, modify, write, etc.
// - CAUTION: NO OTHER THREADS CAN RUN DURING THIS TIME, SO GET OUT OF THIS QUICKLY

// atomic access guard OFF

On single-core systems, such as the microcontrollers I'm familiar with (STM32 and AVR/Arduino), atomic access is ensured simply by turning off all interrupts. Ex: on ARM-core STM32 microcontrollers, do it as follows by using the necessary CMSIS (ARM-provided) functions:

// Read PRIMASK register, check interrupt status before you disable them 
// Returns 0 if they are enabled, or non-zero if disabled 
uint32_t prim = __get_PRIMASK();

// Disable interrupts 
__disable_irq();

// Do some stuff here which can not be interrupted 

// Enable interrupts back, but only if they were previously enabled (prevents nesting problems)
if (!prim) 
{
    __enable_irq();
}

Source: https://stm32f4-discovery.net/2015/06/how-to-properly-enabledisable-interrupts-in-arm-cortex-m/

If using FreeRTOS (Free Real-Time Operating System), do the following:

taskENTER_CRITICAL() // This supports nested calls, and ends up calling `portDISABLE_INTERRUPTS()` anyway.

// do your atomic access here

taskEXIT_CRITICAL()

See: https://www.freertos.org/taskENTER_CRITICAL_taskEXIT_CRITICAL.html

If using AVR-core microcontrollers, such as the ATmega328 (basic Arduino Uno processor), do the following:

uint8_t SREG_bak = SREG; //save global interrupt state
cli(); //clear (disable) interrupts

//atomic variable access guaranteed here

SREG = SREG_bak; //restore interrupt state

See my answer here: https://stackoverflow.com/a/39693278/4561887

So now you need to go do some research (and please do post back), on how to enforce such a principle in C on your given operating system and/or architecture, and/or via some special calls to your kernel or something. This may even require that you write your own inline assembly to do this stuff, then wrap it up in a C function to call.

I look forward to seeing how you accomplish it.

Update: I just did some digging in the FreeRTOS source code to see how they disable interrupts, and here's what I found for the ARM Cortex M3 processor, such as STM32 microcontrollers, when using the GCC compiler:

From "FreeRTOSv9.0.0/FreeRTOS/Source/portable/GCC/ARM_CM3/portmacro.h":

#define portDISABLE_INTERRUPTS() vPortRaiseBASEPRI()

portFORCE_INLINE static void vPortRaiseBASEPRI( void )
{
uint32_t ulNewBASEPRI;

    __asm volatile
    (
        "   mov %0, %1                                              \n" \
        "   msr basepri, %0                                         \n" \
        "   isb                                                     \n" \
        "   dsb                                                     \n" \
        :"=r" (ulNewBASEPRI) : "i" ( configMAX_SYSCALL_INTERRUPT_PRIORITY )
    );
}

Upvotes: 0

Bernd Elkemann

Reputation: 23560

As Tom commented, for semaphores to be correct you need an atomic test-and-set or a compare-and-swap (compare-exchange).

But that is not all. Since you are using shared memory (ie. by multiple processes), the atomic operations provided by C11 (Link) are not sufficient.

Since you are calling Posix functions anyway, I asume that you have access to Posix semaphores.

"POSIX semaphores allow processes and threads to synchronize their actions." (Link)

Upvotes: 1

Implementation of a semaphore in C

Answers (3)

Related Questions