burnedWood
burnedWood

Reputation: 93

Bizarre issue with POSIX semaphores

There are two semaphores and two processes. "p1sem" indicates that process1 waits on this semaphore and process2 posts that semaphore. "p2sem" does the opposite. Both processes initialize the values to 0 (if they haven't been created)

So, process1 runs two sessions of opening and deleting those two semaphores. process2 only runs for one session but is supposed to be called again to deplete the next session of process1. This is what the code looks like:

process2:

#include <semaphore.h>
#include <fcntl.h>

int main(){
   sem_t *waitSem    = sem_open("p2sem", O_CREAT, 0666, 0);
   sem_t *postSem   = sem_open("p1sem", O_CREAT, 0666, 0);

   // tic for a cycle
   sem_post(postSem); 
   sem_wait(waitSem);

   // close and exit
  sem_close(waitSem);
  sem_close(postSem);
  sem_unlink("p2sem");
  sem_unlink("p1sem");
}

process1:

#include <semaphore.h>
#include <fcntl.h>

int main() {
    for(int i = 0; i < 2; i++) {

        // create sems
        sem_t *waitSem    = sem_open("p1sem", O_CREAT, 0666, 0);
        sem_t *postSem   = sem_open("p2sem", O_CREAT, 0666, 0);

        // tic for a cycle
        sem_post(postSem);
        sem_wait(waitSem);

        // close and exit
        sem_close(waitSem);
        sem_close(postSem);
        sem_unlink("p2sem");
        sem_unlink("p1sem");
    }
}

In order for process1 to continue, process2 has to post the p1sem. Same goes for process2, which can only continue if process1 posts it's p2sem. It might sound like some kind of deadlock, but the posts happen before the waits so that shouldn't be a problem.

When process1 starts first and then I call process2 twice, things work fine.

However, if process2 starts first then the first session works fine, but when process2 is called again, both processes hang. As far as I understand, there is no reason for that to happen. When I debug, the values of the semaphores at the moment of hanging are opposite for each process (i.e p1sem has value 0 at process1 but value 1 at process2. Same for p2sem.) I have attached a picture of my gdb (look at the __align = for the value of the semaphores, the large positive number I think indicates a negative -1 which is -# of processes waiting on that semaphore, at least according to http://man7.org/linux/man-pages/man3/sem_getvalue.3.html) You can also play by calling sem_post and sem_wait through gdb but you'll see that whichever process you call that from, it doesn't affect the semaphore values of the other.

/dev/shm contains sem.p1sem and sem.p2sem. If people want to test this,then to restart the process you have to remove those semaphores with rm sem.p1sem sem.p2sem

Does anyone understand this?

enter image description here

Upvotes: 1

Views: 589

Answers (1)

Jean-Baptiste Yun&#232;s
Jean-Baptiste Yun&#232;s

Reputation: 36401

If you launch "proc2" then "proc1", it looks like:

  • "proc1" is elected so it unlinks semaphores and then create the new ones,
  • then "proc2" is elected and unlinks the newly created semaphores,
  • thus a new instanciation of "proc2" creates new semaphores unrelated to previous ones. This is specified by sem_open:

If a process makes repeated calls to sem_open(), with the same name argument, the same descriptor is returned for each successful call, unless sem_unlink() has been called on the semaphore in the interim.

"proc1" is then blocked waiting on a semaphore that is no more accessible by any other process.

Upvotes: 2

Related Questions