pthread_mutex_lock_full assertion failed error

Question

I have been programming an pthread application. The application has mutex locks shared across threads by the parent thread. For some reason, it throws the following error:

../nptl/pthread_mutex_lock.c:428: __pthread_mutex_lock_full: Assertion `e != ESRCH || !robust' failed.

The application is for capturing high speed network traffic using packet_mmap based approach where there are multiple threads each associated with a socket. I am not sure why it is happening. It happens during testing and I am not able to reproduce the error all times. I googled a lot but I am not able to know about the cause. Thanks for your help.

The cause of the error is due to file read. When the line of file read is commented, the error does not occur. It happens in this line:

fread(this->bit_array, sizeof(int), this->m , fp);

where bit_array is an integer array which is dynamically allocated and m is the size of the array.

Thanks.

RKou · Accepted Answer

In GLIBC 2.31, you were running the following source code of pthread_mutex_lock():

    oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock,
                              newval, 0);

    if (oldval != 0)
      {
        /* The mutex is locked.  The kernel will now take care of
           everything.  */
        int private = (robust
               ? PTHREAD_ROBUST_MUTEX_PSHARED (mutex)
               : PTHREAD_MUTEX_PSHARED (mutex));
        int e = futex_lock_pi ((unsigned int *) &mutex->__data.__lock,
                   NULL, private);
        if (e == ESRCH || e == EDEADLK)
          {
        assert (e != EDEADLK
            || (kind != PTHREAD_MUTEX_ERRORCHECK_NP
                && kind != PTHREAD_MUTEX_RECURSIVE_NP));
        /* ESRCH can happen only for non-robust PI mutexes where
           the owner of the lock died.  */
        assert (e != ESRCH || !robust);

        /* Delay the thread indefinitely.  */
        while (1)
          lll_timedwait (&(int){0}, 0, 0 /* ignored */, NULL,
                 private);
          }

        oldval = mutex->__data.__lock;

        assert (robust || (oldval & FUTEX_OWNER_DIED) == 0);
      }

In the above code, the current value of the mutex is atomically read and it appears that it is different than 0 meaning that the mutex is locked. Then, the assert is triggered because the owner of the mutex died and the mutex was not a robust one (meaning that the mutex has not been automatically released upon the end of the owner thread).

If you can modify the source code, you may need to add the "robust" attribute to the mutex (pthread_mutexattr_setrobust()) in order to make the system release it automatically when the owner dies. But it is error prone as the corresponding critical section of code may have not reached a sane point and so may leave some un-achieved work...

So, it would be better to find the reason why a thread may die without unlocking the mutex. Either it is an error, either you forgot to release the mutex in the termination branch.

pthread_mutex_lock_full assertion failed error

Answers (2)

Related Questions