Baris
Baris

Reputation: 43

How to lock a variable number of mutexes in a dynamic container?

Similar questions:

Calling std::lock () with std::vector <mutex*>
Locking an array of std::mutex using a std::lock_guard array

Summary: The 2nd one shows locking statically sized arrays, and the 1st one refers to the boost library boost::lock() which accepts two iterators. However, boost::lock does not support RAII as std::lock doesn't. I am looking for a solution like below

std::lock(m1, m2);
std::lock_guard l1{m1, std::adopt_lock};
std::lock_guard l2{m2, std::adopt_lock};

or if the ownership transfer is needed

std::unique_lock<std::mutex> l1{m1, std::defer_lock};
std::unique_lock<std::mutex> l2{m2, std::defer_lock};
std::lock(l1, l2);

or with std::scoped_lock

std::scoped_lock<std::mutex> lk{ m1, m2 };

However, the mutexes that need to be locked belong to a number of objects known at runtime. Hence, I cannot use std::lock or std::scoped_lock.

std::vector<std::mutex> mutexes{};
mutexes.push_back(std::move(m1));
mutexes.push_back(std::move(m2));
std::scoped_lock<std::mutex> lk{ mutexes }; // ERROR

Project:

I have a project with a DAG structure at the center. The DAG contains many-to-one and one-to-many relationships and (as it should) supports the traversals in both the ancestor and the descendant directions. In order to prevent the deadlocks, i try to implement all algorithms in only one direction (descendant) as much as possible. However, at some points, the code contains ancestor direction inevitably. Actually, as i will explain with an example below, locking always in the same direction does not solve deadlock problem 100 percent. The only solution is locking the DAG node together with the related nodes (e.g. descendant DAG nodes if the traversal is in descendant direction). For example, in descendant direction:

DAGNode DN{};
std::vector<DAGNode> descendants{ DN.get_descendants() };

// Assume we have two descendants
auto& des1{ descendants[0] };
auto& des2{ descendants[1] };
std::scoped_lock<std::mutex> lk{DN.m, des1.m, des2.m};

The code for the ancestor direction is similar.

As the DAG supports both many-to-one and one-to-many relations, the DAG strongly requires that the locking must be performed together as above. For me, a solution like defining a total order does not look reasonable due to many-to-one and one-to-many relations. For example, consider the nodes A1, A2, B1, B2, B3 where A1 is the ancestor for B1 and B2; A2 is the ancestor for B2 and B3. So when A1 is involved in an algorithm (A1, B1, B2) must be locked while (A2, B2, B3) must be locked respectively for A2. Hence, B2 must be locked in both of the two traversals starting from A1 and A2 respectively. This is a result of many-to-one relation but one-to-many also causes similar situations. As can be seen from this simple example that a traversal in the same direction may also cause a deadlock (due to B2). Hence, the only solution is locking all mutexes at once.

Edit: This project is for a geometrical application. I can fix the number of the ancestor nodes as below in order to get the compile time information about the size of the ancestors. For example, a line object is formed by two points. Hence, the DAG node corresponding to a line would compose a DAGNode<2> object.

template<size_t N>
class DAGNode{
std::mutex _m;
std::array<IDAGNode*, N> _ancestors;
...
};

class Line{
DAGNode<2> _DN;
...
};

However, the descendants are not fixed for any object. For example, a line can be used by many surfaces or other objects.

In summary, my questions are:

  1. C++ does not support locking dynamic number of mutexes (?)
  2. Boost does but not in RAII manner (?)
  3. How can i manage many-to-one and one-to-many relationships of DAG in multithreaded code while preventing the deadlocks if the 1st two are correct?
  4. (although it does not seem reasonable to me) Is there any other methods/approaches like applying a total order?

Edit: The following questions are off-topic as commented by @JaMIT 5. i revived famous books like Concurrency In Action, Effective Modern C++, etc. Can you suggest me other sources? 6. Is there any open source library i can use for this situation?

Thanks a lot.

Upvotes: 3

Views: 121

Answers (2)

Baris
Baris

Reputation: 43

First of all, thanks @sehe for the suggestions especially for the one with a coarser-grained locking. I performed a couple of case studies for locking multiple DAG nodes. After some certain number of nodes, the synchronous execution requires manual adjustment using the tools like this_thread::sleep_for which can easily fail under a certain untested condition. So, i end up locking the DAG nodes (fine-grained case) if the number of the nodes is less than a constant value (e.g. 8) with a hard coded function below; otherwise, locking the DAG itself (coarse-grained case).

    template <class Locable>
int lock_mutexs_in_dynamic_container__helper(std::vector<std::unique_lock<Locable>>& ls) {
    if (ls.size() == 2)
        return std::try_lock(ls[0], ls[1]);
    else if (ls.size() == 3)
        return std::try_lock(ls[0], ls[1], ls[2]);
    else if (ls.size() == 4)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3]);
    else if (ls.size() == 5)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4]);
    else if (ls.size() == 6)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5]);
    else if (ls.size() == 7)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6]);
    else if (ls.size() == 8)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6], ls[7]);
    else if (ls.size() == 9)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6], ls[7], ls[8]);
    else if (ls.size() == 10)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6], ls[7], ls[8], ls[9]);
    else if (ls.size() == 11)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6], ls[7], ls[8], ls[9], ls[10]);
    else if (ls.size() == 12)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6], ls[7], ls[8], ls[9], ls[10], ls[11]);
    else if (ls.size() == 13)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6], ls[7], ls[8], ls[9], ls[10], ls[11], ls[12]);
    else if (ls.size() == 14)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6], ls[7], ls[8], ls[9], ls[10], ls[11], ls[12], ls[13]);
    else if (ls.size() == 15)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6], ls[7], ls[8], ls[9], ls[10], ls[11], ls[12], ls[13], ls[14]);
    else if (ls.size() == 16)
        return std::try_lock(ls[0], ls[1], ls[2], ls[3], ls[4], ls[5], ls[6], ls[7], ls[8], ls[9], ls[10], ls[11], ls[12], ls[13], ls[14], ls[15]);
};

I used std::try_lock instead of std::lock because i need to block (i.e. this_thread::sleep_for) the thread for a certain time. The sleep time varies among the algorithms. This way, i try to escape from the deadlock-like situations. A helper function calls the above function, blocking the thread if the lock fails, otherwise returning the ownership of unique_locks.

However, this locking scheme does not actually contain fine granularity. As i explained in the question, when a DAG node is modified, the ancestors and the descendants also need to be updated. Hence, ancestors and descendants are required to be locked together with the node being updated. Besides, in my geometry application, the nodes hold a state information (e.g. uptodate) which is crucial for the application. Due to this state parameter, an update on a DAG node must be propagated through all the descendant nodes involved in the sub-DAG starting from the node. Hence, many of the procedures of the DAG performs a traversal in the descendant direction. No matter which approach i follow (a DFS, a BFS or some special algorithm), the number of DAG nodes to be locked gets very large quickly which causes the lock to be applied on the DAG itself.

When i started working on this project, i had questions about performance/memory conflictions. The persistent data structures generally require more space due to immutability of the data. In case of a DAG, the immutability means copying the whole DAG on every action due to the bidirectional relationship (ancestor and descendant) between the nodes. For a singly-linked list, the partial copy can be applied on the nodes between the head and the node being modified. The other nodes are not affected from the modification. However, in case of a doubly-linked list, all nodes need to be copied. The DAG is similar to a doubly-linked list. In other words, when a DAG node needs a modification, all nodes need to be copied which is inefficient in terms of memory. I could not find an efficient way to eliminate the need of copying the whole DAG, like for example, the bitmapped vector trees (BVT) solving the copy problem of the immutable implementation of vector structure. Hence, although the persistent DAG is superior in terms of concurrency (lock-free), it consumes too much space.

The copy problem can be solved by traditional methods: wrapping the DAG nodes in a class which stores the versions of the node. The wrapper also stores the version ids (vector of unsigned chars) of the ancestors and descendants. The DAG stores the wrapper objects instead of the nodes. The wrapper holds a shared pointer to the node. Only the nodes being modified requires to be copied in this approach. For example, 3 nodes require update if an ancestor of a node is modified: the node being modified, the old ancestor and the new ancestor. For the other nodes (non-modified nodes), the state data and the relations (i.e. ancestor/descendant) are not changed. But, for these non-modified nodes, the versions of the related nodes are incremented. The update on the non-modified nodes can be performed via the ancestor/descendant node version id members of the wrapper. Hence, for the modified nodes the nodes themselves are copied but for the non-modified nodes the wrappers are copied. As the wrappers are copied, they now point to the same nodes. Hence, the ownership on the nodes is defined by a shared pointer. With this approach, the copy of the DAG nodes is converted to copy of vectors of unsigned chars which is easier and consumes less space. However, the traceability of the code decreased, a new reference counting is introduced and the strong exception safety must be achieved for the version ids.

In summary, for my DAG, i could not find a way to implement the lock-based approach with fine-granularity. On the FP side, i could not find a method to convert the full copy of the DAG into a kind of partial copy (like BVTs).

My decision is to use the immutable DAG despite the high memory usage. The copy constructors of the DAG and DAG nodes are obviously crucial in this approach. I will use contiguous memory allocation (most probably a vector) for the nodes in order to increase the performance of the copy operation. Additionally, i will use multi-tasking for the copy of the vector storing the nodes.

Upvotes: 0

sehe
sehe

Reputation: 393613

  1. correct

  2. not as a one-stop API, but you can:

    #include <boost/thread.hpp>
    #include <mutex>
    
    struct bulk_guard {
        std::list<std::unique_lock<std::mutex>> guards_;
    
        template <typename Lockables> explicit bulk_guard(Lockables& lockables) {
            boost::lock(lockables.begin(), lockables.end());
            for (auto& lockable : lockables)
                guards_.emplace_back(lockable, std::adopt_lock);
        }
    };
    
    #include <array>
    int main() {
        std::array<std::mutex, 10> mxs;
    
        bulk_guard bulk(mxs);
    }
    
  3. It's a hard problem I can usually avoid by locking on more coarse-grained groups (like aggregates in DDD, but I recognize these entities in all kinds of sofware; usually it's a unit of distribution (e.g. "tasks") or ownership (e.g. filesystem, directory, process/thread etc.)

  4. I believe that underlies the boost::lock algorithm already. That is reasonable. On most implementations (including POSIX) lockable implementations are not movable, which means the C++ object identity ("address", colloquially) can serve as total order.

  5. I also vived that book. I think it is one of the best resources, although it seems to center a lot on task based concurrency, and you seem to be coming at it more from an actor-based concurrency paradigm. I'm not big on that (I think it tends to be antithetical to performance, which reminds me of many failed promises of generalized actor-oriented concurrency in e.g. Corba and COM+ - complete with MTS).

    Something tells me you can probably find more on that perspective in Eiffel/Smalltalk literature, but there's probably a reason it isn't as popular anymore.

Upvotes: 2

Related Questions