Sevaro
Sevaro

Reputation: 47

How do I avoid large number of locks?

Let arr be declared as follows.

std::vector<std::vector<int>> arr(10,000,000);

My code in serial is something like

for (const auto [x, y] : XY) {
  arr[x].push_back(y);
}

I use openmp and define an array of locks as follows.

std::vector<omp_lock_t> locks(10,000,000);

With locks, I use

#pragma omp parallel for schedule(dynamic)
for (const auto [x, y] : XY) {
  omp_set_lock(&locks[x]);
  arr[x].push_back(y);
  omp_unset_lock(&locks[x]);
}    

Such an approach works in my machine (windows linux subsystem). But then I find the following post

c/c++ maximum number of mutexes allowed in Linux

which raises the concern whether I have used too many locks, and my program may not work for other platforms (those with a limit on the number of locks allowed).

I wonder if there is another way in which I still have the same control granularity as above and it does not have upper limit on the number allowed. Can I use something like compare and swap?

Upvotes: 1

Views: 459

Answers (2)

Daniel Langr
Daniel Langr

Reputation: 23517

According to the OpenMP documentation, omp_lock_t represents "a simple lock". I suppose it is some kind of a spinlock. You, therefore, don't need to care about limits for a number of mutexes. (Mutexes require some interaction with a kernel / scheduler, which is likely a reason for restricting their count.)

omp_lock_t uses a single memory location, which is fine for a spinlock. For instance, this live demo shows that omp_lock_t takes only 4 bytes, while std::mutex 40 bytes: https://godbolt.org/z/sr845h8hx.

Note that a spinlock can be implemented with a single byte, even with a single bit (BTS on x86_64). Therefore, you can compress your locks even much more in memory. However, I'd be careful with such an approach since it can bring significant false sharing issues.

I think your solution is pretty fine. Since the operation in the critical section should be very fast, and there is likely a low probability that multiple threads will access the same element at the same moment, a spinlock is a suitable solution in my opinion.

EDIT

As pointed out in comments, the term simple lock may not mean that the lock is actually a spinlock. It's up to the OpenMP implementation to decide which type of lock it will use. Then, if you want to be sure that you are using a true spinlock (which I still believe is suitable), you can either write your own or use some library that provides one. (Note that the efficient spinlock implementation is not that simple. For instance, it needs to spin on a load/read operation instead of on exchange/test-and-set, which often naive implementations do.)

Upvotes: 2

Jim Cownie
Jim Cownie

Reputation: 2869

A couple of possible solutions

  1. If you are on a modern Intel processor that supports Transactional Synchronization Extensions (TSX) you could use a single, speculative lock. (See [Two Small OpenMP Features You May Have Missed][1]). That shows a very similar use case.

  2. You could use a smaller array of locks and use a hash to map from the array index into the lock-set. Something like this (untested and uncompiled!)

    // Allocate and initialise the lock array, wherever you have that!
        enum { ln2NumLocks = 10,
               NumLocks = 1<<ln2NumLocks }
        omp_lock_t locks[NumLocks];
        for (int i=0; i<NumLocks; i++)
            omp_init_lock(&locks[i]);
    
    // Hash from array index to lock index. What you have here
    // probably doesn't matter too much. It doesn't need to be
    // a crypto-hash!
    int mapToLockIdx(int arrayIdx) {
        return ((arrayIdx >> ln2NumLocks) ^ arrayIdx) & (numLocks-1);
    }
    
    // Your loop is then something like this
    #pragma omp parallel for schedule(dynamic)
    for (const auto [x, y] : XY) {
        auto lock = &locks[mapToLock(x)]; 
        omp_set_lock(lock);
        arr[x].push_back(y);
        omp_unset_lock(lock);
    } 
    


  [1]: https://www.openmp.org/wp-content/uploads/SC18-BoothTalks-Cownie.pdf

Upvotes: 0

Related Questions