Reputation: 45424

How to lock-free update index to maximum over a range?

The problem is best explained with some simple code.

struct foo
{
    static constexpr auto N=8;
    double data[N];            //  initialised at construction
    int max;                   //  index to maximum: data[max] is largest value

    // if value < data[index]:
    //   – update data[index] = value
    //   - update max
    void update(int index, double value)
    {
        if(value >= data[index])
            return;
        data[index] = value;
        if(index==max)         // max unaffected if index!=max
            for(index=0; index!=N; ++index)
                if(data[index] > data[max])
                    max = index;
    }
};

Now, I want to make foo::update() thread safe, i.e. allow concurrent calls from different threads, where participating threads cannot call with the same index. One way is to add a mutex or simple spinlock (the contention can be presumed low) to foo:

struct foo
{
    static constexpr auto N=8;
    std::atomic_flag lock = ATOMIC_FLAG_INIT;
    double data[N];
    int max;

    // index is unique to each thread
    // if value < data[index]:
    //   – update data[index] = value
    //   - update max
    void update(int index, double value)
    {
        if(value >= data[index])
            return;
        while(lock.test_and_set(std::memory_order_acquire));  // aquire spinlock
          data[index] = value;
          if(index==max)
              for(index=0; index!=N; ++index)
                  if(data[index] > data[max])
                      max = index;
        lock.clear(std::memory_order_release);                // release spinlock
    }
};

However, how can I implement foo::update() lock-free (you may consider data and max to be atomic)?

NOTE: this is a simpler version of the original post, without relation to the tree structure.

Upvotes: 1

Answers (1)

tony

Reputation: 3887

So, IIUC, the array only gets new values if they are lower than what is already there (and I won't worry about how the initial values got there), and if the current max is lowered, find a new max.

Some of this is not too hard. But some is... harder.

So the "if value < data[index] then write data" needs to be in a CAS-loop. Something like:

auto oldval = data[index].load(memory_order_relaxed);
do
   if (value <= oldval) return;
while ( ! data[index].compare_exchange_weak(oldval, value) );
// (note that oldval is updated to data[index] each time comp-exch fails)

So now data[index] has the new lower value. Awesome. And relatively easy. Now about max.

First question - Is it OK for max to ever be wrong? Because it may currently be wrong (in our scenario, where we update data[index] before handling max).

Can it be wrong in some ways, not others? ie let's say our data is just two entries:

data[2] = { 3, 7 };

And we want to do update(1, 2) ie change the 7 to a 2. (And thus update max!)

Scenario A: set data first, then max:

data[1] = 2;
pause(); // ie scheduler pauses this thread
max = 0; // data[0]==3 is now max

If another thread comes in at pause(), then data[max] is wrong: 2 instead of 3 :-(

Scenario B: set max first:

max = 0; // it will be "shortly"?
pause();
data[1] = 2;

Now a thread could read data[max] as 3 while 7 is still in data. But 7 is going to become 2 "soon", so is that OK? Is it "less wrong" than scenario A? Depends on usage? (ie if the important thing is "which is max" we have that right. But if max was the only important thing, why store all the data at all?)

It seems odd to ask "is wrong OK", but in some lock-free situations it actually is a valid question. To me B has a chance to be OK for some uses, whereas A doesn't.

Also, and this is important:

data[max] is always wrong, even in the perfect algorithm

By this I mean you need to realize that data[max], as soon as you read it is already out of date - if you are living in a lockfree world. Because it may have changed as soon as you read it. (Also because data and max change independently. But even if you had a function getMaxValue() it would be out of date as soon as it returns.)

Is that OK? Because, if not, you obviously need a lock. But if it is OK, we can use it to our advantage - we might be able to return an answer which we know is somewhat incorrect / out-of-date, but no more incorrect than what you could tell from the outside.

If neither scenario is OK, then you must update max and data[index] at the same time. Which is hard since they don't fit into a lock-free sized chunk.

So instead you can add a layer of indirection:

struct DataAndMax { double data[N]; int max; };
DataAndMax * ptr;

Whenever you need to update max, you need to make a whole new DataAndMax struct (ie allocate a new one), somehow fill it all out nicely, and then atomically swap ptr to the new struct. And if some other thread changed ptr while you were preparing the new data, then you would need to start over, since you need their new data in your data.

And if ptr has changed twice, then it may look like it hasn't changed, when it really has: Let's say ptr currently has value 0xA000 and a 2nd thread allocates a new DataAndStruct at 0xB000, and sets ptr to 0xB000, and frees the old one at 0xA000. Now yet another thread (3rd) comes in, allocates yet another DataAndStruct - and low and behold the allocator gives you back 0xA000 (why not, it was just freed!). So this 3rd thread sets ptr to 0xA000.

And this all happens while you are trying to set ptr to 0xC000. All you see is that ptr was 0xA000, and later still is 0xA000, so you think it (and its data) hasn't changed. Yet it has - it went from 0xA000 to 0xB000 (when you weren't looking) back to 0xA000 - the address is the same, but the data is different. This is called the ABA problem.

Now, if you knew the max number of threads, you could pre-allocate:

DataAndMax dataBufs[NUM_THREADS];
DataAndMax * ptr; // current DataAndMax

And then never allocate/delete and never have ABA problems. Or there's other ways to avoid ABA.

Let's go back, and think about how we are going to - no matter what - return a max value that is potentially out of date. Can we use that?

So you come in, and first check if the index you are about to write to is the important one or not:

if (index != max) {
    // we are not touching max,
    // so nothing fancy here!
    data[index] value;
    return;
}
// else do it the hard way:
//...

But this is already wrong. After the if and before the set, max may have changed. Does every set need to update max!?!?

So, if N is small, you could just linear search for max. It may be wrong if someone makes an update while searching, but remember - it could also be wrong if someone makes an update right after searching or right after "insert magic here". So searching, other than possibly being slow, is as correct as any algorithm. You will find something that was, for a moment, max. If N == 8, I would use searching. Definitely. You can search 8 entries using memory_order_relaxed possibly, and that will be faster than trying to maintain anything using stronger atomic ops.

I have other ideas:

More bookkeeping? Store maxValue separately?

double data[N];
double maxValue;
int indexOfMax;


bool wasMax = false;
if (index == indexOfMax)
    wasMax = true;
data[index] = value; 
if (wasMax || index == indexOfMax)
    findMax(&indexOfMax, &maxValue); // linear search

That would probably need a CAS-loop somewhere. Still linear search, but maybe less often?

Maybe you need extra data at each entry? Not sure yet.

Hmmmm.

This is not simple. Thus if there is a correct algorithm (and I think there is, within some constraints) it is unlikely to not have bugs. ie a correct algorithm might actually exist, but you don't find it - what you instead find is an algorithm that just looks correct.

Upvotes: 1

How to lock-free update index to maximum over a range?

Answers (1)

Related Questions