Reputation: 13075
Why does std::atomic version of the code still fail? (callback changes when refCount is non-zero, and doStop (was) false.
I had a multi-threaded piece of code, which wasn't behaving correctly, and tried to fix it.
However my fix, remained unreliable, but I don't understand why.
Original code Thread A (Use callback) :-
if( !IsUpdating ) {
IncrementReference();
if( !IsUpdating && GetCallBackPointer() ) {
cb = GetCallBackPointer();
cb();
}
DecrementReference();
}
Original code Thread B - (modify callback)
IsUpdating = true;
while( ReferencesUsingCallback ) {
Sleep( 10 );
}
callback = newValue;
IsUpdating = false;
The idea being that if ReferencesUsingCallback is not 0, then the modify callback thread will not be allowed to change the value of the callback.
There is "protection" for the race condition by doing test, AddRef and test. Hoping that the test could not fail again.
Unfortunately the code didn't work, and my assumption was that this was due to some cache coherency issue.
I have since used std::atomic to try and deliver the test case again, and it still can fail. The std::atomic version is the 'AtomicLockedData'. The platform is Windows on Intel i7
Complete code :-
#include <thread>
#include <mutex>
#include <atomic>
#include <chrono>
#define FAILED_LIMIT 5
#define LOOP_SIZE 1000000000LL
void Function()
{
}
typedef void (*CallbackFunction)(void);
int FailedCount;
__int64 counter = 0;
class lockedData {
public:
lockedData() : value(nullptr), value2(nullptr)
{
doStop = 0;
usageCount = 0;
}
long usageCount;
long doStop;
volatile CallbackFunction value;
void * value2;
int Use()
{
return usageCount++;
}
int UnUse()
{
return usageCount--;
}
int Usage() const
{
return usageCount;
}
void SetStop()
{
doStop = 1;
}
void UnStop()
{
doStop = 0;
}
bool IsStopped()
{
return doStop != 0;
}
void StoreData(CallbackFunction pData )
{
value = pData;
}
CallbackFunction ReadData()
{
return value;
}
};
class AtomicLockedData {
public:
AtomicLockedData() : value(nullptr), value2(nullptr)
{
doStop = false;
usageCount = 0;
}
std::atomic<int> usageCount;
std::atomic<bool> doStop;
std::atomic<CallbackFunction> value;
void * value2;
int Use()
{
return usageCount++;
}
int UnUse()
{
return usageCount--;
}
int Usage() const
{
return usageCount.load();
}
void SetStop()
{
doStop.store( true);
}
void UnStop()
{
doStop.store( false );
}
bool IsStopped()
{
return doStop.load() == true;
}
void StoreData(CallbackFunction pData)
{
value.store( pData );
}
CallbackFunction ReadData()
{
return value.load();
}
};
template < class lockData >
int UpdateState( lockData & aLock, CallbackFunction pData, void * pData2 )
{
aLock.SetStop();
while(aLock.Usage() > 0 )
std::this_thread::sleep_for( std::chrono::milliseconds(10) );
aLock.value = pData;
aLock.UnStop();
return 0;
}
template <class lockData >
int ReadState( lockData * aLock, int fib)
{
if (!aLock->IsStopped()) {
aLock->Use();
CallbackFunction val = aLock->ReadData();
if (!aLock->IsStopped() && val) {
fibonacci(fib);
CallbackFunction pTest = const_cast<CallbackFunction>( aLock->ReadData());
if (pTest == 0) {
FailedCount++; // shouldn't be able to change value if use count is non-zero
printf("Failed\n");
}
else {
pTest();
}
}
aLock->UnUse();
}
return 0;
}
unsigned __int64 fibonacci(size_t n)
{
if (n < 3) return 1;
return fibonacci(n - 1) + fibonacci(n - 2);
}
template< class lockData > void ThreadA( lockData * lkData , int fib )
{
void * pData2 = new char[200];
while (FailedCount < FAILED_LIMIT) {
UpdateState< lockData>(*lkData, Function, pData2);
fibonacci(fib);
UpdateState< lockData>(*lkData, NULL, NULL);
fibonacci(fib);
}
}
template< class lockData > void ThreadB(lockData & lkData, int fib )
{
while (FailedCount < FAILED_LIMIT && counter < LOOP_SIZE) {
ReadState(&lkData, fib);
ReadState(&lkData, fib);
ReadState(&lkData, fib);
ReadState(&lkData, fib);
ReadState(&lkData, fib);
ReadState(&lkData, fib);
ReadState(&lkData, fib);
ReadState(&lkData, fib);
ReadState(&lkData, fib);
ReadState(&lkData, fib);
counter++;
}
}
template <class lockType >
void TestLock()
{
counter = 0;
FailedCount = 0;
lockType lk;
std::thread thr(ThreadA<lockType>, &lk, 3);
ThreadB(lk, 3);
thr.join();
printf("Failed %d times for %I64d iterations", FailedCount, counter);
}
int main(int argc, char ** argv)
{
TestLock< lockedData >();
TestLock< AtomicLockedData >();
return 0;
}
Upvotes: 1
Views: 430
Reputation: 13075
Thank you for the other answers, but they don't seem to answer the question.
The code posted was buggy, because it read the value of the callback function before the second check.
template <class lockData >
int ReadState( lockData * aLock, int fib)
{
if (!aLock->IsStopped()) {
aLock->Use();
CallbackFunction val;
if (!aLock->IsStopped() && (val = aLock->ReadData() ) ) {
fibonacci(fib);
CallbackFunction pTest = const_cast<CallbackFunction>( aLock->ReadData());
if (pTest == 0) {
FailedCount++; // shouldn't be able to change value if use count is non-zero
printf("Failed\n");
}
else {
pTest();
}
}
aLock->UnUse();
}
return 0;
}
The atomic version of this code doesn't fail, but the non-atomic version does. Adding volatile to the non-atomic version (fix data races?) doesn't help.
The code is intended to ensure that the read doesn't use the callback value if it is currently being updated. It was hoped to do this wuthout locks, favoring the ReadState solution, and so adding locks would have worked but be pointless.
Before executing ReadState, we check that the update is not run. I am not sure this helps.
After incrementing the usage count, the IsStopped
is checked. That ensures that UpdateState will be blocked from further operation until the usage is 0.
So the remaining race is when ReadState calls increment, after UpdateState has tested the usage count.
The fix is to ensure val is read after checking IsStopped
. The correct way round, if the UpdateState
thread had missed the increment, and was still executing, then the ReadState
would bail and try later. Otherwise, we know IsStopped
is false, and UpdateState
will not make changes until usage is 0, then we can read the value, and it will not be changed.
Reading val
before IsStopped
created a problem, where value could be changed between the read and the second test (pTest), and IsStopped set to 0, which caused the failures.
Upvotes: 0
Reputation: 114539
The lines
if (!aLock->IsStopped()) {
aLock->Use();
look strange.
After IsStopped()
returned false
it's possible that the state will move to stopped before you call Use()
(thus you may get to Use()
a stopped lock).
The solution is having the return value of Use
to communicate failure in case it's a forbidden operation instead of doing a check followed by a Use()
.
Upvotes: 2
Reputation: 146940
Hoping that the test could not fail again.
It most assuredly can, and will if you have a sufficient test. Double-checked locking is well-known to be unsafe.
Your usage of the atomics is not, in as of itself, atomic. Therefore, your operation is not atomic.
Or to put it another way, atomicity is not like const
, it does not propagate implicitly. You can't simply have a safe operation by writing it with atomic variables. You must write an entirely atomic operation as well as simply using atomic variables under the hood.
If you are not up to the task of writing an atomic algorithm based on atomic primitives, you must use a mutex to make it atomic.
Furthermore, your non-atomic code is not only unsafe in terms of concurrency, but undefined behaviour as well, as there are data races there. It's also undefined behaviour because the variables you're using are non-volatile, so they can be assumed by the compiler to not change externally and the compiler may optimize based on this fact. Throw out this code immediately; it is unusable.
Upvotes: 1