Reputation: 23052
My thread synchronisation "style" seems to be throwing helgrind off. Here's a simple program that reproduces the problem:
#include <thread>
#include <atomic>
#include <iostream>
int main()
{
std::atomic<bool> isReady(false);
int i = 1;
std::thread t([&isReady, &i]()
{
i = 2;
isReady = true;
});
while (!isReady)
std::this_thread::yield();
i = 3;
t.join();
std::cout << i;
return 0;
}
As far as I can tell, the above is a perfectly well-formed program. However, when I run helgrind using the following command I get errors:
valgrind --tool=helgrind ./a.out
The output of this is:
==6247== Helgrind, a thread error detector
==6247== Copyright (C) 2007-2015, and GNU GPL'd, by OpenWorks LLP et al.
==6247== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==6247== Command: ./a.out
==6247==
==6247== ---Thread-Announcement------------------------------------------
==6247==
==6247== Thread #1 is the program's root thread
==6247==
==6247== ---Thread-Announcement------------------------------------------
==6247==
==6247== Thread #2 was created
==6247== at 0x56FBB1E: clone (clone.S:74)
==6247== by 0x4E46189: create_thread (createthread.c:102)
==6247== by 0x4E47EC3: pthread_create@@GLIBC_2.2.5 (pthread_create.c:679)
==6247== by 0x4C34BB7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==6247== by 0x5115DC2: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>, void (*)()) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==6247== by 0x4010EF: std::thread::thread<main::{lambda()#1}>(main::{lambda()#1}&&) (in /home/arman/a.out)
==6247== by 0x400F93: main (in /home/arman/a.out)
==6247==
==6247== ----------------------------------------------------------------
==6247==
==6247== Possible data race during read of size 1 at 0xFFF00035B by thread #1
==6247== Locks held: none
==6247== at 0x4022C3: std::atomic<bool>::operator bool() const (in /home/arman/a.out)
==6247== by 0x400F9F: main (in /home/arman/a.out)
==6247==
==6247== This conflicts with a previous write of size 1 by thread #2
==6247== Locks held: none
==6247== at 0x40233D: std::__atomic_base<bool>::operator=(bool) (in /home/arman/a.out)
==6247== by 0x40228E: std::atomic<bool>::operator=(bool) (in /home/arman/a.out)
==6247== by 0x400F4A: main::{lambda()#1}::operator()() const (in /home/arman/a.out)
==6247== by 0x40204D: void std::_Bind_simple<main::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>) (in /home/arman/a.out)
==6247== by 0x401FA3: std::_Bind_simple<main::{lambda()#1} ()>::operator()() (in /home/arman/a.out)
==6247== by 0x401F33: std::thread::_Impl<std::_Bind_simple<main::{lambda()#1} ()> >::_M_run() (in /home/arman/a.out)
==6247== by 0x5115C7F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==6247== by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==6247== Address 0xfff00035b is on thread #1's stack
==6247== in frame #1, created by main (???:)
==6247==
==6247== ----------------------------------------------------------------
==6247==
==6247== Possible data race during write of size 4 at 0xFFF00035C by thread #1
==6247== Locks held: none
==6247== at 0x400FAE: main (in /home/arman/a.out)
==6247==
==6247== This conflicts with a previous write of size 4 by thread #2
==6247== Locks held: none
==6247== at 0x400F35: main::{lambda()#1}::operator()() const (in /home/arman/a.out)
==6247== by 0x40204D: void std::_Bind_simple<main::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>) (in /home/arman/a.out)
==6247== by 0x401FA3: std::_Bind_simple<main::{lambda()#1} ()>::operator()() (in /home/arman/a.out)
==6247== by 0x401F33: std::thread::_Impl<std::_Bind_simple<main::{lambda()#1} ()> >::_M_run() (in /home/arman/a.out)
==6247== by 0x5115C7F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==6247== by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==6247== by 0x4E476F9: start_thread (pthread_create.c:333)
==6247== by 0x56FBB5C: clone (clone.S:109)
==6247== Address 0xfff00035c is on thread #1's stack
==6247== in frame #0, created by main (???:)
==6247==
3==6247==
==6247== For counts of detected and suppressed errors, rerun with: -v
==6247== Use --history-level=approx or =none to gain increased speed, at
==6247== the cost of reduced accuracy of conflicting-access information
==6247== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Helgrind seems to be picking up my while loop as a race condition. How am I supposed to form this program to avoid helgrind throwing out false positives?
Upvotes: 9
Views: 7292
Reputation: 171273
The problem is that Helgrind doesn't understand GCC's atomic builtins, so doesn't realise that they are race-free and impose ordering on the program.
There are ways to annotate your code to help Helgrind, see http://valgrind.org/docs/manual/hg-manual.html#hg-manual.effective-use (but I'm not sure how to use them here, I already tried what sbabbi shows and it only solves part of the problem).
I would avoid yielding in a busy loop anyway, it's a poor form of synchronization. It could be done with a condition variable like so:
#include <thread>
#include <atomic>
#include <iostream>
#include <condition_variable>
int main()
{
bool isReady(false);
std::mutex mx;
std::condition_variable cv;
int i = 1;
std::thread t([&isReady, &i, &mx, &cv]()
{
i = 2;
std::unique_lock<std::mutex> lock(mx);
isReady = true;
cv.notify_one();
});
{
std::unique_lock<std::mutex> lock(mx);
cv.wait(lock, [&] { return isReady; });
}
i = 3;
t.join();
std::cout << i;
return 0;
}
Upvotes: 10
Reputation: 11181
Valgrind cannot know that the while (!isReady)
loop (along with the implicit memory_order_release
and memory_order_consume
flags on the store and load), implies that the statement i = 2
is dependency ordered before i = 3
.
You have to explicitly state this invariant by using valgrind ANNOTATE_HAPPENS_BEFORE
and ANNOTATE_HAPPENS_AFTER
macros:
#include <valgrind/drd.h>
#include <thread>
#include <atomic>
#include <iostream>
int main()
{
std::atomic<bool> isReady(false);
int i = 1;
std::thread t([&isReady, &i]()
{
i = 2;
ANNOTATE_HAPPENS_BEFORE(&isReady);
isReady = true;
});
while (!isReady)
std::this_thread::yield();
ANNOTATE_HAPPENS_AFTER(&isReady);
i = 3;
t.join();
std::cout << i;
return 0;
}
Here we are saying that the line at ANNOTATE_HAPPENS_BEFORE
always happens before the line at ANNOTATE_HAPPENS_AFTER
, we know that due to inspection of the program logic, but valgrind cannot prove that for you.
This program produces:
valgrind --tool=helgrind ./a.out
==714== Helgrind, a thread error detector
==714== Copyright (C) 2007-2015, and GNU GPL'd, by OpenWorks LLP et al.
==714== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==714== Command: ./val
==714==
==714== ---Thread-Announcement------------------------------------------
==714==
==714== Thread #1 is the program's root thread
==714==
==714== ---Thread-Announcement------------------------------------------
==714==
==714== Thread #2 was created
==714== at 0x59E169E: clone (in /usr/lib/libc-2.23.so)
==714== by 0x4E421D9: create_thread (in /usr/lib/libpthread-2.23.so)
==714== by 0x4E43C42: pthread_create@@GLIBC_2.2.5 (in /usr/lib/libpthread-2.23.so)
==714== by 0x4C316F3: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==714== by 0x4C327D7: pthread_create@* (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==714== by 0x5113DB4: __gthread_create (gthr-default.h:662)
==714== by 0x5113DB4: std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) (thread.cc:163)
==714== by 0x40109C: std::thread::thread<main::{lambda()#1}>(main::{lambda()#1}&&) (in /home/ennio/val)
==714== by 0x400F55: main (in /home/ennio/val)
==714==
==714== ----------------------------------------------------------------
==714==
==714== Possible data race during read of size 1 at 0xFFF00061F by thread #1
==714== Locks held: none
==714== at 0x401585: std::atomic<bool>::operator bool() const (in /home/ennio/val)
==714== by 0x400F61: main (in /home/ennio/val)
==714==
==714== This conflicts with a previous write of size 1 by thread #2
==714== Locks held: none
==714== at 0x4015D5: std::__atomic_base<bool>::operator=(bool) (in /home/ennio/val)
==714== by 0x401550: std::atomic<bool>::operator=(bool) (in /home/ennio/val)
==714== by 0x400F1B: main::{lambda()#1}::operator()() const (in /home/ennio/val)
==714== by 0x40146F: void std::_Bind_simple<main::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>) (in /home/ennio/val)
==714== by 0x40140C: std::_Bind_simple<main::{lambda()#1} ()>::operator()() (in /home/ennio/val)
==714== by 0x4013EB: std::thread::_State_impl<std::_Bind_simple<main::{lambda()#1} ()> >::_M_run() (in /home/ennio/val)
==714== by 0x5113A9E: execute_native_thread_routine (thread.cc:83)
==714== by 0x4C318E7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==714== Address 0xfff00061f is on thread #1's stack
==714== in frame #1, created by main (???:)
==714==
3==714==
==714== For counts of detected and suppressed errors, rerun with: -v
==714== Use --history-level=approx or =none to gain increased speed, at
==714== the cost of reduced accuracy of conflicting-access information
==714== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
To remove the error on isReady
itself, I assume a suppression file on __atomic_base<bool>::operator=
would be sufficient.
Upvotes: 2