Reputation: 239
I'm using Valgrind --tool=drd to check my application that uses Boost::thread. Basically, the application populates a set of "Book" values with "Kehai" values based on inputs through a socket connection.
On a seperate thread, a user can connect and get the books send to them.
Its fairly simple, so i figured using a boost::mutex::scoped_lock on the location that serializes the book and the location that clears out the book data should be suffice to prevent any race conditions. Here is the code:
void Book::clear()
{
boost::mutex::scoped_lock lock(dataMutex);
for(int i =NUM_KEHAI-1; i >= 0; --i)
{
bid[i].clear();
ask[i].clear();
}
}
int Book::copyChangedKehaiToString(char* dst) const
{
boost::mutex::scoped_lock lock(dataMutex);
sprintf(dst, "%-4s%-13s",market.c_str(),meigara.c_str());
int loc = 17;
for(int i = 0; i < Book::NUM_KEHAI; ++i)
{
if(ask[i].changed > 0)
{
sprintf(dst+loc,"A%i%-21s%-21s%-21s%-8s%-4s",i,ask[i].price.c_str(),ask[i].volume.c_str(),ask[i].number.c_str(),ask[i].postTime.c_str(),ask[i].status.c_str());
loc += 77;
}
}
for(int i = 0; i < Book::NUM_KEHAI; ++i)
{
if(bid[i].changed > 0)
{
sprintf(dst+loc,"B%i%-21s%-21s%-21s%-8s%-4s",i,bid[i].price.c_str(),bid[i].volume.c_str(),bid[i].number.c_str(),bid[i].postTime.c_str(),bid[i].status.c_str());
loc += 77;
}
}
return loc;
}
The clear() function and the copyChangedKehaiToString() function are called in the datagetting thread and data sending thread,respectively. Also, as a note, the class Book:
struct Book
{
private:
Book(const Book&); Book& operator=(const Book&);
public:
static const int NUM_KEHAI=10;
struct Kehai;
friend struct Book::Kehai;
struct Kehai
{
private:
Kehai& operator=(const Kehai&);
public:
std::string price;
std::string volume;
std::string number;
std::string postTime;
std::string status;
int changed;
Kehai();
void copyFrom(const Kehai& other);
Kehai(const Kehai& other);
inline void clear()
{
price.assign("");
volume.assign("");
number.assign("");
postTime.assign("");
status.assign("");
changed = -1;
}
};
std::vector<Kehai> bid;
std::vector<Kehai> ask;
tm recTime;
mutable boost::mutex dataMutex;
Book();
void clear();
int copyChangedKehaiToString(char * dst) const;
};
When using valgrind --tool=drd, i get race condition errors such as the one below:
==26330== Conflicting store by thread 1 at 0x0658fbb0 size 4
==26330== at 0x653AE68: std::string::_M_mutate(unsigned int, unsigned int, unsigned int) (in /usr/lib/libstdc++.so.6.0.8)
==26330== by 0x653AFC9: std::string::_M_replace_safe(unsigned int, unsigned int, char const*, unsigned int) (in /usr/lib/libstdc++.so.6.0.8)
==26330== by 0x653B064: std::string::assign(char const*, unsigned int) (in /usr/lib/libstdc++.so.6.0.8)
==26330== by 0x653B134: std::string::assign(char const*) (in /usr/lib/libstdc++.so.6.0.8)
==26330== by 0x8055D64: Book::Kehai::clear() (Book.h:50)
==26330== by 0x8094A29: Book::clear() (Book.cpp:78)
==26330== by 0x808537E: RealKernel::start() (RealKernel.cpp:86)
==26330== by 0x804D15A: main (main.cpp:164)
==26330== Allocation context: BSS section of /usr/lib/libstdc++.so.6.0.8
==26330== Other segment start (thread 2)
==26330== at 0x400BB59: pthread_mutex_unlock (drd_pthread_intercepts.c:633)
==26330== by 0xC59565: pthread_mutex_unlock (in /lib/libc-2.5.so)
==26330== by 0x805477C: boost::mutex::unlock() (mutex.hpp:56)
==26330== by 0x80547C9: boost::unique_lock<boost::mutex>::~unique_lock() (locks.hpp:340)
==26330== by 0x80949BA: Book::copyChangedKehaiToString(char*) const (Book.cpp:134)
==26330== by 0x80937EE: BookSerializer::serializeBook(Book const&, std::string const&) (BookSerializer.cpp:41)
==26330== by 0x8092D05: BookSnapshotManager::getSnaphotDataList() (BookSnapshotManager.cpp:72)
==26330== by 0x8088179: SnapshotServer::getDataList() (SnapshotServer.cpp:246)
==26330== by 0x808870F: SnapshotServer::run() (SnapshotServer.cpp:183)
==26330== by 0x808BAF5: boost::_mfi::mf0<void, RealThread>::operator()(RealThread*) const (mem_fn_template.hpp:49)
==26330== by 0x808BB4D: void boost::_bi::list1<boost::_bi::value<RealThread*> >::operator()<boost::_mfi::mf0<void, RealThread>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, RealThread>&, boost::_bi::list0&, int) (bind.hpp:253)
==26330== by 0x808BB90: boost::_bi::bind_t<void, boost::_mfi::mf0<void, RealThread>, boost::_bi::list1<boost::_bi::value<RealThread*> > >::operator()() (bind_template.hpp:20)
==26330== Other segment end (thread 2)
==26330== at 0x400B62A: pthread_mutex_lock (drd_pthread_intercepts.c:580)
==26330== by 0xC59535: pthread_mutex_lock (in /lib/libc-2.5.so)
==26330== by 0x80546B8: boost::mutex::lock() (mutex.hpp:51)
==26330== by 0x805473B: boost::unique_lock<boost::mutex>::lock() (locks.hpp:349)
==26330== by 0x8054769: boost::unique_lock<boost::mutex>::unique_lock(boost::mutex&) (locks.hpp:227)
==26330== by 0x8094711: Book::copyChangedKehaiToString(char*) const (Book.cpp:113)
==26330== by 0x80937EE: BookSerializer::serializeBook(Book const&, std::string const&) (BookSerializer.cpp:41)
==26330== by 0x808870F: SnapshotServer::run() (SnapshotServer.cpp:183)
==26330== by 0x808BAF5: boost::_mfi::mf0<void, RealThread>::operator()(RealThread*) const (mem_fn_template.hpp:49)
==26330== by 0x808BB4D: void boost::_bi::list1<boost::_bi::value<RealThread*> >::operator()<boost::_mfi::mf0<void, RealThread>, boost::_bi::list0>(boost::_bi::type<void>, boost::_mfi::mf0<void, RealThread>&, boost::_bi::list0&, int) (bind.hpp:253)
For the life of me, i can't figure out where the race condition is. As far as I can tell, clearing the kehai is done only after having taken the mutex, and the same holds true with copying it to a string. Does anyone have any ideas what could be causing this, or where I should look?
Thank you kindly.
Upvotes: 6
Views: 2574
Reputation: 184
This report may be ignored safely. It is triggered by how std::string is implemented in libstdc++. This issue has been solved in the libstdc++ version included with gcc 4.4.4 or later. See also GCC bugzilla item #40518 for the details.
Upvotes: 2
Reputation: 12943
STL is supposed to be thread-safe in the sense that it is not a problem to use them with thread if you lock correctly or if you just perform multi-thread reads
Surpirse! Yes, it's supposed, but let me tell you a store about what actually happened.
I had a multit-hreaded application. There was a datastructure with strings (std::string
). It was locked with a critical section.
Other objects eventually needed to take strings from there. They made a copy of those strings in the following manner:
// Take a string
std::string str;
{
Autolock l(g_CritSect);
str = g_SomeStr;
}
The same strategy was used to adjust those strings:
// Put a string
std::string str;
{
Autolock l(g_CritSect);
g_SomeStr = str;
}
And, guess what? Crashes!
But why? Because the string's assignment statement doesn't really produce a copy of the memory block holding the string. Instead the same memory block is reused (referenced).
Well, this is not necessarily bad. But the bad thing was that std::string implemented reference counting of the strings not in a thread-safe way. It used regular arithmetic ++ and -- instead of InterlockedIncrement
and etc.
What happened is that the str
object references the same string. And then it eventually dereferences it in its destructor (or when explicitly assigned to another string). And this happens outside of the locked region.
So that those strings may not be used in a multi-threaded application. And it's almost impossible to implement correct locking to work this around, because the actual referenced data passes silently from object to object.
And that implementation of STL was declared thread-safe.
Upvotes: 0
Reputation: 12943
After your post I took time to learn about Valgrind and how its output should be read.
I can see the following:
You invoke Book::clear
which in turns calls Book::Kehai::clear
, where you assign a value to a string. Inside the std::string::assign
the STL does something which stores some value at the address 0x0658fbb0.
Meanwhile the other thread has accessed the same memory location, hence this situation is considered a race condition.
Now look at the "context" of the other thread. Valgrind doesn't show its exact stack location, however it shows between which "segments" it has occured. According to Valgrind a segment is a consecutive block of memory accesses bounded by synchronization operations.
We see that this block starts with pthread_mutex_unlock
and ends at pthread_mutex_lock
.
Means - the same memory location was accessed when your mutex was not locked, and that thread was somewhere outside of your two functions.
Now, look at the conflicting memory location information:
Allocation context: BSS section of /usr/lib/libstdc++.so.6.0.8
The BSS means that it's a global/static variable. And it's defined somewhere inside the libstdc.
Conclusion:
This race condition has nothing to do with your data structures. It's related to STL. One thread performs something to an std::string
(assigns it to an empty string to be exact), whereas the other thread probably does something STL-related as well.
BTW I remember several years ago I've written a multi-threaded application, and there were problems with std::string
there. As I found out - the STL implementation (which was a Dunkimware) actually implemented string as reference-counted, whereas the reference counting was not thread-safe.
Maybe this is what happens to you as well? Perhaps you should set some compiler flag/option when building a multi-threaded application?
Upvotes: 6
Reputation: 239
Nevermind. I'm an idiot and managed to forget that C++ strings are mutable. I changed the code to use c-style strings and my race condition issues went away.
On an aside for anyone reading this post, does anyone know a good immutable string library for C++? I thought boost had one, but i haven't been able to find anything conclusive on it.
Thanks.
Upvotes: 0