Reputation: 422
I have an unordered_map in code, which has a fixed number of elements initialized at the startup of my program
std::unorderd_map<int, bool> flags;
//Initialize all needed values in the map
// Start thread T1
// Start thread T2
all entries initialized to false in the beginning. There are two threads (T1 & T2). T1 can alter the value of each entry in the map. T2 just reads the entries periodically. In this case, do I need to use a lock? The total number of elements in the unordered_map stays the same throughout the life of the program.
Upvotes: 3
Views: 987
Reputation: 106126
std::unorderd_map<int, bool> flags;
...In this case, do I need to use a lock?
Yes... you'd need a lock around the entire unordered_map
. After all - even if you just had...
bool flag;
...you'd need locking. Jerry Coffin quotes the relevant verse in his answer - reproduced here for convenience: §[intro.races]/21.2
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.
Alternatively, you could use:
std::unorderd_map<int, std::atomic_bool> flags;
[container.requirements.dataraces] guarantees you can safely do concurrent find
operations to lookup the atomic<bool>
elements. std::unordered_map<...>::find
returns an iterator
by value, so each thread will have an independent object to dereference - there's no potential race there. Then reads/writes to the atomic_bool
from different threads can obviously be done safely - that's the whole point of atomic
.
Upvotes: 0
Reputation: 490188
Yes, this causes a problem. The wording from the standard (§ [intro.races]/21.2) is:
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other, except for the special case for signal handlers described below. Any such data race results in undefined behavior.
So, if you only read from multiple threads, you don't get "conflicting actions", but if even one thread writes, you need to synchronize the access.
In the specific case of unordered_map
, I can see a situation where writing and reading concurrently could lead to a really serious problem.
An unordered_map uses a hash table with collision chaining. That means if two (or more) keys hash to the same value, they're all stored in the same "bucket", which is basically a linked-list of the values. So, looking something up in an unordered_map
can involve traversing a linked list.
Experience shows that in a typical case, we can usually expect some elements in a collection to be accessed more often than others, often following something like the commonly known 80:20 rule (20% of the elements will account for 80% of the accesses).
That being the case, one way to optimize a hash table is to try to keep the most commonly accessed elements at the heads of their respective linked lists. To do that, writing a value could not only modify the value itself, but also move its node to (or toward) the beginning of the linked list. So, your reading thread could try to traverse the linked list just as the writing thread is trying to move a node forward in the linked list, so the linked list is completely corrupt at the time (e.g., contains a next
pointer that hasn't been initialized yet).
So even when you don't do any insertion of deletion, and change the value in the map from bool
to std::atomic<bool>
, you still have a potential race condition. You really need to use a lock to protect access to the map itself.
Upvotes: 4
Reputation: 16660
So many ways to answer this question. Almost all of them are "yes, this is a problem".
If T1 inserts or removes elements from the map, then the map could change underneath T2 while it is doing lookups. If you're very lucky, this would cause a consistent crash.
Assuming that T1 only updates existing values in the map, then T2 could read a value that is in the process of being updated, and could get an unusual (as in, not true, and not false) value for the boolean. This could be avoided by using an unorderd_map<int, atomic<bool>>
.
Assuming you avoid both the above problems, you will probably still have data races. A lock of some kind will avoid that.
[Added] Actually, a lock will avoid all three of these problems.
Upvotes: 3
Reputation: 86
Please use a lock. Even though T2 is reading data, you don't want T2 to be reading at the same time T1 is modifying/writing into the map. If that occurs you can have a race condition with undefined behavior where T2 reads the wrong item from the map.
Upvotes: 0
Reputation: 596632
Since T1 is writing new values to the unordered_map
, you need to protect the values from concurrent access, yes. Whether you use a std::mutex
, std::atomic
, etc, that is up to you to decide. But you need some kind of synchronization mechanism between the writing and the reading. Otherwise, T1 could be writing a new value at the exact same moment that T2 is trying to read that same value, which will cause race issues. T2 could easily end up receiving stale or incomplete data.
Upvotes: 2
Reputation: 1
Nope, if you're only reading and you're not changing the data in T2, then you should be fine. However, you might want to look into synchronizing the threads so T2 doesn't read right before T1 writes.
Upvotes: -4