Reputation: 57
Copy-On-Write is considered as one of the good practices in concurrency scenarios. However, I am not clear on how it is different from a simple lock /synchronized on the write method. Could anyone help explain this?
Copy-On-Write:
public V put(K key, V value) {
synchronized (this) {
Map<K, V> newMap = new HashMap<K, V>(internalMap);
V val = newMap.put(key, value);
internalMap = newMap;
return val;
}
}
Direct lock / synchronized:
public V put(K key, V value) {
synchronized (this) {
internalMap.put(key, value);
}
}
For write threads, they are mutually excluded in above 2 examples, same.
For read threads, in Copy-On-Write, read actions after "internalMap = newMap" is run will get the updated value. And in Direct lock, read actions after "internalMap.put(key, value)" is run will get the updated value, kind of same.
So why are we promoting Copy-On-Write? Why we have to "copy" when write?
Upvotes: 5
Views: 814
Reputation: 26882
Both using a lock and copy-on-write achieve (practically) the same functionality. None of them are inherently better than the other.
In general, copy-on-write performs better when there are lots of reads, but very little writes. This is because on average, reads are cheaper than when using a lock, while writes are more expensive due to the copying. When you have a lot of writes, it is usually better to use a lock.
Why the writes are more expensive is probably obvious (you have to copy the whole map on every write, duh). The reason reads are cheaper is as follows:
volatile Map<K, V> internalMap = new HashMap<>();
Reading the internalMap does not require acquiring a lock (for more details, see Difference between volatile and synchronized in Java). Once threads have obtained a reference to the internalMap
, they can just keep working on that copy (e.g. iterating through the entries) without coordinating with other threads because it is guaranteed it won't be mutated. As many threads as necessary can work off a single copy (snapshot) of the map.
To explain by analogy, imagine an author is drafting an article and they have a few people working as their fact checkers. With a lock, only one of them can work on the draft. With copy on write, the author posts an immutable snapshot (copy) to somewhere, which the fact checkers can grab and do their work - while they do their work, they can read the snapshot as needed (rather than interrupting the author every time they forgot parts of the article etc).
Java's lock has improved over the years and hence the difference is small, but under extreme conditions not having to acquire locks / not having to coordinate between threads can result in higher throughput etc.
Upvotes: 5
Reputation: 10814
One benefit in this example is that you get snapshot semantics for the copy on write case: every reference to internalMap
is immutable and will not change anymore once obtained. This can be beneficial when you have many concurrent read operations traversing internalMap
and only occasional updates.
Upvotes: 6