Reputation: 1473
I have a question about using maps in multithreaded application. Suppose we have such scenario:
List<Map<String, Object>>
which is deserialized by Jackson Json.As you can see, map is modified only by single thread, but then it "becomes" read-only (nothing chagnes, just not modified anymore) and passed to another thread. Next, when I looked into implementations of HasMap
(also TreeMap
) and ConcurrentHashMap
, the latter has volatile
fields while first two isn't. So, which implementation of Map
should I use in this case? Does ConcurrentHashMap
is overkill choice or it must be used due to inter-thread transfer?
My simple tests shows that I can use HashMap/TreeMap
when they are modified synchronously and it works, but my conclusion or my test code may be wrong:
def map = new TreeMap() // or HashMap
def start = new CountDownLatch(1)
def threads = (1..5)
println("Threads: " + threads)
def created = new CountDownLatch(threads.size())
def completed = new CountDownLatch(threads.size())
threads.each {i ->
new Thread({
def from = i * 10
def to = from + 10
def local = (from..to)
println(Thread.currentThread().name + " " + local)
created.countDown()
start.await()
println('Mutating by ' + local)
local.each {number ->
synchronized (map) {
map.put(number, ThreadLocalRandom.current().nextInt())
}
println(Thread.currentThread().name + ' added ' + number + ': ' + map.keySet())
}
println 'Done: ' + Thread.currentThread().name
completed.countDown()
}).start()
}
created.await()
start.countDown()
completed.await()
println('Completed:')
map.each { e ->
println('' + e.key + ': ' + e.value)
}
Main thread spawns 5 child threads which updates common map synchronously, when they complete main thread successfully sees all updates by child threads.
Upvotes: 1
Views: 774
Reputation: 9328
This question has a broad scope.
You say :
[A] map is modified only by single thread, but then it "becomes" read-only
The tricky part is the word "then". When you, the programmer say "then", you refer to "clock time", e.g. i've done this, now do that. But for an incredibly wide variety of reasons, the computer does not "think" (execute code) this way. What happened before, and what happens after need to be "syncrhonized manually" for the computer to see the world the way we see it.
That's the way the Java Memory Model expresses stuff : if you want your objects to behave predictably in a concurrent environment, you have to make sure that you establish "happens before" boundaries.
There are a few things that establish happens before relationships in java code. Simplifying a bit, and just to name a few :
start()
s t2, everything that t1 did before starting t2 is visible by t2. Reciprocally with join()
synchronized
, objects monitors : every action made by a thread inside a sync'd block is visible by another thread that syncs on the same instancejava.util.concurrent
classes. e.g Locks and Semaphore, of course, but also collections : if you put an element in a syncrhonized collection, the thread that pulls it out has an happen-before on the thread that put it in.So back to your phrase
then it "becomes" read-only
It does become read ony. But for the computer to see it, you have to give a meaning to "then"; which is : you have to put an happen before relationship
in your code.
Later on you state :
And then puts list into blocking queue
A java.util.concurrent
queue ? How neat is that! It just so happens that a thread pulling out an object from a concurrent queue has a "happens before" relationship with repsect to the thread that put the said object into the queue.
You have established the realtionship. All mutations made (before) by the thread that put the object into the queue are safely visible by the one that pulls it out. You do not need a ConcurrentHashMap
in this case (if no other thread mutates the same data of course).
Your sample code does not use a queue. And it mutates a single map modified by multiple threads (not the other way around as your scenario mentions). So, it's just... not the the same. But either way, your code's fine.
Threads accessing the map do it like so :
synchronized (map) {
map.put(number, ThreadLocalRandom.current().nextInt())
}
The synchornize
provides 1) mutual exclusion of the threads and 2) a happens before. So each thread that enters the synchonization see all that "happened before" in another thread that also syncrhonized on it (which is all of them).
So no problem here.
And then your main thread does :
completed.await()
println('Completed:')
map.each { e ->
println('' + e.key + ': ' + e.value)
}
The thing that saves you here is completed.await()
. This establishes a happens before with every thread that called countDown()
, which is all of them. So your main thread sees everything that was done by the worker threads. All is fine.
Except... We often forget to check to bootstrap of threads. The first time a worker synchronizes on the map instance, nobody did it before. How come we can be sure that they see a map instance fully initialized and ready.
Well, for two reasons :
thread.start()
, which establishes an happens before. This would be enoughtYou're doubly safe.
Upvotes: 1
Reputation: 77186
The java.util.concurrent
classes have special guarantees regarding sequencing:
Memory consistency effects: As with other concurrent collections, actions in a thread prior to placing an object into a
BlockingQueue
happen-before actions subsequent to the access or removal of that element from theBlockingQueue
in another thread.
This means that you are free to use any kind of mutable object and manipulate it as you wish, then put it into the queue. When it's retrieved, all of the manipulations you've applied will be visible.
(Note more generally that the kind of test you demonstrated can only prove lack of safety; in most real-world cases, unsynchronized code works fine 99% of the time. It's that last 1% that bites you.)
Upvotes: 2