Synchronized HashMap vs ConcurrentHashMap write test

Question

I learn java.util.concurrency and I have found one article about performance (http://www.javamex.com/tutorials/concurrenthashmap_scalability.shtml). And I decided to repeat one little part of those performance test for studying purpose. I have written the write test for HashMap and ConcurrentHashMap. I have two questions about it:

Is it true, that for the best performance I should use number threads equal number CPU core?
I understand, that performance vary form platform to platform. But the result has shown that HashMap a little more faster then ConcurrentHashMap. I think it should be the same or vise versa. Maybe I have made a mistake in my code.

Any criticism is welcome.

package Concurrency;

import java.util.concurrent.*;
import java.util.*;

class Writer2 implements Runnable {
    private Map map;
    private static int index;
    private int nIteration; 
    private Random random = new Random();
    char[] chars = "abcdefghijklmnopqrstuvwxyz".toCharArray();
    private final CountDownLatch latch;

    public Writer2(Map map, int nIteration, CountDownLatch latch) {
        this.map = map;
        this.nIteration = nIteration;
        this.latch = latch;
    }

    private synchronized String getNextString() {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 5; i++) {
            char c = chars[random.nextInt(chars.length)];
            sb.append(c);
        }
        sb.append(index);
        if(map.containsKey(sb.toString()))
            System.out.println("dublicate:" + sb.toString());
        return sb.toString();
    }

    private synchronized int getNextInt() { return index++; }

    @Override
    public void run() {
        while(nIteration-- > 0) {
            map.put(getNextString(), getNextInt());
        }
        latch.countDown();
    }
}

public class FourtyTwo {
    static final int nIteration = 100000;
    static final int nThreads = 4;
    static Long testMap(Map map) throws InterruptedException{
        String name = map.getClass().getSimpleName(); 
        CountDownLatch latch = new CountDownLatch(nThreads);
        long startTime = System.currentTimeMillis();
            ExecutorService exec = Executors.newFixedThreadPool(nThreads);
            for(int i = 0; i < nThreads; i++)
                exec.submit(new Writer2(map, nIteration, latch));
            latch.await();  
            exec.shutdown();
        long endTime = System.currentTimeMillis();
        System.out.format(name + ": that took %,d milliseconds %n", (endTime - startTime));
        return (endTime - startTime);
    }
    public static void main(String[] args) throws InterruptedException {
        ArrayList result = new ArrayList() {
            @Override
            public String toString() {
                Long result = 0L;
                Long size = new Long(this.size());
                for(Long i : this)
                    result += i;
                return String.valueOf(result/size);
            }
        }; 

        Map map1 = Collections.synchronizedMap(new HashMap());
        Map map2 = new ConcurrentHashMap<>();

        System.out.println("Rinning test...");
        for(int i = 0; i < 5; i++) {
            //result.add(testMap(map1)); 
            result.add(testMap(map2));
        }
        System.out.println("Average time:" + result + " milliseconds");

    }

}

/*
OUTPUT:
ConcurrentHashMap: that took 5 727 milliseconds 
ConcurrentHashMap: that took 2 349 milliseconds 
ConcurrentHashMap: that took 9 530 milliseconds 
ConcurrentHashMap: that took 25 931 milliseconds 
ConcurrentHashMap: that took 1 056 milliseconds 
Average time:8918 milliseconds

SynchronizedMap: that took 6 471 milliseconds 
SynchronizedMap: that took 2 444 milliseconds 
SynchronizedMap: that took 9 678 milliseconds 
SynchronizedMap: that took 10 270 milliseconds 
SynchronizedMap: that took 7 206 milliseconds 
Average time:7213 milliseconds

*/

DanO · Accepted Answer

One

How many threads varies, not by CPU, but by what you are doing. If, for example, what you are doing with your threads is highly disk intensive, your CPU isn't likely to be maxed out, so doing 8 threads may just cause heavy thrashing. If, however, you have huge amounts of disk activity, followed by heavy computation, followed by more disk activity, you would benefit from staggering the threads, splitting out your activities and grouping them, as well as using more threads. You would, for example, in such a case, likely want to group together file activity that uses a single file, but maybe not activity where you are pulling from a bunch of files (unless they are written contiguously on the disk). Of course, if you overthink disk IO, you could seriously hurt your performance, but I'm making a point of saying that you shouldn't just shirk it, either. In such a program, I would probably have threads dedicated to disk IO, threads dedicated to CPU work. Divide and conquer. You'd have fewer IO threads and more CPU threads.

It is common for a synchronous server to run many more threads than cores/CPUs because most of those threads either do work for only a short time or don't do much CPU intensive work. It's not useful to have 500 threads, though, if you will only ever have 2 clients and the context switching of those excess threads hampers performance. It's a balancing act that often requires a little bit of tuning.

In short

Think about what you are doing
- Network activity is light,so more threads are generally good
- CPU intensive things don't do much good if you have 2x more of those threads than cores... usually a little more than 1x or a little less than 1x is optimum, but you have to test, test, test
- Having 10 disk IO intensive threads may hurt all 10 threads, just like having 30 CPU intensive threads... the thrashing hurts them all
Try to spread out the pain
- See if it helps to spread out the CPU, IO, etc, work or if clustering is better... it will depend on what you are doing
Try to group things up
- If you can, separate out your disk, IO, and network tasks and give them their own threads that are tuned to those tasks

Two

In general, thread-unsafe methods run faster. Similarly using localized synchronization runs faster than synchronizing the entire method. As such, HashMap is normally significantly faster than ConcurrentHashMap. Another example would be StringBuffer compared to StringBuilder. StringBuffer is synchronized and is not only slower, but the synchronization is heavier (more code, etc); it should rarely be used. StringBuilder, however, is unsafe if you have multiple threads hitting it. With that said, StringBuffer and ConcurrentHashMap can race, too. Being "thread-safe" doesn't mean that you can just use it without thought, particularly the way that these two classes operate. For example, you can still have a race condition if you are reading and writing at the same time (say, using contains(Object) as you are doing a put or remove). If you want to prevent such things, you have to use your own class or synchronize your calls to your ConcurrentHashMap.

I generally use the non-concurrent maps and collections and just use my own locks where I need them. You'll find that it's much faster that way and the control is great. Atomics (e.g. AtomicInteger) are nice sometimes, but really not generally useful for what I do. Play with the classes, play with synchronization, and you'll find that you can master than more efficiently than the shotgun approach of ConcurrentHashMap, StringBuffer, etc. You can have race conditions whether or not you use those classes if you don't do it right... but if you do it yourself, you can also be much more efficient and more careful.

Example

Note that we have a new Object that we are locking on. Use this instead of synchronized on a method.

public final class Fun {
    private final Object lock = new Object();

    /*
     * (non-Javadoc)
     *
     * @see java.util.Map#clear()
     */
    @Override
    public void clear() {
        // Doing things...
        synchronized (this.lock) {
            // Where we do sensitive work
        }
    }

    /*
     * (non-Javadoc)
     *
     * @see java.util.Map#put(java.lang.Object, java.lang.Object)
     */
    @Override
    public V put(final K key, @Nullable final V value) {
        // Doing things...
        synchronized (this.lock) {
            // Where we do sensitive work
        }
        // Doing things...
    }
}

And From Your Code...

I might not put that sb.append(index) in the lock or might have a separate lock for index calls, but...

    private final Object lock = new Object();

    private String getNextString() {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 5; i++) {
            char c = chars[random.nextInt(chars.length)];
            sb.append(c);
        }
        synchronized (lock) {
            sb.append(index);
            if (map.containsKey(sb.toString()))
                System.out.println("dublicate:" + sb.toString());
        }
        return sb.toString();
    }

    private int getNextInt() {
        synchronized (lock) {
            return index++;
        }
    }

Synchronized HashMap vs ConcurrentHashMap write test

Answers (2)

One

Two

Example

And From Your Code...

Related Questions