Milos Gregor
Milos Gregor

Reputation: 950

Multi Key Maps - performance comparison

Context

Our application stores lot's of data in memory in many different kinds of maps to allow fast lookups. To keep it simple (and not considering primitive maps) it's always a map with one or more keys. Performance is a big requirement for us.

Problem

I wanted to find the most performant map implementation and as suggested here, I compared these implementations:

  1. Map of Maps (Nested Maps) based on java.util.HashMap specifically for 3 keys :

    Map<K1, Map<K2, Map<K3, V>>>
    
  2. Wrapper Key (Tuples as keys) in a java.util.HashMap

    Map<Triple<K1, K2, K3>, V>
    
  3. Tuples as keys in a net.openhft.koloboke.collect.map.hash.HashObjObjMap which according to this should be (one of) the fastest map.

    HashObjObjMap<Triple<K1, K2, K3>, V>
    

Expectations

  1. Nested Maps will have fastest GET and slowest PUT.
  2. Koloboke hash map will be faster than jdk HashMap.

Results

Benchmark                                                Mode  Cnt   Score   Error  Units
TupleVsNestedMapsBenchmark.benchGetFromNestedMap         avgt   20  11.586 ± 0.205  ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleKolobokeMap  avgt   20  18.619 ± 0.113  ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleMap          avgt   20   8.985 ± 0.085  ns/op
TupleVsNestedMapsBenchmark.benchPutToNestedMap           avgt   20  15.106 ± 0.142  ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleKolobokeMap    avgt   20  22.533 ± 0.335  ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleMap            avgt   20   8.884 ± 0.084  ns/op

Benchmark

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(100000)
@Fork(1)
@Warmup(iterations = 10)
@Measurement(iterations = 20)
public class TupleVsNestedMapsBenchmark {

public static final int N = 10000;

static ObjObjObjObjHashMap<String, String, String, Integer> sourceNestedMap = new ObjObjObjObjHashMap<>();
static Map<Triple<String, String, String>, Integer> sourceTupleMap = new HashMap<>();
static HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap = HashObjObjMaps.newMutableMap();

static {
    for (int i = 0; i < N; i++) {
        sourceNestedMap.put("a-" + i, "b-" + i, "c-" + i, i);
        sourceTupleMap.put(ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i), i);
        sourceTupleKMap.put(ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i), i);
    }
}

@Benchmark
public List<Integer> benchGetFromNestedMap() {
    return benchmarkGet(sourceNestedMap::get);
}

@Benchmark
public List<Integer> benchGetFromTupleMap() {
    return benchmarkGet(((key1, key2, key3) -> sourceTupleMap.get(ImmutableTriple.of(key1, key2, key3))));
}

@Benchmark
public List<Integer> benchGetFromTupleKolobokeMap() {
    return benchmarkGet(((key1, key2, key3) -> sourceTupleKMap.get(ImmutableTriple.of(key1, key2, key3))));
}

@Benchmark
public ObjObjObjObjHashMap<String, String, String, Integer> benchPutToNestedMap() {
    ObjObjObjObjHashMap<String, String, String, Integer> map = new ObjObjObjObjHashMap<>();

    benchmarkPut(map::put);

    return map;
}

@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleMap() {
    Map<Triple<String, String, String>, Integer> map = new HashMap<>();

    benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));

    return map;
}

@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleKolobokeMap() {
    HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();

    benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));

    return map;
}

private List<Integer> benchmarkGet(MapValueSupplier<Integer> mapValueSupplier) {
    List<Integer> result = new ArrayList<>(N);
    for (int i = 0; i < N; i++) {
        result.add(mapValueSupplier.supply("a-" + i, "b-" + i, "c-" + i));

    }
    return result;
}

private void benchmarkPut(PutValueFunction<Integer> putValueFunction) {
    for (int i = 0; i < N; i++) {
        putValueFunction.apply("a-" + i, "b-" + i, "c-" + i, i);
    }
}

private interface MapValueSupplier<T> {

    T supply(String key1, String key2, String key3);
}

private interface PutValueFunction<T> {

    void apply(String key1, String key2, String key3, T value);
}
}

Note: please, don't suggest to use primitive maps. The Integer as (value) is just an example of a cheap object.

Questions

  1. why is the koloboke map 2.5 times slower that jdk map?
  2. why are not nested maps faster? (I would expect that the allocation overhead for the tuple key object will be bigger.)
  3. Or is my benchmark wrong? Then, how can I improve that?

Update

Based on the good advices from @leventov, I changed the Benchmark and tried also the Triple implementation which caches the hashcode (and has a better distribution) - tests are named as Tuple2.

@State(Scope.Thread)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(TupleVsNestedMapsBenchmark.TOTAL_OPS)
@Fork(1)
@Warmup(iterations = 5)
@Measurement(iterations = 20)
public class TupleVsNestedMapsBenchmark {

static final int N = 30;
static final int TOTAL_OPS = N * N * N;

private ObjObjObjObjHashMap<String, String, String, Integer> sourceNestedMap;
private Map<Triple<String, String, String>, Integer> sourceTupleMap;
private HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap;
private Map<Triple<String, String, String>, Integer> sourceTuple2Map;
private HashObjObjMap<Triple<String, String, String>, Integer> sourceTuple2KMap;
private String[] keys;

@Setup
public void init() {
    sourceNestedMap = new ObjObjObjObjHashMap<>();
    sourceTupleMap = new HashMap<>(TOTAL_OPS);
    sourceTupleKMap = HashObjObjMaps.newMutableMap(TOTAL_OPS);
    sourceTuple2Map = new HashMap<>(TOTAL_OPS);
    sourceTuple2KMap = HashObjObjMaps.newMutableMap(TOTAL_OPS);
    keys = new String[N];
    for (int i = 0; i < N; i++) {
        keys[i] = "k" + i;
    }
    for (int i = 0; i < N; i++) {
        for (int j = 0; j < N; j++) {
            for (int k = 0; k < N; k++) {
                sourceNestedMap.put(keys[i], keys[j], keys[k], i);
                sourceTupleMap.put(ImmutableTriple.of(keys[i], keys[j], keys[k]), i); 
                sourceTupleKMap.put(ImmutableTriple.of(keys[i], keys[j], keys[k]), i); 
                sourceTuple2Map.put(ImmutableTriple2.of(keys[i], keys[j], keys[k]), i);
                sourceTuple2KMap.put(ImmutableTriple2.of(keys[i], keys[j], keys[k]), i);
            }
        }
    }
}

@Benchmark
public List<Integer> benchGetFromNestedMap() {
    return benchmarkGet(sourceNestedMap::get);
}

@Benchmark
public List<Integer> benchGetFromTupleMap() {
    return benchmarkGet(((key1, key2, key3) -> sourceTupleMap.get(ImmutableTriple.of(key1, key2, key3))));
}

@Benchmark
public List<Integer> benchGetFromTupleKolobokeMap() {
    return benchmarkGet(((key1, key2, key3) -> sourceTupleKMap.get(ImmutableTriple.of(key1, key2, key3))));
}

@Benchmark
public List<Integer> benchGetFromTuple2Map() {
    return benchmarkGet(((key1, key2, key3) -> sourceTuple2Map.get(ImmutableTriple2.of(key1, key2, key3))));
}

@Benchmark
public List<Integer> benchGetFromTuple2KolobokeMap() {
    return benchmarkGet(((key1, key2, key3) -> sourceTuple2KMap.get(ImmutableTriple2.of(key1, key2, key3))));
}

@Benchmark
public ObjObjObjObjHashMap<String, String, String, Integer> benchPutToNestedMap() {
    ObjObjObjObjHashMap<String, String, String, Integer> map = new ObjObjObjObjHashMap<>();
    benchmarkPut(map::put);
    return map;
}

@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleMap() {
    Map<Triple<String, String, String>, Integer> map = new HashMap<>();
    benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
    return map;
}

@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleKolobokeMap() {
    HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
    benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
    return map;
}

@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTuple2Map() {
    Map<Triple<String, String, String>, Integer> map = new HashMap<>();
    benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple2.of(key1, key2, key3), value));
    return map;
}

@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTuple2KolobokeMap() {
    HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
    benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple2.of(key1, key2, key3), value));
    return map;
}

private List<Integer> benchmarkGet(MapValueSupplier<Integer> mapValueSupplier) {
    List<Integer> result = new ArrayList<>(TOTAL_OPS);
    for (int i = 0; i < N; i++) {
        for (int j = 0; j < N; j++) {
            for (int k = 0; k < N; k++) {
                Integer value = mapValueSupplier.supply(keys[i], keys[j], keys[k]);
                result.add(value);
            }
        }
    }
    return result;
}

private void benchmarkPut(PutValueFunction<Integer> putValueFunction) {
    Integer value = 1;
    for (int i = 0; i < N; i++) {
        for (int j = 0; j < N; j++) {
            for (int k = 0; k < N; k++) {
                putValueFunction.apply(keys[i], keys[j], keys[k], value);
            }
        }
    }
}

private interface MapValueSupplier<T> {

    T supply(String key1, String key2, String key3);
}

private interface PutValueFunction<T> {

    void apply(String key1, String key2, String key3, T value);
}
}

and the results are this:

Benchmark                                                 Mode  Cnt      Score      Error  Units
TupleVsNestedMapsBenchmark.benchGetFromNestedMap          avgt   20     24.524 ±    0.144  ns/op
TupleVsNestedMapsBenchmark.benchGetFromTuple2KolobokeMap  avgt   20     65.604 ±    1.135  ns/op
TupleVsNestedMapsBenchmark.benchGetFromTuple2Map          avgt   20     22.653 ±    0.745  ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleKolobokeMap   avgt   20  34824.901 ± 1718.183  ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleMap           avgt   20   2565.835 ±   57.402  ns/op
TupleVsNestedMapsBenchmark.benchPutToNestedMap            avgt   20     43.160 ±    0.340  ns/op
TupleVsNestedMapsBenchmark.benchPutToTuple2KolobokeMap    avgt   20    237.300 ±    3.362  ns/op
TupleVsNestedMapsBenchmark.benchPutToTuple2Map            avgt   20     40.952 ±    0.535  ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleKolobokeMap     avgt   20  52315.769 ±  399.769  ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleMap             avgt   20   3205.538 ±   44.306  ns/op

Summary

Upvotes: 5

Views: 3897

Answers (3)

leventov
leventov

Reputation: 15303

[Answer to the updated question.]

Well, there are still problems with the benchmarks:

  • When making State lifecycle, you should pass the state object to the benhcmark method, as a parameter (see my code below).
  • Benchmarking put()s should be done differently: 1) in a @Setup method, the collection should be created (with sufficient capacity or size argument) 2) in another @Setup(Level.Invocation) method, you should call collection.clear() 3) measure pure put()s in the benchmark method

  • You still do a lot of allocations in benchmarking method. This might be your case, but it hides collection performance contribution.

So, what I've written:

package tests;

import net.openhft.koloboke.collect.map.hash.HashObjObjMap;
import net.openhft.koloboke.collect.map.hash.HashObjObjMaps;
import org.apache.commons.lang3.tuple.Triple;
import org.openjdk.jmh.annotations.*;

import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.TimeUnit;

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Threads(1)
@Warmup(iterations = 10)
@Measurement(iterations = 20)
@State(Scope.Thread)
public class SoMultiMap {

    public static final int N = Integer.getInteger("runs", 100000);

    private static final double kbk = Double.parseDouble(System.getProperty("kbk", "1.0"));

    static class ImmutableTriple<L, M, R> extends Triple<L, M, R> {
        public final L left;
        public final M middle;
        public final R right;
        private int h;

        public static <L, M, R> ImmutableTriple<L, M, R> of(L left, M middle, R right) {
            return new ImmutableTriple(left, middle, right);
        }

        public ImmutableTriple(L left, M middle, R right) {
            this.left = left;
            this.middle = middle;
            this.right = right;
        }

        public L getLeft() {
            return this.left;
        }

        public M getMiddle() {
            return this.middle;
        }

        public R getRight() {
            return this.right;
        }

        private int innerHash() {
            int h = left.hashCode();
            h *= 1000003;
            h += middle.hashCode();
            h *= 1000003;
            h += right.hashCode();
            return h * 1000003;
        }

        @Override
        public int hashCode() {
            return h != 0 ? h : (h = innerHash());
        }

        @Override
        public boolean equals(Object obj) {
            if (!(obj instanceof ImmutableTriple))
                return super.equals(obj);
            ImmutableTriple triple = (ImmutableTriple) obj;
            if (h != 0 && triple.h != 0 && h != triple.h)
                return false;
            return super.equals(obj);
        }
    }

    ImmutableTriple<String, String, String>[] keys = new ImmutableTriple[N];
    Integer[] values = new Integer[N];
    Map<Triple<String, String, String>, Integer> sourceTupleMap;
    HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap;

    @Setup
    public void fill() {
        sourceTupleMap = new HashMap<>((int) (N / 0.75));
        sourceTupleKMap = HashObjObjMaps.newUpdatableMap((int) (N * kbk));
        for (int i = 0; i < N; i++) {
            keys[i] = ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i);
            values[i] = i;
            sourceTupleKMap.put(keys[i], values[i]);
            sourceTupleMap.put(keys[i], values[i]);
        }
    }

    @Benchmark
    public int tupleHashMapGet(SoMultiMap st) {
        ImmutableTriple<String, String, String>[] keys = st.keys;
        Map<Triple<String, String, String>, Integer> map = st.sourceTupleMap;
        int s = 0;
        for (int i = 0; i < N; i++) {
            s += map.get(keys[i]);
        }
        return s;
    }

    @Benchmark
    public int tupleKolobokeGet(SoMultiMap st) {
        ImmutableTriple<String, String, String>[] keys = st.keys;
        HashObjObjMap<Triple<String, String, String>, Integer> map = st.sourceTupleKMap;
        int s = 0;
        for (int i = 0; i < N; i++) {
            s += map.get(keys[i]);
        }
        return s;
    }

    public static void main(String[] args) {
        SoMultiMap st = new SoMultiMap();
        st.fill();
        st.tupleKolobokeGet(st);
        st.tupleHashMapGet(st);
    }
}

Now what interesting, is results:

With Java 7u55:

HashMap:  65 +- 6 ns/op
Koloboke: 46 +- 2

With Java 8u51:

HashMap:  42 +- 0.5
Koloboke: 49 +- 1

So, we have some VM change, something in between, that made HashMap substantially faster, and Koloboke maps - slightly slower. This requires investigation, for which I don't have time right now. See https://github.com/OpenHFT/Koloboke/issues/42

Also, note a couple of things:

  • Run benchmarks on Server VM
  • Disable CPU scaling during the run
  • close heavy apps (browser, Intellij, etc.), unless you have 16+ hardware threads

Upvotes: 8

leventov
leventov

Reputation: 15303

Triple as an abstraction is OK (at least, I don't see obviously better alternative, you can override Apache Commons' Triple abstract class to define better hashCode() function.

final class ImmutableTriple<L, M, R> extends Triple<L, M, R> {
    public final L left;
    public final M middle;
    public final R right;
    private int h;

    public static <L, M, R> ImmutableTriple<L, M, R> of(L left, M middle, R right) {
        return new ImmutableTriple(left, middle, right);
    }

    public ImmutableTriple(L left, M middle, R right) {
        this.left = left;
        this.middle = middle;
        this.right = right;
    }

    public L getLeft() {
        return this.left;
    }

    public M getMiddle() {
        return this.middle;
    }

    public R getRight() {
        return this.right;
    }

    private int innerHash() {
        int h = left.hashCode();
        h *= 1000003;
        h += middle.hashCode();
        h *= 1000003;
        h += right.hashCode();
        return (int) LongHashFunction.murmur_3().hashInt(h);
    }

    @Override
    public int hashCode() {
        return h != 0 ? h : (h = innerHash());
    }
}

Upvotes: 1

leventov
leventov

Reputation: 15303

The list of problems with your benchmarks:

  • initialization is done in static area, should be done with @Setup methods and @States
  • Heavy allocation within the benchmark, and building strings! What you are actually measuring?
  • Note a mistake - N is 10K, but operationsPerInvocation is 100K, so the actual times are quite depressive
  • Poor String hash code + very poor Triple hash code, leads to some clustering in hash tables
  • When testing nested vs tuple, note that you have choosen all parts of all keys to be unique, i. e. all nested maps are maps with a single key. This isn't probably what you wanted

Upvotes: 2

Related Questions