Reputation: 950
Our application stores lot's of data in memory in many different kinds of maps to allow fast lookups. To keep it simple (and not considering primitive maps) it's always a map with one or more keys. Performance is a big requirement for us.
I wanted to find the most performant map implementation and as suggested here, I compared these implementations:
Map of Maps (Nested Maps) based on java.util.HashMap specifically for 3 keys :
Map<K1, Map<K2, Map<K3, V>>>
Wrapper Key (Tuples as keys) in a java.util.HashMap
Map<Triple<K1, K2, K3>, V>
Tuples as keys in a net.openhft.koloboke.collect.map.hash.HashObjObjMap which according to this should be (one of) the fastest map.
HashObjObjMap<Triple<K1, K2, K3>, V>
Benchmark Mode Cnt Score Error Units
TupleVsNestedMapsBenchmark.benchGetFromNestedMap avgt 20 11.586 ± 0.205 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleKolobokeMap avgt 20 18.619 ± 0.113 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleMap avgt 20 8.985 ± 0.085 ns/op
TupleVsNestedMapsBenchmark.benchPutToNestedMap avgt 20 15.106 ± 0.142 ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleKolobokeMap avgt 20 22.533 ± 0.335 ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleMap avgt 20 8.884 ± 0.084 ns/op
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(100000)
@Fork(1)
@Warmup(iterations = 10)
@Measurement(iterations = 20)
public class TupleVsNestedMapsBenchmark {
public static final int N = 10000;
static ObjObjObjObjHashMap<String, String, String, Integer> sourceNestedMap = new ObjObjObjObjHashMap<>();
static Map<Triple<String, String, String>, Integer> sourceTupleMap = new HashMap<>();
static HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap = HashObjObjMaps.newMutableMap();
static {
for (int i = 0; i < N; i++) {
sourceNestedMap.put("a-" + i, "b-" + i, "c-" + i, i);
sourceTupleMap.put(ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i), i);
sourceTupleKMap.put(ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i), i);
}
}
@Benchmark
public List<Integer> benchGetFromNestedMap() {
return benchmarkGet(sourceNestedMap::get);
}
@Benchmark
public List<Integer> benchGetFromTupleMap() {
return benchmarkGet(((key1, key2, key3) -> sourceTupleMap.get(ImmutableTriple.of(key1, key2, key3))));
}
@Benchmark
public List<Integer> benchGetFromTupleKolobokeMap() {
return benchmarkGet(((key1, key2, key3) -> sourceTupleKMap.get(ImmutableTriple.of(key1, key2, key3))));
}
@Benchmark
public ObjObjObjObjHashMap<String, String, String, Integer> benchPutToNestedMap() {
ObjObjObjObjHashMap<String, String, String, Integer> map = new ObjObjObjObjHashMap<>();
benchmarkPut(map::put);
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleMap() {
Map<Triple<String, String, String>, Integer> map = new HashMap<>();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleKolobokeMap() {
HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
return map;
}
private List<Integer> benchmarkGet(MapValueSupplier<Integer> mapValueSupplier) {
List<Integer> result = new ArrayList<>(N);
for (int i = 0; i < N; i++) {
result.add(mapValueSupplier.supply("a-" + i, "b-" + i, "c-" + i));
}
return result;
}
private void benchmarkPut(PutValueFunction<Integer> putValueFunction) {
for (int i = 0; i < N; i++) {
putValueFunction.apply("a-" + i, "b-" + i, "c-" + i, i);
}
}
private interface MapValueSupplier<T> {
T supply(String key1, String key2, String key3);
}
private interface PutValueFunction<T> {
void apply(String key1, String key2, String key3, T value);
}
}
Note: please, don't suggest to use primitive maps. The Integer as (value) is just an example of a cheap object.
Based on the good advices from @leventov, I changed the Benchmark and tried also the Triple implementation which caches the hashcode (and has a better distribution) - tests are named as Tuple2.
@State(Scope.Thread)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(TupleVsNestedMapsBenchmark.TOTAL_OPS)
@Fork(1)
@Warmup(iterations = 5)
@Measurement(iterations = 20)
public class TupleVsNestedMapsBenchmark {
static final int N = 30;
static final int TOTAL_OPS = N * N * N;
private ObjObjObjObjHashMap<String, String, String, Integer> sourceNestedMap;
private Map<Triple<String, String, String>, Integer> sourceTupleMap;
private HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap;
private Map<Triple<String, String, String>, Integer> sourceTuple2Map;
private HashObjObjMap<Triple<String, String, String>, Integer> sourceTuple2KMap;
private String[] keys;
@Setup
public void init() {
sourceNestedMap = new ObjObjObjObjHashMap<>();
sourceTupleMap = new HashMap<>(TOTAL_OPS);
sourceTupleKMap = HashObjObjMaps.newMutableMap(TOTAL_OPS);
sourceTuple2Map = new HashMap<>(TOTAL_OPS);
sourceTuple2KMap = HashObjObjMaps.newMutableMap(TOTAL_OPS);
keys = new String[N];
for (int i = 0; i < N; i++) {
keys[i] = "k" + i;
}
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
for (int k = 0; k < N; k++) {
sourceNestedMap.put(keys[i], keys[j], keys[k], i);
sourceTupleMap.put(ImmutableTriple.of(keys[i], keys[j], keys[k]), i);
sourceTupleKMap.put(ImmutableTriple.of(keys[i], keys[j], keys[k]), i);
sourceTuple2Map.put(ImmutableTriple2.of(keys[i], keys[j], keys[k]), i);
sourceTuple2KMap.put(ImmutableTriple2.of(keys[i], keys[j], keys[k]), i);
}
}
}
}
@Benchmark
public List<Integer> benchGetFromNestedMap() {
return benchmarkGet(sourceNestedMap::get);
}
@Benchmark
public List<Integer> benchGetFromTupleMap() {
return benchmarkGet(((key1, key2, key3) -> sourceTupleMap.get(ImmutableTriple.of(key1, key2, key3))));
}
@Benchmark
public List<Integer> benchGetFromTupleKolobokeMap() {
return benchmarkGet(((key1, key2, key3) -> sourceTupleKMap.get(ImmutableTriple.of(key1, key2, key3))));
}
@Benchmark
public List<Integer> benchGetFromTuple2Map() {
return benchmarkGet(((key1, key2, key3) -> sourceTuple2Map.get(ImmutableTriple2.of(key1, key2, key3))));
}
@Benchmark
public List<Integer> benchGetFromTuple2KolobokeMap() {
return benchmarkGet(((key1, key2, key3) -> sourceTuple2KMap.get(ImmutableTriple2.of(key1, key2, key3))));
}
@Benchmark
public ObjObjObjObjHashMap<String, String, String, Integer> benchPutToNestedMap() {
ObjObjObjObjHashMap<String, String, String, Integer> map = new ObjObjObjObjHashMap<>();
benchmarkPut(map::put);
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleMap() {
Map<Triple<String, String, String>, Integer> map = new HashMap<>();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTupleKolobokeMap() {
HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple.of(key1, key2, key3), value));
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTuple2Map() {
Map<Triple<String, String, String>, Integer> map = new HashMap<>();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple2.of(key1, key2, key3), value));
return map;
}
@Benchmark
public Map<Triple<String, String, String>, Integer> benchPutToTuple2KolobokeMap() {
HashObjObjMap<Triple<String, String, String>, Integer> map = HashObjObjMaps.newMutableMap();
benchmarkPut((key1, key2, key3, value) -> map.put(ImmutableTriple2.of(key1, key2, key3), value));
return map;
}
private List<Integer> benchmarkGet(MapValueSupplier<Integer> mapValueSupplier) {
List<Integer> result = new ArrayList<>(TOTAL_OPS);
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
for (int k = 0; k < N; k++) {
Integer value = mapValueSupplier.supply(keys[i], keys[j], keys[k]);
result.add(value);
}
}
}
return result;
}
private void benchmarkPut(PutValueFunction<Integer> putValueFunction) {
Integer value = 1;
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
for (int k = 0; k < N; k++) {
putValueFunction.apply(keys[i], keys[j], keys[k], value);
}
}
}
}
private interface MapValueSupplier<T> {
T supply(String key1, String key2, String key3);
}
private interface PutValueFunction<T> {
void apply(String key1, String key2, String key3, T value);
}
}
and the results are this:
Benchmark Mode Cnt Score Error Units
TupleVsNestedMapsBenchmark.benchGetFromNestedMap avgt 20 24.524 ± 0.144 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTuple2KolobokeMap avgt 20 65.604 ± 1.135 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTuple2Map avgt 20 22.653 ± 0.745 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleKolobokeMap avgt 20 34824.901 ± 1718.183 ns/op
TupleVsNestedMapsBenchmark.benchGetFromTupleMap avgt 20 2565.835 ± 57.402 ns/op
TupleVsNestedMapsBenchmark.benchPutToNestedMap avgt 20 43.160 ± 0.340 ns/op
TupleVsNestedMapsBenchmark.benchPutToTuple2KolobokeMap avgt 20 237.300 ± 3.362 ns/op
TupleVsNestedMapsBenchmark.benchPutToTuple2Map avgt 20 40.952 ± 0.535 ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleKolobokeMap avgt 20 52315.769 ± 399.769 ns/op
TupleVsNestedMapsBenchmark.benchPutToTupleMap avgt 20 3205.538 ± 44.306 ns/op
Upvotes: 5
Views: 3897
Reputation: 15303
[Answer to the updated question.]
Well, there are still problems with the benchmarks:
State
lifecycle, you should pass the state object to the benhcmark method, as a parameter (see my code below).Benchmarking put()
s should be done differently: 1) in a @Setup method, the collection should be created (with sufficient capacity
or size
argument)
2) in another @Setup(Level.Invocation)
method, you should call collection.clear()
3) measure pure put()
s in the benchmark method
You still do a lot of allocations in benchmarking method. This might be your case, but it hides collection performance contribution.
So, what I've written:
package tests;
import net.openhft.koloboke.collect.map.hash.HashObjObjMap;
import net.openhft.koloboke.collect.map.hash.HashObjObjMaps;
import org.apache.commons.lang3.tuple.Triple;
import org.openjdk.jmh.annotations.*;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.TimeUnit;
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Threads(1)
@Warmup(iterations = 10)
@Measurement(iterations = 20)
@State(Scope.Thread)
public class SoMultiMap {
public static final int N = Integer.getInteger("runs", 100000);
private static final double kbk = Double.parseDouble(System.getProperty("kbk", "1.0"));
static class ImmutableTriple<L, M, R> extends Triple<L, M, R> {
public final L left;
public final M middle;
public final R right;
private int h;
public static <L, M, R> ImmutableTriple<L, M, R> of(L left, M middle, R right) {
return new ImmutableTriple(left, middle, right);
}
public ImmutableTriple(L left, M middle, R right) {
this.left = left;
this.middle = middle;
this.right = right;
}
public L getLeft() {
return this.left;
}
public M getMiddle() {
return this.middle;
}
public R getRight() {
return this.right;
}
private int innerHash() {
int h = left.hashCode();
h *= 1000003;
h += middle.hashCode();
h *= 1000003;
h += right.hashCode();
return h * 1000003;
}
@Override
public int hashCode() {
return h != 0 ? h : (h = innerHash());
}
@Override
public boolean equals(Object obj) {
if (!(obj instanceof ImmutableTriple))
return super.equals(obj);
ImmutableTriple triple = (ImmutableTriple) obj;
if (h != 0 && triple.h != 0 && h != triple.h)
return false;
return super.equals(obj);
}
}
ImmutableTriple<String, String, String>[] keys = new ImmutableTriple[N];
Integer[] values = new Integer[N];
Map<Triple<String, String, String>, Integer> sourceTupleMap;
HashObjObjMap<Triple<String, String, String>, Integer> sourceTupleKMap;
@Setup
public void fill() {
sourceTupleMap = new HashMap<>((int) (N / 0.75));
sourceTupleKMap = HashObjObjMaps.newUpdatableMap((int) (N * kbk));
for (int i = 0; i < N; i++) {
keys[i] = ImmutableTriple.of("a-" + i, "b-" + i, "c-" + i);
values[i] = i;
sourceTupleKMap.put(keys[i], values[i]);
sourceTupleMap.put(keys[i], values[i]);
}
}
@Benchmark
public int tupleHashMapGet(SoMultiMap st) {
ImmutableTriple<String, String, String>[] keys = st.keys;
Map<Triple<String, String, String>, Integer> map = st.sourceTupleMap;
int s = 0;
for (int i = 0; i < N; i++) {
s += map.get(keys[i]);
}
return s;
}
@Benchmark
public int tupleKolobokeGet(SoMultiMap st) {
ImmutableTriple<String, String, String>[] keys = st.keys;
HashObjObjMap<Triple<String, String, String>, Integer> map = st.sourceTupleKMap;
int s = 0;
for (int i = 0; i < N; i++) {
s += map.get(keys[i]);
}
return s;
}
public static void main(String[] args) {
SoMultiMap st = new SoMultiMap();
st.fill();
st.tupleKolobokeGet(st);
st.tupleHashMapGet(st);
}
}
Now what interesting, is results:
With Java 7u55:
HashMap: 65 +- 6 ns/op
Koloboke: 46 +- 2
With Java 8u51:
HashMap: 42 +- 0.5
Koloboke: 49 +- 1
So, we have some VM change, something in between, that made HashMap
substantially faster, and Koloboke
maps - slightly slower. This requires investigation, for which I don't have time right now. See https://github.com/OpenHFT/Koloboke/issues/42
Also, note a couple of things:
Upvotes: 8
Reputation: 15303
Triple as an abstraction is OK (at least, I don't see obviously better alternative, you can override Apache Commons' Triple
abstract class to define better hashCode()
function.
final class ImmutableTriple<L, M, R> extends Triple<L, M, R> {
public final L left;
public final M middle;
public final R right;
private int h;
public static <L, M, R> ImmutableTriple<L, M, R> of(L left, M middle, R right) {
return new ImmutableTriple(left, middle, right);
}
public ImmutableTriple(L left, M middle, R right) {
this.left = left;
this.middle = middle;
this.right = right;
}
public L getLeft() {
return this.left;
}
public M getMiddle() {
return this.middle;
}
public R getRight() {
return this.right;
}
private int innerHash() {
int h = left.hashCode();
h *= 1000003;
h += middle.hashCode();
h *= 1000003;
h += right.hashCode();
return (int) LongHashFunction.murmur_3().hashInt(h);
}
@Override
public int hashCode() {
return h != 0 ? h : (h = innerHash());
}
}
Upvotes: 1
Reputation: 15303
The list of problems with your benchmarks:
@Setup
methods and @State
sN
is 10K, but operationsPerInvocation
is 100K, so the actual times are quite depressiveString
hash code + very poor Triple
hash code, leads to some clustering in hash tablesUpvotes: 2