Reputation: 5444
I have a case where I have "property maps" one each attached to a very large collection of objects. I am only interested in the overhead of the map itself not of the attached keys and values.
Many have a small number of properties ie: < 5, and many none. For empty maps I use a singleton "empty instance" as an optimization.
For speed it seems that at a count of <= 5, TreeMap seems equal or better than HashMap for retrieval and "auto trims" when nodes are removed. I would think a HashMap uses for each entry the memory overhead of the hash/handle array including memory above the load factor full of null entry handles, and the object overhead and handle storage for the key and the value in the "Entry[+internal fields if any]".
Also I would think a TreeMap uses only the per Entry Object overhead + key,value,prev,next handle fields in the entry for the list, and the smallest possible a binary search of an array - One array object (trimmed to size) full of just the handles for the keys and the values.
Has anyone really checked the accurate full overhead of each of the Java map types? For my bazillions of objects with properties it would make a significant difference in the 1 to 5 property case without impacting speed of access.
Upvotes: 2
Views: 506
Reputation: 5444
Ok - I looked at the Java source and yes TreeMap does use a fair amount of memory per node. So I made two map implementations and tested them.
One a simple singly linked list, the parent with only two fields MyMap { Entry root; int size; } The Entry { key, value, next } The performance turns out for sizes <= 8 was faster than hash map and easy to code and the smaller sizes linearly faster.
I also did the interleaved key/value array with Binary search which performed worse than the simple list as the extra overhead of all the invoking of Comparable.compare() method on the keys ate all the gains and it preformed no faster than HashMap at smaller sizes and got much worse as it got bigger.
So for my case the simplest thing wins. I simply replace it with a hash map when it gets bigger than 8 and if the HashMap gets below a size of 5 I revert back to the list map.
And you are right for putting. For this application there is probably only one put for several hundred gets and most of the puts just replace the value in the entry and do not trash it.
Upvotes: 0
Reputation: 46482
Assuming you want to optimize really hard and need a full fledged Map
, you can encapsulate an Object[]
in an own AbstractMap
and put there all keys and values interleaved. You need to implement a single method for which you use an AbstractSet
, but then it gets worse as all the mutating methods throw by default. So you'd have to inspect them all... which may take maybe a whole day.
With Guava-testing you could ensure that your map will really works. Without really profound testing, I'd recommend to stick with your own tiny interface as Map
is pretty huge.
Upvotes: 2
Reputation: 15313
If your property maps are not updated actively, but built once and queried for more or less long time, I would propose a Java-specific memory saving architecture:
public interface DynamicMap<K, V> {
DynamicMap<K, V> add(K key, V value);
V get(K key);
}
class WrappingDynamicMap<K, V> implements DynamicMap<K, V> {
DynamicMap<K, V> delegate;
public void put(K key, V value) { delegate = add(key, value); }
@Override public DynamicMap<K, V> add(K key, V value) {
if (delegate == null) return new Map1<>(key, value);
return delegate.add(key, value);
}
@Override public V get(K key) { return delegate.get(key); }
}
class Map1<K, V> implements DynamicMap<K, V> {
K k1; V v1;
Map1(K k1, V v1) { this.k1 = k1; this.v1 = v1; }
boolean putThis(K key, V value) {
if (key.equals(k1)) { v1 = value; return true; }
return false;
}
@Override public DynamicMap<K, V> add(K key, V value) {
if (putThis(key, value)) return this;
return new Map2<>(this, key, value);
}
@Override public V get(K key) { return key.equals(k1) ? v1 : null; }
}
class Map2<K, V> extends Map1<K, V> {
K k2; V v2;
Map2(Map1<K, V> m, K k2, V v2) {
super(m.k1, m.v1); this.k2 = k2; this.v2 = v2;
}
@Override boolean putThis(K key, V value) {
if (super.putThis(key, value)) return true;
if (key.equals(k2)) { v2 = value; return true; }
return false;
}
@Override public DynamicMap<K, V> add(K key, V value) {
if (putThis(key, value)) return this;
HashMap<K,V>hm=new HashMap<K,V>(){{
put(k1,v1);put(k2,v2);put(key,value);}};
return new BigMap<>(hm);
}
@Override public V get(K key) {
V v = super.get(key);
return v != null ? v : (key.equals(k2) ? v2 : null);
}
}
// Map3, Map4, Map5
class BigMap<K, V> implements DynamicMap<K, V> {
private HashMap<K, V> impl;
BigMap(HashMap<K, V> impl) { this.impl = impl; }
@Override public DynamicMap<K, V> add(K key, V value) {
impl.put(key, value); return this;
}
@Override public V get(K key) { return impl.get(key); }
}
If you need classic Map interface, use WrappingDynamicMap
with put()
, if you can afford to reassign property map field within the object on each map put, you can use DynamicMap
implementations directly with add()
, it would safe 1 dereference on each query and 16-24 bytes on heap for each property map.
Advantages of this approach:
HashMap
/TreeMap
on sizes 1..3-5 depending on your key's equals()
complexityDisadvantages:
I hope it's clear how removes could be implemented for Map1
/Map2
/../BigMap
. BigMap
could also monitor impl.size()
and turn back to Map5
eventually.
Additional opportunities:
HashMap
, with size stored as byte
.Upvotes: 2
Reputation: 198371
https://code.google.com/p/memory-measurer/wiki/ElementCostInDataStructures
TreeMap
and HashMap
use approximately the same amount of memory per entry, though it's true that TreeMap
has a slightly smaller constant overhead for the map itself.
Upvotes: 3