Reputation: 568
I'm creating a memoization cache with the following characteristics:
What would have superior performance, or under what conditions would one solution be favored over the other?
ThreadLocal HashMap:
class MyCache {
private static class LocalMyCache {
final Map<K,V> map = new HashMap<K,V>();
V get(K key) {
V val = map.get(key);
if (val == null) {
val = computeVal(key);
map.put(key, val);
}
return val;
}
}
private final ThreadLocal<LocalMyCache> localCaches = new ThreadLocal<LocalMyCache>() {
protected LocalMyCache initialValue() {
return new LocalMyCache();
}
};
public V get(K key) {
return localCaches.get().get(key);
}
}
ConcurrentHashMap:
class MyCache {
private final ConcurrentHashMap<K,V> map = new ConcurrentHashMap<K,V>();
public V get(K key) {
V val = map.get(key);
if (val == null) {
val = computeVal(key);
map.put(key, val);
}
return val;
}
}
I figure the ThreadLocal solution would initially be slower if there a lot of threads because of all the cache misses per thread, but over thousands of reads, the amortized cost would be lower than the ConcurrentHashMap solution. Is my intuition correct?
Or is there an even better solution?
Upvotes: 9
Views: 16870
Reputation: 8156
use ThreadLocal as cache is a not good practice
In most containers, threads are reused via thread pools and thus are never gc. this would lead something wired
use ConcurrentHashMap you have to manage it in order to prevent mem leak
if you insist, i suggest using week or soft ref and evict after rich maxsize
if you are finding a in mem cache solution ( do not reinventing the wheel ) try guava cache http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html
Upvotes: 4
Reputation: 310985
The performance question is irrelevant, as the solutions are not equivalent.
The ThreadLocal hash map isn't shared between threads, so the question of thread safety doesn't even arise, but it also doesn't meet your specification, which doesn't say anything about each thread having its own cache.
The requirement for thread safety implies that a single cache is shared among all threads, which rules out ThreadLocal completely.
Upvotes: 1
Reputation: 328737
Note that your ConcurrentHashMap implementation is not thread safe and could lead to one item being computed twice. It is actually quite complicated to get it right if you store the results directly without using explicit locking, which you certainly want to avoid if performance is a concern.
It is worth noting that ConcurrentHashMap is highly scalable and works well under high contention. I don't know if ThreadLocal would perform better.
Apart from using a library, you could take some inspiration from Java Concurrency in Practice Listing 5.19. The idea is to save a Future<V>
in your map instead of a V
. That helps a lot in making the whole method thread safe while staying efficient (lock-free). I paste the implementation below for reference but the chapter is worth reading to understand that every detail counts.
public interface Computable<K, V> {
V compute(K arg) throws InterruptedException;
}
public class Memoizer<K, V> implements Computable<K, V> {
private final ConcurrentMap<K, Future<V>> cache = new ConcurrentHashMap<K, Future<V>>();
private final Computable<K, V> c;
public Memoizer(Computable<K, V> c) {
this.c = c;
}
public V compute(final K arg) throws InterruptedException {
while (true) {
Future<V> f = cache.get(arg);
if (f == null) {
Callable<V> eval = new Callable<V>() {
public V call() throws InterruptedException {
return c.compute(arg);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = cache.putIfAbsent(arg, ft);
if (f == null) {
f = ft;
ft.run();
}
}
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
throw new RuntimeException(e.getCause());
}
}
}
}
Upvotes: 2
Reputation: 45443
The lookup speed is probably similar in both solutions. If there are no other concerns, I'd prefer ThreadLocal, since the best solution to multi-threading problems is single-threading.
However, your main problem is you don't want concurrent calculations for the same key; so there should be a lock per key; such locks can usually be implemented by ConcurrentHashMap.
So my solution would be
class LazyValue
{
K key;
volatile V value;
V getValue() { lazy calculation, doubled-checked locking }
}
static ConcurrentHashMap<K, LazyValue> centralMap = ...;
static
{
for every key
centralMap.put( key, new LazyValue(key) );
}
static V lookup(K key)
{
V value = localMap.get(key);
if(value==null)
localMap.put(key, value=centralMap.get(key).getValue())
return value;
}
Upvotes: 1
Reputation: 533660
this computation is very expensive
I assume this is the reason you created the cache and this should be your primary concern.
While the speed of the solutions might be slightly different << 100 ns, I suspect it is more important that you be able to share results between threads. i.e. ConcurrentHashMap is likely to be the best for your application is it is likely to save you more CPU time in the long run.
In short, the speed of you solution is likely to be tiny compared to the cost of computing the same thing multiple times (for multiple threads)
Upvotes: 3
Reputation: 6229
Given that it's relatively easy to implement both of these, I would suggest you try them both and test at steady state load to see which one performs the best for your application.
My guess is that the the ConcurrentHashMap
will be a little faster since it does not have to make native calls to Thread.currentThread()
like a ThreadLocal
does. However, this may depend on the objects you are storing and how efficient their hash coding is.
I may also be worthwhile trying to tune the concurrent map's concurrencyLevel
to the number of threads you need. It defaults to 16.
Upvotes: 1