Reputation: 29497
Please note: if the cache systems mentioned in this question work so completely differently from one another that an answer to this question is nearly-impossible, then I would simplify this question down to anything that is just JCache (JSR107) compliant.
The major players in the distributed cache game, for Java at least, are EhCache, Hazelcast and Infinispan.
First of all, my understanding of a distributed cache is that it is a cache that lives inside a running JVM process, but that is constantly synchronizing its in-memory contents across other multiple JVM processes running elsewhere. Hence Process 1 (P1
) is running on Machine 1 (M1
), P2
is running on M2
and P3
is running on M3
. An instance of the same distributed cache is running on all 3 processes, but they somehow all know about each other and are able to keep their caches synchronized with one another.
I believe EhCache accomplishes this inter-process synchrony via JGroups. Not sure what the others are using.
Furthermore, my understanding is that these configurations are limiting because, for each node/instance/process, you have to configure it and tell it about the other nodes/instances/processes in the system, so they can all sync their caches with one another. Something like this:
<cacheConfig>
<peers>
<instance uri="myapp01:12345" />
<instance uri="myapp02:12345" />
<instance uri="myapp03:12345" />
</peers>
</cacheConfig>
So to begin with, if anything I have stated is incorrect or is mislead, please begin by correcting me!
Assuming I'm more or less on track, then I'm confused how distributed caches could possibly work in an elastic/cloud environment where nodes are regulated by auto-scalers. One minute, load is peaking and there are 50 VMs serving your app. Hence, you would need 50 "peer instances" defined in your config. Then the next minute, load dwindles to a crawl and you only need 2 or 3 load balanced nodes. Since the number of "peer instances" is always changing, there's no way to configure your system properly in a static config file.
So I ask: How do distributed caches work on the cloud if there are never a static number of processes/instances running?
Upvotes: 1
Views: 594
Reputation: 6094
One way to handle that problem is to have an external (almost static) caching cluster which holds the data and your application (or the frontend servers) are using clients to connect to the cluster. You can still scale the caching clusters up and down to your needs but most of the time you'll need less nodes in the caching cluster than you'll need frontend servers.
Upvotes: 1