sg0710
sg0710

Reputation: 51

Chronicle Map vs Redis vs Koloboke

We have a system where the same dataset(key-value pairs) is used across 50 servers. The number of updates to this dataset is approximately 1000 per hour and has to be replicated across these 50 servers. We have a master system which receives these updates and is responsible for propagating these updates to the other servers. Currently we sync the entire dataset (and not the incremental updates) to all the servers every hour in the form of files. This data is then loaded into the immutable Koloboke maps. Each server handles around 25000 requests per seconds and each request does 30 lookups into this map. The average response latency for requests received on these servers has to max around 3 milliseconds, and hence the in-memory koloboke maps serve us well in maintaining this response time.

However, our current system to sync this data across servers causes issues:

1) Most often than not, sync of this critical data fails on one of the servers resulting in revenue losses

2) Since this data is stored in-memory, it is not persistent and we need to reload this data again every time the server restarts or with every hourly update, which affects the startup time of the application.

In order to make this more efficient, I have explored Redis, Chronicle Maps and Mutable maps in Koloboke library. But I have encountered limitations with all of them:

Redis: Redis supports replication and persistence. However, on using its benchmarking utility, I found that number of lookups it can support is only slightly above our average use case (0.8-1.1 millions requests vs 0.75 million which is our number of lookups per second). Moreover, calls to redis would be made over network which will hurt our average response time of 3ms.

Chronicle Maps: On exploring this further, I found that Chronicle Maps supports replication, persistence and can server upto 30 million requests per second. At first look it seemed like a good option, but later I found that they do not work well with multimaps, and we generate those in our application. Moreover, they store data off-heap and hence the cost of deserialization of the data would induce a performance hit.

Koloboke: Its performance is good and serves our use case, but does not support replication and persistence.

I couldn't find anything around that supports all of our use cases. I am looking for suggestions from this community that can help us build this system efficiently without having any severe performance impact. Any help on this would be much appreciated! Thank you!

Upvotes: 5

Views: 2583

Answers (1)

pgupta
pgupta

Reputation: 5415

This use case can be easily handled in Aerospike. Aerospike can be configured to run exactly how you are running these servers. Aerospike will update all servers for you when you make any update once to the server cluster. At first glance, your read latency requirements are also reasonable for Aerospike. In addition, Aerospike can serve you data both from RAM and simultaneously store on SSDs or HDDs for persistence. Seems like a tailor made case for Aerospike. You can run a proof of concept, free, using Aerospike Community Edition. Or if you want to do an Enterprise Edition Trial license for 3 months and have Aerospike Solution Architecture team help you, reach out to Aerospike Sales. To do this successfully, you must set up the Aerospike cluster correctly for both data capacity and data latency. If you misconfigure, you may summarily dismiss a solution that will work for you.

Upvotes: 5

Related Questions