hmims
hmims

Reputation: 539

Which is the optimized way to query using aerospike client?

I have a set (set1)

Bins :

bin1 (PK = key1)

bin2 (PK = key1)

bin3 (PK = key2)

bin4 (PK = key2)

Which is more optimized way(in terms of query time, cpu usage, failure cases for 1 client call vs 2 client calls) for querying the data from aerospike client from the below 2 approaches:

Approach 1 : Make 1 get call using aeropsike client which has bins = [bin1, bin2, bin3, bin4] and keys = [key1, key2]

Approach 2 : Make 2 aerospike client get calls. First call will have bins = [bin1, bin2] and keys = [key1] and Second call will have bins = [bin3, bin4] and keys = [key2]

I find Approach 2 more cleaner, since in Approach 1 we will try to get the record for all combinations (e.g. : bin1 with key2 as primary key) and it will be extra computation and the primary key set can be large. But the disadvantage of Approach 2 is two Aerospike client calls.

Upvotes: 1

Views: 1083

Answers (1)

Ronen Botzer
Ronen Botzer

Reputation: 7117

A. Batch reads vs. multiple single reads

This is kind of a false choice. Yes, you could make a batch call for [key1, key2] (1), and you shouldn't specify bin1, bin2, bin3, bin4, just get the full records without selecting bins. Or you could make two independent get() calls, one for key1, one for key2 (2).

However, there's no reason you need to read key1, wait for the result, then read key2. You can read them with a synchronous get(key1) in one thread, and a synchronous get(key2) in another thread. The Java client can handle multi-threaded use. Alternatively, you can async get(key1) and immediately async get(key2).

Batch reads (such as in (1)) are not as efficient as single reads when the number of records is smaller than at least the number of nodes in the cluster. The records are evenly distributed, so if you have a 4 node cluster, and you make a batch request with 4 keys, you end up with parallel sub-batches of roughly 1 record per-node. The overhead associated with batch-reads isn't worth it when that's the case. See more about batch index in the docs and the knowledge base FAQ - batch-index tuning parameters. The FAQ - Differences between getting single record versus batch should answer your question.

B. The number of records in an Aerospike database doesn't impact read performance!

You are worried that "the primary key set can be large". That is not a problem at all for Aerospike. In fact, one of the best things about Aerospike is that getting a single record from a database with 1 million records or one with 1 trillion records is pretty much the same big-O computational cost.

Each record has a 64 byte metadata entry in the primary index. The primary index is spread evenly across the nodes of the cluster, because data distribution in Aerospike is extremely even. Each node stores an even share of the partitions, out of 4096 logical partitions for each namespace in the cluster. The partitions are represented as a collection of red-black binary trees (sprigs) with a hash table leading to the correct sprig.

To find any record the client hashes its key into a 20 byte digest. Using 12 bits of the digest the client finds the partition ID, looks it up in the partition map it holds locally, and finds the correct node. Reading the record is now a single hop to the correct node. On that node, a service thread picks up the call from a channel of the network card, looks it up in the correct partition (again, finding the partition ID from the digest is a simple O(1) operation). It hops directly to the correct sprig (also O(1)) and then does a simple O(n log n) binary tree lookup for the record's metadata. Now the service thread knows exactly where to find the record in storage, with a single read IO. I explained this read flow in more detail here (though in version 4.7 transaction queues and threads were removed; the service thread does all the work ).

Another point is that the time spent looking up record metadata in the index is orders of magnitude less than getting the record from storage.

So, the number of records in the cluster doesn't change how fast it takes to read a random record, from a data set of any size.

I wrote an article Aerospike Modeling: User Profile Store that shows how this fact is leveraged to make sub-millisecond reads at millions of transactions-per-second from a petabyte scale data store.

Upvotes: 1

Related Questions