Aerospike - Store *small quantity* of large values

Question

Scenario
Let's say I am storing up to 5 byte arrays, each 50kB, per user.

Possible Implementations:

1) One byte array per record, indexed by secondary key.
Pros: Fast read/write.
Cons: High cardinality query (up to 5 results per query). Bad for horizontal scaling, if byte arrays are frequently accessed.

2) All byte arrays in single record in separate bins
Pros: Fast read
Neutral: Blocksize must be greater than 250kB
Cons: Slow write (one change means rewriting all byte arrays).

3) Store byte arrays in a LLIST LDT
Pros: Avoid the cons of solution (1) and (2)
Cons: LDTs are generally slow

4) Store each byte array in a separate record, keyed to a UUID. Store a UUID list in another record.
Pros: Writes to each byte array does not require rewriting all arrays. No low-cardinality concern of secondary indexes. Avoids use of LDT.
Cons: A client read is 2-stage: Get list of UUIDs from meta record, then multi-get for each UUID (very slow?)

5) Store each byte array as a separate record, using a pre-determined primary key scheme (e.g. userid_index, e.g. 123_0, 123_1, 123_2, 123_3, 123_4) Pros: Avoid 2-stage read Cons: Theoretical collision possibility with another user (e.g. user1_index1 and user2_index2 product same hash). I know this is (very, very) low-probability, but avoidance is still preferred (imagine one user being able to read the byte array of another user due to collision).

My Evaluation
For balanced read/write OR high read/low write situations, use #2 (One record, multiple bins). A rewrite is more costly, but avoids other cons (LDT penalty, 2-stage read).

For a high (re)write/low read situation, use #3 (LDT). This avoids having to rewrite all byte arrays when one of them is updated, due to the fact that records are copy-on-write.

Question
Which implementation is preferable, given the current data pattern (small quantity, large objects)? Do you agree with my evaluation (above)?

Meher · Accepted Answer

Here is some input. (I want to disclose that I do work at Aerospike).

Do avoid #3. Do not use LDT as the feature is definitely not as mature as the rest of the platform, especially when it comes to performance / reliability during cluster rebalance (migrations) situations when nodes leave/join a cluster.

I would try to stick as much as possible with basic Key/Value transactions. That should always be the fastest and most scalable. As you pointed out, option #1 would not scale. Secondary indices also do have an overhead in memory and currently do not allow for fast start (enterprise edition only anyways).

You are also correct on #2 for high write loads, especially if you are going to always update 1 bin...

So, this leaves options #4 and #5. For option #5, the collision will not happen in practice. You can go over the math, it will simply not happen. If it does, you will get famous and can publish a paper :) (there may even be a price for having found a collision). Also, note thatyou have the option to store the key along the record which will provide you with a 'key check' on writes which should be very cheap (since records are read anyway before being written). Option #4 would work as well, it will just do an extra read (which should be super fast).

It all depends on where you want the bit extra complexity. So you can do some simple benchmarking between the 2 options if you have that luxury before deciding.

Aerospike - Store small quantity of large values

Answers (1)

Related Questions