Dean
Dean

Reputation: 6354

Optimal key hashes in redis

I've been reading the article here by on redis's website about using HSET vs SET to create a much more optimised memory usage: http://redis.io/topics/memory-optimization

The problem I have is that my keys are much longer than this. They are of the form: u:<USER_ID that is 9-16 characters>

Examples:

u:123456789
u:123456789abcdefg

Where is the most optimal place to split the key to use with HSET?

I had read here that ideally you would have no more than 1000 items in each "sub-set". So in that case I would split by the last 3 characters resulting in a theoretically maximum of 999 items in each set.

Unfortunately the user id's are not that predictable and I cannot guarantee a good spread. For example if I split at HSET u:123456 789 I could not guarantee there would be all the other user id's that began with 123456 filling the set, meaning memory would not be that optimised.

What should I do?

Upvotes: 1

Views: 2041

Answers (1)

Agis
Agis

Reputation: 33626

Update: It seems you are misunderstanding the article. It is not about "using HSET vs SET". It is about saving memory when using the aggregate data types (hashes, sets, sorted sets etc.), not single strings.

In your case it does not make sense to use a hash instead of a string, since you want each key to map to a single string.

You need to have the whole user ID in your key if you want the guarantee that you won't have collisions (ie. two users mapping to the same key).

However you could shard the hash and have your users within multiple hashes.

For example user IDs beginning with 1-3 would go into the first hash, 4-6 into the second, 7-9 into the third etc. You could use a hash function for storing and retrieving the users from certain hashes.

If you make sure that your hashes:

  1. contain up to a certain limit of entries (see hash-max-ziplist-entries setting) and,
  2. contain only keys and values with sizes up to the hash-max-zipmap-value setting (1024 by default)

then you get a memory saving because your hashes are stored in a much more memory-efficient way, using a data structure called ziplist.

Note: These relevant settings are in the redis.conf file.

Upvotes: 2

Related Questions