Reputation: 373
I want to store user's profiles in redis, as I have to frequently read multiple user's profiles.. there are two options I see at present:
Option 1: - store separate hash key per user's profile
Option 2: - use single hash key to store all users profile
Please tell me which option is best considering following:
Upvotes: 5
Views: 3256
Reputation: 2939
(But don't use hash, use single key. Like SET profile:4d094f58c96767d7a0099d49 {...}
)
SET
, not HSET
)HMGET
, but only if your user base is not very big. Otherwise it can be too hard for server to serve you the result.Option 3 is to break your user data in hash buckets determined by hash from user id. Works good if you have many users and do batches often. Like this:
HSET profiles:<bucket> <id> {json object}
HGET profiles:<bucket> <id>
HMGET profiles:<bucket>
The last one to get a whole bucket of profiles. Don't recommend it to be more than 1mb in total. Works good with sequential ids, not so good with hashes, because they can grow too much. If you used it with hashes and it grew too much that this slows your Redis, you can fallback to HSCAN
(like in option2) or redistribute objects to more buckets with new hash function.
My recommendation, if I got your situation right, is to use 3rd option with sequential ids of range 100. And if you aiming at hight amounts of data, plan for cluster from day one.
Upvotes: 0
Reputation: 3573
As Sergio Tulentsev pointed out, its not good to store all the user's data (especially if the dataset is huge) inside one single hash by any means.
Storing the users data as individual keys is also not preferred if your looking for memory optimization as pointed out in this blog post
Reading the user's data using pagination mechanism demands one to use a database rather than a simple caching system like redis. Hence it's recommended to use a NoSQL database such as mongoDB for this.
But reading from the database each time is a costly operation especially if you're reading a lot of records.
Hence the best solution would be to cache the most active user's data in redis to eliminate the database fetch overhead.
I recommend you looking into walrus .
It basically follows the following pattern:
@cache.cached(timeout=expiry_in_secs)
def function_name(param1, param2, ...., param_n):
# perform database fetch
# return user data
This ensures that the frequently accessed or requested user data is in redis and the function automatically returns the value from redis than making the database call. Also the key is expired if not accessed for a long time.
You set it up as follows:
from walrus import *
db = Database(host='localhost', port=6379, db=0)
where host can take the domain name of the redis cluster running remotely.
Hope this helps.
Upvotes: 3
Reputation: 749
Option #1.
If you "can't have a dataset larger the memory", you can look to Partitioning as the Redis FAQ suggests. On the Redis FAQ you can also check other metrics such as the "maximum number of keys a single Redis instance can hold" or "Redis memory footprint"
Upvotes: 2