Reputation: 70068
I want to use redis to store a large set of user_ids and with each of these ids, a "group id" to which that user was previously assigned:
User_ID | Group_ID
1043 | 2
2403 | 1
The number of user_ids is fairly large (~ 10 million); the number of unique group ids is about 3 - 5.
My purpose for this LuT is routine:
find the group id for a given user; and
return a list of other users (of specified length) with the same group id as that given user
There might be an idiomatic way to do this in redis or at least a way that's most efficient. If so i would like to know what it is. Here's a simplified version of my working implementation (using the python client):
# assume a redis server is already running
# create some model data:
import numpy as NP
NUM_REG_USERS = 100
user_id = NP.random.randint(1000, 9999, NUM_REG_USERS)
cluster_id = NP.random.randint(1, 4, NUM_REG_USERS)
D = zip(cluster_id, user_id)
from redis import Redis
# r = Redis()
# populate the redis LuT:
for t in D :
r.sadd( t[0], t[1] )
# the queries:
# is user_id 1034 in Group 1?
r.sismember("1", 1034)
# return 10 users in the same Group 1 as user_id 1034:
r.smembers("1")[:10] # assume user_id 1034 is in group 1
So i have implemented this LuT using ordinary redis sets; each set is keyed to a Group ID (1, 2, or 3), so there are three sets in total.
Is this the most efficient way store this data given the type of queries i want to run against it?
Upvotes: 3
Views: 2090
Reputation: 16174
Using sets is a good basic approach, though there are a couple of things in there you may want to change:
Unless you store the group ID for each a user somewhere you will need 5 round trips to get the group for a particular user - the operation itself is O(1), but you still need to consider latency. Usually it is fairly easy to do this without too much effort - you have lots of other properties stored for each user, so it is trivial to add one for group id.
You probably want SRANDMEMBER rather than SMEMBERS - I think SMEMBERS will return the same 10 items from your million item set every time.
Upvotes: 1