Reputation: 548
I would like to implement following using Redis:
Not sure how to implement this.
LPUSH
and LTRIM
- I can maintain fixed size - however, I would not be able to check for duplicate easily.SET
- which allows me to check for duplicate very easily before processing incoming activity - however, I am not sure how to limit the size of a SET
.Which data structure I should use - which would allow me to easily check for duplicate and also maintain the fix size.
I am using StackExchange.Redis
library.
Upvotes: 2
Views: 896
Reputation: 50122
TL;DR neither - use the Sorted Set instead and set the elements' (activity ids) score to the timestamp (i.e. Unix epoch).
As you've pointed out, Lists are a bad choice for detecting duplicates because you'll be doing it in O(N) complexity. Sets are perfect for exact duplicate detection (actually, Hashes can be used as well), and you can call SPOP
with count
of Y whenever the set's cardinality (SCARD
) exceeds your X, or in pseudo code:
y = SCARD key - x
if y > 0 then
SPOP key y
end
However, SPOP
is random and you've mentioned that you have a timestamp. In many cases, it is more practical to limit the set's size by retaining only the newest elements instead of being undeterministic. For that, use the Sorted Set and ZREMRANGEBYRANK
will let you keep the set's cardinality (ZCARD
) at an arbitrary X value by throwing away the oldest activities. Note that ranking is from lowest to highest so you'll need use a rank range from from 0 (the first, lowest-scored element) -X (the Xth+1 element from the last, highest-scored element), i.e. only this:
ZREMRANGEBYRANK key 0 -x
Upvotes: 4