Counting/Querying events with multiple keys and dates in Redis

Question

I'm trying to figure out how to handle my data structure within Redis. What I am trying to accomplish is how I can count events with two parameters, and then query Redis for that data by date. Here's an example: events come in with two different parameters, let's call them site and event type, and also with the time that event occurred. From there, I will need to be able to query Redis for how many events occurred over a span of dates, grouped together by site and event type.

Here's a brief example data set:

Oct 3, 2012:
   site A / event A
   site A / event A
   Site B / event A

Oct 4, 2012:
   site B / event B
   site A / event A
   Site B / event A

... and so on.

In my query I would like to know the total number of events over the date span, which will be a span of five weeks. In the example above, this would be something like:

   site A / event A ==> 3 events
   site B / event A ==> 2 events
   site B / event B ==> 1 event

I have looked at using Redis' Sorted Set feature, Hashes, and so on. It seems Sorted Set is the best way to do it, but querying the data with Redis' ZUNIONSTORE command seems like not such a great fit because these events will span five weeks. That makes for at least 35 arguments to the ZUNIONSTORE command.

Any hints, thoughts, ideas, etc?

Thanks so much for your help.

Didier Spezia · Accepted Answer

Contrary to a typical RDBMS or MongoDB, Redis has no rich query language you can use. With such stores, you accumulate the raw data in the store, and then you can use a query to calculate statistics. Redis is not adapted to this model.

With Redis, you are supposed to calculate your statistics on-the-fly and store them directly instead of the raw data.

For instance, supposing we are only interested in statistics over a range of weeks, I would structure the data as follows:

because all the critera are discrete, simple hash objects can be used instead of zsets
one hash object per week
in each hash object, one counter per couple site,event. Optionally, one counter per site, and/or one counter per event.

So when an event occurs, I would pipeline the following commands to Redis:

hincrby W32 site_A:event_A 1 
hincrby W32 site_A:* 1 
hincrby W32 *:event_A 1

Please note there is no need to initialize those counters. HINCRBY will create them (and the hash object) if they do not exist.

To retrieve the statistics for one week:

hgetall W32

In the statistics, you have the counters per site/event, per site only, per event only.

To retrieve the statistics for several weeks, pipeline the following commands:

hgetall W32
hgetall W33
hgetall W34
hgetall W35
hgetall W36

and perform the aggregation on client-side (quite simple if the language supports associative arrays such as map, dictionary, etc ...).

Counting/Querying events with multiple keys and dates in Redis

Answers (1)

Related Questions