Reputation: 1015
I am interested in building a time series application on redis. My data would arrive sometimes historically so the data would be mutable. I have around a 100 million events currently.
The data access I am considering would need to be flexible so in order of importance
type [sale commission impression] datetype [min hour day] clientid brandid producttypeid productid
I would be asking what type of data so a sale, commission, impression. Then the aggregation type so second, min, hour, day The I have a hierarchy of event data, so by client, brand, product type and each specific product.
Is there a way to build a key like:
sale:hour:clientid:brandid:producttypeid:productid
and then query on a partial of that key like
sale:hour:clientid:brandid
Or would I need to build buckets for each
Upvotes: 1
Views: 578
Reputation: 577
RedisLabs now offers RedisTimeSeries which has some interesting features. For your use case, we have labels which you can filter by later in order to retrieve data from the relevant Time Series. It also supports several aggregations
Upvotes: 1
Reputation: 4677
You could use redis sortedset. Checkout this section Lexicographical indexes from this article.
You know when you want to add an element to SortedSet, using the ZADD
commmand. like this:
ZADD key score member
But in your case all columns are string, the score must be decimal so it is useless here. We will always set the score to 0 to keep it have no influence for us.Because with the same score, sorted set will sort the elements with their member's lexicographical order.
And then We create a composite indexes with sortedset like relational DB. Like this:
ZADD myset 0 sale:hour:clientid:brandid:producttypeid:productid:real_value
And if you want to query with sale:hour, you could do this: (lets assuming the sale is unique ID of six chars and hour is datetime format with YYYY-MM-ddThh, and you want to query something with the sale is FG63dF and the hour is 2017-08-28T08)
ZRANGEBYLEX myset [FG63dF:2017-08-28T08 (FG63dF:2017-08-28T09
This will return this hour's data with same sale Id. And other query is same with above. About the meaning of ( and [ , these means if the range item is respectively exclusive or inclusive.
If you want to query a data with all columns, like this:
ZRANGEBYLEX myset [FG63dF:2017-08-28T08:<client-id>:<brand-id>:<producttypeid>:<product-id> +
the '+' is like the infinite value, you could also specific the limit for you. You could refer ZRANGEBYLEX for detail.
Something to Notice: all the columns you want to use in composite indexes like sale, hour and so on must have same format, same padding. For example if some brandid is four digits number ans others have five digits, you must add a left padding to four digits number. E.g. 7878 => 07878
And in this case we use the colon to split these columns, so these columns must not have colon.
More detailed "How to" you could refer the article I mentioned at first.
Upvotes: 1
Reputation: 8119
I think couchbase - with its views (indexes) would be a much better no-sql choice for your use-case. you'd be able to store each item as a single entry and then slice it anyway you want with views. It, however, only eventually persistent on views (so data might be a little stale - a few seconds/up to a minute or so). If you can live with that, this is a much better tool for you. its also persistent, which is harder to get with redis.
Upvotes: 0