Abhiram
Abhiram

Reputation: 333

Datastore for aggregations

What is a preferred datastore for fast aggregating of data? I have data that I pull from other systems regularly, and the data store should support queries like:

Right now, I'm using a custom data model in Redis, and data is fetched in memory, and then aggregates are run over it. The problem with this model is that this is closely tied to my pivots (columns) and any additional pivot, if added will cause my data to explode leading to huge memory consumption on my redis boxes.

I've explored elasticsearch, but elasticsearch queries with aggregations are taking longer than 200ms, for the kind of data that I have.

Are there any other alternatives, I'm also looking at Aerospike now. Can someone throw some light on how does aerospike aggregates work in this scenario?

Upvotes: 0

Views: 371

Answers (1)

sunil
sunil

Reputation: 3567

Aerospike supports aggregations on top of secondary index queries. Seems most of your queries are pivoted on user. You can build a secondary index on top of userid and query for all the data corresponding to a user. You can then slap the aggregation logic and filter the stuff based on desired time range. you need to do this because Aerospike does not yet support multiple where clause where you query for a user and a time range at same time.

Your queries 1 & 2 can be done by writing an aggregation UDF based on a secondary index query on userid as above.

I am not very clear about your 3 questions. Aerospike does not provide group by, sum, count etc as native queries. But you can always write an aggregation UDF to achieve it. http://www.aerospike.com/docs/guide/aggregation.html

Upvotes: 1

Related Questions