What is the most efficient way to store time series in Riak with heavy reads

Question

My current approach:

I have one domain class - Application
Each application in my system is stored in "applications" bucket under APPLICATION_KEY key
Apart from application metadata stored in this bucket, each application has its own bucket called "time_metrics/APPLICATION_KEY" where I store time series in a way:

KEY - timestamp / VALUE - some attributes

My concern is efficiency of queries made over specific time window for given application. Currently to get time series from some specific time window and eventually make some reductions I have to make map/reduce over whole "time_metric/APPLICATION_KEY" bucket, which what I have found is not the recommended use case for Riak Map/Reduce.

My question: what would be the best db structure for this kind of a system and how efficiently query it.

Alex Moore · Accepted Answer

Adding onto @macintux's answer.

Basho has had a few customers that have used riak for time series metrics. Boundary has a nice tech talk about how they use Riak with their network monitoring software. They rollup data into different chunks of time (1m, 5m, 15m) for analysis. They also have a series of blog posts about lessons learned while implementing this system.

Kivra also has a good slide deck about how they use timeseries data with riak.

You could roll up your data into some sort of arbitrary time length, then read the range you need by issuing regular K/V gets, and then reconstruct the larger picture / reduce in your application.

What is the most efficient way to store time series in Riak with heavy reads

Answers (2)

Related Questions