Reputation: 1554
I work for a company which receives data from smart meters. This data can be as much as 2 days old for a live stream and may get post populated in the case errors are made (gaps etc.). Currently we store this typically for 5 years. The data is then pulled into an SSAS Cube and aggregated into 1 minute, 5m, 30m, 1h, 1 day, 1 week, 1month aggregations. For each of these aggregations the Min, Max, Avg is also stored. Building this cube is slow and is not currently scalable since it mines its data from a singular source.
I think that an RRD style database per data point would be a better fit driven by the data push. However I have several questions about RRD (examples would be most welcome)
Examples would be welcome.
Upvotes: 1
Views: 1149
Reputation: 53478
An RRA is a round-robin-archive and defines numbers of data points and resolution. So you can - assuming a 5 minute sample rate:
RRA:AVERAGE:0.5:1:2000
RRA:AVERAGE:0.5:12:2400
Will hold about a week of 5m resolution, and 100d of 1hr resolution. But you could quite easily extend your 5m resolution RRA - although it will make your RRD bigger. The question is - do you actually need to? The whole point of RRDs is the auto archiving vs. graphing resolution - looking at a year's worth of stats and you can't render 5m resolution anyway. With 5m samples, a 1600px wide graph is only about 6 days anyway.
rrdtool dump
to extract the content of the RRD in XML form, which you can also directly modify and then rrdtool restore
it. If you need to do this with any real frequency, I'd suggest using something other than rrdtool. Upvotes: 2