dhirajforyou
dhirajforyou

Reputation: 462

whisper aggregation not working for older data points

carbon storage scheme

[default]  
pattern = .*  
retentions = 5m:15d,15m:1y,1h:10y,1d:100y

storage-aggregation :

[all_sum]  
pattern = .*  
xFilesFactor = 0.1  
aggregationMethod = sum  

Now, I am feeding entries as :

echo "rec.test 25 $(date --date="-6 minute" +%s)" | nc localhost 2003  
echo "rec.test 50 $(date --date="-3 minute" +%s)" | nc localhost 2003  
echo "rec.test 100 $(date +%s)" | nc localhost 2003  
echo "rec.test 1 $(date --date="-1 year" +%s)" | nc localhost 2003  
echo "rec.test 4 $(date --date="-1 year minute" +%s)" | nc localhost 2003  
echo "rec.test 6 $(date --date="-1 year -1 minute" +%s)" | nc localhost 2003  
echo "rec.test 8 $(date --date="-1 year -2 minute" +%s)" | nc localhost 2003  

On grafana graph, I am able to see the aggregation(sum value) for recent feeded values. But 1 year before values are not aggregated. In fact only one value(latest entry from window of 1 hour) 8 is shown instead of 4+6+8=18.

What can be missing in the configurations ?

Upvotes: 1

Views: 599

Answers (2)

Gavrila Florin
Gavrila Florin

Reputation: 1

Same problem here and no access on the graphite / whisper settings because of prod environment. You can aggregate data externally and then send it to the graphite data port. https://github.com/floringavrila/graphite-feeder

Upvotes: 0

kamaradclimber
kamaradclimber

Reputation: 2489

There is a buffer mechanism in carbon-aggregator that stores values received during the finest retention period and emits the aggregated value.

In your example, 5m:15d means that the buffer will store all points received in the last 5 minutes and frequently emit their sum for carbon-cache (which will write into whisper file).

That explains the normal workflow of points in graphite.

Example:

  Metrics received:
  hello.world 42  1427615689 (15 minutes ago)
  hello.world 1   1427615869 (12 minutes ago)
  hello.world 1   1427615929 (11 minutes ago)
  hello.world 314 1427616049 (9 minutes ago)
  hello.world 1   1427616051(~9 minutes ago)

will write 2 points in whisper file:

1427615689 44 (42+1+1)
1427615989 315 (314+1)

However, a buffer is dropped when the the first point of the buffer is older than a given threshold.

The threshold is computed in a way to allow late points to be aggregated (if points come a few seconds after the normal windows of 5minuters) but this has to stop somewhere (otherwise all points should be stored in carbon-aggregator's memory for ever). This theshold resolution * settings['MAX_AGGREGATION_INTERVALS'] where MAX_AGGREGATION_INTERVALS defaults to 5.

In your case, all points received 25 minutes after the timestamp they carry will find a deleted buffer. In this case graphite will create a new buffer and emit "the aggregated" value to whisper, overwriting the correct value.

In the previous example, if you send a point:

hello.world 100  1427615690 (~15 minutes ago)

25 minutes after the time of emission, it will overwrite whisper. You'll get:

1427615689 100 (100)
1427615989 315 (314+1)

Late points are a corner case of grahite buffer design (and most time series databases). If you know that some points can come late you can try to increase the MAX_AGGREGATION_INTERVALS setting but I would recommend to store them elsewhere first and reconcialiate them offline with what is stored in graphite.

Upvotes: 1

Related Questions