Prashant
Prashant

Reputation: 1064

Calculating rates in InfluxDB

I have measurement table with 3 columns (time, value, result). time and value are the base of what a timeseries db requires. The result column tag could have one of the following values (success / hard-fail / soft-fail / unknown).

I'd like to track the success rate (success / total) within a given time window (Since I'll be consuming this on grafana, the time window may change and should support dynamic time range query)


Things I've tried on Influx 0.9:

1: Using Grafana: 

Grafana has a percentage stack option to display the count values. Unfortunately it doesn't display the percentage value. It just shows the graph with the distribution. I'm interested in generating a percentage number and I don't think Grafana has that capability.


2: Continuous queries - single table:

I tried setting up a continuous query to aggregate the count of successes and total count in a new measurement table.

CREATE CONTINUOUS QUERY count_total ON metrics BEGIN SELECT SUM(value) as total INTO "metric_agg" FROM "original_metric" WHERE time > now() - 1h GROUP BY time(5m) END

CREATE CONTINUOUS QUERY count_success ON metrics BEGIN SELECT SUM(value) as success INTO "metric_agg" FROM "original_metric" WHERE result = 'success' and time > now() - 1h GROUP BY time(5m) END

Given that these were two different CQ's writing to the same tables, some values don't perfectly match on the new metric_agg table, causing the grafana query to fail. I feel I might be trying something wrong here. Pelase let me know if you see the error

UPDATE: Not sure if the following would work, but wanted to drop it in here after looking at beckettsean's response. I'll try this out soon.

Create 2 CQ dumping aggregates values under separate fields and run a division:

CREATE CQ count_total ON metrics BEGIN SELECT SUM(value) as total INTO "metric_agg" FROM "original_metric" WHERE time > now() - 1h GROUP BY time(5m) END

CREATE CQ count_success ON metrics BEGIN SELECT SUM(value) as success INTO "metric_agg" FROM "original_metric" WHERE result = 'success' and time > now() - 1h GROUP BY time(5m) END

Grafana: select success / total from metric_agg where $timeFilter


3: Continuous queries - single table - overwrites:

I tried inserting "dummy" 0 values on each CQ above, to make sure I have a value in each row for each column. This actually overwrites the original value with a 0 in most cases and generates incorrect values leading to incorrect rates.


4: Continuous queries - multiple tables:

I'm now trying to write each CQ in its own table and hoping to merge the aggregated tables in a Grafana query. However "merge" on influx doesn't seem to generate a single series.

SELECT sum("success") as success, sum("total") as total FROM merge /metric_result_*/ WHERE time > now() - 2h GROUP BY time(5m) fill(0)

Generates 2 series (metric_result_success and metric_result_total .. same as where the CQ's were writing their data).


I guess what I'm trying to ask is: What's the best way to calculate rates on influx 0.9+? CQ's seem very brittle and I'd love to avoid them, if possible.

Upvotes: 5

Views: 6105

Answers (2)

Prashant
Prashant

Reputation: 1064

For anyone reading this now (Influx 1.0 beta was recently launched).

The Influx team fixed an issue** with their 0.10 release that made "2: Continuous queries - single table:" approach work for me. NOTE: I'm writing the values in different tags, so there is no overwrite. Just keep writing aggregate values onto this new table and in grafana I can use a query that divides the columns/tags I'm interested in.

Alternatively you can use Kapacitor as well to do the same.

**For anyone curious, the issue Influx fixed in 0.10 was before 0.10 adding a new tag value at an existing timestamp, would delete old tags / feilds and just write this new information (causing us to delete info).

Upvotes: 0

beckettsean
beckettsean

Reputation: 1836

2: Continuous queries - single table:

This approach is not recommended, as the CQs are both writing to the same series. There are no tags to differentiate the results, so one CQ is overwriting the points from the other CQ. Points are uniquely identified by their measurement name, tag set, and timestamp. The fields are not part of the uniqueness. Each CQ is writing to the same measurement metric_agg. Since there is no GROUP BY <tag> or GROUP BY * the tag set for both CQs is empty. That just leaves the timestamp to differentiate the two, and since they have the same GROUP BY time() clause, they will write points with identical timestamps.

4: Continuous queries - multiple tables:

There is no way to calculate values across measurements (what you are calling "tables"). Series within a measurement can be merged, but not across measurements.

That's a long way of saying that what you request is not yet possible within the InfluxDB query language alone. To calculate the time spent in a particular state you will need to process the query results outside of InfluxDB.

Upvotes: 1

Related Questions