Influxdb speed up query over long time periods with group by

Question

i write sensor data every second to an influxdb database. Displaying weekly, monthly or yearly summaries in grafana is quite slow since it needs to query many thousand values.

To speed things up, i was thinking about using a cron job to run a queries like

select mean(sensor1) into data_avg_1h from data where time > start and time <= end group by time(1h)

select mean(sensor1) into data_avg_1d from data where time > start and time <= end group by time(1d)

select mean(sensor1) into data_avg_1w from data where time > start and time <= end group by time(1w)

This would mean i need more storage, but queries run much faster.

Is this a bodge job or acceptable and is there a more clever way to do something like that?

robert · Accepted Answer

Yes. It is perfectly ok and it is also recommended to downsample the data like you have mentioned in the question.

However, instead of using a cronjob it will be better to use Continuous query feature of InfluxDB to achieve the same result.

Downsampling & Contious Query Documentation.

Please be aware that when storing the average value for short period, if you want to calculate the average for a longer period from this downsampled data you will have to calculate the weighted average. Otherwise, you will calculating the average of average which, may not be equal to the average value calculated from the Original data.

This is because, each downsampled average value might be having different number of datapoints.

So while calculating the mean on regular interval store the number of data points received in that interval. This way you will be able to calculate the weighted average.

Influxdb speed up query over long time periods with group by

Answers (1)

Related Questions