Reputation: 732
I'm making a test using InfluxDB for time series of stock market data with minutes of high, low, open and close prices. My objective is to aggregate the data so each day is divided in N batches (candles), N is an integer and there is no problem if some minutes get ignored by a batch that cannot be filled.
For example:
A trading day presents market data for each minute from 9AM to 5PM, or 480 minutes. If I want to aggregate those data points in 13 candles per day, I have 13 candles of 36 minutes each and the last 12 minutes are ignored.
As I could not find an easy way for extracting date components, as month of year, minute of day and so on, I created a pre processed measure that represents the offset of each minute since market opening, so each datapoint looks like:
date symbol open high low close offset
2017-01-01 09:00:00 XPTO 1.0 1.3 0.9 1.1 0
2017-01-01 09:01:00 XPTO 1.0 1.3 0.9 1.1 1
2017-01-01 09:02:00 XPTO 1.0 1.3 0.9 1.1 2
...
In popular relational databases it is trivial to aggregate in those buckets by querying (pseudo SQL below):
SELECT ...,FLOOR(offset / FLOOR(480 / n)) AS candle_id FROM prices WHERE symbol = 'XPTO' GROUP BY DAY(date), candle_id
But I couldn't figure it out how to do it in InfluxDB, since there is no functional FLOOR function.
For simple time aggregation influxDB has shown much better results than PostgreSQL and MySQL, in the order of ten times faster to respond. So it would be very nice to achieve the same work with it.
BONUS: actually the market opening time depends on the semester, so it opens at 9AM from January to June and at 10AM from July to December, so the bucket size changes depending on it.
Upvotes: 1
Views: 494
Reputation: 2699
Sadly, you can't. InfluxDB only group by tag, not by field or functions.
You have two options:
WHERE offset / 480 / n BETWEEN x AND y
bucket = 1
But perhaps the heatmap plugin of Grafana will represent your data correctly.
Upvotes: 0