Marcus Hughes
Marcus Hughes

Reputation: 5503

Querying the sum of last data of a point in time

I hope I can explain this well enough and my usecase. I have a bunch of data split by the column algo, each field has a different value and my goal is to get a sum of those values which are reported up to every 5 minutes.

This can sort of be achieved with the following:

> use testdb
Using database testdb
> SELECT sum("value") FROM "balance" group by time(5m) order by time desc limit 20
name: balance
-------------
time                    sum
1516268400000000000
1516268100000000000
1516267800000000000
1516267500000000000
1516267200000000000
1516266900000000000
1516266600000000000
1516266300000000000     0.00675834
1516266000000000000     0.00675797
1516265700000000000     0.00675691

This presents two issues.

  1. As you can see, I do not have entries after 1516266600000000000. For me this is expected behaviour, every subsequent sum should be of the closest entry it can find. All the blanks here should be 0.00675834.
  2. If any algo lacks a data point, the sum will not include that data.

Visualisation of what I am trying to achieve:

Time  | Algo a | Algo b | Algo c
------|--------|--------|---------
5     | 10     |        | 34
4     | 10     | 23     | 31
3     | 9      | 23     |
2     |        |        | 20
1     | 4      | 23     | 20

If I wanted to select by time 5, it would be the sum of 10, 23 (because that's the closest previous entry), and 34. 67.

If I wanted to select by time 3, it would be 9, 23, and 20. 52.

Summary

So first part of the question: How would I obtain such values?

The second part of the question: How do I extend that to find the delta of my two examples to output 15?

Example Dataset

> select * from "balance"
name: balance
-------------
time                    algo            value           mode
1514205000000000000     AlgoA           0.00000778      debug
1514205000000000000     AlgoB           0.00004702      debug
1514205000000000000     AlgoC           0.00002564      debug
1514205000000000000     AlgoD           0.00000335      debug
1514205000000000000     AlgoE           0.00014048      debug
1514205000000000000     AlgoG           0.00050431      debug
1514205000000000000     AlgoH           0.00000164      debug
1514205300000000000     AlgoA           0.00000778      debug
1514205300000000000     AlgoB           0.00004702      debug
1514205300000000000     AlgoC           0.00002564      debug
1514205300000000000     AlgoD           0.00000335      debug
1514205300000000000     AlgoF           0.00002664      debug
1514205300000000000     AlgoG           0.00050431      debug
1514205300000000000     AlgoH           0.00000164      debug
1514205600000000000     AlgoA           0.00000778      debug
1514205600000000000     AlgoB           0.00004702      debug
1514205600000000000     AlgoC           0.00002564      debug
1514205600000000000     AlgoD           0.00000335      debug
1514205600000000000     AlgoE           0.00014048      debug
1514205600000000000     AlgoF           0.00002664      debug
1514205600000000000     AlgoG           0.00050431      debug
1514205900000000000     AlgoA           0.00000778      debug
1514205900000000000     AlgoB           0.00004702      debug
1514205900000000000     AlgoC           0.00002564      debug
1514205900000000000     AlgoD           0.00000335      debug
1514205900000000000     AlgoE           0.00014048      debug
1514205900000000000     AlgoF           0.00002664      debug

Upvotes: 0

Views: 1347

Answers (1)

Yuri G
Yuri G

Reputation: 1213

Ok, let me come up with what it seems to be, and you'd confirm or correct these assumptions.

It looks like:

  1. You have a time series once in 5 min.

  2. Values in them are divided by algos

  3. There's expected to have 0 to 1 points per algo per series

  4. What you need - is the sum of all algos within this 5min series.

  5. Although, if there's no point within a series for particular algo - it should be substituted for last available value from previous series.

Is that correct?

UPDATE So it turns out to be correct.

Then try the following.

First, since you expect no more then one value with particular algo tag per series, you can safely apply LAST() to it as you group by time. The trick here is to use the fill() option for time grouping, like that:

SELECT LAST("value"), algo FROM "balance" WHERE time > now() - 6h GROUP BY time(5m) fill(previous)

That would, obviously, give you desired previous point value, if there's none in the series.

Next step, sum over the result of this query:

SELECT SUM(lv) FROM (SELECT LAST("value") as lv FROM "balance" WHERE time > now() - 6h GROUP BY time(5m) fill(previous)) WHERE time > now() - 6h GROUP BY time(5m)

That is going to give you answer to your first question. I hope. :-)

Upvotes: 1

Related Questions