Reputation: 1909
I have a bucket where one field is a boolean
I'd like to count the number of true and the number of false for each hour
from(bucket: "xxx")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> window(every: 1h)
|> filter(fn: (r) => r["_measurement"] == "xxx")
|> filter(fn: (r) => r["_field"] == "myBoolField")
|> group(columns: ["_stop"])
Because this is issued from a cron that runs every minute (more or less), this will give something like :
table _start _stop _time _value otherfield1 otherfield2
0 2021-05-18T19:00:00 2021-05-18T20:00 2021-05-18T19:01 false xxx xxx
0 2021-05-18T19:00:00 2021-05-18T20:00 2021-05-18T19:02 true xxx xxx
0 2021-05-18T19:00:00 2021-05-18T20:00 2021-05-18T19:03 true xxx xxx
...
1 2021-05-18T20:00:00 2021-05-18T21:00 2021-05-18T20:01 false xxx xxx
1 2021-05-18T20:00:00 2021-05-18T21:00 2021-05-18T20:02 false xxx xxx
1 2021-05-18T20:00:00 2021-05-18T21:00 2021-05-18T20:03 false xxx xxx
...
Now, I'd like to count the total, the number of false and the number of true for each hour (so for each table) but without losing/dropping the other fields
So I'd like a structure like
table _stop _value nbFalse nbTrue otherfield1 otherfield2
0 2021-05-18T20:00 59 1 58 xxx xxx
1 2021-05-18T21:00 55 4 51 xxx xxx
I've tried many combinations of pivot, count, ... without success
From my understanding, the correct way to do is
drop _start and _time
duplicate _value into nbTrue and nbFalse
re-aggregate by _stop to keep only true in nbTrue and false in nbFalse
count the three columns _value, nbTrue and nbFalse
|> drop(columns: ["_start", "_time"]) |> duplicate(column: "_value", as: "nbTrue") |> duplicate(column: "_value", as: "nbFalse")
but I am stucked at step 3...
Upvotes: 2
Views: 1902
Reputation: 1909
Answer from @dskalec should work, I did not test it directly because, in the end, I needed to aggregate more than just the boolean field
here is my query, you can see that it is using the same aggregate+reduce (I just use a pivot before to have more than one field to aggregate)
from(bucket: "rt")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "xxx")
|> pivot(
rowKey:["_time"],
columnKey: ["_field"],
valueColumn: "_value"
)
|> window(every: 1h, createEmpty: true)
|> reduce(fn: (r, accumulator) => ({
nb: accumulator.nb + 1,
field1: accumulator.field1 + r["field1"], //field1 is an int
field2: if r["field2"] then accumulator.field2 + 1 else accumulator.field2, //field2 is a boolean
}),
identity: {nb: 0, field1: 0, field2: 0}
)
|> duplicate(column: "_stop", as: "_time")
|> drop(columns: ["_start", "_stop"])
Upvotes: 1
Reputation: 326
Didn't test it, but I have something similar to this on my mind:
from(bucket: "xxx")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "xxx")
|> filter(fn: (r) => r["_field"] == "myBoolField")
|> aggregateWindow(
every: 1h,
fn: (column, tables=<-) => tables |> reduce(
identity: {true_count: 0.0},
fn: (r, accumulator) => ({
true_count:
if r._value == true then accumulator.true_count + 1.0
else accumulator.true_count + 0.0
})
)
)
I got this from the docs and adjusted it a bit, I think it should get you what you need.
Upvotes: 2