Reputation: 1046
I have recently started working with Kusto. I am stuck with a use case where i need to confirm the approach i am taking is right.
I have data in the following format
In the above example, if the status is 1 and if the time frame is equal to 15 seconds then i need to assume it as 1 occurrence.
So in this case 2 occurrence of status. My approach was
if the current and next rows status is equal to 1 then take the time difference and do row_cum_sum and break it if the next(STATUS)!=0.
Even though the approach is giving me correct output, I am assuming the performance can slow down once the size is increased.
I am looking for an alternative approach if any. Also adding the complete scenario to reproduce this with a sample data.
.create-or-alter function with (folder = "Tests", skipvalidation = "true") InsertFakeTrue() {
range LoopTime from ago(365d) to now() step 6s
| project TIME=LoopTime,STATUS=toint(1)
}
.create-or-alter function with (folder = "Tests", skipvalidation = "true") InsertFakeFalse() {
range LoopTime from ago(365d) to now() step 29s
| project TIME=LoopTime,STATUS=toint(0)
}
.set-or-append FAKEDATA <| InsertFakeTrue();
.set-or-append FAKEDATA <| InsertFakeFalse();
FAKEDATA
| order by TIME asc
| serialize
| extend cstatus=STATUS
| extend nstatus=next(STATUS)
| extend WindowRowSum=row_cumsum(iff(nstatus ==1 and cstatus ==1, datetime_diff('second',next(TIME),TIME),0),cstatus !=1)
| extend windowCount=iff(nstatus !=1 or isnull(next(TIME)), iff(WindowRowSum ==15, 1,iff(WindowRowSum >15,(WindowRowSum/15)+((WindowRowSum%15)/15),0)),0 )
| summarize IDLE_COUNT=sum(windowCount)
Upvotes: 2
Views: 1103
Reputation: 271
The approach in the question is the way to achieve such calculations in Kusto and given that the logic requires sorting is also efficient (as long as the sorted data can reside on a single machine).
Regarding union operator - it runs in parallel by default, you can control the concurrency and spread using hints, see: union operator
Upvotes: 3