Reputation: 805
Kusto allows me to create summarize statistics sliced on some column based on the top on rows of a table ordered by some rule. For example, if I want to compute the average Score of each Location using the last 100 rows, I can write
T | top 100 by Time | summarize avg(Score) by Location
But I want the top to apply to each bin of the summarize. For example, I want to compute the average of the last 10 Scores available for each Location. I don't see how I can do this. At first I thought I might be able to use top-nested by it only returns the top n Locations; I want to return records for all Locations, and use the top n of each to compute an aggregate. I am willing to do this via something that effectively repeats a query for each Location, but I need the final result to be in one table with columns Location and AvgOfLast10Scores.
Upvotes: 2
Views: 1124
Reputation: 3704
I recommend using partition by
then summarize
:
let T = datatable(Timestamp:datetime, Location:string, Score: int)
[
"01/01/2021", "Texas", 1,
"01/02/2021", "Texas", 2,
"01/03/2021", "Texas", 4,
"01/03/2021", "Alabama", 1,
"01/04/2021", "Alabama", 3,
"01/05/2021", "Alabama", 4,
];
T
| partition by Location (top 2 by Timestamp)
| summarize avg(Score) by Location
Results:
You might also be able to use top-nested
, but I find top-nested confusing. You can read more about partition in the MS documentation
Upvotes: 2
Reputation: 7608
Here is one option (in this example it will take the last two measurements):
datatable(Time:datetime, Location: string, Score:int)[datetime(2021-01-10), "a", 6,
datetime(2021-01-10 01:00), "a", 7,
datetime(2021-01-10 02:00), "a", 8,
datetime(2021-01-10 03:00), "a", 10,
datetime(2021-01-10), "b", 10,
datetime(2021-01-10 01:00), "b", 20,
datetime(2021-01-10 02:00), "b", 30,
datetime(2021-01-10 03:00), "b", 40]
| order by Location asc, Time desc
| extend rowNumber = row_number(0, Location != prev(Location))
| where rowNumber <= 1
| summarize avg(Score) by Location
Upvotes: 1